This commit introduces an optional feature to provide to native emitters
the fully qualified name of the entity they are compiling.
This is achieved by altering the generic ASM API to provide a third
argument to the entry function, containing the name of the entity being
compiled. Currently only the debug emitter uses this feature, as it is
not really useful for other emitters for the time being; in fact the
macros in question just strip the name away.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit expands the implementation of Viper load/store operations
that are optimised for the Xtensa platform.
Now both load and store emitters should generate the shortest possible
sequence in all cases. Redundant specialised operation emitters have
been aliased to the general case implementation - this was the case of
integer-indexed load/store operations with a fixed offset of zero.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit updates the existing specialised implementations for
int-indexed 32-bit load and store operations, and adds a specialised
implementation for int-indexed 16-bit load.
The 32-bit operations relied on the fact that their applicability was
limited to a specific range, falling back on a generic implementation
otherwise. Introducing a single entry point for each int-indexed
load/store operation size would break that assumption. Now those two
operations contain fallback code to generate working code by themselves
instead of raising an exception.
The 16-bit operation instead simply did not have any range check, but it
was not exposed directly to the Viper emitter. When a 16-bit
int-indexed load entry point was introduced, the existing implementation
would fail when accessing memory outside its 0..255 halfwords range. A
specialised implementation is now present, performing fewer operations
than the existing Viper emitter equivalent.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit extends the generic ASM API by adding the rest of the
ASM_{LOAD,STORE}[size]_REG_REG_OFFSET macros whenever applicable.
The Viper int-indexed load/store code generator was changed to use those
API functions if they are available, falling back to backend-specific
implementations if possible and ultimately to a generic implementation.
Right now all backends except for x64 implement load16, load32, and
store32 operations (x64 only implements load16).
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit expands the Xtensa inline assembler to support most if not
all opcodes available on the ESP8266 and LX3 Xtensa cores.
This is meant as a stepping stone to add inline assembler support for
the ESP32 and its LX6 core, along to windowed-specific opcodes and
additional opcodes that are present only on the LX7 core (ESP32-S3 and
later).
New opcodes being added are covered by tests, and the provided tests
were expanded to also include opcodes available in the existing
implementation. Given that the ESP8266 space requirements are tighter
than ESP32's, certain opcodes that won't be commonly used have been put
behind a define to save some space in the general use case.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit cleans up the Viper code generation blocks for
register-indexed load and store operations.
An attempt is made to simplify the code in the common code generator
code block, by moving architecture-specific code to the appropriate
native generation backends whenever possible. This should make that
specific bit of code in the Viper generator clearer and easier to
maintain in the long term.
To achieve this, six generic assembler meta-opcodes have been
introduced, named `ASM_{LOAD,STORE}{8,16,32}_REG_REG_REG`. A
platform-independent implementation for those operations is provided, so
backends that cannot emit a shorter sequence for the requested operation
or are fine with the platform-independent implementation can just not
provide said meta-opcodes.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit improves the emitted code sequences for address generation
in the Viper subsystem when loading/storing 16 and 32 bit values via a
register offset.
The Xtensa opcodes ADDX2 and ADDX4 are used to avoid performing the
extra shifts to align the final operation offset. Those opcodes are
available on both xtensa and xtensawin MicroPython architectures.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
ASM_NOT_REG is optional, it can be synthesised by xor(reg, -1).
ASM_NEG_REG can also be synthesised with a subtraction, but most
architectures have a dedicated instruction for it.
Signed-off-by: Damien George <damien@micropython.org>
`ASM_MOV_REG_IMM_FIX_U16` and `ASM_MOV_REG_IMM_FIX_WORD` are no longer
used anywhere in the code.
See discussion in #12771.
Signed-off-by: Alessandro Gatti <a.gatti@frob.it>
This commit adds optimised l32i/s32i functions that select the best load/
store encoding based on the size of the offset, and uses the function when
necessary in code generation.
Without this, ASM_LOAD_REG_REG_OFFSET() could overflow the word offset
(using a narrow encoding), for example when loading the prelude from the
constant table when there are many (>16) constants.
Fixes issue #8458.
Signed-off-by: Damien George <damien@micropython.org>
This commit adds support for saving and loading .mpy files that contain
native code (native, viper and inline-asm). A lot of the ground work was
already done for this in the form of removing pointers from generated
native code. The changes here are mainly to link in qstr values to the
native code, and change the format of .mpy files to contain native code
blocks (possibly mixed with bytecode).
A top-level summary:
- @micropython.native, @micropython.viper and @micropython.asm_thumb/
asm_xtensa are now allowed in .py files when compiling to .mpy, and they
work transparently to the user.
- Entire .py files can be compiled to native via mpy-cross -X emit=native
and for the most part the generated .mpy files should work the same as
their bytecode version.
- The .mpy file format is changed to 1) specify in the header if the file
contains native code and if so the architecture (eg x86, ARMV7M, Xtensa);
2) for each function block the kind of code is specified (bytecode,
native, viper, asm).
- When native code is loaded from a .mpy file the native code must be
modified (in place) to link qstr values in, just like bytecode (see
py/persistentcode.c:arch_link_qstr() function).
In addition, this now defines a public, native ABI for dynamically loadable
native code generated by other languages, like C.
All architectures now have a dedicated register to hold the pointer to the
native function table mp_fun_table, and so they all need to load this
register at the start of the native function. This commit makes the
loading of this register uniform across architectures by passing the
pointer in the constant table for the native function, and then loading the
register from the constant table. Doing it this way means that the pointer
is not stored in the assembly code, helping to make the code more portable.
After the previous commit this macro is no longer needed by the native
emitter because live heap pointers are no longer stored in generated native
machine code.
Loading a pointer by indexing into the native function table mp_fun_table,
rather than loading an immediate value (via a PC-relative load), uses less
code space.
On x86 archs (both 32 and 64 bit) a bool return value only sets the 8-bit
al register, and the higher bits of the ax register have an undefined
value. When testing the return value of such cases it is required to just
test al for zero/non-zero. On the other hand, checking for truth or
zero/non-zero on an integer return value requires checking all bits of the
register. These two cases must be distinguished and handled correctly in
generated native code. This patch makes sure of this.
For other supported native archs (ARM, Thumb2, Xtensa) there is no such
distinction and this patch does not change anything for them.
All the asm macro names that convert a particular architecture to a generic
interface now follow the convention whereby the "destination" (usually a
register) is specified first.
For archs that have 16-bit pointers, the asmxtensa.h file can give compiler
warnings about left-shift being greater than the width of the type (due to
the inline functions in this header file). Explicitly casting the
constants to uint32_t stops these warnings.