[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes what needs to be done to port libjit
to a new CPU architecture. It is assumed that the reader is familiar
with compiler implementation techniques and the particulars of their
target CPU's instruction set.
We will use ARCH
to represent the name of the architecture
in the sections that follow. It is usually the name of the CPU in
lower case (e.g. x86
, arm
, ppc
, etc). By
convention, all back end functions should be prefixed with _jit
,
because they are not part of the public API.
20.1 Porting the function apply facility | ||
20.2 Creating the instruction generation macros | ||
20.3 Writing the architecture definition rules | ||
20.4 Allocating registers in the back end |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The first step in porting libjit
to a new architecture is to port
the jit_apply
facility. This provides support for calling
arbitrary C functions from your application or from JIT'ed code.
If you are familiar with libffi
or ffcall
, then
jit_apply
provides a similar facility.
Even if you don't intend to write a native code generator, you will
probably still need to port jit_apply
to each new architecture.
The libjit
library makes use of gcc's __builtin_apply
facility to do most of the hard work of function application.
This gcc facility takes three arguments: a pointer to the function
to invoke, a structure containing register arguments, and a size
value that indicates the number of bytes to push onto the stack
for the call.
Unfortunately, the register argument structure is very system dependent. There is no standard format for it, but it usually looks something like this:
stack_args
Pointer to an array of argument values to push onto the stack.
struct_ptr
Pointer to the buffer to receive a struct
return value.
The struct_ptr
field is only present if the architecture
passes struct
pointers in a special register.
word_reg[0..N]
Values for the word registers. Platforms that pass values in registers will populate these fields. Not present if the architecture does not use word registers for function calls.
float_reg[0..N]
Values for the floating-point registers. Not present if the architecture does not use floating-point registers for function calls.
It is possible to automatically detect the particulars of this structure
by making test function calls and inspecting where the arguments end up
in the structure. The gen-apply
program in libjit/tools
takes care of this. It outputs the jit-apply-rules.h
file,
which tells jit_apply
how to operate.
The gen-apply
program will normally "just work", but it is possible
that some architectures will be stranger than usual. You will need to modify
gen-apply
to detect this additional strangeness, and perhaps
also modify libjit/jit/jit-apply.c
.
If you aren't using gcc to compile libjit
, then things may
not be quite this easy. You may have to write some inline assembly
code to emulate __builtin_apply
. See the file
jit-apply-x86.h
for an example of how to do this.
Be sure to add an #include
line to jit-apply-func.h
once you do this.
The other half of jit_apply
is closure and redirector support.
Closures are used to wrap up interpreted functions so that they can be
called as regular C functions. Redirectors are used to help compile a
JIT'ed function on-demand, and then redirect control to it.
Unfortunately, you will have to write some assembly code to support
closures and redirectors. The builtin gcc facilities are not complete
enough to handle the task. See jit-apply-x86.c
and
jit-apply-arm.c
for some examples from existing architectures.
You may be able to get some ideas from the libffi
and
ffcall
libraries as to what you need to do on your architecture.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
You will need a large number of macros and support functions to
generate the raw instructions for your chosen CPU. These macros are
fairly generic and are not necessarily specific to libjit
.
There may already be a suitable set of macros for your CPU in
some other Free Software project.
Typically, the macros are placed into a file called jit-gen-ARCH.h
in the libjit/jit
directory. If some of the macros are complicated,
you can place helper functions into the file jit-gen-ARCH.c
.
Remember to add both jit-gen-ARCH.h
and jit-gen-ARCH.c
to Makefile.am
in libjit/jit
.
Existing examples that you can look at for ideas are jit-gen-x86.h
and jit-gen-arm.h
. The macros in these existing files assume that
instructions can be output to a buffer in a linear fashion, and that each
instruction is relatively independent of the next.
This independence principle may not be true of all CPU's. For example,
the ia64
packs up to three instructions into a single "bundle"
for parallel execution. We recommend that the macros should appear to
use linear output, but call helper functions to pack bundles after the fact.
This will make it easier to write the architecture definition rules.
A similar approach could be used for performing instruction scheduling
on platforms that require it.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The architecture definition rules for a CPU are placed into the files
jit-rules-ARCH.h
and jit-rules-ARCH.c
. You should add
both of these files to Makefile.am
in libjit/jit
.
You will also need to edit jit-rules.h
in two places. First,
place detection logic at the top of the file to detect your platform
and define JIT_BACKEND_ARCH
to 1. Further down the file,
you should add the following two lines to the include file logic:
#elif defined(JIT_BACKEND_ARCH) #include "jit-rules-ARCH.h" |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Every rule header file needs to define the macro JIT_REG_INFO
to
an array of values that represents the properties of the CPU's
registers. The _jit_reg_info
array is populated with
these values. JIT_NUM_REGS
defines the number of
elements in the array. Each element in the array has the
following members:
name
The name of the register. This is used for debugging purposes.
cpu_reg
The raw CPU register number. Registers in libjit
are
referred to by their pseudo register numbers, corresponding to
their index within JIT_REG_INFO
. However, these pseudo
register numbers may not necessarily correspond to the register
numbers used by the actual CPU. This field provides a mapping.
other_reg
The second pseudo register in a 64-bit register pair, or -1 if the current register cannot be used as the first pseudo register in a 64-bit register pair. This field only has meaning on 32-bit platforms, and should always be set to -1 on 64-bit platforms.
flags
Flag bits that describe the pseudo register's properties.
The following flags may be present:
JIT_REG_WORD
This register can hold an integer word value.
JIT_REG_LONG
This register can hold a 64-bit long value without needing a second register. Normally only used on 64-bit platforms.
JIT_REG_FLOAT32
This register can hold a 32-bit floating-point value.
JIT_REG_FLOAT64
This register can hold a 64-bit floating-point value.
JIT_REG_NFLOAT
This register can hold a native floating-point value.
JIT_REG_FRAME
This register holds the frame pointer. You will almost always supply
JIT_REG_FIXED
for this register.
JIT_REG_STACK_PTR
This register holds the stack pointer. You will almost always supply
JIT_REG_FIXED
for this register.
JIT_REG_FIXED
This register has a fixed meaning and cannot be used for general allocation.
JIT_REG_CALL_USED
This register will be destroyed by a function call.
JIT_REG_IN_STACK
This register is in a stack-like arrangement.
JIT_REG_GLOBAL
This register is a candidate for global register allocation.
A CPU may have some registers arranged into a stack. In this case operations can typically only occur at the top of the stack, and may automatically pop values as a side-effect of the operation. An example of such architecture is x87 floating point unit. Such CPU requires three additional macros.
JIT_REG_STACK
If defined, this indicates the presence of the register stack.
JIT_REG_STACK_START
The index of the first register in the JIT_REG_INFO
array that is used
in a stack-like arrangement.
JIT_REG_STACK_END
The index of the last register in the JIT_REG_INFO
array that is used
in a stack-like arrangement.
The entries in the JIT_REG_INFO
array from JIT_REG_STACK_START
up to JIT_REG_STACK_END
must also have the JIT_REG_IN_STACK
flag set.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The rule file may also have definitions of the following macros:
JIT_NUM_GLOBAL_REGS
The number of registers that are used for global register allocation. Set to zero if global register allocation should not be used.
JIT_ALWAYS_REG_REG
Define this to 1 if arithmetic operations must always be performed on registers. Define this to 0 if register/memory and memory/register operations are possible.
JIT_PROLOG_SIZE
If defined, this indicates the maximum size of the function prolog.
JIT_FUNCTION_ALIGNMENT
This value indicates the alignment required for the start of a function. e.g. define this to 32 if functions should be aligned on a 32-byte boundary.
JIT_ALIGN_OVERRIDES
Define this to 1 if the platform allows reads and writes on any byte boundary. Define to 0 if only properly-aligned memory accesses are allowed. Normally only defined to 1 under x86.
jit_extra_gen_state
jit_extra_gen_init
jit_extra_gen_cleanup
The jit_extra_gen_state
macro can be supplied to add extra fields
to the struct jit_gencode
type in jit-rules.h
, for
extra CPU-specific code generation state information.
The jit_extra_gen_init
macro initializes this extra information,
and the jit_extra_gen_cleanup
macro cleans it up when code
generation is complete.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Initialize the backend. This is normally used to configure registers that may not appear on all CPU's in a given family. For example, only some ARM cores have floating-point registers.
Get the ELF machine and ABI type information for this platform.
The machine
field should be set to one of the EM_*
values in jit-elf-defs.h
. The abi
field should
be set to one of the ELFOSABI_*
values in jit-elf-defs.h
(ELFOSABI_SYSV
will normally suffice if you are unsure).
The abi_version
field should be set to the ABI version,
which is usually zero.
Create instructions in the entry block to initialize the registers and frame offsets that contain the parameters. Returns zero if out of memory.
This function is called when a builder is initialized. It should
scan the signature and decide which register or frame position
contains each of the parameters and then call either
jit_insn_incoming_reg
or jit_insn_incoming_frame_posn
to notify libjit
of the location.
Create instructions within func necessary to set up for a
function call to a function with the specified signature.
Use jit_insn_push
to push values onto the system stack,
or jit_insn_outgoing_reg
to copy values into call registers.
If is_nested is non-zero, then it indicates that we are calling a
nested function within the current function's nested relationship tree.
The nested_level value will be -1 to call a child, zero to call a
sibling of func, 1 to call a sibling of the parent, 2 to call
a sibling of the grandparent, etc. The jit_insn_setup_for_nested
instruction should be used to create the nested function setup code.
If the function returns a structure by pointer, then struct_return must be set to a new local variable that will contain the returned structure. Otherwise it should be set to NULL.
Place the indirect function pointer value into a suitable register or stack location for a subsequent indirect call.
Create instructions within func to clean up after a function call
and to place the function's result into return_value.
This should use jit_insn_pop_stack
to pop values off the system
stack and jit_insn_return_reg
to tell libjit
which
register contains the return value. In the case of a void
function, return_value will be NULL.
Note: the argument values are passed again because it may not be possible to determine how many bytes to pop from the stack from the signature alone; especially if the called function is vararg.
Not all CPU's support all arithmetic, conversion, bitwise, or comparison operators natively. For example, most ARM platforms need to call out to helper functions to perform floating-point.
If this function returns zero, then jit-insn.c
will output a
call to an intrinsic function that is equivalent to the desired opcode.
This is how you tell libjit
that you cannot handle the
opcode natively.
This function can also help you develop your back end incrementally. Initially, you can report that only integer operations are supported, and then once you have them working you can move on to the floating point operations.
Generate the prolog for a function into a previously-prepared
buffer area of JIT_PROLOG_SIZE
bytes in size. Returns
the start of the prolog, which may be different than buf.
This function is called at the end of the code generation process, not the beginning. At this point, it is known which callee save registers must be preserved, allowing the back end to output the most compact prolog possible.
Generate a function epilog, restoring the registers that were saved on entry to the function, and then returning.
Only one epilog is generated per function. Functions with multiple
jit_insn_return
instructions will all jump to the common epilog.
This is needed because the code generator may not know which callee
save registers need to be restored by the epilog until the full function
has been processed.
Generate code for a redirector, which makes an indirect jump
to the contents of func->entry_point
. Redirectors
are used on recompilable functions in place of the regular
entry point. This allows libjit
to redirect existing
calls to the new version after recompilation.
Generate instructions to spill a pseudo register to the local variable frame. If other_reg is not -1, then it indicates the second register in a 64-bit register pair.
This function will typically call _jit_gen_fix_value
to
fix the value's frame position, and will then generate the
appropriate spill instructions.
Generate instructions to free a register without spilling its value. This is called when a register's contents become invalid, or its value is no longer required. If value_used is set to a non-zero value, then it indicates that the register's value was just used. Otherwise, there is a value in the register but it was never used.
On most platforms, this function won't need to do anything to free the register. But some do need to take explicit action. For example, x86 needs an explicit instruction to remove a floating-point value from the FPU's stack if its value has not been used yet.
Generate instructions to load a value into a register. The value will
either be a constant or a slot in the frame. You should fix frame slots
with _jit_gen_fix_value
.
Spill the contents of value from its corresponding global register. This is used in rare cases when a machine instruction requires its operand to be in the specific register that happens to be global. In such cases the register is spilled just before the instruction and loaded back immediately after it.
Load the contents of value into its corresponding global register. This is used at the head of a function to pull parameters out of stack slots into their global register copies.
Generate instructions to exchange the contents of the top stack register with a stack register specified by the reg argument.
It needs to be implemented only by backends that support stack registers.
Generate instructions to copy the contents of the top stack register
into a stack register specified by the reg
argument and pop
the top register after this. If reg
is equal to the top register
then the top register is just popped without copying it.
It needs to be implemented only by backends that support stack registers.
Generate instructions to spill the top stack register to the local variable frame. The pop argument indicates if the top register is popped from the stack.
It needs to be implemented only by backends that support stack registers.
Fix the position of a value within the local variable frame. If it doesn't already have a position, then assign one for it.
Generate native code for the specified insn. This function should call the appropriate register allocation routines, output the instruction, and then arrange for the result to be placed in an appropriate register or memory destination.
Called to notify the back end that the start of a basic block has been reached.
Called to notify the back end that the end of a basic block has been reached.
Determine if type is a candidate for allocation within global registers.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The libjit
library provides a number of functions for
performing register allocation within basic blocks so that you
mostly don't have to worry about it:
Get the pseudo register by its name.
Determine if a type requires a long register pair.
Get the CPU register that corresponds to a pseudo register. "other_reg" will be set to the other register in a pair, or -1 if the register is not part of a pair.
Perform global register allocation on the values in func
.
This is called during function compilation just after variable
liveness has been computed.
Initialize the register allocation state for a new block.
Spill all of the temporary registers to memory locations. Normally used at the end of a block, but may also be used in situations where a value must be in a certain register and it is too hard to swap things around to put it there.
Set pseudo register reg
to record that it currently holds the
contents of value
. The register must not contain any other
live value at this point.
Load the contents of value
into pseudo register reg
,
spilling out the current contents. This is used to set up outgoing
parameters for a function call.
If value
is currently in a register, then force its value out
into the stack frame. The is_dest
flag indicates that the value
will be a destination, so we don't care about the original value.
Load a value into any register that is suitable and return that register. If the value needs a long pair, then this will return the first register in the pair. Returns -1 if the value will not fit into any register.
If destroy
is non-zero, then we are about to destroy the register,
so the system must make sure that such destruction will not side-effect
value
or any of the other values currently in that register.
If used_again
is non-zero, then it indicates that the value is
used again further down the block.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Klaus Treichel on May, 11 2008 using texi2html 1.78.