By default, when you enter in expressions at Guile’s REPL, they are first compiled to bytecode. Then that bytecode is executed to produce a value. If the expression evaluates to a procedure, the result of this process is a compiled procedure.
A compiled procedure is a compound object consisting of its bytecode and
a reference to any captured lexical variables. In addition, when a
procedure is compiled, it has associated metadata written to side
tables, for instance a line number mapping, or its docstring. You can
pick apart these pieces with the accessors in (system vm
program)
. See Compiled Procedures, for a full API reference.
A procedure may reference data that was statically allocated when the procedure was compiled. For example, a pair of immediate objects (see Immediate Objects) can be allocated directly in the memory segment that contains the compiled bytecode, and accessed directly by the bytecode.
Another use for statically allocated data is to serve as a cache for a bytecode. Top-level variable lookups are handled in this way; the first time a top-level binding is referenced, the resolved variable will be stored in a cache. Thereafter all access to the variable goes through the cache cell. The variable’s value may change in the future, but the variable itself will not.
We can see how these concepts tie together by disassembling the
foo
function we defined earlier to see what is going on:
scheme@(guile-user)> (define (foo a) (lambda (b) (vector foo a b))) scheme@(guile-user)> ,x foo Disassembly of #<procedure foo (a)> at #xf1da30: 0 (instrument-entry 164) at (unknown file):5:0 2 (assert-nargs-ee/locals 2 1) ;; 3 slots (1 arg) 3 (allocate-words/immediate 2 3) at (unknown file):5:16 4 (load-u64 0 0 65605) 7 (word-set!/immediate 2 0 0) 8 (load-label 0 7) ;; anonymous procedure at #xf1da6c 10 (word-set!/immediate 2 1 0) 11 (scm-set!/immediate 2 2 1) 12 (reset-frame 1) ;; 1 slot 13 (handle-interrupts) 14 (return-values) ---------------------------------------- Disassembly of anonymous procedure at #xf1da6c: 0 (instrument-entry 183) at (unknown file):5:16 2 (assert-nargs-ee/locals 2 3) ;; 5 slots (1 arg) 3 (static-ref 2 152) ;; #<variable 112e530 value: #<procedure foo (a)>> 5 (immediate-tag=? 2 7 0) ;; heap-object? 7 (je 19) ;; -> L2 8 (static-ref 2 119) ;; #<directory (guile-user) ca9750> 10 (static-ref 1 127) ;; foo 12 (call-scm<-scm-scm 2 2 1 40) 14 (immediate-tag=? 2 7 0) ;; heap-object? 16 (jne 8) ;; -> L1 17 (scm-ref/immediate 0 2 1) 18 (immediate-tag=? 0 4095 2308) ;; undefined? 20 (je 4) ;; -> L1 21 (static-set! 2 134) ;; #<variable 112e530 value: #<procedure foo (a)>> 23 (j 3) ;; -> L2 L1: 24 (throw/value 1 151) ;; #(unbound-variable #f "Unbound variable: ~S") L2: 26 (scm-ref/immediate 2 2 1) 27 (allocate-words/immediate 1 4) at (unknown file):5:28 28 (load-u64 0 0 781) 31 (word-set!/immediate 1 0 0) 32 (scm-set!/immediate 1 1 2) 33 (scm-ref/immediate 4 4 2) 34 (scm-set!/immediate 1 2 4) 35 (scm-set!/immediate 1 3 3) 36 (mov 4 1) 37 (reset-frame 1) ;; 1 slot 38 (handle-interrupts) 39 (return-values)
The first thing to notice is that the bytecode is at a fairly low level. When a program is compiled from Scheme to bytecode, it is expressed in terms of more primitive operations. As such, there can be more instructions than you might expect.
The first chunk of instructions is the outer foo
procedure. It
is followed by the code for the contained closure. The code can look
daunting at first glance, but with practice it quickly becomes
comprehensible, and indeed being able to read bytecode is an important
step to understanding the low-level performance of Guile programs.
The foo
function begins with a prelude. The
instrument-entry
bytecode increments a counter associated with
the function. If the counter reaches a certain threshold, Guile will
emit machine code (“JIT-compile”) for foo
. Emitting machine
code is fairly cheap but it does take time, so it’s not something you
want to do for every function. Using a per-function counter and a
global threshold allows Guile to spend time JIT-compiling only the
“hot” functions.
Next in the prelude is an argument-checking instruction, which checks that it was called with only 1 argument (plus the callee function itself makes 2) and then reserves stack space for an additional 1 local.
Then from ip
3 to 11, we allocate a new closure by allocating a
three-word object, initializing its first word to store a type tag,
setting its second word to its code pointer, and finally at ip
11, storing local value 1 (the a
argument) into the third word
(the first free variable).
Before returning, foo
“resets the frame” to hold only one local
(the return value), runs any pending interrupts (see Asynchronous Interrupts) and
then returns.
Note that local variables in Guile’s virtual machine are usually
addressed relative to the stack pointer, which leads to a pleasantly
efficient sp[n]
access. However it can make the
disassembly hard to read, because the sp
can change during the
function, and because incoming arguments are relative to the fp
,
not the sp
.
To know what fp
-relative slot corresponds to an
sp
-relative reference, scan up in the disassembly until you get
to a “n slots” annotation; in our case, 3, indicating that the
frame has space for 3 slots. Thus a zero-indexed sp
-relative
slot of 2 corresponds to the fp
-relative slot of 0, which
initially held the value of the closure being called. This means that
Guile doesn’t need the value of the closure to compute its result, and
so slot 0 was free for re-use, in this case for the result of making a
new closure.
A closure is code with data. As you can see, making the closure
involved making an object (ip
3), putting a code pointer in it
(ip
8 and 10), and putting in the closure’s free variable
(ip
11).
The second stanza disassembles the code for the closure. After the
prelude, all of the code between ip
5 and 24 is related to
loading the toplevel variable foo
into slot 1. This lookup
happens only once, and is associated with a cache; after the first run,
the value in the cache will be a bound variable, and the code will jump
from ip
7 to 26. On the first run, Guile gets the module
associated with the function, calls out to a run-time routine to look up
the variable, and checks that the variable is bound before initializing
the cache. Either way, ip
26 dereferences the variable into
local 2.
What follows is the allocation and initialization of the vector return
value. Ip
27 does the allocation, and the following two
instructions initialize the type-and-length tag for the object’s first
word. Ip
32 sets word 1 of the object (the first vector slot) to
the value of foo
; ip
33 fetches the closure variable for
a
, then in ip
34 stores it in the second vector slot; and
finally, in ip
35, local b
is stored to the third vector
slot. This is followed by the return sequence.