Guile’s CPS language is composed of continuations. A continuation is a labelled program point. If you are used to traditional compilers, think of a continuation as a trivial basic block. A program is a “soup” of continuations, represented as a map from labels to continuations.
Like basic blocks, each continuation belongs to only one function. Some continuations are special, like the continuation corresponding to a function’s entry point, or the continuation that represents the tail of a function. Others contain a term. A term contains an expression, which evaluates to zero or more values. The term also records the continuation to which it will pass its values. Some terms, like conditional branches, may continue to one of a number of continuations.
Continuation labels are small integers. This makes it easy to sort them and to group them into sets. Whenever a term refers to a continuation, it does so by name, simply recording the label of the continuation. Continuation labels are unique among the set of labels in a program.
Variables are also named by small integers. Variable names are unique among the set of variables in a program.
For example, a simple continuation that receives two values and adds
them together can be matched like this, using the match
form from
(ice-9 match)
:
(match cont (($ $kargs (x-name y-name) (x-var y-var) ($ $continue k src ($ $primcall '+ #f (x-var y-var)))) (format #t "Add ~a and ~a and pass the result to label ~a" x-var y-var k)))
Here we see the most common kind of continuation, $kargs
, which
binds some number of values to variables and then evaluates a term.
Bind the incoming values to the variables vars, with original names names, and then evaluate term.
The names of a $kargs
are just for debugging, and will end
up residualized in the object file for use by the debugger.
The term in a $kargs
is always a $continue
, which
evaluates an expression and continues to a continuation.
Evaluate the expression exp and pass the resulting values (if any)
to the continuation labelled k. The source information associated
with the expression may be found in src, which is either an alist
as in source-properties
or is #f
if there is no associated
source.
There are a number of expression kinds. Above you see an example of
$primcall
.
Perform the primitive operation identified by name
, a well-known
symbol, passing it the arguments args, and pass all resulting
values to the continuation.
param is a constant parameter whose interpretation is up to the
primcall in question. Usually it’s #f
but for a primcall that
might need some compile-time constant information – such as
add/immediate
, which adds a constant number to a value – the
parameter holds this information.
The set of available primitives includes many primitives known to
Tree-IL and then some more; see the source code for details. Note that
some Tree-IL primcalls need to be converted to a sequence of lower-level
CPS primcalls. Again, see (language tree-il compile-cps)
for
full details.
The variables that are used by $primcall
, or indeed by any
expression, must be defined before the expression is evaluated. An
equivalent way of saying this is that predecessor $kargs
continuation(s) that bind the variables(s) used by the expression must
dominate the continuation that uses the expression: definitions
dominate uses. This condition is trivially satisfied in our example
above, but in general to determine the set of variables that are in
“scope” for a given term, you need to do a flow analysis to see what
continuations dominate a term. The variables that are in scope are
those variables defined by the continuations that dominate a term.
Here is an inventory of the kinds of expressions in Guile’s CPS
language, besides $primcall
which has already been described.
Recall that all expressions are wrapped in a $continue
term which
specifies their continuation.
Continue with the constant value val.
Continue with the procedure that implements the primitive operation named by name.
Call proc with the arguments args, and pass all values to
the continuation. proc and the elements of the args list
should all be variable names. The continuation identified by the term’s
k should be a $kreceive
or a $ktail
instance.
Pass the values named by the list args to the continuation.
There are two sub-languages of CPS, higher-order CPS and
first-order CPS. The difference is that in higher-order CPS,
there are $fun
and $rec
expressions that bind functions or
mutually-recursive functions in the implicit scope of their use sites.
Guile transforms higher-order CPS into first-order CPS by closure
conversion, which chooses representations for all closures and which
arranges to access free variables through the implicit closure parameter
that is passed to every function call.
Continue with a procedure. body names the entry point of the
function, which should be a $kfun
. This expression kind is only
valid in higher-order CPS, which is the CPS language before closure
conversion.
Continue with a set of mutually recursive procedures denoted by
names, vars, and funs. names is a list of
symbols, vars is a list of variable names (unique integers), and
funs is a list of $fun
values. Note that the $kargs
continuation should also define names/vars bindings.
The contification pass will attempt to transform the functions declared
in a $rec
into local continuations. Any remaining $fun
instances are later removed by the closure conversion pass. If the
function has no free variables, it gets allocated as a constant.
A constant which is a function whose entry point is label. As a
constant, instances of $const-fun
with the same label will
not allocate; the space for the function is allocated as part of the
compilation unit.
In practice, $const-fun
expressions are reified by CPS-conversion
for functions whose call sites are not all visible within the
compilation unit and which have no free variables. This expression kind
is part of first-order CPS.
Otherwise, if the closure has free variables, it will be allocated at
its definition site via an allocate-words
primcall and its free
variables initialized there. The code pointer in the closure is
initialized from a $code
expression.
Continue with the value of label, which should denote some
$kfun
continuation in the program. Used when initializing the
code pointer of closure objects.
However, If the closure can be proven to never escape its scope then
other lighter-weight representations can be chosen. Additionally, if
all call sites are known, closure conversion will hard-wire the calls by
lowering $call
to $callk
.
Like $call
, but for the case where the call target is known to be
in the same compilation unit. label should denote some
$kfun
continuation in the program. In this case the proc
is simply an additional argument, since it is not used to determine the
call target at run-time.
To summarize: a $continue
is a CPS term that continues to a
single label. But there are other kinds of CPS terms that can continue
to a different number of labels: $branch
, $switch
,
$throw
, and $prompt
.
Evaluate the branching primcall op, with arguments args and constant parameter param, and continue to kt with zero values if the test is true. Otherwise continue to kf.
The $branch
term is like a $continue
term with a
$primcall
expression, except that instead of binding a value and
continuing to a single label, the result of the test is not bound but
instead used to choose the continuation label.
The set of operations (corresponding to op values) that are valid
in a $branch is limited. In the general case, bind the result of
a test expression to a variable, and then make a $branch
on a
true?
op referencing that variable. The optimizer should inline
the branch if possible.
Continue to a label in the list k* according to the index argument arg, or to the default continuation kf if arg is greater than or equal to the length k*. The index variable arg is an unboxed, unsigned 64-bit value.
The $switch
term is like C’s switch
statement. The
compiler to CPS can generate a $switch
term directly, if the
source language has such a concept, or it can rely on the CPS optimizer
to turn appropriate chains of $branch
statements to
$switch
instances, which is what the Scheme compiler does.
Throw a non-resumable exception. Throw terms do not continue at all.
The usual value of op is throw
, with two arguments
key and args. There are also some specific primcalls that
compile to the VM throw/value
and throw/value+data
instructions; see the code for full details.
The advantage of having $throw
as a term is that, because it does
not continue, this allows the optimizer to gather more information from
type predicates. For example, if the predicate is char?
and the
kf continues to a throw, the set of labels dominated by kt
is larger than if the throw notationally continued to some label that
would never be reached by the throw.
Push a prompt on the stack identified by the variable name tag,
which may be escape-only if escape? is true, and continue to
kh with zero values. If the body aborts to this prompt, control
will proceed at the continuation labelled kh, which should be a
$kreceive
continuation. Prompts are later popped by
pop-prompt
primcalls.
At this point we have described terms, expressions, and the most common
kind of continuation, $kargs
. $kargs
is used when the
predecessors of the continuation can be instructed to pass the values
where the continuation wants them. For example, if a $kargs
continuation k binds a variable v, and the compiler decides
to allocate v to slot 6, all predecessors of k should put
the value for v in slot 6 before jumping to k. One
situation in which this isn’t possible is receiving values from function
calls. Guile has a calling convention for functions which currently
places return values on the stack. A continuation of a call must check
that the number of values returned from a function matches the expected
number of values, and then must shuffle or collect those values to named
variables. $kreceive
denotes this kind of continuation.
Receive values on the stack. Parse them according to arity, and
then proceed with the parsed values to the $kargs
continuation
labelled k. As a limitation specific to $kreceive
,
arity may only contain required and rest arguments.
$arity
is a helper data structure used by $kreceive
and
also by $kclause
, described below.
A data type declaring an arity. req and opt are lists of
source names of required and optional arguments, respectively.
rest is either the source name of the rest variable, or #f
if this arity does not accept additional values. kw is a list of
the form ((keyword name var) ...)
, describing
the keyword arguments. allow-other-keys? is true if other keyword
arguments are allowed and false otherwise.
Note that all of these names with the exception of the vars in the kw list are source names, not unique variable names.
Additionally, there are three specific kinds of continuations that are only used in function entries.
Declare a function entry. src is the source information for the
procedure declaration, and meta is the metadata alist as described
above in Tree-IL’s <lambda>
. self is a variable bound to
the procedure being called, and which may be used for self-references.
tail is the label of the $ktail
for this function,
corresponding to the function’s tail continuation. clause is the
label of the first $kclause
for the first case-lambda
clause in the function, or otherwise #f
.
A tail continuation.
A clause of a function with a given arity. Applications of a function
with a compatible set of actual arguments will continue to the
continuation labelled cont, a $kargs
instance representing
the clause body. If the arguments are incompatible, control proceeds to
alternate, which is a $kclause
for the next clause, or
#f
if there is no next clause.