The translation phase takes a top-level form (or body),
and generates a ModuleExp
, which is a top-level expression.
This is done using a Translator
, which keeps track of
lexical bindings and other translation state.
class Translator
{ ...;
public Expression rewrite(Object exp)
{ ... }
public Expression syntaxError
(String message) { ... }
}
The rewrite
method converts a Scheme source form to
an Expression
. The syntaxError
method is called when a syntax error is seen.
It prints out the current source filename and line number with
the given message
.
In addition to handling special forms (such as lambda
),
the rewrite
phase also handles name resolution of
identifiers to their lexical declarations.
This is complicated because a variable reference may appear before
the declaration that defines it, for example in a letrec
.
The solution is to use two phases:
The first “scan” sub-phase looks for declarations, macro-expanding
outermost macro applications to determine if the result is a
declaration.
The second “rewrite” sub-phase takes the result of the
scan phase, using the declarations that scan produced, and
rewrites any deferred forms.
The phases are actually interleaved, because a Scheme “body”
may contain internal definitions. Thus to rewrite a body we first have to
scan it for definitions, before we rewrite the result.
We won't go into further details about the scan phase.
class Syntax
{ ...;
public abstract Expression rewriteForm
(Object obj, Translator tr);
}
The rewrite
method in Translator
checks for syntactic keywords and macros. If the car
of
a “call” is a Syntax
or if it is a Symbol
that is bound to a Syntax
,
then the rewriteForm
method of the Syntax
is called.
As an example, this trivial class implements quote
:
class quote extends Syntax
{ ...;
public Expression rewriteForm (Object form, Translator tr)
{ // Error-checking is left out.
return new QuoteExp((((Pair) form).cdr).car);
}
}
(The actual implementation of quote
is more complex,
since it has to interpolate forms bound to pattern variables.)
A Macro
is a Syntax
where rewriteForm
calls a transformer function:
class Macro extends Syntax
{ ...;
Procedure transformer;
Expression rewriteForm (Object form, Translator tr)
{
return tr.rewrite(expand(form, tr));
}
Object expand (Object form, Translator tr)
{ // Much simplified.
return transformer.apply1(form);
}
}
When Kawa sees a define-syntax
, it creates a
Macro
that contains the transformer
resulting from the transformer expression. The latter may be
a lambda
expression; that is normally the case
when using syntax-case
. Alternatively, the
transformer expression may be a syntax-rules
form.
That gets compiled to a SyntaxRules
object:
This contains an encoded representation of the patterns and templates
in the syntax-rules
.
class SyntaxRules extends Procedure1
{ ...;
SyntaxRule[] rules;
public Object apply1 (Object arg)
{
Translator tr = getCurrentTranslator();
Object[] v = new Object[maxVars];
for (int i = 0; i < rules.length;)
{
SyntaxRule r = rules[i++];
if (r.match (obj, v))
return r.execute(v, tr);
}
return tr.syntaxError
("no matching syntax-rule");
}
}
A problem when writing a macro is that identifiers you use internally in
the macro might conflict with identifiers that appear at the macro call site.
User-supplied code might accidentally reference a temporary variable
in the macro expansion or vice versa.
A macro system is hygienic when it automatically
separates the names in the macro (and the macro definition context) from
those in the macro expansion context.
Conceptually, this is done by magically renaming certain names.
Macros defined using the standard syntax-rules
are by default hygienic, as are by default those defined using
the syntax-case
system [Dybvig93]
[Waddell99] provided by Kawa and some other Scheme
implementations. It is possible to override hygiene when using
syntax-case
, though we won't go into that.
The following discussion will focus on syntax-case
,
since we can view syntax-rules
as a short-hand syntax
for combinating a syntax-case
with a
syntax
template for each alternative.
The key to understanding syntax-case
is that it uses
syntax objects, which combine the
actual Scheme source form (a list, symbol, or literal) with its
syntactic context. A syntax object is
implemented in two ways. An implicit syntax object is just
the Scheme source form. In that case the syntactic context is
implicitly the “current” lexical context.
An explicit syntax object is
implemented using SyntaxForm
:
class SyntaxForm
{
Object form;
TemplateScope scope;
}
The form
is the orginal form from the Scheme reader,
while the scope
field identifies the
lexical context.
A SyntaxForm
object is created when a syntax template
(i.e. a syntax
expression) is evaluated; the
scope
field is the
scope of the syntax
form within
the macro definition,
which may be unrelated to the lexical scope of the expansion context.
If a syntax
form doesn't contain any pattern
variables, then the result is a single SyntaxForm
object
that wraps the syntax
argument expression.
However, any pattern variables are returned as-is. This may require some
destructuring of the template, rather like quasi-quotation.
Consider this example:
(syntax-case form () ((_ e1 e2) (syntax (list a e1 e2 b c))))
The syntax
form is compiled as if it were
(using the cons*
function from SRFI-1):
(cons* (syntax list) (syntax a) e1 e1 (syntax b c))
Note that in Scheme code you'd have to write e1
and e2
as (syntax e1)
and (syntax e2)
, since you're only allowed to
reference pattern variables in syntax
templates.
However, syntax
on a pattern variable is
implemented as the identity function.
When the Kawa compiler sees a syntax
, it translates
the template into a SyntaxTemplate
object.
This object contains a compact encoding of the template,
plus a reference to the current lexical scope.
Evaluating the syntax
form translates into
invoking the execute
method of
the SyntaxTemplate
.
The execute
creates a fresh
TemplateScope
instance,
when is used for the scope
field of
any generated SyntaxForm
objects.
The pattern in a syntax-case
or
syntax-rules
is matched against a syntax object,
which can be an explicit SyntaxForm
object,
an implicit syntax object (a plain Scheme value), or a mixture of the
two. The “destructuring” specified by the pattern
may need to decend into a SyntaxForm
.
Consider for example this syntax object:
`(,(syntax a) ,(syntax (b c d)) f)
being matched against this pattern:
(x (y . z) . r)
Then we get these bindings:
x: (syntax a) y: (syntax b) z: (syntax (c d)) r: (f)
The scope
fields of y
and
z
are taken from the original
(syntax (b c d))
.
Note that the last element of the original list is a plain Scheme form
- i.e. an implicit syntax form, so that's what the pattern variable is
bound to, and that's what get inserted into any template that
references r
.
The Kawa compiler compiles a syntax-case
into a SyntaxPattern
object.
A single rule in a syntax-rules
is compiled to a
SyntaxRule
, which is a combination of a
SyntaxPattern
with
a SyntaxTemplate
.
The whole syntax-rules
is compiled to a
SyntaxRules
, which is basically a collection of
SyntaxRule
objects
that implements Procedure1
so it can be used as the
transformer of a define-syntax
macro.
A TemplateScope
is created when a
syntax
template is expanded.
class TemplateScope extends LetExp
{
}
Initially, a TemplateScope
doesn't contain
any declarations, but it may gain implicit alias declarations as
explained below. In addition, a TemplateScope
inherits declarations from the parent scope, which is set to the
context of the macro definition.
Kawa's rewrite
stage translates an input
syntax object to an Expression
.
The initial syntax object is an implicit syntax object, consisting
of a form from the Scheme reader. If we're processing the
output from a macro, then the syntax object may contain
SyntaxForm
objects.
To rewrite
a SyntaxForm
we temporarily change the current scope to the
TemplateScope
. We then rewrite
the form
value of the SyntaxForm
,
using that TemplateScope
. This means that identifiers
will be looked up in the TemplateScope
. This
normally doesn't have direct declarations, so effectively we're searching
the TemplateScope
's parent, which is the syntactic
context of the macro definition, as desired.
The current scope is restored to the orginal scope when we're done
rewriting the SyntaxForm
.
(Setting and restoring the current scope isn't implemented very efficiently,
though it probably doesn't matter in normal code.)
Things get more complicated if the syntax template creates new definitions, but using names coming from the expansion site rather than the template. Consider this example:
(define-syntax mac (syntax-rules () ((mac v1 init exp) (let ((v1 init)) (let ((i 2)) (list exp i)))))) (define j 10) (define k (mac i 1 (+ i j))) ;; => (11 2)
Here we have a pair of nested let
-expressions, both of
which end up declaring variables named i
.
The list
-expression references the inner
i
literally. After macro-expansion the
exp
also references i
, but its syntactic
scope comes from the macro application, so it should match the
i
declaration whose name also comes from the
macro application - which is that of the outer let
.
This is how it is done:
When rewriting the outer let
we create a normal
LetExp
.
Rewriting the inner let
also creates a
LetExp
, nested in the normal way.
However, the name of this inner i
declaration
comes from the macro template, so we
need to hide it from any code not from the same template.
This is where we use the TemplateScope
:
We place the declaration for the inner i
in
the TemplateScope
, rather than in the
inner LetExp
where it “actually” belongs.
When we rewrite the first operand of the list
,
it uses the normal scoping, so it doesn't see the inner i
,
since that's in the TemplateScope
.
On the other hand the second operand of list
is
a SyntaxForm
whose scope
is the
TemplateScope
, so we change the current scope
to that TemplateScope
. This hides the
outer i
and makes the inner i
visible.
When we're done rewriting the inner let
,
we move the inner i
declaration out of the
TemplateScope
(which gets garbage collected)
and into the inner LetExp
where it stays for
the rest of compilation.
(The implementation doesn't actually move the declaration.
Instead, it initially creates the declaration with a null
name in the LetExp
and an alias that references to
it from the TemplateScope
. When we're done,
it sets the name from the alias.)