A process is a native (operating-system-level) application or program that runs separately from the current virtual machine.
Many programming languages have facilities to allow access to system
processes (commands). (For example Java has java.lang.Process
and java.lang.ProcessBuilder
.)
These facilities let you send data to the standard input, extract the
resulting output, look at the return code, and sometimes even pipe
commands together. However, this is rarely as easy as it is using
the old Bourne shell; for example command substitution is awkward.
Kawa’s solution is based on these two ideas:
A “process expression” (typically a function call) evaluates to a
LProcess
value, which provides access to a Unix-style
(or Windows) process.
In a context requiring a string (or a bytevector), an LProcess
is
automatically converted to a string (or bytevector)
comprising the standard output from the process.
The most flexible way to start a process is with either the
run-process
procedure or
the &`{
syntax
for process literals.
command
}
Procedure: run-process
process-keyword-argument
*
command
Creates a process object, specifically a
gnu.kawa.functions.LProcess
object. Aprocess-keyword-argument
can be used to set various options, as discussed below.The
command
is the process command-line (name and arguments). It can be an array of strings, in which case those are used as the command arguments directly:(run-process ["ls" "-l"])The
command
can also be a single string, which is split (tokenized) into command arguments separated by whitespace. Quotation groups words together just like traditional shells:(run-process "cmd a\"b 'c\"d k'l m\"n'o") ⇒ (run-process ["cmd" "ab 'cd" "k'l m\"no"])The syntax shorthand
&`{
orcommand
}&sh{
(discussed below) is usually more convenient.command
}
process-keyword-argument
::=
process-redirect-argument
| process-environment-argument
| process-misc-argument
We discuss process-redirect-argument
and
process-environment-argument
later.
The process-misc-argument
options are just the following:
shell:
shell
Currently, shell
must be one of #f
(which is ignored)
or #t
. The latter means to use an external shell to tokenize
the command
.
I.e. the following are equivalent:
(run-process shell: #t "command
") (run-process ["/bin/sh" "-c" "command
"])
directory:
dir
Change the working directory of the new process to dir
.
A simple process literal is a kind of
named literal that uses the backtick character
(`
) as the cname
.
For example:
&`{date --utc}
This is equivalent to:
(run-process "date --utc")
In general the following are roughly equivalent (using string quasi-literals):
&`[args
...]{command
} (run-processargs
... &{command
})
The reason for the “roughly” is if command
contains
escaped sub-expressions; in that case &`
may process
the resulting values differently from plain string-substitution,
as discussed below.
If you use &sh
instead of &`
then a shell is used:
&sh{rm *.class}
which is equivalent to:
&`{/bin/sh -c "rm *.class"}
In general, the following are equivalent:
&sh[args
...]{command
} &`[shell: #targs
...]{command
}
The value returned from a call to run-process
or a process literal
is an instance of gnu.kawa.functions.LProcess
.
This class extends java.lang.Process
, so you can treat it as
any other Process
object.
#|kawa:1|# (define p1 &`{date --utc}) #|kawa:2|# (p1:toString) gnu.kawa.functions.LProcess@377dca04 #|kawa:3|# (write p1) gnu.kawa.functions.LProcess@377dca04
What makes an LProcess
interesting is that it is also
a blob, which is automatically
converted to a string (or bytevector) in a context that requires it.
The contents of the blob comes from the standard output of the process.
The blob is evaluated lazily,
so data it is only collected when requested.
#|kawa:4|# (define s1 ::string p1) #|kawa:5|# (write s1) "Wed Jan 1 01:18:21 UTC 2014\n" #|kawa:6|# (define b1 ::bytevector p1) (write b1) #u8(87 101 100 32 74 97 110 ... 52 10)
The display
procedure prints it in “human” form, as a string:
#|kawa:7|# (display p1) Wed Jan 1 01:18:21 UTC 2014
This is also the default REPL formatting:
#|kawa:8|# &`{date --utc} Wed Jan 1 01:18:22 UTC 2014
When you type a command to a shell, its output goes to the console, Similarly, in a REPL the output from the process is copied to the console output - which can sometimes by optimized by letting the process inherit its standard output from the Kawa process.
To substitute the variable or the result of an expression in the command line use the usual syntax for quasi literals:
(define filename (make-temporary-file)) &sh{run-experiment >&[filename]}
Since a process is convertible a string, we need no special syntax for command substitution:
`{echo The directory is: &[&`{pwd}]}
or equivalently:
`{echo The directory is: &`{pwd}}
Things get more interesting when considering the interaction between substitution and tokenization. This is not simple string interpolation. For example, if an interpolated value contains a quote character, we want to treat it as a literal quote, rather than a token delimiter. This matches the behavior of traditional shells. There are multiple cases, depending on whether the interpolation result is a string or a vector/list, and depending on whether the interpolation is inside quotes.
If the value is a string, and we’re not inside quotes, then all non-whitespace characters (including quotes) are literal, but whitespace still separates tokens:
(define v1 "a b'c ") &`{cmd x y&[v1]z} ⇒ (run-process ["cmd" "x" "ya" "b'c" "z"])
If the value is a string, and we are inside single quotes, all characters (including whitespace) are literal.
&`{cmd 'x y&[v1]z'} ⇒ (run-process ["cmd" "x ya b'c z"])
Double quotes work the same except that newline is an argument
separator. This is useful when you have one filename per line, and the
filenames may contain spaces, as in the output from find
:
&`{ls -l "&`{find . -name '*.pdf'}"}
This solves a problem that is quite painful with traditional shells.
If the value is a vector or list (of strings), and we’re not inside quotes, then each element of the array becomes its own argument, as-is:
(define v2 ["a b" "c\"d"]) &`{cmd &[v2]} ⇒ (run-process ["cmd" "a b" "c\"d"])
However, if the enclosed expression is adjacent to non-space non-quote characters, those are prepended to the first element, or appended to the last element, respectively.
&`{cmd x&[v2]y} ⇒ (run-process ["cmd" "xa b" "c\"dy"]) &`{cmd x&[[]]y} ⇒ (run-process ["cmd" "xy"])
This behavior is similar to how shells handle "$@"
(or "${name[@]}"
for general arrays), though in Kawa you would
leave off the quotes.
Note the equivalence:
&`{&[array]} ⇒ (run-process array)
If the value is a vector or list (of strings), and we are inside quotes, it is equivalent to interpolating a single string resulting from concatenating the elements separated by a space:
&`{cmd "&[v2]"} ⇒ (run-process ["cmd" "a b c\"d"])
This behavior is similar to how shells handle "$*"
(or
"${name[*]}"
for general arrays).
If the value is the result of a call to unescaped-data
then it
is parsed as if it were literal. For example a quote in the unescaped
data may match a quote in the literal:
(define vu (unescaped-data "b ' c d '")) &`{cmd 'a &[vu]z'} ⇒ (run-process ["cmd" "a b " "c" "d" "z"])
If we’re using a shell to tokenize the command, then we add quotes or backslashes as needed so that the shell will tokenize as described above:
(define authors ["O'Conner" "de Beauvoir"]) &sh{list-books &[authors]}
The command passed to the shell is:
list-books 'O'\''Conner' 'de Beauvoir
Having quoting be handled by the $construct$:sh
implementation automatically eliminates common code injection problems.
Smart tokenization only happens when using the quasi-literal forms such
as &`{command}
.
You can of course use string templates with run-process
:
(run-process &{echo The directory is: &`{pwd}})
However, in that case there is no smart tokenization: The template is evaluated to a string, and then the resulting string is tokenized, with no knowledge of where expressions were substituted.
You can use various keyword arguments to specify standard input, output,
and error streams. For example to lower-case the text in in.txt
,
writing the result to out.txt
, you can do:
&`[in-from: "in.txt" out-to: "out.txt"]{tr A-Z a-z}
or:
(run-process in-from: "in.txt" out-to: "out.txt" "tr A-Z a-z")
A process-redirect-argument
can be one of the following:
in:
value
The value
is evaluated, converted to a string (as if
using display
), and copied to the input file of the process.
The following are equivalent:
&`[in: "text\n"]{command} &`[in: &`{echo "text"}]{command}
You can pipe the output from command1
to the input
of command2
as follows:
&`[in: &`{command1}]{command2}
in-from:
path
The process reads its input from the specified path
, which
can be any value coercible to a filepath
.
out-to:
path
The process writes its output to the specified path
.
err-to:
path
Similarly for the error stream.
out-append-to:
path
err-append-to:
path
Similar to out-to
and err-to
, but append to the file
specified by path
, instead of replacing it.
in-from: ’pipe
out-to: ’pipe
err-to: ’pipe
Does not set up redirection. Instead, the specified stream is available
using the methods getOutputStream
, getInputStream
,
or getErrorStream
, respectively, on the resulting Process
object,
just like Java’s ProcessBuilder.Redirect.PIPE
.
in-from: ’inherit
out-to: ’inherit
err-to: ’inherit
Inherits the standard input, output, or error stream from the current JVM process.
out-to:
port
err-to:
port
Redirects the standard output or error of the process to
the specified port
.
out-to: ’current
err-to: ’current
Same as out-to: (current-output-port)
,
or err-to: (current-error-port)
, respectively.
in-from:
port
in-from: ’current
Re-directs standard input to read from the port
(or (current-input-port)
). It is unspecified how much is read from
the port
. (The implementation is to use a thread that reads from the
port, and sends it to the process, so it might read to the end of the port,
even if the process doesn’t read it all.)
err-to: ’out
Redirect the standard error of the process to be merged with the standard output.
The default for the error stream (if neither err-to
or
err-append-to
is specified) is equivalent to err-to: 'current
.
Note: Writing to a port is implemented by copying the output or error
stream of the process. This is done in a thread, which means we don’t have
any guarantees when the copying is finished. (In the future we might
change process-exit-wait
(discussed later) wait for not only the
process to finish, but also for these helper threads to finish.)
A here document is
a form a literal string, typically multi-line, and commonly used in
shells for the standard input of a process. You can use string literals or
string quasi-literals for this.
For example, this passes the string "line1\nline2\nline3\n"
to
the standard input of command
:
(run-process [in: &{ &|line1 &|line2 &|line3 }] "command")
Note the use of &|
to mark the end of ignored indentation.
Piping the output of one process as the input of another
is in principle easy - just use the in:
process argument. However, writing a multi-stage pipe-line quickly gets ugly:
&`[in: &`[in: "My text\n"]{tr a-z A-Z}]{wc}
The convenience macro pipe-process
makes this much nicer:
(pipe-process "My text\n" &`{tr a-z A-Z} &`{wc})
Syntax: pipe-process
input
process
*
All of the
process
expressions must berun-process
forms, or equivalent&`{command}
forms. The result of evaluatinginput
becomes the input to the firstprocess
; the output from the firstprocess
becomes the input to the secondprocess
, and so on. The result of wholepipe-process
expression is that of the lastprocess
.Copying the output of one process to the input of the next is optimized: it uses a copying loop in a separate thread. Thus you can safely pipe long-running processes that produce huge output. This isn’t quite as efficient as using an operating system pipe, but is portable and works pretty well.
By default the new process inherits the system environment of the current
(JVM) process as returned by System.getenv()
, but you can override it.
A process-environment-argument
can be one of the following:
env-
name
:
value
In the process environment, set the "
to the
specified name
"value
. For example:
&`[env-CLASSPATH: ".:classes"]{java MyClass}
NAME
:
value
Same as using the env-
option above, but only if the
NAME
is uppercase (i.e. if uppercasing NAME
yields
the same string). For example the previous example could be written:
NAME
&`[CLASSPATH: ".:classes"]{java MyClass}
environment:
env
The env
is evaluated and must yield a HashMap
.
This map is used as the system environment of the process.
When a process finishes, it returns an integer exit code. The code is traditionally 0 on successful completion, while a non-zero code indicates some kind of failure or error.
Procedure: process-exit-wait
process
The
process
expression must evaluate to a process (anyjava.lang.Process
object). This procedure waits for the process to finish, and then returns the exit code as anint
.(process-exit-wait (run-process "echo foo")) ⇒ 0
Procedure: process-exit-ok?
process
Calls
process-exit-wait
, and then returns#false
if the process exited it 0, and returns#true
otherwise.This is useful for emulating the way traditional shell do logic control flow operations based on the exit code. For example in
sh
you might write:if grep Version Makefile >/dev/null then echo found Version else echo no Version fiThe equivalent in Kawa:
(if (process-exit-ok? &`{grep Version Makefile}) &`{echo found} &`{echo not found})Strictly speaking these are not quite the same, since the Kawa version silently throws away the output from
grep
(because no-one has asked for it). To match the output from thesh
, you can useout-to: 'inherit
:(if (process-exit-ok? &`[out-to: 'inherit]{grep Version Makefile}) &`{echo found} &`{echo not found})
Exits the Kawa interpreter, and ends the Java session. Returns the value of
code
to the operating system: Thecode
must be integer, or the special values#f
(equivalent to -1), or#t
(equivalent to 0). Ifcode
is not specified, zero is returned. Thecode
is a status code; by convention a non-zero value indicates a non-standard (error) return.Before exiting, finally-handlers (as in
try-finally
, or theafter
procedure ofdynamic-wind
) are executed, but only in the current thread, and only if the current thread was started normally. (Specifically if we’re inside anExitCalled
block with non-zero nesting - seegnu.kawa.util.ExitCalled
.) Also, JVM shutdown hooks are executed - which includes flushing buffers of output ports. (SpecificallyWriter
objects registered with theWriterManager
.)
Procedure: emergency-exit
[code
]
Exits the Kawa interpreter, and ends the Java session. Communicates an exit value in the same manner as
exit
. Unlikeexit
, neither finally-handlers nor shutdown hooks are executed.
Procedure: make-process
command
envp
Creates a
<java.lang.Process>
object, using the specifiedcommand
andenvp
. Thecommand
is converted to an array of Java strings (that is an object that has type<java.lang.String[]>
. It can be a Scheme vector or list (whose elements should be Java strings or Scheme strings); a Java array of Java strings; or a Scheme string. In the latter case, the command is converted usingcommand-parse
. Theenvp
is process environment; it should be either a Java array of Java strings, or the special#!null
value.Except for the representation of
envp
, this is similar to:(run-process environment:envp
command
)
Runs the specified
command
, and waits for it to finish. Returns the return code from the command. The return code is an integer, where 0 conventionally means successful completion. Thecommand
can be any of the types handled bymake-process
.Equivalent to:
(process-exit-wait (make-processcommand
#!null))
The value of this variable should be a one-argument procedure. It is used to convert a command from a Scheme string to a Java array of the constituent "words". The default binding, on Unix-like systems, returns a new command to invoke
"/bin/sh" "-c"
concatenated with the command string; on non-Unix-systems, it is bound totokenize-string-to-string-array
.
Procedure: tokenize-string-to-string-array
command
Uses a
java.util.StringTokenizer
to parse thecommand
string into an array of words. This splits thecommand
using spaces to delimit words; there is no special processing for quotes or other special characters. (This is the same as whatjava.lang.Runtime.exec(String)
does.)