Every port has associated input and output buffers. You can think of
ports as being backed by some mutable store, and that store might be far
away. For example, ports backed by file descriptors have to go all the
way to the kernel to read and write their data. To avoid this
round-trip cost, Guile usually reads in data from the mutable store in
chunks, and then services small requests like get-char
out of
that intermediate buffer. Similarly, small writes like
write-char
first go to a buffer, and are sent to the store when
the buffer is full (or when port is flushed). Buffered ports speed up
your program by reducing the number of round-trips to the mutable store,
and they do so in a way that is mostly transparent to the user.
There are two major ways, however, in which buffering affects program semantics. Building correct, performant programs requires understanding these situations.
The first case is in random-access read/write ports (see Random Access). These ports, usually backed by a file, logically operate over the same mutable store when both reading and writing. So, if you read a character, causing the buffer to fill, then write a character, the bytes you filled in your read buffer are now invalid. Every time you switch between reading and writing, Guile has to flush any pending buffer. If this happens frequently, the cost can be high. In that case you should reduce the amount that you buffer, in both directions. Similarly, Guile has to flush buffers before seeking. None of these considerations apply to sockets, which don’t logically read from and write to the same mutable store, and are not seekable. Note also that sockets are unbuffered by default. See Network Sockets and Communication.
The second case is the more pernicious one. If you write data to a
buffered port, it probably doesn’t go out to the mutable store directly.
(This “probably” introduces some indeterminism in your program: what
goes to the store, and when, depends on how full the buffer is. It is
something that the user needs to explicitly be aware of.) The data is
written to the store later – when the buffer fills up due to another
write, or when force-output
is called, or when close-port
is called, or when the program exits, or even when the garbage collector
runs. The salient point is, the errors are signaled then too.
Buffered writes defer error detection (and defer the side effects to the
mutable store), perhaps indefinitely if the port type does not need to
be closed at GC.
One common heuristic that works well for textual ports is to flush
output when a newline (\n
) is written. This line buffering
mode is on by default for TTY ports. Most other ports are block
buffered, meaning that once the output buffer reaches the block size,
which depends on the port and its configuration, the output is flushed
as a block, without regard to what is in the block. Likewise reads are
read in at the block size, though if there are fewer bytes available to
read, the buffer may not be entirely filled.
Note that binary reads or writes that are larger than the buffer size go directly to the mutable store without passing through the buffers. If your access pattern involves many big reads or writes, buffering might not matter so much to you.
To control the buffering behavior of a port, use setvbuf
.
Set the buffering mode for port. mode can be one of the following symbols:
none
non-buffered
line
line buffered
block
block buffered, using a newly allocated buffer of size bytes. If size is omitted, a default size will be used.
Another way to set the buffering, for file ports, is to open the file
with 0
or l
as part of the mode string, for unbuffered or
line-buffered ports, respectively. See File Ports, for more.
Any buffered output data will be written out when the port is closed.
To make sure to flush it at specific points in your program, use
force-output
.
Flush the specified output port, or the current output port if port is omitted. The current output buffer contents, if any, are passed to the underlying port implementation.
The return value is unspecified.
Equivalent to calling force-output
on all open output ports. The
return value is unspecified.
Similarly, sometimes you might want to switch from using Guile’s ports
to working directly on file descriptors. In that case, for input ports
use drain-input
to get any buffered input from that port.
This procedure clears a port’s input buffers, similar to the way that force-output clears the output buffer. The contents of the buffers are returned as a single string, e.g.,
(define p (open-input-file ...)) (drain-input p) => empty string, nothing buffered yet. (unread-char (read-char p) p) (drain-input p) => initial chars from p, up to the buffer size.
All of these considerations are very similar to those of streams in the C library, although Guile’s ports are not built on top of C streams. Still, it is useful to read what other systems do. See Streams in The GNU C Library Reference Manual, for more discussion on C streams.