6.12.6 Buffering

Every port has associated input and output buffers. You can think of ports as being backed by some mutable store, and that store might be far away. For example, ports backed by file descriptors have to go all the way to the kernel to read and write their data. To avoid this round-trip cost, Guile usually reads in data from the mutable store in chunks, and then services small requests like get-char out of that intermediate buffer. Similarly, small writes like write-char first go to a buffer, and are sent to the store when the buffer is full (or when port is flushed). Buffered ports speed up your program by reducing the number of round-trips to the mutable store, and they do so in a way that is mostly transparent to the user.

There are two major ways, however, in which buffering affects program semantics. Building correct, performant programs requires understanding these situations.

The first case is in random-access read/write ports (see Random Access). These ports, usually backed by a file, logically operate over the same mutable store when both reading and writing. So, if you read a character, causing the buffer to fill, then write a character, the bytes you filled in your read buffer are now invalid. Every time you switch between reading and writing, Guile has to flush any pending buffer. If this happens frequently, the cost can be high. In that case you should reduce the amount that you buffer, in both directions. Similarly, Guile has to flush buffers before seeking. None of these considerations apply to sockets, which don’t logically read from and write to the same mutable store, and are not seekable. Note also that sockets are unbuffered by default. See Network Sockets and Communication.

The second case is the more pernicious one. If you write data to a buffered port, it probably doesn’t go out to the mutable store directly. (This “probably” introduces some indeterminism in your program: what goes to the store, and when, depends on how full the buffer is. It is something that the user needs to explicitly be aware of.) The data is written to the store later – when the buffer fills up due to another write, or when force-output is called, or when close-port is called, or when the program exits, or even when the garbage collector runs. The salient point is, the errors are signaled then too. Buffered writes defer error detection (and defer the side effects to the mutable store), perhaps indefinitely if the port type does not need to be closed at GC.

One common heuristic that works well for textual ports is to flush output when a newline (\n) is written. This line buffering mode is on by default for TTY ports. Most other ports are block buffered, meaning that once the output buffer reaches the block size, which depends on the port and its configuration, the output is flushed as a block, without regard to what is in the block. Likewise reads are read in at the block size, though if there are fewer bytes available to read, the buffer may not be entirely filled.

Note that binary reads or writes that are larger than the buffer size go directly to the mutable store without passing through the buffers. If your access pattern involves many big reads or writes, buffering might not matter so much to you.

To control the buffering behavior of a port, use setvbuf.

Scheme Procedure: setvbuf port mode [size]
C Function: scm_setvbuf (port, mode, size)

Set the buffering mode for port. mode can be one of the following symbols:

none

non-buffered

line

line buffered

block

block buffered, using a newly allocated buffer of size bytes. If size is omitted, a default size will be used.

Another way to set the buffering, for file ports, is to open the file with 0 or l as part of the mode string, for unbuffered or line-buffered ports, respectively. See File Ports, for more.

Any buffered output data will be written out when the port is closed. To make sure to flush it at specific points in your program, use force-output.

Scheme Procedure: force-output [port]
C Function: scm_force_output (port)

Flush the specified output port, or the current output port if port is omitted. The current output buffer contents, if any, are passed to the underlying port implementation.

The return value is unspecified.

Scheme Procedure: flush-all-ports
C Function: scm_flush_all_ports ()

Equivalent to calling force-output on all open output ports. The return value is unspecified.

Similarly, sometimes you might want to switch from using Guile’s ports to working directly on file descriptors. In that case, for input ports use drain-input to get any buffered input from that port.

Scheme Procedure: drain-input port
C Function: scm_drain_input (port)

This procedure clears a port’s input buffers, similar to the way that force-output clears the output buffer. The contents of the buffers are returned as a single string, e.g.,

(define p (open-input-file ...))
(drain-input p) => empty string, nothing buffered yet.
(unread-char (read-char p) p)
(drain-input p) => initial chars from p, up to the buffer size.

All of these considerations are very similar to those of streams in the C library, although Guile’s ports are not built on top of C streams. Still, it is useful to read what other systems do. See Streams in The GNU C Library Reference Manual, for more discussion on C streams.