18.1 Overview

A regular expression (or regexp, or pattern) is a text string that describes some (mathematical) set of strings. A regexp r matches a string s if s is in the set of strings described by r.

Using the Regex library, you can:

Some regular expressions match only one string, i.e., the set they describe has only one member. For example, the regular expression ‘foo’ matches the string ‘foo’ and no others. Other regular expressions match more than one string, i.e., the set they describe has more than one member. For example, the regular expression ‘f*’ matches the set of strings made up of any number (including zero) of ‘f’s. As you can see, some characters in regular expressions match themselves (such as ‘f’) and some don’t (such as ‘*’); the ones that don’t match themselves instead let you specify patterns that describe many different strings.

To either match or search for a regular expression with the Regex library functions, you must first compile it with a Regex pattern compiling function. A compiled pattern is a regular expression converted to the internal format used by the library functions. Once you’ve compiled a pattern, you can use it for matching or searching any number of times.

The Regex library is used by including regex.h. Regex provides three groups of functions with which you can operate on regular expressions. One group—the GNU group—is more powerful but not completely compatible with the other two, namely the POSIX and Berkeley Unix groups; its interface was designed specifically for GNU.

We wrote this chapter with programmers in mind, not users of programs—such as Emacs—that use Regex. We describe the Regex library in its entirety, not how to write regular expressions that a particular program understands.