18.3.6 List Operators ([] and [^])

Lists, also called bracket expressions, are a set of one or more items. An item is a character, a collating symbol, an equivalence class expression, a character class expression, or a range expression. The syntax bits affect which kinds of items you can put in a list. We explain the last four items in subsections below. Empty lists are invalid.

A matching list matches a single character represented by one of the list items. You form a matching list by enclosing one or more items within an open-matching-list operator (represented by ‘[’) and a close-list operator (represented by ‘]’).

For example, ‘[ab]’ matches either ‘a’ or ‘b’. ‘[ad]*’ matches the empty string and any string composed of just ‘a’s and ‘d’s in any order. Regex considers invalid a regular expression with a ‘[’ but no matching ‘]’.

Nonmatching lists are similar to matching lists except that they match a single character not represented by one of the list items. You use an open-nonmatching-list operator (represented by ‘[^4) instead of an open-matching-list operator to start a nonmatching list.

For example, ‘[^ab]’ matches any character except ‘a’ or ‘b’.

If the syntax bit RE_HAT_LISTS_NOT_NEWLINE is set, then nonmatching lists do not match a newline.

Most characters lose any special meaning inside a list. The special characters inside a list follow.

]

ends the list if it’s not the first list item. So, if you want to make the ‘]’ character a list item, you must put it first.

\

quotes the next character if the syntax bit RE_BACKSLASH_ESCAPE_IN_LISTS is set.

[.

represents the open-collating-symbol operator (see Collating Symbol Operators ([..])).

.]

represents the close-collating-symbol operator.

[=

represents the open-equivalence-class operator (see Equivalence Class Operators ([==])).

=]

represents the close-equivalence-class operator.

[:

represents the open-character-class operator (see Character Class Operators ([::])) if the syntax bit RE_CHAR_CLASSES is set and what follows is a valid character class expression.

:]

represents the close-character-class operator if the syntax bit RE_CHAR_CLASSES is set and what precedes it is an open-character-class operator followed by a valid character class name.

-

represents the range operator (see The Range Operator (-)) if it’s not first or last in a list or the ending point of a range.

All other characters are ordinary. For example, ‘[.*]’ matches ‘.’ and ‘*’.


Footnotes

(4)

Regex therefore doesn’t consider the ‘^’ to be the first character in the list. If you put a ‘^’ character first in (what you think is) a matching list, you’ll turn it into a nonmatching list.