grep Programs (GNU Grep 3.12)

Previous: Exit Status, Up: Invoking grep [Contents][Index]

2.4 `grep` Programs ¶

grep searches the named input files for lines containing a match to the given patterns. By default, grep prints the matching lines. A file named - stands for standard input. If no input is specified, grep searches the working directory . if given a command-line option specifying recursion; otherwise, grep searches standard input. There are four major variants of grep, controlled by the following options.

-G ¶

--basic-regexp

Interpret patterns as basic regular expressions (BREs). This is the default.

-E ¶

--extended-regexp

Interpret patterns as extended regular expressions (EREs). (-E is specified by POSIX.)

-F ¶

--fixed-strings

Interpret patterns as fixed strings, not regular expressions. (-F is specified by POSIX.)

-P ¶

--perl-regexp

Interpret patterns as Perl-compatible regular expressions (PCREs).

For documentation, refer to https://www.pcre.org/, with these caveats:

In a UTF-8 locale, Perl treats data as UTF-8 only under certain conditions, e.g., if perl is invoked with the -C option or the PERL_UNICODE environment variable set appropriately. Similarly, pcre2grep treats data as UTF-8 only if invoked with -u or -U. In contrast, in a UTF-8 locale grep and git grep always treat data as UTF-8.
In Perl and git grep -P, ‘\d’ matches all Unicode digits, even if they are not ASCII. For example, ‘\d’ matches “٣” (U+0663 ARABIC-INDIC DIGIT THREE). In contrast, in ‘grep -P’, ‘\d’ matches only the ten ASCII digits, regardless of locale. In pcre2grep, ‘\d’ ordinarily behaves like Perl and git grep -P, but when given the --posix-digit option it behaves like ‘grep -P’. (On all platforms, ‘\D’ matches the complement of ‘\d’.)
The pattern ‘[[:digit:]]’ matches all Unicode digits in Perl, ‘grep -P’, git grep -P, and pcre2grep, so you can use it to get the effect of Perl’s ‘\d’ on all these platforms. In other words, in Perl and git grep -P, ‘\d’ is equivalent to ‘[[:digit:]]’, whereas in ‘grep -P’, ‘\d’ is equivalent to ‘[0-9]’, and pcre2grep ordinarily follows Perl but when given --posix-digit it follows ‘grep -P’.
(On all these platforms, ‘[[:digit:]]’ is equivalent to ‘\p{Nd}’ and to ‘\p{General_Category: Decimal_Number}’.)
If grep is built with PCRE2 version 10.43 (2024) or later, ‘(?aD)’ causes ‘\d’ to behave like ‘[0-9]’ and ‘(?-aD)’ causes it to behave like ‘[[:digit:]]’.
Although PCRE tracks the syntax and semantics of Perl’s regular expressions, the match is not always exact. Perl evolves and a Perl installation may predate or postdate the PCRE2 installation on the same host, or their Unicode versions may differ, or Perl and PCRE2 may disagree about an obscure construct.
By default, grep applies each regexp to a line at a time, so the ‘(?s)’ directive (making ‘.’ match line breaks) is generally ineffective. However, with -z (--null-data) it can work:
```
$ printf 'a\nb\n' |grep -zP '(?s)a.b'
a
b
```
But beware: with the -z (--null-data) and a file containing no NUL byte, grep must read the entire file into memory before processing any of it. Thus, it will exhaust memory and fail for some large files.

2.4 grep Programs ¶

2.4 `grep` Programs ¶