Saturday, March 8, 2014

Grep Tutorial

Download the Printable Booklet in PDF HERE

This is a reference booklet for grep and regular expression. For explanation of various usages in detail please refer more elaborate guide. grep: Global Regular Expression Print. GNU grep is combination of basic regular expressions, extended regular expressions, fixed strings and Perl-style regular expressions. Default behavior of grep is to return the filename and the line of the test that contains the searched string. Literals are the normal text characters, whereas metacharacters have special meanings. Backtic (``) enclosed portion is interpreted. Double quotes (“”) allow usage of environment variable as a part of search pattern.

There are two ways to search with grep i.e. searching for fixed string and searching for patterns. Concatenation is processed before alternation. Strings are concatenated by simply placing/being next to each other inside regular expression. 

grep -E has advantage of accomplishing the task in fewer characters. If significant use of backreferences is required, grep -E is ideal.

grep -F, any search pattern for grep -F cannot contain any metacharacters, escapes, wildcards, or alternations.

Syntax usage of grep is as follows: grep [options] [regularexpression] [filename]

Example: grep -n 'error' logfile.txt

Single Character Match

Any one character
Character class
Any one member of the character listed in brackets
Negates character class
Any character not listed in bracket (any one)
Escape character
Use the character after escape (\) literally (not interpreted).
Position Match

Start of a line
End of a line
Backslash (less-than)
Start of a word
Backslash (greater-than)
End of a word

Question mark
Optional match (any single character)
Any number of occurrence including zero, wild card
One or more of preceding expression (repetitive match)
Exactly match
Match exactly N times
Match at least
Match at least N times
Specified range
Match minimum and maximum times i.e. {3,4}

Match either of the expression given
Used to limit scope of alternation (sub pattern)
\1, \2, \3, …
Matches text previously matched within parenthesis
Word boundary
Matches characters or words marked by the end of the word, i.e. space, period
Used for matching \ backslash same as \\
Word character
Used for matching any word character, i.e. letter, number or underscore
Non-word character
Used for matching anything considered not-word i.e. other than letter, number and underscore
Start of buffer
Start of buffer sent to grep
End of buffer
Matches the end of buffer sent to grep

POSIX definition


Any alphabetical character

Any numerical character

Any alphabetical or numerical character

Space or tab character

Hexadecimal character

Any punctuation symbol

Any printable character (not control characters)

Any white space character

Excludes whitespace character

Any uppercase letter

Any lowercase letter

Control character

Basic regular expression
grep or grep -G

-e pattern
Recognizes pattern as regular expression argument i.e. grep -e -style (matches -style)
-f file
Takes patterns from file. The pattern file must list one pattern per line.
-i (ignore case)
Case insensitive search
-v (invert match)
Returns lines that do not match pattern
-w (word boundary match)
Matches exact word with boundary.
-x (line match)
Matches entire line ‘Hello, World!’
-c (counts)
Counts the number of matching lines
grep -l “error” *.txt
Prints files containing the pattern, stops at first match
grep -L “error” *.txt
Prints files that do not contain the pattern, stops at first match
-m num
grep -m 10 “error” *.txt
Stops reading file after num lines are matched i.e. only 10 lines that contain regular expression
grep -o pattern filename
Prints only the text that matches
Suppresses output
silent, no messages
Silently discards any error messages resulting from permission errors or non-existent files
byte offset
Displays byte offset of each matching text instead of line number
with filename
Includes the name of the file before each line printed (default when more than one file is input)
no filename
when more than one filename is given it suppresses printing the filename before each output
adds label
It will prefix the line with LABLE
line number
Includes the line number of each line displayed.
initial tab
Inserts a tab before each matching line
Unix byte offsets
Computes the byte offset as if it were running under Unix system
Prints ASCII NUL (a zero byte) after each filename
-A num
after context = num
Prints num (number of lines) after match
-B num
before context = num
Prints num (number of lines) before match
-C num, -num

Prints num (number of lines before and after match
-R or -r
Searches files underneath directory submitted as an input file i.e. grep -R pattern path
Extended Regular Expressions
egrep or grep -E


Any character preceding ? may or may not appear in the target string.

Unlimited number of repetitions while looking for matching string, i.e. grep -E ‘regex1+’ filename (will look for regex1, regex11, regex111 etc.

Determines how many times a pattern needs to be repeated before matching. i.e. grep -E ‘regex{4,6}’ filename

| is or, allows to combine several patterns into one expression i.e. grep -E ‘regex1|regex2’ filename
( )

Used to group particular strings of text for various roles i.e. backreferences, alternation, or simply readability

[ ] Used for matching the character without invoking the special meaning
Fixed strings / Fast grep
fgrep or grep -F

Counts the number of lines contain one or more instances of patter in a file i.e. fgrep -c ‘regex’ filename

Used for searching more than one pattern or when the pattern begins with hyphen
Outputs results to file
Outputs the results of the search into a file instead of printing it to the terminal

When pattern is searched on more than one file, -h prevents fgrep from displaying filenames before the matched output.
ignores case (capitalization)
-i option ignores capitalization in the pattern when matching it.

Displays the files containing the pattern but not the matching lines.
number of the line
Prints out the line number before the line that matches the pattern.
reverse match
Matches any lines that do not contain the given pattern
Perl Style grep
grep -P
Perl-Compatible Regular Expression (PCRE)
PCRE-specific escapes


Matches the alarm character

Matches ctrl+X, where X is any letter

Matches escape character

Matches from feed character

Matches newline character

Matches carriage return

Matches tab character

Matches any decimal digit

Matches any non-decimal character

Matches any whitespace character

Matches any non-whitespace character

Matches any word character

Matches any non-word character

Matches when at word boundary

Matches when at not a word boundary

Matches when at start of subject

Matches when at end of subject or before newline

Matches when at end of subject

Matches at first matching position


1 comment:

Anonymous said...

pretty nice blog, following :)