Help with Grep command in Linux?

husoski

2012-04-02 16:22:36 UTC

grep searches one or more files for lines containing a match for a given pattern. There are a bunch of options (see the output of "grep --help" and "man grep" for lots of information) but the two most common uses are:

grep pattern filename .... searches a named file for lines containing pattern

somecommand | grep pattern .... redirects the standard output of a command to grep and searches that for the pattern

The pattern is a regular expression. The simplest is just a string of characters to match.

grep include hello.cpp goodbye.c *.h

...will find all occurrences of "include" in the file hello.cpp, the file goodbye.c, and all files with names ending in ".h".

The normal action is to print out each matching line. If multiple files are specified, then the name of the file is included at the front of the output line, followed by the line from the file.

There are options (anything beginning with - before the pattern) that can change this. You can print out non-matching lines instead (-v), add lines number to the match output(-n), print counts of matches instead of lines (-c) and more.

There are some special characters for matching flexible patterns, and at least three different syntax options. Use grep -E (or the command "egrep" assumes -E by default) for extended Posix regular expressions or -P for Perl-style regular expressions.

This just scratches the top layer. Google for "regular expression tutorial", take a look at the output of "man 7 regex" (terse description of basic and Posix extended REs), and/or get a good book on using the Linux shell. There are two books I can recommend:

"A Practical Guide to Linux Commands, Editors and Shell Programming" (I have 2005 ed.)

- by Mark G. Sobell

"Beginning Linux Programming" (I have 3rd ed. from 2004)

- by Neil Matthew and Richard Stones

The second sounds more basic but is actually more advanced. Both are well written.

koppe74

2012-04-02 16:31:04 UTC

grep (global regular expression pattern?) - and it's cousins fgrep (fixed-grep) and egrep (extended-grep) - is used to find patterns in (mostly) text-files - ie. it's used to find lines (and thus files) containing a particular text. A regular expression works similar to jokers in filenames (* and ?), and let you match alternative spellings.

If grep is given one filename, it will print out all lines in the file matching. If it's given several filenames, it will do the same, but will prescede all matching lines with the filename. The option -H and -h lets you override this, by turning filenames always on or always off. The option -v inverts the match (printing lines not matching the pattern). The option -l writes the name of matching files only, not the content of matching lines (when match is found, it won't look through the rest of the file). The option -L lists name of files not matching. The option -c writes filenames followed by the count matching lines (including 0 for files that didn't match). The option -i ignores the case of letters (both upper and lower-case matches). These option works for all the grep-commands.

fgrep is simplest to use, you just write the phrase or word you're after in quotes. Several words and phrases can be specified by using lineshift.

fgrep "Hermione

Harry was a wizard

Voldemort" *.txt

Look in all *.txt (in current directory) for lines containing "Hermione", "Harry was a Wizard" or "Voldemort" and presedes the lines with the filename.

grep lets you in addition replace letters with jokers:

. (a dot) the joker - matches one occurance of any character

\. (an escaped dot) matches an actual dot (full stop)

[aB5] matches a, B and 5

[a-d] (range) matches lower-case a, b, c and d

[a-Z] matches all letters

[^aB5] (not) matches all characters *except* a, B and 5

[^a-d] (not range) matches all characters except a,b,c and d

There may also be special ranges for matching all number, all letter, all characters and so on.

A character - including the joker "." and range "[...]" - can be followed by a *, ? or {} having the following meaning

* matches 0, 1 or more occurances of the previous character

? matches 0 or 1 occurance of the previous character

{n,m} where n and m are numbers, matches between n and m occurances of previous character

Thus grep "ab*a" matches aa, aba, abba, abbba, abbbba and so on

grep "B[a-d]*" matches a B followed by 0, 1 or more a,b,c or d (like Baad)

When finding a match, grep will try to make the match as long as possible and starting as far as possible to the left.

^ matches the beginning of a line

$ matches the end of a line

\< matches the beginning of a word

\> matches the end of a word

So grep "^Dear friend" matches all lines beginning with "Dear friend", but not lines with "Dear friend" elsewhere.

And grep "scar\.$" matches all lines ending with "scar.".

egrep adds "extended regular expressions". This adds +, | and () to be used:

+ works like * or ?, but matches 1 or more (at least one) occurances.

() makes the characters/expression inside into a unit; which is treated as a if it was a single character by *, ? and +.

| OR, is used together with () for alternate spellings or several expressions.

egrep "ab+" matches "ab", "abb" and "abbb"; but not "a"

egrep "a(ab)*" matches "a", "aab", "aabab" and "aababab"; but not "aaba", because "ab" is here an "atom".

egrep "pregnan(t|cy)" mateches both pregnant and pregnancy (t OR cy)

egrep "^Dear( old)? friend" makes "old" optional

egrep "He is my (child|son|kid|lad|boy)"

grep and egrep are obvious great when you don't remember an exact quote or the correct spelling.

the grep-commands can be combined with find to list files containing a phrase. As it's called with one file at a time, the -H option lets you see the filename. The -l option is even more useful, as you get just the list of files. To find all *.txt files containing the word "flashlight" in the current directory or below, use:

find . -name "*.txt" -exec fgrep -l "flashlight" {} \;

find will spawn an fgrep for all txt-files it finds.

Else try "man grep", it should be of great help.

?

2012-04-02 15:20:55 UTC

RTFM!

http://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples/

Yahaira

2016-02-23 01:27:32 UTC

Try this: "^\S+ cat" Or this one, which will allow leading whitespace: "^\s*\S+ cat" To allow more than one space between words: "^\s*\S+\s+cat"