1 / 20

Lecture 5

CSE4251 The Unix Programming Environment. Lecture 5. Regular Expressions; grep ;. Why Regular Expressions. Regular expressions are used to describe text patterns/filters Unix commands/utilities that support regular expressions:

Download Presentation

Lecture 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE4251 The Unix Programming Environment Lecture 5 Regular Expressions; grep;

  2. Why Regular Expressions • Regular expressions are used to describe text patterns/filters • Unix commands/utilities that support regular expressions: • grep(fgrep, egrep)- search a file for a string or regular expression • sed - stream editor • awk (nawk) - pattern scanning and processing language • There are some minor differences between the regular expressions supported by these programs • We will cover the general matching operators first.

  3. Character Class • […] matches any of the enclosed chars • [abc] matches a single a b or c • [a-z] matches any of abcdef…xyz • [^…] matches any thing not included • [^A-Za-z] matches a single character as long as it is not a letter. • Example: [Dd][Aa][Vv][Ee] • Matches "Dave" or "dave" or "dAVE", • Does not match "ave" or "da"

  4. Regular Expression Operators • Any character (except a metacharacter!) matches itself. • . Matches any single character except newline. • * Matches 0 or more of the immediately preceding R.E. • ?Matches 0 or 1 instances of the immediately preceding R.E. • + Matches 1 or more instances of immediately preceding R.E. • ^ Matches the preceding R.E. at the beginning of the line • $ Matches the preceding R.E. at the end of the line • | Matches the R.E. specified before or after this symbol • \ Turn off the special meaning

  5. R.E patterns • {n} The preceding item is matched exactly n times. • {n,} The preceding item is matched n or more times. • {n,m} The preceding item is matched at least n times, but not more than m times.

  6. R.E patterns • If you put a subpattern inside parens you can use + * and ? to the entire subpattern. a(bc)*d matches "ad" and "abcbcd" does not match "abcxd" or "bcbcd"

  7. R.E patterns \< beginning of word anchor \<abcmatches “abcd” but not “dabc” \> end of work anchor abc\> matches “dabc” but not “abcd” \(…\) stores the pattern inside \( and \) \(abc\)defmatches “abcdef” and stores abcin \1. So \(abc\)def\1 matches “abcdefabc”. Can store up to 9 matches \(ab\)c\(de\)f\1\2 will match abcdefabde

  8. Examples of R.E. x[abc]?x matches "xax" or "xx“ [abc]* matches "aaaaa" or "acbca" 0*10 matches "010" or "0000010"or "10" ^(dog)$ matches lines starting and ending with dog [\t ]* (A|a)+b*c?

  9. Example • Christian Scott lives here and will put on a Christmas party • There are around 30 to 35 people invited. • They are: • Tom • Dan • Rhonda Savage • Nicky and Kimberly. • Steve, Suzanne, Ginger and Larry ^[A-Z]..$ ^[A-Z][a-z]*3[0-5] [a-z]*\. ^ *[A-Z][a-z][a-z]$

  10. Review: Metacharacters forfilename abbreviation (lecture 2) • * Matches anything: ls Test*.doc • ? Matches any single character lsTest?.doc • [abc…] Matches any of the enclosed characters: ls T[eE][sS][tT].doc • [a-z] matches any character in a range ls [a-zA-Z]* • [!abc…] matches any character except those listed: ls [!0-9]*

  11. Difference • Although there are similarities to the metacharacters used in filename expansion – we are talking about something different. • Filename expansion is done by the shell. • Regular expressions are used by commands (programs). • However, be careful about specifying RE on the command line as a result of this overlap • Good idea to always quote RE with special chars (‘’or “”)on the command line • Example: $ grep ‘[a-z]*’ somefile.txt

  12. grep - search for a string • grep [-bchilnsvw] PATTERN [filename...] • Read files or standard /redirected input • Search for specified pattern in each line • Send results to the standard output • Examples: $ grep ‘^X11’ * - search all files for lines starting with the string “X11” $ grep -v text file - print lines that do not match “text” • Exit status: 0 – pattern found; 1 - not found.

  13. Regular Expressions for grep if cis any non special character\c turn off any special meaning of character c^ beginning of line$ end of line. any single character[...] any of characters in range .…[^....] any single character not in range .…r* zero or more occurrences of r

  14. grep - options • Some useful options -c count number of lines -iignore case -l list only the files with matching lines -L list files that dose not match -v display lines that do not match -n print line numbers -r recursively search the sub-directories

  15. grep advanced options • -F fixed string, don’t interpret R.E • -m NUM, stop reading a file after NUM matching lines • --exclude=GLOB skip files whose name matches GLOB • --include=GLOB search only file names mathes GLOB

  16. gamefile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southern SO Suan Chin 5.1 .95 4 15 southeast SE Patricia Heme 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13

  17. demos $ grep NW gamefile northwest NW Charles Main 3.0 .98 3 34 $ grep '^n' gamefile northwest NW Charles Main 3.0 .98 3 34 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 $ grep '4$' gamefile northwest NW Charles Main 3.0 .98 3 34 $ grep TB Savage gamefile grep: Savage: No such file or directory gamefile:eastern EA TB Savage 4.4 .84 5 20 $ grep -l 'SE' * gamefile

  18. demos cont. $ grep '5\..' gamefile western WE Sharon Gray 5.3 .97 5 23 southern SO Suan Chin 5.1 .95 4 15 northeast NE AM Main Jr. 5.1 .94 3 13 central CT Ann Stephens 5.7 .94 5 13 $ grep '\(3\)\.[0-9].*\1 *\1' gamefile northwest NW Charles Main 3.0 .98 3 34 $ grep '\<north' gamefile northwest NW Charles Main 3.0 .98 3 34 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 $ grep '\<north\>' gamefile north NO Margot Webber 4.5 .89 5 9

  19. demos cont. $ grep -v "Suan Chin" gamefile northwest NW Charles Main 3.0 .98 3 34 western WE Sharon Gray 5.3 .97 5 23 southwest SW Lewis Dalsass 2.7 .8 2 18 southeast SE Patricia Heme 4.0 .7 4 17 eastern EA TB Savage 4.4 .84 5 20 northeast NE AM Main Jr. 5.1 .94 3 13 north NO Margot Webber 4.5 .89 5 9 central CT Ann Stephens 5.7 .94 5 13 $ grep -c 'west' gamefile 3 $ grep -w 'north' gamefile north NO Margot Webber 4.5 .89 5 9 $ grep -i "$LOGNAME" /etc/passwd zhengm:x:503:504::/home/zhengm:/bin/bash

  20. grep with pipes • Remember, we can use pipes when a file is expected • ls –l | grep a*

More Related