1 / 17

Regular Expressions

Regular Expressions. Dr. Ralph D. Westfall May, 2011. RegEx Object. what does it do? finds patterns in text specific strings (all characters identified) strings with "wild cards" (any character) strings with certain characters, not others how could it be used in VB.NET?

Download Presentation

Regular Expressions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regular Expressions Dr. Ralph D. Westfall May, 2011

  2. RegEx Object • what does it do? • finds patterns in text • specific strings (all characters identified) • strings with "wild cards" (any character) • strings with certain characters, not others • how could it be used in VB.NET? • validating inputs on an input form e.g., • email addresses, telephone #s, etc. • searching the content of a large database

  3. Regular Expressions • based on the idea of wild cards • .e.g., dir s*.doc in DOS/Windows finds any .doc file starting with s, followed by any other character(s) • find much more complicated patterns • e.g., any of a specified list of characters rather than just one or all possibilities • e.g., any character except ones in a list

  4. Regular Expression Object • put import statement at top of code, just below Option Strict on Imports System.Text.RegularExpressions • create a regular expression object • Dim reg as Regex • reg = New Regex("[pattern]") • need to create a Regex with a pattern (can't add one later)

  5. VB.NET RegEx Special Characters • . ba. finds bat, bal, bac • period matches any single character • ? bottles? finds bottle and bottles • 0 or 1 of what is before the ? • [0-9] finds 0, 1, … or 9 • any 1 character in specified range • + to+ finds to and too • 1 or more of what is before + • {#} b{2} finds any bb • # = number of preceding matches

  6. RegEx Special Characters - 2 • | OR operator e.g., a|b finds a or b • \ escape character is used to • revert special characters to literal values e.g., \. (back slash followed by period becomes an actual period • change regular characters to operators • \b finds a word boundary (1st or last character) or replaces a backspace

  7. Combining Operators • can group character sequences • [a-z]{3} finds 3-letter lower case words • [0-9]{4}( |-)? finds any 4 #s • followed by space or dash or neither • (could use in credit card validation)

  8. Regular Expression Methods • reg.IsMatch([string to search]) • returns true if pattern is found in string • reg.Match([string to search]) • returns a match object each time pattern is found (use ToString to get object value) • reg.Replace([string1, string2]) • replaces every occurrence of string1 with string2‘Notes

  9. Regular Expression Match Object • can tell if a match was found • can also return the matching string • usually don't need the whole string, just want the matching part mach = reg.Match([string to search]) 'returns match object wherever pattern is found If mach.Success Then strZip = mach.ToString

  10. Using Regular Expressions Dim strCard as String = TextBox1.Text Dim mach as Match Dim reg As New Regex("[0-9]{4}( |-)?" & _ "[0-9]{4}( |-)?[0-9]{4}( |-)?[0-9]{4}")'shorten? mach = reg.Match(strCard ) If mach.Success Then MsgBox("Card is OK") strCard = mach.ToString.Replace("-"c, " "c) TextBox1.Text = strCard ' finds pattern even inside other characters ' e.g., XYZ 1234 5678 9012 3456 ABC ' .Replace("-", " ") dashes  spaces)

  11. Warning • a regular expression may say a String is OK even if there are other characters around it e.g., so need to extract match from original String (other processing?) • Dim strCard as String • Dim ok as Boolean • strCard = "XYZ1234-4323-9876-6543XYZ" • ok = reg.IsMatch(strCard) 'credit card • If ok Then 'pattern • MsgBox(strCard & " is OK") • End If

  12. Extracting Matches from Strings • can use .Match() function to separate matching part from other characters around it e.g., • Dim strCard as String • strCard = "XYZ1234-4323-9876-6543XYZ" • If reg.IsMatch(strCard) Then • MsgBox(reg.Match(strCard).ToString & "is OK") • End If

  13. More Regular Expressions Info • Regular Expressions in .Net • JavaScript Regular Expressions Tester • A Better .NET Regular Expression Tester • Tee-Shirts, etc. (language alert)

  14. RegEx Exercises • explain the following e-mail address pattern: [a-z]+@[a-z]+\.com • extend it to handle following endings: • com, edu, org, net, gov, mil, int (must appear at least once) • modify it to allow numbers, dashes [-] and periods [.] after the 1st character

  15. RegEx Exercises - 2 • create patterns to validate • zip codes both as 5-digit and Zip + 4 • e.g., 90702 or 90702-7934 • phone #s (intnl., long distance, and local) • Social Security #s 123-45-6789 (or spaces) • Cal Poly student ID numbers • names (including middle initial, von, de, etc.) • course #s (CIS, EBZ, CS; 1xx-4xx, etc.)

  16. RegEx Exercises - 3 • test text or files samples against regular expressions at Regular Expression Library's tester page and report back, using regular expressions • from previous pages in this PowerPoint • others that you make up • search for examples from Regular Expression Library

  17. RegEx Exercises - 4 • use Regular Expression Library's tester page to Load a Data Source from a URL and find data in the web page using a regular expression that you specify • and/or find free (trial) software that will do the same thing

More Related