CIS 2250 Lecture Notes - Lecture 10: Metacharacter, Regular Expression, Parsing
Document Summary
Regular expressions are a powerful tool for text processing. They allow for the description and parsing of text. With additional support, tools employing regular expressions can modify text and data. Check any number of files and report any occurrences of double words, i. e. the the: should be case insensitive, i. e. the the. If you were checking html code then you would have to disregard html tags, e. g. the the Given a regular expression and files to search, egrep attempts to match the regex to each line of the file. egrep (cid:858)^(cid:894)fro(cid:373)| ear(cid:272)h(cid:895):(cid:859) e(cid:373)ail-file. Egrep breaks the input file into separated text lines. There is no understanding of high-level concepts, such as words. The following instantiation of egrep will match the three letters c a t whatever they are. egrep (cid:858)(cid:272)at(cid:859) file. If the file (cid:272)o(cid:374)tai(cid:374)s the li(cid:374)e: (cid:862)while o(cid:374) (cid:448)a(cid:272)atio(cid:374), (cid:449)e sa(cid:449) a fat dog o(cid:374) the (cid:271)ea(cid:272)h(cid:863).