We saw in the searching chapter that we can search for literal strings in our text. In this chapter we expand upon searching for literal strings by introducing patterns.
A pattern defines a specification for describing strings. When a pattern is compared to a "target" string, if the pattern describes the string we say that the target string "matches the pattern". Patterns are generally composed of a sequence of smaller patterns, called "atoms". These atoms are arranged in a specific sequence to create the specification.
As a simple example, let's build a pattern that matches phone numbers of the format:
+1(234)567-8910
The first step is to break this string into a sequence of components:
+[country code]([area code])[prefix]-[line number]
Next, we need to specify each component in terms of regular expressions, then combine them into the pattern. One possible final pattern might be:
+\d(\d\{3\})\d\{3\}-\d\{4\}
which is made up of:
- the character range "\d", which specifies that the characters in those locations must be digits,
- quantifiers such as {3}, which specify how many digits appear in each location, and
- literal strings "+", "(", ")", and "-".