Lua's String Patterns

In the previous section we looked at format strings, which provide a convenient way to create consistent, well-formatted strings that can contain unique values. In this chapter we look at patterns, which are roughly the opposite: patterns allow us to extract values from well-formatted strings.

A pattern defines a specification that describes the content of a strings. When a pattern is compared to a "target" string, if the pattern describes the string we say that the target "matches the pattern". Patterns are generally composed of a sequence of smaller patterns, called "atoms". These atoms are arranged in a specific sequence to create the specification.

As a simple example, let's build a pattern that matches phone numbers, then we will describe the process in more detail in the coming sections. We will assume phone numbers matching the format:

+1(234)567-8910

The first step is to break this string into a sequence of components:

+[country code]([area code])[prefix]-[line number]

Next, we need to specify each component in terms of regular expressions, then combine them into the pattern. One possible final pattern might be:

+%d[(]%d%d%d[)]%d%d%d-%d%d%d%d

which is made up of:

  1. A literal +
  2. The "character class" %d, which specifies that the characters in those locations must be digits,
  3. Literal ( and ) surrounding a sequence of 3 digits %d
  4. Sequences of 3 and 4 digits, separated by a literal -

If we apply this pattern to our target phone number, the target string is returned (which is a truthy value indicating that a match occurred. For comparison, we include a few other target strings with slightly different format, which produce nil (i.e. falsy) values indicating that a match did not occur.

local pattern = "+%d[(]%d%d%d[)]%d%d%d[-]%d%d%d%d"

print(string.match("+1(234)567-8910", pattern)) -- +1(234)567-8910
print(string.match("+1(23)4567-8910", pattern)) -- nil
print(string.match("(234)567-8910", pattern)) -- nil

With that quick introduction, let's learn how to create patterns.