A pattern matching function is a switch
-returning function
that is used in a pattern and
participates in the pattern matching process by scanning #current-input
. The function's return value
is used in the calling
pattern to determine if the pattern matched by the function succeeded or failed.
Here is a very simple pattern matching function that matches text up to and including a specified string
:
define switch function upto-and-including (value string p) as return #current-input matches any ** p process submit "Mary had a little lamb." find "Mary" upto-and-including ("little") => t output t find any
Here the function upto-and-including uses the matches
operator to determine if the input
data, represented by #current-input
, contains the terminating string
value. If it does,
matches
consumes that portion of the data and the function returns true
, allowing the pattern
that called the function to continue.
If the input data does not contain the terminal string, matches
returns false
, the function
returns false
, and the pattern that called the function fails.
As the code above shows, data matched by a pattern matching function can be captured in a pattern variable in the
usual way. One of the limits of conventional pattern variables is that they cannot be used to build a shelf of
values from a repeated pattern. Pattern matching functions offer a way around this limitation:
global string patterns variable define switch function digit-catcher (modifiable string digits) as do scan #current-input match digit+ => d set new digits to d return true else return false done process submit "(1)(2)(3)(4)" find ("(" digit-catcher (patterns) ")")+ repeat over patterns as p output p || "%n" again
Pattern matching functions are particularly useful in nested
pattern matching. The following code uses a pattern matching function to handle nested parentheses:
define switch function between-parentheses () as repeat scan #current-input match [any \ "()"]+ ; Keep going. match "(" between-parentheses () ")" ; We've recursed in. match value-end return false again return true process submit "(1((2)(3))478(954)" find "(" between-parentheses () => t ")" output t || "%n" find any
The function between-parens matches data between parentheses. If it encounters an opening parenthesis
character, it calls itself recursively so that any level of parenthetical matter will be matched. If it encounters
a closing parenthesis that is not balanced by a preceding opening parenthesis, the character will not match, the
repeat scan
will exit, and the function will return true
.
Note that we do not actively match the closing parenthesis. Rather, the closing parenthesis is the only thing we
do not match. This is a common and useful technique in many kinds of balancing operations. Find everything but the
closing delimiter, and allow the repeat scan
to exit. This allows the closing delimiter to be matched in
the outer pattern, which is good for two reasons. First, it makes the pattern easier to read. Second, it allows
you to capture the content of the structure without its delimiters (as we do here).
If the function matches the end of the input without seeing the closing parenthesis, it returns false
.
If this occurs in an iterative call, value-end
will then be matched by each instance of the function as it
unwinds.
Interestingly enough, this function can be written in a slightly more compact fashion:
define switch function between-parentheses () as repeat scan #current-input match [any \ "()"]+ ; Keep going. match "(" between-parentheses () ")" ; We've recursed in. again return true process submit "(1((2)(3))478(954)" find "(" between-parentheses () => t ")" output t || "%n" find any
This form never returns false
. It does, however, work almost identically to the original function.
Unless a balancing closing parenthesis is encountered, the function will read to the end of the data, just like
the previous version. It then returns true
, rather than false
, just as if it had ended with the
closing delimiter. But the pattern that called the function will now fail because it will not be able to match the
closing parenthesis.
You can also use pattern matching functions to process the matched data, though it is important to remember that
the code in a pattern matching function is called and executed before the pattern as a whole is complete. This
means the function could execute even though the pattern as a whole fails. Thus the function could be called and
executed again in a subsequent attempt to match the same data. As a consequence, pattern matching functions with
side-effects can lead to unexpected program behaviour, and should be avoided. For example, the program
define switch function greeting () as put #main-output "*" return #current-input matches "Hello, World!" process submit "Hello, World!%n" find greeting () output "Salut, Monde!"outputs
*Salut, Monde!* *In this example, the first asterisk is output when greeting succeeds in matching Hello, World!. The second asterisk is emitted when the pattern matching function is called in attempting to match the newline (
%n
) that follows Hello, World!. In this case, the function fails to match,
but the asterisk has been output nonetheless. Finally, the third asterisk is output when similarly attempting to
match the end of input.