Scopes

Scopes play a particularly important role in the design and execution of OmniMark programs. OmniMark has the following kinds of scopes:

  • lexical scopes
  • execution scopes
  • coroutine scopes

Lexical scopes

A lexical scope is a scope in the written structure of the program. For instance, a rule is a lexical scope—it is written as a series of lines one after another. A function is also a lexical scope. Within a rule or function, a repeat loop or a do block is also a lexical scope.

Lexical scopes define the visibility of shelves. You can declare a local shelf in any lexical scope and it will be visible only to code within that scope. Where one lexical scope is nested inside another, shelves declared in the outer scope are visible in the inner scope, unless a shelf of the same name is declared in the inner scope. In this case, the shelf in the outer scope is hidden within the inner scope, but it still exists in the outer scope.

  process
     local string foo initial { "A" }
     local string bar initial { "B" }
  
     output foo || bar
     do
        local string foo initial { "Z" }
  
        set bar to "Y"
        output foo || bar
     done
     output foo || bar
        

In this program the process rule is one lexical scope. The do block is another lexical scope nested inside the lexical scope of the rule. The program outputs ABZYAY. The bar, declared in the outer scope, is visible in the inner scope, so when its value is changed in the inner scope, the original shelf is changed. The shelf foo, on the other hand, is a different shelf inside the do block from the one declared in the rule. Changing the value of foo in the do block does not change the value of foo in the outer scope.

Scope of execution

An execution scope (also called dynamic scope) is a set of lexical scopes that execute together as a unit, in a nested fashion. The most straightforward case of execution nesting is a function call.

  define integer function 
     sum (value integer foo, 
          value integer bar)
  as 
     return foo + bar
           
  
  process
     local string foo initial { "A" }
     local string bar initial { "B" }
  
     output "d" % sum (2, 4)
     output foo || bar

Here the function sum is an entirely separate lexical scope. The shelf names foo and bar used in the function have nothing to do with the shelf names foo and bar in the process rule. But as the program is executed, the execution scope of the function is nested inside the execution scope of the process rule.

A more common case, in OmniMark, is the nested execution scoping that occurs when a find rule fires as a result of a submit in a rule:

  process
     output "<rhyme>"
     submit "Mary had a little lamb"
     output "</rhyme>"
      
  
  find ("Mary" | "lamb") => person
     output "<person>" || person || "</person>" 
        

This program outputs <rhyme><person>Mary</person> had a little <person>lamb</person></rhyme>. In this program, the execution of the find rule is nested inside the execution of the process rule. The submit initiates the scanning of the input data and invokes the find rules. It is this execution scoping that ensures that the <rhyme> and </rhyme> tags get wrapped around the material output as a result of the submit.

The find rule and the process rule are independent lexical scopes but nested execution scopes. Note, however, that unlike the previous example in which the nested execution scope of the function was directly invoked by the function call, in this case it is the data that determines if and when a find rule will be executed in the execution scope established by the process rule. The fact that the data drives program execution in this way is what makes OmniMark such a powerful text processing tool.

While local shelves are never visible outside their lexical scope, they are still instantiated for as long as their lexical scope is in execution scope, and they may well be active. Consider the following program:

  process
     local stream foo
  
     open foo as file "foo.txt"
     using output as foo
     do
        output "<rhyme>"
        submit "Mary had a little lamb"
        output "</rhyme>"
     done
      
  
  find ("Mary" | "lamb") => person
     output "<person>" || person || "</person>" 

In this case the local stream shelf foo created in the process rule is the current output stream for the lexical scope bounded by using output as foo do and done. While it is not lexically in scope in the find rule, and you cannot put any code in the find rule to address or manipulate it, it is still very much active. It is the stream that output goes to when you say output in the rule.

As the above example hints, OmniMark uses execution scopes to a larger extent than most of the other programming languages. You can use the following declarations and actions to create different kinds of scopes of execution:

Output scopes

The using output as qualifier is used in the example above to establish an output scope. In most languages, the destination of an output action (or its equivalent) must be in lexical scope. In OmniMark, the question of where output goes to is separated from the act of creating output, meaning that the output destination is scoped dynamically. Once a stream is in the current output scope, all output will go to it, no matter what lexical scope the output action occurs in.

Input scopes

We have already seen several examples of an input scope. Every example above that uses a submit or do xml-parse is creating a new input scope. Input scopes are the flip side of output scopes. Just as output scopes determine where output goes, so input scopes determine where input comes from. Just as we never have to say where output goes to in an output action, we never have to say where the input comes from when we write a find rule. Output goes to the current output scope. Input comes from the current input scope.

Referent scopes

Another kind of execution scope is a referent scope, that can be used to control the lifetime of referents created within.

Coroutine scope

A coroutine scope consists of two execution scopes which are executed in one coroutine each. One of the two execution scopes is an input scope and it is called consumer, while the other execution scope, called producer, is an output scope. The producer outputs data into its current output, which then feeds it to the consumer through its current input. The execution alternates between the producer and consumer depending on the data flow between them.