contentsconceptssyntaxlibrariessampleserrorsindex
Full text search
About OmniMark
     

OmniMark is a streaming programming language. As a starting point, you can think of OmniMark as either a rule-based language or an event-based language. If you have ever programmed for a graphical user interface such as Windows, the Mac, or Motif, you are used to event-based programming. In these environments, the operating system captures user actions such as keystrokes or mouse clicks, or hardware actions such as data arriving on a serial port, and sends a message to the current program indicating what event has occurred. Programs consist of a collection of responses to events. The order in which program code is invoked depends on the order in which events occur.

OmniMark programs are written the same way, as a collection of responses to events. The difference is that the events an OmniMark program responds to are not user events or hardware events, but data events. Data events occur in streams of data. As a streaming language, the management of streams is built into the heart of OmniMark. OmniMark shields you from the details of stream handling just as good GUI programming languages shield you from the details of user input handling and window management.

What is a data event? Quite simply, a data event is something significant occurring in a stream of data. In a typical GUI environment, it is the operating system and its associated hardware that decides what is an event. There is a defined set of events, and programs simply have to respond to those events that interest them. Who decides what is an event in a stream of data? You do.

This is where the rule-based aspect of OmniMark comes into play. An OmniMark program consists of rules that define data events, and actions that take place when data events occur. Suppose you wanted to count the words in the text "Mary had a little lamb". You would write an OmniMark rule that defined the occurrence of a word as an event:

  find letter+

This is an OmniMark find rule. Find rules attempt to match patterns that occur in a data stream, and if they match something completely, they detect an event. This rule matches letters. The "+" sign after the keyword "letter" stands for "one or more", so this rule will go on matching letters until it comes to something that is not a letter, such as punctuation or a space. Having run out of letters, it will see if it needs to match anything else. Since it doesn't, the pattern is complete and the rule is fired. Any actions following the rule are then executed. This rule will fire once for every word in the data, so all that remains to do is increment a counter each time the rule is fired.

If you are used to other languages such as C or Visual Basic, you are probably thinking that there is something odd about the find rule above. Sure, it finds words, but what does it find them in? Where is the reference to the file or variable that contains the data?

Because it deals primarily with events happening in data, OmniMark maintains a current input. Rules automatically apply to the current input, so you don't have to specify what each rule applies to.

Similarly, OmniMark has a built-in output. All output goes to current output. If you need to change were output goes, you change the destination of current output.

Of course, you are not restricted to using only a single input or output. You can define and use a variety of inputs and outputs, as well as variables. But in OmniMark you generally do not have to concern yourself with opening files, reading the content into variables, and stepping through the content as you would in other languages. In OmniMark, you just name the desired input source and let the data flow; thus, a complete program to count the words in "Mary had a little lamb" looks like this:

  global counter wordcount initial {0}

  process
     submit "Mary had a little lamb"
     output "%d(wordcount)%n"

  find letter+
     increment wordcount 

(In the output statement, "%d" is a format item used to convert the value of the counter "wordcount" to a string and "%n" is a newline.)

To try this program, copy it to a file named "test.xom" and type the following on the command line:

  omnimark -s test.xom

The example above introduces a new kind of rule, the process rule. The process rule, as you would expect, is fired when processing begins. Our program consists of one global variable declaration and two rules. Note that, in this program, it doesn't matter in what order the rules appear since each fires only when a specific event occurs. Thus we could just as easily write the program:

  global counter wordcount initial {0}

  find letter+
     increment wordcount
     output "%d(wordcount)%n"

  process
     submit "Mary had a little lamb"

This program runs just the same as the first. This is not to say that the order of rules never matters in an OmniMark program. If one event could cause more than one find rule to fire, the rule that occurs first will fire, and the one that occurs later will not. This allows you to put more specific rules before more general rules and have the general rules fire only if the specific rule does not. The following two programs produce different output:

  global counter wordcount initial {0}

  process
     submit "Mary had a little lamb"
     output "%d(wordcount)%n"

  find "had"
     output "*"

  find letter+
    increment wordcount

  find any

The program above prints "*4". The program below changes the order of the find rules and produces a different output.

  global counter wordcount initial {0}

  process
     submit "Mary had a little lamb"
     output "%d(wordcount)%n"

  find letter+
     increment wordcount

  find "had"
     output "*"

  find any

This program prints "5".

Why did we add "find any" as a new rule in both these programs? Actually, it fixes an error in all the earlier versions of our word counting program. The rule "find letter+" matches words. But what about the spaces between the words? What was happening to them? If you actually ran the first program, you might have noticed that it printed its result indented by four spaces. Those are the unmatched spaces from our input. Any input that is not matched by a find rule goes right through to output; "find any" at the end of a set of find rules soaks up any unmatched input. Of course, if you use "find any" it must always be the last find rule.

We said that, in OmniMark, you define data events. This is not always true. Sometimes the data itself contains the definition of the event. Documents written in formal markup languages based on SGML and XML contain tags which break the document up into a set of elements. In such a document, the occurrence of such an element constitutes a data event. Because OmniMark has built-in parsers for XML and SGML, you don't need to worry about how elements are recognized, you just need to write rules to process them when they occur. These are called markup rules.

       
----

Top [CONTENTS] [CONCEPTS] [SYNTAX] [LIBRARIES] [SAMPLES] [ERRORS] [INDEX]

Generated: April 21, 1999 at 2:00:46 pm
If you have any comments about this section of the documentation, send email to [email protected]

Copyright © OmniMark Technologies Corporation, 1988-1999.