Input

As a streaming language, OmniMark processes data by streaming it from one place to another and processing it as it streams. Within a program the source of the streamed data is always the current input scope. To get OmniMark to process a piece of data, therefore, you must:

identify the data to be processed,
create an OmniMark source attached to that data, and
make that source the current input scope

You can often accomplish this in one step:

  submit "Mary had a little lamb"

Here, the literal string Mary had a little lamb is the data to be processed. OmniMark automatically converts strings to sources when they are used as the argument of a scanning action. The submit statement automatically creates an input scope and makes its argument the source for that input scope.

In this example, the source is a file:

  submit file "mary.txt"

Here the file operator returns a source that is attached to the named file. submit creates an input scope using that source.

To access network data sources, you use external connection components. For every component that handles an external data source, there is a function that returns an OmniMark source attached to that external data source. This example uses the tcp.reader function that works with the tcp.connection data type:

  submit tcp.reader of connection

This command will scan the data from a TCP/IP connection directly as it streams from the sending application.

Notice that it does not matter whether a piece of data is internal or external to the program. In order to process the data you must scan or parse it. Anything from a simple string variable or literal string to a multi-gigabyte data file is processed in the same way by the same scanning and parsing operations, and they are all "input" as far as OmniMark's streaming architecture is concerned. Any distinction between internal and external sources is taken care of by OmniMark or an external function library. All your program sees is a standard OmniMark source, no matter what the original source of the data.

Establishing a current input scope independently

While all OmniMark's scanning and parsing operations establish a new input scope automatically when they are executed, you can establish a current input scope independent of any scanning or parsing action using the statement using input as:

  process
     using input as file #args[1]
        submit #current-input

Processing the current input source

You can perform a scanning or parsing operation on the data in the current input scope by scanning or parsing #current-input (as in the sample above). This allows you to start a new scanning process to do a particular job in the scanning of an input source. This is discussed in nested pattern matching.

Sources and strings

OmniMark reads sources incrementally as their data is scanned or parsed. This means that you can process very large data sources or—in the case of network data perhaps—sources of indeterminate length that go on for days and weeks without having to worry about running out of memory or needing to do any buffering yourself.

Sources and strings both represent linear data sequences. The difference is that a string must be fully read into memory when it is created, while a source is read incrementally as the data is needed.

OmniMark coerces strings to sources and sources to strings as required. You should be careful not to cause a large source to be coerced to a string unnecessarily, as this will result in it being read into memory completely, which may impose a performance or resource consumption penalty. For example, anything passed as a value string argument to a function, or returned by a stream function, is coerced to a string.

Related Topics