Asynchronous concurrent context-translations: how they work

Full text search
Asynchronous concurrent context-translations: how they work
Related Concepts

Introduction

This is a small but complex example of how OmniMark behaves when executing a context-translation.

Context-translation provides a powerful mechanism for translating generalized text documents to XML/SGML. Context-translation can also be used to add other types of structure to a document, beyond XML/SGML.

In a context-translation, element rules and find rules operate concurrently and asynchronously. Each type of rule operates in one of the two "processors", the pattern processor and the markup processor.

OmniMark selects rules and executes actions in one processor at a time.

The following example is designed to illustrate the interactions between the find rules and the element rules; it is a complete OmniMark program and can be executed. As will be illustrated, OmniMark can switch domains several times during the execution of the actions of a single rule. From the point of view of the OmniMark programmer, the find and output processors are executing concurrently. The asynchronicity is determined by the information requirements and information generating capabilities of the markup parser.

The object of this OmniMark program is to move generalized text information "across" the markup parser through buffers: from the find rules to the element rules and vice versa. It is not a practical example in and of itself, but forms the model for existing commercial OmniMark applications.

Sample

  context-translate                                  ;  1
                                                     ;  2
  global stream main                                 ;  3
  global stream find-state                           ;  4
                                                     ;  5
  element main                                       ;  6
     output find-state                               ;  7
     set main to "Processing element 'main'%n"       ;  8
     output "%c"                                     ;  9
                                                     ; 10
  element alpha                                      ; 11
     output "%g(find-state)"                         ; 12
     suppress                                        ; 13
                                                     ; 14
  element beta                                       ; 15
     suppress                                        ; 16
                                                     ; 17
  find "\doc"                                        ; 18
     set find-state to "Processing find 'doc'%n"     ; 19
     output "<main>"                              ; 20
     set find-state to "%g(main)"                    ; 21
     output "<beta>"                              ; 22
                                                     ; 23
  find-start                                         ; 24
     output "<!doctype main ["                    ; 25
     output "<!element main  - o (alpha, beta)>"  ; 26
     output "<!element alpha - o empty>"          ; 27
     output "<!element beta  - o empty>]>"        ; 28
                                                     ; 29
  markup-error                                       ; 30

Given a document with the content "\doc", OmniMark will execute this program as follows:

Line 1: The compiler sets this program up as a context-translation. OmniMark begins in the output processor. It first tries to select document-start rules. As there are none it then attempts to obtain information from the internal markup parser. The markup parser has not received any information from the input processor so it then attempts to obtain information from OmniMark in the input processor.

Lines 24 through 28: The find-start rule is the first rule to be selected. OmniMark executes each of its actions in the order they are written. These actions will output a small Document Type Declaration to the #markup-parser stream.

The #markup-parser stream is directed into the internal markup parser. The markup parser will parse the complete DTD and will be in a state ready for a document instance.

The markup parser asks OmniMark for more information. There are no more find-start rules so OmniMark begins to read the document (assume a document opened on the command line). The first (and only in this example) text read from the document is "\doc". OmniMark compares the text to the patterns in its available find rules.

Line 18: The only find rule is selected because its pattern matched the text of the document. OmniMark begins to execute the actions in the find rule in the order written.

Line 19: OmniMark attaches the stream find-state to a buffer and sets the buffer's contents to "Processing find 'doc'%n". The set action accomplishes this by:

opening the stream find-state and attaching it to a buffer. The find-state stream now belongs to the input processor. Only find, find-start, find-end, and markup-error rules may write to this stream.
set then writes "Processing find 'doc'%n" to the find-state stream.
set closes the find-state stream.

Line 20: OmniMark writes <main> to the #markup-parser stream.

The markup parser recognizes <main> as a valid start tag and passes it to OmniMark in the output processor.

Line 6: OmniMark examines the element rules available and selects the main element rule. OmniMark begins to execute the actions in the rule in the order they are written.:

Line 7: OmniMark writes the contents of the buffer attached to the find-state stream to the #main-output stream. The #main-output stream "belongs" to the output processor and find, find-start, and find-end rules may not write to this stream.

The find-state buffer contains information placed in it by the find rule on Line 19. The text Processing find 'doc' is written to the #main-output stream.

Line 8: OmniMark attaches the stream main to a buffer and sets the buffer's contents to "Processing element 'main'%n". The set action accomplishes this by:

opening the stream main and attaching it to a buffer. The find-state stream now belongs to the output processor. find, find-start, and find-end rules are not permitted to write to this stream.
set then writes "Processing element 'main'%n" to the main stream.
set closes the main stream.

Line 9: OmniMark uses a %c operator to process the content of the main element.

OmniMark attempts to obtain more information from the SGML parser. The markup parser has no more information to give so it attempts to obtain information from OmniMark in the input processor. OmniMark continues processing actions in the find rule where it left off previously.

Line 21: OmniMark attaches the stream find-state to a buffer and sets its contents. The previous contents of the buffer are lost permanently. The new contents are copied from the stream main created on Line 8.

Line 22: OmniMark writes <beta> to the #markup-parser stream.

The markup parser now has information. The text <beta> is recognized as a start tag. However, it is out of order. The markup parser expects to see a start tag for alpha prior to beta. The markup parser issues an error message to OmniMark.

Line 30: OmniMark selects the markup-error rule. It has no actions, so OmniMark suppresses the error message.

The markup parser then fabricates an alpha start tag and queues the beta start tag. The markup parser passes the fabricated alpha start tag to OmniMark in the output processor.

OmniMark examines its element rules.

Line 11: OmniMark selects the alpha element rule and begins to execute its actions in the order they are written.

Line 12: OmniMark writes the contents of the find-state buffer. The contents of the find-state stream were created on Line 21. The text Processing element 'main' is written to the #main-output stream.

Line 13: OmniMark suppresses any further output from the alpha element and processes its content.

OmniMark attempts to obtain more information from the SGML parser. The markup parser is holding on to the beta start tag which it provides to OmniMark in the output processor.

Lines 15 and 16: OmniMark selects the beta element rule and executes the suppress operator. The suppress operator temporarily removes all streams from the current output set and causes OmniMark to process the content of the beta element. OmniMark attempts to obtain further information from the markup parser.

The SGML parser has no more information so it attempts to obtain information from OmniMark in the input processor. OmniMark reads more of the input document and immediately receives an end of file indication. OmniMark immediately provides the markup parser with the end of file indication.

The markup parser expects a main end tag or an end of file indication. It accepts the end of file as a valid input and fabricates a main end tag which it passes to OmniMark in the output processor ahead of the end of file indication.

Line 9: OmniMark resumes the execution of the main element rule immediately after the %c operator.

Line 10: There are no more actions in this rule and there are no suspended rules. OmniMark attempts to obtain more information from the markup parser.

The markup parser returns the end of file indication which causes OmniMark to cease executing this program.

To completely understand the order of events requires some knowledge of exactly how the markup parser processes XML/SGML. Fundamentally, domain switching occurs when the markup parser receives a complete markup token, such as a complete start or end tag, complete processing instructions, complete external entity references, or data.

Each put or output action which writes into the #sgml stream is acted upon by the markup parser immediately after the action ends. The parser determines whether it has sufficient information to return to the output processor or that it must obtain further information from the input processor.

[CONTENTS] [CONCEPTS] [SYNTAX] [LIBRARIES] [SAMPLES] [ERRORS] [INDEX]

Generated: April 21, 1999 at 2:01:41 pm
If you have any comments about this section of the documentation, send email to [email protected]