How Asynchronous Concurrent Context Translations Work

HOME \| COMPANY \| SOFTWARE \| DOCUMENTATION \| EDUCATION & TRAINING \| SALES & SERVICE
"The Official Guide to Programming with OmniMark"	Site Map \| Search: OmniMark Magazine Developer's Forum
International Edition

OmniMark^® Programmer's Guide Version 3

18. How Asynchronous Concurrent Context Translations Work

Previous chapter is Chapter 17, "SGML Document and Subdocument Parsing".

Next chapter is Chapter 19, "Customizing OmniMark Behaviour".

This chapter uses a small but complex example to describe how OmniMark behaves when executing a complex context-translation.

The context-translation provides the most powerful mechanism for translating generalized text documents to SGML, and for translating generalized text documents to another form of generalized text document with the aid of a content model.

In a context translation, ELEMENT rules and FIND rules operate concurrently and asynchronously. Complex interactions of OmniMark and the SGML parser make this possible. Each type of rules operates in what is referred to as a domain; the input processor and the output processor. Each domain can be thought of as being on either side of the internal SGML parser. The input processor is on the input side of the SGML parser and the output processor is on the output side of the SGML parser.

OmniMark selects rules and executes actions in one domain or the other at a time.

The following example is designed to illustrate the interactions between the FIND rules and the ELEMENT rules. It is a complete OmniMark program and can be executed. As will be illustrated, OmniMark can switch domains several times during the execution of the actions of a single rule. From the point of view of the OmniMark programmer, the find and output processors are executing concurrently. The asynchronicity is determined by the information requirements, and information generation capability of the SGML parser.

The object of this OmniMark program is to move generalized text information "across" the SGML parser through buffers, from the FIND rule to the ELEMENT rules and vice versa. It is not a practical example in and of itself, but forms the model for existing commercial OmniMark applications.

   CONTEXT-TRANSLATE                              ;  1
                                                  ;  2
   GLOBAL STREAM main                             ;  3
   GLOBAL STREAM find-state                       ;  4
                                                  ;  5
   ELEMENT main                                   ;  6
     OUTPUT find-state                            ;  7
     SET main TO "Processing ELEMENT 'main'%n"    ;  8
     OUTPUT "%c"                                  ;  9
                                                  ; 10
   ELEMENT alpha                                  ; 11
     OUTPUT "%g(find-state)"                      ; 12
     SUPPRESS                                     ; 13
                                                  ; 14
   ELEMENT beta                                   ; 15
     SUPPRESS                                     ; 16
                                                  ; 17
   FIND "\doc"                                    ; 18
     SET find-state TO "Processing FIND 'doc'%n"  ; 19
     OUTPUT "<main>"                           ; 20
     SET find-state TO "%g(main)"                 ; 21
     OUTPUT "<beta>"                           ; 22
                                                  ; 23
   FIND-START                                        ; 24
     OUTPUT "<!doctype main ["                    ; 25
     OUTPUT "<!ELEMENT main  - o (alpha, beta)>"  ; 26
     OUTPUT "<!ELEMENT alpha - o empty>"          ; 27
     OUTPUT "<!ELEMENT beta  - o empty>]>"        ; 28
                                                  ; 29
   SGML-ERROR                                     ; 30

Given a document with the content "\doc", OmniMark will execute this program as follows:

Line 1: The compiler sets this program up as a context-translation. OmniMark begins in the output processor. It first tries to select DOCUMENT-START rules. As there are none it then attempts to obtain information from the internal SGML parser. The SGML parser has not received any information from the input processor so it then attempts to obtain information from OmniMark in the input processor.
Lines 24 through 28: The FIND-START rule is the first rule to be selected. OmniMark executes each of its actions in the order they are written. These actions will output a small Document Type Declaration to the #SGML stream. While this is somewhat unusual, it is apropos for this example.
The #SGML stream is directed into the internal SGML parser. The SGML parser will parse the complete DTD and will be in a state ready for a document instance.
The SGML parser asks OmniMark for more information. There are no more FIND-START rules so OmniMark begins to read the document (assume a document opened on the command line). The first (and only in this example) text read from the document is "\doc". OmniMark compares the text to the patterns in its available FIND rules.
Line 18: The only FIND rule is selected because its pattern matched the text of the document. OmniMark begins to execute the actions in the FIND rule in the order written.
Line 19: OmniMark attaches the stream find-state to a buffer and sets the buffer's contents to "Processing FIND 'doc'%n". The SET action accomplishes this by:
- opening the stream find-state and attaching it to a buffer. The find-state stream now belongs to the input processor. Only FIND, FIND-START, FIND-END, and SGML-ERROR rules may write to this stream.
- SET then writes "Processing FIND 'doc'%n" to the find-state stream.
- SET closes the find-state stream.
Line 20: OmniMark writes <main> to the #SGML stream.
The SGML parser recognizes <main> as a valid start tag and passes it to OmniMark in the output processor.
- Line 6: OmniMark examines the ELEMENT rules available and selects the main ELEMENT rule. OmniMark begins to execute the actions in the rule in the order they are written.
- Line 7: OmniMark writes the contents of the buffer attached to the find-state stream to the #MAIN-OUTPUT stream. The #MAIN-OUTPUT stream "belongs" to the output processor and FIND, FIND-START, and FIND-END rules may not write to this stream.
  The find-state buffer contains information placed in it by the FIND rule on line 19. The text Processing FIND 'doc' is written to the #MAIN-OUTPUT stream.
- Line 8: OmniMark attaches the stream main to a buffer and sets the buffer's contents to "Processing ELEMENT 'main'%n". The SET action accomplishes this by:
  - opening the stream main and attaching it to a buffer. The find-state stream now belongs to the output processor. FIND, FIND-START, and FIND-END rules are not permitted to write to this stream.
  - SET then writes "Processing ELEMENT 'main'%n" to the main stream.
  - SET closes the main stream.
- Line 9: OmniMark writes a "%c" format item to the #MAIN-OUTPUT stream. A "%c" format item tells OmniMark to process the content of the main element.
  OmniMark attempts to obtain more information from the SGML parser. The SGML parser has no more information to give so it attempts to obtain information from OmniMark in the input processor. OmniMark continues processing actions in the FIND rule where it left off previously.
Line 21: OmniMark attaches the stream find-state to a buffer and sets its contents. The previous contents of the buffer are lost permanently. The new contents are copied from the stream main created on Line 8.
Line 22: OmniMark writes <beta> to the #SGML stream.
The SGML parser now has information. The text <beta> is recognized as a start tag. However, it is out of order. The SGML parser expects to see a start tag for alpha prior to beta. The SGML parser issues an error message to OmniMark.
- Line 30: OmniMark selects the SGML-ERROR rule. It has no actions, so OmniMark suppresses the error message.
  The SGML parser then fabricates an alpha start tag and queues the beta start tag. The SGML parser passes the fabricated alpha start tag to OmniMark in the output processor.
  OmniMark examines its ELEMENT rules.
- Line 11: OmniMark selects the alpha ELEMENT rule and begins to execute its actions in the order they are written.
- Line 12: OmniMark writes the contents of the find-state buffer. The contents of the find-state stream were created on Line 21. The text Processing ELEMENT 'main' is written to the #MAIN-OUTPUT stream.
- Line 13: OmniMark suppresses any further output from the alpha element and processes its content.
  OmniMark attempts to obtain more information from the SGML parser. The SGML parser is holding on to the beta start tag which it provides to OmniMark in the output processor.
Lines 15 and 16: OmniMark selects the beta ELEMENT rule and executes the SUPPRESS action. The SUPPRESS action temporarily removes all streams from the current output set and causes OmniMark to process the content of the beta element. OmniMark attempts to obtain further information from the SGML parser.
The SGML parser has no more information so it attempts to obtain information from OmniMark in the input processor. OmniMark reads more of the input document and immediately receives an end of file indication. OmniMark immediately provides the SGML parser with the end of file indication.
The SGML parser expects a main end tag or an end of file indication. It accepts the end of file as a valid input and fabricates a main end tag which it passes to OmniMark in the output processor ahead of the end of file indication.
Line 9: OmniMark resumes the execution of the main ELEMENT rule immediately after the "%c" format item.
Line 10: There are no more actions in this rule and there are no suspended rules. OmniMark attempts to obtain more information from the SGML parser.
The SGML parser returns the end of file indication which causes OmniMark to cease executing this program.

To completely understand the order of events requires some knowledge of exactly how the SGML parser processes SGML. Fundamentally, domain switching occurs when the SGML parser receives a complete SGML token, such as a complete start or end tag, complete processing instructions, or complete external entity references.

Each PUT or OUTPUT action which writes into the #SGML stream is acted upon by the SGML parser immediately after the action ends. The parser determines whether it has sufficient information to return to the output processor or that it must obtain further information from the input processor.

Next chapter is Chapter 19, "Customizing OmniMark Behaviour".

OmniMark® Programmer's Guide Version 3

18. How Asynchronous Concurrent Context Translations Work

OmniMark^® Programmer's Guide Version 3