HOME | COMPANY | SOFTWARE | DOCUMENTATION | EDUCATION & TRAINING | SALES & SERVICE | |
"The Official Guide to Programming with OmniMark" |
|
International Edition |
Previous chapter is Chapter 17, "SGML Document and Subdocument Parsing".
Next chapter is Chapter 19, "Customizing OmniMark Behaviour".
This chapter uses a small but complex example to describe how OmniMark behaves when executing a complex context-translation.
The context-translation provides the most powerful mechanism for translating generalized text documents to SGML, and for translating generalized text documents to another form of generalized text document with the aid of a content model.
In a context translation, ELEMENT rules and FIND rules operate concurrently and asynchronously. Complex interactions of OmniMark and the SGML parser make this possible. Each type of rules operates in what is referred to as a domain; the input processor and the output processor. Each domain can be thought of as being on either side of the internal SGML parser. The input processor is on the input side of the SGML parser and the output processor is on the output side of the SGML parser.
OmniMark selects rules and executes actions in one domain or the other at a time.
The following example is designed to illustrate the interactions between the FIND rules and the ELEMENT rules. It is a complete OmniMark program and can be executed. As will be illustrated, OmniMark can switch domains several times during the execution of the actions of a single rule. From the point of view of the OmniMark programmer, the find and output processors are executing concurrently. The asynchronicity is determined by the information requirements, and information generation capability of the SGML parser.
The object of this OmniMark program is to move generalized text information "across" the SGML parser through buffers, from the FIND rule to the ELEMENT rules and vice versa. It is not a practical example in and of itself, but forms the model for existing commercial OmniMark applications.
CONTEXT-TRANSLATE ; 1 ; 2 GLOBAL STREAM main ; 3 GLOBAL STREAM find-state ; 4 ; 5 ELEMENT main ; 6 OUTPUT find-state ; 7 SET main TO "Processing ELEMENT 'main'%n" ; 8 OUTPUT "%c" ; 9 ; 10 ELEMENT alpha ; 11 OUTPUT "%g(find-state)" ; 12 SUPPRESS ; 13 ; 14 ELEMENT beta ; 15 SUPPRESS ; 16 ; 17 FIND "\doc" ; 18 SET find-state TO "Processing FIND 'doc'%n" ; 19 OUTPUT "<main>" ; 20 SET find-state TO "%g(main)" ; 21 OUTPUT "<beta>" ; 22 ; 23 FIND-START ; 24 OUTPUT "<!doctype main [" ; 25 OUTPUT "<!ELEMENT main - o (alpha, beta)>" ; 26 OUTPUT "<!ELEMENT alpha - o empty>" ; 27 OUTPUT "<!ELEMENT beta - o empty>]>" ; 28 ; 29 SGML-ERROR ; 30
Given a document with the content "\doc", OmniMark will execute this program as follows:
The #SGML stream is directed into the internal SGML parser. The SGML parser will parse the complete DTD and will be in a state ready for a document instance.
The SGML parser asks OmniMark for more information. There are no more FIND-START rules so OmniMark begins to read the document (assume a document opened on the command line). The first (and only in this example) text read from the document is "\doc". OmniMark compares the text to the patterns in its available FIND rules.
The SGML parser recognizes <main> as a valid start tag and passes it to OmniMark in the output processor.
The find-state buffer contains information placed in it by the FIND rule on line 19. The text Processing FIND 'doc' is written to the #MAIN-OUTPUT stream.
OmniMark attempts to obtain more information from the SGML parser. The SGML parser has no more information to give so it attempts to obtain information from OmniMark in the input processor. OmniMark continues processing actions in the FIND rule where it left off previously.
The SGML parser now has information. The text <beta> is recognized as a start tag. However, it is out of order. The SGML parser expects to see a start tag for alpha prior to beta. The SGML parser issues an error message to OmniMark.
The SGML parser then fabricates an alpha start tag and queues the beta start tag. The SGML parser passes the fabricated alpha start tag to OmniMark in the output processor.
OmniMark examines its ELEMENT rules.
OmniMark attempts to obtain more information from the SGML parser. The SGML parser is holding on to the beta start tag which it provides to OmniMark in the output processor.
The SGML parser has no more information so it attempts to obtain information from OmniMark in the input processor. OmniMark reads more of the input document and immediately receives an end of file indication. OmniMark immediately provides the SGML parser with the end of file indication.
The SGML parser expects a main end tag or an end of file indication. It accepts the end of file as a valid input and fabricates a main end tag which it passes to OmniMark in the output processor ahead of the end of file indication.
The SGML parser returns the end of file indication which causes OmniMark to cease executing this program.
To completely understand the order of events requires some knowledge of exactly how the SGML parser processes SGML. Fundamentally, domain switching occurs when the SGML parser receives a complete SGML token, such as a complete start or end tag, complete processing instructions, or complete external entity references.
Each PUT or OUTPUT action which writes into the #SGML stream is acted upon by the SGML parser immediately after the action ends. The parser determines whether it has sufficient information to return to the output processor or that it must obtain further information from the input processor.
Next chapter is Chapter 19, "Customizing OmniMark Behaviour".
Copyright © OmniMark Technologies Corporation, 1988-1997. All rights reserved.
EUM27, release 2, 1997/04/11.