external-text-entity
Full Description
swirl
Syntax
  external-text-entity entity-name condition?
     action*


Purpose

A rule used to provide OmniMark's SGML parser with the text of an external text entity (that is an external entity that is not cdata, sdata, ndata or subdoc) whenever such an entity is referenced in an SGML document. An external-text-entity rule looks like an external-data-entity rule. Its most important property is that everything written to the #markup-parser stream within the rule is considered part of the entity's text.

For example, this rule specifies that the text of the entity named "version" is the content of the file named "version.txt":

  external-text-entity version
     output file "version.txt"

If an external-text-entity outputs no text, the SGML parser treats the entity's replacement text as having zero characters. This is not an error.

It should be noted that, in a context-translation, an external-text-entity rule can be performed while the find-start rules are being performed. This will happen if the find-start rules output the text of an SGML declaration and there are external-text-entity rules for processing the entities represented by the public identifiers in the SGML declaration.

The output action is usually used inside an external-text-entity rule to provide the SGML parser with the entity's replacement text. In an external-text-entity rule, the default #current-output stream contains only the #markup-parser stream. That allows the replacement text of the entity to be fed to the parser using output actions.

Because anything written to the #markup-parser stream in an external-text-entity rule becomes part of the entity's text, the entity's text can be made up of one or more pieces from one or more sources.

If the replacement text of some of the external entities is small, all of the entities can be defined in a single file. This technique can be used to construct a "control file" for configurable documents.

The external-text-entity rule is unique in OmniMark in that different parts of it are executed in each domain. The header of the rule, and any associated condition, is tested in the output processor. If the rule is selected, the actions within the rule body are performed in the input processor.

The reason for this split is:

  • The entity reference which triggers the external-text-entity rule is recognized by the SGML parser. Events recognized by the SGML parser cause output processor rules to be triggered. Therefore, the header of the external-text-entity rule must belong to the output processor in order for it to be triggered.
  • The contents of the external text entity must be submitted to the SGML parser to be parsed. Only actions that belong to the input processor can supply text to the SGML parser. Therefore, the actions of the external-text-entity rule must be executed in the input processor.

In practice, it will be irrelevant to most programmers that the rule header is tested in the markup processor and the actions executed in the pattern processor. However, this split has the following implications:

  • The currently active groups in the markup processor affect which external-text-entity rules can be selected.
  • Any function called from within the condition at the head of an external-text-entity rule is performed without using the markup processor. Any groups mentioned or modified in any next group is or using group action are those of the markup processor. In particular, a next group is action in such a function will change the active groups for the markup processor's currently active groups.
  • In the body of the external-text-entity rule the groups mentioned and modified in any next group is or using group action are those of the pattern processor. In particular, any next group is action changes the pattern processor's currently active groups, and any submit in the body of the external-text-entity rule selects find rules using the pattern processor's currently active groups.

Versions of OmniMark prior to V3 treated the rule header for the external-text-entity rule as if it were evaluated in the pattern processor, so there may be a change in the behavior for external-text-entity rules which are not in the #implied group. It is not expected that this will be a significant change for most programs.

Where an external text entity's text does not need processing, it is appropriate that an external-text-entity rule will use an output or put action to provide the file's text to the SGML parser.

Even if some processing is required, it can be done with a do scan or repeat scan in the external-text-entity rule, where each match emits the processed text using output or put.

On the other hand, if substantial processing is required, it will often be the case that it is more appropriate to submit the text of the file to be processed by find rules. In this case, any output of the find rules that process the submitted text is considered part of the text of the entity.

If the find rules are different from those used to process the main input, it will be necessary to use a using group prefix on the submit action to specify which find rules are used to process the submitted text. For example:

  external-text-entity #implied when entity is (public & in-library)
     using group entity-processing
        submit file "%pq"

Note that when a category is handled with a condition, all instances failing the condition must be handled using other rules. For example, when using a when #implied rule, a second rule must be used to handle unless #implied.

An output-to action is allowed in an external-text-entity rule. output-to in an external-text-entity rule remains in effect until the end of the rule, unless it is overridden by a further output-to.

Usually, the only active output stream in an external-text-entity rule is the #markup-parser stream, so text written using the output action becomes part of the replacement text of the external text entity. The output-to action allows the OmniMark programmer to redirect the output to another destination.

The following code shows how external-text-entity rules can be used to match named entities. The first rule will match all entities named in the program, except the #dtd entity and those used in the SGML Declaration (because they don't have names). This allows the #dtd entity to be processed differently than named entities. The second rule matches all entities in the dtd and the document instance by including both the #dtd and #implied entities:

  external-text-entity #implied when
  ...

  external-text-entity (#implied | #dtd) when
  ...

Entity replacement text can also be constructed from multiple sources.

The following external-text-entity rule processes any external text entity that has a system identifier. It treats the system identifier as a sequence of file names, separated by semicolons, and concatenates the text from all of the files together as the entity's replacement text.

  external-text-entity #implied when entity is system
     repeat scan "%eq"
     match [any except ";"]+ => file-name
        output file file-name
     match ";"
        ; Ignore any semicolon
     again

An example of an entity with multiple file names is a case where there is a general entity that represents the chapters that comprise the "advanced" part of a textbook:

  <!ENTITY advanced SYSTEM "chapter7.sgm;chapter8.sgm;chapter9.sgm">

Copyright © OmniMark Technologies Corporation, 1988-1998.