HOME | COMPANY | SOFTWARE | DOCUMENTATION | EDUCATION & TRAINING | SALES & SERVICE

    "The Official Guide to Programming with OmniMark"

Site Map | Search:   
OmniMark Magazine Developer's Forum   

  International Edition   

OmniMark® Programmer's Guide Version 3

14. SGML Objects

Detailed Table of Contents

Previous chapter is Chapter 13, "The Current Output Stream Set".

Next chapter is Chapter 15, "Processing SGML Errors".

OmniMark provides powerful SGML query facilities for retrieving information about the current SGML context. The following information is available:

In addition, OmniMark also provides:

14.1 Elements

OmniMark provides powerful and expressive facilities to obtain information about the element context at the current position in the document, and to obtain detailed information about any or all of the elements that are open at that position.

14.1.1 Element Qualifiers

Normally, in OmniMark, element references and attribute references refer to the innermost open element at the current point in the document instance. The references can be modified with element qualifiers to specify a different element.

The terms used in element qualifiers are:

The following element qualifiers are always available:

The element-name-list above is a parenthesized list of element-names separated by either the "|" operator or the keyword OR:

Syntax

   ( element-name ( (OR | |) element-name)* )

Except for "OF DOCTYPE", element qualifiers can themselves be qualified. Thus, constructs such as the following are permissible:

   ... PARENT OF PARENT OF ANCESTOR listitem ...

14.1.2 Element Context Tests

OmniMark provides a variety of ways to determine the element context at the current point in a document. There are tests:

14.1.2.1 Testing Open Elements

Several tests pertain to open elements. Often these tests refer to an element with a specified element name. These tests are defined below.

In these test definitions, element-qualifiers represents the optional use of one or more of the above element qualifiers. If no element qualifiers are present, the test always applies to the current element.

All of these tests allow the programmer to specify a parenthesized list of element-names separated by OR or "|" instead of a single element name. The test succeeds if the reference element is identified by any of the names in the element-name-list.

14.1.2.1.1 Testing the Identity of an Element

Syntax

   ELEMENT element-qualifier* (IS | ISNT)
      (element-name | element-name-list)

The "ELEMENT IS" test checks whether the element defined by the element-qualifiers has one of the specified names.

14.1.2.1.2 Testing the Identity of the Parent Element

Syntax

   PARENT element-qualifier* (IS | ISNT)
       (element-name | element-name-list)

The "PARENT IS" test checks whether the parent of the qualified element has a particular element-name.

14.1.2.1.3 Testing the Existence of a Particular Ancestor Element

Syntax

   ANCESTOR element-qualifier* (IS | ISNT)
       (element-name | element-name-list)

The "ANCESTOR IS" test checks if the element indicated by the qualifiers has any ancestor with a particular element name.

14.1.2.1.4 Testing the Existence of an Ancestor of the Parent Element

Syntax

   PREPARENT element-qualifier* (IS | ISNT)
      (element-name | element-name-list)

The "PREPARENT IS" test checks if a specified element has any preparent with a particular element name. In other words, it checks if the parent of the specified element has a particular ancestor.

14.1.2.1.5 Testing If a Particular Element Is Open

Syntax

   OPEN ELEMENT element-qualifier* (IS | ISNT)
      (element-name | element-name-list)

The "OPEN ELEMENT IS" test checks if a specified element has any open element with a particular element name.

The "OPEN ELEMENT IS" test is equivalent to, but more efficient than, the "ANCESTOR IS" test combined with the "ELEMENT IS" test. The following examples are equivalent:

Example A

   DO WHEN  OPEN ELEMENT IS chapter
      ...
   DONE

Example B

   DO WHEN  ANCESTOR IS chapter
         | ELEMENT IS chapter
      ...
   DONE

Example C

   DO WHEN PREPARENT IS chapter
         | PARENT IS chapter
         | ELEMENT IS chapter
      ...
   DONE

14.1.2.2 Testing Recently Closed Elements

OmniMark also permits testing recently closed elements.

14.1.2.2.1 Testing the Identity of the Preceding Proper Element

Syntax

   PREVIOUS element-qualifier* (IS | ISNT)
      (element-name | element-name-list)

The "PREVIOUS IS" test succeeds if the qualified element is not the first subelement of its parent, and if the previous subelement of the parent has a specified element name. Otherwise, it fails.

For instance,

   DO WHEN PREVIOUS IS par
      ...
   DONE

tests if the current element follows a paragraph element.

The "PREVIOUS IS" test ignores inclusions and data content.

14.1.2.2.2 Testing the Identity of the Last Subelement

Syntax

   LAST PROPER? SUBELEMENT element-qualifier*
      (IS | ISNT) (element-name | element-name-list)

The "LAST SUBELEMENT IS" test succeeds if the most recently closed subelement of the qualified element has a specified element name.

When the keyword PROPER is specified, the "LAST SUBELEMENT IS" test ignores included elements. If all of the subelements of the referenced element have been included elements, then the test will fail.

When the "LAST SUBELEMENT IS" test is applied to the current element from an ELEMENT rule, it fails if the content has not yet been processed.

When invoked on ancestor elements, the "LAST SUBELEMENT IS" test is related to the "PREVIOUS IS" test. The following two examples are equivalent:

Example A

   DO WHEN PREVIOUS IS par
      ...
   DONE

Example B

   DO WHEN LAST PROPER SUBELEMENT OF PARENT IS par
      ...
   DONE

14.1.2.3 Data Content Tests

OmniMark also provides a contextual test that can test for the presence of data content as well as specific elements.

14.1.2.3.1 Testing the Last Content or Subelement

Syntax

   LAST PROPER? CONTENT element-qualifier* (IS | ISNT)
      (#DATA | element-name | content-identifier-list)

The "LAST CONTENT IS" test succeeds if the last content of the specified element was one of the elements specified or if the last content was data content if #DATA was specified.

When the keyword PROPER is specified, the "LAST CONTENT IS" test ignores included elements. If all of the subelements of the referenced element have been included elements, and if there was no data content in the element, then the test will fail.

The following example precedes the content of the keyword subelement with a slash ("/") only if it is preceded in its containing element by data content or an entity reference:

   ELEMENT keyword
      OUTPUT "/" WHEN LAST CONTENT OF PARENT IS #DATA
      OUTPUT "%sc"

Note that at the start of an ELEMENT rule, there is no "LAST CONTENT" of that element, so any test for it will fail.

The test "LAST CONTENT IS" #DATA is true in a DATA-CONTENT rule only if the data content is immediately preceded either by a non-SGML external entity reference or by one or more processing instructions which are in turn preceded either by data content or by a non-SGML external entity reference. It never applies to the content processed by the DATA-CONTENT rule.

In the following example, the test succeeds only if the text processed by the rule is itself preceded by other data content or an entity reference. It does not succeed just because the data content of this rule has been processed prior to the test:

   DATA-CONTENT
      OUTPUT "%c"
      OUTPUT " (again)" WHEN LAST CONTENT IS #DATA

14.1.2.4 Testing the Status of Proper and Included Elements

There are three ways to classify elements based on why they are permitted at a given point in the document.

  1. The document element is the outermost element of the document instance. It is specified in the DTD.
  2. A proper subelement is one which is permitted at the current point by a content model group in the declaration of the parent element.
  3. An included subelement is one which is permitted by an inclusion exception in an open element.

OmniMark provides the STATUS test to determine why an element was allowed:

Syntax

   STATUS OF LAST SUBELEMENT? element-qualifier*
       (IS | ISNT) (PROPER | INCLUSION)

The STATUS of the specified element is

These are opposites. "STATUS IS PROPER" is equivalent to "STATUS ISNT INCLUSION", and "STATUS IS INCLUSION" is equivalent to "STATUS ISNT PROPER".

If the "OF LAST SUBELEMENT" phrase is used, then the test is performed on the last subelement of the specified element. Otherwise, the test is performed on the specified element itself.

The following test succeeds only if the subelement immediately preceding the current element was an inclusion:

   OUTPUT "\par %n" WHEN STATUS OF LAST SUBELEMENT OF PARENT
          IS INCLUSION

If the specified element does not exist, then the test fails.

14.1.2.5 Counting Elements

There are two ways to check the position of an element with respect to the surrounding elements:

14.1.2.5.1 Counting Repetitions of the Current Element

Syntax

   OCCURRENCE element-qualifier*

The OCCURRENCE operator returns the number of consecutive subelements of the parent of the current element, that have the same element name as the current element. If OCCURRENCE is followed by an element-qualifier, the indicated element is tested instead of the current element.

Inclusions are counted as well as proper subelements.

The first subelement of a parent, any subelement that is not of the same element type as its immediately previous sibling, and a subelement that immediately follows data content in its parent element will always have an OCCURRENCE count of one (1).

14.1.2.5.2 Counting Subelements

Syntax

   CHILDREN element-qualifier*

When an element-qualifier is not given, the CHILDREN operator returns the number of subelements of the current element. If used before the content of the element has been processed, it will always return zero.

When a element-qualifier is given, the children of the indicated element including the currently open child of the indicated element and all its preceding subelements are counted.

Inclusions are counted as well as proper subelements.

The following condition will succeed if the element in which the test occurs is the first element in its parent;

   DO WHEN CHILDREN OF PARENT = 1
      ...
   DONE

The following condition will succeed if the element is the first in a series of like elements, but not the first subelement of its parent:

   DO WHEN OCCURRENCE = 1
       & CHILDREN OF PARENT > 1
   ...
   DONE

14.1.2.6 Element Declaration Tests

OmniMark also allows the programmer to retrieve information from the Document Type Declaration about elements that have been encountered in the document instance.

14.1.2.6.1 Testing the Declared Content of an Element

Syntax

   CONTENT element-qualifier* (IS | ISNT)
      (content-type | content-type-list)

"CONTENT IS" tests the declared content type of the element.

The content-type is one of the following:

A context-type-list is a list of content-types separated by OR or "|":

Syntax

   ( content-type ((OR | |) content-type)* )

If element-qualifiers are specified, the test applies to the qualified element, not the current element.

The following example shows one way of analyzing every element in a document. For the following input:

   <!DOCTYPE a
   [
    <!ELEMENT a    - - (b1|b2|b3)+      -- content type: ELEMENT -->
    <!ELEMENT b1   - O EMPTY            -- EMPTY -->
    <!ELEMENT b2   - O CDATA            -- CDATA -->
    <!ELEMENT b3   - O (c|d|e|#PCDATA)* -- MIXED -->
    <!ELEMENT c    - O RCDATA           -- RCDATA -->
    <!ELEMENT d    - O ANY              -- ANY -->
    <!ELEMENT e    - O (#PCDATA)        -- MIXED (CONREF) -->

    <!ATTLIST e con CDATA #CONREF>
   ]>
   <a>
   <b1>
   <b2>Some text for b2</b2>
   <b3>
   <c>Some text for c</c>
   <d>Text for d
   <b1>
   <b2>b2 text inside d</b2>
   </d>
   <e>e without the conref
   <e con="e with the conref">
   </a>

the following program prints an analysis:

   DOWN-TRANSLATE

   ELEMENT #IMPLIED
      OUTPUT "%n<%q>: "
      DO WHEN CONTENT IS ELEMENT
        OUTPUT "element"
      ELSE WHEN CONTENT IS ANY
        OUTPUT "any"
      ELSE WHEN CONTENT IS MIXED
        OUTPUT "mixed"
      ELSE WHEN CONTENT IS CDATA
        OUTPUT "cdata"
      ELSE WHEN CONTENT IS RCDATA
        OUTPUT "rcdata"
      ELSE WHEN CONTENT IS EMPTY
        OUTPUT "empty"
      DONE
      OUTPUT " (conref)" WHEN CONTENT IS conref
      OUTPUT "%n"

      ; Finally, process the content:
      OUTPUT "%c"

Note that:

14.1.2.6.2 Testing for Short Reference Maps

Syntax

   USEMAP element-qualifiers* (#NONE | #EMPTY | usemap-name)
      ( | (usemap-name | #NONE | #EMPTY))*

The usemap-name is the name of a USEMAP declaration in the DTD.

Elements which have had short reference maps associated with them by USEMAP declarations can be identified by the "USEMAP IS" test. For example:

   OUTPUT "<!USEMAP #EMPTY>" UNLESS USEMAP IS (#NONE | #EMPTY)

The USEMAP test tests the short reference map associated with the currently opened element (or the element specified by the ancestry qualifier following the keyword USEMAP) for one of the map names specified in the test.

The test succeeds:

Knowing whether or not an element type has an associated USEMAP is important if the OmniMark program is creating an SGML document from an input SGML document: the USEMAP may affect the way short reference delimiters are interpreted. When "normalizing" a document, it is usually safest to issue a "<!USEMAP #EMPTY>" declaration at the start of any element with which a USEMAP declaration for a map other than #EMPTY has been associated in the DTD.

This test is for detecting the association of USEMAPs with elements in the DTD, not in the document instance.

If the element-qualifier specifies an element which does not exist, then the test fails.

Short reference map names are subject to NAMELEN and NAMECASE GENERAL.

The following syntactic variations are permitted:

14.1.2.7 Combining SGML Enquiry and Comparison

In versions of OmniMark prior to V3, certain SGML enquiries fail rather than abort when their component enquires fail. For example, in the following, if there is no ANCESTOR chapter, then the test fails, and the "=" test is never done:

   LOCAL COUNTER n
   ...
   DO WHEN OCCURRENCE OF ANCESTOR chapter = n
      ...
   DONE

Contrast this case with the following, in which the test is in error if there are fewer than seven items on the COUNTER shelf:

   LOCAL COUNTER word-count
   LOCAL COUNTER n
   ...
   DO WHEN word-count @ 7 = n
      ...
   DONE

Or the following:

   LOCAL COUNTER n
   ...
   DO WHEN n = OCCURRENCE OF ANCESTOR chapter
      ...
   DONE

which would both seem to be similar to the first case in their intended effect, but which are both in error if there is no ANCESTOR chapter.

OmniMark V3 standardizes these cases so that SGML enquiries of the first type are now also in error when the identified element does not exist. This new interpretation means that all of the following are equivalent in their behaviour:

   WHEN OCCURRENCE OF ANCESTOR chapter = n
   WHEN n = OCCURRENCE OF ANCESTOR chapter
   WHEN (OCCURRENCE OF ANCESTOR chapter) = n

14.1.3 Retrieving the Names of Open Elements

The name of the current element can be obtained using the format item "%q" everywhere except in an EXTERNAL-TEXT-ENTITY or EXTERNAL-DATA-ENTITY rule. In those rules, the "%q" format item yields the name of the entity being processed.

The names of other elements can be obtained using the operator "NAME OF".

14.1.3.1 Formatting Element Names

Syntax

   % format-modifier* q

The "%q" format item refers to the currently opened element everywhere except in EXTERNAL-TEXT-ENTITY and EXTERNAL-DATA-ENTITY rules. In functions, even if the function is called from an EXTERNAL-TEXT-ENTITY or EXTERNAL-DATA-ENTITY rule, "%q" still refers to the currently opened element. This is to ensure that a function always behaves the same regardless of what rule it is called from.

The use of the "%q" format item in EXTERNAL-DATA-ENTITY or EXTERNAL-TEXT-ENTITY rules is described in Section 14.2.1, "Formatting Entity Names".

When referring to an element, the "%q" format can have the following modifiers:

14.1.3.2 Element Names

The "NAME OF" operator can be used to provide the name of any opened element. It can be used in a number of different ways:

The element qualifier that follows "NAME OF" must identify a currently opened element. Otherwise, it is an error.

14.1.3.3 The Document Element Name

The form ""NAME OF DOCTYPE"", will, within a document instance, return the name of the document element (i.e. the topmost element). As described in Section 16.4.1, "The Public Identifier at the Start of the DTD", it is often of interest to know the name of the document element outside of the instance -- in particular, when processing the external identifier at the start of the DTD. For this purpose, the #DOCTYPE stream is provided. It contains the name of the document element, even outside of the document instance.

The #DOCTYPE stream is "attached" as soon as OmniMark encounters the document element name at the start of the DTD, following the DOCTYPE keyword. Prior to that point, the #DOCTYPE stream is "unattached". The "STREAM #DOCTYPE IS ATTACHED" test can be used to distinguish whether or not the document element name is available.

The #DOCTYPE stream is "read-only". Its value cannot be changed by an OmniMark program, nor can the #DOCYTPE "stream shelf" be cleared or added to (with NEW).

14.1.4 The Current Element Stack

At any one point in processing an SGML document instance there are one or more elements open, starting with the document element. "CURRENT ELEMENTS" provides shelf-like access to all of these elements and their attributes.

Like a shelf, the "CURRENT ELEMENTS" stack is an ordered set of things (elements) with names.

It differs from a shelf in the following ways:

As a consequence of these differences between "CURRENT ELEMENTS" and other shelf-like things, different terminology is used. The relationship qualifiers are used to select an element instead of keys, the "NAME OF" operation is used instead of "KEY OF", and a variant of "NUMBER OF" is used instead of "ITEM OF".

"CURRENT ELEMENTS" makes a number of things easy to do:

14.1.4.1 Determining Element Depth

Syntax

   NUMBER OF CURRENT ELEMENTS element-qualifier*

The number of currently opened elements, including the currently opened element (if any), is available using "NUMBER OF CURRENT ELEMENTS":

   ELEMENT #IMPLIED
      SET depth TO NUMBER OF CURRENT ELEMENTS
      OUTPUT "Element %q is nested %d(depth) deep.%n"
      OUTPUT "%c"

Similarly, the number of opened elements down to and including a specified opened element can be determined using "NUMBER OF CURRENT ELEMENTS" together with a element-qualifier. For example,

   NUMBER OF CURRENT ELEMENTS OF ANCESTOR chapter

is the "depth" of the most recently opened (and still open) chapter element, starting with the document element as having depth 1 (one). "NUMBER OF CURRENT ELEMENTS OF DOCTYPE" is always 1 when there are any opened elements.

If "NUMBER OF CURRENT ELEMENTS" is followed by OF and an element-qualifier, then that qualifier must identify a currently opened element. On the other hand, "NUMBER OF CURRENT ELEMENTS" can always be used without a qualifier. If there are no opened elements (e.g. if the macro is used in a DOCUMENT-START or DOCUMENT-END rule), then "NUMBER OF CURRENT ELEMENTS" without a following element-qualifier has a value of zero.

   MACRO where am I IS
      DO WHEN NUMBER OF CURRENT ELEMENTS = 0
         OUTPUT
         "%"Where am I%" was called outside the document element.%n"
      ELSE
         OUTPUT
         "%"Where am I%" was called inside the document element.%n"
      DONE
   MACRO-END

14.1.4.2 Iterating Over the Opened Element Context

Syntax

   REPEAT OVER REVERSED? CURRENT ELEMENTS element-qualifier*
         AS alias-name
      local-declaration*
      action*
   AGAIN

The opened elements can be iterated over and information about them accessed using a new form of the "REPEAT OVER" action. For example, the following action lists the currently opened elements, document element first, together with each of the element's non-implied attribute values (listed following each element name, indented by three spaces):

   REPEAT OVER CURRENT ELEMENTS AS this-element
      OUTPUT NAME OF CURRENT ELEMENT this-element
      OUTPUT "%n"
      REPEAT OVER ATTRIBUTES OF CURRENT ELEMENT this-element
                  AS this-attribute
         DO UNLESS ATTRIBUTE this-attribute IS IMPLIED
            OUTPUT "   "
            OUTPUT KEY OF ATTRIBUTE this-attribute
            OUTPUT " = %"%v(this-attribute)%"%n"
         DONE
      AGAIN
   AGAIN

Within a "REPEAT OVER CURRENT ELEMENTS" action, "CURRENT ELEMENT" alias-name (as in "CURRENT ELEMENT" this-element) identifies the element selected by the current iteration. ELEMENT alone always identifies the currently opened element, and not the current item of the iteration.

"REPEAT OVER CURRENT ELEMENTS" must define an "alias" for the opened element selected on each iteration. The form "CURRENT ELEMENT" followed by the alias name is used to identify any reference to the selected element (see Section 14.1.4.3, "Element Alias Names"). (This technique must also be used when repeating over the attributes of an element.)

If the OmniMark program is to process the opened elements in most-recently-opened-first order, it can use the REVERSED option: "REPEAT OVER REVERSED CURRENT ELEMENTS".

The element (or elements, because a single "REPEAT OVER" can be applied to both "CURRENT ELEMENTS" and "REVERSED CURRENT ELEMENTS" simultaneously) selected by the current iteration is always identified by "CURRENT ELEMENT" element alias name in the head of the "REPEAT OVER" action.

A "%q" format item, references to attributes, and element tests are not affected by a "REPEAT OVER CURRENT ELEMENTS" action. Within the loop, these apply to the same element as they would outside of it. In other words, a "%q" in such a "REPEAT OVER" action does not give the name of the element selected by the current iteration, but rather that of the most recently opened element.

14.1.4.3 Element Alias Names

The alias-name must be used with "REPEAT OVER CURRENT ELEMENTS". The alias-name is used to name the element selected by each iteration over the set of currently opened elements. "REPEAT OVER CURRENT ELEMENTS" is like "REPEAT OVER ATTRIBUTES" or "REPEAT OVER DATA-ATTRIBUTES" in requiring an alias-name.

An element alias-name can only be used as an "element name" following the keyword "CURRENT ELEMENT".

"CURRENT ELEMENT" alias-name can be used in a number of contexts within a "REPEAT OVER CURRENT ELEMENTS" loop:

It is possible to use a name as both an element alias-name and as a "real" element name. If such a name is used in any context other than immediately following the keyword "CURRENT ELEMENT", then it refers to the element with that name and not to the alias-name.

Element alias names are subject to the setting of the "GENERAL NAMECASE" declaration in the same way as all other element names, even though they are not, in a strict sense, SGML element names.

14.1.4.4 Buffering Input to the SGML Parser

If a "REPEAT OVER" action in the input processor uses "CURRENT ELEMENTS" then all the text written to the SGML stream within that "REPEAT OVER" action is "buffered": none of the text is actually passed to OmniMark's built-in SGML parser until after the end of the "REPEAT OVER" action. So, for example, in the following (assuming it is performed in the input processor) all of the currently opened elements will be closed, but only after the end of the "REPEAT OVER":

   REPEAT OVER REVERSED CURRENT ELEMENTS AS this-element
   OUTPUT "</"
   OUTPUT NAME OF CURRENT ELEMENT this-element
   OUTPUT ">"
   AGAIN

Similarly, if a "REPEAT OVER" action in the input processor iterates over a set of attributes or over the tokens of an attribute, or if an attribute or attribute token is identified by a USING prefix (ATTRIBUTE, ATTRIBUTES, DATA-ATTRIBUTES or "USING ATTRIBUTE") then all text written to the #SGML stream within that "REPEAT OVER" action is "buffered" in the same manner as for "CURRENT ELEMENTS".

A "USING ATTRIBUTE" or "USING ATTRIBUTES" prefix will also cause input to the SGML parser to be buffered until the action to which it applies completes.

Buffering #SGML stream text ensures that the current elements don't change in the middle of a "REPEAT OVER CURRENT ELEMENTS".


14.2 Entities

Entities are an important component in SGML document instances. They are used for a number of different purposes:

OmniMark processes different kinds of entities in different ways:

This section documents the operations that can be applied to entity references:

14.2.1 Formatting Entity Names

In EXTERNAL-DATA-ENTITY and EXTERNAL-TEXT-ENTITY rules, the "%q" format modifier can be used to return the name of the entity currently being processed.

Syntax

   % format-modifier* q

The "%q" format item only refers to the current entity in the actions of an EXTERNAL-DATA-ENTITY or EXTERNAL-TEXT-ENTITY rule. It refers to the current element everywhere else, including functions which are called from the EXTERNAL-DATA-ENTITY or EXTERNAL-TEXT-ENTITY rule.

When referring to an entity, the "%q" format can have the following modifiers:

The "%q" format item also has modifiers that cause it to return other information about the current entity:

These modifiers can be combined as follows:

If an entity has no system identifier, then the "e" format modifier acts like ep.

If an entity has no public identifier, or if the program has no LIBRARY rule to associate a system identifier with the entity's public identifier, then it is an error to use ep format modifier combination. If such an entity also does not declare a system identifier in the entity declaration, then it is also an error to use the "e" format modifier alone.

The same observation applies to the system identifier of the entity's notation when using the above format modifiers in combination with the "o" format modifier.

All of the above combinations may be further combined with the "l" or "u" format modifiers. Additionally, the "o" format modifier can also be combined with the "f" and the "k" format modifiers, provided it is not also combined with the "e" or "p" modifiers.

The "f" and the "k" format modifiers can only be used with entity names and notation names.

14.2.2 Entity Tests

Several tests can be applied to entities.

The entity tests are:

14.2.2.1 INTERNAL Entities

Syntax

   ENTITY (IS | ISNT) INTERNAL

The "IS INTERNAL" test succeeds if the entity is an internal entity.

The "IS INTERNAL" test is primarily useful when testing the entities named by the values of ENTITY or ENTITIES attributes. (See Section 14.4.3.3, "Entity and Notation Attribute Tests".) This test will always be false in an EXTERNAL-TEXT-ENTITY or EXTERNAL-DATA-ENTITY rule.

14.2.2.2 Entities That Are External

Syntax

   ENTITY (IS | ISNT)  EXTERNAL

"IS EXTERNAL" succeeds only if the entity is an external entity.

The "IS INTERNAL" test is primarily useful when testing the entities named by the values of ENTITY or ENTITIES attributes. (See Section 14.4.3.3, "Entity and Notation Attribute Tests".) This test will always be true in an EXTERNAL-TEXT-ENTITY or EXTERNAL-DATA-ENTITY rule.

14.2.2.3 External Entities With Public Identifiers

Syntax

   ENTITY (IS | ISNT)  PUBLIC

"IS PUBLIC" succeeds only if the entity is an external entity and was declared with a public identifier.

14.2.2.4 External Entities With System Identifiers

Syntax

   ENTITY (IS | ISNT)  SYSTEM

"IS SYSTEM" succeeds only if the entity is an external entity and was declared with a system identifier.

14.2.2.5 External Entities Defined In A Library Declaration

Syntax

   ENTITY (IS | ISNT)  IN-LIBRARY

The LIBRARY declaration explained in Section 19.1.4.1, "Mapping Public Ids To System Ids" associates system identifiers (usually file names) with public identifiers. The IN-LIBRARY test is used to determine whether the program contains LIBRARY rules for the specified entity.

"IS IN-LIBRARY" succeeds only if entity is an external entity with a public identifier that is mapped to a system identifier in an OmniMark LIBRARY declaration.

14.2.2.6 CDATA Entities

Syntax

   ENTITY (IS | ISNT) CDATA-ENTITY

The "IS CDATA-ENTITY" test succeeds if the entity is an external or internal CDATA entity.

14.2.2.7 SDATA Entities

Syntax

   ENTITY (IS | ISNT) SDATA-ENTITY

The "IS SDATA-ENTITY" test succeeds if the entity is an external or internal SDATA entity.

14.2.2.8 NDATA Entities

Syntax

   ENTITY (IS | ISNT) NDATA-ENTITY

The "IS NDATA-ENTITY" test succeeds if the entity is an external NDATA entity.

14.2.2.9 SUBDOC Entities

Syntax

   ENTITY (IS | ISNT) SUBDOC-ENTITY

The "IS SUBDOC-ENTITY" test succeeds if the entity is an external subdocument entity.

14.2.2.10 #DEFAULT Entities

Syntax

   ENTITY (IS | ISNT) DEFAULT-ENTITY

An SGML DTD can contain a declaration for the default general entity. For example:

   <!ENTITY #DEFAULT SYSTEM "default.txt">

Any general entity reference that contains a name that was not defined as a general entity in the DTD is "rerouted" to the default general entity. For the example above, any reference to an undefined entity would get an external text entity with a system identifier of "default.txt". The entity names in ENTITY and ENTITIES attribute values are also satisfied using the default general entity.

If there is no default general entity, a general entity reference containing an undefined name is an error. An undefined parameter entity is always an error: there is no such thing as the "default parameter entity".

The default general entity can be any type of general entity: internal or external, text or non-SGML. It is the job of the DTD designer and the creators of SGML documents to ensure that whenever an undefined general entity is referenced or its name is used in an ENTITY or ENTITIES attribute value, the default general entity is of the appropriate type. For example, if an undefined entity name is used as an entity value, the default general entity had better be a non-SGML entity (CDATA, SDATA, NDATA or SUBDOC) and not a text entity, or an error will result.

The OmniMark program can determine that the default general entity is used by using the "ENTITY IS DEFAULT-ENTITY" test. The OmniMark program may provide some special processing for entities resolved using the default general entity, or may just list undefined entities.

The following example will always output the text "[DEFAULT]" for references to an undefined general entity, independently of any public or system identifier the default general entity may have, so long as default general entity is an external text entity. In addition, it displays a message on the error output indicating which undefined entities are referenced in the document:

   EXTERNAL-TEXT-ENTITY #IMPLIED WHEN ENTITY IS DEFAULT-ENTITY
      OUTPUT "[DEFAULT]"
      PUT #ERROR "General entity %q is undefined.%n"

It is an error if the referenced entity was not declared.

14.2.2.11 Additional Tests in EXTERNAL-TEXT-ENTITY Rules

External text entities can be either general entities (introduced by "&") or parameter entities (by "%"). EXTERNAL-TEXT-ENTITY rules are processed for both general and parameter external text entities, so the OmniMark programmer has to be ready to handle both kinds of entities.

If an entity has a system identifier or a public identifier (or both) it usually doesn't much matter whether it is a general entity or parameter entity; the rules for how to find the information referenced by the system identifier are usually the same.

However, if only an entity name is provided, as in the following two declarations, and the OmniMark program is written to use the entity name as part of a file name, for example, the program may wish to do so differently for general and parameter entities.

   <!ENTITY chapter1 SYSTEM -- text of the first chapter -->
   <!ENTITY % comdcl SYSTEM -- common declarations -->

There are two tests that can be used in an EXTERNAL-TEXT-ENTITY rule that allow for distinguishing between general and parameter entities:

   ENTITY IS GENERAL
   ENTITY IS PARAMETER

One or the other of these two tests is always true of an entity in an EXTERNAL-TEXT-ENTITY rule, so "ENTITY IS GENERAL" is equivalent to "ENTITY ISNT PARAMETER" and "ENTITY IS PARAMETER" is equivalent to "ENTITY ISNT GENERAL".

These tests can also be used in the EXTERNAL-DATA-ENTITY rule, or with ENTITY or ENTITIES attribute values. However, in both these cases, the entity is always a general entity.

An entity manager designer should note that, in general, parameter entity references usually occur in the DTD and general entity references usually occur in the document instance. There are, however, two exceptions to this general rule:

In summary, both general and parameter external entity references can occur in both the DTD and the document instance.

The test:

Syntax

   ENTITY (IS | ISNT) GENERAL

returns TRUE if the external text entity being processed is a general entity.

The test:

Syntax

   ENTITY (IS | ISNT) PARAMETER

returns TRUE if the external text entity being processed is a parameter entity.

14.2.2.12 Combining Entity Tests

The above entity tests can be combined:

   EXTERNAL-TEXT-ENTITY #IMPLIED
      OUTPUT FILE "%eq"
          WHEN ENTITY IS (SYSTEM & EXTERNAL)

This example outputs the file identified by the entity's system identifier, but only if the entity is an external entity and has a system identifier.


14.3 Notations

Notation references have the syntax:

Syntax

   NOTATION 

Different aspects of the notation of the current external data entity being processed can be queried using the keyword NOTATION.

14.3.1 Notation Tests

Several tests can be applied to notations.

The notation tests are:

14.3.1.1 Notations With Public Identifiers

Syntax

   NOTATION (IS | ISNT)  PUBLIC

"IS PUBLIC" succeeds only if the notation was declared with a public identifier.

14.3.1.2 Notations With System Identifiers

Syntax

   NOTATION (IS | ISNT)  SYSTEM

"IS SYSTEM" succeeds only if the notation was declared with a system identifier.

14.3.1.3 Notations Defined In A Library Declaration

Syntax

   NOTATION (IS | ISNT)  IN-LIBRARY

The LIBRARY declaration explained in Section 19.1.4.1, "Mapping Public Ids To System Ids" associates system identifiers (usually file names) with public identifiers. The IN-LIBRARY test is used to determine whether the program contains LIBRARY rules for the system identifier of the specified notation.

"IS IN-LIBRARY" succeeds only if the notation has a public identifier that is mapped to a system identifier in an OmniMark LIBRARY declaration.

14.3.1.4 Combining Notation Tests

The above notation tests can be combined:

   EXTERNAL-DATA-ENTITY #IMPLIED
      OUTPUT "%eoq"
          WHEN NOTATION IS (SYSTEM | IN-LIBRARY)

This example outputs the name of the file identified by the notation's system identifier, but only if the notation is has a system identifier or has a public identifier mapped to a system identifier in a LIBRARY declaration.

The tests can be combined by either AND (or "&") or OR (or "|").

14.3.1.5 Comparing Notation Names

Syntax

   NOTATION (=|!=) (notation-name | notation-name-list)

The "=" test for a NOTATION tests if its name is one of those given as the notation-name or in the notation-name-list. The notation-names must be constant quoted strings or OmniMark names.

   LOCAL STREAM standard-prefix
   ...
   DO WHEN NOTATION = giff
      ...
   DONE

14.4 Attributes

Several OmniMark features allow the programmer to manipulate the attribute values that are a central aspect of the SGML language. There are two kinds of attributes:

OmniMark generally uses DATA-ATTRIBUTE to refer to data attributes, and ATTRIBUTE to refer to element attributes. The exceptions to this are:

In those contexts, either ATTRIBUTE or DATA-ATTRIBUTE can be used.

14.4.1 Attribute References

Attribute references have the syntax:

Syntax

   ATTRIBUTE attribute-name
       element-qualifier* ((ITEM | @) numeric-expression)?

Unlike programmer-defined data types, attribute references always require the ATTRIBUTE herald. This is because attributes are defined in the SGML document and not in the OmniMark program.

Attribute references are always treated as string expressions, even if the attribute was declared in the SGML document to be of type NUMBER. However, string expressions which contain a valid representation of a decimal number can be used anywhere where a numeric expression is permitted, so this interpretation places no restriction on the use of attributes.

Attributes can always be further identified by following the attribute name with an element-qualifier. Element-qualifiers are described in Section 14.1.1, "Element Qualifiers" and further clarified in Section 14.1.4.3, "Element Alias Names". For example:

   ATTRIBUTE date OF ANCESTOR change

refers to the date attribute of an enclosing element whose element name is change.

Within the EXTERNAL-DATA-ENTITY rule (described in Section 16.2.1, "Processing External Data and Subdocument Entities"), unqualified attributes are data attributes; in other contexts, they are attributes of the current element.

The ITEM (or "@") indexer can be used if the attribute was declared as a list-valued attribute. (See Section 14.4.2, "List-Valued Attributes".) However, unlike shelves, for which the right-most value is selected unless indication is made otherwise, attributes do not have a "default" selected value: the whole attribute value is tested or output as a single unit if no index is specified.

ATTRIBUTE references provide the attribute value unmodified, except for:

In particular, TRANSLATE rule processing is not performed on the value of an attribute when it is referenced using the ATTRIBUTE herald. (This contrasts with references to an attribute value using the "%v" modifier. See Section 14.4.4, "Attribute Format Items".)

Attempting to use an ATTRIBUTE value in a string expression will cause an error:

14.4.1.1 Attribute Qualifiers

When using element-qualifiers in an attribute reference, the programmer should be aware that:

In the header and body of an EXTERNAL-DATA-ENTITY rule, all unqualified references to attributes actually refer to data attributes of the external entity being processed. In all other rules, they refer to element attributes (attributes of element start-tags). See Section 14.4.3.4, "Data Attributes Associated With Entity Attributes" for an explanation of how the DATA-ATTRIBUTE keyword can be used with qualifiers.

Unqualified references to attributes inside functions always refer to element attributes. In order to refer to the data attributes of an external entity being processed, the qualifier "OF ENTITY" must be specified.

Element qualifiers can themselves be qualified. Thus, constructs such as the following are permissible:

   ATTRIBUTE type OF PARENT OF ANCESTOR listitem

This last example refers to an attribute of the parent (presumably a list element) of an ancestor called listitem.

14.4.1.1.1 The Using Prefix and Attribute Values

Syntax

   USING ATTRIBUTE attribute-name element-qualifier*
      ((ITEM | @) numeric-expression)?

The USING prefix allows attributes to be referenced without repeating the element-qualifiers or the ITEM (or "@") indexer. It is also useful when using the "%v" format item on an attribute which does not belong to the current element.

In the action in the following rule, the qualifier "OF PARENT" is implied when the attribute chapno is named:

   ELEMENT section
     USING ATTRIBUTE chapno OF PARENT
       OUTPUT "%v(chapno).%d(sectno). %c%n"

Either the element-qualifier or the ITEM (or "@") indexer must be specified.

14.4.2 List-Valued Attributes

A list-valued attribute is one whose declaration is one of:

When the attribute is a list-valued attribute, a particular item in the list can be accessed with an ITEM (or "@") phrase. For example:

   ATTRIBUTE col-w @ 3

refers to the third item in the list of values specified for attribute col-w.

The selector must not be greater than the number of items in the attribute value. The number of items in a list-valued attribute can be determined by using the "NUMBER OF" operator described in Section 7.4.1, "Determining the Size of a Shelf".

Determining the system or public identifier or notation of an entity name used in an ENTITIES attribute requires the use of the ITEM (or "@") phrase.

List-valued attributes are different from shelves in the following ways:

14.4.2.1 Counting Tokens in a List-Valued Attribute

Syntax

   NUMBER OF ATTRIBUTE attribute-name element-qualifier*

"NUMBER OF ATTRIBUTE" returns the number of tokens in a list-valued attribute. When the attribute does not have a list-valued type, "NUMBER OF ATTRIBUTE" will always yield the value one (1).

14.4.2.2 Iterating Over List-Valued Attributes

Syntax

   REPEAT OVER ATTRIBUTE attribute-name element-qualifier*
      local-declaration*
      action*
   AGAIN

"REPEAT OVER ATTRIBUTE" iterates over the values in a list-valued attribute.

One or more list-valued attributes and/or shelves can be combined in a single "REPEAT OVER" when they each have the same number of values:

   ...
   ELEMENT e
      LOCAL COUNTER attribute-value-length VARIABLE
      ...
      REPEAT OVER attribute-value-length & ATTRIBUTE multi
         SET attribute-value-length TO LENGTH OF ATTRIBUTE multi
      AGAIN

The previous example initializes a COUNTER shelf with the lengths of the corresponding attribute values.

14.4.3 Attribute Tests

14.4.3.1 Testing the Source of An Attribute Value

There are three operators for testing where the value of an attribute is set:

Unlike all other attribute references, it is not an error if the specified attribute does not exist, or was not given a value.

If the element-qualifier references an element that does not exist, or if the specified attribute is not declared, then the IS form of the test always fails, and the ISNT form always succeeds.

Since the test does not actually use the value, it is not an error for the value to be unset.

The test:

Syntax

   ATTRIBUTE attribute-name element-qualifier*
      ((ITEM | @) numeric-expression)? (IS|ISNT) SPECIFIED

succeeds when the referenced:

The "IS SPECIFIED" attribute test fails otherwise. Using ISNT instead of IS reverses the result.

The test:

Syntax

   ATTRIBUTE attribute-name element-qualifier*
      ((ITEM | @) numeric-expression)? (IS|ISNT) DEFAULTED

succeeds when the referenced:

The "IS DEFAULTED" attribute test fails otherwise. Using ISNT instead of IS reverses the result.

The test:

Syntax

   ATTRIBUTE attribute-name element-qualifier*
      ((ITEM | @) numeric-expression)? (IS|ISNT) IMPLIED

succeeds when the referenced:

The "IS IMPLIED" attribute test fails otherwise. Using ISNT instead of IS reverses the result.

14.4.3.2 Attribute Type Tests

The type of an element attribute or data attribute can be tested for. An attribute can be declared with one of the following types:

or it can be declared as a name token group. (A name token group attribute can only have one of the values specified in the parenthesized list of names.)

If the specified attribute was not declared for the qualified element (or entity), or if the qualified element does not exist, then an error message is printed, and OmniMark halts. The error message can be avoided by using the "IS SPECIFIED", "IS DEFAULTED", or "IS IMPLIED" tests.

The type of the attribute can be tested for with the following operators:

Syntax

   ATTRIBUTE attribute-name element-qualifier*
      ((ITEM | @) numeric-expression)? (IS | ISNT)
         (CDATA | NAME | NAMES | NUMBER | NUMBERS |
          NMTOKEN | NMTOKENS | NUTOKEN | NUTOKENS |
          ID | IDREF | IDREFS | NOTATION | ENTITY |
          ENTITIES | GROUP)

This test succeeds if the attribute was declared with the given type. The "IS GROUP" test succeeds if the given attribute was declared to take its values from a name token group.

14.4.3.2.1 Combining Attribute Type Tests

Attribute type tests can be combined by separating the types with OR or "|" and parenthesing them. For example, the following two example is legal:

   GLOBAL COUNTER id-uses
   ...
   DO WHEN ATTRIBUTE id IS (IDREF | IDREFS)
      REPEAT OVER ATTRIBUTE id
         DO WHEN id-uses HAS KEY ATTRIBUTE id
            INCREMENT id-uses ^ ATTRIBUTE id
         ELSE
            SET NEW id-uses ^ ATTRIBUTE id TO 1
         DONE
      AGAIN
   DONE

Note that, for a data attribute, a test of its type for ID, IDREF, IDREFS, NOTATION, ENTITY or ENTITIES will always fail, because those types of attributes cannot be associated with a notation.

14.4.3.3 Entity and Notation Attribute Tests

The notation and entity tests (described in Section 14.3, "Notations" and Section 14.2, "Entities") can be applied directly to attribute values which are declared as ENTITY or NOTATION, or directly to an item of an attribute value declared as ENTITIES.

   ...
   DO WHEN ATTRIBUTE IS (EXTERNAL & IN-LIBRARY)
      ...
   DONE

It is usually wise to test that the attribute value is an ENTITY, ENTITIES, or NOTATION attribute before applying the entity or notation tests.

14.4.3.4 Data Attributes Associated With Entity Attributes

In EXTERNAL-DATA-ENTITY rules, the attributes of the entity that triggered the rule are referred to using the ATTRIBUTE herald.

However, entities named in ENTITY and ENTITIES attributes may also have data attributes. To differentiate these attributes from the attributes that belong to the current element or external entity being processed, the data attributes that belong to an ENTITY or ENTITIES attribute item are heralded with the keyword DATA-ATTRIBUTE.

The keyword DATA-ATTRIBUTE can be used in the same way that ATTRIBUTE is used elsewhere. The data attribute name must always be followed by OF and, in parentheses, the identification of the attribute value or item containing the entity name. The syntax is:

Syntax

   DATA-ATTRIBUTE data-attribute-name  OF
      ( ATTRIBUTE attribute-name
         element-qualifier* ((ITEM | @) numeric-expression)? )
      ((ITEM | @) numeric-expression)?

The first optional ITEM (or "@") indexer (inside the parentheses) is associated with the attribute of the qualified element or the external entity currently being processed. It is this attribute which must be declared as ENTITY or ENTITIES.

The second optional ITEM (or "@") indexer (outside the parentheses) is associated with the data attribute named in the parenthesized attribute value item.

The parentheses eliminate the confusion between the two ITEM (or "@") indexers.

For example:

   OUTPUT DATA-ATTRIBUTE widths OF (ATTRIBUTE name) @ 1
   OUTPUT "%v(widths)"

finds the attribute named name in the current element, verifies that it is of type ENTITY, finds the data attribute widths associated with it, and prints out its first item.

Note that the ATTRIBUTE keyword is used in EXTERNAL-DATA-ENTITY rules to refer to a data attribute of the current entity. The DATA-ATTRIBUTE keyword is only used to refer to the data attributes of an ENTITY or ENTITIES attribute value.

For simplicity from this point on, the syntax of the DATA-ATTRIBUTE will be given as:

Syntax

   DATA-ATTRIBUTE data-attribute-name  OF
      ( attribute-reference )
      ((ITEM | @) numeric-expression)?

where attribute-reference is understood to mean an ATTRIBUTE reference with optional element-qualifiers and an optional ITEM (or "@") indexer.

14.4.3.4.1 Using Data Attribute References

DATA-ATTRIBUTE references can be used in any context that permits ATTRIBUTE references.

For example, if tableref is an element that references external entities containing tables, and name is an ENTITY attribute giving the name of the external entity, then the following will compute the number of columns in a table from the number of column widths entered in a list-valued attribute:

   ELEMENT tableref
      LOCAL COUNTER column-count
      ...
      SET column-count TO NUMBER OF
         DATA-ATTRIBUTE colwidth OF (ATTRIBUTE name)

Note that DATA-ATTRIBUTE references are also permitted inside the parentheses of other DATA-ATTRIBUTE references, allowing as many levels of indirection as necessary:

   ELEMENT tableref
      LOCAL COUNTER column-count
      ...
      SET column-count TO NUMBER OF
         DATA-ATTRIBUTE colwidth OF
             (DATA-ATTRIBUTE table OF (ATRIBUTE name))

Iterating Over List-Valued Data Attributes

Syntax

   REPEAT OVER DATA-ATTRIBUTE attribute-name OF
         ( attribute-reference )
      local-declaration*
      action*
   AGAIN

The "REPEAT OVER" action can be applied to DATA-ATTRIBUTE references in exactly the same way as ATTRIBUTE references.

Inside the "REPEAT OVER", the data attribute being iterated over can either be referred to using the DATA-ATTRIBUTE herald, or the ATTRIBUTE herald.

For instance, the following are equivalent:

Example A

   REPEAT OVER DATA-ATTRIBUTE col-width OF (ATTRIBUTE table-ref)
      OUTPUT DATA-ATTRIBUTE col-width
   AGAIN

Example B

   REPEAT OVER DATA-ATTRIBUTE col-width OF (ATTRIBUTE table-ref)
      OUTPUT ATTRIBUTE col-width
   AGAIN

Example C

   REPEAT OVER DATA-ATTRIBUTE col-width OF (ATTRIBUTE table-ref)
      OUTPUT "%zv(col-width)"
   AGAIN

The "z" format modifier is used to turn off TRANSLATE rules to make the "%v" format item behave exactly like the ATTRIBUTE reference. (See Section 14.4.4, "Attribute Format Items".)

The Using Prefix and Data Attribute Values

Syntax

   USING DATA-ATTRIBUTE attribute-name OF
      ( attribute-reference )
      ((ITEM | @) numeric-expression)?

The USING prefix can also be applied to DATA-ATTRIBUTE references in exactly the same way as ATTRIBUTE references. In the action within the USING, the keyword ATTRIBUTE can be used to indicate the selected attribute instead of the keyword DATA-ATTRIBUTE.

The following examples are equivalent:

Example A

   OUTPUT DATA-ATTRIBUTE size OF (ATTRIBUTE id OF PARENT)

Example B

   USING DATA-ATTRIBUTE size OF (ATTRIBUTE id OF PARENT)
     OUTPUT DATA-ATTRIBUTE size

Example C

   USING DATA-ATTRIBUTE size OF (ATTRIBUTE id OF PARENT)
     OUTPUT ATTRIBUTE size

Example D

   USING DATA-ATTRIBUTE size OF (ATTRIBUTE id OF PARENT)
     OUTPUT "%zv(size)"

The "z" format modifier is used to turn off TRANSLATE rules to make the "%v" format item behave exactly like the ATTRIBUTE reference. (See Section 14.4.4, "Attribute Format Items".)

14.4.4 Attribute Format Items

This section describes the "%v" format item used to format attribute values. It also describes the different format modifiers that are available for different types of attribute values.

The same principles that apply to the "%q" format item also apply to the "%v" format item. It refers to the attributes of the currently opened element, except when used directly in the body of an EXTERNAL-DATA-ENTITY rule. There it refers to the entity's data attributes. Unlike "%q", "%v" in an EXTERNAL-TEXT-ENTITY rule refers to the currently opened element's attributes -- there are no such things as "text" attributes.

The USING prefix is helpful when an attribute of other than the current element or entity reference needs to be manipulated or displayed. It can also be used to select a particular token of a list-valued attribute.

14.4.4.1 Accessing Attribute Values

Syntax

   % format-modifier* v( attribute-name )

In ELEMENT rules, the named attribute must be an attribute of the element; in EXTERNAL-DATA-ENTITY rules it must be a data attribute of the entity being processed. In all other rules the named attribute must be an attribute of the containing element.

The following modifiers can always be used with the "%v" format.

14.4.4.2 "%v" Modifiers Specific to CDATA Attributes

If the attribute has a CDATA declared type the following modifiers can also be used:

14.4.4.3 "%v" Modifiers Specific to External ENTITY or ENTITIES Attributes

If the attribute's declared type is ENTITY or ENTITIES, and the entity name refers to an external entity, the following modifiers can also be specified:

These modifiers can be combined as follows:

If an entity has no system identifier, then "e" acts like ep. It is an error if either "e" or ep is used and the entity has no system identifier or no public identifier bound by a LIBRARY rule to a system identifier.

This format accesses letters within system and public identifiers in upper-case or lower-case as they appear in the entity declaration. Letters in element, entity, or notation names appear in upper-case or lower-case as they appear in the processed document unless the SGML Declaration specifies upper-case substitution for that class of name. If so, the name is accessed with letters forced to upper-case. Thus, in the Reference Concrete Syntax, by default, element and notation names appear in upper-case while entity names appear as entered in the document.

Only the "o" format modifier can be combined with the "f", "k", "u", or "l" format modifiers.

For an ENTITIES attribute, if the attribute value contains more than one entity name, the USING prefix, described in Section 14.4.1.1.1, "The Using Prefix and Attribute Values", must be used to select one entity whose system or public identifier is to be manipulated or displayed.

14.4.4.4 "%v" Modifiers Specific to Internal ENTITY or ENTITIES Attributes

If the value of an ENTITY or ENTITIES attribute is the name of an internal CDATA or SDATA entity then the "%ev" format can be used to determine the replacement text of the internal entity. (The only kinds of internal entities that can be used in an ENTITY or ENTITIES attribute are CDATA or SDATA.) See the example below for the different handling of external and internal entities.

In the following example, the element as-is has a single required ENTITY attribute text. The entity named by the attribute value simply provides the text that is to replace the element, wherever it occurs in a document.

   <!ELEMENT as-is - O EMPTY>
   <!ATTLIST as-is text ENTITY #REQUIRED>

The following ELEMENT rule for processing the as-is element does the following:

Example

   ELEMENT as-is
      DO WHEN ATTRIBUTE text IS ENTITY
         DO WHEN ATTRIBUTE text IS EXTERNAL
            OUTPUT FILE "%ev(text)"
         ELSE
            OUTPUT "%ev(text)"
         DONE
      DONE
      SUPPRESS

Note that "%ev" returns one of two things, depending on whether the entity named by the attribute to which it is applied is INTERNAL or EXTERNAL:

The EXTERNAL test, and other tests that can be used for attributes like text are described in Section 14.4.3.3, "Entity and Notation Attribute Tests". These tests allow the OmniMark program not only to distinguish between internal and external entities, but also whether an attribute is an ENTITY or ENTITIES attribute in the first place, and whether an internal or external entity is CDATA or SDATA.

14.4.4.5 "%v" Modifiers Specific to NOTATION Attributes

Some of the format modifiers available for ENTITY or ENTITIES attributes are also available for NOTATION attributes. Specifically, the following modifiers can be specified:

If a notation has no system identifier, then "e" acts like ep. It is an error if either "e" or ep is used and the notation has no system identifier or no public identifier bound by a LIBRARY rule to a system identifier.

These formats access letters within system and public identifiers in upper-case or lower-case as they appear in the notation declaration. Letters in element, notation, or notation names appear in upper-case or lower-case as they appear in the processed document unless the SGML Declaration specifies upper-case substitution for that class of name. If so, the name is accessed with letters forced to upper-case. Thus, in the Reference Concrete Syntax, by default, element and notation names appear in upper-case while notation names appear as entered in the document.

None of them can be used with the "f", "k", "l", or "u" format modifiers.

14.4.5 Attribute Sets

There are many types of applications where it is undesirable to reference each attribute by name. Some examples are:

For such applications, OmniMark provides the ATTRIBUTES object which allows all the declared attributes or specified attributes of an element or external entity to be treated as a shelf in the following way:

It differs from other shelves in:

References to an ATTRIBUTES shelf have the syntax:

Syntax

   SPECIFIED? ATTRIBUTES element-qualifier*

References to a DATA-ATTRIBUTES shelf have the syntax:

Syntax

   SPECIFIED? DATA-ATTRIBUTES
      (OF ( attribute-reference ))?

The utility of the ATTRIBUTES shelf can be seen by the following OmniMark code fragment, which outputs "normalized" start and end tags around the content of the current element, with all specified attribute values included:

   ELEMENT #IMPLIED
      OUTPUT "<%q"
      REPEAT OVER SPECIFIED ATTRIBUTES AS this-attribute
         OUTPUT " "
             || KEY OF ATTRIBUTE this-attribute
             || "=%"%v(this-attribute)%""
      AGAIN
      OUTPUT ">" _
             "%c"
      OUTPUT "</%q>" WHEN CONTENT ISNT (EMPTY | CONREF)

This example can be used as a complete but simple OmniMark program that "normalizes" an SGML document. In practise, such a program will also need to:

A complete normalizer is distributed with OmniMark.

14.4.5.1 Indexing the ATTRIBUTES Shelf

Attributes can be accessed using KEY (or "^") and ITEM (or "@") indexers:

Syntax

   SPECIFIED? ATTRIBUTES
      (KEY | ^) string-expression

Syntax

   SPECIFIED? ATTRIBUTES
      (ITEM | @) numeric-expression

The key value used to index should always be in upper-case when NAMECASE GENERAL YES applies to the SGML document being processed (e.g. "IDENT" above). Unlike names in OmniMark programs, string values used as keys are not automatically upper-cased -- doing so is the OmniMark programmer's responsibility. Upper-casing can be done by directly entering the appropriate values as above or by using the ""u"" format modifier.

For example, the following actions output the same value:

Example A

   OUTPUT ATTRIBUTE ident

Example B

   OUTPUT ATTRIBUTES ^ "IDENT"

Example C

   LOCAL STREAM attribute-name

   SET attribute-name TO "ident"
   OUTPUT ATTRIBUTES ^ "%ug(attribute-name)"

The ATTRIBUTES shelf differs from other shelves in not having a "current item". So the following is always invalid:

   OUTPUT ATTRIBUTES

In addition, even though some of the items of the ATTRIBUTES shelf may be list-valued, "double indexing" is not allowed. The following is invalid:

   OUTPUT ATTRIBUTES ^ "IDENT" @ 1

The "double indexing" can be accomplished with a USING prefix:

   USING ATTRIBUTES ^ "IDENT" AS ident
      OUTPUT ATTRIBUTE ident @ 1

The "USING ATTRIBUTES" prefix (Section 14.4.5.3, "Selecting an Item of the ATTRIBUTES and DATA-ATTRIBUTES Shelf") can be used to select an attribute from a set so that a token can be selected from the attribute value:

   USING ATTRIBUTES ^ "IDENT" AS ident
      OUTPUT ATTRIBUTE ident @ 1

DATA-ATTRIBUTES shelves can be accessed in the same way:

Syntax

   SPECIFIED? DATA-ATTRIBUTES
      (OF ( attribute-reference ))?
      (KEY | ^) string-expression

Syntax

   SPECIFIED? DATA-ATTRIBUTES
      (OF ( attribute-reference ))?
      (ITEM | @) numeric-expression

An item of an ATTRIBUTES or DATA-ATTRIBUTES shelf can also be used in the parenthesized portion of a DATA-ATTRIBUTES or DATA-ATTRIBUTE reference:

Example A

   ...
   OUTPUT DATA-ATTRIBUTE OF (ATTRIBUTES ^ "REF")

Example B

   ...
   OUTPUT DATA-ATTRIBUTE
      OF (DATA-ATTRIBUTES OF (ATTRIBUTES ^ "REF") @ 1)

14.4.5.1.1 Testing For Declared Attributes

Syntax

   SPECIFIED? ATTRIBUTES HAS KEY string-expression

"HAS KEY" can be applied to the attribute shelf to determine whether an element has an attribute declared for it. Note that just because an attribute is declared does not mean that it has a value:

   GLOBAL STREAM id-name
   ...
   OUTPUT ATTRIBUTES ^ id-name
       WHEN ATTRIBUTES HAS KEY id-name &
            ATTRIBUTES ^ id-name ISNT IMPLIED

The "HAS KEY" operator can be very useful when combined with tests that determine the declared type of an attribute. (Remember that those tests cause an error when the attribute that is being referenced was not declared for the specified element.) The following example normalizes an SGML document and adds id attributes to elements that can have them, depending on their type.

   DOWN-TRANSLATE

   GLOBAL COUNTER id-count

   ELEMENT #IMPLIED
      OUTPUT "<%q"
      DO WHEN ATTRIBUTES HAS KEY 'ID'
         DO WHEN ATTRIBUTE id IS CDATA
            OUTPUT " id='%q/%d(id-count)'"
         ELSE
            OUTPUT " id='%d(id-count)'"
         DONE
         INCREMENT id-count
      DONE
      OUTPUT ">%c"
      OUTPUT "</%q>" WHEN ELEMENT ISNT #EMPTY

The "HAS KEY" test is also useful in conjunction with attribute tests when the attribute name is known. The following two tests are equivalent:

Example A

   OUTPUT ATTRIBUTE ident
      WHEN ATTRIBUTES HAS KEY "IDENT" & ATTRIBUTE ident ISNT IMPLIED

Example B

   OUTPUT ATTRIBUTE ident
      WHEN ATTRIBUTE ident IS SPECIFIED | ATTRIBUTE ident IS DEFAULTED

"HAS KEY" can be used on the DATA-ATTRIBUTES shelf as well:

Syntax

   SPECIFIED? DATA-ATTRIBUTES
      (OF ( attribute-reference ))?
      HAS KEY string-expression

14.4.5.1.2 Getting The Name of an Attribute

Syntax

   KEY OF SPECIFIED? ATTRIBUTES
       element-qualifier* @ numeric-expression

Syntax

   KEY OF SPECIFIED? DATA-ATTRIBUTES
      (OF ( attribute-reference ))? @ numeric-expression

Syntax

   KEY OF ATTRIBUTE attribute-name element-qualifier*

Syntax

   KEY OF DATA-ATTRIBUTE OF ( attribute-reference )

Each of the above forms returns the "true" name of the attribute. The name is upper-cased when NAMECASE GENERAL YES applies to the SGML document.

When "KEY OF" is applied to an item on the ATTRIBUTES or DATA-ATTRIBUTES shelf, the "@" (ITEM) indexer is required.

Because attributes are always considered to be part of a shelf of attributes, it always makes sense to ask for the "key" of an attribute, even if the ATTRIBUTE form is used rather than ATTRIBUTES.

This is most useful when retrieving the real name of an attribute that is being referenced through an alias. For example:

   REPEAT OVER ATTRIBUTES AS this-one
      DO WHEN ATTRIBUTE this-one ISNT IMPLIED
         OUTPUT KEY OF ATTRIBUTE this-one
         OUTPUT "='%v(this-one)'%n"
      DONE
   AGAIN

14.4.5.1.3 Getting The Position of an Attribute

Syntax

   ITEM OF SPECIFIED? ATTRIBUTES
       element-qualifier* ^ string-expression

Syntax

   ITEM OF SPECIFIED? DATA-ATTRIBUTES
       (OF ( attribute-reference ))? ^ string-expression

When "ITEM OF" is applied to an item on the ATTRIBUTES or DATA-ATTRIBUTES shelf, the "^" (KEY) indexer is required.

Similarly to asking for the name of an attribute, "ITEM OF" can be used to determine the order in which the attribute was declared. When SPECIFIED , the first declared attribute has item number 1, the second 2, and so on.

For example, the following outputs a line of text only when the IDENT attribute is the first one declared in its ATTLIST:

   OUTPUT "IDENT is the first declared attribute%n"
          WHEN ITEM OF ATTRIBUTES ^ "IDENT" = 1

14.4.5.1.4 Getting The Number of Attributes

Syntax

   NUMBER OF SPECIFIED? ATTRIBUTES
       element-qualifier*

Syntax

   NUMBER OF SPECIFIED? DATA-ATTRIBUTES
      (OF ( attribute-reference ))?

The number of attributes declared for an element can be determined by applying "NUMBER OF" to ATTRIBUTES, as in:

   ...
      SET attribute-count TO NUMBER OF ATTRIBUTES
      OUTPUT "Element %q has %d(attribute-count) attribute(s).%n"
      SET attribute-count TO NUMBER OF ATTRIBUTES OF PARENT
      OUTPUT "Element %q's parent has %d(attribute-count) attribute(s).%n"

14.4.5.1.5 The Order of Attributes

The ATTRIBUTES and the DATA-ATTRIBUTES shelves are indexed in the following order:

The following gives the value of the first attribute declared for the current element no matter what its name or in where its value is specified in a start tag. (The following is in error if there are no attributes declared for the currently opened element or if the first declared attribute has neither a default nor a specified value.)

   OUTPUT ATTRIBUTES @ 1

In the next case the value of the first attribute specified in the start tag is output, no matter what its declared order. (In this case it is an error if no attributes are specified in the start tag, even if there are declared attributes and they all have default values.)

   OUTPUT SPECIFIED ATTRIBUTES @ 1

The following examples will give different values, because each shelf can contain a different number of attributes, and the "SPECIFIED ATTRIBUTES" shelf can be in a different order.

Example A

   LOCAL COUNTER id-pos
   SET id-pos TO ITEM OF ATTRIBUTES ^ "POS"

Example B

   LOCAL COUNTER id-pos
   SET id-pos TO ITEM OF SPECIFIED ATTRIBUTES ^ "POS"

14.4.5.2 Iterating Over All of the Attributes

Syntax

   REPEAT OVER SPECIFIED? ATTRIBUTES
           element-qualifier* AS alias-name
      local-declaration*
      action*
   AGAIN

The "REPEAT OVER" action can be used to iterate over all the attributes:

For example:

   REPEAT OVER SPECIFIED ATTRIBUTES AS this-one
      OUTPUT KEY OF ATTRIBUTE this-one
      OUTPUT "='%v(this-one)'%n"
   AGAIN

The order in which the attributes are repeated over is described in Section 14.4.5.1.5, "The Order of Attributes".

"REPEAT OVER ATTRIBUTES" must specify an alias-name (following AS). This name is used to identify the attribute selected on each iteration within the "REPEAT OVER" action. Any name can be used; this-one in the example above was chosen arbitrarily.

Inside the "REPEAT OVER", use of the keyword ATTRIBUTE followed by the alias-name without any element-qualifiers always refers to the attribute identified by the alias.

If the OmniMark programmer wishes to refer to an attribute of the currently opened element with the same name as the alias-name being used, the element-qualifier "OF ELEMENT" can be used. An element-qualifier always indicates that the attribute belongs to the identified currently opened element.

For example, the following action outputs the value of the attribute named "THIS-ONE" of the currently opened element, even if there is an active attribute alias also using the name "THIS-ONE":

   ...
   OUTPUT ATTRIBUTE this-one OF ELEMENT
   ...

"REPEAT OVER" can also be applied to the DATA-ATTRIBUTES shelf, in which case the syntax is:

Syntax

   REPEAT OVER SPECIFIED? DATA-ATTRIBUTES
           (OF ( attribute-reference ))?
           element-qualifier* AS alias-name
      local-declaration*
      action*
   AGAIN

Within a "REPEAT OVER DATA-ATTRIBUTES", the attribute being iterated over can be referenced with either the keyword DATA-ATTRIBUTE or ATTRIBUTE, followed by the alias-name.

   REPEAT OVER SPECIFIED DATA-ATTRIBUTES OF (ATTRIBUTE ref) AS this-one
      OUTPUT KEY OF DATA-ATTRIBUTE this-one
      OUTPUT "='%v(this-one)'%n"
   AGAIN

14.4.5.3 Selecting an Item of the ATTRIBUTES and DATA-ATTRIBUTES Shelf

Syntax

   USING SPECIFIED? ATTRIBUTES
          element-qualifier* indexer AS alias-name
       action

Syntax

   USING SPECIFIED? DATA-ATTRIBUTES
         (OF ( attribute-reference ))?
         indexer AS alias-name
      action

USING can be applied to the ATTRIBUTES or the DATA-ATTRIBUTES shelf in the same way as it is applied to programmer-defined shelves. The indexer is required and has one of the following forms:

Syntax

   (ITEM | @) numeric-expression

Syntax

   (KEY | ^) string-expression

In the action following the USING prefix, the selected item is referenced by the keyword ATTRIBUTE followed by the alias-name defined in the USING prefix. The keyword DATA-ATTRIBUTE can be used when it is an item of the DATA-ATTRIBUTES shelf that has been selected.

The USING prefix must be used when accessing individual items from a list-valued attribute from the ATTRIBUTES or DATA-ATTRIBUTES shelf.

The following USING prefix and action outputs, one per line, the tokens of one of the currently opened element's parent's attributes. The parent's attribute that is selected is the one whose attribute name is the value of the stream variable name-to-be-used:

   GLOBAL STREAM name-to-be-used
   ...
   USING ATTRIBUTES OF PARENT ^ name-to-be-used
         AS named-attribute
      REPEAT OVER ATTRIBUTE named-attribute
         OUTPUT ATTRIBUTE named-attribute
         OUTPUT "%n"
      AGAIN

In this example, the attribute selected from the parent is assigned the alias-name named-attribute, and that alias is used to refer to the selected attribute within the action prefixed by "USING ATTRIBUTES".

Inside the "REPEAT OVER" action above, named-attribute serves two purposes:

14.4.5.3.1 More About Attribute Aliases

Attribute aliases can be defined by "REPEAT OVER" and USING even when the name of the attribute is known. The attribute alias serves to simplify attribute identification when more than one opened element has an attribute with the same name. For example, in the following, the attribute alias parent-type means that the parent's type attribute can be easily identified, especially in "%v" formats, even if the currently opened element has an attribute named "type":

   USING ATTRIBUTE type OF PARENT AS parent-type
      DO WHEN ATTRIBUTE type != ATTRIBUTE parent-type
         OUTPUT "Type attributes differ:%n" _
                "  current: %v(type)%n" _
                "  parent's: %v(parent-type)%n"
      DONE

An attribute alias can be defined on any form of "REPEAT OVER" or USING. Within such a context, the attribute alias-name takes precedence over an element attribute or data attribute with the same name: use of the alias-name refers to the attribute associated with the alias, and not with the attribute whose "real" name is the alias name, if any.

The only place where an attribute alias is required is when "REPEAT OVER" or USING is used with ATTRIBUTES, "SPECIFIED ATTRIBUTES", DATA-ATTRIBUTES or "SPECIFIED DATA-ATTRIBUTES". In these cases an alias is required so that the selected attribute or attributes can be identified within the "REPEAT OVER" or USING.

14.4.5.3.2 Retrieving the Key of an Attribute Alias

When the operator "KEY OF" is applied to the selected or iterated item inside a "USING ATTRIBUTES" or "REPEAT OVER ATTRIBUTES", it returns the real name of the attribute, and not the alias name.

14.4.5.3.3 Locking the SGML Parser Input

In the input processor, the input to the SGML parser is "locked" whenever "REPEAT OVER" or USING is applied to any of:

This means that the available set of attributes does not change while the action prefixed by the USING is being performed.

What happens is that while the USING or "REPEAT OVER" is being performed, any text written to the #SGML stream is "buffered" and not passed to the SGML parser. The SGML parser is quiescent, with the happy consequence that there is no question about what set of attributes is identified by the USING or "REPEAT OVER".

In this regard, these forms are like "REPEAT OVER CURRENT ELEMENTS", which also needs to "lock" the SGML parser input in the same circumstances for the same reasons.

14.4.5.4 Attribute Tests on Items of the ATTRIBUTES Shelf

All the tests that can normally be applied to attributes can be used with any identification of an attribute, whether the attribute is identified by the attribute's name, by using an attribute alias name, or by selecting an item of ATTRIBUTES.

The following "REPEAT OVER" action only outputs attributes whose values are defaulted or specified (excluding the #IMPLIED and unspecified ones) and whose values consist entirely of letters. The attributes are output in the order that they are declared. It does so by testing an attribute identified by an attribute alias name:

   OUTPUT "<%q"
   REPEAT OVER ATTRIBUTES AS this-attribute
      DO WHEN ATTRIBUTE this-attribute ISNT IMPLIED &
              ATTRIBUTE this-attribute MATCHES (LETTER+ VALUE-END)
         OUTPUT " "
         OUTPUT KEY OF ATTRIBUTE this-attribute
         OUTPUT "=%"%v(this-attribute)%""
      DONE
   AGAIN
   OUTPUT ">"

The following test succeeds only if the first declared attribute for an element has a value of "FIRST":

   OUTPUT "First attribute has a value of %"FIRST%"%n"
          WHEN ATTRIBUTES @ 1 = "FIRST"

Because all attributes of "SPECIFIED ATTRIBUTES" are, by definition, specified, the "IS SPECIFIED", "IS DEFAULTED" and "IS IMPLIED" tests don't make much sense when applied to it: "IS SPECIFIED" always succeeds, and "IS DEFAULTED" and "IS IMPLIED" always fail.


14.5 The #APPINFO Stream

The APPINFO parameter of an SGML declaration is used to provide processing information to an application. The #APPINFO stream provides access to this information. The #APPINFO stream may appear only in a string expression, either in the format item:

   "...%g(#appinfo)..."

or as the name of a stream:

   #APPINFO

The #APPINFO stream may not be opened, written to, closed, or discarded. It is not ATTACHED if an SGML declaration was not given, if APPINFO NONE was specified in the SGML Declaration, or in cross-translations. If it is ATTACHED, it is also CLOSED.

Next chapter is Chapter 15, "Processing SGML Errors".

Copyright © OmniMark Technologies Corporation, 1988-1997. All rights reserved.
EUM27, release 2, 1997/04/11.

Home Copyright Information Website Feedback Site Map Search