Markup declarations and events

A DTD provides constraints that an SGML or XML document must respect to pass validation. These constraints are defined by markup declarations which describe, among other things, the elements, attributes and entities that a conforming SGML or XML document may use, and in what combinations.

When parsing against a DTD, OmniMark provides access to the declarations contained in the DTD against which it is parsing. As well, new element, attribute and entity declarations can be created. From these declarations, new element events can be derived.

Accessing element and entity declarations declared by a DTD

This example demonstrates how to access element and entity declarations from a compiled DTD.

  process
     local entity-declaration ents variable
  
     do xml-parse document creating xml-dtds{"reference"} scan file "book.dtd"
        suppress
     done
  
     repeat over declared-elements of xml-dtds{"reference"} as elmt
        local content-model model initial { content of elmt }
  
        output name of elmt || " ("
        do when model == any-content-model
           output "any"
        else when model == cdata-content-model
           output "cdata"
        else when model == element-content-model
           output "element"
        else when model == mixed-content-model
           output "mixed"
        else when model == rcdata-content-model
           output "rcdata"
        else
           output "other"
        done
        output " content)"
  
        do when number of attributes of elmt > 0
           output "%n  "
           repeat over attributes of elmt as attr
              output key of attr || ", "
           again
        done
        output "%n"
     again
  
     output "%nGeneral Entities:%n"
     repeat over declared-general-entities of xml-dtds{"reference"} as ent
        output "   " || key of ent || "%n"
     again
  
     output "%nParameter Entities:%n"
     repeat over declared-parameter-entities of xml-dtds{"reference"} as ent
        output "   " || key of ent || "%n"
     again
          

Emitting new elements: basic form

The following is a useful markup source function for emitting new element events into an output stream. It first creates a new markup-element-event from an element-declaration and a shelf of attribute-declarations. It then emits into the #current-output stream:

  1. the start tag of the new event
  2. any provided nested content
  3. the end tag
Note that we use signal throw to emit the start and end tags, and that the very same markup-element-event is used to ensure that the end tag corresponds to the start tag. The optional argument src allows for nested content to be provided, and because it is itself a markup source, it may contain its own markup events.
  define overloaded markup source function
     emit-element (value     element-declaration declaration,
                   read-only specified-attribute attribute-values,
                   value     markup source       src               optional)
  as
     local markup-element-event event initial { create-element-event declaration attributes attribute-values }
  
     signal throw #markup-start event
     output src
        when src is specified
     signal throw #markup-end event
          

Here is an example which makes use of our emit-element function to re-emit an element, removing all attributes. We pass %c as the third argument. This is a markup source that will cause the markup parse to continue in a streaming fashion when the output statement in the definition of emit-element is executed.

  element "example"
       output emit-element (declaration of #current-markup-event, {}, "%c")
          

Emitting new elements: from an element event

We can build on this basic version of emit-element by adding an overloading that accepts a markup-element-event rather than an element-declaration. The function declaration of is used to retrieve the element declaration form the element event passed as the first argument.

  define overloaded markup source function
     emit-element (value     markup-element-event source-event,
                   read-only specified-attribute  attribute-values,
                   value     markup source        src               optional)
  as
     output emit-element (declaration of source-event, attribute-values, src)
          

Here is an example which makes use of our emit-element function to re-emit an element after making some changes to its attributes.

  element "example"
     local specified-attribute attrs variable
  
     copy specified attributes to attrs
     remove attrs{"attribute-to-remove"}
     set key of attrs{"attribute-to-rename"} to "renamed-attribute"
     set new? attrs{"new-attribute-name"}
        to create-specified-attribute      declaration of attribute "new-attribute-name"
                                      from "new value"
  
     output emit-element (#current-markup-event, attrs, "%c")
          

Emitting new elements: string-based and well-formed

Our next example will be a version of emit-element that accepts string arguments for the element name and attributes, and which works with well-formed XML. As we will be trampolining to the previous emit-element function, we must prepare:

  1. an element-declaration and
  2. a shelf of specified-attributes.
To create the element-declaration, we need:
  1. the element name
  2. a shelf of attribute-declaration
  3. a content model
So we end up creating a shelf of attribute-declarations in order to create the element-declaration, and a shelf of specified-attributes instances needed by emit-element. We use the fallback behavior of create-element-declaration by creating a keyless entry on the global constant shelf implied-cdata. Its single entry will be used as the attribute-declaration for every element-declaration created. For well-formed XML we have chosen to make all attributes implied and to use the any content model. Also, note that we use the built-in conversion function from string to specified-attribute in the line which creates new entries on the valid-attributes shelf.
  constant attribute-declaration implied-cdata
     initial { create-attribute-declaration         attribute-declared-cdata
                                            default attribute-declared-implied }
  
  define overloaded markup source function
     emit-element-well-formed (value     string        element-name,
                               read-only string        attribute-values,
                               value     markup source src               optional)
  as
     local specified-attribute valid-attributes variable
     local element-declaration decl
  
     repeat over attribute-values as a
        set new valid-attributes{key of a} to a
     again
  
     set decl to create-element-declaration            element-name
                                            attributes implied-cdata
                                               content any-content-model
  
     output emit-element (decl, valid-attributes, src)
          

Here is an example which makes use of our emit-element function. It contains a nested call to emit-element-well-formed which is permitted because this argument is of type markup source, which is the declared return type of emit-element-well-formed.

  output emit-element-well-formed ("foo", {"a" with key "attr1", "b" with key "attr2"},
                                   emit-element-well-formed ("bar", {"c" with key "attr3"}))
          

Emitting new elements: string-based and validated

We will create a validating form of emit-element that accepts string arguments for the element name and attributes. The function will have a required dtd argument. Rather than creating a well-formed element-declaration, this function will retrieve the required element declaration for the element to be created from the compiled DTD. It will then create specified-attributes having properties as declared in the DTD. It also leverages the element-declaration to ensure that these are permitted for the element in question.

  define overloaded markup source function
     emit-element (value     string        element-name,
                   read-only string        attribute-values,
                   value     dtd           compiled-dtd,
                   value     markup source src               optional)
  as
     local element-declaration declaration
     local specified-attribute valid-attributes variable
  
     assert declared-elements of compiled-dtd has key element-name
        message "The DTD does not declare the element named %"%g(element-name)%"."
  
     set declaration to (declared-elements of compiled-dtd){element-name}
  
     using attributes of declaration as declared-attributes
     repeat over attribute-values as a
        assert declared-attributes has key (key of a)
           message "The element %"%g(element-name)%" cannot have attribute %"" || key of a || "%"."
        set new valid-attributes{key of a} to create-specified-attribute declared-attributes{key of a} from a
     again
  
     output emit-element (declaration, valid-attributes, src)
          

Here is an example of use of this function.

  import "omxmlwrite.xmd" prefixed by xml.
  
  global string sample initial {   "<!DOCTYPE a [%n"
                                || "<!ELEMENT a         (#PCDATA)>%n"
                                || "<!ATTLIST a%n"
                                || "          one CDATA #IMPLIED%n"
                                || "          two CDATA #REQUIRED>%n"
                                || "<!ELEMENT b         EMPTY>%n"
                                || "<!ATTLIST b%n"
                                || "          three CDATA #IMPLIED%n"
                                || "          four  CDATA #REQUIRED>%n"
                                || "]>%n"
                                || "<a two=%"2%">a</a>" }
  
  process
     do xml-parse document scan sample
        output xml.written from "%c"
     done
  
  element "a"
     output emit-element ("a", {"1" with key "one", "2" with key "two"}, #current-dtd, #content)
          

Emitting new elements: markup sink formulation

It is sometimes more convenient to use a markup sink rather than a markup source.

  define overloaded markup sink function
     element-emitter-well-formed (value     string      element-name,
                                  read-only string      attribute-values,
                                  value     markup sink src               optional)
  as
     using output as src
        output emit-element-well-formed (element-name, attribute-values, #current-input)
          

Here is an example of a processing pipeline created by chaining together markup sink functions.

  using output as xml.writer into #current-output
     using output as element-emitter-well-formed ("foo", {"a" with key "attr1", "b" with key "attr2"}, #current-output)
        using output as element-emitter-well-formed ("bar", {"c" with key "attr3"}, #current-output)
           output emit-element-well-formed ("baz", {"d" with key "attr4"})
               || emit-element-well-formed ("baz", {"d" with key "attr5"})
               || emit-element-well-formed ("baz", {"d" with key "attr6"})