Entities

An XML or SGML document may contain entities. Entities are defined in the DTD and may be either internal or external.

Internal entities have their values defined in the DTD. The parser resolves internal entities. External entities are defined by reference to a SYSTEM or PUBLIC identifier. It is your responsibility to resolve external entities (though in some cases you can let OmniMark assume this responsibility for you).

There are two kinds of external entities: text and data. A text entity is one whose value forms part of the parsable text of the current document. It should be read by the parser as if it were part of the current document. A data entity is an entity whose value is referred to by the current document, but does not form part of the parsable text of the document. Data entities should not be read by the parser, at least not as part of the current parse. XML supports text entities but not data entities. SGML supports both.

There are two kinds of internal entities: text and character data. The parser resolves both kinds of internal entities; it parses the content of text entities and treats the content of character data entities as text, disregarding any markup characters they may contain.

Entities also come in two forms, general, which can be used in a document instance, and parameter, which can be used in a DTD.

SGML also allows the creation of attributes of type entity, that is attributes whose value is an entity name. To resolve the attribute value, you must resolve the entity.

A DTD is a special case of an external text entity. All or part of the DTD of a document may be referenced by a PUBLIC or SYSTEM identifier in the DOCTYPE declaration.

Retrieving information on entities

There are a number of tests you can use to determine the type or status of entities. Here is a complete list of the available tests:

Resolving entities

How you resolve an entity depends on whether the entity is internal or external, text or data, and whether or not it occurs as an attribute value. (Whether an entity is a general or parameter entity does not affect how you resolve it.)

Internal text entities are given their declared values by the parser. This is done silently, so you cannot tell that the substitution has occurred. (That is, you can find the replacement text of the entity with a translate rule, but you cannot tell if that text occurred in the source as a text entity or as verbatim text.)

Internal character data entities are given their declared values by the parser. However, you can detect that a piece of text is the expansion of a character data entity using translate rules. See Entities, internal

External text entities, including the DTD, but excluding entities occurring as attribute values, can be captured and processed with an external-text-entity rule. See Entities, external.

Entities occurring as attribute values can be captured and processed in element rules.