|
Introduction
The pattern shown here can be used to parse any formal public identifier.
This pattern can also be used in an external-text-entity
rule, an external-data-entity
rule, or when processing an entity name valued element attribute.
If this pattern were used to parse the public identifier "-//All Mine//TEXT Chapter3//EN", it would result in these pattern variable assignments:
unregistered-owner-identifier
would have the value "All Mine".
public-text-class
would have the value "TEXT".
public-text-description
would have the value "Chapter 3".
public-text-language
would have the value "EN".
registered-owner-identifier
, iso-owner-identifier
, unavailable-text-indicator
, public-text-designating-sequence
, and public-text-display-version
would fail the specified
test.
This pattern is rather lengthy because of its generality and because of the long pattern variable names used. Most applications will not need all parts of the public identifier. Shorter pattern variable names can be used -- the terms in the pattern are those used in the SGML standard to describe the parts of a formal public identifier. On the other hand, some OmniMark programmers will want to extend the pattern to extract details of an ISO owner identifier, public text description, or designating sequence.
match ("+//" ([any except "/"]+ | "/" lookahead not "/")* => registered-owner-identifier | "-//" ([any except "/"]+ | "/" lookahead not "/")* => unregistered-owner-identifier | ([any except "/"]+ | "/" lookahead not "/")* => iso-owner-identifier) "//" [any except "%_"]+ => public-text-class " " ("-//" => unavailable-text-indicator)? ([any except "/"]+ | "/" lookahead not "/")* => public-text-description "//" (letter {2} => public-text-language value-end | ([any except "/"]+ | "/" lookahead not "/")* => public-text-designating-sequence ("//" any* => public-text-display-version)?)
Related Concepts
Public identifiers: parsing |
---- |