|
|||||
Markup languages | |||||
In OmniMark documentation, almost all references to "markup languages" are actually references to element-based markup languages that have been created using either SGML (Standard Generalized Markup Language) or XML (the eXtensible Markup Language). A markup language is a full set of markup instructions which can be used to comprehensively describe the structural information content of a piece of text. Markup tags are the actual pieces of code that are added to the electronic document.
When you create a markup language using SGML or XML, you are defining a set of tags which can then be used to demarcate the structure of your documents. Because SGML and XML are used to create sets of markup tags, they are "metalanguages", languages that describe other languages. The benefits of using SGML and XML to create these markup languages are numerous; since they are internationally recognized standards, this standardization allows the marked up documents to be portable across platforms. Additionally, with SGML or XML you are able to create fully customized languages that will most comprehensively treat your unique markup requirements.
Markup instructions can be interpreted by applications that use the markup to determine the formatting of a document. When used this way, the markup usually has an immediate and specific effect on the text, either by changing the appearance of the characters (by rendering them in a bold or italic font, for example), or by affecting the positioning of the text (such as by changing the margin, indent, and spacing values).
For example, HTML is a markup language whose elements describe the formatting of a document when that document is processed by an HTML browser or similar application:
<html> <head> <title>Hamlet</title> </head> <body bgcolor="#ffffff" text="#000000"> <div align=center> <font size=5> <b>Hamlet</b> <p><font size=3><b>Act I, Scene I</b> <p><i>Francisco at his post. Enter to him Bernardo</i> </div> <p><b>Bernardo:</b> Who's there? <p><b>Francisco:</b> Nay, answer me: stand and unfold yourself. <p><b>Bernardo:</b> Long live the king! </body> </html>
The elements in this short HTML document affect the alignment, size, and appearance of the text.
When interpreted by applications, specific markup languages also detail the structure of a document, identifying the various internal components of which it is made. These components can include things such as paragraphs, headings, sections, subsections, names, titles, chapters, volumes, articles, and so on. The possible list of document components is endless, but each specific markup language can only be used to identify a small set of these.
For example, the following document is marked up using a very simple language created with XML:
<play> <title>Hamlet</title> <act><scene> <scenedesc>Elsinore. A platform before the castle.</scenedesc> <stagedir>Francisco at his post. Enter to him Bernardo</stagedir> <char>Bernardo</char> <line>Who's there?</line> <char>Francisco</char> <line>Nay, answer me: stand, and unfold yourself.</line> <char>Bernardo</char> <line>Long live the king!</line> </scene></act> </play>
The elements in this short XML document are used to identify, to an XML application, the components and structure of the information it contains.
Wherever possible, OmniMark uses the same names and terminology as the SGML and XML specifications.
---- |