|        | |||||
|  | |||||
| XML Parsing and UTF-8 Encoding | |||||
| Prerequisite Concepts | |||||
Version 4.0.1 of the OmniMark programming language supports UTF-8 encoding as part of the XML parser. To allow characters to be processed in a uniform manner, independently of how they come to the XML parser, OmniMark converts numeric character references (such as "�") and hexadecimal character references (such as "¡") into their corresponding UTF-8 encodings.
Version 4.0 of OmniMark supported XML and UTF-8, but had some problems with character references. These problems have been fixed in version 4.0.1.
Version 4.0.1 fixes the following problems that occurred in version 4.0:
The following translate rule can be used as a method of converting UTF-8 encodings outside the ASCII range back into hexadecimal values:
  translate utf8-char => c
     local counter n
     set n to utf8-char-number c
     do when n <= "%16r{7F}"
        output c
     else
        output "&#x%16rud(n);"
     done
| Prerequisite Concepts XML document processing | 
| ---- |