![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|||||
|
|
|||||
| XML Parsing and UTF-8 Encoding | |||||
| Prerequisite Concepts | |||||
Version 4.0.1 of the OmniMark programming language supports UTF-8 encoding as part of the XML parser. To allow characters to be processed in a uniform manner, independently of how they come to the XML parser, OmniMark converts numeric character references (such as "�") and hexadecimal character references (such as "¡") into their corresponding UTF-8 encodings.
Version 4.0 of OmniMark supported XML and UTF-8, but had some problems with character references. These problems have been fixed in version 4.0.1.
Version 4.0.1 fixes the following problems that occurred in version 4.0:
The following translate rule can be used as a method of converting UTF-8 encodings outside the ASCII range back into hexadecimal values:
translate utf8-char => c
local counter n
set n to utf8-char-number c
do when n <= "%16r{7F}"
output c
else
output "&#x%16rud(n);"
done
|
Prerequisite Concepts XML document processing |
| ---- |