contentsconceptssyntaxlibrariessampleserrorsindex
Full text search
XML Parsing and UTF-8 Encoding
Prerequisite Concepts      

Version 4.0.1 of the OmniMark programming language supports UTF-8 encoding as part of the XML parser. To allow characters to be processed in a uniform manner, independently of how they come to the XML parser, OmniMark converts numeric character references (such as "�") and hexadecimal character references (such as "&#xA1") into their corresponding UTF-8 encodings.

Version 4.0 of OmniMark supported XML and UTF-8, but had some problems with character references. These problems have been fixed in version 4.0.1.

Version 4.0.1 fixes the following problems that occurred in version 4.0:

The following translate rule can be used as a method of converting UTF-8 encodings outside the ASCII range back into hexadecimal values:

  translate utf8-char => c
     local counter n
     set n to utf8-char-number c
     do when n <= "%16r{7F}"
        output c
     else
        output "&#x%16rud(n);"
     done

Prerequisite Concepts
     XML document processing
 
   
----

Top [CONTENTS] [CONCEPTS] [SYNTAX] [LIBRARIES] [SAMPLES] [ERRORS] [INDEX]

Generated: April 21, 1999 at 2:00:52 pm
If you have any comments about this section of the documentation, send email to [email protected]

Copyright © OmniMark Technologies Corporation, 1988-1999.