EUC (OMFFEUC)

This library contains one string source function and one string sink function, as follows:

  • reader is a string source function that reads its value string source argument, and returns the text of that data converted from a EUC encoding to a UTF-8 encoding. That is, the provided source is in EUC, but the program sees UTF-8.
  • writer is a string source function that accepts UTF-8 encoded data and writes that data to its value string sink argument, converted from a UTF-8 encoding to a EUC encoding. That is, the program writes UTF-8, but the provided output receives EUC.

The data formats are interpreted/produced according to the Japanese Industry Standards JIS X 0201, JIS X 0208 and JIS X 0212. The EUC data format is transformed using the JIS⟺EUC conversion algorithms.

The only kinds of errors that can occur are in conversion: finding a character that does not have a conversion in the other character set. In this case, the converted value use is DEL (0x7F) in the JIS encoding, and NOT-A-CHARACTER (0xFFFD) in the Unicode (UTF-8) encoding.

These functions are based on [1] Ken Lunde, “Understanding Japanese Information Processing”, O'Reilly 1993, ISBN 1-56592-043-0.

Usage Note

To use OMFFEUC, you must import it into your program using an import declaration such as:

  import "omffeuc.xmd" prefixed by euc.