|
|||||
Excel to XML conversions | |||||
------ |
A context-translation is the most general of the built-in translation types. A context-translation converts data from one form to another, using XML/SGML as an intermediate form. A context-translation can be viewed as an up-translation that produces an intermediate XML/SGML document combined with a simultaneous down-translation of that XML/SGML document.
Patterns in the original document suggest its structure and allow a (possibly partial) conversion to XML/SGML. OmniMark parses the XML/SGML document and, using the XML/SGML parser, corrects structural errors. The final output makes use of the structure discovered by the parser to produce a fully marked-up document, a minimized document, or some other form of data.
A context-translation begins with context-translate
. A context-translation combines the best features of an up-translation and a down-translation with the powerful error recovery and context-tracking capability of the parser.
Converting an Excel spreadsheet (.XLS) file to XML
This sample describes a context translation program designed to convert non-XML files to XML; in particular, it converts the content of an Excel spreadsheet (XLS) to XML.
The conversion is done in two steps:
The first step in any conversion is to analyze the structure of the input file. You can do this either manually (by "eyeballing" the input file) or by designing an OmniMark program to perform this task. Since the input file in this sample is very simple, a manual approach makes the most sense. The following figure shows the original XLS spreadsheet:
Fig 2. Excel spreadsheet source file
The sample spreadsheet has the following basic structure:
The resulting XML file includes main elements that reflect this structure:
------ |
---- |