Literate Programming Using OmniMark
Contents
4. Tangling
The tangling process consists of re-assembling the pieces of
the literate program into a form that can be used as an
executable program. In the weaving process, we did not want to re-order the input
document, under the assumption that the author has already
chosen the best order for the presentation. In the tangling
process, however, the entire task consists of re-ordering the
sections so that they will make syntactic and semantic sense to
the language tool (e.g., the compiler). Knuth chose the name tangle for a reason: compilers need
things in very specific orders (e.g., function signatures
must be defined before the function is called). The result of
the tangling process is a tangled mess, and is not meant for
human consumption. Given this, we can ignore any formatting
issues whatsoever, and concentrate on generating things in the
correct order. The bulk of the tangling process takes place in the rule for
the code element. The algorithm for assembling a tangled
program from a literate program is relatively simple:
unidentified code blocks are concatenated together <24 tangling code unidentified> =
using output as tangled-file
output "%c"
and any cross-references to identified code blocks are
replaced by the code blocks themselves: <25 tangling code identified> =
set referent ("lg" % attribute "id") with (referents-allowed & append)
to "%c"
There are few special
cases to handle, however. If the output attribute is
specified, the code block should be output to the specified
file: this is useful for keeping (say) a DTD and the associated
processing program together. <26 tangling code output> =
assert attribute "do-tangle" = ul"no-tangle"
message "ERROR: A code-block cannot be output"
|| " and tangled at the same time."
set file generate-filename to "%c"
The assertion is there to make sure that the author is not
trying to output a code block to more than one location. There
is nothing wrong with this, but is seems a little non-sensical.
Better to disallow it right from the start, until we find a
pressing need for it. If the do-tangle attribute is specified as no-tangle, then the code block is being used to provide an
example in the weaved output, so the tangling process can ignore
it: <27 tangling code no tangle> =
suppress
Putting this all together, we have <28 tangling code> =
When an undefined general entity is encountered in a code
block, the external-text-entity rule
(<33 handling a cross-reference>) fires, translating the entity into
a processing instruction. This processing instruction is then
translated into a referent to the code block's content: <29 tangling a code reference> =
processing-instruction "code-reference " any+ => reference-name
output referent reference-name
With this approach, OmniMark's referent mechanism will take
care of inserting the body of the code block wherever it
appears. The remainder of the tangling process is fairly
mechanical. The global shelf tangled-file is used as an
output stream: <2 global shelves> +=
global stream tangled-file
The program element is used to open the output file and
attach it to tangled-file. The filename is specified by
the output attribute. <30 tangling a program> =
element "program"
open tangled-file with referents-allowed as file attribute "output"
using output as tangled-file
output "%c"
close tangled-file
All other elements can be safely ignored by the tangling
process. <31 tangling miscellaneous elements> =
element ("title" | "section" | "p" | "b" | "i" | "tt")
suppress
All that is left is to define the group that contains the
rules for the tangling process. <32 tangling> =
Previous section: Weaving
Next section: Handling Cross-References
|