HOME | COMPANY | SOFTWARE | DOCUMENTATION | EDUCATION & TRAINING | SALES & SERVICE | |
"The Official Guide to Programming with OmniMark" |
|
International Edition |
Previous chapter is Chapter 5, "Organizing Rules With Groups".
Next chapter is Chapter 7, "Shelves".
OmniMark supports three kinds of data types:
This chapter discusses the following topics:
In OmniMark, switches, counters and streams are all different kinds of shelves. Shelves are an aggregate, which means that they can contain more than one item. Specific items on a shelf can be selected by using an indexer. In the syntax descriptions in this chapter, the positions where indexers are permitted will be shown. Indexers are explained in Section 7.2, "Indexing Items on a Shelf".
In OmniMark, variable declarations must be present for all variables, or for none of them.
If a program contains even one variable declaration, then all variables must be declared. In a production environment, it could be catastrophic if a mispelled variable name was just assumed to be an undeclared variable.
Declaration-free programs are very useful for quick programs that are used a few times and then discarded. They are also convenient for prototyping parts of a large system.
Declaration-free programs have the following restrictions:
Wherever this manual indicates an optional type herald, that herald is required for declaration-free programs.
Global variable declarations declare a variable which can be referenced anywhere in the program from that point on.
If the name of a declared variable is the same as an OmniMark keyword, then that keyword loses its meaning for the entire program, including that part of the program that precedes the variable declaration. This behaviour can be changed with the "DECLARE HERALDED-NAMES" declaration.
No global variable may have the same name as another global variable, even if they are of different types. This behaviour too can be changed with the "DECLARE HERALDED-NAMES" declaration.
Global variable declarations, rules, declarations, function definitions, and macro definitions can appear in any order, and they can be intermixed. Global variable declarations cannot appear inside of rules.
Examples of global variable declarations are:
GLOBAL SWITCH is-present GLOBAL COUNTER number-of-lines GLOBAL STREAM out
Global variable declarations are discussed further in: Section 7.1.1, "Declaring Global Shelves".
Local variable declarations are declared at the beginning of a local scope. In a rule, a local scope always begins immediately after the rule header, and ends immediately before the next rule, declaration, function definition, or macro definition. Thus, local variable declarations can appear before the actions in a rule.
If the name of a local variable is the same as an OmniMark keyword, then that keyword loses its meaning for the entire local scope. This includes any declarations that precede the local variable declaration. (For example, this means that the name local cannot be used for a local variable. This behaviour can be changed with the "DECLARE HERALDED NAMES" declaration.
No local variable may have the same name as another local variable in the same local scope. It is often reasonable to give the same name to two local variables in two different rules. That is allowed because the two declarations will be in different local scopes. This restriction can be changed with the "DECLARE HERALDED-NAMES" declaration.
Examples of local variable declarations are:
FIND-START LOCAL STREAM tmp LOCAL COUNTER sum
Local scopes are discussed further in: Section 7.1.2, "Declaring Local Shelves" and Chapter 7, "Shelves".
Versions of OmniMark prior to V3 required a type herald to precede every variable name in any context where variables of different type are allowed. The herald had three purposes:
With the punctuational operators, heralds can actually detract from readability. Therefore, as of OmniMark V3, heralds are no longer required in most contexts. Heralds are only needed for OmniMark variables in two cases:
These cases are described in Section 21.1, "The Reduced Language". OmniMark always requires heralds for the names of SGML objects.
In OmniMark, the names of built-in variables begin with the octothorpe ("#") character. In rare cases, programmers may need to define their own variables which contain an octothorpe. This may happen when an OmniMark program is generated from another source and the programmer does not have control over the variable names.
To prevent confusion with built-in variables, programmer-defined variable names that contain a "#" must be quoted. (This is the only circumstance in which a quoted variable name refers to a variable different from the unquoted variable name.)
Format items which apply to variables contain the variable name that they are formatting. An extra set of quotation marks is not permitted inside of a format item, so OmniMark provides an alternate method of differentiating between a programmer-defined name containing a "#" and a built-in variable.
To reference the programmer-defined variable name inside a format item, the octothorpe character ("#") must be escaped:
LOCAL COUNTER "#item" LOCAL COUNTER sizes VARIABLE ... SET "#item" to 0 ; catch the sizes here REPEAT OVER sizes INCREMENT "#item" BY sizes OUTPUT "Count %d(#item), sizes = %d(%#item)%n" AGAIN
In the example above, the first format item prints the value of the built-in counter #ITEM, and the second prints the value of the programmer-defined variable.
Programmer-defined variable names containing "#" characters are almost always unnecessary, and are deprecated.
A switch is an object that may have one of two different values: TRUE or FALSE. The value of a switch can be used anywhere a test expression is permitted.
Switches are automatically initialized to FALSE when they are created. It is recommended that programmers explicitly initialize switch items before using them. Forgetting to initialize (or re-initialize) variables is a frequent source of errors.
SET SWITCH? switch-name indexer? TO test-expression
Example A
... LOCAL SWITCH foo SET foo TO FALSE
Example B
... LOCAL SWITCH foo SET foo TO x > y
Example C
LOCAL SWITCH foo LOCAL SWITCH fie ... SET foo TO fie
Example D
LOCAL SWITCH foo ... SET foo TO ! foo
The SET action can be used to set the value of a switch item. The herald, SWITCH, can be specified if desired, but is not necessary.
ACTIVATE switch-name indexer?
DEACTIVATE switch-name indexer?
In OmniMark prior to V3, setting a switch value to TRUE was referred to as activating that switch. Setting the value to FALSE was referred to as deactivating the switch. OmniMark provided the ACTIVATE and the DEACTIVATE actions to set the value of the switch.
ACTIVATE can be used to set a switch item's value to TRUE. The keyword SWITCH is not permitted before the switch-name because ACTIVATE cannot apply to any other variable type.
DEACTIVATE can be used to set a switch item's value to FALSE. The herald SWITCH is not permitted before the switch-name because DEACTIVATE cannot apply to any other variable type.
Some programmers may find this terminology more natural.
The following syntax can be used to set multiple switches at the same time. This form is deprecated, because it does not significantly enhance readability, and may, in fact, detract from it.
Syntax
(ACTIVATE | DEACTIVATE) switch-name indexer? (& switch-name indexer?)*
The following syntactic variations are permitted:
A reference to a switch item value is a valid test expression in itself. The syntax for this is:
Syntax
SWITCH? switch-name indexer?
Example
LOCAL SWITCH foo ... DO WHEN foo ... DONE
The herald SWITCH is not required.
ACTIVE switch-name indexer?
Example
... LOCAL SWITCH foo ... DO WHEN ACTIVE foo ... DONE
OmniMark programs for versions prior to V3 used the operator ACTIVE to test switch values. The ACTIVE test simply yields the value of the switch. This operator is no longer required in OmniMark V3. A herald is not permitted after the ACTIVE keyword because ACTIVE can only apply to switches.
The above example using ACTIVE is functionally equivalent to the preceding example which just references the switch directly.
The ACTIVE test has another form which can test more than one switch value simultaneously. The syntax is one of:
Syntax
ACTIVE switch-name indexer?
Syntax
ACTIVE ( switch-name indexer? (& switch-name indexer?)* )
Syntax
ACTIVE ( switch-name indexer? (| switch-name indexer?)* )
The following syntactic variations are permitted:
Example A
... LOCAL SWITCH foo LOCAL SWITCH fie ... DO WHEN ACTIVE (foo & fie) ... DONE
Example B
... LOCAL SWITCH foo LOCAL SWITCH fie ... DO WHEN foo & fie ... DONE
The above two examples have exactly the same meaning.
The ACTIVE test can only be used on a list of switches when the connectors are either all "&" or all "|". This means that the example:
... LOCAL SWITCH foo LOCAL SWITCH fie LOCAL SWITCH fum ... DO WHEN foo & fie | fum ... DONE
must be expressed as:
... LOCAL SWITCH foo LOCAL SWITCH fie LOCAL SWITCH fum ... DO WHEN ACTIVE (foo & fie) | ACTIVE fum ... DONE
OmniMark counters store integer values for counting, manipulating measurements, and performing general arithmetic. Counters can have positive or negative values. The maximum magnitude of a counter is 2,147,483,647. This section presents actions for setting, incrementing, and decrementing counters, and methods for representing counters in strings.
Counters are automatically initialized to 1. It is recommended that programmers always explicitly initialize their counters to make the initial value obvious to the reader of the program and to avoid errors resulting from forgetting to initialize (or re-initialize) a counter.
SET COUNTER? counter-name indexer? TO numeric-expression
The SET action can be used to set the value of a counter item. The herald, COUNTER, can be specified if desired, but is not necessary.
Example A
... LOCAL COUNTER foo SET foo TO 0
Example B
... LOCAL COUNTER foo SET foo TO x + y
Example C
LOCAL COUNTER foo LOCAL COUNTER fie ... SET foo TO fie
Example D
LOCAL COUNTER foo ... SET foo TO -foo
Example E
LOCAL COUNTER foo LOCAL STREAM s ... SET foo TO s
The last example shows how string expressions can be used as numeric values. It depends on the stream s being closed and containing a valid string representation of a decimal number. (See Section 9.1, "Numeric Expressions" for more detail.)
The INCREMENT action increases the value stored in a counter by a specified amount. If the BY part is omitted, the counter is incremented by 1.
INCREMENT counter-name indexer? (BY numeric-expression)?
No herald is permitted before the counter name because INCREMENT can only be applied to counters.
CROSS-TRANSLATE GLOBAL COUNTER lines GLOBAL COUNTER chars FIND "%n" INCREMENT lines INCREMENT chars FIND ANY-TEXT+ => line INCREMENT chars BY LENGTH OF line
The DECREMENT action decreases the value stored in a counter by a specified amount. If the BY part is omitted, the counter is decremented by 1.
DECREMENT counter-name indexer? (BY numeric-expression)?
No herald is permitted before the counter name because DECREMENT can only be applied to counters.
RESET counter-name indexer? (TO numeric-expression)?
Versions of OmniMark prior to V3 used RESET instead of SET (or "SET COUNTER"). If the TO part is not specified, the value of the counter item is set to 1. (Note that SET requires a TO part.)
No herald is permitted before the counter name because RESET can only be applied to counters.
RESET has been superseded by SET and should no longer be used.
OmniMark is able to represent the value of a counter in the following ways:
The format item used to represent counter values as symbols is described in Section 19.1.4.4, "The Symbol Format".
% format-modifier* d( counter-name )
Example
CROSS-TRANSLATE GLOBAL COUNTER line-number FIND-START SET line-number TO 1 OUTPUT-TO #SUPPRESS FIND "%n" INCREMENT line-number FIND-END OUTPUT-TO OUTPUT OUTPUT "The file contains %d(line-number) lines.%n"
The "%d" format formats the value of the specified counter item in arabic numerals. By default, the numbers are expressed in decimal (base 10) notation. If the value is negative, a minus sign is output before the number.
The following format modifiers can be used with the "%d" option:
This is the decimal point modifier. It consists of just a positive number with no identifying letter after it.
The decimal point modifier inserts a decimal point in the string representation of the counter value. The number indicates how many digits must be placed to the right of the decimal point. This allows counter values to simulate fixed-point numbers.
The "s" modifier strips trailing zeros from the result of the decimal point modifier. If there are no non-zero digits after the decimal point in the resulting representation, the decimal point is removed too. The "s" modifier requires that the decimal point modifier be specified as well.
The "f" modifier is the field-width modifier. The "f" modifier requires a preceding positive number. If the specified number is less than the minimum number of characters needed to print the value, the "f" modifier is ignored. If it is greater, space characters are added to the right of the value to fill it out to the field width.
"k" is the right justification modifier. It causes padding to be done on the left side of the field.
The "k" modifier requires the "f" modifier.
The "k" modifier is not allowed with the "z" modifier.
"z" specifies that the number should be padded on the left instead of the right, and it should be padded with zeros instead of with spaces.
The "z" modifier requires the "f" modifier.
The "z" modifier is not allowed with the "k" modifier.
"r" specifies the radix of the string representation of the counter value. It converts the value according to the radix specified by the number, which must be in the range 2 to 36. The lower-case letters "a" through "z" are used for the digits 10 to 35 when the radix is greater than 10.
When the "r" modifier is used, counter values are treated as unsigned numbers, so that negative numbers are represented as large positive numbers. This also means that although the modifier "10r" does not change the representation of non-negative numbers, it does change negative numbers to positive numbers.
"l" is the lower-casing modifier. It is allowed only when the radix modifier is specified, this causes the lower-case letters "a" through "z" to be used when the radix is greater than 10. This is equivalent to the default behavior.
The "l" modifier requires the "r" modifier.
The "l" modifier cannot be used with the "u" modifier.
"u" is the upper-casing modifier. It is allowed only when the radix modifier is specified, this causes the upper-case letters "A" through "Z" to be used when the radix is greater than 10.
The "u" modifier requires the "r" modifier.
The "u" modifier cannot be used with the "l" modifier.
The following table illustrates the effect of these modifiers:
Format Value Result %3d 23456 23.456 %3d 20 0.020 %3sd 23456 23.456 %3sd 20 0.02 %3sd 23000 23
The following is an example of using field widths and radices with a counter with a value of 29. The slashes are used only to show the number of spaces each format item produces:
Format Value Result /%d(n)/ 29 /29/ /%8fd(n)/ 29 /29 / /%8fkd(n)/ 29 / 29/ /%8fzd(n)/ 29 /00000029/ /%2rd(n)/ 29 /11101/ /%8rd(n)/ 29 /35/ /%16rd(n)/ 29 /1d/ /%16r8fzud(n)/ 29 /0000001D/
% format-modifier* a( counter-name )
The "%a" format represents the value of a counter item as a letter in the usual alphabetic sequence. A value of 1 would be represented as "a", 2 as "b", etc.
If the value of the counter item is greater than 26, then a multi-character string would be generated: 27 as "aa", 28 as "ab", 29 as "ac", etc.
If the value of the counter is zero the digit "0" is output instead of a letter. If the value of the counter is negative, a minus sign prefixes the generated alphabetic code.
The following modifiers can be used with the "%a" format:
The "j" modifier causes the letters "i", "l", and "o" to be omitted from the alphabetic sequence. This is a frequent convention in environments where these letters might be confused with the digits "1" and "0".
For example, suppose that the value of counter i is 9. Processed by "%ua(i)", the value appears as "I". However, if the format "%uja(i)" is used, the value will be "J".
The "l" modifier causes lower-case characters to be used. Since this is the default, the "l" can always be omitted. The programmer can use the "l" modifier to emphasize to the reader that lower-case letters are being used.
The "l" modifier cannot be used with the "u" modifier.
The "u" modifier causes upper-case characters to be used.
The "u" modifier cannot be used with the "l" modifier.
The "w" modifier changes the convention for representing values greater than 26 so that 27 is represented as "aa", 28 as "bb", 29 as "cc", 52 as "zz", 53 as "aaa", and so on.
The field-width modifier is allowed with the "%a" format. If the number is less than the minimum number of characters needed to print the value, the modifier is ignored. If it is greater, space characters are added to the right of the value to fill it out to the field width.
This modifier is allowed when the field-width modifier is given. It causes padding to be done on the left side of the field instead of the right.
The "k" modifier requires the "f" modifier.
% format-modifier* i( counter-name )
The "%i" format item represents the value of a counter item as a Roman numeral.
If the value of the counter is zero the digit "0" is output instead of a Roman numeral. If the value of the counter is negative, the Roman numeral is prefixed by a minus sign.
Unless changed by a modifier, lower-case letters are used.
The following modifiers can be used with the "%i" format:
The "l" modifier reinforces the default convention of generating lower-case Roman numerals.
The "l" modifier cannot be used with the "u" modifier.
The "u" modifier causes upper-case Roman numerals to be generated.
The "u" modifier cannot be used with the "l" modifier.
The field-width modifier is allowed with the "%i" format. If the specified number is less than the minimum number of characters needed to print the value, the modifier is ignored. If it is greater, space characters are added to the right of the value to fill it out to the field width.
This modifier is allowed when the field-width modifier is given. It causes padding to be done on the left side of the field instead of the right.
The "k" modifier requires the "f" modifier.
% format-modifier* b( counter-name )
The "%b" format item represents the value of the counter item as a binary string. The characters are determined by converting each byte in the binary representation of the counter into the character represented by the value of that byte. A counter with a value of zero (0) would be expressed in one byte as "%0#", a value of 1 as "%1#", and so on.
In fact, when the field-width is 1, the "%b" format item is the dynamic analog of the constant "%#" modifier.
The following modifiers can be used with the "%b" format:
The byte order modifier consists of just a positive number. It does not contain an identifying character.
The byte order modifier causes the bytes to be output according to the order as described in Section 9.2.2.2, "Converting Binary Data To Numbers".
The field-width modifier is allowed with the "%b" format. It specifies the maximum number of bytes to convert the value to. The specified number cannot be greater than the maximum size of a counter representation which is currently 4.
LOCAL COUNTER temp ... OUTPUT "%4f0b(temp)"
Note that when both the ordering and field-width modifiers are specified, the field-width modifier must appear first. Otherwise, OmniMark cannot tell when the first modifier ends and the second begins.
To convert a number to a multi-character string, both the length of the desired string and the order of the characters in it need to be known. The following outputs the value in the counter "temp" in four characters with high-to-low ordering, using the "f" modifier to specify the length:
The default length is 1 (one). If the specified field-width is greater than the number of bytes needed to print the value, zero-value bytes ("%0#") are used to print the value, and they are placed as specified by the order.
The byte ordering is determined by the following (in order of precedence):
For example, if we were converting the value 16,909,060 (1 * 256^3 + 2 * 256^2 + 3 * 256 + 4 * 1) to a string, the following table shows the string each byte order modifier value would produce:
(The "%{}" format item is described in Section 9.2.1.1.4, "Entering Numeric Characters ("%#" and "%{}")".)
The "4f" modifier is necessary to convert all four bytes of the value to a string. Otherwise, only the least significant byte would be used.
OmniMark provides a single data structure, called a stream, for:
A stream must be opened before characters can be written to it. Streams that hold characters for later use must be closed before their contents can be accessed.
This section describes the common features of streams.
A stream must be written to and closed in the domain that "owns" it. OmniMark defines two domains; the input processor and the output processor.
Streams belong to the input processor if they are opened in a rule or function that is executing in the input processor. They belong to the output processor if they are opened in the output processor. (Remember that the header of an EXTERNAL-TEXT-ENTITY rule is tested in the output processor, but the body executes in the input processor.)
It is an error to write to a stream which belongs to a different domain. For example, it is an error for an action in an ELEMENT rule to write to a stream which belongs to the input processor.
This restriction can be avoided by opening the stream with the open modifier DOMAIN-FREE. See Section 6.4.3.1.1, "Open Modifiers" for more information.
Closing a stream removes the domain ownership from a stream. Closed streams may be accessed, opened, or reopened in either domain. Opening or reopening a closed stream in a different domain moves ownership to that domain.
The context translation in Chapter 18, "How Asynchronous Concurrent Context Translations Work" further explains how streams are processed.
A stream is always opened with a destination, called an attachment. There are three kinds of stream attachments:
Buffers serve a function analogous to string variables in other programming languages. They are used to store the results of string expressions for later use within the program.
OmniMark can create and write to files in the external file system. This is a useful way of saving information that can be used by applications that will run at a later time.
Other file operations are described in Section 10.4, "Manipulating Files".
Referents are a feature designed specifically for hypertext applications. They are used to resolve forward references (references into a part of the document that has not been processed yet). Briefly put:
These operations can be done in any order; OmniMark will remember where the referents have been written to, and will replace the referent with its final contents when the referent resolution is done.
Referent resolution is normally performed after the program completes. The programmer can also define nested referent scopes. The referents created and written in a nested referent scope will be resolved when the scope completes.
Referents are explained in detail in Chapter 11, "Cross-Referencing and Hypertext Linking".
Streams are initially unattached.
The basic actions that can be performed on a stream are opening it, writing to it, and closing it.
OPEN stream-name indexer? open-modifiers? AS attachment
The permitted attachments are:
Opens the stream as a buffer.
Opens the stream as a referent whose name is given by the string expression.
Opens the stream as a file whose name is given by the string expression. This is the recommended way to open files.
Opens the stream as a "connection" to an external output function. When data is written to the stream, it is processed by the external output function. This extends the ways in which OmniMark interacts with the external environment simply by adding external function libraries. These functions are described further in Section 12.3.4, "External Output Functions".
(For backwards compatibility, OmniMark allows the FILE keyword to be omitted from the FILE attachment. This practise is deprecated in OmniMark V3. See Section 6.4.3.1.2, "The Keyword FILE in Stream Attachments".)
Opening a stream causes the previous contents of the attached object to be lost, just as if the stream were discarded. (See Section 6.4.4.3, "Discarding a Stream".)
A stream can be opened if it is:
(A #CURRENT-OUTPUT set is saved when a PUT action, or a "USING OUTPUT AS" prefix is executed. These actions save the old #CURRENT-OUTPUT set and establish a new set for the duration of the action. See Chapter 13, "The Current Output Stream Set" for more details.)
Examples of opening a stream are:
Example A
LOCAL STREAM temp ... OPEN temp AS BUFFER
Example B
LOCAL STREAM chapid ... OPEN chapid AS REFERENT "%v(id)"
Example C
LOCAL STREAM def ... OPEN def AS FILE "%v(docname).def"
The last two examples use the value of the attributes id and docname respectively as the basis of the referent name and the file name.
OmniMark automatically performs whatever actions are required by a particular computer system to close each open file at the end of a translation.
Open-modifiers are modifiers that affect how the stream is opened. The phrase begins with the keyword WITH. If more than one open modifier is specified, they must be enclosed in parentheses and separated by "&".
The syntax of the open modifiers phrase is:
Syntax
WITH open-modifier (& open-modifier)*
The following syntactic variations are permitted:
Allowed modifiers are:
This a constant string expression containing one or more of the format modifiers "s", "z", "h", "u", and "l". These are described in Section 4.1.2, "Processing Content".
Since "u" and "l" describe contradictory states (upper-case and lower-case, respectively), they cannot be used together.
This consists of the keyword BINARY followed by a numeric expression that evaluates to a valid binary ordering number (0,1,2, or 3).
The BINARY modifier specifies the default byte ordering code to use in "%b" format items written to that stream as described in Section 6.3.4.4, "Binary Representations". It overrides the default established by the BINARY-OUTPUT declaration.
This modifier is only effective on streams attached to files. It causes data to be written to the file in "binary mode". This means that no translation is done on the newline characters. Each character is written "as is" to the file.
This open modifier has the same syntax as the BREAK-WIDTH declaration:
Syntax
BREAK-WIDTH numeric-expression (TO? numeric-expression)?
The first numeric-expression sets the preferred width, and the second sets the maximum width. The keyword TO is required if the second numeric-expression is not an integer. The maximum width must be the same as or greater than the preferred width.
OmniMark will attempt to break lines at the preferred width, if possible. If lines exceed the maximum width, OmniMark will issue error messages. No messages are produced if the maximum width is not specified.
Streams opened with DOMAIN-FREE may be written to, reopened, closed, and discarded from any domain, not just the domain in which the stream was opened. This option is useful for files which contain processing messages from the program.
Streams opened with the DOMAIN-FREE modifier also have the "h" and "z" element content format modifiers turned on by default.
This modifier allows referents to be written to this stream.
Files opened with the REFERENTS-ALLOWED modifier will not be written to disk until the referents are resolved. This means that the contents of such a file are not available until the end of the referent scope is reached. If the file was opened in the global referent scope, the referents are not resolved until the program ends, so these files cannot be used as input to a later part of the same program.
The contents of buffers and referents opened with the REFERENTS-ALLOWED modifier and then closed, can only be written to other streams which have REFERENTS-ALLOWED, until the referents are resolved.
This modifier also allows referents to be written to this stream. The referents are immediately converted to a string expression identifying the name of the referent and its current value. This string expression is an ordinary value. No replacement is done when referents are resolved.
The purpose of this modifier is to allow referents to be written to streams without buffering those streams until referents are resolved, and to display something useful when doing so.
This is primarily useful in programmer-generated error, warning, or information messages where the contents of a buffer containing referents needs to be displayed.
The REFERENTS-DISPLAYED option is described in Section 11.2, "Allowing Referents to be Written to a Stream".
This is the default. It causes an error whenever a referent is written to a stream that has this modifier applied. See Section 11.2, "Allowing Referents to be Written to a Stream".
Different operating systems have different newline sequences in text files. To make programs uniform across platforms, OmniMark automatically converts between the system-specific newline sequence and OmniMark's internal representation: a single line feed character (ASCII 10). Using a single character makes the sequence easier to manipulate, especially in patterns. By using the same character on all systems, OmniMark programs are made more portable.
To achieve this, the system-specific newline sequence is translated to a single line feed when files are read in TEXT-MODE. When writing files in TEXT-MODE, the reverse translation is automatically done.
This behaviour can corrupt binary data, so the BINARY-MODE modifier is provided to specify that no such translation is done.
TEXT-MODE is the default, unless the deprecated NEWLINE declaration is used.
Each option can only be specified once.
For example, the following action
LOCAL STREAM contents ... OPEN contents WITH ("u" & REFERENTS-ALLOWED) AS FILE "contents.out"
will cause all element data content written to the stream contents to be forced to upper-case. The modifier can be overridden by a subsequent PUT or OUTPUT-TO action.
For backwards compatibility, in the OPEN and "REOPEN ... AS" actions only, BREAK-WIDTH is allowed without a heralding WITH, "&", or AND. This form is deprecated, and is provided for compatibility with previous releases of OmniMark.
In OmniMark V3, the FILE keyword can be used as an operator which returns the contents of a file as a string expression, or it can be used as a stream attachment.
The following examples illustrate how these two usages are resolved in the attachment part of an OPEN action:
Example A
OPEN s AS "my-file"
Example B
OPEN s AS FILE "my-file"
Example C
OPEN s AS FILE FILE "my-file"
The first two examples are equivalent, and the second one is the recommended form. The third form gets the name of the file to be opened from the contents of the file "my-file". For instance, if the file "my-file" contained the text "whatsit", the "whatsit" would be the name of the file opened and attached to the stream s.
A more interesting example is the following. The first two actions below are equivalent, but mean different things than the third:
LOCAL STREAM x ... OPEN x AS BINARY-MODE FILE "y" OPEN x AS FILE BINARY-MODE FILE "y" OPEN x WITH BINARY-MODE AS FILE "y"
In the first action, note that binary-mode file "y" is a string expression, whose value is the contents of the binary file named "y". This is the deprecated form of "OPEN ... AS FILE" (with the keyword FILE omitted), so x is opened as a file whose name is contained in the binary file "y". x will be opened in TEXT-MODE on systems where that is the default, and in BINARY-MODE on other systems. (See Section 10.4.2.1, "Binary And Text-Mode Files".)
The second action is a slightly clearer way of doing the same thing. This form is strongly encouraged over the first form.
The third action opens x in BINARY-MODE with the name "y".
REOPEN stream-name indexer? open-modifiers? (AS attachment)?
The permitted attachments for REOPEN are the same as for OPEN. (See Section 6.4.3.1, "Opening Streams".)
The REOPEN action is used to open an existing or new object so that text can be appended to it. Unlike OPEN, the contents of the attached object are not erased when it is REOPENed.
The modifiers available for REOPEN depend on whether the attachment is specified or not. (For more information about the open modifiers, see Section 6.4.3.1.1, "Open Modifiers".)
All of the open-modifiers available for the OPEN action are also available for REOPEN.
All of the open modifiers available for the OPEN action are also available for REOPEN except for TEXT-MODE or BINARY-MODE.
When the attachment is specified the REOPEN behaves as follows:
When the attachment is not specified:
REOPEN s WITH ""
Like OPEN, it is an error to REOPEN a stream which is part of any #CURRENT-OUTPUT set (active or saved), for either domain.
A stream opened with REFERENTS-ALLOWED cannot be reopened once it is closed; it is effectively "locked" once closed.
For backwards compatibility, in the OPEN and "REOPEN ... AS" actions only, BREAK-WIDTH is allowed without a heralding WITH, "&", or AND. If the attachment is not specified, and the BREAK-WIDTH open modifier is applied, then the appropriate WITH, "&", or AND herald must precede the BREAK-WIDTH keyword. This form is deprecated, and is provided for compatibility with previous releases of OmniMark.
The #PROCESS-OUTPUT and #ERROR streams are always open, and cannot be closed or discarded. They can only be reopened to change their modifiers. REOPEN cannot specify a stream attachment when applied to either of these streams. The following restrictions apply to open modifiers specified in the REOPEN action:
Examples of reopening #PROCESS-OUTPUT are:
REOPEN #PROCESS-OUTPUT REOPEN #PROCESS-OUTPUT WITH BINARY 2 REOPEN #PROCESS-OUTPUT WITH "us" REOPEN #PROCESS-OUTPUT WITH ("l" & BINARY 0)
The first example does nothing at all. The second is used to change the default BINARY-OUTPUT modifier on the #PROCESS-OUTPUT stream (see Section 6.3.4.4, "Binary Representations"). The third changes the element content modifiers (see Section 4.1.2, "Processing Content"). The fourth changes both the element content modifiers and binary ordering. The two modifiers may appear in either order.
The "h" and "z" element content modifiers are always on, so line-breaking and translation are never applied to text written to the #PROCESS-OUTPUT or #ERROR streams. By default, no case-conversion or stripping is done on element content, and the binary output value is either the value specified in the BINARY-OUTPUT declaration if given, 0 if not.
Text can be written to one or more streams with the PUT action.
PUT stream-name indexer? open-modifiers? (& stream-name indexer? open-modifiers?)* string-expression
Example
PUT #MAIN-OUTPUT "The quick brown fox"
The PUT action writes the given string-expression to one or more streams. The open-modifiers permitted are described in Section 6.4.4.1.1, "Open Modifiers Allowed With PUT".
The string-expression is only evaluated as needed. This means that if the string-expression consists of any of the following alone or in a JOIN, they are processed "as needed". This means that if it contains:
Only the following open-modifiers are permitted with PUT:
The specified open-modifiers replace the ones currently in force. If no open-modifiers are specified, the ones currently in force are used. See Section 6.4.3.1.1, "Open Modifiers".
The following syntactic variations are permitted:
Example
... PUT (#MAIN-OUTPUT & aux) "The quick brown fox"
Before writing the text to the specified stream items, the PUT action saves the old #CURRENT-OUTPUT set, and creates a new #CURRENT-OUTPUT set consisting of the specified stream items. After the writing has completed, the previous #CURRENT-OUTPUT is restored.
This causes #CURRENT-OUTPUT sets to be nested within each other. This useful feature of #CURRENT-OUTPUT becomes important when the string expression contains a "%c" format item.
#CURRENT-OUTPUT may be given as one of the stream-names. In that case, the text is written to the set of streams that are currently active as well as the other ones specified. No modifiers can be specified for #CURRENT-OUTPUT.
The concept of the #CURRENT-OUTPUT set is discussed in Chapter 13, "The Current Output Stream Set".
CLOSE stream-name indexer? (& stream-name indexer?)*
The CLOSE action explicitly closes a stream. Once a stream attached to a file has been closed, the same stream name can be used to open another file. If an OPEN attempts to open a stream that is already open, the stream is first closed, as if by the CLOSE action, and then opened.
Any of the following forms can be used to close several streams at once:
Example A
CLOSE s1 CLOSE s2 CLOSE s3
Example B
CLOSE s1 & s2 & s3
Example C
CLOSE (s1 & s2 & s3)
Example D
CLOSE s1 AND s2 AND s3
Example E
CLOSE (s1 AND s2 AND s3)
The first form is recommended, especially if indexers are being applied to the streams.
It is an error to close a stream that is part of any #CURRENT-OUTPUT (for any currently open rule or function).
A stream must be explicitly closed in one domain before it can be opened or reopened in a different domain.
The following syntactic variations are permitted:
DISCARD stream-name indexer? (& stream-name indexer?)*
The DISCARD action breaks the association between the stream and the object that it is attached to. If the stream was attached to a buffer, DISCARD destroys the buffer.
If this stream is later reopened (with the REOPEN action) its content will be empty. An attempt to access the content of a discarded stream is an error. Discarded streams are neither OPEN, CLOSED, nor ATTACHED.
The following syntactic variations are permitted:
Any of the following forms can be used to discard several streams at once:
Example A
DISCARD s1 DISCARD s2 DISCARD s3
Example B
DISCARD s1 & s2 & s3
Example C
DISCARD (s1 & s2 & s3)
Example D
DISCARD s1 AND s2 AND s3
Example E
DISCARD (s1 AND s2 AND s3)
The first form is recommended, especially if indexers are being applied to the streams.
It is an error to discard a stream that is part of any #CURRENT-OUTPUT stream set.
SET STREAM? stream-name indexer? open-modifiers? TO string-expression
The SET (or "SET STREAM") action applied to a stream is used to associate a buffer with the named stream and to initialize the contents of the buffer.
(Prior to OmniMark V3, "SET BUFFER" was used instead of SET or "SET STREAM". OmniMark V3 still supports "SET BUFFER" but its use is deprecated.)
The string-expression is only evaluated as needed. This means that if the string-expression consists of any of the following alone or in a JOIN, they are processed "as needed". This means that if it contains:
Any of the open modifiers permitted for OPEN may be specified in the SET (or "SET STREAM") action. See Section 6.4.3.1.1, "Open Modifiers".
For example, the following rule would open stream stuff as a buffer, and set it to the contents of element date, and then close it.
GLOBAL STREAM stuff ... ELEMENT date SET stuff TO "%c"
In the SET action, the string-expression is evaluated before the stream is opened. Thus, the following two examples are equivalent:
Example A
LOCAL STREAM s1 ... SET s1 WITH "u" TO "...%c..."
Example B
LOCAL STREAM s1 ... DO LOCAL STREAM tmp-s OPEN tmp-s WITH "u" AS BUFFER PUT tmp-s "...%c..." CLOSE tmp-s OPEN s1 WITH "u" AS BUFFER PUT s1 tmp-s CLOSE s1 DONE
Unlike other open modifiers, BREAK-WIDTH must be parenthesized within the SET (or "SET STREAM") action. This is to avoid a potential ambiguity in the interpretation of the TO keyword in situations like the following:
LOCAL STREAM x ... SET x WITH BREAK-WIDTH 72 TO "80" . . .
The BREAK-WIDTH part must be delimited by parentheses, e.g.:
LOCAL STREAM x ... SET x WITH (BREAK-WIDTH 72 TO "80") . . .
or
LOCAL STREAM x ... SET x WITH (BREAK-WIDTH 72) TO "80" . . .
The contents of a stream previously opened as a buffer or referent, and then closed, can also be accessed in a string expression. The following examples are equivalent:
Example A
LOCAL STREAM my-stream ... OUTPUT my-stream
Example B
LOCAL STREAM my-stream ... OUTPUT "%g(my-stream)"
% format-modifier* g( stream-name )
OmniMark buffers and referents contain character strings of any size, from a few characters to large portions of a document. Buffers can be constructed from document content, attribute values, or strings in an OmniMark program. Once a buffer has been created, the "%g" format item can be used to access its content.
The stream must be attached to a referent or a buffer and closed before the "%g" format is used. The following modifiers are allowed:
If the specified number is less than the minimum number of characters needed to format the value, the modifier is ignored. If it is greater, space characters are added to the right of the value to fill it out to the field width.
The "l" modifier converts all of the text to lower-case.
The "l" modifier cannot be used with the "u" modifier.
This modifier is allowed when the field-width modifier is given. It causes padding to be done on the left side of the field instead of the right.
The "k" modifier requires the "f" modifier.
The "u" modifier converts all of the text to upper-case.
The "u" modifier cannot be used with the "l" modifier.
This section describes operators that act on streams.
STREAM? stream-name indexer? (HAS | HASNT) NAME
Streams that are attached to files and referents have names. The "HAS NAME" test will return TRUE if the specified stream item has a name, and FALSE otherwise.
If the stream has a name, the "NAME OF" operator can be applied to the stream item to return a string expression containing the name of the stream.
The built-in stream #MAIN-OUTPUT has a name when it has been attached to a file on the command line with the -of or -aof options.
The herald STREAM is optional.
NAME OF STREAM? stream-name indexer?
If the specified stream is attached to a file or a referent, the "NAME OF" operator will return the name of the file or referent. Invoking the "NAME OF" operator on a stream that is not attached to a file or a referent is an error.
The herald STREAM is optional.
It is an error to use the "NAME OF" test if the stream does not have a name. When in doubt, the test should be preceded by a "HAS NAME" test, as in the following example:
GLOBAL STREAM new-output ... DO WHEN new-output HAS NAME DO UNLESS NAME OF new-output = "new-out.txt" PUT #ERROR "STREAM new-output has unexpected name: " PUT #ERROR NAME OF new-output HALT WITH 1 DONE ELSE PUT #ERROR "new-output was never opened " _ "as file or referent%n" HALT WITH 1 DONE
The first kind of stream test checks for the state of a stream, and has the form:
Syntax
STREAM? stream-name indexer? (IS | ISNT) (OPEN | CLOSED)
The herald STREAM is optional.
The tests have the following meaning. A stream is:
A stream must be CLOSED to be used in a "%g" format item or in a string expression. An example of these tests is:
DOCUMENT-START LOCAL STREAM s ... OPEN s AS BUFFER WHEN s ISNT OPEN ... CLOSE s WHEN s ISNT CLOSED ...
The type of the object that a stream is attached to can be tested. A stream whose contents have been discarded (see Section 6.4.4.3, "Discarding a Stream") is not ATTACHED. If a stream is ATTACHED it must be either OPEN or CLOSED.
A stream attachment type test has the syntax:
Syntax
STREAM? stream-name indexer? (IS | ISNT) (BUFFER | FILE | REFERENT | EXTERNAL | SGML-PARSER | ATTACHED)
A stream is a:
OmniMark allows combined tests of the form:
LOCAL STREAM my-stream LOCAL STREAM your-stream ... DO WHEN my-stream IS (OPEN & BUFFER) ... DONE DO WHEN your-stream IS (BUFFER | FILE) ... DONE
However, the OmniMark compiler will report as fatal errors combinations that don't make sense, such as:
LOCAL STREAM my-stream LOCAL STREAM your-stream ... DO WHEN my-stream IS (OPEN & CLOSED) ... DONE DO WHEN your-stream IS (BUFFER & FILE) ... DONE
Next chapter is Chapter 7, "Shelves".
Copyright © OmniMark Technologies Corporation, 1988-1997. All rights reserved.
EUM27, release 2, 1997/04/11.