HOME | COMPANY | SOFTWARE | DOCUMENTATION | EDUCATION & TRAINING | SALES & SERVICE | |
"The Official Guide to Programming with OmniMark" |
|
International Edition |
Previous chapter is Chapter 11, "Cross-Referencing and Hypertext Linking".
Next chapter is Chapter 13, "The Current Output Stream Set".
An OmniMark program can define two different kinds of functions:
Internal functions serve the following purposes:
External functions are used to gain access to external applications or operating system services which are not available directly in OmniMark.
There are two aspects to a function:
Additionally, a function may also have a function predefinition which simply defines the function interface. Function predefinitions are discussed in Section 12.5.2, "Function Predefinitions".
A function, like a variable, must be declared before it is used. This allows OmniMark to know how to interpret the arguments passed to the function, and to know what kind of result it is getting back from the function.
Functions are defined using a function definition, which has the syntax:
Syntax
DEFINE EXTERNAL? result-type? FUNCTION function-name argument-list AS function-body
An example is:
DEFINE FUNCTION print-n-spaces VALUE COUNTER n AS LOCAL COUNTER spaces-yet-to-print SET spaces-yet-to-print TO n REPEAT EXIT WHEN spaces-yet-to-print <= 0 OUTPUT "%_" DECREMENT spaces-yet-to-print AGAIN
The keyword EXTERNAL is used to indicate an external function. External functions are discussed in Section 12.3, "External Functions".
Recursively defined functions may need to be used before their definitions, so there is a special way of "predefining" a function described in Section 12.5.2, "Function Predefinitions".
A function's name must be a "plain" name -- it cannot be quoted, the way variable names are.
When a function is defined in an OmniMark program, the name of the function becomes a keyword in that program. If the function's name just happens to be the same as an OmniMark keyword, the OmniMark keyword cannot be used in the program -- that keyword always refers to the function.
It is generally best simply not to use OmniMark keywords as function names. (It's actually best not to use OmniMark keywords as variable names either.) However, there are enough different OmniMark keywords that OmniMark doesn't take the strong position of disallowing OmniMark keywords, it just says they can't coexist in the same program. This more flexible position also means that OmniMark programs will continue to work when new keywords are added to the language.
Note that the rule that you can't use a word as both a function name and an OmniMark keyword in the same program applies to the whole program -- it is an error to use a keyword early on in a program, and to later on define it as a function name.
The result-type is generally one of the keywords SWITCH, COUNTER, or STREAM. It can also be used to indicate when an external function is an external output function or an external source function. (See Section 12.3.4, "External Output Functions" or Section 12.3.3, "Externally-Defined Sources" for more information.) When result-type is given, it always precedes the function name.
Every called function must eventually return to its caller (unless it does a HALT or contains a terminating error). The explicit way of returning from a function is to use a RETURN action, which can appear anywhere within a function definition. The form of the RETURN action depends on the result type defined for the function, which is specified at the start of the function, as follows:
Syntax
RETURN numeric-expression
Syntax
RETURN string-expression
Syntax
RETURN test-expression
Syntax
RETURN
An example of a value-returning function is:
DEFINE COUNTER FUNCTION double VALUE COUNTER n AS RETURN n * 2
In turn, the result of calling the function double is itself used in a numeric expression in the following example (which sets the value of the counter m to be 15):
LOCAL COUNTER m ... SET m TO double 7 + 1
The result type of a function determines where the function can be called in the OmniMark program: a COUNTER-returning function can be called in a numeric expression, a SWITCH-returning function in a test, and a STREAM-returning function in a string expression. Value-returning functions can be called anywhere that an operator returning the same type can be called.
If the function has no return type, then the function call can only be used in the place of an action:
... GLOBAL SWITCH registry VARIABLE DEFINE FUNCTION register VALUE STREAM string AS NEW registry ^ string ... ELEMENT name LOCAL STREAM name-value SET name-value TO "%sc" register name-value ...
Note that a function does not require a RETURN action if it does not return a value. In that case, the function simply returns when there are no more actions to execute.
The argument-list in a function definition describes the class and type of each argument to a function, if any. It can be parenthesized or unparenthesized. It has one of the following forms:
Syntax
( (argument-template ( argument-separator argument-template)*)? )
or
Syntax
( argument-separator? argument-template (argument-separator argument-template)*)?
In a parenthesized argument list, the argument-separator can either be a comma (",") or an unquoted OmniMark name.
In an unparenthesized argument list, the argument-separator must be an unquoted OmniMark name.
The maximum number of function arguments for an individual function call is 16383. This should be sufficient for most purposes.
A function-argument-template describes how the caller passes the argument to the function. Each argument passed must match its corresponding argument template.
A argument-template has the basic syntax:
Syntax
argument-class shelf-type argument-name (OPTIONAL default-value?)?
The argument-class is one of:
Syntax
VALUE | READ-ONLY | MODIFIABLE | REMAINDER
The shelf-type is one of:
Syntax
COUNTER | SWITCH | STREAM
External functions cannot have OPTIONAL arguments.
A function argument class is the first of the three components of a function argument template. An argument class indicates how an argument value in a call is interpreted: whether an actual value or a reference to a shelf is passed, and what kind of expression is allowed in the function call. There are four different function argument classes:
External functions cannot have REMAINDER arguments.
A VALUE argument is used to pass an expression as a function argument. Any expression can be passed as a VALUE argument of a compatible type:
Unlike READ-ONLY arguments, when a shelf item reference is passed as a VALUE argument, only the value of that item is passed. Inside the function, that VALUE argument cannot be used to access the key of that item (if any) or the values or keys of any of the other items on the shelf. (See Section 12.1.3.2.5, "The Difference Between VALUE and READ-ONLY Arguments".)
Within a function, a VALUE argument is treated as if it were a "read-only" shelf with a single item (initialized to the passed value) and no key. (This is often referred to as a scalar shelf.) A VALUE argument never has a key, even if a shelf item which does have a key is passed as a VALUE argument.
The following function illustrates the use of VALUE arguments:
DEFINE COUNTER FUNCTION factorial VALUE COUNTER n AS DO WHEN n = 0 RETURN 1 ELSE RETURN n * factorial (n - 1) DONE
VALUE arguments are used whenever it is desirable to pass the result of an expression as an argument.
A VALUE argument may be passed to another called function as a VALUE or READ-ONLY argument or as part of a REMAINDER argument.
A READ-ONLY argument is a reference to an existing shelf. A READ-ONLY argument is used to pass a shelf as a whole. It allows a caller to share a shelf with a function being called.
Unlike a VALUE argument, a READ-ONLY argument cannot be used to pass the value of an expression. (See Section 12.1.3.2.5, "The Difference Between VALUE and READ-ONLY Arguments".)
A READ-ONLY argument is used to make the shelf accessible to the function, so that the function can read values and keys from it, but it can't be modified in any way. READ-ONLY is essentially a promise by the function that it won't change the shelf in any way. That promise is enforced by OmniMark.
OmniMark does not permit any action on a READ-ONLY argument that would modify the shelf. It also does not permit a READ-ONLY argument to be passed to another function as a MODIFIABLE argument. These checks are done at compile-time and do not affect the run-time performance of OmniMark.
Certain predefined shelves are considered to be "read-only" and may only be passed as READ-ONLY arguments. They include #APPINFO, #DOCTYPE, #ITEM, #FIRST and #LAST. (The values of individual items may be passed as VALUE arguments as well, but that is not equivalent to passing the entire shelf.)
The following function definition is an example of using a READ-ONLY argument. It counts the number of active switch values on a switch shelf, but doesn't need to modify that shelf:
DEFINE COUNTER FUNCTION count-active-switches (READ-ONLY SWITCH switch-shelf) AS LOCAL COUNTER active-count SET active-count TO 0 REPEAT OVER switch-shelf INCREMENT active-count WHEN switch-shelf AGAIN RETURN active-count
The value of an item on a READ-ONLY argument shelf may be passed to another function as a VALUE argument or as part of a REMAINDER argument. A READ-ONLY shelf may be passed to another function as a READ-ONLY shelf.
A MODIFIABLE argument is also a reference to an existing shelf. It is MODIFIABLE because the called function can modify the passed shelf in any of the ways that shelves can be modified (as long as the original shelf permits the operation). A MODIFIABLE shelf can have the key or value of any of its items changed, can be CLEARed or NEWed, and can be COPYed to.
A MODIFIABLE argument is used in preference to a READ-ONLY one when the argument is used to return results. It is useful when the single result value that can be returned from a function is not sufficient -- for example, where more than one value needs to be returned.
A simple example of using a modifiable shelf is the following, which splits a space-delimited sentence up into its individual words, and returns the words in a shelf:
DEFINE FUNCTION split-up-sentence (VALUE STREAM sentence, MODIFIABLE STREAM words) AS CLEAR words REPEAT SCAN sentence MATCH WHITE-SPACE* [ANY-TEXT EXCEPT BLANK]+ => word SET NEW words TO word AGAIN
An item on a MODIFIABLE argument shelf can be passed to another function individually as a VALUE argument, or as part of a REMAINDER argument. The entire shelf can be passed as either a READ-ONLY argument or a MODIFIABLE argument.
A REMAINDER argument is a special kind of argument. It is used to pass a number of discrete values as a single argument to a function. OmniMark does this by "creating" a READ-ONLY shelf argument initialized by the values being passed.
If a function has a REMAINDER argument, it must be the last one. A REMAINDER argument has an argument template that looks like:
Syntax
REMAINDER shelf-type argument-name (argument-separator ...)?
Within the function, the REMAINDER argument behaves exactly like a READ-ONLY shelf. The size of the shelf depends on the number of values passed.
If given, the argument-separator that precedes the ellipsis ("...") is used to herald the second and subsequent values being passed. Otherwise the argument-separator that precedes the REMAINDER argument is used to herald every one of the values being passed.
In the following example, the second herald for the REMAINDER argument is omitted. Therefore the first herald is used both as the introductory herald for the argument and as the item separator:
DEFINE COUNTER FUNCTION sum (VALUE COUNTER x0 , REMAINDER COUNTER xx) AS LOCAL COUNTER s SET s to x0 REPEAT OVER xx INCREMENT s BY xx AGAIN RETURN s
A call to "sum" would look like:
SET k TO sum (1, 2, 3, 4, 5) ; "k" gets the value 15.
In the case that the REMAINDER is the first (and only) argument of a function, and it has no leading herald, the second herald must be defined, as in:
DEFINE FUNCTION sum (REMAINDER COUNTER x, ...) AS LOCAL COUNTER s SET s to 0 REPEAT OVER x INCREMENT s BY x AGAIN RETURN s
Note that the two examples above are equivalent when at least one or more values are passed. The functions are called with exactly the same syntax.
REMAINDER arguments cannot be OPTIONAL. If no values are passed in the REMAINDER argument, then the created argument shelf is still specified. It simply has no items. See Section 12.1.3.5, "Optional Function Arguments".
An item on a REMAINDER argument shelf may be passed to a function individually as a VALUE argument or as part of a REMAINDER shelf argument. The entire REMAINDER argument shelf can be passed as a READ-ONLY argument.
External functions cannot have REMAINDER arguments.
The fundamental difference between VALUE and READ-ONLY arguments is that a VALUE argument is used to pass the result of a single expression, and a READ-ONLY argument is used to pass an entire shelf.
Use a VALUE argument when:
Use a READ-ONLY argument when a shelf is being passed as an argument and:
In general, if none of the conditions that require a READ-ONLY argument applies, then it is probably better to make the argument a VALUE argument.
The shelf-type defines what kind of shelf or expression can be passed as that argument. Shelves can only be passed as READ-ONLY or MODIFIABLE arguments. Expressions can only be passed as VALUE arguments, or as items of a REMAINDER argument.
STREAM arguments that have a class of VALUE or REMAINDER must be passed a valid string expression. That means that if an item on a STREAM shelf is being passed as a VALUE argument or as an item of a REMAINDER argument, then the STREAM must be attached to a BUFFER or a REFERENT, and must be closed.
The argument-separator depends on the type of function call.
When a function has a parenthesized argument list, the argument-separator must either be a comma (",") or an unquoted OmniMark name. The comma-and-parentheses style is the familiar function call form used in many other programming languages.
When a function has an unparenthesized argument list, unquoted OmniMark names are used as the argument-separators. A programmer can use such separators to clarify the purpose of the arguments.
Below are two equivalent examples. The first example uses the keyword form of the argument-separator which makes the roles of the arguments a little clearer than in the second example:
Example A
DEFINE FUNCTION multiply-shelf MODIFIABLE COUNTER shelf by VALUE COUNTER val AS REPEAT OVER shelf SET shelf TO shelf * val AGAIN ... FIND DIGIT+ => val LOCAL COUNTER values ... multiply-shelf values by val
Example B
DEFINE FUNCTION multiply-shelf (MODIFIABLE COUNTER shelf, VALUE COUNTER val) AS REPEAT OVER shelf SET shelf TO shelf * val AGAIN ... FIND DIGIT+ => val LOCAL COUNTER values ... multiply-shelf (values, val)
Calls to a function must use exactly the same argument-separators as in the function definition. If parentheses are present in the function definition, they must always be present in the calls to that function. If they are absent in the function definition, then they must be absent in every call to that function as well.
Arguments must occur in the same order in the function call as in the function definition.
For functions which do not take any arguments, the programmer can choose either of the following forms:
Example A
DEFINE FUNCTION f () AS ...
Example B
DEFINE FUNCTION f AS ...
If the empty parentheses are specified in the definition of a function, they must also be specified in every call to that same function. Similarly, if the empty parentheses are not specified in the definition, then calls to that function are not permitted to specify them.
In the unparenthesized form of the function argument list, the programmer can specify a leading argument-separator to herald the first argument. This allows a more natural syntax for some functions:
DEFINE COUNTER FUNCTION shelf-total of READ-ONLY COUNTER shelf AS LOCAL COUNTER total INITIAL {0} REPEAT OVER shelf-total INCREMENT total BY shelf-total AGAIN RETURN total ... GLOBAL COUNTER values VARIABLE ... FIND-END LOCAL COUNTER sum SET sum TO shelf-total of values
Using the initial argument-separator "of" in the above example provides a more natural interface to the function, than if the function was called "shelf-total-of".
A first argument-separator can also be used when there is no natural choice for an unheralded first argument, or where the meaning of any unheralded argument could be ambiguous. For instance, the calling syntax of the following function makes it quite clear what the arguments mean:
DEFINE COUNTER FUNCTION calculate-cylinder-volume radius VALUE COUNTER radius height VALUE COUNTER height AS RETURN 314 * radius * radius * height / 100 ... FIND "cyl r=" digit+ => r " h=" digit+ => h LOCAL COUNTER volume SET volume TO calculate-cylinder-volume radius r height h
It is always clear whether the arguments have been specified correctly simply by reading the call, without referring back to the function definition. OmniMark will issue an error if the heralds are in the wrong order, and it is easy for the reader to ensure that the values passed in match the heralds.
A function's argument can be declared as optional by placing the keyword OPTIONAL following the argument name as in:
DEFINE FUNCTION increment MODIFIABLE COUNTER x by VALUE COUNTER y OPTIONAL INITIAL {1} AS SET x TO x + y
This is a replacement of the OmniMark INCREMENT action by a functionally equivalent function. (Unlike the INCREMENT action though, the second argument requires parenthesization if it is an expression. Argument recognition is discussed in Section 12.4.2, "Argument Recognition".)
An OPTIONAL argument is omitted from a call by omitting the preceding function argument separator, together with the value.
In the parenthesized argument form, a required argument cannot follow an optional one, because OmniMark does not have any way of recognizing whether the argument is intended to be the optional one or the required one.
A VALUE argument that is declared OPTIONAL can also be provided with a default value or values in curly braces, as in the example above. MODIFIABLE and READ-ONLY arguments can also be declared OPTIONAL, but they cannot have default values. REMAINDER arguments cannot be declared OPTIONAL.
When an optional argument is omitted from a function call, when it has no default value, it is illegal to access that argument from within the function in any way, except to test whether it was specified. Whether or not an OPTIONAL argument was specified in a call can be determined by using the "IS SPECIFIED" argument test.
External functions cannot have OPTIONAL arguments.
shelf-type? argument-name (IS|ISNT) SPECIFIED
The "IS SPECIFIED" argument test returns TRUE if the OPTIONAL argument was specified in the call, and FALSE if it was not. An example is:
DEFINE COUNTER FUNCTION count-elements saving MODIFIABLE COUNTER current-value OPTIONAL AS SET current-value TO NUMBER OF CURRENT ELEMENTS WHEN current-value IS SPECIFIED RETURN NUMBER OF CURRENT ELEMENTS
(This function simply returns the number of currently opened elements, and, if the optional argument is specified, puts the count there as well.)
The "IS SPECIFIED" test can only be applied to OPTIONAL arguments. Whether or not the "IS SPECIFIED" test succeeds depends on whether a value was specified for the argument in the function call. Whether or not an "OPTIONAL VALUE" argument has a default value does not affect the "IS SPECIFIED" test.
In particular, for an argument whose value is not specified in the function call:
The "IS SPECIFIED" test is not permitted on a REMAINDER argument. If the caller does not pass any values to a REMAINDER argument, then the argument shelf will have zero items.
A function body is the same sort of thing as a rule body. It has zero or more local declarations followed by zero or more actions. In the case of a function one of those actions can be a RETURN action, although in the case of functions that don't return results, an explicit RETURN is not necessary.
Another simple example of a function definition is the following:
DEFINE COUNTER FUNCTION count-words-in VALUE STREAM s AS LOCAL COUNTER word-count SET word-count TO 0 REPEAT SCAN s MATCH WHITE-SPACE* [ANY-TEXT EXCEPT BLANK]+ INCREMENT word-count AGAIN RETURN word-count
In this case the function has a result type (COUNTER), and as a consequence must explicitly return a result of this type. In this example, the last line returns the value of the COUNTER variable containing the number of words.
One principle that might help programmers understand what's allowed in functions is that function definitions are essentially rules. Functions differ from rules only in that the programmer specifies when functions are performed and what arguments are passed to them by specifying a function call, whereas, in general, the input data determines where rules are performed and what information (pattern variables and attribute values) is passed to them.
Apart from that, however, rules and function definitions are the same sort of thing. Furthermore, whatever can be done in a rule that calls a function can also be done in a function. Three major examples of this are the following:
A simple example is the following function that provides the appropriate wrappers for RTF emphasis commands:
DEFINE FUNCTION process-emphasis VALUE STRING command-name AS OUTPUT "{\%g(command-name) %sc}"
This function could be called as follows:
ELEMENT EMPH WHEN ATTRIBUTE type = "BOLD" process-emphasis "b"
Content processing functions such as this become very useful when they do some non-trivial job, such as table formatting.
OmniMark allows programs to be written without declaring any of the COUNTER, SWITCH or STREAM variables when the "DECLARE HERALDED-NAMES" declaration is specified or when the -herald command-line option is specified. This makes it easy to write small prototype programs.
The basic rule that governs the use of declarations in these cases is that either everything or nothing (variable-wise) must be declared. As part of this basic rule, if an OmniMark program contains any function definitions, all variables must be declared. Function definitions, because they declare function names, are considered to be declarations for the purpose of this rule.
Passing a shelf reference as a function argument sets up an interaction between the function and the passed shelf in its caller. This section describes some of the major aspects of that interaction.
When a shelf is passed as a function argument, the selected item becomes the current item of the function argument. If the shelf being passed in has an indexer, then the item selected by the indexer becomes the current item of the shelf argument. If the shelf being passed in has no indexer, then the currently selected item of that shelf is used as the current item of the function argument. As an example of using this, the following function takes a reference to a STREAM variable and removes all the white space characters from it:
DEFINE FUNCTION trim-string (MODIFIABLE STREAM s) AS LOCAL STREAM new-s OPEN new-s AS BUFFER REPEAT SCAN s MATCH [ANY-TEXT EXCEPT BLANK]+ => non-white-space PUT new-s non-white-space MATCH WHITE-SPACE+ ; skip it AGAIN CLOSE new-s SET s TO new-s
This function can then be called with an item of a shelf, and that item will be "trimmed". For example:
trim-string (STREAM file-names @ 3)
In this case @ 3 of the shelf will be the one "trimmed".
The currently selected item can also be inherited, with the same effect, as in:
... GLOBAL STREAM file-names VARIABLE ... USING file-names @ 3 trim-string (file-names)
Care must be taken in writing functions that have READ-ONLY or MODIFIABLE arguments. It must not be assumed that the "lastmost" item of a passed shelf is its current item. The following function illustrates this point by setting up a USING to ensure that references to the STREAM s refer to the newly created item:
DEFINE FUNCTION add-new-codes (MODIFIABLE STREAM code-set, VALUE STREAM new-codes) AS USING code-set LASTMOST DO NEW code-set OPEN code-set AS BUFFER REPEAT SCAN new-codes MATCH LETTER => one-code PUT code-set "%ux(one-code)" MATCH ANY ; skip AGAIN CLOSE code-set DONE
A shelf passed as a READ-ONLY argument cannot be modified when accessed by its argument name.
However, if the argument that was passed refers to a global shelf, the function can access the shelf by its global name, and modify it that way. Since the READ-ONLY argument still refers to the same shelf, it appears as if the READ-ONLY argument has been modified.
The same kind of trick can be done by passing the same shelf as both a READ-ONLY argument and a MODIFIABLE argument. Since they refer to the same shelf, modifying the MODIFIABLE argument shelf effectively modifies the READ-ONLY shelf.
The following example illustrates the modification of a READ-ONLY argument. In the example, the function total-shelf sums the item values of the shelf passed to it and resets the GLOBAL shelf totals to be a single-item shelf whose item value is the sum. The trouble occurs when the GLOBAL totals is itself passed to total-shelf: the sum is zero, independently of what was on the totals shelf when it was passed.
GLOBAL COUNTER totals ... DEFINE FUNCTION total-shelf (READ-ONLY COUNTER shelf-to-sum) AS CLEAR totals SET NEW totals TO 0 REPEAT OVER shelf-to-sum INCREMENT totals BY shelf-to-sum AGAIN FIND-START SET totals TO 7 SET NEW totals TO 23 total-shelf (totals)
Needless to say, passing a shelf to a function where there is a possibility that the passed GLOBAL shelf may be modified by the function is deprecated in general, whether or not a READ-ONLY or MODIFIABLE argument is involved. Doing so tends, as in the example, to produce unexpected behaviour.
A similar problem does not exist with VALUE and REMAINDER arguments, because the value is captured as part of the call, and no later modification of any GLOBAL value within the function is going to modify the passed value.
Like local variables, SAVE and SAVE-CLEAR cannot be applied to function arguments. The purpose of a SAVE is to create a local shelf, but associate it with a global name, so that all references to that global name actually use the new instantiation. Therefore the shelf name used in a SAVE or a SAVE-CLEAR declaration must always be a global name.
SAVE or SAVE-CLEAR can be applied to a GLOBAL shelf name, even if that global shelf was passed as a MODIFIABLE or a READ-ONLY argument to a function. However, the function argument still applies to the shelf as it was before it was saved. This means:
In other words, SAVE and SAVE-CLEAR changes the association of a GLOBAL shelf name, but not a function argument name.
An OmniMark programmer can declare and call functions written in a programming language other than OmniMark. The external function declaration defines the calling sequence and initially specifies where these external functions are to be found on the system on which OmniMark is running. The programmer can redefine where to find the external functions at any time.
The interface that external functions use to access the OmniMark environment is designed to make it easy to write external functions that behave just like internal functions. In particular, external function programmers are urged to make use of the current output set and the currently selected items of function arguments wherever applicable.
The interface provided to the external function programmer is described in detail in the companion manual, OmniMark 3 External Functions Programmer's Guide [etr23j].
An external function definition starts out looking like an OmniMark function definition: it has an optional result type, a function name and an argument list. The keyword EXTERNAL follows DEFINE to make it clear that an external function is being declared, but it's what follows the AS that really differentiates an external function from internal functions.
For an external function OmniMark requires an external function name and an optional external function library name, specifying where to find the function. As an example:
DEFINE EXTERNAL FUNCTION get-db-record (VALUE STREAM key, VALUE STREAM value) AS "gdbrec" IN FUNCTION-LIBRARY "mylib.dll"
The only other difference between an external function definition and an internal function definition is that external functions cannot have OPTIONAL or REMAINDER arguments.
How the external function name and the external function library name are interpreted depends on the system on which the OmniMark program is being run. They have to be constant string expressions in the function definition, but they can be changed by the actions "SET EXTERNAL FUNCTION" and "SET FUNCTION-LIBRARY OF EXTERNAL-FUNCTION", as described in the next section.
The "IN FUNCTION-LIBRARY" part can only be omitted when the "DECLARE FUNCTION-LIBRARY" declaration is used to specify a default library.
Additional information about external functions can be found in the following documents:
A "DECLARE FUNCTION-LIBRARY" declaration can be used to declare a default function library. This is especially where multiple declared external functions come from the same library.
DECLARE FUNCTION-LIBRARY string-expression
This declaration sets the default function library for any following external function definition that do not have an "IN FUNCTION-LIBRARY" part. More than one "DECLARE FUNCTION-LIBRARY" declaration can appear in a program, in which case each "DECLARE FUNCTION-LIBRARY" declaration applies to the function definitions which follow until another "DECLARE FUNCTION-LIBRARY" declaration is encountered.
The "IN FUNCTION-LIBRARY" part of an external function definition can only be omitted if the definition is preceded by a "DECLARE FUNCTION-LIBRARY" declaration -- there is no "built-in" or "system-supplied" default external function library.
An external function's external name and external library do not actually have to be specified in an external function definition. They can be set later as the OmniMark program runs. For example, an external function could be declared with an (almost certainly) invalid external function name and library name:
DEFINE EXTERNAL FUNCTION get-db-record (VALUE STREAM key, VALUE STREAM value) AS "*" IN FUNCTION-LIBRARY "*"
During the running of the OmniMark program, the names can be determined and assigned to the external function with the actions:
SET EXTERNAL-FUNCTION function-name TO string-expression
SET FUNCTION-LIBRARY OF EXTERNAL-FUNCTION function-name TO string-expression
For example:
SET EXTERNAL-FUNCTION get-db-record TO "gdbrec" SET FUNCTION-LIBRARY OF EXTERNAL-FUNCTION get-db-record TO "mylib.so"
The values used to set the external function name and external function library name can be constant string expressions, determined by program logic or derived from some input data -- it's entirely up to the program.
An external function has to have a valid external function name and valid function library name (unless it defaults) at the point of every call made to that function. These names can be changed at any time, and they take effect at the next call. This allows calls to database access functions, for example, to be rebound to different database systems during a run.
If a programmer wants to know what external function name or external function library name is currently associated with an external function, they can ask using the EXTERNAL-FUNCTION or "FUNCTION-LIBRARY OF EXTERNAL-FUNCTION" enquiries.
EXTERNAL-FUNCTION OF function-name
FUNCTION-LIBRARY OF EXTERNAL-FUNCTION OF function-name
Each of these enquires takes the OmniMark name of an external function and returns a string result: the external function name or the library name. For example:
OUTPUT "For external function %"get-db-record%",%n" _ " external name = %"" || EXTERNAL-FUNCTION OF get-db-record || "%",%n" _ " function library name = %"" || FUNCTION-LIBRARY OF EXTERNAL-FUNCTION OF get-db-record || "%".%n"
OmniMark allows functions to be declared as externally-defined sources. An externally-defined source is an external function which returns its data incrementally, (much like the built-in FILE operator).
These types of functions can be very useful for external functions which establish a "session-based" communication link with another application, or for functions which return a potentially large amount of data.
An externally-defined source function is declared in a manner similar to an external function, except that its return value must be SOURCE.
Syntax
DEFINE EXTERNAL SOURCE function-name argument-list AS string-expression (IN FUNCTION-LIBRARY string-expression)?
An externally-defined source function can be called anywhere that a stream-returning function can be called. The functionality that distinguishes the two kinds of functions is isolated within the implementation of the function, and within OmniMark itself.
More information on how an externally-defined source function interacts with OmniMark is provided in the companion manual, OmniMark 3 External Functions Programmer's Guide [etr23j].
An external output function is a function to which a STREAM item can be attached. Such a stream is called an external output stream. Any data written to the stream item is processed by the function.
An external output function is declared with the syntax:
Syntax
DEFINE EXTERNAL OUTPUT function-name argument-list AS string-expression (IN FUNCTION-LIBRARY string-expression)?
Streams attached to external output functions support all of the usual stream operations: OPEN, REOPEN, PUT, CLOSE, "HAS NAME", and "NAME OF". Since the external output function is not required to provide a name, the "HAS NAME" test can be used to ensure the name is present before trying to retrieve it.
External output streams can support all of the usual stream open modifiers. Information about the modifiers is passed to the external output function for handling. (Normally, the external output function delegates the handling of the open modifiers to OmniMark. However, it has the information if other handling is also required.)
More information on how an external output function interacts with OmniMark is provided in the companion manual, OmniMark 3 External Functions Programmer's Guide [etr23j].
SET stream-output-function-call open-modifiers TO string-expression
The SET action be used directly on external output functions in the same way that it can be used on files and referents. For example:
DEFINE OUTPUT FUNCTION http-connection (VALUE STREAM requesting-id) AS "http_con" IN FUNCTION-LIBRARY "http.dll" ... FIND ... LOCAL STREAM requesting-id LOCAL STREAM requested-file-name ... SET http-connection (requesting-id) TO FILE (requested-file-name)
As with "SET FILE", the specified stream attachment (in this case an externally-defined output stream) is opened, the result of evaluating the right-hand side is written to it directly, and then it is closed.
Streams opened with REFERENTS-ALLOWED differ from all others in that the text written to the stream is "buffered" until it is possible to resolve the values of all referents written to the stream. OmniMark creates and maintains the buffer into which text written to the stream is initially placed.
Once the values of all referents in the referent scope have been resolved (i.e. at the end of the scope) the text of the stream, together with the resolved values of the referents written to it, are written to the stream's "final destination", if any. In the case of a stream attached to an external output function, this final destination is that function.
This section describes function calls and function argument recognition in detail.
A function that does not return a value can be called from anywhere an action is allowed. A value-returning function can only be called from places where an expression of the corresponding type can appear.
The following illustrates some of the possibilities, given that f-sw is a SWITCH-returning function, f-c1, f-c2 and f-c3 are COUNTER-returning functions and f-s1 and f-s2 are STREAM-returning functions:
DO WHEN f-sw SET n TO f-c1 + f-c2 SET s TO f-s1 || f-s2 ||* f-c3
OmniMark programmers should understand how function arguments are recognized to avoid syntax errors or ambiguous constructs.
Function argument recognition behaves differently for functions with the parenthesized comma-separated argument list, and those whose arguments are heralded by names.
An argument in the parenthesized, comma-separated argument list begins after the opening parenthesis that begins the argument list, or the comma that ends the previous argument.
The argument continues until the comma that ends it is found, or until the closing parenthesis that ends the argument list is found.
In the following function call, the middle argument is a + b.
DEFINE FUNCTION f (VALUE COUNTER x, VALUE COUNTER y, VALUE COUNTER z) AS ... GLOBAL COUNTER a GLOBAL COUNTER b PROCESS f (1, a + b, 2)
An argument heralded by a name actually consists only of the immediately-following subexpression. OmniMark stops recognizing the argument as soon as it has a valid expression, regardless of whether the expression has the correct type, or whether the code following the expression could also be interpreted as part of the expression.
In the next example, the single argument to the function call is just a. The "+ b" is not considered part of the function and would be added to the function result when it returns, if that were legal. Since it is not, the example is in error.
DEFINE FUNCTION f VALUE COUNTER a ... GLOBAL COUNTER a GLOBAL COUNTER b PROCESS f a + b
Parentheses must be used to ensure the correct interpretation:
DEFINE FUNCTION f VALUE COUNTER a ... GLOBAL COUNTER a GLOBAL COUNTER b PROCESS f (a + b)
A function which takes a single argument can actually be considered a monadic operator like "NAME OF" or like a negative sign ("-") applied to a numeric value. In OmniMark, monadic operators always bind more tightly than other operators. Functions which take a single argument behave exactly the same way.
When a function takes more than one argument, the recognition of each argument is the same as it is for the last argument: the argument is only recognized up until a valid expression has been formed.
The following example is another syntactically illegal function call, because the first argument is again interpreted only as a. The "+ b" is not part of the argument. The error occurs because there is text between the end of the first argument and the herald for the second argument (to).
DEFINE FUNCTION capture VALUE COUNTER low TO VALUE COUNTER high AS ... GLOBAL COUNTER a GLOBAL COUNTER b PROCESS capture a + b to 100
To ensure that the above example is interpreted correctly, parentheses must be used:
DEFINE FUNCTION capture VALUE COUNTER low TO VALUE COUNTER high AS ... GLOBAL COUNTER a GLOBAL COUNTER b PROCESS capture (a + b) to 100
By using the same method to recognize every argument in a function, the programmer can be sure that adding optional arguments to the end of a previously-correct function definition will have the least impact on existing programs.
This section discusses three types of ambiguity or apparent ambiguity that can occur in the recognition of function call arguments and their separators.
There are no "reserved" keywords when it comes to a choice of names used for function argument separator. All apparent ambiguities in function definition headers are resolved by the OmniMark compiler by looking one or two symbols beyond an apparently ambiguously used name.
The argument heralds used in a function's definition cannot be "ambiguous" amongst themselves. For example, the following produces an error:
DEFINE FUNCTION ambig VALUE COUNTER x with VALUE COUNTER y OPTIONAL with VALUE COUNTER z AS . . .
The error is that once an argument value for "x" has been found, the herald "with" can either designate the optional argument "y" or the following argument "z".
If two or more arguments can be recognized at the same time then each must have a distinct function argument separator. It is an error for two such arguments to have the same function argument separator.
The rule for determining which arguments can be recognized is:
Function calls that occur in arguments of other function calls can have trailing OPTIONAL or REMAINDER arguments that have the same function argument separator as following arguments in the enclosing function call.
Where an argument could be recognized as belonging to both an embedded and enclosing function call, it is always recognized as belonging to the inner function call. For example, with the following definitions:
DEFINE FUNCTION ambig1 VALUE COUNTER x with VALUE COUNTER y OPTIONAL AS . . . DEFINE FUNCTION ambig2 VALUE COUNTER a with VALUE COUNTER b OPTIONAL AS . . .
the following two examples are equivalent:
Example A
ambig1 ambig2 3 with 7
Example B
ambig1 (ambig2 3 with 7)
In this case the argument "with 7" is recognized as the second, OPTIONAL, argument of ambig2 (b). It is not the second argument of ambig1 (y).
Within a function call argument, recognition of the function argument separator that heralds an argument that can follow the current one predominates over the recognition of a name as a keyword or function name. For example, given the following function definition:
DEFINE FUNCTION copy-item MODIFIABLE STREAM s item VALUE COUNTER i AS SET s TO s item i
the following call
copy-item column-names item 3 item 7
is processed using the first "item" as the function argument separator and "3" as the second argument value. When the second "item" is encountered by OmniMark, it is considered to be in error. The example is considered to be the same as:
(copy-item column-names item 3) item 7
which is clearly illegal.
A function argument separator for the next argument is recognized in this way:
The following call to the copy-item defined in the earlier example is valid, and identifies the first "item" as the indexer of the first argument, and the second "item" as the function argument separator in the copy-item function call:
copy-item (column-names item 3) item 7
Recognition of separators occurs following macro call recognition and expansion, so parentheses, brackets or braces used in macro calls do not, in and of themselves, suppress recognition of separators.
Parentheses, brackets and braces occurring in the expansion of a macro do suppress separator recognition. It is always wise to wrap macro expansion text in parentheses if there is any possibility of a keyword in the expansion being misrecognized.
OmniMark functions can be called "recursively". That is, a function can call itself. This section describes:
Recursion is a "structured" looping technique in which a function calls itself to do a subpart of its work. A classic numerical example is calculating the "factorial" of a number -- the product of all the integers up to and including that number:
DEFINE FUNCTION factorial (VALUE COUNTER n) AS DO WHEN n <= 0 RETURN 1 ELSE RETURN n * factorial (n - 1) DONE
Recursive techniques are especially useful in "divide-and-conquer" applications such as sorting, where the data being sorted is typically "partitioned" and the sorting algorithm applied recursively to each of the partitions.
No special provision is required when declaring a function that calls itself. But if two functions are written such that each calls the other, it is impossible to define each function prior to its first use: one must precede the other. To support such mutually recursive functions, OmniMark provides function "predefinitions". A function predefinition looks just like a function definition except that it has no body. Instead of AS and a following set of local declarations and actions, there is just the keyword ELSEWHERE. For example:
DEFINE FUNCTION analyse-command-set (VALUE STREAM command-set) ELSEWHERE
Functions must be defined or predefined before they are used. For mutually recursive functions, one or more of the functions must be predefined prior to the first actual function definition.
There are some rules for using function predefinitions:
OmniMark supports a special type of recursion called tail recursion. If calling itself or another function is the very last thing a function does, then instead of "calling" that function, OmniMark terminates the current function and "jumps" to the other function. The main effect of this is that, with a bit of care, function calls can be made that seem to be millions of levels deep without actually using up any more computer memory than a single call.
The OmniMark programmer doesn't have to do anything to make a function tail recursive. The OmniMark compiler examines each function call in a function and determines if it can be made into a tail recursive call.
The rule that a function call must be the very last thing in a function is interpreted in a very strict way. In particular, a function will not be tail recursive when:
In all these cases something has to be done (or undone) by the calling function after returning from the called function. This cleanup prevents a "jump" to the called function from being done.
OmniMark places very few restrictions on what side effects a function can produce. The few restrictions OmniMark places on function side effects are, nonetheless important. They are:
Because of this non-determinism, functions called from the rule headers of unselected rules may or may not be called. OmniMark programmers should not count on any side effects either being performed, or not being performed in such function calls.
In general, functions which have side effects should not be called from rule headers.
(The non-determinism of rule selection is described in Section 12.6.1, "Function Side Effects In Rule Headers".)
Well-written programs will always have very few side effects, and most programs shouldn't encounter the problems introduced by functions changing values "behind the scenes".
When selecting a rule to be performed, OmniMark examines the rules in the program. OmniMark:
If there is a function call within any such condition, or pattern that has a side effect, then whether or not that side effect occurs depends entirely on whether the function call is actually performed during rule selection.
The header conditions and header patterns that are actually evaluated during the selection of a rule are evaluated in the order in which the rules that contain them appear in the OmniMark program.
For the situations where it is not defined whether OmniMark evaluates a part of a rule header, different releases of OmniMark may behave differently. Furthermore, the same release of OmniMark may handle these cases differently when the OmniMark program is modified, even though the change may not seem to affect that rule.
As a consequence, OmniMark programmers should be strongly discouraged from depending on any undocumented behaviour that they discover from experimenting with OmniMark programs.
Next chapter is Chapter 13, "The Current Output Stream Set".
Copyright © OmniMark Technologies Corporation, 1988-1997. All rights reserved.
EUM27, release 2, 1997/04/11.