HOME | COMPANY | SOFTWARE | DOCUMENTATION | EDUCATION & TRAINING | SALES & SERVICE

    "The Official Guide to Programming with OmniMark"

Site Map | Search:   
OmniMark Magazine Developer's Forum   

  International Edition   

OmniMark® Programmer's Guide Version 3

12. Functions

Detailed Table of Contents

Previous chapter is Chapter 11, "Cross-Referencing and Hypertext Linking".

Next chapter is Chapter 13, "The Current Output Stream Set".

An OmniMark program can define two different kinds of functions:

Internal functions serve the following purposes:

External functions are used to gain access to external applications or operating system services which are not available directly in OmniMark.

There are two aspects to a function:

Additionally, a function may also have a function predefinition which simply defines the function interface. Function predefinitions are discussed in Section 12.5.2, "Function Predefinitions".

12.1 The Function Definition

A function, like a variable, must be declared before it is used. This allows OmniMark to know how to interpret the arguments passed to the function, and to know what kind of result it is getting back from the function.

Functions are defined using a function definition, which has the syntax:

Syntax

   DEFINE EXTERNAL? result-type? FUNCTION function-name
       argument-list AS
       function-body

An example is:

   DEFINE FUNCTION print-n-spaces VALUE COUNTER n AS
      LOCAL COUNTER spaces-yet-to-print
      SET spaces-yet-to-print TO n
      REPEAT
         EXIT WHEN spaces-yet-to-print <= 0
         OUTPUT "%_"
         DECREMENT spaces-yet-to-print
      AGAIN

The keyword EXTERNAL is used to indicate an external function. External functions are discussed in Section 12.3, "External Functions".

Recursively defined functions may need to be used before their definitions, so there is a special way of "predefining" a function described in Section 12.5.2, "Function Predefinitions".

12.1.1 Function Names

A function's name must be a "plain" name -- it cannot be quoted, the way variable names are.

When a function is defined in an OmniMark program, the name of the function becomes a keyword in that program. If the function's name just happens to be the same as an OmniMark keyword, the OmniMark keyword cannot be used in the program -- that keyword always refers to the function.

It is generally best simply not to use OmniMark keywords as function names. (It's actually best not to use OmniMark keywords as variable names either.) However, there are enough different OmniMark keywords that OmniMark doesn't take the strong position of disallowing OmniMark keywords, it just says they can't coexist in the same program. This more flexible position also means that OmniMark programs will continue to work when new keywords are added to the language.

Note that the rule that you can't use a word as both a function name and an OmniMark keyword in the same program applies to the whole program -- it is an error to use a keyword early on in a program, and to later on define it as a function name.

12.1.2 The Function Result Type

The result-type is generally one of the keywords SWITCH, COUNTER, or STREAM. It can also be used to indicate when an external function is an external output function or an external source function. (See Section 12.3.4, "External Output Functions" or Section 12.3.3, "Externally-Defined Sources" for more information.) When result-type is given, it always precedes the function name.

Every called function must eventually return to its caller (unless it does a HALT or contains a terminating error). The explicit way of returning from a function is to use a RETURN action, which can appear anywhere within a function definition. The form of the RETURN action depends on the result type defined for the function, which is specified at the start of the function, as follows:

12.1.2.1 Value-Returning Functions

An example of a value-returning function is:

   DEFINE COUNTER FUNCTION double VALUE COUNTER n AS
      RETURN n * 2

In turn, the result of calling the function double is itself used in a numeric expression in the following example (which sets the value of the counter m to be 15):

   LOCAL COUNTER m
   ...
   SET m TO double 7 + 1

The result type of a function determines where the function can be called in the OmniMark program: a COUNTER-returning function can be called in a numeric expression, a SWITCH-returning function in a test, and a STREAM-returning function in a string expression. Value-returning functions can be called anywhere that an operator returning the same type can be called.

12.1.2.2 Functions With No Result Type

If the function has no return type, then the function call can only be used in the place of an action:

   ...
   GLOBAL SWITCH registry VARIABLE

   DEFINE FUNCTION register
      VALUE STREAM string
   AS
      NEW registry ^ string

   ...
   ELEMENT name
      LOCAL STREAM name-value
      SET name-value TO "%sc"
      register name-value
      ...

Note that a function does not require a RETURN action if it does not return a value. In that case, the function simply returns when there are no more actions to execute.

12.1.3 The Function Argument List

The argument-list in a function definition describes the class and type of each argument to a function, if any. It can be parenthesized or unparenthesized. It has one of the following forms:

Syntax

   ( (argument-template
      ( argument-separator argument-template)*)? )

or

Syntax

   ( argument-separator? argument-template
      (argument-separator argument-template)*)?

In a parenthesized argument list, the argument-separator can either be a comma (",") or an unquoted OmniMark name.

In an unparenthesized argument list, the argument-separator must be an unquoted OmniMark name.

The maximum number of function arguments for an individual function call is 16383. This should be sufficient for most purposes.

12.1.3.1 Function Argument Templates

A function-argument-template describes how the caller passes the argument to the function. Each argument passed must match its corresponding argument template.

A argument-template has the basic syntax:

Syntax

   argument-class shelf-type argument-name (OPTIONAL default-value?)?

The argument-class is one of:

Syntax

   VALUE | READ-ONLY | MODIFIABLE | REMAINDER

The shelf-type is one of:

Syntax

   COUNTER | SWITCH | STREAM

External functions cannot have OPTIONAL arguments.

12.1.3.2 Function Argument Classes

A function argument class is the first of the three components of a function argument template. An argument class indicates how an argument value in a call is interpreted: whether an actual value or a reference to a shelf is passed, and what kind of expression is allowed in the function call. There are four different function argument classes:

External functions cannot have REMAINDER arguments.

12.1.3.2.1 Passing Expressions As Arguments

A VALUE argument is used to pass an expression as a function argument. Any expression can be passed as a VALUE argument of a compatible type:

Unlike READ-ONLY arguments, when a shelf item reference is passed as a VALUE argument, only the value of that item is passed. Inside the function, that VALUE argument cannot be used to access the key of that item (if any) or the values or keys of any of the other items on the shelf. (See Section 12.1.3.2.5, "The Difference Between VALUE and READ-ONLY Arguments".)

Within a function, a VALUE argument is treated as if it were a "read-only" shelf with a single item (initialized to the passed value) and no key. (This is often referred to as a scalar shelf.) A VALUE argument never has a key, even if a shelf item which does have a key is passed as a VALUE argument.

The following function illustrates the use of VALUE arguments:

   DEFINE COUNTER FUNCTION factorial VALUE COUNTER n AS
      DO WHEN n = 0
         RETURN 1
      ELSE
         RETURN n * factorial (n - 1)
      DONE

VALUE arguments are used whenever it is desirable to pass the result of an expression as an argument.

A VALUE argument may be passed to another called function as a VALUE or READ-ONLY argument or as part of a REMAINDER argument.

12.1.3.2.2 Passing a Shelf as a Read-Only Argument

A READ-ONLY argument is a reference to an existing shelf. A READ-ONLY argument is used to pass a shelf as a whole. It allows a caller to share a shelf with a function being called.

Unlike a VALUE argument, a READ-ONLY argument cannot be used to pass the value of an expression. (See Section 12.1.3.2.5, "The Difference Between VALUE and READ-ONLY Arguments".)

A READ-ONLY argument is used to make the shelf accessible to the function, so that the function can read values and keys from it, but it can't be modified in any way. READ-ONLY is essentially a promise by the function that it won't change the shelf in any way. That promise is enforced by OmniMark.

OmniMark does not permit any action on a READ-ONLY argument that would modify the shelf. It also does not permit a READ-ONLY argument to be passed to another function as a MODIFIABLE argument. These checks are done at compile-time and do not affect the run-time performance of OmniMark.

Certain predefined shelves are considered to be "read-only" and may only be passed as READ-ONLY arguments. They include #APPINFO, #DOCTYPE, #ITEM, #FIRST and #LAST. (The values of individual items may be passed as VALUE arguments as well, but that is not equivalent to passing the entire shelf.)

The following function definition is an example of using a READ-ONLY argument. It counts the number of active switch values on a switch shelf, but doesn't need to modify that shelf:

   DEFINE COUNTER FUNCTION count-active-switches
                           (READ-ONLY SWITCH switch-shelf) AS
      LOCAL COUNTER active-count
      SET active-count TO 0
      REPEAT OVER switch-shelf
         INCREMENT active-count WHEN switch-shelf
      AGAIN
      RETURN active-count

The value of an item on a READ-ONLY argument shelf may be passed to another function as a VALUE argument or as part of a REMAINDER argument. A READ-ONLY shelf may be passed to another function as a READ-ONLY shelf.

12.1.3.2.3 Passing a Shelf as a Modifiable Argument

A MODIFIABLE argument is also a reference to an existing shelf. It is MODIFIABLE because the called function can modify the passed shelf in any of the ways that shelves can be modified (as long as the original shelf permits the operation). A MODIFIABLE shelf can have the key or value of any of its items changed, can be CLEARed or NEWed, and can be COPYed to.

A MODIFIABLE argument is used in preference to a READ-ONLY one when the argument is used to return results. It is useful when the single result value that can be returned from a function is not sufficient -- for example, where more than one value needs to be returned.

A simple example of using a modifiable shelf is the following, which splits a space-delimited sentence up into its individual words, and returns the words in a shelf:

   DEFINE FUNCTION split-up-sentence
                   (VALUE STREAM sentence,
                    MODIFIABLE STREAM words) AS
      CLEAR words
      REPEAT SCAN sentence
      MATCH WHITE-SPACE* [ANY-TEXT EXCEPT BLANK]+ => word
         SET NEW words TO word
      AGAIN

An item on a MODIFIABLE argument shelf can be passed to another function individually as a VALUE argument, or as part of a REMAINDER argument. The entire shelf can be passed as either a READ-ONLY argument or a MODIFIABLE argument.

12.1.3.2.4 Passing a Variable Number of Values as a Single Argument

A REMAINDER argument is a special kind of argument. It is used to pass a number of discrete values as a single argument to a function. OmniMark does this by "creating" a READ-ONLY shelf argument initialized by the values being passed.

If a function has a REMAINDER argument, it must be the last one. A REMAINDER argument has an argument template that looks like:

Syntax

   REMAINDER shelf-type argument-name (argument-separator ...)?

Within the function, the REMAINDER argument behaves exactly like a READ-ONLY shelf. The size of the shelf depends on the number of values passed.

If given, the argument-separator that precedes the ellipsis ("...") is used to herald the second and subsequent values being passed. Otherwise the argument-separator that precedes the REMAINDER argument is used to herald every one of the values being passed.

In the following example, the second herald for the REMAINDER argument is omitted. Therefore the first herald is used both as the introductory herald for the argument and as the item separator:

   DEFINE COUNTER FUNCTION sum
                  (VALUE COUNTER x0 , REMAINDER COUNTER xx) AS
      LOCAL COUNTER s
      SET s to x0
      REPEAT OVER xx
         INCREMENT s BY xx
      AGAIN
      RETURN s

A call to "sum" would look like:

   SET k TO sum (1, 2, 3, 4, 5) ; "k" gets the value 15.

In the case that the REMAINDER is the first (and only) argument of a function, and it has no leading herald, the second herald must be defined, as in:

   DEFINE FUNCTION sum (REMAINDER COUNTER x, ...) AS
      LOCAL COUNTER s
      SET s to 0
      REPEAT OVER x
         INCREMENT s BY x
      AGAIN
      RETURN s

Note that the two examples above are equivalent when at least one or more values are passed. The functions are called with exactly the same syntax.

REMAINDER arguments cannot be OPTIONAL. If no values are passed in the REMAINDER argument, then the created argument shelf is still specified. It simply has no items. See Section 12.1.3.5, "Optional Function Arguments".

An item on a REMAINDER argument shelf may be passed to a function individually as a VALUE argument or as part of a REMAINDER shelf argument. The entire REMAINDER argument shelf can be passed as a READ-ONLY argument.

External functions cannot have REMAINDER arguments.

12.1.3.2.5 The Difference Between VALUE and READ-ONLY Arguments

The fundamental difference between VALUE and READ-ONLY arguments is that a VALUE argument is used to pass the result of a single expression, and a READ-ONLY argument is used to pass an entire shelf.

Use a VALUE argument when:

Use a READ-ONLY argument when a shelf is being passed as an argument and:

In general, if none of the conditions that require a READ-ONLY argument applies, then it is probably better to make the argument a VALUE argument.

12.1.3.3 Function Argument Shelf Types

The shelf-type defines what kind of shelf or expression can be passed as that argument. Shelves can only be passed as READ-ONLY or MODIFIABLE arguments. Expressions can only be passed as VALUE arguments, or as items of a REMAINDER argument.

STREAM arguments that have a class of VALUE or REMAINDER must be passed a valid string expression. That means that if an item on a STREAM shelf is being passed as a VALUE argument or as an item of a REMAINDER argument, then the STREAM must be attached to a BUFFER or a REFERENT, and must be closed.

12.1.3.4 Function Argument Separators

The argument-separator depends on the type of function call.

When a function has a parenthesized argument list, the argument-separator must either be a comma (",") or an unquoted OmniMark name. The comma-and-parentheses style is the familiar function call form used in many other programming languages.

When a function has an unparenthesized argument list, unquoted OmniMark names are used as the argument-separators. A programmer can use such separators to clarify the purpose of the arguments.

Below are two equivalent examples. The first example uses the keyword form of the argument-separator which makes the roles of the arguments a little clearer than in the second example:

Example A

   DEFINE FUNCTION multiply-shelf
      MODIFIABLE COUNTER shelf
      by VALUE COUNTER val
   AS
      REPEAT OVER shelf
         SET shelf TO shelf * val
      AGAIN
   ...

   FIND DIGIT+ => val
      LOCAL COUNTER values

      ...
      multiply-shelf values by val

Example B

   DEFINE FUNCTION multiply-shelf
      (MODIFIABLE COUNTER shelf,
       VALUE COUNTER val)
   AS
      REPEAT OVER shelf
         SET shelf TO shelf * val
      AGAIN
   ...

   FIND DIGIT+ => val
      LOCAL COUNTER values

      ...
      multiply-shelf (values, val)

Calls to a function must use exactly the same argument-separators as in the function definition. If parentheses are present in the function definition, they must always be present in the calls to that function. If they are absent in the function definition, then they must be absent in every call to that function as well.

Arguments must occur in the same order in the function call as in the function definition.

12.1.3.4.1 Functions With No Arguments

For functions which do not take any arguments, the programmer can choose either of the following forms:

Example A

   DEFINE FUNCTION f () AS
     ...

Example B

   DEFINE FUNCTION f AS
     ...

If the empty parentheses are specified in the definition of a function, they must also be specified in every call to that same function. Similarly, if the empty parentheses are not specified in the definition, then calls to that function are not permitted to specify them.

12.1.3.4.2 Argument Separators for the First Argument

In the unparenthesized form of the function argument list, the programmer can specify a leading argument-separator to herald the first argument. This allows a more natural syntax for some functions:

   DEFINE COUNTER FUNCTION shelf-total of READ-ONLY COUNTER shelf AS
      LOCAL COUNTER total INITIAL {0}

      REPEAT OVER shelf-total
         INCREMENT total BY shelf-total
      AGAIN
      RETURN total
   ...
   GLOBAL COUNTER values VARIABLE
   ...
   FIND-END
      LOCAL COUNTER sum
      SET sum TO shelf-total of values

Using the initial argument-separator "of" in the above example provides a more natural interface to the function, than if the function was called "shelf-total-of".

A first argument-separator can also be used when there is no natural choice for an unheralded first argument, or where the meaning of any unheralded argument could be ambiguous. For instance, the calling syntax of the following function makes it quite clear what the arguments mean:

   DEFINE COUNTER FUNCTION calculate-cylinder-volume
      radius VALUE COUNTER radius
      height VALUE COUNTER height
   AS
      RETURN 314 * radius * radius * height / 100

   ...
   FIND "cyl r=" digit+ => r " h=" digit+ => h
      LOCAL COUNTER volume

      SET volume TO calculate-cylinder-volume radius r height h

It is always clear whether the arguments have been specified correctly simply by reading the call, without referring back to the function definition. OmniMark will issue an error if the heralds are in the wrong order, and it is easy for the reader to ensure that the values passed in match the heralds.

12.1.3.5 Optional Function Arguments

A function's argument can be declared as optional by placing the keyword OPTIONAL following the argument name as in:

   DEFINE FUNCTION increment MODIFIABLE COUNTER x
                   by VALUE COUNTER y OPTIONAL INITIAL {1} AS
      SET x TO x + y

This is a replacement of the OmniMark INCREMENT action by a functionally equivalent function. (Unlike the INCREMENT action though, the second argument requires parenthesization if it is an expression. Argument recognition is discussed in Section 12.4.2, "Argument Recognition".)

An OPTIONAL argument is omitted from a call by omitting the preceding function argument separator, together with the value.

In the parenthesized argument form, a required argument cannot follow an optional one, because OmniMark does not have any way of recognizing whether the argument is intended to be the optional one or the required one.

A VALUE argument that is declared OPTIONAL can also be provided with a default value or values in curly braces, as in the example above. MODIFIABLE and READ-ONLY arguments can also be declared OPTIONAL, but they cannot have default values. REMAINDER arguments cannot be declared OPTIONAL.

When an optional argument is omitted from a function call, when it has no default value, it is illegal to access that argument from within the function in any way, except to test whether it was specified. Whether or not an OPTIONAL argument was specified in a call can be determined by using the "IS SPECIFIED" argument test.

External functions cannot have OPTIONAL arguments.

12.1.3.5.1 Testing The Presence Of An Optional Argument

Syntax

   shelf-type? argument-name (IS|ISNT) SPECIFIED

The "IS SPECIFIED" argument test returns TRUE if the OPTIONAL argument was specified in the call, and FALSE if it was not. An example is:

   DEFINE COUNTER FUNCTION count-elements
                  saving MODIFIABLE COUNTER current-value OPTIONAL AS
      SET current-value TO NUMBER OF CURRENT ELEMENTS
          WHEN current-value IS SPECIFIED
      RETURN NUMBER OF CURRENT ELEMENTS

(This function simply returns the number of currently opened elements, and, if the optional argument is specified, puts the count there as well.)

The "IS SPECIFIED" test can only be applied to OPTIONAL arguments. Whether or not the "IS SPECIFIED" test succeeds depends on whether a value was specified for the argument in the function call. Whether or not an "OPTIONAL VALUE" argument has a default value does not affect the "IS SPECIFIED" test.

In particular, for an argument whose value is not specified in the function call:

The "IS SPECIFIED" test is not permitted on a REMAINDER argument. If the caller does not pass any values to a REMAINDER argument, then the argument shelf will have zero items.

12.1.4 Function Bodies

A function body is the same sort of thing as a rule body. It has zero or more local declarations followed by zero or more actions. In the case of a function one of those actions can be a RETURN action, although in the case of functions that don't return results, an explicit RETURN is not necessary.

Another simple example of a function definition is the following:

   DEFINE COUNTER FUNCTION count-words-in VALUE STREAM s AS
      LOCAL COUNTER word-count
      SET word-count TO 0
      REPEAT SCAN s
      MATCH WHITE-SPACE* [ANY-TEXT EXCEPT BLANK]+
         INCREMENT word-count
      AGAIN
      RETURN word-count

In this case the function has a result type (COUNTER), and as a consequence must explicitly return a result of this type. In this example, the last line returns the value of the COUNTER variable containing the number of words.

12.1.4.1 What Can Be Done In a Function?

One principle that might help programmers understand what's allowed in functions is that function definitions are essentially rules. Functions differ from rules only in that the programmer specifies when functions are performed and what arguments are passed to them by specifying a function call, whereas, in general, the input data determines where rules are performed and what information (pattern variables and attribute values) is passed to them.

Apart from that, however, rules and function definitions are the same sort of thing. Furthermore, whatever can be done in a rule that calls a function can also be done in a function. Three major examples of this are the following:

12.1.5 Functions and Declaration-Free Programs

OmniMark allows programs to be written without declaring any of the COUNTER, SWITCH or STREAM variables when the "DECLARE HERALDED-NAMES" declaration is specified or when the -herald command-line option is specified. This makes it easy to write small prototype programs.

The basic rule that governs the use of declarations in these cases is that either everything or nothing (variable-wise) must be declared. As part of this basic rule, if an OmniMark program contains any function definitions, all variables must be declared. Function definitions, because they declare function names, are considered to be declarations for the purpose of this rule.


12.2 Functions -- Shelves and Arguments

Passing a shelf reference as a function argument sets up an interaction between the function and the passed shelf in its caller. This section describes some of the major aspects of that interaction.

12.2.1 The "Current Item"

When a shelf is passed as a function argument, the selected item becomes the current item of the function argument. If the shelf being passed in has an indexer, then the item selected by the indexer becomes the current item of the shelf argument. If the shelf being passed in has no indexer, then the currently selected item of that shelf is used as the current item of the function argument. As an example of using this, the following function takes a reference to a STREAM variable and removes all the white space characters from it:

   DEFINE FUNCTION trim-string (MODIFIABLE STREAM s) AS
      LOCAL STREAM new-s
      OPEN new-s AS BUFFER
      REPEAT SCAN s
      MATCH [ANY-TEXT EXCEPT BLANK]+ => non-white-space
         PUT new-s non-white-space
      MATCH WHITE-SPACE+
         ; skip it
      AGAIN
      CLOSE new-s
      SET s TO new-s

This function can then be called with an item of a shelf, and that item will be "trimmed". For example:

   trim-string (STREAM file-names @ 3)

In this case @ 3 of the shelf will be the one "trimmed".

The currently selected item can also be inherited, with the same effect, as in:

   ...
   GLOBAL STREAM file-names VARIABLE
   ...
   USING file-names @ 3
      trim-string (file-names)

Care must be taken in writing functions that have READ-ONLY or MODIFIABLE arguments. It must not be assumed that the "lastmost" item of a passed shelf is its current item. The following function illustrates this point by setting up a USING to ensure that references to the STREAM s refer to the newly created item:

   DEFINE FUNCTION add-new-codes (MODIFIABLE STREAM code-set,
                                  VALUE STREAM new-codes) AS
      USING code-set LASTMOST
      DO
         NEW code-set
         OPEN code-set AS BUFFER
         REPEAT SCAN new-codes
         MATCH LETTER => one-code
            PUT code-set "%ux(one-code)"
         MATCH ANY
            ; skip
         AGAIN
         CLOSE code-set
      DONE

12.2.2 Modifying Read-Only Arguments

A shelf passed as a READ-ONLY argument cannot be modified when accessed by its argument name.

However, if the argument that was passed refers to a global shelf, the function can access the shelf by its global name, and modify it that way. Since the READ-ONLY argument still refers to the same shelf, it appears as if the READ-ONLY argument has been modified.

The same kind of trick can be done by passing the same shelf as both a READ-ONLY argument and a MODIFIABLE argument. Since they refer to the same shelf, modifying the MODIFIABLE argument shelf effectively modifies the READ-ONLY shelf.

The following example illustrates the modification of a READ-ONLY argument. In the example, the function total-shelf sums the item values of the shelf passed to it and resets the GLOBAL shelf totals to be a single-item shelf whose item value is the sum. The trouble occurs when the GLOBAL totals is itself passed to total-shelf: the sum is zero, independently of what was on the totals shelf when it was passed.

   GLOBAL COUNTER totals
   ...
   DEFINE FUNCTION total-shelf (READ-ONLY COUNTER shelf-to-sum) AS
      CLEAR totals
      SET NEW totals TO 0
      REPEAT OVER shelf-to-sum
         INCREMENT totals BY shelf-to-sum
      AGAIN
   FIND-START
      SET totals TO 7
      SET NEW totals TO 23
      total-shelf (totals)

Needless to say, passing a shelf to a function where there is a possibility that the passed GLOBAL shelf may be modified by the function is deprecated in general, whether or not a READ-ONLY or MODIFIABLE argument is involved. Doing so tends, as in the example, to produce unexpected behaviour.

A similar problem does not exist with VALUE and REMAINDER arguments, because the value is captured as part of the call, and no later modification of any GLOBAL value within the function is going to modify the passed value.

12.2.3 Applying SAVE To Function Arguments

Like local variables, SAVE and SAVE-CLEAR cannot be applied to function arguments. The purpose of a SAVE is to create a local shelf, but associate it with a global name, so that all references to that global name actually use the new instantiation. Therefore the shelf name used in a SAVE or a SAVE-CLEAR declaration must always be a global name.

SAVE or SAVE-CLEAR can be applied to a GLOBAL shelf name, even if that global shelf was passed as a MODIFIABLE or a READ-ONLY argument to a function. However, the function argument still applies to the shelf as it was before it was saved. This means:

In other words, SAVE and SAVE-CLEAR changes the association of a GLOBAL shelf name, but not a function argument name.


12.3 External Functions

An OmniMark programmer can declare and call functions written in a programming language other than OmniMark. The external function declaration defines the calling sequence and initially specifies where these external functions are to be found on the system on which OmniMark is running. The programmer can redefine where to find the external functions at any time.

The interface that external functions use to access the OmniMark environment is designed to make it easy to write external functions that behave just like internal functions. In particular, external function programmers are urged to make use of the current output set and the currently selected items of function arguments wherever applicable.

The interface provided to the external function programmer is described in detail in the companion manual, OmniMark 3 External Functions Programmer's Guide [etr23j].

12.3.1 Declaring an External Function

An external function definition starts out looking like an OmniMark function definition: it has an optional result type, a function name and an argument list. The keyword EXTERNAL follows DEFINE to make it clear that an external function is being declared, but it's what follows the AS that really differentiates an external function from internal functions.

For an external function OmniMark requires an external function name and an optional external function library name, specifying where to find the function. As an example:

   DEFINE EXTERNAL FUNCTION get-db-record (VALUE STREAM key,
                                           VALUE STREAM value) AS
          "gdbrec" IN FUNCTION-LIBRARY "mylib.dll"

The only other difference between an external function definition and an internal function definition is that external functions cannot have OPTIONAL or REMAINDER arguments.

How the external function name and the external function library name are interpreted depends on the system on which the OmniMark program is being run. They have to be constant string expressions in the function definition, but they can be changed by the actions "SET EXTERNAL FUNCTION" and "SET FUNCTION-LIBRARY OF EXTERNAL-FUNCTION", as described in the next section.

The "IN FUNCTION-LIBRARY" part can only be omitted when the "DECLARE FUNCTION-LIBRARY" declaration is used to specify a default library.

Additional information about external functions can be found in the following documents:

12.3.1.1 Declaring a Default Function Library

A "DECLARE FUNCTION-LIBRARY" declaration can be used to declare a default function library. This is especially where multiple declared external functions come from the same library.

Syntax

   DECLARE FUNCTION-LIBRARY string-expression

This declaration sets the default function library for any following external function definition that do not have an "IN FUNCTION-LIBRARY" part. More than one "DECLARE FUNCTION-LIBRARY" declaration can appear in a program, in which case each "DECLARE FUNCTION-LIBRARY" declaration applies to the function definitions which follow until another "DECLARE FUNCTION-LIBRARY" declaration is encountered.

The "IN FUNCTION-LIBRARY" part of an external function definition can only be omitted if the definition is preceded by a "DECLARE FUNCTION-LIBRARY" declaration -- there is no "built-in" or "system-supplied" default external function library.

12.3.2 Identifying An External Function

An external function's external name and external library do not actually have to be specified in an external function definition. They can be set later as the OmniMark program runs. For example, an external function could be declared with an (almost certainly) invalid external function name and library name:

   DEFINE EXTERNAL FUNCTION get-db-record
      (VALUE STREAM key, VALUE STREAM value)
      AS "*" IN FUNCTION-LIBRARY "*"

During the running of the OmniMark program, the names can be determined and assigned to the external function with the actions:

Syntax

   SET EXTERNAL-FUNCTION function-name TO string-expression

Syntax

   SET FUNCTION-LIBRARY OF EXTERNAL-FUNCTION function-name
      TO string-expression

For example:

   SET EXTERNAL-FUNCTION get-db-record TO "gdbrec"
   SET FUNCTION-LIBRARY OF EXTERNAL-FUNCTION get-db-record TO "mylib.so"

The values used to set the external function name and external function library name can be constant string expressions, determined by program logic or derived from some input data -- it's entirely up to the program.

An external function has to have a valid external function name and valid function library name (unless it defaults) at the point of every call made to that function. These names can be changed at any time, and they take effect at the next call. This allows calls to database access functions, for example, to be rebound to different database systems during a run.

12.3.2.1 Enquiring About the Identity of an External Function

If a programmer wants to know what external function name or external function library name is currently associated with an external function, they can ask using the EXTERNAL-FUNCTION or "FUNCTION-LIBRARY OF EXTERNAL-FUNCTION" enquiries.

Syntax

   EXTERNAL-FUNCTION OF function-name

Syntax

   FUNCTION-LIBRARY OF EXTERNAL-FUNCTION OF function-name

Each of these enquires takes the OmniMark name of an external function and returns a string result: the external function name or the library name. For example:

   OUTPUT "For external function %"get-db-record%",%n" _
          "    external name = %"" ||
          EXTERNAL-FUNCTION OF get-db-record ||
          "%",%n" _
          "    function library name = %"" ||
          FUNCTION-LIBRARY OF EXTERNAL-FUNCTION OF get-db-record ||
          "%".%n"

12.3.3 Externally-Defined Sources

OmniMark allows functions to be declared as externally-defined sources. An externally-defined source is an external function which returns its data incrementally, (much like the built-in FILE operator).

These types of functions can be very useful for external functions which establish a "session-based" communication link with another application, or for functions which return a potentially large amount of data.

An externally-defined source function is declared in a manner similar to an external function, except that its return value must be SOURCE.

Syntax

   DEFINE EXTERNAL SOURCE function-name
       argument-list
       AS string-expression
       (IN FUNCTION-LIBRARY string-expression)?

An externally-defined source function can be called anywhere that a stream-returning function can be called. The functionality that distinguishes the two kinds of functions is isolated within the implementation of the function, and within OmniMark itself.

More information on how an externally-defined source function interacts with OmniMark is provided in the companion manual, OmniMark 3 External Functions Programmer's Guide [etr23j].

12.3.4 External Output Functions

An external output function is a function to which a STREAM item can be attached. Such a stream is called an external output stream. Any data written to the stream item is processed by the function.

An external output function is declared with the syntax:

Syntax

   DEFINE EXTERNAL OUTPUT function-name argument-list
       AS string-expression
       (IN FUNCTION-LIBRARY string-expression)?

Streams attached to external output functions support all of the usual stream operations: OPEN, REOPEN, PUT, CLOSE, "HAS NAME", and "NAME OF". Since the external output function is not required to provide a name, the "HAS NAME" test can be used to ensure the name is present before trying to retrieve it.

External output streams can support all of the usual stream open modifiers. Information about the modifiers is passed to the external output function for handling. (Normally, the external output function delegates the handling of the open modifiers to OmniMark. However, it has the information if other handling is also required.)

More information on how an external output function interacts with OmniMark is provided in the companion manual, OmniMark 3 External Functions Programmer's Guide [etr23j].

12.3.4.1 Setting Output Functions

Syntax

   SET stream-output-function-call
      open-modifiers TO string-expression

The SET action be used directly on external output functions in the same way that it can be used on files and referents. For example:

   DEFINE OUTPUT FUNCTION http-connection
         (VALUE STREAM requesting-id) AS "http_con"
         IN FUNCTION-LIBRARY "http.dll"
   ...
   FIND ...
      LOCAL STREAM requesting-id
      LOCAL STREAM requested-file-name
      ...
      SET http-connection (requesting-id) TO
         FILE (requested-file-name)

As with "SET FILE", the specified stream attachment (in this case an externally-defined output stream) is opened, the result of evaluating the right-hand side is written to it directly, and then it is closed.

12.3.4.2 Writing Referents to External Output Streams

Streams opened with REFERENTS-ALLOWED differ from all others in that the text written to the stream is "buffered" until it is possible to resolve the values of all referents written to the stream. OmniMark creates and maintains the buffer into which text written to the stream is initially placed.

Once the values of all referents in the referent scope have been resolved (i.e. at the end of the scope) the text of the stream, together with the resolved values of the referents written to it, are written to the stream's "final destination", if any. In the case of a stream attached to an external output function, this final destination is that function.


12.4 Function Calls

This section describes function calls and function argument recognition in detail.

12.4.1 Where Can A Function Be Called From?

A function that does not return a value can be called from anywhere an action is allowed. A value-returning function can only be called from places where an expression of the corresponding type can appear.

The following illustrates some of the possibilities, given that f-sw is a SWITCH-returning function, f-c1, f-c2 and f-c3 are COUNTER-returning functions and f-s1 and f-s2 are STREAM-returning functions:

   DO WHEN f-sw 
      SET n TO f-c1 + f-c2
      SET s TO f-s1 || f-s2 ||* f-c3

12.4.2 Argument Recognition

OmniMark programmers should understand how function arguments are recognized to avoid syntax errors or ambiguous constructs.

Function argument recognition behaves differently for functions with the parenthesized comma-separated argument list, and those whose arguments are heralded by names.

12.4.2.1 Argument Recognition in the Parenthesis-and-Comma Form

An argument in the parenthesized, comma-separated argument list begins after the opening parenthesis that begins the argument list, or the comma that ends the previous argument.

The argument continues until the comma that ends it is found, or until the closing parenthesis that ends the argument list is found.

In the following function call, the middle argument is a + b.

   DEFINE FUNCTION f (VALUE COUNTER x, VALUE COUNTER y, VALUE COUNTER z) AS
      ...

   GLOBAL COUNTER a
   GLOBAL COUNTER b

   PROCESS
      f (1, a + b, 2)

12.4.2.2 Argument Recognition in the Name-Heralded Form

An argument heralded by a name actually consists only of the immediately-following subexpression. OmniMark stops recognizing the argument as soon as it has a valid expression, regardless of whether the expression has the correct type, or whether the code following the expression could also be interpreted as part of the expression.

In the next example, the single argument to the function call is just a. The "+ b" is not considered part of the function and would be added to the function result when it returns, if that were legal. Since it is not, the example is in error.

   DEFINE FUNCTION f VALUE COUNTER a
      ...

   GLOBAL COUNTER a
   GLOBAL COUNTER b

   PROCESS
      f a + b

Parentheses must be used to ensure the correct interpretation:

   DEFINE FUNCTION f VALUE COUNTER a
      ...

   GLOBAL COUNTER a
   GLOBAL COUNTER b

   PROCESS
      f (a + b)

A function which takes a single argument can actually be considered a monadic operator like "NAME OF" or like a negative sign ("-") applied to a numeric value. In OmniMark, monadic operators always bind more tightly than other operators. Functions which take a single argument behave exactly the same way.

When a function takes more than one argument, the recognition of each argument is the same as it is for the last argument: the argument is only recognized up until a valid expression has been formed.

The following example is another syntactically illegal function call, because the first argument is again interpreted only as a. The "+ b" is not part of the argument. The error occurs because there is text between the end of the first argument and the herald for the second argument (to).

   DEFINE FUNCTION capture VALUE COUNTER low TO VALUE COUNTER high AS
      ...

   GLOBAL COUNTER a
   GLOBAL COUNTER b

   PROCESS
      capture a + b to 100

To ensure that the above example is interpreted correctly, parentheses must be used:

   DEFINE FUNCTION capture VALUE COUNTER low TO VALUE COUNTER high AS
      ...

   GLOBAL COUNTER a
   GLOBAL COUNTER b

   PROCESS
      capture (a + b) to 100

By using the same method to recognize every argument in a function, the programmer can be sure that adding optional arguments to the end of a previously-correct function definition will have the least impact on existing programs.

12.4.2.3 Resolving Ambiguous Argument Separators

This section discusses three types of ambiguity or apparent ambiguity that can occur in the recognition of function call arguments and their separators.

There are no "reserved" keywords when it comes to a choice of names used for function argument separator. All apparent ambiguities in function definition headers are resolved by the OmniMark compiler by looking one or two symbols beyond an apparently ambiguously used name.

12.4.2.3.1 Duplicate Separators in a Function Definition

The argument heralds used in a function's definition cannot be "ambiguous" amongst themselves. For example, the following produces an error:

   DEFINE FUNCTION ambig VALUE COUNTER x
                   with VALUE COUNTER y OPTIONAL
                   with VALUE COUNTER z AS
      . . .

The error is that once an argument value for "x" has been found, the herald "with" can either designate the optional argument "y" or the following argument "z".

If two or more arguments can be recognized at the same time then each must have a distinct function argument separator. It is an error for two such arguments to have the same function argument separator.

The rule for determining which arguments can be recognized is:

12.4.2.3.2 Using the Same Separator in Different Functions

Function calls that occur in arguments of other function calls can have trailing OPTIONAL or REMAINDER arguments that have the same function argument separator as following arguments in the enclosing function call.

Where an argument could be recognized as belonging to both an embedded and enclosing function call, it is always recognized as belonging to the inner function call. For example, with the following definitions:

   DEFINE FUNCTION ambig1 VALUE COUNTER x with VALUE COUNTER y OPTIONAL AS
      . . . 
   DEFINE FUNCTION ambig2 VALUE COUNTER a with VALUE COUNTER b OPTIONAL AS
      . . . 

the following two examples are equivalent:

Example A

   ambig1 ambig2 3 with 7

Example B

   ambig1 (ambig2 3 with 7)

In this case the argument "with 7" is recognized as the second, OPTIONAL, argument of ambig2 (b). It is not the second argument of ambig1 (y).

12.4.2.3.3 Greedy Argument Separator Recognition

Within a function call argument, recognition of the function argument separator that heralds an argument that can follow the current one predominates over the recognition of a name as a keyword or function name. For example, given the following function definition:

   DEFINE FUNCTION copy-item MODIFIABLE STREAM s
                        item VALUE COUNTER i AS
      SET s TO s item i

the following call

   copy-item column-names item 3 item 7

is processed using the first "item" as the function argument separator and "3" as the second argument value. When the second "item" is encountered by OmniMark, it is considered to be in error. The example is considered to be the same as:

   (copy-item column-names item 3) item 7

which is clearly illegal.

A function argument separator for the next argument is recognized in this way:


12.5 Recursive Functions

OmniMark functions can be called "recursively". That is, a function can call itself. This section describes:

12.5.1 Recursion

Recursion is a "structured" looping technique in which a function calls itself to do a subpart of its work. A classic numerical example is calculating the "factorial" of a number -- the product of all the integers up to and including that number:

   DEFINE FUNCTION factorial (VALUE COUNTER n) AS
      DO WHEN n <= 0
         RETURN 1
      ELSE
         RETURN n * factorial (n - 1)
      DONE

Recursive techniques are especially useful in "divide-and-conquer" applications such as sorting, where the data being sorted is typically "partitioned" and the sorting algorithm applied recursively to each of the partitions.

12.5.2 Function Predefinitions

No special provision is required when declaring a function that calls itself. But if two functions are written such that each calls the other, it is impossible to define each function prior to its first use: one must precede the other. To support such mutually recursive functions, OmniMark provides function "predefinitions". A function predefinition looks just like a function definition except that it has no body. Instead of AS and a following set of local declarations and actions, there is just the keyword ELSEWHERE. For example:

   DEFINE FUNCTION analyse-command-set (VALUE STREAM command-set)
          ELSEWHERE

Functions must be defined or predefined before they are used. For mutually recursive functions, one or more of the functions must be predefined prior to the first actual function definition.

There are some rules for using function predefinitions:

12.5.3 Tail Recursion

OmniMark supports a special type of recursion called tail recursion. If calling itself or another function is the very last thing a function does, then instead of "calling" that function, OmniMark terminates the current function and "jumps" to the other function. The main effect of this is that, with a bit of care, function calls can be made that seem to be millions of levels deep without actually using up any more computer memory than a single call.

The OmniMark programmer doesn't have to do anything to make a function tail recursive. The OmniMark compiler examines each function call in a function and determines if it can be made into a tail recursive call.

The rule that a function call must be the very last thing in a function is interpreted in a very strict way. In particular, a function will not be tail recursive when:

In all these cases something has to be done (or undone) by the calling function after returning from the called function. This cleanup prevents a "jump" to the called function from being done.


12.6 Function Side Effects

OmniMark places very few restrictions on what side effects a function can produce. The few restrictions OmniMark places on function side effects are, nonetheless important. They are:

Well-written programs will always have very few side effects, and most programs shouldn't encounter the problems introduced by functions changing values "behind the scenes".

12.6.1 Function Side Effects In Rule Headers

When selecting a rule to be performed, OmniMark examines the rules in the program. OmniMark:

If there is a function call within any such condition, or pattern that has a side effect, then whether or not that side effect occurs depends entirely on whether the function call is actually performed during rule selection.

The header conditions and header patterns that are actually evaluated during the selection of a rule are evaluated in the order in which the rules that contain them appear in the OmniMark program.

For the situations where it is not defined whether OmniMark evaluates a part of a rule header, different releases of OmniMark may behave differently. Furthermore, the same release of OmniMark may handle these cases differently when the OmniMark program is modified, even though the change may not seem to affect that rule.

As a consequence, OmniMark programmers should be strongly discouraged from depending on any undocumented behaviour that they discover from experimenting with OmniMark programs.

Next chapter is Chapter 13, "The Current Output Stream Set".

Copyright © OmniMark Technologies Corporation, 1988-1997. All rights reserved.
EUM27, release 2, 1997/04/11.

Home Copyright Information Website Feedback Site Map Search