Names and symbols

An OmniMark program consists of tokens. A token is a sequence of characters that represents a discrete item in the structure of the program.

There are five kinds of tokens in an OmniMark program:

  • OmniMark keywords—keywords are simply the keywords of the language.
  • OmniMark names—OmniMark names are the names of the objects you define in your program. These objects are
  • OmniMark symbols—OmniMark symbols are the symbols that you can use as the name of functions that you define.
  • Markup names—markup names are the names of elements, attributes, entities, and notations in an XML or SGML document.
  • Values—values include literal strings and numbers.

OmniMark names must start with a letter of the Roman alphabet or a Unicode character over 127, followed by letters of the Roman alphabet, Unicode characters over 127, digits, hyphens (-), underscores (_), or periods (.). Names are case insensitive. By default, only the characters of the Roman alphabet have defined upper and lowercase equivalents. You can define upper and lowercase equivalents for other characters with declare name-letters.

The colon character : is the field selection operator. It is used to select a particular field or a record. The colon character is not allowed in OmniMark names.

All OmniMark names must be declared before they are used. Different types of names are declared in different ways:

All OmniMark names share a common name space. You cannot have a function and a catch with the same name. Function, catch, macro and group names are always global in scope. Shelf names may be global or local. The same shelf name can be declared in different lexical scopes. Declaring a name in a local scope hides the same name in a wider scope. Note that this applies to all names, so a local shelf with the same name as a function will hide the function name in the local scope.

OmniMark names and keywords are in the same name space, so if you use an OmniMark name that is the same as a keyword, the token will be recognized as the name, not the keyword. The keyword will then be unavailable in your program. The main advantage of this behavior is that if new keywords are introduced in a later version of OmniMark, and they happen to match names you have used in your program, your program will still run properly. Generally speaking, you should not use a name that matches a keyword.

You cannot declare a keyword as a global name if you have already used that keyword as a keyword.

The hyphen character is a legitimate part of a name, so when you use it as a minus sign (-), use a space between a preceding name and the minus sign or the minus sign will be treated as part of the name.

The maximum length of an OmniMark name is 2,048 characters.

Markup names

Markup names follow the naming restriction of the appropriate markup language (XML or the Reference Concrete Syntax of SGML). You may use markup names unquoted in an OmniMark program when the markup name is a legal OmniMark name. For instance, if your XML document includes name spaces, your XML names will include the colon character, which is not allowed in an OmniMark name. Markup names that are not legal OmniMark names must be quoted. For consistency, we recommend that all markup names in your code be quoted.

  element "foo"
     output "%c"
  
  
  element "bar:baz"
     output "%c"
          

Character set encodings

You are allowed to use Unicode characters in OmniMark names. Beware, however, that there are different encodings of Unicode characters. Only UTF-8, the most common encoding, is supported by the OmniMark compiler.

OmniMark symbols

You can use symbols as function names. This is particularly useful if you want to write a function to overload an existing OmniMark operator to work with a new data type. You can also define your own symbols for use as function names. For example, this program defines a function to format a BCD number as a dollar amount and uses the symbol $ as the function name:

  import "ombcd.xmd" unprefixed
  
  define string function 
     $ value bcd money
  as
     return "<$,NNZ.ZZ>" % money
  
  
  process
     local bcd cash initial { 1276.759 }
  
     output $ cash
          

An OmniMark symbol used as an infix function name must start with one of the following characters:

  ! - + % ? / ~ $ \ @ ^ * = | < > &
          

And may be followed by one or more of the following characters:

  * = | < > &
          

An OmniMark symbol used as a regular function name must start with one of the following characters:

  ! - + % ? / ~ $ \ @ ^
          

And may be followed by one or more of the following characters:

  * = | < > &
          

Note that OmniMark names and OmniMark symbols are two distinct types of tokens. You can use either an OmniMark name or an OmniMark symbol as a function name, but you cannot mix and match characters from the name and the symbol character sets.

Quoted names and symbols

If you need to have a name or symbol that does not meet the requirements of a legal name or symbol, you can use a quoted name. To create a quoted name, you must place the name in quotes and precede it with a # symbol.

  process
     local integer foo
     local integer #"3 blind mice" initial { 3 }
     local integer #"*&^$"         initial { 5 }
  
  
     set foo to #"3 blind mice" + #"*&^$"
     output "d" % foo
          

The main benefit of this feature is that it allows you to use symbolic operators from modules in situations in which it is necessary to import the module with a prefix. For instance, if you wanted to import the OMFLOAT library with a prefix of float., that would result in the overloaded + operator being exported from the module with the name float.+, which is not a legal OmniMark name. You can use quoted names to handle this situation:

  import "omfloat.xmd" prefixed by float.
  
  process
     local float.float foo
     local float.float bar
     local float.float baz
  
     set foo to 3.75
     set bar to 2.50
     set baz to foo #"float.+" bar
  
     output "d" #"float.%%" baz
          

You can use static format items within the quoted names, but not dynamic format items. You cannot use quoted names to generate variable names at runtime.