define string sink function

declaration/definition

Syntax
define string sink function 
    function-name   function-argument-list
  (as function-body | elsewhere)

or

  define external string sink function 
     function-name function-argument-list
  as external name (in function-library library-name)?


Purpose

Defines a function that serves as a sink (or destination) of strings. This replaces define external output function from previous versions of OmniMark; that form is now deprecated. It also extends the functionality to internal functions.

A string sink function is like an ordinary action function, in that it does not return a value to its calling context. Instead, it reads strings output from the scope it is invoked from.

A return action without a value can be used to end a string sink function; alternatively, the function can be allowed to fall off its end.

There is only one restriction on what can be done in the body of an internal string sink function: #current-output is not attached at the beginning of the function. Most other operations are available.

Unlike an ordinary function, a string sink function executes concurrently with its caller: its input streamed incrementally from its calling context, without being buffered.

Invoking string sink functions

A string sink function is invoked by calling it in a context that expects a destination for strings. Additionally, an internal string sink function can only be invoked in a well-scoped context. A well-scoped context as the destination of a using output as scope, the destination of a putaction, or a value string sink argument to another function. An externally-defined string sink function can be invoked in any context that expects a destination for strings: as the argument of output-to, as the attachment on an open action, on the left-hand side of the set action, and so on. For example, a string sink function called uppercase, can be invoked as follows:

       define string sink function uppercase (value string sink s) elsewhere
  
       process
          local stream s
  
          open s as buffer
          using output as uppercase (s)
             output "Hello, World!%n"
          close s
  
          output s

String sink functions as filters

A filter can be written using a string sink function that takes an argument of type value string sink. The string sink function scans its #current-input, performs any filtering operations necessary, and outputs the filtered data to its string sink argument.

For example, the string sink function uppercase mentioned above might be implemented as

       define string sink function
          uppercase (value string sink s)
       as
          using output as s
          repeat scan #current-input
          match letter+ => t
             output "ug" % t
  
          match [any \ letter]+ => t
             output t
          again

uppercase filters its #current-input using a repeat scan loop, generating an uppercased version of the same string. It outputs this string to its value string sink argument s.

A longer example: filtering markup output

A string sink function provides a simple way of filtering the output of markup processing. The following process rule parses some SGML input, and outputs the result of its processing to a file:

       process
          using output as file "output.txt"
          do sgml-parse document scan file "input.sgml"
             output "%c"
          done
  
  
       element #implied
          output "%c"

If we wanted to uppercase the output from the markup parse, we could capture the data in a buffer, perform some post-processing, and output the data to the destination file. However, this buffers the entire document output from the markup parse, which is potentially prohibitively expensive. Alternately, we could wrap the invocation of the SGML parser in a string source function:

       define string source function
          wrapper (value string filename)
       as
          do sgml-parse document scan file filename
             output "%c"
          done
  
  
       process
          using output as file "output.txt"
          repeat scan wrapper ("input.sgml")
          match letter+ => t
             output "ug" % t
  
          match [any \ letter]+ => t
             output t
          again

However, a less invasive solution is to use the uppercase function described earlier, and to write the process rule as

       process
          using output as uppercase (file "output.txt")
          do sgml-parse document scan file "input.sgml"
             output "%c"
          done

This solution has minimal impact on the existing code, and all data is streamed from one process to another.