Network programming

Network programming refers to all those operations that happen behind the scenes and involve cooperation between multiple applications running on multiple machines. These are the programs the user does not see, but which do the real work on the Internet and other networked environments.

The workhorse of network programming is the server. A server is a program that provides services to other programs. A server waits for a request from another program, decodes that request, and sends a response. Servers run unattended for days and weeks, so they must be robust.

The program requesting a service from a server is a client. A client issues a request and waits for a response from the server.

A middleware application is a program that acts as a go-between between two other programs. Generally, a middleware program adds value to the transaction. An active content server that receives requests from a web server and queries a database to fill the request is acting as middleware. It manages the communication between the web server and the database server and adds value by adding HTML tagging to the database results.

A middleware application acts as both a client and a server. Multi-tier architectures are possible in which multiple middleware applications broker and add value to communications across a network. Though they sound exotic, servers and middleware applications are easy to program in OmniMark using the connectivity libraries.

When embarking on a network programming project, you will need to know a little bit about protocols. A protocol is simply an agreement between a client and a server about how they will communicate. If you use a common published protocol, or publish your own protocol, you can enable any number of clients to communicate with your server. On the other hand, if you keep your protocol private and encrypted you can help to secure your server against intrusion.

There are two important types of protocol you need to know about:

  • transport protocols, and
  • application protocols.

Transport protocols are used to actually get messages across the network from one machine to another in good order. TCP/IP is the transport protocol used on the Internet, and supported by OmniMark's network library and the tcp.service and tcp.connection data types.

An application protocol is an agreement about what constitutes a message and what the message means. While disk files have natural ends, a network message is just a sequence of bytes over an open connection. You have to look at the data itself to determine if you have found the whole message. The OMTCP library supports all the common methods of delimiting a message.

Once you have a complete message you must decode it to see what it means, and then generate and send the appropriate response. OmniMark is the ideal language for decoding network protocols. Its streaming features make it very easy to interpret a message and to formulate a response quickly.

Simple Server

The following is a simple OmniMark server program. This server returns the first line of a nursery rhyme when it receives a message naming the principal character of that rhyme.

  import "omtcp.xmd" prefixed by tcp.
  
  declare catch kill-server ()
  
  process
     local tcp.service service initial { tcp.create-service on 5432 }
  
     repeat
        local tcp.connection connection initial { tcp.accept-connection from service }
  
        using output as tcp.writer of connection
           submit tcp.reader of connection protocol tcp.end-delimited "%10#"
  
      catch #external-exception identity identity message message location location
        put #log identity || " : " || message || "%n"
              || location || "%n"
     again
  
   catch kill-server ()
     ; EMPTY ... the server will end by virtue of the fact that the program will terminate.
  
  find "Mary" "%13#"? value-end
     output "Mary had a little lamb%13#%10#"
  
  find "Tom" "%13#"? value-end
     output "Tom, Tom, the piper's son.%13#%10#"
  
  find "die" "%13#"? value-end
     output "Argh!%13#%10#"
     throw kill-server ()

A server operates rather like a telephone. First we place it in service by assigning it a telephone number. Then it must wait for a call. When a call comes it must answer it, listen to the message, and make an appropriate response. The conversation may consist of a single exchange, or of multiple exchanges. When the conversation is over, it hangs up and goes back to waiting for the next call.

The essential operation of a server, then, comes down to three things:

  • start up: put the server in service,
  • request loop: wait for calls, respond, and repeat, and
  • shut down: take the server out of service.
Because it runs for a long time and has to handle many requests, a server has two overriding performance requirements:
  • no matter what happens while servicing a request, the server must not crash: it must stay running, and
  • no matter what happens while servicing a request, the server must always return to a consistent ready state when the request is complete: if the server was in a different state for each request, its responses would not be reliable.

Let's look at how our sample server meets these requirements, line by line:

  process
     local tcp.service service initial { tcp.create-service on 5432 }

This is the code that puts the server in service. It uses an instance of the tcp.service data type to establish a service on port 5432 of the machine it is running on. The server's address (its phone number) will be the machine's network address combined with the port number. Many different servers can run on the same machine using different ports.

     repeat
        local tcp.connection connection initial { tcp.accept-connection from service }
        ; ...
     again
This is the code that listens for an incoming call. tcp.accept-connection waits for a client to connect. When it receives a connection, it returns a value which represents the connection to the client. connectionis declared inside the repeat loop so that it will go out of scope at its end, providing automatic closure and cleanup of the connection.
        ; ...
           submit tcp.reader of connection protocol tcp.end-delimited "%10#"
A instance of tcp.connection provides an OmniMark source so that data can be read from the client. Reading data from a network connection, however, is different from reading from a file. While you can either read from a file or write to it, but not both, a network connection, like a telephone connection, is two way. This means that OmniMark cannot detect the end of a message on a network connection the way it detected the end of a file. The connection stays open and there could always be more characters coming. For this reason, all network data communication requires a specific application protocol for determining the end of a message. OmniMark provides support for all the common application protocols used for this purpose through the provided I/O protocols in the OMTCP library. In this case we are using a line-based protocol. In our request protocol, the end of a message is signaled by a line-end combination %13#%10#. This can be recognized by using the end-delimited protocol with the delimiter set to %10#. We submit data from that source to our find rules which will analyze the message and generate the appropriate response.
     repeat
        local tcp.connection connection initial { tcp.accept-connection from service }
  
        using output as tcp.writer of connection
           submit tcp.reader of connection protocol tcp.end-delimited "%10#"
        ; ...
     again
Our connection represents a two way network connection. Not only must we get a source from it to read data, we must also attach an output to it so that we can send data over the connection to the client. We do this with the tcp.writer function. With our submit prefixed by using output as tcp.writer of connection, the find rules will read from and write to the network connection.
  find "Mary" "%13#"? value-end
     output "Mary had a little lamb%13#%10#"
     
  find "Tom" "%13#"? value-end
     output "Tom, Tom, the piper's son.%13#%10#"
Ours is a line-based protocol, but line ends are different on different platforms: CR+LF on Window, LF on UNIX. Across a network, which can include machines from different platforms, we have to pick one for ourselves. Our protocol specifically requires CR+LF. But for matching purposes, we use %10# as the delimiter with an optional preceding %13# so that even if the client forgets to send the appropriate line end sequence, we can still read the message. When we send, however, we explicitly send %13#%10# rather than %10#. In this we are following an important maxim of network programming: be liberal in what you accept, conservative in what you send.
  find "die" "%13#"? value-end
     output "Argh!%13#%10#"
     throw kill-server ()
This is the find rule that detects the poison pill message. To ensure an orderly shutdown, we provide a method of terminating our server by sending it a message to shut itself down. (In a production system, you might want to pick a slightly less obvious message for the poison pill.) Shutting down the server is an exception to normal processing. We accomplish it by initiating a throw to a catch label named kill-server ().
  process
     ...
     repeat
        ...
     again
  
   catch kill-server ()
We catch the throw to kill-server () after the end of the server loop. OmniMark cleans up local scopes on the way, ensuring a clean and orderly shutdown. We are at the end of the process rule now, so the program exits normally.

Error and Recovery

A server needs to stay running despite any errors that occur in servicing a particular request. On the other hand it should shut down if it cannot run reliably. The following code provides for both these situations:

      catch #external-exception identity identity message message location location
        put #log identity || " : " || message || "%n"
              || location || "%n"
If there is an error in processing a request, OmniMark initiates a throw to #external-exeption. We catch the throw at the end of the server loop. This provides for an automatic cleanup of any resources in use in servicing the request in progress, and assures that the server returns to its stable ready state. (No attempt is made to rescue the specific request in which the error occurred. In a production server you would want to provide such error recovery, but make sure you always have a fallback that aborts the current request and returns to a stable ready state.)

This simple server program has everything you need for a robust and usable production server. You would need to adapt the code to the protocol you are using, but apart from that, once input and output are bound to the connection, everything else is just regular OmniMark programming.

Simple Client

Any client program, written in any language, can use our server as long as they know the protocol. Here is a simple client written in OmniMark:

  import "omtcp.xmd" prefixed by tcp.
  
  process
     local tcp.connection connection initial { tcp.connect to "localhost" on 5432 }
  
     using output as tcp.writer of connection
        output #args[1] || "%13#%10#"
  
     output tcp.reader of connection
This client is called with the name of the nursery rhyme character on the command line and prints out the line it receives from the server. Let's go through it line by line:
     local tcp.connection connection initial { tcp.connect to "localhost" on 5432 }
Like the server program, the client uses an instance of the tcp.connection data type to create a connection. Unlike the server it does not require an instance of tcp.service, as it is not establishing a service, but simply making a connection to a service established elsewhere. The client takes a more active role than the server, however. While the server waits for a call, the client must take the initiative and make a call. It does this with the tcp.connect function. The tcp.connect function takes a network and port address for a server and when the connection is made it returns an instance of tcp.connection, which we can write to and read from just as we did in the server program.
     output tcp.reader of connection
When we read the data returned from the server we actually have two choices. Since ours is a line-based protocol, we could use the protocol parameter of tcp.reader to read the response. But we also know that the server will drop the connection as soon as it has finished sending data. (This behavior is part of our protocol as well.) So we choose to keep reading data until the connection is dropped. This way we will get at least partial data even if something goes wrong and the server never sends the end of line. Be conservative in what you send and liberal in what you accept.

Clients for Common Servers

Most of the client program you write in OmniMark will probably be for well-known servers such a HTTP (Web), FTP, or SMTP/POP (Mail). OmniMark's connectivity libraries provide direct support for these and other common protocols, greatly simplifying the task of retrieving data from these servers.