I wrote a very basic network library a little while back for Idris, but recently decided to scrap it and start again. In this post, I’ll describe how to use the new library, the guarantees that dependent types give us, and show some examples. I’ll also describe my plans for the coming months. Any comments or criticisms would be very much appreciated!
First off, it’s important that I say that this is very much a work in progress right now – there are a number of limitations that I hope to iron out soon (for example, we can only operate on strings) but I hope this will be enough to get us started.
You will need:
- A copy of the library (http://www.github.com/SimonJF/IdrisNet2)
- The latest git revision of Idris (http://www.github.com/idris-lang/Idris-dev)
- The new Effects library (http://www.github.com/edwinb/Eff-new) Once you have installed all of these (in order to do this, run idris –install <packagename> _from the directory for the packages, or just _make for Idris), we can start having a look at how everything works:
The Low Level Stuff
The library is largely in two parts: the higher-level effects for verified TCP communication, and the lower-level socket interface.
The Network.Socket library is a low-level abstraction over the C socket library, and it contains all of the goodness you would expect: socket, bind, connect, send, recv and so on. This is to allow people to build things on top of the basics of the library (so, if you fancy doing some nice raw socket goodness, this should provide a nice starting point!).
There’s also a thin C library, idrisnet.c, which provides some helper functionality to bridge the gap imposed by the FFI.
I haven’t generated the documentation just yet, but the functionality should be fairly clear from the file, and accessible to anyone with knowledge of programming with C sockets.
The Higher-Level Stuff
Primer: The New Effects Library
If you haven’t already, I strongly recommend you read the details about the “old” (I hesitate to call it that!) effects library here. The paper details how to model side effects algebraically, and shows how, since effectual operations may be predicated on the state of an associated resource, the library may be used to enforce resource usage protocols.
One issue we found when using the library for a larger project was that it was quite difficult to encapsulate failure. Initial versions of IdrisWeb (a POC web framework) and IdrisNet encapsulated failures by parameterising a resource over the particular step of the computation (for example, Bound or Listening) and recording any failure at the value level. This wasn’t ideal, since many extra handler functions needed to be created, and it was rather difficult to propagate failures in the program.
It’s not really my place to describe this, since the new Effects library was written by Edwin, but the general gist is that we may now use the result of an effectual operation to calculate the output resource. A good example is the ‘Hello World’ of resource-aware DSLs: file handling. Taken from here (not written by me!), we firstly specify an ADT to abstractly model the effectual operations:
data FileIO : Effect where
Open : String -(m : Mode) -
{() ==if result then OpenFile m else ()} FileIO Bool
Close : {OpenFile m ==()} FileIO ()
ReadLine : {OpenFile Read} FileIO String
WriteLine : String -{OpenFile Write} FileIO ()
EOF : {OpenFile Read} FileIO Bool
For those of you who are familiar with the existing Effects syntax, you will notice some differences. Firstly, the { … } notation is syntactic sugar for a resource transition. Let’s take Open as an example:
Open : String -> (m : Mode) -> {() ==> if result then OpenFile m else ()} FileIO Bool
This means that the Open operation takes a String, a Mode (for example, read or write) and returns a Bool. This Boolean return value is bound to the variable _result _in the curly-bracketed expression:
{() ==if result then OpenFile m else ()}
This indicates that we start the operation with the resource (). If the operation is successful (so, if it returns True) then the resulting resource is (OpenFile m), where m is the mode we specified before. If not, then the resource remains ().
We also then (as in the original paper) supply handler functions, which handle the abstract effects in a given execution context. To use these operations, we must wrap them in a function like so:
open : Handler FileIO e =>
String -> (m : Mode) ->
{ [FILE_IO ()] ==> [FILE_IO (if result then OpenFile m else ())] } EffM e Bool
open f m = Open f m
We once again use syntactic sugar to show how the resource changes as a result of the operation.
TCPClient
The TCPClient.idr file allows us to build TCP Clients (funny, that!), which are guaranteed to follow the correct resource-usage protocol and are therefore guaranteed not to make erroneous calls to the underlying library. We can also guarantee freedom from handle leaks, since the types will specify that all allocated resources must also be freed. Additionally, the state (in our case, the client socket) is contained within the resource, so the user doesn’t have to worry about maintaining and using the descriptor.
Let’s have a bit more of a look at the TCPClient effect itself. We want to keep track of whether a client is connected (and therefore may send and receive data along the TCP channel) and also if an error has occurred (meaning that the channel may not be used but must still be closed). To do this, we define two resources, ClientConnected and ErrorState.
data ClientConnected : Type where
CC : Socket -> ClientConnected
data ErrorState : Type where
ES : Socket -> ErrorState
We then define the effect.
-- TCP Client Effect
data TCPClient : Effect where
Connect : SocketAddress -> Port -> {() ==> interpConnectRes result}
TCPClient (SocketOperationRes Socket)
Close : { ClientConnected ==> ()} TCPClient ()
Finalise : { ErrorState ==> () } TCPClient ()
WriteString : String -> { ClientConnected ==> interpOperationRes result}
TCPClient (SocketOperationRes ByteLength)
ReadString : ByteLength ->
{ ClientConnected ==> interpOperationRes result }
TCPClient (SocketOperationRes (String, ByteLength))
I think a diagram makes this more clear…
[
Essentially, we start in the uninitialised state (). When we try to connect using tcpConnect (the wrapper function around the Connect operation), we either successfully connect and therefore transition to ClientConnected, or the connection fails and we remain in the uninitialised state. By closing the connection using tcpClose, we transition back to the uninitialised state.
The more interesting transitions happen as a result of the possibly failing resources tcpSend and tcpRecv. These will either succeed, and remain in the ClientConnected state, or fail, and transition into the ErrorState state.
To implement this transition, we can essentially distil the results of a TCP socket operation into 4 cases:
- The operation succeeds, returning the number of bytes sent on a send operation, or the number of bytes received on a receive operation.
- The operation fails, but returns the codes EAGAIN or EWOULDBLOCK. These codes indicate that the operation failed, but can be safely retried.
- The operation fails with a fatal error. In this case, we may not use the socket link again, and should close it at the next opportunity.
-
The remote socket has closed the connection. To represent this programmatically, we define the following data type:
data SocketOperationRes a = OperationSuccess a | FatalError SocketError – Most socket errors are fatal. | RecoverableError SocketError – EAGAIN / EWOULDBLOCK | ConnectionClosed
The interesting part comes with linking this to the resource transition. To do this, we make use of the fact that Idris treats types as first-class terms, thereby allowing them to be computed, and implement a function which calculates the output resource, given the result of the operation:
interpOperationRes : SocketOperationRes a -Type
interpOperationRes (OperationSuccess _) = ClientConnected
interpOperationRes (FatalError _) = ErrorState
interpOperationRes (RecoverableError _) = ClientConnected
interpOperationRes ConnectionClosed = ()
So, if the operation succeeds or a recoverable error occurs, we remain in the ClientConnected state. If we encounter a fatal error, we transition to the FatalError state, and finally if the connection is closed, we transition to the uninitialised state.
TCPServer
Well, we can connect to things, but it’d be good to have something to connect to. We do this in a similar way to TCPClient, defining resources representing each state of the server: uninitialised, bound, listening, and in an error state.
data ServerBound : Type where
SB : Socket -ServerBound
data ServerListening : Type where
SL : Socket -ServerListening
data ServerError : Type where
SE : Socket -ServerError
And the corresponding effect and state diagram:
data TCPServer : Effect where
Bind : SocketAddress -> Port -> { () ==> interpBindRes result }
TCPServer (SocketOperationRes ())
Listen : { ServerBound ==> interpListenRes result }
TCPServer (SocketOperationRes ())
Accept : ClientProgram t ->
{ ServerListening ==> interpOperationRes result }
TCPServer (SocketOperationRes t)
CloseBound : { ServerBound ==> () } TCPServer ()
CloseListening : { ServerListening ==> () } TCPServer ()
Finalise : { ServerError ==> () } TCPServer ()
Let us look in detail at accepting a client, since this is (in my opinion) quite cool. The Accept operation has the following signature:
Accept : ClientProgram t ->
{ ServerListening ==interpOperationRes result }
TCPServer (SocketOperationRes t)
Note the ClientProgram parameter: this is a type synonym for the following effectual operation:
ClientProgram : Type -Type
ClientProgram t = {[TCPSERVERCLIENT (ClientConnected)] ==
[TCPSERVERCLIENT ()]} EffM IO t
What this means is that we present the Accept operation with an effectual program which assumes a connected client—in this case, the socket gained from the accept operation—and results in the socket being closed. This is then run if a client is successfully accepted, and we have a static guarantee that the client behaves properly and is cleaned up.
(I’ve omitted the details of TCPSERVERCLIENT here since this is already far too long for a blog post, but essentially it’s like TCPCLIENT but without the connection operation).
Using the Libraries
I’ve written two basic applications making use of the libraries: EchoClient and EchoServer. They work as you might expect: the EchoClient application sends a message to the EchoServer, and the EchoServer sends it back. We’ll discuss EchoClient here, although both are on the repository.
At the top level, we use the Effects DSL to run the effectful echoClient program:
main : IO ()
main = run [(), ()] (echoClient (IPv4Addr 127 0 0 1) 1234)
We then attempt to connect to the server, using the given details:
echoClient : SocketAddress -Port -
{ [TCPCLIENT (), STDIO] ==[TCPCLIENT (), STDIO]} EffM IO ()
echoClient sa port = do
connect_res <- tcpConnect sa port
case connect_res of
OperationSuccess sock = do putStr "Connected!\n"
getAndSend
RecoverableError _ = echoClient sa port
ConnectionClosed =do putStr "Unable to connect :(\n"
return ()
FatalError _ = do putStr "Unable to connect :( \n"
return ()
The important concept to recognise here is that due to the correspondence between the result of the tcpConnect operation, by case-splitting, we may ascertain the resource given by the result in each branch. This allows us to ensure that the operations within the branch correspond to the resource. To give a concrete example, we may invoke the getAndSend function (discussed below) which requires a ClientConnected resource in the OperationSuccess branch, but attempting to do the same in the FatalError branch will result in a compile-time error.
The getAndSend and recvAndPrint functions take a very similar form, making use of the send and receive functions to communicate with the server. They also make use of the composability of resources to also allow messages to be printed to the console. I’m going to omit them in the interest of brevity (although, this post is now knocking on 2000 words, so that’s a bit rich…) but you can see them in the repository.
The Future…
If you’ve got this far, firstly, thanks for your time and I hope you found something at least a little bit interesting somewhere along the way! This is still early work, and there’s still a lot to do. Most seriously, this only works with strings at the moment: ideally, we’d have some kind of stream of Bits8 characters. The programs written with the effectual TCP library are also pretty verbose at the moment, and ideally it would be nice to find some way of making them more concise whilst retaining the guarantees we have at the moment. Basic concurrency support should also be along soon.
More interestingly, I’m currently working on a DSL to specify packet structures and marshal lower-level, much like the one described in this paper. That should come along soon, and I’ll hopefully use it as part of bigger projects.
Thanks for reading, and any comments would be gratefully received!