Planet Haskell

June 03, 2023

Abhinav Sarkar

Implementing Co, a Small Language With Coroutines #4: Adding Channels

In the previous post, we added coroutines to Co, the small language we are implementing in this series of posts. In this post, we add channels to it to be able to communicate between coroutines.

  1. Implementing Co #1: The Parser
  2. Implementing Co #2: The Interpreter
  3. Implementing Co #3: Adding Coroutines
  4. Implementing Co #4: Adding Channels
  5. Implementing Co #5: Adding Sleep

This post was originally published on


With coroutines, we can now have multiple Threads of Computation (ToCs) in a Co program. However, right now these ToCs work completely independent of each other. Often in such concurrent systems, we need to communicate between these ToCs, for example, one coroutine may produce some data that other coroutines may need to consume. Or, one coroutine may need to wait for some other coroutine to complete some task before it can proceed. For that, we need Synchonization between coroutines.

There are various ways to synchronize ToCs: Locks, Semaphores, Promises, Actors, Channels, Software Transactional Memory, etc. In particular, channels are generally used with coroutines for synchronization in many languages like Go, Kotlin, Python etc, and we are going to do the same.

Channels are a synchronization primitive based on Communicating Sequential Processes (CSP). CSP is a formal language for describing patterns of interaction between concurrent processes. In CSP, processes communicate with each other by sending and receiving messages over channels.

A process can send a message to a channel only if the channel is not full, and blocks otherwise. Similarly, a process can receive a message from a channel only if the channel is not empty, blocking otherwise. Thus, channels provide a way for processes to synchronize with each other, and at the same time, communicate by passing messages.

Before we implement channels, we have to decide how they are going to work.

Channel Design

There are various design decisions that we need to make while implementing channels. Depending on what we choose, we end up with different kinds. Some of the major design decisions are:

Buffered vs Unbuffered
A buffered channel has a buffer to store messages. A send operation on a buffered channel succeeds if the buffer is not full, even if there are no pending receive operations. On the other hand, a send operation on an unbuffered channel blocks until the message is received by some other process. For example, in Java LinkedBlockingQueue is a buffered channel, while SynchronousQueue is an unbuffered channel1.

Bounded vs Unbounded
A bounded channel has a buffer of fixed capacity, and can hold only a fixed number of messages at maximum. A send operation on a bounded channel blocks if the buffer is full and there are no pending receive operations. An unbounded channel has a buffer with no fixed capacity, and can hold any number of messages. A send operation on an unbounded channel never blocks. For example, in Java ArrayBlockingQueue is a bounded channel, while LinkedBlockingQueue is an unbounded one.

Synchronous vs Asynchronous
A synchronous channel blocks on send until the message is received by some other process, even if the channel has an unbounded buffer. An asynchronous channel does not block on send if the channel’s buffer has space. For example, in Java LinkedTransferQueue is a synchronous channel, while ArrayBlockingQueue is an asynchronous channel.

Blocking vs Non-blocking
A blocking channel blocks on send if the channel’s buffer is full, or on receive if it is empty. A non-blocking channel never blocks on send or receive, and instead returns a sentinel value (usually the Null value), or throws an error to indicate that the operation could not be executed. For example, in Java BlockingQueue.put is a blocking send operation, while BlockingQueue.offer is a non-blocking send operation.

Fair vs Unfair
A fair channel ensures that the order of sends and receives is preserved. That means, if there are multiple pending sends and receives, they are executed in the order they were requested. An unfair channel does not guarantee any order. For example, in Java, ArrayBlockingQueue supports fair and unfair modes by passing a boolean flag to its constructor.

Locking vs Lock-free
A locking channel uses locks to synchronize access to the channel. A lock-free channel uses atomic operations for the same. For example, in Java LinkedBlockingQueue is a locking channel, while ConcurrentLinkedQueue is a lock-free channel.

Selectable vs Non-selectable
A selectable channel can be used in a Select like operation to wait for a message on multiple channels at once. A non-selectable channel cannot be used in such an operation. For example, channels in Go and Clojure core.async are selectable, while aforementioned channels in Java are not.

In our implementation for Co, we have both buffered and unbuffered channels. The buffered channels are bounded, with a fixed capacity. The channels are asynchronous, blocking, fair, lock-free, and non-selectable.

Enough of theory, let’s see how channels work in Co.

Channel Operations

In this section, we explore the various scenarios for send and receive operations on a channel in Co using diagrams. These diagrams are for buffered channels. For unbuffered channels, the send operation acts as for a fully buffered channel, and the receive operation acts as for an empty buffered channel.

Each channel has three internal queues: a send queue, a receive queue, and a buffer2. The send and receive queues are used to store pending send and receive operations (as coroutines) respectively. The buffer is used to store data of the messages. The send and receive queues are always bounded, because otherwise any number of send and receive operations can be blocked on a channel, thus defeating the point of bounded buffer. In extreme cases, it can cause the program to run out of memory.

The invariants we must maintain for the channel operations are:

  1. There can never be pending send operations while there are pending receive operations, and vice versa. This is because a send operation will complete immediately if there are pending receive operations, and vice versa.
  2. There can never be pending receive operations while there are messages in the buffer. This is because a receive operation will complete immediately by dequeuing the oldest message in the buffer.
  3. There can never be pending send operations while there is room in the buffer. This is because a send operation will complete immediately by enqueuing the message in the buffer.

With these invariants in mind, let’s look at the different scenarios in detail:

  • When a program tries to receive from a channel, and the channel has nothing in its buffer and there are no pending sends, the program blocks. The programs’s continuation is captured as a coroutine, and is enqueued to the receive queue. Note that the coroutine is not queued into the interpreter’s global coroutine queue.
Receive when no pending sends and buffer empty <noscript>Receive when no pending sends and buffer empty</noscript>
Receive when no pending sends and buffer empty
  • The corresponding scenario for a send operation is when the channel has pending receives. In this case, the send operation completes immediately, and the first coroutine in the receive queue is dequeued and resumed with the message.
Send when pending receives <noscript>Send when pending receives</noscript>
Send when pending receives
  • When there are no pending receives and the buffer is not full, the message is enqueued to the buffer, and the send operation completes immediately.
Send when no pending receives and buffer not full <noscript>Send when no pending receives and buffer not full</noscript>
Send when no pending receives and buffer not full
  • In the corresponding scenario for a receive operation, when there are no pending sends, and there are messages in the buffer, the oldest message is dequeued, and the receive operation completes immediately with it.
Receive when no pending sends and buffer not empty <noscript>Receive when no pending sends and buffer not empty</noscript>
Receive when no pending sends and buffer not empty
  • When the buffer is full, the program trying to do a send operation is blocked and its continuation is captured as a coroutine and queued into the send queue. Note that the coroutine is not queued into the interpreter’s global coroutine queue.
Send when buffer full <noscript>Send when buffer full</noscript>
Send when buffer full
  • In the corresponding scenario for a receive operation, when the buffer is full, the oldest message is dequeued from the buffer, and the receive operation completes immediately with it. If there are pending sends, the oldest coroutine in the send queue is dequeued and resumed, and its message is enqueued to the buffer.
Receive when pending sends and buffer full <noscript>Receive when pending sends and buffer full</noscript>
Receive when pending sends and buffer full
  • When the send queue is full and the buffer is full as well, an error is thrown when trying to do a send operation.
Send when send queue and buffer full <noscript>Send when send queue and buffer full</noscript>
Send when send queue and buffer full
  • Similarly, when the receive queue is full and the buffer is empty, an error is thrown when a receive operation is attempted.
Receive when receive queue full and buffer empty <noscript>Receive when receive queue full and buffer empty</noscript>
Receive when receive queue full and buffer empty

That captures all scenarios for send and receive operations on a channel. In the next section, we implement channels in Co.

Adding Channels

Let’s start with defining the Channel type:

data Channel = Channel
  { channelCapacity :: Int,
    channelBuffer :: Queue Value,
    channelSendQueue :: Queue (Coroutine (), Value),
    channelReceiveQueue :: Queue (Coroutine Value)

newChannel :: Int -> Interpreter Channel
newChannel size = Channel size <$> newQueue <*> newQueue <*> newQueue

A channel has a buffer, a send queue, and a receive queue. The buffer is a queue of Co values, the receive queue is a queue of coroutines, and the send queue is a queue of coroutine and value pair. A channel also has a capacity, which is the capacity of the buffer3.

Now, we add Channel to the Value type:

data Value
  = Null
  | Boolean Bool
  | Str String
  | Num Integer
  | Function Identifier [Identifier] [Stmt] Env
  | BuiltinFunction Identifier Int ([Expr] -> Interpreter Value)
  | Chan Channel

Finally, we introduce some new built-in functions to create channels:

builtinEnv :: IO Env
builtinEnv = Map.fromList <$> traverse (traverse newIORef) [
    ("print", BuiltinFunction "print" 1 executePrint)
  , ("newChannel",
     BuiltinFunction "newChannel" 0 $ fmap Chan . const (newChannel 0))
  , ("newBufferedChannel",
     BuiltinFunction "newBufferedChannel" 1 executeNewBufferedChannel)
  , ("sleep", BuiltinFunction "sleep" 1 executeSleep)
  , ("getCurrentMillis",
     BuiltinFunction "getCurrentMillis" 0 executeGetCurrentMillis)

The newChannel function creates an unbuffered channel, and the newBufferedChannel function creates a buffered channel with the given capacity:

executeNewBufferedChannel :: [Expr] -> Interpreter Value
executeNewBufferedChannel argEs = evaluate (head argEs) >>= \case
  Num capacity | capacity >= 0 -> Chan <$> newChannel (fromIntegral capacity)
  _ -> throw "newBufferedChannel call expected a positive number argument"

Wiring Channels

Moving on to wiring the channels into the existing interpreter implementation. First we add a new constructor for send statements to the Stmt type:

data Stmt
  = ExprStmt Expr
  | VarStmt Identifier Expr
  | AssignStmt Identifier Expr
  | IfStmt Expr [Stmt]
  | WhileStmt Expr [Stmt]
  | FunctionStmt Identifier [Identifier] [Stmt]
  | ReturnStmt (Maybe Expr)
  | YieldStmt
  | SpawnStmt Expr
  | SendStmt Expr Expr
  deriving (Show, Eq)

type Program = [Stmt]

And another for receive expressions to the Expr type:

data Expr
  = LNull
  | LBool Bool
  | LStr String
  | LNum Integer
  | Variable Identifier
  | Binary BinOp Expr Expr
  | Call Expr [Expr]
  | Lambda [Identifier] [Stmt]
  | Receive Expr
  deriving (Show, Eq)

type Identifier = String

We have already written the code to parse these statements and expressions in the first post, so that’s taken care of. We need to modify the execute and evaluate functions to handle these new statements and expressions. Let’s start with execute:

execute :: Stmt -> Interpreter ()
execute = \case
  ExprStmt expr -> void $ evaluate expr
  VarStmt name expr -> evaluate expr >>= defineVar name
  AssignStmt name expr -> evaluate expr >>= assignVar name
  IfStmt expr body -> do
    cond <- evaluate expr
    when (isTruthy cond) $
      traverse_ execute body
  while@(WhileStmt expr body) -> do
    cond <- evaluate expr
    when (isTruthy cond) $ do
      traverse_ execute body
      execute while
  ReturnStmt mExpr -> do
    mRet <- traverse evaluate mExpr
    throwError . Return . fromMaybe Null $ mRet
  FunctionStmt name params body -> do
    env <- State.gets isEnv
    defineVar name $ Function name params body env
  YieldStmt -> yield
  SpawnStmt expr -> spawn expr
  SendStmt expr chan -> evaluate chan >>= \case
    Chan channel -> do
      val <- evaluate expr
      channelSend val channel
    v -> throw $ "Cannot send to a non-channel: " <> show v
    isTruthy = \case
      Null -> False
      Boolean b -> b
      _ -> True

To execute a SendStmt, we evaluate its arguments to get the channel and the value to send. Then we call the channelSend function to send the value over the channel.

Similarly, to evaluate a Receive expression, we evaluate its argument to get the channel, and then call the channelReceive function to receive a value from the channel:

evaluate :: Expr -> Interpreter Value
evaluate = \case
  LNull -> pure Null
  LBool bool -> pure $ Boolean bool
  LStr str -> pure $ Str str
  LNum num -> pure $ Num num
  Variable v -> lookupVar v
  Lambda params body -> Function "<lambda>" params body <$> State.gets isEnv
  binary@Binary {} -> evaluateBinaryOp binary
  call@Call {} -> evaluateFuncCall call
  Receive expr -> evaluate expr >>= \case
    Chan channel -> channelReceive channel
    val -> throw $ "Cannot receive from a non-channel: " <> show val

Now comes the core of the implementation: the channelSend and channelReceive functions. Let’s look into them in detail.

Sending and Receiving

The channelSend function takes a value and a channel, and sends the value over the channel, blocking if necessary.

channelSend :: Value -> Channel -> Interpreter ()
channelSend value Channel {..} = do
  bufferSize <- queueSize channelBuffer
  sendQueueSize <- queueSize channelSendQueue

  dequeue channelReceiveQueue >>= \case
    -- there are pending receives
    Just coroutine@Coroutine {..} ->
      scheduleCoroutine $ coroutine { corCont = const $ corCont value }

    -- there are no pending receives and the buffer is not full
    Nothing | channelCapacity > 0 && bufferSize < channelCapacity ->
      enqueue value channelBuffer

    -- there are no pending receives and
    -- (the buffer is full or the channel is unbuffered)
    Nothing | sendQueueSize < maxSendQueueSize -> do
      env <- State.gets isEnv
      callCC $ \cont -> do
        coroutine <- newCoroutine env cont
        enqueue (coroutine, value) channelSendQueue

    -- the send queue is full
    Nothing -> throw "Channel send queue is full"
    maxSendQueueSize = 4

This is a direct implementation of the algorithm we discussed earlier using diagrams. We dequeue a coroutine from the receive queue. Then:

  • If there is a coroutine, we schedule it to be run with the sent value. The send call does not block.
  • If there is no coroutine, and
    • the channel is buffered and the buffer is not full, we enqueue the sent value to the buffer. The send call does not block.
    • the buffer is full, we create a new coroutine with the current continuation, and enqueue the coroutine and the value to the send queue. The send call blocks.
  • If the send queue is full, we throw an error.

Next, let’s write the channelReceive function:

channelReceive :: Channel -> Interpreter Value
channelReceive Channel {..} = do
  mSend <- dequeue channelSendQueue
  mBufferedValue <- dequeue channelBuffer
  recieveQueueSize <- queueSize channelReceiveQueue

  case (mSend, mBufferedValue) of
    -- the channel is unbuffered and there are pending sends
    (Just (sendCoroutine, sendValue), Nothing) -> do
      scheduleCoroutine sendCoroutine
      return sendValue

    -- the buffer is full and there are pending sends
    (Just (sendCoroutine, sendValue), Just bufferedValue) -> do
      scheduleCoroutine sendCoroutine
      enqueue sendValue channelBuffer
      return bufferedValue

    -- the buffer is empty and there are no pending sends
    (Nothing, Nothing) | recieveQueueSize < maxReceiveQueueSize -> do
      env <- State.gets isEnv
      callCC $ \receive -> do
        coroutine <- newCoroutine env receive
        enqueue coroutine channelReceiveQueue
        return Null

    -- the receive queue is full
    (Nothing, Nothing) -> throw "Channel receive queue is full"

    -- the buffer is not empty and there are no pending sends
    (Nothing, Just bufferedValue) -> return bufferedValue
    maxReceiveQueueSize = 4

This is also a straightforward implementation of the algorithm. We dequeue a coroutine and its value from the send queue, and another value from the buffer. Then:

  • If there is a coroutine,
    • but no buffered value, we schedule the coroutine to be resumed, and return its value. The returned value becomes the value that is received from the channel. The receive call does not block.
    • and a buffered value, we schedule the coroutine to be resumed, enqueue its value to the buffer, and return the buffered value. The receive call does not block.
  • If there is no coroutine and no buffered value, and the receive queue is not full, we create a new coroutine with the current continuation, and enqueue it to the receive queue. The receive call blocks.
  • If the receive queue is full, we throw an error.

We hardcode the capacity of the send and receive queues to 4.

That’s it for the implementation of channels. Since we broke down the scenarios for send and receive operations, the implementation is not complicated. Let’s see it in action next.

Pubsub using Channels

In this demo, we implement a pubsub system using channels. The pubsub system consists of a server and a set of workers. The server sends messages to the workers over a channel. The workers print the messages and send acks back to the server over another channel. After sending all the messages, the server waits for the acks from the workers, and then stops the workers.

Diagrammatically, the pubsub system looks like this:

Pubsub using channels <noscript>Pubsub using channels</noscript>
Pubsub using channels

The boxes with double borders are ToCs, and the ones with single borders are channels. The arrows show how the ToCs and channels are connected.

Pubsub code
// server sends messages to workers.
function startServer(messageCount, messageChan) {
  print("server starting");
  var i = 1;
  while (i < messageCount + 1) {
    print("server sending: " + i);
    i -> messageChan;
    print("server sent: " + i);
    i = i + 1;

// workers receive messages over a channel, print them.
// and send a ack back to the sender on a channel.
function worker(name, messageChan, ackChan) {
  print("worker " + name + " starting");
  var message = null;
  while (true) {
    message = <- messageChan;
    print("worker " + name + " received: " + message);
    if (message == null) {
      print("worker " + name + " stopped");
    print("worker " + name + " sending: " + message);
    message -> ackChan;
    print("worker " + name + " sent: " + message);

// start workers.
function startWorkers(workerCount, messageChan, ackChan) {
  print("workers starting");
  var i = 1;
  while (i < workerCount + 1) {
    function(name) {
      spawn worker(name, messageChan, ackChan);
    i = i + 1;
  print("workers scheduled to be started");

// server waits for acks from workers.
function waitForWorkers(messageCount, ackChan, doneChan) {
  print("server waiting for acks");
  var i = 1;
  var message = null;
  while (i < messageCount + 1) {
    message = <- ackChan;
    print("server received: " + message);
    i = i + 1;
  print("server received all acks");
  null -> doneChan;

// stop workers.
function stopWorkers(workerCount, messageChan, doneChan) {
  var done = <- doneChan;
  print("workers stopping");
  var i = 1;
  while (i < workerCount + 1) {
    null -> messageChan;
    i = i + 1;
  print("workers scheduled to be stopped");

var workerCount = 3;
var messageCount = 7;
var messageBufferSize = 5;
var ackBufferSize = 1;
var messageChan = newBufferedChannel(messageBufferSize);
var ackChan = newBufferedChannel(ackBufferSize);
var doneChan = newChannel();

startWorkers(workerCount, messageChan, ackChan);
spawn waitForWorkers(messageCount, ackChan, doneChan);
startServer(messageCount, messageChan);
stopWorkers(workerCount, messageChan, doneChan);

Running the program produces this output:

Pubsub output
workers starting
workers scheduled to be started
server starting
server sending: 1
server sent: 1
server sending: 2
server sent: 2
server sending: 3
server sent: 3
server sending: 4
server sent: 4
server sending: 5
server sent: 5
server sending: 6
worker 1 starting
worker 1 received: 1
worker 1 sending: 1
worker 1 sent: 1
worker 1 received: 2
worker 1 sending: 2
worker 2 starting
worker 2 received: 3
worker 2 sending: 3
worker 3 starting
worker 3 received: 4
worker 3 sending: 4
server waiting for acks
server received: 1
server received: 2
server received: 3
server received: 4
server sent: 6
server sending: 7
server sent: 7
worker 1 sent: 2
worker 1 received: 5
worker 1 sending: 5
worker 1 sent: 5
worker 1 received: 6
worker 1 sending: 6
worker 1 sent: 6
worker 1 received: 7
worker 1 sending: 7
worker 2 sent: 3
worker 3 sent: 4
server received: 5
server received: 6
server received: 7
server received all acks
worker 1 sent: 7
workers stopping
workers scheduled to be stopped
worker 2 received: null
worker 2 stopped
worker 3 received: null
worker 3 stopped
worker 1 received: null
worker 1 stopped

The output shows how the server and worker coroutines yield control to each other when they are waiting for messages or acks4.

Bonus Round: Emulating Actors

The Actor model is a concurrent programming paradigm where computation is carried out by lightweight processes called Actors that can only communicate with each other by sending messages. This makes them ideal for building concurrent and distributed systems.

In this section, we emulate actors in Co using channels:

function start(process) {
  var inbox = newChannel();
  spawn (function () {
    var val = null;
    while (true) {
      val = <- inbox;
      if (val == null) { return; }
  return function (message) { message -> inbox; };

function send(actor, message) { actor(message); }
function stop(actor) { actor(null); }

Actors are implemented as wrappers around channels. By sending messages to an actor’s channel, we can send messages to the actor. However, we cannot expose the channels directly, so we wrap them in functions.

The start function creates and starts an actor by creating a new channel, and spawning a coroutine that receives messages from the channel in a loop and passes them to the process function taken as a parameter by the start function. Upon receiving a null value, the coroutine returns, which stops the actor.

The start function returns a function to send messages to the actor, which works by sending the messages to the actor’s channel.

The send function is a convenience function to send a message to an actor. The stop function stop an actor by sending it a null message.

It was easy, wasn’t it? Now let’s use actors in some different ways.

Let’s start with a simple example of an actor that prints the received messages:

var printer = start(print);
spawn send(printer, "world");
send(printer, "hello");

The process parameter here is the print function. Running this program produces the following output:


Next, let’s write an actor that counts. For that, first we need to create a 2-Tuple data structure using closures, named Pair5:

function Pair(first, second) {
  return function (command) {
    if (command == "first") { return first; }
    if (command == "second") { return second; }
    return null;

function first(pair) { return pair("first"); }
function second(pair) { return pair("second"); }

Now we implement the counter actor:

function makeCounter() {
  var value = 0;
  return start(function (message) {
    var command = first(message);
    var arg = second(message);

    if (command == "inc") { value = value + arg; }
    if (command == "get") { send(arg, value); }

The makeCounter function creates a counter actor. The counter actor is started with a processing function that takes a message as a Pair, extracts the command and the argument from the message, and increments the counter value or sends the counter value back depending on the command. We exercise the counter like this:

var printer = start(print);
var counter1 = makeCounter();

send(counter1, Pair("inc", 1));
send(counter1, Pair("get", printer));

send(counter1, Pair("inc", 2));
send(counter1, Pair("get", printer));

var counter2 = makeCounter();
send(counter2, Pair("inc", 5));
send(counter2, Pair("get", printer));

The output of the program is:


And for the grand finale, let’s reimplement the ping-pong program using actors:

function makePingPonger(name) {
  var self = null;
  function pingPong(message) {
    var value = first(message);
    var other = second(message);

    if (value == "done") {
      print(name + " done");
      spawn (function () { stop(self); } ());

    print(name + " " + value);
    if (value == 0) {
      print(name + " done");
      send(other, Pair("done", self));
      spawn (function () { stop(self); } ());

    send(other, Pair(value - 1, self));
  self = start(pingPong);
  return self;

The makePingPonger function creates a ping-ponger actor. The ping-ponger actor is started with a processing function that takes a message as a Pair of the value to print and the other actor to send the next message to. The processing function prints the value, decrements it, and sends it to the other actor. If the value is 0, it sends a done message to the other actor and stops itself. If the value is done, it stops itself.

Upon running it like this:

var pinger = makePingPonger("ping");
var ponger = makePingPonger("pong");
send(pinger, Pair(10, ponger));

It produces the same output as the original ping-pong program:

ping 10
pong 9
ping 8
pong 7
ping 6
pong 5
ping 4
pong 3
ping 2
pong 1
ping 0
ping done
pong done

In this post, we added channels to Co, and used them to create a variety of concurrent programs. We learned about CSP and how implement it using coroutines and channels. In the next post, we will add support for sleep to Co.

The code for complete Co interpreter is available here.


Many thanks to Steven Deobald for reviewing a draft of this article.

  1. Recently, Java added support for Virtual Threads, which though are not cooperatively scheduled like coroutines, are scheduled by the JVM, and are very lightweight. With virtual threads, the various Java queues can be considered channels as defined in CSP.↩︎

  2. The design of channels in Co is inspired by the design of channels in Clojure core.async. It is a simplified version, not supporting some of the features of core.async, such as transducers, and alts.↩︎

  3. Recall that the Queue type is an immutable queue data structure wrapped in an IORef, which we manipulate using atomic operations atomicModifyIORef'.↩︎

  4. You can try running the program with different values for the workerCount, messageCount, messageBufferSize and ackBufferSize variables to see how it behaves. You can also try changing the order of the function calls at the end of the program, or prefixing them with spawn to see how it affects the output. In some cases, the program may deadlock and hang, and in some other cases, it may throw an error. Try to understand why.↩︎

  5. We used the same trick to create a binary tree data structure in the previous post.↩︎

If you liked this post, please leave a comment.

by Abhinav Sarkar ( at June 03, 2023 12:00 AM

June 02, 2023

Brent Yorgey

Dynamic programming in Haskell: lazy immutable arrays

This is part 1 of a promised multi-part series on dynamic programming in Haskell. As a reminder, we’re using Zapis as a sample problem. In this problem, we are given a sequence of opening and closing brackets (parens, square brackets, and curly braces) with question marks, and have to compute the number of different ways in which the question marks could be replaced by brackets to create valid, properly nested bracket sequences.

Last time, we developed a recurrence for this problem and saw some naive, directly recursive Haskell code for computing it. Although this naive version is technically correct, it is much too slow, so our goal is to implement it more efficiently.

Mutable arrays?

Someone coming from an imperative background might immediately reach for some kind of mutable array, e.g. STUArray. Every time we call the function, we check whether the corresponding array index has already been filled in. If so, we simply return the stored value; if not, we compute the value recursively, and then fill in the array before returning it.

This would work, but there is a better way!

Immutable arrays

While mutable arrays occasionally have their place, we can surprisingly often get away with immutable arrays, where we completely define the array up front and then only use it for fast lookups afterwards.

  • If the type of the array elements is suitable, and we can initialize the array elements all at once from a list using some kind of formula, map, scan, etc., we should use UArray since it is much faster than Array.
  • However, UArray is strict in the elements, and the elements must be of a type that can be stored unboxed. If we need a more complex element type, or we need to compute the array recursively (where some elements depend on other elements), we can use Array.

What about the vector library, you ask? Well, it’s a very nice library, and quite fast, but unfortunately it is not available on many judging platforms, so I tend to stick to array to be safe. However, if you’re doing something like Advent of Code or Project Euler where you get to run the code on your own machine, then you should definitely reach for vector.

Lazy, recursive, immutable arrays

In my previous post on topsort we already saw the basic idea: since Arrays are lazy in their elements, we can define them recursively; the Haskell runtime then takes care of computing the elements in a suitable order. Previously, we saw this applied to automatically compute a topological sort, but more generally, we can use it to fill out a table of values for any recurrence.

So, as a first attempt, let’s just replace our recursive c function from last time with an array. I’ll only show the solve function for now; the rest of the code remains the same. (Spoiler alert: this solution works, but it’s ugly. We’ll develop much better solutions later.)

solve :: String -> Integer
solve str = c!(0,n)
    n = length str
    s = listArray (0,n-1) str

    c :: Array (Int, Int) Integer
    c = array ((0,0),(n,n)) $
      [ ((i,i), 1) | i <- [0..n] ]
      [ ((i,j),0) | i <- [0..n], j <- [0..n], even i /= even j ]
      [ ((i,j),v)
      | i <- [0..n], j <- [0..n], i /= j, even i == even j
      , let v = sum [ m (s!i) (s!k) * c!(i+1,k) * c!(k+1,j) | k <- [i+1, i+3 .. j-1]]

We use the array function to create an array, which takes first a pair of indices specifying the index range, and then a list of (index, value) pairs. (The listArray function can also be particularly useful, when we have a list of values which are already in index order, as in the definition of s.)

This solution is accepted, and it’s quite fast (0.04s for me). However, it’s really ugly, and although it’s conceptually close to our directly recursive function from before, the code is almost unrecognizably different. It’s ugly that we have to repeat conditions like i /= j and even i == even j, and binders like i <- [0..n]; the multiple list comprehensions and nested pairs like ((i,j),v) are kind of ugly, and the fact that this is implementing a recurrence is completely obscured.

However, I included this solution as a first step because for a long time, after I learned about using lazy immutable arrays to implement dynamic programming in Haskell, this was the kind of solution I wrote! Indeed, if you just think about the idea of creating a recursively defined array, this might be the kind of thing you come up with: we define an array c using the array function, then we have to list all its elements, and we get to refer to c along the way.

Mutual recursion to the rescue

Most of the ugliness comes from losing sight of the fact that there is a function mapping indices to values: we simply listed out all the function’s input/output pairs without getting to use any of Haskell’s very nice facilities for defining functions! So we can clean up the code considerably if we make a mutually recursive pair of an array and a function: the array values are defined using the function, and the function definition can look up values in the array.

solve :: String -> Integer
solve str = cA!(0,n)
    n = length str
    s = listArray (0,n-1) str

    cA :: Array (Int, Int) Integer
    cA = array ((0,0),(n,n)) $
      [ ((i,j), c (i,j)) | i <- [0 .. n], j <- [0 .. n] ]

    c :: (Int, Int) -> Integer
    c (i,j)
      | i == j           = 1
      | even i /= even j = 0
      | otherwise        = sum
        [ m (s!i) (s!k) * cA ! (i+1,k) * cA ! (k+1,j)
        | k <- [i+1, i+3 .. j-1]

Much better! The c function looks much the same as our naive version from before, with the one difference that instead of calling itself recursively, it looks up values in the array cA. The array, in turn, is simply defined as a lookup table for the outputs of the function.

Generalized tabulation

One nice trick we can use to simplify the code a bit more is to use the range function to generate the list of all valid array indices, and then just map the c function over this. This also allows us to use the listArray function, since we know that the range will generate the indices in the right order.

cA :: Array (Int, Int) Integer
cA = listArray rng $ map c (range rng)
    rng = ((0,0), (n,n))

In fact, we can abstract this into a useful little function to create a lookup table for a function:

tabulate :: Ix i => (i,i) -> (i -> a) -> Array i a
tabulate rng f = listArray rng (map f $ range rng)

(We can generalize this even more to make it work for UArray as well as Array, but I’ll stop here for now. And yes, I intentionally named this to echo the tabulate function from the adjunctions package; Array i is indeed a representable functor, though it’s not really possible to express without dependent types.)

The solution so far

Putting it all together, here’s our complete solution so far. It’s pretty good, and in fact it’s organized in a very similar way to Soumik Sarkar’s dynamic programming solution to Chemist’s Vows. (However, there’s an even better solution coming in my next post!)

import Control.Arrow
import Data.Array

main = interact $ lines >>> last >>> solve >>> format

format :: Integer -> String
format = show >>> reverse >>> take 5 >>> reverse

tabulate :: Ix i => (i,i) -> (i -> a) -> Array i a
tabulate rng f = listArray rng (map f $ range rng)

solve :: String -> Integer
solve str = cA!(0,n)
    n = length str
    s = listArray (0,n-1) str

    cA :: Array (Int, Int) Integer
    cA = tabulate ((0,0),(n,n)) c

    c :: (Int, Int) -> Integer
    c (i,j)
      | i == j           = 1
      | even i /= even j = 0
      | otherwise        = sum
        [ m (s!i) (s!k) * cA ! (i+1,k) * cA ! (k+1,j)
        | k <- [i+1, i+3 .. j-1]

m '(' ')'                = 1
m '[' ']'                = 1
m '{' '}'                = 1
m '?' '?'                = 3
m b '?' | b `elem` "([{" = 1
m '?' b | b `elem` ")]}" = 1
m _ _                    = 0

Coming up next: automatic memoization!

So what’s not to like about this solution? Well, I still don’t like the fact that we have to define a mutually recursive array and function. Conceptually, I want to name them both c (or whatever) since they are really isomorphic representations of the exact same mathematical function. It’s annoying that I have to make up a name like cA or c' or whatever for one of them. I also don’t like that we have to remember to do array lookups instead of recursive calls in the function—and if we forget, Haskell will not complain! It will just be really slow.

Next time, we’ll see how to use some clever ideas from Conal Elliot’s MemoTrie package (which themselves ultimately came from a paper by Ralf Hinze) to solve these remaining issues and end up with some really beautiful code!

by Brent at June 02, 2023 04:12 PM

Competitive programming in Haskell: introduction to dynamic programming

In my previous post, I challenged you to solve Zapis. In this problem, we are given a sequence of opening and closing brackets (parens, square brackets, and curly braces) with question marks, and have to compute the number of different ways in which the question marks could be replaced by brackets to create valid, properly nested bracket sequences.

For example, given (??), the answer is 4: we could replace the question marks with any matched pair (either (), [], or {}), or we could replace them with )(, resulting in ()().

An annoying aside

One very annoying thing to mention about this problem is that it requires us to output the last 5 digits of the answer. At first, I interpreted that to mean “output the answer modulo 10^5”, which would be a standard sort of condition for a combinatorics problem, but that’s not quite the same thing, in a very annoying way: for example, if the answer is 2, we are supposed to output 2; but if the answer is 1000000002, we are supposed to output 00002, not 2! So simply computing the answer modulo 10^5 is not good enough; if we get a final answer of 2, we don’t know whether we are supposed to pad it with zeros. I could imagine keeping track of both the result modulo 10^5 along with a Boolean flag telling us whether the number has ever overflowed; we have to pad with zeros iff the flag is set at the end. I’m pretty sure this would work. But for this problem, it turns out that the final answer is at most “only” about 100 digits, so we can just compute the answer exactly as an Integer and then literally show the last 5 digits.

A recurrence

Now, how to compute the answer? For this kind of problem the first step is to come up with a recurrence. Let s[0 \dots n-1] be the given string, and let c(i,j) be the number of ways to turn the substring s[i \dots j-1] into a properly nested sequence of brackets, so ultimately we want to compute the value of c(0,n). (Note we make c(i,j) correspond to the substring which includes i but excludes j, which means, for example, that the length of the substring is j-i.) First, some base cases:

  • c(i,i) = 1 since the empty string always counts as properly nested.
  • c(i,j) = 0 if i and j have different parity, since any properly nested string must have even length.

Otherwise, s[i] had better be an opening bracket of some kind, and we can try matching it with each of s[i+1], s[i+3], s[i+5], …, s[j-1]. In general, matching s[i] with s[k] can be done in either 0, 1, or 3 ways depending on whether they are proper opening and closing brackets and whether any question marks are involved; then we have c(i+1,k) ways to make the substring between s[i] and s[k] properly nested, and c(k+1,j) ways for the rest of the string following s[k]. These are all independent, so we multiply them. Overall, we get this:

c(i,j) = \begin{cases} 1 & i = j \\ 0 & i \not \equiv j \pmod 2 \\ \displaystyle \sum_{k \in [i+1, i+3, \dots, j-1]} m(s[i], s[k]) \cdot c(i+1,k) \cdot c(k+1,j) & \text{otherwise} \end{cases}

where m(x,y) counts the number of ways to make x and y into a matching pair of brackets: it returns 0 if the two characters cannot possibly be a matching open-close pair (either because they do not match or because one of them is the wrong way around); 1 if they match, and at most one of them is a question mark; and 3 if both are question marks.

How do we come up with such recurrences in the first place? Unfortunately, Haskell doesn’t really make this any easier—it requires some experience and insight. However, what we can say is that Haskell makes it very easy to directly code a recurrence as a recursive function, to play with it and ensure that it gives correct results for small input values.

A naive solution

To that end, if we directly code up our recurrence in Haskell, we get the following naive solution:

import Control.Arrow
import Data.Array

main = interact $ lines >>> last >>> solve >>> format

format :: Integer -> String
format = show >>> reverse >>> take 5 >>> reverse

solve :: String -> Integer
solve str = c (0,n)
    n = length str
    s = listArray (0,n-1) str

    c :: (Int, Int) -> Integer
    c (i,j)
      | i == j           = 1
      | even i /= even j = 0
      | otherwise        = sum
        [ m (s!i) (s!k) * c (i+1,k) * c (k+1,j)
        | k <- [i+1, i+3 .. j-1]

m '(' ')'                = 1
m '[' ']'                = 1
m '{' '}'                = 1
m '?' '?'                = 3
m b '?' | b `elem` "([{" = 1
m '?' b | b `elem` ")]}" = 1
m _ _                    = 0

This solution is correct, but much too slow—it passes the first four test cases but then fails with a Time Limit Exceeded error. In fact, it takes exponential time in the length of the input string, because it has a classic case of overlapping subproblems. Our goal is to compute the same function, but in a way that is actually efficient.

Dynamic programming, aka memoizing recurrences

I hate the name “dynamic programming”—it conveys zero information about the thing that it names, and was essentially invented as a marketing gimmick. Dynamic programming is really just memoizing recurrences in order to compute them more efficiently. By memoizing we mean caching some kind of mapping from input to output values, so that we only have to compute a function once for each given input value; on subsequent calls with a repeated input we can just look up the corresponding output. There are many, many variations on the theme, but memoizing recurrences is really the heart of it.

In imperative languages, dynamic programming is often carried out by filling in tables via nested loops—the fact that there is a recurrence involved is obscured by the implementation. However, in Haskell, our goal will be to write code that is as close as possible to the above naive recursive version, but still actually efficient. Over the next few posts we will discuss several techniques for doing just that.

  • In part 1, we will explore the basic idea of using lazy, recursive, immutable arrays (which we have already seen in a previous post).
  • In part 2, we will use ideas from Conal Elliot’s MemoTrie package (and ultimately from a paper by Ralf Hinze) to clean up the code and make it a lot closer to the naive version.
  • In part 3, we’ll discuss how to memoize functions with infinite (or just very large) domains.
  • There may very well end up being more parts… we’ll see where it ends up!

Along the way I’ll also drop more links to relevant background. This will ultimately end up as a chapter in the book I’m slowly writing, and I’d like to make it into the definitive reference on dynamic programming in Haskell—so any thoughts, comments, links, etc. are most welcome!

by Brent at June 02, 2023 04:10 PM


Well-Typed at GHC Contributors' Workshop and ZuriHac 2023

This year, Well-Typed is delighted to support both the GHC Contributors’ Workshop and ZuriHac.

GHC Contributors’ Workshop

The GHC Contributors’ Workshop organised by the Haskell Foundation is taking place from 7–9 June 2023. This is a great chance to learn about GHC development from the experts! In-person registration is closed, but you can participate online by registering here by 5th June.

Several of the Well-Typed team will be presenting at the event and are looking forward to welcoming you at the workshop:

  • Ben Gamari will be speaking on the GHC runtime system (RTS),
  • Duncan Coutts on how GHC interfaces with Cabal and the Haskell tooling ecosystem,
  • Sam Derbyshire on GHC’s frontend and in particular the Renamer, and
  • Zubin Duggal on Haskell Language Server.


This year’s ZuriHac will take place from 10–12 June 2023 in Rapperswil. Many of us from Well-Typed will be around over the weekend and are looking forward to catching up from Haskellers from all over the world.

As in previous years, Well-Typed are happy to offer a free training workshop at ZuriHac:

Lazy Evaluation by Andres Löh

Sunday 11 June 2023, 1200–1500 CEST

In this workshop, we are going to take a deep dive into lazy evaluation, looking at several examples and reasoning about how they get evaluated. The goal is to develop a strong understanding of how Haskell’s evaluation strategy works. Hopefully, we will see why laziness is a compelling idea with a lot of strong points, while also learning how some common sources of space leaks can be avoided.

The workshop will be accessible to anyone who has mastered the basics of Haskell and is looking to understand the language in more depth, whether they are a student or professional developer. We are not going to use any advanced features of the language, and you do not have to be a Haskell expert to attend!

This workshop will give you a glimpse of the kind of material covered on our training courses, such as our Haskell Performance and Optimization course. If you are interested in our courses or other services, check our Training page, or just send us an email.

If you cannot make it to ZuriHac, you can still check out recordings of some of our previous workshops in 2020 and 2021, or watch our new video series The Haskell Unfolder.

by christine, andres, adam at June 02, 2023 12:00 AM

June 01, 2023

Mark Jason Dominus

Why does this phrase sound so threatening?

Screenshot of tweet from Ari Cohn (@AriCohn) saying “If you are the lawyer for the Village of melrose Park, this phrasing is really not what you want to see at the opening of the opinion.”  Below that is Cohn's screenshot of the opening words of a 2022 opinion of U.S. District Judge Steven C. Seeger: “The Village of melrose Park decided that it would be a good idea”.

I took it the same way:

The Village of Melrose Park decided that it would be a good idea

is a menacing way to begin, foreboding bad times ahead for the Village.

But what about this phrasing communicates that so unmistakably? I can't put my finger on it. Is it “decided that”? If so, why? What would have been a less threatening way to say the same thing? Does “good idea” contribute to the sense of impending doom? Why or why not?

(The rest of the case is interesting, but to avoid distractions I will post about it separately. The full opinion is here.)

by Mark Dominus ( at June 01, 2023 12:04 AM

Tweag I/O

Testing Control-Flow Translations in GHC

In November 2022, Tweag engineers merged a WebAssembly back end into the Glasgow Haskell Compiler (GHC). The back end includes a new translation for control flow, which enables GHC to avoid depending on external tools like Binaryen. Because the translation is new, we wanted to test it before submitting a merge request. And classic unit testing was not a good fit—we would have needed to know what the WebAssembly code was expected to be generated from any given fragment of Haskell, and that’s a job for a compiler, not an engineer. Fortunately, we don’t care how generated code is written; we care about how it behaves.

Code’s behavior can be tested by running it, but we wanted to test the translation before integrating it with GHC. So we simulated translated code by tracing possible executions symbolically.

Translation from Cmm to WebAssembly

In GHC’s low-level intermediate form, Cmm, control flow is represented as an arbitrary graph. But in WebAssembly, control flow is represented as structured code:

diamond control-flow graph and equivalent WebAssembly

In both representations, each letter A, B, C, or D stands for an action that can change the state of the machine. And after the action, the machine decides what action to perform next. The decision made after action A is conditional; it is based on an observation of a value that depends on the state of the machine. That value is the value of an expression: a condition in a conditional branch (Cmm) or a condition associated with an if statement (WebAssembly).

Cmm’s control flow is expressed through unconditional branches, conditional branches, and Switch. WebAssembly’s control flow is expressed through multiple syntactic forms: blocks, if statements, loop statements, return, and some multipurpose br forms. A br form can exit any block or if in which it appears, and it can branch to the start of any loop in which it appears. As an example, the loop, br, and return forms are used in the translation of a loop:

loopy control-flow graph

Translations from Cmm to WebAssembly, including the two shown here, can be tested while leaving the actual decisions and observations in abstract form (letters A through F, not actual code).

Reasoning about correctness

If a translation from a source function to a target function is correct, then when both functions are started in the same initial state, they take the same actions and they finish in the same final state. (In truth, the functions are started in related states and take related actions, but this post keeps things simple.) If an action <semantics>Ai<annotation encoding="application/x-tex">A_i</annotation></semantics>Ai designates a function from states to states, then the final state after a sequence of actions operating on an initial state <semantics>σ0<annotation encoding="application/x-tex">\sigma_0</annotation></semantics>σ0 can be written as

<semantics>σo▹A1▹A2▹⋯▹An<annotation encoding="application/x-tex">\sigma_o \mathbin{\triangleright} A_1 \mathbin{\triangleright} A_2 \mathbin{\triangleright} \cdots \mathbin{\triangleright} A_n</annotation></semantics>σoA1A2An

where <semantics>▹<annotation encoding="application/x-tex">\mathbin{\triangleright}</annotation></semantics> is a reverse function-application symbol, as in Elm. Such states can be compared symbolically.

After an action <semantics>Ai<annotation encoding="application/x-tex">A_i</annotation></semantics>Ai is taken, the subsequent action <semantics>Ai+1<annotation encoding="application/x-tex">A_{i+1}</annotation></semantics>Ai+1 is determined by a decision, which is made by control logic on the basis of the current machine state. But the simulator’s states are so abstract that it doesn’t how the control logic will make the decision—for example, at a conditional branch, the simulator doesn’t know what the value of the Boolean condition will be, so it doesn’t know which way the decision will go. But it does know that a Boolean condition must observed to be True or False, so it simulates both alternatives—and it records the observed condition.

A translation is deemed correct if for both source and target functions, the simulator records the same sequence of actions and observations. That condition suffices because

  • Every decision can be identified with the action that follows it.

  • If two functions start in the same state and they take the same action, they wind up in the same successor state.

  • If two functions observe the value of the same expression in the same state, they observe the same value.

In both Cmm and WebAssembly, control logic can observe just two forms of value: Boolean and integer. A Boolean observation determines the decision made by a conditional branch or if statement. An integer observation determines the decision made by a Switch or a br_table, which are used to implement Haskell case expressions.

Testing a single run

A translation is tested by simulating runs of both source code and target code. Each simulated run produces a list of events, where an event is an action or an observation:

data Event stmt expr = Action stmt
                     | Predicate expr Bool
                     | Switch expr (Integer,Integer) Integer

An Action value records what action (stmt) was simulated. A Predicate value records an observation of the Boolean value of the given expression (expr), from a conditional branch or an if statement. A Switch value records an observation of the given integer expression, e.g., in the translation of a case expression. A Switch also records an integer pair that specifies the range of possible observations (bounds of a jump table).

The list of events is produced by interpreting Cmm code or WebAssembly code in a monad that supplies observations and remembers events:

class MonadFail m => ControlTestMonad stmt expr m where
  takeAction    :: stmt -> m ()
  evalPredicate :: expr -> m Bool
  evalEnum      :: expr -> (Integer,Integer) -> m Integer

The takeAction function simply adds an action event to the list of remembered events, but the evalPredicate and evalEnum functions do more: each one not only remembers an event but also returns an observation, which is drawn from the monad’s supply.

The monad is used by an interpreter for Cmm functions:

evalGraph :: forall stmt expr m .
             ControlTestMonad stmt expr m
            => (Label -> Block CmmNode O O -> stmt)
            -> (Label -> CmmExpr -> expr)
            -> CmmGraph
            -> m ()

The first two arguments determine the representations of stmt and expr, which are left up to client code. The third argument, of type CmmGraph, represents a Cmm function as a control-flow graph.

Function evalGraph starts interpreter run at the given graph’s entry label:

evalGraph stmt expr g = run (g_entry g)
  where run :: Label -> m ()
        run label = do
          takeAction @stmt @expr (stmt label (actionOf label))
          case lastNode (blockOf label) of
            CmmBranch l -> run l
            CmmCondBranch e t f _ -> do
                b <- evalPredicate @stmt @expr (expr label e)
                run (if b then t else f)
            CmmSwitch e targets -> do
                i <- evalEnum @stmt @expr (expr label e) $
                     extendRight $ switchTargetsRange targets
                run $ labelIn i targets
            CmmCall { cml_cont = Nothing } -> return ()
            ... more cases ...

Inside run, auxiliary function actionOf returns the action associated with a labeled node in the control-flow graph, and lastNode extracts the decision-making instruction from the end of the basic block that is stored at the node. (Implementations of auxiliary functions are not shown.)

Monad m is instantiated as BitConsumer stmt expr. A bit consumer is run by supplying a sequence of Booleans:

eventsFromBits :: BitConsumer stmt expr () -> [Bool] -> [Event stmt expr]

The Booleans determine the observations supplied by the monad. (This design makes it possible to test translations using randomly generated bit strings, perhaps using QuickCheck.) The BitConsumer’s evalPredicate method supplies a Boolean observation by taking one Boolean from the sequence. The evalEnum method supplies an integer observation by taking as many Booleans as are needed to code for an integer in the given range.

Target code is represented by a value of type WasmControl stmt expr, which is interpreted by evalWasm:

evalWasm :: ControlTestMonad stmt expr m => WasmControl stmt expr -> m ()

When evalGraph is applied to a Cmm function and evalWasm is applied to its translation, both bit consumers can be run on the same sequence of bits. If the resulting lists of events differ, there is a fault in the translator.

Identifying test inputs

To identify sequences of Booleans to pass to eventsFromBits, we consider what execution paths we wish to test. Most functions contain more than one path, all of which ought to be tested, but if a function contains loops, all the paths can’t be tested: there are infinitely many. However! Our tests compare symbolic executions. And symbolic executions compose. Therefore if a simulation of every loop and every loop-free path meets the correctness criterion, the simulation of any possible path will also meet the correctness criterion.

Paths are enumerated by a depth-first search of the Cmm control-flow graph. At each node in the control-flow graph, the path enumerator visits all possible successor nodes. And when a node has more than one successor—that is, when the node makes a nontrivial decision—each successor is associated with an observation. For example, a conditional node has two successors, which are respectively associated with the observations True and False. The enumerator also remembers the action taken at each node, so it produces a list of paths, where each path is a sequence of events:

type CmmPath = [Event Stmt Expr]
cmmPaths :: CmmGraph -> [CmmPath]

Function cmmPaths returns every path through the control-flow graph in which at most one node appears more than once. In other words, the search stops every time it reaches a node that already appears on the path so far, but it does include that node. This termination condition ensures that every loop is among the paths that are enumerated, and so is every non-looping path.

After the Cmm paths are enumerated, each one is converted to a sequence of bits, which can then be fed to a BitConsumer built from the WebAssembly interpreter. A path is converted to bits in the EntropyTransducer module, which also exports a function that is used invert the integer conversion during symbolic execution.

traceBits :: [Event a b] -> [Bool]
rangeSelect :: (Integer, Integer) -> [Bool] -> Maybe (Integer, [Bool])

Function rangeSelect consumes only as many bits as are needed to code for an integer in the given range. Any bits left over are returned along with the integer. If there aren’t enough input bits to code for an integer, rangeSelect returns Nothing.


The infrastructure described above was used to test several components of GHC’s new translator:

The first three components are now part of GHC.

Testing uncovered a handful of faults, the most notable of which were as follows:

  • A misunderstanding about the node-splitting algorithm resulted in an infinite loop. The fault was corrected by reimplementing the algorithm.

  • When generating code for a case expression that scrutinizes type Int, GHC specifies a jump table of size <semantics>264<annotation encoding="application/x-tex">2^{64}</annotation></semantics>264. Attempting to generate a WebAssembly table with that many entries made GHC run out of memory. The fault was corrected by calling cmmImplementSwitchPlans. This function converts an insanely large jump table to a decision tree, each leaf of which holds a jump table of reasonable size.

  • When generating loop forms, the translation to WebAssembly mistakenly put some code before the loop header; this code should have gone after the loop header. The fault was corrected by swapping two lines of code in the translator.

Testing also failed to uncover a notable fault: I misread the WebAssembly specification and thought that reaching the end of a loop went back to the beginning of the loop. In fact, it exits. This fault was not uncovered until the translation was integrated into GHC. The fault was corrected by changing one line in in the translator: I removed a function call that made “fallthrough context” of a loop equal to its entry point. To make the tests pass again, I made a corresponding change to evalWasm: on encountering the end of a loop, the corrected code no longer pushes the loop back onto its evaluation stack.


GHC is a big beast, and it’s hard to wrangle. We’re pleased that we were able to test the trickiest part of the WebAssembly code generator before fully integrating it with GHC and a live WebAssembly platform. And the tests remain present, so if someone forgets to enable the node-splitting algorithm in the production back end (cough, cough), we can quickly rule out the translation as the source of the fault.

June 01, 2023 12:00 AM

May 31, 2023

Mark Jason Dominus

More about _Cozzi v. Village of Melrose Park_

Earlier today I brought up the case of Cozzi v. Village of Melrose Park and the restrained but unmistakably threatening tone of the opening words of the judge's opinion in that case:

The Village of Melrose Park decided that it would be a good idea

I didn't want to distract from the main question, so I have put the details in this post instead. the case is Cozzi v. Village of Melrose Park N.D.Ill. 21-cv-998, and the judge's full opening paragraph is:

The Village of Melrose Park decided that it would be a good idea to issue 62 tickets to an elderly couple for having lawn chairs in their front yard. The Village issued ticket after ticket, imposing fine after fine, to two eighty-year-old residents, Plaintiffs Vincent and Angeline Cozzi.

The full docket is available on CourtListener. Mr. Cozzi died in February 2022, sometime before the menacing opinion was written, and the two parties are scheduled to meet for settlement talks next Thursday, June 8.

The docket also contains the following interesting entry from the judge:

On December 1, 2021, George Becker, an attorney for third-party deponent Brandon Theodore, wrote a letter asking to reschedule the deposition, which was then-set for December 2. He explained that a "close family member who lives in my household has tested positive for Covid-19." He noted that he "need[ed] to reschedule it" because "you desire this deposition live," which the Court understands to mean in-person testimony. That cancellation made perfect sense. We're in a pandemic, after all. Protecting the health and safety of everyone else is a thoughtful thing to do. One might have guessed that the other attorneys would have appreciated the courtesy. Presumably Plaintiff's counsel wouldn't want to sit in a room with someone possibly exposed to a lethal virus. But here, Plaintiff's counsel filed a brief suggesting that the entire thing was bogus. "Theodore's counsel cancelled the deposition because of he [sic] claimed he was exposed to Covid-19.... Plaintiff's counsel found the last minute cancellation suspect.... " That response landed poorly with the Court. It lacked empathy, and unnecessarily impugned the integrity of a member of the bar. It was especially troubling given that the underlying issue involves a very real, very serious public health threat. And it involved a member of Becker's family. By December 16, 2021, Plaintiff's counsel must file a statement and reveal whether Plaintiff's counsel had any specific reason to doubt the candor of counsel about a family member contracting the virus. If not, then the Court suggests a moment of quiet reflection, and encourages counsel to view the filing as a good opportunity for offering an apology.

by Mark Dominus ( at May 31, 2023 11:33 PM

May 30, 2023

Tweag I/O

Announcing Tf-Ncl 0.1

With Nickel releasing 1.0 I’m excited to announce the 0.1 release of Tf-Ncl, an experimental tool for writing Terraform deployments with Nickel instead of HCL.

Tf-Ncl enables configurations to be checked against Terraform provider-specific contracts, before calling Terraform to perform the deployment. Nickel can natively generate outputs as JSON, YAML or TOML; since Terraform can accept its deployment configuration as JSON, you can straightforwardly export a Nickel configuration, adhering to the right format, to Terraform. Tf-Ncl provides a framework for ensuring a Nickel configuration has this specific format. Specifically, Tf-Ncl is a tool to generate Nickel contracts that describe the configuration schema expected by a set of Terraform providers.

This approach means that Terraform doesn’t need to know or care that Nickel has generated its deployment configuration. State management is entirely unaffected. And deployments written with Nickel can instruct Terraform to use existing HCL modules, making it possible to migrate a configuration incrementally. You can start using Nickel’s programming features without committing to a complete rewrite of all your configuration at once. Having the full power of Nickel available makes it possible to describe the important parameters of your deployment in a format that suits your application while minimizing duplication. Then you can write Nickel code to generate the necessary Terraform resource definitions in all their complexity. For example, you could maintain a list of user accounts with associated data like team membership and admin status, and then generate appropriate Terraform resources setting up the referenced teams and their member accounts. Later in this post, I’ll show you how to achieve a simplified version of this.

Tf-Ncl is a tech demo to show what is possible with Nickel and should be considered experimental at this time. But we do hope to improve it and your feedback will be essential for that.

Trying It Out

The quickest and easiest way to set up an example project is to use Nix flakes:

nix flake init -t github:tweag/tf-ncl#hello-tf

This will leave you with two files in the current directory, flake.nix and main.ncl. The flake.nix file defines a Nix flake which provides a shell environment with: nickel, the Nickel CLI; nls, the Nickel language server; and topiary, Tweag’s Tree-sitter based formatter. It also contains shell scripts to link the generated Nickel contracts into the current directory and to call Terraform with the result of a Nickel evaluation. Enter the development shell environment with:

nix develop

Now you can evaluate the Nickel configuration in main.ncl using:


Calling run-nickel doesn’t perform any Terraform operations yet, it just evaluates the Nickel code in main.ncl to produce a JSON file The latter can be understood by Terraform and is treated just like an HCL configuration would be. In the hello-tf example, the deployment consists of a single null_resource with a local-exec provisioner that just prints Hello, world!. Continuing with our example, you can now initialize Terraform and apply the Terraform deployment to get your greeting:

terraform init
terraform apply

You can also combine the Nickel evaluation with the call to Terraform using the run-terraform wrapper script:

run-terraform apply

Let’s take a look at this tiny example deployment. It is configured in main.ncl:

let Tf = import "./tf-ncl-schema.ncl" in
  config.resource.null_resource.hello-world = {
    provisioner.local-exec = [
      { command = "echo 'Hello, world!'" }
} | Tf.Config

This Nickel code first imports the contracts generated by Tf-Ncl and binds them to the name Tf. Then it defines a record which contains the overall configuration and declares it to be a Terraform configuration using the syntax | Tf.Config. For this toy example the deployment consists of just a null_resource with an attached local provisioner, that greets everyone it sees.

Let’s try to use this scaffolding for writing an example deployment. Let’s say we want to take a list of GitHub user names and add those to our GitHub organization.

The first thing to do is to declare to Tf-Ncl that we want to use the github Terraform provider. This can be done by adjusting the flake.nix file. The outputs section of the flake defines a devShell using a Tf-Ncl provided function. This function is what we need to customize:

outputs = inputs: inputs.utils.lib.eachDefaultSystem (system:
    devShell =${system}.mkDevShell {
      providers = p: {
        inherit (p) null;

This is the place were you can specify which Terraform providers your deployment will need. These are also the providers for which Tf-Ncl will generate Nickel contracts. To have Tf-Ncl generate contracts for the GitHub Terraform provider as well as the Terraform internal null provider, you would replace the function passed as providers to the mkDevShell function, i.e.:

providers = p: {
  inherit (p) null github;

Having done that, you need to re-enter the development environment by exiting the current one and running nix develop again. Afterwards the wrapper scripts run-nickel and run-terraform will all use the new contracts including the GitHub provider. Now, let’s write some Nickel to turn a list of GitHub user names into Terraform resources. Start with the hello-tf scaffold, remove the null_resource and add the users list:

let Tf = import "./tf-ncl-schema.ncl" in
  users = [ "alice", "bob", "charlie" ],
  config = {
    provider.github = [
        token = "<placeholder-token>", # Don't do this in production!
        owner = "<placeholder-organization>",
} | Tf.Config

I’ve also added a provider section that will tell the GitHub Terraform provider which organization it should manage. If you do this for real, don’t put an authorization token in the configuration directly. Rather, use Terraform variables or data sources to retrieve secrets. The next step will be to process the list of usernames into github_membership resource blocks for Terraform. For that, you can use Nickel’s standard library to map over the users array.

This will leave you with an array of records. But what’s needed is a single record containing all the fields. The Nickel library function std.record.merge_all provides that functionality. Nickel has the F# and OCaml inspired |> operator which makes writing these kinds of pipelined function application quite ergonomic. Here’s how to use it for defining memberships:

memberships =
  |> (fun user => {
    resource.github_membership."%{users}-membership" = {
      username = user,
      role = "member",
  |> std.record.merge_all

Finally, the resulting memberships record needs to be combined with the provider configuration in the field config. That can be done with Nickel’s merging operator &. In summary, here’s the deployment:

let Tf = import "./tf-ncl-schema.ncl" in
  users = [ "alice", "bob", "charlie" ],
  memberships = users
      (fun user => {
        resource.github_membership."%{user}-membership" = {
          username = user,
          role = "member",
    |> std.record.merge_all,
  config = {
    provider.github = [{
      token = "<placeholder-token>",
      owner = "<placeholder-organization>",
  & memberships,
} | Tf.Config

Try to have Terraform generate a plan for the deployment:

$ run-terraform plan

Terraform will perform the following actions:

  # github_membership.alice-membership will be created
  + resource "github_membership" "alice-membership" {
      + etag     = (known after apply)
      + id       = (known after apply)
      + role     = "member"
      + username = "alice"

  # github_membership.bob-membership will be created
  + resource "github_membership" "bob-membership" {
      + etag     = (known after apply)
      + id       = (known after apply)
      + role     = "member"
      + username = "bob"

  # github_membership.charlie-membership will be created
  + resource "github_membership" "charlie-membership" {
      + etag     = (known after apply)
      + id       = (known after apply)
      + role     = "member"
      + username = "charlie"

Plan: 3 to add, 0 to change, 0 to destroy.

It works 🎉 You can take a look at the entire example in the Tf-Ncl repository or by using a Nix flake template:

$ nix flake init -t github:tweag/tf-ncl#github-simple

If you happen to have existing HCL modules, those can be included in the Nickel configuration for an incremental migration. For example, let’s say example-module/ contains the following module:

variable "greeting" {
  type = string

resource "null_resource" "greeter" {
  provisioner local-exec {
      command = "echo ${var.greeting}"

Then this can be included from the top-level main.ncl by modifying the config attribute to include an instruction to Terraform to instantiate the module with some parameters. That is, you could use the following:

  # [...]
  config = {
    # [...]
    module.greeter = {
      source = "./example-module",
      greeting = "Hello, world!",
  & memberships,
} | Tf.Config

Future Directions

At this point, Tf-Ncl should be considered a tech demo for Nickel. While it can produce working deployments for Terraform, there are various areas that still need improvement. For one, the generated contracts can be huge for featureful providers. While this is actually a great benchmark for Nickel’s evaluator, it can cause problems; for example, asking the Nickel language server for completion candidates may time out for very large contracts. I’m looking into changing the structure of the Tf-Ncl contracts to make them more modular and easier to process piecewise. There are also limitations to Tf-Ncl’s handling of provider computed fields for Terraform. But more on that in a coming deep dive blog post on the technical challenges of building Tf-Ncl.

Tf-Ncl is a new tool. Feedback is essential for improving it. Please try out Nickel and Tf-Ncl, find new uses, break it and, most importantly, tell us about it!

May 30, 2023 12:00 AM

May 28, 2023

Mark Jason Dominus

The Master of the Pecos River returns

Lately I have been enjoying Adam Unikowsky's Legal Newsletter which is thoughtful, informative, and often very funny.

For example a recent article was titled “Why does doctrine get so complicated?”:

After reading Reed v. Goertz, one gets the feeling that the American legal system has failed. Maybe Reed should get DNA testing and maybe he shouldn’t. But whatever the answer to this question, it should not turn on Article III, the Rooker-Feldman doctrine, sovereign immunity, and the selection of one from among four different possible accrual dates. Some disputes have convoluted facts, so one would expect the legal analysis to be correspondingly complex. But this dispute is simple. Reed says DNA testing would prove his innocence. The D.A. says it wouldn’t. If deciding this dispute requires the U.S. Supreme Court to resolve four difficult antecedent procedural issues, something has gone awry.

Along the way Unikowsky wanted to support that claim that:

law requires the shallowest degree of subject-matter expertise of any intellectual profession

and, comparing the law with fields such as medicine, physics or architecture which require actual expertise, he explained:

After finishing law school, many law students immediately become judicial law clerks, in which they are expected to draft judicial opinions in any area of law, including areas to which they had zero exposure in law school. If a judge asks a law clerk to prepare a judicial opinion in (say) an employment discrimination case, and the student expresses concern that she did not take Employment Law in law school, the judge will assume that the law clerk is making a whimsical joke.

I laughed at that.

Still from “Arrested Developement” of a Michael Bluth's hand holding a brown paper bag labeled ‘DEAD DOVE Do Not Eat!’ which he has just found in his refrigerator.  In the show, Michael looks inside, makes a face, and says “I don't know what I expected.”

Anyway, that was not what I planned to talk about. For his most recent article, Unikowsky went over all the United States Supreme Court cases from the last ten years, scored them on a five-axis scale of interestingness and importance, and published his rankings of the least significant cases of the decade”.

Reading this was a little bit like the time I dropped into Reddit's r/notinteresting forum, which I joined briefly, and then quit when I decided it was not interesting.

I think I might have literally fallen asleep while reading about U.S. Bank v. Lakeridge, despite Unikowsky's description of it as “the weirdest cert grant of the decade”:

There was some speculation at the time that the Court meant to grant certiorari on the substantive issue of “what’s a non-statutory insider?” but made a typographical error in the order granting certiorari, but didn’t realize its error until after the baffled parties submitted their briefs, after which the Court decided, whatever, let’s go with it.

Even when the underlying material was dull, Unikowsky's writing was still funny and engaging. There were some high points. Check out his description of the implications of the decision in Amgen, or the puzzled exchange between Justice Sotomayor and one of the attorneys in National Association of Manufacturers.

But one of the cases on his list got me really excited:

The decade’s least significant original-jurisdiction case, selected from a small but august group of contenders, was Texas v. New Mexico, 141 S. Ct. 509 (2020). In 1988, the Supreme Court resolved a dispute between Texas and New Mexico over equitable apportionment of the Pecos River’s water.

Does this ring a bell? No? I don't know that many Supreme Court cases, but I recognized that one. If you have been paying attention you will remember that I have blogged about it before!

I love when this happens. It is bit like when you have a chance meeting with a stranger while traveling in a foreign country, spend a happy few hours with them, and then part, expecting never to see them again, but then years later you are walking in a different part of the world and there they are going the other way on the same sidewalk.

by Mark Dominus ( at May 28, 2023 09:11 PM

May 26, 2023

Mark Jason Dominus

Hieroglyphic monkeys holding stuff

I recently had occasion to mention this Unicode codepoint with the undistinguished name EGYPTIAN HIEROGLYPHIC SIGN E058A:

In a slightly more interesting world it would have been called STANDING MONKEY HOLDING SEVERED HEAD.

Unicode includes a group of eight similar hieroglyphic signs of monkeys holding stuff. Screenshots are from Unicode proposal N1944, Encoding Egyptian Hieroglyphs in Plane 1 of the UCS. The monkeys are on page 27. The names are my own proposals.


That monkey looks altogether too pleased with itself for my liking.


I have no idea what the triangle thingy is supposed to be. A thorn? A bread cone maybe? The object on the monkey's head is the crown of northern Egypt.


What if you want to type the character for a standing monkey holding the left eye of Ra? I suppose you have to compose several codepoints?


Is it a ball? An orb? A bowl? A dolerite pounder?


I have no idea what the flower thingy is supposed to represent. Budge's dictionary classifies it with the “trees, plants, flowers, etc.” but assigns it only a phonetic value. (Budge, E. Wallis; An Egyptian Hieroglyphic Dictionary (London 1920), v.1, p. cxxiii)


The monkey is holding, but not wearing, the crown of southern Egypt.


This last one is amazing.

I think the hook by the monkey's foot is a sign with no meaning other than the ‘s’ sound.

The object in the monkey's left hand is quite common in hieroglyphic writing but I do not know what it is. Budge (p.cxxxiii) says it is a “sacred object worshipped in the Delta” and that it is pronounced “tcheṭ” or “ṭeṭ”, but I have not been able to find what it is called at present. Hmmm…

Aha! It is called djed:

It is a pillar-like symbol in Egyptian hieroglyphs representing stability. It is associated with the creator god Ptah and Osiris, the Egyptian god of the afterlife, the underworld, and the dead. It is commonly understood to represent his spine.

Thanks to Wikipedia's list of hieroglyphs.

Addendum: This morning I feel a little foolish because I found tcheṭ in the “list of hieroglyphic characters” section of Budge's dictionary, but when I didn't know what it was, it didn't occur to me to actually look it up in the dictionary.

Screencap of the entry from Budge's dictionary, defining tcheṭ.  The glyph is a sort of pillar or column with a fluted middle and a sort of vertebral thing on top.  The definition reads: “an amulet that was supposed to endue the wearer with the permanence and stability of the backbone of Osiris”.  Then there is another hieroglyph that incorporates tcheṭ as a component, glossed as “the backbone of Osiris, the sacrum bone”.

by Mark Dominus ( at May 26, 2023 01:55 PM

GHC Developer Blog

GHC 9.2.8 is now available

GHC 9.2.8 is now available

Zubin Duggal - 2023-05-26

The GHC developers are happy to announce the availability of GHC 9.2.8. Binary distributions, source distributions, and documentation are available at

This release is primarily a bugfix release addressing one issue found in 9.2.7:

  • Fix a bug with RTS linker failing with ‘internal error: m32_allocator_init: Failed to map’ on newer Linux kernels (#19421).

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy compiling,

  • Zubin

by ghc-devs at May 26, 2023 12:00 AM

May 25, 2023

Mark Jason Dominus

Egyptian crocodile hieroglyphs in Unicode

A while back Rik Signes brought my attention to the Unicode codepoint with the long and peculiar name TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR (U+0c7e) and this inspired me to write an article about what it was for.

Recently I was looking into how Egyptian heiroglyphic characters are encoded in Unicode. The possible character set is quite large; for example here's the name of the god Osiris:

Hieroglyph consisting of three components.  At right, the figure of a beared, kneeling man.  At left, a polygon representing a throne, above a human eye.

Is this a single codepoint? No, there are codepoints for the three components of the hieroglyph (the kneeling bearded man, the eye, and the polygon thingy that represents a throne), and then some combining characters to say how they should be assembled, also combining characters to indicate notations like cartouches and rubrics.

(I learned this hieroglyphic with the eye part uppermost and the man and throne side-by-side below, but I suppose Egyptian spelling must have changed over the millennia.)

The codepoints themselves have disappointing names like EGYPTIAN HIEROGLYPHIC SIGN A049, which I think is the designation for the man-with-beard component. But the original proposal hints at something greater. It suggests, and immediately rejects, a descriptive nomenclature including such names as BABY CHICK, OWL, HARE,





If you want to look it up, it is known as EGYPTIAN HIEROGLYPHIC SIGN I098, found on pages 36–37 of the formal proposal. The suggested glyph looks like this:

Hieroglyphic symbol of a recumbent crocodile with cobra headdress and flagellum.

Addendum: Rik informs me that he brought the Telugu fraction to my attention not, as I remembered, because it was longest but because it was curious. At the time the longest designation was


which has since been supplanted by the twins


I still regret that STANDING MONKEY HOLDING SEVERED HEAD is not the name of a Unicode codepoint.

Hieroglyphic symbol of a standing monkey holding a severed head.

[ Addendum 20230526: More hieroglyphic monkeys holding stuff. ]

by Mark Dominus ( at May 25, 2023 03:21 PM

Tweag I/O

Functional Python, Part III: The Ghost in the Machine

Tweagers have an engineering mantra — Functional. Typed. Immutable. — that begets composable software which can be reasoned about and avails itself to static analysis. These are all “good things” for building robust software, which inevitably lead us to using languages such as Haskell, OCaml and Rust. However, it would be remiss of us to snub languages that don’t enforce the same disciplines, but are nonetheless popular choices in industry. Ivory towers are lonely places, after all.

Last time I wrote about how we can use Python’s1 abstract base classes to express useful concepts and primitives that are common in functional programming languages. In this final episode, I’ll cover testing strategies that can be learnt from functional programming and applied to your Python code.

I, Test

It’s hardly a revelation that testing the code we write — and automating that process — is an essential element of software engineering practice. Besides the headline case of catching bugs and regressions, a well-written test suite can exemplify how a codebase is intended to be used and can even guide how code is structured for the better.2

“Well-written”, however, is the operative word. One can easily fall into the folly of optimising for coverage, say, rather than focusing on what matters, such as the expected behaviour of the system under test or how its parts interface with each other. Dogmatic adherents to various methodologies can waste time ticking boxes, while bugs slip through the cracks. In a dynamically typed language, like Python, it can be tempting to just throw tests together and call it a day!

How, besides discipline, can we do better? Code often contains a tantalising amount of metadata, such as type signatures. Can we leverage that to facilitate testing; to reduce the cognitive burden on the engineer? Is it possible for the machine to find edge cases — to improve coverage and resiliency — rather than relying on our inherently fallible selves?

As you may suspect, all these questions have been answered with a resounding “Yes!” in functional programming ecosystems. So let’s look at what we can do in Python.

Property-Based Testing for Fun and Profit

It’s common to see tests that use hard-coded examples, asserting against an expected output. These examples are often cherry-picked to elicit the (presumed) behaviour of the system under test. Better still is if the examples are engineered to trigger edge cases and failure modes.

The problem here is that the input space could be vast and the combinatorial effect will quickly overcome a human’s ability to reliably identify exemplars. Moreover, the relationship between the input and the expected output is, at best, only implicitly expressed; subtleties are easily lost.

Take, for example, sorting: A selection of inputs and circumstantially sorted outputs only gives anecdotal evidence of what’s being tested. The tests may be backed up by comments or the name of the test itself, but is this really sufficient? Say we’re testing a stable sort; while the before-and-after states will infer its correctness under scrutiny, this is not going to be immediately obvious to a casual reader.

Property-based testing (PBT) turns this around by having you define that relationship (i.e., the “properties” of the system) and generating the examples for you. That generation is particularly clever, where most PBT libraries will steer the examples towards common minimal failure states; a process known as “shrinking”.3 Your test suite is thus transformed into a specification of the system under test, which is a far stronger guarantee that it’s doing what it’s supposed to, while simultaneously documenting those expectations.

Back to our stable sort example, the properties might be that, given an input list:

  • Each element in the input is accounted for in the output; no more, no fewer.
  • Each element (after the first) in the output should be greater or equal to the one it preceded, however element comparison is defined.
  • The relative order of equal elements in the input must be preserved in the output.

If you’ve ever learnt Haskell, you may have been introduced to QuickCheck — which popularised PBT — often to entice you with what can be achieved. In the 20+ intervening years, PBT libraries now exist for many programming languages; Python is no different, with Hypothesis as its de facto PBT offering. So let’s stop navel gazing and get our hands dirty!

One, Two. One, Two. This is Just a Test.

In our previous discussions, we defined a List type and its monoid. We demonstrated, in the Python REPL, that, say, concatenation appears to behave as we expect, but as stated above, this is not a rigorous test. Instead, let’s look at what properties we’d expect from list concatenation and test those instead. The obvious ones might be:

  1. We should start by reassuring ourselves that our “so called monoid” actually obeys the monoid rules: associativity and the existence of an identity element.

  2. As a gate-keeping sanity check, the length of the concatenated list (<semantics>A⋆B<annotation encoding="application/x-tex">A\star B</annotation></semantics>AB) must be equal to the sum of the lengths of its inputs (<semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>A and <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>B, respectively): <semantics>∥A⋆B∥=∥A∥+∥B∥<annotation encoding="application/x-tex">\|A\star B\| = \|A\| + \|B\|</annotation></semantics>AB=A+B.

  3. The (0-indexed) <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics>ith element of <semantics>A⋆B<annotation encoding="application/x-tex">A\star B</annotation></semantics>AB should equal:

    • the <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics>ith element of <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>A, when <semantics>0≤i<∥A∥<annotation encoding="application/x-tex">0 \le i \lt \|A\|</annotation></semantics>0i<A;

    • the <semantics>(i−∥A∥)<annotation encoding="application/x-tex">(i - \|A\|)</annotation></semantics>(iA)th element of <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>B, when <semantics>∥A∥≤i<∥A⋆B∥<annotation encoding="application/x-tex">\|A\| \le i \lt \|A\star B\|</annotation></semantics>Ai<AB.

We can quite easily convert these properties into code:4

from typing import TypeVar

T = TypeVar("T")

def test_monoid(a: List[T], b: List[T], c: List[T]):
    # Associativity: (a * b) * c == a * (b * c)
    assert (a.mappend(b)).mappend(c) == a.mappend(b.mappend(c))

    # Identity: e * a == a == a * e
    identity = List.mempty()
    assert identity.mappend(a) == a == a.mappend(identity)

def test_concatenation(a: List[T], b: List[T]):
    concatenated = a.mappend(b)

    assert len(concatenated) == len(a) + len(b)

    for i in range(len(concatenated)):
        if i < len(a):
            assert concatenated[i] == a[i]
            assert concatenated[i] == b[i - len(a)]

I hope you’d agree that these tests are clearly defining the expected behaviour. While the entire input space won’t be used for our monoid rule test — it’s a test, not a proof — having many generated examples is surely better than a handful. The question now becomes, “Where do the generated inputs come from?”

Enter Hypothesis, to help us out.

Like many other PBT libraries, Hypothesis provides an extensive suite — which it calls “strategies” — of primitive value generators, including from type introspection, as well as combinators which allow you to build more complex value generators. These can then be utilised by Hypothesis in a test harness, which generates the input values under test.

These candidate values are pseudorandom and the quantity configurable. The more examples that are generated, the greater confidence you have that your code correctly satisfies the properties you’ve defined; albeit at the expense of test runtime.5 That said, Hypothesis keeps state, so falsifying examples will be retried when they’re found, and it will focus its search around common failure states.

Let’s demonstrate with a simple example. Consider the property of real numbers being positive when squared. We can formulate this as a Hypothesis test6 as:

from hypothesis import given
from hypothesis.strategies import floats

# NOTE IEEE-754 floats encode a NaN value, which we want to avoid
def test_square(v: float):
    assert v * v > 0

Oops! Running this will fail, because we got our property wrong. The squares of real numbers aren’t strictly positive; they are non-negative. Hypothesis rightfully complains and shows us where things went wrong:

Falsifying example: test_square(

So our task, to test the properties of our list monoid, is to write a strategy, using Hypothesis’ primitives, to generate Lists.7 To keep things simple, we’ll generate Lists of integers — the type of the elements shouldn’t affect concatenation, after all — and we won’t bother parametrising size limits, like the Hypothesis lists strategy does.

So, as a first approximation:

from hypothesis import strategies as st

def lists(draw: st.DrawFn) -> List[int]:
    # Draw an integer element, or None to signal the list terminator
    element = draw(st.one_of(st.none(), st.integers()))

    match element:
        case None:
            return Nil()

        case _:
            # Recursively draw from this strategy to extend the list
            return Cons(element, draw(lists()))

The magic is in the composite decorator and the draw function it provides. It’s a very clean API that makes writing complex strategies straightforward.

We can now use this strategy with the property test functions we wrote earlier:

@given(lists(), lists(), lists())
def test_monoid(a: List[int], b: List[int], c: List[int]):

@given(lists(), lists())
def test_concatenation(a: List[int], b: List[int]):

Lo and behold, it works!

To reassure ourselves, let’s try to break it by purposely introducing a bug in the List.mappend implementation, keeping our properties as previously defined. For example, we can swap the input lists around in the concatenation easily:

def mappend(self, rhs: List[Monoid[T]]) -> List[Monoid[T]]:
    # Should be: foldr(Cons, rhs, self)
    return foldr(Cons, self, rhs)

Immediately, Hypothesis complains. The monoid rules still hold, but the concatenation is obviously incorrect:

AssertionError: Cons(0, Nil()) * Cons(1, Nil()) != Cons(1, Cons(0, Nil()))
Falsifying example: test_append(
    a=Cons(0, Nil()),
    b=Cons(1, Nil()),

At this point we should feel pretty smug with ourselves our List implementation!

The Devil’s in the Details

You’d be forgiven for thinking of PBT as a panacea — and let’s be honest: in many ways it is — however, it is not without its practical shortcomings. In particular, despite being more resilient than its human counterparts, a machine is still limited by the combinatorial explosion problem I mentioned earlier.

It is so easy to write strategies, that one can get carried away. If your domain values are particularly complicated or deeply nested, Hypothesis (or any PBT library) will take an increasing amount of time to compute examples; to the point that testing becomes intractable. Decomposing the problem — another instance of a testing strategy influencing how code is structured — can help alleviate this, but it’s not always possible.

However, of course, using PBT in your project is not some kind of Faustian bargain. It is a tool — a particularly powerful one — that ought to be wielded liberally, but wisely. It wouldn’t make sense, for example, to agonise over a suite of complex strategies to simulate coherent state for an integration test, when rigging up that integration test is ultimately going to be cheaper.

We should also take care not to decry example testing outright. While PBT is the more robust technique, throwing in a handful of examples can provide stronger assurances against known failure states, which we cannot guarantee will be generated by the PBT library. Examples also have illustrative power, which greatly benefits the reader when it comes to maintenance. Implementing these as unit tests, rather than in documentation, also ensures their assertions won’t drift from the truth.

Aside: Terms and Conditions

Complementary to PBT is the concept of “design by contract”, which was originally developed for Eiffel, an object-orientated programming language. Contracts, much like their real-world analogue, can be used to enforce the interface of, say, a function in terms of its expectations (pre-conditions), guarantees (post-conditions) and the state which the function maintains (invariants). Ordinarily, these are checked at runtime — during development — where any violations will cause the program to crash.

There are a few ways to specify contracts in Python:

  • Informally, with asserts;
  • Using a library like Deal or icontract;
  • There’s even a long-dormant proposal to standardise them.

The fine-print of contracts is by the bye and runtime checking is definitely not what this series is about. So why do I mention it? It turns out there are tools that can perform static analysis against contracts. One such tool, in the Python ecosystem, is CrossHair.8

Whereas Hypothesis will generate pseudorandom inputs and shrink them to common failure states, CrossHair works by analysing the execution paths of any code under contract with an SMT solver to find falsifying examples. It understands various kinds of contracts, including those outlined above as well as pre-conditions specified by Hypothesis’ given decorator.

CrossHair is still alpha software. As of writing, it appears to handle our List monoid, with the Hypothesis pre-conditions that we defined above. (Although it starts to misbehave when we introduce a bug.) That said, it’s a promising development in this space that’s worth keeping an eye on.


Testing can be a chore. A necessary evil. Writing good tests, which faithfully and clearly represent the logic of your codebase, is harder still. PBT is like a super-power that inverts the testing burden from hard-coding examples — which, although beginner-friendly, is highly limited — into thinking more deeply about what your code should be doing, leaving the machine to do the leg work. It’s almost addictive!

Throughout this series, I’ve had one goal in mind: to show that techniques from functional programming can be productively applied to “the second best language for everything”. PBT is yet another example of this and Hypothesis is a mature Python library to enable this super-power; of which, I’ve barely scratched the surface. Nonetheless, despite my introductory tour, I hope this and a glimpse of things to come has convinced you to give it — and the other techniques explored in this series — a try.

Thanks to Julien Debon, Guillaume Desforges, Johan Herland, Mark Karpov and Vince Reuter for their reviews of this article.

  1. We are not limited to Python; these techniques can be applied in any language with suitable support, libraries and tooling.
  2. The value of dependency inversion, for example, is not completely obvious until you start writing tests. At which point, you’ll wish you’d done it sooner!
  3. My colleague, Julien Debon, wrote a great article on shrinking strategies in OCaml’s port of QuickCheck. (Julien also takes the credit for introducing me to PBT proper.)
  4. For the sake of the examples, we assume that our List type is equatable, indexable and sized (i.e., implements __eq__, __getitem__ and __len__, respectively) and its element type is equatable. This is left as an exercise for the reader.
  5. For local development, it can be useful to set the number of generated examples fairly low, to keep the development loop tight. However, it makes sense to increase that significantly in the CI/CD environment to increase assurance.
  6. “Hypothesis” is a slightly annoying name for a testing library, as searches for “Hypothesis test” inevitably return results about statistical hypothesis testing! This is one of life’s many trials.
  7. Because we use generics, Hypothesis is not able to derive the generator for our List from type annotations alone. If we lose the class hierarchy that simulates ADTs and stick with concrete element types, then from_type can work against Cons. That’s not a good trade-off, when writing a generator by hand is so straightforward.
  8. My colleague, Conner Baker, first brought CrossHair to my attention. This article was just going to be about PBT, but even at this early stage, this tool is too cool not to briefly mention!

May 25, 2023 12:00 AM

Sandy Maguire

Certainty by Construction Progress Report 3

The following is a progress report for Certainty by Construction, a new book I’m writing on learning and effectively wielding Agda. Writing a book is a tedious and demoralizing process, so if this is the sort of thing you’re excited about, please do let me know!

Week three, and this update is coming in hot, a whole day early! This week I worked on the ring solving chapter, realizing that I can make a very much non-toy solver, and pack it into a chapter. We now build a multivariate semiring solver, discuss how and why it works, and then do some dependent-type shenanigans to put a delightful user interface in front of the whole thing.

In addition, it came with some excellent opportunities to discuss where semantics come from, and let me talk about homomorphisms earlier than I was otherwise hoping to.

My plan for the week was to tackle the remainder of the setoids chapter, but setoids are awful and it’s hard to motivate myself to do that, since I avoid using them in my day-to-day life whenever I can. Which is always. We’ll see what happens with this chapter, but maybe it’ll get melted down into something else. Nevertheless, understanding setoids is important for actually doing anything with the stdlib, so I dunno.

On the typesetting front, I spent an hour today fighting with Latex trying to ensure that it has glyphs for every unicode character in the book. I’ve got all but one of them sorted out now, and in the process, learned way more about Latex than any human should need to know.

The plan for next week is to cleanup the extremely WIP backmatter chapters. There’s a bunch of crap in there about me trying to do math math and failing, because math math doesn’t give two sniffs about constructability, and so none of it works out. If I’m feeling particularly plucky, I might try my hand at defining the reals, just because it might be fun.

As of today’s update, the book is now 360 pages long! I estimate it’ll be about 450 when it’s done, so we’re clearly making progress.

Anyway, that’s all for today. If you’ve already bought the book, you can get the updates for free on Leanpub. If you haven’t, might I suggest doing so? Your early support and feedback helps inspire me and ensure the book is as good as it can possibly be.

May 25, 2023 12:00 AM

May 24, 2023

Brent Yorgey

Competitive programming in Haskell: parsing with an NFA

In my previous post, I challenged you to solve Chemist’s Vows. In this problem, we have to decide which words can be made by concatenating atomic element symbols. So this is another parsing problem; but unlike the previous problem, element symbols are not prefix-free. For example, B and Be are both element symbols. So, if we see BE..., we don’t immediately know whether we should parse it as Be, or as B followed by an element that starts with E (such as Er).

A first try

A parsing problem, eh? Haskell actually shines in this area because of its nice parser combinator libraries. The Kattis environment does in fact have the parsec package available; and even on platforms that don’t have parsec, we can always use the Text.ParserCombinators.ReadP module that comes in base. So let’s try throwing one of those packages at the problem and see what happens!

If we try using parsec, we immediately run into problems; honestly, I don’t even know how to solve the problem using parsec. The problem is that <|> represents left-biased choice. If we parse p1 <|> p2 and parser p1 succeeds, then we will never consider p2. But for this parsing problem, because the symbols are not prefix-free, sometimes we can’t know which of two options we should have picked until later.

ReadP, on the other hand, explicitly has both biased and unbiased choice operators, and can return a list of possible parses instead of just a single parse. That sounds promising! Here’s a simple attempt using ReadP: to parse a single element, we use an unbiased choice over all the element names; then we use many parseElement <* eof to parse each word, and check whether there are any successful parses at all.

{-# LANGUAGE OverloadedStrings #-}

import           Control.Arrow
import           Data.Bool
import qualified Data.ByteString.Lazy.Char8   as C
import           Text.ParserCombinators.ReadP (ReadP, choice, eof, many,
                                               readP_to_S, string)

main = C.interact $
  C.lines >>> drop 1 >>> map (solve >>> bool "NO" "YES") >>> C.unlines

solve :: C.ByteString -> Bool
solve s = case readP_to_S (many parseElement <* eof) (C.unpack s) of
  [] -> False
  _  -> True

elements :: [String]
elements = words $
  "h he li be b c n o f ne na mg al si p s cl ar k ca sc ti v cr mn fe co ni cu zn ga ge as se br kr rb sr y zr nb mo tc ru rh pd ag cd in sn sb te i xe cs ba hf ta w re os ir pt au hg tl pb bi po at rn fr ra rf db sg bh hs mt ds rg cn fl lv la ce pr nd pm sm eu gd tb dy ho er tm yb lu ac th pa u np pu am cm bk cf es fm md no lr"

parseElement :: ReadP String
parseElement = choice (map string elements)

Unfortunately, this fails with a Time Limit Exceeded error (it takes longer than the allotted 5 seconds). The problem is that backtracking and trying every possible parse like this is super inefficient. One of the secret test inputs is almost cerainly constructed so that there are an exponential number of ways to parse some prefix of the input, but no way to parse the entire thing. As a simple example, the string crf can be parsed as either c rf (carbon + rutherfordium) or cr f (chromium + fluorine), so by repeating crf n times we can make a string of length 3n which has 2^n different parses. If we fed this string to the ReadP solution above, it would quickly succeed with more or less the first thing that it tried. However, if we stick a letter on the end that does not occur in any element symbol (such as q), the result will be an unparseable string, and the ReadP solution will spend a very long time backtracking through exponentially many parses that all ultimately fail.


The key insight is that we don’t really care about all the different possible parses; we only care whether the given string is parseable at all. At any given point in the string, there are only two possible states we could be in: we could be finished reading one element symbol and about to start reading the next one, or we could be in the middle of reading a two-letter element symbol. We can just scan through the string and keep track of the set of (at most two) possible states; in other words, we will simulate an NFA which accepts the language of strings composed of element symbols.

First, some setup as before.

{-# LANGUAGE OverloadedStrings #-}

import           Control.Arrow              ((>>>))
import           Data.Array                 (Array, accumArray, (!))
import           Data.Bool                  (bool)
import qualified Data.ByteString.Lazy.Char8 as C
import           Data.List                  (partition, nub)
import           Data.Set                   (Set)
import qualified Data.Set                   as S

main = C.interact $
  C.lines >>> drop 1 >>> map (solve >>> bool "NO" "YES") >>> C.unlines

elements :: [String]
elements = words $
  "h he li be b c n o f ne na mg al si p s cl ar k ca sc ti v cr mn
fe co ni cu zn ga ge as se br kr rb sr y zr nb mo tc ru rh pd ag cd
in sn sb te i xe cs ba hf ta w re os ir pt au hg tl pb bi po at rn
fr ra rf db sg bh hs mt ds rg cn fl lv la ce pr nd pm sm eu gd tb dy
ho er tm yb lu ac th pa u np pu am cm bk cf es fm md no lr"

Now, let’s split the element symbols into one-letter and two-letter symbols:

singles, doubles :: [String]
(singles, doubles) = partition ((==1).length) elements

We can now make boolean lookup arrays that tell us whether a given letter occurs as a single-letter element symbol (single) and whether a given letter occurs as the first letter of a two-letter symbol (lead). We also make a Set of all two-letter element symbols, for fast lookup.

mkAlphaArray :: [Char] -> Array Char Bool
mkAlphaArray cs = accumArray (||) False ('a', 'z') (zip cs (repeat True))

single, lead :: Array Char Bool
[single, lead] = map (mkAlphaArray . map head) [singles, doubles]

doubleSet :: Set String
doubleSet = S.fromList doubles

Now for simulating the NFA itself. There are two states we can be in: START means we are about to start and/or have just finished reading an element symbol; SEEN c means we have seen the first character of some element (c) and are waiting to see another.

data State = START | SEEN Char
  deriving (Eq, Ord, Show)

Our transition function takes a character c and a state and returns a set of all possible next states (we just use a list since these sets will be very small). If we are in the START state, we could end up in the START state again if c is a single-letter element symbol; we could also end up in the SEEN c state if c is the first letter of any two-letter element symbol. On the other hand, if we are in the SEEN x state, then we have to check whether xc is a valid element symbol; if so, we return to START.

delta :: Char -> State -> [State]
delta c START    = [START | single!c] ++ [SEEN c | lead!c]
delta c (SEEN x) = [START | [x,c] `S.member` doubleSet]

We can now extend delta to act on a set of states, giving us the set of all possible resulting states; the drive function then iterates this one-letter transition over an entire input string. Finally, to solve the problem, we start with the singleton set [START], call drive using the input string, and check whether START (which is also the only accepting state) is an element of the resulting set of states.

trans :: Char -> [State] -> [State]
trans c sts = nub (sts >>= delta c)

drive :: C.ByteString -> ([State] -> [State])
drive = C.foldr (\c -> (trans c >>>)) id

solve :: C.ByteString -> Bool
solve s = START `elem` drive s [START]

And that’s it! This solution is accepted in 0.27 seconds (out of a maximum allowed 5 seconds).

For next time

  • If you want to practice the concepts from my past couple posts, give Haiku a try.
  • For my next post, I challenge you to solve Zapis!

by Brent at May 24, 2023 11:03 AM

May 23, 2023

GHC Developer Blog

GHC 9.6.2 is now available

GHC 9.6.2 is now available

Ben Gamari - 2023-05-23

The GHC developers are happy to announce the availability of GHC 9.6.2. Binary distributions, source distributions, and documentation are available at

This release is primarily a bugfix release addressing a few issues found in 9.6.2. These include:

  • a number of simplifier and specialisation issues (#22761, #22549)

  • A bug resulting in crashes of programs using the new listThreads# primop (#23071).

  • A compiler crash triggered by certain uses of quantified constraints (#23171)

  • Various bugs in the Javascript backend have been fixed (#23399, #23360, #23346)

  • A missing write barrier in the non-moving collector’s handling of selector thunks, resulting in undefined behavior (#22930).

  • The non-moving garbage collector’s treatment of weak pointers has been revamped which should allow more reliable finalization of Weak# closures (#22327)

  • The non-moving garbage collector now bounds the amount of marking it will do during the post-marking stop-the-world phase, greatly reducing tail latencies in some programs (#22929)

A full accounting of changes can be found in the release notes. As some of the fixed issues do affect correctness users are encouraged to upgrade promptly.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy compiling,

  • Ben

by ghc-devs at May 23, 2023 12:00 AM

May 22, 2023

Brent Yorgey

New ko-fi page: help me attend ICFP!

tl;dr: if you appreciate my past or ongoing contributions to the Haskell community, please consider helping me get to ICFP by donating via my new ko-fi page!


Working at a small liberal arts institution has some tremendous benefits (close interaction with motivated students, freedom to pursue the projects I want rather than jump through a bunch of hoops to get tenure, fantastic colleagues), and I love my job. But there are also downsides; the biggest ones for me are the difficulty of securing enough travel funding, and, relatedly, the difficulty of cultivating and maintaining collaborations.

I would really like to be able to attend ICFP in Seattle this September; the last time I was able to attend ICFP in person was 2019 in Berlin. With transportation, lodging, food, and registration fees, it will probably come to about $3000. I can get a grant from my instutition to pay for up to $1200, but that still leaves a big gap.

As I was brainstorming other sources of funding, it dawned on me that there are probably many people who have been positively impacted by my contributions to the Haskell community (e.g. CIS 194, the Typeclassopedia, diagrams, split, MonadRandom, burrito metaphors…) and/or would like to support my ongoing work (competitive programming in Haskell, swarm, disco, ongoing package maintenance…) and would be happy to donate a bit.

So, to that end, I have set up a ko-fi page.

  • If you have been positively impacted by my contributions and would like to help me get to ICFP this fall, one-time donations — even very small amounts — are greatly appreciated! I’m not going to promise any particular rewards, but if you’re at ICFP I will definitely find you to say thanks!

  • Thinking beyond this fall, ideally this could even become a reliable source of funding to help me travel to ICFP or other collaboration opportunities every year. To that end, if you’re willing to sign up for a recurring monthly donation, that would be amazing — think of it as supporting my ongoing work: blog posts (and book) on competitive programming in Haskell, Swarm and Disco development, and ongoing package maintenance. I will post updates on ko-fi with things I’m thinking about and working on; I am also going to try to publish more frequent blog posts, at least in the near term.

Thank you, friends — I hope to see many people in Seattle! Next up: back to your regularly scheduled competitive programming!

by Brent at May 22, 2023 07:27 PM

May 18, 2023

Sandy Maguire

Certainty by Construction Progress Report 2

The following is a progress report for Certainty by Construction, a new book I’m writing on learning and effectively wielding Agda. Writing a book is a tedious and demoralizing process, so if this is the sort of thing you’re excited about, please do let me know!

It’s week two of regular updates on Certainty by Construction, baby! This week I made 17 commits to the repository, half of which were towards the goal of improving the book’s typesetting. Spurred on by a bug report asking “what the hell does AgdaCmd:MakeCase mean?” I decided to upgrade the book’s build system. Now you should see explicit keystrokes to press when the book asks you to run a command alongside.

You’ll also notice intra-prose syntax highlighting, meaning that if the book mentions a type, it will now be presented in a beautiful blue, among other things in other colors. Agda has some janky support for this, but I couldn’t get it working, which means I annotated each and every piece of syntax highlighting by hand. Please file a bug if you notice I’ve missed any.

Content-wise, the old chapter on “structured sets” has become “relations”, and it has several new sections fleshing out the idea and giving several more examples. I’m now in the middle of rewriting the setoids chapter, but it too has three new sections, and thus the whole thing is no longer all about modular arithmetic.

Next week I’m going to continue powering on with the setoids chapter—including a big digression on what congruence entails under a setoid—and then I think I’ll tackle the ring solving chapter.

For the first time, this book seems like I might not be working on it for the rest of my life. It’s nowhere near done, but the topic and style are finally hashed out, and the content is mostly in an alpha state. From here it’s really just to continue grinding, rewriting all the crap bits over and over again, until they’re no longer crap.

Anyway, that’s all for today. If you’ve already bought the book, you can get the updates for free on Leanpub. If you haven’t, might I suggest doing so? Your early support and feedback helps inspire me and ensure the book is as good as it can possibly be.

May 18, 2023 12:00 AM

May 17, 2023

Tweag I/O

Announcing Nickel 1.0

Today, I am very excited to announce the 1.0 release of Nickel.

A bit more than one year ago, we released the very first public version Nickel (0.1). Throughout various write-ups and public talks (1, 2, 3), we’ve been telling the story of our dissatisfaction with the state of configuration management.

The need for a New Deal

Configuration is everywhere. The manifest of a web app, the configuration of an Apache virtual host, an Infrastructure-as-Code (IaC) cloud deployment (Terraform, Kubernetes, etc.).

Configuration is more often than not considered a second-class engineering discipline, like a side activity of software engineering. That’s a dangerous mistake: with the advent of IaC for the cloud, configuration has become an important aspect of modern software systems, and a critical point of failure.

All the different but connected configurations composing a system are scattered across many languages, tools, and services (JSON, YAML, HCL, Puppet, Apache’s Tcl, and so on). Configuration management is indeed a cross-cutting concern, making failures harder to predict and often spectacular.

In the last decade, studies have shown that misconfigurations were the second largest cause of service-level disruptions in one of Google’s main production services 1. Misconfigurations also contribute to 16% of production incidents at Facebook 2, including the worst-ever outage of Facebook and Instagram that occurred in March 2019 4. Fastly’s3 outage on the 8th June 2021, which basically broke a substantial part of the internet, was triggered by a configuration issue.

Modern configurations are complex. They require new tools to be dealt with, which is why we developed the Nickel configuration language.

Nickel 1.0

Nickel is a lightweight and generic configuration language. It can replace YAML as your new application’s configuration language, or it can generate static configuration files (YAML, JSON or TOML) to be fed to existing tools.

Unlike YAML, though, it anticipates large configurations by being programmable and modular. To minimize the risk of misconfigurations, Nickel features (opt-in) static typing and contracts, a powerful and extensible data validation framework.

Since the initial release (0.1), we’ve refined the semantics enough to be confident in a core design that is unlikely to radically change.

Since the previous stable version (0.3.1), efforts have been made on three principal fronts: tooling (in particular the language server), the core language semantics (contracts, metadata, and merging), and the surface language (the syntax and the stdlib). Please see the release notes for more details.

Tooling & set-up

Follow the getting started guide from Nickel’s website to get a working binary. Nickel also comes with:

  • An LSP language server.
  • A REPL nickel repl, a markdown documentation generator nickel doc and a nickel query command to retrieve metadata, types and contracts from code.
  • Plugins for (Neo)Vim, VSCode, Emacs, and a tree-sitter grammar.
  • A code formatter, thanks to Tweag’s tree-sitter-based Topiary.

Watch the companion video for a tour of these tools and features.

A primer on Nickel

Let me walk you through a small example showing what you can do in Nickel 1.0. You can try the code examples from this section in the online Nickel playground.

Just a fancy JSON

I’ll use a basic Kubernetes deployment of a MySQL service as a working example. The following is a direct conversion of a good chunk of mysql.yaml into Nickel syntax, omitting uninteresting values:

  apiVersion = "v1",
  kind = "Pod",
  metadata = {
    name = "mysql", = "mysql",

  spec = {
    containers = [
        resources = {
          image = "mysql",
          name = "mysql",
          ports = [
              containerPort = 3306,
              name = "mysql",
          volumeMounts = [
              # name must match the volume name below
              name = "mysql-persistent-storage",
              # mount path within the container
              mountPath = "/var/lib/mysql",
    volumes = [
        name = "mysql-persistent-storage",
        cinder = {
          volumeID = "bd82f7e2-wece-4c01-a505-4acf60b07f4a",
          fsType = "ext4",

This snippet looks like JSON, with a few minor syntax differences. Nickel has indeed the same primitive data types as JSON: numbers (arbitrary precision rationals), strings, arrays, and records (objects in JSON).

The previous example has a lot of repetition: the string "mysql", the name of the app, occurs several times.

Besides, a comment mentions that the name inside the volumeMounts field must match the name of the volume defined inside volumes. Developers are responsible for maintaining this invariant by hand. The absence of a single source of truth might lead to inconsistencies.

Finally, imagine that you now need to reuse the previous configuration several times with slight variations, where the app name and the MySQL port number may change.

This can’t be solved in pure YAML: all you can do is to copy and paste data, and try to manually ensure that copies always all agree. Unlike YAML though, Nickel is programmable.

Reusable configuration

Let’s upgrade our very first example:

# mysql-module.ncl
  config | not_exported = {
    volume_name = "mysql-persistent-storage",

  apiVersion = "v1",
  kind = "Pod",
  metadata = {
    name = config.app_name, = config.app_name,

  spec = {
    containers = [
        resources = {
          limits.cpu = 0.5,
          image = "mysql",
          name = config.app_name,
          ports = [
              containerPort = config.port,
              name = "mysql",
          volumeMounts = [
              name = config.volume_name,
              # mount path within the container
              mountPath = "/var/lib/mysql",
    volumes = [
        name = config.volume_name,
        cinder = {
          volumeID = "bd82f7e2-wece-4c01-a505-4acf60b07f4a",
          fsType = "ext4",

The most important change is the new config field. The rest is almost the same as before, but I replaced the strings "mysql" with config.app_name, "mysql-persistent-storage" with config.volume_name and the hard-coded port number 3306 with config.port.

We can directly use the new fields from within other fields. Indeed, in Nickel, you can refer to another part of a configuration you are defining from inside the very same configuration. It’s a natural way to describe data dependencies, when parts of the configuration are generated from other parts.

The | symbol attaches metadata to fields. Here, | not_exported indicates that config is an internal value that shouldn’t appear in the final exported YAML.

This explains most of the new machinery. But this configuration has still something off. Let us try to export it:

$ nickel export -f mysql-module.ncl --format yaml
error: missing definition for `port`
   ┌─ mysql.ncl:33:36
 2 │     config | not_exported = {
   │ ╭───────────────────────────'
 3 │ │     port,
 4 │ │     app_name,
 5 │ │     volume_name = "mysql-persistent-storage",
 6 │ │   },
   │ ╰───' in this record
   · │
33 │               containerPort = config.port,
   │                               -------^^^^
   │                               │      │
   │                               │      required here
   │                               accessed here

Indeed, port and app_name don’t have a value! In fact, you should view this snippet as a partial configuration, a configuration with holes to be filled.

To recover a complete configuration, we need the merge operator &. Merging is a primitive operation that recursively combines records together. Merge is also able to provide a definite value to the fields app_name and port:

# file: mysql-final.ncl
(import "mysql-module.ncl") & {
  config = {
    app_name = "mysql-backend",
    port = 10500,

Running nickel export --format yaml -f mysql-final.ncl will now produce a valid YAML configuration.

Partial configurations bear similarities with functions: both are a solution to the problem of how to make repetitive code reusable and composable. But the former seems more adapted to writing reusable configuration snippets.

It’s trivial to assemble several partial configurations together (as long as there’s no conflict): just merge them together.

Partial configurations are data, which can be queried, inspected, and transformed, as long as the missing fields aren’t required. For example, you can get the list of fields of our partial configuration:

$ nickel query -f mysql-module.ncl

Available fields
• apiVersion
• config
• kind
• metadata
• spec

Finally, partial configurations are naturally overridable. Recall that mysql-final.ncl contains the final complete configuration, and consider this example:

# file: override.ncl
(import "mysql-final.ncl") & {
  config.app_name | force = "mysql_overridden",
  metadata.labels.overridden = "true",

Exporting override.ncl produces a configuration where all the values depending on config.app_name (directly or indirectly) are updated to use the new value "mysql_overridden", and where metadata.labels has a new entry overridden set to "true".

Overriding is useful to tweak existing code that you don’t control, and wasn’t written to be customizable in the first place. It’s probably better to do without whenever possible, but for some application domains, it simply can’t be avoided.

Partial configurations are automatically extensible: you don’t have to even think about it.

On the other hand, functions are opaque values, which need to be fed with arguments before you can do anything with them. Assembling many of them, inspecting them before they are applied or making their result overridable range from technically possible but much more cumbersome to downright impossible.

Correct configurations

Nickel has two main mechanisms to ensure correctness and prevent misconfigurations as much as possible: static typing and contracts.

Static typing is particularly adapted for small reusable functions. The typechecker is rigorous and catches errors early, before the code path is even triggered.

For configuration data, we tend to use contracts. Contracts are a principled way of writing and applying runtime data validators. It feels like typing when used, relying on annotations and contract constructors (array contract, function contract, etc.), but the checks are performed at runtime. In return, you can easily define your own contracts and compose them with the existing ones.

Contracts are useful to validate the produced configuration (in that case, they will probably come from a library or be automatically generated from external schemas such as when writing Terraform configurations in Nickel). They can validate inputs as well:

# file: mysql-module-safe.ncl
let StartsWith = fun prefix =>
  std.contract.from_predicate (std.string.is_match "^%{prefix}")

  config | not_exported = {
      | String
      | StartsWith "mysql"
      | doc m%"
          The name of the mysql application. The name must start with `"mysql"`,
          as enforced by the `StartsWith` contract.
      | default
      = "mysql",
      | Number
      | default
      = 3306,
    volume_name = "mysql-persistent-storage",

  # ...

The builtin String contract and the custom contract StartsWith "mysql" have been attached to app_name. The latter enforces that the value starts with "mysql", if we have this requirement for some reason. We won’t enter the details of custom contracts here, but it’s a pretty straightforward function.

We’ve used other metadata as well: doc is for documentation, while default indicates that the following definition is a default value. Metadata are leveraged by the Nickel tooling (the LSP, nickel query, nickel doc, etc.)

If we provide a faulty value for app_name, the evaluation will raise a proper error.

Going further

You can look at the main README for a general description of the project. The first blog post series explains the inception of Nickel, and the following posts focus on specific aspects of the language. The most complete source remains the user manual.

Configure your configuration

Zooming out from technical details, I would like to paint a broader picture here. My overall impression is that using bare YAML for e.g. Kubernetes is like programming in assembly. It’s certainly doable, but not desirable. It’s tedious, low-level, and lacks the minimal abstractions to make it scalable.

Some solutions do make configuration more flexible: YAML templating (Helm), tool-specific configuration languages (HCL), etc. But many of them feel like a band-aid over pure JSON or YAML which somehow accidentally grew up to become semi-programming languages.

Our final example is a snippet mapping a pair of high-level values — the only values we care configuring, app_name and port — to a “low-level” Kubernetes configuration. Once written, the only remaining thing to do is to… configure your configuration!

(import "mysql-module.ncl") & {
  config.app_name = "mysql_backup",
  config.port_number = 10400

Nickel allows to provide a well-defined interface for reusable parts of configuration, while still providing an escape hatch to override anything else when you need it. Such configuration snippets can be reused, composed, and validated thanks to types and contracts. Although Nickel undeniably brings in complexity, Nickel might paradoxically empower users to make configuration simple again.

If JSON is enough for your use-case, that’s perfect! There’s really no better place to be. But if you’ve ever felt underequipped to handle, write and evolve large configurations, Nickel might be for you.


We are happy to announce the 1.0 milestone for the Nickel configuration language. You can use it wherever you would normally use JSON, YAML, or TOML, but feel limited by using static text or ad-hoc templating languages. You can use it if your configuration is spread around many tools and ad-hoc configuration languages. Nickel could become one fit-them-all configuration language, enabling the sharing of the abstractions, code, schemas (contracts) and tooling across all your stack. Don’t hesitate to check the companion video to help you getting set up.

In our next blog post, we’ll show how to configure Terraform providers using Nickel, and thereby gain the ability to check the code against provider-specific contracts, ahead of performing the deployments. Stay tuned!

Your feedback, ideas, and opinions are invaluable: please use Nickel, break it, do cool things we haven’t even imagined, and most importantly, please let us know about it! Email us at or go to Nickel’s open source GitHub repository.

  1. L. A. Barroso, U. Hölzle, and P. Ranganathan. The Datacenter as a Computer: An Introduction to the Design of Warehouse-scale Machines (Third Edition). Morgan and Claypool Publishers, 2018.
  2. C. Tang, T. Kooburat, P. Venkatachalam, A. Chander, Z. Wen, A. Narayanan, P. Dowell, and R. Karl. Holistic Configuration Management at Facebook, in Proceedings of the 25th ACM Symposium on Operating System Principles (SOSP’15), October 2015
  3. Facebook blames a misconfigured server for yesterday’s outage, TechCrunch
  4. Fastly’s statement

May 17, 2023 12:00 AM

May 16, 2023

Philip Wadler

Naomi Klein on AI Hallucinations

Amongst all the nonsense, something sensible in the press about AI: "AI machines aren’t ‘hallucinating’, But their makers are" in The Guardian. Written by Naomi Klein, the author of one of my favourite books, This Changes Everything.

But first, it’s helpful to think about the purpose the utopian hallucinations about AI are serving. What work are these benevolent stories doing in the culture as we encounter these strange new tools? Here is one hypothesis: they are the powerful and enticing cover stories for what may turn out to be the largest and most consequential theft in human history. Because what we are witnessing is the wealthiest companies in history (Microsoft, Apple, Google, Meta, Amazon …) unilaterally seizing the sum total of human knowledge that exists in digital, scrapable form and walling it off inside proprietary products, many of which will take direct aim at the humans whose lifetime of labor trained the machines without giving permission or consent.

This should not be legal. In the case of copyrighted material that we now know trained the models (including this newspaper), various lawsuits have been filed that will argue this was clearly illegal. Why, for instance, should a for-profit company be permitted to feed the paintings, drawings and photographs of living artists into a program like Stable Diffusion or Dall-E 2 so it can then be used to generate doppelganger versions of those very artists’ work, with the benefits flowing to everyone but the artists themselves?

The painter and illustrator Molly Crabapple is helping lead a movement of artists challenging this theft. “AI art generators are trained on enormous datasets, containing millions upon millions of copyrighted images, harvested without their creator’s knowledge, let alone compensation or consent. This is effectively the greatest art heist in history. Perpetrated by respectable-seeming corporate entities backed by Silicon Valley venture capital. It’s daylight robbery,” a new open letter she co-drafted states.

The trick, of course, is that Silicon Valley routinely calls theft “disruption” – and too often gets away with it. We know this move: charge ahead into lawless territory; claim the old rules don’t apply to your new tech; scream that regulation will only help China – all while you get your facts solidly on the ground. By the time we all get over the novelty of these new toys and start taking stock of the social, political and economic wreckage, the tech is already so ubiquitous that the courts and policymakers throw up their hands.

We saw it with Google’s book and art scanning. With Musk’s space colonization. With Uber’s assault on the taxi industry. With Airbnb’s attack on the rental market. With Facebook’s promiscuity with our data. Don’t ask for permission, the disruptors like to say, ask for forgiveness. (And lubricate the asks with generous campaign contributions.)

by Philip Wadler ( at May 16, 2023 06:10 PM

May 12, 2023

Sandy Maguire

Certainty by Construction Progress Report 1

The following is a progress report for Certainty by Construction, a new book I’m writing on learning and effectively wielding Agda. Writing a book is a tedious and demoralizing process, so if this is the sort of thing you’re excited about, please do let me know!

As part of a new ~quarterly goal, I’m going to be publishing updates to Certainty by Construction every Friday. This is for a few reasons: one, things get done much more quickly when you’re not doing them in private; two, relatedly, it’s good to get some exposure here and keep myself accountable.

Anyway, there are 26 new pages since last week, although a good deal of that is code without any prose around it yet. I’m in the process of cannibalizing the sections on relations and setoids into a single chapter. It’s a discussion of mathematical relations, their properties, an several examples. We explore different pre-orders, partial orders and total orders, and have a length digression about effectively designing indices for data types.

This last point arose from me spending a few hours trying to work out under which circumstances exactly Agda gets confused about whether or not a computing index will give rise to a constructor. My findings are that it’s not really about computing indices, so much as it is about Agda running out of variables in which it can pack constraints. I suspect this knowledge can be exploited to make more interesting constructors than I thought possible, but I haven’t worked out how to do it yet.

I’ve also been working on how to simplify some bigger setoid proofs, where you have a bunch of equational reasoning you’d like to do under congruence. The folklore on this is generally to introduce a lemma somewhere else, but this has always struck me as a disappointing solution. Modulo the concrete syntax, this seems to work pretty well:

    : A
     {f : A  A}
     (cong : {x y : A}  x ≈ y  f x ≈ f y)
     {x y z : A}
     x IsRelatedTo y
     f y IsRelatedTo z
     f x IsRelatedTo z
_ ≈nested cong [ relTo x=y ] (relTo fy=z)
    = relTo (trans (cong x=y) fy=z)
infixr 2 _≈nested_[_]_

which lets you focus in on a particular sub-expression, and use a new equational reasoning block to rewrite that, before popping your results back to the full expression. As an example:

((a *H c) *x+ 0#) +H b *S c +H d *S a ⌋ * x + b * d
≈nested (+-congʳ ∘ *-congʳ) [  -- focus on subexpr
((a *H c) *x+ 0#) +H b *S c +H d *S a ⌋
≈⟨ +H-+-hom (((a *H c) *x+ 0#) +H b *S c) (d *S a) x ⟩
((a *H c) *x+ 0#) +H b *S c ⌋ + ⟦ d *S a ⌋
≈⟨ +-congʳ (+H-+-hom ((a *H c) *x+ 0#) (b *S c) x)
  ⌊ a *H c ⌋ * x + 0# + ⌊ b *S c ⌋ + ⌊ d *S a ⌋
≈⟨ …via… *S-*-hom ⟩
  ⌊ a *H c ⌋ * x + (b * ⌊ c ⌋) + (d * ⌊ a ⌋)
≈⟨ +-congʳ (+-congʳ (*-congʳ (*H-*-hom a c x)))
  ⌊ a ⌋ * ⌊ c ⌋ * x + b * ⌊ c ⌋ + d * ⌊ a ⌋
∎ ]                            -- pop back
  (⌊ a ⌋ * ⌊ c ⌋ * x + b * ⌊ c ⌋ + d * ⌊ a ⌋) * x + (b * d)

The attentive reader here will notice that I have also clearly been improving the chapter on ring solving. Maybe I’m just better at proofs these days, but the whole thing feels much less challenging than my first few times looking at it.

Anyway, that’s all for today. If you’ve already bought the book, you can get the updates for free on Leanpub. If you haven’t, might I suggest doing so? Your early support and feedback helps inspire me and ensure the book is as good as it can possibly be.

May 12, 2023 12:00 AM

May 11, 2023

Tweag I/O

A journey through the auditing process of a smart contract

Smart contracts are critical programs running on a blockchain. Tweag’s High Assurance Software Group has become a well-known and trusted actor in the Cardano blockchain ecosystem when it comes to auditing and improving the reliability of these products. Let me guide you through our auditing process, how we improved upon classic testing techniques, and where we’d like to go next with formal methods.

The fictional company FooCorp is planning to provide a service AwesomeFoo to its clients through the deployment of a smart contract on the Cardano blockchain. For the sake of this blog post, they asked us to audit it.

What is AwesomeFoo exactly?

Just like you and me, Alice and Bob have an account at their local bank. What they possess there is described by a single number: the balance. When they spend or receive money, this balance is updated accordingly. Some blockchains (such as Ethereum) have a similar concept of account and balance but Alice and Bob are also regular users of the Cardano blockchain which follows another design. Instead of accounts, Alice and Bob each own a wallet containing objects called UTxOs (Unspent Transaction Outputs). A UTxO is like a one-use banknote or coupon: it has a value, and you can spend it once. Spending it allows you to create new UTxOs with values and owners of your choice, as long as it adds up to the value spent.

Let’s look at an example. Consider the UTxOs in black in the following diagram:

Illustration of transactions spending and producing UTxOs

Alice has two UTxOs: one with 10 Ada (the currency on the Cardano blockchain) and another with 5. She wants to pay 12 Ada to Bob. No UTxO has that exact value so she submits a transaction request (in blue) in which she spends her two UTxOs (15 Ada in total) and creates a 12 Ada one that goes in Bob’s wallet, and a 3 Ada one (the remainder) to herself. The green transaction from Bob is another example. This is how value is exchanged with no need for the concept of centralized accounts.

The diagram shows transaction requests that have been accepted on the chain, but some requests can be rejected if they don’t follow the rules:

  • Rule #1: The total input and output value must match1.
  • Rule #2: If a UTxO from a wallet is spent, the owner of the wallet must sign the transaction request.

Some UTxOs don’t belong to individuals like Alice and Bob, but are associated to programs instead. When you submit a transaction request in which you spend such a UTxO, the associated program is executed. It is a pure function that returns a boolean. Given the context of the transaction (inputs, outputs, signers, etc.), it says whether spending the UTxO is allowed (True) or not (False).

  • Rule #3: If a UTxO associated to a program is spent, the program must return True given the context of the transaction request as parameters.

The AwesomeFoo smart contract is a collection of such programs. Smart contracts are autonomous and make it possible to run complex services and protocols on the chain such as distributed exchanges (DEX), stable coins (e.g. Djed), crowdfunding, auctions, staking, etc.

Why is FooCorp right to ask for an audit?

  • High risks
    • No updates. As you noticed on the diagram, each UTxO on the chain has an address. For wallet UTxOs, it is a key belonging to the owner. For programs, it is the hash of the program. This means that updating the program changes the address but all existing UTxOs are still associated to the previous address. These will go through the old version when spent. It is only possible to migrate, not update. Therefore, it is crucial to get the product right from the start.
    • Distributed design. In complex smart contracts, users craft transaction requests in which multiple UTxOs, sometimes belonging to different programs, are spent at the same time to achieve some result. In such cases, every individual program has to be designed with that collective result and the other programs in mind. Design and implementation are prone to many hard to detect errors, as in distributed or concurrent programming.
  • High stakes. In case of unexpected behavior, bugs, or attacks exploiting vulnerabilities in the product, FooCorp and its clients could suffer from major and unrecoverable financial losses.
  • Testing is limited. An audit has a fair chance to spot specification or design flaws which no amount of testing can identify. Besides, an external pair of eyes is likely to catch issues FooCorp may have missed in internal tests.

We help FooCorp by uncovering vulnerabilities, bugs, weaknesses, specification issues, design flaws, and wastes of space/time in their smart contract. We also suggest mitigations when applicable.

In the High Assurance Software Group, we have audited around a dozen Cardano smart contracts, a few of them twice when the client wished for a second audit. In around 75% of our audits, we found at least one concern that would expose parties of the smart contract to risks of abuse (e.g. weaknesses or direct exploits to steal money).

The audit begins

FooCorp has just provided us with:

  • The specification documents of AwesomeFoo including high level requirements, use cases, examples, and diagrams.
  • The implementation in Haskell of the programs comprising the AwesomeFoo smart contract.

In a few weeks, we are expected to deliver:

  • The audit report listing our concerns and suggestions sorted by type (unclear specifications, bugs, vulnerabilities/exploits, code quality, optimization) and ranked by severity (from critical to very low).
  • The audit code: an audit-oriented API using our homemade framework to interact with AwesomeFoo along with all the scenarios we tried, which FooCorp can use as regression tests or expand/tinker with if they want.

First look at AwesomeFoo

First, we need to really understand what AwesomeFoo is about, so we study the specs and skim through the code.

New finding. We quickly notice an unclear item in the documentation: it states that users of AwesomeFoo can’t perform some action after a deadline but it doesn’t say if the validity interval includes the deadline, nor what’s to become of any unspent deposit once the deadline has elapsed. The code tells us what happens, but not what’s intended to happen.

We simply ask the engineers at FooCorp about it! The auditing process involves a continuous dialogue: we inform them of every important discovery or question as we study their product. Here, they help us clarify the above item but we will log it in our report anyway and suggest an improvement.

Simple tests

In the industry, audits often consist of a manual inspection of specs and code. At Tweag, we go way beyond that. We have been continuously developing and improving our homemade auditing toolbox, cooked-validators, in parallel with our audits. It lets us interface with smart contracts to experiment with test scenarios. According to many of our clients who joined our user base, cooked-validators is easier and more efficient at that task than existing libraries.

A test has a sequence of transaction requests that are submitted one by one to a simulated blockchain, which either accepts or rejects them. The test specifies the expected outcome.

Basic test cases

To write such tests, we need to design transaction requests. We use cooked-validators to create an API to build requests that correspond to meaningful AwesomeFoo interactions. We don’t reuse FooCorp’s API because:

  • It is written with the end user in mind. It ensures transaction requests are well-formed from the start and fit the expectations of the smart contract, whereas we want to craft weird and unexpected transactions to find undesired outcomes. We write a more flexible API to that end and use our experience from previous audits.
  • Writing our own API brings us an in-depth understanding of AwesomeFoo, after the overview we get from the specification and skimming through the code.

Once we’re done with the API, we warm up with a few unit test cases covering common usage scenarios. We also test a few properties using property-based checking2.

New finding. An implementation bug escaped FooCorp’s test suite: a utility function that generates supposedly-unique identifiers actually loops back at some point on existing ones. For now, AwesomeFoo has an undocumented implicit limit on the amount of resources using the identifiers so the issue is dormant. We report it to FooCorp immediately and explain how it may become critical if they ever increase this limit.

At this stage, our audit may look like a glorified API and test suite. Actually, this test suite is ammunition for the secret weapons hidden in cooked-validators.

The trace modification framework

Remember the structure of a test case:

Structure of a test case

Each test case has a sequence of transaction requests, the trace, detailing a scenario. cooked-validators provides a powerful framework to modify, generate, and combine transaction requests and traces. If you are interested in the technical details, refer to our previous posts: Part 1 presents the underlying trace abstraction that uses a freer monad, and Part 2 the LTL3-based language used to specify modifications. A library of ready-to-use higher-level modifiers, which we are about to detail next, is built on top of this lower level framework. The idea is to derive lots of new meaningful test scenarios from existing ones, with little effort.


Tweaks are functions to modify transaction requests (add, delete, change):

Types of tweaks

You can combine them to easily create a new trace from another. They make it possible to punctually explore outside the space of what the base AwesomeFoo API can produce, without having to manually expand or generalize said API. This makes it extremely convenient to think out of the box and quickly implement many new ideas.

Example of a tweak

In the above, we modify a test scenario into another in which Bob tries to create the necessary conditions to steal money. This scenario is expected to fail. Thankfully for FooCorp, it does.

We use tweaks to expand coverage of our unit test suite.

Expanding the test suite with tweaks

Automated attacks: find targets and apply tweaks on a large scale

We know a bunch of common vulnerabilities in smart contracts. We often want to attempt attacks that exploit them without a specific setup in mind. Manually writing new traces is tedious and likely to miss specific conditions in which an attack is successful. To maximize our chances of finding a setup where the attack works, we try it on all the test scenarios we have so far. Our framework makes it possible to target locations in traces to apply the required modifications with tweaks.

Automated attacks can produce several new test cases from a single one such as in the following example.

Expanding the test suite with tweaks

Then, we can automatically attempt common attacks on all our previous scenarios, turning dozens of test cases into hundreds of meaningful test cases.

Expanding the test suite with automated attacks

New finding. The automated attack reveals a vulnerability in AwesomeFoo. It is actually due to a typo in a function that checks whether two values are equal. One case has been forgotten by FooCorp. Neither manual code inspection nor tests carried out by FooCorp and us in the early stages of the audit had detected it.

Our experience with other techniques

  • Unit testing: writing individual tests is tedious and likely to miss interesting cases. The tweaks make it possible to write additional tests extremely easily by slightly changing previous test cases, going out of the box, and the automated attacks considerably increase coverage afterwards.
  • Model checking is exhaustive. All of the (finite) input space is tested. We explored that path before but it was impossible to verify interesting properties on smart contracts in a reasonable time without reducing the state space to some trivially small and irrelevant subset.
  • Property-based checking compromises by only considering randomly generated inputs. Writing good generators for these inputs is hard. By applying our automated attacks to existing meaningful traces (including manually tweaked traces), we cover inputs that are naturally more likely to yield interesting results compared to random distributions.

The following diagrams illustrate how the different techniques differ in terms of coverage.

Illustration of input space coverage by different testing techniques

Looking ahead: towards formal verification

As we told FooCorp, no audit is guaranteed to catch all bugs and vulnerabilities. Our auditing process is constantly evolving. Although we have improved upon existing testing techniques, we would ideally like to formally verify (demonstrate mathematically) that the implementation of a smart contract fits its specifications. Part of our research activity is dedicated to the use of formal methods in the pursuit of this goal. In particular, we’re developing Pirouette, a language-agnostic functional symbolic evaluation engine with a backend for smart contracts on Cardano. We’re also working on using LiquidHaskell, which uses refinement types and SMT solvers, to prove properties on smart contracts.

  1. This is only partially true in practice. A tiny amount of Ada, a fee, is lost in every transaction. The Cardano blockchain also supports custom tokens (e.g. custom coins, NFTs) that can be minted or burned during some transaction provided they follow rules defined in a program called a minting policy.
  2. cooked-validators interfaces with Tasty, HUnit, and Quickcheck.
  3. Linear Temporal Logic

May 11, 2023 12:00 AM

Matt Parsons

Working with Haskell CallStack

GHC Haskell provides a type CallStack with some magic built in properties. Notably, there’s a constraint you can write - HasCallStack - that GHC will automagically figure out for you. Whenever you put that constraint on a top-level function, it will figure out the line and column, and either create a fresh CallStack for you, or it will append the source location to the pre-existing CallStack in scope.

Getting the current CallStack

To grab the current CallStack, you’ll write callStack - a value-level term that summons a CallStack from GHC’s magic.

import GHC.Stack

emptyCallStack :: IO ()
emptyCallStack = putStrLn $ show callStack

If we evaluate this in a compiled executable, then GHC will print out [] - a CallStack list with no entries! This isn’t much use. Let’s add a HasCallStack constraint.

giveCallStack :: HasCallStack => IO ()
giveCallStack = putStrLn $ show callStack

Running this in our test binary gives us the following entry, lightly formatted:

[ ( "giveCallStack"
  , SrcLoc 
    { srcLocPackage = "main"
    , srcLocModule = "Main"
    , srcLocFile = "test/Spec.hs"
    , srcLocStartLine = 18
    , srcLocStartCol = 9
    , srcLocEndLine = 18
    , srcLocEndCol = 22

We get a [(String, SrcLoc)]. The String represnts the function that was called, and where SrcLoc tells us the package, module, file, and a begin and end to the source location of the call site - not the definition site.

Let’s construct a helper that gets the current SrcLoc.

getSrcLoc :: HasCallStack => SrcLoc
getSrcLoc = snd $ head $ getCallStack callStack

I’m going to call print getSrcLoc in my test binary, and this is the output (again, formatted for legibility):

    { srcLocPackage = "main"
    , srcLocModule = "Main"
    , srcLocFile = "test/Spec.hs"
    , srcLocStartLine = 27
    , srcLocStartCol = 15
    , srcLocEndLine = 27
    , srcLocEndCol = 24

We can use this to construct a link to a GitHub project - suppose that we called that inside the esqueleto repository, and we want to create a link that’ll go to that line of code. Normally, you’d want to shell out and grab the commit and branch information, but let’s just bake that into the link for now.

mkGithubLink :: HasCallStack => String
mkGithubLink =
        [ ""
        , srcLocFile srcLoc
        , "#L", show $ srcLocStartLine srcLoc
        , "-"
        , "L", show $ srcLocEndLine srcLoc
    srcLoc = getSrcLoc

Let’s call that from our test binary now:

main = do
    -- snip...
    example "mkGithubLink" do
        putStrLn mkGithubLink

The output is given:


But - that’s not right! That’s giving us the source location for getSrcLoc inside of mkGithubLink. We want it to give us the location of the callsite of mkGithubLink.

Fortunately, we can freeze the current CallStack, which will prevent getSrcLoc from adding to the existing CallStack.

Freezing the CallStack

GHC.Stack provides a function withFrozenCallStack, with a bit of a strange type signature:

withFrozenCallStack :: HasCallStack => (HasCallStack => a) -> a

This function freezes the CallStack for the argument of the function. This is useful if you want to provide a wrapper around a function that manipulates or reports on the CallStack in some way, but you don’t want that polluting any other CallStack.

Let’s call that before getSrcLoc and see what happens.

mkGithubLinkFrozen :: HasCallStack => String
mkGithubLinkFrozen =
        [ ""
        , srcLocFile srcLoc
        , "#L", show $ srcLocStartLine srcLoc
        , "-"
        , "L", show $ srcLocEndLine srcLoc
    srcLoc = withFrozenCallStack getSrcLoc

-- in test binary,
main = do
    -- snip
    example "mkGithubLinkFrozen" do putStrLn mkGithubLinkFrozen




More real-world examples

As an example, the library annotated-exception attaches CallStacks to thrown exceptions, and each function like catch or onException that touches exceptions will append the current source location to the existing CallStack.

However, handle is implemented in terms of catch, which is implemented in terms of catches, and we wouldn’t want every single call-site of handle to mention catch and catches, and we wouldn’t want every call site of catch to mention catches - that’s just noise. So, we can freeze the CallStack:

    :: (HasCallStack, Exception e, MonadCatch m) 
    => (e -> m a) -> m a -> m a
handle handler action = 
    withFrozenCallStack catch action handler

    :: (HasCallStack, Exception e, MonadCatch m) 
    => m a -> (e -> m a) -> m a
catch action handler = 
    withFrozenCallStack catches action [Handler handler]

    :: (HasCallStack, MonadCatch m) 
    => m a -> [Handler m a] -> m a
catches action handlers = 
    Safe.catches action (withFrozenCallStack mkAnnotatedHandlers handlers)

mkAnnotatedHandlers :: (HasCallStack, MonadCatch m) => [Handler m a] -> [Handler m a]
mkAnnotatedHandlers xs =
    xs >>= \(Handler hndlr) ->
        [ Handler $ \e ->
            checkpointCallStack $ hndlr e
        , Handler $ \(AnnotatedException anns e) ->
            checkpointMany anns $ hndlr e

Now, there’s something interesting going on here: consider these two possible definition of handle:

handle handler action = 
    withFrozenCallStack catch action handler
handle handler action = 
    withFrozenCallStack $ catch action handler

It’s a Haskell instinct to write function $ argument, and it seems a bit odd to see withFrozenCallStack - a function - applied without a dollar. This is a subtle distinction - withFrozenCallStack applied to catch alone just freezes the CallStack for catch, but not for handler or action. If we apply withFrozenCallStack $ catch action handler, then we’ll freeze the CallStack for our arguments, too. This is usually not what you want.

Freezing Functions

Let’s explore the above subtle distinction in more depth.

wat :: HasCallStack => IO ()
wat = do
    wrap "unfrozen" (printSrcLoc getSrcLoc)
    withFrozenCallStack $ wrap "dolla" (printSrcLoc getSrcLoc)
    withFrozenCallStack wrap "undolla" (printSrcLoc getSrcLoc)

printSrcLoc :: SrcLoc -> IO ()
printSrcLoc = putStrLn . prettySrcLoc

wrap :: HasCallStack => String -> IO a -> IO a
wrap message action = do
    putStrLn $ concat
        [ "Beginning ", message
        , ", called at ", prettySrcLoc getSrcLoc
    a <- action
    putStrLn $ "Ending " <> message
    pure a

Before seeing the answer and discussion below, consider and predict what SrcLoc you expect to see printed out when wat is called.

Let’s zoom in on that:

    wrap "unfrozen" (printSrcLoc getSrcLoc)
    withFrozenCallStack $ wrap (print getSrcLoc)
    withFrozenCallStack wrap (print getSrcLoc)

Both lines type check just fine. The difference is in which CallStacks are frozen. The first line freezes the CallStack for the entire expression, wrap (print getSrcLoc). The second line only freezes the CallStack for the wrap function - the CallStack for the (print getSrcLoc) is unfrozen.

Let’s see what happens when we run that:

Beginning unfrozen, called at src/Lib.hs:51:40 in callstack-examples-
src/Lib.hs:40:34 in callstack-examples-
Ending unfrozen
Beginning dolla, called at test/Spec.hs:34:19 in main:Main
test/Spec.hs:34:19 in main:Main
Ending dolla
Beginning undolla, called at test/Spec.hs:34:19 in main:Main
src/Lib.hs:42:53 in callstack-examples-
Ending undolla

For unfrozen, wrap calls the SrcLoc that corresponds to it’s putStrLn $ concat [..., getSrcLoc] call. This always points to the wrap definition site - we’d want to freeze that getSrcLoc if we wanted the call site of wrap in that case. The next line (src/Lib.hs:40:34 ...) is our printSrcLoc getSrcLoc function provided to wrap. That SrcLoc points to the call site of getSrcLoc in the file for that function.

For dolla, we’ve frozen the CallStack for both wrap and the function argument. That means the SrcLoc we get for both cases is the same - so we’re not really returning the exact SrcLoc, but the most recent SrcLoc before the entire CallStack was frozen. This SrcLoc corresponds to the call-site of wat in the test suite binary, not the library code that defined it.

For undolla, we’ve only frozen the CallStack for wrap, and we leave it untouched for printSrcLoc getSrcLoc. The result is that wrap prints out the frozen CallStack pointing to the callsite of wat in the test binary, while the function argument printSrcLoc getSrcLoc is able to access the CallStack with new frames added.

It’s easiest to see what’s going on here with explicit function parenthesization. Haskell uses whitespace for function application, which makes the parentheses implicit for multiple argument functions. Let’s write the above expressions with explicit parens around withFrozenCallStack:

    (withFrozenCallStack (wrap "dolla" (printSrcLoc getSrcLoc)))
    (withFrozenCallStack wrap) "undolla" (printSrcLoc getSrcLoc)

I almost wish that withFrozenCallStack always required parentheses, just to make this clearer - but that’s not possible to enforce.

Unfortunately, yeah, mkGithubLinkFrozen is broken if we’ve frozen the CallStack externally:

-- test
main :: HasCallStack => IO ()         
main = do                             -- === line 16
    -- snip...

    example "frozen githublink" do
        putStrLn (withFrozenCallStack mkGithubLinkFrozen)
                                      -- ^^^ line 37

This outputs:

frozen githublink

Line 16 points to main, where we’ve included our HasCallStack constraint. What if we omit that constraint?

main :: IO ()         
main = do                             -- === line 16
    -- snip...

    example "frozen githublink" do
        putStrLn (withFrozenCallStack mkGithubLinkFrozen)
                                      -- ^^^ line 37

This outputs:

frozen githublink
callstack-examples-test: Prelude.head: empty list

Uh oh!

Well, GHC.Stack doesn’t provide a utility for us to unfreeze the CallStack, which makes sense - that would break whatever guarantee that withFrozenCallStack is providing.

If we look at the internal definitions for CallStack, we’ll see that it’s a list-like type:

data CallStack
  = EmptyCallStack
  | PushCallStack [Char] SrcLoc CallStack
  | FreezeCallStack CallStack

Then we can see withFrozenCallStack’s implementation:

withFrozenCallStack :: HasCallStack
                    => ( HasCallStack => a )
                    -> a
withFrozenCallStack do_this =
  -- we pop the stack before freezing it to remove
  -- withFrozenCallStack's call-site
  let ?callStack = freezeCallStack (popCallStack callStack)
  in do_this

That ?callStack syntax is GHC’s ImplicitParams extension - it’s an implementation detail that GHC may change at any point in the future. Let’s rely on that detail! It has remained true for 10 major versions of base, and we can always try and upstream this officially…

import GHC.Stack.Types

thawCallStack :: CallStack -> CallStack
thawCallStack stack =
    case stack of
        FreezeCallStack stk -> stk
        _ -> stack

withThawedCallStack :: HasCallStack => (HasCallStack => r) -> r
withThawedCallStack action =
    let ?callStack = thawCallStack (popCallStack callStack)
     in action

Unfortunately, we can’t call this within mkGithubLink - that unfreezes the CallStack, but at that point, it’s too late.

Yet another “safe” use of head that turns out to be unsafe! Only in Haskell might we have a totally empty stack trace…

Propagating CallStack

When you write a top-level function, you can include a CallStack. Any time you call error, the existing CallStack will be appended to the ErrorCall thrown exception, which you can see by matching on ErrorCallWithLocation instead of plain ErrorCall.

CallStack propagation is fragile. Any function which does not include a HasCallStack constraint will break the chain, and you’ll only have the lowest level of the CallStack. Consider boom and boomStack:

boom :: Int
boom = error "oh no"

boomStack :: HasCallStack => Int
boomStack = error "oh no, but with a trace"

If we evaluate these, then we’ll see very different information. error will summon it’s own CallStack, which will include the callsite of error itself:

callstack-examples-test: oh no
CallStack (from HasCallStack):
  error, called at src/Lib.hs:76:8 in callstack-examples-

Line 76 and column 8 point exactly to where error is called in the definition of boom. Let’s evaluate boomStack now:

callstack-examples-test: oh no, but with a trace
CallStack (from HasCallStack):
  error, called at src/Lib.hs:79:13 in callstack-examples-
  boomStack, called at test/Spec.hs:40:15 in main:Main
  main, called at test/Spec.hs:16:1 in main:Main

Now, we see the entry for error’s call-site, as well as boomStack’s call site, and finally main - the entire chain!

Remembering to put HasCallStack constraints everywhere is a bit of a drag, which is another motivation for my annotated-exception library - all of the functions which touch exceptions in any way will push a stack frame onto any exception that has been thrown. This means that any catch or finally or similar will do a much better job of keeping track of the stack frame. Diagnosing problems becomes far easier.

We can do this for ErrorCall, but it’s annoying, because the location is a String.

mkStackFrameLines :: CallStack -> [String]
mkStackFrameLines =
    map formatFrame . getCallStack
    formatFrame (fn, srcLoc) =
        fn <> ", called at " <> prettySrcLoc srcLoc

addStackFrame :: HasCallStack => IO a -> IO a
addStackFrame action = do
    let newLines =
            map ("  " <>) $ mkStackFrameLines callStack
        appendLoc locs =
                (locs : newLines)
    action `catch` \(ErrorCallWithLocation err loc) ->
        throwIO $ ErrorCallWithLocation err (appendLoc loc)

-- These functions are used here .
-- Try and predict what their output will be!

moreContextPlease :: IO ()
moreContextPlease =
    addStackFrame $ do
        print boom

moreContextPleaseStacked :: HasCallStack => IO ()
moreContextPleaseStacked =
    addStackFrame $ do
        print boom

When we evaluate moreContextPlease, we’ll see this:

callstack-examples-test: oh no
CallStack (from HasCallStack):
  error, called at src/Lib.hs:77:8 in callstack-examples-
  addStackFrame, called at src/Lib.hs:84:5 in callstack-examples-

This gives us a little more context - we at least have that addStackFrame call. But addStackFrame happily adds everything in the trace, and moreContextPleaseStacked has an unbroken line:

callstack-examples-test: oh no
CallStack (from HasCallStack):
  error, called at src/Lib.hs:77:8 in callstack-examples-
  addStackFrame, called at src/Lib.hs:89:5 in callstack-examples-
  moreContextPleaseStacked, called at test/Spec.hs:40:9 in main:Main
  main, called at test/Spec.hs:16:1 in main:Main

Wow! A complete stack trace, all the way from error to main. You never see that.

Unfortunately, the String makes deduplicating lines more challenging. boomStack included the HasCallStack, which would be an unbroken chain too - let’s see how that plays out…

moreContextPleaseStacked :: HasCallStack => IO ()
moreContextPleaseStacked =
    addStackFrame $ do
        print boomStack

Evaluating this now gives us:

callstack-examples-test: oh no, but with a trace
CallStack (from HasCallStack):
  error, called at src/Lib.hs:80:13 in callstack-examples-
  boomStack, called at src/Lib.hs:90:15 in callstack-examples-
  moreContextPleaseStacked, called at test/Spec.hs:40:9 in main:Main
  main, called at test/Spec.hs:16:1 in main:Main
  addStackFrame, called at src/Lib.hs:89:5 in callstack-examples-
  moreContextPleaseStacked, called at test/Spec.hs:40:9 in main:Main
  main, called at test/Spec.hs:16:1 in main:Main

We get error, boomStack, moreContextPleaseStacked, main - the original stack trace. Then we append to that addStackFrame, which also adds in moreContextPleaseStacked and main again. So, clearly, this is noisier than it needs to be - ideally, we would not include duplicates. This should be possible - addStackFrame could potentially parse the location String and if it finds a shared suffix (in this case, moreContextPleaseStacked), then it can only insert the addStackFrame call above it and drop the rest.


I’ve mentioned annotated-exception a few times. This library extends the CallStack machinery to any exception that is thrown by the library or passes through an exception handler. Additionally, you can provide additional metadata information on your exceptions, which makes debugging them much more useful. You can now transparently add, say, the logged in user ID to every single exception that gets thrown in a code block.

The source code for this blog post is available at this GitHub repository.

May 11, 2023 12:00 AM

May 03, 2023

Brent Yorgey

Competitive programming in Haskell: tries

In my previous post, I challenged you to solve Alien Math, which is about reading numbers in some base B, but with a twist. We are given a list of B strings representing the names of the digits 0 through B-1, and a single string describing a number, consisting of concatenated digit names. For example, if B = 3 and the names of the digits are zero, one, two, then we might be given a string like twotwozerotwoone, which we should interpret as 22021_3 = 223_{10}. Crucially, we are also told that the digit names are prefix-free, that is, no digit name is a prefix of any other. But other than that, the digit names could be really weird: they could be very different lengths, some digit names could occur as substrings (just not prefixes) of others, digit names could share common prefixes, and so on. So this is really more of a parsing problem than a math problem; once we have parsed the string as a list of digits, converting from base B is the easy part.

One simple way we can do this is to define a map from digit names to digits, and simply look up each prefix of the given string until we find a hit, then chop off that prefix and start looking at successive prefixes of the remainder. This takes something like O(n^2 \lg n) time in the worst case (I think)—but this is actually fine since n is at most 300. This solution is accepted and runs in 0.00 seconds for me.


However, I want to talk about a more sophisticated solution that has better asymptotic time complexity and generalizes nicely to other problems. Reading a sequence of strings from a prefix-free set should make you think of Huffman coding, if you’ve ever seen that before. In general, the idea is to define a trie containing all the digit names, with each leaf storing the corresponding digit. We can then scan through the input one character at a time, keeping track of our current position in trie, and emit a digit (and restart at the root) every time we reach a leaf. This should run in O(n) time.

Let’s see some generic Haskell code for tries (this code can also be found at byorgey/comprog-hs/Trie.hs on GitHub). First, some imports, a data type definition, and emptyTrie and foldTrie for convenience:

module Trie where

import           Control.Monad              ((>=>))
import qualified Data.ByteString.Lazy.Char8 as C
import           Data.List                  (foldl')
import           Data.Map                   (Map, (!))
import qualified Data.Map                   as M
import           Data.Maybe                 (fromMaybe)

data Trie a = Trie
  { trieSize :: !Int
  , value    :: !(Maybe a)
  , children :: !(Map Char (Trie a))
  deriving Show

emptyTrie :: Trie a
emptyTrie = Trie 0 Nothing M.empty

-- | Fold a trie into a summary value.
foldTrie :: (Int -> Maybe a -> Map Char r -> r) -> Trie a -> r
foldTrie f (Trie n b m) = f n b ( (foldTrie f) m)

A trie has a cached size (we could easily generalize this to store any sort of monoidal annotation), a possible value (i.e. the value associated with the empty string key, if any), and a map from characters to child tries. The cached size is not needed for this problem, but is included since I needed it for some other problems.

Now for inserting a key/value pair into a Trie. This code honestly took me a while to get right! We fold over the given string key, producing for each key suffix a function which will insert that key suffix into a trie. We have to be careful to correctly update the size, which depends on whether the key being inserted already exists—so the recursive go function actually returns a pair of a new Trie and an Int representing the change in size.

-- | Insert a new key/value pair into a trie, updating the size
--   appropriately.
insert :: C.ByteString -> a -> Trie a -> Trie a
insert w a t = fst (go w t)
    go = C.foldr
      (\c insSuffix (Trie n v m) ->
         let (t', ds) = insSuffix (fromMaybe emptyTrie (M.lookup c m))
         in  (Trie (n+ds) v (M.insert c t' m), ds)
      (\(Trie n v m) ->
         let ds = if isJust v then 0 else 1
         in  (Trie (n+ds) (Just a) m, ds)

Now we can create an entire Trie in one go by folding over a list of key/value pairs with insert:

-- | Create an initial trie from a list of key/value pairs.  If there
--   are multiple pairs with the same key, later pairs override
--   earlier ones.
mkTrie :: [(C.ByteString, a)] -> Trie a
mkTrie = foldl' (flip (uncurry insert)) emptyTrie

A few lookup functions: one to look up a single character and return the corresponding child trie, and then on top of that we can build one to look up the value associated to an entire string key.

-- | Look up a single character in a trie, returning the corresponding
--   child trie (if any).
lookup1 :: Char -> Trie a -> Maybe (Trie a)
lookup1 c = M.lookup c . children

-- | Look up a string key in a trie, returning the corresponding value
--   (if any).
lookup :: C.ByteString -> Trie a -> Maybe a
lookup = C.foldr ((>=>) . lookup1) value

Finally, a function that often comes in handy for using a trie to decode a prefix-free code. It takes an input string and looks it up character by character; every time it encounters a key which exists in the trie, it emits the corresponding value and then starts over at the root of the trie.

decode :: Trie a -> C.ByteString -> [a]
decode t = reverse . snd . C.foldl' step (t, [])
    step (s, as) c =
      let Just s' = lookup1 c s
      in  maybe (s', as) (\a -> (t, a:as)) (value s')

These tries are limited to string keys, since that is most useful in a competitive programming context, but it is of course possible to make much more general sorts of tries — see Hinze, Generalizing Generalized Tries.


Finally, we can use our generic tries to solve the problem: read the input, build a trie mapping digit names to values, use the decode function to read the given number, and finally interpret the resulting list of digits in the given base.

import Control.Arrow ((>>>))
import ScannerBS
import Trie

main = C.interact $ runScanner tc >>> solve >>> showB

data TC = TC { base :: Integer, digits :: [C.ByteString], number :: C.ByteString }
  deriving (Eq, Show)

tc :: Scanner TC
tc = do
  base <- integer
  TC base <$> (fromIntegral base >< str) <*> str

solve :: TC -> Integer
solve TC{..} = foldl' (\n d -> n*base + d) 0 (decode t number)
    t = mkTrie (zip digits [0 :: Integer ..])

Practice problems

Here are a few other problems where you can profitably make use of tries. Some of these can be solved directly using the Trie code given above; others may require some modifications or enhancements to the basic concept.

For next time

For next time, I challenge you to solve Chemist’s vows!

by Brent at May 03, 2023 07:57 PM

May 01, 2023

Monday Morning Haskell

Spring Sale: Final Day!

Today is the final day to subscribe and get 20% off any of our paid courses! Here are the potential sale prices you might get:

  1. Haskell From Scratch | Our Comprehensive Beginners Course | $79.20
  2. Practical Haskell | Learn about Useful Web Libraries and Project Concepts | 119.20
  3. Making Sense of Monads - Learn Haskell's Key Concept | $23.20
  4. Effectful Haskell - Take a Step further with Monadic Effect Systems | 31.20
  5. Haskell Brain - Combine Tensor Flow and Haskell | 31.20

In addition, you can also take a look at our new free course, Setup.hs. This course teaches you how to set up your basic Haskell toolchain, including IDE integrations!

So if you want that 20% discount code, make sure to subscribe to our mailing list before the end of the day!

by James Bowen at May 01, 2023 03:01 PM

April 27, 2023


GHC activities report: February–March 2023

We’re happy to support the Haskell Foundation in planning a workshop for new GHC contributors on 7th-9th June. Workshop registration has now closed, but you can still register to attend ZuriHac on 10th-12th June!

This is the seventeenth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of February and March 2023. You can find the previous editions collected under the ghc-activities-report tag.

We are delighted that Hasura have recently started sponsoring our GHC maintenance work, building on our long-running collaboration on Haskell tooling. Many thanks to them and to the other sponsors, Juspay and GitHub via the Haskell Foundation. In addition, Mercury are funding specific work on improved performance for GHC, HLS and related projects. We are also grateful to our past sponsors, including Microsoft Research and IOG.

However, some of our sponsorship agreements are coming to an end and we need more sponsorship to sustain the team! If your company might be able to contribute funding to ensure that we can continue this valuable work, please read about how you can help or get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC!


The existing GHC team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal, Sam Derbyshire and Rodrigo Mesquita. Many others within Well-Typed are contributing to GHC more occasionally.


  • Ben released GHC 9.6.1, the first major release in the 9.6 series, having previously prepared several alphas and the release candidate.

  • Zubin released GHC 9.2.6 and GHC 9.2.7, bugfix releases in the 9.2 series.


  • Sam and Adam ensured that the changes to overloaded labels from GHC proposal 170 didn’t steal syntax by preventing the . character from appearing in labels. This avoids a breaking change that was discovered in the 9.6 alphas.

Typechecker and renamer

  • Sam completed a significant overhaul of the treatment of duplicate record fields in the renamer and typechecker.

    Some highlights:

    • Template Haskell support for duplicate record fields.
    • Improve disambiguation of record updates involving sum types (#21443).
    • Stop displaying mangled field names such as $sel:MkD:fld to the user.

    Fixed tickets: #13352, #14848, #17381, #17551, #19664, #21443, #21444, #21720, #21898, #21946, #21959, #22125, #22160, #22424, #23010, #23062, #23063 and #23177.

  • Sam introduced a warning when users rely on a certain constraint solver bug involving expansion of superclass constraints (!9921 !10020). This was done to introduce a migration cycle, so that people have the time to update their programs, based on feedback from the 9.6 alphas.

  • Sam fixed a small bug with type variable scoping in SPECIALISE pragmas (#22913).

  • Sam identified and fixed a missing type-synonym expansion in the typechecker (#22985).

  • Rodrigo fixed a bug in which strictness annotations in Template Haskell quotes could be dropped (#23036).

Error messages and warnings

  • Adam implemented GHC proposal 541, which allows users to assign custom warning categories to WARNING pragmas, for example:

    {-# WARNING in "x-partial" head "This is a partial function...." #-}

    With this change, it became possible to attach custom warnings to head/tail which can be silenced selectively. This allowed completion of CLC proposal 87 and CLC proposal 114.

  • Matt migrated error messages related to interface loading to the new diagnostic-based infrastructure (!10135).

  • Sam reviewed and assisted other contributors with several contributions migrating error messages to the new diagnostic infrastructure.

  • Rodrigo stopped ghci’s own Prelude import from annoyingly triggering the missing-import-lists warning.


  • Sam updated a few pointer equality operations, such as sameMutableArray#, sameMutVar# and eqStableName#, to be levity-polymorphic (!9976).

Code generation

  • Andreas fixed a tag inference bug for the bytecode interpreter (#22840).

  • Ben improved the code generation of bitmasks on AArch64 by precisely checking whether a value is a bitmask immediate (#23030).

  • Ben improved code generation for atomic read and writes on AArch64, by using in-line primops (#22115).

  • Zubin fixed a bug in which multiline-comments were incorrectly handled in the AArch64 native code generator (#23002).

Debug information

  • In an ongoing quest to reduce the size of compiler results containing debug information, Finley implemented compression of info-table provenance entry map data (!9893). With this feature, -finfo-table-map enabled build results are roughly 20% smaller in size.

Runtime system

  • Ben fixed a bug in the alignment of capabilities, fixing the segfault reported in #22965.

  • Ben has been working to track down a variety of memory ordering issues uncovered by newer AArch64 implementations (#23185, #22872, #23222).

  • Ben inserted write barriers for IND and IND_STATIC info tables. The fact that these were missing caused segfaults seen in #22872.

Garbage collector

  • Ben fixed a plethora of bugs in the non-moving garbage collector, in particular refactoring the treatment of weak pointers. He also reduced maximum pause times during synchronisation. See !9609.

    Fixed bugs: #22327, #22926, #22927, #22929, #22930, and #22931.

  • Ben prevented slop from being zeroed when using the non-moving garbage collector, as this could cause races (#23170).

Runtime performance

  • Matthew and Andreas investigated various regressions that were reported in the text and bytestring packages. They discovered:

    • That patches to use sub-word sized instructions, such as using eqWord8# on Word8s instead of extending to Word before comparing, were regressing performance due to partial register stalls (see e.g. #20405). Reverting those patches recovered the original performance characteristics.

    • That the inlining of join points can introduce heap checks in hot code paths, as described in #22936. The patch which made GHC more eager to inline certain join points was reverted.

    • That knock-on inlining differences caused by recent changes to the GHC.Unicode module were causing regressions in functions such as toUpper. Adding a NOINLINE pragma to a certain large function (!9958) resolved the issue.


  • Ben bumped the win32-tarballs submodule, alleviating certain ABI incompatibility issues resulting from the move to a Clang toolchain in GHC 9.4.

  • Matthew fixed some inconsistencies in version numbers in the documentation generated by Hadrian (#23121).

Testsuite & CI

  • Ben and Rodrigo made a few tests more consistent by ensuring that handles are flushed at the correct point.

GHC build system

  • Ben stopped Hadrian from unnecessarily packaging the lib/settings file (#20253, #22982).

  • Ben fixed an issue where running ./configure twice on MacOS could cause strange errors in FIND_CXX_STD_LIB (#23116).

  • Rodrigo and Matthew worked towards plugin loading soundness by giving GHC a proper unit-id and making Hadrian add hashes to package IDs (#20742, !10119)


  • Ben worked on improving reliability of head.hackage, as several spurious failures had been triggered by cabal suddenly finding different build plans after a package was updated on hackage.

    This included adding a freeze file to freeze the Hackage index state, as well as introducing extra constraints to avoid ever including certain outdated packages in a build plan.

  • Ben continued work in moving head.hackage to use the foliage tool for Hackage repository generation. This work will both simplify head.hackage’s infrastructure and allow the repository to be frozen with cabal’s index-state field.

GHC Proposals

  • Adam drafted several GHC proposals:

    • GHC proposal 579 modifies the format of proposals and their evaluation criteria to make backwards compatibility a more explicit concern.

    • GHC proposal 581 extends ExplicitNamespaces, building on a previous design by others.

    • GHC proposal 583 proposes changes to the HasField class used for overloaded record fields, to unblock the implementation of OverloadedRecordUpdate.

by ben, andreask, matthew, zubin, sam, finley, adam at April 27, 2023 12:00 AM

April 26, 2023

Chris Reade

Graphs, Kites and Darts – Empires and SuperForce

We have been exploring properties of Penrose’s aperiodic tilings with kites and darts using Haskell.

Previously in Diagrams for Penrose tiles we implemented tools to draw finite tilings using Haskell diagrams. There we also noted that legal tilings are only correct tilings if they can be continued infinitely and are incorrect otherwise. In Graphs, Kites and Darts we introduced a graph representation for finite tilings (Tgraphs) which enabled us to implement operations that use neighbouring tile information. In particular we implemented a force operation to extend a Tgraph on any boundary edge where there is a unique choice for adding a tile.

In this note we find a limitation of force, show a way to improve on it (superForce), and introduce boundary coverings which are used to implement superForce and calculate empires.

Properties of Tgraphs

A Tgraph is a collection of half-tile faces representing a legal tiling and a half-tile face is either an LD (left dart) , RD (right dart), LK (left kite), or RK (right kite) each with 3 vertices to form a triangle. Faces of the Tgraph which are not half-tile faces are considered external regions and those edges round the external regions are the boundary edges of the Tgraph. The half-tile faces in a Tgraph are required to be connected and locally tile-connected which means that there are exactly two boundary edges at any boundary vertex (no crossing boundaries).

As an example Tgraph we show kingGraph (the three darts and two kites round a king vertex), where

  kingGraph = makeTgraph 
    [LD (1,2,3),RD (1,11,2),LD (1,4,5),RD (1,3,4),LD (1,10,11)
    ,RD (1,9,10),LK (9,1,7),RK (9,7,8),RK (5,7,1),LK (5,6,7)

This is drawn in figure 1 using

  hsep 1 [dashJVGraph kingGraph, drawGraph kingGraph]

which shows vertex labels and dashed join edges (left) and without labels and join edges (right). (hsep 1 provides a horizontal seperator of unit length.)

Figure 1: kingGraph with labels and dashed join edges (left) and without (right).
Figure 1: kingGraph with labels and dashed join edges (left) and without (right).

Properties of forcing

We know there are at most two legal possibilities for adding a half-tile on a boundary edge of a Tgraph. If there are zero legal possibilities for adding a half-tile to some boundary edge, we have a stuck tiling/incorrect Tgraph.

Forcing deals with all cases where there is exactly one legal possibility for extending on a boundary edge. That means forcing either fails at some stage with a stuck Tgraph (indicating the starting Tgraph was incorrect) or it enlarges the starting Tgraph until every boundary edge has exactly two legal possibilities for adding a half-tile so a choice would need to be made to grow the Tgraph any further.

Figure 2 shows force kingGraph with kingGraph shown red.

Figure 2: force kingGraph with kingGraph shown red.
Figure 2: force kingGraph with kingGraph shown red.

If g is a correct Tgraph, then force g succeeds and the resulting Tgraph will be common to all infinite tilings that extend the finite tiling represented by g. However, we will see that force g is not a greatest lower bound of (infinite) tilings that extend g. Firstly, what is common to all extensions of g may not be a connected collection of tiles. This leads to the concept of empires which we discuss later. Secondly, even if we only consider the connected common region containing g, we will see that we need to go beyond force g to find this, leading to an operation we call superForce.

Our empire and superForce operations are implemented using boundary coverings which we introduce next.

Boundary edge covering

Given a successfully forced Tgraph fg, a boundary edge covering of fg is a list of successfully forced extensions of fg such that

  1. no boundary edge of fg remains on the boundary in each extension, and
  2. the list takes into account all legal choices for extending on each boundary edge of fg.

[Technically this is a covering of the choices round the boundary, but each extension is also a cover of the boundary edges.] Figure 3 shows a boundary edge covering for a forced kingGraph (force kingGraph is shown red in each extension).

Figure 3: A boundary edge covering of force kingGraph.
Figure 3: A boundary edge covering of force kingGraph.

In practice, we do not need to explore both choices for every boundary edge of fg. When one choice is made, it may force choices for other boundary edges, reducing the number of boundary edges we need to consider further.

The main function is boundaryECovering working on a BoundaryState (which is a Tgraph with extra boundary information). It uses covers which works on a list of extensions each paired with the remaining set of the original boundary edges not yet covered. (Initially covers is given a singleton list with the starting boundary state and the full set of boundary edges to be covered.) For each extension in the list, if its uncovered set is empty, that extension is a completed cover. Otherwise covers replaces the extension with further extensions. It picks the (lowest numbered) boundary edge in the uncovered set, tries extending with a half-dart and with a half-kite on that edge, forcing in each case, then pairs each result with its set of remaining uncovered boundary edges before adding the resulting extensions back at the front of the list to be processed again. If one of the choices for a dart/kite leads to an incorrect tiling (a stuck tiling) when forced, that choice is dropped (provided the other choice succeeds). The final list returned consists of all the completed covers.

  boundaryECovering:: BoundaryState -> [BoundaryState]
  boundaryECovering bs = covers [(bs, Set.fromList (boundary bs))]

  covers:: [(BoundaryState, Set.Set Dedge)] -> [BoundaryState]
  covers [] = []
  covers ((bs,es):opens) 
    | Set.null es = bs:covers opens -- bs is complete
    | otherwise   = covers (newcases ++ opens)
       where (de,des) = Set.deleteFindMin es
             newcases = fmap (\b -> (b, commonBdry des b))
                             (atLeastOne $ tryDartAndKite bs de)

Here we have used

  type Try a = Either String a
  tryDartAndKite:: BoundaryState -> Dedge -> [Try BoundaryState]
  atLeastOne    :: [Try a] -> [a]

We frequently use Try as a type for results of partial functions where we need to continue computation if there is a failure. For example we have a version of force (called tryForce) that returns a Try Tgraph so it does not fail by raising an error, but returns a result indicating either an explicit failure situation or a successful result with a final forced Tgraph. The function tryDartAndKite tries adding an appropriate half-dart and half-kite on a given boundary edge, then uses tryForceBoundary (a variant of tryForce which works with boundary states) on each result and returns a list of Try results. The list of Try results is converted with atLeastOne which collects the successful results but will raise an error when there are no successful results.

Boundary vertex covering

You may notice in figure 3 that the top right cover still has boundary vertices of kingGraph on the final boundary. We use a boundary vertex covering rather than a boundary edge covering if we want to exclude these cases. This involves picking a boundary edge that includes such a vertex and continuing the process of growing possible extensions until no boundary vertices of the original remain on the boundary.


A partial example of an empire was shown in a 1977 article by Martin Gardner 1. The full empire of a finite tiling would consist of the common faces of all the infinite extensions of the tiling. This will include at least the force of the tiling but it is not obviously finite. Here we confine ourselves to the empire in finite local regions.

For example, we can calculate a local empire for a given Tgraph g by finding the common faces of all the extensions in a boundary vertex covering of force g (which we call empire1 g).

This requires an efficient way to compare Tgraphs. We have implemented guided intersection and guided union operations which, when given a common edge starting point for two Tgraphs, proceed to compare the Tgraphs face by face and produce an appropriate relabelling of the second Tgraph to match the first Tgraph only in the overlap where they agree. These operations may also use geometric positioning information to deal with cases where the overlap is not just a single connected region. From these we can return a union as a single Tgraph when it exists, and an intersection as a list of common faces. Since the (guided) intersection of Tgraphs (the common faces) may not be connected, we do not have a resulting Tgraph. However we can arbitrarily pick one of the argument Tgraphs and emphasise which are the common faces in this example Tgraph.

Figure 4 (left) shows empire1 kingGraph where the starting kingGraph is shown in red. The grey-filled faces are the common faces from a boundary vertex covering. We can see that these are not all connected and that the force kingGraph from figure 2 corresponds to the connected set of grey-filled faces around and including the kingGraph in figure 4.

Figure 4: King's empire (level 1 and level 2).
Figure 4: King’s empire (level 1 and level 2).

We call this a level 1 empire because we only explored out as far as the first boundary covering. We could instead, find further boundary coverings for each of the extensions in a boundary covering. This grows larger extensions in which to find common faces. On the right of figure 4 is a level 2 empire (empire2 kingGraph) which finds the intersection of the combined boundary edge coverings of each extension in a boundary edge covering of force kingGraph. Obviously this process could be continued further but, in practice, it is too inefficient to go much further.


We might hope that (when not discovering an incorrect tiling), force g produces the maximal connected component containing g of the common faces of all infinite extensions of g. This is true for the kingGraph as noted in figure 4. However, this is not the case in general.

The problem is that forcing will not discover if one of the two legal choices for extending a resulting boundary edge always leads to an incorrect Tgraph. In such a situation, the other choice would be common to all infinite extensions.

We can use a boundary edge covering to reveal such cases, leading us to a superForce operation. For example, figure 5 shows a boundary edge covering for the forced Tgraph shown in red.

Figure 5: One choice cover.
Figure 5: One choice cover.

This example is particularly interesting because in every case, the leftmost end of the red forced Tgraph has a dart immediately extending it. Why is there no case extending one of the leftmost two red edges with a half-kite? The fact that such cases are missing from the boundary edge covering suggests they are not possible. Indeed we can check this by adding a half-kite to one of the edges and trying to force. This leads to a failure showing that we have an incorrect tiling. Figure 6 illustrates the Tgraph at the point that it is discovered to be stuck (at the bottom left) by forcing.

Figure 6: An incorrect extension.
Figure 6: An incorrect extension.

Our superForce operation starts by forcing a Tgraph. After a successful force, it creates a boundary edge covering for the forced Tgraph and checks to see if there is any boundary edge of the forced Tgraph for which each cover has the same choice. If so, that choice is made to extend the forced Tgraph and the process is repeated by applying superForce to the result. Otherwise, just the result of forcing is returned.

Figure 7 shows a chain of examples (rockets) where superForce has been used. In each case, the starting Tgraph is shown red, the additional faces added by forcing are shown black, and any further extension produced by superForce is shown in blue.

Figure 7: SuperForce rockets.
Figure 7: SuperForce rockets.


We still do not know if forcing decides that a Tgraph is correct/incorrect. Can we conclude that if force g succeeds then g (and force g) are correct? We found examples (rockets in figure 7) where force succeeds but one of the 2 legal choices for extending on a boundary edge leads to an incorrect Tgraph. If we find an example g where force g succeeds but both legal choices on a boundary edge lead to incorrect Tgraphs we will have a counter-example. If such a g exists then superForce g will raise an error. [The calculation of a boundary edge covering will call atLeastOne where both branches have led to failure for extending on an edge.]

This means that when superForce succeeds every resulting boundary edge has two legal extensions, neither of which will get stuck when forced.

I would like to thank Stephen Huggett who suggested the idea of using graphs to represent tilings and who is working with me on proof problems relating to the kite and dart tilings.

Reference [1] Martin Gardner (1977) MATHEMATICAL GAMES. Scientific American, 236(1), (pages 110 to 121).

by readerunner at April 26, 2023 05:13 PM

Magnus Therning

Some practical Haskell

As I'm nearing the end of my time with my current employer I thought I'd put together some bits of practical Haskell that I've put into production. We only have a few services in Haskell, and basically I've had to sneak them into production. I'm hoping someone will find something useful. I'd be even happier if I get pointers on how to do this even better.


I've written about that earlier in three posts:

  1. A take on log messages
  2. A take on logging
  3. Logging with class

Final exception handler

After reading about the uncaught exception handler in Serokell's article I've added the following snippet to all the services.

main :: IO ()
main = do
    originalHandler <- getUncaughtExceptionHandler
    setUncaughtExceptionHandler $ handle originalHandler . lastExceptionHandler logger

lastExceptionHandler :: Logger -> SomeException -> IO ()
lastExceptionHandler logger e = do
    fatalIO logger $ lm $ "uncaught exception: " <> displayException e

Handling signals

To make sure the platform we're running our services on is happy with a service it needs to handle SIGTERM, and when running it locally during development, e.g. for manual testing, it's nice if it also handles SIGINT.

The following snippet comes from a service that needs to make sure that every iteration of its processing is completed before shutting down, hence the IORef that's used to signal whether procession should continue or not.

main :: IO ()
main = do
    cont <- newIORef True
    void $ installHandler softwareTermination (Catch $ sigHandler logger cont) Nothing
    void $ installHandler keyboardSignal (Catch $ sigHandler logger cont) Nothing

sigHandler :: Logger -> IORef Bool -> IO ()
sigHandler logger cont = do
    infoIO logger "got a signal, shutting down"
    writeIORef cont False


Due to some details about how networking works in our platform it's currently not possible to use network-based probing. Instead we have to use files. There are two probes that are of interest

  • A startup probe, existance of the file signals that the service has started as is about being processing.
  • A progress probe, a timestamp signals the time the most recent iteration of processing finished1.

I've written a little bit about the latter before in A little Haskell: epoch timestamp, but here I'm including both functions.

createPidFile :: FilePath -> IO ()
createPidFile fn = getProcessID >>= writeFile fn . show

writeTimestampFile :: MonadIO m => FilePath -> m ()
writeTimestampFile fn = liftIO $ do
    getPOSIXTime >>= (writeFile fn . show) . truncate @_ @Int64 . (* 1000)



The actual probing is then done using a command that compares the saved timestamp with the current time. As long as the difference is smaller than a threshold the probe succeeds.

April 26, 2023 11:54 AM

April 21, 2023

JP Moresmau

Data queries and transformations via WebAssembly plugins

Following on the previous research I've done on using WebAssembly as a way to write plugins possibly in several languages and running them in a Rust app via the runtime, I've started building something more extensive.

The plugins now allow writing data querying and transformation code without having to deal with the low level details on how to connect to the underlying database. A plugin can:

- define the input parameters it needs to run.

- based on the actual provided values for these parameters, generate a SQL query string and bound parameters that actually need to run on the database. So a plugin can have full control on the SQL generation. Some plugins could always use the same SQL query and just use bound parameters coming from their input parameters, or could do any kind of preprocessing to generate the SQL, for things that bound parameters don't allow.

- the runtime will run the query with the bound parameters on the underlying database, and return the result row by row to the plugin. The plugin can then do whatever processing on each row it needs, and return intermediate results or only return results at the end.

Since I using Wasmer, let's define the interface we need in WAI format:

// Get the metadata of the query.
metadata: func() -> query-metadata

// Start the query processing with the given variables.
start: func(variables: list<variable>) -> execution

// Encapsulates the query row processing.
resource execution {
// The actual query to run.
query-string: func() -> string

// The variables to use in the query.
variables: func() -> list<variable>

// Callback on each data row, returning potential intermediate results.
row: func(data: list<variable>) -> option<query-result>

// Callback on query end, returning potential final results.
// Columns are passed in case no data was returned.
end: func(columns: list<string>) -> option<query-result>

The simple data types are left out for brevity (they are defined in their own file), but hopefully the intent of this interface should be clear.  The metadata function returns a description of what the plugin does and what parameters it takes (a parameter is strongly typed). The start function take actual values for parameters and return an execution resource. This execution exposes the actual query string and bound parameters for that query (via the variables method). The runtime will then call the row function for each result row and the end function at the end. Each of these can return a possibly partial result. The end function takes the names of columns so that proper metadata is known even if the query returned no row.

Examples of very basic plugins used in tests can be seen here and here. They just collect the data passed to them and return it in the end method.

Each plugin can then be compiled to WASM for example via the cargo wapm --dry-run command provided by Wasmer.

The current runtime I've built is very simple: it takes all plugins from a folder and database connections are defined in a YAML file, and only Sqlite and Postgres are supported. An executable is provided to be able to run plugins from the command line.

Using a WAI interface and not having to deal with low level WASM code is great. cargo expand is your friend to understand what Wasmer generates as structures are generated differently between the import! and export! macros, so some structures own their data while some take references, which can sometimes trip you up.

Of course I would need to test this for performance, to determine how much copying of data is done between the Rust runtime and the WASM plugins.

Let me know if you have use cases where this approach could be interesting! As usual, all code is available on Github.

by JP Moresmau ( at April 21, 2023 07:39 AM

April 18, 2023


falsify: Hypothesis-inspired shrinking for Haskell

Consider this falsify property test that tries to verify the (obviously false) property that all elements of all lists of up to 10 binary digits are the same (we will explain the details below; hopefully the intent is clear):

prop_list :: Property ()
prop_list = do
    n  <- gen $ Gen.integral $ Range.between (0, 10)
    xs <- gen $ replicateM n $ $ Range.between (0, 1)
    assert $ P.pairwise P.eq .$ ("xs", xs)

we might get a counter-example such as this:

failed after 9 shrinks
(xs !! 0) /= (xs !! 1)
xs     : [0,1]
xs !! 0: 0
xs !! 1: 1

More interesting than the counter-example itself is how falsify arrived at that counter-example; if we look at the shrink history (--falsify-verbose), we see that the list shrunk as follows:

~> [1,1,0]       -- shrink the list length
~> [0,1,0]       -- shrink an element of the list
~> [0,1]         -- shrink the list length again

The test runner is able to go back and forth between shrinking the length and the list, and shrinking elements in the list. That is, we have integrated shrinking (like in hedgehog: we do not specify a separate generator and shrinker), which is internal: works across monadic bind. The Python Hypothesis library showed the world how to achieve this. In this blog post we will introduce falsify, a new library that provides property based testing in Haskell and has an approach to shrinking that is inspired by Hypothesis. As we shall see, however, the details are quite different.


In this first section we will discuss some of the background behind falsify; the next section will be a more tutorial-style introduction on how to use it. This section is not meant to an exhaustive discussion of the theory behind falsify, or how the theory differs from that of Hypothesis; both of those topics will be covered in a paper, currently under review. However, a basic understanding of these principles will help to use the library more effectively, and so that will be our goal in this first section.

Unit testing versus property based testing

In unit testing (for example using tasty-hunit), a test for a function f might look something like this:

test :: Assertion
test =
    unless (f input == expected) $
      assertFailure "not equal"

That is, we apply f to specific input, and then verify that we get an expected result. By contrast, in property based testing, we do not specify a specific input, but instead generate a random input using some generator genInput, and then verify that the input and the output are related by some property prop:

test_property :: Property ()
test_property = do
    input <- gen $ genInput
    unless (prop input (f input)) $
      testFailed "property not satisfied"

This blog post is not intended as an introduction to property-based testing; merely observe that generation of input values is a critical ingredient in property based testing. However, if you are not familiar with the topic, or not yet convinced that you should be using it, I can highly recommend watching Testing the Hard Stuff and Staying Sane, and then reading How to Specify It!: A Guide to Writing Properties of Pure Functions, both by the world’s foremost property-based testing guru John Hughes.

The importance of shrinking

Suppose we want to test the (false) property that for all numbers x and y, x - y == y - x:

prop_shrinking :: Property ()
prop_shrinking = do
    x <- gen $ $ Range.between (0, 99)
    y <- gen $ $ Range.between (0, 99)
    unless (x - y == y - x) $
      testFailed "property not satisfied"

Since the property is false, we will get a counter-example. Without shrinking, one such a counter-example might be

x = 38
y = 23

However, although that is indeed a counter-example to the property, it’s not a great counter-example. Why these specific numbers? Is there something special about them? To quote John Hughes in Experiences with QuickCheck: Testing the Hard Stuff and Staying Sane:

Random tests contain a great deal of junk—that is their purpose! Junk provokes unexpected behaviour and tests scenarios that the developer would never think of. But tests usually fail because of just a few features of the test case. Debugging a test that is 90% irrelevant is a nightmare; presenting the developer with a test where every part is known to be relevant to the failure, simplifies the debugging task enormously.

(emphasis mine). For our example, the numbers 38 and 23 are not particularly relevant to the failure; with shrinking, however, the counter-example we will get is

x = 0
y = 1

Indeed, this is the only counter-example we will ever get: 0 is the smallest number possible (“least amount of detail”), and the only thing that is relevant about the second number is that it’s not equal to the first.

Parsing versus generation

Generation of inputs relies on pseudo-random number generators (PRNGs). The typical interface to a PRNGs is something like this:

nextSample :: PRNG -> (Word, PRNG)

Given such an interface, we might define the type of generators as

newtype Gen a = Gen (PRNG -> (a, PRNG))

This covers generation, but not shrinking. The traditional approach in QuickCheck to shrinking is to pair a generator with a shrinking function, a function of type

shrink :: a -> [a]

This works, but it’s not without its problems; see my blog post Integrated versus Manual Shrinking for an in-depth discussion. The key insight of the Hypothesis library is that instead of shrinking generated values, we instead shrink the samples produced by the PRNG. Suppose we unfold a PRNG to a stream of random samples:

unfoldLinear :: PRNG -> [Word]
unfoldLinear prng =
    let (s, prng') = next prng
    in s : unfoldLinear prng'

Then we can shift our perspective: rather than thinking of generating random values from a PRNG we instead parse this stream of random samples:

newtype Parser a = Parser ([Word] -> (a, [Word])

Instead of having a separate shrinking function, we now simply shrink the list of samples, and then re-run the parser. This is the Hypothesis approach in a nutshell; parsers of course need to ensure that the produced value shrinks as the samples are shrunk. For example, here is a very simple (proof of concept) generator for Bool:

parseBool :: Parser Bool
parseBool = Parser $ \(s:ss) -> (
    if s >= maxBound `div` 2 then True else False
  , ss

Assuming that the sample is chosen uniformly in the full Word range, this parser will choose uniformly between True and False; and as the sample is shrunk towards zero, the boolean will shrink towards False.

Streams versus trees

If you look at the definition of Gen in QuickCheck, you will see it’s actually different to the definition we showed above:

newtype Gen a = Gen (PRNG -> a)

Like our definition above, this generator takes a PRNG as input, but it does not return an updated PRNG. This might seem confusing: suppose we are generating two numbers, as in our example above; how do we ensure those two numbers are generated from different PRNGs?

To solve this problem, we will need a PRNG that in addition to next, also provides a way to split the PRNG into two new PRNGs:

next  :: PRNG -> (Word, PRNG)  -- as before
split :: PRNG -> (PRNG, PRNG)  -- new

Then to run two generators, we first split the PRNG:

both :: QcGen a -> QcGen b -> QcGen (a, b)
both (QcGen g1) (QcGen g2) = QcGen $ \prng ->
    let (l, r) = split prng
    in (g1 l, g2 r)

The advantage of this approach is laziness: we can produce the second value of type b without generating the value of type a first. Indeed, if we never demand the value of a, we will not generate it at all! This is of critical importance if we have generators for infinite values; for example, it is what enables us to Generate Functions.

The falsify definition of Gen

If we apply the insight from Hypothesis (that is, parse samples rather than generate using PRNGs) to this new setting where splitting PRNGs is a fundamental operation, we arrive at the definition of Gen in falsify. First, unfolding a PRNG does not give us an infinite stream of samples, but rather an infinite tree of samples:

data STree = STree Word STree STree

unfold :: PRNG -> STree
unfold prng =
    let (s, _) = next  prng
        (l, r) = split prng
    in STree s (unfold l) (unfold r)

A generator is then a function that takes a part of a sample tree, parses it, and produces a value and an updated sample tree:

newtype Gen a = Gen (STree -> (a, [STree]))

This does not reintroduce dependencies between generators: each generator will be run against a different subtree, and update only that subtree. For example, here is how we might run two generators:

both :: Gen a -> Gen b -> Gen (a, b)
both (Gen g1) (Gen g2) = Gen $ \(STree s l r) ->
    let (a, ls) = g1 l
        (b, rs) = g2 r
    in ( (a, b)
       ,    [STree s l' r  | l' <- ls]
         ++ [STree s l  r' | r' <- rs]

Note that we are focussing on the core concepts here, and are glossing over various details. In particular, the actual definition in falsify has an additional constructor Minimal, which is a finite representation of the infinite tree that is zero everywhere. This is a key component in making this work with infinite data structures; see upcoming paper for an in-depth discussion. Users of the library however generally do not need to be aware of this (indeed, the sample tree abstraction is not part of the public API).

Consequences of using sample trees

Arguably all of the key differences between Hypothesis and falsify stem from the difference in representation of samples: a linear stream in Hypothesis and an infinite tree in falsify. In this section we will discuss two consequences of this choice.

Shrinking the sample tree

First, we need to decide how to shrink a sample tree. In Hypothesis, the sample stream (known as a “choice sequence”) is subjected all kinds of passes (15 and counting, according to Test-Case Reduction via Test-Case Generation: Insights from the Hypothesis Reducer), which shrink the sample stream according to lexicographical ordering; for example:

..¸ x, ..        < .., x', ..       -- shrink an element (x' < x)
.., x, y, z, ..  < .., x, z, ..     -- drop an element from the stream
.., x, y, z, ..  < .., y, z, x, ..  -- sort part of the stream (y < z < x)

When we are dealing with infinite sample trees, such a total ordering does not exist. For example, consider the following two trees:

tree1 = STree ..         tree2 = STree ..
         (STree 1 ..)              (STree 2 ..)
         (STree 4 ..)              (STree 3 ..)

Sample 1 in tree1 is less than the corresponding sample 2 in tree2, but sample 4 in tree1 is greater than the corresponding sample 3 in tree2. Hence, we have neither tree1 < tree2 nor tree2 < tree1: these two trees are incomparable. Instead, falsify works with a partial ordering; instead of the multitude of shrinking passes of Hypothesis, falsify has precisely one pass1: shrink an individual sample in the tree.

Distributing samples to parsers

When we have a stream of values that we need to use for multiple parsers, we need to decide which samples go to which parser. In Hypothesis, this essentially happens on a first-come-first-served basis: any samples left unused by the first parser will be used by the next. As discussed, falsify parsers do not return “samples left unused.” Instead, the sample tree is split each time we compose parsers, like we did in both, shown above. In practice, this happens primarily when using applicative <*> or monadic >>=.


These two differences are rather technical in nature; how do they affect users? Suppose we have a generator that produces a list and then a number:

listThenNum :: Gen ([Bool], Int)
listThenNum = do
    xs <- Gen.list ..
    n  <-  ..
    return (xs, n)

If we are using a stream of samples, Hypothesis style, and then drop a random sample from that stream, the generator for int might suddenly be run against an entirely different sample; it might increase in value! Similarly, if we run that int generator against the first sample left over by the list generator, and if that list generator uses fewer samples as it shrinks, we might also run int against an unrelated sample, and its value might again increase.

This is not necessarily a problem; after all, we can then start to decrease that new int value again. However, that is only possible if the generated value with the larger int is still a counter-example to whatever property is being tested. If that is not the case, then we might not be able to shrink the list, and we might end up with a non-minimal counter-example. That can make debugging more difficult (we haven’t gotten rid of all the “junk”), and it can be difficult for users to understand why this might not shrink any further; even if the library offers facilities for showing why shrinking stopped (for example, showing which shrunk examples were rejected; verbose mode in falsify), it can still be quite puzzling why the library is trying to increase a value during shrinking.

Neither of these problems can arise in falsify: it never drops samples at all (instead, only shrinking individual samples), and since monadic bind splits the sample tree, we are guaranteed that the behaviour of int is entirely unaffected by the behaviour of list. This makes the shrinking behaviour in falsify more predictable and easier to understand.2

Monadic bind

We mentioned above that QuickCheck’s approach to shrinking has its problems, without going into detail about what those problems are. Instead, we referred to the blogpost Integrated versus Manual Shrinking; this blog post discusses not only the problems in QuickCheck, but also shows one alternative approach, known as integrated shrinking, used by QuviQ QuickCheck and made popular in the Haskell world by the library hedgehog.

The problem with integrated shrinking is that it does not work across monadic bind. The linked blogpost explains this in great detail, but the essence of the problem is not hard to see. Consider the type of monadic bind:

(>>=) :: Gen a -> (a -> Gen b) -> Gen b

We cannot shrink the right hand side of (>>=) independent from the left hand side, because the right hand side is not a generator. We only have a generator once we apply the supplied function to the result of the first generator. This means that we cannot shrink these two generators independently: if, after shrinking the right hand side, we go back and then shrink the left hand side, we get an entirely different generator, and the shrinking we did previously is just wasted.

In practice what this means is that once we start shrinking the right hand side, we will never go back anymore and shrink the left hand side. In the example from the introduction we first generated a list length, and then the elements of the list:

prop_list :: Property ()
prop_list = do
    n  <- gen $ Gen.integral $ Range.between (0, 10)
    xs <- gen $ replicateM n $ $ Range.between (0, 1)
    assert $ P.pairwise P.eq .$ ("xs", xs)

With integrated shrinking, once we start shrinking elements from the list, we will never go back anymore and shrink the list length. With internal shrinking, however, we can go back and forth across monadic bind. This is the raison d’être of internal shrinking: it doesn’t matter that we cannot shrink the two generators independently, because we are not shrinking generators! Instead, we just shrink the samples that feed into those generators.

Selective functors

It is important to understand the limitations of internal shrinking: it is certaintly not a silver bullet. For example, consider this combinator that takes two generators, flips a coin (generates a boolean, shrinking towards True), and then executes one of the two generators:

choose :: Gen a -> Gen a -> Gen a -- Suboptimal definition
choose g g' = do
    b <- Gen.bool True
    if b then g else g'

This combinator works, but it’s not optimal. Suppose the initial value of b is False, and so we use g'; and let’s suppose furthermore that we spend some time shrinking the sample tree using g'. Consider what happens if b now shrinks to True. When this happens we will now run g against the sample tree as it was left after shrinking with g'. Although we can do that, it very much depends on the specific details of g and g' whether it’s useful to do it, and we will certainly lose the predictability we discussed above.

We could try to make the two generators shrink independent from each other by simply running both of them, and using the boolean only to choose which result we want. After all, Haskell is lazy, and so this should be fine:

choose :: Gen a -> Gen a -> Gen a -- Bad definition!
choose g g' = do
    x <- g
    y <- g'
    b <- Gen.bool True
    return $ if b then x else y

While is is true that generation using this definition of choose will work just fine (and laziness ensures that we will in fact only run whatever generator is used), this combinator shrinks very poorly. The problem is that if we generate a value but they don’t use it, the (part of) the sample tree that we used to produce that value is irrelevant, and so by definition we can always replace it by the sample tree that is zero everywhere. This means that if we later want to switch to that generator, we will only be able to do so if the absolute minimum value that the generator can produce happens to work for whatever property we’re testing. This is an important lesson to remember:

Do not generate values and then discard them: such values will always shrink to their minimum. (Instead, don’t generate the value at all.)

To solve this problem, we need to make it visible to the library when we need a generator and when we do not, so that we it can avoid shrinking that part of the sample tree while the generator is not in use. Selectively omitting effects is precisely what selective applicative functors give us. A detailed discussion of this topic would take us well outside the scope of this blog post; in the remainder of this section we will discuss the basics only.

Gen is a selective functor, which means that it is an instance of Selective, which has a single method called select:

select :: Gen (Either a b) -> Gen (a -> b) -> Gen b

The intuition is that we run the first generator; if that produces Left a, we run the second generator to get a b; if the first generator produces Right b, we skip the second generator completely. Like for applicative <*> and monadic (>>=), the two generators are run against different subtrees of the sample tree, but the critical difference is that we will not try to shrink the right subtree for the second generator unless that generator is used.

If that all sounds a bit abstract, perhaps suffices to say that any selective functor supports

ifS :: Selective f => f Bool -> f a -> f a -> f a

which we can use to implement choose in a way that avoids reusing the sample tree of the first generator for the second:

choose :: Gen a -> Gen a -> Gen a
choose = ifS (bool True)

Indeed, this is precisely the definition in the falsify library itself.


With the background out of the way, let’s now consider how to actually use the library. Probably the easiest way to get started is to use the tasty integration. Here is a minimal template to get you started:

module Main (main) where

import Test.Tasty
import Test.Tasty.Falsify

main :: IO ()
main = defaultMain $ testGroup "MyTestSuite" [
      testProperty "myFirstProperty" prop_myFirstProperty

prop_myFirstProperty :: Property ()
prop_myFirstProperty = return ()

This depends on tasty package, as well as falsify of course. If you want, you can also use the Test.Falsify.Interactive module to experiment with falsify in ghci.

Getting started

Suppose we want to test that if we multiply a number by two, the result must be even. Here’s how we could do it:

prop_multiply2_even :: Property ()
prop_multiply2_even = do
    x <- gen $ $ Range.withOrigin (-100, 100) 0
    unless (even (x * 2)) $ testFailed "not even"

Some observations:

  • Property is a monad, so the usual combinators (such as unless) for monads are available
  • gen runs a generator, and adds the output of the generator to the test log. (The test log is only shown when the property fails.)
  • is an alias for Gen.integral, which can produce values for any Integral type. There is no analogue of QuickCheck’s Arbitrary class in falsify: like in hedgehog and in Hypothesis, every generator must be explicitly specified. For a justification of this choice, see Jacob Stanley’s excellent Lambda Jam 2017 presentation Gens N’ Roses: Appetite for Reduction (Jacob is the author of hedgehog).
  • The specified Range tells the generator two things: in which range to produce a value, and how to shrink that value. In our example, withOrigin takes an “origin” as explicit value (here, 0), and the generator will shrink towards that origin.
  • testFailed is the primitive way to make a test fail, but we shall see a better way momentarily.


Suppose we mistakingly think we need to multiply a number by three to get a even number:

prop_multiply3_even :: Property ()
prop_multiply3_even = do
    x <- gen $ $ Range.withOrigin (-100, 100) 0
    unless (even (x * 3)) $ testFailed "not even"

If we run this test, we will get a counter-example:

multiply3_even: FAIL
  failed after 14 shrinks
  not even
  Logs for failed test run:
  generated 1 at CallStack (from HasCallStack):
    gen, called at demo/Demo/Blogpost.hs:217:10 in main:Demo.Blogpost

This counter-example is not awful: it gives us the counter-example (1), and that counter-example is minimal. We can however do much better; the idiomatic way in falsify to test properties of values is to use a Predicate. A predicate of type

Predicate '[a, b, c, ..]

is essentially a function

a -> b -> c -> .. -> Bool

but in such a way that it can produce a meaningful message if the predicate is not satisfied. Here’s how we might use it for our example:

prop_multiply3_even_pred :: Property ()
prop_multiply3_even_pred = do
    x <- gen $ $ Range.withOrigin (-100, 100) 0
    assert $ P.even `` P.fn ("multiply3", (* 3)) .$ ("x", x)

Some comments:

  • P.even, like even from the prelude. is a predicate that checks its argument is even
  •, like (.) from the prelude, composes a predicate with a function. In addition to the function itself, you also specify the name of the function, so that that name can be used in error messages.
  • (.$), like ($) from the prelude, applies a predicate to a named argument.

The use of predicates is not required, but can be very helpful indeed. For our running example, this will produce this test failure message instead:

multiply3_even_pred: FAIL
  failed after 2 successful tests and 13 shrinks
  not (even (multiply3 x))
  x          : 1
  multiply3 x: 3

Ranges, Labelling

We saw the use of withOrigin already, and earlier in this blog post we used between; a generator such as

Gen.integral $ Range.between (10, 100)

will produce a value between 10 and 100 (inclusive), shrinking towards 10; it is also possibly to flip the two bounds to shrink towards 100 instead.

The other very useful Range constructor is skewedBy. A generator such as

Gen.integral $ Range.skewedBy 5 (0, 100)

will produce values between (0, 100), like between does, but skewed towards zero; a negative skew value will instead skew towards 100 (but still shrink towards zero). As an example use case, suppose that for a certain property we need a list of Int and an Int, and sometimes that separate Int should be a member of the list, sometimes not:

prop_skew :: Double -> Property ()
prop_skew skew = do
    xs <- gen $ Gen.list rangeListLen $ Gen.integral rangeValues
    x  <- gen $ Gen.integral rangeValues
    collect "elem" [x `elem` xs]
    rangeListLen, rangeValues :: Range Word
    rangeListLen = Range.between (0, 10)
    rangeValues  = Range.skewedBy skew (0, 100)

This example is a property that always passes, but we use collect to collect some statistics; specifically, in what percentage of the tests x is an element of xs. If we run this with a skew of 0, we might see something like:

100000 successful tests

Label "elem":
   94.6320% False
    5.3680% True

In only 5% of cases the element appears in the list. There are various ways in which we could change that distribution of test data, but the simplest way is simply to generate more values towards the lower end of the range; if we run the test with a skew of 5 we get

Label "elem":
   41.8710% False
   58.1290% True


Nearly all generators are built using prim as their basic building block, which returns the next sample from the sample tree. Higher-level generators split into two categories: “simple” (non-compound) generators that produce a value given some arguments, and generator combinators that take generators as input and produce new generators. Some important examples in the first category are:

  • integral, which we discussed already
  • bool, which produces a Bool, shrinking towards a choice of True or False
  • elem, which picks a random element from a list, and shuffle, which shuffles a list
  • etc.

The library also offers a number of generator combinators; here we list the most important ones:

  • choose we saw when we discussed Selective functors, and chooses (uniformly) between two generators, shrinking towards the first.

  • list takes a range for the list length and a generator and produces a list of values. Unlike the simple “pick a length and then call replicateM” approach from the example from the introduction, this generator can drop elements anywhere in the list (it does this by using the combinator mark to mark elements in the list; as the marks shrink towards “drop,” the element is removed, up to the specified Range).

  • frequency, similar to the like-named function in QuickCheck, takes a list of generators and frequencies, and chooses a generator according to the specified frequency. This is another way in which we can tweak the distribution of test data.

    The implementation of frequency ensures that the generators can shrink indepedently from each other. This could be defined just using Selective, but for improved performance it makes use of a low-level combinator called perturb; see also bindIntegral, which generalizes Selective bindS, and has significantly better performance than bindS.

Generating functions

One of the most impressive aspects of QuickCheck is that it can generate, show and shrink functions. This is due to a functional pearl by Koen Claessen called Shrinking and showing functions; the presentation is available on YouTube and is well worth a watch. We have adapted the QuickCheck approach (and simplified it slightly) to falsify; the generator is called fun. Here is an example (Fn is a pattern synonym; you will essentially always need it when generating functions):

prop_fn1 :: Property ()
prop_fn1 = do
    Fn (f :: [Int] -> Bool) <- gen $ $ Gen.bool False
    assert $
         `P.on` P.fn ("f", f)
      .$ ("x", [1, 2, 3])
      .$ ("y", [4, 5, 6])

This property says that for any function f :: [Int] -> Bool, if we apply that function to the list [1, 2, 3] we must get the same result as when we apply it to the list [4, 5, 6]. Of course, that is not true, and when we run this test, falsify will give us a counter-example:

failed after 53 shrinks
(f x) /= (f y)
x  : [1,2,3]
y  : [4,5,6]
f x: True
f y: False

Logs for failed test run:
generated {[1,2,3]->True, _->False} at CallStack (from HasCallStack):
  gen, called at demo/Demo/Blogpost.hs:244:32 in main:Demo.Blogpost

Notice the counter-example we get: a function that returns True for the list [1, 2, 3], and False everywhere else.3 It truly is quite astonishing that this works: we can see that the list [1, 2, 3] is special by inspecting the source code, but of course falsify (or indeed, QuickCheck; this is not unique to falsify) cannot! Instead, falsify will generate a random infinitely large description of functions from [Int] -> Bool, that covers all possible input lists, and then start shrinking this description, throwing away values for inputs it doesn’t need, until a minimal test case remains. Truly a testament to the power of laziness in Haskell.

For a more realistic example, let’s port an example from Koen Claessen’s presentation to falsify. This example is testing the (wrong) property that for all functions f and predicates p,

map f . filter p == filter p . map f

In falsify, we might express this as:

prop_mapFilter :: Property ()
prop_mapFilter = do
    Fn (f :: Int -> Int)  <- gen $ genInt
    Fn (p :: Int -> Bool) <- gen $ genBool
    xs :: [Int] <- gen $ Gen.list (Range.between (0, 100)) genInt
    assert $
      `P.split` (P.fn ("map f", map f), P.fn ("filter p", filter p))
      `P.split` (P.fn ("filter p", filter p), P.fn ("map f", map f))
      .$ ("xs", xs)
      .$ ("xs", xs)
    genInt :: Gen Int
    genInt = $ Range.between (0, 100)

    genBool :: Gen Bool
    genBool = Gen.bool False

We generate a random function f, a random predicate p, a random list xs, and then assert the property; and of course, falsify will happily give us a counter-example:

failed after 25 shrinks
(map f (filter p xs)) /= (filter p (map f xs))
xs                 : [96]
xs                 : [96]
filter p xs        : [96]
map f xs           : [0]
map f (filter p xs): [0]
filter p (map f xs): []

Logs for failed test run:
generated {_->0} at CallStack (from HasCallStack):
  gen, called at demo/Demo/Blogpost.hs:254:30 in main:Demo.Blogpost
generated {96->True, _->False} at CallStack (from HasCallStack):
  gen, called at demo/Demo/Blogpost.hs:255:30 in main:Demo.Blogpost
generated [96] at CallStack (from HasCallStack):
  gen, called at demo/Demo/Blogpost.hs:256:20 in main:Demo.Blogpost

It generated a function that maps anything to 0, a predicate that is True for 96 and False for everything else, and a list containing only the value 96; this is indeed a nice counter-example, as the output from the assert explains.

Side note: in an ideal world that value 96 would be shrunk too. However, this would require shrinking both the 96 in the generated list and the 96 in the generated predicate at the same time. Like QuickCheck, falsify never takes more than one shrink step at once, to ensure that shrinking is O(n) and avoid exponential explosion. Section Dependencies between commands of my blog post “An in-depth look at quickcheck-state-machine” discusses this kind of problem in the context of quickcheck-state-machine.

Testing shrinking

When we use internal (or indeed, integrated) shrinking, we don’t write a separate shrinking function, but that doesn’t mean we cannot get shrinking wrong. Shrinking never truly comes for free! As a simple example, consider writing a generator that produces any value below a given maximum (essentially, a more limited form of integral). A first attempt might be:

below :: Word64 -> Gen Word64
below n = (`mod` n) <$> Gen.prim

While this generator does in fact produce values in the range 0 <= x < n, it does not shrink very well! As the value produced prim shrinks, the value produced by below will cycle. We can discover this by writing a property that tests the shrinking behaviour of below, using testShrinkingOfGen:

prop_below_shrinking :: Property ()
prop_below_shrinking = do
    n <- gen $ Gen.integral $ Range.between (1, 1_000)
    testShrinkingOfGen $ below n

This property will fail:

failed after 4 successful tests and 14 shrinks
original < shrunk
original: 0
shrunk  : 1

Logs for failed test run:
generated 2 at CallStack (from HasCallStack):
  gen, called at demo/Demo/Blogpost.hs:281:10 in main:Demo.Blogpost

In addition to testing individual shrinking steps, we can also test that for particular property and generator, we can generate a particular minimum using testMinimum. Let’s consider the naive list generator from the introduction one more time:

naiveList :: Range Int -> Gen a -> Gen [a]
naiveList r g = do
    n  <- Gen.integral r
    replicateM n g

Suppose we want to verify that if we use this generator to test the (false) property that “all elements of all lists are always equal” we should always get either [0, 1] or [1, 0] as a counter-example; after all, those are the two minimal counter-examples. We could test this as follows:

prop_naiveList_minimum :: Property ()
prop_naiveList_minimum =
    testMinimum (P.elem .$ ("expected", [[0,1], [1,0]])) $ do
      xs <- gen $ naiveList
                    (Range.between (0, 10))
                    ( (Range.between (0, 1)))
      case P.eval $ P.pairwise P.eq .$ ("xs", xs) of
        Left _   -> testFailed xs
        Right () -> return ()

The counter-example reported by falsify is (somewhat shortened):

naiveList_minimum: FAIL
  failed after 0 shrinks
  minimum `notElem` expected
  minimum : [0,0,1]
  expected: [[0,1],[1,0]]

Logs for failed test run:
generated [0,0,1] at CallStack (from HasCallStack):
  gen, called at demo/Demo/Blogpost.hs:294:13 in main:Demo.Blogpost

Logs for rejected potential next shrinks:

** Rejected run 0
generated [] at CallStack (from HasCallStack):

** Rejected run 3
generated [0] at CallStack (from HasCallStack):

** Rejected run 4
generated [0,0] at CallStack (from HasCallStack):

** Rejected run 8
generated [0,0,0] at CallStack (from HasCallStack):

This is telling us that the minimum value it produced was [0, 0, 1], instead of one of the two lists that we expected. It also tells us what shrink steps were rejected (because they weren’t counter-examples). This is informative, because _none of those shrink steps is [0, 1]: the naive list generator, unlike the real one (which does pass this property) cannot drop elements from the start of the list.

The falsify test suite uses testShrinkingOfGen (and its generalization testShrinking) as well as testMinimum extensively to test falsify’s own generators.


Finally, falsify offers two combinators shrinkWith and fromShrinkTree which provide compatibility with QuickCheck style shrinking

shrinkWith :: (a -> [a]) -> Gen a -> Gen a

and hedgehog style shrinking

fromShrinkTree :: Tree a -> Gen a

respectively. The implementation of these combinators depends on a minor generalization of the sample tree representation; the details are discussed in the paper.


Shrinking is an essential component of any approach to property based testing. In the Haskell world, two libraries offered two competing approaches to shrinking: manual shrinking offered by QuickCheck, where users are entirely responsible for writing shrinkers for their generators, and integrated shrinking, offered by hedgehog. Integrated shrinking is nice, but does not work well with monadic bind. The Python Hypothesis library taught us how we can have “internal” shrinking: like in integrated shrinking, we do not write a separate generator and shrinker, but unlike in integrated shrinking, this approach does work across monadic bind.

The Haskell falsify library takes the core ideas of Hypothesis and applies them in the context of Haskell. As we have seen, however, the actual details of how these two libraries work differ quite significantly. The falsify approach is more suitable to Haskell where we might deal with infinite data structures, provides the user with more predictable shrinking, and provides the user with tools for controlling generator independence (through the use of selective functors).


  1. Admittedly, it depends a bit on how you count: we can also replace any tree in a single step by the tree that is zero everywhere, which could be considered a separat “pass.”↩︎

  2. Hypothesis does try to avoid redistributing samples during shrinking as part of it’s “hierarchical delta debugging,” essentially recovering some kind of tree structure, but this is not under the control of the user, and does not provide any guarantees.↩︎

  3. It can also produce {[4,5,6]->True, _->False}, depending on the seed.↩︎

by edsko at April 18, 2023 12:00 AM

GHC Developer Blog

GHC 9.4.5 is now available

GHC 9.4.5 is now available

Zubin Duggal - 2023-04-18

The GHC developers are happy to announce the availability of GHC 9.4.5. Binary distributions, source distributions, and documentation are available at

This release is primarily a bugfix release addressing a few issues found in 9.4.4. These include:

  • Fixes for a number of bug fixes in the simplifier (#22623, #22718, #22913, 22695, #23184, #22998, #22662, #22725).
  • Many bug fixes to the non-moving and parallel GCs (#22264, #22327, #22926, #22927, #22929, #22930, #17574, #21840, #22528)
  • A fix a bug with the alignment of RTS data structures that could result in segfaults when compiled with high optimisation settings on certain platforms (#22975 , #22965).
  • Bumping gmp-tarballs to a version which doesn’t use the reserved x18 register on AArch64/Darwin systems, and also has fixes for CVE-2021-43618 (#22497, #22789).
  • A number of improvements to recompilation avoidance with multiple home units (#22675, #22677, #22669, #22678, #22679, #22680)
  • Fixes for regressions in the typechecker and constraint solver (#22647, #23134, #22516, #22743)
  • Easier installation of binary distribution on MacOS platforms by changing the installation Makefile to remove the quarantine attribute when installing.
  • … and many more. See the release notes for a full accounting.

As some of the fixed issues do affect correctness users are encouraged to upgrade promptly.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy compiling,

  • Zubin

by ghc-devs at April 18, 2023 12:00 AM

April 17, 2023

Monday Morning Haskell

This is How to Build Haskell with GNU Make (and why it's worth trying)

In a previous article I showed the GHC commands you need to compile a basic Haskell executable without explicitly using the source files from its dependencies. But when you're writing your own Haskell code, 99% of the time you want to be using a Haskell build system like Stack or Cabal for your compilation needs instead of writing your own GHC commands. (And you can learn how to use Stack in my new free course, Setup.hs).

But part of my motivation for solving that problem was that I wanted to try an interesting experiment:

How can I build my Haskell code using GNU Make?

GNU Make is a generic build system that allows you to specify components of your project, map out their dependencies, and dictate how your build artifacts are generated and run.

I wanted to structure my source code the same way I would in a Cabal-style application, but rely on GNU Make to chain together the necessary GHC compilation commands. I did this to help gain a deeper understanding of how a Haskell build system could work under the hood.

In a Haskell project, we map out our project structure in the .cabal file. When we use GNU Make, our project is mapped out in the makefile. Here's the Makefile we'll ultimately be constructing:

GHC = ~/.ghcup/ghc/9.2.5/bin/ghc
BIN = ./bin
EXE = ${BIN}/hello

LIB_DIR = ${BIN}/lib
SRCS = $(wildcard src/*.hs)
LIB_OBJS = $(wildcard ${LIB_DIR}/*.o)

library: ${SRCS}
  @mkdir -p ${LIB_DIR}
  @${GHC} ${SRCS} -hidir ${LIB_DIR} -odir ${LIB_DIR}

generate_run: app/Main.hs library
  @mkdir -p ${BIN}
  @cp ${LIB_DIR}/*.hi ${BIN}
  @${GHC} -i${BIN} -c app/Main.hs -hidir ${BIN} -odir ${BIN}
  @${GHC} ${BIN}/Main.o ${LIB_OBJS} -o ${EXE}

run: generate_run

TEST_DIR = ${BIN}/test
TEST_EXE = ${TEST_DIR}/run_test

generate_test: test/Spec.hs library
  @mkdir -p ${TEST_DIR}
  @cp ${LIB_DIR}/*.hi ${TEST_DIR}
  @${GHC} -i${TEST_DIR} -c test/Spec.hs -hidir ${TEST_DIR} -odir ${TEST_DIR}
  @${GHC} ${TEST_DIR}/Main.o ${LIB_OBJS} -o ${TEST_EXE}

test: generate_test

  rm -rf ./bin

Over the course of this article, we'll build up this solution piece-by-piece. But first, let's understand exactly what Haskell code we're trying to build.

Our Source Code

We want to lay out our files like this, separating our source code (/src directory), from our executable code (/app) and our testing code (/test):

├── app
│   └── Main.hs
├── makefile
├── src
│   ├── MyFunction.hs
│   └── TryStrings.hs
└── test
    └── Spec.hs

Here's the source code for our three primary files:

-- src/MyStrings.hs
module MyStrings where

greeting :: String
greeting = "Hello"

-- src/MyFunction.hs
module MyFunction where

modifyString :: String -> String
modifyString x = base <> " " <> base
    base = tail x <> [head x]

-- app/Main.hs
module Main where

import MyStrings (greeting)
import MyFunction (modifyString)

main :: IO ()
main = putStrLn (modifyString greeting)

And here's what our simple "Spec" test looks like. It doesn't use a testing library, it just prints different messages depending on whether or not we get the expected output from modifyString.

-- test/Spec.hs
module Main where

import MyFunction (modifyString)

main :: IO ()
main = do
  test "abcd" "bcda bcda"
  test "Hello" "elloH elloH"

test :: String -> String -> IO ()
test input expected = do
  let actual = modifyString input
  putStr $ "Testing case: " <> input <> ": "
  if expected /= actual
    then putStrLn $ "Incorrect result! Expected: " <> expected <> " Actual: " <> actual
    else putStrLn "Correct!"

The files are laid out the way we would expect for a basic Haskell application. We have our "library" code in the src directory. We have a single "executable" in the app directory. And we have a single "test suite" in the test directory. Instead of having a Project.cabal file at the root of our project, we'll have our makefile. (At the end, we'll actually compare our Makefile with an equivalent .cabal file).

But what does the Makefile look like? Well it would be overwhelming to construct it all at once. Let's begin slowly by treating our executable as a single file application.

Running a Single File Application

So for now, let's adjust Main.hs so it's an independent file without any dependencies on our library modules:

-- app/Main.hs
module Main where

main :: IO ()
main = putStrLn "Hello"

The simplest way to run this file is runghc. So let's create our first makefile rule that will do this. A rule has a name, a set of prerequisites, and then a set of commands to run. We'll call our rule run, and have it use runghc on app/Main.hs. We'll also include the app/Main.hs as a prerequisite, since the rule will run differently if that file changes.

run: app/Main.hs
  runghc app/Main.hs

And now we can run this run using make run, and it will work!

$ make run
runghc app/Main.hs

Notice that it prints the command we're running. We can change this by using the @ symbol in front of the command in our Makefile. We'll do this with almost all our commands:

run: app/Main.hs
  @runghc app/Main.hs

And it now runs our application without printing the command.

Using runghc is convenient, but if we want to use dependencies from different directories, we'll eventually need to use multiple stages of compilation. So we'll want to create two distinct rules. One that generates the executable using ghc, and another that actually runs the generated executable.

So let's create a generate_run rule that will produce the build artifacts, and then run will use them.

generate_run: app/Main.hs
  @ghc app/Main.hs

run: generate_run

Notice that run can depend on generate_run as a prerequisite, instead of the source file now. This also generates three build artifacts directly in our app directory: the interface file Main.hi, the object file Main.o, and the executable Main.

It's bad practice to mix build artifacts with source files, so let's use GHC's arguments (-hidir, -odir and -o) to store these artifacts in their own directory called bin.

generate_run: app/Main.hs
  @mkdir -p ./bin
  @ghc app/Main.hs -hidir ./bin -odir ./bin -o ./bin/hello

run: generate_run

We can then add a third rule to "clean" our project. This would remove all binary files so that we can do a fresh recompilation if we want.

  rm -rf ./bin

For one final flourish in this section, we can use some variables. We can make one for the GHC compiler, referencing its absolute path instead of a symlink. This would make it easy to switch out the version if we wanted. We'll also add a variable for our bin directory and the hello executable, since these are used multiple times.

# Could easily switch versions if desired
# e.g. GHC = ~/.ghcup/ghc/9.4.4/bin/ghc
GHC = ~/.ghcup/ghc/9.2.5/bin/ghc
BIN = ./bin
EXE = ${BIN}/hello

generate_run: app/Main.hs
  @mkdir -p ${BIN}
  @${GHC} app/Main.hs -hidir ${BIN} -odir ${BIN} -o ${EXE}

run: generate_run

  rm -rf ./bin

And all this still works as expected!

$ generate_run
[1 of 1] Compiling Main (app/Main.hs, bin/Main.o)
Linking ./bin/hello
$ make run
$ make clean
rm -rf ./bin

So we have some basic rules for our executable. But remember our goal is to depend on a library. So let's add a new rule to generate the library objects.

Generating a Library

For this step, we would like to compile src/MyStrings.hs and src/MyFunction.hs. Each of these will generate an interface file (.hi) and an object file (.o). We want to place these artifacts in a specific library directory within our bin folder.

We'll do this by means of a new rule, library, which will use our two source files as its prerequisites. It will start by creating the library artifacts directory:

LIB_DIR = ${BIN}/lib

library: src/MyStrings.hs src/MyFunction.hs
  @mkdir -p ${LIB_DIR}

But now the only thing we have to do is use GHC on both of our source files, using LIB_DIR as the destination point.

LIB_DIR = ${BIN}/lib

library: src/MyStrings.hs src/MyFunction.hs
  @mkdir -p ${LIB_DIR}
  @ghc src/MyStrings.hs src/MyFunction.hs -hidir ${LIB_DIR} -odir ${LIB_DIR}

Now when we run the target, we'll see that it produces the desired files:

$ make library
$ ls ./bin/lib
MyFunction.hi MyFunction.o MyStrings.hi MyStrings.o

Right now though, if we added a new source file, we'd have to modify the rule in two places. We can fix this by adding a variable that uses wildcard to match all our source files in the directory (src/*.hs).

LIB_DIR = ${BIN}/lib
SRCS = $(wildcard src/*.hs)

library: ${SRCS}
  @mkdir -p ${LIB_DIR}
  @${GHC} ${SRCS} -hidir ${LIB_DIR} -odir ${LIB_DIR}

While we're learning about wildcard, let's make another variable to capture all the produced object files. We'll use this in the next section.

LIB_OBJS = $(wildcard ${LIB_DIR}/*.o)

So great! We're producing our library artifacts. How do we use them?

Linking the Library

In this section, we'll link our library code with our executable. We'll begin by assuming our Main file has gone back to its original form with imports, instead of the simplified form:

-- app/Main.hs
module Main where

import MyStrings (greeting)
import MyFunction (modifyString)

main :: IO ()
main = putStrLn (modifyString greeting)

We when try to generate_run, compilation fails because it cannot find the modules we're trying to import:

$ make generate_run
Could not find module 'MyStrings'
Could not find module 'MyFunction'

As we went over in the previous article, the general approach to compiling the Main module with its dependencies has two steps:

1. Compile with the -c option (to stop before the linking stage) using -i to point to a directory containing the interface files.

2. Compile the generated Main.o object file together with the library .o files to produce the executable.

So we'll be modifying our generate_main rule with some extra steps. First of course, it must now depend on the library rule. Then our first new command will be to copy the .hi files from the lib directory into the top-level bin directory.

generate_run: app/Main.hs library
  @mkdir -p ./bin
  @cp ${LIB_DIR}/*.hi ${BIN}

We could have avoided this step by generating the library artifacts in bin directly. I wanted to have a separate location for all of them though. And while there may be some way to direct the next command to find the headers in the lib directory, none of the obvious ways worked for me.

Regardless, our next step will be to modify the ghc call in this rule to use the -c and -i arguments. The rest stays the same:

generate_run: app/Main.hs library
  @mkdir -p ./bin
  @cp ${LIB_DIR}/*.hi ${BIN}
  @${GHC} -i${BIN} -c app/Main.hs -hidir ${BIN} -odir ${BIN}

Finally, we invoke our final ghc call, linking the .o files together. At the command line, this would look like:

$ ghc ./bin/Main.o ./bin/lib/MyStrings.o ./bin/lib/MyFunction.o -o ./bin/hello

Recalling our LIB_OBJS variable from up above, we can fill in the rule in our Makefile like so:

LIB_OBJS = $(wildcard ${LIB_DIR}/*.o)

generate_run: app/Main.hs library
  @mkdir -p ./bin
  @cp ${LIB_DIR}/*.hi ${BIN}
  @${GHC} -i${BIN} -c app/Main.hs -hidir ${BIN} -odir ${BIN}
  @${GHC} ${BIN}/Main.o ${LIB_OBJS} -o ${EXE}

And now our program will work as expected! We can clean it and jump straight to the make run rule, since this will run its prerequisites make library and make generate_run automatically.

$ make clean
rm -rf ./bin
$ make run
[1 of 2] Compiling MyFunction (src/MyFunction.hs, bin/lib/MyFunction.o)
[2 of 2] Compiling MyStrings (src/MyStrings.hs, bin/lib/MyStrings.o)
elloH elloH

So we've covered the library and an executable, but most Haskell projects have at least one test suite. So how would we implement that?

Adding a Test Suite

Well, a test suite is basically just a special executable. So we'll make another pair of rules, generate_test and test, that will mimic generate_run and run. Very little changes, except that we'll make another special directory within bin for our test artifacts.

TEST_DIR = ${BIN}/test
TEST_EXE = ${TEST_DIR}/run_test

generate_test: test/Spec.hs library
  @mkdir -p ${TEST_DIR}
  @cp ${LIB_DIR}/*.hi ${TEST_DIR}
  @${GHC} -i${TEST_DIR} -c test/Spec.hs -hidir ${TEST_DIR} -odir ${TEST_DIR}
  @${GHC} ${TEST_DIR}/Main.o ${LIB_OBJS} -o ${TEST_EXE}

test: generate_test

Of note here is that at the final step, we're still using Main.o instead of Spec.o. Since it's an executable module, it also compiles as Main.

But we can then use this to run our tests!

$ make clean
$ make test
[1 of 2] Compiling MyFunction (src/MyFunction.hs, bin/lib/MyFunction.o)
[2 of 2] Compiling MyStrings (src/MyStrings.hs, bin/lib/MyStrings.o)
Testing case: abcd: Correct!
Testing case: Hello: Correct!

So now we have all the different components we'd expect in a normal Haskell project. So it's interesting to consider how our makefile definition would compare against an equivalent .cabal file for this project.

Comparing to a Cabal File

Suppose we want to call our project HaskellMake and store its configuration in HaskellMake.cabal. We'd start our Cabal file with four metadata lines:

cabal-version: 1.12
name: HaskellMake
build-type: Simple

Now our library would expose its two modules, using the src directory as its root. The only "dependency" is the Haskell base packages. Finally, default-language is a required field.

    , MyFunction
  default-language: Haskell2010

The executable would similarly describe where the files are located and state a base dependency as well as a dependency on the library itself.

executable hello
  main-is: Main.hs
    , HaskellMake
  default-language: Haskell2010

Finally, our test suite would look very similar to the executable, just with a different directory and filename.

test-suite make-test
  type: exitcode-stdio-1.0
  main-is: Spec.hs
    , HaskellMake
  default-language: Haskell2010

And, if we add a bit more boilerplate, we could actually then compile our code with Stack! First we need a stack.yaml specifying the resolver and the package location:

# stack.yaml
resolver: lts-20.12
  - .

Then we need Setup.hs:

-- Setup.hs

import Distribution.Simple
main = defaultMain

And now we could actually run our code!

$ stack build
$ stack exec hello
elloH elloH
$ stack test
Testing case: abcd: Correct!
Testing case: Hello: Correct!

Now observant viewers will note that we don't use any Hackage dependencies in our code - only base, which GHC always knows how to find. It would require a lot of work for us to replicate dependency management. We could download a .zip file with curl easily enough, but tracking the whole dependency tree would be extremely difficult.

And indeed, many engineers have spent a lot of time getting this process to work well with Stack and Cabal! So while it would be a useful exercise to try to do this manually with a simple dependency, I'll leave that for a future article.

When comparing the two file definitions, Undoubtedly, the .cabal definition is more concise and human readable, but it hides a lot of implementation details. Most of the time, this is a good thing! This is exactly what we expect from tools in general; they should allow us to work more quickly without having to worry about details.

But there are times where we might, on our time, want to occasionally try out a more adventurous path like we've done in this article that avoids relying too much on modern tooling. So why was this article a "useful exercise"™?

What's the Point?

So obviously, there's no chance this Makefile approach is suddenly going to supplant Cabal and Stack for building Haskell projects. Stack and Cabal are "better" for Haskell precisely because they account for the intricacies of Haskell development. In fact, by their design, GHC and Cabal both already incorporate some key ideas and features from GNU Make, especially with avoiding re-work through dependency calculation.

But there's a lot you can learn by trying this kind of exercise.

First of all, we learned about GNU Make. This tool can be very useful if you're constructing a program that combines components from different languages and systems. You could even build your Haskell code with Stack, but combine it with something else in a makefile.

A case and point for this is my recent work with Haskell and AWS. The commands for creating a docker image, authenticating to AWS and deploying it are lengthy and difficult to remember. A makefile can, at the very least, serve as a rudimentary aliasing tool. You could run make deploy and have it automatically rebuild your changes into a Docker image and deploy that to your server.

But beyond this, it's important to take time to deliberately understand how our tools work. Stack and Cabal are great tools. But if they seem like black magic to you, then it can be a good idea to spend some time understanding what is happening at an internal level - like how GHC is being used under the hood to create our build artifacts.

Most of the fun in programming comes in effectively applying tools to create useful programs quickly. But if you ever want to make good tools in the future, you have to understand what's happening at a deeper level! At least a couple times a year, you should strive to go one level deeper in your understanding of your programming stack.

For me this time, it was understanding just a little more about GHC. Next time I might dive into dependency management, or a different topic like the internal workings of Haskell data structures. These kinds of topics might not seem immediately applicable in your day job, but you'll be surprised at the times when deeper knowledge will pay dividends for you.

Getting Better at Haskell

But enough philosophizing. If you're completely new to Haskell, going "one level deeper" might simply mean the practical ability to use these tools at a basic level. If your knowledge is more intermediate, you might want to explore ways to improve your development process. These thoughts can lead to questions like:

1. What's the best way to set up my Haskell toolchain in 2023?

2. How do I get more efficient and effective as a Haskell programmer?

You can answer these questions by signing up for my new free course Setup.hs! This will teach how to install your Haskell toolchain with GHCup and get you started running and testing your code.

Best of all, it will teach you how to use the Haskell Language Server to get code hints in your editor, which can massively increase your rate of progress. You can read more about the course in this blog post.

If you subscribe to our monthly newsletter, you'll also get an extra bonus - a 20% discount on any of our paid courses. This offer is good for two more weeks (until May 1) so don't miss out!

by James Bowen at April 17, 2023 02:30 PM

April 12, 2023


Announcing new YouTube series: The Haskell Unfolder

We are happy to announce

In each episode, we will discuss technical topics around programming in Haskell. Topics range from beginner-friendly to advanced and—once in a while—esoteric.

We encourage audience participation! During the episodes, we will monitor the YouTube chat and try to address questions and comments you submit there. We are also open for feedback and topic suggestions by email at

At this point, we are announcing the first two episodes. We will announce subsequent episodes and their time slots on YouTube and on Twitter.

Episode 1: unfoldr

Wednesday, 19 April 2023, 1830 UTC (11:30 am PDT, 2:30 pm EDT, 7:30 pm BST, 20:30 CEST)

In the first episode, to honour the name of our show, we will take a look at the unfoldr function and discuss how it works and how it can be used. This episode should be suitable for everyone interested in Haskell, including beginners.

Episode 2: quantified constraints

Wednesday, 3 May 2023, 1830 UTC (11:30 am PDT, 2:30 pm EDT, 7:30 pm BST, 20:30 CEST)

In this episode, we will discuss the QuantifiedConstraints language extension. For this episode we will assume familiarity with type classes. An understanding of type families will be helpful for a part of the episode, but is not a requirement.

All episodes will be available for watching after the stream. We hope to see many of you live and appreciate any feedback you might have.

by edsko, andres at April 12, 2023 12:00 AM

April 11, 2023

JP Moresmau

WebAssembly: bidirectional communication between components and host

Still investigating the use of WebAssembly to implement a plugin system using Rust and Wasmer. In the previous post, I could load a WebAssembly plugin that implemented an interface that the host knew about. But since WebAssembly is very limited by design in terms of API, as soon as a plugin wants to interact with the outside world in some shape of form, it needs to be able to call an API the host will provide. Thus the host can ensure safety, performance, etc. without allowing the WebAssembly code direct access to real resources.

So let's say we'd like to customize our greeting message based on the time of day. We're going to need a very simple function to tell us the current hour of the day:

hour: func() -> u32

This goes into host.wai. In our english-rs module we can then import! this file with standard web assembly Rust generation and use it in our implementation of greet:


struct Greeter;

impl crate::greeter::Greeter for Greeter {
/// The language we greet in.
fn language() -> String {

/// Greet the given name.
fn greet(name: String) -> String {
let hour = host::hour();
if hour < 12 {
format!("Good morning, {name}!")
} else if hour < 18 {
format!("Good afternoon, {name}!")
} else {
format!("Good evening, {name}!")

Note how we call the host::hour function and get the hour. Of course here we doing something trivial using standard WebAssembly types like u32, things would become more complicated if the types get more complex.

The host code needs to provide an implementation of the hour function and inject it via imports:

fn hour() -> u32 {
let now = chrono::Local::now();
fn main() -> Result<()> {
let imports = imports! {
"host" => {
"hour" => Function::new_typed(&mut store, hour)
let instance = Instance::new(&mut store, &module, &imports)?;

Et voilà!

cargo run -- "JP Moresmau"
Language: English
Greeting: Good afternoon, JP Moresmau!

All code can be found as before at

Happy WebAssembly and Rust hacking!

by JP Moresmau ( at April 11, 2023 12:35 PM

April 10, 2023

JP Moresmau

Web Assembly Interfaces help integration of WASM libraries

 In the previous post, I showed how to run plugins generated from Rust code with wasm-bindgen, using the wasmtime crate. I then discovered Wasmer, so I rewrote the runtime to use the Wasmer API, which is very similar (see But I see Wasmer have a lot more tooling available than just a runtime, so let's see how it can help!

I followed first the tutorial at

You can first define the Web Assembly interface you want to expose in a type of IDL file - think protobuf or Corba, depending on your age :-). Easy enough in our case:

language: func() -> string

greet: func(name: string) -> string

We can put that file (greeter.wai) in our english-rs crate folder, and remove all wasm-bindgen dependencies and related code. We can then use the export! macro of the wai-bindgen-rust crate to automatically generate a trait that defines both function, and then provide an implementation:


struct Greeter;

impl crate::greeter::Greeter for Greeter {
/// The language we greet in.
fn language() -> String {

/// Greet the given name.
fn greet(name: String) -> String {
format!("Hello, {name}!")

That's it! The only change from the previous code apart from the impl block is that the name parameter is an owned String and not a &str.

Then I can publish this library to the wasmer WebAssembly libraries repositories via the cargo-wapm command. It now lives at You can download the .wasm file from there!

What about the runtime? There's not a lot of documentation yet because of lot of this is still beta (and the specs of WebAssembly Interfaces and related concepts are still in flux), but it's possible to use the import! macro of the wai-bindgen-wasmer crate to generate code to interact with the module, using the same wai file: we use the same file that defines the interface to both generate the trait we need to implement and the client struct. This is what our greeter now looks like:

use std::{env, fs};

use anyhow::{anyhow, Result};
use greeter::{Greeter, GreeterData};
use wasmer::*;
use wasmer_compiler_llvm::LLVM;


/// Greet using all the plugins.
fn main() -> Result<()> {
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
return Err(anyhow!("Usage: i18n-greeter <name>"));
let compiler_config = LLVM::default();
let engine = EngineBuilder::new(compiler_config).engine();

let paths = fs::read_dir("./plugins").unwrap();

for path in paths {
let path = path?;
let mut store = Store::new(&engine);

let module = Module::from_file(&store, path.path())?;

let imports = imports! {};
let instance = Instance::new(&mut store, &module, &imports)?;
let env = FunctionEnv::new(&mut store, GreeterData {});
let greeter = Greeter::new(&mut store, &instance, env)?;

let language = greeter.language(&mut store)?;
println!("Language: {language}");
let greeting = greeter.greet(&mut store, &args[1])?;
println!("Greeting: {greeting}");

No more requirement to understand how to call WASM functions and get the return value, the Wasmer WAI generated code does that for you! The API could still be cleaner (maybe without that store parameter everywhere) but the convenience is already a clear win.

All this code can be found at

Happy WebAssembly hacking!

by JP Moresmau ( at April 10, 2023 02:55 PM

Monday Morning Haskell

How to Make ChatGPT Go Around in Circles (with GHC and Haskell)

As part of my research for the recently released (and free!) Setup.hs course, I wanted to explore the different kinds of compilation commands you can run with GHC outside the context of a build system.

I wanted to know…

Can I use GHC to compile a Haskell module without its dependent source files?

The answer, obviously, should be yes. When you use Stack or Cabal to get dependencies from Hackage, you aren't downloading and recompiling all the source files for those libraries.

And I eventually managed to do it. It doesn't seem hard once you know the commands already:

$ mkdir bin
$ ghc src/MyStrings.hs src/MyFunction.hs -hidir ./bin -odir ./bin
$ ghc -c app/Main.hs -i./bin -hidir ./bin -odir ./bin
$ ghc bin/Main.o ./bin/MyStrings.o ./bin/MyFunction.o -o ./bin/hello
$ ./bin/hello

But, being unfamiliar with the inner workings of GHC, I struggled for a while to find this exact combination of commands, especially with their arguments.

So, like I did last week, I turned to the latest tool in the developer's toolkit: ChatGPT. But once again, everyone's new favorite pair programmer had some struggles of its own on the topic! So let's start by defining exactly the problem we're trying to solve.

The Setup

Let's start with a quick look at our initial file tree.

├── app
│   └── Main.hs
└── src
    ├── MyFunction.hs
    └── MyStrings.hs

This is meant to look the way I would organize my code in a Stack project. We have two "library" modules in the src directory, and one executable module in the app directory that will depend on the library modules. These files are all very simple:

-- src/MyStrings.hs
module MyStrings where

greeting :: String
greeting = "Hello"

-- src/MyFunction.hs
module MyFunction where

modifyString :: String -> String
modifyString x = base <> " " <> base
    base = tail x <> [head x]

-- app/Main.hs
module Main where

import MyStrings (greeting)
import MyFunction (modifyString)

main :: IO ()
main = putStrLn (modifyString greeting)

Our goal is to compile and run the executable with two constraints:

1. Use only GHC (no Stack or Cabal involved)

2. Compile the library separately, so that the Main module could be compiled using only the library's build artifacts, and not the source files.

Trying to Compile

Now, there are two easy ways to compile this code if we're willing to violate our constraints (particularly the second one). If all three files are in the same directory, GHC can immediately find the modules we're importing, so we can just call ghc Main.hs.

└── src
    ├── Main.hs
    ├── MyFunction.hs
    └── MyStrings.hs


$ cd src
$ ghc Main.hs
$ ./Main
elloH elloH

If we keep the original file structure, it's still pretty easy if we're willing to violate constraint 2 by using the library source files in the ghc command:

$ ghc app/Main.hs src/MyStrings.hs src/MyFunction.hs
$ ./Main
elloH elloH

However, with my extra constraint of not using the library source files directly, things got a little more difficult. I originally thought "Hey, let's just include the build artifacts in the GHC call." Each library module generates an interface file (.hi) and an object file (.o). So let's try including those in the GHC call:

$ ghc src/MyStrings.hs src/MyFunction.hs
$ ls -l src/
MyFunction.hi MyFunction.hs MyFunction.o MyStrings.hi MyStrings.hs MyStrings.o
$ ghc app/Main.hs src/MyFunction.hi src/MyFunction.o src/MyStrings.hi src/MyStrings.o
Could not find module 'MyStrings'
Could not find module 'MyFunction'

As we can see, it doesn't work! Even though we're including the build artifacts as direct arguments of ghc, this doesn't help our main module to find them.

So, feeling a little too lazy to dig deep through GHC documentation, and feeling like this problem might have too many arbitrary constraints to phrase in a Google query, I decided to try our aforementioned new favorite tool.

Enter: ChatGPT

The First Session

(Conversation has been abbreviated for relevance)


How do I link existing .o files with a new source module using the 'ghc' command?

Chat GPT

To link existing object (.o) files with a new source module using the ghc command, you need to include the object files in the list of input files passed to the ghc command.

Assuming you have an existing .o file called example.o and a new source module called MyModule.hs, you can link them together into an executable program using the following command:

$ ghc -o MyProgram MyModule.hs example.o

This command tells ghc to compile and link the MyModule.hs source file with the example.o object file and create an executable program called MyProgram. The -o flag specifies the output file name.


Is it possible to use a ghc command to link a Haskell object file (.o) with a Haskell source file that imports the other module?

Chat GPT

Yes, it is possible to use the ghc command to link a Haskell object file (.o) with a Haskell source file that imports the other module. Here's an example:

Suppose you have two files, MyModule.hs and OtherModule.o, and MyModule.hs imports a module that is defined in OtherModule.o. To link the two files into an executable program, you can use the following command:

$ ghc -o MyProgram MyModule.hs OtherModule.o

This command tells ghc to compile and link the MyModule.hs source file with the OtherModule.o object file, and create an executable program called MyProgram.


Hmmm I don't think this works. I created the following file OtherModule.hs:

module OtherModule where

greeting :: String
greeting = "Hello"

And I'm trying to include it in another file:

module Main where

import OtherModule (greeting)

main :: IO ()
main = putStrLn greeting

I started by compiling OtherModule:

$ ghc src/OtherModule.hs

Then I moved OtherModule.o to the app directory (with MyModule.hs), and tried running this command:

$ cd app
$ ghc -o MyProgram MyModule.hs OtherModule.o

I get the error:

"Could not find module 'OtherModule'"

Chat GPT

I apologize for the confusion in my previous answer. You are correct that simply linking the object file with the executable does not work in this case.

When GHC compiles a module, it generates an interface file (.hi) that contains information about the module's name, type signatures, and other details. The compiled object file (.o) contains the compiled machine code for the functions defined in the module.

When you try to import a module in a source file, GHC looks for the interface file (OtherModule.hi) to find information about the module's name and type signatures. If the interface file is not found, GHC will raise an error.

In your case, when you compile OtherModule.hs, GHC generates an interface file called OtherModule.hi. To compile MyModule.hs, GHC needs to be able to find OtherModule.hi in order to resolve the import of OtherModule. Simply linking the OtherModule.o object file is not enough.

To make this work, you need to compile OtherModule.hs and MyModule.hs together, like this:

ghc -o MyProgram MyModule.hs src/OtherModule.hs

This command tells GHC to compile MyModule.hs and OtherModule.hs together, and create an executable program called MyProgram.

I hope this clears up the confusion. Again, I apologize for the confusion in my previous answer.


Ah, so I need to include both source files, and it can't be done by linking an object file with another source file?

Chat GPT

Yes, that's correct. When you import a module in a source file, GHC needs to be able to find the interface file (.hi) for that module in order to resolve the import. If the interface file is not found, GHC will raise an error.

When you compile a source file with GHC, it generates both an object file (.o) and an interface file (.hi). The object file contains the compiled machine code for the functions defined in the module, while the interface file contains information about the module's name, type signatures, and other details.

When you compile a source file that imports another module, GHC needs to be able to find the interface file for that module in order to compile the importing module. If the interface file is not found, GHC will raise an error.

So in your case, you need to compile both MyModule.hs and OtherModule.hs together so that GHC can generate the necessary interface files and link the resulting object files together to produce the executable program.

It is not possible to directly link an object file with a source file that imports it, as GHC needs the interface file to be present in order to compile the importing module.


Suspiciously, ChatGPT suggests a less detailed command than the previous one that failed for me!

$ ghc -o MyProgram MyModule.hs OtherModule.o

It includes only the object file, instead of the object and interface file. And, of course this fails, like I showed above.

Now there's an interesting statement that will come into play later, though I didn't realize it!

Yes, that's correct. When you import a module in a source file, GHC needs to be able to find the interface file (.hi) for that module in order to resolve the import. If the interface file is not found, GHC will raise an error.

I assumed that to "find" the interface file meant providing it on the command line (as in my original command). So with that idea having failed, I fixated on one part of the final statement:

It is not possible to directly link an object file with a source file that imports it...

And concluded that getting the right information from ChatGPT might be a lost cause. It seemed like it was sure this was possible, and then just as sure that it was impossible. But details matter, as we'll see!

Working Around It

So feeling slightly defeated, I decided to poke around the areas of the code where Cabal actually saves packages to see if there was a pattern I could follow. And I managed to cobble together a convoluted workaround.

What I observed was that GHC could use any module from a library in the Cabal package database. On my system's setup, this could be found in the .ghcup cache:

$ cd ~/.ghcup/ghc/9.2.5/lib/ghc-9.2.5/
$ ls
$ ls package.conf.d

And indeed, the package directories did not have source files, only various forms of compiled object and interface files. And each package had a corresponding .conf file in the package.conf.d directory.

So I could...

  1. Make my own directory in the package database (MyPackage-
  2. Place the .hi and .o files in there
  3. Create a phony MyPackage- and put it in package.conf.d. (Base it on an existing .conf file but make some substitutions)

And But it makes me feel a little bit like:

But manually manipulating a package database wasn't the solution I had in mind though. Surely there's something easier, right?

The Second Session

After a week of sitting on this, I decided to come back to ChatGPT and see if it had any other ideas. I decided to frame my question in a different way, stating a lot more details up front. Lo and behold, it had a different answer this time!


Suppose I have a Haskell module called MyStrings:

module MyStrings where

greeting :: String
greeting = "Hello"

I can compile this module:

$ ghc MyStrings.hs

This generates an interface file, MyStrings.hi, and an object file, MyStrings.o.

Now suppose I have an executable file, Main.hs, that imports MyStrings:

module Main where

import MyStrings (greeting)

main :: IO ()
main = putStrLn greeting

Can I use ghc to compile this Main file by using the .hi and .o files, without including the imported source file, MyStrings.hs in the command?


Yes, you can use the .hi and .o files generated by compiling MyStrings.hs to compile Main.hs without including MyStrings.hs in the command. To do so, you can use the -c option to tell GHC to compile Main.hs to an object file, and then use the -o option to link the object files together into an executable:

$ ghc -c Main.hs
$ ghc Main.o MyStrings.o -o myprogram

The first command compiles Main.hs to an object file Main.o, without attempting to link it. The second command links Main.o with MyStrings.o and creates an executable called myprogram. The .hi files are not needed for linking, so you don't need to include them in the command.

Note that in order for this to work, the interface file MyStrings.hi must be present in the same directory as Main.hs when you compile it. If you move MyStrings.hi to a different directory, you'll need to use the -i option to tell GHC where to find it:

$ ghc -i/path/to/MyStrings -c Main.hs
$ ghc Main.o /path/to/MyStrings/MyStrings.o -o myprogram

Here, -i/path/to/MyStrings tells GHC to look for MyStrings.hi in the directory /path/to/MyStrings.


So there are two key ingredients that weren't mentioned at all in our first pass: the -c option and the -i option.

Using -c performs compilation without linking. With this option, GHC is able to produce an object file for our Main module without needing its dependent objects. So .o files are unnecessary in this stage.

We still need the .hi files though. But instead of providing them on the command line, we use the -i argument to supply them. It's an odd argument, because we put the path right after the i, without any spacing.

After we're done with the first phase, then we can link all our object files together.

Solving It

And sure enough, this approach works!

$ ghc src/MyStrings.hs src/MyFunction.hs
$ ghc -c app/Main.hs -i./src
$ ghc app/Main.o ./src/MyStrings.o ./src/MyFunction.o -o hello
$ ./hello
elloH elloH

And if we want to be a little cleaner about putting our artifacts in a single location, we can use the -hidir and -odir arguments for storing everything in a bin directory.

$ mkdir bin
$ ghc src/MyStrings.hs src/MyFunction.hs -hidir ./bin -odir ./bin
$ ghc -c app/Main.hs -i./bin -hidir ./bin -odir ./bin
$ ghc bin/Main.o ./bin/MyStrings.o ./bin/MyFunction.o -o ./bin/hello
$ ./bin/hello
elloH elloH

And we're done! Our program is compiling as we wanted it to, without our "Main" compilation command directly using the library source files.


So with that fun little adventure concluded, what can we learn from this? Well first of all, prompts matter a great deal when you're using a Chatbot. The more detailed your prompt, and the more you spell out your assumptions, the more likely you'll get the answer you're looking for. My second prompt was waaay more detailed than my first prompt, and the solution was much better as a result.

But a more pertinent lesson for Haskellers might be that using GHC by itself can be a big pain. So if you're a beginner, you might be asking:

What's the normal way to build Haskell Code?

You can learn all about building and running your Haskell code in our new free course, Setup.hs. This course will teach you the easy steps to set up your Haskell toolchain, and show you how to build and run your code using Stack, Haskell's most popular build system. You'll even learn how to get Haskell integrations in several popular code editors so you can learn from your mistakes much more quickly. Learn more about it on the course page.

And if you subscribe to our monthly newsletter, you'll get a code for 20% off any of our paid courses until May 1st! So don't miss out on that offer!

by James Bowen at April 10, 2023 02:30 PM

April 07, 2023

JP Moresmau

A WebAssembly plugin system

Wow, it's been 4 years since I last blogged something! I usually just lurk on social media and I guess I was too busy to find the time to write about something reasonably interesting...

But hopefully today I did something I can present. I wanted to see how I could use WebAssembly to write plugins I could then load dynamically in Rust code. So far, results are good but only work with the wasm-bindgen conventions for calling and get results from functions. Still!

A lot of WebAssembly samples show you how to pass around the basic numeric types like i32, so I wanted something with Strings for a little extra complexity and interest. So our plugins will expose two methods:

- language takes no argument and return a string indicating which (human, not programming) language the plugin handles

- greet takes one argument, a person name, and return a greeting in the plugin's language

As you can see, not too involved, but a nice little use case.

The first plugin in Rust

I followed the instructions at the Rust and WebAssembly guide to get started, this was really painless. The Rust code for my first plugin is simply (including some generated code I didn't touch):

mod utils;

use wasm_bindgen::prelude::*;

// When the `wee_alloc` feature is enabled, use `wee_alloc` as the global
// allocator.
#[cfg(feature = "wee_alloc")]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;

/// The language we greet in.
pub fn language() -> String {

/// Greet the given name.
pub fn greet(name: &str) -> String {
format!("Hello, {name}!")

Running wasm-pack build gives us a .wasm file that exports language and greet, good!

The plugin runner in Rust

I used the wasmtime crate to get a runtime engine capable of loading and calling WebAssembly modules. I didn't look for any utility functions and implemented the string handling functions necessary to interact with the wasm-bindgen exposed functions myself.

So cargo.toml has very few dependencies:

wasmtime = "1.0.0"
anyhow = "1.0.70"
byteorder = "1.4.3"

The main function is fairly straightforward: it gets the arguments, creates the WASM engine and asks each plugin it can find in a folder to greet the person:

fn main() -> Result<()> {
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
return Err(anyhow!("Usage: i18n-greeter <name>"));
let engine = Engine::default();
let linker = Linker::new(&engine);

let paths = fs::read_dir("./plugins").unwrap();

for path in paths {
let path = path?;
let module = Module::from_file(&engine, path.path())?;
let mut runtime = Runtime::new(&engine, &linker, &module)?;
let language = runtime.language()?;
println!("Language: {language}");
let greeting = runtime.greet(&args[1])?;
println!("Greeting: {greeting}");

The magic is inside the Runtime struct, that actually handles the nitty-gritty of initializing all the necessary things for wasmtime to do its thing:

struct Runtime {
store: Store<()>,
memory: Memory,
/// Pointer to currently unused memory.
pointer: usize,
language: TypedFunc<i32, ()>,
greet: TypedFunc<(i32, i32, i32), ()>,

We have to manage the WebAssembly linear memory ourselves so we do it very simplify by keeping a pointer to where in the memory we can put stuff.

Initializing the runtime:

fn new(engine: &Engine, linker: &Linker<()>, module: &Module) -> Result<Self> {
let mut store = Store::new(engine, ());

let instance = linker.instantiate(&mut store, module)?;

let memory = instance
.get_memory(&mut store, "memory")
.ok_or(anyhow::format_err!("failed to find `memory` export"))?;
let language = instance
.get_func(&mut store, "language")
"`language` was not an exported function"
.typed::<i32, (), _>(&store)?;
let greet = instance
.get_func(&mut store, "greet")
.ok_or(anyhow::format_err!("`greet` was not an exported function"))?
.typed::<(i32, i32, i32), (), _>(&store)?;

Ok(Self {
pointer: 0,

With this we can do our own very basic memory management, which means reserving an area of memory, for example to read and write strings as UTF8 byte arrays, using the wasm-bindgen conventions:

/// Get a new pointer to store the given size in memory.
/// Grows memory if needed.
fn new_pointer(&mut self, size: usize) -> Result<i32> {
let current = self.pointer;
self.pointer += size;
while self.pointer > self.memory.data_size(& {
self.memory.grow(&mut, 1)?;
Ok(current as i32)

/// Reset pointer, so memory can get overwritten.
fn reset_pointer(&mut self) {
self.pointer = 0;

/// Read string from memory.
fn read_string(&self, offset: i32, length: i32) -> Result<String> {
let mut contents = vec![0; length as usize];
.read(&, offset as usize, &mut contents)?;

/// Read bounds from memory.
fn read_bounds(&self, offset: i32) -> Result<(i32, i32)> {
let mut buffer = [0u8; 8];
.read(&, offset as usize, &mut buffer)?;
let start = (&buffer[0..4]).read_i32::<LittleEndian>()?;
let length = (&buffer[4..]).read_i32::<LittleEndian>()?;
Ok((start, length))

/// Write string into memory.
fn write_string(&mut self, str: &str) -> Result<(i32, i32)> {
let data = str.as_bytes();
let offset = self.new_pointer(data.len())?;
self.memory.write(&mut, offset as usize, data)?;
Ok((offset, str.len() as i32))

Basically we pass two i32 when we need to transfer a string, the offset and length that we use on the linear memory to read or write the bytes.

Using these, wrapping our plugin functions is easy:

/// Call language function.
fn language(&mut self) -> Result<String> {
let offset = self.new_pointer(16)?;, offset)?;
let (offset, length) = self.read_bounds(offset)?;
let s = self.read_string(offset, length)?;

/// Call greet function.
fn greet(&mut self, name: &str) -> Result<String> {
let offset = self.new_pointer(16)?;
let (start, length) = self.write_string(name)?;, (offset, start, length))?;
let (offset, length) = self.read_bounds(offset)?;
let s = self.read_string(offset, length)?;

So if we copy the wasm file compiled from our first plugin into the plugins directory and run the program with my name:

cargo run "JP Moresmau"
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/i18n-greeter 'JP Moresmau'`
Language: English
Greeting: Hello, JP Moresmau!

Further steps

Of course now what would be very cool would be to be able to write plugins in other languages, but for example it looks like Go, even with TinyGo, is still very much tied to a Javascript runtime. Maybe the wasm-bindgen conventions will be ported to other languages than Rust in the future?

Trying it yourself

All the code can be found at

by JP Moresmau ( at April 07, 2023 01:21 PM

April 03, 2023

Gabriella Gonzalez

Stop calling everything "Nix"

Stop calling everything "Nix"

One of my pet peeves is when people abuse the term “Nix” without qualification when trying to explain the various components of the Nix ecosystem.

As a concrete example, a person might say:

“I hate Nix’s syntax”

… and when you dig into this criticism you realize that they’re actually complaining about the Nixpkgs API, which is not the same thing as the syntax of the Nix expression language.

So one of the goals of this post is to introduce some unambiguous terminology that people can use to refer to the various abstraction layers of the Nix ecosystem in order to avoid confusion. I’ll introduce each abstraction layer from the lowest level abstractions to the highest level abstractions.

Another reason I explain “Nix” in terms of these abstraction layers is because this helps people consult the correct manual. The Nix ecosystem provides three manuals that you will commonly need to refer to in order to become more proficient:

… and I hope by the end of this post it will be clearer which manual interests you for any given question.

Edit: Domen Kožar pointed out that there is an ongoing effort to standardize terminology here:

I’ll update the post to match the agreed-upon terminology when that is complete.

Layer #0: The Nix store

I use the term “Nix store” to mean essentially everything you can manage with the nix-store command-line tool.

That is the simplest definition, but to expand upon that, I mean the following files:

  • Derivations: /nix/store/*.drv
  • Build products: /nix/store/* without a .drv extension
  • Log files: /nix/var/log/nix/drvs/**
  • Garbage collection roots: /nix/var/nix/gcroots/**

… and the following operations:

  • Realizing a derivation

    i.e. converting a .drv file to the corresponding build products using nix-store --realise

  • Adding static files to the /nix/store

    i.e. nix-store --add

  • Creating GC roots for build products

    i.e. the --add-root option to nix-store

  • Garbage collecting derivations not protected by a GC root

    i.e. nix-store --gc

There are other things the Nix store supports (like profile management), but these are the most important operations.

CAREFULLY NOTE: the “Nix store” is independent of the “Nix language” (which we’ll define below). In other words, you could replace the front-end Nix programming language with another language (e.g. Guile scheme, as Guix does). This is because the Nix derivation format (the .drv files) and the nix-store command-line interface are both agnostic of the Nix expression language. I have a talk which delves a bit more into this subject:

Layer #1: The Nix language

I use the term “Nix language” to encompass three things:

  • The programming language: source code we typically store in .nix files
  • Instantiation: the interpretation of Nix code to generate .drv files
  • Flakes: pure evaluation and instantiation caching

To connect this with the previous section, the typical pipeline for converting Nix source code to a build product is:

Nix source code (*.nix)            │ Nix language
      ↓ Instantiation              ├─────────────
Nix derivation (/nix/store/*.drv)  │
      ↓ Realization                │ Nix store
Nix build product (/nix/store/*)   │

In isolation, the Nix language is “just” a purely functional programming language with simple language constructs. For example, here is a sample Nix REPL session:

nix-repl> 2 + 2

nix-repl> x = "world"   

nix-repl> "Hello, " + x  
"Hello, world"

nix-repl> r = { a = 1; b = true; }

nix-repl> if r.b then r.a else 0

However, as we go up the abstraction ladder the idiomatic Nix code we’ll encounter will begin to stray from that simple functional core.

NOTE: Some people will disagree with my choice to include flakes at this abstraction layer since flakes are sometimes marketed as a dependency manager (similar to niv). I don’t view them in this way and I treat flakes as primarily as mechanism for purifying evaluation and caching instantiation, as outlined in this post:

… and if you view flakes in that capacity then they are a feature of the Nix language since evaluation/instantiation are the primary purpose of the programming language.

Layer #2: The Nix build tool

This layer encompasses the command-line interface to both the “Nix store” and the “Nix language”.

This includes (but is not limited to):

  • nix-store (the command, not the underlying store)
  • nix-instantiate
  • nix-build
  • nix-shell
  • nix subcommands, including:
    • nix build
    • nix run
    • nix develop
    • nix log
    • nix flake

I make this distinction because the command-line interface enables some additional niceties that are not inherent to the underlying layers. For example, the nix build command has some flake integration so that you can say nix build someFlake#somePackage and this command-line API nicety is not necessarily inherent to flakes (in my view).

Also, many of these commands operate at both Layer 0 and Layer 1, which can blur the distinction between the two. For example the nix-build command can accept a layer 1 Nix program (i.e. a .nix file) or a layer 0 derivation (i.e. a .drv file).

Another thing that blurs the distinction is that the Nix manual covers all three of the layers introduced so far, ranging from the Nix store to the command-line interface. However, if you want to better understand these three layers then that is correct place to begin:

Layer #3: Nixpkgs

Nixpkgs is a software distribution (a.k.a. “distro”) for Nix. Specifically, all of the packaging logic for Nixpkgs is hosted on GitHub here:

This repository contains a large number of Nix expressions for building packages across several platforms. If the “Nix language” is a programming language then “Nixpkgs” is a gigantic “library” authored within that language. There are other Nix “libraries” outside of Nixpkgs but Nixpkgs is the one you will interact with the most.

The Nixpkgs repository establishes several widespread idioms and conventions, including:

  • The standard environment (a.k.a. stdenv) for authoring a package
    • There are also language-specific standard-environments, too
  • A domain-specific language for overriding individual packages or sets of packages

When people complain about “Nix’s syntax”, most of the time they’re actually complaining about Nixpkgs and more specifically complaining about the Nixpkgs system for overriding packages. However, I can see how people might mistake the two.

The reason for the confusion is that the Nixpkgs support for overrides is essentially an embedded domain-specific language, meaning that you still express everything in the Nix language (layer 1), but the ways in which you express things is fundamentally different than if you were simply using low-level Nix language features.

As a contrived example, this “layer 1” Nix code:

  x = 1;

  y = x + 2;

… would roughly correspond to the following “layer 3” Nixpkgs overlay:

self: super: {
  x = 1;

  y = self.x + 2;

The reason why Nixpkgs doesn’t do the simpler “layer 1” thing is because Nixpkgs is designed to support “late binding” of expressions, meaning that everything can be overridden, even dependencies deep within the dependency tree. Moreover, this overriding is done in such a way that everything “downstream” of the overrride (i.e. all reverse dependencies) pick up the change correctly.

As a more realistic example, the following program:

  pkgs = import <nixpkgs> { };

  fast-tags =

  fast-tags-no-tests =
    pkgs.haskell.lib.dontCheck fast-tags;


… is simpler, but is not an idiomatic use of Nixpkgs because it is not using the overlay system and therefore does not support late binding. The more idiomatic analog would be:

  overlay = self: super: {
    fast-tags =

    fast-tags-no-tests =

  pkgs = import <nixpkgs> { overlays = [ overlay ]; };


You can learn more about this abstraction layer by consulting the Nixpkgs manual:

Layer #4: NixOS

NixOS is an operating system that is (literally) built on Nixpkgs. Specifically, there is a ./nixos/ subdirectory of the Nixpkgs repository for all of the NixOS-related logic.

NixOS is based on the NixOS module system, which is yet another embedded domain-specific language. In other words, you configure NixOS with Nix code, but the idioms of that Nix code depart even more wildly from straightforward “layer 1” Nix code.

NixOS modules were designed to look more like Terraform modules than Nix code, but they are still technically Nix code. For example, this is what the NixOS module for the lorri service looks like at the time of this writing:

{ config, lib, pkgs, ... }:

  cfg =;
  socketPath = "lorri/daemon.socket";
in {
  options = {
    services.lorri = {
      enable = lib.mkOption {
        default = false;
        type = lib.types.bool;
        description = lib.mdDoc ''
          Enables the daemon for `lorri`, a nix-shell replacement for project
          development. The socket-activated daemon starts on the first request
          issued by the `lorri` command.
      package = lib.mkOption {
        default = pkgs.lorri;
        type = lib.types.package;
        description = lib.mdDoc ''
          The lorri package to use.
        defaultText = lib.literalExpression "pkgs.lorri";

  config = lib.mkIf cfg.enable {
    systemd.user.sockets.lorri = {
      description = "Socket for Lorri Daemon";
      wantedBy = [ "" ];
      socketConfig = {
        ListenStream = "%t/${socketPath}";
        RuntimeDirectory = "lorri";
    }; = {
      description = "Lorri Daemon";
      requires = [ "lorri.socket" ];
      after = [ "lorri.socket" ];
      path = with pkgs; [ config.nix.package git gnutar gzip ];
      serviceConfig = {
        ExecStart = "${cfg.package}/bin/lorri daemon";
        PrivateTmp = true;
        ProtectSystem = "strict";
        ProtectHome = "read-only";
        Restart = "on-failure";

    environment.systemPackages = [ cfg.package ];

You might wonder how NixOS relates to the underlying layers. For example, if Nix is a build system, then how do you “build” NixOS? I have another post which elaborates on that subject here:

Also, you can learn more about this abstraction layer by consulting the NixOS manual:

Nix ecosystem

I use the term “Nix ecosystem” to describe all of the preceding layers and other stuff not mentioned so far (like hydra, the continuous integration service).

This is not a layer of its own, but I mention this because I prefer to use “Nix ecosystem” instead of “Nix” to avoid ambiguity, since the latter can easily be mistaken for an individual abstraction layer (especially the Nix language or the Nix build tool).

However, when I do hear people say “Nix”, then I generally understand it to mean the “Nix ecosystem” unless they clarify otherwise.


Hopefully this passive aggressive post helps people express themselves a little more precisely when discussing the Nix ecosystem.

If you enjoy this post, you will probably also like this other post of mine:

… since that touches on the Nixpkgs and NixOS embedded domain-specific languages and how they confound the user experience.

I’ll conclude this post with the following obligatory joke:

I’d just like to interject for a moment. What you’re refering to as Nix, is in fact, NixOS, or as I’ve recently taken to calling it, Nix plus OS. Nix is not an operating system unto itself, but rather another free component of a fully functioning ecosystem made useful by the Nix store, Nix language, and Nix build tool comprising a full OS as defined by POSIX.

Many Guix users run a modified version of the Nix ecosystem every day, without realizing it. Through a peculiar turn of events, the operating system based on Nix which is widely used today is often called Nix, and many of its users are not aware that it is basically the Nix ecosystem, developed by the NixOS foundation.

There really is a Nix, and these people are using it, but it is just a part of the system they use. Nix is the expression language: the program in the system that specifies the services and programs that you want to build and run. The language is an essential part of the operating system, but useless by itself; it can only function in the context of a complete operating system. Nix is normally used in combination with an operating system: the whole system is basically an operating system with Nix added, or NixOS. All the so-called Nix distributions are really distributions of NixOS!

by Gabriella Gonzalez ( at April 03, 2023 06:48 PM

Ergonomic newtypes for Haskell strings and numbers

Ergonomic newtypes for Haskell strings and numbers

This blog post summarizes a very brief trick I commonly recommend whenever I see something like this:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text (Text)
import Numeric.Natural (Natural)

newtype Name = Name { getName :: Text }
    deriving (Show)

newtype Age = Age { getAge :: Natural }
    deriving (Show)

data Person = Person { name :: Name, age :: Age }
    deriving (Show)

example :: Person
example = Person{ name = Name "John Doe", age = Age 42 }

… where the newtypes are not opaque (i.e. the newtype constructors are exported), so the newtypes are more for documentation purposes rather than type safety.

The issue with the above code is that the newtypes add extra boilerplate for both creating and displaying those types. For example, in order to create the Name and Age newtypes you need to explicitly specify the Name and Age constructors (like in the definition for example above) and they also show up when displaying values for debugging purposes (e.g. in the REPL):

>>> example
Person {name = Name {getName = "John Doe"}, age = Age {getAge = 42}}

Fortunately, you can easily elide these noisy constructors if you follow these rules of thumb:

  • Derive IsString for newtypes around string-like types

  • Derive Num for newtypes around numeric types

  • Change the Show instances to use the underlying Show for the wrapped type

For example, I would suggest amending the original code like this:

{-# LANGUAGE DerivingStrategies         #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE OverloadedStrings          #-}

module Example1 where

import Data.Text (Text)
import Data.String (IsString)
import Numeric.Natural (Natural)

newtype Name = Name { getName :: Text }
    deriving newtype (IsString, Show)

newtype Age = Age { getAge :: Natural }
    deriving newtype (Num, Show)

data Person = Person { name :: Name, age :: Age }
    deriving stock (Show)

example :: Person
example = Person{ name = "John Doe", age = 42 }

… and now the Age and Name constructors are invisible, even when displaying these types (using their Show instances):

>>> example
Person {name = "John Doe", age = 42}

That is the entirety of the trick, but if you still don’t follow, I’ll expand upon that below.


Revisiting the starting code:

{-# LANGUAGE OverloadedStrings #-}

import Data.Text (Text)
import Numeric.Natural (Natural)

newtype Name = Name { getName :: Text }
    deriving (Show)

newtype Age = Age { getAge :: Natural }
    deriving (Show)

data Person = Person { name :: Name, age :: Age }
    deriving (Show)

example :: Person
example = Person{ name = Name "John Doe", age = Age 42 }

… the first thing we’re going to do is to enable the DerivingStrategies language extension because I’m going to lean pretty heavily on Haskell’s support for deriving typeclass instances in this post and I want to be more explicit about how these instances are being derived:

{-# LANGUAGE DerivingStrategies #-}

newtype Name = Name { getName :: Text }
    deriving stock (Show)

newtype Age = Age { getAge :: Natural }
    deriving stock (Show)

I’ve changed the code to explicitly specify that we’re deriving Show using the “stock” deriving strategy, meaning that Haskell has built-in language support for deriving Show and we’re going to use that.

The next step is that we’re going to add an IsString instance for Name because it wraps a string-like type (Text). However, at first we’ll write out the instance by hand:

import Data.String (IsString(..))

instance IsString Name where
    fromString string = Name (fromString string)

This IsString instance works in conjunction with Haskell’s OverloadedStrings so that we can directly use a string literal in place of a Name, like this:

example :: Person
example = Person{ name = "John Doe", age = Age 42 }
                      -- ↑
                      -- No more Name constructor required here

… and the reason that works is because the compiler implicitly inserts fromString around all string literals when you enable OverloadedStrings, as if we had written this:

example :: Person
example = Person{ name = fromString "John Doe", age = Age 42 }

The IsString instance for Name:

instance IsString Name where
    fromString string = Name (fromString string)

… essentially defers to the IsString instance for the underlying wrapped type (Text). In fact, this pattern of deferring to the underlying instance is common enough that Haskell provides a language extension for this purpose: GeneralizedNewtypeDeriving. If we enable that language extension, then we can simplify the IsString instance to this:

{-# LANGUAGE GeneralizedNewtypeDeriving #-}

newtype Name = Name { getName :: Text }
    deriving stock (Show)
    deriving newtype (IsString)

The deriving newtype indicates that we’re explicitly using the GeneralizedNewtypeDeriving extension to derive the implementation for the IsString instance.

In this particular case we don’t have to specify the deriving strategy; we could have just said deriving (IsString) and it still would have worked because it wasn’t ambiguous; no other deriving strategy would have worked in this case. However, as we’re about to see there are cases where you want to explicitly disambiguate between multiple possible deriving strategies.

The next step is that we implement Num for our Age type since it wraps a numeric type (Natural):

instance Num Age where
    Age x + Age y = Age (x + y)

    Age x - Age y = Age (x - y)

    Age x * Age y = Age (x * y)

    negate (Age x) = Age (negate x)

    abs (Age x) = Age (abs x)

    signum (Age x) = Age (signum x)

    fromInteger integer = Age (fromInteger integer)

Bleh! That’s a lot of work to do when really we were most interested in the fromInteger method (so that we could use numeric literals directly to create an Age).

The reason we care about the fromInteger method is because Haskell lets you use integer literals for any type that implements Num (without any language extension; this is part of the base language). So, for example, we can further simplify our example Person to:

example :: Person
example = Person{ name = "John Doe", age = 42 }
                                        -- ↑
                                        -- No more Age constructor required here

… and the reason that works is because the compiler implicitly inserts fromInteger around all integer literals, as if we had written this:

example :: Person
example = Person{ name = "John Doe", age = fromInteger 42 }

It would be nice if Haskell had a dedicated class for just the fromInteger method (e.g. IsInteger), but alas if we want ergonomic support for numeric literals then we have to add support for other numeric operations, too, even if they might not necessarily make sense for our newtype.

Like before, though, we can use the GeneralizedNewtypeDeriving extension to derive Num instead:

newtype Age = Age { getAge :: Natural }
    deriving stock (Show)
    deriving newtype (Num)

Much better!

However, we’re not done, yet, because at the moment these Name and Age constructors still appear in the debug output:

>>> example
Person {name = Name {getName = "John Doe"}, age = Age {getAge = 42}}


Okay, so the final step is to change the Show instances for Name and Age to defer to the Show instances for their underlying types:

instance Show Name where
    show (Name string) = show string

instance Show Age where
    show (Age natural) = show natural

These are still valid Show instances! The Show class requires that the displayed representation should be valid Haskell code for creating a value of that type, and in both cases that’s what we get.

For example, if you show a value like Name "John Doe" you will get "John Doe", and that’s valid Haskell code for creating a Name if you enable OverloadedStrings.

Note: You might argue that this is not a valid Show instance because it requires the use of a language extension (e.g. OverloadedStrings) in order to be valid code. However, this is no different than the Show instance for Text (which is also only valid if you enable OverloadedStrings), and most people do not take issue with that Show instance for Text either.

Similarly, if you show a value like Age 42 you will get 42, and that’s valid Haskell code for creating an Age.

So with those two new Show instances our Person type now renders much more compactly:

>>> example
Person {name = "John Doe", age = 42}

… but we’re not done! The last part of the trick is to use GeneralizedNewtypeDeriving to derive the Show instances, like this:

newtype Name = Name { getName :: Text }
    deriving newtype (IsString, Show)

newtype Age = Age { getAge :: Natural }
    deriving newtype (Num, Show)

… and this is where the DerivingStrategies language extension really matters! Without that extension there would be no way to tell the compiler to derive Show by deferring to the underlying type. By default, if you don’t specify the deriving strategy then the compiler assumes that derived Show instances use the stock deriving strategy.


There’s one last bonus to doing things in this way: you might now be able to hide the newtype constructor by not exporting it! I think this is actually the most important benefit of all because a newtype with an exposed constructor doesn’t really improve upon the type safety of the underlying type.

When a newtype like Name or Age exposes the newtype constructor then the newtype serves primarily as documentation and I’m not a big fan of this “newtypes as documentation” design pattern. However, I’m not that strongly opposed to it either; I wouldn’t use it in own code, but I also wouldn’t insist that others don’t use it. Another post which takes a stronger stance on this is Names are not type safety, especially the section on “Newtypes as tokens”.

I’m personally okay with other people using newtypes in this way, but if you do use “newtypes as documentation” then please add IsString / Num / Show instances as described in this post so that they’re more ergonomic for others to use.

by Gabriella Gonzalez ( at April 03, 2023 02:33 PM

Monday Morning Haskell

How to Stump ChatGPT in 1 Easy Step (Hint: Ask a Windows Question)

While doing some research for Setup.hs, my new free course, I confronted a fact of Haskell development with Stack.

The commands for running individual test suites are cumbersome.

For example, if you have a parse-tests suite in your Project.cabal file, you can run it (without running other suites) like so:

$ stack build Project:test:parse-tests

The structure isn't hard once you're used to it, but remembering the exact names of your test suites isn't always easy on larger projects. This leads to the question:

How can I create aliases for 'stack build' commands?

On Unix-based systems, it's quite easy. Just slap something like this in your .bashrc (or equivalent) and you're good to go:

alias ppt="stack build Project:test:parse-tests"


$ source ~/.bashrc
$ ppt
(Tests run)

Windows though is, as always, a special beast. And much as I generally prefer to use Windows Subsystem for Linux, I wanted to help any users who might be working with a pure Windows setup. This article will focus on:

How do I enable 'stack' aliases in Windows PowerShell?

Now, when I lay out the code I put in my PowerShell profile, it looks a little dense but fairly innocuous:

Function ProjectParseTest { Invoke-Expression "stack build Project:test:parse-tests" }
New-Alias -Name qpt -Value ProjectParseTest

But discovering the exact formula for getting all this to work was quite difficult. And as we'll see, in fact, even ChatGPT, the new kid on the block, struggles with this question. After a brief while struggling through Windows documentation, I turned to the chat bot. Let's review the first question:

ChatGPT Question 1


How can I make command aliases in Windows 11 Powershell?

Chat GPT

You can create command aliases in Windows 11 PowerShell by using the New-Alias cmdlet. Here's an example that creates an alias ls for the Get-ChildItem cmdlet:

New-Alias -Name ls -Value Get-ChildItem

This will create an alias named ls that can be used just like the Get-ChildItem cmdlet. For example:

ls .

This will list the contents of the current directory, just as if you had typed Get-ChildItem ..

To persist your alias across sessions, you need to add it to your PowerShell profile. To do that, first determine the path to your profile by running the following command:

$ $profile

If the profile does not exist yet, you can create it by running the following command:

New-Item -Type File -Path $profile -Force

Then, open the profile in your favorite text editor and add the New-Alias command. Save the file and restart your PowerShell session. Your alias will be available every time you start PowerShell.


Now, everything Chat has told me is technically true. The command it gives successfully creates an alias. Notably though, if you want to try this yourself, you should use a different alias, as Powershell already includes the alias ls in most recent Windows version (previously you would need to use the alias dir to get that effect).

Additionally, I found that the PROFILE variable was indeed set, but the file did not exist. In fact, the directory it was supposed to be in (Documents/WindowsPowerShell) did not exist either! So you'll have to make that before creating the file. But placing an alias command in there will persist the alias across sessions.

However, notice that this doesn't quite answer our question, because the subject of the alias was an established Windows expression (Get-ChildItem, also know as a cmdlet), rather than a terminal command we would write.

Let's suppose we just want to alias sb=stack build. It does not work to simply use a string for the command:

There are some clues to our eventual solution in the error message though. It tells us we need a cmdlet, a function, a script file, or an operable program.

Now stack by itself is an operable program. So we could actually do the following:

PS > New-Alias -Name st -Value "stack"
PS > st
(stack usage output)

But we want to provide arguments as part of the alias. So let's see if ChatGPT knows how. We'll see that its answers are not quite right.

ChatGPT Question 2


How do I make the alias run a command, like "stack build"

Chat GPT

You can create an alias that runs a command by using the Invoke-Expression cmdlet to evaluate a string as a PowerShell expression. Here's an example that creates an alias sb for the stack build command:

New-Alias -Name sb -Value { Invoke-Expression "stack build" }

This will create an alias named sb that can be used to run the stack build command. For example:


This will run the stack build command, just as if you had typed it into the PowerShell prompt.

As with the previous example, you can add this alias to your PowerShell profile to persist it across sessions.


I get an error "Cannot evaluate parameter 'Value' because its argument is specified as a script block and there is no input"

How do I specify input in this alias?

Chat GPT

To specify input in a script block alias, you can use the Param keyword to declare input parameters and then pass the input to the underlying command. Here's an example that creates an alias sb for the stack build command, allowing you to specify the target to build as input:

New-Alias -Name sb -Value {
    Param ($target)
    Invoke-Expression "stack build $target"

This will create an alias named sb that can be used to run the stack build command with a specified target. For example:

sb my-target

This will run the stack build my-target command, just as if you had typed it into the PowerShell prompt.

As with the previous examples, you can add this alias to your PowerShell profile to persist it across sessions.


Both of the proposed approaches fail! I commented on the issue explicitly after the first response, and then got the same error running the second approach ("Cannot evaluate parameter 'Value' because its argument is specified as a script block..."). So I gave up on my interrogation and started searching for documentation again.

Solving the Problem

I don't remember exactly where I found the solution. I think it involved cobbling together points from multiple Stack Overflow posts. Windows documentation was not very helpful. Let's recall the error message from earlier:

The term 'stack build' is not recognized as the name of a cmdlet, function, script file, or operable program.

We can't make an official Windows PS cmdlet out of our program, nor can we make an operable program with the arguments we want. We could make a script file, but running scripts from PowerShell is surprisingly difficult (there are some extra steps with permissions). The answer is that we can make a function that our alias will refer to.

This function will, incidentally, use the Invoke-Expression idea ChatGPT recommended as well, just not directly as the alias value!

The following code will go in our $PROFILE. First, we make a function that invokes our expression. We can give this function an arbitrary name, but I used a capitalized name (ProjectParseTest) to distinguish from any potential aliases.

Function ProjectParseTest { Invoke-Expression "stack build Project:test:parse-tests" }

Now we can use this function as the object of a New-Alias call! So we use the command ChatGPT suggested, just substituting our function for the -Value instead of providing a raw Invoke-Expression command.

Function ProjectParseTest { Invoke-Expression "stack build Project:test:parse-tests" }
New-Alias -Name ppt -Value ProjectParseTest

This alias succeeds now, and by putting this in my PowerShell profile, I can persist the alias across sessions!

$ ppt
(Test runs)

Haskell on Windows

Now aliases are just one small piece of the Haskell puzzle. So if you're trying to get started with Haskell, but don't have a Mac, and aren't familiar with Linux, you might want to know:

How do I set up my Haskell toolchain on Windows?

My new free course Setup.hs goes over all the basics of setting up your Haskell toolchain, including how to get started with code hints in three of the most popular editors out there. Plus, every lecture includes a walkthrough video for Windows* so you can learn what kinds of odd quirks you might come across! You can read more about the course in this article.

Plus, if you subscribe to our monthly newsletter, you'll also get a 20% discount code for all our paid courses that is good until May 1! So don't miss out on your chance to learn about Haskell!

*Separate walkthrough videos for MacOS and Linux are also included.

by James Bowen at April 03, 2023 02:30 PM

April 01, 2023

Michael Snoyman

The Opposite of Partial Functions

It would be fair, if cliche, to say "partial functions considered harmful." For those unfamiliar: partial functions are functions which are undefined for some valid inputs. A classic example would be Haskell's head function. It takes a list and gives you back the first item from the list. All well and good, but what should that function do in the case of an empty list?

Partial function enthusiasts would argue that head is fine as it is. It's the caller's responsibility to confirm that they don't pass in an empty list, e.g.:

myFunction :: IO () -- lol not a function
myFunction = do
    someList <- getListFromEther
    if null someList
        then putStrLn "The list is empty!"
        else putStrLn $ "First item on list: " ++ show (head someList)

Opponents of partial functions will argue that this unnecessarily introduces possibilities for failure. One solution is to change the output type of head to use an explicitly optional value using Maybe. Then we're forced to deal with the possibility of an empty list:

headMay :: [a] -> Maybe a -- implementation left as an exercise to the reader

myFunction = do
    someList <- getListFromEther
    case headMay someList of
        Nothing -> putStrLn "The list is empty!"
        Just x -> putStrLn $ "First item on list: " ++ show x

Another alternative is to change the input type to the function to make empty lists impossible. getListFromEther could return a NonEmptyList and we can provide a completely safe head :: NonEmptyList a -> a.

Both of these approaches are examples of converting a partial function into a total function. We do that here by modifying either the input (or domain for math nerds) or the output (or range for math nerds, or codomain for younger math nerds) of the function. Total functions say that for every possible value in the domain, the function will return a non-bottom value in the codomain.

Side note to build up to a larger point: I just said "non-bottom value." You might think that a "top value" would be the opposite of a "bottom value." But that's simply not true. You could have middle values, almost-bottom values, tippy-top values, pretty-high-but-not-quite-the-top, etc. So we need to talk about "bottom" and "non-bottom."

"But Michael," you say, "that's not at all what opposite means. You're talking about the negation of bottom, not its opposite. Top is literally the opposite of bototm." That may be true, but I'm redefining words my way. My blog, my dictionary, my rules.

Anyway, coming back to the original point... people often get confused and think the opposite of partial functions is total functions. But partial and total functions are more similar than different. Consider head: for all three possible implementations we described above, it will return the same value for the vast, vast majority of possible lists. The only time it returns a different result is in the one special case of empty lists.

Therefore, using exceptional1 language skills and my patent-pending Snoyberg Dictionary, I would like to introduce a brand new concept that will revolutionize the programming world.

Impartial functions

When you say "I'm partial to pistachio ice cream," you mean two things:

  1. You have a tendency towards picking pistachio ice cream versus other flavors
  2. You have very bad taste

Partial functions say the same thing. A partial head function is partial to non-bottom values, treating each of them separately and doing real work with them. The true opposite of a partial function is an impartial function. A function that makes no distinction on different input values. Let's take a properly written headMay function (yes, I said it was an exercise for you, but you were too lazy to write it, so I had to do it for you):

headMay :: [a] -> Maybe a
headMay [] = Nothing
headMay (x:_) = Just x

Look at how partial that function is, only returning Nothing for empty list! Here is a true impartial implementation:

headMay :: [a] -> Maybe a
headMay _ = Nothing

We can do the same for the original head function by adding a typeclass constraint for the well regarded and highly recommended Default typeclass:

head :: Default a => [a] -> a
head _ = def

Constant functions

The astute reader--and I'm guessing you're astute--may have noticed that our current definition of impartial functions is actually the same as a constant function. One solution to this is simply to delete the dictionary definition for constant functions, which is actually my preference. But since the following code is so awesomely bad, let's go on.

I'm going to claim that constant functions are a strict subset of impartial functions. Impartial functions cannot be partial to different values. But they can be partial to non-values. Let's go back to the definition of total functions: for all well defined input (meaning non-bottom), the function must return a non-bottom value. A true impartial function can go in the same direction, and say that for all bottom input we provide non-bottom output, and for all non-bottom input, we provide bottom output.


OK, that was a bit convoluted, but I'm sure this code will clear up any confusion immediately:

import Control.DeepSeq (NFData, ($!!))
import Control.Exception (SomeException, try)
import Prelude hiding (head)
import System.IO.Unsafe (unsafePerformIO)

class Default a where
  def :: a
instance Default Int where
  def = 42 -- obviously

head :: forall a. (Default a, NFData a) => [a] -> a
head list =
  case unsafePerformIO $ try $ pure $!! list :: Either SomeException [a] of
    Left _ -> def
    Right _ -> error "Oops! Not bottom!"

main :: IO ()
main = do
  print $ head (undefined :: [Int])
  print $ head ([5] :: [Int])

This code is pretty straightforward and easy to follow, and hits all known Haskell best practices and code formatting recommendations, so we don't need to give it any more thought.

Final observation

What we just explored demonstrates my true point here. We know, because words, that partial and impartial functions must be the opposite of each other. And we also just demonstrated that impartial functions are literally the opposite of total functions. That means we have a double negation: !!partial == !impartial == total. And since !!x == x in all languages (especially Javascript), we now know that partial and total functions are exactly the same thing!

We've been living a lie for decades. Don't let the haters stop you. Partial functions are just as safe as total functions. They're the same thing. And when you're ready to up your game, embrace impartial functions.

1I know no one will believe me, but that's actually a very clever pun.

April 01, 2023 12:00 AM

March 30, 2023

Magnus Therning

More on tree-sitter and consult

Here's a few things that I've gotten help with figuring out during the last few days. Both things are related to my playing with tree-sitter that I've written about earlier, here and here.

You might also be interested in the two repositories where the full code is. (I've linked to the specific commits as of this writing.)

Anonymous nodes and matching in tree-sitter

In the grammar for Cabal I have a rule for sections that like this

sections: $ => repeat1(choice(

where each section followed this pattern

benchmark: $ => seq(
    field('name', $.section_name),
    field('properties', $.property_block),

This made it a little bit difficult to capture the relevant parts of each section to implement consult-cabal. I thought a pattern like this ought to work

  (_ _ @type
     name: (section_name)? @name)))

but it didn't; I got way too many things captured in type. Clearly I had misunderstood something about the wildcards, or the query syntax. I attempted to add a field name to the anonymous node, i.e. change the sections rules like this

benchmark: $ => seq(
    field('type', 'benchmark'),
    field('name', $.section_name),
    field('properties', $.property_block),

It was accepted by tree-sitter generate, but the field type was nowhere to be found in the parse tree.

Then I changed the query to list the anonymous nodes explicitly, like this

  (_ ["benchmark" "common" "executable" ...] @type
     name: (section_name)? @name)))

That worked, but listing all the sections like that in the query didn't sit right with me.

Luckily there's a discussions area in tree-sitters GitHub so a fairly short discussion later I had answers to why my query behaved like it did and a solution that would allow me to not list all the section types in the query. The trick is to wrap the string in a call to alias to make it a named node. After that it works to add a field name to it as well, of course. The section rules now look like this

benchmark: $ => seq(
    field('type', alias('benchmark', $.section_type)),
    field('name', $.section_name),
    field('properties', $.property_block),

and the final query looks like this

   type: (section_type) @type
   name: (section_name)? @name)))

With that in place I could improve on the function that collects all the items for consult-cabal so it now show the section's type and name instead of the string representation of the tree-sitter node.

State in a consult source for preview of lines in a buffer

I was struggling with figuring out how to make a good state function in order to preview the items in consult-cabal. The GitHub repo for consult doesn't have discussions enabled, but after a discussion in an issue I'd arrived at a state function that works very well.

The state function makes use of functions in consult and looks like this

(defun consult-cabal--state ()
  "Create a state function for previewing sections."
  (let ((state (consult--jump-state)))
    (lambda (action cand)
      (when cand
        (let ((pos (get-text-property 0 'section-pos cand)))
          (funcall state action pos))))))

The trick here was to figure out how the function returned by consult--jump-state actually works. On the surface it looks like it takes an action and a candidate, (lambda (action cand) ...). However, the argument cand shouldn't be the currently selected item, but rather a postion (ideally a marker), so I had to attach another text property on the items (section-pos, which is fetched in the inner lambda). This position is then what's passed to the function returned by consult--jump-state.

In hindsight it seems so easy, but I was struggling with this for an entire evening before finally asking the question the morning after.

March 30, 2023 03:37 PM

March 29, 2023


Calling Purgatory from Heaven: Binding to Rust in Haskell

Calling hell from heaven and heaven from hell is a classic paper from the previous century, introducing the Haskell foreign function interface (FFI). It describes the facilities that Haskell offers for calling functions written in C (and vice versa). In this blog post, we will consider how to call functions written in Rust instead: not quite hell, but not quite heaven either.

We will make use of two libraries that we wrote to streamline this process: a Haskell-side library called foreign-rust, and a Rust-side library called haskell-ffi. We developed these libraries as part of the development of Be, a (smart) contract platform; we are thankful to them for making these libraries open source. That said, this blog post should also be useful for people who do not want to use these libraries, and indeed, we will also show examples of interop that do not rely on them. The source code for the examples discussed in this blog post can be found at

Getting started

Binding to Rust functions from Haskell is not quite as convenient as binding to C functions. The common denominator between Rust and Haskell is C, and so we have to do two things: we have to write a Rust-side wrapper that exposes the functionality we want to bind against as C functions, and then write Haskell-side bindings against these C functions.

Our running example in this blog post will therefore consist of a Rust library which we will call rust-wrapper, and a Haskell library which we will call haskell-wrapper. To get us started, let’s see if we can pass two numbers from Haskell to Rust, add them Rust-side, and then print the result Haskell-side.


Create a new Rust crate for our new rust-wrapper library, and then add the following to the Cargo.toml file:

haskell-ffi.git = ""
haskell-ffi.rev = "2bf292e2e56eac8e9fb0fb2e1450cf4a4bd01274"

capi = []

versioning = false

After declaring the dependency on the haskell-ffi library, the features and package.metadata.capi.library sections are for the benefit of cargo cbuild; we will see momentarily how to use this tool. First, however, add this function to the library’s

pub extern "C" fn rust_wrapper_add(x: u64, y: u64) -> u64 {
    x + y

The extern "C" directive tells the Rust compiler that this function should use the C calling convention. The no_mangle attribute ensures that the Rust compiler won’t change the name of our function to something unrecognizable, so that we know what the function is called in our Haskell bindings. This does mean that the function name should be unique across any C libraries that we might link against, which is why we will prefix the names of all external functions with rust_wrapper_. (See Calling Rust code from C from the Rustonomicon for more details.)

Finally, we need to configure cbindgen and tell it what kind of header file we want; we don’t need to call it manually (cargo cbuild will do that for us), but we do need to tell it that we want a C header file, not a C++ header file. Create a file called cbindgen.toml in the project root (alongside Cargo.toml) with the following three lines:

include_guard = "RUST_WRAPPER_H"
include_version = false
language = "C"

Now compile the library with

cargo cbuild

(you might need to install the cargo-c applet for cargo first.) This will create a bunch of files, but three are of particular interest:

  • target/<arch>/debug/ this is the shared object that our Haskell application will need to link against.

  • target/arch>/debug/rust_wrapper.h: this is the C header file that we will need to compile our Haskell-side bindings. For our running example so far, this header will contain

    uint64_t rust_wrapper_add(uint64_t x, uint64_t y);
  • target/<arch>/debug/rust_wrapper-uninstalled.pc: this is a pkg-config file which contains the C compiler and linker flags (including paths) that we will need Haskell-side to know where the .so and .h files that we just described are.1

Finally, we will need to set two environment variables; the first will ensure that we can find the pkg-config file, and the second ensures that when we run our application (after building and linking it), the .so file can still be found:

export PKG_CONFIG_PATH=<path>/rust-wrapper/target/<arch>/debug
export LD_LIBRARY_PATH=<path>/rust-wrapper/target/<arch>/debug


On the Haskell side, create a new package, and then add this to the library section of the .cabal file:

build-depends: .., foreign-rust
build-tool-depends: c2hs:c2hs
pkgconfig-depends: rust_wrapper-uninstalled

The first declares a dependency on c2hs; this is a preprocessor that we will use to write our bindings; the second declares the dependency on the Rust library; cabal will use pkg-config to figure out which compiler and linker flags are required (thereby also figuring out where that Rust library is). While the library is not yet released to Hackage, we’ll need to add the repo to our cabal.project file:

  type: git
  tag: 90b1c210ae4e753c39481a5f3b141b74e6b6d96e

For this simple example we don’t benefit much from c2hs, but will nonetheless use it to bind to our add function, to give us a chance to introduce its basic syntax. (For a detailed discussion of the syntax of c2hs, see the c2hs User Guide.)

module C.GettingStarted (rustWrapperAdd) where

#include "rust_wrapper.h"

import Data.Word

{# fun pure unsafe rust_wrapper_add as rustWrapperAdd
     { `Word64'
     , `Word64'
  -> `Word64'

This declares a function which

  • is called rust_wrapper_add C-side, but should be called rustWrapperAdd Haskell-side

  • has two arguments, both of type Word64

  • has a result also of type Word64

  • is pure: the signature of the function should be

    rustWrapperAdd :: Word64 -> Word64 -> Word64

    rather than

    rustWrapperAdd :: Word64 -> Word64 -> IO Word64 -- unnecessary

    (because calling the function twice with the same inputs will give the same results)

  • uses an unsafe call: unsafe calls can be used for foreign functions that do not call back into the Haskell runtime; this gives the RTS some guarantees which it can take advantage of to make the foreign call more efficient (see also Guaranteed call safety in the ghc manual)

This should be sufficient; if we now start a repl (cabal repl) and import our module, we should be able to test our function:

ghci> rustWrapperAdd 2 3

Marshalling data

It is easy enough to ferry individual Word64 over and back, but Haskell and Rust are both languages with rich type systems. If we want to transfer more complex values across the language barrier, we have two choices: we can either serialize and deserialize, or we can use pointers to the data. The first option is the easier and less fragile, as it avoids Haskell-side managing of values that live on the Rust-side heap; it is this approach that the haskell-ffi and foreign-rust libraries aim to streamline. We will consider the second option in section Avoiding serialization.

For a more realistic example, therefore, we will consider binding against a Rust function that constructs a self-signed x509 certificate with corresponding secret key, given a list of domain names:

fn generate_simple_self_signed(alt_names: Vec<String>) -> (Certificate, SecretKey)

Interlude: (no) orphans in Rust

Although of course the precise definitions differ, the basic concept of an orphan instance is similar in Haskell and in Rust. An orphan instance is an instance of some type class (or trait) C for some type T

instance C T where ..


impl C for T { ..}

which is not “bundled with” either the definition of C or the definition of T (where “bundled with” means “same module” in Haskell, and “same package” in Rust). This ensures coherence: it can never happen that we get two different instances in scope (for the same C and T) when we import two different modules.

However, where the introduction of an orphan instance in Haskell is merely a compiler warning, which we can choose to ignore (thereby taking on the responsibility ourselves to ensure coherence), in Rust it is an error: we really cannot introduce an orphan instance.

This can be quite a serious limitation. For example, suppose we want to have a (Rust side) type class for “things we can marshall to Haskell.” If this type class is defined in an external package, and we want to marshall a type defined in a different package, unless there is instance already defined in either of these two packages, we are stuck: we cannot provide an instance ourselves (since it would be an orphan). The haskell-ffi library therefore adopts a workaround, which we will discuss now.

Marshalling in haskell-ffi

Central to the haskell-ffi library is the definition of two traits (type classes), for data that can be marshalled to and from Haskell respectively:

pub type Error     = Box<dyn std::error::Error + Send + Sync>;
pub type Result<T> = core::result::Result<T, Error>;

pub trait ToHaskell<Tag> {
    fn to_haskell<W: Write>(&self, writer: &mut W, tag: PhantomData<Tag>) -> Result<()>;

pub trait FromHaskell<Tag>: Sized {
    fn from_haskell(buf: &mut &[u8], tag: PhantomData<Tag>) -> Result<Self, Error>;

These are similar to the BorshSerialize and BorshDeserialize traits from the borsh crate (package), and indeed, ToHaskell and FromHaskell have all the standard instances that make it compatible with the Borsh binary serialization format.

The definition of these two traits might look a bit obscure to a Haskell programmer; a rough Haskell translation might be

class ToHaskell tag a where -- illustration only
  toHaskell :: forall w. Write w => a -> IORef w -> Proxy tag -> IO ()

class Sized a => FromHaskell tag a where -- illustration only
  fromHaskell :: IORef (Vector Word8) -> Proxy tag -> IO a

Some points:

  • The definition of Error is describing a boxed value of existential type, which is required to satisfy a few instance (aka implement a few traits), most notably std::error::Error; this corresponds nearly 1:1 with SomeException in Haskell:

    data SomeException = forall e. Exception e => SomeException e
  • W: Write is known as a trait bound in Rust, and corresponds to ad-hoc polymorphism in Haskell.

  • Rust does not really have multi-parameter type classes; the additional Tag parameter is an example of what is (confusingly) called generics in Rust, and corresponds roughly to parametric polymorphism in Haskell (although the two concepts don’t align perfectly).

  • PhantomData in Rust, like Proxy in Haskell, serves only as a hint to the type checker: here, to determine the type Tag.

The Tag argument allows us to work around the no-orphans limitation. The haskell-ffi library can introduce instances that are polymorphic in a choice of tag, such as

impl<Tag, T: ToHaskell<Tag>> ToHaskell<Tag> for Option<T>

corresponding to

instance ToHaskell tag t => ToHaskell tag (Maybe t) -- illustration only

but, as we shall see momentarily, we can also introduce additional instances in other libraries (such as our rust-wrapper library), provided that we choose a specific tag.

Example: Rust

Let’s now get back to our example. Recall that we want to bind to a Rust function with this signature:

fn generate_simple_self_signed(alt_names: Vec<String>) -> (Certificate, SecretKey)

To do that, we need to write a wrapper function that we expose as a C function. The wrapper will have two arguments for each argument of the function we are wrapping, as well as two arguments for the result:

pub extern "C" fn rust_wrapper_generate_simple_self_signed(
    alt_names: *const u8,
    alt_names_len: usize,
    out: *mut u8,
    out_len: &mut usize,
) {

For each argument of the original function, we have a pair of C arguments: the first is a pointer to a buffer containing a serialized form of the argument, and the second is the length of that buffer. For the result of the original function we likewise have a pair of C arguments: the first points to a buffer that the result will be serialized to, and the second must initially contain the size of that buffer, and is overwritten to contain the required size of the buffer when the function returns (so that the caller can try again if the buffer is too small; see Using the C function, below).

Before we can write the body of the wrapper, we need to choose a Tag to use (see previous section):

pub enum RW {}
pub const RW: PhantomData<RW> = PhantomData;

RW (for rust-wrapper) is just an empty type; it will only serve as a type-level tag. The body of our wrapper function is now simple:

pub extern "C" fn rust_wrapper_generate_simple_self_signed(
    alt_names: *const u8,
    alt_names_len: usize,
    out: *mut u8,
    out_len: &mut usize,
) {
    let alt_names: Vec<String> = marshall_from_haskell_var(alt_names, alt_names_len, RW);
    let result = generate_simple_self_signed(alt_names);
    marshall_to_haskell_var(&result, out, out_len, RW);

We will discuss the _var suffix in section on bounded size data.

Example: Haskell

On the Haskell side, we first have to decide what we want to do with the serialized data we get from Rust. We can try to deserialize it, or we could just leave it in serialized form, relying on Rust-side functions to interact with it. The foreign-rust library helps us with deserialization if we choose to do so, and provides tools for working with serialized data if we choose not to.

For our example it is simplest to just leave the data in serialized form:

newtype Certificate = Certificate Strict.ByteString
  deriving newtype (BorshSize, ToBorsh, FromBorsh)
  deriving newtype (IsRaw)
  deriving (Show, Data.Structured.Show, IsString) via AsBase64 Certificate

newtype SecretKey = SecretKey (FixedSizeArray 32 Word8)
  deriving newtype (BorshSize, BorshMaxSize, ToBorsh, FromBorsh)
  deriving newtype (IsRaw)
  deriving (Show, Data.Structured.Show, IsString) via AsBase64 SecretKey

Some comments:

  • BorshSize, ToBorsh, FromBorsh and BorshMaxSize come from the Haskell borsh library. We will see the use of BorshSize and BorshMaxSize when we discuss bounded size data.

  • IsRaw is a type class from the foreign-rust library capturing “raw” values: values that are essentially just bytestrings:

    class IsRaw a where
      rawSize :: a -> Word32
      toRaw   :: a -> Lazy.ByteString
      fromRaw :: Lazy.ByteString -> Either String a
  • FixedSizeArray is a datatype from the Haskell borsh package which corresponds to bytestrings of a known, static, length; FixedSizeArray 32 Word8 is a precise analogue of [u8; 32] in Rust.

  • Data.Structured.Show, from foreign-rust, is like Show from the prelude, but producing a structured value, which can be pretty-printed a bit nicer. It’s similar to the PrettyVal class from the pretty-show package, but unlike PrettyVal (and like Show from the prelude), its pretty-printed values are valid Haskell.

  • Finally, AsBase64 is a newtype that can be used to conveniently derive Show, Data.Structured.Show and IsString instances, all using a base-64 encoding. Similarly foreign-haskell also provides AsBase16, AsBase58, and AsDecimal (list of decimal values).

With our datatypes defined, we can now bind our function:

{# fun unsafe rust_wrapper_generate_simple_self_signed as rustWrapperSelfSigned
     { toBorshVar*  `[Text]'&
     , getVarBuffer `Buffer (Certificate, SecretKey)'&
  -> `()'

This is not really any more difficult than the function which just passed numbers around: c2hs provides explicit support for arguments that correspond to a single argument Haskell-side and two arguments C-side (that’s what the ampersand & means), and it allows us to define specific marshalling functions; we use toBorshVar and getVarBuffer, both from foreign-rust. The syntax is a bit arcane, but the good news is that all functions you wrap will look very similar.

Using the C function

The signature of the Haskell function that c2hs made for us is not quite as convenient as we might like:

rustWrapperSelfSigned :: [Text] -> Buffer (Certificate, SecretKey) -> IO ()

When we discussed the Rust-side function, we mentioned that it expects a buffer to write its output to, along with the size of that buffer. Since we are trying to avoid managing memory allocated Rust-side in Haskell, or vice versa, we will create that buffer Haskell side; but what size buffer should we allocate? The generated function just punts on this question, and doesn’t allocate a buffer at all.

But not to worry, foreign-rust has us covered. The main function it provides here is withBorshVarBuffer:

withBorshVarBuffer :: (FromBorsh a, ..) => (Buffer a -> IO ()) -> IO a

It will allocate a 1kB buffer and then call the function; if it turns out this buffer is not large enough, the Rust-side function will tell it what the right size buffer is, and so it will just try once more with a larger buffer. We can use this to provide a selfSigned function with the signature we’d expect:

selfSigned :: [Text] -> IO (Certificate, SecretKey)
selfSigned = withBorshVarBuffer . rustWrapperSelfSigned

We can try this now in ghci:

ghci> selfSigned [""]

Working with foreign values

We now have a Haskell side representation of the Rust Certificate type but we can’t yet do much with it; in this section we will therefore explore this a bit more.

Binding another function

Let’s bind another function, which returns the certificate’s “subject.” Rust-side, we can define

pub extern "C" fn rust_wrapper_get_certificate_subject(
    cert: *const u8,
    cert_len: usize,
    out: *mut u8,
    out_len: &mut usize,
) {
    let cert: Certificate = marshall_from_haskell_var(cert, cert_len, RW);
    let result = format!("{}", cert.tbs_certificate.subject);
    marshall_to_haskell_var(&result, out, out_len, RW);

This function has exactly the same shape as the previous we wrote (indeed, an important goal of the haskell-ffi/foreign-rust library pair is precisely to make this kind of work as “boring” as possible). The c2hs declaration Haskell-side is also very similar:

{# fun unsafe rust_wrapper_get_certificate_subject as rustWrapperCertificateSubject
     { toBorshVar*  `Certificate'&
     , getVarBuffer `Buffer Text'&
  -> `()'

Unlike selfSigned, however, which really must live in IO (each time the function is called, it produces a different certificate), this function is morally pure:

certificateSubject :: Certificate -> Text
certificateSubject = withPureBorshVarBuffer . rustWrapperCertificateSubject

We can try it out:

ghci> (cert, pkey) <- selfSigned [""]
ghci> certificateSubject cert
"CN=rcgen self signed cert"

Using the IsString instance

We mentioned above that foreign-rust introduces Data.Structured.Show, to replace PrettyVal from pretty-show, in order to ensure that pretty-printed values are still valid Haskell. Indeed, we derived IsString for Certificate above, which means that we can denote Rust-side values in our Haskell code:

ghci> certificateSubject "MIIB..uZ04" -- same string that we got above
"CN=rcgen self signed cert"

This can be very useful when experimenting, in (regression) tests, etc.


It’s nice that we can show and even denote Rust-side values in Haskell, but a long base-64 string is not the most informative. To make things like debugging a little easier, foreign-rust therefore provides a way to annotate values:

class CanAnnotate a where
  type Annotated a :: Type
  annotate       :: a -> Annotated a
  dropAnnotation :: Annotated a -> a

In many (but by no means all) cases, an annotated form of a value just pairs that value with some additional value; we can use this for Certificate:

  via PairWithAnnotation Certificate
  instance CanAnnotate Certificate

type instance Annotation Certificate = Text

instance ComputeAnnotation Certificate where
  computeAnnotation = certificateSubject

Trying it out:

ghci> (cert, pkey) <- selfSigned [""]
ghci> annotate cert
WithAnnotation {value = "MIIB..uZ04", annotation = "CN=rcgen self signed cert"}

or we can use Data.Structured to make this a little cleaner:

ghci> Data.Structured.print $ annotate cert
WithAnnotation {
  value = "MIIB..uZ04"
, annotation = "CN=rcgen self signed cert"

Annotations are designed to be “transitive” (and there is support for generically deriving CanAnnotate for your own types if you just want to transitively get all annotations). As a simple example, here’s what we get if we annotate something of type [Maybe Certificate]:

ghci> Data.Structured.print $ annotate [Just cert]
WithAnnotation {
  value = [
        WithAnnotation {
          value = "MIIB..uZ04"
        , annotation = "CN=rcgen self signed cert"
, annotation = Length 1

This can be very helpful during debugging (there is also dropAnnotation which goes in the opposite direction).

Fixed size data

When we discussed binding rust_wrapper_self_signed, we said that withBorshVarBuffer will allocate an initial buffer of a certain size, then call the Rust function, hoping the buffer will be big enough, and then call the Rust function a second time if it turns out the buffer was too small after all.

If we know ahead of time how big the value will be, however, we can do better. For example, we know that the size of (this type of) a secret key is always 32 bytes; indeed, we said so right in the type:

newtype SecretKey = SecretKey (FixedSizeArray 32 Word8)


Let’s consider binding to a Rust function which constructs an example key, generated from a PRNG with specified seed:

pub extern "C" fn rust_wrapper_example_key(seed: u64, out: *mut u8, out_len: usize) {
    let mut prng: StdRng = StdRng::seed_from_u64(seed);
    let result: SecretKey = SecretKey::random(&mut prng);
    marshall_to_haskell_fixed(&result, out, out_len, RW);

The seed is a simple C value so no need for any serialization. This is not true for the result of the function, but unlike before, the size of the output is statically known. This means we can use marshall_to_haskell_fixed Rust-side, instead of marshall_to_haskell_var; usage is almost identical, except that the out_len is now a simple usize, rather than a pointer to a usize: the haskell-ffi Rust code will verify the size of the buffer allocated Haskell-side, and panic if it’s not of the right size (this would be a bug), but there is no need for it to communicate a new size back to the Haskell code.

This depends on an additional trait which provides the size:

pub trait HaskellSize<Tag> {
    fn haskell_size(tag: PhantomData<Tag>) -> usize;

This class comes with all the instances we’d expect for the Borsh serialization format; for example, we have

impl<Tag, T: HaskellSize<Tag>, const N: usize> HaskellSize<Tag> for [T; N] {
    fn haskell_size(tag: PhantomData<Tag>) -> usize {
        T::haskell_size(tag) * N

(This instance uses what is known as “const generics” in Rust parlance; in the Haskell world that const N: usize parameter would correspond to a KnownNat n constraint.) There is also a macro available you can use to derive HaskellSize for your own structs (enums do not have a statically known size).


The c2hs declaration of this function looks like this:

{# fun pure unsafe rust_wrapper_example_key as exampleKey
     {                   `Word64'
     , allocFixedBuffer- `SecretKey'& fromBorsh*
  -> `()'

Since we are sure we only need to call the Rust function once (with an appropriately size buffer), we can do everything right within the c2hs incantation: allocFixedBuffer will allocate the appropriate buffer before calling the function, and fromBorsh will get the value from the buffer afterwards. Moreover, since this function is now morally pure, we can use the c2hs keyword for pure functions, and the signature of the function constructed by c2hs is exactly what we’d expect, with no further wrapping required:

exampleKey :: Word64 -> SecretKey

Bounded size data

For the case where there is no fixed size, but there is a bound on the size, we have marshall_to_haskell_max on the Rust side (depending on a HaskellMaxSize trait) and allocMaxBuffer on the Haskell side (depending on a BorshMaxSize class). The most important use case for this is Rust’s Option type, corresponding to Maybe in Haskell. For example, here is a Rust function which deserializes a secret key in PEM format:

pub extern "C" fn rust_wrapper_key_from_pem(
    key: *const u8,
    key_len: usize,
    out: *mut u8,
    out_len: usize,
) {
    let key: String = marshall_from_haskell_var(key, key_len, RW);
    let result: Option<SecretKey> = match elliptic_curve::SecretKey::from_sec1_pem(&key) {
        Ok(key) => Some(key),
        Err(elliptic_curve::Error) => None,
    marshall_to_haskell_max(&result, out, out_len, RW);

and here is the corresponding c2hs binding:

{# fun pure unsafe rust_wrapper_key_from_pem as fromPem
     { toBorshVar*     `Text'&
     , allocMaxBuffer- `Maybe SecretKey'& fromBorsh*
  -> `()'

As before, no additional wrapping is necessary Haskell-side; the signature of the function constructed by c2hs is

fromPem :: Text -> Maybe SecretKey

Composite types

Suppose we have this datatype Rust-side:

#[derive(serde::Serialize, serde::Deserialize, BorshSerialize, BorshDeserialize, HaskellSize)]
pub struct Color {
    r: f64,
    g: f64,
    b: f64,

We can piggyback on the BorshSerialize and BorshDeserialize instances derived by macros from the borsh crate to define our FromHaskell and ToHaskell instances:

impl<Tag> ToHaskell<Tag> for Color {
    fn to_haskell<W: Write>(&self, writer: &mut W, _tag: PhantomData<Tag>) -> Result<()> {

impl<Tag> FromHaskell<Tag> for Color {
    fn from_haskell(buf: &mut &[u8], _tag: PhantomData<Tag>) -> Result<Self> {
        let x = Color::deserialize(buf)?;

Here is a simple function that constructs the “red” color:

pub extern "C" fn rust_wrapper_red(out: *mut u8, out_len: usize) {
    let result = Color {
        r: 1.0,
        g: 0.0,
        b: 0.0,
    marshall_to_haskell_fixed(&result, out, out_len, RW);

We now have two choices how we represent this datatype Haskell-side: we can represent it by a proper Haskell value, or we can leave the Haskell-side representation opaque. We will consider these separately.

Structured Haskell representation

The cleanest representation of this datatype is of course the corresponding Haskell datatype

data Color = Color { r :: Double, g :: Double, b :: Double }
  deriving stock (Show, GHC.Generic)
  deriving anyclass (SOP.Generic, SOP.HasDatatypeInfo)
  deriving anyclass (Data.Structured.Show)
  deriving CanAnnotate via NoAnnotation Color
  deriving (BorshSize, ToBorsh, FromBorsh) via AsStruct Color

After we have derived the necessary instances, interacting with Rust is trivial; for example, here’s how we can bind the red function:

{# fun pure unsafe rust_wrapper_red as red
     { allocFixedBuffer- `Color'& fromBorsh*
  -> `()'

Nothing else to do.

ghci> red
Color {r = 1.0, g = 0.0, b = 0.0}

Opaque Haskell representation

However, in some cases we don’t want to try and parse the value Haskell-side; perhaps it’s just unnecessarily difficult, or perhaps we consider the exact serialized form of the Rust value an implementation detail of the Rust code. Perhaps we don’t even have a Haskell-side representation, and all we have is a pointer to a value on the Rust heap (see section Avoiding serialization).

For example, we might represent Color as2

newtype Color = Color (FixedSizeArray 24 Word8)
  deriving newtype (BorshSize, ToBorsh, FromBorsh)
  deriving newtype (IsRaw)

Even with this representation, however, we might still prefer a more informative Show instance. The final trick that foreign-rust has up its sleeve is support for “Rust side JSON serialization/deserialization.” This works as follows. First, we define functions Rust-side that convert a value to and from JSON. In most cases, this is easy, because we can derive serde Serialize and Deserialize instances, and then use serde_json:3

pub extern "C" fn rust_wrapper_color_from_json(
    json: *const u8,
    json_len: usize,
    out: *mut u8,
    out_len: &mut usize,
) {
    let json: String = marshall_from_haskell_var(json, json_len, RW);
    let result: core::result::Result<Color, serde_json::Error> = serde_json::from_str(&json);
    marshall_result_to_haskell_var(&result, out, out_len, RW);

pub extern "C" fn rust_wrapper_color_to_json(
    color: *const u8,
    color_len: usize,
    out: *mut u8,
    out_len: &mut usize,
) {
    let color: Color = marshall_from_haskell_var(color, color_len, RW);
    let json: String = serde_json::to_string(&color).unwrap();
    marshall_to_haskell_var(&json, out, out_len, RW);

Binding to these functions follows the now-familiar pattern:

{# fun unsafe rust_wrapper_color_from_json as rustWrapperColorFromJSON
     { toBorshVar*  `Text'&
     , getVarBuffer `Buffer (Either Text Color)'&
  -> `()'

{# fun unsafe rust_wrapper_color_to_json as rustWrapperColorToJSON
     { toBorshFixed* `Color'&
     , getVarBuffer  `Buffer Text'&
  -> `()'

with corresponding wrappers:

colorToJSON :: Color -> JSON
colorToJSON = withPureBorshVarBuffer . rustWrapperColorToJSON

colorFromJSON :: HasCallStack => JSON -> Either Failure Color
colorFromJSON = first mkFailure . withPureBorshVarBuffer . rustWrapperColorFromJSON

where JSON is a newtype wrapper around a lazy bytestring, and Failure pairs a Text error message with a CallStack. With these functions defined, we can provide instances for ToJSON and FromJSON instances from Foreign.Rust.External.JSON:

instance External.ToJSON Color where
  toJSON = colorToJSON

instance External.FromJSON Color where
  fromJSON = colorFromJSON

This gives us two things: we can, if we wish, derive standard aeson FromJSON and ToJSON instances; but we can also use JSON in our Show instance:

deriving via UseExternalJSON Color instance Aeson.ToJSON   Color
deriving via UseExternalJSON Color instance Aeson.FromJSON Color

deriving via AsJSON Color instance Show                 Color
deriving via AsJSON Color instance Data.Structured.Show Color

Trying it out:

ghci> red
asJSON @Color
      "b": 0.0
    , "g": 0.0
    , "r": 1.0

As mentioned above, the library always attempts to ensure that Show produces valid Haskell expressions. If we are using JSON, it does this by using the aesonQQ quasi-quoter, along with a wrapper and a type annotation, to avoid ambiguous type errors.

Avoiding serialization

Finally, we will consider when we don’t want to use serialization to transfer values between Haskell and Rust (because it’s too expensive), or we can’t use it (perhaps it’s a value that cannot be serialized). For example, suppose we have this Rust-side type of handles:

pub struct Handle(usize);
impl Drop for Handle
pub fn new_handle() -> Handle

We can expose C functions in our Rust code that allocate a handle, query a handle’s ID, and free a handle:

pub extern "C" fn rust_wrapper_new_handle() -> *mut Handle {
    let handle: Box<Handle> = Box::new(new_handle());

pub extern "C" fn rust_wrapper_handle_id(handle: *mut Handle) -> usize {
    let handle: &Handle = unsafe { &*handle };

pub extern "C" fn rust_wrapper_free_handle(handle: *mut Handle) {
    let _handle: Box<Handle> = unsafe { Box::from_raw(handle) };

On the Haskell side, we can use c2hs to create a newtype around a pointer to a handle, expose C functions that allocate a handle and query its ID, and use rust_wrapper_free_handle as the finalizer (called by the garbage collector):

{#pointer *Handle foreign finalizer rust_wrapper_free_handle newtype #}

{# fun unsafe rust_wrapper_new_handle as newHandle
   -> `Handle'

{# fun pure unsafe rust_wrapper_handle_id as handleId
      { `Handle'
   -> `Word64'

The signatures generated by c2hs are

newtype Handle = Handle (ForeignPtr Handle)

newHandle :: IO Handle
handleId  :: Handle -> Word64

The biggest drawback of this approach is that we now no longer have any representation of these values Haskell-side; we cannot provide a “legal” Show instances. This can be quite inconvenient, especially in tests. Managing values on the Rust heap through the Haskell GC (even if we are using the Rust-side deallocator) is also simply more error prone, and if things go wrong, hard to debug. It is probably only the better choice if serialization is either impossible or prohibitively expensive.


The design of haskell-ffi and foreign-rust is optimized for ease of integration, not necessarily for optimal performance. This is almost certainly fine for most applications, but you probably don’t want the overhead of serialization and deserialization when doing FFI calls to Rust in a tight Haskell loop (of course, it is almost never a good idea to do that anyway).

Alongside withBorshVarBuffer, foreign-rust offers withBorshBufferOfInitSize which can be used to specify a different initial buffer size, which can be used to reduce the probability that a second round-trip is necessary (in the case that the initial buffer was not big enough). In principle, you could use this in conjunction with a Rust-side function that computes the required buffer size, but there isn’t much point: this would still require two FFI calls, with the same parameters; if there is a cheap way Rust-side to compute the necessary buffer size, then that behaviour should just be baked into the one Rust function: check if the allocated buffer is big enough before doing anything else. The standard marshalling functions offered by haskell-ffi do not do this, since in general it is difficult to know exactly how large the serialized form of some data is without actually serializing it.

In the case where a Rust function must really only be called once (perhaps because it has side effects), you can choose to forgo serialization altogether, as we described above in Avoiding serialization. Alternatively, foreign-rust offers a hybrid approach, where we allocate a buffer Rust-side, pass a pointer to the buffer to Haskell, deserialize it Haskell-side, and then free the buffer when no longer required. For our example where we convert a secret key to PEM, the Rust-side wrapper would look like this:

pub extern "C" fn rust_wrapper_key_to_pem_external(key: *const u8, key_len: usize) -> *mut Vec<u8> {
    // .. construction of `result` exactly as before
    marshall_to_haskell_external(result, RW)

with Haskell counter-part:

{# fun pure unsafe rust_wrapper_key_to_pem_external as toPemExternal
     { toBorshFixed* `SecretKey'&
  -> `Text' fromExternalBorsh*

The type of the function constructed by c2hs is then SecretKey -> Text. The advantages of this approach is that no initial buffer size needs to be estimated (we just use whatever buffer was allocated Rust-side), a second round-trip is guaranteed not to be needed, and we avoid copying the buffer. We still have the serialization/deserialization overhead, of course, and—perhaps more importantly—it is difficult to predict quite how long we will hold on to that Rust-allocated buffer. The deserializer might return values that directly or indirectly point to that buffer, and since these buffers are allocated on the Rust heap, not the Haskell heap, memory profiling might be difficult. In most cases, this approach is therefore probably not the right choice.


This was a long blog post, so let’s summarize:

  • Expose extern "C" functions in your Rust-code; you can use the Rust library haskell-ffi to serialize and deserialize data in a convenient manner.
  • Build your Rust library with cargo cbuild, to generate a header file and a pkg-config file.
  • Declare a pkg-config dependency on the Rust library in your cabal file, as well as a dependency on the build tool (preprocessor) c2hs.
  • Use c2hs to add bindings to the C functions; the Haskell library foreign-rust is a companion library to haskell-ffi that makes this process very streamlined.
  • For data types with a fixed size encoding, the c2hs declaration might be all you need; otherwise you will write a simple wrapper function, again using functionality from foreign-rust.
  • To Show Rust-side values, foreign-rust offers various ways, which show a value in base-16, base-58, base-64, or JSON format; each of these generate valid Haskell, so that you can denote Rust-side values within your Haskell source code.
  • In addition, foreign-rust offers functionality for annotating values with additional information, which can be quite helpful to get further information about Rust-side values during debugging.
  • Finally, if serialization of Rust-side values is undesirable or impossible, you can just pass pointers back and forth, using the Haskell garbage collector to call the Rust-side deallocator when a value is no longer in use. However, when you do this, you will have no way of denoting these values Haskell-side.

  1. There is also rust_wrapper.pc, which can be used if the Rust library is installed system-wide. Here we will assume that we will link against the library in its build directory.↩︎

  2. We could even use a ByteString, like we did for Certificate. If we do, we just need to update the Rust code to ensure that the ToHaskell and FromHaskell include a length prefix; “Borsh in Borsh” style.↩︎

  3. marshall_result_to_haskell_var is a thin wrapper around marshall_to_haskell_var which can be used in the common case that we have a Result<T, E> for some library specific error type E; it just calls format on the error before calling marshall_to_haskell_var.↩︎

by edsko at March 29, 2023 12:00 AM

March 27, 2023

Magnus Therning

Cabal, tree-sitter, and consult

After my last post I thought I'd move on to implement the rest of the functions in haskell-mode's major mode for Cabal, functions like haskell-cabal-goto-library-section and haskell-cabal-goto-executable-section. Then I realised that what I really want is a way to quickly jump to any section, that is, I want consult-cabal!

What follows is very much a work-in-progress, but hopefully it'll show enough promise.

Listing the sections

As I have a tree-sitter parse tree to hand it is fairly easy to fetch all the nodes corresponding to sections. Since the last post I've made some improvements to the parser and now the parse tree looks like this (I can recommend the function treesit-explore-mode to expore the parse tree, I've found it invaluable ever since I realised it existed)

 (properties ...)
  (common common (section_name) ...)
  (library library ...)
  (executable executable (section_name) ...)

That is, all the sections are children of the node called sections.

The function to use for fetching all the nodes is treesit-query-capture, it needs a node to start on, which this case should be the full parse tree, i.e. (treesit-buffer-root-node 'cabal) and a query string. Given the structure of the parse tree, and that I want to capture all children of sections, a query string like this one works

"(cabal (sections (_)* @section))"

Finally, by default treesit-query-capture returns a list of tuples of the form (<capture> . <node>), but in this case I only want the list of nodes, so the full call will look like this

(treesit-query-capture (treesit-buffer-root-node 'cabal)
                       "(cabal (sections (_)* @section))"
                       nil nil t)

Hooking it up to consult

As I envision adding more things to jump to in the future, I decided to make use of consult--multi. That in turn means I need to define a "source" for the sections. After a bit of digging and rummaging in the consult source I put together this

(defvar consult-cabal--source-section
  `(:name "Sections"
    :category location
    :action ,#'consult-cabal--section-action
    :items ,#'consult-cabal--section-items)
  "Definition of source for Cabal sections.")

which means I need two functions, consult-cabal--section-action and consult-cabal--section-items. I started with the latter.

Getting section nodes as items for consult

It took me a while to work understand how this would ever be able to work. The function that :items point to must return a list of strings, but how would I ever be able to use just a string to jump to the correct location?

The solution is in a comment in the documentation of consult--multi:

:items - List of strings to select from or function returning list of strings. Note that the strings can use text properties to carry metadata, which is then available to the :annotate, :action and :state functions.

I'd never come across text properties in Emacs before, so at first I completely missed those two words. Once I'd looked up the concept in the documentation everything fell into place. The function consult-cabal--section-items would simply attach the relevant node as a text property to the strings in the list.

My current version, obviously a work-in-progress, takes a list of nodes and turns them naïvely into a string and attaches the node. I split it into two functions, like this

(defun consult-cabal--section-to-string (section)
  "Convert a single SECTION node to a string."
  (propertize (format "%S" section)
              :treesit-node section))

(defun consult-cabal--section-items ()
  "Fetch all sections as a list of strings ."
  (let ((section-nodes (treesit-query-capture (treesit-buffer-root-node 'cabal)
                                              "(cabal (sections (_)* @section))"
                                              nil nil t)))
    (mapcar #'consult-cabal--section-to-string section-nodes)))

Implementing the action

The action function is called with the selected item, i.e. with the string and its properties. That means, to jump to the selected section the function needs to extract the node property, :treesit-node, and jump to the start of it. the function to use is get-text-property, and as all characters in the string will have to property I just picked the first one. The jumping itself I copied from the navigation functions I'd written before.

(defun consult-cabal--section-action (item)
  "Go to the section referenced by ITEM."
  (when-let* ((node (get-text-property 0 :treesit-node item))
              (new-pos (treesit-node-start node)))
    (goto-char new-pos)))

Tying it together with consult--multi

The final function, consult-cabal, looks like this

(defun consult-cabal ()
  "Choose a Cabal construct and jump to it."
  (consult--multi '(consult-cabal--source-section)
                  :sort nil))

Conclusions and where to find the code

The end result works as intended, but it's very rough. I'll try to improve it a bit more. In particular I want

  1. better strings - (format "%S" node) is all right to start with, but in the long run I want strings that describe the sections, and
  2. preview as I navigate between items - AFAIU this is what the :state field is for, but I still haven't looked into how it works.

The source can be found here.

March 27, 2023 07:20 PM

Monday Morning Haskell

New Free Course: Setup.hs!

You can read all the Haskell articles you want, but unless you write the code for yourself, you'll never get anywhere! But there are so many different tools and ideas floating around out there, so how are you supposed to know what to do? How do you even get started writing a Haskell project? And how can you make your development process as efficient as possible?

The first course I ever made, Your First Haskell Project, was designed to help beginners answer these questions. But over the years, it's become a bit dated, and I thought it would be good to sunset that course and replace it with a new alternative, Setup.hs. Like its predecessor, Setup.hs is totally free!

Setup.hs is a short course designed for many levels of Haskellers! Newcomers will learn all the basics of building and running your code. More experienced Haskellers will get some new tools for managing all your Haskell-related programs, as well as some tips for integrating Haskell features into your code editor!

Here's what you'll learn in the course:

  1. How to install and manage all of the core Haskell tools (GHC, Cabal, Stack)
  2. What components you need in your Haskell project and how you can build and run them all
  3. How to get your editor to use advanced features, like flagging compilation errors and providing autocomplete suggestions.

We'll do all of this in a hands-on way with detailed, step-by-step exercises!


Setup.hs makes a few notable updates and improvements compared to Your First Haskell Project.

First, it uses GHCup to install all the necessary tools instead of the now-deprecated Haskell Platform. GHCup allows for seamless switching between the different versions of all our tools, which can be very useful when you have many projects on your system!

Second, it goes into more details about pretty much every topic, whether that's project organization, Stack snapshots, and extra dependencies.

Third and probably most importantly, Setup.hs will teach you how to get Haskell code hints in three of the most common code editors (VS Code, Vim & Emacs) using Haskell Language Server. Even if these lectures don't cover the particular editor you use, they'll give you a great idea of what you need to search for to learn how. I can't overstate how useful these kinds of integrations are. They'll massively speed up your development and, if you're a beginner, they'll rapidly accelerate your learning.

If all this sounds super interesting to you, head over to the course page and sign up!

by James Bowen at March 27, 2023 02:30 PM

March 24, 2023

Philip Wadler

Benchmarking best practices


A handy summary prepared by Jesse Sigal. Thanks, Jesse!


- Determine what is relevant for you to actually benchmark (areas include accuracy, computational complexity, speed, memory usage, average/best/worst case, power usage, degree of achievable parallelism, probability of failure, clock time, performance vs time for anytime algorithms).

- Make sure you run on appropriate data, including generating random (but representable) data and running statistical analysis.

- Consider using multiple datasets and cross-validation.

- Consider the extreme cases as well.
- Find benchmarks the field will care about.



- “Writing for Computer Science” by Justin Zobel

- “The art of computer systems performance analysis” (1990) by Raj Jain



- A. Crapé and L. Eeckhout, “A Rigorous Benchmarking and Performance Analysis Methodology for Python Workloads,” 2020 IEEE International Symposium on Workload Characterization (IISWC), Beijing, China, 2020, pp. 83-93, doi: 10.1109/IISWC50251.2020.00017.

- A. Georges, D. Buytaert, L. Eechkout, “Statistically rigorous java performance evaluation,” OOPSLA '07: Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems, languages and applications, October 2007 Pages

- Benchmarking Crimes: An Emerging Threat in Systems Security. van der Kouwe, E.; Andriesse, D.; Bos, H.; Giuffrida, C.; and Heiser, G. Technical Report arXiv preprint arXiv:1801.02381, January 2018.

- Hoefler, Torsten, and Roberto Belli. "Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results." Proceedings of the international conference for high performance computing, networking, storage and analysis. 2015.

- Hunold, Sascha, and Alexandra Carpen-Amarie. "Reproducible MPI benchmarking is still not as easy as you think." IEEE Transactions on Parallel and Distributed Systems 27.12 (2016): 3617-3630.


Online resources

by Philip Wadler ( at March 24, 2023 05:27 PM

Sandy Maguire

The Co-Blub Paradox

The following is an excerpt from Certainty by Construction, a new book I’m writing on learning and effectively wielding Agda. Writing a book is a tedious and demoralizing process, so if this is the sort of thing you’re excited about, please do let me know!

It is widely acknowledged that the languages you speak shape the thoughts you can think; while this is true for natural language, it is doubly so in the case of programming languages. And it’s not hard to see why; while humans have dedicated neural circuitry for natural language, it would be absurd to suggest there is dedicated neural circuitry for fiddling around with the semantics of pushing around arcane symbol abstractly encoded as electrical potentials over a conductive metal.

Because programming—and mathematics more generally—does not come easily to us humans, it can be hard to see the forest for the trees. We have no built-in intuition as to what should be possible, and thus, this intuition is built by observing the artifacts created by more established practitioners. In these more “artificial” of human endeavors, newcomers to the field are truly constructivists—their methods for practicing the art are shaped only by their previously-observed patterns. Because different programming languages support different features and idioms, the imaginable shape of what programming is must be shaped by the languages we understand.

In a famous essay, “Beating the Averages,” Paul Graham points out the so-called Blub paradox. This, Graham says, is the ordering of programming languages by powerfulness; a programmer who thinks in a middle-of-the-road language along this ordering (call it Blub) can identify less powerful languages, but not those which are more powerful. The idea rings true; one can arrange languages in power by the features they support, and subsequently check to see if a language supports all the features felt to be important. If it doesn’t, it must be less powerful. However, this technique doesn’t work to identify more powerful languages—at best, you will see that the compared language supports all the features you’re looking for, but you don’t know enough to ask for more.

More formally, we can describe the Blub paradox as a semi-decision procedure. That is, given an ordering over programming languages (here, by “power”,) we can determine whether a language is less than our comparison language, but not whether it is more than. We can determine when the answer is definitely “yes,” but, not when it is “no!”

Over two decades of climbing this lattice of powerful languages, I have come to understand a lesser-known corollary of the Blub paradox, coining it the Co-Blub paradox1. This is the observation that knowledge of lesser languages is actively harmful in the context of a more powerful language. The hoops you unwittingly jumped through in Blub due to lacking feature X are anti-patterns in the presence of feature X. This is obviously true when stated abstractly, but insidious when one is in the middle of it.

Let’s look at a few examples over the ages, to help motivate the problem before we get into our introspection proper. In the beginning, people programmed directly in machine code. Not assembly, mind you, but in raw binary-encoded op-codes. They had a book somewhere showing them what bits needed to be set in order to cajole the machine into performing any given instruction. Presumably if this were your job, you’d come to memorize the bit patterns for common operations, and it wouldn’t be nearly as tedious as it seems today.

Then came assembly languages, which provided human-meaningful mnemonics to the computer’s opcodes. No longer did we need to encode a jump as 11111000110000001100 — now it was jl 16. Still mysterious, to be sure, but significant gains are realized in legibility. When encoded directly in machine code, programs were, for the most part, write-only. But assembly languages don’t come for free; first you need to write an assembler: a program that reads the mnemonics and outputs the raw machine code. If you were already proficient writing machine code directly, you can imagine the task of implementing an assembler to feel like make work—a tool to automate a problem you don’t have. In the context of the Co-Blub paradox, knowing the direct encodings of your opcodes is an anti-pattern when you have an assembly language, as it makes your contributes inscrutable among your peers.

Programming directly in assembly eventually hit its limits. Every computer had a different assembly language, which meant if you wanted to run the same program on a different computer you’d have to completely rewrite the whole thing; often needing to translate between extremely different concepts and limitations. Ignoring a lot of history, C came around with the big innovation that software should be portable between different computers: the same C program should work regardless of the underlying machine architecture. If you were an assembly programmer, you ran into the anti-pattern that while you could squeeze more performance and perform clever optimizations if you were aware of the underlying architecture, this fundamentally limited you to that platform.

By virtue of being, in many ways, a unifying assembly language, C runs very close to what we think of as “the metal.” Although different computer architectures have minor differences in registers and ways of doing things, they are all extremely similar variations on a theme. They all expose storable memory indexed by a number, operations for performing basic logic and arithmetic tasks, and means of jumping around to what the computer should consider to be the next instruction. As a result, C exposes this abstraction of what a computer is to its programmers, who are thus required to think about mutable memory and about how to encode complicated objects as sequences of bytes in that memory. But then came Java, whose contribution to mainstream programming was to popularize the idea that memory is cheap and abundant, and thus OK to waste some in order to alleviate the headache of needing to track it all yourself. As a C programmer coming to Java, you must unlearn the idea that memory is sacred and scarce, that you can do a better job of keeping track of it than the compiler can, and, hardest of all, that it is an important thing to track in the first place.

There is a clear line of progression here; as we move up the lattice of powerful languages, we notice that more and more details of what we thought were integral parts of programming turn out to be not particularly relevant to the actual task at hand. However, the examples thus discussed are already known to the modern programmer. Let’s take a few steps further, into languages deemed esoteric in the present day. It’s easy to see and internalize examples from the past, but those staring us in the face are much more difficult to spot.

Compare Java then to Lisp, which—among many things—makes the argument that functions, and even programs themselves, are just as meaningful objects as are numbers and records. Where Java requires the executable pieces to be packaged up and moved around with explicit dependencies on the data it requires, Lisp just lets you write and pass around functions, which automatically carry around all the data they reference. Java has a design pattern for this called the “command pattern,” which requires much ado and ink to be spilled, while in Lisp it just works in a way that is hard to understand if you are used to thinking about computer programs as static sequences of instructions. Indeed, the command pattern is bloated and ultimately unnecessary in Lisp, and practitioners must first unlearn it before they can begin to see the beauty of Lisp.

Haskell takes a step further than Lisp, in that it restricts when and where side-effects are allowed to occur in a program. This sounds like heresy (and feels like it for the first six months of programming in Haskell) until you come to appreciate that almost none of a program needs to perform side-effects. As it happens, side-effects are the only salient observation of the computer’s execution model, and by restricting their use, Haskell frees its programmers from needing to think about how the computer will execute their code—promising only that it will. As a result, Haskell code looks much more like mathematics than it looks like a traditional computer program. Furthermore, by abstracting away the execution model, the runtime is free to parallelize and reorder code, often even eliding unnecessary execution altogether. The programmer who refuses to acknowledge this reality and insists on coding with side-effects pays a great price, both on the amount of code they need to write, in its long-term reusability, and, most importantly, in the correctness of their computations.

All of this brings us to Agda, which is as far