Planet Haskell

December 02, 2021

Matthew Sackman

Let's build! A distributed, concurrent editor: Part 1 - Introduction

The sun is bright and the sky’s blue when I get to the office in London. As the lift doors ping open and I step out, a pair of heads near my desk swivel and look towards me. A sense of shame and embarrassment immediately forms in the pit of my stomach as I spot Alice and Bob sat waiting for me. I’d forgotten the meeting with them this morning that I’m late for. Alice shoots a death-stare my way, and Bob looks pointedly at his watch. I hurry over and mutter profuse apologies: I didn’t sleep well due to worry over a board meeting later today. Quickly getting down to business, I ask: “How can I help?”

Alice and Bob want to write a document together. But they don’t want to be sat next to each other to do this - they want to be able to work apart, but on the same document. They’ve tried emailing around documents but that always gets into a mess - they both now have several files called the_document_version_3 and they really don’t know which is the actual version 3. So they want something where they can see instantly the changes that each other is making and both make changes at the same time.

I briefly interrupt to query this use of the word instantly:

“If you’re involving computers and networks, then it’s very likely it’s going to take tens of milliseconds to relay changes from one to the other, and probably hundreds of milliseconds if you’re geographically far apart.”

They think this is probably OK: if this is as fast as it can be then it’ll have to do, and those numbers sound small enough that it’s unlikely to be too detrimental to the interactive editing experience.

Digging into this a bit further, I present them with a scenario:

“Let’s say”, says I, “that you’re both looking at the document on your own computers, and you each have your cursor at the same place in the document. At the exact same moment, one of you types x and the other types y. What should happen?”

The discussion bounces around for a while - it’s clear there is no one correct answer. In the end we agree it most likely doesn’t matter too much, provided the following is true:

  • Either the x or the y or both should be added to the document at the cursor position (and nothing else!), and if both, any order is acceptable.
  • Once Alice’s computer and Bob’s computer have received updates from each other, they should both see the exact same document.

They point out to me that they’re colleagues, and want to collaborate. Their nature is that if they see one of them is editing a particular word or sentence, the other is likely to wait to see what changes are made, rather than wade in with fists full of fury and hammer on the keyboard without thought to what the other person is doing. So to them, the scenario I posed isn’t likely to be that common. I smile internally: I rather suspect they’ve never tried to work with Eve, down in security.

By now, my computer has booted up and I do some quick searching online. It turns out that Google docs has now been sunsetted, and Dropbox paper was never invented, nor several other tools you might be thinking of. I realise I’m going to have to build this myself.

I query them on the notion of saving the document - earlier in the discussion they’d mentioned emailing documents around so I want to know how and where their document should be saved.

Obviously we want it to be saved in the cloud”, Alice asserts. They look at me like I’m from the past. Right. So the cloud exists, but Dropbox doesn’t. Strange world, but fine, whatever! I half nod, half shrug, and suggest:

“This means there will need to be some sort of server which will provide persistent storage for the document.”

“That sounds fine”, says Bob, not caring one iota. “Also, it should save instantly. Like, every key-press. We should never have to think about saving.”

I wonder what else they might want: the 2nd moon on a stick perhaps? Thankfully the words never leave my mouth. My thoughts turn to disconnected operation and the impact of network disconnects.

“Erm, are you both planning on working on this document when you’re traveling? On trains, aeroplanes - stuff like that?”

“Absolutely”. They both nod vigorously.

“OK, so if you’re on a plane then you can’t be connected to a network, and if you’re on a train and you go through the countryside or a tunnel or something, you may lose connectivity.”

Blank stares.

“So”, I continue, “that means if you’re both typing and you don’t have connectivity, there is no way I can relay changes between you both, nor store your changes on the server.”

This foxes them a little and a fair amount of discussion and debate follows once again. I want to push them towards a if you’re not connected then you can’t edit the document at all world, but they don’t like that - they definitely want to be able to edit the document when on long flights. Looking at my watch I realise we’re fast running out of time for this meeting, but I was late so let’s overrun for a bit if necessary.

“How about you can edit it, and the edits show up on your computer immediately, and once you reconnect to the network it will try to sync with the server?” They’re both looking hopefully at me; so far so good. “But, there’s a chance that if the other one of you has edited the same part of the document in the mean time then you might have a conflict and maybe some changes get lost?”. The hopeful expressions seem diminished.

“I think we can work with that”, Alice suggests cautiously. “If I’m going to do a large amount of work on a section of the document on a flight, I’ll just have to let Bob know my plans and he’ll know not to touch that area until I say I’m done.” Bob looks less convinced but I guess my face is putting out several expressions too, so he takes the diplomatic route: “How about we start with this, see how it goes, and if we don’t like it then we revisit this issue later?” We all nod in agreement, though as is often the way, with different agendas internally.

“How do you want to edit this document?” I ask. “Is a browser-based UI OK?” I regret the words before they’ve even formed in my mouth. Why did I just say that? There is nothing that drives me up the wall more than trying to program browsers. Horrible misdesigned things.

“Yeah sure, that’d be great” says Bob. He doesn’t care. Nor does Alice. No one actually wants to use a browser. They just want to write their document. But I’ve gone and said it now, like the idiot I am.

They begin to get up to leave. “Oh yes, one other thing,” Alice starts. Somehow I already know I’m being ambushed. Alice looks at Bob: “Oh right! Yes! OK, so you know how we said earlier that we’d been emailing around Word documents?”

“Well I don’t think you mentioned they were Word documents, but sure, go on.”

“Well, it’s really useful that the undo (and redo) history is saved inside the document.”

“Exactly!” Alice joins back in. “We really want the undo/redo history to be part of the document and saved too. Thanks! Great meeting!”

“Try not to be late next time. Bye!” Nice rejoinder there from Bob. Not going to let my tardiness go then is he? And like that, they were gone.

Undo and redo. In a distributed editor. Have they even thought about how they want that to work? Oh well, no time to dwell on that. Quick bathroom break and then this board meeting. At least I’ve got something to tell the board now.

I give it my best shot. I know the details are a bit rough, and the slides were hurried. Also more than that, there’s just something missing from the plan. But still, I try and deliver the notion of this editor for Alice and Bob and their document to the board.

I finish my presentation. Silence descends for 30 seconds. And then almost in perfect synchronicity they all get out their sharpened pencils and pens (presumably sharp, but not sharpened) and start writing out their reasons for firing me. I panic. I make mistakes no one’s ever made before. I rapidly embellish my plans, promising technology that could never be delivered in the time frame available, all just to placate the board and save my skin.

“Wait! I forgot to mention! It’s not just one document, it’s any number of documents!”

Most pens are still moving, but a pencil or two is stationery.

“Oh! And, erm, erm, it’s not just Alice and Bob, oh no! It’s many many users. Like however many you like. All editing their own documents, in their own groups and teams and stuff!”

A stay of execution may be in reach.

“So like, we could have millions of users and billions of documents, and once we’ve been that disruptive, and have that much data and that many users, we’ll definitely be able to make money from them all and stuff!”

Utter silence. You could hear a mouse fart.

Suddenly, jubilation! Back-slapping, high-fiving, a case of champagne is opened, wild cheering breaks out. I blush, I thank them for their input and for really driving me the extra mile. They assure me they’ve all got detailed notes of everything I just rashly promised and will be cutting my head off if I fail to deliver. “Well that’s only fair!” I joke with unwise joviality. Several of them give me a look that I’ll remember to my dying day. I bid them farewell, until next time, and so forth, and retreat from the room. I spend 20 minutes vomiting in the toilets. What have I just done?

Let’s build!

So, with the scene set, let’s build this thing! I enjoy building distributed systems, I like the challenge. It’s also something that I think needs practice. It’s an area where I think it really helps to understand a bit of theory, and practice thinking about all the different orders in which events can occur.

I’m planning to write several articles on this project over the next few weeks and months. Initially, I’ll be talking a bit about theory and how that informs our protocol design between the browser and the server (and in fact how the undo/redo feature ends up meaning simple HTTP REST calls are not the most appropriate route to take). I’ll cover how we store this document, including its undo and redos (i.e. it’s editing history). There are tradeoffs everywhere: so many different ways you could make this work, none of them perfect, all a little ugly in places, but several of them more than good enough. I’ll show bits of the code (which will be open source - at the time of writing this, the code is complete and working, but I’m yet to write tests). And finally, I’ll cover testing this. I will write end-to-end soak tests, and hopefully fuzz tests too. Yup, I want to fuzz test a distributed concurrent system.

Oh yes, and hopefully, what I build will work for any data structure. Not just a document. Should be fun!

December 02, 2021 11:01 AM

Tweag I/O

Implementing a content-addressed Nix

Although the feature is still marked as experimental, the release of Nix 2.4 marks the entry of content-addressed derivations in a released Nix version. This is something to celebrate, but also an occasion to look back and see what this addition means, implementation wise.

This new feature required indeed some deep changes both in the high-level algorithm describing the build loop (and also as a consequence in its low-level details) as well as in the “plumbing� code around it.

Note: Because it touches the implementation, this post will be more technical than the others, and will also assume that you’re somewhat familiar with what content-addressed derivations means, and how it works. If you haven’t done it already, I invite you to read the previous blog articles on the topic (1, 2 and 3), and if you’re really curious, the corresponding RFC.

High-level overview of the build loop

The first − and most obvious − change is that the flow of the build loop itself had to be adapted to take into account this new way of realising derivations.

Before looking at what changed, let’s try to understand how things used to work. What happens (or rather happened) when someone runs nix-build default.nix? The first thing of course, is that the Nix expression evaluator will happily evaluate this default.nix file. We won’t detail this process here, suffice it to say that this will return a derivation which represents the thing that we want to build.

This derivation will be passed on to the “build loop�, whose behavior is roughly the following (I’m using a Python-like pseudocode, but the actual implementation is in C++):

def buildDerivation(derivation : Derivation) -> None:
    remainingOutputs = tryToSubstituteOutputs(derivation)
    if (remainingOutputs == []):

In plain english, what happens is that the builder will first try to substitute the outputs of the derivation from the configured binary caches. If some couldn’t be substituted, then it will recursively build all the inputs of the derivation, build the derivation itself and eventually register its output paths in the database. Simple and straightforward (although as you might imagine the devil is in the details, like always).

So what changes with content-addressed derivations? Quite a bunch of things in fact.

Building resolved derivations

The first big change is that for early-cutoff to work, we don’t really want to build the derivation that’s given to us, but rather its resolved version, which is the same derivation but where all the inputs are replaced by their content-addressed path (this is explained more in details in the corresponding RFC). So our buildDerivation should actually be redefined as

def buildDerivation(derivation : Derivation) -> Set[Realisation]:
    remainingOutputs = tryToSubstituteOutputs(derivation)
    if (remainingOutputs == []):
    return buildResolvedDerivation(Derivation.resolve(derivation))

where buildResolvedDerivation : ResolvedDerivation -> Set[Realisation] is the one that will do most of the job

Calculated output paths

Another change is that while the output paths used to be a given, they are now a product of the build (it’s the whole point of content-addressed derivations). It means that:

  1. We must register what path each output maps to in the database, and
  2. This mapping must be passed to any piece of code that needs to access the output paths as it can’t just be inferred anymore.

The first point is handled by a new function registerRealisation : Realisation -> (), where a Realisation associates a given derivation output (foo.drv!out) to a given store path (/nix/store/…-foo-out).

For the second point, we must change the registerOutputPaths function a bit: the way doBuild works is that it will build everything in a predetermined temporary location. Then registerOutputPaths will do all the magic to move these paths to their final content-addressed location, as described in my post about self-references. Eventually, this function will return a set of Realisation materializing the newly built derivation outputs.

Our new buildResolvedDerivation now looks like:

def buildResolvedDerivation(derivation : ResolvedDerivation)
      -> Set[Realisation]:
    # Maybe a substituter didn’t know about the original derivation,
    # but knows about the resolved one, so let’s try substituting
    # again
    remainingOutputs = tryToSubstituteOutputs(derivation)
    if (remainingOutputs == []):
    # No need to try building the inputs as by definition
    # a resolved derivation already has all its inputs available
    newRealisations = registerOutputPaths()
    for realisation in newRealisations:
    return newRealisations

Registering realisations

In addition to the two points above, we must also register the new realisations for the original (non-resolved) derivation, meaning that buildDerivation should take the set of Realisation that buildResolvedDerivation provides, and register them as if they were its own build outputs. Something like:

def buildDerivation(derivation : Derivation) -> Set[Realisation]:
  remainingOutputs = tryToSubstituteOutputs(derivation)
  if (remainingOutputs == []):
  newRealisations = buildResolvedDerivation(
  return newRealisations

Mixing content-addressed and non-content-addressed

These changes are obviously making the build process slightly more involved, but in a way this is only making explicit some stuff that was implicitely passed. The actual build loop is actually even more complex than that right now (even at this level of abstraction), because there’s something else to add on top of this: since content-addressed derivations are an experimental features, this new build loop must only be followed when the feature is enabled. So there’s a bunch of if (settings.isExperimentalFeatureEnabled(Xp::CaDerivations)) in a bunch of places. I won’t write the full algorithm, because it’s more messy than interesting, and it’s (hopefully �) only temporary as the feature is supposed to get out of its experimental state one way or another at some point.

The build scheduling system

Getting more into the details and closer to the actual implementation, things don’t look that nicely structured, mostly because everything is plagued with a weird cps-like style that seems to come out of nowhere.

The reason for this style is that all this process has to be asynchronous, so that the internal scheduler can properly supervise an arbitrary number of concurrent builds and keep the user informed as to what’s going on. And because this part of the code base predates all the nice async frameworks that we can have right now, it uses its own framework, which indeed does the job but does impose a cost in terms of maintainability and readability of the code1.

Explaining this mechanism in details would be too long to fit in this post2, so I’ll just give a quick overview of the bits that really matter here.

At its core, this framework is a cooperative scheduling system. Each high-level task is represented by a subclass of Goal, which is some kind of state machine. The interface of a Goal, is essentially:

class Goal:
    # type State = Goal -> None
    State = Callable[[Goal], None]
    state : State

    dependencies : [Goal]

    # Called when a goal wants to say that it doesn’t have more to do
    def amDone(self) -> None:

The idea is that state should point to a method of the goal (which represents the next thing to do for the goal). The method pointed to by state should eventually return, and set state to a new method representing the next thing to run (so morally yielding the control back to the scheduler). On top of that, a scheduler will look at all the goals, and will schedule all the goals with no dependencies by calling their state method, until they call amDone to signal that they can be destroyed.

A consequence of this design is that there’s no function inside a goal, just “actions� (the States) that take no argument except the goal itself and return nothing. So any information that must be passed to or returned from a Sate must be present in a global environment (with respect to the current goal), meaning that it must be a field of the class (which is neither memory-efficient nor nice to work with, but has the advantage of making the whole scheduling system much simpler). Likewise, no information can be passed from a goal to another, except by side-effects.

For example, a simplified version of the interface of the DerivationGoal class (the one representing for the buildDerivation function above) would be something like:

class DerivationGoal(Goal):
    # type State = DerivationGoal -> None
    State = Callable[[DerivationGoal], None]

    derivation : Derivation

    outputsToBuild : List[OutputName] = []
    def tryToSubstituteOutputs(self) -> None:
        # Will fill `outputsToBuild` with the name of the outputs that couldn’t
        # be substituted.
        state = self.outputsSubstituted

    def outputsSubstituted(self) -> None:
        if (outputsToBuild) == []:
            state = 0 # Well that’s not really valid in python, but nevermind
            state = self.doBuild

    def doBuild(self) -> None:
        state = self.registerOutputPaths

    newRealisations : List[Realisation] = []
    def registerOutputPaths(self) -> None:
        # sets `newRealisations`
        state = self.registerRealisations

    def registerRealisations(self) -> None:

As can be seen above, this “state machine� isn’t very modular since everything has to be in the same scope. Besides, this is also a minefield, as most States rely on some implicit state (yes, the wording is a bit tricky) because it’s going to assume that a previous State has set the right fields of the class. Failure to meet this requirements will generally result at best in a segfault, at worst in some nonsensical behavior. Also because of this reliance to an ambient state, sharing code between two different Goal types is much more complex.

This means that the relatively complex logic needed to build content-addressed derivations has been pretty painful to implement. In a nutshell, it works by having a single goal DerivationGoal, implementing both the buildDerivation and buildResolvedDerivation functions above, and that will behave as one or the other depending on the shape of the given derivation. If the derivation doesn’t have the shape of a resolved one, then right at the end, tryToSubstituteOutputs will resolve the derivation, declare a dependency on a new DerivationGoal (with the resolved derivation, this time), and switch to a new resolvedFinished state that will

  1. Query the database for the newly registered realisations (since they couldn’t be passed explicitely between the two goals)
  2. Implement the registerRealisationsFromResolved function described above. This obviously also meant adding a bunch of new fields in the DerivationGoal class to handle all the CA-specific values that had to be passed around between goals.

Outside of the build loop

So far, we’ve been deliberately ignoring everything happening at instantiation time, and focused on what’s happening at build time. Let’s now look at some of the plumbing.

At the interface between the Nix evaluator and the builder lie the “derivations�. These are the data-structures that are returned by the evaluation and which describe in low-level terms how to build store paths.

Amongst other things, a derivation contains

  • A set of inputs
  • A build script
  • A list of outputs

In an input-addressed world, these outputs are plain store paths (computed by hashing all the rest of the derivation). This isn’t the case in an input-addressed world (as the output paths can only be computed once the build is done). But we still need a way to refer to these outputs in the derivation. In particular, we want to export each output path as an environment variable before the start of the build, so that we can do for example echo Foo > $out to create a file as the out output of the derivation. To be able to do that, we assign to each output a “placeholder�, a long chain of characters (which is also computed from a hash of the rest of the derivation so that it is deterministic but can’t be guessed a priori). Right before the build, this placeholder will be replaced by the temporary output used for the build.

More generally, the fact that we don’t know the output paths in advance led to some global changes in the code base, as a lot of places assumed otherwise. For example, the logic for nix build was something along the lines of:

class Buildable:
    """Represents a derivation that can be built """
    drvPath : StorePath
    outputs : Map[str,StorePath]

buildable : Buildable = evalCode(input)
symlink(buildable.outputs["result"], "result")

The Buildable class obviously can’t represent unbuilt content-addressed derivations, so it had to be changed. In a first step, we changed the type of the outputs field to Map[str,Optional[StorePath]] to take into account the fact that for some derivations in some contexts, we don’t know the output paths. This change isn’t the best one semantically speaking, since in most places we should know statically whether the output paths are known or not. But it had the advantage of being very simple to implement and get to work with input-addressed derivations (just add some indirection whenever we access the output field as for input-addressed derivations the field will always be set). And from that, we could progressively migrate to take into account content-addressed derivations by making sure that we weren’t blindly dereferencing the optional in places where it was potentially empty.

Then, we could go one step further and replace this class by two different ones:

class Buildable:
    """Something that we might want to build"""
    drvPath: StorePath
    outputNames: Set[str]

class Built:
    """Something that we have built"""
    drvPath: StorePath
    outputs : Map[str,StorePath]

The only ways to create a Built value being to either build a Buildable or query the database to (optionnally) get a prebuilt version of it, we now have our safety back, and we can be sure that internally, Nix will never try to access the path of an unbuilt derivation output.

Another issue we have to deal with, is that Nix is (or can be used as) a distributed system: The most common setup involves a client and a daemon, and distributed builds and binary caches can mean that several versions of the tool can have to work together. Obviously, this means that different versions have to play well together (as much as the interesection of the features of each allows), and the introduction of content-addressed derivations isn’t an excuse for breaking this.

Amongst other things, this means that

  1. If the daemon or a remote builder doesn’t handle CA derivations, then things must either work or fail graciously. In the case of the remote builder, we reused a trick from the work of recursive-nix, which is that content-addressed derivations will only be sent to builders which advertise the ca-derivations feature. That way a network of Nix machines can freely mix ca-aware and non-ca-aware machines.
  2. Remote caches must also transparently support both content-addressed and input-addressed derivations. Speaking in terms of http binary caches, this was actually quite natural, because as far as remote-caching is concerned, content-addressed derivations build on top of already existing mechanisms, and only requires adding a (totally separate) endpoint to query for the realisations. So the client is free to ignore the substituter if it doesn’t provide that endpoint, and conversely if the client doesn’t query that endpoint… well nothing happens, but that’s precisely what we want.


Overall, implementing this was quite a bumpy ride, but also an awesome occasion to dig into the internals of Nix (and try to improve them by the way). Hopefully, this post was also an occasion for you to join me in this wild ride, and give you some taste of how your favorite package manager works internally. And maybe even make you want to contribute to it?

  1. If this makes you wonder whether we could do better, the answer is yes, we just need more contributors to be able to tackle this sort of deep refactorings

  2. Which also gives me a nice escape hatch so that I don’t have to admit that I don’t really understand this framework


December 02, 2021 12:00 AM

November 30, 2021

Matt Parsons

RankNTypes via Lambda Calculus

RankNTypes is a language extension in Haskell that allows you to write even more polymorphic programs. The most basic explanation is that it allows the implementer of a function to pick a type, rather than the caller of the function. A very brief version of this explanation follows:

The Typical Explanation

Consider the identity function, or const:

id :: a -> a
id x = x

const :: a -> b -> a
const a b = a

These functions work for any types that the caller of the function picks. Which means that, as implementers, we can’t know anything about the types involved.

Let’s say we want to apply a function to each element in a tuple. Without a type signature, we can write:

applyToBoth f (a, b) = (f a, f b)

Vanilla Haskell will provide this type:

applyToBoth :: (a -> a) -> (a, a) -> (a, a)

This is a perfectly useful type, but what if we want to apply it to a tuple containing two different types? Well, we can’t do anything terribly interesting with that - if we don’t know anything about the type, the only thing we can provide is id.

applyToBoth :: (forall x. x -> x) -> (a, b) -> (a, b)
applyToBoth f (a, b) = (f a, f b)

And that forall x inside of a parentheses is a RankNType. It allows the implementer of the function to select the type that the function will be used at, and the caller of the function must provide something sufficiently polymorphic.

This explanation is a bit weird and difficult, even though it captures the basic intuition. It’s not super obvious why the caller or the implementer gets to pick types, though. Fortunately, by leveraging the lambda calculus, we can make this more precise!

Whirlwind Tour of Lambda

Feel free to skip this section if you’re familiar with the lambda calculus. We’re going to work from untyped, to simply typed, and finally to the polymorphic lambda calculus. This will be sufficient for us to get a feeling for what RankNTypes are.

Untyped Lambda Calculus

The untyped lambda calculus is an extremely simple programming language with three things:

  1. Variables
  2. Anonymous Functions (aka lambdas)
  3. Function Application

This language is Turing complete, surprisingly. We’ll use Haskell syntax, but basically, you can write things like:

id = \x -> x

const = \a -> \b -> a

apply = \f -> \a -> f a

Simply Typed Lambda Calculus

The simply typed lambda calculus adds an extremely simple type system to the untyped lambda calculus. All terms must be given a type, and we will have a pretty simple type system - we’ll only have Unit and function arrows. A lambda will always introduce a function arrow, and a function application always eliminates it.

id :: Unit -> Unit
id = \(x :: Unit) -> x

idFn :: (Unit -> Unit) -> (Unit -> Unit)
idFn = \(f :: Unit -> Unit) -> f

const :: Unit -> Unit -> Unit
const = \(a :: Unit) -> \(b :: Unit) -> a

apply :: (Unit -> Unit) -> Unit -> Unit
apply = \(f :: Unit -> Unit) -> \(a :: Unit) -> f a

This is a much less powerful programming language - it is not even Turing Complete. This makes sense - type systems forbid certain valid programs that are otherwise syntactically valid.

The type system in this is only capable of referring to the constants that we provide. Since we only have Unit and -> as valid type constants, we have a super limited ability to write programs. We can still do quite a bit - natural numbers and Boolean types are perfectly expressible, but many higher order combinators are impossible.

Let’s add polymorphic types.

Polymorphic Lambda Calculus

The magic of the lambda calculus is that we have a means of introducing variables. The problem of the simply typed lambda calculus is that we don’t have variables. So we can introduce type variables.

Like Haskell, we’ll use forall to introduce type variables. In a type signature, the syntax will be the same. However, unlike Haskell, we’re going to have explicit type variable application and introduction at the value level as well.

Let’s write id with our new explicit type variables.

id :: forall a. a -> a
id = forall a. \(x :: a) -> x

Let’s write const and apply.

const :: forall a. forall b. a -> b -> a
const = forall a. forall b. \(x :: a) -> \(y :: b) -> x

apply :: forall a. forall b. (a -> b) -> a -> b
apply = forall a. forall b. \(f :: a -> b) -> \(x :: a) -> f x

Finally, let’s apply some type variables.

constUnit :: Unit -> Unit -> Unit
constUnit = 
    const @Unit @Unit 

idUnitFn :: (Unit -> Unit) -> (Unit -> Unit)
idUnitFn = 
    id @(Unit -> Unit) f

idReturnUnitFn :: forall a. (a -> Unit) -> (a -> Unit)
idReturnUnitFn =
    forall a. id @(a -> Unit)

constUnitFn :: Unit -> (Unit -> Unit) -> Unit
constUnitFn = 
    const @Unit @(Unit -> Unit)

We’re passing types to functions. With all of these simple functions, the caller gets to provide the type. If we want the implementer to provide a type, then we’d just put the forall inside a parentheses. Let’s look at the applyBoth from above. This time, we’ll have explicit type annotations and introductions!

    :: forall a. forall b. (forall x. x -> x) -> (a, b) -> (a, b)
applyBoth =
    forall a. forall b.           -- [1]
    \(f :: forall x. x -> x) ->   -- [2]
    \((k, h) :: (a, b)) ->        -- [3]
        (f @a k, f @b h)          -- [4]

There’s a good bit going on here, so let’s break it down on a line-by-line basis.

  1. Here, we’re introducing our type variables a and b so that we can refer to them in the type signatures of our variables, and apply them to our functions.
  2. Here, we’re introducing our first value parameter - the function f, which itself has a type that accepts a type variable.
  3. Now, we’re accepting our second value parameter - a tuple (k, h) :: (a, b). We can refer to a and b in this signature because we’ve introduced them in step 1.
  4. Finally, we’re supplying the type @a to our function f in the left hand of the tuple, and the type @b to the type in the right. This allows our types to check.

Let’s see what it looks like to call this function. To give us some more interesting types to work with, we’ll include Int and Bool literals.

foo :: (Int, Bool)
foo = 
        @Int @Bool 
        (3, True)

We haven’t decided what _f will look like exactly, but the type of the value is forall x. x -> x. So, syntactically, we’ll introduce our type variable, then our value-variable:

foo :: (Int, Bool)
foo = 
        @Int @Bool 
        (forall x. \(a :: x) -> (_ :: x))
        (3, True)

As it happens, the only value we can possibly plug in here is a :: x to satisfy this. We know absolutely nothing about the type x, so we cannot do anything with it.

foo :: (Int, Bool)
foo = 
        @Int @Bool 
        (forall x. \(a :: x) -> a)
        (3, True)

Tug of War

applyBoth is an awful example of RankNTypes because there’s literally nothing useful you can do with it. The reason is that we don’t give the caller of the function any options! By giving the caller of the function more information, they can do more useful and interesting things with the results.

This mirrors the guarantee of parametric polymorphism. The less that we know about our inputs, the less we can do with them - until we get to types like const :: a -> b -> a where the implementation is completely constrained.

What this means is that we provide, as arguments to the callback function, more information!

Let’s consider this other type:

applyBothList :: (forall x. [x] -> Int) -> ([a], [b]) -> (Int, Int)
applyBothList f (as, bs) = 
    (f as, f bs)

Now the function knows a good bit more: we have a list as our input (even if we don’t know anything aobut the type), and the output is an Int. Let’s translate this to our polymorphic lambda calculus.

applyBothList =
    forall a. forall b.
    \(f :: forall x. [x] -> Int) ->
    \( as :: [a], bs :: [b] ) ->
    ( f @a as, f @b bs )

When we call this function, this is what it looks like:

        @Int @Char
        (forall x. \(xs :: [x]) -> length @x xs * 2)
        ( [1, 2, 3], ['a', 'b', 'c', 'd'] )


In Haskell, a type class constraint is elaborated into a record-of-functions that is indexed by the type.

class Num a where
    fromInteger :: Integer -> a
    (+) :: a -> a -> a
    (*) :: a -> a -> a
    (-) :: a -> a -> a
    -- etc...

-- under the hood, this is the same thing:

data NumDict a = NumDict
    { fromInteger :: Integer -> a
    , (+) :: a -> a -> a
    , (-) :: a -> a -> a
    , (*) :: a -> a -> a

When you have a function that accepts a Num a argument, GHC turns it into a NumDict a and passes it explicitly.

-- Regular Haskell:
square :: Num a => a -> a
square a = a * a

-- What hapens at runtime:
square :: NumDict a -> a -> a
square NumDict {..} a = a * a

Or, for a simpler variant, let’s consider Eq.

-- Regular Haskell:
class Eq a where
    (==) :: a -> a -> Bool

-- Runtime dictionary:
newtype EqDict a = EqDict { (==) :: a -> a -> Bool }

-- Regular:
allEqual :: (Eq a) => a -> a -> a -> Bool
allEqual x y z =
    x == y && y == z && x == z

-- Runtime dictionary:
allEqual :: EqDict a -> a -> a -> a -> Bool
allEqual (EqDict (==)) x y z =
    x == y && y == z && x == z

(Note that binding a variable name to an operator is perfectly legal!)

One common way to extend the power or flexibility of a RankNTypes program is to include allowed constraints in the callback function. Knowing how and when things come into scope can be tricky, but if we remember our polymorphic lambda calculus, this becomes easy.

Consider this weird signature:

weirdNum :: (forall a. Num a => a) -> String
weirdNum someNumber = 
    show (someNumber @Int)

This isn’t exactly a function. What sort of things can we call here?

Well, we have to produce an a. And we know that we have a Num a constraint. This means we can call fromInteger :: Integer -> a. And, we can also use any other Num methods - so we can add to it, double it, square it, etc.

So, calling it might look like this:

main = do
    putStrLn $ weirdNum (fromInteger 3 + 6 * 2)

Let’s elaborate this to our lambda calculus. We’ll convert the type class constraint into an explicit dictionary, and then everything should work normally.

weirdNum =
    \(number :: forall a. NumDict a -> a) ->
        show @Int intShowDict(number @Int intNumDict)

Now, let’s call this:

        ( forall a. 
        \(dict :: NumDict a) -> 
            fromInteger dict 3

More on the Lambda Calculus

If you’ve found this elaboration interesting, you may want to consider reading Type Theory and Formal Proof. This book is extremely accessible, and it taught me almost everything I know about the lambda calculus.

November 30, 2021 12:00 AM

November 29, 2021

Monday Morning Haskell

See and Believe: Visualizing with Gloss

Last week I discussed AI for the first time in a while. We learned about the Breadth-First-Search algorithm (BFS) which is so useful in a lot of simple AI applications. But of course writing abstract algorithms isn't as interesting as seeing them in action. So this week I'll re-introduce Gloss, a really neat framework I've used to make some simple games in Haskell.

This framework simplifies a lot of the graphical work one needs to do to make stuff show up on screen and it allows us to provide Haskell code to back it up and make all the logic interesting. I think Gloss also gives a nice demonstration of how we really want to structure a game and, in some sense, any kind of interactive program. We'll break down how this structure works as we make a simple display showing the BFS algorithm in practice. We'll actually have a "player" piece navigating a simple maze by itself.

To see the complete code, take a look at this GitHub repository! The Gloss code is all in the Game module.

Describing the World

In Haskell, the first order of business is usually to define our most meaningful types. Last week we did that by specifying a few simple aliases and types to use for our search function:

type Location = (Int, Int)
data Cell = Empty | Wall
  deriving (Eq)
type Grid = A.Array Location Cell

When we're making a game though, there's one type that is way more important than the rest, and this is our "World". The World describes the full state of the game at any point, including both mutable and immutable information.

In describing our simple game world, we might view three immutable elements, the fundamental constraints of the game. These are the "start" position, the "end" position, and the grid itself. However, we'll also want to describe the "current" position of our player, which can change each time it moves. This gives us a fourth field.

data World = World
  { playerLocation :: Location
  , startLocation :: Location
  , endLocation :: Location
  , worldGrid :: Grid

We can then supplement this by making our "initial" elements. We'll have a base grid that just puts up a simple wall around our destination, and then make our starting World.

-- looks like:
-- S o o o
-- o x x o
-- o x F o
-- o o o o
baseGrid :: Grid
baseGrid =
  (A.listArray ((0, 0), (3, 3)) (replicate 16 Empty))
  [((1, 1), Wall), ((1, 2), Wall), ((2, 1), Wall)]

initialWorld :: World
initialWorld = World (0, 0) (0, 0) (2, 2) baseGrid

Playing the Game

We've got our main type in place, but we still need to pull it together in a few different ways. The primary driver function of the Gloss library is play. We can see its signature here.

play :: Display -> Color -> Int
  -> world
  -> (world -> Picture)
  -> (Event -> world -> world)
  -> (Float -> world -> world)
  -> IO ()

The main pieces of this are driven by our World type. But it's worth briefly addressing the first three. The Display describes the viewport that will show up on our screen. We can give it particular dimensions and offset:

windowDisplay :: Display
windowDisplay = InWindow "Window" (200, 200) (10, 10)

The next two values just indicate the background color of the screen, and the tick rate (how many game ticks occur per second). And after those, we just have our initial world value as we made above.

main :: IO ()
main = play
  windowDisplay white 1 initialWorld

But now we have three more functions that are clearly driven by our World type. The first is a drawing function. It takes the current state of the world and create a Picture to show on screen.

The second function is an input handler, which takes a user input event as well as the current world state, and returns an updated world state, based on the event. We won't address this in this article.

The third function is an update function. This describes how the world naturally evolves without any input from tick to tick.

For now, we'll make type signatures as we prepare to implement these functions for ourselves. This allows us to complete our main function:

main :: IO ()
main = play
  windowDisplay white 20 initialWorld

drawingFunc :: World -> Picture

inputHandler :: Event -> World -> World

updateFunc :: Float -> World -> World

Let's move on to these different world-related functions.

Updating the World

Now let's handle updates to the world. To start, we'll make a stubbed out input-handler. This will just return the input world each tick.

inputHandler :: Event -> World -> World
inputHandler _ w = w

Now let's describe how the world will naturally evolve/update with each game tick. For this step, we'll apply our BFS algorithm. So all we really need to do is retrieve the locations and grid out of the world and run the function. If it gives us a non-empty list, we'll substitute the first square in that path for our new location. Otherwise, nothing happens!

updateFunc :: Float -> World -> World
updateFunc _ w@(World playerLoc _ endLoc grid time) =
  case path of
    (first : rest) -> w {playerLocation = first}
    _ -> w
    path = bfsSearch grid playerLoc endLoc

Note that this function receives an extra "float" argument. We don't need to use this.


Finally, we need to draw our world so we can see what is going on! To start, we need to remember the difference between the "pixel" positions on the screen, and the discrete positions in our maze. The former are floating point values up to (200.0, 200.0), while the latter are integer numbers up to (3, 3). We'll make a type to store the center and corner points of a given cell, as well as a function to generate this from a Location.

A lot of this is basic arithmetic, but it's easy to go wrong with sign errors and off-by-one errors!

data CellCoordinates = CellCoordinates
  { cellCenter :: Point
  , cellTopLeft :: Point
  , cellTopRight :: Point
  , cellBottomRight :: Point
  , cellBottomLeft :: Point

-- First param: (X, Y) offset from the center of the display to center of (0, 0) cell
-- Second param: Full width of a cell
locationToCoords :: (Float, Float) -> Float -> Location -> CellCoordinates
locationToCoords (xOffset, yOffset) cellSize (x, y) = CellCoordinates
  (centerX, centerY)
  (centerX - halfCell, centerY + halfCell) -- Top Left
  (centerX + halfCell, centerY + halfCell) -- Top Right
  (centerX + halfCell, centerY - halfCell) -- Bottom Right
  (centerX - halfCell, centerY - halfCell) -- Bottom Left
    (centerX, centerY) = (xOffset + (fromIntegral x) * cellSize, yOffset - (fromIntegral y) * cellSize)
    halfCell = cellSize / 2.0

Now we need to use these calculations to draw pictures based on the state of our world. First, let's write a conversion that factors in the specifics of the display, which allows us to pinpoint the center of the player marker.

drawingFunc :: World -> Picture
drawingFunc world =
    conversion = locationToCoords (-75, 75) 50
    (px, py) = cellCenter (conversion (playerLocation world))

Now we can draw a circle to represent that! We start by making a Circle that is 10 pixels in diameter. Then we translate it by the coordinates. Finally, we'll color it red. We can add this to a list of Pictures we'll return.

drawingFunc :: World -> Picture
drawingFunc world = Pictures
  [ playerMarker ]
    -- Player Marker
    conversion = locationToCoords (-75, 75) 50
    (px, py) = cellCenter (conversion (playerLocation world))
    playerMarker = Color red (translate px py (Circle 10))

Now we'll make Polygon elements to represent special positions on the board. Using the corner elements from CellCoordinates, we can draw a blue square for the start position and a green square for the final position.

drawingFunc :: World -> Picture
drawingFunc world = Pictures
  [startPic, endPic, playerMarker ]
    -- Player Marker
    conversion = locationToCoords (-75, 75) 50
    (px, py) = cellCenter (conversion (playerLocation world))
    playerMarker = Color red (translate px py (Circle 10))

    # Start and End Pictures
    (CellCoordinates _ stl str sbr sbl) = conversion (startLocation world)
    startPic = Color blue (Polygon [stl, str, sbr, sbl])
    (CellCoordinates _ etl etr ebr ebl) = conversion (endLocation world)
    endPic = Color green (Polygon [etl, etr, ebr, ebl])

Finally, we do the same thing with our walls. First we have to filter all the elements in the grid to get the walls. Then we must make a function that will take the location and make the Polygon picture. Finally, we combine all of these into one picture by using a Pictures list, mapped over these walls. Here's the final look of our function:

drawingFunc :: World -> Picture
drawingFunc world = Pictures
  [gridPic, startPic, endPic, playerMarker ]
    -- Player Marker
    conversion = locationToCoords (-75, 75) 50
    (px, py) = cellCenter (conversion (playerLocation world))
    playerMarker = Color red (translate px py (Circle 10))

    # Start and End Pictures
    (CellCoordinates _ stl str sbr sbl) = conversion (startLocation world)
    startPic = Color blue (Polygon [stl, str, sbr, sbl])
    (CellCoordinates _ etl etr ebr ebl) = conversion (endLocation world)
    endPic = Color green (Polygon [etl, etr, ebr, ebl])

    # Drawing the Pictures for the Walls
    walls = filter (\(_, w) -> w == Wall) (A.assocs $ worldGrid world)
    mapPic (loc, _) = let (CellCoordinates _ tl tr br bl) = conversion loc 
                          in Color black (Polygon [tl, tr, br, bl])
    gridPic = Pictures (map mapPic walls)

And now when we play the game, we'll see our circle navigate to the goal square!


Next time, we'll look at a more complicated version of this kind of game world!

by James Bowen at November 29, 2021 03:30 PM

November 28, 2021

Joachim Breitner

Zero-downtime upgrades of Internet Computer canisters

TL;DR: Zero-downtime upgrades are possible if you stick to the basic actor model.


DFINITY’s Internet Computer provides a kind of serverless compute platform, where the services are WebAssemmbly programs called “canisters”. These services run without stopping (or at least that’s what it feels like from the service’s perspective; this is called “orthogonal persistence”), and process one message after another. Messages not only come from the outside (“ingress” calls), but are also exchanged between canisters.

On top of these uni-directional messages, the system provides the concept of “inter-canister calls”, which associates a respondse message with the outgoing message, and guarantees that a response will come. This RPC-like interface allows canister developers to program in the popular async/await model, where these inter-canister calls look almost like normal function calls, and the subsequent code is suspended until the response comes back.

The problem

This is all very well, until you try to upgrade your canister, i.e. install new code to fix a bug or add a feature. Because if you used the await pattern, there may still be suspended computations waiting for the response. If you swap out the program now, the code of that suspended computation will no longer be present, and the response cannot be handled! Worse, because of an infelicity with the current system’s API, when the response comes back, it may actually corrupt your service’s state.

That is why upgrading a canister requires stopping it first, which means waiting for all outstanding calls to come back. During this time, your canister is not available for new calls (so there is downtime), and worse, the length of the downtime is at the whims of the canisters you called – they could withhold the response ad infinitum, rendering your canister unupgradeable.

Clearly, this is not acceptable for any serious application. In this post, I’ll explore some of the ways to mitigate this problem, and how to create canisters that are safely instantanously (no downtime) upgradeable.

It’s a spectrum

Some canisters are trivially upgradeable, for others all hope is lost; it depends on what the canister does and how. As an overview, here is the spectrum:

  1. A canister that never performs inter-canister calls can always be upgraded without stopping.
  2. A canister that only does one-way calls, and does them in a particular way (see below), can always be upgraded without stopping.
  3. A canister that performs calls, and where it is acceptable to simply drop outstanding repsonses, can always be upgraded without stopping, once the System API has been improved and your Canister Development Kit (CDK; Motoko or Rust) has adapted.
  4. A canister that performs calls, but uses explicit continuations to handle, responses instead of the await-convenience, based on an eventually fixed System API, can be upgradeded without stopping, and will even handle responses afterwards.
  5. A canister that uses await to do inter-canister call cannot be upgraded without stopping.

In this post I will explain 2, which is possible now, in more detail. Variant 3 and 4 only become reality if and when the System API has improved.

One-way calls

A one-way call is a call where you don’t care about the response; neither the replied data, nor possible failure conditions.

Since you don’t care about the response, you can pass an invalid continuation to the system (technical detail: a Wasm table index of -1). Because it is invalid for any (realistic) Wasm module, it will stay invalid even after an upgrade, and the problem of silent corruption mentioned above is avoided. And otherwise it’s fine for this to be invalid: it means the canister “traps” once the response comes back, which is harmeless (and possibly even cheaper than a do-nothing computation).

This requires your CDK to support this kind of call. Mostly incidential, Motoko (and Candid) actually have the concept of one-way call in their type system, namely shared functions with return type () instead of async ... (Motoko is actually older than the system, and not every prediction about what the system will provide has proven successful). So, pending this PR to be released, Motoko will implement one-way calls in this way. On Rust, you have to use the System API directly or wait for cdk-rs to provide this ability (patches welcome, happy to advise).

You might wonder: How are calls useful if I don’t get to look at the response? Of course, this is a set-back – calls with responses are useful, and await is convenient. And if you have to integrate with an existing service that only provides normal calls, you are out of luck.

But if you get to design the canister and all called canisters together, it may be possible to use only one-way messages. You’d be programming in the plain actor model now, with all its advantages (simple concurrency, easy to upgrade, general robustness).

Consider for example a token ledger canister, not unlike the ICP ledger canister. For the most part, it doesn’t have to do any outgoing calls (and thus be trivially upgradeble). But say we need to add notify functionality, where the ledger canister tells other canisters about a transaction. This is a good example for a one-way call: Maybe the ledger canister doesn’t care if that notification was received? The ICP leder does care (once it comes back successful, this particular notification cannot be sent again), but maybe your ledger can do it differently: let the other canister confirm the receip via another one-way call, instead of via the reply; or simply charge for each notification and do not worry about repeated notifications.

Maybe you want to add archiving functionality, where the ledger canister streams its data to an archive canister. There, again, instead of using successful responses to confirm receipt, the archive canister can ping the ledger canister with the latest received index directly.

Yes, it changes the programming model a bit, and all involved parties have to play together, but the gain (zero-downtime upgrades) is quite valuable, and removes a fair number of other sources of issues.

And in the future?

The above is possible with today’s Internet Computer. If the System API gets improves the way I hope it will be, you have a possible middle ground: You still don’t get to use await and instead have to write your response handler as separate functions, but this way you can call any canister again, and you get the system’s assistance in mapping responses to calls. With this in place, any canister can be rewritten to a form that supports zero-downtime upgrades, without affecting its interface or what the canister can do.

by Joachim Breitner ( at November 28, 2021 05:11 PM

November 27, 2021

Magnus Therning

Fallback of actions

In a tool I'm writing I want to load a file that may reside on the local disk, but if it isn't there I want to fetch it from the web. Basically it's very similar to having a cache and dealing with a miss, except in my case I don't populate the cache.

Let me first define the functions to play with

loadFromDisk :: String -> IO (Either String Int)
loadFromDisk k@"bad key" = do
    putStrLn $ "local: " <> k
    pure $ Left $ "no such local key: " <> k
loadFromDisk k = do
    putStrLn $ "local: " <> k
    pure $ Right $ length k

loadFromWeb :: String -> IO (Either String Int)
loadFromWeb k@"bad key" = do
    putStrLn $ "web: " <> k
    pure $ Left $ "no such remote key: " <> k
loadFromWeb k = do
    putStrLn $ "web: " <> k
    pure $ Right $ length k

Discarded solution: using the Alternative of IO directly

It's fairly easy to get the desired behaviour but Alternative of IO is based on exceptions which doesn't strike me as a good idea unless one is using IO directly. That is fine in a smallish application, but in my case it makes sense to use tagless style (or ReaderT pattern) so I'll skip exploring this option completely.

First attempt: lifting into the Alternative of Either e

There's an instance of Alternative for Either e in version 0.5 of transformers. It's deprecated and it's gone in newer versions of the library as one really should use Except or ExceptT instead. Even if I don't think it's where I want to end up, it's not an altogether bad place to start.

Now let's define a function using liftA2 (<|>) to make it easy to see what the behaviour is

fallBack ::
    Applicative m =>
    m (Either String res) ->
    m (Either String res) ->
    m (Either String res)
fallBack = liftA2 (<|>)
λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key"
local: bad key
web: good key
Right 8

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key"
local: bad key
web: bad key
Left "no such remote key: bad key"

The first example shows that it falls back to loading form the web, and the second one shows that it's only the last failure that survives. The latter part, that only the last failure survives, isn't ideal but I think I can live with that. If I were interested in collecting all failures I would reach for Validation from validation-selective (there's one in validation that should work too).

So far so good, but the next example shows a behaviour I don't want

λ> loadFromDisk "good key" `fallBack` loadFromWeb "good key"
local: good key
web: good key
Right 8

or to make it even more explicit

λ> loadFromDisk "good key" `fallBack` undefined
local: good key
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:79:14 in base:GHC.Err
  undefined, called at <interactive>:451:36 in interactive:Ghci4

There's no short-circuiting!1

The behaviour I want is of course that if the first action is successful, then the second action shouldn't take place at all.

It looks like either <|> is strict in its second argument, or maybe it's liftA2 that forces it. I've not bothered digging into the details, it's enough to observe it to realise that this approach isn't good enough.

Second attempt: cutting it short, manually

Fixing the lack of short-circuiting the evaluation after the first success isn't too difficult to do manually. Something like this does it

fallBack ::
    Monad m =>
    m (Either String a) ->
    m (Either String a) ->
    m (Either String a)
fallBack first other = do
    first >>= \case
        r@(Right _) -> pure r
        r@(Left _) -> (r <|>) <$> other

It does indeed show the behaviour I want

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key"
local: bad key
web: good key
Right 8

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key"
local: bad key
web: bad key
Left "no such remote key: bad key"

λ> loadFromDisk "good key" `fallBack` undefined
local: good key
Right 8

Excellent! And to switch over to use Validation one just have to switch constructors, Right becomes Success and Left becomes Failure. Though collecting the failures by concatenating strings isn't the best idea of course. Switching to some other Monoid (that's the constraint on the failure type) isn't too difficult.

fallBack ::
    (Monad m, Monoid e) =>
    m (Validation e a) ->
    m (Validation e a) ->
    m (Validation e a)
fallBack first other = do
    first >>= \case
        r@(Success _) -> pure r
        r@(Failure _) -> (r <|>) <$> other

Third attempt: pulling failures out to MonadPlus

After writing the fallBack function I still wanted to explore other solutions. There's almost always something more out there in the Haskell eco system, right? So I asked in the #haskell-beginners channel on the Functional Programming Slack. The way I asked the question resulted in answers that iterates over a list of actions and cutting at the first success.

The first suggestion had me a little confused at first, but once I re-organised the helper function a little it made more sense to me.

mFromRight :: MonadPlus m => m (Either err res) -> m res
mFromRight = (either (const mzero) return =<<)

To use it put the actions in a list, map the helper above, and finally run asum on it all2. I think it makes it a little clearer what happens if it's rewritten like this.

firstRightM :: MonadPlus m => [m (Either err res)] -> m res
firstRightM = asum . fmap go
    go m = m >>= either (const mzero) return
λ> firstRightM [loadFromDisk "bad key", loadFromWeb "good key"]
local: bad key
web: good key

λ> firstRightM [loadFromDisk "good key", undefined]
local: good key

So far so good, but I left out the case where both fail, because that's sort of the fly in the ointment here

λ> firstRightM [loadFromDisk "bad key", loadFromWeb "bad key"]
local: bad key
web: bad key
*** Exception: user error (mzero)

It's not nice to be back to deal with exceptions, but it's possible to recover, e.g. by appending <|> pure 0.

λ> firstRightM [loadFromDisk "bad key", loadFromWeb "bad key"] <|> pure 0
local: bad key
web: bad key

However that removes the ability to deal with the situation where all actions fail. Not nice! Add to that the difficulty of coming up with a good MonadPlus instance for an application monad; one basically have to resort to the same thing as for IO, i.e. to throw an exception. Also not nice!

Fourth attempt: wrapping in ExceptT to get its Alternative behaviour

This was another suggestion from the Slack channel, and it is the one I like the most. Again it was suggested as a way to stop at the first successful action in a list of actions.

firstRightM ::
    (Foldable t, Functor t, Monad m, Monoid err) =>
    t (m (Either err res)) ->
    m (Either err res)
firstRightM = runExceptT . asum . fmap ExceptT

Which can be used similarly to the previous one. It's also easy to write a variant of fallBack for it.

fallBack ::
    (Monad m, Monoid err) =>
    m (Either err res) ->
    m (Either err res) ->
    m (Either err res)
fallBack first other = runExceptT $ ExceptT first <|> ExceptT other
λ> loadFromDisk "bad key" `fallBack` loadFromWeb "good key"
local: bad key
web: good key
Right 8

λ> loadFromDisk "good key" `fallBack` undefined
local: good key
Right 8

λ> loadFromDisk "bad key" `fallBack` loadFromWeb "bad key"
local: bad key
web: bad key
Left "no such local key: bad keyno such remote key: bad key"

Yay! This solution has the short-circuiting behaviour I want, as well as collecting all errors on failure.


I'm still a little disappointed that liftA2 (<|>) isn't short-circuiting as I still think it's the easiest of the approaches. However, it's a problem that one has to rely on a deprecated instance of Alternative for Either String, but switching to use Validation would be only a minor change.

Manually writing the fallBack function, as I did in the second attempt, results in very explicit code which is nice as it often reduces the cognitive load for the reader. It's a contender, but using the deprecated Alternative instance is problematic and introducing Validition, an arguably not very common type, takes away a little of the appeal.

In the end I prefer the fourth attempt. It behaves exactly like I want and even though ExpectT lives in transformers I feel that it (I pull it in via mtl) is in such wide use that most Haskell programmers will be familiar with it.

One final thing to add is that the documentation of Validation is an excellent inspiration when it comes to the behaviour of its instances. I wish that the documentation of other packages, in particular commonly used ones like base, transformers, and mtl, would be more like it.



I'm not sure if it's a good term to use in this case as Wikipedia says it's for Boolean operators. I hope it's not too far a stretch to use it in this context too.


In the version of base I'm using there is no asum, so I simply copied the implementation from a later version:

asum :: (Foldable t, Alternative f) => t (f a) -> f a
asum = foldr (<|>) empty

November 27, 2021 10:31 AM

November 25, 2021

Tweag I/O

The Varieties of the Haskelling Experience

Recently, a group of Haskellers within Tweag had a knowledge-sharing event where we shared our varied Haskell setups with each other, and learned some nice tricks and tips for every-day life.

The idea was raised of sharing these findings via the blog, so without further ado, let’s explore the varieties of the Tweag Haskelling experience by covering a few of our personal setups!

Nicolas Frisby

  • Editor: emacs
  • Other main tools: grep, cabal, inotify/entr around cabal build all, my own prototype tags generator.
  • Explanation: barebones emacs = syntax highlighting + basic Haskell mode for minimal indentation suggestions + a hack of gtags.el for querying my Tweag tags db
  • I love: fast start-up, tags for instance declarations, tags only require parse not typecheck, never hangs, never crashes, all state is persistent, independence of all the tools
  • Could be better: my tags prototype only handles cabal.project, just one at a time, not the source-repository-package entries (a cabal limitation, but fix on its way), and I still use grep for finding use sites, no Haddocks integration
  • Trivia: aliases of any kind worry me because I won’t be able to work in a fresh/foreign machine/login; these two are my only exceptions

    • git config --global alias.lg log --first-parent --color --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit
    • git config --global alias.lgtree log --graph --color --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit

Richard Eisenberg

  • Editor: Visual Studio Code
  • Other main tools: haskell-language-server, cabal, cabal-env, ghci, emacs’s smerge-mode.
  • Explanation: A recent convert from emacs, I’m enjoying the easier discoverability of commands in VSCode — and the power of the Haskell Language Server.
  • I love: Not worrying about directories when finding files in VSCode
  • Could be better: The switch has made me realize I need to upgrade my 5-year-old laptop; VSCode’s merge conflict resolution is not as good as smerge-mode.
  • Trivia: alias gitlog='git log --graph --full-history --all --color --pretty=format:"%x1b[31m%h%x09%x1b[32m%d%x1b[0m%x20%s"'. Try it. Be happy.

Facundo Domínguez

  • Editor: Vim
  • Other main tools: bash, hasktags, grep, hoogle,
  • Explanation: Every project has its own ways of building, and in my experience, figuring out IDE configuration for each one takes so long that I don’t feel it helps me read or write code fast enough to deserve it. Vim is almost always already installed when I approach a new development environment.
  • I love: I can focus on the task to solve as fast as possible.
  • Could be better: hasktags sometimes sends you to the wrong definition. A more sophisticated indexing solution could give more precision, without necessarily complicating the setup.
  • Trivia: I rely a lot on the undo history and the language-agnostic autocompletion that vim provides.

Thomas Bagrel

  • Editor: VS Code + Haskell extension pack (HLS, Hlint, formatter using Ormolu)
  • Other main tools: nix, hoogle,
  • Explanation: I’m still a beginner with haskell, and I usually rely a lot on IDE features (especially autocompletion, real-time type inference, documentation at call site) to learn a new programming language. However, this strategy doesn’t work that well with haskell (for various reasons I could detail if someone is interested). Anyway, VS Code with haskell extensions gives the closest experience to what I’m used to have. Previously, I extensively used JetBrains IDEs (for Java, Scala, Rust, JavaScript and Python dev).
  • I love: the IDE is decently fast (compared to JetBrains IDEs for example)
  • Could be better:

    • I still don’t know how to navigate in the code quickly with VSC,
    • I still need to learn the keyboard shortcuts,
    • Refactoring capabilities and “find referencesâ€� actions are nowhere near as good as IntelliJ for Rust dev for example,
    • Documentation popup (at call site) works well most of the time, but I can’t jump to the documentation of a type itself in some function signature for example,
    • I can’t jump to the definition of something defined in a library.
  • Conclusion: I really like large (and slow) JetBrains IDEs, because I can often access everything from the IDE (I don’t need to fetch documentation online most of the time, ease-of-discoverability is great, and most importantly, muscle memory is here). But because their Haskell support is not really good at the time (but maybe I need to try harder, and tweak/configure the haskell plugin better), I fallback to VSCode. I still feel quite lost to be honest.

Clément Hurlin

  • Editor: NeoVim
  • Other main tools:

    • Plugins:
    • fzf for fuzzing finding things (files, symbols),
    • coc.nvim as the LSP server, that delegated to haskell language server under the hood,
    • neoformat for formatting with ormolu,
    • vim-grepper to search for text with git ls-files under the hood. Although alternatively I more and more rely on the LSP “searchâ€� feature (which opens a nice preview window).
    • Other tools:
    • Usually I have ghcid running in a terminal on the side, because the LSP sometimes doesn’t list all errors (the error appears only when you open the file).
    • hls-tactics-plugin for automatic case splitting (hole filling never works :-/)
    • hoogle in a browser nearby :-)
  • Trivia: My history: Like Thomas, I used to exclusively use (and develop actually) large IDEs such as Eclipse, but I always used the vim mode in there. When starting Haskell I started with vscode but it started to be too slow when my project started growing. On the contrary, neovim with coc.nvim under the hood and the LSP is really fast and easy to setup ğŸ‘�(and I use it with OCaml too).

Torsten Schmits

  • Editor: NeoVim
  • Other main tools: nix, zsh, tmux, ghcid, HLS, CoC, tree-sitter
  • Explanation: I maintain my own haskell-nix toolkit that’s rather specific to my workflow for the projects I work on mostly alone, and do some ad-hoc setup for those I collaborate on. Luckily, HLS is making it easier to hack on a new project. I’ve also built a few Neovim plugins in Haskell that implement some IDE-like features, like running ghcid in a tmux pane.
  • I love: Nix’s reproducibility is a total paradigm shift for setting up dev environments, and I’m very happy about the features that HLS does well.
  • Could be better: Maybe it’s my idiosyncratic workflow, but HLS still has significant problems. Before version 1.2 it kept crashing after 1-5 minutes on lots of my projects (it was an insane relief when this was fixed), and it still doesn’t work well with multi-project setups.
  • Trivia: Did you know I wrote the Haskell grammar for tree-sitter?

Guillaume Desforges

  • Editor: VS Code
  • Other main tools: Nix, “Nix Environment Selectorâ€� (VSCode extension), “Haskellâ€� (VSCode extension)
  • Explanation: I make a shell.nix to get stack and haskell-language server. (and requirements for plugins), load it into the VSCode env, then create my project with stack. I write a small hie.yaml to explicit the cradle. You need to edit the settings.json so that Haskell (VSCode extension) actually uses the one from Nix.
  • I love-: “it just worksâ€�â„¢, ease of setup (at least relatively), get full HLS
  • Could be better: first load of the nix shell can be a bit long, and Nix Environment selector has no output, so I’d advise loading it in shell before loading it using the extension.
  • Trivia: if you are on NixOS, stack enables nix build automatically so you might need to define a stack-shell.nix and specify it in stack.yaml.

Jeff Young

  • Editor: Spacemacs (emacs)
  • Other main tools: Nix! I’ve used NixOS+Xmonad as a daily driver for 3 years now. Fish shell. Rigrep and fzf for searching. Magit, a legendary git UX; I don’t know how people live without it.
  • Explanation: I mostly work on Haskell and GHC, but in my spare time I’ve been contributing and exploring J, BQN, and Dyalog-APL. So I’ve been working on better package in nixpkgs for each of these and writing concomitant layers for Spacemacs. This means I’m in way too deep into emacs to switch now. I track all my goals in org-mode including clocking time, I have syncthing setup to sync my org files on a home server I built. This then syncs to my phone for notifications so the interface between all digital objects in my home is emacs+org. All of this to say that I’m probably a lifer when it comes to emacs. Although I am thinking of switching and contributing to Doom emacs (Spacemacs has been on version 0.3 for 3 years now?!)
  • I love:

    • A small set of extremely powerful tools. Once you grok it it is the hammer for every problem (and therein lies the problem with emacs!).
    • Available everywhere and is free. Focus on backwards compatibility.
    • Built in documentation for everything. I simply hit SPC h d f over any function and get documentation for it. This integrates in varying ways with Haskell docs but nothing is as good as common lisp’s SLIME or emacs-lisp’s built in documentation.
    • The docs do what they say, say what they mean and are up to date. I can read some documentation that describes a configuration change, then go do that exact change and the predicted effect will occur.
    • Emacs is a lisp machine. Don’t like something? Then sling some code and overwrite it with a hook, or if it is old then directly since emacs-lisp is dynamically scoped.
    • TRAMP mode allows my to SPC f f write /sudo::/etc and get a sudo’d shell into /etc. This same feature of emacs allows me to remote into any server and use my emacs configuration from my desktop. This means I never need to export my emacs configuration to the remote server, I simply need to use TRAMP to ssh in and I’m all set.
    • Copy paste with the kill ring allows me to hit SPC r y and see the last n things I’ve killed or copied. In contrast with most clip boards which just save the last thing copied.
    • emacs-everywhere
  • Could be better:

    • Latency, although this is improving with emacsGcc (a jit’d version of emacs)
    • No true Async!
    • Various packages are missing documentation. I’m sure this bites people trying to use emacs to work on Haskell. For me I’ve gotten used to reading source code in lieu of reading docs.
    • It is easy to get lost. There are a million ways to do the same thing and there are numerous packages that solve the same problem. Just understanding org-mode means reading docs for org org-babel org-ox org-publish etc.
    • Setting up HLS and LSP in spacemacs on NixOS is a very involved process and is not beginner friendly in any way. I have struggle with getting it working but now it is pretty good! I even have it setup for GHC which was a minor miracle.
  • Trivia: I’d regularly lose workflow without fish’s history autocompletion.

Karol Czulkowski

  • Editor: emacs + haskell-mode + haskell-language-server (lsp, lsp-ui)
  • Other main tools: nix/nixos, helm, key-chord, eno, projectile, silver-searcher, nix-direnv, envrc (thx Steve!), yasnippet, hoogle search (from haskell-mode)
  • Explanation: My journey with emacs started from terrible scala support in IntelliJ. Frontend compiler for this ide was showing fake errors which had confused me many, many times. Then I found ensime which unfortunately was abandoned but simultaneously Metals was presented to the wider audience. This LSP experience was something that I was looking for. When it comes to Haskell & emacs, I started with Intero which was… inevitably abandoned and somehow not working on NixOs. Then I’ve found Dante but the overall experience for me was worst in comparison to Scala Metals. I gave another chance to HLS and now this is really, really close my best LSP experience so far.
  • I love:

    • emacs: Albeit, I’m not elisp hacker, I know there is always a way to setup anything I want in the way I want. I can use different plugins like jumping between windows/words/braces, regexp expressions, key-chords, dired-helm, projectile, multiple-cursors for all the projects or tech-stacks I work on.
    • projectile: all the boost it gives me when it comes to navigating among projects.
    • hls: Feedback it gives. Immediately.
  • Could be better:

    • hls: jump to definition works only for local definitions
    • hls: haskell-language-server & bazel integration :)
  • Trivia: I feel like I’m still not utilizing all the power that emacs gives (especially after reading Jeff’s setup :) ).

Adrian Robert

  • Editor: VS Code + NixEnv + HLS.
  • Other main tools: rg (ripgrep), Cabal, haskell.nix
  • Explanation: I first tried IntelliJ but found it reminiscent of my experience with Eclipse for Scala – not good enough. So then became a first-time user of VS Code and liked it. Why not Emacs? Somehow the collective relative friction of setting up, the more specialized UI, and a general feeling of greater speed have me on the VS Code side now. I’ll continue experimenting.
  • I love: Quick feedback (underline + hover) on type mismatch problems. As a Haskell newbie I’d go through a lot more compile cycles without this.
  • Could be better: The experience still feels pretty basic, say like using XCode for Objective-C ten years ago. I appreciate what’s there, but nowadays one misses having things like refactoring and code templating. I don’t know if this is a fundamental limitation in the LSP paradigm or in VS Code, or a lack in the Haskell implementation or weakness in the Haskell plugin.
  • Trivia: For Scala I still haven’t managed to get our Scala/Gradle project up and running properly in VS Code with Metals, which I suspect has to do with its multidirectory and subproject structure, together with too many parameters in Gradle versions and things it pulls in. Frustrating, though since I don’t work on this very often it hasn’t been frustrating enough for me to get it working. :-}

Noon van der Silk

  • Editor: NeoVim
  • Other main tools: XMonad, nix, zsh, rip-grep, fzf
  • Explanation: I use a fairly simple vim setup, with a strong reliance on fuzzy-finding and grep.
  • I love: How fast everything starts up; independence of all the tools.
  • Could be better: I’m jealous of people using HLS.
  • Trivia: I rely on a lot of zsh aliases to live a convenient command-line life.

Julien Debon

  • Editor: Spacemacs
  • Other main tools:

  • Explanation: I have been a heavy user of IDEs like Eclipse and IntelliJ for many years, but after several frustrations about configuring/fine tuning, I moved to VS Code for a while, and then a year ago to Emacs, via Spacemacs. While the learning curve is steep, I would never go back: I can exactly customize my editor as I wish with minimal effort, and nearly all languages and features are already supported. On the Haskell front, a lot of changes have happened in the past few years. Using HLS is currently pretty good, though I still occasionally miss a good debugger or advanced refactoring tools.
  • I love:

    • Emacs: Emacs extensibility and Spacemacs mnemonics: almost all shortcuts just make sense.
    • Hoogle: I am a user and abuser of Hoogle: I can’t believe I have developed all these years in various languages without this game-changing tool which allows looking for functions by signature. Whatever function, type, typeclass or package I am looking for, Hoogle finds the perfect solution 99% of the time. It baffles me that most languages don’t have an equivalent feature, including other languages I use on a regular basis.
    • HLS: I love code navigation, documentation display, instant typechecking, and call hierarchy.
  • Could be better:

    • Emacs: every once in a while I would like to scroll a buffer without moving the cursor, but it’s impossible in Emacs. I think it is the only feature from my past editors that is not simply possible in Emacs.
    • HLS: I learn a lot by browsing dependency code, so I really, really wish we could navigate to dependency code. Currently I browse dependency code via Hackage/Stackage, but this is nowhere near as comfortable as editor integration.
  • Trivia: I only discovered Magit a year ago, but if anyone forced me to go back to using vanilla Git, the situation would escalate pretty quickly. Similarly to Hoogle, I just don’t understand why it’s not more widespread, considering how many people interact with Git every day.

What’s your preferred way of working in Haskell? We’d love to hear from you! Share your setup with us on Twitter!

November 25, 2021 12:00 AM

November 23, 2021

Edward Z. Yang

Interactive scraping with Jupyter and Puppeteer

One of the annoying things about scraping websites is bouncing back and forth between the browser where you are using Dev Tools to work out what selectors you should be using to scrape out data, and your actual scraping script, which is usually some batch program that may have to take a few steps before the step you are debugging. A batch script is fine once your scraper is up and running, but while developing, it's really handy to pause the scraping process at some page and fiddle around with the DOM to see what to do.

This interactive-style development is exactly what Juypter notebooks shine at; when used in conjunction with a browser-based scraping library like Puppeteer, you can have exactly this workflow. Here's the setup:

  1. Puppeteer is a JavaScript library, so you'll need a JavaScript kernel for Jupyter to run it. As an extra complication, Puppeteer is also async, so you'll need a kernel that supports async execution. Fortunately, ijavascript-await provides exactly this. Note that on recent versions of node this package does not compile; you can install this PR which makes this work: Hypothetically, we should be able to use stock ijavascript when node supports top level await, but this currently does not work:
  2. Inside the directory you will store your snotebooks, you'll need to npm install puppeteer so that it's available for your notebooks.
  3. Launch Puppeteer with let puppeteer = require('puppeteer'); let browser = await puppeteer.launch({headless: false}); and profit!

There will be a live browser instance which you can poke at using Dev Tools, and you type commands into the Jupyter notebook and see how they affect the browser state.

I tweeted about this and the commenters had some good suggestions about other things you could try:

  • You don't have to use Puppeteer; Selenium can also drive the browser, and it has a Python API to boot (so no faffing about with alternate Jupyter kernels necessary). I personally prefer working in JavaScript for crawlers, since the page scripting itself is also in JavaScript, but this is mostly a personal preference thing.
  • For simple interactions, where all you really want is to just do a few interactions and record them, Headless Recorder provides a nice extension for just directly recording operations in your browser and then getting them out in executable form. I haven't tried it out yet but it seems like it would be very easy to use.

by Edward Z. Yang at November 23, 2021 02:28 PM

Mark Jason Dominus

Consecutive squareful numbers

On Saturday I was thinking about how each of is a multiple of a square number, and similarly . No such sequence of four numbers came immediately to mind. The smallest example turns out to be .

Let's say a number is “squareful” if it has the form $$a\cdot b^2$$ for . The opposite, “squarefree”, is a standard term, but “non-squarefree” sounds even worse than “squareful”. Do ten consecutive squareful numbers exist, and if so, how can we find them?

I did a little algebraic tinkering but didn't come up with anything. If are consecutive squareful numbers, then so are , except they aren't consecutive, but maybe we could find the right so that and are also squareful. I couldn't make this work though, so I wrote some brute-force search programs to get the lay of the land.

The computer quickly produced sequences of length 7:

$$\begin{array}{rcrr} 217070 & = & 4430 \ · & 7^2 \\ 217071 & = & 24119 \ · & 3^2 \\ 217072 & = & 54268 \ · & 2^2 \\ 217073 & = & 17 \ · & 113^2 \\ 217074 & = & 1794 \ · & 11^2 \\ 217075 & = & 8683 \ · & 5^2 \\ 217076 & = & 54269 \ · & 2^2 \end{array} $$

and length 8:

$$\begin{array}{rcrr} 1092747 & = & 3027 \ · & 19^2 \\ 1092748 & = & 273187 \ · & 2^2 \\ 1092749 & = & 22301 \ · & 7^2 \\ 1092750 & = & 43710 \ · & 5^2 \\ 1092751 & = & 9031 \ · & 11^2 \\ 1092752 & = & 273188 \ · & 2^2 \\ 1092753 & = & 121417 \ · & 3^2 \\ 1092754 & = & 6466 \ · & 13^2 \\ \end{array} $$

Neither of these suggested anything to me, and nor did any of the other outputs, so I stuck into OEIS. With numbers like that you don't even have to ask for the whole sequence, you only need to ask for the one number. Six sequences came up but the first five are all the same and were what I was looking for: A045882: Smallest term of first run of (at least) n consecutive integers which are not squarefree.

This led me to Louis Marmet's paper First occurrences of square-free gaps and an algorithm for their computation. The paper provides the earliest sequences of consecutive squareful numbers, for , and bounds for how far out these sequences must be when . This is enough to answer the questions I originally asked:

  • Are there ten consecutive squareful numbers? Yes, the smallest example starts at , found in 1999 by D. Bernier.

  • How can we find them? Marmet gives a sieve method that starts simple and becomes increasingly elaborate.

The paper is from 2007, so it seems plausible that the same algorithms on 2021 computers could produce some previously unknown results.

[ Addendum 20211124: The original version of this article ended “The general problem, of whether there are arbitrarily long sequences of squareful numbers, seems to be open.” This is completely wrong. Daniel Wagner and Shreevatsa R. pointed out that the existence of arbitrarily long sequences is quite elementary. Pick any squares you like that do not share a common factor, say for example and . Now (because Chinese remainder theorem) you can find consecutive numbers that are multiples of those specific squares; in this case . ]

[ In only vaguely related news, I was driving Toph to school this morning, and a car in front of mine had license plate number . ]

by Mark Dominus ( at November 23, 2021 04:14 AM

November 22, 2021

Monday Morning Haskell

AI Revisited: Breaking Down BFS


So we're approaching the end of the year, and of all the topics that I've tended to focus on in my writings, there's one that I haven't really written about in probably over a year, and this is AI and Machine Learning. I've still been doing some work behind the scenes, as you'll know if you keep following the blog for the next few weeks. But I figured I'd spend the last few weeks of the year with some AI related topics. This week, I'll go over an algorithm that is really useful to understand when it comes to writing simple AI programs, and this is Breadth-First-Search.

All the code for the next few weeks can be found in this GitHub repository! For this week, all the code can be found in the BFS module.

The Algorithm

To frame this problem in a more concrete way, let's imagine we have a 2-dimensional grid. Some spaces are free, other spaces are "walls". We want to use breadth first search to find a path from a start point to a destination point.


So our algorithm will take two locations, and return a path from location A to Location B, or an empty list if no path can be found.

The key data structure when executing a breadth-first-search is a queue. Our basic approach is this: we will place our starting location in the queue. Then, we'll go through a loop as long as the queue is not empty. We'll pull an item off, and then add each of the empty neighbors on to the back of the queue, as long as they haven't been added yet. If we dequeue the destination, we're done! But if we reach an empty queue, then we don't have a valid path.

The last tricky part is that we to track the "parent" of each location. That is, which of its neighbors placed it on the queue? This will allow us to reconstruct the path we need to take to get from point a to point b.

So let's imagine we have a simple graph like in the ASCII art above. We start at (0,0). Our queue will operate like this.

It contains (0,0). We'll then enqueue (0, 1) and (1, 0), since those are the neighbors of (0, 0).

(0, 0) <-- Current
(0, 1)
(1, 0)

Then we're done with (0, 0). So we dequeue (0, 1). This its only neighbor is (0, 2), so that gets placed on the end of the queue.

(0, 1) <-- Current
(1, 0)
(0, 2)

And then we repeat the process with (1, 0), placing (0, 2).

(1, 0) <-- Current
(0, 2)
(2, 0)

We keep doing this until we navigate around to our destination at (2,2).

Types First

How do we translate this to Haskell? My favorite approach to problems like this is to use a top-down, type driven, compile-first method of writing the algorithm. Because before we can really get started in earnest, we have to define our data structures and our types. First, let's alias an integer tuple as a "Location":

type Location = (Int, Int)

Now, we're going to imagine we're navigating a 2D grid, and we'll represent this with an array where the indices are tuples which represent locations, and each value is either "empty" or "wall". We can move through empty spaces, but we cannot move through walls.

data Cell = Empty | Wall
  deriving (Eq)
type Grid = A.Array Location Cell

Now we're ready to define the type signature for our function. This takes the grid as an input, as well as the start and end location:

bfsSearch :: Grid -> Location -> Location -> [Location]

We'll need one more type to help frame the problem. This algorithm will use the State monad, because there's a lot of state we need to track here. First off, we need the queue itself. We represent this with the Sequence type in Haskell. Then, we need our set of visited locations. Each time we enqueue a location, we'll save it here. Last, we need our "parents" map. This will help us determine the path at the very end.

data BFSState = BFSState
  { queue :: S.Seq Location
  , visited :: Set.Set Location
  , parents :: M.Map Location Location

A Stateful Skeleton

With these types, we can start framing the problem a bit more. First, we want to construct our initial state. Everything is empty except our queue has the starting location on it.

bfsSearch :: Grid -> Location -> Location -> [Location]
bfsSearch grid start finish = ...
    initialState = BFSState (S.singleton start) Set.empty M.empty

Now we want to pass this function to a stateful computation that returns our list. So we'll imagine we have a helper in the State monad which returns our location. We'll call this bfsSearch'. We can then fill in our original function with evalState.

bfsSearch :: Grid -> Location -> Location -> [Location]
bfsSearch grid start finish = evalState (bfsSearch' grid finish) initialState
    initialState = BFSState (S.singleton start) Set.empty M.empty

bfsSearch' :: Grid -> Location -> State BFSState [Location]

Base Case

Now within our stateful helper, we can recognize that this will be a recursive function. We dequeue an element, enqueue its neighbors, and then repeat the process. So let's handle the base cases first. We'll retrieve our sequence from the state and check if it's empty or not. If it's empty, we return the empty list. This means that we couldn't find a path.

bfsSearch' :: Grid -> Location -> State BFSState [Location]
bfsSearch' grid finish = do
  (BFSState q v p) <- get
  case S.viewl q of
    (top S.:< rest) -> ...
    _ -> return []

Now another base case is where the top of our queue is the destination. In this case, we're ready to "unwind" the path from that destination in our stateful map. Let's imagine we have a function to handle this unwinding process. We'll fill it in later.

bfsSearch' :: Grid -> Location -> State BFSState [Location]
bfsSearch' grid finish = do
  (BFSState q v p) <- get
  case S.viewl q of
    (top S.:< rest) -> if top == finish
      then return (unwindPath p [finish])
      else ...
    _ -> return []

unwindPath :: M.Map Location Location -> [Location] -> [Location]

The General Case

Now let's write out the steps for our general case.

  1. Get the neighbors of the top element on the queue
  2. Append these to the "rest" of the queue (discarding the top element).
  3. Insert this top element into our "visited" set v.
  4. For each new location, insert it into our "parents" map with the current top as its "parent".
  5. Update our final state and recurse!

Each of these statements is 1-2 lines in our function, except we'll want to make a helper for the first line. Let's imagine we have a function that can give us the unvisited neighbors of a space in our grid. This will require passing the location, the grid, and the visited set.

let valid adjacent = getValidNeighbors top grid v

getValidNeighbors ::
  Location -> Grid -> Set.Set Location -> [Location]

The next lines involve data structure manipulation, with a couple tricky folds. First, appending the new elements into the queue.

let newQueue = foldr (flip (S.|>)) rest validAdjacent

Next, inserting the top into the visited set. This one's easy.

let newVisited = Set.insert top v

Now, insert each new neighbor into the parents map. The new location is the "key", and the current top is the value.

let newParentsMap = foldr (\loc -> M.insert loc top) p validAdjacent

Last of all, we replace the state and recurse!

put (BFSState newQueue newVisited newParentsMap)
bfsSearch' grid finish

Here's our complete function!

bfsSearch' :: Grid -> Location -> State BFSState [Location]
bfsSearch' grid finish = do
  (BFSState q v p) <- get
  case S.viewl q of
    (top S.:< rest) -> if top == finish
      then return (unwindPath p [finish])
      else do
        let validAdjacent = getValidNeighbors top grid v
        let newQueue = foldr (flip (S.|>)) rest validAdjacent
        let newVisited = Set.insert top v
        let newParentsMap = foldr (\loc -> M.insert loc top) p validAdjacent
        put (BFSState newQueue newVisited newParentsMap)
        bfsSearch' grid finish
    _ -> return []

Filling in Helpers

Now we just need to fill in our helper functions. Unwinding the map is a fairly straightforward tail-recursive problem. We get the parent of the current element, and keep an accumulating list of the places we've gone:

unwindPath :: M.Map Location Location -> [Location] -> [Location]
unwindPath parentsMap currentPath = case M.lookup (head currentPath) parentsMap of
  Nothing -> tail currentPath
  Just parent -> unwindPath parentsMap (parent : currentPath)

Finding the neighbors is slightly tricker. For each direction (right, down, left, and right), we have to consider if the "neighbor" cell is in bounds. Then we have to consider if it's empty. Finally, we need to know if it is still "unvisited". As long as all three of these conditions hold, we can potentially add it. Here's what this process looks like for finding the "right" neighbor.

getValidNeighbors :: Location -> Grid -> Set.Set Location -> [Location]
getValidNeighbors (r, c) grid v = ...
    (rowMax, colMax) = snd . A.bounds $ grid
    right = (r, c + 1)
    right' = if c + 1 <= colMax && grid A.! right == Empty && not (Set.member right v)
      then Just right
      else Nothing

We do this in every direction, and we'll use catMaybes so we only get the correct ones in the end!

getValidNeighbors :: Location -> Grid -> Set.Set Location -> [Location]
getValidNeighbors (r, c) grid v = catMaybes [right', down', left', up']
    (rowMax, colMax) = snd . A.bounds $ grid
    right = (r, c + 1)
    right' = if c + 1 <= colMax && grid A.! right == Empty && not (Set.member right v)
      then Just right
      else Nothing
    down = (r + 1, c)
    down' = if r + 1 <= rowMax && grid A.! down == Empty && not (Set.member down v)
      then Just down
      else Nothing
    left = (r, c - 1)
    left' = if c - 1 >= 0 && grid A.! left == Empty && not (Set.member left v)
      then Just left
      else Nothing
    up = (r - 1, c)
    up' = if r - 1 >= 0 && grid A.! up == Empty && not (Set.member up v)
      then Just up
      else Nothing


This basic structure can also be adapted to use depth-first search as well! The main difference is that you must treat the Sequence as a stack instead of a queue, appending new items to the left side of the sequence. Both of these algorithms are guaranteed to find a path if it exists. But BFS will find the shortest path in this kind of scenario, whereas DFS probably won't!

Next week, we'll continue a basic AI exploration by putting this algorithm to work in a game environment with Gloss!

by James Bowen at November 22, 2021 03:30 PM

November 21, 2021

Sandy Maguire

Automatically Migrating Eq of No (/=)

We’ve all spent more time talking about Eq of no (/=) than it deserves. Today Bodigrim published Migration guide for Eq of no (/=) which describes all the steps you’ll need to take in order to update your codebase for the century of the fruitbat.

But that made me think — why do humans need to do this by hand? Computers are good at this sort of thing. So I wrote a tiny little comby config that does the replacements we want. Comby is a fantastic “parser parser combinator” — which is to say, a little DSL for writing program transformations. You just write the pattern you want to match, and comby lifts it to work over whitespace, and ensures that your greedy matches are parenthesis-aware, and that sort of thing. It’s quite lovely. The config I wrote is listed at the end of this post.

Here’s a problematic module that will be very broken by Eq of no (/=):

module Neq where

import Prelude (Eq (..), Bool(..), (||))
import Data.Eq (Eq (..))

data A = A Bool Bool

instance Eq A where
  A x1 x2 /= A y1 y2 = x1 /= y1 || x2 /= x2

data B = B Bool

instance Eq B where
  B x == B y = x == y
  B x /= B y = x /= y

data C a = C a

  Eq a => Eq (C a)
  C x == C y = x == y
  C x /= C y = x /= y

data D = D Bool

instance Eq D where
  D x /= D y =
    x /= y
  D x == D y =
    x == y

data E = E Bool

instance Eq E where
  E x /= E y =
    let foo = x /= y in foo

After running comby, we get the following diff:

 module Neq where

-import Prelude (Eq (..), Bool)
-import Data.Eq (Eq (..))
-data A = A Bool
+import Prelude (Eq, (==), (/=), Bool(..), (||))
+import Data.Eq (Eq, (==), (/=))
+data A = A Bool Bool

 instance Eq A where
-  A x1 x2 /= A y1 y2 = x1 /= y1 || x2 /= x2
+  A x1 x2 == A y1 y2 = not $ x1 /= y1 || x2 /= x2

 data B = B Bool

 instance Eq B where
   B x == B y = x == y
-  B x /= B y = x /= y

 data C a = C a

 instance Eq a => Eq (C a) where
   C x == C y = x == y
-  C x /= C y = x /= y

 data D = D Bool

 instance Eq D where
-  D x /= D y = x /= y
   D x == D y = x == y

 data E = E Bool

 instance Eq E where
-  E x /= E y =
-    let foo = x /= y in foo
+  E x == E y = not $ let foo = x /= y in foo

Is it perfect? No, but it’s pretty good for the 10 minutes it took me to write. A little effort here goes a long way!

My config file to automatically migrate Eq of no (/=):

instance :[ctx]Eq :[name] where
  :[x] /= :[y] = :[z\n]
instance :[ctx]Eq :[name] where
  :[x] == :[y] = not $ :[z]

instance :[ctx]Eq :[name] where
  :[x1] == :[y1] = :[z1\n]
  :[x2] /= :[y2] = :[z2\n]
instance :[ctx]Eq :[name] where
  :[x1] == :[y1] = :[z1]

instance :[ctx]Eq :[name] where
  :[x2] /= :[y2] = :[z2\n]
  :[x1] == :[y1] = :[z1\n]
instance :[ctx]Eq :[name] where
  :[x1] == :[y1] = :[z1]

import Prelude (:[pre]Eq (..):[post])
import Prelude (:[pre]Eq, (==), (/=):[post])

import Data.Eq (:[pre]Eq (..):[post])
import Data.Eq (:[pre]Eq, (==), (/=):[post])

Save this file as eq.toml, and run comby in your project root via:

$ comby -config eq.toml -matcher .hs -i -f .hs

Comby will find and make all the changes you need, in place. Check the diff, and make whatever changes you might need. In particular, it might bork some of your whitespace — there’s an issue to get comby to play more nicely with layout-aware languages. A more specialized tool that had better awareness of Haskell’s idiosyncrasies would help here, if you have some spare engineering cycles. But when all’s said and done, comby does a damn fine job.

November 21, 2021 01:38 PM

November 18, 2021

Mark Jason Dominus

In simple English, what does it mean to be transcendental?

I've been meaning to write this up for a while, but somehow never got around to it. In my opinion, it's the best Math Stack Exchange post I've ever written. And also remarkable: its excellence was widely recognized. Often I work hard and write posts that I think are really good, and they get one or two upvotes; that's okay, because the work is its own reward. And sometimes I write posts that are nothing at all that get a lot of votes anyway, and that is okay because the Math.SE gods are fickle. But this one was great and it got what it deserved.

I am really proud of it, and in this post I am going to boast as shamelessly as I can.

The question was:

In simple English, what does it mean to be transcendental?

There were several answers posted immediately that essentially recited the definition, some better than others. At the time I arrived, the most successful of these was by Akiva Weinberger, which already had around fifty upvotes.

… Numbers like this, that satisfy polynomial equations, are called algebraic numbers. … A real (or complex) number that's not algebraic is called transcendental.

If you're going to essentially quote the definition, I don't think you can do better than to explain it the way Akiva Weinberger did. It was a good answer!

Once one answer gets several upvotes, it moves to the top of the list, right under the question itself. People see it first, and they give it more votes. A new answer has zero votes, and is near the bottom of the page, so people tend it ignore it. It's really hard for new answers to surpass a highly-upvoted previous answer. And while fifty upvotes on some stack exchanges is not a large number, on Math SE fifty is a lot; less than 0.2% of answers score so high.

I was unhappy with the several quoting-the-definition answers. Because honestly "numbers… that satisfy polynomial equations" is not “simple English” or “layman's terms” as the OP requested. Okay, transcendental numbers have something to do with polynomial equations, but why do we care about polynomial equations? It's just explaining one obscure mathematical abstraction in terms of second one.

I tried to think a little deeper. Why do we care about polynomials? And I decided: it's because the integer polynomials are the free ring over the integers. That's not simple English either, but the idea is simple and I thought I could explain it simply. Here's what I wrote:

We will play a game. Suppose you have some number . You start with and then you can add, subtract, multiply, or divide by any integer, except zero. You can also multiply by . You can do these things as many times as you want. If the total becomes zero, you win.

For example, suppose is . Multiply by , then subtract . The result is zero. You win!

Suppose is . Multiply by , then by again, then subtract . You win!

Suppose is . Here it's not easy to see how to win. But it turns out that if you multiply by , subtract 10, multiply by twice, and add , then you win. (This is not supposed to be obvious; you can try it with your calculator.)

But if you start with , you cannot win. There is no way to get from to if you add, subtract, multiply, or divide by integers, or multiply by , no matter how many steps you take. (This is also not supposed to be obvious. It is a very tricky thing!)

Numbers like from which you can win are called algebraic. Numbers like with which you can't win are called transcendental.

Why is this interesting? Each algebraic number is related arithmetically to the integers, and the winning moves in the game show you how so. The path to zero might be long and complicated, but each step is simple and there is a path. But transcendental numbers are fundamentally different: they are not arithmetically related to the integers via simple steps.

This answer was an immediate hit. It rocketed past the previous top answer into the stratosphere. Of 190,000 Math SE, answers, there are twenty with scores over 500; mine is 13th.

The original version left off the final paragraph (“Why is this interesting?”). Fortunately, someone posted a comment pointing out the lack. They were absolutely right, and I hastened to fix it.

I love this answer for several reasons:

  • It's not as short as possible, but it's short enough.

  • It's almost completely jargonless. It doesn't use the word “coefficient”. You don't have to know what a polynomial is. You only have to understand grade-school arithmetic. You don't even need to know what a square root is; you can still try the example if you have a calculator with a square root button.

  • Sometimes to translate a technical concept into plain language, one must sacrifice perfect accuracy, or omit important details. This explanation is technically flawless.

  • One often sees explanations of “irrational number” that refer to the fact such a number has a nonrepeating decimal expansion. While this is true, it's a not what irrationality is really about, but a secondary property. The real root of the matter is that an irrational number is not the ratio of any two integers.

    My post didn't use the word “polynomial” and took a somewhat different path than the typical explanation, but it nevertheless hit directly at the root of the topic, not at a side issue.

  • Also I had some unusually satisfying exchanges with critical commenters. There are a few I want to call out for triumphant mockery, but I have a policy of not mocking private persons on this blog, and this is just the kind of situation I intended to apply it to.

This is some good work. When I stand in judgement and God asks me if I did my work as well as I could, this is going to be one of the things I bring up.

by Mark Dominus ( at November 18, 2021 12:09 PM

November 17, 2021

Brent Yorgey

Competitive programming in Haskell: BFS, part 4 (implementation via STUArray)

In a previous post, we saw one way to implement our BFS API, but I claimed that it is not fast enough to solve Modulo Solitaire. Today, I want to demonstrate a faster implementation. (It’s almost certainly possible to make it faster still; I welcome suggestions!)

Once again, the idea is to replace the HashMaps from last time with mutable arrays, but in such a way that we get to keep the same pure API—almost. In order to allow arbitrary vertex types, while storing the vertices efficiently in a mutable array, we will require one extra argument to our bfs function, namely, an Enumeration specifying a way to map back and forth between vertices and array indices.

So why not instead just restrict vertices to some type that can be used as keys of a mutable array? That would work, but would unnecessarily restrict the API. For example, it is very common to see competitive programming problems that are “just” a standard graph algorithm, but on a non-obvious graph where the vertices are conceptually some more complex algebraic type, or on a graph where the vertices are specified as strings. Typically, competitive programmers just implement a mapping between vertices to integers on the fly—using either some math or some lookup data structures on the side—but wouldn’t it be nicer to be able to compositionally construct such a mapping and then have the graph search algorithm automatically handle the conversion back and forth? This is exactly what the Enumeration abstraction gives us.

This post is literate Haskell; you can obtain the source from the darcs repo. The source code (without accompanying blog post) can also be found in my comprog-hs repo.

{-# LANGUAGE FlexibleContexts    #-}
{-# LANGUAGE RankNTypes          #-}
{-# LANGUAGE RecordWildCards     #-}
{-# LANGUAGE ScopedTypeVariables #-}

module Graph where

import Enumeration

import           Control.Arrow       ((>>>))
import           Control.Monad
import           Control.Monad.ST
import qualified Data.Array.IArray   as IA
import           Data.Array.ST
import           Data.Array.Unboxed  (UArray)
import qualified Data.Array.Unboxed  as U
import           Data.Array.Unsafe   (unsafeFreeze)
import           Data.Sequence       (Seq (..), ViewL (..), (<|), (|>))
import qualified Data.Sequence       as Seq

infixl 0 >$>
(>$>) :: a -> (a -> b) -> b
(>$>) = flip ($)
{-# INLINE (>$>) #-}

exhaustM is like exhaust from the last post, but in the context of an arbitrary Monad. Each step will now be able to have effects (namely, updating mutable arrays) so needs to be monadic.

exhaustM :: Monad m => (a -> m (Maybe a)) -> a -> m a
exhaustM f = go
    go a = do
      ma <- f a
      maybe (return a) go ma

The BFSResult type is the same as before.

data BFSResult v =
  BFSR { getLevel :: v -> Maybe Int, getParent :: v -> Maybe v }

Instead of using HashMaps in our BFSState as before, we will use STUArrays.1 These are unboxed, mutable arrays which we can use in the ST monad. Note we also define V as a synonym for Int, just as a mnemonic way to remind ourselves which Int values are supposed to represent vertices.

type V = Int
data BFSState s =
  BS { level :: STUArray s V Int, parent :: STUArray s V V, queue :: Seq V }

To initialize a BFS state, we allocate new mutable level and parent arrays (initializing them to all -1 values), and fill in the level array and queue with the given start vertices. Notice how we need to be explicitly given the size of the arrays we should allocate; we will get this size from the Enumeration passed to bfs.

initBFSState :: Int -> [V] -> ST s (BFSState s)
initBFSState n vs = do
  l <- newArray (0,n-1) (-1)
  p <- newArray (0,n-1) (-1)

  forM_ vs $ \v -> writeArray l v 0
  return $ BS l p (Seq.fromList vs)

The bfs' function implements the BFS algorithm itself. Notice that it is not polymorphic in the vertex type; we will fix that with a wrapper function later. If you squint, the implementation looks very similar to the implementation of bfs from my previous post, with the big difference that everything has to be in the ST monad now.

bfs' :: Int -> [V] -> (V -> [V]) -> (V -> Bool) -> ST s (BFSState s)
bfs' n vs next goal = do
  st <- initBFSState n vs
  exhaustM bfsStep st
    bfsStep st@BS{..} = case Seq.viewl queue of
      EmptyL -> return Nothing
      v :< q'
        | goal v -> return Nothing
        | otherwise -> v >$> next >>> filterM (fmap not . visited st)
            >=> foldM (upd v) (st{queue=q'}) >>> fmap Just

    upd p b@BS{..} v = do
      lp <- readArray level p
      writeArray level v (lp + 1)
      writeArray parent v p
      return $ b{queue = queue |> v}

visited :: BFSState s -> V -> ST s Bool
visited BS{..} v = (/= -1) <$> readArray level v
{-# INLINE visited #-}

The bfs function is a wrapper around bfs'. It presents the same API as before, with the exception that it requires an extra Enumeration v argument, and uses it to convert vertices to integers for the inner bfs' call, and then back to vertices for the final result. It also handles freezing the mutable arrays returned from bfs' and constructing level and parent lookup functions that index into them. Note, the use of unsafeFreeze seems unavoidable, since runSTUArray only allows us to work with a single mutable array; in any case, it is safe for the same reason the use of unsafeFreeze in the implementation of runSTUArray itself is safe: we can see from the type of toResult that the s parameter cannot escape, so the type system will not allow any further mutation to the arrays after it completes.

bfs :: forall v. Enumeration v -> [v] -> (v -> [v]) -> (v -> Bool) -> BFSResult v
bfs Enumeration{..} vs next goal
  = toResult $ bfs' card (map locate vs) (map locate . next . select) (goal . select)
    toResult :: (forall s. ST s (BFSState s)) -> BFSResult v
    toResult m = runST $ do
      st <- m
      (level' :: UArray V Int) <- unsafeFreeze (level st)
      (parent' :: UArray V V) <- unsafeFreeze (parent st)
      return $
          ((\l -> guard (l /= -1) >> Just l) . (level' IA.!) . locate)
          ((\p -> guard (p /= -1) >> Just (select p)) . (parent' IA.!) . locate)

Incidentally, instead of adding an Enumeration v argument, why don’t we just make a type class Enumerable, like this?

class Enumerable v where
  enumeration :: Enumeration v

bfs :: forall v. Enumerable v => [v] -> ...

This would allow us to keep the same API for BFS, up to only different type class constraints on v. We could do this, but it doesn’t particularly seem worth it. It would typically require us to make a newtype for our vertex type (necessitating extra code to map in and out of the newtype) and to declare an Enumerable instance; in comparison, the current approach with an extra argument to bfs requires us to do nothing other than constructing the Enumeration itself.

Using this implementation, bfs is finally fast enough to solve Modulo Solitaire, like this:

main = C.interact $ runScanner tc >>> solve >>> format

data Move = Move { a :: !Int, b :: !Int } deriving (Eq, Show)
data TC = TC { m :: Int, s0 :: Int, moves :: [Move] } deriving (Eq, Show)

tc :: Scanner TC
tc = do
  m <- int
  n <- int
  TC m <$> int <*> n >< (Move <$> int <*> int)

type Output = Maybe Int

format :: Output -> ByteString
format = maybe "-1" showB

solve :: TC -> Output
solve TC{..} = getLevel res 0
    res = bfs (finiteE m) [s0] (\v -> map (step m v) moves) (==0)

step :: Int -> Int -> Move -> Int
step m v (Move a b) = (a*v + b) `mod` m
{-# INLINE step #-}

It’s pretty much unchanged from before, except for the need to pass an Enumeration to bfs (in this case we just use finiteE m, which is the identity on the interval [0 .. m)).

Some remaining questions

This is definitely not the end of the story.

  • Submitting all this code (BFS, Enumeration, and the above solution itself) as a single file gives a 2x speedup over submitting them as three separate modules. That’s annoying—why is that?

  • Can we make this even faster? My solution to Modulo Solitaire runs in 0.57s. There are faster Haskell solutions (for example, Anurudh Peduri has a solution that runs in 0.32s), and there are Java solutions as fast as 0.18s, so it seems to me there ought to be ways to make it much faster. If you have an idea for optimizing this code I’d be very interested to hear it! I am far from an expert in Haskell optimization.

  • Can we generalize this nicely to other kinds of graph search algorithms (at a minimum, DFS and Dijkstra)? I definitely plan to explore this question in the future.

For next time: Breaking Bad

Next time, I want to look at a few other applications of this BFS code (and perhaps see if we can improve it along the way); I challenge you to solve Breaking Bad.

  1. Why not use Vector, you ask? It’s probably even a bit faster, but the vector library is not supported on as many platforms.↩

by Brent at November 17, 2021 03:52 PM

Tweag I/O

Safe Sparkle: a resource-safe interface with linear types

During my internship at Tweag, I worked on creating a safe interface for the sparkle library under the excellent mentorship of Facundo Domínguez. sparkle is a Haskell library that provides bindings for Apache Spark using inline-java’s Java interop capabilities. As such, this new safe interface accomplishes the same thing that inline-java’s safe interface does; it helps to ensure safe management of Java references at compile-time using linear types.

As discussed in this earlier post, we need to be careful to free Java references provided by inline-java when they’re done being used, and we also shouldn’t use them after they’ve been freed. Since sparkle manipulates references to Java objects defined in the Spark Java library, any potential users must take care to safely manage references when using sparkle as well. Hence, the goal of creating a safe interface for sparkle was clear: ensure that users manage references to Spark objects safely, using linear types. However, actually designing a safe interface that achieved this goal in the best way possible involved a couple of non-obvious design decisions along the way.

In this post, I will discuss some of the more important design choices I made, both as a way to introduce people to the new safe interface for sparkle, and possibly, as a more general guideline for things to consider when designing a library that achieves safe resource management in a linear monad.

Porting Strategy

The first design decision I had to make, although not a user-facing one, was how I wanted to port the unsafe sparkle library over to a safe version. For the most part, the jvm and jni libraries (on top of which both inline-java and sparkle are built) structure their safe interfaces as wrappers around the corresponding unsafe ones. That is, a typical function defined in one of these libraries will involve some unwrapping of data types, followed by an unsafe call to the underlying unsafe function of interest, plus maybe some extra reference management. For example, the getArrayLength function in the safe version of jni is essentially a wrapper around the original implementation in the unsafe version:

getArrayLength :: MonadIO m => JArray a %1-> m (JArray a, Ur Int32)
getArrayLength = Unsafe.toLinear $ \o ->
    liftPreludeIO ((,) o . Ur <$> JNI.getArrayLength (unJ o))

In this case, the library writer was careful to check that the “unsafe” version of getArrayLength was actually safe and didn’t delete or modify the original array o that was passed in, justifying a call to Unsafe.toLinear. And indeed, there’s not much of a choice here but to use Unsafe.toLinear. The actual implementation of getArrayLength involves a call to inline-c, for which there is no linearly-typed safe interface.

In the case of sparkle, however, there was another option: reimplement all the functionality in the safe interface using the safe interfaces from inline-java, jvm, and jni. Since sparkle is primarily built on top of these libraries, we no longer run into the problem of primitives whose implementation is inherently unsafe/nonlinear. The main benefits of this approach are that the new implementations are more likely to be safe, as they’re built from safe building blocks (whereas using Unsafe.toLinear requires us to be very careful each time we use it) and that many of sparkle’s bindings work out of the box when we switch the underlying libraries. The main downsides are that there is more code repetition between the safe and unsafe interfaces and that some functions may be more complicated to implement if we limit ourselves to only using linear types. For example, we may need to use folds instead of maps in order to thread some shared, immutable linear resource across a sequence of actions.

In the end, I went for the latter approach, as it guarantees more safety in the implementation itself (the entire safe interface uses Unsafe.toLinear only once), and the process of adapting pre-existing code to work with the safe interfaces for inline-java, jvm, and jni turned out to be pretty straightforward in most cases.

When to delete references?

The second major design point I had to address when designing the safe interface was that of when references would be deleted. In any interface that deals with the safe management of some resource, there must necessarily be some place where the resource is ultimately consumed (or freed, or deleted). Technically speaking, if we have some value bound in linear IO, let’s say a reference to a Spark RDD:
  rdd <- parallelize sc [1,2,3,4]

Then at some point, we need to pass rdd to exactly one function that consumes it linearly. Let’s say we want to filter the elements in our RDD. Then filter should consume rdd linearly. The Spark filter function also returns an RDD, so we would probably assign filter a type signature as follows:

filter :: (Static (Reify a), Typeable a) => Closure (a -> Bool) -> RDD a %1 -> Linear.IO (RDD a)

Ignoring the complexities with distributed closures, this just says that filter takes a filtering function, consumes a reference to an RDD linearly, then returns a reference to that RDD with the filter transformation applied. RDDs are immutable, so the returned reference refers to a different Java object than rdd did, but this means that we might reasonably still want to do something with the original RDD referred to by rdd! We could manually create another reference to rdd before calling filter:

   (rdd0, rdd1) <- newLocalRef rdd
   filteredRDD <- filter (static (closure (> 3))) rdd0

But we might also be equally as justified in making the type signature of filter as follows, instead:

filter :: (Static (Reify a), Typeable a) => Closure (a -> Bool) -> RDD a %1 -> Linear.IO (RDD a, RDD a)

In this case we return a reference to the input RDD, as well as a reference to the new, filtered one. The only problem here is that sometimes we might not need the reference to the original RDD anymore, in which case we would have to manually delete it:

   (oldRDD, filteredRDD) <- filter (static (closure (> 3))) rdd
   deleteLocalRef oldRDD

So what’s the right type signature for filter? Both of these possible signatures are equally expressive, so we need to determine which option is better in practice.

Ultimately, the answer is somewhat subjective and will likely vary based on resource usage patterns, but I largely opted for the first option (in which we do not return a copy of the original reference) in designing the safe interface for sparkle. My reasons for choosing this approach are that it allows for better composition, compatibility with the unsafe interface, compatibility with inline-java, and practical ease of use.


Deleting input references by default allows sparkle functions to compose more easily. For example, imagine that we wish to define a function that takes an RDD of words and tells us how many of them are palindromes. If we adopt the convention that sparkle functions always return a reference to the input RDD, then this function would look something like this:

countPalindromes :: RDD Text %1 -> IO (RDD Text, Ur Int64)
countPalindromes rdd =
  filter (static (closure isPalindrome)) rdd >>= \(oldRDD, newRDD) ->
    count newRDD >>= \(newRDD', res) ->
      deleteLocalRef newRDD' >>
        pure (oldRDD, res)

Here, we have countPalindromes return the input RDD in keeping with the chosen convention. Using the other convention, however, our function would look like this:

countPalindromes :: RDD Text %1 -> IO (Ur Int64)
countPalindromes rdd = filter (static (closure isPalindrome)) rdd >>= count

As we can see, not having to wrap and unwrap tuples in the output of functions allows for more seamless composition.


Additionally, this approach does not fundamentally alter the return type of any functions, so they can be used in much the same way as their unsafe counterparts (which makes porting unsafe code to the safe interface easier). Similarly, our chosen convention is the same one that safe inline-java uses in its quasi-quotations, so many sparkle functions behave exactly as one would expect their inline-java-implemented analogs to behave. For example, subtract can be defined as nothing more than a wrapper around the corresponding inline-java quasiquotation:

subtract :: RDD a %1 -> RDD a %1 -> IO (RDD a)
subtract rdd1 rdd2 = [java| $rdd1.subtract($rdd2) |]

Usage patterns

Finally, this option fits better with common usage patterns in Spark. A resource like a file handle, which the user typically needs to use repeatedly, probably should be returned from every function that consumes it. But in Spark, pipelines of transformations and actions on a single entity (RDD, Dataset, etc.) are fairly common (take the countPalindromes function above as a minimal example). That is, it is not generally the case that one needs an older reference after doing something with it, so it seems a bit cleaner to create a few extra references when you need them than to have to clean up unused references.

Note that the above applies only to references to immutable Spark objects. While all functions that deal with immutable objects follow this convention, I dealt with functions taking references to mutable objects on a more case by case basis. For example, if we perform an action that mutates a mutable object, we would typically want to use the object for something else afterwards (e.g. setting a field in a configuration object before initializing a process with that configuration) so it makes sense to return a reference to that object.

Unrestricted Values

In discussing what to return from functions, it’s also worth briefly mentioning what happens when a function returns something other than a reference to a Java object. In this case, the function would just return a normal Haskell value, and while everything may be embedded in linear IO, we don’t care about managing Haskell values in a linear fashion whatsoever, so I adopted the convention of wrapping all returned Haskell values in Ur, signifying that these values are unrestricted and may be used any number of times (including none at all). While it’s certainly possible to return Haskell values that aren’t wrapped in Ur, doing so would be needlessly limiting (see the section “Escaping Linearity” in this post). Note that this is also the convention suggested by the safe reify_ function from jvm:

reify_ :: (Reify a, MonadIO m) => J (Interp a) %1-> m (Ur a)

Global References

Finally, one of the trickiest points involved in porting sparkle over to a safe interface was the issue of global references.

In some cases, sparkle deals with objects that are “global” in some sense. For example the static, final BooleanType field from the Spark DataTypes class is global in the sense that any reference to this field will refer to the same piece of memory in the JVM, and the value of this object will never change. For an entity like this one, it seems a bit unnecessary to have to keep track of a bunch of local reference to it if it’s always the same. It would be simpler to just have some kind of global reference that we could always use to refer to this object. Ideally, we would engineer this so that we would be able to avoid unnecessary copying of local references and unnecessary JNI calls from the Haskell side each time we want to do something with this object.

At first, simply using global references as defined in jni seems like the most straightforward solution; however, a mere coercion doesn’t work when we want to pass global references to safe Spark functions.

As it stands, Spark objects are represented in safe sparkle as wrappers around safe local references. For example:

newtype DataType = DataType (J ('Class "org.apache.spark.sql.types.DataType"))

And safe local references are themselves just wrappers around unsafe references:

newtype J (a :: JType) = J (Unsafe.J a)

A global reference has the same type as a local reference, and many functions from jni can work on both kinds of references. But unfortunately, there are different calls to delete each of them. We have deleteLocalRef and deleteGlobalRef, and the user shall not apply the wrong call for a reference or undefined behavior ensues.

Suppose we want to pass a global reference to a function that takes a DataType. We could try to disguise our unsafe global reference as follows:

-- global reference to `BooleanType`
booleanTypeRef :: Unsafe.J ('Class "org.apache.spark.sql.types.DataType")

-- Takes a safe reference to a DataType and deletes it after using, as is the convention
safeSparkFunction :: DataType %1 -> IO ()

someFunc' =
  disguisedRef <- pure $ DataType (Safe.J booleanTypeRef) -- disguise unsafe global ref as safe local ref
  safeSparkFunction disguisedRef                          -- RUNTIME ERROR or UNDEFINED BEHAVIOR

In this case, safeSparkFunction consumes its argument linearly, meaning that it will delete any reference passed in, and we would get an error since inline-java would use deleteLocalRef, which is invalid to call on a global reference. Moreover, nothing prevents the library user from using booleanTypeRef after it has been deleted!

So as it stands, there’s not really a good way to pass a global reference into a safe function taking a DataType, suggesting that perhaps the definition of DataType is actually what needs to change. There’s no way to wrap an unsafe global reference into a DataType without covering up the fact that this reference is global. Ideally, the global reference would be kept valid for as long as it is needed by the program. It may be possible to change the internals of the safe jni, jvm, and inline-java (in particular by making the safe J type into a union type) to allow safe sparkle functions to take either global or local references as arguments safely.

Overall, while the aforementioned potential changes are likely the most optimal solution, I made the simpler, yet still workable compromise of just using safe local references anywhere where a global reference might be preferable. The main downsides of this approach are that the user may have to do some extra manual reference management with these references where it is not strictly necessary, and we also lose a few performance optimizations that come from global references (such as avoiding unnecessary copying or JNI calls). But the major advantage of this solution is that it is simple for the user, as they will not have to worry about whether or not a given reference is local or global (while any solution involving global references would seek to minimize the degree to which the user needs to care about this, the distinction is sure to surface somewhere). By treating all references as safe local references, reference management becomes uniform across the entirety of the interface, at the cost of some extra verbosity and minor performance hits.

Closing remarks

We have now seen some of the problems involved when it comes to designing a library that enforces safe reference management in linear IO. We went through a case-study for linear types in Haskell and the safe interface of inline-java, and I hope that it can serve as a motivation for understanding the importance of thinking carefully about compatibility and ease-of-use when designing safe resource-management libraries that use linear types. Indeed, these two factors will play a large role in the future adoption of linear types in Haskell, and understanding the relevant design choices will be essential in scaling the safe resource-management library ecosystem.

November 17, 2021 12:00 AM

FP Complete

Levana NFT Launch

FP Complete Corporation, headquartered in Charlotte, North Carolina, is a global technology company building next-generation software to solve complex problems.  We specialize in Server-Side Software Engineering, DevSecOps, Cloud-Native Computing, Distributed Ledger, and Advanced Programming Languages. We have been a full-stack technology partner in business for 10+ years, delivering reliable, repeatable, and highly secure software.  Our team of engineers, strategically located in over 13 countries, offers our clients one-stop advanced software engineering no matter their size.

For the past few months, the FP Complete engineering team has been working with Levana Protocol on a DeFi platform for leveraged assets on the Terra blockchain. But more recently, we've additionally been helping launch the Levana Dragons meteor shower. This NFT launch completed in the middle of last week, and to date is the largest single NFT event in the Terra ecosystem. We were very excited to be a part of this. You can read more about the NFT launch itself on the Levana Protocol blog post.

We received a lot of positive feedback about the smoothness of this launch, which was pretty wonderful feedback to hear. People expressed interest in learning about the technical decisions we made that led to such a smooth event. We also had a few hiccups occur during the launch and post-launch that are worth addressing as well.

So strap in for a journey involving cloud technologies, DevOps practices, Rust, React, and—of course—Dragons.

Overview of the event

The Levana Dragons meteor shower was an event consisting of 44 separate "showers", or drops during which NFT meteors would be issued. Participants in a shower competed by contributing UST (a Terra-specific stablecoin tied to US Dollars) to a specific Terra wallet. Contributions from a single wallet across the shower were aggregated into a single contribution, and contributions of a higher amount resulted in a better meteor. At the least granular level, this meant stratification into legendary, ancient, rare, and common meteors. But higher contributions also lead to the greater likelihood of receiving an egg inside your meteor.

Each shower was separated from the next by 1 hour, and we opened up the site about 24 hours before the first shower occurred. That means the site was active for contributions for about 67 hours straight. Then, following the showers, we needed to mint the actual NFTs, ship them to users' wallets, and open up the "cave" page where users could view their NFTs.

So all told, this was an event that spanned many days, had lots of bouts of high activity, was involved in a game that incorporated many financial transactions, and any downtime, slowness, or poor behavior could result in user frustration or worse. On top of that, given the short timeframe this event was intended to be active, attacks such as DDoS taking down the site could be catastrophic for success of the showers. And the absolute worst case would be a compromise allowing an attacker to redirect funds to a different wallet.

All that said, let's dive in.

Backend server

A major component of the meteor drop was to track contributions to the destination wallet, and provide high level data back to users about these activities. This kind of high level data included the floor prices per shower, the timestamps of the upcoming drops, total meteors a user had acquired so far, and more. All this information is publicly available on the blockchain, and in principle could have been written as frontend logic. However, the overhead of having every visitor to the site downloading essentially the entire history of transactions with the destination wallet would have made the site unusable.

Instead, we implemented a backend web server. We used Rust (with Axum) for this for multiple reasons:

  • We're very familiar with Rust
  • Rust is a high performance language, and there were serious concerns about needing to withstand surges in traffic and DDoS attacks
  • Due to CosmWasm already heavily leveraging Rust, Rust was already in use on the project

The server was responsible for keeping track of configuration data (like the shower timestamps and destination wallet address), downloading transaction information from the blockchain (using the Full Client Daemon), and answering queries to the frontend (described next) providing this information.

We could have kept data in a mutable database like PostgreSQL, but instead we decided to keep all data in memory and download from scratch from the blockchain on each application load. Given the size of the data, these two decisions initially seemed very wise. We'll see some outcomes of this when we analyze performance and look at some of our mistakes below.

React frontend

The primary interface users interacted with was a standard React frontend application. We used TypeScript, but otherwise stuck with generic tools and libraries wherever possible. We didn't end up using any state management libraries or custom CSS systems. Another thing to note is that this frontend is going to expand and evolve over time to include additional functionality around the evolving NFT concept, some of which has already happened, and we'll discuss below.

One specific item that popped up was mobile optimization. Initially, the plan was for the meteor shower site to be desktop-only. After a few beta runs, it became apparent that the majority of users were using mobile devices. As a DAO, a primary goal of Levana is to allow for distributed governance of all products and services, and therefore we felt it vital to be responsive to this community request. Redesigning the interface for mobile and then rewriting the relevant HTML and CSS took up a decent chunk of time.

Hosting infrastructure

Many DApps sites are exclusively client side, leveraging frontend logic interacting with the blockchain and smart contracts exclusively. For these kinds of sites, hosting options like Vercel work out very nicely. However, as described above, this application was a combo frontend/backend. Instead of splitting the hosting between two different options, we decided to host both the static frontend app and the backend dynamic app in a single place.

At FP Complete, we typically use Kubernetes for this kind of deployment. In this case, however, we went with Amazon ECS. This isn't a terribly large delta from our standard Kubernetes deployments, following many of the same patterns: container-based application, rolling deployments with health checks, autoscaling and load balancers, externalized TLS cert management, and centralized monitoring and logging. No major issues there.

Additionally, to help reduce burden on the backend application and provide a better global experience for the site, we put Amazon CloudFront in front of the application, which allowed caching the static files in data centers around the world.

Finally, we codified all of this infrastructure using Terraform, our standard tool for Infrastructure as Code.


GitLab is a standard part of our FP Complete toolchain. We leverage it for internal projects for its code hosting, issue tracking, Docker registry, and CI integration. While we will often adapt our tools to match our client needs, in this case we ended up using our standard tool, and things went very well.

We ended up with a four-stage CI build process:

  1. Lint and build the frontend code, producing an artifact with the built static assets
  2. Build a static Rust application from the backend, embedding the static files from (1), and run standard Rust lints (clippy and fmt), producing an artifact with the single file compiled binary
  3. Generate a Docker image from the static binary in (2)
  4. Deploy the new Docker image to either the dev or prod ECS cluster

Steps (3) and (4) are set up to only run on the master and prod branches. This kind of automated deployment setup made it easy for our distributed team to get changes into a real environment for review quickly. However, it also opened a security hole we needed to address.

AWS lockdown

Due to the nature of this application, any kind of downtime during the active showers could have resulted in a lot of egg on our faces and a missed opportunity for the NFT raise. However, there was a far scarier potential outcome. Changing a single config value in production—the destination wallet—would have enabled a nefarious actor to siphon away funds intended for NFTs. This was the primary concern we had during the launch.

We considered multiple social engineering approaches to the problem, such as advertising to potentially users the correct wallet address they should be using. However, we decided that most likely users would not be checking addresses before sending their funds. We did set up some emergency "shower halted" page and put in place an on-call team to detect and deploy such measures if necessary, but fortunately nothing along those lines occurred.

However, during the meteor shower, we did instate an AWS account lockdown. This included:

  • Switching Zehut, a tool we use for granting temporary AWS credentials, into read-only credentials mode
  • Disabling GitLab CI's production credentials, so that GitLab users could not cause a change in prod

We additionally vetted all other components in the pipeline of DNS resolution, such as domain name registrar, Route 53, and other AWS services for hosting.

These are generally good practices, and over time we intend to refine the AWS permissions setup for Levana's AWS account in general. However, this launch was the first time we needed to use AWS for app deployment, and time did not permit a thorough AWS permissions analysis and configuration.

During the shower

As I just mentioned, during the shower we had an on-call team ready to jump into action and a playbook to address potential issues. Issues essentially fell into three categories:

  1. Site is slow/down/bad in some way
  2. Site is actively malicious, serving the wrong content and potentially scamming people
  3. Some kind of social engineering attack is underway

The FP Complete team were responsible for observing (1) and (2). I'll be honest that this is not our strong suit. We are a team that typically builds backends and designs DevOps solutions, not an on-call operations team. However, we were the experts in both the DevOps hosting, as well as the app itself. Fortunately, no major issues popped up, and the on-call team got to sit on their hands the whole time.

Out of a preponderance of caution, we did take a few extra steps before the showers started to try and ensure we were ready for any attack:

  1. We bumped the replica count in ECS from 2 desired instances to 5. We had autoscaling in place already, but we wanted extra buffer just to be safe.
  2. We increased the instance size from 512 CPU units to 2048 CPU units.

In all of our load testing pre-launch, we had seen that 512 CPU units was sufficient to handle 100,000 requests per second per instance with 99th percentile latency of 3.78ms. With these bumped limits in production, and in the middle of the highest activity on the site, we were very pleased to see the following CPU and memory usage graphs:

CPU usage

Memory usage

This was a nice testament to the power of a Rust-written web service, combined with proper autoscaling and CloudFront caching.

Image creation

Alright, let's put the app itself to the side for a second. We knew that, at the end of the shower, we would need to quickly mint NFTs for everyone wallet that donated more than $8 during a single shower. There are a few problems with this:

  • We had no idea how many users would contribute.
  • Generating the images is a relatively slow process.
  • Making the images available on IPFS—necessary for how NFTs work—was potentially going to be a bottleneck.

What we ended up doing was writing a Python script that pregenerated 100,000 or so meteor images. We did this generation directly on an Amazon EC2 instance. Then, instead of uploading the images to an IPFS hosting/pinning service, we ran the IPFS daemon directly on this EC2 instance. We additionally backed up all the images on S3 for redundant storage. Then we launched a second EC2 instance for redundant IPFS hosting.

This Python script not only generated the images, but also generated a CSV file mapping the image Content ID (IPFS address) together with various pieces of metadata about the meteor image, such as the meteor body. We'll use this CID/meteor image metadata mapping for correct minting next.

All in all, this worked just fine. However, there were some hurdles getting there, and we have plans to change this going forward in future stages of the NFT evolution. We'll mention those below.


Once the shower finished, we needed to get NFTs into user wallets as quickly as possible. That meant we needed two different things:

  1. All the NFT images on IPFS, which we had.
  2. A set of CSV files providing the NFTs to be generated, together with all of their metadata and owners.

The former was handled by the previous step. The latter was additional pieces of Rust tooling we wrote that leveraged the same internal libraries we wrote for the backend application. The purpose of this tooling was to:

  • Aggregate the total set of contributions from the blockchain.
  • Stratify contributions into individual meteors of different rarity.
  • Apply the appropriate algorithms to randomly decide which meteors receive an egg and which don't.
  • Assign eggs among the meteors.
  • Assign additionally metadata to the meteors.
  • Choose an appropriate and unique meteor image for each meteor based on its needed metadata. (This relies on the Python-generated CSV file above.)

This process produced a few different pieces of data:

  • CSV files for meteor NFT generation. There's nothing secret about these, you could reconstruct them yourself by analyzing the NFT minting on the blockchain.
  • The distribution of attributes (such as essence, crystals, distance, etc.) among the meteors for calculating rarity of individual traits. Again, this can be derived easily from public information.
  • A file that tracks the meteor/egg mapping. This is the one outcome from this process that is a closely guarded secret.

This final point is also influencing the design of the next few stages of this project. Specifically, while a smart contract would be the more natural way to interact with NFTs in general, we cannot expose the meteor/egg mapping on the blockchain. Therefore, the "cracking" phase (which will allow users to exchange meteors for their potential eggs) will need to work with another backend application.

In any event, this metadata-generation process was something we tested multiple times on data from our beta runs, and were ready to produce and send over to for minting soon after the shower. I believe users got NFTs in their wallets within 8 hours of the end of the shower, which was a pretty good timeframe overall.

Opening the cave

The final step was opening the cave, a new page on the meteor site that allows users to view their meteors. This phase was achieved by updating the configuration values of the backend to include:

  • The smart contract address of the NFT collection
  • The total number of meteors
  • The trait distribution

Once we switched the config values, the cave opened up, and users were able to access it. Besides pulling the static information mentioned above from the server, all cave page interactions occur fully client side, with the client querying the blockchain using the Terra.js library.

And that's where we're at today. The showers completed, users got their meteors, the cave is open, and we're back to work on implementing the cracking phase of this project. W00t!


Overall, this project went pretty smoothly in production. However, there were a few gotcha moments worth mentioning.

FCD rate limiting

The biggest issue we hit during the showers, and the one that had the biggest potential to break everything, was FCD rate limiting. We'd done extensive testing prior to the real showers on testnet, with many volunteer testers in addition to bots. We never ran into a single example that I'm aware of where rate limiting kicked in.

However, the real production shower run into such rate limiting issues about 10 showers into the event. (We'll look at how they manifested in a moment.) There are multiple potentially contributing factors for this:

  • There was simply far greater activity in the real event than we had tested for.
  • Most of our testing was limited to just 10 showers, and the real event went for 44.
  • There may be different rate limiting rules for FCD on mainnet versus testnet.

Whatever the case, we began to notice the rate limiting when we tried to roll out a new feature. We implemented the Telescope functionality, which allowed users to see the historical floor prices in previous showers.


After pushing the change to ECS, however, we noticed that the new deployment didn't go live. The reason was that, during the initial data load process, the new processes were receiving rate limiting responses and dying. We tried fixing this by adding a delay or other kinds of retry logic. However, none of these combinations allowed the application to begin processing requests within ECS's readiness check period. (We could have simply turned off health checks, but that would have opened a new can of worms.)

This problem was fairly critical. Not being able to roll out new features or bug fixes was worrying. But more troubling was the lack of autohealing. The existing instances continued to run fine, because they only needed to download small amounts of data from FCD to stay up-to-date, and therefore never triggered the rate limiting. But if any of those instances went down, ECS wouldn't be able to replace them with healthy instances.

Fortunately, we had already written the majority of a caching solution in prior weeks, and had not finished the work because we thought it wasn't a priority. After a few hair-raising hours of effort, we got a solution in place which:

  • Saved all transactions to a YAML file (a binary format would have been a better choice, but YAML was the easiest to roll out)
  • Uploaded this YAML file to S3
  • Ran this save/upload process on a loop, updating every 10 minutes
  • Modified the application logic to start off by first downloading the YAML file from S3, and then doing a delta load from there using FCD

This reduced startup time significantly, bypassed the rate limiting completely, and allowed us to roll out new features and not worry about the entire site going down.

IPFS hosting

FP Complete's DevOps approach is decidedly cloud-focused. For large blob storage, our go-to solution is almost always cloud-based blob storage, which would be S3 in the case of Amazon. We had zero experience with large scale IPFS data hosting prior to this project, which presented a unique challenge.

As mentioned, we didn't want to go with one of the IPFS pinning services, since the rate limiting may have prevented us from uploading all the pregenerated images. (Rate limiting is beginning to sound like a pattern here...) Being comfortable with S3, we initially tried hosting the images using go-ds-s3, a plugin for the ipfs CLI that uses S3 for storage. We still don't know why, but this never worked correctly for us. Instead, we reverted to storing the raw image data on Amazon EBS, which is more expensive and less durable, but actually worked. To fix the durability issue, we backed up all the raw image files to S3.

Overall, however, we're not happy with this outcome. The cost for this hosting is relatively high, and we haven't set up a truly fault-tolerant, highly available hosting. At this point, we would like to switch over to an IPFS pinning service, such as Pinata. Now that the images are available on IPFS, issuing API calls to pin those files should be easier than uploading the complete images. We're planning on using this as a framework going forward for other images, namely:

  • Generate the raw images on EC2
  • Upload for durability to S3
  • Run ipfs locally to make the images available on IPFS
  • Pin the images to a service like Pinata
  • Take down the EC2 instance

The next issue we ran into was... RATE LIMITING, again. This time, we discovered that Cloudflare's IPFS gateway was rate limiting users on downloading their meteor images, resulting in a situation where users would see only some of their meteors appear in their cave page. We solved this one by sticking CloudFront in front of the S3 bucket holding the meteor images and serving from there instead.

Going forward, when it's available, Cloudflare R2 is a promising alternative to the S3+CloudFront offering, due to reduced storage cost and entirely removed bandwidth costs.

Lessons learned

This project was a great mix of leveraging existing expertise and pairing with some new challenges. Some of the top lessons we learned here were:

  1. We got a lot of experience with working directly with the LCD and FCD APIs for Terra from Rust code. Previously, with our DeFi work, this almost exclusively sat behind Terra.js usage.
  2. IPFS was a brand-new topic for us, and we got to play with some pretty extreme cases right off the bat. Understanding the concepts in pinning and gateways will help us immensely with future NFT work.
  3. Since ECS is a relatively unusual technology for us, we got to learn quite a few of the idiosyncrasies it has versus Kubernetes, our more standard toolchain.
  4. While rate limiting is a concept we're familiar with and have worked with many times in the past, these particular obstacles were all new, and each of them surprising in different ways. Typically, we would have some simpler workarounds for these rate limiting issues, such as using authenticated requests. Having to solve each problem in such an extreme way was surprising.
  5. And while we've been involved in blockchain and smart contract work for years, this was our first time working directly with NFTs. This was probably the simplest lesson learned. The API for querying the NFTs contracts is fairly straightforward, and represented a small portion of the time spent on this project.


We're very excited to have been part of such a successful event as the Levana Dragons NFT meteor shower. This was a fun site to work on, with a huge and active user base, and some interesting challenges. It was great to pair together some of our standard cloud DevOps practices with blockchain and smart contract common practices. And using Rust brought some great advantages we're quite happy with.

Going forward, we're looking forward to getting to continue evolving the backend, frontend, and DevOps of this project, just like the NFTs themselves will be evolving. Happy dragon luck to all!

Interested in learning more? Check out these relevant articles

Does this kind of work sound interesting? Consider applying to work at FP Complete.

November 17, 2021 12:00 AM

November 16, 2021

Mark Jason Dominus

What is not portable

I had a small dispute this week about whether the Osborne 1 computer from 1981 could be considered a “laptop”. It certainly can't:

The Osborne 1 is a beige box the size of a toolbox.  Its lid conceals a full-sized keyboard attached to the rest of the box by a ribbon cable.  Packed into the box are a very small CRT monitor, two 5¼ inch diskette drives, assorted I/O receptacles, and, not visible, the computer itself. Bilby, CC BY 3.0 via Wikimedia Commons

The Osborne was advertised as a “portable” computer. Wikipedia describes it, more accurately, as “luggable”. I had a friend who owned one, and at the time I remarked “those people would call the Rock of Gibraltar ‘portable’, if it had a handle.”.

Looking into it a little more this week, I learned that the Osborne weighed 24½ pounds. Or, in British terms, 1¾ stone. If your computer has a weight measurable in “stone”, it ain't portable.

by Mark Dominus ( at November 16, 2021 03:44 PM

November 15, 2021

Brent Yorgey

Competitive programming in Haskell: Enumeration

I’m in the middle of a multi-part series on implementing BFS in Haskell. In my last post, we saw one implementation, but I claimed that it is not fast enough to solve Modulo Solitaire, and I promised to show off a faster implementation in this post, but I lied; we have to take a small detour first.

The main idea to make a faster BFS implementation is to replace the HashMaps from last time with mutable arrays, but hopefully in such a way that we get to keep the same pure API. Using mutable arrays introduces a few wrinkles, though.

  1. The API we have says we get to use any type v for our vertices, as long as it is an instance of Ord and Hashable. However, this is not going to work so well for mutable arrays. We still want the external API to allow us to use any type for our vertices, but we will need a way to convert vertices to and from Int values we can use to index the internal mutable array.

  2. A data structre like HashMap is dynamically sized, but we don’t have this luxury with arrays. We will have to know the size of the array up front.

In other words, we need to provide a way to bijectively map vertices to a finite prefix of the natural numbers; that is, we need what I call invertible enumerations. This idea has come up for me multiple times: in 2016, I wrote about using such an abstraction to solve another competitive programming problem, and in 2019 I published a library for working with invertible enumerations. I’ve now put together a lightweight version of that library for use in competitive programming. I’ll walk through the code below, and you can also find the source code in my comprog-hs repository.

First, some extensions and imports.

{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}

module Enumeration where

import qualified Data.List as L
import Data.Hashable
import qualified Data.Array as A
import qualified Data.HashMap.Strict as HM

An Enumeration a consists of a cardinality, and two functions, select and locate, which together form a bijection between (some subset of) values of type a and a finite prefix of the natural numbers. We can convert an Enumeration into a list just by mapping the select function over that finite prefix.

data Enumeration a = Enumeration
  { card   :: !Int
  , select :: Int -> a
  , locate :: a -> Int

enumerate :: Enumeration a -> [a]
enumerate e = map (select e) [0 .. card e-1]

Since a occurs both positively and negatively, Enumeration is not a Functor, but we can map over Enumerations as long as we provide both directions of a bijection a <-> b.

mapE :: (a -> b) -> (b -> a) -> Enumeration a -> Enumeration b
mapE f g (Enumeration c s l) = Enumeration c (f . s) (l . g)

We have various fundamental ways to build enumerations: empty and unit enumerations, and an identity enumeration on a finite prefix of natural numbers.

voidE :: Enumeration a
voidE = Enumeration 0 (error "select void") (error "locate void")

unitE :: Enumeration ()
unitE = singletonE ()

singletonE :: a -> Enumeration a
singletonE a = Enumeration 1 (const a) (const 0)

finiteE :: Int -> Enumeration Int
finiteE n = Enumeration n id id

We can automatically enumerate all the values of a Bounded Enum instance. This is useful, for example, when we have made a custom enumeration type.

boundedEnum :: forall a. (Enum a, Bounded a) => Enumeration a
boundedEnum = Enumeration
  { card   = hi - lo + 1
  , select = toEnum . (+lo)
  , locate = subtract lo . fromEnum
    lo, hi :: Int
    lo = fromIntegral (fromEnum (minBound @a))
    hi = fromIntegral (fromEnum (maxBound @a))

We can also build an enumeration from an explicit list. We want to make sure this is efficient, since it is easy to imagine using this e.g. on a very large list of vertex values given as part of the input of a problem. So we build an array and a hashmap to allow fast lookups in both directions.

listE :: forall a. (Hashable a, Eq a) => [a] -> Enumeration a
listE as = Enumeration n (toA A.!) (fromA HM.!)
    n = length as
    toA :: A.Array Int a
    toA = A.listArray (0,n-1) as
    fromA :: HM.HashMap a Int
    fromA = HM.fromList (zip as [0 :: Int ..])

Finally, we have a couple ways to combine enumerations into more complex ones, via sum and product.

(>+<) :: Enumeration a -> Enumeration b -> Enumeration (Either a b)
a >+< b = Enumeration
  { card   = card a + card b
  , select = \k -> if k < card a then Left (select a k) else Right (select b (k - card a))
  , locate = either (locate a) ((+card a) . locate b)

(>*<) :: Enumeration a -> Enumeration b -> Enumeration (a,b)
a >*< b = Enumeration
  { card = card a * card b
  , select = \k -> let (i,j) = k `divMod` card b in (select a i, select b j)
  , locate = \(x,y) -> card b * locate a x + locate b y

There are a few more combinators in the source code but I don’t know whether I’ll ever use them. You can read about them if you want. For now, let’s try using this to solve a problem!

…ah, who am I kidding, I can’t find any problems that can be directly solved using this framework. Invertibility is a double-edged sword—we absolutely need it for creating an efficient BFS with arbitrary vertices, and the combinators will come in quite handy if we want to use some complex type for vertices. However, requiring invertibility also limits the expressiveness of the library. For example, there is no Monad instance. This is why my simple-enumeration library has both invertible and non-invertible variants.

by Brent at November 15, 2021 10:29 PM

November 13, 2021

Mark Jason Dominus

I said it was obvious but it was false

In a recent article about Norbert Wiener's construction of ordered pairs, I wrote:

The first thing I thought of was $$\pi_2(p) = \bigcup \{x\in p: |x| = 1\}$$ (“give me the contents of the singleton elements of ”) but one could also do $$\pi_2(p) = \{x: \{\{x\}\} \in p\},$$ and it's obvious that both of these are correct.

I said “it's obvious that both are correct”, but actually no! The first one certainly seems correct, but when I thought about it more carefully, I realized that the first formula is extremely complex, vastly moreso than the second. That notation, so intuitive and simple-seeming, is hiding a huge amount of machinery. It denotes the cardinality of the set , and cardinality is not part of the base language of set theory. Normally we don't worry about this much, but in this context, where we are trying to operate at the very bottom bare-metal level of set theory, it is important.

To define requires a lot of theoretical apparatus. First we have to define the class of ordinals. Then we can show that every well-ordered set is order-isomorphic to some ordinal. Zermelo's theorem says that has a well-ordering , and then is order-isomorphic to some ordinal. There might be multiple such ordinals, but Zermelo's theorem guarantees there is at least one, and we can define to be the smallest such ordinal. Whew! Are we done yet?

We are not. There is still more apparatus hiding in the phrases “well-ordering” and “order-isomorphic”: a well-ordering is certain kind of relation, and an order isomorphism is a certain kind of function. But we haven't defined relations and functions yet, and we can't, because we still don't have ordered pairs! And what is that notation supposed to mean anyway? Oh, it's just the… um… whoops.

So the idea is not just inconveniently complex, but circular.

We can get rid of some of the complexity if we restrict the notion to finite sets (that eliminates the need for the full class of ordinals, for the axiom of choice and the full generality of Zermelo's theorem, and we can simplify it even more if we strip down our definition to the minimum requirements, which is that we only need to recognize cardinalities as large as 2.

But if we go that way we may as well dispense with a general notion of cardinality entirely, which is what the second definition does. Even without cardinality, we can still define a predicate which is true if and only if is a singleton. That's simply: $$\operatorname{singleton}(x) \equiv_{def} \exists y. x = \{y\}.$$

One might want to restrict the domain of here, depending on context.¹

To say that has exactly two elements² we proceed analogously: $$\exists y. \exists z. z\ne y \land x = \{y, z\}.$$ And similarly we can define a predicate “has exactly elements” for any fixed number , even for infinite . But a general definition of cardinality at this very early stage is far out of reach.

¹ This issue is not as complex as it might appear. Although it appears that must range over all possible sets, we can sidestep this by restricting our investigation to some particular universe of discourse, which can be any set of sets that we like. The restricted version of is a true function, an actual set in the universe, rather than a class function.

² There does not seem to be any standard jargon for a set with exactly two elements . As far as I can tell, “doubleton” is not used. “Unordered pair” is not quite right, because the unordered pair has only one element, not two.

by Mark Dominus ( at November 13, 2021 05:30 AM

November 12, 2021

Mark Jason Dominus

Stack Exchange is a good place to explain initial and terminal objects in the category of sets

The fact that singleton sets are terminal in the category of sets, and the empty set is initial, is completely elementary, so it's often passed over without discussion. But understanding it requires understanding the behavior of empty functions, and while there is nothing complex about that, novices often haven't thought it through, because empty functions are useless except for the important role they play in Set. So it's not unusual to see questions like this one:

I have trouble understanding the difference between initial and terminal objects in category theory. … Why there can be morphism from empty set to any other set? And why there is not morphism to empty set as well?

I'm happy with the following answer, which is of the “you already knew this, you only thought you didn't” type. It doesn't reveal any new information, it doesn't present any insights. All it does is connect together some things that the querent hasn't connected before.

This kind of connecting is an important part of pedagogy, one that Math Stack Exchange is uniquely well-suited to deal with. It is not well-handled by the textbook (which should not be spending time or space on such an elementary issue) or in lectures (likewise). In practice it's often handled by the TA (or the professor), during office hours, which isn't a good way to do it: the TA will get bored after the second time, and most students never show up to office hours anyway. It can be well-handled if the class has a recitation section where a subset of the students show up at a set time for a session with the TA, but upper-level classes like category theory don't usually have enough students to warrant this kind of organization. When I taught at math camp, we would recognize this kind of thing on the fly and convene a tiny recitation section just to deal with the one issue, but again, very few category theory classes take place at math camp.

Stack Exchange, on the other hand, is a great place to do this. There are no time or space limitations. One person can write up the answer, and then later querents can be redirected to the pre-written answer.

Your confusion seems to be not so much about initial and terminal objects, but about what those look like in the category of sets. Looking at the formal definition of “function” will help make clear some of the unusual cases such as functions with empty domains.

A function from to can be understood as a set of pairs $$\langle a,b\rangle$$ where and . And:

There must be exactly one pair for each element of .

Exactly one, no more and no less, or the set of pairs is not a function.

For example, the function that takes an integer and yields its square can be understood as the (infinite) set of ordered pairs:

$$\{ \ldots ,\langle -2, 4\rangle, \langle -1, 1\rangle, \langle 0, 0\rangle ,\langle 1, 1\rangle, \langle 2, 4\rangle\ldots\}$$

And for each integer there is exactly one pair . Some numbers can be missing on the right side (for example, there is no pair ) and some numbers can be repeated on the right (for example the function contains both and ) but on the left each number appears exactly once.

Now suppose is some set and is a set with only one element . What does a function from to look like? There is only one possible function: it must be: $$\{ \langle a_1, b\rangle, \langle a_2, b\rangle, \ldots\}.$$ There is no choice about the left-side elements of the pairs, because there must be exactly one pair for each element of . There is also no choice about the right-side element of each pair. has only one element, , so the right-side element of each pair must be .

So, if is a one-element set, there is exactly one function from to . This is the definition of “terminal”, and one-element sets are terminal.

Now what if it's that has only one element? We have and . How many functions are there now? Only one?

One function is $$\{\langle a, b_1\rangle\}$$ another is $$\{\langle a, b_2\rangle\}$$ and another is $$\{\langle a, b_3\rangle\}$$ and so on. Each function is a set of pairs where the left-side elements come from , and each element of is in exactly one pair. has only one element, so there can only be one pair in each function. Still, the functions are all different.

You said:

I would find it more intuitive if one-element set would be initial object too.

But for a one-element set to be initial, there must be exactly one function for each . And we see above that usually there are many functions .

Now we do functions on the empty set. Suppose is and is empty. What does a function from look like? It must be a set of pairs, it must have exactly one pair for each element of , and the right-side of each pair must be an element of . But has no elements, so this is impossible: $$\{\langle a_1, ?\rangle, \langle a_2, ?\rangle, \ldots\}.$$

There is nothing to put on the right side of the pairs. So there are no functions . (There is one exception to this claim, which we will see in a minute.)

What if is empty and is not, say ? A function is a set of pairs that has exactly one pair for each element of . But has no elements. No problem, the function has no pairs! $$\{\}$$

A function is a set of pairs, and the set can be empty. This is called the “empty function”. When is the empty set, there is exactly one function from , the empty function, no matter what is. This is the definition of “initial”, so the empty set is initial.

Does the empty set have an identity morphism? It does; the empty function is its identity morphism. This is the one exception to the claim that there are no functions from : if is also empty, the empty function is such a function, the only one.

The issue for topological spaces is exactly the same:

  • When has only one element, there is exactly one continuous map for every .
  • When is empty, there is exactly one continuous map for every : the empty function is the homeomorphism.
  • When has only one element, there are usually many continuous maps , one different one for each element of .

There are categories in which the initial and terminal objects are the same:

  • In the category of groups, the trivial group (with one element) is both initial and terminal.

  • A less important but simpler example is Set*, the category of pointed sets, whose objects are nonempty sets in which one element has been painted red. The morphisms of Set* are ordinary functions that map the red element in the domain to the red element of the codomain.

I hope this was some help.

[ Thanks to Rupert Swarbrick for pointing out that I wrote “homeomorphism” instead of “continuous map” ]

by Mark Dominus ( at November 12, 2021 03:27 PM

Sandy Maguire

Dragging Haskell Kicking and Screaming into the Century of the Fruitbat

Yesterday, the new Haskell CLC decided to remove (/=) from the Eq typeclass. As expected, the community has embraced this change with characteristic grace. The usual state of affairs is that people who like the changes are silent, while people who don’t are exceptionally vocal. The result: people working hard to improve things only ever get yelled at, and nobody says “hey, you’re doing a great job. Thank you!”

To the new Haskell CLC, with my deepest sincerity:

You’re doing a great job! Thank you.

Today I’d like to talk a little about the problems I see in the Haskell ecosystem. These are by no means insurmountable problems. They’re not even technical problems. They’re social problems, which suck, because those are the sort that are hard to solve.

The (/=) proposal has caused a surprising amount of uproar about such a stupid change. But why? Nobody actually cares technically about whether we can define not-equals instead of equals. I mean, I don’t care. You don’t care. I’m willing to bet dollars to donuts that nobody reading this essay has actually ever defined (/=) in an instance.

No, the outcry is because (/=) is a proxy for the real issue. As best I can tell, there are three camps, and everyone is talking past one another. This is my attempt to organize the camps, steel-man them (well, two out of three), and respond with my commentary.

Personally, I support the removal of (/=) because it’s a tiny, inconspicuous change that nobody is going to notice. It’s a bright beacon in the darkness of the ecosystem saying “hey, the situation sucks, but we’re willing to improve the situation.” Haskell has tons of problems like this, none of which are controversial — things like how head is antithetical to everything we like about Haskell, and yet it’s included by default in every fucking module of all time. Yes, there are valid arguments as to why it shouldn’t be removed, but nobody would argue to put it in if we were designing Prelude today.

In my eyes, removing (/=) from Eq shows that we as a community are willing to pay the cost to move out of a bad Nash equilibrium. Because as far as I see it, there are only two options:

  1. break backwards compatibility and pay the migration costs
  2. don’t break backwards compatibility, and pay the costs of having a shitty standard library forever

Asymptotically, option 1 is better than option 2. There is a constant amount of code today that we will need to fix, compared to a linear (or worse!) amount of work in the future to work around having a shitty standard library. And yes, that cost probably is forever, because if we have too much code to break today, we’re going to have even more code that will break tomorrow.

I want us to be able to remove (/=), because it gives me hope that one day we will be able to add Foldable1, and get join and liftA2 into Prelude, and the other hundreds of tiny improvements that would all of our lives better. Not profoundly, but each one saves us a few seconds, multiplied by the number of practitioners, integrated over all of our careers.

And yes, those things will come at a cost, but it’s a cost that is getting bigger the longer we don’t pay it.

In the particular example of (/=), there isn’t even any cost. There are ~50 packages that define (/=), and the solution is just to delete those definitions. Yes, it’s churn, but I personally am willing to send a patch to each package. If you’re the maintainer of such a package, email me and I’ll just go fix it for you.

This is not a big issue.

The second camp as best I can tell, are educators who aren’t particularly up to date on what Haskell is in 2021. These are the people saying “this will break our tutorials” (and, I suspect, are also the ones who say “we need head in Prelude because it’s good for beginners”.) While this group clearly has good intentions, I don’t think they get it. Educational materials for everything go obsolete, made much worse by the half-life of knowledge. If this is truly a thing you care about, just update your tutorials. There is no shortage of people in the community writing explanatory material, and I guarantee they will rush in to fill any void.

Of much more importance is third camp. They also seems to not care about (/=) in particular. But they are concerned about “backwards compatibility at all costs.” And to them, it seems, (/=) is a slippery slope. If we can’t maintain backwards compatibility for something as stupid as (/=), we’ve got no chance of having long-term maintainability. It’s a perfectly good argument.

To quote Dmitrii:

The size of breakage doesn’t matter. Breakage is breakage.

My personal rebuttal against this attitude is that it gets us stuck in extremely suboptimal equilibria. If we can never break anything, then every mistake must exist for perpetuity. By virtue of being human, we’re unfortunately fallible, and thus are going to make mistakes.

However, I can definitely sympathize here. If every week you are called on to fix a tiny, breaking change, sooner than later you’re going to quit. We should not encourage any behavior that leads to burnout in our best players. That’s a bad long term plan.

But in my experience, it’s not breaking changes that are the problem. It’s lots of breaking changes that are the problem. Breaking changes are annoying, sure, but what’s worse is the context shifting necessary to fix a breaking change. Doing the work is \(O(n)\) with respect to the breakages, but there is an extremely high constant factor. With this in mind, one big breakage is significantly better than lots of tiny breakages.

Since we’re just about to break things, might I again suggest we add Foldable1, and clean up Prelude? If breakage is breakage (it is), we might as well take advantage of it and do as much cleanup as possible. This is an excellent opportunity. The status-quo is for all of us to argue about it every week, with half the people saying “these are bad circumstances and we should fix them,” with the other half saying “yes, they are bad circumstances, but breakage is worse.”

But given that we now have breakage, let’s make the most of it.

November 12, 2021 12:15 PM

Daniel Mlot (duplode)

Divisible and the Monoidal Quartet

A recent blog post by Gabriella Gonzalez, Co-Applicative programming style, has sparked discussion on ergonomic ways to make use of the Divisible type class. The conversation pointed to an interesting rabbit hole, and jumping into it resulted in these notes, in which I attempt to get a clearer view of picture of the constellation of monoidal functor classes that Divisible belongs to. The gist of it is that “Divisible is a contravariant Applicative, and Decidable is a contravariant Alternative” is not a full picture of the relationships between the classes, as there are a few extra connections that aren’t obvious to the naked eye.

Besides Gabriella’s post, which is an excellent introduction to Divisible, I recommend as background reading Tom Ellis’ Alternatives convert products to sums, which conveys the core intuition about monoidal functor classes in an accessible manner. There is a second post by Tom, The Mysterious Incomposability of Decidable, that this article will be in constant dialogue with, in particular as a source of examples. From now on I will refer to it as “the Decidable post”. Thanks to Gabriella and Tom for inspiring the writing of this article.

For those of you reading with GHCi on the side, the key general definitions in this post are available from this .hs file.


As I hinted at the introduction, this post is not solely about Divisible, but more broadly about monoidal functor classes. To start from familiar ground and set up a reference point, I will first look at the best known of those classes, Applicative. We won’t, however, stick with the usual presentation of Applicative in terms of (<*>), as it doesn’t generalise to the other classes we’re interested in. Instead, we will switch to the monoidal presentation: 1

zipped :: Applicative f => f a -> f b -> f (a, b)
zipped = liftA2 (,)

-- An operator spelling, for convenience.
(&*&) :: Applicative f => f a -> f b -> f (a, b)
(&*&) = zipped
infixr 5 &*&

unit :: Applicative f => f ()
unit = pure ()

-- Laws:

-- unit &*& v ~ v
-- u &*& unit ~ u
-- (u &*& v) &*& w ~ u &*& (v &*& w)

(Purely for the sake of consistency, I will try to stick to the Data.Functor.Contravariant.Divisible naming conventions for functions like zipped.)

The matter with (<*>) (and also liftA2) that stops it from being generalised for our purposes is that it leans heavily on the fact that Hask is a Cartesian closed category, with pairs as the relevant product. Without that, the currying and the partial application we rely on when writing in applicative style become unfeasible.

While keeping ourselves away from (<*>) and liftA2, we can recover, if not the full flexibility, the power of applicative style with a variant of liftA2 that takes an uncurried function:

lizip :: Applicative f => ((a, b) -> c) -> f a -> f b -> f c
lizip f u v = fmap f (zipped u v)

(That is admittedly a weird name; all the clashing naming conventions around this topic has left me with few good options.)

On a closing note for this section, my choice of operator for zipped is motivated by the similarity with (&&&) from Control.Arrow:

(&&&) :: Arrow p => p a b -> p a c -> p a (b, c)

In particular, (&*&) for the function Applicative coincides with (&&&) for the function Arrow.

Leaning on connections like this one, I will use Control.Arrow combinators liberally, beginning with the definition of the following two convenience functions that will show up shortly:

dup :: a -> (a, a)
dup = id &&& id

forget :: Either a a -> a
forget = id ||| id


As summarised at the beginning of the Decidable post, while Applicative converts products to products covariantly, Divisible converts products to products contravariantly. From that point of view, I will take divided, the counterpart to zipped, as the fundamental combinator of the class:

-- This is the divided operator featured on Gabriella's post, soon to
-- become available from Data.Functor.Contravariant.Divisible
(>*<) :: Divisible k => k a -> k b -> k (a, b)
(>*<) = divided
infixr 5 >*<

-- Laws:

-- conquered >*< v ~ v
-- u >*< conquered ~ u
-- (u >*< v) >*< w ~ u >*< (v >*< w)

Recovering divide from divided is straightforward, and entirely analogous to how lizip can be obtained from zipped:

divide :: Divisible k => (a -> (b, c)) -> k b -> k c -> k a
divide f = contramap f (divided u v)

Lessened currying aside, we might say that divide plays the role of liftA2 in Divisible.

It’s about time for an example. For that, I will borrow the one from Gabriella’s post:

data Point = Point { x :: Double, y :: Double, z :: Double }
    deriving Show

nonNegative :: Predicate Double
nonNegative = Predicate (0 <=)

-- (>$<) = contramap
nonNegativeOctant :: Predicate Point
nonNegativeOctant =
    adapt >$< nonNegative >*< nonNegative >*< nonNegative
    adapt = x &&& y &&& z

The slight distortion to Gabriella’s style in using (&&&) to write adapt pointfree is meant to emphasise how that deconstructor can be cleanly assembled out of the component projection functions x, y and z. Importantly, that holds in general: pair-producing functions a -> (b, c) are isomorphic (a -> b, a -> c) pairs of projections. That gives us a variant of divide that takes the projections separately:

tdivide :: Divisible k => (a -> b) -> (a -> c) -> k b -> k c -> k a
tdivide f g u v = divide (f &&& g) u v

Besides offering an extra option with respect to ergonomics, tdivide hints at extra structure available from the Divisible class. Let’s play with the definitions a little:

tdivide f g u v
divide (f &&& g) u v
contramap (f &&& g) (divided u v)
contramap ((f *** g) . dup) (divided u v)
(contramap dup . contramap (f *** g)) (divided u v)
contramap dup (divided (contramap f u) (contramap g v))
divide dup (contramap f u) (contramap g v)

divide dup, which duplicates input in order to feed each of its arguments, is a combinator worthy of a name, or even two:

dplus :: Divisible k => k a -> k a -> k a
dplus = divide dup

(>+<) :: Divisible k => k a -> k a -> k a
(>+<) = dplus
infixr 5 >+<

So we have:

tdivide f g u v = dplus (contramap f u) (contramap g v)

Or, using the operators:

tdivide f g u v = f >$< u >+< g >$< v

An alternative to using the projections to set up a deconstructor to be used with divide is to contramap each projection to its corresponding divisible value and combine the pieces with (>+<). That is the style favoured by Tom Ellis, 2 which is why I have added a “t” prefix to tdivide comes from. For instance, Gabriella Gonzalez’s example would be spelled as follows in this style:

nonNegativeOctantT :: Predicate Point
nonNegativeOctantT =
    x >$< nonNegative >+< y >$< nonNegative >+< z >$< nonNegative


The (>+<) combinator defined above is strikingly similar to (<|>) from Alternative, down to its implied monoidal nature: 3

(>+<) :: Divisible k => k a -> k a -> k a

(<|>) :: Alternative f => f a -> f a -> f a

It is surprising that (>+<) springs forth in Divisible rather than Decidable, which might look like the more obvious candidate to be Alternative’s contravariant counterpart. To understand what is going on, it helps to look at Alternative from the same perspective we have used here for Applicative and Divisible. For that, first of all we need an analogue to divided. Let’s borrow the definition from Applicatives convert products to sums:

combined :: Alternative f => f a -> f b -> f (Either a b)
combined u v = Left <$> u <|> Right <$> v

(-|-) :: Alternative f => f a -> f b -> f (Either a b)
(-|-) = combined
infixr 5 -|-

-- We also need a suitable identity:
zero :: Alternative f => f Void
zero = empty

-- Laws:

-- zero -|- v ~ v
-- u -|- zero ~ u
-- (u -|- v) -|- w ~ u -|- (v -|- w)

(I won’t entertain the various controversies about the Alternative laws here, nor any interaction laws involving Applicative. Those might be interesting matters to think about from this vantage point, though.)

A divide analogue follows:

combine :: Alternative f => (Either a b -> c) -> f a -> f b -> f c
combine f u v = fmap f (combined u v)

Crucially, Either a b -> c can be split in a way dual to what we have seen earlier with a -> (b, c): an Either-consuming function amounts to a pair of functions, one to deal with each component. That being so, we can use the alternative style trick done for Divisible by dualising things:

tcombine :: Alternative f => (a -> c) -> (b -> c) -> f a -> f b -> f c
tcombine f g = combine (f ||| g)
tcombine f g u v
combine (f ||| g) u v
fmap (f ||| g) (combined u v)
fmap (forget . (f +++ g)) (combined u v)
fmap forget (combined (fmap f u) (fmap g v))
combine forget (fmap f u) (fmap g v)

To keep things symmetrical, let’s define:

aplus :: Alternative f => f a -> f a -> f a
aplus = combine forget
-- (<|>) = aplus

So that we end up with:

tcombine f g u v = aplus (fmap f u) (fmap g v)

tcombine f g u v = f <$> u <|> g <$> v

For instance, here is the Alternative composition example from the Decidable post…

alternativeCompose :: [String]
alternativeCompose = show <$> [1,2] <|> reverse <$> ["hello", "world"]

… and how it might be rendered using combine/(-|-):

alternativeComposeG :: [String]
alternativeComposeG = merge <$> [1,2] -|- ["hello", "world"]
    merge = show ||| reverse

There is, therefore, something of a subterranean connection between Alternative and Divisible. The function arguments to both combine and divide, whose types are dual to each other, can be split in a way that not only reveals an underlying monoidal operation, (<|>) and (>+<) respectively, but also allows for a certain flexibility in using the class combinators.


Last, but not least, there is Decidable to deal with. Data.Functor.Contravariant.Divisible already provides chosen as the divided analogue, so let’s just supply the and operator variant: 4

(|-|) :: Decidable k => k a -> k b -> k (Either a b)
(|-|) = chosen
infixr 5 |-|

-- Laws:

-- lost |-| v ~ v
-- u |-| lost ~ u
-- (u |-| v) |-| w ~ u |-| (v |-| w)

choose can be recovered from chosen in the usual way:

choose :: Decidable k => (a -> Either b c) -> k b -> k c -> k a
choose f u v = contamap f (chosen u v)

The a -> Either b c argument of choose, however, is not amenable to the function splitting trick we have used for divide and combine. Either-producing functions cannot be decomposed in that manner, as the case analysis to decide whether to return Left or Right cannot be disentangled. This is ultimately what Tom’s complaint about the “mysterious incomposability” of Decidable is about. Below is a paraphrased version of the Decidable example from the Decidable post:

data Foo = Bar String | Baz Bool | Quux Int
    deriving Show

pString :: Predicate String
pString = Predicate (const False)

pBool :: Predicate Bool
pBool = Predicate id

pInt :: Predicate Int
pInt = Predicate (>= 0)

decidableCompose :: Predicate Foo
decidableCompose = analyse >$< pString |-| pBool |-| pInt
    analyse = \case
        Bar s -> Left s
        Baz b -> Right (Left b)
        Quux n -> Right (Right n)

The problem identified in the post is that there is no straightfoward way around having to write “the explicit unpacking into an Either” performed by analyse. In the Divisible and Alternative examples, it was possible to avoid tuple or Either shuffling by decomposing the counterparts to analyse, but that is not possible here. 5

In the last few paragraphs, we have mentioned Divisible, Alternative and Decidable. What about Applicative, though? The Applicative example from the Decidable post is written in the usual applicative style:

applicativeCompose :: [[String]]
applicativeCompose =
    f <$> [1, 2] <*> [True, False] <*> ["hello", "world"]
    f = (\a b c -> replicate a (if b then c else "False"))

As noted earlier, though, applicative style is a fortunate consequence of Hask being Cartesian closed, which makes it possible to turn (a, b) -> c into a -> b -> c. If we leave out (<*>) and restrict ourselves to (&*&), we end up having to deal explicitly with tuples, which is a dual version of the Decidable issue:

monoidalCompose :: [[String]]
monoidalCompose =
    consume <$> [1, 2] &*& [True, False] &*& ["hello", "world"]
    consume (a, (b, c)) = replicate a (if b then c else "False")

Just like a -> Either b c functions, (a, b) -> c functions cannot be decomposed: the c value can be produced by using the a and b components in arbitrary ways, and there is no easy way to disentangle that.

Decidable, then, relates to Applicative in an analogous way to how Divisible does to Alternative. There are a few other similarities between them that are worth pointing out:

  • Neither Applicative nor Decidable offer a monoidal f a -> f a -> f a operation like the ones of Alternative and Decidable. A related observation is that, for example, Op’s Decidable instance inherits a Monoid constraint from Divisible but doesn’t actually use it in the method implementations.

  • choose Left and choose Right can be used to combine consumers so that one of them doesn’t actually receive input. That is analogous to how (<*) = lizip fst and (*>) = lizip snd combine applicative values while discarding the output from one of them.

  • Dually to how zipped/&*& for the function functor is (&&&), chosen for decidables such as Op and Predicate amounts to (|||). My choice of |-| as the corresponding operator hints at that.

In summary

To wrap things up, here is a visual summary of the parallels between the four classes:

Diagram of the four monoidal functor classes under consideration, with Applicative and Decidable in one diagonal, and Alternative and Divisible in the other.
Diagram of the four monoidal functor classes under consideration, with Applicative and Decidable in one diagonal, and Alternative and Divisible in the other.

To my eyes, the main takeaway of our figure of eight trip around this diagram has to do with its diagonals. Thanks to a peculiar kind of duality, classes in opposite corners of it are similar to each other in quite a few ways. In particular, the orange diagonal classes, Alternative and Divisible, have monoidal operations of f a -> f a -> f a signature that emerge from their monoidal functor structure.

That Divisible, from this perspective, appears to have more to do with Alternative than with Applicative leaves us a question to ponder: what does that mean for the relationship between Divisible and Decidable? The current class hierarchy, with Decidable being a subclass of Divisible, mirrors the Alternative-Applicative relationship on the other side of the covariant-contravariant divide. That, however, is not the only reasonable arrangement, and possibly not even the most natural one. 6


dplus is a monoidal operation

If we are to show that (>+<) is a monoidal operation, first of all we need an identity for it. conquer :: f a sounds like a reasonable candidate. It can be expressed in terms of conquered, the unit for divided, as follows:

-- conquer discards input.
conquer = const () >$< conquered

The identity laws do come out all right:

conquer >+< v = v  -- Goal
conquer >+< v  -- LHS
dup >$< (conquer >*< v)
dup >$< ((const () >$< conquered) >*< v)
dup >$< (first (const ()) >$< (conquered >*< v))
first (const ()) . dup >$< (conquered >*< v)
-- conquered >*< v ~ v
first (const ()) . dup >$< (snd >$< v)
snd . first (const ()) . dup >$< v
v  -- LHS = RHS

u >+< conquer = u  -- Goal
u >+< conquer  -- LHS
dup >$< (u >*< discard)
dup >$< (u >*< (const () >$< conquered))
dup >$< (second (const ()) >$< (u >*< conquered))
second (const ()) . dup >$< (u >*< conquered)
-- u >*< conquered ~ u
second (const ()) . dup >$< (fst >$< u)
fst . second (const ()) . dup >$< u
u  -- LHS = RHS

And so does the associativity one:

(u >+< v) >+< w = u >+< (v >+< w)  -- Goal
(u >+< v) >+< w  -- LHS
dup >$< ((dup >$< (u >*< v)) >*< w)
dup >$< (first dup >$< ((u >*< v) >*< w))
first dup . dup >$< ((u >*< v) >*< w)
u >+< (v >+< w)  -- RHS
dup >$< (u >*< (dup >$< (v >*< w)))
dup >$< (second dup >$< (u >*< (v >*< w)))
second dup . dup >$< (u >*< (v >*< w))
-- (u >*< v) >*< w ~ u >*< (v >*< w)
-- assoc ((x, y), z) = (x, (y, z))
second dup . dup >$< (assoc >$< ((u >*< v) >*< w))
assoc . second dup . dup >$< ((u >*< v) >*< w)
first dup . dup >$< ((u >*< v) >*< w)  -- LHS = RHS

Handling nested Either

The examples in this appendix are available as a separate .hs file.

There is a certain awkwardness in dealing with nested Either as anonymous sums that is hard to get rid of completely. Prisms are a tool worth looking into in this context, as they are largely about expressing pattern matching in a composable way. Let’s bring lens into Tom’s Decidable example, then:

data Foo = Bar String | Baz Bool | Quux Int
    deriving (Show)
makePrisms ''Foo

A cute trick with prisms is using outside to fill in the missing cases of a partial function (in this case, (^?! _Quux):

anonSum :: APrism' s a -> (s -> b) -> s -> Either a b
anonSum p cases = set (outside p) Left (Right . cases)

decidableOutside :: Predicate Foo
decidableOutside = analyse >$< pString |-| pBool |-| pInt
    analyse = _Bar `anonSum` (_Baz `anonSum` (^?! _Quux))

An alternative is using matching to write it in a more self-explanatory way:

matchingL :: APrism' s a -> s -> Either a s
matchingL p = view swapped . matching p

decidableMatching :: Predicate Foo
decidableMatching =
    choose (matchingL _Bar) pString $
    choose (matchingL _Baz) pBool $
    choose (matchingL _Quux) pInt $
    error "Missing case in decidableMatching"

These implementations have a few inconveniences of their own, the main one perhaps being that there is noting to stop us from forgetting one of the prisms. The combinators from the total package improve on that by incorporating exhaustiveness checking for prisms, at the cost of requiring the sum type to be defined in a particular way.

There presumably also is the option of brining in heavy machinery, and setting up an anonymous sum wrangler with Template Haskell or generics. In fact, it appears the shapely-data package used to offer precisely that. It might be worth it to take a moment to make it build with recent GHCs.

All in all, these approaches feel like attempts to approximate extra language support for juggling sum types. As it happens, though, there is a corner of the language which does provide extra support: arrow notation. Converting the example to arrows provides a glimpse of what might be:

-- I'm going to play nice, rather than making b phantom and writing a
-- blatantly unlawful Arrow instance just for the sake of the notation.
newtype Pipecate a b = Pipecate { getPipecate :: a -> (Bool, b) }

instance Category Pipecate where
    id = Pipecate (True,)
    Pipecate q . Pipecate p = Pipecate $ \x ->
        let (bx, y) = p x
            (by, z) = q y
        in (bx && by, z)

instance Arrow Pipecate where
    arr f = Pipecate (const True &&& f)
    first (Pipecate p) = Pipecate $ \(x, o) ->
         let (bx, y) = p x
         in (bx, (y, o))

instance ArrowChoice Pipecate where
    left (Pipecate p) = Pipecate $ \case
        Left x ->
            let (bx, y) = p x
            in (bx, Left y)
        Right o -> (True, Right o)

fromPred :: Predicate a -> Pipecate a ()
fromPred (Predicate p) = Pipecate (p &&& const ())

toPred :: Pipecate a x -> Predicate a
toPred (Pipecate p) = Predicate (fst . p)

decidableArrowised :: Predicate Foo
decidableArrowised = toPred $ proc foo -> case foo of
    Bar s -> fromPred pString -< s
    Baz b -> fromPred pBool -< b
    Quux n -> fromPred pInt -< n

decidableArrowised corresponds quite closely to the various Decidable-powered implementations. Behind the scenes, case commands in arrow notation give rise to nested eithers. Said eithers are dealt with by the arrows, which are combined in an appropriate way with (|||). (|||), in turn, can be seen as an arrow counterpart to chosen/(|-|). Even the -< “feed” syntax, which the example above doesn’t really take advantage of, amounts to slots for contramapping. If someone ever feels like arranging a do-esque noation for Decidable to go with Gabriella’s DivisibleFrom, it seems case commands would be a nice starting point.

  1. See the relevant section of the Typeclassopedia for a brief explanation of it.↩︎

  2. See, for instance, this Twitter conversation, or the Divisible example in the Decidable post. Note that, though I’m using (>$<) here for ease of comparison, the examples in this style arguably look tidier when spelled with contramap.

    Speaking of operator usage, it is not straightforward to decide on the right fixities for all those operators, and it is entirely possible that I have overlooked something. I have picked them aiming to have both styles work without parentheses, and to have the pairs associated to the right, that is:

    adapt >$< u >*< v >*< w
      = adapt >$< (u >*< (v >*< w))
    f >$< u >+< g >$< v >+< h >$< v
      = (f >$< u) >+< (g >$< v) >+< (h >$< w)
  3. A proof that (>+<) is indeed monoidal is in an end note to this post.

    On a related note, my choice of >+< as the dplus operator is, in part, a pun on (<+>) from ArrowPlus. (>+<) for many instances works very much like (<+>), monoidally combining outputs, even if there probably isn’t a sensible way to actually make the types underlying the various Divisible functors instances of ArrowPlus.↩︎

  4. Both dhall and co-log-core define (>|<) as chosen-like operators. To my eyes, though, >|< fits dplus better. As a compromise, I opted to not use >|< for neither of them here.↩︎

  5. I will play with a couple of approaches to nested Either ergonomics at the end of the post, in an appendix.↩︎

  6. See also contravariant issue #64, which suggests no longer making Decidable a subclass of Divisible. Though the argument made by Zemyla is a different one, there are resonances with the observations made here. On a related development, semigroupoids has recently introduced a Conclude class, which amounts to “Decidable without a superclass constraint on Divisible”.↩︎

by Daniel Mlot at November 12, 2021 02:50 AM

November 10, 2021

Douglas M. Auclair (geophf)

November, 2021 1HaskellADay 1Liners

  • 2021-11-09: You have: \k _v -> f k Curry away the arguments.
  • 2021-11-09: Hello, all. It's been a minute.

    Here's a #1Liner #Haskell problem

    You have m :: Map a b

    You want to filter it by s :: Set a

    so that m has keys only in s.

    How would you do that?

    • O_O @dysinger: let map = Data.Map.fromList [(1, "one"), (2, "two"), (3, "three")]
      set = Data.Set.fromList [1,3,5,7,9]
      in Data.Map.fromList [ elem | elem <- Data.Map.toList map, Data.Set.member (fst elem) set ]
    • ephemient @ephemient: Map.filterWithKey (\k _ -> Set.member k set) map
    • ephemient @ephemient: Map.intersectionWith const map $ Map.fromDistinctAscList [(k, ()) | k <- Set.toAscList set]
    • じょお @gotoki_no_joe Map.intersection m (Map.fromSet (const ()) s)

by geophf ( at November 10, 2021 12:12 AM

November 09, 2021

Joachim Breitner

How to audit an Internet Computer canister

I was recently called upon by Origyn to audit the source code of some of their Internet Computer canisters (“canisters” are services or smart contracts on the Internet Computer), which were written in the Motoko programming language. Both the application model of the Internet Computer as well as Motoko bring with them their own particular pitfalls and possible sources for bugs. So given that I was involved in the creation of both, they reached out to me.

In the course of that audit work I collected a list of things to watch out for, and general advice around them. Origyn generously allowed me to share that list here, in the hope that it will be helpful to the wider community.

Inter-canister calls

The Internet Computer system provides inter-canister communication that follows the actor model: Inter-canister calls are implemented via two asynchronous messages, one to initiate the call, and one to return the response. Canisters process messages atomically (and roll back upon certain error conditions), but not complete calls. This makes programming with inter-canister calls error-prone. Possible common sources for bugs, vulnerabilities or simply unexpected behavior are:

  • Reading global state before issuing an inter-canister call, and assuming it    to still hold when the call comes back.

  • Changing global state before issuing an inter-canister call, changing it again    in the response handler, but assuming nothing else changes the state in    between (reentrancy).

  • Changing global state before issuing an inter-canister call, and not    handling failures correctly, e.g. when the code handling the callback rolls    backs.

If you find such pattern in your code, you should analyze if a malicious party can trigger them, and assess the severity that effect

These issues apply to all canisters, and are not Motoko-specific.


Even in the absence of inter-canister calls the behavior of rollbacks can be surprising. In particular, rejecting (i.e. throw) does not rollback state changes done before, while trapping (e.g. Debug.trap, assert …, out of cycle conditions) does.

Therefore, one should check all public update call entry points for unwanted state changes or unwanted rollbacks. In particular, look for methods (or rather, messages, i.e. the code between commit points) where a state change is followed by a throw.

This issues apply to all canisters, and are not Motoko-specific, although other CDKs may not turn exceptions into rejects (which don’t roll back).

Talking to malicious canisters

Talking to untrustworthy canisters can be risky, for the following (likely incomplete) reasons:

  • The other canister can withhold a response. Although the bidirectional   messaging paradigm of the Internet Computer was designed to guarantee a   response eventually, the other party can busy-loop for as long as they are   willing to pay for before responding. Worse, there are ways to deadlock a   canister.

  • The other canister can respond with invalidly encoded Candid. This will cause   a Motoko-implemented canister to trap in the reply handler, with no easy way   to recover. Other CDKs may give you better ways to handle invalid Candid, but even then you will have to worry about Candid cycle bombs that will cause your reply handler to trap. 

Many canisters do not even do inter-canister calls, or only call other trustwothy canisters. For the others, the impact of this needs to be carefully assessed.

Canister upgrade: overview

For most services it is crucial that canisters can be upgraded reliably. This can be broken down into the following aspects:

  1. Can the canister be upgraded at all?
  2. Will the canister upgrade retain all data?
  3. Can the canister be upgraded promptly?
  4. Is three a recovery plan for when upgrading is not possible?

Canister upgradeability

A canister that traps, for whatever reason, in its canister_preupgrade system method is no longer upgradeable. This is a major risk. The canister_preupgrade method of a Motoko canister consists of the developer-written code in any system func preupgrade() block, followed by the system-generated code that serializes the content of any stable var into a binary format, and then copies that to stable memory.

Since the Motoko-internal serialization code will first serialize into a scratch space in the main heap, and then copy that to stable memory, canisters with more than 2GB of live data will likely be unupgradeable. But this is unlikely the first limit:

The system imposes an instruction limit on upgrading a canister (spanning both canister_preupgrade and canister_postupgrade). This limit is a subnet configuration value, and sepearate (and likely higher) than the normal per-message limit, and not easily determined. If the canister’s live data becomes too large to be serialized within this limit, the canister becomes non-upgradeable.

This risk cannot be eliminated completely, as long as Motoko and Stable Variables are used. It can be mitigated by appropriate load testing:

Install a canister, fill it up with live data, and exercise the upgrade. If this succeeds with a live data set exceeding the expected amount of data by a margin, this risk is probably acceptable. Bonus points for adding functionality that will prevent the canister’s live data to increase above a certain size.

If this testing is to be done on a local replica, extra care needs to be taken to make sure the local replica actually performs instruction counting and has the same resource limits as the production subnet.

An alternative mitigation is to avoid canister_pre_upgrade as much as possible. This means no use of stable var (or restricted to small, fixed-size configuration data). All other data could be

  • mirrored off the canister (possibly off chain), and manually re-hydrated after an upgrade.
  • stored in stable memory manually, during each update call, using the ExperimentalStableMemory API. While this matches what high-assurance Rust canisters (e.g. the Internet Identity) do, This requires manual binary encoding of the data, and is marked experimental, so I cannot recommend this at the moment.
  • not put into a Motoko canister until Motoko has a scalable solution for stable variable (for example keeping them in stable memory permanently, with smart caching in main memory, and thus obliterating the need for pre-upgrade code.)

Data retention on upgrades

Obviously, all live data ought to be retained during upgrades. Motoko automatically ensures this for stable var data. But often canisters want to work with their data in a different format (e.g. in objects that are not shared and thus cannot be put in stable vars, such as HashMap or Buffer objects), and thus may follow following idiom:

stable var fooStable = …;
var foo = fooFromStable(fooStable);
system func preupgrade() { fooStable := fooToStable(foo); })
system func postupgrade() { fooStable := (empty); })

In this case, it is important to check that

  • All non-stable global vars, or global lets with mutable values, have a stable companion.
  • The assignments to foo and fooStable are not forgotten.
  • The fooToStable and fooFromStable form bijections.

An example would be HashMaps stored as arrays via Iter.toArray(….entries()) and HashMap.fromIter(….vals()).

It is worth pointiong out that a code view will only look at a single version of the code, but cannot check whether code changes will preserve data on upgrade. This can easily go wrong if the names and types of stable variables are changed in incompatible way. The upgrade may fail loudly in this cases, but in bad cases, the upgrade may even succeed, losing data along the way. This risk needs to be mitigated by thorough testing, and possibly backups (see below).

Prompt upgrades

Motoko and Rust canisters cannot be safely upgraded when they are still waiting for responses to inter-canister calls (the callback would eventually reach the new instance, and because of infelicities of the IC’s System API, could possibly call arbitrary internal functions). Therefore, the canister needs to be stopped before upgrading, and started again. If the inter-canister calls take a long time, this mean that upgrading may take a long time, which may be undesirable. Again, this risk is reduced if all calls are made to trustworthy canisters, and elevated when possibly untrustworthy canisters are called, directly or indirectly.

Backup and recovery

Because of the above risk around upgrades it is advisable to have a disaster recovery strategy. This could involve off-chain backups of all relevant data, so that it is possible to reinstall (not upgrade) the canister and re-upload all data.

Note that reinstall has the same issue as upgrade described above in “prompt upgrades”: It ought to be stopped first to be safe.

Note that the instruction limit for messages, as well as the message size limit, limit the amount of data returned. If the canister needs to hold more data than that, the backup query method might have to return chunks or deltas, with all the extra complexity that entails, e.g. state changes between downloading chunks.

If large data load testing is performed (as Irecommend anyways to test upgradeability), one can test whether the backup query method works within the resource limits.

Time is not strictly monotonic

The timestamps for “current time” that the Internet Computer provides to its canisters is guaranteed to be monotonic, but not strictly monotonic. It can return the same values, even in the same messages, as long as they are processed in the same block. It should therefore not be used to detect “happens-before” relations.

Instead of using and comparing time stamps to check whether Y has been performed after X happened last, introduce an explicit var y_done : Bool state, which is set to False by X and then to True by Y. When things become more complex, it will be easier to model that state via an enumeration with speaking tag names, and update this “state machine” along the way.

Another solution to this problem is to introduce a var v : Nat counter that you bump in every update method, and after each await. Now v is your canister’s state counter, and can be used like a timestamp in many ways.

While we are talking about time: The system time (typically) changes across an await. So if you do let now = and then await, the value in now may no longer be what you want.

Wrapping arithmetic

The Nat64 data type, and the other fixed-width numeric types provide opt-in wrapping arithmetic (e.g. +%, fromIntWrap). Unless explicitly required by the current application, this should be avoided, as usually a too large or negatie value is a serious, unrecoverable logic error, and trapping is the best one can do.

Cycle balance drain attacks

Because of the IC’s “canister pays” model, all canisters are prone to DoS attacks by draining their cycle balance, and this risk needs to be taken into account.

The most elementary mitigation strategy is to monitor the cycle balance of canisters and keep it far from the (configurable) freezing threshold.

On the raw IC-level, further mitigation strategies are possible:

  • If all update calls are authenticated, perform this authentication as quickly as possible, possibly before decoding the caller’s argument. This way, a cycle drain attack by an unauthenticated attacker is less effective (but still possible).

  • Additionally, implementing the canister_inspect_message system method allows the above checks to be performed before the message even is accepted by the Internet Computer. But it does not defend against inter-canister messages and is therefore not a complete solution.

  • If an attack from an authenticated user (e.g. a stakeholder) is to be expected, the above methods are not effective, and an effective defense might require relatively involved additional program logic (e.g. per-caller statistics) to detect such an attack, and react (e.g. rate-limiting).

  • Such defenses are pointless if there is only a single method where they do not apply (e.g. an unauthenticated user registration method). If the application is inherently attackable this way, it is not worth the bother to raise defenses for other methods.

   Related: A justification why the Internet Identity does not use canister_inspect_message)

A motoko-implemented canister currently cannot perform most of these defenses: Argument decoding happens unconditionally before any user code that may reject a message based on the caller, and canister_inspect_message is not supported. Furthermore, Candid decoding is not very cycle defensive, and one should assume that it is possible to construct Candid messages that require many instructions to decode, even for “simple” argument type signatures.

The conclusion for the audited canisters is to rely on monitoring to keep the cycle blance up, even during an attack, if the expense can be born, and maybe pray for IC-level DoS protections to kick in.

Large data attacks

Another DoS attack vector exists if public methods allow untrustworthy users to send data of unlimited size that is persisted in the canister memory. Because of the translation of async-await code into multiple message handlers, this applies not only to data that is obviously stored in global state, but also local data that is live across an await point.

The effectiveness of such attacks is limited by the Internet Computer’s message size limit, which is in the order of a few megabytes, but many of those also add up.

The problem becomes much worse if a method has an argument type that allows a Candid space bomb: It is possible to encode very large vectors with all values null in Candid, so if any method has an argument of type [Null] or [?t], a small message will expand to a large value in the Motoko heap.

Other types to watch out:

  • Nat and Int: This is an unbounded natural number, and thus can be arbitrarily large. The Motoko representation will however not be much larger than the Candid encoding (so this does not qualify as a space bomb).

   It is still advisable to check if the number is reasonable in size before storing it or doing an await. For example, when it denotes an index in an array, throw early if it exceeds the size of the array; if it denotes a token amount to transfer, check it against the available balance, if it denotes time, check it against reasonable bounds.

  • Principal: A Principal is effectively a Blob. The Interface   specification says that principals are at most 29   bytes in   length, but the Motoko Candid   decoder   does not check that currently (fixed in the next version of Motoko). Until then, a Principal passed as an   argument can be large (the principal in msg.caller is system-provided and   thus safe). If you cannot wait for the fix to reach you, manually check the size of the princial (via   Principal.toBlob) before doing the await.

Shadowing of msg or caller

Don’t use the same name for the “message context” of the enclosing actor and the methods of the canister: It is dangerous to write shared(msg) actor, because now msg is in scope across all public methods. As long as these also use public shared(msg) func …, and thus shadow the outer msg, it is correct, but it if one accidentially omits or mis-types the msg, no compiler error would occur, but suddenly msg.caller would now be the original controller, likely defeating an important authorization step.

Instead, write shared(init_msg) actor or shared({caller = controller}) actor to avoid using msg.


If you write a “serious” canister, whether in Motoko or not, it is worth to go through the code and watch out for these patterns. Or better, have someone else review your code, as it may be hard to spot issues in your own code.

Unfortunately, such a list is never complete, and there are surely more ways to screw up your code – in addition to all the non-IC-specific ways in which code can be wrong. Still, things get done out there, so best of luck!

by Joachim Breitner ( at November 09, 2021 05:34 PM

November 04, 2021

Tweag I/O

A Haskell memory leak in way too much detail with Cachegrind

Haskell’s laziness gives us great benefits, but comes with costs. One of the most visible cost of laziness is hard to predict memory behavior, and so we have, and need, tools to profile and measure the heap of our Haskell programs. Most of the time, we live in a high-level world of type classes, monoids, and isomorphisms. But at the end of the day our code needs to run on actual machines, and when our code is run it needs to be performant. Typically, “be performant”, means “have reasonable computational complexity” and so we usually focus on good asymptotic behavior instead of constant costs. But when our programs meet our CPUs, constant costs are important, and thus having low level tools is helpful for low level optimizations, i.e., when you want to squeeze that last 10% out of your program.

Fortunately, the low level profiling story has improved. As of version 8.10, GHC has basic support to export DWARF-compliant debugging information. This means that we should be able to use Valgrind to inspect low level details of our programs.

In spirit, the result of our valgrind analysis will be similar to ticky profiling, which gives us allocations and entry counts for our functions. However, with valgrind we can get lower level details, such as the raw instruction counts per function of our Haskell programs!

So, if you’re interested in low-level optimization of your Haskell programs then this article is for you. I’ll demonstrate the use of a valgrind tool, cachegrind, to inspect the behavior of the canonical leaky Haskell program: a lazy left fold. You’ll get three things: (1) a step-by-step tutorial on running cachegrind on your programs. (2) a section-by-section breakdown of the cachegrind profile, and (3) some guidance on interpreting the results.

What is Cachegrind?

Cachegrind is a cache profiling and branch prediction tool. It takes your program and inspects how the program interacts with your machine’s cache hierarchy and branch predictor. Why does this matter? Well some data structures, such as the Hash Array Mapped Tries (HAMTs) in the unordered-containers library greatly benefit from caching behavior in modern CPUs. Because HAMTs store arrays, modern CPUs will load the entire array into the CPU caches on a write or a read, which leads to cache hits and avoids CPU cycles that are wasted waiting for the needed memory locations to load into the CPU caches. This is the reason why HAMTs are so heavily used in JVM based languages such as Scala and Clojure; the JVM is very good at detecting and performing this optimization. So even though we do not typically concern ourselves with the CPU caching behavior of our programs, it becomes important when we want to squeeze that last bit of performance. As always in these matters, throughput is the goal, latency is the problem, and caching is the key.

The small example

Consider this program, whose primary feature is to leak memory:

-- Main.hs
module Main where

import Data.Foldable (foldr',foldl')

main :: IO ()
main = print $ "hello " ++ show (foo [1 .. 1000000 :: Integer])

foo :: [Integer] -> (Integer, Integer)
foo = foldl' go (0, 1)
  where go (a,b) x = (x + a, x - b)

The memory leak is in this line:

  where go (a,b) x = (x + a, x - b)

Even though we have used the strict left fold foldl' our accumulation function is still too lazy because the tuple (a,b) is a lazy tuple in both its fst (a) and snd (b) arguments.

The fix is simple; we force the thunks in the fst and snd positions by adding bang patterns inside the tuple:

{-# LANGUAGE BangPatterns #-} -- new

module Main where

import Data.Foldable (foldr',foldl')

main :: IO ()
main = print $ "hello " ++ show (foo [1 .. 1000000 :: Integer])

foo :: [Integer] -> (Integer, Integer)
foo = foldl' go (0, 1)
  where go (!a,!b) x = (x + a, x - b) -- notice the bang patterns on a and b

So we know we have a memory leak and how to fix it; our goal is to detect that leak with cachegrind to inspect how that leak manifests in the CPU cache hierarchy. To use cachegrind (or, more generally, valgrind), we’ll need to instruct GHC to generate debugging information, e.g., cabal build --ghc-options="-g -O2". Note that you should compile with -O2 to get a binary as similar as possible to the binary you would actually ship.

We know we have a memory leak, so we should see a lot of cache misses and a higher instruction count, because not only will we need to chase more pointers, but Haskell’s runtime system will have more work to do. To understand cachegrind’s output, we’ll look at the non-leaky version first to have a good idea of what to expect if there isn’t a problem.

The invocation is simple, and expect your program to run much slower than normal:

$ valgrind --tool=cachegrind ./dist-newstyle/build/x86_64-linux/ghc-8.10.4/leak-
==18410== Cachegrind, a cache and branch-prediction profiler
==18410== Copyright (C) 2002-2017, and GNU GPL'd, by Nicholas Nethercote et al.
==18410== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==18410== Command: ./dist-newstyle/build/x86_64-linux/ghc-8.10.4/leak-
--18410-- warning: L3 cache found, using its data for the LL simulation.
"hello (500000500000,500001)"

Notice that I called valgrind on the binary ./dist-newstyle/build/x86_64-linux/ghc-8.10.4/leak- rather than cabal run. For whatever reason, using cabal run, e.g., valgrind --tool=cachegrind cabal run loses the DWARF symbols produced by the -g GHC flag and thus creates an empty cachegrind profile.

The result of calling cachegrind produces two kinds of output. Lines beginning with ==18410==, are cachegrind summary output; these can be safely ignored for now. The second kind of output is the line "hello" ..., which is the output of our program.

The important result is the cachegrind profile produced in the same directory as the invocation was called, in this case that is ~/tmp/leak. The profile is a file called cachegrind.out.<pid> where <pid> is the pid number of the process created by your shell, on my machine this file was cachegrind.out.19438. These files are raw output. To transform them into a human readable format we use a tool called cg_annotate (short for cachegrind annotate) that is packaged with valgrind, like this:

$ cg_annotate cachegrind.out.19438 > cachegrind.out.not-leaky

And now we can view our report. I’ll only show the important pieces. (If you do not have a wide monitor, then this report will be very ugly.)

Each section of the report is separated by dashes (------). The first section is a summary of the simulated machine that cachegrind uses. It generates the summary by inspecting your machine, mine looks like this:

I1 cache:         65536 B, 64 B, 4-way associative
D1 cache:         32768 B, 64 B, 8-way associative
LL cache:         16777216 B, 64 B, 16-way associative
Command:          ./Main
Data file:        cachegrind.out.16623
Events recorded:  Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw
Events shown:     Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw
Event sort order: Ir I1mr ILmr Dr D1mr DLmr Dw D1mw DLmw
Thresholds:       0.1 100 100 100 100 100 100 100 100
Include dirs:
User annotated:
Auto-annotation:  on

These details are important, but for our purposes we’ll skip over them. Just know that they’ll change depending on your machine’s chipset. For a deeper dive I recommend the relevant section in the cachegrind manual.

The second section:

Ir                   I1mr           ILmr           Dr                  D1mr             DLmr           Dw                  D1mw               DLmw
315,065,912 (100.0%) 7,790 (100.0%) 4,021 (100.0%) 87,831,142 (100.0%) 134,139 (100.0%) 6,312 (100.0%) 46,945,020 (100.0%) 1,759,240 (100.0%) 68,891 (100.0%)  PROGRAM TOTALS

is a summary of totals for the program. Each column name is defined in the cachegrind manual but I’ll reproduce it here. There are 3 kinds of simulated CPU caches, a first-level instruction cache (I1), a data cache (D1), and a “Last-Level” cache (LL), and several kinds of metrics about these caches:

Ir   --> cache reads, specifically the number of instructions executed
I1mr --> instruction cache read misses (we needed to fetch something from RAM)
ILmr --> the "Last-Level" (LL) cache instruction read misses. Think an L3 cache
Dr   --> number of memory reads
D1mr --> number of data cache read misses
DLmr --> number of last level data cache read misses
Dw   --> number of memory writes
D1mw --> number of data cache write misses
DLmw --> number of last level data cache write misses

So now we can make sense of the report. From the program total section we see that running our program required processing 315,065,912 instructions. The amount of cache misses was low, with 7,790 instruction read misses, and 4,021 last level read misses. Similarly, our program had 87,831,142 data reads, and we had quite a few data cache write misses (1,759,240). These results shouldn’t be surprising, almost every program will have some read and write misses, but this is a good base line for comparison. For the remainder of this post, I’ll be omitting the I1mr and ILmr columns, and the ratios (100.0%). These columns are important, but not our primary concern; so I’ve removed them to avoid line wraps, such as the wrap for PROGRAM TOTALS. When relevant, I’ll reproduce the ratios in text rather than in the tables.

Let’s continue into the report. The third section:

Ir           Dr          D1mr    DLmr   Dw          D1mw       DLmw     file:function
146,150,652  42,041,086     314      1  18,013,694    506,849  18,916   ???:ghczmbignum_GHCziNumziInteger_integerAdd_info
 69,999,966  20,999,989      27      1   7,999,996    246,575   9,202   ???:ghczmbignum_GHCziNumziInteger_integerSub_info
 53,068,513  13,013,697       2      0  16,013,703  1,000,001  37,318   /home/doyougnu/tmp/leak/app//Main.hs:???
 39,000,049   9,000,012      81      3   4,000,005          1       1   ???:ghczmbignum_GHCziNumziInteger_integerGtzh_info
  1,807,138     903,491  27,372      0     246,435         26       0   ???:stg_gc_noregs
    996,435     359,937  27,619      0      27,810         27       0   /store/Programming/ghc-master/rts/sm/Storage.c:resetNurseries
    515,043     340,482  55,940      0      59,267          4       0   /store/Programming/ghc-master/rts/sm/BlockAlloc.c:countBlocks
    405,151     131,609   1,818    504      60,468          4       4   ???:do_lookup_x

is the same data broken down by descending instruction count (Ir) by function. There are several things to notice. First, function names are z-encoded, although they are still readable. Unsurprisingly, most of our instructions were adding ghczmbignum_GHCziNumziInteger_integerAdd_info (146,150,652 (46.39%) instructions), and subtracting ???:ghczmbignum_GHCziNumziInteger_integerSub_info (with 69,999,966 (22.22%)). Our Main function is responsible for almost as many data writes (with 16,013,703 (34.11%)) and the majority of write misses (with 1,000,001 (56.84%)) for just the D1 cache.

The last section:

-- Auto-annotated source: /home/doyougnu/tmp/leak/app//Main.hs
Ir         Dr        D1mr       DLmr       Dw         D1mw      DLmw

         .         . .          .                   .         .      .   {-# LANGUAGE BangPatterns #-}
         .         . .          .                   .         .      .
         .         . .          .                   .         .      .   module Main where
         .         . .          .                   .         .      .
         .         . .          .                   .         .      .   import Data.Foldable (foldr',foldl')
         .         . .          .                   .         .      .
         .         . .          .                   .         .      .   main :: IO ()
        30         0 0          0                   4         0      0   main = print $ "hello " ++ (show $ foo [1 .. 1000000 :: Integer])
         .         . .          .                   .         .      .
         .         . .          .                   .         .      .   foo :: [Integer] -> (Integer, Integer)
15,041,101 4,013,698 0          0           3,000,004         1      1   foo = foldl' go (0, 1)
38,027,462 9,000,003 2          0          13,013,711 1,000,001 37,318     where go (!a,!b) x = (x + a, x - b)

is the same information, but broken out line-by-line per file. This is the section that is worth most of your time. For our tiny program we have only one file, Main.hs, and so this is the only file to inspect. In the real report you’ll have a bunch of annotated source from the runtime system which I have omitted. Unsurprisingly, the annotated source points directly to the local function go, as expected. Most of the work happens in the accumulator of the strict left fold. Most of our data write misses come from our accumulator function, and we see that foo only had a single missed write in the last level and data cache. We’ll return to this point later.

Alright, now for the leaky version:

Ir                      Dr                     D1mr                DLmr                Dw                     D1mw                DLmw
17,722,854,207 (100.0%) 6,937,629,327 (100.0%) 64,898,306 (100.0%) 39,650,087 (100.0%) 3,749,019,039 (100.0%) 18,682,395 (100.0%) 12,551,725 (100.0%)  PROGRAM TOTALS

We can see that our instruction count (Ir) has exploded from 315,065,912 to 17,722,854,207! That’s many more instructions to process. Let’s look at the function section.

Ir                     D1mr                DLmr                D1mw                DLmw                file:function
2,851,984,623 (16.09%)  4,325,884 ( 6.67%)    789,401 ( 1.99%)        315 ( 0.00%)        65 ( 0.00%)  ???:evacuate
1,865,100,704 (10.52%)  4,141,338 ( 6.38%)  2,857,644 ( 7.21%) 11,806,905 (63.20%) 9,754,926 (77.72%)  ???:copy_tag
1,330,588,095 ( 7.51%)          0                   0                 569 ( 0.00%)         8 ( 0.00%)  ???:LOOKS_LIKE_INFO_PTR
1,285,525,857 ( 7.25%)     35,179 ( 0.05%)      5,617 ( 0.01%)        436 ( 0.00%)        23 ( 0.00%)  ???:LOOKS_LIKE_INFO_PTR_NOT_NULL
  956,209,814 ( 5.40%) 19,897,093 (30.66%)  6,802,815 (17.16%)        360 ( 0.00%)        21 ( 0.00%)  ???:LOOKS_LIKE_CLOSURE_PTR
  940,144,172 ( 5.30%)        541 ( 0.00%)         69 ( 0.00%)          6 ( 0.00%)         0           ???:scavenge_block
  851,875,808 ( 4.81%)          0                   0                 774 ( 0.00%)         6 ( 0.00%)  ???:INFO_PTR_TO_STRUCT
  791,197,818 ( 4.46%)        338 ( 0.00%)         84 ( 0.00%)        128 ( 0.00%)         6 ( 0.00%)  ???:alloc_in_moving_heap
  498,427,284 ( 2.81%)          0                   0               3,133 ( 0.02%)         0           ???:Bdescr
  478,104,914 ( 2.70%)          0                   0                 202 ( 0.00%)         3 ( 0.00%)  ???:UNTAG_CONST_CLOSURE
  446,714,992 ( 2.52%)          2 ( 0.00%)          0                  16 ( 0.00%)         4 ( 0.00%)  ???:alloc_for_copy
  417,480,380 ( 2.36%)    233,588 ( 0.36%)    193,590 ( 0.49%)          1 ( 0.00%)         0           ???:get_itbl
  326,467,351 ( 1.84%)          0                   0                 117 ( 0.00%)         7 ( 0.00%)  ???:GET_CLOSURE_TAG
  302,466,470 ( 1.71%)  2,976,163 ( 4.59%)  1,342,229 ( 3.39%)          0                  0           ???:evacuate_BLACKHOLE
  301,493,426 ( 1.70%)     34,069 ( 0.05%)        221 ( 0.00%)          0                  0           ???:scavenge_stack
  291,518,962 ( 1.64%)          0                   0                   1 ( 0.00%)         0           ???:UNTAG_CLOSURE
  273,950,830 ( 1.55%)         16 ( 0.00%)         12 ( 0.00%)          0                  0           ???:scavenge_thunk_srt
  177,938,980 ( 1.00%)    506,224 ( 0.78%)     10,352 ( 0.03%)         27 ( 0.00%)         0           ???:scavenge_mutable_list
  176,504,972 ( 1.00%) 19,583,208 (30.18%) 19,255,219 (48.56%)         27 ( 0.00%)         2 ( 0.00%)  ???:countBlocks
  170,901,058 ( 0.96%)        648 ( 0.00%)        387 ( 0.00%)          0                  0           ???:STATIC_LINK
  166,097,651 ( 0.94%)  1,810,517 ( 2.79%)  1,649,510 ( 4.16%)  1,121,003 ( 6.00%)   628,281 ( 5.01%)  ???:integerzmwiredzmin_GHCziIntegerziType_plusInteger_info
   82,121,083 ( 0.46%)  1,248,932 ( 1.92%)  1,187,596 ( 3.00%)  1,120,936 ( 6.00%)   637,117 ( 5.08%)  ???:integerzmwiredzmin_GHCziIntegerziType_minusInteger_info
   66,204,923 ( 0.37%)     20,660 ( 0.03%)         57 ( 0.00%)          0                  0           ???:closure_sizeW_
   64,000,464 ( 0.36%)          0                   0                  61 ( 0.00%)         0           ???:overwritingClosure
   63,945,983 ( 0.36%)     15,725 ( 0.02%)        312 ( 0.00%)    253,048 ( 1.35%)    13,443 ( 0.11%)  ???:recordMutableCap
   60,343,797 ( 0.34%)        245 ( 0.00%)         14 ( 0.00%)  2,000,002 (10.71%)    90,901 ( 0.72%)  /home/doyougnu/tmp/leak//app/Main.hs:Main_zdwgo_info
   57,140,616 ( 0.32%)    147,133 ( 0.23%)        171 ( 0.00%)          9 ( 0.00%)         1 ( 0.00%)  ???:alloc_todo_block
   43,000,043 ( 0.24%)        124 ( 0.00%)          9 ( 0.00%)          1 ( 0.00%)         1 ( 0.00%)  ???:integerzmwiredzmin_GHCziIntegerziType_gtIntegerzh_info
   40,990,314 ( 0.23%)          2 ( 0.00%)          0                   0                  0           ???:push_scanned_block
   32,982,071 ( 0.19%)    236,653 ( 0.36%)    188,783 ( 0.48%)     29,739 ( 0.16%)    29,696 ( 0.24%)  ???:freeGroup
   32,000,132 ( 0.18%)  1,107,641 ( 1.71%)  1,055,166 ( 2.66%)  1,004,515 ( 5.38%)   996,263 ( 7.94%)  /home/doyougnu/tmp/leak//app/Main.hs:???

Our table has similarly exploded, so much so that I’ve removed several rows, and the Dr and Dw columns for a decent display, although I’ve preserved the ratios to show how processing work has shifted. Look at all the extra processing the runtime system had to do, and just from missing two bangs! In fact, almost all the work is from the runtime system, which is expected for a memory leak. However, a lot of this information is noise: we want to know where the leak is for our program. Ignoring all the runtime system information, there are only two entries from the program:

Ir                         D1mr                DLmr                  D1mw                DLmw
   60,343,797 ( 0.34%)          245 ( 0.00%)         14 ( 0.00%)  2,000,002 (10.71%)    90,901 ( 0.72%)  /home/doyougnu/tmp/leak//app/Main.hs:Main_zdwgo_info
   32,000,132 ( 0.18%)    1,107,641 ( 1.71%)  1,055,166 ( 2.66%)  1,004,515 ( 5.38%)   996,263 ( 7.94%)  /home/doyougnu/tmp/leak//app/Main.hs:???

and one of those, Main_zdwgo_info, is the z-encoded go function.

Let’s check the annotated source:

-- Auto-annotated source: /home/doyougnu/tmp/leak//app/Main.hs
Ir          Dr          D1mr       DLmr       Dw          D1mw       DLmw

         .           .          .          .           .          .        .   {-# LANGUAGE BangPatterns #-}
         .           .          .          .           .          .        .
         .           .          .          .           .          .        .   module Main where
         .           .          .          .           .          .        .
         .           .          .          .           .          .        .   import Data.Foldable (foldr',foldl')
         .           .          .          .           .          .        .
         .           .          .          .           .          .        .   main :: IO ()
        43           5          0          0           8          0        0   main = print $ "hello " ++ show (foo [1 .. 1000000 :: Integer])
         .           .          .          .           .          .        .
         .           .          .          .           .          .        .   foo :: [Integer] -> (Integer, Integer)
60,343,822  18,125,017        245         14  21,062,521  2,000,002   90,901   foo = foldl' go (0, 1)
32,000,135  10,000,042  1,107,641  1,055,166   8,000,038  1,004,515  996,263     where go (a,b) x = (x + a, x - b)

There are several things to notice. First, the instruction count for foo has risen from 15,041,101 ( 4.77%) to 60,343,822 ( 0.34%). Second, go’s instruction count has reduced from 38,027,462 (12.07%) to 32,000,135 ( 0.18%) because go has less work to do; it only needs to allocate thunks! Third, and this is the crucial point, is that we see foldl' has data cache write misses and data cache read misses. The lazy version shows 245 ( 0.00%) and 14 ( 0.00%) read misses, and 2,000,002 D1mw and 90,901 DLmw write misses, while the strict version has 0 read misses and 1 ( 0.00%) 1 ( 0.00%) write misses.

The read misses are interesting, not because of the difference in magnitude, but because we shouldn’t expect any read misses for a tight fold such as foo. If foo does not have any problems, then we would expect it to be tail-call optimized into a tight loop. Thus, we would expect no read misses because the memory location that must be read from should be in the cache while foo computes, leading to cache hits. That the lazy version shows any cache read misses indicates a problem.

The more interesting result is in the cache write misses. A cache write miss occurs when we want to write some data, say an assembly operand, to the cache. Before a write occurs, we check to see if that operand is already in the cache. If it is then we have a cache hit; if it is not then we have a cache miss. So this should make sense: we know that foo is going to write to a (Integer, Integer) to the data cache. We should expect that foo will compute and then write the memory address containing the result to the cache only once. If it doesn’t write only once, then it is storing intermediate data which is later read to finish the computation, i.e., it is allocating a thunk! So we see that the strict version has a single write miss for both the 1 and LL caches, most likely because the memory operand was not in the cache. It shouldn’t be: before calling foo we had not computed the result yet. In contrast, the lazy version has over 2 million D1 write misses, clearly indicating a memory leak.

Summary and Guidance

To summarize, we can use GHC’s -g flag to generate DWARF symbols for our Haskell programs. With these symbols we can retrieve fine-grained data, such as instruction count per line of source code. This information helps identify hot spots in our code, detect memory leaks, and begin the process of optimizing our programs. This article has been a very light introduction to cachegrind, but I haven’t covered everything. For example, cachegrind’s second use is inspecting the branch prediction behavior of programs. For the interested please see the cachegrind manual linked below.

To close, I’d like to give some recommendations on how to use cachegrind information. Getting the information is the easy part; understanding its message, however, is more difficult. So here are my recommendations:

  1. Create the smallest representative example first. Cachegrind executes the input program much more slowly than normal and so creating a minimal example is beneficial not just for debugging but also to reduce the turn-around time of your profiling.
  2. Try to use GHC’s heap profiling tools first. If you have a memory leak it is likely that it will be more observable with a heap profile. Use a ticky or prof profile to find functions with many allocations and entry points, then use cachegrind to dig deeper. You should be using cachegrind when you really need fine-grained, line-by-line details, or if you know you are writing something that should interact with CPU caches in a particular way, such as HAMTs.
  3. When you decide to use cachegrind, look at instruction counts first. You’re specifically looking for lines of code that correspond to a large amount of instructions. Even if does not indicate a memory leak, it is still useful knowledge to identify a hot spot in your code. And always remember, the lower the instruction count, the better.
  4. The data is important, but situating the data in context of your program is more important. You need to be able to ask “how many times do I expect this function to write or read”, and “do these numbers make sense?“. For example, we could only make sense of the write miss data by thinking about how many times our strict version should write. In practice, this is the hardest part, but also the most rewarding. I prefer, and recommend, the tried and true method of staring at it until it makes sense, but your mileage may vary.

Extra reading and sources

November 04, 2021 12:00 AM

November 03, 2021

Matthew Sackman

NixOS, again!

Now I promise this isn’t just a NixOS advocacy site; I do have some other content planned, honest! I just wanted to give a couple of examples of the using Nix as part of your daily workflow of building stuff.

In something like Debian, you may use .deb files/packages often, but it’s normally without thinking about it too much, and you certainly don’t create a deb just to build a document, for example. But with Nix, because all the time you’re building these little functions to build something in a fairly pure, referentially-transparent way, you can also reuse them and combine them in new ways, which can be quite neat.

So, previously, I’ve shown a slightly esoteric example of setting up LaTeX in Nix. We ended up with a file ~/.dotfiles/myfonts.nix which can build a number of things, including a TexLive derivation with some custom fonts added to it. Typically, when working in LaTeX, I’ll create a Makefile which contains something like this:

cv.pdf: cv.tex $(shell find . -type f -name *.svg)
	pdflatex -shell-escape cv && pdflatex -shell-escape cv && pdflatex -shell-escape cv

	- rm *.aux *.bbl *.blg *.log *.ptb cv.pdf

(I’m currently updating my CV, hence this example.) This approach, with a Makefile, works just fine. But, it does rely a lot on finding its dependencies in the current environment - it’s not referentially-transparent by any stretch of the imagination. But if we use Nix, we can take advantage of the work we’ve done previously and make it all a little purer. So in the same directory, we’ll add a default.nix with this:

{ pkgs ? import <nixpkgs> {} }:

  fonts = import /home/matthew/.dotfiles/myfonts.nix { inherit pkgs; };
pkgs.runCommandLocal "cv.pdf" {
  src = ./cv.tex;
  buildInputs = [ fonts.mytexlive ];
} ''
  pdflatex -shell-escape $src && pdflatex -shell-escape $src && pdflatex -shell-escape $src
  cp $(basename $src .tex).pdf $out

We import the file we’ve made previously, and then it’s just a single call to the pkgs.runCommandLocal function, providing a reasonably arbitrary name, the environment in which we declare the src to be the local tex file, and buildInputs to be our custom TexLive derivation, and then the shell commands for what to do.

Now, instead of running make, we can run nix-build. In fact, we can run okular $(nix-build) because nix-build puts all logs over stderr and the only thing that comes out of stdout is the $out path created in the nix-store, which is the path of the PDF itself. Is this really any better than running make? No, not massively. It’s a little cute, and it means that I can take mytexlive out of my ~/.config/nixpkgs/home.nix if I want to, but this isn’t exactly going to save me lots of time. It makes things a little tidier.

Hugo, direnv etc

This website is generated by Hugo. I’ve only just started using it but it seems quite nice. hugo behaves for me and hasn’t required a lot of configuring. There are already a couple of blog posts out there for using hugo with Nix. They seem fairly similar. They use direnv which I’ve not used before and seems interesting: the concept is that you can get scripts to run when you cd into a directory. In Nix, we can harness this to add programs (e.g. hugo) to our profile, but only when we’re working within a directory. Which seems nice. I’ve gone a bit further than those existing blog posts though, in order to get generating the whole site being driven through Nix too.

Setting up direnv

Home-manager supports direnv, so in ~/.config/nixpkgs/home.nix I have added:

programs.direnv = {
  enable = true;
  enableBashIntegration = true;
  nix-direnv.enable = true;

Now run home-manager switch as normal. I’m going to be working in ~/websites/wellquite/ from here on:

$ mkdir -p ~/websites/wellquite && cd ~/websites/wellquite
$ echo "use nix" >> .envrc
$ direnv allow

Open a new terminal and cd ~/websites/wellquite. You need a new terminal / shell so that it runs the direnv shell hooks.

Generating the website

Now, I want to use pkgs.mkShell to add hugo and some theme setup stuff to my profile whenever direnv activates itself. But, I also want to be able to run nix-build in here and essentially have it run hugo in a clean state and generate the entire site. Direnv will look for a shell.nix file and run that. nix-build uses default.nix by default. So the plan is to declare as much as possible in default.nix, and then shell.nix will import that, and make minor further tweaks.

So, default.nix:

{ pkgs ? import <nixpkgs> {}}:

  hugo-theme-indigo = pkgs.stdenvNoCC.mkDerivation {
    name = "hugo-theme-indigo";
    src = pkgs.fetchFromGitHub {
      owner = "AngeloStavrow";
      repo = "indigo";
      rev = "7cfe70c0014d6162b81e40358df67c566bbe0296";
      sha256 = "15mj9yfyfl69kn5myspzcljg2msd5mlfj4dz351b1608icba4g8r";
    installPhase = ''
      cp -r $src $out
    preferLocalBuild = true;

  wellquite = pkgs.stdenvNoCC.mkDerivation {
    name = "wellquite";

    src = with builtins; filterSource
      (path: type: substring 0 1 (baseNameOf path) != "." && (baseNameOf path) != "default.nix" && (baseNameOf path) != "shell.nix" && type != "symlink")

    dontConfigure = true;
    buildInputs = [ pkgs.hugo hugo-theme-indigo ];
    preferLocalBuild = true;
    installPhase = ''
      runHook preInstall

      mkdir -p themes
      ln -snf "${hugo-theme-indigo}" themes/hugo-theme-indigo
      hugo -d $out

      if [ -d ./overlays ]; then
        cp -R ./overlays/* $out/

      runHook postInstall
  inherit hugo-theme-indigo wellquite;
  inherit (pkgs) hugo;

Two things going on here:

  1. hugo-theme-indigo is just how to fetch a theme I’m using from Github. It’s properly pinned so I know it’ll never change without my say-so.
  2. wellquite then uses that theme, and hugo itself, to generate the whole site.

In wellquite, .src looks a little elaborate. The more obvious thing to do is just src = ./;. If you do that, then the directory’s content will change whenever, say, a result symlink changes, which will happen whenever you run nix-build. You probably don’t want to provide the previous build output as a source to the current build. So there’s a bit of careful filtering going on there to make sure that no dotfiles, no symlinks, and no default.nix or shell.nix get treated as sources. This just ensures that repeatedly calling nix-build can spot if no real changes have happened to our source, in which case there may be nothing to (re)build.

I have some other plain HTML files sitting in ./overlays which I wish to just have copied into the output, hence those extra few lines of shell. With all this, I can now call nix-build -A wellquite and I should then find the whole site built in ./result, which I can then rsync up to my server.

Hooking into the shell

With the bulk of the work now done, I just want a small shell.nix which can reuse a bunch of things from default.nix and add hugo to my user profile so that I can do handy things like run hugo serve when writing articles. Hence, shell.nix:

{ pkgs ? import <nixpkgs> {}}:

  myhugo = import ./default.nix { inherit pkgs; };

in pkgs.mkShell {
  buildInputs = [ myhugo.hugo myhugo.hugo-theme-indigo ];
  shellHook = ''
    mkdir -p themes
    ln -snf "${myhugo.hugo-theme-indigo}" themes/hugo-theme-indigo

Now, whenever I cd ~/websites/wellquite, the shell.nix will be run. It’ll import from default.nix thus reusing code we’ve already written, and then just add a symlink so that my chosen theme is available locally. I can then run hugo serve and it all just works.

The ability to easily reuse things you’ve previously built is quite neat, and it’s just much better done than having a whole load of shell scripts lying around all over the place. It’s not going to radically change my life, but it does feel a little more robust and defensive, which I like.

November 03, 2021 08:10 PM

November 01, 2021


Haskell teaching and development job with Well-Typed

tl;dr If you’d like a job with us, and in particular if you are enthusiastic about teaching Haskell, send your application as soon as possible.

We are looking for a Haskell expert to join our team at Well-Typed to focus on teaching. This is a great opportunity for someone who is passionate about Haskell and who is keen to improve and promote Haskell in a professional context.

About Well-Typed

We are a team of top notch Haskell experts. Founded in 2008, we were the first company dedicated to promoting the mainstream commercial use of Haskell. To achieve this aim, we help companies that are using or moving to Haskell by providing a range of services including consulting, development, training, and support and improvement of the Haskell development tools. We work with a wide range of clients, from tiny startups to well-known multinationals. We have established a track record of technical excellence and satisfied customers.

Our company has a strong engineering culture. All our managers and decision makers are themselves Haskell developers. Most of us have an academic background and we are not afraid to apply proper computer science to customers’ problems, particularly the fruits of FP and PL research.

We are a self-funded company so we are not beholden to external investors and can concentrate on the interests of our clients, our staff and the Haskell community.

About the job

The role is not tied to a single specific project or task, and is fully remote.

However, somewhat deviating from our usual modus operandi, this time we are interested in hiring someone who is particularly enthusiastic about (and ideally experienced in) teaching.

At Well-Typed, teaching Haskell has been one of the fundamental services we have provided nearly from the start. Often, some amount of teaching is a natural part of consulting work, but quite often, we are also hired for dedicated training courses by our clients.

The topics range from general introductions to Haskell and functional programming all the way to rather specific topics that are specifically tailored to the needs of a particular client, and sometimes move beyond Haskell into other topics, such as security, formal methods, or smart contracts.

Currently, primarily due to the Covid restrictions, most of our teaching is delivered remotely, but in the past, we have also been delivering courses on-site, and we may return to that in the future.

There is a variety of ways in which you could help us scale up our teaching efforts, including but not limited to the following:

  • helping us to develop new training materials and exercises,

  • maintaining and improving existing training materials and exercises,

  • developing and improving tooling around teaching, for example for partially auto-reviewing solutions to tasks, so that individual feedback can really focus on individual aspects of the solutions of participants,

  • delivering course sessions, answering questions by participants, and reviewing assignments submitted by course participants.

In addition to the teaching-related tasks, there will probably still be scope for regular consulting tasks, which may involve:

  • working on GHC, libraries and tools;

  • Haskell application development;

  • working directly with clients to solve their problems.

We try wherever possible to arrange tasks within our team to suit peoples’ preferences and to rotate to provide variety and interest.

About you

Having a good understanding of Haskell is essential, as is being able to communicate this understanding effectively. Familiarity with other languages and good software engineering practices are also useful.

It helps if you are confident in presenting in English, talking about various technical topics, both in front of a live audience and in a remote video call scenario. It also helps if you are good at preparing clear written materials to support the courses. It is furthermore helpful if you can teach not just based on your own materials, but also based on materials that have been prepared by others.

You are likely to have a bachelor’s degree or higher in computer science or a related field, although this isn’t a requirement.

Further (optional) bonus skills:

  • experience with Cardano and/or Plutus,

  • experience of consulting or running a business,

  • knowledge of and experience in applying formal methods,

  • familiarity with (E)DSL design,

  • knowledge of concurrency and/or systems programming,

  • experience with working on GHC,

  • experience with web programming (in particular front-end),

  • … (you tell us!)

Offer details

The offer is initially for one year full time, with the intention of a long term arrangement. Living in England is not required. We may be able to offer either employment or sub-contracting, depending on the jurisdiction in which you live.

If you are interested, please apply by email to . Tell us why you are interested and why you would be a good fit for Well-Typed, and attach your CV. Please indicate how soon you might be able to start.

We are looking to fill this position as soon as possible, but depending on various factors, we may be able to hire more than one person. In any case, please try to get your application to us by 15 November 2021.

by andres, duncan, adam, christine at November 01, 2021 12:00 AM

October 31, 2021

Joachim Breitner

A mostly allocation-free optional type

The Motoko programming language has a built-in data type for optional values, named ?t with values null and ?v (for v : t); this is the equivalent of Haskell’s Maybe, Ocaml’s option or Rust’s Option. In this post, I explain how Motoko represents such optional values (almost) without allocation.

I neither claim nor expect that any of this is novel; I just hope it’s interesting.

Uniform representation

The Motoko programming language, designed by Andreas Rossberg and implemented by a pretty cool team at DFINITY is a high-level language with strict semantics and a strong, structural, equi-recursive type system that compiles down to WebAssembly.

Because the type system supports polymorphism, it represents all values uniformly. Simplified for the purpose of this blog post, they are always pointers into a heap object where the first word of the heap object, the heap tag, contains information about the value:

│ tag │ …

The tag is something like array, int64, blob, variant, record, …, and it has two purposes:

  • The garbage collector uses it to understand what kind of object it is looking at, so that it can move it around and follow pointers therein. Variable size objects such as arrays include the object size in a subsequent word of the heap object.

  • Some types have values that may have different shapes on the heap. For example, the ropes used in our text representation can either be a plain blob, or a concatenation node of two blobs. For these types, the tag of the heap object is inspected.

The optional type, naively

The optional type (?t) is another example of such a type: Its values can either be null, or ?v for some value v of type t, and the primitive operations on this type are the two introduction forms, an analysis function, and a projection for non-null values:

null : () -> ?t
some : t -> ?t
is_some : ?t -> bool
project : ?t -> t     // must only be used if is_some holds

It is natural to use the heap tag to distinguish the two kind of values:

  • The null value is a simple one-word heap object with just a tag that says that this is null:

    │ null │
  • The other values are represented by a two-word object: The tag some, indicating that it is a ?v, and then the payload, which is the pointer that represents the value v:

    │ some │ payload │

With this, the four operations can be implemented as follows:

def null():
  ptr <- alloc(1)
  ptr[0] = NULL_TAG
  return ptr

def some(v):
  ptr <- alloc(2)
  ptr[0] = SOME_TAG
  ptr[1] = v
  return ptr

def is_some(p):
  return p[0] == SOME_TAG

def project(p):
  return p[1]

The problem with this implementation is that null() and some(v) both allocate memory. This is bad if they are used very often, and very liberally, and this is the case in Motoko: For example the iterators used for the for (x in e) construct have type

type Iter<T> = { next : () -> ?T }

and would unavoidably allocate a few words on each iteration. Can we avoid this?

Static values

It is quite simple to avoid this for for null: Just statically create a single null value and use it every time:

static_null = [NULL_TAG]

def null():
  return static_null

This way, at least null() doesn’t allocate. But we gain more: Now every null value is represented by the same pointer, and since the pointer points to static memory, it does not change even with a moving garbage collector, so we can speed up is_some:

def is_some(p):
  return p != static_null

This is not a very surprising change so far, and most compilers out there can and will do at least the static allocation of such singleton constructors.

For example, in Haskell, there is only a single empty list ([]) and a single Nothing value in your program, as you can see in my videos exploring the Haskell heap.

But can we get rid of the allocation in some(v) too?

Unwrapped optional values

If we don’t want to allocate in some(v), what can we do? How about simply

def some(v):
  return v

That does not allocate! But it is also broken. At type ??Int, the values null, ?null and ??null are distinct values, but here this breaks.

Or, more formally, the following laws should hold for our four primitive operations:

  1. is_some(null()) = false
  2. v. is_some(some(v)) = true
  3. p. project(some(p)) = p

But with the new definition of some, we’d get is_some(some(null())) = false. Not good!

But note that we only have a problem if we are wrapping a value that is null or some(v). So maybe take the shortcut only then, and write the following:

def some(v):
  if v == static_null || v[0] == SOME_TAG:
    ptr <- alloc(2)
    ptr[0] = SOME_TAG
    ptr[1] = v
    return ptr
    return v

The definition of is_some can stay as it is: It is still the case that all null values are represented by static_null. But the some values are now no longer all of the same shape, so we have to change project():

def project(p):
  if p[0] == SOME_TAG:
    return p[1]
    return p

Now things fall into place: A value ?v can, in many cases, be represented the same way as v, and no allocation is needed. Only when v is null or ?null or ??null or ???null etc. we need to use the some heap object, and thus have to allocate.

In fact, it doesn’t cost much to avoid allocation even for ?null:

static_some_null = [SOME_TAG, static_null]
def some(v):
  if v == static_null:
    return static_some_null
  else if v[0] == SOME_TAG:
    ptr <- alloc(2)
    ptr[0] = SOME_TAG
    ptr[1] = v
    return ptr
    return v

So unless one nests the ? type two levels deep, there is no allocation whatsoever, and the only cost is a bit more branching in some and project.

That wasn’t hard, but quite rewarding, as one can now use idioms like the iterator shown above with greater ease.


The following tables shows the representation of various values before and after. Here […] is a pointed-to dynamically allocated heap object, {…} a statically allocated heap object, N = NULL_TAG and S = SOME_TAG.

type value before after
Null null {N} {N}
?Int null {N} {N}
?Int ?23 [S,23] 23
??Int null {N} {N}
??Int ?null [S,{N}] {S,{N}}
??Int ??23 [S,[S,23]] 23
???Int null {N} {N}
???Int ?null [S,{N}] {S,{N}}
???Int ??null [S,[S,{N}]] [S,{S,{N}}]
???Int ???23 [S,[S,[S,23]]] 23

Concluding remarks

  • If you know what parametric polymorphism is, and wonder how this encoding can work in a language that has that, note that this representation is on the low-level of the untyped run-time value representation: We don’t need to know the type of v in some(v), merely its heap representation.

  • The uniform representation in Motoko is a bit more involved: The pointers are tagged (by subtracting 1) and some scalar values are represented directly in place (shifted left by 1 bit). But this is luckily orthogonal to what I showed here.

  • Could Haskell do this for its Maybe type? Not so easily:

    • The Maybe type is not built-in, but rather a standard library-defined algebraic data type. But the compiler could feasible detect that this is option-like?

    • Haskell is lazy, so at runtime, the type Maybe could be Nothing, or Just v, or, and this is crucial, a yet to be evaluated expression, also called a thunk. And one definitely needs to distinguish between a thunk t :: Maybe a that may evaluate to Nothing, and a value Just t :: Maybe a that definitely is Just, but contains a value, which may be a thunk.

    Something like this may work for a strict Maybe type or unlifted sums like (# (# #) | a #), but may clash with other ticks in GHC, such as pointer tagging.

  • As I said above, I don’t expect this to be novel, and I am happy to add references to prior art here.

  • Given that a heap object with tag SOME_TAG now always encodes a tower ?ⁿnull for n>0, one could try to optimize that even more by just storing the n:

    │ some │  n  │

    But that seems unadvisable: It is only a win if you have deep towers, which is probably rare. Worse, now the project function would have to return such a heap object with n decremented, so now projection might have to allocate, which goes against the cost model expected by the programmer.

  • If you rather want to see code than blog posts, feel free to check out Motoko PR #2115.

  • Does this kind of stuff excite you? Motoko is open source, so your contributions may be welcome!

by Joachim Breitner ( at October 31, 2021 09:01 PM

October 30, 2021

Matthew Sackman

LaTeX, fonts, & NixOS

Back in 2006, for reasons that made sense to me at the time, and make no sense to anyone at all now, I decided to buy some fonts and use them when writing my final-year thesis for university. I was writing my thesis in LaTeX. What followed was a week in which I read some pretty thorough documentation and slowly figured out how to convert fonts in normal TrueType and OpenType formats into the variety of formats that LaTeX needs, and how to install them. Having been through this pain once, I then decided to use these fonts and this setup for pretty much every document I would ever write from that point onwards. As I was already using Debian, whenever I set up a new machine, I would copy all these font files to the new machine, putting them in the same places as before, and everything Just Worked.

I’ve just left my previous job, so it’s probably time I updated my CV (Résumé). Which is in LaTeX, using these fonts, and I’ve just switched OS from Debian to NixOS. So in order to be able to build my CV, I need to figure out how LaTeX works in NixOS, and how to install my own fonts in it. Now although I’m sure the number of people still using LaTeX is high (as far as I know, it’s still the standard for many subjects in academia), I suspect the number of people using LaTeX in NixOS is rather low. And the number of people who have installed their own fonts is even lower. The reason I think this is because there was precious little documentation about how to do this, and whilst it turned out to not require a lot of code, it took a solid day to figure it all out. That said, I’m definitely still a NixOS novice. And yes, I realise this is all ridiculous.

So in an attempt to improve the documentation situation, here’s a guide. Kinda.

We’re going to be creating some Nix derivations that:

  • Make your TrueType and OpenType fonts available for the system as a whole so programs like libreoffice, chrome, inkscape etc can all find and use them.
  • Create a LaTeX derivation that combines your custom fonts (plus any .sty files you’ve created) with a LaTeX derivation from nixpkgs.

I assume you’ve already done everything necessary to get your fonts ready for LaTeX. If the whole Berry naming scheme and all that jazz is still A Thing then I trust you’ve found and followed the guides and got your font files prepared.

System-wide fonts

Because fonts need to be added to various databases, we need to add derivations into the system-wide fonts.fonts variable. So, in our /etc/nixos/configuration.nix file, we’re going to need something like this:

  myfonts = import /home/matthew/.dotfiles/myfonts.nix { inherit pkgs; };
  fonts.fonts = [ myfonts.fonts ];

You can guess that ~/.dotfiles is my repo for storing all my dotfiles and system configuration bits and bobs.

Firstly, let’s get the directory structure right. In my case, I’m installing MyriadPro which is from Adobe (and I have in OpenType format), and also Plantin, which is from Monotype (and I have in TrueType format). So:

$ cd ~/.dotfiles
$ mkdir -p myfonts/share/fonts/opentype/adobe/myriad/
$ mkdir -p myfonts/share/fonts/truetype/monotype/plantin/

Populate those directories with the relevant files. Now, let’s start ~/.dotfiles/myfonts.nix:

{ pkgs ? import <nixpkgs> {} }:

  fonts = pkgs.stdenvNoCC.mkDerivation {
    pname = "myfonts";
    version = "1.0.0";
    src = ./myfonts;
    dontConfigure = true;

    installPhase = ''
      runHook preInstall
      cp -R share $out/
      runHook postInstall

    meta = {
      description = "Adobe Myriad and Monotype Plantin fonts";
  inherit fonts;

This file gets a bit bigger with time, which is why it’s set up the way it is, and returns attributes rather than a single derivation.

With these two changes done, we should be able to do a nixos-rebuild switch and now find our fonts are available to normal desktop programs.


LaTeX in NixOS is provided in a number of different schemes depending on quite how much of LaTeX (or rather TexLive) you want installed. I decided not to muck about and just install everything. So the basic derivation I’m trying to extend is pkgs.texlive.scheme-full. You may be aware that installing fonts into LaTeX requires:

  1. Putting all the font files in their various formats in the right place.
  2. Enabling your new font map files, and then running updmap --sys.

NixOS provides a pkgs.texlive.combine function that returns a derivation combining a bunch of LaTeX packages. That’s where we’ll start: we’ll provide our own package as an argument to combine. This will at least get all the files in the right place. Again, let’s begin by putting our LaTeX-suitable font files in the right layout for the source:

$ cd ~/.dotfiles/myfonts
$ mkdir -p tex/latex/adobe/myriad             # Put .sty and .fd files for Myriad in here
$ mkdir -p tex/latex/monotype/plantin         # Similarly .sty and .fd files.
$ mkdir -p share/fonts/vf/adobe/myriad        # For .vf files.
$ mkdir -p share/fonts/vf/monotype/plantin
$ mkdir -p share/fonts/type1/adobe/myriad     # For .type1 files.
$ mkdir -p share/fonts/type1/monotype/plantin
$ mkdir -p share/fonts/afm/adobe/myriad       # For .afm files.
$ mkdir -p share/fonts/afm/monotype/plantin
$ mkdir -p share/fonts/tfm/adobe/myriad       # For .tfm files.
$ mkdir -p share/fonts/tfm/monotype/plantin
$ mkdir -p share/fonts/map/dvips/myriad       # For .map files.
$ mkdir -p share/fonts/map/dvips/plantin

Keep the myfonts/share/fonts/opentype and myfonts/share/fonts/truetype directories from the previous steps unaltered.

With those directories all populated, we want to extend our myfonts.nix file:

{ pkgs ? import <nixpkgs> {} }:

  fonts = pkgs.stdenvNoCC.mkDerivation {
    pname = "myfonts";
    version = "1.0.0";
    passthru.tlType = "run";
    src = ./myfonts;
    dontConfigure = true;

    installPhase = ''
      runHook preInstall
      cp -R share $out/
      cp -R tex $out/
      runHook postInstall

    meta = {
      description = "Adobe Myriad and Monotype Plantin fonts";

  latexFonts = { pkgs = [ fonts ]; };

  mytexlive = (pkgs.texlive.combine {
    inherit (pkgs.texlive)
    inherit latexFonts;
  inherit fonts mytexlive;

A few things to note:

  • We’re copying both tex and share now to $out: our single fonts derivation will work both to provide font files to LaTeX and to the system as a whole.
  • We’ve added passthru.tlType - I guess tl stands for TexLive. I’m honestly not certain about the "run" value, but I had a quick read through some of the TexLive derivations in nixpkgs and the only other value I could see was "bin" which then went into code that was doing wrapping of binaries and such like. So "run" seems right.
  • The latexFonts wrapping seems to be something that’s just required to present the package in the right way to combine.
  • mytexlive is the result of combining the provided pkgs.texlive.scheme-full derivation with our custom fonts derivation.

(Update: An earlier version of this post had both name and pname as attributes in the derivation. I was confused. I now believe that the “right” thing to do is always have pname and version and never worry much about name.)

So we should now be able to build this:

$ cd ~/.dotfiles
$ nix-build -A mytexlive myfonts.nix

If we look inside the ./result directory, we should find all our extra font files, and hopefully in the right places. However, attempting to use them won’t work because we’ve not run updmap yet. Building some LaTeX documents may not error now, but we’ll get font substitutions happening, and most likely wind up with Computer Modern being used. Bleugh!

Running updmap

Back in the Debian world, there’s some dpkg-reconfigure incantation I used to run to do the necessary work. For some reason I’d hoped that Nix would just magically spot the extra .map files appearing in the output directories and automatically run updmap. Alas, it does not. So, we need to add some commands to be run right at the end of our mytexlive derivation being built. The way to do this is using overrideAttrs. This requires a little change to our myfonts.nix file:

  postCombineOverride = oldAttrs: {
    postBuild = oldAttrs.postBuild + ''
      updmap --sys --enable --enable
      updmap --sys

  mytexlive = (pkgs.texlive.combine {
    inherit (pkgs.texlive)
    inherit latexFonts;
  }).overrideAttrs postCombineOverride;

We’ve added this postCombineOverride thing, and appended to the mytexlive section. What we’re doing here is that we’re replacing the postBuild commands from the derivation that results from the pkgs.texlive.combine function. Thankfully, we’re provided the existing value via oldAttrs, and so we just add onto the end of that. We enable the two extra map files we’ve provided, and then we regenerate the font databases by calling updmap --sys.

That should do it. Running nix-build -A mytexlive myfonts.nix again should get us a LaTeX which fully knows about our extra fonts.

Adding to home-manager

Just as we edited /etc/nixos/configuration.nix to make the OpenType and TrueType fonts available to the whole system, we can also add to home-manager to make our custom LaTeX system available to ourselves. So editing ~/.config/nixpkgs/home.nix:

{ config, pkgs, ... }:

  myfonts = import /home/matthew/.dotfiles/myfonts.nix { inherit pkgs; };

  homeManager = {
    fonts.fontconfig.enable = true;
    home.packages = [
in homeManager

And then a final home-manager switch should add our customised LaTeX to our normal user profile.

Not a crazy amount of work, but a bit fiddly, and not easy to figure all this out. So I hope this is useful to someone!

October 30, 2021 04:54 PM

Moving to NixOS

I started using Linux as my main desktop OS in about the year 2000. I bought the Debian 2.1 “slink” CD-ROM set. At the time I also dabbled with other OSes too: Windows, BeOS, and also Minix. Over the last 20 years, through University and my career so far as a software engineer, Debian has been incredibly reliable, and has been my go-to choice whenever setting up a new machine. About five years ago I came across NixOS. I spent a fair amount of time learning NixOS for a project at work and getting an initial understanding of its concepts and design. But then I changed job and put it away for a while.

As a software engineer, I increasingly despair at the level of accidental complexity that we, as a community and industry, have built for ourselves. It’s utterly absurd, not just how anything works, but the way we write software these days. In a no doubt futile quest to get a handle on and manage this complexity, I’ve decided to switch my main laptops and desktops to NixOS. I have no real intention for this piece to be evangelism. There are bits of the design of NixOS that I really like, and I will continue to use it. For me, being able to declare and construct environments which contain only the things I want, and nothing I don’t want, is valuable.

Here I’m just going to note a few bits and pieces that I’ve solved for myself that might come in handy for anyone walking the same path.

Emacs, GoPls and Systemd

I write quite a lot of Go; I still use emacs; I like using gopls; I dislike the amount of memory gopls uses (and sometimes CPU too) and I want to constrain this. So here’s the plan:

  1. Run gopls from a user systemd unit, and limit its memory. This boils down to running gopls -listen=... to create the server.
  2. Configure emacs to connect to the systemd-managed gopls, via unix domain socket. Which boils down to running gopls -remote=... to create the client and connect to the server.

I’m using home-manager, so I can create my own systemd units by adding to ~/.config/nixpkgs/home.nix: = {
  gopls = {
    Unit = {
      Description = "Run gopls as a daemon";
    Install = {
      WantedBy = [ "" ];
    Service = {
      ExecStart = "${homeManager.home.homeDirectory}/go/bin/gopls -listen=unix;%t/gopls";
      ExecStopPost = "/run/current-system/sw/bin/rm -f %t/gopls";
      Restart = "always";
      RestartSec = 3;
      MemoryHigh = "1.5G";
      MemoryMax = "2G";
systemd.user.startServices = "sd-switch";

I’ve chosen to install gopls manually with the normal go install which means the binary has ended up at ~/go/bin/gopls. I could choose to use gopls as provided by Nix, but then I’d have to do more work if I wanted to upgrade it ahead of the nixpkgs repository. It’s a tradeoff: I’ve decided that I’m not worried about having it pinned to any particular version, and I just want the latest and I’ll be in charge of upgrading it every few days if I want to. But you might decide you don’t want to live on the bleeding edge so much, you want the Nix-provided version and so all you’d need to do is add pkgs.gopls to home.packages and then set ExecStart in the above to "${pkgs.gopls}/bin/gopls ..." or something like that.

So here we’re declaring a systemd unit, and putting some resource limits on it. After running home-manager switch we should be able to see gopls has been launched:

$ systemctl status --user gopls
● gopls.service - Run gopls as a daemon
Loaded: loaded (/nix/store/cg631mc569q6132avg4kn54751ivgl5v-home-manager-files/.config/systemd/user/gopls.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2021-10-30 12:37:34 BST; 1h 36min ago
Main PID: 1557 (gopls)
Tasks: 18 (limit: 37762)
Memory: 25.5M (high: 1.5G max: 2.0G)
CPU: 1.896s
CGroup: /user.slice/user-1000.slice/user@1000.service/app.slice/gopls.service
└─1557 /home/matthew/go/bin/gopls -listen=unix;/run/user/1000/gopls

Oct 30 12:37:34 rocket gopls[1557]: serve.go:114: Gopls daemon: listening on unix network, address /run/user/1000/gopls...
Oct 30 12:37:34 rocket systemd[1547]: Started Run gopls as a daemon.

In the systemd unit declaration, note the %t in the ExecStart and ExecStopPost lines. That gets expanded to the XDG_RUNTIME_DIR which is the right place to put a unix domain socket. So we’re creating (and removing) a socket at $XDG_RUNTIME_DIR/gopls. We now need to tell emacs how to connect to that.

In emacs, I’m using lsp-mode, go-mode and a bunch of other hooks and features (check out treemacs for some nice extensions which get almost-IDE-like features). We need to configure the lsp-go-gopls-server-args variable.

Snippet from ~/.emacs:

(use-package lsp-mode
  :ensure t
  :custom (lsp-go-gopls-server-args (list (format "-remote=unix;%s/gopls" (getenv "XDG_RUNTIME_DIR"))))
  :commands (lsp lsp-deferred)
  :config (progn
            ;; use flycheck, not flymake
            (setq lsp-prefer-flymake nil)))

(I don’t use home-manager to manage my ~/.emacs, but I do have that file checked in to a repo for safe-keeping and easy reproducibility.)

The variable lsp-go-gopls-server-args wants a list of strings, which is why I’m using the list function to put the string result of format into a list. I’ve gone to a little effort to grab that XDG_RUNTIME_DIR value from the environment and avoid hard-coding any paths there.

And that’s it, for now. It seems to work OK.

October 30, 2021 12:20 PM

October 29, 2021

Brent Yorgey

Competitive programming in Haskell: BFS, part 3 (implementation via HashMap)

In a previous post, I showed how we can solve Modulo Solitaire (and hopefully other BFS problems as well) using a certain API for BFS, and we also explored some alternatives. I had a very interesting discussion with Andrey Mokhov in the comments about potential designs for an even more general API; more on that in a future post, perhaps!

For today, though, I want to finally show one way to implement this API efficiently. Spoiler alert: this implementation ultimately won’t be fast enough for us, but it will be a helpful stepping stone on our way to a yet faster implementation (which will of course get its own post in due time).

This post is literate Haskell; you can obtain the source from the darcs repo. We begin with a few LANGUAGE pragmas and imports.

{-# LANGUAGE RecordWildCards            #-}
{-# LANGUAGE ScopedTypeVariables        #-}
{-# LANGUAGE TupleSections              #-}

module BFS where

import           Control.Arrow               ((>>>))
import           Data.Hashable               (Hashable)
import           Data.HashMap.Strict         (HashMap, (!))
import qualified Data.HashMap.Strict         as HM
import           Data.List                   (foldl')
import           Data.Sequence               (Seq (..), ViewL (..), (|>))
import qualified Data.Sequence               as Seq

Now a couple utility functions: (>$>) is just flipped function application, and exhaust iterates an (a -> Maybe a) function as many times as possible, returning the last non-Nothing value.

infixl 0 >$>
(>$>) :: a -> (a -> b) -> b
(>$>) = flip ($)
{-# INLINE (>$>) #-}

exhaust :: (a -> Maybe a) -> a -> a
exhaust f = go
    go a = maybe a go (f a)

Here is the BFSResult record that we ultimately want to produce; it should be familiar from previous posts.

data BFSResult v =
  BFSR { getLevel :: v -> Maybe Int, getParent :: v -> Maybe v }

While running our BFS, we’ll keep track of three things: the level of each vertex that has been encountered; a mapping from each encountered vertex to its parent; and a queue of vertices that have been encountered but yet to be processed. We use a Seq from Data.Sequence to represent the queue, since it supports efficient (amortized constant-time) insertion and removal from either end of the sequence. There are certainly other potential ways to represent a queue in Haskell (and this probably deserves its own blog post) but Data.Sequence seems to give good performance for minimal effort (and in any case, as we’ll see, it’s not the performance bottleneck here). We use a pair of HashMaps to represent the level and parent maps.

data BFSState v =
  BS { level :: HashMap v Int, parent :: HashMap v v, queue :: Seq v }

Given a list of starting vertices, we can create an initial state, with a queue containing the starting vertices and all of them set to level 0.

initBFSState :: (Eq v, Hashable v) => [v] -> BFSState v
initBFSState vs = BS (HM.fromList (map (,0) vs)) HM.empty (Seq.fromList vs)

Now, here is our imeplementation of BFS, using the API discussed previously: it takes a list of starting vertices, a function giving the neighbors of each vertex, and a function identifying “target vertices” (so we can stop early), and returns a BFSResult record. We create an initial state, run bfsStep as much as possible, and convert the end state into a result.

bfs :: forall v. (Eq v, Hashable v) => [v] -> (v -> [v]) -> (v -> Bool) -> BFSResult v
bfs vs next goal = toResult $ exhaust bfsStep (initBFSState vs)

Converting the final BFSState into a BFSResult is easy: just return functions that do a lookup into the relevant map.

    toResult BS{..} = BFSR (`HM.lookup` level) (`HM.lookup` parent)

To do a single step of BFS, try to remove the next vertex v from the queue. If the queue is empty, or the next vertex is a goal vertex, return Nothing to signal that we are done.

    bfsStep st@BS{..} = case Seq.viewl queue of
      EmptyL -> Nothing
      v :< q'
        | goal v    -> Nothing

Otherwise, use the next function to find the neighbors of v, keep only those we haven’t encountered before (i.e. those which are not keys in the level map), and use each one to update the BFS state (being sure to first set the queue to the new one with v removed).

        | otherwise ->
          v >$> next >>> filter (not . (`HM.member` level)) >>>
            foldl' (upd v) (st{queue=q'}) >>> Just

To update the BFS state based on a newly visited vertex, we record its parent, insert it into the level map with a level one greater than its parent, and add it to the end of the queue.

    upd p BS{..} v = BS
      (HM.insert v l level)
      (HM.insert v p parent)
      (queue |> v)
        l = level!p + 1

And that’s it! This is good enough to solve many BFS problems on Open Kattis, such as Breaking Bad, ARMPIT Computations, and Folding a Cube. (I will leave you the pleasure of solving these problems yourself; I am especially fond of my Haskell solution to Folding a Cube.)

Unfortunately, it is not fast enough to solve Modulo Solitaire, which I picked specifically because it seems to be one of the most computationally demanding BFS problems I’ve seen. My solution using this HashMap-based implementation solves a bunch of initial test cases, but exceeds the 2 second time limit on one of the later test cases. Next time, I’ll show how to adapt this into an even faster implementation which is actually fast enough to solve Modulo Solitaire.

by Brent at October 29, 2021 09:55 PM

Ken T Takusagawa

[gnsalavv] counting sentences with parentheses

consider the following rules defining a "sentence":

  1. a sentence is string of letters, spaces, and parentheses.
  2. the first character in a sentence may not be a space.
  3. the final character in a sentence may not be a space.
  4. there may not be 2 or more spaces in a row.
  5. parentheses must be balanced
  6. the string between balanced parentheses must be a valid sentence.

any string of letters forms a word: there is no dictionary or rules concerning permitted words.  there is no grammar about where words can go in a sentence.

(previously, without parentheses.)

there are three possibilities for rules concerning a space before an open-parenthesis and after a close-parenthesis:

  1. "spaces required": a space is always required before an open parenthesis and after a close parentheses, except at the beginning or end of a sentence.  in other words, a parenthetical substitutes for a word.
  2. "spaces forbidden": spaces are forbidden before an open parenthesis and after a close parentheses.  this makes #1 more compact, taking the following view: a parenthesis is a separator, so space is redundant.
  3. "spaces optional": no rule, i.e., space permitted but not required in those locations.  sentences otherwise identical can differ by the presence or absence of such spaces.  in other words, a parenthetical can be a portion of a word.

we also consider a 4th possibility for completeness: "no spaces anywhere" = spaces forbidden everywhere.  sentences consist of letters and matched parentheses only.  previously.

(note that, because the string inside parentheses must be a valid sentence, and because sentences may not begin or end with spaces, spaces are never permitted after an open parenthesis or before a close parenthesis in any of the possibilities.)

possibility 1: spaces required

space required before an open parenthesis and after a close parenthesis.

assume an alphabet of exactly one letter for simplicity.

number of possible sentences of a given length, for an alphabet of 1 letter:

0: 1
1: 1
2: 2
3: 3
4: 7
5: 13
6: 28
7: 57
8: 122
9: 260
10: 563
20: 1803320
50: 234639824043294365
100: 2401704148722713218387226080782103540
200: 675802081149514595440924093540053480236325063284358890328823379454019858491
500: 82616673865208219840168987274343611550170249309187075398615393168193465853781619472575603429097239515922012333672624393300660923478883779991454212379508123961866005624093205144548567401645612

here are the 7 sentences exactly 4 characters long.  we use period as the space character to make it visible:

(()) ().a (aa) a.() a.aa aa.a aaaa

number of possible sentences of a given length, for an alphabet of X letters:

0: 1
1: X + 0
2: X^2 + 0 X + 1
3: X^3 + 1 X^2 + 1 X + 0
4: X^4 + 2 X^3 + 1 X^2 + 2 X + 1
5: X^5 + 3 X^4 + 2 X^3 + 5 X^2 + 1 X + 1
6: X^6 + 4 X^5 + 4 X^4 + 8 X^3 + 4 X^2 + 6 X + 1
7: X^7 + 5 X^6 + 7 X^5 + 12 X^4 + 13 X^3 + 12 X^2 + 4 X + 3
8: X^8 + 6 X^7 + 11 X^6 + 18 X^5 + 28 X^4 + 22 X^3 + 22 X^2 + 12 X + 2
9: X^9 + 7 X^8 + 16 X^7 + 27 X^6 + 50 X^5 + 46 X^4 + 60 X^3 + 28 X^2 + 19 X + 6
10: X^10 + 8 X^9 + 22 X^8 + 40 X^7 + 81 X^6 + 94 X^5 + 123 X^4 + 84 X^3 + 79 X^2 + 24 X + 7

the constant terms of the polynomials are OEIS A329691.  these are sentences formed from an empty alphabet, sentences containing only spaces and parentheses, no letters.

possibility 2: spaces forbidden

spaces forbidden next to parentheses.

number of possible sentences of a given length, for an alphabet of 1 letter:

0: 1
1: 1
2: 2
3: 5
4: 10
5: 24
6: 54
7: 128
8: 305
9: 738
10: 1812
20: 22305995
50: 209085861583658998988
100: 2896163012069368328112246952862062350725222
200: 1538301376752204151873892519966611640610203038661778430235437213872294001191888210488379
500: 1296046302196493674416139158608742818799860682561639815148641768773774389295081119244166279227434997260848624438315999195498401549080170742562237181628517294428031369534378524852432685463153189441233522913434883852344497135

here are the 10 sentences exactly 4 characters long: (()) ()() ()aa (a)a (aa) a(a) a.aa aa() aa.a aaaa

number of possible sentences of a given length, for an alphabet of X letters:

0: 1
1: X + 0
2: X^2 + 0 X + 1
3: X^3 + 1 X^2 + 3 X + 0
4: X^4 + 2 X^3 + 5 X^2 + 0 X + 2
5: X^5 + 3 X^4 + 8 X^3 + 3 X^2 + 9 X + 0
6: X^6 + 4 X^5 + 12 X^4 + 10 X^3 + 22 X^2 + 0 X + 5
7: X^7 + 5 X^6 + 17 X^5 + 22 X^4 + 44 X^3 + 9 X^2 + 30 X + 0
8: X^8 + 6 X^7 + 23 X^6 + 40 X^5 + 81 X^4 + 44 X^3 + 96 X^2 + 0 X + 14
9: X^9 + 7 X^8 + 30 X^7 + 65 X^6 + 140 X^5 + 126 X^4 + 234 X^3 + 30 X^2 + 105 X + 0
10: X^10 + 8 X^9 + 38 X^8 + 98 X^7 + 229 X^6 + 284 X^5 + 505 X^4 + 192 X^3 + 415 X^2 + 0 X + 42

the constant terms of the polynomials, corresponding to an empty alphabet, are OEIS A126120, the Catalan sequence interspersed with zeros.

possibility 3: spaces optional

number of possible sentences of a given length, for an alphabet of 1 letter:

0: 1
1: 1
2: 2
3: 5
4: 13
5: 35
6: 97
7: 275
8: 794
9: 2327
10: 6905
20: 543861219
50: 1128333693246416457543548
100: 119628765495379222079660420511035921565831351707845
200: 3649654665420403223682340275125820396477186776743324179123217492162687232294919994358381100091839560848
500: 573687344404198770128089072739235272009117916044030235515101699291546310381453636803458852283006441770075054703364427541927298229216743855593043191380481458147380991797504771043559300958504387502948387474913767477040710263802219479301907466467399365606135187581

here are the 13 sentences exactly 4 characters long: (()) ()() ().a ()aa (a)a (aa) a()a a(a) a.() a.aa aa() aa.a aaaa

number of possible sentences of a given length, for an alphabet of X letters:

0: 1
1: X + 0
2: X^2 + 0 X + 1
3: X^3 + 1 X^2 + 3 X + 0
4: X^4 + 2 X^3 + 6 X^2 + 2 X + 2
5: X^5 + 3 X^4 + 11 X^3 + 9 X^2 + 10 X + 1
6: X^6 + 4 X^5 + 18 X^4 + 24 X^3 + 33 X^2 + 12 X + 5
7: X^7 + 5 X^6 + 27 X^5 + 51 X^4 + 88 X^3 + 60 X^2 + 38 X + 5
8: X^8 + 6 X^7 + 38 X^6 + 94 X^5 + 200 X^4 + 204 X^3 + 176 X^2 + 60 X + 15
9: X^9 + 7 X^8 + 51 X^7 + 157 X^6 + 403 X^5 + 555 X^4 + 620 X^3 + 356 X^2 + 156 X + 21
10: X^10 + 8 X^9 + 66 X^8 + 244 X^7 + 740 X^6 + 1296 X^5 + 1805 X^4 + 1480 X^3 + 930 X^2 + 284 X + 51

the constant terms of the polynomials, corresponding to an empty alphabet, are OEIS A167638.

possibility 4: no spaces anywhere

number of possible sentences of a given length, for an alphabet of 1 letter.  this OEIS A001006 "Motzkin numbers".

0: 1
1: 1
2: 2
3: 4
4: 9
5: 21
6: 51
7: 127
8: 323
9: 835
10: 2188
20: 50852019
50: 2837208756709314025578
100: 737415571391164350797051905752637361193303669
200: 135992213076511509724385771675602872730799169849536546904800645523735889653182288054195352807
500: 4743905065248174705076073568924606940687938425039221715559064934518576868883249780520846342406761520737185061080260761031842147475829201295002628662481146249642585695179800443503586610960281508335011647289023649206194869780049646575619

here are the 9 sentences exactly 4 characters long: (()) ()() ()aa (a)a (aa) a()a a(a) aa() aaaa

number of possible sentences of a given length, for an alphabet of X letters.  there are many zero coefficients:

0: 1
1: X + 0
2: X^2 + 0 X + 1
3: X^3 + 0 X^2 + 3 X + 0
4: X^4 + 0 X^3 + 6 X^2 + 0 X + 2
5: X^5 + 0 X^4 + 10 X^3 + 0 X^2 + 10 X + 0
6: X^6 + 0 X^5 + 15 X^4 + 0 X^3 + 30 X^2 + 0 X + 5
7: X^7 + 0 X^6 + 21 X^5 + 0 X^4 + 70 X^3 + 0 X^2 + 35 X + 0
8: X^8 + 0 X^7 + 28 X^6 + 0 X^5 + 140 X^4 + 0 X^3 + 140 X^2 + 0 X + 14
9: X^9 + 0 X^8 + 36 X^7 + 0 X^6 + 252 X^5 + 0 X^4 + 420 X^3 + 0 X^2 + 126 X + 0
10: X^10 + 0 X^9 + 45 X^8 + 0 X^7 + 420 X^6 + 0 X^5 + 1050 X^4 + 0 X^3 + 630 X^2 + 0 X + 42

the constant terms of the polynomials, corresponding to an empty alphabet, are OEIS A126120, the Catalan sequence interspersed with zeros, same as possibility 2.

alphabet of size 26

next, we consider an alphabet of size X=26, in contrast to X=1 above.

number of possible sentences of length 18, for the 4 possibilities of spaces around parentheses:

53015540666536623728839718 = 10^ 25.724403194816087 (spaces required)

55175049982065362539432242 = 10^ 25.741742735253272 (spaces forbidden)

64004269864112471169216906 = 10^ 25.806208947680233 (spaces optional)

36555393489772334041228182 = 10^ 25.56295146310067 (no spaces anywhere)

number of possible sentences ("tweets") of length 140, for the 4 possibilities:

195934261323419694751910685930748987539493098537206064673150995769544781921774070731971968699512843893026062219012006297509999482003429684592290894981480864413992714261505322833423124487191189991245116 = 10^ 200.29211038394115 (spaces required)

279651170699176161293244273654763429207797519615693389895644252455841236471855853136070695512690866656907376176359966857585855146382978692320648870033749655770564609113840755678875203578876312834446376 = 10^ 200.44661664174802 (spaces forbidden)

82796358617823453501990692494372407625581449023146800949812002179372327361110631037893437558483173977254085396667494514825913324957849322646741729490697338529492350764942937726613380909403414602288119700 = 10^ 202.91801123694236 (spaces optional)

958111401189737767248075704214258864131901794586118314462994973720998051971111561994445027276127246857724857253761295408341276986563999954106395387550976416770852601625771829448219928651378985693500472 = 10^ 200.98141600814867 (no spaces anywhere)

in general, the number of possible sentences increases through "spaces required", "spaces forbidden", then "spaces optional".  possibility #4, "no spaces anywhere", does not fit in a consistent place among them.


computations were done with a Haskell program.  source code here.

we used the poly package to do arithmetic on polynomials.

the number of possibilities of given length is often expressed in terms of a smaller size, so we used memoization to reuse previously calculated smaller results.  we manually implemented memoization using a State monad, storing results in a collection of Maps.  (using Data.Map to store a collection indexed by consecutive integers was a little bit of overkill.)

(previously, seemingly magical automatic memoization without a monad using the MemoTrie package.)

an important space optimization was to fully evaluate values with seq before inserting into a memoization map:

  answer :: Poly.VPoly Integer <- calcanswer;
-- Poly (version 0.3.1) does not yet support DeepSeq ; it was added 0.3.2
  seq (answer==answer) $ State.modify (setter n answer);

length 500 was approximately the longest sentence we could count the number of sentences.  beyond that, computation time became hours and memory usage became gigabytes.  large space usage was partially because we memoized polynomials, not integers, only evaluating a polynomial at the end if we were given a concrete size of the alphabet.

to avoid boilerplate for referring to State components, we used fclabels to create lenses for manipulating State.  the mkLabels declaration (the Template Haskell "splice") has to be carefully placed in the code.  the declaration has to come after the data declaration.  the declaration has to come before any code that uses the lenses.  any code before mkLabels cannot refer to anything declared after mkLabels, even if neither does anything with the lenses.  these rules were quite surprising.  i discovered them by trial and error.  i had never used Template Haskell before.  in a normal Haskell program, all declarations other than import statements can be in any order.

normally i prefer to put the main function near the beginning of the program, but this became not possible.

the -ddump-splices compilation flag dumps the splices generated by Template Haskell.  the generated code looks formidable, heavily using arrows, categorical points, and categories, so i chose not to try to understand what it is doing.

we tried both Control.Monad.State.Strict and Control.Monad.State.Lazy.  Strict had less memory usage, but Lazy allows output of infinite lists, which is convenient, demonstrated in "cumulative1family" which computes the running sum of all sentences of n characters or less.

generating all possible sentences of a given length is equivalent to running a parser in reverse, with additional bookkeeping to keep track of the number of characters still left to output.  we used list as the nondeterminism monad.  we did not do memoization while generating all possible sentences because we never did generation on large sentence lengths.

because we use a state monad for memoization of counting and the nondeterminism monad for generation of sentences, the corresponding monadic codes are structurally similar.  below are some examples.  (the suffixes "1family" and "2family" refer to possibilities 1 (spaces required) and 2 (spaces forbidden) respectively.)

gensentence1family :: Integer -> [String];
gensentence1family n | n<0 = Monad.mzero
| n==0 = return ""
| True = do {
  i <- [1..n];
  Applicative.liftA2 (++)
   (genword1family i)
   (genrestsentence1family $ n-i);

countsentence1family :: Integer -> Monad1family Mypoly;
countsentence1family n | n<0 = return 0
| n==0 = return 1
| True = memoizelens n lenssentence1family $ sumM [1..n] (\i -> Applicative.liftA2 (*)
  (countword1family i)
  (countrestsentence1family $ n-i));


genparenthesized2family :: Integer -> [String];
genparenthesized2family n = do {
  y <- gensentence2family $ n-2;
  return $ ('(':y)++")";

countparenthesized2family :: Integer -> Monad2family Mypoly;
countparenthesized2family n = countsentence2family $ n-2;

it seems within the realm of imagination to unify the structurally similar codes, parameterizing over a polymorphic monad.  we did not pursue this.

it also seems possible to explicitly define a grammar then derive the generation and counting functions generically from grammars.  avoid ambiguous grammars to prevent generating duplicates and double-counting.  we did not pursue this.

future work: if there are N possible sentences, define and compute both directions of a bijection between sentences and the numbers 1 to N.

by Unknown ( at October 29, 2021 04:30 AM

GHC Developer Blog

GHC 9.2.1 is now available

GHC 9.2.1 is now available

bgamari - 2021-10-29

The GHC developers are very happy to at long last announce the availability of GHC 9.2.1. Binary distributions, source distributions, and documentation are available at

GHC 9.2 brings a number of exciting features including:

  • A native code generation backend for AArch64, significantly speeding compilation time on ARM platforms like the Apple M1.

  • Many changes in the area of records, including the new RecordDotSyntax and NoFieldSelectors language extensions, as well as Support for DuplicateRecordFields with PatternSynonyms.

  • Introduction of the new GHC2021 language extension set, giving users convenient access to a larger set of language extensions which have been long considered stable.

  • Merging of ghc-exactprint into the GHC tree, providing infrastructure for source-to-source program rewriting out-of-the-box.

  • Introduction of a BoxedRep RuntimeRep, allowing for polymorphism over levity of boxed objects (#17526)

  • Implementation of the UnliftedDataTypes extension, allowing users to define types which do not admit lazy evaluation (proposal)

  • The new [-hi profiling][] mechanism which provides significantly improved insight into thunk leaks.

  • Support for the ghc-debug out-of-process heap inspection library ghc-debug

  • Significant improvements in the bytecode interpreter, allowing more programs to be efficently run in GHCi and Template Haskell splices.

  • Support for profiling of pinned objects with the cost-centre profiler (#7275)

  • Faster compilation and a smaller memory footprint

  • Introduction of Haddock documentation support in TemplateHaskell (#5467)

Finally, thank you to Microsoft Research, GitHub, IOHK, the Zw3rk stake pool, Tweag I/O, Serokell, Equinix, SimSpace, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Moreover, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do open a ticket if you see anything amiss.

Happy testing,

  • Ben

by ghc-devs at October 29, 2021 12:00 AM

October 28, 2021

Tweag I/O

Hacking on Ormolu: An internship report

After being convinced of the usefulness of code formatters by the excellent scalafmt, which is very widely used in the Scala ecosystem, I was on the lookout for a similar tool for Haskell. In 2018, most options did not completely cut it for me, due to e.g. only formatting certain parts of the source file or having many unfortunate bugs. Upon discovering Ormolu in 2019, I was first appalled by its style, but this reaction was only short-lived, as I found these stylistic preferences to be easily malleable via familiarisation.

I started using Ormolu for all personal projects, and submitted bug reports and minor pull requests. Therefore, an internship to work on Ormolu full-time came like a call!

As described in the announcement post, I worked on support for recent GHC versions, improved the CI setup, fixed various bugs and could also bring in my own suggestions. Let’s get into it!

Upgrading ghc-lib-parser

Like most modern tools operating on Haskell source files, Ormolu leverages the parser in GHC via ghc-lib-parser. As major GHC upgrades often result in significant changes in the exposed compiler API, upgrading Ormolu to a new version of ghc-lib-parser often involves a non-trivial amount of work. At the start of my internship, the upgrade to ghc-lib-parser-9.0 was long overdue. After playing type tetris for a while to get everything to compile, I had to dive into the details in order to debug subtle failures in the test suite.

One cool change in the GHC API which allowed me to simplify the code at several places works like this: In GHC 8.8, the pattern match coverage checker got smarter in detecting that constructors containing a Void-like type can not occur.

-- A type with no inhabitants
data Void

-- Ex falso quodlibet
absurd :: Void -> a
absurd = \case {}

data Music a
  = AutoDetect !a
  | Opus
  | Flac

Now values of type Music FilePath could either be AutoDetect filePath, Opus or Flac, but values of type Music Void will always be either Opus or Flac. Types like this are ubiquitous in the GHC API due to a technique called Trees that grow, and can be thought of as a way to emulate anonymous sum types.

Note that the strictness annotation of a is crucial here to ensure that it is impossible to plug in something like undefined to create a value of type Music Void other than Opus and Flac. In GHC 9.0, these strictness annotations were added in the appropriate places, which allowed me to rewrite code like

isLossy :: Music Void -> Bool
isLossy = \case
  AutoDetect x -> absurd x
  Opus -> True
  Flac -> False


isLossy :: Music Void -> Bool
isLossy = \case
  Opus -> True
  Flac -> False

which is a nice reduction in cognitive load.

Getting to appreciate Nix

Ormolu provides binary releases, as compiling Ormolu from scratch takes a long time, especially due to the dependency on ghc-lib-parser. I had previously contributed a simple GitHub Actions workflow to do this, but it did not use the existing Nix setup, and in particular possibly not the exact same set of dependencies that are used on Ormolu’s CI.

This sparked the idea of using haskell.nix in Ormolu’s Nix setup. In our case, the following features were particularly nice:

  • haskell.nix uses the build plan of cabal, instead of using a fixed package set. This is very convenient, as one does not have to manually override the version of ones dependencies if the default ones are insuitable.
  • It is trivial to create a fully static (musl-based) Linux binary, and, amazingly, even to cross-compile to Windows! The following is the entire Nix code which is responsible for creating Ormolu’s standalone binaries:
binaries = {
  Linux = hsPkgs.projectCross.musl64.hsPkgs.ormolu.components.exes.ormolu;
  macOS = pkgs.runCommand "ormolu-macOS" {
    buildInputs = [ pkgs.macdylibbundler ];
  } ''
    mkdir -p $out/bin
    cp ${ormoluExe}/bin/ormolu $out/bin/ormolu
    chmod 755 $out/bin/ormolu
    dylibbundler -b -x $out/bin/ormolu -d $out/bin -p '@executable_path'
  Windows = hsPkgs.projectCross.mingwW64.hsPkgs.ormolu.components.exes.ormolu;

In addition, haskell.nix was for a long time the only way to reliably use recent GHCJS versions, which will be relevant as seen in the next section.

Reviving Ormolu Live

Earlier, I enjoyed using Ormolu Live, which allowed one to play around with Ormolu in the browser without installation. I suggested to revive this project as part of my internship, which was met with encouragement.

The original incarnation of Ormolu Live relied on reflex-platform, which does not yet support GHC 8.10, yet ghc-lib-parser-9.2 requires at least version 8.10. Therefore, I rewrote Ormolu Live using miso, a small Elm-like framework, and added some configurability and the option to view the GHC AST in the process.

The new Ormolu Live now lives in the Ormolu repo and is updated automatically on every commit to master. Feel free to play around with it here!

New features in Ormolu

Of course, I did not only work on peripheral tasks, but also on Ormolu itself directly. Two highlights:

Respecting .cabal files

In many projects, certain GHC language extensions are enabled for all modules in the project’s .cabal file:

  default-extensions: BangPatterns LambdaCase PatternSynonyms
  default-language: Haskell2010

Since, specifying the --cabal-default-extensions flag will make Ormolu automatically take these into consideration when parsing your Haskell source files.

As I am guilty of always pasting a huge set of extensions in my .cabal file for personal projects and found it very annoying having to manually add these to Ormolu as CLI arguments, I am happy to having got this implemented!

This feature is also enabled by default in ormolu-action, the official way to run Ormolu via GitHub Actions.

Robust support for CPP and disabling formatting locally

Unfortunately, some Haskell code is impossible to be correctly formatted automatically, like complex usage of the CPP language extension, or preserving a very specific code layout of a single function. This necessarily requires one to make tradeoffs, which was an interesting process with rewarding discussions. I want to thank @kukimik on GitHub for suggesting the basic idea we ended up incorporating.

We decided to replace the previous mechanism to handle these cases with a more principled approach, so in particular, you can now be confident that text between Ormolu’s magic comments won’t be touched at all:

U can't touch this!

We follow a simple but effective strategy: At first, all lines between these magic comments, but also lines between #if and #endif and similar constructs of CPP, are marked. Then all contiguous regions of unmarked lines are formatted individually, with the raw marked lines being interspersed at the end.

There are files using CPP that cannot be formatted correctly with this strategy, but with a basic mental model of how Ormolu works, as well as appropriately inserted magic comments, even more complex cases should not be hard to adapt.

Bugs, bugs, bugs

Even though Ormolu is continuously tested on thousands of lines of Haskell code, various special cases of less used language features were still lurking around, waiting to disrupt someone’s workflow. All such known incidents are now resolved. In particular, support of the Arrows extension is now significantly more robust, and a long standing bug involving misplaced Haddock comments has been fixed.

As one of its goals, Ormolu strives to be idempotent, meaning that formatting twice will always yield the same result as only formatting once. It does a pretty good job at this, but as soon as comments are added to the mix, there are still many cases where one has to format twice (or sometimes even more often) to get to an idempotent snippet. This is not a perfect state of affairs, but fixing these kinds of issues is often very brittle and intricate with little real benefit, so we decided that these bugs should not be the primary focus of my internship. Possibly, an entirely new approach to printing comments might be necessary to get to the root of this problem.


In summary, the internship was an excellent experience. I learned many new things about the GHC API and finally got my hands dirty with Nix. I really enjoyed talking to many awesome people as part of numerous coffee breaks, and had a lot of fun with my mentor Mark!

October 28, 2021 12:00 AM

October 27, 2021

Haskell Foundation blog

Into the Future

by Andrew Boardman

In the previous post I talked about what the Haskell Foundation has been up to for the first seven months, now I want to discuss where we’re heading.

The Haskell Foundation could undertake a vast array of initiatives, so restricting the scope of what we will use our resources on has been a major part of our recent efforts.


Amplify Haskell’s impact on humanity.

We have selected this mission statement to focus our strategy. It is general enough to make sure we can support the community in the ways we need to, but gives us something to measure proposed strategies against.

Strategic Focus: 2022

Increase the productivity of junior, professional Haskell developers.

For our first major strategy we looked to generate positive feedback loops. Not only do we want to amplify Haskell’s impact on humanity, but we want to improve our community’s ability to make changes that do the same.

Plastic flower sculpture in a lake.

This focus translates into a bias for accepting Haskell Foundation Tech Proposals (HFTPs) that relate to tooling enhancements that make junior, professional Haskellers, more productive. We will also create HFTPs as we find compelling use cases.

Update: To clarify, we mean developers who are junior in terms of professional Haskell usage, not in terms of their development career in general.

That does not mean that HFTPs need to be specific to junior, professional Haskellers. We chose this focus because we believe it will generate a broad array of additional improvements that benefit the entire community, while also ensuring that we get a specific set of improvements fully finished. Therefore, as we review proposals, we will prioritize those whose impact on junior, professional Haskellers is clear, but having a broader impact will be a plus.

Emily Pillmore and the Haskell Foundation Technical Track (HFTT) are responsible for both evaluating proposals from the community, as well as incubating ones where appropriate.


I just read the phrase “algorithmic attention arms race” by Mark Xu Neyer, and it sums up the state of our world in a really elegant way. The best way to break out of the shackles of consuming culture is to make, and the most leveraged way to make is to produce better tools, to make making faster, easier, and more efficient.

Tape measure at 8 ft. mark.

If you caught my Haskell Love talk, you know that I feel passionately about developer tools, the developer experience, and the impact that has on our ability to make an impact in the larger world. To recap the ideas I presented there:

  • Polished, functional, professional tooling allows developers to work at a higher abstraction level; the details they would otherwise have to juggle in their brains can be trivially accessible in their IDE.
  • A tricky part of learning Haskell is understanding how the concepts fit together, and how they translate into the Haskell run time. An IDE that simulates the runtime and shows developers how their code translates into a running system not only helps programmers be more productive, it helps them learn the language better and faster.
  • Making the language (and runtime) easier and faster to learn, understand, and debug, addresses some of the top reasons why Haskell’s reputation for production work can be rough.
  • A truly interactive, simulating IDE experience also takes care of an issue for bigger projects: the edit -> compile -> run -> test loop needs to be instantaneous from the developer’s perspective for maximum productivity. We must not allow our tools to interrupt developer flow.

The Haskell language deserves an integrated development environment that takes advantage of its pure, lazy, strongly typed, functional programming fundamentals and provides an experience that delivers an order of magnitude improvement in productivity.

Junior, Professional Haskellers

We believe the sweet spot for focusing on tooling improvements is the workflow of professional Haskellers who are at the beginner to intermediate level.

  • Hiring is both a strength and a weakness of our community. The self-selection bias of learning and sticking with Haskell gives a rich talent pool, but smaller than many languages.
  • Senior Haskellers are rare, and those with experience translating business needs into production quality code rarer still.
  • Managers therefore have their effectiveness gated on their ability to get quality work out of the beginner to intermediate engineers on their team, including their ability to hire them.
  • Shortening the time for Haskellers to become seniors and leads both fills the existing talent gap, as well as makes our community as a whole much stronger.

Next Steps

The exciting new phase of Haskell Foundation operations is to identify, in our various task forces and committees, work that we can do to support our strategy, what resources to allocate to them, and get to work!

If you have an idea for a project that fits, are working on one already, or want to get involved, you can email the HF and we will direct you to the appropriate person or task force!


by Haskell Foundation at October 27, 2021 08:28 PM

October 26, 2021

Lysxia's blog

On proving lists infinite

It’s obvious what an infinite list is. But when formalizing things, we must be picky about definitions to not get tangled up in a messy web of concepts. This post will present some ways of saying that “a list is infinite” formally in Coq.1

Coinductive lists

Imports and options
From Coq Require Import Arith Lia.

Set Primitive Projections.
Set Implicit Arguments.
Set Maximal Implicit Insertion.
Set Contextual Implicit.

First, define the type of lists. Lists are made of Cons (::) and Nil. As it is a recursive type, we also have to decide whether to make it inductive, so that only finite lists can be constructed, or coinductive, so that lists might also be infinite sequences of Cons. We start by introducing the type’s base functor ColistF a _, presenting the two list constructors without recursion. We obtain the coinductive type Colist a as a fixed point of ColistF a : Type -> Type.

Inductive ColistF (a : Type) (x : Type) :=
| Nil : ColistF a x
| Cons : a -> x -> ColistF a x

CoInductive Colist (a : Type) : Type :=
  Delay { force : ColistF a (Colist a) }.

Thus the type Colist a has a destructor force : Colist a -> ColistF a (Colist a) (the final coalgebra of ColistF a) and a constructor Delay : ColistF a (Colist a) -> Colist a. This ceremony may look all mysterious if you’re new to this; after living with coinductive types for a while, you will assimilate their philosophy of “destructors first”—unlike inductive types’ “constructors first”.

Notation prep
Add Printing Constructor Colist.

Declare Scope colist_scope.
Delimit Scope colist_scope with colist.
Local Open Scope colist_scope.

Some familiar notations, [] for Nil and :: for Cons.

Notation "'[' ']'" := Nil : colist_scope.
Notation "x :: xs" := (Cons x xs) : colist_scope.

Some simple definitions

Recursive definitions involving lists mostly look as you would expect in Coq as in any functional programming language, but every output list is wrapped in an explicit Delay, and every input list of a match is wrapped in a force. It’s as if you were handling lazy data structures in an eagerly evaluated programming language. Coq is a pure and total language, so evaluation order doesn’t matter as much as in partial languages, but the operational semantics is still careful to not reduce coinductive definitions unless they are forced.

Here is the map function that any self-respecting type of list must provide.

CoFixpoint map {a b} (f : a -> b) (xs : Colist a) : Colist b := Delay
  match force xs with
  | [] => []
  | x :: xs => f x :: map f xs

Another example is the list nats of all natural numbers. It relies on the more general definition of lists of numbers greater than an arbitrary natural number n.

CoFixpoint nats_from (n : nat) : Colist nat := Delay
  (n :: nats_from (S n)).

Definition nats := nats_from 0.

Let’s put that aside for now. We will be needing map and nats later.

Never-ending lists

We will now say “infinite lists” in an informal you-know-what-I-mean sense, as we explore different ways of making it more formal, which will have their own names.

A list is infinite when it never ends with a Nil. But in constructive mathematics we never say never—it’s not even obvious how you could even say it in this instance. A list is infinite when it, and its tails, always evaluate to a Cons.

A more “incremental” rephrasing of the above is that a list xs is infinite when xs evaluates to a Cons, and its tail is also infinite. That definition of infinite lists is recursive, so that you can “unfold” it iteratively to establish that every tail evaluates to a Cons. But because it is recursive, it’s not a priori well-defined.

Let us forget about “is infinite” for a second, and talk more generally about properties P that somehow subscribe to that definition: if xs satisfies P, then xs evaluates to a Cons, and the tail of xs satisfies P. Let us call such a P a never-ending invariant.

Definition Neverending_invariant {a} (P : Colist a -> Prop) : Prop :=
  forall xs, P xs -> exists x xs', force xs = Cons x xs' /\ P xs'.

The intuition is that if xs satisfies any never-ending invariant P, then xs must be infinite. This leads to our first characterization of infinite lists, “never-ending” lists.

Never-ending: definition

A list is never-ending when it satisfies some never-ending invariant.

Definition Neverending {a} (xs : Colist a) : Prop :=
  exists (P : Colist a -> Prop),
    Neverending_invariant P /\ P xs.

The key property that makes the notion of never-ending lists useful is the following unfolding lemma: a never-ending list is a Cons, and its tail is never-ending.

Note: you can hover and click on the tactics in proof scripts (Proof. ... Qed.) to see the intermediate proof states.2

Lemma unfold_Neverending {a} (xs : Colist a)
  : Neverending xs ->
    exists x xs',
      force xs = Cons x xs' /\ Neverending xs'.
  intros NE.
  unfold Neverending in NE.
  destruct NE as [P [NE Hxs]].
  unfold Neverending_invariant in NE.
  apply NE in Hxs.
  destruct Hxs as [x [xs' [Hxs Hxs']]].
  exists x, xs'.
  split; [assumption | ].
  unfold Neverending.
  exists P.
  split; [ | assumption ].
  exact NE.

Doesn’t that lemma’s statement remind you of Neverending_invariant above?

That lemma means exactly that the property of “being never-ending” is itself a never-ending invariant!

Lemma Neverending_invariant_Neverending {a}
  : Neverending_invariant (Neverending (a := a)).
  unfold Neverending. (* This goal looks funny -> *)
  exact (@unfold_Neverending a).

The definition of Neverending makes it the weakest never-ending invariant: all never-ending invariants imply Neverending.

Lemma Neverending_weakest {a} (P : Colist a -> Prop) (xs : Colist a)
  : Neverending_invariant P -> P xs -> Neverending xs.
  intros INV H.
  unfold Neverending.
  exists P.
  split; assumption.

This is actually an instance of a pretty general way of defining recursive properties (and recursive types, by Curry-Howard) without using recursion. You introduce a class of “invariants” identified by the recursive definition, and then you pick the strongest or weakest one, depending on the situation (inductive or coinductive).3

Lists with too many elements

This next property is sufficient but not necessary: a list must be infinite if it contains infinitely many distinct elements. While this sounds circular, we care only about defining “infinite lists”, and for that we can leverage other “infinities” already lying around, like the natural numbers. Note that an infinite list may not satisfy that property by repeating the same finitely many elements (e.g., repeat 0).

One way to show that a set is infinite is to exhibit an injective function from the natural numbers (or any other infinite set): distinct elements are mapped to distinct elements, or conversely, every image element has a unique antecedent.

Definition injective {a b} (f : a -> b) : Prop :=
  forall x y, f x = f y -> x = y.

Now we need to tie those elements to a list, using the membership relation In. That relation is defined inductively: an element x is in a list xs if either x is the head of xs or x is in the tail of the list.

Unset Elimination Schemes. (* Don't generate induction principles for us. *)
Inductive In {a : Type} (x : a) (xs : Colist a) : Prop :=
| In_split y ys : force xs = Cons y ys -> x = y \/ In x ys -> In x xs
Lemma In_ind (a : Type) (x : a) (P : Colist a -> Prop)
    (H : forall xs (y : a) (ys : Colist a),
         force xs = y :: ys -> x = y \/ (In x ys /\ P ys) -> P xs)
  : forall xs, In x xs -> P xs.
  fix SELF 2; intros xs [].
  eapply H; eauto.
  destruct H1; [ left | right ]; auto.

Lemma not_In_Nil {a} (x : a) xs : force xs = [] -> In x xs -> False.
  intros ? []; congruence.
#[global] Hint Resolve not_In_Nil : core.

Naturally, an element cannot be in an empty list. Two distinct elements cannot be in a list of length one. And so on. So if we can prove that infinitely many elements are in a list, then the list must be infinite. Let us call this property “surnumerable”, since it means that we can enumerate a subset of its elements.

Surnumerability: definition

A list xs is surnumerable if there is some injective function f : nat -> a such that f i is in xs for all i.

Definition Surnumerable {a} (xs : Colist a) : Prop :=
  exists f : nat -> a,
    injective f /\ forall i, In (f i) xs.

Surnumerable implies Neverending

A simple approach is to prove that Surnumerable is a never-ending invariant, but that requires decidable equality on a. A more general solution considers the invariant satisfied by lists xs such that Surnumerable (ys ++ xs) for some finite ys. The pigeonhole reasoning for that proof seems challenging, so I haven’t done it myself.

Theorem Surnumerable_Neverending {a} (xs : Colist a)
  : Surnumerable xs -> Neverending xs.
  (* Exercise for the reader. *)

Injectivity is not very “constructive”, you have to use a lot of tricks to recover useful information from it. In a proof that surnumerability implies never-ending-ness, a big part of it is to prove that surnumerability of a list Cons x xs implies (more or less) surnumerability of its tail xs. In other words, given f which describes an infinite set of elements in Cons x xs, and we must construct a new f2 which describes an infinite set of elements all in xs. The challenge is thus to “remove” the head x from the given injective function—if x occurs at all in f. This would be easier if we had a pseudo-inverse function to point to its antecedent by f. The existence of a pseudo-inverse is equivalent to injectivity classically, but it is stronger constructively. In category theory, a function f with a pseudo-inverse is called a split mono(morphism).

Definition splitmono {a b} (f : a -> b) : Prop :=
  exists g : b -> a, forall x, g (f x) = x.

We obtain a variant of Surnumerable using splitmono instead of injective.

Definition SplitSurnumerable {a} (xs : Colist a) : Prop :=
  exists (f : nat -> a),
    splitmono f /\ forall i, In (f i) xs.

The pseudo-inverse makes the proof of never-ending-ness much simpler.

Theorem SplitSurnumerable_Neverending {a} (xs : Colist a)
  : SplitSurnumerable xs -> Neverending xs.
  intros PN. unfold SplitSurnumerable in PN.
  destruct PN as (f & Hf & Hincl).
  unfold Neverending.
  (* Here is the never-ending invariant. *)
  exists (fun xs => exists n, forall i, n <= i -> In (f i) xs).
  - unfold Neverending_invariant.
    intros xs_ [n Hn].
    destruct (force xs_) as [ | x xs'] eqn:Hforce.
    + exfalso. eauto using not_In_Nil.
    + exists x, xs'; split; [ auto | ].
      destruct Hf as [g Hf].
      exists (max n (S (g x))).
      intros i Hi.
      specialize (Hn i (Nat.max_lub_l _ _ _ Hi)).
      destruct Hn.
      rewrite H in Hforce; inversion Hforce; subst; clear Hforce.
      destruct H0.
      * exfalso. rewrite <- H0 in Hi. rewrite Hf in Hi. lia.
      * assumption.
  - exists 0. auto.

Surnumerability may be easier to prove than never-ending-ness in some situations. A proof that a list is never-ending essentially “walks through” the evaluation of the list, but in certain situations the list might be too abstract to inspect, for example when reasoning by parametricity,4 and we can only prove the membership of individual elements one by one.


Our last idea is that infinite lists (with element type a) are in bijection with functions nat -> a. So we can show that a list is infinite by proving that it corresponds to a function nat -> a via such a bijection. We shall use the obvious bijection that sends f to map f nats—and conversely sends an infinite list xs to a function index xs : nat -> a. We will thus say that a list xs is enumerable if it can be written as map f nats for some f.

Equality of colists

Before we can state the equation xs = map f nats, we must choose a notion of equality. One can be readily obtained via the following coinductive relation, which corresponds to the relational interpretation of the type Colist à la Reynolds.5 It interprets the type constructor Colist : Type -> Type as a relation transformer RColist : (a -> b -> Prop) -> (Colist a -> Colist b -> Prop), which can be specialized to an equivalence relation RColist eq; we will write it in infix notation as == in the rest of the post.

Inductive RColistF {a b} (r : a -> b -> Prop) xa xb (rx : xa -> xb -> Prop)
  : ColistF a xa -> ColistF b xb -> Prop :=
| RNil : RColistF r rx [] []
| RCons x xs y ys : r x y -> rx xs ys -> RColistF r rx (Cons x xs) (Cons y ys)

CoInductive RColist {a b} (r : a -> b -> Prop) (xs : Colist a) (ys : Colist b) : Prop :=
  RDelay { Rforce : RColistF r (RColist r) (force xs) (force ys) }.

Notation "x == y" := (RColist eq x y) (at level 70) : colist_scope.

Enumerability: definition

We can now say formally that xs is enumerable by f if xs == map f nats.

Definition Enumerable_by {a} (f : nat -> a) (xs : Colist a) : Prop :=
  xs == map f nats.

Definition Enumerable {a} (xs : Colist a) : Prop :=
  exists f, Enumerable_by f xs.

As mentioned earlier, the equation xs == map f nats exercises one half of the bijection between infinite lists and functions on nat. Formalizing the other half takes more work, and it will actually let us prove that Neverending implies Enumerable.

Neverending implies Enumerable

Essentially, we need to define an indexing function index : Colist a -> nat -> a. However, this is only well-defined for infinite lists. A better type will be a dependent type index : forall (xs : Colist a), Neverending xs -> nat -> a, where the input list xs must be never-ending.

Start with a naive definition having the simpler type, which handles partiality with a default value:

Fixpoint index_def {a} (def : a) (xs : Colist a) (i : nat) : a :=
  match force xs, i with
  | Cons x _, O => x
  | Cons _ xs, S i => index_def def xs i
  | Nil, _ => def

Given a never-ending list, we are able to extract an arbitrary value as a default—which will be passed to index_def but never actually be used. It takes a bit of dependently typed programming, which we dispatch with tactics. And since we don’t actually care about the result we can keep the definition opaque with Qed (instead of Defined).

Definition head_NE {a} (xs : Colist a) (NE : Neverending xs) : a.
  destruct (force xs) as [ | x xs' ] eqn:Hxs.
  - exfalso. apply unfold_Neverending in NE. destruct NE as [? [? []]]. congruence.
  - exact x.

Combining index_def and head_NE, we obtain our index function.

Definition index {a} (xs : Colist a) (NE : Neverending xs) (i : nat) : a :=
  index_def (head_NE NE) xs i.

The remaining code in this post proves that a never-ending list xs is enumerated by index xs.

This first easy lemma says that index_def doesn’t depend on the default value if the list is never-ending.

Lemma index_def_Neverending {a} (def def' : a) (xs : Colist a) (i : nat)
  : Neverending xs -> index_def def xs i = index_def def' xs i.
  revert xs; induction i; intros * NE; cbn.
  all: apply unfold_Neverending in NE.
  all: destruct NE as [x [xs' [Hxs NE]]]. 
  all: rewrite Hxs.
  all: auto.

The next lemma does the heavy lifting, constructing an “equality invariant” (or “bisimulation”) that must hold between all respective tails of xs and map (index xs) nats, which then implies ==.

Note that instead of index xs, we actually write index NE where NE is a proof of Neverending xs, since index requires that argument, and xs can be deduced from NE’s type.

Lemma Neverending_Enumerable_ {a} (xs : Colist a) (NE : Neverending xs)
    (f : nat -> a) (n : nat)
  : (forall i, f (n+i) = index NE i) ->
    xs == map f (nats_from n).
  revert xs NE n; cofix SELF; intros * Hf.
  assert (NE' := NE).
  apply unfold_Neverending in NE'.
  destruct NE' as [x [xs' [Hxs NE']]].
  rewrite Hxs; cbn.
  - specialize (Hf 0).
    cbn in Hf. rewrite Nat.add_0_r, Hxs in Hf. auto.
  - apply SELF with (NE := NE'); clear SELF.
    intros i. specialize (Hf (S i)).
    cbn in Hf. rewrite Nat.add_succ_r, Hxs in Hf.
    cbn; rewrite Hf. unfold index.
    apply index_def_Neverending. auto.

Here’s the final result. A never-ending list xs is enumerated by index xs.

Theorem Neverending_Enumerable_by {a} (xs : Colist a) (NE : Neverending xs)
  : Enumerable_by (index NE) xs.
  unfold Enumerable_by, nats.
  apply Neverending_Enumerable_ with (NE0 := NE) (n := 0).

We can repackage the theorem to hide the enumeration function, more closely matching the English sentence “never-ending-ness implies enumerability”.

Corollary Neverending_Enumerable {a} (xs : Colist a)
  : Neverending xs -> Enumerable xs.
  intros NE; eexists; apply Neverending_Enumerable_by with (NE0 := NE).

The converse holds this time. The main insight behind the proof is that the property “xs == map f (nats_from n) for some n” is a never-ending invariant.

Theorem Enumerable_Neverending {a} (xs : Colist a)
  : Enumerable xs -> Neverending xs.
  unfold Enumerable, Enumerable_by. intros [f EB].
  unfold Neverending.
  exists (fun xs => exists n, xs == map f (nats_from n)).
  - unfold Neverending_invariant. intros xs_ [n EB_].
    destruct EB_ as [EB_]. cbn in EB_. inversion EB_; subst.
    exists (f n), xs0. split; [ auto | ].
    exists (S n). assumption.
  - exists 0; assumption.

Reasoning with enumerability

I think Neverending is the most intuitive characterization of infinite lists, but Enumerable can be easier to use. To illustrate the point, let us examine a minimized version of my use case.

Consider an arbitrary function from lists of lists to lists: join : Colist (Colist a) -> Colist a.

Try to formalize the statement

When join is applied to a square matrix, i.e., a list of lists all of the same length, it computes the diagonal.

(NB: An infinite list of infinite lists is considered a square.)

The literal approach is to introduce two functions length (in the extended naturals) and diagonal, so we can translate the above sentence as follows:

forall (xs : Colist (Colist a)),
  (forall row, In row xs -> length row = length xs) ->
  join xs == diagonal xs. 

However, this is unwieldly because the definition of diagonal is not completely trivial. One will have to prove quite a few propositions about diagonal in order to effectively reason about it.

A more parsimonious solution relies on the idea that the “diagonal” is simple to define on functions f : b -> b -> a, as diagonal f := fun x => f x x. That leads to the following translation:

forall (f : b -> b -> a) (xs : Colist b),
  join (map (fun x => map (f x) xs) xs) = map (fun x => f x x) xs

It takes a bit of squinting to recognize the original idea, but the upside is that this is now a purely equational fact, without side conditions.

Rather than constrain a general list of lists to be a square, we generate squares from a binary function f : b -> b -> a and a list xs : Colist b representing the “sides” of the square, containing “coordinates” along one axis. In particular, we can use xs := nats as the side of an “infinite square”, and nats arises readily from Enumerable lists. Any square can be extensionally rewritten in that way. This theorem requires no ad-hoc definition like a separate diagonal function, and instead we can immediately use general facts about map both to prove and to use such a theorem.

  • Surnumerable: the list contains infinitely many distinct elements (two versions, based on classical injections and split monos).
  • Never-ending: the list never terminates with Nil—always evaluates to Cons.
  • Enumerable: the list identifies with some function on nat.
Print SplitSurnumerable.
(*      ⇓      *)
Print Surnumerable.
(*      ⇓      *)
Print Neverending.
(*      ⇕      *)
Print Enumerable.

Can you think of other characterizations of infinite lists?

  1. Which I’ve used recently in a proof that there is no ZipList monad.↩︎

  2. Plugging Alectryon.↩︎

  3. This is a generalization of the types Mu and Nu as they are named in Haskell. This is also how the paco library defines coinductive propositions.↩︎

  4. Like in the no-ziplist-monad proof.↩︎

  5. See also my previous post.↩︎

by Lysxia at October 26, 2021 12:00 AM

October 25, 2021

Monday Morning Haskell

Monads want to be Free!

Free Monads Thumb.jpg

(This post is also available as a YouTube video)!

In last week's article I showed how we can use monad classes to allow limited IO effects in our functions. That is, we can get true IO functionality for something small (like printing to the terminal), without allowing a function to run any old IO action (like reading from the file system). In this way monad classes are the building blocks of Haskell's effect structures.

But there's another idea out there called "free monads". Under this paradigm, we can represent our effects with a data type, rather than a typeclass, and this can be a nicer way to conceptualize the problem. In this article I'll show how to use free monads instead of monad classes in the same Nim game example we used last time.

The "starter" code for this article is on the monad-class branch here.

The "ending" code is on the eff branch.

And here is a pull request showing all the edits we'll make!

Intro to Free Monads

Free monads are kind of like Haskell Lenses in that there are multiple implementations out there for the same abstract concept. I'm going to use the Freer Effects library. If you use a different implementation, the syntax details might be a bit different, but the core ideas should still be the same.

The first thing to understand about using free monads, at least with this library, is that there's only a single monad, which we call the Eff monad. And to customize the behavior of this monad, it's parameterized by a type level list containing different effects. Now, we can treat any monad like an effect. So we can construct an instance of this Eff monad that contains the State monad over our game state, as well as the IO monad.

playGame :: Eff '[State (Player, Int), IO ] Player

Now in order to use monadic functionality within our Eff monad, we have the use the send function. So let's write a couple helpers for the state monad to demonstrate this.

getState :: (Member (State (Player, Int)) r) => Eff r (Player, Int)
getState = send (get :: State (Player, Int) (Player, Int))

putState :: (Member (State (Player, Int)) r) => (Player, Int) -> Eff r ()
putState = send . (put :: (Player, Int) -> State (Player, Int) ())

Whereas a typical monad class function won't specify the complete monad m, in this case, we won't specify the complete effect list. We'll just call it r. But then we'll place what is called a Member constraint on this function. We'll say that State (Player, Int) must be a "member" of the effect list r. Then we can just use send in conjunction with the normal monadic functions. We can also add in some type specifiers to make things more clear for the compiler.

Creating an Effect Type

But now let's think about our MonadTerminal class from last time. This doesn't correspond to a concrete monad, so how would we use it? The answer is that instead of using a typeclass, we're going to make a data type representing this effect, called Terminal. This will be a generalized algebraic data type, or GADT. So its definition actually kind of does look like a typeclass. Notice this seemingly extra a parameter as part of the definition.

data Terminal a where
  LogMessage :: String -> Terminal ()
  GetInputLine :: Terminal String

Now we capitalized our function names to make these data constructors. So let's write functions now under the original lowercase names that will allow us to call these constructors. These functions will look a lot like our state functions. We'll say that Terminal must be a member of the type list r. And then we'll just use send except we'll use it with the appropriate constructor for our effect type.

logMessage :: (Member Terminal r) => String -> Eff r ()
logMessage = send . LogMessage

getInputLine :: (Member Terminal r) => Eff r String
getInputLine = send GetInputLine


At this point, you're probably wondering "hmmmm...when do we make these functions concrete"? After all, we haven't used putStrLn yet or anything like that. The answer is that we write an interpretation of the effect type, using a particular monad. This function will assume that our Terminal effect is on top of the effect stack, and it will "peel" that layer off, returning an action that no longer has the effect on the stack.

We call this function runTerminalIO because for this interpretation, we'll assume we are using the IO monad. And hence we will add a constraint that the IO monad is on the remaining stack r.

runTerminalIO :: (Member IO r) => Eff (Terminal ': r) a -> Eff r a
runTerminalIO = ...

To fill in this function, we create a natural transformation between a Terminal action and an IO action. For the LogMessage constructor of course we'll use putStrLn, and for GetInputLine we'll use getLine.

runTerminalIO :: (Member IO r) => Eff (Terminal ': r) a -> Eff r a
runTerminalIO = ...
    terminalToIO :: Terminal a -> IO a
    terminalToIO (LogMessage msg) = putStrLn msg
    terminalToIO GetInputLine = getLine

Then to complete the function, we use runNat, a library function, together with this transformation.

runTerminalIO :: (Member IO r) => Eff (Terminal ': r) a -> Eff r a
runTerminalIO = runNat terminalToIO
    terminalToIO :: Terminal a -> IO a
    terminalToIO (LogMessage msg) = putStrLn msg
    terminalToIO GetInputLine = getLine

Interpreting the Full Stack

Now our complete effect stack will include this Terminal effect, the State effect, and the IO monad. This final stack is like our GameMonad. We'll need to write a concrete function to turn this in to a normal IO action.

transformGame :: Eff '[ Terminal, State (Player, Int), IO ] a -> IO a
transformGame = runM . (runNatS (Player1, 0) stateToIO) . runTerminalIO
    stateToIO :: (Player, Int) -> State (Player, Int) a -> IO ((Player, Int), a)
    stateToIO prev act = let (a', nextState) = runState act prev in return (nextState, a')

This function is a bit like our other interpretation function in that it includes a transformation of the state layer. We combine this with our existing runTerminalIO function to get the final interpretation. Instead of runNat, we use runNatS to assign an initial state and allow that state to pass through to other calls.

Final Tweaks

And now there are just a few more edits we need to make. Most importantly, we can change the type signatures of our different functions. They should be in the Eff monad, and for every monad class constraint we used before, we'll now include a Member constraint.

playGame :: Eff '[ Terminal, State (Player, Int), IO ] Player

validateMove :: (Member Terminal r, Member (State (Player, Int)) r) => String -> Eff r (Maybe Int)

promptPlayer :: (Member Terminal r, Member (State (Player, Int)) r) => Eff r ()

readInput :: (Member Terminal r) => Eff r String

That's most all of what we need to do! We also have to change the direct get and put calls to use getState and putState, but that's basically it! We can rebuild and play our game again now!

Conclusion: Effectful Haskell!

Now I know this overview was super quick so I could barely scratch the surface of how free monads work and what their benefits are. If you think these sound really cool though, and you want to learn this concept more in depth and get some hands on experience, you should sign up for our Effectful Haskell Course!

This course will teach you all the ins and outs of how Haskell allows you to structure effects, including how to do it with free monads. You'll get to see how these ideas work in the context of a decently-sized project. Even better is that you can get a 20% discount on it by subscribing to Monday Morning Haskell. So don't miss out, follow that link and get learning today!

by James Bowen at October 25, 2021 02:30 PM

October 21, 2021

Gabriel Gonzalez

Co-Applicative programming style


This post showcases an upcoming addition to the contravariant package that permits programming in a “co-Applicative” (Divisible) style that greatly resembles Applicative style.

This post assumes that you are already familiar with programming in an Applicative style, but if you don’t know what that is then I recommend reading:


The easiest way to motivate this is through a concrete example:

{-# LANGUAGE NamedFieldPuns #-}

import Data.Functor.Contravariant (Predicate(..), (>$<))
import Data.Functor.Contravariant.Divisible (Divisible, divided)

nonNegative :: Predicate Double
nonNegative = Predicate (0 <=)

data Point = Point { x :: Double, y :: Double, z :: Double }

nonNegativeOctant :: Predicate Point
nonNegativeOctant = adapt >$< nonNegative >*< nonNegative >*< nonNegative
adapt Point{ x, y, z } = (x, (y, z))

-- | This operator will be available in the next `contravariant` release
(>*<) :: Divisible f => f a -> f b -> f (a, b)
(>*<) = divided

infixr 5 >*<

This code takes a nonNegative Predicate on Doubles that returns True if the double is non-negative and then uses co-Applicative (Divisible) style to create a nonNegativeOctant Predicate on Points that returns True if all three coordinates of a Point are non-negative.

The key part to zoom in on is the nonNegativeOctant Predicate, whose implementation superficially resembles the Applicative style that we know and love:

nonNegativeOctant = adapt >$< nonNegative >*< nonNegative >*< nonNegative

The difference is that instead of the <$> and <*> operators we use >$< and >*<, which are their evil twins dual operators1. For example, you can probably see the resemblance to the following code that uses Applicative style:

readDouble :: IO Double
readDouble = readLn

readPoint :: IO Point
readPoint = Point <$> readDouble <*> readDouble <*> readDouble


I’ll walk through the types involved to help explain how this style works.

First, we will take this expression:

nonNegativeOctant = adapt >$< nonNegative >*< nonNegative >*< nonNegative

… and explicitly parenthesize the expression instead of relying on operator precedence and associativity:

nonNegativeOctant = adapt >$< (nonNegative >*< (nonNegative >*< nonNegative))

So the smallest sub-expression is this one:

nonNegative >*< nonNegative

… and given that the type of nonNegative is:

nonNegative :: Predicate Double

… and the type of the (>*<) operator is:

(>*<) :: Divisible f => f a -> f b -> f (a, b)

… then we can specialize the f in that type to Predicate (since Predicate implements the Divisible class):

(>*<) :: Predicate a -> Predicate b -> Predicate (a, b)

… and further specialize a and b to Double:

(>*<) :: Predicate Double -> Predicate Double -> Predicate (Double, Double)

… and from that we can conclude that the type of our subexpression is:

nonNegative >*< nonNegative
:: Predicate (Double, Double)

In other words, nonNegative >*< nonNegative is a Predicate whose input is a pair of Doubles.

We can then repeat the process to infer the type of this larger subexpression:

nonNegative >*< (nonNegative >*< nonNegative))
:: Predicate (Double, (Double, Double))

In other words, now the input is a nested tuple of three Doubles.

However, we want to work with Points rather than nested tuples, so we pre-process the input using >$<:

adapt >$< (nonNegative >*< (nonNegative >*< nonNegative))
adapt :: Point -> (Double, (Double, Double))
adapt Point{ x, y, z } = (x, (y, z))

… and this works because the type of >$< is:

(>$<) :: Contravariant f => (a -> b) -> f b -> f a

… and if we specialize f to Predicate, we get:

(>$<) :: (a -> b) -> Predicate b -> Predicate a

… and we can further specialize a and b to:

:: (Point -> (Double, (Double, Double)))
-> Predicate (Double, (Double, Double))
-> Predicate Point

… which implies that our final type is:

nonNegativeOctant :: Predicate Point
nonNegativeOctant = adapt >$< (nonNegative >*< (nonNegative >*< nonNegative))
adapt Point{ x, y, z } = (x, (y, z))


We can better understand the relationship between the two sets of operators by studying their types:

-- | These two operators are dual to one another:
(<$>) :: Functor f => (a -> b) -> f a -> f b
(>$<) :: Contravariant f => (a -> b) -> f b -> f a

-- | These two operators are similar in spirit, but they are not really dual:
(<*>) :: Applicative f => f (a -> b) -> f a -> f b
(>*<) :: Divisible f => f a -> f b -> f (a, b)

Okay, so (>*<) is not exactly the dual operator of (<*>). (>*<) is actually dual to liftA2 (,)2:

(>*<)      :: Divisible   f => f a -> f b -> f (a, b)
liftA2 (,) :: Applicative f => f a -> f b -> f (a, b)

In fact, if we were to hypothetically redefine (<*>) to be liftA2 (,) then we could write Applicative code that is even more symmetric to the Divisible code (albeit less ergonomic):

import Control.Applicative (liftA2)
import Prelude hiding ((<*>))

(<*>) = liftA2 (,)

infixr 5 <*>

readDouble :: IO Double
readDouble = readLn

readPoint :: IO Point
readPoint = adapt <$> readDouble <*> readDouble <*> readDouble
adapt (x, (y, z)) = Point{ x, y, z }

-- Compare to:
nonNegativeOctant :: Predicate Point
nonNegativeOctant = adapt >$< nonNegative >*< nonNegative >*< nonNegative
adapt Point{ x, y, z } = (x, (y, z))

It would be nice if we could create a (>*<) operator that was dual to the real (<*>) operator, but I could not figure out a good way to do this.

If you didn’t follow all of that, the main thing you should take away from this going into the next section is:

  • the Contravariant class is the dual of the Functor class
  • the Divisible class is the dual of the Applicative class

Syntactic sugar

GHC supports the ApplicativeDo extension, which lets you use do notation as syntactic sugar for Applicative operators. For example, we could have written our readPoint function like this:

{-# LANGUAGE ApplicativeDo #-}

readPoint :: IO Point
readPoint = do
x <- readDouble
y <- readDouble
z <- readDouble
return Point{ x, y, z }

… which behaves in the exact same way. Actually, we didn’t even need the ApplicativeDo extension because IO has a Monad instance and anything that has a Monad instance supports do notation without any extensions.

However, the ApplicativeDo language extension does change how the do notation is desugared. Without the extension the above readPoint function would desugar to:

readPoint =
readDouble >>= \x ->
readDouble >>= \y ->
readDouble >>= \z ->
return Point{ x, y, z }

… but with the ApplicativeDo extension the function instead desugars to only use Applicative operations instead of Monad operations:

-- I don't know the exact desugaring logic, but I imagine it's similar to this:
readPoint = adapt <$> readDouble <*> readDouble <*> readDouble
adapt x y z = Point{ x, y, z }

So could there be such a thing as “DivisibleDo” which would introduce syntactic sugar for Divisible operations?

I think there could be such an extension, and there are several ways you could design the user experience.

One approach would be to permit code like this:

{-# LANGUAGE DivisibleFrom #-}

nonNegativeOctant :: Predicate Point
nonNegativeOctant =
from Point{ x, y, z }
x -> nonNegative
y -> nonNegative
z -> nonNegative

… which would desugar to the original code that we wrote:

nonNegativeOctant = adapt >$< nonNegative >*< nonNegative >*< nonNegative
adapt Point{ x, y, z } = (x, (y, z))

Another approach could be to make the syntax look exactly like do notation, except that information flows in reverse:

{-# LANGUAGE DivisibleDo #-}

nonNegativeOctant :: Predicate Point
nonNegativeOctant = do
x <- nonNegative
y <- nonNegative
r <- nonNegative
return Point{ x, y, z } -- `return` here would actually be a special keyword

I assume that most people will prefer the from notation, so I’ll stick to that for now.

If we were to implement the former DivisibleFrom notation then the Divisible laws stated using from notation would become:

-- Left identity
from x
x -> m
x -> conquer

= m

-- Right identity
from x
x -> conquer
x -> m

= m

-- Associativity
from (x, y, z)
(x, y) -> from (x, y)
x -> m
y -> n
z -> o

= from (x, y, z)
x -> m
(y, z) -> from (y, z)
y -> n
z -> o

= from (x, y, z)
x -> m
y -> n
z -> o

This explanation of how DivisibleFrom would work is really hand-wavy, but if people were genuinely interested in such a language feature I might take a stab at making the semantics of DivisibleFrom sufficiently precise.


The original motivation for the (>*<) operator and Divisible style was to support compositional RecordEncoders for the dhall package.

Dhall’s Haskell API defines a RecordEncoder type which specifies how to convert a Haskell record to a Dhall syntax tree, and we wanted to be able to use the Divisible operators to combine simpler RecordEncoders into larger RecordEncoders, like this:

data Project = Project
{ name :: Text
, description :: Text
, stars :: Natural

injectProject :: Encoder Project
injectProject =
( adapt >$< encodeFieldWith "name" inject
>*< encodeFieldWith "description" inject
>*< encodeFieldWith "stars" inject
adapt Project{..} = (name, (description, stars))

The above example illustrates how one can assemble three smaller RecordEncoders (each of the encodeFieldWith functions) into a RecordEncoder for the Project record by using the Divisible operators.

If we had a DivisibleFrom notation, then we could have instead written:

injectProject =
recordEncoder from Project{..}
name -> encodeFieldWith "name" inject
description -> encodeFieldWith "description" inject
stars -> encodeFieldWith "stars" inject

If you’d like to view the original discussion that led to this idea you can check out the original pull request.


I upstreamed this (>*<) operator into the contravariant package, which means that you’ll be able to use the trick outlined in this post after the next contravariant release.

Until then, you can define your own (>*<) operator inline within your own project, which is what dhall did while waiting for the operator to be upstreamed.

  1. Alright, they’re not categorically dual in a rigorous sense, but I couldn’t come up with a better term to describe their relationship to the original operators.↩︎

  2. I feel like liftA2 (,) should have already been added to Control.Applicative by now since I believe it’s a pretty fundamental operation from a theoretical standpoint.↩︎

by Gabriella Gonzalez ( at October 21, 2021 03:26 PM

Sandy Maguire

Proving Commutativity of Polysemy Interpreters

To conclude this series of posts on polysemy-check, today we’re going to talk about how to ensure your effects are sane. That is, we want to prove that correct interpreters compose into correct programs. If you’ve followed along with the series, you won’t be surprised to note that polysemy-check can test this right out of the box.

But first, what does it mean to talk about the correctness of composed interpreters? This idea comes from Yang and Wu’s Reasoning about effect interaction by fusion. The idea is that for a given program, changing the order of two subsequent actions from different effects should not change the program. Too abstract? Well, suppose I have two effects:

foo :: Member Foo r => Sem r ()
bar :: Member Bar r => Sem r ()

Then, the composition of interpreters for Foo and Bar is correct if and only if1 the following two programs are equivalent:

forall m1 m2.
  m1 >> foo >> bar >> m2
  m1 >> bar >> foo >> m2

That is, since foo and bar are actions from different effects, they should have no influence on one another. This sounds like an obvious property; effects correspond to individual units of functionality, and so they should be completely independent of one another. At least — that’s how we humans think about things. Nothing actually forces this to be the case, and extremely hard-to-find bugs will occur if this property doesn’t hold, because it breaks a mental abstraction barrier.

It’s hard to come up with good examples of this property being broken in the wild, so instead we can simulate it with a different broken abstraction. Let’s imagine we’re porting a legacy codebase to polysemy, and the old code hauled around a giant stateful god object:

data TheWorld = TheWorld
  { counter :: Int
  , lots    :: Int
  , more'   :: Bool
  , stuff   :: [String]

To quickly get everything ported, we replaced the original StateT TheWorld IO application monad with a Member (State TheWorld) r constraint. But we know better than to do that for the long haul, and instead are starting to carve out effects. We introduce Counter:

data Counter m a where
  Increment :: Counter m ()
  GetCount :: Counter m Int

makeSem ''Counter

with an interpretation into our god object:

    :: Member (State TheWorld) r
    => Sem (Counter ': r) a
    -> Sem r a
runCounterBuggy = interpret $ \case
  Increment ->
    modify $ \world -> world
                         { counter = counter world + 1
  GetCount ->
    gets counter

On its own, this interpretation is fine. The problem occurs when we use runCounterBuggy to handle Counter effects that coexist in application code that uses the State TheWorld effect. Indeed, polysemy-check tells us what goes wrong:

quickCheck $
  prepropCommutative @'[State TheWorld] @'[Counter] $
    pure . runState defaultTheWorld . runCounterBuggy

we see:


Effects are not commutative!

k1  = Get
e1 = Put (TheWorld 0 0 False [])
e2 = Increment
k2  = Pure ()

(k1 >> e1 >> e2 >> k2) /= (k1 >> e2 >> e1 >> k2)
(TheWorld 1 0 False [],()) /= (TheWorld 0 0 False [],())

Of course, these effects are not commutative under the given interpreter, because changing State TheWorld will overwrite the Counter state! That’s not to say that this sequence of actions actually exists anywhere in your codebase, but it’s a trap waiting to happen. Better to take defensive action and make sure nobody can ever even accidentally trip this bug!

The bug is fixed by using a different data store for Counter than TheWorld. Maybe like this:

    :: Sem (Counter ': r) a
    -> Sem r a
runCounter = (evalState 0) . reinterpret @_ @(State Int) $ \case
  Increment -> modify (+ 1)
  GetCount -> get

Contrary to the old handler, runCounter now introduces its own anonymous State Int effect (via reinterpret), and then immediately eliminates it. This ensures the state is invisible to all other effects, with absolutely no opportunity to modify it. In general, this evalState . reintrpret pattern is a very good one for implementing pure effects.

Of course, a really complete solution here would also remove the counter field from TheWorld.

Behind the scenes, prepropCommutative is doing exactly what you’d expect — synthesizing monadic preludes and postludes, and then randomly pulling effects from each set of rows and ensuring everything commutes.

At first blush, using prepropCommutative to test all of your effects feels like an \(O(n^2)\) sort of deal. But take heart, it really isn’t! Let’s say our application code requires Members (e1 : e2 : e3 : es) r, and our eventual composed interpreter is runEverything :: Sem ([e] ++ es ++ [e3, e2, e1] ++ impl) a -> IO (f a). Here, we only need \(O(es)\) calls to prepropCommutative:

  • prepropCommutative @'[e2] @'[e1] runEverything
  • prepropCommutative @'[e3] @'[e2, e1] runEverything
  • prepropCommutative @'[e] @'(es ++ [e2, e1]) runEverything

The trick here is that we can think of the composition of interpreters as an interpreter of composed effects. Once you’ve proven an effect commutes with a particular row, you can then add that effect into the row and prove a different effect commutes with the whole thing. Induction is pretty cool!

As of today there is no machinery in polysemy-check to automatically generate this linear number of checks, but it seems like a good thing to include in the library, and you can expect it in the next release.

To sum up these last few posts, polysemy-check is an extremely useful and versatile tool for proving correctness about your polysemy programs. It can be used to show the semantics of your effects (and adherence of such for their interpreters.) It can show the equivalence of interpreters — such as the ones you use for testing, and those you use in production. And now we’ve seen how to use it to ensure that the composition of our interpreters maintains its correctness.

Happy testing!

  1. Well, there is a second condition regarding distributivity that is required for correctness. The paper goes into it, but polysemy-check doesn’t yet implement it.↩︎

October 21, 2021 12:53 AM

October 20, 2021


Induction without core-size blow-upa.k.a. Large records: anonymous edition

An important factor affecting compilation speed and memory requirements is the size of the core code generated by ghc from Haskell source modules. Thus, if compilation time is an issue, this is something we should be conscious of and optimize for. In part 1 of this blog post we took an in-depth look at why certain Haskell constructs lead to quadratic blow-up in the generated ghc core code, and how the large-records library avoids these problems. Indeed, the large-records library provides support for records, including support for generic programming, with a guarantee that the generated core is never more than O(n) in the number of record fields.

The approach described there does however not directly apply to the case of anonymous records. This is the topic we will tackle in this part 2. Unfortunately, it seems that for anonymous records the best we can hope for is O(n log n), and even to achieve that we need to explore some dark corners of ghc. We have not attempted to write a new anynomous records library, but instead consider the problems in isolation; on the other hand, the solutions we propose should be applicable in other contexts as well. Apart from section Putting it all together, the rest of this blog post can be understood without having read part 1.

This work was done on behalf of Juspay; we will discuss the context in a bit more detail in the conclusions.

Recap: The problem of type arguments

Consider this definition of heterogenous lists, which might perhaps form the (oversimplified) basis for an anonymous records library:

data HList xs where
  Nil  :: HList '[]
  (:*) :: x -> HList xs -> HList (x ': xs)

If we plot the size of the core code generated by ghc for a Haskell module containing only a single HList value

newtype T (i :: Nat) = MkT Word

type ExampleFields = '[T 00, T 01, .., , T 99]

exampleValue :: HList ExampleFields
exampleValue = MkT 00 :* MkT 01 :* .. :* MkT 99 :* Nil

we get an unpleasant surprise:

The source of this quadratic behaviour is type annotations. The translation of exampleValue in ghc core is shown below (throughout this blog post I will show “pseudo-core�, using standard Haskell syntax):

exampleValue :: HList ExampleFields
exampleValue =
      (:*) @(T 00) @'[T 01, T 02, .., T 99] (MkT 00)
    $ (:*) @(T 01) @'[      T 02, .., T 99] (MkT 01)
    $ (:*) @(T 99) @'[                    ] (MkT 99)
    $ Nil

Every application of the (:*) constructor records the names and types of the remaining fields; clearly, this list is O(n) in the size of the record, and since there are also O(n) applications of (:*), this term is O(n²) in the number of elements in the HList.

We covered this in detail in part 1. In the remainder of this part 2 we will not be concerned with values of HList (exampleValue), but only with the type-level indices (ExampleFields).

Instance induction considered harmful

A perhaps surprising manifestation of the problem of quadratic type annotations arises for certain type class dictionaries. Consider this empty class, indexed by a type-level list, along with the customary two instances, for the empty list and an inductive instance for non-empty lists:

class EmptyClass (xs :: [Type])

instance EmptyClass '[]
instance EmptyClass xs => EmptyClass (x ': xs)

requireEmptyClass :: EmptyClass xs => Proxy xs -> ()
requireEmptyClass _ = ()

Let’s plot the size of a module containing only a single usage of this function:

requiresInstance :: ()
requiresInstance = requireEmptyClass (Proxy @ExampleFields)

Again, we plot the size of the module against the number of record fields (number of entries in ExampleFields):

The size of this module is practically identical to the size of the module containing exampleValue, because at the core level they actually look very similar. The translation of requiresEmptyClass looks something like like this:

d00  :: EmptyClass '[T 00, T 01, .., T 99]
d01  :: EmptyClass '[      T 01, .., T 99]
d99  :: EmptyClass '[                T 99]
dNil :: EmptyClass '[                    ]

d00  = fCons @'[T 01, T 02, .., T 09] @(T 00) d01
d01  = fCons @'[      T 02, .., T 99] @(T 01) d02
d99  = fCons @'[                    ] @(T 99) dNil
dNil = fNil

requiresInstance :: ()
requiresInstance = requireEmptyClass @ExampleFields d00 (Proxy @ExampleFields)

The two EmptyClass instances we defined above turn into two functions fCons and fNil, with types

fNil  :: EmptyClass '[]
fCons :: EmptyClass xs -> EmptyClass (x ': xs)

which are used to construct dictionaries. These look very similar to the Nil and (:*) constructor of HList, and indeed the problem is the same: we have O(n) calls to fCons, and each of those records all the remaining fields, which is itself O(n) in the number of record fields. Hence, we again have core that is O(n²).

Note that this is true even though the class is empty! It therefore really does not matter what we put inside the class: any induction of this shape will immediately result in quadratic blow-up. This was already implicit in part 1 of this blog post, but it wasn’t as explicit as it perhaps should have been—at least, I personally did not have it as clearly in focus as I do now.

It is true that if both the module defining the instances and the module using the instance are compiled with optimisation enabled, that might all be optimised away eventually, but it’s not that easy to know precisely when we can depend on this. Moreover, if our goal is to improve compilation time (and therefore the development cycle), we anyway typically do not compile with optimizations enabled.

Towards a solution

For a specific concrete record we can avoid the problem by simply not using induction. Indeed, if we add

instance {-# OVERLAPPING #-} EmptyClass ExampleFields

to our module, the size of the compiled module drops from 25,462 AST nodes to a mere 15 (that’s 15 full stop, not 15k!). The large-records library takes advantage of this: rather than using induction to define instances, it generates a single instance for each record type, which has constraints for every field in the record. The result is code that is O(n) in the size of the record.

The only reason large-records can do this, however, is that every record is explicitly declared. When we are dealing with anonymous records such an explicit declaration does not exist, and we have to use induction of some form. An obvious idea suggests itself: why don’t we try halving the list at every step, rather than reducing it by one, thereby reducing the code from O(n²) to O(n log n)?

Let’s try. To make the halving explicit1, we define a binary Tree type2, which we will use promoted to the type-level:

data Tree a =
  | One a
  | Two a a
  | Branch a (Tree a) (Tree a)

EmptyClass is now defined over trees instead of lists:

class EmptyClass (xs :: Tree Type) where

instance EmptyClass 'Zero
instance EmptyClass ('One x)
instance EmptyClass ('Two x1 x2)
instance (EmptyClass l, EmptyClass r) => EmptyClass ('Branch x l r)

We could also change HList to be parameterized over a tree instead of a list of fields. Ideally this tree representation should be an internal implementation detail, however, and so we instead define a translation from lists to balanced trees:

-- Evens [0, 1, .. 9] == [0, 2, 4, 6, 8]
type family Evens (xs :: [Type]) :: [Type] where
  Evens '[]            = '[]
  Evens '[x]           = '[x]
  Evens (x ': _ ': xs) = x ': Evens xs

-- Odds [0, 1, .. 9] == [1, 3, 5, 7, 9]
type family Odds (xs :: [Type]) :: [Type] where
  Odds '[]       = '[]
  Odds (_ ': xs) = Evens xs

type family ToTree (xs :: [Type]) :: Tree Type where
  ToTree '[]       = 'Zero
  ToTree '[x]      = 'One x
  ToTree '[x1, x2] = 'Two x1 x2
  ToTree (x ': xs) = 'Branch x (ToTree (Evens xs)) (ToTree (Odds xs))

and then redefine requireEmptyClass to do the translation:

requireEmptyClass :: EmptyClass (ToTree xs) => Proxy xs -> ()
requireEmptyClass _ = ()

The use sites (requiresInstance) do not change at all. Let’s measure again:

Total failure. Not only did we not improve the situation, it got significantly worse.

What went wrong?

To understand, let’s look at the case for when we have 10 record fields. The generated code looks about as clean as one might hope for:

dEx :: EmptyClass (ToTree ExampleFields)
dEx = dTree `cast` <..>

dTree :: EmptyClass (
               (T 0)
               ('Branch (T 1) ('Two (T 3) (T 7)) ('Two (T 5) (T 9)))
               ('Branch (T 2) ('Two (T 4) (T 8)) ('One (T 6)))
dTree =
      @('Branch (T 1) ('Two (T 3) (T 7)) ('Two (T 5) (T 9)))
      @('Branch (T 2) ('Two (T 4) (T 8)) ('One (T 6)))
      @(T 0)
      ( fBranch
           @('Two (T 3) (T 7))
           @('Two (T 5) (T 9))
           @(T 1)
           (fTwo @(T 3) @(T 7))
           (fTwo @(T 5) @(T 9))
      ( -- .. right branch similar

requiresInstance :: ()
requiresInstance = requireEmptyClass @ExampleFields dEx (Proxy @ExampleFields)

Each recursive call to the dictionary construction function fBranch is now indeed happening at half the elements, as intended.

The need for a cast

The problem is in the first line:

dEx = dTree `cast` <..>

Let’s first understand why we have a cast here at all:

  1. The type of the function that we are calling is

    requireEmptyClass :: forall xs. EmptyClass (ToTree xs) => Proxy xs -> ()

    We must pick a value for xs: we pick ExampleFields.

  2. In order to be able to call the function, we must produce evidence (that is, a dictionary) for EmptyClass (ToTree ExampleFields). We therefore have to evaluate ToTree ... to ('Branch ...); as we do that, we construct a proof π that ToTree ... is indeed equal to the result (Branch ...) that we computed.

  3. We now have evidence of EmptyClass ('Branch ...), but we need evidence of EmptyClass (ToTree ...). This is where the cast comes in: we can coerce one to the other given a proof that they are equal; since π proves that

    ToTree ... ~ 'Branch ...

    we can construct a proof that

    EmptyClass (ToTree ...) ~ EmptyClass ('Branch ...)

    by appealing to congruence (if x ~ y then T x ~ T y).

The part in square brackets <..> that I elided above is precisely this proof.


Let’s take a closer look at what that proof looks like. I’ll show a simplified form3, which I hope will be a bit easier to read and in which the problem is easier to spot:

  ToTree[3] (T 0) '[T 1, T 2, T 3, T 4, T 5, T 6, T 7, T 8, T 9]
; Branch (T 0)
    ( Evens[2] (T 1) (T 2) '[T 3, T 4, T 5, T 6, T 7, T 8, T 9]
    ; Evens[2] (T 3) (T 4) '[T 5, T 6, T 7, T 8, T 9]
    ; Evens[2] (T 5) (T 6) '[T 7, T 8, T 9]
    ; Evens[2] (T 7) (T 8) '[T 9]
    ; Evens[1] (T 9)
    ; ToTree[3] (T 1) '[T 3, T 5, T 7, T 9]
    ; Branch (T 1)
        ( Evens[2] (T 3) (T 5) '[T 7, T 9]
        ; Evens[2] (T 7) (T 9) '[]
        ; Evens[0]
        ; ToTree[2] (T 3) (T 7)
        ( Odds[1] (T 3) '[T 5, T 7, T 9]
        ; Evens[2] (T 5) (T 7) '[T 9]
        ; Evens[1] (T 9)
        ; ToTree[2] (T 5) (T 9)
    ( Odds[1] (T 1) '[T 2, T 3, T 4, T 5, T 6, T 7, T 8, T 9]
    ; .. -- omitted (similar to the left branch)

The “axioms� in this proof – ToTree[3], Evens[2], etc. – refer to specific cases of our type family definitions. Essentially the proof is giving us a detailed trace of the evaluation of the type families. For example, the proof starts with

ToTree[3] (T 0) '[T 1, T 2, T 3, T 4, T 5, T 6, T 7, T 8, T 9]

which refers to the fourth line of the ToTree definition

ToTree (x ': xs) = 'Branch x (ToTree (Evens xs)) (ToTree (Odds xs))

recording the fact that

  ToTree '[T 0, .., T 9]
~ 'Branch (T 0) (ToTree (Evens '[T 1, .. T9])) (ToTree (Odds '[T 1, .. T9]))

The proof then splits into two, giving a proof for the left subtree and a proof for the right subtree, and here’s where we see the start of the problem. The next fact that the proof establishes is that

Evens '[T 1, .. T 9] ~ '[T 1, T 3, T 5, T 7, T 9]

Unfortunately, the proof to establish this contains a line for every evaluation step:

  Evens[2] (T 1) (T 2) '[T 3, T 4, T 5, T 6, T 7, T 8, T 9]
; Evens[2] (T 3) (T 4) '[T 5, T 6, T 7, T 8, T 9]
; Evens[2] (T 5) (T 6) '[T 7, T 8, T 9]
; Evens[2] (T 7) (T 8) '[T 9]
; Evens[1] (T 9)

We have O(n) such steps, and each step itself has O(n) size since it records the remaining list. The coercion therefore has size O(n²) at every branch in the tree, leading to an overall coercion also4 of size O(n²).

Six or half a dozen

So far we defined

requireEmptyClass :: EmptyClass (ToTree xs) => Proxy xs -> ()
requireEmptyClass _ = ()

and are measuring the size of the core generated for a module containing

requiresInstance :: ()
requiresInstance = requireEmptyClass (Proxy @ExampleFields)

for some list ExampleFields. Suppose we do one tiny refactoring, and make the caller use ToList instead; that is, requireEmptyClass becomes

requireEmptyClass :: EmptyClass t => Proxy t -> ()
requireEmptyClass _ = ()

and we now call ToTree at the use site instead:

requiresInstance :: ()
requiresInstance = requireEmptyClass (Proxy @(ToTree ExampleFields))

Let’s measure again:

The quadratic blow-up disappeared! What happened? Just to add to the confusion, let’s consider one other small refactoring, leaving the definition of requireEmptyClass the same but changing the use site to

requiresInstance = requireEmptyClass @(ToTree ExampleFields) Proxy

Measuring one more time:

Back to blow-up! What’s going on?


We will see in the next section that the difference is due to roles. Before we look at the details, however, let’s first remind ourselves what roles are5. When we define

newtype Age = MkAge Int

then Age and Int are representationally equal but not nominally equal: if we have a function that wants an Int but we pass it an Age, the type checker will complain. Representational equality is mostly a run-time concern (when do two values have the same representation on the runtime heap?), whereas nominal equality is mostly a compile-time concern. Nominal equality implies representional equality, of course, but not vice versa.

Nominal equality is a very strict equality. In the absence of type families, types are only nominally equal to themselves; only type families introduce additional axioms. For example, if we define

type family F (a :: Type) :: Type where
  F Int = Bool
  F Age = Char

then F Int and Bool are considered to be nominally equal.

When we check whether two types T a and T b are nominally equal, we must simply check that a and b are nominally equal. To check whether T a and T b are representionally equal is however more subtle [Safe zero-cost coercions for Haskell, Fig 3]. There are three cases to consider, illustrated by the following examples:

data T1 x = T1
data T2 x = T2 x
data T3 x = T3 (F x)


  • T1 a and T1 b are representionally equal no matter the values of a and b: we say that x has a phantom role.
  • T2 a and T2 b are representionally equal if a and b are: x has a representional role.
  • T3 a and T3 b are representionally equal if a and b are nominally equal: x has a nominal role (T3 Int and T3 Age are certainly not representionally equal, even though Int and Age are!).

This propagates up; for example, in

data ShowType a = ShowType {
      showType :: Proxy a -> String

the role of a is phantom, because a has a phantom role in Proxy. The roles of type arguments are automatically inferred, and usually the only interaction programmers have with roles is through

coerce :: Coercible a b => a -> b

where Coercible is a thin layer around representational equality. Roles thus remain invisible most of the time—but not today.

Proxy versus type argument

Back to the problem at hand. To understand what is going on, we have to be very precise. The function we are calling is

requireEmptyClass :: EmptyClass xs => Proxy xs -> ()
requireEmptyClass _ = ()

The question we must carefully consider is: what are we instantiating xs to? In the version with the quadratic blowup, where the use-site is

requiresInstance = requireEmptyClass @(ToTree ExampleFields) Proxy

we are being very explicit about the choice of xs: we are instantiating it to ToTree ExampleFields. In fact, we are more explicit than we might realize: we are instantiating it to the unevaluated type ToTree ExampleFields, not to ('Branch ...). Same thing, you might object, and you’d be right—but also wrong. Since we are instantiating it to the unevaluated type, but then build a dictionary for the evaluated type, we must then cast that dictionary; this is precisely the picture we painted in section What went wrong? above.

The more surprising case then is when the use-site is

requiresInstance = requireEmptyClass (Proxy @(ToTree ExampleFields))

In this case, we’re leaving ghc to figure out what to instantiate xs to. It will therefore instantiate it with some fresh variable �. When it then discovers that we also have a Proxy � argument, it must unify � with ToTree ExampleFields. Crucially, when it does so, it instantiates � to the evaluated form of ToTree ExampleFields (i.e., Branch ...), not the unevaluated form. In other words, we’re effectively calling

requiresInstance = requireEmptyClass @(Branch _ _ _) (Proxy @(ToTree ExampleFields))

Therefore, we need and build evidence for EmptyClass (Branch ...), and there is no need for a cast on the dictionary.

However, the need for a cast has not disappered: if we instantiate xs to (Branch ...), but provide a proxy for (ToTree ...), we need to cast the proxy instead—so why the reduction in core size? Let’s take a look at the equality proof given to the cast:

   Univ(phantom phantom <Tree *>
-- (1)    (2)      (3)     (4)
     :: ToTree ExampleFields         -- (5)
      , 'Branch  (T 0)               -- (6)
           ('Branch (T 1)
               ('Two (T 3) (T 7))
               ('Two (T 5) (T 9)))
           ('Branch (T 2)
               ('Two (T 4) (T 8))
               ('One (T 6)))

This is it! The entire coercion that was causing us trouble has been replaced basically by a single-constructor “trust me� universal coercion. This is the only coercion that establishes “phantom equality�6. Let’s take a closer look at the coercion:

  1. Univ indicates that this is a universal coercion.
  2. The first occurrence of phantom is the role of the equality that we’re establishing.
  3. The second occurrence of phantom is the provenance of the coercion: what justifies the use of a universal coercion? Here this is phantom again (we can use a universal coercion when establishing phantom equality), but this is not always the case; one example is unsafeCoerce, which can be used to construct a universal coercion between any two types at all7.
  4. When we establish the equality of two types �1 :: �1 and �2 :: �2, we also need to establish that the two kinds �1 and �2 are equal. In our example, both types trivially have the same kind, and so we can just use <Tree *>: reflexivity for kind Tree *.

Finally, (5) and (6) are the two types that we are proving to be equal to each other.

This then explains the big difference between these two definitions:

requiresInstance = requireEmptyClass (Proxy @(ToTree ExampleFields))
requiresInstance = requireEmptyClass @(ToTree ExampleFields) Proxy

In the former, we are instantiating xs to (Branch ...) and we need to cast the proxy, which is basically free due to Proxy’s phantom type argument; in the latter, we are instantiating xs to (ToTree ...), and we need to cast the dictionary, which requires the full equality proof.


If a solution that relies on the precise nature of unification, instantiating type variables to evaluated types rather than unevaluated types, makes you a bit uneasy—I would agree. Moreover, even if we did accept that as unpleasant but necessary, we still haven’t really solved the problem. The issue is that

requireEmptyClass :: EmptyClass xs => Proxy xs -> ()
requireEmptyClass _ = ()

is a function we cannot abstract over: the moment we define something like

abstracted :: forall xs. EmptyClass (ToTree xs) => Proxy xs -> ()
abstracted _ = requireEmptyClass (Proxy @(ToTree xs))

in an attempt to make that ToTree call a responsibility of a library instead of use sites (after all, we wanted the tree representation to be an implementation detail), we are back to the original problem.

Let’s think a little harder about these roles. We finished the previous section with a remark that “(..) we need to cast the dictionary, which requires the full equality proof�. But why is that? When we discussed roles above, we saw that the a type parameter in

data ShowType a = ShowType {
      showType :: Proxy a -> String

has a phantom role; yet, when we define a type class

class ShowType a where
  showType :: Proxy a -> String

(which, after all, is much the same thing), a is assigned a nominal role, not a phantom role. The reason for this is that ghc insists on coherence (see Safe zero-cost coercions for Haskell, section 3.2, Preserving class coherence). Coherence simply means that there is only ever a single instance of a class for a specific type; it’s part of how ghc prevents ambiguity during instance resolution. We can override the role of the ShowType dictionary and declare it to be phantom

type role ShowType phantom

but we lose coherence:

data ShowTypeDict a where
  ShowTypeDict :: ShowType a => ShowTypeDict a

showWithDict :: Proxy a -> ShowTypeDict a -> String
showWithDict p ShowTypeDict = showType p

instance ShowType Int where showType _ = "Int"

showTypeInt :: ShowTypeDict Int
showTypeInt = ShowTypeDict

oops :: String
oops = showWithDict (Proxy @Bool) (coerce showTypeInt)

This means that the role annotation for ShowType requires the IncoherentInstances language pragma (there is currently no class-level pragma).

Solving the problem

Despite the problem with roles and potential incoherence discussed in the previous section, role annotations on classes cannot make instance resolution ambiguous or result in runtime type errors or segfaults. We do have to be cautious with the use of coerce, but we can shield the user from this through careful module exports. Indeed, we have used role annotations on type classes to our advantage before.

Specifically, we can redefine our EmptyClass as an internal (not-exported) class as follows:

class EmptyClass' (xs :: Tree Type) -- instances as before
type role EmptyClass' phantom

then define a wrapper class that does the translation from lists to trees:

class    EmptyClass' (ToTree xs) => EmptyClass (xs :: [Type])
instance EmptyClass' (ToTree xs) => EmptyClass (xs :: [Type])

requireEmptyClass :: EmptyClass xs => Proxy xs -> ()
requireEmptyClass _ = ()

Now the translation to a tree has become an implementation detail that users do not need to be aware of, whilst still avoiding (super)quadratic blow-up in core:

Constraint families

There is one final piece to the puzzle. Suppose we define a type family mapping types to constraints:

type family CF (a :: Type) :: Constraint

In the next session we will see a non-contrived example of this; for now, let’s just introduce a function that depends on CF a for no particular reason at all:

withCF :: CF a => Proxy a -> ()
withCF _ = ()

Finally, we provide an instance for HList:

type instance CF (HList xs) = EmptyClass (ToTree xs)

Now let’s measure the size of the core generated for a module containing a single call

satisfyCF :: ()
satisfyCF = withCF (Proxy @(HList ExampleFields))


Shallow thinking

In section Proxy versus type argument above, we saw that when we call

requireEmptyClass @(Branch _ _ _) (Proxy @(ToTree ExampleFields))

we construct a proof π :: Proxy (ToTree ...) ~ Proxy (Branch ...), which then gets simplified. We didn’t spell out in detail how that simplification happens though, so let’s do that now.

As mentioned above, the type checker always works with nominal equality. This means that the proof constructed by the type checker is actually π :: Proxy (ToTree ...) ~N Proxy (Branch ...). Whenever we cast something in core, however, we want a proof of representational equality. The coercion language has an explicit constructor for this8, so we could construct the proof

sub π :: Proxy (ToTree ...) ~R Proxy (Branch ...)

However, the function that constructs this proof first takes a look: if it is a constructing an equality proof T π' (i.e., an appeal to congruence for type T, applied to some proof π'), where T has an argument with a phantom role, it replaces the proof (π') with a universal coercion instead. This also happens when we construct a proof that EmptyClass (ToTree ...) ~R EmptyClass (Branch ...) (like in section Solving the problem), provided that the argument to EmptyClass is declared to have a phantom role.

Why doesn’t that happen here? When we call function withCF we need to construct evidence for CF (HList ExampleFields). Just like before, this means we must first evaluate this to EmptyClass (Branch ..), resulting in a proof of nominal equality π :: CF .. ~N EmptyClass (Branch ..). We then need to change this into a proof of representational equality to use as an argument to cast. However, where before π looked like Proxy π' or EmptyClass π', we now have a proof that looks like

  D:R:CFHList[0] <'[T 0, T 1, .., T 9]>_N
; EmptyClass π'

which first proves that CF (HList ..) ~ EmptyClass (ToTree ..), and only then proves that EmptyClass (ToTree ..) ~ EmptyClass (Branch ..). The function that attempts the proof simplification only looks at the top-level of the proof, and therefore does not notice an opportunity for simplification here and simply defaults to using sub π.

The function does not traverse the entire proof because doing so at every point during compilation could be prohibitively expensive. Instead, it does cheap checks only, leaving the rest to be cleaned up by coercion optimization. Coercion optimization is part of the “very simple optimizer�, which runs even with -O0 (unless explicitly disabled with -fno-opt-coercion). In this particular example, coercion optimization will replace the proof (π') by a universal coercion, but it will only do so later in the compilation pipeline; better to reduce the size of the core sooner rather than later. It’s also not entirely clear if it will always be able to do so, especially also because the coercion optimiser can make things significantly worse in some cases, and so it may be scaled back in the future. I think it’s preferable not to depend on it.

Avoiding deep normalization

As we saw, when we call

withCF (Proxy @(HList ExampleFields))

the type checker evaluates CF (HList ExampleFields) all the way to EmptyClass (Branch ...): although there is ghc ticket proposing to change this, at present whenever ghc evaluates a type, it evaluates it all the way. For our use case this is frustrating: if ghc were to rewrite CF (HList ..) to ExampleFields (ToTree ..) (with a tiny proof: just one rule application), we would then be back where we were in section Solving the problem, and we’d avoid the quadratic blow-up. Can we make ghc stop evaluating earlier? Yes, sort of. If instead of

type instance CF (HList xs) = EmptyClass (ToTree xs)

we say

class    EmptyClass (ToTree xs) => CF_HList xs
instance EmptyClass (ToTree xs) => CF_HList xs

type instance CF (HList xs) = CF_HList xs

then CF (HList xs) simply gets rewritten (in one step) to CF_HList xs, which contains no further type family applications. Now the type checker needs to check that CF_HList ExampleFields is satisfied, which will match the above instance, and hence it must check that EmptyClass (ToTree ExampleFields) is satisfied. Now, however, we really are back in the situation from section Solving the problem, and the size of our core looks good again:

Putting it all together: Generic instance for HList

Let’s put theory into practice and give a Generic instance for HList, of course avoiding quadratic blow-up in the process. We will use the Generic class from the large-records library, discussed at length in part 1.

The first problem we have to solve is that when giving a Generic instance for some type a, we need to choose a constraint

Constraints a c :: (Type -> Constraint) -> Constraint

such that when Constraints a c is satisfied, we can get a type class dictionary for c for every field in the record:

class Generic a where
  -- | @Constraints a c@ means "all fields of @a@ satisfy @c@"
  type Constraints a (c :: Type -> Constraint) :: Constraint

  -- | Construct vector of dictionaries, one for each field of the record
  dict :: Constraints a c => Proxy c -> Rep (Dict c) a

  -- .. other class members ..

Recall that Rep (Dict c) a is a vector containing a Dict c x for every field of type x in the record. Since we are avoiding heterogenous data structures (due to the large type annotations), we effectively need to write a function

hlistDict :: Constraints a c => Proxy a -> [Dict c Any]

Let’s tackle this in quite a general way, so that we can reuse what we develop here for the next problem as well. We have a type-level list of types of the elements of the HList, which we can translate to a type-level tree of types. We then need to reflect this type-level tree to a term-level tree with values of some type f Any (in this example, f ~ Dict c). We will delegate reflection of the values in the tree to a separate class:

class ReflectOne (f :: Type -> Type) (x :: Type) where
  reflectOne :: Proxy x -> f Any

class ReflectTree (f :: Type -> Type) (xs :: Tree Type) where
  reflectTree :: Proxy xs -> Tree (f Any)

type role ReflectTree nominal phantom -- critical!

Finally, we can then flatten that tree into a list, in such a way that it reconstructs the order of the list (i.e., it’s a term-level inverse to the ToTree type family):

treeToList :: Tree a -> [a]

The instances for ReflectTree are easy. For example, here is the instance for One:

instance ReflectOne f x => ReflectTree f ('One x) where
  reflectTree _ = One (reflectOne (Proxy @x))

The other instances for ReflectTree follow the same structure (the full source code can be found in the large-records test suite). It remains only to define a ReflectOne instance for our current specific use case:

instance c a => ReflectOne (Dict c) (a :: Type) where
  reflectOne _ = unsafeCoerce (Dict :: Dict c a)

With this definition, a constraint ReflectTree (Dict c) (ToTree xs) for a specific list of xs will result in a c x constraint for every x in xs, as expected. We are now ready to give a partial implementation of the Generic class:

hlistDict :: forall c (xs :: [Type]).
     ReflectTree (Dict c) (ToTree xs)
  => Proxy xs -> [Dict c Any]
hlistDict _ = treeToList $ reflectTree (Proxy @(ToTree xs))

class    ReflectTree (Dict c) (ToTree xs) => Constraints_HList xs c
instance ReflectTree (Dict c) (ToTree xs) => Constraints_HList xs c

instance Generic (HList xs) where
  type Constraints (HList xs) = Constraints_HList xs
  dict _ = Rep.unsafeFromListAny $ hlistDict (Proxy @xs)

The other problem that we need to solve is that we need to construct field metadata for every field in the record. Our (over)simplified “anonymous record� representation does not have field names, so we need to make them up from the type names. Assuming some type family

type family ShowType (a :: Type) :: Symbol

we can construct this metadata in much the same way that we constructed the dictionaries:

instance KnownSymbol (ShowType a) => ReflectOne FieldMetadata (a :: Type) where
  reflectOne _ = FieldMetadata (Proxy @(ShowType a)) FieldLazy

instance ReflectTree FieldMetadata (ToTree xs) => Generic (HList xs) where
  metadata _ = Metadata {
        recordName          = "Record"
      , recordConstructor   = "MkRecord"
      , recordSize          = length fields
      , recordFieldMetadata = Rep.unsafeFromListAny fields
      fields :: [FieldMetadata Any]
      fields = treeToList $ reflectTree (Proxy @(ToTree xs))

The graph below compares the size of the generated core between a straight-forward integration with generics-sop and two integrations with large-records, one in which the ReflectTree parameter has its inferred nominal role, and one with the phantom role override:

Both the generics-sop integration and the nominal large-records integration are O(n²). The constant factors are worse for generics-sop, however, because it suffers from both the problems described in part 1 of this blog as well as the problems described in this part 2. The nominal large-records integration only suffers from the problems described in part 2, which are avoided by the O(n log n) phantom integration.

TL;DR: Advice

We considered a lot of deeply technical detail in this blog post, but I think it can be condensed into two simple rules. To reduce the size of the generated core code:

  1. Use instance induction only with balanced data structures.
  2. Use expensive type families only in phantom contexts.

where “phantom context� is short-hand for “as an argument to a datatype, where that argument has a phantom role�. The table below summarizes the examples we considered in this blog post.

Don’t Do
instance C xs => instance C (x ': xs) instance (C l, C r) => instance C ('Branch l r)
foo @(F xs) Proxy foo (Proxy @(F xs))
type role Cls nominal type role Cls phantom
Use with caution: requires IncoherentInstances
type instance F a = Expensive a type instance F a = ClassAlias a
(for F a :: Constraint)


This work was done in the context of trying to improve the compilation time of Juspay’s code base. When we first started analysing why Juspay was suffering from such long compilation times, we realized that a large part of the problem was due to the large core size generated by ghc when using large records, quadratic in the number of record fields. We therefore developed the large-records library, which offers support for records which guarantees to result in core code that is linear in the size of the record. A first integration attempt showed that this improved compilation time by roughly 30%, although this can probably be tweaked a little further.

As explained in MonadFix’s blog post Transpiling a large PureScript codebase into Haskell, part 2: Records are trouble, however, some of the records in the code base are anonymous records, and for these the large-records library is not (directly) applicable, and instead, MonadFix is using their own library jrec.

An analysis of the integration with large-records revealed however that these large anonymous records are causing similar problems as the named records did. Amongst the 13 most expensive modules (in terms of compilation time), 5 modules suffered from huge coercions, with less extreme examples elsewhere in the codebase. In one particularly bad example, one function contained coercion terms totalling nearly 2,000,000 AST nodes! (The other problem that stood out was due to large enumerations, not something I’ve thought much about yet.)

We have certainly not resolved all sources of quadratic core code for anonymous records in this blog post, nor have we attempted to integrate these ideas in jrec. However, the generic instance for anonymous records (using generics-sop generics) was particularly troublesome, and the ideas outlined above should at least solve that problem.

In general the problem of avoiding generating superlinear core is difficult and multifaceted:

Fortunately, ghc gives us just enough leeway that if we are very careful we can avoid the problem. That’s not a solution, obviously, but at least there are temporary work-arounds we can use.

Postscript: Pre-evaluating type families

If the evaluation of type families at compile time leads to large proofs, one for every step in the evaluation, perhaps we can improve matters by pre-evaluating these type families. For example, we could define ToTree as

type family ToTree (xs :: [Type]) :: Tree Type where
  ToTree '[]                       = 'Zero
  ToTree '[x0]                     = 'One x0
  ToTree '[x0, x1]                 = 'Two x0 x1
  ToTree '[x0, x1, x2]             = 'Branch x0 ('One x1)    ('One x2)
  ToTree '[x0, x1, x2, x3]         = 'Branch x0 ('Two x1 x3) ('One x2)
  ToTree '[x0, x1, x2, x3, x4]     = 'Branch x0 ('Two x1 x3) ('Two x2 x4)
  ToTree '[x0, x1, x2, x3, x4, x5] = 'Branch x0 ('Branch x1 ('One x3) ('One x5)) ('Two x2 x4)

perhaps (or perhaps not) with a final case that defaults to regular evaluation. Now ToTree xs can evaluate in a single step for any of the pre-computed cases. Somewhat ironically, in this case we’re better off without phantom contexts:

The universal coercion is now larger than the regular proof, because it records the full right-hand-side of the type family:

Univ(phantom phantom <Tree *>
  :: 'Branch
        (T 0)
        ('Branch (T 1) ('Two (T 3) (T 7)) ('Two (T 5) (T 9)))
        ('Branch (T 2) ('Two (T 4) (T 8)) ('One (T 6)))
   , ToTree '[T 0, T 1, T 2, T 3, T 4, T 5, T 6, T 7, T 8, T 9]))

The regular proof instead only records the rule that we’re applying:

D:R:ToTree[10] <T 0> <T 1> <T 2> <T 3> <T 4> <T 5> <T 6> <T 7> <T 8> <T 9>

Hence, the universal coercion is O(n log n), whereas the regular proof is O(n). Incidentally, this is also the reason why ghc doesn’t just always use universal coercions; in some cases the universal coercion can be significantly larger than the regular proof (the Rep type family of GHC.Generics being a typical example). The difference between O(n) and O(n log n) is of course significantly less important than the difference between O(n log n) and O(n²), especially since we anyway still have other O(n log n) factors, so if we can precompute type families like this, perhaps we can be less careful about roles.

However, this is only a viable option for certain type families. A full-blown anonymous records library will probably need to do quite a bit of type-level computation on the indices of the record. For example, the classical extensible records theory by Gaster and Jones depends crucially on a predicate lacks that checks that a particular field is not already present in a record. We might define this as

type family Lacks (x :: k) (xs :: [k]) :: Bool where
  Lacks x '[]       = 'True
  Lacks x (x ': xs) = 'False
  Lacks x (y ': xs) = Lacks x xs

It’s not clear to me how to pre-compute this type family in such a way that the right-hand side can evaluate in O(1) steps, without the type family needing O(n²) cases. We might attempt

type family Lacks (x :: k) (xs :: [k]) :: Bool where
  -- 0
  Lacks x '[] = 'True

  -- 1
  Lacks x '[x] = 'False
  Lacks x '[y] = 'True

  -- 2
  Lacks x '[x  , y2] = 'False
  Lacks x '[y1 , x ] = 'False
  Lacks x '[y1 , y2] = 'True

  -- 3
  Lacks x '[x,  y2 , y3] = 'False
  Lacks x '[y1, x  , y3] = 'False
  Lacks x '[y1, y2 , x ] = 'False
  Lacks x '[y1, y2 , y3] = 'True

  -- etc

but even if we only supported records with up to 200 fields, this would require over 20,000 cases. It’s tempting to try something like

type family Lacks (x :: k) (xs :: [k]) :: Bool where
  Lacks x []           = True
  Lacks x [y0]         = (x != y0)
  Lacks x [y0, y1]     = (x != y0) && (x != y1)
  Lacks x [y0, y1, y2] = (x != y0) && (x != y1) && (x != y2)

but that is not a solution (even supposing we had such an != operator): the right hand side still has O(n) operations, and so this would still result in proofs of O(n) size, just like the non pre-evaluated Lacks definition. Type families such as Sort would be more difficult still. Thus, although precomputation can in certain cases help avoid large proofs, and it’s a useful technique to have in the arsenal, I don’t think it’s a general solution.


  1. The introduction of the Tree datatype is not strictly necessary. We could instead work exclusively with lists, and use Evens/Odds directly to split the list in two at each step. The introduction of Tree however makes this structure more obvious, and as a bonus leads to smaller core, although the difference is a constant factor only.↩�

  2. This Tree data type is somewhat non-standard: we have values both at the leaves and in the branches, and we have leaves with zero, one or two values. Having values in both branches and leaves reduces the number of tree nodes by one half (leading to smaller core), which is an important enough improvement to warrant the minor additional complexity. The Two constructor only reduces the size the tree roughly by a further 4%, so not really worth it in general, but it’s useful for the sake of this blog post, as it keeps examples of trees a bit more manageable.↩�

  3. Specifically, I made the following simplifications to the actual coercion proof:
    1. Drop the use of Sym, replacing Sym (a ; .. ; c) by (Sym c ; .. ; Sym a) and Sym c by simply c for atomic axioms c.
    2. Appeal to transitivity to replace a ; (b ; c) by a ; b ; c
    3. Drop all explicit use of congruence, replacing T c by simply c, whenever T has a single argument.
    4. As a final step, reverse the order of the proof; in this case at least, ghc seemed to reason backwards from the result of the type family application, but it’s more intuitive to reason forwards instead.↩�

  4. See Master Theorem, case 3: Work to split/recombine a problem dominates subproblems.↩�

  5. Roles were first described in Generative Type Abstraction and Type-level Computation. Phantom roles were introduced in Safe zero-cost coercions for Haskell.↩�

  6. Safe zero-cost coercions for Haskell does not talk about this in great detail, but does mention it. See Fig 5, “Formation rules for coercions�, rule Co_Phantom, as well as the (extremely brief) section 4.2.2., “Phantom equality relates all types�. The paper does not use the terminology “universal coercion�.↩�

  7. Confusingly the pretty-printer uses a special syntax for a universal unsafe coercion, using UnsafeCoinstead of Univ.↩�

  8. Safe zero-cost coercions for Haskell, section 4.2.1: Nominal equality implies representational equality.↩�

by edsko, adam at October 20, 2021 12:00 AM

Lysxia's blog

Initial and final encodings of free monads

Free monads are often introduced as an algebraic data type, an initial encoding:

data Free f a = Pure a | Free (f (Free f a))

Thanks to that, the term “free monads” tends to be confused with that encoding, even though “free monads” originally refers to a representation-independent idea. Dually, there is a final encoding of free monads:

type Free' f a = (forall m. MonadFree f m => m a)

where MonadFree is the following class:

class Monad m => MonadFree f m where
  free :: f (m a) -> m a

The two types Free and Free' are isomorphic. An explanation a posteriori is that free monads are unique up to isomorphism. In this post, we will prove that they are isomorphic more directly,1 in Coq.

In other words, there are two functions:

fromFree' :: Free' f a -> Free f a
toFree' :: Free f a -> Free' f a

such that, for all u :: Free f a,

fromFree' (toFree' u) = u  -- easy

and for all u :: Free' f a,

toFree' (fromFree' u) = u  -- hard

(Also, these functions are monad morphisms.)

The second equation is hard to prove because it relies on a subtle fact about polymorphism. If you have a polymorphic function forall m ..., it can only interact with m via operations provided as parameters—in the MonadFree dictionary. The equation crashes down if you can perform some kind of case analysis on types, such as isinstanceof in certain languages. This idea is subtle because, how do you turn this negative property “does not use isinstanceof” into a positive, useful fact about the functions of a language?

Parametricity is the name given to such properties. You can get a good intuition for it with some practice. For example, most people can convince themselves that forall a. a -> a is only inhabited by the identity function. But formalizing it so you can validate your intuition is a more mysterious art.

Proof sketch

First, unfolding some definitions, the equation we want to prove will simplify to the following:

foldFree (u @(Free f)) = u @m

where u :: forall m. MonadFree f m => m a is specialized at Free f on the left, at an arbitrary m on the right, and foldFree :: Free f a -> m a is a certain function we do not need to look into for now.

The main idea is that those different specializations of u are related by a parametricity theorem (aka. free theorem).

For all monads m1, m2 that are instances of MonadFree f, and for any relation r between m1 and m2, if r satisfies $CERTAIN_CONDITIONS, then r relates u @m1 and u @m2.

In this case, we will let r relate u1 :: Free f a and u2 :: m a when:

foldFree u1 = u2

As it turns out, r will satisfy $CERTAIN_CONDITIONS, so that the parametricity theorem above applies. This yields exactly the desired conclusion:

foldFree (u @(Free f)) = u @m

It is going to be a gnarly exposition of definitions before we can even get to the proof, and the only reason I can think of to stick around is morbid curiosity. But I had the proof and I wanted to do something with it.2

Formalization in Coq

Imports and setting options
From Coq Require Import Morphisms.

Set Implicit Arguments.
Set Contextual Implicit.

Initial free monads

Right off the bat, the first hurdle is that we cannot actually write the initial Free in Coq. To guarantee that all functions terminate and to prevent logical inconsistencies, Coq imposes restrictions about what recursive types can be defined. Indeed, Free could be used to construct an infinite loop by instantiating it with a contravariant functor f. The following snippet shows how we can inhabit the empty type Void, using only non-recursive definitions, so it’s fair to put the blame on Free:

newtype Cofun b a = Cofun (a -> b)

omicron :: Free (Cofun Void) Void -> Void
omicron (Pure y) = y
omicron (Free (Cofun z)) = z (Free (Cofun z))

omega :: Void
omega = omicron (Free (Cofun omicron))

To bypass that issue, we can tweak the definition of Free into what you might know as the freer monad, or the operational monad. The key difference is that the recursive occurrence of Free f a is no longer under an abstract f, but a concrete (->) instead.

Inductive Free (f : Type -> Type) (a : Type) : Type :=
| Pure : a -> Free f a
| Bind : forall e, f e -> (e -> Free f a) -> Free f a

Digression on containers

With that definition, it is no longer necessary for f to be a functor—it’s even undesirable because of size issues. Instead, f should rather be thought of as a type of “shapes”, containing “positions” of type e, and that induces a functor by assigning values to those positions (via the function e -> Free f a here); such an f is also known as a “container”.

For example, the Maybe functor consists of two “shapes”: Nothing, with no positions (indexed by Void), and Just, with one position (indexed by ()). Those shapes are defined by the following GADT, the Maybe container:

data PreMaybe _ where
  Nothing_ :: PreMaybe Void
  Just_ :: PreMaybe ()

A container extends into a functor, using a construction that some call Coyoneda:

data Maybe' a where
  MkMaybe' :: forall a e. PreMaybe e -> (e -> a) -> Maybe' a

data Coyoneda f a where
  Coyoneda :: forall f a e. f e -> (e -> a) -> Coyoneda f a

Freer f a (where Freer is called Free here in Coq) coincides with Free (Coyoneda f) a (for the original definition of Free at the top). If f is already a functor, then it is observationally equivalent to Coyoneda f.

Monad and MonadFree

The Monad class hides no surprises. For simplicity we skip the Functor and Applicative classes. Like in C, return is a keyword in Coq, so we have to settle for another name.

Class Monad (m : Type -> Type) : Type :=
  { pure : forall {a}, a -> m a
  ; bind : forall {a b}, m a -> (a -> m b) -> m b
(* The braces after `forall` make the arguments implicit. *)

Our MonadFree class below is different than in Haskell because of the switch from functors to containers (see previous section). In the original MonadFree, the method free takes an argument of type f (m a), where the idea is to “interpret” the outer layer f, and “carry on” with a continuation m a. Containers encode that outer layer without the continuation.3

Class MonadFree {f m : Type -> Type} `{Monad m} : Type :=
  { free : forall {x}, f x -> m x }.

(* Some more implicit arguments nonsense. *)
Arguments MonadFree f m {_}.

Here comes the final encoding of free monads. The resemblance to the Haskell code above should be apparent in spite of some funny syntax.

Definition Free' (f : Type -> Type) (a : Type) : Type :=
  forall m `(MonadFree f m), m a.

Type classes in Coq are simply types with some extra type inference rules to infer dictionaries. Thus, the definition of Free' actually desugars to a function type forall m, Monad m -> MonadFree f m -> m a. A value u : Free' f a is a function whose arguments are a type constructor m, followed by two dictionaries of the Monad and MonadFree classes. We specialize u to a monad m by writing u m _ _, applying u to the type constructor m and two holes (underscores) for the dictionaries, whose contents will be inferred via type class resolution. See for example fromFree' below.

While we’re at it, we can define the instances of Monad and MonadFree for the initial encoding Free.

Fixpoint bindFree {f a b} (u : Free f a) (k : a -> Free f b) : Free f b :=
  match u with
  | Pure a => k a
  | Bind e h => Bind e (fun x => bindFree (h x) k)

Instance Monad_Free f : Monad (Free f) :=
  {| pure := @Pure f
  ;  bind := @bindFree f

Instance MonadFree_Free f : MonadFree f (Free f) :=
  {| free A e := Bind e (fun a => Pure a)

Interpretation of free monads

To show that those monads are equivalent, we must exhibit a mapping going both ways.

The easy direction is from the final Free' to the initial Free: with the above instances of Monad and MonadFree, just monomorphize the polymorph.

Definition fromFree' {f a} : Free' f a -> Free f a :=
  fun u => u (Free f) _ _.

The other direction is obtained via a fold of Free f, which allows us to interpret it in any instance of MonadFree f: replace Bind with bind, interpret the first operand with free, and recurse in the second operand.

Fixpoint foldFree {f m a} `{MonadFree f m} (u : Free f a) : m a :=
  match u with
  | Pure a => pure a
  | Bind e k => bind (free e) (fun x => foldFree (k x))

Definition toFree' {f a} : Free f a -> Free' f a :=
  fun u M _ _ => foldFree u.


In everyday mathematics, equality is a self-evident notion that we take for granted. But if you want to minimize your logical foundations, you do not need equality as a primitive. Equations are just equivalences, where the equivalence relation is kept implicit.

Who even decides what the rules for reasoning about equality are anyway? You decide, by picking the underlying equivalence relation. 4

Here is a class for equality. It is similar to Eq in Haskell, but it is propositional (a -> a -> Prop) rather than boolean (a -> a -> Bool), meaning that equality doesn’t have to be decidable.

Class PropEq (a : Type) : Type :=
  propeq : a -> a -> Prop.

Notation "x = y" := (propeq x y) : type_scope.

For example, for inductive types, a common equivalence can be defined as another inductive type which equates constructors and their fields recursively. Here it is for Free:

Inductive eq_Free f a : PropEq (Free f a) :=
| eq_Free_Pure x : eq_Free (Pure x) (Pure x)
| eq_Free_Bind p (e : f p) k1 k2
  : (forall x, eq_Free (k1 x) (k2 x)) ->
    eq_Free (Bind e k1) (Bind e k2)

(* Register it as an instance of PropEq *)
Existing Instance eq_Free.

Having defined equality for Free, we can state and prove one half of the isomorphism between Free and Free'.

Theorem to_from f a (u : Free f a)
  : fromFree' (toFree' u) = u.

The proof is straightforward by induction, case analysis (which is performed as part of induction), and simplification.

  induction u. all: cbn. all: constructor; auto.

Equality on final encodings, naive attempts

To state the other half of the isomorphism (toFree' (fromFree' u) = u), it is less obvious what the right equivalence relation on Free' should be. When are two polymorphic values u1, u2 : forall m `(MonadFree f m), m a equal? A fair starting point is that all of their specializations must be equal. “Equality” requires an instance of PropEq, which must be introduced as an extra parameter.

(* u1 and u2 are "equal" when all of their specializations
   (u1 m _ _) and (u2 m _ _) are equal. *)
Definition eq_Free'_very_naive f a (u1 u2 : Free' f a) : Prop :=
  forall m `(MonadFree f m) `(forall x, PropEq (m x)),
    u1 m _ _ = u2 m _ _.

That definition is flagrantly inadequate: so far, a PropEq instance can be any relation, including the empty relation (which never holds), and the Monad instance (as a superclass of MonadFree) might be unlawful. In our desired theorem, toFree' (fromFree' u) = u, the two sides use a priori different combinations of bind and pure, so we expect to rely on laws to be able to rewrite one side into the other.

In programming, we aren’t used to proving that implementations satisfy their laws, so there is always the possibility that a Monad instance is unlawful. In math, the laws are in the definitions; if something doesn’t satisfy the monad laws, it’s not a monad. Let’s irk some mathematicians and say that a lawful monad is a monad that satisfies the monad laws. Thus we will have one Monad class for the operations only, and one LawfulMonad class for the laws they should satisfy. Separating code and proofs that way helps to organize things. Code is often much simpler than the proofs about it, since the latter necessarily involves dependent types.

Class LawfulMonad {m} `{Monad m} `{forall a, PropEq (m a)} : Prop :=
  { Equivalence_LawfulMonad :> forall a, Equivalence (propeq (a := m a))
  ; propeq_bind : forall a b (u u' : m a) (k k' : a -> m b),
      u = u' -> (forall x, k x = k' x) -> bind u k = bind u' k'
  ; bind_pure : forall a (u : m a),
      bind u (pure (a := a)) = u
  ; pure_bind : forall a b (x : a) (k : a -> m b),
      bind (pure x) k = k x
  ; bind_bind : forall a b c (u : m a) (k : a -> m b) (h : b -> m c),
      bind (bind u k) h = bind u (fun x => bind (k x) h)

The three monad laws should be familiar (bind_pure, pure_bind, bind_bind). In those equations, “=” denotes a particular equivalence relation, which is now a parameter/superclass of the class. Once you give up on equality as a primitive notion, algebraic structures must now carry their own equivalence relations. The requirement that it is an equivalence relation also becomes an explicit law (Equivalence_LawfulMonad), and we expect that operations (in this case, bind) preserve the equivalence (propeq_bind). Practically speaking, that last fact allows us to rewrite subexpressions locally, otherwise we could only apply the monad laws at the root of an expression.

A less naive equivalence on Free' is thus to restrict the quantification to lawful instances:

Definition eq_Free'_naive f a (u1 u2 : Free' f a) : Prop :=
  forall m `(MonadFree f m) `(forall x, PropEq (m x)) `(!LawfulMonad (m := m)),
    u1 m _ _ = u2 m _ _.

That is a quite reasonable definition of equivalence for Free'. In other circumstances, it could have been useful. Unfortunately, it is too strong here: we cannot prove the equation toFree' (fromFree' u) = u with that interpretation of =. Or at least I couldn’t figure out a solution. We will need more assumptions to be able to apply the parametricity theorem of the type Free'. To get there, we must formalize Reynolds’ relational interpretation of types.

Types as relations

The core technical idea in Reynolds’ take on parametricity is to interpret a type t as a relation Rt : t -> t -> Prop. Then, the parametricity theorem is that all terms x : t are related to themselves by Rt (Rt x x is true). If t is a polymorphic type, that theorem connects different specializations of a same term x : t, and that allows us to formalize arguments that rely on “parametricity” as a vague idea.

For example, if t = (forall a, a -> a), then Rt is the following relation, which says that two functions f and f' are related if for any relation Ra (on any types), f and f' send related inputs (Ra x x') to related outputs (Ra (f a x) (f' a' x')).

Rt f f' =
  forall a a' (Ra : a -> a' -> Prop),
  forall x x', Ra x x' -> Ra (f a x) (f' a' x')

If we set Ra x x' to mean “x equals an arbitrary constant z0” (ignoring x', i.e., treating Ra as a unary relation), the above relation Rt amounts to saying that f z0 = z0, from which we deduce that f must be the identity function.

The fact that Rt is a relation is not particularly meaningful to the parametricity theorem, where terms are simply related to themselves, but it is a feature of the construction of Rt: the relation for a composite type t1 -> t2 combines the relations for the components t1 and t2, and we could not get the same result with only unary predicates throughout.5 More formally, we define a relation R[t] by induction on t, between the types t and t', where t' is the result of renaming all variables x to x' in t (including binders). The two most interesting cases are:

  • t starts with a quantifier t = forall a, _, for a type variable a. Then the relation R[forall a, _] between the polymorphic f and f' takes two arbitrary types a and a' to specialize f and f' with, and a relation Ra : a -> a' -> Prop, and relates f a and f' a' (recursively), using Ra whenever recursion reaches a variable a.

  • t is an arrow t1 -> t2, then R[t1 -> t2] relates functions that send related inputs to related outputs.

In summary:

R[forall a, t](f, f') = forall a a' Ra, R[t](f a)(f' a')
R[a](f, f')           = Ra(f, f')
                        -- Ra should be in scope when a is in scope.
R[t1 -> t2](f, f')    = forall x x', R[t1](x, x') -> R[t2](f x, f' x')

That explanation was completely unhygienic, but refer to Reynolds’ paper or Wadler’s Theorems for free! for more formal details.

For sums (Either/sum) and products ((,)/prod), two values are related if they start with the same constructor, and their fields are related (recursively). This can be deduced from the rules above applied to the Church encodings of sums and products.

Type constructors as relation transformers

While types t : Type are associated to relations Rt : t -> t -> Prop, type constructors m : Type -> Type are associated to relation transformers (functions on relations) Rm : forall a a', (a -> a' -> Prop) -> (m a -> m a' -> Prop). It is usually clear what’s what from the context, so we will often refer to “relation transformers” as just “relations”.

For example, the initial Free f a type gets interpreted to the relation RFree Rf Ra defined as follows. Two values u1 : Free f1 a1 and u2 : Free f2 a2 are related by RFree if either:

  • u1 = Pure x1, u2 = Pure x2, and x1 and x2 are related (by Ra); or
  • u1 = Bind e1 k1, u2 = Bind e2 k2, e1 and e2 are related, and k1 and k2 are related (recursively).

We thus have one rule for each constructor (Pure and Bind) in which we relate each field (Ra x1 x2 in RFree_Pure; Rf _ _ _ y1 y2 and RFree Rf Ra (k1 x1) (k2 x2) in RFree_Bind). Let us also remark that the existential type e in Bind becomes an existential relation Re in RFree_Bind.

Inductive RFree {f₁ f₂ : Type -> Type}
    (Rf : forall a₁ a₂ : Type, (a₁ -> a₂ -> Prop) -> f₁ a₁ -> f₂ a₂ -> Prop)
    {a₁ a₂ : Type} (Ra : a₁ -> a₂ -> Prop) : Free f₁ a₁ -> Free f₂ a₂ -> Prop :=
  | RFree_Pure : forall (x₁ : a₁) (x₂ : a₂),
      Ra x₁ x₂ -> RFree Rf Ra (Pure x₁) (Pure x₂)
  | RFree_Bind : forall (e₁ e₂ : Type) (Re : e₁ -> e₂ -> Prop) (y₁ : f₁ e₁) (y₂ : f₂ e₂),
      Rf e₁ e₂ Re y₁ y₂ ->
      forall (k₁ : e₁ -> Free f₁ a₁) (k₂ : e₂ -> Free f₂ a₂),
      (forall (x₁ : e₁) (x₂ : e₂),
        Re x₁ x₂ -> RFree Rf Ra (k₁ x₁) (k₂ x₂)) ->
      RFree Rf Ra (Bind y₁ k₁) (Bind y₂ k₂).

Inductive relations such as RFree, indexed by types with existential quantifications such as Free, are a little terrible to work with out-of-the-box—especially if you’re allergic to UIP. Little “inversion lemmas” like the following make them a bit nicer by reexpressing those relations in terms of some standard building blocks which leave less of a mess when decomposed.

Definition inv_RFree {f₁ f₂} Rf {a₁ a₂} Ra (u₁ : Free f₁ a₁) (u₂ : Free f₂ a₂)
  : RFree Rf Ra u₁ u₂ ->
    match u₁, u₂ return Prop with
    | Pure a₁, Pure a₂ => Ra a₁ a₂
    | Bind y₁ k₁, Bind y₂ k₂ =>
      exists Re, Rf _ _ Re y₁ y₂ /\
        (forall x₁ x₂, Re x₁ x₂ -> RFree Rf Ra (k₁ x₁) (k₂ x₂))
    | _, _ => False
  intros []; eauto.

Type classes, which are (record) types, also get interpreted in the same way. Since Monad is parameterized by a type constructor m, the relation RMonad between Monad instances is parameterized by a relation between two type constructors m1 and m2. Two instances of Monad, i.e., two values of type Monad m for some m, are related if their respective fields, i.e., pure and bind, are related. pure and bind are functions, so two instances are related when they send related inputs to related outputs.

Record RMonad (m₁ m₂ : Type -> Type)
    (Rm : forall a₁ a₂ : Type, (a₁ -> a₂ -> Prop) -> m₁ a₁ -> m₂ a₂ -> Prop)
    `{Monad m₁} `{Monad m₂} : Prop :=
  { RMonad_pure : forall (t₁ t₂ : Type) (Rt : t₁ -> t₂ -> Prop) (x₁ : t₁) (x₂ : t₂),
      Rt x₁ x₂ -> Rm t₁ t₂ Rt (pure x₁) (pure x₂)
  ; RMonad_bind : forall (t₁ t₂ : Type) (Rt : t₁ -> t₂ -> Prop) 
      (u₁ u₂ : Type) (Ru : u₁ -> u₂ -> Prop) (x₁ : m₁ t₁) (x₂ : m₂ t₂),
      Rm t₁ t₂ Rt x₁ x₂ ->
      forall (k₁ : t₁ -> m₁ u₁) (k₂ : t₂ -> m₂ u₂),
      (forall (x₁ : t₁) (x₂ : t₂),
         Rt x₁ x₂ -> Rm u₁ u₂ Ru (k₁ x₁) (k₂ x₂)) ->
      Rm u₁ u₂ Ru (bind x₁ k₁) (bind x₂ k₂)

MonadFree also gets translated to a relation RMonadFree. Related inputs, related outputs.

Record RMonadFree (f₁ f₂ : Type -> Type)
    (Rf : forall a₁ a₂ : Type, (a₁ -> a₂ -> Prop) -> f₁ a₁ -> f₂ a₂ -> Prop)
    (m₁ m₂ : Type -> Type)
    (Rm : forall a₁ a₂ : Type, (a₁ -> a₂ -> Prop) -> m₁ a₁ -> m₂ a₂ -> Prop)
    `{MonadFree f₁ m₁} `{MonadFree f₂ m₂} : Prop :=
  { RMonadFree_free : forall (a₁ a₂ : Type) (Ra : a₁ -> a₂ -> Prop) (x₁ : f₁ a₁) (x₂ : f₂ a₂),
      Rf a₁ a₂ Ra x₁ x₂ -> Rm a₁ a₂ Ra (free x₁) (free x₂)

Note that RMonad and RMonadFree are “relation transformer transformers”, since they take relation transformers as arguments, to produce a relation between class dictionaries.

We can now finally translate the final Free' to a relation. Two values u1 : Free' f1 a1 and u2 : Free' f2 a2 are related if, for any two monads m1 and m2, with a relation transformer Rm, whose Monad and MonadFree instances are related by RMonad and RMonadFree, Rm relates u1 m1 _ _ and u2 m2 _ _.

Definition RFree' {f₁ f₂} Rf {a₁ a₂} Ra (u₁ : Free' f₁ a₁) (u₂ : Free' f₂ a₂) : Prop :=
  forall m₁ m₂ `(MonadFree f₁ m₁) `(MonadFree f₂ m₂) Rm
    (pm : RMonad Rm) (pf : RMonadFree Rf Rm),
    Rm _ _ Ra (u₁ m₁ _ _) (u₂ m₂ _ _).

The above translation of types into relations can be automated by a tool such as paramcoq. However paramcoq currently constructs relations in Type instead of Prop, which got me stuck in universe inconsistencies. That’s why I’m declaring Prop relations the manual way here.

The parametricity theorem says that any u : Free' f a is related to itself by RFree' (for some canonical relations on f and a). It is a theorem about the language Coq which we can’t prove within Coq. Rather than postulate it, we will simply add the required RFree' _ _ u u assumption to our proposition (from_to below). Given a concrete u, it should be straightforward to prove that assumption case-by-case in order to apply that proposition.

These “relation transformers” are a bit of a mouthful to spell out, and they’re usually guessable from the type constructor (f or m), so they deserve a class, that’s a higher-order counterpart to PropEq (like Eq1 is to Eq in Haskell).

Class PropEq1 (m : Type -> Type) : Type :=
  propeq1 : forall a₁ a₂, (a₁ -> a₂ -> Prop) -> m a₁ -> m a₂ -> Prop.

Given a PropEq1 m instance, we can apply it to the relation eq to get a plain relation which seems a decent enough default for PropEq (m a).

Instance PropEq_PropEq1 {m} `{PropEq1 m} {a} : PropEq (m a) := propeq1 eq.

Really lawful monads

We previously defined a “lawful monad” as a monad with an equivalence relation (PropEq (m a)). To use parametricity, we will also need a monad m to provide a relation transformer (PropEq1 m), which subsumes PropEq with the instance just above.6 This extra structure comes with additional laws, extending our idea of monads to “really lawful monads”.

Class Trans_PropEq1 {m} `{PropEq1 m} : Prop :=
  trans_propeq1 : forall a₁ a₂ (r : a₁ -> a₂ -> Prop) x₁ x₁' x₂ x₂',
    x₁ = x₁' -> propeq1 r x₁' x₂ -> x₂ = x₂' -> propeq1 r x₁ x₂'.

Class ReallyLawfulMonad m `{Monad m} `{PropEq1 m} : Prop :=
  { LawfulMonad_RLM :> LawfulMonad (m := m)
  ; Trans_PropEq1_RLM :> Trans_PropEq1 (m := m)
  ; RMonad_RLM : RMonad (propeq1 (m := m))

Class ReallyLawfulMonadFree f `{PropEq1 f} m `{MonadFree f m} `{PropEq1 m} : Prop :=
  { ReallyLawfulMonad_RLMF :> ReallyLawfulMonad (m := m)
  ; RMonadFree_RLMF : RMonadFree (propeq1 (m := f)) (propeq1 (m := m))

We inherit the LawfulMonad laws from before. The relations RMonad and RMonadFree, defined earlier, must relate m’s instances of Monad and MonadFree, for the artificial reason that that’s roughly what RFree' will require. We also add a generalized transitivity law, which allows us to rewrite either side of a heterogeneous relation propeq1 r using the homogeneous one = (which denotes propeq1 eq).

It’s worth noting that there is some redundancy here, that could be avoided with a bit of refactoring. That generalized transitivity law Trans_PropEq1 implies transitivity of =, which is part of the claim that = is an equivalence relation in LawfulMonad. And the bind component of RMonad implies propeq_bind in LawfulMonad, so these RMonad and RMonadFree laws can also be seen as generalizations of congruence laws to heterogeneous relations, making them somewhat less artificial than they may seem at first.

Restricting the definition of equality on the final free monad Free' to quantify only over really lawful monads yields the right notion of equality for our purposes, which is to prove the from_to theorem below, validating the isomorphism between Free and Free'.

Instance eq_Free' f `(PropEq1 f) a : PropEq (Free' f a) :=
  fun u₁ u₂ =>
    forall m `(MonadFree f m) `(PropEq1 m) `(!ReallyLawfulMonadFree (m := m)),
      u₁ m _ _ = u₂ m _ _.

Quickly, let’s get the following lemma out of the way, which says that foldFree commutes with bind. We’re really saying that foldFree is a monad morphism but no time to say it properly. The proof of the next lemma will need this, but it’s also nice to look at this on its own.

Lemma foldFree_bindFree {f m} `{MonadFree f m} `{forall a, PropEq (m a)} `{!LawfulMonad (m := m)}
    {a b} (u : Free f a) (k : a -> Free f b)
  : foldFree (bindFree u k) = bind (foldFree u) (fun x => foldFree (k x)).
  induction u; cbn [bindFree foldFree].
  - symmetry. apply pure_bind with (k0 := fun x => foldFree (k x)).
  - etransitivity; [ | symmetry; apply bind_bind ].
    eapply propeq_bind.
    * reflexivity.
    * auto.

Finally the proof

Our goal is to prove an equation in terms of eq_Free', which gives us a really lawful monad as an assumption. We open a section to set up the same context as that and to break down the proof into more digestible pieces.


Context {f m} `{MonadFree f m} `{PropEq1 m} `{!ReallyLawfulMonad (m := m)}.

As outlined earlier, parametricity will yield an assumption RFree' _ _ u u, and we will specialize it with a relation R which relates u1 : Free f a and u2 : m a when foldFree u1 = u2. However, RFree' actually expects a relation transformer rather than a relation, so we instead define R to relate u1 : Free f a1 and u2 : Free f a2 when propeq1 Ra (foldFree u1) u2, where Ra is a relation given between a1 and a2.

Let R := (fun a₁ a₂ (Ra : a₁ -> a₂ -> Prop) u₁ u₂ => propeq1 Ra (foldFree u₁) u₂).

The following two lemmas are the “$CERTAIN_CONDITIONS” mentioned earlier, that R must satisfy, i.e., we prove that R, via RMonad (resp. RMonadFree), relates the Monad (resp. MonadFree) instances for Free f and m.

Lemma RMonad_foldFree : RMonad (m₁ := Free f) R.
  constructor; intros.
  - cbn. apply RMonad_RLM; auto.
  - unfold R. eapply trans_propeq1.
    + apply foldFree_bindFree.
    + eapply RMonad_RLM; eauto.
    + reflexivity.

Context (Rf : PropEq1 f).
Context (RMonadFree_m : RMonadFree propeq1 propeq1).

Lemma RMonadFree_foldFree : RMonadFree Rf R.
  constructor; intros.
  - unfold R.
    eapply trans_propeq1.
    + apply bind_pure.
    + apply RMonadFree_m. eassumption.
    + reflexivity.


Here comes the conclusion, which completes our claim that toFree'/fromFree' is an isomorphism (we proved the other half to_from on the way here). This equation is under an assumption which parametricity promises to fulfill, but we will have to step out of the system if we want it right now.

Theorem from_to f (Rf : PropEq1 f) a (u : Free' f a)
  : RFree' Rf eq u u ->
    toFree' (fromFree' u) = u.

In the proof, we get the assumption H : RFree' Rf eq u u, which we apply to the above lemmas, RMonad_foldFree and RMonadFree_foldFree, using the specialize tactic. That yields exactly our desired goal.

  do 2 red; intros.
  unfold toFree', fromFree'.
  red in H.
  specialize (H (Free f) m _ _ _ _ _ RMonad_foldFree (RMonadFree_foldFree RMonadFree_RLMF)).
  apply H.


If you managed to hang on so far, treat yourself to some chocolate.

To formalize a parametricity argument in Coq, I had to move the goalposts quite a bit throughout the experiment:

  • Choose a more termination-friendly encoding of recursive types.
  • Relativize equality.
  • Mess around with heterogeneous relations without crashing into UIP.
  • Reinvent the definition of a monad, again.
  • Come to terms with the externality of parametricity.

It could be interesting to see a “really lawful monad” spelled out fully.

Another similar but simpler exercise is to prove the equivalence between initial and final encodings of lists. It probably wouldn’t involve “relation transformers” as much. There are also at least two different variants: is your final encoding “foldr”- or “fold”-based (the latter mentions monoids, the former doesn’t)?

I hope that machinery can be simplified eventually, but given the technical sophistication that is currently necessary, prudence is advised when navigating around claims made “by parametricity”.

  1. Answering Iceland_Jack’s question on Twitter.↩︎

  2. Also an excuse to integrate Alectryon in my blog.↩︎

  3. That idea is also present in Kiselyov and Ishii’s paper.↩︎

  4. Those who do know Coq will wonder, what about eq (“intensional equality”)? It is a fine default relation for first-order data (nat, pairs, sums, lists, ASTs without HOAS). But it is too strong for computations (functions and coinductive types) and proofs (of Props). Then a common approach is to introduce extensionality axioms, postulating that “extensional equality implies intensional equality”. But you might as well just stop right after proving whatever extensional equality you wanted.↩︎

  5. Well, if you tried you would end up with the unary variant of the parametricity theorem, but it’s much weaker than the binary version shown here. n-ary versions are also possible and even more general, but you have to look hard to find legitimate uses.↩︎

  6. To be honest, that decision was a little arbitrary. But I’m not sure making things more complicated by keeping EqProp1 and EqProp separate buys us very much.↩︎

by Lysxia at October 20, 2021 12:00 AM

October 19, 2021

Chris Smith 2

You’re invited to the October virtual Haskell CoHack

Hi everyone,

This Saturday, I’m once again hosting a virtual Haskell CoHack. In the past, we’ve had a great time with groups here working on various projects, whether it’s learning Haskell, hacking on GHC, writing documentation, making progress on personal projects, or just hanging out to chat with like-minded folk. You should consider coming if you would be excited to meet fellow Haskell programmers and work or learn together with them.

There are details, including times, on the meetup page:

by Chris Smith at October 19, 2021 01:59 AM

October 18, 2021

Haskell Foundation blog

Haskell Foundation September Seven Month Update Extravaganza

by Andrew Boardman

Seven Months!

It is hard to believe Emily Pillmore and I have been running the Haskell Foundation for seven months already. Similar to parenting, it feels like no time has elapsed, but at the same time it went very slowly.

We want to improve and become more effective, so in this monthly update let’s dive into what we’ve done over the last seven months, what we’ve learned, and where we want to go.


Our first challenge was raising funds so we could continue to have a Haskell Foundation, and it certainly took a while before the first check came in under our watch.

Last year, prior to the selection of the Board or the Executive Team, GitHub was the first check to clear, they had stepped up to continue funding GHC work that had previously been supported by Microsoft Research. IOHK came a month later with huge support for the Foundation, followed soon by Well-Typed, Mercury, Flipstone, Tweag, and Obsidian Systems. A month after, in January 2021, EMQ joined that illustrious group of early sponsors.

Fundraising has a long lag time between initial contact and checks clearing, so our next sponsors joined us in June, with Digital Asset joining us at the Monad level, and ExFreight at Applicative. This broke the drought, and we added TripShot in July, HERP in August (both as Functors), and CarbonCloud in September at Applicative.

Welcome to CarbonCloud as our newest Sponsor!

We talked to at least 37 companies at different stages of using Haskell, got tons of wonderful feedback and insight into what they’re doing, what is working, what is not. We converted those conversations into five new sponsors totaling $140,000, have another company that is in the final stage, and two more that are figuring out payment logistics.

Additionally, we were given an in-kind donation by MLabs: 40 hours / month of the amazing Koz Ross’s time to dedicate towards HF projects.

Lessons Learned

Fundraising is an ongoing process, and my primary focus as Executive Director. We are always looking for more companies to talk to and more opportunities to find funding for the Foundation and our initiatives. We are also now talking to existing sponsors about renewals and, where reasonable, increasing their contributions.

We cannot take our foot off the gas and relax, our resources are a fundamental limit to what we can accomplish.

Technical Track

Much of the work we need to do is fundamentally technical in nature, and we have largely been successful. We had some ideas of what we wanted to accomplish and did some deep dives into Backpack and the Windows platform for Haskell early on.

Utf8 Text

Andrew Lelechenko (aka Bodigrim) had a very focused proposal in mind: switch the internal Text representation from Utf-16 to Utf-8. This had been attempted before, but bikeshedding and arguing had stalled it out.

Bodigrim created his official proposal in May, disabled implicit fusion in Text in June (he found serious performance issues in basic cases while working on the changes), had a PR up for review in August, and merged the PR in September. Amazing and very well received work!

Dominos on a table.

The next steps are PRs to the GHC codebase, and following the changes to downstream dependencies to ensure smooth updates.

Minimal Windows Installer

For a while Haskell support for Windows was a bit… rough. GHC put a lot of effort into fixing that up, but there was (and still is, but less so) spotty support by the tooling surrounding the compiler.

Julian Ospald stepped up to add proper Windows support for ghcup, and after a marathon of work and collaboration with Tamar Christina, Ben Gamari, and others, got it up and running!

If you want a maintained system GHC, Tamar’s Chocolatey package takes care of the complexity (but requires Chocolatey).
If you prefer to manage your own installation, and want a “system GHC” experience in Windows, you can now use ghcup.
For ease of use, managing multiple GHC installations seamlessly, or if you normally use Stack for your projects, Stack takes care of the complexity behind its CLI.

Lessons Learned

We attempted to use the momentum of this project to consolidate the ecosystem on a single Haskell installer, and unfortunately that did not go well. We did learn a lot from the experience, and made changes to how we go about selecting projects, getting community feedback, and how we assist with project management.

Haskell Foundation Tech Proposal Process

Emily Pillmore created a proposal process proposal, with a template to help guide members in the community on what needs to be thought out, decided, and written when proposing Haskell Foundation involvement in technical work.

It makes sure that we’ve thought through many of the issues that gave us problems: making sure the right people are notified, looking at prior art, determining motivation and deliverables, and what resources are needed to enable success.

Haskell Foundation Technical Task Force Elections

The HFTT is a volunteer group that evaluates the proposals. Emily posted a call for applications to join, received excellent results, and selected the new members. Participation in Haskell Foundation volunteer groups by a wide range of community members is crucial for making sure different points of view and perspectives are involved in the decision making, and we encourage everyone to get involved.

Extended Dependency Generation GHC Proposal

There are other proposal processes in our community as well, the most famous being the GHC Proposal Process. HF Board member Hécate Choutri found this gem, particularly given our love and support of the HLS project, and asked the HFTT to rally support for it. We absolutely agreed, and have requested that the GHC team prioritize it (within reason, given how slammed they are with getting releases out the door).

What does that mean? The Haskell Foundation provides support for GHC development (and we’d love to provide more, please donate and sponsor!), so we get some say in how the work is prioritized. We generally leave it to that team and the Steering Committee’s best judgement, but occasionally when we see a priority to help the ecosystem, we let them know what we’d like moved up in the queue.

Dashboard Proposal

Haskell is a very fast language, but occasionally a problem sneaks through CI and testing and ends up in production code. Emily worked with Ben Gamari to draft a proposal to take infrastructure the GHC team had in place, provide better UX, and extend it beyond GHC itself to cover important libraries that are a dependency of a large percentage of Haskell projects.

We want to consistently measure the performance of GHC itself, core libraries, fundamental dependencies, and eventually more. This will allow us to find GHC changes that affect library and application performance, address regressions closer to when the change is made, as well as lock in performance improvements. If you have expertise in DevOps, data visualization, and performance analysis, you can make a big difference here!


A hole in our ecosystem had been consistent, inclusive maintainership of the Cabal projects. Emily stepped up to fill this need, did substantial work to update the code base to modern practices and styles, and found people to help maintain the project. This has led to new releases, new maintainers, and plans for future releases and features.

Core Libraries Committee

Similarly, the CLC had become operationally defunct, and once again Emily stepped in. She created a new way of working process, largely similar to the HFTP and the new process for Cabal, found which existing members were still active and able to perform their duties, and held an election process to find new members. Clear expectations have been set with a focus on communication and consistency.


The fabulous Hécate Choutri does a fantastic job organizing volunteers around radically improving the documentation in our community. They have rallied efforts around the wiki, improvements in Haddocks, and Haskell School, a community sourced learning resource from the ground up. It has eight contributors, 76 commits (as of this writing), and three languages in development.

If you are passionate about bringing more people into our community faster, join in!

Performance Book

Gil Mizrahi’s Haskell Performance Tuning Book proposal has been submitted and is getting feedback. A critical gap in the knowledge of intermediate level Haskellers is the ability to deeply understand the performance of Haskell applications, how to address issues, how to design performant systems, and how to use the variety of tools available to debug issues.

This proposal offers a solution: a community sourced and maintained online book with the cumulative knowledge of Haskell performance experts, so that set of skills can be widely available and accessible. No matter what your level of expertise, you can help make this a reality!


Matchmaker is project to address the issue of volunteers who have time and expertise but don’t know what projects need the help they can provide, and the maintainers who need that help and can’t find the volunteers. If you’re looking for a way to help out, here is a project that would help you help others in your exact situation!


Haskell Interlude Podcast

Niki Vazou, Joachim Breitner, Andres Löh, Alejandro Serrano, and Wouter Swierstra proposed a long form podcast to interview guests about Haskell related topics. Originally they released a teaser episode, introducing the hosts, followed by Episode 1, an interview with Emily Pillmore. Episode 2 is out today, featuring Lennart Augustsson!

Code of Conduct

We started with the Guidelines for Respectful Communication as a foundation for how we wish interactions within the community to be conducted. We have determined that the GRC is necessary but not complete, and are working with the Rust Foundation to standardize on a Code of Conduct that would ideally be shared between the two communities.

This is meant to augment and enhance the GRC, not replace it. If you would like to be involved in making our community a friendlier, more inclusive place, please join our Slack and let us know in the #community-track channel.


Affiliation at this time involves adopting the GRC, but we’re also discussing HF endorsement of open source projects depending on the level of support the maintainers are willing to sign up for.

As an example, we could have multiple tiers:

* Core
* Security fixes turned around within 24 or 48 hours
* Stability guarantees, support for the last N releases of GHC
* Maintainership line of succession
* Code of Conduct / GRC adoption

Each level would include all the requirements below it, and would give people choosing which libraries and tools to adopt better information about the state of that project and whether they feel comfortable using it in production given their project’s needs.

Current Affiliated Projects and Teams

Haskell IDE Team
GHC Steering Committee
Clash Lang
Haskell Weekly
Core Libraries Committee
Haskell Love
Zurihac Committee
IHP (Integrated Haskell Platform)

Our Foundation Task Force

Chris Smith and Matthias Toepp have created a new task force aiming to help the community feel ownership of the Haskell Foundation. It is Our Foundation, not theirs (or mine). Their first initiatives are increasing the number of individuals donating to the Foundation, and a grant program to steer Foundation funds to the community. Please consider applying to be part of the task force!

Haskell Teachers’ Forum

A very early stage idea to bring together educators teaching Haskell at all levels. We want to share materials, best practices, and ideas. Send me an email if you’re interested in being involved!

State of Haskell Survey

Taylor Fausak has been the steward of the survey since 2017, and we’re discussing how to use HF resources to understand our community better. We would love more and better data about the state of our community, both so we can measure progress, as well as make more informed decisions.


On June 19th Emily gave a Haskell Foundation status update talk at Zurihac.

Haskell Love

My Haskell Love talk was on September 10th, where I talked about how our tooling is the linchpin for unlocking the potential of Haskell to solve real world problems and be the future of software engineering.

Office Hours

On the first Monday of each month, we host Haskell Foundation Office Hours at 16:00 UTC (9a US west coast) on Andrew’s Twitch channel. We have run two so far, the next one will be on October 4th. It is a ton of fun, we talk about a variety of topics, so please bring your questions and feedback!


There are operational details that need to be taken care of, particularly true when we had just started. Benefits, payroll, legal forms, accounting, and so on. Big thanks to the HF Board Treasurer, Ryan Trinkle, for helping us navigate all of this and making sure we get paid, the HF gets paid, and we have all of the legalities settled.


  • 2021–10–05: Added Stack and Stackage to the list of Affiliated projects.

by Haskell Foundation at October 18, 2021 09:32 PM

Brent Yorgey

Competitive programming in Haskell: BFS, part 2 (alternative APIs)

In my last post, I showed how we can solve Modulo Solitaire (and hopefully other BFS problems as well) using a certain API for BFS, which returns two functions: one, level :: v -> Maybe Int, gives the level (i.e. length of a shortest path to) of each vertex, and parent :: v -> Maybe v gives the parent of each vertex in the BFS forest. Before showing an implementation, I wanted to talk a bit more about this API and why I chose it.

In particular, Andrey Mokhov left a comment on my previous post with some alternative APIs:

bfsForest :: Ord a => [a] -> AdjacencyMap a -> Forest a
bfs :: Ord a => [a] -> AdjacencyMap a -> [[a]]

Of course, as Andrey notes, AdjacencyMap is actually a reified graph data structure, which we don’t want here, but that’s not essential; presumably the AdjacencyMap arguments in Andrey’s functions could easily be replaced by an implicit graph description instead. (Note that an API requiring an implicit representation is strictly more powerful, since if you have an explicit representation you can always just pass in a function which does lookups into your explicit representation.) However, Andrey raises a good point. Both these APIs return information which is not immediately available from my API.

  • bfsForest returns an actual forest we can traverse, giving the children of each node. My API only returns a parent function which gives the parent of each node. These contain equivalent information, however, and we can convert back and forth efficiently (where by “efficiently” in this context I mean “in O(n \lg n) time or better”) as long as we have a list of all vertices. To convert from a Forest to a parent function, just traverse the forest and remember all the parent-child pairs we see, building e.g. a Map that can be used for lookup. To convert back, first iterate over the list of all vertices, find the parent of each, and build an inverse mapping from parents to sets of children. If we want to proceed to building an actual Forest data structure, we can unfold one via repeated lookups into our child mapping.

    However, I would argue that in typical applications, having the parent function is more useful than having a Forest. For example, the parent function allows us to efficiently answer common, classic queries such as “Is vertex v reachable from vertex s?” and “What is a shortest path from s to v?” Answering these questions with a Forest would require traversing the entire Forest to look for the target vertex v.

  • bfs returns a list of levels: that is, the first list is the starting vertices, the next list is all vertices one step away from any starting vertex, the next list is all vertices two steps away, and so on. Again, given a list of all vertices, we can recover a list of levels from the level function: just traverse the list of all vertices, looking up the level of each and adding it to an appropriate mapping from levels to sets of vertices. Converting in the other direction is easy as well.

    A level list lets us efficiently answer a queries such as “how many vertices are exactly 5 steps away from s”?, whereas with the level function we can efficiently answer queries such as “What is the length of a shortest path from s to v?” In practice, the latter form of query seems more common.

In the final version of this BFS API, I will probably include some functions to recover forests and level sets as described above. Some benchmarking will be needed to see whether it’s more efficient to recover them after the fact or to actually keep track of them along the way.

by Brent at October 18, 2021 07:48 PM

Monday Morning Haskell

Using IO without the IO Monad!


(This post is also available as a YouTube video!)

In last week's article, I explained what effects really are in the context of Haskell and why Haskell's structures for dealing with effects are really cool and distinguish it from other programming languages.

Essentially, Haskell's type system allows us to set apart areas of our code that might require a certain effect from those that don't. A function within a particular monad can typically use a certain effect. Otherwise, it can't. And we can validate this at compile time.

But there seems to be a problem with this. So many of Haskell's effects all sort of fall under the umbrella of the IO monad. Whether that's printing to the terminal, or reading from the file system, using threads and concurrency, connecting over the network, or even creating a new random number generator.

putStrLn :: String -> IO ()
readFile :: FilePath -> IO String
readMVar :: MVar a -> IO a
httpJSON :: (MonadIO m, FromJSON a) => Request -> m (Response a)
getStdGen :: MonadIO m => m StdGen

Now I'm not going to tell you "oh just re-write your program so you don't need as much IO." These activities are essential to many programs. And often, they have to be spread throughout your code.

But the IO monad is essentially limitless in its abilities. If your whole program uses the IO monad, you essentially don't have any of the guarantees that we'd like to have about limiting side effects. If you need any kind of IO, it seems like you have to allow all sorts of IO.

But this doesn't have to be the case. In this article we're going to demonstrate how we can get limited IO effects within our function. That is, we'll write our type signature to allow a couple specific IO actions, without opening the door to all kinds of craziness. Let's see how this works.

An Example Game

Throughout this video we're going to be using this Nim game example I made. You can see all the code in Game.hs.

Our starting point for this article is the instances branch.

The ending point is the monad-class branch.

You can take a look at this pull request to see all the changes we're going to make in this article!

This program is a simple command line game where players are adding numbers to a sum and want to be the one to get to exactly 100. But there are some restrictions. You can't add more than 10, or add a negative number, or add too much to put it over 100. So if we try to do that we get some of these helpful error messages. And then when someone wins, we see who that is.

Our Monad

Now there's not a whole lot of code to this game. There are just a handful of functions, and they mostly live in this GameMonad we created. The "Game Monad" keeps track of the game state (a tuple of the current player and current sum value) using the State monad. Then it also uses the IO monad below that, which we need to receive user input and print all those messages we were seeing.

newtype GameMonad a = GameMonad
  { gameAction :: StateT (Player, Int) IO a
  } deriving (Functor, Applicative, Monad)

We have a couple instances, MonadState, and MonadIO for our GameMonad to make our code a bit simpler.

instance MonadIO GameMonad where
  liftIO action = GameMonad (lift action)

instance MonadState (Player, Int) GameMonad where
  get = GameMonad get
  put = GameMonad . put

Now the drawback here, as we talked about before, is that all these GameMonad functions can do arbitrary IO. We just do liftIO and suddenly we can go ahead and read a random file if we want.

playGame :: GameMonad Player
playGame = do
  input <- readInput
  validateResult <- validateMove input
  case validateResult of
    Nothing -> playGame
    Just i -> do
      # Nothing to stop this!
      readResult <- liftIO $ readFile "input.txt"

Making Our Own Class

But we can change this with just a few lines of code. We'll start by creating our own typeclass. This class will be called MonadTerminal. It will have two functions for interacting with the terminal. First, logMessage, that will take a string and return nothing. And then getInputLine, that will return a string.

class MonadTerminal m where
  logMessage :: String -> m ()
  getInputLine :: m String

How do we use this class? Well we have to make a concrete instance for it. So let's make an instance for our GameMonad. This will just use liftIO and run normal IO actions like putStrLn and getLine.

instance MonadTerminal GameMonad where
  logMessage = liftIO . putStrLn
  getInputLine = liftIO getLine

Constraining Functions

At this point, we can get rid of the old logMessage function, since the typeclass uses that name now. Next, let's think about the readInput expression.

readInput :: GameMonad String
readInput = liftIO getLine

It uses liftIO and getLine right now. But this is exactly the same definition we used in MonadTerminal. So let's just replace this with the getInputLine class function.

readInput :: GameMonad String
readInput = getInputLine

Now let's observe that this function no longer needs to be in the GameMonad! We can instead use any monad m that satisfies the MonadTerminal constraint. Since the GameMonad does this already, there's no effect on our code!

readInput :: (MonadTerminal m) => m String
readInput = getInputLine

Now we can do the same thing with the other two functions. They call logMessage and readInput, so they require MonadTerminal. And they call get and put on the game state, so they need the MonadState constraint. But after doing that, we can remove GameMonad from the type signatures.

validateMove :: (MonadTerminal m, MonadState (Player, Int) m) => String -> m (Maybe Int)

promptPlayer :: (MonadTerminal m, MonadState (Player, Int) m) => m ()

And now these functions can no longer use arbitrary IO! They're still using using the true IO effects we wrote above, but since MonadIO and GameMonad aren't in the type signature, we can't just call liftIO and do a file read.

Of course, the GameMonad itself still has IO on its Monad stack. That's the only way we can make a concrete implementation for our Terminal class that actually does IO!

But the actual functions in our game don't necessarily use the GameMonad anymore! They can use any monad that satisfies these two classes. And it's technically possible to write instances of these classes that don't use IO. So the functions can't use arbitrary IO functionality! This has a few different implications, but it especially gives us more confidence in the limitations of what these functions do, which as a reminder, is considered a good thing in Haskell! And it also allows us to test them more easily.

Conclusion: Effectful Haskell

Hopefully you think at least that this is a cool idea. But maybe you're thinking "Woah, this is totally game changing!" If you want to learn more about Haskell's effect structures, I have an offer for you!

If you head to this page you'll learn about our Effectful Haskell course. This course will give you hands-on experience working with the ideas from this video on a small but multi-functional application. The course starts with learning the different layers of Haskell's effect structures, and it ends with launching this application on the internet.

It's really cool, and if you've read this long, I think you'll enjoy it, so take a look! As a bonus, if you subscribe to Monday Morning Haskell, you can get a code for 20% off on this or any of our courses!

by James Bowen at October 18, 2021 02:30 PM

October 16, 2021

Sandy Maguire

Proving Equivalence of Polysemy Interpreters

Let’s talk more about polysemy-check. Last week we looked at how to do property-testing for a polysemy effects’ laws. Today, we’ll investigate how to show that two interpretations are equivalent.

To continue with last week’s example, let’s say we have an effect that corresponds to having a Stack that we can push and pop:

data Stack s m a where
  Push      :: s -> Stack s m ()
  Pop       :: Stack s m (Maybe s)
  RemoveAll :: Stack s m ()
  Size      :: Stack s m Int

deriving instance Show s => Show (Stack s m a)
deriveGenericK ''Stack

makeSem ''Stack

Since we’d like to prove the equivalence of two interpretations, we’ll need to first write two interpretations. But, to illustrate, we’re going simulate multiple interpreters via a single interpretation, parameterized by which bugs should be present in it.

purposes of brevity, we’ll write a single interpretation of Stack s in terms of State [s], and then interpret that in two different ways. In essence, what we’re really testing here is the equivalence of two State interpretations, but it’s good enough for an example.

We’ll start with the bugs:

data Bug
  = PushTwice
  | DontRemove
  deriving stock (Eq, Ord, Show, Enum, Bounded)

instance Arbitrary Bug where
  arbitrary = elements [minBound..maxBound]

hasBug :: [Bug] -> Bug -> Bool
hasBug = flip elem

The PushTwice bug, as you might expect, dispatched a Push command so that it pushes twice onto the stack. The DontRemove bug causes RemoveAll to be a no-op. Armed with our bugs, we can write a little interpreter for Stack that translates Stack s commands into State [s] commands, and then immediately runs the State effect:

    :: [Bug]
    -> Sem (Stack s ': r) a
    -> Sem r ([s], a)
runStack bugs =
  (runState [] .) $ reinterpret $ \case
    Push s -> do
      modify (s :)
      when (hasBug bugs PushTwice) $
        modify (s :)

    Pop -> do
      r <- gets listToMaybe
      modify (drop 1)
      pure r

    RemoveAll ->
      unless (hasBug bugs DontRemove) $
        put []

    Size ->
      gets length

For our efforts we are rewarded: runState gives rise to four interpreters for the price of one. We can now ask whether or not these interpreters are equivalent. Enter propEquivalent:

With these interpreters out of the way, it’s time to answer our original question: are pureStack and ioStack equivalent? Which is to say, do they get the same answer for every possible program? Enter propEquivalent:

    :: forall effs r1 r2 f
     . ( forall a. Show a => Show (f a)
       , forall a. Eq a => Eq (f a)
    => ( Inject effs r1
       , Inject effs r2
       , Arbitrary (Sem effs Int)
    => (forall a. Sem r1 a -> IO (f a))
    -> (forall a. Sem r2 a -> IO (f a))
    -> Property

All of the functions in polysemy-check have fun type signatures like this one. But despite the preponderance of foralls, it’s not as terrible as you might think. The first ten lines here are just constraints. There are only two arguments to prepropEquivalent, and they are the two interpreters you’d like to test.

This type is crazy, and it will be beneficial to understand it. There are four type variables, three of which are effect rows. We can distinguish between them:

  • effs: The effect(s) you’re interested in testing. In our case, our interpreter handles Stack s, so we let effs ~ Stack s.
  • r1: The effects handled by interpreter 1. Imagine we had an interpreter for Stack s that ran it via IO instead. In that case, r1 ~ '[State s, Embed IO].
  • r2 The effects handled by interpreter 2.

The relationships that must between effs, r1 and r2 are \(effs \subset r1\) and \(effs \subset r2\). When running prepropEquivalent, you must type-apply effs, because Haskell isn’t smart enough to figure it out for itself.

The other type variable to prepropEquivalent is f, which allows us to capture the “resulting state” of an interpreter. In runStack :: [Bug] -> Sem (Stack s ': r) a -> Sem r ([s], a), you’ll notice we transform a program returning a into one returning ([s], a), and thus f ~ (,) [s]. If your interpreter doesn’t produce any resulting state, feel free to let f ~ Identity.

We’re finally ready to test our interpreters! For any equivalence relationship, we should expect something to be equivalent to itself. And this is true regardless of which bugs we enable:

prop_reflexive :: Property
prop_reflexive = do
  bugs <- arbitrary
  pure $
    prepropEquivalent @'[Stack Int]
      (pure . run . runStack bugs)  -- pure is getting us into IO
      (pure . run . runStack bugs)

So what’s happening here? Internally, prepropEquivalent is generating random programs of type Sem '[Stack Int] Int, and lifting that into Sem r1 Int and Sem r2 Int, and then running both interpreters and ensuring the result is the same for every program. Note that this means any fundamental non-determinism in your interpretation will break the test! Make sure to use appropriate interpreters for things like clocks and random values!

To strengthen our belief in prepropEquivalent, we can also check that runStack is not equivalent to itself if different bugs are enabled:

prop_bugsNotEquivalent :: Property
prop_bugsNotEquivalent =
  expectFailure $
    prepropEquivalent @'[Stack Int]
      (pure . run . runStack [PushTwice])
      (pure . run . runStack [])

Running this test will give us output like:

+++ OK, failed as expected. Falsified (after 3 tests):
([0,0],1) /= ([0],1)

The counterexample here isn’t particularly helpful (I haven’t yet figured out how to show the generated program that fails,) but you can get a hint here by noticing that the stack (the [0,0]) is twice as big in the first result as in the second.

Importantly, by specifying @'[Stack Int] when calling prepropEquivalent, we are guaranteed that the generated program will only use actions from Stack Int, so it’s not too hard to track down. This is another win for polysemy in my book — that we can isolate bugs with this level of granularity, even if we can’t yet perfectly point to them.

All of today’s code (and more!) is available as a test in polysemy-check, if you’d like to play around with it. But that’s all for now. Next week we’ll investigate how to use polysemy-check to ensure that the composition of your effects themselves is meaningful. Until then!

October 16, 2021 12:06 PM

October 14, 2021

Brent Yorgey

Competitive programming in Haskell: BFS, part 1

In a previous post, I challenged you to solve Modulo Solitaire. In this problem, we are given a starting number s_0 and are trying to reach 0 in as few moves as possible. At each move, we may pick one of up to 10 different rules (a_i,b_i) that say we can transform s into (a_i s + b_i) \bmod m.

In one sense, this is a straightforward search problem. Conceptually, the numbers 0 through m-1 form the vertices of a graph, with a directed edge from s to t whenever there is some allowed (a_i, b_i) such that t = (a_i s + b_i) \bmod m; we want to do a breadth first search in this graph to find the length of a shortest path from s_0 to 0. However, m can be up to 10^6 and there can be up to 10 rules, giving a total of up to 10^7 edges. In the case that 0 is unreachable, we may have to explore every single edge. So we are going to need a pretty fast implementation; we’ll come back to that later.

Haskell actually has a nice advantage here. This is exactly the kind of problem in which we want to represent the graph implicitly. There is no reason to actually reify the graph in memory as a data structure; it would only waste memory and time. Instead, we can specify the graph implicitly using a function that gives the neighbors of each vertex, which means BFS itself will be a higher-order function. Higher-order functions are very awkward to represent in a language like Java or C++, so when I solve problems like this in Java, I tend to just write the whole BFS from scratch every single time, and I doubt I’m the only one. However, in Haskell we can easily make an abstract interface to BFS which takes a function as input specifying an implicit graph, allowing us to nicely separate out the graph search logic from the task of specifying the graph itself.

What would be my ideal API for BFS in Haskell? I think it might look something like this (but I’m happy to hear suggestions as to how it could be made more useful or general):

data BFSResult v =
  BFSR { level :: v -> Maybe Int, parent :: v -> Maybe v }

bfs ::
  (Ord v, Hashable v) =>
  [v] ->                      -- Starting vertices
  (v -> [v]) ->               -- Neighbors
  (v -> Bool) ->              -- Goal predicate
  BFSResult v

bfs takes a list of vertices to search from (which could be a singleton if there is a single specific starting vertex), a function specifying the out-neighbors of each vertex, and a predicate specifying which vertices are “goal” vertices (so we can stop early if we reach one), and returns a BFSResult record, which tells us the level at which each vertex was encountered, if at all (i.e. how many steps were required to reach it), and the parent of each vertex in the search. If we just want to know whether a vertex was reachable at all, we can see if level returns Just; if we want to know the shortest path to a vertex, we can just iterate parent. Vertices must be Ord and Hashable to facilitate storing them in data structures.

Using this API, the solution is pretty short.

main = C.interact $ runScanner tc >>> solve >>> format

data Move = Move { a :: !Int, b :: !Int } deriving (Eq, Show)
data TC = TC { m :: Int, s0 :: Int, moves :: [Move] } deriving (Eq, Show)

tc :: Scanner TC
tc = do
  m <- int
  n <- int
  TC m <$> int <*> n >< (Move <$> int <*> int)

format :: Maybe Int -> ByteString
format = maybe "-1" showB

solve :: TC -> Maybe Int
solve TC{..} = level res 0
    res = bfs [s0] (\v -> map (step v) moves) (==0)
    step v (Move a b) = (a*v + b) `mod` m

We run a BFS from s_0, stopping when we reach 0, and then look up the level of 0 to see the minimum number of steps needed to reach it.

In part 2, I’ll talk about how to implement this API. There are many viable implementation strategies, but the trick is getting it to run fast enough.

by Brent at October 14, 2021 04:09 PM

Gabriel Gonzalez

Advice for aspiring bloggers


I’m writing this post to summarize blogging advice that I’ve shared with multiple people interested in blogging. My advice (and this post) won’t be very coherent, but I hope people will still find this useful.

Also, this advice is targeted towards blogging and not necessarily writing in general. For example, I have 10 years of experience blogging, but less experience with other forms of writing, such as writing books or academic publications.


Motivation is everything when it comes to blogging. I believe you should focus on motivation before working on improving anything else about your writing. In particular, if you always force yourself to set aside time to write then (in my opinion) you’re needlessly making things hard on yourself.

Motivation can be found or cultivated. Many new writers start off by finding motivation; inspiration strikes and they feel compelled to share what they learned with others. However, long-term consistent writers learn how to cultivate motivation so that their writing process doesn’t become “feast or famine”.

There is no one-size-fits-all approach to cultivating motivation, because not everybody shares the same motivation for writing. However, the first step is always reflecting upon what motivates you to write, which could be:

  • sharing exciting new things you learn
  • making money
  • evangelizing a new technology or innovation
  • launching or switching to a new career
  • changing the way people think
  • improving your own understanding by teaching others
  • settling a debate or score
  • sorting out your own thoughts

The above list is not comprehensive, and people can blog for more than one reason. For example, I find that I’m most motivated to blog when I have just finished teaching someone something new or arguing with someone. When I conclude these conversations I feel highly inspired to write.

Once you clue in to what motivates you, use that knowledge to cultivate your motivation. For example, if teaching people inspires me then I’ll put myself in positions where I have more opportunities to mentor others, such as becoming an engineering manager, volunteering for Google Summer of Code, or mentoring friends earlier in their careers. Similarly, if arguing with people inspires me then I could hang out on social media with an axe to grind (although I don’t do that as much these days for obvious reasons…).

When inspiration strikes

That doesn’t mean that you should never write when you’re not motivated. I still sometimes write when it doesn’t strike my fancy. Why? Because inspiration doesn’t always strike at a convenient time.

For example, sometimes I will get “hot” to write something in the middle of my workday (such as right after a 1-on-1 conversation) and I have to put a pin in it until I have more free time later.

One of the hardest things about writing is that inspiration doesn’t always strike at convenient times. There are a few ways to deal with this, all of which are valid:

  • Write anyway, despite the inconvenience

    Sometimes writing entails reneging on your obligations and writing anyway. This can happen when you just know the idea has to come out one way or another and it won’t necessarily happen on a convenient schedule.

  • Write later

    Some topics will always inspire you every time you revisit them, so even if your excitement wears off it will come back the next time you revisit the subject.

    For example, sometimes I will start to write about something that I’m not excited about at the moment but I remember I was excited about it before. Then as I start to write everything comes flooding back and I recapture my original excitement.

  • Abandon the idea

    Sometimes you just have to completely give up on writing something.

    I’ve thrown away a lot of writing ideas that I was really attached to because I knew I would never have the time. It happens, it’s sad when it happens, but it’s a harsh reality of life.

    Sometimes “abandon the idea” can become “write later” if I happen to revisit the subject years later at a more opportune time, but I generally try to abandon ideas completely, otherwise they will keep distracting me and do more harm than good.

I personally have done all of the above in roughly equal measure. There is no right answer to which approach is correct and I treat it as a judgment call.

Quantity over quality

One common pattern I see is that new bloggers tend to “over-produce” some of their initial blog posts, especially for ideas they are exceptionally attached to. This is not necessarily a bad thing, but I usually advise against it. You don’t want to put all of your eggs in one basket and you should focus on writing more frequent and less ambitious posts rather than a few overly ambitious posts, especially when starting out.

One reason why is that people tend to be poor judges of their own work, in my experience. Not only do you not know when inspiration will strike, but you will also not know when inspiration has truly struck. There will be some times when you think something you produce is your masterpiece, your magnum opus, and other people are like “meh”. There will be other times when you put out something that feels half-baked or like a shitpost and other people will tell you that it changed their life.

That’s not to say that you shouldn’t focus on quality at all. Quite the opposite: the quality of your writing will improve more quickly if you write more often instead of polishing a few posts to death. You’ll get more frequent feedback from a wider audience if you keep putting your work out there regularly.

Great writing is learning how to build empathy for the reader and you can’t do that if you’re not regularly interacting with your audience. The more they read your work and provide feedback the better your intuition will get for what your audience needs to hear and how your writing will resonate with them. As time goes on your favorite posts will become more likely to succeed, but there will always remain a substantial element of luck to the process.


Writing is hard, even for experienced writers like me, because writing is so underconstrained.

Programming is so much easier than writing for me because I get:

  • Tooling support

    … such as an IDE, syntax highlighting or type-checker

  • Fast feedback loop

    For many application domains I can run my code to see if it works or not

  • Clearer demonstration of value

    I can see firsthand that my program actually does what I created it to do

Writing, on the other hand, is orders of magnitude more freeform and nebulous than code. There are so many ways to say or present the exact same idea, because you can vary things like:

  • Choice of words

  • Conceptual approach

  • Sentence / paragraph structure

  • Scope

  • Diagrams / figures

  • Examples

    Oh, don’t get me started on examples. I can spend hours or even days mulling over which example to use that is just right. A LOT of my posts in my drafts have run aground on the choice of example.

There also isn’t a best way to present an idea. One way of explaining things will resonate with some people better than others.

On top of that the feedback loop is sloooooow. Soliciting reviews from others can take days. Or you can publish blind and hope that your own editing process and intution is good enough.

The way I cope is to add artificial constraints to my writing, especially when first learning to write. I came up with a very opinionated way of structuring everything and saying everything so that I could focus more on what I wanted to say instead of how to say it.

The constraints I created for myself touched upon many of the above freeform aspects of writing. Here are some examples:

  • Choice of words

    I would use a very limited vocabulary for common writing tasks. In fact, I still do in some ways. For example, I still use “For example,” when introducing an example, a writing habit which still lingers to this day.

  • Sentence / paragraph structure

    The Science of Scientific Writing is an excellent resource for how to improve writing structure in order to aid reader comprehension.

  • Diagrams / figures

    I created ASCII diagrams for all of my technical writing. It was extremely low-tech, but it got the job done.

  • Examples

    I had to have three examples. Not two. Not four. Three is the magic number.

In particular, one book stood out as exceptionally helpful in this regard:

The above book provides several useful rules of thumb for writing that new writers can use as constraints to help better focus their writing. You might notice that this post touches only very lightly on the technical aspects of authoring and editing writing, and that’s because all of my advice would boil down to: “go read that book”.

As time went on and I got more comfortable I began to deviate from these rules I had created for myself and then I could more easily find my own “voice” and writing style. However, having those guardrails in place made a big difference to me early on to keep my writing on track.


Sometimes you need to write something over an extended period of time, long after you are motivated to do so. Perhaps this because you are obligated to do so, such as writing a blog post for work.

My trick to sustaining interest in posts like these is to always begin each writing session by editing what I’ve written so far. This often puts me back in the same frame of mind that I had when I first wrote the post and gives me the momentum I need to continue writing.


Do not underestimate the power of editing your writing! Editing can easily transform a mediocre post into a great post.

However, it’s hard to edit the post after you’re done writing. By that point you’re typically eager to publish to get it off your plate, but you should really take time to still edit what you’ve written. My rule of thumb is to sleep on a post at least once and edit in the morning before I publish, but if I have extra stamina then I keep editing each day until I feel like there’s nothing left to edit.


I’d like to conclude this post by acknowledging the blog that inspired me to start blogging:

That blog got me excited about the intersection of mathematics and programming and I’ve been blogging ever since trying to convey the same sense of wonder I got from reading about that.

by Gabriella Gonzalez ( at October 14, 2021 04:01 PM


Remote Interactive Course on Type-level Programming with GHC

We are offering our “Type-level programming with GHC” course again this autumn. This course is now available to book online on a first come, first served basis. If you want to book a ticket but they have sold out, please sign up to the waiting list, in case one becomes available.

Training course details

This course will be a mixture of lectures, discussions and live coding delivered via Google Meet. The maximum course size is deliberately kept small (up to 10 participants) so that it is still possible to ask and discuss individual questions. The course will be led by Andres Löh, who has more than two decades of Haskell experience and has taught many courses to varied audiences.

Type-level Programming with GHC

An overview of Haskell language extensions designed for type-level programming / how to express more properties of your programs statically

8-10th November 2021, 1930-2230 GMT (3 sessions, each 3 hours)

Other Well-Typed training courses

If you are interested in the format, but not the topic or cannot make the time, feel free to drop us a line with requests for courses on other topics or at other times. We can also do courses remotely or on-site for your company, on the topics you are most interested in and individually tailored to your needs. Check out more detailed information on our training services or just contact us.

by christine, andres at October 14, 2021 12:00 AM

October 13, 2021


GHC activities report: August-September 2021

This is the eighth edition of our GHC activities report where we describe the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of August and September 2021.

You can find the previous editions collected under the ghc-activities-report tag.

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK, Facebook, and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!


Currently, Ben Gamari, Andreas Klebinger, Matthew Pickering and Zubin Duggal are working primarily on GHC-related tasks. Sam Derbyshire has just been joining the team at the start of October.

Many others within Well-Typed, including Adam Gundry, Alfredo Di Napoli, Alp Mestanogullari, Douglas Wilson and Oleg Grenrus, are contributing to GHC more occasionally.

Haskell Implementor’s Workshop

A few from our team presented various facets of their work at the Haskell Implementor’s Workshop in late August. More discussion of these presentations can be found in a previous HIW recap post on this blog.

Release management

  • Ben has been handling backports and release planning for the 9.2.1 and 9.0.2 releases.

  • Zubin fixed some bugs with LLVM version detection in the HEAD and 8.10.5 releases (#19973, #19828, #19959).

  • The bindists produced by hadrian now have a very similar structure to the ones produced by make (Matt, !6345, !6349).

Compiler error messages

  • Alfredo continued working on the conversion of GHC diagnostic messages from plain structured documents to richer Haskell types. After porting some errors in the driver code (!6249) he turned his attention to the modules in GHC’s typechecker (!6414), and he’s currently converting GHC’s typeclass-derivation code to use the new diagnostic infrastructure (!6561).


  • Matt has been fixing a number of bugs discovered after recent driver refactoring. Hopefully everything works as before in the 9.4 release! (!6508, !6507, !6412)

  • Matt attempted to implement the splice imports proposal but the specification didn’t correctly enforce level invariants. The latest idea is to introduce a complementary “quote” import to distinguish imports allowed to be used in quotations.

Haddock and documentation

  • Zubin has been finishing up the the long-pending hi Haddock work, which should allow Haddock to generate documentation using only GHC interface (.hi) files (!6224). This greatly simplifies Haddock’s implementation, and allows it to skip parsing, renaming and type-checking files if the appropriate information already exists in the interface files, speeding it up greatly in such cases. This also reduces Haddock’s peak memory consumption. Identifiers in Haddock comments will also be renamed by GHC itself, and the results are also serialized into .hi files for tooling to make use of. A number of Haddock bugs were fixed along the way (#20034, haddock #30, haddock #665, haddock #921).

Profiling and debugging

  • Andreas continued looking into using the machine stack register for the haskell stack. blog post. While we have a branch that uses the machine stack register there are issues with perf not unwinding properly as well as issues related to llvm compatibility. For these reasons we will likely stop looking into this for the time being.

Compiler performance

  • Matt has been investigating several compiler performance regressions when compiling common packages (#19478, #19471).

  • Andreas landed a few performance improvements in !6609.

  • Andreas improved further on tag inference in !5614. As it stands it improves compiler performance while also improving runtime for most programs slightly.

  • During investigation of some edges of the tag inference work Andreas found some edge cases in GHC’s current code generation (#20334, #20333, #20332).

  • Adam has been working on a new approach to improving compilation performance for programs with significant use of type families. The key idea is to introduce a more compact representation of coercions in Core, which will occupy less space and be faster to traverse during optimisation (#8095, !6476).

Runtime performance

  • Matt diagnosed and found an interesting recent regression in the text package benchmarks due to code generation using 8-bit instructions which causes partial register stalling (#20405).

Compiler correctness

  • Ben added build system support for multi-target native toolchains (e.g. clang), allowing GHC to be used robustly on platforms which may run code for multiple platforms (#20162).

  • Ben fixed numerous linking issues affecting musl-based platforms, enabling static linkage on Alpine.

  • Ben fixed a number of build system bugs pertaining to libffi linkage affecting Darwin.

  • Ben fixed a Hadrian bug causing binary distribution installation to use platform parameters from the build environment instead of the installation environment (#20253).

Runtime system

  • Ben found and fixed a few tricky write barrier issues in the non-moving garbage collector (#20399).

CI and infrastructure

  • At the request of Richard, Matt has reduced the default verbosity levels of the Hadrian build output (!6584, !6545).

  • Ben refactored GHC’s CI infrastructure on Darwin platforms, eliminating several sources of fragility in the process.

by ben, matthew, andreask, zubin, alfredo, adam at October 13, 2021 12:00 AM

October 11, 2021

Monday Morning Haskell

Why Haskell?

Effectful Haskell Thumb.jpg

(This post is also available as a YouTube video!)

When I tell other programmers I do a lot of programming in Haskell, a common question is "Why"? What is so good about Haskell that it's worth learning a language that is so different from the vast majority of software. And there are a few different things I usually think of, but the biggest one that sticks out for me is the way Haskell structures effects. I think these structures have really helped me change the way I think about programming, and knowing these ideas has made me a more effective developer, even in other languages.

Defining Effects

Now you might be wondering, what exactly is an effect? Well to describe effects, let's first think about a "pure" function. A pure function has no inputs besides the explicit parameters, and the only way it impacts our program's behavior is through the value it returns.

// A simple, pure, function
public int addWith5(int x, int y) {
  int result = x + y + 5;
  return result;

We can define an effect as, well, anything outside of that paradigm. This can be as simple as an implicit mutable input to the function like a global variable.

// Global mutable variable as in "implicit" input
global int z;

public int addWith5(int x, int y) {
  int result = x + y + 5 + z; // < z isn't a function parameter!
  return result;

Or it can be something more complicated like writing something to the file system, or making an HTTP request to an API.

// More complicated effects (pseudo-code)
public int addWith5(int x, int y) {
  int result = x + y + 5;
  WriteFile("result.txt", result);;
  return result;

Once our function does these kinds of operations, its behavior is significantly less predictable, and that can cause a lot of bugs.

Now a common misconception about Haskell is that it does not allow side effects. And this isn't correct. What is true about Haskell is that if a function has side effects, these must be part of its type signature, usually in the form of a monad, which describes the full computational context of the function.

A function in the State monad can update a shared global value in some way.

updateValue :: Int -> State Int Int

A function in the IO monad can write to the file system or even make a network call.

logAndSendRequest :: Req -> IO Result

Doing this type-level documentation helps avoid bugs and provide guarantees about parts of our program at compile time, and this can be a real lifesaver.

Re-thinking Code

In the last few years I've been writing about Haskell during my free time but using C++ and Python in my day job. And so I have a bigger appreciation for the lessons I learned from Haskell's effect structures and I've seen that my code in other languages is much better because I understand these lessons.

New Course: Effectful Haskell!

And this is why I'm excited to introduce my newest course on the Monday Morning Haskell Academy. This one is called Effectful Haskell, and I think it might be the most important course I've made so far, because it really zeroes in on this idea of effects. For me, this is the main idea that separates Haskell from other languages. But at the same time, it can also teach you to be a better programmer in these other languages.

This course is designed to give you hands-on experience with some of the different tools and paradigms Haskell has for structuring effects. It includes video lectures, screencasts, and in depth coding exercises that culminate with you launching a small but multi-functional web server.

If you've dabbled a bit in Haskell and you understand the core ideas, but you want to see what the language is really capable of, I highly recommend you try out this course. You can head to the course sales page to see an overview of the course as well as the FAQ. I'll mention a couple special items.

First, there is a 30-day refund guarantee if you decide you don't like the course.

And second, if you subscribe (or are already subscribed) to the Monday Morning Haskell newsletter, you'll get a 20% discount code for this and our other courses! So I hope you'll take a look and try it out.

by James Bowen at October 11, 2021 02:30 PM

October 09, 2021

Sandy Maguire

Testing Polysemy With polysemy-check

Last week we covered how to port an existing codebase to polysemy. The “why you might want to do this” was left implicit, but to be more explicit about things, it’s because littering your codebase with IO makes things highly-coupled and hard to test. By forcing yourself to think about effects, you are forced to pull concerns apart, and use the type-system to document what’s going on. But more importantly for today, it gives us a layer of indirection inside of which we can insert testing machinery.

To take an extreme example from the codebase I’m currently working on, compare a function with its original (non-polysemized) type:

api :: Opts -> ServerT API App

which looks very simple, and gives the false impression that api is fairly uninteresting. However, there is an amazing amount of IO hiding inside of App, which becomes significantly more evident when we give this type explicit dependency constraints:

api ::
    '[ AReqIDStore,
       Error SparError,
       Input Opts,
       Logger (Msg -> Msg)
       Logger String,
    r =>
  Opts ->
  ServerT API (Sem r)

Wow! Not so innocent-looking now, is it? Each Member constraint here is a unit of functionality that was previously smuggled in via IO. Not only have we made them more visible, but we’ve now exposed a big chunk of testable surface-area. You see, each one of these members provides an abstract interface, which we can implement in any way we’d like.

Because IO is so hard to test, the idea of polysemy is that we can give several interpretaions for our program — one that is pure, lovely, functional, and, importantly, very easy to test. Another interpretation is one that that runs fast in IO. The trick then is to decompose the problem of testing into two steps:

  1. show that the program is correct under the model interpreter
  2. show that the model interpreter is equivalent to the real interpreter

This sounds great in principle, but as far as I know, it’s never been actually done in practice. My suspicion is that people using polysemy in the wild don’t get further than step 1 (which is OK — a good chunk of the value in effect systems is in the decomposition itself.) Doing all of the work to show equivalence of your interpreters is a significant amount of work, and until now, there have been no tools to help.

Introducing polysemy-check: a new library for proving all the things you’d want to prove about a polysemy codebase. polysemy-check comes with a few tools for synthesizing QuickCheck properties, plus machinery for getting Arbitrary instances for effects for free.

Using polysemy-check

To get started, you’re going to need to give two instances for every effect in your system-under-test. Let’s assume we have a stack effect:

data Stack s m a where
  Push :: s -> Stack s m ()
  Pop :: Stack s m (Maybe s)
  RemoveAll :: Stack s m ()
  Size :: Stack s m Int

makeSem ''Stack

The instances we need are given by:

deriving instance (Show s, Show a) => Show (Stack s m a)
deriveGenericK ''Stack

where deriveGenericK is TemplateHaskell that from kind-generics (but is re-exported by polysemy-check.) kind-generics is GHC.Generics on steroids: it’s capable of deriving generic code for GADTs.

The first thing that probably comes to mind when you consider QuickCheck is “checking for laws.” For example, we should expect that push s followed by pop should be equal to pure (Just s). Laws of this sort give meaning to effects, and act as sanity checks on their interpreters.

Properties for laws can be created via prepropLaw:

    :: forall effs r a f
     . ( (forall z. Eq z => Eq (f z))
       , (forall z. Show z => Show (f z))
    => ( Eq a
       , Show a
       , ArbitraryEff effs r
    => Gen (Sem r a, Sem r a)
    -> (forall z. Sem r (a, z) -> IO (f (a, z)))
    -> Property

Sorry for the atrocious type. If you’re looking for Boring Haskell, you’d best look elsewhere.

The first argument here is a QuickCheck generator which produces two programs that should be equivalent. The second argument is the interpreter for Sem under which the programs must be equivalent, or will fail the resulting Property. Thus, we can write the push/pop law above as:

    :: forall s r f effs res
     . (
         -- The type that our generator returns
         res ~ (Maybe s)

         -- The effects we want to be able to synthesize for contextualized
         -- testing
       , effs ~ '[Stack s]

         -- Misc constraints you don't need to care about
       , Arbitrary s
       , Eq s
       , Show s
       , ArbitraryEff effs r
       , Members effs r
       , (forall z. Eq z => Eq (f z))
       , (forall z. Show z => Show (f z))
    => (forall a. Sem r (res, a) -> IO (f (res, a)))
    -> Property
law_pushPop = prepropLaw @effs $ do
  s <- arbitrary
  pure ( push s >> pop
       , pure (Just s)

Sorry. Writing gnarly constraints is the cost not needing to write gnarly code. If you know how to make this better, please open a PR!

There’s something worth paying attention to in law_pushPop — namely the type of the interpreter (forall a. Sem r (Maybe s, a) -> IO (f (Maybe s, a))). What is this forall a thing doing, and where does it come from? As written, our generator would merely checks the equivalence of the exact two given programs, but this is an insufficient test. We’d instead like to prove the equivalence of the push/pop law under all circumstances.

Behind the scenes, prepropLaw is synthesizing a monadic action to run before our given law, as well as some actions to run after it. These actions are randomly pulled from the effects inside the effs ~ '[Stack s] row (and so here, they will only be random Stack actions.) The a here is actually the result of these “contextual” actions. Complicated, but you really only need to get it right once, and can copy-paste it forevermore.

Now we can specialize law_pushPop (plus any other laws we might have written) for a would-be interpreter of Stack s. Any interpreter that passes all the properties is therefore proven to respect the desired semantics of Stack s.

Wrapping Up

polysemy-check can do lots more, but this post is overwhelming already. So next week we’ll discuss how to prove the equivalence of interpreters, and how to ensure your effects are sane with respect to one another.

October 09, 2021 02:23 PM

October 06, 2021

Gabriel Gonzalez

The "return a command" trick


This post illustrates a trick that I’ve taught a few times to minimize the “change surface” of a Haskell program. By “change surface” I mean the number of places Haskell code needs to be updated when adding a new feature.

The motivation

I’ll motivate the trick through the following example code for a simple REPL:

import Control.Applicative ((<|>))
import Data.Void (Void)
import Text.Megaparsec (Parsec)

import qualified Data.Char as Char
import qualified System.IO as IO
import qualified Text.Megaparsec as Megaparsec
import qualified Text.Megaparsec.Char as Megaparsec

type Parser = Parsec Void String

data Command = Print String | Save FilePath String

parsePrint :: Parser Command
parsePrint = do
Megaparsec.string "print"


string <- Megaparsec.takeRest

return (Print string)

parseSave :: Parser Command
parseSave = do
Megaparsec.string "save"


file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)


string <- Megaparsec.takeRest

return (Save file string)

parseCommand :: Parser Command
parseCommand = parsePrint <|> parseSave

main :: IO ()
main = do
putStr "> "

eof <- IO.isEOF

if eof
then do
putStrLn ""

else do
text <- getLine

case Megaparsec.parse parseCommand "(input)" text of
Left e -> do
putStr (Megaparsec.errorBundlePretty e)

Right command -> do
case command of
Print string -> do
putStrLn string

Save file string -> do
writeFile file string


This REPL supports two commands: print and save:

> print Hello, world!
Hello, world!
> save number.txt 42

print echoes back whatever string you supply and save writes the given string to a file.

Now suppose that we wanted to add a new load command to read and display the contents of a file. We would need to change our code in four places.

First, we would need to change the Command type to add a new Load constructor:

data Command = Print String | Save FilePath String | Load FilePath

Second, we would need to add a new parser to parse the load command:

parseLoad :: Parser Command
parseLoad = do
Megaparsec.string "load"


file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)

return (Load file)

Third, we would need to add this new parser to parseCommand:

parseCommand :: Parser Command
parseCommand = parsePrint <|> parseSave <|> parseLoad

Fourth, we would need to add logic for handling our new Load constructor in our main loop:

                    case command of
Print string -> do
putStrLn string

Save file string -> do
writeFile file string

Load file -> do
string <- readFile file

putStrLn string

I’m not a fan of this sort of program structure because the logic for how to handle each command isn’t all in one place. However, we can make a small change to our program structure that will not only simplify the code but also consolidate the logic for each command.

The trick

We can restructure our code by changing the type of all of our parsers from this:

parsePrint :: Parser Command

parseSave :: Parser Command

parseLoad :: Parser Command

parseCommand :: Parser Command

… to this:

parsePrint :: Parser (IO ())

parseSave :: Parser (IO ())

parseLoad :: Parser (IO ())

parseCommand :: Parser (IO ())

In other words, our parsers now return an actual command (i.e. IO ()) instead of returning a Command data structure that still needs to be interpreted.

This entails the following changes to the implementation of our three command parsers:

{-# LANGUAGE BlockArguments #-}

parsePrint :: Parser (IO ())
parsePrint = do
Megaparsec.string "print"


string <- Megaparsec.takeRest

return do
putStrLn string

parseSave :: Parser (IO ())
parseSave = do
Megaparsec.string "save"


file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)


string <- Megaparsec.takeRest

return do
writeFile file string

parseLoad :: Parser (IO ())
parseLoad = do
Megaparsec.string "load"


file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)

return do
string <- readFile file

putStrLn string

Now that each parser returns an IO () action, we no longer need the Command type, so we can delete the following datatype definition:

data Command = Print String | Save FilePath String | Load FilePath

Finally, our main loop gets much simpler, because we no longer need to specify how to handle each command. That means that instead of handling each Command constructor:

            case Megaparsec.parse parseCommand "(input)" text of
Left e -> do
putStr (Megaparsec.errorBundlePretty e)

Right command -> do
case command of
Print string -> do
putStrLn string

Save file string -> do
writeFile file string

Load file -> do
string <- readFile file

putStrLn string

… we just run whatever IO () command was parsed, like this:

            case Megaparsec.parse parseCommand "(input)" text of
Left e -> do
putStr (Megaparsec.errorBundlePretty e)

Right io -> do

Now we only need to make two changes to the code any time we add a new command. For example, all of the logic for the load command is right here:

parseLoad :: Parser (IO ())
parseLoad = do
Megaparsec.string "load"


file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)

return do
string <- readFile file

putStrLn string

… and here:

parseCommand :: Parser (IO ())
parseCommand = parsePrint <|> parseSave <|> parseLoad
-- ↑

… and that’s it. We no longer need to change our REPL loop or add a new constructor to our Command datatype (because there is no Command datatype any longer).

What’s neat about this trick is that the IO () command we return has direct access to variables extracted by the corresponding Parser. For example:

parseLoad = do
Megaparsec.string "load"


-- The `file` variable that we parse here …
file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)

return do
-- … can be referenced by the corresponding `IO` action here
string <- readFile file

putStrLn string

There’s no need to pack our variables into a data structure and then unpack them again later on when we need to use them. This technique promotes tight and “vertically integrated” code where all of the logic is one place.

Final encodings

This trick is a special case of a more general trick known as a “final encoding” and the following post does a good job of explaining what “initial encoding” and “final encoding” mean:

To briefly explain initial and final encodings in my own words:

  • An “initial encoding” is one where you preserve as much information as possible in a data structure

    This keeps your options as open as possible since you haven’t specified what to do with the data yet

  • A “final encoding” is one where you encode information by how you intend to use it

    This tends to simplify your program if you know in advance how the information will be used

The initial example from this post was an initial encoding because each Parser returned a Command type which preserved as much information as possible without specifying what to do with it. The final example from this post was a final encoding because we encoded our commands by directly specifying what we planned to do with them.


This trick is not limited to returning IO actions from Parsers. For example, the following post illustrates a similar trick in the context of implementing configuration “wizards”:

… where a wizard has type IO (IO ()) (a command that returns another command).

More generally, you will naturally rediscover this trick if you stick to the principle of “keep each component’s logic all in one place”. In the above example the “components” were REPL commands, but this consolidation principle is useful for any sort of plugin-like system.


Here is the complete code for the final version of the running example if you would like to test it out yourself:

{-# LANGUAGE BlockArguments #-}

import Control.Applicative ((<|>))
import Data.Void (Void)
import Text.Megaparsec (Parsec)

import qualified Data.Char as Char
import qualified System.IO as IO
import qualified Text.Megaparsec as Megaparsec
import qualified Text.Megaparsec.Char as Megaparsec

type Parser = Parsec Void String

parsePrint :: Parser (IO ())
parsePrint = do
Megaparsec.string "print"


string <- Megaparsec.takeRest

return do
putStrLn string

parseSave :: Parser (IO ())
parseSave = do
Megaparsec.string "save"


file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)


string <- Megaparsec.takeRest

return do
writeFile file string

parseLoad :: Parser (IO ())
parseLoad = do
Megaparsec.string "load"


file <- Megaparsec.takeWhile1P Nothing (not . Char.isSpace)

return do
string <- readFile file

putStrLn string

parseCommand :: Parser (IO ())
parseCommand = parsePrint <|> parseSave <|> parseLoad

main :: IO ()
main = do
putStr "> "

eof <- IO.isEOF

if eof
then do
putStrLn ""