Planet Haskell

September 25, 2021

Chris Smith 2

September Virtual CoHack Recap

The September chapter of the Virtual Haskell CoHack is now past. Here’s how it went:

  • We had 12 people in attendance overall, ranging from experienced Haskell programmers to the novice and intermediate level. We had a great time getting to know each other and hearing each other’s Haskell stories.
  • One group worked on getting started with the implementation of the lambda cases GHC proposal. They made significant progress on lexing and parsing work to implement the feature.
  • Another group worked on explainable-predicates, implementing a number of minor feature requests and culminating in the implementation of qADT, a powerful Template Haskell combinator for convenient matching of algebraic data types with predicates for the fields. A new version has been released.
  • Yet another group worked on documentation for Data.Foldable, focusing on making the documentation more precise, accessible, and helpful with examples and clear language.

We’ve already got the next Virtual Haskell CoHack scheduled, so go ahead and sign up if you’d like to join the fun next time around.

by Chris Smith at September 25, 2021 11:18 PM

September 24, 2021

Brent Yorgey

Swarm: preview and call for collaboration

For about a month now I have been working on building a game1, tentatively titled Swarm. It’s nowhere near finished, but it has at least reached a point where I’m not embarrassed to show it off. I would love to hear feedback, and I would especially love to have others contribute! Read on for more details.

Swarm is a 2D tile-based resource gathering game, but with a twist: the only way you can interact with the world is by building and programming robots. And there’s another twist: the kinds of commands your robots can execute, and the kinds of programming language features they can interpret, depends on what devices they have installed; and you can create new devices only by gathering resources. So you start out with only very basic capabilities and have to bootstrap your way into more sophisticated forms of exploration and resource collection.

I guess you could say it’s kind of like a cross between Minecraft, Factorio, and Karel the Robot, but with a much cooler programming language (lambda calculus + polymorphism + recursion + exceptions + a command monad for first-class imperative programs + a bunch of other stuff).

The game is far from complete, and especially needs a lot more depth in terms of the kinds of devices and levels of abstraction you can build. But for me at least, it has already crossed the line into something that is actually somewhat fun to play.

If it sounds interesting to you, give it a spin! Take a look at the README and the tutorial. If you’re interested in contributing to development, check out the CONTRIBUTING file and the GitHub issue tracker, which I have populated with a plethora of tasks of varying difficulty. This could be a great project to contribute to especially if you’re relatively new to Haskell; I try to keep everything well-organized and well-commented, and am happy to help guide new contributors.

  1. Can you tell I am on sabbatical?↩

by Brent at September 24, 2021 02:51 AM

September 23, 2021

Tweag I/O

Functional data pipelines with funflow2

As the data science and machine learning fields have grown over the past decade, so has the number of data pipelining frameworks. What started out largely as a set of tools for extract-transform-load (ETL) processes has expanded into a diverse ecosystem of libraries, all of which aim to provide data teams with the ability to move and process their data. Apache Airflow, Beam, Luigi, Azkaban — the list goes on and on.

As users of several of the above frameworks, Tweag has a special interest in data pipelines. While working with them, we have observed a few common shortcomings which make writing and debugging data pipelines more complex than it needs to be. These shortcomings include limited composability of pipeline components, as well as minimal support for static analysis of pipelines. This second issue can be especially annoying when executing pipelines in a machine learning context, where pipelines are often defined as a Directed Acyclic Graph (DAG) of long-running tasks (e.g. training a model). In these situations, early identification of a pipeline doomed to fail due problems like configuration errors can spare great waste of compute time and resources.

Additionally, while many data pipeline frameworks provide some support for choosing the execution environment of the pipeline (e.g. running locally vs. on a Spark or Kubernetes cluster), not all provide control over the execution logic of the tasks themselves. This is a common limitation of workflow-oriented frameworks, like Apache Airflow, that couple a task’s definition to its execution logic by combining them in a single class. This lack of modularity makes it difficult to do things like write integration tests for pipeline components.

Enter: funflow2

funflow2 is the successor of funflow, a Haskell library for writing workflows in a functional style. funflow2 makes use of kernmantle to offer several advantages over the original funflow including greater extensibility, additional type-level controls, and a more feature-rich interpretation layer for analyzing pipelines before execution.

An ICFP Haskell Symposium 2020 paper provides a deep dive into kernmantle’s design and features.

funflow2 aims to address the limitations we have observed in other data pipeline frameworks while providing a simple, high-level API to express pipelines in a functional style. Let’s take a closer look.


In funflow2, a pipeline is represented by a Flow data type. Flows have the nice property of being composable, which allows pipeline subcomponents to be recombined with other components to create new pipelines.

To illustrate this point, we will compare two simple Apache Airflow and funflow2 pipelines, each with three sequential tasks that execute a function. In Airflow, this can be written as follows.

from datetime import datetime

from airflow import DAG
from airflow.operators.python import PythonOperator

dag = DAG(
   default_args={"start_date": datetime(2021, 10, 1)},

gen_data = PythonOperator(
   python_callable=lambda: 1,

add_one = PythonOperator(
   python_callable=lambda x: x + 1,

print_result = PythonOperator(
   python_callable=lambda x: print(x),

In Airflow it’s simple to define a DAG of tasks, but it’s tricky to re-use it among pipelines. Airflow provides support for a SubDagOperator to trigger other DAGs, but in the end we end up with two completely separate DAG objects, forcing us to manage them individually.

With Funflow2 reusability is much simpler. Since Flows are designed to be composable with Arrows, we can just write:

{-# LANGUAGE FlexibleContexts #-}

import Funflow (pureFlow, ioFlow)
import Control.Arrow ((>>>))

-- Note: the >>> operator is used to chain together two Flows sequentially
genData = pureFlow (const 1)
addOne = pureFlow (\x -> x + 1)
printResult = ioFlow (\x -> print x)
example1 = genData >>> addOne >>> printResult

anotherFlow = example1 >>> pureFlow (\_ -> "anotherFlow")

It’s as simple as that!

Identifying errors early

One of the most frustrating things that can happen when executing a pipeline of long-running tasks is to discover that one of the tasks towards the end of the pipeline was misconfigured - leading to wasted developer time and compute resources. Our original funflow package sought to alleviate this pain point by emphasizing resumability. Resumability was achieved through caching results as pipeline execution proceeded. funflow2 supports resumability using the same caching approach as the original funflow.

The potential time and resource savings provided by caching are limited, however, in at least two ways. First, since pipeline tasks are structured as atomic units, a failed pipeline cannot recover the work that was successfully completed within a failing task before the failure occurred. Perhaps even more importantly, with resumability a failed pipeline may still cause lost efficiency due to mental context switching. For instance, maybe you start a pipeline run, switch to another project (perhaps over multiple days), and then find that your pipeline has failed. You might find yourself asking the question, “what was I trying to achieve in the task that failed?” Earlier discovery of errors is less wasteful with mental resources and less stressful for a pipeline user.

As a pipeline author it is useful to identify misconfigurations early in the development process, either through compilation errors or through early failures at run-time. While many pipelining frameworks divide a program’s lifecycle into compile-time and run-time, kernmantle (and thus funflow) distinguishes three phases: compile-time, config-time and run-time. In the intermediate config-time phase, the pipeline is interpreted and configured which allow for static analysis. As we’ll see below, funflow2 makes use of both compile- and execution-time phases to ensure early failure.

Type errors, detectable at compilation time

There are a number of ways in which a pipeline author can accidentally misconfigure a pipeline. One example of misconfiguration is when there’s a mismatch between a task’s input type and the type of argument passed by an upstream task. This kind of issue can plague pipelines built with a library written in a dynamically typed language like Python (e.g. Airflow). It’s less common of an issue, though, for pipelines written in statically typed languages like Java or Scala (e.g. scio). Since funflow2 pipelines are written in Haskell, the language’s static type checking allows us to catch issues like this at compile time.

For example, the following pipeline will fail to compile since flow3 attempts to pass the string output of flow1 as an input to flow2, which expects an integer.

-- Note: `const` is a built in function which just returns the
-- specified value
flow1 :: Flow () String
flow1 = pureFlow (const "42")

flow2 :: Flow Integer Integer
flow2 = pureFlow (+1)
flow3 = flow1 >>> flow2
Couldn't match type ‘Integer’ with ‘String’

Value errors, detectable at config time

A more subtle and less easily detectable kind of misconfiguration occurs when a task is configured in a way that passes compile-time type checks but is guaranteed to be invalid at runtime. Funflow2 is built on top of kernmantle, which provides us with a convenient layer for extracting static information from tasks in a pre-execution phase called config-time, after the pipeline has been loaded but before any task has run. This layer can be used for early detection of errors in a pipeline and ensuring early failure if something is awry. For example, let’s try running a Flow which attempts to run a Docker container with an image tag that does not exist. This pipeline can be compiled without complaint but is doomed to fail at run-time.

{-# LANGUAGE OverloadedStrings #-}

import Funflow (dockerFlow)
import Funflow.Tasks.Docker (DockerTaskConfig (..), DockerTaskInput (..))

failingFlow = dockerFlow $
  DockerTaskConfig {
    image = "alpine:this-tag-does-not-exist",
    command = "echo",
    args = ["hello"]

-- Flows created with dockerFlow expect a DockerTaskInput,
-- and we can just create an empty value for the sake of this
-- example.
emptyInput = DockerTaskInput {inputBindings = [], argsVals = mempty}

flow = ioFlow (\_ -> print "start of flow")
  >>> pureFlow (const emptyInput)          -- Constructs dockerFlow input
  >>> failingFlow
  >>> ioFlow (\_ -> print "end of flow")

Attempting to run this pipeline gives us the following error.

runFlow flow ()
Found docker images, pulling...
Pulling docker image: alpine:this-tag-does-not-exist
ImagePullError "Request failed with status 404: \"Not Found\" …"

Note that “start of flow” is never printed; the first task in the pipeline, which would print that, never actually executes. This is because funflow2’s default pipeline runner extracts the identifiers of all images required to run a pipeline at config-time and attempts to pull them before starting the actual pipeline run.

For another example of this fail-fast behavior when a pipeline is misconfigured, have a look at the configuration tutorial.

Aside: workflows and build systems

The connection between build systems and the kinds of task-based workflows / pipelines mentioned here is no secret and is a topic we have mentioned in previous blog posts. Despite serving different domains, workflow tools like funflow2 and build tools like Bazel or mill all seek to provide users with the ability to execute a graph of tasks in a repeatable way. Here we briefly consider a couple of key parallels between build systems and funflow2.

Early cutoff is a common property of build systems and refers to a build system’s ability to halt once it has determined that no dependency of a downstream task has changed since the most recent build1. While funflow2 is not a build system, it can be used to mimic one and even exhibits early cutoff, owing to its caching strategy using content-addressable storage. If the hash determined by the hash of a task and its inputs has not changed since a previous pipeline run, work need not be repeated.

Another property which disinguishes build systems is whether or not they support dynamic dependencies, or those for which the relationships themselves may vary with task output(s). This property depends on the build systems’ approach to modeling graphs of tasks, and in a functional paradigm is determined by whether or not tasks are modeled using monadic effects1. funflow2 uses arrows and not monads for composing its task graph and therefore falls within the camp of tools which do not support dynamic dependencies such as Bazel. The use of a static task graph is the key to pre-execution dependency collection and is what allows funflow2 to support the execution-free interpretation of tasks to detect invalid configuration. If tasks were linked in a monadic fashion, this would be impossible.


In addition to limited composability and fail-late behavior, tight coupling between a task and its execution logic is another common weakness among existing task-based data pipelining frameworks: the data defining a task is inseparably bound to the specific logic for how the task is executed. In a closely coupled context, altering task execution logic requires entire redefinition of the task itself, e.g. by subclassing. This confounds granular testing and nudges pipeline validation toward an end-to-end style. The resulting coarse resolution view of a test failure complicates diagnosis and remedy of a bug.

In funflow2, an interpreter defines task execution logic and is separate from the task definition itself. Effectively, an interpreter transforms a task, which is treated as data, into an executable action. Separating concerns this way allows development of custom interpreters for integration testing, project-specific execution requirements, and more. Using interpreters, you can do things like flip between test and deployment execution contexts while working with the same pipeline definition. Furthermore, this modularity allows for a greater static analysis, yielding enhanced fail-fast behavior.

While an example implementation is out of scope for this post, an example of custom tasks and interpreters is available in the extensibility tutorial on the funflow2 project website.

Similarities with funflow

funflow2 uses a completely new backend (kernmantle) and therefore has major API differences from the original funflow. Nonetheless, many of the useful features of funflow are available in funflow2:

Arrow syntax

The most immediately evident similarity between the original funflow and funflow2 is syntactic. Since funflow2 still models flows using arrows, Arrow syntax can still be used to define and compose flows.


Like its predecessor, the funflow2 project uses the cas-store package which provides caching functionality using content-addressable storage.


While funflow2’s API differs significantly from funflow, it provides a module with aliases for some of the core functions from funflow such as step and stepIO to help simplify migration. Many of the core example from funflow have also been ported to funflow2 such as the C compilation tutorial or custom make example discussed in an earlier blog post.

Next steps

Interested in trying out some hands-on examples? We’ve prepared a set of tutorials on the funflow2 website. Alternatively, each tutorial notebook can be run locally using the provided Nix shell. You may also initialize a funflow2 project using our cookiecutter template.

funflow2 is still in its early stages, and so far most of the development on it has focused on building the Flow API, along with a few tasks and interpreters. One area in which other data pipeline frameworks excel is in providing a high level of interoperability through a wide range of predefined tasks for common operations, including running database queries and uploading data to cloud storage. We would welcome external contributions to provide these and other common task types, so please open an issue or pull request if this interests you.

Thanks for reading, and stay tuned for future updates on funflow2!

  1. Refer to the paper “Build Systems a la Carte” for much more discussion of minimality, early cutoff, and other properties of build systems and pipelines.

September 23, 2021 12:00 AM

Gil Mizrahi

A new project-oriented Haskell book

This is an announcement and a 'request for comments' for a new project-oriented, online, and free, Haskell book: Learning Haskell by building a static blog generator (or LHBG for short).

This book is yet another attempt of mine to try and help make Haskell a bit more approachable. I think Haskell is a great language and I've been enjoying using it for most of my projects for several years now.

I wanted to create a short book aimed at complete beginners to the language which focuses on the practice of writing Haskell programs, including idioms, techniques, and general software development, rather than the concepts and language features - for which there are many resources available.

This book is still pretty much in an 'alpha' kind of stage, meaning that it probably needs some feedback-driven editing, but I think it is ready to be shared. I hope this book will be useful to some people trying to learn Haskell and write programs with it.

Do let me know if you have any comments or feedback on it. Especially if you are a new Haskeller getting started with the language!

by Gil at September 23, 2021 12:00 AM

September 22, 2021

Brent Yorgey

Competitive programming in Haskell: Codeforces Educational Round 114

Yesterday morning I competed in Educational Round 114 on, using only Haskell. It is somewhat annoying since it does not support as many Haskell libraries as Open Kattis (e.g. no unordered-containers, split, or vector); but on the other hand, a lot of really top competitive programmers are active there, and I enjoy occasionally participating in a timed contest like this when I am able.

WARNING: here be spoilers! Stop reading now if you’d like to try solving the contest problems yourself. (However, Codeforces has an editorial with explanations and solutions already posted, so I’m not giving anything away that isn’t already public.) I’m going to post my (unedited) code for each problem, but without all the imports and LANGUAGE extensions and whatnot; hopefully that stuff should be easy to infer.

Problem A – Regular Bracket Sequences

In this problem, we are given a number n and asked to produce any n distinct balanced bracket sequences of length 2n. I immediately just coded up a simple recursive function to generate all possible bracket sequences of length 2n, and then called take n on it. Thanks to laziness this works great. I missed that there is an even simpler solution: just generate the list ()()()()..., (())()()..., ((()))()..., i.e. where the kth bracket sequence starts with k nested pairs of brackets followed by n-k singleton pairs. However, I solved it in only four minutes anyway so it didn’t really matter!

readB = C.unpack >>> read

main = C.interact $
  C.lines >>> drop 1 >>> concatMap (readB >>> solve) >>> C.unlines

bracketSeqs 0 = [""]
bracketSeqs n =
  [ "(" ++ s1 ++ ")" ++ s2
  | k <- [0 .. n-1]
  , s1 <- bracketSeqs k
  , s2 <- bracketSeqs (n - k - 1)

solve n = map C.pack . take n $ bracketSeqs n

Problem B – Combinatorics Homework

In this problem, we are given numbers a, b, c, and m, and asked whether it is possible to create a string of a A’s, b B’s, and c C’s, such that there are exactly m adjacent pairs of equal letters. This problem requires doing a little bit of combinatorial analysis to come up with a simple Boolean expression in terms of a, b, c, and m; there’s not much to say about it from a Haskell point of view. You can refer to the editorial posted on Codeforces if you want to understand the solution.

readB = C.unpack >>> read

main = C.interact $
  C.lines >>> drop 1 >>> map (C.words >>> map readB >>> solve >>> bool "NO" "YES") >>> C.unlines

solve :: [Int] -> Bool
solve [a,b,c,m] = a + b + c - m >= 3 && m >= z - (x+y) - 1
    [x,y,z] = sort [a,b,c]

Problem C – Slay the Dragon

This problem was super annoying and I still haven’t solved it. The idea is that you have a bunch of “heroes”, each with a numeric strength, and there is a dragon described by two numbers: its attack level and its defense level. You have to pick one hero to fight the dragon, whose strength must be greater than or equal to the dragon’s defense; all the rest of the heroes will stay behind to defend your castle, and their combined strength must be greater than the dragon’s attack. This might not be possible, of course, so you can first spend money to level up any of your heroes, at a rate of one coin per strength point; the task is to find the minimum amount of money you must spend.

The problem hinges on doing some case analysis. It took me a good while to come up with something that I think is correct. I spent too long trying to solve it just by thinking hard; I really should have tried formal program derivation much earlier. It’s easy to write down a formal specification of the correct answer which involves looping over every hero and taking a minimum, and this can be manipulated into a form that doesn’t need to do any looping.

In the end it comes down to (for example) finding the hero with the smallest strength greater than or equal to the dragon’s defense, and the hero with the largest strength less than or equal to it (though one of these may not exist). The intended way to solve the problem is to sort the heroes by strength and use binary search; instead, I put all the heroes in an IntSet and used the lookupGE and lookupLE functions.

However, besides my floundering around getting the case analysis wrong at first, I got tripped up by two other things: first, it turns out that on the Codeforces judging hardware, Int is only 32 bits, which is not big enough for this problem! I know this because my code was failing on the third test case, and when I changed it to use Int64 instead of Int (which means I also had to switch to Data.Set instead of Data.IntSet), it failed on the sixth test case instead. The other problem is that my code was too slow: in fact, it timed out on the sixth test case rather than getting it wrong per se. I guess Data.Set and Int64 just have too much overhead.

Anyway, here is my code, which I think is correct, but is too slow.

data TC = TC { heroes :: ![Int64], dragons :: ![Dragon] }
data Dragon = Dragon { defense :: !Int64, attack :: !Int64 }

main = C.interact $
  runScanner tc >>> solve >>> map (show >>> C.pack) >>> C.unlines

tc :: Scanner TC
tc = do
  hs <- numberOf int64
  ds <- numberOf (Dragon <$> int64 <*> int64)
  return $ TC hs ds

solve :: TC -> [Int64]
solve (TC hs ds) = map fight ds
    heroSet = S.fromList hs
    total = foldl' (+) 0 hs
    fight (Dragon df atk) = minimum $
      [ max 0 (atk - (total - hero)) | Just hero <- [mheroGE] ]
      [ df - hero + max 0 (atk - (total - hero)) | Just hero <- [mheroLE]]
        mheroGE = S.lookupGE df heroSet
        mheroLE = S.lookupLE df heroSet

I’d like to come back to this later. Using something like vector to sort and then do binary search on the heroes would probably be faster, but vector is not supported on Codeforces. I’ll probably end up manually implementing binary search on top of something like Data.Array.Unboxed. Doing a binary search on an array also means we can get away with doing only a single search, since the two heroes we are looking for must be right next to each other in the array.

Edited to add: I tried creating an unboxed array and implementing my own binary search over it; however, my solution is still too slow. At this point I think the problem is the sorting. Instead of calling sort on the list of heroes, we probably need to implement our own quicksort or something like that over a mutable array. That doesn’t really sound like much fun so I’m probably going to forget about it for now.

Problem D – The Strongest Build

In this problem, we consider a set of k-tuples, where the value for each slot in a tuple is chosen from among a list of possible values unique to that slot (the values for a slot are given to us in sorted order). For example, perhaps the first slot has the possible values 1, 2, 3, the second slot has possible values 5, 8, and the third slot has possible values 4, 7, 16. In this case there would be 3 \times 2 \times 3 possible tuples, ranging from (1,5,4) up to (3,8,16). We are also given a list of forbidden tuples, and then asked to find a non-forbidden tuple with the largest possible sum.

If the list of slot options is represented as a list of lists, with the first list representing the choices for the first slot, and so on, then we could use sequence to turn this into the list of all possible tuples. Hence, a naive solution could look like this:

solve :: Set [Int] -> [[Int]] -> [Int]
solve forbidden =
  head . filter (`S.notMember` forbidden) . sortOn (Down . sum) . sequence

Of course, this is much too slow. The problem is that although k (the size of the tuples) is limited to at most 10, there can be up to 2 \cdot 10^5 choices for each slot (the choices themselves can be up to 10^8). The list of all possible tuples could thus be truly enormous; in theory, there could be up to (2 \cdot 10^5)^{10} \approx 10^{53}), and generating then sorting them all is out of the question.

We can think of the tuples as forming a lattice, where the children of a tuple t are all the tuples obtained by downgrading exactly one slot of t to the next smaller choice. Then the intended solution is to realize that the largest non-forbidden tuple must either be the top element of the lattice (the tuple with the maximum possible value for every slot), OR a child of one of the forbidden tuples (it is easy to see this by contradiction—any tuple which is not the child of a forbidden tuple has at least one parent which has a greater total value). So we can just iterate over all the forbidden tuples (there are at most 10^5), generate all possible children (at most 10) for each one, and take the maximum.

However, that’s not how I solved it! I started thinking from the naive solution above, and wondered whether there is a way to do sortOn (Down . sum) . sequence more efficiently, by interleaving the sorting and the generation. If it can be done lazily enough, then we could just search through the beginning of the generated ordered list of tuples for the first non-forbidden one, without having to actually generate the entire list. Indeed, this reminded me very much of Richard Bird’s implementation of the Sieve of Eratosthenes (see p. 11 of that PDF). The basic idea is to make a function which takes a list of choices for a slot, and a (recursively generated) list of tuples sorted by decreasing sum, and combines each choice with every tuple, merging the results so they are still sorted. However, the key is that when combining the best possible choice for the slot with the largest tuple in the list, we can just immediately return the resulting tuple as the first (best) tuple in the output list, without needing to involve it in any merging operation. This affords just enough laziness to get the whole thing off the ground. I’m not going to explain it in more detail than that; you can study the code below if you like.

I’m quite pleased that this worked, though it’s definitely an instance of me making things more complicated than necessary.

data TC = TC { slots :: [[Choice]], banned :: [[Int]] }

tc = do
  n <- int
  TC <$> (n >< (zipWith Choice [1 ..] <$> numberOf int)) <*> numberOf (n >< int)

main = C.interact $
  runScanner tc >>> solve >>> map (show >>> C.pack) >>> C.unwords

solve :: TC -> [Int]
solve TC{..} = choices . fromJust $ find ((`S.notMember` bannedSet) . choices) bs
    bannedSet = S.fromList banned
    revSlots = map reverse slots
    bs = builds revSlots

data Choice = Choice { index :: !Int, value :: !Int }

data Build = Build { strength :: !Int, choices :: [Int] }
  deriving (Eq, Show, Ord)

singletonBuild :: Choice -> Build
singletonBuild (Choice i v) = Build v [i]

mkBuild xs = Build (sum xs) xs

-- Pre: all input lists are sorted descending.
-- All possible builds, sorted in descending order of strength.
builds :: [[Choice]] -> [Build]
builds []     = []
builds (i:is) = chooseFrom i (builds is)

chooseFrom :: [Choice] -> [Build] -> [Build]
chooseFrom [] _  = []
chooseFrom xs [] = map singletonBuild xs
chooseFrom (x:xs) (b:bs) = addToBuild x b : mergeBuilds (map (addToBuild x) bs) (chooseFrom xs (b:bs))

addToBuild :: Choice -> Build -> Build
addToBuild (Choice i v) (Build s xs) = Build (v+s) (i:xs)

mergeBuilds xs [] = xs
mergeBuilds [] ys = ys
mergeBuilds (x:xs) (y:ys) = case compare (strength x) (strength y) of
  GT -> x : mergeBuilds xs (y:ys)
  _  -> y : mergeBuilds (x:xs) ys

Problems E and F

I didn’t even get to these problems during the contest; I spent too long fighting with problem C and implementing my overly complicated solution to problem D. I might attempt to solve them in Haskell too; if I do, I’ll write about them in another blog post!

by Brent at September 22, 2021 11:05 AM

September 21, 2021

Abhinav Sarkar

Implementing Co, a Small Interpreted Language With Coroutines #2: The Interpreter

In the previous post, we wrote the parser for Co, the small interpreted language we are building in this series of posts. The previous post was all about the syntax of Co. In this post we dive into the semantics of Co, and write an interpreter for its basic features.

This post was originally published on

This is the second post in a series of posts:

  1. Implementing Co #1: The Parser
  2. Implementing Co #2: The Interpreter
  3. Implementing Co #3: Continuations, Coroutines, and Channels

Previously, on …

Here’s a quick recap. The basic features of Co that we are aiming to implement in this post are:

  • Dynamic and strong typing.
  • Null, boolean, string and integer literals, and values.
  • Addition and subtraction arithmetic operations.
  • String concatenation operation.
  • Equality and inequality checks on booleans, strings and numbers.
  • Less-than and greater-than comparison operations on numbers.
  • Variable declarations, usage and assignments.
  • if and while statements.
  • Function declarations and calls, with support for recursion.
  • First class functions.
  • Mutable closures.

Note that some parts of code snippets in this post have been faded away. These are the part which add support for coroutines and channels. You can safely ignore these parts for now. We’ll go over them in the next post.

We represent the Co Abstract Syntax Tree (AST) as a pair of Haskell Algebraic Data Types (ADTs), one for Expressions:

data Expr
  = LNull
  | LBool Bool
  | LStr String
  | LNum Integer
  | Variable Identifier
  | Binary BinOp Expr Expr
  | Call Identifier [Expr]
  | Receive Expr
  deriving (Show, Eq)

type Identifier = String

data BinOp = Plus | Minus | Equals | NotEquals | LessThan | GreaterThan
  deriving (Show, Eq)

And another for Statements:

data Stmt
  = ExprStmt Expr
  | VarStmt Identifier Expr
  | AssignStmt Identifier Expr
  | IfStmt Expr [Stmt]
  | WhileStmt Expr [Stmt]
  | FunctionStmt Identifier [Identifier] [Stmt]
  | ReturnStmt (Maybe Expr)
  | YieldStmt
  | SpawnStmt Expr
  | SendStmt Expr Identifier
  deriving (Show, Eq)

type Program = [Stmt]

Also, program is the parser for Co programs. To parse code, run the program parser with the runParser function like this:

> runParser program "var x = 1 + s;"
Right [VarStmt "x" (Binary Plus (LNum 1) (Variable "s"))]

Now, off to the new stuff.

Running a Program

There are many ways to run a program. If the program is written in Machine Code, you can run it directly on the matching CPU. But machine code is too low-level, and writing programs in it is very tedious and error-prone. Thus, programmers prefer to write code in high-level programming languages, and turn it into machine code to be able to run it. Here’s where different ways of running code come in:

  • We can run the high-level code through a Compiler to turn it into machine code to be able to run it directly. Example: compiling C++ using GCC.
  • We can run the code through a compiler which turns it into a relatively lower-level programming language code, and then run that lower-level program through another compiler to turn it into machine code. Example: compiling Haskell into LLVM IR using GHC, which can then be run through the LLVM toolchain to generate machine code.
  • We can run the code through a Transpiler (also called Source-to-source compiler) to turn it into code in a programming language that is of similar level, and then run the resultant code with that language’s toolchain. Example: transpiling Purescript into JavaScript, and running it with node.js.
  • We can compile the source code to Bytecode and run the bytecode on a Virtual Machine. Example: Java virtual machine running Java bytecode compiled from Clojure source code by the Clojure compiler.
  • We can parse the code to an AST, and immediately execute the AST using an AST Interpreter. Example: PHP version 3, Bash. 1
  • We can also mix-and-match parts of the above options to create hybrids, like Just-in-time compilation to machine code within a virtual machine.

Many ways to run a program
Many ways to run a program
Many ways to run a program
Many ways to run a program

For running Co programs, we will implement an AST-walking interpreter. The interpreter implemented in this post will support only the basic features of Co. In the next post, we’ll extend it to support coroutines and channels.

The complete code for the interpreter is here. You can load it in GHCi using stack (by running stack co-interpreter.hs), and follow along while reading this article.

Runtime Values

An AST-walking interpreter takes an AST as its input, and recursively walks down the AST nodes, from top to bottom. While doing this, it evaluates expressions to runtime values, and executes the statements to do their effects.

The runtime values are things that can be passed around in the code during the program run time. Often called “first-class”, these values can be assigned to variables, passed as function arguments, and returned from functions. If Co were to support data structures like lists and maps, these values could be stored in them as well. The Value ADT below represents these values:

data Value
  = Null
  | Boolean Bool
  | Str String
  | Num Integer
  | Function Identifier [Identifier] [Stmt] Env
  | Chan Channel

Other than the usual values like null, booleans, strings, and numbers, we also have functions as first-class runtime values in Co. So, a function can be assigned to a variable, and passed as an argument to another function. Thus, Co supports higher-order functions.

We also write instances to show and check equality for these values:

instance Show Value where
  show = \case
    Null -> “null”
    Boolean b -> show b
    Str s -> s
    Num n -> show n
    Function name _ _ _ -> “function ” <> name
    Chan Channel {} -> “Channel”

instance Eq Value where
  Null == Null = True
  Boolean b1 == Boolean b2 = b1 == b2
  Str s1 == Str s2 = s1 == s2
  Num n1 == Num n2 = n1 == n2
  _ == _ = False

Note that only null, booleans, strings and numbers can be checked for equality in Co. Also, only values of same type can be equals. A string can never be equal to a number2.

So, how do we go about turning the expressions to values, and executing statements? Before learning that, we must take a detour into some theory of programming languages.

Readers familiar with the concepts of environments, scopes, closures and early returns can skip the next sections, and jump directly to the implementation.

Environment Model of Evaluation

Let’s say we have this little Co program to run:

var a = 2;
function twice(x) { return x + x; }

We need to evaluate twice(a) to a value to print it. One way to do that is to substitute variables for their values, quite literally. twice is a variable, value of which is a function. And a is another variable, with value 2. We can do repeated substitution to arrive at a resultant value like this:

=> print(twice(2));
=> print(2 + 2);
=> print(4);

This is called the Substitution model of evaluation. This works for the example we have above, and for a large set of programs3. However, it breaks down as soon as we add mutability to the mix:

var a = 2;
function incA() {
  var b = a + 1;
  return b;
a = 3;

Running this with the Co interpreter results in the output:


We can’t use the substitution model here because we can’t consider variables like a to be substitutable with single values anymore. Now, we must think of them more as places in which the values are stored. Also, the stored values may change over the lifetime of the program execution. We call this place where the variable values are stored, the Environment, and this understanding of program execution is called the Environment Model of Evaluation.

Value of a variable may change over time
Value of a variable may change over time
Value of a variable may change over time
Value of a variable may change over time

A pair of a variable’s name and its value at any particular time is called a Binding. An Environment is a collection of zero-or-more bindings. To fully understand environments, first we have to learn about scopes.


Let’s consider the twice function again:

function twice(x) { return x + x; }

Calling twice with different arguments prints different results. The function seems to forget the value of its parameter x after each call. This may feel very natural to programmers, but how does it really work? The answer is Scopes.

A scope is a region of the program lifetime during which a variable name-to-value binding is in effect. When the program execution enters a scope, the variables in that scope become defined and available to the executing code4. When the program execution exits the scope, the variables become undefined and inaccessible (also known as going out of scope).

Lexical scoping is a specific style of scoping where the structure of the program itself shows where a scope begins and ends5. Like most modern languages, Co is lexically scoped. A function in Co starts a new scope which extends over the entire function body, and the scope ends when the function ends. Functions are the only way of creating new scopes in Co6.

That’s how repeated invocation of functions don’t remember the values of their parameters across the calls. Every time a new call is started, a new scope is created with the parameter names bound to the value of the arguments of the call. And when the call returns, this new scope is destroyed.

Scopes can be enclosed within other scopes. In Co, this can be done by defining a function inside the body of another function. All programs have at least one scope, which is the program’s top-level scope, often called the global scope.

Scopes are intimately related to the environment. In fact, the structure of the environment is how scopes are implemented.

Scopes are implemented by the environment
Scopes are implemented by the environment
Scopes are implemented by the environment
Scopes are implemented by the environment

An environment can be thought of as a stack of frames, with one frame per enclosed scope. A frame contains zero-or-more bindings. The bindings in enclosed scopes (frames higher in the environment stack) hide the bindings (called shadowing) in enclosing scopes (frames lower in the environment stack). Program’s global scope is represented by the lowermost frame in the stack.

The above diagram shows the frames of the two calls to the twice function. The scope of the twice function is enclosed in the global scope. To find the value of a variable inside the function, the interpret first looks into the topmost frame that represents the scope of the twice function. If the binding is not found, then the interpreter goes down the stack of frames, and looks into the frame for the global scope.

What happens when a function body tries to access variables not defined in the function’s scope? We get Closures.


If a function body refers to variables not defined in the function’s scope, such variables are called Free Variables. In lexically scoped languages, the value of a free variable is determined from the scope in which the function is defined. A function along with the references to all its free variables, is called a Closure7.

Closures are prevalent in programming languages with first-class functions. Co—with its support for first-class functions—also supports closures. Closures in Co are mutable, meaning the values of the free variables of a function can change over time, and the changes are reflected in the behavior of the function8.

We already saw an example of closures earlier:

var a = 2;
function incA() {
  var b = a + 1;
  return b;
a = 3;

This is how the frames exist over time for the two invocations of the incA function:

a is a free variable of the function incA
a is a free variable of the function incA
a is a free variable of the function incA
a is a free variable of the function incA

Here, a is a free variable of the function incA. Its value is not present in the scope of incA, but is obtained from the global scope. When its value in the global scope changes later, the value returned by incA changes as well. In other words, incA and a together form a closure.

The following example demonstrates a closure with a mutable free variable and enclosed scopes:

function makeCounter(name) {
  var count = 0;
  function inc() {
    count = count + 1;
    print(name + " = " + count);
  return inc;

var countA = makeCounter("a");
var countB = makeCounter("b");


Here, both name and count are free variables referred in the function inc. While name is only read, count is changed in the body of inc.

Running the above code prints:

a = 1
a = 2
b = 1
a = 3

Note that the two functions countA and countB refer to two different instances of the count variable, and are not affected by each other. In other words, countA and countB are two different closures for the same function inc.

Now for one last thing to learn about before we jump to the implementation: early returns.

Early Returns

Statement oriented programming languages often allow returning from a function before the entire function is done executing. This is called an Early return. We saw an example of this in the fibonacci function in the previous post:

function fib(n) {
  if (n < 2) {
    return n;
  return fib(n - 2)
    + fib(n - 1);

In the above code, when the input n is less than 2, the code returns early from the function at the line 3.

Expression oriented programming languages, like Haskell, have no early returns. Every function is an expression in Haskell, and has to be evaluated entirely9 to get back a value. Since our AST-walking interpreter itself is written in Haskell, we need to figure out how to support early returns in the Co code being interpreted. The interpreter should be able to stop evaluating at an AST node representing a return statement, and jump to the node representing the function’s caller.

One way to implement this is Exceptions. Exceptions let us abort the execution of code at any point of execution, and resume from some other point in the lower in the function call stack. Although Haskell supports exceptions as we know them from languages like Java and Python, it also supports exceptions as values using the Error monad. That’s what we will leverage for implementing early returns in our interpreter.

Finally, we are really to start implementing the interpreter.

The Interpreter

The interpreter is implemented as a Haskell newtype over a stack of monad using the monad transformers and typeclasses from the mtl library:

newtype Interpreter a = Interpreter
  { runInterpreter ::
      ExceptT Exception
            (Either Exception ())
            (StateT InterpreterState IO))
    ( Functor,
      MonadBase IO,
      MonadState InterpreterState,
      MonadError Exception,

From bottom to top, the monad stack is comprised of:

  1. the IO monad to be able to print to the console,
  2. the State monad transformer to track the state of the interpreter, and
  3. the Except monad transformer to propagate exceptions while interpreting the code.

We model the environment as Map of variable names to IORefs of values:

type Env = Map.Map Identifier (IORef Value)

The interpreter state contains the environment used for interpretation. The state changes as variables come in and go out of scopes.

type Queue a = IORef (Seq.Seq a)

data InterpreterState = InterpreterState
  { isEnv :: Env,
    isCoroutines :: Queue (Coroutine ())

newInterpreterState :: IO InterpreterState
newInterpreterState = do
  coroutines <- newIORef Seq.empty
  return $ InterpreterState Map.empty coroutines

The immutable nature of Map and the mutable nature of IORef allow us to correctly model scopes, frames and closures in Co, as we see in the later sections of this post.

When trying to interpret wrong code like 1 + true, the interpreter throws runtime errors. We roll these errors along with early returns into an ADT for exceptions:

data Exception
  = Return Value
  | RuntimeError String
  | CoroutineQueueEmpty

That’s it for defining the types for the interpreter. Next, we implement the functions to interpret Co programs, starting with functions to work with environments.

Manipulating Environments

In Co, variables must be initialized when being defined. Additionally, only the already defined variables can be referenced or assigned.

To define a new variable, we create a new IORef with the variable’s value, insert it in the current environment map with the variable name as the key, and replace the interpreter state with the new environment map.

defineVar :: Identifier -> Value -> Interpreter ()
defineVar name value = do
  env <- State.gets isEnv
  env' <- defineVarEnv name value env
  setEnv env'

defineVarEnv :: Identifier -> Value -> Env -> Interpreter Env
defineVarEnv name value env = do
  valueRef <- newIORef value
  return $ Map.insert name valueRef env

setEnv :: Env -> Interpreter ()
setEnv env = State.modify' $ \is -> is {isEnv = env}

We extract two helper functions defineVarEnv and setEnv that we reuse in later sections.

To lookup and assign a variable, we get the current environment, lookup the IORef in the map by the variable’s name, and then read the IORef for lookup, or write the new value to it for assignment.

lookupVar :: Identifier -> Interpreter Value
lookupVar name =
  State.gets isEnv >>= findValueRef name >>= readIORef

assignVar :: Identifier -> Value -> Interpreter ()
assignVar name value =
  State.gets isEnv >>= findValueRef name >>= flip writeIORef value

We use the helper function findValueRef to lookup a variable name in the environment map. It throws a runtime error if the variable is not already defined.

findValueRef :: Identifier -> Env -> Interpreter (IORef Value)
findValueRef name env =
  case Map.lookup name env of
    Just ref -> return ref
    Nothing -> throw $ "Unknown variable: " <> name

throw :: String -> Interpreter a
throw = throwError . RuntimeError

These functions are enough for us to implement the evaluation of expressions and execution of statements.

Evaluating Expressions

Co expressions are represented by the Expr ADT. The evaluate function below shows how they are evaluated to runtime values.

evaluate :: Expr -> Interpreter Value
evaluate = \case
  LNull -> pure Null
  LBool bool -> pure $ Boolean bool
  LStr str -> pure $ Str str
  LNum num -> pure $ Num num
  Variable v -> lookupVar v
  binary@Binary {} -> evaluateBinaryOp binary
  call@Call {} -> evaluateFuncCall call
  Receive expr ->
    evaluate expr >>= \case
      Chan channel -> channelReceive channel
      val -> throw $ “Cannot recieve from a non-channel: ” <> show val

Literals null, booleans, strings, and numbers evaluate to themselves. Variables are looked up from the environment using the lookupVar function we wrote earlier. Binary operations and function call expressions are handled by helper functions explained below.

evaluateBinaryOp :: Expr -> Interpreter Value
evaluateBinaryOp ~(Binary op leftE rightE) = do
  left <- evaluate leftE
  right <- evaluate rightE
  let errMsg msg = msg <> ": " <> show left <> " and " <> show right
  case (op, left, right) of
    (Plus, Num n1, Num n2) -> pure $ Num $ n1 + n2
    (Plus, Str s1, Str s2) -> pure $ Str $ s1 <> s2
    (Plus, Str s1, _) -> pure $ Str $ s1 <> show right
    (Plus, _, Str s2) -> pure $ Str $ show left <> s2
    (Plus, _, _) -> throw $ errMsg "Cannot add or append"

    (Minus, Num n1, Num n2) -> pure $ Num $ n1 - n2
    (Minus, _, _) -> throw $ errMsg "Cannot subtract"

    (LessThan, Num n1, Num n2) -> pure $ Boolean $ n1 < n2
    (LessThan, _, _) -> throw $ errMsg "Cannot compare non-numbers"
    (GreaterThan, Num n1, Num n2) -> pure $ Boolean $ n1 > n2
    (GreaterThan, _, _) -> throw $ errMsg "Cannot compare non-numbers"

    (Equals, _, _) -> pure $ Boolean $ left == right
    (NotEquals, _, _) -> pure $ Boolean $ left /= right

To evaluate a binary operation, first we recursively evaluate its left and right operands by calling evaluate on them. Then, depending on the operation and types of the operands, we do different things.

The + operation can be used to either add two numbers, or to concat two operands when one or both of them are strings. In all other cases, it throws runtime errors.

The -, >, and < operations can be invoked only on numbers. Other cases throw runtime errors.

The == and != operations run corresponding Haskell operations on their operands.

That’s all for evaluating binary operations. Next, let’s look at how to execute statements. We come back to evaluating function calls after that.

Executing Statements

Co statements are represented by the Stmt ADT. The execute function below uses a case expression to execute the various types of statements in different ways:

execute :: Stmt -> Interpreter ()
execute = \case
  ExprStmt expr -> void $ evaluate expr
  VarStmt name expr -> evaluate expr >>= defineVar name
  AssignStmt name expr -> evaluate expr >>= assignVar name
  IfStmt expr body -> do
    cond <- evaluate expr
    when (isTruthy cond) $
      traverse_ execute body
  while@(WhileStmt expr body) -> do
    cond <- evaluate expr
    when (isTruthy cond) $ do
      traverse_ execute body
      execute while
  ReturnStmt mExpr -> do
    mRet <- traverse evaluate mExpr
    throwError . Return . fromMaybe Null $ mRet
  FunctionStmt name params body -> do
    env <- State.gets isEnv
    defineVar name $ Function name params body env
  YieldStmt -> yield
  SpawnStmt expr -> spawn (void $ evaluate expr)
  SendStmt expr chan -> do
    val <- evaluate expr
    evaluate (Variable chan) >>= \case
      Chan channel -> channelSend val channel
      val’ -> throw $ “Cannot send to a non-channel: ” <> show val’
    isTruthy = \case
      Null -> False
      Boolean b -> b
      _ -> True

Expressions in expression statements are evaluated by calling evaluate on them, and the resultant values are discarded.

For variable definition and assignment statements, first we evaluate the value expressions, and then define or assign variables with the given variable names and the resultant values.

For if statements, first we evaluate their conditions, and if conditions yield truthy10 values, we recursively execute the statement bodies. while statements are executed in a similar fashion, except we recursively execute the while statements again after executing their bodies.

For return statements, we evaluate their optional return value expressions, and then throw the resultant values as exceptions wrapped with the Return constructor.

Execution of function statements is more interesting. First thing that we do is to capture the current environment from the interpreter state. Then we define a new variable11 with the function’s name and a runtime function value that encapsulates the function’s name, parameter names, and body statements, as well as, the captured environment. This is how closures record the values of functions’ free variables from their definition contexts.

In the next section, we see how the captured environments and returns as exceptions are used to evaluate function calls.

Evaluating Function Calls

The capability of defining and calling functions is the cornerstone of abstraction in programming languages. In Co, functions are first-class, support recursion12, and are also the means of implementing scopes and closures. Hence, this section is the most important and involved one.

We start by matching on the called function’s name to support built-in functions like print:

evaluateFuncCall :: Expr -> Interpreter Value
evaluateFuncCall ~(Call funcName argEs) = case funcName of
  “newChannel” -> Chan <$> newChannel
  “print” -> executePrint argEs
  funcName -> lookupVar funcName >>= \case
    func@Function {} -> evaluateFuncCall’ func argEs
    val -> throw $ “Cannot call a non-function: ” <> show val
    executePrint = \case
      [expr] -> evaluate expr >>= liftIO . print >> return Null
      _ -> throw “print must be called with exactly one argument”

If the function name is print, we evaluate the argument and print it using Haskell’s print function. Else, we find the user-defined function by looking up in the environment, failing which we throw a runtime error. If we find a function, we evaluate the call with the helper function evaluateFuncCall'. But before diving into it, let’s take a look at how the world looks from inside a function.

function makeGreeter(greeting) {
  function greeter(name) {
    var say = greeting + " " + name;
  return greeter;

var hello = makeGreeter("hello");
var namaste = makeGreeter("namaste");


In the above Co code, the function greeter has a free variable greeting, a bound parameter name, and a local variable say. Upon executing the code with the interpreter, we get the following output:

hello Arthur
namaste Ford

The output makes sense when we understand the variables hello and namaste are closures over the function greeter. The environment seen from inside greeter when it is being executed is a mix of the scope (and hence, the environment) it is defined in, and the scope it is called in.

Function environment is a mix of its caller and definition environments
Function environment is a mix of its caller and definition environments
Function environment is a mix of its caller and definition environments
Function environment is a mix of its caller and definition environments

More specifically, the free variables come from the definition scope, and the parameters come from the caller scope. Local variables can be derived from any combinations of free variables and parameters. With this understanding, let’s see how we evaluate function calls:

evaluateFuncCall' :: Value -> [Expr] -> Interpreter Value
    ~func@(Function funcName params body funcDefEnv) argEs = do
  funcCallEnv <- State.gets isEnv
  retVal <- executeBody funcCallEnv
  setEnv funcCallEnv
  return retVal
    checkArgCount = when (length argEs /= length params) $
      throw $ funcName <> " call expected " <> show (length params)
              <> " argument(s) but received " <> show (length argEs)

    setupFuncEnv = do
      args <- traverse evaluate argEs
      funcDefEnv' <- defineVarEnv funcName func funcDefEnv
      setEnv funcDefEnv'
      for_ (zip params args) $ uncurry defineVar

    executeBody funcCallEnv =
      (traverse_ execute body >> return Null) `catchError` \case
        Return val -> return val
        err -> setEnv funcCallEnv >> throwError err

Let’s go over the above code, step by step:

  1. evaluateFuncCall' is called with the function to evaluate. We get access to the function’s name, its parameter names, body statements, and the environment it is defined in. We also get the argument expressions for the function call. (Line 2–3)
  2. First, we check that the count of arguments is same as the count of the function parameter in checkArgCount, failing which we throw a runtime error. (Line 4, 11–13)
  3. Then, we capture the current environment from the interpreter state. This is the function’s caller’s environment. (Line 5)
  4. Next, we set up the environment in which the function will be executed (line 6). In setupFuncEnv:
    1. We evaluate the argument expressions in the current (caller’s) environment13. (Line 16)
    2. We bind the callee function itself to its name in its own environment. This lets our function to recursively call itself. (Line 17)
    3. We set the current environment in the interpreter state to the functions’s environment. (Line 18)
    4. We bind the argument values to their parameter names in the function’s environment. This lets the function body access the arguments being called with. (Line 19)
  5. With the function environment set up, we execute the function body in executeBody (line 7):
    1. We execute each statement in the body, and return null in case there was no explicit return in the function. (Line 22)
    2. If the body contains a return statement, or if its execution throws a runtime error, we handle the exception in the catchError case statement.
      1. For return, we pass along the return value. (Line 23)
      2. For a runtime error, first we set the current environment back to the caller’s environment that we captured in step 3, and then we throw the error. The error is eventually handled in the interpret function described in the next section. (Line 24)
    3. We capture the value returned from executing the body. (Line 7)
  6. We set the current environment back to the caller’s environment that we captured in step 3. (Line 8)
  7. We return the captured return value from evaluateFuncCall'. The function call is complete now. (Line 9)

Curious readers may wonder, why do we need to use State monad, Maps, and IORefs together, when all of them do similar work of storing and mutating variables? Because, together they let us implement function calls, scopes and closures, as described below:

  1. State monad lets us swap the current environment for a function’s definition environment when a function call is made, and to restore the calling environment after the call is complete.
  2. Immutable maps are perfect for implementing scopes. Adding variables in an immutable map returns a modified map without changing the original map. This lets us shadow variables defined in outer scopes when entering inner scopes, while also being able to easily restore the shadowed variables by just restoring the original map after the inner scopes end. There is no need to use a stack of mutable maps, which is how environments are generally implemented in interpreters which do not use immutable maps.
  3. Lastly, putting IORefs as values of immutable maps lets us implement mutable closures. All closures of same function share the same references to the IORefs. This allows variable mutations made from one closure to be visible to all others. If we had used just immutable maps, changes made to variable values would not propagate between closures because of immutability.

So that’s how function calls—the most crucial part of the interpreter—work. That completes the guts of our interpreter for the basic features of Co. In the next and last section, we put everything together.

Interpreting a Program

We are down to the last step. We interpret a program returned from the parser written in the previous post to run it.

interpret :: Program -> IO (Either String ())
interpret program = do
  state <- newInterpreterState
  retVal <- flip evalStateT state
    . flip runContT return
    . runExceptT
    . runInterpreter
    $ (traverse_ execute program <* awaitTermination)
  case retVal of
    Left (RuntimeError err) -> return $ Left err
    Left (Return _) -> return $ Left “Cannot return for outside functions”
    Left CoroutineQueueEmpty -> return $ Right ()
    Right _ -> return $ Right ()

We run the list of statements in the program by running the execute function on them. Then we run the monad transformer stack, layer by layer, to get the return value. Finally, we case match on the return value to catch errors, and we are done.

We package the parser and the interpreter together to create the runFile function that takes a file path, reads and parses the file, and then interprets the AST:

runFile :: FilePath -> IO ()
runFile file = do
  code <- readFile file
  case runParser program code of
    Left err -> hPutStrLn stderr err
    Right program -> interpret program >>= \case
      Left err -> hPutStrLn stderr $ "ERROR: " <> err
      _ -> return ()

Finally, we can run the interpreter on the Co files:

> runFile ""

That’s all for now. We implemented the interpreter for the basic features for Co, and learned about how function calls, scopes and closures work. In the next part, we’ll extend our interpreter to add support for coroutines and channels in Co.

The full code for the interpreter can be seen here. You can discuss this post on lobsters, r/haskell, discourse, twitter or in the comments below.

  1. It’s hard to find examples of real-world programming languages that are run with AST interpreters. This is because AST interpreters are too slow for real-world usage. However, they are the easiest to understand and implement, and hence are widely using in teaching programming languages theory.↩︎

  2. This is called Strong typing in programming languages parlance. JavaScript, on the other hand, is a weakly typed language. In JavaScript, 1 == '1' evaluates to true, whereas in Co, it evaluates to false.↩︎

  3. The property of being able to substitute expressions for their corresponding values without changing the meaning of the program is called Referential transparency. Pure functions—like twice here—that do not have any side-effects are referentially transparent.↩︎

  4. I’m being a little hand-wavy here because most programmers have at least an intuitive understanding of scopes. Read literature for accurate details.↩︎

  5. This is in contrast to Dynamic scoping where the a variable’s scope is essentially global, and is defined by function’s execution context instead of definition context, as in lexical scoping.↩︎

  6. Blocks are another widely used structure that support lexical scoping. Co doesn’t have blocks in the interest of simplicity of implementation.↩︎

  7. The function is said to close its free variables over its closure. Hence, the name Closure.↩︎

  8. Some programming languages like Java support a limited version of closures, which require values of the free variables of functions to not change over time.↩︎

  9. Well, not entirely, because Haskell is a lazily evaluated language.↩︎

  10. In Co, only null and false evaluate to false. All other values evaluate to true. This is implemented by the isTruthy function.↩︎

  11. Functions are just variables in Co. That is to say, functions definitions and variable definitions share the same namespace. This is how it works in many programming languages like JavaScript and Python. But some languages like Common Lisp have separate namespaces for functions and variables.↩︎

  12. Co does not support mutual recursion though. This is because a function in Co only sees the bindings done before its own definition. This can be fixed by either adding a special syntax for mutually recursive functions, or by hositing all the bindings in a scope to the top of the scope, like how JavaScript does.↩︎

  13. Evaluating function arguments before the function body is called the Strict evaluation strategy. Most of the modern programming languages work this way, for example, Java, Python, JavaScript, Ruby etc. This is in contrast to Non-strict evaluation in programming languages like Haskell, where the arguments to functions are evaluated only when their values are needed in the function bodies.↩︎

If you liked this post, please leave a comment.

by Abhinav Sarkar ( at September 21, 2021 12:00 AM

September 20, 2021

FP Complete

Combining Axum, Hyper, Tonic, and Tower for hybrid web/gRPC apps: Part 4

This is the fourth and final post in a series on combining web and gRPC services into a single service using Tower, Hyper, Axum, and Tonic. The full four parts are:

  1. Overview of Tower
  2. Understanding Hyper, and first experiences with Axum
  3. Demonstration of Tonic for a gRPC client/server
  4. Today's post: How to combine Axum and Tonic services into a single service

Single port, two protocols

That heading is a lie. Both an Axum web application and a gRPC server speak the same protocol: HTTP/2. It may be more fair to say they speak different dialects of it. But importantly, it's trivially easy to look at a request and determine whether it wants to talk to the gRPC server or not. gRPC requests will all include the header Content-Type: application/grpc. So our final step today is to write something that can accept both a gRPC Service and a normal Service, and return one unified service. Let's do it! For reference, complete code is in src/bin/

Let's start off with our main function, and demonstrate what we want this thing to look like:

async fn main() {
    let addr = SocketAddr::from(([0, 0, 0, 0], 3000));

    let axum_make_service = axum::Router::new()
        .route("/", axum::handler::get(|| async { "Hello world!" }))

    let grpc_service = tonic::transport::Server::builder()

    let hybrid_make_service = hybrid(axum_make_service, grpc_service);

    let server = hyper::Server::bind(&addr).serve(hybrid_make_service);

    if let Err(e) = server.await {
        eprintln!("server error: {}", e);

We set up simplistic axum_make_service and grpc_service values, and then use the hybrid function to combine them into a single service. Notice the difference in those names, and the fact that we called into_make_service for the former and into_service for the latter. Believe it or not, that's going to cause us a lot of pain very shortly.

Anyway, with that yet-to-be-explained hybrid function, spinning up a hybrid server is a piece of cake. But the devil's in the details!

Also: there are simpler ways of going about the code below using trait objects. I avoided any type erasure techniques, since (1) I thought the code was a bit clearer this way, and (2) it turns into a nicer tutorial in my opinion. The one exception is that I am using a trait object for errors, since Hyper itself does so, and it simplifies the code significantly to use the same error representation across services.

Defining hybrid

Our hybrid function is going to return a HybridMakeService value:

fn hybrid<MakeWeb, Grpc>(make_web: MakeWeb, grpc: Grpc) -> HybridMakeService<MakeWeb, Grpc> {
    HybridMakeService { make_web, grpc }

struct HybridMakeService<MakeWeb, Grpc> {
    make_web: MakeWeb,
    grpc: Grpc,

I'm going to be consistent and verbose with the type variable names throughout. Here, we have the type variables MakeWeb and Grpc. This reflects the difference between what Axum and Tonic provide from an API perspective. We'll need to provide Axum's MakeWeb with connection information in order to get the request-handling Service. With Grpc, we won't have to do that.

In any event, we're ready to implement our Service for HybridMakeService:

impl<ConnInfo, MakeWeb, Grpc> Service<ConnInfo> for HybridMakeService<MakeWeb, Grpc>
    MakeWeb: Service<ConnInfo>,
    Grpc: Clone,
    // ...

We have the two expected type variables MakeWeb and Grpc, as well as ConnInfo, to represent whatever connection information we're given. Grpc won't care about that at all, but the ConnInfo must match up with what MakeWeb is receiving. Therefore, we have the bound MakeWeb: Service<ConnInfo>. The Grpc: Clone bound will make sense shortly.

When we receive an incoming connection, we'll need to do two things:

  • Get a new Service from MakeWeb. Doing this may happen asynchronously, and may have some an error.
    • SIDE NOTE If you remember the actual implementation of Axum, we know for a fact that neither of these are true. Getting a Service from an Axum IntoMakeService will always succeed, and never does any async work. But there are no APIs in Axum exposing this fact, so we're stuck behind the Service API.
  • Clone the Grpc we already have.

Once we have the new Web Service and the cloned Grpc, we'll wrap these up into a new struct, HybridService. We're also going to need some help to perform the necessary async actions, so we'll create a new helper Future type. This all looks like:

type Response = HybridService<MakeWeb::Response, Grpc>;
type Error = MakeWeb::Error;
type Future = HybridMakeServiceFuture<MakeWeb::Future, Grpc>;

fn poll_ready(
    &mut self,
    cx: &mut std::task::Context,
) -> std::task::Poll<Result<(), Self::Error>> {

fn call(&mut self, conn_info: ConnInfo) -> Self::Future {
    HybridMakeServiceFuture {
        grpc: Some(self.grpc.clone()),

Note that we're deferring to self.make_web to say it's ready and passing along its errors. Let's tie this piece off by looking at HybridMakeServiceFuture:

struct HybridMakeServiceFuture<WebFuture, Grpc> {
    web_future: WebFuture,
    grpc: Option<Grpc>,

impl<WebFuture, Web, WebError, Grpc> Future for HybridMakeServiceFuture<WebFuture, Grpc>
    WebFuture: Future<Output = Result<Web, WebError>>,
    type Output = Result<HybridService<Web, Grpc>, WebError>;

    fn poll(self: Pin<&mut Self>, cx: &mut std::task::Context) -> Poll<Self::Output> {
        let this = self.project();
        match this.web_future.poll(cx) {
            Poll::Pending => Poll::Pending,
            Poll::Ready(Err(e)) => Poll::Ready(Err(e)),
            Poll::Ready(Ok(web)) => Poll::Ready(Ok(HybridService {
                grpc: this.grpc.take().expect("Cannot poll twice!"),

We need to pull in pin_project to allow us to project the pinned web future inside our poll implementation. (If you're not familiar with pin_project, don't worry, we'll describe things later on with HybridFuture.) When we poll web_future, we could end up in one of three states:

  • Pending: the MakeWeb isn't ready, so we aren't ready either
  • Ready(Err(e)): the MakeWeb failed, so we pass along the error
  • Ready(Ok(web)): the MakeWeb is successful, so package up the new web value with the grpc value

There's some funny business with that this.grpc.take() to get the cloned Grpc value out of the Option. Futures have an invariant that, once they return Ready, they cannot be polled again. Therefore, it's safe to assume that take will only ever be called once. But all of this pain could be avoided if Axum exposed an into_service method instead.


The previous types will ultimately produce a HybridService. Let's look at what that is:

struct HybridService<Web, Grpc> {
    web: Web,
    grpc: Grpc,

impl<Web, Grpc, WebBody, GrpcBody> Service<Request<Body>> for HybridService<Web, Grpc>
    Web: Service<Request<Body>, Response = Response<WebBody>>,
    Grpc: Service<Request<Body>, Response = Response<GrpcBody>>,
    Web::Error: Into<Box<dyn std::error::Error + Send + Sync + 'static>>,
    Grpc::Error: Into<Box<dyn std::error::Error + Send + Sync + 'static>>,
    // ...

This HybridService will take Request<Body> as input. The underlying Web and Grpc will also take Request<Body> as input, but they'll produce slightly different output: either Response<WebBody> or Response<GrpcBody>. We're going to need to somehow unify those body representations. As mentioned above, we're going to use trait objects for error handling, so no unification there is necessary.

type Response = Response<HybridBody<WebBody, GrpcBody>>;
type Error = Box<dyn std::error::Error + Send + Sync + 'static>;
type Future = HybridFuture<Web::Future, Grpc::Future>;

The associated Response type is going to be a Response<...> as well, but its body is going to be the HybridBody<WebBody, GrpcBody> type. We'll get to that later. Similarly, we have two different Futures that may get called, depending on the kind of request. We need to unify over that with a HybridFuture type.

Next, let's look at poll_ready. We need to check for both Web and Grpc being ready for a new request. And each check can result in one of three cases: Pending, Ready(Err), or Ready(Ok). This function is all about pattern matching and unifying the error representation using .into():

fn poll_ready(
    &mut self,
    cx: &mut std::task::Context<'_>,
) -> std::task::Poll<Result<(), Self::Error>> {
    match self.web.poll_ready(cx) {
        Poll::Ready(Ok(())) => match self.grpc.poll_ready(cx) {
            Poll::Ready(Ok(())) => Poll::Ready(Ok(())),
            Poll::Ready(Err(e)) => Poll::Ready(Err(e.into())),
            Poll::Pending => Poll::Pending,
        Poll::Ready(Err(e)) => Poll::Ready(Err(e.into())),
        Poll::Pending => Poll::Pending,

And finally, we can see call, where the real logic we're trying to accomplish lives. This is where we get to look at the request and determine where to route it:

fn call(&mut self, req: Request<Body>) -> Self::Future {
    if req.headers().get("content-type").map(|x| x.as_bytes()) == Some(b"application/grpc") {
    } else {

Amazing. All of this work for essentially 5 lines of meaningful code!


That's it, we're at the end! The final type we're going to analyze in this series is HybridFuture. (There's also a HybridBody type, but it's similar enough to HybridFuture that it doesn't warrant its own explanation.) The struct's definition is:

#[pin_project(project = HybridFutureProj)]
enum HybridFuture<WebFuture, GrpcFuture> {
    Web(#[pin] WebFuture),
    Grpc(#[pin] GrpcFuture),

Like before, we're using pin_project. This time, let's explore why. The interface for the Future trait requires pinned pointers in memory. Specifically, the first argument to poll is self: Pin<&mut Self>. Rust itself never gives any guarantees about object permanence, and that's absolutely critical to writing an async runtime system.

The poll method on HybridFuture is therefore going to receive an argument of type Pin<&mut HybridFuture>. The problem is that we need to call the poll method on the underlying WebBody or GrpcBody. Assuming we have the Web variant, the problem we face is that pattern matching on HybridFuture will give us a &WebFuture or &mut WebFuture. It won't give us a Pin<&mut WebFuture>, which is what we need!

pin_project makes a projected data type, and provides a method .project() on the original that gives us those pinned mutable references instead. This allows us to implement the Future trait for HybridFuture correctly, like so:

impl<WebFuture, GrpcFuture, WebBody, GrpcBody, WebError, GrpcError> Future
    for HybridFuture<WebFuture, GrpcFuture>
    WebFuture: Future<Output = Result<Response<WebBody>, WebError>>,
    GrpcFuture: Future<Output = Result<Response<GrpcBody>, GrpcError>>,
    WebError: Into<Box<dyn std::error::Error + Send + Sync + 'static>>,
    GrpcError: Into<Box<dyn std::error::Error + Send + Sync + 'static>>,
    type Output = Result<
        Response<HybridBody<WebBody, GrpcBody>>,
        Box<dyn std::error::Error + Send + Sync + 'static>,

    fn poll(self: Pin<&mut Self>, cx: &mut std::task::Context) -> Poll<Self::Output> {
        match self.project() {
            HybridFutureProj::Web(a) => match a.poll(cx) {
                Poll::Ready(Ok(res)) => Poll::Ready(Ok(,
                Poll::Ready(Err(e)) => Poll::Ready(Err(e.into())),
                Poll::Pending => Poll::Pending,
            HybridFutureProj::Grpc(b) => match b.poll(cx) {
                Poll::Ready(Ok(res)) => Poll::Ready(Ok(,
                Poll::Ready(Err(e)) => Poll::Ready(Err(e.into())),
                Poll::Pending => Poll::Pending,

We unify together the successful response bodies with the HybridBody enum and use a trait object for error handling. And now we're presenting a single unified type for both types of requests. Huzzah!


Thank you dear reader for getting through these posts. I hope it was helpful. I definitely felt more comfortable with the Tower/Hyper ecosystem after diving into these details like this. Let's sum up some highlights from this series:

  • Tower provides a Rusty interface called Service for async functions from inputs to outputs, or requests to responses, which may fail
    • Don't forget, there are two levels of async behavior in this interface: checking whether the Service is ready and then waiting for it to complete processing
  • HTTP itself necessitates two levels of async functions: a type InnerService = Request -> IO Response for individual requests, and type OuterService = ConnectionInfo -> IO InnerService for the overall connection
  • Hyper provides a concrete server implementation that can accept things that look like OuterService and run them
    • It uses a lot of traits, some of which are not publicly exposed, to generalize
    • It provides significant flexibility in the request and response body representation
    • The helper functions service_fn and make_service_fn are a common way to create the two levels of Service necessary
  • Axum is a lightweight framework sitting on top of Hyper, and exposing a lot of its interface
  • gRPC is an HTTP/2 based protocol which can be hosted via Hyper using the Tonic library
  • Dispatching between an Axum service and gRPC is conceptually easy: just check the content-type header to see if something is a gRPC request
  • But to make that happen, we need a bunch of helper "hybrid" types to unify the different types between Axum and Tonic
  • A lot of the time, you can get away with trait objects to enable type erasure, but hybridizing Either-style enums work as well
    • While they're more verbose, they may also be clearer
    • There's also a potential performance gain by avoiding dynamic dispatch

If you want to review it, remember that a complete project is available on GitHub at

Finally, some more subjective takeaways from me:

  • I'm overall liking Axum, and I'm already using it for a new client project.
  • I do wish it was a little higher level, and that the type errors weren't quite as intimidating. I think there may be some room in this space for more aggressive type erasure-focused frameworks, exchanging a bit of runtime performance for significantly simpler ergonomics.
  • I'm also looking at rewriting our Zehut product to leverage Axum. So far, it's gone pretty well, but other responsibilities have taken me off of that work for the foreseeable future. And there are some painful compilation issues to be aware of.
  • I do miss strongly typed routes, but overall I'd rather use something like Axum than push farther with routetype. In the future, though, I may look into providing some routetype/axum bridge.

If this kind of content was helpful, and you're interested in more in the future, please consider subscribing to our blog. Let me know (on Twitter or elsewhere) if you have any requests for additional content like this.

If you're looking for more Rust content, check out:

September 20, 2021 12:00 AM

September 17, 2021

Brent Yorgey

Automatically updated, cached views with lens

Recently I discovered a nice way to deal with records where certain fields of the record cache some expensive function of other fields, using the lens library. I very highly doubt I am the first person to ever think of this, but I don’t think I’ve seen it written down anywhere. I’d be very happy to be learn of similar approaches elsewhere.

The problem

Suppose we have some kind of record data structure, and an expensive-to-calculate function which computes some kind of “view”, or summary value, for the record. Like this:

data Record = Record
  { field1 :: A, field2 :: B, field3 :: C }

expensiveView :: A -> B -> C -> D
expensiveView = ...

(Incidentally, I went back and forth on whether to put real code or only pseudocode in this post; in the end, I decided on pseudocode. Hopefully it should be easy to apply in real situations.)

If we need to refer to the summary value often, we might like to cache the result of the expensive function in the record:

data Record = Record
  { field1 :: A, field2 :: B, field3 :: C, cachedView :: D }

expensiveView :: A -> B -> C -> D
expensiveView = ...

However, this has several drawbacks:

  1. Every time we produce a new Record value by updating one or more fields, we have to remember to also update the cached view. This is easy to miss, especially in a large codebase, and will most likely result in bugs that are very difficult to track down.

  2. Actually, it gets worse: what if we already have a large codebase that is creating updated Record values in various places? We now have to comb through the codebase looking for such places and modifying them to update the cachedExpensive field too. Then we cross our fingers and hope we didn’t miss any.

  3. Finally, there is nothing besides comments and naming conventions to prevent us from accidentally modifying the cachedExpensive field directly.

The point is that our Record type now has an associated invariant, and invariants which are not automatically enforced by the API and/or type system are Bad ™.

Lens to the rescue

If you don’t want to use lens, you can stop reading now. (Honestly, given the title, I’m not even sure why you read this far.) In my case, I was already using it heavily, and I had a lightbulb moment when I realized how I could leverage it to add a safe cached view to a data type without modifying the rest of my codebase at all!

The basic idea is this:

  1. Add a field to hold the cached value as before.
  2. Don’t use lens’s TemplateHaskell utilites to automatically derive lenses for all the fields. Instead, declare them manually, such that they automatically update the cached field on every set operation.
  3. For the field with the cached value itself, declare a Getter, not a Lens.
  4. Do not export the constructor or field projections for your data type; export only the type and the lenses.

In pseudocode, it looks something like this:

module Data.Record
  (Record, field1, field2, field3, cachedView)

import Control.Lens

data Record = Record
  { _field1 :: A, _field2 :: B, _field3 :: C, _cachedView :: D }

expensiveView :: A -> B -> C -> D
expensiveView = ...

recache :: Record -> Record
recache r = r { _cachedView = expensiveView (_field1 r) (_field2 r) (_field3 r) }

cachingLens :: (Record -> a) -> (Record -> a -> Record) -> Lens' Record a
cachingLens get set = lens get (\r a -> recache $ set r a)

field1 :: Lens' Record A
field1 = cachingLens _field1 (\r x -> r { _field1 = x })

field2 :: Lens' Record B
field2 = cachingLens _field2 (\r x -> r { _field2 = x })

field3 :: Lens' Record C
field3 = cachingLens _field3 (\r x -> r { _field3 = x })

cachedView :: Getter Record D
cachedView = to _cachedView

This solves all the problems! (1) We never have to remember to update the cached field; using a lens to modify the value of another field will automatically cause the cached view to be recomputed as well. (3) We can’t accidentally set the cached field, since it only has a Getter, not a Lens. In fact, this even solves (2), the problem of having to update the rest of our codebase: if we are already using lens to access fields in the record (as I was), then the rest of the codebase doesn’t have to change at all! And if we aren’t using lens already, then the typechecker will infallibly guide us to all the places we have to fix; once our code typechecks again, we know we have caught every single access to the record in the codebase.

Variant for only a few fields

What if we have a large record, and the cached summary value only depends on a few of the fields? In that case, we can save a bit of work for ourselves by getting lens to auto-generate lenses for the other fields, and only handcraft lenses for the fields that are actually involved. Like this:

{-# LANGUAGE TemplateHaskell #-}

data Record = Record
  { _field1 :: A, _field2 :: B, _cachedView :: C, ... }

expensiveView :: A -> B -> C
expensiveView = ...

let exclude = ['_field1, '_field2, '_cachedView] in
    (lensRules & lensField . mapped . mapped %~ \fn n ->
      if n `elem` exclude then [] else fn n)

field1 :: Lens' Record A
field1 = ... similar to before ...

field2 :: Lens' Record B
field2 = ...

cachedView :: Getter Record C
cachedView = to _cachedView

But what about the lens laws?

You might worry that having a lens for one field automatically update the value of another field might break the lens laws somehow, but it’s perfectly legal, as we can check.

  1. view l (set l v s) ≡ v clearly holds: setting the cachedView on the side doesn’t change the fact that we get back out whatever we put into, say, field1.
  2. set l v' (set l v s) ≡ set l v' s also clearly holds. On the left-hand side, the cached summary value will simply get overwritten in the same way that the other field does.
  3. set l (view l s) s ≡ s is actually a bit more subtle. If we view the value of field1, then set it with the same value again, how do we know the value of the overall record s doesn’t change? In particular, could we end up with a different cachedView even though field1 is the same? But in fact, in this specific scenario (putting the same value back into a field that we just read), the value of the cachedView won’t change. This depends on two facts: first, that the expensiveView is a deterministic function which always returns the same summary value for the same input record. Of course this is guaranteed by the fact that it’s a pure function. Second, we must maintain the invariant that the cachedView is always up-to-date, so that recomputing the summary value after setting a field to the same value it already had will simply produce the same summary value again, because we know the summary value was correct to begin with. And of course, maintaining this invariant is the whole point; it’s guaranteed by the way we only export the lenses (and only a Getter for the cachedView) and not the record constructor.

And that’s it! I’ve been using this approach very successfully in a current project (the same project that got me to implement Hindley-Milner with unification-fd—watch this space for an announcement soon!). If you know of similar approaches that have been written about elsewhere, or if you end up using this technique in your own project, I’d love to hear about it.

by Brent at September 17, 2021 03:45 PM

September 16, 2021

Neil Mitchell

Reflecting on the Shake Build System

Summary: As a medium-sized build system, Shake has some good bits and some bad bits.

I first developed the Shake build system at Standard Chartered in 2008, rewriting an open source version in my spare time in 2011. I wrote a paper on Shake for ICFP 2012 and then clarified some of the details in a JFP 2020 paper. Looking back, over a decade later, this post discusses what went well and what could be improved.

The first thing to note is that Shake is a medium sized build system. If you have either 1 source file or 1 million source files, Shake probably isn't a good fit. In this post I'm going to go through how Shake does as a medium-sized build system, and two other posts reflect on what I think a small build system or huge build system might look like.

The most important thing Shake got right was adding monadic/dynamic dependencies. Most build systems start with a static graph, and then, realising that can't express the real world, start hacking in an unprincipled manner. The resulting system becomes a bunch of special cases. Shake embraced dynamic dependencies. That makes some things harder (no static cycle detection, less obvious parallelism, must store dependency edges), but all those make Shake itself harder to write, while dynamic dependencies make Shake easier to use. I hope that eventually all build systems gain dynamic dependencies.

In addition to dynamic dependencies, Shake has early cut-off, meaning files that rebuild but don't change don't invalidate rules that depend upon them. This feature is something increasingly becoming standard in build systems, which is great to see.

Shake is written as a Haskell DSL, which means users of Shake are writing a Haskell program that happens to heavily leverage the Shake library. That choice was a double-edged sword. There are some significant advantages:

  • I didn't need to invent a special purpose language. That means I get to reuse existing tooling, existing learning materials, and existing libraries.
  • Since Shake is just a library, it can be documented with the normal tools like Haddock.
  • Users can extend Shake using Haskell, and publish libraries building on top of Shake.
  • The modelling of monadic dependencies in Haskell is pretty elegant, given a dedicated syntax for expressing monadic computations (the do keyword).
  • Projects like Haskell Language Server can build on top of Shake in fairly fundamental ways. See our recent IFL 2020 paper for the benefits that brought.

But there are also some downsides:

  • Most significantly, the audience of Shake is somewhat limited by the fact that Shake users probably have to learn some Haskell. While the user manual aims to teach enough Haskell to write Shake without really knowing Haskell, it's still a barrier.
  • Haskell has some significant weaknesses, e.g. it has two competing package managers, countless distribution mechanisms, and none of these things are consistent for long periods of time. Haskell has a poor on-ramp, and thus so does Shake.

The choice of an embedded DSL for Shake also leads to the issue that Shake doesn't know when a rule has changed, since a rule is opaque Haskell code. As a consequence, if you modify a command line in a .hs file, Shake is unaware and won't rebuild the necessary files. There are a bunch of techniques for dealing with this limitation (see the Shake functions shakeVersion, versioned), but none are pleasant, and it remains an easy mistake to make. A potential way out is to build a system which reads configuration files not in Haskell and interprets them, which I gave a talk about, and I've seen deployed in practice. But it's something where each user ends up rolling their own.

Another limitation is that Shake is (deliberately) quite low-level. It gives you a way to depend on a file, and a way to run a command line. It doesn't give you a way to express a C++ library. The hope from the beginning was that Shake would be language neutral, and that libraries would arise that built on top of Shake providing access to standard libraries. If you were writing a Python/C++/Ruby build script, you'd simply import those libraries, mix them together, and have a working build system. There are libraries that have gone in that direction, the libraries shake-language-c and shake-cpp provide C++ rules, avr-shake lets you work with AVR Crosspack. Unfortunately, there aren't enough libraries to just plug together a build system. I think a fundamental problem is that it's not immediately obvious how such libraries would compose, and without that composability, it's hard to build the libraries that would rely on composability.

Finally, the most surprising aspect about developing Shake is that a large part of the effort has gone into writing an ergonomic and cross-platform process executor. The result is found at Development.Shake.Command, and can be used outside Shake, letting users write:

cmd "gcc -c" [src]

This example invokes gcc, ensuring that src is properly escaped if it has spaces or other special characters. A significant amount of the engineering work in Shake has gone into that facility, when it's totally orthogonal to the build system itself.

In the next two parts of this series, I'll go through why I think Shake isn't a great build system for tiny projects (and what might be), followed by why Shake isn't great for absolutely huge projects (and what would need to be fixed).

by Neil Mitchell ( at September 16, 2021 08:51 PM

Small Project Build Systems

Summary: Forward build systems might work better for small projects.

Yesterday's post talked about how Shake is a good medium sized build system - but what about smaller projects? Is the Shake model right for them? Shake can be considered a backwards build system. Each rule says how to produce a file, given some input files (which are dependencies) and an action. Almost all build systems (e.g. Make, Buck, Bazel, CMake, SCons, Ninja) fit this model, which is analysed in the Build Systems a la Carte paper. While this model works, it has two disadvantages:

  • You have to explicitly list dependencies, or infer them from include files etc. That means either dependencies are insufficient (you probably forgot some), or they are excessive (you added some you don't need). Usually both.
  • You have to think backwards. When you ask someone how to build an executable from a C file, no one talks about linking first, but to program a build system you have to.

The alternative to a backwards build system is a forwards build system, of which Memoize was the first. You just write out the commands in order, and dependency tracing figures out if they have changed. To compile a C program it can be as simple as:

gcc -c util.c
gcc -c main.c
gcc -o main main.o util.o

That build script is incredibly simple - so simple it could also be treated as a shell script.

A few years ago I wrote such a system, called Rattle, and wrote a paper about it at OOPSLA 2020 with my co-authors Sarah Spall and Sam Tobin-Hochstadt. Sarah gave a talk about Rattle at OOPSLA, and I gave a talk at Build Meetup 2021. We were able to compile projects like NodeJS faster than the NodeJS build system (which uses Make), showing the idea might be feasible.

If forward build systems are so great, why do I think they are most suitable for small projects? There are four reasons, the first three of which have mitigations, but the final one sets a limit on the size at which forward build systems are suitable.

  1. Forward build systems rely on tracing which files are dependencies of a command. Doing that quickly in a cross-platform manner is a nightmare. There are tricks like hooking system calls etc, but it presents a significant engineering hurdle, especially on MacOS, which makes this task harder with every release.
  2. Forward build systems are immature. The earliest examples no longer work. Rattle is a relatively practical research system - it could evolve into a production system - but it's not there yet. And compared to the alternatives, Rattle is probably one of the closest to production, in large part because it builds off a lot of common infrastructure from Shake which is much more mature.
  3. Forward build systems lack parallelism, since if you want to express parallelism, you need to think about dependencies once more, and it's easy to go wrong. Rattle mostly solves the parallelism by automatically inferring when it is safe to parallelise, which is how we were able to remain competitive with Make.

And finally, the biggest issue is that forward build systems are not compositional, while backward build systems are. If you want to write a 1 million build rule system, in a backwards system, each rule looks like any other. Whereas in a forward build system, assuming you need to give an order, writing down that order in a compositional way is hard - in fact, whenever I've tried it, you start expressing the dependencies between entries and end up with a backwards build system.

Happily, people are continuing to research forward build system. Rattle adds parallelism, Stroll removes the requirement for an order, Fac allows some dependencies and infers the remaining ones, LaForge finds greater incrementality. Perhaps all those ideas can be combined, along with a lot of engineering, to produce a practical forward build system.

Rattle has shown a well engineered forward build system would be feasible for small projects. It's unclear how much larger the concept might be able to scale, probably never to millions of files, but for small projects it might provide a significantly lower effort path to writing build systems.

by Neil Mitchell ( at September 16, 2021 08:05 PM

Huge Project Build Systems

Summary: Shake won't scale to millions of files, this post says what would be required to make it do so.

While Shake has compiled projects with hundreds of thousands of files, it's never scaled out to millions of files, and it would be unlikely to work well at that size. The most popular build systems that operate at that scale are Buck (from Facebook) and Bazel (from Google). In this post I go through the changes that would need to be made to make Shake scale.

The first issue is covered in my previous post, that Shake doesn't know if you change the build rules themselves. As you scale up, it becomes much more important that if you change the rules, everything is properly tracked. As the number of people involved in a project increases, the rate at which the build system changes will also increase. Both Buck and Bazel solve this problem using a deterministic Python-based configuration language called Starlark. If Shake stopped being a Haskell DSL, I'd argue that it stops being Shake and becomes something different, so it's unclear what could be done there.

The next issue is that every time Shake is invoked, it checks the modification time of every file, and then walks the entire dependency graph. That works fine at 10K files, but as you move to 1M files, it takes too long. The solution is two-fold, first be informed which files have changed using notification APIs (e.g. the Watchman tool), and then use reverse dependencies to only explore the portion of the graph that has changed. Happily, Pepe already has a patch adding reverse dependencies to Shake, so that isn't too infeasible.

The final issue is that Shake was designed as a single-machine build system, not for sharing results between users. When I first wrote Shake, I didn't have access to servers, and AWS was brand new. Now, over a decade later, servers are easy to obtain and large scale build systems need to share results, so that if one user builds a file, no one else needs to. Within the realm of multi-user build systems, there are two basic operations - sharing results and offloading commands.

Shake, with it's new cloud features, is able to share results between users using a shared drive. It works, and big companies are using it for real, but I'd consider it fairly experimental. For execution, Shake is unable to run actions remotely, so can't make use of something like Bazel's remote execution API. Since dependencies are specified at the rule level, and remote execution operates at the command level, there is a bit of a mismatch, and it's unclear what that might look like in Shake.

While Shake won't work at huge scales, it is still quite an effective build system at quite large scales. But, given the limitations, I imagine it will never get to the scale of Buck/Bazel. At the same time, Buck/Bazel lack dynamic dependencies, which makes them unable to express rules such as Haskell effectively.

Happily, I am involved with a new build system, the next generation of Buck. I joined Facebook two years ago, and since that time have been focused on this project. It's written in Rust, configured with Starlark (I've spent a lot of time working on an open-source Starlark interpreter in Rust), and should work at huge scales. It's not yet open source, but it will be - we are targeting early next year.

I think Shake is still a build system with a lot to offer, and continue to maintain and improve it. For people who want to scale beyond the range of Shake, I'd definitely recommend using the next generation of Buck, once it is available.

by Neil Mitchell ( at September 16, 2021 03:46 PM

September 14, 2021

Joachim Breitner

A Candid explainer: Quirks

This is the fifth and final post in a series about the interface description language Candid.

If you made it this far, you now have a good understanding of what Candid is, what it is for and how it is used. For this final post, I’ll put the spotlight on specific aspects of Candid that are maybe surprising, or odd, or quirky. This section will be quite opinionated, and could maybe be called “what I’d do differently if I’d re-do the whole thing”.

Note that these quirks are not serious problems, and they don’t invalidate the overall design. I am writing this up not to discourage the use of Candid, but merely help interested parties to understand it better.

References in the wire format

When the work on Candid began at DFINITY, the Internet Computer was still far away from being a thing, and many fundamental aspects about it were still in the air. I particular, there was still talk about embracing capabilities as a core feature of the application model, which would be implemented as opaque references on the system level, likely building on WebAssembly’s host reference type proposal (which only landed recently), and could be used to model access permissions, custom tokens and many other things.

So Candid is designed with that in mind, and you’ll find that its wire format is not just a type table and a value table, but actually

a triple (T, M, R), where T (“type”) and M (“memory”) are sequences of bytes and R (“references”) is a sequence of references.

Also the wire format for values of function service tyeps have an extra byte to distinguish between “public references” (represented by a principal and possible a method name in the data part), and these opaque references.

Alas, references never made it into the Internet Computer, so all Candid implementations simply ignore that part of the specification. But it’s still in the spec, and if it confused you before, now you know why.

Hashed field names

Candid record and variant types look like they have textual field names:

type T = record { syndactyle : nat; trustbuster: bool }

But that is actually only true superficially. The wire format for Candid only stores hashes of field names. So the above is actually equivalent to

type T = record { 4260381820 : nat; 3504418361 : bool }

or, for that matter, to

type T = record { destroys : bool; rectum : nat }

(Yes, I used an english word list to find these hash collisions. There aren’t that many actually.)

The benefit of such hashing is that the messages are a bit smaller in most (not all) cases, but it is a big annoyance for dynamic uses of Candid. It’s the reason why tools like dfx, if they don’t know the Candid interface of a service, will print the result with just the numerical hash, letting you guess which field is which.

It also complicates host languages that derive Candid types from the host language, like Motoko, as some records (e.g. record { trustbuster: bool; destroys : int }) with field name hash collisions can not be represented in Candid, and either the host language’s type system needs to be Candid aware now (as is the case of Motoko), or serialization/deserialization will fail at runtime, or odd bugs can happen.

(More discussion of this issue).


Many languages have a built-in notion of a tuple type (e.g. (Int, Bool)), but Candid does not have such a type. The only first class product type is records.

This means that tuples have to encoded as records somehow. Conveniently(?) record fields are just numbers after all, so the type (Int, Bool) would be mapped to the type

record { 0 : int; 1 : bool }

So tuples can be expressed. But from my experience implementing the type mappings for Motoko and Haskell this is causing headaches. To get a good experience when importing from Candid, the tools have to try to reconstruct which records may have been tuples originally, and turn them into tuples.

The main argument for the status quo is that Candid types should be canonical, and there should not be more than one product type, and records are fine, and there needs to be no anonymous product type. But that has never quite resonated with me, given the practical reality of tuple types in many host languages.

Argument sequences

Did I say that Candid does not have tuple types? Turns out it does, sort of. There is no first class anonymous product, but since functions take sequences of arguments and results, there is a tuple type right there:

func foo : (bool, int) -> (int, bool)

Again, I found that ergonomic interaction with host languages becomes relatively unwieldy by requiring functions to take and return sequences of values. This is especially true for languages where functions take one argument value or return one result type (the latter being very common). Here, return sequences of length one are turned into that type directly, longer argument sequences turn into the host language’s tuple type, and nullary argument sequences turn into the idiomatic unit type. But this means that the types (int, bool) -> () and (record { 0: int, 1: bool}) -> () may be mapped to the same host language type, which causes problems when you hope to encode all necessary Candid type information in the host language.

Another oddity with argument and result sequences is that you can give names to the entries, e.g. write

func hello : (last_name : text; first_name : text) -> ()

but these names are completely ignored! So while this looks like you can, for example, add new optional arguments in the middle, such as

func hello : (last_name : text; middle_name: opt text, first_name : text) -> ()

without breaking clients, this does not have the effect you think it has and will likely break.

My suggestion is to never put names on function arguments and result values in Candid interfaces, and for anything that might be extended with new fields or where you want to name the arguments, use a single record type as the only argument:

func hello : (record { last_name : text; first_name : text}) -> ()

This allows you to add and remove arguments more easily and reliably.

Type “shorthands”

The Candid specification defines a system of types, and then adds a number of “syntactic short-hands”. For example, if you write blob in a Candid type description, it ought to means the same as vec nat8.

My qualm with that is that it doesn’t always mean the same. A Candid type description is interpreted by a number of, say, “consumers”. Two such consumers are part of the Candid specification:

  • The specification that defines the wire format for that type
  • The upgrading (subtyping) rules

But there are more! For every host language, there is some mapping from Candid types to host language types, and also generic tools like Candid UI are consumers of the type algebra. If these were to take the Candid specification as gospel, they would be forced to treat blob and vec nat8 the same, but that would be quite unergonomic and might cause performance regressions (most language try to map blob to some compact binary data type, while vec t tends to turn into some form of array structure).

So they need to be pragmatic and treat blob and vec nat8 differently. But then, for all practical purposes, blob is not just a short-hand of vec nat8. They are different types that just happens to have the same wire representations and subtyping relations.

This affects not just blob, but also “tuples” (record { blob; int; bool }) and field “names”, as discussed above.

The value text format

For a while after defining Candid, the only implementation was in Motoko, and all the plumbing was automatic, so there was never a need for users to to explicitly handle Candid values, as all values were Motoko values. Still, for debugging and testing and such things, we eventually needed a way to print out Candid values, so the text format was defined (“To enable convenient debugging, the following grammar specifies a text format for values…”).

But soon the dfx tool learned to talk to canisters, and now users needed to enter Candid value on the command line, possibly even when talking to canisters for which the interface was not known to dfx. And, sure enough, the textual interface defined in the Candid spec was used.

Unfortunately, since it was not designed for that use case, it is rather unwieldy:

  • It is quite verbose. You have to write record { … }, not just { … }. Vectors are written vec { …; …} instead of some conventional syntax like […, …]. Variants are written as variant { error = "…"} with braces that don’t any value here, and something like #error "…" might have worked as well.

    With a bit more care, a more concise and ergonomic syntax might have been possible.

  • It wasn’t designed to be sufficient to create a Candid value from it. If you write 5 it’s unclear whether that’s a nat or an int16 or what (and all of these have different wire representations). Type annotations were later added, but are relatively unwieldy, and don’t cover all cases (e.g. a service reference with a recursive type cannot be represented in the textual format at the moment).

  • Not really the fault of the textual format, but some useful information about the types is not reflected in the type description that’s part of the wire format. In particular not the field names, and whether a value was intended to be binary data (blob) or a list of small numbers (vec nat8), so pretty-printing such values requires guesswork. The Haskell library even tries to brute-force the hash to guess the field name, if it is short or in a english word list!

In hindsight I think it was too optimistic to assume that correct static type information is always available, and instead of actively trying to discourage dynamic use, Candid might be better if we had taken these (unavoidable?) use cases into account.

Custom wire format

At the beginning of this post, I have a “Candid is …” list. The list is relatively long, and the actual wire format is just one bullet point. Yes, defining a wire format that works isn’t rocket science, and it was easiest to just make one up. But since most of the interesting meat of Candid is in other aspects (subtyping rules, host language integration), I wonder if it would have been better to use an existing, generic wire format, such as CBOR, and build Candid as a layer on top.

This would give us plenty of tools and libraries to begin with. And maybe it would have reduced barrier of entry for developers, which now led to the very strange situation that DFINITY advocates for the universal use of Candid on the Internet Computer, so that all services can smoothly interact, but two of the most important services on the Internet Computer (the registry and the ledger) use Protobuf as their primary interface format, with Candid interfaces missing or an afterthought.

Sideways Interface Evolution

This is not a quirk of Candid itself, but rather an idiom of how you can use Candid that emerged from our solution for record extensions and upgrades.

Consider our example from before, a service with interface

service { add_user : (record { login : text; name : text }) -> () }

where you want to add an age field, which should be a number.

The “official” way of doing that is to add that field with an optional type:

service { add_user : (record { login : text; name : text; age : opt nat }) -> () }

As explained above, this will not break old clients, as the decoder will treat a missing argument as null. So far so good.

But often when adding such a field you don’t want to bother new clients with the fact that this age was, at some point in the past, not there yet. And you can do that! The trick is to distinguish between the interface you publish and the interface you implement. You can (for example in your documentation) state that the interface is

service { add_user : (record { login : text; name : text; age : nat }) -> () }

which is not a subtype of the old type, but it is the interface you want new clients to work with. And then your implementation uses the type with opt nat. Calls from old clients will come through as null, and calls from new clients will come through as opt 42.

We can see this idiom used in the Management Canister of the Internet Computer. The current documented interface only mentions a controllers : vec principal field in the settings, but the implementation still can handle both the old controller : principal and the new controllers field.

It’s probably advisable to let your CI system check that new versions of your service continue to implement all published interfaces, including past interfaces. But as long as the actual implementation’s interface is a subtype of all interfaces ever published, this works fine.

This pattern is related to when your service implements, say, http_request (so its implemented interface is a subtype of that common interface), but does not include that method in the published documentation (because clients of your service don’t need to call it).

Self-describing Services

As you have noticed, Candid was very much designed assuming that all parties always have the service type of services they want to interact with. But the Candid specification does not define how one can obtain the interface of a given service, and there isn’t really a an official way to do that on the Internet Computer.

That is unfortunate, because many interesting features depend on that: Such as writing import C "ic:7e6iv-biaaa-aaaaf-aaada-cai" in your Motoko program, and having it’s type right there. Or tools like, that allow you to interact with any canister right there.

One reason why we don’t really have that feature yet is because of disagreements about how dynamic that feature should be. Should you be able to just ask the canister for its interface (and allow the canister to vary the response, for example if it can change its functionality over time, even without changing the actual wasm code)? Or is the interface a static property of the code, and one should be able to query the system for that data, without the canister’s active involvement. Or, completely different, should interfaces be distributed out of band, maybe together with API documentation, or in some canister registry somewhere else?

I always leaned towards the first of these options, but not convincingly enough. The second options requires system assistance, so more components to change, more teams to be involved that maybe intrinsically don’t care a lot about this feature. And the third might have emerged as the community matures and builds such infrastructure, but that did not happen yet.

In the end I sneaked in an implementation of the first into Motoko, arguing that even if we don’t know yet how this feature will be implemented eventually, we all want the feature to exist somehow, and we really really want to unblock all the interesting applications it enables (e.g. Candid UI). That’s why every Motoko canister, and some rust canisters too, implements a method

__get_candid_interface_tmp_hack : () -> (text)

that one can use to get the Candid interface file.

The name was chosen to signal that this may not be the final interface, but like all good provisional solutions, it may last longer than intended. If that’s the case, I’m not sorry.

This concludes my blog post series about Candid, for now. If you want to know more, feel free to post your question on the DFINTY developer forum, and I’ll probably answer.

by Joachim Breitner ( at September 14, 2021 06:13 PM

September 13, 2021

Monday Morning Haskell

Haskellings Demo Video!

Haskellings Intro.jpg

If you've been following my Twitch stream, you know I've been continuing to work quite a bit on the Haskellings automated tutorial. This week I'm releasing a short YouTube video demonstrating how to get started with this program and use it! If you've been waiting for the kick to get started with learning Haskell, this is it! Download Haskellings and get started!

And if you like it, don't forget to subscribe to Monday Morning Haskell for our subscriber-only resources!

by James Bowen at September 13, 2021 02:30 PM

Chris Reade

Diagrams for Penrose Tiles

Penrose Kite and Dart Tilings with Haskell Diagrams

Revised version (no longer the full program in this literate Haskell)

Infinite non-periodic tessellations of Roger Penrose’s kite and dart tiles.


As part of a collaboration with Stephen Huggett, working on some mathematical properties of Penrose tilings, I recognised the need for quick renderings of tilings. I thought Haskell diagrams would be helpful here, and that turned out to be an excellent choice. Two dimensional vectors were well-suited to describing tiling operations and these are included as part of the diagrams package.

This literate Haskell uses the Haskell diagrams package to draw tilings with kites and darts. It also implements the main operations of inflate and decompose which are essential for constructing tilings (explained below).

Firstly, these 5 lines are needed in Haskell to use the diagrams package:

{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE FlexibleContexts          #-}
{-# LANGUAGE TypeFamilies              #-}
import Diagrams.Prelude
import Diagrams.Backend.SVG.CmdLine

and we will also import a module for half tiles (explained later)

import HalfTile

These are the kite and dart tiles.

Kite and Dart
Kite and Dart

The red line marking here on the right hand copies, is purely to illustrate rules about how tiles can be put together for legal (non-periodic) tilings. Obviously edges can only be put together when they have the same length. If all the tiles are marked with red lines as illustrated on the right, the vertices where tiles meet must all have a red line or none must have a red line at that vertex. This prevents us from forming a simple rombus by placing a kite top at the base of a dart and thus enabling periodic tilings.

All edges are powers of the golden section \phi which we write as phi.

phi = (1.0 + sqrt 5.0) / 2.0

So if the shorter edges are unit length, then the longer edges have length phi. We also have the interesting property of the golden section that phi^2 = phi + 1 and so 1/phi = phi-1, phi^3 = 2phi +1 and 1/phi^2 = 2-phi.

All angles in the figures are multiples of tt which is 36 deg or 1/10 turn. We use ttangle to express such angles (e.g 180 degrees is ttangle 5).

ttangle:: Int -> Angle Double
ttangle n = (fromIntegral (n `mod` 10))*^tt
             where tt = 1/10 @@ turn


In order to implement inflate and decompose, we need to work with half tiles. We now define these in the separately imported module HalfTile with constructors for Left Dart, Right Dart, Left Kite, Right Kite

data HalfTile rep = LD rep -- defined in HalfTile module
                  | RD rep
                  | LK rep
                  | RK rep

where rep is a type variable allowing for different representations. However, here, we want to use a more specific type which we will call Piece:

type Piece = HalfTile (V2 Double)

where the half tiles have a simple 2D vector representation to provide orientation and scale. The vector represents the join edge of each half tile where halves come together. The origin for a dart is the tip, and the origin for a kite is the acute angle tip (marked in the figure with a red dot).

These are the only 4 pieces we use (oriented along the x axis)

ldart,rdart,lkite,rkite:: Piece
ldart = LD unitX
rdart = RD unitX
lkite = LK (phi*^unitX)
rkite = RK (phi*^unitX)

Perhaps confusingly, we regard left and right of a dart differently from left and right of a kite when viewed from the origin. The diagram shows the left dart before the right dart and the left kite before the right kite. Thus in a complete tile, going clockwise round the origin the right dart comes before the left dart, but the left kite comes before the right kite.

When it comes to drawing pieces, for the simplest case, we just want to show the two tile edges of each piece (and not the join edge). These edges are calculated as a list of 2 new vectors, using the join edge vector v. They are ordered clockwise from the origin of each piece

pieceEdges:: Piece -> [V2 Double]
pieceEdges (LD v) = [v',v ^-^ v'] where v' = phi*^rotate (ttangle 9) v
pieceEdges (RD v) = [v',v ^-^ v'] where v' = phi*^rotate (ttangle 1) v
pieceEdges (RK v) = [v',v ^-^ v'] where v' = rotate (ttangle 9) v
pieceEdges (LK v) = [v',v ^-^ v'] where v' = rotate (ttangle 1) v

Now drawing lines for the 2 outer edges of a piece is simply

drawPiece:: Piece -> Diagram B
drawPiece = strokeLine . fromOffsets . pieceEdges

It is also useful to calculate a list of the 4 tile edges of a completed half-tile piece clockwise from the origin of the tile. (This is useful for colour filling a tile)

tileEdges:: Piece -> [V2 Double]
tileEdges (LD v) = pieceEdges (RD v) ++ map negated (reverse (pieceEdges (LD v)))
tileEdges (RD v) = tileEdges (LD v)
tileEdges (LK v) = pieceEdges (LK v) ++ map negated (reverse (pieceEdges (RK v)))
tileEdges (RK v) = tileEdges (LK v)

To fill whole tiles with colours, darts with dcol and kites with kcol we can use fillDK'. This uses only the left pieces to identify the whole tile and ignores right pieces so that a tile is not filled twice.

fillDK':: Colour Double -> Colour Double -> Piece -> Diagram B
fillDK' dcol kcol c =
     case c of (LD _) -> (strokeLoop $ glueLine $ fromOffsets $ tileEdges c)  # fc dcol
               (LK _) -> (strokeLoop $ glueLine $ fromOffsets $ tileEdges c)  # fc kcol
               _      -> mempty

To show half tiles separately, we can use drawJPiece which adds the join edge of each half tile to make loops with 3 edges.

drawJPiece:: Piece -> Diagram B
drawJPiece = strokeLoop . closeLine . fromOffsets . pieceEdges

For an alternative fill operation we can use drawJPiece with the line colour on the join edge matching the fill colour.

fillDK:: Colour Double -> Colour Double -> Piece -> Diagram B
fillDK dcol kcol piece = drawPiece piece <> (drawJPiece piece # fc col # lc col) where
    col = case piece of (LD _) -> dcol
                        (RD _) -> dcol
                        (LK _) -> kcol
                        (RK _) -> kcol

By making Pieces transformable we can reuse generic transform operations. These 4 lines of code are required to do this

type instance N (HalfTile a) = N a
type instance V (HalfTile a) = V a
instance Transformable a => Transformable (HalfTile a) where
    transform t ht = fmap (transform t) ht

So we can also scale a piece, translate a piece by a vector, and rotate a piece by an angle. (Positive rotations are in the anticlockwise direction.)

scale:: Double -> Piece -> Piece
rotate :: Angle Double -> Piece -> Piece
translate:: V2 Double -> Piece -> Piece


A patch is a list of located pieces (each with a 2D point)

type Patch = [Located Piece]

To turn a whole patch into a diagram using some function cd for drawing the pieces, we use

patchWith cd patch = position $ fmap (viewLoc . mapLoc cd) patch

Here mapLoc applies a function to the piece in a located piece – producing a located diagram in this case, and viewLoc returns the pair of point and diagram from a located diagram. Finally position forms a single diagram from the list of pairs of points and diagrams.

The common special case drawPatch uses drawPiece on each piece

drawPatch = patchWith drawPiece

Patches are automatically inferred to be transformable now Pieces are transformable, so we can also scale a patch, translate a patch by a vector, and rotate a patch by an angle.

scale :: Double -> Patch -> Patch
rotate :: Angle Double -> Patch -> Patch
translate:: V2 Double -> Patch -> Patch

As an aid to creating patches with 5-fold rotational symmetry, we combine 5 copies of a basic patch (rotated by multiples of ttangle 2 successively).

penta:: Patch -> Patch
penta p = concatMap copy [0..4] 
            where copy n = rotate (ttangle (2*n)) p

This must be used with care to avoid nonsense patches. But two special cases are

sun =  penta [rkite `at` origin, lkite `at` origin]
star = penta [rdart `at` origin, ldart `at` origin]

This figure shows some example patches, drawn with drawPatch The first is a star and the second is a sun.

tile patches
tile patches

The tools so far for creating patches may seem limited (and do not help with ensuring legal tilings), but there is an even bigger problem.

Correct Tilings

Unfortunately, correct tilings – that is, tilings which can be extended to infinity – are not as simple as just legal tilings. It is not enough to have a legal tiling, because an apparent (legal) choice of placing one tile can have non-local consequences, causing a conflict with a choice made far away in a patch of tiles, resulting in a patch which cannot be extended. This suggests that constructing correct patches is far from trivial.

The infinite number of possible infinite tilings do have some remarkable properties. Any finite patch from one of them, will occur in all the others (infinitely many times) and within a relatively small radius of any point in an infinite tiling. (For details of this see links at the end)

This is why we need a different approach to constructing larger patches. There are two significant processes used for creating patches, namely inflate (also called compose) and decompose.

To understand these processes, take a look at the following figure.


Here the small pieces have been drawn in an unusual way. The edges have been drawn with dashed lines, but long edges of kites have been emphasised with a solid line and the join edges of darts marked with a red line. From this you may be able to make out a patch of larger scale kites and darts. This is an inflated patch arising from the smaller scale patch. Conversely, the larger kites and darts decompose to the smaller scale ones.


Since the rule for decomposition is uniquely determined, we can express it as a simple function on patches.

decompose :: Patch -> Patch
decompose = concatMap decompPiece

where the function decompPiece acts on located pieces and produces a list of the smaller located pieces contained in the piece. For example, a larger right dart will produce both a smaller right dart and a smaller left kite. Decomposing a located piece also takes care of the location, scale and rotation of the new pieces.

decompPiece lp = case viewLoc lp of
  (p, RD vd)-> [ LK vd  `at` p
               , RD vd' `at` (p .+^ v')
               ] where v'  = phi*^rotate (ttangle 1) vd
                       vd' = (2-phi) *^ (negated v') -- (2-phi) = 1/phi^2
  (p, LD vd)-> [ RK vd `at` p
               , LD vd' `at` (p .+^ v')
               ]  where v'  = phi*^rotate (ttangle 9) vd
                        vd' = (2-phi) *^ (negated v')  -- (2-phi) = 1/phi^2
  (p, RK vk)-> [ RD vd' `at` p
               , LK vk' `at` (p .+^ v')
               , RK vk' `at` (p .+^ v')
               ] where v'  = rotate (ttangle 9) vk
                       vd' = (2-phi) *^ v' -- v'/phi^2
                       vk' = ((phi-1) *^ vk) ^-^ v' -- (phi-1) = 1/phi
  (p, LK vk)-> [ LD vd' `at` p
               , RK vk' `at` (p .+^ v')
               , LK vk' `at` (p .+^ v')
               ] where v'  = rotate (ttangle 1) vk
                       vd' = (2-phi) *^ v' -- v'/phi^2
                       vk' = ((phi-1) *^ vk) ^-^ v' -- (phi-1) = 1/phi

This is illustrated in the following figure for the cases of a right dart and a right kite.


The symmetric diagrams for left pieces are easy to work out from these, so they are not illustrated.

With the decompose operation we can start with a simple correct patch, and decompose repeatedly to get more and more detailed patches. (Each decomposition scales the tiles down by a factor of 1/phi but we can rescale at any time.)

This figure illustrates how each piece decomposes with 4 decomposition steps below each one.

four decompositions of pieces
four decompositions of pieces
thePieces =  [ldart, rdart, lkite, rkite]  
fourDecomps = hsep 1 $ fmap decomps thePieces # lw thin where
        decomps pc = vsep 1 $ fmap drawPatch $ take 5 $ decompositions [pc `at` origin] 

We have made use of the fact that we can create an infinite list of finer and finer decompositions of any patch, using:

decompositions:: Patch -> [Patch]
decompositions = iterate decompose

We could get the n-fold decomposition of a patch as just the nth item in a list of decompositions.

For example, here is an infinite list of decomposed versions of sun.

suns = decompositions sun

The coloured tiling shown at the beginning is simply 6 decompositions of sun displayed using fillDK'

sun6 = suns!!6
filledSun6 = patchWith (fillDK' red blue) sun6 # lw ultraThin

The earlier figure illustrating larger kites and darts emphasised from the smaller ones is also sun6 but this time drawn with

experimentFig = patchWith experiment sun6 # lw thin

where pieces are drawn with

experiment:: Piece -> Diagram B
experiment pc = emph pc <> (drawJPiece pc # dashingN [0.002,0.002] 0 # lw ultraThin)
  where emph pc = case pc of
          (LD v) -> (strokeLine . fromOffsets) [v] # lc red   -- emphasise join edge of darts in red
          (RD v) -> (strokeLine . fromOffsets) [v] # lc red 
          (LK v) -> (strokeLine . fromOffsets) [rotate (ttangle 1) v] -- emphasise long edge for kites
          (RK v) -> (strokeLine . fromOffsets) [rotate (ttangle 9) v]


You might expect inflation (also called composition) to be a kind of inverse to decomposition, but it is a bit more complicated than that. With our current representation of pieces, we can only inflate single pieces. This amounts to embedding the piece into a larger piece that matches how the larger piece decomposes. There is thus a choice at each inflation step as to which of several possibilities we select as the larger half-tile. We represent this choice as a list of alternatives. This list should not be confused with a patch. It only makes sense to select one of the alternatives giving a new single piece.

The earlier diagram illustrating how decompositions are calculated also shows the two choices for inflating a right dart into either a right kite or a larger right dart. There will be two symmetric choices for a left dart, and three choices for left and right kites.

Once again we work with located pieces to ensure the resulting larger piece contains the original in its original position in a decomposition.

inflate :: Located Piece -> [Located Piece]
inflate lp = case viewLoc lp of
  (p, RD vd)-> [ RD vd' `at` (p .+^ v')
               , RK vk  `at` p
               ] where v'  = (phi+1) *^ vd                  -- vd*phi^2
                       vd' = rotate (ttangle 9) (vd ^-^ v')
                       vk  = rotate (ttangle 1) v'
  (p, LD vd)-> [ LD vd' `at` (p .+^ v')
               , LK vk `at` p
               ] where v'  = (phi+1) *^ vd                  -- vd*phi^2
                       vd' = rotate (ttangle 1) (vd ^-^ v')
                       vk  = rotate (ttangle 9) v'
  (p, RK vk)-> [ LD vk  `at` p
               , LK lvk' `at` (p .+^ lv') 
               , RK rvk' `at` (p .+^ rv')
               ] where lv'  = phi*^rotate (ttangle 9) vk
                       rv'  = phi*^rotate (ttangle 1) vk
                       rvk' = phi*^rotate (ttangle 7) vk
                       lvk' = phi*^rotate (ttangle 3) vk
  (p, LK vk)-> [ RD vk  `at` p
               , RK rvk' `at` (p .+^ rv')
               , LK lvk' `at` (p .+^ lv')
               ] where v0 = rotate (ttangle 1) vk
                       lv'  = phi*^rotate (ttangle 9) vk
                       rv'  = phi*^rotate (ttangle 1) vk
                       rvk' = phi*^rotate (ttangle 7) vk
                       lvk' = phi*^rotate (ttangle 3) vk

As the result is a list of alternatives, we need to select one to do further inflations. We can express all the alternatives after n steps as inflations n where

inflations :: Int -> Located Piece -> [Located Piece]
inflations 0 lp = [lp]
inflations n lp = do
    lp' <- inflate lp
    inflations (n-1) lp'

This figure illustrates 5 consecutive choices for inflating a left dart to produce a left kite. On the left, the finishing piece is shown with the starting piece embedded, and on the right the 5-fold decomposition of the result is shown.

five inflations
five inflations
fiveInflate = hsep 1 $ fmap drawPatch [[ld,lk'], multiDecomp 5 [lk']] where -- two separate patches
       ld  = (ldart `at` origin)
       lk  = inflate ld  !!1
       rk  = inflate lk  !!1
       rk' = inflate rk  !!2
       ld' = inflate rk' !!0
       lk' = inflate ld' !!1

Finally, at the end of this literate haskell program we choose which figure to draw as output.

fig::Diagram B
fig = filledSun6
main = mainWith fig

That’s it. But, What about inflating whole patches?, I hear you ask. Unfortunately we need to answer questions like what pieces are adjacent to a piece in a patch and whether there is a corresponding other half for a piece. These cannot be done with our simple vector representations. We would need some form of planar graph representation, which is much more involved. That is another story.

Many thanks to Stephen Huggett for his inspirations concerning the tilings. A library version of the above code is available on GitHub

Further reading on Penrose Tilings

As well as the Wikipedia entry Penrose Tilings I recommend two articles in Scientific American from 2005 by David Austin Penrose Tiles Talk Across Miles and Penrose Tilings Tied up in Ribbons.

There is also a very interesting article by Roger Penrose himself: Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989.

More information about the diagrams package can be found from the home page Haskell diagrams

by readerunner at September 13, 2021 02:01 PM

Joachim Breitner

A Candid explainer: Language integration

This is the forth post in a series about the interface description language Candid.

Now for something completely different: How does Candid interact with the various host languages, i.e. the actual programming languages that you write your services and clients in?

There are two facets to that question:

  1. How is the Candid interface represented inside the language?

    Some languages with rich type systems can express all relevant information about a Candid method or service within its own type system, and then the concrete serialization/deserialization code can be derived from that type (e.g. using type classes in Haskell, Traits in Rust, or built into the compiler in Motoko).

    Other languages have a less rich type system (e.g. C), no type-driven generic programming or are simply dynamically typed. In these cases, the Candid type has to be either transformed into specific code for that service by some external tool (e.g. JavaScript and TypeScript) or the Candid description has to be parsed and interpreted at runtime.

    Either approach will give rise to a type mapping between the Candid types and the host language types. Developers will likely have to know which types correspond to which, which is why the Candid manual’s section on types explicitly lists that.

  2. How is the Candid interface description produced and consumed?

    This is maybe the even more important question; what comes first: The code written in the host language, or the Candid description. There are multiple ways to tackle this, all of which have their merits, so I let’s look at some typical approaches.

Generating candid from the host language

In many case you don’t care too much about the interface of your service, and you just want to write the functionality, and get the interface definition for free. This is what you get when you write Motoko services, where the compiler calculates the Candid interface based on the Motoko types of your actor methods, and the build tool (dfx) puts that Candid file where it needs to go. You can thus develop services without ever writing or even looking at Candid type definitions.

The Candid library for Rust supports that mode as well, although you have to add some macros to your program to use it.

A downside of this model is that you have only indirect control over the generated Candid. Since it is type-driven, whenever there is more than one sensible Candid type for a given host language type, the translation tool has to make a choice, and if that does not suit you, that can be a problem.

In the case of Motoko we were able to co-design it with Candid, and their type systems are similar enough that this works well in practice. We have a specification of Candid-Motoko-type-mappings, and the type export from from Motoko to Candid is almost surjective. (Almost, because of Candid’s float32 type, which Motoko simply does not have, and because of service types with methods names that are not valid Motoko identifiers.)

Checking host language against Candid

The above works well when you (as the service developer) get to define the service’s interface as you go. But sometimes you want to develop a service that adheres to a given Candid interface. For example, in order to respond to HTTP requests in an Internet Computer service, you should provide a method http_request that implements this interface (simplified):

type HeaderField = record { text; text; };

type HttpRequest = record {
  method: text;
  url: text;
  headers: vec HeaderField;
  body: blob;

type HttpResponse = record {
  status_code: nat16;
  headers: vec HeaderField;
  body: blob;

service : {
  http_request: (request: HttpRequest) -> (HttpResponse) query;

Here, a suitable mode of operation is to generate the Candid description of the service that you built, and then compare it against this expected interface with a tool that implements the Candid subtyping relation. This would then complain if what you wrote was not compatible with the above interface. The didc check tool that comes with the Rust library can do that. If your service has to implement multiple such pre-defined interfaces, its actual interface will end up being a subtype of each of these interfaces.

Importing Candid

If you already have a Candid file, in particular if you are writing a client that wants to talk to an existing service, you can also import that Candid file into your language. This is a very common mode of operation for such interface descriptions languages (e.g. Protobuf). The details depend a lot on the host language, though:

  • In Motoko, you can import typed actor references: Just write import C "canister:foo" where foo is the name of another canister in your project, and your build tool (dfx) will pass the Candid interface of foo to the Motoko compiler, which then translates that Candid service type into a Motoko actor type for you to use.

    The mapping from Candid types to Motoko types is specified in as well, via the function i(…).

    The vision was always that you can import a reference to any canister on the Internet Computer this way (import C "ic:7e6iv-biaaa-aaaaf-aaada-cai"), and the build tool would fetch the interface automatically, but that has not been implemented yet, partly because of disagreement about how canisters expose their interface (see below). The Motoko compiler is ready, though, as described in its build tool interface specification.

    What is still missing in Motoko is a way to import a Candid type alone (without a concrete canister reference), to use the types somewhere (in function arguments, or to assert the type of the full canister).

  • The Rust library does not support importing Candid, as far as I know. If you write a service client in Rust, you have to know which Rust types map to the right Candid types, and manually get that right.

  • For JavaScript and TypeScript, the didc tool that comes with the Rust library can take a Candid file and produce JS resp. TS code that gives you an object representing the service with properly typed methods that you can easily interact with.

  • With the Haskell Candid library you can import Candid inline or from a file, and it uses metaprogramming (Template Haskell) to generate suitable Haskell types for you, which you can encode/decode at. This is for example used in the test suite of the Internet Identity, which executes that service in a simulated Internet Computer environment.

Generic and dynamic use

Finally, it’s worth mentioning that Candid can be used generically and dynamically:

  • Since Canisters can indicate their interface, a website like that enumerates Canisters can read that interface, and provide a fully automatically generated UI for them. For example, you can not only see the Candid interface of the Internet Identity canister at, but actually interact with it!

  • Similarly, during development of a service, it is very useful to be able to interact with it before you have written your actual front-end. The Candid UI tool provides that functionality, and can be used during local development or on-chain. It is also integrated into the Internet Computer playground.

  • Command line tools like dfx allow you to make calls to services, even without having their actual Candid interface description around. However, such dynamic use was never part of the original design of Candid, so it is a bit rough around the edges – the textual format is a bit unwieldy, you need to get the types right, and field name in the responses may be missing.

Do you want to know why field names are missing then? Then check out the next and final post of this series, where I discuss a number of Candid’s quirks.

by Joachim Breitner ( at September 13, 2021 01:47 PM

FP Complete

Combining Axum, Hyper, Tonic, and Tower for hybrid web/gRPC apps: Part 3

This is the third of four posts in a series on combining web and gRPC services into a single service using Tower, Hyper, Axum, and Tonic. The full four parts are:

  1. Overview of Tower
  2. Understanding Hyper, and first experiences with Axum
  3. Today's post: Demonstration of Tonic for a gRPC client/server
  4. How to combine Axum and Tonic services into a single service

Tonic and gRPC

Tonic is a gRPC client and server library. gRPC is a protocol that sits on top of HTTP/2, and therefore Tonic is built on top of Hyper (and Tower). I already mentioned at the beginning of this series that my ultimate goal is to be able to serve hybrid web/gRPC services over a single port. But for now, let's get comfortable with a standard Tonic client/server application. We're going to create an echo server, which provides an endpoint that will repeat back whatever message you send it.

The full code for this is available on GitHub. The repository is structured as a single package with three different crates:

  • A library crate providing the protobuf definitions and Tonic-generated server and client items
  • A binary crate providing a simple client tool
  • A binary crate providing the server executable

The first file we'll look at is the protobuf definition of our service, located in proto/echo.proto:

syntax = "proto3";

package echo;

service Echo {
  rpc Echo (EchoRequest) returns (EchoReply) {}

message EchoRequest {
  string message = 1;

message EchoReply {
  string message = 1;

Even if you're not familiar with protobuf, hopefully the example above is fairly self-explanatory. We need a file to use tonic_build to compile this file:

fn main() {
        .compile(&["proto/echo.proto"], &["proto"])

And finally, we have our mammoth src/ providing all the items we'll need for implementing our client and server:


There's nothing terribly interesting about the client. It's a typical clap-based CLI tool that uses Tokio and Tonic. You can read the source on GitHub.

Let's move onto the important part: the server.

The server

The Tonic code we put into our library crate generates an Echo trait. We need to implement that trait on some type to make our gRPC service. This isn't directly related to our topic today. It's also fairly straightforward Rust code. I've so far found the experience of writing client/server apps with Tonic to be a real pleasure, specifically because of how easy these kinds of implementations are:

use tonic_example::echo_server::{Echo, EchoServer};
use tonic_example::{EchoReply, EchoRequest};

pub struct MyEcho;

impl Echo for MyEcho {
    async fn echo(
        request: tonic::Request<EchoRequest>,
    ) -> Result<tonic::Response<EchoReply>, tonic::Status> {
        Ok(tonic::Response::new(EchoReply {
            message: format!("Echoing back: {}", request.get_ref().message),

If you look in the source on GitHub, there are two different implementations of main, one of them commented out. That one's the more straightforward approach, so let's start with that:

async fn main() -> anyhow::Result<()> {
    let addr = ([0, 0, 0, 0], 3000).into();



This uses Tonic's Server::builder to create a new Server value. It then calls add_service, which looks like this:

impl<L> Server<L> {
    pub fn add_service<S>(&mut self, svc: S) -> Router<S, Unimplemented, L>
        S: Service<Request<Body>, Response = Response<BoxBody>>
            + NamedService
            + Clone
            + Send
            + 'static,
        S::Future: Send + 'static,
        S::Error: Into<crate::Error> + Send,
        L: Clone

We've got another Router. This works like in Axum, but it's for routing gRPC calls to the appropriate named service. Let's talk through the type parameters and traits here:

  • L represents the layer, or the middlewares added to this server. It will default to Identity, to represent the no middleware case.
  • S is the new service we're trying to add, which in our case is an EchoServer.
  • Our service needs to accept the ever-familiar Request<Body> type, and respond with a Response<BoxBody>. (We'll discuss BoxBody on its own below.) It also needs to be NamedService (for routing).
  • As usual, there are a bunch of Clone, Send, and 'static bounds too, and requirements on the error representation.

As complicated as all of that appears, the nice thing is that we don't really need to deal with those details in a simple Tonic application. Instead, we simply call the serve method and everything works like magic.

But we're trying to go off the beaten path and get a better understanding of how this interacts with Hyper. So let's go deeper!


In addition to the serve method, Tonic's Router type also provides an into_service method. I'm not going to go into all of its glory here, since it doesn't add much to the discussion but adds a lot to the reading you'll have to do. Instead, suffice it to say that

  • into_service returns a RouterService<S> value
  • S must implement Service<Request<Body>, Response = Response<ResBody>>
  • ResBody is a type that Hyper can use for response bodies

OK, cool? Now we can write our slightly more long-winded main function. First we create our RouterService value:

let grpc_service = tonic::transport::Server::builder()

But now we have a bit of a problem. Hyper expects a "make service" or an "app factory", and instead we just have a request handling service. So we need to go back to Hyper and use make_service_fn:

let make_grpc_service = make_service_fn(move |_conn| {
    let grpc_service = grpc_service.clone();
    async { Ok::<_, Infallible>(grpc_service) }

Notice that we need to clone a new copy of the grpc_service, and we need to play all the games with splitting up the closure and the async block, plus Infallible, that we saw before. But now, with that in place, we can launch our gRPC service:

let server = hyper::Server::bind(&addr).serve(make_grpc_service);

if let Err(e) = server.await {
    eprintln!("server error: {}", e);

If you want to play with this, you can clone the tonic-example repo and then:

  • Run cargo run --bin server in one terminal
  • Run cargo run --bin client "Hello world!" in another

However, trying to open up http://localhost:3000 in your browser isn't going to work out too well. This server will only handle gRPC connections, not standard web browser requests, RESTful APIs, etc. We've got one final step now: writing something that can handle both Axum and Tonic services and route to them appropriately.


Let's look into that BoxBody type in a little more detail. We're using the tonic::body::BoxBody struct, which is defined as:

pub type BoxBody = http_body::combinators::BoxBody<bytes::Bytes, crate::Status>;

http_body itself provides its own BoxBody, which is parameterized over the data and error. Tonic uses the Status type for errors, and represents the different status codes a gRPC service can return. For those not familiar with Bytes, here's a quick excerpt from the docs

Bytes is an efficient container for storing and operating on contiguous slices of memory. It is intended for use primarily in networking code, but could have applications elsewhere as well.

Bytes values facilitate zero-copy network programming by allowing multiple Bytes objects to point to the same underlying memory. This is managed by using a reference count to track when the memory is no longer needed and can be freed.

When you see Bytes, you can semantically think of it as a byte slice or byte vector. The underlying BoxBody from the http_body crate represents some kind of implementation of the http_body::Body trait. The Body trait represents a streaming HTTP body, and contains:

  • Associated types for Data and Error, corresponding to the type parameters to BoxBody
  • poll_data for asynchronously reading more data from the body
  • Helper map_data and map_err methods for manipulating the Data and Error associated types
  • A boxed method for some type erasure, allowing us to get back a BoxBody
  • A few other helper methods around size hints and HTTP/2 trailing data

The important thing to note for our purposes is that "type erasure" here isn't really complete type erasure. When we use boxed to get a trait object representing the body, we still have type parameters to represent the Data and Error. Therefore, if we end up with two different representations of Data or Error, they won't be compatible with each other. And let me ask you: do you think Axum will use the same Status error type to represent errors that Tonic does? (Hint: it doesn't.) So when we get to it next time, we'll have some footwork to do around unifying error types.

Almost there!

We'll tie up next week with the final post in this series, tying together all the different things we've seen so far.

Read part 4 now

If you're looking for more Rust content, check out:

September 13, 2021 12:00 AM

September 09, 2021

Gabriel Gonzalez

Optics are monoids


This post documents my favorite lens trick of all time. Also, this trick works for any optics package based on van Laarhoven lenses, like lens-family-core or microlens.

This post assumes some familiarity with lenses, so if you are new to lenses then you might want to first read:

The title is slightly misleading and the precise statement is that Folds are Monoids, and all of the following optics are subtypes of Folds:

  • Getter
  • Lens
  • Traversal
  • Prism
  • Iso

… but I couldn’t fit that all of that in the title.

That means that if you combine any of the above optic types using <>, you will get a new optic that can be used as a Fold that combines their targets. For example:

>>> toListOf _1 (True, False)          -- _1 is a Lens
>>> toListOf _2 (True, False) -- _2 is a Lens
>>> toListOf (_1 <> _2) (True, False) -- (_1 <> 2) is a Fold

Also, mempty is the “empty” Fold that targets nothing:

>>> toListOf mempty (True, False)

There’s more to this trick, though, and we can build upon this idea to create optics that traverse complex data structures in a single pass.

Realistic example

To illustrate the trick, I’ll use a realistic example inspired by one of my interpreter side projects. I’ll keep things simple by reducing the original example to the following syntax tree for a toy lambda calculus implementation:

data Syntax
= Variable String
| Lambda String Syntax
| Apply Syntax Syntax
| Let String Syntax Syntax

example :: Syntax
example = Lambda "x" (Lambda "y" (Apply (Variable "x") (Variable "y")))

Now suppose we’d like to write a function that verifies that our syntax tree has no empty variable names. Without optics, we could write something like this:

wellFormed :: Syntax -> Bool
wellFormed (Variable name) =
not (null name)
wellFormed (Lambda name body) =
not (null name) && wellFormed body
wellFormed (Apply function argument) =
wellFormed function && wellFormed argument
wellFormed (Let name assignment body) =
not (null name) && wellFormed assignment && wellFormed body

… which works as expected on a few smoke tests:

>>> wellFormed example
>>> wellFormed (Variable "")

This implementation is simple enough for now. However, real interpreters tend to add a whole bunch of other constructors to the syntax tree. For example, each new keyword or datatype we add to the language will add another constructor to the syntax tree and each new constructor increases the boilerplate code for functions like wellFormed.

Thankfully, the lens and generic-lens packages provide useful utilities that simplify recursive functions like wellFormed. All we have to do is derive Plated and Generic for our Syntax type, like this:

{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE DeriveGeneric #-}

module Example where

import Data.Generics.Product (the)
import Data.Generics.Sum (_As)
import GHC.Generics (Generic)

import Control.Lens
import Data.Data (Data)

data Syntax
= Variable String
| Lambda String Syntax
| Apply Syntax Syntax
| Let String Syntax Syntax
deriving (Data, Generic, Show)

instance Plated Syntax

example :: Syntax
example = Lambda "x" (Lambda "y" (Apply (Variable "x") (Variable "y")))

Once we derive Plated we can use the cosmos lens to zoom in on all sub-expressions:

>>> toListOf cosmos example
[ Lambda "x" (Lambda "y" (Apply (Variable "x") (Variable "y")))
, Lambda "y" (Apply (Variable "x") (Variable "y"))
, Apply (Variable "x") (Variable "y")
, Variable "x"
, Variable "y"

… and when we derive Generic we can further narrow down the results using _As and the from the generic-lens package:

>>> :set -XTypeApplications
>>> :set -XDataKinds
>>> toListOf (cosmos . _As @"Variable") example -- Get all Variable names

>>> toListOf (cosmos . _As @"Lambda" . the @1) example -- Get all Lambda names

>>> toListOf (cosmos . _As @"Let" . the @1) example -- Get all Let names

So we can combine these tricks to implement our wellFormed function using optics to handle the automatic tree traversal:

wellFormed :: Syntax -> Bool
wellFormed syntax =
hasn't (cosmos . _As @"Variable" . only "") syntax
&& hasn't (cosmos . _As @"Lambda" . the @1 . only "") syntax
&& hasn't (cosmos . _As @"Let" . the @1 . only "") syntax

… but we’re not done here!

The cosmos Traversal factored away some of the repetition because we no longer need to recursively descend into subexpressions any longer. We also no longer need to explicitly handle constructors that have no variable names, like Apply.

Our wellFormed function is still repetitive, though, because three times in a row we write:

hasn't (cosmos .. only "") syntax

… and we’d like to factor out this repetition.

Our first instinct might be to factor out the repetition with a helper function, like this:

wellFormed :: Syntax -> Bool
wellFormed syntax =
noEmptyVariables (_As @"Variable")
&& noEmptyVariables (_As @"Lambda" . the @1)
&& noEmptyVariables (_As @"Let" . the @1)
noEmptyVariables fold =
hasn't (cosmos . fold . only "") syntax

… and that does work, but there is actually a better way. We can instead use the fact that Traversals are also Folds and Folds are Monoids to write this:

wellFormed :: Syntax -> Bool
wellFormed syntax =
( cosmos
. ( _As @"Variable"
<> _As @"Lambda" . the @1
<> _As @"Let" . the @1
. only ""

In other words, we can take the following three Traversals that each focus on a different source of variable names:

  • _As @"Variable" - Focus in on variable names
  • _As @"Lambda" . the @1 - Focus in on Lambda-bound variables
  • _As @"Let" . the @1 - Focus in on Let-bound variables

… and when we combine them using <> we get a Fold that focuses on all possible sources of variable names. We can then chain this composite Fold in between cosmos and only to find all of the empty variable names.

In fact, we’re not limited to using <>. Any utility that works on Monoids will work, like mconcat, so we can refactor our code even further like this:

wellFormed :: Syntax -> Bool
wellFormed = hasn't (cosmos . names . only "")

-- | Get all variable names within the current constructor
names :: Monoid m => Getting m Syntax String
names =
[ _As @"Variable"
, _As @"Lambda" . the @1
, _As @"Let" . the @1

Now we have factored out a useful and reusable names Fold1 that we can combine with cosmos to get all names within our syntax tree:

>>> toListOf (cosmos . names) example

This means that any new contributor to our interpreter can register a new source of variable names by adding a Traversal to that list and all downstream functions that use names will automatically update to do the right thing.

Why this trick?

I briefly touched on some other cool tricks in the course of this post, including:

  • The use of Plated for simplifying recursive Traversals
  • The use of generic-lens for boilerplate-free optics

… so why do I consider “optics as monoids” to be the coolest trick of them all? After all, Plated and generic-lens did most of the heavy lifting in the above example.

The reason why I particularly love the Monoid instance for lenses is because (as far as I know) nobody ever designed lenses to do this; this is purely an emergent property of independent design choices spread out over time.

This shouldn’t surprise us too much, because Haskell’s mathematically inspired type classes sort of follow the Unix philosophy to Do one thing and do it well. If each piece does one small thing correctly, then if you combine multiple pieces then the composite result is correct by construction.

However, you don’t need to take my word for it. I’ll walk through in detail how this works, first at the type level and then at the term level.

Type level

The first piece of the puzzle is the following Semigroup and Monoid instances for functions in base:

instance Semigroup b => Semigroup (a -> b) where
(f <> g) x = f x <> g x

instance Monoid b => Monoid (a -> b) where
mempty x = mempty

These instances says that a function is a Monoid if the function’s result is also a Monoid. We combine two functions by combining their outputs (when given the same input) and the empty function ignores its input and produces an empty output.

The second piece of the puzzle is the Const type (the constant Functor), which has a Semigroup and Monoid instance, too:

newtype Const a b = Const { getConst :: a }

instance Monoid a => Monoid (Const a b) where
mempty = Const mempty

instance Semigroup a => Semigroup (Const a b) where
Const x <> Const y = Const (x <> y)

These instances are so simple that we could have just derived them (and indeed, that is what the base package does):

newtype Const a b = Const { getConst :: a }
deriving newtype (Semigroup, Monoid)

In other words, Const a b is a Monoid if a is a Monoid. Combining two Consts combines their stored value, and the empty Const stores an empty value.

The final piece of the puzzle is that a Fold from the lens package is just a higher-order function over Consts:

-- This not the real type, but it's equivalent
type Fold a b = forall m . Monoid m => (b -> Const m b) -> (a -> Const m a)

… and that type is a valid Monoid, because:

  • (b -> Const m b) -> (a -> Const m a) is a Monoid if (a -> Const m a) is a Monoid

    … according to the Monoid instance for functions

  • a -> Const m a is a Monoid if Const m a is a Monoid

    … also according to the Monoid instance for functions

  • Const m a is a Monoid if m is a Monoid

    … according to the Monoid instance for Const

  • m is a Monoid

    … according to the Monoid m => constraint in the type of Fold

Therefore, all Folds are Monoids.

Term level

However, knowing that a Fold type-checks as a Monoid is not enough. We want to build an intuition for what happens when we use Monoid operations on Folds.

The most compact way we can state our intuition is by the following two laws:

toListOf (l <> r) a = toListOf l a <> toListOf r a

toListOf mempty a = mempty

In other words, if you combine two Folds then you combine their “targets”, and the empty Fold has no targets.

These laws are also known as “monoid morphism laws”. In other words, toListOf is a function that transforms one type of Monoid (Folds) into another type of Monoid (lists).

We can verify those two laws using equational reasoning, but in order to do so we need to use the following simplified definition for toListOf:

{-# LANGUAGE RankNTypes #-}

import Data.Functor.Const (Const(..))

toListOf :: Fold a b -> a -> [b]
toListOf fold a = getConst (fold (\b -> Const [b]) a)

The real toListOf function from the lens package has a different, but equivalent, implementation. The real implementation is more efficient, but takes more steps when proving things using equational reasoning, so I prefer to use this simpler implementation.

Now let’s prove the two monoid morphism laws. The proof for the first law is:

toListOf (l <> r) a

-- Substitute `toListOf` with its definition
= getConst ((l <> r) (\b -> Const [b]) a)

-- If `l` and `r` are functions, then according to the `Semigroup` instance for
-- functions:
-- (f <> g) x = f x <> g x
-- … where in this case:
-- f = l
-- g = r
-- x = \b -> Const [b]
= getConst ((l (\b -> Const [b]) <> r (\b -> Const [b])) a)

-- Use the `Semigroup` instance for functions again, except this time:
-- f = l (\b -> Const [b])
-- g = r (\b -> Const [b])
-- x = a
= getConst (l (\b -> Const [b]) a <> r (\b -> Const [b]) a)

-- Now use the `Semigroup` instance for `Const`, which (essentially) says:
-- getConst (x <> y) = getConst x <> getConst y
-- … where:
-- x = l (\b -> Const [b]) a
-- y = r (\b -> Const [b]) a
= getConst (l (\b -> Const [b]) a) <> getConst (r (\b -> Const [b]) a)

-- Now apply the definition of `toListOf`, but in reverse:
= toListOf l a <> toListOf r a

… and the proof for the second law is:

toList mempty a

-- Substitute `toListOf` with its definition
= getConst (mempty (\b -> Const [b]) a)

-- If `mempty` is a function, then according to the `Monoid` instance for
-- functions:
-- mempty x = mempty
-- … where in this case:
-- x = \b -> Const [b]
= getConst (mempty a)

-- Use the `Monoid` instance for functions again, except this time:
-- x = a
= getConst mempty

-- Now use the `Monoid` instance for `Const`, which says:
-- mempty = Const mempty
= getConst (Const mempty)

-- getConst (Const x) = x
= mempty


Hopefully that gives you a taste for how slick and elegant Haskell’s lens package is. If you like this post, you might also like these other posts:

Also, I know that I skimmed through the subjects of Plated and generic-lens, which are interesting topics in their own right. I hope to cover those in more detail in future posts.

I don’t know if this trick can be made to work for optics (an alternative to lens that uses an abstract interface to improve error messages). I know that it does not work at the time of this writing, but perhaps a Monoid instance could be added for the Optic type? I also have no idea if this trick or a related trick works for profunctor-optics (a different formulation of lenses based on profunctors).

I haven’t benchmarked whether combining Folds is faster than doing separate passes over the same data structure. I believe it’s more lazy, though, to process the data structure in a single pass using a composite Fold.


Here is the complete code example if you want to test this out locally:

{-# LANGUAGE DataKinds          #-}
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TypeApplications #-}

module Example where

import Data.Generics.Product (the)
import Data.Generics.Sum (_As)
import GHC.Generics (Generic)

import Control.Lens
import Data.Data (Data)

data Syntax
= Variable String
| Lambda String Syntax
| Apply Syntax Syntax
| Let String Syntax Syntax
deriving (Data, Generic, Show)

instance Plated Syntax

example :: Syntax
example = Lambda "x" (Lambda "y" (Apply (Variable "x") (Variable "y")))

wellFormed :: Syntax -> Bool
wellFormed = hasn't (cosmos . names . only "")

names :: Monoid m => Getting m Syntax String
names =
[ _As @"Variable"
, _As @"Lambda" . the @1
, _As @"Let" . the @1

  1. The type of names is morally Fold Syntax String, which according to the lens documentation is the essentially same type, but only the Getting version of the type will type-check here.↩︎

by Gabriella Gonzalez ( at September 09, 2021 02:34 PM

Brent Yorgey

Competitive programming in Haskell: Kadane’s algorithm

I will be giving a talk on competitive programming in Haskell tomorrow, September 10, at Haskell Love. You should attend! It’s free!

In my last competitive programming post, I challenged you to solve Purple Rain. We are presented with a linear sequence of cells, each colored either red or blue, and we are supposed to find the (contiguous) segment of cells with the maximal absolute difference between the number of red and blue. For example, below is shown one of the sample inputs, with the solution highlighted: the segment from cell 3 to cell 7 (the cells are numbered from 1) has four red cells compared to only one blue, for an absolute difference of three. You can verify that no other segment does better.

Transforming the problem

The obvious way to do this is to generate a list of all segments, compute the absolute difference between the number of red and blue cells for each, and take the maximum. However, that approach is doomed to exceed the time limit in any programming language: it would take O(n^3) time (O(n^2) possible segments times O(n) to sum each one), and the problem states that n can be up to 10^5. With 10^8 operations per second as a good rule of thumb, O(n^3) with n = 10^5 is clearly too slow. (In fact, any time you see an input size of 10^5, it is a dead giveaway that you are expected to find an O(n) or O(n \lg n) solution. 10^5 is big enough to make an O(n^2) solution prohibitively slow, but not so big that I/O itself becomes the bottleneck.)

The first insight is that we can transform this into the classic problem of finding the maximum sum subarray (also known as the maximum segment sum; either way I will abbreviate it as MSS) in two steps: first, turn each red cell into a 1, and each blue into -1. The sum of a segment then tells us how many more red than blue cells there are. Now, we actually want the biggest absolute difference between red and blue; but if we have an algorithm to find the MSS we can just run it twice: once to find the maximum excess of red over blue, and again with 1 and -1 flipped to find the maximum excess of blue over red.

The MSS problem has a long history in the functional programming community, being one of the flagship problems to demonstrate the techniques of program derivation in the style of the Bird-Meertens Formalism, aka Squiggol and The Algebra of Programming. It is possible to start out with a naive-but-obviously-correct O(n^3) implementation, and do a series of equational transformations to turn it into an efficient O(n) algorithm! If you’ve never seen that kind of thing before, I highly recommend checking it out; the Wikipedia page on the Bird-Meertens Formalism, linked above, is a good place to start. Certainly getting good at such derivations can be a handy skill when doing competitive programming in Haskell. But in any case, today I want to approach the problem from a different point of view, namely, coming up with a good functional equivalent to an existing imperative algorithm.

Kadane’s algorithm

Kadane’s algorithm, first proposed by Jay Kadane sometime in the late 1970s, is a linear-time algorithm for solving the MSS problem. It is actually quite simple to implement (the tricky part is understanding why it works!).

The idea is to loop through an array while keeping track of two things: a current value cur, and a best value. The best value is just the greatest value cur has ever taken on, so keeping it updated is easy: on every loop, we compare cur to best, and save the value of cur into best if it is higher. To keep cur updated, we simply add each new array value to it—but if it ever falls below zero, we just reset cur to zero. Here is some Java code:

public static int kadane(int[] a) {
    int best = 0, cur = 0;
    for (int i = 0; i < a.length; i++) {
        cur += a[i];
        if (cur < 0) cur = 0;
        if (cur > best) best = cur;
   return best;

Again, it is not at all obvious why this works, though putting in the effort to understand a proof is well worth the time. That is not the purpose of this blog post, however, so I’ll leave you to read about it on your own!

Translating Kadane’s algorithm to Haskell

In the imperative version, we iterate through a list, keep track of a current value, and also keep track of the best value we have seen so far. It is possible to translate this directly to Haskell: create a record with two fields, one for the current thing and one for the best thing, then iterate through the list with foldl', doing the appropriate update at each step:

data State s = State { curThing :: s, bestThing :: s }

-- Given a way to update the current s value with the next list
-- element of type a, update a State s value which keeps track of the
-- current s value as well as the best s value seen so far.
step :: Ord s => (s -> a -> s) -> (State s -> a -> State s)
step update (State cur best) a = State next (max best next)
    next = update cur a

bestIntermediate :: Ord s => (s -> a -> s) -> s -> [a] -> s
bestIntermediate update init = bestThing . foldl' (step update) (State init init)

But there’s a much better way! Note that the update function has the right type to be used with foldl'. But if we just computed foldl' update init directly, we would get only the single s value at the very end. But our goal is to get the best out of all the intermediate values. No problem: a scan is just a fold that returns all the intermediate values instead of only the final one! So instead of all this complicated and quasi-imperative State stuff, we just do a scanl' followed by taking the maximum:

bestIntermediate :: Ord s => (s -> a -> s) -> s -> [a] -> s
bestIntermediate update init = maximum . scanl' update init

Ah, much better! Using bestIntermediate, we can now translate Kadane’s algorithm as follows:

kadane1 :: [Int] -> Int
kadane1 = bestIntermediate next 0
    next s a = max 0 (s + a)

Whenever I write down an algorithm like this in Haskell—especially if I have “translated” it from an existing, imperative algorithm—I like to figure out how I can generalize it as much as possible. What structure is assumed of the inputs that makes the algorithm work? Can we replace some concrete monomorphic types with polymorphic ones? What type class constraints are needed? Often these sorts of generalizations make for supposedly tricky competitive programming problems. For example, if you have only ever seen Dijkstra’s algorithm presented in terms of finding shortest paths by summing edge weights, it takes quite a bit of insight to realize that the algorithm really just finds the “best” path with respect to operations that play nicely with each other in certain ways. For example, if we use the operations max and min in place of min and (+), Dijkstra’s algorithm finds the path with the maximum bottleneck. (This will probably end up being its own blog post at some point…)

Anyway, how can we generalize kadane1? The obvious starting point is that if we just let GHC infer a type for kadane1, we would get something more general:

kadane2 :: (Num a, Ord a) => [a] -> a
kadane2 = bestIntermediate next 0
    next s a = max 0 (s + a)

The only thing we do with the input list elements is add them and compare them; we also need 0 to have the same type as the input list elements. So the algorithm works for anything that has Ord and Num instances.

But wait—do we really need Num if all we are using is 0 and +? Really we are using the monoid of integers under addition. So we can generalize again, to any ordered monoid:

kadane :: (Monoid a, Ord a) => [a] -> a
kadane = bestIntermediate next mempty
    next s a = max mempty (s <> a)

In fact, if you study the proof of Kadane’s algorithm, you will see that this works just so long as the monoid operation interacts nicely with the ordering, that is, if x < y implies x \diamond z < y \diamond z and z \diamond x < z \diamond y for all z (this is what is usually meant by an “ordered monoid”).

Finding the best segment

So far, our code finds the best segment sum, but it doesn’t tell us which segment it was that was best—and for this problem we are actually supposed to output the starting and ending indices of the best segment, not the maximum red-blue difference itself.

If I were doing this in Java, I would probably just add several more variables: one to record where the segment currently being considered starts (which gets reset to i+1 when cur is reset to 0), and two to record the start and end indices of the best segment so far. This gets kind of ugly. Conceptually, the values actually belong in triples representing a segment: the start and end index together with the sum. In Java, it would be too heavyweight to construct a class to store these three values together, so in practice I would just do it with a mess of individual variables as described above. Fortunately, in Haskell, this is very lightweight, and we should of course create a data type to represent a segment.

It’s also worth noting that in Haskell, we were naturally led to make a polymorphic bestIntermediate function, which will work just as well with a segment type as it does Int. Only our kadane function itself will have to change. We will make a data type to represent segments, with an appropriate Ord instance to specify when one segment is better than another, and we will update the next helper function to update a segment instead of just updating a sum.

The solution

So let’s get started! First, some LANGUAGE pragmas and imports we will need, and a basic solution framework.

{-# LANGUAGE DerivingStrategies         #-}
{-# LANGUAGE DerivingVia                #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE LambdaCase                 #-}
{-# LANGUAGE ViewPatterns               #-}

import           Control.Arrow         ((>>>))
import           Data.ByteString.Char8 (ByteString)
import qualified Data.ByteString.Char8 as C
import           Data.List             (scanl')
import           Data.Semigroup

showB :: Show a => a -> C.ByteString
showB = show >>> C.pack

main = C.interact $ solve >>> format

Next, a data type to represent a segment, along with an Ord instance, and a function to format the answer for this problem. Note that we make all the fields of Segment strict (this makes a big difference in runtime—you should pretty much always use strict fields unless there is a really good reason not to). Notice also that the Ord instance is where we encode the problem’s specific instructions about how to break ties: “If there are multiple possible answers, print the one that has the Westernmost (smallest-numbered) starting section. If there are multiple answers with the same Westernmost starting section, print the one with the Westernmost ending section.” So we first compare Segments by sum, then by left index, then by right index, being careful to reverse the order of comparison for the indices, since better for indices means smaller.

-- The segment [l,r) (i.e. l inclusive, r exclusive), with its sum
data Segment a = I { l :: !Int, r :: !Int, s :: !a } deriving (Eq, Show)

instance Ord a => Ord (Segment a) where
  compare (I l1 r1 s1) (I l2 r2 s2)
    = compare s1 s2 <> compare l2 l1 <> compare r2 r1

format :: Segment a -> ByteString
format (I l r _) = C.unwords [showB l, showB (r-1)]
  -- (r-1) since r is exclusive but we are supposed to show
  -- the index of the last element

And now for Kadane’s algorithm itself. The bestIntermediate function is unchanged. kadane changes to start with an appropriate “empty” segment, and to use a next function that updates the current segment [l,r) based on the next list element a (the one at index r+1). The right index always gets incremented to r+1. If the current sum combined with a is “negative” (that is, less than mempty), we reset the left index to r+1 as well (making the segment empty), and the running sum to mempty. Otherwise, we leave the left index unchanged and add a to the running sum. I also added an argument to indicate the index of the first list element, since in this problem the list is supposed to be indexed from 1, but future problems might be indexed from 0, and I would never remember which it is. (I suppose we could also take a list of pairs (i,a) where i is the index of a. That would even work for non-consecutive indices.)

bestIntermediate :: Ord s => (s -> a -> s) -> s -> [a] -> s
bestIntermediate update init = maximum . scanl' update init

kadane :: (Monoid a, Ord a) => Int -> [a] -> Segment a
kadane ix = bestIntermediate next (I ix ix mempty)
    next (I l r s) a
      | s<>a < mempty = I (r+1) (r+1) mempty
      | otherwise     = I l     (r+1) (s<>a)

Finally, we can write the solve function itself. We map R’s and B’s in the input to 1 and -1, respectively, to generate a list of 1’s and -1’s called mountain. Then we call kadane on mountain and on map negate mountain, and pick whichever gives us a better answer. But wait, not quite! Remember that kadane needs a Monoid instance for the list elements, and Int has none. So we can whip up a quick newtype, Step, that has the right instances (note how deriving Num also allows us to use the literal values 1 and -1 as Step values). DerivingVia is quite handy in situations like this.

solve :: ByteString -> Segment Step
solve (C.init -> b) = max (kadane 1 mountain) (kadane 1 (map negate mountain))
    mountain = map (\case { 'R' -> 1; 'B' -> -1 }) (C.unpack b)

newtype Step = Step Int
  deriving (Semigroup, Monoid) via Sum Int
  deriving (Eq, Ord, Num)

Next time: Modulo Solitaire and fast BFS

For next time, I invite you to solve Modulo Solitaire. Warning: this is a straightforward BFS problem; the issue is getting a Haskell solution to run fast enough! I struggled with this for quite some time before coming up with something that worked. My ultimate goal in this case is to develop a polished library with an elegant, functional API for doing BFS, but that runs fast under the hood. I’m very curious to see how others might approach the problem.

by Brent at September 09, 2021 11:26 AM

Matt Parsons

Deferred Derivation

justifiably lazy orphans

(alternative subtitle: “I used the TemplateHaskell to destroy the TemplateHaskell”)

At the day job, we use the aeson-typescript library to generate TypeScript types from our Haskell types. One problem is that the library uses TemplateHaskell to do this, which means we have TemplateHaskell in almost all of our datatype-defining modules. Due to the TemplateHaskell recompilation avoidance bug (any module that uses TemplateHaskell must always be recompiled if any transitive dependency is changed), this means we spend a lot of time recompiling a lot of modules that don’t need to change. Freeing a module from TemplateHaskell speeds up our build tremendously - not because TemplateHaskell is slow (it’s very fast) but because compiling at all is slow. The best way to speed something up is to spend 0 time doing it - don’t do it at all!

Anyway, I investigated switching the library to a Generics based approach, but it’s complicated and looked tricky, so I decided not to pursue that. The library defines a type class HasJSONOptions, which lets you re-use the same JSON options for both aeson’s JSON encoding classes and the TypeScript class.

I had recently done some work with discoverEntities, a TemplateHaskell function which gathers the PersistEntity instances in scope and collects their entityDef (Proxy :: Proxy a). I started wondering - can I use this trick to defer the derivation and omit the TemplateHaskell? In preparation, I wrote discover-instances, which generalized the above pattern (and was a fun exercise in typed TemplateHaskell quotes).

Now, we might have had an instance like this:

data X = X { ... }

$(deriveJSON defaultOptions ''X)
$(deriveTypeScript defaultOptions ''X)

We can excise the TemplateHaskell entirely by using Generics-based derivation for the JSON classes, and specify an instance for the HasJSONOptions class.

data X = X { ... }
    deriving stock Generic

instance HasJSONOptions X where
    getJSONOptions _ =

instance ToJSON X where
    toJSON =
        genericToJSON (getJSONOptions (Proxy :: Proxy X))

instance FromJSON X where
    parseJSON =
        genericFromJSON (getJSONOptions (Proxy :: Proxy X))

But we don’t yet have that TypeScript instance.

Let’s look at the module that actually generates our TypeScript code.

module MakeTypeScript where

import Model.X
import Model.Y
import Model.Z

writeTypeScript :: IO ()
writeTypeScript = do
    writeFile frontendTypes $ concat
        [ makeTypeScriptFor (Proxy :: Proxy X)
        , makeTypeScriptFor (Proxy :: Proxy Y)
        , makeTypeScriptFor (Proxy :: Proxy Z)

This is the only place those instances are ever used. With the above change (eg deleting the deriveTypeScript stuff), there’s no longer an instance present. But I can fix that by deriving the instance in this file.

module MakeTypeScript where

import Model.X
import Model.Y
import Model.Z

$(deriveTypeScript (getJSONOptions (Proxy :: Proxy X)) ''X)
$(deriveTypeScript (getJSONOptions (Proxy :: Proxy Y)) ''Y)
$(deriveTypeScript (getJSONOptions (Proxy :: Proxy Z)) ''Z)

writeTypeScript :: IO ()
writeTypeScript = do
    writeFile frontendTypes $ concat
        [ makeTypeScriptFor (Proxy :: Proxy X)
        , makeTypeScriptFor (Proxy :: Proxy Y)
        , makeTypeScriptFor (Proxy :: Proxy Z)

This is deeply unsatisfying to me. We have three points of repetition:

  1. importing a type
  2. deriveing an instance for it
  3. Mentioning the type in writeTypeScript.

Let’s improve this.

First, we want to use discoverInstances to splice in all the HasJSONOptions instances that are visible. Second, we’ll iterate over each instance, and derive the code there.

module MakeTypeScript where

import Model.X
import Model.Y
import Model.Z

    decs <-
            $$(discoverInstances @HasJSONOptions)
            $ \proxy@(Proxy :: Proxy a) -> do
                deriveTypeScript (getJSONOptions proxy) (nameForType proxy)
    pure (concat decs)

writeTypeScript :: IO ()
writeTypeScript = do
    writeFile frontendTypes $ concat
        [ makeTypeScriptFor (Proxy :: Proxy X)
        , makeTypeScriptFor (Proxy :: Proxy Y)
        , makeTypeScriptFor (Proxy :: Proxy Z)

forInstances is a convenience function that lets you operate on the proxy with the constraint in scope.

This totally works. We’ve derived all of the instances of TypeScript, and now we’re using them quite happily.

Now we’re down to two points of repetition - importing a type and writing it specifically in makeTypeScriptFor. We can’t get rid of imports, so let’s look at the writeFile list. This is pretty easy to get rid of using our SomeDict TypeScript.

module MakeTypeScript where

import Model.X
import Model.Y
import Model.Z

    decs <-
            $$(discoverInstances @HasJSONOptions)
            $ \proxy@(Proxy :: Proxy a) -> do
                deriveTypeScript (getJSONOptions proxy) (nameForType proxy)
    pure (concat decs)

writeTypeScript :: IO ()
writeTypeScript = do
    writeFile frontendTypes $ concat $ 
            $$(discoverInstances @TypeScript)
            $ \proxy -> makeTypeScriptFor proxy

And we’re done!


  1. No more TemplateHaskell requirement in a module that defines an API type!
  2. No more boring repetition!
  3. We got to use fancy types!


  1. Orphan instances :(
  2. There’s still TemplateHaskell even if it’s localized
  3. Generic-based deriving is slower, so a clean build will be worse

September 09, 2021 12:00 AM

Oleg Grenrus

Leibniz coercion

Posted on 2021-09-09 by Oleg Grenrus

Recently Ryan Scott wrote an article about Leibniz (and Martin-Löf) equality. Interestingly we can do the same thing for coercions (which are representational equalities). That exercise stumbles on the first-orderness of role system in GHC, making it a good example to be solved. (I’d like to have a solution, but I only have a problem).

{-# LANGUAGE GADTs, RankNTypes, QuantifiedConstraints #-}
{-# LANGUAGE StandaloneKindSignatures, PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables, TypeOperators, TypeApplications #-}
import Data.Coerce (Coercible, coerce)
import Data.Kind (Type)
import Data.Functor.Identity (Identity (..))

The Coercion type defined in Data.Type.Coercion is akin to Martin-Löf equality (:~:):

type Coercion :: k -> k -> Type
data Coercion a b where
  Coercion :: Coercible a b => Coercion a b

compare that to alternative way1 to write (:~:):

type (:~:) :: k -> k -> Type
data a :~: b where
  Refl :: a ~ b => a :~: b

There is an analog of castWith function, called coerceWith, a type-safe cast using representational equality:

coerceWith' :: Coercion a b -> a -> b
coerceWith' Coercion x = coerce x

Symmetry, transitivity etc combinators are defined similarly by pattern matching on the constructor.

Leibniz coercion

What would Leibniz variant look like? Recall identity of indiscernibles: two objects are equal if and only if they satisfy the same properties. We encode that in Haskell as:

type (:=) :: k -> k -> Type
newtype a := b = Leibniz { subst' :: forall c. c a -> c b }

where c is all properties, or contexts. For Coercion we need to restrict that to the representational 2 contexts. We can encode representational contexts using QuantifiedConstraints extension:

type Koerzion :: k -> k -> Type
newtype Koerzion a b = Koerzion
  { subst :: forall c. (forall x y. Coercible x y => Coercible (c x) (c y))
          => c a -> c b

(Leibniz was German, and this is the first name I thought of. I do not know German, I’m sorry).

Another way to think about this is:

  • Everything should respect the nominal equality.

  • However, only representational things respect representational equality, i.e. coercions.

Next, lets defined define some basic combinators. Reflexivity is defined similarly as for the (nominal) Leibniz equality:

refl :: Koerzion a a
refl = Koerzion id

We can generalise it to arbitrary Coercible types, that is the point of coercions:

koerzion :: Coercible a b => Koerzion a b
koerzion = Koerzion coerce

The type-safe cast do not pose any problems either:

coerceWith :: Koerzion a b -> a -> b
coerceWith ab a = runIdentity (subst ab (Identity a))

With transitivity we step into a problem:

trans :: forall a b c. Koerzion a b -> Koerzion b c -> Koerzion a c
trans ab bc = subst bc ab

doesn’t work. Let’s ignore a problem for now, and follow David Feuer comment and defined transitivity as:

trans :: forall a b c. Koerzion a b -> Koerzion b c -> Koerzion a c
trans ab bc = Koerzion (subst bc . subst ab)

So far so good. We have defined reflexivity, transitivity, and type-safe cast.


If we try to define symmetry, following the nominal Leibniz equality example:

type Symm :: k -> k -> Type
newtype Symm a b = Symm { unsymm :: Koerzion b a }

sym :: forall a b. Koerzion a b -> Koerzion b a
sym ab = unsymm (subst ab (Symm refl))

we get an error (which is similar to what we get in the first transitivity attempt):

Couldn't match representation of type: c y
                               with that of: c x
        arising from a use of ‘subst’

though I’d expected

Couldn't match representation of type: Symm a a
                               with that of: Symm a b
        arising from a use of ‘subst’

However, I believe that it caused by the same underlying reason: Koerzion roles are not representational:

*Main> :i Koerzion 
type role Koerzion nominal nominal nominal
type Koerzion :: forall k. k -> k -> *

In comparison, GADT variant Coercion is:

*Main> :i Coercion
type Coercion :: forall k. k -> k -> *
type role Coercion nominal representational representational

This highlights the first-orderness of the role system. Koerzion is morally representational in a and b, but GHC doesn’t want to infer it. (And there is no way to say trust me). The QuantifiedConstraints "hack" to internalize representational contexts is not integrated into roles system and Coercible constraint solver (AFAIK), so the things do not work out.

Equivalence of Leibniz and Martin-Löf-like coercions

However, Coercion and Koerzion are equivalent:

toKoerzion :: Coercion a b -> Koerzion a b
toKoerzion Coercion = Koerzion coerce

fromKoerzion :: forall a b. Koerzion a b -> Coercion a b
fromKoerzion ab = subst ab @(Coercion a) (Coercion :: Coercion a a)

And we can define sym via that equivalence:

sym :: forall a b. Koerzion a b -> Koerzion b a
sym = toKoerzion . sym' . fromKoerzion

sym' :: Coercion a b -> Coercion b a
sym' Coercion = Coercion

which justifies my "Koerzion is morally representational" claim, because Coercion is representational, Koerzion should be too.

It would be very nice if GHC had higher-order roles. There are plenty of practical examples which run into the same problem (e.g. the core type in pipes is not representational in any of its type arguments, though it should be). But maybe this Koerzion example will nerd-snip someone to solve this problem :)

  1. I’m not sure that the representation of this and original variant is exactly the same, i.e. whether the equality constraint is unboxed in both variants. That details is not important for us now.↩︎

  2. Another Ryan’s blog post has a section about roles. GHC manual is also a good source of knowledge.↩︎

September 09, 2021 12:00 AM

September 08, 2021

Brent Yorgey

Implementing Hindley-Milner with the unification-fd library

For a current project, I needed to implement type inference for a Hindley-Milner-based type system. (More about that project in an upcoming post!) If you don’t know, Hindley-Milner is what you get when you add polymorphism to the simply typed lambda calculus, but only allow \forall to show up at the very outermost layer of a type. This is the fundamental basis for many real-world type systems (e.g. OCaml or Haskell without RankNTypes enabled).

One of the core operations in any Hindley-Milner type inference algorithm is unification, where we take two types that might contain unification variables (think “named holes”) and try to make them equal, which might fail, or might provide more information about the values that the unification variables should take. For example, if we try to unify a -> Int and Char -> b, we will learn that a = Char and b = Int; on the other hand, trying to unify a -> Int and (b, Char) will fail, because there is no way to make those two types equal (the first is a function type whereas the second is a pair).

I’ve implemented this from scratch before, and it was a great learning experience, but I wasn’t looking forward to implementing it again. But then I remembered the unification-fd library and wondered whether I could use it to simplify the implementation. Long story short, although the documentation for unification-fd claims it can be used to implement Hindley-Milner, I couldn’t find any examples online, and apparently neither could anyone else. So I set out to make my own example, and you’re reading it. It turns out that unification-fd is incredibly powerful, but using it can be a bit finicky, so I hope this example can be helpful to others who wish to use it. Along the way, resources I found especially helpful include this basic unification-fd tutorial by the author, Wren Romano, as well as a blog post by Roman Cheplyaka, and the unification-fd Haddock documentation itself. I also referred to the Wikipedia page on Hindley-Milner, which is extremely thorough.

This blog post is rendered automatically from a literate Haskell file; you can find the complete working source code and blog post on GitHub. I’m always happy to receive comments, fixes, or suggestions for improvement.

Prelude: A Bunch of Extensions and Imports

We will make GHC and other people’s libraries work very hard for us. You know the drill.

> {-# LANGUAGE DeriveAnyClass        #-}
> {-# LANGUAGE DeriveFoldable        #-}
> {-# LANGUAGE DeriveFunctor         #-}
> {-# LANGUAGE DeriveGeneric         #-}
> {-# LANGUAGE DeriveTraversable     #-}
> {-# LANGUAGE FlexibleContexts      #-}
> {-# LANGUAGE FlexibleInstances     #-}
> {-# LANGUAGE GADTs                 #-}
> {-# LANGUAGE LambdaCase            #-}
> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE PatternSynonyms       #-}
> {-# LANGUAGE StandaloneDeriving    #-}
> {-# LANGUAGE UndecidableInstances  #-}
> import           Control.Category ((>>>))
> import           Control.Monad.Except
> import           Control.Monad.Reader
> import           Data.Foldable              (fold)
> import           Data.Functor.Identity
> import           Data.List                  (intercalate)
> import           Data.Map                   (Map)
> import qualified Data.Map                   as M
> import           Data.Maybe
> import           Data.Set                   (Set, (\\))
> import qualified Data.Set                   as S
> import           Prelude                    hiding (lookup)
> import           Text.Printf
> import           Text.Parsec
> import           Text.Parsec.Expr
> import           Text.Parsec.Language       (emptyDef)
> import           Text.Parsec.String
> import qualified Text.Parsec.Token          as L
> import           Control.Unification        hiding ((=:=), applyBindings)
> import qualified Control.Unification        as U
> import           Control.Unification.IntVar
> import           Data.Functor.Fixedpoint
> import           GHC.Generics               (Generic1)
> import           System.Console.Repline

Representing our types

We’ll be implementing a language with lambas, application, and let-expressions—as well as natural numbers with an addition operation, just to give us a base type and something to do with it. So we will have a natural number type and function types, along with polymorphism, i.e. type variables and forall. (Adding more features like sum and product types, additional base types, recursion, etc. is left as an exercise for the reader!)

So notionally, we want something like this:

type Var = String
data Type = TyVar Var | TyNat | TyFun Type Type

However, when using unification-fd, we have to encode our Type data type (i.e. the thing we want to do unification on) using a “two-level type” (see Tim Sheard’s original paper).

> type Var = String
> data TypeF a = TyVarF Var | TyNatF | TyFunF a a
>   deriving (Show, Eq, Functor, Foldable, Traversable, Generic1, Unifiable)
> type Type = Fix TypeF

TypeF is a “structure functor” that just defines a single level of structure; notice TypeF is not recursive at all, but uses the type parameter a to mark the places where a recursive instance would usually be. unification-fd provides a Fix type to “tie the knot” and make it recursive. (I’m not sure why unification-fd defines its own Fix type instead of using the one from Data.Fix, but perhaps the reason is that it was written before Data.Fix existed—unification-fd was first published in 2007!)

We have to derive a whole bunch of instances for TypeF which fortunately we get for free. Note in particular Generic1 and Unifiable: Unifiable is a type class from unification-fd which describes how to match up values of our type. Thanks to the work of Roman Cheplyaka, there is a default implementation for Unifiable based on a Generic1 instance—which GHC derives for us in turn—and the default implementation works great for our purposes.

unification-fd also provides a second type for tying the knot, called UTerm, defined like so:

data UTerm t v
  = UVar  !v               -- ^ A unification variable.
  | UTerm !(t (UTerm t v)) -- ^ Some structure containing subterms.

It’s similar to Fix, except it also adds unification variables of some type v. (If it means anything to you, note that UTerm is actually the free monad over t.) We also define a version of Type using UTerm, which we will use during type inference:

> type UType = UTerm TypeF IntVar

IntVar is a type provided by unification-fd representing variables as Int values, with a mapping from variables to bindings stored in an IntMap. unification-fd also provies an STVar type which implements variables via STRefs; I presume using STVars would be faster (since no intermediate lookups in an IntMap are required) but forces us to work in the ST monad. For now I will just stick with IntVar, which makes things simpler.

At this point you might wonder: why do we have type variables in our definition of TypeF, but also use UTerm to add unification variables? Can’t we just get rid of the TyVarF constructor, and let UTerm provide the variables? Well, type variables and unification variables are subtly different, though intimately related. A type variable is actually part of a type, whereas a unification variable is not itself a type, but only stands for some type which is (as yet) unknown. After we are completely done with type inference, we won’t have a UTerm any more, but we might have a type like forall a. a -> a which still contains type variables, so we need a way to represent them. We could only get rid of the TyVarF constructor if we were doing type inference for a language without polymorphism (and yes, unification could still be helpful in such a situation—for example, to do full type reconstruction for the simply typed lambda calculus, where lambdas do not have type annotations).

Polytype represents a polymorphic type, with a forall at the front and a list of bound type variables (note that regular monomorphic types can be represented as Forall [] ty). We don’t need to make an instance of Unifiable for Polytype, since we never unify polytypes, only (mono)types. However, we can have polytypes with unification variables in them, so we need two versions, one containing a Type and one containing a UType.

> data Poly t = Forall [Var] t
>   deriving (Eq, Show, Functor)
> type Polytype  = Poly Type
> type UPolytype = Poly UType

Finally, for convenience, we can make a bunch of pattern synonyms that let us work with Type and UType just as if they were directly recursive types. This isn’t required; I just like not having to write Fix and UTerm everywhere.

> pattern TyVar :: Var -> Type
> pattern TyVar v = Fix (TyVarF v)
> pattern TyNat :: Type
> pattern TyNat = Fix TyNatF
> pattern TyFun :: Type -> Type -> Type
> pattern TyFun t1 t2 = Fix (TyFunF t1 t2)
> pattern UTyNat :: UType
> pattern UTyNat = UTerm TyNatF
> pattern UTyFun :: UType -> UType -> UType
> pattern UTyFun t1 t2 = UTerm (TyFunF t1 t2)
> pattern UTyVar :: Var -> UType
> pattern UTyVar v = UTerm (TyVarF v)


Here’s a data type to represent expressions. There’s nothing much interesting to see here, since we don’t need to do anything fancy with expressions. Note that lambdas don’t have type annotations, but let-expressions can have an optional polytype annotation, which will let us talk about checking polymorphic types in addition to inferring them (a lot of presentations of Hindley-Milner don’t talk about this).

> data Expr where
>   EVar  :: Var -> Expr
>   EInt  :: Integer -> Expr
>   EPlus :: Expr -> Expr -> Expr
>   ELam  :: Var -> Expr -> Expr
>   EApp  :: Expr -> Expr -> Expr
>   ELet  :: Var -> Maybe Polytype -> Expr -> Expr -> Expr

Normally at this point we would write parsers and pretty-printers for types and expressions, but that’s boring and has very little to do with unification-fd, so I’ve left those to the end. Let’s get on with the interesting bits!

Type inference infrastructure

Before we get to the type inference algorithm proper, we’ll need to develop a bunch of infrastructure. First, here’s the concrete monad we will be using for type inference. The ReaderT Ctx will keep track of variables and their types; ExceptT TypeError of course allows us to fail with type errors; and IntBindingT is a monad transformer provided by unification-fd which supports various operations such as generating fresh variables and unifying things. Note, for reasons that will become clear later, it’s very important that the IntBindingT is on the bottom of the stack, and the ExceptT comes right above it. Beyond that we can add whatever we like on top.

> type Infer = ReaderT Ctx (ExceptT TypeError (IntBindingT TypeF Identity))

Normally, I would prefer to write everything in a “capability style” where the code is polymorphic in the monad, and just specifies what capabilites/effects it needs (either just using mtl classes directly, or using an effects library like polysemy or fused-effects), but the way the unification-fd API is designed seems to make that a bit tricky.

A type context is a mapping from variable names to polytypes; we also have a function for looking up the type of a variable in the context, and a function for running a local subcomputation in an extended context.

> type Ctx = Map Var UPolytype
> lookup :: Var -> Infer UType
> lookup x = do
>   ctx <- ask
>   maybe (throwError $ UnboundVar x) instantiate (M.lookup x ctx)
> withBinding :: MonadReader Ctx m => Var -> UPolytype -> m a -> m a
> withBinding x ty = local (M.insert x ty)

The lookup function throws an error if the variable is not in the context, and otherwise returns a UType. Conversion from the UPolytype stored in the context to a UType happens via a function called instantiate, which opens up the UPolytype and replaces each of the variables bound by the forall with a fresh unification variable. We will see the implementation of instantiate later.

We will often need to recurse over UTypes. We could just write directly recursive functions ourselves, but there is a better way. Although the unification-fd library provides a function cata for doing a fold over a term built with Fix, it doesn’t provide a counterpart for UTerm; but no matter, we can write one ourselves, like so:

> ucata :: Functor t => (v -> a) -> (t a -> a) -> UTerm t v -> a
> ucata f _ (UVar v) = f v
> ucata f g (UTerm t) = g (fmap (ucata f g) t)

Now, we can write some utilities for finding free variables. Inexplicably, IntVar does not have an Ord instance (even though it is literally just a newtype over Int), so we have to derive one if we want to store them in a Set. Notice that our freeVars function finds free unification variables and free type variables; I will talk about why we need this later (this is something I got wrong at first!).

> deriving instance Ord IntVar
> class FreeVars a where
>   freeVars :: a -> Infer (Set (Either Var IntVar))

Finding the free variables in a UType is our first application of ucata. First, to find the free unification variables, we just use the getFreeVars function provided by unification-fd and massage the output into the right form. To find free type variables, we fold using ucata: we ignore unification variables, capture a singleton set in the TyVarF case, and in the recursive case we call fold, which will turn a TypeF (Set ...) into a Set ... using the Monoid instance for Set, i.e. union.

> instance FreeVars UType where
>   freeVars ut = do
>     fuvs <- fmap (S.fromList . map Right) . lift . lift $ getFreeVars ut
>     let ftvs = ucata (const S.empty)
>                      (\case {TyVarF x -> S.singleton (Left x); f -> fold f})
>                      ut
>     return $ fuvs `S.union` ftvs

Why don’t we just find free unification variables with ucata at the same time as the free type variables, and forget about using getFreeVars? Well, I looked at the source, and getFreeVars is actually a complicated beast. I’m really not sure what it’s doing, and I don’t trust that just manually getting the unification variables ourselves would be doing the right thing!

Now we can leverage the above instance to find free varaibles in UPolytypes and type contexts. For a UPolytype, we of course have to subtract off any variables bound by the forall.

> instance FreeVars UPolytype where
>   freeVars (Forall xs ut) = (\\ (S.fromList (map Left xs))) <$> freeVars ut
> instance FreeVars Ctx where
>   freeVars = fmap S.unions . mapM freeVars . M.elems

And here’s a simple utility function to generate fresh unification variables, built on top of the freeVar function provided by unification-fd:

> fresh :: Infer UType
> fresh = UVar <$> lift (lift freeVar)

One thing to note is the annoying calls to lift we have to do in the definition of FreeVars for UType, and in the definition of fresh. The getFreeVars and freeVar functions provided by unification-fv have to run in a monad which is an instance of BindingMonad, and the BindingMonad class does not provide instances for mtl transformers. We could write our own instances so that these functions would work in our Infer monad automatically, but honestly that sounds like a lot of work. Sprinkling a few lifts here and there isn’t so bad.

Next, a data type to represent type errors, and an instance of the Fallible class, needed so that unification-fd can inject errors into our error type when it encounters unification errors. Basically we just need to provide two specific constructors to represent an “occurs check” failure (i.e. an infinite type), or a unification mismatch failure.

> data TypeError where
>   UnboundVar   :: String -> TypeError
>   Infinite     :: IntVar -> UType -> TypeError
>   Mismatch     :: TypeF UType -> TypeF UType -> TypeError
> instance Fallible TypeF IntVar TypeError where
>   occursFailure   = Infinite
>   mismatchFailure = Mismatch

The =:= operator provided by unification-fd is how we unify two types. It has a kind of bizarre type:

(=:=) :: ( BindingMonad t v m, Fallible t v e
         , MonadTrans em, Functor (em m), MonadError e (em m))
      => UTerm t v -> UTerm t v -> em m (UTerm t v)

(Apparently I am not the only one who thinks this type is bizarre; the unification-fd source code contains the comment -- TODO: what was the reason for the MonadTrans madness?)

I had to stare at this for a while to understand it. It says that the output will be in some BindingMonad (such as IntBindingT), and there must be a single error monad transformer on top of it, with an error type that implements Fallible. So =:= can return ExceptT TypeError (IntBindingT Identity) UType, but it cannot be used directly in our Infer monad, because that has a ReaderT on top of the ExceptT. So I just made my own version with an extra lift to get it to work directly in the Infer monad. While we’re at it, we’ll make a lifted version of applyBindings, which has the same issue.

> (=:=) :: UType -> UType -> Infer UType
> s =:= t = lift $ s U.=:= t
> applyBindings :: UType -> Infer UType
> applyBindings = lift . U.applyBindings

Converting between mono- and polytypes

Central to the way Hindley-Milner works is the way we move back and forth between polytypes and monotypes. First, let’s see how to turn UPolytypes into UTypes, hinted at earlier in the definition of the lookup function. To instantiate a UPolytype, we generate a fresh unification variable for each variable bound by the Forall, and then substitute them throughout the type.

> instantiate :: UPolytype -> Infer UType
> instantiate (Forall xs uty) = do
>   xs' <- mapM (const fresh) xs
>   return $ substU (M.fromList (zip (map Left xs) xs')) uty

The substU function can substitute for either kind of variable in a UType (right now we only need it to substitute for type variables, but we will need it to substitute for unification variables later). Of course, it is implemented via ucata. In the variable cases we make sure to leave the variable alone if it is not a key in the given substitution mapping. In the recursive non-variable case, we just roll up the TypeF UType into a UType by applying UTerm. This is the power of ucata: we can deal with all the boring recursive cases in one fell swoop. This function won’t have to change if we add new types to the language in the future.

> substU :: Map (Either Var IntVar) UType -> UType -> UType
> substU m = ucata
>   (\v -> fromMaybe (UVar v) (M.lookup (Right v) m))
>   (\case
>       TyVarF v -> fromMaybe (UTyVar v) (M.lookup (Left v) m)
>       f -> UTerm f
>   )

There is one other way to convert a UPolytype to a UType, which happens when we want to check that an expression has a polymorphic type specified by the user. For example, let foo : forall a. a -> a = \x.3 in ... should be a type error, because the user specified that foo should have type forall a. a -> a, but then gave the implementation \x.3 which is too specific. In this situation we can’t just instantiate the polytype—that would create a unification variable for a, and while typechecking \x.3 it would unify a with nat. But in this case we don’t want a to unify with nat—it has to be held entirely abstract, because the user’s claim is that this function will work for any type a.

Instead of generating unification variables, we instead want to generate what are known as Skolem variables. Skolem variables do not unify with anything other than themselves. So how can we get unification-fd to do that? It does not have any built-in notion of Skolem variables. What we can do instead is to just embed the variables within the UType as UTyVars instead of UVars! unification-fd does not even know those are variables; it just sees them as another rigid part of the structure that must be matched exactly, just as a TyFun has to match another TyFun. The one remaining issue is that we need to generate fresh Skolem variables; it certainly would not do to have them collide with Skolem variables from some other forall. We could carry around our own supply of unique names in the Infer monad for this purpose, which would probably be the “proper” way to do things; but for now I did something more expedient: just get unification-fd to generate fresh unification variables, then rip the (unique! fresh!) Ints out of them and use those to make our Skolem variables.

> skolemize :: UPolytype -> Infer UType
> skolemize (Forall xs uty) = do
>   xs' <- mapM (const fresh) xs
>   return $ substU (M.fromList (zip (map Left xs) (map toSkolem xs'))) uty
>   where
>     toSkolem (UVar v) = UTyVar (mkVarName "s" v)

When unification-fd generates fresh IntVars it seems that it starts at minBound :: Int and increments, so we can add maxBound + 1 to get numbers that look reasonable. Again, this is not very principled, but for this toy example, who cares?

> mkVarName :: String -> IntVar -> Var
> mkVarName nm (IntVar v) = nm ++ show (v + (maxBound :: Int) + 1)

Next, how do we convert from a UType back to a UPolytype? This happens when we have inferred the type of a let-bound variable and go to put it in the context; typically, Hindley-Milner systems generalize the inferred type to a polytype. If a unification variable still occurs free in a type, it means it was not constrained at all, so we can universally quantify over it. However, we have to be careful: unification variables that occur in some type that is already in the context do not count as free, because we might later discover that they need to be constrained.

Also, just before we do the generalization, it’s very important that we use applyBindings. unification-fd has been collecting a substitution from unification variables to types, but for efficiency’s sake it does not actually apply the substitution until we ask it to, by calling applyBindings. Any unification variables which still remain after applyBindings really are unconstrained so far. So after applyBindings, we get the free unification variables from the type, subtract off any unification variables which are free in the context, and close over the remaining variables with a forall, substituting normal type variables for them. It does not particularly matter if these type variables are fresh (so long as they are distinct). But we can’t look only at unification variables! We have to look at free type variables too (this is the reason that our freeVars function needs to find both free type and unification variables). Why is that? Well, we might have some free type variables floating around if we previously generated some Skolem variables while checking a polymorphic type. (A term which illustrates this behavior is \y. let x : forall a. a -> a = y in x 3.) Free Skolem variables should also be generalized over.

> generalize :: UType -> Infer UPolytype
> generalize uty = do
>   uty' <- applyBindings uty
>   ctx <- ask
>   tmfvs  <- freeVars uty'
>   ctxfvs <- freeVars ctx
>   let fvs = S.toList $ tmfvs \\ ctxfvs
>       xs  = map (either id (mkVarName "a")) fvs
>   return $ Forall xs (substU (M.fromList (zip fvs (map UTyVar xs))) uty')

Finally, we need a way to convert Polytypes entered by the user into UPolytypes, and a way to convert the final UPolytype back into a Polytype. unification-fd provides functions unfreeze : Fix t -> UTerm t v and freeze : UTerm t v -> Maybe (Fix t) to convert between terms built with UTerm (with unification variables) and Fix (without unification variables). Converting to UPolytype is easy: we just use unfreeze to convert the underlying Type to a UType.

> toUPolytype :: Polytype -> UPolytype
> toUPolytype = fmap unfreeze

When converting back, notice that freeze returns a Maybe; it fails if there are any unification variables remaining. So we must be careful to only use fromUPolytype when we know there are no unification variables remaining in a polytype. In fact, we will use this only at the very top level, after generalizing the type that results from inference over a top-level term. Since at the top level we only perform inference on closed terms, in an empty type context, the final generalize step will generalize over all the remaining free unification variables, since there will be no free variables in the context.

> fromUPolytype :: UPolytype -> Polytype
> fromUPolytype = fmap (fromJust . freeze)

Type inference

Finally, the type inference algorithm proper! First, to check that an expression has a given type, we infer the type of the expression and then demand (via =:=) that the inferred type must be equal to the given one. Note that =:= actually returns a UType, and it can apparently be more efficient to use the result of =:= in preference to either of the arguments to it (although they will all give equivalent results). However, in our case this doesn’t seem to make much difference.

> check :: Expr -> UType -> Infer ()
> check e ty = do
>   ty' <- infer e
>   _ <- ty =:= ty'
>   return ()

And now for the infer function. The EVar, EInt, and EPlus cases are straightforward.

> infer :: Expr -> Infer UType
> infer (EVar x)      = lookup x
> infer (EInt _)      = return UTyNat
> infer (EPlus e1 e2) = do
>   check e1 UTyNat
>   check e2 UTyNat
>   return UTyNat

For an application EApp e1 e2, we infer the types funTy and argTy of e1 and e2 respectively, then demand that funTy =:= UTyFun argTy resTy for a fresh unification variable resTy. Again, =:= returns a more efficient UType which is equivalent to funTy, but we don’t need to use that type directly (we return resTy instead), so we just discard the result.

> infer (EApp e1 e2) = do
>   funTy <- infer e1
>   argTy <- infer e2
>   resTy <- fresh
>   _ <- funTy =:= UTyFun argTy resTy
>   return resTy

For a lambda, we make up a fresh unification variable for the type of the argument, then infer the type of the body under an extended context. Notice how we promote the freshly generated unification variable to a UPolytype by wrapping it in Forall []; we do not generalize it, since that would turn it into forall a. a! We just want the lambda argument to have the bare unification variable as its type.

> infer (ELam x body) = do
>   argTy <- fresh
>   withBinding x (Forall [] argTy) $ do
>     resTy <- infer body
>     return $ UTyFun argTy resTy

For a let expression without a type annotation, we infer the type of the definition, then generalize it and add it to the context to infer the type of the body. It is traditional for Hindley-Milner systems to generalize let-bound things this way (although note that GHC does not generalize let-bound things with -XMonoLocalBinds enabled, which is automatically implied by -XGADTs or -XTypeFamilies).

> infer (ELet x Nothing xdef body) = do
>   ty <- infer xdef
>   pty <- generalize ty
>   withBinding x pty $ infer body

For a let expression with a type annotation, we skolemize it and check the definition with the skolemized type; the rest is the same as the previous case.

> infer (ELet x (Just pty) xdef body) = do
>   let upty = toUPolytype pty
>   upty' <- skolemize upty
>   check xdef upty'
>   withBinding x upty $ infer body

Running the Infer monad

We need a way to run computations in the Infer monad. This is a bit fiddly, and it took me a long time to put all the pieces together. (But typed holes are sooooo great! It would have taken me way longer without them…) I’ve written the definition of runInfer using the backwards function composition operator, (>>>), so that the pipeline flows from top to bottom and I can intersperse it with explanation.

> runInfer :: Infer UType -> Either TypeError Polytype
> runInfer

The first thing we do is use applyBindings to make sure that we substitute for any unification variables that we know about. This results in another Infer UType.

>   =   (>>= applyBindings)

We can now generalize over any unification variables that are left, and then convert from UPolytype to Polytype. Again, this conversion is safe because at this top level we know we will be in an empty context, so the generalization step will definitely get rid of all the remaining unification variables.

>   >>> (>>= (generalize >>> fmap fromUPolytype))

Now all that’s left is to interpret the layers of our Infer monad one by one. As promised, we start with an empty type context.

>   >>> flip runReaderT M.empty
>   >>> runExceptT
>   >>> evalIntBindingT
>   >>> runIdentity

Finally, we can make a top-level function to infer a polytype for an expression, just by composing infer and runInfer.

> inferPolytype :: Expr -> Either TypeError Polytype
> inferPolytype = runInfer . infer


To be able to test things out, we can make a very simple REPL that takes input from the user and tries to parse, typecheck, and interpret it, printing either the results or an appropriate error message.

> eval :: String -> IO ()
> eval s = case parse expr "" s of
>   Left err -> print err
>   Right e -> case inferPolytype e of
>     Left tyerr -> putStrLn $ pretty tyerr
>     Right ty   -> do
>       putStrLn $ pretty e ++ " : " ++ pretty ty
>       when (ty == Forall [] TyNat) $ putStrLn $ pretty (interp e)
> main :: IO ()
> main = evalRepl (const (pure "HM> ")) (liftIO . eval) [] Nothing Nothing (Word (const (return []))) (return ()) (return Exit)

Here are a few examples to try out:

HM> 2 + 3
2 + 3 : nat
HM> \x. x
\x. x : forall a0. a0 -> a0
HM> \x.3
\x. 3 : forall a0. a0 -> nat
HM> \x. x + 1
\x. x + 1 : nat -> nat
HM> (\x. 3) (\y.y)
(\x. 3) (\y. y) : nat
HM> \x. y
Unbound variable y
HM> \x. x x
Infinite type u0 = u0 -> u1
HM> 3 3
Can't unify nat and nat -> u0
HM> let foo : forall a. a -> a = \x.3 in foo 5
Can't unify s0 and nat
HM> \f.\g.\x. f (g x)
\f. \g. \x. f (g x) : forall a2 a3 a4. (a3 -> a4) -> (a2 -> a3) -> a2 -> a4
HM> let f : forall a. a -> a = \x.x in let y : forall b. b -> b -> b = \z.\q. f z in y 2 3
let f : forall a. a -> a = \x. x in let y : forall b. b -> b -> b = \z. \q. f z in y 2 3 : nat
HM> \y. let x : forall a. a -> a = y in x 3
\y. let x : forall a. a -> a = y in x 3 : forall s1. (s1 -> s1) -> nat
HM> (\x. let y = x in y) (\z. \q. z)
(\x. let y = x in y) (\z. \q. z) : forall a1 a2. a1 -> a2 -> a1

And that’s it! Feel free to play around with this yourself, and adapt the code for your own projects if it’s helpful. And please do report any typos or bugs that you find.

Below, for completeness, you will find a simple, recursive, environment-passing interpreter, along with code for parsing and pretty-printing. I don’t give any commentary on them because, for the most part, they are straightforward and have nothing to do with unification-fd. But you are certainly welcome to look at them if you want to see how they work. The one interesting thing to say about the parser for types is that it checks that types entered by the user do not contain any free variables, and fails if they do. The parser is not really the right place to do this check, but again, it was expedient for this toy example. Also, I tend to use megaparsec for serious projects, but I had some parsec code for parsing something similar to this toy language lying around, so I just reused that.


> data Value where
>   VInt :: Integer -> Value
>   VClo :: Var -> Expr -> Env -> Value
> type Env = Map Var Value
> interp :: Expr -> Value
> interp = interp' M.empty
> interp' :: Env -> Expr -> Value
> interp' env (EVar x) = fromJust $ M.lookup x env
> interp' _   (EInt n) = VInt n
> interp' env (EPlus ea eb)   =
>   case (interp' env ea, interp' env eb) of
>     (VInt va, VInt vb) -> VInt (va + vb)
>     _ -> error "Impossible! interp' EPlus on non-Ints"
> interp' env (ELam x body) = VClo x body env
> interp' env (EApp fun arg) =
>   case interp' env fun of
>     VClo x body env' ->
>       interp' (M.insert x (interp' env arg) env') body
>     _ -> error "Impossible! interp' EApp on non-closure"
> interp' env (ELet x _ xdef body) =
>   let xval = interp' env xdef
>   in  interp' (M.insert x xval env) body


> lexer :: L.TokenParser u
> lexer = L.makeTokenParser emptyDef
>   { L.reservedNames = ["let", "in", "forall", "nat"] }
> parens :: Parser a -> Parser a
> parens = L.parens lexer
> identifier :: Parser String
> identifier = L.identifier lexer
> reserved :: String -> Parser ()
> reserved = L.reserved lexer
> reservedOp :: String -> Parser ()
> reservedOp = L.reservedOp lexer
> symbol :: String -> Parser String
> symbol = L.symbol lexer
> integer :: Parser Integer
> integer = L.natural lexer
> parseAtom :: Parser Expr
> parseAtom
>   =   EVar  <$> identifier
>   <|> EInt  <$> integer
>   <|> ELam  <$> (symbol "\\" *> identifier)
>             <*> (symbol "." *> parseExpr)
>   <|> ELet  <$> (reserved "let" *> identifier)
>             <*> optionMaybe (symbol ":" *> parsePolytype)
>             <*> (symbol "=" *> parseExpr)
>             <*> (reserved "in" *> parseExpr)
>   <|> parens parseExpr
> parseApp :: Parser Expr
> parseApp = chainl1 parseAtom (return EApp)
> parseExpr :: Parser Expr
> parseExpr = buildExpressionParser table parseApp
>   where
>     table = [ [ Infix (EPlus <$ reservedOp "+") AssocLeft ]
>             ]
> parsePolytype :: Parser Polytype
> parsePolytype = do
>   pty@(Forall xs ty) <- parsePolytype'
>   let fvs :: Set Var
>       fvs = flip cata ty $ \case
>         TyVarF x       -> S.singleton x
>         TyNatF         -> S.empty
>         TyFunF xs1 xs2 -> xs1 `S.union` xs2
>       unbound = fvs \\ S.fromList xs
>   unless (S.null unbound) $ fail $ "Unbound type variables: " ++ unwords (S.toList unbound)
>   return pty
> parsePolytype' :: Parser Polytype
> parsePolytype' =
>   Forall <$> (fromMaybe [] <$> optionMaybe (reserved "forall" *> many identifier <* symbol "."))
>           <*> parseType
> parseTypeAtom :: Parser Type
> parseTypeAtom =
>   (TyNat <$ reserved "nat") <|> (TyVar <$> identifier) <|> parens parseType
> parseType :: Parser Type
> parseType = buildExpressionParser table parseTypeAtom
>   where
>     table = [ [ Infix (TyFun <$ symbol "->") AssocRight ] ]
> expr :: Parser Expr
> expr = spaces *> parseExpr <* eof

Pretty printing

> type Prec = Int
> class Pretty p where
>   pretty :: p -> String
>   pretty = prettyPrec 0
>   prettyPrec :: Prec -> p -> String
>   prettyPrec _ = pretty
> instance Pretty (t (Fix t)) => Pretty (Fix t) where
>   prettyPrec p = prettyPrec p . unFix
> instance Pretty t => Pretty (TypeF t) where
>   prettyPrec _ (TyVarF v) = v
>   prettyPrec _ TyNatF = "nat"
>   prettyPrec p (TyFunF ty1 ty2) =
>     mparens (p > 0) $ prettyPrec 1 ty1 ++ " -> " ++ prettyPrec 0 ty2
> instance (Pretty (t (UTerm t v)), Pretty v) => Pretty (UTerm t v) where
>   pretty (UTerm t) = pretty t
>   pretty (UVar v)  = pretty v
> instance Pretty Polytype where
>   pretty (Forall [] t) = pretty t
>   pretty (Forall xs t) = unwords ("forall" : xs) ++ ". " ++ pretty t
> mparens :: Bool -> String -> String
> mparens True  = ("("++) . (++")")
> mparens False = id
> instance Pretty Expr where
>   prettyPrec _ (EVar x) = x
>   prettyPrec _ (EInt i) = show i
>   prettyPrec p (EPlus e1 e2) =
>     mparens (p>1) $
>       prettyPrec 1 e1 ++ " + " ++ prettyPrec 2 e2
>   prettyPrec p (ELam x body) =
>     mparens (p>0) $
>       "\\" ++ x ++ ". " ++ prettyPrec 0 body
>   prettyPrec p (ELet x mty xdef body) =
>     mparens (p>0) $
>       "let " ++ x ++ maybe "" (\ty -> " : " ++ pretty ty) mty
>             ++ " = " ++ prettyPrec 0 xdef
>             ++ " in " ++ prettyPrec 0 body
>   prettyPrec p (EApp e1 e2) =
>     mparens (p>2) $
>       prettyPrec 2 e1 ++ " " ++ prettyPrec 3 e2
> instance Pretty IntVar where
>   pretty = mkVarName "u"
> instance Pretty TypeError where
>   pretty (UnboundVar x)     = printf "Unbound variable %s" x
>   pretty (Infinite x ty)    = printf "Infinite type %s = %s" (pretty x) (pretty ty)
>   pretty (Mismatch ty1 ty2) = printf "Can't unify %s and %s" (pretty ty1) (pretty ty2)
> instance Pretty Value where
>   pretty (VInt n) = show n
>   pretty (VClo x body env)
>     = printf "<%s: %s %s>"
>       x (pretty body) (pretty env)
> instance Pretty Env where
>   pretty env = "[" ++ intercalate ", " bindings ++ "]"
>     where
>       bindings = map prettyBinding (M.assocs env)
>       prettyBinding (x, v) = x ++ " -> " ++ pretty v

by Brent at September 08, 2021 06:37 PM


Haskell Implementors' Workshop 2021

We enjoyed taking part in the Haskell Implementors’ Workshop (HIW 2021) this year, as part of ICFP 2021. Many thanks to the program chair, Ningning Xie, and the other organisers!

This year Andres served on the programme committee and Edsko co-hosted the workshop, with both of them chairing sessions on the day. Edsko and Douglas gave talks, and Ben helped deliver the annual GHC status update along with Simon Peyton Jones.

The full videos from the day are now available. Read on for more details of contributions to HIW from the Well-Typed team.

Avoiding quadratic GHC core code size

Edsko de Vries, Andres Löh


There are (at least) two important sources of quadratic code size in GHC core: (1) we have no ability to inspect or update part of a record (or indeed type class dictionary) without referring to every field in the record individually, and (2) it is difficult or impossible to control sharing at the type level. This lack of type level sharing means for example that a chain of n applications of the Applicative <*> operator is O(n^2) in size.

This does not matter for small chains and small types, but it can make compilation slow and memory hungry for large types. In this talk we will take a closer look at the various sources of quadratic core code size, and then introduce the large-records library. This library makes it possible to define large records with an O(n) core size, at the cost of using an untyped internal representation. We discuss how the library works and the API that users are provided for working with these records, including an generics interface; the latter is styled on generics-sop but differs quite in a bit in the details to avoid all O(n^2) pitfalls. For a module containing just a single record with 100 fields, using large-records reduces the GHC core code size by more than two orders of magnitude.

Improvements to GHC’s parallel garbage collector

Douglas Wilson


GHC’s parallel garbage collector periodically “stops the world” to perform collections. This requires synchronisation between all threads in the program, which must which must all be stopped before a collection can begin. This synchronisation imposes a significant cost on the runtime on both GHC and all parallel programs build with GHC. In its pre-9.2 implementation, this cross-thread synchronisation is achieved through spin locks and (on Linux) calls to sched_yield. We have modified this implementation to use mutexes and condition variables. These modifications will be available in GHC 9.2 We present benchmarks showing dramatic performance improvements for GHC itself while building the Cabal library, and provide some explanation for the surprising magnitude of the improvements.

The state of GHC

Simon Peyton Jones (Microsoft Research), Ben Gamari


This presentation (a regular fixture at HIW) summarised the latest news about developments in GHC. We were particularly happy to see improvements in GHC’s compile-time performance being reported.

by christine, andres, edsko, douglas, ben at September 08, 2021 12:00 AM

Tweag I/O

Building a Go project using Bazel

In this post, I’ll show how to build a Go project with Bazel, using Gazelle to generate the build files. Why Bazel? With Bazel, I get incremental, reproducible, and cacheable builds. It’s designed for large, polyglot monorepos and can be extended to support more languages than the many already available. Why Gazelle? Gazelle looks at the Go source files and generates Bazel BUILD files for me.

I used tendermint as an example Go project, because it’s large enough to include several interesting problems:

  • protocol buffers
  • external dependencies
  • runtime dependencies
  • flaky and exclusive tests

How will we be able to tell when it’s finished? At the end, we’ll be able to build targets for all of the Go sources,

$ bazel build //...

run tests,

$ bazel test //...

and execute the tendermint binary.

$ bazel run //cmd/tendermint

To see the final results, check out my fork of tendermint.

Let’s get started!

But where should we start? As it is usual for Bazel, we start with the WORKSPACE file, which lives at the project root and describes it, and the BUILD files, which define packages within the project. The documentation for rules_go and Gazelle helpfully gives some code snippets for these. Starting with a fresh clone of tendermint, I create a WORKSPACE file that looks like this:

workspace(name = "com_github_tendermint_tendermint")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

    name = "io_bazel_rules_go",
    sha256 = "8e968b5fcea1d2d64071872b12737bbb5514524ee5f0a4f54f5920266c261acb",
    url = "",

    name = "bazel_gazelle",
    sha256 = "62ca106be173579c0a167deb23358fdfe71ffa1e4cfdddf5582af26520f1c66f",
    url = "",

load("@io_bazel_rules_go//go:deps.bzl", "go_register_toolchains", "go_rules_dependencies")
load("@bazel_gazelle//:deps.bzl", "gazelle_dependencies")


go_register_toolchains(version = "1.16.5")


The first line of the WORKSPACE file gives the project a name, and the rest loads the dependencies on rules_go and Gazelle. Gazelle generates Bazel BUILD files, so that I don’t have to write them all myself, and updates the Bazel files after changes are done to Go source files.

Even though there aren’t any Go sources to build in the root directory, let’s write a BUILD file in the root directory. What is the point of that? The root BUILD file contains a Gazelle target, so that we can use Bazel to run Gazelle. The gazelle:prefix directive tells Gazelle how other Go projects can import this one.

load("@bazel_gazelle//:def.bzl", "gazelle")

# gazelle:prefix
gazelle(name = "gazelle")

Let’s try it out and see what happens.

$ bazel run //:gazelle

Wow, BUILD files everywhere! There is now one generated BUILD file in every folder which contains Go source files. And I’m really glad that I didn’t have to write those all myself.

Proto problems

What happens if we try building the targets in those BUILD files? Let’s find out.

$ bazel build //...
ERROR: /tendermint/third_party/proto/gogoproto/BUILD.bazel:5:14: no such package '@com_google_protobuf//': The repository '@com_google_protobuf' could not be resolved and referenced by '//third_party/proto/gogoproto:gogoproto_proto'
ERROR: /tendermint/third_party/proto/gogoproto/BUILD.bazel:5:14: every rule of type proto_library implicitly depends upon the target '@com_google_protobuf//:protoc', but this target could not be found because of: no such package '@com_google_protobuf//': The repository '@com_google_protobuf' could not be resolved
ERROR: Analysis of target '//third_party/proto/gogoproto:gogoproto_proto' failed; build aborted: Analysis failed

Uh oh, there’s a problem.

Here we’ve run into a dependency problem about using the proto_library rule for the protocol buffers in tendermint. Could I add that dependency and move on? Well yes, it can be done. However, according to the rules_go documentation, it’s probably easier to use the existing .pb.go files generated by the protocol buffer compiler, since tendermint already has those files. By default, Gazelle makes go_proto_library and proto_library to compile protos and generate .pb.go files. Instead of using that default, I include a Gazelle directive in the root BUILD file to disable the protos.

# gazelle:proto disable_global

I remove the generated BUILD files, generate new ones, and attempt to build everything again.

$ bazel run //:gazelle

Depending on the Outside World

$ bazel build //...
ERROR: /tendermint/internal/blockchain/v2/BUILD.bazel:3:11: no such package '@com_github_go_kit_kit//metrics/prometheus': The repository '@com_github_go_kit_kit' could not be resolved and referenced by '//internal/blockchain/v2:blockchain'
ERROR: Analysis of target '//internal/blockchain/v2:blockchain' failed; build aborted: Analysis failed

Oh no! What happened now?

To see why it’s failing, let’s take a look at the definition of the target //internal/blockchain/v2:blockchain in internal/blockchain/v2/BUILD.bazel.

    name = "blockchain",
    srcs = [
    importpath = "",
    visibility = ["//:__subpackages__"],
    deps = [

The target’s name is blockchain, and it builds a library from the Go sources listed. The import path tells us how another project can import it in Go, and the visibility says that only other targets within tendermint can depend on blockchain.

Finally, each dependency is explicitly listed. The dependencies that begin with // refer to packages within tendermint, and the rest are external dependencies. The error message points to one of those external dependencies: @com_github_go_kit_kit//metrics/prometheus. Why can’t Bazel find it? I haven’t given any definition of com_github_go_kit_kit, or any of the other external dependencies, so Bazel can’t figure out where to find it. Do I have to tell Bazel exactly where to find all of those dependencies? Bazel needs exact instructions about where to find external dependencies, but Gazelle’s update-repos command can use the go.mod file from the Go build to figure out the external dependencies, so I don’t have to write them all out myself. In the root BUILD file, I make another gazelle target, and this one uses the update-repos command.

    name = "gazelle-update-repos",
    args = [
    command = "update-repos",

And I run it the same way as the other gazelle target.

$ bazel run //:gazelle-update-repos

What exactly did update-repos do? There’s now a new file in the root of the project, called deps.bzl. Along with many other external dependencies, it gives a definition of com_github_go_kit_kit.

        name = "com_github_go_kit_kit",
        build_file_proto_mode = "disable_global",
        importpath = "",
        sum = "h1:dXFJfIHVvUcpSgDOV+Ne6t7jXri8Tfv2uOLHUZ2XNuo=",
        version = "v0.10.0",

How does Bazel know to look at deps.bzl? The update-repos command also adds a few lines in the WORKSPACE file.

load("//:deps.bzl", "go_dependencies")

# gazelle:repository_macro deps.bzl%go_dependencies

Note that the flag -build_file_proto_mode=disable_global sets the go_repository attribute build_file_proto_mode. This indicates that I don’t want to have proto_library and go_proto_library targets in my external dependencies either, and instead follow the recommendation to use pre-generated .pb.go files.

And, with all that, everything builds.

$ bazel build //...

One more quick note about the build: The current convention is that Gazelle generates names for go_library targets using the last component of the path. However, the convention used to be that all go_library targets were called go_default_library. When other projects depend on tendermint, they might use either of those conventions, so to accommodate both, we include another directive in the root BUILD file.

# gazelle:go_naming_convention import_alias

After running Gazelle again, there are now alias targets to the go_library target. For example the alias to //internal/blockchain/v2:blockchain is:

    name = "go_default_library",
    actual = ":blockchain",
    visibility = ["//:__subpackages__"],

It refers to the same blockchain library, but allows it to use another name.

Testing… Testing…

What next? On to the tests! Let’s run them all.

$ bazel test //... --test_output=errors

Finding Data

Most of them pass, but a few fail. Let’s look at one of them in particular.

$ bazel test //... --test_output=errors
==================== Test output for //state/indexer/sink/psql:psql_test:
--- FAIL: TestType (14.49s)
        	Error Trace:	psql_test.go:364
        	Error:      	Expected nil, but got: &fmt.wrapError{msg:"failed to read sql file from 'schema.sql': open schema.sql: no such file or directory", err:(*fs.PathError)(0xc0005f5f50)}
        	Test:       	TestType

The test needs the schema.sql file, but can’t find it. Why not? The file definitely exists in the folder along with the BUILD file and the Go sources. However, Bazel needs to know explicitly about dependencies, and Gazelle doesn’t know about these runtime dependencies, so we have to add them ourselves.

    name = "psql_test",
    srcs = ["psql_test.go"],
    data = ["schema.sql"],
    embed = [":psql"],
    deps = [

Since schema.sql is only necessary at runtime, we list it as a data dependency. Note that there is another type of dependency in this target: an embed dependency. Embedding is a specific type of dependency in rules_go which is usually used to include a library in a binary. In this case, it is a library embedded in a test target.

Now we can run the test again:

$ bazel test //state/indexer/sink/psql:psql_test --test_output=errors

And it passes! Onwards.

$ bazel test //... --test_output=errors
==================== Test output for //state/indexer:indexer_test:
--- FAIL: TestIndexerServiceIndexesBlocks (50.58s)
        	Error Trace:	indexer_service_test.go:184
        	Error:      	Expected nil, but got: &fmt.wrapError{msg:"failed to read sql file from './sink/psql/schema.sql': open ./sink/psql/schema.sql: no such file or directory", err:(*fs.PathError)(0xc0000aaea0)}
        	Test:       	TestIndexerServiceIndexesBlocks

Oh look! Here’s another test that also requires the same schema.sql file.

    data = ["schema.sql"],

It might be tempting to add the dependency exactly the same way as before, but that fails, because schema.sql is in a different package. Each BUILD file defines a different package, and state/indexer/sink/psql/BUILD.bazel defines the package that contains schema.sql. Instead, we need to explicitly mention the package name.

    data = ["//state/indexer/sink/psql:schema.sql"],

But that’s not all.

$ bazel test //... --test_output=errors
ERROR: /tendermint/state/indexer/BUILD.bazel:21:8: in go_test rule //state/indexer:indexer_test: target '//state/indexer/sink/psql:schema.sql' is not visible from target '//state/indexer:indexer_test'. Check the visibility declaration of the former target if you think the dependency is legitimate. To set the visibility of that source file target, use the exports_files() function
ERROR: Analysis of target '//state/indexer:indexer_test' failed; build aborted: Analysis of target '//state/indexer:indexer_test' failed

By default, files are only available to targets within the package. How can we change this default? An explicit declaration using exports_files in the package //state/indexer/sink/psql will do the trick.

    visibility = ["//state/indexer:__subpackages__"],

And finally, this test passes too.

$ bazel test //state/indexer:indexer_test --test_output=errors

Exclusive Tests

Let’s try something weird. Let’s add the flag --nocache_test_results so that Bazel doesn’t re-use the cached test results from before.

$ bazel test //... --test_output=errors --nocache_test_results
==================== Test output for //state/indexer/sink/psql:psql_test:
--- FAIL: TestType (114.24s)
        	Error Trace:	psql_test.go:365
        	Error:      	Expected nil, but got: &fmt.wrapError{msg:"Error unlocking while returning from other err: Migration '2021-07-09 20:30:37.648524208 +0000 UTC db schema' Failed:\ndriver: bad connection\nread tcp> read: connection reset by peer", err:(*fmt.wrapError)(0xc000527cc0)}
        	Test:       	TestType

Yuck! What is this? Wasn’t this test passing before? It’s a network error, and it turns out that this test does pass when run by itself, and passes over and over again, even with --nocache_test_results.

$ bazel test //state/indexer/sink/psql:psql_test --test_output=errors --nocache_test_results

Running the test separately is really annoying, but is necessary to guarantee that the network connections are available. Is there a better way to do it? Yes, there’s an attribute tags = ["exclusive"], that ensures no other tests run in parallel with it. Is there any way that those tests could run in parallel? Yes, it is possible to change the tests, but that goes beyond building with Bazel, so I work around the issue by not running them in parallel.

Flaky Tests

After running all of those tests so many times, I notice something strange. What is it? The test //light:light_test sometimes fails and sometimes passes. In fact, this test was known to be flaky. Is there anything we can do about it? Testing rules have a flaky attribute, and adding flaky = True tells Bazel to run the test up to three times before failing it.

Are all the tests passing now? There are a few failing tests left, but these also fail on my machine with go test, so it’s not a Bazel-specific issue. To prevent those tests from running with bazel test //..., I add tags = ["manual"] to those test targets.

Are we done yet?

The final thing that we’ll check in this post is the main tendermint binary.

$ bazel run //cmd/tendermint

And it runs, printing out help for using tendermint.

Yes! It’s finished

Now I have a Bazel build of tendermint, and I can run tests and the tendermint command. For more examples of using Bazel, check out posts which build projects with Haskell and OCaml.

September 08, 2021 12:00 AM

September 07, 2021

Joachim Breitner

A Candid explainer: Opt is special

This is the third post in a series about the interface description language Candid.

The record extension problem

Initially, the upgrade rules of Candid were a straight-forward application of the canonical subtyping rules. This worked and was sound, but it forbid one very commonly requested use case: Extending records in argument position.

The subtyping rule for records say that

   record { old_field : nat; new_field : int }
<: record { old_field : nat }

or, in words, that subtypes can have more field than supertypes. This makes intuitive sense: A user of a such a record value that only expects old_field to be there is not bothered if suddenly a new_field appears. But a user who expects new_field to be there is thrown off if it suddenly isn’t anymore. Therefore it is ok to extend the records returned by a service with new fields, but not to extend the record in your methods’s argument types.

But users want to extend their record types over time, also in argument position!

In fact, they often want to extend them in both positions. Consider a service with the following interface, where the CUser record appears both in argument and result position:

type CUser = record { login : text; name : text };
service C : {
  register_user : (CUser) -> ();
  get_user_data : (text) -> (CUser);

It seems quite natural to want to extend the record with a new field (say, the age of the user). But simply changing the definition of CUser to

type CUser = record { login : text; name : text; age : nat }

is not ok, because now register_user requires the age field, but old clients don’t provide it.

So how can we allow developers to make changes like this (because they really really want that), while keeping the soundness guarantees made by Candid (because we really really want that)? This question has bothered us for over two years, and we even created a meta issue that records the dozen approached we considered, discussed and eventually ditched.

Dynamic subtyping checks in opt

I will spare you the history lesson, though, and explain the solution we eventually came up with.

In the example above it seems unreasonable to let the developer add a field age of type nat. Since there may be old clients around, the service unavoidably has to deal with records that don’t have an age field. If the code expects an age value, what should it do in that case?

But one can argue that changing the record type to

type CUser = record { login : text; name : text; age : opt nat }

could work: If the age field is present, use the value. If the field is absent, inject a null value during decoding.

In the first-order case, this rather obvious solution would work just fine, and we’d be done. But Candid aims to solve the higher order case, and I said things get tricky here, didn’t I?

Consider another, independent service D with the following interface:

type DUser = record { login : text; name : text };
service D : {
  on_user_added : (func (DUser) -> ()) -> ()

This service has a method that takes a method reference, presumably with the intention of calling it whenever a new user was added to service D. And because the types line up nicely, we can compose these two services, by once calling D.on_user_added(C.register_user), so that from now on D calls C.register_user(…). These kind of service mesh-ups are central to the vision of the Internet Computer!

But what happens if the services now evolve their types in different ways, e.g. towards

type CUser = record { login : text; name : text; age : opt nat }


type DUser = record { login : text; name : text; age : opt variant { under_age; adult }}

Individually, these are type changes that we want to allow. But now the call from D to C may transmit a value of record { …; age = opt variant { under_age } } when C expects a natural number! This is precisely what we want to prevent by introducing types, and it seems we have lost soundness.

The best solution we could find is to make the opt type somewhat special, and apply these extra rules when decoding at an expected type of opt t.

  • If this was a record field, and the provided record value doesn’t even have a field of that name, don’t fail but instead treat this as null. This handles the first-order example above.

  • If there is a value, look at its type (which, in Candid, is part of the message).

    • If it is opt t' and t' <: t, then decode as normal.

    • If it is opt t' but t' <: t does not hold, then treat that as null.

      This should only happen in these relatively obscure higher-order cases where services evolved differently and incompatibly, and makes sure that the calls that worked before the upgrades will continue to work afterwards.

      It is not advisable to actually use that rule when changing your service’s interface. Tools that assist the developer with an upgrade check should prevent or warn about the use of this rule.

  • Not strictly required here, but since we are making opt special anyways:

    If its type t' is not of the form opt …, pretend it was opt t', and also pretend that the given value was wrapped in opt.

    This allows services to make record field in arguments that were required in the old version optional in a new version, e.g. as a way to deprecated them. So it is mildly useful, although I can report that it makes the meta-theory and implementation rather complex, in particular together with equirecursion and beasts like type O = opt O. See this discussion for a glimpse of that.

In the above I stress that we look at the type of the provided value, and not just the value. For example, if the sender sends the value opt vec {} at type opt vec nat, and the recipient expects opt vec bool, then this will decode as null, even though one could argue that the value opt vec {} could easily be understood at type opt vec bool. We initially had that, but then noticed that this still breaks soundness when there are references inside, and we have to do a full subtyping check in the decoder. This is very unfortunate, because it makes writing a Candid decoder a noticeably harder task that requires complicated graph algorithms with memoization (which I must admit Motoko has not implemented yet), but it’s the least bad solution we could find so far.

Isn’t that just dynamic typing?

You might have noticed that with these rules, t <: opt t' holds for all types t and t'. In other words, every opt … is a top type (like reserved), thus all optional types are equivalent. One could argue that we are throwing all of the benefits of static typing over board this way. But it’s not quite as bad: It’s true that decoding Candid values now involves a dynamic check inserting null values in certain situations, but as long as everyone plays by the rules (i.e. upgrades their services according to the Candid safe upgrade rules, and heeds the warning alluded to above), these certain situations will only happen in the obscurest of cases.

It is, by the way, not material to this solution that the special subtyping behavior is applied to “the” opt type, and a neighboring point in the design space would be a dedicated upgraded t type constructor. That would allow developers to use the canonical opt type with the usual subtyping rules and communicate their intent (“clients should consider this field optional” vs. “old clients don’t know about this field, but new clients should use it”) more cleanly – at the expense of having more “non-canonical” types in the type system.

To see why it is convenient if Candid has mostly just the “normal” types, read the next post, which will describe how Candid can be integrated into a host language.

by Joachim Breitner ( at September 07, 2021 04:53 PM

September 06, 2021

Philip Wadler

Concurrency, State, and Cardano


Recently, there have been a flurry of messages on Twitter about limitations to Cardano, many of them misleading. Sundae Swap have put together an excellent summary of the issues.

After months of testing with smaller groups, the Cardano public testnet was recently upgraded to support smart contracts. The surge of activity that followed included many dApp tests and experiments, with developers eager to perform a large-scale test and show off their hard work. This effort has created a ferocious discussion around some of the design decisions behind Cardano. Many critics are using this discussion as an opportunity to point to Cardano, misrepresent the problem, and ultimately underestimate the potential of one of the giants of the crypto industry. Misconceptions are now floating around suggesting that Cardano only supports one transaction per block, only one user can interact with a smart contract at a time, and that cardano is ultimately destined for centralization. All of these are inaccurate, and we present below a new framing and the start of a few solutions that dApp builders might choose.

by Philip Wadler ( at September 06, 2021 06:25 PM

Monday Morning Haskell

Hpack in Video Form!


For a couple more weeks we'll be continuing with videos from recent streams! As a reminder you can catch me streaming on Twitch every Monday at 7:30 pm PDT (UTC-07). In the latest video, I walk through refactoring Haskellings to use Hpack instead of relying on the normal .cabal file format. Enjoy!

by James Bowen at September 06, 2021 02:30 PM

FP Complete

Combining Axum, Hyper, Tonic, and Tower for hybrid web/gRPC apps: Part 2

This is the second of four posts in a series on combining web and gRPC services into a single service using Tower, Hyper, Axum, and Tonic. The full four parts are:

  1. Overview of Tower
  2. Today's post: Understanding Hyper, and first experiences with Axum
  3. Demonstration of Tonic for a gRPC client/server
  4. How to combine Axum and Tonic services into a single service

I recommend checking out the first post in the series if you haven't already.

Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

Quick recap

  • Tower provides a Service trait, which is basically an asynchronous function from requests to responses
  • Service is parameterized on the request type, and has an associated type for Response
  • It also has an associated Error type, and an associated Future type
  • Service allows async behavior in both checking whether the service is ready to accept a request, and for handling the request
  • A web application ends up having two sets of async request/response behavior
    • Inner: a service that accepts HTTP requests and returns HTTP responses
    • Outer: a service that accepts the incoming network connections and returns an inner service

With that in mind, let's look at Hyper.

Services in Hyper

Now that we've got Tower under our belts a bit, it's time to dive into the specific world of Hyper. Much of what we saw above will apply directly to Hyper. But Hyper has a few additional curveballs to deal with:

  • Both the Request and Response types are parameterized over the representation of the request/response bodies
  • There are a bunch of additional traits and type parameterized in the public API, some not appearing in the docs at all, and many that are unclear

In place of the run function we had in our previous fake server example, Hyper follows a builder pattern for initializing HTTP servers. After providing configuration values, you create an active Server value from your Builder with the serve method. Just to get it out of the way now, this is the type signature of serve from the public docs:

pub fn serve<S, B>(self, new_service: S) -> Server<I, S, E>
    I: Accept,
    I::Error: Into<Box<dyn StdError + Send + Sync>>,
    I::Conn: AsyncRead + AsyncWrite + Unpin + Send + 'static,
    S: MakeServiceRef<I::Conn, Body, ResBody = B>,
    S::Error: Into<Box<dyn StdError + Send + Sync>>,
    B: HttpBody + 'static,
    B::Error: Into<Box<dyn StdError + Send + Sync>>,
    E: NewSvcExec<I::Conn, S::Future, S::Service, E, NoopWatcher>,
    E: ConnStreamExec<<S::Service as HttpService<Body>>::Future, B>,

That's a lot of requirements, and not all of them are clear from the docs. Hopefully we can bring some clarity to this. But for now, let's start off with something simpler: the "Hello world" example from the Hyper homepage:

use std::{convert::Infallible, net::SocketAddr};
use hyper::{Body, Request, Response, Server};
use hyper::service::{make_service_fn, service_fn};

async fn handle(_: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new("Hello, World!".into()))

async fn main() {
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));

    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle))

    let server = Server::bind(&addr).serve(make_svc);

    if let Err(e) = server.await {
        eprintln!("server error: {}", e);

This follows the same pattern we established above:

  • handle is an async function from a Request to a Response, which may fail with an Infallible value.
    • Both Request and Response are parameterized with Body, a default HTTP body representation.
  • handle gets wrapped up in service_fn to produce a Service<Request<Body>>. This is like app_fn above.
  • We use make_service_fn, like app_factory_fn above, to produce the Service<&AddrStream> (we'll get to that &AddrStream shortly).
    • We don't care about the &AddrStream value, so we ignore it
    • The return value from the function inside make_service_fn must be a Future, so we wrap with async
    • The output of that Future must be a Result, so we wrap with an Ok
    • We need to help the compiler out a bit and provide a type annotation of Infallible, otherwise it won't know the type of the Ok(service_fn(handle)) expression

Using this level of abstraction for writing a normal web app is painful for (at least) three different reasons:

  • Managing all of these Service pieces manually is a pain
  • There's very little in the way high level helpers, like "parse the request body as a JSON value"
  • Any kind of mistake in your types may lead to very large, non-local error messages that are difficult to diagnose

So we'll be more than happy to move on from Hyper to Axum a bit later. But for now, let's continue exploring things at the Hyper layer.

Bypassing service_fn and make_service_fn

What I found most helpful when trying to grok Hyper was implementing a simple app without service_fn and make_service_fn. So let's go through that ourselves here. We're going to create a simple counter app (I'm nothing if not predictable). We'll need two different data types: one for the "app factory", and one for the app itself. Let's start with the app itself:

struct DemoApp {
    counter: Arc<AtomicUsize>,

impl Service<Request<Body>> for DemoApp {
    type Response = Response<Body>;
    type Error = hyper::http::Error;
    type Future = Ready<Result<Self::Response, Self::Error>>;

    fn poll_ready(&mut self, _cx: &mut std::task::Context) -> Poll<Result<(), Self::Error>> {

    fn call(&mut self, _req: Request<Body>) -> Self::Future {
        let counter = self.counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
        let res = Response::builder()
            .header("Content-Type", "text/plain; charset=utf-8")
            .body(format!("Counter is at: {}", counter).into());

This implementation uses the std::future::Ready struct to create a Future which is immediately ready. In other words, our application doesn't perform any async actions. I've set the Error associated type to hyper::http::Error. This error would be generated if, for example, you provided invalid strings to the header method call, such as non-ASCII characters. As we've seen multiple times, poll_ready just advertises that it's always ready to handle another request.

The implementation of DemoAppFactory isn't terribly different:

struct DemoAppFactory {
    counter: Arc<AtomicUsize>,

impl Service<&AddrStream> for DemoAppFactory {
    type Response = DemoApp;
    type Error = Infallible;
    type Future = Ready<Result<Self::Response, Self::Error>>;

    fn poll_ready(&mut self, _cx: &mut std::task::Context) -> Poll<Result<(), Self::Error>> {

    fn call(&mut self, conn: &AddrStream) -> Self::Future {
        println!("Accepting a new connection from {:?}", conn);
        std::future::ready(Ok(DemoApp {
            counter: self.counter.clone()

We have a different parameter to Service, this time &AddrStream. I did initially find the naming here confusing. In Tower, a Service takes some Request. And with our DemoApp, the Request it takes is a Hyper Request<Body>. But in the case of DemoAppFactory, the Request it's taking is a &AddrStream. Keep in mind that a Service is really just a generalization of failable, async functions from input to output. The input may be a Request<Body>, or may be a &AddrStream, or something else entirely.

Similarly, the "response" here isn't an HTTP response, but a DemoApp. I again find it easier to use the terms "input" and "output" to avoid the name overloading of request and response.

Finally, our main function looks much the same as the original from the "Hello world" example:

async fn main() {
    let addr = SocketAddr::from(([0, 0, 0, 0], 3000));

    let factory = DemoAppFactory {
        counter: Arc::new(AtomicUsize::new(0)),

    let server = Server::bind(&addr).serve(factory);

    if let Err(e) = server.await {
        eprintln!("server error: {}", e);

If you're looking to extend your understanding here, I'd recommend extending this example to perform some async actions within the app. How would you modify Future? If you use a trait object, how exactly do you pin?

But now it's time to take a dive into a topic I've avoided for a while.

Understanding the traits

Let's refresh our memory from above on the signature of serve:

pub fn serve<S, B>(self, new_service: S) -> Server<I, S, E>
    I: Accept,
    I::Error: Into<Box<dyn StdError + Send + Sync>>,
    I::Conn: AsyncRead + AsyncWrite + Unpin + Send + 'static,
    S: MakeServiceRef<I::Conn, Body, ResBody = B>,
    S::Error: Into<Box<dyn StdError + Send + Sync>>,
    B: HttpBody + 'static,
    B::Error: Into<Box<dyn StdError + Send + Sync>>,
    E: NewSvcExec<I::Conn, S::Future, S::Service, E, NoopWatcher>,
    E: ConnStreamExec<<S::Service as HttpService<Body>>::Future, B>,

Up until preparing this blog post, I have never tried to take a deep dive into understanding all of these bounds. So this will be an adventure for us all! (And perhaps it should end up with some documentation PRs by me...) Let's start off with the type variables. Altogether, we have four: two on the impl block itself, and two on this method:

  • I represents the incoming stream of connections.
  • E represents the executor.
  • S is the service we're going to run. Using our terminology from above, this would be the "app factory." Using Tower/Hyper terminology, this is the "make service."
  • B is the choice of response body the service returns (the "app", not the "app factory", using nomenclature above).

I: Accept

I needs to implement the Accept trait, which represents the ability to accept a new connection from some a source. The only implementation out of the box is for AddrIncoming, which can be created from a SocketAddr. And in fact, that's exactly what Server::bind does.

Accept has two associated types. Error must be something that can be converted into an error object, or Into<Box<dyn StdError + Send + Sync>>. This is the requirement of (almost?) every associated error type we look at, so from now on I'll just skip over them. We need to be able to convert whatever error happened into a uniform representation.

The Conn associated type represents an individual connection. In the case of AddrIncoming, the associated type is AddrStream. This type must implement AsyncRead and AsyncWrite for communication, Send and 'static so it can be sent to different threads, and Unpin. The requirement for Unpin bubbles up from deeper in the stack, and I honestly don't know what drives it.

S: MakeServiceRef

MakeServiceRef is one of those traits that doesn't appear in the public documentation. This seems to be intentional. Reading the source:

Just a sort-of "trait alias" of MakeService, not to be implemented by anyone, only used as bounds.

Were you confused as to why we were receiving a reference with &AddrStream? This is the trait that powers that transformation. Overall, the trait bound S: MakeServiceRef<I::Conn, Body, ResBody = B> means:

  • S must be a Service
  • S will accept input of type &I::Conn
  • It will in turn produce a new Service as output
  • That new service will accept Request<Body> as input, and produce Response<ResBody> as output

And while we're talking about it: that ResBody has the restriction that it must implement HttpBody. As you might guess, the Body struct mentioned above implements HttpBody. There are a number of implementations too. When we get to Tonic and gRPC, we'll see that there are, in fact, other response bodies we have to deal with.

NewSvcExec and ConnStreamExec

The default value for the E parameter is Exec, which does not appear in the generated docs. But of course you can find it in the source. The concept of Exec is to specify how tasks are spawned off. By default, it leverages tokio::spawn.

I'm not entirely certain of how all of these plays out, but I believe the two traits in the heading allow for different handling of spawning for the connection service (app factory) versus the request service (app).

Using Axum

Axum is the new web framework that kicked off this whole blog post. Instead of dealing directly with Hyper like we did above, let's reimplement our counter web service using Axum. We'll be using axum = "0.2". The crate docs provide a great overview of Axum, and I'm not going to try to replicate that information here. Instead, here's my rewritten code. We'll analyze a few key pieces below:

use axum::extract::Extension;
use axum::handler::get;
use axum::{AddExtensionLayer, Router};
use hyper::{HeaderMap, Server, StatusCode};
use std::net::SocketAddr;
use std::sync::atomic::AtomicUsize;
use std::sync::Arc;

#[derive(Clone, Default)]
struct AppState {
    counter: Arc<AtomicUsize>,

async fn main() {
    let addr = SocketAddr::from(([0, 0, 0, 0], 3000));

    let app = Router::new()
        .route("/", get(home))

    let server = Server::bind(&addr).serve(app.into_make_service());

    if let Err(e) = server.await {
        eprintln!("server error: {}", e);

async fn home(state: Extension<AppState>) -> (StatusCode, HeaderMap, String) {
    let counter = state
        .fetch_add(1, std::sync::atomic::Ordering::SeqCst);
    let mut headers = HeaderMap::new();
    headers.insert("Content-Type", "text/plain; charset=utf-8".parse().unwrap());
    let body = format!("Counter is at: {}", counter);
    (StatusCode::OK, headers, body)

The first thing I'd like to get out of the way is this whole AddExtensionLayer/Extension bit. This is how we're managing shared state within our application. It's not directly relevant to our overall analysis of Tower and Hyper, so I'll suffice with a link to the docs demonstrating how this works. Interestingly, you may notice that this implementation relies on middlewares, which does in fact leverage Tower, so it's not completely separate.

Anyway, back to our point at hand. Within our main function, we're now using this Router concept to build up our application:

let app = Router::new()
    .route("/", get(home))

This says, essentially, "please call the home function when you receive a request for /, and add a middleware that does that whole extension thing." The home function uses an extractor to get the AppState, and returns a value of type (StatusCode, HeaderMap, String) to represent the response. In Axum, any implementation of the appropriately named IntoResponse trait can be returned from handler functions.

Anyway, our app value is now a Router. But a Router cannot be directly run by Hyper. Instead, we need to convert it into a MakeService (a.k.a. an app factory). Fortunately, that's easy: we call app.into_make_service(). Let's look at that method's signature:

impl<S> Router<S> {
    pub fn into_make_service(self) -> IntoMakeService<S>
        S: Clone;

And going down the rabbit hole a bit further:

pub struct IntoMakeService<S> { /* fields omitted */ }

impl<S: Clone, T> Service<T> for IntoMakeService<S> {
    type Response = S;
    type Error = Infallible;
    // other stuff omitted

The type Router<S> is a value that can produce a service of type S. IntoMakeService<S> will take some kind of connection info, T, and produce that service S asynchronously. And since Error is Infallible, we know it can't fail. But as much as we say "asynchronously", looking at the implementation of Service for IntoMakeService, we see a familiar pattern:

fn poll_ready(&mut self, _cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {

fn call(&mut self, _target: T) -> Self::Future {
    future::MakeRouteServiceFuture {
        future: ready(Ok(self.service.clone())),

Also, notice how that T value for connection info doesn't actually have any bounds or other information. IntoMakeService just throws away the connection information. (If you need it for some reason, see into_make_service_with_connect_info.) In other words:

  • Router<S> is a type that lets us add routes and middleware layers
  • You can convert a Router<S> into an IntoMakeService<S>
  • But IntoMakeService<S> is really just a fancy wrapper around an S to appease the Hyper requirements around app factories
  • So the real workhorse here is just S

So where does that S type come from? It's built up by all the route and layer calls you make. For example, check out the get function's signature:

pub fn get<H, B, T>(handler: H) -> OnMethod<H, B, T, EmptyRouter>
    H: Handler<B, T>,

pub struct OnMethod<H, B, T, F> { /* fields omitted */ }

impl<H, B, T, F> Service<Request<B>> for OnMethod<H, B, T, F>
    H: Handler<B, T>,
    F: Service<Request<B>, Response = Response<BoxBody>, Error = Infallible> + Clone,
    B: Send + 'static,
    type Response = Response<BoxBody>;
    type Error = Infallible;
    // and more stuff

get returns an OnMethod value. And OnMethod is a Service that takes a Request<B> and returns a Response<BoxBody>. There's some funny business at play regarding the representations of bodies, which we'll eventually dive into a bit more. But with our newfound understanding of Tower and Hyper, the types at play here are no longer inscrutable. In fact, they may even be scrutable!

And one final note on the example above. Axum works directly with a lot of the Hyper machinery. And that includes the Server type. While the axum crate reexports many things from Hyper, you can use those types directly from Hyper instead if so desired. In other words, Axum is pretty close to the underlying libraries, simply providing some convenience on top. It's one of the reasons I'm pretty excited to get a bit deeper into my experiments with Axum.

So to sum up at this point:

  • Tower provides an abstraction for asynchronous functions from input to output, which may fail. This is called a service.
  • HTTP servers have two levels of services. The lower level is a service from HTTP requests to HTTP responses. The upper level is a service from connection information to the lower level service.
  • Hyper has a lot of additional traits floating around, some visible, some invisible, which allow for more generality, and also make things a bit more complicated to understand.
  • Axum sits on top of Hyper and provides an easier to use interface for many common cases. It does this by providing the same kind of services that Hyper is expecting to see. And it seems to be doing a bunch of fancy footwork around HTTP body representations.

Next step on our journey: let's look at another library for building Hyper services. We'll follow up on this in our next post.

Read part 3 now

If you're looking for more Rust content from FP Complete, check out:

September 06, 2021 12:00 AM

September 05, 2021

Philip Wadler

Reading list


Kevin Doran wrote to me requesting a reading list on logic. He recommends an introductory guide to logic by Peter Smith. Smith's appendix lists several textbooks on logic, but misses my two favourites.

  • Jean-Yves Girard, Yves Lafont, and Paul Taylor. Proofs and Types. CUP, 1989. The master logician (discoverer of System F and linear logic), aided by two superb computer scientists, covers the basics. Explains additive and multiplicative proof rules, and the fundamentals of propositions as types.
  • Jean van Heijenoort, From Frege to Gödel A Source Book in Mathematical Logic, 1879-1931, Harvard University Press, 1967. My favourite entry is the correspondence between Russell and Frege---it is only a few pages long and so a quick read; don't skip van Heijenoort's introduction. See also Anita Feferman's biography of van Heijenoort, who lived quite a life: he served as Trotsky's secretary and was shot to death by a jealous lover.

by Philip Wadler ( at September 05, 2021 11:09 AM

September 02, 2021

Matt Parsons

Family Values

I wrote a big thread on the company Slack to compare type families: open vs closed vs associated. I also ended up discussing data families, as well, since they are a good complement to type families. I’ll probably edit this further and include it in my book, Production Haskell, but here’s the Slack transcript for now:

An associated type family is an open type family, but with the requirement that it be defined for any type that’s also an instance of the type class. Consider the mono-traversable library. We have:

type family Element o :: Type

class MonoFunctor o where
  omap :: (Element o -> Element o) -> o -> o

class MonoFoldable o where
  ofoldMap :: Monoid m => (Element o -> m) -> o -> m

class (MonoFunctor o, MonoFoldable o) => MonoTraversable o where
  otraverse :: (Applicative f) => (Element o -> f (Element o)) -> o -> f o

If we wanted to put Element as an associated type, like:

class MonoFunctor o where
  type Element o :: Type
  omap :: (Element o -> Element o) -> o -> o

Then we’d be a bit stuck - we need to require class (MonoFunctor o) => MonoFoldable o. This is a stronger requirement than regular Foldable, and that’s a problem. Or we define a separate type family on MonoFoldable. But then - what do we use with MonoTraversable?

If we’re desperate to avoid open type families, then we can work around this by having a dummy class that only carries the type family.

class HasElement o where
   type Element o :: Type

class (HasElement o) => MonoFunctor o where ...
class (HasElement o) => MonoFoldable o where ...
class (MonoFunctor o, MonoFoldable o) => MonoTraversable o where ...

But “a dummy class that only has a type family” seems obviously worse to me than “a type family” :shrug:

The question: “Should I use an open type family or a closed type family?” has an analog to simpler language features: type classes and sum types.

If you want a closed set, you use a sum type. If you want an open set, you use a type class. So if you’re familiar with the trade-offs there, the trade-offs with open/closed type familes are easier to evaluate

A closed set means you can be exhaustive - “every case is handled.” If you’re pattern matching on a datakind, like type family Foo (x :: Bool), then you can know that handling Foo 'True and Foo 'False that you’ve handled all cases. You don’t have to worry that some user is going to add a case and blow things up. (Well, you kinda do, because even closed type families aren’t actually closed, due to stuckness semantics, but, uhhhhh, that’s a bit of very advanced trickery, talk to csongor if you’re curious)

An open set is a way of allowing easy extensibility. So you’re going to accept something of kind Type or possibly a polymorphic kind variable to allow people to define their own types, and their own instances of these types. For example, if I want to associate a type with a string, I can write:

type family TypeString (sym :: Symbol) :: Type

type instance TypeString "Int" = Int
type instance TypeString "Char" = Char

And that lets me run programs at the type level, that end users can extend. Much like you can write a type class and end users can extend your functionality.

But ultimately you need to do something at the value level. Which means you need to take some type information and translate it to the value level. This is precisely what type classes do - they are morally “a function from a type to a value.” We can write a super basic function, like:

typeStringProxy :: Proxy sym -> Proxy (TypeString sm)
typeStringProxy _ = Proxy

But this is still not useful without further classes. The Default class assigns a special value to any type, and we could do something like this:

typeStringDefault :: forall sm. Default (TypeString sm) => Proxy sm -> TypeString sm
typeStringDefault = def @(TypeString sm)

Since associating a type class and an open type family is so common, it’s almost always better to use an associated type unless you know that the type family is going to be shared across multiple type classes.

“So how do you associate a closed type family with values?” That’s a great question. We can do the same trick with Proxy functions:

type family Closed (a :: Bool) where
  Closed 'True = Int
  Closed 'False = Char

closed :: Proxy b -> Proxy (Closed b)
closed _ = Proxy

But, until we know what b is, we can’t figure out what Closed b is. To pattern match on a type, we need a type class.

class RunClosed (b :: Bool) where
  runClosed :: Proxy b -> Closed b

instance RunClosed 'True where
    runClosed _ = 3

instance RunClosed 'False where
    runClosed _ = 'a'

I’m going to detour a bit here and mention data families. A data family is like a type family, but instead of allowing you to refer to any type, you have to specify the constructors inline. To take an example from persistent,

data family Key a

newtype instance Key User = UserKey { unUserKey :: UUID }
type UserId = Key User

newtype instance Key Organization = OrganizationKey { unOrganizationKey :: UUID }
type OrganizationId = Key Organization

An advantage of this is that, since you specify the constructors, you can know the type of Key a by knowing the constructor in use - the value specifies the type. OrganizationKey :: UUID -> Key Organization.

It looks a lot like an “open type family,” and in fact is completely analogous. But we don’t call them “open data families,” even though that’s an appropriate name for it. It should make you wonder - is there such a thing as a closed data family?

Aaaaaand - the answer is “yes”, but we call them GADTs instead.

The nice thing about an “open data family” is that you can learn about types by inspecting values - by knowing a value (like OrganizationKey uuid), I can work ‘backwards’ and learn that I have an Key Organization. But, I can’t write a case expression over all Key a - it’s open! and case only works on closed things. So this code does not work:

whatKey :: Key a -> Maybe UUID
whatKey k = case k of
  UserKey uuid -> Just uuid
  OrganizationKey uuid -> Just uuid
  _ -> Nothing

Indeed, we need a type class to allow us to write get :: Key a -> SqlPersistT m (Maybe a).

A GADT - as a closed data family - allows us to work from a value to a type, and since it is exhaustive, we can write case expressions on them.

data Key a where
    UserKey :: { unUserKey :: UUID } -> Key User
    OrganizationKey :: { unOrganizationKey :: UUID } -> Key Organization

If I have this structure, then I can actually write get without a type class.

get :: Key a -> SqlPersistT IO (Maybe a) 
get k = case k of
  UserKey uuid -> do
    [userName, userAge] <- rawSql "SELECT * FROM users WHERE id = ?" [toPersistValue uuid]
    pure User {..}
  OrganizationKey uuid -> ...

A GADT is ‘basically’ a closed type family that gives you constructor tags for applying that type family,. If we look at Closed, we can inline this:

type family ClosedTy (b :: Bool) where
  ClosedTy True = Int
  ClosedTy False = Char

data ClosedData (a :: Type) where
  ClosedData :: Proxy b -> ClosedData (ClosedTy b)

-- inlining:
data Closed (a :: Type) where
  ClosedTrue :: Proxy 'True -> Closed Int
  ClosedFalse :: Proxy 'False -> Closed Char

When we case on a Closed value, we get:

runClosed :: Closed a -> a
runClosed closed = 
  case closed of
    ClosedTrue (Proxy :: Proxy 'True) -> 3
    ClosedFalse (Proxy :: Proxy 'False) -> 'a'

I’m going to conclude now, with this distillation:

  • Open type family + type class = extensible, open programming, but no exhaustivity.
  • Closed type family + GADT + functions = exhaustive handling of types, but not extensible
  • An open type family + a GADT isn’t much fun.
  • A closed type family + a type class isn’t much fun

September 02, 2021 12:00 AM

September 01, 2021

Gabriel Gonzalez

Forward and reverse proxies explained


This post explains what is the difference between a forward proxy and a reverse proxy. This post will likely be most useful for:

  • engineers designing on-premises enterprise software to forward traffic in restrictive networking environments

  • authors of forward and reverse proxy packages / frameworks

  • people who want to learn more about networking in general

I’m writing this mainly because I had to puzzle through all of this when I was designing part of our product’s network architecture at work and I frequently explain what I learned to my coworkers. Most of the existing explanations of proxies were not helpful to me when I learned this for the first time, so I figured I would try to explain things in my own words so that I can reference this post when teaching others.

This post will not touch upon the use cases for forward and reverse proxies and instead will mostly focus on the architectural differences between them. If you are interested in learning more about how they are used in practice then I recommend reading the Wikipedia article on proxies:

Also, this post assumes some familiarity with the HTTP and HTTPS protocols.

The difference

The simplest difference between the two types of proxies is that:

  • A reverse proxy is a proxy where the proxy selects the origin server
  • A forward proxy is a proxy where the client selects the origin server

… and “origin server” means the server where the HTTP resource originates from that the proxy forwards the HTTP request to.

There are a few alternative definitions for forward proxy and reverse proxy, but in my experience the above definitions promote the correct intuition.

curl examples

I sometimes explain the difference between forward and reverse proxies in terms of curl commands.

For example, if I host a reverse proxy at that forwards requests to, then a sample HTTP request going through the reverse proxy might look like the following curl command:

$ curl

In a reverse proxy, the client (e.g. curl) does not select the origin server (e.g. In this particular example the origin server is hard-coded into the reverse proxy and there would be no way for the client to specify that the HTTP request should be forwarded to instead.

The data flow for such a request looks something like this:

│ │
│ │
│ │
↑ ↓
"GET / HTTP/1.1" "HTTP/1.1 200 OK"
↑ ↓
│ │
│ │
│ │
↑ ↓
"GET / HTTP/1.1" "HTTP/1.1 200 OK"
↑ ↓
│ │
│ curl │
│ │

In other words:

  • The client (e.g. curl) sends an HTTP request to the reverse proxy
  • The proxy (e.g. forwards the HTTP request to the origin server
  • The origin server (e.g. responds to the HTTP request
  • The proxy forwards the HTTP response back to the client

Now suppose that I host a forward proxy at A sample HTTP request destined for going through the forward proxy might look like this:

$ curl --proxy

In a forward proxy, the client (e.g. curl) selects the origin server (e.g. and could have potentially selected a different origin server, such as, like this:

$ curl --proxy

… and then the forward proxy would forward the request to instead1.

The data flow for a request going through a forward proxy depends on whether the client connects to the origin server using HTTP or HTTPS.

If the client uses HTTP then the data flow for a forward proxy typically looks like this:

│ │
│ │
│ │
↑ ↓
"GET / HTTP/1.1" "HTTP/1.1 200 OK"
↑ ↓
│ │
│ │
│ │
↑ ↓
"GET HTTP/1.1" "HTTP/1.1 200 OK"
↑ ↓
│ │
│ curl │
│ │

In other words:

  • The client (e.g. curl) sends an HTTP request to the proxy with the origin server in the request line
  • The proxy (e.g. forwards the request to the origin server specified on the request line
  • The origin server (e.g. responds to the HTTP request
  • The proxy forwards the HTTP response back to the client

The main difference from the previous example is the HTTP request line that curl sends to Forward proxies receive an absolute URI from the client (e.g. instead of a relative URI (like /). This is how the forward proxy knows where to forward the client’s request.

Now contrast that with the data flow for a forward proxy when the client uses HTTPS:

│ │
│ │
│ │


│ │ │
│ │ │
│ │ │
↑ │
↑ ↓
│ │
│ curl │
│ │

This is a different flow of information:

  • The client (e.g. curl) sends a CONNECT request to the proxy
  • The proxy (e.g. opens a TCP connection to the origin server (e.g.
  • The proxy forwards the rest of the client’s connection as raw TCP traffic directly to the origin server
    • This TCP connection is encrypted, so by default the proxy cannot intercept or modify HTTPS traffic

However, despite the difference in data flow both HTTP forward proxies and HTTPS forward proxies let the client select the origin server. This is why we still call them both forward proxies even though they are architecturally different server types.

nc examples

If you’re curious, you can verify the difference in curl’s behavior when using HTTP vs HTTPS forward proxies on the command line. If you set up nc to listen on port 8000:

$ nc -l 8000

… and then ask curl to use localhost:8000 as a forward proxy, you’ll see different results depending on whether the origin server URI uses an HTTP or HTTPS scheme. For example, if you run:

$ curl --proxy http://localhost:8000 --data 'A secret!'

… then nc will print the following incoming request from curl:

$ nc -l 8000
User-Agent: curl/7.64.1
Accept: */*
Proxy-Connection: Keep-Alive
Content-Length: 9
Content-Type: application/x-www-form-urlencoded

A secret!

Note that the request line has an absolute URI specifying the origin server to connect to. Also, the HTTP forward proxy has complete access to the contents of the request (including the headers and payload) and can tamper with them before sending the request further upstream.

Contrast that with an HTTPS request that goes through a forward proxy:

$ curl --proxy http://localhost:8000 --data 'A secret!'

… where now nc will print an incoming request that looks like this:

$ nc -l 8000
User-Agent: curl/7.64.1
Proxy-Connection: Keep-Alive

This time the HTTP method is CONNECT (regardless of what the original method was) and the client only divulges enough information to the proxy to establish the TCP tunnel to

Blurring the line

Sometimes people configure reverse proxies to behave like a “poor person’s forward proxy”. For example, you could imagine a reverse proxy configured to select the origin server based on the value of an HTTP header:

$ curl --header 'Origin:'

… or based on the URI:

$ curl

… or based on a subdomain (one per supported origin server):

$ # Yes, I have actually seen this in the wild
$ curl

This sort of thing is possible, and there are some constrained use cases for doing things this way, but you should err on the side of using a forward proxy if you intend to let the client select the origin server. Forward proxy software (like squid) will better support this use case than reverse proxy software (like haproxy) and HTTP clients (like your browser or curl) will also better support forward proxies for this use case.

Example pseudocode

To further clarify the difference, here is some example pseudocode for how one would implement the various types of proxies.

The request handler for a hand-written reverse proxy might look something like the following Python pseudocode:

def handleRequest(request):
# Forward the incoming request further upstream to some predefined origin
# server (e.g., typically with some changes to the HTTP headers
# that we won't cover in this post.
response = httpRequest(
host = "",
method = request.method,
headers = fixRequestHeaders(request.headers),
path = request.path,
query = request.query,
body = request.body

# Now forward the origin server's response back to the client
headers = fixResponseHeaders(response.headers),
statusCode = response.statusCode,
body = response.body

The request handler for an HTTP forward proxy would look something like this:

def handleRequest(request):
# A request to an HTTP forward proxy has a request line with an absolute
# URI:
# … which is how the proxy knows where to forward the request further
# upstream.
upstreamURI = request.absoluteURI

if upstreamURI is not None:
# Other than obtaining the origin server from the request line, the rest
# is similar to a reverse proxy.
response = httpRequest(
host =,
method = request.method,
headers = fixRequestHeaders(request.headers),
path = upstreamURI.path,
query = upstreamURI.query,
body = request.body,

headers = fixResponseHeaders(response.headers),
statusCode = response.statusCode,
body = response.body
statusCode = 400,
body = "The request line did not have an absolute URI",

On the other hand, a HTTPS forward proxy would look something like this:

def handleRequest(request):
# A request to an HTTPS forward proxy has a request line of the form:
if request.connect is not None:
host =,
port = request.connect.port,
socket = request.socket
statusCode = 400,
body = "The HTTP method was not CONNECT"

Combining a forward and reverse proxy

This section illustrates a scenario where you might combine a forward proxy and a reverse proxy to further reinforce the difference between the two.

Suppose that we wish to create a forward proxy reachable at using squid. The problem is that squid does not provide out-of-the-box support for listening to HTTPS connections to squid itself.

Carefully note that squid can easily forward an HTTPS connection (using a CONNECT tunnel, as explained above), but squid does not support encrypting the client’s initial HTTP request to squid itself requesting the establishment of the CONNECT tunnel.

However, we can run a reverse proxy that can handle TLS termination (such as haproxy) listening on The reverse proxy’s sole job is to accept encrypted HTTPS requests sent to and then forward the decrypted HTTP request to the squid service running on the same machine. squid then forwards the request further depending on the origin server specified in the HTTP request line.

In this scenario, squid is still acting as a forward proxy, in the sense that squid uses the HTTP request line provided by the client to determine where to forward the request. Vice versa, haproxy is acting as a reverse proxy because haproxy always forwards all incoming requests to squid regardless of what the client specifies.

This scenario illustrates one reason why I don’t define forward or reverse proxies in terms of which network or machine the proxy runs on. In this example, both the forward proxy and reverse proxy are running on the same network and machine.

There are other types of proxies that this post didn’t cover, but if you want to learn more you should also check out:

  • Transparent proxies

    These are proxies where the client doesn’t specify the proxy at all because it intercepts all client traffic. I personally view transparent proxies as a third type of forward proxy (the client still selects the origin server), although transparent proxies are architecturally very different from non-transparent forward proxies.

  • SSL Forward Proxy

    This is a variation on a forward proxy that can decrypt HTTPS traffic. However, this requires the client to opt into this decryption in some way.

  1. Some forward proxies do not forward all HTTP requests. Instead, a forward proxy might specify which hostnames, protocols, and ports to permit in an access control list. This means that the example from the text would only work if the forward proxy’s access control lists permitted requests to and↩︎

by Gabriella Gonzalez ( at September 01, 2021 04:20 PM

Mark Jason Dominus

On having the right theorem

A recent Math StackExchange question asks “Prove every permutation of the alphabet contains a subset of six letters in order”. That is, you take a string of length 26 that contains each letter once; you can find a subsequence of six letters that is either increasing or decreasing.

Choosing a permutation at random, suppose we have:

    q y x u l i n g w o c k j d r p f v t s h e a z b m

Then the sequence x w v t s e b has length 7 and is in descending order. Or:

    t h e q u i c k b r o w n f x j m p s v r l z y d g

This contains the ascending sequence h q r s v y. Also the descending sequence t q o n m l d.

I thought about this for a while but couldn't make any progress. But OP had said “I know I have to construct a partially ordered set and possibly use Dilworth's Theorem…” so I looked up Dilworth's theorem.

It turns out I did actually know Dilworth's theorem. It is about partially-ordered sets. Dilworth's theorem says that if you partition a partially-ordered set into totally-ordered subsets, called ‘chains’, then the number of such chains is at least as big as the size of the largest “antichain”. An antichain is a subset in which no two elements are comparable.

This was enough of a hint, and I found the solution pretty quickly after that. Say that is the position of letter in the string . Define the partial order : $$ i\prec j \qquad \equiv \qquad i < j \text{ and } S[i] < S[j] $$

That is, means that is alphabetically earlier than and its position in is to the left of . This is obviously a partial ordering of the letters of the alphabet. Chains under are, by definition, ascending sequences of letters from . It's easy to show that antichains are descending sequences.

Partition into chains. If any chain has length or more, that is an ascending sequence of letters and we win.

If not, then no chain has more than 5 letters, and so there must be at least chains, because and there are letters. Since there are at least chains, Dilworth's theorem tells us there is an antichain of at least letters, and hence a descending sequence of that length, and we win.

Once you have the idea to use Dilworth's theorem, the problem collapses. (You also have to invent the right relation, but there's really only one possible choice.)

Maybe I'll write a followup article about how just having a theorem doesn't necessarily mean you have an algorithm. Mathematicians will say “partition into chains,” but actually programming something like that might be nontrivial. Then finding the antichain among the chains might also be nontrivial.

by Mark Dominus ( at September 01, 2021 01:42 PM

August 30, 2021

Joachim Breitner

A Candid explainer: Safe higher-order upgrades

This is the second post in a series about the interface description language Candid.

Safe upgrades

A central idea behind Candid is that services evolve over time, and so also their interfaces evolve. As they do, it is desirable to keep the interface usable by clients who have not been updated. In particular on a blockchainy platform like the Internet Computer, where some programs are immutable and cannot be changed to accommodate changes in the interface of the services they use, this is of importance.

Therefore, Candid defines which changes to an interface are guaranteed to be backward compatible. Of course it’s compatible to add new methods to a service, but some changes to a method signature can also be ok. For example, changing

service A1 : {
  get_value : (variant { current; previous : nat })
    -> (record { value : int; last_change : nat })


service A2 : {
  get_value : (variant { current; previous : nat; default })
    -> (record { value : int; last_change : nat; committed : bool })

is fine: It doesn’t matter that clients that still use the old interface don’t know about the new constructor of the argument variant. And the extra field in the result record will silently be ignored by old clients.

In the Candid spec, this relation is written as A2 <: A1 (note the order!), and the formal footing this builds on is subtyping. We thus say that “it is safe to upgrade a service to a subtype”, and that A2’s interface is a subtype of A1’s.

In small examples, I often use nat and int, because every nat is also a int, but some int values are not not nat values, namely the negative ones. We say nat is a subtype of int, nat <: int. So a function that in the first returns a int can in the new version return a nat without breaking old clients. But the other direction doesn’t work: If the old version’s method had a return type of nat, then changing that to int would be a breaking change, as old clients would not be prepared to handle negative numbers.

Note that arguments of function types follow different rules. In fact, the rules are simply turned around, and in argument position (also called “negative position”), we can evolve the interface to accept supertypes. Concretely, a function that originally expected an nat can be changed to expect an int, but the other direction doesn’t work, because there might still be old clients around that send negative numbers. This reversing-of-the-rules is called contravariance.

The vision is that the developer’s tooling will warn the developer before they upgrade the code of a running service to a type that is not a subtype of the old interface.

Let me stress, though, that all this is about not breaking existing clients that use the old interface. It does not mean that a client developer who fetches the new interface for your service won’t have to change their code to make his programs compile again.

Higher order composition

So far, we considered the simple case of one service with a bunch of clients, i.e. the “first order” situation. Things get much more interesting if we have multiple services that are composed, and that pass around references to each other, and we still want everything to be nicely typed and never fail even as we upgrade services.

Therefore, Candid’s type system can express that a service’s method expects or a returns a reference to another service or method, and the type that this service or method should have. For example

service B : { add_listener : (text, func (int) -> ()) -> () }

says that the service with interface B has a method called add_listener which takes two arguments, a plain value of type text and a reference to a function that itself expects a int-typed argument.

The contravariance of the subtyping rules explained above also applies to the types of these function reference. And because the int in the above type is on the left of two function arrows, the subtyping rule direction flips twice, and it is therefore safe to upgrade the service to accept the following interface:

service B : { add_listener : (text, func (nat) -> ()) -> () }

Soundness theorem and Coq proof

The combination of these higher order features and the safe upgrade mechanism is maybe the unique selling point of Candid, but also what made its design quite tricky sometimes. And although the conventional subtyping rules work well, we wanted to do some less conventional things (more on that below), and more than once thought we had a good solution, only to notice that it did not hold water in complicated higher-order cases.

But what does it mean to hold water? I felt the need to precisely define a soundness criteria for higher order interface description languages, which you can find in the document An IDL Soundness Proposition. This is a general framework which you can instantiate with your concrete IDL language and rules, and then it tells you what you have to prove to consider your language to be sound. Intuitively, it simulates all possible ways in which services can be defined and upgraded, and in which they can pass around references to each other, and the soundness property is that then that for all messages sent between services, they can always be understood.

The document also shows, in general, that any system that builds on “canonical subtyping” in sound, as expected. That is still a generic theorem that you can instantiate with a concrete system, but now there is less to do.

Because these proofs can get tricky in corner cases, it is valuable to mechanize them in an interactive theorem prover and have the computer check the proofs. So I have created a Coq formalization that defines the soundness criteria, including the reduction to canonical subtyping. It also defines MiniCandid, a greatly simplified variant of Candid, proves various properties about it (transitivity etc.) and finally instantiates the soundness theorem.

I am particularly fond of the use of coinduction to model the equirecursive types, and the use of named cases, as we know them from Isabelle, to make the proofs a bit more readable and maintainable.

I am less fond of how much work it seems to be to extend MiniCandid with more of Candid’s types. Already adding vec was more work than it’s worth it, and I defensively blame Coq’s not-so-great support for nested recursion.

The soundness relies on subtyping, and that is all fine and well as long as we stick to canonical rules. Unfortunately, practical considerations force us to deviate from these a bit, as we see in the next post of this series.

by Joachim Breitner ( at August 30, 2021 09:21 PM

Tweag I/O

Daily ICFP: Day 5

Daily ICFP

Several Tweagers all agreed to gather notes here about our ICFP (International Conference on Functional Programming) experiences.

Other posts in this series:

Day 5 - Friday

These notes follow Arnaud, Noon and Richard through their day at ICFP, roughly in order. (The talks are not publicly posted at the time of publication of this post, but we expect they will be over the coming weeks.)

ArnaudParafuzz: Coverage-guided Property Fuzzing for Multicore OCaml programs: this is in the context of multicore OCaml (multicore OCaml has algebraic effects). The authors use algebraic effects to abstract over concurrency primitives, so that they can test concurrency property with the AFL fuzzer (AFL tries to generate random inputs, here random schedules, while trying very hard to trigger code paths that were not covered by previous runs).

ArnaudWibbily Wobbly Timey Camly: an OCaml library to deal with time, timezones, time intervals, etc… The talk focused a lot on finding out whether a particular date belong to a set of dates. These sets are built by assembling constraints. It’s an unusual, but very convincing design.

Noon — FARM starts today! I’m so excited; I’ve wanted to attend FARM for years, and this is my first opportunity!

Noonmimium: a self-extensible programming language for sound and music

  • Nice language that has scheduling and state.
  • Has a focus on composition over live-coding (c.f. extempore, say.)

Noon — Unfortunately, the next speaker wasn’t able to make it, so there was no talk.

Noon — Bit disappointed that FARM and Haskell Symposium are on at the same time, I’ve ended up attending Haskell, and I’ll hope to watch FARM at a later point.

Practical Normalization by Evaluation for EDSLs, by Nachiappan Villiappan, Alejandro Russo, and Sam Lindley


  • Prefer shallow over deep embedding to piggy-back features from the host language.
  • Downside is we’ve lost syntax for the operations.
  • Maybe one way is to interconvert between expression representation and host (?) representation
  • Doesn’t always work; sometimes there’s no unique choice.


I had heard the term “normalization by evaluation” a number of times in the literature, but I never really understood it until watching this talk: the idea is to take a bit of syntax, interpret it into a semantic domain (that is, reinterpret it as an expression in the host language) and then inject the result back into the encoding of the syntax of the object language. The only problem is, now that I’ve described it thusly, it feels vacuous once again: what is the difference between “normalization by evaluation” and just evaluating terms in your object language? The term keeps cropping up, but I don’t really get it.

Let me think a bit more about what the words mean: normalization is the process of finding a normal form, where a normal form is (generally) a distinguished member of an equivalence class, useful for checking membership in the equivalence class. That is, if we want to identify all semantically equal expressions, we want to notice that 1 + 1 and 2 are semantically equal; thus, they are in the same equivalence class. A good way to notice this is to choose one distinguished member of each equivalence class: this is the normal form. For 1 + 1 and 2, 2 will be this normal form. Evaluation is a separate process by which we take an expression and simplify it according to well-established rules of evaluation.

Putting this together, we might surmise that “normalization by evaluation” is an approach for checking semantic equality of expressions, using this recipe:

To determine whether e1 and e2 are equal:

  1. Evaluate e1 to yield value v1.
  2. Evaluate e2 to yield value v2.
  3. Declare that e1 is semantically equal to e2 iff v1 is syntactically equal to v2.

This process does not really care what v1 and v2 are — just whether they’re syntactically equal. So maybe that really is the essence of normalization by evaluation. And I’ve probably struggled to understand this because this process seems “obvious” to me, and so I cannot really imagine another way of checking semantic equality of expressions.

In the end, I’m not sure whether this definition of normalization-by-evaluation is correct, and I’m afraid I got stuck on this during the talk and did not extract other useful bits.

NoonSafe Mutation with Algebraic Effects

  • Concurrency is frustrating; we don’t want non-determinism!
  • Maybe algebraic effects can help?
  • Idea: Annotate the resources and modify these to control what can be done.
  • It all works!

RichardSeeking Stability by being Lazy and Shallow: Lazy and shallow instantiation is user friendly, by Gert-Jan Bottu and Richard A. Eisenberg.

I will not comment further on this talk other than to include this shameless plug, and to publicly state how much I enjoyed working with former Tweag intern Gert-Jan on developing this paper. I do hope to submit a proposal requesting lazy instantiation in future versions of GHC.

Noon — I had wanted to watch the Linear Haskell, Today and Tomorrow talk, but it was scheduled alongside ShutdownPL (and FARM); I really want to support ICFPs efforts for more DEI (diversity, equity, and inclusion) content so I feel compelled to attend ShutdownPL; I’ll have to catch the other talks at a later time.

RichardLinear Haskell, Today and Tomorrow, keynote by former Tweager Jean-Philippe Bernardy

This is a fine keynote describing the state of linear types in Haskell today and sketching out how we can extract more from them in the future. In particular, it describes the long-term plan for supporting pure mutable-in-place structures, as long as we can verify statically (via linear types) that there are never two live copies of the structures.

NoonShutdownPL - Seeking Good Trouble Before It Goes Bad: Anti-Jerk Systems in STEM

  • This was a bold and strong talk. The main discussion was around the need to address problematic people at the end of the so-called “pipeline problem” in STEM; i.e. there’s no point bringing people in to a community if they are just going to leave because they are not welcomed.
  • So, one of the key ideas I got from this talk was to consider why people leave. I think this is a very powerful concept and often quite hard to do.
  • Another key idea from this talk was highlighting the damage done to communities by supporting, publicly, people known to be engaging in problematic behaviour. While I think it’s perhaps very obvious, it also can be quite subtle; the talk goes into a discussion of legal issues some of the community leaders, in the specific example, were facing, and the cost of speaking up; which is often very high; especially high if you are not in a privileged position.
  • One of the most practical ideas that I came out of this talk with, is this: Survey people who are in/leaving your community, and find out what they are thinking. It doesn’t have to be entirely negative; it could also solicit comments on things that are going well, or new ideas, etc. But one idea that came up, in discussions with my partner afterwards, was the idea of what I’m calling a “Living Code of Conduct”: It’s a mechanism for flagging behaviour (good or bad!) and aligning it to a specific code-of-conduct item managed by the community. It’s probably best done anonymously; but doesn’t have to be; there could be lots of variants. In any case, if you’re at all interested in this idea, do reach out to me; I’d love to chat more about it!

NoonDeadlock-Free Session Types in Linear Haskell

  • Really enjoying the presentations by Wen; she has a very engaging style.
  • First time I’ve heard about Session types, but I think I got a good feeling.
  • Overall I really enjoyed it!

NoonEvaluating Linear Functions to Symmetric Monoidal Categories

  • Box and wire diagrams.
  • Symmetric Monoidal Categories (SMCs).
  • Issue with Arrows is that they don’t represent these box-and-write diagrams very well; and in particular fail at the parallelisation.
  • Idea is to implement a nice notation for working with SMCs in Haskell.
  • Perhaps has applications to quantum computing!
  • So of course I’m interested, and will try and do a bit more reading.



  • Overall, my experience at ICFP has been great.
  • I feel very full of FP/Type theory knowledge! I feel like I learned a lot of new words and concepts that I will inevitably have to google again, but now they’ll spark a pleasant memory in my mind :)
  • I met several very friendly people, and also watched many lovely talks that I enjoyed; hopefully some friendly faces from ICFP will be appearing on the Compositional podcast in the next few months!
  • Thanks to the organisers and all the speakers for their hard work!

August 30, 2021 12:00 AM

FP Complete

Combining Axum, Hyper, Tonic, and Tower for hybrid web/gRPC apps: Part 1

I've played around with various web server libraries and frameworks in Rust, and found various strengths and weaknesses with them. Most recently, I put together an FP Complete solution called Zehut (which I'll blog about another time) that needed to combine a web frontend and gRPC server. I used Hyper, Tonic, and a minimal library I put together called routetype. It worked, but I was left underwhelmed. Working directly with Hyper, even with the minimal routetype layer, felt too ad-hoc.

When I recently saw the release of Axum, it seemed to be speaking to many of the needs I had, especially calling out Tonic support. I decided to make an experiment of replacing the direct Hyper+routetype usage I'd used with Axum. Overall the approach works, but (like the routetype work I'd already done) involved some hairy business around the Hyper and Tower APIs.

I've been meaning to write some blog post/tutorial/experience report for Hyper+Tower for a while now. So I decided to take this opportunity to step through these four libraries (Tower, Hyper, Axum, and Tonic), with the specific goal in mind of creating hybrid web/gRPC apps. It turned out that there was more information here than I'd anticipated. To make for easier reading, I've split this up into a four part blog post series:

  1. Today's post: overview of Tower
  2. Understanding Hyper, and first experiences with Axum
  3. Demonstration of Tonic for a gRPC client/server
  4. How to combine Axum and Tonic services into a single service

Let's dive in!

Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.

What is Tower?

The first stop on our journey is the tower crate. To quote the docs, which state this succinctly:

Tower provides a simple core abstraction, the Service trait, which represents an asynchronous function taking a request and returning either a response or an error. This abstraction can be used to model both clients and servers.

This sounds fairly straightforward. To express it in Haskell syntax, I'd probably say Request -> IO Response, leveraging the fact that IO handles both error handling and asynchronous I/O. But the Service trait is necessarily more complex than that simplified signature:

pub trait Service<Request> {
    type Response;
    type Error;

    // This is what it says in the generated docs
    type Future: Future;

    // But this more informative piece is in the actual source code
    type Future: Future<Output = Result<Self::Response, Self::Error>>;

    fn poll_ready(
        &mut self,
        cx: &mut Context<'_>
    ) -> Poll<Result<(), Self::Error>>;
    fn call(&mut self, req: Request) -> Self::Future;

Service is a trait, parameterized on the types of Requests it can handle. There's nothing specific about HTTP in Tower, so Requests may be lots of different things. And even within Hyper, an HTTP library leveraging Tower, we'll see that there are at least two different types of Request we care about.

Anyway, two of the associated types here are straightforward: Response and Error. Combining the parameterized Request with Response and Error, we basically have all the information we care about for a Service.

But it's not all the information Rust cares about. To provide for asynchronous calls, we need to provide a Future. And the compiler needs to know the type of the Future we'll be returning. This isn't really useful information to use as a programmer, but there are plenty of pain points already around async code in traits.

And finally, what about those last two methods? They are there to allow the Service itself to be asynchronous. It took me quite a while to fully wrap my head around this. We have two different components of async behavior going on here:

  • The Service may not be immediately ready to handle a new incoming request. For example (coming from the docs on poll_ready), the server may currently be at capacity. You need to check poll_ready to find out whether the Service is ready to accept a new request. Then, when it's ready, you use call to initiate handling of a new Request.
  • The handling of the request itself is also async, returning a Future, which can be polled/awaited.

Some of this complexity can be hidden away. For example, instead of giving a concrete type for Future, you can use a trait object (a.k.a. type erasure). Stealing again from the docs, the following is a perfectly valid associated type for Future:

type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>>>>;

However, this incurs some overhead for dynamic dispatch.

Finally, these two layers of async behavior are often unnecessary. Many times, our server is always ready to handle a new incoming Request. In the wild, you'll often see code that hard-codes the idea that a service is always ready. To quote from those docs for the final time in this section:

fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>> {

This isn't saying that request handling is synchronous in our Service. It's saying that request acceptance always succeeds immediately.

Going along with the two layers of async handling, there are similarly two layers of error handling. Both accepting the new request may fail, and processing the new request may fail. But as you can see in the code above, it's possible to hard-code something which always succeeds with Ok(()), which is fairly common for poll_ready. When processing the request itself also cannot fail, using Infallible (and eventually the never type) as the Error associated type is a good call.

Fake web server

That was all relatively abstract, which is part of the problem with understanding Tower (at least for me). Let's make it more concrete by implementing a fake web server and fake web application. My Cargo.toml file looks like:

name = "learntower"
version = "0.1.0"
edition = "2018"

tower = { version = "0.4", features = ["full"] }
tokio = { version = "1", features = ["full"] }
anyhow = "1"

I've uploaded the full source code as a Gist, but let's walk through this example. First we define some helper types to represent HTTP request and response values:

pub struct Request {
    pub path_and_query: String,
    pub headers: HashMap<String, String>,
    pub body: Vec<u8>,

pub struct Response {
    pub status: u32,
    pub headers: HashMap<String, String>,
    pub body: Vec<u8>,

Next we want to define a function, run, which:

  • Accepts a web application as an argument
  • Loops infinitely
  • Generates fake Request values
  • Prints out the Response values it gets from the application

The first question is: how do you represent that web application? It's going to be an implementation of Service, with the Request and Response types being those we defined above. We don't need to know much about the errors, since we'll simply print them. These parts are pretty easy:

pub async fn run<App>(mut app: App)
    App: Service<crate::http::Request, Response = crate::http::Response>,
    App::Error: std::fmt::Debug,

But there's one final bound we need to take into account. We want our fake web server to be able to handle requests concurrently. To do that, we'll use tokio::spawn to create new tasks for handling requests. Therefore, we need to be able to send the request handling to a separate task, which will require bounds of both Send and 'static. There are at least two different ways of handling this:

  • Cloning the App value in the main task and sending it to the spawned task
  • Creating the Future in the main task and sending it to the spawned task

There are different runtime impacts of making this decision, such as whether the main request accept loop will be blocked or not by the application reporting that it's not available for requests. I decided to go with the latter approach. So we've got one more bound on run:

App::Future: Send + 'static,

The body of run is wrapped inside a loop to allow simulating an infinitely running server. First we sleep for a bit and then generate our new fake request:


let req = crate::http::Request {
    path_and_query: "/fake/path?page=1".to_owned(),
    headers: HashMap::new(),
    body: Vec::new(),

Next, we use the ready method (from the ServiceExt extension trait) to check whether the service is ready to accept a new request:

let app = match app.ready().await {
    Err(e) => {
        eprintln!("Service not able to accept requests: {:?}", e);
    Ok(app) => app,

Once we know we can make another request, we get our Future, spawn the task, and then wait for the Future to complete:

let future =;
tokio::spawn(async move {
    match future.await {
        Ok(res) => println!("Successful response: {:?}", res),
        Err(e) => eprintln!("Error occurred: {:?}", e),

And just like that, we have a fake web server! Now it's time to implement our fake web application. I'll call it DemoApp, and give it an atomic counter to make things slightly interesting:

pub struct DemoApp {
    counter: Arc<AtomicUsize>,

Next comes the implementation of Service. The first few bits are relatively easy:

impl tower::Service<crate::http::Request> for DemoApp {
    type Response = crate::http::Response;
    type Error = anyhow::Error;
    type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>> + Send>>;

    // Still need poll_ready and call

Request and Response get set to the types we defined, we'll use the wonderful anyhow crate's Error type, and we'll use a trait object for the Future. We're going to implement a poll_ready which is always ready for a Request:

fn poll_ready(
    &mut self,
    _cx: &mut std::task::Context<'_>,
) -> Poll<Result<(), Self::Error>> {
    Poll::Ready(Ok(())) // always ready to accept a connection

And finally we get to our call method. We're going to implement some logic to increment the counter, fail 25% of the time, and otherwise echo back the request from the user, with an added X-Counter response header. Let's see it in action:

fn call(&mut self, mut req: crate::http::Request) -> Self::Future {
    let counter = self.counter.clone();
    Box::pin(async move {
        println!("Handling a request for {}", req.path_and_query);
        let counter = counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
        anyhow::ensure!(counter % 4 != 2, "Failing 25% of the time, just for fun");
            .insert("X-Counter".to_owned(), counter.to_string());
        let res = crate::http::Response {
            status: 200,
            headers: req.headers,
            body: req.body,
        Ok::<_, anyhow::Error>(res)

With all that in place, running our fake web app on our fake web server is nice and easy:

async fn main() {


One thing that's particularly unsatisfying about the code above is how much ceremony it takes to write a web application. I need to create a new data type, provide a Service implementation for it, and futz around with all that Pin<Box<Future>> business to make things line up. The core logic of our DemoApp is buried inside the call method. It would be nice to provide a helper of some kind that lets us define things more easily.

You can check out the full code as a Gist. But let's talk through it here. We're going to implement a new helper app_fn function which takes a closure as its argument. That closure will take in a Request value, and then return a Response. But we want to make sure it asynchronously returns the Response. So we'll need our calls to look something like:

app_fn(|req| async { some_code(req).await })

This app_fn function needs to return a type which provides our Service implementation. Let's call it AppFn. Putting these two things together, we get:

pub struct AppFn<F> {
    f: F,

pub fn app_fn<F, Ret>(f: F) -> AppFn<F>
    F: FnMut(crate::http::Request) -> Ret,
    Ret: Future<Output = Result<crate::http::Response, anyhow::Error>>,
    AppFn { f }

So far, so good. We can see with the bounds on app_fn that we'll accept a Request and return some Ret type, and Ret must be a Future that produces a Result<Response, Error>. Implementing Service for this isn't too bad:

impl<F, Ret> tower::Service<crate::http::Request> for AppFn<F>
    F: FnMut(crate::http::Request) -> Ret,
    Ret: Future<Output = Result<crate::http::Response, anyhow::Error>>,
    type Response = crate::http::Response;
    type Error = anyhow::Error;
    type Future = Ret;

    fn poll_ready(
        &mut self,
        _cx: &mut std::task::Context<'_>,
    ) -> Poll<Result<(), Self::Error>> {
        Poll::Ready(Ok(())) // always ready to accept a connection

    fn call(&mut self, req: crate::http::Request) -> Self::Future {

We have the same bounds as on app_fn, the associated types Response and Error are straightforward, and poll_ready is the same as it was before. The first interesting bit is type Future = Ret;. We previously went the route of a trait object, which was more verbose and less performant. This time, we already have a type, Ret, that represents the Future the caller of our function will be providing. It's really nice that we get to simply use it here!

The call method leverages the function provided by the caller to produce a new Ret/Future value per incoming request and hand it back to the web server for processing.

And finally, our main function can now embed our application logic inside it as a closure. This looks like:

async fn main() {
    let counter = Arc::new(AtomicUsize::new(0));
    fakeserver::run(util::app_fn(move |mut req| {
        // need to clone this from the closure before moving it into the async block
        let counter = counter.clone();
        async move {
            println!("Handling a request for {}", req.path_and_query);
            let counter = counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
            anyhow::ensure!(counter % 4 != 2, "Failing 25% of the time, just for fun");
                .insert("X-Counter".to_owned(), counter.to_string());
            let res = crate::http::Response {
                status: 200,
                headers: req.headers,
                body: req.body,
            Ok::<_, anyhow::Error>(res)

Side note: the extra clone

From bitter experience, both my own and others I've spoken with, that let counter = counter.clone(); above is likely the trickiest piece of the code above. It's all too easy to write code that looks something like:

let counter = Arc::new(AtomicUsize::new(0));
fakeserver::run(util::app_fn(move |_req| async move {
    let counter = counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
        "Just demonstrating the problem, counter is {}",

This looks perfectly reasonable. We move the counter into the closure and then use it. However, the compiler isn't too happy with us:

error[E0507]: cannot move out of `counter`, a captured variable in an `FnMut` closure
   --> src\
95  |       let counter = Arc::new(AtomicUsize::new(0));
    |           ------- captured outer variable
96  |       fakeserver::run(util::app_fn(move |_req| async move {
    |  _________________________________________________________^
97  | |         let counter = counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
    | |                       -------
    | |                       |
    | |                       move occurs because `counter` has type `Arc<AtomicUsize>`, which does not implement the `Copy` trait
    | |                       move occurs due to use in generator
98  | |         Err(anyhow::anyhow!(
99  | |             "Just demonstrating the problem, counter is {}",
100 | |             counter
101 | |         ))
102 | |     }))
    | |_____^ move out of `counter` occurs here

It's a slightly confusing error message. In my opinion, it's confusing because of the formatting I've used. And I've used that formatting because (1) rustfmt encourages it, and (2) the Hyper docs encourage it. Let me reformat a bit, and then explain the issue:

let counter = Arc::new(AtomicUsize::new(0));
fakeserver::run(util::app_fn(move |_req| {
    async move {
        let counter = counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
            "Just demonstrating the problem, counter is {}",

The issue is that, in the argument to app_fn, we have two different control structures:

  • A move closure, which takes ownership of counter and produces a Future
  • An async move block, which takes ownership of counter

The issue is that there's only one counter value. It gets moved first into the closure. That means we can't use counter again outside the closure, which we don't try to do. All good. The second thing is that, when that closure is called, the counter value will be moved from the closure into the async move block. That's also fine, but it's only fine once. If you try to call the closure a second time, it would fail, because the counter has already been moved. Therefore, this closure is a FnOnce, not a Fn or FnMut.

And that's the problem here. As we saw above, we need at least a FnMut as our argument to the fake web server. This makes intuitive sense: we will call our application request handling function multiple times, not just once.

The fix for this is to clone the counter inside the closure body, but before moving it into the async move block. That's easy enough:

fakeserver::run(util::app_fn(move |_req| {
    let counter = counter.clone();
    async move {
        let counter = counter.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
            "Just demonstrating the problem, counter is {}",

This is a really subtle point, hopefully this demonstration will help make it clearer.

Connections and requests

There's a simplification in our fake web server above. A real HTTP workflow starts off with a new connection, and then handles a stream of requests off of that connection. In other words, instead of having just one service, we really need two services:

  1. A service like we have above, which accepts Requests and returns Responses
  2. A service that accepts connection information and returns one of the above services

Again, leaning on some terse Haskell syntax, we'd want:

type InnerService = Request -> IO Response
type OuterService = ConnectionInfo -> IO InnerService

Or, to borrow some beautiful Java terminology, we want to create a service factory which will take some connection information and return a request handling service. Or, to use Tower/Hyper terminology, we have a service, and a make service. Which, if you've ever been confused by the Hyper tutorials like I was, may finally explain why "Hello World" requires both a service_fn and make_service_fn call.

Anyway, it's too detailed to dive into all the changes necessary to the code above to replicate this concept, but I've provided a Gist showing an AppFactoryFn.

And with that... we've finally played around with fake stuff long enough that we can dive into real life Hyper code. Hurrah!

Next time

Up until this point, we've only played with Tower. The next post in this series is available, where we try to understand Hyper and experiment with Axum.

Read part 2 now

If you're looking for more Rust content from FP Complete, check out:

August 30, 2021 12:00 AM

August 29, 2021

Joachim Breitner

A Candid explainer: The rough idea

One of the technologies that was created by DFINITY as we built the “Internet Computer” is Candid. Candid is

  • a language to describe the interface of a service, with named methods, arguments and results,
  • a type system for these arguments and results that maps to many possible implementation languages, thus enabling interoperability between these languages,
  • a wire format (encoding) of values of these types,
  • a notion of “safe upgrade” that allows service developers to evolve their interface with the assurance that no clients break, even in the presence of higher-order composition,
  • a meta-theory of safe interface description languages and a formal proof that Candid has these properties, and
  • a bunch of tools and libraries.

In this in-depth blog post series I want to shed some light onto these aspects of Candid. The target audience is mainly anyone who wants to deeply understand Candid, e.g. researchers, implementors of Candid tools, anyone who wants to builds a better alternative, but no particular prior Candid knowledge is expected. Also, much of what is discussed here is independent of the Internet Computer. Some posts may be more theoretical or technical than others; if you are lost in one I hope you’ll rejoin for the subsequent post

Much in these posts is not relevant for developers who just want to use Candid in their projects, and also much that they want to know is missing. The Internet Computer dev docs will hopefully cater to that audience.

Blog post series overview

  1. The rough idea (this post)

    Announcing the blog post series, and very briefly outlining what an Interface Description Language even is.

  2. Safe higher-order upgrades

    Every can do first-order interface description languages. What makes Candid special is its support for higher-order service composition. This post also contains some general meta-theory and a Coq proof!

  3. Opt is special

    Extending records in argument position is surprisingly tricky. Lean why, and how we solved it.

  4. Language integration

    The point of Candid is inter-op, so let’s look the various patterns of connecting Candid to your favorite programming language.

  5. Quirks

    The maybe most interesting post in this series: Candid has a bunch of quirks (hashed field names, no tuples but sequnces, references that nobody used, and more). I’ll explain some of them, why they are there, and what else we could have done. Beware: This post will be opinionated.

The rough idea

The idea of an interface description language is easy to begin with. Something like

service A : {
  add : (int) -> ();
  get : () -> (int);

clearly communicates that a service called A provides two methods add and get, that add takes an integer as an argument, and that get returns such a number. Of course, this only captures the outer shape of the service, but not what it actually does – we might assume it implements a counter of sorts, but that is not part of the Candid interface description.

In order to now use the service, we also need a (low-level) transport mechanism. Candid itself does not specify that, but merely assumes that it exists, and is able to transport the name of the invoked method and a raw sequence of bytes to the service, and a response (again a raw sequence of bytes) back.

Already on the Internet Computer we have two distinct transport mechanisms used are external calls via an HTTP-based RPC interface and on-chain inter-canister calls. Both are handled by the service in the same way. Here we can see that Candid succeeds in abstracting over differences in the low-level transport mechanism. Because of this abstraction, it is possible to use Candid over other transportation mechanisms as well (conventional HTTPS, E-Mail, avian carrier). I think Candid has potential there as well, so even if you are not interested in the Internet Computer, this post may be interesting to you.

The translation of argument and result values (e.g. numbers of type int) to the raw sequence of bytes is specified by Candid, defining a wire format. For example, the type int denotes whole numbers of arbitrary size, and they are encoded using the LEB128 scheme. The Candid wire format also contains a magic number (DIDL, which is 0x4449444c in hex) and a type description (more on that later). So when passing the value 2342 to the add, the raw bytes transported will be 0x4449444c00017da612.

Of course we want to transfer more data than just integers, and thus Candid supports a fairly complete set of common basic types (nat, int, nat8, nat16, nat32, nat64, int8, int16, int32, int64, float32, float64, bool, text, blob) and composite types (vectors, optional values , records and variants). The design goal is to provide a canonical set of types – enough to express most data that you might want to pass, but no more than needed, so that different host languages can support these types easily.

The Candid type system is structural. This means that two types are the same when they are defined the same way. Imagine two different services defining the same

type User = record { name : text; user_id : nat }

then although User is defined in two places, it’s still the same type. In other words, these name type definitions are always just simple aliases, and what matters is their right-hand side.

Because Candid types can be recursive (e.g. type Peano = opt Peano), this means we have an equirecursive type system, which makes some things relatively hard.

So far, nothing too special about Candid compared to other interface definition languages (although a structural equirecursive type system is already something). In the next post we look at reference types, and how such higher order features make Candid an interesting technology.

by Joachim Breitner ( at August 29, 2021 03:51 PM

Donnacha Oisín Kidney

Weighted Search Package

Posted on August 29, 2021
Tags: Haskell

I have packaged up the more interesting bits from the Algebras for Weighted Search paper and put it up on hackage.

You can see it here.

It contains the HeapT monad, the Monus class, and an implementation of Dijkstra’s algorithm, the Viterbi algorithm, and probabilistic parsing.

Check it out!

by Donnacha Oisín Kidney at August 29, 2021 12:00 AM

August 28, 2021

Mark Jason Dominus

Mo in an alternate universe

In a transparent attempt to capitalize on the runaway success of The Wonderful Wizard of Oz, the publishers of L. Frank Baum's earlier book A New Wonderland re-released it under the title The Magical Monarch of Mo. What if this ploy had actually worked? Would the book have inspired a movie?

We're off to see the Monarch,
The Marvelous Monarch of Mo…

Naah, it kinda falls apart after that.

by Mark Dominus ( at August 28, 2021 07:44 PM

How to fix hiring?

On Twitter, Mike Coutermarsh suggested:

Job interview: “algorithms”

Reality: “Turn a 127 message deep slack thread between 5 engineers into a decision”

I suppose this was meant facetiously but I think it might contain the germ of a good idea.

Applicants are usually given timed a programming quiz. What if instead, the candidate was supplied with the 127-message Slack thread and given 24 hours to write up a proposal document? I honestly think this might produce good results.

Such a submission would be extremely probative of the candidate's talents and abilities, including:

  • reading and understanding technical arguments
  • balancing engineering tradeoffs
  • foreseeing potential issues
  • writing clear English
  • planning
  • seriousness
  • writing coherent, well-organized, and persuasive documents

It is much more difficult to cheat on this task than on a typical programming exercise. The candidate certainly can't submit a prewritten essay that they found somewhere; that would be easy to detect. A candidate who can take someone else's prewritten essay and quickly rewrite it to plausibly appear original is probably quite well-qualified on many of the important metrics! (Plus an additional important one: the ability to do research. They had to locate, recognize, and read the essay they rewrote.)

It shouldn't be hard to change up the essay topic periodically, since the engineers will be producing several of those 127-message Slack threads every month. This also tends to impede cheating.

When a good candidate comes for an in-person interview, you have a ready-made topic of conversation. Instead of coding at the whiteboard, you can ask them to discuss their proposal.

Complaints that this would discriminate against candidates with poor command of English do not hold water. Good command of English is one of the job requirements, and the whole point of a job interview is to discriminate against unqualified candidates. Besides, if the hiring process encourages candidates to improve their English writing abilities, rather than cramming a bunch of red-black-tree algorithms, language trivia, or irrelevant brainteasters, so much the better for everyone.

by Mark Dominus ( at August 28, 2021 04:57 PM

Why is the S combinator an S?

The most important combinator in combinatory logic is the combinator, defined simply:

$$ S x y z ⇒ (x z)(y z) $$

or in -calculus terms:

$$ S = \lambda x y z. (x z)(y z). $$

A wonderful theorem states that any -expression with no free variables can be converted into a combinator expression that contains only the combinators and , where is really the only interesting one of the three, being merely the identity function, and a constructor of constant functions:

$$ \begin{align} I x & = x \\ K x y & = x \\ \end{align} $$

In fact one can get along without since .

A not-too-infrequently-asked question is why the three combinators are named as they are. The is an identity function and pretty obvious stands for “identity”.

Similarly the constructs constant functions: is the combinator which ignores its argument and yields . So it's not hard to imagine that is short for Konstant, which is German for “constant”; no mystery there.

But why ? People typically guess that it stands for “substitution”, the idea being that if you have some application $$A\,B$$ then allows one to substitute some term for a free variable in both and prior to the application:

$$ S\, A\, B\, T = A[v/T]\, B[v/T]. $$

Although this seems plausible, it's not correct.

Combinatory logic was introduced in a 1924 paper of Moses Schönfinkel. In it, he defines a family of combinators including the standard , , and ; he shows that only and are required. His initial set of combinators comprises the following:

$$ \begin{array}{cllrl} I & \textit{Identitätsfunktion} & \text{“identity function”}& I\,x =& x \\ C & \textit{Konstanzfunktion} & \text{“constancy function”} & C\,x\,y =& x \\ T & \textit{Vertauschungsfunktion} & \text{“swap function”} & T\,x\,y\,z=& x\,z\,y \\ Z & \textit{Zusammensetzungsfunktion} & \text{“composition function”} & Z\,x\,y\,z=& x\,(y\,z) \\ S & \textit{Verschmelzungsfunktion} & \text{“fusion function”} & S\,x\,y\,z=& x\,z\,(y\,z) \end{array} $$

(Schönfinkel also had combinators representing logical operations (one corresponding to the Sheffer stroke, which had been discovered in 1913), and to quantification, but those don't concern us right now.)

and are now usually called and . These names probably originated in Curry's Grundlagen der kombinatorischen Logik (1930). Curry 1930 is probably also the origin of the change from to . I have no idea why Schönfinkel chose to abbreviate Konstanzfunktion as instead of . Curry notes that for Schönfinkel has , but does not explain his changes. In Curry and Feys’ influential 1958 book on combinatory logic, the and combinators given names that are are literal translations of Schönfinkel's: “elementary permutator” and “elementary compositor”.

Returning to the combinator, one sees that its German name in Schönfinkel's paper, Verschmelzungsfunktion, begins with the letter V, but so does Vertauschungsfunktion, so abbreviating either with V would have been ambiguous. Schönfinkel instead chose to abbreviate Verschmelzungsfunktion with S for its root schmelzen, “fusion”, and Vertauschungsfunktion with T for its root tauschen, “swap”. The word schmelzen is akin to English words “melt” and “smelt”.

The “swap” is straightforward: the combinator swaps the order of the arguments to in : $$T\,x\,y\,z = x\,z\,y$$ but does not otherwise alter the structure of the expression.

But why is the “melting” or “fusion” combinator? It's because Schönfinkel was interested in reducing abitrary mathematical expressions to combinators. He will sometimes have an application and he wants to ‘fuse’ the two occurrences of . He can do this by rewriting the expression as . Schönfinkel says:

Der praktische Nutzen der Function besteht ersichtlich darin, daß sie es ermöglicht, mehrmals auftresnde Veränderliche — und bis zu einem gewissen Grade auch individuelle Functionen — nur einmal auftreten zu lassen.

Clearly, the practical use of the function will be to enable us to reduce the number of occurrences of a variable — and to some extent also of a particular function — from several to a single one.

(Translation from van Heijenoort, p. 362.)

So there you have it: the combinator is so-named not for substitution, but because S is the first letter of schmelzen, ‘to fuse’.


  • Schönfinkel, M. “Über die Bausteine der mathematischen Logik” (“On the building-blocks of mathematical logic”), Mathematische Annalen (1969), p. 305–316; Berlin, Göttingen, Heidelberg.

    English translation in Van Heijenoort, Jean (ed.) From Frege to Gödel: a Source Book in Mathematical Logic, 1879–1931 (1967) pp. 355–366 Harvard University Press; Cambridge and London.

  • Curry, H.B. “Grundlagen der kombinatorischen Logik” (“Fundamentals of combinatory logic”), _American Journal of Mathematics Vol. 52, No. 3 (Jul., 1930), pp. 509-536.

  • Curry, H.B. and Robert Feys Combinatory Logic (1958) p. 152 North-Holland Publishing Company, Amsterdam.

by Mark Dominus ( at August 28, 2021 04:02 PM

August 27, 2021

Gabriel Gonzalez

Naming function arguments in Dhall


This post showcases some neat Dhall language features for improving the readability of types that I think other languages should steal.

To motivate this post, consider the following Haskell type for Data.Text.replace:

replace :: Text -> Text -> Text -> Text

This function replaces all occurrences of a substring with another substring, but you wouldn’t be able to easily guess which argument is which from the type alone.

Fortunately, the function does have Haddock-level documentation in the form of comments for each function argument:

:: Text
-- ^ @needle@ to search for. If this string is empty, an
-- error will occur.
-> Text
-- ^ @replacement@ to replace @needle@ with.
-> Text
-- ^ @haystack@ in which to search.
-> Text

… but what if you could provide a hint to what each argument does within the type itself?

Naming function arguments using / forall

Well, in Dhall you can, and here is the equivalent Dhall type:

$ dhall repl
:type Text/replace

∀(needle : Text) → ∀(replacement : Text) → ∀(haystack : Text) → Text

… or if you prefer ASCII syntax then that is the same as this type:

forall (needle : Text) ->
forall (replacement : Text) ->
forall (haystack : Text) ->

Here we’ve tagged each function argument with the argument’s name. In fact, you don’t need to do this explicitly. Dhall will automatically infer named function argument types when you create user-defined functions:

:type λ(name : Text) → "Hello, ${name}"

∀(name : Text) → Text

In many cases these names are “cosmetic”, meaning that they have no effect on type-checking. For example, as far as the type-checker is concerned the following function type:

∀(name : Text) → Text

… is the exact same as the following simpler function type (they are α-equivalent):


However, sometimes these names are not cosmetic! The simplest example is a polymorphic function (a.k.a. a “generic” function), like the following polymorphic identity function:

let identity
: ∀(a : Type) → a → a
= λ(a : Type) → λ(x : a) → x

in identity

Here we’ve used the to name the first function argument a, which lets us reference that name downstream within the same type.

let identity
-- We're naming our first function argument `a` …
-- ↓
: ∀(a : Type) → a → a
-- ↑ ↑
-- … so that we can reference the value of the first function
-- argument downstream within the same type
= λ(a : Type) → λ(x : a) → x

in identity

However, unlike other languages, Dhall also lets us name the second function argument using the exact same syntax, even though our second function argument is not a Type:

let identity
: ∀(a : Type) → ∀(x : a) → a
= λ(a : Type) → λ(x : a) → x

In fact, this is actually the type that the interpreter would have inferred if we had omitted the type annotation. The name of this second function argument is cosmetic, though.

Another example where the argument name is not cosmetic is the following hypothetical Dhall type for a safe division function that rejects 0 denominators at type-checking time:

: ∀(denominator : Natural) →
∀(proof : Natural/isZero denominatorFalse) →
∀(numerator : Natural) →

This division function takes three arguments instead of the usual two:

  • The first argument is the denominator

    … as you may have guessed from the helpful type-level name

  • The second function argument requires a proof that denominator is non-zero

    Side note: This proof obligation is satisfied by assert : False ≡ False so long as denominator is not 0

  • The third argument is the numerator

By naming the first argument denominator, we can refer to the value of the first argument within the type of the second argument:

-- We're referring to the value of the first argument …
-- ↓
: ∀(denominator : Natural) →
-- … within the type of the second argument
-- ↓
∀(proof : Natural/isZero denominatorFalse) →
-- ↑
-- These names are still cosmetic, though
-- ↓
∀(numerator : Natural) →

You might wonder why Dhall uses the / forall keyword for this purpose, since typically most languages only use forall to create polymorphic (a.k.a. “generic”) functions. This is because Dhall is implemented as a pure type system, meaning that Dhall uniformly handles term-level function arguments and type-level function arguments in the exact same way. They’re both special cases of Π types.

The following paper does a really good job explaining pure type systems and Π types:

… and that’s actually where I stole this trick from.

Really the correct symbol to use is Π and not , but there isn’t a great way to input Π as ASCII. The Henk paper suggests using |~| as the ASCII counterpart to Π, but that didn’t seem right to me. I went with / forall because there’s already prior art for using those for type arguments.

Anonymous record types

There is another approach to naming function arguments that doesn’t require a pure type system at all: anonymous record types.

For example, we could have changed the Text/replace function to have this type:

{ needle : Text, replacement : Text, haystack : Text } → Text

That’s pretty clear, too, and can be done in any language that supports anonymous record types, including Dhall, F#, PureScript, Elm, and OCaml.

Naming type parameters

You can also use the above two tricks to name type arguments for type constructors, too.

For example, we can define a Map type constructor to be a list of key-value pairs:

let Map =
λ(key : Type) →
λ(value : Type) →
List { mapKey : key, mapValue : value }

… which we would use like this:

Map Text Natural

List { mapKey : Text, mapValue : Natural }

… and the inferred type names the arguments to our Map type constructor:

:type Map

∀(key : Type) → ∀(value : Type) → Type

We can also use anonymous records to name type arguments, too! For example, we could have instead defined Map as:

let Map =
λ(args : { key : Type, value : Type }) →
List { mapKey : args.key, mapValue : args.value }

… which we would use like this:

Map { key = Text, value = Natural }

List { mapKey : Text, mapValue : Natural }

… and this latter Map would have an inferred type of:

:type Map

∀(args : { key : Type, value : Type }) → Type

Pretty neat! Not many languages can pass type constructor arguments as named fields of a record.


Hopefully this gives language designers some ideas for how they can add language support for naming function arguments.

Unfortunately, my favorite language (Haskell) does not exactly support these features, so sometimes people work around this by using newtypes to name function arguments. I’ve never been of fan of this approach, especially if the newtype is not opaque and this post does a good job of explaining why:

However, you can simulate this trick in Haskell using something like what the vulkan package does, which is to use DataKinds and TypeOperators to create a type-level operator that lets you associate arbitrary name data with types:

type (name :: k) ::: a = a

… which the package uses like this:

:: forall io . (MonadIO io)
=> -- | @commandBuffer@ is the command buffer into which the command is
-- recorded.
-> -- | @vertexCount@ is the number of vertices to draw.
("vertexCount" ::: Word32)
-> -- | @instanceCount@ is the number of instances to draw.
("instanceCount" ::: Word32)
-> -- | @firstVertex@ is the index of the first vertex to draw.
("firstVertex" ::: Word32)
-> -- | @firstInstance@ is the instance ID of the first instance to draw.
("firstInstance" ::: Word32)
-> io ()

These names are ignored by the type-checker, just like the equivalent cosmetic names in Dhall.

by Gabriella Gonzalez ( at August 27, 2021 03:22 PM

Chris Smith 2

Nascent GHC Proposal: Source Rewrite Rules and Optional Constraints

Lately, I’ve been thinking more and more about something that would be nice to have in GHC.

Motivation #1: HMock Predicates

Consider the Predicate type from HMock:

data Predicate a = Predicate
{ showPredicate :: String,
showNegation :: String,
accept :: a -> Bool,
explain :: a -> String

The purpose of this type is to be a function a -> Bool that can also explain itself. That is, you can get Strings to explain what it’s testing, and why a particular argument matches or not. There are three important things to know about Predicate:

  1. It’s important that Predicate works on arbitrary types, including those without Show instances. Indeed, using predicates is the only way to match function calls in HMock whose arguments don’t have Eq and Show instances.
  2. The exact strings produced by showPredicate, showNegation, and explain are entirely unspecified, by design. It’s nice for users if they are as descriptive as possible, but they only get used in error messages and diagnostics, so no kind of real correctness depends upon their exact value.
  3. Predicates are often polymorphic. For instance, there is eq :: (Show a, Eq a) => a -> Predicate a, which compares checks for equality with a specific value. There are even more complex predicates, too, like unorderedElemsAre :: MonoFoldable t => [Predicate (Element t)] -> Predicate t, which uses a Ford-Fulkerson max-flow algorithm to match elements and child predicates, then reports on the best available matching, and whether there were either elements or child predicates that couldn’t be matched.

Here’s the fundamental problem: what do I do with Show instances? Even in the description above, there are some unsatisfying choices:

  • The eq predicate requires a Show constraint. This is used to implement showPredicate, showNegation, and explain. However, if it weren’t available, I could still provide the core functionality of eq at the cost of somewhat poorer descriptive text. Leaving out the Show constraint would have doomed all users to receive poor descriptions. In the end, I left it in, making eq less useful. I could also define a separate nonShowableEq predicate, which is identical to eq in all ways except that it lacks Show. This is ugly, and doing this everywhere would double the exposed symbols in this module (from 50 to 100 exports!)
  • On the other hand, unorderedElemsAre doesn’t have an Eq constraint on Element t. If it did, I could provide better explanatory text by listing which specific element values in the collection were unexpected. Yet this seems too limiting on such a useful combinator, which is often useful for higher-order types. Therefore, I left the constraint out. This is a real loss of functionality. Again, I could define showableUnorderedElemsAre, but at the cost of exploding the API if I did this everywhere.

What I really want is for GHC to check the types looking for a Show constraint, then use one or the other implementation depending on whether a Show constraint exists. This might seem suspect. Indeed, it is! It’s only reasonable because the results of the two implementations are equivalent up to the unspecified explanatory text.

Motivation #2: Improving efficiency

This isn’t the only time something like this comes up. A similar situation often arises with polymorphic algorithms. With only an Eq constraint, the best you can do for Haskell’s nub function (which removes duplicates from a list) is O(n²). Add an Ord constraint, and you can do the same thing in O(n log n). It requires a bit of care to do it in a way that preserves the original order, but it’s still possible. Add a Hashable constraint, and you can do it in O(n) expected time.

In practice, Haskell offers nub, and common libraries offer orbNub to do the same thing more efficiently. That’s again an extra API for the same function, just with better performance at the cost of more assumptions about its input. It’s ugly that this is necessary.

Existing (and partial) solutions

It’s not surprising that people have thought about this in the past. There are several bits of prior work worth mentioning.

  • In C++, there’s an approach known as SFINAE: “Specialization Failure Is Not An Error”. What this means is that you can just write two implementations of the same template, and if the more specific one doesn’t compile, you don’t get an error! Instead, the compiler ignores it and substitutes the more general implementation. Since C++ monomorphizes all templates at compile time, it all comes down to trying, and dropping the specialization if possible. This means C++ programs can make these trade-offs without exposing different names and APIs.
  • Mike Izbicki has written a Haskell library called ifcxt for Haskell, which partially solves the problem. The good news is that you can use the library to define optional constraints, and different implementations based on whether the constraint is satisfied or not. The bad news is that it requires Template Haskell to write a bunch of boilerplate. Unfortunately, new boilerplate is needed for every new type, so this Template Haskell cannot be internal to the implementation and needs to be invoked from the client code.
  • GHC already has a mechanism called rewrite rules, which allow certain expressions to be matched and rewritten by the compiler into a more efficient form. However, this happens during the Core stage of GHC, which is after type class resolution has occurred. This means that rewrite rules are not able to add additional type classes.

It seems that what’s needed here is a GHC plugin or extension.

Proposed solution #1: Source rewrite rules

This seems very similar to a rewrite rule, but just needs to fire at a different stage of the compilation. Therefore, if I were proposing this, I would propose a new kind of rewrite rule. Something like this:

"nub/Ord" forall x. nub x = ordNub x

When encountering a rule like this during type checking, GHC would attempt to perform the given substitution. If the resulting program type checks, fine. Otherwise, it would fall back to not applying the rule rather than failing the compile. This is very like the SFINAE rule in C++.

Proposed solution #2: Optional constraints

There’s a slight problem with the solution above: it’s not very compositional when we have separate compilation. If I write this:

foo :: a -> a
foo = id
fooNum :: Num a => a -> a
fooNum a = a + 1
"fooNum" forall x. foo x = fooNum x
bar x = foo x

What is the type of bar? If bar has type a -> a, then the fooNum rule never fires. On the other hand, if bar has type Num a => a -> a, then it is insufficiently general.

I propose that it should infer a -> a, but with an explicit type signature, one could write bar :: ?Num a => a -> a. This would inform GHC that it should generate two versions of bar, one specialized implementation with a Num constraint, and one general implementation without. It should then automatically generate a new source rule forall x. bar x = barNum x, that tries to rewrite the more general implementation into the specialized one when possible. (If there are no source rules that depend on the constraint, GHC could realize that both implementations are the same, and decline to generate a specialization at all, possibly with a warning about a redundant optional constraint.)


What do you think? Is this a direction you’d like to see GHC move? Is ifCxt enough?

by Chris Smith at August 27, 2021 01:40 AM

Tweag I/O

Daily ICFP: Day 4

Daily ICFP

Several Tweagers all agreed to gather notes here about our ICFP (International Conference on Functional Programming) experiences.

Other posts in this series:

Day 4 - Thursday

These notes follow Noon and Richard through their day at ICFP, roughly in order. (The talks are not publicly posted at the time of publication of this post, but we expect they will be over the coming weeks.)

Noon — Second to last day of ICFP. I’m becoming sad that it’s going to be over! But that said, I’m really looking forward to today! I feel a bit relaxed as well, because from my timezone (UK) I get a bit of a break in the morning (as I’m starting my day at the Haskell track.)

Chesskell: A Two-Player Game at the Type Level, by Toby Bailey and Michael Gale


  • What if you were taught chess slowly by a compiler?
  • Chesskell uses a little bit (a lot) more memory than the Witcher 3.
  • Overall a very fun talk about pushing the limits of type-level programming in GHC.


This was such a fun talk. Chesskell is an implementation of chess such that any illegal move is a type error. You play by typing in a Haskell buffer to append moves to the game.

To be clear: this is not an implementation of a chess game. It’s really only a fun experiment to see how far Haskell’s type system can go. In this sense, this project is a little dangerous: I wouldn’t want, say, a Java programmer to come across Chesskell and think this is how one would implement chess in Haskell! With that out of the way, though, this is indeed really fun. It’s amazing this can be done at all.

The talk focused mostly on limitations in GHC around its evaluation of type families, which is unpredictably slow. We at GHC know this and are working on it (led by Adam Gundry of Well-Typed), but this talk really pointed out all our flaws. It also describes an EDSL (embedded domain-specific language) that struggles with the new simplified subsumption in GHC 9. (See the proposal for a description.) It’s the first example I’ve come across of a program that is anti-simplified-subsumption in a way that cannot be easily repaired.

Express: applications of dynamically typed Haskell expressions, by Rudy Matela


  • Talking about this library
  • Very cool library.
  • Really cool capability to generate conjectures (equations that are true from testing) from expressions; i.e. learn things such as xs ++ [] = xs, and other interesting ideas.
  • Uses this to generalise property-based testing counter-examples.
  • Also does function synthesis!
  • Overall I loved this talk and the idea and I think the library will only get much more awesome over time!


I really liked this talk, too. It starts with a simple idea — Haskell’s Dynamic — and extends it with the ability to track function application and store abstract variables (that is, an x that is not bound to a value, like we would see in an algebraic equation). The talk then explores all manner of applications of this simple idea. The whole approach (simple idea leading to multifarious applications) is so Haskelly.

Haskell⁻¹: Automatic Function Inversion in Haskell, by Finn Teegen, Kai-Oliver Prott and Niels Bunkenburg.


  • When inverting, how to deal with functions that are not injective? (i.e. no unique inverse)
  • Idea: Just allow multiple inverses!
  • With the plugin, any function in standard Haskell can be inverted.
  • Want a “functional logic” version of Haskell, similar to Prolog.
  • Went into a bit more detail about the function inversion process.
  • Overall pretty interesting, and gives me more motivation that one day I’d love to investigate how GHC plugins work and what is possible with them!


This nice talk shows another superpower of GHC: allowing plugins that can manipulate Haskell source — this one, computing inverses of functions.

Why functional programming with linear types matters, by Tweag CEO Mathieu Boespflug


  • My favourite quote: “The Spleen of Reality”


This talk explores why Tweag is interested in linear types, focusing on two key benefits: extra safety (the example given was that we don’t want our functional program to accidentally duplicate one pizza into two) and extra performance (linear types can be used to ensure resource disposal, and so provides a potential alternative to garbage collection and the costly latency GC can introduce).

The talk also includes an excellent introduction to linear types for anyone who does not know about them.

Design Patterns for Parser Combinators (Functional Pearl), by Jamie Willis and Nicolas Wu



Amazing talk. Do watch. And, when you do, keep in mind that everything you’re seeing is actually live.

This is one of a series of amazing talks by Nick Wu. I will endeavor to watch the talk of any paper of his into perpetuity.

Oh, and there was content: a very nice description of useful design patterns for parser combinators. Nothing earth-shattering here, but it’s really great to have all this material in one place. When I reach for parser combinators next (or am mentoring someone who is), I will point them to this paper.

NoonGraded Monads and Type-Level Programming for Dependence Analysis, by Finnbar Keating and Michael Gale

  • Imagine working with a robot.
  • Want types to reflect what we do: such as reading from the screen, writing to the screen, etc.
  • Graded monads can help us do this.

NoonSylvester: Unified, typed, notation for symbolic mathematics and proofs (short talk), by Allister Beharry

  • Popped into this talk because I was curious about the topic.
  • It’s based in F#; not a language I’ve used a lot recently.
  • Main idea is to have a language that allows for computer-algebra and general-purpose programming.
  • Integrates with other tools (say, Z3).
  • I really like the idea in general, and it was nice to watch this and step outside my usual comfort zone a bit.

Richard — In the afternoon PLTea, an interesting conversation led to a new insight: GHC has for some time thought about doing finer-grained analysis for recompilation avoidance. That is, if I change one function, then I have to recompile only modules that depend on that function. But I don’t think the current analysis is always quite that clever. And so, being cleverer would avoid spurious recompilation. The new insight is that this kind of cleverness would be very helpful to IDEs trying to provide live code feedback: in order for an IDE to know where (say) a type error is, it has to run the code through GHC. If we can compile just a tiny part of the file at a time (part of what would power the fine-grained recompilation-avoiding dependency analysis), then this live feedback would come much faster. So it’s nice to see multiple benefits from one feature! Now we just have to design and implement the feature…

August 27, 2021 12:00 AM

GHC Developer Blog

GHC 8.10.7 is now available

GHC 8.10.7 is now available

Zubin Duggal - 2021-08-27

The GHC team is very pleased to announce the availability of GHC 8.10.7. Source and binary distributions are available at the usual place.

This is a small bugfix release, fixing one linking portability issue (#19950) present in GHC 8.10.5 and GHC 8.10.6 on some x86_64 macOS toolchains, which resulted in undefined symbol errors for ___darwin_check_fd_set_overflow.

Issue #19950 is caused by a bug in newer Apple toolchains (specifically XCode 12) where programs compiled with affected versions of XCode are not backwards compatible with configurations running older version of XCode (certain versions of XCode 11).

We claimed to have fixed this in GHC 8.10.6, but alas this wasn’t the case. The fix was originally tested on the master branch, which uses a different build configuration from the 8.10 branch. We have now tested the fix on the GHC 8.10 branch and finally squashed the bug.

We would like to thank Microsoft Research, GitHub, IOHK, the Zw3rk stake pool, Tweag I/O, Serokell, Equinix, SimSpace, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. We would also like to thank the hundreds of open-source contributors whose work makes GHC possible.

A complete list of bug fixes and improvements can be found in the release notes.

As always, feel free to report any issues you encounter via

by ghc-devs at August 27, 2021 12:00 AM

August 26, 2021


Dependent Types are a Runtime Maybe

Awhile back I was discussing dependent types with someone and we ended up concluding that dependent types can always be replaced by a runtime Maybe.  This seemed to me then (and still does today) as a fairly surprising and provocative conclusion.  So I thought I'd put the idea out there and see what people think.

Let's look at a few examples that are commonly used to illustrate dependent types:

  • Vectors of length N
  • Matrices of size m x n
  • Sorted lists
  • Height-balanced trees (trees where the height of subtrees differ by at most one)
The argument is roughly as follows.  All of these examples ultimately boil down to enforcing some kind of constraint on some data types.  The more powerful your dependent type system, the more rich and expressive will be the constraints that you can enforce.  If we take this thought experiment to its logical conclusion, we end up with a dependent type system that allows us to enforce any constraint that can be computed.

The important realization here is that every one of the above dependent type examples are a constraint that can also be enforced by a smart constructor.  The smart constructor pattern is roughly this:

module Foo
  ( Foo
  , mkFoo
  -- Any other functionality that Foo supplies

data Foo = ...

mkFoo :: FooInputs -> Maybe Foo

You can express all of the above dependent type constraints using this pattern. VecN can simply hold a vector and the length N constraint can be enforced in mkVecN.  Similarly, SortedList can simple hold a list and the mkSortedList can sort its input and/or return Nothing if its input isn't sorted.  The smart constructor mkFoo can contain arbitrarily complex Turing-complete constraints and return a Just whenever they're satisfied or a Nothing if they're not.

The key difference between dependent types and a smart constructor is that with dependent types the constraint is enforced at compile time and with a smart constructor it is checked at runtime.  This suggests a rule of thumb for answering the question of whether you should use dependent types:

If the cost (TCO...i.e. the sum total of dev time, readability of the resulting code, and all the ongoing maintenance) of using a dependent type is less than the cost of handling the Nothing cases at runtime, then you should use a dependent type.  Otherwise, you should just use a smart constructor.

The interesting thing here is that Haskell gives us a fantastic set of tools for handling runtime Nothing values.  The Maybe type is has instances of Functor, Applicative, and Monad which allow us to avoid a lot of the code overhead of checking the failure cases and handling them appropriately.  It is often possible to front load the checking of the constraint with a case statement near the top level:

case mkFoo inputs of
  Nothing -> handleError
  Just a -> handleSuccess a

Then all the rest of your code will be working with Foo, which is structurally guaranteed to have the properties you want and allows the use of simplified code that doesn't bother checking the error conditions.

My takeaway from this argument is that you should only reach for dependent types when dealing with situations where you can't front-load the error handling and the cost of having Maybe a's floating around your code exceeds the cost of the dependent type machinery.

What do you think?  Am I missing something here?  I'd love to see if anyone has practical examples of dependent types that can't be boiled down to this kind of smart constructor and runtime Maybe or where the cost of doing so is exceptionally high.

by mightybyte ( at August 26, 2021 02:23 PM

Tweag I/O

Daily ICFP: Day 3

Daily ICFP

Several Tweagers all agreed to gather notes here about our ICFP (International Conference on Functional Programming) experiences.

Other posts in this series:

Day 3 - Wednesday

These notes follow Noon, Arnaud, and Richard through their day at ICFP, roughly in order.

NoonOn Continuation-Passing Transformations and Expected Cost Analysis

  • Discussion of the difficulty of costing probabilistic algorithms (i.e. you don’t know what it will do.)
  • Turns out using continuations makes it a bit easier to reason about (I didn’t quite follow the trick that makes it true),
  • But, it gives a way to think compute average-cost analysis, which seems pretty useful!

NoonAutomatic Amortized Resource Analysis with the Quantum Physicist’s Method

  • Quantum physics? I’m interested.
  • “This talk doesn’t involve any quantum physics at all”. Oh well.
  • Idea is to think in terms of an “energy bank” and how much your program will need over time.
  • Can use this to think about time, and space.
  • Analogy to Alice and Bob having some fixed money and some fixed task (buying candy).
  • Their idea is to take a (kind-of) super-position of all kinds of ways of splitting up resources (money).
  • This helps with some technical detail of the standard amortized resource approaches (I didn’t quite get why.)
  • Really enjoyable.
  • Definitely curious to take a look at this paper!

ArnaudGhostCell: Separating Permissions from Data in Rust

This talk addresses an important problem in the Rust programming language: how does one write a structure with pointer sharing in Rust (typically: a doubly-linked list). Until now, it was either using unsafe features (danger!) or using a Cell type (slow and icky). In this talk a new safe and efficient approach is proposed. Which would be remarkable already. But the authors also proved their approach safe in Coq, as part of the RustBelt project. Impressive.

NoonSkipping the Binder Bureaucracy with Mixed Embeddings in a Semantics Course

  • In learning about proofs and correctness, how to get to the good stuff faster?
  • Answer: By using features of the host/meta language.
  • I’m personally unconvinced (not that I’m an expert); it seems worse to me to have implicit implementation details that students don’t quite understand.
  • But I do understand the desire to speed up getting to the good parts!
  • And maybe this is a good technique when not teaching? Just for your own experimenting? Probably.

RichardSkipping the Binder Bureaucracy with Mixed Embeddings in a Semantics Course (Functional Pearl), by Adam Chlipala

This was a nice talk about a technique Adam uses in his course on formal verification of programming languages using the Coq proof assistant. His goal is to take undergraduates, in a single semester, from no experience with Coq to being able to do non-trivial proofs on programming languages with e.g. effects or concurrency. The talk describes the use of mixed embeddings in his formalization of effectful languages: use a deep embedding of the command language but a shallow embedding of expressions. That is, the command structure (e.g. write to memory, goto, etc.) is expressed using a datatype, but expressions (that evaluate to some value) are just written in Coq’s native syntax. This allows for e.g. proofs about commands without needing to worry at all about expressions (and variable binding!).

I had previously viewed shallow embeddings as a bit of a parlor trick: if you’re describing a programming language that has behavior very similar to the one you’re writing in, you can skip some of the description. But the whole trick depends on a close correspondence between the object language (that is, the one you’re implementing) and the host language (that is, the one you’re writing in). Yet this presentation made me realize that shallow embeddings are considerably cheaper than deep ones — in the cases when the trick works. Cheap and cheerful is sometimes indeed better than fully general — especially if it allows undergrads to access advanced material so quickly.

RichardCatala: A Programming Language for the Law, by Denis Merigoux, Nicolas Chataing, and Jonathan Protzenko

I was so excited when I saw this paper in the program, as I always wondered how we could take legal statutes and formalize them. This paper attempts this for the tax code, where the rules are very algorithmic. (Formalizing, e.g., slander laws seems harder.) The authors worked with lawyers to design a language, Catala, that is a fully expressive functional language while using a syntax easily readable by lawyers (i.e. domain experts). In the act of translating tax statutes into Catala, lawyers frequently discovered new ambiguities in the law. In the Q&A session, we even imagined writing QuickCheck properties over laws. (Example: you would probably want a monotonic tax system, where an increase in gross income never yields a decrease in take-home income. Indeed, Jonathan explained that non-monotonic tax law is actually unconstitutional in France — yet he believes the existing tax law is non-monotonic anyway.)

This talk filled me with optimism at the receptiveness of lawyers to this new technology. The research group is actively seeking new groups to collaborate with.

NoonCatala: A Programming Language for the Law

  • One of the few (only?) talk about a paper that I had actually read before the conference!
  • I’ve been fascinated with tax calculations since reading the various Piketty books.
  • Motivation: Converting law into code is hard and sometimes subjective.
  • IRS income tax program implemented in assembly from the 60s!
  • Idea: Mix law and code together (as both are necessary for determining correctness.)
  • Interesting fact: The language was at least partly co-designed by lawyers who helped pick the keywords,
  • Quite liked this idea and the talk!

Noon — PLTea

  • Some interesting conversations.

Noon — Programming Contest Report

  • An astonishing amount of work goes into this.
  • Richard: Yes, it does! I’m amazed every year by the generosity of the organizers of this contest in planning and executing it.

Richard — Program Chair Report

Many conferences include a report from the program chair, including information about the number of submissions, the selection process, and how many papers got accepted. Interestingly, the pandemic has had only a small effect on the number of submissions to ICFP, despite anecdotal evidence I’ve heard that submissions are down in other scientific fields. (Plausible theories: less time for focused thought during school shutdowns; less ability to focus due to anxiety in the news; less research time for faculty who teach due to the need to redesign courses to work well in a remote format.) Maybe this means that interest in ICFP is actually going up, cancelling out the negative effect of the pandemic. Huge thanks to Ron Garcia for chairing the program committee and to Sukyoung Ryu for chairing the conference!

Noon — Industrial Reception

  • Had a really enjoyable chat with the various visitors to the Tweag table. Thanks everyone for stopping by!

Richard — I second that comment about the Industrial Reception. We had a very nice crew of people interested both in Tweag and in research ideas. Among more Tweag-centric discussions, I had a small but fascinating conversation about which is more fundamental: math or computer science. I’ve felt for a few years that computer science is just applied mathematics, but the way in which dependent types can describe fundamental mathematical concepts makes me wonder sometimes whether I’ve gotten the relationship backwards. As a case in point, I wondered aloud with a visitor to the Tweag table (I did not ask permission for posting their name) about how to re-express the fundamental definitions about differential calculus on manifolds using dependent types. This sounds very, very fun, but I worry it will be too distracting from “real” work (like continuing to improve GHC)! We’ll see whether I can fit this in.

RichardHow to design co-programs, by Jeremy Gibbons

This presentation gives a wonderfully elegant, simple approach to taking the first steps toward writing functions. There are no advanced concepts (don’t trip over the word “corecursion”!) and the video shows how typed functional programming is such an aide to reasoning about how to write programs.

RichardLeibniz equality is isomorphic to Martin-Löf identity, parametrically, by Andreas Abel, Jesper Cockx, Dominique Devriese, Amin Timany, and Philip Wadler (presented by Phil)

This paper presents a proof that Leibniz equality and Martin-Löf equality are isomorphic. The talk gives very nice introductions to both of these concepts, so I am going to skip that here. As the talk concedes, the isomorphism between these two notions of equality is not necessarily all that useful, but it’s nice to know how the proof can be done, and it’s interesting that the proof requires an assumption of parametricity. The paper (and presentation!) are both literate Agda, meaning that rendering the LaTeX or slides also type-checks the technical content, giving greater assurance of the result.

This talk is a nice peek into the power of dependent types to write formal proofs!

Richard — If other attendees at ICFP are reading this, please spend more time in the lounge! A few times I cruised through the lounge looking for new folks to meet and left disappointed. Somehow, last year’s ICFP did not suffer from this problem: I had many great out-of-band conversations. Maybe the novelty of online conferences has worn off? I’ve still had a number of fine conversations this year, but it definitely feels a step down from last year.

August 26, 2021 12:00 AM

August 24, 2021

Matt Parsons

Designing New

I want a better way of constructing Haskell records.

Let’s compare and contrast the existing ways. We’ll be using this datatype as an example:

data Env = Env
    { accountId :: String
    , accountPassword :: String
    , requestHook :: Request -> IO Request
    , responseHook :: Response -> IO Response

This type is an Env that you might see in a ReaderT Env IO integration with some external service. We can attach request hooks and response hooks.

Function Arguments

The simplest and most boring way is to pass function arguments.

env :: Env
env = Env "asdfadf" "hunter42" pure pure

This is undesirable for a few reasons:

  1. We have no idea what those parameters mean without looking at the datatype definition.
  2. We have to pass arguments in a specific order.
  3. If the type of the Env changes, then this also changes.
  4. … but we don’t get a break if the field order is changed in a way that respects the types!

Consider swapping the order of accountId and accountPassword in our data definition. Now everything breaks mysteriously with no type errors.

Using the function-style for constructing records is probably a bad idea.

Record Labels

The second most boring way is to use record construction syntax:

env :: Env
env = Env
    { accountId = "asdfasdf"
    , accountPassword = "hunter42"
    , requestHook = pure
    , responseHook = pure

This solves basically all the problems with function arguments. However, we’re still sensitive to changes in the record constructor. If we add a new field, we must account for that in all creation sites. This is annoying, especially since many new fields in records like this are designed to accommodate new functionality or customization, and most existing users want to just ignore them.

A Default Record

Instead of constructing a record, we’ll have end users modify an existing record.

defaultEnv :: Env
defaultEnv = Env
    { accountId = ""
    , accountPassword = ""
    , requestHook = pure
    , responseHook = pure

env :: Env
env = defaultEnv
    { accountId = "asdfasdf"
    , accountPassword = "hunter42"

However, this is gross, for a few reasons. The first is that we provide a dummy value of accountId and accountPassword, and the end user is required to fill them in. There’s actually no way for us to give a warning or error if they fail to provide it.

The standard solution is to accept function arguments, but this has a nasty problem: record syntax binds tighter than anything else, even function application, so we need to do this:

defaultEnv :: String -> String -> Env
defaultEnv a p = Env a p pure pure -- brevity, forgive me

env :: Env
env = (defaultEnv "asdfasdf" "hunter42")
    { requestHook = \req -> do
        logRequest req
        pure req

That’s right - we gotta put parens around our constructor. We can’t use $ here, either, because the syntax explicitly requires a value { field0 = val0, ... fieldN = valN } form.

Also now we’re back at the same problem with defaultEnv - we can mismatch our function arguments.

An Args Record

The pattern I chose for SqlBackend in persistent is to have an *Args record.

{-# language DuplicateRecordFields #-}
{-# language RecordWildCards #-}

data EnvArgs = EnvArgs
    { accountId :: String
    , accountPassword :: String

mkEnv :: EnvArgs -> Env
mkEnv EnvArgs {..} = Env
    { requestHook = pure
    , responseHook = pure
    , ..

env :: Env
env = mkEnv EnvArgs 
    { accountId = "asdfasdf"
    , accountPassword = "hunter42"

This solves all of the above problems, but it’s a bit unsatisfying - we can’t also modify the requestHook and responseHook parameters directly in mkEnv, we have to do it outside.

fullEnv :: Env
fullEnv = 
    (mkEnv EnvArgs {..}) 
        { requestHook = \req -> do
            log req
            pure req

Hmm, slightly annoying syntax, again. But, hey, whatever, it works.

Codependent Records

No, I’m not talking about some fancy type theory. Record syntax is essentially codependent on the value it is modifying, or the constructor it is using. We can’t pass in a ‘record’ of stuff and use it in ways that are clever or useful.

Let’s talk about the “whitespace operator.” We can imagine defining it like this, for regular functions:

( ) :: (a -> b) -> a -> b
f a = f a

OK, it’s special built in syntax, the definition doesn’t make any sense. But let’s try and write it for records now. Remember we need to support update and creation.

( ) :: (AllowableRecord con rec result)
    => con -> rec -> result
con rec = implementRecord con rec

class AllowableRecord con rec result where
    implementRecord :: con -> rec -> result

Now rec is something that can stand alone - it is freed from the codependent relationship with the values and constructors it serves. What is that something, though?

It could be a row type, like PureScript. That’d be awesome.

Well now I’ve just worked myself up into a Mood about GHC’s record syntax. Even with OverloadedRecordDot, Haskell’s records are still bad, they’re just not awful.

Ignore Records, Use Functions

This approach eschews records entirely for updates and uses set* functions. It makes for a pretty clean interface.

env :: Env
env = 
    addRequestHook (\req -> log req >> pure req)
    $ mkEnv EnvArgs
        { accountId = "asdfasdf"
        , accountPassword = "hunter42"

addRequestHook :: (Request -> IO Request) -> Env -> Env
addRequestHook newHook env = env
    { requestHook = \req -> do
        requestHook env req
        newHook req

This is pretty tedious as a library author to write, but it gives you a better interface.

It would be nice if we could use this for construction, too. But this is a challenge because the type would change with each new addition to the record. The { ... } record syntax can know ahead of time how many fields there are, and GHC can issue warnings (or errors) if any are missing.

Type Changing Updates

We can use a type parameter for each field that is required to be set.

data EnvP a b = EnvP
    { accountId :: a
    , accountPassword :: b
    , requestHook :: Request -> IO Request
    , responseHook :: Response -> IO Response

type Env = EnvP String String

data Void

defaultEnv :: EnvP Void Void
defaultEnv = EnvP
    { requestHook = pure
    , responseHook = pure

GHC will issue warnings here, but that’s okay - we know they’re undefined at the type level.

Now we can write our set functions:

setAccountId :: String -> EnvP a b -> EnvP String b
setAccountId str env = env { accountId = str }

setAccountPassword :: String -> EnvP a b -> EnvP a String
setAccountPassword str env = env { accountPassword = str }

env :: Env
env = 
    setAccountId "asdfasdf" 
    $ setAccountPassword "hunter42"
    $ defaultEnv

And, well, this actually works out. If we only expose the Env type (and maybe a pattern synonym for construction/deconstruction), this interface should be pretty safe and straightforward. A final mkEnv call could even put it behind a newtype wrapper, or a similar datatype, similar to the *Args pattern above. The boilerplate sucks, but would be easy to TemplateHaskell away.

Can OverloadedRecordDot help us here? With some of the tricks in Stealing impl From Rust, sort of. We can write simple setters:

data User = User { name :: String }

instance HasField "setName" User (String -> User) where
    getField self newName = 
        self { name = newName }

And, using the One Weird Trick to defeat functional dependencies, we can write type-changing setters, too!

    HasField "setAccountId" (EnvP a b) (x -> EnvP x b)
    HasField "setAccountId" (EnvP a b) (x -> EnvP x b)
    getField self x = self { accountId = x }

Now, to provide a good UX, we’d want to require this be String, possibly with a nice TypeError constraint that complains. But this’ll work for now - we can totally write this:

env :: EnvP String Void
env = defaultEnv.setAccountId "asdfasdf"

Unfortunately, chaining this isn’t really feasible.

env :: EnvP String String
env = defaultEnv.setAccountId "asdfasdf".setAccountPassword "hunter42" 

This fails with an error, as .setAccountPassword is attaching to "asdfasdf", not the return of defaultEnv.setAccountId "asdfasdf". So we can work around this with parens:

env :: EnvP String String
env = 
    (defaultEnv.setAccountId "asdfasdf").setAccountPassword "hunter42" 

This gets annoying, especially as the chaining goes up. Assigning to intermediate values also works:

env :: EnvP String String
env = 
        withId = defaultEnv.setAccountId "asdfasdf"
        withPassword = withId.setAccountPassword "hunter42" 

But, at this point, I’m wondering how this is any better than just writing

env :: EnvP String String
env = setAccountId "asdfadsf" $ setAccountPassword "hunter42" defaultEnv

Unfortunately, the type errors can get a bit weird and annoying carrying around the EnvP value. Wrapping it in a newtype or translating to a separate data structure can make errors better. It also distinguishes the “create this record” and “use this record” scenarios.

Back to Args

And, yeah, ultimately, I think Args is probably the right way to go.

There’s not really much to a library for it. You’d define a class like this:

class New a where
    type Args a = r | r -> a
    new :: Args a -> a

You want the TypeFamilyDependencies annotation on Args because you want the argument type to inform the result type. A data family would also work, but it would not allow you to define it separately and document it with a separate type name. Maybe a problem, maybe not. It may also be nice to vary the return type, allowing IO, for example. That looks like this:

class New a where
    type Args a = r | r -> a
    type Return a = r | r -> a
    type Return a = a

    new :: Args a -> Return a

But now we’ve just, got, like, this type class, where it takes a thing, and returns another thing (maybe in IO, maybe not?? who knows). And this is so general and lawless that making a library for it seems a bit silly.

So, instead of writing a library, I wrote a blog post.

August 24, 2021 12:00 AM

August 23, 2021

Monday Morning Haskell

New Quicksort Video!


This week we've got a new video out! It goes in depth into the Quicksort algorithm. We compare implementations in Haskell and Python, and also consider what it really means for us to have an "In Place" algorithm that mutates our data.

I've written about this topic before, so check out these articles if you prefer written content!

But this new video includes some neat visuals showing Quicksort in action, so check it out!

All the code in the video is visible on Github as well!

If you like this content, make sure to subscribe to our mailing list!

by James Bowen at August 23, 2021 02:30 PM


The new GHC diagnostic infrastructure

TL;DR Starting from version 9.4, GHC will have a completely revamped API to deal with diagnostics (i.e. warnings or errors), moving away from loosely-structured strings in favour of richer Haskell types. This will make it easier to develop IDEs and other tools that work with GHC’s diagnostics.

Well-Typed was able to carry out this work thanks to Richard Eisenberg, as part of NSF grant number 1704041.


The topic of “Can we please have better IDEs?” crops up now and again within the Haskell community. Over the years tools like Haskell Language Server dramatically improved the situation, but all these tools had to deal with one constant factor: the limitations of the GHC API, in particular how it emitted diagnostics.

Previously, diagnostics were emitted as structured documents (SDocs), which can be seen as strings with a richer API to control their layout when being pretty-printed. Once GHC emitted a diagnostic as an SDoc, the best tools could do was to parse the text hoping to regain some structure out of it. Consider the following error:

AnyM.hs:12:3: error:
    Illegal bang-pattern (use BangPatterns):
    ! res
12 |   !res <- anyM (pure . (==) 5) [1..]
   |   ^^^^

Here most tools have to work unnecessarily hard, in order to:1

  1. Find out the diagnostic’s precise location, by parsing the filename:line:column on the first line.
  2. Understand whether this is an error or a warning (and perhaps the specific warning flag used) by parsing the rest of the first line.
  3. Determine the meaning of the diagnostic (e.g. that this is a parse error due to use of a bang-pattern when the BangPatterns extension is not enabled).
  4. If the diagnostic includes extra context like involved types or variables, further parse and analyse the text to extract such information for later use.
  5. Parse any hints present in the diagnostic (in this case use BangPatterns), turning them into automated refactorings that the user may wish to apply.

This is surely not very ergonomic for GHC API users, and with this in mind, Alp Mestanogullari wrote GHC proposal #306 to improve the situation, and this was subsequently implemented by Alfredo Di Napoli working with Richard Eisenberg.

In this blog post, we will explore what lies in store in future releases of GHC when it comes to diagnostic messages. Part 1 explains the new diagnostic API design at a high level, and Part 2 summarizes further possibilities that this work enables, including a low-barrier way to get started contributing to GHC. If you’d prefer a larger example, the Appendix demonstrates a tiny tool that uses the GHC API to parse a module and give customised errors and hints for some categories of diagnostics.

Part 1: Representing diagnostics in the GHC API

A diagnostic is a fact that GHC emits about the compiled program. These diagnostics always arise for a particular reason, such as a warning flag being enabled or a type error GHC can’t recover from. However, there is a fluid relationship between warnings and errors in GHC: for example, -Werror turns warnings into errors. Thus we refer to “diagnostics” or “messages” to encompass both.

Our work focused on creating a richer hierarchy of diagnostic types that can be returned by the GHC API functions instead of an opaque SDoc. The key idea is to have datatypes describing the meaning of errors, rather than their presentation. As an example, the TcRnMessage type describes diagnostics that may be emitted by the type-checker:

data TcRnMessage where
  TcRnUnknownMessage      :: (Diagnostic a, Typeable a) => a -> TcRnMessage
  TcRnUnusedPatternBinds  :: HsBind GhcRn -> TcRnMessage
  TcRnDodgyImports        :: RdrName -> TcRnMessage

The GhcMessage type unifies diagnostics that may be generated across different phases of compilation:

data GhcMessage where
  -- | A message from the parsing phase.
  GhcPsMessage      :: PsMessage -> GhcMessage
  -- | A message from typecheck/renaming phase.
  GhcTcRnMessage    :: TcRnMessage -> GhcMessage
  -- | A message from the desugaring (HsToCore) phase.
  GhcDsMessage      :: DsMessage -> GhcMessage
  -- | A message from the driver.
  GhcDriverMessage  :: DriverMessage -> GhcMessage
  -- | An \"escape\" hatch which can be used when we don't know the source of
  -- the message or if the message is not one of the typed ones.
  GhcUnknownMessage :: (Diagnostic a, Typeable a) => a -> GhcMessage

Each message is then wrapped into a MsgEnvelope (as we will see later) to store metadata such as the source position. Given a TcRnMessage, tools can simply pattern match to interpret the diagnostic and extract the information they need. However, they will also need to render the message for the user or retrieve any hints, and this is where the new Diagnostic typeclass comes into play.

The Diagnostic typeclass

The Diagnostic typeclass is defined as such:

class Diagnostic a where
  diagnosticMessage :: a -> DecoratedSDoc
  diagnosticReason  :: a -> DiagnosticReason
  diagnosticHints   :: a -> [GhcHint]

This specifies the common interface supported by diagnostic types such as TcRnMessage. It includes:

  • The diagnosticMessage: how this diagnostic can be presented to the user as text (structured for pretty-printing).

  • The diagnosticReason: why this diagnostic arose, for example due to an error or a warning controlled by a specific flag. There is a subtle but important nuance behind this field, discussed below.

  • The diagnosticHints: a list of hints that tools can use to present users with refactoring suggestions. For example, this may include a value like SuggestExtension BangPatterns if enabling the extension may fix the error.

Representing “unknown” diagnostics

The astute reader probably noticed the following data constructor for a TcRnMessage (and similarly for GhcMessage):

  TcRnUnknownMessage :: (Diagnostic a, Typeable a) => a -> TcRnMessage

This constructor serves two purposes. First of all, it allows us to gradually port the existing GHC SDoc diagnostics to the new API without having to do it all in one go. GHC emits a lot of diagnostics, so only a subset have been ported and new errors and warnings are converted day after day. In the meantime, we can simply wrap all the existing SDoc into a generic DiagnosticMessage (which has a suitable Diagnostic instance) and pass it to the TcRnUnknownMessage.

Second, this constructor makes the diagnostic infrastructure extensible: tools building on the GHC API and performing their own checks, such as the LiquidHaskell GHC plugin, will be able to define their own diagnostic types.

Messages and envelopes

A diagnostic type such as TcRnMessage or GhcMessage captures the meaning of the warning or error, but not the context in which it arose. For that, the GHC API wraps it in a MsgEnvelope:

data MsgEnvelope e = MsgEnvelope
   { errMsgSpan        :: SrcSpan
   , errMsgContext     :: PrintUnqualified
   , errMsgDiagnostic  :: e
   , errMsgSeverity    :: Severity
   } deriving (Functor, Foldable, Traversable)

Hopefully this type is fairly self explanatory:

  • The errMsgSpan carries the range of source positions to which the message relates.
  • The errMsgContext is a minor detail, determining whether names are printed with or without module qualifiers.
  • The errMsgDiagnostic is the payload of the message, for example a TcRnMessage or GhcMessage.
  • The errMsgSeverity is the severity of the message, to which we turn now.

Reason and severity

It isn’t immediately obvious why a MsgEnvelope contains a Severity, while the Diagnostic class includes a function returning a DiagnosticReason, since these might seem overlapping. Let’s take a look at their definitions first:

data Severity
  = SevIgnore
  | SevWarning
  | SevError

data DiagnosticReason
  = WarningWithoutFlag
  | WarningWithFlag !WarningFlag
  | ErrorWithoutFlag
  deriving (Eq, Show)

While it looks like they might be unified, they actually serve two different purposes: the Severity is an enumeration that indicates how GHC will report this message to the user (or not at all, in case of a SevIgnore). The DiagnosticReason instead gives the reason why the diagnostic was generated by GHC in the first place.

This arises from the fluid relationship between errors and warnings in GHC. For example, a diagnostic might be created due to the -Wunused-imports warning flag, but with -Werror enabled, so it should be treated as an error, not a warning. Thus the Severity will be SevError whereas the DiagnosticReason will be WarningWithFlag Opt_WarnUnusedImports. Keeping separate the “nature of the message” vs. “how the message should be treated”, we are able to capture both concepts without information loss.

The DiagnosticReason is determined by the diagnosticReason class method as a fixed function of the diagnostic type, and never changes. In contrast, the Severity is computed dynamically depending on the flags enabled at the point in the GHC session where the message is emitted, and hence must be stored as part of the MsgEnvelope.

Part 2: Applications and further work

Refactoring GHC this way has been a long and sometimes tricky process, but we hope it will bring many benefits to the ecosystem. In this section we will explore next steps and possible projects that could put this work to use.

Completing the refactoring

We have completed the foundations, but there is still lots to be done, as described in #19905. The good news is that the majority of diagnostics still not ported are not too hard to convert, and these kind of tasks are well suited as a first ticket for somebody who is looking for an opportunity to contribute to GHC.

We have written a collection of self-contained and self-explanatory tickets, which are labelled error messages+newcomer+task in the GHC issue tracker. A typical ticket (such as #20119) gives a high level overview of what needs to be done together with a possible plan of action, mentioning a couple of key modules to get people started.

We have already been delighted to see some new GHC contributors getting started through this work!

Haskell Language Server integration

A key motivation for this refactoring work was to enable HLS to consume GHC’s diagnostics more conveniently, rather than needing to parse them with regular expressions. This is being discussed on HLS issue #2014. It will take a bit of time before HLS can start incorporating the new infrastructure and reap its benefits, because HLS will need to be adapted to deal with the substantial changes in the GHC API in versions 9.2 and 9.4.

GHC unique diagnostic codes

There was a recent surge of interest in GHC proposal #325, which suggests diagnostics should be given unique reference IDs a la Rust, for example:

> class a

<interactive>:5:7: error:
    [GHCERR_b356c55] Malformed head of type or class declaration: a

The idea is that the unique ID (here GHCERR_b356c55) is easier to search for, or look up in a reference document, than a longer message that may change between compiler releases.

Doing this using the old SDoc-based infrastructure in GHC would have been daunting and potentially very error prone: using the new diagnostic infrastructure this seems fairly easy now. As a small proof of concept, we could simply add an ADT enumerating all the diagnostics:

data DiagnosticId
  = GHCERR_01
  | GHCERR_02
  | GHCWARN_01
  | ...

Then we could extend the Diagnostic typeclass to require that each diagnostic must return an identifier:

class Diagnostic a where
  diagnosticId      :: a -> DiagnosticId
  diagnosticMessage :: a -> DecoratedSDoc
  diagnosticReason  :: a -> DiagnosticReason
  diagnosticHints   :: a -> [GhcHint]

Last but not least, when displaying messages we could now pretty-print the relevant diagnostic and its ID (together with a URL pointing to the relevant section in the GHC manual, for example). The extra typeclass method will ensure that diagnosticId is automatically accessible as part of the normal GHC API.

Diagnostic message plugins

GHC already has an extensive plugin mechanism that allows developers to modify certain stages of the compilation pipeline, for example to add optimization passes or adjust the type-checker’s behaviour. A “diagnostic message plugin” would allow users to intercept a diagnostic message before it gets printed, so that they can manipulate it, for example to add domain-specific error information. This could just be a hook in the form of an effectful function GhcMessage -> m GhcMessage that would be called by GHC before we pretty-print the message, where the monad m would allow side-effects such as IO, looking up data in the GHC API session and so on.

For example, a plugin to search for unknown function identifiers on Hoogle might look something like this2:

data HoogleSeacher = HoogleSeacher {
    originalMessage  :: TcRnMessage
  , foundIdentifiers :: [JSON.Value]

instance Diagnostic HoogleSeacher
  diagnosticReason = diagnosticReason . originalMessage
  diagnosticMessage (HoogleSeacher msg foundIdentifiers) = diagnosticReason msg `unionDecoratedSDoc` ...
     -- Print the original message, together with any identifiers fetched from Hoogle

hooglePlugin :: DiagnosticPlugin
hooglePlugin = defaultDiagnosticPlugin { onGhcMessage = searchHoogle }
    searchHoogle :: GhcMonad m => GhcMessage -> m GhcMessage
    searchHoogle = \case
      GhcTcRnMessage (msg@TcRnOutOfScope{outOfScopeName})
        -> do let query = "" <> outOfScopeName
              results <- -- issue a HTTP query and decode the resulting JSON
              pure $ GhcUnknownMessage (HoogleSeacher msg results)
      x -> pure x

JSON output from GHC

It would be nice to give GHC the ability to emit structured diagnostics in JSON, when a particular flag is set. This would mean that tools not able to use the GHC API directly could simply call GHC with this flag and parse the output JSON into something structured they can manipulate.

GHC already supports a -ddump-json flag, but its semantics is largely unspecified and it does not currently leverage the new representation of diagnostic messages. There has been some discussion in ticket #19278 on what a potential JSON interface could look like. The final design hasn’t been decided yet, so if you have any valuable input or feedback on what you would like to see, that ticket is the one to monitor closely.


I (Alfredo) would like to personally thank Richard Eisenberg for the valuable contributions during this work and for all the “rebuttals” which ultimately led to the final design. The original “diagnostic message plugin” idea was suggested to me by my colleague Andres Löh.

While doing this refactoring work, we kept in mind real world use cases, trying to come up with an API that would maximise reuse in IDEs and other third party tools. Having said that, it’s always hard to guess what would be most useful to others, and this is why we would love to hear from you if you have a tool you think could benefit from the new GHC API. It would be great to receive feedback on whether or not this work is actually making your life easier. Please get in touch via the GHC issue tracker (in particular #19905) or the ghc-devs mailing list.

Well-Typed are always open to working on projects that benefit GHC and the surrounding Haskell ecosystem. Please email if you’d like to discuss how we can help with your open-source or commercial project.

Appendix: Example tool to customize errors

Here’s an example of a tiny tool that uses the GHC API to parse a module and give customised errors and hints for some categories of diagnostics. It runs an interactive GHC session via the runGhc function, tries to parse the input module and returns either a collection of diagnostics including some errors, or a successfully parsed module. If there are parse errors, it pretty-prints them using custom functions.

Ignoring the technical details, the key functions are prettyPrintError and reworkBangPatterns, where we were able to work with a typed representation of diagnostics and their hints. This was not possible with previous versions of GHC: in their case, errs would be just a collection of SDocs, and the best thing we could have done would have been to parse the SDocs to recover any extra information.

The full code is available here. It requires a recent build of GHC HEAD or GHC 9.4 (when available).3

playground :: FilePath
           -- ^ The module we would like to compile, with extension (e.g. "AnyM.hs")
           -> IO ()
playground filename = do
  res <- runGhc (Just myGhcLibDir) $ do
    df <- getSessionDynFlags

    setSessionDynFlags $ df { ghcLink        = LinkInMemory
                            , ghcMode        = CompManager
                            , extensionFlags = EnumSet.empty

    hsc_env <- getSession
    mb_emod <- first (fmap GhcDriverMessage) <$> liftIO (summariseFile hsc_env [] fn Nothing Nothing)
    case mb_emod of
      Left errs  -> pure $ Left errs
      Right emod -> handleSourceError (pure . Left . srcErrorMessages)
                                      (Right <$> parseModule (emsModSummary emod))

  case res of
    Left errs -> do
      putStrLn "Errors:"
      putStrLn $ showPprUnsafe . ppr $ formatBulleted defaultSDocContext $
        (mkDecorated $ map (prettyPrintError . errMsgDiagnostic) (bagToList . getMessages $ errs))
    Right ps  -> do
      -- Do something with the parsed module.
      putStrLn $ showPprUnsafe (pm_parsed_source ps)


    prettyPrintError :: GhcMessage -> SDoc
    prettyPrintError msg =
      let body  = case msg of
            GhcPsMessage (PsErrNumUnderscores _)
              -> vcat [ text "You are trying to use the underscore (_) to separate the digits"
                      , text "but this syntax is not standard Haskell2010 syntax."
            _ -> vcat . unDecorated $ diagnosticMessage msg
          hints = map reworkBangPatterns (diagnosticHints msg)
      in vcat [
           , hang (text "Hints:") 2 (vcat hints)

    reworkBangPatterns :: GhcHint -> SDoc
    reworkBangPatterns h = ppr $ case h of
      SuggestSingleExtension _ LangExt.BangPatterns
        -> text "Uh-oh, you need to enable BangPatterns! :)"
      x -> ppr x


For example, here’s a trivial Haskell program containing some parse errors:

module Main where

import Data.Foldable

anyM :: (Monad m, Foldable t) => (a -> m Bool) -> t a -> m Bool
anyM f = foldrM (\v acc -> do { v <- f v; if v then pure True else pure acc}) False

main :: IO ()
main = do
  !res <- anyM (pure . (==) 5) [10_000 ..]
  print res

Running the program above on this program (saved as AnyM.hs) produces something like this:

/Users/adinapoli/programming/haskell/playground/AnyM.hs:10:3: error:
    Illegal bang-pattern
    Suggested fix: Perhaps you intended to use BangPatterns
10 |   !res <- anyM (pure . (==) 5) [10_000 ..]
   |   ^^^^

/Users/adinapoli/programming/haskell/playground/AnyM.hs:10:33: error:
    Illegal underscores in integer literals
    Suggested fix: Perhaps you intended to use NumericUnderscores
10 |   !res <- anyM (pure . (==) 5) [10_000 ..]
   |                                 ^^^^^^
* You are trying to use the underscore (_) to separate the digits
  but this syntax is not standard Haskell2010 syntax.
  Hints: Perhaps you intended to use NumericUnderscores
* Illegal bang-pattern
  Hints: Uh-oh, you need to enable BangPatterns! :)

The first two messages are part of the standard output that GHC would normally emit, whereas the last part is our little tool in action.

  1. If we were dealing directly with the output of a GHC API call rather than a diagnostic as printed on stdout by GHC we would have avoided steps 1 and 2 but the rest would still have been necessary.↩︎

  2. This is obviously a fictional example, just to demonstrate a semi-interesting usage of the plugin. Furthermore, at the time of writing, the TcRnOutOfScope constructor has not yet been ported.↩︎

  3. The GHC version will need to include commit 06d1ca856d3374bf8dac952740cfe4cef76a350d. Of course it is possible that subsequent GHC API changes will require changes to the code.↩︎

by alfredo at August 23, 2021 12:00 AM

August 20, 2021


Avoiding quadratic core code size with large records

Edsko will be talking about the problems discussed in this blog post in his Haskell Implementors’ Workshop talk this Sunday, Aug 22. The talk will be broadcast live on YouTube.

Consider a module that contains nothing but the definition of a single large record and some type class instances:

{-# OPTIONS_GHC -fplugin=RecordDotPreprocessor #-}

module Example where

import Data.Aeson
import Generics.SOP.JSON
import Generics.SOP.TH
import GHC.TypeLits

newtype T (i :: Nat) = MkT Word
  deriving (Show, Eq, ToJSON)

data R = MkR {
      f00 :: T 00
    , f01 :: T 01
      -- .. lots more ..
    , f98 :: T 98
    , f99 :: T 99
  deriving (Eq, Show)

deriveGeneric ''R

instance ToJSON R where
  toJSON = gtoJSON defaultJsonOptions

As it stands—using ghc’s standard representation for records, along with the code generated by the RecordDotPreprocessor plugin and the Generic instance generated by generics-sop—this results in a core representation of a whopping 450,000 terms/types/coercions, takes 3 seconds to compile, and requires 500M of RAM to compile.1

If we change this module to

module Example where

import Data.Aeson (ToJSON(..))
import Data.Record.Generic.JSON
import Data.Record.TH

import Test.Record.Size.Infra

largeRecord defaultLazyOptions [d|
    data R = MkR {
          f00 :: T 00
        , f01 :: T 01
          -- .. lots more ..
        , f98 :: T 98
        , f99 :: T 99
      deriving (Eq, Show)

instance ToJSON R where
  toJSON = gtoJSON

we get a module with essentially the same functionality, but with a core size of a mere 14,000 terms/types/coercions, which compiles within 1 second and requires roughly 100M of RAM.

In this blog post we describe why this simple module generates so much code, and how the large-records library manages to reduce this by more than an order of magnitude.

We wrote this library because Juspay recently engaged Well-Typed’s services, and one of their requests to us was to try and improve compilation time and compilation memory requirements for their code base. Juspay very generously allowed us to make large-records open source, and it is now available on Hackage.

Quadratic code size at every level

The reason the core representation of our example module is so large is that unfortunately there are many examples of ghc and other libraries being accidentally quadratic. Before we look at some concrete examples, let’s first investigate where this quadratic code size is coming from. As we will see, it arises at every level: terms, types, type classes, and type level programming.

Warmup: terms

For our running example, we will want to have a record with lots of fields. To avoid some “accidental optimizations”, we’ll give each of those fields a different type. To make that a little bit easier, we’ll just introduce a single type that is indexed by a natural number, so that this one type definition gives us as many different types as we need:

data T (n :: Nat) = MkT Int

That out of the way, consider a record with lots of fields, such as

data R = MkR {
    f00 :: T 00
  , f01 :: T 01
  , f02 :: T 02
  -- .. lots more ..
  , f98 :: T 98
  , f99 :: T 99

When we define a record, ghc will generate field accessors for all fields in the record. In other words, it will derive functions such as

f00 :: R -> T 0

These functions are not difficult to generate, of course. Each function is just a simple case statement:

f00 = \(r :: R) ->
    case r of
      MkR x00 x01 x02 x03 x04 x05 x06 x07 x08 x09
          x10 x11 x12 x13 x14 x15 x16 x17 x18 x19
          -- .. lots more ..
          x90 x91 x92 x93 x94 x95 x96 x97 x98 x99 ->

Although simple, this case statement mentions a lot of unused variables (99 of them, in fact). Moreover, each of those variables is annotated with their type. This means that this one function is actually rather large; ghc reports that it contains 5 terms and 202 types. The size of this function is clearly linear in the number n of fields we have; moreover, ghc will generate one function for each of those n fields; that means that simply declaring the record will already generate code that is O(n²) in size.

More subtle: types

Suppose we define an applicative “zip” function for R, something like

zipMyRecordWith ::
     Applicative f
  => (forall n. T n -> T n -> f (T n))
  -> R -> R -> f R
zipMyRecordWith f r r' =
        pure MkR
    <*> f (f00 r) (f00 r')
    <*> f (f01 r) (f01 r')
    <*> f (f02 r) (f02 r')
    -- .. lots more ..
    <*> f (f98 r) (f98 r')
    <*> f (f99 r) (f99 r')

Clearly the size of this function is at least linear in the number of record fields, that much is expected. However, -ddump-simpl tells us that this function contains 50,818 types! Where are all of those coming from?

Recall the type of (<*>):

(<*>) :: Applicative f => f (a -> b) -> f a -> f b

Those type variables need to be instantiated; f is always instantiated to the f type parameter passed to zipMyRecordWith, a is the type of the next field we’re applying, but what about b? Let’s annotate zipMyRecordWith with the types of a and b in pseudo-Haskell:

zipMyRecordWith ::
     Applicative f
  => (forall n. T n -> T n -> f (T n))
  -> R -> R -> f R
zipMyRecordWith f r r' =
        pure MkR
    <*> @(T 00) @(T 01 -> T 02 -> T 03 -> .. -> T 99 -> R) f (f00 r) (f00 r')
    <*> @(T 01) @(        T 02 -> T 03 -> .. -> T 99 -> R) f (f01 r) (f01 r')
    <*> @(T 02) @(                T 03 -> .. -> T 99 -> R) f (f02 r) (f02 r')
    -- .. lots more ..
    <*> @(T 98) @(                              T 99 -> R) f (f98 r) (f98 r')
    <*> @(T 99) @(                                      R) f (f99 r) (f99 r')

The first instantiation of (<*>) mentions the type of every single field; the second mentions the types of all-but-one field, the next of all-but-two, etc. This means that the size of this single function is once again O(n²) in the number of record fields.

Type class dictionaries

Suppose we wanted to capture the concept “some constraints c applied to the types of all fields in our record”:

class (
    c (T 00)
  , c (T 01)
  , c (T 02)
  -- .. lots more ..
  , c (T 98)
  , c (T 99)
  ) => Constraints_R c

This should be fine right? Right? Wrong.

When we declare a type class, we’re effectively constructing a record type with fields for each of the methods of the class; this record is known as the “dictionary” for the class. Superclass constraints translate to “subdictionaries”: when a class such as Ord has a superclass constraint on Eq, then the dictionary for Ord will have a field for an Eq dictionary.

This means that this definition of Constraints_R is actually of a very similar nature to the definition of R itself: it defines a record with 100 fields. And just like for the record, ghc will generate “field accessors” to extract the fields of this dictionary; put another way, those field accessors “prove” that if we know Constraints_R c, we also know c (T 00), c (T 01), etc. What do those field accessors look like? You guessed it, a big pattern match; in pseudo-Haskell:

$p1Constraints_R :: Constraints_R c => c (T 0)
$p1Constraints_R = \dict ->
    case dict of
      Constraints_R d00 d01 d02 d03 d04 d05 d06 d07 d08 d09
                    d10 d11 d12 d13 d14 d15 d16 d17 d18 d19
                    -- .. lots more ..
                    d90 d91 d92 d93 d94 d95 d96 d97 d98 d99 ->

Since ghc generates a projection like this for each superclass constraint, this once again results in code of size that is O(n²) in the number of record fields.

Type level induction

So far all our examples have been simple Haskell; for our next example, we’ll get a bit more advanced. Just like we can have list values, we can also have list types; for example, here is a type-level list of the indices of the T types used inside our running example record R:

type IndicesR = '[
    00, 01, 02, 03, 04, 05, 06, 07, 08, 09
  , 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
  -- .. lots more ..
  , 90, 91, 92, 93, 94, 95, 96, 97, 98, 99

There are many use cases for type level lists. For example, we can define a type NP such that NP f [x, y, .., z] is basically the same as (f x, f y, .., f z):

data NP (f :: k -> Type) (xs :: [k]) where
  Nil  :: NP f '[]
  (:*) :: f x -> NP f xs -> NP f (x ': xs)

If we have a T for every index in IndicesR, we can construct a value of our record:

npToR :: NP T IndicesR -> R
npToR (  f00 :* f01 :* f02 :* f03 :* f04 :* f05 :* f06 :* f07 :* f08 :* f09