Planet Haskell

December 01, 2022

Monday Morning Haskell

Day 1 - Intro Problem

As a reminder, these writeups won't be super detailed, since I have to do one every day. I'll try to focus on the key ideas though, and I'll always link to my code!

Solution code on GitHub

All 2022 Problems

Subscribe to Monday Morning Haskell!

Problem Overview

This year we're dealing with elves. Each elf is carrying some snack items with a certain number of calories. Our input has one calorie count per line, and an empty line denotes that we have reached the end of one elf's snack collection and started another.






For the first part, we just want to find the elf with the most calories. This is the 4th elf, with a total of 24000 calories (7000+8000+9000).

For the second part, we want the sum of calories from the three elves with the most. So we take the 24000 from the elf with the most, and add the 3rd elf (11000 calories) and the 5th elf (10000 calories). This gives a total of 45000.

Full Description

Solution Approach and Insights

Nothing complicated here. Once we parse into list-of-lists-of-ints, we just use map sum and either take the maximum or the sum of the top 3.

Relevant Utilities

Function parseFile

Parsing the Input

Here's our parsing code. One nuance...I needed to add an extra empty line to the given inputs in order to make this parse work. Dealing with empty line separators is a little tricky with megaparsec (or at least I haven't mastered the right pattern yet), because the "chunk separator" is the same as the "line separator" within each chunk (eol parser).

parseInput :: (MonadLogger m) => ParsecT Void Text m [[Int]]
parseInput =
  sepEndBy1 parseIntLines eol
    parseIntLines = some parseIntLine
    parseIntLine = do
      i <- parsePositiveNumber
      return i

Getting the Solution

As above, nothing complicated here. Use map sum and take the maximum.

processInputEasy :: (MonadLogger m) => [[Int]] -> m Int
processInputEasy intLists = return $ maximum (map sum intLists)

With the hard part, we sort, reverse, take 3, and then take another sum.

processInputHard :: (MonadLogger m) => [[Int]] -> m Int
processInputHard intLists = return $ sum $ take 3 $ reverse $ sort (map sum intLists)

Answering the Question

And no additional processing is needed - we have our answer! (My standard template has the answer always wrapped in Maybe to account for failure cases).

solveEasy :: FilePath -> IO (Maybe Int)
solveEasy fp = runStdoutLoggingT $ do
  input <- parseFile parseInput fp
  Just <$> processInputEasy input

solveHard :: FilePath -> IO (Maybe Int)
solveHard fp = runStdoutLoggingT $ do
  input <- parseFile parseInput fp
  Just <$> processInputHard input


Link to YouTube

by James Bowen at December 01, 2022 04:00 PM

Tweag I/O

Higher-orderness is first-order interaction

There is an inherent beauty to be found in simple, pervasive ideas that shift our perspective on familiar objects. Such ideas can help tame the complexity of abstruse abstractions by offering a more intuitive angle from which to understand them.

The aim of this post is to present an alternative angle — that of interactive semantics — from which to view one of the fundamental notion of functional programming: higher-order functions.

Interactive semantics provide an intuitive understanding of the concept of higher-order functions, which is a worthy mathematical investigation in itself. But this approach is also practical, shedding a new light on existing programming techniques and programming language features. We will review the example of higher-order contracts in this post. We will also present direct application of interactive semantics in the design and the implementation of programming languages.

Denotational semantics

Take the following programs, written respectively in Java, Rust, and Haskell:

public int incrByTwo(int x) {
  return x + 2;
fn incr_by_two(x: i32) -> i32 {
    let constant = 1;
    let offset = 1;
    constant + offset + x
incrByTwo :: Int -> Int
incrByTwo x = 1 + x + 1

While they look different on the surface, our intuition tells us that they are also somehow all the same. But what does “being the same” even mean for functions defined in such different languages?

One point of view is that syntax is merely a way to represent a more fundamental object, and that each of the above examples in fact represents the exact same object. From a purely mathematical point of view, these programs all fundamentally represent the function which adds <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>2 to its argument.

<semantics>f:Z→Z=x↦x+2<annotation encoding="application/x-tex">f : \mathbb{Z} \to \mathbb{Z} = x \mapsto x + 2</annotation></semantics>f:ZZ=xx+2.

The process of stripping away the purely syntactic details of a program to discover the mathematical objects at its core is the main concern of the field of denotational semantics. We refer to the mathematical object a program represents as its denotation. The idea being that by ridding ourselves of the unimportant details of a particular syntax we can focus better on the essence of the program.

The motivation for doing this partly stems from fundamental philosophical questions, such as: “what really is a program?” Attempting to answer such questions unveils deep connections between computer science and the rest of mathematics. However, stripping a program down to its substance can also provide us with techniques to answer much more concrete questions. For example: proving that two given programs always behave in the same way.

While incrByTwo operates on integers, even the most bare-bone functional language features much more complex objects: functions.


A higher-order function is a function which manipulates other functions. The various instances of incrByTwo only represent a first-order function, since their sole argument is a number. On the other hand, the usual map operation on lists is higher-order, as it takes as an argument a function describing how to transform each element of the list. This can be seen clearly in the Haskell syntax for the type of map, in particular the presence of the function type (a -> b) as the first argument:

map :: (a -> b) -> [a] -> [b]

Integers are easy to grok. They are static pieces of data that one can inspect and pass around. Functions are a different matter: they are an opaque entity that can only be actioned by handing it over data or functions.

This distinction is not only theoretical but also practical: while choosing a concrete representation of integers on a CPU is often relatively simple, selecting a representation for functions and closures, together with a calling convention, is not.

Traditionally, mathematicians have simply encoded functions as data. In set theory, the formal lingua franca of modern mathematics, a function is a (potentially infinite) set of tuples pairing each input with the corresponding output. Our incrByTwo denotation <semantics>n↦n+2<annotation encoding="application/x-tex">n \mapsto n+2</annotation></semantics>nn+2 is represented as the infinite set:

<semantics>{…,(−1,1),(0,2),(1,3),…}={(n,n+2)∣n∈Z}<annotation encoding="application/x-tex">\{\ldots, (-1, 1), (0,2), (1,3), \ldots \} = \{ (n, n+2) \mid n \in \mathbb{Z} \}</annotation></semantics>{,(1,1),(0,2),(1,3),}={(n,n+2)nZ}

While the set representation of functions is useful for mathematics, a static, infinite dictionary with virtually instant lookup turns out not to be a great model for computation, for a number of philosophical and technical reasons1. At its core, the notion of functions as sets ignores a fundamental concept of computation: time, and its direct manifestation, interaction.

Interactive semantics

Game Semantics (GS hereafter) is a line of thought which takes its root in dialectical interpretations of logic. In GS, we not only consider the inputs and outputs of a higher-function, but also all the interactions with other functions given as arguments. That is, we consider the traces of the function (the Player in GS), viewed as a dialogue with an Opponent, representing the environment in which the function executes (the calling context).

Take a simple higher-order function:

negate :: (Bool -> Bool) -> Bool -> Bool
negate f x = if f x then false else true

The evaluation of the run negate (\y. y) false now corresponds to a play in a game defined by the type of negate. Let’s first attach a unique label to each occurrence of Bool:

(Bool -> Bool) -> Bool -> Bool
( B1 ->   B2 ) ->  B3  ->  B4

The play goes like this 2:

  • Opponent (caller): hey, could you give me your return value (B4) ?
  • Player (negate): sure, but first give me the return value of f (evaluating f x, B2)
  • O: ok, but first give the value of its parameter y (B1)
  • P: ok, then I need the value of my parameter x (B3)
  • O: x is false (B3)
  • P: then y is false (B1)
  • O: then f returns false (B2)
  • P: then I finally return true (B4)

The full denotation of negate is then a strategy for this game.


The game partitions the occurrences of Bool into outputs/positive, where the Opponent asks first and the Player answers, and inputs/negative, where roles are switched. This distinction is called polarity.

Consider the anonymous function \y. y from the previous call to negate (let’s call it f).

The play for f false, from the point of view of f, looks like:

  • O: asks for return value
  • P (f): asks for parameter y
  • O: answers y is false
  • P: returns false

If you come back to the first play of negate and hide the moves external to the call f x, the above play exactly matches a subset of the original one, just with the polarities flipped!

  • Opponent (caller): hey, could you give me your return value (B4) ?
  • Player (negate): sure, but first give me the return value of f (evaluating f x, B2)
  • O: sure, but first give the value of the its parameter y (B1)
  • P: ok, then I need the value of my parameter x (B3)
  • O: x is false (B3)
  • P: then y is false (B1)
  • O: then f returns false (B2)
  • P: then I finally return true (B4)

The Player/Opponent distinction is perfectly symmetric. Indeed, from the point of view of f in the subcall f x, the caller is negate, which thus becomes the opponent.

Determining the polarity is easy: look at the type of the function and compute the path from the root to a type occurrence in terms of going to the left or to the right of an arrow. The occurrence is positive if the number of left is even (including 0), and negative otherwise. For negate, working on a type with labels and parentheses ((B1 ->a B2) ->b (B3->c B4)):

  • B1 is positive (left of ->b, left of ->a)
  • B2 is negative (left of ->b, right of ->a)
  • B3 is negative (right of ->b, left of ->c)
  • B4 is positive (right of ->b, right of ->c)

The essence of GS is to model the execution of a higher-order function as an interaction over basic values. The beauty lies in the simplicity of the concept and the perfect symmetry between Player and Opponent. Polarity tells us if a value is an input, which must be provided by the environment, or an output, which must be provided by the function, either directly or indirectly through subcalls.

From a theoretical perspective, GS was the first technique to provide a class of denotational models that satisfy a strong form of correspondence between programs and their denotations (they gave the first fully abstract model for PCF). Game semantics seems to hit a sweet-spot by hiding unessential aspects of programs without forgetting the essential dynamic of interaction.

But the GS point of view is also practical. Let’s illustrate a few applications equipped with our new interactive lens.


Higher-order contracts

At Tweag, I am working on a configuration language called Nickel. Nickel features contracts, a form of higher-order dynamic type-checking. Contracts enable safe interaction between typed code and untyped code by preventing the untyped code from injecting ill-typed parameters.

Take a variant of our negate function in Nickel:

let negate : (Bool -> Bool) -> Bool -> Bool = fun f x => !(f x)

When calling e.g. negate (fun y => y) false from untyped code, the interpreter will check that the values bound to x, y, f x, and !(f x) are booleans.

Now, if we break the contract of f by calling negate (fun y => 2) false, the first line of the output will be:

error: contract broken by the caller

Conversely, if we define negate to break the contract of f3:

negate | (Bool -> Bool) -> Bool -> Bool = fun f x => !(f 5)

And make a legal call negate (fun y => y) false, the error becomes:

error: contract broken by a function

Higher-order contracts are precisely exploiting the same idea of breaking higher-orderness into first-order interactions! A contract for a higher-order function decomposes into primitive contracts (here Bool), incurring one check for each type occurrence. The blame distinction caller/function corresponds to the polarity of GS.

The trace of the second example looks like (labelling the type as (B1 -> B2) -> B3 -> B4):

  • Opponent (caller): let’s check that negate returns a Bool (evaluating negate (fun y => y) false, B4)
  • Player (negate): sure, but first let’s check that f returns a Bool (evaluating f x, B2)
  • O: ok, but then I need a Bool for y (B1)
  • P: ok, then I need a Bool for x (B3)
  • O: x is false (B3)
  • P: ok, false is a Bool. Then y is 5 (B1)
  • O: hey, 5 is not a Bool! I blame the player (B1)

For the typed version of negate, Player represents the internal, type-safe boundaries. Opponent is the external world, potentially untyped, and the contract is the border police meticulously controlling everything that crosses the boundary.

Negate contract boundaries

Circuits and distributed computing

If you look at the plays of the previous section, they have strong feeling of message-passing style. The function is exchanging primitive data with the environment. This view has been exploited for example to design compilation techniques and a language runtime that makes it trivial to break down terms and run them on distinct distributed nodes. In contrast, making a classic stack-based virtual machine distributed is not trivial.

The interactive interpretation has been used as well to compile high level functional programs down to integrated circuit, precisely by reducing the complexity of higher-orderness to exchanging first-order messages4.


Interactive semantics like Game Semantics have moved forward the understanding of the nature of programs and computations by incorporating an aspect forgotten by a naive set-based semantics: interaction. Such interactive semantics have proven theoretically fruitful and particularly flexible (they can model side effects, concurrency, and more). Game Semantics also has practical applications by serving as a guideline for compilation techniques and language design.

But in the end, I think that the richness of the interactive semantics resides in its surprisingly simple and intuitive foundation (who has never played games!). My humble hope for this post is that in a no so distant future, you may stare at a language feature, an abstract concept or a programming technique and suddenly exclaim:

Of course! It’s just that higher-orderness is first-order interaction.

  1. Non-exhaustively:

  2. The dataflow may look funny if you’re used to languages using the common strict evaluation strategies. Here, we first enter the body of the function, and only ask for and evaluate arguments when their value is actually used: Haskell programmers may have recognized a non-strict evaluation strategy (here, call-by-name). GS can also model strict evaluation, but we stick to the traditional presentation of games for simplicity.

  3. negate isn’t well-typed anymore, so we use a contract annotation | which eschews typechecking but keeps the runtime checks

  4. Geometry of Synthesis I, II, III, and IV.

December 01, 2022 12:00 AM

November 30, 2022

Monday Morning Haskell

Advent of Code 2022!

Tomorrow is December 1st, which means that tonight (midnight Eastern American time, 9pm Pacific) is the start of Advent of Code! This is an informal, annual coding contest where you can solve one programming puzzle each day from December 1st up through Christmas on December 25th. The problems generally get more challenging as the month goes on, as you'll start needing to use more advanced data structures and solution techniques.

Last year I did Advent of Code for the first time, writing up all my solutions in Haskell (of course). This year I will be doing the contest again, and this time I plan to create more blog content as I go, rather than doing writeups way after the fact. Of course, I might not be able to do everything every day, but I'll try to keep up!

Here are all the ways I'll be trying to put my solutions out there for you to learn from (and critique!).


I will push all my code to my GitHub repository, on the aoc-2022 branch. So once my solutions are done you'll be able to see them for yourself!


I will also attempt to do daily write-ups on the blog, giving a rough descriptive outline of each solution. These won't be as detailed as the write-ups I did in the last month or so, but all the code will be there and I'll describe all the key insights and general solution approach.


I'll also be recording myself as I solve the problems so you can watch my solution process in real time. I'll post these videos to my YouTube channel. These videos will generally be unedited since I won't have time to go back through everything every day. I also won't be able to do these as much when it gets closer to Christmas as I'll be traveling and away from my recording setup. Some of these videos might have more commentary, some might have less. I haven't decided yet and it will vary from day-to-day.


I will not have a regular streaming schedule. As much as possible, I plan to attempt to solve problems as soon as they come out, and the contest rules request that people do not stream solutions until the leaderboard (for the fastest solutions) is filled for that particular problem. This is in order to prevent someone from copying the solution and getting on the leaderboard without effort. (For what it's worth, I doubt I'll actually be fast enough to get on the leaderboard).

If I get behind on my solutions, then it's very possible I'll do some streaming sessions while I catch up. You can follow me on Twitter or on my Twitch stream to know when I'm going live!


I'll try to keep up and solve the problem every day and keep up with content, but life gets busy, so I can't make any guarantees! But hopefully I'll have all the solutions published by the end of the year!

I encourage you to try out the Advent of Code problems for yourself! It's a great tool for learning a new programming language (especially Haskell!).

I'll also be doing a couple newsletter updates over the course of this month, so make sure to subscribe to our mailing list to get those and stay up to date!

by James Bowen at November 30, 2022 03:30 PM

November 29, 2022

Theory Lunch (Institute of Cybernetics, Tallinn)

A Remarkable Property of Real-Valued Functions on Intervals of the Real Line

Today the 17 October 2019 I discussed a very remarkable fixed point theorem discovered by the Ukrainian mathematician Oleksandr Micholayovych Sharkovsky.

We recall that a periodic point of period n\geq1 for a function f:X\to{X} is a point x_n such that f^n(x_n)=x_n. With this definition, a periodic point of period n is also periodic of period m for every m which is a multiple of n. If f^n(x_n)=x_n but f^k(x_n)\neq{x_n} for every k from 1 to n-1, we say that n is the least period of x_n.

Theorem 1. (Sharkovsky’s “little” theorem) Let I\subseteq\mathbb{R} be an interval and let f:I\to\mathbb{R} be a continuous function su. If f has a point of least period 3, then it has points of arbitrary least period; in particular, it has a fixed point.

Note that no hypothesis is made on I being open or closed, bounded or unbounded.

Our proof of Sharkovsky’s “little” theorem follows the one given in (Sternberg, 2010), and could even be given in a Calculus 1 course: the most advanced result will be the intermediate value theorem.

Lemma 1. Let I=[a,b] be a compact interval of the real line, let f:I\to\mathbb{R} be a continuous function. Suppose that for some compact interval J it is I\subseteq{J}\subseteq{f(I)}. Then f has a fixed point in J.

Proof. Let m and M be the minimum and the maximum of f in I, respectively. As I\subseteq{f(I)}, it is m\leq{a} and M\geq{b}. Choose u,v\in{I} such that f(u)=m and f(v)=M. Then g(x)=f(x)-x is nonpositive at x=u and nonnegative at x=v. By the intermediate value theorem applied to g, f must have a fixed point in the closed and bounded interval (possibly reduced to a single point) delimited by u and v, which is a subset of J. \Box

Lemma 2. In the hypotheses of Lemma 1, let K be a closed and bounded interval contained in f(I). Then there exists a closed and bounded subinterval J of I such that f(J)=K.

Proof. Let K=[c,d]. We may suppose c<d, otherwise the statement is trivial. Let u\in{I} be the largest such that f(u)=c. Two cases are possible.

  1. There exists x\in(u,b] such that f(x)=d. Let v be the smallest such x, and let J=[u,v]. Then surely f(J)\supset{K}, but if for some x\in(u,v) we had either f(x)<c or f(x)>d, then by the intermediate value theorem, for some y\in(u,v) we would also have either f(y)=c or f(y)=d, against our choice of u and v.
  2. f(x)<d for every x\in(u,b]. Let then w be the largest x\in[a,u] such that f(x)=d, and let J=[w,u]. Then f(J)=K for reasons similar to those of the previous point.


Proof of Sharkovsky’s “little” theorem. Let a,b,c,\in\mathbb{R} be such that f(a)=b, f(b)=c, and f(c)=a. Up to cycling between these three values and replacing f(x) with -f(-x), we may suppose a<b<c. Fix a positive integer n: we will prove that there exists x_{n}\in{I} such that f^n(x_{n})=x and f^i(x_{n})\neq{x_{n}} for every i<n.

Let L=[a,b] and R=[b,c] be the “left” and “right” side of the closed and bounded interval [a,c]: then R\subseteq{f(L)} and L\cup{R}\subseteq{f(R)} by the intermediate value theorem. In particular, R\subseteq{f(R)}, and Lemma 1 immediately tells us that f has a fixed point x_{1} in R. Also, L\subseteq{f(R)}\subseteq{f^2(L)}, so f also has a point of period 2 in L, again by Lemma 1: call it x_{2}. This point x_{2} cannot be a fixed point, because then it would also belong to R as L\subseteq{f(R)}, but L\cap{R}=\{b\} which has period 3. As we can obviously take x_{3}=b, we only need to consider the case n\geq4.

By Lemma 2, there exists a closed and bounded subinterval A_1 of R such that f(A_1)=R. In turn, as A_1\subseteq{R}, there also exists a closed and bounded subinterval A_2 of A_1 such that f(A_2)=A_1, again by Lemma 2: but then, f^2(A_2)=f(A_1)=R. By iterating the procedure, we find a sequence of closed and bounded intervals A_i such that, for every i\geq1, A_{i+1}\subseteq{A_i} and f^i(A_i)=R.

We stop at i=n-2 and recall that R\subseteq{f(L)}: we are still in the situation of Lemma 2, with A_{n-2} in the role of K. So we choose A_{n-1} as a closed and bounded subinterval not of A_{n-2}, but of L, such that f(A_{n-1})=A_{n-2}. In turn, as L\subseteq{f(R)}, there exists a closed and bounded subinterval A_n of R such that f(A_n)=A_{n-1}. Following the chain of inclusions we obtain f^n(A_n)=R. By Lemma 1, f^n has a fixed point x_n in A_n, which is a periodic point of period n for f.

Can the least period of x_n for f be smaller than n? No, it cannot, for the following reason. If x_{n} has period m\leq{n}, then so has y=f(x_{n}), and in addition n is divisible by m. But f(x_n)\in{L} while f^i(x_n)\in{R} for every i\in[2:n]: consequently, if x_{n} has period m<n, then y\in{L}\cap{R}=\{b\}. But this is impossible, because f^{2}(y)=f^{3}(x_{n})\in{R} by construction as n\geq4, while f^{2}(b)=a\not\in{R}. \Box

Theorem 1 is a special case of a much more general, and complex, result also due to Sharkovsky. Before stating it, we need to define a special ordering on positive integers.

Definition. The Sharkovsky ordering \rhd between positive integers is defined as follows:

  • Identify the number n=2^k\cdot{m}, with m odd integer, with the pair (k,m).
  • Sort the pairs with m>1 in lexicographic order.
    That is: first, list all the odd numbers, in increasing order; then, all the doubles of the odd numbers, in increasing order; then, all the quadruples of the odd numbers, in increasing order; and so on.
    For example, 17\rhd243 and 4095\rhd6
  • Set (k,m)\rhd(h,1) for every m>1 and k,h\geq0.
    That is: the powers of 2 follow, in the Sharkovskii ordering, any number which has an odd factor.
    For example, 17000000000000\rhd2.
  • Sort the pairs of the form (k,1)—i.e., the powers of 2—in reverse order.

The set of positive integer with the Sharkowsky ordering has then the form:


Note that \rhd is a total ordering.

Theorem 2. (Sharkovsky’s “great” theorem) Let I be an interval on the real line and let f:\mathbb{R}\to\mathbb{R} be a continuous function.

  1. If f has a point of least period m, and m\rhd{n}, then f has a point of least period n. In particular, if f has a periodic point, then it has a fixed point.
  2. For every m\geq1 integer it is possible to choose I and f so that f has a point of minimum period m and no points of minimum period k for any k\rhd{m}. In particular, there are functions whose only periodic points are fixed.


  • Keith Burns and Boris Hasselblatt. The Sharkovsky theorem: A natural direct proof. The American Mathematical Monthly 118(3) (2011), 229–244. doi:10.4169/amer.math.monthly.118.03.229
  • Robert L. Devaney, An Introduction to Chaotic Dynamical Systems, Second Edition, Westview Press 2003.
  • Shlomo Sternberg, Dynamical Systems, Dover 2010.

by Silvio Capobianco at November 29, 2022 05:58 PM

November 28, 2022

Monday Morning Haskell

Black Friday Sale Ends Today!

Today is Cyber Monday, which marks the last day of our Black Friday Thanksgiving sale! This is your last chance to get the biggest deals of the year on all of our online courses here at Monday Morning Haskell!

For the rest of the day, you can get 20% off any of our courses by using the code BLACKFRIDAY22 at checkout. And you can get an extra discount (up to 30% off) if you subscribe to our monthly newsletter!

Here's one final review of our different courses.

Haskell From Scratch

This is our full-length beginners course. It will give you a full introduction to Haskell's syntax and core concepts. You'll also get the chance to start developing your Haskell problem solving skills. It's the best option if you've never written a full Haskell project before!

Course Description

Making Sense of Monads

This shorter course focuses strictly on monads and other functional structures. If monads have been a tricky subject for you in the past, hopefully this course can help you finally conquer them! The course includes two mini-projects for you to hone your skills!

Course Description

Effectful Haskell

Effectful Haskell goes a step beyond our introductory monads course. You'll learn some practical applications for advanced monadic ideas - like how to use monad classes and free monads to organize effects in your program. Effectful Haskell also includes some basic practice in deploying an application to Heroku.

Course Description

Practical Haskell

Practical Haskell is our second full-length course. Over the course of five modules, you'll build out the different layers of a full-stack application. You'll learn how to interact with a database, build a web server and develop a web frontend with Elm!

Course Description

Haskell Brain

Haskell Brain is our machine-learning course. It will teach you how to use TensorFlow in conjunction with Haskell, as well as a couple other related libraries and techniques!

Course Description


So don't miss out on these offers! Remember the code BLACKFRIDAY22 at checkout for 20% off, and you can subscribe to our mailing list for an ever better offer!

Later this week, we'll be back with the start of Advent of Code, so there will be a ton of new content in the next month!

by James Bowen at November 28, 2022 03:30 PM


Haskell development job with Well-Typed

tl;dr If you’d like a job with us, send your application as soon as possible.

Over the next few months, we are looking for one or more Haskell experts to join our team at Well-Typed. At the moment, we are looking particularly for someone who is knowledgeable and interested in one or more of the following areas:

  • GHC development.
  • General Haskell development, with a good understanding of issues relating to performance.

This is a great opportunity for someone who is passionate about Haskell and who is keen to improve and promote Haskell in a professional context.

About Well-Typed

We are a team of top notch Haskell experts. Founded in 2008, we were the first company dedicated to promoting the mainstream commercial use of Haskell. To achieve this aim, we help companies that are using or moving to Haskell by providing a range of services including consulting, development, training, and support and improvement of the Haskell development tools. We work with a wide range of clients, from tiny startups to well-known multinationals. We have established a track record of technical excellence and satisfied customers.

Our company has a strong engineering culture. All our managers and decision makers are themselves Haskell developers. Most of us have an academic background and we are not afraid to apply proper computer science to customers’ problems, particularly the fruits of FP and PL research.

We are a self-funded company so we are not beholden to external investors and can concentrate on the interests of our clients, our staff and the Haskell community.

About the job

The role is not tied to a single specific project or task, and is fully remote.

In general, work for Well-Typed could cover any of the projects and activities that we are involved in as a company. The work may involve:

  • working on GHC, libraries and tools;

  • Haskell application development;

  • working directly with clients to solve their problems;

  • teaching Haskell and developing training materials.

We try wherever possible to arrange tasks within our team to suit peoples’ preferences and to rotate to provide variety and interest.

Well-Typed has a variety of clients. For some we do proprietary Haskell development and consulting. For others, much of the work involves open-source development and cooperating with the rest of the Haskell community: the commercial, open-source and academic users.

About you

Our ideal candidate has excellent knowledge of Haskell, whether from industry, academia or personal interest. Familiarity with other languages, low-level programming and good software engineering practices are also useful. Good organisation and ability to manage your own time and reliably meet deadlines is important. You should also have good communication skills.

You are likely to have a bachelor’s degree or higher in computer science or a related field, although this isn’t a requirement.

Further (optional) bonus skills:

  • experience in teaching Haskell or other technical topics,

  • experience of consulting or running a business,

  • experience with Cardano and/or Plutus,

  • knowledge of and experience in applying formal methods,

  • familiarity with (E)DSL design,

  • knowledge of networking, concurrency and/or systems programming,

  • experience with working on GHC,

  • experience with web programming (in particular front-end),

  • … (you tell us!)

Offer details

The offer is initially for one year full time, with the intention of a long term arrangement. Living in England is not required. We may be able to offer either employment or sub-contracting, depending on the jurisdiction in which you live. The salary range is 50k–90k GBP per year.

If you are interested, please apply by email to . Tell us why you are interested and why you would be a good fit for Well-Typed, and attach your CV. Please indicate how soon you might be able to start.

We are looking to fill at least one position as soon as possible, so please send in your application as soon as you can. That said, we expect to be continuously hiring over the next few months, and are willing to consider applications from expert Haskell developers at any time, so there is no firm application deadline.

by andres, duncan, adam, christine at November 28, 2022 12:00 AM

November 27, 2022

Mark Jason Dominus

Whatever became of the Peanuts kids?

One day I asked Lorrie if she thought that Schroeder actually grew up to be a famous concert pianist. We agreed that he probably did. Or at least Schroeder has as good a chance as anyone does. To become a famous concert pianist, you need to have talent and drive. Schroeder clearly has talent (he can play all that Beethoven and Mozart on a toy piano whose black keys are only painted on) and he clearly has drive. Not everyone with talent and drive does succeed, of course, but he might make it, whereas some rando like me has no chance at all.

That led to a longer discussion about what became of the other kids. Some are easier than others. Who knows what happens to Violet, Sally, (non-Peppermint) Patty, and Shermy? I imagine Violet going into realty for some reason.

As a small child I did not understand that Lucy's “psychiatric help 5¢” lemonade stand was hilarious, or that she would have been the literally worst psychiatrist in the world. (Schultz must have known many psychiatrists; was Lucy inspired by any in particular?) Surely Lucy does not become an actual psychiatrist. The world is cruel and random, but I refuse to believe it is that cruel. My first thought for Lucy was that she was a lawyer, perhaps a litigator. Now I like to picture her as a union negotiator, and the continual despair of the management lawyers who have to deal with her.

Her brother Linus clearly becomes a university professor of philosophy, comparative religion, Middle-Eastern medieval literature, or something like that. Or does he drop out and work in a bookstore? No, I think he's the kind of person who can tolerate the grind of getting a graduate degree and workin his way into a tenured professorship, with a tan corduroy jacket with patches on the elbows, and maybe a pipe.

Peppermint Patty I can imagine as a high school gym teacher, or maybe a yoga instructor or massage therapist. I bet she'd be good at any of those. Or if we want to imagine her at the pinnacle of achievement, coach of the U.S. Olympic softball team. Marcie is calm and level-headed, but a follower. I imagine her as a highly competent project manager.

In the conversation with Lorrie, I said “But what happens to Charlie Brown?”

“You're kidding, right?” she asked.

“No, why?”

“To everyone's great surprise, Charlie Brown grows up to be a syndicated cartoonist and a millionaire philanthropist.”

Of course she was right. Charlie Brown is good ol' Charlie Schultz, whose immense success suprised everyone, and nobody more than himself.

Charles M. Schultz was born 100 years ago last Saturday.

by Mark Dominus ( at November 27, 2022 05:46 PM

November 26, 2022

Mark Jason Dominus

Wombat coprolites

I was delighted to learn some time ago that there used to be giant wombats, six feet high at the shoulders, unfortunately long extinct.

It's also well known (and a minor mystery of Nature) that wombats have cubical poop.

Today I wondered, did the megafauna wombat produce cubical megaturds? And if so, would they fossilize (as turds often do) and leave ten-thousand-year-old mineral cubescat littering Australia? And if so, how big are these and where can I see them?

A look at Intestines of non-uniform stiffness mold the corners of wombat feces (Yang et al, Soft Matter, 2021, 17, 475–488) reveals a nice scatter plot of the dimensions of typical wombat scat, informing us that for (I think) the smooth-nosed (common) wombat:

  • Length: 4.0 ± 0.6 cm
  • Height: 2.3 ± 0.3 cm
  • Width: 2.5 ± 0.3 cm

Notice though, not cubical! Clearly longer than they are thick. And I wonder how one distinguishes the width from the height of a wombat turd. Probably the paper explains, but the shitheads at Soft Matter want £42.50 plus tax to look at the paper. (I checked, and Alexandra was not able to give me a copy.)

Anyway the common wombat is about 40 cm long and 20 cm high, while the extinct giant wombats were nine or ten times as big: 400 cm long and 180 cm high, let's call it ten times. Then a propportional giant wombat scat would be a cuboid approximately 24 cm (9 in) wide and tall, and 40 cm (16 in) long. A giant wombat poop would be as long as… a wombat!

But not the imposing monoliths I had been hoping for.

Yang also wrote an article Duration of urination does not change with body size, something I have wondered about for a long time. I expected bladder size (and so urine quantity) to scale with the body volume, the cube of the body length. But the rate of urine flow should be proportional to the cross-sectional area of the urethra, only the square of the body length. So urination time should be roughly proportional to body size. Yang and her coauthors are decisive that this is not correct:

we discover that all mammals above 3 kg in weight empty their bladders over nearly constant duration of 21 ± 13 s.

What is wrong with my analysis above? It's complex and interesting:

This feat is possible, because larger animals have longer urethras and thus, higher gravitational force and higher flow speed. Smaller mammals are challenged during urination by high viscous and capillary forces that limit their urine to single drops. Our findings reveal that the urethra is a flow-enhancing device, enabling the urinary system to be scaled up by a factor of 3,600 in volume without compromising its function.

Wow. As Leslie Orgel said, evolution is cleverer than you are.

However, I disagree with the conclusion: 21±13 is not “nearly constant duration”. This is a range of 8–34s, with some mammals taking four times as long as others.

The appearance of the fibonacci numbers here is surely coincidental, but wouldn't it be awesome if it wasn't?

[ Addendum: I wondered if this was the only page on the web to contain the bigram “wombat coprolites”, but Google search produced this example from 2018:

Have wombats been around for enough eons that there might be wombat coprolites to make into jewelry? I have a small dinosaur coprolite that is kind of neat but I wouldn't make that turd into a necklace, it looks just like a piece of poop.


by Mark Dominus ( at November 26, 2022 03:43 PM

Monday Morning Haskell

Black Friday Spotlight: Haskell Brain!

Machine learning is one of the most important skills in software today. The field has typically been dominated by languages like Python (through TensorFlow and PyTorch) and R. So it's a bit frustrating for Haskell fans who want to use this awesome language as widely as possible but struggle to apply it to this critical domain.

However, there are a few tools out there that allow us to use Haskell for machine learning! Chief among these are the Haskell Tensorflow bindings. They aren't easy to use though, and there aren't many tutorials either!

The Haskell Brain seeks to fill this gap. This course will walk you through all the important questions about getting started with Haskell and TensorFlow.

  1. What system setup is required?
  2. How are tensors represented in Haskell?
  3. How can I train a machine learning model with tensors?

If you're ready to start answering these questions, head to the course sales page!

For more details about what's included in the course, including FAQ, head over to our course description page.

The best part of it is that for the next few days, you can get 20% off this course through our Black Friday sale! Just use the code BLACKFRIDAY22 and checkout. If you want an even better deal of 30% off, you can subscribe to our mailing list! You'll get a special code when you sign up. So don't miss out!

by James Bowen at November 26, 2022 03:30 PM

November 25, 2022

Monday Morning Haskell

Black Friday Spotlight: Practical Haskell

How do you actually do something in Haskell? A programming language is only helpful if we can use it to solve real problems. Perhaps you've written up some neat and tidy solutions to small problems with Haskell. But does the language actually have the libraries and tools to build useful programs?

The answer to this question is a resounding Yes! Not only does Haskell have useful libraries for practical problems, but the "Haskell Approach" to these problems often has clear advantages! For instance, in Haskell you can:

  1. Write database queries that are type-safe, interoperating seamlessly with well-defined Haskell types.
  2. Define a web server where the API is clearly laid out and defined in terms of Haskell types.
  3. Link your Haskell types to frontend types that will populate your Web UI
  4. Organize "effects" within your system so that the capabilities of different functions are explicitly defined and known to your program at compile-time.
  5. Use monads to describe a test case in plan language.

These ideas can supercharge your Haskell abilities! But they aren't necessarily easy to pick up. It takes a fair amount of commitment to learn them well enough to use in your own projects.

Luckily, Monday Morning Haskell has a great tool for you to learn these skills! Our Practical Haskell Course will teach you how to build a functional application that integrates systems like databases, web servers, frontend web pages, and behavioral tests.

If this sounds like exactly what you've been looking for to rapidly improve your Haskell, head to the course page to get started!

If you're curious for more details, head to our course description page to learn about what you can expect in the course.

Don't forget, you've got a couple more days to take advantage of our Black Friday Sale! You can use the code BlackFriday22 to get 20% off any of our courses, including Practical Haskell. If you subscribe to our mailing list, you can get an even better code for 30% off! So don't miss out on those savings!

by James Bowen at November 25, 2022 03:30 PM

November 24, 2022

Tweag I/O

Threads and messages with Rust and WebAssembly

On most systems, you can implement concurrency using either threads or processes, where the main difference between the two is that threads share memory and processes don’t. Modern web browsers support concurrency through the Web Workers API. Although Web Workers are by default closer to a multi-process model, when used with WebAssembly you can opt-in to a more thread-like experience. Just like in systems programming, the choice of threads vs. processes comes with various trade-offs and performance implications; I’ll be covering some of them in this post. These examples will be in Rust, but similar trade-offs should apply to other languages compiled to WASM.

The Web Workers API (multi-processing on the web)

When used from JavaScript, the Web Workers API is very simple: call new Worker("/path/to/worker.js") and your browser will fetch worker.js and start running it concurrently. Inter-worker communication works in a very JavaScripty way, by setting message handler callbacks and then sending messages. To use Web Workers from compiled WASM code, you’ll need to go “through” JavaScript: you need a little JavaScript glue for spawning the worker, and you need to do the message sending and callback handling using some JavaScript bindings. Here’s a little example that spawns a worker, sends a message, and gets a reply:

// Spawn a worker and communicate with it.
fn spawn_worker() {
  let worker = web_sys::Worker::new("./worker.js");
  let callback = wasm_bindgen::Closure<FnMut(web_sys::MessageEvent)>::new(|msg| {
    assert_eq!(, Some(2.0));
  // Set up a callback to be invoked whenever we receive a message from the worker.
  // .as_ref().unchecked_ref() turns a wasm_bindgen::Closure into a &js_sys::Function

  // Send a message to the worker.
  worker.post_message(&JsValue::from(1.0)).expect("failed to post");

  // Did you notice that `set_onmessage` took a borrow? We still own `callback`, and we'd
  // better not free it too soon! See also
  std::mem::forget(callback); // FIXME: memory management is hard

// An entry point for the JavaScript worker to call back into WASM.
pub fn worker_entry_point(arg: i32) {
  // Add 1 to our argument and send it back to the main thread.
  // Yeah, the js_sys/web_sys bindings are ... low-level.
    .post_message(&JsValue::from(arg + 1))

And here’s the JavaScript glue code in worker.js, which receives messages and calls worker_entry_point:

self.onmessage = async event => {
  const { child_entry_point } = await wasm_bindgen(

Note that when using the Web Workers API, all of the messages you send are JsValues. This is fine for sending primitive types, but it becomes annoying if you want to send structured types, which must be converted into JsValues and back. You can simplify this process by using a helper crate like gloo-worker, which provides a convenient way to send structured data between workers. Under the hood, it serializes and deserializes data to and from a js_sys::ArrayBuffer.

Dealing with large data can also be tricky, because post_message requires that you copy the data. To avoid large data copies, you can use a SharedArrayBuffer (a buffer that can be accessed by multiple workers) or the post_message_with_transfer function, which allows for transferring the ownership of certain JavaScript objects from one worker to another without copying. The downside of this workaround is that it doesn’t work directly with objects living in WASM memory. For example, if you have a Vec<u8> that you want to send to another worker, you’ll need to either copy it to an ArrayBuffer and transfer it, or copy it to a SharedArrayBuffer and share it.

Shared memory in WebAssembly (multi-threading on the web)

Workers that share an address space can communicate with less boilerplate and minimal data-copying. To create shared memory workers, note that wasm_bindgen’s auto-generated initialization function takes a second (optional) parameter: a WASM memory object for the module to use. Memory chunks can be shared between WASM modules, so we can instantiate a new module using the same memory as the first one, and the two modules will share it.

Having two WASM workers sharing the same memory opens the door to more expressive inter-worker communication. For example, we can easily write a function for executing a closure in another worker, just like how the std::thread::spawn function works. The trick is to create a closure and send its address to the other worker. Since the memory space is shared, the receiving worker can cast that address back into a closure and execute it.

// A function imitating `std::thread::spawn`.
pub fn spawn(f: impl FnOnce() + Send + 'static) -> Result<web_sys::Worker, JsValue> {
  let worker = web_sys::Worker::new("./worker.js")?;
  // Double-boxing because `dyn FnOnce` is unsized and so `Box<dyn FnOnce()>` is a fat pointer.
  // But `Box<Box<dyn FnOnce()>>` is just a plain pointer, and since wasm has 32-bit pointers,
  // we can cast it to a `u32` and back.
  let ptr = Box::into_raw(Box::new(Box::new(f) as Box<dyn FnOnce()>));
  let msg = js_sys::Array::new();
  // Send the worker a reference to our memory chunk, so it can initialize a wasm module
  // using the same memory.
  // Also send the worker the address of the closure we want to execute.
  msg.push(&JsValue::from(ptr as u32))

// This function is here for `worker.js` to call.
pub fn worker_entry_point(addr: u32) {
  // Interpret the address we were given as a pointer to a closure to call.
  let closure = unsafe { Box::from_raw(ptr as *mut Box<dyn FnOnce()>) };

The JavaScript worker glue must be changed slightly, to use the received memory chunk when initializing its WASM module.

self.onmessage = async event => {
  //[0] should be the Memory object, and[1] is the value to pass into child_entry_point
  const { child_entry_point } = await wasm_bindgen(

And now we can spawn closures on another thread just like in native multi-threaded code, using the spawn function above instead of std::thread::spawn. You can even use Rust’s native inter-thread communication tools, like std::sync::mpsc, to transfer data between threads without copying! Our first worker example becomes as simple as:

let (to_worker, from_main) = std::sync::mpsc::channel();
let (to_main, from_worker) = std::sync::mpsc::channel();
spawn(move || { to_main.send(from_main.recv().unwrap() + 1.0); });
assert_eq!(from_worker.recv().unwrap(), 2.0);

Ok, there are some caveats. Shared memory WASM modules need some features that weren’t in the first iteration of the WASM spec, so you’ll need to build with some extra [target-features][targer-features]. You’ll also need to rebuild the standard library with those features, which requires a nightly compiler and unstable flags. Something like this will do the trick:

RUSTFLAGS="-C target-feature=+atomics,+bulk-memory,+mutable-globals" cargo build --target=wasm32-unknown-unknown --release -Z build-std=panic_abort,std

And then you’ll need to configure your web server with some special headers, because shared WASM memory builds on SharedArrayBuffer.

But there’s a more serious issue with shared-memory workers: our example called from_worker.recv() in the main thread, and most browsers will throw an exception if you try to block the main thread, even for a very short time. Since Rust doesn’t have any tooling for checking non-blockingness (see here for some discussion), this might be difficult to ensure.

If the extra discipline is just too onerous or unreliable, you can guarantee a non-blocked main thread by moving all shared-memory WASM modules off of it: from the main thread, use the JavaScript message-passing methods to communicate with one or more workers, which are free to communicate amongst each other using whichever (possibly blocking) methods they want.

How much does all of this actually matter?

To measure the performance implications of the various options, I made some buffers and sent them back and forth repeatedly between workers while measuring the round-trip time. I repeated the experiment with two different buffer sizes (a large 20 MB buffer, and a small 16 B one) and three different message-passing methods. The timings were done on Firefox 101, and the code is available here.

20MB buffer 16B buffer
post_message 28ms 0.028ms
post_message_with_transfer 0.033ms 0.033ms
std::sync::mpsc::channel 0.0062ms 0.0062ms

You’ll notice that Rust-native shared memory is the fastest by a substantial factor but not a very large absolute amount, unless you really need to send a lot of messages. Between the JavaScript methods, post_message_with_transfer has some small overhead compared to post_message for small buffers, but this is dwarfed by the copying time if you have substantial data to send.

At Tweag, we’ve been working with a client on an optimized WASM library that caches and doles out largish (around 20MB each) chunks of data. We tried various different threading architectures and ended up making do without shared memory. Our heavy use of non-lock-free primitives made it hard to keep the main browser thread happy when using shared memory, while the hybrid architecture depicted above forced us into too many expensive copies (we couldn’t just transfer the data to the main thread because we needed a copy in cache). With a separate-memory architecture, we arranged our data processing so that large buffers are only ever transferred, never copied. And the small overhead of post_message_with_transfer was negligible compared to the other processing we were doing.

Your ideal architecture might be different from ours. By explaining some of the trade-offs involved, I hope this post will help you find it!

November 24, 2022 12:00 AM

November 23, 2022


Announcing a live tutorial on eventlog2html and ghc-debug

We are happy to announce that we will be live-streaming a free tutorial on Haskell debugging tools via YouTube:

Finley McIlwaine, 2022-12-01, 1900–2100 GMT

Understanding and analysing the memory usage of Haskell programs is a notoriously difficult yet important problem. Recent improvements to GHC’s profiling capabilities, along with better tooling, has made it much easier to deeply and precisely analyse the memory usage characteristics of even large Haskell programs.

This workshop aims to present two such tools that allow high and low level memory usage analysis of Haskell programs: eventlog2html and ghc-debug. We will learn how to set up and use eventlog2html to generate high-level visuals and statistics of our program’s execution. We will also learn how to set up and use ghc-debug to precisely and programmatically explore our program’s low-level memory usage profile.

We will examine these tools by using them on several pre-prepared Haskell programs. The workshop aims to be beneficial to Haskell programmers of all levels. Beginner Haskell programmers can expect to gain a deeper understanding of lazy evaluation and the impacts it can have on program performance. Experienced Haskell programmers can expect to gain an understanding of exactly what these tools have to offer and the skills necessary to use these tools on their own Haskell programs.

This is a re-run of a similar workshop Finley presented at MuniHac 2022, which unfortunately was not recorded.

We hope that many of you will join us next Thursday for this stream! There will be an option to ask questions during the presentation via the YouTube chat. There is no need to register.

by finley, andres at November 23, 2022 12:00 AM

November 22, 2022

Chris Reade

Graphs, Kites and Darts

Graphs, Kites and Darts

Figure 1: Three Coloured Patches
Figure 1: Three Coloured Patches

Non-periodic tilings with Penrose’s kites and darts

We continue our investigation of the tilings using Haskell with Haskell Diagrams. What is new is the introduction of a planar graph representation. This allows us to define more operations on finite tilings, in particular forcing and composing.

Previously in Diagrams for Penrose Tiles we implemented tools to create and draw finite patches of Penrose kites and darts (such as the samples depicted in figure 1). The code for this and for the new graph representation and tools described here can be found on GitHub

To describe the tiling operations it is convenient to work with the half-tiles: LD (left dart), RD (right dart), LK (left kite), RK (right kite) using a polymorphic type HalfTile (defined in a module HalfTile)

data HalfTile rep 
 = LD rep | RD rep | LK rep | RK rep   deriving (Show,Eq)

Here rep is a type variable for a representation to be chosen. For drawing purposes, we chose two-dimensional vectors (V2 Double) and called these Pieces.

type Piece = HalfTile (V2 Double)

The vector represents the join edge of the half tile (see figure 2) and thus the scale and orientation are determined (the other tile edges are derived from this when producing a diagram).

Figure 2: The (half-tile) pieces showing join edges (dashed) and origin vertices (red dots)
Figure 2: The (half-tile) pieces showing join edges (dashed) and origin vertices (red dots)

Finite tilings or patches are then lists of located pieces.

type Patch = [Located Piece]

Both Piece and Patch are made transformable so rotate, and scale can be applied to both and translate can be applied to a Patch. (Translate has no effect on a Piece unless it is located.)

In Diagrams for Penrose Tiles we also discussed the rules for legal tilings and specifically the problem of incorrect tilings which are legal but get stuck so cannot continue to infinity. In order to create correct tilings we implemented the decompose operation on patches.

The vector representation that we use for drawing is not well suited to exploring properties of a patch such as neighbours of pieces. Knowing about neighbouring tiles is important for being able to reason about composition of patches (inverting a decomposition) and to find which pieces are determined (forced) on the boundary of a patch.

However, the polymorphic type HalfTile allows us to introduce our alternative graph representation alongside Pieces.

Tile Graphs

In the module Tgraph.Prelude, we have the new representation which treats half tiles as triangular faces of a planar graph – a TileFace – by specialising HalfTile with a triple of vertices (clockwise starting with the tile origin). For example

LD (1,3,4)       RK (6,4,3)
type Vertex = Int
type TileFace = HalfTile (Vertex,Vertex,Vertex)

When we need to refer to particular vertices from a TileFace we use originV (the first vertex – red dot in figure 2), oppV (the vertex at the opposite end of the join edge – dashed edge in figure 2), wingV (the remaining vertex not on the join edge).

originV, oppV, wingV :: TileFace -> Vertex


The Tile Graphs implementation uses a type Tgraph which has a list of tile faces and a maximum vertex number.

data Tgraph = Tgraph { maxV  :: Vertex
                     , faces :: [TileFace]
                     }  deriving (Show)

For example, fool (short for a fool’s kite) is a Tgraph with 6 faces and 7 vertices, shown in figure 3.

fool = Tgraph { maxV = 7
               , faces = [RD (1,2,3),LD (1,3,4),RK (6,2,5)
                         ,LK (6,3,2),RK (6,4,3),LK (6,7,4)

(The fool is also called an ace in the literature)

Figure 3: fool
Figure 3: fool

With this representation we can investigate how composition works with whole patches. Figure 4 shows a twice decomposed sun on the left and a once decomposed sun on the right (both with vertex labels). In addition to decomposing the right graph to form the left graph, we can also compose the left graph to get the right graph.

Figure 4: sunD2 and sunD
Figure 4: sunD2 and sunD

After implementing composition, we also explore a force operation and an emplace operation to extend tilings.

There are some constraints we impose on Tgraphs.

  • No spurious vertices. The vertices of a Tgraph are the vertices that occur in the faces of the Tgraph (and maxV is the largest number occurring).
  • Connected. The collection of faces must be a single connected component.
  • No crossing boundaries. By this we mean that vertices on the boundary are incident with exactly two boundary edges. The boundary consists of the edges between the Tgraph faces and exterior region(s). This is important for adding faces.
  • Tile connected. Roughly, this means that if we collect the faces of a Tgraph by starting from any single face and then add faces which share an edge with those already collected, we get all the Tgraph faces. This is important for drawing purposes.

In fact, if a Tgraph is connected with no crossing boundaries, then it must be tile connected. (We could define tile connected to mean that the dual graph excluding exterior regions is connected.)

Figure 5 shows two excluded graphs which have crossing boundaries at 4 (left graph) and 13 (right graph). The left graph is still tile connected but the right is not tile connected (the two faces at the top right do not have an edge in common with the rest of the faces.)

Although we have allowed for Tgraphs with holes (multiple exterior regions), we note that such holes cannot be created by adding faces one at a time without creating a crossing boundary. They can be created by removing faces from a Tgraph without necessarily creating a crossing boundary.

Important We are using face as an abbreviation for half-tile face of a Tgraph here, and we do not count the exterior of a patch of faces to be a face. The exterior can also be disconnected when we have holes in a patch of faces and the holes are not counted as faces either. In graph theory, the term face would generally include these other regions, but we will call them exterior regions rather than faces.

Figure 5: A face-connected graph with crossing boundaries at 4, and a non face-connected graph
Figure 5: A tile-connected graph with crossing boundaries at 4, and a non tile-connected graph

In addition to the constructor Tgraph we also use

checkedTgraph:: [TileFace] -> Tgraph

which creates a Tgraph from a list of faces, but also performs checks on the required properties of Tgraphs. We can then remove or select faces from a Tgraph and then use checkedTgraph to ensure the resulting Tgraph still satisfies the required properties.

selectFaces, removeFaces  :: [TileFace] -> Tgraph -> Tgraph
selectFaces fcs g = checkedTgraph (faces g `intersect` fcs)
removeFaces fcs g = checkedTgraph (faces g \\ fcs)

Edges and Directed Edges

We do not explicitly record edges as part of a Tgraph, but calculate them as needed. Implicitly we are requiring

  • No spurious edges. The edges of a Tgraph are the edges of the faces of the Tgraph.

To represent edges, a pair of vertices (a,b) is regarded as a directed edge from a to b. A list of such pairs will usually be regarded as a directed edge list. In the special case that the list is symmetrically closed [(b,a) is in the list whenever (a,b) is in the list] we will refer to this as an edge list rather than a directed edge list.

The following functions on TileFaces all produce directed edges (going clockwise round a face).

type Dedge = (Vertex,Vertex)
  -- join edge - dashed in figure 2
joinE  :: TileFace -> Dedge 
  -- the short edge which is not a join edge
shortE :: TileFace -> Dedge   
-- the long edge which is not a join edge
longE  :: TileFace -> Dedge
  -- all three directed edges clockwise from origin
faceDedges :: TileFace -> [Dedge]

For the whole Tgraph, we often want a list of all the directed edges of all the faces.

graphDedges :: Tgraph -> [Dedge]
graphDedges g = concatMap faceDedges (faces g)

Because our graphs represent tilings they are planar (can be embedded in a plane) so we know that at most two faces can share an edge and they will have opposite directions of the edge. No two faces can have the same directed edge. So from graphDedges g we can easily calculate internal edges (edges shared by 2 faces) and boundary directed edges (directed edges round the external regions).

internalEdges, boundaryDedges :: Tgraph -> [Dedge]

The internal edges of g are those edges which occur in both directions in graphDedges g. The boundary directed edges of g are the missing reverse directions in graphDedges g.

We also refer to all the long edges of a Tgraph (including kite join edges) as phiEdges (both directions of these edges).

phiEdges :: Tgraph -> [Dedge]

This is so named because, when drawn, these long edges are phi times the length of the short edges (phi being the golden ratio which is approximately 1.618).

Drawing Tgraphs (Patches and VPinned)

The module Tgraph.Convert contains functions to convert a Tgraph to our previous vector representation (Patch) defined in TileLib so we can use the existing tools to produce diagrams.

makePatch :: Tgraph -> Patch

drawPatch :: Patch -> Diagram B -- defined in module TileLib

drawGraph :: Tgraph -> Diagram B
drawGraph = drawPatch . makePatch

However, it is also useful to have an intermediate stage (a VPinned) which contains both faces and locations for each vertex. This allows vertex labels to be drawn and for faces to be identified and retained/excluded after the location information is calculated.

data VPinned  = VPinned {vLocs :: VertexLocMap
                        ,vpFaces :: [TileFace]

A VPinned has a map from vertices to locations and a list of faces. We make VPinned transformable so it can also be an argument type for rotate, translate, and scale.

The conversion functions include

makeVPinned   :: Tgraph -> VPinned
dropLabels :: VPinned -> Patch -- discards vertex information
drawVPinned   :: VPinned -> Diagram B  -- draws labels as well

drawVGraph   :: Tgraph -> Diagram B
drawVGraph = drawVPinned . makeVPinned

One consequence of using abstract graphs is that there is no unique predefined way to orient or scale or position the patch arising from a graph representation. Our implementation selects a particular join edge and aligns it along the x-axis (unit length for a dart, philength for a kite) and tile-connectedness ensures the rest of the patch can be calculated from this.

We also have functions to re-orient a VPinned and lists of VPinneds using chosen pairs of vertices. [Simply doing rotations on the final diagrams can cause problems if these include vertex labels. We do not, in general, want to rotate the labels – so we need to orient the VPinned before converting to a diagram]

Decomposing Graphs

We previously implemented decomposition for patches which splits each half-tile into two or three smaller scale half-tiles.

decompose :: Patch -> Patch

We now have a Tgraph version of decomposition in the module Tgraphs:

decomposeG :: Tgraph -> Tgraph

Graph decomposition is particularly simple. We start by introducing one new vertex for each long edge (the phiEdges) of the Tgraph. We then build the new faces from each old face using the new vertices.

As a running example we take fool (mentioned above) and its decomposition foolD

*Main> foolD = decomposeG fool

*Main> foolD
Tgraph { maxV = 14
       , faces = [LK (1,8,3),RD (2,3,8),RK (1,3,9)
                 ,LD (4,9,3),RK (5,13,2),LK (5,10,13)
                 ,RD (6,13,10),LK (3,2,13),RK (3,13,11)
                 ,LD (6,11,13),RK (3,14,4),LK (3,11,14)
                 ,RD (6,14,11),LK (7,4,14),RK (7,14,12)
                 ,LD (6,12,14)

which are best seen together (fool followed by foolD) in figure 6.

Figure 6: fool and foolD (= decomposeG fool)
Figure 6: fool and foolD (= decomposeG fool)

Composing graphs, and Unknowns

Composing is meant to be an inverse to decomposing, and one of the main reasons for introducing our graph representation. In the literature, decomposition and composition are defined for infinite tilings and in that context they are unique inverses to each other. For finite patches, however, we will see that composition is not always uniquely determined.

In figure 7 (Two Levels) we have emphasised the larger scale faces on top of the smaller scale faces.

Figure 7: Two Levels
Figure 7: Two Levels

How do we identify the composed tiles? We start by classifying vertices which are at the wing tips of the (smaller) darts as these determine how things compose. In the interior of a graph/patch (e.g in figure 7), a dart wing tip always coincides with a second dart wing tip, and either

  1. the 2 dart halves share a long edge. The shared wing tip is then classified as a largeKiteCentre and is at the centre of a larger kite. (See left vertex type in figure 8), or
  2. the 2 dart halves touch at their wing tips without sharing an edge. This shared wing tip is classified as a largeDartBase and is the base of a larger dart. (See right vertex type in figure 8)
Figure 8: largeKiteCentre (left) and largeDartBase (right)
Figure 8: largeKiteCentre (left) and largeDartBase (right)

[We also call these (respectively) a deuce vertex type and a jack vertex type later in figure 10]

Around the boundary of a graph, the dart wing tips may not share with a second dart. Sometimes the wing tip has to be classified as unknown but often it can be decided by looking at neighbouring tiles. In this example of a four times decomposed sun (sunD4), it is possible to classify all the dart wing tips as largeKiteCentres or largeDartBases so there are no unknowns.

If there are no unknowns, then we have a function to produce the unique composed graph.

composeG:: Tgraph -> Tgraph

Any correct decomposed graph without unknowns will necessarily compose back to its original. This makes composeG a left inverse to decomposeG provided there are no unknowns.

For example, with an (n times) decomposed sun we will have no unknowns, so these will all compose back up to a sun after n applications of composeG. For n=4 (sunD4 – the smaller scale shown in figure 7) the dart wing classification returns 70 largeKiteCentres, 45 largeDartBases, and no unknowns.

Similarly with the simpler foolD example, if we classsify the dart wings we get

largeKiteCentres = [14,13]
largeDartBases = [3]
unknowns = []

In foolD (the right hand graph in figure 6), nodes 14 and 13 are new kite centres and node 3 is a new dart base. There are no unknowns so we can use composeG safely

*Main> composeG foolD
Tgraph { maxV = 7
       , faces = [RD (1,2,3),LD (1,3,4),RK (6,2,5)
                 ,RK (6,4,3),LK (6,3,2),LK (6,7,4)

which reproduces the original fool (left hand graph in figure 6).

However, if we now check out unknowns for fool we get

largeKiteCentres = []
largeDartBases = []
unknowns = [4,2]    

So both nodes 2 and 4 are unknowns. It had looked as though fool would simply compose into two half kites back-to-back (sharing their long edge not their join), but the unknowns show there are other possible choices. Each unknown could become a largeKiteCentre or a largeDartBase.

The question is then what to do with unknowns.

Partial Compositions

In fact our composeG resolves two problems when dealing with finite patches. One is the unknowns and the other is critical missing faces needed to make up a new face (e.g the absence of any half dart).

It is implemented using an intermediary function for partial composition

partCompose:: Tgraph -> ([TileFace],Tgraph) 

partCompose will compose everything that is uniquely determined, but will leave out faces round the boundary which cannot be determined or cannot be included in a new face. It returns the faces of the argument graph that were not used, along with the composed graph.

Figure 9 shows the result of partCompose applied to two graphs. [These are force kiteD3 and force dartD3 on the left. Force is described later]. In each case, the excluded faces of the starting graph are shown in pale green, overlaid by the composed graph on the right.

Figure 9: partCompose for two graphs (force kiteD3 top row and force dartD3 bottom row)
Figure 9: partCompose for two graphs (force kiteD3 top row and force dartD3 bottom row)

Then composeG is simply defined to keep the composed faces and ignore the unused faces produced by partCompose.

composeG:: Tgraph -> Tgraph
composeG = snd . partCompose 

This approach avoids making a decision about unknowns when composing, but it may lose some information by throwing away the uncomposed faces.

For correct Tgraphs g, if decomposeG g has no unknowns, then composeG is a left inverse to decomposeG. However, if we take g to be two kite halves sharing their long edge (not their join edge), then these decompose to fool which produces an empty graph when recomposed. Thus we do not have g = composeG (decomposeG g) in general. On the other hand we do have g = composeG (decomposeG g) for correct whole-tile Tgraphs g (whole-tile means all half-tiles of g have their matching half-tile on their join edge in g)

Later (figure 21) we show another exception to g = composeG(decomposeG g) with an incorrect tiling.

We make use of

selectFacesVP    :: [TileFace] -> VPinned -> VPinned
removeFacesVP    :: [TileFace] -> VPinned -> VPinned
selectFacesGtoVP :: [TileFace] -> Tgraph -> VPinned
removeFacesGtoVP :: [TileFace] -> Tgraph -> VPinned

for creating VPinneds from selected tile faces of a Tgraph or VPinned. This allows us to represent and draw a subgraph which need not be connected nor satisfy the no crossing boundaries property provided the Tgraph it was derived from had these properties.


When building up a tiling, following the rules, there is often no choice about what tile can be added alongside certain tile edges at the boundary. Such additions are forced by the existing patch of tiles and the rules. For example, if a half tile has its join edge on the boundary, the unique mirror half tile is the only possibility for adding a face to that edge. Similarly, the short edge of a left (respectively, right) dart can only be matched with the short edge of a right (respectively, left) kite. We also make use of the fact that only 7 types of vertex can appear in (the interior of) a patch, so on a boundary vertex we sometimes have enough of the faces to determine the vertex type. These are given the following names in the literature (shown in figure 10): sun, star, jack (=largeDartBase), queen, king, ace, deuce (=largeKiteCentre).

Figure 10: Vertex types
Figure 10: Vertex types

The function

force :: Tgraph -> Tgraph

will add some faces on the boundary that are forced (i.e new faces where there is exactly one possible choice). For example:

  • When a join edge is on the boundary – add the missing half tile to make a whole tile.
  • When a half dart has its short edge on the boundary – add the half kite that must be on the short edge.
  • When a vertex is both a dart origin and a kite wing (it must be a queen or king vertex) – if there is a boundary short edge of a kite half at the vertex, add another kite half sharing the short edge, (this converts 1 kite to 2 and 3 kites to 4 in combination with the first rule).
  • When two half kites share a short edge their common oppV vertex must be a deuce vertex – add any missing half darts needed to complete the vertex.

Figure 11 shows foolDminus (which is foolD with 3 faces removed) on the left and the result of forcing, ie force foolDminus on the right which is the same graph we get from force foolD.

foolDminus = 
    removeFaces [RD(6,14,11), LD(6,12,14), RK(5,13,2)] foolD
Figure 11: foolDminus and force foolDminus = force foolD
Figure 11: foolDminus and force foolDminus = force foolD

Figures 12, 13 and 14 illustrate the result of forcing a 5-times decomposed kite, a 5-times decomposed dart, and a 5-times decomposed sun (respectively). The first two figures reproduce diagrams from an article by Roger Penrose illustrating the extent of influence of tiles round a decomposed kite and dart. [Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989. (fig 14)].

Figure 12: force kiteD5 with kiteD5 shown in red
Figure 12: force kiteD5 with kiteD5 shown in red
Figure 13: force dartD5 with dartD5 shown in red
Figure 13: force dartD5 with dartD5 shown in red
Figure 14: force sunD5 with sunD5 shown in red
Figure 14: force sunD5 with sunD5 shown in red

In figure 15, the bottom row shows successive decompositions of a dart (dashed blue arrows from right to left), so applying composeG to each dart will go back (green arrows from left to right). The black vertical arrows are force. The solid blue arrows from right to left are (force . decomposeG) being applied to the successive forced graphs. The green arrows in the reverse direction are composeG again and the intermediate (partCompose) figures are shown in the top row with the ignored faces in pale green.

Figure 15: Arrows: black = force, green = composeG, solid blue = (force . decomposeG)
Figure 15: Arrows: black = force, green = composeG, solid blue = (force . decomposeG)

Figure 16 shows the forced graphs of the seven vertex types (with the starting graphs in red) along with a kite (top right).

Figure 16: Relating the forced seven vertex types and the kite
Figure 16: Relating the forced seven vertex types and the kite

These are related to each other as shown in the columns. Each graph composes to the one above (an empty graph for the ones in the top row) and the graph below is its forced decomposition. [The rows have been scaled differently to make the vertex types easier to see.]

Adding Faces to a Tgraph

This is technically tricky because we need to discover what vertices (and implicitly edges) need to be newly created and which ones already exist in the Tgraph. This goes beyond a simple graph operation and requires use of the geometry of the faces. We have chosen not to do a full conversion to vectors to work out all the geometry, but instead we introduce a local representation of angles at a vertex allowing a simple equality test.

Integer Angles

All vertex angles are integer multiples of 1/10th turn (mod 10) so we use these integers for face internal angles and boundary external angles. The face adding process always adds to the right of a given directed edge (a,b) which must be a boundary directed edge. [Adding to the left of an edge (a,b) would mean that (b,a) will be the boundary direction and so we are really adding to the right of (b,a)]. Face adding looks to see if either of the two other edges already exist in the graph by considering the end points a and b to which the new face is to be added, and checking angles.

This allows an edge in a particular sought direction to be discovered. If it is not found it is assumed not to exist. However, this will be undermined if there are crossing boundaries . In this case there must be more than two boundary directed edges at the vertex and there is no unique external angle.

Establishing the no crossing boundaries property ensures these failures cannot occur. We can easily check this property for newly created graphs (with checkedTgraph) and the face adding operations cannot create crossing boundaries.

Touching Vertices and Crossing Boundaries

When a new face to be added on (a,b) has neither of the other two edges already in the graph, the third vertex needs to be created. However it could already exist in the Tgraph – it is not on an edge coming from a or b but from another non-local part of the Tgraph. We call this a touching vertex. If we simply added a new vertex without checking for a clash this would create a nonsense graph. However, if we do check and find an existing vertex, we still cannot add the face using this because it would create a crossing boundary.

Our version of forcing prevents face additions that would create a touching vertex/crossing boundary by calculating the positions of boundary vertices.

No conflicting edges

There is a final (simple) check when adding a new face, to prevent a long edge (phiEdge) sharing with a short edge. This can arise if we force an incorrect graph (as we will see later).

Implementing Forcing

Our order of forcing prioritises updates (face additions) which do not introduce a new vertex. Such safe updates are easy to recognise and they do not require a touching vertex check. Surprisingly, this pretty much removes the problem of touching vertices altogether.

As an illustration, consider foolDMinus again on the left of figure 11. Adding the left dart onto edge (12,14) is not a safe addition (and would create a crossing boundary at 6). However, adding the right dart RD(6,14,11) is safe and creates the new edge (6,14) which then makes the left dart addition safe. In fact it takes some contrivance to come up with a Tgraph with an update that could fail the check during forcing when safe cases are always done first. Figure 17 shows such a contrived Tgraph formed by removing the faces shown in green from a twice decomposed sun on the left. The forced result is shown on the right. When there are no safe cases, we need to try an unsafe one. The four green faces at the bottom are blocked by the touching vertex check. This leaves any one of 9 half-kites at the centre which would pass the check. But after just one of these is added, the check is not needed again. There is always a safe addition to be done at each step until all the green faces are added.

Figure 17: A contrived example requiring a touching vertex check
Figure 17: A contrived example requiring a touching vertex check

Boundary information

The implementation of forcing has been made more efficient by calculating some boundary information in advance. This boundary information uses a type Boundary

data Boundary 
  = Boundary
    { bDedges     :: [Dedge]
    , bvFacesMap  :: Mapping Vertex [TileFace]
    , bvLocMap    :: Mapping Vertex (Point V2 Double)
    , allFaces    :: [TileFace]
    , allVertices :: [Vertex]
    , nextVertex  :: Vertex
    } deriving (Show)

This records the boundary directed edges (bDedges) plus a mapping of the boundary vertices to their incident faces (bvFacesMap) plus a mapping of the boundary vertices to their positions (bvLocMap). It also keeps track of all the faces and vertices. The boundary information is easily incremented for each face addition without being recalculated from scratch, and a final graph with all the new faces is easily recovered from the boundary information when there are no more updates.

makeBoundary  :: Tgraph -> Boundary
recoverGraph  :: Boundary -> Tgraph

The saving that comes from using boundaries lies in efficient incremental changes to boundary information and, of course, in avoiding the need to consider internal faces. As a further optimisation we keep track of updates in a mapping from boundary directed edges to updates, and supply a list of affected edges after an update so the update calculator (update generator) need only revise these. The boundary and mapping are combined in a force state.

type UpdateMap = Mapping Dedge Update
type UpdateGenerator = Boundary -> [Dedge] -> UpdateMap
data ForceState = ForceState 
       { boundaryState:: Boundary
       , updateMap:: UpdateMap 

Forcing then involves using a specific update generator (allUGenerator) and initialising the state, then using the recursive forceAll which keeps doing updates until there are no more, before recovering the final graph.

force:: Tgraph -> Tgraph
force = forceWith allUGenerator

forceWith:: UpdateGenerator -> Tgraph -> Tgraph
forceWith uGen = recoverGraph . boundaryState . 
                 forceAll uGen . initForceState uGen

forceAll :: UpdateGenerator -> ForceState -> ForceState
initForceState :: UpdateGenerator -> Tgraph -> ForceState

In addition to force we can easily define

wholeTiles:: Tgraph -> Tgraph
wholeTiles = forceWith wholeTileUpdates 

which just uses the first forcing rule to make sure every half-tile has a matching other half.

We also have a version of force which counts to a specific number of face additions.

stepForce :: Int -> ForceState -> ForceState

This proved essential in uncovering problems of accumulated innaccuracy in calculating boundary positions (now fixed).

Some Other Experiments

Below we describe results of some experiments using the tools introduced above. Specifically: emplacements, sub-Tgraphs, incorrect tilings, and composition choices.


The finite number of rules used in forcing are based on local boundary vertex and edge information only. We may be able to improve on this by considering a composition and forcing at the next level up before decomposing and forcing again. This thus considers slightly broader local information. In fact we can iterate this process to all the higher levels of composition. Some graphs produce an empty graph when composed so we can regard those as maximal compositions. For example composeG fool produces an empty graph.

The idea now is to take an arbitrary graph and apply (composeG . force) repeatedly to find its maximally composed graph, then to force the maximal graph before applying (force . decomposeG) repeatedly back down to the starting level (so the same number of decompositions as compositions).

We call the function emplace, and call the result the emplacement of the starting graph as it shows a region of influence around the starting graph.

With earlier versions of forcing when we had fewer rules, emplace g often extended force g for a Tgraph g. This allowed the identification of some new rules. Since adding the new rules we have not yet found graphs with different results from force and emplace. [Update: We now have an example where force includes more than emplace].


In figure 18 on the left we have a four times decomposed dart dartD4 followed by two sub-Tgraphs brokenDart and badlyBrokenDart which are constructed by removing faces from dartD4 (but retaining the connectedness condition and the no crossing boundaries condition). These all produce the same forced result (depicted middle row left in figure 15).

Figure 18: dartD4, brokenDart, badlyBrokenDart
Figure 18: dartD4, brokenDart, badlyBrokenDart

However, if we do compositions without forcing first we find badlyBrokenDart fails because it produces a graph with crossing boundaries after 3 compositions. So composeG on its own is not always safe, where safe means guaranteed to produce a valid Tgraph from a valid correct Tgraph.

In other experiments we tried force on Tgraphs with holes and on incomplete boundaries around a potential hole. For example, we have taken the boundary faces of a forced, 5 times decomposed dart, then removed a few more faces to make a gap (which is still a valid Tgraph). This is shown at the top in figure 19. The result of forcing reconstructs the complete original forced graph. The bottom figure shows an intermediate stage after 2200 face additions. The gap cannot be closed off to make a hole as this would create a crossing boundary, but the channel does get filled and eventually closes the gap without creating a hole.

Figure 19: Forcing boundary faces with a gap (after 2200 steps)
Figure 19: Forcing boundary faces with a gap (after 2200 steps)

Incorrect Tilings

When we say a Tgraph g is a correct graph (respectively: incorrect graph), we mean g represents a correct tiling (respectively: incorrect tiling). A simple example of an incorrect graph is a kite with a dart on each side (called a mistake by Penrose) shown on the left of figure 20.

*Main> mistake
Tgraph { vertices = [1,2,4,3,5,6,7,8]
       , faces = [RK (1,2,4),LK (1,3,2),RD (3,1,5)
                 ,LD (4,6,1),LD (3,5,7),RD (4,8,6)

If we try to force (or emplace) this graph it produces an error in construction which is detected by the test for conflicting edge types (a phiEdge sharing with a non-phiEdge).

*Main> force mistake
Tgraph {vertices = *** Exception: doUpdate:(incorrect tiling)
Conflicting new face RK (11,1,6)
with neighbouring faces
[RK (9,1,11),LK (9,5,1),RK (1,2,4),LK (1,3,2),RD (3,1,5),LD (4,6,1),RD (4,8,6)]
in boundary
Boundary ...

In figure 20 on the right, we see that after successfully constructing the two whole kites on the top dart short edges, there is an attempt to add an RK on edge (1,6). The process finds an existing edge (1,11) in the correct direction for one of the new edges so tries to add the erroneous RK (11,1,6) which fails a noConflicts test.

Figure 20: An incorrect graph (mistake), and the point at which force mistake fails
Figure 20: An incorrect graph (mistake), and the point at which force mistake fails

So it is certainly true that incorrect graphs may fail on forcing, but forcing cannot create an incorrect graph from a correct graph.

If we apply decomposeG to mistake it produces another incorrect graph (which is similarly detected if we apply force), but will nevertheless still compose back to mistake if we do not try to force.

Interestingly, though, the incorrectness of a graph is not always preserved by decomposeG. If we start with mistake1 which is mistake with just two of the half darts (and also an incorrect tiling) we still get a similar failure on forcing, but decomposeG mistake1 is no longer incorrect. If we apply composeG to the result or force then composeG the mistake is thrown away to leave just a kite (see figure 21). This is an example where composeG is not a left inverse to either decomposeG or (force . decomposeG).

Figure 21: mistake1 with its decomposition, forced decomposition, and recomposed.
Figure 21: mistake1 with its decomposition, forced decomposition, and recomposed.

Composing with Choices

We know that unknowns indicate possible choices (although some choices may lead to incorrect graphs). As an experiment we introduce

makeChoices :: Tgraph -> [Tgraph]

which produces 2^n alternatives for the 2 choices of each of n unknowns (prior to composing). This uses forceLDB which forces an unknown to be a largeDartBase by adding an appropriate joined half dart at the node, and forceLKC which forces an unknown to be a largeKiteCentre by adding a half dart and a whole kite at the node (making up the 3 pieces for a larger half kite).

Figure 22 illustrates the four choices for composing fool this way. The top row has the four choices of makeChoices fool (with the fool shown embeded in red in each case). The bottom row shows the result of applying composeG to each choice.

Figure 22: makeChoices fool (top row) and composeG of each choice (bottom row)
Figure 22: makeChoices fool (top row) and composeG of each choice (bottom row)

In this case, all four compositions are correct tilings. The problem is that, in general, some of the choices may lead to incorrect tilings. More specifically, a choice of one unknown can determine what other unknowns have to become with constraints such as

  • a and b have to be opposite choices
  • a and b have to be the same choice
  • a and b cannot both be largeKiteCentres
  • a and b cannot both be largeDartBases

This analysis of constraints on unknowns is not trivial. The potential exponential results from choices suggests we should compose and force as much as possible and only consider unknowns of a maximal graph.

For calculating the emplacement of a graph, we first find the forced maximal graph before decomposing. We could also consider using makeChoices at this top step when there are unknowns, i.e a version of emplace which produces these alternative results (emplaceChoices)

The result of emplaceChoices is illustrated for foolD in figure 23. The first force and composition is unique producing the fool level at which point we get 4 alternatives each of which compose further as previously illustrated in figure 22. Each of these are forced, then decomposed and forced, decomposed and forced again back down to the starting level. In figure 23 foolD is overlaid on the 4 alternative results. What they have in common is (as you might expect) emplace foolD which equals force foolD and is the graph shown on the right of figure 11.

Figure 23: emplaceChoices foolD
Figure 23: emplaceChoices foolD

Future Work

I am collaborating with Stephen Huggett who suggested the use of graphs for exploring properties of the tilings. We now have some tools to experiment with but we would also like to complete some formalisation and proofs. For example, we do not know if force g always produces the same result as emplace g. [Update (August 2022): We now have an example where force g strictly includes emplace g].

It would also be good to establish that g is incorrect iff force g fails.

We have other conjectures relating to subgraph ordering of Tgraphs and Galois connections to explore.

by readerunner at November 22, 2022 10:43 AM

Tweag I/O

WebAssembly backend merged into GHC

Tweag has been working on a GHC WebAssembly backend for some time. Recently, the WebAssembly backend merge request has landed in GHC, and is on course to appear in the upcoming 9.6 release series. This post will give a quick demonstration of how to try it out locally, and explain what comes in this patch and what will be coming next.

Playing with WASM locally

If you’re using nix on x86_64-linux, compiling a Haskell program to a self-contained wasm module is as simple as:

$ nix shell
$ echo 'main = putStrLn "hello world"' > hello.hs
$ wasm32-wasi-ghc hello.hs -o hello.wasm
[1 of 2] Compiling Main             ( hello.hs, hello.o )
[2 of 2] Linking hello.wasm
$ wasmtime ./hello.wasm
hello world

There’s also a non-nix installation script. Check the ghc-wasm-meta repo’s README for details.

What’s interesting about the example above? It doesn’t need any companion JavaScript code, and runs on a variety of wasm engines that support wasi, including but not limited to: wasmtime, wasmedge, wasmer and wasm3. Compared to the legacy asterius project, there are also a few other serious benefits:

  • The killer feature is being able to use GHC’s own RTS code for garbage collection and other runtime functionality. The GHC RTS is way more robust, feature-complete and performant than asterius’s legacy JavaScript runtime. Lots of Haskell features that never worked in asterius (e.g. STM or profiling) now work out of the box.
  • It has proper support for compiling and linking C/C++ code. Terms and conditions apply here, but there’s still a high chance the cbits in your packages will work out of the box.
  • Since it uses LLVM for linking, the linking step is orders of magnitudes faster than asterius, which uses a custom object format and linking logic.
  • GHC CI tests a program that uses the GHC API to parse a Haskell module. ghc is a big package and depends on everything in the boot libraries, so even having only a part of GHC frontend working in pure wasm is already pretty cool, and it certainly provides more assurance than a simple “hello world”. asterius never had ghc in its boot libraries.

What is in this merge request

The GHC wasm backend merge request’s commit history is carefully structured to contain mostly small and easy to review patches. The changeset can be roughly grouped into:

  • Enhancing the build system, making it aware of the wasm32-wasi target, and avoid compiling stuff not supported on that target
  • Avoiding the usage of POSIX features not supported on wasm32-wasi – various places need to be patched, like the RTS, base or unix
  • Doing various other RTS fixes, for issues that didn’t break other GHC targets by pure luck
  • Enhancing the GHC driver with certain wasm-specific logic – most of the time due to the need to workaround some upstream issues in LLVM
  • Modeling the wasm structured control flow, and implementing the algorithm to translate arbitrary Cmm control flow graphs to it – this part of the work was done by my colleague Norman Ramsey, and well explained in his ICFP 2022 paper
  • Implementing the wasm native code generator (NCG), which translates Cmm to assembly code – unlike NCGs for other targets, the wasm NCG uses a dependently-typed IR to preserve type safety of the wasm value stack, and this has proved to be helpful in catching some errors early on when writing the NCG
  • Serving the binary distributions as CI artifacts, and there’s already some basic testing

GHC is a rapidly evolving project, and merging the wasm backend does not make it immune to potential future breakages. For me, it’s not just an honor to implement wasm support, but also a personal commitment to maintain it, prevent bit-rotting, and make sure that the bus factor of this work goes beyond 1 in the future. This is made possible by Tweag’s long term support.

What comes next

JavaScript FFI

asterius had a rich JavaScript FFI implementation, allowing one to import JavaScript functions into Haskell, pass arbitrary JavaScript values as first-class Haskell values, and export Haskell functions to be called by JavaScript. Furthermore, the JavaScript async functions worked naturally with the Haskell threading system, so that when a Haskell thread is blocked on an async JavaScript call, the runtime executes other threads instead of blocking completely.

This is the first of asterius main features that I plan to port to GHC’s wasm backend. You don’t pay for JavaScript if you don’t use it. We’ve already gained good experience with wasm/js interoperability, but this time I will need to do non-trivial refactorings in the GHC RTS storage manager and scheduler to achieve the same. So this will take some time and may not make it into GHC 9.6.1.

Template Haskell

asterius had limited support for Template Haskell. Template Haskell requires dynamically linking Haskell code, but how dynamic linking is supposed to work in wasm is still unclear, so asterius cheated by doing static linking each time a TH splice was evaluated. Since the runtime heap state isn’t preserved between splice evals, when the TH splices are stateful, this approach won’t work, but it’s been proven to work surprisingly well for a lot of TH splices in the wild.

I plan to add Template Haskell support for GHC’s wasm backend in a similar way. Pure TH splices (e.g. generating optics for datatypes) are likely to work, and work much faster than asterius thanks to the much improved linking performance. But splices with side effects (e.g. gitrev that needs to spawn a git subprocess), may not work if the side effect isn’t a supported WASI operation.

Since implementing proper dynamic linking isn’t planned yet, ghci wouldn’t work in GHC’s wasm backend in the near future.

More things to come

There are also other things planned in addition to the above features, including but not limited to:

  • Using the GHC issue tracker for bugfixes/feature planning and discussions, for better transparency of my work
  • Running the full GHC testsuite and nofib benchmarks
  • Supporting cross-compiling to wasm from more host systems
  • Wasm-related patches to common Hackage dependencies, or a Hackage overlay for wasm

November 22, 2022 12:00 AM

November 21, 2022

Michael Snoyman

Seeking new Stackage Curator

The Stackage Curator team is responsible for ongoing maintenance tasks for Stackage: creating builds, adding manual bounds, merging pull requests, and more. The responsibilities and general workflow are described in detail in the curators document, but in short:

  • There are a total of 8 curators
  • Each curator takes a one-week slot in rotation
  • During that week, the curator reviews incoming PRs, ensures Stackage Nightly builds, and puts out an LTS release

I am planning on stepping down from my position as one of the Stackage Curators. With personal and work responsibilities, I simply don't have the time to dedicate to my curator responsibilities. What time I do have available I intend to devote instead to higher level topics, such as toolchain fixes.

And thus this blog post: I'm putting out a call for a new Stackage Curator to join the team. As a Stackage Curator, you're providing a valuable service to the entire Haskell community of helping keeping builds running and packages moving forward. You'll also have more impact on deciding when Stackage makes steps forwards to new versions of GHC and other dependencies.

If you're interested in joining the curator team, please fill out this form.

Thank you

Now seems as good a time as any to say this. I want to express a huge thank you to the entire Haskell community that have been part of Stackage, and in particular to the Stackage Curator team. By raw number of contributors (742 at time of writing), it is the most active project I've ever started. And I never could have kept it running without the rest of the curator team to shoulder the burden. Adam, Alexey, Chris, Dan, Jens, Joe, and Mihai: it's been a pleasure being a curator with you. Thank you for everything, and I'm looking forward to continued involvement on my reduced schedule.

November 21, 2022 12:00 AM

November 19, 2022

Stackage Blog

LTS 20 release for ghc-9.2 and Nightly now on ghc-9.4

Stackage LTS 20 has been released

The Stackage team is very happy to announce the first Stackage LTS version 20 snapshot has been released this week, based on GHC stable version 9.2.5.

LTS 20 includes many package changes, and is the first LTS release with over 3000 packages!! Thank you for all the nightly contributions that made this possible.

If your package is missing from LTS 20 and builds there, you can easily request to have it added using our straightforward process: just open a github issue in the lts-haskell project and following the steps in the template.

Stackage Nightly updated to ghc-9.4.3

At the same time we are also excited to have moved Stackage Nightly to GHC 9.4.3 now!

Almost 500 Nightly packages had to be disabled as part of the upgrade to 9.4. Please help to update your packages to build with ghc-9.4 and get them back into Stackage Nightly, thank you!

Big thank you to the community for all your help and support, and do keep the contributions coming!

(Note for Linux users of older glibc < 2.32: at the time of writing stack setups for ghc-9.4 default to the fedora33 bindist which uses glibc-2.32. Some possible workarounds are mentioned in this issue though the Stackage team has not verified the suggestions.)

November 19, 2022 02:00 PM

November 18, 2022


Funding GHC, Cabal and HLS maintenance

tl;dr Please get in touch if you can help fund development of the core Haskell tools.

Ever since it was founded in 2008, Well-Typed has supported the development and maintenance of the Glasgow Haskell Compiler (GHC) as an open-source project, supplying expert Haskellers to work on essential tasks such as triaging and diagnosing bugs, improving performance, and managing releases. More recently we have expanded our work to include maintenance of the Cabal build tool and the Haskell Language Server (HLS).

We would love to be able to spend more engineering time improving GHC, Cabal and HLS, but we need funding. If your company uses Haskell and might be able to contribute to the future of Haskell development, please contact us!

For many years work on GHC was funded by Microsoft Research. It is currently supported by GitHub via the Haskell Foundation, IOG, and a small number of other commercial sponsors. In addition, Cabal maintenance is supported by IOG, and recently the HLS Open Collective has begun supporting HLS release management. We are very grateful to the sponsors for making our work possible.

Today, the GHC/HLS maintenance team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal and Sam Derbyshire. Cabal maintenance is undertaken by Mikolaj Konarski.

We post regular activity reports from the GHC team to give an idea of the kind of work being undertaken. Besides regular maintenance work, we have recently been collaborating with Hasura on debugging and developer tooling and working on improving HLS performance on behalf of Mercury. We have previously implemented major features for clients, such as the nonmoving garbage collector.

We can offer:

  • Significantly reduced rates for sponsoring the work of the GHC, Cabal and HLS teams

  • Development of specific features or bug fixes

  • Expert support with use of Haskell development tools at your company (e.g. reducing build times or improving the developer experience for your engineers)

Of course, the GHC development community is much bigger than one company. Our approach has always been to support the fantastic volunteers who work on GHC, so the maintenance fund primarily covers activities for which recruiting volunteers is difficult. Implementing new language features is sometimes feasible as an academic research project or fun to do as a hobby, but fixing old bugs is less so!

The part our team plays is clearly recognised by core GHC developers:

I really appreciate the skill, collegiality, and effectiveness of the team at Well Typed.

Simon Peyton Jones

As a wishing-I-were-more-frequent GHC contributor, I just want to say how much I appreciate the work this team is doing. Over the past year or so (maybe a little longer?), this team has expanded significantly and has become more systematized. The effect is simply wonderful. I no longer worry that GHC tickets get lost, and the responsiveness is excellent.

Richard Eisenberg

If you might be able to help fund this important work, why not get in touch today?

by adam at November 18, 2022 12:00 AM

November 17, 2022

Tweag I/O

JupyterWith Next

JupyterWith has been around for several years with growing popularity. Over the years, we found that researchers struggled with the Nix language and jupyterWith API. Since researchers are our primary target audience, we decided to improve the usability of jupyterWith.

Today, we are proud to announce the release of a new version! The new simplified API makes jupyterWith easier to use and provides more options for creating kernels.

What is jupyterWith?

JupyterLab is a web-based interactive development environment for notebooks, code, and data. These notebooks can be shared with other users and the residing code can be rerun providing repeatability.

The Jupyter ecosystem allows users to produce and repeat research and results, but it lacks in facilitating reproducible results. There may not appear to be a difference between repeatable and reproducible, but there is a meaningful difference; reproducibility guarantees that our code and results will be exactly the same while repeatability does not.

While many Jupyter kernels are available as Python packages, just as many are not (e.g. haskell and julia). Projects such as PDM and JupyterLab Requirements can create reproducible environments but are restricted to the Python kernels.

jupyterWith was announced in early 2019 and provides a Nix-based framework for declarative and reproducible JupyterLab environments with configurable kernels. It actively supports over a dozen kernels and provides example setups and notebooks for users to try out. jupyterWith can create entirely reproducible JupyterLab environments for any kernel.

Why jupyterWith?

If you can run an experiment multiple times in the same environment and get to the same conclusion, you have repeatability. In our case, running the same code on the same machine should give the same outputs. Consider what would happen if you handed off your code to another user and they ran it on their system. Different operating systems or different versions of the same operating system may fetch different versions of the same package. Fetching the same package at different times may not return the same version due to patch or security updates. If you can guarantee the same outputs given all that has changed, then you have reproducibility.

With repeatability, we cannot guarantee that the packages and dependencies of our code will remain constant. Using jupyterWith we can give that guarantee and ensure that on any system, run by any user, and given identical inputs, the code will produce identical outputs. This guarantee is what makes our code and therefore our research reproducible.

What is new?

This release focuses on helping users quickly and easily get their project started, and making it easier to extend kernels to fit their needs.

New templates

The new version of jupyterWith provides new kernel templates which makes it easier for users to bootstrap their project using Nix flakes. They are small, easily digestible, and ready to be customized.

Better Python kernels

It used to be difficult to select particular Python packages because we were tied to nixpkgs. jupyterWith now uses Poetry and poetry2nix to install kernels that are packaged with Python and their dependencies. Poetry allows users to easily select the desired version of a package and can resolve dependencies. poetry2nix greatly simplifies the kernel files, which helps with readability and maintainability.

Better kernel definition interface

Finally, we have simplified and standardized the interfaces for kernel files. This makes it easier for users to implement completely new kernels.

Getting Started

The following code will initialize a new project directory with a flake template from the jupyterWith repository and start the JupyterLab environment. With a renewed focus on user ease, this is all that is necessary to get started.

$ mkdir my-project
$ cd my-project
$ nix flake init --template github:tweag/jupyterWith
$ nix run

Each kernel provided will generally only have the standard libraries and packages available, but there is a readme provided with the template with instructions on extending existing kernels, creating a custom kernel, and installing extensions.


If you have used jupyterWith in the past, you are probably used to seeing kernel files like the ipython kernel example below. The version of Python used and the packages available to the kernel, can be set using the python3 and packages attributes respectively.

Old interface

  iPython = iPythonWith {
    # Identifier that will appear on the Jupyter interface.
    name = "nixpkgs";
    # Libraries to be available to the kernel.
    packages = p: with p; [ numpy pandas ];
    # Optional definition of `python3` to be used.
    # Useful for overlaying packages.
    python3 = pkgs.python3Packages;
    # Optional value to true that ignore file collisions inside the packages environment
    ignoreCollisions = false;

The new interface is similar but there are a few key differences. All kernels are provided through availableKernels and the kernels are named by the language rather than the kernel project name. For example, before there was iPythonWith and iHaskellWith, and now it is availableKernels.python and availableKernels.haskell. The version of Python uses is passed through the python attribute and additional packages are provided with the extraPackages attribute. There is one new attribute, editablePackageSources, which is used by poetry2nix, to add packages to the environment in editable mode.

New interface!

availableKernels.python.override {
  name = "python-with-numpy"; # must be unique
  displayName = "python with numpy"; # name that appears in JupyterLab Web UI
  python = pkgs.python3;
  extraPackages = ps: [ ps.numpy ];
  editablePackageSources = {};

Both of these are still subject to the package versions available in nixpkgs. However, with Poetry, we can create a completely custom kernel with a pyproject.toml file and specify exactly which package versions we want. The full details are available in the How To and Tutorials sections of the documentation.


Usability has been improved, but there is much more to do. The next major items on the roadmap include:

  • Updating and improving the flake templates.
  • Updating and improving documentation on configuring existing kernels and packaging new kernels.
  • Providing better MacOS support.
  • Adding new and improving existing kernels.
  • Create a website indexing kernels that can be used and configured in jupyterWith.

Join us in contributing to the project. You can find the repository here.

November 17, 2022 12:00 AM

November 15, 2022

Tweag I/O

Staged programming with typeclasses

Staged programming consists of evaluating parts of a program at compile time for greater efficiency at runtime, as some computations would have already been executed or made more efficient during compilation. The poster child for staged programming is the exponential function: to compute a^b, if b is known at compile time, a^b can be replaced by b explicit multiplications. Staged programming allows you to write a^5, but have the expression compile to a*a*a*a*a.

In Haskell, the traditional way to do staged programming is to reach for Template Haskell. Template Haskell is, after all, designed for this purpose and gives you strong guarantees that the produced code is indeed a*a*a*a*a, as desired. On the other hand it does feel a little heavyweight and programmers, in practice, tend to avoid exposing Template Haskell in their interfaces.

In this blog post, I want to present another way to do staged programming that is more lightweight, and feels more like a native Haskell solution, but, in exchange, offers fewer guarantees. At its core, what is needed for staged programming is to distinguish between what is statically known and what is dynamically known. In Template Haskell, static and dynamic information is classified by whether an expression is within a quotation or not. But there is another way to signal statically-known information in Haskell: types.

This is what we are going to do in this blog post: passing statically-known arguments at the type level. I’ve used this technique in linear-base.

Natural numbers at the type level

Haskell offers a native kind Nat of type-level natural numbers. We could pass the (statically known) exponent as Nat, in fact we eventually will, but it is difficult to consume numbers of kind Nat because GHC doesn’t know enough about them (for instance, GHC doesn’t know that n+1 is equivalent to 1+n).

Instead, we will use an inductive encoding of the natural numbers: the Peano encoding.

data Peano
  = Z         -- zero
  | S Peano   -- successor of another peano number

In this encoding, 3 is written S (S (S Z)).

Normally, Peano would live at the type level, and both Z and S would live at the term level (they’re data constructors after all). But thanks to the DataKinds extension – which allows data constructors to be promoted to types – we can also use Peano as the kind of type-level Z and S.

Now let’s return to the power function. We will first create a typeclass RecurseOnPeano, that will contain the power function (and that could host any other recursive metaprogramming function that operates on Peanos):

class RecurseOnPeano (n :: Peano) where
  power :: Int -> Int

The power function only needs one term-level parameter: the number that will be multiplied by itself n times. Indeed, the exponent is already “supplied” as a type-level parameter n. In fact, the signature of the power function outside the typeclass would be:

power :: forall (n :: Peano). RecurseOnPeano n => Int -> Int

At a call site, the type-level parameter n will be supplied to the function through a type application, using the dedicated @ symbol (e.g. power @(S (S Z)) 4). It isn’t possible to omit the type parameter n at a call site because there is no way for GHC to deduce it from the type of a term-level parameter of the function. So we need to enable the AllowAmbiguousTypes extension.

The implementation of the power function will be defined through two instances of the RecurseOnPeano typeclass – one for the base case (n = Z), and one for the recursive case (n = S n') – as one would do in a term-level recursive function.

The first instance is relatively straightforward as x^0 = 1 for every positive integer x:

instance RecurseOnPeano Z where
  power _ = 1

For the second instance we want to write power @(S n) x = x * power @n x. But to use power @n x, n needs to fulfill the RecurseOnPeano constraint too. In the end, that yields:

instance RecurseOnPeano n => RecurseOnPeano (S n) where
  power x = x * power @n x

We now have a first working example:

-- <<<<<<<<<<<<< file CompileRecurse.hs >>>>>>>>>>>>>

{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE ScopedTypeVariables  #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
module CompileRecurse where
import GHC.TypeLits

data Peano = Z | S Peano

class RecurseOnPeano (n :: Peano) where
  power :: Int -> Int

instance RecurseOnPeano Z where
  power _ = 1
  {-# INLINE power #-}
instance RecurseOnPeano n => RecurseOnPeano (S n) where
  power x = x * power @n x
  {-# INLINE power #-}

-- <<<<<<<<<<<<< file Main.hs >>>>>>>>>>>>>

{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE DataKinds #-}
module Main where
import CompileRecurse

main :: IO ()
main = print $ power @(S (S (S Z))) 2  -- this should print 8

Many languages extensions are required for this example to work:

  • KindSignatures permits the syntax (n :: Peano) to restrict the RecurseOnPeano class to types of the Peano kind.
  • TypeApplications gives the @type syntax to supply type-level parameters.
  • DataKinds allows us to promote the Peano data type to the kind level.
  • ScopedTypeVariables is needed to be able to refer to n in the body of power in the second instance of RecurseOnPeano.
  • AllowAmbiguousTypes is needed when we declare a typeclass function in which the term-level parameters (if there are any) are not sufficient to infer the type-level parameters (and thus require an explicit type application at the call site).

I also added {-# INLINE #-} pragmas on the power implementations, because we indeed want GHC to inline these to achieve our initial goal. For such a simple example, GHC would inline them by default, but it’s better to be explicit about our intent here.

You can now validate that the power @(S (S (S Z))) 2 encoding for 2^3 indeed prints 8 on the terminal.

From Peano type-level integers to GHC Nats

Writing S (S (S Z)) is not very convenient. We would definitely prefer to write 3 instead. And that is possible, if we allow a bit more complexity in our code.

Number literals, such as 3, when used at the type level are of kind Nat from GHC.TypeLits.

Unfortunately, if we completely replace our home-made Peanos with GHC Nats, we will run into some issues of overlapping instances in the RecurseOnPeano typeclass.1

A solution can be found by using the {-# OVERLAPPING #-} and {-# OVERLAPPABLE #-} pragmas, but it is quite fragile: instance selection is no longer driven by types or structure but rather by a manual override. And the rules for such an override are rather complex, especially when more than two instances are involved; in the case at hand, we might want to add a third instance with a specific implementation for n = 1.

Instead, we will add a type family (that is, a function from types to types) to convert from Nats to Peanos, and add an auxiliary function power' that will take a type-level Nat instead of a type-level Peano:

-- <<<<<<<<<<<<< add to file CompileRecurse.hs >>>>>>>>>>>>>

{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE FlexibleContexts #-}

type family NatToPeano n where
  NatToPeano 0 = Z
  NatToPeano n = S (NatToPeano (n - 1))

-- 'RecurseOnPeano (NatToPeano n) =>' means that the ¨Peano equivalent of n
-- must be an instance of RecurseOnPeano to get access to 'power'
power' :: forall (n :: Nat). (RecurseOnPeano (NatToPeano n)) => Int -> Int
power' = power @(NatToPeano n)

-- <<<<<<<<<<<<< change in file Main.hs >>>>>>>>>>>>>

main = print $ power' @3 2  -- this should still print 8

Our function is still working as expected, and is now more convenient to use!

A look under the hood

Our initial goal was to unroll the power' function at compile time. Let’s check whether this promise holds.

We will create a new test file test/CompileRecurseTests.hs and set specific GHC options so that we can take a look at the generated Core2 code for our project:

{-# OPTIONS_GHC -O -ddump-simpl -dsuppress-all -dsuppress-uniques -ddump-to-file #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE DataKinds #-}
module Main where

import CompileRecurse

myFunc :: Int -> Int
myFunc x = power' @3 x + 1

main :: IO ()
main = return ()

The following GHC flags are used:

  • -O enables optimizations in GHC.
  • -ddump-simpl requests the Core code after the output of the simplifier.
  • -dsuppress-all and -dsuppress-uniques reduce the verbosity of the output (otherwise, searching for a specific piece of code would become very tedious).
  • Finally, -ddump-to-file asks for the output to be written to a file in the build directory.

With the above options, compiling and running the test suite creates a file CompileRecurseTests.dump-simpl deep down in the build tree.3 If we ignore all the lines about $trModule, we get:

-- RHS size: {terms: 12, types: 3, coercions: 0, joins: 0/0}
  = \ x -> case x of { I# x1 -> I# (+# (*# x1 (*# x1 x1)) 1#) }

I# is the “boxing” constructor for integers, that is, the one taking an unboxed integer (Int#) and creating a Haskell Int (an integer behind a pointer). +# and *# are the equivalent of arithmetic functions + and * for unboxed integers Int#.

We can see that myFunc

  • takes an Int,
  • unboxes its value,
  • makes the 2 product operations corresponding to the inlined power' @3 x,
  • adds 1, and finally,
  • boxes the result once again to produce an Int.

There is no mention of power' here, so the function has been successfully inlined!

Inspection testing

Checking manually whether or not the inlining has happened – by looking through the .dump-simpl file after every change – is really impractical. Instead, it is possible to use the inspection-testing and tasty-inspection-testing libraries to automate such a process.

To do this, we simply need to introduce a function myFunc' – corresponding to what we expect to be the optimized and inlined form of myFunc – and then we check that both myFunc and myFunc' result in the same generated Core code by using the specific === comparison operator (and a little bit of Template Haskell too):

{-# OPTIONS_GHC -O -dno-suppress-type-signatures -fplugin=Test.Tasty.Inspection.Plugin #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TemplateHaskell #-}

module Main where

import Test.Tasty
import Test.Tasty.Inspection
import CompileRecurse

myFunc :: Int -> Int
myFunc x = power' @3 x + 1

myFunc' :: Int -> Int
myFunc' x = x * (x * x) + 1

main :: IO ()
main = defaultMain . testGroup "Inspection testing of power'" $
  [ $(inspectTest $ 'myFunc === 'myFunc') ]

Running the test suite gives:

Inspection testing of power'
  myFunc === myFunc': OK

All 1 tests passed (0.01s)

If both functions didn’t result in the same generated Core code – e.g. if we wrote (x * x) * x + 1 instead of x * (x * x) + 1 in myFunc' – we would get:

Inspection testing of power'
  myFunc === myFunc': FAIL
        [ ... ]
          = \ (x [Dmd=<S,1*U(U)>] :: Int) ->
              case x of { I# x1 -> I# (+# (*# x1 (*# x1 x1)) 1#) }
        [ ... ]
          = \ (x [Dmd=<S,1*U(U)>] :: Int) ->
              case x of { I# x -> I# (+# (*# (*# x x) x) 1#) }

1 out of 1 tests failed (0.01s)
typeclass-blogpost> Test suite inspection-tests failed

In this way, the correct inlining of power' can be checked automatically after each change to the codebase!


This was a brief introduction to staged programming in Haskell, leveraging the type (and typeclass) system as a lightweight alternative to Template Haskell. The technique detailed in this article has been implemented in real-world contexts to create variadic functions like printf, and I hope that you will find many other useful applications for it!

I would like to give a special thank you to Arnaud Spiwack who both taught me this technique in the first place, and then helped me to greatly improve this blog post.

  1. In short, this is because GHC can’t distinguish between the base and recursive instances with Nats as easily as it can with Peanos
  2. Core is the main intermediate language used inside GHC.
  3. In my case, the full path was: .stack-work/dist/x86_64-linux-nix/Cabal-

November 15, 2022 12:00 AM

November 14, 2022

Michael Snoyman

Why my video calls sucked (and how I fixed it)

A few years ago, I wrote a blog post about how I set up the networking in our house following some major construction. I was really excited about how much better the internet would be. No more WiFi dead zones, and a wired connection to my computer to provide extra speed and stability to my office. Overall, the results were great. Speed test anywhere in my house showed I was getting the full 500mbps promised by my cable company. However, not everything was working as expected:

  • Video calls stuttered, a lot. The most egregious and confusing behavior was that, during a call, I would continue to see the video moving while people went silent for between 10 and 20 seconds.
  • My kids complained off-and-on about problems in online games (Minecraft in this case), and had trouble talking with friends over Discord audio chat.

If you're looking for a summary "try this if you're having trouble," here are my three recommendations:

  • Diagnosis: try tethering to your smartphone instead of using the internet in your house and see if the behavior is better. You'll almost certainly have slower speeds, but video calls and gaming may be more consistent.
  • Stop gap measure: try running a VPN of some kind and see if that improves the situation. One possibility is trying out Cloudflare Warp. This helped significantly for me, but wasn't perfect.
  • Real fix for the underlying problem: buy a new router, connect it to the modem/router from your Internet Service Provider (ISP), and put the modem into bridge mode.

The rest of this blog post will try to explain what the problem is. We're going to get into the technicals, but I'm hoping the content will make sense to anyone with basic experience on the internet, not just networking engineers.

Finally, it's worth calling out two coworkers for their involvement in this story. First is Niklas Hambüchen, who years ago warned me of the perils of ISP-provided routers. I should have listened to him then. The second is Neil Mayhew, who not only helped me debug this along the way, but also accidentally gave me the clue I needed to isolate the problem.

Analyzing the problem

If you're suffering from the problems I describe above, it can be incredibly frustrating. Not only do video calls turn into a source of endless pain and miscommunication, but no one will believe you. Call the ISP, and they'll tell you your speed tests are fine. Same with hardware manufacturers, operating systems, and the video software itself in most cases. Nothing sees the problem. You know something is broken, but you're essentially told you're crazy.

The big hints to me that something more complicated was happening under the surface was which things worked well and which didn't. Watch some videos online? No problem at all. Browse websites? Fine. Massive downloads (pretty common in my line of work)? Incredibly fast. The fact that calls and gaming were broken was the first indication something was weird.

The final puzzle piece hit a few weeks ago. The aforementioned Neil had told me for a while how great VR gaming was, especially a game called Echo Arena, and so we ordered an Oculus. I loaded up the game, went into the lobby... and timed out. I tried that a few more times, and it kept happening. Then I tried using my phone as a mobile hotspot, and the game worked perfectly.

Before I explain why that was so important, we have to talk about a few lower level details of networks.

Packets, addresses, and ports

We often use terms like "connecting" to a website. In a physical sense, that doesn't happen. When I go to, I don't have a physical cable, radio signal, or any other physical manifestation of a connection between my computer and some computer at YouTube headquarters. Instead, the way the internet works is a series of computers that connect to each other and pass data around to each other. This is known as routing.

Every computer on a network has an Internet Protocol (IP) address. These are numbers that look like You've probably seen them at some point. The basic idea of routing traffic is I say to the next computer in the line "hey, I want to talk to a computer with that address." The next computer may have connections to 5 other computers, and it knows which of those computers is closest to that IP address. It figures this out using a routing table. The data then "hops" from that computer to the next one, from there to another computer, and so on until it reaches its destination.

But like I said, there aren't any "connections." Instead, internet traffic is made up of a bunch of "packets." You can think of these as envelopes. They have an IP address on the outside, and a small amount of data inside. When you "connect" to another computer, you're actually sending a bunch of these packets over the network. The computers in the middle route your traffic by looking at the outside of the envelope (called the header). And your packets make it to their destination.

One other thing to keep in mind. Each computer can talk to lots of other computers at the same time. Each computer may provide different ways to talk to it (known as protocols, such as the web, or email, or video calling). To allow a single computer to do all these things at the same time, we have one more important number: the port number. This is a number between 1 and 65,536, and tells the computer which "connection" traffic is trying to use. When you send a packet, your header includes the destination IP address and destination port number. It also includes the source IP address and source port number. This allows the other computer to respond to you.

Packet loss, UDP, and TCP

OK, one more topic to learn. There are actual physical mechanisms that control networks. It could be a network cable, a WiFi signal, a statelite connection to Starlink, or the 4G cellular signal on your phone. They all share one thing in common: they can lose data. Maybe you're driving through a tunnel and the cell signal is interrupted. Maybe you live in my neighborhood, and the cable company still hasn't properly protected their cables from water and you lose internet every time it rains. Whatever the case, and for whatever reason, it's entirely possible to lose some of the data. This is known as packet loss.

There are two basic approaches in networking to dealing with packet loss, each with their own advantages and disadvantages.

  • User Datagram Protocol, or UDP, is a protocol that works as a "fire and forget" message. I send data to the other side, and I have no guarantees of if it arrived or the order it will arrive in (maybe packet 513 will get there before packet 512). Maybe the same packet will get received multiple times. No one knows.
  • Transmission Control Protocol, or TCP, is what people normally think of as a "connection" on the internet. TCP adds a bunch of bookkeeping rules to address the limitations of UDP. It makes sure packets arrive in the correct order, resending them if they didn't get to the other side. It makes sure the order of packets is correct. And it lets you know if the other side breaks the connection.

You may be wondering: why in the world would anyone ever use UDP? It sounds terrible! Overall, TCP is more commonly used for sure, because most people need those guarantees most of the time. But there are some use cases where UDP is far superior. (And finally this weird tangent will connect back to the beginning of the blog post.) UDP is great when:

  • You don't actually need every single bit of data to arrive on the other side.
  • You care much more about raw speed than other factors.

There are two great examples of this:

  1. Audio calls! It turns out that if you take a stream of audio, you can break it down into a whole bunch of tiny data packets containing a slice of time. Then you can send them over the network. If one of those packets is lost, the other side can usually understand what you said from all the other audio packets. And having to add in the delays imposed by TCP to ensure all the data arrives would impose more delays, causing audio calls to become "laggy," or to be more technical, would introduce latency.
  2. Gaming. In lots of video games, we don't need to have every single bit of data about what the user did. There are ways to write gaming protocols that say things like "don't tell me the user pressed up for 5 seconds, tell me their new position." If one of those updates gets lost, it's no big deal, the next update will give the newer position. You may "glitch" a bit in the game and jump around, but again, overall, the speed is more important than every piece of data.

There are other examples of UDP being superior, but I won't bother covering them here, because now we know enough to see what was happening in my house.

What's in a router?

It turns out that Echo Arena, that game I tried playing, was communicating with the server over UDP. And for whatever reason, it was more sensitive to the breakage in my house than other things like video calls and Minecraft. It turns out, the router in our house was mishandling UDP packets.

Most people get a router from their cable, DSL, or fiberoptic company when they pay them for internet access. But this device generally is not just a router. It's actually doing three different jobs most of the time, and we need to separate those out:

  1. Modem. The term modem means a device that converts one kind of physical connection into a network connection. Cable modems, for example, convert the TV cable wires already running into your house into a network signal, something they weren't originally designed for. Fiber modems will convert the fiberoptic light-based signals into a network signal. DSL does the same with phone lines. Even old-school dial-up modems are simply using audio over the phone line for the same purpose.
  2. Wireless access point. You can connect to your "router" by plugging in a network cable to the back and connecting that to your computer. But most people these days are using a WiFi signal instead. A wireless access point is the translator between WiFi signals and your wired network. In the case of your modem/wireless router combo, it's built into the device, but you could use an external one. (And this is a great way to extend the range of your wireless network if you need to.)
  3. Router. Before we get into that though, there are two more things we need to learn about IP addresses:
  • There are some IP addresses that have been reserved as "private," meaning they can be used inside people's homes or businesses, but can't go on the internet. This includes anything that starts with 10. or 192.168.. If those look familiar... just wait a second, we'll get to it.
  • There are only 4 billion IP addresses possible. That may seem like a lot, but it turns out that it isn't nearly enough for all the people, servers, Internet of Things devices, and everything else that wants to be on the internet. We have an IP address shortage.
    • Side note: the current common IP address standard is called IPv4, and is what I'm referring to. There's a new standard, called IPv6, that totally solves this problem by introducing an insane number of addresses. To get a sense of how big:

      340,282,366,920,938,463,463,374,607,431,768,211,456, which is approximately 340 undecillion addresses...

      So we could assign an IPV6 address to EVERY ATOM ON THE SURFACE OF THE EARTH, and still have enough addresses left to do another 100+ earths.

Unfortunately, IPv6 is having trouble taking off, so we're stuck with IPv4 and a shortage of IP addresses. And this is where your router comes in. Its job is to get a public IP address from your ISP, and then create a local network inside your house. It does this by a few different technologies:

  • The router creates a private IP address for itself. This is commonly, or, or something along those lines.
  • It runs something called a DHCP server that lets other computers on the network ask for a new IP address and connection information. It will hand out private addresses like
  • And finally, the part we care about the most: your router does Network Address Translation, or NAT, to convert your packets from private to public addresses.

What's in a NAT?

Let's break this down. Suppose you're trying to connect to a website like Your computer will look up (using a different system called DNS which I'm not covering right now) that website's IP address. For example, I just got the address when looking up I want to connect to it as a secure website (HTTPS), and the standard port number for that is 443. My computer knows its own private IP address (let's say it's, and randomly chooses an unused port number (let's say 4001). Then my computer makes a packet that looks like:

  • Destination IP:
  • Destination port: 443
  • Source IP:
  • Source port: 4001
  • Data: please start a TCP connection with me

Then, it sends that packet to my router so the router can pass that packet on to the rest of the internet. However, as it stands right now, that packet will be blocked, because private IP addresses are not allowed on the internet. And that's a good thing, because lots of computers in the world have the IP address, and YouTube wouldn't know which comptuer to send it to.

Instead, the router translates the network address (e.g., NAT). The router has some public IP address it got from my ISP, let's say It will then translate the header on the packet above to say:

  • Source IP:
  • Source port: 54542 (we'll come back to why this is different in a moment)

The router has to remember that it did this translation, and it sends off the packet to the internet. Eventually YouTube receives the request, processes it, and sends out a packet of its own that looks like this:

  • Destination IP:
  • Destination port: 54542
  • Source IP:
  • Source port: 443
  • Data: OK, starting a TCP connection

The router receives this packet, notices the destination port, and remembers "hey, I remember that should get these." It then replaces the destination IP and port with:

  • Destination IP:
  • Destination port: 4001

It then sends that data into the local network inside my house, where my computer receives it, and thinks it's talking directly to YouTube.

Side point: why did the router change from 4001 to 54542? Because other computers in my network may also be using source port 4001, and the router needs to distinguish which computer should receive these packets.

This is a horribly ugly hacky workaround for not having enough IP addresses. But it (mostly) works just fine, and the entire internet is built on it right now.

If you want to see evidence of this happening, check your local computers settings and see what IP address it thinks it has. Then compare with the number you get from That website is seeing the IP address from the router, not from your local computer, and so you'll almost certainly get two different numbers.

About that "mostly"

I said this all mostly works. Let's start with TCP. With TCP, there's a whole protocol of how to connect over TCP. The router understands this, looks at the headers and the data, and remembers the mapping between the original source IP/port and the new source port. Almost every router under the sun handles this situation really well.

Unfortunately, the situation isn't as good for UDP. That's because there's no real "connection." UDP is just a bunch of packets. Good routers handle UDP really well, keep track of the mappings, and intelligently decide when a source port has been unused for long enough that it's allowed to forget about it.

And that brings me to my video call problems. The router included with the modem from my ISP sucks. It would forget about these mappings at the wrong time. The result would be that, in the middle of a call, the UDP packets carrying the audio from the other side would suddenly get "stuck" on the router and not get sent to my computer. Eventually, the router would remember a new port mapping and the call would resume. But I'd lose 10-20 seconds of audio while that happened.

For various technical reasons that I'm no expert at and aren't really relevant, the video data in calls often goes over TCP instead of UDP, and that's why I would continue to see the video move while people went silent.

Similarly, the kids could play Minecraft for a while before packet loss ensued and they'd get sent to "limbo." Discord calls would work until they'd glitch for a bit. And finally, the final puzzle piece: Echo Arena detected the situation much faster than anything else and simply refused to play at all.

The solution

With the problem identified, the solution is simple: don't use the router in the modem I got from my ISP. I bought a new router, plugged it into the modem, and switched the modem into "bridge mode." This disables the router functionality in the modem. Now my shiny new router got a public IP address and could send data directly to the internet. It's responsible for giving out IP addresses in my house and doing all the NAT work. And since it's a good router, it does this all correctly. With this device installed, video calls instantly became near-perfect, my kids stopped complaining about Minecraft, and I could play Echo Arena (which I still suck at, but hey, that's what I get for writing blog posts instead of practicing my video game skills).

In my case, I already had Wireless Access Points (WAPs) throughout the house, so I did not need a wireless router. Instead, I bought an ER605 from TP-Link. I've been very happy with the EAP245 WAPs I got from TP-Link before, and this is part of the same business class of devices. However, if you don't have your own WAPs, it's probably a better idea to get a wireless router, which includes both router and WAP functionality.

Anyway, I hope that explanation is helpful to someone else. When discussing with Neil, he pointed out how sad it is that many people in the world are probably affected by this crappy-internet problem and have no way of diagnosing it themselves. (Hell, I'm a network engineer and it took about three years for me to figure it out!) Good luck to all!

November 14, 2022 12:00 AM

November 12, 2022

Philip Wadler

IO Scotfest: The Age of Voltaire - Nov 18-19

IOHK/IOG will be hosting a meeting at Edinburgh next week. Available online, plus an in-person meetup for folk near Edinburgh.

Let’s celebrate the dawning of a new era for #Cardano together. Join us for a virtual event that will showcase the community’s achievements over the last 5 years & discuss IOG’s vision for the future of Cardano. Learn more:


by Philip Wadler ( at November 12, 2022 12:11 PM

November 08, 2022

Mark Jason Dominus

Addenda to recent articles 202210

I haven't done one of these in a while. And there have been addenda. I thought hey, what if I ask Git to give me a list of commits from October that contain the word ‘Addendum’. And what do you know, that worked pretty well. So maybe addenda summaries will become a regular thing again, if I don't forget by next month.

Most of the addenda resulted in separate followup articles, which I assume you will already have seen. ([1] [2] [3]) I will not mention this sort of addendum in future summaries.

  • In my discussion of lazy search in Haskell I had a few versions that used do-notation in the list monad, but eventually abandoned it n favor of explicit concatMap. For example:

          s nodes = nodes ++ (s $ concatMap childrenOf nodes)

    I went back to see what this would look like with do notation:

          s nodes = (nodes ++) . s $ do
              n <- nodes
              childrenOf n


  • Regarding the origin of the family name ‘Hooker’, I rejected Wiktionary's suggestion that it was an occupational name for a maker of hooks, and speculated that it might be a fisherman. I am still trying to figure this out. I asked about it on English Language Stack Exchange but I have not seen anything really persuasive yet. One of the answers suggests that it is a maker of hooks, spelled hocere in earlier times.

    (I had been picturing wrought-iron hooks for hanging things, and wondered why the occupational term for a maker of these wasn't “Smith”. But the hooks are supposedly clothes-fastening hooks, made of bone or some similar finely-workable material. )

    The OED has no record of hocere, so I've asked for access to the Dictionary of Old English Corpus of the Bodleian library. This is supposedly available to anyone for noncommercial use, but it has been eight days and they have not yet answered my request.

    I will post an update, if I have anything to update.

by Mark Dominus ( at November 08, 2022 11:34 PM

November 07, 2022

GHC Developer Blog

GHC 9.2.5 is now available

GHC 9.2.5 is now available

Zubin Duggal - 2022-11-07

The GHC developers are happy to announce the availability of GHC 9.2.5. Binary distributions, source distributions, and documentation are available at

This release is primarily a bugfix release addressing a few issues found in 9.2.4. These include:

  • Code generation issues in the AArch64 native code generator backend resulting in incorrect runtime results in some circumstances (#22282, #21964)
  • Fixes for a number of issues with the simplifier leading to core lint errors and suboptimal performance (#21694, #21755, #22114)
  • A long-standing interface-file determinism issue where full paths would leak into the interface file (#22162)
  • A runtime system bug where creating empty mutable arrays resulted in a crash (#21962)
  • … and a few more. See the release notes for a full accounting.

As some of the fixed issues do affect correctness users are encouraged to upgrade promptly.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy compiling,

  • Zubin

by ghc-devs at November 07, 2022 12:00 AM

November 05, 2022

Mark Jason Dominus

A map of Haskell's numeric types

I keep getting lost in the maze of Haskell's numeric types. Here's the map I drew to help myself out. (I think there might have been something like this in the original Haskell 1998 report.)

(PNG version) (Original DOT file (The SVG above is hand-edited graphviz output))

Ovals are typeclasses. Rectangles are types. Black mostly-straight arrows show instance relationships. Most of the defined functions have straightforward types like or or . The few exceptions are shown by wiggly colored arrows.

Basic plan

After I had meditated for a while on this picture I began to understand the underlying organization. All numbers support and . And there are three important properties numbers might additionally have:

  • Ord : ordered; supports etc.
  • Fractional : supports division
  • Enum: supports ‘pred’ and ‘succ’

Integral types are both Ord and Enum, but they are not Fractional because integers aren't closed under division.

Floating-point and rational types are Ord and Fractional but not Enum because there's no notion of the ‘next’ or ‘previous’ rational number.

Complex numbers are numbers but not Ord because they don't admit a total ordering. That's why Num plus Ord is called Real: it's ‘real’ as constrasted with ‘complex’.

More stuff

That's the basic scheme. There are some less-important elaborations:

Real plus Fractional is called RealFrac.

Fractional numbers can be represented as exact rationals or as floating point. In the latter case they are instances of Floating. The Floating types are required to support a large family of functions like and π.

You can construct a Ratio a type for any a; that's a fraction whose numerators and denominators are values of type a. If you do this, the Ratio a that you get is a Fractional, even if a wasn't one. In particular, Ratio Integer is called Rational and is (of course) Fractional.

Shuff that don't work so good

Complex Int and Complex Rational look like they should exist, but they don't really. Complex a is only an instance of Num when a is floating-point. This means you can't even do 3 :: Complex Int — there's no definition of fromInteger. You can construct values of type Complex Int, but you can't do anything with them, not even addition and subtraction. I think the root of the problem is that Num requires an abs function, and for complex numbers you need the sqrt function to be able to compute abs.

Complex Int could in principle support most of the functions required by Integral (such as div and mod) but Haskell forecloses this too because its definition of Integral requires Real as a prerequisite.

You are only allowed to construct Ratio a if a is integral. Mathematically this is a bit odd. There is a generic construction, called the field of quotients, which takes a ring and turns it into a field, essentially by considering all the formal fractions (where ), and with considered equivalent to exactly when . If you do this with the integers, you get the rational numbers; if you do it with a ring of polynomials, you get a field of rational functions, and so on. If you do it to a ring that's already a field, it still works, and the field you get is trivially isomorphic to the original one. But Haskell doesn't allow it.

I had another couple of pages written about yet more ways in which the numeric class hierarchy is a mess (the draft title of this article was "Haskell's numbers are a hot mess") but I'm going to cut the scroll here and leave the hot mess for another time.

[ Addendum: Updated SVG and PNG to version 1.1. ]

by Mark Dominus ( at November 05, 2022 01:12 AM

November 03, 2022

GHC Developer Blog

GHC 9.4.3 released

GHC 9.4.3 released

bgamari - 2022-11-03

The GHC developers are happy to announce the availability of GHC 9.4.3. Binary distributions, source distributions, and documentation are available at

This release is primarily a bugfix release addressing a few issues found in 9.4.2. These include:

  • An issue where recursively calls could be speculatively evaluated, resulting in non-termination (#20836)
  • A code generation issue in the AArch64 native code generator backend resulting in incorrect runtime results in some circumstances (#22282)
  • A crash on Darwin when running executables compiled with IPE support (#22080)
  • A long-standing interface-file determinism issue where full paths would leak into the interface file (#22162)
  • A bug in the process library where file handles specified as NoStream would still be usable in the child (process#251)

Note that, as GHC 9.4 is the first release series where the release artifacts are all generated by our new Hadrian build system, it is possible that there will be packaging issues. If you enounter trouble while using a binary distribution, please open a ticket. Likewise, if you are a downstream packager, do consider migrating to Hadrian to run your build; the Hadrian build system can be built using cabal-install, stack, or the in-tree bootstrap script. See the accompanying blog post for details on migrating packaging to Hadrian.

We would also like to emphasize that GHC 9.4 must be used in conjunction with Cabal-3.8 or later. This is particularly important for Windows users due to changes in GHC’s Windows toolchain.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy Haskelling,

  • Ben

by ghc-devs at November 03, 2022 12:00 AM

November 02, 2022

Matt Parsons

Break Gently with Pattern Synonyms

This is a really brief post to call out a nice trick for providing users a nice migration message when you delete a constructor in a sum type.

The Problem

You have a sum type, and you want to delete a redundant constructor to refactor things.

data Foo 
    = Bar Int 
    | Baz Char
    | Quux Double

That Quux is double trouble. But if we simply delete it, then users will get a Constructor not found: Quux. This isn’t super helpful. They’ll have to go find where Quux came from, what package defined it, and then go see if there’s a Changelog. If not, then they’ll have to dig through the Git history to see what’s going on. This isn’t a fun workflow.

But, let’s say you really need end users to migrate off Quux. So we’re interested in giving a compile error that has more information than Constructor not in scope.

Here’s what some calling code looks like:

blah :: Foo -> Int
blah x = case x of
    Bar i -> i
    Baz c -> fromEnum c
    Quux a -> 3

will give the output:

/home/matt/patsyn.hs:24:5: error:
    Not in scope: data constructor Quux
24 |     Quux a -> 3
   |     ^^^^
Failed, no modules loaded.

Fortunately, we can make this nicer.

GHC gives us a neat trick called PatternSynonyms. They create constructor-like things that we can match on and construct with, but that are a bit smarter.


Let’s redefine Quux as a pattern synonym on Foo. We’ll also export it as part of the datatype definition.

{-# language PatternSynonyms, ViewPatterns #-}

module Wow (Foo (.., Quux)) where

data Foo
    = Bar Int
    | Baz Char

pattern Quux :: a -> Foo
pattern Quux i <- (const Nothing -> Just i)

This does something tricky: we always throw away the input with the ViewPattern, and we can summon whatever we want in the left hand side. This allows us to provide whatever a is needed to satisfy the type. This match will never succeed - so Quux behavior will never happen.

Now, we get a warning for the match:

[1 of 1] Compiling Main             ( /home/matt/patsyn.hs, interpreted )

/home/matt/patsyn.hs:25:5: warning: [-Woverlapping-patterns]
    Pattern match is redundant
    In a case alternative: Quux a -> ...
25 |     Quux a -> 3
   |     ^^^^^^^^^^^
Ok, one module loaded.

But an error for constructing:

[1 of 1] Compiling Main             ( /home/matt/patsyn.hs, interpreted )

/home/matt/patsyn.hs:28:10: error:
    • non-bidirectional pattern synonym ‘Quux’ used in an expression
    • In the expression: Quux 3
      In an equation for ‘blargh’: blargh = Quux 3
28 | blargh = Quux 3
   |          ^^^^
Failed, no modules loaded.

So we need to construct with it, too. We can modify the pattern synonym by providing a where, and specifying how to construct with it. Since we’re intending to prevent folks from using it, we’ll just use undefined.

pattern Quux :: a -> Foo
pattern Quux i <- (const Nothing -> Just i) where
    Quux _ = undefined

With this, we get just the warning about a redundant pattern match. Now it’s time to step up our game by providing a message to the end user.


GHC gives us the ability to write {-# WARNING Quux "migrate me pls" #-}. This can make sense if we expect that the runtime behavior of a program won’t be changed by our pattern synonym.

So let’s write a warning:

pattern Quux :: a -> Foo
pattern Quux i <- (const Nothing -> Just i) where
    Quux _ = undefined

    "Please migrate away from Quux in some cool manner. \
    \See X resource for migration tips." 

Now, when compiling, we’ll see the warnings:

/home/matt/patsynimp.hs:11:5: warning: [-Wdeprecations]
    In the use of data constructor ‘Quux’ (imported from PatSyn):
    "Please migrate away from Quux in some cool manner. See X resource for migration tips."
11 |     Quux _ -> 3
   |     ^^^^

/home/matt/patsynimp.hs:11:5: warning: [-Woverlapping-patterns]
    Pattern match is redundant
    In a case alternative: Quux _ -> ...
11 |     Quux _ -> 3
   |     ^^^^^^^^^^^

/home/matt/patsynimp.hs:14:10: warning: [-Wdeprecations]
    In the use of data constructor ‘Quux’ (imported from PatSyn):
    "Please migrate away from Quux in some cool manner. See X resource for migration tips."
14 | blargh = Quux (3 :: Int)
   |          ^^^^

But this may not be good enough. We may want to give them an error, so they can’t build.


base defines a type TypeError, which GHC treats specially - it raises a type error. This isn’t generally useful, but can be great for marking branches of a type family or type class instance as “impossible.” The error message can be fantastic for guiding folks towards writing correct code.

PatternSynonyms can have two sets of constraints: the first is required when constructing, and the second is provided when matching. So let’s just put an error in the first and see what happens:

pattern Quux
    :: (TypeError ('Text "please migrate ..."))
    => ()
    => a -> Foo
pattern Quux i <- (const Nothing -> Just i) where
    Quux _ = undefined

Unfortunately, GHC blows up immediately while compiling the synonym!

[1 of 2] Compiling PatSyn           ( PatSyn.hs, interpreted )

PatSyn.hs:20:1: error: please migrate ...
20 | pattern Quux
   | ^^^^^^^^^^^^...
Failed, no modules loaded.

We can’t even -fdefer-type-errors this one. Are we hosed?

What about the second position? Same problem. We can’t put a bare TypeError in there at all.

Fortunately, we can have a lil’ bit of laziness by introducing it as a constraint.

class DeferredError
instance (TypeError ('Text "please migrate ...")) => DeferredError

pattern Quux
    :: DeferredError
    => DeferredError
    => a -> Foo
pattern Quux i <- (const Nothing -> Just i) where
    Quux _ = undefined

This actually does give us a warning now - at the const Nothing -> Just i line, we have a deferred type error.

This gives us the error behavior we want!

/home/matt/patsynimp.hs:14:10: error:
    • please migrate ...
    • In the expression: Quux (3 :: Int)
      In an equation for ‘blargh’: blargh = Quux (3 :: Int)
14 | blargh = Quux (3 :: Int)
   |          ^^^^^^^^^^^^^^^
Failed, one module loaded.

We only get the one error - but if we delete it, we can see the other error:

[2 of 2] Compiling Main             ( /home/matt/patsynimp.hs, interpreted )

/home/matt/patsynimp.hs:11:5: error:
    • please migrate ...
    • In the pattern: Quux _
      In a case alternative: Quux _ -> 3
      In the expression:
        case x of
          Bar i -> i
          Baz c -> fromEnum c
          Quux _ -> 3
11 |     Quux _ -> 3
   |     ^^^^^^
Failed, one module loaded.

What’s fun is that we can actually provide two different messages. Constructing something will give both error messages, and pattern matching only uses the “required” constraint.

This should make it much easier for end users to migrate to new versions of your library.

Final Code and Errors

{-# language PatternSynonyms #-}
{-# language KindSignatures #-}
{-# language FlexibleContexts #-}
{-# language FlexibleInstances #-}
{-# language ViewPatterns #-}
{-# language MultiParamTypeClasses #-}
{-# language UndecidableInstances #-}
{-# language DataKinds #-}

{-# OPTIONS_GHC -fdefer-type-errors #-}

module PatSyn where

import Prelude
import GHC.Exts
import GHC.TypeLits

data Foo
    = Bar Int
    | Baz Char

class DeferredError (a :: ErrorMessage)
instance (TypeError a) => DeferredError a

pattern Quux
    :: DeferredError ('Text "please migrate (required constraint)")
    => DeferredError ('Text "please migrate (provided constraint)")
    => a -> Foo
pattern Quux i <- (const Nothing -> Just i) where
    Quux _ = undefined

Matching a constructor:

[2 of 2] Compiling Main             ( /home/matt/patsynimp.hs, interpreted )

/home/matt/patsynimp.hs:11:5: error:
    • please migrate (required constraint)
    • In the pattern: Quux _
      In a case alternative: Quux _ -> 3
      In the expression:
        case x of
          Bar i -> i
          Baz c -> fromEnum c
          Quux _ -> 3
11 |     Quux _ -> 3
   |     ^^^^^^
Failed, one module loaded.

Using a constructor:

[2 of 2] Compiling Main             ( /home/matt/patsynimp.hs, interpreted )

/home/matt/patsynimp.hs:14:10: error:
    • please migrate (required constraint)
    • In the expression: Quux (3 :: Int)
      In an equation for ‘blargh’: blargh = Quux (3 :: Int)
14 | blargh = Quux (3 :: Int)
   |          ^^^^^^^^^^^^^^^

/home/matt/patsynimp.hs:14:10: error:
    • please migrate (provided constraint)
    • In the expression: Quux (3 :: Int)
      In an equation for ‘blargh’: blargh = Quux (3 :: Int)
14 | blargh = Quux (3 :: Int)
   |          ^^^^^^^^^^^^^^^
Failed, one module loaded.

November 02, 2022 12:00 AM

October 31, 2022

Mark Jason Dominus

Emoji for U.S. presidents

Content warning: something here to offend almost everyone

A while back I complained that there were no emoji portraits of U.S. presidents. Not that there a Chester A. Arthur portrait would see a lot of use. But some of the others might come in handy.

I couldn't figure them all out. I have no idea what a Chester Arthur emoji would look like. And I assigned 🧔� to all three of Garfield, Harrison, and Hayes, which I guess is ambiguous but do you really need to be able to tell the difference between Garfield, Harrison, and Hayes? I don't think you do. But I'm pretty happy with most of the rest.

George Washington 💵
John Adams
Thomas Jefferson 📜
James Madison
James Monroe
John Quincy Adams �
Andrew Jackson
Martin Van Buren 🌷
William Henry Harrison 🪦
John Tyler
James K. Polk
Zachary Taylor
Millard Fillmore ⛽
Franklin Pierce
James Buchanan
Abraham Lincoln �
Andrew Johnson 💩
Ulysses S. Grant �
Rutherford B. Hayes 🧔�
James Garfield 🧔�
Chester A. Arthur
Grover Cleveland 🔂
Benjamin Harrison 🧔�
Grover Cleveland 🔂
William McKinley
Theodore Roosevelt 🧸
William Howard Taft �
Woodrow Wilson �
Warren G. Harding 🫖
Calvin Coolidge 🙊
Herbert Hoover ⛺
Franklin D. Roosevelt 👨�🦽
Harry S. Truman �
Dwight D. Eisenhower 🪖
John F. Kennedy �
Lyndon B. Johnson 🗳�
Richard M. Nixon �
Gerald R. Ford �
Jimmy Carter 🥜
Ronald Reagan 💸
George H. W. Bush 👻
William J. Clinton �
George W. Bush �
Barack Obama 🇰🇪
Donald J. Trump �
Joseph R. Biden 🕶�

Honorable mention: Benjamin Franklin �

Dishonorable mention: J. Edgar Hoover 👚

If anyone has better suggestions I'm glad to hear them. Note that I considered, and rejected � for Lincoln because it doesn't look like his actual hat. And I thought maybe McKinley should be �� but since they changed the name of the mountain back I decided to save it in case we ever elect a President Denali.

(Thanks to Liam Damewood for suggesting Harding, and to Colton Jang for Clinton's saxophone.)

[ Addendum 20221106: Twitter user Simon suggests emoji for UK prime ministers. ]

[ Addendum 20221108: Rasmus Villemoes makes a good suggestion of 😼 for Garfield. I had considered this angle, but abandoned it because there was no way to be sure that the cat would be orange, overweight, or grouchy. Also the 🧔� thing is funnier the more it is used. But I had been unaware that there is CAT FACE WITH WRY SMILE until M. Villemoes brought it to my attention, so maybe. (Had there been an emoji resembling a lasagna I would have chosen it instantly.) ]

[ Addendum 20221108: January First-of-May has suggested 🌷 for Maarten van Buren, a Dutch-American whose first language was not English but Dutch. Let it be so! ]

by Mark Dominus ( at October 31, 2022 09:50 PM

October 29, 2022

Matt Parsons

Spooky Masks and Async Exceptions

Everyone loves Haskell because it makes concurrent programming so easy! forkIO is great, and you’ve got STM and MVar and other fun tools that are pleasant to use.

Well, then you learn about asynchronous exceptions. The world seems a little scarier - an exception could be lurking around any corner! Anyone with your ThreadId could blast you with a killThread or throwTo and you would have no idea what happened.

The async library hides a lot of this from you by managing the forkIO and throwTo stuff for you. It also makes it easy to wait on a thread to finish, and receive exceptions that the forked thread died with. Consider how nice the implementation of timeout is here:

timeout :: Int -> IO a -> IO (Maybe a)
timeout microseconds action = do
  withAsync (Just <$> action) $ \a0 ->
  withAsync (Nothing <$ threadDelay microseconds) $ \a1 ->
      either id id <$> waitEither a0 a1

The async library uses asynchronous exceptions to signal that a thread must die. The withAsync function guarantees that the forked thread is killed off when the inner action is complete. So timeout will fork a thread to run Just <$> action, and then fork another thread to threadDelay. waitEither accepts an Async a and an Async b and returns an IO (Either a b) - whichever one finishes first determines the return type. If threadDelay finishes first, then we get a Right Nothing as the return, and exits. This spells doom for the action thread.

But if our brave hero is able to escape before the deadline, it’s the threadDelay that gets killed!

Indeed, this is a specialization of race :: IO a -> IO b -> IO (Either a b), which runs two IO actions in separate threads. The first to complete returns the value, and the remaining thread is sacrificed to unspeakable horrors.

But, you really shouldn’t catch or handle async exceptions yourself. GHC uses them to indicate “you really need to shut down extremely quickly, please handle your shit right now.” ThreadKilled is used to end a thread’s execution, and UserInterrupt means that you got a SIGINT signal and need to stop gracefully. The async package uses AsyncCancelled to, well, cancel threads. However, the base package’s Control.Exception has a footgun: if you catch-all-exceptions by matching on SomeException, then you’ll catch these async exceptions too!

Now, you should pretty much never be catching SomeException, unless you really really know what you’re doing. But I see it all the time:

import Control.Exception (catch)

blah = 
    Just <$> coolThing 
        `catch` \(SomeException e) -> do
            reportException e
            pure Nothing

If coolThing receives a ThreadKilled or an AsyncCancelled or UserInterrupt or anything else from throwTo, it’ll catch it, report it, and then your program will continue running. Then the second Ctrl-C comes from the user, and your program halts immediately without running any cleanup. This is pretty dang bad! You really want your finally calls to run.

You search for a bit, and you find the safe-exceptions package. It promises to make things a lot nicer by not catching async exceptions by default. So our prior code block, with just a change in import, becomes much safer:

import Control.Exception.Safe (catch)

blah = 
    Just <$> coolThing 
        `catch` \(SomeException e) -> do
            reportException e
            pure Nothing

This code will no longer catch and report an async exception. However, the blocks in your finally and bracket for cleanup will run!

Unfortunately, the safe-exceptions library (and the unliftio package which uses the same behavior), have a dark secret…

*thunder claps in the distance, as rain begins to fall*

… they wear spooky masks while cleaning! WowowoOOOoOoOooOooOOooOooOOo

No, really, they do something like this:

bracket provide cleanup action = 
        (\a -> 
            Control.Exception.uninterruptibleMask_ $ 
                cleanup a)

This code looks pretty innocuous. It even says that it’s good! “Your cleanup function is guaranteed not to be interrupted by an asynchronous exception.” So if you’re cleaning things up, and BAMM a vampire ThreadKills you, you’ll finish your cleanup before rethrowing. This might just be all you need to make it out of the dungeon alive.

Behind the sweet smile and innocent demeanor of the safe-exceptions package, though, is a dark mystery - and a vendetta for blood. Well, maybe not blood, but I guess “intercompatibility of default expectations”?

A Nightmare Scenario: Night of the Living Deadlock

Once, a brave detective tried to understand how slow the database was. But in her studies, she accidentally caused the the entire app to deadlock and become an unkillable zombie?!

There are three actors in this horror mystery. Mr DA, the prime suspect. Alice, our detective. And Bob, the unassuming janitor.

Mr Database Acquisition

One of the suspected villains is Mr. Database Acquisition, a known rogue. Usually, Mr. Database Acquisition works quickly and effectively, but sometimes everything stops and he’s nowhere to be found. We’re already recording how long he takes by measuring the job completion time, but if the job never finishes, we don’t know anything.

The database connection is provided from a resource-pool Pool, which is supposed to be thread safe and guarantee resource allocation. But something seems shady about it…


Alice is a performance engineer and lead detective. She’s interested in making the codebase faster, and to do so, she sets up inspection points to log how long things are taking.

Alice cleverly sets up a phantom detective - a forked thread that occasionally checks in on Mr Database.

    :: (IO () -> IO r) -> IO r
withAcquisitionTimer action = do
    timeSpent <- newIORef 0
    let tracker = 
            forever $ do
                threadDelay 1000
                timeSpent <- atomicModifyIORef' timeSpent (\a -> (a+1000, a+1000))
                recordMetric runningWait timeSpent

        report = do
            elapsed <- readIORef timeSpent
            recordMetric totalWait elapsed

    withAsync (tracker `finally` report) $ \a ->
        action (cancel a)

The actual implementation is a bit more robust and sensible, but this gets the gist across. Pretend we’re in a campy low budget horror movie.

The tracker thread wakes up every millisecond to record how long we’re waiting, and continues running until the thread is finally cancelled, or killed with an async exception, or the action finishes successfully, or if a regular exception causes action to exit early. withAsync will cancel the tracker thread, ensuring that we don’t leak threads. Part of cancel’s API is that it doesn’t return until the thread is totally, completely, certainly dead - so when withAsync returns, you’re guaranteed that the thread is dead.

Alice sets the tracker up for every database acquisition, and waits to see what’s really going on.

Bob, the Janitor

theSceneOfTheCrime =
        (runDB startProcess) 
        (\processId -> runDB (closeProcess processId)) 
        $ \processId -> do
            doWorkWith processId
            {- ... snip ... -}

There’s a great big mess - it appears that someone was thrown from a high building! Foul play is suspected from the initial detective work. But after the excitement dies down, the janitor, Bob, is left to clean up the mess.

One of the perks of being a janitor is protection from all sorts of evil. While you’re cleaning stuff up, nothing spooky can harm you - no async exceptions are allowd. You might expect there’s a loophole here, but it’s fool proof. It’s such a strong protection that the janitor is even able to bestow it upon anyone that works for him to help clean up.

Bob begins cleaning up by recording the work he’s doing in the database. To do this, he requests a database connection from Mr Database. However, this provides Mr Database with the same protections: no one can kill him, or anyone that works for him!

Now, by the particular and odd rules of this protection magic, you don’t have to know that someone is working for you. So the phantom tracker that Alice set up is similarly extended this protection.

Mr Database provides the database connection to Bob in a prompt manner, and Bob completes his task. However, when Bob attempts to release the database back, he can’t! The database connection is permanently stuck to his hand. Mr Database can’t accept it back and put it in the pool, and he can’t continue to his next job. The entire application comes grinding to a halt, as no one can access the database.

What kind of bizarre curse is this?

The Gift of Safety

withAsync wants to be safe - it wants to guarantee that the forked thread is killed when the block exits. It accomplishes this by effectively doing:

withAsync thread action = 
        (async thread)

async forks the thread and prepares the Async:

async action = do
   var <- newEmptyTMVarIO
   threadId <- mask $ \restore ->
          forkIO $ try (restore action) >>= atomically . putTMVar var
   return Async 
      { asyncThreadId = threadId 
      , _asyncWait = readTMVar var

async is careful to mask the forkIO call, which ensures that the forked thread is masked. That allows action to receive async exceptions, but outside of action, it’s guaranteed that if try succeeds, then the atomically . putTMVar var also succeeds. Since try will catch async exceptions, this means that the async exception will definitely be registered in the putTMVar call.

uninterruptibleCancel cancels the thread in an uninterruptible state. cancel waits for the thread to complete - either with an exception or a real value.

Meanwhile, bracket is also cursed with safety:

module UnliftIO.Exception where

bracket make clean action = 
    withRunInIO $ \runInIO ->
            (runInIO make)
            (\a -> uninterruptibleMask_ $ runInIO $ clean a)
            (\a -> runInIO $ action a)

The Curse of Two Gifts

Unspeakable magical rules dictate that two gifts form a curse, under the usual laws for associativity and commutativity.

To understand what’s going on, we start by inlining the bracket.

crimeSceneCleanedUp =
    withRunInIO $ \runInIO ->
        (runInIO $ runDB createProcess)
        (\pid -> 
            uninterruptibleMask_ $ do
                runInIO $ runDB $ do
                    closeProcess pid

We know that the make and action managed to complete, so we’re interested in the cleanup. Let’s expand runDB annd omit some noise:

crimeSceneCleanedUp =
    withRunInIO $ \runInIO ->
            uninterruptibleMask_ $ do
                runInIO $ do
                    sqlPool <- getSqlPool
                    withAcquisitionTimer $ \stop ->
                        flip runSqlPool sqlPool $ do
                            closeProcess pid

Hmm! That withAcqusitionTimer is new! Enhance!!

crimeSceneCleanedUp =
    withRunInIO $ \runInIO ->
            uninterruptibleMask_ $ do
                runInIO $ do
                    sqlPool <- getSqlPool
                    withAsync (task `finally` record) $ \async ->
                        flip runSqlPool sqlPool $ do
                        cancel async 
                        closeProcess pid

Uh oh. Let’s zoom in on withAsync (and get rid of some indentation):

crimeSceneCleanedUp =
    uninterruptibleMask_ $ do
        sqlPool <- getSqlPool
            (async (task `finally` record))
            $ \async ->
            flip runSqlPool sqlPool $ do
                cancel async 
                closeProcess pid

One more level!

crimeSceneCleanedUp =
    uninterruptibleMask_ $ do
        sqlPool <- getSqlPool
                var <- newEmptyTMVarIO
                threadId <- mask $ \restore ->
                    forkIO $ do
                        eres <- try $ restore $ 
                            task `finally` record 
                        atomically $ putTMVar var eres
                return Async 
                    { asyncThreadId = threadId 
                    , _asyncWait = readTMVar var
            $ \async ->
            flip runSqlPool sqlPool $ do
                cancel async 
                closeProcess pid

Uh oh. forkIO inherits the masking state from the parent thread. This means that uninterruptibleMask_ state, set by bracket’s cleanup, is inherited by our forkIO.

Let’s zoom back out on that async call and inline the task

crimeSceneCleanedUp =
    uninterruptibleMask_ $ do
                forever $ do
                    threadDelay 1000
                    {- hmm -}
             `finally` record) $ \async ->
            {- snip -}

Ah! That’s the zombie. Reducing it to it’s most basic nature, we have:

zombie :: IO (Async a)
zombie =
    uninterruptibleMask_ $
        async $ 
            forever $ 
                threadDelay 1000

uninteruptibleMask_ means “I cannot be killed by async exceptions.” async allows the forked thread to inherit the masking state of the parent. But about half of the API of async requires that the forked thread can be killed by async exceptions. race is completely broken with unkillable Asyncs.

The solution is to use withAsyncWithUnmask:

safeWithAsync thread action =
    withAsyncWithUnmask (\unmask -> unmask thread) action

This unmasks the child thread, revealing it to be an imposter all along.

And I would have ~gotten away with it~ never exited and consumed all resources, if it weren’t for you danged kids!!!

The unmasked phantom thread, free from it’s curse of safety, was killed and returned to the phantom aether to be called upon in other sorcery.

October 29, 2022 12:00 AM

October 27, 2022


GHC activities report: August-September 2022

This is the fourteenth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of August and September 2022. You can find the previous editions collected under the ghc-activities-report tag.

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools and improvements to HLS. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC!


The current GHC team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal and Sam Derbyshire. Many others within Well-Typed are contributing to GHC more occasionally.


  • August and September were quiet months due to holidays and now our attention is turning to the forthcoming 9.4.3, 9.2.5 releases and 9.6 branch.

  • Zubin has been preparing GHC 9.2.5 and backporting critical runtime performance fixes like #21755 to the branch.


  • Matt finished an extension to interface files which allows the whole core program to be stored in an interface file. This improves restart times of GHCi and can massively improve compilation times of projects involving many Template Haskell splices. (!7502)

  • Matt investigated a number of issues to do with interface file determinism and added some CI jobs to try to check that we are producing deterministic interfaces. (!8895)

Compiler performance

  • Andreas investigated the benefit of being less aggressive in pruning specializations in #18532, where it turned out the current behaviour is already optimal.

  • Andreas investigated GHC and core lint performance in #22200. This resulted in !9055 where a few key improvements improved compile times for builds using -dcore-lint by ~15% in the common case and ~30% in edge cases.

  • Matt did another round of space usage investigation and fixed a number of leaks in the simplifier. These changes are most obvious when compiling very large modules. The fixes focused on making sure we didn’t retain old bindings across simplifier passes. (#22102, !8896)


  • Sam improved how GHC picks Given quantified constraints when trying to solve a Wanted constraint, by picking the quantified constraint with the weakest precondition (if one exists). This fixes #22216 and #22223.

Error messages

  • Sam finalised and landed a patch adding error codes to error messages (!8849). All errors and warnings that GHC emits using the new diagnostic infrastructure now come with an associated unique code, e.g. error: [GHC-53633] Pattern match is redundant. These can then be used for looking up documentation, for example in the Haskell Error Index. This is part of a Haskell Foundation proposal.

Code generation

  • Andreas changed the tag inference pass to apply in a few more situations in !8747. In particular, code returning variables that are statically known to be properly tagged, as well as dataToTag, benefit from this change and will produce more efficient code.

  • Ben and Andreas fixed code generation bug #21968 which sometimes caused incorrect results when compiling without optimization.

  • Andreas fixed #22042 where GHC sometimes produced invalid bytecode inside GHCi.

  • Ben fixed a bug in code generation for intMulMayOflo# on AArch64 (#21624) and updated test-primops to exercise the affected codepath.

  • Ben fixed a bug in code generation on x86_64 (#21968) where under some conditions a switch discriminator could clobber a live value.

  • Ben fixed a bug in the LLVM code generator which would break GHC-generated initializers. (#22019)

Core-to-Core pipeline

  • Andreas investigated #21960 about regressions in the 9.2/9.4 point releases. They were tracked back to the simple fix for #21694. Simon Peyton Jones provided a more robust solution in !8862.

  • Andreas identified #22075 where GHC would sometimes enter a infinite loop while compiling a program involving recursive top level bindings which Simon Peyton Jones then fixed in !8905.

Runtime system

  • Ben fixed an bug in newArrayArray# in 9.2 where zero-sized arrays would integer underflow when initializing the card array. (#21962)
  • Ben continued work in teaching the runtime linker about constructor/deconstructor priorities, improving reliability of interoperation with C++ code. (#21947)
  • Ben diagnosed and fixed a bug in the nonmoving collector where objects could be inappropriately scavenged. (#21885)


  • Ben reworked GHC’s handling of IPE information, significantly reducing the on-disk size and improving initialization efficiency. (!8868)


  • Ben finished and merged his thread introspection branch, allowing user programs to enumerate the threads of a program and query the label, state, and stack of each. (!2816)
  • Ben reworked the exception provenance proposal and rewrote his prototype implementation.


  • Ben fixed a number of packaging issues (#21901, #21965, #21713, #21974, #21506, #21956, #21988, #21976, #21974) and worked to improve CI to ensure that such regressions are caught in the future.

  • Matt fixed a number of packaging issues to do with Hackage documentation and added CI jobs which generate documentation suitable for upload to Hackage. (!8846, !8841)


  • Matt improved Hadrian build times by increasing the amount of parallelism available. On a full build the total time is 75% of the time before these patches. (!8879)
  • Sam made some improvements to Hadrian bootstrapping on Windows.


  • Ben worked to fix various CI issues (#21986) and began work on testing cross-compilers under CI. (#21480)
  • Ben finished his work removing the make build system from GHC’s source tree, dropping over 10kLoC from the repository. (#17527)
  • Ben looked into the feasibility of notarizing macOS binary distributions. (#17418)

by ben, andreask, matthew, zubin, sam at October 27, 2022 12:00 AM

October 24, 2022

Comonad Reader

Domains, Sets, Traversals and Applicatives

Last time I looked at free monoids, and noticed that in Haskell lists don't really cut it. This is a consequence of laziness and general recursion. To model a language with those properties, one needs to use domains and monotone, continuous maps, rather than sets and total functions (a call-by-value language with general recursion would use domains and strict maps instead).

This time I'd like to talk about some other examples of this, and point out how doing so can (perhaps) resolve some disagreements that people have about the specific cases.

The first example is not one that I came up with: induction. It's sometimes said that Haskell does not have inductive types at all, or that we cannot reason about functions on its data types by induction. However, I think this is (techincally) inaccurate. What's true is that we cannot simply pretend that that our types are sets and use the induction principles for sets to reason about Haskell programs. Instead, one has to figure out what inductive domains would be, and what their proof principles are.

Fortunately, there are some papers about doing this. The most recent (that I'm aware of) is Generic Fibrational Induction. I won't get too into the details, but it shows how one can talk about induction in a general setting, where one has a category that roughly corresponds to the type theory/programming language, and a second category of proofs that is 'indexed' by the first category's objects. Importantly, it is not required that the second category is somehow 'part of' the type theory being reasoned about, as is often the case with dependent types, although that is also a special case of their construction.

One of the results of the paper is that this framework can be used to talk about induction principles for types that don't make sense as sets. Specifically:

newtype Hyp = Hyp ((Hyp -> Int) -> Int)

the type of "hyperfunctions". Instead of interpreting this type as a set, where it would effectively require a set that is isomorphic to the power set of its power set, they interpret it in the category of domains and strict functions mentioned earlier. They then construct the proof category in a similar way as one would for sets, except instead of talking about predicates as subsets, we talk about sub-domains instead. Once this is done, their framework gives a notion of induction for this type.

This example is suitable for ML (and suchlike), due to the strict functions, and sort of breaks the idea that we can really get away with only thinking about sets, even there. Sets are good enough for some simple examples (like flat domains where we don't care about ⊥), but in general we have to generalize induction itself to apply to all types in the 'good' language.

While I haven't worked out how the generic induction would work out for Haskell, I have little doubt that it would, because ML actually contains all of Haskell's data types (and vice versa). So the fact that the framework gives meaning to induction for ML implies that it does so for Haskell. If one wants to know what induction for Haskell's 'lazy naturals' looks like, they can study the ML analogue of:

data LNat = Zero | Succ (() -> LNat)

because function spaces lift their codomain, and make things 'lazy'.


The other example I'd like to talk about hearkens back to the previous article. I explained how foldMap is the proper fundamental method of the Foldable class, because it can be massaged to look like:

foldMap :: Foldable f => f a -> FreeMonoid a

and lists are not the free monoid, because they do not work properly for various infinite cases.

I also mentioned that foldMap looks a lot like traverse:

foldMap  :: (Foldable t   , Monoid m)      => (a -> m)   -> t a -> m
traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)

And of course, we have Monoid m => Applicative (Const m), and the functions are expected to agree in this way when applicable.

Now, people like to get in arguments about whether traversals are allowed to be infinite. I know Ed Kmett likes to argue that they can be, because he has lots of examples. But, not everyone agrees, and especially people who have papers proving things about traversals tend to side with the finite-only side. I've heard this includes one of the inventors of Traversable, Conor McBride.

In my opinion, the above disagreement is just another example of a situation where we have a generic notion instantiated in two different ways, and intuition about one does not quite transfer to the other. If you are working in a language like Agda or Coq (for proving), you will be thinking about traversals in the context of sets and total functions. And there, traversals are finite. But in Haskell, there are infinitary cases to consider, and they should work out all right when thinking about domains instead of sets. But I should probably put forward some argument for this position (and even if I don't need to, it leads somewhere else interesting).

One example that people like to give about finitary traversals is that they can be done via lists. Given a finite traversal, we can traverse to get the elements (using Const [a]), traverse the list, then put them back where we got them by traversing again (using State [a]). Usually when you see this, though, there's some subtle cheating in relying on the list to be exactly the right length for the second traversal. It will be, because we got it from a traversal of the same structure, but I would expect that proving the function is actually total to be a lot of work. Thus, I'll use this as an excuse to do my own cheating later.

Now, the above uses lists, but why are we using lists when we're in Haskell? We know they're deficient in certain ways. It turns out that we can give a lot of the same relevant structure to the better free monoid type:

newtype FM a = FM (forall m. Monoid m => (a -> m) -> m) deriving (Functor)
instance Applicative FM where
  pure x = FM ($ x)
  FM ef < *> FM ex = FM $ \k -> ef $ \f -> ex $ \x -> k (f x)
instance Monoid (FM a) where
  mempty = FM $ \_ -> mempty
  mappend (FM l) (FM r) = FM $ \k -> l k <> r k
instance Foldable FM where
  foldMap f (FM e) = e f
newtype Ap f b = Ap { unAp :: f b }
instance (Applicative f, Monoid b) => Monoid (Ap f b) where
  mempty = Ap $ pure mempty
  mappend (Ap l) (Ap r) = Ap $ (<>) < $> l < *> r
instance Traversable FM where
  traverse f (FM e) = unAp . e $ Ap . fmap pure . f

So, free monoids are Monoids (of course), Foldable, and even Traversable. At least, we can define something with the right type that wouldn't bother anyone if it were written in a total language with the right features, but in Haskell it happens to allow various infinite things that people don't like.

Now it's time to cheat. First, let's define a function that can take any Traversable to our free monoid:

toFreeMonoid :: Traversable t => t a -> FM a
toFreeMonoid f = FM $ \k -> getConst $ traverse (Const . k) f

Now let's define a Monoid that's not a monoid:

data Cheat a = Empty | Single a | Append (Cheat a) (Cheat a)
instance Monoid (Cheat a) where
  mempty = Empty
  mappend = Append

You may recognize this as the data version of the free monoid from the previous article, where we get the real free monoid by taking a quotient. using this, we can define an Applicative that's not valid:

newtype Cheating b a =
  Cheating { prosper :: Cheat b -> a } deriving (Functor)
instance Applicative (Cheating b) where
  pure x = Cheating $ \_ -> x
  Cheating f < *> Cheating x = Cheating $ \c -> case c of
    Append l r -> f l (x r)

Given these building blocks, we can define a function to relabel a traversable using a free monoid:

relabel :: Traversable t => t a -> FM b -> t b
relabel t (FM m) = propser (traverse (const hope) t) (m Single)
 hope = Cheating $ \c -> case c of
   Single x -> x

And we can implement any traversal by taking a trip through the free monoid:

  :: (Applicative f, Traversable t) => (a -> f b) -> t a -> f (t b)
slowTraverse f t = fmap (relabel t) . traverse f . toFreeMonoid $ t

And since we got our free monoid via traversing, all the partiality I hid in the above won't blow up in practice, rather like the case with lists and finite traversals.

Arguably, this is worse cheating. It relies on the exact association structure to work out, rather than just number of elements. The reason is that for infinitary cases, you cannot flatten things out, and there's really no way to detect when you have something infinitary. The finitary traversals have the luxury of being able to reassociate everything to a canonical form, while the infinite cases force us to not do any reassociating at all. So this might be somewhat unsatisfying.

But, what if we didn't have to cheat at all? We can get the free monoid by tweaking foldMap, and it looks like traverse, so what happens if we do the same manipulation to the latter?

It turns out that lens has a type for this purpose, a slight specialization of which is:

newtype Bazaar a b t =
  Bazaar { runBazaar :: forall f. Applicative f => (a -> f b) -> f t }

Using this type, we can reorder traverse to get:

howBizarre :: Traversable t => t a -> Bazaar a b (t b)
howBizarre t = Bazaar $ \k -> traverse k t

But now, what do we do with this? And what even is it? [1]

If we continue drawing on intuition from Foldable, we know that foldMap is related to the free monoid. Traversable has more indexing, and instead of Monoid uses Applicative. But the latter are actually related to the former; Applicatives are monoidal (closed) functors. And it turns out, Bazaar has to do with free Applicatives.

If we want to construct free Applicatives, we can use our universal property encoding trick:

newtype Free p f a =
  Free { gratis :: forall g. p g => (forall x. f x -> g x) -> g a }

This is a higher-order version of the free p, where we parameterize over the constraint we want to use to represent structures. So Free Applicative f is the free Applicative over a type constructor f. I'll leave the instances as an exercise.

Since free monoid is a monad, we'd expect Free p to be a monad, too. In this case, it is a McBride style indexed monad, as seen in The Kleisli Arrows of Outrageous Fortune.

type f ~> g = forall x. f x -> g x
embed :: f ~> Free p f
embed fx = Free $ \k -> k fx
translate :: (f ~> g) -> Free p f ~> Free p g
translate tr (Free e) = Free $ \k -> e (k . tr)
collapse :: Free p (Free p f) ~> Free p f
collapse (Free e) = Free $ \k -> e $ \(Free e') -> e' k

That paper explains how these are related to Atkey style indexed monads:

data At key i j where
  At :: key -> At key i i
type Atkey m i j a = m (At a j) i
ireturn :: IMonad m => a -> Atkey m i i a
ireturn = ...
ibind :: IMonad m => Atkey m i j a -> (a -> Atkey m j k b) -> Atkey m i k b
ibind = ...

It turns out, Bazaar is exactly the Atkey indexed monad derived from the Free Applicative indexed monad (with some arguments shuffled) [2]:

hence :: Bazaar a b t -> Atkey (Free Applicative) t b a
hence bz = Free $ \tr -> runBazaar bz $ tr . At
forth :: Atkey (Free Applicative) t b a -> Bazaar a b t
forth fa = Bazaar $ \g -> gratis fa $ \(At a) -> g a
imap :: (a -> b) -> Bazaar a i j -> Bazaar b i j
imap f (Bazaar e) = Bazaar $ \k -> e (k . f)
ipure :: a -> Bazaar a i i
ipure x = Bazaar ($ x)
(>>>=) :: Bazaar a j i -> (a -> Bazaar b k j) -> Bazaar b k i
Bazaar e >>>= f = Bazaar $ \k -> e $ \x -> runBazaar (f x) k
(>==>) :: (s -> Bazaar i o t) -> (i -> Bazaar a b o) -> s -> Bazaar a b t
(f >==> g) x = f x >>>= g

As an aside, Bazaar is also an (Atkey) indexed comonad, and the one that characterizes traversals, similar to how indexed store characterizes lenses. A Lens s t a b is equivalent to a coalgebra s -> Store a b t. A traversal is a similar Bazaar coalgebra:

  s -> Bazaar a b t
  s -> forall f. Applicative f => (a -> f b) -> f t
  forall f. Applicative f => (a -> f b) -> s -> f t

It so happens that Kleisli composition of the Atkey indexed monad above (>==>) is traversal composition.

Anyhow, Bazaar also inherits Applicative structure from Free Applicative:

instance Functor (Bazaar a b) where
  fmap f (Bazaar e) = Bazaar $ \k -> fmap f (e k)
instance Applicative (Bazaar a b) where
  pure x = Bazaar $ \_ -> pure x
  Bazaar ef < *> Bazaar ex = Bazaar $ \k -> ef k < *> ex k

This is actually analogous to the Monoid instance for the free monoid; we just delegate to the underlying structure.

The more exciting thing is that we can fold and traverse over the first argument of Bazaar, just like we can with the free monoid:

bfoldMap :: Monoid m => (a -> m) -> Bazaar a b t -> m
bfoldMap f (Bazaar e) = getConst $ e (Const . f)
newtype Comp g f a = Comp { getComp :: g (f a) } deriving (Functor)
instance (Applicative f, Applicative g) => Applicative (Comp g f) where
  pure = Comp . pure . pure
  Comp f < *> Comp x = Comp $ liftA2 (< *>) f x
  :: (Applicative f) => (a -> f a') -> Bazaar a b t -> Bazaar a' b t
btraverse f (Bazaar e) = getComp $ e (c . fmap ipure . f)

This is again analogous to the free monoid code. Comp is the analogue of Ap, and we use ipure in traverse. I mentioned that Bazaar is a comonad:

extract :: Bazaar b b t -> t
extract (Bazaar e) = runIdentity $ e Identity

And now we are finally prepared to not cheat:

  :: (Applicative f, Traversable t) => (a -> f b) -> t a -> f (t b)
honestTraverse f = fmap extract . btraverse f . howBizarre

So, we can traverse by first turning out Traversable into some structure that's kind of like the free monoid, except having to do with Applicative, traverse that, and then pull a result back out. Bazaar retains the information that we're eventually building back the same type of structure, so we don't need any cheating.

To pull this back around to domains, there's nothing about this code to object to if done in a total language. But, if we think about our free Applicative-ish structure, in Haskell, it will naturally allow infinitary expressions composed of the Applicative operations, just like the free monoid will allow infinitary monoid expressions. And this is okay, because some Applicatives can make sense of those, so throwing them away would make the type not free, in the same way that even finite lists are not the free monoid in Haskell. And this, I think, is compelling enough to say that infinite traversals are right for Haskell, just as they are wrong for Agda.

For those who wish to see executable code for all this, I've put a files here and here. The latter also contains some extra goodies at the end that I may talk about in further installments.

[1] Truth be told, I'm not exactly sure.

[2] It turns out, you can generalize Bazaar to have a correspondence for every choice of p

newtype Bizarre p a b t =
  Bizarre { bizarre :: forall f. p f => (a -> f b) -> f t }

hence and forth above go through with the more general types. This can be seen here.

by Dan Doel at October 24, 2022 05:47 PM

Free Monoids in Haskell

It is often stated that Foldable is effectively the toList class. However, this turns out to be wrong. The real fundamental member of Foldable is foldMap (which should look suspiciously like traverse, incidentally). To understand exactly why this is, it helps to understand another surprising fact: lists are not free monoids in Haskell.

This latter fact can be seen relatively easily by considering another list-like type:

data SL a = Empty | SL a :> a
instance Monoid (SL a) where
  mempty = Empty
  mappend ys Empty = ys
  mappend ys (xs :> x) = (mappend ys xs) :> x
single :: a -> SL a
single x = Empty :> x

So, we have a type SL a of snoc lists, which are a monoid, and a function that embeds a into SL a. If (ordinary) lists were the free monoid, there would be a unique monoid homomorphism from lists to snoc lists. Such a homomorphism (call it h) would have the following properties:

h [] = Empty
h (xs <> ys) = h xs <> h ys
h [x] = single x

And in fact, this (together with some general facts about Haskell functions) should be enough to define h for our purposes (or any purposes, really). So, let's consider its behavior on two values:

h [1] = single 1
h [1,1..] = h ([1] <> [1,1..]) -- [1,1..] is an infinite list of 1s
          = h [1] <> h [1,1..]

This second equation can tell us what the value of h is at this infinite value, since we can consider it the definition of a possibly infinite value:

x = h [1] <> x = fix (single 1 <>)
h [1,1..] = x

(single 1 ) is a strict function, so the fixed point theorem tells us that x = ⊥.

This is a problem, though. Considering some additional equations:

[1,1..] <> [n] = [1,1..] -- true for all n
h [1,1..] = ⊥
h ([1,1..] <> [1]) = h [1,1..] <> h [1]
                   = ⊥ <> single 1
                   = ⊥ :> 1
                   ≠ ⊥

So, our requirements for h are contradictory, and no such homomorphism can exist.

The issue is that Haskell types are domains. They contain these extra partially defined values and infinite values. The monoid structure on (cons) lists has infinite lists absorbing all right-hand sides, while the snoc lists are just the opposite.

This also means that finite lists (or any method of implementing finite sequences) are not free monoids in Haskell. They, as domains, still contain the additional bottom element, and it absorbs all other elements, which is incorrect behavior for the free monoid:

pure x <> ⊥ = ⊥
h ⊥ = ⊥
h (pure x <> ⊥) = [x] <> h ⊥
                = [x] ++ ⊥
                = x:⊥
                ≠ ⊥

So, what is the free monoid? In a sense, it can't be written down at all in Haskell, because we cannot enforce value-level equations, and because we don't have quotients. But, if conventions are good enough, there is a way. First, suppose we have a free monoid type FM a. Then for any other monoid m and embedding a -> m, there must be a monoid homomorphism from FM a to m. We can model this as a Haskell type:

forall a m. Monoid m => (a -> m) -> FM a -> m

Where we consider the Monoid m constraint to be enforcing that m actually has valid monoid structure. Now, a trick is to recognize that this sort of universal property can be used to define types in Haskell (or, GHC at least), due to polymorphic types being first class; we just rearrange the arguments and quantifiers, and take FM a to be the polymorphic type:

newtype FM a = FM { unFM :: forall m. Monoid m => (a -> m) -> m }

Types defined like this are automatically universal in the right sense. [1] The only thing we have to check is that FM a is actually a monoid over a. But that turns out to be easily witnessed:

embed :: a -> FM a
embed x = FM $ \k -> k x
instance Monoid (FM a) where
  mempty = FM $ \_ -> mempty
  mappend (FM e1) (FM e2) = FM $ \k -> e1 k <> e2 k

Demonstrating that the above is a proper monoid delegates to instances of Monoid being proper monoids. So as long as we trust that convention, we have a free monoid.

However, one might wonder what a free monoid would look like as something closer to a traditional data type. To construct that, first ignore the required equations, and consider only the generators; we get:

data FMG a = None | Single a | FMG a :<> FMG a

Now, the proper FM a is the quotient of this by the equations:

None :<> x = x = x :<> None
x :<> (y :<> z) = (x :<> y) :<> z

One way of mimicking this in Haskell is to hide the implementation in a module, and only allow elimination into Monoids (again, using the convention that Monoid ensures actual monoid structure) using the function:

unFMG :: forall a m. Monoid m => FMG a -> (a -> m) -> m
unFMG None _ = mempty
unFMG (Single x) k = k x
unFMG (x :<> y) k = unFMG x k <> unFMG y k

This is actually how quotients can be thought of in richer languages; the quotient does not eliminate any of the generated structure internally, it just restricts the way in which the values can be consumed. Those richer languages just allow us to prove equations, and enforce properties by proof obligations, rather than conventions and structure hiding. Also, one should note that the above should look pretty similar to our encoding of FM a using universal quantification earlier.

Now, one might look at the above and have some objections. For one, we'd normally think that the quotient of the above type is just [a]. Second, it seems like the type is revealing something about the associativity of the operations, because defining recursive values via left nesting is different from right nesting, and this difference is observable by extracting into different monoids. But aren't monoids supposed to remove associativity as a concern? For instance:

ones1 = embed 1 <> ones1
ones2 = ones2 <> embed 1

Shouldn't we be able to prove these are the same, becuase of an argument like:

ones1 = embed 1 <> (embed 1 <> ...)
      ... reassociate ...
      = (... <> embed 1) <> embed 1
      = ones2

The answer is that the equation we have only specifies the behavior of associating three values:

x <> (y <> z) = (x <> y) <> z

And while this is sufficient to nail down the behavior of finite values, and finitary reassociating, it does not tell us that infinitary reassociating yields the same value back. And the "... reassociate ..." step in the argument above was decidedly infinitary. And while the rules tell us that we can peel any finite number of copies of embed 1 to the front of ones1 or the end of ones2, it does not tell us that ones1 = ones2. And in fact it is vital for FM a to have distinct values for these two things; it is what makes it the free monoid when we're dealing with domains of lazy values.

Finally, we can come back to Foldable. If we look at foldMap:

foldMap :: (Foldable f, Monoid m) => (a -> m) -> f a -> m

we can rearrange things a bit, and get the type:

Foldable f => f a -> (forall m. Monoid m => (a -> m) -> m)

And thus, the most fundamental operation of Foldable is not toList, but toFreeMonoid, and lists are not free monoids in Haskell.

[1]: What we are doing here is noting that (co)limits are objects that internalize natural transformations, but the natural transformations expressible by quantification in GHC are already automatically internalized using quantifiers. However, one has to be careful that the quantifiers are actually enforcing the relevant naturality conditions. In many simple cases they are.

by Dan Doel at October 24, 2022 05:47 PM

Fast Circular Substitution

Emil Axelsson and Koen Claessen wrote a functional pearl last year about Using Circular Programs for Higher-Order Syntax.

About 6 months ago I had an opportunity to play with this approach in earnest, and realized we can speed it up a great deal. This has kept coming up in conversation ever since, so I've decided to write up an article here.

In my bound library I exploit the fact that monads are about substitution to make a monad transformer that manages substitution for me.

Here I'm going to take a more coupled approach.

To have a type system with enough complexity to be worth examining, I'll adapt Dan Doel's UPTS, which is a pure type system with universe polymorphism. I won't finish the implementation here, but from where we get it should be obvious how to finish the job.

Unlike Axelsson and Claessen I'm not going to bother to abstract over my name representation.

To avoid losing the original name from the source, we'll just track names as strings with an integer counting the number of times it has been 'primed'. The name is purely for expository purposes, the real variable identifier is the number. We'll follow the Axelsson and Claessen convention of having the identifier assigned to each binder be larger than any one bound inside of it. If you don't need he original source names you can cull them from the representation, but they can be useful if you are representing a syntax tree for something you parsed and/or that you plan to pretty print later.

data Name = Name String Int
   deriving (Show,Read)
hint :: Name -> String
hint (Name n _) = n
nameId :: Name -> Int
nameId (Name _ i) = i
instance Eq Name where
  (==) = (==) `on` nameId
instance Ord Name where
  compare = compare `on` nameId
prime :: String -> Int -> Name
prime n i = Name n (i + 1)

So what is the language I want to work with?

type Level = Int
data Constant
  = Level
  | LevelLiteral {-# UNPACK #-} !Level
  | Omega
  deriving (Eq,Ord,Show,Read,Typeable)
data Term a
  = Free a
  | Bound {-# UNPACK #-} !Name
  | Constant !Constant
  | Term a :+ {-# UNPACK #-} !Level
  | Max  [Term a]
  | Type !(Term a)
  | Lam   {-# UNPACK #-} !Name !(Term a) !(Term a)
  | Pi    {-# UNPACK #-} !Name !(Term a) !(Term a)
  | Sigma {-# UNPACK #-} !Name !(Term a) !(Term a)
  | App !(Term a) !(Term a)
  | Fst !(Term a)
  | Snd !(Term a)
  | Pair !(Term a) !(Term a) !(Term a)
  deriving (Show,Read,Eq,Ord,Functor,Foldable,Traversable,Typeable)

That is perhaps a bit paranoid about remaining strict, but it seemed like a good idea at the time.

We can define capture avoiding substitution on terms:

subst :: Eq a => a -> Term a -> Term a -> Term a
subst a x y = y >>= \a' ->
  if a == a'
    then x
    else return a'

Now we finally need to implement Axelsson and Claessen's circular programming trick. Here we'll abstract over terms that allow us to find the highest bound value within them:

class Bindable t where
  bound :: t -> Int

and instantiate it for our Term type

instance Bindable (Term a) where
  bound Free{}        = 0
  bound Bound{}       = 0 -- intentional!
  bound Constant{}    = 0
  bound (a :+ _)      = bound a
  bound (Max xs)      = foldr (\a r -> bound a `max` r) 0 xs
  bound (Type t)      = bound t
  bound (Lam b t _)   = nameId b `max` bound t
  bound (Pi b t _)    = nameId b `max` bound t
  bound (Sigma b t _) = nameId b `max` bound t
  bound (App x y)     = bound x `max`  bound y
  bound (Fst t)       = bound t
  bound (Snd t)       = bound t
  bound (Pair t x y)  = bound t `max` bound x `max` bound y

As in the original pearl we avoid traversing into the body of the binders, hence the _'s in the code above.

Now we can abstract over the pattern used to create a binder in the functional pearl, since we have multiple binder types in this syntax tree, and the code would get repetitive.

binder :: Bindable t =>
  (Name -> t) ->
  (Name -> t -> r) ->
  String -> (t -> t) -> r
binder bd c n e = c b body where
  body = e (bd b)
  b = prime n (bound body)
lam, pi, sigma :: String -> Term a -> (Term a -> Term a) -> Term a
lam s t   = binder Bound (`Lam` t) s
pi s t    = binder Bound (`Pi` t) s
sigma s t = binder Bound (`Sigma` t) s

We may not always want to give names to the variables we capture, so let's define:

lam_, pi_, sigma_ :: Term a -> (Term a -> Term a) -> Term a
lam_   = lam "_"
pi_    = pi "_"
sigma_ = sigma "_"

Now, here's the interesting part. The problem with Axelsson and Claessen's original trick is that every substitution is being handled separately. This means that if you were to write a monad for doing substitution with it, it'd actually be quite slow. You have to walk the syntax tree over and over and over.

We can fuse these together by making a single pass:

instantiate :: Name -> t -> IntMap t -> IntMap t
instantiate = IntMap.insert . nameId
rebind :: IntMap (Term b) -> Term a -> (a -> Term b) -> Term b
rebind env xs0 f = go xs0 where
  go = \case
    Free a       -> f a
    Bound b      -> env IntMap.! nameId b
    Constant c   -> Constant c
    m :+ n       -> go m :+ n
    Type t       -> Type (go t)
    Max xs       -> Max (fmap go xs)
    Lam b t e    -> lam   (hint b) (go t) $ \v ->
      rebind (instantiate b v env) e f
    Pi b t e     -> pi    (hint b) (go t) $ \v ->
      rebind (instantiate b v env) e f
    Sigma b t e  -> sigma (hint b) (go t) $ \v ->
      rebind (instantiate b v env) e f
    App x y      -> App (go x) (go y)
    Fst x        -> Fst (go x)
    Snd x        -> Snd (go x)
    Pair t x y   -> Pair (go t) (go x) (go y)

Note that the Lam, Pi and Sigma cases just extend the current environment.

With that now we can upgrade the pearl's encoding to allow for an actual Monad in the same sense as bound.

instance Applicative Term where
  pure = Free
  (< *>) = ap
instance Monad Term where
  return = Free
  (>>=) = rebind IntMap.empty

To show that we can work with this syntax tree representation, let's write an evaluator from it to weak head normal form:

First we'll need some helpers:

apply :: Term a -> [Term a] -> Term a
apply = foldl App
rwhnf :: IntMap (Term a) ->
  [Term a] -> Term a -> Term a
rwhnf env stk     (App f x)
  = rwhnf env (rebind env x Free:stk) f
rwhnf env (x:stk) (Lam b _ e)
  = rwhnf (instantiate b x env) stk e
rwhnf env stk (Fst e)
  = case rwhnf env [] e of
  Pair _ e' _ -> rwhnf env stk e'
  e'          -> Fst e'
rwhnf env stk (Snd e)
  = case rwhnf env [] e of
  Pair _ _ e' -> rwhnf env stk e'
  e'          -> Snd e'
rwhnf env stk e
  = apply (rebind env e Free) stk

Then we can start off the whnf by calling our helper with an initial starting environment:

whnf :: Term a -> Term a
whnf = rwhnf IntMap.empty []

So what have we given up? Well, bound automatically lets you compare terms for alpha equivalence by quotienting out the placement of "F" terms in the syntax tree. Here we have a problem in that the identifiers we get assigned aren't necessarily canonical.

But we can get the same identifiers out by just using the monad above:

alphaEq :: Eq a => Term a -> Term a -> Bool
alphaEq = (==) `on` liftM id

It makes me a bit uncomfortable that our monad is only up to alpha equivalence and that liftM swaps out the identifiers used throughout the entire syntax tree, and we've also lost the ironclad protection against exotic terms.

But overall, this is a much faster version of Axelsson and Claessen's trick and it can be used as a drop-in replacement for something like bound in many cases, and unlike bound, it lets you use HOAS-style syntax for constructing lam, pi and sigma terms.

With pattern synonyms you can prevent the user from doing bad things as well. Once 7.10 ships you'd be able to use a bidirectional pattern synonym for Pi, Sigma and Lam to hide the real constructors behind. I'm not yet sure of the "best practices" in this area.

Here's the code all in one place:

[Download Circular.hs]

Happy Holidays,

by Edward Kmett at October 24, 2022 05:47 PM

On the unsafety of interleaved I/O

One area where I'm at odds with the prevailing winds in Haskell is lazy I/O. It's often said that lazy I/O is evil, scary and confusing, and it breaks things like referential transparency. Having a soft spot for it, and not liking most of the alternatives, I end up on the opposite side when the topic comes up, if I choose to pick the fight. I usually don't feel like I come away from such arguments having done much good at giving lazy I/O its justice. So, I thought perhaps it would be good to spell out my whole position, so that I can give the best defense I can give, and people can continue to ignore it, without costing me as much time in the future. :)

So, what's the argument that lazy I/O, or unsafeInterleaveIO on which it's based, breaks referential transparency? It usually looks something like this:

swap (x, y) = (y, x)
setup = do
  r1 < - newIORef True
  r2 <- newIORef True
  v1 <- unsafeInterleaveIO $ do writeIORef r2 False ; readIORef r1
  v2 <- unsafeInterleaveIO $ do writeIORef r1 False ; readIORef r2
  return (v1, v2)
main = do
  p1 <- setup
  p2 <- setup
  print p1
  print . swap $ p2

I ran this, and got:

(True, False)
(True, False)

So this is supposed to demonstrate that the pure values depend on evaluation order, and we have broken a desirable property of Haskell.

First a digression. Personally I distinguish the terms, "referential transparency," and, "purity," and use them to identify two desirable properties of Haskell. The first I use for the property that allows you to factor your program by introducing (or eliminating) named subexpressions. So, instead of:

f e e

we are free to write:

let x = e in f x x

or some variation. I have no argument for this meaning, other than it's what I thought it meant when I first heard the term used with respect to Haskell, it's a useful property, and it's the best name I can think of for the property. I also (of course) think it's better than some of the other explanations you'll find for what people mean when they say Haskell has referential transparency, since it doesn't mention functions or "values". It's just about equivalence of expressions.

Anyhow, for me, the above example is in no danger of violating referential transparency. There is no factoring operation that will change the meaning of the program. I can even factor out setup (or inline it, since it's already named):

main = let m = setup
        in do p1 < - m
              p2 <- m
              print p1
              print . swap $ p2

This is the way in which IO preserves referential transparency, unlike side effects, in my view (note: the embedded language represented by IO does not have this property, since otherwise p1 could be used in lieu of p2; this is why you shouldn't spend much time writing IO stuff, because it's a bad language embedded in a good one).

The other property, "purity," I pull from Amr Sabry's paper, What is a Purely Functional Language? There he argues that a functional language should be considered "pure" if it is an extension of the lambda calculus in which there are no contexts which observe differences in evaluation order. Effectively, evaluation order must only determine whether or not you get an answer, not change the answer you get.

This is slightly different from my definition of referential transparency earlier, but it's also a useful property to have. Referential transparency tells us that we can freely refactor, and purity tells us that we can change the order things are evaluated, both without changing the meaning of our programs.

Now, it would seem that the original interleaving example violates purity. Depending on the order that the values are evaluated, opponents of lazy I/O say, the values change. However, this argument doesn't impress me, because I think the proper way to think about unsafeInterleaveIO is as concurrency, and in that case, it isn't very strange that the results of running it would be non-deterministic. And in that case, there's not much you can do to prove that the evaluation order is affecting results, and that you aren't simply very unlucky and always observing results that happen to correspond to evaluation order.

In fact, there's something I didn't tell you. I didn't use the unsafeInterleaveIO from base. I wrote my own. It looks like this:

unsafeInterleaveIO :: IO a -> IO a
unsafeInterleaveIO action = do
  iv < - new
  forkIO $
    randomRIO (1,5) >>= threadDelay . (*1000) >>
    action >>= write iv
  return . read $ iv

iv is an IVar (I used ivar-simple). The pertinent operations on them are:

new :: IO (IVar a)
write :: IVar a -> a -> IO ()
read :: IVar a -> a

new creates an empty IVar, and we can write to one only once; trying to write a second time will throw an exception. But this is no problem for me, because I obviously only attempt to write once. read will block until its argument is actually is set, and since that can only happen once, it is considered safe for read to not require IO. [1]

Using this and forkIO, one can easily write something like unsafeInterleaveIO, which accepts an IO a argument and yields an IO a whose result is guaranteed to be the result of running the argument at some time in the future. The only difference is that the real unsafeInterleaveIO schedules things just in time, whereas mine schedules them in a relatively random order (I'll admit I had to try a few times before I got the 'expected' lazy IO answer).

But, we could even take this to be the specification of interleaving. It runs IO actions concurrently, and you will be fine as long as you aren't attempting to depend on the exact scheduling order (or whether things get scheduled at all in some cases).

In fact, thinking of lazy I/O as concurrency turns most spooky examples into threading problems that I would expect most people to consider rather basic. For instance:

  • Don't pass a handle to another thread and close it in the original.
  • Don't fork more file-reading threads than you have file descriptors.
  • Don't fork threads to handle files if you're concerned about the files being closed deterministically.
  • Don't read from the same handle in multiple threads (unless you don't care about each thread seeing a random subsequence of the stream).

And of course, the original example in this article is just non-determinism introduced by concurrency, but not of a sort that requires fundamentally different explanation than fork. The main pitfall, in my biased opinion, is that the scheduling for interleaving is explained in a way that encourages people to try to guess exactly what it will do. But the presumption of purity (and the reordering GHC actually does based on it) actually means that you cannot assume that much more about the scheduling than you can about my scheduler, at least in general.

This isn't to suggest that lazy I/O is appropriate for every situation. Sometimes the above advice means that it is not appropriate to use concurrency. However, in my opinion, people are over eager to ban lazy I/O even for simple uses where it is the nicest solution, and justify it based on the 'evil' and 'confusing' ascriptions. But, personally, I don't think this is justified, unless one does the same for pretty much all concurrency.

I suppose the only (leading) question left to ask is which should be declared unsafe, fork or ivars, since together they allow you to construct a(n even less deterministic) unsafeInterleaveIO?

[1] Note that there are other implementations of IVar. I'd expect the most popular to be in monad-par by Simon Marlow. That allows one to construct an operation like read, but it is actually less deterministic in my construction, because it seems that it will not block unless perhaps you write and read within a single 'transaction,' so to speak.

In fact, this actually breaks referential transparency in conjunction with forkIO:

deref = runPar . get
randomDelay = randomRIO (1,10) >>= threadDelay . (1000*)
myHandle m = m `catch` \(_ :: SomeExpression) -> putStrLn "Bombed"
mySpawn :: IO a -> IO (IVar a)
mySpawn action = do
  iv < - runParIO new
  forkIO $ randomDelay >> action >>= runParIO . put_ iv
  return iv
main = do
  iv < - mySpawn (return True)
  myHandle . print $ deref iv
  myHandle . print $ deref iv

Sometimes this will print "Bombed" twice, and sometimes it will print "Bombed" followed by "True". The latter will never happen if we factor out the deref iv however. The blocking behavior is essential to deref maintaining referential transparency, and it seems like monad-par only blocks within a single runPar, not across multiples. Using ivar-simple in this example always results in "True" being printed twice.

It is also actually possible for unsafeInterleaveIO to break referential transparency if it is implemented incorrectly (or if the optimizer mucks with the internals in some bad way). But I haven't seen an example that couldn't be considered a bug in the implementation rather than some fundamental misbehavior. And my reference implementation here (with a suboptimal scheduler) suggests that there is no break that isn't just a bug.

by Dan Doel at October 24, 2022 05:47 PM

Categories of Structures in Haskell

In the last couple posts I've used some 'free' constructions, and not remarked too much on how they arise. In this post, I'd like to explore them more. This is going to be something of a departure from the previous posts, though, since I'm not going to worry about thinking precisely about bottom/domains. This is more an exercise in applying some category theory to Haskell, "fast and loose".

(Advance note: for some continuous code to look at see this file.)

First, it'll help to talk about how some categories can work in Haskell. For any kind k made of * and (->), [0] we can define a category of type constructors. Objects of the category will be first-class [1] types of that kind, and arrows will be defined by the following type family:

newtype Transformer f g = Transform { ($$) :: forall i. f i ~> g i }
type family (~>) :: k -> k -> * where
  (~>) = (->)
  (~>) = Transformer
type a < -> b = (a -> b, b -> a)
type a < ~> b = (a ~> b, b ~> a)

So, for a base case, * has monomorphic functions as arrows, and categories for higher kinds have polymorphic functions that saturate the constructor:

  Int ~> Char = Int -> Char
  Maybe ~> [] = forall a. Maybe a -> [a]
  Either ~> (,) = forall a b. Either a b -> (a, b)
  StateT ~> ReaderT = forall s m a. StateT s m a -> ReaderT s m a

We can of course define identity and composition for these, and it will be handy to do so:

class Morph (p :: k -> k -> *) where
  id :: p a a
  (.) :: p b c -> p a b -> p a c
instance Morph (->) where
  id x = x
  (g . f) x = g (f x)
instance Morph ((~>) :: k -> k -> *)
      => Morph (Transformer :: (i -> k) -> (i -> k) -> *) where
  id = Transform id
  Transform f . Transform g = Transform $ f . g

These categories can be looked upon as the most basic substrates in Haskell. For instance, every type of kind * -> * is an object of the relevant category, even if it's a GADT or has other structure that prevents it from being nicely functorial.

The category for * is of course just the normal category of types and functions we usually call Hask, and it is fairly analogous to the category of sets. One common activity in category theory is to study categories of sets equipped with extra structure, and it turns out we can do this in Haskell, as well. And it even makes some sense to study categories of structures over any of these type categories.

When we equip our types with structure, we often use type classes, so that's how I'll do things here. Classes have a special status socially in that we expect people to only define instances that adhere to certain equational rules. This will take the place of equations that we are not able to state in the Haskell type system, because it doesn't have dependent types. So using classes will allow us to define more structures that we normally would, if only by convention.

So, if we have a kind k, then a corresponding structure will be σ :: k -> Constraint. We can then define the category (k,σ) as having objects t :: k such that there is an instance σ t. Arrows are then taken to be f :: t ~> u such that f "respects" the operations of σ.

As a simple example, we have:

  k = *
  σ = Monoid :: * -> Constraint
  Sum Integer, Product Integer, [Integer] :: (*, Monoid)
  f :: (Monoid m, Monoid n) => m -> n
    if f mempty = mempty
       f (m <> n) = f m <> f n

This is just the category of monoids in Haskell.

As a side note, we will sometimes be wanting to quantify over these "categories of structures". There isn't really a good way to package together a kind and a structure such that they work as a unit, but we can just add a constraint to the quantification. So, to quantify over all Monoids, we'll use 'forall m. Monoid m => ...'.

Now, once we have these categories of structures, there is an obvious forgetful functor back into the unadorned category. We can then look for free and cofree functors as adjoints to this. More symbolically:

  Forget σ :: (k,σ) -> k
  Free   σ :: k -> (k,σ)
  Cofree σ :: k -> (k,σ)
  Free σ ⊣ Forget σ ⊣ Cofree σ

However, what would be nicer (for some purposes) than having to look for these is being able to construct them all systematically, without having to think much about the structure σ.

Category theory gives a hint at this, too, in the form of Kan extensions. In category terms they look like:

  p : C -> C'
  f : C -> D
  Ran p f : C' -> D
  Lan p f : C' -> D

  Ran p f c' = end (c : C). Hom_C'(c', p c) ⇒ f c
  Lan p f c' = coend (c : c). Hom_C'(p c, c') ⊗ f c

where is a "power" and is a copower, which are like being able to take exponentials and products by sets (or whatever the objects of the hom category are), instead of other objects within the category. Ends and coends are like universal and existential quantifiers (as are limits and colimits, but ends and coends involve mixed-variance).

Some handy theorems relate Kan extensions and adjoint functors:

  if L ⊣ R
  then L = Ran R Id and R = Lan L Id

  if Ran R Id exists and is absolute
  then Ran R Id ⊣ R

  if Lan L Id exists and is absolute
  then L ⊣ Lan L Id

  Kan P F is absolute iff forall G. (G . Kan P F) ~= Kan P (G . F)

It turns out we can write down Kan extensions fairly generally in Haskell. Our restricted case is:

  p = Forget σ :: (k,σ) -> k
  f = Id :: (k,σ) -> (k,σ)
  Free   σ = Ran (Forget σ) Id :: k -> (k,σ)
  Cofree σ = Lan (Forget σ) Id :: k -> (k,σ)
  g :: (k,σ) -> j
  g . Free   σ = Ran (Forget σ) g
  g . Cofree σ = Lan (Forget σ) g

As long as the final category is like one of our type constructor categories, ends are universal quantifiers, powers are function types, coends are existential quantifiers and copowers are product spaces. This only breaks down for our purposes when g is contravariant, in which case they are flipped. For higher kinds, these constructions occur point-wise. So, we can break things down into four general cases, each with cases for each arity:

newtype Ran0 σ p (f :: k -> *) a =
  Ran0 { ran0 :: forall r. σ r => (a ~> p r) -> f r }
newtype Ran1 σ p (f :: k -> j -> *) a b =
  Ran1 { ran1 :: forall r. σ r => (a ~> p r) -> f r b }
-- ...
data RanOp0 σ p (f :: k -> *) a =
  forall e. σ e => RanOp0 (a ~> p e) (f e)
-- ...
data Lan0 σ p (f :: k -> *) a =
  forall e. σ e => Lan0 (p e ~> a) (f e)
data Lan1 σ p (f :: k -> j -> *) a b =
  forall e. σ e => Lan1 (p e ~> a) (f e b)
-- ...
data LanOp0 σ p (f :: k -> *) a =
  LanOp0 { lan0 :: forall r. σ r => (p r -> a) -> f r }
-- ...

The more specific proposed (co)free definitions are:

type family Free   :: (k -> Constraint) -> k -> k
type family Cofree :: (k -> Constraint) -> k -> k
newtype Free0 σ a = Free0 { gratis0 :: forall r. σ r => (a ~> r) -> r }
type instance Free = Free0
newtype Free1 σ f a = Free1 { gratis1 :: forall g. σ g => (f ~> g) -> g a }
type instance Free = Free1
-- ...
data Cofree0 σ a = forall e. σ e => Cofree0 (e ~> a) e
type instance Cofree = Cofree0
data Cofree1 σ f a = forall g. σ g => Cofree1 (g ~> f) (g a)
type instance Cofree = Cofree1
-- ...

We can define some handly classes and instances for working with these types, several of which generalize existing Haskell concepts:

class Covariant (f :: i -> j) where
  comap :: (a ~> b) -> (f a ~> f b)
class Contravariant f where
  contramap :: (b ~> a) -> (f a ~> f b)
class Covariant m => Monad (m :: i -> i) where
  pure :: a ~> m a
  join :: m (m a) ~> m a
class Covariant w => Comonad (w :: i -> i) where
  extract :: w a ~> a
  split :: w a ~> w (w a)
class Couniversal σ f | f -> σ where
  couniversal :: σ r => (a ~> r) -> (f a ~> r)
class Universal σ f | f -> σ where
  universal :: σ e => (e ~> a) -> (e ~> f a)
instance Covariant (Free0 σ) where
  comap f (Free0 e) = Free0 (e . (.f))
instance Monad (Free0 σ) where
  pure x = Free0 $ \k -> k x
  join (Free0 e) = Free0 $ \k -> e $ \(Free0 e) -> e k
instance Couniversal σ (Free0 σ) where
  couniversal h (Free0 e) = e h
-- ...

The only unfamiliar classes here should be (Co)Universal. They are for witnessing the adjunctions that make Free σ the initial σ and Cofree σ the final σ in the relevant way. Only one direction is given, since the opposite is very easy to construct with the (co)monad structure.

Free σ is a monad and couniversal, Cofree σ is a comonad and universal.

We can now try to convince ourselves that Free σ and Cofree σ are absolute Here are some examples:

free0Absolute0 :: forall g σ a. (Covariant g, σ (Free σ a))
               => g (Free0 σ a) < -> Ran σ Forget g a
free0Absolute0 = (l, r)
 l :: g (Free σ a) -> Ran σ Forget g a
 l g = Ran0 $ \k -> comap (couniversal $ remember0 . k) g
 r :: Ran σ Forget g a -> g (Free σ a)
 r (Ran0 e) = e $ Forget0 . pure
free0Absolute1 :: forall (g :: * -> * -> *) σ a x. (Covariant g, σ (Free σ a))
               => g (Free0 σ a) x < -> Ran σ Forget g a x
free0Absolute1 = (l, r)
 l :: g (Free σ a) x -> Ran σ Forget g a x
 l g = Ran1 $ \k -> comap (couniversal $ remember0 . k) $$ g
 r :: Ran σ Forget g a x -> g (Free σ a) x
 r (Ran1 e) = e $ Forget0 . pure
free0Absolute0Op :: forall g σ a. (Contravariant g, σ (Free σ a))
                 => g (Free0 σ a) < -> RanOp σ Forget g a
free0Absolute0Op = (l, r)
 l :: g (Free σ a) -> RanOp σ Forget g a
 l = RanOp0 $ Forget0 . pure
 r :: RanOp σ Forget g a -> g (Free σ a)
 r (RanOp0 h g) = contramap (couniversal $ remember0 . h) g
-- ...

As can be seen, the definitions share a lot of structure. I'm quite confident that with the right building blocks these could be defined once for each of the four types of Kan extensions, with types like:

  :: forall g σ a. (Covariant g, σ (Free σ a))
  => g (Free σ a) < ~> Ran σ Forget g a
  :: forall g σ a. (Covariant g, σ (Cofree σ a))
  => g (Cofree σ a) < ~> Lan σ Forget g a
  :: forall g σ a. (Contravariant g, σ (Free σ a))
  => g (Free σ a) < ~> RanOp σ Forget g a
  :: forall g σ a. (Contravariant g, σ (Cofree σ a))
  => g (Cofree σ a) < ~> LanOp σ Forget g a

However, it seems quite difficult to structure things in a way such that GHC will accept the definitions. I've successfully written freeAbsolute using some axioms, but turning those axioms into class definitions and the like seems impossible.

Anyhow, the punchline is that we can prove absoluteness using only the premise that there is a valid σ instance for Free σ and Cofree σ. This tends to be quite easy; we just borrow the structure of the type we are quantifying over. This means that in all these cases, we are justified in saying that Free σ ⊣ Forget σ ⊣ Cofree σ, and we have a very generic presentations of (co)free structures in Haskell. So let's look at some.

We've already seen Free Monoid, and last time we talked about Free Applicative, and its relation to traversals. But, Applicative is to traversal as Functor is to lens, so it may be interesting to consider constructions on that. Both Free Functor and Cofree Functor make Functors:

instance Functor (Free1 Functor f) where
  fmap f (Free1 e) = Free1 $ fmap f . e
instance Functor (Cofree1 Functor f) where
  fmap f (Cofree1 h e) = Cofree1 h (fmap f e)

And of course, they are (co)monads, covariant functors and (co)universal among Functors. But, it happens that I know some other types with these properties:

data CoYo f a = forall e. CoYo (e -> a) (f e)
instance Covariant CoYo where
  comap f = Transform $ \(CoYo h e) -> CoYo h (f $$ e)
instance Monad CoYo where
  pure = Transform $ CoYo id
  join = Transform $ \(CoYo h (CoYo h' e)) -> CoYo (h . h') e
instance Functor (CoYo f) where
  fmap f (CoYo h e) = CoYo (f . h) e
instance Couniversal Functor CoYo where
  couniversal tr = Transform $ \(CoYo h e) -> fmap h (tr $$ e)
newtype Yo f a = Yo { oy :: forall r. (a -> r) -> f r }
instance Covariant Yo where
  comap f = Transform $ \(Yo e) -> Yo $ (f $$) . e
instance Comonad Yo where
  extract = Transform $ \(Yo e) -> e id
  split = Transform $ \(Yo e) -> Yo $ \k -> Yo $ \k' -> e $ k' . k
instance Functor (Yo f) where
  fmap f (Yo e) = Yo $ \k -> e (k . f)
instance Universal Functor Yo where
  universal tr = Transform $ \e -> Yo $ \k -> tr $$ fmap k e

These are the types involved in the (co-)Yoneda lemma. CoYo is a monad, couniversal among functors, and CoYo f is a Functor. Yo is a comonad, universal among functors, and is always a Functor. So, are these equivalent types?

coyoIso :: CoYo < ~> Free Functor
coyoIso = (Transform $ couniversal pure, Transform $ couniversal pure)
yoIso :: Yo < ~> Cofree Functor
yoIso = (Transform $ universal extract, Transform $ universal extract)

Indeed they are. And similar identities hold for the contravariant versions of these constructions.

I don't have much of a use for this last example. I suppose to be perfectly precise, I should point out that these uses of (Co)Yo are not actually part of the (co-)Yoneda lemma. They are two different constructions. The (co-)Yoneda lemma can be given in terms of Kan extensions as:

yoneda :: Ran Id f < ~> f
coyoneda :: Lan Id f < ~> f

But, the use of (Co)Yo to make Functors out of things that aren't necessarily is properly thought of in other terms. In short, we have some kind of category of Haskell types with only identity arrows---it is discrete. Then any type constructor, even non-functorial ones, is certainly a functor from said category (call it Haskrete) into the normal one (Hask). And there is an inclusion functor from Haskrete into Hask:

 Haskrete -----> Hask
      |        /|
      |       /
      |      /
Incl  |     /
      |    /  Ran/Lan Incl F
      |   /
      |  /
      v /

So, (Co)Free Functor can also be thought of in terms of these Kan extensions involving the discrete category.

To see more fleshed out, loadable versions of the code in this post, see this file. I may also try a similar Agda development at a later date, as it may admit the more general absoluteness constructions easier.

[0]: The reason for restricting ourselves to kinds involving only * and (->) is that they work much more simply than data kinds. Haskell values can't depend on type-level entities without using type classes. For *, this is natural, but for something like Bool -> *, it is more natural for transformations to be able to inspect the booleans, and so should be something more like forall b. InspectBool b => f b -> g b.

[1]: First-class types are what you get by removing type families and synonyms from consideration. The reason for doing so is that these can't be used properly as parameters and the like, except in cases where they reduce to some other type that is first-class. For example, if we define:

type I a = a

even though GHC will report I :: * -> *, it is not legal to write Transform I I.

by Dan Doel at October 24, 2022 05:47 PM

Gabriella Gonzalez

How to correctly cache build-time dependencies using Nix


Professional Nix users often create a shared cache of Nix build products so that they can reuse build products created by continuous integration (CI). For example, CI might build Nix products for each main development branch of their project or even for every pull request and it would be nice if those build products could be shared with all developers via a cache.

However, uploading build products to a cache is a little non-trivial if you don’t already know the “best� solution, which is the subject of this post.

The solution described in this post is:

  • Simple

    It only takes a few lines of Bash code because we use the Nix command-line interface idiomatically

  • Efficient

    It is very cheap to compute which build products to upload and requires no additional builds nor an exorbitant amount of disk space

  • Accurate

    It uploads the build products that most people would intuitively want to upload

Note: Throughout this post I will be using the newer Nix command-line interface and flakes, which requires either adding this line to your nix.conf file:

extra-experimental-features = nix-command flakes

… and restarting your Nix daemon (if you have a multi-user Nix installation), or alternatively adding these flags to the beginning of all nix commands throughout this post:

$ nix --option extra-experimental-features 'nix-command flakes' …

Wrong solution #0

As a running example, suppose that our CI builds a top-level build product using a command like this:

$ nix build .#example

The naïve way to upload that to the cache would be:

$ nix store sign --key-file "${KEY_FILE}" --recursive .#example

$ nix copy --to s3:// .#example

Note: You will need to generate a KEY_FILE using the nix-store --generate-binary-cache-key command if you haven’t already. For more details, see the following documentation from the manual:

Click to expand to see the documentation
Operation --generate-binary-cache-key
nix-store --generate-binary-cache-key key-name secret-key-file

This command generates an Ed25519 key pair (
that can be used to create a signed binary cache. It takes three
mandatory parameters:

1. A key name, such as, that is used to look up
keys on the client when it verifies signatures. It can be
anything, but it’s suggested to use the host name of your cache
(e.g. with a suffix denoting the number of the
key (to be incremented every time you need to revoke a key).

2. The file name where the secret key is to be stored.

3. The file name where the public key is to be stored.

That seems like a perfectly reasonable thing to do, right? However, the problem with that is that it is incomplete, meaning that the cache would still be missing several useful build products that developers would expect to be there.

Specifically, the above command only copies the “run-time� dependencies of our build product whereas most developers expect the cache to also include “build-time� dependencies, and I’ll explain the distinction between the two.

Run-time vs. Build-time

Many paths in the /nix/store are not “valid� in isolation. They typically depend on other paths within the /nix/store.

For example, suppose that I build the GNU hello package, like this:

$ nix build nixpkgs#hello

I can query all of the other paths within the /nix/storethat the hello package transitively depends on at run-time using this command:

$ nix-store --query --requisites ./result

… or I can print the same information in tree form like this:

$ nix-store --query --tree ./result

On my macOS machine, it has two run-time dependencies (other than itself) within the /nix/store: libobjc and apple-framework-CoreFoundation-11.0.

Note: there might be other run-time dependencies, because I believe Nixpkgs support for macOS requires some impure system dependencies, but I’m not an expert on this so I could be wrong.

These are called “run-time� dependencies because we cannot run our hello executable without them.

Nix prevents us from getting into situations where a /nix/store path is missing its run-time dependencies. For example, if I were to nix copy the hello build product to any cache, then Nix would perform the following steps, in order:

  • Copy libobjc to the cache

    … since that has no dependencies

  • Copy apple-framework-CoreFoundation to the cache

    … since its libobjc dependency is now satisfied within the cache

  • Copy hello to the cache

    … since its apple-framework-CoreFoundation dependency is now satisfied within the cache

However, Nix also has a separate notion of “build-time� dependencies, which are dependencies that we need to in order to build the hello package.

Note: The reason we’re interested in build-time dependencies for our project is that we want developers to be able to rebuild the project if they make any changes to the source code. If we were to only cache the run-time dependencies of our project that wouldn’t cache the development environment that developers need.

In order to query these dependencies I need to first get the “derivation� (.drv file) for hello:

$ DERIVATION="$(nix path-info --derivation nixpkgs#hello)"

$ declare -p DERIVATION
typeset DERIVATION=/nix/store/4a78f0s4p5h2sbcrrzayl5xas2i7zq1m-hello-2.12.1.drv

You can think of a derivation file as a build recipe that contains instructions for how to build the corresponding build product (the hello package in this case).

I can query the direct dependencies of that derivation using this command:

$ nix-store --query --references "${DERIVATION}"

Many of these dependencies are themselves derivations (.drv files), meaning that they represent other packages that Nix might have to build or fetch from a cache.

Note: the .drv files are actually not the build-time dependencies, but rather the instructions for building them. You can convert any .drv file to the matching product it is supposed to build using the same nix build command, like this:

$ nix build /nix/store/labgzlb16svs1z7z9a6f49b5zi8hb11s-bash-5.1-p16.drv

Does that mean that these build-time dependencies are on our machine if we built nixpkgs#hello? Not necessarily. In fact, in all likelihood the nixpkgs#hello build was cached, meaning that nix build nixpkgs#hello only downloaded helloand its run-time dependencies and no build-time dependencies were required nor installed by Nix.

However, I could in principle force Nix to build the hello package instead of downloading it from a cache, like this:

$ nix build nixpkgs#hello --rebuild

… and that would download the direct build-time dependencies of the hello package in order to rebuild the package.

Wrong solution #1

By this point you might suppose that you have enough information to come up with a better set of /nix/store paths to cache. Your solution might look like this:

  • Get the derivation for the top-level build product

  • Get the direct build-time dependencies of that derivation

  • Build the top-level build product and its direct build-time dependencies

  • Cache the top-level build product and its direct build-time dependencies

In other words, something like this Nix code:

$ DERIVATION="$(nix path-info --derivation "${BUILD}")"

$ DEPENDENCIES=($(nix-store --query --references "${DERIVATION}"))

$ nix build "${BUILD}" "${DEPENDENCIES[@]}"

$ nix store sign --key-file "${KEY_FILE}" --recursive "${BUILD}" "${DEPENDENCIES[@]}"

$ nix copy --to "${CACHE}" "${BUILD}" "${DEPENDENCIES[@]}"

This is better, but still not good enough!

The problem with this solution is that it only works well if your dependencies never change and you only modify your top-level project. If you upgrade or patch any of your direct build-time dependencies then you need to have their build-time dependencies cached so that you can quickly rebuild them.

In fact, going two layers deep is still not enough; in practice you can’t easily anticipate in advance how deep in the build-time dependency tree you might need to patch or upgrade things. For example, you might need to patch or upgrade your compiler, which is really deep in your build-time dependency tree.

Wrong solution #2

Okay, so maybe we can try to build and cache all of our build-time dependencies?

Wrong again. There are way too many of them. You can query them by replacing --references with --requisites and you’ll a giant list of results, even for “small� packages. For example:

$ DERIVATION=$(nix path-info --derivation nixpkgs#hello)

$ nix-store --query --requisites "${DERIVATION}"
… 🌺 500+ derivations later 🌺 …
Click to expand and see the full list of build-time dependencies

The above command not only lists the build-time dependencies for the hello package, but also their transitive build-time dependencies. In other words, these are all the derivations needed to build the hello package “from scratch� in the absence of any cache products. We can see the complete tree of build-time dependencies like this:

$ nix-store --query --tree "${DERIVATION}"
│ ├───/nix/store/7kcayxwk8khycxw1agmcyfm9vpsqpw4s-bootstrap-tools.drv
│ │ ├───/nix/store/3glray2y14jpk1h6i599py7jdn3j2vns-mkdir.drv
│ │ ├───/nix/store/50ql5q0raqkcydmpi6wqvnhs9hpdgg5f-cpio.drv
│ │ ├───/nix/store/81xahsrhpn9mbaslgi5sz7gsqra747d4-unpack-bootstrap-tools->
│ │ ├───/nix/store/>
│ │ ├───/nix/store/gxzl4vmccqj89yh7kz62frkxzgdpkxmp-sh.drv
│ │ └───/nix/store/pjbpvdy0gais8nc4sj3kwpniq8mgkb42-bzip2.drv
│ ├───/nix/store/3lhw0v2wyzimzl96xfsk6psfmzh38irh-bash51-007.drv
│ │ ├───/nix/store/7kcayxwk8khycxw1agmcyfm9vpsqpw4s-bootstrap-tools.drv [..>
│ │ ├───/nix/store/nbxwxwqwcr9rrmxb6gb532f18102815x-bootstrap-stage0-stdenv>
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/7kcayxwk8khycxw1agmcyfm9vpsqpw4s-bootstrap-tools.drv>
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/cickvswrvann041nqxb0rxilc46svw1n-prune-libtool-files>
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/
│ │ │ ├───/nix/store/kxw6q8v6isaqjm702d71n2421cxamq68-make-symlinks-relat>
│ │ │ ├───/nix/store/m54bmrhj6fqz8nds5zcj97w9s9bckc9v-compress-man-pages.>
│ │ │ ├───/nix/store/ngg1cv31c8c7bcm2n8ww4g06nq7s4zhm-set-source-date-epo>
│ │ │ └───/nix/store/wlwcf1nw2b21m4gghj70hbg1v7x53ld8-reproducible-builds>
│ │ ├───/nix/store/i65va14cylqc74y80ksgnrsaixk39mmh-mirrors-list.drv
│ │ │ ├───/nix/store/7kcayxwk8khycxw1agmcyfm9vpsqpw4s-bootstrap-tools.drv>
│ │ │ ├───/nix/store/nbxwxwqwcr9rrmxb6gb532f18102815x-bootstrap-stage0-st>
│ │ │ └───/nix/store/
│ │ └───/nix/store/

If we were to build and cache all of these build-time dependencies then our local /nix/store and cache would explode in size. Also, we do not need to do this because there is a better solution …

Correct solution

The solution that provides the best value is to cache all transitive build-time dependencies that are present within the current /nix/store after building the top-level build product. In other words, don’t bother to predict which build-time dependencies we need; instead, empirically infer which ones to cache based on which ones Nix installed and used along the way.

This is not only more accurate, but it’s also more efficient: we don’t need to build or download anything new because we’re only caching things we already locally installed.

As a matter of fact, the nix-store command already supports this use case quite well. If you consult the documentation for the --requisites flag, you’ll find this gem:

       • --requisites; -R
Prints out the closure (../ of the store path paths.

This query has one option:

• --include-outputs Also include the existing output paths of store
derivations, and their closures.

This query can be used to implement various kinds of deployment. A
source deployment is obtained by distributing the closure of a store
derivation. A binary deployment is obtained by distributing the closure
of an output path. A cache deployment (combined source/binary
deployment, including binaries of build-time-only dependencies) is
obtained by distributing the closure of a store derivation and
specifying the option --include-outputs.

We’re specifically interested in a “cache deployment�, so we’re going to do exactly what the documentation says and use the --include-outputs flag in conjunction with the --requisites flag. In other words, the --include-outputs flag was expressly created for this use case!

So here is the simplest, but least robust, version of the script for computing the set of build-time dependencies to cache, as a Bash array:

$ # Continue reading before using this code; there's a more robust version later

$ # Optional: Perform the build if you haven't already
$ nix build "${BUILD}"

$ DERIVATION="$(nix path-info --derivation "${BUILD}")"

$ DEPENDENCIES=($(nix-store --query --requisites --include-outputs "${DERIVATION}"))

$ nix store sign --key-file "${KEY_FILE}" --recursive "${DEPENDENCIES[@]}"

$ nix copy --to "${CACHE}" "${DEPENDENCIES[@]}"

The above code is simple and clear enough to illustrate the idea, but we’re going to make a few adjustments to make this code more robust.

Specifically, we’re going to:

  • Change the code to support an array of build targets

    i.e. BUILDS instead of BUILD

  • Use mapfile instead of ($(…)) to create intermediate arrays

    See: SC2207

  • Use xargs to handle command line length limits

… which gives us:

$ # Optional: Perform the build if you haven't already
$ echo "${BUILDS[@]}" | xargs nix build

$ mapfile -t DERIVATIONS < <(echo "${BUILDS[@]}" | xargs nix path-info --derivation)

$ mapfile -t DEPENDENCIES < <(echo "${DERIVATIONS[@]}" | xargs nix-store --query --requisites --include-outputs)

$ echo "${DEPENDENCIES[@]}" | xargs nix store sign --key-file "${KEY_FILE}" --recursive

$ echo "${DEPENDENCIES[@]}" | xargs nix copy --to "${CACHE}"

… where you:

  • replace BUILDS with a Bash array containing what you want to build

    e.g. .#example or nixpkgs#hello

  • replace CACHE with whatever store you use as your cache

    e.g. s3://

  • replace KEY_FILE with the path to your cache signing key


That last script is the pedantically robust way to do this in Bash if you want to be super paranoid. The above script might not work in other shells, but hopefully this post was sufficiently clear that you can adapt the script to your needs.

If I made any mistakes in the above post, let me know and I can fix them.

by Gabriella Gonzalez ( at October 24, 2022 12:26 PM

October 22, 2022

Gabriella Gonzalez

What does "isomorphic" mean (in Haskell)?

What does "isomorphic" mean (in Haskell)

Sometimes you’ll hear someone describe two things as being “isomorphic� to one another and I wanted to explain what that means.

You might have already guessed that “isomorphic� is a synonym for “equivalent�, and that would have been a pretty good guess. Really, the main difference between the two words is that “isomorphic� has a more precise and more general definition than “equivalent�.

In this post I will introduce a more precise definition of “isomorphic�, using Haskell code. This definition won’t be the fully general definition, but I still hope to give you some taste of how “isomorphic� can denote something more than just “equivalent�.

The simple version

The simplest and least general definition of “isomorphic� (in Haskell) is:

Two types, A, and B, are isomorphic if there exist two functions, forward and backward of the following types:

forward :: A -> B

backward :: B -> A

… such that the following two equations (which I will refer to as the “isomorphism laws�) are true:

forward . backward = id

backward . forward = id

id here is the identity function from Haskell’s Prelude, defined like this:

id :: a -> a
id x = x

… and (.) is the function composition operator (also from Haskell’s Prelude), defined like this:

(.) :: (b -> c) -> (a -> b) -> (a -> c)
(f . g) x = f (g x)

According to the above definition, the types Bool -> a and (a, a) are isomorphic, because we can define two functions:

forward :: (Bool -> a) -> (a, a)
forward function = (function False, function True)

backward :: (a, a) -> (Bool -> a)
backward (first, second) False = first
backward (first, second) True = second

… and we can prove that those two functions satisfy the isomorphism laws using equational reasoning.

Proof of the isomorphism laws (click to expand)

Here’s the proof of the first isomorphism law:

forward . backward

-- (f . g) = \x -> f (g x)
-- … where:
-- f = forward
-- g = backward
= \x -> forward (backward x)

-- x = (first, second)
= \(first, second) -> forward (backward (first, second))

-- forward function = (function False, function True)
= \(first, second) ->
(backward (first, second) False, backward (first, second) True)

-- backward (first, second) False = first
-- backward (first, second) True = second
= \(first, second) -> (first, second)

-- x = (first, second)
-- … in reverse
= \x -> x

-- id x = x
-- … in reverse
= \x -> id x

-- η-reduction
= id

… and here is the proof of the second isomorphism law:

backward . forward

-- (f . g) = \x -> f (g x)
-- … where:
-- f = backward
-- g = forward
-- x = function
= \function -> backward (forward function)

-- forward function = (function False, function True)
= \function -> backward (function False, function True)

-- η-expand
= \function bool -> backward (function False, function True) bool

-- There are two possible cases:
-- Case #0: bool = False
-- Case #1: bool = True

-- Proof for case #0: bool = False
= \function bool -> backward (function False, function True) False

-- backward (first, second) False = first
-- … where:
-- first = function False
-- second = function True
= \function bool -> function False

-- bool = False
-- … in reverse
= \function bool -> function bool

-- η-reduction
= \function -> function

-- id x = x
-- … in reverse
= \function -> id function

-- η-reduction
= id

-- Proof for case #1: bool = True
= \function bool -> backward (function False, function True) True

-- backward (first, second) True = second
-- … where:
-- first = function False
-- second = function True
= \function bool -> function True

-- b = True
-- … in reverse
= \function bool -> function bool

-- η-reduction
= \function -> function

-- id x = x
-- … in reverse
= \function -> id function

-- η-reduction
= id

We’ll use the notation A ≅ B as a short-hand for “A is isomorphic to B�, so we can also write:

Bool -> a ≅ (a, a)

Whenever we declare that two types are isomorphic we need to actually specify what the forward and backwardconversion functions are and prove that they satisfy isomorphism laws. The existence of forward and backwardfunctions of the correct input and output types is not enough to establish that the two types are isomorphic.

For example, suppose we changed the definition of forward to:

forward :: (Bool -> a) -> (a, a)
forward function = (function True, function False)

Then forward . backward and backward . forward would still type-check and have the right type, but they would no longer be equal to id.

In other words, when discussing isomorphic types, it’s technically not enough that the two types are equivalent. The way in which they are equivalent matters, too, if we want to be pedantic. In practice, though, if there’s only one way to implement the two conversion functions then people won’t bother to explicitly specify them.

The reason why this is important is because an isomorphism also gives us an explicit way to convert between the two types. We're not just declaring that they're equivalent, but we're spelling out exactly how to transform each type into the other type, which is very useful!

More examples

Let’s speedrun through a few more examples of isomorphic types, which all parallel the rules of arithmetic:

-- 0 + a = a
Either Void a ≅ a

-- a + (b + c) = (a + b) + c
Either a (Either b c) = Either (Either a b) c

-- 1 × a = a
((), a) ≅ a

-- a × (b × c) = (a × b) × c
(a, (b, c)) ≅ ((a, b), c)

-- 0 × a = 0
(Void, a) ≅ Void

-- a × (b + c) = (a × b) + (a × c)
(a, Either b c) ≅ Either (a, b) (a, c)

-- a ^ 1 = a
() -> a ≅ a

-- a ^ 0 = 1
Void -> a ≅ ()

-- (c ^ b) ^ a = (c ^ a) ^ b
a -> b -> c ≅ b -> a -> c

-- (c ^ b) ^ a = c ^ (a × b)
a -> b -> c ≅ (a, b) -> c

Exercise: implement the forward and backward functions for some of the above types and prove the isomorphism laws for each pair of functions. It will probably be very tedious to prove all of the above examples, so pick the ones that interest you the most.

Intermediate tricks

This section will introduce some more advanced tricks for proving that two types are isomorphic.

First, let’s start with a few ground rules for working with all isomorphisms:

  • Reflexivity: a ≅ a

  • Symmetry: If a ≅ b then b ≅ a

  • Transitivity: If a ≅ b and b ≅ c then a ≅ c

Now let’s get into some Haskell-specific rules:

a newtype in Haskell is isomorphic to the underlying type if the newtype constructor is public.

For example, if we were to define:

newtype Name = Name { getName :: String }

… then Name and String would be isomorphic (Name ≅ String), where:

forward :: Name -> String
forward = getName

backward :: String -> Name
backward = Name

One such newtype that shows up pretty often when reasoning about isomorphic types is the Identitytype constructor from Data.Functor.Identity:

newtype Identity a = Identity { runIdentity :: a }

… where Identity a ≅ a.

To see why Identity is useful, consider the following two types:

newtype State s a = State { runState :: s -> (a, s) }

newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }

The latter newtype is from the transformerspackage, which is how we layer on the “state� effect within a monad transformer stack. If you don’t understand what that means, that’s okay; it’s not that relevant to the point.

However, the transformers package doesn’t define State as above. Instead, the transformerspackage defines State like this:

type State s = StateT s Identity

The latter type synonym definition for State is equivalent (“isomorphic�) to the newtype definition for State I provided above. In order to prove that though I’ll need to distinguish between the two State type constructors, so I’ll use a numeric subscript to distinguish them:

import Data.Functor.Identity (Identity)

newtype Stateâ‚€ s a = Stateâ‚€ { runState :: s -> (a, s) }

newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }

type State� s = StateT s Identity

… and then we can prove that State₀ is isomorphic to State� like this:

  • Stateâ‚€ s a ≅ s -> (a, s)

    … because the State₀ newtype is isomorphic to the underlying type

  • s -> (a, s) ≅ s -> Identity (a, s)

    … because the Identity newtype is isomorphic to the underlying type

  • s -> Identity (a, s) ≅ StateT s Identity a

    … because the StateT newtype is isomorphic to the underlying type

  • StateT s Identity a = Stateâ‚� s a

    … because of how the State� type synonym is defined.

Therefore, by transitivity, we can conclude:

  • Stateâ‚€ s a ≅ Stateâ‚� s a

Okay, now let’s introduce an extremely useful rule related to isomorphic types:

If f is a Functor then forall r . (a -> r) -> f r is isomorphic to f a.

Or in other words:

Functor f => (forall r . (a -> r) -> f r) ≅ f a

… and here are the two conversion functions:

{-# LANGUAGE RankNTypes #-}

forward :: Functor f => (forall r . (a -> r) -> f r) -> f a
forward f = f id

backward :: Functor f => f a -> (forall r . (a -> r) -> f r)
backward fa k = fmap k fa

This is essentially the Yoneda lemma in Haskell form, which is actually a bit tricky to prove. If you don’t believe me, try proving the isomorphism laws for the above forward and backward functions and see how far you get. It’s much easier to rely on the fact that someone else already did the hard work of proving those isomorphism laws for us.

Here’s a concrete example of the Yoneda lemma in action. Suppose that I want to prove that there is only one implementation of the identity function, id. I can do so by proving that the type of the identity function (forall a . a -> a) is isomorphic to the () type (a type inhabited by exactly one value):

(forall a . a -> a) ≅ ()

Here’s how you prove that by chaining together several isomorphic types:

  (forall a . a -> a)
-- a ≅ () -> a
≅ (forall a . (() -> a) -> a)
-- a ≅ Identity a
≅ (forall a . (() -> a) -> Identity a)
-- ✨ Yoneda lemma (where f = Identity) ✨
≅ Identity ()
≅ ()

… so since the () type is inhabited by exactly one value (the () term) and the () type is isomorphic to the type of id, then there is exactly one way to implement id (which is id x = x).

Note: To be totally pedantic, there is exactly one way to implement id “up to isomorphism�. This is how we say that there might be several syntactically different ways of implementing id, such as:

id x = x

id y = y

id = \x -> x

id x = y
y = x

… but all of those ways of implementing id are isomorphic to one another (in a slightly different sense that I have not covered), so there is essentially only one way of implementing id.

Similarly, we can prove that there are exactly two ways to implement a function of type forall a . a -> a -> a by showing that such a type is isomorphic to Bool (a type inhabited by exactly two values):

  (forall a . a -> a -> a)
-- a -> b -> c ≅ (a, b) -> c
≅ (forall a . (a, a) -> a)
-- (a, a) ≅ Bool -> a
≅ (forall a . (Bool -> a) -> a)
-- a ≅ Identity a
≅ (forall a . (Bool -> a) -> Identity a)
-- ✨ Yoneda lemma (where f = Identity) ✨
≅ Identity Bool
≅ Bool

… and in case you’re curious, here are the only two possible ways to implement that type (up to isomorphism):

{-# LANGUAGE ExplicitForAll #-}

false :: forall a . a -> a -> a
false f t = f

true :: forall a . a -> a -> a
true f t = t

Here’s one last example of using the Yoneda lemma to prove that:

(forall r . (a -> r) -> r) ≅ a

… which you can prove like this:

  (forall r . (a -> r) -> r)
-- Identity r ≅ r
≅ (forall r . (a -> r) -> Identity r)
-- ✨ Yoneda lemma (where f = Identity) ✨
≅ Identity a
≅ a

Exercise: Prove that these two types are isomorphic:

(forall r . (b -> r) -> (a -> r)) ≅ a -> b
Solution (click to expand)
  (forall r . (b -> r) -> (a -> r))
-- a -> b -> c ≅ b -> a -> c
≅ (forall r . a -> (b -> r) -> r)
-- r ≅ Identity r
≅ (forall r . a -> (b -> r) -> Identity r)
-- ✨ Yoneda lemma (where f = Identity) ✨
≅ a -> Identity b
-- Identity b ≅ b
≅ a -> b


So far we’ve only used the word “isomorphic� but there is a related word we should cover: “isomorphism�.

In Haskell, if the types A and B are “isomorphic� then an “isomorphism� between them is the corresponding pair of functions converting between them (i.e. forward and backward).

The easiest way to explain this is to actually define an isomorphism type in Haskell:

data Isomorphism a b = Isomorphism
{ forward :: a -> b
, backward :: b -> a

For example:

exampleIsomorphism :: Isomorphism ((a, b) -> c) (a -> b -> c)
exampleIsomorphism = Isomorphism{ forward = curry, backward = uncurry }

However, this is not the only way we can encode an isomorphism in Haskell. For example, the lens package has an Iso type which can also represent an isomorphism:

import Control.Lens (Iso', iso)

exampleIso :: Iso' ((a, b) -> c) (a -> b -> c)
exampleIso = iso curry uncurry

These two types are equivalent. In fact, you might even say they are … isomorphic 👀.

{-# LANGUAGE NamedFieldPuns #-}

import Control.Lens (AnIso', Iso', cloneIso, iso, review, view)

data Isomorphism a b = Isomorphism
{ forward :: a -> b
, backward :: b -> a

-- | We have to use `AnIso'` here instead of `Iso'` for reasons I won't go into
isomorphismIsomorphism :: Isomorphism (Isomorphism a b) (AnIso' a b)
isomorphismIsomorphism = Isomorphism{ forward, backward }
forward :: Isomorphism a b -> AnIso' a b
forward (Isomorphism f b) = iso f b

backward :: AnIso' a b -> Isomorphism a b
backward iso =
{ forward = view (cloneIso iso)
, backward = review (cloneIso iso)

Generalized isomorphisms

I mentioned earlier that the isomorphism definition we began with was not the fully general definition. In this section we’ll slightly generalize the definition, while still sticking to something ergonomic to express within Haskell:

Two types, A, and B, are isomorphic if there exist two morphisms, forward and backward of the following types:

forward :: cat A B

backward :: cat B A

… such that cat is an instance of the Category type class and the following two equations (which I will refer to as the “isomorphism laws�) are true:

forward . backward = id

backward . forward = id

… where (.) and id are the methods of the Category type class and not necessarily the (.) and id from the Prelude.

This definition is based on the Categorytype class from the Control.Categorymodule, which is defined like this:

class Category cat where
-- | the identity morphism
id :: cat a a

-- | morphism composition
(.) :: cat b c -> cat a b -> cat a c

… and all instance of the Category class must satisfy the following three “category laws�:

(f . g) . h = f . (g . h)

f . id = f

id . f = f

In other words, you can think of the Category class as generalizing our notion of functions to become “morphisms� so that we replace values of type a -> b (functions) with values of type cat a b (“morphisms�). When we generalize our notion of functions to morphisms then we can similarly generalize our notion of isomorphisms.

Of course, Haskell functions are one instance of this Category class:

instance Category (->) where
id =

(.) = (Prelude..)

… so if we take our more general definition of isomorphisms and replace cat with (->) then we get back the less general definition of isomorphisms that we started with.

However, things other than functions can be instances of this Category class, too. For example, “monadic� functions of type Monad m => a -> m b can implement Category, too, if we wrap them in a newtype:

import Control.Category (Category(..))
import Control.Monad ((<=<))

-- Note: This type and instance already exists in the `Control.Arrow` module
newtype Kleisli m a b = Kleisli{ runKleisli :: a -> m b }

instance Monad m => Category (Kleisli m) where
id = Kleisli return

Kleisli f . Kleisli g = Kleisli (f <=< g)

… and that satisfies the category laws because:

(f <=< g) <=< h = f <=< (g <=< h)

f <=< return = f

return <=< f = f

Fun fact: The above category laws for the Kleisli type constructor are isomorphic to the monad laws (in a different sense of the world "isomorphic" that I have not covered).

Once we begin to use Category instances other than functions we can begin to explore more interesting types of “morphisms� and “isomorphisms�. However, in order to do so we need to generalize our Isomorphism type like this:

data Isomorphism cat a b = Isomorphism
{ forward :: cat a b
, backward :: cat b a

… so that we can store morphisms that are not necessarily functions.

With that generalized Isomorphism type in hand we can now create a sample Isomorphism in a KleisliCategory:

import Data.Monoid (Sum(..))
import Control.Monad.Writer (Writer)

writerIsomorphism :: Isomorphism (Kleisli (Writer (Sum Integer))) () ()
writerIsomorphism = Isomorphism{ forward, backward }
forward :: Kleisli (Writer (Sum Integer)) () ()
forward = Kleisli (\_ -> tell (Sum 1))

backward :: Kleisli (Writer (Sum Integer)) () ()
backward = Kleisli (\_ -> tell (Sum (-1)))

Like before, we still require that:

forward . backward = id

backward . forward = id

… but in this case the (.) and id in these two isomorphism laws will be the ones for our Kleisli type instead of the ones for functions.

Proof of isomorphism laws (click to expand)

I’ll skip over several steps for this proof to highlight the relevant parts:

forward . backward

= Kleisli (\_ -> tell (Sum 1)) . Kleisli (\_ -> tell (Sum (-1)))

= Kleisli ((\_ -> tell (Sum 1)) <=< (\_ -> tell (Sum (-1))))

= Kleisli (\_ -> tell (Sum 0))

= Kleisli return

= id
The proof of backward . forward = id is essentially the same thing, except flipped.

Note our Isomorphism effectively says that the type () is isomorphic to the type () within this Kleisli (Writer (Sum Integer)) Category, which is not a very interesting conclusion. Rather, for this Isomorphism the (slightly more) interesting bit is in the “morphisms� (the forward and backwarddefinitions), which are inverses of one another.

Here is one last example of a non-trivial Categoryinstance with an example isomorphism:

import Prelude hiding ((.), id)

-- Note: This is not how the lens package works, but it's still a useful example
data Lens a b = Lens{ view :: a -> b, over :: (b -> b) -> (a -> a) }

instance Category Lens where
id = Lens{ view = id, over = id }

Lens{ view = viewL, over = overL } . Lens{ view = viewR, over = overR } =
Lens{ view = viewL . viewR, over = overR . overL }

lensIsomorphism :: Isomorphism Lens Bool Bool
lensIsomorphism = Isomorphism{ forward, backward }
forward :: Lens Bool Bool
forward = Lens{ view = not, over = \f -> not . f . not }

-- There is no rule that the two morphisms can't be the same
backward :: Lens Bool Bool
backward = forward

Again, it’s not very interesting to say that Bool is isomorphic to Bool, but it is more to note that the forward lens is essentially its own inverse.

There’s one last category I want to quickly mention, which is … Isomorphism!

Yes, the Isomorphism type we introduced is itself an instance of the Category class:

instance Category cat => Category (Isomorphism cat) where
Isomorphism forwardL backwardL . Isomorphism forwardR backwardR =
Isomorphism (forwardL . forwardR) (backwardR . backwardL)

id = Isomorphism id id

You might even say that an “isomorphism� is a “morphism� in the above Category. An “iso�-“morphism�, if you will (where “iso� means “same�).

Furthermore, we can create an example Isomorphism in this Category of Isomorphisms:

nestedIsomorphism :: Isomorphism (Isomorphism (->)) Integer Integer
nestedIsomorphism =
{ forward = Isomorphism{ forward = (+ 1), backward = subtract 1 }
, backward = Isomorphism{ forward = subtract 1, backward = (+ 1) }

Okay, perhaps that’s going a bit too far, but I just wanted to end this post with a cute example of how you can keep chaining these ideas together in new ways.


In my experience, the more you train your ability to reason formally about isomorphisms the more you broaden your ability to recognize disparate things as equivalent and draw interesting connections between them.

For example, fluency with many common isomorphisms is a useful skill for API design because often there might be a way to take an API which is not very ergonomic and refactor it into an equivalent (isomorphic) API which is more ergonomic to use.

by Gabriella Gonzalez ( at October 22, 2022 08:12 PM

October 17, 2022

Donnacha Oisín Kidney

Lazily Grouping in Haskell

Posted on October 17, 2022
Tags: Haskell

Here’s a cool trick:

minimum :: Ord a => [a] -> a
minimum = head . sort

This is <semantics>�(n)<annotation encoding="application/x-tex">\mathcal{O}(n)</annotation></semantics> in Haskell, not <semantics>�(nlogn)<annotation encoding="application/x-tex">\mathcal{O}(n \log n)</annotation></semantics> as you might expect. And this isn’t because Haskell is using some weird linear-time sorting algorithm; indeed, the following is <semantics>�(nlogn)<annotation encoding="application/x-tex">\mathcal{O}(n \log n)</annotation></semantics>:

maximum :: Ord a => [a] -> a
maximum = last . sort

No: since the implementation of minimum above only demands the first element of the list, and since sort has been carefully implemented, only a linear amount of work will be done to retrieve it.

It’s not easy to structure programs to have the same property as sort does above: to be maximally lazy, such that unnecessary work is not performed. Today I was working on a maximally lazy implementation of the following program:

groupOn :: Eq k => (a -> k) -> [a] -> [(k,[a])]
groupOn = ...

>>> groupOn (`rem` 2) [1..5]

>>> groupOn (`rem` 3) [5,8,3,6,2]

This function groups the elements of a list according to some key function. The desired behaviour here is a little subtle: we don’t want to just group adjacent elements, for instance.

groupOn (`rem` 3) [5,8,3,6,2] ≢ [(2,[5,8]),(0,[3,6]),(2,[2])]

And we don’t want to reorder the elements of the list by the keys:

groupOn (`rem` 3) [5,8,3,6,2] ≢ [(0,[3,6]),(2,[5,8,2])]

These constraints make it especially tricky to make this function lazy. In fact, at first glance, it seems impossible. What should, for instance, groupOn id [1..] return? It can’t even fill out the first group, since it will never find another 1. However, it can fill out the first key. And, in fact, the second. And it can fill out the first element of the first group. Precisely:

groupOn id [1..] ≡ [(1,1:⊥), (2,2:⊥), (3,3:⊥), ...

Another example is groupOn id (repeat 1), or groupOn id (cycle [1,2,3]). These each have partially-defined answers:

groupOn id (repeat 1)      ≡ (1,repeat 1):⊥

groupOn id (cycle [1,2,3]) ≡ (1,repeat 1):(2,repeat 2):(3,repeat 3):⊥

So there is some kind of well-defined lazy semantics for this function. The puzzle I was interested in was defining an efficient implementation for these semantics.

The Slow Case

The first approximation to a solution I could think of is the following:

groupOn :: Ord k => (a -> k) -> [a] -> [(k, [a])]
groupOn k = Map.toList . Map.fromListWith (++) . map (\x -> (k x, [x]))

In fact, if you don’t care about laziness, this is probably the best solution: it’s <semantics>�(nlogn)<annotation encoding="application/x-tex">\mathcal{O}(n \log n)</annotation></semantics>, it performs well (practically as well as asymptotically), and it has the expected results.

However, there are problems. Primarily this solution cares about ordering, which we don’t want. We want to emit the results in the same order that they were in the original list, and we don’t necessarily want to require an ordering on the elements (for the efficient solution we will relax this last constraint).

Instead, let’s implement our own “map� type that is inefficient, but more general.

type Map a b = [(a,b)]

insertWith :: Eq a => (b -> b -> b) -> a -> b -> Map a b -> Map a b
insertWith f k v [] = [(k,v)]
insertWith f k v ((k',v'):xs)
  | k == k'   = (k',f v v') : xs
  | otherwise = (k',v') : insertWith f k v xs

groupOn :: Eq k => (a -> k) -> [a] -> [(k, [a])]
groupOn k = foldr (uncurry (insertWith (++))) [] . map (\x -> (k x, [x]))

The problem here is that it’s not lazy enough. insertWith is strict in its last argument, which means that using foldr doesn’t gain us anything laziness-wise.

There is some extra information we can use to drive the result: we know that the result will have keys that are in the same order as they appear in the list, with duplicates removed:

groupOn :: Eq k => (a -> k) -> [a] -> [(k, [a])]
groupOn k xs = map _ ks
    ks = map k xs

From here, we can get what the values should be from each key by filtering the original list:

groupOn :: Eq k => (a -> k) -> [a] -> [(k,[a])]
groupOn key xs = map (\k -> (k, filter ((k==) . key) xs)) (nub (map key xs))

Using a kind of Schwartzian transform yields the following slight improvement:

groupOn :: Eq k => (a -> k) -> [a] -> [(k,[a])]
groupOn key xs = map (\k -> (k , map snd (filter ((k==) . fst) ks))) (nub (map fst ks))
    ks = map (\x -> (key x, x)) xs

But this traverses the same list multiple times unnecessarily. The problem is that we’re repeating a lot of work between nub and the rest of the algorithm.

The following is much better:

groupOn :: Eq k => (a -> k) -> [a] -> [(k,[a])]
groupOn key = go . map (\x -> (key x, x)) 
    go [] = []
    go ((k,x):xs) = (k,x:map snd y) : go ys
        (y,ys) = partition ((k==).fst) xs

First, we perform the Schwartzian transform optimisation. The work of the algorithm is done in the go helper. The idea is to filter out duplicates as we encounter them: when we encounter (k,x) we can keep it immediately, but then we split the rest of the list into the components that have the same key as this element, and the ones that differ. The ones that have the same key can form the collection for this key, and those that differ are what we recurse on.

This partitioning also avoids re-traversing elements we know to be already accounted for in a previous group. I think that this is the most efficient (modulo some inlining and strictness improvements) algorithm that can do groupOn with just an Eq constraint.

A Faster Version

The reason that the groupOn above is slow is that every element returned has to traverse the entire rest of the list to remove duplicates. This is a classic pattern of quadratic behaviour: we can improve it by using the same trick as quick sort, by partitioning the list into lesser and greater elements on every call.

groupOnOrd :: Ord k => (a -> k) -> [a] -> [(k,[a])]
groupOnOrd key = go . map (\x -> (key x, x)) 
    go [] = []
    go ((k,x):xs) = (k,x:e) : go lt ++ go gt
        (e,lt,gt) = foldr split ([],[],[]) xs
        split ky@(k',y) ~(e,lt,gt) = case compare k' k of
          LT -> (e, ky:lt, gt)
          EQ -> (y:e, lt, gt)
          GT -> (e, lt, ky:gt)

While this is <semantics>�(nlogn)<annotation encoding="application/x-tex">\mathcal{O}(n \log n)</annotation></semantics>, and it does group elements, it also reorders the underlying list. Let’s fix that by tagging the incoming elements with their positions, and then using those positions to order them back into their original configuration:

groupOnOrd :: Ord k => (a -> k) -> [a] -> [(k,[a])]
groupOnOrd k = map (\(_,k,xs) -> (k,xs)) . go . zipWith (\i x -> (i, k x, x)) [0..]
    go [] = []
    go ((i, k, x):xs) = (i, k, x : e) : merge (go l) (go g)
        (e, l, g) = foldr split ([],[],[]) xs
        split ky@(_,k',y) ~(e, l, g) = case compare k' k of
          LT -> (e  , ky : l,      g)
          EQ -> (y:e,      l,      g)
          GT -> (e  ,      l, ky : g)
    merge [] gt = gt
    merge lt [] = lt
    merge (l@(i,_,_):lt) (g@(j,_,_):gt)
      | i <= j    = l : merge lt (g:gt)
      | otherwise = g : merge (l:lt) gt

This is close, but still not right. This isn’t yet lazy. The merge function is strict in both arguments.

However, we have all the information we need to unshuffle the lists without having to inspect them. In split, we know which direction we put each element: we can store that info without using indices.

groupOnOrd :: Ord k => (a -> k) -> [a] -> [(k,[a])]
groupOnOrd k = catMaybes . go . map (\x -> (k x, x))
    go [] = []
    go ((k,x):xs) = Just (k, x : e) : merge m (go l) (go g)
        (e, m, l, g) = foldr split ([],[],[],[]) xs
        split ky@(k',y) ~(e, m, l, g) = case compare k' k of
          LT -> (  e, LT : m, ky : l,      g)
          EQ -> (y:e, EQ : m,      l,      g)
          GT -> (  e, GT : m,      l, ky : g)
    merge []        lt     gt     = []
    merge (EQ : xs) lt     gt     = Nothing : merge xs lt gt
    merge (LT : xs) (l:lt) gt     = l       : merge xs lt gt
    merge (GT : xs) lt     (g:gt) = g       : merge xs lt gt

What we generate here is a [Ordering]: this list tells us the result of all the compare operations on the input list. Then, in merge, we invert the action of split, rebuilding the original list without inspecting either lt or gt.

And this solution works! It’s <semantics>�(nlogn)<annotation encoding="application/x-tex">\mathcal{O}(n \log n)</annotation></semantics>, and fully lazy.

>>> map fst . groupOnOrd id $ [1..]

>>> groupOnOrd id $ cycle [1,2,3]
(1,repeat 1):(2,repeat 2):(3,repeat 3):⊥

>>> groupOnOrd (`rem` 3) [1..]

The finished version of these two functions, along with some benchmarks, is available here.

by Donnacha Oisín Kidney at October 17, 2022 12:00 AM

October 14, 2022

Ken T Takusagawa

[ykoqomhu] summing reciprocals minimizing round-off error

to decrease round-off error when summing a collection of positive floating-point numbers, sum numbers from smallest to largest.  however, it is not as simple as just first sorting the input list: a partial sum could become large compared to the next number to be added.  better is to put all the numbers in a priority queue (heap), then repeatedly pop off the two smallest numbers, add them, and push them back into the priority queue.  (increased precision comes at the cost of a factor of log n time.)  we demonstrate this in Haskell, using Data.PQueue.Min in the pqueue package as our priority queue.  we sum exact Rational numbers for simplicity, and we keep track of what got added to what in an expression tree.

future work: keep track of roundoff error.

(related work, not implemented here: Kahan summation is another way to decrease round-off error.  Kahan was previously mentioned in the context of trying to avoid catastrophic loss of precision when doing trigonometry.)

source code.

here is the tail-recursive function that sums the contents of a priority queue.

reduceto1 :: Pqueue.MinQueue Expr -> Expr;
reduceto1 q = let {
  (a::Expr, q2) = Pqueue.deleteFindMin q
} in case Pqueue.minView q2 of {
  Nothing -> a;
  Just(b::Expr, q3) -> reduceto1 $ flip Pqueue.insert q3 $ Plus a b; -- smaller number on the left side of the plus sign

first, we demonstrate adding the first N reciprocals of integers (partial sums of the harmonic series).

1: 1/1

2: (1/2 + 1/1)

3: ((1/3 + 1/2) + 1/1)

4: (1/1 + (1/2 + (1/4 + 1/3)))

5: (1/1 + (1/2 + (1/3 + (1/5 + 1/4))))

6: (1/1 + ((1/4 + 1/3) + ((1/6 + 1/5) + 1/2)))

7: (1/1 + (((1/7 + 1/6) + 1/3) + ((1/5 + 1/4) + 1/2)))

8: ((1/2 + (1/4 + (1/8 + 1/7))) + ((1/3 + (1/6 + 1/5)) + 1/1))

9: ((1/2 + (1/4 + (1/7 + 1/6))) + ((1/3 + (1/5 + (1/9 + 1/8))) + 1/1))

10: ((1/2 + ((1/8 + 1/7) + 1/3)) + (((1/6 + 1/5) + ((1/10 + 1/9) + 1/4)) + 1/1))

11: ((1/2 + ((1/7 + 1/6) + 1/3)) + ((((1/11 + 1/10) + 1/5) + ((1/9 + 1/8) + 1/4)) + 1/1))

12: (((1/4 + (1/8 + 1/7)) + (1/3 + (1/6 + (1/12 + 1/11)))) + (((1/5 + (1/10 + 1/9)) + 1/2) + 1/1))

13: (((1/4 + (1/7 + (1/13 + 1/12))) + (1/3 + (1/6 + (1/11 + 1/10)))) + (((1/5 + (1/9 + 1/8)) + 1/2) + 1/1))

14: ((((1/8 + 1/7) + ((1/14 + 1/13) + 1/6)) + (1/3 + ((1/12 + 1/11) + 1/5))) + ((((1/10 + 1/9) + 1/4) + 1/2) + 1/1))

15: (((((1/15 + 1/14) + 1/7) + ((1/13 + 1/12) + 1/6)) + (1/3 + ((1/11 + 1/10) + 1/5))) + ((((1/9 + 1/8) + 1/4) + 1/2) + 1/1))

16: ((((1/7 + (1/14 + 1/13)) + 1/3) + ((1/6 + (1/12 + 1/11)) + (1/5 + (1/10 + 1/9)))) + (1/1 + (1/2 + (1/4 + (1/8 + (1/16 + 1/15))))))

17: ((((1/7 + (1/13 + 1/12)) + 1/3) + ((1/6 + (1/11 + 1/10)) + (1/5 + (1/9 + (1/17 + 1/16))))) + (1/1 + (1/2 + (1/4 + (1/8 + (1/15 + 1/14))))))

18: (((((1/14 + 1/13) + 1/6) + 1/3) + (((1/12 + 1/11) + 1/5) + ((1/10 + 1/9) + ((1/18 + 1/17) + 1/8)))) + (1/1 + (1/2 + (1/4 + ((1/16 + 1/15) + 1/7)))))

19: (((((1/13 + 1/12) + 1/6) + 1/3) + (((1/11 + 1/10) + 1/5) + (((1/19 + 1/18) + 1/9) + ((1/17 + 1/16) + 1/8)))) + (1/1 + (1/2 + (1/4 + ((1/15 + 1/14) + 1/7)))))

20: (((1/3 + (1/6 + (1/12 + 1/11))) + ((1/5 + (1/10 + (1/20 + 1/19))) + ((1/9 + (1/18 + 1/17)) + 1/4))) + (1/1 + (1/2 + ((1/8 + (1/16 + 1/15)) + (1/7 + (1/14 + 1/13))))))

21: (((1/3 + (1/6 + (1/11 + (1/21 + 1/20)))) + ((1/5 + (1/10 + (1/19 + 1/18))) + ((1/9 + (1/17 + 1/16)) + 1/4))) + (1/1 + (1/2 + ((1/8 + (1/15 + 1/14)) + (1/7 + (1/13 + 1/12))))))

22: (((1/3 + ((1/12 + 1/11) + ((1/22 + 1/21) + 1/10))) + ((1/5 + ((1/20 + 1/19) + 1/9)) + (((1/18 + 1/17) + 1/8) + 1/4))) + (1/1 + (1/2 + (((1/16 + 1/15) + 1/7) + ((1/14 + 1/13) + 1/6)))))

23: (((1/3 + (((1/23 + 1/22) + 1/11) + ((1/21 + 1/20) + 1/10))) + ((1/5 + ((1/19 + 1/18) + 1/9)) + (((1/17 + 1/16) + 1/8) + 1/4))) + (1/1 + (1/2 + (((1/15 + 1/14) + 1/7) + ((1/13 + 1/12) + 1/6)))))

24: ((((1/6 + (1/12 + (1/24 + 1/23))) + ((1/11 + (1/22 + 1/21)) + 1/5)) + (((1/10 + (1/20 + 1/19)) + (1/9 + (1/18 + 1/17))) + 1/2)) + (1/1 + ((1/4 + (1/8 + (1/16 + 1/15))) + ((1/7 + (1/14 + 1/13)) + 1/3))))

25: ((((1/6 + (1/12 + (1/23 + 1/22))) + ((1/11 + (1/21 + 1/20)) + 1/5)) + (((1/10 + (1/19 + 1/18)) + (1/9 + (1/17 + 1/16))) + 1/2)) + (1/1 + ((1/4 + (1/8 + (1/15 + 1/14))) + ((1/7 + (1/13 + (1/25 + 1/24))) + 1/3))))

26: ((((1/6 + ((1/24 + 1/23) + 1/11)) + (((1/22 + 1/21) + 1/10) + 1/5)) + ((((1/20 + 1/19) + 1/9) + ((1/18 + 1/17) + 1/8)) + 1/2)) + (1/1 + ((1/4 + ((1/16 + 1/15) + 1/7)) + (((1/14 + 1/13) + ((1/26 + 1/25) + 1/12)) + 1/3))))

27: ((((1/6 + ((1/23 + 1/22) + 1/11)) + (((1/21 + 1/20) + 1/10) + 1/5)) + ((((1/19 + 1/18) + 1/9) + ((1/17 + 1/16) + 1/8)) + 1/2)) + (1/1 + ((1/4 + ((1/15 + 1/14) + 1/7)) + ((((1/27 + 1/26) + 1/13) + ((1/25 + 1/24) + 1/12)) + 1/3))))

28: (((((1/12 + (1/24 + 1/23)) + (1/11 + (1/22 + 1/21))) + (1/5 + (1/10 + (1/20 + 1/19)))) + (((1/9 + (1/18 + 1/17)) + 1/4) + 1/2)) + (1/1 + (((1/8 + (1/16 + 1/15)) + (1/7 + (1/14 + (1/28 + 1/27)))) + (((1/13 + (1/26 + 1/25)) + 1/6) + 1/3))))

29: (((((1/12 + (1/23 + 1/22)) + (1/11 + (1/21 + 1/20))) + (1/5 + (1/10 + (1/19 + 1/18)))) + (((1/9 + (1/17 + 1/16)) + 1/4) + 1/2)) + (1/1 + (((1/8 + (1/15 + (1/29 + 1/28))) + (1/7 + (1/14 + (1/27 + 1/26)))) + (((1/13 + (1/25 + 1/24)) + 1/6) + 1/3))))

30: ((((((1/24 + 1/23) + 1/11) + ((1/22 + 1/21) + 1/10)) + (1/5 + ((1/20 + 1/19) + 1/9))) + ((((1/18 + 1/17) + 1/8) + 1/4) + 1/2)) + (1/1 + ((((1/16 + 1/15) + ((1/30 + 1/29) + 1/14)) + (1/7 + ((1/28 + 1/27) + 1/13))) + ((((1/26 + 1/25) + 1/12) + 1/6) + 1/3))))

31: ((((((1/23 + 1/22) + 1/11) + ((1/21 + 1/20) + 1/10)) + (1/5 + ((1/19 + 1/18) + 1/9))) + ((((1/17 + 1/16) + 1/8) + 1/4) + 1/2)) + (1/1 + (((((1/31 + 1/30) + 1/15) + ((1/29 + 1/28) + 1/14)) + (1/7 + ((1/27 + 1/26) + 1/13))) + ((((1/25 + 1/24) + 1/12) + 1/6) + 1/3))))

32: (((((1/11 + (1/22 + 1/21)) + 1/5) + ((1/10 + (1/20 + 1/19)) + (1/9 + (1/18 + 1/17)))) + 1/1) + ((1/2 + (1/4 + (1/8 + (1/16 + (1/32 + 1/31))))) + ((((1/15 + (1/30 + 1/29)) + 1/7) + ((1/14 + (1/28 + 1/27)) + (1/13 + (1/26 + 1/25)))) + (1/3 + (1/6 + (1/12 + (1/24 + 1/23)))))))

33: (((((1/11 + (1/21 + 1/20)) + 1/5) + ((1/10 + (1/19 + 1/18)) + (1/9 + (1/17 + (1/33 + 1/32))))) + 1/1) + ((1/2 + (1/4 + (1/8 + (1/16 + (1/31 + 1/30))))) + ((((1/15 + (1/29 + 1/28)) + 1/7) + ((1/14 + (1/27 + 1/26)) + (1/13 + (1/25 + 1/24)))) + (1/3 + (1/6 + (1/12 + (1/23 + 1/22)))))))

34: ((((((1/22 + 1/21) + 1/10) + 1/5) + (((1/20 + 1/19) + 1/9) + ((1/18 + 1/17) + ((1/34 + 1/33) + 1/16)))) + 1/1) + ((1/2 + (1/4 + (1/8 + ((1/32 + 1/31) + 1/15)))) + (((((1/30 + 1/29) + 1/14) + 1/7) + (((1/28 + 1/27) + 1/13) + ((1/26 + 1/25) + 1/12))) + (1/3 + (1/6 + ((1/24 + 1/23) + 1/11))))))

35: ((((((1/21 + 1/20) + 1/10) + 1/5) + (((1/19 + 1/18) + 1/9) + (((1/35 + 1/34) + 1/17) + ((1/33 + 1/32) + 1/16)))) + 1/1) + ((1/2 + (1/4 + (1/8 + ((1/31 + 1/30) + 1/15)))) + (((((1/29 + 1/28) + 1/14) + 1/7) + (((1/27 + 1/26) + 1/13) + ((1/25 + 1/24) + 1/12))) + (1/3 + (1/6 + ((1/23 + 1/22) + 1/11))))))

36: ((((1/5 + (1/10 + (1/20 + 1/19))) + ((1/9 + (1/18 + (1/36 + 1/35))) + ((1/17 + (1/34 + 1/33)) + 1/8))) + 1/1) + ((1/2 + (1/4 + ((1/16 + (1/32 + 1/31)) + (1/15 + (1/30 + 1/29))))) + (((1/7 + (1/14 + (1/28 + 1/27))) + ((1/13 + (1/26 + 1/25)) + 1/6)) + (1/3 + ((1/12 + (1/24 + 1/23)) + (1/11 + (1/22 + 1/21)))))))

37: ((((1/5 + (1/10 + (1/19 + (1/37 + 1/36)))) + ((1/9 + (1/18 + (1/35 + 1/34))) + ((1/17 + (1/33 + 1/32)) + 1/8))) + 1/1) + ((1/2 + (1/4 + ((1/16 + (1/31 + 1/30)) + (1/15 + (1/29 + 1/28))))) + (((1/7 + (1/14 + (1/27 + 1/26))) + ((1/13 + (1/25 + 1/24)) + 1/6)) + (1/3 + ((1/12 + (1/23 + 1/22)) + (1/11 + (1/21 + 1/20)))))))

38: ((((1/5 + ((1/20 + 1/19) + ((1/38 + 1/37) + 1/18))) + ((1/9 + ((1/36 + 1/35) + 1/17)) + (((1/34 + 1/33) + 1/16) + 1/8))) + 1/1) + ((1/2 + (1/4 + (((1/32 + 1/31) + 1/15) + ((1/30 + 1/29) + 1/14)))) + (((1/7 + ((1/28 + 1/27) + 1/13)) + (((1/26 + 1/25) + 1/12) + 1/6)) + (1/3 + (((1/24 + 1/23) + 1/11) + ((1/22 + 1/21) + 1/10))))))

39: ((((1/5 + (((1/39 + 1/38) + 1/19) + ((1/37 + 1/36) + 1/18))) + ((1/9 + ((1/35 + 1/34) + 1/17)) + (((1/33 + 1/32) + 1/16) + 1/8))) + 1/1) + ((1/2 + (1/4 + (((1/31 + 1/30) + 1/15) + ((1/29 + 1/28) + 1/14)))) + (((1/7 + ((1/27 + 1/26) + 1/13)) + (((1/25 + 1/24) + 1/12) + 1/6)) + (1/3 + (((1/23 + 1/22) + 1/11) + ((1/21 + 1/20) + 1/10))))))

40: (((((1/10 + (1/20 + (1/40 + 1/39))) + ((1/19 + (1/38 + 1/37)) + 1/9)) + (((1/18 + (1/36 + 1/35)) + (1/17 + (1/34 + 1/33))) + 1/4)) + 1/1) + ((1/2 + ((1/8 + (1/16 + (1/32 + 1/31))) + ((1/15 + (1/30 + 1/29)) + 1/7))) + ((((1/14 + (1/28 + 1/27)) + (1/13 + (1/26 + 1/25))) + 1/3) + ((1/6 + (1/12 + (1/24 + 1/23))) + ((1/11 + (1/22 + 1/21)) + 1/5)))))

next, sums of the reciprocals of the primes up to N.  like the harmonic series, sum diverges when taken over all primes.

2: 1/2

3: (1/3 + 1/2)

5: (1/2 + (1/5 + 1/3))

7: (1/2 + (1/3 + (1/7 + 1/5)))

11: (1/2 + (1/3 + (1/5 + (1/11 + 1/7))))

13: ((1/5 + (1/7 + (1/13 + 1/11))) + (1/3 + 1/2))

17: (((1/11 + (1/17 + 1/13)) + 1/3) + ((1/7 + 1/5) + 1/2))

19: ((((1/19 + 1/17) + 1/7) + 1/3) + (((1/13 + 1/11) + 1/5) + 1/2))

23: ((((1/17 + 1/13) + 1/7) + 1/3) + (((1/11 + (1/23 + 1/19)) + 1/5) + 1/2))

29: (((1/7 + (1/13 + (1/29 + 1/23))) + 1/3) + ((1/5 + (1/11 + (1/19 + 1/17))) + 1/2))

31: (((1/7 + (1/13 + 1/11)) + 1/3) + ((1/5 + ((1/23 + 1/19) + (1/17 + (1/31 + 1/29)))) + 1/2))

37: (((1/7 + ((1/29 + 1/23) + 1/11)) + 1/3) + ((1/5 + ((1/19 + 1/17) + ((1/37 + 1/31) + 1/13))) + 1/2))

41: (((((1/31 + 1/29) + 1/13) + (1/11 + (1/23 + (1/41 + 1/37)))) + 1/3) + ((1/5 + ((1/19 + 1/17) + 1/7)) + 1/2))

43: ((1/3 + ((1/13 + (1/29 + 1/23)) + (1/11 + ((1/43 + 1/41) + 1/19)))) + ((1/5 + ((1/17 + (1/37 + 1/31)) + 1/7)) + 1/2))

47: ((1/3 + ((1/13 + (1/23 + (1/47 + 1/43))) + (1/11 + ((1/41 + 1/37) + 1/19)))) + ((1/5 + ((1/17 + (1/31 + 1/29)) + 1/7)) + 1/2))

53: ((1/3 + ((1/13 + 1/11) + 1/5)) + ((((1/23 + (1/43 + 1/41)) + (1/19 + 1/17)) + (((1/37 + 1/31) + (1/29 + (1/53 + 1/47))) + 1/7)) + 1/2))

59: ((1/3 + ((((1/59 + 1/53) + 1/23) + 1/11) + 1/5)) + (((((1/47 + 1/43) + (1/41 + 1/37)) + (1/19 + 1/17)) + (1/7 + ((1/31 + 1/29) + 1/13))) + 1/2))

61: ((1/3 + ((((1/53 + 1/47) + 1/23) + 1/11) + 1/5)) + (1/2 + ((((1/43 + 1/41) + 1/19) + (1/17 + (1/37 + 1/31))) + (1/7 + (((1/61 + 1/59) + 1/29) + 1/13)))))

67: ((1/3 + (((1/23 + (1/47 + 1/43)) + 1/11) + 1/5)) + (1/2 + ((((1/41 + 1/37) + 1/19) + (1/17 + ((1/67 + 1/61) + 1/31))) + (1/7 + ((1/29 + (1/59 + 1/53)) + 1/13)))))

71: ((1/3 + ((1/11 + (1/23 + (1/43 + 1/41))) + 1/5)) + (1/2 + (((1/19 + (1/37 + (1/71 + 1/67))) + (1/17 + (1/31 + (1/61 + 1/59)))) + (1/7 + ((1/29 + (1/53 + 1/47)) + 1/13)))))

73: ((1/3 + ((1/11 + ((1/47 + 1/43) + (1/41 + 1/37))) + 1/5)) + (1/2 + (((1/19 + 1/17) + (((1/73 + 1/71) + (1/67 + 1/61)) + (1/31 + 1/29))) + (1/7 + (1/13 + ((1/59 + 1/53) + 1/23))))))

79: ((1/3 + ((1/11 + ((1/43 + 1/41) + 1/19)) + 1/5)) + (1/2 + (((((1/79 + 1/73) + 1/37) + 1/17) + (((1/71 + 1/67) + 1/31) + ((1/61 + 1/59) + 1/29))) + (1/7 + (1/13 + ((1/53 + 1/47) + 1/23))))))

83: ((1/3 + ((1/11 + ((1/41 + (1/83 + 1/79)) + 1/19)) + 1/5)) + (1/2 + ((((1/37 + (1/73 + 1/71)) + 1/17) + (((1/67 + 1/61) + 1/31) + (1/29 + (1/59 + 1/53)))) + (1/7 + (1/13 + (1/23 + (1/47 + 1/43)))))))

89: ((1/3 + ((1/11 + ((1/41 + (1/79 + 1/73)) + 1/19)) + 1/5)) + (1/2 + ((((1/37 + (1/71 + 1/67)) + 1/17) + ((1/31 + (1/61 + 1/59)) + (1/29 + (1/53 + 1/47)))) + (1/7 + (1/13 + (1/23 + (1/43 + (1/89 + 1/83))))))))

97: ((1/3 + (((1/23 + (1/43 + 1/41)) + (((1/83 + 1/79) + 1/37) + 1/19)) + 1/5)) + (1/2 + (((1/17 + ((1/73 + 1/71) + (1/67 + 1/61))) + 1/7) + (((1/31 + 1/29) + 1/13) + (((1/59 + 1/53) + (1/47 + (1/97 + 1/89))) + 1/11)))))

101: ((1/3 + ((((1/47 + 1/43) + ((1/89 + 1/83) + 1/41)) + (1/19 + ((1/79 + 1/73) + 1/37))) + 1/5)) + (1/2 + (((1/17 + ((1/71 + 1/67) + 1/31)) + 1/7) + ((((1/61 + 1/59) + 1/29) + 1/13) + (((1/53 + (1/101 + 1/97)) + 1/23) + 1/11)))))

103: ((1/3 + (1/5 + ((((1/97 + 1/89) + 1/43) + (1/41 + (1/83 + 1/79))) + (1/19 + (1/37 + (1/73 + 1/71)))))) + (1/2 + (((1/17 + ((1/67 + 1/61) + 1/31)) + 1/7) + (((1/29 + (1/59 + 1/53)) + 1/13) + ((((1/103 + 1/101) + 1/47) + 1/23) + 1/11)))))

107: ((1/3 + (1/5 + (((1/43 + (1/89 + 1/83)) + (1/41 + (1/79 + 1/73))) + (1/19 + (1/37 + (1/71 + 1/67)))))) + (1/2 + (((1/17 + (1/31 + (1/61 + 1/59))) + 1/7) + (((1/29 + (1/53 + (1/107 + 1/103))) + 1/13) + ((((1/101 + 1/97) + 1/47) + 1/23) + 1/11)))))

109: ((1/3 + (1/5 + (((1/43 + 1/41) + ((1/83 + 1/79) + 1/37)) + (1/19 + 1/17)))) + (1/2 + (((((1/73 + 1/71) + (1/67 + 1/61)) + (1/31 + 1/29)) + 1/7) + ((((1/59 + (1/109 + 1/107)) + (1/53 + (1/103 + 1/101))) + 1/13) + (((1/47 + (1/97 + 1/89)) + 1/23) + 1/11)))))

113: ((1/3 + (1/5 + ((((1/89 + 1/83) + 1/41) + 1/19) + (((1/79 + 1/73) + 1/37) + 1/17)))) + (1/2 + (((((1/71 + 1/67) + 1/31) + ((1/61 + 1/59) + 1/29)) + 1/7) + (((((1/113 + 1/109) + 1/53) + ((1/107 + 1/103) + (1/101 + 1/97))) + 1/13) + ((1/23 + (1/47 + 1/43)) + 1/11)))))

127: ((((1/13 + (((1/109 + 1/107) + 1/53) + ((1/103 + 1/101) + 1/47))) + ((1/23 + ((1/97 + 1/89) + 1/43)) + 1/11)) + (1/5 + (((1/41 + (1/83 + 1/79)) + 1/19) + ((1/37 + (1/73 + 1/71)) + 1/17)))) + (1/2 + (((((1/67 + 1/61) + 1/31) + (((1/127 + 1/113) + 1/59) + 1/29)) + 1/7) + 1/3)))

131: ((((1/13 + ((1/53 + (1/107 + 1/103)) + ((1/101 + 1/97) + 1/47))) + ((1/23 + (1/43 + (1/89 + 1/83))) + 1/11)) + (1/5 + (((1/41 + (1/79 + 1/73)) + 1/19) + ((1/37 + (1/71 + 1/67)) + 1/17)))) + (1/2 + ((((((1/131 + 1/127) + 1/61) + 1/31) + (1/29 + (1/59 + (1/113 + 1/109)))) + 1/7) + 1/3)))

137: ((((1/13 + ((1/53 + (1/103 + 1/101)) + (1/47 + (1/97 + 1/89)))) + (1/11 + (1/23 + (1/43 + 1/41)))) + (1/5 + ((((1/83 + 1/79) + 1/37) + 1/19) + (((1/73 + 1/71) + (1/67 + (1/137 + 1/131))) + 1/17)))) + (1/2 + ((((1/31 + (1/61 + (1/127 + 1/113))) + (1/29 + (1/59 + (1/109 + 1/107)))) + 1/7) + 1/3)))

139: ((((1/13 + (((1/107 + 1/103) + (1/101 + 1/97)) + 1/23)) + (1/11 + ((1/47 + 1/43) + ((1/89 + 1/83) + 1/41)))) + (1/5 + ((1/19 + ((1/79 + 1/73) + 1/37)) + (1/17 + ((1/71 + (1/139 + 1/137)) + (1/67 + (1/131 + 1/127))))))) + (1/2 + ((((1/31 + (1/61 + 1/59)) + (1/29 + ((1/113 + 1/109) + 1/53))) + 1/7) + 1/3)))

149: ((((1/13 + (((1/103 + 1/101) + 1/47) + 1/23)) + (1/11 + (((1/97 + 1/89) + 1/43) + (1/41 + (1/83 + 1/79))))) + (1/5 + ((1/19 + (1/37 + (1/73 + (1/149 + 1/139)))) + (1/17 + ((1/71 + 1/67) + ((1/137 + 1/131) + 1/61)))))) + (1/2 + ((((1/31 + ((1/127 + 1/113) + 1/59)) + (1/29 + ((1/109 + 1/107) + 1/53))) + 1/7) + 1/3)))

151: ((((1/13 + (((1/101 + 1/97) + 1/47) + 1/23)) + (1/11 + ((1/43 + (1/89 + 1/83)) + (1/41 + (1/79 + (1/151 + 1/149)))))) + (1/5 + ((1/19 + (1/37 + (1/73 + 1/71))) + (1/17 + (((1/139 + 1/137) + 1/67) + ((1/131 + 1/127) + 1/61)))))) + (1/2 + ((((1/31 + 1/29) + ((1/59 + (1/113 + 1/109)) + (1/53 + (1/107 + 1/103)))) + 1/7) + 1/3)))

157: ((((1/13 + ((1/47 + (1/97 + 1/89)) + 1/23)) + (1/11 + ((1/43 + 1/41) + ((1/83 + 1/79) + ((1/157 + 1/151) + 1/73))))) + (1/5 + ((1/19 + (1/37 + ((1/149 + 1/139) + 1/71))) + (1/17 + ((1/67 + (1/137 + 1/131)) + 1/31))))) + (1/2 + (((((1/61 + (1/127 + 1/113)) + 1/29) + ((1/59 + (1/109 + 1/107)) + (1/53 + (1/103 + 1/101)))) + 1/7) + 1/3)))

163: ((((1/13 + (1/23 + (1/47 + 1/43))) + (1/11 + (((1/89 + 1/83) + 1/41) + (((1/163 + 1/157) + 1/79) + 1/37)))) + (1/5 + ((1/19 + (((1/151 + 1/149) + 1/73) + (1/71 + (1/139 + 1/137)))) + (1/17 + ((1/67 + (1/131 + 1/127)) + 1/31))))) + (1/2 + ((1/7 + (((1/61 + 1/59) + 1/29) + (((1/113 + 1/109) + 1/53) + ((1/107 + 1/103) + (1/101 + 1/97))))) + 1/3)))

167: (((((((1/109 + 1/107) + 1/53) + ((1/103 + 1/101) + 1/47)) + (1/23 + ((1/97 + 1/89) + 1/43))) + (1/11 + (((1/83 + (1/167 + 1/163)) + 1/41) + 1/19))) + (1/5 + ((((1/79 + (1/157 + 1/151)) + 1/37) + ((1/73 + (1/149 + 1/139)) + (1/71 + 1/67))) + (1/17 + (((1/137 + 1/131) + 1/61) + 1/31))))) + (1/2 + ((1/7 + ((((1/127 + 1/113) + 1/59) + 1/29) + 1/13)) + 1/3)))

173: ((((((1/53 + (1/107 + 1/103)) + ((1/101 + 1/97) + 1/47)) + (1/23 + ((1/89 + (1/173 + 1/167)) + 1/43))) + (1/11 + ((1/41 + (1/83 + (1/163 + 1/157))) + 1/19))) + (1/5 + ((((1/79 + (1/151 + 1/149)) + 1/37) + ((1/73 + 1/71) + ((1/139 + 1/137) + 1/67))) + (1/17 + (((1/131 + 1/127) + 1/61) + 1/31))))) + (1/2 + ((1/7 + ((1/29 + (1/59 + (1/113 + 1/109))) + 1/13)) + 1/3)))

do these expression trees have any rhyme or reason?  the parenthesized representation is not good for seeing structural patterns.  future work: draw them as trees.

the expression trees define a unique binary tree for each integer, or for each prime.

by Unknown ( at October 14, 2022 11:16 PM

October 10, 2022

Joachim Breitner

rec-def: Minesweeper case study

I’m on the train back from MuniHac, where I gave a talk about the rec-def library that I have excessively blogged about recently (here, here, here and here). I got quite flattering comments about that talk, so if you want to see if they were sincere, I suggest you watch the recording of “Getting recursive definitions off their bottoms� (but it’s not necessary for the following).

After the talk, Franz Thoma approached me and told me a story of how we was once implementing the game Minesweeper in Haskell, and in particular the part of the logic where, after the user has uncovered a field, the game would automatically uncover all fields that are next to a “neutral� field, i.e. one with zero adjacent bombs. He was using a comonadic data structure, which makes a “context-dependent parallel computation� such as uncovering one field quite natural, and was hoping that using a suitable fix-point operator, he can elegantly obtain not just the next step, but directly the result of recursively uncovering all these fields. But, much to his disappointment, that did not work out: Due to the recursion inherent in that definition, a knot-tying fixed-point operator will lead to a cyclic definition.

Microsoft Minesweeper
Microsoft Minesweeper

He was wondering if the rec-def library could have helped him, and we sat down to find out, and this is the tale of this blog post. I will avoid the comonadic abstractions and program it more naively, though, to not lose too many readers along the way. Maybe read Chris Penner’s blog post and Finch’s functional pearl “Getting a Quick Fix on Comonads� if you are curious about that angle.

Minesweeper setup

Let’s start with defining a suitable data type for the grid of the minesweeper board. I’ll use the Array data type, it’s Ix-based indexing is quite useful for grids:

The library lacks a function to generate an array from a generating function, but it is easy to add:

Let’s also fix the size of the board, as a pair of lower and upper bounds (this is the format that the Ix type class needs):

Now board is simply a grid of boolean values, with True indicating that a bomb is there:

It would be nice to be able to see these board in a nicer way. So let us write A function that prints a grid, including a frame, given a function that prints something for each coordinate. Together with a function that prints a bomb (as *), we can print the board:

The expression b ! c looks up a the coordinate in the array, and is True when there is a bomb at that coordinate.

So here is our board, with two bombs:

ghci> putStrLn $ pBombs board1
#    #
#*   #
#*   #
#    #

But that’s not what we want to show to the user: Every field should have have a number that indicates the number of bombs in the surrounding fields. To that end, we first define a function that takes a coordinate, and returns all adjacent coordinates. This also takes care of the border, using inRange:

With that, we can calculate what to display in each cell – a bomb, or a number:

With a suitable printing function, we can now see the full board:

And here it is:

ghci> putStrLn $ pBoard board1
#11  #
#*2  #
#*2  #
#11  #

Next we have to add masks: We need to keep track of which fields the user already sees. We again use a grid of booleans, and define a function to print a board with the masked fields hidden behind ?:

So this is what the user would see

ghci> putStrLn $ pMasked board1 mask1
#11 ?#

Uncovering some fields

With that setup in place, we now implement the piece of logic we care about: Uncovering all fields that are next to a neutral field. Here is the first attempt:

The idea is that we calculate the new mask m1 from the old one m0 by the following logic: A field is visible if it was visible before (m0 ! c), or if any of its neighboring, neutral fields are visible.

This works so far: I uncovered the three fields next to the one neutral visible field:

ghci> putStrLn $ pMasked board1 $ solve0 board1 mask1
#11  #
#?2  #

But that’s not quite what we want: We want to keep doing that to uncover all fields.

Uncovering all fields

So what happens if we change the logic to: A field is visible if it was visible before (m0 ! c), or if any of its neighboring, neutral fields will be visible.

In the code, this is just a single character change: Instead of looking at m0 to see if a neighbor is visible, we look at m1:

(This is roughly what happened when Franz started to use the kfix comonadic fixed-point operator in his code, I believe.)

Does it work? It seems so:

ghci> putStrLn $ pMasked board1 $ solve1 board1 mask1
#11  #
#?2  #
#?2  #
#?1  #

Amazing, isn’t it!

Unfortunately, it seems to work by accident. If I start with a different mask:

which looks as follows:

ghci> putStrLn $ pMasked board1 mask2
#??? #

Then our solve1 function does not work, and just sits there:

ghci> putStrLn $ pMasked board1 $ solve1 board1 mask2

Why did it work before, but now now?

It fails to work because as the code tries to figure out if a field, it needs to know if the next field will be uncovered. But to figure that out, it needs to know if the present field will be uncovered. With the normal boolean connectives (|| and or), this does not make progress.

It worked with mask1 more or less by accident: None of the fields on in the first column don’t have neutral neighbors, so nothing happens there. And for all the fields in the third and forth column, the code will know for sure that they will be uncovered based on their upper neighbors, which come first in the neighbors list, and due to the short-circuting properties of ||, it doesn’t have to look at the later cells, and the vicious cycle is avoided.

rec-def to the rescue

This is where rec-def comes in: By using the RBool type in m1 instead of plain Bool, the recursive self-reference is not a problem, and it simply works:

Note that I did not change the algorithm, or the self-reference through m1; I just replaced Bool with RBool, || with RB.|| and or with RB.or. And used RB.get at the end to get a normal boolean out. And �, here we go:

ghci> putStrLn $ pMasked board1 $ solve2 board1 mask2
#11  #
#?2  #
#?2  #
#?1  #

That’s the end of this repetition of “let’s look at a tying-the-knot-problem and see how rec-def helps�, which always end up a bit anti-climatic because it “just works�, at least in these cases. Hope you enjoyed it nevertheless.

by Joachim Breitner ( at October 10, 2022 08:22 AM

October 08, 2022

Oleg Grenrus

Simple(r?) simplices

Posted on 2022-10-08 by Oleg Grenrus agda

This post is a literate Agda file, where I try to define a category Δ of finite ordinals and monotone maps. Reed Mullanix wrote a post "Simple Simplices" around year and half ago about the topic suggesting an option.

That option, called Δ⇒ is implemented in agda-categories package in Categories.Category.Instance.Simplex module.

Reed asks for a decomposition:

decompose : (Fin m → Fin n) → (m Δ⇒ n)

I think I got it.

Agda setup

module 2022-10-08-simplex where
import Data.Nat as ℕ

openusing (; zero; suc; z≤n; s≤s; __)
open import Data.Nat.Properties using (≤-refl; ≤-trans)
open import Data.Fin using (Fin; zero; suc; __; _<_; toℕ)
open import Data.Product using (Σ; _×_; _,_; proj₁; proj₂; map)
open import Data.Fin.Properties using (suc-injective; toℕ<n)
open import Relation.Binary.PropositionalEquality
  using (__; refl; cong; sym; trans)

open Relation.Binary.PropositionalEquality.≡-Reasoning

  n m p :

Monotone maps

Reed mentions two options for implementing simplex category

  1. Define Δ as the category of finite ordinals and monotone maps.
  2. Define Δ as a free category generated by face and degeneracy maps, quotient by the simplicial identities.

Second one is just awful.

I assume the first option goes something like:

First we define the isMonotone predicate on Fin n → Fin m functions.

isMonotone : (Fin n  Fin m)  Set
isMonotone f =  i j  i ≤ j  f i ≤ f j

Then a monotone function is a function together with a proof it is monotone

Monotone : Set
Monotone n m = Σ (Fin n  Fin m) isMonotone

And because it's a function in (ordinary) Agda we need to define an equality:

__ : Monotone n m  Monotone n m  Set
(f , _)(g , _) =  i  f i ≡ g i

The pointwise equality works well, and we don't actually care about isMonotone proof. (Though I think it can be shown that it is hProp. so this is justified).

Reed mentions that this formulation is nice, except that we want to be able to define simplicial sets by how they act on the face and degeneracy maps, not some random monotonic map!

I actually don't know anything about face and boundary maps, but I trust others on that. (E.g. nLab also says that all morphism are generated by face and degeneracy maps)

Reed then proceed to define a third variant, which resembles free category definition, yet he doesn't quotient by simplicial identities, but instead he defines equality using the semantics (i.e. pointwise on a function "applying" his description to finite ordinal).

Fourth formulation

... but there is fourth (?) option to encode monotone maps.

And it is very simple! (It does resemble thinnings I wrote recently above, more on them below).

data Mono : Set where
  base :                  Mono zero    zero
  skip : Mono n m        Mono n       (suc m)
  edge : Mono n (suc m)  Mono (suc n) (suc m)

The base and skip constructors are similar as in thinnings, but edge as different then keep. Where keep always introduced a new "output", edge requires there to be an element and maps new input to that same element.

So if we have a Mono which looks like:

\begin{tikzpicture} \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (A) at (0,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (B) at (0,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (C) at (0,1.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (D) at (0,1.50) {}; \node[anchor=east] at (A) {$0$}; \node[anchor=east] at (B) {$1$}; \node[anchor=east] at (C) {$2$}; \node[anchor=east] at (D) {$3$}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (X) at (2,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Y) at (2,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Z) at (2,1.00) {}; \node[circle, draw, inner sep=0pt, minimum width=4pt] (U) at (2,1.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (V) at (2,2.00) {}; \node[anchor=west] at (X) {$0$}; \node[anchor=west] at (Y) {$1$}; \node[anchor=west] at (Z) {$2$}; \node[anchor=west] at (U) {$3$}; \node[anchor=west] at (V) {$4$}; \draw[-] (A) -- (X); \draw[-] (B) -- (Y); \draw[-] (C) -- (Z); \draw[-] (D) -- (V); \end{tikzpicture}

We can add a new edge which goes to already existing output with edge:

\begin{tikzpicture} \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (A) at (0,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (B) at (0,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (C) at (0,1.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (D) at (0,1.50) {}; \node[circle, draw, red, fill=red, inner sep=0pt, minimum width=4pt] (E) at (0,2.00) {}; \node[anchor=east] at (A) {$0$}; \node[anchor=east] at (B) {$1$}; \node[anchor=east] at (C) {$2$}; \node[anchor=east] at (D) {$3$}; \node[anchor=east] at (E) {$4$}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (X) at (2,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Y) at (2,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Z) at (2,1.00) {}; \node[circle, draw, inner sep=0pt, minimum width=4pt] (U) at (2,1.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (V) at (2,2.00) {}; \node[anchor=west] at (X) {$0$}; \node[anchor=west] at (Y) {$1$}; \node[anchor=west] at (Z) {$2$}; \node[anchor=west] at (U) {$3$}; \node[anchor=west] at (V) {$4$}; \draw[-] (A) -- (X); \draw[-] (B) -- (Y); \draw[-] (C) -- (Z); \draw[-] (D) -- (V); \draw[red,-] (E) -- (V); \end{tikzpicture}

It took some time to get this right.

keep as in thinnings can be define as first adding a new output with skip and then connecting an edge there:

pattern keep f = edge (skip f)

We can define identity morphism and composition:

id : Mono n n
id {zero} = base
id {suc n} = keep id

-- I'm rebel, using ⨟ for the composition
__ : Mono n m  Mono m p  Mono n p
base ⨟ g = g
skip f ⨟ skip g = skip (skip f ⨟ g)
skip f ⨟ edge g = f ⨟ g
edge f ⨟ skip g = skip (edge f ⨟ g)
edge f ⨟ edge g = edge (f ⨟ edge g)

I leave as an exercise to prove that category laws are satisfied.

Next we can define the semantics, i.e. how Mono maps finite ordinals: The definition is simple, encoding the graphical intuition from above.

apply : Mono n m  Fin n  Fin m
apply (skip f) i       = suc (apply f i)
apply (edge f) zero    = zero
apply (edge f) (suc i) = apply f i

We can show that apply, id and work together as expected.

apply-id : (i : Fin n)  apply id i ≡ i
apply-id zero    = refl
apply-id (suc i) = cong suc (apply-id i)

apply-⨟ : (f : Mono n m) (g : Mono m p) (i : Fin n)
         apply g (apply f i) ≡ apply (f ⨟ g) i
apply-⨟ (skip f) (skip g) i       = cong suc (apply-⨟ (skip f) g i)
apply-⨟ (skip f) (edge g) i       = apply-⨟ f g i
apply-⨟ (edge f) (skip g) i       = cong suc (apply-⨟ (edge f) g i)
apply-⨟ (edge f) (edge g) zero    = refl
apply-⨟ (edge f) (edge g) (suc i) = apply-⨟ f (edge g) i

Mono has a very nice property: it uniquely represents a monotone map In other words, if there are two Mono n m, but for all i : Fin n, they act the same, then f and g are propositionally equal:

apply-inj : (f g : Mono n m)  (∀ i  apply f i ≡ apply g i)  f ≡ g
apply-inj base     base     p = refl
apply-inj (skip f) (skip g) p =
  cong skip (apply-inj f g λ i  suc-injective (p i))
apply-inj (skip f) (edge g) p with p zero
... | ()
apply-inj (edge f) (skip g) p with p zero
... | ()
apply-inj (edge f) (edge g) p = cong edge (apply-inj f g λ i  p  (suc i) )

As a sanity check, apply f is indeed monotone:

isMonotone-apply : (f : Mono n m)  isMonotone (apply f)
isMonotone-apply (skip f) i       j       i≤j       = s≤s (isMonotone-apply f i j i≤j)
isMonotone-apply (edge f) zero    j       0≤j       = z≤n
isMonotone-apply (edge f) (suc i) (suc j) (s≤s i≤j) = isMonotone-apply f i j i≤j

Combining the previous, we can map from Mono (data) to Monotone (Agda function).

Mono→Monotone : Mono n m  Monotone n m
Mono→Monotone f = apply f , isMonotone-apply f

From Agda function to data

Because the Mono definition is so simple, we should try to convert back. The code in this section can be improved, but for now we only need the final result.

First we define "subtraction" and "addition" of finite ordinals. The is monus on natural numbers (i.e. safe subtraction, defaulting to zero). In the same vein, lower doesn't require i ≤ j proof.

-- kind of j - i, no i ≤ j requirement, "monus"
lower : (i j : Fin (suc m))  Fin (suc (m ∸ toℕ i))
lower             zero    j       = j
lower {m = suc m} (suc i) zero    = zero  -- extra case, here i≤j
lower {m = suc m} (suc i) (suc j) = lower i j

raise : (i : Fin (suc m))  Fin (suc (m ∸ toℕ i))  Fin (suc m)
raise             zero    j = j
raise {m = suc m} (suc i) j = suc (raise i j)

We can show that raise and lower cancel out, here we need j ≤ i proof. (I noticed that I'm not consistent with i and j variables, but hopefully you get along).

raise∘lower≡id : (i j : Fin (suc m)) (j≤i : j ≤ i)  i ≡ raise j (lower j i)
raise∘lower≡id i zero j≤i = refl
raise∘lower≡id {m = suc m} (suc i) (suc j) (s≤s j≤i) =
  cong suc (raise∘lower≡id i j j≤i)

Then we need a handful of lemmas.

lower for fixed k is monotone:

isMonotone-lower :  (k : Fin (suc m))  isMonotone (lower k)
isMonotone-lower             zero    i       j       i≤j       = i≤j
isMonotone-lower {m = suc m} (suc k) zero    j       z≤0       = z≤n -- redundant case
isMonotone-lower {m = suc m} (suc k) (suc i) (suc j) (s≤s i≤j) = isMonotone-lower k i j i≤j

We can raise the Mono, so we can commute raise and apply

raise-mono' :  p  Mono n (suc (m ∸ p))  Mono n (suc m)
raise-mono'             zero    f = f
raise-mono' {m = zero}  (suc p) f = f
raise-mono' {m = suc m} (suc p) f = skip (raise-mono' p f)

raise-mono :  (k : Fin (suc m))  Mono n (suc (m ∸ toℕ k))  Mono n (suc m)
raise-mono k = raise-mono' (toℕ k)

Then the idea is to define Monotone to Mono conversion by looking at f zero input, and trimming f using lower.

For the lack of better name I call this new function next:

next-f : (f : Fin (suc n)  Fin (suc m))  isMonotone f
      Fin n  Fin (suc (m ∸ toℕ (f zero)))
next-f f f-mono i = lower (f zero) (f (suc i))

And next-f f is monotone if f is:

next-mono : (f : Fin (suc n)  Fin (suc m)) (f-mono : isMonotone f)
           isMonotone (next-f f f-mono)
next-mono f f-mono i j i≤j = isMonotone-lower
  (f zero)
  (f (suc i))
  (f (suc j))
  (f-mono (suc i) (suc j) (s≤s i≤j))

next : (f : Monotone (suc n) (suc m))
      Monotone n (suc (m ∸ toℕ (proj₁ f zero)))
next (f , f-mono) = next-f f f-mono , next-mono f f-mono

Now we have (almost) all the ingredients to define Monotone→Mono function:

absurd : Mono zero n
absurd {zero} = base
absurd {suc n} = skip absurd

Monotone→Mono' : (f : Fin n  Fin m)  isMonotone f  Mono n m
Monotone→Mono' {zero}         f f-mono = absurd
Monotone→Mono' {suc n} {zero} f f-mono with f zero
... | ()
Monotone→Mono' {suc n} {suc m} f f-mono = raise-mono (f zero)
  (edge (Monotone→Mono' (next-f f f-mono) (next-mono f f-mono)))

And Monotone→Mono just packages that:

Monotone→Mono : Monotone n m  Mono n m
Monotone→Mono (f , f-mono) = Monotone→Mono' f f-mono

Monotone ↔ Mono isomorphism

Monotone→Mono and Mono→Monotone are each others inverse.

First two lemmas, showing that raise and apply "commute" in a special case we need:

raise-edge-apply-zero : (j : Fin (suc m))
                       (f : Mono n (suc (m ∸ toℕ j)))
                       j ≡ apply (raise-mono j (edge f)) zero
raise-edge-apply-zero zero                f = refl
raise-edge-apply-zero {m = suc m} (suc j) f =
  cong suc (raise-edge-apply-zero j f)

raise-edge-apply-suc : (j : Fin (suc m))
                      (i : Fin n)
                      (f : Mono n (suc (m ∸ toℕ j)))
                      raise j (apply f i)
                     ≡ apply (raise-mono j (edge f)) (suc i)
raise-edge-apply-suc             zero    i f = refl
raise-edge-apply-suc {m = suc m} (suc j) i f =
  cong suc (raise-edge-apply-suc j i f)

Using which we can show that apply ∘ Monotone→Mono is the identity function: (Agda proofs are wide, layout of my blog looks horrible with those, I'm sorry).

apply-Monotone→Mono : (f : Monotone n m)
                     (i : Fin n)
                     proj₁ f i ≡ apply (Monotone→Mono f) i
apply-Monotone→Mono {suc n} {zero} f i with proj₁ f zero
... | ()
apply-Monotone→Mono {suc n} {suc m} f zero = begin
  proj₁ f zero                                                   ≡⟨ raise-edge-apply-zero (proj₁ f zero) (Monotone→Mono (next f))
  apply (Monotone→Mono f) zero                                   ∎
apply-Monotone→Mono {suc n} {suc m} f (suc i) = begin
  proj₁ f (suc i)                                                ≡⟨ raise∘lower≡id (proj₁ f (suc i)) (proj₁ f zero) (proj₂ f zero (suc i) z≤n)
  raise (proj₁ f zero) (lower (proj₁ f zero) (proj₁ f (suc i)))  ≡⟨ cong (raise (proj₁ f zero)) (apply-Monotone→Mono (next f) i)
  raise (proj₁ f zero) (apply (Monotone→Mono (next f)) i)        ≡⟨ raise-edge-apply-suc (proj₁ f zero) i _
  apply (Monotone→Mono f) (suc i)

And that is the same as saying that we can convert Monotone to Mono and back, and we get what we started with (in sense):

Monotone→Mono→Monotone : (f : Monotone n m)
                        f ≐ Mono→Monotone (Monotone→Mono f)
Monotone→Mono→Monotone = apply-Monotone→Mono

The other direction, i.e. starting with Mono is simple to show as well using apply-inj lemma, which is the benefit of Mono having unique representation:

Monotone→Mono→Mono : (f : Mono n m)
                    f ≡ Monotone→Mono (Mono→Monotone f)
Monotone→Mono→Mono f = apply-inj
  (Monotone→Mono (Mono→Monotone f))
  (apply-Monotone→Mono (Mono→Monotone f))

In this section we have shown that Mono and Monotone types are isomorphic. Great news!

Interlude: Thinnings and contractions

Recall thinnings:

data Thin : Set where
  baseₜ : Thin zero zero
  skipₜ : Thin n m  Thin n (suc m)
  keepₜ : Thin n m  Thin (suc n) (suc m)

applyₜ : Thin n m  Fin n  Fin m
applyₜ (skipₜ f) i       = suc (applyₜ f i)
applyₜ (keepₜ f) zero    = zero
applyₜ (keepₜ f) (suc i) = suc (applyₜ f i)

These are strictly monotone functions:

isStrictlyMonotone : (Fin n  Fin m)  Set
isStrictlyMonotone f =  i j  i < j  f i < f j

isStrictlyMonotone-applyₜ : (f : Thin n m)  isStrictlyMonotone (applyₜ f)
isStrictlyMonotone-applyₜ (skipₜ f) i       j       i<j       = s≤s (isStrictlyMonotone-applyₜ f i j i<j)
isStrictlyMonotone-applyₜ (keepₜ f) zero    (suc j) (s≤s i<j) = s≤s z≤n
isStrictlyMonotone-applyₜ (keepₜ f) (suc i) (suc j) (s≤s i<j) = s≤s (isStrictlyMonotone-applyₜ f i j i<j)

Similarly: unique representation

applyₜ-inj : (f g : Thin n m)  (∀ i  applyₜ f i ≡ applyₜ g i)  f ≡ g
applyₜ-inj baseₜ     baseₜ     p = refl
applyₜ-inj (skipₜ f) (skipₜ g) p =
  cong skipₜ (applyₜ-inj f g λ i  suc-injective (p i))
applyₜ-inj (skipₜ f) (keepₜ g) p with p zero
... | ()
applyₜ-inj (keepₜ f) (skipₜ g) p with p zero
... | ()
applyₜ-inj (keepₜ f) (keepₜ g) p =
  cong keepₜ (applyₜ-inj f g λ i  suc-injective (p (suc i)))

But applyₜ f maps are also injective, i.e. map different Fin ns to to different Fin ms:

applyₜ-inj₂ : (f : Thin n m) (i j : Fin n)  applyₜ f i ≡ applyₜ f j  i ≡ j
applyₜ-inj₂ (skipₜ f) i       j       p = applyₜ-inj₂ f i j (suc-injective p)
applyₜ-inj₂ (keepₜ f) zero    zero    p = refl
applyₜ-inj₂ (keepₜ f) (suc i) (suc j) p = cong suc (applyₜ-inj₂ f i j (suc-injective p))

Thinnings can be converted to Mono:

Thin→Mono : Thin n m  Mono n m
Thin→Mono baseₜ     = base
Thin→Mono (skipₜ f) = skip (Thin→Mono f)
Thin→Mono (keepₜ f) = keep (Thin→Mono f)

Thins are injective monotonic maps. Can we represent the surjective ones? Yes! This look very similar:

data Cntr : Set where
  baseₖ : Cntr zero zero
  edgeₖ : Cntr n (suc m)  Cntr (suc n) (suc m)
  keepₖ : Cntr n m  Cntr (suc n) (suc m)

edgeₖ' : Cntr (suc n) m  Cntr (suc (suc n)) m
edgeₖ' (edgeₖ f) = edgeₖ (edgeₖ f)
edgeₖ' (keepₖ f) = edgeₖ (keepₖ f)

applyₖ : Cntr n m  Fin n  Fin m
applyₖ (edgeₖ f) zero    = zero
applyₖ (edgeₖ f) (suc i) = applyₖ f i
applyₖ (keepₖ f) zero    = zero
applyₖ (keepₖ f) (suc i) = suc (applyₖ f i)

isMonotone-applyₖ : (f : Cntr n m)  isMonotone (applyₖ f)
isMonotone-applyₖ (edgeₖ f) zero    j        0≤j      = z≤n
isMonotone-applyₖ (edgeₖ f) (suc i) (suc j) (s≤s i≤j) = isMonotone-applyₖ f i j i≤j
isMonotone-applyₖ (keepₖ f) zero    j       0≤j       = z≤n
isMonotone-applyₖ (keepₖ f) (suc i) (suc j) (s≤s i≤j) = s≤s (isMonotone-applyₖ f i j i≤j)

applyₖ-surjective : (f : Cntr n m) (j : Fin m)  Σ (Fin n) λ i  applyₖ f i ≡ j
applyₖ-surjective (edgeₖ f) j with applyₖ-surjective f j
... | i , p = suc i , p
applyₖ-surjective (keepₖ f) zero    = zero , refl
applyₖ-surjective (keepₖ f) (suc j) with applyₖ-surjective f j
... | i , p = suc i , cong suc p

Cntr→Mono : Cntr n m  Mono n m
Cntr→Mono baseₖ = base
Cntr→Mono (edgeₖ f) = edge (Cntr→Mono f)
Cntr→Mono (keepₖ f) = keep (Cntr→Mono f)

We can show that Mono can be decomposed into composition of Cntr and Thin.

We can define the type and smart constructors:

Cntr×Thin : Set
Cntr×Thin n m = Σ ℕ λ p  Cntr n p × Thin p m

baseₖₜ : Cntr×Thin zero zero
baseₖₜ = 0 , baseₖ , baseₜ

skipₖₜ : Cntr×Thin n m  Cntr×Thin n (suc m)
skipₖₜ (p , f , g) = p , f , skipₜ g

edgeₖₜ : Cntr×Thin n (suc m)  Cntr×Thin (suc n) (suc m)
edgeₖₜ (p , f , skipₜ g) = suc p , keepₖ f , keepₜ g
edgeₖₜ (p , f , keepₜ g) = p , edgeₖ f , keepₜ g

Then conversion from Mono is trivial to define:

Mono→Cntr×Thin : (f : Mono n m)  Cntr×Thin n  m
Mono→Cntr×Thin base     = baseₖₜ
Mono→Cntr×Thin (skip f) = skipₖₜ (Mono→Cntr×Thin f)
Mono→Cntr×Thin (edge f) = edgeₖₜ (Mono→Cntr×Thin f)

Other direction isn't tricky either:

Cntr×Thin→Mono : Cntr×Thin n m  Mono n m
Cntr×Thin→Mono (_ , f , g) = Cntr→Mono f ⨟ Thin→Mono g

We can show that starting from Mono we can convert to a pair of Cntr and Thin, and if we convert back, we get what we started with:

skip-⨟ : (f : Mono n m) (g : Mono m p)  f ⨟ skip g ≡ skip (f ⨟ g)
skip-⨟ base     g = refl
skip-⨟ (skip f) g = refl
skip-⨟ (edge f) g = refl

skip-pres : (f : Cntr×Thin n m)  Cntr×Thin→Mono (skipₖₜ f) ≡ skip (Cntr×Thin→Mono f)
skip-pres (p , f , g) = skip-⨟ (Cntr→Mono f) (Thin→Mono g)

edge-pres : (f : Cntr×Thin n (suc m))  Cntr×Thin→Mono (edgeₖₜ f) ≡ edge (Cntr×Thin→Mono f)
edge-pres (p     , f , skipₜ g) = refl
edge-pres (suc p , f , keepₜ g) = refl

Mono→CT→Mono : (f : Mono n m)  Cntr×Thin→Mono (Mono→Cntr×Thin f) ≡ f
Mono→CT→Mono base = refl
Mono→CT→Mono (skip f) = trans (skip-pres (Mono→Cntr×Thin f)) (cong skip (Mono→CT→Mono f))
Mono→CT→Mono (edge f) = trans (edge-pres (Mono→Cntr×Thin f)) (cong edge (Mono→CT→Mono f))

This is an example of factoring a function into a composition of a surjective function followed by an injective one.

Isomorphism with Reed's formulation

Reed's "Simple Simplices" blog post ended with a challenge writing

decompose : (Fin m → Fin n) → (m Δ⇒ n)


As we can convert Monotone to Mono, maybe we can get close?

Let's try.

open import Categories.Category.Instance.Simplex

The other direction, from Δ⇒ to Mono can be defined in systematic way. We define faceₘ and degenₘ and show that they behave like face and degen maps:

faceₘ : Fin (suc n)  Mono n (suc n)
faceₘ             zero    = skip id
faceₘ {n = suc n} (suc i) = keep (faceₘ i)

apply-faceₘ : (i : Fin (suc n)) (j : Fin n)  face i j ≡ apply (faceₘ i) j
apply-faceₘ zero    j       = cong suc (sym (apply-id j))
apply-faceₘ (suc i) zero    = refl
apply-faceₘ (suc i) (suc j) = cong suc (apply-faceₘ i j)

degenₘ : Fin n  Mono (suc n) n
degenₘ zero    = edge id
degenₘ (suc i) = keep (degenₘ i)

apply-degenₘ : (i : Fin n) (j : Fin (suc n))  degen i j ≡ apply (degenₘ i) j
apply-degenₘ {suc n} zero    zero    = refl
apply-degenₘ {suc n} zero    (suc j) = sym (apply-id j)
apply-degenₘ {suc n} (suc i) zero    = refl
apply-degenₘ {suc n} (suc i) (suc j) = cong suc (apply-degenₘ i j)

That is enough to define Δ→Mono map. As we already showed that identity and composition respect apply, We can show that so does respect Δ→Mono.

Δ→Mono : n Δ⇒ m  Mono n m
Δ→Mono ε       = id
Δ→Mono (δ i)   = faceₘ i
Δ→Mono (σ j)   = degenₘ j
Δ→Mono (f ⊚ g) = Δ→Mono g ⨟ Δ→Mono f

apply-Δ→Mono : (f : n Δ⇒ m) (i : Fin n)  apply (Δ→Mono f) i ≡ ⟦ f ⟧ i
apply-Δ→Mono ε       j = apply-id j
apply-Δ→Mono (δ i)   j = sym (apply-faceₘ i j)
apply-Δ→Mono (σ i)   j = sym (apply-degenₘ i j)
apply-Δ→Mono (f ⊚ g) j = begin
  apply (Δ→Mono (f ⊚ g)) j                ≡⟨ sym (apply-⨟ (Δ→Mono g) (Δ→Mono f) j)
  apply (Δ→Mono f) (apply (Δ→Mono g) j)   ≡⟨ cong (apply (Δ→Mono f)) (apply-Δ→Mono g j)
  apply (Δ→Mono f) (⟦ g ⟧ j)              ≡⟨ apply-Δ→Mono f (⟦ g ⟧ j)
  ⟦ f ⊚ g ⟧ j                             ∎

The actual direction we are interested in is similar. We define smart constructors, and then proceed by structural induction.

First smart constructor is (maybe surprisingly) keepₚ:

Note: it doesn't make Δ⇒ any bigger, it still has the same structure and as many face and degen maps.

keepₚ : n Δ⇒ m  suc n Δ⇒ suc m
keepₚ ε       = ε
keepₚ (δ i)   = δ (suc i)
keepₚ (σ j)   = σ (suc j)
keepₚ (f ⊚ g) = keepₚ f ⊚ keepₚ g

keepₚ-apply-zero : (f : n Δ⇒ m)  ⟦ keepₚ f ⟧ zero ≡ zero
keepₚ-apply-zero ε = refl
keepₚ-apply-zero (δ i) = refl
keepₚ-apply-zero (σ j) = refl
keepₚ-apply-zero (f ⊚ g) = trans (cong ⟦ keepₚ f ⟧ (keepₚ-apply-zero g)) (keepₚ-apply-zero f)

keepₚ-apply-suc : (f : n Δ⇒ m) (i : Fin n)  ⟦ keepₚ f ⟧ (suc i) ≡ suc (⟦ f ⟧ i)
keepₚ-apply-suc ε       j = refl
keepₚ-apply-suc (δ i)   j = refl
keepₚ-apply-suc (σ i)   j = refl
keepₚ-apply-suc (f ⊚ g) j = trans (cong ⟦ keepₚ f ⟧ (keepₚ-apply-suc g j)) (keepₚ-apply-suc f (⟦ g ⟧ j) )

Base case is simple:

baseₚ : zero Δ⇒ zero
baseₚ = ε

Skip is using face map:

skipₚ : n Δ⇒ m  n Δ⇒ suc m
skipₚ f = δ zero ⊚ f

skipₚ-apply : (f : n Δ⇒ m) (i : Fin n)  ⟦ skipₚ f ⟧ i ≡ suc (⟦ f ⟧ i)
skipₚ-apply f i = refl

And edge is using degen map:

edgeₚ : n Δ⇒ suc m  suc n Δ⇒ suc m
edgeₚ f = σ zero ⊚ keepₚ f

edgeₚ-apply-zero : (f : n Δ⇒ suc m)  ⟦ edgeₚ f ⟧ zero ≡ zero
edgeₚ-apply-zero f = cong (degen zero) (keepₚ-apply-zero f)

edgeₚ-apply-suc : (f : n Δ⇒ suc m) (i : Fin n)  ⟦ edgeₚ f ⟧ (suc i) ≡ ⟦ f ⟧ i
edgeₚ-apply-suc f i = cong (degen zero) (keepₚ-apply-suc f i)

Conversion from Mono to Δ⇒ is then easy when you have the pieces. The size of Δ⇒ is n face maps and m degen maps, even for identity map. Thus it's not minimal in any sense, but it isn't enormous either.

Mono→Δ : Mono n m  n Δ⇒ m
Mono→Δ base     = baseₚ
Mono→Δ (skip f) = skipₚ (Mono→Δ f)
Mono→Δ (edge f) = edgeₚ (Mono→Δ f)

Finally we can show that Mono→Δ and Δ→Mono for an isomorphism:

apply-Mono→Δ : (f : Mono n m) (i : Fin n)  ⟦ Mono→Δ f ⟧ i ≡ apply f i
apply-Mono→Δ (skip f) i       = trans (skipₚ-apply (Mono→Δ f) i) (cong suc (apply-Mono→Δ f i))
apply-Mono→Δ (edge f) zero    = edgeₚ-apply-zero (Mono→Δ f)
apply-Mono→Δ (edge f) (suc i) = trans (edgeₚ-apply-suc (Mono→Δ f) i) (apply-Mono→Δ f i)

Mono→Δ→Mono : (f : Mono n m)  Δ→Mono (Mono→Δ f) ≡ f
Mono→Δ→Mono f = apply-inj (Δ→Mono (Mono→Δ f)) f λ i  trans (apply-Δ→Mono (Mono→Δ f) i) (apply-Mono→Δ f i)

Δ→Mono→Δ' : (f : n Δ⇒ m) (i : Fin n)  ⟦ Mono→Δ (Δ→Mono f) ⟧ i ≡ ⟦ f ⟧ i
Δ→Mono→Δ' f i = trans (apply-Mono→Δ (Δ→Mono f) i) (apply-Δ→Mono f i)

Δ→Mono→Δ : (f : n Δ⇒ m)  Mono→Δ (Δ→Mono f) ≗ f
Δ→Mono→Δ f = Δ-eq λ {i}  Δ→Mono→Δ' f i

Using this result, and iso between Mono and Monotone we can define conversion from Monotone to Δ⇒:

Monotone→Δ : Monotone n m  n Δ⇒ m
Monotone→Δ f = Mono→Δ (Monotone→Mono f)

Monotone→Δ-correct : (f : Monotone n m) (i : Fin n)
                    proj₁ f i ≡ ⟦ Monotone→Δ f ⟧ i
Monotone→Δ-correct f i = begin
  proj₁ f i                  ≡⟨ apply-Monotone→Mono f i ⟩
  apply (Monotone→Mono f) i  ≡⟨ sym (apply-Mono→Δ (Monotone→Mono f) i)
  ⟦ Monotone→Δ f ⟧ i         ∎

The Monotone→Δ is almost the decompose Reed was asking about. We need to know that argument is also monotonic to do the conversion. I think it's possible to define

  monotonise : (Fin m → Fin n) → Monotone m n

such that it is involutive on monotonic maps:

  monotonise-inv : (f : Monotone n m) → f ≐ monotonise (proj₁ f)

But if we have monotonise, then we can define

decompose : (Fin n → Fin m) → n Δ⇒ m
decompose f = Monotone→Δ (monotonise f)


First the maximum function, and few lemmas:

infix 5 __

__ : Fin n  Fin n  Fin n
zero  ∨ j      = j
suc i ∨ zero   = suc i
suc i ∨ suc j  = suc (i ∨ j)

i≤j∨i : (i j : Fin n)  i ≤ j ∨ i
i≤j∨i zero    j       = z≤n
i≤j∨i (suc i) zero    = ≤-refl
i≤j∨i (suc i) (suc j) = s≤s (i≤j∨i i j)

i≤i∨j : (i j : Fin n)  i ≤ i ∨ j
i≤i∨j zero    j       = z≤n
i≤i∨j (suc i) zero    = ≤-refl
i≤i∨j (suc i) (suc j) = s≤s (i≤i∨j i j)

i≤j→i∨k≤i∨k : (i j k : Fin n)  i ≤ j  i ∨ k ≤ j ∨ k
i≤j→i∨k≤i∨k zero    j       k       0≤j       = i≤j∨i k j
i≤j→i∨k≤i∨k (suc i) (suc j) zero    i≤j       = i≤j
i≤j→i∨k≤i∨k (suc i) (suc j) (suc k) (s≤s i≤j) = s≤s (i≤j→i∨k≤i∨k i j k i≤j)

i≤j→j≡j∨i : (i j : Fin n)  i ≤ j  j ≡ j ∨ i
i≤j→j≡j∨i zero    zero    0≤0       = refl
i≤j→j≡j∨i zero    (suc j) i<j       = refl
i≤j→j≡j∨i (suc i) (suc j) (s≤s i≤j) = cong suc (i≤j→j≡j∨i i j i≤j)

Then we can write an algorithm to make arbitrary f monotone:

The idea is to raise the floor for larger inputs:

monotonise-f' : (Fin (suc n)  Fin m)  (Fin n  Fin m)
monotonise-f' f k = f (suc k) ∨ f zero

monotonise-f : (Fin n  Fin m)  (Fin n  Fin m)
monotonise-f f zero    = f zero
monotonise-f f (suc i) = monotonise-f (monotonise-f' f) i

The monotonised f is greater then just f:

monotonise-f-≤ : (f : Fin n  Fin m) (i j : Fin n)
                i ≤ j
                f i ≤ monotonise-f f j
monotonise-f-≤ f zero zero i≤j = ≤-refl
monotonise-f-≤ {n = suc (suc n)} f zero (suc j) i≤1+j = ≤-trans
  (i≤j∨i (f zero) (f (suc zero)))
  (monotonise-f-≤ (monotonise-f' f) zero j z≤n)
monotonise-f-≤ f (suc i) (suc j) (s≤s i≤j) = ≤-trans
  (i≤i∨j (f (suc i)) (f zero))
  (monotonise-f-≤ (monotonise-f' f) i j i≤j)

And the result is indeed monotone:

monotonise-mono : (f : Fin n  Fin m)  isMonotone (monotonise-f f)
monotonise-mono f zero    zero    0≤0       = ≤-refl
monotonise-mono f zero    (suc j) 0≤j       = monotonise-f-≤ f zero (suc j) z≤n
monotonise-mono f (suc i) (suc j) (s≤s i≤j) = monotonise-mono (monotonise-f' f) i j i≤j

So we can convert an arbitrary function to Monotone n m:

monotonise : (Fin n  Fin m)  Monotone n m
monotonise f = monotonise-f f , monotonise-mono f

Finally we can prove that monotonise is "involutive" when applied

monotonise-f'-mono : (f : Fin (suc n)  Fin m)
                    isMonotone f
                    isMonotone (monotonise-f' f)
monotonise-f'-mono f f-mono i j i≤j = i≤j→i∨k≤i∨k
  (f (suc i))
  (f (suc j))
  (f zero)
  (f-mono (suc i) (suc j) (s≤s i≤j))

monotonise-inv' : (f : Fin n  Fin m)  isMonotone f   i  f i ≡ monotonise-f f i
monotonise-inv' f f-mono zero    = refl
monotonise-inv' f f-mono (suc i) = begin
  f (suc i)                           ≡⟨ i≤j→j≡j∨i (f zero) (f (suc i)) (f-mono zero (suc i) z≤n)
  monotonise-f' f i                   ≡⟨ monotonise-inv' (monotonise-f' f) (monotonise-f'-mono f f-mono) i ⟩
  monotonise-f (monotonise-f' f) i    ∎

monotonise-inv : (f : Monotone n m)  f ≐ monotonise (proj₁ f)
monotonise-inv (f , f-mono) = monotonise-inv' f f-mono

And finally we can define decompose!

decompose : (Fin n  Fin m)  n Δ⇒ m
decompose f = Monotone→Δ (monotonise f)

October 08, 2022 12:00 AM

October 07, 2022

Matt Parsons

Femoroacetabular Impingement

Apparently, I’ve spent my entire life with a condition called “femoracetabular impingement.” The bones in my hips are deformed - the femoral neck is too thick and mis-shapen, and I have a “pincer” on my acetabum which restricts range of motion even further.

As a result, I wasn’t able to internally rotate my hips almost at all - I had a single degree range of motion (normal for the population is 45 degrees). I can get my knee to about 90 degrees, but that’s it - for my knees to come up any further, I need to flex my low back. This makes sitting, cycling, weightlifting, yoga, and, uh, pretty much everything a painful and difficult experience.

For a long time, I thought I just had “tight hamstrings,” and would occasionally get really into mobility exercises and stretching to try and improve it. Nothing ever worked. In fact, all of that stretching and mobilization was really stretching my low back, not my hamstrings, because the joint was already fully flexed - bone-on-bone contact.

And, yeah, bone-on-bone. From squatting, deadlifting, sitting in a chair and programming, and cycling, I’ve pretty much shredded the labrum on each side of my hip. Turns out, the weird aching pains in the front of my legs are hip arthritis.

I found out about all of this in such a roundabout way. Last year, my girlfriend wanted to join a bike racing team. She found a team ride/race for No Ride Around, which happened to be the team for my favorite local bike shop. I love cycling and wanted to support her, so I joined too, even though racing isn’t really my thing.

Being on a race team, especially a really supportive one, is a fantastic motivation. The team leader recommended Denver Fit Loft for a race bike fit. Charles Van Atta, the fitter, was surprised at my limited range of motion, and recommended that I consult an orthopedic surgeon for hip impingement.

Fortunately, Denver has a really great sports medicine scene. In my Google research, I found Dr. James Genuario, a world leading expert in exactly this sort of thing. Within a few weeks of the bicycle fitting, I had X-rays confirming a severe case of hip impingement. In a normal hip, there’s a number called the “alpha angle” that describes how round the femoral head is. A normal alpha angle is 45 degrees, and 50-55 degrees is considered “pathological” and warrants surgical intervention. My alpha angle is 69 degrees. Based on my current hip condition, I was looking at a total hip replacement in 5-15 years if I didn’t act quick.

Yet another fortunate coincidence - another member of my race team worked in medical device support, and knew many of the surgeons in the area. He gave me a bunch of advice, and spoke very well of Dr. Genuario.

I spent six weeks going to PT twice a week. Lots of weird stretches and exercises did - well, nothing at all. Insurance companies require six weeks of PT before they’ll pay for the MRI and CT scans required for surgery, much less the surgery itself. Apparently, about half of the folks that initially report these problems can resolve with stretching. Given my seriously messed up bone anatomy, I wish we could have skipped that step.

After six weeks, I got my MRI scan - and fortunately my connective tissue is good enough to warrant corrective surgery. A month of waiting, and I was able to get the CT scan, which provides a highly detailed 3D picture of my hip. The CT scan goes to Germany, where they construct (in software) a 3D model of a “healthy” version of my hip. This is the blueprint. The surgeon will use that to trim my bones to the right shape. What’s fun is that I found a video of this procedure on YouTube. They literally use a fancy dremel tool to shave the bone down.

On September 22nd, I received my first surgery. The doctor said that he wasn’t sure if he could repair the labrum, and I may need a reconstruction - which is a fancy way of saying “get a dead person’s labrum and stitch it in there.” Once I signed all the consent forms, they gave me a Valium, and started hooking me up to an IV. The nurse was jovial as I was being wheeled away - “we got you on the good drugs, it’s party time.” To which my drugged out self responded - “double fisting valium and whatever this is.” That’s my last memory before going under.

On waking up, the doctor said that he couldn’t repair the existing labrum - something about it looking like “crab meat.” Given that I was still high on the anesthesia, I said “hell yeah i’m part zombie.”

I was in a fog all that day, and for two days afterward, I was taking narcotics. I weaned myself off pretty quick, since I dislike the side effects, and they don’t work that great on me anyway. After a few days, my hip was feeling totally fine, but every single medication I was on otherwise had “constipation” as a side effect, including the anti-nausea medications. So when my stomach started to feel bad, I took all the nausea meds, which only made things worse. The 29th (my birthday) was the hardest day - I was completely laid up in bed. Once I determined the real cause of the stomach discomfort, it was pretty easy to manage.

I’m at two weeks post-op right now, and recovery is great. Dr. Genuario’s skill as a surgeon is remarkable - he was able to bring my alpha angle to 45 degrees. Despite removing so much bone, there is no pain at this point. I’m supposed to be weaning off of crutches starting next week, but truth be told, I’m only using a single crutch most of the time anyway. The range of motion in my operative leg is much better than the

My second surgery is scheduled for November 3rd. Another three weeks in crutches, and I’ll be able to walk unassisted for Thanksgiving. Another three weeks of recovery and PT, and I’ll be able to ride a bike outside - hopefully in time for the winter solstice (would hate to lose my Solstice Century streak). I should be back to full strength and regular activity by April.

I’m incredibly grateful for everyone involved in the process. But the person who has helped the most is my partner. She’s supported me through all of this, helped me with my physical therapy, and changed my wound dressings.

October 07, 2022 12:00 AM

Femoracetabular Impingement

Apparently, I’ve spent my entire life with a condition called “femoracetabular impingement.” The bones in my hips are deformed - the femoral neck is too thick and mis-shapen, and I have a “pincer” on my acetabum which restricts range of motion even further.

As a result, I wasn’t able to internally rotate my hips almost at all - I had a single degree range of motion (normal for the population is 45 degrees). I can get my knee to about 90 degrees, but that’s it - for my knees to come up any further, I need to flex my low back. This makes sitting, cycling, weightlifting, yoga, and, uh, pretty much everything a painful and difficult experience.

For a long time, I thought I just had “tight hamstrings,” and would occasionally get really into mobility exercises and stretching to try and improve it. Nothing ever worked. In fact, all of that stretching and mobilization was really stretching my low back, not my hamstrings, because the joint was already fully flexed - bone-on-bone contact.

And, yeah, bone-on-bone. From squatting, deadlifting, sitting in a chair and programming, and cycling, I’ve pretty much shredded the labrum on each side of my hip. Turns out, the weird aching pains in the front of my legs are hip arthritis.

I found out about all of this in such a roundabout way. Last year, my girlfriend wanted to join a bike racing team. She found a team ride/race for No Ride Around, which happened to be the team for my favorite local bike shop. I love cycling and wanted to support her, so I joined too, even though racing isn’t really my thing.

Being on a race team, especially a really supportive one, is a fantastic motivation. The team leader recommended Denver Fit Loft for a race bike fit. Charles Van Atta, the fitter, was surprised at my limited range of motion, and recommended that I consult an orthopedic surgeon for hip impingement.

Fortunately, Denver has a really great sports medicine scene. In my Google research, I found Dr. James Genuario, a world leading expert in exactly this sort of thing. Within a few weeks of the bicycle fitting, I had X-rays confirming a severe case of hip impingement. In a normal hip, there’s a number called the “alpha angle” that describes how round the femoral head is. A normal alpha angle is 45 degrees, and 50-55 degrees is considered “pathological” and warrants surgical intervention. My alpha angle is 69 degrees. Based on my current hip condition, I was looking at a total hip replacement in 5-15 years if I didn’t act quick.

Yet another fortunate coincidence - another member of my race team worked in medical device support, and knew many of the surgeons in the area. He gave me a bunch of advice, and spoke very well of Dr. Genuario.

I spent six weeks going to PT twice a week. Lots of weird stretches and exercises did - well, nothing at all. Insurance companies require six weeks of PT before they’ll pay for the MRI and CT scans required for surgery, much less the surgery itself. Apparently, about half of the folks that initially report these problems can resolve with stretching. Given my seriously messed up bone anatomy, I wish we could have skipped that step.

After six weeks, I got my MRI scan - and fortunately my connective tissue is good enough to warrant corrective surgery. A month of waiting, and I was able to get the CT scan, which provides a highly detailed 3D picture of my hip. The CT scan goes to Germany, where they construct (in software) a 3D model of a “healthy” version of my hip. This is the blueprint. The surgeon will use that to trim my bones to the right shape. What’s fun is that I found a video of this procedure on YouTube. They literally use a fancy dremel tool to shave the bone down.

On September 22nd, I received my first surgery. The doctor said that he wasn’t sure if he could repair the labrum, and I may need a reconstruction - which is a fancy way of saying “get a dead person’s labrum and stitch it in there.” Once I signed all the consent forms, they gave me a Valium, and started hooking me up to an IV. The nurse was jovial as I was being wheeled away - “we got you on the good drugs, it’s party time.” To which my drugged out self responded - “double fisting valium and whatever this is.” That’s my last memory before going under.

On waking up, the doctor said that he couldn’t repair the existing labrum - something about it looking like “crab meat.” Given that I was still high on the anesthesia, I said “hell yeah i’m part zombie.”

I was in a fog all that day, and for two days afterward, I was taking narcotics. I weaned myself off pretty quick, since I dislike the side effects, and they don’t work that great on me anyway. After a few days, my hip was feeling totally fine, but every single medication I was on otherwise had “constipation” as a side effect, including the anti-nausea medications. So when my stomach started to feel bad, I took all the nausea meds, which only made things worse. The 29th (my birthday) was the hardest day - I was completely laid up in bed. Once I determined the real cause of the stomach discomfort, it was pretty easy to manage.

I’m at two weeks post-op right now, and recovery is great. Dr. Genuario’s skill as a surgeon is remarkable - he was able to bring my alpha angle to 45 degrees. Despite removing so much bone, there is no pain at this point. I’m supposed to be weaning off of crutches starting next week, but truth be told, I’m only using a single crutch most of the time anyway. The range of motion in my operative leg is much better than the

My second surgery is scheduled for November 3rd. Another three weeks in crutches, and I’ll be able to walk unassisted for Thanksgiving. Another three weeks of recovery and PT, and I’ll be able to ride a bike outside - hopefully in time for the winter solstice (would hate to lose my Solstice Century streak). I should be back to full strength and regular activity by April.

October 07, 2022 12:00 AM

October 06, 2022

Brent Yorgey

Swarm alpha release!

The Swarm development team is very proud to announce the very first alpha release of the game. There are still many missing features (for example, saving games is not yet possible) and known bugs, but at this point it’s quite playable (and, dare we say, fun!) and ready for some intrepid souls to try it out and give us some feedback.

What is it?

Swarm is a 2D, open-world programming and resource gathering game with a strongly-typed, functional programming language and a unique upgrade system. Unlocking language features is tied to collecting resources, making it an interesting challenge to bootstrap your way into the use of the full language.

Notable changes since the last progress update include:

  • An all-new in-game tutorial consisting of a sequence of guided challenges
  • Several new challenge scenarios (mazes! towers of hanoi!), and documentation on how to make your own
  • Lots more in-game help and info, including help on currently available commands + recipes, and a dialog showing all live robots
  • Many more entities, recipes, and language features to explore and collect
  • Better mouse support
  • Backwards incremental search and tab completion in the REPL
  • Many, many small bug fixes and improvements!

Give it a try!

To install, check out the installation instructions: you can download a binary release (for now, Linux only, but MacOS binaries should be on the horizon), or install from Hackage. Give it a try and send us your feedback, either via a github issue or via IRC!

Future plans & getting involved

We’re still hard at work on the game, and will next turn our attention to some big features, such as:

Of course, there are also tons of small things that need fixing and polishing too! If you’re interested in getting involved, check out our contribution guide, come join us on IRC (#swarm on Libera.Chat), or take a look at the list of issues marked “low-hanging fruit”.

Brought to you by the Swarm development team:

  • Brent Yorgey
  • Ondřej Šebek
  • Tristan de Cacqueray

With contributions from:

  • Alexander Block
  • Daniel Díaz Carrete
  • Huw Campbell
  • Ishan Bhanuka
  • Jacob
  • Jens Petersen
  • José Rafael Vieira
  • Joshua Price
  • lsmor
  • Noah Yorgey
  • Norbert Dzikowski
  • Paul Brauner
  • Ryan Yates
  • Sam Tay

…not to mention many others who gave valuable suggestions and feedback. Want to see your name listed here in the next release? See how you can contribute!

by Brent at October 06, 2022 08:01 PM

September 30, 2022

Oleg Grenrus

Three different thinnings

Posted on 2022-09-30 by Oleg Grenrus agda

I was lately again thinking about thinnings.

Thinnings are a weaker form of renamings, which we use in well-scoped or well-typed implementations of programming languages. (Their proper name is order-preserving embeddings, mathematicians may know them as morphism in augmented simplex category Δ₊)

There is one well known and used implementation implementation for them. It's simple to use and write proofs about. However it's not super great. Especially it's not great in Haskell, as it cannot be given Category instance. (Though you almost never need thinnings in Haskell, so the reason is a bit moot).

I'll show two other implementations, and show that they are equivalent, using Cubical Agda to state the equivalences. Before we dive in, Agda module prologue:

{-# OPTIONS --cubical --safe #-}
module 2022-09-30-thinnings where

open import Cubical.Core.Everything
open import Cubical.Foundations.Prelude
open import Cubical.Foundations.Isomorphism
open import Cubical.Data.Nat
open import Cubical.Data.Empty
open import Cubical.Data.Sigma
open import Cubical.Relation.Nullary

I will show only a well-scoped thinnings. So the context are simply natural numbers. As there are plenty of them, let us define few common variables.

  n m p r :

Orthodox thinnings

For the sake of this post, I call well known thinnings orthodox, and use ₒ subscript to indicate that.

data _⊑ₒ_ : Type where
  nilₒ   :           zero   ⊑ₒ zero
  skipₒ  : n ⊑ₒ m   n      ⊑ₒ suc m
  keepₒ  : n ⊑ₒ m   suc n  ⊑ₒ suc m

Orth = _⊑ₒ_

An example thinning is like

exₒ : 5 ⊑ₒ 7
exₒ = keepₒ (skipₒ (keepₒ (skipₒ (keepₒ (keepₒ (keepₒ nilₒ))))))

Which would look like:

\begin{tikzpicture} \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (A) at (0,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (B) at (0,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (C) at (0,1.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (D) at (0,2.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (E) at (0,3.00) {}; \node[anchor=east] at (A) {$0$}; \node[anchor=east] at (B) {$1$}; \node[anchor=east] at (C) {$2$}; \node[anchor=east] at (D) {$3$}; \node[anchor=east] at (E) {$4$}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (X) at (2,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Y) at (2,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Z) at (2,1.00) {}; \node[circle, draw, inner sep=0pt, minimum width=4pt] (U) at (2,1.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (V) at (2,2.00) {}; \node[circle, draw, inner sep=0pt, minimum width=4pt] (W) at (2,2.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Q) at (2,3.00) {}; \node[anchor=west] at (X) {$0$}; \node[anchor=west] at (Y) {$1$}; \node[anchor=west] at (Z) {$2$}; \node[anchor=west] at (U) {$3$}; \node[anchor=west] at (V) {$4$}; \node[anchor=west] at (W) {$5$}; \node[anchor=west] at (Q) {$6$}; \draw[-] (A) -- (X); \draw[-] (B) -- (Y); \draw[-] (C) -- (Z); \draw[-] (D) -- (V); \draw[-] (E) -- (Q); \end{tikzpicture}

We can define identity thinning:

idₒ : n ⊑ₒ n
idₒ {zero}   = nilₒ
idₒ {suc n}  = keepₒ idₒ

Note how it pattern matches on the size (of the context). That what makes it impossible to defined Category instance in Haskell.

We can also define composition, and weakening on top of the context

_⦂ₒ_ : n ⊑ₒ m  m ⊑ₒ p  n ⊑ₒ p
δ₁        ⦂ₒ nilₒ      = δ₁
δ₁        ⦂ₒ skipₒ δ₂  = skipₒ (δ₁ ⦂ₒ δ₂)
keepₒ δ₁  ⦂ₒ keepₒ δ₂  = keepₒ (δ₁ ⦂ₒ δ₂)
skipₒ δ₁  ⦂ₒ keepₒ δ₂  = skipₒ (δ₁ ⦂ₒ δ₂)

wkₒ : n ⊑ₒ suc n
wkₒ = skipₒ idₒ

As said, the proofs about this formulation are simple. Plenty of equalities hold definitionally:

keep-id≡idₒ : keepₒ idₒ ≡ idₒ {suc n}
keep-id≡idₒ = refl

Separate thinning

As mentioned in previous section the orthodox thinning is not very efficient. For example when implementing normalization by evaluation (NbE) we run into problems. There we need identity thinning when evaluating every application, so we will pay a price proportional to the size of the current context!

In his work Andras Kovacs makes a variant swapping nilₒ for idₒ. However then thinnings won't have unique representation anymore and proofs become more inconvenient to write.

We can make a special case for identity thinning without sacrificing unique representation for the cost of slightly more complicated definition. We just need to consider identity thinning and non-identity ones separately.

data _⊏ₛ_ : Type where
  wkₛ    :           n      ⊏ₛ suc n
  keepₛ  : n ⊏ₛ m   suc n  ⊏ₛ suc m
  skipₛ  : n ⊏ₛ m   n      ⊏ₛ suc m

data _⊑ₙ_ : Type where
  idₙ :              n ⊑ₙ n
  strict : n ⊏ₛ m   n ⊑ₙ m

Strict = _⊏ₛ_
NonStr = _⊑ₙ_

We can implement most operations without much problems. Note that also wkₙ has a small, context-size independent, representation.

nilₙ : zero ⊑ₙ zero
nilₙ = idₙ

wkₙ :  {n}  n ⊑ₙ suc n
wkₙ = strict wkₛ

skipₙ : n ⊑ₙ m  n ⊑ₙ suc m
skipₙ idₙ         = wkₙ
skipₙ (strict x)  = strict (skipₛ x)

keepₙ : n ⊑ₙ m  suc n ⊑ₙ suc m
keepₙ idₙ         = idₙ
keepₙ (strict δ)  = strict (keepₛ δ)

keep-id≡idₙ : keepₙ idₙ ≡ idₙ {suc n}
keep-id≡idₙ = refl

Composition is a bit more complicated then for orthodox variant, but not considerably:

_⦂ₛ_ : n ⊏ₛ m  m ⊏ₛ p  n ⊏ₛ p
δ₁        ⦂ₛ wkₛ       = skipₛ δ₁
δ₁        ⦂ₛ skipₛ δ₂  = skipₛ (δ₁ ⦂ₛ δ₂)
wkₛ       ⦂ₛ keepₛ δ₂  = skipₛ δ₂
keepₛ δ₁  ⦂ₛ keepₛ δ₂  = keepₛ (δ₁ ⦂ₛ δ₂)
skipₛ δ₁  ⦂ₛ keepₛ δ₂  = skipₛ (δ₁ ⦂ₛ δ₂)

_⦂ₙ_ : n ⊑ₙ m  m ⊑ₙ p  n ⊑ₙ p
δ₁         ⦂ₙ idₙ         = δ₁
idₙ        ⦂ₙ strict δ₂        = strict δ₂
strict δ₁  ⦂ₙ strict δ₂  = strict (δ₁ ⦂ₛ δ₂)

Are these orthodox and this thinning the same?

Are ⊑ₒ and ⊑ₙ the same? We can construct an isomorphism between them to answer that question positively.

Orth→NonStr : n ⊑ₒ m  n ⊑ₙ m
Orth→NonStr nilₒ        = nilₙ
Orth→NonStr (keepₒ δ)   = keepₙ (Orth→NonStr δ)
Orth→NonStr (skipₒ δ)   = skipₙ (Orth→NonStr δ)

Strict→Orth : n ⊏ₛ m  n ⊑ₒ m
Strict→Orth wkₛ         = wkₒ
Strict→Orth (keepₛ δ)   = keepₒ (Strict→Orth δ)
Strict→Orth (skipₛ δ)   = skipₒ (Strict→Orth δ)

NonStr→Orth : n ⊑ₙ m  n ⊑ₒ m
NonStr→Orth idₙ         = idₒ
NonStr→Orth (strict δ)  = Strict→Orth δ

It is not enough to define conversion functions we also need to show that they cancel out. Luckily this is not difficult, we need few auxiliary homomorphism lemmas.

NonStr→Orth-keepₒ : (δ : n ⊑ₙ m)  NonStr→Orth (keepₙ δ) ≡ keepₒ (NonStr→Orth δ)
NonStr→Orth-skipₒ : (δ : n ⊑ₙ m)  NonStr→Orth (skipₙ δ) ≡ skipₒ (NonStr→Orth δ)
Orth→NonStr-id≡id :  n  Orth→NonStr idₒ ≡ idₙ {n}
NonStr→Orth-keepₒ idₙ         = refl
NonStr→Orth-keepₒ (strict _)  = refl

NonStr→Orth-skipₒ idₙ         = refl
NonStr→Orth-skipₒ (strict _)  = refl

Orth→NonStr-id≡id zero    = refl
Orth→NonStr-id≡id (suc n) = cong keepₙ (Orth→NonStr-id≡id n)

And finally we can show that Orth→NonStr NonStr→Orth are each others inverses.

Orth→NonStr→Orth    : (δ : n ⊑ₒ m)  NonStr→Orth (Orth→NonStr δ) ≡ δ
Strict→Orth→NonStr  : (δ : n ⊏ₛ m)  Orth→NonStr (Strict→Orth δ) ≡ strict δ
NonStr→Orth→NonStr  : (δ : n ⊑ₙ m)  Orth→NonStr (NonStr→Orth δ) ≡ δ
Orth→NonStr→Orth nilₒ       = refl
Orth→NonStr→Orth (keepₒ δ)  = NonStr→Orth-keepₒ (Orth→NonStr δ) ∙ cong keepₒ (Orth→NonStr→Orth δ)
Orth→NonStr→Orth (skipₒ δ)  = NonStr→Orth-skipₒ (Orth→NonStr δ) ∙ cong skipₒ (Orth→NonStr→Orth δ)

Strict→Orth→NonStr wkₛ        = cong skipₙ (Orth→NonStr-id≡id _)
Strict→Orth→NonStr (keepₛ δ)  = cong keepₙ (Strict→Orth→NonStr δ)
Strict→Orth→NonStr (skipₛ δ)  = cong skipₙ (Strict→Orth→NonStr δ)

NonStr→Orth→NonStr idₙ         = Orth→NonStr-id≡id _
NonStr→Orth→NonStr (strict δ)  = Strict→Orth→NonStr δ

In Cubical Agda we can promote the above isomorphism to an equality.

Orth≡NonStr-pointwise : (n ⊑ₒ m)(n ⊑ₙ m)
Orth≡NonStr-pointwise = isoToPath
  (iso Orth→NonStr NonStr→Orth NonStr→Orth→NonStr Orth→NonStr→Orth)

Orth≡NonStr : Orth ≡ NonStr
Orth≡NonStr i n m = Orth≡NonStr-pointwise {n} {m} i

But are they still the same?

Even the types are the same, are the operations we defined on them the same? We still need to show that the operations give the same results.

I'll define a simplified "category operations" type, with an identity and a composition:

CatOps : ( Type)  Type
CatOps __
  = (∀ {n}  n ↝ n)                       -- identity
  × (∀ {n m p}  n ↝ m  m ↝ p  n ↝ p )  -- composition

Orthodox category ops are:

CatOps-Orth : CatOps Orth
CatOps-Orth = idₒ , _⦂ₒ_

And NonStr ops are:

CatOps-NonStr : CatOps NonStr
CatOps-NonStr = idₙ , _⦂ₙ_

And we can show transport orthodox ops along Orth≡NonStr to get other variant

CatOps-NonStrₜ : CatOps NonStr
CatOps-NonStrₜ = subst CatOps Orth≡NonStr CatOps-Orth

The goal is to show that all these are equal.

First We can construct a path between two CatOps NonStr structures,

For identity part we need identity homomorphism:

Orth→NonStr-id : Orth→NonStr idₒ ≡ idₙ {n}
Orth→NonStr-id {zero}  = refl
Orth→NonStr-id {suc n} = cong keepₙ (Orth→NonStr-id {n})

Then we can extract the transported identity, and show it is the same as idₙ:

idₙₜ : n ⊑ₙ n
idₙₜ = fst CatOps-NonStrₜ

idₙₜ≡idₙ : idₙₜ ≡ idₙ {n}
idₙₜ≡idₙ = transportRefl (Orth→NonStr idₒ) ∙ Orth→NonStr-id

The composition is slightly more complicated.

skip-⦂ₙ : (δ₁ : n ⊑ₙ m)  (δ₂ : m ⊑ₙ p)
         skipₙ (δ₁ ⦂ₙ δ₂)(δ₁ ⦂ₙ skipₙ δ₂)
skip-⦂ₙ idₙ         idₙ         = refl
skip-⦂ₙ (strict _)  idₙ         = refl
skip-⦂ₙ idₙ         (strict _)  = refl
skip-⦂ₙ (strict _)  (strict _)  = refl

skip-keep-⦂ₙ : (δ₁ : n ⊑ₙ m) (δ₂ : m ⊑ₙ p)
              skipₙ (δ₁ ⦂ₙ δ₂)(skipₙ δ₁ ⦂ₙ keepₙ δ₂)
skip-keep-⦂ₙ δ₁          idₙ         = refl
skip-keep-⦂ₙ idₙ         (strict _)  = refl
skip-keep-⦂ₙ (strict _)  (strict _)  = refl

keep-keep-⦂ₙ : (δ₁ : n ⊑ₙ m) (δ₂ : m ⊑ₙ p)
              keepₙ (δ₁ ⦂ₙ δ₂)(keepₙ δ₁ ⦂ₙ keepₙ δ₂)
keep-keep-⦂ₙ δ₁          idₙ         = refl
keep-keep-⦂ₙ idₙ         (strict x)  = refl
keep-keep-⦂ₙ (strict _)  (strict _)  = refl

We can show that Orth→NonStr preserves composition.

Orth→NonStr-⦂ : (δ₁ : n ⊑ₒ m) (δ₂ : m ⊑ₒ p)
               Orth→NonStr (δ₁ ⦂ₒ δ₂) ≡ Orth→NonStr δ₁ ⦂ₙ Orth→NonStr δ₂
Orth→NonStr-⦂ δ₁          nilₒ        = refl
Orth→NonStr-⦂ δ₁          (skipₒ δ₂)  = cong skipₙ (Orth→NonStr-⦂ δ₁ δ₂) ∙ skip-⦂ₙ (Orth→NonStr δ₁) (Orth→NonStr δ₂)
Orth→NonStr-⦂ (skipₒ δ₁)  (keepₒ δ₂)  = cong skipₙ (Orth→NonStr-⦂ δ₁ δ₂) ∙ skip-keep-⦂ₙ (Orth→NonStr δ₁) (Orth→NonStr δ₂)
Orth→NonStr-⦂ (keepₒ δ₁)  (keepₒ δ₂)  = cong keepₙ (Orth→NonStr-⦂ δ₁ δ₂) ∙ keep-keep-⦂ₙ (Orth→NonStr δ₁) (Orth→NonStr δ₂)

Using the above fact, we can show that and are pointwise equal. The proof looks complicated, but is pretty straightforward in the end.

_⦂ₙₜ_ : n ⊑ₙ m  m ⊑ₙ p  n ⊑ₙ p
_⦂ₙₜ_ = snd CatOps-NonStrₜ

⦂ₙₜ≡⦂ₙ : (δ₁ : n ⊑ₙ m) (δ₂ : m ⊑ₙ p)  δ₁ ⦂ₙₜ δ₂ ≡ δ₁ ⦂ₙ δ₂
⦂ₙₜ≡⦂ₙ {n} {m} {p} δ₁ δ₂ =
  transport refl expr₁  ≡⟨ transportRefl expr₁ ⟩
  expr₁                 ≡⟨ expr₁≡expr₂ ⟩
  expr₂                 ≡⟨ Orth→NonStr-⦂ (NonStr→Orth δ₁) (NonStr→Orth δ₂)
  expr₃                 ≡⟨  i  NonStr→Orth→NonStr δ₁ i ⦂ₙ
                                  NonStr→Orth→NonStr δ₂ i)
  δ₁ ⦂ₙ δ₂ ∎
    expr₁ = Orth→NonStr (NonStr→Orth (transport refl δ₁) ⦂ₒ
                         NonStr→Orth (transport refl δ₂))
    expr₂ = Orth→NonStr (NonStr→Orth δ₁ ⦂ₒ NonStr→Orth δ₂)
    expr₃ = Orth→NonStr (NonStr→Orth δ₁) ⦂ₙ Orth→NonStr (NonStr→Orth δ₂)

    expr₁≡expr₂ : expr₁ ≡ expr₂
    expr₁≡expr₂ i = Orth→NonStr (NonStr→Orth (transportRefl δ₁ i) ⦂ₒ
                                 NonStr→Orth (transportRefl δ₂ i))

And finally we can state that first equality:

CatOps-NonStr≡ : CatOps-NonStrₜ ≡ CatOps-NonStr
CatOps-NonStr≡ i = idₙₜ≡idₙ i , λ δ₁ δ₂  ⦂ₙₜ≡⦂ₙ δ₁ δ₂ i

and the quality we actually wanted to say, that CatOps-Orth and CatOps-NonStr are equal (if we equate their types by Orth≡NonStr)!!!

CatOps-Orth≡NonStr :  i  CatOps (Orth≡NonStr i))
  [ CatOps-Orth ≡ CatOps-NonStr ]
CatOps-Orth≡NonStr = toPathP CatOps-NonStr≡

Higher-inductive type

Cubical Agda also supports higher inductive types (HITs), i.e. types with additional equalities. We can formalize Andras better performing thinning as a HIT, by throwing in an additional equality. Agda will then ensure that we always respect it.

data _⊑ₕ_ : Type where
  idₕ    :           n      ⊑ₕ n
  keepₕ  : n ⊑ₕ m   suc n  ⊑ₕ suc m
  skipₕ  : n ⊑ₕ m   n      ⊑ₕ suc m

  -- it is what it says: keep idₕ ≡ idₕ
  keep-id≡idₕ :  n  keepₕ (idₕ {n = n}) ≡ idₕ {n = suc n}

HIT = _⊑ₕ_

Composition for HIT-thinning looks very similar to the orthodox version...

_⦂ₕ_ : n ⊑ₕ m  m ⊑ₕ p  n ⊑ₕ p
δ₁        ⦂ₕ idₕ       = δ₁
δ₁        ⦂ₕ skipₕ δ₂  = skipₕ (δ₁ ⦂ₕ δ₂)
idₕ       ⦂ₕ keepₕ δ₂  = keepₕ δ₂
keepₕ δ₁  ⦂ₕ keepₕ δ₂  = keepₕ (δ₁ ⦂ₕ δ₂)
skipₕ δ₁  ⦂ₕ keepₕ δ₂  = skipₕ (δ₁ ⦂ₕ δ₂)

... except that we have extra cases which deal with an extra equality we threw in.

We have to show that equations are consistent with keep-id≡idₕ equality. The goals may be obfuscated, but relatively easy to fill.

keep-id≡idₕ n i ⦂ₕ keepₕ δ₂ = goal i
  lemma :  {n m}  (δ : HIT n m)  idₕ ⦂ₕ δ ≡ δ
  lemma idₕ = refl
  lemma (keepₕ δ) = refl
  lemma (skipₕ δ) = cong skipₕ (lemma δ)
  lemma (keep-id≡idₕ n i) j = keep-id≡idₕ n i

  goal : keepₕ (idₕ ⦂ₕ δ₂) ≡ keepₕ δ₂
  goal i = keepₕ (lemma δ₂ i)

idₕ               ⦂ₕ keep-id≡idₕ n i = keep-id≡idₕ n i
keepₕ δ₁          ⦂ₕ keep-id≡idₕ n i = keepₕ δ₁
skipₕ δ₁          ⦂ₕ keep-id≡idₕ n i = skipₕ δ₁
keep-id≡idₕ .n i  ⦂ₕ keep-id≡idₕ n j = goal i j
   goal : Square refl (keep-id≡idₕ n) refl (keep-id≡idₕ n)
   goal i j = keep-id≡idₕ n (i ∧ j)

We can try to prove that the HIT variant is the same as orthodox one. The conversion functions are extremely simple, because the data-type is almost the same:

Orth→HIT : n ⊑ₒ m  n ⊑ₕ m
Orth→HIT nilₒ      = idₕ
Orth→HIT (keepₒ δ) = keepₕ (Orth→HIT δ)
Orth→HIT (skipₒ δ) = skipₕ (Orth→HIT δ)

HIT→Orth : n ⊑ₕ m  n ⊑ₒ m
HIT→Orth idₕ                = idₒ
HIT→Orth (keepₕ δ)          = keepₒ (HIT→Orth δ)
HIT→Orth (skipₕ δ)          = skipₒ (HIT→Orth δ)
HIT→Orth (keep-id≡idₕ n i)  = keep-id≡idₒ {n} i

Converting orthodox representation to HIT and back doesn't change the thinning. The proof is straightforward structural induction.

Orth→HIT→Orth : (δ : Orth n m)  HIT→Orth (Orth→HIT δ) ≡ δ
Orth→HIT→Orth nilₒ       = refl
Orth→HIT→Orth (keepₒ δ)  = cong keepₒ (Orth→HIT→Orth δ)
Orth→HIT→Orth (skipₒ δ)  = cong skipₒ (Orth→HIT→Orth δ)

On the other hand the opposite direction is tricky.

Easy part is to show that Orth→HIT preserves the identity, that will show that idₕ roundtrips.

Orth→HIT-id :  n  Orth→HIT idₒ ≡ idₕ {n}
Orth→HIT-id zero     = refl
Orth→HIT-id (suc n)  = cong keepₕ (Orth→HIT-id n) ∙ keep-id≡idₕ n

We also have to show that keep-id≡idₕ roundtrips. This is considerably more challenging. Luckily if you squint enough (and are familiar with cubical library), you notice the pattern:

lemma :  n  Square
  (cong keepₕ (Orth→HIT-id n))
  (cong keepₕ (Orth→HIT-id n) ∙ keep-id≡idₕ n)
  (refl {x = keepₕ (Orth→HIT idₒ)})
  (keep-id≡idₕ n)
lemma n = compPath-filler
  {x = keepₕ (Orth→HIT idₒ)}
  (cong keepₕ (Orth→HIT-id n))
  (keep-id≡idₕ n)

(In general, proving the equalities about equalities in Cubical Agda, i.e. filling squares and cubes feels to be black magic).

Using these lemmas we can finish the equality proof:

HIT→Orth→HIT : (δ : HIT n m)  Orth→HIT (HIT→Orth δ) ≡ δ
HIT→Orth→HIT idₕ                  = Orth→HIT-id _
HIT→Orth→HIT (keepₕ δ)            = cong keepₕ (HIT→Orth→HIT δ)
HIT→Orth→HIT (skipₕ δ)            = cong skipₕ (HIT→Orth→HIT δ)
HIT→Orth→HIT (keep-id≡idₕ n i) j  = lemma n i j

Orth≡HIT-pointwise : n ⊑ₒ m ≡ n ⊑ₕ m
Orth≡HIT-pointwise =
  isoToPath (iso Orth→HIT HIT→Orth HIT→Orth→HIT Orth→HIT→Orth)

Orth≡HIT : Orth ≡ HIT
Orth≡HIT i n m = Orth≡HIT-pointwise {n} {m} i

And we can show that this thinning identity and composition behave as the orthodox one. The identity homomorphism we have already proven, composition is trivial as the HIT structure resembles the structure orthodox thinning:

Orth→HIT-⦂ :  {n m p} (δ₁ : Orth n m) (δ₂ : Orth m p)
   Orth→HIT (δ₁ ⦂ₒ δ₂) ≡ Orth→HIT δ₁ ⦂ₕ Orth→HIT δ₂
Orth→HIT-⦂ δ₁           nilₒ       = refl
Orth→HIT-⦂ δ₁          (skipₒ δ₂)  = cong skipₕ (Orth→HIT-⦂ δ₁ δ₂)
Orth→HIT-⦂ (keepₒ δ₁)  (keepₒ δ₂)  = cong keepₕ (Orth→HIT-⦂ δ₁ δ₂)
Orth→HIT-⦂ (skipₒ δ₁)  (keepₒ δ₂)  = cong skipₕ (Orth→HIT-⦂ δ₁ δ₂)

Then we can repeat what we did with previous thinning.

CatOps-HIT : CatOps HIT
CatOps-HIT = idₕ , _⦂ₕ_

CatOps-HITₜ : CatOps HIT
CatOps-HITₜ = subst CatOps Orth≡HIT CatOps-Orth

Identities are equal:

idₕₜ : n ⊑ₕ n
idₕₜ = fst CatOps-HITₜ

idₕₜ≡idₕ : idₕₜ ≡ idₕ {n}
idₕₜ≡idₕ = transportRefl (Orth→HIT idₒ) ∙ Orth→HIT-id _

and composition (literally the same code as in previous section, it can be automated but it's not worth for a blog post)

_⦂ₕₜ_ : n ⊑ₕ m  m ⊑ₕ p  n ⊑ₕ p
_⦂ₕₜ_ = snd CatOps-HITₜ

⦂ₕₜ≡⦂ₕ : (δ₁ : n ⊑ₕ m) (δ₂ : m ⊑ₕ p)  δ₁ ⦂ₕₜ δ₂ ≡ δ₁ ⦂ₕ δ₂
⦂ₕₜ≡⦂ₕ {n} {m} {p} δ₁ δ₂ =
  transport refl expr₁  ≡⟨ transportRefl expr₁ ⟩
  expr₁                 ≡⟨ expr₁≡expr₂ ⟩
  expr₂                 ≡⟨ Orth→HIT-⦂ (HIT→Orth δ₁) (HIT→Orth δ₂)
  expr₃                 ≡⟨  i  HIT→Orth→HIT δ₁ i ⦂ₕ HIT→Orth→HIT δ₂ i)
  δ₁ ⦂ₕ δ₂ ∎
    expr₁ = Orth→HIT (HIT→Orth (transport refl δ₁) ⦂ₒ
                      HIT→Orth (transport refl δ₂))
    expr₂ = Orth→HIT (HIT→Orth δ₁ ⦂ₒ HIT→Orth δ₂)
    expr₃ = Orth→HIT (HIT→Orth δ₁) ⦂ₕ Orth→HIT (HIT→Orth δ₂)

    expr₁≡expr₂ : expr₁ ≡ expr₂
    expr₁≡expr₂ i = Orth→HIT (HIT→Orth (transportRefl δ₁ i) ⦂ₒ
                              HIT→Orth (transportRefl δ₂ i))

And the equalities of CatOps:

CatOps-HIT≡ : CatOps-HITₜ ≡ CatOps-HIT
CatOps-HIT≡ i = idₕₜ≡idₕ i , λ δ₁ δ₂  ⦂ₕₜ≡⦂ₕ δ₁ δ₂ i

CatOps-Orth≡HIT :  i  CatOps (Orth≡HIT i)) [ CatOps-Orth ≡ CatOps-HIT ]
CatOps-Orth≡HIT = toPathP CatOps-HIT≡


We have seen three definitions of thinnings. Orthodox one, one with identity constructor yet unique representation and variant using additional equality. Using Cubical Agda we verified that these three definitions are equal, and their identity and composition behave the same.

What we can learn from it?

Well. It is morally correct to define

data Thin n m where
  ThinId   ::             Thin    n     n
  ThinSkip :: Thin n m -> Thin    n  (S m)
  ThinKeep :: Thin n m -> Thin (S n) (S m)

as long as you pay attention to not differentiate between ThinKeep ThinId and ThinId, you are safe. GHC won't point you if you wrote something inconsistent.

For example checking whether the thinning is an identity:

isThinId :: Thin n m -> Maybe (n :~: m)
isThinId ThinId = Just Refl
isThinId _      = Nothing

is not correct, but will be accepted by GHC. (Won't be by Cubical Agda).

But if you don't trust yourself, you can go for slightly more complicated

data Thin n m where
  ThinId ::              Thin n n
  Thin'  :: Thin' n m -> Thin n m

data Thin' n m where
  ThinWk   ::              Thin'    n  (S n)
  ThinSkip :: Thin' n m -> Thin'    n  (S m)
  ThinKeep :: Thin' n m -> Thin' (S n) (S m)

In either case you will be able to write Category instance:

instance Category Thin where
  id = ThinId
  (.) = _look_above_in_the_Agda_Code

which is not possible with an orthodox thinning definition.


open import Cubical.Data.Nat.Order

-- thinnings can be converted to less-than-or-equal-to relation:
⊑ₕ→≤ : n ⊑ₕ m  n ≤ m
⊑ₕ→≤ idₕ = 0 , refl
⊑ₕ→≤ (keepₕ δ) with ⊑ₕ→≤ δ
... | n , p = n  , +-suc n _ ∙ cong suc p
⊑ₕ→≤ (skipₕ δ) with ⊑ₕ→≤ δ
... | n , p = suc n , cong suc p
⊑ₕ→≤ (keep-id≡idₕ n i) = lemma' i where
  lemma' : ⊑ₕ→≤ (keepₕ idₕ) ≡ ⊑ₕ→≤ (idₕ {suc n})
  lemma' = Σ≡Prop  m   isSetℕ (m + suc n) (suc n)) (refl {x = 0})

-- Then we can check whether thinning is an identity.
-- Agda forces us to not cheat.
-- (Well, and also → Dec (n ≡ m))
isThinId : n ⊑ₕ m  Dec (n ≡ m)
isThinId idₕ = yes refl
isThinId (keepₕ δ) with isThinId δ
... | yes p = yes (cong suc p)
... | no ¬p = no λ p  ¬p (injSuc p)
isThinId {n} {m} (skipₕ δ) with ⊑ₕ→≤ δ
... |  (r , p) = no λ q  ¬m+n<m {m = n} {n = 0}
  (r , (r + suc (n + 0)    ≡⟨ +-suc r (n + 0)
        suc (r + (n + 0))  ≡⟨ cong  x  suc (r + x)) (+-zero n)
        suc (r + n)        ≡⟨ cong suc p ⟩
        suc _              ≡⟨ sym q ⟩
        n                  ∎))

isThinId (keep-id≡idₕ n i) = yes  _  suc n)

-- Same for orthodox
⊑ₒ→≤ : n ⊑ₒ m  n ≤ m
⊑ₒ→≤ nilₒ = 0 , refl
⊑ₒ→≤ (skipₒ δ) with ⊑ₒ→≤ δ
... | n , p = suc n , cong suc p
⊑ₒ→≤ (keepₒ δ) with ⊑ₒ→≤ δ
... | n , p = n  , +-suc n _ ∙ cong suc p

-- if indices match, δ is idₒ
⊥-elim : {A : Type}  A
⊥-elim ()

idₒ-unique : (δ : n ⊑ₒ n)  δ ≡ idₒ
idₒ-unique nilₒ      = refl
idₒ-unique (skipₒ δ) = ⊥-elim (¬m<m (⊑ₒ→≤ δ))
idₒ-unique (keepₒ δ) = cong keepₒ (idₒ-unique δ)

-- or idₕ, for which direct proof is trickier.
idₕ-unique : (δ : n ⊑ₕ n)  δ ≡ idₕ
idₕ-unique {n} = subst {A = Σ _ CatOps}
   { (__ , (id , __))  (δ : n ⊑ n)  δ ≡ id})
   i  Orth≡HIT i , CatOps-Orth≡HIT i)

More extras

The most important operation thinning support is their action on variables.

data Var : Type where
  vz :         Var (suc n)
  vs : Var n  Var (suc n)

Using each of the variants let us define the action:

thinₒ : n ⊑ₒ m  Var n  Var m
thinₒ nilₒ      ()
thinₒ (skipₒ δ) x      = vs (thinₒ δ x)
thinₒ (keepₒ δ) vz     = vz
thinₒ (keepₒ δ) (vs x) = vs (thinₒ δ x)

thinₛ : n ⊏ₛ m  Var n  Var m
thinₛ wkₛ       x      = vs x
thinₛ (skipₛ δ) x      = vs (thinₛ δ x)
thinₛ (keepₛ δ) vz     = vz
thinₛ (keepₛ δ) (vs x) = vs (thinₛ δ x)

thinₙ : n ⊑ₙ m  Var n  Var m
thinₙ idₙ        x = x
thinₙ (strict δ) x = thinₛ δ x

It's worth noticing that HIT forces to take into account the keep≡id≡idₕ equality, so we cannot do silly stuff in keepₕ cases.

thinₕ : n ⊑ₕ m  Var n  Var m
thinₕ idₕ       x      = x
thinₕ (skipₕ δ) x      = vs (thinₕ δ x)
thinₕ (keepₕ δ) vz     = vz
thinₕ (keepₕ δ) (vs x) = vs (thinₕ δ x)

thinₕ (keep-id≡idₕ n i) vz     = vz
thinₕ (keep-id≡idₕ n i) (vs x) = vs x

Let us prove that these definitions are compatible. First we need a simple lemma, that thinₒ idₒ is an identity function.

thin-idₒ : (x : Var n)  thinₒ idₒ x ≡ x
thin-idₒ {suc n} vz     = refl
thin-idₒ {suc n} (vs x) = cong vs (thin-idₒ x)
Action : ( Type)  Type
Action n m __ = n ⊑ m  Var n  Var m

thinₙₜ : n ⊑ₙ m  Var n  Var m
thinₙₜ {n} {m} = subst (Action n m) Orth≡NonStr thinₒ

Strict→Orth-thin : (δ : n ⊏ₛ m) (x : Var n)  thinₒ (Strict→Orth δ) x ≡ thinₛ δ x
Strict→Orth-thin wkₛ       x      = cong vs (thin-idₒ x)
Strict→Orth-thin (skipₛ δ) x      = cong vs (Strict→Orth-thin δ x)
Strict→Orth-thin (keepₛ δ) vz     = refl
Strict→Orth-thin (keepₛ δ) (vs x) = cong vs (Strict→Orth-thin δ x)

NonStr→Orth-thin : (δ : n ⊑ₙ m) (x : Var n)  thinₒ (NonStr→Orth δ) x ≡ thinₙ δ x
NonStr→Orth-thin idₙ        x = thin-idₒ x
NonStr→Orth-thin (strict δ) x = Strict→Orth-thin δ x

thinₙₜ≡thinₙ-pointwise : (δ : n ⊑ₙ m) (x : Var n)  thinₙₜ δ x ≡ thinₙ δ x
thinₙₜ≡thinₙ-pointwise {n} {m} δ x
  = transportRefl (thinₒ (NonStr→Orth (transp  i  n ⊑ₙ m) i0 δ)) (transp  j  Var n) i0 x))
  ∙ cong₂ thinₒ (cong NonStr→Orth (transportRefl δ)) (transportRefl x)
  ∙ NonStr→Orth-thin δ x

thinₙₜ≡thinₙ : (thinₙₜ {n} {m}) ≡ thinₙ
thinₙₜ≡thinₙ i δ x = thinₙₜ≡thinₙ-pointwise δ x i

thinₒ≡thinₙ :  i  Action n m (Orth≡NonStr i)) [ thinₒ ≡ thinₙ ]
thinₒ≡thinₙ = toPathP thinₙₜ≡thinₙ

The HIT version is not much trickier, if any.

thinₕₜ : n ⊑ₕ m  Var n  Var m
thinₕₜ {n} {m} = subst (Action n m) Orth≡HIT thinₒ

HIT→Orth-thin : (δ : n ⊑ₕ m) (x : Var n)  thinₒ (HIT→Orth δ) x ≡ thinₕ δ x
HIT→Orth-thin idₕ       x      = thin-idₒ x
HIT→Orth-thin (skipₕ δ) x      = cong vs (HIT→Orth-thin δ x)
HIT→Orth-thin (keepₕ δ) vz     = refl
HIT→Orth-thin (keepₕ δ) (vs x) = cong vs (HIT→Orth-thin δ x)

HIT→Orth-thin (keep-id≡idₕ n i) vz     = refl
HIT→Orth-thin (keep-id≡idₕ n i) (vs x) = cong vs (thin-idₒ x)

thinₕₜ≡thinₕ-pointwise : (δ : n ⊑ₕ m) (x : Var n)  thinₕₜ δ x ≡ thinₕ δ x
thinₕₜ≡thinₕ-pointwise {n} {m} δ x
  = transportRefl (thinₒ (HIT→Orth (transp  i  n ⊑ₕ m) i0 δ)) (transp  j  Var n) i0 x))
  ∙ cong₂ thinₒ (cong HIT→Orth (transportRefl δ)) (transportRefl x)
  ∙ HIT→Orth-thin δ x

thinₕₜ≡thinₕ : (thinₕₜ {n} {m}) ≡ thinₕ
thinₕₜ≡thinₕ i δ x = thinₕₜ≡thinₕ-pointwise δ x i

thinₒ≡thinₕ :  i  Action n m (Orth≡HIT i)) [ thinₒ ≡ thinₕ ]
thinₒ≡thinₕ = toPathP thinₕₜ≡thinₕ

At the end we have three variants of thinnings with identity and composition, and which act on variables the same way.

Now, if we prove properties of these operations, e.g. identity laws, composition associativity, or that composition and action commute, it would be enough to prove these for the orthodox implementation, then we can simply transport the proofs.

In other words, whatever we prove about one structure will hold for two others, like idₕ-unique in previous section.

Some proofs are simple:

thin-idₕ : (x : Var n)  thinₕ idₕ x ≡ x
thin-idₕ x = refl

but we can get them through the equality anyway:

thin-idₕ' : (x : Var n)  thinₕ idₕ x ≡ x
thin-idₕ' {n} x = subst
  {A = Σ _  __  Action n n __ × (n ⊑ n))}                -- structure
   { (__ , thin , id)  thin id x ≡ x })                   -- motif
   i  Orth≡HIT i , thinₒ≡thinₕ i , CatOps-Orth≡HIT i .fst) -- proof that structures are equal
  (thin-idₒ x)                                                -- proof to transport

September 30, 2022 12:00 AM

September 26, 2022

Matthew Sackman

Complexity and software engineering

OK, it’s definitely not just the software industry. If you’ve seen the film The Big Short you may remember the seemingly endless secondary markets, adding complexity that led to no one understanding what risks they were exposed to. The motivation there seemed to be purely making money.

Look at the food on your plate at dinner, and try thinking about the complexity of where all the ingredients came from to make that meal. If you have meat on your plate it might have been grown in the same country as you, but maybe not for the food the animal ate. You’re probably also eating animal antibiotics (or the remains of them). Where were they made? How can you start to get a hold on the incredible complexity of the human food chain? The motivation here seems also to be to make money: if you can make the same product as your competitors, but cheaper, then you can undercut your competitors a little, have bigger margins, and make more money. Who cares if it requires enormous environmental damage, right? Products are sure as hell not priced to reflect the damage done to the environment to create, maintain, or dispose of them.

As an aside, have you ever marvelled at how incredible plants are? They literally convert dirt, sunlight, water, and a few minerals, into food. Ultimately we’re all just the result of dirt, sunlight, water, and a few minerals. Bonkers.

Software does seem a little different though. We seem to utterly fetishize complexity, mostly for bragging rights. I’ve certainly been guilty in the past (and I suspect in the future too), of building far more complicated things than necessary, because I can. In a number of cases I could concoct a benchmark which showed the new code was faster, thus justifying the increased complexity of the code, and the consequence of a more difficult code-base to maintain. I definitely get a buzz from making a complex thing work, and I suspect this is quite common. I’ve been told that at Amazon, promotion requires being able to demonstrate that you’ve built or maintained complex systems. Well I love hearing about unintended 2nd-order effects. The consequence here is pretty obvious: a whole bunch of systems get built in ludicrously complex ways just so that people can apply for promotion. I guess the motivation there is money too.

As I say, when building something complex, it can be rewarding when it works. Six months later I’ve often come to regret it, when the complexity is directly opposed to being able to fix a bug that’s surfaced. It can cause silos by creating “domain experts” (i.e. a ball and chain around your feet). I’ve had cases where I’ve had to build enormously complex bits of code in order to work around equally bonkers levels of tech-debt, which can’t be removed because of “business reasons”. The result is unnecessary complexity on top of unnecessary complexity. No single person can understand how the whole thing works (much less write down some semantics, or any invariants) because the code-base is now too large, too complex, and riddled with remote couplings, broken abstractions, poor design, and invalid assumptions. Certain areas of the code-base become feared, and more code gets added to avoid confronting the complexity. Developer velocity slows to an absolute crawl, developers get frustrated and head for the door. No one is left who understands much. With some time and space since that particular situation, it’s now easy for me to sit here and declare that sort of thing a red-flag, and that I should have run away from it sooner. Who knows what’ll happen next time?

I find it easy to convince myself that complexity I’ve built, or claim to understand, is acceptable, whilst complexity I don’t understand is abhorrent.

As an industry we seem to love to kid ourselves that we’re all solving the same problems as Google, Facebook, or Amazon etc. At no job I’ve ever worked do I believe the complexity that comes from use of Kubernetes is justified, and yet it seems to have become “best practice”, which I find baffling. On a recent project I decided to rent a single (virtual) server, and I deploy everything with a single nixos-rebuild --target-host myhost switch. Because everything on the server is running under systemd, and because of the way nixos restarts services, downtime is less than 2 seconds and is only visible to long-lived connections (WebSockets in this case): systemd can manage listening-sockets itself and pass them to the new executable, maintaining any backlog of pending connections.

To me, this “simplicity” is preferable than trying to achieve 100% service availability. I’m not going to lose any money because of 2 seconds of downtime, even if that happens several times a day. It’s much more valuable to me to be able to get code changes deployed quickly. Is this really simpler than using something like Kubernetes? Maybe: there are certainly fewer moving parts and all the nixos stuff is only involved when deploying updates. Nevertheless, it’s not exactly simple; but I believe I understand enough of it to be happy to build, use, and rely on it.

I was recently reading Nick Cameron’s blog post on 10 challenges for Rust. The 9th point made me think about the difficulty of maintaining the ability to make big changes to any large software project. We probably all know to say the right words about avoiding hidden or tight couplings, but evidence doesn’t seem to suggest that it’s possible in large sophisticated software projects.

We are taught to fear the “big rewrite” project, citing 2nd-system-syndrome, though the definition of that seems to be about erroneously replacing “small, elegant, and successful systems”. It’s not about replacing giant, bug riddled, badly understood and engineered systems (to be super clear, I’m talking about this in general, not about the Rust compiler which I know nothing about). I do think we are often mistaken to fear rebuilding systems: I suspect we look at the size of the existing system and estimate the difficulty of recreating it. But of course we don’t want to recreate all those problems. We’ve learnt from them and can carry that knowledge forwards (assuming you manage to stop an employee exodus). There’s no desire to recreate the mountains of code that stem from outdated assumptions, inevitable mistakes in code design, unnecessary and accidental complexity, tech-debt, and its workarounds.

I’ve been thinking about parallels in other industries. Given the current price of energy in the UK and how essential it is to improve the heating efficiency of our homes, it’s often cheaper to knock down existing awful housing and rebuild from scratch. No fear of 2nd-system-syndrome here: it’s pretty unarguable that a lot of housing in the UK is dreadful, both from the point of view of how we use rooms these days, and energy efficiency. Retrofitting can often end up being more expensive, less effective, slower, and addresses only a few problems. But incremental improvement doesn’t require as much up-front capital.

If you look at the creative arts, artists, authors, and composers all create a piece of work and then it’s pretty much done. Yes, there are plenty of examples of composers going back and revising works after they’ve been performed (Bruckner and Sibelius for example), sometimes for slightly odd reasons such as establishing or extending copyright (for example Stravinsky). But a piece of art is not built by a slowly changing team over a period of 10 years (film may be an interesting exception to this). When it’s time to start a new piece of art, well, it’s time. Knowledge, style, preferences, techniques: these are carried forwards. Shostakovich always sounds unmistakably like Shostakovich. But his fifth symphony is not his fourth with a few bug fixes.

At the other end of the spectrum, take the economic philosophy known as Georgism. From what I can gather, no serious economist on the left or right believes it would be a bad idea to implement, and it seems like it would have a great many benefits. But large landowners (people who own a lot of land, not people who own any amount of land and happen to be large) would probably have to pay more tax. Large landowners tend to currently have a lot of political power. Consequently Georgism never gets implemented in the West. So despite it being almost universally accepted as a good idea, because we can’t “start again”, we’re never going to have it. From what I can see, literally the only chance would be a successful violent uprising.

Finally, recently I came across “When Do Startups Scale? Large-scale Evidence from Job Postings” by Lee and Kim. Now this paper isn’t specifically looking at software, and they use the word “experiment” to mean changing the product the company is creating in order to find the best fit for their market – they’re not talking about experimenting with software. Nevertheless:

We find that startups that begin scaling within the first 12 months of their founding are 20 to 40% more likely to fail. Our results show that this positive correlation between scaling early and firm failure is negated for startups that engage in experimentation through A/B testing.

It’s definitely a big stretch, but in the case of software this could be evidence that delaying writing lots of code, for as long as possible, is beneficial. Avoid complexity; continue to experiment with prototypes and throw-away code and treat nothing as sacrosanct for as long as possible. Do not acquiesce to complexity: give it an inch and it’ll take a mile before you even realise what’s happened.

So what to do? I’ve sometimes thought that say, once a month, companies should run some git queries and identify the oldest code in their code-bases. This code hasn’t been touched for years. The authors have long since left. It may be that no one understands what it even does. What happens if we delete it? Now in many ways (probably all ways) this is a completely mad idea: if it ain’t broke, don’t fix it, and why waste engineering resources on recreating code and functionality that no one had a problem with?

But at the same time, if this was the culture across the entire company and was priced in, then it might enable some positive things:

  • There would be more eyes on, and understanding of, ancient code. Thus less ancient code, and more understanding in general.

  • This ancient code may well embody terribly outdated assumptions about what the product does. Can it be updated with the current ideas and assumptions?

  • This ancient code may also encode invariants about the product which are no longer true. There may be a way to change or relax them. By doing so you might be able to delete various workarounds and tech-debt that exists higher up.

Now because I would guess a lot of ancient code is quite foundational, changing it may very well be quite dangerous. One fix could very quickly demand another, and before you know it you’ve embarked upon rewriting the whole thing. Maybe that’s the goal: maybe you should aim to be able to rewrite huge sections of the product within a month if it is judged to be beneficial to the code-base. But of course this requires such ideas to be taken seriously and valued right across the company. For the engineering team to have a strong voice at the top table. And really is this so different from just keeping a list of areas of the code that no one likes and dedicating time to fixing those? I guess if nothing else, it might give a starting point for making such a list.

Unnecessary complexity in software seems endemic, and is frequently worshipped. This, and a fear of experiments to rewrite, blunts the drive to simplify. Yet the benefits of a smaller and simpler code-base are unarguable: with greater understanding of how the product works, a small team can move much faster.

September 26, 2022 04:01 PM

Philip Wadler

Angry Reviewer


Angry Reviewer is a tool to provide feedback on your writing. I look forward to trying it out.

by Philip Wadler ( at September 26, 2022 12:08 PM

September 24, 2022

Magnus Therning

Annotate projects in Emacs

Every now and then I've wished to write comments on files in a project, but I've never found a good way to do that. annotate.el and org-annotate-file both collect annotations in a central place (in my $HOME), while marginalia puts annotations in files next to the source files but in a format that's rather cryptic and tends to be messed up when attached to multiple lines. None of them is ideal, I'd like the format to be org-mode, but not in a central file. At the same time having one annotation file per source file is simply too much.

I tried wrapping org-annotate-file, setting org-annotate-file-storage-file and taking advantage of elisp's dynamic binding. However, it opens the annotation file in the current window, and I'd really like to split the window and open the annotations the right. Rather than trying to sort of "work it out backwards" I decided to write a small package and use as much of the functionality in org-annotate-file.el as possible.

First off I decided that I want the annotation file to be called

(defvar org-projectile-annotate-file-name ""
  "The name of the file to store project annotations.")

Then I wanted a slightly modified version of org-annotate-file-show-section, I wanted it to respect the root of the project.

(defun org-projectile-annotate--file-show-section (storage-file)
  "Add or show annotation entry in STORAGE-FILE and return the buffer."
  ;; modified version of org-annotate-file-show-section
  (let* ((proj-root (projectile-project-root))
         (filename (file-relative-name buffer-file-name proj-root))
         (line (buffer-substring-no-properties (point-at-bol) (point-at-eol)))
         (annotation-buffer (find-file-noselect storage-file)))
    (with-current-buffer annotation-buffer
      (org-annotate-file-annotate filename line))

The main function can then simply work out where the file with annotations should be located and call org-projectile-annotate--file-show-section.

(defun org-projectile-annotate ()
  (let ((annot-fn (file-name-concat (projectile-project-root)
    (set-window-buffer (split-window-right)
                       (org-projectile-annotate--file-show-section annot-fn))))

When testing it all out I noticed that org-store-link makes a link with a search text. In my case it would be much better to have links with line numbers. I found there's a hook to modify the behaviour of org-store-link, org-create-file-search-functions. So I wrote a function to get the kind of links I want, but only when the project annotation file is open in a buffer.

(defun org-projectile-annotate-file-search-func ()
  "A function returning the current line number when called in a
project while the project annotation file is open.

This function is designed for use in the hook
'org-create-file-search-functions'. It changes the behaviour of
'org-store-link' so it constructs a link with a line number
instead of a search string."
  ;; TODO: find a way to make the link description nicer
  (when (and (projectile-project-p)
             (get-buffer-window org-projectile-annotate-file-name))
    (number-to-string (line-number-at-pos))))

That's it, now I only have to wait until the next time I want to comment on a project to see if it improves my way of working.

September 24, 2022 08:42 PM

September 21, 2022

Lysxia's blog

The quantified constraint trick

My favorite Haskell trick is how to use quantified constraints with type families. Kudos to Iceland_jack for coming up with it.

Quantified constraints and type families

QuantifiedConstraints is an extension from GHC 8.6 that lets us use forall in constraints.

It lets us express constraints for instances of higher-kinded types like Fix:

newtype Fix f = Fix (f (Fix f))

deriving instance (forall a. Eq a => Eq (f a)) => Eq (Fix f)

Other solutions existed previously, but they’re less elegant:

deriving instance Eq (f (Fix f)) => Eq (Fix f)

instance Eq1 f => Eq (Fix f) where ...

It also lets us say that a monad transformer indeed transforms monads:

class (forall m. Monad m => Monad (t m)) => MonadTrans t where
  lift :: m a -> t m a

(Examples lifted from the GHC User Guide on QuantifiedConstraints, section Motivation.)

One restriction is that the conclusion of a quantified constraint cannot mention a type family.

type family F a

-- (forall a. C (F a))  -- Illegal type family application in a quantified constraint

A quantified constraint can be thought of as providing a local instance, and they are subject to a similar restriction on the shape of instance heads so that instance resolution may try to match required constraints with the head of existing instances.

Type families are not matchable: we cannot determine whether an applied type family F a matches a type constructor T in a manner satisfying the properties required by instance resolution (“coherence”). So type families can’t be in the conclusion of a type family.

The quantified constraint trick

Step 1

To legalize type families in quantified constraints, all we need is a class synonym:

class    C (F a) => CF a
instance C (F a) => CF a

That CF a is equivalent to C (F a), and forall a. CF a is legal.

Step 2?

Since GHC 9.2, Step 1 alone solves the problem. It Just Works™. And I don’t know why.

Before that, for GHC 9.0 and prior, we also needed to hold the compiler’s hand and tell it how to instantiate the quantified constraint.

Indeed, now functions may have constraints of the form forall a. CF a, which should imply C (F x) for any x. Although CF and C (F x) are logically related, when C (F x) is required, that triggers a search for instances of the class C, and not the CF which is provided by the quantified constraint. The search would fail unless some hint is provided to the compiler.

When you require a constraint C (F x), insert a type annotation mentioning the CF x constraint (using the CF class instead of C).

_ {- C (F x) available here -} :: CF x => _

Inside the annotation (to the left of ::), we are given CF x, from which C (F x) is inferred as a superclass. Outside the annotation, we are requiring CF x, which is trivially solved by the quantified constraint forall a. CF a.


-- Mixing quantified constraints with type families --

class C a
type family F a

-- forall a. C (F a)  -- Nope.

class    C (F a) => CF a  -- Class synonym
instance C (F a) => CF a

-- forall a. CF a     -- Yup.

-- Some provided function we want to call.
f :: C (F t) => t

-- A function we want to implement using f.
g :: (forall a. CF a) => t
g = f               -- OK on GHC >= 9.2
g = f :: CF t => t  -- Annotation needed on GHC <= 9.0

The part of that type annotation that really matters is the constraint. The rest of the type to the right of the arrow is redundant. Another way to write only the constraint uses the following identity function with a fancy type:

with :: forall c r. (c => r) -> (c => r)
with x = x

So you can supply the hint like this instead:

g :: forall t. (forall a. CF a) => t
g = with @(CF t) f

Application: generic-functor

What do I need that trick for? It comes up in generic metaprogramming.

Imagine deriving Functor for Generic types (no Generic1, which is not as general as you might hope). One way is to implement the following class on generic representations:

class RepFmap a a' rep rep' where
  repFmap :: (a -> a') -> rep -> rep'

A type constructor f :: Type -> Type will be a Functor when its generic representation (Rep) implements RepFmap a a'… for all a, a'.

-- Class synonym for generically derivable functors
class    (forall a. Generic (f a), forall a a'. RepFmap a a' (Rep (f a) ()) (Rep (f a') ())) => GFunctor f
instance ...   -- idem (class synonym)

-- Wait a second...

But that is illegal, because the type family Rep occurs in the conclusion of a quantified constraint.

Time for the trick! We give a new name to the conclusion:

class    RepFmap a a' (Rep (f a) ()) (Rep (f a') ()) => RepFmapRep a a' f
instance ...  -- idem (class synonym)

And we can use it in a quantified constraint:

-- Now this works!
class    (forall a. Generic (f a), forall a a'. RepFmapRep a a' f) => GFunctor f
instance ...   -- idem (class synonym)

To obtain the final generic implementation of fmap, we wrap repFmap between to and from.

gfmap :: forall f a a'. GFunctor f => (a -> a') -> f a -> f a'
gfmap f =
  with @(RepFmapRep a a' f)             -- Hand-holding for GHC <= 9.0
    (to @_ @() . repFmap f . from @_ @())

Et voilà.

(Gist of this example)

Appendix: Couldn’t we do this instead?

If you’ve followed all of that, there’s one other way you might try defining gfmap without QuantifiedConstraints, by just listing the three constraints actually needed in the body of the function.

-- Dangerous gfmap!
gfmap ::
  Generic (f a) =>
  Generic (f a') =>
  RepFmap a a' (Rep (f a) ()) (Rep (f a') ()) =>
  (a -> a') -> f a -> f a'
gfmap f = to @_ @() . repFmap f . from @_ @()

This is okay as long as it is only ever used to implement fmap as in:

fmap = gfmap

Any other use voids a guarantee you didn’t know you expected.

The thing I haven’t told you is that RepFmap is implemented with… incoherent instances!