Planet Haskell

September 30, 2022

Oleg Grenrus

Three different thinnings

Posted on 2022-09-30 by Oleg Grenrus

I was lately again thinking about thinnings.

Thinnings are a weaker form of renamings, which we use in well-scoped or well-typed implementations of programming languages. (Their proper name is order-preserving embeddings, mathematicians may know them as morphism in augmented simplex category Δ₊)

There is one well known and used implementation implementation for them. It's simple to use and write proofs about. However it's not super great. Especially it's not great in Haskell, as it cannot be given Category instance. (Though you almost never need thinnings in Haskell, so the reason is a bit moot).

I'll show two other implementations, and show that they are equivalent, using Cubical Agda to state the equivalences. Before we dive in, Agda module prologue:

{-# OPTIONS --cubical --safe #-}
module 2022-09-30-thinnings where

open import Cubical.Core.Everything
open import Cubical.Foundations.Prelude
open import Cubical.Foundations.Isomorphism
open import Cubical.Data.Nat
open import Cubical.Data.Empty
open import Cubical.Data.Sigma
open import Cubical.Relation.Nullary

I will show only a well-scoped thinnings. So the context are simply natural numbers. As there are plenty of them, let us define few common variables.

  n m p r :

Orthodox thinnings

For the sake of this post, I call well known thinnings orthodox, and use ₒ subscript to indicate that.

data _⊑ₒ_ : Type where
  nilₒ   :           zero   ⊑ₒ zero
  skipₒ  : n ⊑ₒ m   n      ⊑ₒ suc m
  keepₒ  : n ⊑ₒ m   suc n  ⊑ₒ suc m

Orth = _⊑ₒ_

An example thinning is like

exₒ : 5 ⊑ₒ 7
exₒ = keepₒ (skipₒ (keepₒ (skipₒ (keepₒ (keepₒ (keepₒ nilₒ))))))

Which would look like:

\begin{tikzpicture} \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (A) at (0,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (B) at (0,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (C) at (0,1.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (D) at (0,2.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (E) at (0,3.00) {}; \node[anchor=east] at (A) {$0$}; \node[anchor=east] at (B) {$1$}; \node[anchor=east] at (C) {$2$}; \node[anchor=east] at (D) {$3$}; \node[anchor=east] at (E) {$4$}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (X) at (2,0.00) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Y) at (2,0.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Z) at (2,1.00) {}; \node[circle, draw, inner sep=0pt, minimum width=4pt] (U) at (2,1.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (V) at (2,2.00) {}; \node[circle, draw, inner sep=0pt, minimum width=4pt] (W) at (2,2.50) {}; \node[circle, draw, fill=black, inner sep=0pt, minimum width=4pt] (Q) at (2,3.00) {}; \node[anchor=west] at (X) {$0$}; \node[anchor=west] at (Y) {$1$}; \node[anchor=west] at (Z) {$2$}; \node[anchor=west] at (U) {$3$}; \node[anchor=west] at (V) {$4$}; \node[anchor=west] at (W) {$5$}; \node[anchor=west] at (Q) {$6$}; \draw[-] (A) -- (X); \draw[-] (B) -- (Y); \draw[-] (C) -- (Z); \draw[-] (D) -- (V); \draw[-] (E) -- (Q); \end{tikzpicture}

We can define identity thinning:

idₒ : n ⊑ₒ n
idₒ {zero}   = nilₒ
idₒ {suc n}  = keepₒ idₒ

Note how it pattern matches on the size (of the context). That what makes it impossible to defined Category instance in Haskell.

We can also define composition, and weakening on top of the context

_⦂ₒ_ : n ⊑ₒ m  m ⊑ₒ p  n ⊑ₒ p
δ₁        ⦂ₒ nilₒ      = δ₁
δ₁        ⦂ₒ skipₒ δ₂  = skipₒ (δ₁ ⦂ₒ δ₂)
keepₒ δ₁  ⦂ₒ keepₒ δ₂  = keepₒ (δ₁ ⦂ₒ δ₂)
skipₒ δ₁  ⦂ₒ keepₒ δ₂  = skipₒ (δ₁ ⦂ₒ δ₂)

wkₒ : n ⊑ₒ suc n
wkₒ = skipₒ idₒ

As said, the proofs about this formulation are simple. Plenty of equalities hold definitionally:

keep-id≡idₒ : keepₒ idₒ ≡ idₒ {suc n}
keep-id≡idₒ = refl

Separate thinning

As mentioned in previous section the orthodox thinning is not very efficient. For example when implementing normalization by evaluation (NbE) we run into problems. There we need identity thinning when evaluating every application, so we will pay a price proportional to the size of the current context!

In his work Andras Kovacs makes a variant swapping nilₒ for idₒ. However then thinnings won't have unique representation anymore and proofs become more inconvenient to write.

We can make a special case for identity thinning without sacrificing unique representation for the cost of slightly more complicated definition. We just need to consider identity thinning and non-identity ones separately.

data _⊏ₛ_ : Type where
  wkₛ    :           n      ⊏ₛ suc n
  keepₛ  : n ⊏ₛ m   suc n  ⊏ₛ suc m
  skipₛ  : n ⊏ₛ m   n      ⊏ₛ suc m

data _⊑ₙ_ : Type where
  idₙ :              n ⊑ₙ n
  strict : n ⊏ₛ m   n ⊑ₙ m

Strict = _⊏ₛ_
NonStr = _⊑ₙ_

We can implement most operations without much problems. Note that also wkₙ has a small, context-size independent, representation.

nilₙ : zero ⊑ₙ zero
nilₙ = idₙ

wkₙ :  {n}  n ⊑ₙ suc n
wkₙ = strict wkₛ

skipₙ : n ⊑ₙ m  n ⊑ₙ suc m
skipₙ idₙ         = wkₙ
skipₙ (strict x)  = strict (skipₛ x)

keepₙ : n ⊑ₙ m  suc n ⊑ₙ suc m
keepₙ idₙ         = idₙ
keepₙ (strict δ)  = strict (keepₛ δ)

keep-id≡idₙ : keepₙ idₙ ≡ idₙ {suc n}
keep-id≡idₙ = refl

Composition is a bit more complicated then for orthodox variant, but not considerably:

_⦂ₛ_ : n ⊏ₛ m  m ⊏ₛ p  n ⊏ₛ p
δ₁        ⦂ₛ wkₛ       = skipₛ δ₁
δ₁        ⦂ₛ skipₛ δ₂  = skipₛ (δ₁ ⦂ₛ δ₂)
wkₛ       ⦂ₛ keepₛ δ₂  = skipₛ δ₂
keepₛ δ₁  ⦂ₛ keepₛ δ₂  = keepₛ (δ₁ ⦂ₛ δ₂)
skipₛ δ₁  ⦂ₛ keepₛ δ₂  = skipₛ (δ₁ ⦂ₛ δ₂)

_⦂ₙ_ : n ⊑ₙ m  m ⊑ₙ p  n ⊑ₙ p
δ₁         ⦂ₙ idₙ         = δ₁
idₙ        ⦂ₙ strict δ₂        = strict δ₂
strict δ₁  ⦂ₙ strict δ₂  = strict (δ₁ ⦂ₛ δ₂)

Are these orthodox and this thinning the same?

Are ⊑ₒ and ⊑ₙ the same? We can construct an isomorphism between them to answer that question positively.

Orth→NonStr : n ⊑ₒ m  n ⊑ₙ m
Orth→NonStr nilₒ        = nilₙ
Orth→NonStr (keepₒ δ)   = keepₙ (Orth→NonStr δ)
Orth→NonStr (skipₒ δ)   = skipₙ (Orth→NonStr δ)

Strict→Orth : n ⊏ₛ m  n ⊑ₒ m
Strict→Orth wkₛ         = wkₒ
Strict→Orth (keepₛ δ)   = keepₒ (Strict→Orth δ)
Strict→Orth (skipₛ δ)   = skipₒ (Strict→Orth δ)

NonStr→Orth : n ⊑ₙ m  n ⊑ₒ m
NonStr→Orth idₙ         = idₒ
NonStr→Orth (strict δ)  = Strict→Orth δ

It is not enough to define conversion functions we also need to show that they cancel out. Luckily this is not difficult, we need few auxiliary homomorphism lemmas.

NonStr→Orth-keepₒ : (δ : n ⊑ₙ m)  NonStr→Orth (keepₙ δ) ≡ keepₒ (NonStr→Orth δ)
NonStr→Orth-skipₒ : (δ : n ⊑ₙ m)  NonStr→Orth (skipₙ δ) ≡ skipₒ (NonStr→Orth δ)
Orth→NonStr-id≡id :  n  Orth→NonStr idₒ ≡ idₙ {n}
NonStr→Orth-keepₒ idₙ         = refl
NonStr→Orth-keepₒ (strict _)  = refl

NonStr→Orth-skipₒ idₙ         = refl
NonStr→Orth-skipₒ (strict _)  = refl

Orth→NonStr-id≡id zero    = refl
Orth→NonStr-id≡id (suc n) = cong keepₙ (Orth→NonStr-id≡id n)

And finally we can show that Orth→NonStr NonStr→Orth are each others inverses.

Orth→NonStr→Orth    : (δ : n ⊑ₒ m)  NonStr→Orth (Orth→NonStr δ) ≡ δ
Strict→Orth→NonStr  : (δ : n ⊏ₛ m)  Orth→NonStr (Strict→Orth δ) ≡ strict δ
NonStr→Orth→NonStr  : (δ : n ⊑ₙ m)  Orth→NonStr (NonStr→Orth δ) ≡ δ
Orth→NonStr→Orth nilₒ       = refl
Orth→NonStr→Orth (keepₒ δ)  = NonStr→Orth-keepₒ (Orth→NonStr δ) ∙ cong keepₒ (Orth→NonStr→Orth δ)
Orth→NonStr→Orth (skipₒ δ)  = NonStr→Orth-skipₒ (Orth→NonStr δ) ∙ cong skipₒ (Orth→NonStr→Orth δ)

Strict→Orth→NonStr wkₛ        = cong skipₙ (Orth→NonStr-id≡id _)
Strict→Orth→NonStr (keepₛ δ)  = cong keepₙ (Strict→Orth→NonStr δ)
Strict→Orth→NonStr (skipₛ δ)  = cong skipₙ (Strict→Orth→NonStr δ)

NonStr→Orth→NonStr idₙ         = Orth→NonStr-id≡id _
NonStr→Orth→NonStr (strict δ)  = Strict→Orth→NonStr δ

In Cubical Agda we can promote the above isomorphism to an equality.

Orth≡NonStr-pointwise : (n ⊑ₒ m)(n ⊑ₙ m)
Orth≡NonStr-pointwise = isoToPath
  (iso Orth→NonStr NonStr→Orth NonStr→Orth→NonStr Orth→NonStr→Orth)

Orth≡NonStr : Orth ≡ NonStr
Orth≡NonStr i n m = Orth≡NonStr-pointwise {n} {m} i

But are they still the same?

Even the types are the same, are the operations we defined on them the same? We still need to show that the operations give the same results.

I'll define a simplified "category operations" type, with an identity and a composition:

CatOps : ( Type)  Type
CatOps __
  = (∀ {n}  n ↝ n)                       -- identity
  × (∀ {n m p}  n ↝ m  m ↝ p  n ↝ p )  -- composition

Orthodox category ops are:

CatOps-Orth : CatOps Orth
CatOps-Orth = idₒ , _⦂ₒ_

And NonStr ops are:

CatOps-NonStr : CatOps NonStr
CatOps-NonStr = idₙ , _⦂ₙ_

And we can show transport orthodox ops along Orth≡NonStr to get other variant

CatOps-NonStrₜ : CatOps NonStr
CatOps-NonStrₜ = subst CatOps Orth≡NonStr CatOps-Orth

The goal is to show that all these are equal.

First We can construct a path between two CatOps NonStr structures,

For identity part we need identity homomorphism:

Orth→NonStr-id : Orth→NonStr idₒ ≡ idₙ {n}
Orth→NonStr-id {zero}  = refl
Orth→NonStr-id {suc n} = cong keepₙ (Orth→NonStr-id {n})

Then we can extract the transported identity, and show it is the same as idₙ:

idₙₜ : n ⊑ₙ n
idₙₜ = fst CatOps-NonStrₜ

idₙₜ≡idₙ : idₙₜ ≡ idₙ {n}
idₙₜ≡idₙ = transportRefl (Orth→NonStr idₒ) ∙ Orth→NonStr-id

The composition is slightly more complicated.

skip-⦂ₙ : (δ₁ : n ⊑ₙ m)  (δ₂ : m ⊑ₙ p)
         skipₙ (δ₁ ⦂ₙ δ₂)(δ₁ ⦂ₙ skipₙ δ₂)
skip-⦂ₙ idₙ         idₙ         = refl
skip-⦂ₙ (strict _)  idₙ         = refl
skip-⦂ₙ idₙ         (strict _)  = refl
skip-⦂ₙ (strict _)  (strict _)  = refl

skip-keep-⦂ₙ : (δ₁ : n ⊑ₙ m) (δ₂ : m ⊑ₙ p)
              skipₙ (δ₁ ⦂ₙ δ₂)(skipₙ δ₁ ⦂ₙ keepₙ δ₂)
skip-keep-⦂ₙ δ₁          idₙ         = refl
skip-keep-⦂ₙ idₙ         (strict _)  = refl
skip-keep-⦂ₙ (strict _)  (strict _)  = refl

keep-keep-⦂ₙ : (δ₁ : n ⊑ₙ m) (δ₂ : m ⊑ₙ p)
              keepₙ (δ₁ ⦂ₙ δ₂)(keepₙ δ₁ ⦂ₙ keepₙ δ₂)
keep-keep-⦂ₙ δ₁          idₙ         = refl
keep-keep-⦂ₙ idₙ         (strict x)  = refl
keep-keep-⦂ₙ (strict _)  (strict _)  = refl

We can show that Orth→NonStr preserves composition.

Orth→NonStr-⦂ : (δ₁ : n ⊑ₒ m) (δ₂ : m ⊑ₒ p)
               Orth→NonStr (δ₁ ⦂ₒ δ₂) ≡ Orth→NonStr δ₁ ⦂ₙ Orth→NonStr δ₂
Orth→NonStr-⦂ δ₁          nilₒ        = refl
Orth→NonStr-⦂ δ₁          (skipₒ δ₂)  = cong skipₙ (Orth→NonStr-⦂ δ₁ δ₂) ∙ skip-⦂ₙ (Orth→NonStr δ₁) (Orth→NonStr δ₂)
Orth→NonStr-⦂ (skipₒ δ₁)  (keepₒ δ₂)  = cong skipₙ (Orth→NonStr-⦂ δ₁ δ₂) ∙ skip-keep-⦂ₙ (Orth→NonStr δ₁) (Orth→NonStr δ₂)
Orth→NonStr-⦂ (keepₒ δ₁)  (keepₒ δ₂)  = cong keepₙ (Orth→NonStr-⦂ δ₁ δ₂) ∙ keep-keep-⦂ₙ (Orth→NonStr δ₁) (Orth→NonStr δ₂)

Using the above fact, we can show that and are pointwise equal. The proof looks complicated, but is pretty straightforward in the end.

_⦂ₙₜ_ : n ⊑ₙ m  m ⊑ₙ p  n ⊑ₙ p
_⦂ₙₜ_ = snd CatOps-NonStrₜ

⦂ₙₜ≡⦂ₙ : (δ₁ : n ⊑ₙ m) (δ₂ : m ⊑ₙ p)  δ₁ ⦂ₙₜ δ₂ ≡ δ₁ ⦂ₙ δ₂
⦂ₙₜ≡⦂ₙ {n} {m} {p} δ₁ δ₂ =
  transport refl expr₁  ≡⟨ transportRefl expr₁ ⟩
  expr₁                 ≡⟨ expr₁≡expr₂ ⟩
  expr₂                 ≡⟨ Orth→NonStr-⦂ (NonStr→Orth δ₁) (NonStr→Orth δ₂)
  expr₃                 ≡⟨  i  NonStr→Orth→NonStr δ₁ i ⦂ₙ
                                  NonStr→Orth→NonStr δ₂ i)
  δ₁ ⦂ₙ δ₂ ∎
    expr₁ = Orth→NonStr (NonStr→Orth (transport refl δ₁) ⦂ₒ
                         NonStr→Orth (transport refl δ₂))
    expr₂ = Orth→NonStr (NonStr→Orth δ₁ ⦂ₒ NonStr→Orth δ₂)
    expr₃ = Orth→NonStr (NonStr→Orth δ₁) ⦂ₙ Orth→NonStr (NonStr→Orth δ₂)

    expr₁≡expr₂ : expr₁ ≡ expr₂
    expr₁≡expr₂ i = Orth→NonStr (NonStr→Orth (transportRefl δ₁ i) ⦂ₒ
                                 NonStr→Orth (transportRefl δ₂ i))

And finally we can state that first equality:

CatOps-NonStr≡ : CatOps-NonStrₜ ≡ CatOps-NonStr
CatOps-NonStr≡ i = idₙₜ≡idₙ i , λ δ₁ δ₂  ⦂ₙₜ≡⦂ₙ δ₁ δ₂ i

and the quality we actually wanted to say, that CatOps-Orth and CatOps-NonStr are equal (if we equate their types by Orth≡NonStr)!!!

CatOps-Orth≡NonStr :  i  CatOps (Orth≡NonStr i))
  [ CatOps-Orth ≡ CatOps-NonStr ]
CatOps-Orth≡NonStr = toPathP CatOps-NonStr≡

Higher-inductive type

Cubical Agda also supports higher inductive types (HITs), i.e. types with additional equalities. We can formalize Andras better performing thinning as a HIT, by throwing in an additional equality. Agda will then ensure that we always respect it.

data _⊑ₕ_ : Type where
  idₕ    :           n      ⊑ₕ n
  keepₕ  : n ⊑ₕ m   suc n  ⊑ₕ suc m
  skipₕ  : n ⊑ₕ m   n      ⊑ₕ suc m

  -- it is what it says: keep idₕ ≡ idₕ
  keep-id≡idₕ :  n  keepₕ (idₕ {n = n}) ≡ idₕ {n = suc n}

HIT = _⊑ₕ_

Composition for HIT-thinning looks very similar to the orthodox version...

_⦂ₕ_ : n ⊑ₕ m  m ⊑ₕ p  n ⊑ₕ p
δ₁        ⦂ₕ idₕ       = δ₁
δ₁        ⦂ₕ skipₕ δ₂  = skipₕ (δ₁ ⦂ₕ δ₂)
idₕ       ⦂ₕ keepₕ δ₂  = keepₕ δ₂
keepₕ δ₁  ⦂ₕ keepₕ δ₂  = keepₕ (δ₁ ⦂ₕ δ₂)
skipₕ δ₁  ⦂ₕ keepₕ δ₂  = skipₕ (δ₁ ⦂ₕ δ₂)

... except that we have extra cases which deal with an extra equality we threw in.

We have to show that equations are consistent with keep-id≡idₕ equality. The goals may be obfuscated, but relatively easy to fill.

keep-id≡idₕ n i ⦂ₕ keepₕ δ₂ = goal i
  lemma :  {n m}  (δ : HIT n m)  idₕ ⦂ₕ δ ≡ δ
  lemma idₕ = refl
  lemma (keepₕ δ) = refl
  lemma (skipₕ δ) = cong skipₕ (lemma δ)
  lemma (keep-id≡idₕ n i) j = keep-id≡idₕ n i

  goal : keepₕ (idₕ ⦂ₕ δ₂) ≡ keepₕ δ₂
  goal i = keepₕ (lemma δ₂ i)

idₕ               ⦂ₕ keep-id≡idₕ n i = keep-id≡idₕ n i
keepₕ δ₁          ⦂ₕ keep-id≡idₕ n i = keepₕ δ₁
skipₕ δ₁          ⦂ₕ keep-id≡idₕ n i = skipₕ δ₁
keep-id≡idₕ .n i  ⦂ₕ keep-id≡idₕ n j = goal i j
   goal : Square refl (keep-id≡idₕ n) refl (keep-id≡idₕ n)
   goal i j = keep-id≡idₕ n (i ∧ j)

We can try to prove that the HIT variant is the same as orthodox one. The conversion functions are extremely simple, because the data-type is almost the same:

Orth→HIT : n ⊑ₒ m  n ⊑ₕ m
Orth→HIT nilₒ      = idₕ
Orth→HIT (keepₒ δ) = keepₕ (Orth→HIT δ)
Orth→HIT (skipₒ δ) = skipₕ (Orth→HIT δ)

HIT→Orth : n ⊑ₕ m  n ⊑ₒ m
HIT→Orth idₕ                = idₒ
HIT→Orth (keepₕ δ)          = keepₒ (HIT→Orth δ)
HIT→Orth (skipₕ δ)          = skipₒ (HIT→Orth δ)
HIT→Orth (keep-id≡idₕ n i)  = keep-id≡idₒ {n} i

Converting orthodox representation to HIT and back doesn't change the thinning. The proof is straightforward structural induction.

Orth→HIT→Orth : (δ : Orth n m)  HIT→Orth (Orth→HIT δ) ≡ δ
Orth→HIT→Orth nilₒ       = refl
Orth→HIT→Orth (keepₒ δ)  = cong keepₒ (Orth→HIT→Orth δ)
Orth→HIT→Orth (skipₒ δ)  = cong skipₒ (Orth→HIT→Orth δ)

On the other hand the opposite direction is tricky.

Easy part is to show that Orth→HIT preserves the identity, that will show that idₕ roundtrips.

Orth→HIT-id :  n  Orth→HIT idₒ ≡ idₕ {n}
Orth→HIT-id zero     = refl
Orth→HIT-id (suc n)  = cong keepₕ (Orth→HIT-id n) ∙ keep-id≡idₕ n

We also have to show that keep-id≡idₕ roundtrips. This is considerably more challenging. Luckily if you squint enough (and are familiar with cubical library), you notice the pattern:

lemma :  n  Square
  (cong keepₕ (Orth→HIT-id n))
  (cong keepₕ (Orth→HIT-id n) ∙ keep-id≡idₕ n)
  (refl {x = keepₕ (Orth→HIT idₒ)})
  (keep-id≡idₕ n)
lemma n = compPath-filler
  {x = keepₕ (Orth→HIT idₒ)}
  (cong keepₕ (Orth→HIT-id n))
  (keep-id≡idₕ n)

(In general, proving the equalities about equalities in Cubical Agda, i.e. filling squares and cubes feels to be black magic).

Using these lemmas we can finish the equality proof:

HIT→Orth→HIT : (δ : HIT n m)  Orth→HIT (HIT→Orth δ) ≡ δ
HIT→Orth→HIT idₕ                  = Orth→HIT-id _
HIT→Orth→HIT (keepₕ δ)            = cong keepₕ (HIT→Orth→HIT δ)
HIT→Orth→HIT (skipₕ δ)            = cong skipₕ (HIT→Orth→HIT δ)
HIT→Orth→HIT (keep-id≡idₕ n i) j  = lemma n i j

Orth≡HIT-pointwise : n ⊑ₒ m ≡ n ⊑ₕ m
Orth≡HIT-pointwise =
  isoToPath (iso Orth→HIT HIT→Orth HIT→Orth→HIT Orth→HIT→Orth)

Orth≡HIT : Orth ≡ HIT
Orth≡HIT i n m = Orth≡HIT-pointwise {n} {m} i

And we can show that this thinning identity and composition behave as the orthodox one. The identity homomorphism we have already proven, composition is trivial as the HIT structure resembles the structure orthodox thinning:

Orth→HIT-⦂ :  {n m p} (δ₁ : Orth n m) (δ₂ : Orth m p)
   Orth→HIT (δ₁ ⦂ₒ δ₂) ≡ Orth→HIT δ₁ ⦂ₕ Orth→HIT δ₂
Orth→HIT-⦂ δ₁           nilₒ       = refl
Orth→HIT-⦂ δ₁          (skipₒ δ₂)  = cong skipₕ (Orth→HIT-⦂ δ₁ δ₂)
Orth→HIT-⦂ (keepₒ δ₁)  (keepₒ δ₂)  = cong keepₕ (Orth→HIT-⦂ δ₁ δ₂)
Orth→HIT-⦂ (skipₒ δ₁)  (keepₒ δ₂)  = cong skipₕ (Orth→HIT-⦂ δ₁ δ₂)

Then we can repeat what we did with previous thinning.

CatOps-HIT : CatOps HIT
CatOps-HIT = idₕ , _⦂ₕ_

CatOps-HITₜ : CatOps HIT
CatOps-HITₜ = subst CatOps Orth≡HIT CatOps-Orth

Identities are equal:

idₕₜ : n ⊑ₕ n
idₕₜ = fst CatOps-HITₜ

idₕₜ≡idₕ : idₕₜ ≡ idₕ {n}
idₕₜ≡idₕ = transportRefl (Orth→HIT idₒ) ∙ Orth→HIT-id _

and composition (literally the same code as in previous section, it can be automated but it's not worth for a blog post)

_⦂ₕₜ_ : n ⊑ₕ m  m ⊑ₕ p  n ⊑ₕ p
_⦂ₕₜ_ = snd CatOps-HITₜ

⦂ₕₜ≡⦂ₕ : (δ₁ : n ⊑ₕ m) (δ₂ : m ⊑ₕ p)  δ₁ ⦂ₕₜ δ₂ ≡ δ₁ ⦂ₕ δ₂
⦂ₕₜ≡⦂ₕ {n} {m} {p} δ₁ δ₂ =
  transport refl expr₁  ≡⟨ transportRefl expr₁ ⟩
  expr₁                 ≡⟨ expr₁≡expr₂ ⟩
  expr₂                 ≡⟨ Orth→HIT-⦂ (HIT→Orth δ₁) (HIT→Orth δ₂)
  expr₃                 ≡⟨  i  HIT→Orth→HIT δ₁ i ⦂ₕ HIT→Orth→HIT δ₂ i)
  δ₁ ⦂ₕ δ₂ ∎
    expr₁ = Orth→HIT (HIT→Orth (transport refl δ₁) ⦂ₒ
                      HIT→Orth (transport refl δ₂))
    expr₂ = Orth→HIT (HIT→Orth δ₁ ⦂ₒ HIT→Orth δ₂)
    expr₃ = Orth→HIT (HIT→Orth δ₁) ⦂ₕ Orth→HIT (HIT→Orth δ₂)

    expr₁≡expr₂ : expr₁ ≡ expr₂
    expr₁≡expr₂ i = Orth→HIT (HIT→Orth (transportRefl δ₁ i) ⦂ₒ
                              HIT→Orth (transportRefl δ₂ i))

And the equalities of CatOps:

CatOps-HIT≡ : CatOps-HITₜ ≡ CatOps-HIT
CatOps-HIT≡ i = idₕₜ≡idₕ i , λ δ₁ δ₂  ⦂ₕₜ≡⦂ₕ δ₁ δ₂ i

CatOps-Orth≡HIT :  i  CatOps (Orth≡HIT i)) [ CatOps-Orth ≡ CatOps-HIT ]
CatOps-Orth≡HIT = toPathP CatOps-HIT≡


We have seen three definitions of thinnings. Orthodox one, one with identity constructor yet unique representation and variant using additional equality. Using Cubical Agda we verified that these three definitions are equal, and their identity and composition behave the same.

What we can learn from it?

Well. It is morally correct to define

data Thin n m where
  ThinId   ::             Thin    n     n
  ThinSkip :: Thin n m -> Thin    n  (S m)
  ThinKeep :: Thin n m -> Thin (S n) (S m)

as long as you pay attention to not differentiate between ThinKeep ThinId and ThinId, you are safe. GHC won't point you if you wrote something inconsistent.

For example checking whether the thinning is an identity:

isThinId :: Thin n m -> Maybe (n :~: m)
isThinId ThinId = Just Refl
isThinId _      = Nothing

is not correct, but will be accepted by GHC. (Won't be by Cubical Agda).

But if you don't trust yourself, you can go for slightly more complicated

data Thin n m where
  ThinId ::              Thin n n
  Thin'  :: Thin' n m -> Thin n m

data Thin' n m where
  ThinWk   ::              Thin'    n  (S n)
  ThinSkip :: Thin' n m -> Thin'    n  (S m)
  ThinKeep :: Thin' n m -> Thin' (S n) (S m)

In either case you will be able to write Category instance:

instance Category Thin where
  id = ThinId
  (.) = _look_above_in_the_Agda_Code

which is not possible with an orthodox thinning definition.


open import Cubical.Data.Nat.Order

-- thinnings can be converted to less-than-or-equal-to relation:
⊑ₕ→≤ : n ⊑ₕ m  n ≤ m
⊑ₕ→≤ idₕ = 0 , refl
⊑ₕ→≤ (keepₕ δ) with ⊑ₕ→≤ δ
... | n , p = n  , +-suc n _ ∙ cong suc p
⊑ₕ→≤ (skipₕ δ) with ⊑ₕ→≤ δ
... | n , p = suc n , cong suc p
⊑ₕ→≤ (keep-id≡idₕ n i) = lemma' i where
  lemma' : ⊑ₕ→≤ (keepₕ idₕ) ≡ ⊑ₕ→≤ (idₕ {suc n})
  lemma' = Σ≡Prop  m   isSetℕ (m + suc n) (suc n)) (refl {x = 0})

-- Then we can check whether thinning is an identity.
-- Agda forces us to not cheat.
-- (Well, and also → Dec (n ≡ m))
isThinId : n ⊑ₕ m  Dec (n ≡ m)
isThinId idₕ = yes refl
isThinId (keepₕ δ) with isThinId δ
... | yes p = yes (cong suc p)
... | no ¬p = no λ p  ¬p (injSuc p)
isThinId {n} {m} (skipₕ δ) with ⊑ₕ→≤ δ
... |  (r , p) = no λ q  ¬m+n<m {m = n} {n = 0}
  (r , (r + suc (n + 0)    ≡⟨ +-suc r (n + 0)
        suc (r + (n + 0))  ≡⟨ cong  x  suc (r + x)) (+-zero n)
        suc (r + n)        ≡⟨ cong suc p ⟩
        suc _              ≡⟨ sym q ⟩
        n                  ∎))

isThinId (keep-id≡idₕ n i) = yes  _  suc n)

-- Same for orthodox
⊑ₒ→≤ : n ⊑ₒ m  n ≤ m
⊑ₒ→≤ nilₒ = 0 , refl
⊑ₒ→≤ (skipₒ δ) with ⊑ₒ→≤ δ
... | n , p = suc n , cong suc p
⊑ₒ→≤ (keepₒ δ) with ⊑ₒ→≤ δ
... | n , p = n  , +-suc n _ ∙ cong suc p

-- if indices match, δ is idₒ
⊥-elim : {A : Type}  A
⊥-elim ()

idₒ-unique : (δ : n ⊑ₒ n)  δ ≡ idₒ
idₒ-unique nilₒ      = refl
idₒ-unique (skipₒ δ) = ⊥-elim (¬m<m (⊑ₒ→≤ δ))
idₒ-unique (keepₒ δ) = cong keepₒ (idₒ-unique δ)

-- or idₕ, for which direct proof is trickier.
idₕ-unique : (δ : n ⊑ₕ n)  δ ≡ idₕ
idₕ-unique {n} = subst {A = Σ _ CatOps}
   { (__ , (id , __))  (δ : n ⊑ n)  δ ≡ id})
   i  Orth≡HIT i , CatOps-Orth≡HIT i)

More extras

The most important operation thinning support is their action on variables.

data Var : Type where
  vz :         Var (suc n)
  vs : Var n  Var (suc n)

Using each of the variants let us define the action:

thinₒ : n ⊑ₒ m  Var n  Var m
thinₒ nilₒ      ()
thinₒ (skipₒ δ) x      = vs (thinₒ δ x)
thinₒ (keepₒ δ) vz     = vz
thinₒ (keepₒ δ) (vs x) = vs (thinₒ δ x)

thinₛ : n ⊏ₛ m  Var n  Var m
thinₛ wkₛ       x      = vs x
thinₛ (skipₛ δ) x      = vs (thinₛ δ x)
thinₛ (keepₛ δ) vz     = vz
thinₛ (keepₛ δ) (vs x) = vs (thinₛ δ x)

thinₙ : n ⊑ₙ m  Var n  Var m
thinₙ idₙ        x = x
thinₙ (strict δ) x = thinₛ δ x

It's worth noticing that HIT forces to take into account the keep≡id≡idₕ equality, so we cannot do silly stuff in keepₕ cases.

thinₕ : n ⊑ₕ m  Var n  Var m
thinₕ idₕ       x      = x
thinₕ (skipₕ δ) x      = vs (thinₕ δ x)
thinₕ (keepₕ δ) vz     = vz
thinₕ (keepₕ δ) (vs x) = vs (thinₕ δ x)

thinₕ (keep-id≡idₕ n i) vz     = vz
thinₕ (keep-id≡idₕ n i) (vs x) = vs x

Let us prove that these definitions are compatible. First we need a simple lemma, that thinₒ idₒ is an identity function.

thin-idₒ : (x : Var n)  thinₒ idₒ x ≡ x
thin-idₒ {suc n} vz     = refl
thin-idₒ {suc n} (vs x) = cong vs (thin-idₒ x)
Action : ( Type)  Type
Action n m __ = n ⊑ m  Var n  Var m

thinₙₜ : n ⊑ₙ m  Var n  Var m
thinₙₜ {n} {m} = subst (Action n m) Orth≡NonStr thinₒ

Strict→Orth-thin : (δ : n ⊏ₛ m) (x : Var n)  thinₒ (Strict→Orth δ) x ≡ thinₛ δ x
Strict→Orth-thin wkₛ       x      = cong vs (thin-idₒ x)
Strict→Orth-thin (skipₛ δ) x      = cong vs (Strict→Orth-thin δ x)
Strict→Orth-thin (keepₛ δ) vz     = refl
Strict→Orth-thin (keepₛ δ) (vs x) = cong vs (Strict→Orth-thin δ x)

NonStr→Orth-thin : (δ : n ⊑ₙ m) (x : Var n)  thinₒ (NonStr→Orth δ) x ≡ thinₙ δ x
NonStr→Orth-thin idₙ        x = thin-idₒ x
NonStr→Orth-thin (strict δ) x = Strict→Orth-thin δ x

thinₙₜ≡thinₙ-pointwise : (δ : n ⊑ₙ m) (x : Var n)  thinₙₜ δ x ≡ thinₙ δ x
thinₙₜ≡thinₙ-pointwise {n} {m} δ x
  = transportRefl (thinₒ (NonStr→Orth (transp  i  n ⊑ₙ m) i0 δ)) (transp  j  Var n) i0 x))
  ∙ cong₂ thinₒ (cong NonStr→Orth (transportRefl δ)) (transportRefl x)
  ∙ NonStr→Orth-thin δ x

thinₙₜ≡thinₙ : (thinₙₜ {n} {m}) ≡ thinₙ
thinₙₜ≡thinₙ i δ x = thinₙₜ≡thinₙ-pointwise δ x i

thinₒ≡thinₙ :  i  Action n m (Orth≡NonStr i)) [ thinₒ ≡ thinₙ ]
thinₒ≡thinₙ = toPathP thinₙₜ≡thinₙ

The HIT version is not much trickier, if any.

thinₕₜ : n ⊑ₕ m  Var n  Var m
thinₕₜ {n} {m} = subst (Action n m) Orth≡HIT thinₒ

HIT→Orth-thin : (δ : n ⊑ₕ m) (x : Var n)  thinₒ (HIT→Orth δ) x ≡ thinₕ δ x
HIT→Orth-thin idₕ       x      = thin-idₒ x
HIT→Orth-thin (skipₕ δ) x      = cong vs (HIT→Orth-thin δ x)
HIT→Orth-thin (keepₕ δ) vz     = refl
HIT→Orth-thin (keepₕ δ) (vs x) = cong vs (HIT→Orth-thin δ x)

HIT→Orth-thin (keep-id≡idₕ n i) vz     = refl
HIT→Orth-thin (keep-id≡idₕ n i) (vs x) = cong vs (thin-idₒ x)

thinₕₜ≡thinₕ-pointwise : (δ : n ⊑ₕ m) (x : Var n)  thinₕₜ δ x ≡ thinₕ δ x
thinₕₜ≡thinₕ-pointwise {n} {m} δ x
  = transportRefl (thinₒ (HIT→Orth (transp  i  n ⊑ₕ m) i0 δ)) (transp  j  Var n) i0 x))
  ∙ cong₂ thinₒ (cong HIT→Orth (transportRefl δ)) (transportRefl x)
  ∙ HIT→Orth-thin δ x

thinₕₜ≡thinₕ : (thinₕₜ {n} {m}) ≡ thinₕ
thinₕₜ≡thinₕ i δ x = thinₕₜ≡thinₕ-pointwise δ x i

thinₒ≡thinₕ :  i  Action n m (Orth≡HIT i)) [ thinₒ ≡ thinₕ ]
thinₒ≡thinₕ = toPathP thinₕₜ≡thinₕ

At the end we have three variants of thinnings with identity and composition, and which act on variables the same way.

Now, if we prove properties of these operations, e.g. identity laws, composition associativity, or that composition and action commute, it would be enough to prove these for the orthodox implementation, then we can simply transport the proofs.

In other words, whatever we prove about one structure will hold for two others, like idₕ-unique in previous section.

Some proofs are simple:

thin-idₕ : (x : Var n)  thinₕ idₕ x ≡ x
thin-idₕ x = refl

but we can get them through the equality anyway:

thin-idₕ' : (x : Var n)  thinₕ idₕ x ≡ x
thin-idₕ' {n} x = subst
  {A = Σ _  __  Action n n __ × (n ⊑ n))}                -- structure 
   { (__ , thin , id)  thin id x ≡ x })                   -- motif
   i  Orth≡HIT i , thinₒ≡thinₕ i , CatOps-Orth≡HIT i .fst) -- proof that structures are equal
  (thin-idₒ x)                                                -- proof to transport

September 30, 2022 12:00 AM

September 29, 2022

Monday Morning Haskell

Using Haskell in Vim: The Basics

Last week I went over some of the basic principles of a good IDE setup. Now in this article and the next, we're going to do this for Haskell in a couple different environments.

A vital component of almost any Haskell setup (at least the two we'll look at) is getting Haskell Language Server running and being able to switch your global GHC version. We covered all that in the last article with GHCup.

In this article we'll look at creating a Haskell environment in Vim. We'll cover how Vim allows us to perform all the basic actions we want, and then we'll add some of the extra Haskell features we can get from using HLS in conjunction with a plugin.

One thing I want to say up front, because I know how frustrating it can be to try repeating something from an article and have it not work: this is not an exhaustive tutorial for installing Haskell in Vim. I plan to do a video on that later. There might be extra installation details I'm forgetting in this article, and I've only tried this on Windows Subsystem for Linux. So hopefully in the future I'll have time to try this out on more systems and have a more detailed look at the requirements.

Base Features

But, for now, let's start checking off the various boxes from last week's list. We had an original list of 7 items for basic functionality. Here are five of them:

  1. Open a file in a tab
  2. Switch between tabs
  3. Open files side-by-side (and switch between them)
  4. Open up a terminal to run commands
  5. Switch between the terminal and file to edit

Now Vim is a textual editor, meant to be run from a command prompt or terminal. Thus you can't really use the mouse at all in Vim! This is disorienting at first, but it means that all of these actions we have to take must have keyboard commands. Once you learn all these, your coding will get much faster!

To open a new file in a tab, we would use :tabnew followed by the file name (and we can use autocomplete to get the right file). We can then flip between tabs with the commands :tabn (tab-next) and :tabp (tab-previous).

To see multiple files at the same time, we can use the :split command, followed by the file name. This gives a horizontal split. My preference is for a vertical split, which is achieved with :vs and the file name. Instead of switching between files with :tabn and :tabp, we use the command Ctrl+W to go back and forth.

Finally, we can open a terminal using the :term command. By default, this puts the terminal at the bottom of the screen:

We can also get a side-by-side terminal with :vert term.

Switching between terminals is the same as switching between split screens: Ctrl+WW.

And of course, obviously, Vim has "Vim movement" keys so you can move around the file very quickly!

Now the two other items on the list are related to having a sidebar, another useful base feature in your IDE.

  1. Open & close a navigation sidebar
  2. Use sidebar to open files

We saw above that it's possible to open new files. But on larger projects, you can't keep the whole project tree in your head, so you'll probably need a graphical reference to help you.

Vim doesn't support such a layout natively. But with Vim (and pretty much every editor), there is a rich ecosystem of plugins and extensions to help improve the experience.

In fact, with Vim, there are multiple ways of installing plugins. The one I ended up deciding on is Vim Plug. I used it to install a Plugin called NerdTree, which gives a nice sidebar view where I can scroll around and open files.

In general, to make a modification to your Vim settings, you modify a file in your home directory called .vimrc. To use NerdTree (after installing Vim Plug), I just added the following lines to that file.

call plug#begin('~/.vim/plugged")
Plug 'preservim/nerdtree'
call plug#end()

Here's what it looks like:

All that's needed to bring this menu up is the command :NERDTree. Switching focus remains the same with Ctrl+WW and so does closing the tab with :q.

Configurable Commands

Another key factor with IDEs is being able to remap commands for your own convenience. I found some of Vim's default commands a bit cumbersome. For example, switching tabs is a common enough task that I wanted to make it really fast. I wanted to do the same with opening the terminal, while also doing so with a vertical split instead of the default horizontal split. Finally, I wanted a shorter command to open the NerdTree sidebar.

By putting the following commands in my .vimrc file, I can get these remappings:

nnoremap <Leader>q :tabp<CR>
nnoremap <Leader>r :tabn<CR>
nnoremap <Leader>t :vert term<CR>
nnoremap <Leader>n :NERDTree<CR>

In these statements, <Leader> refers to a special key that is backslash (\) by default, but also customizable. So now I can switch tabs using \q and \r, open the terminal with \t, and open the sidebar with \n.

Language Specific Features

Now the last (and possibly most important) aspect of setting up the IDE is to get the language-specific features working. Luckily, from the earlier article, we already have the Haskell Language Server running thanks to GHCup. Let's see how to apply this with Vim.

First, we need another Vim plugin to work with the language server. This plugin is called "CoC", and we can install it by including this line in our .Vimrc in the plugins section:

call plug#begin('~/.vim/plugged")
Plug 'neoclide/cooc.nvim', {'branch': 'release'}
call plug#end()

After installing the plugin (re-open .vimrc or :source the file), we then have to configure the plugin to use the Haskell Language Server. To do this, we have to use the :CocConfig command within Vim, and then add the following lines to the file:

  "languageserver": {
    "haskell": {
      "command": "haskell-language-server-wrapper",
      "args": ["--lsp"],
      "rootPatterns": ["*.cabal", "stack.yaml", "cabal.project", "package.yaml", "hie.yaml"],
      "filetypes": ["haskell", "lhaskell"]

Next, we have to use GHCup to make sure the "global" version of GHC matches our project's version. So, as an example, we can examine the stack.yaml file and find the resolver:


The 19.13 resolver corresponds to GHC 9.0.2, so let's go ahead and set that using GHCup:

>> ghcup set ghc 9.0.2

And now we just open our project file and we can start seeing Haskell tips! Here's an example showing a compilation error:

The ability to get autocomplete suggestions from library functions works as well:

And we can also get a lint suggestion (I wish it weren't so "yellow"):

Note that in order for this to work, you must open your file from the project root where the .cabal file is. Otherwise HLS will not work correctly!

# This works!
>> cd MyProject
>> vim src/MyCode.hs

# This does not work!
>> cd MyProject/src
>> vim MyCode.hs


That's all for our Haskell Vim setup! Even though this isn't a full tutorial, hopefully this gives you enough ideas that you can experiment with Haskell in Vim for yourself! Next time, we'll look at getting Haskell working in Visual Studio!

If you want to keep up to date with all the latest on Monday Morning Haskell, make sure to subscribe to our mailing list! This will also give you access to our subscriber resources, including beginner friendly resources like our Beginners Checklist!

by James Bowen at September 29, 2022 02:30 PM

Tweag I/O

Four months into The Nix Book

At Tweag, any employee can pitch proposals for internal projects. This is how we got content-addressed derivations in Nix, the formal verification tool Pirouette, or a yearly book budget to support continuous learning.

In May 2022 our Chief Architect Arnaud Spiwack (@aspiwack) accepted my pitch for “The Nix Book”, agreeing to fund work on improving Nix documentation and onboarding experience for three full months. This is a comprehensive report on what happened since and what I learned from it.


The goal of the project was to improve Nix onboarding and documentation experience to increase community growth by writing “The Nix Book”.

Task failed successfully

In short:

  • Failing fast failed. The project took a different course than intended, and this is probably good.
  • We are years away from “The Nix Book”.
  • Writing is hard – many significant improvements are underway, but it needs time.
  • Science: it works. Usability studies, surveys, and expert insights are leading the way. Also: Cognitive biases are lurking everywhere.
  • The challenge is social, not technical. A documentation team was formed to tackle this.
  • We should focus on enabling occasional contributors and help them grow into maintainers.

Since it is a very long text, each section is designed to be read on its own:

The report in full detail will mostly be interesting for active contributors or those inclined to contributing to the Nix ecosystem. It may also be interesting for people who are working or want to work on a software project that is in a similar situation as Nix.1


My work on Nix documentation started in March 2022 by participating in the regular Nix UX meeting. There I encountered Nix contributor and developer at Obsidian Systems John Ericson (@Ericson2314), who set out to document Nix’s architecture. While reviewing his pull request together, we quickly agreed that I should take the editorial lead. I learned a lot about Nix internals during our review sessions we had over multiple weeks, trying to sort out the facts and to present them in a consistent, readable manner.

The insights we found during these discussions helped refine the idea that got me excited about Nix in the first place: making software build and run is no different from writing the software to begin with – it’s all just programming. We can do it effectively or clumsily, depending on the (mental) tools we employ.

I was eager to write down what apparently was there all along, but somewhat hidden between the lines in Eelco Dolstra’s PhD thesis and Build Systems à la Carte. This led to my blog post Taming Unix with functional programming, which illustrates how we can think about building and deploying software in terms of programming language theory.

Most importantly, the principles underlying Nix and many of its mechanisms are amazingly simple – it’s just that most often they are not explained well. This discrepancy between contents and presentation in the Nix ecosystem always struck me as painful… and unnecessary.

During this time the desire to extend the scope of improving Nix documentation and learning material culminated in an internal pitch to compile and write what I would boldly but tentatively call The Nix Book.

The pitch had a fairly broad mission statement2:


  • Improve the autodidactic Nix onboarding experience to increase community growth


  • Write a book actually explaining Nix and its surrounding ecosystem at a high level of abstraction
  • Overhaul the Nix manual to make it a focused technical reference
  • Improve discoverability of existing learning material
  • Lead a Summer of Nix 2022 project to help achieve this

Success of the project hinged on the extent to which existing material already served its purpose, and whether attempts to improve it would be fruitful. This report shows what became of “The Nix Book”.

Key results

  • Taming Unix with functional programming

    The article took a while to fully develop, but the result turned out to be highly successful with over 13,000 readers (making it the most viewed Tweag blog post so far) and a day on the Hacker News front page.

  • Detailed 2022 Nix community survey results

    I needed more information than available in the original report, so I reached out to the Nix marketing team to get my hands on the raw data. From there, I compiled graphs for all quantitative questions. While this only represents a portion of what’s going on in the Nix community, it is evidence. This is a significant improvement over anecdata and intuition and can be used to more confidently reason through strategic decisions, such as prioritizing Nix over NixOS for onboarding.

  • Nix documentation team

    Multiple discussions led to a conclusion that a team should be formed to serve as a sounding board for the effort.

    • I had encounters with Nix author Eelco Dolstra (@edolstra), NixOS contributor and NixOS Wiki maintainer Jörg Thalheim (@Mic92), Cachix author Domen Kožar (@domenkozar), and my colleagues, Matthias Meschede (@MMesch) and Rok Garbas (@garbas).
    • Encouraged by the simultaneous developments around reforming the NixOS Foundation, I had additional exchanges with my colleagues Théophane Hufschmitt (@thufschmitt), Silvan Mosberger (@infinisil), Tweag’s Founder Mathieu Boespflug (@mboes), Flox’ CEO Ron Efroni (@ron), and TVL developer Vincent Ambo (@tazjin).

    Most importantly, the team provides a widely visible point of contact for potential contributors.

  • 2022 Summer of Nix documentation stream

    Led by Matthias (@MMesch), we organized a part of the Summer of Nix program dedicated to improving documentation. In addition, we drafted a line-up of presentations, one of which was also on Nix documentation.

  • How to contribute to documentation

    Based on discussions, feedback, the practice of helping contributors, and the need to accommodate Summer of Nix participants’ work, I drafted a contribution guide to documentation in the Nix ecosystem to have as a reference.

  • Usability study

    10 sessions with (absolute or relative) Nix beginners of different software development backgrounds quickly produced some observations:

    • People love control and reproducibility.
    • Developers just want to get things done and do not care how it works.
    • Engineers usually care most about one specific programming language or framework.
    • People do not read, only skim.
    • navigation often does not meet user needs.
    • Information about the Nix ecosystem is perceived as being highly dispersed and disorganized. Confusion and disorientation quickly kicks in and often results in “tab explosion”.
    • The learning curve is perceived as extremely steep.
    • The Nix language is easy for Haskell users, and obscure to almost everyone else without appropriate instructions.
  • Nix language tutorial

    Identified as the highest-value objective by Rok (@garbas), based on a comprehensive comparison of existing Nix language tutorials, and already partially validated by testing it with beginners, it should hopefully become the centerpiece of future efforts to learn and teach Nix.

  • No Nix Book this year

    As originally envisioned and agreed upon by the Nix documentation team, work should continue based on and Contrary to what was originally envisioned it should happen in a strictly incremental fashion, slowly migrating material towards more curated resources and as close to the source code as possible.

Additional results

Contributions to merged pull requests
Contributions to unmerged pull requests
Ongoing discussions

Measuring success

Some of the following insights appear almost trivial in retrospect. Yet, contributing to a major open source project is a set of skills on its own, one most people don’t learn at university or work.

As a high-level summary, one could say:

1. Do your homework first.

This presents a dilemma:

  • Becoming competent at making improvements requires time which will not be available to actually making those improvements.
  • Trying to make improvements without the necessary competence requires maintainers’ time, which was highly limited to begin with, or may backfire by making matters worse.

Therefore, what follows is an attempt to share experience, and, based on that experience, proposals to deal with the above dilemma.

Again, as a high-level summary:

2. Make it easier for others to do their homework.

Failing fast

Following best practices, the pitch contained abort criteria to avoid the sunk-cost fallacy:

  1. until 2022-05-31: Summer of Nix 2022 project proposal rejected by organization team (excluding Tweag staff)
  2. until 2022-05-31: not enough participants to cover planned tasks
  3. until 2022-07-15: preliminary questionnaires demonstrate satisfactory effectiveness of existing material for defined learning goals
    • may leave room for conducting targeted incremental improvements
  4. until 2022-07-15: setting up surveys and collecting results shows that timeline is not realistic
  5. until: 2022-07-31: elaborating outline and surveying existing material shows that timeline is not realistic
    • may leave room for cutting scope

While the Summer of Nix proposal was accepted (1) and user testing showed desperate need for improvements (3), it was not clear how many Summer of Nix participants would actually want to focus on documentation (2) until the end of July when the program had started. We estimated that of 20 participants, multiple would contribute to documentation in one way or another. It later turned out that only one would work on documentation specifically, and a few others would decide to write a blog post about their ongoing work.

At this point it was already quite evident that a more incremental approach would be inevitable. Both the evidence (4) and envisioned scope for the book (5) were unambiguous in that there was an order-of-magnitude divide between the possible and the desired.

By the beginning of July, the newly founded Nix documentation team decided to focus on more immediate problems and only briefly discussed “The Nix Book” as a long-term vision. Cutting scope occurred naturally: from now on, the focus would be on just the part up to teaching the Nix language.

Around that time I updated Mathieu (@mboes) on the current state and changed strategy, where we re-iterated on the cost-benefit estimate of spending internal budget and project goals – improving Nix onboarding and increasing Tweag’s visibility.

Note that I originally estimated the project to take six months, not three or four. While the changed schedule partially invalidated the relation between estimate and objectives, the time constraints forced me to keep even stricter focus on priorities. At the same time, clearly not being able to deliver on the vision due to the problem size removed most pressure with regard to producing specific artifacts and, thus, any temptation to cut corners.

Instead of spending three months, I spent four months. High time to evaluate.

What went well

Community building

Taking the time to listen and talk to people helped a lot with understanding the problem space and honoring Chesterton’s fence. Getting key people on board helped to build commitment and momentum as well as weed out bad ideas through critical discussion. It also meant that actual changes have to go through at least partial consensus, which requires each of those changes to be fairly small.

In the original pitch, I assumed that I would have to rely on Aaron Swartz’ Wikipedia authorship principle (which suggests that most open source contributors engage only occasionally and typically work on cosmetics, while the substance is provided by regulars). The assumption turned out to be true and Swartz’ findings were confirmed again.

Providing a central point of contact, naming directly responsible individuals, contacting potential contributors immediately, and actively setting examples appears to have resulted in a modest but noticeable increase of attention towards documentation issues, as well as many small and multiple significant pull requests.

User studies

Immediately starting with user studies quickly helped pinpoint concrete user needs and some obvious issues, and either validated or debunked some preconceptions that (at least as far as I perceived) had been discussed mostly based on intuition and oral history. It will still take more time to sort through them again and match the notes to GitHub issues and pull requests. This is to help (1) maintainers to keep track of what can be done and (2) whoever consults the session notes to keep up with what has already been resolved. These studies should continue if possible: at the least, to validate new material (as for example with the new Nix language tutorial) and measure the reduction in onboarding time after improvements have been implemented.

Increasing visibility

Publishing regular updates such as meeting notes, participating in ongoing discussions, and linking to relevant posts, issues, and pull requests seem to have increased awareness of the trajectory of the Nix ecosystem and of what Tweag is doing.

Getting involved consistently and staying active in a constructive manner helped a lot.

All feedback from within the community so far has been positive. Beginners and regular users found the changes in organization and the specific documentation work we got done helpful. Expert users and contributors are vocally happy about the efforts. (At least those who I did not annoy by nagging too much about phrasing and terminology.)

Note: there are also outside voices on the internet who doubt that this (or any) effort will lead to a significant improvement in terms of user experience.

Overview and visualization

Presenting high-level summaries and diagrams at the very beginning of introducing people to various topics was perceived as very helpful, both in the usability tests as well as in multiple informal interactions. It increases the readers’ confidence, and allows them to set realistic expectations before going into details. This is supported by scientific evidence.

I think there should be many more such overviews at the top of learning resources and reference materials.


A particularly inspiring example of making complex problems accessible through visuals is Life cycle of a Poetry project by Attila Gulyas (@toraritte).

Brute-force analysis

Nix, Nixpkgs, and NixOS have a multitude of features and obscure corner cases which are barely, badly, or not at all documented. There are many resources of varying quality which have overlapping contents. The only way to get on top of things, apart from experimentation or diving into source code, often turns out to be research and an exhaustive analysis of prior art: to avoid the Dunning-Kruger effect (”I can easily do better.”), to account for Chesterton’s fence (”This is not good and can be removed/must be changed.”), and finally to simply get things right.

Incorrect documentation is often worse than no documentation.

— attributed to Bertrand Meyer

Improving over the current state is only reliably possible if the current state, and how it came about, is known.

Despite thorough initial overview it took me a while to even stumble upon relevant materials after sitting down again and again to dig through countless Discourse threads and NixOS Wiki pages:

Oftentimes, such research reveals underlying problems (as opposed to mere symptoms) or what caused those problems in the first place.

This kind of due diligence takes a lot of time and concentration, and can be very challenging work. There is also an enormous overhead of preserving insights for the future. However, I am convinced that if no one else has to repeat the effort, the results are worth it in long run. Each time I tried the brute-force approach, the quality of work turned out to be convincing as opposed to my other, less well-prepared proposals, which (rightfully) received substantial headwind.

Scientific method

Leveraging insights from scientific evidence (unsurprisingly) proved to be highly effective, and as a side effect removed most uncertainty about procedure.

The most important resources which shape my day-to-day documentation work:

  • How Learning Works

    Practical advice on effective teaching and learning, backed by broad and deep evidence.

    The best-written and probably most important book I have ever read.

  • Diátaxis

    A framework for structuring software documentation around user needs.

  • Plain language guidelines

    A set of guidelines to write clearly in English.

This is meant quite literally – I refer to each of them every day, one way or another.

Many thanks to my colleague Andrea Bedini (@andreabedini) for recommending How Learning Works by Ambrose, et al. – it keeps changing my life to the better. I have heard multiple times that Visible Learning is the state of the art in learning science. For teaching in the context of software development, I also recommend the ideas behind Software Carpentry.

Lessons learned

Gather more context in the beginning

Looking beyond one’s own backyard by collecting testimonials from other software projects would have been helpful to more quickly see the big picture and, as a result, identify the most pressing, underlying issues. While I talked to many Nix experts and did much research on internal proceedings, I spent very little time on how other projects approached similar problems and which strategies were successful.


Talk to people who solved similar problems in different contexts.


Ron (@ron) interviewed several leaders of open source foundations when preparing the NixOS Foundation reform, which surfaced very helpful, non-obvious insights – and also tales of caution.

Avoid planning fallacy

The idea one person could get even close to a complete book in less than half a year was, while not fully serious, quite presumptuous, and a pathological case of planning fallacy. It was not evident to me in the beginning, but the ecosystem is simply too large, the problems too numerous, and the high-level tasks too big to tackle at once.

Things are moving, but they are moving very, very slowly. Writing the architecture documentation chapter, which covers at most 60% of the topics that it would need to be considered comprehensive, took 8 weeks of wall-clock time. Writing a Nix language tutorial took 4 weeks of wall-clock time. This is extremely frustrating, but unavoidable due to lack of prior experience, some essential complexity, much accidental complexity, external factors considered in the other sections, and – of course – planning fallacy.


Find out how much time it took to produce comparable results, and take it seriously.


The version history goes back to 2016, and has been actively developed on the side since May 2020. In the time-span of about 2 years, 12 original articles were produced, i.e., one article per two months. After the fact, this matches my own experience very closely.

Identifying low-effort high-impact tasks

The Nix ecosystem is large and fragmented. There are many people involved, each with different – and sometimes diverging – interests. It is not enough to ask users what they need, because they will usually instead answer with what they want.

I spent some time dabbling at working on assorted issues before converging on a more systematic approach.


Take enough time to identify issues (user studies) and sort them by effort-impact ratio (brute-force analysis) before delving into work.


  • The “Writing Nix Expressions” chapter in the Nix manual was a pretty bad introduction to the Nix language, throwing many people off early on (including myself, back in the day). Due to the sheer amount and length of other Nix language tutorials, it was not clear before actually working through all of them that this specific one really did not contain anything uniquely valuable. Removing the section was quick and painless.
  • The Nix Pills cover advanced topics and have been reported to be confusing to beginners (including myself, back in the day) many times. The problem was that they were touted as beginner material in many places. Reordering recommendations and rewording the description appears to have helped substantially. (Although better guidance is still needed, see below.)

Make incremental changes

My colleague Clément Hurlin (@smelc) already wrote about this in Getting Things Merged. In an open source project’s community, where essentially everyone can be considered a volunteer, reviewers’ time is even more limited. There is no chance of getting a large pull request merged without having close allies among maintainers – people naturally won’t do more than take a glance, it’s too much work.

This imposes a significant additional cost on authoring pull requests, which has to be taken into account. One has to keep the big picture in mind while only presenting the next obvious step towards a vision. So far even merging a rendering of the vision appears to be too large a task.

On the other hand, small changes keep cognitive load manageable and allow for easier switching between tasks: simply because small tasks get finished quickly.


Never stop asking the question, “What is the smallest possible change required for a tangible improvement?”


  • Unfortunately the pull request documenting Nix architecture still has not been merged properly. Therefore, it is not yet visible in the Nix manual. Although it is limited to the parts I felt confident publishing, it is a large addition. It will have to be split up into multiple parts to ease review.
  • The Nix language tutorial is not finished yet. It takes 1-2 hours just to work through it – not to mention the time needed to make a review. Good progress so far was only possible due to Silvan Mosberger’s (@infinisil) persistent involvement and patient reviews.

Focus on the basics first

The usability study was particularly helpful in demonstrating the gap between what we may wish to have and what people actually need to succeed.

The problems people got stuck on were often trivial, such as not understanding a term or not finding a crucial bit of information to continue. This could be addressed with much less effort than required for creating full-blown tutorials or meticulously working out precise reference material.


  • Reorganize the web site.
  • Establish materials to help guide beginners across different problem domains in the ecosystem.
  • Make improving documentation more appealing for contributors.
  • Provide guidance to navigate each source repository.

Enable contributors

I was expecting to adjust the original goals and targets during the process, since observing actual users dealing with the material would unfold further requirements to guide my work.

However, the usability study results, as well as my own experience, showed that it is much more difficult to improve upon the overall situation than it originally appeared.

I believe coordinated incremental improvements will be more effective than having a few people attack large problems at full-steam. It’s not just that the sheer number of entangled issues is overwhelming, but also that Nix experts are subject to the curse of knowledge. Nix beginners, seeing our work with fresh eyes, have time and again proven invaluable allies by pointing out and often themselves addressing problems that tend to become invisible after getting used to them.

Systematically building momentum, setting examples, creating a culture, and enabling volunteers to contribute appears much more promising and is already bearing fruit.


Focus future efforts on enabling contributors, by providing comprehensive guidance into the process of developing the Nix ecosystem.

This is not to say to stop improving the onboarding process for beginners. Becoming a contributor should instead be considered part of that process.


More blog posts, tutorials, guides, and reference documentation are in the making.
The source code of beginner-oriented materials is opaque to outsiders.

Next steps

I agree with Tweag’s VP of Engineering Steve Purcell’s (@purcell) and Flox’ CEO Ron Efroni’s (@ron) assessment that Nix is at the inflection point towards a trajectory to mass adoption. At the same time I fear that we as a community are not ready for the corresponding influx of users and potential contributors – both in terms of documentation and organization.

Teaching in person is, of course, the most effective way of getting people into Nix, and Tweagers enjoy this rare privilege by default. But it does not scale. To handle more than 0.5%3 of the world’s 24 million software developers, we have to leverage that most of them are self-taught.

My current estimate to get Nix documentation and learning material into a shape that allows for growing the community to scale is on the order of multiple person-years. For comparison, this is how long it took to get flakes and the new CLI “almost ready”. Or how long it took to create “The Rust Book”, which was started end of 2015 and saw intense development activity for over two years.

Taming and helping to navigate the (mostly accidental) complexities of the Nix ecosystem continues to be a huge undertaking. It requires dedicated work and coordination, which is mostly about learning and teaching, communication, and social problems. Having been in both roles, I am more than ever convinced that volunteers are not sufficient to handle this on their own, and that we need more paid regular contributors.

With the new NixOS Foundation board and its corporate backing, in principle we have all the means to systematically grow the Nix pie for everyone. I am looking forward to the Foundation’s board delivering on their promise to develop a roadmap and to enable teams by providing organizational structure, the necessary permissions, and leadership’s attention. In my opinion, part of this endeavour should be a long-term funding scheme for ongoing development and maintenance targeting key objectives, including improving onboarding and documentation.

I would love to help with setting this up. And, while I don’t care who does the job as long as it’s being done, I thoroughly enjoyed doing what I did in the past four months, and would just as passionately continue if there was a possibility.

  1. Haskell may be a good example. Nix and Haskell, both as software projects and communities, share many features (reliability, expressive power, highly motivated contributors, pluralistic governance) and problems (learning curve, documentation, diverging feature sets). Maybe not coincidentally, there is also a significant overlap of users and contributors between the two.
  2. Most of the original pitch is reproduced in the Summer of Nix 2022 project proposal. The quote is slightly reworded from the tl;dr to make it consistent with the proposal’s contents.
  3. This number is based on informal estimates that at most 100,000 people have heard of or are using Nix.

September 29, 2022 12:00 AM

September 26, 2022

Matthew Sackman

Complexity and software engineering

OK, it’s definitely not just the software industry. If you’ve seen the film The Big Short you may remember the seemingly endless secondary markets, adding complexity that led to no one understanding what risks they were exposed to. The motivation there seemed to be purely making money.

Look at the food on your plate at dinner, and try thinking about the complexity of where all the ingredients came from to make that meal. If you have meat on your plate it might have been grown in the same country as you, but maybe not for the food the animal ate. You’re probably also eating animal antibiotics (or the remains of them). Where were they made? How can you start to get a hold on the incredible complexity of the human food chain? The motivation here seems also to be to make money: if you can make the same product as your competitors, but cheaper, then you can undercut your competitors a little, have bigger margins, and make more money. Who cares if it requires enormous environmental damage, right? Products are sure as hell not priced to reflect the damage done to the environment to create, maintain, or dispose of them.

As an aside, have you ever marvelled at how incredible plants are? They literally convert dirt, sunlight, water, and a few minerals, into food. Ultimately we’re all just the result of dirt, sunlight, water, and a few minerals. Bonkers.

Software does seem a little different though. We seem to utterly fetishize complexity, mostly for bragging rights. I’ve certainly been guilty in the past (and I suspect in the future too), of building far more complicated things than necessary, because I can. In a number of cases I could concoct a benchmark which showed the new code was faster, thus justifying the increased complexity of the code, and the consequence of a more difficult code-base to maintain. I definitely get a buzz from making a complex thing work, and I suspect this is quite common. I’ve been told that at Amazon, promotion requires being able to demonstrate that you’ve built or maintained complex systems. Well I love hearing about unintended 2nd-order effects. The consequence here is pretty obvious: a whole bunch of systems get built in ludicrously complex ways just so that people can apply for promotion. I guess the motivation there is money too.

As I say, when building something complex, it can be rewarding when it works. Six months later I’ve often come to regret it, when the complexity is directly opposed to being able to fix a bug that’s surfaced. It can cause silos by creating “domain experts” (i.e. a ball and chain around your feet). I’ve had cases where I’ve had to build enormously complex bits of code in order to work around equally bonkers levels of tech-debt, which can’t be removed because of “business reasons”. The result is unnecessary complexity on top of unnecessary complexity. No single person can understand how the whole thing works (much less write down some semantics, or any invariants) because the code-base is now too large, too complex, and riddled with remote couplings, broken abstractions, poor design, and invalid assumptions. Certain areas of the code-base become feared, and more code gets added to avoid confronting the complexity. Developer velocity slows to an absolute crawl, developers get frustrated and head for the door. No one is left who understands much. With some time and space since that particular situation, it’s now easy for me to sit here and declare that sort of thing a red-flag, and that I should have run away from it sooner. Who knows what’ll happen next time?

I find it easy to convince myself that complexity I’ve built, or claim to understand, is acceptable, whilst complexity I don’t understand is abhorrent.

As an industry we seem to love to kid ourselves that we’re all solving the same problems as Google, Facebook, or Amazon etc. At no job I’ve ever worked do I believe the complexity that comes from use of Kubernetes is justified, and yet it seems to have become “best practice”, which I find baffling. On a recent project I decided to rent a single (virtual) server, and I deploy everything with a single nixos-rebuild --target-host myhost switch. Because everything on the server is running under systemd, and because of the way nixos restarts services, downtime is less than 2 seconds and is only visible to long-lived connections (WebSockets in this case): systemd can manage listening-sockets itself and pass them to the new executable, maintaining any backlog of pending connections.

To me, this “simplicity” is preferable than trying to achieve 100% service availability. I’m not going to lose any money because of 2 seconds of downtime, even if that happens several times a day. It’s much more valuable to me to be able to get code changes deployed quickly. Is this really simpler than using something like Kubernetes? Maybe: there are certainly fewer moving parts and all the nixos stuff is only involved when deploying updates. Nevertheless, it’s not exactly simple; but I believe I understand enough of it to be happy to build, use, and rely on it.

I was recently reading Nick Cameron’s blog post on 10 challenges for Rust. The 9th point made me think about the difficulty of maintaining the ability to make big changes to any large software project. We probably all know to say the right words about avoiding hidden or tight couplings, but evidence doesn’t seem to suggest that it’s possible in large sophisticated software projects.

We are taught to fear the “big rewrite” project, citing 2nd-system-syndrome, though the definition of that seems to be about erroneously replacing “small, elegant, and successful systems”. It’s not about replacing giant, bug riddled, badly understood and engineered systems (to be super clear, I’m talking about this in general, not about the Rust compiler which I know nothing about). I do think we are often mistaken to fear rebuilding systems: I suspect we look at the size of the existing system and estimate the difficulty of recreating it. But of course we don’t want to recreate all those problems. We’ve learnt from them and can carry that knowledge forwards (assuming you manage to stop an employee exodus). There’s no desire to recreate the mountains of code that stem from outdated assumptions, inevitable mistakes in code design, unnecessary and accidental complexity, tech-debt, and its workarounds.

I’ve been thinking about parallels in other industries. Given the current price of energy in the UK and how essential it is to improve the heating efficiency of our homes, it’s often cheaper to knock down existing awful housing and rebuild from scratch. No fear of 2nd-system-syndrome here: it’s pretty unarguable that a lot of housing in the UK is dreadful, both from the point of view of how we use rooms these days, and energy efficiency. Retrofitting can often end up being more expensive, less effective, slower, and addresses only a few problems. But incremental improvement doesn’t require as much up-front capital.

If you look at the creative arts, artists, authors, and composers all create a piece of work and then it’s pretty much done. Yes, there are plenty of examples of composers going back and revising works after they’ve been performed (Bruckner and Sibelius for example), sometimes for slightly odd reasons such as establishing or extending copyright (for example Stravinsky). But a piece of art is not built by a slowly changing team over a period of 10 years (film may be an interesting exception to this). When it’s time to start a new piece of art, well, it’s time. Knowledge, style, preferences, techniques: these are carried forwards. Shostakovich always sounds unmistakably like Shostakovich. But his fifth symphony is not his fourth with a few bug fixes.

At the other end of the spectrum, take the economic philosophy known as Georgism. From what I can gather, no serious economist on the left or right believes it would be a bad idea to implement, and it seems like it would have a great many benefits. But large landowners (people who own a lot of land, not people who own any amount of land and happen to be large) would probably have to pay more tax. Large landowners tend to currently have a lot of political power. Consequently Georgism never gets implemented in the West. So despite it being almost universally accepted as a good idea, because we can’t “start again”, we’re never going to have it. From what I can see, literally the only chance would be a successful violent uprising.

Finally, recently I came across “When Do Startups Scale? Large-scale Evidence from Job Postings” by Lee and Kim. Now this paper isn’t specifically looking at software, and they use the word “experiment” to mean changing the product the company is creating in order to find the best fit for their market – they’re not talking about experimenting with software. Nevertheless:

We find that startups that begin scaling within the first 12 months of their founding are 20 to 40% more likely to fail. Our results show that this positive correlation between scaling early and firm failure is negated for startups that engage in experimentation through A/B testing.

It’s definitely a big stretch, but in the case of software this could be evidence that delaying writing lots of code, for as long as possible, is beneficial. Avoid complexity; continue to experiment with prototypes and throw-away code and treat nothing as sacrosanct for as long as possible. Do not acquiesce to complexity: give it an inch and it’ll take a mile before you even realise what’s happened.

So what to do? I’ve sometimes thought that say, once a month, companies should run some git queries and identify the oldest code in their code-bases. This code hasn’t been touched for years. The authors have long since left. It may be that no one understands what it even does. What happens if we delete it? Now in many ways (probably all ways) this is a completely mad idea: if it ain’t broke, don’t fix it, and why waste engineering resources on recreating code and functionality that no one had a problem with?

But at the same time, if this was the culture across the entire company and was priced in, then it might enable some positive things:

  • There would be more eyes on, and understanding of, ancient code. Thus less ancient code, and more understanding in general.

  • This ancient code may well embody terribly outdated assumptions about what the product does. Can it be updated with the current ideas and assumptions?

  • This ancient code may also encode invariants about the product which are no longer true. There may be a way to change or relax them. By doing so you might be able to delete various workarounds and tech-debt that exists higher up.

Now because I would guess a lot of ancient code is quite foundational, changing it may very well be quite dangerous. One fix could very quickly demand another, and before you know it you’ve embarked upon rewriting the whole thing. Maybe that’s the goal: maybe you should aim to be able to rewrite huge sections of the product within a month if it is judged to be beneficial to the code-base. But of course this requires such ideas to be taken seriously and valued right across the company. For the engineering team to have a strong voice at the top table. And really is this so different from just keeping a list of areas of the code that no one likes and dedicating time to fixing those? I guess if nothing else, it might give a starting point for making such a list.

Unnecessary complexity in software seems endemic, and is frequently worshipped. This, and a fear of experiments to rewrite, blunts the drive to simplify. Yet the benefits of a smaller and simpler code-base are unarguable: with greater understanding of how the product works, a small team can move much faster.

September 26, 2022 04:01 PM

Monday Morning Haskell

Using GHCup!

When it comes to starting out with Haskell, I usually recommend installing Stack. Stack is an effective one-stop shop. It automatically installs Cabal for you, and it's also able to install the right version of GHC for your project. It installs GHC to a sandboxed location, so you can easily use different versions of GHC for different projects.

But there's another good program that can help with these needs! This program is called GHCup ("GHC up"). It fills a slightly different role from Stack, and it actually allows more flexibility in certain areas. Let's see how it works!

How do I install GHCup?

Just like Stack, you can install GHCup with a single terminal command. Per the documentation, you can use this command on Linux, Mac, and Windows Subsystem for Linux:

curl --proto '=https' --tlsv1.2 -sSf | sh

See the link above for special instructions on a pure Windows setup.

What does GHCup do?

GHCup can handle the installation of all the important programs that make Haskell work. This includes, of course, the compiler GHC, Cabal, and Stack itself. What makes it special is that it can rapidly toggle between the different versions of all these programs, which can give you more flexibility.

Once you install GHCup, this should install the recommended version of each of these. You can see what is installed with the command ghcup list.

The "currently installed" version of each has a double checkmark as you can see in the picture. When you use each of these commands with the --version argument, you should see the version indicated by GHCup:

>> stack --version
Version 2.7.5
>> cabal --version
cabal-install version
>> ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.02

How do I switch versions with GHCup?

Any entry with a single green checkmark is "installed" on your system but not "set". You can set it as the "global" version with the ghcup set command.

>> ghcup set ghc 8.10.7
[ Info ] GHC 8.10.7 successfully set as default version
>> ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.10.7

Versions with a red x aren't installed but are available to download. If a version isn't installed on your system, you can use ghcup install to get it:

>> ghcup install stack 2.7.1

Then you need to set the version to use it:

>> ghcup set stack 2.7.1
>> stack --version
Version 2.7.1

Note that the specific example with Stack might not work if you originally installed Stack through its own installer before using GHCup.

GHCup User Interface

On most platforms, you can also use the command: ghcup tui. This brings up a textual user interface that allows you to make these changes quickly! It will bring up a screen like this on your terminal, allowing you to use the arrow keys to set the versions as you desire.

All the commands are on screen, so it's very easy to use!

Notes on Stack and GHC

An important note on setting the "global" version of GHC is that this does not affect stack sandboxing. Even if you run ghcup set ghc 8.10.7, this won't cause any problems for a stack project using GHC 9.02. It will build as normal using 9.02.

So why does it even matter what the global version of GHC is? Let's find out!

GHCup and IDEs

Why do I mention GHCup when my last article was talking about IDEs? Well the one other utility you can install and customize with GHCup is the Haskell Language Server, which shows up in the GHCup output as the program hls. This is a special program that enables partial compilation, lint suggestions and library autocompletion within your IDE (among other useful features). As we'll explore in the next couple articles, Haskell Language Server can be a little tricky to use!

Even though Stack uses sandboxed GHC versions, HLS depends on the "global" version of GHC. And changing the "global" version to a particular version you've installed with stack is a little tricky if you aren't super familiar with Haskell's internals and also comfortable with the command line. So GHCup handles this smoothly.

Imagine we have two projects with different Stack resolvers (and in this case different GHC versions).

# stack.yaml #1
# (GHC 9.0.2)

# stack.yaml #2
# (GHC 8.10.7)

If we want to get code suggestions in our first project, we just need to run this command before open it in the editor:

ghcup set ghc 9.0.2

And if we then want to switch to our second project, we just need one command to get our hints again!

ghcup set ghc 8.10.7

And of course, in addition to switching the GHC version, GHCup installs HLS for you and allows you to switch its version to keep up with updates.


With a basic understanding of HLS and switching GHC versions, we're now in a good position to start designing a really strong Haskell IDE! In the next couple of articles, we'll see a couple examples of this!

Keep up to date with all the latest news on Monday Morning Haskell by subscribing to our mailing list! This will also give you access to our subscriber resources!

by James Bowen at September 26, 2022 02:30 PM

Philip Wadler

Angry Reviewer


Angry Reviewer is a tool to provide feedback on your writing. I look forward to trying it out.

by Philip Wadler ( at September 26, 2022 12:08 PM

September 24, 2022

Magnus Therning

Annotate projects in Emacs

Every now and then I've wished to write comments on files in a project, but I've never found a good way to do that. annotate.el and org-annotate-file both collect annotations in a central place (in my $HOME), while marginalia puts annotations in files next to the source files but in a format that's rather cryptic and tends to be messed up when attached to multiple lines. None of them is ideal, I'd like the format to be org-mode, but not in a central file. At the same time having one annotation file per source file is simply too much.

I tried wrapping org-annotate-file, setting org-annotate-file-storage-file and taking advantage of elisp's dynamic binding. However, it opens the annotation file in the current window, and I'd really like to split the window and open the annotations the right. Rather than trying to sort of "work it out backwards" I decided to write a small package and use as much of the functionality in org-annotate-file.el as possible.

First off I decided that I want the annotation file to be called

(defvar org-projectile-annotate-file-name ""
  "The name of the file to store project annotations.")

Then I wanted a slightly modified version of org-annotate-file-show-section, I wanted it to respect the root of the project.

(defun org-projectile-annotate--file-show-section (storage-file)
  "Add or show annotation entry in STORAGE-FILE and return the buffer."
  ;; modified version of org-annotate-file-show-section
  (let* ((proj-root (projectile-project-root))
         (filename (file-relative-name buffer-file-name proj-root))
         (line (buffer-substring-no-properties (point-at-bol) (point-at-eol)))
         (annotation-buffer (find-file-noselect storage-file)))
    (with-current-buffer annotation-buffer
      (org-annotate-file-annotate filename line))

The main function can then simply work out where the file with annotations should be located and call org-projectile-annotate--file-show-section.

(defun org-projectile-annotate ()
  (let ((annot-fn (file-name-concat (projectile-project-root)
    (set-window-buffer (split-window-right)
                       (org-projectile-annotate--file-show-section annot-fn))))

When testing it all out I noticed that org-store-link makes a link with a search text. In my case it would be much better to have links with line numbers. I found there's a hook to modify the behaviour of org-store-link, org-create-file-search-functions. So I wrote a function to get the kind of links I want, but only when the project annotation file is open in a buffer.

(defun org-projectile-annotate-file-search-func ()
  "A function returning the current line number when called in a
project while the project annotation file is open.

This function is designed for use in the hook
'org-create-file-search-functions'. It changes the behaviour of
'org-store-link' so it constructs a link with a line number
instead of a search string."
  ;; TODO: find a way to make the link description nicer
  (when (and (projectile-project-p)
             (get-buffer-window org-projectile-annotate-file-name))
    (number-to-string (line-number-at-pos))))

That's it, now I only have to wait until the next time I want to comment on a project to see if it improves my way of working.

September 24, 2022 08:42 PM

September 22, 2022

Monday Morning Haskell

What Makes a Good IDE?

Sometimes in the past I've read articles about people's IDE setups and thought "wow, they spend way too much time thinking about this." Now maybe sometimes people do go overboard. But on the other hand, I think it's fair to say I've been neglecting the importance of my development environment in my own practice.

A quick look at some of my videos in the last couple years can show you this fact. This whole playlist is a good example. I'm generally working directly with Vim with virtually no language features beyond syntax highlighting. I think my environment even lacked any semblance of auto-completion, so if I wasn't copying something directly, I would be writing the whole thing out.

If I wanted to compile or run my code, I would switch to a terminal opened in a separate window and manually enter commands. At the very least, I could switch between these terminals pretty easily. But opening new files and trying to compare files side-by-side was a big pain.

So after reflecting on these experiences, one of my resolutions this year has been to improve my Haskell development environment. In this first article on the subject, I'll consider the specific elements of what makes a good IDE and how we can use a systematic approach to build our ideal environment.

Listing Our Needs

Designing a good environment requires us to be intentional. This means thinking carefully about what we're using our environment for, and getting into the details of the specific actions we want.

So a good starting point is to list out the important actions we want to take within our editor. Here's my preliminary list:

  1. Open a file in a tab
  2. Switch between tabs
  3. Open files side-by-side (and switch between them)
  4. Open & close a navigation sidebar
  5. Use sidebar to open files
  6. Open up a terminal to run commands
  7. Switch between the terminal and file to edit

But having these features available is just the start. We also want to do things quickly!

Moving Fast

I used to play Starcraft for a few years. And while I wasn't that good, I was good enough to learn one important lesson to apply to programming. The keyboard is way more efficient than the mouse. The mouse gives the advantage of moving in 2D space. But overall the keyboard is much faster and more precise. So in your development environment, it's very important to learn to do as many things as possible with keyboard shortcuts.

So as a practical matter, I recommend thinking carefully about how you can accomplish your most common tasks (like the features we listed above) with the keyboard. Write these down if you have to!

It helps a lot if you use an editor that allows you to remap keyboard commands. This will give you much more control over how your system works and let you pick key-bindings that are more intuitive for you. Programs like Vim and Emacs allow extensive remapping and overall customization. However, More commercial IDEs like Visual Studio and IntelliJ will still usually allow you some degree of customization.

Speaking of Vim and Emacs, the general movement keys each has (for browsing through and editing your file) are extremely useful for helping improve your general programming speed. Most IDEs have plugins that allow you to incorporate these movement keys.

The faster you're able to move, the more programming will feel like an enjoyable experience, almost like a game! So you should really push yourself to learn these keyboard shortcuts instead of resorting to using the mouse, which might be more familiar at first. In a particular programming session, try to see how long you can go without using the mouse at all!

Language Features

The above features are useful no matter what language or platform you're using. Many of them could be described as "text editor" features, rather than features of a "development environment". But there are also specific things you'll want for the language you're working with.

Syntax highlighting is an essential feature, and autocomplete is extremely important. Basic autocomplete works with the files you have open, but more advanced autocomplete can also use suggestions from library functions.

At the next level of language features, we would want syntax hints, lint suggestions and partial compilation to suggest when we have errors (for compiled languages). These also provide major boosts to your productivity. You can correct errors on the fly, rather than trying to determine the right frequency to switch to your terminal, try compiling your code, and match up the errors with the relevant area of code.

One final area of improvement you could have is integrated build and test commands. Many commercial IDEs have these for particular language setups. But it can be a bit trickier to make these work for Haskell. This is why I still generally rely on opening a terminal instead of using such integrations.


In the next couple articles, I'll go through a few different options I've considered and experimented with for my Haskell development setup. I'll list a few pros and cons of each and give a few tips on starting out with them. I'll also go through a couple tools that are generally useful to making many development environments work with Haskell.

To make sure you're up to date on all our latest news, make sure you've subscribed to our mailing list! This will give you access to all our subscriber resources, including our Beginners Checklist!

by James Bowen at September 22, 2022 02:30 PM

Tweag I/O

Building Nix flakes from Rust workspaces

Did you know that with Nix you can easily define and load development environments with all the tools that you need, without having to install anything (except Nix) on your local machine? It may be as simple as this shell.nix file:

{ pkgs ? import <nixpkgs> {} }:
pkgs.mkShell {
  buildInputs = with pkgs; [ rustc cargo cargo-flamegraph ];

Now, if you run nix-shell, you’ll be given a shell with Rust, Cargo, and Cargo Flamegraph available to you. That’s pretty neat, but what if you want to take it a step further, and use Nix to package your Rust code? There are many options available, with different trade offs, and it can be quite overwhelming to choose between them, although there is some information on the NixOS Wiki. In this post we are going to try the different options on a simple but not entirely trivial Rust code sample.

But why Nix? Cargo uses lock files and does a good job of keeping track of Rust dependencies. But Nix goes further, also taking into account both system dependencies and the Rust compiler itself. Also, in a polyglot environment, Nix can simplify the build process by not requiring a concoction of compilers and tools to be installed.

Our Rust code includes:

  • an app that we want to compile into a native executable
  • a WebAssembly library
  • a common package used by both of the above

We also want to be able to build Nix flakes, for the hermetic packaging they provide.

The complete sample code for each Nix packaging variant is available on GitHub.

Cover photo by Beth MacDonald on Unsplash

The Rust code sample

The common package fetches cat images, like this:

use serde::Deserialize;

pub struct Cat {
    pub url: String

pub async fn fetch_cats() -> Result<Vec<Cat>, reqwest::Error> {

As you can see, this code depends on serde as well as reqwest.

The native app calls this code and prints the URL of the first cat found, simply assuming that at least one cat was retrieved.

use std::error::Error;
use cats;

async fn main() -> Result<(), Box<dyn Error>> {
    let cats = cats::fetch_cats().await?;
    println!("There's a cat at {}", cats[0].url);

The WebAssembly code does something very similar, but with explicit error handling in order to simplify the interface provided to any Javascript client code.

pub async fn cat_url() -> String {
    let cats = cats::fetch_cats().await.expect("cat response");

When these packages are not connected to a single Cargo workspace, the native app can be built by running cargo build in the app package directory. The WebAssembly package can also be built like that, but it will get compiled into native code, which is not what we want. Instead, we build it by running cargo build --target wasm32-unknown-unknown in the wasm package directory.

If we are using a Cargo workspace, however, things are a little different. Now we can build the entire workspace by running cargo build in the root directory, but as you may imagine, this will compile everything into native code. Of course, running cargo build --target wasm32-unknown-unknown isn’t going to help, because it will try to compile the native app into WebAssembly, which doesn’t work and is not what we want either.

There are two ways to fix this. We can either go into each package directory and run the appropriate commands as before, or we can specify individual workspace members like this:

cargo build -p app
cargo build -p wasm --target wasm32-unknown-unknown

Now let’s see how we can achieve the same from within a Nix flake.

The flake

We’ll define our Nix flake like this:

  description = "A flake for building a Rust workspace using buildRustPackage.";

  inputs = {
    rust-overlay.url = "github:oxalica/rust-overlay";
    flake-utils.follows = "rust-overlay/flake-utils";
    nixpkgs.follows = "rust-overlay/nixpkgs";

  outputs = inputs: with inputs;
    flake-utils.lib.eachDefaultSystem (system:
        pkgs = nixpkgs.legacyPackages.${system};
        code = pkgs.callPackage ./. { inherit nixpkgs system rust-overlay; };
      in rec {
        packages = {
          app =;
          wasm = code.wasm;
          all = pkgs.symlinkJoin {
            name = "all";
            paths = with code; [ app wasm ];
        default = packages.all;

What’s nice about this flake is that we can essentially reuse it for any of the variants we will be trying later. The most interesting part is where we make it call ./., which makes it look for a default.nix file. This is where we put everything that is specific to the tool we are using. The output of default.nix is expected to be a derivation called app and another one called wasm. As you can see, we define both of these as flake output packages, but we also define an output called all which contains both. We set this as the default package, so that when we run nix build, we actually get everything.


The first possibility for setting up default.nix is to use buildRustPackage, which is built-in into nixpkgs. We can build the native app like this:

app = pkgs.rustPlatform.buildRustPackage {
  pname = "app";
  version = "0.0.1";
  src = ./.;
  cargoBuildFlags = "-p app";

  cargoLock = {
    lockFile = ./Cargo.lock;

  nativeBuildInputs = [ pkgs.pkg-config ];
  PKG_CONFIG_PATH = "${}/lib/pkgconfig";

The WebAssembly code is not as straightforward, though, because buildRustPackage insists on setting the --target flag to either the (native) host system, or to whatever we’re cross-compiling against. The cross-compilation should actually work, but when configuring it, Nix ends up building a Rust compiler from scratch where the target is set to WebAssembly everywhere. This eventually fails.

Instead, we have to override the cargo build step like this:

wasm = rustPlatformWasm.buildRustPackage {
  pname = "wasm";

  buildPhase = ''
    cargo build --release -p wasm --target=wasm32-unknown-unknown
  installPhase = ''
    mkdir -p $out/lib
    cp target/wasm32-unknown-unknown/release/*.wasm $out/lib/

Please note rustPlatformWasm here, which uses the Rust overlay to get a toolchain with support for the wasm32-unknown-unknown target. See the code repository for details.

While buildRustPackage works, it is quite basic. Since each app corresponds to a single Nix derivation, if anything at all changes, such as source code, dependencies, or Nix config, the app needs to be rebuilt entirely.

Interestingly, it also only seems to work when the Rust code is in a workspace. When it’s just three separate packages, the path dependency to ../cats leads to a build error. This is the case for the WebAssembly app, where we override buildPhase, as well as for the native app, where we don’t.

Now let’s see if any of the other tools can do a better job, starting with naersk.


In order to use naersk we add it to our flake inputs, and pass it to our default.nix file:

naersk.url = "github:nix-community/naersk";
code = pkgs.callPackage ./. { inherit nixpkgs system naersk rust-overlay; };

The real changes are in default.nix, but even here it doesn’t look dramatically different from before:


  naerskLib = pkgs.callPackage naersk {};

  naerskLibWasm = pkgs.callPackage naersk {
    rustc = rustWithWasmTarget;
in {
  app = naerskLib.buildPackage {
    name = "app";
    src = ./.;
    cargoBuildOptions = x: x ++ [ "-p" "app" ];
    nativeBuildInputs = [ pkgs.pkg-config ];
    PKG_CONFIG_PATH = "${}/lib/pkgconfig";
  wasm = naerskLibWasm.buildPackage {
    name = "wasm";
    src = ./.;
    cargoBuildOptions = x: x ++ [ "-p" "wasm" ];
    copyLibs = true;
    CARGO_BUILD_TARGET = wasmTarget;

It’s nice that naersk lets us use CARGO_BUILD_TARGET, so we don’t have to override the build and install phase, like we had to for buildRustPackage. Even nicer is that naersk splits the app code and the third-party dependencies into separate derivations, so as long as we don’t update any dependencies, builds are fast. We can even modify our local cats dependency without triggering a full build.

However, like before, we can not get the build to work unless our packages are structured inside a Cargo workspace. This is a known issue.


We can add crane to our flake inputs like this:

crane.url = "github:ipetkov/crane";

As crane is heavily inspired by naersk, it is no surprise that it works very similarly:


  craneLib = crane.mkLib pkgs;
  craneLibWasm = craneLib.overrideToolchain rustWithWasmTarget;
  app = craneLib.buildPackage {
    src = ./.;
    cargoExtraArgs = "-p app";
    nativeBuildInputs = [ pkgs.pkg-config ];
    PKG_CONFIG_PATH = "${}/lib/pkgconfig";
  wasm = craneLibWasm.buildPackage {
    src = ./.;
    cargoExtraArgs = "-p wasm --target ${wasmTarget}";

    # Override crane's use of --workspace, which tries to build everything.
    cargoCheckCommand = "cargo check --release";
    cargoBuildCommand = "cargo build --release";

Out of the box, the WebAssembly build doesn’t work because crane runs Cargo with the --workspace flag, which means that it tries to also build the native app to WebAssembly. Luckily, we can override this using cargoCheckCommand and cargoBuildCommand.

Like naersk, crane splits the app code and the third-party dependencies into separate derivations, allowing for fast builds as long as dependencies aren’t updated. And, once again, trying to build separate packages outside of a workspace fails.

Where crane is trying to improve on naersk is to make it easier to compose different Cargo invocations as completely separate derivations. For example, you can have one derivation that builds all your dependencies, and additional derivations for running Clippy, building the code, running tests with code coverage, etc. These can of course depend on each other, and Nix will make sure that you don’t have to wait for output that has already been built.


Let’s now take a look at cargo2nix. Like before, we need to add a flake input:

cargo2nix.url = "github:cargo2nix/cargo2nix/release-0.11.0";

However, due to a fix we’ll come back to in a moment, we’re using our own fork for now.

While the other tools parsed Cargo.lock implicitly, with cargo2nix we need to explicitly generate a Cargo.nix file like this:

nix run github:cargo2nix/cargo2nix
git add Cargo.nix

Building the native app is quite straightforward:

  pkgs = import nixpkgs {
    inherit system;
    overlays = [cargo2nix.overlays.default];

  rustPkgs = pkgs.rustBuilder.makePackageSet {
    rustVersion = "1.61.0";
    packageFun = import ./Cargo.nix;
in {
  app = ( {}).bin;

Building the WebAssembly should also have been as easy as this:

rustWithWasmTarget = pkgs.rust-bin.stable.${rustVersion}.default.override {
    targets = [ wasmTarget ];

rustPkgsWasm = pkgs.rustBuilder.makePackageSet {
  rustVersion = "1.61.0";
  packageFun = import ./Cargo.nix;
  rustToolchain = rustWithWasmTarget;
  target = wasmTarget;

Unfortunately, this exposes a bug in the way cargo2nix handles target-specific dependencies. It skips native-only dependencies if the host platform is wasm32, but it should really check the target platform. We can work around this by specifying a cross-system that is wasm32, and the one supported by Nix is wasm32-wasi:

pkgsWasm = import nixpkgs {
  inherit system;
  crossSystem = {
    system = "wasm32-wasi";
    useLLVM = true;
  overlays = [cargo2nix.overlays.default];

There is another problem you may run into, because cargo2nix now thinks you’re building for wasm32-unknown-wasi, and not wasm32-unknown-unknown. Your Cargo.nix file may contain dependencies like this:

${ if == "wasi" then "getrandom" else null } = rustPackages."registry+".getrandom."0.2.7" ...

This means we are now getting dependencies we are not supposed to get, and which fail to build. We need to guide cargo2nix to the correct kernel name here:

packageFun = attrs: import ./Cargo.nix (attrs // {
  hostPlatform = attrs.hostPlatform // {
    parsed = attrs.hostPlatform.parsed // { = "unknown";

You may wonder if we cannot similarly work around the problem mentioned above, and altogether skip the crossSystem config, by simply setting = "wasm32". Unfortunately, I have not had success with that. Anyway, we can finally get our WebAssembly:

wasm = (rustPkgsWasm.workspace.wasm {}).out;

This wouldn’t actually produce any WebAssembly output, but with the help from my colleagues Alexei Drake and Yorick van Pelt we were able to submit a fix.

And for the first time, we managed to also build the code as separate crates, not within a common workspace. All we needed to do was to delete the top-level Cargo files, generate Cargo.lock and Cargo.nix for both /app and /wasm, and update the references to Cargo.nix in default.nix.

Like naersk and crane, our own code and the dependencies are split into separate derivations, but cargo2nix doesn’t stop there. All crates get their own derivation, which means that if we update only one dependency (in Cargo.lock and subsequently in Cargo.nix), we may still enjoy a quick build. This is helpful if just one or two of your dependencies change frequently. Or if you need to somehow break a very long CI build into stages, although you would have to be a little creative, because cargo2nix doesn’t have any support for building just some dependencies.

Other options

There are even more tools for building Rust code with Nix, but they were not up to the challenge, for various reasons.

  • carnix is an old tool that is no longer maintained, and superseded by crate2nix.

  • crate2nix also seems to not be maintained much any more, and it doesn’t support building WebAssembly anyway.

  • dream2nix is a very exciting project that aims to unify the many “2nix” converters into a common framework. However, it doesn’t seem to be mature enough for our purposes, with little documentation and apparently no way to specify a WebAssembly build target.

  • nocargo is another option under development, that, like cargo2nix, will build one derivation per crate. But, as its name suggests, it will not depend on Cargo at all, only Rustc. Unfortunately, it wasn’t able to build our sample code. While it seemingly built the Cats library and the native app without issue, the resulting app output was empty. And the WebAssembly build failed with error messages. Also, having path dependencies outside of a workspace wasn’t supported.


Which tool should you use, then? As always, it depends. If you have a simple app and just want to package it with Nix, buildRustPackage may be all you need. But I think you’ll soon appreciate the faster builds that the other tools provide by splitting your code and dependencies into separate derivations. Crane does seem to be an improvement over naersk, and because it delegates all the hard parts to Cargo, it can stay clear of all the problems that cargo2nix will need to handle.

If, for any reason, you need to split your dependencies into one derivation per crate, then cargo2nix seems to be your only option. The downsides are that you need to manage a Cargo.nix file, and that you may run into bugs if you have complicated builds.

Nocargo looks promising, and may well become the preferred choice when it has matured.

September 22, 2022 12:00 AM

September 21, 2022

Lysxia's blog

The quantified constraint trick

My favorite Haskell trick is how to use quantified constraints with type families. Kudos to Iceland_jack for coming up with it.

Quantified constraints and type families

QuantifiedConstraints is an extension from GHC 8.6 that lets us use forall in constraints.

It lets us express constraints for instances of higher-kinded types like Fix:

newtype Fix f = Fix (f (Fix f))

deriving instance (forall a. Eq a => Eq (f a)) => Eq (Fix f)

Other solutions existed previously, but they’re less elegant:

deriving instance Eq (f (Fix f)) => Eq (Fix f)

instance Eq1 f => Eq (Fix f) where ...

It also lets us say that a monad transformer indeed transforms monads:

class (forall m. Monad m => Monad (t m)) => MonadTrans t where
  lift :: m a -> t m a

(Examples lifted from the GHC User Guide on QuantifiedConstraints, section Motivation.)

One restriction is that the conclusion of a quantified constraint cannot mention a type family.

type family F a

-- (forall a. C (F a))  -- Illegal type family application in a quantified constraint

A quantified constraint can be thought of as providing a local instance, and they are subject to a similar restriction on the shape of instance heads so that instance resolution may try to match required constraints with the head of existing instances.

Type families are not matchable: we cannot determine whether an applied type family F a matches a type constructor T in a manner satisfying the properties required by instance resolution (“coherence”). So type families can’t be in the conclusion of a type family.

The quantified constraint trick

Step 1

To legalize type families in quantified constraints, all we need is a class synonym:

class    C (F a) => CF a
instance C (F a) => CF a

That CF a is equivalent to C (F a), and forall a. CF a is legal.

Step 2?

Since GHC 9.2, Step 1 alone solves the problem. It Just Works™. And I don’t know why.

Before that, for GHC 9.0 and prior, we also needed to hold the compiler’s hand and tell it how to instantiate the quantified constraint.

Indeed, now functions may have constraints of the form forall a. CF a, which should imply C (F x) for any x. Although CF and C (F x) are logically related, when C (F x) is required, that triggers a search for instances of the class C, and not the CF which is provided by the quantified constraint. The search would fail unless some hint is provided to the compiler.

When you require a constraint C (F x), insert a type annotation mentioning the CF x constraint (using the CF class instead of C).

_ {- C (F x) available here -} :: CF x => _

Inside the annotation (to the left of ::), we are given CF x, from which C (F x) is inferred as a superclass. Outside the annotation, we are requiring CF x, which is trivially solved by the quantified constraint forall a. CF a.


-- Mixing quantified constraints with type families --

class C a
type family F a

-- forall a. C (F a)  -- Nope.

class    C (F a) => CF a  -- Class synonym
instance C (F a) => CF a

-- forall a. CF a     -- Yup.

-- Some provided function we want to call.
f :: C (F t) => t

-- A function we want to implement using f.
g :: (forall a. CF a) => t
g = f               -- OK on GHC >= 9.2
g = f :: CF t => t  -- Annotation needed on GHC <= 9.0

The part of that type annotation that really matters is the constraint. The rest of the type to the right of the arrow is redundant. Another way to write only the constraint uses the following identity function with a fancy type:

with :: forall c r. (c => r) -> (c => r)
with x = x

So you can supply the hint like this instead:

g :: forall t. (forall a. CF a) => t
g = with @(CF t) f

Application: generic-functor

What do I need that trick for? It comes up in generic metaprogramming.

Imagine deriving Functor for Generic types (no Generic1, which is not as general as you might hope). One way is to implement the following class on generic representations:

class RepFmap a a' rep rep' where
  repFmap :: (a -> a') -> rep -> rep'

A type constructor f :: Type -> Type will be a Functor when its generic representation (Rep) implements RepFmap a a'… for all a, a'.

-- Class synonym for generically derivable functors
class    (forall a. Generic (f a), forall a a'. RepFmap a a' (Rep (f a) ()) (Rep (f a') ())) => GFunctor f
instance ...   -- idem (class synonym)

-- Wait a second...

But that is illegal, because the type family Rep occurs in the conclusion of a quantified constraint.

Time for the trick! We give a new name to the conclusion:

class    RepFmap a a' (Rep (f a) ()) (Rep (f a') ()) => RepFmapRep a a' f
instance ...  -- idem (class synonym)

And we can use it in a quantified constraint:

-- Now this works!
class    (forall a. Generic (f a), forall a a'. RepFmapRep a a' f) => GFunctor f
instance ...   -- idem (class synonym)

To obtain the final generic implementation of fmap, we wrap repFmap between to and from.

gfmap :: forall f a a'. GFunctor f => (a -> a') -> f a -> f a'
gfmap f =
  with @(RepFmapRep a a' f)             -- Hand-holding for GHC <= 9.0
    (to @_ @() . repFmap f . from @_ @())

Et voilà.

(Gist of this example)

Appendix: Couldn’t we do this instead?

If you’ve followed all of that, there’s one other way you might try defining gfmap without QuantifiedConstraints, by just listing the three constraints actually needed in the body of the function.

-- Dangerous gfmap!
gfmap ::
  Generic (f a) =>
  Generic (f a') =>
  RepFmap a a' (Rep (f a) ()) (Rep (f a') ()) =>
  (a -> a') -> f a -> f a'
gfmap f = to @_ @() . repFmap f . from @_ @()

This is okay as long as it is only ever used to implement fmap as in:

fmap = gfmap

Any other use voids a guarantee you didn’t know you expected.

The thing I haven’t told you is that RepFmap is implemented with… incoherent instances!1 In fact, this gfmap may behave differently depending on how it is instantiated at compile time.

For example, for a functor with a field of constant type:

data T a b = C Int a b
  deriving Generic

gfmap @(T a) @b @b' where a, b and b' are distinct type variables behaves like fmap should. But gfmap @(T Int) @Int @Int will unexpectedly apply its argument function to every field. They all have type Int, so a function Int -> Int can and will be applied to all fields.

I could demonstrate this if I had implemented RepFmap… Luckily, there is a more general version of this “dangerous gfmap” readily available in my library generic-functor. It can be very incoherent, but if you follow some rules, it can also be very fun to use.

Playing with fire

gsolomap2 is a function from generic-functor that can implement fmap, and much more.

fmapT :: (b -> b') -> T a b -> T a b'
fmapT = gsolomap

Map over the first parameter if you prefer:

firstT :: (a -> a') -> T a b -> T a' b
firstT = gsolomap

Or map over both type parameters at once:

bothT :: (a -> a') -> T a a -> T a' a'
bothT = gsolomap

I don’t know what to call this, but gsolomap also does what you might guess from this type:

watT ::
  (a -> a') ->
  T (a , a ) ((a  -> a') -> Maybe a ) ->
  T (a', a') ((a' -> a ) -> Maybe a') 
watT = gsolomap

It’s important to specialize gsolomap with distinct type variables (a and a'). You cannot refactor code by inlining a function if its body uses gsolomap, as it risks breaking that requirement.

Witnessing incoherence

For an example of surprising result caused by incoherence, apply the fmapT defined above to some concrete arguments. See how the result changes then you replace fmapT with its definition, gsolomap.

fmapT    ((+1) :: Int -> Int) (C 0 0 0) == C 0 0 1 :: T Int Int
gsolomap ((+1) :: Int -> Int) (C 0 0 0) == C 1 1 1 :: T Int Int  -- Noooooo...

(Gist of those gsolomap (counter)examples)

This is why gfmap’s signature should use quantified constraints: this guarantees that when the RepFmap constraint is solved, the first two parameters are going to be distinct type variables, thanks to the universal quantification (forall a a'). Thus, incoherence is hidden away.

Following that recipe, generic-functor contains safe implementations of Functor, Foldable, Traversable, Bifunctor, and Bitraversable.

In particular, the type of gfmap guarantees that it has a unique inhabitant satisfying gfmap id = id, and this property is quite straightforward to check by visual inspection of the implementation.

After all, gfmap will essentially do one of three things: (1) it will be id on types that don’t mention the type parameters in its function argument a -> a', (2) it will apply the provided function f, or (3) it will fmap (or bimap, or dimap) itself through a type constructor. In all cases (with some inductive reasoning for (3)), if f = id, then gfmap f = id.

gfmap f = id
gfmap f = f
gfmap f = fmap (gfmap f)

The dangerous gfmap (without QuantifiedConstraints) or gsolomap fail this property, because the extra occurrences of a and a' in its constraint make their signatures have a different “shape” from fmap.

The trade-off is that those safe functions can’t do the same crazy things as gsolomap above.

  1. AFAICT there is no way around that with GHC.Generics. Incoherent instances can be avoided with kind-generics.↩︎

  2. gsolomap accepts one function parameter. There is also gmultimap which accepts arbitrarily many functions.↩︎

by Lysxia at September 21, 2022 12:00 AM

September 20, 2022

Tweag I/O

Optimizing Nickel's Array Contracts

Nickel is a gradually-typed, purely functional configuration language with contracts and lazy evaluation.

My internship at Tweag consisted of (attempts at) performance improvements in the Nickel reference interpreter. In this article, I will describe what I think is the most interesting problem I had on my hands: array operations like head were sometimes running in linear time instead of constant time.

What do I mean by “sometimes”? Well, to answer that, we need some familiarity with contract application and lazy evaluation.

Contract application

Contracts are a key feature of Nickel that empowers you to write correct configurations in a flexible way by extending the type system with your custom dynamically-checked properties. Thus you can write types for TCP ports, email addresses, or even the Cargo Manifest format, using runtime checks on your data.

Prime numbers are cool. Let’s write a contract for them1:

let Prime = fun label value =>
  if is_prime value then value
  else contract.blame_with label "not a prime number"

An expression such as 42 | Prime will expectedly fail, i.e. the interpreter will run the “function” Prime on the blame label, a special value containing error-reporting information, and the value 42. This is syntactic sugar — before evaluation, this expression will be transformed into contract.apply Prime label 42.

For arrays, we can use the constructor Array and write [3, 19] | Array Prime to enforce the prime number contract on all elements of the array. In this case, the expression will be roughly transformed into: (contract.apply Prime label) [3, 19]

This is not completely accurate: the interpreter uses unique labels for each element to avoid blaming the entire array for a contract violation, as that would result in poor error messages.

At this point, you can already see where the linear behavior might come from. Since evaluating:

array.head ( (contract.apply Prime label) [3, 19])

will involve two contract applications, instead of one. The argument of array.head will first be evaluated to [3 | Prime, 19 | Prime], meaning that Prime will be applied on both 3 and 19. Only then can we retrieve 3 | Prime.

This effect gets amplified for recursive functions guarded by an array contract, such as:

let rec product
  | Array Prime -> Num
  = fun ps =>
    if array.length ps == 0 then 1
    else array.head ps * product (array.tail ps)

Here, every recursive call will apply the Array Prime contract to the tail of the input array. Consequently, every call to product will be done in linear time, making the full operation quadratic.

Unfortunately, this reasoning doesn’t completely explain the observed behavior, because Nickel is lazy.

Lazy evaluation

During evaluation, Nickel expressions are put behind thunks. Thunks are mutable memory locations that initially hold unevaluated expressions, and are only updated with evaluation results if some part of the program forces them.

This means that evaluating:

let ps = [3, 9 + 10] | Array Prime in array.head ps

will mean mapping Prime over [3, 9 + 10] to get something that resembles [3 | Prime, (9 + 10) | Prime]. Next, 3 | Prime would be extracted and evaluated.

Hence the evaluation result will be 3 and (9 + 10) | Prime won’t be computed. In fact, one could replace 9 + 10 with a computation which never succeeds, such as 0 / 0, and the above snippet will still evaluate to 3.

Thanks to lazy evaluation, array contracts are mapped in linear time but the resulting contract applications are not evaluated unless they’re needed. If you wish to force the entire array to be fully evaluated, you can use builtin.deep_seq. This special function will make sure that the thunk behind each of the array’s elements is updated.

A solution

Instead of mapping contract application on arrays, the interpreter should make the array hold on to its contract, and only apply it when data leaves the array. This came in the form of a new internal operation which I will refer to as contract.lazy_apply. This means that our [3, 19] | Array Prime example will be equivalent to:

contract.lazy_apply Prime label [3, 19]

This time, array.head ([3, 19] | Prime) will return contract.apply Prime 3 in constant time, even if the thunk corresponding to the array had never been updated before.

After refactoring the interpreter to use the new machinery, I realized that builtin.deep_seq was subtly broken. Given arbitrary terms x and y, builtin.deep_seq x y recursively traverses all the thunks of x so that nested records and arrays will be fully evaluated. Once that’s done, builtin.deep_seq makes the interpreter resume normal evaluation of y.

To preserve the semantics of array contracts, any pending contract should be applied during the builtin.deep_seq operation. At first, I made it so that when builtin.deep_seq is called on an array, the interpreter will evaluate each element with the array’s contract applied to it. This worked in the majority of cases, except for record contracts.

The problem with record contracts was that, under certain conditions2, they have to clone their thunks. Which led to cases where calling builtin.deep_seq on a record with a contract applied, such as { x = y + 1, y = 1 } | { x | Num, y | Num }, updated the newly cloned thunks and not the original thunks.

Therefore, in some cases builtin.deep_seq x y left x with unevaluated thunks: x doesn’t “see” the cloned thunks. This isn’t the desired behaviour. Furthermore, serialization functions to JSON and friends, as well as the pretty-printer in the REPL assume that they are given values without thunks, which I couldn’t guarantee anymore.

To address this, I introduced a new internal operator.

Yet another operator

Because of how record contracts work, applying a contract to an arbitrary term may yield a version of said term where some thunks have been cloned. And so given a term t, when builtin.deep_seq t t recursively evaluates arrays with their contracts mapped, some contract application results will not be reflected in t as they live in cloned thunks.

This is where a new internal unary operation comes into the picture: builtin.force.

How it works, is that builtin.force t not only evaluates t, but also returns a new copy of t where everything is guaranteed to be evaluated. In particular, when t is an array or records, builtin.force t will return a new array filled with the records’ (evaluated) cloned thunks, rather than the original, unevaluated thunks.

Future opportunities

Because array contracts are now saved in a pending state for later application, it’s possible to eliminate duplicates using a limited notion of contract equality. I say limited because the terms inside a contract can be arbitrarily complex. Still, it might be worth the effort to optimize for the common cases such as name aliases, and the Nickel Team recently made some headway towards deciding contract equality in many of those cases.


It turns out that this is one of those cases where the obtained performance improvement is significant. Let’s write a function for computing array slices by generating a range of indices from a from index and to index. For simplicity’s sake, an empty array is returned if the provided range is invalid.

let slice
  | Num -> Num -> Array Num -> Array Num
  = fun from to xs =>
    if to < array.length xs && from >= 0 && from <= to then
      let range = array.generate ((+) from) (to - from)
      in (fun idx => array.elem_at idx xs) range
    else []

I ran slice 200 800 on an array of 1000 elements and obtained the following results, courtesy of the excellent command-line tool hyperfine:

Version Mean [ms] Min [ms] Max [ms] Relative
Without array.lazy_apply 618.9 ± 21.2 601.3 670.1 7.59 ± 0.36
With array.lazy_apply 81.6 ± 2.6 78.8 94.5 1.00

  1. A more idiomatic way of writing this contract is: contract.from_predicate is_prime. Of course, the two styles are equivalent.
  2. This is due to the interaction between Nickel’s merging primitive with recursive records. As documented in this Pull Request

September 20, 2022 12:00 AM

September 19, 2022

Monday Morning Haskell

Haskell for High Schoolers: Paradigm Conference!

Here's a quick announcement today, aimed at those younger Haskellers out there. If you're in middle school or high school (roughly age 18 and below), you should consider signing up for Paradigm Conference this coming weekend (September 23-25)! This is a virtual event aimed at teaching younger students about functional programming.

The first day of the conference consists of a Hackathon where you'll get the chance to work in teams to solve a series of programming problems. I've been emphasizing this sort of problem solving a lot in my streaming sessions, so I think it will be a great experience for attendees!

On the second day, there will be some additional coding activities, as well as workshops and talks from speakers, including yours truly. Since I'll be offline the whole weekend, my talk will be pre-recorded, but it will connect a lot of the work I've been doing in the last couple of months with respect to Data Structures and Dijkstra's algorithm. So if you've enjoyed those series independently, you might enjoy the connections I try to make between these ideas. This video talk will include a special offer for Haskell newcomers!

So to sign up and learn more, head over to the conference site, and start getting ready!

by James Bowen at September 19, 2022 02:30 PM

September 15, 2022

Monday Morning Haskell

Everyday Applicatives!

I recently revised the Applicatives page on this site, and it got me wondering...when do I use applicatives in my code? Functors are simpler, and monads are more ubiquitous. But applicatives fill kind of an in-between role where I often don't think of them too much.

But a couple weeks ago I encountered one of those small coding problems in my day job that's easy enough to solve, but difficult to solve elegantly. And as someone who works with Haskell, of course I like to make my code as elegant as possible.

But since my day job is in C++, I couldn't find a good solution. I was thinking to myself the whole time, "there's definitely a better solution for this in Haskell". And it turns out I was right! And the answer in this case, was to functions specific to Applicative!

To learn more about applicative functors and other functional structures, make sure to read our Monads series! But for now, let's explore this problem!.


So at the most basic level, let's imagine we're dealing with a Messsage type that has a timestamp:

class Message {
  Time timestamp;

We'd like to compare two messages based on their timestamps, to see which one is closer to a third timestamp. But to start, our messages are wrapped in a StatusOr object for handling errors. (This is similar to Either in Haskell).

void function() {
  Time baseTime = ...;
  StatusOr<Message> message1 = ...;
  StatusOr<Message> message2 = ...;

I now needed to encode this logic:

  1. If only one message is valid, do some logic with that message
  2. If both messages are valid, pick the closer message to the baseTime and perform the logic.
  3. If neither message is valid, do a separate branch of logic.

The C++ Solution

So to flesh things out more, I wrote a separate function signature:

void function() {
  Time baseTime = ...;
  StatusOr<Message> message1 = ...;
  StatusOr<Message> message2 = ...;
  optional<Message> closerMessage = findCloserMessage(baseTime, message1, message2);
  if (closerMessage.has_value()) {
    // Do logic with "closer" message
  } else {
    // Neither is valid

std::optional<Message> findCloserMessage(
    Time baseTime,
    const StatusOr<Message>& message1,
    const StatusOr<Message>& message2) {

So the question now is how to fill in this helper function. And it's simple enough if you embrace some branches:

std::optional<Message> findCloserMessage(
    Time baseTime,
    StatusOr<Message> message1,
    StatusOr<Message> message2) {
    if (message1.isOk()) {
        if (message2.isOk()) {
            if (abs(message1.value().timestamp - baseTime) < abs(message2.value().timestamp - baseTime)) {
                return {message1.value()};
            } else {
                return {message2.value()};
        } else {
            return {message1.value()};
    } else {
        if (message2.isOk()) {
            return {message2.value()};
        } else {
            return std::nullopt;

Now technically I could combine conditions a bit in the "both valid" case and save myself a level of branching there. But aside from that nothing else really stood out to me for making this better. It feels like we're doing a lot of validity checks and unwrapping with .value()...more than we should really need.

The Haskell Solution

Now with Haskell, we can actually improve on this conceptually, because Haskell's functional structures give us better ways to deal with validity checks and unwrapping. So let's start with some basics.

data Message = Message
  { timestamp :: UTCTime

function :: IO ()
function = do
  let (baseTime :: UTCTime) = ...
  (message1 :: Either IOError Message) <- ...
  (message2 :: EIther IOError Message) <- ...
  let closerMessage' = findCloserMessage baseTime message1 message2
  case closerMessage' of
    Just closerMessage -> ...
    Nothing -> ...

findCloserMessage ::
  UTCTime -> Either IOError Message -> Either IOError Message -> Maybe Message
findCloserMessage baseTime message1 message2 = ...

How should we go about implementing findCloserMessage?

The answer is in the applicative nature of Either! We can start by defining a function that operates directly on the messages and determines which one is closer to the base:

findCloserMessage baseTime message1 message2 = ...
    f :: Message -> Message -> Message
    f m1@(Message t1) m2@(Message t2) =
      if abs (diffUTCTime t1 baseTime) < abs (diffUTCTime t2 basetime)
        then m1 else m2

We can now use the applicative operator <*> to apply this operation across our Either values. The result of this will be a new Either value.

findCloserMessage baseTime message1 message2 = ...
    f :: Message -> Message -> Message
    f m1@(Message t1) m2@(Message t2) =
      if abs (diffUTCTime t1 baseTime) < abs (diffUTCTime t2 basetime)
        then m1 else m2

    bothValidResult :: Either IOError Message
    bothValidResult = pure f <*> message1 <*> message2

So if both are valid, this will be our result. But if either of our inputs has an error, we'll get this error as the result instead. What happens in this case?

Well now we can use the Alternative behavior of many applicative functors such as Either. This lets us use the <|> operator to combine Either values so that instead of getting the first error, we'll get the first success. So we'll combine our "closer" message if both are valid with the original messages:

import Control.Applicative

findCloserMessage baseTime message1 message2 = ...
    f :: Message -> Message -> Message
    f m1@(Message t1) m2@(Message t2) =
      if abs (diffUTCTime t1 baseTime) < abs (diffUTCTime t2 basetime)
        then m1 else m2

    bothValidResult :: Either IOError Message
    bothValidResult = pure f <*> message1 <*> message2

    allResult :: Either IOError Message
    allResult = bothValidResult <|> message1 <|> message2

The last step is to turn this final result into a Maybe value:

import Control.Applicative
import Data.Either

findCloserMessage ::
  UTCTime -> Either IOError Message -> Either IOError Message -> Maybe Message
findCloserMessage baseTime message1 message2 =
  if isRight allResult then Just (fromRight allResult) else Nothing
    f :: Message -> Message -> Message
    f m1@(Message t1) m2@(Message t2) =
      if abs (diffUTCTime t1 baseTime) < abs (diffUTCTime t2 basetime)
        then m1 else m2

    bothValidResult :: Either IOError Message
    bothValidResult = pure f <*> message1 <*> message2

    allResult :: Either IOError Message
    allResult = bothValidResult <|> message1 <|> message2

The vital parts of this are just the last 4 lines. We use applicative and alternative operators to simplify the logic that leads to all the validity checks and conditional branching in C++.


Is the Haskell approach better than the C++ approach? Up to you! It feels more elegant to me, but maybe isn't as intuitive for someone else to read. We have to remember that programming isn't a "write-only" activity! But these examples are still fairly straightforward, so I think the tradeoff would be worth it.

Now is it possible to do this sort of refactoring in C++? Possibly. I'm not deeply familiar with the library functions that are possible with StatusOr, but it certainly wouldn't be as idiomatic.

If you enjoyed this article, make sure to subscribe to our monthly newsletter! You should also check out our series on Monads and Functional Structures so you can learn more of these tricks!

by James Bowen at September 15, 2022 02:30 PM

Joachim Breitner

rec-def: Dominators case study

More ICFP-inspired experiments using the rec-def library: In Norman Ramsey’s very nice talk about his Functional Pearl “Beyond Relooper: Recursive Translation of Unstructured Control Flow to Structured Control Flow”, he had the following slide showing the equation for the dominators of a node in a graph:

Norman Ramsey shows a formula
Norman Ramsey shows a formula

He said “it’s ICFP and I wanted to say the dominance relation has a beautiful set of equations … you can read all these algorithms how to compute this, but the concept is simple”.

This made me wonder: If the concept is simple and this formula is beautiful – shouldn’t this be sufficient for the Haskell programmer to obtain the dominator relation, without reading all those algorithms?

Before we start, we have to clarify the formula a bit: If a node is an entry node (no predecessors) then the big intersection is over the empty set, and that is not a well-defined concept. For these nodes, we need that big intersection to return the empty set, as entry nodes are not dominated by any other node. (Let’s assume that the entry nodes are exactly those with no predecessors.)

Let’s try, first using plain Haskell data structures. We begin by implementing this big intersection operator on Data.Set, and also a function to find the predecessors of a node in a graph:

Now we can write down the formula that Norman gave, quite elegantly:

Does this work? It seems it does:

But – not surprising if you have read my previous blog posts – it falls over once we have recursion:

So let us reimplement it with Data.Recursive.Set.

The hope is that we can simply replace the operations, and that now it can suddenly handle cyclic graphs as well. Let’s see:

It does! Well, it does return a result… but it looks strange. Clearly node 3 and 4 are also dominated by 1, but the result does not reflect that.

But the result is a solution to Norman’s equation. Was the equation wrong? No, but we failed to notice that the desired solution is the largest, not the smallest. And Data.Recursive.Set calculates, as documented, the least fixed point.

What now? Until the library has code for RDualSet a, we can work around this by using the dual formula to calculate the non-dominators. To do this, we

  • use union instead of intersection
  • delete instead of insert,
  • S.empty, use the set of all nodes (which requires some extra plumbing)
  • subtract the result from the set of all nodes to get the dominators

and thus the code turns into:

And with this, now we do get the correct result:

ghci> domintors3 [(1,2),(1,3),(2,4),(3,4),(4,3)]
fromList [(1,[1]),(2,[1,2]),(3,[1,3]),(4,[1,4])]

We worked a little bit on how to express the “beautiful formula” to Haskell, but at no point did we have to think about how to solve it. To me, this is the essence of declarative programming.

by Joachim Breitner ( at September 15, 2022 08:27 AM

Tweag I/O

How to keep a Bazel project hermetic?

A build is hermetic if it is not affected by details of the environment where it is performed. Hermeticity is a prerequisite for generally desirable features like remote caching and remote execution. While certain build systems, such as Nix, impose hermeticity through their design, others rely on their users to do the extra work and be vigilant to get it. Bazel enforces hermeticity to some extent, for example through sandboxing, but is less strict about it than Nix. In this post I’m going to try to enumerate most ways in which hermeticity of a Bazel project can be compromised.

Execution strategy

One source of inhermeticity is the file system. If tools, such as compilers, are invoked in a way that does not limit their access to contents of the file system, the output of these tools can be influenced by extraneous files that might be present during the build. One example could be include files in languages like C or C++. Imagine a shared machine that is used to perform builds with different configurations. One build might generate some header files and place it in a directory that might later be specified as an include directory in a compiler invocation performed by another build. If the generated header file happens to have the right name it can shadow the correct header file and lead to a build failure that is hard to reproduce and understand. This is not a hypothetical example, but a real problem our client once struggled with. This is why it is important to always use some form of sandbox for your build actions. Sandboxing also guarantees that all build inputs are declared correctly, because otherwise the input files will simply not be available.

The use of sandbox is controlled by choosing an execution strategy. The following execution strategies are available:

  • local (or standalone, which is the same but deprecated) causes commands to be executed as local subprocesses without sandboxing.
  • sandboxed causes commands to be executed inside a sandbox on the local machine.
  • worker causes commands to be executed using a persistent worker, if available.
  • docker causes commands to be executed inside a docker sandbox on the local machine.
  • remote causes commands to be executed remotely; this is only available if a remote executor has been configured separately.

These are set with --spawn_strategy and --strategy flags.

Without going into details of all the strategies mentioned, it must be noted that local should be avoided if the build is to stay hermetic. In addition to the strategy flags there are several ways to choose local execution:

It should also be noted that, as of this writing, Windows has no support for sandboxing. Therefore build hermeticity on Windows cannot be enforced at that level.

With persistent workers

Another pitfall is related to the worker strategy. While using persistent workers can have performance benefits, these workers will not use sandboxed execution by default. It must be enabled manually by using the --worker_sandboxing flag.


Environment variables can also be a source of inhermeticity. There are many ways to inherit the environment of the machine that executes the build:

Whenever the environment of host machine is inherited it becomes an input to the respective build actions and since it is very hard to ensure identical environments on different machines, especially developer machines, features like remote caching have no chance to work.


While most modern Bazel rules will provide a way to pin the toolchain that is used for the build, others will default to simply picking up binaries from the PATH. Nothing prevents these binaries to vary from machine to machine. The built-in C and C++ rules are notorious for this kind of behavior. It is worth paying attention to what kind of rules you are using and what their guarantees with respect to hermeticity and reproducibility are.

Workspace status

Not a bug, but a feature—workspace status is in the gray area with respect to hermeticity. Activated by the --workspace_status_command command line option, it allows users to call an arbitrary program before the build begins and then use its output to stamp build results (e.g. status command could return git commit hash or time stamp). If an action directly depends on the output of the status command, typically stored as bazel-out/stable-status.txt, then it will likely be invalidated and rebuilt more often than intended and not benefit much from remote caching. Extra care must be exercised so as to pick only relevant bits of information from stable-status.txt, put them in a separate file, and depend on that file only when truly necessary.

Other things to watch for

Unfortunately, there is always a new way to shoot yourself in the foot. Here are some examples:

  • Repository rules can execute arbitrary code outside of the sandbox, they can potentially break hermeticity. For example, pip_install or npm_install may build native components with whichever compiler is in PATH, linking against whichever system libraries are found. Avoiding such dependencies, importing them in a reproducible way, for example through rules_nixpkgs, or carefully controlling the environment during fetch may be solutions to this problem.
  • Performing any non-deterministic actions. Creating archives (zip, tar, etc.) is a good example: The order of directory listings as well as timestamps are usually non-deterministic. The [reprodubile-builds project( is a great resources to learn about these issues and how to circumvent them.

Detecting hermeticity issues

In general, detecting hermeticity issues is hard. The best strategy, it seems, is to attempt building your project in different environments and have Bazel write execlogs. An execlog is the ground truth about what is going on during the build. This page about troubleshooting remote cache hits describes how to make Bazel write execlogs. Let’s summarize it:

  1. Execute bazel clean in order to force the subsequent build command to perform all necessary actions so that they end up in the execlog.
  2. Execute bazel build //your:target --execution_log_binary_file=/tmp/exec1.log. This will produce a binary execution log.
  3. Re-run the build (preceding it with a bazel clean invocation) in a different environment or even in the same environment if there is a reason to suspect that something could change between two runs in the same environment.
  4. Compare execution logs following the instructions from this section. The procedure involves building a special parser that can convert binary execlogs produced by Bazel into text and then diffing the obtained text files with a tool like diff. Differences found in this way will reveal sources of inhermeticity.

With this approach the main question becomes “how to choose the environments in which builds are performed so as to detect all hermeticity issues.” There is no answer to this question that works in all cases. Varying host name and user name might catch some problems, while others may only reveal themselves in specific circumstances. If you already know what might be a source of potential problems that could help with choosing the right build environments for these tests. From a pragmatic point of view, choosing environments that are already typically used to perform builds (remote workers, build agents, local developer machines) is probably a good first step.


It is likely true that virtually all users of Bazel wish their builds be hermetic. The blog post summarizes most ways in which hermiticity can be violated and provides some suggestions about how to avoid the common pitfalls and debug hermeticity issues.

September 15, 2022 12:00 AM

September 14, 2022

Joachim Breitner

rec-def: Program analysis case study

At this week’s International Conference on Functional Programming I showed my rec-def Haskell library to a few people. As this crowd appreciates writing compilers, and example from the realm of program analysis is quite compelling.

To Throw or not to throw

Here is our little toy language to analyze: It has variables, lambdas and applications, non-recursive (lazy) let bindings and, so that we have something to analyze, a way to throw and to catch exceptions:

Given such an expression, we would like to know whether it might throw an exception. Such an analysis is easy to write: We traverse the syntax tree, remembering in the env which of the variables may throw an exception:

The most interesting case is the one for Let, where we extend the environment env with the information about the additional variable env_bind, which is calculated from analyzing the right-hand side e1.

So far so good:

ghci> someVal = Lam "y" (Var "y")
ghci> canThrow1 $ Throw
ghci> canThrow1 $ Let "x" Throw someVal
ghci> canThrow1 $ Let "x" Throw (App (Var "x") someVal)

Let it rec

To spice things up, let us add a recursive let to the language:

How can we support this new constructor in canThrow1? Let use naively follow the pattern used for Let: Calculate the analysis information for the variables in env_bind, extend the environment with that, and pass it down:

Note that, crucially, we use env', and not just env, when analyzing the right-hand sides. It has to be that way, as all the variables are in scope in all the right-hand sides.

In a strict language, such a mutually recursive definition, where env_bind uses env' which uses env_bind is basically unthinkable. But in a lazy language like Haskell, it might just work.

Unfortunately, it works only as long as the recursive bindings are not actually recursive, or if they are recursive, they are not used:

ghci> canThrow1 $ LetRec [("x", Throw)] (Var "x")
ghci> canThrow1 $ LetRec [("x", App (Var "y") someVal), ("y", Throw)] (Var "x")
ghci> canThrow1 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "y")

But with genuine recursion, it does not work, and simply goes into a recursive cycle:

ghci> canThrow1 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "x")

That is disappointing! Do we really have to toss that code and somehow do an explicit fixed-point calculation here? Obscuring our nice declarative code? And possibly having to repeat work (such as traversing the syntax tree) many times that we should only have to do once?

rec-def to the rescue

Not with rec-def! Using RBool from Data.Recursive.Bool instead of Bool, we can write the exact same code, as follows:

And it works!

ghci> canThrow2 $ LetRec [("x", App (Var "x") someVal), ("y", Throw)] (Var "x")
ghci> canThrow2 $ LetRec [("x", App (Var "x") Throw), ("y", Throw)] (Var "x")

I find this much more pleasing than the explicit naive fix-pointing you might do otherwise, where you stabilize the result at each LetRec independently: Not only is all that extra work hidden from the programmer, but now also a single traversal of the syntax tree creates, thanks to the laziness, a graph of RBool values, which are then solved “under the hood”.

The issue with x=x

There is one downside worth mentioning: canThrow2 fails to produce a result in case we hit x=x:

ghci> canThrow2 $ LetRec [("x", Var "x")] (Var "x")

This is, after all the syntax tree has been processed and all the map lookups have been resolved, equivalent to

ghci> let x = x in RB.get (x :: RBool)

which also does not work. The rec-def machinery can only kick in if at least one of its function is used on any such cycle, even if it is just a form of identity (which I ~ought to add to the library~ since have added to the library):

ghci> idR x = RB.false ||| x
ghci> let x = idR x in getR (x :: R Bool)

And indeed, if I insert a call to idR in the line

then our analyzer will no longer stumble over these nasty recursive equations:

ghci> canThrow2 $ LetRec [("x", Var "x")] (Var "x")

It is a bit disappointing to have to do that, but I do not see a better way yet. I guess the def-rec library expects the programmer to have a similar level of sophistication as other tie-the-know tricks with laziness (where you also have to ensure that your definitions are productive and that the sharing is not accidentally lost).

by Joachim Breitner ( at September 14, 2022 09:53 PM

September 13, 2022

Mark Jason Dominus

Coat of arms of Zeppelin-Wappen

"Can we choose a new coat of arms?"

"What's wrong with the one we have?"

"What's wrong with it? It's nauseated goat!"

"I don't know what your problem is, it's been the family crest for generations."

"Please, I'm begging."

"Okay, how about a compromise: I'll get a new coat of arms, but it must include a reference to the old one."

"I guess I can live with that."

The coat of arms of the Zeppelin family, per Wikimedia Commons. A white goat's head with a nauseated expression, its long red tongue sticking out.  Below this is a blue and white cloth, a full-face helmet, and, at the bottom, a blue kite shield with the same picture of the same nauseated goat.

[ Source: Wikipedia ]

by Mark Dominus ( at September 13, 2022 02:12 AM

Tweag I/O

Construction and analysis of the build and runtime dependency graph of nixpkgs

During my internship under the mentorship of Guillaume Desforges at Tweag, I worked on creating and analyzing a graph of the contents of Nixpkgs. I received help and advice from many talented colleagues on Tweag’s Nix Slack channel during this process.

Nix is a build system and package manager with a focus on reproducible, declarative and reliable packages. Nixpkgs is an enormous collection of software packages that can be installed with the Nix package manager. Due to the way Nix works, all packages must define precisely all of their dependencies (their dependency closure) down to the operating system’s kernel. This rigor-by-design with respect to dependencies is what makes Nix packages highly reproducible, and as a side effect, it gives us a fantastic dataset: the full dependency network that the more than 80000 packages in this collection form.

The interdependence between software packages forms a very complex network. Seemingly insignificant programs maintained by only a handful of people may be the pillar of applications we use every day. Looking at the dependency graph of software allows us to identify such libraries or programs. As a software collection that is also used to build a complete operating system called NixOS, Nixpkgs is not limited to an ecosystem and contains packages for many languages and programs. Its dependency graph can help us understand the relationship between different software ecosystems and capture some macro features that they have.

Besides, if you plan to contribute to Nixpkgs, knowing some its basic characteristics in advance can help you get a feeling for how this intricate system works.

In contrast to an earlier blog post that extracted the graph in an ad hoc way directly from Nix’s database of packages, the main purpose of this project is to provide a command line tool that simplifies the extraction of derivations (Nix’s name for packages) and their dependencies programmatically from Nixpkgs, and injecting them as nodes and edges in a graph database for further examination.

I thus sincerely encourage you to visit the GitHub repository tweag/nixpkgs-graph after reading this post. Now let’s get into it!

How to get nodes and edges from Nixpkgs?

Before we start, let’s see how Nixpkgs can be seen as a graph.

A graph in computer science is a structure made of nodes and edges, a set of node pairs. A directed graph (digraph), is a graph in which the edges have a direction, from the first to the second node in an edge’s node pair.


In the context of Nixpkgs, we can interpret nodes as derivations (packages) identified by their name, in the form of <pname>-<version>, where <pname> is a package’s name and <version> its version. In addition, given two derivations A and B, an edge from A to B can be understood as a dependency of A on B, meaning B is in either a build-time or runtime dependency of A [^In Nix derivations, dependencies are specified in the buildInputs and propagatedBuildInputs attributes]. For example, in this interpretation, the Nix derivation chromium has four dependencies, one of them being glib.

How to get a list of all derivations in Nixpkgs?

One of the easiest and most direct ways to get the list of all derivations under Nixpkgs is to use the nix search command:

$ nix search --json nixpkgs

The json output is a list of all derivations in Nixpkgs containing each package’s full path in the Nixpkgs collection, it’s pname (package name), version, and description attributes:

"legacyPackages.aarch64-darwin.zzuf":{"pname":"zzuf","version":"0.15","description":"Transparent application input fuzzer"}

This is the list of nodes, with names and various properties, in the graph that we want to build.

How to list the dependencies of a derivation?

To find the dependency relationships between those nodes, we need to get the dependencies for all derivations from Nixpkgs. To do this, we can go through the full tree of Nixpkgs paths (similar to legacyPackages.aarch64-darwin.zzuf), access each of the derivations and extract their dependencies programmatically. In short, we need to map over Nixpkgs.

Mapping over the full Nixpkgs attribute set

The command nix search is a good start for an initial look at the content of Nixpkgs,but it is not enough for what we need because it doesn’t output dependencies. To go further, we will use the Nix language to inspect Nixpkgs, instead of using Nix’s command-line tools. This will give us more liberty. The key is to correctly understand the structure of Nixpkgs, then obtain the dependencies of derivations under Nixpkgs without building them. Because the build of derivations is quite time consuming and we just need to evaluate them.

In Nix, the data type of Nixpkgs is attribute set (similar to the notion of a dict in Python). And the Nix language contains two special functions: the Nix builtin function mapAttrs and the Nixpkgs function concatMapStrings.

With the help of the above two tools, we can iterate through each package in Nixpkgs, get its basic information and dependencies directly and integrate the results into the output. To avoid the time-consuming problem of build, we will use nix-instantiate instead of nix-build. The --eval flag will allow the nix-instantiate command to evaluate Nix expressions without instantiations of store derivations taking place. And this is just what we need.

Then, for each attribute in Nixpkgs, we first check whether it is a derivation using the Nixpkgs library function isDerivation. If so we extract the information in it with tryEval. We use tryEval because not all derivations can be evaluated. tryEval prevents the program from stopping because of the evaluation failure of some derivations. Otherwise, we check if it can recurse (this means that it is not a derivation but a attribute set that contains derivations, and we need to re-apply the extraction function on this set) or is in the whitelist. If so, we recurse, else we stop. For determining whether or not to recurse, we can rely on recurseForDerivations and recurseForRelease attributes. In particular, there are important sets of derivations that are not derivations while their both recurse attributes are false, such as python3Packages. Therefore, a whitelist is added for these sets.

Inconsistent result hierarchy due to nested structure

With the above steps we get the pname, version and dependencies of the derivations in Nixpkgs in JSON format. We get something like {n1, n2, {n3, {n4}}}, while what we hope to get is {n1, n2, n3, n4}. All derivations should be in the same level of the JSON file to be readable to other software.

Once again, Nix provides the ace we need: lib.collect. Using collect, we can both flatten a nested structure and select which elements to take. In order to properly filter the packages, we need the previous mapping step to flag the packages that we have evaluated. This can be done for instance by adding a type attribute with value node, and filtering with a function selectNodes = x: (x.type or null) == "node". Then the function collectNodes = pkgs.lib.collect selectNodes will give us all the previously evaluated packages as a flat list.

In the end, we get the result in the following format:

  "buildInputs": "/nix/store/c1pzk30ksbff1x3krxnqzrzzfjazsy3l-gsettings-desktop-schemas-42.0 /nix/store/mmwc0xqwxz2s4j35w7wd329hajzfy2f1-glib-2.72.3-dev /nix/store/64mp60apx1klb14l0205562qsk1nlk39-gtk+3-3.24.34-dev /nix/store/6hdwxlycxjgh8y55gb77i8yqglmfaxkp-adwaita-icon-theme-42.0 ",
  "id": "chromium-103.0.5060.134",
  "package": [
  "pname": "chromium",
  "version": "103.0.5060.134"

Log the graph to NetworkX

For the graph generation and processing, this project uses the NetworkX Python package. NetworkX is a powerful Python package for the creation, manipulation, and study of complex networks. It also has output functions for multiple formats (.csv, .gexf, etc.), which is very helpful for the subsequent analysts of the graph.

Based on the data in json format obtained in the previous section, the generation of the graph consists of the following main steps:

  • Read data and pre-process
  • Create a new graph in networkx.DiGraph format and add nodes and edges to it
  • Complete data
  • Output data

Preprocessing consists mainly of reading data using pandas and cutting buildInputs and propagatedBuildInputs from one single string to a list. And each item in this list contains only the required id part. In addition, depending on the package set the node belongs to (e.g. pythonPackages), we add a group attribute to it.

Nodes and edges can be added using NetworkX built-in functions. In particular, NetworkX allows us to add various labels to nodes and edges (e.g. a node can contain its id, pname, version, group; an edge can contain the category it belongs to).

Specifically, since not all packages in Nixpkgs can be evaluated, there are some edges involved in nodes that are not evaluated. In turn, in NetworkX’s database, these nodes only have id. So additional group attributes need to be added for them. Here, the group attribute of all nodes that cannot be evaluated is set to "nixpkgs".

Finally, we can first export the data in CSV format using pandas. Besides, we can use NetworkX’s built-in functions output the graph in PNG format, which can allow us to have a general idea of the graph. However, if we want to go deeper into the visualization features, NetworkX also allows us to export the graph in GEXF format and then we can process it with Gephi as well as in GraphML format which could be treated by Neo4j.

Analyze the relationships in Nixpkgs

Now let’s make use of this data.

The command line interface of this project allows the user to customize the version of Nixpkgs to be used. Simply provide the full 40-character SHA-1 hash of a commit and the SHA256 of its tree. The commit of Nixpkgs used in this blog is 481f9b246d200205d8bafab48f3bd1aeb62d775b.

Some basic information

The final directed graph consists of 64205 nodes and 217579 edges. Among them, the top 3 packages with the most direct dependencies are: pleroma-2.4.3: 124, azure-cli-2.34.1: 117, libreoffice- 94. And the most cited 3 nodes are: python3-3.10.6: 7697, texinfo-6.8: 5626, emacs-28.1: 5553. On average, a node has 3.35 direct dependencies. And the longest chain of dependencies in Nixpkgs consists of 41 nodes.

Use Gephi for visualization

Using the GEXF format file provided by default, we can draw the following image with Gephi. As shown below, we set the color of the nodes according to the group they belong to. And the size of each node is nonlinearly and positively related to its in degree (the number of edges coming into a node in a directed graph).

For the layout of the graph, the ForceAtlas2 algorithm is used here. It is a force simulation algorithm that contains both gravitational and repulsive forces. The attractive forces pull the nodes toward their dependencies while the repulsive forces push high-degree nodes away from the nodes around them. Thus we can see that packages belonging to the same ecosystem are clustered together because they have similar dependencies. The nodes with high degree form a blank area around them. In addition, just like the celestial bodies, there is a gravitational force that makes all the nodes clump together to form a circle.


Cycles in the graph

When I first designed the algorithm to calculate the longest chains in Nixpkgs, the algorithm always failed to run. After some analysis, I found that there are some simple cycles in Nixpkgs. Some cycles are of length 1, which means that some derivations have buildInputs or propagatedBuildInputs that contain themselves. There are also some cycles of length 2 or 3. There are six cycles in total:

['gvfs-1.50.2', 'libgdata-0.18.1', 'gnome-online-accounts-3.44.0']
['gvfs-1.50.2', 'gnome-online-accounts-3.44.0']
['pipewire-0.3.51', 'ffmpeg-4.4.2', 'SDL2-2.0.20']

Specifically, to confirm if the error occurred when fetching the Nixpkgs data, I accessed the raw Nixpkgs data:

nix show-derivation nixpkgs#chicken

In the results given by Nix there is the following information:

"/nix/store/1qlyycams6q39ll5r4p1sq57gcvhvgmn-chicken-5.3.0.drv": {
    "env": {
      "buildInputs": "/nix/store/c4ha2dqj3a1jp2dn962wdfq5wqy0gikv-chicken-5.3.0",

This means that cycles do exist in the raw data of Nixpkgs.

How is this possible? If a derivation’s dependencies contained itself, then an infinite loop would occur during the build. That is, building A would require that the environment had already A. Of course this is not possible.

When checking the full hash-pname-version entry in the store, we can see it’s not actually the same package. See example above for chicken, it is what we call bootstrapping. Nix will first build an initial version of chicken without chicken in its dependencies. Then use this initial version to install the final chicken. This is why we can see two chicken with different hashes. But since our identifiers have the form <pname>-<version>, they are identified as one node, thus forming a cycle.

Query the graph with Neo4j

Gephi provides us with a nice visualization, but sometimes you may need some precise queries. For this reason, we also provide the appropriate solution.

Neo4j is a tool to manipulate graphs with additional information, such as node and edge labels and properties. More importantly, it allows querying these graphs through a query language called Cypher. It is possible to query the graph by using three main keywords: MATCH, WHERE, RETURN. The first keyword allows to match some nodes and edges following their types and their edges, the second one allows to check data property and the last one allows to return some result.

First let’s see how to export data from NetworkX to Neo4j. The import file format supported by Neo4j is mainly CSV, which needs to read nodes and edges line by line. But we have a simpler solution to make: let NetworkX output the graph in GraphML format and then install the APOC plugin for Neo4j to read it. This plugin can be installed on both the desktop and server versions of Neo4j and is very easy to use. If you happen to need to transfer data between networkx and Neo4j, you can also refer to this method.

Now we can start playing with some commands to demonstrate the benefits of Neo4j. For example if we want to know who the Python ecosystem directly relies on most, we can do this.

MATCH (n)-[e]->(m)
| group                 | times |
| "nixpkgs"             | 8966  |
| "gnuradio3_8Packages" | 5409  |
| "xorg"                | 86    |
| "libsForQt5"          | 41    |
| "gst_all_1"           | 18    |
| "gnome2"              | 9     |
| "driversi686Linux"    | 6     |
| "gnome"               | 5     |
| "haskellPackages"     | 4     |
| "llvmPackages"        | 3     |

From the results we can see that Python ecosystem relies mainly on some separate derivations (nixpkgs is the group for separate software). This indicates that Python may rely on many separate programs or libraries, as we mentioned at the very beginning. Secondly Python mainly references Gnuradio (a software development toolkit that provides signal processing blocks to implement software radios). This indicates that Python is widely used for signal processing applications.

To take this a step further, we can compare the differences between Python 3.9 and Python 3.10 replacing the ‘python’ string by ‘python39’ and ‘python310’ in the previous query. The results show that Python’s references to Gnuradio are mainly a Python 3.10 thing.

The above is just a preliminary use of Neo4j for the Nixpkgs graph. But with this example we can see that the Nixpkgs graph is able to show some macro features of the software world that is invisible when we’re just in parts of it.


The graph of Nixpkgs on the one hand allows us to visualize through Gephi and thus show the interactions between different software ecosystems. On the other hand, it allows us to perform precise queries through Neo4j. In addition, with the help of Python’s modules NetworkX and Pandas, we can obtain a lot of quantitative results, such as the average dependency of a software of 3.35. This project, tweag/nixpkgs-graph/, provides users with raw materials and some tools that they can explore according to their needs to explore.

In closing, I would like to thank Tweag for giving me this internship opportunity. The value of what I have learned here far outweighs the salary. I would also like to thank all the Tweagers, including my mentor Mr. Guillaume Desforges, who have helped me in this internship program. I hope to have the opportunity to work with you again.

September 13, 2022 12:00 AM

September 10, 2022

Joachim Breitner

rec-def: Behind the scenes

A week ago I wrote about the rec-def Haskell library, which allows you to write more recursive definitions, such as in this small example:

let s1 = RS.insert 23 s2
    s2 = RS.insert 42 s1
in RS.get s1

This will not loop (as it would if you’d just used Data.Set), but rather correctly return the set S.fromList [23,42]. See the previous blog post for more examples and discussion of the user-facing side of this.

For quick reference, these are the types of the functions involved here:

The type of s1 and s2 above is not Set Int, but rather RSet Int, and in this post I’ll explain how RSet works internally.

Propagators, in general

The conceptual model behind an recursive equation like above is

  • There are a multiple cells that can hold values of an underlying type (here Set)
  • These cells have relations that explain how the values in the cells should relate to each other
  • After registering all the relations, some form of solving happens.
  • If the solving succeeds, we can read off the values from the cells, and they should satisfy the registered relation.

This is sometimes called a propagator network, and is a quite general model that can support different kind of relations (e.g. equalities, inequalities, functions), there can be various solving strategies (iterative fixed-points, algebraic solution, unification, etc.) and information can flow on along the edges (and hyper-edges) possibly in multiple directions.

For our purposes, we only care about propagator networks where all relations are functional, so they have a single output cell that is declared to be a function of multiple (possibly zero) input cells, without affecting these input cells. Furthermore, every cell is the output of exactly one such relation.

IO-infested propagator interfaces

This suggests that an implementation of such a propagator network could provide an interface with the following three operations:

  • Functions to declare cells
  • Functions to declare relations
  • Functions to read values off cells

This is clearly an imperative interface, so we’ll see monads, and we’ll simply use IO. So concretely for our small example above, we might expect

There is no need for an explicit “solve” function: solving can happen when declareInsert or getCell is called – as a User I do not care about that.

You might be curious about the implementation of newCell, declareInsert and getCell, but I have to disappoint you: This is not the topic of this article. Instead, I want to discuss how to turn this IO-infested interface into the pure interface seen above?

Pure, but too strict

Obviously, we have to get rid of the IO somehow, and have to use unsafePerformIO :: IO a -> a somehow. This dangerous function creates a pure-looking value that, when used the first time, will run the IO-action and turn into that action’s result.

So maybe we can simply write the following:

Indeed, the types line up, but if we try to use that code, nothing will happen. Our insert is too strict to be used recursively: It requires the value of c2 (as it is passed to declareInsert, which we assume to be strict in its arguments) before it can return c1, so the recursive example at the top of this post will not make any progress.

Pure, lazy, but forgetful

To work around this, maybe it suffices if we do not run declareInsert right away, but just remember that we have to do it eventually? So let’s introduce a new data type for RSet a that contains not just the cell (Cell a), but also an action that we still have to run before getting a value:

This is better: insert is now lazy in its arguments (for this it is crucial to pattern-match on RSet only inside the todo code, not in the pattern of insert!) This means that our recursive code above does not get stuck right away.

Pure, lazy, but runs in circles

But it is still pretty bad: Note that we do not run get s2 in the example above, so that cell’s todo, which would declareInsert 42, will never run. This cannot work! We have to (eventually) run the declaration code from all involved cells before we can use getCell!

We can try to run the todo action of all the dependencies as part of a cell’s todo action:

Now we certainly won’t forget to run the second cell’s todo action, so that is good. But that cell’s todo action will run the first cell’s todo action, and that again the second cell’s, and so on.

Pure, lazy, terminating, but not thread safe

This is silly: We only need (and should!) run that code once! So let’s keep track of whether we ran it already:

Ah, much better: It works! Our call to get c1 will trigger the first cell’s todo action, which will mark it as done before calling the second cell’s todo action. When that now invokes the first cell’s todo action, it is already marked done and we break the cycle, and by the time we reach getCell, all relations have been correctly registered.

In a single-threaded world, this would be all good and fine, but we have to worry about multiple threads running get concurrently, on the same or on different cells.

In fact, because we use unsafePerformIO, we have to worry about this even when the program is not using threads.

And the above code has problems. Imagine a second call to get c1 while the first one has already marked it as done, but has not finished processing all the dependencies yet: It will call getCell before all relations are registered, which is bad.

Recursive do-once IO actions

Making this thread-safe seems to be possible, but would clutter both the code and this blog post. So let’s hide that problem behind a nice and clean interface. Maybe there will be a separate blog post about its implementation (let me know if you are curious), or you can inspect the code in System.IO.RecThunk module yourself). The interface is simply

data Thunk
thunk :: IO [Thunk] -> IO Thunk
force :: Thunk -> IO ()

and the idea is that thunk act will defer the action act until the thunk is passed to force for the first time, and force will not return until the action has been performed (possibly waiting if another thread is doing that at the moment), and also until the actions of all the thunks returned by act have performed, recursively, without running into cycles.

We can use this in our definition of RSet and get to the final, working solution:

This snippet captures the essential ideas behind rec-def:

  • Use laziness to allow recursive definition to describe the propagator graph naturally
  • Use a form of “explicit thunk” to register the propagator graph relations at the right time (not too early/strict, not too late)

And that’s all?

The actual implementation in rec-def has a few more moving parts.

In particular, it tries to support different value types (not just sets), possibly with different implementations, and even mixing them (e.g. in member :: Ord a => a -> RSet a -> RBool), so the generic code is in Data.Propagator.Purify, and supports various propagators underneath. The type RSet is then just a newtype around that, defined in Data.Recursive.Internal to maintain the safety of the abstraction,

I went back and forth on a few variants of the design here, including one where there was a generic R type constructor (R (Set a), R Bool etc.), but then monomorphic interface seems simpler.

Does it really work?

The big remaining question is certainly: Is this really safe and pure? Does it still behave like Haskell?

The answer to these questions certainly depends on the underlying propagator implementation. But it also depends on what we actually mean by “safe and pure”? For example, do we expect the Static Argument Transformation be semantics preserving? Or is it allowed to turn undefined values into defined ones (as it does here)?

I am unsure myself yet, so I’ll defer this discussion to a separate blog post, after I hopefully had good discussions about this here at ICFP 2022 in Ljubljana. If you are around and want to discuss, please hit me up!

by Joachim Breitner ( at September 10, 2022 09:08 AM

September 08, 2022

Mark Jason Dominus

Pope Fibonacci

When Albino Luciani was crowned Pope, he chose his papal name by concatenating the names of his two predecessors, John XXIII and Paul VI, to become John Paul. He died shortly after, and was succeeded by Karol Wojtyła who also took the name John Paul. Wojtyła missed a great opportunity to adopt Luciani's strategy. Had he concatenated the names of his predecessors, he would have been Paul John Paul. In this alternatve universe his successor, Benedict XVI, would have been John Paul Paul John Paul, and the current pope, Francis, would have been Paul John Paul John Paul Paul John Paul. Each pope would have had a unique name, at the minor cost of having the names increase exponentially in length.

(Now I wonder if any dynasty has ever adopted the less impractical strategy of naming their rulers after binary numerals, say:

King Juan
King Juan Cyril
King Juan Juan
King Juan Cyril Cyril
King Juan Cyril Juan
King Juan Juan Cyril

Or perhaps . There are many variations, some actually reasonable.)

I sometimes fantasize that Philadelphia-area Interstate highways are going to do this. The main east-west highway around here is I-76. (Not so-called, as many imagine, in honor of Philadelphia's role in the American revolution of 1776, but simply because it lies south of I-78 and north of I-74 and I-70.) A connecting segment that branches off of I-76 is known as I-676. Driving to one or the other I often see signs that offer both:

Street view of Philadelphia.  In the center is a lamp post with two red and blue Interstate highway shields affixed, indicating that entrances to I-76 West and I-676 East are ahead.

I often fantasize that this is a single sign for I-76676, and that this implies an infinite sequence of highways designated I-67676676, I-7667667676676, and so on.

Finally, I should mention the cleverly-named fibonacci salad, which you make by combining the leftovers from yesterday's salad and the previous day's.

[ Addendum: Do you know the name of the swagman in Waltzing Matilda? It's Juan. The song says so: “Juan's a jolly swagman…” ]

by Mark Dominus ( at September 08, 2022 02:41 PM


Lockstep-style testing with quickcheck-dynamic

Recently IOG and QuviQ released a new library for testing stateful systems called quickcheck-dynamic. In this blog post we will take a look at this library, and how it relates to quickcheck-state-machine. We will focus on the state machine testing aspect; quickcheck-dynamic also has support for dynamic logic, but we will not discuss that here.

Specifically, we will consider how we might do lockstep-style testing with quickcheck-dynamic. This is a particular approach to testing that we described in great detail in an earlier blog post, An in-depth look at quickcheck-state-machine. We will recap the general philosophy in this new blog post, but we will focus here on the hows, not necessarily the whys; it might be helpful to be familiar with the previous blog post to understand the larger context of what we’re trying to achieve.

We have developed a library called quickcheck-lockstep which builds on top of quickcheck-dynamic to provide an abstraction called InLockstep which provides support for lockstep-style testing. In this blog post we will describe this library in two parts:

  1. In the first half we will show a test author’s perspective of how to use the abstraction.
  2. In the second half we show how we can implement the abstraction on top of quickcheck-dynamic.

Part one will suffice for users who simply want to use quickcheck-lockstep. Part two serves two purposes:

  • It will give an illustrated example of how to use quickcheck-dynamic for state based testing. We will use most of the core features of the library to implement our abstraction on top of it.
  • Since the goal is to provide the end user with a very similar style of testing that we previously provided for quickcheck-state-machine (see specifically Test.StateMachine.Lockstep.NAry), the implementation will serve as a good test testbed for comparing the two libraries.

NOTE: quickcheck-lockstep currently depends on an as-yet unreleased version of quickcheck-dynamic. Once this is released, we will also make a Hackage release of quickcheck-lockstep; at the moment, please refer to the GitHub repository instead. The example that we discuss in part 1 is also available in that repository, as an example use case.

Part 1: Lockstep-style testing

In this section we will show how we can do lockstep-style testing using a new abstraction called InLockstep. In Part 2 we will see how we can implement this new abstraction.

Testing philosophy

Lockstep-style testing of stateful systems is quite simple:

  • We have a stateful API that we want to test; this could be a database, a file system, etc.
  • We will reify that stateful API as a datatype with constructors for each of the API calls.
  • We then write two interpreters for this API: one against the system we want to test, and one against a model.
  • We regard the system as a block box: we cannot see the internal state of the database, the contents of the file system, etc. The only thing we can see is the results of the API calls.
  • Here is why we call this lockstep testing: to test the system, we generate an arbitrary sequence of commands, then execute those against the system under test and against the model. The only thing we check at each point is that both systems return the same results, modulo observability.
  • We cannot insist on exactly the same results: for example, opening a file might result in a file handle, which the model cannot reproduce. The model must be allowed to have its own type for “model handles” that is different from real handles, and we do not want to try and compare those to real handles. If the system somehow returns the “wrong” handle, then this will become evident later in the test when we use that handle.

Running example

Our running example will be a file system: it will be precisely the same example we used previously when discussing quickcheck-state-machine: same API, same model, same properties we want to test, same considerations regarding labelling tests and shrinking them—but a different testing framework. If you want to follow along, the code is available on GitHub.

The model is a simple model for a file system. It consists of the following functions:

mMkDir :: Dir               -> Mock -> (Either Err ()      , Mock)
mOpen  :: File              -> Mock -> (Either Err MHandle , Mock)
mWrite :: MHandle -> String -> Mock -> (Either Err ()      , Mock)
mClose :: MHandle           -> Mock -> (Either Err ()      , Mock)
mRead  :: File              -> Mock -> (Either Err String  , Mock)

StateModel implementation

StateModel is the central class in quickcheck-dynamic for stateful testing. Instances of StateModel define the datatype that describes the API, how to generate values of that datatype, how to interpret it, etc. When using the InLockstep infrastructure however, we only define the API datatype; everything else is delegated to InLockstep.

We will define the type of our model as

data FsState = FsState Mock Stats

initState :: FsState
initState = FsState Mock.emptyMock initStats

Here, Mock is the mock file system implementation, and Stats keeps some statistics about the running test. We will see why we need this statistics when we discuss labelling.

Let’s now define two type synonyms. First, one of the type of actions:

type FsAct a = Action (Lockstep FsState) (Either Err a)

Here, Action is the associated data type from StateModel, and Lockstep is an opaque datatype from the lockstep infrastructure. All our actions can return errors, and we want to make sure that the model and the real system agree on what those errors are. So, the result of an FsAct is always of the form Either Err a, where Err is also defined in the model.

Secondly, the type of variables:

type FsVar a = ModelVar FsState a

Variables are an essential part of stateful testing: a variable allows us to refer back to the result of a previously executed command. For example, if we want to write to a file, we need to generate an action that says “write this string to the handle that you got when you opened that file a while ago.” ModelVar are a special kind of variables provided by the lockstep infrastructure; we will discuss them in more detail later.

We can now give the StateModel instance:

type FsVar a = ModelVar FsState a
type FsAct a = Action (Lockstep FsState) (Either Err a)

instance StateModel (Lockstep FsState) where
  data Action (Lockstep FsState) a where
    MkDir :: Dir                        -> FsAct ()
    Open  :: File                       -> FsAct (IO.Handle, File)
    Write :: FsVar IO.Handle  -> String -> FsAct ()
    Close :: FsVar IO.Handle            -> FsAct ()
    Read  :: Either (FsVar File) File   -> FsAct String

  initialState    = Lockstep.initialState initState
  nextState       = Lockstep.nextState
  precondition    = Lockstep.precondition
  arbitraryAction = Lockstep.arbitraryAction
  shrinkAction    = Lockstep.shrinkAction

Some comments:

  • Write and Close both take a variable to a handle, rather than an actual handle. This is what enables us to refer the handles that we got from previous commands.
  • In both cases, the type of that variable is FsVar IO.Handle, but the model implementation requires mock handles instead; we will see how that is resolved in the next section when we discuss relating results from the real system to model results.
  • Open returns the file path of the file it just opened along with the handle, and Read takes either a concrete file path as an argument or a variable to such a file path (e.g., one that might have been returned by Open). This allows us to express “read from the same file that you opened previously in the test”; see the section on Dependencies between commands from the previous post why this can lead to better (more minimal) counter examples.
  • The lockstep infrastructure provides default implementation for the methods of StateModel. In many cases you can just them as-is, like we did here, but of course you don’t have to. For example, the default precondition isn’t always strong enough.

From real results to model results

When we open a file in the real file system, we get an IO.Handle, or possibly an exception. In the model however we have

mOpen :: File -> Mock -> (Either Err MHandle, Mock)

We can map the exception to an Err, so that’s not a problem, but we cannot map an IO.Handle to an MHandle or vice versa: we want to allow the model to return something of a different type here.

The Action datatype from quickcheck-dynamic is a GADT, where the type index describes the result of the action. For example, consider this method from the StateModel class:

postcondition :: (state, state) -> Action state a -> LookUp m -> Realized m a -> m Bool

This method is the check that quickcheck-dynamic does every after action. It has the following parameters:

  1. The before and after state of the model
  2. The action that was executed
  3. A way to look up the values of any variables in those actions
  4. The result of the action in the system under test

The type of the result is Realized m a; this is an abstraction introduced in quickcheck-dynamic 2.0 which allows to run the same tests with different test execution backends; for example, we might run our tests in the real IO monad, or in an IO monad simulator. This is orthogonal to the abstractions provided by InLockstep: no matter the test execution backend, we will always run against the same model. For our purposes (and this will be true for most lockstep-style tests1), we will exclusively run our tests in ReaderT r IO, where quickcheck-dynamic already defines for us that

Realized (ReaderT r IO) a = a

So for the purposes of this blogpost, whenever you see Realized m a, you can translate that to simply a in your head.

In lockstep-style testing, we want to compare that result of type a to the response from the model but, as we saw, the model might return something of a slightly different type. The InLockstep class therefore introduces an associated data type called ModelValue; the idea is that whenever the system under test returns something of type a (technically, Realized m a), we expect the model to return a result of type ModelValue a.

As before, we will define a type synonym:

type FsVal a = ModelValue FsState a

Here’s the definition for FsState:

instance InLockstep FsState where
  data ModelValue FsState a where
    MHandle :: Mock.MHandle -> FsVal IO.Handle

    -- Rest is regular:

    MErr    :: Err    -> FsVal Err
    MFile   :: File   -> FsVal File
    MString :: String -> FsVal String
    MUnit   :: ()     -> FsVal ()

    MEither :: Either (FsVal a) (FsVal b) -> FsVal (Either a b)
    MPair   :: (FsVal a, FsVal b)         -> FsVal (a, b)

  -- .. other members of InLockstep elided

We see that an FsVal a is just a wrapper around an a, unless that a is an IO.Handle in which case FsVal IO.Handle instead wraps a Mock.MHandle.

Recall that we defined

type FsVar a = ModelVar FsState a

We can now be more precise: a ModelVar s a is a variable to a ModelValue s a.

Comparing values

ModelValue allows the model to return something of a different type than the implementation, but when we compare the two, we need something of the same type.2 InLockstep therefore defines a second associated type Observable, which is the observable result. The definition is similar but a bit simpler:

type FsObs a = Observable FsState a

instance InLockstep FsState where
  data Observable FsState a where
    OHandle :: FsObs IO.Handle
    OId     :: (..) => a -> FsObs a
    OEither :: Either (FsObs a) (FsObs b) -> FsObs (Either a b)
    OPair   :: (FsObs a, FsObs b) -> FsObs (a, b)

  -- .. other members of InLockstep elided

This follows a similar structure as ModelValue, with two differences:

  • In the case of a handle, we don’t observe anything. If the system (or the model) returns the wrong handle, we cannot notice this when the open a file; we will only notice it later when we try to read from that file.
  • In the case of ModelValue, we need a guarantee that if we have a value of FsVal IO.Handle, that this is really an Mock.MHandle. We do not need that guarantee for Observable, and so it suffices to define a single constructor OId that can be used for any type at all where the model and the system have a result of the same type.

We also have to explain how to translate from mock results to observable results:

instance InLockstep FsState where
  observeModel :: FsVal a -> FsObs a
  observeModel = \case
      MHandle _ -> OHandle
      MErr    x -> OId x
      MString x -> OId x
      MUnit   x -> OId x
      MFile   x -> OId x
      MEither x -> OEither $ bimap observeModel observeModel x
      MPair   x -> OPair   $ bimap observeModel observeModel x

  -- .. other members of InLockstep elided

We have to do the same for results from the system under test, but we will see that when we discuss actually running the tests. This is a bit of boilerplate, but not difficult to write.

Interpreter for the model

We can now write the interpreter for the model: a function that takes a valid from our reified API, calls the corresponding functions from the model, and then wraps the result in the appropriate constructors of ModelValue:

runMock ::
     ModelLookUp FsState
  -> Action (Lockstep FsState) a
  -> Mock -> (FsVal a, Mock)
runMock lookUp = \case
    MkDir d   -> wrap MUnit     . Mock.mMkDir d
    Open f    -> wrap (mOpen f) . Mock.mOpen f
    Write h s -> wrap MUnit     . Mock.mWrite (getHandle $ lookUp h) s
    Close h   -> wrap MUnit     . Mock.mClose (getHandle $ lookUp h)
    Read f    -> wrap MString   . Mock.mRead (either (getFile . lookUp) id f)
    wrap :: (a -> FsVal b) -> (Either Err a, Mock) -> (FsVal (Either Err b), Mock)
    wrap f = first (MEither . bimap MErr f)

    mOpen :: File -> Mock.MHandle -> FsVal (IO.Handle, File)
    mOpen f h = MPair (MHandle h, MFile f)

    getHandle :: ModelValue FsState IO.Handle -> Mock.MHandle
    getFile   :: ModelValue FsState File      -> File

    getHandle (MHandle h) = h
    getFile   (MFile   f) = f

The only slightly non-trivial thing here is that when we encounter a command with variables, we need to resolve those variables. InLockstep gives us a function of type ModelLookUp FsState, which allows us to resolve any variable we see (the default InLockstep precondition guarantees that this resolution must always succeed). The result of looking up a variable of type a will be a value of type FsVal a; we then need to match on that to extract the wrapped value. In getHandle we see why it’s so important that a FsVal IO.Handle must contain a mock handle, rather than an IO.Handle.

With the interpreter defined, we can complete the next method definition of InLockstep:

instance InLockstep FsState where
  modelNextState :: forall a.
       LockstepAction FsState a
    -> ModelLookUp FsState
    -> FsState -> (FsVal a, FsState)
  modelNextState action lookUp (FsState mock stats) =
      auxStats $ runMock lookUp action mock
      auxStats :: (FsVal a, Mock) -> (FsVal a, FsState)
      auxStats (result, state') =
          (result, FsState state' $ updateStats action result stats)

  -- .. other members of InLockstep elided

All we do here is call the interpreter we just wrote, and then additionally update the statistics (discussed below).


As discussed above, variables allow us to refer back to the results of previously executed commands. We have been glossing over an important detail, however. Recall the types of Open and Close (with the FsAct type synonym expanded):

Open  :: File            -> Action .. (Either Err (IO.Handle, File))
Close :: FsVar IO.Handle -> Action .. (Either Err (IO.Handle, ()))

The result of opening a file is either an error, or else a pair of a handle and a filepath. In quickcheck-dynamic, we get a single variable for the execution of each command, and this is (therefore) true also for the lockstep infrastructure.3 So, after opening a file, we have a variable of type Either Err (IO.Handle, File), but we don’t want a variable of that type as the argument to Close: instead, we want a variable to a IO.Handle. Most importantly, we want to rule out the possibility of trying to a close a file that we never managed to open in the first place; such a test would be nonsensical.4

One of the most important features that the lockstep infrastructure adds on top of core quickcheck-dynamic is a concept of variables with a Functor-esque structure: they support an operation that allows us to change the type of that variable.5 The key datatype is a “generalized variable” GVar; the intuition is that a GVar of type y is actually a Var of some other type x, bundled with a function6 from x -> Maybe y:

data GVar :: Type -> Type where -- not the real definition
  GVar :: Typeable x => Var x -> (x -> Maybe y) -> GVar y

For technical reasons,7 this doesn’t quite work. Instead of that function x -> Maybe y, we instead have essentially a small DSL for defining such functions:

data Op a b where
  OpId    :: Op a a
  OpFst   :: Op (a, b) a
  OpSnd   :: Op (b, a) a
  OpLeft  :: Op (Either a b) a
  OpRight :: Op (Either b a) a
  OpComp  :: Op b c -> Op a b -> Op a c

This DSL can be used to extract the left or right coordinate of a pair, as well as to pattern match on an Either. This will suffice for many test cases but not all, so GVar generalizes over the exact choice of DSL:

data GVar op f where
  GVar :: Typeable x => Var x -> op x y -> GVar op y

InLockstep has an associated type family ModelOp which records the choice of DSL. It defaults to Op, which is just fine for our running example. We do have to specify how to execute this DSL against model values by giving an instance of this class:

class Operation op => InterpretOp op f where
  intOp :: op a b -> f a -> Maybe (f b)

The instance for our FsVal is straightforward:

instance InterpretOp Op (ModelValue FsState) where
  intOp OpId         = Just
  intOp OpFst        = \case MPair   x -> Just (fst x)
  intOp OpSnd        = \case MPair   x -> Just (snd x)
  intOp OpLeft       = \case MEither x -> either Just (const Nothing) x
  intOp OpRight      = \case MEither x -> either (const Nothing) Just x
  intOp (OpComp g f) = intOp g <=< intOp f

The other variable-related thing we need to do in our InLockstep instead is that we need to define which variables are used in all commands:

instance InLockstep FsState where
  usedVars :: LockstepAction FsState a -> [AnyGVar (ModelOp FsState)]
  usedVars = \case
      MkDir{}        -> []
      Open{}         -> []
      Write h _      -> [SomeGVar h]
      Close h        -> [SomeGVar h]
      Read (Left h)  -> [SomeGVar h]
      Read (Right _) -> []

  -- .. other members of InLockstep elided

SomeGVar here is just a way to hide the type of the variable, so that we can have a list of variables of different types:

data AnyGVar op where
  SomeGVar :: GVar op y -> AnyGVar op

Again, the definition of usedVars involves some boilerplate, but not difficult to write. It is important to get this function right; however. When a counter-example is found, quickcheck-dynamic will try to shrink the list of actions, to throw out any irrelevant detail. But if, say, a call to Open is removed, then any calls to Close which referenced that open should also be removed. This is done through preconditions, and the default precondition from InLockstep ensures that this will happen by staying that all usedVars must be defined. However, if usedVars misses some variables, the test will fail during shrinking with a confusing error message about undefined variables.

Generating and shrinking actions

The type of the method for generating actions is

type ModelFindVariables state = forall a.
          Typeable a
       => Proxy a -> [GVar (ModelOp state) a]

class (..) => InLockstep state where
  arbitraryWithVars ::
       ModelFindVariables state
    -> state
    -> Gen (Any (LockstepAction state))

  -- .. other members of InLockstep elided

Thus, we need to generate an arbitrary action given the current state of the model and a way to find all available variables of a specified type. For example, if we previously executed an open command, thenModelFindVariables will tell us that we have a variable of type Either Err (IO.Handle, File). If we have a such a variable, we can turn it into a variable of the type we need using:

handle :: GVar Op (Either Err (IO.Handle, File)) -> GVar Op IO.Handle
handle = mapGVar (\op -> OpFst `OpComp` OpRight `OpComp` op)

The situation for shrinking is very similar:

class (..) => InLockstep state where
  shrinkWithVars ::
       ModelFindVariables state
    -> state
    -> LockstepAction state a
    -> [Any (LockstepAction state)]

  -- .. other members of InLockstep elided

We will not show the full definition of the generator and the shrinker here. Apart from generating variables, they follow precisely the same lines as the we showed previously with quickcheck-state-machine. You can find the full definition in the repository.


When we are testing with randomly generated test data, it is important that we understand what kind of data we are testing with. For example, we might want to verify that certain edge cases are being tested. Labelling is one way to do this: we label specific kind of test inputs, and then check that we see tests being executed with those labels.

For our running the example, the labels, or tags, that we previously considered were

data Tag = OpenTwo | SuccessfulRead

The idea was that a test would be labelled with OpenTwo if it opens at least two different files, and with SuccessfulRead if it manages to execute at least one read successfully.

The abstraction that InLockstep provides for tagging is

class (..) => InLockstep state where
  tagStep ::
       Show a
    => (state, state)
    -> LockstepAction state a
    -> ModelValue state a
    -> [String]

  -- .. other members of InLockstep elided

This enables us to take an action given the before and after state, the action, and its result; we do not see any previously executed actions.8 This means that for our OpenTwo tag we need to record in the state how many different files have been opened. This is the purpose of the Stats:

type Stats = Set File

initStats :: Stats
initStats = Set.empty

Updating the statistics is easy (recall that we used this function in modelNextState above). We just look at the action and its result: if the action is an Open, and the result is a Right value (indicating the Open was success), we insert the filename into the set:

updateStats :: LockstepAction FsState a -> FsVal a -> Stats -> Stats
updateStats action result  =
   case (action, result) of
     (Open f, MEither (Right _)) -> Set.insert f
     _otherwise                  -> id

Tagging is now equally easy. If it’s a Read, we check to see if the result was successful, and if so we add the SuccessfulRead tag. If it’s an Open, we look at the statistics to see if we have opened at least two files:

tagFsAction :: Stats -> LockstepAction FsState a -> FsVal a -> [Tag]
tagFsAction openedFiles = \case
    Read _ -> \v -> [SuccessfulRead | MEither (Right _) <- [v]]
    Open _ -> \_ -> [OpenTwo        | Set.size openedFiles >= 2]
    _      -> \_ -> []

Running the tests

Before we can run any quickcheck-dynamic tests, we have to give an instance of the RunModel class. This class is somewhat confusingly named: it’s main method perform does not explain how to run the model, but rather how to run the system under test. Name aside, instances of RunModel are simple when using quickcheck-lockstep:

type RealMonad = ReaderT FilePath IO

instance RunModel (Lockstep FsState) RealMonad where
  perform       = \_st -> runIO
  postcondition = Lockstep.postcondition
  monitoring    = Lockstep.monitoring (Proxy @RealMonad)

We have to choose a monad to run our system under test in; we choose ReaderT FilePath IO, where the FilePath is the root directory of the file system that we are simulating. The definitions of postcondition and monitoring come straight from quickcheck-lockstep; we just have to provide an interpreter for actions for the system under test:

runIO :: LockstepAction FsState a -> LookUp RealMonad -> RealMonad (Realized RealMonad a)

Writing this interpreter is straight-forward and we will not show it here; the only minor wrinkle is that we need to turn the lookup function for Var that quickcheck-dynamic gives us into a lookup function for GVar; quickcheck-lockstep provides this functionality:9

lookUpGVar :: (..) => Proxy m -> LookUp m -> GVar op a -> Realized m a

The final thing we have to do is provide an instance of RunLockstep; this is a subclass of InLockstep with a single method observeReal; it is a separate class, because RunLockstep itself is not aware of the monad used to run the system under test:

instance RunLockstep FsState RealMonad where
  observeReal _proxy = \case
      MkDir{} -> OEither . bimap OId OId
      Open{}  -> OEither . bimap OId (OPair . bimap (const OHandle) OId)
      Write{} -> OEither . bimap OId OId
      Close{} -> OEither . bimap OId OId
      Read{}  -> OEither . bimap OId OId

To actually run our tests, we can make use of this function provided by quickcheck-lockstep:

runActionsBracket ::
     RunLockstep state (ReaderT st IO)
  => Proxy state
  -> IO st         -- ^ Initialisation
  -> (st -> IO ()) -- ^ Cleanup
  -> Actions (Lockstep state) -> Property

For example, if we have a bug in our mock system such that get a “does not exist” error message instead of an “already exists” error when we create a directory that already exists, the test output might look something like this:

*** Failed! Assertion failed (after 7 tests and 4 shrinks):
 [Var 4 := MkDir (Dir ["x"]),
  Var 6 := MkDir (Dir ["x"])]
State: FsState {.. state1 elided ..}
State: FsState {.. state2 elided ..}
System under test returned: OEither (Left (OId AlreadyExists))
but model returned:         OEither (Left (OId DoesNotExist))

(where we have elided some output) We see the state of the system after every action, as well as the final failed postcondition.

Generating labelled examples

For generating labelled examples, quickcheck-lockstep provides

tagActions ::  InLockstep state => Proxy state -> Actions (Lockstep state) -> Property

(This functionality is not provided by quickcheck-dynamic.10) We can use this with the standard QuickCheck labelledExamples function. As stated, this is very useful both for testing labelling, but also to test the shrinker, because QuickCheck will give us minimal labelled examples. For example, we might get the following minimal example for “successful read”

*** Found example of Tags: ["SuccessfulRead"]
 [Var 8 := Open (File {dir = Dir [], name = "t0"}),
  Var 9 := Close (GVar (Var 8) (fst . fromRight . id)),
  Var 51 := Read (Left (GVar (Var 8) (snd . fromRight . id)))]

The syntax might be a little difficult to read here, but we (1) open a file, then (2) close the file we opened in step (1), and finally (3) read the file that we opened in step (1).

Part 2: Implementation

Now that we have seen all the ingredients, let’s see how the lockstep abstraction is actually implemented. We will first describe which state we track, and then discuss all of the default implementations for the methods of StateModel; this will serve both as an explanation of the implementation, as well as an example of how to define StateModel instances. Fortunately, we have already seen most of the pieces; it’s just a matter of putting them together now.


During test execution, quickcheck-dynamic internally maintains a mapping from variables to the values as returned by the system under test:

type Env m = [EnvEntry m]

data EnvEntry m where
  (:==) :: Typeable a => Var a -> Realized m a -> EnvEntry m

Variables of different types are distinguished at runtime through dynamic typing; this is common for model testing libraries like this, and is not really visible to end users.

The state maintained by the lockstep infrastructure is the user defined model state, along with an environment similar to Env, but for the values returned by the model:

data Lockstep state = Lockstep {
      lockstepModel :: state
    , lockstepEnv   :: EnvF (ModelValue state)

The definition of EnvF is similar to Env, but maps variables of type a to values of type f a:

data EnvEntry f where
  EnvEntry :: Typeable a => Var a -> f a -> EnvEntry f

newtype EnvF f = EnvF [EnvEntry f]

Initialising and stepping the state

State initialisation is simple:

initialState :: state -> Lockstep state
initialState state = Lockstep {
      lockstepModel = state
    , lockstepEnv   = EnvF.empty

Stepping the state (the implementation of nextState) is one of the functions at the heart of the abstraction, but we have actually already seen nearly all the ingredients:

nextState :: forall state a.
     (InLockstep state, Typeable a)
  => Lockstep state
  -> LockstepAction state a
  -> Var a
  -> Lockstep state
nextState (Lockstep state env) action var =
    Lockstep state' $ EnvF.insert var modelResp env
    modelResp :: ModelValue state a
    state'    :: state
    (modelResp, state') = modelNextState (GVar.lookUpEnvF env) action state

We are given the current state, an action to take, and a fresh variable to hold the result of this action, and must compute the result according to the model and new model state. The model result and the new model state come straight from the modelNextState method of InLockstep; the only other thing left to do is to add the variable binding to our environment.

Precondition and postcondition

The only precondition that we have by default is that all variables must be well-defined. This means not only that they have a value, but also that the evaluation of the embedded Op will succeed too. This is verified by

definedInEnvF :: (..) => EnvF f -> GVar op a -> Bool

So the precondition is simply

precondition ::
     InLockstep state
  => Lockstep state -> LockstepAction state a -> Bool
precondition (Lockstep _ env) =
    all (\(SomeGVar var) -> GVar.definedInEnvF env var) . usedVars

The postcondition is also simple: quickcheck-dynamic gives us the action and the result from the system under test; we (re)compute the result from the model and compare “up to observability,” as described above:

checkResponse :: forall m state a.
     RunLockstep state m
  => Proxy m
  -> Lockstep state -> LockstepAction state a -> Realized m a -> Maybe String
checkResponse p (Lockstep state env) action a =
    compareEquality (observeReal p action a) (observeModel modelResp)
    modelResp :: ModelValue state a
    modelResp = fst $ modelNextState (GVar.lookUpEnvF env) action state

    compareEquality ::  Observable state a -> Observable state a -> Maybe String
    compareEquality real mock
      | real == mock = Nothing
      | otherwise    = Just $ concat [
            "System under test returned: "
          , show real
          , "\nbut model returned:         "
          , show mock

postcondition :: forall m state a.
     RunLockstep state m
  => (Lockstep state, Lockstep state)
  -> LockstepAction state a
  -> LookUp m
  -> Realized m a
  -> m Bool
postcondition (before, _after) action _lookUp a =
    pure $ isNothing $ checkResponse (Proxy @m) before action a

Unlike postcondition, which can only return a boolean, checkResponse actually gives a user-friendly error message in case the postcondition is not satisfied. We will reuse this in monitoring below to ensure that this error message is included in the test output.

Generation, shrinking and monitoring

The definitions of arbitraryAction and shrinkAction are thin wrappers around the corresponding methods from InLockstep: we just need to pass them a way to find out which variables are available:

varsOfType ::
     InLockstep state
  => EnvF (ModelValue state) -> ModelFindVariables state
varsOfType env p = map GVar.fromVar $ EnvF.keysOfType p env

This depends on

keysOfType :: Typeable a => Proxy a -> EnvF f -> [Var a]

to find variables of the appropriate type. Action generation and shrinking are now trivial:

arbitraryAction (Lockstep state env) = arbitraryWithVars (varsOfType env) state
shrinkAction    (Lockstep state env) = shrinkWithVars    (varsOfType env) state

Finally, quickcheck-dynamic allows us to “monitor” test execution: we can add additional information to running tests. We will use this both to label tests with inferred tags, as well as to add the state after every step and the result of checkResponse to the test-output in case there is a test failure:

monitoring :: forall m state a.
     RunLockstep state m
  => Proxy m
  -> (Lockstep state, Lockstep state)
  -> LockstepAction state a
  -> LookUp m
  -> Realized m a
  -> Property -> Property
monitoring p (before, after) action _lookUp realResp =
      QC.counterexample ("State: " ++ show after)
    . maybe id QC.counterexample (checkResponse p before action realResp)
    . QC.tabulate "Tags" tags
    tags :: [String]
    tags = tagStep (lockstepModel before, lockstepModel after) action modelResp

    modelResp :: ModelValue state a
    modelResp = fst $ modelNextState
                        (GVar.lookUpEnvF $ lockstepEnv before)
                        (lockstepModel before)


The interface for stateful testing provided by quickcheck-dynamic is fairly minimal. The key methods that a test must implement are:

  • The initial state of the model, and a way to step that state given an action.
  • A precondition which is checked during generation and (importantly) during shrinking to rule out nonsensical tests.
  • A postcondition which is checked after every action and determines whether or not a test is considered successful.
  • Generation and shrinking of actions.
  • Optionally, a way to add additional information to a test.

Although it’s nice to have a minimal API, it leaves end users with a lot of different ways in which they might structure their tests. Sometimes that is useful, but for many situations a more streamlined approach is useful. In this blog post we described the quickcheck-lockstep library, which provides support for “lockstep-style” model testing on top of quickcheck-dynamic. The key difference here is that the postcondition is always the same: we insist that the system under test and the model must return the same results, “up to observability.” By default, the precondition is also always the same: we only insist that all variables are defined.

We previously implemented the same kind of infrastructure for quickcheck-state-machine, so implementing it now for quickcheck-dynamic provided a good comparison point between the two libraries.

  • In terms of model based testing, the two libraries basically have feature parity: they provide the same core functionality.
  • In addition to this core functionality quickcheck-dynamic additionally offers support for dynamic logic, which is absent from quickcheck-state-machine. Conversely, quickcheck-state-machine offers support for parallel test execution, which is currently absent from quickcheck-dynamic. We have not talked about either topic in this blog post.

The differences between the two libraries are mostly technical in nature:

  • Probably the most important downside of quickcheck-dynamic is there is precisely one variable that records the result of an action.11 This is not the case in quickcheck-state-machine, where the number of variables bound by an action is determined at runtime. We can use this to return no bound variables if the action failed, or indeed multiple bound variables if the action returned multiple values (such as our Open example). In quickcheck-lockstep we therefore provide the GVar abstraction, which provides a way to “map” over the type of variables. It might be useful to lift this abstraction into the main library at some point.
  • At the moment, quickcheck-dynamic does not provide explicit support for generating labelled examples. As we saw, we can implement this functionality on top of quickcheck-dynamic (quickcheck-lockstep offers it), but as with GVar, it might be useful to move (a version of) this functionality into to the main library.
  • In quickcheck-state-machine the types of variables are type-level arguments to actions and responses. This means that some functionality such as getting the list of variables used (usedVars) can be defined generically. Moreover, variables can be resolved by the framework, whereas in quickcheck-dynamic test authors are responsible for manually calling the LookUp function whenever necessary. However, we pay a price for this functionality in quickcheck-state-machine; especially when dealing with multiple types of variables, the required type-level machinery gets pretty sophisticated.
  • In quickcheck-dynamic the argument to tests is Actions, which is a list of steps where each step consists of a variable for that step and the action to execute. The corresponding datatype in quickcheck-state-machine is Commands; this is very similar, but in addition to the action, it also records the result of the action. This makes Commands a bit more useful for Actions for things like tagging commands, since we get the full history. In quickcheck-dynamic, tagActions must effectively re-run the full set of actions to construct the right test label.
  • Unlike quickcheck-state-machine, quickcheck-dynamic keeps the definition of the interpreter for Action (RunModel) separate from the StateModel class. This separation is useful, because running the test against the real system often needs some additional state (a database handle, for example) which is not necessary for many other parts of the test framework. In quickcheck-state-machine this can often lead to ugly error "state unused" calls.

All in all, the libraries are quite similar in terms of the core state model testing functionality. For lockstep-style testing, however, quickcheck-lockstep is probably more user-friendly than the corresponding functionality in quickcheck-state-machine because there is less advanced type machinery required. The downside of the single-variable-per-command of quickcheck-dynamic is resolved by the GVar abstraction in quickstep-lockstep.


  1. If we wanted to execute lockstep-style tests against multiple execution backends, we would have to introduce another abstraction to ensure that we can compare model responses to system responses for all of those backends.↩︎

  2. InLockstep could alternatively require a function compareResult :: a -> ModelValue s a -> Bool, but writing such a function is often bit cumbersome, whereas equality for Observable s can be derived.↩︎

  3. Core quickcheck-dynamic takes care of variables for the system under test, but not for the model.↩︎

  4. For an Action .. a, all we see in postcondition is a value of type Realized m a. We want to ensure not just that the model and the system under test have the same behaviour when no exceptions are present, but also that they return the same errors. We must therefore reflect the possibility for an error in the result type of Open.↩︎

  5. Perhaps some of this functionality can be merged with the main library; it certainly seems useful beyond lockstep-style testing.↩︎

  6. This is closely related to Coyoneda.↩︎

  7. The quickcheck-dynamic infrastructure insists that actions have Eq and Show instances. Since variables occur in actions, the same must be true for GVar. Secondly, a function from x -> y would not be enough; we would also need a second function of type ModelValue s x -> ModelValue s y. The indirection through the DSL avoids both of these problems: operations have Eq and Show instances, and we can simply insist on two interpreters of Op: one for Identity and one for ModelValue s.↩︎

  8. We use tagStep not just in labelledExamples, but also in the standard StateModel method monitoring, to tag tests as they are executed. While the former would in principle allow us to tag an entire list of actions, the latter does not.↩︎

  9. The proxy argument is necessary because Realized is a non-injective type family; quickcheck-dynamic relies on AllowAmbiguousTypes instead.↩︎

  10. In StateModel we have monitoring, but monitoring cannot really be used with label, as this would result in lots of calls to label as the test executes (once per action) and each of those calls would result in a separate table in the test output. We must therefore use tabulate instead, but this is not supported by QuickCheck’s labelledExamples. Moreover, the only way to turn a list of actions into a PropertyM in quickcheck-dynamic is runActions, which requires the RunModel argument; but RunModel should not be needed for creating labelled examples. In the lockstep infrastructure we provide instead a function tagActions :: Actions (Lockstep state) -> Property, which basically executes all of the actions, collecting the tags as it goes, and then makes a single call to label with the final list of tags. This then works well with the standard labelledExamples functionality from QuickCheck.↩︎

  11. The registry example from quickcheck-dynamic skirts around the problem: some actions fail, and some actions return new ModelThreadId, but there are no actions that can fail or return a new ModelThreadId.↩︎

by edsko at September 08, 2022 12:00 AM

September 07, 2022

Gabriella Gonzalez

nix-serve-ng: A faster, more reliable, drop-in replacement for nix-serve


Our team at Arista Networksis happy to announce nix-serve-ng, a backwards-compatible Haskell rewrite of nix-serve (a service for hosting a /nix/store as a binary cache). It provides better reliability and performance than nix-serve (ranging from ≈ 1.5× to 32× faster). We wrote nix-serve-ng to fix scaling bottlenecks in our cache and we expect other large-scale deployments might be interested in this project, too.

This post will focus more on the background behind the development process and comparisons to other Nix cache implementations. If you don’t care about any of that then you can get started by following the instructions in the repository’s README.


Before we began this project there were at least two other open source rewrites of nix-serve-ng that we could have adopted instead of nix-serve:

  • eris - A Perl rewrite of nix-serve

    Note: the original nix-serve is implemented in Perl, and eris is also implemented in Perl using a different framework.

  • harmonia- A Rust rewrite of nix-serve

The main reason we did not go with these two alternatives is because they are not drop-in replacements for the original nix-serve. We could have fixed that, but given how simple nix-serve is I figured that it would be simpler to just create our own. nix-serve-ng only took a couple of days for the initial version and maybe a week of follow-up fixes and performance tuning.

We did not evaluate the performance or reliability of eris or harmonia before embarking on our own nix-serve replacement. However, after nix-serve-ng was done we learned that it was significantly faster than the alternatives (see the Performance section below). Some of those performance differences are probably fixable, especially for harmonia. That said, we are very happy with the quality of our solution.

Backwards compatibility

One important design goal for this project is to be significantly backwards compatible with nix-serve. We went to great lengths to preserve compatibility, including:

  • Naming the built executable nix-serve

    Yes, even though the project name is nix-serve-ng, the executable built by the project is named nix-serve.

  • Preserving most of the original command-line options, including legacy options

    … even though some are unused.

In most cases you can literally replace pkgs.nix-servewith pkgs.nix-serve-ng and it will “just work”. You can even continue to use the existing services.nix-serve NixOS options.

The biggest compatibility regression is that nix-serve-ng cannot be built on MacOS. It is extremely close to supporting MacOS save for this one bug in Haskell’s hsc2hs tool: haskell/hsc2hs- #26. We left in all of the MacOS shims so that if that bug is ever fixed then we can get MacOS support easily.

For more details on the exact differences compared to nix-serve, see the Result / Backwards-compatibility section of the README.


nix-serve-ng is faster than all of the alternatives according to both our formal benchmarks and also informal testing. The “Benchmarks” section of our README has the complete breakdown but the relevant part is this table:

Speedups (compared to nix-serve):

Fetch present NAR info ×
Fetch absent NAR info ×
Fetch empty NAR ×101.00.670.5931.80
Fetch 10 MB NAR ×101.00.640.603.35

… which I can summarize like this:

  • nix-serve-ng is faster than all of the alternatives across all use cases
  • eris is slower than the original nix-serveacross all use cases
  • harmonia is faster than the original nix-serve for NAR info lookups, but slower for fetching NARs

These performance results were surprising for a few reasons:

  • I was not expecting eris to be slower than the original nix-serve implementation

    … especially not NAR info lookups to be ≈ 20× slower. This is significant because NAR info lookups typically dominate a Nix cache’s performance. In my (informal) experience, the majority of a Nix cache’s time is spent addressing failed cache lookups.

  • I was not expecting harmonia (the Rust rewrite) to be slower than the original nix-serve for fetching NARs

    This seems like something that should be fixable. harmonia will probably eventually match our performance because Rust has a high performance ceiling.

  • I was not expecting a ≈ 30x speedup for nix-serve-ngfetching small NARs

    I had to triple-check that neither nix-serve-ng nor the benchmark were broken when I saw this speedup.

So I investigated these performance differences to help inform other implementations what to be mindful of.

Performance insights

We didn’t get these kinds of speed-ups by being completely oblivious to performance. Here are the things that we paid special attention to to keep things efficient, in order of lowest-hanging to highest-hanging fruit:

  • Don’t read the secret key file on every NAR fetch

    This is a silly thing that the original nix-serve does that is the easiest thing to fix.

    eris and harmonia also fix this, so this optimization is not unique to our rewrite.

  • We bind directly to the Nix C++ API for fetching NARs

    nix-serve, eris, and harmoniaall shell out to a subprocess to fetch NARs, by invoking either nix dump-path or nix-store --dump to do the heavy lifting. In contrast, nix-serve-ng binds to the Nix C++ API for this purpose.

    This would definitely explain some of the performance difference when fetching NARs. Creating a subprocess has a fixed overhead regardless of the size of the NAR, which explains why we see the largest performance difference when fetching tiny NARs since the overhead of creating a subprocess would dominate the response time.

    This may also affect throughput for serving large NAR files, too, by adding unnecessary memory copies/buffering as part of streaming the subprocess output.

  • We minimize memory copies when fetching NARs

    We go to great lengths to minimize the number of intermediate buffers and copies when streaming the contents of a NAR to a client. To do this, we exploit the fact that Haskell’s foreign function interface works in both directions: Haskell code can call C++ code but also C++ code can call Haskell code. This means that we can create a Nix C++ streaming sink from a Haskell callback function and this eliminates the need for intermediate buffers.

    This likely also improves the throughput for serving NAR files. Only nix-serve-ng performs this optimization (since nix-serve-ng is the only one that uses the C++ API for streaming NAR contents).

  • Hand-write the API routing logic

    We hand-write all of the API routing logic to prioritize and optimize the hot path (fetching NAR info).

    For example, a really simple thing that the original nix-serve does inefficiently is to check if the path matches /nix-cache-info first, even though that is an extremely infrequently used path. In our API routing logic we move that check straight to the very end.

    These optimizations likely improve the performance of NAR info requests. As far as I can tell, only nix-serve-ng performs these optimizations.

I have not benchmarked the performance impact of each of these changes in isolation, though. These observations are purely based on my intuition.


nix-serve-ng is not all upsides. In particular, nix-serve-ng is missing features that some of the other rewrites provide, such as:

  • Greater configurability
  • Improved authentication support
  • Monitoring/diagnostics/status APIs

Our focus was entirely on scalability, so the primary reason to use nix-serve-ng is if you prioritize performance and uptime.


We’ve been using nix-serve-ng long enough internally that we feel confident endorsing its use outside our company. We run a particularly large Nix deployment internally (which is why we needed this in the first place), so we have stress tested nix-serve-ng considerably under heavy and realistic usage patterns.

You can get started by following these these instructions and let us know if you run into any issues or difficulties.

Also, I want to thank Arista Networks for graciously sponsoring our team to work on and open source this project

by Gabriella Gonzalez ( at September 07, 2022 03:56 PM

Ken T Takusagawa

[aybvgyej] prime binary truncations

consider a number N.  if N is odd, test whether N is prime.  if N is even, test whether N+1 is prime.  set N := floor(N/2) and repeat primality testing until N=0.  of all the bitwise right shifts, how may are prime?

for example, start at N := 1580011307924772.  N+1 is prime (1)
N := floor(N/2) = 790005653962386.  N+1 is prime (2)
N := floor(N/2) = 395002826981193.
N := floor(N/2) = 197501413490596.  N+1 is prime (3)
N := floor(N/2) = 98750706745298.
N := floor(N/2) = 49375353372649.
N := floor(N/2) = 24687676686324.
N := floor(N/2) = 12343838343162.
N := floor(N/2) = 6171919171581.
N := floor(N/2) = 3085959585790.  N+1 is prime (4)
N := floor(N/2) = 1542979792895.
N := floor(N/2) = 771489896447.  N is prime (5)
N := floor(N/2) = 385744948223.  N is prime (6)
N := floor(N/2) = 192872474111.  N is prime (7)
N := floor(N/2) = 96436237055.
N := floor(N/2) = 48218118527.  N is prime (8)
N := floor(N/2) = 24109059263.  N is prime (9)
N := floor(N/2) = 12054529631.
N := floor(N/2) = 6027264815.
N := floor(N/2) = 3013632407.  N is prime (10)
N := floor(N/2) = 1506816203.  N is prime (11)
N := floor(N/2) = 753408101.  N is prime (12)
N := floor(N/2) = 376704050.
N := floor(N/2) = 188352025.
N := floor(N/2) = 94176012.  N+1 is prime (13)
N := floor(N/2) = 47088006.  N+1 is prime (14)
N := floor(N/2) = 23544003.
N := floor(N/2) = 11772001.
N := floor(N/2) = 5886000.
N := floor(N/2) = 2943000.  N+1 is prime (15)
N := floor(N/2) = 1471500.  N+1 is prime (16)
N := floor(N/2) = 735750.  N+1 is prime (17)
N := floor(N/2) = 367875.
N := floor(N/2) = 183937.
N := floor(N/2) = 91968.  N+1 is prime (18)
N := floor(N/2) = 45984.
N := floor(N/2) = 22992.  N+1 is prime (19)
N := floor(N/2) = 11496.  N+1 is prime (20)
N := floor(N/2) = 5748.  N+1 is prime (21)
N := floor(N/2) = 2874.
N := floor(N/2) = 1437.
N := floor(N/2) = 718.  N+1 is prime (22)
N := floor(N/2) = 359.  N is prime (23)
N := floor(N/2) = 179.  N is prime (24)
N := floor(N/2) = 89.  N is prime (25)
N := floor(N/2) = 44.
N := floor(N/2) = 22.  N+1 is prime (26)
N := floor(N/2) = 11.  N is prime (27)
N := floor(N/2) = 5.  N is prime (28)
N := floor(N/2) = 2.  N+1 is prime (29)
N := floor(N/2) = 1.
N := floor(N/2) = 0.

thus, the number produces 29 primes, which is the most (a record) among numbers up to that starting N.

when examining N=1, we've arbitrarily chosen not to count N+1 = 2 being prime.

previously, factoring truncations of irrational numbers in binary.

the Pari/GP code below is brute force for pedagogical purposes:

countp(p)=my(numprimes=0); my(bitwidth=0); while(p>0, if(p%2, if(isprime(p), numprimes+=1); p=(p-1)/2, if(isprime(p+1), numprimes+=1); p/=2); bitwidth+=1); [numprimes, bitwidth]

best=0; for(n=0,+oo, a=countp(n); if(a[1]>best, best=a[1]; printbinary(n); print(" ",n," ",a," ",n+1)))

here are starting N which set records of producing increasing number of primes.  we give the number in binary (using period to signify zero) (illustrating rich veins of primes which work for a while, peter out, and later revive), in decimal, the number of primes, starting bitwidth, and N+1 (which I think is always prime).

1. 2 [1, 2] 3
1.. 4 [2, 3] 5
1.1. 10 [3, 4] 11
1.11. 22 [4, 5] 23
1.111. 46 [5, 6] 47
1.1..11. 166 [6, 8] 167
1.11..11. 358 [7, 9] 359
1.11..111. 718 [8, 10] 719
1.11..1111. 1438 [9, 11] 1439
1.11..11111. 2878 [10, 12] 2879
1.11..1111111. 11518 [11, 14] 11519
1.11..11111111. 23038 [12, 15] 23039
1.11..11111111... 92152 [13, 17] 92153
1.11..11111111..... 368608 [14, 19] 368609
1.111111.1....1111.. 783420 [15, 20] 783421
1.111111.1....1111..... 6267360 [16, 23] 6267361
1.111111.1....1111...... 12534720 [17, 24] 12534721
1.111111.1.....111.1111111. 100273918 [18, 27] 100273919
1.11..11111111111111.1..11... 377487000 [19, 29] 377487001
1.11..11111111111111.1..11.... 754974000 [20, 30] 754974001
1.11..11111111111111.1..11...... 3019896000 [21, 32] 3019896001
1.11..11111.1..11111...1.11.1111... 24147626872 [22, 35] 24147626873
1.11..11111.1..11111...1.11.1111...1. 96590507490 [23, 37] 96590507491
1.11..111.1......11....11..1.111111111. 385744948222 [24, 39] 385744948223
1.11..111.1......11....11..1.1111111111. 771489896446 [25, 40] 771489896447
1.11..111.1......11....11..1.111111111111. 3085959585790 [26, 42] 3085959585791
1.11..111.1......11....11..1.1111111111111... 24687676686328 [27, 45] 24687676686329
1.111111.1....11111....111..11..1.11..11..11111. 210298272002878 [28, 48] 210298272002879
1.11..111.1......11....11..1.111111111111.1..1..1.. 1580011307924772 [29, 51] 1580011307924773
1.11..111.1......11....11..1.1111111111111...1.1.1111. 12640090463400286 [30, 54] 12640090463400287
1.111111.1....1111.......1111.. 26918107252899406 [31, 55] 26918107252899407

what is the asymptotic growth rate of the records?  it appears worse than O(2^n).

the later entries were calculated with a Haskell program doing branch-and-bound, which is much more efficient than brute force.  below is the key routine.  searching all binary numbers of a given bitwidth is equivalent to traversing a full binary tree of a given height.  because we are looking for records, we know what previous record we need to exceed.  when exploring a node in the middle of the tree, we know how many primes (or primes minus 1) we already have in the path back to the root.  the upper bound of the number of primes left, down to the leaf, is the height above the leaves.  these can be combined to yield an upper bound for the number of primes on this branch of the tree.  if the upper bound is less than the goal, we can prune, abandoning this branch (mzero).

binary numbers are represented as little-endian lists of Bool.  searching the False branch first searches smaller numbers first.  (future work: it's probably better to store the "path so far" as a bitstring.)

dosearch :: forall m . (MonadPlus m) => Integer -> ([Bool], Integer) -> Integer -> m[Bool];
dosearch goal (_, primessofar) numleft | primessofar+numleft < goal = Monad.mzero;
dosearch _ (pathsofar, _) 0 = return pathsofar;
dosearch goal (pathsofar, primessofar) numleft = let {
  nextodd :: [Bool] = True:pathsofar;
  nextprimesofar :: Integer = if isPrime $ binarytointeger nextodd
   then 1+primessofar
   else primessofar;
  nextsearch :: [Bool] -> m [Bool];
  nextsearch path = dosearch goal (path, nextprimesofar) (pred numleft);
} in nextsearch (False:pathsofar) `Monad.mplus` nextsearch nextodd;

(the infix application of mplus eliminates some parentheses.)

future work: parallelize, faster primality testing.

below is a list that includes numbers that tie (not necessarily exceed) the record number of primes.  these were found by brute force, so the list does not go as far as above.  the last number produces 21 primes.  no longer are all starting numbers prime or one less than a prime.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 16 17 18 19 20 21 22 23 36 37 40 41 42 43 44 45 46 47 72 73 82 83 88 89 92 93 94 95 106 107 144 145 146 147 148 149 150 151 156 157 162 163 164 165 166 167 178 179 190 191 292 293 312 313 330 331 332 333 334 335 346 347 352 353 356 357 358 359 382 383 586 587 660 661 716 717 718 719 1320 1321 1432 1433 1436 1437 1438 1439 2876 2877 2878 2879 5756 5757 5758 5759 6120 6121 11278 11279 11496 11497 11512 11513 11514 11515 11516 11517 11518 11519 12240 12241 22992 22993 23026 23027 23028 23029 23036 23037 23038 23039 24480 24481 46072 46073 46076 46077 46078 46079 48960 48961 48962 48963 84718 84719 90238 90239 91968 91969 92040 92041 92106 92107 92110 92111 92118 92119 92144 92145 92146 92147 92152 92153 97926 97927 184080 184081 184290 184291 184304 184305 184306 184307 184308 184309 195852 195853 195854 195855 195862 195863 360946 360947 360952 360953 360958 360959 367878 367879 368160 368161 368162 368163 368398 368399 368442 368443 368446 368447 368578 368579 368580 368581 368582 368583 368608 368609 391710 391711 737158 737159 737216 737217 737218 737219 783412 783413 783420 783421 1474432 1474433 1474438 1474439 1566826 1566827 1566840 1566841 1566842 1566843 1566846 1566847 2943000 2943001 2948712 2948713 2948864 2948865 2948866 2948867 2948872 2948873 2948876 2948877 2948878 2948879 2949118 2949119 3133642 3133643 3133652 3133653 3133654 3133655 3133680 3133681 3133682 3133683 3133684 3133685 3133686 3133687 3133692 3133693 3133694 3133695 3136512 3136513 4864080 4864081 5775126 5775127 5775148 5775149 5775270 5775271 5775336 5775337 5886000 5886001 5886002 5886003 5886006 5886007 5890582 5890583 5890606 5890607 5895118 5895119 5895166 5895167 5895366 5895367 5895418 5895419 5897250 5897251 5897266 5897267 5897278 5897279 5897424 5897425 5897426 5897427 5897728 5897729 5897730 5897731 5897732 5897733 5897734 5897735 5897744 5897745 5897746 5897747 5897752 5897753 5897754 5897755 5897756 5897757 5897758 5897759 5897760 5897761 5897766 5897767 5897806 5897807 5897818 5897819 5897832 5897833 5897952 5897953 5898236 5898237 5898238 5898239 6237598 6237599 6259972 6259973 6266926 6266927 6267060 6267061 6267070 6267071 6267076 6267077 6267082 6267083 6267284 6267285 6267286 6267287 6267298 6267299 6267304 6267305 6267306 6267307 6267308 6267309 6267310 6267311 6267346 6267347 6267360 6267361 6267366 6267367 6267388 6267389 11550298 11550299 11550540 11550541 11781166 11781167 11795460 11795461 11795470 11795471 11795518 11795519 12534142 12534143 12534568 12534569 12534616 12534617 12534618 12534619 12534720 12534721 12534778 12534779 25068286 25068287 25069138 25069139 25069440 25069441 25069442 25069443 25069468 25069469 25069556 25069557 25069558 25069559 34214398 34214399 47124666 47124667 47124862 47124863 47161342 47161343 47178132 47178133 47178238 47178239 47181960 47181961 50136572 50136573 50136574 50136575 50136660 50136661 50136958 50136959 50138276 50138277 50138278 50138279 50138470 50138471 50138478 50138479 50138880 50138881 50138882 50138883 50138884 50138885 50138886 50138887 50138926 50138927 50138936 50138937 50138938 50138939 50139112 50139113 50139114 50139115 50139116 50139117 50139118 50139119 68428792 68428793 68428796 68428797 68428798 68428799 92402388 92402389 92404198 92404199 92404342 92404343 94176000 94176001 94176012 94176013 94176028 94176029 94176772 94176773 94176778 94176779 94249332 94249333 94249334 94249335 94249338 94249339 94249720 94249721 94249722 94249723 94249724 94249725 94249726 94249727 94321342 94321343 94322684 94322685 94322686 94322687 94325866 94325867 94326666 94326667 94326718 94326719 94356022 94356023 94356264 94356265 94356266 94356267 94356476 94356477 94356478 94356479 94356570 94356571 94358800 94358801 94358808 94358809 94358826 94358827 94363656 94363657 94363680 94363681 94363692 94363693 94363728 94363729 94363768 94363769 94363920 94363921 94363922 94363923 94364026 94364027 94365888 94365889 94365892 94365893 94371750 94371751 100273056 100273057 100273138 100273139 100273140 100273141 100273144 100273145 100273146 100273147 100273148 100273149 100273150 100273151 100273152 100273153 100273320 100273321 100273322 100273323 100273912 100273913 100273916 100273917 100273918 100273919 100277766 100277767 100277872 100277873 188353558 188353559 188727312 188727313 188727456 188727457 188731786 188731787 188743500 188743501 200546302 200546303 200546640 200546641 200547836 200547837 200547838 200547839 200553106 200553107 200553108 200553109 200555520 200555521 200555526 200555527 200555532 200555533 200555534 200555535 200555542 200555543 200555566 200555567 200555700 200555701 200555712 200555713 200555744 200555745 200555746 200555747 200555752 200555753 273715192 273715193 342841318 342841319 369616798 369616799 369617340 369617341 369621540 369621541 376704000 376704001 376704006 376704007 376707116 376707117 376707118 376707119 376707480 376707481 376997280 376997281 376997338 376997339 376997352 376997353 376998406 376998407 376998888 376998889 376998898 376998899 376998900 376998901 377289658 377289659 377303040 377303041 377303470 377303471 377303532 377303533 377306668 377306669 377425056 377425057 377425912 377425913 377426280 377426281 377454624 377454625 377454626 377454627 377454628 377454629 377454768 377454769 377454912 377454913 377454914 377454915 377454996 377454997 377455692 377455693 377456110 377456111 377463552 377463553 377463570 377463571 377463572 377463573 377463574 377463575 377487000 377487001 401111040 401111041 401111068 401111069 401111400 401111401 401111506 401111507 753408000 753408001 754613338 754613339 754850112 754850113 754851826 754851827 754852560 754852561 754909828 754909829 754927140 754927141 754974000 754974001 1509948000 1509948001 1509948002 1509948003 1604444220 1604444221 1604446026 1604446027 3013632406 3013632407 3018453358 3018453359 3019410240 3019410241 3019708566 3019708567 3019896000 3019896001 6036906718 6036906719 6039792000 6039792001 6039792002 6039792003 6039792006 6039792007 6417777072 6417777073 6417807238 6417807239 12073713072 12073713073 12073813436 12073813437 12073813438 12073813439 12077629236 12077629237 12078563326 12078563327 12078595558 12078595559 12078834268 12078834269 12079584000 12079584001 12079584002 12079584003 12079584004 12079584005 12079584006 12079584007 12079584012 12079584013 12079584014 12079584015 12079584016 12079584017 12079584028 12079584029 12079584046 12079584047 12079584052 12079584053 12834985258 12834985259 12835553760 12835553761 12835553770 12835553771 12835554144 12835554145 12835554146 12835554147 12835554232 12835554233 12835587846 12835587847 12835614476 12835614477 12835614478 12835614479

by Unknown ( at September 07, 2022 12:33 AM

September 03, 2022

Joachim Breitner

More recursive definitions

Haskell is a pure and lazy programming language, and the laziness allows us to write some algorithms very elegantly, by recursively referring to already calculated values. A typical example is the following definition of the Fibonacci numbers, as an infinite stream:

Elegant graph traversals

A maybe more practical example is the following calculation of the transitive closure of a graph:

We represent graphs as maps from vertex to their successors vertex, and define the resulting map sets recursively: The set of reachable vertices from a vertex v is v itself, plus those reachable by its successors vs, for which we query sets.

And, despite this apparent self-referential recursion, it works!

Cyclic graphs ruin it all

These tricks can be very impressive … until someone tries to use it on a cyclic graph and the program just hangs until we abort it:

At this point we are thrown back to implement a more pedestrian graph traversal, typically keeping explicit track of vertices that we have seen already:

I have written that seen/todo recursion idiom so often in the past, I can almost write it blindly And indeed, this code handles cyclic graphs just fine:

ghci> transitive2 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
fromList [(1,[1,2,3]),(2,[1,2,3]),(3,[3])]

But this is a bit anticlimactic – Haskell is supposed to be a declarative language, and transitive1 declares my intent just fine!

We can have it all

It seems there actually is a way to write essentially the code in transitive1, and still get the right result in all cases, and I have just published a possible implementation as rec-def. In the module Data.Recursive.Set we find an API that resembles that of Set, with a type RSet a, and in addition to conversion functions from and to sets, we find the two operations that we needed in transitive1:

Let’s try that:

And indeed it works! Magic!

ghci> transitive2 $ M.fromList [(1,[3]),(2,[1,3]),(3,[])]
fromList [(1,[1,3]),(2,[1,2,3]),(3,[3])]
ghci> transitive2 $ M.fromList [(1,[2,3]),(2,[1,3]),(3,[])]
fromList [(1,[1,2,3]),(2,[1,2,3]),(3,[3])]

To show off some more, here are small examples:

ghci> let s = RS.insert 42 s in RS.get s
fromList [42]
ghci> :{
  let s1 = RS.insert 23 s2
      s2 = RS.insert 42 s1
  in RS.get s1
fromList [23,42]

How is that possible? Is it still Haskell?

The internal workings of the RSet a type will be the topic of a future blog post; let me just briefly mention that it uses unsafe features under the hood, and just keeps applying the equations you gave until a fixed-point is reached. Because it starts with the empty set and all operations provided by Data.Recursive.Set are monotonous (e.g. no difference) it will eventually find the least fixed point.

Despite the unsafe machinery under the hood, I claim that Data.Recursive.Set is itself nicely safe, and does not destroy Haskell’s nice properties like purity, referential transparency and equational reasoning. If you disagree, I’d like to hear about it (here, on Twitter, Reddit or Discourse)! There is a brief discussion at the end of the tutorial in Data.Recursive.Example.

More than sets

The library also provides Data.Recursive.Bool for recursive equations with booleans (preferring False) and Data.Recursive.DualBool (preferring True), and some operations like member :: Ord a => a -> RSet a -> RBool can actually connect different types. I plan to add other data types (natural numbers, maps, Maybe, with suitable orders) as demand arises and as I come across nice small example use-cases for the documentation (e.g. finding shortest paths in a graph).

I believe this idiom is practically useful in a wide range of applications (which of course all have some underlying graph structure – but then almost everything in Computer Science is a graph). My original motivation was a program analysis. Imagine you want to find out from where in your program you can run into a division by zero. As long as your program does not have recursion, you can simply keep track of a boolean flag while you traverse the program, keeping track a mapping from function names to whether they can divide by zero – all nice and elegant. But once you allow mutually recursive functions, things become tricky. Unless you use RBool! Simply use laziness, pass the analysis result down when analyzing the function’s right-hand sides, and it just works!

by Joachim Breitner ( at September 03, 2022 12:31 PM

September 01, 2022

Brent Yorgey

Competitive programming in Haskell: Infinite 2D array

If you like number theory, combinatorics, and/or optimizing Haskell code, I challenge you to solve Infinite 2D Array using Haskell.

  • Level 1: can you come up with a general formula to compute F_{x,y}?
  • Level 2: In general, how can you efficiently compute F_{x,y} \pmod {10^9 + 7}?
  • Level 3: Now implement the above ideas in Haskell so your solution actually fits within the 1 second time limit.

I have solved it but it was definitely challenging. In a subsequent blog post I’ll talk about my solution and ask for other optimization ideas.

by Brent at September 01, 2022 05:04 PM

August 29, 2022

Gabriella Gonzalez

Stop calling everything "Nix"


One of my pet peeves is when people abuse the term “Nix” without qualification when trying to explain the various components of the Nix ecosystem.

As a concrete example, a person might say:

“I hate Nix’s syntax”

… and when you dig into this criticism you realize that they’re actually complaining about the Nixpkgs API, which is not the same thing as the syntax of the Nix expression language.

So one of the goals of this post is to introduce some unambiguous terminology that people can use to refer to the various abstraction layers of the Nix ecosystem in order to avoid confusion. I’ll introduce each abstraction layer from the lowest level abstractions to the highest level abstractions.

Another reason I explain “Nix” in terms of these abstraction layers is because this helps people consult the correct manual. The Nix ecosystem provides three manuals that you will commonly need to refer to in order to become more proficient:

… and I hope by the end of this post it will be clearer which manual interests you for any given question.

Edit: Domen Kožar pointed out that there is an ongoing effort to standardize terminology here:

I’ll update the post to match the agreed-upon terminology when that is complete.

Layer #0: The Nix store

I use the term “Nix store” to mean essentially everything you can manage with the nix-store command-line tool.

That is the simplest definition, but to expand upon that, I mean the following files:

  • Derivations: /nix/store/*.drv
  • Build products: /nix/store/* without a .drv extension
  • Log files: /nix/var/log/nix/drvs/**
  • Garbage collection roots: /nix/var/nix/gcroots/**

… and the following operations:

  • Realizing a derivation

    i.e. converting a .drv file to the corresponding build products using nix-store --realise

  • Adding static files to the /nix/store

    i.e. nix-store --add

  • Creating GC roots for build products

    i.e. the --add-root option to nix-store

  • Garbage collecting derivations not protected by a GC root

    i.e. nix-store --gc

There are other things the Nix store supports (like profile management), but these are the most important operations.

CAREFULLY NOTE: the “Nix store” is independent of the “Nix language” (which we’ll define below). In other words, you could replace the front-end Nix programming language with another language (e.g. Guile scheme, as Guix does). This is because the Nix derivation format (the .drv files) and the nix-storecommand-line interface are both agnostic of the Nix expression language. I have a talk which delves a bit more into this subject:

Layer #1: The Nix language

I use the term “Nix language” to encompass three things:

  • The programming language: source code we typically store in .nix files
  • Instantiation: the interpretation of Nix code to generate .drv files
  • Flakes: pure evaluation and instantiation caching

To connect this with the previous section, the typical pipeline for converting Nix source code to a build product is:

Nix source code (*.nix)            │ Nix language
↓ Instantiation ├─────────────
Nix derivation (/nix/store/*.drv) │
↓ Realization │ Nix store
Nix build product (/nix/store/*) │

In isolation, the Nix language is “just” a purely functional programming language with simple language constructs. For example, here is a sample Nix REPL session:

nix-repl> 2 + 2

nix-repl> x = "world"

nix-repl> "Hello, " + x
"Hello, world"

nix-repl> r = { a = 1; b = true; }

nix-repl> if r.b then r.a else 0

However, as we go up the abstraction ladder the idiomatic Nix code we’ll encounter will begin to stray from that simple functional core.

NOTE: Some people will disagree with my choice to include flakes at this abstraction layer since flakes are sometimes marketed as a dependency manager (similar to niv). I don’t view them in this way and I treat flakes as primarily as mechanism for purifying evaluation and caching instantiation, as outlined in this post:

… and if you view flakes in that capacity then they are a feature of the Nix language since evaluation/instantiation are the primary purpose of the programming language.

Layer #2: The Nix build tool

This layer encompasses the command-line interface to both the “Nix store” and the “Nix language”.

This includes (but is not limited to):

  • nix-store (the command, not the underlying store)
  • nix-instantiate
  • nix-build
  • nix-shell
  • nix subcommands, including:
    • nix build
    • nix run
    • nix develop
    • nix log
    • nix flake

I make this distinction because the command-line interface enables some additional niceties that are not inherent to the underlying layers. For example, the nix build command has some flake integration so that you can say nix build someFlake#somePackage and this command-line API nicety is not necessarily inherent to flakes (in my view).

Also, many of these commands operate at both Layer 0 and Layer 1, which can blur the distinction between the two. For example the nix-build command can accept a layer 1 Nix program (i.e. a .nix file) or a layer 0 derivation (i.e. a .drv file).

Another thing that blurs the distinction is that the Nix manual covers all three of the layers introduced so far, ranging from the Nix store to the command-line interface. However, if you want to better understand these three layers then that is correct place to begin:

Layer #3: Nixpkgs

Nixpkgs is a software distribution (a.k.a. “distro”) for Nix. Specifically, all of the packaging logic for Nixpkgs is hosted on GitHub here:

This repository contains a large number of Nix expressions for building packages across several platforms. If the “Nix language” is a programming language then “Nixpkgs” is a gigantic “library” authored within that language. There are other Nix “libraries” outside of Nixpkgs but Nixpkgs is the one you will interact with the most.

The Nixpkgs repository establishes several widespread idioms and conventions, including:

  • The standard environment (a.k.a. stdenv) for authoring a package
    • There are also language-specific standard-environments, too
  • A domain-specific language for overriding individual packages or sets of packages

When people complain about “Nix’s syntax”, most of the time they’re actually complaining about Nixpkgs and more specifically complaining about the Nixpkgs system for overriding packages. However, I can see how people might mistake the two.

The reason for the confusion is that the Nixpkgs support for overrides is essentially an embedded domain-specific language, meaning that you still express everything in the Nix language (layer 1), but the ways in which you express things is fundamentally different than if you were simply using low-level Nix language features.

As a contrived example, this “layer 1” Nix code:

x = 1;

y = x + 2;

… would roughly correspond to the following “layer 3” Nixpkgs overlay:

self: super: {
x = 1;

y = self.x + 2;

The reason why Nixpkgs doesn’t do the simpler “layer 1” thing is because Nixpkgs is designed to support “late binding” of expressions, meaning that everything can be overridden, even dependencies deep within the dependency tree. Moreover, this overriding is done in such a way that everything “downstream” of the overrride (i.e. all reverse dependencies) pick up the change correctly.

As a more realistic example, the following program:

pkgs = import <nixpkgs> { };

fast-tags =

fast-tags-no-tests =
pkgs.haskell.lib.dontCheck fast-tags;


… is simpler, but is not an idiomatic use of Nixpkgs because it is not using the overlay system and therefore does not support late binding. The more idiomatic analog would be:

overlay = self: super: {
fast-tags =

fast-tags-no-tests =

pkgs = import <nixpkgs> { overlays = [ overlay ]; };


You can learn more about this abstraction layer by consulting the Nixpkgs manual:

Layer #4: NixOS

NixOS is an operating system that is (literally) built on Nixpkgs. Specifically, there is a ./nixos/ subdirectory of the Nixpkgs repository for all of the NixOS-related logic.

NixOS is based on the NixOS module system, which is yet another embedded domain-specific language. In other words, you configure NixOS with Nix code, but the idioms of that Nix code depart even more wildly from straightforward “layer 1” Nix code.

NixOS modules were designed to look more like Terraform modules than Nix code, but they are still technically Nix code. For example, this is what the NixOS module for the lorri service looks like at the time of this writing:

{ config, lib, pkgs, ... }:

cfg =;
socketPath = "lorri/daemon.socket";
in {
options = {
services.lorri = {
enable = lib.mkOption {
default = false;
type = lib.types.bool;
description = lib.mdDoc ''
Enables the daemon for `lorri`, a nix-shell replacement for project
development. The socket-activated daemon starts on the first request
issued by the `lorri` command.
package = lib.mkOption {
default = pkgs.lorri;
type = lib.types.package;
description = lib.mdDoc ''
The lorri package to use.
defaultText = lib.literalExpression "pkgs.lorri";

config = lib.mkIf cfg.enable {
systemd.user.sockets.lorri = {
description = "Socket for Lorri Daemon";
wantedBy = [ "" ];
socketConfig = {
ListenStream = "%t/${socketPath}";
RuntimeDirectory = "lorri";
}; = {
description = "Lorri Daemon";
requires = [ "lorri.socket" ];
after = [ "lorri.socket" ];
path = with pkgs; [ config.nix.package git gnutar gzip ];
serviceConfig = {
ExecStart = "${cfg.package}/bin/lorri daemon";
PrivateTmp = true;
ProtectSystem = "strict";
ProtectHome = "read-only";
Restart = "on-failure";

environment.systemPackages = [ cfg.package ];

You might wonder how NixOS relates to the underlying layers. For example, if Nix is a build system, then how do you “build” NixOS? I have another post which elaborates on that subject here:

Also, you can learn more about this abstraction layer by consulting the NixOS manual:

Nix ecosystem

I use the term “Nix ecosystem” to describe all of the preceding layers and other stuff not mentioned so far (like hydra, the continuous integration service).

This is not a layer of its own, but I mention this because I prefer to use “Nix ecosystem” instead of “Nix” to avoid ambiguity, since the latter can easily be mistaken for an individual abstraction layer (especially the Nix language or the Nix build tool).

However, when I do hear people say “Nix”, then I generally understand it to mean the “Nix ecosystem” unless they clarify otherwise.


Hopefully this passive aggressive post helps people express themselves a little more precisely when discussing the Nix ecosystem.

If you enjoy this post, you will probably also like this other post of mine:

… since that touches on the Nixpkgs and NixOS embedded domain-specific languages and how they confound the user experience.

I’ll conclude this post with the following obligatory joke:

I’d just like to interject for a moment. What you’re refering to as Nix, is in fact, NixOS, or as I’ve recently taken to calling it, Nix plus OS. Nix is not an operating system unto itself, but rather another free component of a fully functioning ecosystem made useful by the Nix store, Nix language, and Nix build tool comprising a full OS as defined by POSIX.

Many Guix users run a modified version of the Nix ecosystem every day, without realizing it. Through a peculiar turn of events, the operating system based on Nix which is widely used today is often called Nix, and many of its users are not aware that it is basically the Nix ecosystem, developed by the NixOS foundation.

There really is a Nix, and these people are using it, but it is just a part of the system they use. Nix is the expression language: the program in the system that specifies the services and programs that you want to build and run. The language is an essential part of the operating system, but useless by itself; it can only function in the context of a complete operating system. Nix is normally used in combination with an operating system: the whole system is basically an operating system with Nix added, or NixOS. All the so-called Nix distributions are really distributions of NixOS!

by Gabriella Gonzalez ( at August 29, 2022 02:20 PM

FP Complete

FP Complete Corporation Announces Partnership with Portworx by Pure Storage

FP Complete Corporation Announces Partnership with Portworx by Pure Storage to Streamline World-Class DevOps Consulting Services with State-of-the-Art, End-To-End Storage and Data Management Solution for Kubernetes Projects.

Charlotte, North Carolina (August 31, 2022) – FP Complete Corporation, a global technology partner that specializes in DevSecOps, Cloud Native Computing, and Advanced Server-Side Programming Languages today announced that it has partnered with Portworx by Pure Storage to bring an integrated solution to customers seeking DevSecOps consulting services for the management of persistent storage, data protection, disaster recovery, data security, and hybrid data migrations.

The partnership between FP Complete Corporation and Portworx will be integral in providing FP Complete's DevSecOps and Cloud Enablement clients with a data storage platform designed to run in a container that supports any cloud physical storage on any Kubernetes distribution.

Portworx Enterprise gets right to the heart of what developers and Kubernetes admins want: data to behave like a cloud service. Developers and Admins wish to request Storage based on their requirements (capacity, performance level, resiliency level, security level, access, protection level, and more) and let the data management layer figure out all the details. Portworx PX-Backup adds enterprise-grade point-and-click backup and recovery for all applications running on Kubernetes, even if they are stateless.

Portworx shortens development timelines and headaches for companies moving from on-prem to cloud. In addition, the integration between FP Complete Corporation and Portworx allows the easy exchange of best practices information, so design and storage run in parallel.

Gartner predicts that by 2025, more than 85% of global organizations will be running containerized applications in production, up from less than 35% in 20191. As container adoption increases and more applications are being deployed in the enterprise, these organizations want more options to manage stateful and persistent data associated with these modern applications.

"It is my pleasure to announce that Pure Storage can now be utilized by our world-class engineers needing a fully integrated, end-to-end storage and data management solution for our DevSecOps clients with complicated Kubernetes projects. Pure Storage is known globally for its strength in the storage industry, and this partnership offers strong support for our business," said Wes Crook, CEO of FP Complete Corporation.

“There can be zero doubt that most new cloud-native apps are built on containers and orchestrated by Kubernetes. Unfortunately, the early development on containers resulted in lots of data access and availability issues due to a lack of enterprise-grade persistent storage data management and low data visibility. With Portworx and the aid of Kubernetes experts like FP Complete, we can offer customers a rock-solid, enterprise-class, cloud-native development platform that delivers end-to-end application and data lifecycle management that significantly lowers the risks and costs of operating cloud-native application infrastructure,” said Venkat Ramakrishnan, VP, Engineering, Cloud Native Business Unit, Pure Storage.

About FP Complete Corporation
Founded in 2012 by Aaron Contorer, former Microsoft executive, FP Complete Corporation is known globally as the one-stop, full-stack technology shop that delivers agile, reliable, repeatable, and highly secure software. In 2019, we launched our flagship platform, Kube360®, which is a fully managed enterprise Kubernetes-based DevOps ecosystem. With Kube360, FP Complete is now well positioned to provide a complete suite of products and solutions to our clients on their journey towards cloudification, containerization, and DevOps best practices. The Company's mission is to deliver superior software engineering to build great software for our clients. FP Complete Corporation serves over 200+ global clients and employs over 70 people worldwide. It has won many awards and made the Inc. 5000 list in 2020 for being one of the 5000 fastest-growing private companies in America. For more information about FP Complete Corporation, visit its website at [](

1 Arun Chandrasekaran, Best Practices for Running Containers and Kubernetes in Production, Gartner, August 2020

August 29, 2022 12:00 AM

August 28, 2022

Gabriella Gonzalez

Incrementally package a Haskell program using Nix


This post walks through how to take a standalone Haskell file and progressively package the file using Nix. In other words, we will tour a spectrum of packaging options ranging from simple to fancy.

The running example will be the following standalone single-file Haskell program:

I won’t go into detail about what that program does, although you can study the program if you are curious. Essentially, I’m planning to deliver a talk based on that program at this year’s MuniHac and I wanted to package it up so that other people could collaborate on the program with me during the hackathon.

When I began writing this post, there was no packaging logic for this program; it’s a standalone Haskell file. However, this file has several dependencies outside of Haskell’s standard library, so up until now I needed some way to obtain those dependencies for development.

Stage 0: ghc.withPackages

The most low-tech way that you can hack on a Haskell program using Nix is to use nix-shell to obtain a transient development environment (this is what I had done up until now).

Specifically, you can do something like this:

$ nix-shell --packages 'ghc.withPackages (pkgs: [ pkgs.mtl pkgs.MemoTrie pkgs.containers pkgs.pretty-show ])'

… where pkgs.mtl and pkgs.MemoTrie indicate that I want to include the mtl and MemoTriepackages in my Haskell development environment.

Inside of that development environment I can build and run the file using ghc. For example, I can use ghc -O to build an executable to run:

[nix-shell]$ ghc -O Spire.hs
[nix-shell]$ ./Spire

… or if I don’t care about optimizations I can interpret the file using runghc:

$ runghc Spire.hs

Stage 1: IDE support

Once I’m inside a Nix shell I can begin to take advantage of integrated development environment (IDE) support.

The two most common tools Haskell developers use for rapid feedback are ghcid and haskell-language-server:

  • ghcid provides a command-line interface for fast type-checking feedback but doesn’t provide other IDE-like features

  • haskell-language-server is more of a proper IDE that you use in conjunction with some editor

I can obtain either tool by exiting from the shell and creating a new shell that includes the desired tool.

For example, if I want to use ghcid then I recreate the nix-shell using the following command:

$ nix-shell --packages ghcid 'ghc.withPackages (pkgs: [ pkgs.mtl pkgs.MemoTrie pkgs.containers pkgs.pretty-show ])'

… and then I can tell ghcid to continuously type-check my file using:

[nix-shell]$ ghcid Spire.hs

If I want to use haskell-language-server, then I recreate the nix-shell using this command:

$ nix-shell --packages haskell-language-server 'ghc.withPackages (pkgs: [ pkgs.mtl pkgs.MemoTrie pkgs.containers pkgs.pretty-show ])'

… and then I can explore the code in any editor that supports the language server protocol.

Note that if you use VSCode as your editor then you may need to install some additional plugins:

… and the next section will show how to install VSCode and those plugins using Nix.

However, once you do install those plugins then you can open the file in VSCode from within the nix-shell using:

[nix-shell]$ code Spire.hs

… and once you trust the file the IDE features will kick in.

Stage 2: Global development environment

Sometimes I like to globally install development tools that are commonly shared between projects. For example, if I use ghcid or haskell-language-server across all my projects then I don’t want to have to explicitly enumerate that tool in each project’s Nix shell.

Moreover, my tool preferences might not be shared by other developers. If I share my nix-shell with other developers for a project then I probably don’t want to add editors/IDEs or other command-line tools to that environment because then they have to download those tools regardless of whether they plan to use them.

However, I don’t want to globally install development tools like this:

$ nix-env --install --file '<nixpkgs>' --attr ghcid
$ nix-env --install --file '<nixpkgs>' --attr haskell-language-server

Part of the reason I use Nix is to avoid imperatively managing my development environment. Fortunately, though, nix-envsupports a more declarative way of managing dependencies.

What you can do instead is save a file like this to ~/default.nix:

# For VSCode
config = { allowUnfree = true; };

overlay = pkgsNew: pkgsOld: {
# Here's an example of how to use Nix to install VSCode with plugins managed
# by Nix, too
vscode-with-extensions = pkgsOld.vscode-with-extensions.override {
vscodeExtensions = [

pkgs = import <nixpkgs> { inherit config; overlays = [ overlay ]; };

{ inherit (pkgs)
# I included some sample useful development tools for Haskell. Feel free
# to customize.

… and once you create that file you have two options.

The first option is that you can set your global development environment to match the file by running:

$ nix-env --remove-all --install --file ~/default.nix

NOTE: At the time of this writing you may also need to add --system x86_64-darwin if you are trying out these examples on an M1 Macbook. For more details, see:

Carefully note the --remove-all, which resets your development environment to match the file, so that nothing from your old development environment is accidentally carried over into your new development environment. This makes our use of the nix-envcommand truly declarative.

The second option is that you can change the file to create a valid shell, like this:

config = { allowUnfree = true; };

overlay = pkgsNew: pkgsOld: {
vscode-with-extensions = pkgsOld.vscode-with-extensions.override {
vscodeExtensions = [

pkgs = import <nixpkgs> { inherit config; overlays = [ overlay ]; };

pkgs.mkShell {
packages = [

… and then run:

$ nix-shell ~/default.nix

Or, even better, you can rename the file to ~/shell.nixand then if you’re already in your home directory (e.g. you just logged into your system), then you can run:

$ nix-shell

… which will select ~/shell.nix by default. This lets you get a completely transient development environment so that you never have to install anything development tools globally.

These nix-shell commands stack, so you can first run nix-shell to obtain your global development environment and then use nix-shell a second time to obtain project-specific dependencies.

My personal preference is to use the declarative nix-envtrick for installing global development tools. In my opinion it’s just as elegant as nix-shell and slightly less hassle.

Stage 3: Cabal

Anyway, enough about global development tools. Back to our Haskell project!

So ghc.withPackages is a great way to just start hacking on a standalone Haskell program when you don’t want to worry about packaging up the program. However, at some point you might want to share the program with the others or do a proper job of packaging if you’re trying to productionizethe code.

That brings us to the next step, which is packaging our Haskell program with a Cabal file (a Haskell package manifest). We’ll need the cabal-install command-line tool before we proceed further, so you’ll want to add that tool to your global development environment (see the previous section).

To create our .cabal file we can run the following command from the top-level directory of our Haskell project:

$ cabal init --interactive
Should I generate a simple project with sensible defaults? [default: y] n

… and follow the prompts to create a starting point for our .cabal file.

After completing those choices and trimming down the .cabal file (to keep the example simple), I get a file that looks like this:

cabal-version:      2.4
name: spire
version: 1.0.0
license: BSD-3-Clause
license-file: LICENSE

executable spire
main-is: Spire.hs
build-depends: base ^>=
default-language: Haskell2010

The only thing I’m going change for now is to add dependencies to the build-depends section and increase the upper bound on base::

cabal-version:      2.4
name: spire
version: 1.0.0
license: BSD-3-Clause
license-file: LICENSE

executable spire
main-is: Spire.hs
build-depends: base >= && < 5
, MemoTrie
, containers
, mtl
, pretty-show
, transformers
default-language: Haskell2010

Stage 4: cabal2nix --shell

Adding a .cabal file suffices to share our Haskell package with other Haskell developers if they’re not using Nix. However, if we want to Nix-enable package our package then we have a few options.

The simplest option is to run the following command from the top-level of the Haskell project:

$ cabal2nix --shell . > shell.nix

That will create something similar to the following shell.nix file:

{ nixpkgs ? import <nixpkgs> {}, compiler ? "default", doBenchmark ? false }:


inherit (nixpkgs) pkgs;

f = { mkDerivation, base, containers, lib, MemoTrie, mtl
, pretty-show, transformers
mkDerivation {
pname = "spire";
version = "1.0.0";
src = ./.;
isLibrary = false;
isExecutable = true;
executableHaskellDepends = [
base containers MemoTrie mtl pretty-show transformers
license = lib.licenses.bsd3;

haskellPackages = if compiler == "default"
then pkgs.haskellPackages
else pkgs.haskell.packages.${compiler};

variant = if doBenchmark then pkgs.haskell.lib.doBenchmark else;

drv = variant (haskellPackages.callPackage f {});


if pkgs.lib.inNixShell then drv.env else drv

… and if you run nix-shell within the same directory the shell environment will have the Haskell dependencies you need to build and run project using cabal:

$ nix-shell
[nix-shell]$ cabal run

… and tools like ghcid and haskell-language-server will also work within this shell, too. The only difference is that ghcid now takes no arguments, since it will auto-detect the cabal project in the current directory:

[nix-shell]$ ghcid

Note that this nix-shell will NOTinclude cabal by default. You will need to globally install cabal (see the prior section on “Global development environment”).

This cabal2nix --shell workflow is sufficiently lightweight that you can Nixify other people’s projects on the fly when hacking on them locally. A common thing I do if I need to make a change to a person’s project is to clone their repository, run:

$ cabal2nix --shell . > shell.nix
$ nix-shell

… and start hacking away. I don’t even need to upstream the shell.nix file I created in this way; I just keep it around locally for my own hacking.

In fact, I typically don’t want to upstream such a shell.nix file (even if the upstream author were receptive to Nix), because there are more robust Nix expressions we can upstream instead.

Stage 5: Custom shell.nix file

One disadvantage of cabal2nix --shell is that you have to re-run the command any time your dependencies change. However, if you’re willing to hand-write your own shell.nix file then you can create something more stable:

overlay = pkgsNew: pkgsOld: {
haskellPackages = pkgsOld.haskellPackages.override (old: {
overrides = pkgsNew.haskell.lib.packageSourceOverrides {
spire = ./.;

pkgs = import <nixpkgs> { overlays = [ overlay ]; };


The packageSourceOverrides is the key bit. Under the hood, that essentially runs cabal2nix for you any time your project changes and then generates your development environment from the result. You can also use packageSourceOverrides to specify non-default versions of dependencies, too:

overlay = pkgsNew: pkgsOld: {
haskellPackages = pkgsOld.haskellPackages.override (old: {
overrides = pkgsNew.haskell.lib.packageSourceOverrides {
spire = ./.;

# Example of how to pin a dependency to a non-defaul version
pretty-show = "1.9.5";

pkgs = import <nixpkgs> { overlays = [ overlay ]; };


… although that will only work for packages that have been released prior to the version of Nixpkgs that you’re depending on.

If you want something a bit more robust, you can do something like this:

overlay = pkgsNew: pkgsOld: {
haskellPackages = pkgsOld.haskellPackages.override (old: {
overrides =
(old.overrides or (_: _: { }))
[ (pkgsNew.haskell.lib.packageSourceOverrides {
spire = ./.;
(pkgsNew.haskell.lib.packagesFromDirectory {
directory = ./packages;

pkgs = import <nixpkgs> { overlays = [ overlay ]; };


… and then you have the option to also depend on any dependency that cabal2nix knows how to generate:

$ mkdir packages

$ # Add the following file to version control to preserve the directory
$ touch packages/.gitkeep

$ cabal update

$ cabal2nix cabal://${PACKAGE_NAME}-${VERSION} > ./packages/${PACKAGE_NAME}.nix

… and that works even on bleeding-edge Haskell packages that Nixpkgs hasn’t picked up, yet.

Stage 6: Pinning Nixpkgs

All of the prior examples are “impure”, meaning that they depend on the ambient nixpkgs channel installed on the developer’s system. This nixpkgs channel might vary from system to system, meaning that each system might have different versions of nixpkgs installed, and then you run into issues reproducing each other’s builds.

For example, if you have a newer version of nixpkgsinstalled your Nix build for the above Haskell project might succeed, but then another developer might attempt to build your project with an older version of nixpkgs, which might select an older incompatible version of one of your Haskell dependencies.

Or, vice versa, the examples in this blog post might succeed at the time of this writing for the current version of nixpkgs but then as time goes on the examples might begin to fail for future versions of nixpkgs.

You can fix that by pinning Nixpkgs, which this post covers:

For example, we could pin nixpkgs for our global ~/default.nix like this:

nixpkgs = builtins.fetchTarball {
url = "";
sha256 = "14ann7vz7qgfrw39ji1s19n1p0likyf2ag8h7rh8iwp3iv5lmprl";

config = { allowUnfree = true; };

overlay = pkgsNew: pkgsOld: {
vscode-with-extensions = pkgsOld.vscode-with-extensions.override {
vscodeExtensions = [

pkgs = import nixpkgs { inherit config; overlays = [ overlay ]; };

{ inherit (pkgs)

… which pins us to the tip of the release-22.05 branch at the time of this writing.

We can likewise pin nixpkgs for our project-local shell.nix like this:

nixpkgs = builtins.fetchTarball {
url = "";
sha256 = "14ann7vz7qgfrw39ji1s19n1p0likyf2ag8h7rh8iwp3iv5lmprl";

overlay = pkgsNew: pkgsOld: {
haskellPackages = pkgsOld.haskellPackages.override (old: {
overrides = pkgsNew.haskell.lib.packageSourceOverrides {
spire = ./.;

pkgs = import nixpkgs { overlays = [ overlay ]; };



The final improvement we can make is the most important one of all: we can convert our project into a Nix flake:

There are two main motivations for flake-enabling our project:

  • To simplify managing inputs that we need to lock (e.g. nixpkgs)
  • To speed up our shell

To flake-enable our project, we’ll save the following code to flake.nix:

{ inputs = {
nixpkgs.url = github:NixOS/nixpkgs/release-22.05;

utils.url = github:numtide/flake-utils;

outputs = { nixpkgs, utils, ... }:
utils.lib.eachDefaultSystem (system:
config = { };

overlay = pkgsNew: pkgsOld: {
spire =

haskellPackages = pkgsOld.haskellPackages.override (old: {
overrides = pkgsNew.haskell.lib.packageSourceOverrides {
spire = ./.;

pkgs =
import nixpkgs { inherit config system; overlays = [ overlay ]; };

rec {
packages.default = pkgs.haskellPackages.spire;

apps.default = {
type = "app";

program = "${pkgs.spire}/bin/spire";

devShells.default = pkgs.haskellPackages.spire.env;

… and then we can delete our old shell.nix because we don’t need it anymore.

Now we can obtain a development environment by running:

$ nix develop

… and the above flake also makes it possible to easily build and run the program, too:

$ nix run    # Run the program
$ nix build # Build the project

In fact, you can even run a flake without having to clone a repository. For example, you can run the example code from this blog post by typing:

$ nix run github:Gabriella439/spire

Moreover, we no longer have to take care of managing hashes for, say, Nixpkgs. The flake machinery takes care of that automatically for you and generates a flake.lock file which you can then add to version control. For example, the lock file I got was:

"nodes": {
"nixpkgs": {
"locked": {
"lastModified": 1661617163,
"narHash": "sha256-NN9Ky47j8ohgPhA9JZyfkYIbbAo6RJkGz+7h8/exVpE=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "0ba2543f8c855d7be8e90ef6c8dc89c1617e8a08",
"type": "github"
"original": {
"owner": "NixOS",
"ref": "release-22.05",
"repo": "nixpkgs",
"type": "github"
"root": {
"inputs": {
"nixpkgs": "nixpkgs",
"utils": "utils"
"utils": {
"locked": {
"lastModified": 1659877975,
"narHash": "sha256-zllb8aq3YO3h8B/U0/J1WBgAL8EX5yWf5pMj3G0NAmc=",
"owner": "numtide",
"repo": "flake-utils",
"rev": "c0e246b9b83f637f4681389ecabcb2681b4f3af0",
"type": "github"
"original": {
"owner": "numtide",
"repo": "flake-utils",
"type": "github"
"root": "root",
"version": 7

… and you can easily upgrade to, say, a newer revision of Nixpkgs if you need to.

Additionally, all of the Nix commands are now faster. Specifically, the first time you run a command Nix still needs to download and/or build dependencies, but subsequent runs are faster because Nix can skip the instantiation phase. For more details, see:


Flakes are our final destination, so that’s as far as this post will go. There are technically some more ways that we can overengineer things, but in my experience the idioms highlighted in this post are the ones that provide the highest power-to-weight ratio.

The key thing to take away is that the Nixpkgs Haskell infrastructure lets you smoothly transition from simpler approaches to more powerful approaches, and even the final flake-enabled approach is actually not that complicated.

by Gabriella Gonzalez ( at August 28, 2022 03:56 PM

August 27, 2022

Brent Yorgey

Types for top-level definitions

I’ve come up with idea for a type system for first-class (global) definitions, which can serve as a very lightweight alternative to a proper module system. I’m posting it here in the hopes of getting some feedback and pointers to related work.

Commands and expressions

The programming language of Swarm (for lack of a better term I will hereafter refer to it as Swarmlang) has a bunch of imperative commands, and standard monadic sequencing constructs. For example,

move; move

does two move commands in sequence, and

thing <- grab; give friend thing

first executes grab, binding the variable thing to the result, then executes give friend thing. Of course, there is also a rich language of pure expressions, with things like arithmetic, strings, lambdas and function application, pairs, sums, and so on.

Some languages make a syntactic distinction between statements and expressions, but Swarmlang does not: everything is an expression, and some expressions happen to have a command type. If t is a type, then cmd t is the type of an imperative command which, when executed, can have some effects and then returns a result of type t. (Of course this should feel very familiar to Haskell programmers; cmd has many similarities to IO.) This approach makes many things simpler and means that commands are first-class values.

Typechecking definitions

Swarmlang has definitions, which are just expressions with a command type. If e is an expression, then

def x = e end

has type cmd (). When executed, it should have the effect of binding the name x to the expression e, and bringing x into scope for all subsequent commands. Thus, it is valid to sequence this first definition with a second definition that mentions x, like so:

def x = e end;
def y = foo bar x end

Of course, this means that while typechecking the definition of y, we must be able to look up the type of x. However, the type of the first def command is simply cmd (), which does not tell us anything about x or its type. Normally, the typing rule for sequencing of commands would be something like

\displaystyle \frac{\Gamma \vdash c_1 : \mathrm{cmd}\; \tau_1 \qquad \Gamma \vdash c_2 : \mathrm{cmd}\; \tau_2}{\Gamma \vdash c_1 ; c_2 : \mathrm{cmd}\;\tau_2}

but this does not work for def commands, since it does not take into account the new names brought into scope. Up until now, I have dealt with this in a somewhat ad-hoc manner, with some special typechecking rules for def and some ad-hoc restrictions to ensure that def can only syntactically show up at the top level. However, I would really like to put everything on a more solid theoretical basis (which will hopefully simplify the code as well).

Decorating command types

The basic idea is to decorate the \mathrm{cmd} type with extra information about names bound by definitions. As usual, let \Gamma denote a generic context, that is, a finite mapping from variable names to their types. Then we extend the cmd type by adding a context to it:

\mathrm{cmd}\; \tau \Rightarrow \Gamma

is the type of a command which yields a result of type \tau and produces global bindings for some names whose types are recorded in \Gamma. (Of course, we can continue to use \mathrm{cmd}\; \tau as an abbreviation for \mathrm{cmd}\; \tau \Rightarrow \varnothing.) So, for example, def x = 3 end no longer has type \mathrm{cmd}\; (), but rather something like \mathrm{cmd}\; () \Rightarrow \{x : \mathrm{int}\}, representing the fact that although def x = 3 end does not result in an interesting value, it does bind a name, x, whose type is int.

This is slightly unusual in the fact that types and contexts are now mutually recursive, but that doesn’t seem like a big problem. We can now write down a proper typing rule for sequencing that takes definitions into account, something like this:

\displaystyle \frac{\Gamma \vdash c_1 : \mathrm{cmd} \; \tau_1 \Rightarrow \Gamma_1 \qquad \Gamma, \Gamma_1 \vdash c_2 : \mathrm{cmd} \; \tau_2 \Rightarrow \Gamma_2}{\Gamma \vdash c_1 ; c_2 : \mathrm{cmd} \; \tau_2 \Rightarrow \Gamma, \Gamma_1, \Gamma_2}

And of course the typing rule for def looks like this:

\displaystyle \frac{\Gamma \vdash e : \tau}{\Gamma \vdash \texttt{def}\; x = e\; \texttt{end} : \mathrm{cmd}\; () \Rightarrow \{x : \tau\}}

These rules together can now correctly typecheck an expression like

def x = 3 end;
def y = 2 + x end

where the second definition refers to the name defined by the first. The whole thing would end up having type \mathrm{cmd}\; () \Rightarrow \{ x : \mathrm{int}, y : \mathrm{int} \}.

…with polymorphism?

All this seems straightforward with only first-order types, as in my example typing rules above. But once you add parametric polymorphism my brain starts to hurt. Clearly, the context associated to a command type could bind variables to polytypes. For example, def id = \x.x end has type \mathrm{cmd}\; () \Rightarrow \{id : \forall \alpha. \alpha \to \alpha\}. But should the context associated to a command type always contain polytypes, or only when the command type is itself a polytype? In other words, how do we deal with the associated contexts in the monotypes that show up during type inference? And what would it mean to unify two command types with their contexts (and would that ever even be necessary)? I hope it’s actually simple and I just need to think about it some more, but I haven’t wrapped my brain around it yet.

Ideas and pointers welcome!

I’d be very happy to hear anyone’s ideas, or (especially) pointers to published work that seems related or relevant! Feel free to comment either here, or on the relevant github issue.

by Brent at August 27, 2022 12:47 PM

August 26, 2022

Philip Wadler

Help, please! Do you know any applications of my work?

When writing an application, it sometimes help if I can point out that monads and type classes, which my research contributed to, are used to process every post on Facebook. (Via Haxl. Thanks, Simon Marlow!)

Do you know of other applications of my work? If so, please email me or list them in the comments. (You can find my email at the bottom of my home page.)

Possible example: I gather Twitter uses monads and implicits in Scala (where implicits were influenced by type classes), but it's hard to find confirmation online. Do you know whether they are used, and how heavily? (It's easier to find such confirmation for The Guardian.)

Possible example: Do you make heavy use of generics in Java? I contributed to their design.

Possible example: I gather protocols in Swift are in part inspired by type classes, but it is hard to find confirmation online. Can you point me to confirmation?

There are many other possibilities. I hope you know some I haven't dreamed of!

Many thanks for your help. Answers are welcome at any time, but would be most useful if they can be provided by 2 September 2022.

by Philip Wadler ( at August 26, 2022 01:25 PM

August 23, 2022


Verifying initial conditions in Plutus

On a UTxO-style blockchain such as Cardano, transaction outputs are normally to the (hash) of a public key; in order to spend such an output, the spending transaction must be signed with the corresponding private key as evidence that the party creating the transaction has the right to spend the output.

The extended UTxO model introduces a second kind of output: an output to a smart contract: a piece of code f. Along with the contract itself, the output also contains an additional argument d, known as the datum. When a transaction spends such an output, f(d) is executed; when f indicates that all is fine, the transaction is approved; otherwise, it is rejected.

An immediate consequence of this model is that outputs are only verified when they are spent, not when they are created. This can lead to some tricky-to-verify initial conditions. We will explore this problem in this blog post, and suggest some ways in which we can solve it. Along the way, we will recap some often misunderstood Plutus concepts.

This work was done as part of the development of Be, a (smart) contract platform.

Stage restriction: datums vs parameters

Script outputs consist of three things: the value at the output, the hash of the script, and a datum or datum hash. We will ignore the value in the rest of this section.

We can think of a script hash and a datum as a pair of a function f and an argument datum d, which we might write as f(d). Scripts often have arguments other than the datum too; we might write this as fx1, .., xN(d). These additional arguments are commonly referred to as parameters.

The difference between parameters and the datum is one of stage: parameters are applied at the stage of script compilation, whereas the datum is applied at the stage of script execution. Let’s consider this from a few different angles.

  • (Data) values1 that must be computed off-chain should be parameters; values that must be computed on-chain should be in the datum.

    For example, stateful scripts often have to compute their next state. Since this computation happens on-chain, this state must therefore live in the script’s datum.

  • From the on-chain code’s perspective, fx and fx’ are different scripts with different hashes, whereas f(d) and f(d') look like the same script, applied to different datums.

  • Sometimes there are reasons to prefer that a particular value is a parameter rather than a datum: after all, datums are just values and cannot a priori be trusted (we will come back to this in detail later). In principle it’s not impossible to compute a parameter on-chain, but it’s difficult: we must have sufficient information on-chain to be able to compute, for a given x, the hash of fx (that is, the hash of the serialised source code of fx). In practice, on Cardano this is currently impossible, because the appropriate hashing algorithm is not available on-chain.2

  • Conversely, sometimes we might prefer that a value is stored in the datum rather than is passed as a parameter. Off-chain code can easily recognize outputs to f(d) in the current UTxO set without needing to be aware of d; however, outputs to fx cannot be recognized as such without x, since the hash of fx is different for every x.

Self-reference: direct and indirect recursion

Scripts often need to be aware of their own hash. For example, a script f may need to check that any transaction spending an output to f also contains an output back to f:

(f, d, V) \xrightarrow{\mspace{30mu}\mathit{Tx}\mspace{30mu}} (f, d&39;, V&39;)

Clearly, the source of f cannot literally contain the hash of that very same source code: the hash would be uncomputable. Scripts must therefore be told their own hash when they run. This has some important consequences.

For example, suppose we have a parameterized minting policy πf, such that πf only allows minting of tokens if those tokens are output to f. Suppose furthermore that f needs to check if particular inputs contain πf tokens. In principle this should be fine: since f is told its own hash, it should be able to compute the hash of πf. However, we run into the stage restriction discussed above: the script’s own hash is a run-time value, whereas the parameter to π must be provided at compile time. Again, it’s not in principle impossible to solve this problem3, but in practice it’s difficult, and as we saw, in fact currently impossible on Cardano for technical reasons.

We must therefore tell script f what the hash of πf is. This cannot be a parameter, because this would lead to uncomputable self-referential hashes again. Indeed, even our notation would break down here:

f_{\displaystyle \pi_{f_{\pi_{f_{\ddots}}}}}

We can put (the hash of) πf in the datum d instead; (f, (πf, d), V) is unproblematic for hash computations, but now we have a different problem: f has no way of verifying that the datum contains the right hash. We will come back to back this point below.

It may also be possible to break the mutual dependency between f and πf in different ways. For example, perhaps f has an associated NFTo, so that we could parameterize both f and π by NFTo instead. As we shall see, this is also not without problems.

Stateful scripts

In the EUTxO model, state is never actually updated; instead, old state is consumed and new state is created. Typically, this state resides in the datum of a script:

(f, \; d_\mathit{old}) \xrightarrow{\mspace{30mu}\mathit{Tx}\mspace{30mu}} (f, \; d_\mathit{new})

Inductive reasoning

When Tx is validated, f can verify the evolution of dold to dnew, but it cannot verify dold; instead, we want to reason inductively, and say that f must at some point have verified that previous state as well:

\cdots \xrightarrow{\mspace{30mu}\mathit{Tx}_n\mspace{30mu}} (f, \; d_n) \xrightarrow{\mspace{30mu}\mathit{Tx}_{n+1}\mspace{30mu}} (f, \; d_{n+1}) \xrightarrow{\mspace{30mu}\mathit{Tx}_{n+2}\mspace{30mu}} (f, \; d_{n+2})

Unfortunately, this is an induction without a base case, because the first output to f is not verified by f:

() \xrightarrow{\mspace{30mu}\mathit{Tx}_0\mspace{30mu}} (f, d_0) \xrightarrow{\mspace{30mu}\mathit{Tx}_1\mspace{30mu}} (f, d_1) \xrightarrow{\mspace{30mu}\mathit{Tx}_2\mspace{30mu}} \cdots \xrightarrow{\mspace{30mu}\mathit{Tx}_n\mspace{30mu}} (f, \; d_n)

The critical transaction here is Tx1: ideally it should not only verify that d1 is the correct next state after d0, but also the base case: d0 should be the correct initial state.

Unfortunately, the fact that d0 should be the base case is deducible from Tx0, but on Cardano that information is not reflected in Tx1, and Tx1 is the only context available to f when it validates d0. This means that contracts do not know when the base case should apply, and hence cannot verify their own base case.

There are two ways to solve this problem.

  • Declare that it is the responsibility of the off-chain code to verify the base case. In rare cases it may be possible to verify this by inspecting the current datum, but in most cases it will mean looking through the chain history to find the original output (Tx0 in the diagram above).

  • We can solve the problem entirely on-chain through the clever use of NFTs. We will discuss this in detail below.

State tokens

Often stateful scripts have an associated NFT, anchored at a random output o, which is included in the value of each script output:

(f, \; d_\mathit{old}, \mathsf{NFT}_o \uplus V_\mathrm{old}) \xrightarrow{\mspace{30mu}\mathit{Tx}\mspace{30mu}} (f, \; d_\mathit{new}, \mathsf{NFT}_o \uplus V_\mathrm{new})

Such NFT is useful to be able to uniquely identify the current script output.4 However, the presence of such an NFT does not change the narrative above in any meaningful way. While it is true that if the NFT is minted in Tx0 (and f checks for the presence of the NFT as part of its checks), then d0 must indeed be the base case, but this is once again not visible from Tx1.

Conversely, the presence of this kind of NFT in an input does not mean that that input spends a script output to f; this will only be true once the NFT is locked in the script (at which point f will validate that it will remain locked in the script). The base case should be that the NFT is locked in the contract immediately when it’s minted, but verifying this again involves checking the chain history.

Parameterized NFTs

The NFT minting policy NFTo simply checks that the enclosing transaction spends output o. Since outputs can only be spent once, this guarantees that the resulting token is indeed a singleton. If we additionally want the guarantee that the NFT is locked in some script f immediately upon minting, we can define an NFT minting policy which is parameterized by an output and a script it’s intended for: NFTo, f would verify that the enclosing transaction spends o, and that the resulting token is locked in a script output to f.

Unfortunately, this leads to precisely the kind of mutual dependency between f and NFTo, f that we discussed above: f needs to know the hash of NFTo, f so that it can verify that the NFT remains locked in the script, and NFTo, f needs to know the hash of f so that it can verify the NFT is locked in the script when it is first minted.

We must therefore store the hash of the NFT in the datum associated with f, but now f has no way of verifying that datum. It can verify that the hash never changes, but it has no way of verifying that the hash is correct. It’s another example of the base case problem discussed above, but in a sense even more severe: even if the script could tell when the base case should apply, it would still not know what the base case should be.

Fortunately, if we use an NFT to identify script outputs, we can solve the base case problem once and for all, by additionally parameterizing the NFT with the expected initial state. NFTo, f, d will allow minting if the enclosing transaction

  • spends o
  • locks the resulting token in an output to script f
  • the associated datum5 is (NFTo, f, d, d)

This means that if we interact with a script through NFTo, f, d, the NFT guarantees the inductive base case. Put another way: we have encoded the base case right in the hash of the NFT.6

Stateful minting policies

A minting policy π is a script that determines whether a particular token is allowed to be minted or burned. Minting policies are stateless: they do not have associated datums (indeed, there are no outputs to a minting policy), nor can we somehow encode state in the hash of the minting policy itself, because we have no way of “evolving” that hash.

Many minting policies however do need some state. When this happens, the minting policy needs to be paired with a stateful script f. The minting policy π could then check the transaction for an input that spends a script output (f, d), and then use the associated datum d as its state. This is unproblematic as long as the minting policy only needs to read the state, but does not need to modify it. In this case, f is still responsible for checking the state in the continuation output, as normal.

If however the minting policy does need to change the state, we have a problem.

  1. We could attempt to set things up such that the minting policy π informs the regular script f that π will take over duties for verifying the continuation output, by having the minting policy verify that the redeemer7 for f tells it not to check the continuation output. This is however not sound: a malicious user could create a transaction that doesn’t mint but does use that special redeemer value, and would then be able to change the state of f at will.

  2. f could check the transaction to see if it it burns or mints any π tokens, and delegate verification of the output datum to π when this is the case. This is sound but results in the same kind of mutual dependency problem that we already encountered above: π now needs the hash of f (to recognize outputs to f), and f needs the hash of π (to check whether or not any π tokens are minted).

  3. Most stateful scripts use some state token NFTo. It is tempting to think that we could break the cycle by having the minting policy π check for the input with NFTo instead of the input spending a script output to f: now π no longer needs to know the hash of f. Unfortunately, as we saw, we are only guaranteed that the presence of the NFT implies that the input spends an output to f if we verify through other means that the NFT is locked in the script f immediately upon minting.8

  4. We must therefore store the hash of the minting policy in the datum of f. We can use the parameterized NFT we discussed above, and use NFTo, f, (πf, d) to ensure that the initial value of the minting policy is correct9 (f itself must ensure that the hash never changes).

Variable base case

When we discussed inductive reasoning, we saw that scripts cannot verify their own base case. We will now consider what happens when the base case for one script is another script:

(g, d&39;, V&39;) \xrightarrow{\mspace{30mu}\mathit{Tx&39;}\mspace{30mu}} (f, d, V) \xrightarrow{\mspace{30mu}\mathit{Tx}\mspace{30mu}} \ldots

In this case there is no fixed base case for f; instead, the guarantee required by f is that g has verified d; as always, this guarantee must be derivable from information present in Tx only.

We can do this by defining a minting policy Linkg, f, which allows minting in a transaction Tx' if

  • Tx' spends an input to g
  • The only Linkg, f tokens in the outputs of Tx' are locked in outputs to f.

Scripts g and f then have the following responsibilities:

  • g must verify the datums in all outputs to f in Tx'.
  • f must check for the presence of the Linkg, f token in the input it is verifying. It must also check that the only outputs of the Linkg, f token in transaction Tx are back to f.

These tokens are not NFTs, and do not need to be. We just need to be make sure the tokens never ``leak’’ (which the rules above do ensure).

Once more we have to solve the problem of mutual dependencies between f and Linkg, f and between g and Linkg, f. First, we can assume without loss of generality that g is stateful; after all, if it wasn’t, then this setup provides no benefits over a setup with a fixed base case. We can therefore store the hash of Linkg, f in the state (datum) associated with g, and then rely on the NFTo, g, (Linkg, ; f, d’) to ensure that the hash is recorded correctly. We can use g to verify that the datum for f also records the correct hash of Linkg, f.

This also avoids any dependencies between f and g directly: g can identify outputs to f by looking for outputs that include a Linkg, f token, and f verifies that its inputs include that token.


Stateful Plutus scripts can verify that the evolution of state happens according to the rules defined in the script, but cannot verify the initial state: they lack the context that would tell them when the base case applies.

Moreover, the lack of on-chain script hash computation (or, put another way, the strict stage separation between script parameters and the script datum) means that we must often include the hash of one script in the datum of another; these hashes cannot be verified at all, even if the script could know when a state should be initial.

Often the verification of these initial conditions are relegated to off-chain code instead, but this is unsatisfactory and dangerous. The same on-chain code could be interacted with by multiple off-chain applications; a single forgotten check in one of those off-chain applications could result in security vulnerabilities.

Stateful scripts often have an associated NFT, used to identify the current script output in a UTxO set. As it turns out, such NFTs are subject to their own unverified initial conditions. However, we showed that we can resolve all of these problems by defining an NFTo, f, d which verifies that the NFT is locked in the script f immediately upon minting and that the initial datum is d. In addition, we saw that if the base case for one script f is another script g, we can use NFTo, g, d to guarantee the base case for g, and a special token Linkg, f to verify the base case for f.


  1. Data values as opposed to currency values.↩︎

  2. Script hashes use BLAKE-224; the available on-chain hashing algorithms are SHA2-256, SHA3-256, and BLAKE-256.↩︎

  3. Specifically, f would need enough information to be able to compute blake224(serialise(πf)).↩︎

  4. This can be useful off-chain, in order to figure out which output in the UTxO set corresponds to the current state of the contract we’re interacting with. It can also be useful on-chain, in case there could be multiple instances of f within a single transaction, and the script needs to figure out which output belongs to which input.↩︎

  5. This latter check is unproblematic becauses scripts are told their own hash when they run.↩︎

  6. It is true that the hash is carrying a lot of weight here; we are depending on that fixed size hash to encode a lot of information. However, this is no different from representing script outputs in the first place; we are comfortable with hashes representing fx1, .., xN, in which case the hash is also encoding a lot of information: the definition of f as well as all parameters xi.↩︎

  7. As we saw earlier, a script output contains (the hash of) a function fx1, .., xN, and a datum d. When a script output is spent, the spending input additionally provides a redeemer value r; the actual code that is run is then fx1, .., xN(d, r). In Plutus V1 it in fact not possible for the minting policy to check the redeemer value for other scripts, as this information is not present in the ScriptContext; this is resolved in V2.↩︎

  8. We cannot use NFTo, f, d: this is guaranteed to be locked in the script immediately but, unlike NFTo, still depends on f.↩︎

  9. When π tokens are minted, we could rely on π instead to verify that the right hash is recorded. However, this check will not happen until the first π-token is minted, and we wouldn’t know what happened before then.↩︎

by edsko, finley at August 23, 2022 12:00 AM

August 22, 2022

GHC Developer Blog

GHC 9.4.2 released

GHC 9.4.2 released

bgamari - 2022-08-22

The GHC developers are happy to announce the availability of GHC 9.4.2. Binary distributions, source distributions, and documentation are available at

This release is primarily a bugfix release addressing a few packaging issues found in 9.4.1. See the release notes for a full accounting.

Note that, as GHC 9.4 is the first release series where the release artifacts are all generated by our new Hadrian build system, it is possible that there will be packaging issues. If you enounter trouble while using a binary distribution, please open a ticket. Likewise, if you are a downstream packager, do consider migrating to Hadrian to run your build; the Hadrian build system can be built using cabal-install, stack, or the in-tree bootstrap script. See the accompanying blog post for details on migrating packaging to Hadrian.

We would also like to emphasize that GHC 9.4 must be used in conjunction with Cabal-3.8 or later. This is particularly important for Windows users due to changes in GHC’s Windows toolchain.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy Haskelling,

  • Ben

by ghc-devs at August 22, 2022 12:00 AM

August 21, 2022

Magnus Therning

Patching in Nix

Today I wanted to move one of my Haskell projects to GHC 9.2.4 and found that envy didn't compile due to an upper bound on its dependency on bytestring, it didn't allow 0.11.*.

After creating a PR I decided I didn't want to wait for upstream so instead I started looking into options for patching the source of a derivation of a package from Hackage. In the past I've written about building Haskell packages from GitHub and an older one were I used callHackageDirect to build Haskell packages from Hackage. I wasn't sure how to patch up a package from Hackage though, but after a bit of digging through haskell-modules I found appendPatch.

The patch wasn't too hard to put together once I recalled the name of the patch queue tool I used regularly years ago, quilt. I put the resulting patch in the nix folder I already had, and the full override ended up looking like this

hl = haskell.lib;
hsPkgs = haskell.packages.ghc924;

extraHsPkgs = hsPkgs.override {
  overrides = self: super: {
    envy = hl.appendPatch (self.callHackageDirect {
      pkg = "envy";
      ver = "";
      sha256 =
    } { }) ./nix/envy-fix-deps.patch;

August 21, 2022 08:05 PM

August 16, 2022

Matt Parsons

Dynamic Exception Reporting in Haskell

Exceptions kind of suck in Haskell. You don’t get a stack trace. They don’t show up in the types of functions. They incorporate a subtyping mechanism that feels more like Java casting than typical Haskell programming.

A partial solution to the problem is HasCallStack - that gives us a CallStack which gets attached to error calls. However, it only gets attached to error - so you can either have String error messages and a CallStack, or you can have richly typed exceptions with no location information.

A CallStack is a static piece of information about the code. “You called foo, which called bar, which called quuz, which blew up with No parse.” The CallStack answers a single question: “Where did this go wrong?”

But there’s often many more interesting questions that simply “Where?” You often want to know Who? When? How? in order to diagnose the big one: why did my code blow up?

In order to help answer these questions and develop robust exception reporting and diagnosing facilities, I created the annotated-exception package.

Better Call Stacks

annotated-exception provides a big improvement in static CallStack behavior. To understand the improvement, let’s dig into the core problem:

Broken Chains and Orphan Stacks

If any function doesn’t include a HasCallStack constraint in your stack, then the chain is broken, and you only get the stack closest to the source.

Consider this trivial example, which has a few ways of blowing up:

import GHC.Stack

foo :: HasCallStack => Int
foo = error "foo"

bar :: HasCallStack => Int
bar = foo

baz :: Int
baz = foo

quux :: HasCallStack => Int
quux = bar

ohno :: HasCallStack => Int
ohno = baz

If we call foo in GHCi, we get the immediate stack trace:

λ> foo
*** Exception: foo
CallStack (from HasCallStack):
  error, called at <interactive>:4:7 in interactive:Ghci1
    foo, called at <interactive>:14:1 in interactive:Ghci2

Since the bar term has the HasCallStack constraint, it will add it’s location to the mix:

λ> bar
*** Exception: foo
CallStack (from HasCallStack):
  error, called at <interactive>:4:7 in interactive:Ghci1
  foo, called at <interactive>:6:7 in interactive:Ghci1
  bar, called at <interactive>:15:1 in interactive:Ghci2

However, baz omits the constraint, which means that you won’t get that function in the stack:

λ> baz
*** Exception: foo
CallStack (from HasCallStack):
  error, called at <interactive>:4:7 in interactive:Ghci1
    foo, called at <interactive>:8:7 in interactive:Ghci1

The quux term has the call stack, so you get the whole story again:

λ> quux
*** Exception: foo
CallStack (from HasCallStack):
  error, called at <interactive>:4:7 in interactive:Ghci1
    foo, called at <interactive>:6:7 in interactive:Ghci1
      bar, called at <interactive>:10:8 in interactive:Ghci1
        quux, called at <interactive>:17:1 in interactive:Ghci2

But here’s the crappy thing - ohno does have a HasCallStack constraint. You might expect that it would show up in the backtrace. But it does not:

λ> ohno
*** Exception: foo
CallStack (from HasCallStack):
  error, called at <interactive>:4:7 in interactive:Ghci1
  foo, called at <interactive>:8:7 in interactive:Ghci1

The CallStack for foo, baz, and ohno are indistinguishable. This makes diagnosing the failure difficult.

To avoid this problem, you must diligently place a HasCallStack constraint on every function in your code base. This is pretty annoying! And if you have any library code that calls your code, the library’s lack of HasCallStack will break your chains for you.

checkpoint to the rescue

annotated-exception introduces the idea of a checkpoint. The simplest one is checkpointCallStack, which attaches the call-site to any exceptions thrown out of the action:

    :: (HasCallStack, MonadCatch m)
    => m a
    -> m a

Let’s replicate the story from above.

import Control.Exception.Annotated

foo :: IO Int
foo = throw (userError "foo")

-- in GHCi, evaluate:
-- λ> foo
*** Exception: 
         { annotations = 
             [ Annotation @CallStack 
                 [ ( "throw"
                   , SrcLoc 
                         { srcLocPackage = "interactive"
                         , srcLocModule = "Ghci1"
                         , srcLocFile = "<interactive>"
                         , srcLocStartLine = 4
                         , srcLocStartCol = 7
                         , srcLocEndLine = 4
                         , srcLocEndCol = 30
         , exception = user error (foo)

I’ve formatted the output to be a bit more legible. Now, instead of a plain IOError, we’ve thrown an AnnotatedException IOError. Inside of it, we have the CallStack from throw, which knows where it was thrown from. That CallStack inside of the exception is reporting the call-site of throw - not the definition site! This is true even though foo does not have a HasCallStack constraint!

Let’s do bar. We’ll do HasCallStack and our checkpointCallStack, just to see what happens:

import GHC.Stack

bar :: HasCallStack => IO Int
bar = checkpointCallStack foo

-- λ> bar
*** Exception: 
        { annotations = 
            [ Annotation @CallStack 
                [ ( "throw"
                  , SrcLoc { srcLocPackage = "interactive", srcLocModule = "Ghci1", srcLocFile = "<interactive>", srcLocStartLine = 4, srcLocStartCol = 7, srcLocEndLine = 4, srcLocEndCol = 30}
                , ( "checkpointCallStack"
                  , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci2", srcLocFile = "<interactive>", srcLocStartLine = 15, srcLocStartCol = 7, srcLocEndLi ne = 15, srcLocEndCol = 30}
                , ( "bar"
                  , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci3", srcLocFile = "<interactive>", srcLocStartLine = 17, srcLocStartCol = 1, srcLocEndLine = 17, srcLocEndCol = 4}
        , exception = user error (foo)

We get the source location for throw, checkpointCallStack, and then the use site of bar.

Now, suppose we have our Problem Function again: baz doesn’t have a HasCallStack constraint or a checkpointCallStack. And when we called it through ohno, we lost the stack, even though ohno had the HasCallStack constraint.

baz :: IO Int
baz = bar

ohno :: IO Int
ohno = checkpointCallStack baz

-- λ> ohno
*** Exception: 
        { annotations = 
            [ Annotation @CallStack 
                [ ( "throw"
                  , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci1", srcLocFile = "<interactive>", srcLocStartLine = 4, srcLocStartCol = 7, srcLocEndLine = 4, srcLocEndCol = 30}
                , ( "checkpointCallStack"
                  , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci2", srcLocFile = "<interactive>", srcLocStartLine = 15, srcLocStartCol = 7, srcLocEndLi ne = 15, srcLocEndCol = 30}
                , ( "bar"
                  , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci3", srcLocFile = "<interactive>", srcLocStartLine = 21, srcLocStartCol = 7, srcLocEndLine = 21, srcLocEndCol = 10}
                , ( "checkpointCallStack"
                  , SrcLoc {srcLocPackage = "interactive", srcLocModule = "Ghci3", srcLocFile = "<interactive>", srcLocStartLine = 23, srcLocStartCol = 8, srcLocEndLine = 23, srcLocEndCol = 31}
        , exception = user error (foo)

When we call ohno, we preserve all of the entries in the CallStack. checkpointCallStack in ohno adds itself to the CallStack that is present on the AnnotatedException itself, so it doesn’t need to worry about the stack being broken. It’s perfectly capable of recording that history for you.

Ain’t Just a Checkpoint - catch me later

The type signature for catch in annotated-exception looks like this:

    :: (HasCallStack, Exception e, MonadCatch m)
    => m a
    -> (e -> m a)
    -> m a

That HasCallStack constraint is used to give you a CallStack entry for any time that you catch an exception.

newtype MyException = MyException String
    deriving Show

instance Exception MyException

boom :: IO Int
boom = throw (MyException "boom")

recovery :: IO Int
recovery =
    boom `catch` \(MyException message) -> do
        putStrLn message
        throw (MyException (message ++ " recovered"))

recovery catches the MyException from boom, prints the message, and then throws a new exception with a modified message.

λ> recovery
*** Exception: 
        { annotations = 
            [ Annotation @CallStack 
                [ ( "throw"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 19, srcLocStartCol = 9, srcLocEndLine = 19, srcLocEndCol = 54}
                , ( "throw"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34}
                , ( "catch"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 17, srcLocStartCol = 5, srcLocEndLine = 19, srcLocEndCol = 54}
        , exception = MyException "boom recovered"

Now, look at that call stack: we have the first throw (from boom), then we have the second throw (in recovery), and finally the catch in recovery.

So we know where the exception originally happened, where it was rethrown, and where it was caught. This is fantastic!

But, even better - these annotations survive even if you throw a different type of Exception. This means you can translate exceptions fearlessly, knowing that any essential annotated context won’t be lost.

Dynamic Annotations

As I said earlier, CallStack is fine, but it’s a static thing. We can figure out “what code called what other code” that eventually led to an exception, but we can’t know anything about the running state of the program.

Enter checkpoint. This function attaches an arbitrary Annotation to thrown exceptions. An Annotation is a wrapper around any value that has an instance of Show and Typeable. The library provides an instance of IsString for this, so you can enable OverloadedStrings and have stringly-typed annotations.

constantAnnotation :: IO String
constantAnnotation =
    checkpoint "from constant annotation" $ do
        msg <- getLine
        if null msg
            then throw (MyException "empty message")
            else pure msg

But the real power is in using runtime data to annotate things.

Let’s imagine you’ve got a web application. You’re reporting runtime exceptions to a service, like Bugsnag. Specific teams “own” routes, so if something breaks, you want to alert the right team.

You can annotate thrown exceptions with the route.

data Route 
    = Login
    | Signup
    | ViewPosts
    | CreatePost
    | EditPost PostId
    deriving Show

dispatch :: Request -> IO Response
dispatch req = 
    case parseRequest req of
        Right route ->
            checkpoint (Annotation route) $ 
                case route of
                    Login ->
                    Signup -> 
                    ViewPosts ->
                    CreatePost ->
                    EditPost postId ->
                        checkpoint (Annotation postId) $
                            handleEditPost postId
        Left _ ->

Now, suppose an exception is thrown somewhere in handleLogin. It’s going to bubble up past dispatch and get handled by the Warp default exception handler. That’s going to dig into the [Annotation] and use that to alter the report we send to Bugsnag. The team that is responsible for handleLogin gets a notification that something broke there.

In the EditPost case, we’ve also annotated the exception with the post ID that we’re trying to edit. This means that, when debugging, we can know exactly which post threw the given exception. Now, when diagnosing and debugging, we can immediately pull up the problematic entry. This gives us much more information about the problem, which makes diagnosis easier.

Likewise, suppose we have a function that gives us the logged in user:

withLoggedInUser :: (Maybe (Entity User) -> IO a) -> IO a
withLoggedInUser action = do
    muser <- getLoggedInUser
    checkpoint (Annotation (fmap entityKey muser)) $ do
        action muser

If the action we pass in to withLoggedInUser throws an exception, that exception will carry the Maybe UserId of whoever was logged in. Now, we can easily know who is having a problem on our service, in addition to what the problem actually is.

The Value of Transparency

But wait - if all exceptions are wrapped with this AnnotatedException type, then how do I catch things? Won’t this pollute my codebase?

And, what happens if I try to catch an AnnotatedException MyException but some other code only threw a plain MyException? Won’t that break things?

These are great questions.

catch and try from other libraries will fail to catch a FooException if the real type of the exception is AnnotatedException FooException. However, catch and try from annotated-exception is capable of “seeing through” the AnnotatedException wrapper.

In fact, we took advantage of this earlier - here’s the code for recovery again:

boom :: IO Int
boom = throw (MyException "boom")

recovery :: IO Int
recovery =
    boom `catch` \(MyException message) -> do
        putStrLn message
        throw (MyException (message ++ " recovered"))

Note how catch doesn’t say anything about annotations. We catch a MyException, exactly like you would in Control.Exception, and the annotations are propagated.

But, let’s say you want to catch the AnnotatedException MyException. You just do that.

recoveryAnnotated :: IO Int
recoveryAnnotated =
    boom `catch` \(AnnotatedException annotations (MyException message)) -> do
        putStrLn message
        traverse print annotations
        throw (OtherException (length message))

-- in GHCi,
λ> recoveryAnnotated
Annotation @CallStack [("throw",SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34})]
*** Exception: 
        { annotations = 
            [ Annotation @CallStack 
                [ ( "throw"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 37, srcLocStartCol = 9, srcLocEndLine = 37, srcLocEndCol = 48}
        , exception = OtherException 4

Now, something tricky occurs here: we don’t preserve the annotations on the thrown exception. If you catch an AnnotatedException, the library assumes that you’re going to handle those yourself.

If you want to keep them, you’d need to throw an AnnotatedException:

recoveryAnnotatedPreserve :: IO Int
recoveryAnnotatedPreserve =
    boom `catch` \(AnnotatedException annotations (MyException message)) -> do
        putStrLn message
        traverse print annotations
        throw (AnnotatedException annotations (OtherException (length message)))

-- in GHCi,
λ> recoveryAnnotatedPreserve 
Annotation @CallStack [("throw",SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34})]
*** Exception: 
        { annotations = 
            [ Annotation @CallStack 
                [ ( "throw"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 44, srcLocStartCol = 9, srcLocEndLine = 44, srcLocEndCol = 81}
            , Annotation @CallStack 
                [ ( "throw"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "src/annotated.hs", srcLocStartLine = 13, srcLocStartCol = 8, srcLocEndLine = 13, srcLocEndCol = 34}
        , exception = OtherException 4

We’re missing catch, which is unfortunate, but generally you aren’t going to be doing this - you’re either going to be handling an error completely, or rethrowing it, and the [Annotation] won’t be relevant to you… unless you’re writing an integration with Bugsnag, or reporting on them in some other way.

So annotated-exception’s exception handling functions can “see through” an AnnotatedException inner to work only on the inner exception type. But what if I try to catch a DatabaseException as an AnnotatedException DatabaseException?

Turns out, the Exception instance of AnnotatedException allows you to do that.

import qualified Control.Exception

emptyAnnotationsAreCool :: IO ()
emptyAnnotationsAreCool =
    Control.Exception.throwIO (MyException "definitely not annotated?")
            \(AnnotatedException annotations (MyException woah)) -> do
                print annotations
                putStrLn woah

-- in GHCi,
λ> emptyAnnotationsAreCool 
definitely not annotated?

We promote the inner into AnnotatedException [] inner. So the library works regardless if any code you throw cares about AnnotatedException. If you call some external library code which throws an exception, you’ll get the first annotation you try - including if that’s just from catch:

catchPutsACallStack :: IO ()
catchPutsACallStack =
    Control.Exception.throwIO (MyException "definitely not annotated?")
            \(MyException woah) -> do
                throw (OtherException (length woah))

-- in GHCi,
λ> catchPutsACallStack 
*** Exception: 
        { annotations = 
            [ Annotation @CallStack 
                [ ( "throw"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "../", srcLocStartLine = 60, srcLocStartCol = 17, srcLocEndLine = 60, srcLocEndCol = 53})
                , ("catch"
                  , SrcLoc {srcLocPackage = "main", srcLocModule = "Annotated", srcLocFile = "../", srcLocStartLine = 58, srcLocStartCol = 9, srcLocEndLine = 58, srcLocEndCol = 16}
        , exception = OtherException 25

We get throw and catch both showing up in our stack trace. If we’d used Control.Exception.throwIO instead of Control.Exception.Annotated.throw, then we’d still have catch as an annotation.

Do you feel the power?

The primary purpose here is to share the technique and inspire a hunger for dynamic exception annotations.

We’ve been using this technique at Mercury for most of this year. It has dramatically simplified how we report exceptions, the shape of our exceptions, and how much info we get from a Bugsnag report. It’s now much easier to diagnose problems and fix bugs.

The Really Big Deal here is that - we now have something better than other languages. The lack of stack traces in Haskell is really annoying, and a clear way that Haskell suffers compared to Ruby or Java. But now, with annotated-exception, we actually have more powerful and more useful exception annotations than a mere stack trace. And, since this is all just library functions, you can swap to Control.Exception.Annotated with little fuss.

August 16, 2022 12:00 AM

August 13, 2022

Chris Smith 2

Geometry, Dimensions, and Elections

I found this to be an interesting way to ponder the theory of elections and group decision-making, so I’m writing to share. I have not done the research to become aware of what is previously known in this area, and I make no claim that any of the thoughts contained here are new.

It’s common in the United States to approximate political opinions using a spectrum from “left” to “right”, where the left end of the spectrum represents an emphasis on social justice, and the right an emphasis on free markets and traditional values. Libertarians, on the other hand, are famous for advocating their view that politics are better described by two orthogonal dimensions, as epitomized by David Nolan in his Nolan Chart. Leaving aside a bunch of details, the idea of the chart is that an individual’s political opinions can be approximately described by a point in a two-dimensional space.

The Nolan Chart, a well-known Libertarian advocacy tool

There are many legitimate criticisms of the specific choice of dimensions in the Nolan Chart, but it does capture a first step toward the perspective that interests me here. Most generally, we can consider each individual’s political opinions as living in an infinite-dimensional space. However, such a space can then approximated by its projection down to however many dimensions are convenient for a particular purpose, with a corresponding loss of information as the number of dimensions gets smaller.

This has a fascinating interaction with Condorcet’s paradox. If you’re not familiar with the name, Condorcet’s paradox refers to a phenomenon described by the Marquis de Condorcet in the 18th century.

Condorcet’s Paradox: If each member of a group has consistently ordered individual preferences among three or more options, it is nevertheless still possible that the collective preferences of the group are cyclic. That is, a majority of the group may prefer option A to option B, a majority may prefer option B to option C, and yet a majority may also prefer option C to option A. Cycles are possible of any length greater than or equal to three.

For example, let’s think about an election for book club president, with three candidates: Alice, Bob, and Camille. We will write A>B>C to indicate that a member of the club prefers Alice as their first choice for president, followed by Bob, and finally Camille as their last choice. In all, there are six possible preference orders among the three candidates: A>B>C, A>C>B, B>A>C, B>C>A, C>A>B, and C>B>A. As Condorcet’s paradox predicts, there may be cycles in the overall preferences of the book club. For example, if 10% of members prefer A>B>C, 35% prefer A>C>B, 45% prefer B>C>A, and the remaining 10% prefer C>A>B, then you can verify that 55% of club members prefer Alice over Bob, 55% of voters prefer Bob over Camille, but 55% also prefer Camille over Alice!

This is quite inconvenient, because it means that in many elections, it’s possible for there to be no clear winner at all. But how does it relate to the dimensionality of political preferences?

Well, let’s assume for the sake of argument that political opinions are one-dimensional. I’ll describe the opinions as “left” or “right”, but the specific choice of dimension doesn’t matter. In such a model, the only question is how far left or right is optimal. Every voter would have a preference. Maybe it’s left-wing. Maybe it’s center-right. We won’t be concerned with which specific opinions the voter holds on an issue-by-issue basis, because in this world those are completely determined by just measuring how far left or right their opinions are. A voter’s preference among candidates is determined by how far each candidate is from that voter’s preferred political position.

Here are three candidates, as well as the ranges of voters who will express each possible preference. The dotted lines mark the midpoints between each of the three candidate pairs.

Voter preference ranges in a 3-candidate, 1-dimensional model

You may notice two of the six possible preference orders are missing. Voters never prefer A>C>B or C>A>B, because there is simply no position along the one-dimensional left-right axis that is closer to both A and C than it is to B. Because of this, there is also no possibility of a Condorcet cycle among these candidates. Indeed, if either A or C are preferred over B, it can only be because they are the first choice of a majority of voters, so they are preferred over any alternative.

We can go even further in this case: except for exact ties, the one unique candidate who will be preferred over all others by a majority of voters (possibly a different majority for each head-to-head contest, though!) will be the first choice of whichever voter has the median political preference among all the voters. However, I don’t see how to naturally generalize this observation to higher dimensions.

It is considering a second dimension that reveals the possibility of a Condorcet paradox among voters’ true preferences. That’s because the additional dimension lets candidates A and C have similarities that are not shared by B. With a second dimension, voter preferences are divided into areas, like this.

Voter preference ranges in a 3-candidate, 2-dimensional model

If you project this image onto the x axis, the candidates are the same as in the previous model. However, here we’ve added a second dimension, the y axis, in which candidate B differs significantly from A or C. There are now regions of voter preferences in which it is sensible to express candidate orderings A>C>B and C>A>B, restoring the possibility of a Condorcet paradox. Of course, we didn’t create Condorcet’s paradox by choosing to use a two-dimensional model. In a real-world scenario, voters would have expressed the preferences A>C>B and C>A>B anyway. A one-dimensional model would have to reject those voters as behaving irrationally, but a second dimension can explain them.

Similarly, suppose we add a fourth candidate into the two-dimensional model. We might see something like this:

Voter preference ranges in a 4-candidate, 2-dimensional model

There are 24 possible candidate orderings among 4 candidates, but only 18 of them appear here. Of the eighteen, 12 are open-ended regions around the outside of the diagram that include extreme positions, while the other 6 are bounded regions that sit strictly between the others as a kind of compromise or centrism. 6 more orderings, though, are missing from the diagram entirely! That is because, like before, the model has too few dimensions to recognize how a voter could adopt one of those preferences. In this case, the six missing preferences are D>A>B>C, D>A>C>B, A>D>C>B, A>D>B>C, D>C>A>B, and D>B>A>C. (Curiously, these are precisely the opposite preferences of the six bounded areas. The same thing occurs in the one-dimensional model, where the two unrepresentable orderings were the opposite preferences for the two bounded regions of the spectrum.)

You can see, then, that a two-dimensional model such as a Nolan Chart may be more expressive than a one-dimensional model, but still fails to capture some voter preferences (and this is entirely setting aside the question of whether the Nolan Chart in particular chooses the best pair of dimensions to consider). Beyond 2 dimensions, it’s more difficult to visualize, but the same things should occur. As more candidates are added, more dimensions will be needed to explain the various preferences voters may have.

There’s definitely some hand-waving involved in the above. The most obvious example is the notion of “distance” that is assumed to accurately determine a voter’s candidate preference. In my models, I used a Euclidean distance. In reality, each voter, in addition to having their own ideal candidate as a point in the space, may also have a different metric expressing how important each dimension is to them. These concerns can be dismissed as just another example of how “all models are wrong”, but this one would need some kind of validation to rely on it for real quantitative predictions. I don’t mean it that way; only as a framework for thinking about what can happen when you apply low-dimensional reasoning to what’s ultimately a high-dimensional concept.

by Chris Smith at August 13, 2022 03:10 AM

August 12, 2022

Chris Reade

Graphs, Kites and Darts

Graphs, Kites and Darts

Figure 1: Three Coloured Patches
Figure 1: Three Coloured Patches

Non-periodic tilings with Penrose’s kites and darts

We continue our investigation of the tilings using Haskell with Haskell Diagrams. What is new is the introduction of a planar graph representation. This allows us to define more operations on finite tilings, in particular forcing and composing.

Previously in Diagrams for Penrose Tiles we implemented tools to create and draw finite patches of Penrose kites and darts (such as the samples depicted in figure 1). The code for this and for the new graph representation and tools described here can be found on GitHub

To describe the tiling operations it is convenient to work with the half-tiles: LD (left dart), RD (right dart), LK (left kite), RK (right kite) using a polymorphic type HalfTile (defined in a module HalfTile)

data HalfTile rep 
 = LD rep | RD rep | LK rep | RK rep   deriving (Show,Eq)

Here rep is a type variable for a representation to be chosen. For drawing purposes, we chose two-dimensional vectors (V2 Double) and called these Pieces.

type Piece = HalfTile (V2 Double)

The vector represents the join edge of the half tile (see figure 2) and thus the scale and orientation are determined (the other tile edges are derived from this when producing a diagram).

Figure 2: The (half-tile) pieces showing join edges (dashed) and origin vertices (red dots)
Figure 2: The (half-tile) pieces showing join edges (dashed) and origin vertices (red dots)

Finite tilings or patches are then lists of located pieces.

type Patch = [Located Piece]

Both Piece and Patch are made transformable so rotate, and scale can be applied to both and translate can be applied to a Patch. (Translate has no effect on a Piece unless it is located.)

In Diagrams for Penrose Tiles we also discussed the rules for legal tilings and specifically the problem of incorrect tilings which are legal but get stuck so cannot continue to infinity. In order to create correct tilings we implemented the decompose operation on patches.

The vector representation that we use for drawing is not well suited to exploring properties of a patch such as neighbours of pieces. Knowing about neighbouring tiles is important for being able to reason about composition of patches (inverting a decomposition) and to find which pieces are determined (forced) on the boundary of a patch.

However, the polymorphic type HalfTile allows us to introduce our alternative graph representation alongside Pieces.

Tile Graphs

In the module Tgraph.Prelude, we have the new representation which treats half tiles as triangular faces of a planar graph – a TileFace – by specialising HalfTile with a triple of vertices (clockwise starting with the tile origin). For example

LD (1,3,4)       RK (6,4,3)
type Vertex = Int
type TileFace = HalfTile (Vertex,Vertex,Vertex)

When we need to refer to particular vertices from a TileFace we use originV (the first vertex – red dot in figure 2), oppV (the vertex at the opposite end of the join edge – dashed edge in figure 2), wingV (the remaining vertex not on the join edge).

originV, oppV, wingV :: TileFace -> Vertex


The Tile Graphs implementation uses a type Tgraph which has a list of graph vertices and a list of tile faces.

data Tgraph = Tgraph { vertices :: [Vertex]
                     , faces    :: [TileFace]
                     }  deriving (Show)

For example, fool (short for a fool’s kite) is a Tgraph with 6 faces and 7 vertices, shown in figure 3.

fool = Tgraph { vertices = [1,2,3,4,5,6,7]
              , faces = [RD (1,2,3),LD (1,3,4),RK (6,2,5)
                        ,LK (6,3,2),RK (6,4,3),LK (6,7,4)

(The fool is also called an ace in the literature)

Figure 3: fool
Figure 3: fool

With this representation we can investigate how composition works with whole patches. Figure 4 shows a twice decomposed sun on the left and a once decomposed sun on the right (both with vertex labels). In addition to decomposing the right graph to form the left graph, we can also compose the left graph to get the right graph.

Figure 4: sunD2 and sunD
Figure 4: sunD2 and sunD

After implementing composition, we also explore a force operation and an emplace operation to extend tilings.

There are some constraints we impose on Tgraphs.

  • No spurious vertices. Every vertex of a Tgraph face must be one of the Tgraph vertices and each of the Tgraph vertices occurs in at least one of the Tgraph faces.
  • Connected. The collection of faces must be a single connected component.
  • No crossing boundaries. By this we mean that vertices on the boundary are incident with exactly two boundary edges. The boundary consists of the edges between the Tgraph faces and exterior region(s). This is important for adding faces.
  • Tile connected. Roughly, this means that if we collect the faces of a Tgraph by starting from any single face and then add faces which share an edge with those already collected, we get all the Tgraph faces. This is important for drawing purposes.

In fact, if a Tgraph is connected with no crossing boundaries, then it must be tile connected. (We could define tile connected to mean that the dual graph excluding exterior regions is connected.)

Figure 5 shows two excluded graphs which have crossing boundaries at 4 (left graph) and 13 (right graph). The left graph is still tile connected but the right is not tile connected (the two faces at the top right do not have an edge in common with the rest of the faces.)

Although we have allowed for Tgraphs with holes (multiple exterior regions), we note that such holes cannot be created by adding faces one at a time without creating a crossing boundary. They can be created by removing faces from a Tgraph without necessarily creating a crossing boundary.

Important We are using face as an abbreviation for half-tile face of a Tgraph here, and we do not count the exterior of a patch of faces to be a face. The exterior can also be disconnected when we have holes in a patch of faces and the holes are not counted as faces either. In graph theory, the term face would generally include these other regions, but we will call them exterior regions rather than faces.

Figure 5: A face-connected graph with crossing boundaries at 4, and a non face-connected graph
Figure 5: A tile-connected graph with crossing boundaries at 4, and a non tile-connected graph

In addition to the constructor Tgraph we also use

checkedTgraph:: [TileFace] -> Tgraph

which creates a Tgraph from a list of faces, but also performs checks on the required properties of Tgraphs. We can then remove or select faces from a Tgraph and then use checkedTgraph to ensure the resulting Tgraph still satisfies the required properties.

selectFaces, removeFaces  :: [TileFace] -> Tgraph -> Tgraph
selectFaces fcs g = checkedTgraph (faces g `intersect` fcs)
removeFaces fcs g = checkedTgraph (faces g \\ fcs)

Edges and Directed Edges

We do not explicitly record edges as part of a Tgraph, but calculate them as needed. Implicitly we are requiring

  • No spurious edges. The edges of a Tgraph are the edges of the faces of the Tgraph.

To represent edges, a pair of vertices (a,b) is regarded as a directed edge from a to b. A list of such pairs will usually be regarded as a directed edge list. In the special case that the list is symmetrically closed [(b,a) is in the list whenever (a,b) is in the list] we will refer to this as an edge list rather than a directed edge list.

The following functions on TileFaces all produce directed edges (going clockwise round a face).

  -- join edge - dashed in figure 2
joinE  :: TileFace -> (Vertex,Vertex)
  -- the short edge which is not a join edge
shortE :: TileFace -> (Vertex,Vertex)
  -- the long edge which is not a join edge
longE  :: TileFace -> (Vertex,Vertex)
 -- all three directed edges clockwise from origin
faceDedges :: TileFace -> [(Vertex,Vertex)]

For the whole Tgraph, we often want a list of all the directed edges of all the faces.

graphDedges :: Tgraph -> [(Vertex,Vertex)]
graphDedges g = concatMap faceDedges (faces g)

Because our graphs represent tilings they are planar (can be embedded in a plane) so we know that at most two faces can share an edge and they will have opposite directions of the edge. No two faces can have the same directed edge. So from graphDedges g we can easily calculate internal edges (edges shared by 2 faces) and boundary directed edges (directed edges round the external regions).

internalEdges, boundaryDedges :: Tgraph -> [(Vertex,Vertex)]

The internal edges of g are those edges which occur in both directions in graphDedges g. The boundary directed edges of g are the missing reverse directions in graphDedges g.

We also refer to all the long edges of a Tgraph (including kite join edges) as phiEdges (both directions of these edges).

phiEdges :: Tgraph -> [(Vertex, Vertex)]

This is so named because, when drawn, these long edges are phi times the length of the short edges (phi being the golden ratio which is approximately 1.618).

Drawing Tgraphs (Patches and VPatches)

The module Tgraph.Convert contains functions to convert a Tgraph to our previous vector representation (Patch) defined in TileLib so we can use the existing tools to produce diagrams.

makePatch :: Tgraph -> Patch

drawPatch :: Patch -> Diagram B -- defined in module TileLib

drawGraph :: Tgraph -> Diagram B
drawGraph = drawPatch . makePatch

However, it is also useful to have an intermediate stage (a VPatch = Vertex Patch) which contains both face (vertices) and vectors. This allows vertex labels to be drawn and for faces to be identified and retained/excluded after the vector information is calculated.

data VPatch  = VPatch {lVertices :: [Located Vertex]
                      ,lHybrids :: [Located Hybrid]

A Vpatch has a list of located vertices and a list of located hybrids, where a Hybrid is a HalfTile with a dual representation of the face (vertices) and vector (join edge). We make VPatch transformable so it can also be an argument type for rotate, translate, and scale.

The conversion functions include

makeVPatch   :: Tgraph -> VPatch
dropVertices :: VPatch -> Patch -- discards vertex information
drawVPatch   :: VPatch -> Diagram B  -- draws labels as well

drawVGraph   :: Tgraph -> Diagram B
drawVGraph = drawVPatch . makeVPatch

One consequence of using abstract graphs is that there is no unique predefined way to orient or scale or position the patch arising from a graph representation. Our implementation selects a particular join edge and aligns it along the x-axis (unit length for a dart, philength for a kite) and tile-connectedness ensures the rest of the patch can be calculated from this.

We also have functions to re-orient a Vpatch and lists of VPatchs using chosen pairs of vertices. [Simply doing rotations on the final diagrams can cause problems if these include vertex labels. We do not, in general, want to rotate the labels – so we need to orient the Vpatch before converting to a diagram]

Decomposing Graphs

We previously implemented decomposition for patches which splits each half-tile into two or three smaller scale half-tiles.

decompose :: Patch -> Patch

We now have a Tgraph version of decomposition in the module Tgraphs:

decomposeG :: Tgraph -> Tgraph

Graph decomposition is particularly simple. We start by introducing one new vertex for each long edge (the phiEdges) of the Tgraph. We then build the new faces from each old face using the new vertices.

As a running example we take fool (mentioned above) and its decomposition foolD

*Main> foolD = decomposeG fool

*Main> foolD
Tgraph { vertices = [1,8,3,2,9,4,5,13,10,6,11,14,7,12]
       , faces = [LK (1,8,3),RD (2,3,8),RK (1,3,9)
                 ,LD (4,9,3),RK (5,13,2),LK (5,10,13)
                 ,RD (6,13,10),LK (3,2,13),RK (3,13,11)
                 ,LD (6,11,13),RK (3,14,4),LK (3,11,14)
                 ,RD (6,14,11),LK (7,4,14),RK (7,14,12)
                 ,LD (6,12,14)

which are best seen together (fool followed by foolD) in figure 6.

Figure 6: fool and foolD (= decomposeG fool)
Figure 6: fool and foolD (= decomposeG fool)

Composing graphs, and Unknowns

Composing is meant to be an inverse to decomposing, and one of the main reasons for introducing our graph representation. In the literature, decomposition and composition are defined for infinite tilings and in that context they are unique inverses to each other. For finite patches, however, we will see that composition is not always uniquely determined.

In figure 7 (Two Levels) we have emphasised the larger scale faces on top of the smaller scale faces.

Figure 7: Two Levels
Figure 7: Two Levels

How do we identify the composed tiles? We start by classifying vertices which are at the wing tips of the (smaller) darts as these determine how things compose. In the interior of a graph/patch (e.g in figure 7), a dart wing tip always coincides with a second dart wing tip, and either

  1. the 2 dart halves share a long edge. The shared wing tip is then classified as a largeKiteCentre and is at the centre of a larger kite. (See left vertex type in figure 8), or
  2. the 2 dart halves touch at their wing tips without sharing an edge. This shared wing tip is classified as a largeDartBase and is the base of a larger dart. (See right vertex type in figure 8)
Figure 8: largeKiteCentre (left) and largeDartBase (right)
Figure 8: largeKiteCentre (left) and largeDartBase (right)

[We also call these (respectively) a deuce vertex type and a jack vertex type later in figure 10]

Around the boundary of a graph, the dart wing tips may not share with a second dart. Sometimes the wing tip has to be classified as unknown but often it can be decided by looking at neighbouring tiles. In this example of a four times decomposed sun (sunD4), it is possible to classify all the dart wing tips as largeKiteCentres or largeDartBases so there are no unknowns.

If there are no unknowns, then we have a function to produce the unique composed graph.

composeG:: Tgraph -> Tgraph

Any correct decomposed graph without unknowns will necessarily compose back to its original. This makes composeG a left inverse to decomposeG provided there are no unknowns.

For example, with an (n times) decomposed sun we will have no unknowns, so these will all compose back up to a sun after n applications of composeG. For n=4 (sunD4 – the smaller scale shown in figure 7) the dart wing classification returns 70 largeKiteCentres, 45 largeDartBases, and no unknowns.

Similarly with the simpler foolD example, if we classsify the dart wings we get

largeKiteCentres = [14,13]
largeDartBases = [3]
unknowns = []

In foolD (the right hand graph in figure 6), nodes 14 and 13 are new kite centres and node 3 is a new dart base. There are no unknowns so we can use composeG safely

*Main> composeG foolD
Tgraph { vertices = [1,2,3,4,5,6,7]
       , faces = [RD (1,2,3),LD (1,3,4),RK (6,2,5)
                 ,RK (6,4,3),LK (6,3,2),LK (6,7,4)

which reproduces the original fool (left hand graph in figure 6).

However, if we now check out unknowns for fool we get

largeKiteCentres = []
largeDartBases = []
unknowns = [4,2]    

So both nodes 2 and 4 are unknowns. It had looked as though fool would simply compose into two half kites back-to-back (sharing their long edge not their join), but the unknowns show there are other possible choices. Each unknown could become a largeKiteCentre or a largeDartBase.

The question is then what to do with unknowns.

Partial Compositions

In fact our composeG resolves two problems when dealing with finite patches. One is the unknowns and the other is critical missing faces needed to make up a new face (e.g the absence of any half dart).

It is implemented using an intermediary function for partial composition

partCompose:: Tgraph -> ([TileFace],Tgraph) 

partCompose will compose everything that is uniquely determined, but will leave out faces round the boundary which cannot be determined or cannot be included in a new face. It returns the faces of the argument graph that were not used, along with the composed graph.

Figure 9 shows the result of partCompose applied to two graphs. [These are force kiteD3 and force dartD3 on the left. Force is described later]. In each case, the excluded faces of the starting graph are shown in pale green, overlaid by the composed graph on the right.

Figure 9: partCompose for two graphs (force kiteD3 top row and force dartD3 bottom row)
Figure 9: partCompose for two graphs (force kiteD3 top row and force dartD3 bottom row)

Then composeG is simply defined to keep the composed faces and ignore the unused faces produced by partCompose.

composeG:: Tgraph -> Tgraph
composeG = snd . partCompose 

This approach avoids making a decision about unknowns when composing, but it may lose some information by throwing away the uncomposed faces.

For correct Tgraphs g, if decomposeG g has no unknowns, then composeG is a left inverse to decomposeG. However, if we take g to be two kite halves sharing their long edge (not their join edge), then these decompose to fool which produces an empty graph when recomposed. Thus we do not have g = composeG (decomposeG g) in general. On the other hand we do have g = composeG (decomposeG g) for correct whole-tile Tgraphs g (whole-tile means all half-tiles of g have their matching half-tile on their join edge in g)

Later (figure 21) we show another exception to g = composeG(decomposeG g) with an incorrect tiling.

We make use of

selectFacesVP    :: [TileFace] -> VPatch -> VPatch
removeFacesVP    :: [TileFace] -> VPatch -> VPatch
selectFacesGtoVP :: [TileFace] -> Tgraph -> VPatch
removeFacesGtoVP :: [TileFace] -> Tgraph -> VPatch

for creating VPatches from selected tile faces of a Tgraph or VPatch. This allows us to represent and draw a subgraph which need not be connected nor satisfy the no crossing boundaries property provided the Tgraph it was derived from had these properties.


When building up a tiling, following the rules, there is often no choice about what tile can be added alongside certain tile edges at the boundary. Such additions are forced by the existing patch of tiles and the rules. For example, if a half tile has its join edge on the boundary, the unique mirror half tile is the only possibility for adding a face to that edge. Similarly, the short edge of a left (respectively, right) dart can only be matched with the short edge of a right (respectively, left) kite. We also make use of the fact that only 7 types of vertex can appear in (the interior of) a patch, so on a boundary vertex we sometimes have enough of the faces to determine the vertex type. These are given the following names in the literature (shown in figure 10): sun, star, jack (=largeDartBase), queen, king, ace, deuce (=largeKiteCentre).

Figure 10: Vertex types
Figure 10: Vertex types

The function

force :: Tgraph -> Tgraph

will add some faces on the boundary that are forced (i.e new faces where there is exactly one possible choice). For example:

  • When a join edge is on the boundary – add the missing half tile to make a whole tile.
  • When a half dart has its short edge on the boundary – add the half kite that must be on the short edge.
  • When a vertex is both a dart origin and a kite wing (it must be a queen or king vertex) – if there is a boundary short edge of a kite half at the vertex, add another kite half sharing the short edge, (this converts 1 kite to 2 and 3 kites to 4 in combination with the first rule).
  • When two half kites share a short edge their common oppV vertex must be a deuce vertex – add any missing half darts needed to complete the vertex.

Figure 11 shows foolDminus (which is foolD with 3 faces removed) on the left and the result of forcing, ie force foolDminus on the right which is the same graph we get from force foolD.

foolDminus = 
    removeFaces [RD(6,14,11), LD(6,12,14), RK(5,13,2)] foolD
Figure 11: foolDminus and force foolDminus = force foolD
Figure 11: foolDminus and force foolDminus = force foolD

Figures 12, 13 and 14 illustrate the result of forcing a 5-times decomposed kite, a 5-times decomposed dart, and a 5-times decomposed sun (respectively). The first two figures reproduce diagrams from an article by Roger Penrose illustrating the extent of influence of tiles round a decomposed kite and dart. [Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989. (fig 14)].

Figure 12: force kiteD5 with kiteD5 shown in red
Figure 12: force kiteD5 with kiteD5 shown in red
Figure 13: force dartD5 with dartD5 shown in red
Figure 13: force dartD5 with dartD5 shown in red
Figure 14: force sunD5 with sunD5 shown in red
Figure 14: force sunD5 with sunD5 shown in red

In figure 15, the bottom row shows successive decompositions of a dart (dashed blue arrows from right to left), so applying composeG to each dart will go back (green arrows from left to right). The black vertical arrows are force. The solid blue arrows from right to left are (force . decomposeG) being applied to the successive forced graphs. The green arrows in the reverse direction are composeG again and the intermediate (partCompose) figures are shown in the top row with the ignored faces in pale green.

Figure 15: Arrows: black = force, green = composeG, solid blue = (force . decomposeG)
Figure 15: Arrows: black = force, green = composeG, solid blue = (force . decomposeG)

Figure 16 shows the forced graphs of the seven vertex types (with the starting graphs in red) along with a kite (top right).

Figure 16: Relating the forced seven vertex types and the kite
Figure 16: Relating the forced seven vertex types and the kite

These are related to each other as shown in the columns. Each graph composes to the one above (an empty graph for the ones in the top row) and the graph below is its forced decomposition. [The rows have been scaled differently to make the vertex types easier to see.]

Adding Faces to a Tgraph

This is technically tricky because we need to discover what vertices (and implicitly edges) need to be newly created and which ones already exist in the Tgraph. This goes beyond a simple graph operation and requires use of the geometry of the faces. We have chosen not to do a full conversion to vectors to work out all the geometry, but instead we introduce a local representation of angles at a vertex allowing a simple equality test.

Integer Angles

All vertex angles are integer multiples of 1/10th turn (mod 10) so we use these integers for face internal angles and boundary external angles. The face adding process always adds to the right of a given directed edge (a,b) which must be a boundary directed edge. [Adding to the left of an edge (a,b) would mean that (b,a) will be the boundary direction and so we are really adding to the right of (b,a)]. Face adding looks to see if either of the two other edges already exist in the graph by considering the end points a and b to which the new face is to be added, and checking angles.

This allows an edge in a particular sought direction to be discovered. If it is not found it is assumed not to exist. However, this will be undermined, there are crossing boundaries . In this case there must be more than two boundary directed edges at the vertex and there is no unique external angle.

Establishing the no crossing boundaries property ensures these failures cannot occur. We can easily check this property for newly created graphs (with checkedTgraph) and the face adding operations cannot create crossing boundaries.

Touching Vertices and Crossing Boundaries

When a new face to be added on (a,b) has neither of the other two edges already in the graph, the third vertex needs to be created. However it could already exist in the Tgraph – it is not on an edge coming from a or b but from another non-local part of the Tgraph. We call this a touching vertex. If we simply added a new vertex without checking for a clash this would create a nonsense graph. However, if we do check and find an existing vertex, we still cannot add the face using this because it would create a crossing boundary.

Our version of forcing prevents face additions that would create a touching vertex/crossing boundary by calculating the positions of boundary vertices.

No conflicting edges

There is a final (simple) check when adding a new face, to prevent a long edge (phiEdge) sharing with a short edge. This can arise if we force an incorrect graph (as we will see later).

Implementing Forcing

Our order of forcing prioritises updates (face additions) which do not introduce a new vertex. Such safe updates are easy to recognise and they do not require a touching vertex check. Surprisingly, this pretty much removes the problem of touching vertices altogether.

As an illustration, consider foolDMinus again on the left of figure 11. Adding the left dart onto edge (12,14) is not a safe addition (and would create a crossing boundary at 6). However, adding the right dart RD(6,14,11) is safe and creates the new edge (6,14) which then makes the left dart addition safe. In fact it takes some contrivance to come up with a Tgraph with an update that could fail the check during forcing when safe cases are always done first. Figure 17 shows such a contrived Tgraph formed by removing the faces shown in green from a twice decomposed sun on the left. The forced result is shown on the right. When there are no safe cases, we need to try an unsafe one. The four green faces at the bottom are blocked by the touching vertex check. This leaves any one of 9 half-kites at the centre which would pass the check. But after just one of these is added, the check is not needed again. There is always a safe addition to be done at each step until all the green faces are added.

Figure 17: A contrived example requiring a touching vertex check
Figure 17: A contrived example requiring a touching vertex check

Boundary information

The implementation of forcing has been made more efficient by calculating some boundary information in advance. This boundary information uses a type Boundary

data Boundary 
  = Boundary
    { bDedges     :: [(Vertex,Vertex)]
    , bvFacesMap  :: Mapping Vertex [TileFace]
    , bvLocMap    :: Mapping Vertex (Point V2 Double)
    , allFaces    :: [TileFace]
    , allVertices :: [Vertex]
    , nextVertex  :: Vertex
    } deriving (Show)

This records the boundary directed edges (bDedges) plus a mapping of the boundary vertices to their incident faces (bvFacesMap) plus a mapping of the boundary vertices to their positions (bvLocMap). It also keeps track of all the faces and vertices. The boundary information is easily incremented for each face addition without being recalculated from scratch, and a final graph with all the new faces is easily recovered from the boundary information when there are no more updates.

makeBoundary  :: Tgraph -> Boundary
recoverGraph  :: Boundary -> Tgraph

The saving that comes from using boundaries lies in efficient incremental changes to boundary information and, of course, in avoiding the need to consider internal faces. As a further optimisation we keep track of updates in a mapping from boundary directed edges to updates, and supply a list of affected edges after an update so the update calculator (update generator) need only revise these. The boundary and mapping are combined in a force state.

type UpdateMap = Mapping DEdge Update
type UpdateGenerator = Boundary -> [DEdge] -> UpdateMap
data ForceState = ForceState 
       { boundaryState:: Boundary
       , updateMap:: UpdateMap 

Forcing then involves using a specific update generator (allUGenerator) and initialising the state, then using the recursive forceAll which keeps doing updates until there are no more, before recovering the final graph.

force:: Tgraph -> Tgraph
force = forceWith allUGenerator

forceWith:: UpdateGenerator -> Tgraph -> Tgraph
forceWith uGen = recoverGraph . boundaryState . 
                 forceAll uGen . initForceState uGen

forceAll :: UpdateGenerator -> ForceState -> ForceState
initForceState :: UpdateGenerator -> Tgraph -> ForceState

In addition to force we can easily define

wholeTiles:: Tgraph -> Tgraph
wholeTiles = forceWith wholeTileUpdates 

which just uses the first forcing rule to make sure every half-tile has a matching other half.

We also have a version of force which counts to a specific number of face additions.

stepForceWith :: UpdateGenerator -> Int -> ForceState -> ForceState

This proved essential in uncovering problems of accumulated innaccuracy in calculating boundary positions (now fixed).

Some Other Experiments

Below we describe results of some experiments using the tools introduced above. Specifically: emplacements, sub-Tgraphs, incorrect tilings, and composition choices.


The finite number of rules used in forcing are based on local boundary vertex and edge information only. We may be able to improve on this by considering a composition and forcing at the next level up before decomposing and forcing again. This thus considers slightly broader local information. In fact we can iterate this process to all the higher levels of composition. Some graphs produce an empty graph when composed so we can regard those as maximal compositions. For example composeG fool produces an empty graph.

The idea now is to take an arbitrary graph and apply (composeG . force) repeatedly to find its maximally composed graph, then to force the maximal graph before applying (force . decomposeG) repeatedly back down to the starting level (so the same number of decompositions as compositions).

We call the function emplace, and call the result the emplacement of the starting graph as it shows a region of influence around the starting graph.

With earlier versions of forcing when we had fewer rules, emplace g often extended force g for a Tgraph g. This allowed the identification of some new rules. Since adding the new rules we have not yet found graphs with different results from force and emplace. [Although, the vertex labelling of the result will usually be different].


In figure 18 on the left we have a four times decomposed dart dartD4 followed by two sub-Tgraphs brokenDart and badlyBrokenDart which are constructed by removing faces from dartD4 (but retaining the connectedness condition and the no crossing boundaries condition). These all produce the same forced result (depicted middle row left in figure 15).

Figure 18: dartD4, brokenDart, badlyBrokenDart
Figure 18: dartD4, brokenDart, badlyBrokenDart

However, if we do compositions without forcing first we find badlyBrokenDart fails because it produces a graph with crossing boundaries after 3 compositions. So composeG on its own is not always safe, where safe means guaranteed to produce a valid Tgraph from a valid correct Tgraph.

In other experiments we tried force on Tgraphs with holes and on incomplete boundaries around a potential hole. For example, we have taken the boundary faces of a forced, 5 times decomposed dart, then removed a few more faces to make a gap (which is still a valid Tgraph). This is shown at the top in figure 19. The result of forcing reconstructs the complete original forced graph. The bottom figure shows an intermediate stage after 2200 face additions. The gap cannot be closed off to make a hole as this would create a crossing boundary, but the channel does get filled and eventually closes the gap without creating a hole.

Figure 19: Forcing boundary faces with a gap (after 2200 steps)
Figure 19: Forcing boundary faces with a gap (after 2200 steps)

Incorrect Tilings

When we say a Tgraph g is a correct graph (respectively: incorrect graph), we mean g represents a correct tiling (respectively: incorrect tiling). A simple example of an incorrect graph is a kite with a dart on each side (called a mistake by Penrose) shown on the left of figure 20.

*Main> mistake
Tgraph { vertices = [1,2,4,3,5,6,7,8]
       , faces = [RK (1,2,4),LK (1,3,2),RD (3,1,5)
                 ,LD (4,6,1),LD (3,5,7),RD (4,8,6)

If we try to force (or emplace) this graph it produces an error in construction which is detected by the test for conflicting edge types (a phiEdge sharing with a non-phiEdge).

*Main> force mistake
Tgraph {vertices = *** Exception: doUpdate:(incorrect tiling)
Conflicting new face RK (11,1,6)
with neighbouring faces
[RK (9,1,11),LK (9,5,1),RK (1,2,4),LK (1,3,2),RD (3,1,5),LD (4,6,1),RD (4,8,6)]
in boundary
Boundary ...

In figure 20 on the right, we see that after successfully constructing the two whole kites on the top dart short edges, there is an attempt to add an RK on edge (1,6). The process finds an existing edge (1,11) in the correct direction for one of the new edges so tries to add the erroneous RK (11,1,6) which fails a noConflicts test.

Figure 20: An incorrect graph (mistake), and the point at which force mistake fails
Figure 20: An incorrect graph (mistake), and the point at which force mistake fails

So it is certainly true that incorrect graphs may fail on forcing, but forcing cannot create an incorrect graph from a correct graph.

If we apply decomposeG to mistake it produces another incorrect graph (which is similarly detected if we apply force), but will nevertheless still compose back to mistake if we do not try to force.

Interestingly, though, the incorrectness of a graph is not always preserved by decomposeG. If we start with mistake1 which is mistake with just two of the half darts (and also an incorrect tiling) we still get a similar failure on forcing, but decomposeG mistake1 is no longer incorrect. If we apply composeG to the result or force then composeG the mistake is thrown away to leave just a kite (see figure 21). This is an example where composeG is not a left inverse to either decomposeG or (force . decomposeG).

Figure 21: mistake1 with its decomposition, forced decomposition, and recomposed.
Figure 21: mistake1 with its decomposition, forced decomposition, and recomposed.

Composing with Choices

We know that unknowns indicate possible choices (although some choices may lead to incorrect graphs). As an experiment we introduce

makeChoices :: Tgraph -> [Tgraph]

which produces 2^n alternatives for the 2 choices of each of n unknowns (prior to composing). This uses forceLDB which forces an unknown to be a largeDartBase by adding an appropriate joined half dart at the node, and forceLKC which forces an unknown to be a largeKiteCentre by adding a half dart and a whole kite at the node (making up the 3 pieces for a larger half kite).

Figure 22 illustrates the four choices for composing fool this way. The top row has the four choices of makeChoices fool (with the fool shown embeded in red in each case). The bottom row shows the result of applying composeG to each choice.

Figure 22: makeChoices fool (top row) and composeG of each choice (bottom row)
Figure 22: makeChoices fool (top row) and composeG of each choice (bottom row)

In this case, all four compositions are correct tilings. The problem is that, in general, some of the choices may lead to incorrect tilings. More specifically, a choice of one unknown can determine what other unknowns have to become with constraints such as

  • a and b have to be opposite choices
  • a and b have to be the same choice
  • a and b cannot both be largeKiteCentres
  • a and b cannot both be largeDartBases

This analysis of constraints on unknowns is not trivial. The potential exponential results from choices suggests we should compose and force as much as possible and only consider unknowns of a maximal graph.

For calculating the emplacement of a graph, we first find the forced maximal graph before decomposing. We could also consider using makeChoices at this top step when there are unknowns, i.e a version of emplace which produces these alternative results (emplaceChoices)

The result of emplaceChoices is illustrated for foolD in figure 23. The first force and composition is unique producing the fool level at which point we get 4 alternatives each of which compose further as previously illustrated in figure 22. Each of these are forced, then decomposed and forced, decomposed and forced again back down to the starting level. In figure 23 foolD is overlaid on the 4 alternative results. What they have in common is (as you might expect) emplace foolD which equals force foolD and is the graph shown on the right of figure 11.

Figure 23: emplaceChoices foolD
Figure 23: emplaceChoices foolD

Future Work

I am collaborating with Stephen Huggett who suggested the use of graphs for exploring properties of the tilings. We now have some tools to experiment with but we would also like to complete some formalisation and proofs. For example, we do not know if force g always produces the same result as emplace g. [Update (August 2022): We now have an example where force g strictly includes emplace g].

It would also be good to establish that g is incorrect iff force g fails.

We have other conjectures relating to subgraph ordering of Tgraphs and Galois connections to explore.

by readerunner at August 12, 2022 04:14 PM

Diagrams for Penrose Tiles

Penrose Kite and Dart Tilings with Haskell Diagrams

Revised version (no longer the full program in this literate Haskell)

Infinite non-periodic tessellations of Roger Penrose’s kite and dart tiles.


As part of a collaboration with Stephen Huggett, working on some mathematical properties of Penrose tilings, I recognised the need for quick renderings of tilings. I thought Haskell diagrams would be helpful here, and that turned out to be an excellent choice. Two dimensional vectors were well-suited to describing tiling operations and these are included as part of the diagrams package.

This literate Haskell uses the Haskell diagrams package to draw tilings with kites and darts. It also implements the main operations of compChoicescompChoices and decompose which are essential for constructing tilings (explained below).

Firstly, these 5 lines are needed in Haskell to use the diagrams package:

{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE FlexibleContexts          #-}
{-# LANGUAGE TypeFamilies              #-}
import Diagrams.Prelude
import Diagrams.Backend.SVG.CmdLine

and we will also import a module for half tiles (explained later)

import HalfTile

These are the kite and dart tiles.

Kite and Dart
Kite and Dart

The red line marking here on the right hand copies, is purely to illustrate rules about how tiles can be put together for legal (non-periodic) tilings. Obviously edges can only be put together when they have the same length. If all the tiles are marked with red lines as illustrated on the right, the vertices where tiles meet must all have a red line or none must have a red line at that vertex. This prevents us from forming a simple rombus by placing a kite top at the base of a dart and thus enabling periodic tilings.

All edges are powers of the golden section \phi which we write as phi.

phi = (1.0 + sqrt 5.0) / 2.0

So if the shorter edges are unit length, then the longer edges have length phi. We also have the interesting property of the golden section that phi^2 = phi + 1 and so 1/phi = phi-1, phi^3 = 2phi +1 and 1/phi^2 = 2-phi.

All angles in the figures are multiples of tt which is 36 deg or 1/10 turn. We use ttangle to express such angles (e.g 180 degrees is ttangle 5).

ttangle:: Int -> Angle Double
ttangle n = (fromIntegral (n `mod` 10))*^tt
             where tt = 1/10 @@ turn


In order to implement compChoices and decompose, we need to work with half tiles. We now define these in the separately imported module HalfTile with constructors for Left Dart, Right Dart, Left Kite, Right Kite

data HalfTile rep = LD rep -- defined in HalfTile module
                  | RD rep
                  | LK rep
                  | RK rep

where rep is a type variable allowing for different representations. However, here, we want to use a more specific type which we will call Piece:

type Piece = HalfTile (V2 Double)

where the half tiles have a simple 2D vector representation to provide orientation and scale. The vector represents the join edge of each half tile where halves come together. The origin for a dart is the tip, and the origin for a kite is the acute angle tip (marked in the figure with a red dot).

These are the only 4 pieces we use (oriented along the x axis)

ldart,rdart,lkite,rkite:: Piece
ldart = LD unitX
rdart = RD unitX
lkite = LK (phi*^unitX)
rkite = RK (phi*^unitX)

Perhaps confusingly, we regard left and right of a dart differently from left and right of a kite when viewed from the origin. The diagram shows the left dart before the right dart and the left kite before the right kite. Thus in a complete tile, going clockwise round the origin the right dart comes before the left dart, but the left kite comes before the right kite.

When it comes to drawing pieces, for the simplest case, we just want to show the two tile edges of each piece (and not the join edge). These edges are calculated as a list of 2 new vectors, using the join edge vector v. They are ordered clockwise from the origin of each piece

pieceEdges:: Piece -> [V2 Double]
pieceEdges (LD v) = [v',v ^-^ v'] where v' = phi*^rotate (ttangle 9) v
pieceEdges (RD v) = [v',v ^-^ v'] where v' = phi*^rotate (ttangle 1) v
pieceEdges (RK v) = [v',v ^-^ v'] where v' = rotate (ttangle 9) v
pieceEdges (LK v) = [v',v ^-^ v'] where v' = rotate (ttangle 1) v

Now drawing lines for the 2 outer edges of a piece is simply

drawPiece:: Piece -> Diagram B
drawPiece = strokeLine . fromOffsets . pieceEdges

It is also useful to calculate a list of the 4 tile edges of a completed half-tile piece clockwise from the origin of the tile. (This is useful for colour filling a tile)

tileEdges:: Piece -> [V2 Double]
tileEdges (LD v) = pieceEdges (RD v) ++ map negated (reverse (pieceEdges (LD v)))
tileEdges (RD v) = tileEdges (LD v)
tileEdges (LK v) = pieceEdges (LK v) ++ map negated (reverse (pieceEdges (RK v)))
tileEdges (RK v) = tileEdges (LK v)

To fill whole tiles with colours, darts with dcol and kites with kcol we can use leftFillDK. This uses only the left pieces to identify the whole tile and ignores right pieces so that a tile is not filled twice.

leftFillDK:: Colour Double -> Colour Double -> Piece -> Diagram B
leftFillDK dcol kcol c =
  case c of (LD _) -> (strokeLoop $ glueLine $ fromOffsets $ tileEdges c)
                       # fc dcol
            (LK _) -> (strokeLoop $ glueLine $ fromOffsets $ tileEdges c)
                        # fc kcol
            _      -> mempty

To fill half tiles separately, we can use fillPiece which fills without drawing edges of a half tile.

fillPiece:: Colour Double -> Piece -> Diagram B
fillPiece col piece = drawJPiece piece # fc col # lw none

For an alternative fill operation  we can use fillDK which fills darts and kites with given colours and draws the edges with drawPiece.

fillDK:: Colour Double -> Colour Double -> Piece -> Diagram B
fillDK dcol kcol piece = drawPiece piece <> fillPiece col piece where
    col = case piece of (LD _) -> dcol
           (RD _) -> dcol
           (LK _) -> kcol
           (RK _) -> kcol

By making Pieces transformable we can reuse generic transform operations. These 4 lines of code are required to do this

type instance N (HalfTile a) = N a
type instance V (HalfTile a) = V a
instance Transformable a => Transformable (HalfTile a) where
    transform t ht = fmap (transform t) ht

So we can also scale a piece  and rotate a piece by an angle. (Positive rotations are in the anticlockwise direction.)

scale:: Double -> Piece -> Piece
rotate :: Angle Double -> Piece -> Piece


A patch is a list of located pieces (each with a 2D point)

type Patch = [Located Piece]

To turn a whole patch into a diagram using some function cd for drawing the pieces, we use

patchWith cd patch = position $ fmap (viewLoc . mapLoc cd) patch

Here mapLoc applies a function to the piece in a located piece – producing a located diagram in this case, and viewLoc returns the pair of point and diagram from a located diagram. Finally position forms a single diagram from the list of pairs of points and diagrams.

The common special case drawPatch uses drawPiece on each piece

drawPatch = patchWith drawPiece

Patches are automatically inferred to be transformable now Pieces are transformable, so we can also scale a patch, translate a patch by a vector, and rotate a patch by an angle.

scale :: Double -> Patch -> Patch
rotate :: Angle Double -> Patch -> Patch
translate:: V2 Double -> Patch -> Patch

As an aid to creating patches with 5-fold rotational symmetry, we combine 5 copies of a basic patch (rotated by multiples of ttangle 2 successively).

penta:: Patch -> Patch
penta p = concatMap copy [0..4] 
            where copy n = rotate (ttangle (2*n)) p

This must be used with care to avoid nonsense patches. But two special cases are

sun =  penta [rkite `at` origin, lkite `at` origin]
star = penta [rdart `at` origin, ldart `at` origin]

This figure shows some example patches, drawn with drawPatch The first is a star and the second is a sun.

tile patches
tile patches

The tools so far for creating patches may seem limited (and do not help with ensuring legal tilings), but there is an even bigger problem.

Correct Tilings

Unfortunately, correct tilings – that is, tilings which can be extended to infinity – are not as simple as just legal tilings. It is not enough to have a legal tiling, because an apparent (legal) choice of placing one tile can have non-local consequences, causing a conflict with a choice made far away in a patch of tiles, resulting in a patch which cannot be extended. This suggests that constructing correct patches is far from trivial.

The infinite number of possible infinite tilings do have some remarkable properties. Any finite patch from one of them, will occur in all the others (infinitely many times) and within a relatively small radius of any point in an infinite tiling. (For details of this see links at the end)

This is why we need a different approach to constructing larger patches. There are two significant processes used for creating patches, namely compChoices and decompose.

To understand these processes, take a look at the following figure.


Here the small pieces have been drawn in an unusual way. The edges have been drawn with dashed lines, but long edges of kites have been emphasised with a solid line and the join edges of darts marked with a red line. From this you may be able to make out a patch of larger scale kites and darts. This is a composed patch arising from the smaller scale patch. Conversely, the larger kites and darts decompose to the smaller scale ones.


Since the rule for decomposition is uniquely determined, we can express it as a simple function on patches.

decompose :: Patch -> Patch
decompose = concatMap decompPiece

where the function decompPiece acts on located pieces and produces a list of the smaller located pieces contained in the piece. For example, a larger right dart will produce both a smaller right dart and a smaller left kite. Decomposing a located piece also takes care of the location, scale and rotation of the new pieces.

decompPiece lp = case viewLoc lp of
  (p, RD vd)-> [ LK vd  `at` p
               , RD vd' `at` (p .+^ v')
               ] where v'  = phi*^rotate (ttangle 1) vd
                       vd' = (2-phi) *^ (negated v') -- (2-phi) = 1/phi^2
  (p, LD vd)-> [ RK vd `at` p
               , LD vd' `at` (p .+^ v')
               ]  where v'  = phi*^rotate (ttangle 9) vd
                        vd' = (2-phi) *^ (negated v')  -- (2-phi) = 1/phi^2
  (p, RK vk)-> [ RD vd' `at` p
               , LK vk' `at` (p .+^ v')
               , RK vk' `at` (p .+^ v')
               ] where v'  = rotate (ttangle 9) vk
                       vd' = (2-phi) *^ v' -- v'/phi^2
                       vk' = ((phi-1) *^ vk) ^-^ v' -- (phi-1) = 1/phi
  (p, LK vk)-> [ LD vd' `at` p
               , RK vk' `at` (p .+^ v')
               , LK vk' `at` (p .+^ v')
               ] where v'  = rotate (ttangle 1) vk
                       vd' = (2-phi) *^ v' -- v'/phi^2
                       vk' = ((phi-1) *^ vk) ^-^ v' -- (phi-1) = 1/phi

This is illustrated in the following figure for the cases of a right dart and a right kite.


The symmetric diagrams for left pieces are easy to work out from these, so they are not illustrated.

With the decompose operation we can start with a simple correct patch, and decompose repeatedly to get more and more detailed patches. (Each decomposition scales the tiles down by a factor of 1/phi but we can rescale at any time.)

This figure illustrates how each piece decomposes with 4 decomposition steps below each one.

four decompositions of pieces
four decompositions of pieces
thePieces =  [ldart, rdart, lkite, rkite]  
fourDecomps = hsep 1 $ fmap decomps thePieces # lw thin where
        decomps pc = vsep 1 $ fmap drawPatch $ take 5 $ decompositions [pc `at` origin] 

We have made use of the fact that we can create an infinite list of finer and finer decompositions of any patch, using:

decompositions:: Patch -> [Patch]
decompositions = iterate decompose

We could get the n-fold decomposition of a patch as just the nth item in a list of decompositions.

For example, here is an infinite list of decomposed versions of sun.

suns = decompositions sun

The coloured tiling shown at the beginning is simply 6 decompositions of sun displayed using leftFillDK

sun6 = suns!!6
filledSun6 = patchWith (leftFillDK red blue) sun6 # lw ultraThin

The earlier figure illustrating larger kites and darts emphasised from the smaller ones is also sun6 but this time drawn with

experimentFig = patchWith experiment sun6 # lw thin

where pieces are drawn with

experiment:: Piece -> Diagram B
experiment pc = emph pc <> (drawJPiece pc # dashingN [0.002,0.002] 0
                            # lw ultraThin)
  where emph pc = case pc of
   -- emphasise join edge of darts in red
          (LD v) -> (strokeLine . fromOffsets) [v] # lc red
          (RD v) -> (strokeLine . fromOffsets) [v] # lc red 
   -- emphasise long edges for kites
          (LK v) -> (strokeLine . fromOffsets) [rotate (ttangle 1) v]
          (RK v) -> (strokeLine . fromOffsets) [rotate (ttangle 9) v]

Compose Choices

You might expect composition to be a kind of inverse to decomposition, but it is a bit more complicated than that. With our current representation of pieces, we can only compose single pieces. This amounts to embedding the piece into a larger piece that matches how the larger piece decomposes. There is thus a choice at each composition step as to which of several possibilities we select as the larger half-tile. We represent this choice as a list of alternatives. This list should not be confused with a patch. It only makes sense to select one of the alternatives giving a new single piece.

The earlier diagram illustrating how decompositions are calculated also shows the two choices for embedding a right dart into either a right kite or a larger right dart. There will be two symmetric choices for a left dart, and three choices for left and right kites.

Once again we work with located pieces to ensure the resulting larger piece contains the original in its original position in a decomposition.

compChoices :: Located Piece -> [Located Piece]
compChoices lp = case viewLoc lp of
  (p, RD vd)-> [ RD vd' `at` (p .+^ v')
               , RK vk  `at` p
               ] where v'  = (phi+1) *^ vd       -- vd*phi^2
                       vd' = rotate (ttangle 9) (vd ^-^ v')
                       vk  = rotate (ttangle 1) v'
  (p, LD vd)-> [ LD vd' `at` (p .+^ v')
               , LK vk `at` p
               ] where v'  = (phi+1) *^ vd        -- vd*phi^2
                       vd' = rotate (ttangle 1) (vd ^-^ v')
                       vk  = rotate (ttangle 9) v'
  (p, RK vk)-> [ LD vk  `at` p
               , LK lvk' `at` (p .+^ lv') 
               , RK rvk' `at` (p .+^ rv')
               ] where lv'  = phi*^rotate (ttangle 9) vk
                       rv'  = phi*^rotate (ttangle 1) vk
                       rvk' = phi*^rotate (ttangle 7) vk
                       lvk' = phi*^rotate (ttangle 3) vk
  (p, LK vk)-> [ RD vk  `at` p
               , RK rvk' `at` (p .+^ rv')
               , LK lvk' `at` (p .+^ lv')
               ] where v0 = rotate (ttangle 1) vk
                       lv'  = phi*^rotate (ttangle 9) vk
                       rv'  = phi*^rotate (ttangle 1) vk
                       rvk' = phi*^rotate (ttangle 7) vk
                       lvk' = phi*^rotate (ttangle 3) vk

As the result is a list of alternatives, we need to select one to make further composition choices. We can express all the alternatives after n steps as compNChoices n where

compNChoices :: Int -> Located Piece -> [Located Piece]
compNChoices 0 lp = [lp]
compNChoices n lp = do
    lp' <- compChoices lp
    compNChoices (n-1) lp'

This figure illustrates 5 consecutive choices for composing a left dart to produce a left kite. On the left, the finishing piece is shown with the starting piece embedded, and on the right the 5-fold decomposition of the result is shown.

five inflations
five inflations
fiveCompChoices = hsep 1 $ fmap drawPatch [[ld,lk'], multiDecomp 5 [lk']] where 
-- two separate patches
       ld  = (ldart `at` origin)
       lk  = compChoices ld  !!1
       rk  = compChoices lk  !!1
       rk' = compChoices rk  !!2
       ld' = compChoices rk' !!0
       lk' = compChoices ld' !!1

Finally, at the end of this literate haskell program we choose which figure to draw as output.

fig::Diagram B
fig = filledSun6
main = mainWith fig

That’s it. But, What about composing whole patches?, I hear you ask. Unfortunately we need to answer questions like what pieces are adjacent to a piece in a patch and whether there is a corresponding other half for a piece. These cannot be done easily with our simple vector representations. We would need some form of planar graph representation, which is much more involved. That is another story.

Many thanks to Stephen Huggett for his inspirations concerning the tilings. A library version of the above code is available on GitHub

Further reading on Penrose Tilings

As well as the Wikipedia entry Penrose Tilings I recommend two articles in Scientific American from 2005 by David Austin Penrose Tiles Talk Across Miles and Penrose Tilings Tied up in Ribbons.

There is also a very interesting article by Roger Penrose himself: Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989.

More information about the diagrams package can be found from the home page Haskell diagrams

by readerunner at August 12, 2022 10:21 AM

August 07, 2022

GHC Developer Blog

GHC 9.4.1 released

GHC 9.4.1 released

bgamari - 2022-08-07

The GHC developers are happy to announce the availability of GHC 9.4.1. Binary distributions, source distributions, and documentation are available at

This release includes:

  • A new profiling mode, -fprof-late, which adds automatic cost-center annotations to all top-level functions after Core optimisation has run. This provides informative profiles while interfering significantly less with GHC’s aggressive optimisations, making it easier to understand the performance of programs which depend upon simplification..

  • A variety of plugin improvements including the introduction of a new plugin type, defaulting plugins, and the ability for typechecking plugins to rewrite type-families.

  • An improved constructed product result analysis, allowing unboxing of nested structures, and a new boxity analysis, leading to less reboxing.

  • Introduction of a tag-check elision optimisation, bringing significant performance improvements in strict programs.

  • Generalisation of a variety of primitive types to be levity polymorphic. Consequently, the ArrayArray# type can at long last be retired, replaced by standard Array#.

  • Introduction of the \cases syntax from GHC proposal 0302.

  • A complete overhaul of GHC’s Windows support. This includes a migration to a fully Clang-based C toolchain, a deep refactoring of the linker, and many fixes in WinIO.

  • Support for multiple home packages, significantly improving support in IDEs and other tools for multi-package projects.

  • A refactoring of GHC’s error message infrastructure, allowing GHC to provide diagnostic information to downstream consumers as structured data, greatly easing IDE support.

  • Significant compile-time improvements to runtime and memory consumption.

  • On overhaul of our packaging infrastructure, allowing full traceability of release artifacts and more reliable binary distributions.

  • Reintroduction of deep subsumption (which was previously dropped with the simplified subsumption change) as a language extension.

  • … and much more. See the release notes for a full accounting.

Note that, as 9.4.1 is the first release for which the released artifacts will all be generated by our Hadrian build system, it is possible that there will be packaging issues. If you enounter trouble while using a binary distribution, please open a ticket. Likewise, if you are a downstream packager, do consider migrating to Hadrian to run your build; the Hadrian build system can be built using cabal-install, stack, or the in-tree bootstrap script. See the accompanying blog post for details on migrating packaging to Hadrian.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy testing,

  • Ben

by ghc-devs at August 07, 2022 12:00 AM

August 05, 2022


GHC activities report: June-July 2022

This is the thirteenth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of June and July 2022. You can find the previous editions collected under the ghc-activities-report tag.

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools and improvements to HLS. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC!


The current GHC team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal and Sam Derbyshire.

Many others within Well-Typed, including Adam Gundry, Alfredo Di Napoli, Alp Mestanogullari, Douglas Wilson and Oleg Grenrus, are contributing to GHC more occasionally.


  • Zubin released GHC 9.2.4 which contains a backport of the DeepSubsumption language extension.

  • Ben, Matt and Doug have been finalising GHC 9.4.1, which is due to be released at the beginning of August.


  • Matt extended the support for multiple components in GHCi. In particular, enough operations are now supported to be able to use ghcid with multiple components. (!8584, !8548)

  • Zubin fixed Ctrl-C behaving very poorly in GHCi on Windows. This could cause broken terminals, even after exiting GHCi. (#21889)


  • Matt fixed a space leak in --make mode, greatly reducing the memory usage for projects which have module loops (!8710). This patch has been observed to reduce the allocation peak in --make mode when compiling packages which contain many hs-boot files – such as GHC or Agda – by around 25-30%.

  • Matt fixed an issue where dependencies were calculated incorrectly when using multiple home units with hs-boot files (!8573). This led to compiler crashes for projects with hs-boot files when used with multiple home units.

  • Matt fixed a bug where certain build steps were added into the build graph when they would never be executed. This avoids the last compilation build step in --make mode being displayed as, e.g., [7 of 8] instead of [8 of 8] (!8665).

Compiler performance

  • Andreas optimized the rule matching code in !8608, which resulted in a ~2% speedup when compiling the Cabal library.
  • Andreas investigated a compile time regression in #21839. This unearthed some potential improvements in the binary library as well as potential improvements for inlining heuristics described in #21938.
  • Zubin and Matt discovered and fixed a space leak affecting HLS that manifested when using the extendMG function from the GHC API, and also backported a fix to GHC 9.2 (#21816).


  • Sam improved the disambiguation mechanism for record updates, fixing #21443. Now, a record update such as r { fld1 = v1, fld2 = v2, fld3 = v3 } will typecheck when there is a single constructor which has all of the fields fld1, fld2 and fld3. (Beforehand, we would insist there be only one datatype with all those fields, without looking at constructors.)

  • Matt fixed a subtle issue to do with different varieties of built-in syntax which meant re-exports of FUN,TYPE,One and Many didn’t work properly (#21752, #20695, #18302).

  • Matt spent quite a bit of time helping Simon Peyton Jones with the DeepSubsumption patch by testing various iterations of the patch on the head.hackage package set. This uncovered quite a few regressions which we managed to fix before merging the feature.

Error messages

  • Matt added a flag -fsuppress-error-contexts which makes error messages less verbose (!8563, #21722).

  • Sam fixed #21662, a bug in the treatment of dictionaries in the pattern match checker. The pattern-match checker now does a better job at emitting warnings in the presence of class dictionaries.

Code generation

  • Ben debugged #21708, identifying it as a soundness issue in the implementation of the keepAlive# primop where the Cmm pipeline could in very particular cases inappropriately drop the touch# to which keepAlive# was desugared (similar to the issue seen in #14346, the issue which keepAlive# was intended to fix). As a stop-gap measure, he implemented a naive fix, making keepAlive# an out-of-line primop. He then began work on #16098, an optimisation which will allow GHC to eliminate much of the overhead incurred by this approach. Whilst difficult to trigger, the issue manifested when compiling xmonad with GHC 9.2.3 so it was critical to fix promptly in the GHC 9.2.4 release.

  • While working on #21708, Ben identified a soundness issue in GHC’s current usage runtime-representationally polymorphic primops and started thinking about how this might be mitigated (#21868).

  • Ben debugged and fixed a bug in the AArch64/Darwin NCG which lead to incorrect behavior in the presence of foreign calls to functions expecting narrow, signed arguments (#21773, #20735).

  • Ben continued work on !3012, a change introducing a standard thunk for unpacking of string literals. This significantly reduces both compilation time and code size for programs containing many strings.


  • Sam changed the desugaring of the withDict# function to avoid GHC’s typeclass specialiser from performing incorrect, semantic-changing program transformations (#21575). In future releases withDict# is intended to be a compiler primitive which replaces assumptions libraries such as reflection use about the internal representation of type class dictionaries.

  • Ben worked on finishing a rework of GHC’s treatment of undersaturated primops, simplifying the code generator’s treatment of primops and reducing the size of the compiler’s symbol table (#20155).

Core-to-Core pipeline

  • Andreas fixed a bug in the worker-wrapper implementation, which caused certain programs compiled with -fmax-worker-args=20 to panic (#21472).

  • Andreas fixed #21770 which caused a debugged build of GHC to panic under certain circumstances.

Runtime system

  • Ben fixed a number of issues in the RTS linker. These included the introduction of proper support for global constructors and destructors, fixing the resolution of DSO handles on Darwin, and fixing unregistration of unwind information on Windows. Together, this work fixed a number of issues in statically-linked configurations as well as enabled the RTS linker to load libc++ on Windows (#21618).

  • Ben debugged and fixed a bug in GHC 9.4’s new adjustor-pool implementation which lead to double-allocation of adjustor slots and consequently crashes (#21768).

  • Ben debugged and fixed a subtle bug in the non-moving garbage collector’s scavenging logic which could result in undefined behavior (#21885).

  • Ben debugged and fixed a bug in the biographical profiler which resulted in program crashes in long-running programs (#21880).


  • Andreas fixed #21233 by restoring the textual ticky output for ticky profiling.

  • Andreas improved how the (new in 9.4) profiling mode -fprof-late interacts with the creation of unfoldings, in order to avoid interfering with Core optimizations. This means that profiles produced with late cost centres will be more faithful to their original (optimised, unprofiled) programs (#21249).


  • Ben finished work on a set of interfaces allowing users to introspect on a program’s threads, their labels and statuses (!2816). In the process he came up with a few ideas for extending this infrastructure to enable better diagnostics of looping evaluation and multi-threaded deadlocks (#21877).

  • Zubin discovered and implemented a workaround for a bug in the GHC linker affecting programs which use text-2.0 in Template Haskell splices with a statically linked compiler (#21787, text!453).

  • Doug fixed a long latent concurrency bug in GHC.Event.Thread.closeFdWith (#21651).

  • Doug fixed a bug in forkOn, where the wrong thread was context-switched (#21824).

  • Ben worked-around process#247, a spurious failure of process’s posix_spawn backend caused by an infelicity in Darwin’s implementation.


  • Matt added a number of workarounds to fix latent issues in the soon-to-be retired Make build system (for example !8751, !8731).

  • Matt added support to the build system to allow a specific base-url to be used when generating haddock documentation so the result can be uploaded to hackage (!8542).

  • Ben fixed a number of issues noticed when using Hadrian-built binary distributions on Darwin (#21506, #21570).


  • Sam added a Hadrian key-value setting which allows flags to be passed when running Hsc2Hs.

  • Matt added an experimental mode to hadrian which uses GHC’s --make mode rather than -c in order to compile libraries (!8640). In the future, this should improve build times for GHC developers, as using --make mode is faster than using -c.

  • Matt added a new ./hadrian/ghci-multi target which loads the GHC project into a single GHCi multi-repl using the new multiple home units feature.


  • Ben began introducing CI infrastructure to validate building of GHC in a cross-compiled configuration (#11958).

  • Andreas did a maintenance pass over open merge requests to nofib.

by ben, andreask, matthew, zubin, sam, adam, douglas at August 05, 2022 12:00 AM

GHC Developer Blog

Migrating from Make to Hadrian (for packagers)

Migrating from Make to Hadrian (for packagers)

Sam Derbyshire - 2022-08-05

As the Hadrian build system for GHC has reached maturity and the old Make-based build system is becoming increasingly costly to maintain, the GHC maintenance team has decided that it is finally time to remove GHC’s Make-based build system. GHC 9.4 will be the last release series compatible with Make, which will be limited to booting with GHC 9.0. From 9.6 onwards, the only supported way to build GHC will be to use Hadrian.

This blog post will give an overview of using Hadrian, which should help packagers migrate from the old Make-based build system.

The Hadrian build system

Hadrian is a modular, statically-typed, extensible build system for GHC, introduced in the paper Non-recursive Make Considered Harmful. It consists of a Haskell executable implemented using the shake library, and is used to configure and build GHC.

Building Hadrian

Contributors to GHC will be accustomed to running the ./hadrian/build script, which builds and runs Hadrian. This script calls out to cabal, which fetches the dependencies of the Hadrian package from Hackage before building the resulting Haskell executable. While this is convenient for developers, it isn’t appropriate for build environments in which one doesn’t have access to the network (e.g. in order to enforce a hermetic build environment). For that reason, Hadrian provides a set of scripts for bootstrapping the build system from source tarballs. These can be found in the hadrian/bootstrap directory.

Bootstrapping Hadrian

The Hadrian bootstrap scripts are driven by a set of precomputed build plans; these depend on the version of the bootstrap GHC being used. A typical workflow might look like the following:

  • Locally:
    • Choose a build plan appropriate for the bootstrap GHC version, such as hadrian/bootstrap/plan-bootstrap-8.10.7.json. (These build plans can also be generated manually from a cabal-install build plan; see generate_bootstrap_plans)
    • Fetch the sources needed by the build plan: fetch -w <path_to_ghc> --deps plan-bootstrap-8.10.7.json -o 8_10_7_bootstrap_sources.tar.gz
  • In the build environment:
    • Provision the bootstrap-sources tarball generated above.
  • In your GHC build script:
    • Build Hadrian using the bootstrap script: -w <path_to_ghc> --bootstrap-sources 8_10_7_bootstrap_sources.tar.gz
    • Build GHC using the resulting Hadrian executable, located by default in bin/hadrian, e.g. bin/hadrian -j --flavour=perf+debug_info -w <path_to_ghc>

An example of how to use these bootstrap scripts can be seen in the ghcs-nix repository. This repository contains Nix expressions specifying how to build many GHC versions, with both Make and Hadrian.

From now on, we will assume that you have built Hadrian (either via ./hadrian/build or via bootstrapping), referring to the hadrian executable agnostically.

Using Hadrian

How does Hadrian replace make?

To build GHC, we begin as before by running ./boot (if necessary, i.e. a configure file doesn’t already exist) and then ./configure <args>. As with Make, the build environment is determined by the configure script, which will read provided arguments as well as environment variables. For example, the selection of the bootstrap compiler is done via the GHC environment variable, and the selection of the C compiler uses the CC environment variable. This is unchanged, and details can be found on the GHC wiki.

Once the configure script is run, we replace make commands with hadrian commands. The fundamental command to build GHC with Hadrian is

hadrian <args>

Hadrian stages

GHC is a self-hosted compiler. This means that we need to provide a GHC executable in order to compile GHC. In Hadrian, this is done in stages:

  • The stage0 compiler is the bootstrap compiler: a user-provided executable which will be used to compile GHC. The bootstrap compiler is chosen via the GHC variable passed to the configure script.
  • The stage1 compiler is the compiler built using the stage0 compiler; it runs on the build platform and produces code for the target platform. The stage1 compiler is limited in that it does not support dynamic code loading via the internal bytecode interpreter.
  • The stage2 compiler is the compiler built using the stage1 compiler; it runs on the target platform. The stage2 compiler is necessary for the implementation of Template Haskell and GHC plugins.

In Hadrian, build artifacts are put in a subdirectory of the build folder (by default, _build) corresponding to the stage of the compiler used to perform the build. This means that the stage2 compiler will be found (if using the default build directory, _build) at _build/stage1/bin/ghc.

Hadrian provides meta-targets which can be used to build particular subsets of the compiler. A typical Hadrian command, which builds a library or executable for a given stage, looks like

hadrian <stage>:{lib,exe}:<package name>

For example, hadrian stage2:lib:base will build the stage2 base library, and put it into the _build/stage1 subdirectory.

Flavours and flavour transformers

A Hadrian build flavour is a pre-defined collection of build settings that fully define a GHC build. These are described here. The flavour being used determines the ways in which GHC and its libraries will be built, as described in the Hadrian documentation. This replaces the variables of the make build system such as GhcLibWays, DYNAMIC_GHC_PROGRAMS, DYNAMIC_BY_DEFAULT.

A flavour is set using the --flavour command-line argument, e.g. hadrian/build --flavour=perf. As a packager you probably want to use either the release or perf flavour:

flavour name description
perf A fully optimised bindist
release The same configuration as perf, but with additional build products such as interface files containing Haddock docs

Flavours can be modified using flavour transformers. For example, the profiled_ghc flavour transformer compiles the GHC library and executable with cost-centre profiling enabled. One can, e.g., apply the profiled_ghc transformer to the perf flavour with hadrian --flavour=perf+profiled_ghc.

Make variable Hadrian flavour transformer
GhcProfiled profiled_ghc
GhcDebugged debug_ghc
SplitObjs split_sections

The full list of supported flavour transformers is available here.

Building GHC for distribution

Packagers will be interested in the binary-dist and binary-dist-dir Hadrian targets. For example, the command

hadrian/build --flavour=release binary-dist

will produce a complete release binary distribution tarball, while the binary-dist-dir target produces the directory only (not the tarball). The resulting bindist will be placed in _build/bindist.

When building the binary-dist target, documentation (namely, Haddock documentation and GHC’s Sphinx-built user’s guide) will be built by default. Building of documentation can be disabled using Hadrian’s --docs=... command-line flag. If you don’t want to build documentation, there are options to disable building various parts of the documentation. For example, if you don’t have Sphinx available, you can disable the parts of the documentation which require it:

# Build only the documentation for the base package, without using sphinx
hadrian {..} docs:base --docs=no-sphinx

Further information about configuring the documentation built by Hadrian can be found in the Hadrian readme.

Large-integer implementation

GHC supports several implementations of the Integer/Natural types and operations on them. The selection of the implementation is done using the --bignum Hadrian argument, e.g. --bignum=gmp to use the GMP library, or --bignum=native to use a pure-Haskell implementation.

Key-value settings

While we expect that the mechanisms described above will suffice for most builds, Hadrian also provides a fine-grained key-value configuration mechanism for modifying the command-lines passed to each of the tools run by Hadrian. For instance, one can pass an additional argument to all GHC invocations via:

hadrian {..} "*.*.ghc.*.opts += -my-ghc-option"

Passing an additional option when compiling the ghc library only:

hadrian {..} "*.ghc.ghc.*.opts += -my-ghc-option"

These settings can also be placed in a hadrian.settings file in the build root (by default _build), instead of passing them in the command line.

Hadrian currently supports the following key-value settings:

  • (<stage> or *).(<package name> or *).ghc.{hs, c, cpp, link, deps, *}.opts
    Arguments passed to GHC invocations.
    • hs for arguments passed to GHC when compiling Haskell modules
    • c for arguments passed to the C compiler
    • cpp for arguments passed to the C++ compiler
    • link for arguments passed during linking
    • deps for arguments to a ghc -M command, which outputs dependency information between Haskell modules
  • (<stage> or *).(<package name> or *).cc.{c, deps, *}.opts
    Arguments passed directly to the C compiler.
  • (<stage> or *).(<package name> or *).cabal.configure.opts
    Arguments passed to the cabal configure step.
  • (<stage> or *).(<package name> or *)
    Arguments passed when running Hsc2Hs.

These Hadrian key-value settings are useful to replace the assignment of Make variables, even though Hadrian is not intended to be a one-to-one replacement of Make; recovering the behaviour with Hadrian might require a few tweaks.

Consider for example the way that Make passes flags to GHC when compiling source Haskell files. Make has several different variables, such as SRC_HC_OPTS, WAY_<way>_<pkg>_OPTS, EXTRA_HC_OPTS. These are passed in order, with later flags overriding previous ones. With Hadrian, things are much simpler, and one can usually achieve the same goal by simply setting the *.*.ghc.hs.opts Hadrian key-value setting.

The following table serves as a general guideline in migrating the use of Make variables (bearing the above caveats in mind):

Make variable Hadrian key-value setting
GhcLibOpts *.*.ghc.*.opts
GhcRtsHcOpts *.*.rts.*.opts
SRC_HC_OPTS, EXTRA_HC_OPTS *.*.ghc.hs.opts
SRC_CC_OPTS, EXTRA_CC_OPTS *.*.ghc.c.opts with -optc prefix
SRC_CPP_OPTS, EXTRA_CPP_OPTS a combination of *.*.ghc.c.opts with -optc prefix and *.*.cc.c.opts
SRC_LD_OPTS, EXTRA_LD_OPTS *.* with -optl prefix
<pkg>_EXTRA_LD_OPTS *.<pkg> with -optl prefix
<pkg>_CONFIGURE_OPTS *.<pkg>.cabal.configure.opts
utils/hsc2hs_dist-install_EXTRA_LD_OPTS *.* with -L prefix

To pass module-specific or way-specific options, e.g. passing a C pre-processor option when compiling specific modules in a certain way (as when using a Way_<way>_<pkg>_OPTS Make variable), please use the programmatic interface described below.

Programmatic configuration

If the above configuration mechanisms aren’t sufficient, it is also possible to directly add new configurations to Hadrian. This allows finer-grained changes, such as changing the options when compiling a specific module or set of modules. If you really need to do this, you can read about it in the Hadrian manual. A good starting place to look for inspiration is Settings.Packages, which contains the arguments used to build GHC and the libraries it depends upon. The documentation for Shake is also a helpful resource, as Hadrian uses the Shake EDSL to implement its build rules.

Further support

If you are having any issues with packaging GHC after these changes, or find yourself needing to use the programmatic interface, please open an issue on the issue tracker, so that we can work together to modify Hadrian for your needs.

by ghc-devs at August 05, 2022 12:00 AM

August 04, 2022

Sandy Maguire

Why Is the Web So Monotonous? Google.

Does it ever feel like the internet is getting worse? That’s been my impression for the last decade. The internet feels now like it consists of ten big sites, plus fifty auxiliary sites that come up whenever you search for something outside of the everyday ten. It feels like it’s harder to find amateur opinions on matters, except if you look on social media, where amateur opinions are shared, unsolicited, with much more enthusiasm than they deserve. The accessibility of the top ten seems like it collapses the internet into a monoculture of extremism, and, perhaps even more disappointingly, a monoculture that echos the offline world.

Contrast this to the internet of yore. By virtue of being hard to access, the internet filtered away the mass appeal it has today. It was hard and expensive to get on, and in the absence of authoring tools, you were only creating internet content if you had something to say. Which meant that, as a consumer, if you found something, you had good reason to believe it was well-informed. Why would someone go through the hassle of making a website about something they weren’t interested in?

In 2022, we have a resoundingly sad answer to that question: advertising. The primary purpose of the web today is “engagement,� which is Silicon Valley jargon for “how many ads can we push through someone’s optical nerve?� Under the purview of engagement, it makes sense to publish webpages on every topic imaginable, regardless of whether or not you know what you’re talking about. In fact, engagement goes up if you don’t know what you’re talking about; your poor reader might mistakenly believe that they’ll find the answer they’re looking for elsewhere on your site. That’s twice the advertising revenue, baby!

But the spirit of the early web isn’t gone: the bookmarks I’ve kept these long decades mostly still work, and many of them still receive new content. There’s still weird, amateur, passion-project stuff out there. It’s just hard to find. Which brings us to our main topic: search.

Google is inarguably the front page of the internet. Maybe you already know where your next destination is, in which case you probably search for the website on Google and click on the first link, rather than typing in the address yourself. Or maybe you don’t already know your destination, and you search for it. Either way, you hit Google first.

When I say the internet is getting worse, what I really mean is that the Google search results are significantly less helpful than they used to be. This requires some qualification. Google has gotten exceedingly good at organizing everyday life. It reliably gets me news, recipes, bus schedules, tickets for local events, sports scores, simple facts, popular culture, official regulations, and access to businesses. It’s essentially the yellow pages and the newspaper put together. For queries like this, which are probably 95% of Googles traffic, Google does an excellent job.

The difficulties come in for that other 5%, the so-called “long tail.� The long tail is all those other things we want to know about. Things without well-established, factual answers. Opinions. Abstract ideas. Technical information. If you’re cynical, perhaps it’s all the stuff that doesn’t have wide-enough appeal to drive engagement. Whatever the reason, the long tail is the stuff that’s hard to find on the modern internet.

Notice that the long-tail is exactly the stuff we need search for. Mass-appeal queries are, almost by definition, not particularly hard to find. If I need a bus schedule, I know to talk to my local transit authority. If I’m looking to keep up with the Kardashians, I’m not going to have any problems (at least, no search problems.) On the other hand, it’s much less clear where to get information on why my phone starts overheating when I open the chess app.

So what happens if you search for the long tail on Google? If you’re like me, you flail around for ten minutes wasting your time reading crap articles before you remember that Google is awful for the long tail, and you come away significantly more frustrated, not having found what you were looking for in the first place.

Lets look at some examples. One of my favorite places in the world is Koh Lanta, Thailand. When traveling, I’m always on the lookout for places that give off the Koh Lanta vibe. What does that mean? Hard to say, exactly, but having tourist amenities without being touristy. Charming, slow, cheap. I don’t know exactly; if I did, it’d be easier to find. Anyway, forgetting that Google is bad at long tails, I search for what is the koh lanta of croatia? and get:

  • Koh-Lanta - Wikipedia [note: not the island, the game show]
  • Top 15 Unique Things to Do in Koh Lanta
  • Visit Koh Lanta on a trip to Thailand
  • Beautiful places to travel, Koh lanta, Sunset
  • Holiday Vacation to Koh Lanta: Our favourite beaches and …
  • Koh Lanta Activities: 20 Best Things to Do
  • etc

With the exception of “find a flight from Dubrovnik to Koh Lanta� on page two, you need to get to page five before you see any results that even acknowledge I also searched for croatia. Not very impressive.

When you start paying attention, you’ll notice it on almost every search — Google isn’t actually giving you answers to things you searched for. Now, maybe the reason here is that there aren’t any good results for the query, but that’s a valuable thing to know as well. Don’t just hit me with garbage, it’s an insult to my intelligence and time.

Where Things Go Wrong🔗

I wanted to figure out why exactly the internet is getting worse. What’s going on with Google’s algorithm that leads to such a monotonous, boring, corporate internet landscape? I thought I’d dig into search engine optimization (SEO) — essentially, techniques that improve a website’s ranking in Google searches. I’d always thought SEO was better at selling itself than it was at improving search results, but my god was I wrong.

SEO techniques are extremely potent, and their widespread adoption is what’s wrong with the modern web.

For example, have you ever noticed that the main content of most websites is something like 70% down the page? Every recipe site I’ve ever seen is like this — nobody cares about how this recipe was originally your great-grandmother’s. Just tell us what’s in it. Why is this so prevalent on the web?

Google rewards a website for how long a user stays on it, with the reasoning being that a bad website has the user immediately hit the back button. Seems reasonable, until you notice the problem of incentives here. Websites aren’t being rewarded for having good content under this scheme, they’re rewarded for wasting your time and making information hard to find. Outcome: websites that answer questions, but hide the information somewhere on a giant (ad-filled) page.

Relatedly, have you noticed how every website begins with a stupid paragraph overviewing the thing you’re searching for? It’s always followed by a stupid paragraph describing why you should care about the thing. For example, I just searched for garden irrigation, and the first result is:

Water is vital to plant health, but watering by hand can be a hassle. You have to drag hoses between gardens, move sprinklers around, or take the time to water each plant. Our innovative watering systems take the hassle out of watering. They’re the easiest way to give plants the consistent moisture they need for your biggest harvest and most beautiful blooms.

Water is vital to plant health. Wow, who knew! Why in god’s name would I be searching for garden irrigation if I didn’t know that water was vital to plant health. Why is copy like this so prevalent on the web?

Things become clearer when you look at some of the context of this page:

Url: https://[redacted]/how-to/how-to-choose-a-watering-system/8747.html

Title: How to Choose a Garden Irrigation System

Heading: Soak, Drip or Spray: Which is right for you?

Subheading: Choose the best of our easy, customizable, irrigation systems to help your plants thrive and save water

As it happens, Google rewards websites which use keywords in their url, title, headings, and first 100 words. Just by eyeballing, we can see that this particular website is targeting the keywords “water�, “system�, “irrigation�, and “garden�. Pages like these hyper-optimized to come up for particular searches. The stupid expository stuff exists only to pack “important keywords� into the first 100 words.

But keyword targeting doesn’t stop there. As I was reading through this SEO stuff (that is, the first page of a Google search for seo tricks,) every single page offered 15-25 great, technical SEO tricks. And then, without fail, the final point on each page was “but really, the best SEO strategy is having great content!� That’s weird. “Great content� isn’t something an algorithm can identify; if it were, you wouldn’t be currently reading the ravings of a madman, angry about the state of the internet.

So, why do all of these highly-optimized SEO pages ubiquitously break form, switching from concrete techniques to platitudes? You guessed it, it’s a SEO technique! Google offers a keyword dashboard, where you can see which keywords group together, and (shudder) which keywords are trending. Google rewards you for having other keywords in the group on your page. And it extra rewards you for having trending keywords. You will not be surprised to learn that “quality content� is a keyword that clusters with “seo,� nor that it is currently a trending keyword.

Think about that for a moment. Under this policy, Google is incentivizing pages to become less focused, by adding text that is only tangentially related. But, how do related keywords come about? The only possible answer here is to find keywords that often cluster on other pages. But this is a classic death spiral, pulling every page in a topic to have the same content.

Another way of looking at it is that if you are being incentivized, you are being disincentivized. Webpages are being penalized for including original information, because original information can’t possibly be in the keyword cluster.

There are a multitude of perverse incentives from Google, but I’ll mention only two more. The first is that websites are penalized for having low-ranking pages. The conventional advice here is to delete “underperforming� pages, which only makes the search problem worse — sites are being rewarded for deleting pages that don’t align with the current search algorithm.

My last point: websites are penalized for even linking to low-ranking pages!

It’s not hard to put all of the pieces together and see why the modern web is so bland and monotonous. Not only is the front-page of the internet aggressively penalizing websites which aren’t bland and monotonous, it’s also punishing any site which has the audacity to link to more interesting parts of the web.

How Culpable is Google?🔗

So the discoverable part of web sucks. But is that really Google’s fault? I’d argue no. By virtue of being the front-page, Google’s search results are under extreme scrutiny. In the eyes of the non-technical population, especially the older generations, the internet and Google are synonymous. The fact is that Google gets unfairly targeted by legislation because it’s a big, powerful tech company, and we as a society are uncomfortable with that.

Worse, the guys doing the regulation don’t exactly have a grasp on how internet things work.

Society at large has been getting very worried about disinformation. Who’s problem is that? Google’s — duh. Google is how we get information on the internet, so it’s up to them to defend us from disinformation.

Unfortunately it’s really hard to spot disinformation. Sometimes even the government lies to us (gasp!). I can think of two ways of avoiding getting in trouble with respect to disinformation. One: link only to official sites, thus changing the problem of trustworthiness to one of authority. If there is no authority, just give back the consensus. Two: don’t return any information whatsoever.

Google’s current strategy seems to be somewhere between one and two. For example, we can try a controversialish search like long covid doesn't exist. The top results at time of writing are:

  1. The search for Long Covid (
  2. Small Study Finds No Obvious Physical Causes for Long COVID (
  3. Fact Check-‘Long COVID’ is not fake, quoted French study did … (
  4. Harvard Medical School expert explains ‘long COVID’ (
  5. Claim that French study showed long COVID doesn’t exist … (
  6. What doctors wish patients knew about long COVID (

I’m not particularly in the know, but I recognize most of these organizations. sounds official. Not only is one of the pages from Harvard, but also it’s from a Harvard Medical School expert. I especially like the fifth one, the metadata says:

Claim: Long COVID is “mostly a mental disease�; the condition long COVID is solely due to a person’s belief, not actual disease; long COVID doesn’t exist

Fact check by Health Feedback: Inaccurate

Every one of these websites comes off as authoritative — not in sense of “knowing what they’re talking about� because that’s hard to verify — but in the sense of being the sort of organization we’d trust to answer this question for us. Or, in the case of number five, at least telling us that they fact checked it.

Let’s try a search for something requiring less authority, like “best books.� In the past I would get a list of books considered the best. But now I get:

  1. The Greatest Books: The Best Books of All Time - 1 to 50
  2. The Best Books of All Time |
  3. 100 Best Books of All Time - Reader’s Digest
  4. Best Book Lists - Goodreads
  5. Best Books 2022: Books We Love : NPR

You’ll notice there are no actual books here. There are only lists of best books. Cynical me notes that if you were to actually list a book, someone could find it controversial. Instead, you can link to institutional websites, and let them take the controversy for their picks.

This isn’t the way the web needs to be. Google could just as well given me personal blogs of people talking about long covid and their favorite books, except (says cynical me) that these aren’t authoritative sources, and thus, linking to them could be considered endorsement. And the web is too big and too fast moving to risk linking to anything that hasn’t been vetted in advance. It’s just too easy to accidentally give a good result to a controversial topic, and have the law makers pounce on you. Instead, punt the problem back to authorities.

The web promised us a democratic, decentralized public forum, and all we got was the stinking yellow pages in digital format. I hope the crypto people can learn a lesson here.

Anyway, all of this is to say that I think lawmakers and liability concerns are the real reason the web sucks. All things being equal, Google would like to give us good results, but it prefers making boatloads of money, and that would be hard to do if it got regulated into nothingness.

A Note on Other Search Engines🔗

Google isn’t the only search engine around. There are others, but it’s fascinating that none of them compete on the basis of providing better results. DDG claims to have better privacy. Ecosia claims to plant trees. Bing exists to keep Microsoft relevant post-2010, and for some reason, ranks websites for being highly-shared on social media (again, things that are, by definition, not hard to find.)

Why don’t other search engines compete on search results? It can’t be hard to do better than Google for the long tail.

What Can We Do?🔗

It’s interesting to note that the problems of regulatory-fear and SEO-capture are functions of Google’s cultural significance. If Google were smaller or less important, there’d be significantly less negative-optimization pressure on it. Google is a victim of its own success.

That is to say, I don’t think all search engines are doomed to fail in the same way that Google has. A small search engine doesn’t need to be authoritative, because nobody is paying attention to it. And it doesn’t have to worry about SEO for the same reason — there’s no money to be made in manipulating its results.

What I dream of is Google circa 2006. A time where a search engine searched what you asked for. A time before aggressive SEO. A time before social media, when the only people on the internet had a reason to be there. A time before sticky headers and full-screen modal pop-ups asking you to subscribe to a newsletter before reading the article. A time before click-bait and subscription-only websites which tease you with a paragraph before blurring out the rest of the content.

These problems are all solvable with by a search engine. But that search engine isn’t going to be Google. Let’s de-rank awful sites, and boost personal blogs of people with interesting things to say. Let’s de-rank any website that contains ads. Let’s not index any click-bait websites, which unfortunately in 2022 includes most of the news.

What we need is a search engine, by the people, and for the people. Fuck the corporate interests and the regulatory bullshit. None of this is hard to do. It just requires someone to get started.

August 04, 2022 12:00 AM

August 03, 2022


The Plutus Compilation Pipeline: Understanding Plutus Core(s)

Plutus is a strict, pure functional language. It is developed by IOHK for use on the Cardano blockchain, but in this blog post we will not be concerned with specific applications of the language, but instead look at its compilation pipeline.

Technically speaking, Plutus is not one language, but three, and most people who write “Plutus” are not really writing Plutus at all, but Haskell. These Haskell programs are translated to the Plutus Intermediate Representation (PIR). After that, data types are replaced by their Scott encodings, and recursion is replaced by explicit fixpoints to get to typed Plutus Core (PLC). Finally, types are erased to get to the Untyped Plutus Core (UPLC).

In this blog post, we will explain this entire process, with a particular focus on

  • The consequences of polymorphism in the typed language on the untyped language
  • Scott encoding
  • Type and term level fixpoints

We won’t try to explain the default syntax of Plutus Core in this blog post, because it is very verbose; the syntax we used here is generated by a custom pretty-printer we wrote. If you want to learn more about the default syntax, or need a more gentle introduction to lambda calculus, you might want to check out An Introduction to Plutus Core.


Throughout this blog post we will assume the reader is familiar with Haskell, and we will use Haskell in sections marked In Haskell as a vehicle for explaining some of the concepts needed to understand Plutus code, before showing those concepts in Plutus itself.

In Haskell. Force and Delay

Suppose we need to work with an API that is stricter than we’d like it to be. To have a concrete but simple (if contrived) example, let’s say we want to use this strict implementation of if-then-else:

strictIfThenElse :: Bool -> a -> a -> a
strictIfThenElse !b !t !f = if b then t else f

If we call this function with two arguments t and f, it will evaluate both arguments before even looking at b, even though only one will be required.

> strictIfThenElse True 'a' undefined
*** Exception: Prelude.undefined

If we want to avoid this problem, we somehow need to delay evaluation of the arguments. We could do this by giving them a dummy argument of unit type:

type Delayed a = () -> a

delay :: a -> Delayed a
delay x = const x

Going the other direction, if we have a delayed argument, in order to evaluate it be must force it:

force :: Delayed a -> a
force x = x ()

Using delay and force we can get the behaviour we’d expect:

> force $ strictIfThenElse True (delay 'a') (delay undefined)

Simple example: function application

With these first preliminaries out of the way, let’s start taking a look at Plutus now. The compilation pipeline is as follows:

  1. Plutus programs are written using Haskell as a surface language.
  2. The Haskell code is translated to the Plutus Intermediate Representation (PIR).
  3. Through a series of transformations the compiler turns PIR into (typed) Plutus Core (PLC). The main difference between PIR and PLC is that the former has support for datatypes and recursion, and the latter does not.
  4. Finally, PLC is turned into Untyped Plutus Core (UPLC).

We will see lots of examples of each of these representations in this blog post, but let’s start with something very simple. Consider the following Haskell code:

apMono :: (Integer -> Integer) -> Integer -> Integer
apMono f x = f x

For these first few simple examples, the PIR and PLC representations are identical:

λ (f :: Integer -> Integer) (x :: Integer) -> f x

PIR and PLC are explicitly typed languages, so every variable binder is given a type annotation. These types are then removed in the translation to UPLC:

λ f x -> f x

No surprises so far, so let’s consider something a bit more interesting:

apId :: (forall a. a -> a) -> Integer -> Integer
apId f x = f x

Same code as above, but with a different type: we now insist that the first argument is polymorphic in a. The only (total) function of type forall a. a -> a is the identity, so apId isn’t particularly useful, but it will serve to illustrate a point. The PIR/PLC translation of apId is:

λ (f :: ∀ a. a -> a) (x :: Integer) -> f {Integer} x

Polymorphic functions in Haskell become functions that take type arguments in PIR; in the body we see f being applied to the type argument Integer and then to the regular argument x. (We will use curly brackets for type arguments and type application.) Still nothing unusual, but in the translation to UPLC we see the first Plutus specific feature:

λ f x -> f! x

That exclamation mark is the force operator we discussed above. UPLC has explicit constructs for delay and force, which we will write as λ ~ -> e and e!, respectively:

λ ~ -> e  ==  delay e
e!        ==  force e

The question is: why is f being forced here?

Polymorphism versus strictness

To illustrate why f was being forced in that last example, let’s consider a slightly more interesting PIR program1

let* !boom = λ {a} -> ERROR
     !id   = λ {a} (x :: a) -> x
in λ (b :: Bool) -> ifThenElse {∀ a. a -> a} b boom id

This defines a function that takes an argument b of type Bool. When the caller passes True, the function evaluates to ERROR (aka undefined in Haskell); otherwise it evaluates to the identity function.

You might expect the untyped version of this program to look something like this:

λ b -> ifThenElse b ERROR (λ x -> x)

However, as we mentioned above, Plutus Core is a strict language: function arguments are evaluated before the function is called. This means that this translation is not correct: the call to ERROR would be evaluated and the program would crash, independent of the value of b.

In the original program this is not the case: the argument passed to ifThenElse is (λ {a} -> ERROR), which is a function. Evaluating a function does nothing, so the original PIR program does have the behaviour we need.2

This means that when we erase types to get from PLC to UPLC, we have to make sure that we don’t make programs stricter than they were. The Plutus compiler does this by replacing type arguments by delays. The UPLC translation of our example is:

λ b -> ifThenElse# ! b (λ ~ -> ERROR) (λ ~ y -> y)

Using the definitions from the section on Force and Delay, we could write this UPLC code in (Pseudo-)Haskell as

maybeExplode :: Bool -> Delayed (∀ a. a -> a)
maybeExplode b = strictIfThenElse b (delay undefined) (delay id)

This is nearly identical to the strictIfThenElse example we saw before, except that there is no need to force the final result: we want to return a delayed function, because we are returning a polymorphic function.

Finally, this also explains why we need to force the call to the built-in function ifThenElse#: it too is a polymorphic function, and hence it too is delayed.


In Haskell. Scott encoding

Consider pattern matching on a value of type Maybe a: we will need two cases, one case for Nothing and one case for Just (x :: a). We can capture this notion of “pattern matching on Maybe” as a function:

toSMaybe :: ∀ a. Maybe a -> (∀ r. r -> (a -> r) -> r)
toSMaybe x = λ n j -> maybe n j x

As is clear from the definition, this is just the well-known function maybe from the standard library with the arguments in a different order (the reason for the strange name toSMaybe will become apparent soon). We can also go the other way: from the pattern matching function to the value of type Maybe a:

fromSMaybe :: (∀ r. r -> (a -> r) -> r) -> Maybe a
fromSMaybe x = x Nothing Just

In fact, it’s not difficult to see that these two functions are mutually inverse. This means that the two representations Maybe a and (∀ r. r -> (a -> r) -> r) are isomorphic, and we could define Maybe to be the type of pattern matching functions. This is known as the Scott encoding of Maybe:

type SMaybe a = forall r. r -> (a -> r) -> r

As some further examples of Scott encoding, we will consider the case for booleans, pairs, and unit (we will see the Plutus equivalent of all of these later). In the case of Bool, pattern matching will need a case for True and a case for False; this means that the translation to the Scott encoding turns out to be if-then-else:

type SBool = forall r. r -> r -> r

toSBool :: Bool -> SBool
toSBool x = λ t f -> if x then t else f

fromSBool :: SBool -> Bool
fromSBool x = x True False

For pairs, we only need one case, which will need to accept both components of the pair as arguments. This means that the Scott-encoding is essentially uncurry:

type SPair a b = forall r. (a -> b -> r) -> r

toSPair :: (a, b) -> SPair a b
toSPair x = λ f -> uncurry f x

fromSPair :: SPair a b -> (a, b)
fromSPair x = x (,)

Finally, for unit we only need one case, with no arguments; i.e., the Scott-encoding of the unit value () is the identity function:

type SUnit = forall r. r -> r

toSUnit :: () -> SUnit
toSUnit () = id

fromSUnit :: SUnit -> ()
fromSUnit x = x ()

This last example may seem a bit silly, but we will encounter it again later.

Scott encoding is not Church encoding

For readers familiar with the Church encoding, it may be worth pointing out that the Scott encoding is not a fold; it only captures pattern matching, not recursion. Concretely, the Scott encoding of natural numbers is given by

newtype SNat = SNat (forall r. (SNat -> r) -> r -> r)

Note the recursive occurrence of SNat here. This means for example that unlike with Church encodings, we can easily define a predecessor function:

pred :: SNat -> SNat
pred (SNat x) = x id (error "pred: zero")

This example will look different in Plutus, however; we will come back to it when we discuss recursion.

Simple example: booleans

Now that we understand what Scott encodings are, let’s look at how booleans work in Plutus Core, and consider compilation of the the Haskell expression

ifThenElse :: Bool -> a -> a -> a
ifThenElse = λ b t f -> if b then t else f

In PIR, this looks like

let* data Bool = True  | False
in λ {a} (b :: Bool) (t :: a) (f :: a) ->
     Bool_match b {∀ _. a} (λ {_} -> t) (λ {_} -> f) {_}

Every PIR program is entirely self-contained (no external libraries), so all datatype and function definitions are included. Unlike in Haskell, datatype declarations are scoped, and automatically introduce an explicit pattern matching function; for Bool this pattern matching function is called Bool_match.

The body of the function does not introduce any concepts we didn’t see before: if-then-else is translated3 by a call to the pattern matching function on Bool. In order to prevent the evaluation of t and f before the pattern match on b, we delay those arguments by giving them a dummy type argument; this makes them polymorphic functions, and we then force the choice once it is made by applying it a dummy type argument4. As we discussed above, this type abstraction and application will then be translated to force/delay calls when the program is translated to UPLC, as we will see shortly.

Of course, in this particular case, the Plutus compiler is being overly conservative: since t and f are variables (lambda arguments), we know that they must already have been evaluated (Plutus is a strict language after all), and so the force/delay in this example is redundant and could in principle be optimized away.

This is our first example of a program that looks different in PIR and PLC, because PLC does not have data types; instead, we see the Scott encoding of Bool appear:

let* type Bool   = ∀ r. r -> r -> r
     !True       = λ {r} (t :: r) (f :: r) -> t
     !False      = λ {r} (t :: r) (f :: r) -> f
     !Bool_match = λ (x :: ∀ r. r -> r -> r) -> x
in λ {a} (b :: Bool) (t :: a) (f :: a) ->
     Bool_match b {∀ _. a} (λ {_} -> t) (λ {_} -> f) {_}

We can see that Bool is now the Scott-encoding of Bool we saw in the previous section; True and False have become functions that select the first and second argument respectively, and Bool_match is now the identity function because Scott-encoded data types are pattern matching functions.

Finally, the UPLC translation is

λ ~ b t f -> ( b!~ -> t) (λ ~ -> f) )!

We have now seen all concepts required to understand this code:

  1. The function itself is polymorphic, and so it is delayed; this explains the initial dummy argument.
  2. b is the Scott-encoding of a Bool, is therefore a polymorphic function, which turned into a function with a dummy argument in UPLC, and so we must force b in order to actually use that function.
  3. We delay both t and f, and once the choice is made, we force the result.

This definition is very similar to the strictIfThenElse example above; the only difference really is the top-level delay; it’s as if we had written this in Haskell:

delay $ λ b t f -> force $ force b (delay t) (delay f)

Types with arguments: pairs

The Bool type discussed in the previous section had no arguments: it had kind Type. As a second example of Scott encodings, let’s consider a type that does have arguments: the type of pairs. This type has two type arguments (the two components), and hence has kind Type -> Type -> Type (also written * -> * -> *).

Let’s consider the compilation of the Haskell program

fst :: (a, b) -> a
fst (x, _y) = x

In PIR, this program looks like:

let* data (Tuple2 :: * -> * -> *) a b = Tuple2 a b
in λ {a} {b} (pair :: Tuple2 a b) ->
     Tuple2_match {a} {b} pair {a} (λ (x :: a) (y :: b) -> x)

Like before, the pattern match on the pair has been replaced by an explicit call to the pattern matching function. Unlike in the Bool case, this pattern matching function now has three type arguments: two for the type arguments of the pair (a and b), and one for the result type (here, a, because we are selecting the first component of the pair).

The PLC version of this program is

let* type Tuple2 :: * -> * -> * = Λ a b. ∀ r. (a -> b -> r) -> r
     !Tuple2       = λ {a} {b} (x :: a) (y :: b) {r} (f :: a -> b -> r) -> f x y
     !Tuple2_match = λ {a} {b} (x :: (Λ a b. ∀ r. (a -> b -> r) -> r) a b) -> x
in λ {a} {b} (pair :: Tuple2 a b) ->
     Tuple2_match {a} {b} pair {a} (λ (x :: a) (y :: b) -> x)

The most interesting thing here is that the type Tuple2 has become a type-level function (syntax Λ a. t), which computes the type of the pattern matching function given two arguments a and b. Note the kind difference between

Λ a b. ... :: * -> * -> *

and (for a given a and b)

∀ r. ... :: *

The rest of the code is reasonably straight-forward: the Tuple2 term is the constructor for pairs, Tuple2_match is the pattern matching function and is again the identity function (because data types are pattern matching functions when we use Scott encodings).

Finally, this program compiles down to the following UPLC:

λ ~ ~ pair -> pair! (λ x y -> x)

No new concepts here; it is perhaps somewhat surprising to see the double delay of this function, arising from the fact that it has two type arguments. Indeed, one might wonder if some of these calls to delay/force might not be avoided through a compiler optimization pass.

Type-level recursion

In Haskell. Type-level fixpoints

When we introduced Scott encoding, we saw that we can take something that we normally think of as a feature of a language (pattern matching) and turn it into something that we can work with in the language (functions). In this section we will see that we can do the same with another language feature: type-level recursion (we will delay a discussion of term level recursion until the next section). For a discussion of why one might do this in Haskell, see for example the paper by Rodriguez et al. on generic programming, or the paper by Bahr and Hvitved on compositional data types; why the Plutus compiler does it is discussed in the Plutus paper.

Consider the standard definition of lists:

data List a = Nil | Cons a (List a)

List takes a single type argument, and hence has kind

List :: * -> *

In order to identify recursion as its own concept, we can abstract out the recursive use of List, and give List an additional argument that will stand in for “recursive calls”:

data ListF f a = Nil | Cons a (f a)

Now f has the same kind that List had previously, which means that we have

ListF :: (* -> *) -> (* -> *)

What should we pass as the value for f? In a sense, we want to pass ListF itself, but if we do, then that ListF also wants an argument, ad infinitum:

ListF (ListF (ListF (ListF ..)))  a

This, of course, is precisely recursion. We can capture this using a type-level fixpoint operator:

newtype Fix (t :: (* -> *) -> (* -> *)) (a :: *) = WrapFix (t (Fix t) a)

The kind of the argument t to Fix is precisely the kind of ListF, so we can define

type List = Fix ListF

Let’s check if this makes sense:

   List a
== Fix ListF a
== ListF (Fix ListF) a
== ListF List a

In Haskell. Generalizing to arbitrary kinds: IFix

If we look closely at the arguments of Fix, we realize that it is more (kind) monomorphic than it needs to be, and there is an obvious generalization:

newtype IFix (t :: (k -> *) -> (k -> *)) (a :: k) = WrapIFix (t (IFix t) a)

The Plutus paper goes through some length explaining why this definition is sufficient to capture all datatypes (Section 3.1, Recursive types). We will not do that in this blog post, but we will see three different example use cases, starting with lists: notice that we get Fix back simply by instantiating k to *:

type List = IFix FListF

Let’s introduce two functions to wrap and unwrap IFix:

wrap :: t (IFix t) a -> IFix t a
wrap = WrapIFix

unwrap :: IFix t a -> t (IFix t) a
unwrap (WrapIFix t) = t

If we now write functions on lists, we must wrap and unwrap at the appropriate times:

nil :: FList a
nil = wrap $ Nil

cons :: a -> FList a -> FList a
cons x xs = wrap $ Cons x xs

null :: FList a -> Bool
null xs =
   case unwrap xs of
     Nil      -> True
     Cons _ _ -> False

In Haskell. Combining with Scott encoding

Before we look at how these fixpoints are used in Plutus, let’s stay in Haskell a tiny bit longer, and first see how fixpoints can be combined with Scott encoding.

First, let’s see the Scott encoding of lists without the use of IFix:

newtype List a = List {
      listMatch :: forall r. r -> (a -> List a -> r) -> r

The pattern matching function needs two cases: one for nil and one for cons; the latter then takes two arguments, the head and tail of the list. Some simple examples:

nil :: List a
nil = List $ λ n _c -> n

cons :: a -> List a -> List a
cons x xs = List $ λ _ c -> c x xs

null :: List a -> Bool
null xs = listMatch xs True (λ _ _ -> False)

To combine the Scott encoding with the use of explicit fixpoints, we need to go through the same process as before, and give List an additional argument that captures the recursive uses:

newtype ListF f a = List {
      listMatch :: forall r. r -> (a -> f a -> r) -> r

type List = IFix IListF

The same three functions we defined above now become:

nil :: List a
nil = wrap $ List $ λ n _c -> n

cons :: a -> List a -> List a
cons x xs = wrap $ List $ λ _n c -> c x xs

null :: List a -> Bool
null xs = listMatch (unwrap xs) True (λ _ _ -> False)

Simple example: lists

We are now finally ready to look at how lists work in Plutus. Let’s start with the definition of nil. In PIR:

let rec data (List :: * -> *) a = Nil  | Cons a (List a)
in Nil

The only new PIR feature we see here is marking the definition of Nat as recursive: let-bindings in PIR are non-recursive unless otherwise indicated.

Let’s look the PLC version now, which is considerably more involved; PLC has neither datatypes nor recursion, and instead uses Scott encodings and the IFix combinator5:

let* type ListF = Λ (f :: * -> *). Λ a. ∀ r. r -> (a -> (f a) -> r) -> r
     type List  = Λ a. ifix ListF a
     !Nil       = λ {a} ->
                    wrap {ListF} {a} (
                      λ {r} (n :: r) (c :: a -> List a -> r) -> n)
     !Cons      = λ {a} (x :: a) (xs :: List a) ->
                    wrap {ListF} {a} (
                      λ {r} (n :: r) (c :: a -> (List a) -> r) -> c x xs)
     !List_match = λ {a} (x :: List a) ->
                    unwrap x
in Nil

Hopefully this PLC code is not difficult to understand anymore now; ifix, wrap and unwrap are PLC primitives. To understand the untyped Plutus code, we merely need to know that wrap and unwrap are simply erased, so this whole program compiles down to

λ ~ ~ n c -> n

The double delay comes from the two type parameters: the type of the list elements a, and the type of the continuation r.

As another example, let’s look at null (check if a list is empty). First, in PIR:

let* data Bool = True  | False
     rec data List :: * -> * a = Nil  | Cons a (List a)
in λ {a} (xs :: List a) ->
     List_match {a} xs {∀ _. Bool}
       (λ {_} -> True)
       (λ (x :: a) (xs' :: List a) {_} -> False)

The pattern match on the list is replaced by a call to the pattern matching function, and the two arguments for the Nil and Cons cases are delayed, to avoid unnecessary computation; we have seen these patterns before (and, as before, here too the use of delay is unnecessarily conservative). In PLC:

let* -- definition of IListF, List, Nil, Cons, List_match as above
     -- definition of Bool, True, False as discussed in section on datatypes
in λ {a} (xs :: List a) ->
     List_match {a} xs {∀ _. Bool}
       (λ {_} -> True)
       (λ (x :: a) (xs :: List a) {_} -> False)

The main code looks identical, but the definitions have changed from the PLC version: List_match, True and False are now all using the Scott-encoding. In UPLC:

λ ~ xs -> ( xs!~ ~ t f -> t) (λ x xs ~ ~ t f -> f) )!

Let’s make sure we really understand this code:

  1. The top-level delay (λ ~ xs -> ..) is there because nil is a polymorphic function (type argument a).
  2. We must force xs because xs is represented by the Scott-encoding of a list, which is therefore itself a polymorphic function (type argument r).
  3. We give two arguments to xs: the case for nil and the case for cons. We delay both of those arguments, and then force the result once the choice is made.
  4. Finally, True (i.e., (λ t f -> t) in Scott encoding) and False (i.e, (λ t f -> f)) themselves are polymorphic functions (they have their own r type argument), and we therefore have one more delay in both arguments to xs.

In Haskell. Types of kind *

Lists are probably the simplest example of IFix, because the kind of ListF

ListF :: (* -> *) -> (* -> *)

lines up very nicely with the kind of IFix:

IFix :: ∀ k. ((k -> *) -> (k -> *)) -> (k -> *)

simply by picking k = *. We’ll now discuss an example that doesn’t line up quite so nicely: natural numbers. Basic definition:

data Nat = Succ Nat | Zero

Abstracting out the recursion:

data NatF f = Succ f | Zero

Since NatF does not take any argument, its kind does not line up with what IFix wants:

NatF :: * -> *

What we will do instead is pick some type SimpleRec such that

IFix SimpleRec :: (* -> *) -> *

In other words, we will pick k = * -> *. This then allows us to then define

type Nat = IFix SimpleRec NatF

Unlike in the list example, NatF is now the second argument to IFix, not the first; this is quite a different use of IFix.6

Here’s how we can define SimpleRec:

newtype SimpleRec (rec :: (* -> *) -> *) (f :: (* -> *)) = SimpleRec (f (rec f))

Let’s see if this makes sense:

== IFix SimpleRec NatF
== SimpleRec (IFix SimpleRec) NatF
== NatF (IFix SimpleRec NatF)
== NatF Nat

Example: natural numbers

Now that we have seen the theory in Haskell, the Plutus translation is not that difficult anymore. Let’s consider the compilation of

data Nat = Zero | Succ Nat

isZero :: Nat -> Bool
isZero Zero = True
isZero _    = False

to Plutus. First, PIR:

let*    data Bool = True | False
let rec data Nat  = Succ Nat | Zero
in λ (n :: Nat) ->
     Nat_match n {∀ _. Bool}
       (λ (n' :: Nat) {_} -> False)
       (λ {_} -> True)

Nothing new here. Let’s look at PLC:

let* type SimpleRec = Λ (rec :: (* -> *) -> *). Λ (f :: * -> *). f (rec f)
     type NatF      = Λ f. ∀ r. (f -> r) -> r -> r
     type Nat       = ifix SimpleRec NatF

     !Succ      = λ (n :: Nat) -> wrap {SimpleRec} {NatF} (
                    λ {r} (s :: Nat -> r) (z :: r) -> s n)
     !Zero      = wrap {SimpleRec} {NatF} (
                    λ {r} (s :: Nat -> r) (z :: r) -> z)
     !Nat_match = λ (x :: Nat) -> unwrap x

     -- definition of Bool, True, False as discussed in section on datatypes

in λ (n :: Nat) ->
     Nat_match n {∀ _. Bool}
       (λ (ipv :: Nat) {_} -> False)
       (λ {_} -> True)

Other than the use of SimpleRec that we discussed in the previous section7, this introduces no new concepts. The UPLC code is straight-forward, because the wrapping/unwrapping simply disappears:

λ n -> ( n! (λ ipv ~ ~ t f -> f) (λ ~ ~ t f -> t) )!

Term-level recursion

In Haskell. Term-level fixpoints

So far we have been talking about recursive types; let’s now turn our attention to recursive terms. Let’s consider addition on natural numbers:

data Nat = Succ Nat | Zero

add :: Nat -> Nat -> Nat
add Zero     m = m
add (Succ n) m = Succ (add n m)

Just like we did with types, we can abstract out the recursive call:

addF :: (Nat -> Nat -> Nat) -> Nat -> Nat -> Nat
addF f Zero     m = m
addF f (Succ n) m = Succ (f n m)

And as with types, we really want to pass the function itself as its own argument:

addF (addF (addF (addF ..)))

We can do this by defining a term-level fixpoint combinator:

fix :: (a -> a) -> a
fix f = f (fix f)

add :: Nat -> Nat -> Nat
add = fix addF

Let’s verify:

== fix addF
== addF (fix addF)
== addF add

In Haskell. Deriving term-level recursion from type-level recursion

In the previous section, we abstracted out the recursion from add, but we still used recursion in the definition of fix. Stunningly, it is possible to define the term-level fixpoint combinator without using recursion at all! This is the famous Y-combinator. As we saw above, at the heart of recursion is applying a function to itself, and this self-application is also at the heart of the Y-combinator; in pseudo-Haskell:

fix :: forall a. (a -> a) -> a
fix = λ f -> (λ s -> f (s s)) (λ s -> f (s s)) -- not real Haskell

Let’s verify:

== fix addF
== (λ s -> addF (s s)) (λ s -> addF (s s))
== addF ((λ s -> addF (s s)) (λ s -> addF (s s)))
== addF (fix addF)
== addF add

The above definition is pseudo-Haskell, because it does not typecheck. Consider

selfApply :: ??
selfApply f = f f

What would be the type of selfApply? Since we are apply f to something, it must have type a -> b for some a and b; and since we are applying it to itself, we must then also have that a is the same as a -> b. If we tried this in ghc, it would complain that it cannot solve this problem:

• Occurs check: cannot construct the infinite type: a ~ a -> b
• In the first argument of ‘f’, namely ‘f’
  In the expression: f f
  In an equation for ‘selfApply’: selfApply f = f f

However, we can define a type T b such that T b is isomorphic to T b -> b:

newtype Self b = WrapSelf { unwrapSelf :: Self b -> b }

This allows us to define self-application8:

selfApply :: Self b -> b
{-# NOINLINE selfApply #-}
selfApply f = (unwrapSelf f) f

and therefore the Y-combinator:

fix :: forall a. (a -> a) -> a
fix f = selfApply $ WrapSelf $ λ s -> f (selfApply s)

If we remove all the newtype wrapping and unwrapping, we can see that this is indeed the Y-combinator:

   fix f
== selfApply $ WrapSelf $ λ s -> f (selfApply s)
== selfApply $ λ s -> f (selfApply s)
== selfApply $ λ s -> f (s s)
== (λ s -> f (s s)) (λ s -> f (s s))

In Haskell. Using IFix

We have now managed to define a term-level fixpoint combinator without using term-level recursion, but we still used type-level recursion in the definition of Self. The final step is to define Self using IFix instead. As always, we first abstract out the recursion from Self:

newtype SelfF (self :: * -> *) (a :: *) = SelfF {
      unSelfF :: self a -> a

Fortunately, the kind of SelfF lines up nicely with the kind expected by IFix, so we can simply define

type Self = IFix SelfF

The definition of the fixpoint combinator remains essentially the same; we just need to now wrap/unwrap both the SelfF newtype and the IFix newtype.

fix :: forall a. (a -> a) -> a
fix f = let w = wrap $ SelfF $ λ s -> f (selfApply s) in selfApply w

However, Since Plutus is a strict language, it only makes sense to take the fixpoint of functions (you can’t define an infinite list in Plutus, for example). The fixpoint in Plutus therefore is specialized to functions; in Haskell we could define this as:9

fixFun :: forall a b. ((a -> b) -> (a -> b)) -> (a -> b)
fixFun f = let w = wrap $ SelfF $ λ s x -> f (selfApply s) x in selfApply w

Example: addition

We are now ready to discuss our final and most involved Plutus program: addition. As a reminder, the Haskell code looked like

data Nat = Succ Nat | Zero

add :: Nat -> Nat -> Nat
add Zero     m = m
add (Succ n) m = Succ (add n m)


let rec data Nat = Succ Nat | Zero
    rec !add = λ (n :: Nat) (m :: Nat) ->
           Nat_match n {∀ _. Nat}
             (λ (n' :: Nat) {_} -> Succ (add n' m))
             (λ {_} -> m)
in add

To go from PIR to PLC, we need to move the type-level recursion from Nat, the term-level recursion from add, and introduce Scott encoding:10

let* type SimpleRec = Λ (rec :: (* -> *) -> *). Λ (f :: * -> *). f (rec f)
     type NatF      = Λ f. ∀ r. (f -> r) -> r -> r
     type Nat       = ifix SimpleRec NatF

     !Succ      = λ (n :: Nat) ->
                   wrap {SimpleRec} {NatF} (
                     λ {r} (s :: Nat -> r) (z :: r) -> s n)
     !Zero      = wrap {SimpleRec} {NatF} (
                     λ {r} (s :: Nat -> r) (z :: r) -> z)