Planet Haskell

May 28, 2022

Mark Jason Dominus

“Llaves” and other vanishing consonants

Lately I asked:

Where did the ‘c’ go in llave (“key”)? It's from Latin clavīs

Several readers wrote in with additional examples, and I spent a little while scouring Wiktionary for more. I don't claim hat this list is at all complete; I got bored partway through the Wiktionary search results.

Spanish English Latin antecedent
llagar to wound plāgāre
llama flame flamma
llamar to summon, to call clāmāre
llano flat, level plānus
llantén plaintain plantāgō
llave key clavis
llegar to arrive, to get, to be sufficient   plicāre
lleno full plēnus
llevar to take levāre
llorar to cry out, to weep plōrāre
llover to rain pluere

I had asked:

Is this the only Latin word that changed ‘cl’ → ‘ll’ as it turned into Spanish, or is there a whole family of them?

and the answer is no, not exactly. It appears that llave and llamar are the only two common examples. But there are many examples of the more general phenomenon that

(consonant) + ‘l’ → ‘ll’

including quite a few examples where the consonant is a ‘p’.

Spanish-related notes

  • Eric Roode directed me to this discussion of “Latin CL to Spanish LL” on the WordReference.com language forums. It also contains discussion of analogous transformations in Italian. For example, instead of plānusllano, Italian has → piano.

  • Alex Corcoles advises me that Fundéu often discusses this sort of issue on the Fundéu web site, and also responds to this sort of question on their Twitter account. Fundéu is the Foundation of Emerging Spanish, a collaboration with the Royal Spanish Academy that controls the official Spanish language standard.

  • Several readers pointed out that although llave is the key that opens your door, the word for musical keys and for encryption keys is still clave. There is also a musical instrument called the claves, and an associated technical term for the rhythmic role they play. Clavícula (‘clavicle’) has also kept its ‘c’.

  • The connection between plicāre and llegar is not at all clear to me. Plicāre means “to fold”; English cognates include ‘complicated’, ‘complex’, ‘duplicate’, ‘two-ply’, and, farther back, ‘plait’. What this has to do with llegar (‘to arrive’) I do not understand. Wiktionary has a long explanation that I did not find convincing.

  • The levārellevar example is a little weird. Wiktionary says "The shift of an initial 'l' to 'll' is not normal".

  • Llaves also appears to be the Spanish name for the curly brace characters { and }. (The square brackets are corchetes.)

Not related to Spanish

  • The llover example is a favorite of the Universe of Discourse, because Latin pluere is the source of the English word plover.

  • French parler (‘to talk’) and its English descendants ‘parley’ and ‘parlor’ are from Latin parabola.

  • Latin plōrāre (‘to cry out’) is obviously the source of English ‘implore’ and ‘deplore’. But less obviously, it is the source of ‘explore’. The original meaning of ‘explore’ was to walk around a hunting ground, yelling to flush out the hidden game.

  • English ‘autoclave’ is also derived from clavis, but I do not know why.

  • Wiktionary's advanced search has options to order results by “relevance” and last-edited date, but not alphabetically!

Thanks

  • Thanks to readers Michael Lugo, Matt Hellige, Leonardo Herrera, Leah Neukirchen, Eric Roode, Brent Yorgey, and Alex Corcoles for hints clues, and references.

[ Addendum: Andrew Rodland informs me that an autoclave is so-called because the steam pressure inside it forces the door lock closed, so that you can't scald yourself when you open it. ]

by Mark Dominus (mjd@plover.com) at May 28, 2022 03:21 PM

May 27, 2022

GHC Developer Blog

GHC 9.2.3 is now available

GHC 9.2.3 is now available

Zubin Duggal - 2022-05-27

The GHC developers are very happy to at announce the availability of GHC 9.2.3. Binary distributions, source distributions, and documentation are available at downloads.haskell.org.

This release includes many bug-fixes and other improvements to 9.2.2 including:

As some of the fixed issues do affect correctness users are encouraged to upgrade promptly.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Tweag I/O, Serokell, Equinix, SimSpace, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy compiling,

  • Zubin

by ghc-devs at May 27, 2022 12:00 AM

Lysxia's blog

Formalizing finite sets

Combinatorics studies mathematical structures by counting. Counting may seem like a benign activity, but the same rigor necessary to prevent double- or under-counting mistakes arguably underpins all of mathematics.1

Combining my two favorite topics, I’ve always wanted to mechanize combinatorics in Coq.2 An immediate challenge is to formalize the idea of “set”.3 We have to be able to define the set of things we want to count. It turns out that there are at least two ways of encoding sets in type theory: sets as types, and sets as predicates. They are suitable for defining different classes of operations: sums (disjoint union) are a natural operation on types, while unions and intersections are naturally defined on predicates.

The interplay between these two notions of sets, and finiteness, will then let us prove the standard formula for the cardinality of unions, aka. the binary inclusion-exclusion formula:

#|X ∪ Y| = #|X| + #|Y| - #|X ∩ Y|
Imports and options
From Coq Require Import ssreflect ssrbool.

Set Implicit Arguments.

Sets as types

The obvious starting point is to view a type as the set of its inhabitants.

How do we count its inhabitants? We will say that a set A has cardinality n if there is a bijection between A and the set {0 .. n-1} of natural numbers between 0 and n-1.

Bijections

A bijection is a standard way to represent a one-to-one correspondence between two sets, with a pair of inverse functions. We define the type bijection A B as a record containing the two functions and a proof of their inverse relationship.

Record is_bijection {A B} (to : A -> B) (from : B -> A) : Prop :=
  { from_to : forall a, from (to a) = a
  ; to_from : forall b, to (from b) = b }.

Record bijection (A B : Type) : Type :=
  { bij_to : A -> B
  ; bij_from : B -> A
  ; bij_is_bijection :> is_bijection bij_to bij_from }.

Infix "<-->" := bijection (at level 90) : type_scope.

We say that A and B are isomorphic when there exists a bijection between A and B. Isomorphism is an equivalence relation: reflexive, symmetric, transitive.4

Definition bijection_refl {A} : A <--> A.
Admitted. (* Easy exercise *)

Definition bijection_sym {A B} : (A <--> B) -> (B <--> A).
Admitted. (* Easy exercise *)

Definition bijection_trans {A B C} : (A <--> B) -> (B <--> C) -> (A <--> C).
Admitted. (* Easy exercise *)

Infix ">>>" := bijection_trans (at level 40).

Finite sets

Our “bijective” definition of cardinality shall rely on a primitive, canonical family of finite types {0 .. n-1} that is taken for granted. We can define them as the following sigma type, using the familiar set comprehension notation, also known as ordinal in math-comp:

Definition fin (n : nat) : Type := { p | p < n }.

An inhabitant of fin n is a pair of a p : nat and a proof object of p < n. Such proofs objects are unique for a given p and n, so the first component uniquely determines the second component, and fin n does have exactly n inhabitants.5

Finiteness

We can now say that a type A has cardinality n if there is a bijection between A and fin n, i.e., there is an inhabitant of A <--> fin n. Note that this only defines finite cardinalities, which is fine for doing finite combinatorics. Infinity is really weird so let’s not think about it.

As a sanity check, you can verify the cardinalities of the usual suspects, bool, unit, and Empty_set.

Definition bijection_bool : bool <--> fin 2.
Admitted. (* Easy exercise *)

Definition bijection_unit : unit <--> fin 1.
Admitted. (* Easy exercise *)

Definition bijection_Empty_set : Empty_set <--> fin 0.
Admitted. (* Easy exercise *)

A type A is finite when it has some cardinality n : nat. When speaking informally, it’s common to view finiteness as a property, a thing that a set either is or is not. To prove finiteness is merely to exhibit the relevant data: a number to be the cardinality, and an associated bijection (which we call an enumeration of A, enum for short). Hence we formalize “finiteness” as the type of that data.

Record is_finite (A : Type) : Type :=
  { card : nat
  ; enum : A <--> fin card }.

Further bundling is_finite A proofs with their associated set A, we obtain a concept aptly named “finite type”.6 A finite type is a type A paired with a proof of is_finite A.

Record finite_type : Type :=
  { ft_type :> Type
  ; ft_is_finite :> is_finite ft_type }.

We leverage coercions (indicated by :>) to lighten the notation of expressions involving finite_type.

The first coercion ft_type lets us use a finite_type as a Type. So if E : finite_type, we can write the judgement that “e is an element of E” as e : E, which implicitly expands to the more cumbersome e : ft_type E.

Similarly, the second coercion ft_is_finite lets us access the evidence of finiteness without naming that field. In particular, we can write the cardinality of E : finite_type as card E, as if card were a proper field of E rather than the nested record it actually belongs to. This is a convenient mechanism for overloading, letting us reuse the name card(inality) even though records technically cannot have fields with the same name. With that, we define #|A| as sugar for card A:

Notation "'#|' A '|'" := (card A).
Some notation boilerplate
Declare Scope fintype_scope.
Delimit Scope fintype_scope with fintype.
Bind Scope fintype_scope with finite_type.

Uniqueness of cardinality

The phrase “cardinality of a set” suggests that cardinality is an inherent property of sets. But now we’ve defined “finite type” essentially as a tuple where the cardinality is just one component. What’s to prevent us from putting a different number there, for the same underlying type?

We can prove that this cannot happen. Cardinality is unique: any two finiteness proofs for the same type must yield the same cardinality.

(The proof is a little tedious and technical.)

Theorem card_unique {A} (F1 F2 : is_finite A) : card F1 = card F2.
Admitted. (* Intermediate exercise *)

A slightly more general result is that isomorphic types (i.e., related by a bijection) have the same cardinality. It can first be proved in terms of is_finite, from which a corollary in terms of finite_type follows.

Theorem card_bijection {A B} (FA : is_finite A) (FB : is_finite B)
  : (A <--> B) -> card FA = card FB.
Admitted. (* Like card_unique *)

Theorem card_bijection_finite_type {A B : finite_type}
  : (A <--> B) -> #|A| = #|B|.
Proof.
  apply card_bijection.
Qed.

The converse is also true and useful: two types with the same cardinality are isomorphic.

Theorem bijection_card {A B} (FA : is_finite A) (FB : is_finite B)
  : card FA = card FB -> (A <--> B).
Admitted. (* Easy exercise *)

Theorem bijection_card_finite_type {A B : finite_type}
  : #|A| = #|B| -> (A <--> B).
Proof.
  apply bijection_card.
Qed.

Operations on finite sets

Sum

The sum of sets is also known as the disjoint union.

Inductive sum (A B : Type) : Type :=
| inl : A -> A + B
| inr : B -> A + B
where "A + B" := (sum A B) : type_scope.

sum is a binary operation on types. We must work to make it an operation on finite types.

There is a bijection between fin n + fin m (sum of sets) and fin (n + m) (sum of nats).

Definition bijection_sum_fin {n m} : fin n + fin m <--> fin (n + m).
Admitted. (* Intermediate exercise *)

The sum is a bifunctor.

Definition bijection_sum {A A' B B'}
  : (A <--> B) -> (A' <--> B') -> (A + A' <--> B + B').
Admitted. (* Easy exercise *)

Combining those facts, we can prove that the sum of two finite sets is finite (finite_sum), and the cardinality of the sum is the sum of the cardinalities (card_sum).

Definition is_finite_sum {A B} (FA : is_finite A) (FB : is_finite B)
  : is_finite (A + B) :=
  {| card := #|FA| + #|FB|
  ;  enum := bijection_sum (enum FA) (enum FB) >>> bijection_sum_fin |}.

Definition finite_sum (A B : finite_type) : finite_type :=
  {| ft_type := A + B ; ft_is_finite := is_finite_sum A B |}.

Infix "+" := finite_sum : fintype_scope.
Theorem card_sum {A B : finite_type} : #|(A + B)%fintype| = #|A| + #|B|.
Proof.
  reflexivity.
Qed.

Product

The cartesian product has structure dual to the sum.

Inductive prod (A B : Type) : Type :=
| pair : A -> B -> A * B
where "A * B" := (prod A B) : type_scope.
  • There is a bijection fin n * fin m <--> fin (n * m).
  • The product is a bifunctor.
  • The product of finite sets is finite.
  • The cardinality of the product is the product of the cardinalities.
Coq code
Definition bijection_prod_fin {n m} : fin n * fin m <--> fin (n * m).
Admitted. (* Intermediate exercise *)

Definition bijection_prod {A A' B B'}
  : (A <--> B) -> (A' <--> B') -> (A * A' <--> B * B').
Admitted. (* Easy exercise *)

Definition is_finite_prod {A B} (FA : is_finite A) (FB : is_finite B)
  : is_finite (A * B) :=
  {| card := #|FA| * #|FB|
  ;  enum := bijection_prod (enum FA) (enum FB) >>> bijection_prod_fin |}.

Definition finite_prod (A B : finite_type) : finite_type :=
  {| ft_type := A * B ; ft_is_finite := is_finite_prod A B |}.

Infix "*" := finite_prod : fintype_scope.

Theorem card_prod {A B : finite_type} : #|(A * B)%fintype| = #|A| * #|B|.
Proof.
  reflexivity.
Qed.

Sets as predicates

Two other common operations on sets are union and intersection. However, those operations don’t fit in the view of sets as types. While set membership x ∈ X is a proposition, type inhabitation x : X is a judgement, which is a completely different thing,7 so we need a different approach.

The idea of set membership x ∈ X as a proposition presumes that x and X are entities that exist independently of each other. This suggests that there is some “universe” that elements x live in, and the sets X under consideration are subsets of that same universe. We represent the universe by a type A, and sets (i.e., “subsets of the universe”) by predicates on A.

Definition set_of (A : Type) := (A -> bool).

Hence, if x : A is an element of the universe, and X : set A is a set (subset of the universe), we will denote set membership x ∈ X simply as X x (x satisfies the predicate X).

Those predicates are boolean, i.e., decidable. This is necessary in several constructions and proofs here, notably to prove that the union or intersection of finite sets is finite. We rely on a coercion to implicitly convert booleans to Prop: is_true : bool >-> Prop, which is exported by ssreflect.

Union, intersection, complement

Those common set operations correspond to the usual logical connectives.

Section Operations.

Context {A : Type}.

Definition union' (X Y : set_of A) : set_of A := fun a => X a || Y a.
Definition intersection' (X Y : set_of A) : set_of A := fun a => X a && Y a.
Definition complement' (X : set_of A) : set_of A := fun a => negb (X a).

End Operations.

Define the familiar infix notation for union and intersection.

Declare Scope set_of_scope.
Delimit Scope set_of_scope with set_of.
Bind Scope set_of_scope with set_of.

Infix "∪" := union' (at level 40) : set_of_scope.
Infix "∩" := intersection' (at level 40) : set_of_scope.

Finiteness

Again, we will characterize finite sets using bijections to fin n. We first transform the set X into a type to_type X, so we can form the type of bijections to_type X <--> fin n. Like fin, we define to_type A as a sigma type. Thanks to the predicate X being boolean, there is at most one proof p : X a for each a, so the type { a : A | X a } has exactly one inhabitant for each inhabitant a : A satisfying X a.

Definition to_type {A : Type} (X : set_of A) : Type := { a : A | X a }.

Coercion to_type : set_of >-> Sortclass.

We obtain a notion of finite set by imitating the structure of finite_type. The set-as-predicate X is finite if the set-as-type to_type X is finite.

Record finite_set_of (A : Type) : Type :=
  { elem_of :> set_of A
  ; fso_is_finite :> is_finite (to_type elem_of)
  }.

Similarly, a finite_type_of can be coerced to a finite_type.

Definition to_finite_type {A} (X : finite_set_of A) : finite_type :=
  {| ft_type := elem_of X
  ;  ft_is_finite := X |}.

Coercion to_finite_type : finite_set_of >-> finite_type.

Finite unions and intersections

We then prove that the union and intersection of finite sets are finite. This is actually fairly challenging, since proving finiteness means to calculate the cardinality of the set and to construct the associated bijection. Unlike sum and product, there is no simple formula for the cardinality of union and intersection. One candidate may seem to be the binary inclusion-exclusion formula:

#|X ∪ Y| = #|X| + #|Y| - #|X ∩ Y|

But that only gives the cardinality of the union in terms of the intersection, or vice versa, and we don’t know either yet.

Rather than constructing the bijections directly, we decompose the proof. The intuition is that X ∪ Y and X ∩ Y can easily be “bounded” by known finite sets, namely X + Y and X respectively. By “bounded”, we mean that there is an injection from one set to the other.

The standard definition of injectivity is via an implication f x = f y -> x = y. However, a better definition for our purposes comes from a one-sided inverse property: a function f : A -> B is a section if there exists another function g : B -> A (called a retraction) such that g (f x) = x. Every section is an injection, but the converse requires the law of excluded middle.

Record is_section {A B} (to : A -> B) (from : B -> A) : Prop :=
  { s_from_to : forall a, from (to a) = a }.

Record section (A B : Type) : Type :=
  { s_from : A -> B
  ; s_to : B -> A
  ; s_is_section : is_section s_from s_to }.

The point is that, given a section to a finite set, section A (fin n), we can construct a bijection A <--> fin m for some m, that is smaller than n. We formalize this result with a proof-relevant sigma type.

Definition section_bijection (A : Type) (n : nat)
  : section A (fin n) -> { m & A <--> fin m }.
Admitted. (* Hard exercise *)

This construction is rather involved. It is much more general than when we were looking specifically at union and intersection, but at the same time it is easier to come up with as it abstracts away the distracting details of those operations.

Now there is a section from X ∪ Y to X + Y, and from X ∩ Y to X.

Definition section_union {A} (X Y : set_of A)
  : section (X ∪ Y)%set_of (X + Y).
Admitted. (* Easy exercise *)

Definition section_intersection {A} (X Y : set_of A)
  : section (X ∩ Y)%set_of X.
Admitted. (* Easy exercise *)

We can then rely on the finiteness of X and X + Y to extend those sections to fin n for some n via the following theorem:

Theorem section_extend (A B C : Type)
  : section A B -> (B <--> C) -> section A C.
Admitted. (* Easy exercise *)

Definition section_union' {A} (X Y : finite_set_of A)
  : section (X ∪ Y)%set_of (fin (#|X| + #|Y|)).
Proof.
  eapply section_extend.
  - apply section_union.
  - apply is_finite_sum.
Qed.

Definition section_intersection' {A} (X Y : finite_set_of A)
  : section (X ∩ Y)%set_of (fin #|X|).
Proof.
  eapply section_extend.
  - apply section_intersection.
  - apply enum.
Qed.

Finally, by section_bijection, we obtain finiteness proofs of union' and intersection', which let us define union and intersection properly as operations on finite sets.

Theorem is_finite_union {A} {X Y : set_of A}
    (FX : is_finite X) (FY : is_finite Y)
  : is_finite (X ∪ Y)%set_of.
Proof.
  refine {| enum := projT2 (section_bijection _) |}.
  eapply (section_extend (B := X + Y)%type).
  - apply section_union.
  - apply (is_finite_sum FX FY).
Qed.

Theorem is_finite_intersection {A} {X Y : set_of A}
    (FX : is_finite X) (FY : is_finite Y)
  : is_finite (X ∩ Y)%set_of.
Proof.
  refine {| enum := projT2 (section_bijection _) |}.
  eapply section_extend.
  - apply section_intersection.
  - apply (enum FX).
Qed.

Definition union {A} (X Y : finite_set_of A) : finite_set_of A :=
  {| fso_is_finite := is_finite_union X Y |}.

Definition intersection {A} (X Y : finite_set_of A) : finite_set_of A :=
  {| fso_is_finite := is_finite_intersection X Y |}.
Declare Scope fso_scope.
Delimit Scope fso_scope with fso.
Bind Scope fso_scope with finite_set_of.

Infix "∪" := union (at level 40) : fso_scope.
Infix "∩" := intersection (at level 40) : fso_scope.

Hereafter, and will denote finite unions and intersections.

#[local] Open Scope fso_scope.

Inclusion-exclusion

#|X ∪ Y| = #|X| + #|Y| - #|X ∩ Y|

To prove that formula, it’s probably not a good idea to look at how and compute their cardinalities. A better idea is to construct a bijection, which implies an equality of cardinalities by card_bijection.

To start, subtractions are bad, so we rewrite the goal:

#|X ∪ Y| + #|X ∩ Y| = #|X| + #|Y|

Now we look for a bijection (X ∪ Y) + (X ∩ Y) <--> X + Y. It gets a bit tricky because of the dependent types.

Definition inclusion_exclusion_bijection {A} (X Y : finite_set_of A)
  : (X ∪ Y)%set_of + (X ∩ Y)%set_of <--> X + Y.
Admitted. (* Hard exercise *)

Isomorphic sets have the same cardinality (by theorem card_bijection_finite_type). The resulting equation simplifies to the binary inclusion-exclusion identity, because #|A + B| equals #|A| + #|B| definitionally. So the proof consists simply of applying that theorem with the above bijection.

Theorem inclusion_exclusion {A} (X Y : finite_set_of A)
  : #|X ∪ Y| + #|X ∩ Y| = #|X| + #|Y|.
Proof.
  apply (@card_bijection_finite_type ((X ∪ Y) + (X ∩ Y)) (X + Y)).
  apply inclusion_exclusion_bijection.
Qed.

Conclusion

To formalize mathematics, it’s often useful to revisit our preconceptions about fundamental concepts. To carry out even basic combinatorics in type theory, it’s useful to distinguish two views of the naive notion of set.

For example, when we say “union”, we really mean one of two things depending on the context. Either the sets are obviously disjoint, so we really mean “sum”: this corresponds to viewing sets as types. Or we implicitly know that the two sets contain the same “type” of elements a priori, so the overlap is something we have to worry about explicitly: this corresponds to viewing sets as predicates on a given universe.


  1. Ironically, when making restaurant reservations, I still occasionally forget to count myself.↩︎

  2. The code from this post is part of this project I’ve started here. Also check out Brent Yorgey’s thesis: Combinatorial Species and Labelled Structures (2014).↩︎

  3. Speaking of sets, it’s important to distinguish naive set theory from axiomatic set theory. Naive set theory is arguably what most people think of when they hear “set”. It is a semi-formal system for organizing mathematics: there are sets, they have elements, and there are various operations to construct and analyze sets, but overall we don’t think too hard about what sets are (hence, “semi-formal”). When this blog post talks about sets, it is in the context of naive set theory. Axiomatic set theory is formal, with rules that are clear enough to be encoded in a computer. The name “axiomatic set theory” is a stroke of marketing genius, establishing it as the “standard” way of formalizing naive set theory, and thus, all of mathematics, as can be seen in most introductory courses on formal logic. Historically, Zermelo’s set theory was formulated at around the same time as Russell’s type theory, and type theory is at the root of currently very active areas of programming language theory and formal methods.↩︎

  4. Bijections actually form a groupoid (a “proof-relevant equivalence relation”).↩︎

  5. We could also have defined fin as the inductive type of bounded naturals, which is named Fin.t in the standard library. Anecdotal experience suggests that the sigma type is more beginner-friendly. But past a certain level of familiarity, I think they are completely interchangeable.

    Inductive fin' : nat -> Type :=
    | F0 : fin' 1
    | FS : forall n, fin' n -> fin' (S n).

    The definition of fin as a sigma type relies on details of the definition of the order relation _ < _. Other definitions may allow the proposition p < n to be inhabited by multiple proof objects, causing fin n to have “more” than n inhabitants unless they are collapsed by proof irrelevance.↩︎

  6. math-comp has a different but equivalent definition of fintype.↩︎

  7. … if you know what those words mean.↩︎

by Lysxia at May 27, 2022 12:00 AM

May 26, 2022

Monday Morning Haskell

Sizing Up our Files

Earlier this week we went over some basic mechanics with regard to binary files. This week we'll look at a couple functions for dealing with file size. These are perhaps a bit more useful with binary files, but they also work with normal files, as we'll see.

The two functions are very simple. We can get the file size, and we can set the file size:

hFileSize :: Handle -> IO Integer

hSetFileSize :: Handle -> Integer -> IO ()

Getting the file size does exactly what you would expect. It gives us an integer for the number of bytes in the file. We can use this on our bitmap from last time, but also on a normal text file with the lines "First Line" through "Fourth Line".

main :: IO ()
main = do
  h1 <- openFile "pic_1.bmp" ReadMode
  h2 <- openFile "testfile.txt" ReadMode
  hFileSize h1 >>= print
  hFileSize h2 >>= print

...

822
46

Note however, that we cannot get the file size of terminal handles, since these aren't, of course, files. A potential hope would be that this would return the number of bytes we've written to standard out so far, or the (strictly read) number of bytes we get in stdin before end-of-file. But it throws an error instead:

main :: IO ()
main = do
  hFileSize stdin >> print
  hFileSize stdout >> print

...

<stdin>: hFileSize: inappropriate type (not a regular file)

Now setting the file size is also possible, but it's a tricky and limited operation. First of all, it will not work on a handle in ReadMode:

main :: IO ()
main = do
  h <- openFile "testfile.txt" ReadMode
  hSetFileSize h 34

...

testfile.txt: hSetFileSize: invalid argument (Invalid argument)

In ReadWriteMode however, this operation will succeed. By truncating from 46 to 34, we remove the final line "Fourth Line" from the file (don't forget the newline character!).

main :: IO ()
main = do
  h <- openFile "testfile.txt" ReadMode
  hSetFileSize h 34

... (File content)

First Line
Second Line
Third Line

Setting the file size also works with WriteMode. Remember that opening a file in write mode will erase its existing contents. But we can start writing new contents to the file and then truncate later.

main :: IO ()
main = do
  h <- openFile "testfile.txt" WriteMode
  hPutStrLn h "First Line"
  hPutStrLn h "Second Line"
  hPutStrLn h "Third Line"
  hPutStrLn h "Fourth Line"
  hSetFileSize h 34

... (File content)

First Line
Second Line
Third Line

And, as you can probably tell by now, hSetFileSize only truncates from the end of files. It can't remove content from the beginning. So with our binary file example, we could drop 48 bytes to remove one of the "lines" of the picture, but we can't use this function to remove the 54 byte header:

main :: IO ()
main = do
  h <- openFile "pic_1.bmp" ReadWriteMode
  hSetFileSize h 774

Finally, hSetFileSize can also be used to add space to a file. Of course, the space it adds will all be null characters (byte = 0). But this can still be useful in certain circumstances.

main :: IO ()
main = do
  h <- openFile "pic_1.bmp" ReadWriteMode
  hSetFileSize h 870
  inputBytes <- B.unpack <$> B.hGetContents h
  let lines = chunksOf 48 (drop 54 inputBytes)
  print (last lines)

...

[0,0,0,...]

These aren't the most common operations, but perhaps you'll find a use for them! We're almost done with our look at more obscure IO actions. If you've missed some of these articles and want a summary of this month's new material, make sure to subscribe to our monthly newsletter! You'll also get a sneak peak at what's coming next!

by James Bowen at May 26, 2022 02:30 PM

Mark Jason Dominus

Quick Spanish etymology question

Where did the ‘c’ go in llave (“key”)? It's from Latin clavīs, like in “clavicle”, “clavichord”, “clavier” and “clef”.

Is this the only Latin word that changed ‘cl’ → ‘ll’ as it turned into Spanish, or is there a whole family of them?

[ Addendum 20220528: There are more examples. ]

by Mark Dominus (mjd@plover.com) at May 26, 2022 12:43 PM

Tweag I/O

Reproducible probabilistic programming environments

The development of user-friendly and powerful probabilistic programming libraries (PPLs) has been making Bayesian data analysis easily accessible: PPLs allow a statistician to define a statistical model for their data programatically in either domain-specific or common popular programming languages and provide powerful inference algorithms to sample posterior- and other probability distributions. Under the hood, many PPLs make use of big libraries such as TensorFlow, JAX or Theano / Aesara (a fork of Theano) that provide fast, vectorized array and matrix computations and, for gradient-based inference algorithms, automatic differentiation. In practice, if a user wants to use a PPL, they have to make sure these dependencies (and their dependencies!) are installed, too, which often can be difficult and/or irreproducible: want to compile your compute graph to C code? Then you need a C compiler. But what if you can’t just install the required C compiler version, because there’s already a different, incompatible version on your system? Or you want to run your work code on your private laptop, whose Python version is too low. And so on and so on…

We recently packaged and fixed a couple of PPLs and their related dependencies for use with the Nix package manager. In this blog post, we will showcase Nix and how it gives you easy access to a fully reproducible and isolated development environment in which PPLs and all their dependencies are pre-installed.

A really brief introduction to Nix

Nix assumes a functional approach to package management: building a software component is regarded as a pure, deterministic function that has as inputs the component’s source code, a list of dependencies, possibly a compiler, build instructions, etc. — in short, everything you need to build a software component. This concept is implemented very strictly in Nix. For example, a Python application packaged in Nix has not only its immediate Python dependencies exactly declared (with specific versions and all their dependencies), but also the Python version it is supposed to work with and any system dependencies (think BLAS, OpenCV, C compiler, glibc, …).

Nix and its ecosystem is an extremely large and active open source project, as indicated by the Nix package collection containing build instructions for over 80,000 packages. All these packages can be made available to developers in completely reproducible shells that are guaranteed to provide exactly the same software environment on your laptop, your AWS virtual machine or your continous integration (CI) runner. A limitation of Nix is that it runs only on Linux-like systems and that it requires sudo privileges for installation. You can also use Nix in Windows Subsystem for Linux (WSL) and on MacOS, but support for the latter is not as good as for standard Linux systems. After you installed Nix, you can get a shell in which, for example, Python 3.9, numpy, and OpenCV are available simply by typing:

$ nix-shell -p python39Packages.numpy open-cv

You can now check where this software is stored:

$ python -c "import numpy; print(numpy.__file__)"
/nix/store/bhs02rwyhgsdwriw9f1amkx9020zpir5-python3.9-numpy-1.21.2/lib/python3.9/site-packages/numpy/__init__.py
$ which opencv_version
/nix/store/hyni6hs71dphfy2s5yk8w1h3gzh90a44-opencv-4.5.4/bin/opencv_version

You see that all software provided by Nix is stored in a single directory (the “Nix store”) and is identified by a unique hash. This feature allows you to have several versions of the same software installed side-by-side, without the risk of collision and without any modification to the state of your system: want Python 3.8 and Python 3.9 available in the same shell? Sure, just punch in a nix-shell -p python38 python39!

Example: reproducible development Nix shell with PyMC3

Now that we hopefully made you curious about Nix, let’s finally mix Nix and probabilistic programming. As mentioned in the introduction, we made a couple of probabilistic programming-related libraries available in the Nix package collection. More specifically, we fixed and updated the packaging of PyMC3 and TensorFlow Probability and newly packaged the Theano fork Aesara, which is a crucial dependency for the upcoming PyMC 4 release.

To get a Nix shell that makes, for example, PyMC3 available, you could just run nix-shell -p python39Packages.pymc3, but if you want to add more packages and later recreate this environment, you want to have some permanent definition of your Nix shell. To achieve this, you can write a small file shell.nix in the functional Nix language declaring these dependencies. The shell.nix file essentially describes a function that takes a certain Nix package collection as an input and returns a Nix development shell. An example could look as follows:

let
  # use a specific (although arbitrarily chosen) version of the Nix package collection
  default_pkgs = fetchTarball {
    url = "http://github.com/NixOS/nixpkgs/archive/nixpkgs-unstable.tar.gz";
    # the sha256 makes sure that the downloaded archive really is what it was when this
    # file was written
    sha256 = "0x5j9q1vi00c6kavnjlrwl3yy1xs60c34pkygm49dld2sgws7n0a";
  };
# function header: we take one argument "pkgs" with a default defined above
in { pkgs ? import default_pkgs { } }:
with pkgs;
let
  # we create a Python bundle containing Python 3.9 and a few packages
  pythonBundle =
    python39.withPackages (ps: with ps; [ pymc3 matplotlib numpy ipython ]);
# this is what the function returns: the result of a mkShell call with a buildInputs
# argument that specifies all software to be made available in the shell
in mkShell { buildInputs = [ pythonBundle ]; }

You can then enter the Nix shell defined by this file by simply typing nix-shell in the file’s directory. When doing that for the first time, Nix will download all necessary dependencies from a cache server and perhaps rebuild some dependencies from scratch, which might take a minute. But once all dependencies are built or downloaded, they will be cached in the /nix/store/ directory and available instantaneously for later nix-shell calls.

Now let’s see what we can do with our Nix shell! We first check where Python and PyMC3 are from and then run a small Python script that does a Bayesian linear regression on made-up sample data:

$ nix-shell
[... lots of output about downloading dependencies]
[nix-shell:~/some/dir:]$ which python
/nix/store/rhc1yh5dvhll2db9n8qywpg6ysdv6yif-python3-3.9.10-env/bin/python
# *Not* your system Python, but a Python from your Nix store
[nix-shell:~/some/dir:]$ python -c "import pymc3; print(pymc3.__file___)"
/nix/store/rhc1yh5dvhll2db9n8qywpg6ysdv6yif-python3-3.9.10-env/lib/python3.9/site-packages/pymc3/__init__.py
# PyMC3 is in your Nix store, too. The state of your system Python installation is unchanged
[nix-shell:~/some/dir:]$ python pymc3_linear_regression.py
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [noise, intercept, slope]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 5 seconds.

png

[nix-shell:~/some/dir:]$ exit
# now you're back to your normal shell...
$ which python
/usr/bin/python
# and back to your system Python

As you can see, you entered an isolated development shell that provides the dependencies you specified and that allows you to run PyMC3 without changing the state of your system Python installation. And if you now run the same sequence of commands on a different machine with Nix installed, it will just work just the same way as above! Just put the shell.nix file in the same VCS repository as your code and voilà - you’re sharing not only your code, but also the software environment it was developed in.

This is not specific to PyMC3 at all: a reproducible and isolated software environment containing TensorFlow Probability or Aesara can be defined and used similarly; just replace pymc3 by tensorflow-probability or aesara in your Python bundle.

On the face of it, that all might seem not so different from a Python virtual environment. But we saw the crucial difference above: Python virtual environments manage only Python dependencies, but no dependencies further down the “dependency tree”. Nix, on the other hand, behaves thus rather a bit like Conda or, although it’s quite a stretch, like a Docker image: it provides system dependencies, too. A detailed comparison of these alternatives to Nix is beyond the scope of this post, though.

Conclusion

In this blog post, we gave a short introduction to the Nix package manager and its development shell feature and demonstrated how to use it to obtain a reproducible software environment that contains PyMC3 and a few other Python dependencies. We also showed that these software environments don’t mess with your system state and thus allow you to fearlessly experiment and try out new software without breaking anything. In this regard, Nix provides an alternative to Docker or Conda, but it can do much, much more — in fact, there is even a whole Linux distribution (NixOS) that is based on the Nix package manager!

You can find the shell.nix file and the PyMC3 example script in Tweag’s blog post resources repository. If you would like to learn more about Nix, visit the Nix website for more resources, browse the Nix Discourse or pop in to #nix:nixos.org on Matrix or #nixos on the Libera.Chat IRC network.

May 26, 2022 12:00 AM

May 24, 2022

GHC Developer Blog

GHC 9.4.1-alpha2 released

GHC 9.4.1-alpha2 released

bgamari - 2022-05-24

The GHC developers are happy to announce the availability of the second alpha release of the GHC 9.4 series. Binary distributions, source distributions, and documentation are available at downloads.haskell.org.

This major release will include:

  • A new profiling mode, -fprof-late, which adds automatic cost-center annotations to all top-level functions after Core optimisation has run. This incurs significantly less performance cost while still providing informative profiles.

  • A variety of plugin improvements including the introduction of a new plugin type, defaulting plugins, and the ability for typechecking plugins to rewrite type-families.

  • An improved constructed product result analysis, allowing unboxing of nested structures, and a new boxity analysis, leading to less reboxing.

  • Introduction of a tag-check elision optimisation, bringing significant performance improvements in strict programs.

  • Generalisation of a variety of primitive types to be levity polymorphic. Consequently, the ArrayArray# type can at long last be retired, replaced by standard Array#.

  • Introduction of the \cases syntax from GHC proposal 0302

  • A complete overhaul of GHC’s Windows support. This includes a migration to a fully Clang-based C toolchain, a deep refactoring of the linker, and many fixes in WinIO.

  • Support for multiple home packages, significantly improving support in IDEs and other tools for multi-package projects.

  • A refactoring of GHC’s error message infrastructure, allowing GHC to provide diagnostic information to downstream consumers as structured data, greatly easing IDE support.

  • Significant compile-time improvements to runtime and memory consumption.

  • On overhaul of our packaging infrastructure, allowing full traceability of release artifacts and more reliable binary distributions.

  • … and much more. See the release notes for a full accounting.

Note that, as 9.4.1 is the first release for which the released artifacts will all be generated by our Hadrian build system, it’s possible that there will be packaging issues. If you enounter trouble while using a binary distribution, please open a ticket. Likewise, if you are a downstream packager, do consider migrating to Hadrian to run your build; the Hadrian build system can be built using cabal-install, stack, or the in-tree bootstrap script.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Tweag I/O, Serokell, Equinix, SimSpace, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy testing,

  • Ben

by ghc-devs at May 24, 2022 12:00 AM

May 23, 2022

Monday Morning Haskell

Using Binary Mode in Haskell

So far in our IO adventures, we've only been dealing with plain text files. But a lot of data isn't meant to be read as string data. Some of the most interesting and important problems in computing today are about reading image data and processing it so our programs can understand what's going on. Executable program files are also in a binary format, rather than human readable. So today, we're going to explore how IO works with binary files.

First, it's important to understand that handles have encodings, which we can retrieve using hGetEncoding. For the most part, your files will default as UTF-8.

hGetEncoding :: Handle -> IO (Maybe TextEncoding)

main :: IO ()
main = do
  hGetEncoding stdin >>= print
  hGetEncoding stdout >>= print
  h <- openFile "testfile.txt" ReadMode
  hGetEncoding h >>= print

...

Just UTF-8
Just UTF-8
Just UTF-8

There are other encodings of course, like char8, latin1, and utf16. These are different ways of turning text into bytes, and each TextEncoding expression refers to one of these. If you know you have a file written in UTF16, you can change the encoding using hSetEncoding:

hSetEncoding :: Handle -> TextEncoding -> IO ()

main :: IO ()
main = do
  h <- openFile "myutf16file.txt" ReadMode
  hSetEncoding h utf16
  myString <- hGetLine h
  ...

But now notice that hGetEncoding returns a Maybe value. For binary files, there is no encoding! We are only allowed to read raw data. You can set a file to read as binary by using hSetBinaryMode True, or by just using openBinaryFile.

hSetBinaryMode :: Handle -> Bool -> IO ()

openBinaryFile :: FilePath -> IOMode -> IO Handle

main :: IO ()
main = do
  h <- openBinaryFile "pic_1.bmp" ReadMode
  ...

When it comes to processing binary data, it is best to parse your input into a ByteString rather than a string. Using the unpack function will then allow you to operate on the raw list of bytes:

import qualified Data.ByteString as B

main :: IO ()
main = do
  h <- openBinaryFile "pic_1.bmp" ReadMode
  inputBytes <- B.hGetContents h
  print $ length inputBytes

In this example, I've opened up an image files, and converted its data into a list of bytes (using the Word type).

Further processing of the image will require some knowledge of the image format. As a basic example, I made a 24-bit bitmap with horizontal stripes throughout. The size was 16 pixels by 16 pixels. With 3 bytes (24 bits) per pixel, the total size of the "image" would be 768. So then upon seeing that my program above printed "822", I could figure out that the first 54 bits were just header data.

I could then separate my data into "lines" (48-byte chunks) and I successfully observed that each of these chunks followed a specific pattern. Many lines were all white (the only value was 255), and other lines had three repeating values.

import qualified Data.ByteString as B
import Data.List.Split (chunksOf)

main :: IO ()
main = do
  h <- openBinaryFile "pic_1.bmp" ReadMode
  inputBytes <- B.unpack <$> B.hGetContents h
  let lines = chunksOf 48 (drop 54 inputBytes)
  forM_ lines print

...

[255, 255, 255, ...]
[36, 28, 237, 36, 28, 237, ...]
[255, 255, 255, ...]
[76, 177, 34, 76, 177, 34 ...]
[255, 255, 255, ...]
[36, 28, 237, 36, 28, 237, ...]
[255, 255, 255, ...]
[76, 177, 34, 76, 177, 34 ...]
[255, 255, 255, ...]
[0, 242, 255, 0, 242, 255, ...]
[255, 255, 255, ...]
[232, 162, 0, 232, 162, 0, ...]
[255, 255, 255, ...]
[0, 242, 255, 0, 242, 255, ...]
[255, 255, 255, ...]
[232, 162, 0, 232, 162, 0, ...]

Now that the data is broken into simple numbers, it would be possible to do many kinds of mathematical algorithms on it if there were some interesting data to process!

In our last couple of IO articles, we'll keep looking at some issues with binary data. If you want monthly summaries of what we're writing here at Monday Morning Haskell, make sure to subscribe to our monthly newsletter! This will also give you access to our subscriber resources!

by James Bowen at May 23, 2022 02:30 PM

GHC Developer Blog

Mid 2022 Release Plans

Mid 2022 Release Plans

Matthew Pickering - 2022-05-23

This post sets out our current plans for the upcoming releases in the next few months.

9.4.1

The next stage of the 9.4.1 release series (alpha2) will be released within the next few days. The main changes in alpha2 relative to alpha1 are improvements to the packaging and release process which have fixed a number of packaging bugs present in alpha1 due to moving to bindists built by hadrian.

The final 9.4.1 release is scheduled for late July.

The release manager for this release is Ben Gamari.

9.2.3

The 9.2.3 release is scheduled to immediately follow the 9.4.1-alpha1. This release fixes some packaging issues on Windows and a few bugs, notably:

  • a panic involving unbound cycle-breaker variables that prevented several libraries from compiling, such as massiv-io (#20231),
  • a typechecker regression in which GHC refused to use recursive equalities involving type families (#21473, #21515).

The release manager for this release is Zubin Duggal.

9.0.* series

We do not intend to produce any more releases in the 9.0.* series. From our perspective there are no serious bugs affecting the 9.0.2 release. It is advised that users start using the 9.2 series, which we intend to stabilise in the same manner as the 8.10 series. We have made this decision for the following reasons:

  • The 9.2 series does not contain significant breakage (when upgrading from 9.0)
  • Anecdotal evidence suggests that many companies are upgrading straight from 8.10.7 to 9.2.2 and skipping the 9.0.* releases.
  • We do not currently have capacity to manage 4 active branches.

Conclusion

This post summarises the latest state of the release cycles and our intent within the next few months. If you have any questions or comments then please post the next few months. If you have any questions or comments then please be in touch via mailto:ghc-devs@haskell.org.

by ghc-devs at May 23, 2022 12:00 AM

May 21, 2022

Mark Jason Dominus

What's long and hard?

Sometime in the previous millennium, my grandfather told me this joke:

Why is Fulton Street the hottest street in New York?

Because it lies between John and Ann.

I suppose this might have been considered racy back when he heard it from his own grandfather. If you didn't get it, don't worry, it wasn't actually funny.

cropped screenshot from Google Maps, showing a two-block region of lower Manhattan, bounded by John Street on the south and Ann Street on the north. Fulton Street lies between them, parallel.

Today I learned the Philadelphia version of the joke, which is a little better:

What's long and black and lies between two nuts?

Sansom Street.

cropped screenshot from Google Maps, showing a two-block region of West Philadelphia, bounded by Walnut Street on the south and Chestnut Street on the north. Sansom Street lies between them, parallel.

I think it that the bogus racial flavor improves it (it looks like it might turn out to be racist, and then doesn't). Some people may be more sensitive; to avoid making them uncomfortable, one can replace the non-racism with additional non-obscenity and ask instead “what's long and stiff and lies between two nuts?”.

There was a “what's long and stiff” joke I heard when I was a kid:

What's long and hard and full of semen?

A submarine.

Eh, okay. My opinion of puns is that they can be excellent, when they are served hot and fresh, but they rapidly become stale and heavy, they are rarely good the next day, and the prepackaged kind is never any good at all.

The antecedents of the “what's long and stiff” joke go back hundreds of years. The Exeter Book, dating to c. 950 CE, contains among other things ninety riddles, including this one I really like:

A curious thing hangs by a man's thigh,
under the lap of its lord. In its front it is pierced,
it is stiff and hard, it has a good position.
When the man lifts his own garment
above his knee, he intends to greet
with the head of his hanging object that familiar hole
which is the same length, and which he has often filled before.

(The implied question is “what is it?”.)

The answer is of course a key. Wikipedia has the original Old English if you want to compare.

Finally, it is off-topic but I do not want to leave the subject of the Exeter Book riddles without mentioning riddle #86. It goes like this:

Wiht cwom gongan
  þær weras sæton
monige on mæðle,
  mode snottre;
hæfde an eage
  ond earan twa,
ond II fet,
  XII hund heafda,
hrycg ond wombe
  ond honda twa,
earmas ond eaxle,
  anne sweoran
ond sidan twa.
  Saga hwæt ic hatte.

I will adapt this very freely as:

What creature has two legs and two feet, two arms and two hands, a back and a belly, two ears and twelve hundred heads, but only one eye?

The answer is a one-eyed garlic vendor.

by Mark Dominus (mjd@plover.com) at May 21, 2022 06:01 PM

May 19, 2022

Monday Morning Haskell

Interactive IO

Today we'll continue our study of IO by looking at an interactive IO program. In this kind of program, the user will enter commands continuously on the command line to interact with our program. The fun part is that we'll find a use for a lesser-known library function called, well, interact!

Imagine you're writing a command line program where you want the user to keep entering input lines, and you do some kind of processing for each line. The most simple example would be an echo program, where we simply repeat the user's input back out to them:

>> Hello
Hello
>> Goodbye
Goodbye

A naive approach to writing this in Haskell would use recursion like so:

main :: IO ()
main = go
  where
    go = do
      input <- getLine
      putStrLn input
      go

However, there's no terminal condition on this loop. It keeps expecting to read a new line. Our only way to end the program is with "ctrl+C". Typically, the cleaner way to end a program is to use the input "ctrl+D" instead, which is the "end of file" character. However, this example will not end elegantly if we do that:

>> Hello
Hello
>> Goodbye
Goodbye
>> (ctrl+D)
<stdin>: hGetLine: end of file

What's happening here is that getLine will throw this error when it reads the "end of file" character. In order to fix this, we can use these helper functions.

hIsEOF :: Handle -> IO Bool

-- Specialized to stdin
isEOF :: IO Bool

These give us a boolean that indicates whether we have reached the "end of file" as our input. The first works for any file handle and the second tells us about the stdin handle. If it returns false, then we are safe to proceed with getLine. So here's how we would rewrite our program:

main :: IO ()
main = go
  where
    go = do
      ended <- isEOF
      if ended
        then return ()
        else do
          input <- getLine
          putStrLn input
          go

Now we won't get that error message when we enter "ctrl+D".

But for these specific problems, there's another tool we can turn to, and this is the "interact" function:

interact :: (String -> String) -> IO ()

The function we supply simply takes an input string and determines what string should be output as a result. It handles all the messiness of looping for us. So we could write our echo program very simply like so:

main :: IO ()
main = interact id

...

>> Hello
Hello
>> Goodbye
Goodbye
>> Ctrl+D

Or if we're a tiny bit more ambitious, we can capitalize each of the user's entries:

main :: IO ()
main = interact (map toUpper)

...

>> Hello
HELLO
>> Goodbye
GOODBYE
>> Ctrl+D

The function is a little tricky though, because the String -> String function is actually about taking the whole input string and returning the whole output string. The fact that it works line-by-line with simple functions is an interesting consequence of Haskell's laziness.

However, because the function is taking the whole input string, you can also write your function so that it breaks the input into lines and does a processing function on each line. Here's what that would look like:

processSingleLine :: String -> String
processSingleLine = map toUpper

processString :: String -> String
processString input = result
  where
    ls = lines input
    result = unlines (map processSingleLine ls)

main :: IO ()
main = interact processString

For our uppercase and id examples, this works the same way. But this would be the only proper way to write our program if we wanted to, for example, parse a simple equation on each line and print the result:

processSimpleAddition :: String -> String
processSingleAddition input = case splitOn " " input of
  [num1, _, num2] -> show (read num1 + read num2)
  _ -> "Invalid input!"

processString :: String -> String
processString input = result
  where
    ls = lines input
    result = unlines (map processSimpleAddition ls)

main :: IO ()
main = interact processString

...

>> 4 + 5
9
>> 3 + 2
5
>> Hello
Invalid input!

So hIsEOF and interact are just a couple more tools you can add to your arsenal to simplify some of these common types of programs. If you're enjoying these blog posts, make sure to subscribe to our monthly newsletter! This will keep you up to date with our newest posts AND give you access to our subscriber resources!

by James Bowen at May 19, 2022 02:30 PM

May 18, 2022

Gabriel Gonzalez

Namespaced De Bruijn indices

Namespaced De Bruijn indices

In this post I share a trick I use for dealing with bound variables in Dhall that I thought might be useful for other interpreted programming languages. I have no idea if this trick has been introduced before but if it has then just let me know and I’ll acknowledge any prior art here.

Edit: Todd Wilson points out that Mark-Oliver Stehr’s CINNI originally introduced this idea.

The brief explanation of the trick is: instead of choosing between a named or a nameless representation for bound variables you can get the best of both worlds by namespacing De Bruijn indices by variable names. This simplifies the implementation and in some cases improves the end user’s experience.

The rest of this post is a longer explanation of the above summary, starting with an explanation of the trick and followed by a review of the benefits of this approach.

Background

I’d like to first explain what I mean by “named” and “nameless” representations before I explain the trick.

A named representation of the lambda calculus syntax tree typically looks something like this:

data Syntax
= Variable String
| Lambda String Syntax
| Apply Syntax Syntax

For example, if the user wrote the following Haskell-like code:

\f -> \x -> f x

… then that would correspond to this syntax tree:

example :: Syntax
example = Lambda "f" (Lambda "x" (Apply (Variable "f") (Variable "x")))

The named representation has the nice property that it preserves the original variable names … well, sort of. This representation definitely preserves the variable names when you initially parse the code into the syntax tree, but if you β-reduce an expression you can potentially run into problems.

For example, consider this expression:

\x -> (\y -> \x -> y) x

… which corresponds to this syntax tree:

Lambda "x" (Apply (Lambda "y" (Lambda "x" (Variable "y"))) (Variable "x"))

If you try to β-reduce (\y -> \x -> y) x without renaming any variables then you get the following incorrect result:

\x -> \x -> x

This bug is known as “name capture” and capture-avoiding substitution requires renaming one of the variables named x so that the inner x does not shadow the outer x. For example, we could fix the problem by renaming the outer x to x1 like this:

\x1 -> \x -> x1

A nameless representation tries to work around these name capture issues by replacing the variable names with numeric indices (known as De Bruijn indices):

data Syntax
= Variable Int
| Lambda Syntax
| Apply Syntax Syntax

For example, code like this:

\f -> \x -> f x

… corresponds to this nameless representation:

example :: Syntax
example = Lambda (Lambda (Apply (Variable 1) (Variable 0)))

Carefully note that the Lambda constructor now has no field for the bound variable name, so it’s as if the user had instead written:

\ -> \ -> @1 @0

… using @n to represent the variable whose De Bruijn index is n.

The numeric De Bruijn indices refer to bound variables. Specifically, the numeric index 0 refers to the “closest” or “innermost” variable bound by a lambda:

--                This 0 index …
-- ↓
\ -> \ -> @1 @0
-- ↑ … refers to the variable bound by this lambda

… and incrementing the index moves to the next outermost lambda:

--             This 1 index …
-- ↓
\ -> \ -> @1 @0
-- ↑ … refers to the variable bound by this lambda

De Bruijn indices avoid name collisions between bound variables, but they require you to do additional work if you wish to preserve the original variable names. There are several ways to do so, and I’ll present my preferred approach.

The trick - Part 1

We can get the best of both worlds by combining the named and nameless representations into a hybrid representation like this:

data Syntax
= Variable String Int
| Lambda String Syntax
| Apply Syntax Syntax

I call this representation “namespaced De Bruijn indices”.

This is almost the exact same as our named representation, except that we have now added an Int field to the Variable constructor. This Int field is morally the same as the De Bruijn index in the nameless representation, except that this time the De Bruijn index is “namespaced” to a specific variable name.

The easiest way to explain this is with a few examples.

The following expression:

\x -> \y -> \x -> x@0

… corresponds to this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 0)))

… and this curried function returns the third function argument:

--                    This …
-- ↓
\x -> \y -> \x -> x@0
-- ↑ … refers to this bound variable

… because that is the innermost bound variable named x.

Similarly, the following expression:

\x -> \y -> \x -> y@0

… corresponds to this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "y" 0)))

… which returns the second function argument:

--                    This …
-- ↓
\x -> \y -> \x -> y@0
-- ↑ … refers to this bound variable

… because that is the innermost bound variable named y.

Carefully note that our variable still has a De Bruijn index of 0, but we ignore the innermost bound variable named x because we also pair our De Bruijn index with name of the variable we are referring to (y) so we only count bound variables named y when resolving the De Bruijn index.

Finally, the following expression:

\x -> \y -> \x -> x@1

… corresponds to this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 1)))

… which returns the first function argument:

--                    This …
-- ↓
\x -> \y -> \x -> x@1
-- ↑ … refers to this bound variable

The De Bruijn index is 1, which means that it refers to the second innermost (0-indexed) bound variable named x.

Notice how this representation lets us refer to shadowed variables by their index. These De Bruijn indices are not an internal implementation detail, but are actually available to the user as part of the surface syntax of the language.

However, we want to avoid littering the code with these De Bruijn indices, which brings us to the second part of the trick.

The trick - Part 2

The next step is to add syntactic sugar to the language by allowing users to omit the index in the source code, which defaults the index to 0. This means that an expression that never references shadowed variables never needs to specify a De Bruijn index.

For example, instead of writing this:

\x -> \y -> \x -> x@0

… we can elide the index to simplify the code to:

\x -> \y -> \x -> x

… which will still parse as:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 0)))

Similarly, we can simplify this:

\x -> \y -> \x -> y@0

… to this:

\x -> \y -> \x -> y

… which will still parse as:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "y" 0)))

However, we cannot use this syntactic sugar to simplify the final example:

\x -> \y -> \x -> x@1

… since the index is non-zero. Any code that references a shadowed variable still needs to use an explicit De Bruijn index to do so.

Vice versa, we also omit zero indices when pretty-printing code. When we pretty-print this syntax tree:

Lambda "x" (Lambda "y" (Lambda "x" (Variable "x" 0)))

… we don’t include the index:

\x -> \y -> \x -> x

This syntactic sugar ensures that most users do not need to be aware that indices exist at all when writing code. The user only encounters the indices in two scenarios:

  • The user wishes to explicitly reference a shadowed variable

    For example, in the following expression:

    \x -> \y -> \x -> x@1

    … the user might prefer to use the built-in language support for disambiguating variables of the same name rather than renaming one of the two variables named x.

  • The indices appear in a β-reduced result

    For example, this expression has no user-visible De Bruijn indices:

    \x -> (\y -> \x -> y) x

    … but if you β-reduce the expression (I’ll cover how in the Appendix) and pretty-print the β-reduced expression then the result will introduce a non-zero De Bruijn index to disambiguate the two variables named x:

    \x -> \x -> x@1

In fact, the latter scenario is the reason I originally adopted this trick: I wanted to be able to display β-reduced functions to the end user while preserving the original variable names as much as possible.

Note that De Bruijn indices don’t appear when a β-reduced expression does not reference any shadowed variables. For example, if you β-reduce this expression:

(\f -> f f) (\x -> x)

… the result has no De Bruijn index (because the index is 0 and is therefore elided by the pretty-printer):

\x -> x

The trick - Part 3

One of the benefits of the traditional nameless representation using (non-namespaced) De Bruijn indices is that you get α-equivalence for free. Two nameless expressions are α-equivalent if they are syntactically identical. We can build upon this useful property to derive a compact algorithm for α-equivalence of “namespaced De Bruijn indices”.

The trick is to recognize that namespaced De Bruijn indices reduce to ordinary De Bruijn indices in the degenerate case when you rename all variables to the same name. I’ll call this renaming process “α-reduction”.

For example, if we α-reduce the following expression by renaming all of the: variables to _:

\x -> \y -> \x -> x@1

… then we get this result:

\_ -> \_ -> \_ -> _@2

See the Appendix for the α-reduction algorithm.

Equipped with α-reduction, then we can derive α-equivalence: two expressions are α-equivalent if their α-reduced forms are syntactically identical.

For example, this expression:

\x -> x

… and this expression:

\y -> y

… both α-reduce to:

\_ -> _

… so they are α-equivalent.

Benefits

There are a few benefits of using this trick that motivate me to use this in all of my interpreted languages:

  • This trick improves the readability of β-reduced functions

    β-reduced functions preserve the original variable names and this trick doesn’t suffer from the rename-related name pollution that plagues other capture-avoiding substitution algorithms. In particular, β-reduced expressions only display De Bruijn indices when absolutely necessary (if they reference a shadowed variable) and they otherwise use the original pristine variable names.

  • This trick simplifies the internal implementation

    You don’t need to maintain two separate syntax trees for a named and nameless representation. You can use the same syntax tree for both since any named syntax tree can be α-reduced to give the equivalent nameless syntax tree.

  • This trick enables userland support for referencing shadowed variables

    I know some people think that referencing shadowed variable names is a misfeature. However, I personally feel that resolving name collisions by adding ' or _ characters to the end of variable names is less principled than having language support for resolving name collisions using optional De Bruijn indices.

  • (Not shown) This trick can sometimes improve type errors

    To be precise, this trick improves the inferred types displayed in error messages when using explicit universal quantification.

    Type variables also have to avoid name collisions, so if you use the same namespaced De Bruijn representation for your types then you avoid polluting your inferred types and error messages with junk type variables like a14.

    This post doesn’t cover the equivalent type-level trick, but you can refer to the Dhall standard if you need an example of a language that uses this trick.

Conclusion

I believe that namespaced De Bruijn indices are most appropriate for languages that are (A) strongly normalizing (like Dhall) and (B) interpreted, because such languages tend to support pretty-printing β-reduced functions.

I think this trick is also useful to a lesser extent for all interpreted languages, if only because the implementation is (in my opinion) simpler and more elegant than other algorithms for capture-avoiding substitution (See the Appendix below).

On the other hand, compiled languages will likely not benefit much from this trick since they typically have no need to preserve the original variable names and they also will use an intermediate representation that is very different from the concrete syntax tree.

Appendix - Implementation

This section provides Haskell code specifying how to α-reduce and β-reduce a syntax tree that uses namespaced De Bruijn indices.

This reference implementation is not the most efficient implementation, but it’s the simplest one which I use for pedagogical purposes. If you’re interested in efficiency then check out my Grace project, which mixes this trick with the more efficient normalization-by-evaluation algorithm.

I also don’t include code for the parser or pretty-printer, because the only interesting part is the syntactic sugar for handling variables with a De Bruijn index of 0. Again, check out Grace if you want to refer to a more complete implementation.

-- | Syntax tree
data Syntax
= Variable String Int
| Lambda String Syntax
| Apply Syntax Syntax
deriving (Eq, Show)

{-| Increase the index of all bound variables matching the given variable name

This is modified from the Shifting definition in Pierce's \"Types and
Programming Languages\" by adding an additional argument for the namespace
to shift
-}
shift
:: Int
-- ^ The amount to shift by
-> String
-- ^ The variable name to match (a.k.a. the namespace)
-> Int
-- ^ The minimum bound for which indices to shift
-> Syntax
-- ^ The expression to shift
-> Syntax
shift offset namespace minIndex syntax =
case syntax of
Variable name index -> Variable name index'
where
index'
| name == namespace && minIndex <= index = index + offset
| otherwise = index

Lambda name body -> Lambda name body'
where
minIndex'
| name == namespace = minIndex + 1
| otherwise = minIndex

body' = shift offset namespace minIndex' body

Apply function argument -> Apply function' argument'
where
function' = shift offset namespace minIndex function

argument' = shift offset namespace minIndex argument

{-| Substitute the given variable name and index with an expression

This is modified from the Substitution definition in Pierce's \"Types and
Programming Languages\" by adding an additional argument for the variable
index
-}
substitute
:: Syntax
-- ^ The expression to substitute into
-> String
-- ^ The name of the variable to replace
-> Int
-- ^ The index of the variable to replace
-> Syntax
-- ^ The expression to substitute in place of the given variable
-> Syntax
substitute expression name index replacement =
case expression of
Variable name' index'
| name == name' && index == index' -> replacement
| otherwise -> Variable name' index'

Lambda name' body -> Lambda name' body'
where
index'
| name == name' = index + 1
| otherwise = index

shiftedBody = shift 1 name' 0 replacement

body' = substitute body name index' shiftedBody

Apply function argument -> Apply function' argument'
where
function' = substitute function name index replacement

argument' = substitute argument name index replacement

-- | β-reduce an expression
betaReduce :: Syntax -> Syntax
betaReduce syntax =
case syntax of
Variable name index -> Variable name index

Lambda name body -> Lambda name body'
where
body' = betaReduce body

Apply function argument ->
case function' of
Lambda name body -> body'
where
shiftedArgument = shift 1 name 0 argument

substitutedBody = substitute body name 0 shiftedArgument

unshiftedBody = shift (-1) name 0 substitutedBody

body' = betaReduce unshiftedBody

_ -> Apply function' argument'
where
function' = betaReduce function

argument' = betaReduce argument

-- | α-reduce an expression
alphaReduce :: Syntax -> Syntax
alphaReduce syntax =
case syntax of
Variable name index -> Variable name index

Lambda name body -> Lambda "_" body'
where
shiftedBody = shift 1 "_" 0 body

substitutedBody = substitute shiftedBody name 0 (Variable "_" 0)

unshiftedBody = shift (-1) name 0 substitutedBody

body' = alphaReduce unshiftedBody

Apply function argument -> Apply function' argument'
where
function' = alphaReduce function

argument' = alphaReduce argument

-- | Returns `True` if the two input expressions are α-equivalent
alphaEquivalent :: Syntax -> Syntax -> Bool
alphaEquivalent left right = alphaReduce left == alphaReduce right

Appendix - History

I actually first introduced this feature in Morte, not Dhall. The idea originated from the discussion on this issue.

by Gabriella Gonzalez (noreply@blogger.com) at May 18, 2022 01:56 PM

Dynamic type errors lack relevance

Dynamic type errors lack relevance

Proponents of statically typed languages commonly motivate types as a way to safely detect bugs ahead of time. For example, consider the following Python program that attempts to increment a number stored in counter.txt:

# ./increment.py

with open('counter.txt', 'r') as handle:
x = handle.readline()

with open('counter.txt', 'w') as handle:
handle.write(int(x) + 1)

This program contains a type error, but by the time we find out it’s too late: our program will have already wiped the contents of counter.txt by opening the file as a writable handle:

$ echo -n '0' > ./counter.txt

$ cat counter.txt
0

$ python increment.py
Traceback (most recent call last):
File "increment.py", line 5, in <module>
handle.write(int(x) + 1)
TypeError: expected a string or other character buffer object

$ cat counter.txt # The contents of the file were lost

Defenders of dynamically typed languages sometimes counter that these pitfalls do not matter when runtime failures are mostly harmless. If you want to find errors in your program, just run the program!

As an extreme example, Nix is a purely functional language with a dynamic type system, and you can safely interpret a Nix program ahead of time to detect errors since Nix evaluation has no side effects1. Consequently, Nix proponents sometimes reason that these dynamic type errors are functionally indistinguishable from static type errors thanks to Nix’s purity.

However, dynamic types are not a substitute for static types, even in a purely functional language like Nix. To see why, consider the following Nix expression, which attempts to render a structured value as command-line options:

# ./options.nix

let
pkgs = import <nixpkgs> { };

enable = option: "${option}=true";

disable = option: "${option}=false";

in
pkgs.lib.cli.toGNUCommandLine { }
{ option = [
"max-jobs=5"
"cores=4"
enable "fallback"
];
}

The intention was to produce this result:

[ "--option" "max-jobs=5" "--option" "cores=4" "--option" "fallback=true" ]

… but we actually get a dynamic type error when we interpret the expression:

$ nix-instantiate --eval options.nix --strict
error: evaluation aborted with the following error message: 'generators.mkValueStringDefault: functions not supported: <λ>'

This error message is not very helpful, and it’s not due to a lack of effort, funding, or attention. This sort of poor user experience is inherent to any dynamic type system.

The fundamental issue is that in a dynamically typed language you cannot explain errors to the user in terms of the source code they wrote. In other words, dynamic type errors commonly fail to be relevant to the user.

For example, if Nix had a typical static type system, then the diagnostic might have looked something like this:

# ./options.nix

let
pkgs = import <nixpkgs> { };

enable = option: "${option}=true";

disable = option: "${option}=false";

in
pkgs.lib.cli.toGNUCommandLine { }
{ option = [
"max-jobs=5"
"cores=4"
enable "fallback"
# ~~~~~~
# This element of the list is not a string
];
}

This sort of diagnostic helps us more easily discern that we forgot to parenthesize (enable "fallback"), so the enable function is treated as another list element.

In a dynamic type system, type errors can potentially be far removed from the code that the user wrote. From Nix’s point of view, the actual error is that somewhere in the middle of interpretation it is trying to apply a mkValueStringDefault utility function to the user’s exclaim function:

mkValueStringDefault enable

… but by that point the Nix interpreter is no longer “thinking” in terms of the original program the user wrote, so any interpreter diagnostics will have difficulty explaining the error in terms that the user can understand. For example:

  • In the middle of interpretation any offending subexpressions are abstract syntax trees, not source code

  • Some of these abstract syntax trees may be functions or closures that cannot be (easily) displayed to the user

    We see this above where the error message is unable to render the enable function so it falls back to displaying <λ>.

  • Intermediate evaluation results might not correspond to the source code at all

    For example, the user might not understand where mkValueStringDefault is originating from in the absence of a stack trace.

  • Even if we could trace subexpressions to their original source code the user still might not be able to work backwards from the dynamic type error to the real problem.

    In other words, even if we showed the user the call site for the mkValueStringDefault function they still wouldn’t necessarily understand why exclaim is the function argument.

In fact, the example error message came out better than I expected. The reason why is because somebody took the time to add a custom error message to the mkValueStringDefault utility instead of falling back on the interpreter throwing a dynamic type error:

  mkValueStringDefault = {}: v: with builtins;
let err = t: v: abort
("generators.mkValueStringDefault: " +
"${t} not supported: ${toPretty {} v}");
in

Had they not done so then the error message would have been even further disconnected from the user’s experience. This only reinforces that the relevance of error messages is inversely proportional to the extent to which we avail ourselves of the dynamic type system.

This is why I prefer to lean on static type systems as much as possible to detect errors, because they tend to do a better job of “explaining” what went wrong than dynamic type systems.

Note: The criticisms in this post also apply to exceptions in general (where you can view dynamic types as a special case of exceptions auto-generated by the interpreter). Exceptions also need to be supplemented by stack traces, logging, or debuggers in order to improve their relevance.


  1. Technically, Nix evaluation can trigger builds via “import from derivation”. However, with appropriate sandboxing even builds are mostly harmless. Either way, just assume for the purpose of discussion that Nix evaluation is safe. After all, any unsafety in evaluation only makes the case for static types even stronger.↩︎

by Gabriella Gonzalez (noreply@blogger.com) at May 18, 2022 01:50 PM

May 16, 2022

Monday Morning Haskell

Buffering...Please Wait...

Today we continue our exploration of more obscure IO concepts with the idea of buffering. Buffering determines the more precise mechanics of how our program reads and writes with files. In the right circumstance, using the proper buffering method can make your program work a lot more efficiently.

To start, let's consider the different options Haskell offers us. The BufferMode type has three options:

data BufferMode =
  NoBuffering |
  LineBuffering |
  BlockBuffering (Maybe Int)

Every handle has an assigned buffering mode. We can get and set this value using the appropriate functions:

hGetBuffering :: Handle -> IO BufferMode

hSetBuffering :: Handle -> BufferMode -> IO ()

By default, terminal handles will use NoBuffering and file handles will use BlockBuffering:

main :: IO ()
main = do
  hGetBuffering stdin >>= print
  hGetBuffering stdout >>= print
  (openFile "myfile.txt" ReadMode) >>= hGetBuffering >>= print
  (openFile "myfile2.txt" WriteMode) >>= hGetBuffering >>= print

...

NoBuffering
NoBuffering
BlockBuffering Nothing
BlockBuffering Nothing

So far this seems like some nice trivia to know, but what do these terms actually mean?

Well, when your program reads and writes to files, it doesn't do the "writing" at the exact time you expect. When your program executes hPutStr or hPutStrLn, the given string will be added to the handle's buffer, but depending on the mode, it won't immediately be written out to the file.

If you use NoBuffering though, it will be written immediately. Once the buffer has even a single character, it will write this character to the file. If you use LineBuffering, it will wait until it encounters a newline character.

Finally, there is BlockBuffering. This constructor holds an optional number. The buffer won't write until it contains the given number of bytes. If the value is Nothing, then the underlying number just depends on the operating system.

This idea might sound dangerous to you. Does this mean that it's likely that your program will just leave data unwritten if it doesn't get the right amount? Well no. You can also flush buffers, which will cause them to write their information out no matter what. This happens automatically on important operations like hClose (remember to close your handles!). You can also do this manually with the hFlush function:

hFlush :: Handle -> IO ()

For the most part, you won't notice the difference in buffer modes on normal programs. But under certain circumstances, it can make a big difference in performance. The act of writing information to a file is actually a very long and expensive operation as far as programs are concerned. So doing fewer writes with larger amounts of data tends to be more efficient than doing more writes with smaller amounts of data.

Hopefully you can see now why BlockBuffering is an option. Typically, this is the most efficient way if you're writing a large amount of data, while NoBuffering is the least efficient.

To these this out, I wrote a simple program to write out one hundred thousand numbers to a file, and timed it with different buffer modes:

someFunc :: IO ()
someFunc = do
  let numbers = [1..100000]
  h <- openFile "number.txt" WriteMode
  hSetBuffering h NoBuffering
  timestamp1 <- getCurrentTime
  forM_ numbers (hPrint h)
  hClose h
  timestamp2 <- getCurrentTime
  print $ diffUTCTime timestamp2 timestamp1

When running with NoBuffering, this operation took almost a full second: 0.93938s. However, when I changed to LineBuffering, it dropped to 0.2367s. Finally, with BlockBuffering Nothing, I got a blazing fast 0.05473s. That's around 17x faster! So if you're writing a large amount of data to a file, this can make a definite difference!

If you're writing a program where write-performance is important, I hope this knowledge helps you! Even if not, it's good to know what kinds of things are happening under the hood. If you want to keep up to date with more Haskell knowledge, both obscure and obvious, make sure to subscribe to our monthly newsletter! If you're just starting out, this will give you access to resources like our Beginners Checklist and Recursion Workbook!

by James Bowen at May 16, 2022 02:30 PM

May 14, 2022

Mark Jason Dominus

Cathedrals of various sorts

A while back I wrote a shitpost about octahedral cathedrals and in reply Daniel Wagner sent me this shitpost of a cat-hedron:

A computer graphics drawing of a roughly cat-shaped polyhedron with a glowing blue crucifix stuck on its head.

But that got me thinking: the ‘hedr-’ in “octahedron” (and other -hedrons) is actually the Greek word ἕδρα (/hédra/) for “seat”, and an octahedron is a solid with eight “seats”. The ἕδρα (/hédra/) is akin to Latin sedēs (like in “sedentary”, or “sedate”) by the same process that turned Greek ἡμι- (/hémi/, like in “hemisphere”) into Latin semi- (like in “semicircle”) and Greek ἕξ (/héx/, like in “hexagon”) into Latin sex (like in “sextet”).

So a cat-hedron should be a seat for cats. Such seats do of course exist:

A combination “cat tree” and scratching post sits on the floor of a living room in front of the sofa.  The object is about two feed high and has a carpeted platform atop a column wrapped in sisal rope. Hanging from the platform is a cat toy, and  on the platform resides a black and white domestic housecat.  A second cat investigates the carpeter base of the cat tree.

But I couldn't stop there because the ‘hedr-’ in “cathedral” is the same word as the one in “octahedron”. A “cathedral” is literally a bishop's throne, and cathedral churches are named metonymically for the literal throne they contain or the metaphorical one represent. A cathedral is where a bishop has his “seat” of power.

So a true cathedral should look like this:

The same picture as before, but the cat has been digitally erased from the platform, and replaced with a gorgeously uniformed cardinal of the Catholic Church, wearing white and gold robes and miter.

by Mark Dominus (mjd@plover.com) at May 14, 2022 09:48 PM

May 12, 2022

Monday Morning Haskell

Using Temporary Files

In the last article we learned about seeking. Today we'll see another context where we can use these tools while learning about another new idea: temporary files.

Our "new" function for today is openTempFile. Its type signature looks like this:

openTempFile :: FilePath -> String -> IO (FilePath, Handle)

The first argument is the directory in which to create the file. The second is a "template" for the file name. The template can look like a normal file name, like name.extension. The name of the file that will actually be created will have some random digits appended to the name. For example, we might get name1207-5.extension.

The result of the function is that Haskell will create the file and pass a handle to us in ReadWrite mode. So our two outputs are the full path to the file and its handle.

Despite the name openTempFile, this function won't do anything to delete the file when it's done. You'll still have to do that yourself. However, it does have some useful built-in mechanics. It is guaranteed to not overwrite an existing file on the system, and it also gives limited file permissions so it can't be used by an attacker.

How might we use such a file? Well let's suppose we have some calculation that we break into multiple stages, so that it uses an intermediate file in between. As a contrived example, let's suppose we have two functions. One that writes fibonacci numbers to a file, and another that takes the sum of numbers in a file. We'll have both of these operate on a pre-existing Handle object:

writeFib :: Integer -> Handle -> IO ()
writeFib n handle = writeNum (0, 1) 0
  where
    writeNum :: (Integer, Integer) -> Integer -> IO ()
    writeNum (a, b) x = if x > n then return ()
      else hPutStrLn handle (show a) >> writeNum (b, a + b) (x + 1)

sumNumbers :: Handle -> IO Integer
sumNumbers handle = do
  hSeek handle AbsoluteSeek 0
  nums <- (fmap read . lines) <$> hGetContents handle
  return $ sum nums

Notice how we "seek" to the beginner of the file in our reading function. This means we can use the same handle for both operations, assuming the handle has ReadWrite mode. So let's see how we put this together with openTempFile:

main :: IO ()
main = do
  n <- read <$> getLine
  (file, handle) <- openTempFile "/tmp/fib" "calculations.txt"
  writeFib n handle
  sum <- sumNumbers handle
  print sum
  hClose handle
  removeFile file

A couple notes here. First, if the directory passed to openTempFile doesn't exist, this will cause an error. We also need to print the sum before closing the handle, or else Haskell will not actually try to read anything until after closure due to laziness!

But aside from these caveats, our function works! If we don't remove the file, then we'll be able to see the file at a location like /tmp/fib/calculations6132-6.txt.

This example doesn't necessarily demonstrate why we would use openTempFile instead of just giving the file the name calculations.txt. The answer to that is our process is now safer with respect to concurrency. We could run this same operation on different threads in parallel, and there would be no file conflicts. We'll see exactly how to do that later this year!

For now, make sure you're subscribed to our monthly newsletter so that you can stay up to date with all latest information and offers! If you're already subscribed, take a look at our subscriber resources that can help you improve your Haskell!

by James Bowen at May 12, 2022 02:30 PM

Tweag I/O

Comparing strict and lazy

This blog post covers essentially the same material as the talk I gave at Haskell Exchange 2020 (time truly flies). If you prefer watching it in a talk format, you can watch the recording. Or you can browse the slides.

I first conceived of writing (well, talking, initially) on this subject after one too many person told me “lazy is better than strict because it composes”. You see, this is a sentence that simply doesn’t make much sense to me, but it is oft repeated.

Before we get started, let me make quite specific what we are going to discuss: we are comparing programming languages, and whether by default their function calls are lazy or strict. Strict languages can (and do) have lazy data structures and lazy languages can have strict data structures (though it’s a little bit harder, in GHC, for instance, full support has only recently been released).

In the 15 years that I’ve been programming professionally, the languages in which I’ve written the most have been Haskell and Ocaml. These two languages are similar enough, but Haskell is lazy and Ocaml is strict. I’ll preface my comparison by saying that, in my experience, when switching between Ocaml and Haskell, I almost never think about laziness or strictness. It comes up sometimes. But it’s far from being a central consideration. I’m pointing this out to highlight that lazy versus strict is really not that important; it’s not something that’s worth the very strong opinions that you can see sometimes.

Locks

With these caveats established, I’d like to put to rest the statement that laziness composes better. Consider the following piece of Haskell

atomicPut :: Handle -> String -> IO ()
atomicPut h line =
  withMVar lock $ \_ -> do
    hPutStrLn h line

This looks innocuous enough: it uses a lock to ensure that the line argument is printed without being interleaved with another call to atomicPut. It also has a severe bug. Don’t beat yourself up if you don’t see why: it’s pretty subtle; and this bit of code existed in a well-used logging library for years (until it broke production on a project I was working on and I pushed a fix). The problem, here, is that line is lazy, hence can contain an arbitrary amount of computation, which is subsequently run by hPutStrLn. Running arbitrary amounts of computation within a locked section is very bad.

The fix, by the way, is to fully evaluate the line before entering the lock

atomicPut :: Handle -> String -> IO ()
atomicPut h line = do
  evaluate $ force line
  withMVar lock $ \_ -> do
    hPutStrLn h line

It goes to show, though, that laziness doesn’t compose with locks. You have to be quite careful too: for each variable used within the locked section, you need to evaluate it at least as much as the locked code will before entering the lock.

Shortcutting fold

When people claim that lazy languages compose better, what they think about is something like this definition

or :: [Bool] -> Bool
or =  foldr (||) False

This is truly very clever: because this implementation will stop traversing the list as soon as it finds a True element. To see why, let’s look at the definition of foldr

foldr            :: (a -> b -> b) -> b -> [a] -> b
foldr _ z []     =  z
foldr f z (x:xs) =  f x (foldr f z xs)

When we call foldr recursively, we do that as an argument to f, but since f is lazy, the recursive call is not evaluated until f itself asks for the evaluation. In or, f is (||), which doesn’t evaluate its second argument when the first is True, so the recursive call never happens in this case.

It’s absolutely possible to do the same thing in a strict language. But it requires quite a bit more setup:

(* val fold_right_lazily : ('a -> 'b Lazy.t -> 'b Lazy.t) -> 'a List.t -> 'b Lazy.t -> 'b Lazy.t *)
let rec fold_right_lazily f l accu =
  match l with
  | [] -> accu
  | a::l' -> f a (lazy (Lazy.force (fold_right_lazily f l' accu)))

(* val or_ : bool List.t -> bool *)
let or_ l = Lazy.force (fold_right_lazily (fun x y -> x || (Lazy.force y)) l (Lazy.from_val false))

But, honestly, it’s not really worth it. GHC takes a lot of care to optimise lazy evaluation, since it’s so central in its evaluation model. But Ocaml doesn’t give so much attention to lazy values. So fold_right_lazily wouldn’t be too efficient. In practice, Ocaml programmers will rather define or manually

(* val or_ : bool List.t -> bool *)
let rec or_ = function
  | [] -> false
  | b::l -> b || (or_ l) (* || is special syntax which is lazy in the
                           second argument*)

Applicative functors

Another, probably lesser known, way in which laziness shines is applicative functors. At its core, an applicative functor, is a data structure which supports zipWith<N> (or map<N> in Ocaml) for all N. For instance, for lists:

zipWith0 :: a -> [a]
zipWith1 :: (a -> b) -> [a] -> [b]
zipWith2 :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith3 :: (a -> b -> c -> d) -> [a] -> [b] -> [c] -> [d]
zipWith4 :: (a -> b -> c -> d -> e) -> [a] -> [b] -> [c] -> [d] -> [e]
zipWith5 :: (a -> b -> c -> d -> e -> f) -> [a] -> [b] -> [c] -> [d] -> [e] -> [f]

Of course, that’s infinitely many functions, and we can’t define them all. Though probably, it’s enough to define 32 of them, but even that would be incredibly tedious. The applicative functor abstraction very cleverly finds a way to summarise all these functions as just three functions:

pure :: a -> [a]
(<$>) :: (a -> b) -> [a] -> [b]
(<*>) :: [a -> b] -> [a] -> [b]

(in Haskell, this would be the Applicative instance of the ZipList type, but let’s not be distracted by that)

Then, zipWith5 is derived simply as:

zipWith5 :: (a -> b -> c -> d -> e -> f) -> [a] -> [b] -> [c] -> [d] -> [e] -> [f]
zipWith5 f as bs cs ds es = f <$> as <*> bs <*> cs <*> ds <*> es

The definition is so simple that you never need to define zipWith5 at all: you just use the definition inline. But there’s a catch: were you to use this definition on a strict data structure, the performance would be abysmal. Indeed, this zipWith5 would allocate 5 lists: each call to (<*>) allocates an intermediate list. But a manual implementation of zipWith5 requires a single list to be allocated. This is very wasteful.

For lazy list it’s alright: you do allocate all 5 lists as well, but in a very different pattern. You first allocate the first cons cell of each of the 5 lists, and discard all 4 intermediate results. Then you allocate the second cons cell of each list, etc… This means that at any point in time, the memory overhead is constant. This is the sort of load that a garbage collector handles very very well.

Now, this is really about lazy data structures versus strict data structures, which I said at the beginning that I wouldn’t discuss. But I think that there is something here: in a strict language, you will usually be handling strict data structures. If you want them to support an efficient applicative interface, you will need a lazy copy of the same data structure. This is a non-trivial amount of extra code. I imagine it could be mitigated by the language letting you derive this lazy variant of your data structure. But I don’t think any language does this yet.

Matching lazy data is weird

That being said, lazy languages come with their lot of mind-boggling behaviour. Pattern-matching can be surprisingly counter-intuitive.

Consider the following

f :: Bool -> Bool -> Int
f _    False = 1
f True False = 2
f _    _     = 3

This second clause may seem completely redundant: after all anything matched by the second clause is already matched by the first. However it isn’t: it forces the evaluation of the first argument. So the mere presence of this clause changes the behaviour of f (it makes it that f undefined False = undefined). But f can never return 2.

This and more examples can be found in Simon Peyton Jones’s keynote talk at Haskell Exchange 2019 Revisiting Pattern Match Overlap Checks as well as in an older paper GADTs meet their match, by Karachalias, Schrijvers, Vytiniotis, and Peyton Jones.

Memory

Consider the following implementation of list length:

length :: [a] -> Integer
length [] = 0
length (_:as) = 1 + length as

It’s not a good implementation: it will take <semantics>O(n)<annotation encoding="application/x-tex">O(n)</annotation></semantics>O(n) stack space, which for big lists, will cause a stack overflow, even though length really only require <semantics>O(1)<annotation encoding="application/x-tex">O(1)</annotation></semantics>O(1) space. It’s worth noting that we are consuming stack space, here, because (+) is strict. Otherwise, we could have been in a case like the or function from earlier, where the recursive call was guarded by the lazy argument and didn’t use space.

For such a strict recursion, the solution is well-known: change the length function to be tail recursive with the help of an accumulator:

length :: [a] -> Integer
length = go 0
  where
    go :: Integer -> [a] -> Integer
    go acc [] = acc
    go acc (_:as) = go (acc+1) as

This transformation is well-understood, straightforward, and also incorrect. Because while it is true that we are no longer using stack space, we have traded it for just as much heap space. Why? Well, while (+) is strict, Integer is still a lazy type: a value of type Integer is a thunk. During our recursion we never evaluate the accumulator; so what we are really doing is creating a sort of copy of the list in the form of thunks which want to evaluate acc + 1.

This is a common trap of laziness (see also this blog post by Neil Mitchell). It’s much the same as why you typically want to use foldl' instead of foldl in Haskell. The solution is to evaluate intermediate results before making recursive calls, for instance with bang patterns1:

length :: [a] -> Integer
length = go 0
  where
    go :: Integer -> [a] -> Integer
    go !acc [] = acc
    go !acc (_:as) = go (acc+1) as

This is an instance of a bigger problem: it’s often very difficult to reason about memory in lazy languages. The question “do these few lines of code leak memory” sometimes provokes very heated discussions among seasoned Haskell programmers.

On a personal note I once fixed a pretty bad memory leak. The fix was pushed in production. When I came back to work the next day, the memory leak was still there. What had happened is that there were actually two memory leaks, one was caused by laziness: my reproducer forced the guilty thunks, so masked that second leak. I only fixed one. I got burnt by the fact that the memory usage of applications with a lot of laziness changes when you observe them.

Lazy deserialisation

The flip side, though, is that if you don’t need a whole data structure, you won’t allocate the unnecessary parts without having to think about it. Where this shines the brightest, in my opinion, is in lazy deserialisation.

The scenario is you get (from disk, from the network,…) a datastructure in the form of a serialised byte-string. And you convert it to a linked data structure for manipulation within your application. If the data structure is lazy, you can arrange it so that forcing part of the data structure performs the relevant deserialisation (for instance Json can be deserialised lazily).

This is very nice because linked data structures weigh strongly on the garbage collector which needs to traverse them again and again, while byte-strings are essentially free.

In this scenario you can, guilt-free, convert your byte-string to a value of your data structure. And deserialisation will happen implicitly on demand.

This can be done in a strict language with a lazy data structure, but this requires more care and more boilerplate, so it can get in the way sometimes.

Conclusion

Comparing laziness and strictness, it’s difficult to find a clear winner. I could have given more points of comparisons (in fact, there are a few more in my talk), but the pattern continues. Of course, among the different advantages and disadvantages of laziness and strictness, there may be some which count more for you. This could make you take a side. But others will want to make a different trade-off.

From where I stand, laziness is not a defining feature of Haskell. In fact, Purescript, which is undeniably a Haskell dialect, is a strict language. Things like type classes, higher-order type quantification, and purity are much more relevant in distinguishing Haskell from the rest of the ML family of languages.

In my opinion, the main role that laziness has played for Haskell is making sure that it stays a pure language. At the time, we didn’t really know how to make pure languages, and a strict language would have likely eventually just added effects. You simply can’t do that with a lazy language, so Haskell was forced to be creative. But this is a thing of the past: we now know how to make pure languages (thanks to Haskell!), we don’t really need laziness anymore.

I honestly think that, nowadays, laziness is a hindrance. Whether you prefer laziness or strictness, I find it difficult to argue that the benefits of laziness are large enough to justify Haskell being the only lazy language out there. So much of the standard tooling assumes strictness (it’s a legitimate question to ask what it would mean to step through a Haskell program with gdb, but there is no doubt about what it means for Ocaml). So if you’re making a new language, I think that you should make it strict.


  1. As it happens, the faulty implementation of length can be found in base. It’s not exposed to the user but it’s used internally. It’s saved by the fact that it uses Int rather than Integer, and the strictness analysis automatically makes the go function strict. I honestly have no idea whether the author was conscious of the fact that they were leveraging the strictness analysis this way, or whether it’s another piece of evidence that it’s very easy to get wrong.

May 12, 2022 12:00 AM

May 11, 2022

Well-Typed.Com

Hasura and Well-Typed collaborate on Haskell tooling

editorial note: This is a cross-post of a post originally published on the Hasura blog.

Well-Typed and Hasura have been working together since 2020 to improve Haskell tooling for commercial Haskell users, taking advantage of Well-Typed’s expertise maintaining the Glasgow Haskell Compiler and Hasura’s experience using Haskell in production at scale. Over the last two years we have continued our productive relationship working on a wide variety of projects, in particular related to the profiling and debugging capabilities of the compiler, many of which have had a reinvention or spruce-up. In this post we’ll look at back at the progress we have made together.

Memory profiling and heap analysis

ghc-debug

One of the first big projects we worked on was ghc-debug, a new heap analysis tool that can gather detailed information about the heap of a running process or analyse a snapshot. This tool gives precise results so it can be used to reliably investigate memory usage issues, and we have used it numerous times to fix bugs in the GHC code base. Within Hasura we have used it to investigate fragmentation issues more closely and also to diagnose a critical memory leak regression before a release.

Since GHC 9.2, ghc-debug is supported natively in GHC. All the libraries and executables are on Hackage so it can be installed and used like any normal Haskell library.

Info table profiling

Also in GHC 9.2 we introduced “info table profiling” (or -hi profiling), a new heap profiling mode that analyses memory usage over time and relates it to source code locations. Crucially, it does not require introducing cost centres and recompiling with profiling enabled (which may distort performance). It works by storing a map from info tables to meta-information such as where it originated, what type it is and so on. The resulting profile can be viewed using eventlog2html to give a detailed table about the memory behaviour of each closure type over the course of a program.

We have used info table profiling extensively on GHC itself to find and resolve memory issues, meaning that GHC 9.2 and 9.4 bring significant reductions in compile-time memory usage.

Understanding memory fragmentation

Our early work with Hasura investigated why there was a large discrepency between the memory usage reported by the operating system and the Haskell runtime. The initial hypothesis was that, due to the extensive use of pinned bytestrings in Hasura’s code base, we were losing memory due to heap fragmentation.

We developed an understanding of how exactly fragmentation could occur on a Haskell heap, tooling to analyse the extent of fragmentation and ultimately some fixes to GHC’s memory allocation strategy to reduce fragmentation caused by short-lived bytestring allocations.

This investigation also led to a much deeper understanding of the memory retention behaviour of the GHC runtime and led to some additional improvements in how much memory the runtime will optimistically retain. For long-lived server applications the amount of memory used should return to a steady baseline after being idle for a long period.

This work also highlighted how other compilers trigger idle garbage collections. In particular, we may want to investigate triggering idle collections by allocation rate rather than simple idleness, as applications may continue to still do a small amount of work in their idle periods.

Runtime performance profiling and monitoring

Late cost centre profiling

Code centre profiling, the normal tool recommended for GHC users profiling their Haskell programs, allows recording both time/allocation and heap profiles. It requires compiling the project in profiling mode, which inserts cost centres to the compiled code. Traditionally, the issue with cost centre profiling has been that adding cost centres severly affects how your program is optimised. This means that the existing strategies for automatically inserting cost centres (such as -fprof-auto) can lead to major skew if they are inserted in an inopportune place.

We have implemented a new cost centre insertion mode in GHC 9.4, -fprof-late, which inserts cost centres after the optimiser has finished running. Therefore the cost centres will not affect how your code is optimised and the profile gives a more accurate view of how your unprofiled program would perform. The trade-off is that the names of the cost centres contain internal names, but they are nearly always easily understandable.

The utility of this mode can not be understated, you now get a very fine-grained profile that accurately reflects the actual runtime behaviour of your program. It’s made me start using the cost-centre profiler again!

We also developed a plugin which can be used to approximate this mode if you are using GHC 9.0 or 9.2.

Ticky-ticky profiling

Hasura have a suite of benchmarks that track different runtime metrics, such as bytes allocated.1 Investigating regressions in these benchmarks requires a profiling tool geared towards profiling allocations. GHC has long had support for ticky profiling, which gives a low level view about which functions are allocating. However, in the past ticky profiling has been used almost exclusively by GHC developers, not users, and profiles were only consumable in a rudimentary text-based format.

We added support to emit ticky samples via the eventlog in GHC 9.4, and support for rendering the information in the profile to an interactive HTML table using eventlog2html. In addition, we integrated the new info table mapping (as used by ghc-debug and -hi profiling) to give precise locations for each ticky counter, making it easier to interpret the profile.

Live profiling and monitoring via the eventlog

For a long time we have been interested in unifying GHC’s various profiling mechanisms via the eventlog, and making them easier to monitor. We developed a prototype live monitoring setup for Hasura, eventlog-live, that could attach to the eventlog and read events whilst the program was running. This prototype was subsequently extended thanks to funding from IOG.

Native Stack Pointer register

GHC-compiled programs use separate registers for the C stack and Haskell stack. One consequence of this is that native Linux debugging and statistical profiling tools (such as perf) see only the C stack pointer, and hence provide a very limited window into the behaviour of Haskell programs.

Hasura commissioned some experimental investigatory work to see whether it would be possible to use the native stack pointer register for the Haskell stack, and hence get more useful output from off-the-shelf debugging tools. Unfortunately we ran into issues getting perf to understand the debugging information generated by GHC, and there are challenges related to maintaining LLVM compatibility, but we remain interested in exploring this further.

Haskell Language Server

Lately we have started to support maintenance of the Haskell Language Server (HLS). The language server is now a key part of many developers’ workflows, so it is a priority to make sure it is kept up-to-date and works reliably, and sponsorship from companies like Hasura is crucial to enabling this.

Recently our work on HLS has included:

  • Supporting the GHC 9.2 release series, as Hasura were keen to upgrade and have access to all the improved functionality we discussed in this post.

  • Diagnosing and resolving difficult-to-reproduce segfaults experienced by HLS users. It turned out that the version compatability checks were not strict enough, and HLS could load incompatible object files when running Template Haskell. In particular, you must build haskell-language-server with exactly the same version of GHC with which you compiled your package dependencies, so that object files for dependencies have the correct ABI.

  • Starting to take advantage of the recently completed support for Multiple Home Units in GHC to make HLS work more robustly for projects consisting of multiple components.

Conclusion

Well-Typed are grateful to Hasura for funding this work, as it will benefit the whole Haskell community. With their help we have made significant progress in the last two years improving debugging and profiling capabilities of the compiler, and improving the developer experience using HLS. We look forward to continuing our productive collaboration in the future.

As well as experimenting with all these tools on Hasura’s code base, we have also been using them to great effect on GHC’s code base, in order to reduce memory usage and increase performance of the compiler itself (e.g. by profiling GHC compiling Hasura’s graphql-engine). The new profiling tools have been useful in finding places to optimise: ghc-debug and -hi profiling made eliminating memory leaks straightforward, the late cost centre patch gives a great overview of where GHC spends time, and ticky profiling gives a low level overview of the allocations. They have also been very helpful for our work on improving HLS performance.

Well-Typed are actively looking for funding to continue maintaining and enhancing GHC and HLS. If your company relies on robust Haskell tooling, and you could support this work, or would like help improving the developer experience for your Haskell engineers, please get in touch with us via info@well-typed.com!


  1. The number of bytes allocated acts as a proxy for the amount of computation performed, since Haskell programs tend to allocate frequently, and allocations are more consistent than CPU or wall clock time.↩︎

by matthew at May 11, 2022 12:00 AM

May 09, 2022

Magnus Therning

Comments and org-static-blog

I'm using org-static-blog to generate the contents of this site. So far I'm very happy with it, but I've gotten a few emails from readers who've wanted to comment on something I've written and they always point out that it's not easy to do. It's actually not a coincidence that it's a bit difficult!

Yesterday I came up with a way that might make is slightly easier without involving JavaScript from a 3rd party. By making use of the built-in support for adding HTML code for comments. One slight limitation is that it's a single variable holding the code, and I'd really like to allow for both

  • using a link to a discussion site, e.g. reddit, as well as
  • my email address

As the comment support in org-static-blog comes in the form of a single variable this seems a bit difficult to accomplish. However, it isn't difficult at all to do in elisp due to the power of advice-add.

By using the following advice on org-static-blog-publish-file

(advice-add 'org-static-blog-publish-file :around
            (lambda (orig-fn filename &rest args)
              (let*  ((comments-url (with-temp-buffer
                                      (insert-file-contents filename)
                                      (or (cadar (org-collect-keywords '("commentsurl")))
                                          my-blog-default-comments-url)))
                      (org-static-blog-post-comments (concat "Comment <a href=" comments-url ">here</a>.")))
                (apply orig-fn filename args))))

and defining my-blog-default-comments-url to a mailto:... URL I get a link to use for commenting by either

  1. set commentsurl to point to discussion about the post on reddit, or
  2. not set commentsurl at all and get the mailto:... URL.

If you look at my previous post you see the result of the former, and if you look below you see the result of the latter.

May 09, 2022 08:10 PM

Gabriel Gonzalez

The golden rule of software distributions

golden-software-distro

This is a short post documenting a pattern I learned as a user and maintainer of software distributions. I wanted to share this pattern because the lesson was non-obvious to me in my early days as a software engineer.

I call this pattern the “golden rule of software distributions”, which I’ll define the verbose way followed by the concise way.

The verbose version of the golden rule of software distributions is:

If a package manager only permits installing or depending on one version of each of package then a software distribution for that package manager should bless one version of each package. The blessed version for each package must be compatible with the blessed version of every other package.

The concise version of the golden rule of software distributions is:

A locally coherent package manager requires a globally coherent software distribution.

… where:

  • “locally coherent” means that you can only install or depend on one version of each package for a given system or project

  • “globally coherent” means each package has a unique blessed version compatible with every other package’s blessed version

Note that any sufficiently large software distribution will not perfectly adhere to this golden rule. You should view this rule as an ideal that a software distribution aspires to approximate as closely as possible, but there will necessarily be cases where they cannot.

Motivation

I’ll introduce the term “build plan” to explain the motivation behind the golden rule:

A build plan for a package A specifies a version for each dependency of A such that A successfully builds against those dependencies.

To motivate the golden rule, let’s examine what happens when you have a locally coherent package manager but a globally incoherent software distribution:

  • Package users need to do a combinatorial search of their dependencies

    … in order to find a successful build plan. Specifically, they may need to test multiple major versions of their direct and indirect independencies to find a permutation that successfully builds.

  • Compatible sets of packages become increasingly unlikely at scale

    The likelihood of finding a build plan rapidly diminishes as your dependency tree grows. Beyond a certain number of dependencies a build plan might not even exist, even if every dependency is maintained.

  • Package authors need to support multiple major versions of every dependency

    … in order to maximize the likelihood that downstream packages can find a successful build plan. Maintaining this backwards compatibility greatly increases their maintenance burden.

  • Package authors must test against multiple major versions of each dependency

    … in order to shield their users from build failures due to unexpected build plans. This means a large number of CI runs for every proposed change to the package, which slows down their development velocity.

  • Responsibility for fixing incompatibilities becomes diffuse

    Sometimes you need to depend on two packages (A and B) which transitively depend on incompatible versions of another package (C). Neither package A nor package B can be held responsible for fixing the problem unless there is a blessed version of package C.

These issues lead to a lot of wasted work, which scales exponentially with the number of dependencies. Consequently, software ecosystems that ignore the golden rule run into difficulties scaling dependency trees which people will work around in the following ways:

  • Culturally discouraging dependencies

  • Vendoring dependencies within their projects

  • Gravitating towards large and monolithic dependencies / frameworks

The fundamental problem

The golden rule is necessary because build plans do not compose for a locally coherent package manager. In other words, if you have a working build plan for package A and another working build plan for package B, you cannot necessarily combine those two build plans to generate a working build plan for a package that depends on both A and B. In particular, you definitely cannot combine the two build plans if A and B depend on incompatible versions of another package C.

However, build plans can also fail to compose for more subtle reasons. For example, you can depend on multiple packages whose build plans are all pairwise-compatible, but there still might not exist a build plan for the complete set of packages.

The good news is that you can trivially “weaken” a build plan, meaning that if you find a build plan that includes both packages A and B then you can downgrade that to a working build plan for just package A or just package B.

Consequently, the globally optimal thing to do is to find a working build plan that combines as many packages as possible, because then any subset of that build plan is still a working build plan. That ensures that any work spent fixing this larger build plan is not wasted and benefits everybody. Contrast that with work spent fixing the build for a single package (e.g. creating a lockfile), which does not benefit any other package (not even downstream packages, a.k.a. reverse dependencies).

Common sense?

Some people might view the golden rule of software distributions as common sense that doesn’t warrant a blog post, but my experiences with the Haskell ecosystem indicate otherwise. That’s because I began using Haskell seriously around 2011, four years before Stackage was first released.

Before Stackage, I ran into all of the problems described in the previous section because there was no blessed set of Haskell packages. In particular, the worst problem (for me) was the inability to find a working build plan for my projects.

This issue went on for years; basically everyone in the Haskell ecosystem (myself included) unthinkingly cargo-culted this as the way things were supposed to work. When things went wrong we blamed Cabal (e.g. “Cabal hell”) for our problems when the root of the problem had little to do with Cabal.

Stackage fixed all of that when Michael Snoyman essentially introduced the Haskell ecosystem to the golden rule of software distributions. Stackage works by publishing a set of blessed package versions for all of the packages vetted by Stackage and these packages are guaranteed to all build together. Periodically, Stackage publishes an updated set of blessed package versions.

After getting used to this, I quickly converted to this way of doing things, which seemed blindingly obvious in retrospect. Also, my professional career arc shifted towards DevOps, including managing upgrades and software distributions and I discovered that this was a fairly common practice for most large software distributions.

Why this rule is not intuitive

Actually, this insight is not as obvious as people might think. In fact, a person with a superficial understanding of how software ecosystems work might suspect that the larger a software ecosystem grows the more incoherent things get. However, you actually encounter the opposite phenomenon in practice: the larger a software ecosystem gets the more coherent things get (by necessity).

In fact, I still see people argue against the global coherence of software ecosystems, which indicates to me that this isn’t universally received wisdom. Sometimes they argue against global coherence directly (they believe coherence imposes an undue burden on maintainers or users) or they argue against global coherence indirectly (by positing incoherence as a foundation of a larger architectural pattern). Either way, I strongly oppose global incoherence, both for the theoretical reasons outlined in this post and also based on my practical experience managing dependencies in the pre-Stackage days of the Haskell ecosystem.

Indeed, many of people arguing against globally coherent software ecosystems are actually unwitting beneficiaries of global coherence. There is a massive amount of invisible work that goes on behind the scenes for every software distribution to create a globally coherent package set that benefits everybody (not just the users of those software distributions). For example, all software users benefit from the work that goes into maintaining the Debian, Arch, Nixpkgs, and Brew software distributions even if they don’t specifically use those software distributions or their associated package managers.

Conclusion

This whole post has one giant caveat, which is that all of the arguments assume that the packager manager is locally coherent, which is not always the case! In fact, there’s a post that proves that local coherence can be undesirable because it (essentially) makes dependency resolution NP complete. For more details, see:

I personally have mixed views on whether local coherence is good or bad. Right now I’m slightly on team “local coherence is good”, but my opinions on that are not fully formed, yet.

That said, most package managers tend to require or at least benefit from local coherence so in practice most software distributions also require global coherence. For example, Haskell’s build tooling basically requires global coherence (with some caveats I won’t go into), so global coherence is a good thing for the Haskell ecosystem.

by Gabriella Gonzalez (noreply@blogger.com) at May 09, 2022 01:54 PM

May 08, 2022

Magnus Therning

A little Haskell: epoch timestamp

A need of getting the current UNIX time is something that comes up every now and then. Just this week I needed it in order to add a k8s liveness probe1.

While it's often rather straight forward to get the Unix time as an integer in other languages2, in Haskell there's a bit of type tetris involved.

  1. getPOSIXTime gives me a POSIXTime, which is an alias for NominalDiffTime.
  2. NominalDiffTime implements RealFrac and can thus be converted to anything implementing Integral (I wanted it as Int64).
  3. NominalDiffTime also implements Num, so if the timestamp needs better precision than seconds it's easy to do (I needed milliseconds).

The combination of the above is something like

truncate <$> getPOSIXTime

In my case the full function of writing the timestamp to a file looks like this

writeTimestampFile :: MonadIO m => Path Abs File -> m ()
writeTimestampFile afn = liftIO $ do
    truncate @_ @Int64 . (* 1000) <$> getPOSIXTime >>= writeFile (fromAbsFile afn) . show

Footnotes:

1

Over the last few days I've looked into k8s probes. Since we're using Istio TCP probes are of very limited use, and as the service in question doesn't offer an HTTP API I decided to use a liveness command that checks that the contents of a file is a sufficiently recent epoch timestamp.

2

Rust's Chrono package has Utc.timestamp(t). Python has time.time(). Golang has Time.Unix.

May 08, 2022 05:51 AM

May 05, 2022

Tweag I/O

Existential optics

Optics make it possible to conveniently access and modify data structures in an immutable, composable way. Thanks to that, they catch lots of attention from the functional programming community. Still, you can have a hard time understanding how they work just by looking at data declarations and type definitions.

In this post, I present a way of encoding optics that is different from usual. This encoding, called existential optics, is easier to understand than the other encodings, since it makes more explicit the structure of each optic. This is not new and is well known in the category theory academic circles. Still, these ideas do not seem to appear in libraries for languages like Haskell, Purescript, or Scala.

The most well-known type of optics are lenses, which were also the first to be analyzed and used. We will use them as our recurring example to compare the several ways we have to encode optics.

Lenses 101

A Lens is effectively the immutable version of a getter/setter pair you will often find in object-oriented programming, and especially in Object-Relational mappers. It allows focusing on a component of a container data type.

For example, consider an Address record that contains a street field. A Lens Address Street allows us to retrieve the Street from an address.

streetLens :: Lens Address Street

view streetLens (Address { street = "Baker Street", number = "221B" })
-- "Baker Street"

We could reuse the same lens to update the street inside the address.

over streetLens toUpper (Address { street = "Baker Street", number = "221B" })
-- Address { street = "BAKER STREET", number = "221B" }

Optics generalize this pattern. Intuitively, while Lenses generalize the notion of a field, Prisms generalize the notion of a constructor and Traversals generalize both to to an arbitrary number of values.

Lenses, and optics in general, compose extremely well. For example, if we also had a User record containing an address field of type Address, we could easily access and modify the street field of the address.

addressLens :: Lens User Address

view (streetLens . addressLens)
  (User { address = Address { street = "Baker Street", number = "221B" }, name = "Sherlock" })
-- "Baker Street"

over (streetLens . addressLens) toUpper
  (User { address = Address { street = "Baker Street", number = "221B" }, name = "Sherlock" })
-- User { address = Address { street = "BAKER STREET", number = "221B" }, name = "Sherlock" }

Now it is time to ask ourselves how a Lens s a could be encoded. Next, I will present some possible alternatives for lenses and compare them with respect to the following aspects:

  • how easy it is to understand what the encoding actually describes;
  • how easily composition works;
  • how easily we can generalize the encoding to other optics.

Explicit encoding

The easiest option is to encode a Lens just as a getter and a setter

data Lens s a = Lens
  { get :: s -> a
  , set :: s -> a -> s
  }

This encoding is extremely easy to grasp, packing together exactly the API we would like to use. On the other hand, it is not immediate to understand how a Lens s a and a Lens a b could compose to return a Lens s b. Also, it turns out that this encoding is very ad-hoc, and it is not immediately clear how to generalize it to other optics.

Van Laarhoven encoding

In 2009 Twan Van Laarhoven came up with a new encoding for lenses which is commonly used, for example by the lens library.

type Lens s a
  =  forall f. Functor f
  => (a -> f a) -> s -> f s

This says that a Lens s a allows us to lift a function a -> f a to a function s -> f s for any possible functor f. For a more in-depth explanation of the Van Laarhoven encoding, please refer to this talk by Simon Peyton Jones.

I would argue that it is harder to understand what the encoding is describing. The functionalities provided by the explicit encoding could be recovered with a wise choice of f. For example, when f = Identity we get a function (a -> a) -> (s -> s) which allows us to edit the content. Similarly, if we choose f = Const a, we get a function (a -> a) -> s -> a; applying this to the identity function, we get back our get :: s -> a function.

What we gain, though, is a massive improvement with respect to composability. Now, we can use just function composition, i.e. ., to compose lenses.

It also generalizes quite well to traversals, but not so well to prisms; see for example how the definition of Prism in the lens library differs.

Profunctor encoding

An encoding which is commonly used for new optics libraries is the so-called profunctor encoding. The main idea is to quantify the encoding, not over functors, but profunctors instead.

type Lens s a
  =  forall p. Strong p
  => p a a -> p s s

In these terms, a Lens is a way to lift a value of type p a a to a value of type p s s for any Strong profunctor p (i.e. a profunctor which allows lifting values with (,)).

I would argue that this encoding is even less immediate to understand than Van Laarhoven’s one. On the other hand, since we are still dealing with simple functions, we are still able to compose optics just with function composition.

It also becomes extremely easy to generalize this encoding to other types of optics. The type of optic is determined by the constraint we have on the p type variable. In the case of Lens, we have Strong, but if we use just Profunctor, we get Isos, another type of optic which describes isomorphisms. If we use Choice, a typeclass which allows lifting values with Either, we get Prisms.

Now when we want to compose two optics of a different type, we just need to collect all the relevant constraints. For example, if we compose a Lens, which is constrained by Strong p, with a Prism, which is constrained by Choice p, we will get an optic constrained by (Strong p, Choice p).

In short, the profunctor encoding works extremely well with regard to compositionality, but it constrains us with an encoding that is not easy to understand. So the question now is: can we encode optics in another way, that is more expressive and easy to manipulate, possibly giving up a little bit of composability?

Existential encoding

Another equivalent way of expressing what a Lens is, uses the so-called existential encoding, described by Van Laarhoven himself.

data Lens s a
  = forall c. Lens (s -> (c, a)) ((c, a) -> s)

This says that a Lens s a is comprised of two functions: one from s to (c, a) and one back, where we can choose c arbitrarily. Since c appears only in the constructor and not as a parameter of the Lens type constructor, it is called existential.

Coming back to the example used above, with the existential encoding we can implement streetLens as follows:

streetLens :: Lens Address Street
streetlens = Lens f g
  where
    f :: Address -> (Number, Street)
    f address = (number address, street address)

    g :: (Number, Street) -> Address
    g (street, number) = Address street number

Generally, when we know s and we know a, we can identify c as whatever is left over after removing an a from s, broadly speaking.

Easy to grasp

Another way to read this definition is:

A Lens s a is a proof that there exists a c such that s is isomorphic to (c, a).

This is an extremely explicit way to think about a Lens; it says that whenever we have a Lens, we could actually think about a tuple.

Easy to use

Thanks to the fact that we can easily understand what a Lens is with the existential encoding, it is also easy to understand how to define combinators for it. The idea is that we can deconstruct s into a pair (c, a) and then build it back.

For example, if we want to extract a from s, we simply deconstruct s into the pair (c, a), and then we project it into the second component.

view :: Lens s a -> s -> a
view (Lens f _) = snd . f

Similarly, if we want to lift a function h :: a -> a to a function s -> s, we can decompose s into (c, a), map h over the second component of the pair and then construct a new s from the old c and the new a.

over :: Lens s a -> (a -> a) -> s -> s
over (Lens f g) h = g . fmap h . f

Easy to generalize to other optics

When it comes to generalizing the existential encoding to other optics, it turns out that it is enough to switch the (,) data type with another type constructor of the same kind.

Prisms

For example, if we use Either instead of (,) that we had in the definition of a Lens, we get a Prism:

data Prism s a
  = forall c. Prism (s -> Either c a) (Either c a -> s)

This definition tells us that a Prism s a is just a proof that there exists a c such that s is isomorphic to Either c a.

With this definition it is easy to define the common operations used on a Prism. preview allows to retrieve the focus if we are on the correct branch of the sum type; review instead allows to construct s from a.

preview :: Prism s a -> s -> Maybe a
preview (Prism f _) = rightToMaybe . f

review :: Prism s a -> a -> s
review (Prism _ g) = g . Right

General optics

Generally, an optic has the following shape:

data Optic f s a
  = forall c. Optic (s -> f c a) (f c a -> s)

This amounts to saying that Optic f s a is a proof that there exists a c such that s is isomorphic to f c a. I find this a really clear explanation of what an optic is and how to think about it. It is enough then to plug a concrete data type instead of f to get a concrete family of optics. For example:

type Lens = Optic (,)

type Prism = Optic Either

type Iso = Optic Tagged

where Tagged is the identity functor with an added phantom type.

We could also use other data types with the correct kind, like (->), Affine and PowerSeries, to obtain other optics like Grates, AffineTraversals and Traversals.

type Grate = Optic (->)

type AffineTraversal = Optic Affine

type Traversal = Optic PowerSeries

Composing existential optics

Lenses and optics are well known for how well they compose. Let’s see how the existential encoding behaves with respect to composition.

We can immediately observe that function composition does not work with the existential encoding, since we are not dealing with functions anymore.

Let’s consider first the case where we are composing two optics of the same type

compose :: Optic f s u -> Optic f u a -> Optic f s a

If we try to implement compose just following the types, we will probably arrive at

compose
  :: (forall x. Functor (f x))
  => Optic f s u -> Optic f u a -> Optic f s a
compose (Optic f g) (Optic h l)
  = Optic (_a . fmap h . f) (g . fmap l . _b)

We are first deconstructing s into f c u and then u into f c1 a and we are left with two typed holes _a and _b.

_a :: f c (f c1 a) -> f c0 a
_b :: f c0 a -> f c (f c1 a)

Existential associativity

To fill the holes we left, we introduce a type class. Note that c0 in _a and _b is existentially quantified, so we have the freedom to instantiate it however we like.

class ExistentiallyAssociative f where

  type E f a b

  existentialAssociateL :: f a (f b c) -> f (E f a b) c
  existentialAssociateR :: f (E f a b) c -> f a (f b c)

We use an associated type family E to be able to choose what c0 should be for the given f. Notice also how, when E f = f, this type class is saying that f is associative. And this is in fact what happens with data types as (,) and Either. You can find instances for other data types here.

Now, thanks to this new type class, we can fill the holes we left and conclude the definition of compose

compose
  :: ( forall x. Functor (f x)
     , ExistentiallyAssociative f )
  => Optic f s u -> Optic f u a -> Optic f s a

Changing optic type

To compose optics of different types (i.e. defined for different fs), we need first to be able to change the type of an optic. This makes sense because, for example, an Iso can always be seen both as a Lens and as a Prism, and Lenses and Prisms can be seen as AffineTraversals.

What we would like to do is convert an Optic f s a into an Optic g s a. If we try to follow the types, we will probably end up with something like the following:

morph :: Optic f s a -> Optic g s a
morph (Optic h l) = Optic (_a . h) (l . _b)

_a :: f c a -> g c0 a
_b :: g c0 a -> f c a

where _a and _b are typed holes and c0 is existential, meaning that we can choose what it is.

Embedding

As we did for ExistentiallyAssociative, we are going to fill the holes by introducing a type class that provides exactly what we need. Similar to the previous case, we will use an associated type family M to be able to choose c0.

class Morph f g where

  type M f g c

  f2g :: f c a -> g (M f g c) a
  g2f :: g (M f g c) a -> f c a

This class describes how we can embed f into g choosing c0 appropriately.

For example, we can see the identity functor Id as a pair (,) choosing the first component of the pair to be ().

instance Morph Tagged (,) where
  type M Tagged (,) c = ()

  f2g :: Tagged c a -> ((), a)
  f2g (Tagged a) = ((), a)

  g2f :: ((), a) -> Tagged c a
  g2f ((), a) = Tagged a

Similarly, we can see Id as an Either choosing the left component to be Void. For more instances, take a look here.

This new type class allows us to complete the definition of morph we started above.

Composing optics of different type

To compose existential optics of different types, we now need to connect all the pieces we have. To compose an Optic f s u with an Optic g u a we first need to morph them both to a common optic type where we can compose them.

compose'
  :: ( ExistentiallyAssociative h
     , forall x. Functor (h x)
     , Morph f h
     , Morph g h )
  => Optic f s u
  -> Optic g u a
  -> Optic h s a
compose' opticF opticG
  = compose (morph opticF) (morph opticG)

Conclusion

The existential encoding for optics can not compete with profunctor optics in terms of composability. On the other hand, it scores better on other aspects. In particular:

  • the definition of optics is easy to understand. This is a two-fold perk. On one side, it makes teaching and learning optics easier. On the other hand, it makes the implementation of combinators and consuming the library easier, since the implementer is left to use simple data types.
  • it clarifies what an optic is. Being able to express an optic as proof of the existence of an isomorphism, allows having a clear picture in mind about what an optic is. This helps discriminate optics from other constructs and possibly also to discover new optics.
  • the hierarchy between optics is explicit. It is completely described by the Morph instances we described above, which can be understood in terms of embedding one data type into another.

This blog post would not exist without the precious work of many other programmers and category theorists. Reading Bartosz Milewski’s optics-related work was a particular inspiration.

If you want to dive deeper into the code, you can find a sketch of the ideas explained in this post in this repository.

May 05, 2022 12:00 AM

May 04, 2022

Neil Mitchell

Working on build systems full-time at Meta

Summary: I joined Meta 2.5 years ago to work on build systems. I’m enjoying it.

I joined Meta over two years ago when an opportunity arose to work on build systems full time. I started the Shake build system at Standard Chartered over 10 years ago, and then wrote an open source version a few years later. Since then, I’ve always been dabbling in build systems, at both moderate and small scale. I really enjoyed writing the Build Systems a la Carte paper, and as a result, started to appreciate some of the Bazel and Buck design decisions. I was involved in the Bazel work at Digital Asset, and after that decided that there was still lots of work to be done on build systems. I did some work on Cloud Shake, but the fact that I wasn’t working on it every day, and that I wasn’t personally using it, made it hard to productionize. A former colleague now at Meta reached out and invited me for breakfast — one thing led to another, and I ended up at Meta working on build systems full time.

What I’ve learnt about build systems

The biggest challenge at Meta is the scale. When I joined they already used the Buck build system, which had been developed at Meta. Looking at the first milliseconds after a user starts an incremental build is illustrative:

  • With Shake, it starts the process, loads the database into memory, walks the entire graph calling stat on each input and runs any build actions.
  • With Buck, it connects to a running daemon, talks to a running file watcher (Watchman in the case of Buck) and uses reverse dependencies to jump to the running actions.

For Shake, on repos with 100K files, that process might take ~0.5s, but it is O(n). If you increase to 10M files, it takes 50s, and your users will revolt. With Buck, the overhead is proportional to the number of changed files, which is usually a handful.

While Shake is clearly infeasible at the scale of Meta, Buck was also starting to show its age, and I’ve been working with others to significantly improve Buck, borrowing lessons from everywhere, including Shake. Buck also addresses problems that Shake doesn’t, such as how to cope with multi-configuration builds (e.g. building for x86 and ARM simultaneously), having a separate file and target namespace and effective use of remote execution and caching.

We expect that the new version of Buck will be released open source soon, at which point I’ll definitely be talking more about the design and engineering trade-offs behind it.

What's different moving from finance to tech

My career to date has been in finance, so working at Meta is a very different world. Below are a few things that stand out (I believe most of these are common to other big tech companies too, but Meta is my first one).

Engineering career ladder: In finance the promotion path for a good engineer is to become a manager of engineers, then a manager of managers, and so on up. In my previous few roles I was indeed managing teams, which included setting technical direction and doing coding. At Meta, managers look after people, and help set the team direction. Engineers look after code and services, and set the technical direction. But importantly, you can be promoted as an engineer, without gaining direct reports, and the opportunities and compensation are equivalent to that for managers. There are definitely aspects of management that I like (e.g. mentoring, career growth, starting collaborations), and happily all of these are things engineers can still engage in.

Programmer centric culture: In finance the company is often built around traders and sales people. In tech, the company is built around programmers, which is visible in the culture. There are hardware vending machines, free food, free ice cream, minimal approvals. They’ve done a very good job of providing a relaxing and welcoming environment (with open plan offices, but I don’t mind that aspect). The one complaint I had was that Meta used to have a pretty poor work from home policy, but that’s now been completely rewritten and is now very good.

Reduced hierarchy: I think this may be more true of Meta than other tech, but there is very minimal hierarchy. Programmers are all just programmers, not “senior” or “junior”. I don’t have the power to tell anyone what to do, but in a slightly odd way, my manager doesn’t have that power either. If I want someone to tackle a bug, I have to justify that it is a worthwhile thing to do. One consequence of that is that the ability to form relationships and influence people is much more important. Another consequence that I didn’t foresee is that working with people in different teams is very similar to working with people in your team, since exactly the same skills apply. I can message any engineer at Meta, about random ideas and possible collaborations, and everyone is happy to talk.

Migration is harder: In previous places I worked, if we needed 100 files moved to a new version of a library, someone got told to do it, and they went away and spent a lot of time doing it. At Meta that’s a lot harder — firstly, it’s probably 100K files due to the larger scale, and secondly, telling someone they must do something is a lot less effective. That means there is a greater focus on automation (automatically editing the files), compatibility (doesn’t require editing the files) and benefits (ensuring that moving to the new version of the library will make your life better). All those are definitely better ways to tackle the problem, but sometimes, work must be done that is tedious and time consuming, and that is harder to make happen.

Open source: The process for open sourcing an internal library or tool in the developer infrastructure space is very smooth. The team I work in has open sourced the Starlark programming language (taking over maintenance from Google), the Gazebo Rust utility library and a Rust linter, plus we have a few more projects in the pipeline. As I write code in the internal Meta monorepo, it gets sync’d to GitHub a few minutes later. It’s also easy to contribute to open source projects, e.g. Meta engineers have contributed to my projects such as Hoogle (before I even considered joining Meta).

Hiring: Meta hires a lot of engineers (e.g. 1,000 additional people in London). That means that interviews are more like a production line, with a desire to have a repeatable process, where candidates are assigned teams after the interviews, rather than interviewing with a team. There are upsides and downsides to that—if I interview a strong candidate, it’s really sad to know that I probably won’t get to work closely with them. It also means that the interview process is determined centrally, so I can’t follow my preferences. But it does mean that if a friend is looking for a job there’s often something available for them (you can find details on compilers and programming here and a full list of jobs here), and a repeatable process is good for fairness.

Overall I’m certainly very happy to be working on build systems. The build system is the thing that stands between a user and trying out their changes, so anything I can do to make that process better benefits all developers. I’m very excited to share what I’ve been working on more widely in the near future!

(Disclosure: This blog post had to go through Meta internal review, because it’s an employee talking about Meta, but other than typos, came out unchanged.)

by Neil Mitchell (noreply@blogger.com) at May 04, 2022 03:16 PM

Gabriel Gonzalez

Why does Haskell's take function accept insufficient elements?

Why does Haskell's take function accept insufficient elements?

This post is a long-form response to a question on Reddit, which asked:

I just started learning haskell, and the take function confuses me.

e.g take 10 [1,2,3,4,5] will yield [1,2,3,4,5]

How does it not generate an error ?

… and I have enough to say on this subject that I thought it would warrant a blog post rather than just a comment reply.

The easiest way to answer this question is to walk through all the possible alternative implementations that can fail when not given enough elements.

Solution 0: Output a Maybe

The first thing we could try would be to wrap the result in a Maybe, like this:

safeTake :: Int -> [a] -> Maybe [a]
safeTake 0 _ = Just []
safeTake n [] = Nothing
safeTake n (x : xs) = fmap (x :) (safeTake (n - 1) xs)
>>> safeTake 3 [0..]
Just [0,1,2]

>>> safeTake 3 []
Nothing

The main deficiency with this approach is that it is insufficiently lazy. The result will not produce a single element of the output list until safeTake finishes consuming the required number of elements from the input list.

We can see the difference with the following examples:

>>> oops = 1 : 2 : error "Too short"

>>> take 1 (take 3 oops)
[1]

>>> safeTake 1 =<< safeTake 3 oops
*** Exception: Too short

Solution 1: Fail with error

Another approach would be to create a partial function that fails with an error if we run out of elements, like this:

partialTake :: Int -> [a] -> [a]
partialTake 0 _ = []
partialTake n (x : xs) = x : partialTake (n - 1) xs
>>> partialTake 3 [0..]
[0,1,2]

>>> partialTake 3 []
*Main> partialTake 3 []
*** Exception: Test.hs:(7,1)-(8,51): Non-exhaustive patterns in function partialTake

>>> partialTake 1 (partialTake 3 oops)
[1]

Partial functions like these are undesirable, though, so we won’t go with that solution.

Solution 2: Use a custom list-like type

Okay, but what if we could store a value at the end of the list indicating whether or not the take succeeded. One way we could do that would be to define an auxiliary type similar to a list, like this:

{-# LANGUAGE DeriveFoldable #-}

data ListAnd r a = Cons a (ListAnd r a) | Nil r deriving (Foldable, Show)

… where now the empty (Nil) constructor can store an auxiliary value. We can then use this auxiliary value to indicate to the user whether or not the function succeeded or not:

data Result = Sufficient | Insufficient deriving (Show)

takeAnd :: Int -> [a] -> ListAnd Result a
takeAnd 0 _ = Nil Sufficient
takeAnd n [] = Nil Insufficient
takeAnd n (x : xs) = Cons x (takeAnd (n - 1) xs)
>>> takeAnd 3 [0..]
Cons 0 (Cons 1 (Cons 2 (Nil Sufficient)))

>>> takeAnd 3 []
Nil Insufficient

Also, the ListAnd type derives Foldable, so we can recover the old behavior by converting the ListAnd Result a type into [a] using toList:

>>> import Data.Foldable (toList)

>>> toList (takeAnd 3 [0..])
[0,1,2]

>>> toList (takeAnd 3 [])
[]

>>> toList (takeAnd 1 (toList (takeAnd 3 oops)))
[1]

This is the first total function that has the desired laziness characteristics, but the downside is that the take function now has a much weirder type. Can we solve this only using existing types from base?

Solution 3: Return a pair

Well, what if we were to change the type of take to return an pair containing an ordinary list alongside a Result, like this:

takeWithResult :: Int -> [a] -> ([a], Result)
takeWithResult 0 _ = ( [], Sufficient )
takeWithResult n [] = ( [], Insufficient)
takeWithResult n (x : xs) = (x : ys, result )
where
(ys, result) = takeWithResult (n - 1) xs
>>> takeWithResult 3 [0..]
([0,1,2],Sufficient)
>>> takeWithResult 3 []
([],Insufficient)

Now we don’t need to add this weird ListAnd type to base, and we can recover the old behavior by post-processing the output using fst:

>>> fst (takeWithResult 3 [0..])
[0,1,2]
fst (takeWithResult 3 [])
[]

… and this also has the right laziness characteristics:

>>> fst (takeWithResult 1 (fst (takeWithResult 3 oops)))
[1]

… and we can replace Result with a Bool if want a solution that depends solely on types from base:

takeWithResult :: Int -> [a] -> ([a], Bool)
takeWithResult 0 _ = ([], True)
takeWithResult n [] = ([], False)
takeWithResult n (x : xs) = (x : ys, result)
where
(ys, result) = takeWithResult (n - 1) xs

However, even this solution is not completely satisfactory. There’s nothing that forces the user to check the Bool value before accessing the list, so this is not as safe as, say, the safeTake function which returns a Maybe. The Bool included in the result is more of an informational value rather than a safeguard.

Conclusion

So the long-winded answer to the original question is that there are several alternative ways we could implement take that can fail if the input list is too small, but in my view each of them has their own limitations.

This is why I think Haskell’s current take function is probably the least worst of the alternatives, even if it’s not the safest possible implementation.

by Gabriella Gonzalez (noreply@blogger.com) at May 04, 2022 04:45 AM

May 03, 2022

Mark Jason Dominus

The disembodied heads of Oz

The Wonderful Wizard of Oz

Certainly the best-known and most memorable of the disembodied heads of Oz is the one that the Wizard himself uses when he first appears to Dorothy:

In the center of the chair was an enormous Head, without a body to support it or any arms or legs whatever. There was no hair upon this head, but it had eyes and a nose and mouth, and was much bigger than the head of the biggest giant.

As Dorothy gazed upon this in wonder and fear, the eyes turned slowly and looked at her sharply and steadily. Then the mouth moved, and Dorothy heard a voice say:

“I am Oz, the Great and Terrible. Who are you, and why do you seek me?”

Original illustration from _The Wonderful Wizard of Oz_. Dorothy, a small girl with pigtails, has her hands behind her back as she faces an enormous gem-studded throne, lit from offpanel by a ray of light. Hovering over the seat of the throne is a disembodied man's head, larger than Dorothy's whole body.  The head is completely bald, with a bulbous nose, protruding ears, and staring eyes with no visible eyelids.  Everything in the picture has been colored the same shade of emerald green, except for the head's eyes, which were left uncolored, emphasizing the staring.

Those Denslow illustrations are weird. I wonder if the series would have lasted as long as it did, if Denslow hadn't been replaced by John R. Neill in the sequel.

This head, we learn later, is only a trick:

He pointed to one corner, in which lay the Great Head, made out of many thicknesses of paper, and with a carefully painted face.

"This I hung from the ceiling by a wire," said Oz; "I stood behind the screen and pulled a thread, to make the eyes move and the mouth open."

The Wonderful Wizard of Oz has not one but two earlier disembodied heads, not fakes but violent decaptitations. The first occurs offscreen, in the Tin Woodman's telling of how he came to be made of tin; I will discuss this later. The next to die is an unnamed wildcat that was chasing the queen of the field mice:

So the Woodman raised his axe, and as the Wildcat ran by he gave it a quick blow that cut the beast’s head clean off from its body, and it rolled over at his feet in two pieces.

Later, the Wicked Witch of the West sends a pack of forty wolves to kill the four travelers, but the Woodman kills them all, decapitating at least one:

As the leader of the wolves came on the Tin Woodman swung his arm and chopped the wolf's head from its body, so that it immediately died. As soon as he could raise his axe another wolf came up, and he also fell under the sharp edge of the Tin Woodman's weapon.

After the Witch is defeated, the travelers return to Oz, to demand their payment. The Scarecrow wants brains:

“Oh, yes; sit down in that chair, please,” replied Oz. “You must excuse me for taking your head off, but I shall have to do it in order to put your brains in their proper place.” … So the Wizard unfastened his head and emptied out the straw.

Original spot illustration from _The Wonderful Wizard of Oz_. The Scarecrow's body is sitting comfortably atop the letter ‘N’ that begins the chapter. It is holding a cane and wearing a black suit with a buttoned jacket. From its shoulders protrudes a forked stick.  At right, the Wizard is holding the Scarecrow's head.  The Wizard is a short bald man wearing a white lab coat over his striped trousters and fancy spotted waistcoat. He is looking with interest at the Scarecrow's head, with eyebrows raised.  The Scarecrow's head, clearly a stuffed sack with a face painted on it, is looking back cheerfully.  Locked around the Scarecrow's head are green-tinted spectacles.  The only colors in the illustration are two shades of green.

On the way to the palace of Glinda, the travelers pass through a forest whose inhabitants have been terrorized by a giant spider monster:

Its legs were quite as long as the tiger had said, and its body covered with coarse black hair. It had a great mouth, with a row of sharp teeth a foot long; but its head was joined to the pudgy body by a neck as slender as a wasp's waist. This gave the Lion a hint of the best way to attack the creature… with one blow of his heavy paw, all armed with sharp claws, he knocked the spider's head from its body.

That's the last decapitation in that book. Oh wait, not quite. They must first pass over the hill of the Hammer-Heads:

He was quite short and stout and had a big head, which was flat at the top and supported by a thick neck full of wrinkles. But he had no arms at all, and, seeing this, the Scarecrow did not fear that so helpless a creature could prevent them from climbing the hill.

It's not as easy as it looks:

As quick as lightning the man's head shot forward and his neck stretched out until the top of the head, where it was flat, struck the Scarecrow in the middle and sent him tumbling, over and over, down the hill. Almost as quickly as it came the head went back to the body, …

So not actually a disembodied head. The Hammer-Heads get only a Participation trophy.

Well! That gets us to the end of the first book. There are 13 more.

The Marvelous Land of Oz

One of the principal characters in this book is Jack Pumpkinhead, who is a magically animated wooden golem, with a carved pumpkin for a head.

Original spot illustration from _The Marvelous Land of Oz_. Tip, a young boy in a hat, jacket, and shorts, is sitting in a pumpkin patch, smiling and holding a knife.  He has just finished carving the head of Jack Pumpkinhead from a pumpkin about two feet in diameter.  The carved face has round eyes, a triangular nose, and a broad, toothless smile.

The head is not attached too well. Even before Jack is brought to life, his maker observes that the head is not firmly attached:

Tip also noticed that Jack's pumpkin head had twisted around until it faced his back; but this was easily remedied.

This is a recurring problem. Later on, the Sawhorse complains:

"Even your head won't stay straight, and you never can tell whether you are looking backwards or forwards!"

The imperfect attachement is inconvenient when Jack needs to flee:

Jack had ridden at this mad rate once before, so he devoted every effort to holding, with both hands, his pumpkin head upon its stick…

Unfortunately, he is not successful. The Sawhorse charges into a river:

The wooden body, with its gorgeous clothing, still sat upright upon the horse's back; but the pumpkin head was gone, and only the sharpened stick that served for a neck was visible.… Far out upon the waters [Tip] sighted the golden hue of the pumpkin, which gently bobbed up and down with the motion of the waves. At that moment it was quite out of Tip's reach, but after a time it floated nearer and still nearer until the boy was able to reach it with his pole and draw it to the shore. Then he brought it to the top of the bank, carefully wiped the water from its pumpkin face with his handkerchief, and ran with it to Jack and replaced the head upon the man's neck.

There are four illustrations of Jack with his head detached.

The Sawhorse leaps across the stream.  Tip, Jack, and the Scarecrow are tied to the sawhorse, and all four look very surprised. Jack's head is midair, being left behind. Rear view of the travelers, still astride the Sawhorse, in the stream.  Jack's head is bobbing in the stream a couple of feet behind them. The four travelers are on the opposite banks, still tied together, and dripping wet.  Jack's head, imperturbably cheerful as ever, is floating away. Tip is on his knees on the riverbank, looking worried, and reaching for Jack's head with a long stick.

The Sawhorse (who really is very disagreeable) has more complaints:

"I'll have nothing more to do with that Pumpkinhead," declared the Saw-Horse, viciously. "he loses his head too easily to suit me."

Jack is constantly worried about the perishability of his head:

“I am in constant terror of the day when I shall spoil."

"Nonsense!" said the Emperor — but in a kindly, sympathetic tone. "Do not, I beg of you, dampen today's sun with the showers of tomorrow. For before your head has time to spoil you can have it canned, and in that way it may be preserved indefinitely."

At one point he suggests using up a magical wish to prevent his head from spoiling.

The Woggle-Bug rather heartlessly observes that Jack's head is edible:

“I think that I could live for some time on Jack Pumpkinhead. Not that I prefer pumpkins for food; but I believe they are somewhat nutritious, and Jack's head is large and plump."

At one point, the Scarecrow is again disassembled:

Meanwhile the Scarecrow was taken apart and the painted sack that served him for a head was carefully laundered and restuffed with the brains originally given him by the great Wizard.

There is an illustration of this process, with the Scarecrow's trousers going through a large laundry-wringer; perhaps they sent his head through later.

The Gump's head has long branching antlers, an upturned nose, and a narrow beard.  Its neck is attached to some sort of wooden plaque that can be hung on the wall.

The protagonists need to escape house arrest in a palace, and they assemble a flying creature, which they bring to life with the same magical charm that animated Jack and the Sawhorse. For the creature's head:

The Woggle-Bug had taken from its position over the mantle-piece in the great hallway the head of a Gump. … The two sofas were now bound firmly together with ropes and clothes-lines, and then Nick Chopper fastened the Gump's head to one end.


Once brought to life, the Gump is extremely puzzled:

“The last thing I remember distinctly is walking through the forest and hearing a loud noise. Something probably killed me then, and it certainly ought to have been the end of me. Yet here I am, alive again, with four monstrous wings and a body which I venture to say would make any respectable animal or fowl weep with shame to own.”

The Gump assemblage, as seen in right profile.  The thing is made of two high-backed sofas tied together (only one is visible) with palm fronts for wings, a broom for a tail, and the former Gump's head attached to the end.  Tip is addressing the head with his finger extended for emphasis.

Flying in the Gump thing, the Woggle-Bug he cautions Jack:

"Not unless you carelessly drop your head over the side," answered the Woggle-Bug. "In that event your head would no longer be a pumpkin, for it would become a squash."

and indeed, when the Gump crash-lands, Jack's head is again in peril:

Jack found his precious head resting on the soft breast of the Scarecrow, which made an excellent cushion…

Whew. But the peril isn't over; it must be protected from a flock of jackdaws, in an unusual double-decaptitation:

[The Scarecrow] commanded Tip to take off Jack's head and lie down with it in the bottom of the nest… Nick Chopper then took the Scarecrow to pieces (all except his head) and scattered the straw… completely covering their bodies.

Shortly after, Jack's head must be extricated from underneath the Gump's body, where it has rolled. And the jackdaws have angrily scattered all the Scarecrow's straw, leaving him nothing but his head:

"I really think we have escaped very nicely," remarked the Tin Woodman, in a tone of pride.

"Not so!" exclaimed a hollow voice.

At this they all turned in surprise to look at the Scarecrow's head, which lay at the back of the nest.

"I am completely ruined!" declared the Scarecrow…

They re-stuff the Scarecrow with banknotes.

At the end of the book, the Gump is again disassembled:

“Once I was a monarch of the forest, as my antlers fully prove; but now, in my present upholstered condition of servitude, I am compelled to fly through the air—my legs being of no use to me whatever. Therefore I beg to be dispersed."

So Ozma ordered the Gump taken apart. The antlered head was again hung over the mantle-piece in the hall…

It reminds me a bit of Dixie Flatline. I wonder if Baum was famillar with that episode? But unlike Dixie, the head lives on, as heads in Oz are wont to do:

You might think that was the end of the Gump; and so it was, as a flying-machine. But the head over the mantle-piece continued to talk whenever it took a notion to do so, and it frequently startled, with its abrupt questions, the people who waited in the hall for an audience with the Queen.

The Gump's head makes a brief reappearance in the fourth book, startling Dorothy with an abrupt question.

Ozma of Oz

Oz fans will have been anticipating this section, which is a highlight on any tour of the Disembodied Heads of Oz. For it features the Princess Langwidere:

Now I must explain to you that the Princess Langwidere had thirty heads—as many as there are days in the month.

I hope you're buckled up.

But of course she could only wear one of them at a time, because she had but one neck. These heads were kept in what she called her "cabinet," which was a beautiful dressing-room that lay just between Langwidere's sleeping-chamber and the mirrored sitting-room. Each head was in a separate cupboard lined with velvet. The cupboards ran all around the sides of the dressing-room, and had elaborately carved doors with gold numbers on the outside and jewelled-framed mirrors on the inside of them.

When the Princess got out of her crystal bed in the morning she went to her cabinet, opened one of the velvet-lined cupboards, and took the head it contained from its golden shelf. Then, by the aid of the mirror inside the open door, she put on the head—as neat and straight as could be—and afterward called her maids to robe her for the day. She always wore a simple white costume, that suited all the heads. For, being able to change her face whenever she liked, the Princess had no interest in wearing a variety of gowns, as have other ladies who are compelled to wear the same face constantly.

Princess Langwidere, a graceful woman in a flowing, short-sleeved gown, is facing away from us, looking in one of a row of numbered mirrors. She his holding her head in both hands, apparently lifting it of or putting it on, as it is several inches above the neck.  Both head and neck are cut off clean and straight, as if one had cut through a clay model with a wire.  The head is blonde, with the hair in a sophisticated updo. Evidently the mirrors are attached to cabinet doors, for the one to the left, numbered “18”, it open, and we can see another of Langwidere's heads within, resting on a stand at about chest height. It has black hair, falling in ringlets to the neck, and wears a large flower. it appears to be watching Langwidere with a grave expression.  The picture is colored in shades of grayish blue.

Oh, but it gets worse. Foreshadowing:

After handing head No. 9, which she had been wearing, to the maid, she took No. 17 from its shelf and fitted it to her neck. It had black hair and dark eyes and a lovely pearl-and-white complexion, and when Langwidere wore it she knew she was remarkably beautiful in appearance.

There was only one trouble with No. 17; the temper that went with it (and which was hidden somewhere under the glossy black hair) was fiery, harsh and haughty in the extreme, and it often led the Princess to do unpleasant things which she regretted when she came to wear her other heads.

Langwidere and Dorothy do not immediately hit it off. And then the meeting goes completely off the rails:

"You are rather attractive," said the lady, presently. "Not at all beautiful, you understand, but you have a certain style of prettiness that is different from that of any of my thirty heads. So I believe I'll take your head and give you No. 26 for it."

Dorothy refuses, and after a quarrel, the Princess imprisons her in a tower.

Ozma of Oz contains only this one head-related episode, but I think it surpasses the other books in the quality of the writing and the interest of the situation.

Dorothy and the Wizard in Oz

This loser of a book has no disembodied heads, only barely a threat of one. Eureka the Pink Kitten has been accused of eating one of the Wizard's tiny trained piglets.

[Ozma] was just about to order Eureka's head chopped off with the Tin Woodman's axe…

The Wizard does shoot a Gargoyle in the eye with his revolver, though.

The Road to Oz

In this volume the protagonists fall into the hands of the Scoodlers:

It had the form of a man, middle-sized and rather slender and graceful; but as it sat silent and motionless upon the peak they could see that its face was black as ink, and it wore a black cloth costume made like a union suit and fitting tight to its skin. …

The thing gave a jump and turned half around, sitting in the same place but with the other side of its body facing them. Instead of being black, it was now pure white, with a face like that of a clown in a circus and hair of a brilliant purple. The creature could bend either way, and its white toes now curled the same way the black ones on the other side had done.

"It has a face both front and back," whispered Dorothy, wonderingly; "only there's no back at all, but two fronts."

Okay, but I promised disembodied heads. The Scoodlers want to make the protagonists into soup. When Dorothy and the others try to leave, the Scoodlers drive them back:

Two of them picked their heads from their shoulders and hurled them at the shaggy man with such force that he fell over in a heap, greatly astonished. The two now ran forward with swift leaps, caught up their heads, and put them on again, after which they sprang back to their positions on the rocks.

The problem with this should be apparent.

Toto, a small black terrier, is running, carrying a scoodler's head in his mouth.  The scoodler's head is spherical, with messsy, greasy-looking hair. In place of a node it has a mark like the club in a deck of cards.  Its teeth are large and square and several are missing.  The scoodler's face is in an expression of exaggerated dismay.

The characters escape from their prison and, now on guard for flying heads, they deal with them more effectively than before:

The shaggy man turned around and faced his enemies, standing just outside the opening, and as fast as they threw their heads at him he caught them and tossed them into the black gulf below. …

Shaggy Man is standing on a narrow stone bridge over a deep gulf.  He faces a round, dark cave entrance from which bursts a profusion of flying scoodler heads, all grinning horribly, thrown by the massed scoodlers emerging from the cave mouth.  The Shaggy Man is catching two of the heads.  To his left,the already-caught heads are raining down into the distance.

They should have taken a hint from the Hammer-Heads, who clearly have the better strategy. If you're going to fling your head at trespassers, you should try to keep it attached somehow.

Presently every Scoodler of the lot had thrown its head, and every head was down in the deep gulf, and now the helpless bodies of the creatures were mixed together in the cave and wriggling around in a vain attempt to discover what had become of their heads. The shaggy man laughed and walked across the bridge to rejoin his companions.

Brutal.

A closeup of several falling scoodler heads, including that of the Scoodler Queen, wearing a crown.  They scoodlers all seem to be greatly dismayed.

That is the only episode of head-detachment that we actually see. The shaggy man and Button Bright have their heads changed into a donkey's head and a fox's head, respectively, but manage to keep them attached. Jack Pumpkinhead makes a return, to explain that he need not have worried about his head spoiling:

I've a new head, and this is the fourth one I've owned since Ozma first made me and brought me to life by sprinkling me with the Magic Powder."

"What became of the other heads, Jack?"

"They spoiled and I buried them, for they were not even fit for pies. Each time Ozma has carved me a new head just like the old one, and as my body is by far the largest part of me I am still Jack Pumpkinhead, no matter how often I change my upper end.

How now lives in a pumpkin field, so as to be assured of a ready supply of new heads.

The Emerald City of Oz

By this time Baum was getting tired of Oz, and it shows in the lack of decapitations in this tired book.

In one of the two parallel plots, the ambitious General Guph promises the Nome King that he will conquer Oz. Realizing that the Nome armies will be insufficient, he hires three groups of mercenaries. The first of these aren't quite headless, but:

These Whimsies were curious people who lived in a retired country of their own. They had large, strong bodies, but heads so small that they were no bigger than door-knobs. Of course, such tiny heads could not contain any great amount of brains, and the Whimsies were so ashamed of their personal appearance and lack of commonsense that they wore big heads, made of pasteboard, which they fastened over their own little heads.

Don't we all know someone like that?

To induce the Whimsies to fight for him, Guph promises:

"When we get our Magic Belt," he made reply, "our King, Roquat the Red, will use its power to give every Whimsie a natural head as big and fine as the false head he now wears. Then you will no longer be ashamed because your big strong bodies have such teenty-weenty heads."

The Whimsies hold a meeting and agree to help, except for one doubter:

But they threw him into the river for asking foolish questions, and laughed when the water ruined his pasteboard head before he could swim out again.

A closeup of several falling scoodler heads, including that of the Scoodler Queen, wearing a crown.  They scoodlers all seem to be greatly dismayed.

While Guph is thus engaged, Dorothy and her aunt and uncle are back in Oz sightseeing. One place they visit is the town of Fuddlecumjig. They startle the inhabitants, who are “made in a good many small pieces… they have a habit of falling apart and scattering themselves around…”

The travelers try to avoid startling the Fuddles, but they are unsuccessful, and enter a house whose floor is covered with little pieces of the Fuddles who live there.

On one [piece] which Dorothy held was an eye, which looked at her pleasantly but with an interested expression, as if it wondered what she was going to do with it. Quite near by she discovered and picked up a nose, and by matching the two pieces together found that they were part of a face.

"If I could find the mouth," she said, "this Fuddle might be able to talk, and tell us what to do next."

They do succeed in assembling the rest of the head, which has red hair:

"Look for a white shirt and a white apron," said the head which had been put together, speaking in a rather faint voice. "I'm the cook."

This is fortunate, since it is time for lunch.

Jack Pumpkinhead makes an appearance later, but his head stays on his body.

The Patchwork Girl of Oz

As far as I can tell, there are no decapitations in this book. The closest we come is an explanation of Jack Pumpkinhead's head-replacement process:

“Just now, I regret to say, my seeds are rattling a bit, so I must soon get another head."

"Oh; do you change your head?" asked Ojo.

"To be sure. Pumpkins are not permanent, more's the pity, and in time they spoil. That is why I grow such a great field of pumpkins — that I may select a new head whenever necessary."

"Who carves the faces on them?" inquired the boy.

"I do that myself. I lift off my old head, place it on a table before me, and use the face for a pattern to go by. Sometimes the faces I carve are better than others--more expressive and cheerful, you know--but I think they average very well."

Some people the protagonists meet in their travels use the Scarecrow as sports equipment, but his head remains attached to the rest of him.

Tik-tok of Oz

This is a pretty good book, but there are no disembodied heads that I could find.

The Scarecrow of Oz

As you might guess from the title, the Scarecrow loses his head again. Twice.

Only a short time elapsed before a gray grasshopper with a wooden leg came hopping along and lit directly on the upturned face of the Scarecrow’s head.

A full-color illustration of the Scarecrow's head lying on the ground.  The head is a stuffed sack with the opening loosely tied.  The Scarecrow is smiling calmly (his smile is painted on) and is looking at the grasshopper that his perched on on his (flat, painted-on) nose.  The grasshopper, whose back leg is wooden, is looking down at the Scarecrow's right eye. Around the two are red wildflowers and stems of tall grass.

The Scarecrow and the grasshopper (who is Cap'n Bill, under an enchantment) have a philosophical conversation about whether the Scarecrow can be said to be alive, and a little later Trot comes by and reassembles the Scarecrow. Later he nearly loses it again:

… the people thought they would like him for their King. But the Scarecrow shook his head so vigorously that it became loose, and Trot had to pin it firmly to his body again.

The Scarecrow is not yet out of danger. In chapter 22 he falls into a waterfall and his straw is ruined. Cap'n Bill says:

“… the best thing for us to do is to empty out all his body an’ carry his head an’ clothes along the road till we come to a field or a house where we can get some fresh straw.”

A full-color illustration of Cap'n Bill holding the  Scarecrow's head in his arms.   The head is still smiling, but perhaps a little more wanly than usual. Cap'n Bill is an elderly man with a red nose and cheeks bushy white eyebrowsm sideburns, and beard, but no mustache.  He is wearing a fisherman's rain hat and a double-breasted blue overcoat.  At the bottom of the picture are Trot wearing the Scarecrow's boots on her hands, and Betsy, holding another one of the Scarecrow's garments.

This they do, with the disembodied head of the Scarecrow telling stories and giving walking directions.

Rinkitink in Oz

No actual heads are lost in the telling of this story. Prince Inga kills a giant monster by bashing it with an iron post, but its head (if it even has one; it's not clear) remains attached. Rinkitink sings a comic song about a man named Ned:

A red-headed man named Ned was dead;
Sing fiddle-cum-faddle-cum-fi-do!
In battle he had lost his head;
Sing fiddle-cum-faddle-cum-fi-do!
'Alas, poor Ned,' to him I said,
'How did you lose your head so red?'
Sing fiddle-cum-faddle-cum-fi-do!

But Ned does not actually appear in the story, and we only get to hear the first two verses of the song because Bilbil the goat interrupts and begs Rinkitink to stop.

Elsewhere, Nikobob the woodcutter faces a monster named Choggenmugger, hacks off its tongue with his axe, splits its jaw in two, and then chops it into small segments, “a task that proved not only easy but very agreeable”. But there is no explicit removal of its head and indeed, the text and the pictures imply that Choggenmugger is some sort of giant sausage and has no head to speak of.

The Lost Princess of Oz

No disembodied heads either. The nearest we come is:

At once there rose above the great wall a row of immense heads, all of which looked down at them as if to see who was intruding.

These heads, however, are merely the heads of giants peering over the wall.

Two books in a row with no disembodied heads. I am becoming discouraged. Perhaps this project is not worth finishing. Let's see, what is coming next?

Oh.

Right then…

The Tin Woodman of Oz

This is the mother lode of decapitations in Oz. As you may recall, in The Wonderful Wizard of Oz the Tin Woodman relates how he came to be made of tin. He dismembered himself with a cursed axe, and after amputating all four of his limbs, he had them replaced with tin prostheses:

The Wicked Witch then made the axe slip and cut off my head, and at first I thought that was the end of me. But the tinsmith happened to come along, and he made me a new head out of tin.

One would expect that they threw the old head into a dumpster. But no! In The Tin Woodman of Oz we learn that it is still hanging around:

The Tin Woodman had just noticed the cupboards and was curious to know what they contained, so he went to one of them and opened the door. There were shelves inside, and upon one of the shelves which was about on a level with his tin chin the Emperor discovered a Head—it looked like a doll's head, only it was larger, and he soon saw it was the Head of some person. It was facing the Tin Woodman and as the cupboard door swung back, the eyes of the Head slowly opened and looked at him. The Tin Woodman was not at all surprised, for in the Land of Oz one runs into magic at every turn.

"Dear me!" said the Tin Woodman, staring hard. "It seems as if I had met you, somewhere, before. Good morning, sir!"

"You have the advantage of me," replied the Head. "I never saw you before in my life."

This appears to be a draft version of the previous illustration, to which it is quite similar. It cuts off at the Tin Woodman's waist.  The Woodman is smiling broadly and holding his axe.  The head looks less surly but still not happy to meet the Woodman.

This creepy scene is more amusing than I remembered:

"Haven't you a name?"

"Oh, yes," said the Head; "I used to be called Nick Chopper, when I was a woodman and cut down trees for a living."

"Good gracious!" cried the Tin Woodman in astonishment. "If you are Nick Chopper's Head, then you are Me—or I'm You—or—or— What relation are we, anyhow?"

"Don't ask me," replied the Head. "For my part, I'm not anxious to claim relationship with any common, manufactured article, like you. You may be all right in your class, but your class isn't my class. You're tin."

Apparently Neill enjoyed this so much that he illustrated it twice, once as a full-page illustration and once as a spot illustration on the first page of the chapter:

Very similar to previous; it cuts off at the waist, and the Woodman is holding his axe and smiling broadly.  The head still appears annoyed at the intrusion, but less disagreeable.

The chapter, by the way, is titled “The Tin Woodman Talks to Himself”.

Later, we get the whole story from Ku-Klip, the tinsmith who originally assisted the amputated Tin Woodman. Ku-Klip explains how he used leftover pieces from the original bodies of both the Tin Woodman and the Tin Soldier (a completely superfluous character whose backstory is identical to the Woodman's) to make a single man, called Chopfyt:

"First, I pieced together a body, gluing it with the Witch's Magic Glue, which worked perfectly. That was the hardest part of my job, however, because the bodies didn't match up well and some parts were missing. But by using a piece of Captain Fyter here and a piece of Nick Chopper there, I finally got together a very decent body, with heart and all the trimmings complete."

Ku-Klip sits on a folding stool. He is a heavy man with a bald head, white whiskers almost down to the floor, and wire-rimmed spectacles.  His sleeves are rolled up, and he brushing glue onto a disembodied right arm, getting it ready to be attached to the half-built body of Chopfyt.  The body is headless, and is missing its right arm and the lower half of the left arm.  It has both legs, but they are clearly different lengths and the two knees are at different heights. By Ku-Klip's foot is a large jug labeled MEAT GLUE.  On a nearby table, Nick Fyter's head is watching, waiting to be attached to Chopfyt's body.

The Tin Soldier is spared the shock of finding his own head in a closet, since Ku-Klip had used it in Chopfyt.

I'm sure you can guess where this is going.

The Tin Soldier is pointing in shock at Chopfyt, who is sitting in a chair, looking more annoyed than anything else.  Chopfyt has reddish-brown hair, a scowling face, and a hooked nose.  He is wearing rumpled brown cloths, and red shoes with large toes. His hands are clasped around his right knee.  He appears to be very uncomfortable with the situation but determined to ride it out.  The Tin Soldier looks just like the tin  woodman, except with two rows of buttons down his cylindrical body, and instead of the funnel worn by the Tin Woodman he is wearing a tin shako.

Whew, that was quite a ride. Fortunately we are near the end and it is all downhill from here.

The Magic of Oz

This book centers around Kiki Aru, a grouchy Munchkin boy who discovers an extremely potent magical charm for transforming creatures. There are a great many transformations in the book, some quite peculiar and humiliating. The Wizard is turned into a fox and Dorothy into a lamb. Six monkeys are changed into giant soldiers. There is a long episode in which Trot and Cap'n Bill are trapped on an enchanted island, with roots growing out of their feets and into the ground. A giraffe has its tail bitten off, and there is the usual explanation about Jack Pumpkinhead's short shelf life. But I think everyone keeps their heads.

Glinda of Oz

There are no decapitations in this book, so we will have to settle for a consolation prize. The book's plot concerns the political economy of the Flatheads.

Dorothy knew at once why these mountain people were called Flatheads. Their heads were really flat on top, as if they had been cut off just above the eyes and ears.

A cheerful-looking Flathead out with his wife.  He is carrying a halberd, and she a parasol. They are smiling, and their arms are linked.

The Flatheads carry their brains in cans. This is problematic: an ambitious flathead has made himself Supreme Dictator, and appropriated his enemies’ cans for himself.

The protagonists depose the Supreme Dictator, and Glinda arranges for each Flathead to keep their own brains in their own head where they can't be stolen, in a scene reminiscent of when the Scarecrow got his own brains, way back when.

Glinda is dumping the contents of a can of brains, which look something like vacuum cleaner lint, onto the flat head of a blissful-looking Flathead.

That concludes our tour of the Disembodied Heads of Oz. Thanks for going on this journey with me.

[ Previously, Frank Baum's uncomfortable relationship with Oz. Coming up eventually, an article on domestic violence in Oz. Yes, really ]

by Mark Dominus (mjd@plover.com) at May 03, 2022 08:17 PM

Matt Parsons

Moving the Programming Blog

I’m moving the programming stuff over to https://overcoming.software.

May 03, 2022 12:00 AM

Donnacha Oisín Kidney

Depth Comonads

Posted on May 3, 2022
Tags: Agda

I haven’t written much on this blog recently: since starting a PhD all of my writing output has gone towards paper drafts and similar things. Recently, though, I’ve been thinking about streams, monoids, and comonads and I haven’t manage to wrangle those thoughts into something coherent enough for a paper. This blog post is a collection of those (pretty disorganised) thoughts. The hope is that writing them down will force me to clarify things, but consider this a warning that the rest of this post may well be muddled and confusing.

Streams

The first thing I want to talk about is streams.

record Stream (A : Type) : Type where
  coinductive
  field head : A
        tail : Stream A

This representation is coinductive: the type above contains infinite values. Agda, unlike Haskell, treats inductive and coinductive types differently (this is why we need the coinductive keyword in the definition). One of the differences is that it doesn’t check termination for construction of these values:

alternating :: [Bool]
alternating = True : False : alternating

We have the equivalent in Haskell on the right. We’re also using some fancy syntax for the Agda code: copatterns (Abel and Pientka 2013).

Note that this type is only definable in a language with some notion of laziness. If we tried to define a value like alternating above in OCaml we would loop. Haskell has no problem, and Agda—through its coinduction mechanism—can handle it as well.

Update 4-5-22: thanks to Arnaud Spiwack (@aspiwack) for correcting me on this, it turns out the definition of alternating above can be written in Ocaml, even without laziness. Apparently Ocaml has a facility for strict cyclic data structures. Also, I should be a little more precise with what I’m saying above: even without the extra facility for strict cycles, you can of course write a lazy list with some kind of lazy wrapper type.

There is, however, an isomorphic type that can be defined without coinduction:

(notice that, in this form, the function ℕ-alternating is the same function as even : ℕ → Bool)

In fact, we can convert from the coinductive representation to the inductive one. This conversion function is more familiarly recognisable as the indexing function:

_[_] : Stream A → ℕ-Stream A
xs [ zero  ] = xs .head
xs [ suc n ] = xs .tail [ n ]

I’m not just handwaving when I say the two representations are isomorphic: we can prove this isomorphism, and, in Cubical Agda, we can use this to transport programs on one representation to the other.

Proof of isomorphism

tabulate : ℕ-Stream A → Stream A
tabulate xs .head = xs zero
tabulate xs .tail = tabulate (xs ∘ suc)

stream-rinv : (xs : Stream A) → tabulate (xs [_]) ≡ xs
stream-rinv xs i .head = xs .head
stream-rinv xs i .tail = stream-rinv (xs .tail) i

stream-linv : (xs : ℕ-Stream A) (n : ℕ) → tabulate xs [ n ] ≡ xs n
stream-linv xs zero    = refl
stream-linv xs (suc n) = stream-linv (xs ∘ suc) n

stream-reps : ℕ-Stream A ⇔ Stream A
stream-reps .fun = tabulate
stream-reps .inv = _[_]
stream-reps .rightInv = stream-rinv
stream-reps .leftInv xs = funExt (stream-linv xs)

One final observation about streams: another way to define a stream is as the cofree comonad of the identity functor.

record Cofree (F : Type → Type) (A : Type) : Type where
  coinductive
  field root : A
        step : F (Cofree F A)

�-Stream : Type → Type
�-Stream = Cofree id

Concretely, the Cofree F A type is a possibly infinite tree, with branches shaped like F, and internal nodes labelled with A. It has the following characteristic function:

{-# NON_TERMINATING #-}
trace : ⦃ _ : Functor � ⦄ → (A → B) → (A → � A) → A → Cofree � B
trace ϕ � x .root = ϕ x
trace ϕ � x .step = map (trace ϕ �) (� x)

Like how the free monad turns any functor into a monad, the cofree comonad turns any functor into a comonad. Comonads are less popular and widely-used than monads, as there are less well-known examples of them. I have found it helpful to think about comonads through spatial analogies. A lot of comonads can represent a kind of walk through some space: the extract operation tells you “what is immediately here�, and the duplicate operation tells you “what can I see from each point�. For the stream, these two operations are inhabited by head and the following:

duplicate : Stream A → Stream (Stream A)
duplicate xs .head = xs
duplicate xs .tail = duplicate (xs .tail)

Generalising Streams

There were three key observations in the last section:

  1. Streams are coinductive. This requires a different termination checker in Agda, and a different evaluation model in strict languages.
  2. They have an isomorphic representation based on indexing. This isomorphic representation doesn’t need coinduction or laziness.
  3. They are a special case of the cofree comonad.

Going forward, we’re going to look at generalisations of streams, and we’re going to see what these observations mean in the contexts of the new generalisations.

The thing we’ll be generalising is the index of the stream. Currently, streams are basically structures that assign a value to every ℕ: what does a stream of—for instance—rational numbers look like? To drive the intuition for this generalisation let’s first look at the comonad instance on the ℕ-Stream type:

ℕ-extract : ℕ-Stream A → A
â„•-extract xs = xs zero

ℕ-duplicate : ℕ-Stream A → ℕ-Stream (ℕ-Stream A)
â„•-duplicate xs zero    = xs
ℕ-duplicate xs (suc n) = ℕ-duplicate (xs ∘ suc) n

This is the same instance as is on the Stream type, transported along the isomorphism between the two types (we could have transported the instance automatically, using subst or transport; I have written it out here manually in full for illustration purposes).

The â„•-duplicate method here can changed a little to reveal something interesting:

ℕ-duplicate₂ : ℕ-Stream A → ℕ-Stream (ℕ-Stream A)
â„•-duplicateâ‚‚ xs zero    m = xs m
ℕ-duplicate₂ xs (suc n) m = ℕ-duplicate₂ (xs ∘ suc) n m

ℕ-duplicate₃ : ℕ-Stream A → ℕ-Stream (ℕ-Stream A)
ℕ-duplicate₃ xs n m = xs (go n m)
  where
  go : ℕ → ℕ → ℕ
  go zero    m = m
  go (suc n) m = suc (go n m)

ℕ-duplicate₄ : ℕ-Stream A → ℕ-Stream (ℕ-Stream A)
â„•-duplicateâ‚„ xs n m = xs (n + m)

In other words, duplicate basically adds indices.

There is something distinctly monoidal about what’s going on here: taking the (ℕ, +, 0) monoid as focus, the extract method above corresponds to the monoidal empty element, and the duplicate method corresponds to the binary operator on monoids. In actual fact, there is a comonad for any function from a monoid, often called the Traced comonad.

Traced : Type → Type → Type
Traced E A = E → A

extractᵀ : ⦃ _ : Monoid E ⦄ → Traced E A → A
extractᵀ xs = xs ε

duplicateᵀ : ⦃ _ : Monoid E ⦄ → Traced E A → Traced E (Traced E A)
duplicateᵀ xs e� e₂ = xs (e� ∙ e₂)

Reifying Traced

The second observation we made about streams was that they had an isomorphic representation which didn’t need coinduction. What we can see above, with Traced, is a representation that also doesn’t need coinduction. So what is the corresponding coinductive representation? What does a generalised reified stream look like?

So the first approach to reifying a function to a data structure is to simply represent the function as a list of pairs.

C-Traced : Type → Type → Type
C-Traced E A = Stream (E × A)

This representation obviously isn’t ideal: it isn’t possible to construct an isomorphism between C-Traced and Traced. We can—kind of—go in one direction, but even that function isn’t terminating:

{-# NON_TERMINATING #-}
lookup-env : ⦃ _ : IsDiscrete E ⦄ → C-Traced E A → Traced E A
lookup-env xs x = if does (x ≟ xs .head .fst)
                     then xs .head .snd
                     else lookup-env (xs .tail) x

I’m not too concerned with being fast and loose with termination and isomorphisms for the time being, though. At the moment, I’m just interested in exploring the relationship between streams and the indexing functions.

As a result, let’s try and push on this representation a little and see if it’s possible to get something interesting and almost isomorphic.

Segmented Streams

To get a slightly nicer representation we can exploit the monoid a little bit. We can do this by storing offsets instead of the absolute indices for each entry. The data structure I have in mind here looks a little like this:

�����������┳������┳������┉
┃x         ┃y     ┃z     ┉
┡����������╇������╇������┉
╵⇤a╌╌╌╌╌╌╌⇥╵⇤b╌╌╌⇥╵⇤c╌╌╌╌┈

Above is a stream containing the values x, y, and z. Instead of each value corresponding to a single entry in the stream, however, they each correspond to a segment. The value x, for instance, labels the first segment in the stream, which has a length given by a. y labels the second segment, with length b, z with length c, and so on.

The Traced version of the above structure might be something like this:

str :: Traced m a
str i | i < a         = x
      | i < a + b     = y
      | i < a + b + c = z
      | ...

So the index-value mapping is also segmented. The stream, in this way, is kind of like a ruler, where different values mark out different quantities along the ruler, and the index function takes in a quantity and tells you which entry in the ruler that quantity corresponds to.

In code, we might represent the above data structure with the following type:

record Segments (E : Type) (A : Type) : Type where
  field
    length : E
    label  : A
    next   : Segments E A

open Segments

The question is, then, how do we convert this structure to an Traced representation?

Monuses

We need some extra operations on the monoid in the segments in order to enable this conversion to the Traced representation. The extra operations are encapsulated by the monus algebra: I wrote about this in the paper I submitted with Nicolas Wu to ICFP last year (2021). It’s a simple algebra on monoids which basically encapsulates monoids which are ordered in a sensible way.

The basic idea is that we construct an order on monoids which says “x is smaller than y if there is some z that we can add to x to get to y�.

_≼_ : ⦃ _ : Monoid A ⦄ → A → A → Type _
x ≼ y = ∃ z × (y ≡ x ∙ z)

A monus is a monoid where we can extract that z, when it exists. On the monoid (â„•, +, 0), for instance, this order corresponds to the normal ordering on â„•.

Extracting the z above corresponds to a kind of difference operator:

_∸_ : ℕ → ℕ → ℕ
x     ∸ zero  = x
suc x ∸ suc y = x ∸ y
_     ∸ _     = zero

This operator is sometimes called the monus. It is a kind of partial, or truncating, subtraction:

_ : 5 ∸ 2 ≡ 3
_ = refl

_ : 2 ∸ 5 ≡ 0
_ = refl

And, indeed, this operator “extracts� the z, when it exists.

∸‿is-monus : ∀ x y → (x≼y : x ≼ y) → y ∸ x ≡ fst x≼y
∸‿is-monus zero    _       (z , y≡0+z) = y≡0+z
∸‿is-monus (suc x) (suc y) (z , y≡x+z) = ∸‿is-monus x y (z , suc-inj y≡x+z)
∸‿is-monus (suc x) zero    (z , 0≡x+z) = ⊥-elim (zero≢suc 0≡x+z)

Our definition of a monus is simple: a monus is anything where the order ≼, sometimes called the “algebraic preorder�, is total and antisymmetric. This is precisely what lets us write a function which takes the Segments type and converts it back to the Traced type.

{-# NON_TERMINATING #-}
Segments→Traced : ⦃ _ : Monus E ⦄ → Segments E A → Traced E A
Segments→Traced xs i with xs .length ≤? i
... | yes (j , i≡xsₗ∙j) = Segments→Traced (xs .next) j
... | no  _             = xs .label

This function takes an index, and checks if that length is greater than or equal to the first segment in the stream of segments. If it is, then it continues searching through the rest of the segments with the index reduced by the size of that first segment. If not, then it returns the label of the first segment.

Taking the old example, we are basically converting to ∸ from +:

str :: Traced m a
str i | i         < a = x
      | i ∸ a     < b = y
      | i ∸ a ∸ b < c = z
      | ...

The first issue here is that this definition is not terminating. That might seem an insurmountable problem at first—we are searching through an infinite stream, after all—but notice that there is one paremeter which is decreasing on each recursive call: the index. Well, it only decreases if the segment is non-zero: this can be enforced by changing the definition of the segments type:

record ℱ-Segments (E : Type) ⦃ _ : Monus E ⦄ (A : Type) : Type where
  coinductive
  field
    label    : A
    length   : E
    length≢ε : length ≢ ε
    next     : ℱ-Segments E A

open ℱ-Segments

This type allows us to write the following definition:

module _ ⦃ _ : Monus E ⦄ (wf : WellFounded _≺_) where
  wf-index : ℱ-Segments E A → (i : E) → Acc _≺_ i → A
  wf-index xs i a with xs .length ≤? i
  ... | no _ = xs .label
  wf-index xs i (acc wf) | yes (j , i≡xsₗ∙j) =
    wf-index (xs .next) j (wf j (xs .length , i≡xsₗ∙j ; comm _ _ , xs .length≢ε))

  ℱ-Segments→Traced : ℱ-Segments E A → Traced E A
  ℱ-Segments→Traced xs i = wf-index xs i (wf i)

Trying to build an isomorphism

So the ℱ-Segments type is interesting, but it only really gives one side of the isomorphism. There is no way to write a function Traced E A → ℱ-Segments E A.

The problem is that there’s no way to get the “next� segment from a function E → A. We can find the label of the first segment, by applying the function to ε, but there’s no real way to figure out the size of this segment. We can change Traced little to provide this size, though.

Ind : ∀ E → ⦃ _ : Monus E ⦄ → Type → Type
Ind E A = E → A × Σ[ length ⦂ E ] × (length ≢ ε)

This new type will return a tuple consisting of the value indicated by the supplied index, along with the distance to the next segment. For instance, on the example stream given in the diagram earlier, supplying an index i that is bigger than a but smaller than a + b, this function should return y along with some j such that i + j ≡ a + b. Diagrammatically:

╷⇤i╌╌╌╌╌╌╌╌⇥╷⇤j╌╌⇥╷
┢��������┳��┷�����╈������┉
┃x       ┃y       ┃z     ┉
┡��������╇��������╇������┉
╵⇤a╌╌╌╌╌⇥╵⇤b╌╌╌╌╌⇥╵⇤c╌╌╌╌┈

This can be implemented in code like so:

module _ ⦃ _ : Monus E ⦄ where
  wf-ind : ℱ-Segments E A → (i : E) → Acc _≺_ i → A × ∃ length × (length ≢ ε)
  wf-ind xs i _ with xs .length ≤? i
  ... | no xsₗ≰i =
    let j , _ , j≢ε = <⇒≺ i (xs .length) xsₗ≰i
    in xs .label , j , j≢ε
  wf-ind xs i (acc wf) | yes (j , i≡xsₗ∙j) =
    wf-ind (xs .next) j (wf j (xs .length , i≡xsₗ∙j ; comm _ _ , xs .length≢ε))

  ℱ-Segments→Ind : WellFounded _≺_ → ℱ-Segments E A → Ind E A
  ℱ-Segments→Ind wf xs i = wf-ind xs i (wf i)

Again, if the monus has finite descending chains, this function is terminating. And the nice thing about this is that it’s possible to write a function in the other direction:

Ind→ℱ-Segments : ⦃ _ : Monus E ⦄ → Ind E A → ℱ-Segments E A
Ind→ℱ-Segments ind =
  let x , s , s≢ε = ind ε
  in λ where .label    → x
             .length   → s
             .length≢ε → s≢ε
             .next     → Ind→ℱ-Segments (ind ∘ (s ∙_))

The problem here is that this isomorphism is only half correct. We can prove that converting to Ind and back is the identity, but not the other direction. There are too many functions in Ind.

Nonetheless, it’s still interesting!

State Comonad

There is a comonad on state (Waern 2018; Kmett 2018) that is different from store. Notice that above the Ind type has the same type (almost) as State E A.

This is interesting in two ways: first, it gives some concrete, spatial intuition for what’s going on with the state comonad.

Second, it gives a kind of interesting monad instance on the stream. If we apply the Ind→ℱ-Segments function to the implementation of join on state, we should get a join on ℱ-Segments. And we do!

First, we need to redefine Ind to the following:

�-Ind : ∀ E → ⦃ _ : Monus E ⦄ → Type → Type
�-Ind E A = (i : E) → A × Σ[ length ⦂ E ] × (i ≺ length)

This is actually isomorphic to the previous definition, but we return the absolute value of the next segment, rather than the distance to the next segment.

�-iso : ⦃ _ : Monus E ⦄ → �-Ind E A ⇔ Ind E A
�-iso .fun xs i =
  let x , s , k , s≡i∙k , k≢ε = xs i
  in  x , k , k≢ε
�-iso .inv xs i =
  let x , s , s≢ε = xs i
  in  x , i ∙ s , s , refl , s≢ε
�-iso .rightInv _ = refl
�-iso .leftInv  xs p i = 
  let x , s           , k , s≡i∙k                   , k≢ε = xs i
  in  x , s≡i∙k (~ p) , k , (λ q → s≡i∙k (~ p ∨ q)) , k≢ε

The implemention of join on this type is the following:

�-join : ⦃ _ : Monus E ⦄ → �-Ind E (�-Ind E A) → �-Ind E A
�-join xs i =
  let x , j , i<j = xs i
      y , k , k<j = x j
  in  y , k , ≺-trans i<j k<j

This is the same definition of join as for State, modulo the < fiddling.

On a stream, this operation corresponds to taking a stream of streams and collapsing it to a single stream. It does this by taking a prefix of each internal stream equal in size to the segment of the outer entry. Diagrammatically:

�����������┳������┳������┉
┃xs        ┃ys    ┃zs    ┉
┡����������╇������╇������┉
╵⇤a╌╌╌╌╌╌╌⇥╵⇤b╌╌╌⇥╵⇤c╌╌╌╌┈
          ╱        ╲
         ╱          ╲
        ╱            ╲
       ╱              ╲
      ╱                ╲
     ╷⇤b╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌⇥╷
     ┢�������┳������┳���┷┉
ys = ┃xʸ     ┃yʸ    ┃zʸ  ┉
     ┡�������╇������╇����┉
     ╵⇤aʸ╌╌╌⇥╵⇤bʸ╌╌⇥╵⇤cʸ╌┈

Here we start with a stream consisting of the streams xs, ys, and zs, followed by some other streams. Zooming in on ys, we see that it is in a segment of length b, and consists of three values xʸ, yʸ, and zʸ, with segment lengths aʸ, bʸ, and cʸ, respectively.

Calling join on this stream will give us the following stream:

��┉�┳����┳����┳�����┳����┉
┃ ┉ ┃xʸ  ┃yʸ  ┃zʸ   ┃    ┉
┡�┉�╇����╇����╇�����╇����┉
│   │⇤aʸ⇥╵⇤bʸ⇥╵⇤╌╌┈⇥│
╵⇤a⇥╵⇤b╌╌╌╌╌╌╌╌╌╌╌╌⇥╵⇤c╌╌┈

Again, we’re focusing on the ys section here, which occupies the segment from a to a ∙ b. After join, this segment is occupied by three elements, xʸ, yʸ, and zʸ.

Notice that this isn’t quite the normal join on streams. That join takes a stream of streams, and turns the ith entry into the ith entry in the underlying stream. It’s a diagonalisation, in other words.

This one is kind of similar, but it takes chunks of the outer stream.

Theory

All of this so far is very hand-wavy. We have an almost isomorphism (a split surjection, to be precise), but not much in the way of concrete theoretical insights, just some vague gesturing towards spatial metaphors and so on.

Thankfully, there are two seperate areas of more serious research that seem related to the stuff I’ve talked about here. The first is update monads and directed containers, and the second is graded comonads. I think I understand graded comonads and the related work better out of the two, but update monads and directed containers seems more closely related to what I’m doing here.

Update Monads and Directed Containers

There are a few papers on this topic: Ahman, Chapman, and Uustalu (2012), Ahman and Uustalu (2013; Ahman and Uustalu 2014; Ahman and Uustalu 2016).

The first of these, “When Is a Container a Comonad?� constructs, as the title suggests, a class for containers which are comonads in a standard way.

Here’s the definition of a container:

Container : Type�
Container = Σ[ Shape ⦂ Type ] × (Shape → Type)

⟦_⟧ : Container → Type → Type
⟦ S , P ⟧ X = Σ[ s ⦂ S ] × (P s → X)

Containers are a generic way to describe a class of well-behaved functors. Any container is a pair of a shape and position. Lists, for instance, are containers, where their shape is described by the natural numbers (the shape here is the length of the list). The positions in such a list are the numbers smaller than the length, in dependently-typed programming we usually use the Fin type for this:

Fin : ℕ → Type
Fin n = ∃ m × (m <ℕ n)

The container version of lists, then, is the following:

ℒ��� : Type → Type
ℒ��� = ⟦ ℕ , Fin ⟧

Here’s the same list represented in the standard way, and as a container:

The benefit of using containers is that it gives a standard, generic, and composable way to construct functors that have some nice properties (like strict positivity). They’re pretty annoying to use in practice, though, which is a shame.

Directed containers are container that have three extra operations.

  • A tail-like operation, where a position can be converted into the shape of containers that the suffic from that position.
  • A head-like operation, where you can always return the root position.
  • A +-like operation, where you take a position on some tail and translate it into a position on the original container, by adding it.

As the paper observes, these are very similar to a “dependently-typed� version of the monoid methods. This seems to me to be very similar to the indexing stuff we were doing earlier on.

The real interesting part is in the paper “Updated Monads: Cointerpreting Directed Containers� (Ahman and Uustalu 2014). This paper presents a variant on state monads, called “update monads�.

These are monads that use a monoid action:

record RightAction (� : Type) (� : Type) : Type where
  infixl 5 _↓_
  field
    ⦃ monoid⟨�⟩ ⦄ : Monoid �
    _↓_ : � → � → �
    ↓-assoc : ∀ x y z → (x ↓ y) ↓ z ≡ x ↓ (y ∙ z)
    ↓-ε : ∀ x → x ↓ ε ≡ x

A (right) monoid action is a monoid along with a function ↓ that “acts� on some other set, in a way that coheres with the monoid methods. The definition is given above. One way to think about it is that if a monoid � has an action on � it means that elements of � can kind of be transformed into elements of � → �.

This can be used to construct a monad that looks suspiciously like the state monad:
Upd : (� � : Type) ⦃ _ : RightAction � � ⦄ → Type → Type
Upd � � X = � → � × X

η : ⦃ _ : RightAction � � ⦄ → A → Upd � � A
η x s = ε , x

μ : ⦃ _ : RightAction � � ⦄ → Upd � � (Upd � � A) → Upd � � A
μ xs s = let p , x = xs s
             q , y = x (s ↓ p)
         in  (p ∙ q , y)

It turns out that the dependently-typed version of this gives directed containers.

Grading and the Cofree Comonad

I’m still in the early stages of understanding all of this material, but at the moment graded comonads and transformers are concepts that I’m much more familiar and comfortable with.

The idea behind graded monads and comonads is similar to the idea behind any indexed monad: we’re adding an extra type parameter to the monad or type, which can constrain the operations involved. The graded monads and comonads use a monoid as that index. This works particularly nicely, in my opinion: just allowing any index at all sometimes feels a little unstructured. The grading construction seems to constrain things to the right degree: the use of the monoid, as well, works really well with comonads.

That preamble out of the way, here’s the definition of a graded comonad:

record GradedComonad (� : Type) ⦃ _ : Monoid � ⦄ (� : � → Type → Type) : Type� where
  field
    extract : � ε A → A
    extend  : (� y A → B) → � (x ∙ y) A → � x B
This also has a few laws, which are expressed cleaner using cokleisli composition:
  _=<=_ : (� x B → C) → (� y A → B) → � (x ∙ y) A → C
  (g =<= f) x = g (extend f x)

  field
    idˡ : (f : � x A₀ → B₀) → PathP (λ i → � (ε∙ x i) A₀ → B₀) (extract =<= f) f
    idʳ : (f : � x A₀ → B₀) → PathP (λ i → � (∙ε x i) A₀ → B₀) (f =<= extract) f
    c-assoc : (f : � x C₀ → D₀) (g : � y B₀ → C₀) (h : � z A₀ → B₀) →
          PathP (λ i → � (assoc x y z i) A₀ → D₀) ((f =<= g) =<= h) (f =<= (g =<= h))

This seems to clearly be related to the stream constructions. Grading is all about the monoidal information about a comonad: the streams above are a comonad which indexes its entries with a monoid.

There are now two constructions I want to show that suggest a link betweent the stream constructions and graded comonads. First of these is the Cofree degrading comonad:

record G-CofreeF (� : Type → Type) (� : � → Type → Type) (A : Type) : Type where
  coinductive; constructor _â—ƒ_
  field here : A
        step : � (∃ w × � w (G-CofreeF � � A))
open G-CofreeF

G-Cofree : ⦃ _ : Monoid � ⦄ → (Type → Type) → (� → Type → Type) → Type → Type
G-Cofree � � A = � ε (G-CofreeF � � A)

This construction is similar to the cofree comonad transformer: it is based on the cofree comonad, but with an extra (graded) comonad wrapped around each level. For any functor � and graded comonad �, G-Cofree � � is a comonad. The implementation of extract is simple:

extract′ : ⦃ _ : Monoid � ⦄ ⦃ _ : GradedComonad � � ⦄ → G-Cofree � � A → A
extract′ = here ∘ extract

extend is more complex. First, we need a version of extend which takes a proof that the grade is of the right form:

module _ { � : � → Type → Type } where
  extend[_] : ⦃ _ : Monoid � ⦄ ⦃ _ : GradedComonad � � ⦄ →
              x ∙ y ≡ z → (� y A → B) → � z A → � x B
  extend[ p ] k = subst (λ z → � z _ → _) p (extend k)

Then we can implement the characteristic function on the free comonad: traceT. On graded comonads it has the following form:

module Trace ⦃ _ : Monoid � ⦄ ⦃ _ : GradedComonad � � ⦄ ⦃ _ : Functor � ⦄ where
  module _ {A B} where
    {-# NON_TERMINATING #-}
    traceT : (� ε A → B) → (� ε A → � (∃ w × � w A)) → � ε A → G-Cofree � � B
    traceT ϕ � = ψ
      where
      ψ : � x A → � x (G-CofreeF � � B)
      ψ = extend[ ∙ε _ ] λ x → ϕ x ◃ map (map₂ ψ) (� x)

This function is basically the unfold for the free degrading comonad. If G-Cofree is a internally-labelled tree, then ϕ above is the labelling function, and � is the “next� function, returning the children for some root.

Using this, we can implement extend:

  extend′ : (G-Cofree � � A → B) → G-Cofree � � A → G-Cofree � � B
  extend′ f = traceT f (step ∘ extract)

The relation between this and the stream is that the stream can be defined in terms of this: Stream W = G-Cofree id (GC-Id W).

Finally, the last construction I want to introduce is the following:

module _ ⦃ _ : Monus � ⦄ where
  data Prefix-F⊙ (� : Type → Type) (� : � → Type → Type) (i j : �) (A : Type) : Type where 
    prefix : ((i≤j : i ≤ j) → A × � (∃ k × � k (Prefix-F⊙ � � k (fst i≤j) A))) → Prefix-F⊙ � � i j A

  Prefix⊙ : (� : Type → Type) (� : � → Type → Type) (j : �) (A : Type) → Type
  Prefix⊙ � � j A = � ε (Prefix-F⊙ � � ε j A)

  Prefix : (� : Type → Type) (� : � → Type → Type) (A : Type) → Type
  Prefix � � A = ∀ {i} → Prefix⊙ � � i A

This type is designed to mimic sized type definitions. It has an implicit parameter which can be set, by the user of the type, to some arbitrary depth. Basically the parameter means “explore to this depth�; by using the ∀ we say that it is defined up to any arbitrary depth.

When the ≺ relation on the monus is well founded it is possible to implement traceT:

  module _ ⦃ _ : GradedComonad � � ⦄ ⦃ _ : Functor � ⦄ (wf : WellFounded _≺_) {A B : Type} where
    traceT : (� ε A → B) → (� ε A → � (∃ w × (w ≢ ε) × � w A)) → � ε A → Prefix � � B
    traceT ϕ � xs = extend[ ∙ε _ ] (λ xs′ → prefix λ _ → ϕ xs′ ,  map (map₂ (ψ (wf _))) (� xs)) xs
      where
      ψ : Acc _≺_ y → (x ≢ ε) × � x A → � x (Prefix-F⊙ � � x y B)
      ψ (acc wf) (x≢ε , xs) =
        extend[ ∙ε _ ]
          (λ x → prefix
            λ { (k , y≡x∙k) →
              Ï• x , map
                (λ { (w , w≢ε , xs) →
                  w , ψ (wf k (_ , y≡x∙k ; comm _ _ , x≢ε)) (w≢ε , xs)}) (� x)})
          xs

Conclusion

Comonads are much less widely used than monads in Haskell and similar languages. Part of the reason, I think, is that they’re too powerful in a non-linear language. Monads are often used to model sublanguages where it’s possible to introduce “special� variables which interact with the monadic context.

pyth = do
  x <- [1..10]
  y <- [1..10]
  z <- [1..10]
  guard (x*x + y*y == z*z)
  return (x,y,z)

The x variable here semantically spans over the range [1..10]. In the following two examples we see the semantics of state and maybe:

sum :: [Int] -> Int
sum xs = flip evalState 0 $ do
  put 0
  for_ xs $ \x -> do
    n <- get
    put (n + x)
  m <- get
  return m
data E = Lit Int | E :+: E | E :/: E

eval :: E -> Maybe Int
eval (Lit n) = n
eval (xs :+: ys) = do x <- eval xs
                      y <- eval ys
                      return (x + y)
eval (xs :/: ys) = do x <- eval xs
                      y <- eval ys
                      guard (y /= 0)
                      return (x / y)

The variables n and m introduced in the state example are “special� because their values depend on the computations that came before. In the maybe example the variables introduced could be Nothing.

You can’t do the same thing with comonads because you’re always able to extract the “special� variable with extract :: m a -> a. Instead of having special variable introduction, comonads let you have special variable elimination. But, since Haskell isn’t linear, you can always just discard a variable so this isn’t much use.

Looking at the maybe example, we have a function eval :: E -> Maybe Int that introduces an Int variable with a “catch�: it is wrapped in a Maybe. We want to use the eval function as if it were a normal function E -> Int, with all of the bookkeeping managed for us: that’s what monads and do notation (kind of) allow us to do.

An analagous example with comonads might be having a function consume :: m V -> String. This “handles� a V value, but the “catch� is that it needs an m context to do so. If we want to treat the consume function as if it were a normal function V -> String then comonads (and codo notation Orchard and Mycroft 2013) would be a perfect fit.

The reason that this analagous case doesn’t arise very often is that we don’t have many handlers that look like m V -> String in Haskell. Why? Because if we want to “handle� a V we can just discard it: as a non-linear language, you do not need to perform any ceremony to discard a variable in Haskell.

Graded comonads, though, seem to be much more useful than normal comonads. I think it is becuase they basically get rid of the m a -> a function, changing it into a much more restricted form. In this way, they give a kind of small linear language, but just for the monoidal type parameter.

And there are a lot of uses for the graded comonads. Above we’ve used them for termination checking. A recursive function might have the form a -> b, where a is the thing being recursed on. If we’re using well-founded recursion to show that it’s terminating, though, we add an extra parameter, an Acc _<_ proof, turning this function into Acc _<_ w × a -> b. The Acc _<_ here is the graded comonad, and this recursive function is precisely the “handler�.

Other examples might be provacy or permissions: a function might be able to work on some value, but only if it has particular permission regarding that value. The permission here is the monoid.

There are other examples I’m sure, those are just the couple that I have been thinking about.

References

Abel, Andreas, and Brigitte Pientka. 2013. “Wellfounded Recursion with Copatterns� (0) (June): 25. http://www2.tcs.ifi.lmu.de/%7Eabel/icfp13-long.pdf.

Ahman, Danel, James Chapman, and Tarmo Uustalu. 2012. “When Is a Container a Comonad?� In Foundations of Software Science and Computational Structures, 74–88. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg. doi:10.1007/978-3-642-28729-9_5.

Ahman, Danel, and Tarmo Uustalu. 2014. “Update Monads: Cointerpreting Directed Containers�: 23 pages. doi:10.4230/LIPICS.TYPES.2013.1.

———. 2013. “Distributive laws of directed containers.� Progress in Informatics (10) (March): 3. doi:10.2201/NiiPi.2013.10.2.

———. 2016. “Directed Containers as Categories� (April). doi:10.4204/EPTCS.207.5.

Kidney, Donnacha Oisín, and Nicolas Wu. 2021. “Algebras for Weighted Search.� Proceedings of the ACM on Programming Languages 5 (ICFP) (August): 72:1–72:30. doi:10.1145/3473577.

Kmett, Edward. 2018. “The State Comonad.� Blog. The Comonad.Reader. http://comonad.com/reader/2018/the-state-comonad/.

Orchard, Dominic, and Alan Mycroft. 2013. “A Notation for Comonads.� In Implementation and Application of Functional Languages, ed by. Ralf Hinze, 1–17. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. doi:10.1007/978-3-642-41582-1_1.

Waern, Love. 2018. “I made a monad that I haven’t seen before, and I have a few questions about it.� Reddit Post. reddit.com/r/haskell. https://www.reddit.com/r/haskell/comments/7oav51/i_made_a_monad_that_i_havent_seen_before_and_i/.

by Donnacha Oisín Kidney at May 03, 2022 12:00 AM

May 02, 2022

ERDI Gergo

Cheap and cheerful microcode compression

This post is about an optimization to the Intel 8080-compatible CPU that I describe in detail in my book Retrocomputing in Clash. It didn't really fit anywhere in the book, and it isn't as closely related to the FPGA design focus of the book, so I thought writing it as a blog post would be a good idea.

Retrocomputing with Clash

Just like the real 8080 from 1974, my Clash implementation is microcoded: the semantics of each machine code instruction of the Intel 8080 is described as a sequence of steps, each step being the machine code instruction of an even simpler, internal micro-CPU. Each of these micro-instruction steps are then executed in exactly one clock cycle each.

My 8080 doesn't faithfully replicate the hardware 8080's micro-CPU; in fact, it doesn't replicate it at all. It is a from-scratch design based on a black box understanding of the 8080's instruction set, and the main goal was to make it easy to understand, instead of making it efficient in terms of FPGA resource usage. Of course, since my micro-CPU is different, the micro-instructions have no one to one correspondence with the orignal Intel 8080, and so the microcode is completely different as well.

In the CPU, after fetching the machine code instruction byte, we look up the microcode for that byte, and then execute it cycle by cycle until we're done. This post is about how to store that microcode efficiently.

An illustrative example

To avoid dealing with the low-level details of what exactly goes on in our microcode, for the rest of this blog post let's use a dictionary of a small handful of English words as our running example. Suppose that we want to store the following table:

0.  shape
1.  shaping
2.  shift
3.  shapeshifting
4.  ape
5.  aping
6.  ship
7.  shipping
8.  grape
9.  elope
10. shard
11. sharding
12. shared
13. geared
    

There's a lot of redundancy between these words, and we will see how to exploit that. But does this make it a poor example that won't generalize to our real use case of storing microcode? Not at all. There are lots of 8080 instructions that are just minimal variations of each other, such as doing the exact same operation but on different general purpose registers; thus, their microcode is also going to be very similar, doing the same setup/teardown around a different kernel.

Fixed length vectors

Since our eventual goal is designing hardware, everything ultimately needs a fixed size. The most straightforward representation of our dictionary, then, is as a vector that is sized to fit the longest single word:

      type Dictionary = Vec 14 (Vec 13 Char)
    

The longest word "shapeshifting" is 13 characters. For all 14 possible inputs, we store 13 characters, using a special "early termination" marker like '.' in the middle for those words that are shorter:

0.  shape........
1.  shaping......
2.  shift........
3.  shapeshifting
4.  ape..........
5.  aping........
6.  ship.........
7.  shipping.....
8.  grape........
9.  elope........
10. shard........
11. sharding.....
12. shared.......
13. geared.......
    

We can then use this table very easily in a hardware implementation: after fetching the "instruction", i.e. the dictionary key, we look up the corresponding Vec 13 Char in the dictionary ROM, and keep a 4-bit counter of type Index 13 to process it cycle by cycle.

This is the equivalent of the microcode representation that we use in Retrocomputing in Clash, but it is easy to see that it is very wasteful. In our illustrative example, we store a total of 14 ⨯ 13 = 182 characters, whereas the total length of all strings is only 85, so we waste about 55% of our storage.

On our 8080-compatible CPU we get similar (slightly worse) numbers: the longest instruction, XTHL, takes 18 cycles. We don't need to store microcode for the first cycle, since that always corresponds to just fetching the instruction byte itself. This leaves us with 17 micro-operations. For all 256 possible machine code instruction bytes, we end up storing a total of 256 ⨯ 17 = 4352 micro-operations, but if we look at the cycle count of each individual 8080 instruction, the useful part is only 1493 micro-operations. That's a waste of about 65%.

Using terminators

No, wait, not this guy.

The obvious way to cut down on some of that fat is to store each word only up to its end. We can use a terminated representation for this, by keeping some end-of-word marker ('.' in the examples below), and concatenating all items:

0.  shape.
6.  shaping.
14. shift.
20. shapeshifting.
34. ape.
38. aping.
44. ship.
49. shipping.
58. grape.
64. elope.
70. shard.
76. sharding.
85. shared.
92. geared.
    

Instead of storing 182 characters, we now only store 99. While this is still more than 85, because we also have to store all those word-separating '.' markers, it is still a big improvement.

However, there's a bit of cheating going in here, because with the above table as given, we'd have no way of looking up words by their original index. For example, word #7 is shipping, but if we started at entry number 7 in this representation, we'd get haping. We need to also store a table of contents that gives us the starting address of each dictionary entry:

0.  0
1.  6
2.  14
3.  20
4.  34
5.  38
6.  44
7.  49
8.  58
9.  64
10. 70
11. 76
12. 85
13. 92
    

If we want to calculate the contribution of the table of contents to the total size, we have to get a bit more precise. Previously, we characterized ROM footprint in units of characters, but now we need to store 7-bit indices as well. To be able to add the two together, we need to also fix the bit width of each character. For now, let's just use 8 bits per character.

The total size in bits, for storing the table of contents and the dictionary in terminated form, comes out to 14 ⨯ 7 + 99 ⨯ 8 = 890. We can compare this to the 14 ⨯ 13 ⨯ 8 = 1456 bits of the fixed-length representation to see that it's a huge improvement.

Linked lists

Not this guy either.

As we've seen, the table of contents takes up 98 bits, or about 11% of our total footprint in the terminated representation. Can we get rid of it?

One way of doing this is to change the starting address of each word to its key. This is already the case for our first word, shape, since its key is 0 and it starts at address 0. However, the next word, shaping, can't start at address 1, since that is where the second letter of the first word resides.

If we store the next character's address instead of making the assumption that it's going to be the next address, we can start each word at the address corresponding to its key, and then leave subsequent letters to addresses beyond the largest key:

0.  s → @14
1.  s → @18
2.  s → @24
3.  s → @28
4.  a → @40
5.  a → @42
6.  s → @46
7.  s → @49
8.  g → @56
9.  e → @60 
10. s → @64
11. s → @68
12. s → @75 
13. g → @80
14. h → a → p → e → @85
18. h → a → p → i → n → g → @85
24. h → i → f → t → @85
28. h → a → p → e → s → h → i → f → t → i → n → g → @85
40. p → e → @85
42. p → i → n → g → @85
46. h → i → p → @85
49. h → i → p → p → i → n → g → @85
56. r → a → p → e → @85
60. l → o → p → e → @85
64. h → a → r → d → @85
68. h → a → r → d → i → n → g → @85
75. h → a → r → e → d → @85
80. e → a → r → e → d → @85
    

Not only did we get rid of the table of contents, we can also store the terminators more implicitly, by using a special value for the next pointer. In this example, we can use @85 for that purpose, pointing beyond the last cell. This leaves us with just 85 cells compared to the 99 cells with terminators.

However, each cell now contains both a character and a pointer. Since we need to address 85 cells, the latter takes up 7 bits, for a total of 85 ⨯ (8 + 7) = 1275 bits.

This is a step back from the the terminated representation's 890 bits. We can make a note, though, that if each character was at least 35 bits wide instead of 8, the linked representation would come out ahead. But the real reason we are interested in the linked-list form is that it suggests a further optimization that finally exploits the redundancy between the words in our dictionary.

Common suffix elimination

This is where we get to the actual meat of this post. Let's focus on the following subset of our linked list representation:

12. s → @75 
13. g → @80
75. h → a → r → e → d → @85
80. e → a → r → e → d → @85
    

We are storing the shared suffix ared twice, when instead, we could redirect the second one to the first occurrence, saving 4 cells:

12. s → @75 
13. g → @80
75. h → @76
76. a → r → e → d → @85
80. e → @76
    

If we apply the same idea to all words, we arrive at the following linked representation. Note that we still start every word at the index corresponding to its key, avoiding the need for a table of contents:

0.  s → @14
1.  s → @15
2.  s → @16
3.  s → @20
4.  a → @32
5.  a → @34
6.  s → @35
7.  s → @38
8.  g → @41
9.  e → @42
10. s → @44
11. s → @48
12. s → @52
13. g → @56
14. h → @4
15. h → @5
16. h → i → f → t → @57
20. h → a → p → e → s → h → i → f → t → @29
29. i → n → g → @57
32. p → e → @57
34. p → @29
35. h → i → p → @57
38. h → i → p → @34
41. r → @4
42. l → o → @32
44. h → a → r → @47
47. d → @57
48. h → a → r → d → @29
52. h → @53
53. a → r → e → @47
56. e → @53
    

It's hard to see what exactly is going on here from this textual format, but things become much cleaner if we display it as a graph:

We can compute the size of this representation along the same lines as the linked-list one, except now we only have 57 cells. This also means that the pointers can be 6 bits instead of 7, for a total size of 57 ⨯ (8 + 6) = 798 bits. A 10% save compared to the 890 bits of the terminated representation!

Going back to our real-world use case of 8080 microcode, each micro-instruction is 15 bits wide. We have already computed that the fixed-length representation uses 4352 ⨯ 15 = 65,280 bits; if we do the same calculation for the other representations, we get 28,286 bits in the terminated representation, 37,492 bits with linked lists, and a mere 13,675 bits, that is, just 547 cells, with the common suffixes shared!

So how do we compute this shared-suffix representation? Luckily, it turns out we can do that in just a handful of lines of code.

Reverse trie representation

To come up with the basic idea, let's start by thinking about why we are aiming to share common suffixes instead of common prefixes, such as the prefix between shape and shaping. The answer, of course, is that we need to address each word separately. If we try to unify shape and shaping, starting with the key 0 or 1 and making it as far as shap doesn't tell us on its own if we should continue with e or ing for the given key.

On the other hand, once we start a given word, it doesn't matter where subsequent letters are, including even the beginning of other words (as is the case between shaping and aping). So fan-out is bad (it would mean having to make a decision), but fan-in is a-OK.

There's an obvious data structure for exploiting common prefixes: we can put all our words in a trie. We can make one in Haskell by using a finite map from the next key element to the stored data (for terminal nodes) and the rest of the trie:

import Data.List.NonEmpty (NonEmpty(..))
import qualified Data.List.NonEmpty as NE     
import qualified Data.Map as M

newtype Trie k a = MkTrie{ childrenMap :: M.Map k (Maybe a, Trie k a) }

children :: Trie k a -> [(k, Maybe a, Trie k a)]
children t = [(k, x, t') | (k, (x, t')) <- M.toList $ childrenMap t]
    

Note that a terminal node doesn't necessarily mean no children, since one full key may be a proper prefix of another key. For example, if we build a trie that stores "FOO" ↦ 1 and "FOOBAR" ↦ 2, then the node at 'F' → 'O' → 'O' will contain value Just 1 and also the child trie for 'B' → 'A' → 'R'.

The main operation on a Trie that we will need is building one from a list of (key, value) pairs via repeated insertion:

empty :: Trie k a
empty = MkTrie M.empty

insert :: (Ord k) => NonEmpty k -> a -> Trie k a -> Trie k a
insert ks x = insertOrUpdate (const x) ks

insertOrUpdate :: (Ord k) => (Maybe a -> a) -> NonEmpty k -> Trie k a -> Trie k a
insertOrUpdate f = go
  where
    go (k :| ks) (MkTrie ts) = MkTrie $ M.alter (Just . update . fromMaybe (Nothing, empty)) k ts
      where
        update (x, t) = case NE.nonEmpty ks of
            Nothing  -> (Just $ f x, t)
            Just ks' -> (x, go ks' t)

fromList :: (Ord k) => [(NonEmpty k, a)] -> Trie k a
fromList = foldr (uncurry insert) empty
    

Ignoring the actual indices for a moment, looking at a small subset of our example consisting of only shape, shaping, and aping, the trie we'd build from it looks like this:

But we want to find common suffixes, not prefixes, so how does all this help us with that? Well, a suffix is just a prefix of the reversed sequence, so watch what happens when we build a trie after reversing each word, and lay it out left-to-right:

In this representation, terminal nodes correspond to starting letters of each word, so we can store the dictionary index as the value associated with them:

At this point, it should be clear how we are going to build our nicely compressed representation: we build a suffix trie, and then flatten it by traversing it bottom up. Before we do that, though, let's take care of one more subtlety: what if we have the same word at multiple indices in our dictionary? This is not an invalid case, and does come up in practice for the multiple NOP instructions of the 8080, all mapping to the exact same microcode. The solution is to simply allow a NonEmpty list of dictionary keys on terminal nodes in the resulting trie:

fromListMany :: (Ord k) => [(NonEmpty k, a)] -> Trie k (NonEmpty a)
fromListMany = foldr (\(ks, x) -> insertOrUpdate ((x :|) . maybe [] NE.toList) ks) empty

suffixTree :: (KnownNat n, Ord a) => Vec n (NonEmpty a) -> Trie a (NonEmpty (Index n))
suffixTree = fromListMany . toList . imap (\i word -> (NE.reverse word, i))
    

Flattening

The pipeline going from a suffix tree to flat ROM payload containing the linked-list representation has three steps:

  1. Allocate an address to each cell, and fill in the links. We can make our lives much easier by using Either (Index n) Int for the addresses: Left i is an index from the original dictionary, and Right ptr means it corresponds to the middle of a word.
  2. Renumber the addresses by reserving the first n addresses to the original indices (remember, this is how we avoid the need for a table of contents), and using the rest for the internal ones.
  3. After step two we have a single continuous address block of cells. All that remains to be done is reordering the elements, sorting each one into its own position.
compress :: forall n a. (KnownNat n, Ord a) => Vec n (NonEmpty a) -> [(a, Maybe Int)]
compress = reorder . renumber . links . suffixTree
  where
    reorder = map snd . sortBy (comparing fst)

    renumber xs = [ (flatten addr, (x, flatten <$> next)) | (addr, x, next) <- xs ]
      where
        offset = snatToNum (SNat @n)

        flatten (Left k) = fromIntegral k
        flatten (Right idx) = idx + offset
    

Since we don't know the full size of the resulting ROM upfront, we have to use Int as the final unified address type; this is not a problem in practice since for our real use case, all this microcode compression code runs at compile time via Template Haskell, so we can dynamically compute the smallest pointer type and just fromIntegral the link pointers into that.

We conclude the implementation with links, the function that computes the next pointers. The cell emitted for each trie node should link to the cell for its parent node, so we pass the parent down as we traverse the trie (this is the next parameter below). The new cell itself is either put in the next empty cell if it is not a terminal node, i.e. if it doesn't correspond to a first letter in our original dictionary; or, it and all its aliases are emitted at their corresponding Left addresses.

links :: Trie k (NonEmpty a) -> [(Either a Int, k, Maybe (Either a Int))]
links = execWriter . flip runStateT 0 . go Nothing
  where
    go next = mapM_ (node next) . children

    node next (k, mx, t') = do
        this <- case mx of
            Nothing -> Right <$> alloc
            Just (x:|xs) -> do
                tell [(Left x', k, next) | x' <- xs]
                return $ Left x
        tell [(this, k, next)]
        go (Just this) t'

    alloc = get <* modify succ
    

If you want to play around with this, you can find the full code on GitHub; in particular, there's Data.Trie for the no-frills trie implementation, and Hardware.Intel8080.Microcode.Compress implementing the microcode compression scheme described in this post.

And finally, just for the fun of it, this is what the 8080 microcode — that prompted all this — looks like in its suffix tree form, clearling showing the clusters of instructions that share the same epilogue:

May 02, 2022 07:05 PM

May 01, 2022

Gabriel Gonzalez

Introductory resources to type theory for language implementers

Introductory resources to type theory for language implementers

This post briefly tours resources that helped introduce me to type theory, because I’m frequently asked by others for resources on this subject (even though I never had a formal education in type theory). Specifically, these resources will focus more on how to implement a type checker or type inference algorithm.

Also, my post will be biased against books, because I don’t tend to learn well from reading books. That said, I will mention a few books that I’ve heard others recommend, even if I can’t personally vouch for them.

What worked for me

The first and most important resource that I found useful was this one:

The reason why is because that paper shows logical notation side-by-side with Haskell code. That juxtaposition served as a sort of “Rosetta stone” for me to understand the correspondence between logical judgments and code. The paper also introduces some type theory basics (and dependent types!).

Along similar lines, another helpful resource was:

… which, as the name suggests, walks through a Haskell program to type-check Haskell code. This paper along with the preceding one helped bring type checkers “down to earth” to me by showing how there wasn’t any magic or secret sauce to implementing a type checker.

After that, the next thing that helped me improve my understanding was learning about pure type systems. Specifically, this paper was a very clear introduction to pure type systems:

You can think of pure type systems as sort of a “framework” for specifying type systems or talking about them. For example, the simply typed lambda calculus, System F, System FΩ, and the calculus of constructions are some example type systems that you’ll hear the literature refer to, and they’re all special cases of this general framework. You can think of pure type systems as generalizing the lambda cube.

However, none of the above resources introduce how to implement a type system with “good” type inference. To elaborate on that, many simple type systems can infer the types of program outputs from program inputs, but cannot work “in reverse” and infer the types of inputs from outputs. Hindley Milner type inference is one example of a “good” type inference algorithm that can work in reverse.

However, I never learned Hindley Milner type inference all that well, because I skipped straight to bidirectional type checking, which is described in this paper:

I prefer bidirectional type checking because (in my limited experience) it’s easier to extend the bidirectional type checking algorithm with new language features (or, at least, easier than extending Hindley Milner with the same language features).

The other reason I’m a fan of bidirectional type checking is that many cutting edge advances in research slot in well to a bidirectional type checker or even explicitly present their research using the framework of bidirectional type checking.

Books

Like I mentioned, I didn’t really learn that much from books, but here are some books that I see others commonly recommend, even if I can’t personally vouch for them:

Example code

I also created a tutorial implementation of a functional programming language that summarizes everything I know about programming language theory so far, which is my Fall-from-Grace project:

This project is a clean reference implementation of how to implement an interpreted langauge using (what I believe are) best practices in the Haskell ecosystem.

I also have a longer post explaining the motivation behind the above project:

Conclusion

Note that these are not the only resources that I learned from. This post only summarizes the seminal resources that greatly enhanced my understanding of all other resources.

Feel free to leave a comment if you have any other resources that you’d like to suggest that you feel were helpful in this regard.

by Gabriella Gonzalez (noreply@blogger.com) at May 01, 2022 03:57 PM

Philip Wadler

Object-Oriented Programming — The Trillion Dollar Disaster

 


Elixir engineer Ilya Suzdalnitski explains from the perspective of an engineer who has used both why he prefers functional programming to object-oriented programming systems. OOPS!

by Philip Wadler (noreply@blogger.com) at May 01, 2022 03:12 PM

GHC Developer Blog

GHC 9.4.1-alpha1 released

GHC 9.4.1-alpha1 released

bgamari - 2022-05-01

The GHC developers are happy to announce the availability of the first alpha release of the GHC 9.4 series. Binary distributions, source distributions, and documentation are available at downloads.haskell.org.

This major release will include:

  • A new profiling mode, -fprof-late, which adds automatic cost-center annotations to all top-level functions after Core optimisation has run. This incurs significantly less performance cost while still providing informative profiles.

  • A variety of plugin improvements including the introduction of a new plugin type, defaulting plugins, and the ability for typechecking plugins to rewrite type-families.

  • An improved constructed product result analysis, allowing unboxing of nested structures, and a new boxity analysis, leading to less reboxing.

  • Introduction of a tag-check elision optimisation, bringing significant performance improvements in strict programs.

  • Generalisation of a variety of primitive types to be levity polymorphic. Consequently, the ArrayArray# type can at long last be retired, replaced by standard Array#.

  • Introduction of the \cases syntax from GHC proposal 0302

  • A complete overhaul of GHC’s Windows support. This includes a migration to a fully Clang-based C toolchain, a deep refactoring of the linker, and many fixes in WinIO.

  • Support for multiple home packages, significantly improving support in IDEs and other tools for multi-package projects.

  • A refactoring of GHC’s error message infrastructure, allowing GHC to provide diagnostic information to downstream consumers as structured data, greatly easing IDE support.

  • Significant compile-time improvements to runtime and memory consumption.

  • … and much more

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Tweag I/O, Serokell, Equinix, SimSpace, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy testing,

  • Ben

by ghc-devs at May 01, 2022 12:00 AM

April 30, 2022

Ken T Takusagawa

[ljxgdqve] makeRegexOpts example

we demonstrate how to use the functions makeRegex and makeRegexOpts using the regex-tdfa Haskell regular expression package.

the key point is, you cannot use =~ if you want to use these functions.  if you do, for example:

bad :: String -> Bool;
bad s = s =~ (makeRegex "[[:digit:]]");

you will get inscrutable error messages:

* Ambiguous type variable `source0' arising from a use of `=~' prevents the constraint `(RegexMaker Regex CompOption ExecOption source0)' from being solved.

* Ambiguous type variables `source0', `compOpt0', `execOpt0' arising from a use of `makeRegex' prevents the constraint `(RegexMaker source0 compOpt0 execOpt0 [Char])' from being solved.

instead, you have to use matchTest or similar functions described in Text.Regex.Base.RegexLike in regex-base.  the functions are reexported by but not documented in Text.Regex.TDFA .

https://gabebw.com/blog/2015/10/11/regular-expressions-in-haskell is a good explanation.

below is an example program that searches case-insensitively for input lines that contain the substring "gold", equivalent to "grep -i gold".  we need to use makeRegexOpts to disable case sensitivity.

module Main where {
import qualified Text.Regex.TDFA as Regex;

main :: IO();
main = getContents >>= ( mapM_ putStrLn . filter myregex . lines);

myregex :: String -> Bool;
myregex s = Regex.matchTest r s where {
  r :: Regex.Regex;
  r = Regex.makeRegexOpts mycompoptions myexecoptions "gold" ;
  mycompoptions :: Regex.CompOption;
  mycompoptions = Regex.defaultCompOpt {Regex.caseSensitive = False}; -- record syntax
  myexecoptions :: Regex.ExecOption;
  myexecoptions = Regex.defaultExecOpt;
};
}

here is documentation about all the available ExecOption and CompOption for this TDFA regex implementation.

previously, on the lack of substitution in Haskell regexes.

by Unknown (noreply@blogger.com) at April 30, 2022 07:06 PM

April 28, 2022

Tweag I/O

Union and intersection contracts are hard, actually

Nickel, a configuration language that we are developing at Tweag, strives to provide first-class data validation. Nickel does so thanks to its contract system (although contracts are useful beyond data validation). While the whole story is more involved, contracts can be loosely seen as validation functions. They can be nicely combined using built-in constructors: one can form record contracts, array contracts, function contracts, and more.

A natural and useful addition that we soon considered are the boolean combinators or and and: at first sight, implementing them for boolean predicates looks trivial. But contracts are not exactly predicates, and this subtle difference makes the implementation of general unions (or) and intersections (and) contracts deceptively difficult. The existing literature, while keenly aware of the fact, doesn’t really explain why. This post intends to explain and illustrate this why. You’ll find a thorough exposure in our paper Union and intersection contracts are hard, actually presented at DLS21.

Contracts

Constructors

The generic way to define a custom contract in Nickel is to write a function that takes the value to check (and a label that you can ignore for now), and either fails, or returns the value otherwise. It’s sufficient to define any contract we may need in principle, but it’s not always very ergonomic. Let’s write such a contract for an array whose elements are records with a field foo. The value of each foo must also be a number greater than 10.

let Foos = fun label value =>
  if builtin.is_array value then
    value
    |> array.map (fun elem =>
      if builtin.is_record elem && record.fields elem == ["foo"] then
        if builtin.is_num elem.foo && elem.foo >= 10 then
          value
        else
          contract.blame_with
            "a foo field is not a number greater than 10"
            label
      else
        contract.blame_with "an element is not a record with a foo field" label)

  else
    contract.blame_with "not an array" label in

[{foo = 20}, {foo = "30"}] | Foos

The pipe operator | is used to apply a contract to a value. Now compare the previous definition with a version using the built-in constructors:

let GreaterThan = fun bound =>
  contract.from_predicate (fun value => builtin.is_num value && value >= bound) in
let Foos = Array {foo | GreaterThan 10} in

[{foo = 20}, {foo = "30"}] | Foos

We’ve used the from_predicate, Array and record constructors to assemble the same contract in a more concise, more modular and clearer way. Because such constructors have built-in support, the error messages are also better localized1.

Currently, Nickel features contract constructors corresponding to native values:

  • primitive contracts (numbers, strings, booleans)
  • record contracts
  • dictionary contracts
  • array contracts
  • function contracts

Unions and intersections

The existing constructors can get us quite far already, but some common contracts are still out of reach. Sometimes, we wish to apply several conditions to the same value. For example, ensuring that a field is not only a valid port number, but also a non-reserved port number (greater than 1023). There is currently no combinator to build this contract out of Port and GreaterThan. If we had intersections (written /\ thereafter), we could2:

let NonReservedPort = Port /\ GreaterThan 1023 in
{
  port | NonReservedPort
  ...
}

Unions would be very useful too. An ubiquitous example is nullable contracts, accepting a value that either satisfies some contract A, or can be null. This would simply be A \/ Null.

Beyond nullable values, we may allow a field to accept several alternative representations. For example, a date contract which accepts both a structured record or a valid ISO-8601 string. Once again, with unions, this contract is straightforward to write (assuming prior definitions of Iso8601Date, Day, and so on):

let Data = {day | Day, month | Month, year | Year} \/ Iso8601Date

Unions and intersections are hard

However appealing union and intersection contracts may be, they happen — perhaps surprisingly — to break fundamental properties of the core Nickel language. In the following, I only mention unions for simplicity, but all of the points made have a counterpart for the dual case of intersections.

Laziness

The crux of the issue concerns lazy contracts. By lazy, I mean that the checks embodied by such contracts don’t, and often can’t, fire right when the contract is first evaluated.

Eager contracts

Primitive contracts (Num, Bool and Str), and more generally any contract defined as a boolean predicate (e.g. using contract.from_predicate), are eager. When evaluating exp | Bool, the Bool contract checks the nature of exp and either fails immediately, or returns the boolean value of exp unchanged. The contract won’t interfere with the evaluation anymore.

Union of predicates can be trivially defined as the pointwise or operator ||:

P1 \/ ... \/ PN := fun value => P1 value || ... || PN value

Data contracts

On the other hand, datatype contracts like arrays and records are lazy. To understand why, consider, for example, Nixpkgs: it is a dictionary mapping packages to build recipes. That is, a massive, over-50 000-key-value-pair wide dictionary. It is absolutely out of the question to evaluate the entirety of this dictionary every time one needs to install 10 new packages: this would result in a painfully slow experience. Outside of Nix, one may want to query just a field or a subset of a configuration, without having to evaluate the whole thing.

To cope with such use-cases, Nickel has been made lazy. Expressions are only evaluated when needed, including the content of arrays and records. To preserve this capability, array and record contracts must be lazy as well. For if they were simple eager predicates, applying a top-level contract like nixpkgs | Packages would require the full evaluation of nixpkgs, in spite of the language’s lazyness.

Concretely, a record contract {foo | Str, bar | Num} will:

  • check that the value is a record with fields foo and bar. This part happens immediately.
  • lazily maps contracts Str and Num onto the inner fields. That is:

    {foo = 1 + 1, bar = 2} | {foo | Str, bar | Num}`
    # evaluates to
    {foo = 1 + 1 | Str, bar = 2 | Num}

Here, the Str contract violation for 1 + 1 will only cause an error once foo is used or serialized to a configuration, but not when the contract is first evaluated. If foo is never used, the contract won’t fail the execution. Lazy contracts return the original value, but with delayed checks buried inside.

Union contracts as a side-effect

The problem with lazy contracts is that union contracts can’t know right away which branch of the union to take. This implies that the implementation of union requires a form of backtracking and exception-like control flow, making them effectful. Take the following example (assume Pos and Neg are contracts for positive and negative numbers):

let Contract = {foo | Pos, bar | Pos } \/ {foo | Neg, bar | Neg} in

let data | Contract = {
  foo = 0 + 5,
  bar = 0 - 7,
}

This contract should fail at some point, because foo and bar are neither both positive nor both negatives. However, because of lazyness, the union contract can’t evaluate foo or bar right away. Hence, it doesn’t know which branch of the union to try yet.

If we use data.foo later in the program, the union contract will evaluate the Neg contract, acknowledge its failure, and rule out the second branch {foo | Neg, bar | Neg}. We still can’t figure if bar satisfies Pos yet.

Symmetrically, if we rather use data.bar alone, we can only rule out the first branch of the union, because data.bar fails Pos.

Detecting the violation of the original contract is possible only once we have used both data.bar and data.foo in the same program. Now, imagine that data is defined in a library data.ncl. We import the library in two different files foo.ncl and bar.ncl, each using only one of the fields:

# foo.ncl
let data = import "data.ncl" in data.foo

# bar.ncl
let data = import "data.ncl" in data.bar

They run totally fine in isolation. Now, if in a third program we import and use both foo.ncl and bar.ncl:

let foo = import "foo.ncl" in
let bar = import "bar.ncl" in
foo + bar

The interpreter now reports a contract violation pointing to one of the imports! This is a spooky action at a distance, or side-effect.

  • For the programmer, side-effects are hard to reason about because they prevent local reasoning.
  • For the interpreter, side-effects inhibit many optimizations and program transformations. Nickel being lazy and pure, a lot of program optimizations can be applied unconditionally. Not so much once we add unions.

Unions are also complex to implement efficiently. Here, the union contract needs to maintain shared mutable state between all the use points of data. At each contract violation on data.foo or data.bar, we need to update the shared state, and use it to decide if we should actually raise an error. This also implies that not all contract failures immediately stop the execution anymore, which requires to turn the simple bail-out semantics of contract.blame into a recoverable exception-like mechanism.

Lazy data contracts may look quite specific to Nickel. Alas, we can recast the same arguments and examples for function contracts, even in a strict (non-lazy) language. This is what we did in the paper. As first-class functions are a founding principle of functional programming, a contract system for functional languages without function contracts would be seriously impaired. Thus, any functional language with contracts faces difficulties when adding unions to the mix.

A way out

We’ve seen through an example why adding general union and intersection contracts to any contract system with lazy data contract or function contracts incurs a prohibitive cost in complexity.

Those issues arise when one tries to implement unions that must work with arbitrary contracts. We already observed that a number of contracts are not lazy, such as predicates. And indeed, the union of predicates is trivial to implement. If Nickel could distinguish the contracts built from contract.from_predicate and remember their boolean definition, we could form the union of an arbitrary number of predicates together with one arbitrary contract. Just apply each predicate in order, and if one succeeds, return the value. If they all fail, apply the last contract.

Even for lazy contracts like records, a lot of unions are actually workable. Take for example {foo | Num} \/ {foo | Num, bar | Str}. Just looking at the shape of the records, we see that the right one has an additional field bar. Nickel could systematically generate a discriminating predicate based on the structure of the operands. Here, that would be record.has_field "bar". We can then implement the union easily, because we can decide right away which branch to try: if the predicate returns true, apply the right contract, otherwise apply the left one. Such a restricted union constructor would bail out on harder cases such as {foo | A} \/ {foo | B}, where there is no obvious discriminating predicate.

A similar analysis and strategy is implemented for unions of function contracts in the Racket language. Having such a theoretically restricted but practically useful union constructor is probably the road we are going to take for Nickel.


  1. Contracts constructors are also helpful in the interaction with the static side of the gradual type system of Nickel, but this is orthogonal to the issues explored in this post.

  2. One can actually attach several contract to a value, and this particular example is already possible to write as {port | Port | GreaterThan 1023}. However, this is only works for the and combinator, and such an and isn’t first class: we can’t write a contract for an array of non reserved ports in a simple way.

April 28, 2022 12:00 AM

April 27, 2022

Philip Wadler

How to Speak

A master class from Patrick Winston of MIT on how to present ideas clearly. Chockfull of useful advice, much of which I've not seen elsewhere. Recommended.

by Philip Wadler (noreply@blogger.com) at April 27, 2022 04:46 PM

FP Complete

The Hidden Dangers of Haskell's Ratio Type

Here's a new Haskell WAT?!

Haskell has a type Rational for working with precisely-valued fractional numbers, and it models the mathematical concept of a rational number. Although it's relatively slow compared with Double, it doesn't suffer from the rounding that's intrinsic to floating-point arithmetic. It's very useful when writing tests because an exact result can be predicted ahead of time. For example, a computation that should produce zero will produce exactly zero rather than a small value within some range that would have to be determined.

Rational is actually a (monomorphic) specialization of the more general (polymorphic) type Ratio (from Data.Ratio). Ratio allows you to specify the underlying type used for the numerator and denominator. For example, to work with rational numbers using Int as the underlying type you can use Ratio Int. For the common case of using Integer as the underlying type, the type synonym Rational is provided:

type Rational = Ratio Integer

It's tempting to use Ratio with a fixed-width type like Int because Int is much faster than Integer. However, let's see what can happen if you do this:

λ> import Data.Int
λ> import Data.Ratio
λ> let r = 1 % 12 :: Rational   in r - r == 0
True
λ> let r = 1 % 12 :: Ratio Int8 in r - r == 0
False

WAT?!

Let's see what those subtracted values evaluate to:

λ> let r = 1 % 12 :: Rational   in r - r
0 % 1
λ> let r = 1 % 12 :: Ratio Int8 in r - r
0 % (-1)

Hmmm, let's see if that Ratio Int8 value is considered equal to 0:

λ> let r = 0 % (-1) :: Ratio Int8 in r == 0
True

WAT?!

Let's see what those manually-entered values are:

λ> 0 % (-1) :: Ratio Int8
0 % 1
λ> 0 :: Ratio Int8
0 % 1

OK, so these values really are equal, but why are the values in the subtraction different? The explanation is two-fold.

First, 0 % (-1) is a denormalized state for Ratio and shouldn't occur. (As you've probably suspected, it arises from integer overflow. More on that in a minute.) It's not too surprising, then, that it isn't equal to 0.

But why is it equal to 0 when we enter it directly? It's because % is a function not a constructor, and it normalizes the signs of the numerator and denominator before constructing the value:

x % y = reduce (x * signum y) (abs y)

The underlying assumption (the invariant) is that denominators will always be positive.

reduce is a function that reduces the numerator and denominator to their lowest terms, by dividing by the greatest common divisor:

reduce x y = (x `quot` d) :% (y `quot` d)
  where d = gcd x y

Here you can see the constructor that actually creates the values from their components, which is :%. It's not exported from Data.Ratio and the "smart constructor" % is used instead, to ensure that new Ratio values always satisfy the invariant.

Second, addition and subtraction are implemented without trying to minimize the possibility of integer overflow. For example:

(x :% y) - (x' :% y') = reduce (x * y' - x' * y) (y * y')

If y * y' overflows to a negative value, reduce will not normalize the signs. The result of gcd is always non-negative so the signs don't change and denormalized values are never renormalized. That happens only in % when constructing Ratio values.

Let's look at what happens in our example:

λ> x = 1; y = 12; x' = 1; y' = 12
λ> x * y' - x' * y :: Int8
0
λ> y * y' :: Int8
-112
λ> gcd 0 (-112)
112
λ> 0 `quot` 112
0
λ> (-112) `quot` 112
-1

The reduced result of 1 % 12 - 1 % 12 is therefore the denormalized value 0 :% (-1) which isn't considered equal to the normalized value 0 % 1.

Even though 12 is much less than maxBound :: Int8, when squared it results in integer overflow. The implementation of Num for Ratio is not designed to avoid overflows and they can happen very easily with numerators and denominators that are much less than the maxBound for the type.

The implementation could have used a slightly different approach:

(x :% y) - (x' :% y') = reduce (x * z' - x' * z) (y * z')
  where z = y `quot` d
        z' = y' `quot` d
        d = gcd y y'

However, the use of reduce is still necessary (consider 3 % 10 - 2 % 15) so this requires two more divisions and a gcd compared with the actual implementation.

Using a type as small as Int8 might seem a little unrealistic, but the problem can occur with any fixed-width integral type and I used Int8 for the illustration because it's easier to understand the problem when working with small values. I originally encountered it when using Ratio Int even though Int has a very large maxBound. I was writing property tests using QuickCheck for some polymorphic arithmetic code that was supposed to produce a zero sum as a result. The test succeeded with Rational and failed with Ratio Int and I couldn't understand why because the random values being generated by the test framework had numerators and denominators far less than maxBound :: Int. However, they were greater than its square root.

The documentation for Ratio says:

Note that Ratio's instances inherit the deficiencies from the type parameter's. For example, Ratio Natural's Num instance has similar problems to Natural's.

However, that doesn't really prepare you for what might happen with other type parameters! The moral of this story is that Ratio isn't much use on its own and you should always use Rational unless you really understand what you're getting into.

Further reading

Like that blog post? Check out the Haskell section of our site with tutorials and other blog posts. You can also check out all Haskell tagged blog posts.

We're hiring. Interested in working with our team on solving these kinds of WAT issues? Check out our jobs page for more information.

April 27, 2022 12:00 AM

April 24, 2022

Gil Mizrahi

Building a bulletin board using twain and friends

This is a port of my previous scotty tutorial for the twain web (micro) framework.

We are going to build a very simple bulletin board website using twain and friends. It'll be so simple that we won't even use a database, but hopefully it'll provide enough information on twain that you can continue it yourselves if you'd like.

But first, we are going to cover some of the basics of web programming, what are WAI and warp, and how to use twain.

Web programming and twain

Twain is a (tiny) server-side web framework, which means it provides a high-level API for describing web apps.

Twain is built on top of WAI, which is a lower level Web Application Interface. Warp is a popular web server implementation that runs WAI apps (also called a WAI handler).

A web server is a network application that receives requests from clients, processes them, and returns responses. The communication between the web client and web server follows the HTTP protocol. The HTTP protocol defines what kind of requests a user can make, such as "I want to GET this file", and what kind of responses the server can return, such as "404 Not Found".

wai provides a slightly low level mechanism of talking about requests and responses, and twain provides a bit more convenient mechanism than WAI for defining WAI apps. Warp takes descriptions of web programs that are written using WAI and provides the actual networking functionality, including the concurrent processing.

If you are interested in working with wai directly, Michael Snoyman's video workshop Your First Web App with WAI and Warp is a good place to learn more about it.

How to Run

Twain (and more specifically WAI) apps have the type Application, which can be considered as a specification of a web application. Application is a type alias:

type Application
  = Request -> (Response -> IO ResponseReceived) -> IO ResponseReceived

This type means that a WAI application is a function that takes a user's HTTP Request, and is expected to produce an HTTP Response to the user which it will pass to the function it got as a second argument (which is responsible for actually delivering the response, continuation passing style).

In order to run this application, we need to pass this function to a WAI handler which will do all of the networking heavy lifting of actually:

  • Opening sockets on a certain port
  • Receiving messages from the socket
  • Handling threading
  • Sending responses through the socket

...and so on. This is where warp comes in.

Once we have a WAI app, we can run it using the on a certain port using the run function from the warp package. This function will handle the heavy lifting and will call our web app when a Request from a user comes in, and will ask for a Respone from us.

Building a WAI Application with Twain

To build an Application with wai, one needs to take a Request and produce a Response. While this is fairly straightforward in principle, parsing a request and composing a response can be a bit tedious and repetative. For responses we need to branch over the HTTP method type, parse the route, extract the variable parts and branch on it, and so on. For requests, we need to set the HTTP status, the HTTP headers, the response body, and so on.

Twain provides us with a slightly more convenient API to describe web apps. It provides us with API for declaring methods and routes, extract variable information, compose routes, and create responses with less boilerplate.

A twain app is generally constructed by listing several HTTP methods + routes to be tried in order, and a matching responders for each method+route.

Let's explore these steps one by one, starting with declaring methods and routes, then defining responders, and finally gluing them all together.

Routing

Two of the most important details that can be found in an HTTP request is which component does the user want to access and in what way. The first is described using a Path, and the second using a Method.

For example, if a user would like to view the bulletin board post number 13, they will send the HTTP request GET /post/13. The first part is the method, and the second is the path.

To construct a route in twain, we need to specify the method using one of the method functions (such as get), and apply it the route and an action to generate a response.

Twain provides a textual interface for describing routes using the GHC extension OverloadedStrings. For example, we can describe static paths such as /static/css/style.css by writing the string "/static/css/style.css".

When writing routes, we often want to describe more than just a static path, sometimes we want part of the path to vary. We can give a name to a variable part of the path by prefixing the name with a colon (:).

For example, "/post/:id" will match with /post/17, /post/123, /post/hello and so on, and later, when we construct a response, we will be able to extract to this variable part with the function param by passing it the name "id".

For our bulletin board we want to create several routes:

get "/" -- Our main page, which will display all of the bulletins
get "/post/:id" -- A page for a specific post
get "/new" -- A page for creating a new post
post "/new" -- A request to submit a new page
post "/post/:id/delete" -- A request to delete a specific post

Next, we'll define what to do if we match on each of these routes.

Responding

Once we match an HTTP method and route, we can decide what to do with it. This action is represented by the type ResponderM a.

ResponderM implements the monadic interface, so we can chain such action in the same way we are used to from types like IO, this will run one action after the other.

In ResponderM context, we can find out more details about the request, do IO, decide how to respond to the user, and more.

Querying the Request

The request the user sent often has more information than just the HTTP method and route. It can hold request headers such as which type of content the user is expecting to get or the "user-agent" it uses, in case of the HTTP methods such as POST and PUT it can include a body which includes additional content, and more.

Twain provides a few utility functions to query a few of the more common parts of a request, with functions such as body, header and files. Or the entire request if needed.

It also provides easy access to the varying parts of the route and body with param and params.

For our case this will come into play when we want to know which post to refer to (what is the :id in the /post/:id route), and what is the content of the post (in the /new route).

Responding to the user

There are several ways to respond to the user, the most common ones is to return some kind of data. This can be text, HTML, JSON, a file or more.

In HTTP, in addition to sending the data, we also need to describe what kind of data we are sending and even that the request was successful at all.

Twain handles all that for the common cases by providing utility functions such as text, html, and json.

These functions take the relevant data we want to send to the user and create a WAI Response, which we can then send to the user.

For example, if we want to send a simple html page on the route /hello, we'll write the following

get "/hello" $
  send $
    html "<html><head><link rel=\"stylesheet\" type=\"text/css\" href=\"/style.css\"></head><body>Hello!</body></html>"

The HTTP Reponse we created with html will automatically set the status code 200, and the Content-Type which is appropriate for HTML pages, This is also something that we can set ourselves without help if we like using the raw function by applying it with the status, headers and body directly, instead of calling html. For example:

get "/hello" $
  send $
    raw
      status200
      [("Content-Type", "text/html; charset=utf-8")]
      "<html><head><link rel=\"stylesheet\" type=\"text/css\" href=\"/style.css\"></head><body>Hello!</body></html>"
IO

It is possible to use IO operations in a ResponderM context using the function liftIO. For example:

get "/hello" $ do
  liftIO (putStrLn "They said hello!")
  send $ text "Hello back!"

This way we can write to console, change a song in our music player, or query a database in the middle of processing a request! Fair warning though: Warp runs request processing concurrently, so make sure you avoid race conditions in your code!

Gluing routes together

Each example we have seen above has the type Middleware, which is defined like this:

type Middleware = Application -> Application

As a reminder, Application is also a type alias:

type Application
  = Request -> (Response -> IO ResponseReceived) -> IO ResponseReceived

In essence, a Middleware is a function that takes a WAI Application and can add additional processing to it - it can process the user Request before passing it to the Application it received, and it can do extra processing to the Response the Application generates before calling the Response -> IO ResponseReceived it received.

To illustrate, here's a very simple Middleware that prints some data from a Request before passing it to the web app, And prints some data from the web app's Response before sending it to the user:

mylogger :: Twain.Middleware
mylogger app request respond = do
  print (Twain.requestMethod request)
  app request $ \response ->
    print (Twain.responseStatus response)
    respond response

Twain uses this to mechanism to compose route handlers as well: a route handler is essentially a function that checks the request first and decides whether it wants to handle it (if the route matches) or pass it to the next route handler. So we can compose route handlers using regular function composition!

All that's missing is that final route handler of type Application that will definitely handle all requests that were not processed by previous handlers request. We can use twain's notFound function to send the user a failure message in no other route handler was able to handle their request.

Here's an example of a simple WAI Application with several route handlers:

{-# language OverloadedStrings #-}

import Web.Twain
import Network.Wai.Handler.Warp (run)

main :: IO ()
main = do
  putStrLn "Server running at http://localhost:3000 (ctrl-c to quit)"
  run 3000 app

app :: Application
app =
  ( get "/" (send (text "hello"))
  . get "/echo/hi" (send (text "hi there"))
  . get "/echo/:str" (param "str" >>= \str -> send (text str))
  )
  (notFound (send (text "Error: not found.")))

Note that the order of the routes matters - we try to match the /echo/hi route before the /echo/:str route and provide a custom handler to a specific case, all other cases will be caught by the more general route handler.

And as an aside, I personally don't like to use that many parenthesis and find using $ a bit more aesthetically pleasing, but . has presedence over $ so it's not going to work so well here. Fortunately we can place the routes in a list and then fold over the list to compose them instead:

app :: Application
app =
  foldr ($)
    (notFound $ send $ text "Error: not found.")
    [ get "/" $
      send $ text "hello"

    , get "/echo/hi" $
      send $ text "hi there"

    , get "/echo/:str" $ do
      str <- param "str"
      send $ text str
    ]

I like this style a bit more!

Alright, enough chitchat - let's get to work

We now have the basic building blocks with which we can build our bulletin board! There are a few more things we can cover that will make our lives easier, but we'll pick them up as we go.

At the time of writing the most recent version of twain is 2.1.0.0.

Some simple structure

Here's the simple initial structure which we will iterate on to build our bulletin board app:

{-# language OverloadedStrings #-}

-- | A bulletin board app built with twain.
module Bulletin where

import qualified Web.Twain as Twain
import Network.Wai.Handler.Warp (run, Port)

-- | Entry point. Starts a bulletin-board server at port 3000.
main :: IO ()
main = runServer 3000

-- | Run a bulletin-board server at at specific port.
runServer :: Port -> IO ()
runServer port = do
  putStrLn $ unwords
    [ "Running bulletin board app at"
    , "http://localhost:" <> show port
    , "(ctrl-c to quit)"
    ]
  run port mkApp

-- | Bulletin board application description.
mkApp :: Twain.Application
mkApp =
  foldr ($)
    (Twain.notFound $ Twain.send $ Twain.text "Error: not found.")
    routes

-- | Bulletin board routing.
routes :: [Twain.Middleware]
routes =
  -- Our main page, which will display all of the bulletins
  [ Twain.get "/" $
    Twain.send $ Twain.text "not yet implemented"

  -- A page for a specific post
  , Twain.get "/post/:id" $
    Twain.send $ Twain.text "not yet implemented"

  -- A page for creating a new post
  , Twain.get "/new" $
    Twain.send $ Twain.text "not yet implemented"

  -- A request to submit a new page
  , Twain.post "/new" $
    Twain.send $ Twain.text "not yet implemented"

  -- A request to delete a specific post
  , Twain.post "/post/:id/delete" $
    Twain.send $ Twain.text "not yet implemented"
  ]

We'll start with a very simple routing skeleton. For the sake of simplicity, I'm going to put this code in main.hs and run it using:

stack runghc --package twain-2.1.0.0 --package warp main.hs

Eventually the program will greet us with the following output:

Running bulletin board app at http://localhost:3000 (ctrl-c to quit)

Which means that we can now open firefox a go to http://localhost:3000 and be greeted by our twain application.

  • I've also create a complete cabal project if you'd prefer to use that instead: see the commit

Displaying posts

Next, we are going to need figure out how to represent our bulletin data and how to keep state around.

We are going to add a few new packages to use for our data representation: text, time, and containers.

Add above:

import qualified Data.Text as T
import qualified Data.Time.Clock as C
import qualified Data.Map as M

And we'll represent a post in the following way:

-- | A description of a bulletin board post.
data Post
  = Post
    { pTime :: C.UTCTime
    , pAuthor :: T.Text
    , pTitle :: T.Text
    , pContent :: T.Text
    }

And we'll use a Map to represent all of the posts:

-- | A mapping from a post id to a post.
type Posts = M.Map Integer Post

Once we have these types, we can thread a value of type Posts to routes, so they will be available to all requests and response handlers. We'll change runServer and app a bit and add some dummy data.

-- | Run a bulletin-board server at at specific port.
runServer :: Port -> IO ()
runServer port = do
  app <- mkApp
  putStrLn $ unwords
    [ "Running bulletin board app at"
    , "http://localhost:" <> show port
    , "(ctrl-c to quit)"
    ]
  run port app

-- ** Application and routing

-- | Bulletin board application description.
mkApp :: IO Twain.Application
mkApp = do
  dummyPosts <- makeDummyPosts
  pure $ foldr ($)
    (Twain.notFound $ Twain.send $ Twain.text "Error: not found.")
    (routes dummyPosts)

-- | Bulletin board routing.
routes :: Posts -> [Twain.Middleware]
routes posts =
  -- Our main page, which will display all of the bulletins
  [ Twain.get "/" $
    Twain.send (displayAllPosts posts)

  -- A page for a specific post
  , Twain.get "/post/:id" $ do
    pid <- Twain.param "id"
    Twain.send (displayPost pid posts)

  -- A page for creating a new post
  , Twain.get "/new" $
    Twain.send $ Twain.text "not yet implemented"

  -- A request to submit a new page
  , Twain.post "/new" $
    Twain.send $ Twain.text "not yet implemented"

  -- A request to delete a specific post
  , Twain.post "/post/:id/delete" $
    Twain.send $ Twain.text "not yet implemented"
  ]

And add some additional business logic to display posts as simple text for now:

-- ** Business logic

-- | Respond with a list of all posts
displayAllPosts :: Posts -> Twain.Response
displayAllPosts =
  Twain.text . T.unlines . map ppPost . M.elems

-- | Respond with a specific post or return 404
displayPost :: Integer -> Posts -> Twain.Response
displayPost pid posts =
  case M.lookup pid posts of
    Just post ->
      Twain.text (ppPost post)

    Nothing ->
      Twain.raw
        Twain.status404
        [("Content-Type", "text/plain; charset=utf-8")]
        "404 Not found."

And add code that define the types, creates a dummy posts list, and implements ppPost which converts a Post to text:

-- ** Posts

-- | A mapping from a post id to a post.
type Posts = M.Map Integer Post

-- | A description of a bulletin board post.
data Post
  = Post
    { pTime :: C.UTCTime
    , pAuthor :: T.Text
    , pTitle :: T.Text
    , pContent :: T.Text
    }

-- | Create an initial posts Map with a dummy post
makeDummyPosts :: IO Posts
makeDummyPosts = do
  time <- C.getCurrentTime
  pure $
    M.singleton
      0
      ( Post
        { pTime = time
        , pTitle = "Dummy title"
        , pAuthor = "Dummy author"
        , pContent = "bla bla bla..."
        }
      )

-- | Prettyprint a post to text
ppPost :: Post -> T.Text
ppPost post =
  let
    header =
      T.unwords
        [ "[" <> T.pack (show (pTime post)) <> "]"
        , pTitle post
        , "by"
        , pAuthor post
        ]
    seperator =
      T.replicate (T.length header) "-"
  in
    T.unlines
      [ seperator
      , header
      , seperator
      , pContent post
      , seperator
      ]

Now, when running our program with:

stack runghc --package twain-2.1.0.0 --package warp --package text --package containers main.hs

We should be able to see a post when going to http://localhost:3000, see the same post when going to http://localhost:3000/post/0, and see a not found message when trying to go to a post with a different id such as http://localhost:3000/post/17

We can also create HTTP requests and see the results from the command-line using curl:

To see all posts:

curl -X GET http://localhost:3000

To see the post with id 0:

curl -X GET http://localhost:3000/post/0

Managing mutable state

Now this is a good start but we are still missing a few important parts:

  • Adding new posts
  • Generating new distinct post ids on post creation
  • Making sure all threads access the same state without stepping on each other's toes

While we could use a mutable variable like IORef or MVar, writing code that can run a sequence of commands that use mutable data can be tricky.

For example one thing we want to do is, when creating a new post:

  1. Get the current id
  2. Increment it, use that id to create a new post
  3. update the mutable variable to point to the new Map

However, if, for example, two threads manage to get the same id before incrementing the id, we'll get two posts with the same id. Or if two threads create the new Map and ask the mutable variable to point at their new Map, one post will be not actually be added and will be lost forever.

To combat that, we'll use shared memory using Software Transactional Memory (in short, STM). The stm packages provides us with mutable variables that can be shared and updated concurrently in an atomic way. Meaning that we can describe a sequence of operations on shared memory that are guaranteed to run atomically as one transaction without other operations on the same mutable variables getting mixed in between.

I recommend reading the chapter of STM in PCPH to get a more in-depth overview of stm.

Now - we can create a state data type the with contain the posts currently existing in the system as well as a updating new id for the next post added to the system:

-- | Application state.
data AppState
  = AppState
    { asNextId :: Integer -- ^ The id for the next post
    , asPosts :: Posts -- ^ All posts
    }

And then wrap it up in a transaction mutable variable: STM.TVar AppState.

We can create a new TVar in an IO context and pass it to routes so that the twain web app is a closure containing the mutable variable, and that way any thread handling requests and responses will have access to it!

We'll add a new import:

import qualified Control.Concurrent.STM as STM

And we'll edit mkApp to create the TVar and pass it to routes:

mkApp :: IO Application
mkApp = do
  dummyPosts <- makeDummyPosts
  appstateVar <- STM.newTVarIO AppState{asNextId = 1, asPosts = dummyPosts}
  pure $ foldr ($)
    (Twain.notFound $ Twain.send $ Twain.text "Error: not found.")
    (routes appstateVar)

routes :: STM.TVar AppState -> [Middleware]
routes appstateVar = do
  ...

The three most interesting functions we have (for now) to operate on our mutable transactional variable appstateVar are:

readTVar   :: TVar a -> STM a
writeTVar  :: TVar a -> a -> STM ()

atomically :: STM a -> IO a

the STM type we see here is similar to IO, it is a description of a transactional program - a sequence of steps that must run atomically. And the atomically function is one that converts that program into something that the Haskell runtime system can run in IO context.

So now, creating a new post and adding it to the current state of the system looks like this:

-- | Add a new post to our store.
newPost :: Post -> STM.TVar AppState -> IO Integer
newPost post appstateVar =
  STM.atomically $ do
    appstate <- STM.readTVar appstateVar
    STM.writeTVar
      appstateVar
      ( appstate
        { asNextId = asNextId appstate + 1
        , asPosts = M.insert (asNextId appstate) post (asPosts appstate)
        }
      )
    pure (asNextId appstate)

And these operations are guaranteed to run atomically. (We can also use STM.modifyTVar :: TVar a -> (a -> a) -> STM () for a slightly more convenient code.)

Let's add another import so we can run IO actions inside ResponderM:

import Control.Monad.IO.Class (liftIO)

and change the code of routes to handle viewing posts from our store:

-- | Bulletin board routing.
routes :: STM.TVar AppState -> [Twain.Middleware]
routes appstateVar =
  -- Our main page, which will display all of the bulletins
  [ Twain.get "/" $ do
    posts <- liftIO $ asPosts <$> STM.readTVarIO appstateVar
    Twain.send (displayAllPosts posts)

  -- A page for a specific post
  , Twain.get "/post/:id" $ do
    pid <- Twain.param "id"
    posts <- liftIO $ asPosts <$> STM.readTVarIO appstateVar
    Twain.send (displayPost pid posts)

  -- A page for creating a new post
  , Twain.get "/new" $
    Twain.send $ Twain.text "not yet implemented"

  -- A request to submit a new page
  , Twain.post "/new" $
    Twain.send $ Twain.text "not yet implemented"

  -- A request to delete a specific post
  , Twain.post "/post/:id/delete" $
    Twain.send $ Twain.text "not yet implemented"
  ]

Note how we can run IO operations inside a ResponderM context using liftIO.

Let's also add the ability to delete posts:

routes :: STM.TVar AppState -> [Twain.Middleware]
routes appstateVar =
  [ ...

  -- A request to delete a specific post
  , Twain.post "/post/:id/delete" $ do
    pid <- Twain.param "id"
    response <- liftIO $ handleDeletePost pid appstateVar
    Twain.send response
  ]

-- | Delete a post and respond to the user.
handleDeletePost :: Integer -> STM.TVar AppState -> IO Twain.Response
handleDeletePost pid appstateVar = do
  found <- deletePost pid appstateVar
  pure $
    if found
      then
        Twain.redirect302 "/"

      else
        Twain.raw
          Twain.status404
          [("Content-Type", "text/html; charset=utf-8")]
          "404 Not Found."

-- | Delete a post from the store.
deletePost :: Integer -> STM.TVar AppState -> IO Bool
deletePost pid appstateVar =
  STM.atomically $ do
    appstate <- STM.readTVar appstateVar
    case M.lookup pid (asPosts appstate) of
      Just{} -> do
        STM.writeTVar
          appstateVar
          ( appstate
            { asPosts = M.delete pid (asPosts appstate)
            }
          )
        pure True

      Nothing ->
        pure False

We can also test POST requests from the command-line using curl:

To delete the post with id 0:

curl -X POST http://localhost:3000/post/0/delete

HTML and forms

We're going to start writing some HTML to display our data and add a form for adding a new post.

We're going to use lucid. If you are interested in more possible choices for html libraries vrom911's article about html libraries is a good place to start.

Lucid provides a monadic EDSL for writing html pages. The functions are all suffixed with underscore (_) and represent the relevant html tags.

We'll add this import at the top:

import qualified Lucid as H

And the following type for convenience:

type Html = H.Html ()

And first, we'll create a template boilerplate which into we'll inject our content later:

-- | HTML boilerplate template
template :: T.Text -> Html -> Html
template title content =
  H.doctypehtml_ $ do
    H.head_ $ do
      H.meta_ [ H.charset_ "utf-8" ]
      H.title_ (H.toHtml title)
      H.link_ [ H.rel_ "stylesheet", H.type_ "text/css", H.href_ "/style.css"  ]
    H.body_ $ do
      H.div_ [ H.class_ "main" ] $ do
        H.h1_ [ H.class_ "logo" ] $
          H.a_ [H.href_ "/"] "Bulletin Board"
        content

Notice how the lists represent the attributes of a tag, how tags are sequenced using the monadic interface, and how tags are nested by passing them as input to other tags.

Let's create pages for posts:

-- | All posts page.
allPostsHtml :: Posts -> Html
allPostsHtml posts = do
  H.p_ [ H.class_ "new-button" ] $
    H.a_ [H.href_ "/new"] "New Post"
  mapM_ (uncurry postHtml) $ reverse $ M.toList posts

postHtml :: Integer -> Post -> Html
postHtml pid post = do
  H.div_ [ H.class_ "post" ] $ do
    H.div_ [ H.class_ "post-header" ] $ do
      H.h2_ [ H.class_ "post-title" ] $
        H.a_
          [H.href_ ("/post/" <> T.pack (show pid))]
          (H.toHtml $ pTitle post)

      H.span_ $ do
        H.p_ [ H.class_ "post-time" ] $ H.toHtml (T.pack (show (pTime post)))
        H.p_ [ H.class_ "post-author" ] $ H.toHtml (pAuthor post)

    H.div_ [H.class_ "post-content"] $ do
      H.toHtml (pContent post)

And change our web handlers to use html instead of text:

 -- | Respond with a list of all posts
 displayAllPosts :: Posts -> Twain.Response
 displayAllPosts =
-  Twain.text . T.unlines . map ppPost . M.elems
+  Twain.html . H.renderBS . template "Bulletin board - posts" . allPostsHtml

 -- | Respond with a specific post or return 404
 displayPost :: Integer -> Posts -> Twain.Response
 displayPost pid posts =
   case M.lookup pid posts of
     Just post ->
-      Twain.text (ppPost post)
+      Twain.html $
+        H.renderBS $
+          template "Bulletin board - posts" $
+            postHtml pid post

     Nothing ->
       Twain.raw
         Twain.status404
         [("Content-Type", "text/plain; charset=utf-8")]
         "404 Not found."

In order to delete a post, we need to make a POST command to the URL /post/<post-id>/delete. We can do that using HTML by creating a form, defining its URL and method, and create an input HTML element of type submit.

    -- delete button
    H.form_
      [ H.method_ "post"
      , H.action_ ("/post/" <> T.pack (show pid) <> "/delete")
      , H.onsubmit_ "return confirm('Are you sure?')"
      , H.class_ "delete-post"
      ]
      ( do
        H.input_ [H.type_ "submit", H.value_ "Delete", H.class_ "deletebtn"]
      )

You can stick this wherever you want in postHtml, I placed it at the end. Now, if you run the program using:

stack runghc --package twain --package text --package containers --package stm --package lucid main.hs

and go to the website (http://localhost:3000), you'll be greeted with beautiful (well, not beautiful, but functional) posts and a delete button for each post.

Submitting data via forms and processing it

Next we are going to add a post. To do that we need to create a new HTML page which will contain another HTML form. This time we will want to capture some input which will then be part of the body of the POST request.

-- | A new post form.
newPostHtml :: Html
newPostHtml = do
  H.form_
    [ H.method_ "post"
    , H.action_ "/new"
    , H.class_ "new-post"
    ]
    ( do
      H.p_ $ H.input_ [H.type_ "text", H.name_ "title", H.placeholder_ "Title..."]
      H.p_ $ H.input_ [H.type_ "text", H.name_ "author", H.placeholder_ "Author..."]
      H.p_ $ H.textarea_ [H.name_ "content", H.placeholder_ "Content..."] ""
      H.p_ $ H.input_ [H.type_ "submit", H.value_ "Submit", H.class_ "submit-button"]
    )

And we need to be able to access the following from the request on the server. We can do that using param. So let's implement the relevant parts in routes:

  -- A page for creating a new post
  , Twain.get "/new" $
    Twain.send handleGetNewPost

  -- A request to submit a new page
  , Twain.post "/new" $ do
    title <- Twain.param "title"
    author <- Twain.param "author"
    content <- Twain.param "content"
    time <- liftIO C.getCurrentTime

    response <-
      liftIO $ handlePostNewPost
        ( Post
          { pTitle = title
          , pAuthor = author
          , pContent = content
          , pTime = time
          }
        )
        appstateVar

    Twain.send response

and the handlers:

-- | Respond with the new post page.
handleGetNewPost :: Twain.Response
handleGetNewPost =
  Twain.html $
    H.renderBS $
      template "Bulletin board - posts" $
        newPostHtml

-- | Respond with the new post page.
handlePostNewPost :: Post -> STM.TVar AppState -> IO Twain.Response
handlePostNewPost post appstateVar = do
  pid <- newPost post appstateVar
  pure $ Twain.redirect302 ("/post/" <> T.pack (show pid))

And now we have a fairly functional little bulletin board! Hooray!

Styling

This post is already pretty long, so I will not cover styling in depth.

There are multiple way to use styling:

The first is using the EDSL approach like we did with lucid using a library like clay, the second is to write the css text inline in a Haskell module using something like the raw-strings-qq library, another is to write it in an external file and embed to context at compile time using template haskell and the file-embed library, another is to ship the css file along with the executable and use responseFile from the wai package to send it as a file.

For each of these - don't forget to set the content type header to "text/css; charset=utf-8"!

We can send a very rudimentary CSS as a string with the css function by adding this to the end of the routes list:

  -- css styling
  , Twain.get "/style.css" $
    Twain.send $ Twain.css ".main { width: 900px; margin: auto; }"

Logging, Sessions, Cookies, Authentication, etc.

The wai ecosystem has a wide variety of features that can be composed together. These features are usually encapsulated as "middlewares".

Remember, a middleware is a function that takes an Application and returns an Application. Middlewares can add functionality before the request passes to our twain app or after the response.

The wai-extra packages contains a bunch of middlewares we can use. Like logging, gzip compression of responses, forcing ssl usage, or simple http authentication.

For example, let's add some logging from wai-extra to our bulletin-app. We import a request logger from wai-extra:

import qualified Network.Wai.Middleware.RequestLogger as Logger

And then we can apply our twain app to a function such as logStdoutDev to add request logging to our twain app:

 -- | Run a bulletin-board server at at specific port.
 runServer :: Port -> IO ()
 runServer port = do
   app <- mkApp
   putStrLn $ unwords
     [ "Running bulletin board app at"
     , "http://localhost:" <> show port
     , "(ctrl-c to quit)"
     ]
-  run port app
+  run port (Logger.logStdoutDev app)

Testing

Testing WAI apps can be relatively straightforward with packages such as hspec-wai. Check out this twain test module for example usage.

Deploying

I usually create a static executable using ghc-musl and docker so I can deploy my executable on other linux servers.

In a stack project, add the following sections:

Add this to the stack.yaml:

docker:
  enable: true
  image: utdemir/ghc-musl:v24-ghc922

and this to the .cabal file under the executable section:

  ghc-options: -static -optl-static -optl-pthread -fPIC -threaded -rtsopts -with-rtsopts=-N

Check the ghc-musl repo for more instructions.

That's it

I hope you found this tutorial useful. If there's something you feel I did not explain well or you'd like me to cover, let me know via email, twitter.

The whole program including the stack and cabal files can be found on Github.

April 24, 2022 12:00 AM

April 21, 2022

Tweag I/O

Announcing WebAuthn

Tweag and Mercury are happy to announce a server-side library for the the WebAuthn specification (part of the FIDO2 project), available as webauthn on Hackage!

This library has been developed by a team at Tweag, contracted by Mercury. Mercury felt that Webauthn support was a missing element in the Haskell ecosystem, and wanted to contribute an open-source library to fill this gap. The library builds upon two previous prototypes, in concertation with their authors: a hackathon project by Arian van Putten and an alternative implementation by Fumiaki Kinoshita (also known as webauthn-0 on Hackage).

In the rest of the blog post we will mostly focus on introducing WebAuthn itself.

The problems with passwords, TOTP and SMS 2FA

Passwords are the dominant way to log into web services, but they have problems, both for users and developers. Users have to choose a password that is not easy to guess and then remember it. A password manager solves this problem, but introduces another attack vector. Furthermore, passwords can be compromised with phishing attempts.

Developers have to handle passwords with care, using a good key derivation function involving hashing and salts, only transferring the resulting key for storage in a secure database. This can be very error prone, see the OWASP cheat sheet for password storage for the current best practices.

TOTP and SMS-based two-factor authentication methods provide additional security over standalone passwords. However, these also come with their respective downsides. TOTP is a symmetric algorithm, and therefore provides no additional security against database leaks; SMS is plaintext, unreliable, and subject to local law surrounding automated messages.

Enter WebAuthn

The FIDO2 project, of which WebAuthn is a part of, attempts to solve these issues by using public key cryptography instead of passwords for authentication. Public-private key pairs specific to each web service are generated and stored by authenticators like a YubiKey or SoloKey, or platform specific hardware like TPM or Apple’s TouchID1.

While the main motivation for WebAuthn is authentication without passwords, it can also be used to add second-factor authentication to password-based authentication.

Ceremonies

WebAuthn can be split into two ceremonies (a ceremony is like a network protocol except that it involves human actions). The first ceremony is one-time registration of a credential, in which a new key pair is generated on an authenticator and its public component transferred to and stored by the web service. The second ceremony is authentication using a previously registered credential, in which it is proven that the user is in possession of the authenticator with the private key corresponding to a given public key.

Since web services can only run sandboxed client-side code, connected authenticators cannot be used directly. Instead, the browser-provided WebAuthn API needs to be called via JavaScript to indirectly interact with authenticators. Both ceremonies also use the shared concept of a public key credential, which is something the user can present to the web service in order to be authenticated.

We will now look at the steps of these ceremonies in some detail.

Registration The image below depicts the steps of the registration ceremony, showing how the web server, the browser and the authenticator interact with each other through the WebAuthn API.

Web Authentication API registration component and dataflow diagram This figure depicting the registration component of WebAuthn by Mozilla Contributors is licensed under CC-BY-SA 2.5.

Step 0: The client informs the web server that a user wishes to register a credential. This initial message is implementation-specific, but typically it contains the desired username. For username-less login this message can be empty.

Step 1: The web server responds with detailed requirements of the to-be-created key pair and other information, of which the following are of particular interest:

Step 2: The client selects an authenticator based on the above requirements and relays the relevant information to it.

Step 3: From here, the authenticator verifies that a user is present (via a button for example) and generates a new public-private key pair scoped to the web service, optionally with a proof that it originates from a trusted and secure authenticator, which is relayed back to the client.

Step 4: The client combines this information with its own information and relays it back to the web server.

Step 5: The client takes the data created by the authenticator and constructs the PublicKeyCredential, the interesting parts of which are:

  • identifier: The unique identifier of the provided credential.
  • response: The response of the authenticator to the client’s request for a new credential.

    • clientData: Provides the context for which the credential was created. e.g. the challenge and perceived origin of the options.
    • attestationObject: In case attestation was requested this will contain the attestation information.

Step 6: The web server performs validation. This validation includes, but is not limited to:

  1. Checking if the challenge matches the challenge originally sent to the client.
  2. Checking that the origin is the expected origin. If this isn’t the case, the client might have been connected to another server.

In case the web server requested, and was provided with, an attestation object, it may also verify that the attestation is sufficient. Different authenticators provide different methods of attestation. Hence, the web server must be able to handle different formats. WebAuthn Level 2 (the specification used for the implementation of the library) defines 6 verifiable attestation statement formats.

The web server can also choose to lookup further information on attested authenticators in the FIDO Alliance Metadata Service (specification). This service provides up-to-date information on registered authenticators. These fields are of particular interest:

  • attestationRootCertificates: The root certificates for many authenticators. For these authenticators, verifying the attestation using the metadata is essential to trust the attestation. For other authenticators (e.g. Apple), the root certificate is hardcoded in the library. Our library automatically handles looking up the authenticator and verifying the attestation if needed, if provided with the required data.
  • AuthenticatorStatus: The status of the authenticator. For instance, an authenticator might be compromised for physical attacks.

Authentication When a user wishes to authenticate themselves, this happens through the authentication ceremony. The goal of this ceremony is for the user to securely prove that they are in possession of the authenticator that holds the private key corresponding to a public key previously registered to the users account.

WebAuthn authentication component and dataflow diagram This figure depicting the authentication component of WebAuthn by Mozilla Contributors is licensed under CC-BY-SA 2.5.

Step 0: The client informs the web server that a user authenticate themselves. This initial message is implementation specific. For typical registration this message should contain the desired username. For username-less login, this message may be void of any information.

Step 1: The web server generates a new challenge and constructs the PublicKeyCredentialRequestOptions. For username-less login, the challenge is in fact the only field that has to be set. The two fields that are interesting are:

  • challenge: The authenticator will sign this challenge in the following steps to prove the possession of the private key.
  • allowCredentials: For the given username (if any), the server selects the credentials it has on record and relays them as this field (in order of preference). This will allow the client to select the credential for which it knows an authenticator with the correct private key. For passwordless login, this field is empty, it is then up to the authenticator to select the correct credential based on their scope.
  • userVerification: Whether user verification (in addition to user presence, which is always done) should be performed, making the result signify multi-factor authentication.

Step 2-4: The client selects an authenticator and relays the relevant information to it. From here, the authenticator verifies if the user is present (using a button for example), signs the challenge (and client data) with its private key, and returns the signature. The authenticator also returns the authenticatorData which contains information about the scope of the credential, the user interaction, and the authenticator itself.

Step 5: The client takes the data created by the authenticator and constructs the PublicKeyCredential, the interesting parts of which are:

  • identifier: The unique identifier of the provided credential.
  • response: The response of the authenticator to the client’s request.

    • clientData: Provides the context for which the signing took place. e.g. the challenge and perceived origin of the options.
    • authenticatorData: Contains information about the credential scope, user interaction with the authenticator for signing, and the signature counter.
    • signature: Contains the signature over the clientDataJSON and authenticatorData.

Step 6: The server verifies that the client data and authenticatorData are as expected, and that the signature is valid.

The Haskell library

The library implements almost the entire WebAuthn Level 2 specification, with extensions being the only major missing part. While the general design of the library isn’t expected to change very much, it should still be considered an alpha version for now. If you have a website with user accounts running on Haskell, we’d love for you to try it out and tell us what could be improved, contributions are welcome too!

To get started, here are our recommendations:


  1. More specifically, the Secure Enclave, which TouchID allows access to.

April 21, 2022 12:00 AM

April 12, 2022

Well-Typed.Com

GHC activities report: February-March 2022

This is the eleventh edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of February and March 2022.

You can find the previous editions collected under the ghc-activities-report tag.

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK, Meta, and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!

Team

The current GHC team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal and Sam Derbyshire.

Many others within Well-Typed, including Adam Gundry, Alfredo Di Napoli, Alp Mestanogullari, Douglas Wilson and Oleg Grenrus, are contributing to GHC more occasionally.

Releases

  • Ben finished backports to GHC 9.2.2 and cut the release.

  • Matt worked on preparing the 9.4 release.

  • Zubin has started preparing the 9.2.3 release.

Typechecker

  • Sam has been implementing syntactic unification, which allows two types to be checked for equality syntactically. This is useful in several places in the typechecker when we don’t want to emit an equality constraint to be processed by the constraint solver (thus giving less work to the constraint solver). This work will allow us to progress towards fixing #13105 (allowing rewriting in RuntimeReps). (!7812)

  • Sam fixed the implementation of isLiftedType_maybe, an internal function in GHC used to determine whether something is definitely lifted (e.g. Int :: Type), definitely unlifted (e.g. Int# :: TYPE IntRep), or unknown (e.g. a :: TYPE r for a type variable r). This function did not correctly account for type families or levity variables, as noted in #20837. This was hiding several bugs, e.g. in strictness analysis and in pattern matching inhabitation tests.

  • Sam allowed HasField constraints to appear in quantified constraints (#20989).

  • Sam added a check preventing users to derive KnownNat instances, which could be used to cause segfaults as shown in #21087.

  • Sam made the output of GHCi’s :type command more user-friendly, by improving instantiation of types involving out-of-order inferred type variables (#21088) and skipping normalisation for types that aren’t fully instantiated (#20974).

Code generation

  • Ben reworked the x86-64 native code generator to produce more position-independent code where possible. This enables use of GHC with new Windows toolchains, which enable address-space layout randomization by default (#16780).

  • Ben fixed a slew of bugs in GHC’s code generation for unaligned array accesses, revealed by recent work on the bytestring package (#20987, #21015).

  • Ben characterised and fixed a rather tricky bug in the generation of static reference tables for programs containing cyclic binding groups containing CAFs, static functions, and static data constructor applications (#20959).

  • Ben debugged fixed a recently introduced regression where GHC miscompiled code involving jump tables with sub-word discriminants (#21186).

  • Ben debugged a tricky non-deterministic crash due to a set of missing GC roots (#21141).

  • Spurred by insights from #21141, Ben started investigating how we can reduce the impact of error paths on SRT sizes. Sadly, there are some tricky challenges in this area which will require further work (#21169, #21183)

  • Ben migrated GHC’s Windows distribution towards a fully Clang/LLVM-based toolchain, eliminating a good number of bugs attributable to the previous GNU toolchain. This was a major undertaking involving changes in code generation (!7449), linking (!7774, !7528), the RTS (!7511, !7512, !7446), Cabal (Cabal #8062), the driver (!7448), and packaging and should significantly improve GHC’s maintainability and reliability on Windows platforms. See !7448 for a full overview of all of the moving parts involved in this migration.

  • Sam fixed a bug with code generation of keepAlive#, which was incorrectly being eta reduced even though it is supposed to always be kept eta-expanded (#21090).

Core

  • Sam added a check that prevents unboxed float literals from occurring in patterns, to avoid case expressions needing to implement complicated floating-point equality rules. This didn’t affect any packages on head.hackage.

Runtime system

  • Ben refactored the handling of adjustor thunks, an implementation detail of GHC’s foreign function interface implementation. The new representation significantly reduces their size and contribution to address-space fragmentation, fixing eliminating a known memory leak in GHCi (#20349) and fixing a source of testsuite fragility on Windows and i386 (#21132).

  • With help from Matt, Ben at long last finished and merged his refactoring of GHC’s eventlog initialization logic, eliminating a measurable source of RTS startup overhead and removing the last barrier to enabling eventlog support by default (!4477).

  • Ben rewrote the RTS linker used on Windows platforms, greatly improving link robustness by extending and employing the RTS’s m32 allocator for mapping object code (!7447).

  • Ben identified a GC bug, revealed by recent improvements in pointer tagging consistency, where the GC failed to untag function closures referenced from PAPs (#21254).

  • Matt identified and fixed a discrepency in the runtime stats calculations which would lead to incorrect CPU time calculations when using multiple GC threads. (!7890)

Error messages

  • Sam migrated more error messages to use the diagnostic infrastructure, such as “Missing signature” errors, and illegal wildcard errors (!7033).

  • Sam improved the treatment of promotion ticks on symbolic operators, which means that GHC now correctly report unticked promoted symbolic constructors when compiling with -Wunticked-promoted-constructors (#19984).

  • Zubin added warnings that get triggered when file header pragmas like LANGUAGE pragmas are found in the body of the module where they would usually be ignored (#20385).

Parser

  • Sam allowed COMPLETE pragmas involving qualified constructor names to be parsed correctly (!7645).

Driver

  • Ben rebased and extended work by Tamar Christina to improve GHC’s support for linking against C++ libraries, addressing pains revealed by the text libraries recent addition of a dependency on simdutf (#20010).

  • Ben introduced response file support into GHC’s command-line parser, making it possible to circumvent the restrictive limits on command-line-length imposed by some platforms (#16476).

  • Zubin and Matt added more fine grained recompilation checking for modules using Template Haskell, so that they are only recompiled when a dependency actually used in a splice is changed (#20605, !7353, blog post).

  • Matt fixed some more bugs in the dependency calculations in the driver. In particular some situations involving redundant hs-boot files are now handled correctly.

  • Matt modified the driver to store a cached transitive dependency calculation which can share work of computing a transitive dependency across modules. This is used when computing what instances are in scope for example.

  • Matt once again improved the output of the -Wunused-packages warning to now display some more information about the redundant package imports (!7883).

  • Matt fixed a long-standing bug where in one-shot mode, the compiler would look for interface files in the -i dirs even if the -hidir was set (!7851).

API features

  • Zubin fixed a few bugs with HIE file support, one where certain variable scopes weren’t being calculated properly (#18425) and another one where relationships between derived typeclass instances weren’t being recorded (#20341).

  • Matt and Zubin finished the hi-haddock patch which makes GHC lex and rename Haddock documentation strings. These are then stored in interface files so downstream tools such as HLS and GHCi can directly read this information without having to process the doc strings themselves.

Template Haskell

  • Sam fixed a compiler panic triggered by illegal occurrences of type wildcards resulting from splicing in a Template Haskell type (#15433).

  • Zubin fixed a few issues with the pretty printing of TH syntax (#20868, #20842).

  • Zubin added support for quoting patterns containing negative numeric literals (#20711).

Profiling

  • Andreas added a new profiling mode: -fprof-late. This mode will add cost centres only after the simplifier had a chance to run resulting in profiling performance more in line with regular builds as profiling in this mode will allow most optimizations to fire even with profiling enabled. The user guide has more information and we encourage people to try it out!

  • Andreas and Matt made various changes to the ticky-profiling infrastructure which allow it to be used with the eventlog, and furthermore to use it with eventlog2html. Thanks to Hasura for funding this work!

  • Matt made a few improvements to Source Notes, which are used to give source locations to expressions. This should result in more accurate location information in more situations (!7536).

Libraries

  • Ben fixed a regression in the process library due to the recently-introduced support for posix_spawnp (process #224).

  • Andreas opened up a proposal to export MutableByteArray from Data.Array.Byte. This makes the treatment of MutableByteArray consistent with the treatment of ByteArray in regards to being exported by base. The proposal has been implemented, accepted and will be in ghc-9.4.

Compiler performance

  • Sam continued work on directed coercions, which avoids large coercions being produced when rewriting type families (#8095). Unfortunately, benchmarking revealed some severe regressions, such as when compiling the singletons package. This is due to coercion optimisation being less effective than it was previously. Sam implemented a workaround (coercion zapping in the coercion optimiser), but due to the complexity of the patch, it was decided it would be better to investigate zapping without directed coercions, as the implementation of the directed coercions patch suggested a way forward that could avoid previous pitfalls. The directed coercions patch has been put on hold for the time being, until a better approach can be found for coercion optimisation. Sam wrote up an overview of the difficulties encountered during the implementation on the GHC wiki here, which should be useful to future implementors.

Packaging

  • All the release bindists are now produced by Hadrian. Starting from the 9.4 release, all the bindists that we distribute will be built in this manner.

  • Zubin and Matt finished the reinstallable GHC patch, which allows GHC and all its libraries to be built using normal cabal-install commands. In the future it’s hoped that this will allow ghc to be rebuilt in cabal build plans but we still have some issues to work out to do with Template Haskell (#20742).

Runtime performance

  • After a long time tag inference has finally landed in !5614 which implements #16970.

    This is a new optimization which can allow the compiler to omit branches checking for the presence of a pointer tag if we can infer from the context that a tag must always be present.

    For the nofib benchmark suite benchmarks the best result was obtained when -fworker-wrapper-cbv was enabled resulting in a 4.01% decrease in instructions executed, with a similar benefit in runtime. Sadly because of #20364 -fworker-wrapper-cbv can prevent RULEs from firing when INLINE[ABLE] pragmas are not used correctly. Which turned out to be a fairly common problem!

    For this reason -fworker-wrapper-cbv is off by default at which point the improvement was “only” by a 1.53% reduction in instructions executed. But in general -fworker-wrapper-cbv can be safely enabled for all modules except these which define RULE relevant functions which are currently not subject to a W/W split. The user guide has some guidance about when -fworker-wrapper-cbv can be safely enabled.

    The main goal of this optimization was to improve tight loops like the ones performed by the lookup operations in containers. There a significant amount of performance was lost to redundant checks for pointer tags and we saw an improvement of runtime by up to 15% for some of the lookup operations.

Infrastructure

  • Ben and Matt introduced a lint to verify references between GHC’s long-form comments (so-called “Notes”). This helps eliminate a long-standing problem where note references grow stale across refactorings of the compiler (!7482).

  • Sam migrated the linting infrastructure to allow all linting steps to be run locally, so that developers can be confident their merge requests won’t fail during the linting stage in CI (!7578).

  • Matt created a script which generates the main CI pipelines for all supported build combinations. The script is a simple Haskell file which is easier to understand and modify than the old gitlab yaml file which led to various inconsistencies between the build configurations and bugs such as missing build artifacts (!7753).

by ben, andreask, matthew, zubin, sam, adam at April 12, 2022 12:00 AM

April 07, 2022

Philip Wadler

Vote!


It's time once again! The below is copied from https://www.gov.uk/register-to-vote. The site is easy to use and registration takes less than five minutes. I'll be voting for the Scottish Green Party.

Deadline for registering to vote in the 5 May 2022 elections

Register by 11:59pm on 14 April to vote in the following elections on 5 May:

  • local government, combined authority mayoral, mayoral and parish council elections in England
  • local government and community council elections in Wales
  • Northern Ireland Assembly election

Register by 11:59pm on 18 April to vote in the local government elections in Scotland on 5 May.

Who can register

You must be aged 16 or over (or 14 or over in Scotland and Wales).

You must also be one of the following:

  • a British citizen
  • an Irish or EU citizen living in the UK
  • a Commonwealth citizen who has permission to enter or stay in the UK, or who does not need permission
  • a citizen of another country living in Scotland or Wales who has permission to enter or stay in the UK, or who does not need permission

Check which elections you’re eligible to vote in.

You can vote when you’re 18 or over. If you live in Scotland or Wales, you can vote in some elections when you’re 16 or over.

You normally only need to register once - not for every election. You’ll need to register again if you’ve changed your name, address or nationality. 

Register online

It usually takes about 5 minutes.

Start now

by Philip Wadler (noreply@blogger.com) at April 07, 2022 05:47 PM

Haskell in Production

 

Serokell has a series of posts on Haskell in Production. Spotted by Alex Wasey. Thanks, Alex!

by Philip Wadler (noreply@blogger.com) at April 07, 2022 04:24 PM

Ken T Takusagawa

[xcruhlyr] first-class pattern guards in Haskell

patterns are not first class objects in Haskell.  they cannot be assigned to variables nor passed around.

in the function f1 below, the patterns Apple and Banana are hardcoded.

data Fruit = Apple | Banana | Orange ;

f1 :: Fruit -> String;
f1 Apple = "got first choice fruit";
f1 Banana = "got second choice fruit";
f1 _ = "did not get what we want";

however, unlike patterns, pattern guards can be first-class objects.  using them, we can accomplish anything a first-class pattern could do.  in the example below, we pass patterns as boolean predicates to f2 and call them in the pattern guards (to the right of the vertical bar).  applep, bananap, and orangep are patterns turned into boolean functions.

f2 :: (Fruit -> Bool) -> (Fruit -> Bool) -> Fruit -> String;
f2 pattern1 pattern2 fruit
| pattern1 fruit = "got first choice fruit" -- note: no semicolon here
| pattern2 fruit = "got second choice fruit";
f2 _ _ _ = "did not get what we want";

applep :: Fruit -> Bool;
applep Apple = True;
applep _ = False;

bananap :: Fruit -> Bool;
bananap Banana = True;
bananap _ = False;

orangep :: Fruit -> Bool;
orangep Orange = True;
orangep _ = False;

examplef2 :: Fruit -> IO();
examplef2 fruit = do {
putStrLn $ f2 applep bananap fruit;
putStrLn $ f2 orangep applep fruit;
};

we can also do pattern matching, encapsulating extraction of a value from a pattern into a (first-class) function returning Maybe.  below, the function superhero calls its supplied pattern and attempts to match against Just.  if the pattern returns Nothing, the guard fails, and we fall through to "muggle".

data Health = Mind Int | Body Int;

superhero :: (Health -> Maybe Int) -> Health -> String;
superhero pattern health | Just power <- pattern health = if power > 9000
  then "is superhero"
  else "not strong enough";
superhero _ _ = "muggle";

getmind :: Health -> Maybe Int;
getmind (Mind i) = Just i;
getmind _ = Nothing;

getbody :: Health -> Maybe Int;
getbody (Body i) = Just i;
getbody _ = Nothing;

powermeter :: Health -> IO();
powermeter health = do {
putStrLn $ superhero getmind health;
putStrLn $ superhero getbody health;
};

in general, because what we pass is a function, we can build up a pattern guard in the myriad of ways we can build a function in a functional programming language.  however, I have not explored this very far.

a pattern guard can have several components, separated by commas.  a comma acts like the boolean AND operator.  each component can be a boolean expression, a pattern match (its boolean value is whether the match succeeds), or an assignment of a local variable with "let" (always evaluates to True).

note well: pattern matching in a guard is different from pattern matching in a let in a guard.  unlike the definition above, the following always succeeds, never falling through to "muggle".  if pattern returns Nothing, then a run-time error "Non-exhaustive patterns" occurs.

superhero pattern health | let { Just power = pattern health } = if power ...

by Unknown (noreply@blogger.com) at April 07, 2022 05:36 AM

Lysxia's blog

The pro-PER meaning of "proper"

A convenient proof tactic is to rewrite expressions using a relation other than equality. Some setup is required to ensure that such a proof step is allowed. One important obligation is to prove Proper theorems for the various functions in our library. For example, a theorem like

Instance Proper_f : Proper ((==) ==> (==)) f.

unfolds to forall x y, x == y -> f x == f y, meaning that f preserves some relation (==), so that we can “rewrite x into y under f�. Such a theorem must be registered as an instance so that the rewrite tactic can find it via type class search.

Where does the word “proper� come from? How does Proper ((==) ==> (==)) f unfold to forall x y, x == y -> f x == f y?

You can certainly unfold the Coq definitions of Proper and ==> and voilà, but it’s probably more fun to tell a proper story.

It’s a story in two parts:

  • Partial equivalence relations
  • Respectfulness

Some of the theorems discussed in this post are formalized in this snippet of Coq.

Partial equivalence relations (PERs)

Partial equivalence relations are equivalence relations that are partial. 🤔

In an equivalence relation, every element is at least related to itself by reflexivity. In a partial equivalence relation, some elements are not related to any element, not even themselves. Formally, we simply drop the reflexivity property: a partial equivalence relation (aka. PER) is a symmetric and transitive relation.

Class PER (R : A -> A -> Prop) :=
  { PER_symmetry : forall x y, R x y -> R y x
  ; PER_transitivity : forall x y z, R x y -> R y z -> R x z }.

We may remark that an equivalence relation is technically a “total� partial equivalence relation.

An equivalent way to think about an equivalence relation on a set is as a partition of that set into equivalence classes, such that elements in the same class are related to each other while elements of different classes are unrelated. Similarly, a PER can be thought of as equivalence classes that only partially cover a set: some elements may belong to no equivalence class.

On the left, a partition of a set of points, representing an equivalence relation.
On the right, a partial partition representing a PER.
On the left, a set of points grouped in three classes. On the right, a set of points grouped in two classes, with some points leftover.

Exercise: define the equivalence classes of a PER; show that they are disjoint.

Solution

The equivalence classes of a PER R : A -> A -> Prop are sets of the form C x = { y ∈ A | R x y }.

Given two equivalence classes C x and C x', we show that these sets are either equal or disjoint. By excluded middle:

  • Either R x x', then R x y -> R x' y by symmetry and transitivity, so y ∈ C x -> y ∈ C x', and the converse by the same argument. Therefore C x = C x'.

  • Or ~ R x x', then we show that ~ (R x y /\ R x' y):

    • assume R x y and R x' y,
    • then R x x' by symmetry and transitivity,
    • by ~ R x x', contradiction.

    Hence, ~ (y ∈ C x /\ y ∈ C x'), therefore C x and C x' are disjoint.

(I wouldn’t recommend trying to formalize this in Coq, because equivalence classes are squarely a set-theoretic concept. We just learn to talk about things differently in type theory.)

A setoid is a set equipped with an equivalence relation. A partial setoid is a set equipped with a PER.

PERs are useful when we have to work in a set that is “too big�. A common example is the set of functions on some setoid. For instance, consider the smallest equivalence relation (≈) on three elements {X, X', Y} such that X ≈ X'. Intuitively, we want to think of X and X' as “the same�, so that the set morally looks like a two-element set.

How many functions {X, X', Y} -> {X, X', Y} are there? If we ignore the equivalence relation, then there are 33 functions. But if we think of {X, X', Y} as a two-element set by identifying X and X', there should be 22 functions. The actual set of functions {X, X', Y} -> {X, X', Y} is “too big�:

  1. it contains some “bad� functions which break the illusion that X and X' are the same, for example by mapping X to X and X' to Y;

    (* A bad function *)
    bad X = X
    bad X' = Y
    bad Y = Y
  2. it contains some “duplicate� functions, for example the constant functions const X and const X' should be considered the same since X ≈ X'.

To tame that set of functions, we equip it with the PER R where R f g if forall x y, x ≈ y -> f x ≈ g y.

Definition R f g : Prop := forall x y, x ≈ y -> f x ≈ g y.

That relation R has the following nice features:

  1. Bad functions are not related to anything: forall f, not (R bad f).

  2. Duplicate functions are related to each other: R (const X) (const X').

Having defined a suitable PER, we now know to ignore the “bad� unrelated elements and to identify elements related to each other. Those remaining “good� elements are called the proper elements.

A proper element x of a relation R is one that is related to itself: R x x.

This is how the Proper class is defined in Coq:

(* In the standard library: From Coq Require Import Morphisms *)
Class Proper {A} (R : A -> A -> Prop) (x : A) : Prop :=
  proper_prf : R x x.

Note that properness is a notion defined for any relation, not only PERs. This story could probably be told more generally. But I think PERs make the motivation more concrete, illustrating how relations let us not only relate elements together, but also weed out badly behaved elements via the notion of properness.

The restriction of a relation R to its proper elements is reflexive. Hence, if R is a PER, its restriction is an equivalence relation. In other words, a PER is really an equivalence relation with an oversized carrier.

Exercise: check that there are only 4 functions {X, X', Y} -> {X, X', Y} if we ignore the non-proper functions and we equate functions related to each other by R.

Solution

The equivalence classes are listed in the following table, one per row, with each sub-row giving the mappings of one function for X, X', Y. There are 4 equivalence classes spanning 15 functions, and 12 “bad� functions that don’t belong to any equivalence classes.

      X  X' Y
------------------
1     X  X  X    1
      X  X  X'   2
      X  X' X    3
      X  X' X'   4
      X' X  X    5
      X' X  X'   6
      X' X' X    7
      X' X' X'   8
------------------
2     X  X  Y    9
      X  X' Y   10
      X' X  Y   11
      X' X' Y   12
------------------
3     Y  Y  X   13
      Y  Y  X'  14
------------------
4     Y  Y  Y   15
------------------
Bad   X  Y  X   16
      X  Y  X'  17
      X' Y  X   18
      X' Y  X'  19
      X  Y  Y   20
      X' Y  Y   21
      Y  X  X   22
      Y  X  X'  23
      Y  X' X   24
      Y  X' X'  25
      Y  X  Y   26
      Y  X' Y   27

Exercise: given a PER R, prove that an element is related to itself by R if and only if it is related to some element.

Theorem Prim_and_Proper {A} (R : A -> A -> Prop) :
  PER R ->
  forall x, (R x x <-> exists y, R x y).

(Solution)

Respectfulness

The relation R defined above for functions {X, X', Y} -> {X, X', Y} is an instance of a general construction. Given two sets D and C, equipped with relations RD : D -> D -> Prop and RC : C -> C -> Prop (not necessarily equivalences or PERs), two functions f, g : D -> C are respectful if they map related elements to related elements. Thus, respectfulness is a relation on functions, D -> C, parameterized by relations on their domain D and codomain C:

(* In the standard library: From Coq Require Import Morphisms *)
Definition respectful {D} (RD : D -> D -> Prop)
                      {C} (RC : C -> C -> Prop)
    (f g : D -> C) : Prop :=
  forall x y, RD x y -> RC (f x) (g y).

(Source)

The respectfulness relation is also cutely denoted using (==>), viewing it as a binary operator on relations.

Notation "f ==> g" := (respectful f g) (right associativity, at level 55)
  : signature_scope.

(Source)

For example, this lets us concisely equip a set of curried functions E -> D -> C with the relation RE ==> RD ==> RC. Respectfulness provides a point-free notation to construct relations on functions.

(RE ==> RD ==> RC) f g
<->
forall s t x y, RE s t -> RD x y -> RC (f s x) (g t y)

Respectfulness on D -> C can be defined for any relations on D and C. Two special cases are notable:

  • If RD and RC are PERs, then RD ==> RC is a PER on D -> C (proof), so this provides a concise definition of extensional equality on functions (This was the case in the example above.)

  • If RD and RC are preorders (reflexive, transitive), then the proper elements of RD ==> RC are exactly the monotone functions.

Proper respectful functions and rewriting

Now consider the proper elements of a respectfulness relation. Recalling the earlier definition of properness, it transforms a (binary) relation into a (unary) predicate:

Proper : (A -> A -> Prop) -> (A -> Prop)

While we defined respectfulness as a binary relation above, we shall also say that a single function f is respectful when it maps related elements to related elements. The following formulations are equivalent; in fact, they are all the same proposition by definition:

forall x y, RD x y -> RC (f x) (f y)
=
respectful RD RC f f
=
(RD ==> RC) f f
=
Proper (RD ==> RC) f

The properness of a function f with respect to the respectfulness relation RD ==> RC is exactly what we need for rewriting. We can view f as a “context� under which we are allowed to rewrite its arguments along the domain’s relation RD, provided that f itself is surrounded by a context that allows rewriting along the codomain’s relation RC. In a proof, the goal may be some proposition in which f x occurs, P (f x), then we may rewrite that goal into P (f y) using an assumption RD x y, provided that Proper (RD ==> RC) f and Proper (RC ==> iff) P, where iff is logical equivalence, with the infix notation <->.

Definition iff (P Q : Prop) : Prop := (P -> Q) /\ (Q -> P).
Notation "P <-> Q" := (iff P Q).

Respectful functions compose:

Proper (RD ==> iff) (fun x => P (f x))
=
forall x y, RD x y -> P (f x) <-> P (f y)

And that, my friends, is the story of how the concept of “properness� relates to the proof technique of generalized rewriting.


Appendix: Pointwise relation

Another general construction of relations on functions is the “pointwise relation�. It only assumes a relation on the codomain RC : C -> C -> Prop. Two functions f, g : D -> C are related pointwise by RC if they map each element to related elements.

(* In the standard library: From Coq Require Import Morphisms *)
(* The domain D is not implicit in the standard library. *)
Definition pointwise_relation {D C} (RC : C -> C -> Prop)
    (f g : D -> C) : Prop :=
  forall x, RC (f x) (g x).

(* Abbreviation (not in the stdlib) *)
Notation pr := pointwise_relation.

(Source)

This is certainly a simpler definition: pointwise_relation RC is equivalent to eq ==> RC, where eq is the standard intensional equality relation.

One useful property is that pointwise_relation RC is an equivalence relation if RC is an equivalence relation. In comparison, we can at most say that RD ==> RC is a PER if RD and RC are equivalence relations. It is not reflexive as soon as RD is bigger than eq (the smallest equivalence relation) and RC is smaller than the total relation fun _ _ => True.

In Coq, the pointwise_relation is also used for rewriting under lambda abstractions. Given a higher-order function f : (E -> F) -> D, we may want to rewrite f (fun z => M z) to f (fun z => N z), using a relation forall z, RF (M z) (N z), where the function bodies M and/or N depend on z so the universal quantification is necessary to bind z in the relation. This can be done using the setoid_rewrite tactic, after having proved a Proper theorem featuring pointwise_relation:

Instance Proper_f : Proper (pointwise_relation RF ==> RD) f.

One disadvantage of pointwise_relation is that it is not compositional. For instance, it is not preserved by function composition:

Definition compose {E D C} (f : D -> C) (g : E -> D) : E -> C :=
  fun x => f (g x).

Theorem not_Proper_compose :
  not
   (forall {E D C}
           (RD : D -> D -> Prop) (RC : C -> C -> Prop),
    Proper (pr RC ==> pr RD ==> pr RC)
           (compose (E := E))).

Instead, at least the first domain of compose should be quotiented by RD ==> RC instead:

Instance Proper_compose {E D C}
    (RD : D -> D -> Prop) (RC : C -> C -> Prop) :
    Proper ((RD ==> RC) ==> pr RD ==> pr RC)
           (compose (E := E)).

We can even use ==> everywhere for a nicer-looking theorem:

Instance Proper_compose' {E D C} (RE : E -> E -> Prop)
    (RD : D -> D -> Prop) (RC : C -> C -> Prop) :
    Proper ((RD ==> RC) ==> (RE ==> RD) ==> (RE ==> RC))
           compose.

Exercise: under what assumptions on relations RD and RC do pointwise_relation RD and RC ==> RD coincide on the set of proper elements of RC ==> RD?

Solution
Theorem pointwise_respectful {D C} (RD : D -> D -> Prop) (RC : C -> C -> Prop)
  : Reflexive RD -> Transitive RC ->
    forall f g, Proper (RD ==> RC) f -> Proper (RD ==> RC) g ->
    pointwise_relation RC f g <-> (RD ==> RC) f g.
(Link to proof)

This table summarizes the above comparison:

pointwise_relation respectful (==>)
is an equivalence yes no
allows rewriting under binders yes no
respected by function composition no yes

Appendix: Parametricity

Respectfulness lets us describe relations RD ==> RC on functions using a notation that imitates the underlying type D -> C. More than a cute coincidence, this turns out to be a key component of Reynolds’s interpretation of types as relations: ==> is the relational interpretation of the function type constructor ->. Building upon that interpretation, we obtain free theorems to harness the power of parametric polymorphism.

Free theorems provide useful properties for all polymorphic functions of a given type, regardless of their implementation. The canonical example is the polymorphic identity type ID := forall A, A -> A. A literal reading of that type is that, well, for every type A we get a function A -> A. But this type tells us something more: A is abstract to the function, it cannot inspect A, so the only possible implementation is really the identity function fun A (x : A) => x. Free theorems formalize that intuition.

The type ID := forall A, A -> A is interpreted as the following relation RID:

Definition RID (f g : forall A, A -> A) : Prop :=
  forall A (RA : A -> A -> Prop), (RA ==> RA) (f A) (g A).

where we translated forall A, to forall A RA, and A -> A to RA ==> RA.

The parametricity theorem says that every typed term t : T denotes a proper element of the corresponding relation RT : T -> T -> Prop, i.e., RT t t holds. “For all t : T, RT t t� is the “free theorem� for the type T.

The free theorem for ID says that any function f : ID satisfies RID f f. Unfold definitions:

RID f f
=
forall A (RA : A -> A -> Prop) x y, RA x y -> RA (f A x) (f A y)

Now let z : A be an arbitrary element of an arbitrary type, and let RA := fun x _ => x = z. Then the free theorem instantiates to

x = z -> f A x = z

Equivalently,

f A z = z

that says exactly that f is extensionally equal to the identity function.


More reading

by Lysxia at April 07, 2022 12:00 AM

April 06, 2022

Well-Typed.Com

large-anon: Practical scalable anonymous records for Haskell

The large-anon library provides support for anonymous records; that is, records that do not have to be declared up-front. For example, used as a plugin along with the record-dot-preprocessor plugin, it makes it possible to write code such as this:

magenta :: Record [ "red" := Double, "green" := Double, "blue" := Double ]
magenta = ANON { red = 1, green = 0, blue = 1 }

reduceRed :: RowHasField "red" r Double => Record r -> Record r
reduceRed c = c{red = c.red * 0.9}

The type signatures are not necessary; type inference works as aspected for these records. If you prefer to use lenses1, that is also possible:

reduceBlue :: RowHasField "blue" r Double => Record r -> Record r
reduceBlue = over #blue (* 0.9)

The library offers a small but very expressive API, and it scales to large records (with 100 fields and beyond), with excellent compilation time performance and good runtime performance. In this blog post we will first present the library from a user’s perspective, then give an overview of the internals with an aim to better to understand the library’s runtime characteristics, and finally show some benchmarks. The library is available from Hackage and is currently compatible with ghc 8.8, 8.10 and 9.0 (extending this to 9.2 should not be too hard).

If you want to follow along, the full source code for all the examples in this blog post can be found in Test.Sanity.BlogPost in the large-anon test suite.

The simple interface

The library offers two interfaces, “simple” and “advanced.” We will present the simple interface first, then explore the advanced interface below.

The simple interface can be summarized as follows:

Data.Record.Anon.Simple

data Record (r :: Row Type)
  deriving (Eq, Ord, Show, Large.Generic, ToJSON, FromJSON)

data Pair a b = a := b
type Row k = [Pair Symbol k]

instance (RowHasField n r a, ..) => HasField n (Record r) a

empty   :: Record '[]
insert  :: Field n -> a -> Record r -> Record (n := a : r)
get     :: RowHasField n r a => Field n -> Record r -> a
set     :: RowHasField n r a => Field n -> a -> Record r -> Record r
project :: SubRow r r' => Record r -> Record r'
inject  :: SubRow r r' => Record r' -> Record r -> Record r
merge   :: Record r -> Record r' -> Record (Merge r r')
where Large.Generic comes from the large-generics package.

In the remainder of this section we will introduce this API by means of examples. When there is possibility for confusion, we will use the prefix S. to refer to the simple interface (and A. for the advanced interface).

Record construction and field access

In the introduction we used some syntactic sugar: the ANON record constructor makes it possible to use regular record syntax for anonymous records. This syntax is available as soon as you use the large-anon plugin. ANON desugars to calls to empty and insert; it does not depend on any kind of internal or unsafe API) and there is no need to use it if you prefer not to (though see Applying pending changes):

purple :: Record [ "red" := Double, "green" := Double, "blue" := Double ]
purple =
     S.insert #red   0.5
   $ S.insert #green 0
   $ S.insert #blue  0.5
   $ S.empty

Similarly, the example in the introduction used RecordDotSyntax as provided by record-dot-preprocessor, but we can also use get and set:

reduceGreen :: RowHasField "green" r Double => Record r -> Record r
reduceGreen c = S.set #green (S.get #green c * 0.9) c

Constraints

The summary of the simple interface showed that Record has a Show instance. Let’s take a closer look at its precise signature:

instance (KnownFields r, AllFields r Show) => Show (Record r)

The KnownFields constraint says that the field names of r must be known, and the AllFields r Show constraint says that all fields of r must in turn satisfy Show; the show instance uses this to output records like this:

> magenta
ANON {red = 1.0, green = 0.0, blue = 1.0}

In fact, Show for Record simply uses gshow from large-generics.

The order of the fields is preserved in the output: large-anon regards records with rows that differ only in their order as different types; isomorphic, but different. The project function can be used to translate between records with different field order; we shall see an example when we discuss sequenceA.

The RowHasField, KnownFields, AllFields and SubRow constraints (for project) are solved by the large-anon typechecker plugin, so you will need to add

{-# OPTIONS_GHC -fplugin=Data.Record.Anon.Plugin #-}

at the top of your Haskell file. We will see later how to manually prove such constraints when the plugin cannot.

Project and inject

In the previous section we saw that project can be used to reorder fields, but is actually more general than that. In addition to reordering fields, we can also omit fields: a SubRow r r' constraint is satisfied whenever the fields of r' are a subset of the fields of r. Moreover, when SubRow r r' holds we can also update the larger record from the smaller one: project and inject together form a lens.

Let’s consider an example. Suppose we have some kind of renderer with a bunch of configuration options:

type Config = [
      "margin"   := Double
    , "fontSize" := Int
    , "header"   := String
    , ...
    ]

defaultConfig :: Record Config
defaultConfig = ANON {
      margin   = 1
    , fontSize = 18
    , header   = ""
    , ...
    }

render :: Record Config -> ...

To call render, we would need to construct such a record; for example, we could do2

render $ defaultConfig{margin = 2}

There is an alternative, however. Rather than passing in the full configuration, we could offer an API where the caller only passes in the overrides:

render' :: SubRow Config overrides => Record overrides -> ...
render' overrides = render (S.inject overrides defaultConfig)

Now we no longer need to export a defaultConfig to the user:

render' $ ANON { margin = 2 }

The advanced interface

The key difference between the simple interface and the advanced one is that Record is additionally parameterised by a type constructor f:3

data Record (f :: k -> Type) (r :: Row k)

Intuitively, every field in the record will be wrapped in an application of f. Indeed, the simple interface is but a thin layer around the advanced one, instantiating f to the identity functor I:

magenta' :: A.Record I [ "red" := Double, "green" := Double, "blue" := Double ]
magenta' = S.toAdvanced magenta

The additional type constructor argument makes records a lot more expressive, and consequently the advanced API is much richer than the simple one. We will give some representative examples.

Foldable and zipping

“Folding” (as in Foldable) essentially means “turning into a list.” With records we cannot do that, unless all fields of the record happen to have the same type. We can express this by using the constant functor K:

collapse :: Record (K a) r -> [a]
toList   :: KnownFields r => Record (K a) r -> [(String, a)]

Similarly, because every field in the record has a different type, zipping requires a polymorphic function:

zipWith :: (forall x. f x -> g x -> h x) -> Record f r -> Record g r -> Record h r

(There are also monadic and constrained variations of zipping available.)

Example: toList and zipWith

Suppose we want to write a function that translates records to JSON values, but allow the user per-field overrides which can change how the value of that field gets output. That is, we want to enable the user to provide a function of type

newtype FieldToJSON x = FieldToJSON (x -> Value)

for every field of type x. We will do this by providing a record of such functions to our JSON generation function, in addition to the actual record we want to translate:

recordToJSON :: KnownFields r => A.Record FieldToJSON r -> Record r -> Value
recordToJSON fs xs = Aeson.object . map (first fromString) $
    A.toList $ A.zipWith aux fs (S.toAdvanced xs)
  where
    aux :: FieldToJSON x -> I x -> K Value x
    aux (FieldToJSON f) (I x) = K (f x)

Function aux is returning K Value x, emphasizing that the result of aux is a Value, no matter what the type of the field was; this is what enables the call to toList.

It is worth noting quite how short and simple this function is; try doing this with regular records!

Applicative

Recall the types of pure and (<*>) from the prelude:

pure  :: Applicative f => a -> f a
(<*>) :: Applicative f => f (a -> b) -> f a -> f b

Records are “Applicative-like,” but don’t quite match this interface because, again, every field of the record has a different type. The corresponding functions in the advanced record API are:

pure  :: KnownFields r => (forall x. f x) -> Record f r
cpure :: AllFields r c => Proxy c -> (forall x. c x => f x) -> Record f r
ap    :: Record (f -.-> g) r -> Record f r -> Record g r

A function of type (f -.-> g) x is really a function from f x -> g x; thus, the type of ap says: “provided you have a record containing functions from f x -> g x for every field of type x in the record, and a corresponding record of arguments of type f x, then I can construct a record of results of type g x.”

Similarly, to construct a record in the first place, we can use pure or cpure. The type of pure is simpler, but it is less often useful: it requires the caller to construct a value of type f x for any x at all. Often that is not possible, and we need to know that some constraint c x holds; cpure can be used in this case.

If you have used large-generics or (more likely) sop-core before, you will find this style familiar. If not, this may look a little intimidating, but hopefully the examples in this blog post will help. You might also like to read the paper True Sums of Products where this style of programming was introduced.

Example: cpure

Our example JSON construction function took as argument a record of FieldToJSON values. In most cases, we just want to use toJSON for every field. We can write a function that constructs such a record for any row using cpure:

defaultFieldToJSON :: AllFields r ToJSON => A.Record FieldToJSON r
defaultFieldToJSON = A.cpure (Proxy @ToJSON) (FieldToJSON toJSON)

Suppose for the sake of an example that we want to generate JSON for our Config example, but that we want to output null for the header if it’s empty:

headerToJSON :: String -> Value
headerToJSON "" = Aeson.Null
headerToJSON xs = toJSON xs

Then

recordToJSON
  defaultFieldToJSON{header = FieldToJSON headerToJSON}
  defaultConfig

will result in something like

{
    "margin": 1,
    "fontSize": 18,
    "header": null
}

Example: ap

Suppose that we want the function that creates the value to also be passed the field name:

newtype NamedFieldToJSON a = NamedFieldToJSON (String -> a -> Value)

Our generation function must now zip three things: the record of functions, a record of names, and the actual record of values. We can get a record of names using

reifyKnownFields :: KnownFields r => proxy r -> Record (K String) r

(We will see reification and reflection of constraints in more detail when we discuss how to manually prove constraints.) However, large-anon does not offer a zipWith3. Not to worry; just like for ordinary Applicative structures we can write

pure f <*> xs <*> ys <*> zs

to combine three structures, we can do the same for records:

recordToJSON' :: forall r.
     KnownFields r
  => A.Record NamedFieldToJSON r -> Record r -> Value
recordToJSON' fs xs = Aeson.object . map (first fromString) $
    A.toList $
             A.pure (fn_3 aux)
      `A.ap` fs
      `A.ap` A.reifyKnownFields (Proxy @r)
      `A.ap` S.toAdvanced xs
  where
    aux :: NamedFieldToJSON x -> K String x -> I x -> K Value x
    aux (NamedFieldToJSON f) (K name) (I x) = K (f name x)

Traversable

The essence of Traversable is that we sequence effects: given some traversable structure of actions, create an action returning the structure:

sequenceA :: (Traversable t, Applicative f) => t (f a) -> f (t a)

We can do the same for records; the advanced API offers

sequenceA' :: Applicative m => Record m r -> m (Record I r)
sequenceA  :: Applicative m => Record (m :.: f) r -> m (Record f r)

and the simplified API offers

sequenceA :: Applicative m => A.Record m r -> m (Record r)

When we are sequencing actions, order matters, and large-anon guarantees that actions are executed in row-order (another reason not to consider rows “up to reordering”).

Example: sequenceA

Let’s go back to our Config running example, and let’s assume we want to write a parser for it. Let’s say that the serialised form of the Config is just a list of values, something like

2.1 14 Example

Then we could write our parser as follows (ANON_F is the equivalent of ANON for the advanced interface):

parseConfig :: Parser (Record Config)
parseConfig = S.sequenceA $ ANON_F {
      margin   = parseDouble
    , fontSize = parseInt
    , header   = parseString
    }

We are using sequenceA to turn a record of parsers into a parser of a record. However, what if the order of the serialised form does not match the order in the record? No problem, we can parse in the right order and then use project to reorder the fields:

parseConfig' :: Parser (Record Config)
parseConfig' = fmap S.project . S.sequenceA $ ANON_F {
      header   = parseString
    , margin   = parseDouble
    , fontSize = parseInt
    }

Of course, first ordering and then sequencing would not work!

Incidentally, anonymous records have an advance over regular records here; with normal records we could write something like

parseConfig :: Parser Config
parseConfig =
        MkConfig
    <$> parseDouble
    <*> parseInt
    <*> parseString

but there is no way to use the record field names with Applicative (unless we explicitly give the record a type constructor argument and then write infrastructure for dealing with it), nor is there an easy way to change the order.

Manually proving constraints

This section is aimed at advanced usage of the library; in most cases, use of the API we describe here is not necessary.

The large-anon type checker plugin proves KnownFields, AllFields and SubRow constraints, but only for concrete rows. When this is insufficient, the advanced interface provides three pairs of functions for proving each of these.

Inductive reasoning over these constraints is not possible. Induction over type-level structures leads to large ghc core size and bad compilation time, and is avoided entirely in large-anon.

Example: reflectAllFields

For reflectAllFields the pair of functions looks like this:

reifyAllFields   :: AllFields r c => proxy c -> Record (Dict c) r
reflectAllFields :: Record (Dict c) r -> Reflected (AllFields r c)

The former turns a constraint AllFields over a record into a record of dictionaries; the latter goes in the opposite direction. The only difference between Dict (defined in sop-core) and Reflected (defined in large-anon) is that the former takes a type constructor argument:

data Dict c a where
  Dict :: c a => Dict c a

data Reflected c where
  Reflected :: c => Reflected c

We’ll consider two examples. First, if a constraint c holds for every field in some larger record, then it should also hold for every field in a record with a subset of the larger record’s fields:

smallerSatisfies :: forall r r' c.
     (SubRow r r', AllFields r c)
  => Proxy c -> Proxy r -> Reflected (AllFields r' c)
smallerSatisfies pc _ =
    A.reflectAllFields $ A.project (A.reifyAllFields pc :: A.Record (Dict c) r)

Second, if a constraint c implies c', then if every field of a record satisfies c, every field should also satisfy c'. For example, Ord implies Eq, and hence:

ordImpliesEq :: AllFields r Ord => Reflected (AllFields r Eq)
ordImpliesEq =
    A.reflectAllFields $ A.map aux (A.reifyAllFields (Proxy @Ord))
  where
    aux :: forall x. Dict Ord x -> Dict Eq x
    aux Dict = Dict

Example: reflectSubRow

For the SubRow constraint, the pair of functions is

data InRow r a where
  InRow :: (KnownSymbol n, RowHasField n r a) => Proxy n -> InRow r a

reifySubRow   :: (KnownFields r', SubRow r r') => Record (A.InRow r) r'
reflectSubRow :: Record (A.InRow r) r' -> Reflected (SubRow r r')

For our final and most sophisticated example of the use of the advanced API, we will show how we can do a runtime check to see if one row can be projected to another. Such a check is useful when dealing with records with over existential rows, for example when constructing records from JSON values (see someRecord in the advanced API). The large-anon test suite contains contains an example of this in Test.Infra.DynRecord.Simple, as well as a slightly better version of checkIsSubRow in Test.Infra.Discovery.

Starting point

We want to write a function of type

checkIsSubRow ::
     (..)
  => proxy r1 -> proxy' r2 -> Maybe (Reflected (SubRow r1 r2))

We need to use reflectSubRow to do this, so we need to construct a record over r', where every field contains evidence that that field is a member of r.

Let’s consider how to do this one bit at a time, starting with perhaps a non-obvious first step: we will use reifySubRow to construct a record for r with evidence that every field of r is (obviously!) a member of r, and similarly for r':

checkIsSubRow _ _ =
    A.reflectSubRow <$> go A.reifySubRow A.reifySubRow
  where
    go :: A.Record (InRow r ) r
       -> A.Record (InRow r') r'
       -> Maybe (A.Record (InRow r) r')
    go r r' = ...

The strategy is now going to be as follows: we are going to try and translate the evidence of InRow r' to evidence of InRow r, by matching every field of r' with the corresponding field in r (if it exists).

Matching fields

In order to check if we have a match, we need to check two things: the field names need to match, and the field types need to match. For the former we can use

sameSymbol ::
     (KnownSymbol n, KnownSymbol n')
  => Proxy n -> Proxy n' -> Maybe (n :~: n')

from GHC.TypeLits, but to be able to do a runtime type check we need some kind of runtime type information. An obvious choice would be to use Typeable, but here we will stick with something simpler. Let’s suppose the only types we are interested in are Int and Bool; we can implement a runtime type check as follows:

data SupportedType a where
  SupportedInt  :: SupportedType Int
  SupportedBool :: SupportedType Bool

class IsSupportedType a where
  supportedType :: Proxy a -> SupportedType a

instance IsSupportedType Int  where supportedType _ = SupportedInt
instance IsSupportedType Bool where supportedType _ = SupportedBool

sameType :: SupportedType a -> SupportedType b -> Maybe (a :~: b)
sameType SupportedInt  SupportedInt  = Just Refl
sameType SupportedBool SupportedBool = Just Refl
sameType _             _             = Nothing

With this in hand, let’s now go back to our matching function. We have evidence that some field x' is a member of r', and we want evidence that x' is a member of r. We do this by trying to match it against evidence that another field x is a member of r, checking both the field name and the field type:

checkIsMatch :: forall x x'.
     (IsSupportedType x, IsSupportedType x')
  => InRow r' x' -> InRow r x -> K (Maybe (InRow r x')) x
checkIsMatch (InRow x') (InRow x) = K $ do
    Refl <- sameSymbol x x'
    Refl <- sameType (supportedType (Proxy @x)) (supportedType (Proxy @x'))
    return $ InRow x

Now for a given field x' of r', we need to look through all the fields in r, looking for a match:

findField :: forall x'.
      IsSupportedType x'
   => A.Record (InRow r) r -> InRow r' x' -> Maybe (InRow r x')
findField r x' =
    listToMaybe . catMaybes . A.collapse $
      A.cmap (Proxy @IsSupportedType) (checkIsMatch x') r

Finally, we just need to repeat this for all fields of r'; the full implementation of checkIsSubRow is

checkIsSubRow :: forall (r :: Row Type) (r' :: Row Type) proxy proxy'.
     ( KnownFields r
     , KnownFields r'
     , SubRow r  r
     , SubRow r' r'
     , AllFields r  IsSupportedType
     , AllFields r' IsSupportedType
     )
  => proxy r -> proxy' r' -> Maybe (Reflected (SubRow r r'))
checkIsSubRow _ _ =
    A.reflectSubRow <$> go A.reifySubRow A.reifySubRow
  where
    go :: A.Record (InRow r ) r
       -> A.Record (InRow r') r'
       -> Maybe (A.Record (InRow r) r')
    go r r' = A.cmapM (Proxy @IsSupportedType) (findField r) r'

Discussion: choice of InRow

Recall the type of reflectSubRow:

data InRow r a where
  InRow :: (KnownSymbol n, RowHasField n r a) => Proxy n -> InRow r a

reflectSubRow :: Record (A.InRow r) r' -> Reflected (SubRow r r')

This may look obvious in hindsight, but during development of the library it was far from clear what the right representation was for the argument to reflectSubRow; after all, we are dealing with two rows r and r', and it was not evident how to represent this as a single record.

When we finally settled on the above representation it intuitively “felt right,” and this intuition was confirmed in two ways. First, checkIsSubRow previously could only be defined internally in the library by making use of unsafe features; the library is now expressive enough that it can be defined entirely user-side. Indeed, Test.Infra.Discovery in the large-anon test suite also provides an example of the runtime computation of the intersection between two rows, again using safe features of the library only (turns out that this is a minor generalization of checkIsSubRow).

Secondly, if we look at the generated core for reflectSubRow (and clean it up a bit), we find

reflectSubRow d = unsafeCoerce $ fmap aux (toCanonical d)
  where
    aux (InRow _name index _proxy) = index

so we see that it literally just projects out the indices of each field, which is quite satisfying. In fact, if we didn’t include evidence of KnownSymbol in InRow then reflectSubRow would just be the identity function!

Indeed, the choice to include KnownSymbol evidence in InRow is somewhat unfortunate, as it feels like an orthogonal concern. Ultimately the reason we need it is that the kind of the type constructor argument to Record is k -> Type, rather than Symbol -> k -> Type: it is not passed the field names, and hence the field name must be an existential in InRow.

Internal representation

In this section we will give a short overview of the internal representation of a Record. The goal here is not to provide a detailed overview of the internals of the library, but rather to provide users with a better understanding of its runtime characteristics.

A Record is represented as follows:

data Record (f :: k -> Type) (r :: Row k) =
    NoPending  {-# UNPACK #-} !(Canonical f)
  | HasPending {-# UNPACK #-} !(Canonical f) !(Diff f)

We’ll consider the two cases separately.

No pending changes

When there are no pending changes (that is, updated or added fields), Record just wraps Canonical:

newtype Canonical (f :: k -> Type) = Canonical (StrictArray (f Any))
newtype StrictArray a = WrapLazy { unwrapLazy :: SmallArray a }

In addition, the evidence for RowHasField is just an Int:

class RowHasField (n :: Symbol) (r :: Row k) (a :: k) | n r -> a where
  rowHasField :: Tagged '(n, r, a) Int

This means that reading from a record in canonical form is just an array access, and should be very fast.

Pending changes

Updating is however an expensive operation, because the entire array needs to be copied. This is fine for small arrays, but this is not an approach that scales well. Record therefore represents a record with pending changes—added or updated fields—as a combination of the original array along with a Diff:

data Diff (f :: k -> Type) = Diff {
      diffUpd :: !(IntMap (f Any))
    , diffIns :: [FieldName]
    , diffNew :: !(SmallHashMap FieldName (NonEmpty (f Any)))
    }

The details don’t matter too much, but diffUpd contains the new values of updated fields, and diffIns records which new fields have been inserted; diffNew is necessary to deal with shadowing, which is beyond the scope of this blog post.

FieldName is a combination of a precomputed hash and the name of the field:

data FieldName = FieldName {
      fieldNameHash  :: Int
    , fieldNameLabel :: String
    }

instance Hashable FieldName where
  hash = fieldNameHash

These hashes are computed at compile time (through the KnownHash class, defined in large-anon).

The take-away here is that the performance of a Record will degrade to the performance of a hashmap (with precomputed hashes) when there are many pending updates. This makes updating the record faster, but accessing the record slower.

Applying pending changes

The obvious question then is when we apply pending changes, so that we have a flat array again. First of all, the library provides a function to do this:

applyPending :: Record f r -> Record f r

(and similary in the simplified interface). It might be advisable to call this function after having done a lot of field updates, for example. Of course, we shouldn’t call it after every field update because that would result in a full array copy for every update again.

The library also calls applyPending internally in two places:

  • The ANON and ANON_F syntactic sugar call applyPending after the record has been constructed.
  • All of the combinators on records (map, pure, zipWith, etc.) all call applyPending on any input records, and only construct records in canonical form. Since these operations are anyway O(n), the additional cost of calling applyPending is effectively hidden.

Benchmarks

So does all this work? Yes, yes it does, and in this section we will show a bunch of benchmarks to prove it. For a baseline, we will compare against superrecord; this is a library which has been optimized for runtime performance, but makes heavy use of type families and induction and consequently suffers from poor compilation times. It could certainly be argued that this is not the library’s fault, and that ghc should do better; for now, however, we must work with what we have. It should also be noted that unlike large-anon, superrecord does treat rows “up to reordering.”

Record construction

In superrecord there are two ways to construct records: a safe API (rnil and rcons), and an unsafe API (unsafeRNil and unsafeRCons). The latter is unsafe in two ways: unsafeRNil must be told how much space to allocate for the record, and unsafeRCons does in-place update of the record, potentially breaking referential transparency if used incorrectly.

The safe API has such bad compilation time performance that we effectively cannot compare it to large-anon. By the time we get to records of 40 fields, we end up with a ghc core size of 25 million AST nodes (terms, types and coercions), and it takes 20 seconds to compile a single record; this time roughly doubles with every 10 more fields.

We will instead compare with the unsafe API:

We see that for records with 80 fields, large-anon results in ghc core that is roughly an order of magnitude smaller, and compilation time that is about 5.5x faster. The left graph here might suggest that the ghc core size generated by large-anon is linear in the size of the record; this is not quite the case:

(We are showing the core size after desugaring, the very simple optimizer, and the simplifier, but in this case all three are basically of identical size.) The green line is what large-anon does out of the box, and we see that it is not linear. The reason is that repeated calls to insert result in O(n²) type arguments (see Avoiding quadratic core code size with large records for a detailed discussion of this problem). We do have experimental support for integration with typelet (see Type-level sharing in Haskell, now), and while does indeed result in ghc core that is linear in size (blue line), unfortunately it actually makes compilation time worse (although still very good) – at least for this benchmark. Fortunately, compilation time without typelet is linear (again, for this benchmark).

The runtime performance of superrecord is much better, of course:

The most relevant lines here are the red line (unsafe superrecord API) and the green line (default large-anon: no typelet integration, and with a call to applyPending after the record is constructed). We see that superrecord is significantly faster here, by roughly two orders of magnitude. This is not surprising: large-anon first builds up a Map, and then flattens it, whereas superrecord just constructs a single array and then updates it in place (albeit in an unsafe manner).

Accessing record fields

Let’s now consider the performance of reading a bunch of fields from a record. The benchmark here constructs a function that extracts half of the fields of a record (into a non-record datatype).

The ghc core size in large-anon is so small that is is entirely dwarfed by superrecord; it is in fact linear, going up to roughly 3,500 AST nodes for a record of 80 fields, about 3 orders of magnitude better than superrecord. Compilation time is similarly much better, by more than an order of magnitude (50 ms versus 2.5 seconds), and also linear. Showing just large-anon by itself:

Comparing runtime is a bit more difficult, because of the hybrid representation used by large-anon: it very much depends on whether the record has many pending changes or not. We will therefore measure the two extremes: when the record has no pending changes at all, and when the record consists entirely of pending changes, with an empty base array:

Note that when the record is in canonical form (green line), large-anon and superrecord have very similar performance; large-anon is slower by roughly a factor 2x, which can be explained by having to check whether the record is in canonical form on every field access. At the other extreme (blue line), large-anon again degrades to the performance of a Map and is therefore about an order of magnitude slower. Actual performance in any application will fall somewhere between these two extremes.

Updating record fields

The hybrid nature of large-anon here too makes a direct comparison with superrecord a bit difficult. The performance of updating a single field will be different to updating many, and will depend on whether or not we call applyPending. We will therefore show a few different measurements.

Let’s first consider updating a single field. Both superrecord and large-anon have good compilation time performance here; superrecord is non-linear, but in this benchmark we don’t really notice this because compilation is essentially neglible:

In terms of runtime, however, since superrecord needs to copy the entire array, we expect large-anon to do better here:

Indeed, updating a single field has a constant cost in large-anon, since it just adds a single entry to the map.

Of course, in practice we will eventually want to update a bunch of fields, and then call applyPending, so let’s measure that too. First, compilation time:

Here the non-linear compilation time of superrecord really becomes noticable; for a record of 80 fields, it is again more than an order of magnitude slower (50 ms versus 2.5 seconds).

At runtime, field update in large-anon is slightly slower than superrecord for small arrays, but does better than superrecord for larger records. After all, every single field update results in an full array copy in superrecord, which is inherently O(n²). By contrast, large-anon merely updates the map, and then flattens it out at the end, constructing a single array. This is more expensive for smaller arrays, but is O(n log n) instead and therefore scales and becomes faster for larger arrays. Of course, it does mean that applyPending must be called at an appropriate moment (see Applying pending changes).

We should emphasize again that the goal of large-anon was not to create a library that would be better than superrecord at runtime, but rather to create a library with good enough runtime performance but excellent compile time performance. Nonetheless, the O(n²) vs O(n log n) cost of updating records may be important for some applications. Moreover, all functions in large-anon that work with entire records (functions such a (c)map and co) are all O(n).

Generics

There is no explicit support for generics in superrecord, but it does support conversions between records and JSON values. We will compare this to the JSON conversion functions in large-anon, which are defined in terms of generics (indeed, they are just the functions defined in large-generics). Thus, toJSON will serve as an example of a generic consumer, and parseJSON as an example of a generic producer. If anything this benchmark is skewed in favour of superrecord, because what we are measuring there is the performance of more specialized functions.

Let’s first consider the generic consumer, i.e., toJSON:

The ghc core size and compilation time of large-anon get dwarfed here by those of superrecord, so let’s consider them by themselves:

We see that the ghc core size in large-anon is beautifully linear, and so is compilation time.4 Moreover, compilation time is roughly two order of magnitude faster than superrecord (60 ms versus 6 seconds).

Runtime performance:

We see that large-anon is a little more than 2x slower than superrecord, quite an acceptable performance for a generic function.

Finally, the generic producer, i.e., parseJSON:

Here the different in compile time is less extreme, but large-anon is still roughly an order of magnitude faster (with much, much smaller ghc core). Runtime:

We see that superrecord is again roughly 2x faster (slightly less).

Conclusions

The large-anon library provides anonymous records for Haskell, which are

  • practical: the library comes with good syntactic sugar and a very expressive API.
  • scalable: compilation time is linear in the size of records.

For records with 80 fields, compilation time is somewhere between one and two orders of magnitude faster than superrecord. For runtime performance of reading record fields, large-anon lies somewhere between superrecord and Data.Map; for writing record fields and generic operations, large-anon is up to roughly 2x slower than superrecord, but sometimes much faster. The runtime performance of the library can almost certainly be improved; the focus has been on excellent compilation time performance, not excellent runtime performance. That said, I would be pretty certain that for nearly all applications the runtime performance of large-anon is just fine.

The development of large-anon is the latest, and for now probably final, installment in our research on improving compilation time on behalf of Juspay; see the blog posts tagged with compile-time-performance for everything we have written on this topic. In addition, the large-records repo contains a detailed benchmarks report, covering large-records, large-anon, and typelet, as well as the various individual experiments we have done. In addition to documenting the research, perhaps this can also help research into compilation time by other people. We are thankful to Juspay for sponsoring this research and improving the Haskell ecosystem.

Other features

We have covered most of the library’s features in this blog post, but not quite all:

  • All examples of the advanced API in this blog post have been over rows of kind Type (*). The actual API is kind polymorphic; Test.Sanity.PolyKinds in the large-anon test suite contains an example of records with types like this:

    Record Boxed ["a" := Lazy Bool, "b" := Strict Int]

    This is taking advantage of kind polymorphism to differentiate between strict and lazy fields. (In practice this is probably overkill; large-anon is strict by default; to get lazy fields, just use a box data Box a = Box a.)

    Indeed, the runtime functions on rows such as checkIsSubRow (see section Example: reflectSubRow above) are also entirely kind polymorphic, and as demonstrated in Test.Infra.DynRecord.Advanced, row discovery for existential records also works for kinds other than Type.

  • Records can also be merged (concatenated):

    merge :: Record f r -> Record f r' -> Record f (Merge r r')

    The Merge type family does not reduce:

    example :: Record Maybe (Merge '[ "a" :=  Bool ] '[])
    example = merge (insert #a (Just True) empty) empty

    HasField constraint can be solved for rows containing applications of Merge, and project can be used to flatten merged records:

    example :: Record Maybe '[ "a" :=  Bool ]
    example = project $ merge (insert #a (Just True) empty) empty
  • We have not covered the full set of combinators, but hopefully the Haddock documentation is helpful here. Moreover, the set of combinators should be entirely familiar to people who have worked with large-generics or sop-core.

  • In principle the library supports scoped labels. After all, insert has no constraints:

    insert :: Field n -> f a -> Record f r -> Record f (n := a : r)

    The absence of any constraints on insert means that a sequence of many calls to insert to construct a large record is perfectly fine in terms of compilation time, but it also means that fields inserted later can shadow fields inserted earlier. Indeed, those newer fields might have different types than their older namesakes. Everything in the library is designed to take this into account, and I believe it makes for a simpler and more uniform design.

    However, the library currently offers no API for making shadowed fields visible again by removing the field that is shadowing them. There is no fundamental reason why this isn’t supported, merely a lack of time. The work by Daan Leijen in scoped labels (for example, Extensible records with scoped labels) may provide some inspiration here.

Alternative approaches

In a previous blog post Induction without core-size blow-up: a.k.a. Large records: anonymous edition we discussed some techniques that can be used to do type-level induction in Haskell without resulting in huge ghc core and bad compilation time. The reason we ended up not going down this path in the end for large-anon was primarily one of usability.

Consider checking whether a field is a member of a (type-level) row. If the row is a list, then the search is necessarily O(n). If we want to reduce this to O(log n), we could index records by type-level balanced trees. We explored this to some degree; in fact, we’ve gone as far as implementing guaranteed-to-be-balanced type-level red-black trees. In the end though this results in a poorer user experience, since these type-level trees then appear in user visible types, error messages, and so on.

Using a plugin resulted in a more practical library. Note, though, that we are using a plugin only for better compile time performance. In principle everything that large-anon does could be done with type families within Haskell itself; this is different to plugins such as Coxswain which really try to implement a theory of rows. The large-anon library does not attempt this; this keeps the library more simple, but also more predictable. For example, we have seen various examples above that having rows be ordered is useful.


  1. The large-anon library comes with support for optics out of the box, but of course integration with other flavours of lenses is also possible.↩︎

  2. The record-dot-preprocessor syntax for record field update is r{f = ..}, with no spaces allowed; currently none of r { f = .. }, r{ f = .. } or r {f = ..} are recognized, although this is apparently not quite intentional. See the GitHub ticket about Syntax for updating multiple fields?.↩︎

  3. This technique is used by various records and generic programming libraries, such as barbies, higgledy, sop-core and vinyl.↩︎

  4. Compilation time measurements are inherently somewhat noisy when times are small, which explains the outlier at a record size of 90 fields. This is why we present ghc core size measurements as well, which are much more reliably reproducible.↩︎

by edsko at April 06, 2022 12:00 AM

April 04, 2022

FP Complete

Hiring Haskell Developers

FP Complete is actively seeking multiple engineers to work with our globally distributed team of software engineers. This blog post is to announce a new job opening for a developer role focused on Haskell development. At all times you can find an up-to-date listing of job openings on our jobs page. Below is the job information on the new Haskell position.


Senior Haskell Engineer

FP Complete is an engineering consulting firm specializing in reliable, automated server-side systems. Our customers span the globe and cover such diverse industries as FinTech, life sciences, academia, and blockchain. Our software, systems, and DevOps engineers are a remote-first team who love to solve complicated problems well, delivering elegant and robust solutions to complex problems.

We're seeking to expand our team of Haskell developers with at least one additional team member. The focus of this role is to augment our existing team working on customer-facing projects. Our goal is to improve stability and performance of the codebases while adding additional features and integration points.

If you're looking to work on interesting projects with a team of experienced Haskell engineers, keep reading for more details, and be sure to send us your CV at jobs@fpcomplete.com.

Location: Fully remote
Type of engagement: Preference for full time, though part time positions may be available for the right candidate.

Requirements

We are looking for software developers with professional development experience. Developers with significant Haskell knowledge but no prior Haskell professional work experience are welcome to apply. We strive to create an environment where theoretical Haskell skills can be applied to real world codebases.

  • No specific location requirements, work from anywhere. You just need a good internet connection and the ability to communicate well in English, both in writing and orally.
  • 4+ years professional software development experience
  • 2+ years experience with Haskell. Professional experience or open source contributions are ideal, though demonstrable knowledge through personal projects will work as well.
  • Passion to learn and hone new skills
  • Ability to communicate clearly and consistently with a remote team, including coworkers and customers
  • Experience with FP Complete approaches to Haskell are a plus, such as the RIO library and exception handling best practices.
  • Experience working with SQL databases, and ideally Haskell libraries for working with SQL such as persistent.

Additionally, the following skills are huge plus:

  • Experience with CI/CD management, ideally for Haskell projects, but general experience is helpful too
  • Infrastructure management, especially cloud
  • Server software development and debugging
  • Skillsets matching other FP Complete job postings, such as DevOps, Rust, Scala, or frontend development

Why FP Complete

FP Complete is an engineer-driven organization. We strive to foster an environment where engineers can create excellent solutions that they’re proud of. You will have an opportunity to work with, learn from, and mentor other engineers across the globe with a variety of different skill sets, including DevOps engineers, web developers, high performance computing experts, and compiler authors. We try to give every team member opportunities to learn, grow, and thrive. This includes cross training on projects, as well as regular internal collaboration and training meetings on general engineering topics, Haskell, Rust, and DevOps.

For our entire ten-year history, FP Complete has been a remote-first company, with no central office. We offer flexible work hours and location. You don’t need to worry about missing the in-office discussions, as the entire team communicates exclusively remotely.

We service a wide range of industries with customers of various sizes and differing tech stacks. While the work can be challenging, it offers great opportunities to get a broad view of the industry in general.

We are also strong proponents of open-source software. As a company, and as individuals on our team, we maintain a large swath of open-source projects, including many critical pieces of Haskell infrastructure, plus Rust and DevOps projects as well. Our approach to DevOps always follows a strong OSS bias.

Learn more about what we do at https://www.fpcomplete.com/.

How to apply

To apply for this position, please send a cover letter and CV/resume to jobs@fpcomplete.com.

April 04, 2022 12:00 AM

April 01, 2022

Well-Typed.Com

Performance improvements for HLS

TL;DR: Upcoming HLS releases will be substantially faster and more responsive for large codebases using Template Haskell, thanks to work by Well-Typed on behalf of Mercury.

Haskell Language Server (HLS) is a key part of the IDE experience for Haskell. Well-Typed are actively exploring ways to maintain and improve HLS, in particular making it robust and performant on large projects.

Recently, Mercury asked us to deal with performance problems in HLS that were preventing their developers from using it effectively. In particular, for large projects using Template Haskell, HLS would take a long time to start up, and would spend far too much time recompiling after each edit to a file. This made it difficult to use interactively with such projects.

This post describes the progress we have made in improving performance in GHC and HLS for projects using Template Haskell. The work has been primarily completed by Zubin Duggal with help from Matthew Pickering. It involves three key changes:

  • Speeding up recompilation by reducing how often GHC and HLS need to recompile modules after their dependencies are changed.

  • Reducing start-up times by serializing Core code to interface files.

  • Taking advantage of the serialized Core to generate bytecode only when really needed.

We will discuss each of these in turn, followed by benchmarks illustrating the impact of the changes.

Speeding up recompilation given Template Haskell

For interactive development with HLS, a project is fully checked once on startup. Subsequently, when the developer edits a file, that module and any modules that depend on it may then need to be re-checked. In the absence of Template Haskell, HLS can type-check a project and then stop without actually doing code generation, thereby avoiding lots of computation. However, when Template Haskell is used, functions used in TH splices need to be compiled to bytecode or object code in order to run the splices and continue type-checking.

GHC implements a “recompilation check” that decides whether a given module needs to be recompiled when one of its dependencies has changed. For example, if the only function that has changed is never used in the module, there is no need to recompile. This can save substantial amounts of work.

However, in GHC 9.2 and previous versions, where a module enables Template Haskell, that module will be recompiled whenever any of its dependencies are modified at all. The key requirement is that if the definition of any identifier used in a splice is modified, then we need to recompile. By eagerly recompiling if anything changes at all, GHC will always catch the case when something used in a splice is changed, but it’s not a very precise solution, and leads to many unnecessary recompiles.

The crux of the improved recompilation performance is to recompile only if a module which defines a symbol used in a splice is changed, rather than any module at all. We have implemented this scheme in both GHC and HLS:

  • The HLS patch will be available in the next HLS release, and will apply when using HLS with any supported GHC version.

  • The GHC patch will be part of the GHC 9.4 release, and will apply when compiling outside of HLS with GHC or GHCi.

Fine-grained recompilation avoidance in HLS

HLS uses its own build system, hls-graph, based on Shake but adapted to suit the needs of HLS (e.g. storing everything graph in memory rather than on disk, and minimizing rebuilds). The build system defines rules that specify when changes to one build product may affect another. However, these rules are necessarily somewhat coarser-grained than the fine-grained recompilation avoidance check implemented by GHC.

Previously, the build system could reuse old build products only if the rule wasn’t triggered at all, or if old build products were already available on disk. This meant that even if GHC’s fine-grained recompilation avoidance check knew that recompilation was not required, HLS would still have to recompile in order to have the build products around for subsequently running splices.

Thus another crucial improvement to speed up recompilation is to allow the build rules to have access to the value computed on a previous run. Then when a rule is rebuilt, it can first invoke GHC’s fine-grained recompilation check, and if that indicates no recompilation is necessary, immediately return the previous value.

Speeding up start-up with serialized Core

We have discussed recompilation performance during editing, but what about when HLS starts up?

If the project been compiled previously, then in the absence of Template Haskell, HLS will load interface (.hi) files from disk in order to recreate the internal state without much additional work.

However, for modules which use Template Haskell, the compiled bytecode or object code is needed to run splices, in addition to the contents of the interface files. Prior to our work, HLS could not make use of existing compiled code, so it had to recompile everything from scratch, leading to long start-up times every time the project was opened.

How can we avoid this? It turns out to be difficult to use either bytecode1 or object code2 in this context, so we chose a third option: after compiling modules, serialize the Core to disk as part of a “fat interface file,” and use this to generate bytecode for the modules we need on startup. Core is GHC’s intermediate language, so generating bytecode from Core is not a particularly expensive operation. Moreover, the machinery needed to serialize Core already exists in GHC, because GHC serializes some Core for unfoldings into interface files. Thus it was easy to add fat interface file support into HLS while maintaining support for all the GHC versions it currently compiles with.

This significantly reduces start-up times on codebases that use a lot of Template Haskell (provided the code has been compiled once before). This change has draft implementations in progress for HLS, and in GHC itself.

Generating bytecode on demand

Usually, before typechecking a module with TH and running its splices, HLS has to populate the environment it passes to GHC with the compiled bytecode or object code for all the modules that may be used in a splice (i.e. all the imports of the module).

However, by installing a GHC hook, we can intercept splices just before they are compiled and run, and inspect them to discover precisely which modules are actually used in a particular splice. Once we have the set of modules that a splice actually depends on, we can populate the GHC environment by compiling the bytecode of exactly that set of modules and no more. The set of modules used in TH splices is usually much smaller than the total set of imports, so this avoids having to compile many modules to bytecode.

This optimisation is only made possible with the ability to serialize Core, as we can generate bytecode on demand using the Core file we have written out. Without the serialized Core, we would have to keep the intermediate typechecked or Core ASTs of all modules in memory, which is not feasible due to the heap bloat this would cause. HLS has invariants that try really hard to ensure that intermediate ASTs are only kept in memory for files that the user has opened in their editor, as doing so for every single file in the project is infeasible for projects that span over a few dozen modules.

We have a work-in-progress implementation of this change in HLS, but a few issues remain to be ironed out before it is ready to be reviewed and merged. Doing this in GHC would be more difficult; a possible alternative would be to allow the user to control which identifiers can be used in splices via Explicit Splice Imports.

Benchmarks

We conducted some benchmarks on a large commercial Haskell repository, which consists of around 2000 modules and approximately 200,000 lines of code. These benchmarks used the existing HLS benchmarking infrastructure to start HLS, edit files and perform HLS requests (e.g. to look up a definition or get information about a source code location).

The HLS versions being compared are:

  • HLS prior to these changes (baseline),
  • the new recompilation avoidance scheme (avoid-recompile),
  • the strategy to serialize Core along with the new recompilation avoidance scheme (serialize-core), and
  • the optimisation to serialize-core to allow us to generate bytecode only on demand (on-demand-bytecode).

All benchmarks are conducted with a warm cache (i.e. the project has been compiled at least once in the past, so its .hi files are present on disk).

Set 1: Comparison against baseline

This benchmark consists of edits to 3 files, followed by a “get definition” request in each of the files, repeated 20 times. The 3 files chosen consisted of one file relatively deep in the module hierarchy, one in the upper two-thirds, and another one near the very top. These were chosen as to provide a realistic accounting of the kinds of edits users might perform in an actual IDE session, taking into account a diverse set of files from across the entire module hierarchy.

version baseline avoid-recompile serialize-core on-demand-bytecode
time waiting for responses 448.5s 423.8s 80.1s 36.2s
time after responding 10852.4s 503.3s 315.0s 98.3s
total time 11300.9s 927.3s 395.2s 134.4s
initial load time 447.6 429.2s 84.8s 46.9s
average time per response 0.019s 0.010s 0.011s 0.008s
GHC rebuilds 21680 2238 339 339
max residency 7093MiB 4903MiB 3937MiB 3078MiB
bytes allocated 3817GiB 676GiB 533GiB 254GiB

The time measurements are wall-clock times:

  • “time waiting for responses” is the sum of the times between requests being issued and the user seeing responses to their requests (including the initial load in order to respond to the first request).
  • “time after responding” is the sum of the time spent after responses have been returned but HLS is still busy (e.g. recompiling after an edit).
  • “total time” is the time taken to run the entire benchmark.
  • “initial load time” is the time taken for HLS to load the project and come back to idle.
  • “average time per response” is the average amount of time HLS took to respond to a request (not counting the initial load).
  • “GHC rebuilds” counts the total number of times HLS called into the GHC API to rebuild a module.

The total time and number of GHC rebuilds for baseline is vastly higher than the others, because without the recompilation changes we are essentially compiling the entire project on every edit to the file deep in the module hierarchy, while avoid-recompile or serialize-core do a comparatively minimal amount of recompilation work on every edit. The actual factor of improvement is not very meaningful (it could be made arbitrarily high by increasing the number of iterations). However, it does show a very significant improvement to the user experience of the IDE: getting up to date information, warnings and errors much faster compared to the status quo.

Between avoid-recompile and serialize-core the total time decreases further, because on a fresh start, the files can be loaded directly from disk rather than recompiling. Looking at the “GHC rebuilds” column, avoid-recompile needs to rebuild every file once at startup, and then do a minimal amount of recompilation on edits. serialize-core and on-demand-bytecode have to do some much smaller amount of recompilation on startup due to editing a file deep in the hierarchy, and do the same amount of recompilation due to edits as avoid-recompile.

With on-demand-bytecode, we see another dramatic improvement to initial load times as we we can avoid compiling many files to bytecode. We also see a dramatic improvement in the total time as we avoid all this work even on recompiles.

Looking at a graph of heap usage against time shows the dramatic impact of the changes:

“getDefinition after edit” - live bytes over time
“getDefinition after edit” - live bytes over time

Set 2: Impact of serializing Core and generating bytecode on demand

This set of varied benchmarks compares just the avoid-recompile, serialize-core and on-demand-bytecode changes, since they are all substantially better than baseline. These benchmarks consisted of 50 repetitions each.

This time, only the two files near the top were edited as part of the benchmark, because the third file is relatively deep in the module hierarchy, and editing it invalidates a large amount of build products we have on disk, somewhat damping the effect of faster startups due to serialized Core.

Full results are given in the Appendix below. The heap usage vs time graph for some of these benchmarks is also included, for example here is the graph for the first benchmark in this set:

“hover after edit” - live bytes over time
“hover after edit” - live bytes over time

The initial upward sloping segment shows us the initial load, and the flat, spiky segment is the section where we are editing, performing requests and recompiling. The vertical lines show the times at which the initial load is complete. It is clear that the initial load times are vastly improved with the serialize-core changes, going from consistently over 300s to under 100s, and further improved by the on-demand-bytecode changes, further reducing to around 60s.

Of course, in practice, the impact of these improvements differs quite heavily depending on usage. If the first thing done by the user is to open and edit something deep in the module hierarchy upon which a lot of things depend, the improvements can be quite minimal. If on the other hand something at the top of the module hierarchy is opened first, startup times would be greatly improved because we wouldn’t need to recompile anything below it.

Conclusion

We are grateful to Mercury for funding this work, as it will benefit the whole Haskell community. We have already made significant progress improving the performance of HLS, and are continuing to identify further opportunities for performance improvements.

Well-Typed are actively looking for funding to continue maintaining and enhancing HLS and GHC. If your company relies on GHC or HLS, and you could support this work, or would like help improving the developer experience for your Haskell engineers, please get in touch with us via info@well-typed.com!


Appendix: benchmark results

Hover after edit
version avoid-recompile serialize-core on-demand-bytecode
time waiting for responses 324.6s 94.3s 48.4s
time after responding 295.0s 336.2s 102.5s
total time 619.5s 430.5s 151.0s
initial load time 324.2s 98.5s 52.2s
average time per response 0.005s 0.005s 0.006s
GHC rebuilds 1861 339 339
max residency 4697MiB 3777MiB 2787MiB
bytes allocated 595GiB 533GiB 178GiB
“hover after edit” - live bytes over time
“hover after edit” - live bytes over time
getDefinition after edit
version avoid-recompile serialize-core on-demand-bytecode
time waiting for responses 456.0s 92.9s 93.1s
time after responding 387.8s 315.3s 60.4s
total time 843.8s 408.1s 153.5s
initial load time 450.2s 89.5s 63.2s
average time per response 0.05s 0.04s 0.04s
GHC rebuilds 1861 339 339
max residency 4728MiB 3762MiB 2887MiB
bytes allocated 589GiB 385GiB 178GiB
“getDefinition after edit” - live bytes over time
“getDefinition after edit” - live bytes over time
Completions after edit
version avoid-recompile serialize-core on-demand-bytecode
time waiting for responses 785.2s 440.5s 145.3s
time after responding 247.7s 233.3s 24.2s
total time 1032.9s 673.8s 169.5s
initial load time 443.9s 92.0s 62.0s
average time per response 1.25s 1.24s 0.94s
GHC rebuilds 1861 339 339
max residency 4982MiB 4012MiB 3008MiB
bytes allocated 816GiB 598GiB 198iB
“completions after edit” - live bytes over time
“completions after edit” - live bytes over time
Code actions after edit
version avoid-recompile serialize-core on-demand-bytecode
time waiting for responses 479.8s 426.1s 139.3s
time after responding 317.8s 72.3s 0.017s
total time 794.6s 494.4s 139.3s
initial load time 314.9s 72.9s 60.7s
average time per response 4.05s 4.34s 0.80s
GHC rebuilds 1861 339 339
max residency 4990MiB 3983MiB 2981MiB
bytes allocated 685GiB 468GiB 241GiB
“code actions after edit” - live bytes over time
“code actions after edit” - live bytes over time

Footnotes


  1. Directly serializing bytecode is difficult to implement because the bytecode format hasn’t been very stable across the GHC versions HLS supports, and isn’t designed for easy serialization as it contains many references to things that exist only in memory.↩︎

  2. Unlike bytecode, object code already comes with a serializable format, but has several other drawbacks:

    1. Dynamically loaded code is difficult to unload and practically impossible on some platforms. This can make a big difference with interactive use, as memory usage can grow linearly with edits (older object code is never truly unloaded, newer code just shadows the old code). In addition to this, bugs with code unloading on some GHC versions also posed issues.
    2. Generating and emitting native code is usually much slower than bytecode, and even though the emitted code may be faster, the cost of the former usually dominates the latter when it comes to the small, usually quite simple types of programs that are executed due to Template Haskell splices
    3. Linking object code can also be quite expensive.
    ↩︎

by zubin, matthew, adam at April 01, 2022 12:00 AM

Michael Snoyman

April Fools Canceled

Due to totally foreseen conditions, April Fools' Day has been canceled this year. Pic unrelated.

Double Cherry

April 01, 2022 12:00 AM

March 30, 2022

Philip Wadler

Programming language to the stars


TIL that my students can choose among about forty different firms interested in hiring Haskell programmers. Among them is Co-Star, a firm that provides horoscopes to millions of users, which has put up a page detailing why they choose Haskell over other languages. Thanks to Alex Wasey and Dylan Thinnes for the pointer. 

by Philip Wadler (noreply@blogger.com) at March 30, 2022 04:40 PM

March 24, 2022

FP Complete

Canary Deployment with Kubernetes and Istio

Istio is a service mesh that transparently adds various capabilities like observability, traffic management and security to your distributed collection of microservices. It comes with various functionalities like circuit breaking, granular traffic routing, mTLS management, authentication and authorization polices, ability to do chaos testing etc.

In this post, we will explore on how to do canary deployments of our application using Istio.

What is Canary Deployment

Using Canary deployment strategy, you release a new version of your application to a small percentage of the production traffic. And then you monitor your application and gradually expand its percentage of the production traffic.

For a canary deployment to be shipped successfully, you need good monitoring in place. Based on your exact use case, you might want to check various metrics like performance, user experience or bounce rate.

Pre requisites

This post assumes that following components are already provisioned or installed:

  • Kubernetes cluster
  • Istio
  • cert-manager: (Optional, required if you want to provision TLS certificates)
  • Kiali (Optional)

Istio Concepts

For this specific deployment, we will be using three specific features of Istio's traffic management capabilities:

  • Virtual Service: Virtual Service describes how traffic flows to a set of destinations. Using Virtual Service you can configure how to route the requests to a service within the mesh. It contains a bunch of routing rules that are evaluated, and then a decision is made on where to route the incoming request (or even reject if no routes match).
  • Gateway: Gateways are used to manage your inbound and outbound traffic. They allow you to specify the virtual hosts and their associated ports that needs to be opened for allowing the traffic into the cluster.
  • Destination Rule: This is used to configure how a client in the mesh interacts with your service. It's used for configuring TLS settings of your sidecar, splitting your service into subsets, load balancing strategy for your clients etc.

For doing canary deployment, destination rule plays a major role as that's what we will be using to split the service into subset and route traffic accordingly.

Application deployment

For our canary deployment, we will be using the following version of the application:

  • httpbin.org: This will be the version one (v1) of our application. This is the application that's already deployed, and your aim is to partially replace it with a newer version of the application.
  • websocket app: This will be the version two (v2) of the application that has to be gradually introduced.

Note that in the actual real world, both the applications will share the same code. For our example, we are just taking two arbitrary applications to make testing easier.

Our assumption is that we already have version one of our application deployed. So let's deploy that initially. We will write our usual Kubernetes resources for it. The deployment manifest for the version one application:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
  namespace: canary
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
    spec:
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        ports:
        - containerPort: 80

And let's create a corresponding service for it:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: httpbin
  name: httpbin
  namespace: canary
spec:
  ports:
  - name: httpbin
    port: 8000
    targetPort: 80
  - name: tornado
    port: 8001
    targetPort: 8888
  selector:
    app: httpbin
  type: ClusterIP

SSL certificate for the application which will use cert-manager:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: httpbin-ingress-cert
  namespace: istio-system
spec:
  secretName: httpbin-ingress-cert
  issuerRef:
    name: letsencrypt-dns-prod
    kind: ClusterIssuer
  dnsNames:
  - canary.33test.dev-sandbox.fpcomplete.com

And the Istio resources for the application:

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: httpbin-gateway
  namespace: canary
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - canary.33test.dev-sandbox.fpcomplete.com
    port:
      name: https-httpbin
      number: 443
      protocol: HTTPS
    tls:
      credentialName: httpbin-ingress-cert
      mode: SIMPLE
  - hosts:
    - canary.33test.dev-sandbox.fpcomplete.com
    port:
      name: http-httpbin
      number: 80
      protocol: HTTP
    tls:
      httpsRedirect: true
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
  namespace: canary
spec:
  gateways:
  - httpbin-gateway
  hosts:
  - canary.33test.dev-sandbox.fpcomplete.com
  http:
  - route:
    - destination:
        host: httpbin.canary.svc.cluster.local
        port:
          number: 8000

The above resource define gateway and virtual service. You could see that we are using TLS here and redirecting HTTP to HTTPS.

We also have to make sure that namespace has istio injection enabled:

apiVersion: v1
kind: Namespace
metadata:
  labels:
    app.kubernetes.io/component: httpbin
    istio-injection: enabled
  name: canary

I have the above set of k8s resources managed via kustomize. Let's deploy them to get the initial environment which consists of only v1 (httpbin) application:

❯ kustomize build overlays/istio_canary > istio.yaml
❯ kubectl apply -f istio.yaml
namespace/canary created
service/httpbin created
deployment.apps/httpbin created
gateway.networking.istio.io/httpbin-gateway created
virtualservice.networking.istio.io/httpbin created
❯ kubectl apply -f overlays/istio_canary/certificate.yaml
certificate.cert-manager.io/httpbin-ingress-cert created

Now I can go and verify in my browser that my application is actually up and running:

httpbin: Version 1 application

Now comes the interesting part. We have to deploy the version two of our application and make sure around 20% of our traffic goes to it. Let's write the deployment manifest for it:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin-v2
  namespace: canary
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v2
  template:
    metadata:
      labels:
        app: httpbin
        version: v2
    spec:
      containers:
      - image: psibi/tornado-websocket:v0.3
        imagePullPolicy: IfNotPresent
        name: tornado
        ports:
        - containerPort: 8888

And now the destination rule to split the service:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: httpbin
  namespace: canary
spec:
  host: httpbin.canary.svc.cluster.local
  subsets:
  - labels:
      version: v1
    name: v1
  - labels:
      version: v2
    name: v2

And finally let's modify the virtual service to split 20% of the traffic to the newer version:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
  namespace: canary
spec:
  gateways:
  - httpbin-gateway
  hosts:
  - canary.33test.dev-sandbox.fpcomplete.com
  http:
  - route:
    - destination:
        host: httpbin.canary.svc.cluster.local
        port:
          number: 8000
        subset: v1
      weight: 80
    - destination:
        host: httpbin.canary.svc.cluster.local
        port:
          number: 8001
        subset: v2
      weight: 20

And now if you go again to the browser and refresh it a number of times (note that we route only 20% of the traffic to the new deployment), you will see the new application eventually:

websocket: Version 2 application

Testing deployment

Let's do around 10 curl requests to our endpoint to see how the traffic is getting routed:

❯ seq 10 | xargs -Iz curl -s https://canary.33test.dev-sandbox.fpcomplete.com | rg "<title>"
    <title>httpbin.org</title>
    <title>httpbin.org</title>
    <title>httpbin.org</title>
<title>tornado WebSocket example</title>
    <title>httpbin.org</title>
    <title>httpbin.org</title>
    <title>httpbin.org</title>
    <title>httpbin.org</title>
    <title>httpbin.org</title>
<title>tornado WebSocket example</title>

And you can confirm how out of the 10 requests, 2 requests are routed to the websocket (v2) application. If you have Kiali deployed, you can even visualize the above traffic flow:

Kiali visualization

And that summarizes our post on how to achieve canary deployment using Istio. While this post shows a basic example, traffic steering and routing is one of the core features of Istio and it offers various ways to configure the routing decisions made by it. You can find more further details about it in the official docs. You can also use a controller like Argo Rollouts with Istio to perform canary deployments and use additional features like analysis and experiment.


If you're looking for a solid Kubernetes platform, batteries included with a first class support of Istio, check out Kube360.

If you liked this article, you may also like:

See what Kube360 can do for you

March 24, 2022 12:00 AM

March 23, 2022

Well-Typed.Com

New large-records release: now with 100% fewer quotes

The large-records library provides support for large records in Haskell with much better compilation time performance than vanilla ghc does. Well-Typed and MonadFix are happy to announce a new release of this library, which avoids all Template Haskell or quasi-quote brackets. Example:

{-# ANN type User largeRecordLazy #-}
data User = MkUser {
      name   :: String
    , active :: Bool
    }
  deriving stock (Show, Eq)

instance ToJSON User where
  toJSON = gtoJSON

john :: User
john = MkUser { name = "john", active = True }

setInactive :: User -> User
setInactive u = u{active = False}

This makes for a nicer user experience and provides better integration with tooling (for example, better syntax highlighting, auto-formatting, and auto-completion). Importantly, avoiding Template Haskell also means we avoid the unnecessary recompilations that this incurs1, a significant benefit for a library aimed at improving compilation time.

In this blog post we will briefly discuss how this was achieved.

Avoiding quotation

Record declaration

The previous large-records version used quotation in two places. First, it was using Template Haskell quotes for record definitions, something like:

largeRecord defaultLazyOptions [d|
    data User = MkUser {
          name   :: String
        , active :: Bool
        }
      deriving stock (Show, Eq)
  |]

The new version avoids this by using a ghc source plugin instead of TH. The source plugin generates much the same code as the TH code used to do; if you’d like to see what definitions are generated, you can use

{-# ANN type User largeRecordLazy { debugLargeRecords = True } #-}

Record expressions

Record updates such as

setInactive :: User -> User
setInactive u = u{active = False}

were already supported by the old version (and are still supported by the new), since these rely only on RecordDotSyntax as provided by record-dot-preprocessor. However, record values required quasi-quotation in the previous version:

john :: User
john = [lr| MkUser { name = "john", active = True } |]

Here it was less obvious how to replace this with a source plugin, because we cannot see from the syntax whether or not MkUser is the constructor of a large record. Moreover, the old internal representation of large records (described in detail in Avoiding quadratic core code size with large records) meant that ghc was not even aware of name or active as record fields. This means that the source plugin must run before the renamer: after all, name resolution would fail for these names. This in turn essentially means that the plugin gets the syntax to work with and nothing else.

The solution is an alternative internal representation of records, after a cunning idea from Adam Gundry. For our running example, the code that is generated for User is

data User = forall n a.
       (n ~ String, a ~ Bool)
    => MkUser {
           name   :: n
         , active :: a
         }

This representation achieves two things:

  • ghc won’t generate field accessors for fields with an existential type (avoiding quadratic blow-up)
  • but it still works much like a normal record constructor; in particular, record values such as john work just fine.

This representation does mean that regular record updates won’t work; something like

setInactive :: User -> User
setInactive u = u { active = False }

will result in an error

Record update for insufficiently polymorphic field

When using RecordDotSyntax however all is fine, which was already a requirement for using large-records anyway.

Performance

The main benchmark for large-records is a module containing a record declaration with n fields with Eq, Show, Generic and HasField instances, and a ToJSON instance defined using a generic function. See the Benchmarks section of the first blog post on large-records for additional information.

The code generated by the new source plugin is very similar to the code that was previously generated by TH. Critically, it is still linear in the size of the record (unlike standard ghc, which is quadratic); see to the full report on the (compile time) performance of large-records for details. We therefore don’t expect any super-linear improvements in compilation time; indeed, improvement of compilation time was not the point of this refactoring (other than avoiding unnecessary recompilations due to TH). It is nonetheless nice to see that the plugin is roughly 25% faster than TH:

Although we didn’t measure it, avoiding quasi-quotation for record values should also help improve compilation time further, depending on how common these are in any particular codebase.

Conclusions

The large-records library is part of our work on improving compilation time on behalf of Juspay. We have written extensively about these compilation time problems before (see blog posts tagged with compile-time-performance), and also have given various presentations on this topic (HIW 2021, HaskellX 2021). This new release of large-records is not fundamentally different to the previous. It still offers the same features:

  • linear-size ghc code and therefore much better compilation time performance
  • stock derivation support (Show, Eq, Ord)
  • Generics support (through large-generics style generics, similar in style to generics-sop)
  • HasField support for integration with record-dot-preprocessor

However, the fact that Template Haskell quotes and quasi-quotation are no longer required in the new version should make for a much better user experience, as well as further speed up compilation time projects with deep module hierarchies.


  1. Suppose module B imports module A. If B uses Template Haskell splices, it will be recompiled whenever A changes, whether or not the change to A is relevant. Specifically, even with optimizations disabled, a change to the implementation of a function in A will trigger a recompilation of B. The reason is that B might execute f in the splice, and ghc makes no attempt at all to figure out what the splice may or may not execute. We have recently improved this in GHC HEAD; a blog post on that change is coming soon.↩︎

by edsko at March 23, 2022 12:00 AM

March 22, 2022

Sandy Maguire

Review: Proof-Carrying Code

A few months ago, the excellent David Rusu gave me an impromptu lecture on ring signatures, which are a way of signing something as an anonymous member of a group. That is, you can show someone in the signing pool was actually responsible for signing the thing, but can’t determine which member of the pool actually signed it. David walked me through all the math as to how that actually happens, but I was unable to follow it, because the math was hard and, perhaps more importantly, it felt like hand-compiling a proof.

What do I mean by “hand-compiling” a proof? Well, we have some mathematical object, something like

postulate
  Identity : Set
  Message : Set
  SignedBy : Message  Identity  Set

  use-your-imagination : {A : Set}  A

record SignedMessage {n : } (pool : Vec Identity n) : Set where
  field
    message : Message
    @erased
      signer : Fin n
    signature : SignedBy message (lookup pool signer)

where @erased is Agda’s runtime irrelevance annotation, meaning the signer field won’t exist at runtime. In fact, attempting to write a function that would extract it results in the following error:

Identifier signer is declared erased, so it cannot be used here
when checking that the expression signer x has type Fin n

Nice one Agda!

Hand-compiling this thing is thus constructing some object that has the desired properties, but doing it in a way that requires BEING VERY SMART, and throwing away any chance at composability in the process. For example, it’d be nice to have the following:

open SignedMessage

weakenL :  {n pool new-id}
         SignedMessage {n} pool
         SignedMessage (new-id  pool)
weakenL x = use-your-imagination

weakenR :  {n pool new-id}
         SignedMessage {n} pool
         SignedMessage (pool ++ [ new-id ])
weakenR x = use-your-imagination

which would allow us to arbitrarily extend the pool of a signed message. Then, we could trivially construct one:

sign : Message  (who : Identity)  SignedMessage [ who ]
message   (sign msg who) = msg
signer    (sign msg who) = zero
signature (sign msg who) = use-your-imagination

and then obfuscate who signed by some random choice of subsequent weakenLs and weakenRs.

Unfortunately, this is not the case with ring signatures. Ring signatures require you to “bake in” the signing pool when you construct your signature, and you can never again change that pool, short of doing all the work again. This behavior is non-composable, and thus, in my reckoning, unlikely to be a true solution to the problem.

The paper I chose to review this week is Proof-Carrying Code by George Necula, in an attempt to understand if the PL literature has anything to say about this problem.

PCC is an old paper (from 1997, egads!) but it was the first thing I found on the subject. I should really get better at vetting my literature before I go through the effort of going through it, but hey, what are you going to do?

The idea behind PCC is that we want to execute some untrusted machine code. But we don’t want to sacrifice our system security to do it. And we don’t want to evaluate some safe language into machine code, because that would be too slow. Instead, we’ll send the machine code, as well as a safety proof that verifies it’s safe to execute this code. The safety proof is tied to the machine code, such that you can’t just generate a safety proof for an unrelated problem, and then attach it to some malicious code. But the safety proof isn’t obfuscated or anything; the claim is that if you can construct a safety proof for a given program, that program is necessarily safe to run.

On the runtime side, there is a simple algorithm for checking the safety proof, and it is independent of the arguments that the program is run with; therefore, we can get away with checking code once and evaluating it many times. It’s important that the algorithm be simple, because it’s a necessarily trusted piece of code, and it would be bad news if it were to have bugs.

PCC’s approach is a bit… unimaginative. For every opcode we’d like to allow in the programs, we attach a safety precondition, and a postcondition. Then, we map the vector of opcodes we’d like to run into its pre/post conditions, and make sure they are confluent. If they are, we’re good to go. This vector of conditions is called the vector VC in the paper.

So, the compiler computes the VC and attaches it to the code. Think of the VC as a proposition of safety (that is, a type), and a proof of that proposition (the VC itself.) In order to validate this, the runtime does a safety typecheck, figuring out what the proposition of safety would have to be. It compares this against the attached proof, and if they match, it typechecks the VC to ensure it has the type it says. If it does, our code is safe.

The PCC paper is a bit light on details here, so it’s worth thinking about exactly what’s going on here. Presumably determining the safety preconditions is an easy problem if we can do it at runtime, but proving some code satisfies it is hard, or else we could just do that at runtime too.

I’m a bit hesitant to dive into the details here, because I don’t really care about determining whether some blob of machine code is safe to run. It’s a big ball of poorly typed typing judgments about memory usage. Why do I say poorly typed? Well consider one of the rules from the paper:

<semantics>m⊢e:τliste≠0m⊢e:addr∧…<annotation encoding="application/x-tex"> \frac{m \vdash e : \tau \text{list} \quad \quad e \neq 0} {m \vdash e : \text{addr} \wedge \ldots} </annotation></semantics>me:addrme:τliste=0

Here we have that from e : List τ (and that e isn’t 0) we can derive e : addr. At best, if we are charitable in assuming <semantics>e≠0<annotation encoding="application/x-tex">e \neq 0</annotation></semantics>e=0 means that e isn’t nil, there is a type preservation error here. If we are less charitable, there is also some awful type error here involving 0, which might be a null check or something? This seems sufficiently messy that I don’t care enough to decipher it.

How applicable is any of this to our original question around ring signatures? Not very, I think, unfortunately. We already have the ring signature math if we’d like to encode a proof, and the verification of it is easy enough. But it’s still not very composable, and I doubt this paper will add much there. Some more promising approaches would be to draw the mystery commutative diagrams ala Adders and Arrows, starting from a specification and deriving a chain of proofs that the eventual implementation satisfies the specification. The value there is in all the intermediary nodes of the commutative diagram, and whether we can prove weakening lemmas there.

But PCC isn’t entirely a loss; I learned about @erased in Agda.

March 22, 2022 12:00 AM

Mike Izbicki

Fixing North Korea's KCNA Webpage

Fixing North Korea's KCNA Webpage

posted on 2022-03-22

I occasionally have skype calls with computer programmers in North Korea, and one of the things we talk about is how to improve their internet infrastructure. Recently, we talked about how their kcna.kp webpage was using javascript incorrectly. This error prevented other websites from linking to articles published on kcna.kp and Google from searching those articles.

This minor technical problem had geopolitical implications. KCNA is the main newspaper in North Korea, and policy wonks closely analyze KCNA’s articles in order to better understand the North Korean government. A broken KCNA website makes their jobs harder and reduces the quality of discussion about North Korean policy.

As of 22 February, these problems with the KCNA webpage are now fixed.

To illustrate the changes that the KCNA web developers made, we’ll use the Internet Archive’s Wayback Machine to look at old versions of the website. The first snapshot of the kcna.kp webpage is from 20-April-2011.1 The front page shows Kim Jong Il performing on-the-spot guidance, and is the sort of picture that wonks go crazy over:

The webpage is reasonably nice looking, but if you click on any of the article links in the snapshot page, you’ll notice that they don’t work anymore. There’s no way to see the contents of these older articles or their associated images.

Inspecting the HTML source code we can see why. All the link tags look something like

<a href="javascript:onNews('specialnews','2011','410796')">

When you click the link, your browser calls the javascript onNews function. This function is custom written for the KCNA webpage, and makes an AJAX call to display the article’s contents. Unfortunately, web crawlers cannot access the contents of these AJAX calls unless special procedures are followed, and the KCNA webpage did not follow these procedures. So the Internet Archive was not able to archive these links, and this bit of history is lost.2

The Wayback Machine has collected 2395 more snapshots of the KCNA webpage up through today. Looking through these records we can see that the kcna.kp website was redesigned in January 2013, and this redesign broke the webpage even more. The redesigned webpage uses javascript even for displaying the main body of the webpage, and so not even the homepage can be archived. The snapshot from 1-January-2013 is the last working snapshot before this redesign.

After 9 years, the webpage was finally fixed last month on 22-February-2022. The new webpage looks like:

The important part, however, is the underlying HTML code. The link tags now use standard HTML to include the URL directly in the tag with no javascript. For example, the link to the top article about Kim Jong Un above looks like

<a href="/kp/article/q/320150e5ae8e9bc8fdf3d6b8547eaeaf.kcmsf">

Crawlers are able to follow these links. So now, after a 9 year hiatus, the internet archive is once again able to archive articles from the KCNA. You can view the article above permanently archived in the Internet Archive repository along with 26 associated pictures. These automated archives of the KCNA are especially important for Western researchers because the KCNA is known to have altered historic articles in response to domestic purges.

Furthermore, Google3 is now able to index the KCNA’s articles. So analysts can do searches like site:kcna.kp united states to find KCNA articles mentioning a topic of interest like the United States:

These usability improvements will help Western researchers navigate the KCNA’s published articles and learn about the DPRK. But there are still unfortunately some major problems with the webpage.

For example, if you click on any of the google links above, you’ll be taken to the “secure” webpage using the HTTPS protocol (instead of the HTTP protocol). Ordinarily, that’s a good thing, but the KCNA webpage uses a self-signed certificate, so you get a scary looking error message. On firefox, it looks like:

At first glance, this error message makes it look like the KCNA webpage might have something dangerous like a virus on it. That’s not the case though. The message just means that the webpage isn’t properly encrypted.

The North Korean government wants to fix these problems, and we should too. It’s in both their interest and ours to improve the communication between our countries’ foreign policy experts. Unfortunately, the current US sanctions regime makes this difficult. I have a standing invitation from my North Korean colleagues to visit them and teach about modern web standards, but the US has banned American passport holders from entering North Korea. So American sanctions are effectively preventing North Korea from improving their internet.


  1. Prior to 2011, the KCNA was hosted online at http://kcna.co.jp, and the Wayback Machine has archives going back to 1997. Like most other webpages of that era, the kcna.co.jp webpage used simple HTML and had a rather crude appearance. The switch to the .kp ccTLD also entailed a rewrite of the interface to make it prettier and more modern. This rewrite introduced the javascript bugs described in this post. An archived post from North Korea Tech describes the switch from the kcna.co.jp domain to kcna.kp.

  2. Technically, the contents of the KCNA articles themselves are not lost, they’re just much more difficult to access. Libraries maintain print copies of KCNA publications, and there is a custom archiver/search engine https://kcnawatch.org that was built specifically for tracking North Korean media. But the average policy researcher or reporter doesn’t have access to these resources, and so from their perspective this history was lost.

  3. Other search engines are able to index kcna.kp now too, but the process takes time, especially for low traffic webpages. As of 22-Mar-2022, Yandex had indexed kcna.kp, but Bing and Baidu had not.

March 22, 2022 12:00 AM

March 20, 2022

Stackage Blog

LTS 19 release and Nightly on ghc-9.2

The Stackage team is very happy to announce the initial Stackage LTS version 19 snapshot release is now available, based on GHC version 9.0.2. This release is significant for several reasons:

  • not only is it the first stable LTS release based on ghc9,
  • it also includes many significant upgrades, including aeson-2.0 with improved security, and
  • it is also the largest stable LTS release we have ever done: just 1 short of 2900 packages!

Of course it is still possible to get your package added to lts-19, if it builds with lts19 and you missed to get it into Nightly in time for the initial lts-19.0 build: using our straightforward process - just open a github issue in the lts-haskell project and following the template there.

Thank you to the great Haskell community for all the many contributions you are making - do keep them coming!

At the same time we are also excited to move Nightly now to GHC 9.2.2 - enjoy! Apparently there will be a 9.2.3 bugfix release coming, so we will update Nightly to that after it is released.

Quite a number of Nightly packages had to be disabled as part of the upgrade to 9.2. This is being tracked in https://github.com/commercialhaskell/stackage/issues/6486: if you find your package listed there, you can help to update it to build with ghc-9.2 and Stackage Nightly, thank you!

Help us make the next LTS 20 an even bigger and better release!

March 20, 2022 05:00 AM

March 19, 2022

Sandy Maguire

Review: Syntax-Guided Synthesis

I was describing my idea from last week to automatically optimize programs to Colin, who pointed me towards Syntax-Guided Synthesis by Alur et al.

Syntax-Guided Synthesis is the idea that free-range program synthesis is really hard, so instead, let’s constrain the search space with a grammar of allowable programs. We can then enumerate those possible programs, attempting to find one that satisfies some constraints. The idea is quite straightforward when you see it, but that’s not to say it’s unimpressive; the paper has lots of quantitative results about exactly how well this approach does.

The idea is we want to find programs with type I O, that satisfy some specification. We’ll do that by picking some Language of syntax, and trying to build our programs there.

All of this is sorta moot, because we assume we have some oracle which can tell us if our program satisfies the spec. But the oracle is probably some SMT solver, and is thus expensive to call, so we’d like to try hard not to call it if possible.

Let’s take an example, and say that we’d like to synthesize the max of two Nats. There are lots of ways of doing that! But we’d like to find a function that satisfies the following:

data MaxSpec (f :  ×   ) :  ×   Set where
  is-max : {x y : }
          x  f (x , y)
          y  f (x , y)
          ((f (x , y)  x)  (f (x , y)  y))
          MaxSpec f (x , y)

If we can successfully produce an element of MaxSpec f, we have a proof that f is an implementation of max. Of course, actually producing such a thing is rather tricky; it’s equivalent to determining if MaxSpec f is Decidable for the given input.

In the first three cases, we have some conflicting piece of information, so we are unable to produce a MaxSpec:

decideMax : (f :  ×   )  (i :  × )  Dec (MaxSpec f i)
decideMax f i@(x , y) with f i | inspect f i
... | o | [ fi≡o ] with x ≤?