## April 21, 2021

### Magnus Therning

Nothing much to be proud of, but yesterday I found out that servant-docs was marked broken in nixpkgs even though it builds just fine and this morning I decided to do something about it.

So, with the help of a post on the NixOS discourse I put together my first PR.

# The end of history for programming

history

I spend quite a bit of time thinking about what the end of history for programming might look like. By the “end of history” I mean the point beyond which programming paradigms would not evolve significantly.

I care about programming’s “destiny” because I prefer to work on open source projects that bring us closer to that final programming paradigm. In my experience, this sort of work has greater longevity, higher impact, and helps move the whole field of programming forward.

So what would the end of history look like for programming? Is it:

Some people treat programming as a solved problem and view all new languages and paradigms as reskins of old languages or paradigms. From their point of view all that remains is to refine our craft by slowly optimizing things, weeding out bugs or addressing non-technical issues like people management or funding.

I personally do not subscribe to this philosophy because I believe, at a very minimum, that functional programming paradigms will slowly displace object-oriented and imperative programming paradigms (although functional programming might not necessarily be the final programming paradigm).

• … artificial intelligence?

Maybe machines will translate our natural language instructions into code for us, relieving us of the burden of precisely communicating our intent. Or maybe some sufficiently smart AI-powered IDE will auto-complete most of our program for us.

I don’t believe this either, and I think Dijkstra did an excellent job of dismantling this line of reasoning in his essay: On the foolishness of “natural language programming”.

Truthfully, I’m not certain what is the correct answer, but I’ll present my own guess for what the end of history for programming looks like.

#### Mathematical DSLs

My view is that the next logical step for programming is to split into two non-overlapping programming domains:

• runtime building for …
• … mathematical programming languages

Specifically, I expect programming languages to evolve to become more mathematical in nature where the programs users author to communicate their intent resemble pure mathematical expressions.

For example, consider the following mathematical specifications of the boolean logical “and” operator and the function composition operator:

True  && x     = xx     && True  = xFalse && False = False(f . g)(x) = f(g(x))

These mathematical specifications are also executable Haskell code (albeit with extra parentheses to resemble mainstream function syntax). Haskell is one example of a programming language where the code aspires to resemble pure mathematical expressions and definitions.

Any language presenting such an idealized mathematical interface requires introducing a significant amount of complexity “under the hood” since the real world is messy. This is where runtime building comes into play to gloss over such ugly details.

In other words, I’m predicting that the end of history for programming is to become an interface bridging pure mathematical expressions to the real world.

#### The past and present

Let me give a few examples of where this trend towards mathematical userland code is already playing out:

• Memory management

Manual memory management used to be a “userland” concern for most programming languages, but the general trend for new languages is automatic memory management (with the exception of Rust). Memory management used to be an explicit side effect that programmers had to care about and pushing memory management down into the runtime (via garbage collection or other methods) has made programming languages more pure, allowing them to get one step closer to idealized mathematical expressions.

Indeed, Rust is the exception that proves the rule, as Rust is widely viewed as better suited for building runtimes rather than being used for high-level specification of intent.

• Functional programming (especially purely functional programming)

Functional programming is a large step towards programming in a more mathematical style that prefers:

• expressions over statements
• pure functions over side effects
• algebraic datatypes over objects
• recursion over loops

I cover this in more detail in Why I prefer functional programming.

However, functional programming is not a free win. Supporting higher-order functions and closures efficiently isn’t easy (especially for compiled languages), which is why less sophisticated language implementations tend to be more imperative and less functional.

• Evaluation order (especially laziness)

I’ve always felt that “lazy evaluation” doesn’t do a good job of selling the benefits of Haskell’s evaluation model. I prefer to think of Haskell as having “automatic evaluation management”1. In other words, the programmer specifies the expressions as a graph of dependent computations and the runtime figures out the most efficient order in which to reduce the graph.

This is yet another example of where we push something that used to be a userland concern (order of evaluation) into the runtime. Glossing over evaluation order frees us to specify things in a more mathematical style, because in mathematics the evaluation order is similarly irrelevant.

#### The present and future

A common pattern emerges when we study the above trends:

• Push a userland concern into a runtime concern, which:
• … makes programs more closely resemble pure mathematical expressions, and:
• … significantly increases the complexity of the runtime.

You might wonder: what are some other userland concerns that might eventually get pushed into runtime concerns in the near future? Some examples I can think of are:

• Package management

This one is so near in the future that it’s already happening (see: Nix and Dhall). Both languages provide built-in support for fetching code, rather than handling packages out-of-band using a standalone package management tool. This language-level support allows programs to embed external code as if it were a pure sub-expression, more closely approximating the mathematical ideal.

• Error handling

This one requires more explanation: I view type systems as the logical conclusion of pushing error handling into the “runtime” (actually, into the type-checker, not the runtime, to be pedantic). Dhall is an example of a language that takes this idea to the extreme: Dhall has no userland support for raising or catching errors because all errors are type errors2.

Advances in dependently-typed and total functional programming languages get us closer to this ideal of pushing error handling into a runtime concern.

• Logging

I’m actually surprised that language support for pervasively logging everything hasn’t been done already (or maybe it has happened and I missed it). It seems like a pretty mundane thing languages could implement, especially for application domains that are not performance-sensitive.

Many languages already support profiling, and it seems like it wouldn’t be a big leap to turn profiling support into logging support if the user is willing to spend their performance budget on logging.

Logging is one of those classic side effects that is an ugly detail that “ruins” code that would have otherwise been pure and mathematical.

• Services

Service-oriented architectures are another thing that tends to get in the way of writing pure side-effect-free code.

I’m not exactly sure what a service-oriented language runtime would look like, but I don’t think current “serverless” solutions are what I have in mind. Something like AWS Lambda is still too low-level to promote code that is mathematical in nature. Like, if any part of the programming process involves using a separate tool to deploy or manage the serverless code then that is a significant departure from authoring pure mathematical expressions. There needs to be something like a “Nix or Dhall for serverless code”.

#### Conclusion

You might criticize my prediction by claiming that it’s not falsifiable. It seems like I could explain away any new development in programming as falling into the runtime building or mathematical expression category.

That’s why I’d like to stress a key pillar of the prediction: runtimes and mathematical expressions will become more sharply divided over time. This is the actual substance of the prediction and we can infer a few corrolaries from that prediction.

Currently, many mainstream programming paradigms and engineering organizations conflate the two responsibilities, so you end up with people authoring software projects that mix operational logic (runtime concerns) and “business logic” (mathematical intent).

What I predict will happen is that the field of engineering will begin to generate a sharp demand for people with experience in programming language theory or programming language engineering. These people will be responsible for building special-purpose languages and runtimes that abstract away as many operational concerns as possible to support pure mathematical domain-specific languages for their respective enterprise. These languages will in turn be used by a separate group of people whose aim is to translate human intent into mathematical expressions.

One consequence of this prediction is that you’ll begin to see a Cambrian explosion of programming languages in the near future. Naturally as language engineers push more operational concerns into the runtime they will need to more closely tailor the runtime to specific purposes or organizational needs rather than trying to author a general-purpose language. In other words, there will be a marked fragmentation of language runtimes (and type-checkers) as each new language adapts to their respective niche.

Despite runtime fragmentation, you will see the opposite trend in userland code: programs authored in these disparate languages will begin to more closely resemble one another as they become more mathematical in nature. In a sense, mathematical expressions will become the portable “lingua franca” of userland code, especially as non-mathematical concerns get pushed into each respective language’s runtime.

That is a prediction that is much easier to falsify.

Also, if you like this post then you will probably also enjoy the seminal paper: The Next 700 Programming Languages.

1. Also, this interpretation is not that far from the truth as the Haskell standard only specifies a non-strict evaluation strategy. GHC is lazy, but Haskell the language standard does not require implementations to be lazy. For more details, see: Lazy vs. non-strict↩︎

2. Okay, this is a bit of an over-simplification because Dhall has support for Optional values and you can model errors using unions in Dhall, but they are not commonly used in this way and idiomatically most Dhall code in the wild uses the type-checker to catch errors. Dhall is a total functional programming language and the language goes to significant lengths to discourage runtime errors, like forbidding comparing Text values for equality. Also, the language is technically dependently typed and supports testing arbitrary code at type-checking time to catch errors statically.↩︎

# Continued Fractions: Haskell, Equational Reasoning, Property Testing, and Rewrite Rules in Action

### Abstract

In this article, we’ll develop a Haskell library for continued fractions. Continued fractions are a different representation for real numbers, besides the fractions and decimals we all learned about in grade school. In the process, we’ll build correct and performant software using ideas that are central to the Haskell programming language community: equational reasoning, property testing, and term rewriting.

I posted an early version of this article to Reddit, and I’m grateful to those who responded with insight and questions that helped me complete this journey, especially iggybibi and lpsmith.

### Introduction to continued fractions

Let’s start with a challenge. Assume I know how to write down an integer. Now, how can I write down a real number? In school, we’ve all learned a few answers to this question. The big two are fractions and decimals.

Fractions have simple forms for rational numbers. Just write two integers, for the numerator and denominator. That’s great! But alas, they are no good at all for irrational numbers like π, or the square root of 2. I could, of course, write a fraction that’s close to the irrational number, such as 22/7 as an approximation of π. But this is only good up to a certain limited precision. To get a closer approximation (say, 355/113), I have to throw it away and start over. Sure, I could write an infinite sequence of fractions which converge to the rational number, but every finite prefix of such a sequence is redundant and wasted.

Irrationals are just the limiting case of rational numbers, and very precise rational numbers suffer from this, too. It’s not easy to glance at a fraction like 103993/33102 and get a sense of its value. (Answer: this is another approximation of π.) Or, can you quickly tell whether 5/29 is larger or smaller than 23/128?

Instead, we can turn to decimals. If you’re like me, you don’t think about the mechanics of decimal numbers very often, and conventional notation like 3.14159… really hides what’s going on. I’d like to offer a different perspective by writing it like this:

Here, d₀ is the integer part, and d₁, d₂, d₃, d₄, etc. are called the decimal digits. Notice the symmetry here: after the integer part, a decimal has inside of it another decimal representing its tenths.

This recursive structure leads to the main property that we care about with decimals: we can truncate them. If I have a hundred-digit decimal number, and don’t need that much accuracy, I can simply chop off what I don’t need to get a simpler approximation. This is the same property that lets me write a decimal representation of an irrational number. As long as we accept that infinite sequences exist — and as Haskell programmers, infinite data structures don’t scare us…. much — there is a unique infinite sequence of digits that represents any irrational number. The sequence must be infinite is to be expected, since there are uncountably many irrationals, and any finite representation has only countable values. (Unfortunately, this uniqueness isn’t quite true of rational numbers, since 0.4999… and 0.5 are the same number, for example!)

As nice as that is, though, there are also some significant disadvantages to decimals as a representation:

• Although a few rational numbers have simple decimal representations, almost all rational numbers actually need infinite repeating decimals. Specifically, a rational number whose denominator has any prime factor other than 2 or 5 repeats indefinitely.
• Why 10? This is a question for historians, anthropologists, perhaps biologists, but has no answer within mathematics. It is a choice that matters, too! In a different base, a different subset of the rational numbers will have finite decimals. It’s the choice of base 10 that makes it harder to compute with 1/3. Having to make an arbitrary choice about something that matters so much is unfortunate.

We can make up some ground on these disadvantages with a third representation: continued fractions. A continued fraction is an expression of this form:

Just like with decimals, this gives a representation of a real number as a (possibly finite, or possibly infinite) sequence [n₀, n₁, n₂, n₃, n₄, …]. Now we call the integer parts terms, instead of digits. The first term, n₀, is the integer part. The difference is that when we have a remainder to write, we take its reciprocal and continue the sequence with that.

Note:

• Continued fractions represent all rational numbers as finite sequences of terms, while still accounting for all irrationals using infinite sequences.
• Continued fractions do not depend on an arbitrary choice of base. The reciprocal is the same regardless of how we choose to write numbers.
• Continued fractions, like decimals, can be truncated to produce a rational approximation. Unlike decimals, though, the approximations are optimal in the sense that they are as close as possible to the original number without needing a larger denominator.

With this motivation in mind, let’s see what we can do about using continued fractions to represent real numbers.

### Part 1: Types and Values

Time to write some code. As a declarative language that promotes equational reasoning, Haskell is a joy to use for problems like this one.

If you want to see it all together, the code and tests for this part can be found at https://code.world/haskell#PVxpksYe_YAcGw3CNdvEFzw.

To start, we want a new type for continued fractions. A continued fraction is a (finite or infinite) sequence of terms, so we could simply represent it with a Haskell list. However, I find it more readable to define a new type so that we can choose constructors with suggestive names. (It helps with this decision that standard list combinators like map or ++ don’t have obvious interpretations for continued fractions.) I’ve defined the type like this.

data CFrac where    (:+/) :: Integer -> CFrac -> CFrac    Inf :: CFrac
deriving instance Eq CFracderiving instance Show CFrac
infixr 5 :+/

Here, x :+/ y means x + 1/y, where x is an integer, and y is another continued fraction (the reciprocal of the remainder). This is the basic building block of continued fractions, as we saw above.

The other constructor may be surprising, though. I’ve defined a special continued fraction representing infinity! Why? We’ll use it for terminating fractions. When there is no remainder, we have x = x + 0 = x + 1/∞, so Inf is the reciprocal of the remainder for a continued fraction that represents an integer. That we can represent ∞ itself as a top-level continued fraction wasn’t the goal, but it doesn’t seem worth dodging. Noticing that an empty list of terms acts like infinity is actually a great intuition for things to come.

There is a simple one-to-one correspondence between continued fractions and their lists of terms, as follows:

• The :+/ operator corresponds to the list cons operator, :.
• The Inf value corresponds to the empty list, [].

We can define conversions to witness this correspondence between CFrac and term lists.

terms :: CFrac -> [Integer]terms Inf = []terms (n :+/ x) = n : terms x
fromTerms :: [Integer] -> CFracfromTerms = foldr (:+/) Inf

Rational numbers can be converted into continued fractions by following the process described above, and the code is very short.

cfFromFrac :: Integer -> Integer -> CFraccfFromFrac _ 0 = InfcfFromFrac n d = n div d :+/ cfFromFrac d (n mod d)
cfFromRational :: Rational -> CFraccfFromRational r = cfFromFrac (numerator r) (denominator r)

The real fun of continued fractions, though, is that we can represent irrational numbers precisely, as well!

Rational numbers are precisely the solutions to linear equations. The simplest irrational numbers to write as continued fractions are the quadratic irrationals: numbers that are not rational, but are solutions to quadratic equations, and these are precisely the continued fractions with infinitely repeating terms. We can write a function to build these:

cycleTerms :: [Integer] -> CFraccycleTerms ns = fix (go ns) where go [] x = x go (t : ts) x = t :+/ go ts x

And then we can write a quick catalog of easy continued fractions, including some small square roots and the golden ratio.

sqrt2 :: CFracsqrt2 = 1 :+/ cycleTerms [2]
sqrt3 :: CFracsqrt3 = 1 :+/ cycleTerms [1, 2]
sqrt5 :: CFracsqrt5 = 2 :+/ cycleTerms [4]
phi :: CFracphi = cycleTerms [1]

It’s really worth pausing to think how remarkable it is that these fundamental quantities, whose decimal representations have no obvious patterns at all, are so simple as continued fractions!

And it doesn’t end there. Euler’s constant e, even though it’s a transcendental number, also has a simple pattern as a continued fraction.

exp1 :: CFracexp1 = 2 :+/ fromTerms (concatMap (\n -> [1, 2 * n, 1]) [1 ..])

This really strengthens the claim I made earlier, that being based on the reciprocal instead of multiplying by an arbitrary base makes continued fractions somehow less arbitrary than decimals.

I wish I could tell you that π has a similar nice pattern, but alas, it does not. π has a continued fraction just as random as its representation in other notations, and just as hard to compute to arbitrary precision.

It will be interesting later, though, to look at its first few terms. For that, it’s good enough to use the Double approximation to π from Haskell’s base library. It’s not really π; in fact, it’s a rational number! But it‘s close enough for our purposes. Computing an exact value of π needs some more machinery, which we will develop in the next section.

approxPi :: CFracapproxPi = cfFromRational (realToFrac (pi :: Double))

#### Canonical forms

We’ll now return to a more theoretical concern. We want each real number to have a unique continued fraction representation. In fact, I snuck something by you earlier when I derived an Eq instance for CFrac, because that instance is only valid when the type has unique representations. However, this isn’t quite true in general. Here are some examples of continued fractions that are written differently, but have the same value:

Case (1) deals with negative numbers. These will actually cause quite a few problems — not just now, but in convergence properties of algorithms later, too. I have been quietly assuming up to this point that all the numbers we’re dealing with are non-negative. Let’s make that assumption explicit, and require that all continued fraction terms are positive. That takes care of case (1).

This might seem like a terrible restriction. In the end, though, we can recover negative numbers in the same way humans have done so for centuries: by keeping track of a sign, separate from the absolute value. A signed continued fraction type, then, would wrap CFrac and include a Bool field for whether it’s negative. This is entirely straight-forward, and I leave it as an exercise for the interested reader.

Case (2) involves a term with a value of zero. It’s normal for the integer part of a continued fraction to be zero. After that, though, the remaining terms should never be zero, because they are the reciprocals of numbers less than one. So we will impose a second rule that only the first term of a continued fraction may be zero.

Case (3) involves a rational continued fraction with 1 as its final term. It turns out that even after solving cases (1) and (2), every rational number has two distinct continued fractions: one that has a 1 that is not the integer part as its last term, and one that doesn’t. This is analogous to the fact that terminating decimals have two decimal representations, one ending in an infinite sequence of 0s, and the other in an infinite sequence of 9s. In this case, the trailing 1 is longer than it needs to be, so we’ll disallow it, making a third rule that a term sequence can never end in 1, unless that 1 is the integer part.

Subject to these three rules, we get the canonical forms we wanted. It’s unfortunate that non-canonical forms are representable in the CFrac type (i.e., we have failed to make illegal data unrepresentable), but we can do the next best thing: check that our computations all produce canonical values. To do so, let’s write some code to check that a CFrac obeys these rules.

isCanonical :: CFrac -> BoolisCanonical Inf = TrueisCanonical (n :+/ cont) = n >= 0 && isCanonicalCont cont
isCanonicalCont :: CFrac -> BoolisCanonicalCont Inf = TrueisCanonicalCont (1 :+/ Inf) = FalseisCanonicalCont (n :+/ cont) = n > 0 && isCanonicalCont cont

#### Property testing

One very powerful idea that originated in the Haskell community is property testing. We state a property that we want to verify about our code, and allow a testing framework like to QuickCheck to make up examples and test them. Now is a great time to try this out. Here I’ve defined a few properties around canonical forms. The first is a trivial property that just asserts that we make the right decision about the specific examples above. The second, less trivial, guarantees that our cfFromRational function, the only real semantic function we’ve written, produces canonical forms.

prop_isCanonical_examples :: Boolprop_isCanonical_examples =  not (isCanonical (2 :+/ (-2) :+/ Inf))    && isCanonical (1 :+/ 2 :+/ Inf)    && not (isCanonical (1 :+/ 0 :+/ 2 :+/ Inf))    && isCanonical (3 :+/ Inf)    && not (isCanonical (1 :+/ 1 :+/ Inf))    && isCanonical (2 :+/ Inf)
prop_cfFromRational_isCanonical :: NonNegative Rational -> Boolprop_cfFromRational_isCanonical (NonNegative x) =  isCanonical (cfFromRational x)

(We could try to check that the irrational values are canonical, as well, but alas, we run into non-termination since they are infinite. This will be a theme: we’ll have to be satisfied with checking rational test cases, and relying on the fact that any bugs with irrational values will manifest in some nearby rational value.)

We can now run our code for this section. Try it yourself at https://code.world/haskell#PVxpksYe_YAcGw3CNdvEFzw. In addition to running the tests, we’ll print out our example continued fractions so we can see them.

Testing prop_isCanonical_examples+++ OK, passed 1 test.
Testing prop_cfFromRational_isCanonical+++ OK, passed 100 tests.
3/7 = 0 :+/ (2 :+/ (3 :+/ Inf))sqrt2 = 1 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2...sqrt3 = 1 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1...sqrt5 = 2 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4...phi = 1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1...e = 2 :+/ (1 :+/ (2 :+/ (1 :+/ (1 :+/ (4 :+/ (1 :+/ (1...approxPi = 3 :+/ (7 :+/ (15 :+/ (1 :+/ (292 :+/ (1 :+/ (1 :+/ (1 :+/  (2 :+/ (1 :+/ (3 :+/ (1 :+/ (14 :+/ (3 :+/ (3 :+/ (2 :+/ (1 :+/  (3 :+/ (3 :+/ (7 :+/ (2 :+/ (1 :+/ (1 :+/ (3 :+/ (2 :+/ (42 :+/  (2 :+/ Inf))))))))))))))))))))))))))

### Part 2: Conversions

In the previous section, we converted from rational numbers into continued fractions. We now turn to the inverse conversion: from a continued fraction to a rational number.

The code and tests for this part can be found at https://code.world/haskell#PemataxO6pmYz99dOpaeT9g.

#### Computing convergents

Not all continued fractions correspond to rational numbers. However, one of the key benefits of the continued fraction representation is that its truncated term sequences represent particularly efficient rational approximations. These rational approximations are called the convergents. It’s not hard to write a simple recursive function to compute the convergents.

naiveConvergents :: CFrac -> [Rational]naiveConvergents Inf = []naiveConvergents (n :+/ r) =  fromInteger n :  map (\x -> fromInteger n + 1 / x) (naiveConvergents r)

We can also use QuickCheck to write a property test, verifying that a rational number is its own final convergent. This gives a sanity check on our implementation.

prop_naiveConvergents_end_at_Rational ::  NonNegative Rational -> Propertyprop_naiveConvergents_end_at_Rational (NonNegative r) =  last (naiveConvergents (cfFromRational r)) === r

You may guess from my choice of name that I am not happy with this implementation. The problem with naiveConvergents is that it recursively maps a lambda over the entire tail of the list. As it recurses deeper into the list, we build up a bunch of nested mapped lambdas, and end up evaluating O(n²) of these lambdas to produce only n convergents.

#### Mobius transformations

Let’s fix that quadratic slowdown. A more efficient implementation can be obtained by defunctionalizing. The lambdas have the form \x -> fromInteger n + 1 / x. To avoid quadratic work, we need a representation that lets us compose functions of this form into something that can be applied in constant time. What works is mobius transformations.

A mobius transformation is a function of the form:

For our purposes, the coefficients a, b, c, and d are always integers. Instead of an opaque lambda, we represent a mobius transformation by its four coefficients.

data Mobius where  Mobius :: Integer -> Integer -> Integer -> Integer -> Mobius
deriving instance Eq Mobiusderiving instance Show Mobius

We seek to rewrite our computation of convergents using mobius transformations. First of all, we will restructure it as a left fold, which exposes the mobius transformation itself as an accumulator. We’ll want these building blocks:

• An identity mobius transformation, to initialize the accumulator.
• A composition to combine two mobius transformations into a single one.

In other words, mobius transformations should form a monoid!

instance Semigroup Mobius where  Mobius a1 b1 c1 d1 <> Mobius a2 b2 c2 d2 =    Mobius      (a1 * a2 + b1 * c2)      (a1 * b2 + b1 * d2)      (c1 * a2 + d1 * c2)      (c1 * b2 + d1 * d2)
instance Monoid Mobius where  mempty = Mobius 1 0 0 1

There are axioms for monoids: mempty should act like an identity, and <> should be associative. We can test these with property tests.

This is as good a time as any to set up a QuickCheck generator for mobius transformations. To keep things sane, we want to always want to choose mobius transformations that have non-zero denominators, since otherwise the function is undefined on all inputs. We will also want non-negative coefficients in the denominator, since this ensures the transformation is defined (and monotonic, which will matter later on) for all non-negative input values.

instance Arbitrary Mobius where  arbitrary =    suchThat      ( Mobius          <$> arbitrary <*> arbitrary <*> (getNonNegative <$> arbitrary)          <*> (getNonNegative <$> arbitrary) ) (\(Mobius _ _ c d) -> max c d > 0)  shrink (Mobius a b c d) = [ Mobius a' b' c' d' | (a', b', NonNegative c', NonNegative d') <- shrink (a, b, NonNegative c, NonNegative d), max c' d' > 0 ] With the generator in place, the tests are trivial to write: prop_Mobius_id :: Mobius -> Propertyprop_Mobius_id m = mempty <> m === m .&&. m <> mempty === m prop_Mobius_assoc :: Mobius -> Mobius -> Mobius -> Propertyprop_Mobius_assoc m1 m2 m3 = (m1 <> m2) <> m3 === m1 <> (m2 <> m3) #### Faster convergents Now we return to computing convergents. Here is the improved implementation. convergents :: CFrac -> [Rational]convergents = go mempty where go m Inf = [] go m (n :+/ x) = mobiusLimit m' : go m' x where m' = m <> Mobius n 1 1 0  mobiusLimit (Mobius a _ c _) = a % c In the expression, go m x, m is a mobius transformation that turns a convergent of x into a convergent for the entire continued fraction. When x is the entire continued fraction, m is the identity function. As we recurse into the continued fraction, we compose transformations onto m so that this remains true. The lambda \x -> fromInteger n + 1 / x from the naive implementation is now defunctionalized into Mobius n 1 1 0. The remaining task is to compute a truncated value of the continued fraction at each step. Recall that a terminating continued fraction is one which has an infinite term. To determine this truncated value, then, we want to consider the limit of the accumulated mobius transformation as its input tends to infinity. This is: and that completes the implementation. A quick test helps us to be confident in the result. prop_convergents_matches_naive :: NonNegative Rational -> Propertyprop_convergents_matches_naive (NonNegative r) = convergents (cfFromRational r) === naiveConvergents (cfFromRational r) #### Converting to decimals Let’s also consider the conversion from continued fractions to decimals. This time, both representations can be approximated by just truncating, so instead of producing a sequence of approximations as we did with rational numbers, we will just a single decimal that lazily adds to the representation with more precision. Mobius transformations are still a good tool for this job, but we need to refine our earlier observations. Consider the general form of a mobius transformation again: It’s worth taking the time now to play around with different choices for a, b, c, and d, and what they do to the function. You can do so with the Desmos link below: We previously noted one of these bounds, and now add the other: As long as c and d are positive (remember when we agreed to keep them so?), f is monotonic on the entire interval [0, ∞), so f is bounded by these two fractions. In particular, if a/c and b/d have the same integer part, then we know this is the integer part of the result of f, regardless of the (always non-negative) value of x. We can express this insight as a function: mobiusIntPart :: Mobius -> Maybe IntegermobiusIntPart (Mobius a b c d) | c /= 0 && d /= 0 && n1 == n2 = Just n1 | otherwise = Nothing where n1 = a div c n2 = b div d We can now proceed in a manner similar to the computation of convergents. We maintain a mobius transformation which maps the remaining continued fraction to the remaining decimal. Initially, that is the identity. But this time, instead of blindly emitting an approximation at each step, we make a choice: • If the mobius transformation is bounded to guarantee the integer part of its result, then we can produce that decimal digit. We then update the transformation m to yield the remaining decimal places after that. This will tend to widen its bounds. • Otherwise, we will pop off the integer part of the continued fraction, and update the transformation to expect the remaining continued fraction terms after that. This will tend to narrow its bounds, so we’ll be closer to producing a new decimal digit. • If we ever end up with the zero transformation (that is, a and b are both zero), then all remaining decimal places will be zero, so we can stop. If we encounter an infinite term of input, though, we still need to continue, but we can update b and d to match a and c, narrowing those bounds to a single point to indicate that we now know the exact value of the input. Here is the implementation: cfToBase :: Integer -> CFrac -> [Integer]cfToBase base = go mempty where go (Mobius 0 0 _ _) _ = [] go m x | Just n <- mobiusIntPart m, let m' = Mobius base (-base * n) 0 1 <> m = n : go m' x go m (n :+/ x) = go (m <> Mobius n 1 1 0) x go (Mobius a _ c _) Inf = go (Mobius a a c c) Inf cfToDecimal :: CFrac -> StringcfToDecimal Inf = "Inf"cfToDecimal x = case cfToBase 10 x of [] -> "0.0" [z] -> show z ++ ".0" (z : digits) -> show z ++ "." ++ concatMap show digits To test this, we will compare the result to standard Haskell output. However, Haskell’s typical Show instances for fractional types are too complex, producing output like 1.2e6 that we don’t want to match. The Numeric module has the simpler functionality we want. There are some discrepancies due to rounding error, as well, so the test will ignore differences in the last digit of output from the built-in types. prop_cfToDecimal_matchesRational :: NonNegative Rational -> Propertyprop_cfToDecimal_matchesRational (NonNegative r) = take n decFromRat === take n decFromCF where decFromRat = showFFloat Nothing (realToFrac r :: Double) "" decFromCF = cfToDecimal (cfFromRational r) n = max 10 (length decFromRat - 1) #### Generalized Continued Fractions Earlier, we settled for only an approximation of π, and I mentioned that π doesn’t have a nice pattern as a continued fraction. That’s true, but it’s not the whole story. If we relax the definition of a continued fraction just a bit, there are several known expressions for π as a generalized continued fraction. The key is to allow numerators other than 1. Here’s one that’s fairly nice: Generalized continued fractions don’t have many of the same nice properties that standard continued fractions do. They are not unique, and sometimes converge very slowly (or not at all!), yielding poor rational approximations when truncated. But we can compute with them using mobius transformations, using the same algorithms from above. In particular, we can use the same tools as above to convert from a generalized continued fraction to a standard one. First, we’ll define a type for generalized continued fractions. data GCFrac where (:+#/) :: (Integer, Integer) -> GCFrac -> GCFrac GInf :: GCFrac deriving instance Show GCFrac This time I haven’t defined an Eq instance, because representations are not unique. Converting from a standard to a generalized continued fraction is trivial. cfToGCFrac :: CFrac -> GCFraccfToGCFrac Inf = GInfcfToGCFrac (n :+/ x) = (n, 1) :+#/ cfToGCFrac x The conversion in the other direction, though, requires a mobius-like algorithm. gcfToCFrac :: GCFrac -> CFracgcfToCFrac = go mempty where go (Mobius a _ c _) GInf = cfFromFrac a c go m gcf@((int, numer) :+#/ denom) | Just n <- mobiusIntPart m = n :+/ go (Mobius 0 1 1 (- n) <> m) gcf | otherwise = go (m <> Mobius int numer 1 0) denom We can write a property test to verify that, at least, this gives a correct round trip from continued fractions to generalized, and back to standard again. prop_GCFrac_roundtrip :: NonNegative Rational -> Propertyprop_GCFrac_roundtrip (NonNegative r) = gcfToCFrac (cfToGCFrac x) === x where x = cfFromRational r Here’s the definition of π above, expressed in code, and converted to a continued fraction. I’ve written a test to verify that it mostly agrees with the approximate value we defined earlier, just as a sanity check on the implementation. gcfPi = (0, 4) :+#/ go 1 where go i = (2 * i - 1, i ^ 2) :+#/ go (i + 1) exactPi = gcfToCFrac gcfPi prop_correct_pi :: Propertyprop_correct_pi = take 17 (cfToDecimal approxPi) === take 17 (cfToDecimal exactPi) #### Results We now have built enough to get some interesting results. For example, we have exact representations several irrational numbers, and we can use those to obtain both rational approximations and long decimal representations of each. In addition to running our tests, we will print the continued fraction terms, convergents, and decimal representation of each of our test values. Check out the full code at https://code.world/haskell#PemataxO6pmYz99dOpaeT9g. Testing prop_naiveConvergents_end_at_Rational+++ OK, passed 100 tests. Testing prop_Mobius_id+++ OK, passed 100 tests. Testing prop_Mobius_assoc+++ OK, passed 100 tests. Testing prop_convergents_matches_naive+++ OK, passed 100 tests. Testing prop_cfToDecimal_matchesRational+++ OK, passed 100 tests. Testing prop_GCFrac_roundtrip+++ OK, passed 100 tests. Testing prop_correct_pi+++ OK, passed 1 test. 3/7:- terms: 0 :+/ (2 :+/ (3 :+/ Inf))- frac : [0 % 1,1 % 2,3 % 7]- dec : 0.428571428571428571428571428571428571428571428571 sqrt2:- terms: 1 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2- frac : [1 % 1,3 % 2,7 % 5,17 % 12,41 % 29,99 % 70,239 % 1- dec : 1.414213562373095048801688724209698078569671875376 sqrt3:- terms: 1 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1- frac : [1 % 1,2 % 1,5 % 3,7 % 4,19 % 11,26 % 15,71 % 41,9- dec : 1.732050807568877293527446341505872366942805253810 sqrt5:- terms: 2 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4- frac : [2 % 1,9 % 4,38 % 17,161 % 72,682 % 305,2889 % 129- dec : 2.236067977499789696409173668731276235440618359611 phi:- terms: 1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1- frac : [1 % 1,2 % 1,3 % 2,5 % 3,8 % 5,13 % 8,21 % 13,34 %- dec : 1.618033988749894848204586834365638117720309179805 e:- terms: 2 :+/ (1 :+/ (2 :+/ (1 :+/ (1 :+/ (4 :+/ (1 :+/ (1- frac : [2 % 1,3 % 1,8 % 3,11 % 4,19 % 7,87 % 32,106 % 39,- dec : 2.718281828459045235360287471352662497757247093699 approxPi:- terms: 3 :+/ (7 :+/ (15 :+/ (1 :+/ (292 :+/ (1 :+/ (1 :+/- frac : [3 % 1,22 % 7,333 % 106,355 % 113,103993 % 33102,1- dec : 3.141592653589793115997963468544185161590576171875 exactPi:- terms: 3 :+/ (7 :+/ (15 :+/ (1 :+/ (292 :+/ (1 :+/ (1 :+/- frac : [3 % 1,22 % 7,333 % 106,355 % 113,103993 % 33102,1- dec : 3.141592653589793238462643383279502884197169399375 Most of these are exact irrational values, so we can only look at a finite prefix of the convergents, which go on forever getting more and more accurate (at the expense of larger denominators). I’ve chopped them off at 50 characters, so that they comfortably fit on the screen, but you can change that to see as much of the value as you like. With this data in front of us, we can learn more about continued fractions and how they behave. Did you notice that the convergents of π jump from 355/113 all the way to 103993/33102? That’s because 355/113 is a remarkably good approximation of π. You have to jump to a much larger denominator to get a better approximation. You can observe this same fact directly from the continued fraction terms. The fifth term is 292, which is unusually large. 292 is the next term after the truncation that yields the convergent 355/113. Large terms like that in a continued fraction indicate unusually good rational approximations. Conversely, what if there are no large continued fraction terms? More concretely, what can we say about the number that has all 1s as its continued fraction? That’s the golden ratio! What this means is that there are no particularly good rational approximations to the golden ratio. And you can see that, and hear it! • Flowers tend to grow new petals or seeds at an angle determined by the golden ratio, specifically because there are no small intervals (i.e., small denominators) after which they will overlap. • Musical notes sound harmonic when their frequencies are close to simple rational numbers. The least harmonic frequencies are those related by the golden ratio, and you can hear it! (By the way, did you notice the fibonacci numbers in the convergents of the golden ratio? That’s a whole other topic I won’t get into here.) ### Part 3: Arithmetic Now that we’ve got a nice type and some conversions, let’s turn our attention to computing with continued fractions. The code and tests for this section can be found at https://code.world/haskell#P_T4CkfWGu3bgSwFV0R04Sg. #### Simple computations There are a few computations that are easy, so let’s warm up with those. Computing the reciprocal of a continued fraction turns out to be almost trivial, because continued fractions are built out of reciprocals! All we need to do for this one is shift all the terms by one… in either direction, really. Since our normal form only allows zero as the first term, we’ll make the decision based on whether the whole part is already zero. We also need a special case for 1 to prevent building a non-normal representation. cfRecip :: CFrac -> CFraccfRecip (0 :+/ x) = xcfRecip (1 :+/ Inf) = 1 :+/ InfcfRecip x = 0 :+/ x We will test a few expected properties: • The reciprocal is self-inverse. • The result matches taking a reciprocal in the rationals. • The reciprocal always gives an answer in normal form. prop_recipRecip_is_id :: NonNegative Rational -> Propertyprop_recipRecip_is_id (NonNegative r) = r /= 0 ==> cfRecip (cfRecip (cfFromRational r)) === cfFromRational r prop_cfRecip_matches_Rational :: NonNegative Rational -> Propertyprop_cfRecip_matches_Rational (NonNegative r) = r /= 0 ==> cfFromRational (recip r) === cfRecip (cfFromRational r) prop_cfRecip_isCanonical :: NonNegative Rational -> Propertyprop_cfRecip_isCanonical (NonNegative r) = r /= 0 ==> isCanonical (cfRecip (cfFromRational r)) Comparing two continued fractions is also not too difficult. It’s similar to comparing decimals, in that you look down the sequence for the first position where they differ, and that determines the result. However, since we take reciprocals at each term of a continued fraction, we’ll need to flip the result at each term. cfCompare :: CFrac -> CFrac -> OrderingcfCompare Inf Inf = EQcfCompare _ Inf = LTcfCompare Inf _ = GTcfCompare (a :+/ a') (b :+/ b') = case compare a b of EQ -> cfCompare b' a' r -> r Let’s write a QuickCheck property to verify that the result matches the Ord instance for Rational, and then define an Ord instance of our own. prop_cfCompare_matches_Rational :: NonNegative Rational -> NonNegative Rational -> Propertyprop_cfCompare_matches_Rational (NonNegative x) (NonNegative y) = compare x y === cfCompare (cfFromRational x) (cfFromRational y) instance Ord CFrac where compare = cfCompare #### Arithmetic with rational numbers To make any progress on the rest of basic arithmetic, we must return to our trusty old hammer: the mobius transformation. Recall that we’ve used the mobius transformation in two ways so far: • To produce convergents, we produced mobius transformations from terms, composed them together, and also produced a new approximation at each step. • To produce decimals, we looked at the bounds on the result of a mobius transformation, and proceeded in one of two ways: produce a bit of output, and then add a new mobius transformation to the output side; or consume a term of input, and then add a new mobius transformation to the input side. We will now implement mobius transformations that act on continued fractions and produce new continued fractions. The strategy is the same as it was when producing decimals, but modified to produce continued fraction terms as output instead of decimal digits. That leads to this implementation. cfMobius :: Mobius -> CFrac -> CFraccfMobius (Mobius a _ c _) Inf = cfFromFrac a ccfMobius (Mobius _ _ 0 0) _ = InfcfMobius m x | Just n <- mobiusIntPart m = n :+/ cfMobius (Mobius 0 1 1 (- n) <> m) xcfMobius m (n :+/ x) = cfMobius (m <> Mobius n 1 1 0) x There are two properties to test here. The first is that computations on continued fractions match the same computations on rational numbers. To implement this test, we’ll need an implementation of mobius transformations on rational numbers. Then we’ll test that cfMobius gives results in their canonical form. For both tests, we don’t care about transformations whose rational results are negative, as we will just agree not to evaluate them. mobius :: (Eq a, Fractional a) => Mobius -> a -> Maybe amobius (Mobius a b c d) x | q == 0 = Nothing | otherwise = Just (p / q) where p = fromInteger a * x + fromInteger b q = fromInteger c * x + fromInteger d prop_cfMobius_matches_Rational :: Mobius -> NonNegative Rational -> Propertyprop_cfMobius_matches_Rational m (NonNegative r) = case mobius m r of Just x | x >= 0 -> cfMobius m (cfFromRational r) === cfFromRational x _ -> discard prop_cfMobius_isCanonical :: Mobius -> NonNegative Rational -> Propertyprop_cfMobius_isCanonical m (NonNegative r) = case mobius m r of Just rat | rat >= 0 -> let x = cfMobius m (cfFromRational r) in counterexample (show x) (isCanonical x) _ -> discard The punchline here is that, using an appropriate mobius transformation, we can perform any of the four arithmetic functions — addition, subtraction, multiplication, or division — involving one continuous fraction and one rational number. #### Binary operations on continued fractions This was all sort of a warm-up, though, for the big guns: binary operations on two continued fractions, yielding a new continued fraction. Here, for the first time, our trusted mobius transformations fail us. Indeed, this remained an unsolved problem until 1972, when Bill Gosper devised a solution. That solution uses functions of the form: Again, for our purposes, all coefficients will be integers, and the coefficients in the denominator will always be positive. Since these are the generalization of mobius transformations to binary operations, we can call them the bimobius transformations. data Bimobius where BM :: Integer -> Integer -> Integer -> Integer -> Integer -> Integer -> Integer -> Integer -> Bimobius deriving instance Eq Bimobiusderiving instance Show Bimobius The rest of the implementation nearly writes itself! We simply want to follow the same old strategy. We keep a bimobius transformation that computes the rest of the output, given the rest of the inputs. At each step, we emit more output if possible, and otherwise consume a term from one of two inputs. As soon as either input is exhausted (if, indeed that happens), the remaining calculation reduces to a simple mobius transformation of the remaining argument. First, we need an analogue of mobiusIntPart, which was how we determined whether an output was available. bimobiusIntPart :: Bimobius -> Maybe IntegerbimobiusIntPart (BM a b c d e f g h) | e /= 0 && f /= 0 && g /= 0 && h /= 0 && n2 == n1 && n3 == n1 && n4 == n1 = Just n1 | otherwise = Nothing where n1 = a div e n2 = b div f n3 = c div g n4 = d div h The other building block we used to implement mobius transformations was composition with the Monoid instance. Unfortunately, that’s not so simple, since bimobius transformations have two inputs. There are now three composition forms. First, we can compose the mobius transformation on the left, consuming the output of the bimobius: (<>||) :: Mobius -> Bimobius -> Bimobius-- (mob <>|| bimob) x y = mob (bimob x y)Mobius a1 b1 c1 d1 <>|| BM a2 b2 c2 d2 e2 f2 g2 h2 = BM a b c d e f g h where a = a1 * a2 + b1 * e2 b = a1 * b2 + b1 * f2 c = a1 * c2 + b1 * g2 d = a1 * d2 + b1 * h2 e = c1 * a2 + d1 * e2 f = c1 * b2 + d1 * f2 g = c1 * c2 + d1 * g2 h = c1 * d2 + d1 * h2 We can also compose it on the right side, transforming either the first or second argument of the bimobius. (||<) :: Bimobius -> Mobius -> Bimobius-- (bimob ||< mob) x y = bimob (mob x) yBM a1 b1 c1 d1 e1 f1 g1 h1 ||< Mobius a2 b2 c2 d2 = BM a b c d e f g h where a = a1 * a2 + c1 * c2 b = b1 * a2 + d1 * c2 c = a1 * b2 + c1 * d2 d = b1 * b2 + d1 * d2 e = e1 * a2 + g1 * c2 f = f1 * a2 + h1 * c2 g = e1 * b2 + g1 * d2 h = f1 * b2 + h1 * d2 (||>) :: Bimobius -> Mobius -> Bimobius-- (bimob ||> mob) x y = bimob x (mob y)BM a1 b1 c1 d1 e1 f1 g1 h1 ||> Mobius a2 b2 c2 d2 = BM a b c d e f g h where a = a1 * a2 + b1 * c2 b = a1 * b2 + b1 * d2 c = c1 * a2 + d1 * c2 d = c1 * b2 + d1 * d2 e = e1 * a2 + f1 * c2 f = e1 * b2 + f1 * d2 g = g1 * a2 + h1 * c2 h = g1 * b2 + h1 * d2 That’s a truly dizzying bunch of coefficients! I’m definitely not satisfied without property tests to ensure they match the intended meanings. To do that, we need two pieces of test code. First, we want to generate random bimobius transformations with an Arbitrary instance. As with mobius transformations, the instance will guarantee that the denominators are non-zero and have non-negative coefficients. instance Arbitrary Bimobius where arbitrary = suchThat ( BM <$> arbitrary          <*> arbitrary          <*> arbitrary          <*> arbitrary          <*> (getNonNegative <$> arbitrary) <*> (getNonNegative <$> arbitrary)          <*> (getNonNegative <$> arbitrary) <*> (getNonNegative <$> arbitrary)      )      (\(BM _ _ _ _ e f g h) -> maximum [e, f, g, h] > 0)
  shrink (BM a b c d e f g h) =    [ BM a' b' c' d' e' f' g' h'      | ( a',          b',          c',          d',          NonNegative e',          NonNegative f',          NonNegative g',          NonNegative h'          ) <-          shrink            ( a,              b,              c,              d,              NonNegative e,              NonNegative f,              NonNegative g,              NonNegative h            ),        maximum [e', f', g', h'] > 0    ]

And second, we want a simple and obviously correct base implementation of the bimobius transformation to compare with.

bimobius :: (Eq a, Fractional a) => Bimobius -> a -> a -> Maybe abimobius (BM a b c d e f g h) x y  | q == 0 = Nothing  | otherwise = Just (p / q)  where    p =      fromInteger a * x * y        + fromInteger b * x        + fromInteger c * y        + fromInteger d    q =      fromInteger e * x * y        + fromInteger f * x        + fromInteger g * y        + fromInteger h

With this in mind, we’ll check each of the three composition operators performs as expected on rational number, at least. There’s one slight wrinkle: when composing the two functions manually leads to an undefined result, the composition operator sometimes cancels out the singularity and produces an answer. I’m willing to live with that, so I discard cases where the original result is undefined.

prop_mob_o_bimob ::  Mobius -> Bimobius -> Rational -> Rational -> Propertyprop_mob_o_bimob mob bimob r1 r2 =  case mobius mob =<< bimobius bimob r1 r2 of    Just ans -> bimobius (mob <>|| bimob) r1 r2 === Just ans    Nothing -> discard
prop_bimob_o_leftMob ::  Bimobius -> Mobius -> Rational -> Rational -> Propertyprop_bimob_o_leftMob bimob mob r1 r2 =  case (\x -> bimobius bimob x r2) =<< mobius mob r1 of    Just ans -> bimobius (bimob ||< mob) r1 r2 === Just ans    Nothing -> discard
prop_bimob_o_rightMob ::  Bimobius -> Mobius -> Rational -> Rational -> Propertyprop_bimob_o_rightMob bimob mob r1 r2 =  case (\y -> bimobius bimob r1 y) =<< mobius mob r2 of    Just ans -> bimobius (bimob ||> mob) r1 r2 === Just ans    Nothing -> discard

Now to the task at hand: implementing bimobius transformations on continued fractions.

cfBimobius :: Bimobius -> CFrac -> CFrac -> CFraccfBimobius (BM a b _ _ e f _ _) Inf y = cfMobius (Mobius a b e f) ycfBimobius (BM a _ c _ e _ g _) x Inf = cfMobius (Mobius a c e g) xcfBimobius (BM _ _ _ _ 0 0 0 0) _ _ = InfcfBimobius bm x y  | Just n <- bimobiusIntPart bm =    let bm' = Mobius 0 1 1 (- n) <>|| bm in n :+/ cfBimobius bm' x ycfBimobius bm@(BM a b c d e f g h) x@(x0 :+/ x') y@(y0 :+/ y')  | g == 0 && h == 0 = consumeX  | h == 0 || h == 0 = consumeY  | abs (g * (h * b - f * d)) > abs (f * (h * c - g * d)) = consumeX  | otherwise = consumeY  where    consumeX = cfBimobius (bm ||< Mobius x0 1 1 0) x' y    consumeY = cfBimobius (bm ||> Mobius y0 1 1 0) x y'

There are a few easy cases at the start. If either argument is infinite, then the infinite terms dominate, and the bimobius reduces to a unary mobius transformation of the remaining argument. On the other hand, if the denominator is zero, the result is infinite.

Next is the output case. This follows the same logic as the mobius transformations above:

But since m and m’ are now bimobius transformations, we just use one of the special composition forms defined earlier.

The remaining case is where we don’t have a new term to output, and must expand a term of the input, using the now-familiar equation, but composing with the ||< or ||> operators, instead.

There’s a new wrinkle here, though! Which of the two inputs should we expand? It seems best to make the choice that narrows the bounds the most, but I’ll be honest: I don’t actually understand the full logic behind this choice. I’ve simply copied it from Rosetta Code, where it’s not explained in any detail. We will rely on testing for confidence that it works.

Speaking of testing, we definitely need a few tests to ensure this works as intended. We’ll test the output against the Rational version of the function, and that the result is always in canonical form.

prop_cfBimobius_matches_Rational ::  NonNegative Rational ->  NonNegative Rational ->  Bimobius ->  Propertyprop_cfBimobius_matches_Rational  (NonNegative r1)  (NonNegative r2)  bm =    case bimobius bm r1 r2 of      Just x        | x >= 0 ->          cfFromRational x            === cfBimobius              bm              (cfFromRational r1)              (cfFromRational r2)      _ -> discard
prop_cfBimobius_isCanonical ::  NonNegative Rational ->  NonNegative Rational ->  Bimobius ->  Boolprop_cfBimobius_isCanonical  (NonNegative r1)  (NonNegative r2)  bm =    case bimobius bm r1 r2 of      Just x        | x >= 0 ->          isCanonical            (cfBimobius bm (cfFromRational r1) (cfFromRational r2))      _ -> discard

If you like, it’s interesting to compare the implementations of continued fraction arithmetic here with some other sources, such as:

These sources are great, but I think the code above is nicer. The reason is that with lazy evaluation instead of mutation, you can read the code more mathematically as a set of equations, and see what’s going on more clearly. This is quite different from the imperative implementations, which modify state in hidden places and require a lot of mental bookkeeping to keep track of it all. As I said, Haskell is a joy for this kind of programming.

#### Tying it all together

We now have all the tools we need to write Num and Fractional instances for CFrac. That’s a great way to wrap this all together, so that clients of this code don’t need to worry about mobius and bimobius transformations at all.

checkNonNegative :: CFrac -> CFraccheckNonNegative Inf = InfcheckNonNegative x@(x0 :+/ x')  | x < 0 = error "checkNonNegative: CFrac is negative"  | x > 0 = x  | otherwise = x0 :+/ checkNonNegative x'
instance Num CFrac where  fromInteger n    | n >= 0 = n :+/ Inf    | otherwise = error "fromInteger: CFrac cannot be negative"
  (+) = cfBimobius (BM 0 1 1 0 0 0 0 1)
  x - y = checkNonNegative (cfBimobius (BM 0 1 (-1) 0 0 0 0 1) x y)
  (*) = cfBimobius (BM 1 0 0 0 0 0 0 1)
  signum (0 :+/ Inf) = 0  signum _ = 1
  abs = id
  negate (0 :+/ Inf) = 0 :+/ Inf  negate _ = error "negate: CFrac cannot be negative"
instance Fractional CFrac where  fromRational x    | x < 0 = error "fromRational: CFrac cannot be negative"    | otherwise = cfFromRational x
  recip = cfRecip
  (/) = cfBimobius (BM 0 1 0 0 0 0 1 0)

You can find the complete code and tests for this section at https://code.world/haskell#P_T4CkfWGu3bgSwFV0R04Sg. In addition to running the tests, I’ve included a few computations, demonstrating for example that:

### Part 4: Going faster

Everything we’ve done so far is correct, but there’s one way that it’s not terribly satisfying. It can do a lot of unnecessary work. In this part, we’ll try to fix this performance bug.

You can find the code for this part in https://code.world/haskell#PP8uMTTchtH2F36nJYWT3VQ.

Consider, for example, a computation like x + 0.7. We know (and, indeed, Haskell also knows!) that 0.7 is a rational number. We should, then, be able to work this out by applying cfMobius to a simple mobius transformation.

But that’s not what GHC does. Instead, GHC reasons like this:

• We assume that x is a CFrac. But + requires that both arguments have the same type, so 0.7 must also be a CFrac.
• Since 0.7 is a literal, GHC will implicitly add a fromRational, applied to the rational form 7/10, which (using the Fractional instance defined in the last part) uses cfFromRational to convert 0.7 into the continued
fraction 0 :+/ 1 :+/ 2 :+/ 3 :+/ Inf.
• Now, since we have two CFrac values, we must use the heavyweight cfBimobius to compute the answer, choosing between the two streams at every step. In the process, cfBimobius must effectively undo the conversion of 7/10 into continued fraction form, doubling the unnecessary work.

#### Rewriting expressions

We can fix this with GHC’s rewrite rules. They can avoid this unnecessary conversion, and reduce constant factors at the same time, by rewriting expressions to use cfMobius instead of cfBimobius when one argument need not be a continued fraction. Similarly, when cfMobius is applied to a rational argument, it can be replaced by a simpler call to cfFromFrac after just evaluating the mobius transformation on the Rational directly. Integers are another special case: less expensive because their continued fraction representations are trivial, but but we might as well rewrite them, too.

{-# RULES"cfrac/cfMobiusInt" forall a b c d n.  cfMobius (Mobius a b c d) (n :+/ Inf) =    cfFromFrac (a * n + b) (c * n + d)"cfrac/cfMobiusRat" forall a b c d p q.  cfMobius (Mobius a b c d) (cfFromFrac p q) =    cfFromFrac (a * p + b * q) (c * p + d * q)"cfrac/cfBimobiusInt1" forall a b c d e f g h n y.  cfBimobius (BM a b c d e f g h) (n :+/ Inf) y =    cfMobius (Mobius (a * n + c) (b * n + d) (e * n + g) (f * n + h)) y"cfrac/cfBimobiusRat1" forall a b c d e f g h p q y.  cfBimobius (BM a b c d e f g h) (cfFromFrac p q) y =    cfMobius      ( Mobius          (a * p + c * q)          (b * p + d * q)          (e * p + g * q)          (f * p + h * q)      )      y"cfrac/cfBimobiusInt2" forall a b c d e f g h n x.  cfBimobius (BM a b c d e f g h) x (n :+/ Inf) =    cfMobius (Mobius (a * n + b) (c * n + d) (e * n + f) (g * n + h)) x"cfrac/cfBimobiusRat2" forall a b c d e f g h p q x.  cfBimobius (BM a b c d e f g h) x (cfFromFrac p q) =    cfMobius      ( Mobius          (a * p + b * q)          (c * p + d * q)          (e * p + f * q)          (g * p + h * q)      )      x  #-}

There’s another opportunity for rewriting, and we’ve actually already used it! When mobius or bimobius transformations are composed together, we can do the composition in advance to produce a single transformation. Especially since these transformations usually have simple integer arguments that GHC knows how to optimize, this can unlock a lot of other arithmetic optimizations.

Fortunately, we’ve already defined and tested all of the composition operators we need here.

{-# RULES"cfrac/mobius_o_mobius" forall m1 m2 x.  cfMobius m1 (cfMobius m2 x) =    cfMobius (m1 <> m2) x"cfrac/mobius_o_bimobius" forall m bm x y.  cfMobius m (cfBimobius bm x y) =    cfBimobius (m <>|| bm) x y"cfrac/bimobius_o_mobiusLeft" forall bm m x y.  cfBimobius bm (cfMobius m x) y =    cfBimobius (bm ||< m) x y"cfrac/bimobius_o_mobiusRight" forall bm m x y.  cfBimobius bm x (cfMobius m y) =    cfBimobius (bm ||> m) x y  #-}

#### Inlining, and not

This all looks great… but it doesn’t work. The problem is that GHC isn’t willing to inline the right things at the right times to get to exactly the form we need for these rewrite rules to fire. To fix this, I had to go back through all of the previous code, being careful to annotate many of the key functions with {-# INLINE ... #-} pragmas.

There are two forms of this pragma. When a trivial function might just get in the way of a rewrite rule, it’s annotated with a simple INLINE pragma so that GHC can inline and hopefully eliminate the function.

But when a function actually appears on the left hand side of a rewrite rule, though, we need to step more carefully. Inlining that function can prevent the rewrite rule from firing. We could use NOINLINE for this, but I’ve instead picked the form of inline that includes a phase number. The phase number prevents inlining from occurring prior to that phase. It looks something like:

{-# INLINE [2] cfRecip #-}

Choosing a phase number seems to be something of a black art. I picked 2, and it worked, so… maybe try that one?

All these edits mean we have completely new versions of all of the code, and it’s in these files:

Done… but how do we know it worked?

#### Verifying the rewrites

Rewrite rules are a bit of a black art, and it’s nice to at least get some confirmation that they are working. GHC can tell us what’s going on if we pass some flags. Here are some useful ones:

• -ddump-rule-rewrites will print out a description of every rewrite rule that fires, including the before-and-after code.
• -ddump-simpl will print out the final Core, which is GHC’s Haskell-like intermediate language, after the simplifier (which applies these rules) is done. Adding -dsuppress-all makes the result more readable by hiding metadata we don’t care about.

In CodeWorld, we can pass options to GHC by adding an OPTIONS_GHC pragma to the source code. If you looked at the Part 4 link above, you saw that I’ve done so. The output is long! But there are a few things we can notice.

First, we named our rules, so we can search for the rules by name to see if they have fired. They do! We see this:

...
Rule fired    Rule: cfrac/cfBimobiusInt1    Module: (CFracPart4)
...
Rule fired    Rule: timesInteger    Module: (BUILTIN)    Before: GHC.Integer.Type.timesInteger ValArg 0 ValArg n_aocc    After:  0    Cont:   Stop[RuleArgCtxt] GHC.Integer.Type.Integer
Rule fired    Rule: plusInteger    Module: (BUILTIN)    Before: GHC.Integer.Type.plusInteger              ValArg src<GHC/In an imported module> 0 ValArg 1    After:  1
...
Rule fired    Rule: cfrac/bimobius_o_mobiusLeft    Module: (CFracPart4)
...
Rule fired    Rule: cfrac/cfBimobiusInt2    Module: (CFracPart4)

Good news: our rules are firing!

I highlighted the timesInteger and plusInteger rules, as well. This is GHC taking the result of our rules, which involve a bunch of expressions like a1 * a2 + b1 * c2, and working out the math at compile time because it already knows the arguments. This is great! We want that to happen.

But what about the result? For this, we turn to the -ddump-simpl output. It’s a bit harder to read, but after enough digging, we can find this (after some cleanup):

main4 = 1main3 = 0main2 = 2main1  = $wunsafeTake 50# (cfToDecimal ($wcfMobius main4 main4 main3 main2 sqrt5))

This is essentially the expression take 50 (cfDecimal (cfMobius (Mobius 1 1 0 2) sqrt5)). Recall that what we originally wrote was (1 + sqrt5) / 2. We have succeeded in turning an arithmetic expression of CFrac values into a simple mobius transformation at compile time!

### Conclusion

We finally reach the end of our journey. There are still some loose ends, as there always are. For example:

• There’s still the thing with negative numbers, which I left as an exercise.
• Arithmetic hangs when computing with irrational numbers that produce rational results. For example, try computing sqrt2 * sqrt2, and you’ll be staring at a blank screen while your CPU fan gets some exercise. This might seems like a simple bug, but it’s actually a fundamental limitation of our approach. An incorrect term arbitrarily far into the expression for sqrt2 could change the integer part of the result, so any correct algorithm must read the entire continued fraction before producing an answer. But the entire fraction is infinite! You could try to fix this by special-casing the cyclic continued fractions, but you can never catch all the cases.
• There’s probably plenty more low-hanging fruit for performance. I haven’t verified the rewrite rules in earnest, really. Something like Joachim Breitner’s inspection testing idea could bring more rigor to this. (Sadly, libraries for inspection testing require Template Haskell, so you cannot try it within CodeWorld.)
• Other representations for exact real arithmetic offer more advantages. David Lester, for example, has written a very concise exact real arithmetic library using Cauchy sequences with exponential convergence.

In the end, though, I hope you’ve found this part of the journey rewarding in its own right. We developed a non-trivial library in Haskell, derived the code using equational reasoning (even if it was a bit fuzzy), tested it with property testing, and optimized it with rewrite rules. All of these play very well with Haskell’s declarative programming style.

# Global IORef in Template Haskell

I’m investigating a way to speed up persistent as well as make it more powerful, and one of the potential solutions involves persisting some global state across module boundaries. I decided to investigate whether the “Global IORef Trick” would work for this. Unfortunately, it doesn’t.

On reflection, it seems obvious: the interpreter for Template Haskell is a GHCi-like process that is loaded for each module. Loading an interpreter for each module is part of why Template Haskell imposes a compile-time penalty - in my measurements, it’s something like ~100ms. Not huge, but noticeable on large projects. (I still generally find that DeriveGeneric and the related Generic code to be slower, but it’s a complex issue).

Anyway, let’s review the trick and obseve the behavior.

# Global IORef Trick

This trick allows you to have an IORef (or MVar) that serves as a global reference. You almost certainly do not need to do this, but it can be a convenient way to hide state and make your program deeply mysterious.

Here’s the trick:

module Lib where

import Data.IORef
import System.IO.Unsafe

globalRef :: IORef [String]
globalRef = unsafePerformIO $newIORef [] {-# NOINLINE globalRef #-}  There are two important things to note: 1. You must give a concrete type to this. 2. You must write the {-# NOINLINE globalRef #-} pragma. Let’s say we give globalRef a more general type: globalRef :: IORef [a]  This means that we woudl be allowed to write and read whatever we want from this reference. That’s bad! We could do something like writeIORef globalRef [1,2,3], and then readIORef globalRef :: IO [String]. Boom, your program explodes. Unless you want a dynamically typed reference for some reason - and even then, you’d better use Dynamic. If you omit the NOINLINE pragma, then you’ll just get a fresh reference each time you use it. GHC will see that any reference to globalRef can be inlined to unsafePerformIO (newIORef []), and it’ll happily perform that optimization. But that means you won’t be sharing state through the reference. This is a bad idea, don’t use it. I hesitate to even explain it. # Testing the Trick But, well, sometimes you try things out to see if they work. In this case, they don’t, so it’s useful to document that. We’re going to write a function trackString that remembers the strings that are passed previously, and defines a value that returns those. trackString "hello" -- hello = [] trackString "goodbye" -- goodbye = ["hello"] trackString "what" -- what = ["goodbye", "hello"]  Here’s our full module: {-# language QuasiQuotes #-} {-# language TemplateHaskell #-} module Lib where import Data.IORef import System.IO.Unsafe import Language.Haskell.TH import Language.Haskell.TH.Quote import Language.Haskell.TH.Syntax globalRef :: IORef [String] globalRef = unsafePerformIO$ newIORef []
{-# NOINLINE globalRef #-}

trackStrings :: String -> Q [Dec]
trackStrings input = do
strs <- runIO $readIORef globalRef _ <- runIO$ atomicModifyIORef globalRef (\i -> (input : i, ()))
ty <- [t| [String] |]
pure
[ SigD (mkName input) ty
, ValD (VarP (mkName input)) (NormalB (ListE $map (LitE . stringL)$ strs)) []
]


This works in a single module just fine.

{-# language TemplateHaskell #-}

module Test where

import Lib

trackStrings "what"
trackStrings "good"
trackStrings "nothx"

test :: IO ()
test = do
print what
print good
print nothx


If we evaluate test, we get the following output:

[]
["what"]
["good","what"]


This is exactly what we want.

Unfortunately, this is only module-local state. Given this Main module, we get some disappointing output:



module Main where

import Lib

import Test

trackStrings "hello"

trackStrings "world"

trackStrings "goodbye"

main :: IO ()
main = do
test
print hello
print world
print goodbye

[]
["what"]
["good","what"]
[]
["hello"]
["world","hello"]


To solve my problem, main would have needed to output:

[]
["what"]
["good","what"]
["nothx","good","what"]
["hello","nothx","good","what"]
["world","hello","nothx","good","what"]


# Module-local state in Template Haskell

Fortunately, we don’t even need to do anything awful like this. The Q monad offers two methods, getQ and putQ that allow module-local state.

-- | Get state from the Q monad. Note that the state is
-- local to the Haskell module in which the Template
getQ :: Typeable a => Q (Maybe a)

-- | Replace the state in the Q monad. Note that the
-- state is local to the Haskell module in which the
-- Template Haskell expression is executed.
putQ :: Typeable a => a -> Q ()


These use a Typeable dictionary, so you can store many kinds of state - one for each type! This is a neat way to avoid the “polymorphic reference” problem I described above.

# How to actually solve the problem?

If y’all dare me enough I might write a follow-up where I investigate using a compact region to persist state across modules, but I’m terrified of the potential complexity at play there. I imagine it’d work fine for a single threaded compile, but there’d probably be contention on the file with parallel builds. Hey, maybe I just need to spin up a redis server to manage the file locks… Perhaps I can install nix at compile-time and call out to a nix-shell that installs Redis and runs the server.

## April 20, 2021

### Douglas M. Auclair (geophf)

• 2021-04-20, Tuesday:

I have pairs :: [(a, IO [b])]

but I want pairs' :: IO [(a, b)]

sequence a gives me something like I don't know what: distributing the list monad, not the IO monad. Implement:

sequence' :: [(a, IO [b])] -> IO [(a, b)]

• 2021-04-16, Friday:

You have a monad, or applicative, and you wish to execute the action of the latter but return the result of the former. The simplest representation for me is:

pass :: IO a -> b -> IO b

so:

return 5 >>= pass (putStrLn "Hi, there!")

would return IO 5

GO!

• D Oisín Kidney @oisdk flip (<$) • ⓘ_jack @Iceland_jack ($>)
• 2021-04-12, Monday:

A function that takes the result of another function then uses that result and the original pair of arguments to compute the result:

f :: a -> a -> bg :: b -> a -> a -> c

so:

(\x y -> g (f x y) x y)

curry away the x and y arguments.

• 2021-04-07, Wednesday:
you have (Maybe a, Maybe b)
you want Maybe (a, b)

If either (Maybe a) or (Maybe b) is Nothing
then the answer is Nothing.

If both (Maybe a) and (Maybe b) are (Just ...)
then the answer is Just (a, b)

WHAT SAY YOU?

• Jérôme Avoustin @JeromeAvoustin: bisequence
• p h z @phaazon_ with base: uncurry $liftA2 (,) • greg nwosu @buddhistfist: (,) <$> ma <*> mb

# Odd translation choices

Recently I've been complaining about unforced translation errors. ([1] [2]) Here's one I saw today:

The translation was given as:

“honk honk, your Uber has arrived”

“Oleg, what the fuck”

Now, the Russian text clearly says “beep-beep” (“бип-бип”), not “honk honk”. I could understand translating this as "honk honk" if "beep beep" were not a standard car sound in English. But English-speaking cars do say “beep beep”, so why change the original?

(Also, a much smaller point: I have no objection to translating “Что за херня” as “what the fuck”. But why translate “Что за херня, Олег?” as “Oleg, what the fuck” instead of “What the fuck, Oleg”?)

[ Addendum 20210420: Katara suggested that perhaps the original translator was simply unaware that Anglophone cars also “beep beep”. ]

# This is the video of the keynote talk “Blockchains are functional” that I delivered at...

This is the video of the keynote talk “Blockchains are functional” that I delivered at the ACM SIGPLAN International Conference on Functional Programming 2019. Here is the abstract:

Functional programming and blockchains are a match made in heaven! The immutable and reproducible nature of distributed ledgers is mirrored in the semantic foundation of functional programming. Moreover, the concurrent and distributed operation calls for a programming model that carefully controls shared mutable state and side effects. Finally, the high financial stakes often associated with blockchains suggest the need for high assurance software and formal methods.
Nevertheless, most existing blockchains favour an object-oriented, imperative approach in both their implementation as well as in the contract programming layer that provides user-defined custom functionality on top of the basic ledger. On the one hand, this might appear surprising, given that it is widely understood that this style of programming is particularly risky in concurrent and distributed systems. On the other hand, blockchains are still in their infancy and little research has been conducted into associated programming language technology.
In this talk, I explain the connection between blockchains and functional programming as well as highlight several areas where functional programming, type systems, and formal methods have the potential to advance the state of the art. Overall, I argue that blockchains are not just a well-suited application area for functional programming techniques, but that they also provide fertile ground for future research. I illustrate this with evidence from the research-driven development of the Cardano blockchain and its contract programming platform, Plutus. Cardano and Plutus are implemented in Haskell and Rust, and the development process includes semi-formal specifications together with the use of Agda, Coq, and Isabelle to formalise key components.

# Arrows, through a different lens

Our previous posts on computational pipelines, such as those introducing Funflow and Porcupine, show that Arrows are very useful for data science workflows. They allow the construction of effectful and composable pipelines whose structure is known at compile time, which is not possible when using Monads. However, Arrows may seem awkward to work with at first. For instance, it’s not obvious how to use lenses to access record fields in Arrows.

My goal in this post is to show how lenses and other optics can be used in Arrow-based workflows. Doing so is greatly simplified thanks to Profunctor optics and some utilities that I helped add to the latest version of the lens library.

## Optics on functions

We’re used to think of lenses in terms of getters and setters, but I’m more interested today in the functions over and traverseOf.

-- We will use this prefix for the remaining of the post.
-- VL stands for Van Laarhoven lenses.
import qualified Control.Lens as VL

-- Transform a pure function.
over :: VL.Lens s t a b -> (a -> b) -> (s -> t)

-- Transform an effectful function.
traverseOf :: VL.Traversal s t a b -> (a -> m b) -> (s -> m t)

We would like to use similar functions on Arrow-based workflows, something like

overArrow :: VL.Lens s t a b -> Task a b -> Task s t

However, the type of lenses:

type Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t

doesn’t make it very obvious how to define overArrow.

On the other hand, Arrows come equipped with functions first and second:

first :: Task b c -> Task (b, d) (c, d)
second :: Task b c -> Task (d, b) (d, c)

which very much feel like specialised versions of overArrow for the lenses

_1 :: VL.Lens (b, d) (c, d) b c
_2 :: VL.Lens (d, b) (d, c) b c

so maybe there is a common framework that can take both of these into account? The answer is yes, and the solution is lenses — but lenses of a different type.

## Profunctor optics

There is an alternative and equivalent formulation of optics, called Profunctor optics, that works very well with Arrows. Optics in the Profunctor framework have the following shape:

type Optic p s t a b = p a b -> p s t

with more precise optics such as Lens being obtained by imposing constraints to p coming from the different Profunctor classes. In other words, an Optic is precisely a higher-order function acting on some profunctor. Because every Arrow is also a Profunctor1, the shape of an Optic is precisely what is needed to act on Arrows! Moreover, like the optics of the lens library, profunctor optics can be composed like regular functions, with (.).

The lens library now includes a module containing functions that convert between standard and profunctor optics, which makes using them very convenient.

In the following sections, we will go through the use and the intuition of the most common optics: Lens, Prism and Traversal. But first, let’s import the compatibility module for profunctor optics:

-- PL for Profunctor Lenses
import Control.Lens.Profunctor as PL

## Lenses

Standard lenses are all about products — view, for example, is used to deconstruct records:

view _fst :: (a, b) -> a

Therefore, it makes sense for Profunctor lenses to also talk about products. Indeed, that is exactly what happens, through the Strong type class:

class Profunctor p => Strong p where
first' :: p a b -> p (a, c) (b, c)
second' :: p a b -> p (c, a) (c, b)

With profunctor optics, a Lens is defined as follows:

type Lens s t a b = forall p. Strong p => p a b -> p s t

Every Arrow satisfies the Strong class. If we squint, we can rewrite the type of these functions as:

first' :: Lens' (a,c) (b,c) a b
second' :: Lens' (c,a) (c,b) a b

That is, a Strong profunctor is equipped with lenses to reach inside products. One can always convert a record into nested pairs and act on them using Strong — the Lens just makes this much more convenient.

But how do we build a Lens? Besides writing them manually, we can also use all Lenses from lens:

PL.fromLens :: VL.Lens s t a b -> Lens s t a b

which means we can still use all the lenses we know and love. For example, one can apply a task to a tuple of arbitrary size:

PL.fromLens _1 :: Task a b -> Task (a,x,y) (b,x,y)

Summarizing, a Strong profunctor is one we can apply lenses to. Since every Arrow is also a Strong profunctor, one can use Lenses with them.

## Prisms

Standard prisms are all about sums — preview, for example, is used to deconstruct sum-types:

view _Left :: Either a b -> Maybe a

Therefore, it makes sense for Profunctor prisms to also talk about sums. Indeed, that is exactly what happens, through the Choice type class:

class Profunctor p => Choice p where
left' :: p a b -> p (Either a c) (Either b c)
right' :: p a b -> p (Either c a) (Either c b)

With profunctor optics, a Prism is defined as follows:

type Prism s t a b = forall p. Choice p => p a b -> p s t

Every ArrowChoice satisfies the Choice class. Once more, we can rewrite the type of these functions as:

left' :: Prism (Either a c) (Either b c) a b
right' :: Prism (Either c a) (Either c b) a b

That is, a Choice profunctor is equipped with prisms to discriminate sums. One can always convert a sum into nested Eithers and act on them using Choice — the Prism just makes this much more convenient.

But how do we build a Prism? We can also use any prisms from lens with a simple conversion:

PL.fromPrism :: VL.Prism s t a b -> Prism s t a b

For example, one can execute a task conditionally, depending on the existence of the input:

PL.fromPrism _Just :: Action a b -> Action (Maybe a) (Maybe b)

Summarizing, a Choice profunctor is one we can apply prisms to. Since every ArrowChoice can be a Choice profunctor, one can uses prisms with them.

## Traversals

Standard traversals are all about Traversable structures — mapMOf, for example, is used to execute effectful functions:

mapMOf traverse readFile :: [FilePath] -> IO [String]

Therefore, it makes sense for Profunctor traversals to also talk about these traversable structures. Indeed, that is exactly what happens, through the Traversing type class:

class (Choice p, Strong p) => Traversing p where
traverse' :: Traversable f => p a b -> p (f a) (f b)

With profunctor optics, a Traversal is defined as follows:

type Traversal s t a b = forall p. Traversing p => p a b -> p s t

There is no associated Arrow class that corresponds to this class, but many Arrows, such as Kleisli, satisfy it. We can rewrite the type of this functions as:

traverse' :: Traversable f => Traversal (f a) (f b) a b

That is, a Traversing profunctor can be lifted through Traversable functors.

But how do we build a Traversal? We can also use any Traversal from lens with a simple conversion:

PL.fromTraversal :: VL.Traversal s t a b -> Traversal s t a b

For example, one can have a task and apply it to a list of inputs:

PL.fromTraversal traverse :: Action a b -> Action [a] [b]

## Conclusion

Using Arrows does not stop us from taking advantage of the Haskell ecosystem. In particular, optics interact very naturally with Arrows, both in their classical and profunctor formulations. For the moment, the ecosystem is still lacking a standard library for Profunctor optics, but this is not a show stopper — the lens library itself has most of the tools we need. So the next time you are trying out Funflow or Porcupine, don’t shy away from using lens!

1. The fact that these hierarchies are separated is due to historical reasons.

# More soup-guzzling

A couple of days ago I discussed the epithet “soup-guzzling pie-muncher”, which in the original Medieval Italian was brodaiuolo manicator di torte. I had compained that where most translations rendered the delightful word brodaiuolo as something like “soup-guzzler” or “broth-swiller”, Richard Aldington used the much less vivid “glutton”.

A form of the word brodaiuolo appears in one other place in the Decameron, in the sixth story on the first day, also told by Emilia, who as you remember has nothing good to say about the clergy:

… lo 'nquisitore sentendo trafiggere la lor brodaiuola ipocrisia tutto si turbò…

J. M. Rigg (1903), who had elsewhere translated brodaiuolo as “broth-guzzling”, this time went with “gluttony”:

…the inquisitor, feeling that their gluttony and hypocrisy had received a home-thrust…

G. H. McWilliam (1972) does at least imply the broth:

…the inquisitor himself, on hearing their guzzling hypocrisy exposed…

John Payne (1886):

the latter, feeling the hit at the broth-swilling hypocrisy of himself and his brethren…

Cormac Ó Cuilleanáin's revision of Payne (2004):

…the inquisitor himself, feeling that the broth-swilling hypocrisy of himself and his brethren had been punctured…

And what about Aldington (1930), who dropped the ball the other time and rendered brodaiuolo merely as “glutton”? Here he says:

… he felt it was a stab at their thick-soup hypocrisy…

Oh, Richard.

I think you should have tried harder.

# GHC activities report: February-March 2021

This is the fifth edition of our GHC activities report, which is intended to provide regular updates on the work on GHC and related projects that we are doing at Well-Typed. This edition covers roughly the months of February and and March 2021.

The previous editions are here:

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK and Facebook, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!

## Release management

• Ben Gamari finalized releases of GHC 8.10.4 and 9.0.1, and started work on 9.0.2. Additionally, he worked to finish and merge the outstanding patches pending for GHC 9.2, culminating in an alpha release candidate (9.2.1-alpha1).
• Ben also prepared a blog post updating the community on the state of GHC on Apple M1 hardware.

## Frontend

• Doug Wilson proposed adding interprocess semaphore support to allow multiple concurrent compiler processes to make better use of available parallelism (!5176).
• Matthew Pickering performed some simulations and experiments of the proposed changes to increase the parallelism in --make mode (#14095) and the effect of the Doug’s -jsem flag (!5176). The results show both changes have potential to improve compile times when many cores are available.
• Matt opened a GHC proposal to modify the import syntax to distinguish between modules which are used at compile time and runtime. This will mean that enabling TemplateHaskell in your project will cause less recompilation when using ghci and haskell-language-server.
• Ben fixed a long-standing bug rendering the GHCi linker unable to locate dynamic libraries when not explicitly added to LD_LIBRARY_PATH (#19350)

## Profiling and Debugging

• Matt landed the last of the ghc-debug patches into GHC, so it will be ready for use when GHC 9.2 is released.
• Matt has finished off a patch which Ben and David Eichmann started which fixes races in the eventlog implementation. This makes it possible to reliably restart the eventlog and opens up the possibility of remote monitoring tooling which connects to the eventlog over a socket. Matt has made a start on implementing libraries which allow monitoring in this way.
• Matt enhanced the restart support for the eventlog by ensuring certain initialisation events are re-posted every time the eventlog is restarted (!5186).
• Matt added new events to the eventlog which track the current number of allocated blocks and some statistics about memory fragmentation (!5126).
• To aid in recent compiler-performance investigations, Ben has been working on merging corediff, a utility for performing structural comparisons on Core, into his ghc-dump tool.

## RTS

• Matt investigated some discrepancies between OS reported memory usage and live bytes used by a program. This resulted in a new RTS flag (-Fd) in !5036 and a fix which reduces fragmentation due to nursery blocks (!5175). This work is described in more detail in a recent blog post.

## Compiler Performance

• Adam Gundry has been investigating the extremely poor performance of type families that do significant computation (#8095), building on previous work by Ben and Simon Peyton Jones on “coercion zapping”. He has an experimental patch that can significantly improve performance in some cases (!5286), although more work is needed.
• Andreas Klebinger refactored the way the simplifier creates uniques (!4804). This reduces compiler allocations by between 0.1% to 0.5% when compiling with -O.
• Andreas also stomped out some sources of thunks in the simplifier (!4808). The impact varies by program but allocations for GHC were reduced by ~1% in common cases and by up to 4% for particular tests.
• Andreas also finished a similar patch for the code generator (!5236). This mostly benefits users frequently compiling with -O0 where we save around 0.2% of allocations.
• Andreas and Ben also applied the one-shot trick (#18202) to a few more monads inside of GHC. In aggregate this should reduce GHC’s allocations by another half percent in the common case.
• Matt started working on compile time performance and fixed some long standing leaks in the demand analyser. He used ghc-debug and -hi profiling in order to quickly find where the leaks were coming from.
• Matt and Ben started work on a public Grafana dashboard to display the long-term trends in compiler benchmarks in an easy to understand way, taking advantage of the performance measurement infrastructure developed over the last 12 months.
• Ben is investigating a significant regression in compiler performance of the aeson library (#19478) in GHC 9.0.
• Ben reworked the derivation logic for Enum instances, significantly reducing the quantity of code required for such instances.

## Runtime performance

• Andreas has fixed a few more obstacles which will allow GHC to turn on -fstrict-dicts by default (!2575) when compiling with -O2.
• Andreas investigated a runtime regression with GHC 9.0.1 (#19474) which ultimately was caused by vector regressing under certain circumstances. This was promptly fixed by Ben upstream.
• After a review by Simon Peyton Jones, Andreas is evaluating a possible simplification in the implementation of his tag inference analysis.
• Ben investigated and fixed a runtime regression in bytestring caused by GHC 9.0’s less-aggressive simplification of unsafeCoerce (#19539).

## Compiler correctness

• Andreas investigated and fixed (!4926) an issue where the behaviour of unsafeDupablePerformIO had changed in GHC 9.0 as a result of more aggressive optimisation of runRW#. In particular when a user wrote code like:

    unsafePerformIO $do let x = f x writeIORef ref x return x Before the fix it was possible for x to be evaluated before the write to the IORef occurred. • Ben fixed a bug resulting in the linker failing to load libraries with long paths on Windows (#19541) • Ben fixed a rather serious garbage-collector bug affecting code unloading in GHC 8.10.4 and 9.0.1 (#19417). ## Compiler functionality and language extensions • Adam assisted with landing the NoFieldSelectors extension (!4743) and the record dot syntax extensions (!4532). Both of these features will be in GHC 9.2, although record dot syntax will be properly supported only for selection, not for record update. Adam is working on a GHC proposal to address outstanding design questions regarding record update. • Ben finished and merged the BoxedRep implementation started by Andrew Martin, allowing true levity polymorphism in GHC 9.2 (#17526) • Ben has been working with a contributor to finish stack snapshotting functionality, which will allow backtraces on all platforms supported by GHC. ## Compiler error messages refactoring ## CI and infrastructure • Andreas fixed a number of issues to allow windows builds to be validated locally by developers. In particular !4935, !5162, !5200 and !5040. • Ben has been working with the Haskell Foundation to diversify the resource pool supporting GHC’s CI infrastructure. ## April 13, 2021 ### Magnus Therning # Nix shell, direnv and XDG_DATA_DIRS A few weeks ago I noticed that I no longer could use haskell-hoogle-lookup-from-website in Emacs. After a bit of experimentation I found that the reason was that I couldn't use xdg-open in a Nix shell. Yesterday I finally got around to look into further. It's caused by direnv overwriting XDG_DATA_DIRS rather than appending to it. Of course someone already reported a bug already. The workaround is to use use nix --keep XDG_DATA_DIRS  ### Mark Jason Dominus # Scrooge A few months ago I was pondering what it might be like to be Donald Trump. Pretty fucking terrible, I imagine. What's it like, I wondered, to wake up every morning and know that every person in your life is only interested in what they can get from you, that your kids are eagerly waiting for you to die and get out of their way, and that there is nobody in the world who loves you? How do you get out of bed and face that bitter world? I don't know if I could do it. It doesn't get him off the hook for his terrible behavior, of course, but I do feel real pity for the man. It got me to thinking about another pitiable rich guy, Ebeneezer Scrooge. Scrooge in the end is redeemed when he is brought face to face with the fact that his situation is similar to Trump's. Who cares that Scrooge has died? Certainly not his former business associates, who discuss whether they will attend his funeral: “It's likely to be a very cheap funeral,” said the same speaker; “for, upon my life, I don't know of anybody to go to it. Suppose we make up a party, and volunteer.” “I don't mind going if a lunch is provided," observed the gentleman with the excresence on his nose. Later, the Spirit shows Scrooge the people who are selling the curtains stolen from his bed and the shirt stolen from his corpse, and Scrooge begs: “If there is any person in the town who feels emotion caused by this man's death," said Scrooge, quite agonized, “show that person to me, Spirit, I beseech you!” The Spirit complies, by finding a couple who had owed Scrooge money, and who will now, because he has died, have time to pay. I can easily replace Scrooge with Trump in any of these scenes, right up to the end of chapter 4. But Scrooge in the end is redeemed. He did once love a woman, although she left him. Scrooge did have friends, long ago. He did have a sister who loved him, and though she is gone her son Fred still wants to welcome him back into the family. Did Donald Trump ever have any of those things? ### Michael Snoyman # Haskell Foundation Board - Meeting Minutes - April 8, 2021 This blog post is a summary of the meeting minutes for the Haskell Foundation board meeting that took place on April 8, 2021. This is the first time I'm writing these up, and potentially the only time I'm putting them on this blog. So this post is going to be a bit weird; we'll start with some questions. ## Questions Why are you writing these meeting minutes? As you'll see below, one of the decisions at the meeting was selection of additional officers for the board. I was selected as Secretary, which seems to put meeting minutes into my camp. OK, but why are you publishing them on your personal blog? As you'll also see below, the new Haskell Foundation website is nearing completion, and in the future I hope these posts go there. But I wanted to kick things off with what I hope is close to the methodology going forward. Isn't a blog post for each meeting excessive? Yes. As I mentioned in my transparency blog post, transparency includes a balance between too much and too little information. I intend to leverage announcement blog posts for things that deserve an announcement. Where are the actual meeting minutes? They're on Google Drive. I have a bit more information on the Google Drive setup below. With those out of the way, let's get into the noteworthy information from this meeting. ## Opening up Slack A lot of the work in the board so far has been in the "ways of working" direction. Basically: how does the foundation operate, what are the responsibilities of the board, what do officers do, how elections work, etc. Included in that, and in my opinion of significant interest to the community, is how we communicate. All of this information, and the decision making process around it, can be followed in the hf/meta repo on gitlab.haskell.org. I'm not going to try and summarize everything in those documents. Instead, the announcement here is around merge request !12. Specifically: • HF is standardizing on Slack as its avenue of text chatting. • We're going to start opening this up to everyone in the community interested in participating in discussions. • On a personal note, I'm very excited about this. I think a central place for ecosystem discussions is vital. • There's some hesitation/concern about moderation, off-topic discussions, and other points, which we'll need to work out over time. I'd encourage anyone interested in joining in the conversation and staying up to date with topics to request an invite. We're already starting separate topic-specific channels to iterate on various technical topics. Note that Slack is not a replacement for Discourse or other existing platforms. We'll still use Discourse for more official discussions and announcements. (This blog post is an example, the official discussion for it lives on Discourse.) Like many things, we'll likely be figuring out the details over time. ## Board officers There are a total of six board officers, who have all now been selected: • Chair: Richard Eisenberg • Vice Chair: Tom Ellis • Treasurer: Ryan Trinkle • Vice Treasurer: José Pedro Magalhães • Secretary: Michael Snoyman • Vice Secretary: Théophile Hécate Choutri ## New Haskell Foundation website There is a new Haskell Foundation website in the works, which should be ready to go live in the next week. It is fully open source and viewable now: I'm hoping that, in the future, I'll be putting posts like this one on that site instead! ## What's an announcement? Where do minutes go? Where's transparency?!? I'm going to try and reserve these kinds of announcement posts to topics that I think will have widespread interest. I may make mistakes in that judgement, I apologize in advance. My goal is that anyone who wants to stay up to speed on large decisions coming from the board will be able to without using up a lot of their bandwidth. That said, every board meeting (and, for that matter, most or all working group meetings) take and keep meeting notes. We've been sharing these on Discourse, but in the future may simply publish them in the hf/minutes repo. Until now, we've been creating a new Discourse thread and posting the meeting minutes for each meeting. I proposed reducing how often we do that, and instead leave the minutes in Google Drive for those interested. My hope is that a combination of "information is all available in Drive" with "important things get an announcement" will cover most people. But if people would like to see a new Discourse thread for each meeting, I'd like to hear about it. Please comment on this (and other topics) in the Discourse thread linked. ## Get involved! We've been in planning and discussion mode for the past few months on the board. The various working groups are beginning to get some clarity around how they want to function, and are ready for more people to get involved. There's technical work to do, documentation, outreach, and much more. If you're excited to be a part of the efforts of the Haskell Foundation to improve the overall state of Haskell, now's a great time to get in and influence the direction. I strongly encourage everyone to check out the Slack, ask questions, interact on Discourse, and overall dive in! ### Gil Mizrahi # Typing polymorphic variants in Giml In the last blog post we covered extensible records and how we infer their types in Giml. In this blog post we'll take a closer look at another interesting feature and the dual of extensible records - polymorphic variants. There are several approaches for typing polymorphic variants and they have different tradeoffs as well. Giml's system is very similar to OCaml's polymorphic variants system and also shares its limitations. If you are looking for different approaches you might want to look at this/this or this. I'm also not sure polymorphic variants is the best we can do to have extensible sum types, as having recursive polymorphic variants is quite problematic. But there are still plenty of usecases where they could be used so it's worth including them in a language. [Word of warning, this is a bit more complicated than previous posts and it might be best to look at this section as a cookbook. If you see something weird or not explained well, it might be a testement to that I'm a little fuzzy on the details myself] ## Polymorphic Variants Polymorphic variants can be seen as the dual of extensible records. Records are product types, they let us hold multiple values together in one structure at the same time, and by giving a label for each value we can refer to each part. Variants are sum types, they let us define multiple values as alternatives to one another, and we label each alternative as well. There are two operations we can do with variants, one is creating a tagged value, for example #Ok 1, the other is pattern matching on variants. For example: withDefault def x = case x of | #Ok v -> v | #Err _ -> def end The types for polymorphic variants are a bit more complicated. There are actually three different kinds of polymorphic variants: closed, lower-bounded and upper-bounded. In a closed variant we list the different labels and the type of their payload, for example: [ Err : String, Ok : Int ] is a type that have two possible alternative , #Err <some-string> and #Ok <some-integer>. This type is something that can be generated by applying a polymorphic variant to a function that take polymorphic variants as input. An upper-bounded polymorphic variant is open, it describes the maximum set of variants that could be represented by this type, and is represented syntactically with a little < right after the opening bracket (for example: [< Err : String, Ok : Int ]). Why does this exist? This type can be used when we want to use a variant in a pattern matching expression, and we have a set of variants that we can handle. The pattern matching example above is one like that. In that pattern matching expression we could match any subset of [ Err : a, Ok : b ]. So we use an upper-bounded polymorphic variant to express that in types. Lower-bounded polymorphic variant is open as well, it describes the minimum set of variants that could be represented by this type, and is represented will a little > right after the opening bracket (for example: [> Err : String, Ok : Int ]). It can unify with any superset of the specified alternatives. Why does this exist? There are a couple of cases: One, this type is used when we want to pass a variant to a function that does pattern matching on that variant, and has the ability to match on unspecified amount of variants, for example with a wildcard (_) or variable pattern. For example: withDefault def x = case x of | #Ok 0 -> def | #Ok v -> v | _ -> def end In this case, even if we passed #Foo 182 this pattern matching should work, it will just match the wildcard pattern and return def. But note that expressions cannot have multiple types (at least not in Giml), therefore we must specify in our case that the #Ok must have a specific type (Int) and we cannot pass #Ok "hello" instead. In the pattern matching above we could match on [ Ok : Int ] or any other type, and we write this like this: [> Ok : Int ]. Another case is the type of variant "literals" themselves! This could be considered the dual of record selection: when we specify a tagged value (say, #Ok 1), what we really want for it is to fit anywhere that expect at least [ Ok : Int ]. ### Implementation #### Type definition So we add 3 new constructors to represent the three variant types: 1. TypeVariant [(Label, Type)] for closed variant 2. TypePolyVariantLB [(Constr, Type)] TypeVar for lower-bounded polymorphic variant 3. TypePolyVariantUB TypeVar [(Constr, Type)] for upper-bounded polymorphic variant One thing that wasn't visible before when we talked about the syntactic representation of variants is the row type variable. Here, just like with records, we use the row type variable to encoded hidden knowledge about our types. For lower-bounded polymorphic variants (from now on, LB), we use the row type variable to represent other variants (similar to how with records the row type variable was used to represent other fields). For upper-bound polymorphic variants (from now on, UB) the row type variable has a different role. We use the row type variable to keep track of which UB we want to merge together (those with the same row type variable), how to convert a UB to LB (constrain the UB row type variable with the merge of the UB and LB fields + the LB row type variable), and when to treat two UBs as normal variants (those with a different row type variable). We'll explore each scenario soon. #### Elaboration and constraint generation ##### Creating variants When we create a variant, we: 1. Elaborate the type of the payload 2. Generate a type variable 3. Annotate the type of the AST node as TypePolyVariantLB [(<label>, <payload-type>)] <typevar> We will see how this unify with other variants. ##### Pattern matching For matching an expression with one or more patterns of polymorphic variants, we elaborate the expression and constrain the type we got with a newly generated type variable which is going to be our hidden row type variable and representative of the type of the expression (and let's call it tv for now). For each variant pattern matching we run into, we add the constraint: Equality (TypeVar tv) (TypePolyVariantUB tv [(<label>, <payload-type>)]) By making all variant patterns equal to tv, we will eventually need to unify all the types that arise from the patterns, and by keeping the hidden row type variable tv the same for all of them, we annotate that these TypePolyVariantUBs should unify by merging the unique fields on each side and unifying the shared fields. Another special case we have is a regular type variable pattern (capture pattern) and wildcard. In our system, a capture pattern should capture any type, including any variant. This case produces the opposite constraint: Equality (TypeVar tv) (TypePolyVariantLB [] tv2) Note that this special type, TypePolyVariantLB [] tv2, unifies with none variant types as if it a type variable, but with variant types it behave like other TypePolyVariantLB. #### Constraint solving We have 3 new types, TypeVariant, TypePolyVariantLB and TypePolyVariantUB. And in total we have 7 unique new cases to handle: • Equality TypeVariant TypeVariant • Equality TypeVariant TypePolyVariantLB • Equality TypeVariant TypePolyVariantUB • Equality TypePolyVariantLB TypePolyVariantLB • Equality TypePolyVariantLB TypePolyVariantUB • Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables match) • Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables don't match) The cases are symmetrical, so we'll look at one side. ##### Equality TypeVariant TypeVariant For two TypeVariant, like with two TypeRec, we unify all the fields. If there are any that are only on one side, we fail. ##### Equality TypeVariant TypePolyVariantLB This case is similar to the Equality TypeRec TypeRecExt case, all Fields in the LB side must appear and unify in the TypeVariant side, and the fields that are unique to the TypeVariant should unify with the row type variable of LB. We don't want the LB have any extra fields that are not found in the regular variant side! ##### Equality TypeVariant TypePolyVariantUB When an upper-bounded polymorphic variant meets a regular variant, we treat the UB as a regular variant as well. This is something that can makes seemingly well typed programs to be rejected by the typechecker (in a manner that is similar to not having let-polymorphism) but we do not represent the constraint of a subset/subtype in Giml, only equality, and the two types just aren't equal. ##### Equality TypePolyVariantLB TypePolyVariantLB Two lower-bounded polymorphic variant should unify all the shared fields, and their row type variants should unify with a new LB that contains the non-shared fields (and a new row type variable). ##### Equality TypePolyVariantLB TypePolyVariantUB LB means any superset of the mentioned fields, and UB means any subset of the mentioned fields. Meaning, we don't want to have a field in the LB said that doesn't exist in the UB side. This is basically the same case as Equality TypeVariant TypePolyVariantLB. ##### Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables match) When the two row type variables match, this means that the two types represent alternative pattern matches, so we make sure their fields unify and merge them: we generate a constraint that unifies the merged UB (with the same row type variable) with the row type variable. This is so we can establish the most up-to-date type of the row type variable in the substitution. Equality (TypePolyVariantUB tv <merged-fields>) (TypeVar tv) This way we add to the list of alternative variants, add keep track of the row type variable is the substitution and subsequent constraints. ##### Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables don't match) We treat these two as unrelated variants that represent the same subest of fields, so the fields should match just like regular variants (Equality TypeVariant TypeVariant). #### Instantiation Instantiation is straightforward, just like all other cases: we instantiate the row type variables here as well. #### Substitution For substitution we need to handle a few cases: • For UB, if the row type variable maps to another type variable, we just switch the type variable. Otherwise we merge the variant fields we have with the type. • For LB with no variant fields (TypePolyVariantLB [] tv) we return the value mapped in the substitution (remember that this case is also used for wildcards and captures in pattern matching and not just variants). • For LB with variant fields, we try to merge the variant fields with the type. Merging variants happens the same as with records, but we don't have to worry about duplicate variants as the constraint solving algorithm makes sure they have the same type. Note that this is where a UB can flip and become LB (if it's merged with an LB). ### Example As always, let's finish of with an example and infer the types of the following program: withDefault def x = case x of | #Some v -> v | #Nil _ -> def end one = withDefault 0 (#Some 1) #### Elaboration and constraints generation  +---- targ3 -> t4 -> t5 | | +---- targ3 | | | | +----- t4 | | | withDefault def x = +------------ t4 | case x of +---- t7 | | #Some v -> v | #Nil _ -> def | | +---- targ3 end -- Constraints: [ Equality t4 t6 , Equality t5 t7 , Equality t5 targ3 , Equality t6 [< Nil : t9 | (t6)] , Equality t6 [< Some : t7 | (t6)] , Equality t7 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> t4 -> t5) , Equality top0 (targ3 -> t4 -> t5) ]  +-- t16 | +--- top0_i11 | | | | +---- Int | | | | | | +---- t14 -> [> Some : t14 | (t13)] | | | | | | | | +--- Int | | | | | one = withDefault 0 (#Some 1) ------- | +--- t15 -- Constraints: [ Equality t12 (t15 -> t16) , Equality top0_i11 (Int -> t12) , Equality top1 t16 , Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf top0_i11 top0 ] #### Constraint solving ##### First group [ Equality t4 t6 , Equality t5 t7 , Equality t5 targ3 , Equality t6 [< Nil : t9 | (t6)] , Equality t6 [< Some : t7 | (t6)] , Equality t7 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> t4 -> t5) , Equality top0 (targ3 -> t4 -> t5) ]  -- Constraints: [ Equality t4 t6 , Equality t5 t7 , Equality t5 targ3 , Equality t6 [< Nil : t9 | (t6)] , Equality t6 [< Some : t7 | (t6)] , Equality t7 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> t4 -> t5) , Equality top0 (targ3 -> t4 -> t5) ] -- Substitution: [] 1. Equality t4 t6 -- => t4 := t6 -- Constraints: [ Equality t5 t7 , Equality t5 targ3 , Equality t6 [< Nil : t9 | (t6)] , Equality t6 [< Some : t7 | (t6)] , Equality t7 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> t6 -> t5) , Equality top0 (targ3 -> t6 -> t5) ] -- Substitution: [ t4 := t6 ] 2. Equality t5 t7 -- => t5 := t7 -- Constraints: [ Equality t7 t7 , Equality t7 targ3 , Equality t6 [< Nil : t9 | (t6)] , Equality t6 [< Some : t7 | (t6)] , Equality t7 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> t6 -> t7) , Equality top0 (targ3 -> t6 -> t7) ] -- Substitution: [ t4 := t6 , t5 := t7 ] 3. Equality t7 t7 -- => Nothing to do -- Constraints: [ Equality t7 targ3 , Equality t6 [< Nil : t9 | (t6)] , Equality t6 [< Some : t7 | (t6)] , Equality t7 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> t6 -> t7) , Equality top0 (targ3 -> t6 -> t7) ] -- Substitution: unchanged 4. Equality t7 targ3 -- => t7 := targ3 -- Constraints: [ Equality t6 [< Nil : t9 | (t6)] , Equality t6 [< Some : targ3 | (t6)] , Equality targ3 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> t6 -> targ3) , Equality top0 (targ3 -> t6 -> targ3) ] -- Substitution: [ t4 := t6 , t5 := targ3 , t7 := targ3 ] 5. Equality t6 [< Nil : t9 | (t6)] -- => t6 := [< Nil : t9 | (t6)] -- note that we don't do occurs check here -- Constraints: [ Equality [< Nil : t9 | (t6)] [< Nil : t9 | (t6)] , Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)] , Equality targ3 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> [< Nil : t9 | (t6)] -> targ3) , Equality top0 (targ3 -> [< Nil : t9 | (t6)] -> targ3) ] -- Substitution: [ t4 := [< Nil : t9 | (t6)] , t5 := targ3 , t7 := targ3 , t6 := [< Nil : t9 | (t6)] ] 6. Equality [< Nil : t9 | (t6)] [< Nil : t9 | (t6)] -- => The two sides are the same, nothing to do. -- Constraints: [ Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)] , Equality targ3 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> [< Nil : t9 | (t6)] -> targ3) , Equality top0 (targ3 -> [< Nil : t9 | (t6)] -> targ3) ] -- Substitution: unchanged 7. Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)] -- => [ Equality t6 [< Nil : t9 | Some : targ3 | (t6)], Equality t9 t9 ] -- Constraints: [ Equality t6 [< Nil : t9 | Some : targ3 | (t6)] , Equality t9 t9 [ Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)] , Equality targ3 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> [< Nil : t9 | (t6)] -> targ3) , Equality top0 (targ3 -> [< Nil : t9 | (t6)] -> targ3) ] -- Substitution: unchanged 8. Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)] -- => t6 := [< Nil : t9 | Some : targ3 | (t6)] -- Constraints: [ Equality t9 t9 , Equality targ3 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3) , Equality top0 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3) ] -- Substitution: [ t4 := [< Nil : t9 | Some : targ3 | (t6)] , t5 := targ3 , t7 := targ3 , t6 := [< Nil : t9 | Some : targ3 | (t6)] ] 9. Equality t9 t9 -- => Nothing to do -- Constraints: [ Equality targ3 t8 , Equality t9 t10 , Equality tfun2 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3) , Equality top0 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3) ] -- Substitution: [ t4 := [< Nil : t9 | Some : targ3 | (t6)] , t5 := targ3 , t7 := targ3 , t6 := [< Nil : t9 | Some : targ3 | (t6)] ] 10. Equality targ3 t8 -- => targ3 := t8 -- Constraints: [ Equality t9 t10 , Equality tfun2 (t8 -> [< Nil : t9 | Some : t8 | (t6)] -> t8) , Equality top0 (t8 -> [< Nil : t9 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t9 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t9 | Some : t8 | (t6)] , targ3 := t8 ] 11. Equality t9 t10 -- => t9 := t10 -- Constraints: [ Equality tfun2 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) , Equality top0 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 ] 11. Equality tfun2 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) -- => tfun2 := (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) -- Constraints: [ Equality top0 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 ] 12. Equality top0 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) -- => top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 -- Constraints: [] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 ] #### Second group [ Equality t12 (t15 -> t16) , Equality top0_i11 (Int -> t12) , Equality top1 t16 , Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf top0_i11 top0 ] and after applying the substitution: [ Equality t12 (t15 -> t16) , Equality top0_i11 (Int -> t12) , Equality top1 t16 , Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf top0_i11 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Constraints: [ Equality t12 (t15 -> t16) , Equality top0_i11 (Int -> t12) , Equality top1 t16 , Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf top0_i11 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 ] 1. Equality t12 (t15 -> t16) -- => t12 := t15 -> t16 -- Constraints: [ Equality top0_i11 (Int -> t15 -> t16) , Equality top1 t16 , Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf top0_i11 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := t15 -> t16 ] 2. Equality top0_i11 (Int -> t15 -> t16) -- => top0_i11 := Int -> t15 -> t16 -- Constraints: [ Equality top1 t16 , Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := t15 -> t16 , top0_i11 := Int -> t15 -> t16 ] 3. Equality top1 t16 -- => top1 := t16 -- Constraints: [ Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := t15 -> t16 , top0_i11 := Int -> t15 -> t16 , top1 := t16 ] 4. Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) -- => [ Equality t14 Int, Equality [> Some : t14 | (t13)] t15 ] -- Constraints: [ Equality t14 Int , Equality [> Some : t14 | (t13)] t15 , Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15) , InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: unchanged 5. Equality t14 Int -- => t14 := Int -- Constraints: [ Equality [> Some : Int | (t13)] t15 , Equality (Int -> [> Some : Int | (t13)]) (Int -> t15) , InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := t15 -> t16 , top0_i11 := Int -> t15 -> t16 , top1 := t16 , t14 := Int ] 6. Equality [> Some : Int | (t13)] t15 -- => t15 := [> Some : Int | (t13)] -- Constraints: [ Equality (Int -> [> Some : Int | (t13)]) (Int -> [> Some : Int | (t13)]) , InstanceOf (Int -> [> Some : Int | (t13)] -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := [> Some : Int | (t13)] -> t16 , top0_i11 := Int -> [> Some : Int | (t13)] -> t16 , top1 := t16 , t14 := Int , t15 := [> Some : Int | (t13)] ] 7. Equality (Int -> [> Some : Int | (t13)]) (Int -> [> Some : Int | (t13)]) -- => [ Equality Int Int, Equality [> Some : Int | (t13)] [> Some : Int | (t13)] ] -- Constraints: [ Equality Int Int , Equality [> Some : Int | (t13)] [> Some : Int | (t13)] , InstanceOf (Int -> [> Some : Int | (t13)] -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) ] -- Substitution: unchanged 8. The two sides are equals so we skip them 9. The two sides are equals so we skip them 10. InstanceOf (Int -> [> Some : Int | (t13)] -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8) -- => [ Equality (Int -> [> Some : Int | (t13)] -> t16) (t17 -> [< Nil : t18 | Some : t17 | (t19)] -> t17) ] -- (instantiation) -- Constraints: [ Equality (Int -> [> Some : Int | (t13)] -> t16) (t17 -> [< Nil : t18 | Some : t17 | (t19)] -> t17) ] -- Substitution: unchanged 11. Equality (Int -> [> Some : Int | (t13)] -> t16) (t17 -> [< Nil : t18 | Some : t17 | (t19)] -> t17) -- => [ Equality Int t17, Equality [> Some : Int | (t13)] [< Nil : t18 | Some : t17 | (t19)], Equality t16 t17 ] -- Constraints: [ Equality Int t17 , Equality [> Some : Int | (t13)] [< Nil : t18 | Some : t17 | (t19)] , Equality t16 t17 ] -- Substitution: unchanged 11. Equality Int t17 -- => t17 := Int -- Constraints: [ Equality [> Some : Int | (t13)] [< Nil : t18 | Some : Int | (t19)] , Equality t16 Int ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := [> Some : Int | (t13)] -> t16 , top0_i11 := Int -> [> Some : Int | (t13)] -> t16 , top1 := t16 , t14 := Int , t15 := [> Some : Int | (t13)] , t17 := Int ] 12. Equality [> Some : Int | (t13)] [< Nil : t18 | Some : Int | (t19)] -- => Equality [> Some : Int | (t13)] [ Nil : t18 | Some : Int ] -- LB ~ UB are solved like LB and normal variant -- Constraints: [ Equality [> Some : Int | (t13)] [ Nil : t18 | Some : Int ] , Equality t16 Int ] -- Substitution: unchanged 13. Equality [> Some : Int | (t13)] [ Nil : t18 | Some : Int ] -- => [ Equality Int Int, Equality [ Nil : t18 ] t13 ] -- Constraints: [ Equality Int Int , Equality [ Nil : t18 ] t13 , Equality t16 Int ] -- Substitution: unchanged 14. Two sides are equal, skip 15. Equality [ Nil : t18 ] t13 -- => t13 := [ Nil : t18 ] -- Constraints: [ Equality t16 Int ] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := [ Nil : t18 | Some : Int ] -> t16 -- * substitution of LB with normal variant , top0_i11 := Int -> [ Nil : t18 | Some : Int ] -> t16 -- * here too , top1 := t16 , t14 := Int , t15 := [ Nil : t18 | Some : Int ] -- * and here , t17 := Int ] 16. Equality t16 Int -- => t16 := Int -- Constraints: [] -- Substitution: [ t4 := [< Nil : t10 | Some : t8 | (t6)] , t5 := t8 , t7 := t8 , t6 := [< Nil : t10 | Some : t8 | (t6)] , targ3 := t8 , t9 := t10 , tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8 , t12 := [ Nil : t18 | Some : Int ] -> Int -- * substitution of LB with normal variant , top0_i11 := Int -> [ Nil : t18 | Some : Int ] -> Int -- * here too , top1 := Int , t14 := Int , t15 := [ Nil : t18 | Some : Int ] -- * and here , t17 := Int , t16 := Int ] ... and done. ### Result Let's substitute the type variables in our program and hide the row type variables:  +---- t8 -> [< Nil : t10 | Some : t8 ] -> t8 | | +---- t8 | | | | +----- [< Nil : t10 | Some : t8 ] | | | withDefault def x = +------------ [< Nil : t10 | Some : t8 ] | case x of +---- t8 | | #Some v -> v | #Nil _ -> def | | +---- t8 end  +-- Int | +--- Int -> [ Nil : t18 | Some : Int ] -> Int | | | | +---- Int | | | | | | +---- Int -> [ Nil : t18 | Some : Int ] | | | | | | | | +--- Int | | | | | one = withDefault 0 (#Some 1) ------- | +--- [ Nil : t18 | Some : Int ] Phew, looks correct! ### Alternative approach: simulation in userland Another cool thing I'd like to note is that polymorphic variants could potentially be simulated in userland if you have extensible records in the language (and this is what purescript-variant) does for example). One approach I tweeted about that also helped me reach the polymorphic variant implementation for Giml is to: 1. Make variants functions - functions that take a record of labels to function that take the variant's payload 2. Make patterns are these records from labels to functions on the payload 3. Make pattern matching just applying variants with patterns Though in this approach we can't add a "default handler" label. ## Summary Polymorphic variants are a bit complicated and implementing them is a bit tricky, and they also have several limitations. But I believe that for some use cases they can really make a difference and it's worth having them in the language. If you're interesting to see a video of me live coding polymorphic variants in Giml, check out Part 17 of my live streaming Giml development series. ## April 12, 2021 ### Mark Jason Dominus # Soup-guzzling pie-munchers The ten storytellers in The Decameron aren't all well-drawn or easy to tell apart. In the introduction of my favorite edition, the editor, Cormac Ó Cuilleanáin, says: Early in the book we are given hints that we are going to get to know these ten frame characters…. Among the Decameron storytellers, for instance, Pampinea emerges as being bossy, while Dioneo has a filthy mind. But little further character development takes place. I agree, mostly. I can see Dioneo more clearly than Ó Cuilleanáin suggests. Dioneo reminds me of Roberto Benigni's Roman filthy-minded Roman taxi driver in Night on Earth. I also get a picture of Bocaccio's character Filostrato, who is a whiny emo poet boy who complains that he woman he was simping for got tired of him and dumped him for someone else: To be humble and obedient to her and to follow all her whims as closely as I could, was all of no avail to me, and I was soon abandoned for another. Thus I go from bad to worse, and believe I shall until I die.… The person who gave me the nickname of Filostrato [ “victim of love” ] knew what she was doing. When it's Filostrato's turn to choose the theme for the day's stories, he makes the others tell stories of ill-starred love with unhappy endings. They comply, but are relieved when it is over. (Dioneo, who is excused from the required themes, tells instead a farcical story of a woman who hides her secret lover in a chest after he unwittingly drinks powerful sedative.) Ah, but Emilia. None of the characters in the Decameron is impressed with the manners or morals of priests. But Emilia positively despises them. Her story on the third day is a good example. The protagonist, Tedaldo, is meeting his long-lost mistress Ermellina; she broke off the affair with him seven years ago on the advice of a friar who advised that she ought to remain faithful to her husband. Tedaldo is disguised as a friar himself, and argues that she should resume the affair. He begins by observing that modern friars can not always be trusted: Time was when the friars were most holy and worthy men, but those who today take the name and claim the reputation of friars have nothing of the friar but the costume. No, not even that,… Modern friars, narrates Emilia, "strut about like peacocks" showing off their fine clothes. She goes on from there, complaining about friars' vanity, and greed, and lust, and hypocrisy, getting more and more worked up until you can imagine her frothing at the mouth. This goes on for about fifteen hundred words before she gets back to Tedaldo and Ermellina, just at the same time that I get around to what I actually meant to write about in this article: Emilia has Tedaldo belittle the specific friar who was the original cause of his troubles, who must without a doubt have been some soup-guzzling pie-muncher… This was so delightful that I had to write a whole blog post just to show it to you. I look forward to calling other people soup-guzzling pie-munchers in the coming months. But, as with the earlier article about the two-bit huckster I had to look up the original Italian to see what it really said. And, as with the huckster, the answer was, this was pretty much what Bocaccio had originally written, which was: il qual per certo doveva esser alcun brodaiuolo manicator di torte • Brodaiuolo is akin to “broth”, and it has that disparaging diminutive “-uolo” suffix that we saw before in mercantuolo. • A manicator is a gobbler; it's akin to “munch”, “manger”, and “mandible”, to modern Italian mangia and related French manger. A manicator di torte is literally a gobbler of pies. Delightful! I love Bocaccio. While I was researching this article I ran into some other English translations of the phrase. The translation at Brown University's Decameron Web is by J.M. Rigg: some broth-guzzling, pastry-gorging knave without a doubt which I award full marks. The translation of John Payne has must for certain have been some broth-swilling, pastry-gorger and two revised versions of Payne, by Singleton and Ó Cuilleanáin, translate it similarly. But the translation of Richard Aldington only says: who must certainly have been some fat-witted glutton. which I find disappointing. I often wonder why translators opt to water down their translations like this. Why discard the vivid and specific soup and pie in favor of the abstract "fat-witted glutton"? What could possibly be the justification? Translators have a tough job. A mediocre translator will capture only the surface meaning and miss the subtle allusions, the wordplay, the connotations. But here, Aldington hasn't even captured the surface meaning! How hard is it to see torte and include pie in your translation somewhere? I can't believe that his omitting it was pure carelessness, only that Aldington thought that he was somehow improving on the original. But how, I can't imagine. Well, I can imagine a little. Translations can also be too literal. Let's consider the offensive Spanish epithet pendejo. Literally, this is a pubic hair. But to translate it in English as "pubic hair" would be a mistake, since English doesn't use that term in the same way. A better English translation is "asshole". This is anatomically illogical, but linguistically correct, because the metaphor in both languages has worn thin. When an anglophone hears someone called an “asshole” they don't normally imagine a literal anus, and I think similarly Spanish-speakers don't picture a literal pubic hair for pendejo. Brodaiuolo could be similar. Would a 14th-century Florentine, hearing brodaiuolo, picture a generic glutton, or would they imagine someone literally holding a soup bowl up to their face? We probably don't know. But I'm inclined to think that “soup-guzzler” is not too rich, because by this point in Emilia's rant we can almost see the little flecks of spittle flying out of here mouth. I'm offended by Aldington's omission of pie-munching. [ Addendum 20210414: More translations of brodaiuolo. ] ### Philip Wadler # Vote! The UK holds elections on 6 May. From the gov.uk site: Register by 11:59pm on 19 April to vote in the following elections on 6 May: • local government elections and referendums in England • Police and Crime Commissioner elections in England and Wales • Scottish Parliament elections • Senedd (Welsh Parliament) elections • Mayor of London and London Assembly elections Register Online. It usually takes about 5 minutes. Start Now. Registration is easy: a rare example of a well-designed web site. You can also support your party with a donation. Mine is the Edinburgh Green Party; feel free to add yours via the comments. Current polling shows Green on track to win 10 seats in the regional lists, and Alba on track to get no seats. ## April 10, 2021 ### Gil Mizrahi # Typing extensible records in Giml In the last blog post we covered the general structure and algorithms of Giml's type inference engine. In this blog post we'll take a closer look at extensible records in Giml and how to infer their types. ## Records A record is a collection of values, each associated with a name (also called a label). For example, { name = "Giml", age = 0 }. The type of records is very similar to their structure for example the type of the record above is: { age : Int, name : String }. Records are used to build compound data aggregating multiple values into one value that contains all of the information. The different values (also called fields) inside the record can be accessed (or selected) by their labels, like this: <record>.<label>. Records can also be extended with additional fields: let lang = { name = "Giml", age = 0 } in { website = "https://giml-lang.org" | lang } The result of the expression is a new record that has all of the fields lang has and has one extra field website as well. Note that in Giml, we can't have multiple values with the same label. When we add a new field with an existing label, that previous one is replaced with the new value. There are other reasonable behaviours here such as disallowing such operation or having a scope of labels, but this is the behaviour I chose for Giml :) Records are first class and can be used anywhere an Int can be expected. They can also be pattern matched: case { name = "Giml" } of | { name = "Giml" } -> "Yeah I heard about it, funny name." | { name = "Haskell" } -> "Haskell is pretty cool!" | other -> concat (concat "I'm sure " other.name) " is pretty good too!" end So far we've seen record literals and three operations on records: field selection, extension and pattern matching. When we approach to type inference of features we need to take into consideration a few things: 1. What the types of the feature look like 2. What to do for each operation (elaboration and constraints generation) 3. How to unify the types (constraint solving) 4. How to instantiate 5. How to substitute We are going to first try a naive approach: Just represent the types of record as TypeRec [(Label, Type)] and see how it's just not enough to make this feature useful without type annotations. Let see how we elaborate each operation of Record literals, field selection, record extension, pattern matching: Record literals: simple, we have an expression that looks like this: { label1 = expr1, label2 = expr2, ... }, so we elaborate each expression and annotate the node with the type TypeRec [(label1, t1), ...]. Record selection: Also simple, we have something that looks like expr.label, so we elaborate the type of the expression and constrain it as equals to TypeRec [(label, t)]. But hold up, if we try to work with this scheme we'll see that it is not good enough. Trying to typecheck this expression { a = 1, b = 1 }.a will create this constraint: Equality { a : Int, b : Int } { a : t }, and while they both have a field a, one of them is missing a field! So they are definitely not equal. So we need to go at this differently and there are several approaches, one is to create a new type of constraint that describe the relationship of "subtyping", another is to describe the types differently. For Giml I chose the second approach, we add extra information to the types to encode that "this record type may have more fields". This approach is called row polymorphism. ### Row Polymorphism Row polymorphism provides us with the ability to represent extra information in an additional type variable (called a row type variable) in the type. So in addition to the variant we created TypeRec [(Label, Type)], we add another one: TypeRecExt [(Label, Type)] TypeVar. Syntactically the type will look like this: { <field1> : <type1>, ... | <extension-type-variable> } What this extra type variable does is represent additional fields that may be part of the type that we don't know yet. Let's use this for record field selection, instead of saying "this type is equals to a record with one field of type t", we can now say "this type is equals to a record that has this field with this type, and other fields represented by this type variable". With this scheme we can move forward. Let's try again: ### Elaboration and constraint generation #### Record literals same as before #### Record selection We have something that looks like expr.label, so we: 1. Generate a type variable representing the return type (which is the field type), let's call this t 2. Elaborate the type of the expression 3. Generate a type variable representing the rest of the fields, let's call this ext 4. Constrain the type of the expression as equals to { label : t | ext }. Now when we try to typecheck the expression { a = 1, b = 1 }.a, which generates this constraint: Equality { a : Int, b : Int } { a : t | ext }, we can match the fields and types that we know (a : Int with a : t), and also match the fields that we don't ({ b : Int } with ext) - more on that later. And when we find out what the real type of ext is, we'll substitute it (we'll talk about that later as well). #### Record extension We have something that looks like { <field1> = <expr1>, ... | <expression> }, so we: 1. Elaborate the type of each field 2. Elaborate the type of the expression 3. Generate a type variable for the expression which we will call ext 4. Constrain the type of the expression to be equal to { | ext } 5. Annotate the node with the type { <field1> : <type1>, ... | ext } #### Pattern matching Pattern matching is similar to record literals case, but instead of matching the expression in the case with a rigid TypeRec, we generate a type variable and match with TypeRecExt. That way we can select less fields in the pattern than might be available in the expression type. ### Constraint solving We need to handle 3 new cases - equality between: 1. Two TypeRec 2. A TypeRec and a TypeRecExt 3. Two TypeRecExt #### Two TypeRec The two records should have the same labels and for each label we generate a new constraint of Equality between the types for the label on each side. So for example { a : Int, b : String } and { a : t1, b : t2 }, we generate the two constraint: Equality Int t1 and Equality String t2. If one TypeRec has a field that the other do not have, we throw an error of a missing field. #### A TypeRec and a TypeRecExt Each field in the TypeRecExt should match with the matching field in TypeRec. If there's a field in TypeRecExt that does not exist in TypeRec we throw an error. The other side is different, all the missing field in TypeRecExt that exist in TypeRec will be matched with the row type variable of TypeRecExt. Remember - we said that with row polymorphism we use a type variable as a representative of fields we don't know of yet! So we treat the type variable in TypeRecExt as if it is a TypeVar that matches the fields that exist in the TypeRec but not in the TypeRecExt. So for example in Equality { a : Int, b : String } { a : t1 | ext } we generate 2 new constraints: Equality Int t1 and Equality { b : String } ext. #### Two TypeRecExt This scenario is slightly trickier, of course - the specified fields in both sides should be matched just like in previous cases, but what do we do with the missing fields from each side? Let's check a concrete example, we'll make it simpler by not including matching fields. Equality { a : Int | t1 } { b : Int | t2 } What this constraint mean, is that t2 is equals to { a : Int | t1 } without { b : Int }, and t1 is equals to { b : Int | t2 } without { a : Int }. Since we don't have any notion of subtracting a type from a type (some other type system do support this), we can try and represent this differently, we could represent this subset as a new type variable, and translate the Equality above to these two constraints: Equality { a : Int | t' } t2 Equality { b : Int | t' } t1 Now t' represents t2 without { a : Int } and t' also represents t1 without { b : Int }. So more generally, when we try to unify two TypeRecExt, we match the matching fields and also create a new row type variable to represents all of the unspecified fields in both types, and match each row type variable with the fields specified on the other side. Let's describe this one more time in psuedo code, adding new constraints: Equality { <fields-only-found-on-the-left-side> | new-type-var } extRight Equality { <fields-only-found-on-the-right-side> | new-type-var } extLeft <matched-fields> ### Instantiation Instantiation occurs as usual, the row type variable in TypeRecExt should be instantiated as well. ### Substitution What do we do if the row type variable in TypeRecExt appears in the substitution? One, if the row type variable is mapped to a different type variable, we just replace it. Two, if it's mapped to a TypeRec, we merge it with the existing fields and return a TypeRec with all fields, but it's important to note that some fields in the TypeRec from the substitution may be the same as ones from our TypeRecExt. There are a few approaches one could take here, one is to keep a scope of labels, another is to try and unify the types of the field. In Giml we take the type of the left-most instance, discarding the right one. So semantically this expression is legal: { a = "Hi!" | { a = 1} } and its type is { a : String }. Three, if the row type variable is mapped to a TypeRecExt, we do the same thing as in Two, but return a TypeRecExt instad of a TypeRec with the row type variable from the mapped value as our new row type variable. This is basically what we need to do to infer the type of extensible records. ## Example Let's see how to infer the type of a simple record field access example. This process is a bit much, but I've included it for those who want a live demonstration of how this works. Feel free to skip it! giml = { name = "Giml", age = 0 } getName record = record.name gimlName = getName giml #### Elaboration Through elaboration, we end up with the following AST and constraints: -- Ast: +-- { age : Int, name : String } | | | | | __________|_______________ giml = { name = "Giml", age = 0 } -- Constraints: [ Equality top0 {age : Int, name : String} ] -- Ast: +--- targ4 -> t5 | | +--- targ4 | | | __|___ getName record = record.name ----------- | +------- t5 -- Constraints: [ Equality targ4 {name : t5 | t6} , Equality tfun3 (targ4 -> t5) , Equality top1 (targ4 -> t5) ] -- Ast: +--- t9 | | gimlName = getName giml | | | | | +---- top0_i8 | +---- top1_i7 -- Constraints: [ Equality top1_i7 (top0_i8 -> t9) , Equality top2 t9 , InstanceOf top0_i8 top0 , InstanceOf top1_i7 top1 ] The first stage of inference is to group dependencies and order them topologically. Since we don't have any mutual dependencies, each definition is standalone and the order of dependencies is kept (the last definition uses the previous two). Also note that the names of type variables do not matter to the algorithm. They are a bit different so I have better time knowing where they are introduced. #### Constraint solving Now we go over group by group and solve the constraints: ##### First group -- Constraints: [Equality top0 {age : Int, name : String}] -- Substitution: [] 1. Equality top0 {age : Int, name : String} -- => add top0 := {age : Int, name : String} to the substitution -- Constraints: [] -- Substitution: [top0 := {age : Int, name : String}] We carry the substitution to the next group. ##### Second group [ Equality targ4 {name : t5 | t6} , Equality tfun3 (targ4 -> t5) , Equality top1 (targ4 -> t5) ] We need to apply the substitution to the constraints we are about to handle, but this doesn't change anything in this instance. -- Constraints: [Equality targ4 {name : t5 | t6}, Equality tfun3 (targ4 -> t5), Equality top1 (targ4 -> t5)] -- Substitution: [top0 := {age : Int, name : String}] 1. Equality targ4 {name : t5 | t6} -- => add targ4 := {name : t5 | t6} to the substitution and apply -- Constraints: [Equality tfun3 ({name : t5 | t6} -> t5), Equality top1 ({name : t5 | t6} -> t5)] -- Substitution: [top0 := {age : Int, name : String}, targ4 := {name : t5 | t6}] 2. Equality tfun3 ({name : t5 | t6} -> t5) -- => add tfun3 := ({name : t5 | t6} -> t5) to the substitution and apply -- Constraints: [Equality top1 ({name : t5 | t6} -> t5)] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) ] 3. Equality top1 ({name : t5 | t6} -> t5) -- => add top1 := ({name : t5 | t6} -> t5) and apply -- Constraints: [] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) ] ##### Third group [ Equality top1_i7 (top0_i8 -> t9) , Equality top2 t9 , InstanceOf top0_i8 top0 , InstanceOf top1_i7 top1 ] And after applying the substitution: [ Equality top1_i7 (top0_i8 -> t9) , Equality top2 t9 , InstanceOf top0_i8 {age : Int, name : String} , InstanceOf top1_i7 ({name : t5 | t6} -> t5) ] -- Constraints: [ Equality top1_i7 (top0_i8 -> t9) , Equality top2 t9 , InstanceOf top0_i8 {age : Int, name : String} , InstanceOf top1_i7 ({name : t5 | t6} -> t5) ] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) ] 1. Equality top1_i7 (top0_i8 -> t9) -- => add top1_i7 := (top0_i8 -> t9) -- Constraints: [ Equality top2 t9 , InstanceOf top0_i8 {age : Int, name : String} , InstanceOf (top0_i8 -> t9) ({name : t5 | t6} -> t5) ] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) , top1_i7 := (top0_i8 -> t9) ] 2. Equality top2 t9 -- add top2 := t9 and apply -- Constraints: [ InstanceOf top0_i8 {age : Int, name : String} , InstanceOf (top0_i8 -> t9) ({name : t5 | t6} -> t5) ] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) , top1_i7 := (top0_i8 -> t9) , top2 := t9 ] 3. InstanceOf top0_i8 {age : Int, name : String} -- => for InstanceOf constraint we instantiate the second type -- and add a new Equality constraint -- in this case instantiation does nothing because -- there are no type variables on the second type -- so we add Equality top0_i8 {age : Int, name : String} -- which immediately turns into -- so we add top0_i8 := {age : Int, name : String} -- Constraints: [ InstanceOf ({age : Int, name : String} -> t9) ({name : t5 | t6} -> t5) ] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) , top1_i7 := ({age : Int, name : String} -> t9) , top2 := t9 , top0_i8 := {age : Int, name : String} ] 4. InstanceOf ({age : Int, name : String} -> t9) ({name : t5 | t6} -> t5) -- => new constraint: -- Equality ({age : Int, name : String} -> t9) ({name : ti10 | ti11} -> ti10) -- Constraints: [ Equality ({age : Int, name : String} -> t9) ({name : ti10 | ti11} -> ti10) ] -- Substitution: unchanged 5. Equality ({age : Int, name : String} -> t9) ({name : ti10 | ti11} -> ti10) -- => add constraints Equality {age : Int, name : String} {name : ti10 | ti11} -- and Equality t9 ti10 -- Constraints: [ Equality {age : Int, name : String} {name : ti10 | ti11} , Equality t9 ti10 ] -- Substitution: unchanged 6. Equality {age : Int, name : String} {name : ti10 | ti11} -- => This is a TypeRec & TypeRecExt scenario. we match the shared fields -- and match the row type variable with the missing fields from the typeRec -- adding Equality String ti10 and Equality ti11 { age : Int } -- Constraints: [ Equality String ti10 , Equality ti11 { age : Int } , Equality t9 ti10 ] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) , top1_i7 := ({age : Int, name : String} -> t9) , top2 := t9 , top0_i8 := {age : Int, name : String} ] 7. Equality String ti10 -- => ti10 := String -- Constraints: [ Equality ti11 { age : Int } , Equality t9 String ] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) , top1_i7 := ({age : Int, name : String} -> t9) , top2 := t9 , top0_i8 := {age : Int, name : String} , ti10 := String ] 7. Equality ti11 { age : Int } -- => ti11 := { age : Int } -- Constraints: [ Equality t9 String ] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) , top1_i7 := ({age : Int, name : String} -> t9) , top2 := t9 , top0_i8 := {age : Int, name : String} , ti10 := String , ti11 := { age : Int } ] 8. Equality t9 String -- => t9 := String -- Constraints: [] -- Substitution: [ top0 := {age : Int, name : String} , targ4 := {name : t5 | t6} , tfun3 := ({name : t5 | t6} -> t5) , top1 := ({name : t5 | t6} -> t5) , top1_i7 := ({age : Int, name : String} -> String) , top2 := String , top0_i8 := {age : Int, name : String} , ti10 := String , ti11 := { age : Int } , t9 := String ] Phew, we made it! Let's apply the substitution back to our AST and see the result! #### Result  +-- { age : Int, name : String } | | | | | __________|_______________ giml = { name = "Giml", age = 0 }   +--- {name : t5 | t6} -> t5 | | +--- {name : t5 | t6} | | | __|___ getName record = record.name ----------- | +------- t5  +--- String | | gimlName = getName giml | | | | | +---- {age : Int, name : String} | +---- {age : Int, name : String} -> String This look correct! (and yey me who did the constraint solving by hand) ### Summary I found implementing extensible records using row polymorphism to be relatively straightforward. I hope this article makes it more approachable as well. In the next blog post we'll discuss polymorphic variants and how to infer their types. ## April 09, 2021 ### Ken T Takusagawa # [hupfginj] Nim Nim, the game of heaps of coins, is a solved game. Here is how to compute the nim sum (Grundy value) of a nim position in Haskell. nimsum :: [Integer] -> Integer; nimsum = Data.List.foldl' Data.Bits.xor 0; (It's might be surprising that Integer is an instance of Bits, but we will just run with it.) If the nim sum of a position is zero, then the position is lost: any move loses (and makes the nim sum nonzero). If the nim sum is nonzero, the position can be won: choose a move that reduces the nim sum to zero. (What is the computational complexity of finding such a move? Typically, how many winning moves are there? Below, we will give some positions with more than one winning move.) Important details omitted: if the nim sum is nonzero, there always exists a move that makes it zero. If the nim sum is zero, there are no moves that keep it zero. Call a nim position obviously lost if the following mirroring strategy is applicable. The heaps can be organized into pairs of equal-sized heaps. Whatever the first player does to one heap of a pair, the second player can do to the corresponding other heap of the pair and thereby eventually win. All 2-heap nim positions that are lost are obviously lost. In other words, the lost 2-heap nim positions are exactly those which have 2 equal sized heaps. Call a position non-obviously lost if it is lost but not obviously lost. The simplest non-obviously lost position is [1,2,3]. The positions [1,3,3] [2,2,3] [2,3,3] are the three simplest winning positions that have more than one winning move. All can be reduced to [1,2,3] or to [a,a]. Below are the non-obviously lost 3-heap positions whose smallest heap has size 1. All entries are [1, 2n, 2n+1]. [1,2,3],[1,4,5],[1,6,7],[1,8,9],[1,10,11],[1,12,13],[1,14,15],[1,16,17],[1,18,19],... Below are the non-obviously lost 3-heap positions whose smallest heap has size 2. All entries are of the form [2, 4n, 4n+2] or [2, 4n+1, 4n+3]. [2,4,6],[2,5,7],[2,8,10],[2,9,11],[2,12,14],[2,13,15],[2,16,18],[2,17,19],... Below are the non-obviously lost 3-heap positions whose smallest heap has size 3. All entries are of the form [3, 4n, 4n+3] or [3, 4n+1, 4n+2]. [3,4,7],[3,5,6],[3,8,11],[3,9,10],[3,12,15],[3,13,14],[3,16,19],[3,17,18],... With more coins and more heaps, there continue to be patterns, but they are more complicated, so we will not attempt to describe them. Below are the non-obviously lost 3-heap positions whose smallest heap has size 4. Pattern involves 8n+k. [4,8,12],[4,9,13],[4,10,14],[4,11,15],[4,16,20],[4,17,21],[4,18,22],[4,19,23],[4,24,28],[4,25,29],[4,26,30],[4,27,31],[4,32,36],[4,33,37],[4,34,38],[4,35,39],[4,40,44],[4,41,45],[4,42,46],[4,43,47]... Below are the non-obviously lost 4-heap positions whose largest heap has size at most 15. [2,3,4,5],[1,3,4,6],[1,2,5,6],[1,2,4,7],[1,3,5,7],[2,3,6,7],[4,5,6,7],[2,3,8,9],[4,5,8,9],[6,7,8,9],[1,3,8,10],[4,6,8,10],[5,7,8,10],[1,2,9,10],[5,6,9,10],[4,7,9,10],[1,2,8,11],[5,6,8,11],[4,7,8,11],[1,3,9,11],[4,6,9,11],[5,7,9,11],[2,3,10,11],[4,5,10,11],[6,7,10,11],[8,9,10,11],[1,5,8,12],[2,6,8,12],[3,7,8,12],[1,4,9,12],[3,6,9,12],[2,7,9,12],[2,4,10,12],[3,5,10,12],[1,7,10,12],[3,4,11,12],[2,5,11,12],[1,6,11,12],[1,4,8,13],[3,6,8,13],[2,7,8,13],[1,5,9,13],[2,6,9,13],[3,7,9,13],[3,4,10,13],[2,5,10,13],[1,6,10,13],[2,4,11,13],[3,5,11,13],[1,7,11,13],[2,3,12,13],[4,5,12,13],[6,7,12,13],[8,9,12,13],[10,11,12,13],[2,4,8,14],[3,5,8,14],[1,7,8,14],[3,4,9,14],[2,5,9,14],[1,6,9,14],[1,5,10,14],[2,6,10,14],[3,7,10,14],[1,4,11,14],[3,6,11,14],[2,7,11,14],[1,3,12,14],[4,6,12,14],[5,7,12,14],[8,10,12,14],[9,11,12,14],[1,2,13,14],[5,6,13,14],[4,7,13,14],[9,10,13,14],[8,11,13,14],[3,4,8,15],[2,5,8,15],[1,6,8,15],[2,4,9,15],[3,5,9,15],[1,7,9,15],[1,4,10,15],[3,6,10,15],[2,7,10,15],[1,5,11,15],[2,6,11,15],[3,7,11,15],[1,2,12,15],[5,6,12,15],[4,7,12,15],[9,10,12,15],[8,11,12,15],[1,3,13,15],[4,6,13,15],[5,7,13,15],[8,10,13,15],[9,11,13,15],[2,3,14,15],[4,5,14,15],[6,7,14,15],[8,9,14,15],[10,11,14,15],[12,13,14,15] For casual play, let the initial position be a non-obviously lost position. No move is objectively better than another from a lost position, so this will tend to produce varied games from as early as the first move. (The first player explores different ways to try to swindle a win.) [3,4,7] mentioned above is a nice small 3-heap initial position for casual play. Some nice 4-heap (lost) initial positions that are easy to remember are [2,3,4,5] (14 coins total), [4,5,6,7] (22 coins), and [6,7,8,9] (30 coins). We propose non-obviously lost initial positions with more heaps that satisfy the following aesthetic constraints: distinct heap sizes, minimize total number of coins, minimize largest heap, maximize smallest heap. We avoid heaps of equal size in the initial position to make the mirroring strategy somewhat less likely to be usable toward the beginning of the game. However, equal-sized heaps can easily appear after the first move. The unique 3-heap satisfying these constraints is [1,2,3] (6 coins). 4-heap: [2,3,4,5] (14 coins). 5-heap: [3,4,6,8,9] (30 coins). 6-heap: [1,2,4,6,8,9] (30 coins). The perfect triangle 7-heap [1,2,3,4,5,6,7] satisfies the constraints. It has size 28, smaller than the 5 or 6 heaps above. 8-heap: [2,3,4,5,6,7,8,9] (44 coins). 9-heap: [2,3,4,5,7,8,9,10,12] or [2,3,4,5,6,8,9,11,12] (60 coins). 10-heap: [1,2,3,4,5,6,8,9,10,12] (60 coins). Best 11-heap is another perfect triangle: [1,2,3,4,5,6,7,8,9,10,11] (66 coins). 12-heap: [2,3,4,5,6,7,8,9,10,11,12,13] (90 coins). 13-heap: [1,2,3,4,5,6,7,9,10,12,14,16,17], [1,2,3,4,5,6,7,8,11,12,14,16,17], [1,2,3,4,5,6,7,8,10,13,14,16,17], or [1,2,3,4,5,6,7,8,10,12,15,16,17] (106 coins) The values of n for which a perfect triangle [1,2,3,...,n] is a lost position: 0 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 ... The values are 4k-1. There are no non-obviously lost positions with all heaps having the same number of coins. An even number of identical heaps is obviously lost. An odd number of identical heaps is won: just take an entire heap to leave your opponent with an even number of identical heaps, a position which is lost. The following trapezoids (heap sizes of consecutive integers, at least 2 and at most 20 coins in a heap) are non-obviously lost. It appears the number of heaps must be a multiple of 4, and the smallest heap must have an even number of coins. [2,3,4,5],[2,3,4,5,6,7,8,9],[2,3,4,5,6,7,8,9,10,11,12,13],[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17], [4,5,6,7],[4,5,6,7,8,9,10,11],[4,5,6,7,8,9,10,11,12,13,14,15],[4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19], [6,7,8,9],[6,7,8,9,10,11,12,13],[6,7,8,9,10,11,12,13,14,15,16,17], [8,9,10,11],[8,9,10,11,12,13,14,15],[8,9,10,11,12,13,14,15,16,17,18,19], [10,11,12,13],[10,11,12,13,14,15,16,17], [12,13,14,15],[12,13,14,15,16,17,18,19], [14,15,16,17], [16,17,18,19] Future work: organize positions by their distance from being obviously lost. (Previously, we analyzed Chomp in this way.) What other classes of positions can be quickly recognized as lost? (We noted [1, 2n, 2n+1] above.) Future work: repeat this analysis for the "subtraction game" in which the number of coins that can be removed in one move is limited (up to k). Surprisingly, optimal play can be calculated quickly: compute nim sum mod (k+1). ## April 08, 2021 ### Gabriel Gonzalez # How to replace Proxy with AllowAmbiguousTypes no-proxy This post is essentially a longer version of this Reddit post by Ashley Yakeley that provides more explanation and historical context. Sometimes in Haskell you need to write a function that “dispatches” only on a type and not on a value of that type. Using the example from the above post, we might want to write a function that, given an input type, prints the name of that type. #### Approach 1 - undefined One naive approach would be to do this: class TypeName a where typeName :: a -> Stringinstance TypeName Bool where typeName _ = "Bool"instance TypeName Int where typeName _ = "Int" … which we could use like this: >>> typeName False"Bool">>> typeName (0 :: Int)"Int" However, this approach does not work well because we must provide the typeName method with a concrete value of type a. Not only is this superfluous (we don’t care which value we supply) but in some cases we might not even be able to supply such a value. For example, consider this instance: import Data.Void (Void)instance TypeName Void where typeName _ = "Void" There is a perfectly valid name associated with this type, but we cannot retrieve the name without cheating because we cannot produce a (total) value of type Void. Instead, we have to use something like undefined: >>> typeName (undefined :: Void)"Void" The base package uses this undefined-based approach. For example, Foreign.Storable provides a sizeOf function that works just like this: class Storable a where -- | Computes the storage requirements (in bytes) of the argument. The value -- of the argument is not used. sizeOf :: a -> Int … … and to this day the idiomatic way to use sizeOf is to provide a fake value using undefined: >>> sizeOf (undefined :: Bool)4 This works because sizeOf never evaluates its argument. It’s technically safe, albeit not very appealing to depend on undefined. #### Approach 2A - Proxy The next evolution of this approach was to use the Proxy type (now part of base in the Data.Proxy module). As the documentation notes: Historically, Proxy :: Proxy a is a safer alternative to the undefined :: a idiom. I’m not exactly sure what the name Proxy was originally meant to convey, but I believe the intention was that a term (the Proxy constructor) stands in as a “proxy” for a type (specifically, the type argument to the Proxy type constructor). We can amend our original example to use the Proxy type like this: import Data.Proxy (Proxy(..))import Data.Void (Void)class TypeName a where typeName :: Proxy a -> Stringinstance TypeName Bool where typeName _ = "Bool"instance TypeName Int where typeName _ = "Int"instance TypeName Void where typeName _ = "Void" … and now we can safely get the name of a type without providing a specific value of that type. Instead we always provide a Proxy constructor and give it a type annotation which “stores” the type we wish to use: >>> typeName (Proxy :: Proxy Bool)"Bool">>> typeName (Proxy :: Proxy Int)"Int">>> typeName (Proxy :: Proxy Void)"Void" We can simplify that a little bit by enabling the TypeApplications language extension, which permits us to write this: >>> :set -XTypeApplications>>> typeName (Proxy @Bool)"Bool">>> typeName (Proxy @Int)"Int">>> typeName (Proxy @Void)"Void" … or this if we prefer: >>> typeName @Bool Proxy"Bool">>> typeName @Int Proxy"Int">>> typeName @Void Proxy"Void" #### Approach 2B - proxy A minor variation on the previous approach is to use proxy (with a lowercase “p”) in the typeclass definition: import Data.Void (Void)class TypeName a where typeName :: proxy a -> String -- ↑instance TypeName Bool where typeName _ = "Bool"instance TypeName Int where typeName _ = "Int"instance TypeName Void where typeName _ = "Void" Everything else works the same, but now neither the author nor the consumer of the typeclass needs to depend on the Data.Proxy module specifically. For example, the consumer could use any other type constructor just fine: >>> typeName ([] :: [Int]) -- Technically legal, but weird"Int" … or (more likely) the consumer could define their own Proxy type to use instead of the one from Data.Proxy, which would also work fine: >>> data Proxy a = Proxy>>> typeName (Proxy :: Proxy Int)"Int" This trick helped back when Proxy was not a part of the base package. Even now that Proxy is in base you still see this trick when people author typeclass instances because it’s easier and there’s no downside. Both of these Proxy-based solutions are definitely better than using undefined, but they are both still a bit unsatisfying because we have to supply a Proxy argument to specify the desired type. The ideal user experience should only require the type and the type alone as an input to our function. #### Approach 3 - AllowAmbiguousTypes + TypeApplications We previously noted that we could shorten the Proxy-based solution by using TypeApplications: >>> typeName @Bool Proxy"Bool" Well, what if we could shorten things even further and just drop the Proxy, like this: >>> typeName @Bool Actually, we can! This brings us to a more recent approach (the one summarized in the linked Reddit post), which is to use AllowAmbiguousTypes + TypeApplications, like this: {-# LANGUAGE AllowAmbiguousTypes #-}import Data.Void (Void)class TypeName a where typeName :: Stringinstance TypeName Bool where typeName = "Bool"instance TypeName Int where typeName = "Int"instance TypeName Void where typeName = "Void" … which we can invoke like this: >>> :set -XTypeApplications>>> typeName @Bool"Bool">>> typeName @Int"Int">>> typeName @Void"Void" The use of TypeApplications is essential, since otherwise GHC would have no way to infer which typeclass instance we meant. Even a type annotation would not work: >>> typeName :: String -- Clearly, this type annotation is not very helpful<interactive>:1:1: error: • Ambiguous type variable ‘a0’ arising from a use of ‘typeName’ prevents the constraint ‘(TypeName a0)’ from being solved. Probable fix: use a type annotation to specify what ‘a0’ should be. These potential instances exist: instance [safe] TypeName Void -- Defined at Example.hs:14:10 instance [safe] TypeName Bool -- Defined at Example.hs:8:10 instance [safe] TypeName Int -- Defined at Example.hs:11:10 • In the expression: typeName :: String In an equation for ‘it’: it = typeName :: String Type applications work here because you can think of a polymorphic function as really having one extra function argument: the polymorphic type. I elaborate on this a bit in my post on Polymorphism for dummies, but the basic idea is that TypeApplications makes this extra function argument for the type explicit. This means that you can directly tell the compiler which type to use by “applying” the function to the right type instead of trying to indirectly persuade the compiler into using the the right type with a type annotation. #### Conclusion My personal preference is to use the last approach with AllowAmbiguousTypes and TypeApplications. Not only is it more concise, but it also appeals to my own coding aesthetic. Specifically, guiding compiler behavior using type-annotations feels more like logic programming to me and using explicit type abstractions and TypeApplications feels more like functional programming to me (and I tend to prefer functional programming over logic programming). However, the Proxy-based approach requires no language extensions, so that approach might appeal to you if you prefer to use the simplest subset of the language possible. ### Tweag I/O # Ad-hoc interpreters with capability The capability library is an alternative to the venerable mtl (see our earlier blog posts on the subject). It features a set of “mtl-style” type classes, representing effects, along with deriving combinators to define interpreters as type class instances. It relies on the -XDerivingVia extension to discharge effects declaratively, close to the definition of the application’s monad. Business logic can be written in a familiar, idiomatic way. As an example, consider the following computation: testParity :: (HasReader "foo" Int m, HasState "bar" Bool m) => m () testParity = do num <- ask @"foo" put @"bar" (even num) This function assumes a Reader effect "foo" of type Int, and a State effect "bar" of type Bool. It computes whether or not "foo" is an even number and stores the result in "bar". Save for the tags "foo" and "bar", used to enable multiple Reader or State effects within the same monad (an impossible thing with mtl), this is fairly standard Haskell: Type classes are used to constrain what kind of effects the function can perform, while decoupling the computation from any concrete implementation. At use-site, it relies on GHC’s built-in resolution mechanism to “inject” required dependencies. Any seasoned Haskeller should feel right at home ! ## Providing instances To actually call this admittedly silly function, we need to provide interpreters for the "foo" and "bar" effects. Following the ReaderT design pattern, we’ll pack everything we need into a single context record, then interpret our effects over this context in the IO monad, using the deriving combinators provided by the library: data Ctx = Ctx { foo :: Int, bar :: IORef Bool } deriving Generic newtype M a = M { runM :: Ctx -> IO a } deriving (Functor, Applicative, Monad) via ReaderT Ctx IO -- Use DerivingVia to derive a HasReader instance. deriving (HasReader "foo" Int, HasSource "foo" Int) via -- Pick the field foo from the Ctx record in the ReaderT environment. Field "foo" "ctx" (MonadReader (ReaderT Ctx IO)) -- Use DerivingVia to derive a HasState instance. deriving (HasState "bar" Bool, HasSource "bar" Bool, HasSink "bar" Bool) via -- Convert a reader of IORef to a state capability. ReaderIORef (Field "bar" "ctx" (MonadReader (ReaderT Ctx IO))) Thus equipped, we can now make use of our testParity function in an actual program: example :: IO () example = do rEven <- newIORef False runM testParity (Ctx 2 rEven) readIORef rEven >>= print How do we test a function such as testParity in isolation? In our contrived example, this is quite easy: the example function could be easily converted into a test-case. In the Real World™, though, our context Ctx could be much bigger, providing a pool of database connections, logging handles, etc. Surely, we don’t want to spawn a database instance to test such a simple function! ## Ad-hoc interpreters In previous iterations of capability, the solution to this problem would have been to create a new monad for testing purposes, leaving out the capabilities we don’t want. While it works, it is not always the best tool for the job: • You need to define a new monad for each combination of effects you want to test. • Test cases are no longer self-contained; their implementation is spread across multiple places. It makes things less readable and harder to maintain. A solution, supported by fancier effect system libraries such as polysemy or fused-effects, is to define ad-hoc interpreters in the executable code itself. At first glance, it might seem like this is not possible in capability. Indeed, since interpreters are provided as type class instances, and type classes are an inherently static mechanism, surely there is no way of specifying those dynamically. Or is there? As of version 0.4.0.0, the capability library features an experimental Capability.Reflection module, addressing this very limitation. It is inspired by, and uses, Edward Kmett’s reflection library, and uses similar type class wrangling magic to let you define interpreters as explicit dictionaries. ### Interpreters as reified dictionaries Making use of those new features, the example function can be rewritten as: import qualified Control.Monad.Reader as MTLReader example :: IO () example = do let runTestParity :: (Int, IORef Bool) -> IO () runTestParity (foo, bar) = flip MTLReader.runReaderT foo$
-- Interpret the effects into 'ReaderT Int IO'.
--
-- Write the 'HasReader "foo" Int' dictionary
-- in terms of mtl functions.
--
} $-- Use 'MonadIO' to write the 'HasState "bar" Bool' dictionary. -- Forward the 'HasReader "foo" Int' capability. -- -- The 'MonadIO' capability is not forwarded, and hence forgotten. interpret @"bar" @'[HasReader "foo" Int] ReifiedState { _state = \f -> do b <- liftIO$ readIORef bar
let (a, b') = f b
liftIO $writeIORef bar b' pure a , _stateSource = ReifiedSource { _await = liftIO$ readIORef bar }
, _stateSink = ReifiedSink
{ _yield = liftIO . writeIORef bar }
}
testParity

rEven <- newIORef False
runTestParity (2, rEven)
readIORef rEven >>= print

Defining a test monad is no longer required: the effects are interpreted directly in terms of the underlying ReaderT Int IO monad. Type-class dictionaries are passed to the interpret function as mere records of functions and superclass dictionaries — just like GHC does under the hood as hidden parameters when we use statically defined instances.

Let’s dissect the ReifiedReader dictionary:

ReifiedReader
}

Omitting the extra Proxy# arguments, which are here for technical reasons, the first two attributes, _reader and _local, correspond directly to the methods of the HasReader t type class:

class (Monad m, HasSource tag r m) => HasReader (tag :: k) (r :: *) (m :: * -> *) | tag m -> r where
local_ :: Proxy# tag -> (r -> r) -> m a -> m a
reader_ :: Proxy# tag -> (r -> a) -> m a

The _readerSource argument, on the other hand, represents the dictionary of the HasSource superclass:

class Monad m => HasSource (tag :: k) (a :: *) (m :: * -> *) | tag m -> a where
await_ :: Proxy# tag -> m a

### Abstracting interpreters

This is quite boilerplatey, though. If we’re writing a lot of test cases, we are bound to redefine those interpreters several times. This is tedious, error-prone, and clutters our beautiful test logic. Maybe this is could all be factored out? Sure thing!

interpretFoo
=> (forall m'. All (HasReader "foo" Int : cs) m' => m' a)
-> m a
interpretFoo =
}

interpretBar
:: forall cs m a. (MonadIO m, All cs m)
=> IORef Bool
-> (forall m'. All (HasState "bar" Bool : cs) m' => m' a)
-> m a
interpretBar bar =
interpret @"bar" @cs ReifiedState
{ _state = \f -> do
b <- liftIO $readIORef bar let (a, b') = f b liftIO$ writeIORef bar b'
pure a
, _stateSource = ReifiedSource
{ _await = liftIO $readIORef bar } , _stateSink = ReifiedSink { _yield = liftIO . writeIORef bar } } These two functions follow a similar pattern. Let’s have a closer look at the type of interpretBar to understand what is going on: interpretBar :: forall cs m a. (MonadIO m, All cs m) => IORef Bool -> (forall m'. All (HasState "bar" Bool : cs) m' => m' a) -> m a • The (typelevel) cs :: [(* -> *) -> Constraint] argument is a list of capabilities that we wish to retain in the underlying action. • Since we interpret the State effect with a mutable IORef reference, we require that the underlying monad be an instance of MonadIO. Moreover, we ask that our target monad also implement all the required capabilities by adding the All cs m constraint to the context (All is a type family that applies a list of capabilities to a monad to generate a single constraint; for example, All '[MonadIO, HasSource "baz" Baz] m is equivalent to (MonadIO m, HasSource "baz" Baz m)). • The IORef used to store our state is passed as a standard function argument. This was not possible without ad-hoc interpreters: we needed to add the IORef to the Ctx type. With ad-hoc interpreters, on the other hand, we can write instances which capture references in their closures. • The last argument is a monadic action that makes use of HasState "bar" Bool along with the forwarded cs capabilities. It is required to be polymorphic in the monad type, which guarantees that the action cannot use other effects. Now that we have factored out the interpretation of the "foo" and "bar" effects into dedicated functions, they can be neatly composed to provide just the effects we need to run testParity: example :: IO () example = do let runTestParity :: (Int, IORef Bool) -> IO () runTestParity (foo, bar) = flip MTLReader.runReaderT foo$
interpretFoo @'[MonadIO] $interpretBar @'[HasReader "foo" Int] bar$
testParity

rEven <- newIORef False
runTestParity (2, rEven)
readIORef rEven >>= print

## Deriving capabilities

Truth be told, in this example, the dictionaries we’ve been writing aren’t so different from a custom type class with capabilities provided by deriving-via. While the extra power that comes with dynamic dictionaries can be very useful, it isn’t always warranted.

There is a middle ground, however: we can provide capabilities locally, but with deriving-via combinators using a function that we call derive. You would typically use derive to derive high-level capabilities from lower-level capabilities. In our case, we can replace:

runTestParity :: (Int, IORef Bool) -> IO ()
runTestParity (foo, bar) = flip MTLReader.runReaderT foo $interpretFoo @'[MonadIO]$
interpretBar @'[HasReader "foo" Int] bar $testParity with: runTestParity :: (Int, IORef Bool) -> IO () runTestParity ctx = flip MTLReader.runReaderT ctx$
derive
-- Strategy
-- New capability
@'[HasState "bar" Bool]
-- Forwarded capability
@'[MTLReader.MonadReader (Int, IORef Bool)] $derive @(Rename 1 :.: Pos 1 _ :.: MonadReader) @'[HasReader "foo" Int] @'[HasState "bar" Bool] testParity thus getting rid of the interpret{Foo,Bar} helpers entirely. For instance, the HasState "bar" Bool capability is derived from the IORef Bool in the second position of the tuple provided by the ambient MonadReader (Int, IORef Bool) instance. Think DerivingVia, but dynamically! ## Conclusion Wrapping things up: • At its core, the capability library is just mtl on steroids, modeling effects with type classes. • The standard way of using capability is to define interpreters declaratively, using the provided combinators; this programming-style does not allow defining ad-hoc interpreters, at runtime. • The new version of capability provides a way of overcoming this limitation with reified dictionaries. • Standard deriving strategies can be used to provide dynamic instances with less boilerplate, using the underlying deriving mechanism. Writing tests is just one example. Another application might be to dynamically select the interpretation of an effect based on a configuration parameter. All this is still experimental: the API and ergonomics are likely to change a bit over the next few releases, but we hope this post motivates you to give it a try. ## April 06, 2021 ### Gil Mizrahi # Giml's type inference engine Giml's type inference engine uses unification-based constraint solving. It has 5 important parts: 1. Topologically order definitions and group those that depend on one another 2. Elaboration and constraint generation 3. Constraint solving 4. Instantiation 5. Substitution In summary, we sort and group definitions by their dependencies, we elaborate the program with type variables and types that we know and collect constraints on those variables, we then solve the constraints for each group using unification and create substitutions which are mapping from type variables to types and merge them together, and we then substitute the type variables we generated in the elaboration stage with their mapped types from the substitution in the program. Instantiation occurs during the elaboration stage and the constraint solving stage, when we want to use a polymorphic type somewhere. ## Group and order definitions Before we start actually doing type inference, we figure out the dependencies between the various definitions. This is because we want to figure out the type of a definition before it's use, and because we want definitions that depend on each other (mutual recursion) to be solved together. So for example if we have this file as input: main = ffi("console.log", odd 17) even x = if int_equals x 0 then True else if int_equals x 1 then False else odd (sub x 1) odd x = if int_equals x 0 then False else if int_equals x 1 then True else even (sub x 1) The output of the reordering and grouping algorithm will be: [["even", "odd"], ["main"]] This is because even and odd are mutually recursive, so we want to typecheck them together, and because main depends on odd we want to typecheck it after we finished typechecking odd. The module implementing this algorithm is Language.Giml.Rewrites.PreInfer.GroupDefsByDeps. ## Elaboration Now the real work begins. We start by going over the program's AST and annotate it with the types we know. When we don't know what the type of something is we "invent" a new fresh type variable as a placeholder and mark that type with a constraint according to its usage. ### Types Language.Giml.Types.Types contains a datatype that can be used to represent Giml types. For example: • TypeCon "Int" represents the type Int • TypeVar "t1" represents a polymorphic type variable • TypeApp (TypeCon "Option") (TypeCon "Int") represents the type Option Int • TypeRec [("x", TypeCon "Int")] represents a record with the type { x : Int } ### Constraints A Constraint between two types describes their relationship. The most common constraint is Equality. Which means that the two types are interchangeable and should unify to the same type. For example from the constraint Equality (TypeVar t1) (TypeCon "Int") we learn that t1 is the same as Int. The other, less common constraint, is InstanceOf. Which means that the first type has an Equality relationship with the instantiation of the second type. an instantiation is a monomorphic instance of a polymorphic type, for example (a -> b) -> List a -> List b is polymorphic, and can be instantiated, among other examples, to (Int -> String) -> List Int -> List String by replacing the polymorphic type variables a and b with monomorphic types such as Int and String. We will talk about solving these constraints a bit later. ### Implementation We also use State, Reader and Except in our algorithm: • The State contains: • A monotonically increasing integer for generating type variables • A list of constraints we generated through the elaboration phase, which we will return as output at the end • An environment of data definitions, some are built-in and others are user defined and added as part of the elaboration stage (maybe I should've put this in the Reader part...) • The Reader contains: • The currently scoped variables in the environment and their type (which might be polymorphic!) • The type of builtin functions and expressions (such as int_equals and pure, which might also be polymorphic) • The Except part can throw a few errors such as "unbound variable" error. Most of the action happens in elaborateExpr and I'm not going to cover every usecase, but lets look at a few examples: #### Elaborating literals Elaborating literals is simple - we already know what the types are - if we see a string literal we annotate the AST node with the type String, if we see an integer literal we annotate it with Int, and we don't need to generate new constraints because we know exactly what the type is. #### Elaborating lambda functions Elaborating lambda expressions is interesting, because we don't know what the type for the arguments is going to be: it might be inferred by the usage like in this case: \n -> add 1 n or not like in this case \x y -> x. What we do here is: 1. Generate a type variable for each function argument 2. Elaborate the body of the lambda with the extended environment containing the mapping between the function arguments and their newly generated types 3. Annotate the lambda AST node with the type of a function from the argument types to the type of the body, which we get from the elaborated node after (2) And we will trust the constraint solving algorithm to unify all of the types. #### Elaborating function application To elaborate a function application we: 1. Elaborate the arguments 2. Elaborate the applied expression 3. Generate a fresh type variable to represent the return type 4. Constrain the type of the applied expression as a function from the arguments to the return type 5. Annotate the function application AST node with the return type (which we just generated) Remember that the type of the applied expression may just be a generated type variable, so we can't examine its type to see what the return type is. For example if we had this expression \f x -> f x, as we saw before we generated type variables for f and x! So we have to make sure the type inference engine knows what their type should be (by generating a constraint) according to the usage and trust that it'll do its job. #### Elaborating variables For variables, we expect an earlier stage (such as elaborating lambdas, lets or definitions) to have already elaborated their type and added them to the environment, so we look them up. We first try in the scoped environment and if that fails we try the built-ins environment. If that fails as well, we throw an error that the variable is unbound. If we do find a variable in the scoped environment, there are two scenarios we need to take care of: 1. The variable was bound in a place that expect it to be monomorphic, such as a lambda function argument 2. The variable was bound in a place that expect it to be polymorphic, such as a term definition Monomorphic types can be used as is, but polymorphic types represent an instantiation of a type. When we find that the variable we are elaborating is polymorphic, we will generate a new type variable for it and constrain it as an InstanceOf the type we found in the environment. Later in the constraints solving stage we will instantiate the polymorphic type we found in the environment and say that the type variable we generate is equals to the instantiation. We delay this because we want the polymorphic type to be solved first before we instantiate it. If we do not find the variable in the scoped environment, we check the builtins environment. If we find the type there, we instantiate it immediately and annotate the AST node with this type. Here we instantiate the type immediately because it is already solved, so we don't need to wait. ## Constraints solving The heart of the type inference engine. Here we visit each the constraints generated from each group of substitutions separately and generate a substitution (mapping from type variables to types) as output. the solveConstraints function takes an accumulated substitution and a list of constraints as input and tries to solve the first constraint on the list. Solving a constraint might generate more constraints to solve, and may return a substitution as well. We then apply this substitution (substitute all occurences of the keys with the values) to the rest of the constraints and the accumulated substitution itself, merge the new substitution with the accumulated substitution, and keep going until there are no more constraints left to solve. (solveConstraint) uses unification to solve a constraint. Here are a few important cases that it handles: #### Equality t1 t2 (where t1 == t2) When the two types are the same (for example Int and Int, or t1 and t1), the unification succeeds without extra work needed, we don't need to create new constraints or create a substitution. #### Equality (TypeVar tv) t2 When one of the types is a type variable, we substitute (replace) it with t2 in all places we know of (accumulated substitution and rest of the constraints). So the rest of the constraints that contain it will use t2 instead, and when substituting it with another type it'll be t2. #### Equality (TypeApp f1 a1) (TypeApp f2 a2) We generate two new constraints: Equality f1 a2 and Equality f2 a2, reducing a constraint of complex types into something we already know how to handle: a constraint on simple types. #### InstanceOf t1 t2 When we run into an InstanceOf constraint, we instantiate the second type t2 and produce an Equality constraint between t1 and the instantiation of t2. Instantiating is basically "copying" the type signature, and producing the Equality constraint with t1 makes the type inference engine find the right type for the instantiated type variables. We use InstanceOf instead of Equality when the type is polymorphic, for example when we want to use id that we defined as id x = x. If we were to use an Equality constraint instead, every use of id would constrain it to have a particular type (so id 1 would make the type of id Int -> Int instead of only where we use it). There are a few more cases, but this is the gist of it. When we fail to unify we throw an error of type mismatch. ### Example Let's see an example for this (very) short program: one = (\x -> x) 1 After the elaboration stage we generate the following AST and constraints:  +-- t1 -> t1 | --------- one = (\x -> x) 1 | | | +-- t2 | +------ Int | +---- t1 [ Equality top0 t2 , Equality (t1 -> t1) (Int -> t2) ] #### How did we get these constraints? 1. The lambda expression case generated the type t1 for x, so it's type is t1 -> t1 2. The literal case annotated 1 as Int 3. The function application generated the return type t2 and generated the constraint Equality (t1 -> t1) (Int -> t2) which is "the type of the applied expression is equals to a function from the argument to the return type I just generated" 4. The definition of one generated the type variable top0 for it so it can be used in other definitions polymorphically, but for this function it should be used monomorphically, so it's constrainted as equal to the return type and generated the constraint Equality top0 t2. #### How do we unify them? Let's go over them in order. We begin with an empty substitution: - Constraints: [Equality top0 t2, Equality (t1 -> t1) (Int -> t2)] - Substitution: [] 1. Equality top0 t2 -- ==> The first is a type variable, so add top0 := t2 to the substitution - Constraints: [Equality (t1 -> t1) (Int -> t2)] - Substitution: [top0 := t2] 2. Equality (t1 -> t1) (Int -> t2) -- ==> replace with two new constraints: -- 1. Equality t1 Int -- 2. Equality t1 t2 - Constraints: [Equality (-> t1) (-> Int), Equality t1 t2] - Substitution: [top0 := t2] 3. Equality (-> t1) (-> Int) -- ==> replace again with two new constraints: -- 1. Equality (->) (->) -- 2. Equality t1 Int - Constraints: [Equality (->) (->), Equality t1 Int, Equality t1 t2] - Substitution: [top0 := t2] 4. Equality (->) (->) -- ==> these two types are identical, so we can continue - Constraints: [Equality t1 Int, Equality t1 t2] - Substitution: [top0 := t2] 5. Equality t1 Int -- ==> The first is a type variable, so add t1 := Int to the substitution - Constraints: [Equality Int t2] - Substitution: [top0 := t2, t1 := Int] 6. Equality Int t2 -- ==> The second is a type variable, so add t2 := Int to the substitution - Constraints: [] - Substitution: [top0 := Int, t1 := Int, t2 := Int] And we're done! The output is the substitution [top0 := Int, t1 := Int, t2 := Int], we can now apply it to our elaborated program and get the fully type checked program:  +-- Int -> Int | --------- one = (\x -> x) 1 | | | +-- Int | +------ Int | +---- Int Two more things we talk to talk about before we wrap this up: ## Instantiation We instantiate a type by looking up all of the type variables in the type and generate a fresh new type variable for each one (though all instances of a type variable should be mapped to the same new type variable). So for example in the type (a -> b) -> List a -> List b, if t1 and t2 are two new type variables we haven't seen before, we can replace a with t1 and b with t2 everywhere in the type and get (t1 -> t2) -> List t1 -> List t2 After that, we let the Equality constraints and the constraint solving algorithm find the right monomorphic type for t1 and t2. ## Substitution When we apply a substitution to a constraint or another substitution, we replace all occurences of a type variable tv with t if t := tv is in the substitution. But, we also need to check that tv does not contain t inside of it! (this is called an occurs check) if tv does contain t, it means that the we ran into an infinite type and we don't support those here, so we will throw an error. A simple way to generate such case is this: x y = x The constraint generated from this definition is: Equality top0 (t1 -> top0), which will fail the occurs check. If you try to desugar this definition into a lambda expression, you'll quickly discover that you've been had: anytime you'll go to replace x with \y -> x you'll just add another level of lambda. # Summary This is the general overview of Giml's type inference engine. If you are interested in learning more about it, check out the source code. In the next blog posts I will cover the type inference of extensible records and polymorphic variants. If this helped you, there's something you want me to clarify, you think there's something I missed, there's something you want to know more about, or there's something I got totally wrong, do contact me on Twitter or via Email. ## April 01, 2021 ### Michael Snoyman # Sockchain As we all know, blockchain is a disruptive technology that has helped us find new efficiencies in our lives. For example, prior to Bitcoin, creation of new currency required wasteful processes like digging gold out of the ground or printing paper currency. Bitcoin replaces that with cheap electricity. The world is better off for the reduced resource usage. Efficiency! More industries are beginning to adopt various blockchain solutions to solve common problems. These are creating synergies in enterprises and forming new markets. But so far, blockchain has been used almost exclusively to solve industrial and financial systems issues. Today, I'm happy to introduce a new blockchain technology aimed at solving problems faced by everyday users. ## The problem It's a perennial issue, with a huge economic impact (detailed below). We estimate that in addition to the massive economic impact, the emotional and psychological damage may even be the larger factor. I'm talking, of course, about mismatched and lost socks. While "the dryer ate my socks" is a common term, we have in fact identified multiple related but separable subproblems: • The standard missing-sock • Similar, yet slightly different, versions of the same socks, leading to excess effort in identifying a match. This problem breaks down further into: • Socks which have been worn a different number of times and thereby stretched differently • Different manufacturing lines leading to microvariations within construction • Despite all claims to the contrary, there are in fact left vs right variations of socks, either due to manufacture or usage. Identifying a correct pairing becomes an NP-complete problem This is in fact just the tip of the iceberg. We fully believe there may be at least 13 further classifications of sock-related inefficiencies. Based just upon our current knowledge, we have calculated the following impacts: • 7 billion sock-wearing individuals worldwide (some populations are, of course, part of the rabid anti-socker cabal). • Individuals tend to change their socks approximately every 1.2 days. • Individuals tend to launder socks approximately every 8 days. • On average, 3.4 individuals run laundry concurrently. • This leads to an average socks pair/laundry of ~23, or ~46 socks per laundry cycle. • As this is an NP-hard problem to solve, with a comparison taking on average 2.3 seconds. • This leads to approximately 75,000 minutes spent sorting socks per laundry cycle. While there may be some ad-hoc optimizations, we expect no greater than a 10x speedup from such optimizations, and believe each laundry cycle requires at least 7,500 minutes, just on sock sorting. • Combining this with our estimates for laundry cycles per individual, we estimate that the global economy receives an impact of 1.3 billion person years wasted per year just on sock sorting! As you can see, sock sorting accounts for easily the greatest modern crisis known to humankind. But further than this is the emotional and psychological harm caused by these situations. We estimate 1 in 4 divorces are caused by mismatched socks. 3 in 10 children will wear a mismatched pair of socks to school at least once per month, with a 27% chance of cyberbullying as a result. ## Enter the sockchain The sockchain is the natural, obvious, and modern solution to this problem. We leverage our novel Proof-of-Work/Proof-of-Stake hybrid technology to create a distributed application that uses Near Field Communication (NFC) to tag each pair of socks. Dryers throughout the world can easily and cheaply be outfitted with NFC detectors. Miners will be able to detect the presence of such socks in distributed drying systems, and provide a searchable, distributed ledger of sock-pair-locations (SPLs). Our fee incentive structure will guarantee payment of at least 2 socks-pairs/block, leading to economies of scale. The fee structure will appropriately respond to the evolving and dynamic socketplace. We envision further development as well. Already, some of our partners are embedded fungus detection systems within our NFCs for athlete's foot elimination, dealing with yet another worldwide pandemic. ## Cost It's right to question the costs of such endeavors. As mentioned above, the average labor cost per year currently is 1.3 billion person years. Even taking a modest estimate of$150/hour of labor, we end up with an average cost per year of nearly two quadrillion dollars, which is almost the marketcap of Bitcoin itself.

We estimate 80 billion active pairs of socks worldwide. At an NFC cost of $20 per pair, we end up with a modest hardware cost of$1.6 trillion, an easily affordable expense given the current human impact. Outfitting the dryers will cost an additional $50 trillion. And we estimate that electricity costs for our novel Proof of Work system to be no greater than$100 trillion. Including training, marketing, and our very slim administrative expenses, the project will cost approximately $300 trillion. Compared to the current cost of$2 quadrillion, we're looking at an 85% cost savings.

It would be inhumane not to undertake such a project.

# Haskell base proposal, part 2: unifying vector-like types

Two weeks back, I wrote a blog post with a proposal for unification of vector-like types in bytestring, text, and vector. This continued with a discussion on Discourse, and has been part of some brainstorming sessions at the Haskell Foundation tech track to lock down some of the concrete details. (Meeting minutes have been posted to Discourse regularly.) I've discussed with a few other interested parties, and invited feedback from people who have been working on related projects. (And I even received some such feedback!)

At this point, I wanted to summarize what's come up, and propose some concrete next steps. Also, since this is one of the first such proposals we're trying to get to happen through the Haskell Foundation, I'll take a step back with some meta comments about the process itself.

## Refined goals

I threw quite a bit into the original proposal. The brainstorming since has helped make it clear what can be achieved, what will have major breaking impacts, and what will deliver the most value. Here are some outcomes:

• As much as I still think a single, unified Vector type that does boxing or unboxing depending on the contained type would be a Very Good Thing, there are technical hurdles, and not everyone is bought into it. I doubt it's going to happen any time soon.
• The temporary pinning of memory is likely not going to happen, since it would be too big a GC change. However, larger ByteArray#s (3kb IIRC) are already pinned, and using the copy-when-unpinned technique will not be too expensive for things under 3kb. So we have a path forward that requires no (or few) changes to GHC.
• It seems like one of the biggest advantages we may get out of this proposal is to move ByteString to unpinned memory. This would be good, since it would reduce memory fragmentation. The downside is the need to copy smaller ByteStrings before performing FFI calls. But overall, as mentioned previously, we don't think that will have a major performance impact.
• There are already efforts underway, and have been for a while, to rewrite text to use UTF-8. Combined with this proposal, we could be seeing some serious improvements to how textual data is consumed and worked with in Haskell, but that's a bit too far off right now.

## Refined design

With that in place, a semi-coherent design is beginning to form around this proposal:

• We're not ready to move types into base yet, but fortunately we already have another package that's a good candidate for shared-vector-types: primitive. It's already used by vector, and can be used by bytestring and text.
• We have a PrimArray type present in primitive, but it doesn't support slicing. Let's add a PrimVector type with the minimal machinery necessary to get slicing in place.
• The big change: let's rewrite ByteString (the strict variant) to be newtype ByteString = ByteString (PrimVector Word8).
• We can recover backwards compatibility in most of the package, including in the .Unsafe module.
• People directly using the .Internal module will likely be broken, though there may be some clever tricks to recover backwards compat there too.
• We'll get the non-memory-fragmentation benefit immediately.
• Not yet discussed, but putting down for future brainstorming: what should we do with ShortByteString? If we move it over to PrimVector, it will end up with a little more overhead for slicing. Do we leave it alone instead? Move it to PrimArray?

There are additional steps I could write around text and vector, but honestly: let's stop it right there. If we get a working bytestring package on top of primitive's PrimVector, I think that's a great time to take a break, analyze the performance impact, and see ecosystem impact, likely by compiling a Stackage snapshot with the tweaked bytestring and primitive packages.

## Action items

Next steps: find out who's interested in doing the work and dive in! This is still proof of concept, so no real buy-in is needed. We're exploring a possibility. There's a bunch of code that needs to be rewritten in bytestring to see if this is possible.

And while brainstorming in calls has been good, I don't think it's great. I'd like to move to discussions in a chat room to make it easier for others to engage. I'll comment a bit more on this below.

## Other packages

I've reached out to some authors of other packages to get their input on this proposal. I've received some, and incorporated some of that here. For example, both Alexey Kuleshevich and Alexey Khudyakov proposed using primitive as the new home for the shared data types. Others have expressed that they'd rather first see a concrete proposal. We'll see if there is further collaboration possible in the future.

Let me summarize, at a high level, what the process was that was involved in this proposal:

1. Free-form discussions on the Haskell Foundation tech track in a live meeting. People discussing different ideas, different concerns, which ultimately triggered the idea of this proposal.
2. Blog post to the world and Discourse thread laying out some initial ideas and looking for feedback.
3. Public discussion.
4. A few private emails reaching out to specific interested parties to get their input.
5. In concert with (3) and (4): further tech track live meetings to discuss details further.
6. This second blog post (and associated Discourse thread) to update everyone and call to action.
7. Hopefully: some actual coding.

Overall, I think this is a good procedure. If I could make one change, it would be towards leveraging asynchronous communication more and live meetings less. I absolutely see huge value in live meetings of people to brainstorm, as happened in (1). But I think one of the best things we can do as the Haskell Foundation is encourage more people to easily get involved in specific topics they care about.

On the bright side, the significant usage of Discourse for collaboration and reporting on meeting minutes has been a Good Thing. I think blog posts like this one and my previous one are a Good Thing for collecting thoughts coherently.

That said, I realize I'm in the driver's seat on this proposal, and have a skewed view of how the outside world sees things. If people have concerns with how this was handled, or ideas for improvement, bring them up. I think figuring out how to foster serious discussion of complex technical issues in the Haskell ecosystem is vital to its continued success.

# Conferences after COVID: An Early Career Perspective

One silver-lining to the cloud of COVID has been the development of virtual forms of participation. A SIGPLAN blog post by five early-career researchers offers their perspective on what we should do next.
We propose that SIGPLAN form a Committee on Conference Data. The committee would be made up of: one organizing-committee representative from each of the flagship SIGPLAN conferences, one early career representative, and, crucially, a professional data collection specialist hired by SIGPLAN. The group would identify and collect key data that is pertinent to conference organization, especially with respect to physical versus virtual conference formats. The committee would make data-driven recommendations to SIGPLAN organizers based on the collected data and guided by core tenets such as community building, inclusivity, research dissemination, and climate responsibility. We realize that this is not a small request, but we are confident that it is both necessary and achievable. If the committee were to form by May 1, 2021, it would be able to start collecting data at PLDI 2021 and continue through the next two years, providing enormous clarity for SIGPLAN organizers at a time when so much is unclear.

# Skin worms?

The King James Version of Job 19:26 says:

And though after my skin worms destroy this body, yet in my flesh shall I see God:

I find this mysterious for two reasons. First, I cannot understand the grammar. How is this supposed to be parsed? I can't come up with any plausible way to parse this so that it is grammatically correct.

Second, how did the worms get in there? No other English translation mentions worms and they appear to be absent from the original Hebrew. Did the KJV writers mistranslate something? (Probably not, there is nothing in the original to mistranslate.) Or is it just an interpolation?

Pretty ballsy, to decide that God left something out the first time around, but that you can correct His omission.

### Michael Snoyman

I've been playing with this idea off-and-on for a few years now, and decided to finally write a blog post to float it. I'm considering shutting down the Haskellers, and turning it into a redirect to some other community resource. The reason for this is pretty straightforward: I created Haskellers site originally as a some kind of a community/discussion/professional connection hub. But essentially all of its functionality seems better served by other resources at this point, and I'd rather promote those other services.

The simplest idea I had was to change it to a redirect to the haskell.org Discourse, though perhaps hosting a single-page site with links to common Haskell resources, together with a very prominent "edit" button for additional contributions, would be worthwhile.

I'm curious if anyone has feedback, either for or against, or ideas on what can or should replace a haskellers.com page. If so, please click the button below to the GitHub issue, to at least click the â€‹â€‹â€‹â€‹â€‹â€‹â€‹ğŸ‘� or ğŸ‘� button.

# Improvements to memory usage in GHC 9.2

Sometimes OS-reported memory usage might be quite different to the live data in your Haskell program. In a couple of previous posts we’ve explored some reasons for this and shown how ghc-debug can help pinpoint fragmentation problems.

In GHC 9.2 I have made two improvements which should make the memory usage reported by long-running applications more closely line up with the amount of memory they need.

1. By default a lot of extra memory was retained (up to 4 times the amount of live data). Now idle collections will start to return some of that memory configured by the -Fd option. (!5036)
2. Allocation of pinned objects was causing fragmentation of the nursery which could lead to the nursery retaining a lot of memory. (!5175)

In the rest of this post I will explain how to estimate the memory usage of your program based on its amount of live data and provide some more detail about the new -Fd option.

## Inspecting memory usage

There are three metrics which are important when understanding memory usage at a high-level. The GHC.Stats module provides means of inspecting these statistics from inside your program, and they are also included in the eventlog.

Name Description GHC.Stats Eventlog
Live Bytes The amount of live heap allocated data in your program gcdetails_live_bytes HeapLive
Heap Size The amount of memory the RTS is currently retaining for the Haskell heap gcdetails_mem_in_use_bytes HeapSize
OS VmRSS The amount of memory the OS thinks your program is using Sample from /proc/pid/status

OS VmRSS and heap size should correspond closely to each other. If they do not then there are two likely reasons.

1. It could mean that off-heap allocations (such as calls to mallocBytes or allocations from C libraries) are contributing significantly to the resident memory of your program. You can analyse off-heap allocations using tools such as heapcheck.
2. By default memory is returned lazily to the OS by marking returned regions using MADV_FREE, the memory is then free to be returned to the OS but the OS only takes it when it needs more memory. The RTS flag --disable-delayed-os-memory-return means that we use MADV_DONTNEED instead, which promptly returns memory.

In the rest of this post I will assume that OS VmRSS and heap size are close to each other.

## Understanding how Heap Size corresponds to Live Bytes

Heap size is the amount of memory that the RTS has currently allocated for the Haskell heap. There are three main factors which determine heap size.

### Collection Strategy

Firstly, the garbage collection strategy used by the oldest generation requires some overhead. By default a copying strategy is used which requires at least 2 times the amount of currently live data in order to perform a major collection. For example, if your program’s live data is 1GB then the heap size needs to be at least 2GB as each major collection will copy all data into a diferent block.

If instead you are using the compacting (-c) or nonmoving (-xn) strategies for the oldest generation then less overhead is required as the strategy immediately reuses already allocated memory by overwriting. For a program with live bytes 1GB then you might expect the heap size to be at minimum a small percentage above 1GB.

### Nursery Size

Secondly, a certain amount of memory is reserved for the nursery. The size of the nursery (per core) can be specified using the -A flag. By default each nursery is 4MB so if there are 8 cores then 8 * 4 = 32MB will be allocated and reserved for the nursery. As you increase the -A flag the baseline memory usage will correspondingly increase.

### Memory Retention Behaviour

Thirdly, after doing some allocation GHC is quite reluctant to decrease its heap size and return the memory to the OS. This is because after performing a major collection the program might still be allocating a lot and it costs to have to request more memory. Therefore the RTS keeps an extra amount to reuse which depends on the -F ⟨factor⟩ option. By default the RTS will keep up to (2 + F) * live_bytes after performing a major collection due to exhausting the available heap. The default value is F = 2 so you can see the heap size to be as high as 4 times the amount used by your program.

Without further intervention, once your program has topped out at this high threshold, no more memory would be returned to the OS so heap size would always remain at 4 times the live data. If you had a server with 1.5G live data, then if there was a memory spike up to 6G for a short period, then heap size would never dip below 6G. This is what happened before GHC 9.2. In GHC 9.2 memory is gradually returned to the OS so OS memory usage returns closer to the theoretical minimums.

## New memory return strategy in GHC 9.2

The -Fd ⟨factor⟩ option controls the rate at which the heap size is decreased and hence memory returned to the OS. On consecutive major collections which are not triggered by heap overflows, a counter (t) is increased and the F factor is inversely scaled according to the value of t and Fd. The factor is scaled by the equation:

\texttt{F}&39; = \texttt{F} \times {2 ^ \frac{- \texttt{t}}{\texttt{Fd}}}

By default Fd = 4, increasing Fd decreases the rate memory is returned.

Major collections which are not triggered by heap overflows arise mainly in two ways.

1. Idle collections (controlled by -I ⟨seconds⟩)
2. Explicit trigger using performMajorGC.

For example, idle collections happen by default after 0.3 seconds of inactivity. If you are running your application and have also set -Iw30, so that the minimum period between idle GCs is 30 seconds, then say you do a small amount of work every 5 seconds, there will be about 10 idle collections every 5 minutes. This number of consecutive idle collections will scale the F factor as follows:

\texttt{F}&39; = 2 \times {2^{\frac{-10}{4}}} \approx 0.35

and hence we will only retain (0.35 + 2) * live_bytes rather than the original 4 times. If you have less frequent idle collections (e.g. due to increasing -I or -Iw) then you should also decrease Fd so that more memory is returned each time a collection takes place.

Enabling idle collections is important if you want your program to return memory to the operating system and promptly run finalisers. In the past it has sometimes been recommended that long running applications disable idle collections in order to avoid unecessary work but now it is better to keep idle collections enabled but configure the -Iw option to avoid them happening too frequently.

If you set -Fd0 then GHC will not attempt to return memory, which corresponds with the behaviour from releases prior to 9.2. You probably don’t want to do this as unless you have idle periods in your program the behaviour will be similar anyway.

### Analysis and Further Tweaking

These two graphs show the difference between -Fd0 and -Fd4, with -Fd4 the memory usage returns to a baseline of around 4GB after spiking at 8GB. With -Fd0, the memory usage never retreats back below 7GB.

If you want to retain a specific amount of memory then it’s better to set -H1G in order to communicate that you are happy with a heap size of 1G. If you do this then heap size will never decrease below this amount if it ever reaches this threshold.

The collecting strategy also affects the fragmentation of the heap and hence how easy it is to return memory to a theoretical baseline. David in a previous post gave a much more in-depth explanation of fragmentation.

In theory the compacting strategy has a lower memory baseline but practically it can be hard to reach the baseline due to how compacting never defragments. On the other hand, the copying collecting has a higher theoretical baseline but we can often get very close to it because the act of copying leads to lower fragmentation.

## Conclusion

In this post I explained the new GHC option -Fd and how it can be configured to return additional memory to the OS during idle periods. Under the default settings this should result in a lower reported memory usage for most long-running applications in GHC 9.2.

A big thank you to Hasura, an open-source GraphQL engine built in Haskell, for partnering with us and making the work presented here possible.

### Tweag I/O

For the three of us, the launch of the Haskell Foundation was one of 2020’s few bright spots.

In November 2020, Simon Peyton Jones announced the formation of the Haskell Foundation. Conceived during the first lockdowns of early 2020, the Haskell Foundation is a non-profit organization aimed at promoting adoption of the Haskell programming language and bolstering its core infrastructure. The Foundation has now raised approximately $500K in cash and financial commitments from sponsors, recruited a volunteer board of 14 directors and a full-time staff of two: Andrew Boardman, Executive Director (ED); and Emily Pillmore, Chief Technology Officer (CTO). Today, the Foundation has its own point of view, plans, sponsors, and community of volunteers. It has reached escape velocity. Our personal opinions now or back then don’t matter much, but we thought it might be nice to share the story of what sparked its inception. In the end, the Haskell Foundation was launched with input from dozens of friends of Haskell. What we thought might be like rolling a boulder uphill turned out to be more like kicking a stone and starting a rockslide. It seems the Haskell Foundation is an idea whose time had come. ## Inception In early 2020, Simon Peyton Jones got a call from Frank Rodorigo, a US-based entrepreneur, who was in the process of reviving Skills Matter, a community of tech creators, users and adopters. Skills Matter had run into financial difficulties in 2019, and Frank, together with his CTO Scott Conley, wanted to make sure Haskell was at the center of their reboot plans. He hoped to explore some ideas with Simon about the best ways to help Haskell. One thing led to the next: under the impetus of the rebooted Skills Matter, we brainstormed about what extra glue the community might need to bolster the lofty goals that so many seemed to have. Over the years, we have been encouraged by the inspiration Haskell has given to other languages like Rust, Apple’s Swift and others. We even saw Java programmers singing the praises of lazy streams and anonymous functions. It had always seemed like just a matter of time before more programmers start wanting to use “The Real Thing”. While we saw some progress in the Haskell ecosystem, with the release of GHC 8.0, and efforts to eliminate “dependency hell” on the part of the Stack and Cabal projects, it didn’t feel like enough. Enough for what? If you wanted to use Haskell in production at a company, you still had to be brave and determined. The evidence was anecdotal but hard to ignore. It ranged from conference attendees talking about how complicated Haskell has become, to concerns such as getting Haddock working with GHC’s new type system features, to the trouble enabling Stack to keep working with Hackage. Pull requests to some of the core libraries were languishing and many felt there were problems in those libraries simply going unaddressed. Worst of all, some of us were aware of companies that had adopted and then later abandoned Haskell. Those would-be Haskellers faced a confusing collection of projects and committees, none of whom themselves felt they had a broad mandate to advance Haskell. To be sure, there were some people thinking about remedies with proposals like Simple Haskell and Boring Haskell. But we thought more was needed. We started with some goals: • An easier on-ramp. Starting out with Haskell is harder than it should be, with a wealth of ways of setting up a Haskell dev machine (some of them out-of-date!). Even after this first step, newcomers often land on unmaintained wiki pages or other seldom helpful destinations. • More inclusivity. To help the community grow, we wanted the occasional alienating post or dismissive comment to be reliably addressed. • More progress, faster. We wanted to encourage more innovation across the many different projects that comprise Haskell. We also wanted to explore ways to help the committees and volunteers that make up the Haskell community to channel resources and volunteers to where they are needed most, and to ensure that each tool works well with others, forming a cohesive whole. • Funding. We knew first-hand that many companies were already investing in Haskell to ease their own pain points, but their efforts weren’t very connected. We wanted to make all of the above more feasible with the help of sponsors. ## Gaining momentum Having envisioned the outlines of the Haskell Foundation, what next? We wrote those ideas down in a live shared document we called “the whitepaper” and started gathering feedback from an ever widening group. We really wanted to iron out the wrinkles to avoid announcing something that could fall flat on its face. As Simon Peyton Jones is the central architect and developer behind the Haskell ecosystem, the rest of us thought that his visible involvement and leadership would be essential. Happily, Simon agreed to stay involved and eventually spent way more time than he had planned. We still had some worries. Would others consider affiliating? Were we stepping on anyone’s toes? We decided to take a slow and deliberate concentric rings approach to next reach out to core committees, then companies and large projects, then influential community members, to socialize the idea further. We started with the Haskell.Org Committee. Expecting pushback, we were blown away by how eager the chairperson (Jasper van der Jeugt) was, and then in turn the rest of the haskell.org members: Ryan Trinkle, Emily Pillmore, Tikhon Jelvis, Rebecca Skinner, and Alex Garcia de Oliveira. They all became early and essential participants in the launch of the Foundation. It seems that the idea resonated strongly with some of their own ongoing discussions. We were still a small team at this point — fewer than ten people, with Tim Sears as our chief day-to-day organizer. We kept adding anyone who wanted to help to a bi-weekly video call and kept iterating the whitepaper. After a short time it looked nothing like the first draft. Feeling a bit braver, we then decided to also reach out to companies and key stakeholders in the community. We met skeptics along the way, like Michael Snoyman (now on the Foundation Board), both on the vision and the feasibility. These skeptics’ input turned out to be extremely helpful. Among other things, we used their feedback to sharpen the Foundation’s commitment to transparency. Ryan Trinkle (Obsidian Systems) soon started playing the role of shadow CFO. Duncan Coutts and Ben Gamari (Well-Typed) provided valuable input, as did Neil Mitchell, John Wiegley, Ed Kmett, Simon Marlow and others too numerous to mention. It was important to connect with the Core Libraries Committee, the Hackage Trustees and the GHC Steering Committee. Emissaries were dispatched. Those groups ranged from amenable to enthusiastic, but they also asked some thorny questions. Did the community really need another committee? How would the Foundation differentiate itself? Could we actually raise money? In private, Simon Peyton Jones asked Tim why he wrote down a 7-figure sponsorship goal in an early draft of the whitepaper. Was it realistic? Tim had to admit he wasn’t sure, but without funding the Foundation would never have big impact. Only one way to find out… At some point we started calling our informal cabal the HF Working Group. Eventually the invite list numbered in the dozens, with about 12 or so turning up regularly to our video chats. Scheduling was rarely a problem, since nobody was traveling - the silver lining of a terrible pandemic. ## The Foundation escapes quarantine In August, the Working Group asked itself: “Haven’t we socialized this idea enough yet? Can’t we freeze and ratify the whitepaper? Why can’t we just launch?“. Just like that we turned a corner. We started a semi-public outreach effort on the following basis: • The Foundation would be non-profit. Our goal would never be to create a consultancy or training company. Our goal would be to promote Haskell and related technologies. We would not be selling any services. • The Foundation would be inclusive. It would seek input from a variety of sources and be community-driven. A goal of the Foundation would be to look like the community we want to become. • The Foundation would be funded. We would start with ambitious fund-raising goals. Specifically, we wanted to aim for a yearly budget of over$1,000,000. Donations would come from industry and from the general Haskelling public.
• The Foundation would have an executive director. Acknowledging that we are all busy people, well-meaning volunteers simply do not have the bandwidth to offer sustained attention to where it is needed. Instead, the Foundation would have at least one full-time employee, whose day job it is to manage the Haskell community and promote its interests.

To our eyes, now informed by the community, this seemed like a winning formula — a design that would be able to fix Haskell’s problems and promote the language, while strengthening our community.

The Foundation quickly became the world’s worst-kept secret as the Working Group set a public launch date for November.

## Announcement and bootstrapping

Right away there was a new chicken-and-egg problem: we wanted the Board of Foundation to be drawn from the wider community, and yet we needed to have someone in charge so we could launch quickly. The HF Working Group landed on a two-step process. The Foundation would start with an interim Board and launch. The Board would then both replace itself and hire an ED to run the organization. The Working Group identified eight prominent Haskellers pre-launch and invited them to serve as an interim board. Thanks to Simon Peyton Jones’s persuasive powers, and a promise that their main duty would be limited to the above, they were quickly recruited.

Tim Sears, Emily Pillmore, Ryan Trinkle, and Alex Garcia de Oliveira led the fundraising efforts, starting seriously in August with a launch date slated for November. Even before launch, the Foundation quickly landed over $200,000 in commitments. Soon we knew that we would have enough to fund an ED who could spend enough time to foster the sponsorship efforts. Ryan started the paperwork to incorporate the Haskell Foundation as a non-profit, while Emily took over the de facto running of the Working Group. Jasper led an Affiliation Track to coordinate transparency policies for the community. Richard Eisenberg worked with other volunteers (Ben Gamari of Well-Typed, Davean Scies of xkcd, Emily, Moritz Angermann of IOHK, Tikhon Jelvis from the Haskell.Org committee, and Tim Sears) to develop an initial technical agenda, a list of the kinds of projects the Foundation would seek to accelerate. This was to be used in our fundraising pitches and in forming a starting place for the real work to come: we knew that, once the Foundation was made public and we had selected a board, the agenda could be revised in the light of the freedom of being able to consult the public. Rebecca Skinner from the Haskell.Org committee (helped by Tim and Davean) volunteered to lead up the effort to create a website, in advance of the upcoming launch, which we decided to incorporate into the Haskell eXchange conference, hosted by Tweag partner Skills Matter. Cardano/IOHK provided much of the on-the-ground labor in putting the initial website draft together, handing it over to Rebecca to push it over the line for the launch. Finally, on 4 November, 2020 Simon Peyton Jones publicly announced the Haskell Foundation. You can watch the video of his announcement. Despite the feverish pace of work leading up to that announcement, everyone knew that the real work was only beginning, but we had a launch! The story we wanted to share now comes to an end. ## Epilogue In December and January the interim Board recruited an outstanding slate of 14 members from a large group of applicants. As a special bonus they decided that the initial funding was sufficient to hire not only an ED, but also a CTO! Andrew Boardman joined as ED and Emily Pillmore stayed on in a permanent role as CTO. In its first meeting, the Board elected Richard Eisenberg as its chair. The work of the Haskell Foundation proper is finally underway. This includes developing a process for community input, identifying projects to fund and otherwise support, and continuing outreach efforts. We expect that the creation of the Foundation will mark an inflection point in the history of Haskell. It is extremely gratifying to have played a role in helping a community that has given so much to us personally. We’re eager to see where it will go from here. Finally, the Haskell Foundation needs volunteers, people just like you. The best place to reach the Foundation is via the Haskell Foundation category at the Haskell Discourse instance, though you can reach out directly to the Board or its chair, Andrew the ED, or Emily the CTO. ## March 25, 2021 ### Cartesian Closed Comic # Sprawdź nasze filmy w kategoriach Sposobów na spędzanie wolnego czasu jest bez wątpienia wiele. Jednakże są wśród nich takie metody, które okazują się być nie tylko najpopularniejsze, ale również ponadczasowe. Mowa tu oczywiście o seansach domowych, a więc oglądaniu filmów online, w swoim własnym zaciszu domowym. Takie rozwiązanie jest korzystne szczególnie dla tych, którzy cenią sobie ogólny komfort i swobodę. Nie da się ukryć, że obejrzenie filmu na sali kinowej w towarzystwie wielu osób, a oglądanie go w swoich przysłowiowych czterech kątach w pojedynkę bądź wraz z naszymi najbliższymi to zdecydowanie duża różnica. Oprócz tego poszczególne platformy online, na których możliwe jest obejrzenie różnorodnych filmów mają naprawdę szeroki wybór. Jednym z takich stron internetowych jest właśnie zalukaj. ## Dlaczego warto korzystać z zalukaj? Przede wszystkim warto wiedzieć, że zalukaj to strona, na której masz możliwość nie tylko przejrzeć różne kategorie filmowe i propozycje filmów (których znajdziesz całe mnóstwo) ale również możesz obejrzeć taki film. Tak jak zostało już wspomniane, kategorii filmowych jest na zalukaj naprawdę wiele. W związku z tym możliwe jest obejrzenie zarówno takich propozycji jak Dramat, Kryminał, Thriller, Fantasy, Komedia, Sci-Fi jak i także film Przygodowy, Animacja, Familijny, Komedia rom., Sztuki walki, czy też Melodramat. ## Co jeszcze warto wiedzieć o zalukaj? Nie da się nie zauważyć, że w ostatnim czasie wzrosła popularność filmów online. Bez żadnych wątpliwości można stwierdzić, że do największych zalet oglądania filmów dostępnych w Internecie jest fakt, iż takie filmy możemy oglądać w dogodnym dla siebie miejscu i czasie. Dlatego też najciekawsze filmy które nas intrygują bądź też wielokrotnie już obejrzane klasyki, które pragniemy sobie odtworzyć, możemy oglądać zarówno z łóżka, ulubionego fotela, u znajomych czy nawet podczas podróży. Niezależnie więc od tego, czy preferujesz taki gatunek jak film Kostiumowy, Akcja, Historyczny, Romans, Krótkometrażowy, Horror, Western bądź też chociażby Biograficzny, Polityczny, Wojenny, Muzyczny, Psychologiczny, Dokumentalny lub Czarna komedia czy Biblijny - każdy z tych gatunków jest z pewnością warty obejrzenia. Warto także wiedzieć, że na stronie internetowej zalukaj istnieje możliwość założenia swojego konta lub konta vip, aby być ze wszystkim na bieżąco i korzystać z jeszcze większej ilości ofert. ### Tweag I/O # How to build hybrid Haskell and Java programs When working in a sufficiently large Haskell project, one eventually faces the question of whether to implement missing libraries in Haskell or borrow the implementation from another language, such as Java. For a long time now, it has been possible to combine Haskell and Java programs using inline-java. Calling Java functions from Haskell is easy enough. import Language.Java (J, JType(Class)) -- from jvm import Language.Java.Inline (imports, java) -- from inline-java imports "org.apache.commons.collections4.map.*" createOrderedMap :: IO (J ('Class "org.apache.commons.collections4.OrderedMap")) createOrderedMap = [java| new LinkedMap() |] The ghc and javac compilers can cooperate to build this program provided that both know where to find the program’s dependencies. Back in the day, when inline-java was being born, there was no build tool capable of pulling the dependencies of both Haskell and Java, or at least not without additional customization. Since then, however, Tweag has invested effort into enabling Bazel, a polyglot build system, to build Haskell programs. In this blog post we go over the problems of integrating build tools designed for different languages, and how they can be addressed with Bazel, as an example of a single tool that builds them all. More specifically, this post also serves as a tutorial for using inline-java with Bazel, which is a requirement for the latest inline-java release. ## Dependencies done the hard way Suppose we rely on cabal-install or stack to install the Haskell dependencies. This would make the inline-java and jvm packages available. But these tools are specialized to build Haskell packages. If our program also depends on the commons-collections4 Java package, we can’t rely on Cabal to build it. We need help from some other Java-specific package manager. We could rely on maven or gradle to install common-collections4. At that point we can build our project by invoking ghc and telling javac where to find the java dependencies in the system via environment variables (i.e. CLASSPATH). With some extra work, we could even coordinate the build systems so one calls to the other to collect all the necessary dependencies and invoke the compilers. This is, in fact, what inline-java did until recently. But there is a severe limitation to this approach: no build system can track changes to files in the jurisdiction of the other build system. The Cabal-based build system is not going to notice if we change the gradle configuration to build a different version of common-collections4, or if we change a source file on the Java side. Easy enough, we could run gradle every time we want to rebuild our program, just in case something changed in the Java side. But then, should the Haskell build system rebuild the Haskell side? Or should it reuse the artifacts produced in an earlier build? We could continue to extend the integration between build systems so one can detect if the artifacts produced by the other have changed. Unfortunately, this is a non-trivial task and leads to reimplementing features that build systems already implement for their respective languages. We would be responsible for detecting changes on every dependency crossing the language boundary. If that didn’t sound bad enough, incremental builds is not the only concern requiring coordination. Running tests, building deployable artifacts, remote builds and caches, also involve both build systems. ## Dependencies with a polyglot build system A straightforward answer is to use only one build system to build all languages in the project. We chose to turn to Bazel for our building needs. Bazel lets us express dependencies between artifacts written in various languages. In this respect, it is similar to make. However, Bazel comes equipped with sets of rules, such as rules_haskell, for many languages, which know how to invoke compilers and linkers; these rules are distributed as libraries and are reusable across projects. With make, the user must manually encode all of this knowledge in a Makefile herself. It’s not the subject of this blog post, but Bazel comes with a number of other perks, such as hermeticity of builds for reproducibility, distributed builds, and remote caching. We offer a working example in the inline-java repository. In order to specify how to build our Haskell program, we start by importing a rule to build Haskell binaries from rules_haskell. # file: BUILD.bazel load( "@rules_haskell//haskell:defs.bzl", "haskell_binary", ) haskell_binary( name = "example", srcs = ['Main.hs'], extra_srcs = ["@openjdk//:rpath"], compiler_flags = [ "-optl-Wl,@$(location @openjdk//:libjvm.so)",
],
deps = [
"//jvm",
"//jni",
"//:inline-java",
"@stackage//:base",
"@stackage//:text",
] + java_deps,
data = [ ":jar_deploy.jar" ],
plugins = ["//:inline-java-plugin"],
)

java_deps = [
"@maven//:org_apache_commons_commons_collections4_4_1",
]

The load instruction is all we need to do to invoke the rules_haskell library and access its various rules. Here we use the haskell_binary rule to build our hybrid program.

Besides the fact that Main.hs is written in both Haskell and Java, one can tell the hybrid nature of the artifact by observing the dependencies in the deps attribute, which refers to both Haskell and Java libraries.

• //jvm, //jni and //:inline-java refer to Haskell libraries implemented in the current repository
• @rules_haskell//tools/runfiles refers to a special Haskell library defined as part of the rules_haskell rule set. More on this below.
• @stackage//:base and @stackage//:text refer to Haskell packages coming from stackage
• @maven//:org_apache_commons_commons_collections4_4_1 is a Java library coming from a maven repository

Additional configuration in the WORKSPACE file makes precise the location of dependencies coming from rules_haskell, stackage and maven, where the stackage snapshot and the list of maven repositories is specified. More information on the integration with stackage can be found in an earlier post and with maven in the documentation of rules_jvm_external.

While the dependencies in deps are made available at build time, the attribute data = [":jar_deploy.jar"] declares a runtime dependency. Specifically, it makes the Java artifacts available to the Haskell program. The ":jar_deploy.jar" target is a jar file produced with the following rule from the Java rule set. A future version of rules_haskell may generate this runtime dependency automatically, but for the time being we need to add it manually.

java_binary(
name = "jar",
main_class = "dummy",
runtime_deps = java_deps,
)

Now, when there are changes in the Java dependencies, Bazel will know to rebuild the Haskell artifacts, and only if there were changes.

Bazel makes sure that jar_deploy.jar is built, and stores it in an appropriate location. But nothing, so far, tells the Haskell program where to find this file. This is where the runfiles library comes into play. Bazel lays out runtime dependencies, such as jar_deploy.jar following known rules; the runfiles library abstracts over these rules. To complete our example, we need the main function to call to the runfiles library and discover the jar file’s location.

import qualified Bazel.Runfiles as Runfiles
import Data.String (fromString)
import Language.Java (J, JType(Class), withJVM)
...

main = do
r <- Runfiles.create
let jarPath = Runfiles.rlocation r "my_workspace/example/jar_deploy.jar"
(LK _) -> (strokeLoop $glueLine$ fromOffsets $tileEdges c) # fc kcol _ -> mempty To show half tiles separately, we will use drawJComp which adds the join edge of each half tile to make loops with 3 edges. drawJComp:: Component -> Diagram B drawJComp = strokeLoop . closeLine . fromOffsets . compEdges Apart from drawing, we can also scale a component scaleComp:: Double -> Component -> Component scaleComp = compMap . scale  where compMap applies a function to the vector of a component compMap f (LD v) = LD (f v) compMap f (RD v) = RD (f v) compMap f (LK v) = LK (f v) compMap f (RK v) = RK (f v) and rotate a component by an angle rotateComp :: Angle Double -> Component -> Component rotateComp = compMap . rotate  (Positive rotations are in the anticlockwise direction.) ## Patches A patch is a list of components each with an offset represented by a vector type Patch = [(V2 Double, Component)] To turn a whole patch into a diagram using some function cd for drawing the components, we use patchWith cd patch = position$ fmap offset patch
where offset (v,c) = (origin .+^ v, cd c)

Here the offset vector is used to calculate a point from the origin for each component to be positioned and drawn (with cd).

The common special case drawPatch uses drawComp on each component

drawPatch = patchWith drawComp

Other operations on patches include scaling a patch

scalePatch :: Double -> Patch -> Patch
scalePatch r = fmap (\(v,c) -> (scale r v, scaleComp r c))

and rotating a patch by an angle

rotatePatch :: Angle Double -> Patch -> Patch
rotatePatch a = fmap (\(v,c) -> (rotate a v, rotateComp a c))

To move a patch by a vector, we simply add the vector to each offset vector

translatePatch:: V2 Double -> Patch -> Patch
translatePatch v0 = fmap (\(v,c) -> (v0 ^+^ v, c))

As an aid to creating patches with 5-fold rotational symmetry, we combine 5 copies of a basic patch (rotated by multiples of ttangle 2 successively).

penta:: Patch -> Patch
penta p = concatMap copy [0..4]
where copy n = rotatePatch (ttangle (2*n)) p

This must be used with care to avoid nonsense patches. But two special cases are

sun =  penta [(zero, rkite), (zero, lkite)]
star = penta [(zero, rdart), (zero, ldart)]

This figure shows some example patches, drawn with drawPatch The first is a star and the second is a sun.

The tools so far for creating patches may seem limited (and do not help with ensuring legal tilings), but there is an even bigger problem.

## Correct Tilings

Unfortunately, correct tilings – that is, tilings which can be extended to infinity – are not as simple as just legal tilings. It is not enough to have a legal tiling, because an apparent (legal) choice of placing one tile can have non-local consequences, causing a conflict with a choice made far away in a patch of tiles, resulting in a patch which cannot be extended. This suggests that constructing correct patches is far from trivial.

The infinite number of possible infinite tilings do have some remarkable properties. Any finite patch from one of them, will occur in all the others (infinitely many times) and within a relatively small radius of any point in an infinite tiling. (For details of this see links at the end)

This is why we need a different approach to constructing larger patches. There are two significant processes used for creating patches, namely inflate (also called compose) and decompose.

To understand these processes, take a look at the following figure.

Here the small components have been drawn in an unusual way. The edges have been drawn with dashed lines, but long edges of kites have been emphasised with a solid line and the join edges of darts marked with a red line. From this you may be able to make out a patch of larger scale kites and darts. This is an inflated patch arising from the smaller scale patch. Conversely, the larger kites and darts decompose to the smaller scale ones.

## Decomposition

Since the rule for decomposition is uniquely determined, we can express it as a simple function on patches.

decompose :: Patch -> Patch
decompose = concatMap decompC

where the function decompC acts on components and produces a list of the smaller components contained in the component. For example, a larger right dart will produce both a smaller right dart and a smaller left kite. Decomposing a component with offset v also takes care of the positioning, scale and rotation of the new components.

decompC (v, RD vd) = [(v,        LK vd  )
,(v ^+^ v', RD vd' )
]  where v'  = phi*^rotate (ttangle 1) vd
vd' = (2-phi) *^ (zero ^-^ v')
-- (2-phi) = 1/phi^2

decompC (v, LD vd) = [(v,        RK vd  )
,(v ^+^ v', LD vd' )
]  where v'  = phi*^rotate (ttangle 9) vd
vd' = (2-phi) *^ (zero ^-^ v')
-- (2-phi) = 1/phi^2

decompC (v, RK vk) = [(v,        RD vd' )
,(v ^+^ v', LK vk' )
,(v ^+^ v', RK vk' )
] where v'  = rotate (ttangle 9) vk
vd' = (2-phi) *^ v'
-- v'/phi^2
vk' = ((phi-1) *^ vk) ^-^ v'
-- (phi-1) = 1/phi

decompC (v, LK vk) = [(v,        LD vd' )
,(v ^+^ v', RK vk' )
,(v ^+^ v', LK vk' )
] where v'  = rotate (ttangle 1) vk
vd' = (2-phi) *^ v'
-- v'/phi^2
vk' = ((phi-1) *^ vk) ^-^ v'
-- (phi-1) = 1/phi

This is illustrated in the following figure for the cases of a right dart and a right kite.

The symmetric diagrams for left components are easy to work out from these, so they are not illustrated.

With the decompose operation we can start with a simple correct patch, and decompose repeatedly to get more and more detailed patches. (Each decomposition scales the tiles down by a factor of $1/phi$ but we can rescale at any time.)

This figure illustrates how each component decomposes with 4 decomposition steps below each one.

components =  [ldart, rdart, lkite, rkite]
fourDecomps = hsep 1 $fmap decomps components # lw thin where decomps cpt = vsep 1$
fmap drawPatch $take 5$ decompositions [(zero,cpt)] 

We have made use of the fact that we can create an infinite list of finer and finer decompositions of any patch, using:

decompositions:: Patch -> [Patch]
decompositions p = inf where inf = p:fmap decompose inf

We could get the n-fold decomposition of a patch as just the nth item in such a list

multiDecomp :: Int -> Patch -> Patch
multiDecomp n p = decompositions p !! n

For example, here is an infinite list of decomposed versions of sun.

suns = decompositions sun

The coloured tiling shown at the beginning is simply 6 decompositions of sun displayed using fillDK

sun6 = suns!!6
filledSun6 = patchWith (fillDK red blue) sun6 # lw ultraThin

The earlier figure illustrating larger kites and darts emphasised from the smaller ones is also sun6 but this time drawn with

experimentFig = patchWith experiment sun6 # lw thin

where components are drawn with

experiment:: Component -> Diagram B
experiment c = emph c <> (drawComp c # dashingN [0.002,0.002] 0 # lw ultraThin)
where emph c = case c of
(LD v) -> (strokeLine . fromOffsets) [v] # lc red
-- emphasise join edge of darts in red
(RD v) -> (strokeLine . fromOffsets) [v] # lc red
(LK v) -> (strokeLine . fromOffsets) [rotate (ttangle 1) v]
-- emphasise long edge for kites
(RK v) -> (strokeLine . fromOffsets) [rotate (ttangle 9) v]

## Inflation

You might expect inflation (also called composition) to be a kind of inverse to decomposition, but it is a bit more complicated than that. With our current representation of components, we can only inflate single components. This amounts to embedding the component into a larger component that matches how the larger component decomposes. There is thus a choice at each inflation step as to which of several possibilities we select as the larger half-tile. We represent this choice as a list of alternatives. This list should not be confused with a patch. It only makes sense to select one of the alternatives giving a new single component.

The earlier diagram illustrating how decompositions are calculated also shows the two choices for inflating a right dart into either a right kite or a larger right dart. There will be two symmetric choices for a left dart, and three choices for left and right kites.

Once again we work with offsets so we can calculate the offset for any resulting component (to ensure the larger component contains the original in its original position in a decomposition).

inflate :: (V2 Double, Component) -> [(V2 Double, Component)]
inflate (v, RD vd) = [(v ^+^ v', RD vd')
,(v,        RK vk )
] where v'  = (phi+1) *^ vd
-- vd*phi^2
vd' = rotate (ttangle 9) (vd ^-^ v')
vk  = rotate (ttangle 1) v'

inflate (v, LD vd) = [(v ^+^ v', LD vd')
,(v,        LK vk )
] where v'  = (phi+1) *^ vd
-- vd*phi^2
vd' = rotate (ttangle 1) (vd ^-^ v')
vk  = rotate (ttangle 9) v'

inflate (v, RK vk) = [(v,         LD vk  )
,(v ^+^ lv', LK lvk')
,(v ^+^ rv', RK rvk')
] where lv'  = phi*^rotate (ttangle 9) vk
rv'  = phi*^rotate (ttangle 1) vk
rvk' = phi*^rotate (ttangle 7) vk
lvk' = phi*^rotate (ttangle 3) vk

inflate (v, LK vk) = [(v,         RD vk  )
,(v ^+^ rv', RK rvk')
,(v ^+^ lv', LK lvk')
] where v0 = rotate (ttangle 1) vk
lv'  = phi*^rotate (ttangle 9) vk
rv'  = phi*^rotate (ttangle 1) vk
rvk' = phi*^rotate (ttangle 7) vk
lvk' = phi*^rotate (ttangle 3) vk

As the result is a list of alternatives, we need to select one to do further inflations. We can express all the alternatives after n steps as inflations n where

inflations :: Int -> (V2 Double, Component) -> [(V2 Double, Component)]
inflations 0 vc = [vc]
inflations n vc = [vc'' | vc' <- inflate vc, vc'' <- inflations (n-1) vc']

(That last line could be written more succinctly with do notation)

This figure illustrates 5 consecutive choices for inflating a left dart to produce a left kite. On the left, the finishing component is shown with the starting component embedded, and on the right the 5-fold decomposition of the result is shown.

fiveInflate = hsep 1 $fmap drawPatch [[ld,lk'], multiDecomp 5 [lk']] -- two seperate patches where ld = (zero,ldart) lk = inflate ld !!1 rk = inflate lk !!1 rk' = inflate rk !!2 ld' = inflate rk' !!0 lk' = inflate ld' !!1 Finally, at the end of this literate haskell program we choose which figure to draw as output. fig::Diagram B fig = filledSun6 main = mainWith fig That’s it. But, What about inflating whole patches?, I hear you ask. Unfortunately we need to answer questions like what components are adjacent to a component in a patch and whether there is a corresponding other half for a component. These cannot be done with our simple vector representations. We would need some form of planar graph representation, which is much more involved. That is another story. Many thanks to Stephen Huggett for his inspirations concerning the tilings. A library version of the above code is available on GitHub ## Further reading on Penrose Tilings As well as the Wikipedia entry Penrose Tilings I recommend two articles in Scientific American from 2005 by David Austin Penrose Tiles Talk Across Miles and Penrose Tilings Tied up in Ribbons. There is also a very interesting article by Roger Penrose himself: Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989. More information about the diagrams package can be found from the home page Haskell diagrams ### Well-Typed.Com # The fundamental problem solved by blockchain This post originally appeared on Edsko’s personal blog. There are many blog posts and descriptions of blockchain technology on the web, but I feel that very few of them get to the heart of the matter: what is the fundamental problem that blockchains solve? In this blog post I want to take a look at this question; this blog post is intended for a general audience and I will assume no technical knowledge. That does not mean however that we will need to dumb things down; we will see what the fundamental problem solved by blockchain is, and we will even see two alternative ways in which this problem is solved, known as proof-of-work (used by for example Bitcoin) and proof-of-stake (used by for example the Cardano blockchain). Full disclosure: I was one of the lead engineers on the Cardano blockchain, where I worked on the wallet specification, researched coin selection, designed the architecture of the consensus layer and wrote the hard fork combinator. This blog post is however just my personal perspective. ## Motivation Suppose a group of people want to share information with each other. The nature of that information does not really matter; for the sake of having an example, let’s say it’s a group of companies who want to share supply availability, purchase orders, etc. How might we go about setting up a system to make this possible? ### Bulletin board As a starting point, perhaps we could just set up a server where everybody can post messages and everybody can see everybody else’s messages. For reasons that will become clear later, we will refer to the messages that users post as blocks: users can post arbitrary blocks of data.1 Suppose we have some suppliers A, B, C, ..; they might be posting blocks (messages) such as this:  Block contents Number 0 A has 1000 sprockets available 1 B has 500 springs available 2 A looking for a supplier of 300 springs 3 C agrees to buy 100 sprockets from A at €2/sprocket (Block numbers added just for clarity, they are not part of the block itself.) Systems like this are basically just some kind of bulletin board and have been around in digital form since before Internet times, and obviously in non-digital form basically forever. ### Signatures Naturally, the problem with this system is that nothing is preventing people from posting messages that aren’t true. For the system to be useful as a supply chain system, suppliers must be able to depend on the messages that get posted. An important and obvious first step towards addressing this problem is to require messages to be signed. Digital signatures are a well-understood problem; the exact details of how it’s done don’t really matter. With the addition of signatures, the server might now look something like this: Block contents Signed by A has 1000 sprockets available A B has 500 springs available B A looking for a supplier of 300 springs A C agrees to buy 100 sprockets from A at €2/sprocket A, C ### Replay attacks The addition of signatures is a big improvement, but it’s not yet sufficient. For example, perhaps A agreed to sell 100 sprockets to C at €2/sprocket only as a once-off; maybe it was some kind of special introductory price. The fact that A agreed to do this once does not necessarily mean that A will agree to do it again, but in our system so far nothing is stopping C from reposting the message again: Block contents Signed by A has 1000 sprockets available A B has 500 springs available B A looking for a supplier of 300 springs A C agrees to buy 100 sprockets from A at €2/sprocket A, C other messages C agrees to buy 100 sprockets from A at €2/sprocket A, C After all, the message is signed by both A and C and so it looks legit. This is known as a replay attack, and there are many ways to solve it. Here we will consider just one way: we will introduce an explicit ordering; as will see, this will be important later as well. To introduce an ordering, each block will record the hash of its predecessor block. You can think of a hash of a block as a mathematical summary of the block: whenever the block changes, so does the summary. Hash Pred Block contents Signed by 5195 none A has 1000 sprockets available A 5843 5195 B has 500 springs available B 2874 5843 A looking for a supplier of 300 springs A 9325 2874 C agrees to buy 100 sprockets from A at €2/sprocket A, C Crucially, the signature also covers the block’s predecessor; this is what is preventing the replay attack: C cannot post the same message again, because if they did, that new block must have a new predecessor, and that would then require a new signature: Hash Pred Block contents Signed by 5195 none A has 1000 sprockets available A 5843 5195 B has 500 springs available B 2874 5843 A looking for a supplier of 300 springs A 9325 2874 C agrees to buy 100 sprockets from A at €2/sprocket A, C 9325 other messages 6839 7527 6839 C agrees to buy 100 sprockets from A at €2/sprocket ??? Notice how this is creating a chain of blocks: every block linked to its predecessor. In other words, a block chain; although what we have described so far isn’t really a blockchain in the normal sense of the word just yet. ### Eliminating the central server So far all messages have been stored on some kind of central server. One worry we might have here is what happens when that server goes down, suffers a hard disk failure, etc. However, server replication, data backup, etc. are again well-understood problems with known solutions; that won’t concern us here. A more important question is: what happens if we cannot trust the central server? First, notice what the server can not do: since messages are signed by users, and the server cannot recreate those signatures, it is not possible for the server to forge new messages. The server can however delete messages, or simply not show all messages to all users. Let’s first consider why the server might even want to try. Suppose the server currently has the following blocks: Hash Pred Block contents Signed by 2547 none A has 1000 sprockets available A, B, C 8861 2547 A agrees to sell 800 sprockets to B A In this example, A declares to have a sprocket supply, and both B and C have ratified the message. Now when A agrees to sell sprockets to B, B knows that A actually has these sprockets available, and so might transfer the money for the sprockets to A. If A tries to sell the sprockets to both B and C, C will know something is amiss: Hash Pred Block contents Signed by 2547 none A has 1000 sprockets available A, B, C 8861 2547 A agrees to sell 800 sprockets to B A 1010 8861 A agrees to sell 800 sprockets to C A If A only has 1000 sprockets, they can’t be selling 800 sprockets to both B and C. But suppose the server is in cahoots with A, and simply doesn’t show the transaction with B to C; in other words, the chain that the server might present to C could be Hash Pred Block contents Signed by 2547 none A has 1000 sprockets available A, B, C 6079 2547 A agrees to sell 800 sprockets to C A Now C doesn’t know that anything is wrong. This way A can basically sell their stock to as many people as they want, collect the money, and disappear. This is known as a double spending attack, and we cannot fix it if we stick with a central server: after all, the chains that B and C see are both perfectly reasonable and plausible; the attack consists of showing a different chain to different users, even though the chain that each user sees is itself valid. As we will see, this is the problem that the blockchain solves. ### Revising history Before we get to the real blockchain, however, we should discuss one more thing. Consider again the example from the previous section, where the server is untrustworthy, and is cooperating with A to dupe both B and C. Unlike in the previous example, however, suppose that the transaction with A was not the last transaction on the chain; perhaps the server looks like this instead: Hash Pred Block contents Signed by 2547 none A has 1000 sprockets available A, B, C 8861 2547 A agrees to sell 800 sprockets to B A 3598 8861 Some other message D If the server now wants to present an alternative chain to C, in which A is selling the sprockets to C instead of to B, it cannot include the message signed by D. The message signed by D records the hash of its predecessor; if the server changes that predecessor (A selling to C instead of A selling to B), it would have a different hash, and would therefore require a new signature: Hash Pred Block contents Signed by 2547 none A has 1000 sprockets available A, B, C 6079 2547 A agrees to sell 800 sprockets to C A 4592 6079 Some other message ??? However, this unfortunately does not help us. First of all, notice that the server could simply truncate the chain and not include the message signed by D when presenting the chain to C. Moreover, even when the chain does include additional messages, messages earlier on the chain are still not trustworthy. The problem is that we don’t know who D is. Perhaps D is the server itself; if so, then the existence of a message signed by D tells us nothing. Indeed, in the final example above, the server could just construct a new signature in the place of the “???”. ## Proof of work: solving puzzles If we cannot trust the central server, perhaps we can simply allow everyone to produce blocks. Everyone keeps a copy of the block chain, and when someone wants to add a message to the chain they just create it themselves and send it everybody else. If two people happen to create a block at the same time, no problem: we just pick one of the two blocks, and whoever “lost” will just create a new block on top of the block that got adopted, no big deal. This by itself does not solve anything just yet. A could just take the block that contains the transaction with B and send it to B but not to C, and vice versa. Just like before, even if B and C would wait for additional blocks to be added on top of the block they just received, they have no guarantee that those blocks are not being produced by nodes that are cooperating with A. To even have a chance of having a way out of this impasse we will need to impose an assumption: although some participants may be untrustworthy, we will assume that the majority is honest. In the previous section we saw that once the chain includes an honest block, untrustworthy participants can no longer revise the blockchain’s history. Therefore, if the blockchain is operating normally, every new block that comes in increases the likelihood that the chain contains an honest block. After a while the probability that the most recent k blocks on the chain (for some number k) does not contain any honest block is so small that any information that is older than k can be considered trustworthy.2 Unfortunately, while the assumption that the majority of the participants is honest may seem reasonable, that does not mean that we can assume that the majority of the blocks we receive are produced by honest participants. After all, nothing is stopping untrustworthy participants from producing lots and lots of blocks. Unless… we introduce something that does stop that. ### Puzzles This was the genius insight behind Bitcoin: we can translate the assumption “the majority of participants is honest” to “the majority of blocks is honest” if blocks are costly to produce. In order to produce a block, the block must solve a particular mathematical puzzle. It does not really matter what the puzzle is exactly, except that (1) it takes a computer a long time to solve, and (2) we can verify that the solution is correct very quickly. The answer to this puzzle is called proof of work (proof that you did the work that makes a block computationally expensive to produce), and must be included in every block: Hash Pred Proof of work Block contents Signed by 2547 none A has 1000 sprockets available A, B, C 8861 2547 A agrees to sell 800 sprockets to B A 3598 8861 Some other message D Notice how this prevents dishonest nodes from producing lots of different blocks: they would have to solve lots of different puzzles, which would simply be impossible in the time available. We do have to make our “honest majority” assumption a bit more precise for this to be viable: we must assume that the honest participants have more puzzle solving power than the dishonest participants; roughly speaking, the honest participants together must have more powerful computers than the dishonest participants. This means that provided the honest participants are producing blocks, the blocks we receive over the network have at most a 50% probability of being produced by dishonest participants, and therefore the probability that no block is produced by an honest particant diminishes quickly; after 100 blocks that probability is roughly .000000000000000000000000000001. ### Chain selection We mentioned above that it’s possible that two participants produce a block at the same time; when this happens, we just pick one arbitrarily, and continue; the participant who “lost” just produces a new block on top of the block that got picked, and we continue. It is possible that we discover this clash a bit later. For example, suppose B and C produce a block at the same time, D only sees the block produced by B and adds their block on top of Bs block:  5285 none .. A 7115 5285 .. A 1413 7115 .. B 5980 7115 .. C 6679 1413 .. D This is known as a fork in the blockchain. If all nodes are honest, how we resolve the fork does not actually matter terribly: as long as everybody applies the same rule, we end up with the same chain, and the blocks that “lost” are just reconstructed on the new fork. However, we want to throw away as little work as possible, and so we will opt to select the longest chain. More importantly, if the fork was intentionally produced by a dishonest participant, then picking the longer fork also gives us a higher probability of picking the fork with more blocks produced by honest participants: after all, in order to produce the longer fork, more puzzles had to be solved. We say that the difficulty of producing a chain increases with its length. ## Proof of stake Although the introduction of the puzzles was a genius insight, it does have one very important drawback: it is extremely wasteful. A study in 2014 found that the global production of blocks on the Bitcoin block chain consumed as much energy as the entire country of Ireland.3 In a world where energy consumption is one of the greatest challenges to face mankind, this is obviously not a sustainable solution. The key insight is that the important feature of proof-of-work is not necessarily that it is costly to produce blocks, but rather that we tie block production to some kind of resource of which participants have a limited amount. In proof of work that resource happens to puzzle solving (compute) power, but other choices are possible as well. In particular, most blockchains themselves are recording some kind of resource that is tied to a real-world resource: if the blockchain is supporting a cryptocurrency, then the amount of crypto someone owns can be considered their resource; for our running example, we could consider the amount of supplies that someone offers on the blockchain to be their resource.4 Let’s say someone owns 10% of all resources on the chain (i.e., they own 10% of the cryptocurrency, or in our example, 10% of the available supplies is theirs); we then say that they have a 10% stake. In a proof-of-stake blockchain, the probability that someone is allowed to produce a block is proportional to their stake: users with little stake can only produce a few blocks, users with a lot of stake can produce many. Our “honest majority” assumption now becomes: provided that the honest participants have more than 50% stake in the system, the probability that any block is produced by a dishonest participant will be less than 50%, just as for proof-of-work. ### Leadership schedule There are many ways to construct a proof-of-stake blockchain; what I will describe here is based on the Ouroboros consensus protocol5, the consensus protocol underlying the Cardano blockchain. Here’s how a proof of stake blockchain works. Time is divided into slots; for simplicity, we can think of slots as just being equal to seconds (indeed, on the Cardano chain one slot takes one second), and slots are grouped into epochs; on the Cardano chain, one epoch contains a little under half a million slots, for a total of 5 days.6 At the start of each epoch, we run a lottery for each slot in the next epoch. Most of those slots will be empty7, but for some slots we will assign a slot leader: the participant that is allowed to produce a block for that slot. The details of how this lottery works are not very important for our current purposes, except to say that there are clear rules on how to run the lottery that every participant on the chain will follow and cannot really tamper with. The only way for an dishonest participant to increase how many blocks they can produce is to increase their stake in the system. However, this has real life consequences: depending on how “stake” is defined exactly, it might mean they have to buy cryptocurrency, make more supply stock available, etc. Provided that the real world consequences of increasing stake are more than the potential benefit from attempting an attack on the system, the system is secure. Put another way, participants with the most stake in the system are also most invested in the system: they need the system to succeed and continue to operate. Therefore, they are more trustworthy than nodes with less stake. ### Chain selection Since block production is no longer costly (the whole point was to avoid wasteful energy consumption after all), it is not difficult for dishonest participants to construct long chains. Therefore if we have to choose between two chains, choosing the longer chain may not necessarily be the right choice. Instead, we will look at chain density. Suppose we have to choose between two chains: We will count the number of blocks on each chains in a window of s slots; the exact value of s does not matter so much; let’s say s is 100,000 blocks.8 Recall the shape that the “honest majority” assumption takes in a proof-of-stake blockchain: the honest majority has more than 50% stake in the system. That means that within the window, the chain produced by the honest participants in the system will contain more blocks than the chain produced by dishonest participants. Therefore, whenever there is a fork in the chain, we must pick the chain that is denser (contains more blocks) in a window of slots at the intersection point. This is known as the Ouroboros Genesis rule.9 There is a useful special case for this rule. The mathematical analysis of the Ouroboros protocol10 tells us that when all participants in the network are following the protocol, their chains might be a little different near the tip (because not all forks are resolved yet), but the intersection point between all those chains will be no more than k blocks back. Moreover, blocks for “future” slots (slots that we haven’t reached yet11) are considered invalid. In other words, the situation looks a bit like this: where none of the chains can exceed past “now”. In this case, the densest chain in the window will also be the longest; this means that for nodes that are up to date, the longest chain rule (just like in proof of work) can be used. Indeed, this is the rule in Ouroboros Praos, albeit with a side condition; for the details, please see my chapter “Genesis” in the Cardano Consensus Layer Report. ## Conclusions Allowing a group of people to share information is not a difficult problem to solve; techniques such as digital signatures as well as ways to address replay attacks are well understood problems. Things get a lot trickier however if we drop the assumption that we can trust the server on which the data is stored. Although the server cannot manufacture data, it can selectively omit data. Detecting and guarding against this is very difficult: after all, when the server omits data, it is simply going back to an earlier state of the chain, in which that data was not yet present. Therefore almost by definition the version of the chain with some parts omitted must also be valid. The solution to this is to allow all participants in the network to produce blocks. When we do that, we must have a way to resolving conflicts: when two or more nodes produce a block at the same time, how do we choose? If all participants are honest, we could just choose arbitrarily, but if dishonest participants are present, we have to be careful: if the dishonest participant can produce lots of blocks, then choosing randomly may favour the dishonest participant over honest participants. This then is the fundamental problem solved by blockchain technology: how do we limit the number of blocks that dishonest participants can generate? One option, known as proof-of-work and first introduced by Bitcoin, is to make block production simply costly (require a lot of time for a computer to generate). This works, but is extremely wasteful of energy; Bitcoin is consuming as much energy as a small country. The better alternative is proof-of-stake: the blockchain itself records some finite resource, and participants are allowed to produce blocks in proportion to the amount of stake they have. #### Footnotes 1. For the purposes of this blog post I will ignore the differences between blocks and transactions. Put another way, I will assume every block contains exactly one transaction.↩︎ 2. This number k is known as the security parameter, and is one of the most important parameters in a blockchain.↩︎ 3. Bitcoin Mining and its Energy Footprint, Karl J. O’Dwyer and David Malone. Available online See also Bitcoin’s energy consumption is underestimated: A market dynamics approach by Alex de Vries.↩︎ 4. This only works if the supplies someone records on the blockchain are either validated or listing supplies on the blockchain comes with a legal commitment.↩︎ 5. Ouroboros: A Provably Secure Proof-of-Stake Blockchain Protocol. Aggelos Kiayias, Alexander Russell, Bernardo David, and Roman Oliynykov. Available online↩︎ 6. 432.000 of them, to be precise.↩︎ 7. The proportion of slots we expect to be empty on average is known as the active slot coefficient, and is usually denoted by the variable f.↩︎ 8. There are constraints on the value of s. When s is too small, doing a density comparison is not meaningful, because the probability that adversarial nodes have more than 50% stake within the window by pure luck is too large. When s is too large, the leadership schedule within the window is no longer determined by the ledger state at the intersection point, and so the adversary can start to influence the leadership schedule. In Cardano, s is set to 3k/f slots.↩︎ 9. Ouroboros Genesis: Composable Proof-of-Stake Blockchains with Dynamic Availability. Christian Badertscher, Peter Gaži, Aggelos Kiayias, Alexander Russel and Vassilis Zikas. Available online↩︎ 10. Ouroboros Praos: An adaptively-secure, semi-synchronous proof-of-stake protocol. Bernardo David, Peter Gaži, Aggelos Kiayias and Alexander Russell. Available online↩︎ 11. Although this means that the protocol relies on a shared notion of “time” between all participants (a wallclock), the resolution of this clock does not need to be particularly precise, and standard solutions such as NTP suffice.↩︎ ## March 21, 2021 ### Magnus Therning # Todo items in org-roam, an update I got an email from Mr Z with a nice modification to the code in my post on keeping todo items in org-roam. He already had a bunch of agenda files that he wanted to keep using (I had so few of them that I'd simply converted them to roam files). Here's the solution he shared with me: (defvar roam-extra-original-org-agenda-files nil "Original value of org-agenda-files'.") (defun roam-extra:update-todo-files (&rest _) "Update the value of org-agenda-files'." (unless roam-extra-original-org-agenda-files (setq roam-extra-original-org-agenda-files org-agenda-files)) (setq org-agenda-files (append roam-extra-original-org-agenda-files (roam-extra:todo-files))))  It's a rather nice modification I think. Thanks to Mr Z for agreeing to let me share it here. ## March 20, 2021 ### Magnus Therning # Barbie and KenJSON After higher-kinded data (HKD) and barbies were mentioned in episode 35 of Haskell Weekly I've been wondering if it could be used in combination with aeson to do validation when implementing web services. TLDR; I think it'd work, but I have a feeling I'd have to spend some more time on it to get an API with nice ergonomics. ## Defining a type to play with I opted to use barbies-th to save on the typing a bit. Defining a simple type holding a name and an age can then look like this declareBareB [d| data Person = Person {name :: Text, age :: Int} |] deriving instance Show (Person Covered Identity) deriving instance Show (Person Covered Maybe) deriving instance Show (Person Covered (Either Text))  The two functions from the Barbies module documentation, addDefaults and check, can then be written like this addDefaults :: Person Covered Maybe -> Person Covered Identity -> Person Covered Identity addDefaults = bzipWith trans where trans m d = maybe d pure m check :: Person Covered (Either Text) -> Either [Text] (Person Covered Identity) check pe = case btraverse (either (const Nothing) (Just . Identity)) pe of Just pin -> Right pin Nothing -> Left$ bfoldMap (either (: []) (const [])) pe


I found it straight forward to define some instances and play with those functions a bit.

The bit that wasn't immediately obvious to me was how to use aeson to parse into a type like Person Covered (Either Text).

First off I needed some data to test things out with.

bs0, bs1 :: BSL.ByteString
bs0 = "{\"name\": \"the name\", \"age\": 17}"
bs1 = "{\"name\": \"the name\", \"age\": true}"


To keep things simple I took baby steps, first I tried parsing into Person Covered Identity. It turns out that the FromJSON instance from that doesn't need much thought at all. (It's a bit of a pain to have to specify types in GHCi all the time, so I'm throwing in a specialised decoding function for each type too.)

instance FromJSON (Person Covered Identity) where
parseJSON = withObject "Person" $\o -> Person <$> o .: "name"
<*> o .: "age"

decodePI :: BSL.ByteString -> Maybe (Person Covered Identity)
decodePI = decode


Trying it out on the test data gives the expected results

λ> let i0 = decodePI bs0
λ> i0
Just (Person {name = Identity "the name", age = Identity 17})
λ> let i1 = decodePI bs1
λ> i1
Nothing


So far so good! Moving onto Person Covered Maybe. I spent some time trying to use the combinators in Data.Aeson for dealing with parser failures, but in the end I had to resort to using <|> from Alternative.

instance FromJSON (Person Covered Maybe) where
parseJSON = withObject "Person" $\o -> Person <$> (o .: "name" <|> pure Nothing)
<*> (o .: "age" <|> pure Nothing)

decodePM :: BSL.ByteString -> Maybe (Person Covered Maybe)
decodePM = decode


Trying that out I saw exactly the behaviour I expected, i.e. that parsing won't fail. (Well, at least not as long as it's a valid JSON object to being with.)

λ> let m0 = decodePM bs0
λ> m0
Just (Person {name = Just "the name", age = Just 17})
λ> let m1 = decodePM bs1
λ> m1
Just (Person {name = Just "the name", age = Nothing})


With that done I found that the instance for Person Covered (Either Text) followed quite naturally. I had to spend a little time on getting the types right to parse the fields properly. Somewhat disappointingly I didn't get type errors when the behaviour of the code turned out to be wrong. I'm gussing aeson's Parser was a little too willing to give me parser failures. Anyway, I ended up with this instance

instance FromJSON (Person Covered (Either Text)) where
parseJSON = withObject "Person" $\o -> Person <$> ((Right <$> o .: "name") <|> pure (Left "A name is most needed")) <*> ((Right <$> o .: "age") <|> pure (Left "An integer age is needed"))

decodePE :: BSL.ByteString -> Maybe (Person Covered (Either Text))
decodePE = decode


That does exhibit the behaviour I want

λ> let e0 = decodePE bs0
λ> e0
Just (Person {name = Right "the name", age = Right 17})
λ> let e1 = decodePE bs1
λ> e1
Just (Person {name = Right "the name", age = Left "An integer age is needed"})


## In closing

I think everyone will agree that the FromJSON instances are increasingly messy. I think that can be fixed by putting some thought into what a more pleasing API should look like.

I'd also like to mix in validation beyond what aeson offers out-of-the-box, which really only is "is the field present?" and "does the value have the correct type?". For instance, Once we know there is a field called age, and that it's an Int, then we might want to make sure it's non-negitive, or that the person is at least 18. I'm guessing that wouldn't be too difficult.

Finally, I'd love to see examples of using HKDs for parsing/validation in the wild. It's probably easiest to reach me at @magthe@mastodon.technology.

# [crexbapy] default lambda variable

first, introduce a special keyword LAMBDAVARIABLE so that an expression containing LAMBDAVARIABLE gets rewritten by the compiler to

\x -> expression

where x is a fresh symbol, and every instance of LAMBDAVARIABLE in the expression is replaced by x.  (we have used Haskell syntax for lambda functions.)

for example, (LAMBDAVARIABLE + 10) rewrites to (\x -> x + 10).

the benefits are not having to type "x" that one extra time at the beginning ("Don't Repeat Yourself"), and not having to think of a fresh name x.  the latter may be useful for generated code.

problems:

1. the construct cannot be nested.
2. you do not know that an expression is a lambda function until you've parsed into the expression deeply, encountering the special keyword LAMBDAVARIABLE.
3. does (10 * (5 + LAMBDAVARIABLE)) mean (\x -> 10 * (5+x)) or (10 * (\x -> 5+x))?

solution: augment with another construct LAMBDA(...) which delimits a lambda expression and allows nesting.  for example,

LAMBDA(LAMBDAVARIABLE + LAMBDA(LAMBDAVARIABLE^2)(LAMBDAVARIABLE+1))

rewrites to (\x -> x + (\x -> x^2)(x+1)).  note that the inner x (and inner LAMBDAVARIABLE) masks the outer one via static scoping.  if you need access to an outer variable in nested lambdas, don't use this construct.

Mathematica calls lambda functions pure anonymous functions.  LAMBDAVARIABLE is # and LAMBDA is postfix & .

previously, lambda with more than one variable.  consider keywords BINARYLAMBDA, LAMBDAVARIABLE1, LAMBDAVARIABLE2.  beyond two arguments, it's probably best to force the programmer to explicitly name them.

there's a parallel here with lambda expressions themselves being shortcuts:

(\x -> expression) rewrites to
(let { f x = expression } in f)

with the benefits of not having to think of a fresh identifier f and not having to Repeat Yourself in f.

# Custom monad with servant and throwing errors

In the past I've always used scotty when writing web services. This was mostly due to laziness, I found working out how to use scotty a lot easier than servant, so basically I was being lazy. Fairly quickly I bumped into some limitations in scotty, but at first the workarounds didn't add too much complexity and were acceptable. A few weeks ago they started weighing on me though and I decided to look into servant and since I really liked what I found I've started moving all projects to use servant.

In several of the projects I've used tagless final style and defined a type based on ReaderT holding configuration over IO, that is something like

newtype AppM a = AppM {unAppM : ReaderT Config IO a}
deriving
( Functor,
Applicative,
)

runAppM :: AppM a -> Config -> IO a
runAppM app = runReaderT (unAppM app)


I found that servant is very well suited to this style through hoistServer and there are several examples on how to use it with a ReaderT-based type like above. The first one I found is in the servant cookbook. However, as I realised a bit later, using a simple type like this doesn't make it easy to trigger responses with status other than 200 OK. When I looked at the definition of the type for writing handlers that ships with servant, Handler, I decided to try to use the following type in my service

newtype AppM a = AppM {unAppM : ReaderT Config (ExceptT ServerError IO) a}
deriving
( Functor,
Applicative,
)

runAppM :: AppM a -> Config -> IO (Either ServerError a)
runAppM app = runExceptT . runReaderT (unAppM app)


The natural transformation required by hoistServer can then be written like

nt :: AppM a -> Handler a
nt x =
liftIO (runAppM x cfg) >>= \case
Right v -> pure v
Left err -> throwError err


I particularly like how clearly this suggests a way to add custom errors if I want that.

1. Swap out ServerError for my custom error type in AppM.
2. Write a function to transform my custom error type into a ServerError, transformCustomError :: CustomError -> ServerError.
3. use throwError $transformCustomError err in the Left branch of nt. ## A slight complication with MonadUnliftIO I was using unliftio in my service, and as long as I based my monad stack only on ReaderT that worked fine. I even got the MonadUnliftIO instance for free through automatic deriving. ExceptT isn't a stateless monad though, so using unliftio is out of the question, instead I had to switch to MonadBaseControl and the packages that work with it. Defining and instance of MonadBaseControl looked a bit daunting, but luckily Handler has an instance of it that I used as inspiration. First off MonadBaseControl requires the type to also be an instance of MonadBase. There's an explicit implementation for Handler, but I found that it can be derived automatically, so I took the lazy route. The instance of MonadBaseControl for AppM ended up looking like this instance MonadBaseControl IO AppM where type StM AppM a = Either ServerError a liftBaseWith f = AppM (liftBaseWith (\g -> f (g . unAppM))) restoreM = AppM . restoreM  I can't claim to really understand what's going on in that definition, but I have Alexis King's article on Demystifying MonadBaseControl on my list of things to read. ## March 18, 2021 ### Tweag I/O # Types à la carte in Nickel This post is the third one of a series on Nickel, a new configuration language that I’ve been working on. In this post, I explore the design of Nickel’s type system, which mixes static and dynamic typing, and the reasons behind this choice. 1. Presenting Nickel: better configuration for less 2. Programming with contracts in Nickel 3. Types à la carte in Nickel When other constraints allow it (the right tool for the job and all that), I personally go for a statically typed language whenever I can. But the Nickel language is a tad different, for it is a configuration language. You usually run a terminating program once on fixed inputs to generate a static text file. In this context, any type error will most likely either be triggered at evaluation anyway, typechecker or not, or be irrelevant (dead code). Even more if you add data validation, which typing can seldom totally replace: statically enforcing that a string is a valid URL, for example, would require a powerful type system. If you have to validate anyway, checking that a value is a number at run-time on the other hand is trivial. Nickel also aims at being as interoperable with JSON as possible, and dealing with JSON values in a typed manner may be painful. For all these reasons, being untyped1 in configuration code is appealing. But this is not true of all code. Library code is written to be reused many times in many different settings. Although specialised in configuration, Nickel is a proper programming language, and one of its value propositions is precisely to provide abstractions to avoid repeating yourself. For reusable code, static typing sounds like the natural choice, bringing in all the usual benefits. How to get out of this dilemma? ## Gradual typing Gradual typing reconciles the two belligerents by allowing both typed code and untyped code to coexist. Not only to coexist, but most importantly, to interact. One common use-case of gradual typing is to retrofit static typing on top of an existing dynamically typed language, allowing to gradually — hence the name — type an existing codebase. In the case of Nickel, gradual typing is used on its own, because optional typing makes sense. In both situations, gradual typing provides a formal and principled framework to have both typed and untyped code living in a relative harmony. ## Promises, promises! Since configuration code is to be untyped, and make for the majority of Nickel code, untyped is the default. A basic configuration looks like JSON, up to minor syntactic differences: { host = "google.com", port = 80, protocol = "http", } Typechecking is triggered by a type annotation, introduced by :. Annotations can either be apposed to a variable name or to an expression: let makePort : Str -> Num = fun protocol => if protocol == "http" then 80 else if protocol == "ftp" then 21 else null in let unusedBad = 10 ++ "a" in { port = makePort protocol, protocol = ("ht" ++ "tp" : Str), } In this example, makePort is a function taking a string and returning a number. It is annotated, causing the typechecker to kick in. It makes sure that each sub-expression is well-typed. Notice that subterms don’t need any other annotation: Nickel is able to guess most of the types using unification-based type inference. Such a static type annotation is also called a promise, as you make a firm promise to the typechecker about the type of an expression. Static typechecking ends with makePort, and although unusedBad is clearly ill-typed (concatenating a number and a string), it won’t cause any typechecking error. Can you guess the result of trying to run this program? error: Incompatible types ┌─ repl-input-1:7:5 │ 7 │ null in │ ^^^^ this expression │ = The type of the expression was expected to be Num = The type of the expression was inferred to be Dyn = These types are not compatible The typechecker rightly complains than null is not a number. If we fix this (for now, substituting it with -1), the programs runs correctly: $nickel export <<< ...
{
"port": 80,
"protocol": "http"
}

unusedBad doesn’t cause any error at run-time. Due to laziness, it is never evaluated. If we were to add a type annotation for it though, the typechecker would reject our program.

To recap, the typechecker is a lurker by default, letting us do pretty much what we want. It is triggered by a type annotation exp : Type, in which case it switches on and statically typechecks the expression.

## Who’s to be blamed

So far, so good. Now, consider the following example:

let add : Num -> Num -> Num = fun x y => x + y in
add "a" 0

As far as typechecking goes, only the body of add is to be checked, and it is well-typed. However, add is called with a parameter of the wrong type by an untyped chunk. Without an additional safety mechanism, one would get this runtime type error:

error: Type error
┌─ repl-input-0:1:26
│
1 │ let add : Num -> Num -> Num = fun x y => x + y
│                                              ^ This expression has type Str, but Num was expected
│
[..]

The error first points to a location inside the body of add. It doesn’t feel right, and kinda defeats the purpose of typing: while our function should be guaranteed to be well-behaved, any untyped code calling to it can sneak in ill-typed terms via the parameters. In turn, this raises errors that are located in well behaving code. In this case, which is deliberately trivial, the end of the error message elided as [..] turns out to give us enough information to diagnose the actual issue. This is not necessarily the case for more complex real life functions.

There’s not much we can do statically. The whole point of gradual typing being to accommodate for untyped code, we can’t require the call site to be statically typed. Otherwise, types would contaminate everything and we might as well make our language fully statically typed.

We can do something at run-time, though. Assuming type soundness, no type error should arise in the body of a well-typed function at evaluation. The only sources of type errors are the parameters provided by the caller.

Thus, we just have to control the boundary between typed and untyped blocks by checking the validity of the parameters provided by the caller. If we actually input the previous example in the Nickel REPL, we don’t get the above error message, but this one instead:

nickel> let add : Num -> Num = fun x y => x + y in

error: Blame error: contract broken by the caller.
┌─ :1:8
│
1 │ Num -> Num -> Num
│        --- expected type of the argument provided by the caller
│
┌─ repl-input-6:1:31
│
1 │ let add : Num -> Num -> Num = fun x y => x + y
│                               ^^^^^^^^^^^^^^^^ applied to this expression
│
[..]
note:
┌─ repl-input-7:1:1
│
│ --------- (1) calling <func>
[..]

This error happens before the body of add is even entered. The Nickel interpreter wraps add in a function that first checks the parameters to be of the required type before actually calling add. Sounds familiar? This is exactly what we described in the post on contracts. That is, typed functions are automatically guarded by a higher-order contract. This ensures that type errors are caught before entering well-typed land, which greatly improves error locality.

In summary, to avoid sneaking ill-typed value in well-typed blocks, the Nickel interpreter automatically protects typed functions by inserting appropriate contracts.

## A contract with the devil

We have dealt with the issue of calling typed code from untyped code. A natural follow-up is to examine the dual case: how can one use definitions living in untyped code inside a statically typed context? Consider the following example:

// this example does NOT typecheck
let f = fun x => if x then 10 else "a" in
let doStuffToNum: Num -> Num = fun arg =>
arg + (f true) in

doStuffToNum 1

The typed function doStuffToNum calls to an untyped function f. f true turns out to be a number indeed, but f itself is not well-typed, because the types of the if and the else branch don’t match. No amount of additional type annotations can make this program accepted.

See what happens in practice:

error: Incompatible types
┌─ repl-input-1:3:10
│
3 │   arg + (f true) in
│          ^ this expression
│
= The type of the expression was expected to be _a -> Num
= The type of the expression was inferred to be Dyn
= These types are not compatible

f not being annotated, the typechecker can’t do much better than to give f the dynamic type Dyn (although in some trivial cases, it can infer a better type). Since it was expecting a function returning Num, it complains. It seems we are doomed to restrict our usage of untyped variables to trivial expressions, or to type them all.

Or are we? One more time, contracts come to the rescue. Going back to the post on contracts again, contracts are enforced similarly to types, but using | instead of :. Let us fix our example:

let doStuffToNum: Num -> Num = fun arg =>
arg + (f true | Num) in

We just applied a Num contract to f true, and surprise, this code typechecks! Our typechecker didn’t get magically smarter. By adding this contract check, we ensure the fundamental property that f true will either evaluate to a number or fail at run-time with an appropriate contract error. In particular, no value of type Str, for example, can ever enter our well-typed paradise add. When writing exp | Type in a typed block, two things happen:

1. The typechecker switches back to the default mode inside exp, where it doesn’t enforce anything until the next promise (annotation).
2. The typechecker blindly assumes that the type of exp | Type is Type. Hence, contract checks are also called assume.

Put differently, a contract check is considered a type cast by the static type system, whose correctness check is delayed to run-time.

Behold: this implies that something like (5 | Bool) : Bool typechecks. How outrageous, for a typed functional language proponent. But even languages like Haskell have some side conditions to type soundness: b :: Bool doesn’t guarantee that b evaluate to a boolean, for it can loop, or raise an exception. Minimizing the amount of such possibilities is surely for the better, but the important point remains that b never silently evaluates to a string.

To conclude, we can use contracts as delayed type casts, to make the typechecker accept untyped terms in a typed context. This is useful to import values from untyped code, or to write expressions that we know are correct but that the typechecker wouldn’t accept.

## Conclusion

There is more to say about Nickel’s type system, that features parametric polymorphism or structural typing with row polymorphism for records, to cite some interesting aspects. The purpose of this post is rather to explain the essence of gradual typing, why it makes sense in the context of Nickel, and how it is implemented. We’ve seen that contracts are a fundamental ingredient for the interaction between typed and untyped code in both ways.

1. I will use typed to mean statically typed, and untyped to mean dynamically typed.

# Async Control Flow

This post is an investigation of persistent issue #1199 where an asynchronous exception caused a database connnection to be improperly returned to the pool. The linked issue contains some debugging notes, along with the PR that fixes the problem. While I was able to identify the problem and provide a fix, I don’t really understand what happened - it’s a complex bit of work. So I’m going to write this up as an exploration into the exact code paths that are happening.

# Data.Pool

resource-pool is a how persistent manages concurrent pooling and sharing of database connections. When you create a Pool, you specify how to create resources, destroy them, and then some information around resource management: how long to keep an unused resource open, how many sub-pools to maintain, and how many resources per sub-pool (aka stripe). persistent calls createPool here. The database client libraries provide a LogFunc -> IO SqlBackend that is used to create new database connections, and the close' delegates to the connClose field on the SqlBackend record.

While resource-pool isn’t seeing much maintenance activity, it’s relatively well tested and reliable. Once you’ve got a Pool a from createPool, the recommended way to use it is withResource:

withResource
=> Pool a
-> (a -> m b)
-> m b
withResource pool act = control $\runInIO -> mask$ \restore -> do
(resource, local) <- takeResource pool
ret <- restore (runInIO (act resource)) onException
destroyResource pool local resource
putResource local resource
return ret


# Data.Acquire

Now, in persistent-2.10.5, a new API based on the resourcet package’s Data.Acquire was introduced, and this API became the underlying implementation for the runSqlPool family of functions. The underlying functionality is in the new function unsafeAcquireSqlConnFromPool, which was later factored out into resourcet-pool. This change was introduced because resource-pool operates in MonadBaseControl, which is incompatible with many other monad transformers - specifically, ConduitT. Acquire is based on MonadUnliftIO, which is compatible.

In hindsight, we could have just changed the code to use MonadUnliftIO - it’s relatively straightforward to do. A term with a single constrant like MonadBaseControl IO m => m a can be specialized to IO a, and we can then run that using withRunInIO from unliftio.

toUnliftIO
=> (forall m. MonadBaseControl IO m => m a)
-> n a
toUnliftIO mbc =
withRunInIO $\runInIO -> do mbc toPlainIO :: (forall m. MonadBaseControl IO m => m a) -> IO a toPlainIO mbc = mbc toMonadIO :: MonadIO n => (forall m. MonadBaseControl IO m => m a) -> n a toMonadIO mbc = liftIO (toPlainIO mbc)  # Acquire vs Pool I didn’t realize this at the time, but Data.Acquire is inherently a weaker tool than Data.Pool. Data.Acquire provides a means of creating a new resource, and also freeing it automatically when a scope is exited. Data.Pool keeps track of resources, resource counts, and occasionally destroys them if they’re unsued. So let’s look at our conversion function: unsafeAcquireSqlConnFromPool = do pool <- MonadReader.ask let freeConn :: (backend, LocalPool backend) -> ReleaseType -> IO () freeConn (res, localPool) relType = case relType of ReleaseException -> P.destroyResource pool localPool res _ -> P.putResource localPool res return$ fst <$> mkAcquireType (P.takeResource pool) freeConn  mkAcquireType is analogous to createPool - it creates a handle Acquire a that can be used with a function named with: with :: MonadUnliftIO m => Acquire a -> (a -> m b) -> m b with (Acquire f) g = withRunInIO$ \run -> E.mask $\restore -> do Allocated x free <- f restore res <- restore (run (g x)) E.onException free ReleaseException free ReleaseNormal return res  with is aliased to withAcquire, which I’ll use from here on out to disambiguate. You may notice that withAcquire and withResource are implemented nearly identically. withResource uses MonadBaseControl and withAcquire uses MonadUnliftIO, and that’s the whole of the difference. They have the same async exception handling with mask and use the same onException functions. All the exception handling stuff is from Control.Exception, so we’re not using UnliftIO.Exception or Control.Monad.Catch or Control.Exception.Safe here. These are really similar. When we look at how the unsafeSqlConnFromPool works, it should provide identical behavior. For free, we case on ReleaseType and do destroyResource on exception and putResource on any other exit. We’re not handling ReleaseEarly specially - this constructor is only used when we use ResourceT’s release function on a value. Using withAcquire, we’ll only ever pass ReleaseNormal and ReleaseException. So this is locally safe. Weirdly enough, resourcet doesn’t really depend on the Acquire type at all, at least not directly - the ReleaseMap type contains a function ReleaseType -> IO () for freeing resources, but doesn’t mention anything else about it. Anyway, let’s get back on track. Since withAcquire and withResource are nearly identical, it may be our translating code that is the problem. We can use algebraic substitution to check this out. Let’s look at mkAcquireType: mkAcquireType :: IO a -- ^ acquire the resource -> (a -> ReleaseType -> IO ()) -- ^ free the resource -> Acquire a mkAcquireType create free = Acquire$ \_ -> do
x <- create
return $! Allocated x (free x)  The ignored parameter in the lambda there is a function that looks like restore - and we’re ignoring it. So, this action gets run when we unpack the Acquire in withAcquire. Let’s plug in our create and free parameters: mkAcquireType :: IO a -- ^ acquire the resource -> (a -> ReleaseType -> IO ()) -- ^ free the resource -> Acquire a mkAcquireType (create = P.takeResource pool) (free = freeConn) = Acquire$ \_ -> do
x <- (P.takeResource pool)
return $! Allocated x (freeConn x) where freeConn (res, localPool) relType = case relType of ReleaseException -> P.destroyResource pool localPool res _ -> P.putResource localPool res  The pool variable is captured in the closure. Now we can look at withAcquire, and plug in our behavior: withAcquire (Acquire f) g = withRunInIO$ \run -> E.mask $\restore -> do -- f ignores the restore argument: possible bug? Allocated x free <- f restore -- so x here comes from P.takeResource pool -- free = freeConn ret <- restore (run (g x)) E.onException free ReleaseException free ReleaseNormal return ret  Let’s plug in the specific case for free: withAcquire (Acquire f) g = withRunInIO$ \run -> E.mask $\restore -> do -- f ignores the restore argument: possible bug? Allocated x free <- f restore -- so x here comes from P.takeResource pool -- free = freeConn ret <- restore (run (g x)) E.onException P.destroyResource pool localPool res P.putResource localPool res return ret  Closer, closer… Let’s unpack the Allocated stuff: withAcquire (Acquire _) g = withRunInIO$ \run -> E.mask $\restore -> do -- f ignores the restore argument: possible bug? Allocated x free <- f restore -- so x here comes from P.takeResource pool -- free = freeConn ret <- restore (run (g x)) E.onException P.destroyResource pool localPool res P.putResource localPool res return ret where f _ = do x <- (P.takeResource pool) return$! Allocated x (freeConn x)

-- OK, let's splice in the definition of f:

withAcquire (Acquire _) g = withRunInIO $\run -> E.mask$ \restore -> do
-- f ignores the restore argument: possible bug?
Allocated x free <- do
x <- P.takeResource pool
return $! Allocated x (freeConn x) -- so x here comes from P.takeResource pool -- free = freeConn ret <- restore (run (g x)) E.onException P.destroyResource pool localPool res P.putResource localPool res return ret -- Now let's remove the Allocated constructor: withAcquire (Acquire _) g = withRunInIO$ \run -> E.mask $\restore -> do x@(res, localPool) <- P.takeResource pool ret <- restore (run (g x)) E.onException P.destroyResource pool localPool res P.putResource localPool res return ret  With this, we’re now nearly identical with withResource (copied again): withResource :: (MonadBaseControl IO m) => Pool a -> (a -> m b) -> m b withResource pool act = control$ \runInIO -> mask $\restore -> do (resource, local) <- takeResource pool ret <- restore (runInIO (act resource)) onException destroyResource pool local resource putResource local resource return ret  The only difference here is that Acquire also passes the LocalPool to the given action. In the persistent code, we use fmap fst so that it only passes the resource to the callback. So, I’m not sure this function is at fault. Let’s see how we call this function. # What’s that >>= doing there?? acquireSqlConnFromPool is what’s actually called by runSqlPool in this version of the code. acquireSqlConnFromPool :: (MonadReader (Pool backend) m, BackendCompatible SqlBackend backend) => m (Acquire backend) acquireSqlConnFromPool = do connFromPool <- unsafeAcquireSqlConnFromPool return$ connFromPool >>= acquireSqlConn


That >>= is weird. What’s going on here? We have return :: a -> m a, and then f >>= g. f :: Acquire backend - so then g must have the type g :: backend -> Acquire backend, meaning that we’re using the >>= of Acquire a -> (a -> Acquire b) -> Acquire b.

acquireSqlConn cashes out to rawAcquireSqlConn:

rawAcquireSqlConn
:: forall backend m
=> Maybe IsolationLevel -> m (Acquire backend)
rawAcquireSqlConn isolation = do
let rawConn :: SqlBackend
rawConn = projectBackend conn

getter :: T.Text -> IO Statement
getter = getStmtConn rawConn

beginTransaction :: IO backend
beginTransaction = conn <$connBegin rawConn getter isolation finishTransaction :: backend -> ReleaseType -> IO () finishTransaction _ relType = case relType of ReleaseException -> connRollback rawConn getter _ -> connCommit rawConn getter return$ mkAcquireType beginTransaction finishTransaction


So, in the investigation, the exception (libpq: failed (another command is already in progress)) would happen (as best as I can tell) when we try to call connRollback. The problem is somewhere around here.

Um excuse me what? This is also operating in m (Acquire backend), not Acquire backend. How is it possibly being used on the RHS of a >>=?

… Oh, right. Just like MonadBaseControl IO m => m a can be concretized to IO a, we can concretize MonadReader r m => m a to r -> a. So what’s happening here is we’re picking the spcialized type:

rawAcquireSqlConn
:: Maybe IsolationLevel -> backend -> Acquire backend


Wild.

Well, let’s look at >>= for Acquire:

instance Monad Acquire where
return = pure
Acquire f >>= g' = Acquire $\restore -> do Allocated x free1 <- f restore let Acquire g = g' x Allocated y free2 <- g restore E.onException free1 ReleaseException return$! Allocated y (\rt -> free2 rt E.finally free1 rt)


Hmm! This smells funny. The problem occurs when we try to roll back the transaction. So let’s apply some more substitution here.

Acquire f contains:

\_ -> do
x <- P.takeResource pool
pure $Allocated x (freeConn x)  And g' contains (simplifying a tiny bit): \sqlBackend -> do Acquire$ \_ -> do
_ <- beginTransaction sqlBackend getter isolation
pure $Allocated sqlBackend$ \case
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend


So, we can start inlining.

Acquire $\restore -> do Allocated x free1 <- (\_ -> do x <- P.takeResource pool pure$ Allocated x (freeConn x)) restore

let Acquire g = g' x

Allocated y free2 <- g restore E.onException free1 ReleaseException
return $! Allocated y (\rt -> free2 rt E.finally free1 rt) -- (\_ -> x) restore = x Acquire$ \restore -> do
Allocated x free1 <- do
x <- P.takeResource pool
pure $Allocated x (freeConn x) let Acquire g = g' x Allocated y free2 <- g restore E.onException free1 ReleaseException return$! Allocated y (\rt -> free2 rt E.finally free1 rt)

-- float c and freeConn up
Acquire $\restore -> do x <- P.takeResource pool let free1 = freeConn x let Acquire g = g' x Allocated y free2 <- g restore E.onException free1 ReleaseException return$! Allocated y (\rt -> free2 rt E.finally free1 rt)

-- inline g'
Acquire $\restore -> do x <- P.takeResource pool let free1 = freeConn x let sqlBackend = x let Acquire g = Acquire$ \_ -> do
_ <- beginTransaction sqlBackend getter isolation
pure $Allocated sqlBackend$ \case
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend

Allocated y free2 <- g restore E.onException free1 ReleaseException
return $! Allocated y (\rt -> free2 rt E.finally free1 rt) -- Remove Acquire constructor: Acquire$ \restore -> do
x <- P.takeResource pool
let free1 = freeConn x

let sqlBackend = x
let g _ = do
_ <- beginTransaction sqlBackend getter isolation
pure $Allocated sqlBackend$ \case
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend

Allocated y free2 <- g restore E.onException free1 ReleaseException
return $! Allocated y (\rt -> free2 rt E.finally free1 rt) -- Inline g, ignore restore parameter Acquire$ \restore -> do
x <- P.takeResource pool
let free1 = freeConn x
let sqlBackend = x

Allocated y free2 <-
(do
_ <- beginTransaction sqlBackend getter isolation
pure $Allocated sqlBackend$ \case
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend
) E.onException free1 ReleaseException

return $! Allocated y (\rt -> free2 rt E.finally free1 rt)  Now, this next transformation feels a bit tricky. I’m going to float beginTransaction up and put the E.onException only on it. Note that we’re not actually running the free2 action here - just preparing it. Then I’ll assign it with a let. Acquire$ \restore -> do
x <- P.takeResource pool
let free1 = freeConn x
let (sqlBackend, localPool) = x

_ <- beginTransaction sqlBackend getter isolation
E.onException free1 ReleaseException
let free2 = \case
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend

return $! Allocated y (\rt -> free2 rt E.finally free1 rt) -- Inline free1 and free2 Acquire$ \restore -> do
x <- P.takeResource pool
let (sqlBackend, localPool) = x

_ <- beginTransaction sqlBackend getter isolation
E.onException freeConn x ReleaseException

return $! Allocated y$ \rt ->
(\case
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend) rt
E.finally
(freeConn x rt)

-- Inline freeConn
Acquire $\restore -> do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation E.onException P.destroyResource pool localPool sqlBackend return$! Allocated y $\rt -> (case rt of ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) E.finally do case rt of ReleaseException -> P.destroyResource pool localPool sqlBackend _ -> P.putResource localPool sqlBackend  I think it’s important to note that, again we don’t ever actually call restore. So the masking state is inherited and not ever changed. It feels important but I’m not sure if it actually is. Let’s plug this into withAcquire now. withAcquire (Acquire f) g = withRunInIO$ \run -> E.mask $\restore -> do Allocated x free <- f restore res <- restore (run (g x)) E.onException free ReleaseException free ReleaseNormal return res -- Inline f. Since restore is never called, we can omit passing it as -- a parameter. withAcquire (Acquire f) g = withRunInIO$ \run -> E.mask $\restore -> do Allocated x free <- do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation E.onException free1 ReleaseException return$! Allocated x $\rt -> (case rt of ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) E.finally do case rt of ReleaseException -> P.destroyResource pool localPool sqlBackend _ -> P.putResource localPool sqlBackend res <- restore (run (g x)) E.onException free ReleaseException free ReleaseNormal return res -- float x <- P.takeResource pool to the top, and define free using let withAcquire (Acquire f) g = withRunInIO$ \run -> E.mask $\restore -> do x <- P.takeResource pool let free1 = freeConn x let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation E.onException free1 ReleaseException let free rt = (case rt of ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) E.finally do case rt of ReleaseException -> P.destroyResource pool localPool sqlBackend _ -> P.putResource localPool sqlBackend res <- restore (run (g x)) E.onException free ReleaseException free ReleaseNormal return res -- inline free for each case: withAcquire (Acquire f) g = withRunInIO$ \run -> E.mask $\restore -> do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation E.onException P.destroyResource pool localPool sqlBackend res <- restore (run (g x)) E.onException do connRollback sqlBackend E.finally P.destroyResource pool localPool sqlBackend do -- ReleaseNormal connCommit sqlBackend E.finally do P.putResource localPool sqlBackend return res  Let’s consider our masking state. We’re masked for everything except for the restoure (run (g x)) call. Including beginning the transaction and committing the transaction. But we can still receive asynchronous exceptions during interruptible operations. Interruptible operations include “anything that can block or perform IO,” which seems very likely to include the Postgres code here. # The Original Let’s compare this with the original code. The original code delegated to runSqlConn after acquiring a SqlBackend from the Pool in MonadUnliftIO. runSqlConn :: (MonadUnliftIO m, BackendCompatible SqlBackend backend) => ReaderT backend m a -> backend -> m a runSqlConn r conn = withRunInIO$ \runInIO -> mask $\restore -> do let conn' = projectBackend conn getter = getStmtConn conn' restore$ connBegin conn' getter Nothing
x <- onException
(restore $runInIO$ runReaderT r conn)
(restore $connRollback conn' getter) restore$ connCommit conn' getter
return x


We’ll inline this into runSqlPool, so we’ll now see:

runSqlPool r pconn =
withRunInIO $\run -> withResource pconn$ run . runSqlConn r

-- expand lambda
runSqlPool r pconn =
withRunInIO $\run -> withResource pconn$ \conn ->
run $runSqlConn r conn -- inline runSqlConn runSqlPool r pconn = withRunInIO$ \run ->
withResource pconn $\conn -> run$ withRunInIO $\runInIO -> mask$ \restore -> do
let conn' = projectBackend conn
getter = getStmtConn conn'
restore $connBegin conn' getter Nothing x <- onException (restore$ runInIO $runReaderT r conn) (restore$ connRollback conn' getter)
restore $connCommit conn' getter return x  Kind of a lot of withStuff going on, including two withRunInIOs lol. Let’s make it even worse by inlining withResource: -- abstract action to a variable runSqlPool r pconn = withRunInIO$ \run ->
withResource pconn $\conn -> let act = run$ withRunInIO $\runInIO -> mask$ \restore -> do
let conn' = projectBackend conn
getter = getStmtConn conn'
restore $connBegin conn' getter Nothing x <- onException (restore$ runInIO $runReaderT r conn) (restore$ connRollback conn' getter)
restore $connCommit conn' getter return x in act -- inline withResource runSqlPool r pconn = withRunInIO$ \run ->
-- withResource pconn $\conn -> control$ \runInIO0 ->
mask $\restore0 -> do let act conn = run$ withRunInIO $\runInIO1 -> mask$ \restore1 -> do
let conn' = projectBackend conn
getter = getStmtConn conn'
restore1 $connBegin conn' getter Nothing x <- onException (restore1$ runInIO1 $runReaderT r conn) (restore1$ connRollback conn' getter)
restore1 $connCommit conn' getter return x (resource, local) <- takeResource pool ret <- restore0 (runInIO0 (act resource)) onException destroyResource pool local resource putResource local resource return ret -- inline act runSqlPool r pconn = withRunInIO$ \run ->
-- withResource pconn $\conn -> control$ \runInIO0 ->
mask $\restore0 -> do (resource, local) <- takeResource pool ret <- restore0 (runInIO0 ( run$ withRunInIO $\runInIO1 -> mask$ \restore1 -> do
let conn = resource
let conn' = projectBackend conn
getter = getStmtConn conn'
restore1 $connBegin conn' getter Nothing x <- onException (restore1$ runInIO1 $runReaderT r conn) (restore1$ connRollback conn' getter)
restore1 $connCommit conn' getter return x)) onException destroyResource pool local resource putResource local resource return ret  The restore paratmer in mask doesn’t unmask it completely - it restores the existing masking state before the mask was entered. So mask$ \restore -> mask $\restore -> restore (print 10) doesn’t have print 10 in an unmasked state, but the same mask as before. However, here, we have this pattern: mask$ \restore -> do
restore $do mask$ \restore' -> do
...


Which is interesting!

runSqlPool r pconn =
withRunInIO $\run -> control$ \runInIO0 ->
mask $\restore0 -> do -- Masked (resource, local) <- takeResource pool ret <- restore0 -- Unmasked (runInIO0$ run $withRunInIO$ \runInIO1 ->
mask $\restore1 -> do let conn = resource let conn' = projectBackend conn getter = getStmtConn conn' -- Unmasked restore1$ connBegin conn' getter Nothing
x <- onException
(restore1 $runInIO1$ runReaderT r conn)
(restore1 $connRollback conn' getter) restore1$ do --unmasked
connCommit conn' getter
return x)
onException
destroyResource pool local resource
putResource local resource
return ret


1. takeResource pool
2. onException
3. onException and then destroyResource
4. putResource

1. connBegin
2. r (the action passed to runSqlConn)
3. connRollback
4. connCommit

Let’s compare with withAcquire which was all inlined above:

1. takeResource
2. beginTransaction
3. destroyResource
4. connRollback
5. destroyResource again
6. connCommit
7. putResource
1. run (g x) – the action passed to withAcquire and runSqlConn.

So withAcquire actually has quite a bit more masking going on! Interesting. Remembering, the problem occurs when the thread killed exception happens and the connRollback function is called, causing libpq to die with the “command in progress” error.

So, we throw a killThread at our withAcquire function. It’ll land as soon as we’re unmasked, or an interruptible action occurs. Since almost all of it is masked, we need to determine what the interruptible operations are.

takeResource might be interruptible - it has an STM transaction, which does call retry. I don’t know if any code with retry triggers an interrupt, or if only actually calling retry can trigger an interruptible state. Based on a quick and bewildering look at the GHC source, I think it’s just that retry itself can be interrupted. retry occurs when there are no available entries in the local pool and we’re at max resources for the pool. This is exactly the scenario this test is exercising: a single stripe with a single resource that’s constantly in use.

beginTransaction kicks off an IO action to postgres, so it is almost definitely interruptible. Same for connRollback and connCommit. So the masked-state for these items in withAcquire is probably not a big deal - but we could check by using uninterruptibleMask on them.

# To be continued?

I wish I had a more satisfying conclusion here, but I’m all out of time to write on this for now. Please comment on the relevant GitHub issues if you’re interested or have some insight!

# What is a type?

Posted on March 17, 2021

My fascination for types rivals my inability to define the concept.

Even though I don’t know what a type is, I can still recognize when a paper “is about types”: the paper usually contains many occurrences of formulas of the form “t : T”, where “t” is some piece of code, some program, and “T” is one of these mysterious types. The colon in the middle is of no technical significance, but it signals some cultural awareness on the part of authors.

Hypothesis: Types are a meme.

My experience is also that things are very very bad when “t : T” does not hold. Types are a way to tell right from wrong, for some domain-specific definitions of correctness.

Hypothesis: Types are specifications.

One idea that really narrows it down for me is that programming languages have types. You can assign types to everything in a language: static types to source code, dynamic types to run-time values.

Another way to look at this is to compare with other forms of specifications. How do you prove a specification? A priori, you could use any method, and all you care about is that it is somehow “sound”, but otherwise the proof method is a black box.

• Automated theorem proving, with an SMT solver as a black box.
• Interactive theorem proving, with PhD students as a black box.
• Testing, with the compiler/interpreter as a black box.1

Approaches using “types” seem different. To prove a specification, a typing judgement “t : T”, the way forward is to prove more typing judgements by following the rules of some “type system”. Types tell me both “how to specify things” and “how to verify things”—by specifying and verifying sub-things.

Hypothesis: Types are compositional specifications.

I originally posted this question on the recently created TYPES Zulip server, a spin-off of the TYPES mailing list. Those are nice hangouts for people who like types, whatever you believe they are (and also for getting spammed with Calls-For-Papers).

1. Whereas most formal methods are sound for proving the absence of certain bugs, testing is a sound method of finding bugs.↩︎

# Wealth Shown to Scale and Incarceration in Real Numbers

Wealth Shown to Scale and Incarceration in Real Numbers, two intriguing infographics from Matt Korostoff.

# Using GHC low-latency garbage collection in production

This is a guest post by Domen Kožar.

In this post I’ll dive into how low-latency garbage collection (GC) has improved developer experience for Cachix users.

### The need for low latency

Cachix serves the binary cache protocol for the Nix package manager.

Before Nix builds a package, it will ask the binary cache if it contains the binary for a given package it wants to build. For a typical invocation of Nix there can be hundreds or even thousand of packages that need to be checked. However, if a binary is present in the cache, building the package is no longer required, potentially saving a lot of time and CPU.

It is crucial for the backend of such a binary cache service to respond in a timely manner so that the optimisation of skipping builds actually pays off.

### Monitoring GC pauses

The easiest way to monitor and graph how long GC pauses last is via ekg-statsd, by exposing the rts.gc.gc_wall_ms metric. For Cachix, a typical plot of this metric used to look like this:

From the picture, we can see that under load, we experience GC pause times of up to nearly 800 ms. Having 800 ms pauses stopping the world is far from ideal (I’ve even observed some pauses that last over a second under really heavy load), since the endpoint for checking if a certain binary exists normally takes only about 2–4 ms.

### Switching to the low-latency GC

If you want to try the low-latency GC in your own code, please make sure to use GHC 8.10.3 or later since it fixes a few crashing bugs that you don’t want to encounter. Then, to enable the low-latency GC, append the following flags when invoking an executable built by GHC:

myexecutable +RTS --nonmoving-gc

For Cachix, a typical picture of GC pauses plotted over time then looks as follows:

While this is comparing apples to oranges since the load is not exactly the same between the two pictures, you can see that the distribution is now significantly different. By far the most pauses are now actually in the range of just a few milliseconds.

Unfortunately, non-moving GC can occasionally still cause relatively long pauses in the worst case (measured at 150–200 ms). We believe that this is due to the workload spawning many threads. There is still work to be done to further reduce pause times of the low-latency collector under such circumstances.

The throughput impact of the low-latency GC hasn’t been measured for this case. The response time of a “does this binary exist?” request is still within the 2–4 ms range most of the time.

### Conclusion

• Monitor GC pauses to understand how they impact your application response times.

• Non-moving GC has been running in production for over a month, reducing worst-case response time for a performance-sensitive endpoint without any issues.

• Being able to monitor the total number of threads in the RTS would improve production insights, but that is yet to be implemented.

# The bottom of the Haskell Pyramid

Inspired by Kowainik's Knowledge Map that tries to cover many Haskell concepts a Haskeller will probably run into through their journey, as well as libraries and tools, I've decided to make a more modest list of things one should be familiar with to be productive with Haskell, the bottom of The Haskell Pyramid if you will, hoping it will help to focus learners of the language that might get loss in the vast amount of information.

## Installing and running a Haskell toolchain

This could be a bit overwhelming to beginners as there are many choices, should they install the toolchain from their package manager? Download ghc binaries? Install stack? ghcup? maybe even nix?

In my opinion installing Haskell using stack or ghcup probably lead to the best results.

A new user should be able to run ghc and ghci when starting, the rest can come later.

## Expressions, operators and applying functions

What goes in the right-hand side of the equals sign in a definition? Things like "Hello world", 1 + 1, min (max 43212 12345) 43251.

## Indentation rules

How does Haskell know where a definition begins and where it end? Which arguments are applied to which functions? Haskell uses indentation using spaces to figure that out - best to learn how it works quickly to avoid subtle bugs and cryptic errors.

## Compiler errors

The compiler will notify us when our program doesn't make sense. The fact that Haskell provides a lot of feedback about our program helps us a lot from trying to figure our everything ourselves, gotta get used to Haskell errors because they are here to help. But it also means we have to learn what they mean to be effective.

First, try to read them carefully so you get used to them. At some point you won't even need to read them, just know they are there.

## If expressions & Recursion

Recursion is very foundational to Haskell. It is used for iteration, managing state, and more. It is best to get good control over it as it will appear many times.

## The evaluation model, substitution, equational reasoning and referential transparency

A lot of big words that describe how to "understand Haskell code".

Understanding how Haskell evaluates our program is probably the most important tool available when we need to debug our programs (and when writing them, of course)

## Debug tracing, stack traces

... but often programs can be quite big and we need debug traces and stack traces to help guide us in the right direction of the bug.

## Total vs partial functions

How to make sure all input cases are taken into account? What does it mean when they aren't? How can Haskell help us make sure our program won't crash during runtime because we didn't think of a certain input? How can it guide our API design to make it more robust?

## Parametric Polymorphism

Also known as "generics" in other circles (but not in Haskell as Generics mean something else). Helps us write fuctions that are more reusable, composable and precise.

## Higher order functions

Higher order functions are also foundational, they help us avoid repeating ourselves and help us extract the essence of algorithms such as traversing structures or sorting a structure without having to go into the details of "what to do with the elements when traversing" or "how should I compare two elements when sorting".

## Lists and tuples

We use them all the time, maybe too much to be honest.

## Annotating types

In Haskell annotating types isn't mandatory in most cases, but type annotation can sometimes serve as a tiny summary of code, and also will provide some safety net that checks that our summary match our code, so we are more likely to implement the thing we wanted to implement.

## Using and chaining IO operations

To create a useful program that interacts with the world we often need to use IO. What does that mean for referential transparency? (spoiler: nothing)

## Functional core imperative shell

This is a pattern for designing programs that interact with the world. In Haskell we like to keep our functions uneffectful because they are more composable, reusable and are easier to test, but we still need some effects. One way to keep that to a minimum is to design a program in layers: one thin imperative and effectful layer does that interacting, and then the core layer, the logic of the program, that doesn't do effects.

An example of mixing effects and logic: when making a fizzbuzz program, write a loop that increments an integer and checks which bucket it falls, prints that number, increments the number, and repeats the process until we reach the final number to check.

To write the same program with "functional core imperative shell" in mind, We could write the core logic that generates a list of results, and then write a function that writes the numbers on screen

(and that function looks like this):

writeFizzBuzzResults :: [String] -> IO ()
writeFizzBuzzResults results = mapM_ putStrLn results

## Modules

Learn about organizing code and how to import functions and other things from other modules.

## Using hoogle

A wonderful tool for finding functions and types that many other languages lack.

Let's make our programs safer, easier to understand and more modular.

## Pattern matching

Let's write our own control flow over ADTs.

## Installing editor tools & Ghcid

In my opinion having fast feedback dramatically increase the programmer's productivity.

Some people will say Ghcid is all you need, but I think editor features like error-at-point provide faster feedback than ghcid when working on a single module, so those are preferrable. Though when making a big change across many modules it's hard to beat Ghcid at the moment.

## newtypes

They are like a special case of data but have some benefits over it so it's worth to know as they are wildly used.

## Common data types: Maybe, Either

We use them (and often over use them) a lot.

Either especially can be used for handling errors in uneffectful code instead of IO exceptions.

## A package manager

How can I generate random numbers? How do I parse a json string? Send a file over the network? Write a graphical snake game?

The easiest way to accomplish these goals is to use an external library (packages in Haskell), and the easiest way to import a library, build it, and use it with ghc is through a package manager.

There are many options of package managers, stack and cabal-install are the most common ones. Make sure to read the docs. If you go the cabal-install path, make sure you use the v2 version of the commands. Note that you are not tied to one package manager whichever you choose, you can always switch or even use both if you'd like.

## Finding packages

How to find a proper Haskell package for your usecase can be tricky. Most packages can be found on Hackage, some can be found on Stackage.

It's possible to google the usecase + Haskell for example "Haskell json", click on a few links, see if the documentation makes sense and if the package is popular, But sometimes it's easier to ask in Haskell circles.

## Using the right data structures: text, containers, bytestring, vectors

A lot of Haskell programs are slower than one would expect because of choosing the wrong algorithm or the wrong data structures such as using String instead of Text, using lists when one needs fast lookup or indexing instead of using HashMap, Map or Vector, or using Set when one needs fast insertion (and order doesn't matter), or using nub to remove duplicate elements in a list instead of Set.toList . Set.fromList.

Remember that using the right data structure is also useful for understanding the code, as the choice of data structure reveals which algorithms are important for the program.

## Parse, don't validate

Parse, don't validate is an important pattern that helps write code that is safer and more predictable.

## Typeclasses and deriving typeclasses instances

One of the defining features of Haskell. Adds ad-hoc polymorphism and gives us the ability to create interfaces and abstractions over groups of types.

With typeclasses with can write functions like sort :: Ord a => [a] -> [a], that can sort a list of any type as long as its internal order is defined.

## Numbers typeclasses

Working with numbers can be a bit complicated because number groups (such as integers, floating points, etc) are both similar and different. The numbers typeclass hierarchy tries to capture their similarities and differences.

## Common Typeclasses: Show, Read, Eq, Ord, Semigroup, Monoid

How do we display a value for debugging purposes as a String? How do we convert a string to a value? How do we compare values? How do we append two values?

## Kinds

Kinds are to types what types are to values. It creates some sort of a separation between different types and defines where we can use different kinds of types.

## Common Typeclasses: Foldable, Traversable, Functor, Applicative, Monad

These typeclasses are the most common ones that define abstraction around certain types and their selected operations. You will find them at the core API of many data structures, concurrency libraries, parsing, and more.

These modules help reduce boilerplate by emulating imperative style programming using their various type classes interfaces (Such as Functor and Monad).

They are frequently used in Haskell code in the wild so they're worth knowing.

## forkIO, async, STM, TVar

Want to do some concurrency? Haskell got you covered.

## Language extensions

The GHC compiler provides extra (often experimental) language features beyond what standard Haskell provides. Some are nice to have such as LambdaCase and NumericUnderscores, Some are very useful but can make valid Haskell code invalid such as OverloadedStrings, Some provide really important functionality such as FlexibleInstances and BangPatterns, Some are sometimes incredibly useful such as TypeFamilies but often aren't, and some should probably not be used such as NPlusKPatterns.

It's probably better to run into language extensions organically and learn when you need to rather than try to learn them all.

The GHC2021 proposal classifies Haskell extensions, might be useful to go over it at some point.

## Tests: tasty, hspec, hunit, doctest, etc

Ever heard that you don't need to write tests for Haskell? Forget that, choose a unit testing framework, write tests and enjoy the gurantees that come from both strong static typing and tests!

Also learn about property-based testing and QuickCheck it's beautiful.

While not strictly necessary, many programs and libraries in the wild use monad transformers (Types that provide a monadic interface that can be combined with one another), and mtl, which is a library of interfaces of capabilities.

## Extras

### Simple lenses

It took me maybe 3-4 years to finally think about learning lenses, and even then I've only ever scratched the surface with them. While very powerful and expressive, they aren't really necessary to build programs, but many programs and libraries use basic lenses, so if (or when) you run into them in the wild come back and learn them.

• In my experience view, get, over, ., lens, and makeLenses are all you need to know for most cases.

### Profiling

At some point the average Haskell will write a program that won't perform well enough. It's a good idea to learn how to figure out what makes the program slow. But this can be posponed to later.

Template Haskell (a Haskell metaprogramming framework) might creep at you sometimes as a user, but learning to write template haskell isn't usually necessary.

### Lambda Calculus

You might have heard about the lambda calculus, it is a minimalistic functional language that is at the heart of Haskell. Learning the lambda calculus means learning about the power of functions, how functions can be used to represent data, and how to evaluate expressions using pen and paper.

I found it to be very interesting and eye opening, seeing the core of Haskell naked and exposed without syntactic sugar, but it's not required to use or understand Haskell.

## Summary

Haskell has a really high ceiling of concepts, but one doesn't need master Haskell to be productive with it. Yes, there might be additional concepts to learn in order to use certain libraries, but many applications and libraries can (and have) been built on these concepts alone.

Don't believe me? Here are a few applications that (as far as I could tell - corrections are welcomed!) don't use anything more sophisticated than what I mentioned in this article:

• Aura - A secure, multilingual package manager for Arch Linux and the AUR.
• Elm-compiler - Compiler for Elm, a functional language for reliable webapps.
• Patat - Terminal-based presentations using Pandoc.

## March 15, 2021

We spent a few months last year building the groundwork for Haskellings in this YouTube series. Now after some more hard work, we're happy to announce that Haskellings is now available in the beta stage. This program is meant to be an interactive tutorial for learning the Haskell language. If you've never written a line of Haskell in your life, this program is designed to help you take those first steps! You can take a look at the Github repository to learn all the details of using it, but here's a quick overview.

## Overview

Haskellings gives you the chance to write Haskell code starting from the very basics with a quick evaluation loop. It currently has 50 different exercises for you to work with. In each exercise, you can read about a Haskell concept, make appropriate modifications, and then run the code with a single command to check your work!

You can do exercises individually, but the easiest way to do everything in order is to run the program in Watcher mode. In this mode, it will automatically tell you which exercise is next. It will also re-run each exercise every time you save your work.

Haskellings covers a decent amount of ground on basic language concepts. It starts with the very basics of expressions, types and functions, and goes through the basic usage of monads.

Haskellings is an open source project! If you want to report issues or contribute, learn how by reading this document! So go ahead, give it a try!

## Don't Forget: Haskell From Scratch

Haskellings is a quick and easy way to learn the language basics, but it only touches on the surface of a lot of elements. To get a more in-depth look at the language, you should consider our Haskell From Scratch video course. This course includes:

1. Hours of video lectures explaining core language concepts and syntax

This course will help you build a rock-solid foundation for your future Haskell learnings. And even better, we've now cut the price in half! So don't miss out!

# Hyperfunctions

Posted on March 14, 2021

Check out this type:

newtype a -&> b = Hyp { invoke :: (b -&> a) -> b } 

This a hyperfunction (Launchbury, Krstic, and Sauerwein 2013; 2000; 2000), and I think itâ€™s one of the weirdest and most interesting newtypes you can write in Haskell.

The first thing to notice is that the recursion pattern is weird. For a type to refer to itself recursively on the left of a function arrow is pretty unusual, but on top of that the recursion is non-regular. That means that the recursive reference has different type parameters to its parent: a -&> b is on the left-hand-side of the equals sign, but on the right we refer to b -&> a.

Being weird is reason enough to write about them, but whatâ€™s really shocking about hyperfunctions is that theyâ€™re useful. Once I saw the definition I realised that a bunch of optimisation code I had written (to fuse away zips in particular) was actually using hyperfunctions (Ghani et al. 2005). After that, I saw them all over the place: in coroutine implementations, queues, breadth-first traversals, etc.

Anyways, since coming across hyperfunctions a few months ago I thought Iâ€™d do a writeup on them. Iâ€™m kind of surprised theyâ€™re not more well-known, to be honest: theyâ€™re like a slightly more enigmatic Cont monad, with a far cooler name. Letâ€™s get into it!

# What Are Hyperfunctions?

The newtype noise kind of hides whatâ€™s going on with hyperfunctions: expanding the definition out might make things slightly clearer.

type a -&> b = (b -&> a) -> b
= ((a -&> b) -> a) -> b
= (((b -&> a) -> b) -> a) -> b
= ((((... -> b) -> a) -> b) -> a) -> b

So a value of type a -&> b is kind of an infinitely left-nested function type. One thing worth noticing is that all the as are in negative positions and all the bs in positive. This negative and positive business basically refers to the position of arguments in relation to a function arrow: to the left are negatives, and to the right are positives, but two negatives cancel out.

((((... -> b) -> a) -> b) -> a) -> b
+     -     +     -     +

All the things in negative positions are kind of like the things a function â€œconsumesâ€�, and positive positions are the things â€œproducedâ€�. Itâ€™s worth fiddling around with very nested function types to get a feel for this notion. For hyperfunctions, though, itâ€™s enough to know that a -&> b does indeed (kind of) take in a bunch of as, and it kind of produces bs.

By the way, one of the ways to get to grips with polarity in this sense is to play around with the Cont monad, codensity monad, or selection monad (Hedges 2015). If you do, you may notice one of the interesting parallels about hyperfunctions: the type a -&> a is in fact the fixpoint of the continuation monad (Fix (Cont a)). Suspicious!

# Hyperfunctions Are Everywhere

Before diving further into the properties of the type itself, Iâ€™d like to give some examples of how it can show up in pretty standard optimisation code.

### Zips

Letâ€™s say you wanted to write zip with foldr (I have already described this particular algorithm in a previous post). Not foldr on the left argument, mind you, but foldr on both. If you proceed mechanically, replacing every recursive function with foldr, you can actually arrive at a definition:

zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr xf xb xs (foldr yf yb ys)
where
xf x xk yk = yk x xk
xb _ = []

yf y yk x xk = (x,y) : xk yk
yb _ _ = []

In an untyped language, or a language with recursive types, such a definition would be totally fine. In Haskell, though, the compiler will complain with the following:

â€¢ Occurs check: cannot construct the infinite type:
t0 ~ a -> (t0 -> [(a, b)]) -> [(a, b)]

Seasoned Haskellers will know, though, that this is not a type error: no, this is a type recipe. The compiler is telling you what parameters it wants you to stick in the newtype:

newtype Zip a b =
Zip { runZip :: a -> (Zip a b -> [(a,b)]) -> [(a,b)] }

zip :: forall a b. [a] -> [b] -> [(a,b)]
zip xs ys = xz yz
where
xz :: Zip a b -> [(a,b)]
xz = foldr f b xs
where
f x xk yk = runZip yk x xk
b _ = []

yz :: Zip a b
yz = foldr f b ys
where
f y yk = Zip (\x xk -> (x,y) : xk yk)
b = Zip (\_ _ -> [])

And here we see the elusive hyperfunction: hidden behind a slight change of parameter order, Zip a b is in fact the same as [(a,b)] -&> (a -> [(a,b)]).

zip :: forall a b. [a] -> [b] -> [(a,b)]
zip xs ys = invoke xz yz
where
xz :: (a -> [(a,b)]) -&> [(a,b)]
xz = foldr f b xs
where
f x xk = Hyp (\yk -> invoke yk xk x)
b = Hyp (\_ -> [])

yz :: [(a,b)] -&> (a -> [(a,b)])
yz = foldr f b ys
where
f y yk = Hyp (\xk x -> (x,y) : invoke xk yk)
b = Hyp (\_ _ -> [])

### BFTs

In another previous post I derived the following function to do a breadth-first traversal of a tree:

data Tree a = a :& [Tree a]

newtype Q a = Q { q :: (Q a -> [a]) -> [a] }

bfe :: Tree a -> [a]
bfe t = q (f t b) e
where
f :: Tree a -> Q a -> Q a
f (x :& xs) fw = Q (\bw -> x : q fw (bw . flip (foldr f) xs))

b :: Q a
b = Q (\k -> k b)

e :: Q a -> [a]
e (Q q) = q e

That Q type there is another hyperfunction.

bfe :: Tree a -> [a]
bfe t = invoke (f t e) e
where
f :: Tree a -> ([a] -&> [a]) -> ([a] -&> [a])
f (x :& xs) fw = Hyp (\bw -> x : invoke fw (Hyp (invoke bw . flip (foldr f) xs)))

e :: [a] -&> [a]
e = Hyp (\k -> invoke k e)

One of the problems I had with the above function was that it didnâ€™t terminate: it could enumerate all the elements of the tree but it didnâ€™t know when to stop. A similar program (Allison 2006; described and translated to Haskell in Smith 2009) manages to solve the problem with a counter. Will it shock you to find out this solution can also be encoded with a hyperfunction?

bfe t = invoke (f t e) e 1
where
f :: Tree a -> (Int -> [a]) -&> (Int -> [a])
-> (Int -> [a]) -&> (Int -> [a])
f (x :& xs) fw =
Hyp (\bw n -> x :
invoke fw
(Hyp (invoke bw . flip (foldr f) xs)) (n+1))

e :: (Int -> [a]) -&> (Int -> [a])
e = Hyp b

b x 0 = []
b x n = invoke x (Hyp b) (n-1)

(my version here is actually a good bit different from the one in Smith 2009, but the basic idea is the same)

### Coroutines

Hyperfunctions seem to me to be quite deeply related to coroutines. At the very least several of the types involved in coroutine implementations are actual hyperfunctions. The ProdPar and ConsPar types from Pieters and Schrijvers (2019) are good examples:

newtype ProdPar a b = ProdPar (ConsPar a b -> b)
newtype ConsPar a b = ConsPar (a -> ProdPar a b -> b)

ProdPar a b is isomorphic to (a -> b) -&> b, and ConsPar a b to b -&> (a -> b), as witnessed by the following functions:

Conversion functions between ProdPar, ConsPar and hyperfunctions

fromP :: ProdPar a b -> (a -> b) -&> b
fromP (ProdPar x) = Hyp (x . toC)

toC ::  b -&> (a -> b) -> ConsPar a b
toC (Hyp h) = ConsPar (\x p -> h (fromP p) x)

toP :: (a -> b) -&> b -> ProdPar a b
toP (Hyp x) = ProdPar (x . fromC)

fromC :: ConsPar a b -> b -&> (a -> b)
fromC (ConsPar p) = Hyp (\h x -> p x (toP h))

In fact this reveals a little about what was happening in the zip function: we convert the left-hand list to a ProdPar (producer), and the right-hand to a consumer, and apply them to each other.

# Hyperfunctions Are Weird

Aside from just being kind of weird intuitively, hyperfunctions are weird in theory. Set-theoretically, for instance, you cannot form the set of a -&> b: if you tried, youâ€™d run into those pesky size restrictions which stop us from making things like â€œthe set of all setsâ€�. Haskell types, however, are not sets, precisely because we can define things like a -&> b.

For slightly different reasons to the set theory restrictions, we canâ€™t define the type of hyperfunctions in Agda. The following will get an error:

record _â†¬_ (A : Type a) (B : Type b) : Type (a â„“âŠ” b) where
inductive; constructor hyp
field invoke : (B â†¬ A) â†’ B

And for good reason! Agda doesnâ€™t allow recursive types where the recursive call is in a negative position. If we turn off the positivity checker, we can write Curryâ€™s paradox (example proof taken from here):

yes? : âŠ¥ â†¬ âŠ¥
yes? .invoke h = h .invoke h

no! : (âŠ¥ â†¬ âŠ¥) â†’ âŠ¥
no! h = h .invoke h

boom : âŠ¥
boom = no! yes?

Note that this isnâ€™t an issue with the termination checker: the above example passes all the normal termination conditions without issue (yes, even if â†¬ is marked as coinductive). Itâ€™s directly because the type itself is not positive.

Interestingly, there is a slightly different, and nearly equivalent, definition of hyperfunctions which doesnâ€™t allow us to write the above proof:

record _â†¬_ (A : Type a) (B : Type b) : Type (a â„“âŠ” b) where
inductive; constructor hyp
field invoke : ((A â†¬ B) â†’ A) â†’ B

This is basically a slightly expanded out version of the hyperfunction type, and importantly itâ€™s positive. Not strictly positive however, since the recursive call does occur to the left of a function arrow: itâ€™s just positive, in that itâ€™s to the left of an even number of function arrows.

I found in a blog post by SjÃ¶berg (2015) some interesting discussion regarding the question of this extra strictness: in Coq, allowing certain positive but not strictly positive types does indeed introduce an inconsistency (Coquand and Paulin 1990). However this inconsistency relies on an impredicative universe, which Agda doesnâ€™t have. As far as I understand it, it would likely be safe to allow types like â†¬ above in Agda (Coquand 2013), although Iâ€™m not certain that with all of Agdaâ€™s newer features thatâ€™s still the case.

The connection between non-strictly-positive types and breadth-first traversals has been noticed before: Berger, Matthes, and Setzer (2019) make the argument for their inclusion in Agda and Coq using a breadth-first traversal algorithm by Hofmann (1993), which uses the following type:

data Rou
= Over
| Next ((Rou -> [Int]) -> [Int])

Now this type isnâ€™t a hyperfunction (but itâ€™s close); weâ€™ll see soon what kind of thing it is.

# Hyperfunctions Are a Category

So weâ€™ve seen that hyperfunctions show up kind of incidentally through certain optimisations, and weâ€™ve seen that they occupy a strange space in terms of their theoretical interpretation: we havenâ€™t yet seen much about the type itself in isolation. Luckily Ed Kmett has already written the hyperfunctions package (2015), where a laundry list of instances are provided, which can tell us a little more about what hyperfunctions can actually do on their own.

The Category instance gives us the following:

instance Category (-&>) where
id = Hyp (\k -> invoke k id)
f . g = Hyp (\k -> invoke f (g . k))

Weâ€™ve actually seen the identity function a few times: we used it as the base case for recursion in the breadth-first traversal algorithms.

Composition we actually have used as well but itâ€™s more obscured. An analogy to help clear things up is to think of hyperfunctions as a kind of stack. id is the empty stack, and we can use the following function to push items onto the stack:

push :: (a -> b) -> a -&> b -> a -&> b
push f q = Hyp (\k -> f (invoke k q))

Understood in this sense, composition acts like a zipping operation on stacks, since we have the following law:

push f p . push g q â‰¡ push (f . g) (p . q)

While we canâ€™t really pop elements off the top of the stack directly, we can get close with invoke, since it satisfies the following law:

invoke (push f p) q â‰¡ f (invoke q p)

Along with the id implementation we have, this will let us run a hyperfunction, basically folding over the contents of the stack:

run :: a -&> a -> a
run f = invoke f id

This analogy helps us understand how the breadth-first traversals worked: the hyperfunctions are kind of like stacks with <semantics>ğ�’ª(1)<annotation encoding="application/x-tex">\mathcal{O}(1)</annotation></semantics> push and zip, which is precisely what you need for an efficient breadth-first traversal.

bfe :: Tree a -> [a]
bfe = run . f
where
f (x :& xs) = push (x:) (zips (map f xs))

zips = foldr (.) id

Finally, hyperfunctions are of course monads:

instance Monad ((-&>) a) where
m >>= f = Hyp (\k -> invoke (f (invoke m (Hyp (invoke k . (>>=f))))) k)

I wonâ€™t pretend to understand whatâ€™s going on here, but it looks a little like a nested reader monad. Perhaps thereâ€™s some intuition to be gained from noticing that a -&> a ~ Fix (Cont a).

# Hyper Arrows Areâ€¦?

As I said in the introduction Iâ€™m kind of surprised thereâ€™s not more research out there on hyperfunctions. Aside from the excellent papers by Launchbury, Krstic, and Sauerwein (2013) thereâ€™s just not much out there. Maybe itâ€™s that thereâ€™s not that much theoretical depth to them, but all the same there are some clear questions worth looking into.

For example: is there a hyperfunction monad transformer? Or, failing that, can you thread a monad through the type at any point, and do you get anything interesting out?

I have made a little headway on this question, while fiddling with one of the bfe definitions above. Basically I wanted to remove the Int counter for the terminating bfe, and I wanted to use a Maybe somewhere instead. I ended up generalising from Maybe to any m, yielding the following type:

newtype HypM m a b = HypM { invokeM :: m ((HypM m a b -> a) -> b) }

This does the job for the breadth-first traversal:

bfe t = r (f t e)
where
f :: Tree a -> HypM Maybe [a] [a] -> HypM Maybe [a] [a]
f (x :& xs) fw = HypM (Just (\bw -> x : fromMaybe (\k -> k e) (invokeM fw) (bw . flip (foldr f) xs)))

e :: HypM Maybe [a] [a]
e = HypM Nothing

r :: HypM Maybe [a] [a] -> [a]
r = maybe [] (\k -> k r) . invokeM

(In fact, when m is specialised to Maybe we have the same type as Rou)

This type has a very practical use, as it happens, which is related to the church-encoded list monad transformer:

newtype ListT m a = ListT { runListT :: forall b. (a -> m b -> m b) -> m b -> m b }

Just like -&> allowed us to write zip on folds (i.e.Â using foldr), HypM will allow us to write zipM on ListT:

zipM :: Monad m => ListT m a -> ListT m b -> ListT m (a,b)
zipM xs ys = ListT (\c n ->
let
xf x xk = pure (\yk -> yk (HypM xk) x)
xb = pure (\_ -> n)

yf y yk = pure (\xk x -> c (x, y) (join (invokeM xk <*> yk)))
yb = pure (\_ _ -> n)
in join (runListT xs xf xb <*> runListT ys yf yb))

I actually think this function could be used to seriously improve the running time of several of the functions on LogicT: my reading of them suggests that interleave is <semantics>ğ�’ª(n2)<annotation encoding="application/x-tex">\mathcal{O}(n^2)</annotation></semantics> (or worse), but the zip above could be trivially repurposed to give a <semantics>ğ�’ª(n)<annotation encoding="application/x-tex">\mathcal{O}(n)</annotation></semantics> interleave. This would also have knock-on effects on, for instance, >>- and so on.

Another question is regarding the arrows of the hyperfunction. Weâ€™ve seen that a hyperfunction kind of adds â€œstackingâ€� to functions, can it do the same for other arrows? Basically, does the following type do anything useful?

newtype HypP p a b = HypP { invokeP :: p (HypP p b a) b }

Along a similar vein, many of the breadth-first enumeration algorithms seem to use â€œhyperfunctions over the endomorphism monoidâ€�. Basically, they all produce hyperfunctions of the type [a] -&> [a], and use them quite similarly to how we would use difference lists. But we know that there are Cayley transforms in other monoidal categories, for instance in the applicative monoidal category: can we construct the â€œhyperfunctionâ€� version of those?

# References

Allison, Lloyd. 2006. â€œCircular Programs and Self-Referential Structures.â€� Software: Practice and Experience 19 (2) (October): 99â€“109. doi:10.1002/spe.4380190202. http://users.monash.edu/~lloyd/tildeFP/1989SPE/.

Berger, Ulrich, Ralph Matthes, and Anton Setzer. 2019. â€œMartin Hofmannâ€™s Case for Non-Strictly Positive Data Types.â€� In 24th International Conference on Types for Proofs and Programs (TYPES 2018), ed by. Peter Dybjer, JosÃ© EspÃ­rito Santo, and LuÃ­s Pinto, 130:22. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss DagstuhlLeibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.TYPES.2018.1. http://drops.dagstuhl.de/opus/volltexte/2019/11405.

Coquand, Thierry. 2013. â€œ[Agda] Defining Coinductive Types.â€� https://lists.chalmers.se/pipermail/agda/2013/006189.html.

Coquand, Thierry, and Christine Paulin. 1990. â€œInductively Defined Types.â€� In COLOG-88, ed by. Per Martin-LÃ¶f and Grigori Mints, 50â€“66. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. doi:10.1007/3-540-52335-9_47.

Ghani, Neil, Patricia Johann, Tarmo Uustalu, and Varmo Vene. 2005. â€œMonadic Augment and Generalised Short Cut Fusion.â€� In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming, 294â€“305. ICFP â€™05. New York, NY, USA: Association for Computing Machinery. doi:10.1145/1086365.1086403. https://doi.org/10.1145/1086365.1086403.

Hedges, Jules. 2015. â€œThe Selection Monad as a CPS Transformation.â€� arXiv:1503.06061 [cs] (March). http://arxiv.org/abs/1503.06061.

Hofmann, Martin. 1993. â€œNon Strictly Positive Datatypes in System F.â€� https://www.seas.upenn.edu/~sweirich/types/archive/1993/msg00027.html.

Kmett, Edward. 2015. â€œHyperfunctions: Hyperfunctions.â€� https://hackage.haskell.org/package/hyperfunctions.

Krstic, Sava, and John Launchbury. 2000. â€œA Category of Hyperfunctions.â€� http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.2421.

Launchbury, J., S. Krstic, and T. E. Sauerwein. 2013. â€œCoroutining Folds with Hyperfunctions.â€� Electron. Proc. Theor. Comput. Sci. 129 (September): 121â€“135. doi:10.4204/EPTCS.129.9. http://arxiv.org/abs/1309.5135.

Launchbury, John, Sava Krstic, and Timothy E. Sauerwein. 2000. Zip Fusion with Hyperfunctions. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.4961.

Pieters, Ruben P., and Tom Schrijvers. 2019. â€œFaster Coroutine Pipelines: A Reconstruction.â€� In Practical Aspects of Declarative Languages, ed by. JosÃ© JÃºlio Alferes and Moa Johansson, 133â€“149. Lecture Notes in Computer Science. Cham: Springer International Publishing. doi:10.1007/978-3-030-05998-9_9. https://people.cs.kuleuven.be/~tom.schrijvers/portfolio/padl2019.html.

SjÃ¶berg, Vilhelm. 2015. â€œWhy Must Inductive Types Be Strictly Positive?â€� Code and stuff. https://vilhelms.github.io/posts/why-must-inductive-types-be-strictly-positive/.

Smith, Leon P. 2009. â€œLloyd Allisonâ€™s Corecursive Queues: Why Continuations Matter.â€� The Monad.Reader 14 (14) (July): 28. https://meldingmonads.files.wordpress.com/2009/06/corecqueues.pdf.

# Strema is now Giml

For the past few weeks I've been working on Strema, a hobby programming language I developed live on stream. It's been a great experience and I'm proud of what I was able to accomplish in the short time I've worked on it.

I've decided to change Strema a bit and rename it to Giml (pronounced like the hebrew letter Gimel).

I've been wanting to make Giml for a long time, and I had Giml in mind when I initially designed Strema. Building Strema was also kind of a prototype for Giml. I feel the prototype was a success and Strema has grown sufficiently close to what I had in mind for Giml, so I've decided to shift gears and focus my interest on building Giml instead.

Giml resembles Haskell in terms of syntax and semantics more closely than Strema (for example, Giml is purely functional where Strema is not), though Giml is also strict where Haskell in non-strict, and Giml is much leaner in terms of features than Haskell.

I will continue to stream work on Giml, at least for the next few weeks. I hope to make it a hobby programming language I can be proud of, and I hope people who've been interested in Strema will find Giml just as interesting, if not more :)

If you'd like to follow my adventures with Giml, you can view the VOD playlist on Youtube, Or follow my Twitch.tv channel and join me when I stream. I'll be happy to have you!

# The Logarithm of a Sum

This is the story of an interesting flight of fancy with mathematics. I found it intriguing, and hope you do, as well.

#### The question

Here’s a fact that comes up in high school mathematics: you can demote multiplication into addition by using logarithms. That is:

That is, you can compute the log of a product, given only the logs of the factors.

To students today, this might seem like just another algebraic identity. But in the age before calculators, it was actually the main reason for a typical high school student to be interested in logarithms at all! Multiplication is more difficult than addition, so if you have a way to represent numbers that makes multiplication into addition, that helps. This is whole principle behind doing multiplication with a slide rule, for example: one just converts to logarithms, adds the resulting distances, and then converts back.

Similarly, one can use logarithms to demote powers into multiplication:

But if we’re imagining a world where we work entirely with logarithms, it’s not entirely fair to just multiply by y, so I’m going to rewrite this (let’s agree that all logarithms are natural) as:

There’s an additional exponential function there, but if we take that as given, we can now compute the log of a power using only multiplication, the exponential function, and the logs of the inputs.

An interesting question to ask is: what about addition? The following does not work, although math teachers will recognize it as a very common mistake!

So, can we complete this equation?

At first glance, thinking of the logarithm as translating operations down one order (multiplication into addition, and exponents into multiplication), this seems to call for an operation an order lower than addition. What could fit in such a place?

We can start to answer this question using simple algebra and our existing identities. Let’s assume x is not zero (since then it would have no logarithm anyway!), and then we can factor:

So by applying the log rule for multiplication, we get this nifty little formula:

Notice that although the presentation here doesn’t look symmetric, it actually is. Swapping the x and y values doesn’t change the result.

Again, imagining that we have only the logarithms and not the actual values, that fraction at the end is sort of cheating. Just as I did with the multiplication formula, I’ll introduce an explicit exponential, and it simplifies nicely.

In order to write this more clearly, I’ll name a new function, h, and define in terms of that:

It’s true that we haven’t succeeded in getting rid of addition, but this is leading somewhere interesting. But what is this mysterious function h?

#### h: The soft rectified linear function

We can start to explore h by looking at a graph.

At first glance, it looks like h(x) is approximately zero for any input less than -2, and approximately x for any input greater than 2. This sounds like the so-called “rectified linear” function:

Indeed, we can graph the two functions on the same axes, and see that they agree except near zero. (You can also verify this by reasoning about the formula. For inputs far less than zero, the exponential term becomes insignificant, while for inputs far greater than zero, the constant term becomes insignificant. This is the basis of a not-too-hard proof that these are asymptotes.)

We can, therefore, think of h as a soft rectified linear function; what you get by just rounding out the rectified linear function around its sharp corner.

(This rectified linear function, incidentally, has been popularized in machine learning, where for reasons that depend on who you ask, it has turned out to be wildly successful as an activation function for artificial neural networks. Part of the reason for that success is that it is so simple it can be computed quickly. But that’s not enough to explain all of its success! I suspect another part of the reason is that it’s closely related to sums exactly in the sense of the very investigation we’re doing now.)

#### Back to the sum

So if h is so similar to the rectified linear function, what happens when you (inaccurately) use the rectified linear function itself in the sum formula above. Remarkably, you get this:

In other words, in terms of logarithms, adding numbers is approximately the same as just taking the maximum! At least, it is when the difference between the numbers is large. That sort of makes sense, actually. If you add a very large number to a very small number, the result will indeed be approximately the same as the large number. (Remember that since we’re only thinking about numbers with logarithms, both inputs must be positive. We need not worry about the case where both numbers are large but with opposite signs.)

We can pull out of this pattern a sort of “soft” maximum function, which is almost like just giving the greater of its two arguments, but if the arguments are close then it rounds off the curve. Unfortunately, the phrase softmax already means something different and somewhat more complex to the machine learning community mentioned above, so perhaps we ought to call this something like smoothmax instead.

It’s not easily computable, really, in the sense that products and powers were, but this still gives some intuition for the function that does compute the log of a sum, given the logs of the summands. Anyway, I’m satisfied enough with that answer.

This tells us that this smoothmax function can play the role of addition in mathematical expressions. That implies that all of the algebraic properties of addition ought to hold for smoothmax, as well. That’s interesting!

For example, smoothmax ought to be commutative. That is:

Indeed, this is true. I made that observation above when first introducing the formula. One can also expect that smoothmax is associative. That is:

And, indeed, although the algebra is a little more complex, this turns out to be true, as well. In fact, we need not really show each of these with complicated algebra. We’ve already shown that smoothmax is addition, just using the logarithms to represent the numbers.

I think things get even more interesting when we consider the distributive property. Remember that when we work with logs, multiplication gets replaced with addition, so we have this:

Thinking of this as a softened maximum, this works out to be some kind of translation invariance property of the maximum: if you take the maximum of two numbers and then add x, that’s the same as adding x to each one and then taking the maximum! That intuitively checks out.

There are some things that don’t work, though.

You might also hope for something like an identity property, since for addition we have x + 0 = x. This one doesn’t turn out so well, because we cannot take the logarithm of zero! We end up wanting to write something like:

This would make sense given the asymptotic behavior of the smoothmax function, but we’re playing sort of fast and loose with infinities there, so I wouldn’t call it a true identity. To say that correctly, you need limits.

You also need to be careful with expecting smoothmax to act like a maximum! For example:

That’s weird… but not if you remember that smoothmax is at its least accurate when its two inputs are close together, so both inputs being the same is a worst case scenario. Indeed, that’s where the true max function has a non-differentiable sharp corner that needed to be smoothed out. And, indeed, the exact behavior is given by addition, rather than maximums, and addition is not idempotent (i.e., adding a number to itself doesn’t give the same number back).

In fact, speaking of smooth maxing a number with itself:

which resembles a sort of definition of addition of log-naturals as “repeated smoothmax of a number with itself”, in very much the same sense that multiplication by naturals can be defined as repeated addition of a number with itself, strengthening the notion that this operation is sort-of one order lower than addition.

So there you have it. That’s as far as my flight of fancy goes. I found it interesting enough to share.

# Captures in closures and async blocks

This blog post is the second in the Rust quickies series. In my training sessions, we often come up with quick examples to demonstrate some point. Instead of forgetting about them, I want to put short blog posts together focusing on these examples. Hopefully these will be helpful, enjoy!

FP Complete is looking for Rust and DevOps engineers. Interested in working with us? Check out our jobs page.

## Hello Hyper!

For those not familiar, Hyper is an HTTP implementation for Rust, built on top of Tokio. It's a low level library powering frameworks like Warp and Rocket, as well as the reqwest client library. For most people, most of the time, using a higher level wrapper like these is the right thing to do.

But sometimes we like to get our hands dirty, and sometimes working directly with Hyper is the right choice. And definitely from a learning perspective, it's worth doing so at least once. And what could be easier than following the example from Hyper's homepage? To do so, cargo new a new project, add the following dependencies:

hyper = { version = "0.14", features = ["full"] }
tokio = { version = "1", features = ["full"] }


And add the following to main.rs:

use std::convert::Infallible;
use hyper::{Body, Request, Response, Server};
use hyper::service::{make_service_fn, service_fn};

async fn hello_world(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
Ok(Response::new("Hello, World".into()))
}

#[tokio::main]
async fn main() {
// We'll bind to 127.0.0.1:3000

// A Service is needed for every connection, so this
// creates one from our hello_world function.
let make_svc = make_service_fn(|_conn| async {
// service_fn converts our function into a Service
Ok::<_, Infallible>(service_fn(hello_world))
});

// Run this server for... forever!
if let Err(e) = server.await {
eprintln!("server error: {}", e);
}
}


If you're interested, there's a quick explanation of this code available on Hyper's website. But our focus will be on making an ever-so-minor modification to this code. Let's go!

## Counter

Remember the good old days of Geocities websites, where every page had to have a visitor counter? I want that. Let's modify our hello_world function to do just that:

use std::sync::{Arc, Mutex};

type Counter = Arc<Mutex<usize>>; // Bonus points: use an AtomicUsize instead

async fn hello_world(counter: Counter, _req: Request<Body>) -> Result<Response<Body>, Infallible> {
let mut guard = counter.lock().unwrap(); // unwrap poisoned Mutexes
*guard += 1;
let message = format!("You are visitor number {}", guard);
Ok(Response::new(message.into()))
}


That's easy enough, and now we're done with hello_world. The only problem is rewriting main to pass in a Counter value to it. Let's take a first, naive stab at the problem:

let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
let counter: Counter = Arc::new(Mutex::new(0));

let make_svc = make_service_fn(|_conn| async {
Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))
});

if let Err(e) = server.await {
eprintln!("server error: {}", e);
}


Unfortunately, this fails due to moving out of captured variables. (That's a topic we cover in detail in our closure training module.)

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:21:58
|
18 |     let counter: Counter = Arc::new(Mutex::new(0));
|         ------- captured outer variable
...
21 |         Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))
|                                                          ^^^^^^^ move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:20:50
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- captured outer variable
19 |
20 |       let make_svc = make_service_fn(|_conn| async {
|  __________________________________________________^
21 | |         Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))
| |                                        -------------------------------
| |                                        |
| |                                        move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait
| |                                        move occurs due to use in generator
22 | |     });
| |_____^ move out of counter occurs here


## Clone

That error isn't terribly surprising. We put our Mutex inside an Arc for a reason: we'll need to make multiple clones of it and pass those around to each new request handler. But we haven't called clone once yet! Again, let's do the most naive thing possible, and change:

Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))


into

Ok::<_, Infallible>(service_fn(|req| hello_world(counter.clone(), req)))


This is where the error messages begin to get more interesting:

error[E0597]: counter does not live long enough
--> src\main.rs:21:58
|
20 |       let make_svc = make_service_fn(|_conn| async {
|  ____________________________________-------_-
| |                                    |
| |                                    value captured here
21 | |         Ok::<_, Infallible>(service_fn(|req| hello_world(counter.clone(), req)))
| |                                                          ^^^^^^^ borrowed value does not live long enough
22 | |     });
| |_____- returning this value requires that counter is borrowed for 'static
...
29 |   }
|   - counter dropped here while still borrowed


Both async blocks and closures will, by default, capture variables from their environment by reference, instead of taking ownership. Our closure needs to have a 'static lifetime, and therefore can't hold onto a reference to data in our main function.

## move all the things!

The standard solution to this is to simply sprinkle moves on each async block and closure. This will force each closure to own the Arc itself, not a reference to it. Doing so looks simple:

let make_svc = make_service_fn(move |_conn| async move {
Ok::<_, Infallible>(service_fn(move |req| hello_world(counter.clone(), req)))
});


And this does in fact fix the error above. But it gives us a new error instead:

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:20:60
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- captured outer variable
19 |
20 |       let make_svc = make_service_fn(move |_conn| async move {
|  ____________________________________________________________^
21 | |         Ok::<_, Infallible>(service_fn(move |req| hello_world(counter.clone(), req)))
| |                                        --------------------------------------------
| |                                        |
| |                                        move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait
| |                                        move occurs due to use in generator
22 | |     });
| |_____^ move out of counter occurs here


## Double the closure, double the clone!

Well, even this error makes a lot of sense. Let's understand better what our code is doing:

• Creates a closure to pass to make_service_fn, which will be called for each new incoming connection
• Within that closure, creates a new closure to pass to service_fn, which will be called for each new incoming request on an existing connection

This is where the trickiness of working directly with Hyper comes into play. Each of those layers of closure need to own their own clone of the Arc. And in our code above, we're trying to move the Arc from the outer closure's captured variable into the inner closure's captured variable. If you squint hard enough, that's what the error message above is saying. Our outer closure is an FnMut, which must be callable multiple times. Therefore, we cannot move out of its captured variable.

It seems like this should be an easy fix: just clone again!

let make_svc = make_service_fn(move |_conn| async move {
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| hello_world(counter_clone.clone(), req)))
});


And this is the point at which we hit a real head scratcher: we get almost exactly the same error message:

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:20:60
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- captured outer variable
19 |
20 |       let make_svc = make_service_fn(move |_conn| async move {
|  ____________________________________________________________^
21 | |         let counter_clone = counter.clone();
| |                             -------
| |                             |
| |                             move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait
| |                             move occurs due to use in generator
22 | |         Ok::<_, Infallible>(service_fn(move |req| hello_world(counter_clone.clone(), req)))
23 | |     });
| |_____^ move out of counter occurs here


What we need to do is to rewrite our code ever so slightly so reveal what the problem is. Let's add a bunch of unnecessary braces. We'll convert the code above:

let make_svc = make_service_fn(move |_conn| async move {
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| hello_world(counter_clone.clone(), req)))
});


into this semantically identical code:

let make_svc = make_service_fn(move |_conn| { // outer closure
async move { // async block
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| { // inner closure
hello_world(counter_clone.clone(), req)
}))
}
});


The error message is basically identical, just slightly different source locations. But now I can walk through the ownership of counter more correctly. I've added comments to highlight three different entities in the code above that can take ownership of values via some kind of environment:

• The outer closure, which handles each connection
• An async block, which forms the body of the outer closure
• The inner closure, which handles each request

In the original structuring of the code, we put move |_conn| async move next to each other on one line, which—at least for me—obfuscated the fact that the closure and async block were two completely separate entities. With that change in place, let's track the ownership of counter:

1. We create the Arc in the main function; it's owned by the counter variable.
2. We move the Arc from the main function's counter variable into the outer closure's captured variables.
3. We move the counter variable out of the outer closure and into the async block's captured variables.
4. Within the body of the async block, we create a clone of counter, called counter_clone. This does not move out of the async block, since the clone method only requires a reference to the Arc.
5. We move the Arc out of the counter_clone variable and into the inner closure.
6. Within the body of the inner closure, we clone the Arc (which, as explained in (4), doesn't move) and pass it into the hello_world function.

Based on this breakdown, can you see where the problem is? It's at step (3). We don't want to move out of the outer closure's captured variables. We try to avoid that move by cloning counter. But we clone too late! By using counter from inside an async move block, we're forcing the compiler to move. Hurray, we've identified the problem!

## Non-solution: non-move async

It seems like we were simply over-ambitious with our "sprinkling move" attempt above. The problem is that the async block is taking ownership of counter. Let's try simply removing the move keyword there:

let make_svc = make_service_fn(move |_conn| {
async {
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| {
hello_world(counter_clone.clone(), req)
}))
}
});


Unfortunately, this isn't a solution:

error: captured variable cannot escape FnMut closure body
--> src\main.rs:21:9
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- variable defined here
19 |
20 |       let make_svc = make_service_fn(move |_conn| {
|                                                 - inferred to be a FnMut closure
21 | /         async {
22 | |             let counter_clone = counter.clone();
| |                                 ------- variable captured here
23 | |             Ok::<_, Infallible>(service_fn(move |req| {
24 | |                 hello_world(counter_clone.clone(), req)
25 | |             }))
26 | |         }
| |_________^ returns an async block that contains a reference to a captured variable, which then escapes the closure body
|
= note: FnMut closures only have access to their captured variables while they are executing...
= note: ...therefore, they cannot allow references to captured variables to escape


The problem here is that the outer closure will return the Future generated by the async block. And if the async block doesn't move the counter, it will be holding a reference to the outer closure's captured variables. And that's not allowed.

## Real solution: clone early, clone often

OK, undo the async move to async transformation, it's a dead end. It turns out that all we've got to do is clone the counter before we start the async move block, like so:

let make_svc = make_service_fn(move |_conn| {
let counter_clone = counter.clone(); // this moved one line earlier
async move {
Ok::<_, Infallible>(service_fn(move |req| {
hello_world(counter_clone.clone(), req)
}))
}
});


Now, we create a temporary counter_clone within the outer closure. This works by reference, and therefore doesn't move anything. We then move the new, temporary counter_clone into the async move block via a capture, and from there move it into the inner closure. With this, all of our closure captured variables remain unmoved, and therefore the requirements of FnMut are satisfied.

And with that, we can finally enjoy the glory days of Geocities visitor counters!

## Async closures

The formatting recommended by rustfmt hides away the fact that there are two different environments at play between the outer closure and the async block, by moving the two onto a single line with move |_conn| async move. That makes it feel like the two entities are somehow one and the same. But as we've demonstrated, they aren't.

Theoretically this could be solved by having an async closure. I tested with #![feature(async_closure)] on nightly-2021-03-02, but couldn't figure out a way to use an async closure to solve this problem differently than I solved it above. But that may be my own lack of familiarity with async_closure.

For now, the main takeaway is that closures and async blocks are two different entities, each with their own environment.

If you liked this post you may also be interested in:

# The First Winter: Clearing Weeds and Planting Trees

In October 2020, we got our new house. Built 2013, it’s not only new to us, and this is a major upside. While I definitely can appreciate many aspects of old houses, the comfort of a modern house fits us first-time parents very well.

The site is pretty large for a regular house, at about 10 500 m². Eight years ago when the house was built, all the existing old trees were cut down. The weeds took over, as they often seem to do after deforestation if nothing else is planted. The two dominant species were blackberries (I’m not sure which kind) and “Cytisus scoparius.” Birch trees also managed to cover some ground.

It didn’t take long before we made other plans for this place. Specifically, I decided to try restoring the forest, in a way that makes sense in this area. That means a majority of oak and beech, some sections of birch, and a sprinkle of cherry, pine, and spruce.

Of course, I’ll be long gone before the grand result of this project. Future generations can hopefully bask in the glory of its success. In the meantime, I’ll enjoy watching these trees grow up.

Please note that I’m a clumsy hobbyist. I do get solid advice from my sister who is a trained gardener, and from various other sources, but you shouldn’t trust anything I say or do.

## Autumn Colors, All Around

When we arrived, nature was showing its autumn palette of yellow, red, and brown. The following photos are from walks in the area surrounding our house.

## Time To Work

Alright, enough about the surrounding nature. Around mid November I started executing on our plan. At first, I had no power tools or protective gear. I think that lasted about a day, after which I came in covered in scars.

The blackberries were everywhere. I mean it. They often grew up to a height of two meters by growing inside and across the Cytisus scoparius branches, creating what I can only describe as a jungle of barbed wire.

These two have become my mortal enemies, the main targets of this operation. Blackberries, while tasty, is a weed that is notoriusly hard to get rid of. Cytisus scoparius is classified as invasive in some countries, though not in Sweden.

I cleared out a small portion, including some promising oaklings. While scarred, my motivation was now on top. I ordered a clearing saw and a full set of protective gear.

While waiting for the delivery, I collected acorns down by the road and planted them in old planting boxes. Finally, I covered them in thick layer of oak leaves, which sadly didn’t stop birds from the enjoying the smörgåsbord. We’ll soon see how many of the 200-300 acorns that make it. My hopes are low on this one.

When the saw arrived, I did a first pass on the area closest to the house. It took a while until I found a good technique and which blade to use.

I soon realized that a more efficient workflow would be to do it in two separate phases:

1. Get rid of as much blackberry as possible
2. Cut down all Cytisus scoparius and other excess trees

And so it went.

Finally, the north side of the road was cleared. There remains a bunch of birch trees to be cut, but that can wait a bit.

At this point, I had some space to work with. There’s a road construction planned at the corner of our lot, right through a little pond of self-seeded oaklings. Instead of them getting covered by gravel, I decided to move them to the newly cleared grounds. The first batch was around 25-30 trees, I think.

## The Other Side

Awkwardly, the south side of the road remained uncleared. There, the Cytisus scoparius grew even taller and more dense.

Speaking of the multi-sport court – it’s perhaps not the most attractive part of this place, in my opinion. But I’m guessing our toddler might have different opinions on this matter, and so it could come in handy.

An audiobook and some sweat later, the south side was cleared.

We got help from relatives to pile up the remaining branches. There’s a little bit left to do but the main chunk of work is done.

## Planting Oak and Beech

In a handful spots there grew small oak and beech plants that needed moving. Either they grew too densely, below larger trees where they’d only get shade, or in areas that needed to be kept clear (sides of the road, for example.) I started moving them into newly cleared areas to get a nicer spread of trees across the entire area.

I’ve replanted these trees with marker sticks. This makes it easy to spot them. Also, I can support the plants by tying them to the stick, and use them to fix the net cages.

## Pruning

Even with the fierce competition going on after the deforestation, a bunch of oak trees managed to break through. There are some growing 4-5 m tall, already. Unfortunately they’ve grown very close to other plants and got intertwined, so I’ve spent some time pruning.

Pruning trees like these is usually done between July and September, but I wanted to give these a full season of unhindered growth. Hopefully I won’t regret that decision.

## Springtime!

Last week, spring was declared in southern Sweden. Finally! I spent the weekend caging in small trees to (hopefully) protect them from wildlife, and to give them a more focused upwards growth. I’ve used cable ties to straighten and support the smaller trees, but I’m not sure if that’s a bad idea. Maybe I need to swap them for some softer rope.

Some of the larger trees had grown a bit sideways due to competition, so I bought heavy piles and used them for support.

And that’s where I am now, eagearly awaiting the warmth of spring and summer, and the blossoming of trees and flowers. We really don’t know what might show up!

If you’ve read this far, I hope you have enjoyed the first part of our journey. If so, let me know, and I might post more pictures and words on this subject. You can also follow me on Twitter, where I nowadays torture my followers with both tech and gardening tweets.

# [nzcimnev] infix dollar with RankNTypes

here is some surprising behavior seen with ghc 8.6.5 (Haskell compiler).

in the function "answer" at the end:

1. infix dollar sign works
2. using ($$) which is equal to () fails 3. defining ($$$) exactly as ($) is defined in Prelude fails
4. prefix dollar sign fails

in failures, always same two error messages, abbreviated here:

    * Couldn't match type m' with m0'
* No instance for (KnownNat m0) arising from a use of five'

i suspect infix $is not a normal function; it seems to be getting special treatment by ghc. {-# LANGUAGE RankNTypes #-} -- next two needed for type signature of ($$) {-# LANGUAGE KindSignatures #-} {-# LANGUAGE PolyKinds #-} module Main where { import qualified Numeric.Modular; -- https://hackage.haskell.org/package/modular import GHC.TypeNats(KnownNat); import GHC.Exts(TYPE);main :: IO(); main = undefined;five :: forall m . KnownNat m => Numeric.Modular.Mod m; five = Numeric.Modular.mkMod 5;mod3 :: (forall m . KnownNat m => Numeric.Modular.Mod m) -> Integer; mod3 = Numeric.Modular.withMod 3;($$) = ($);-- copy the definition of ($) from Prelude infixr 0 $$; ($$$) :: forall r a (b :: TYPE r). (a -> b) -> a -> b; f $$x = f x; {-# INLINE ($$$) #-}; -- the semicolon is necessary!-- compute 5 mod 3 = 2 answer :: Integer; -- answer = mod3 five; -- works answer = mod3$ five; -- works -- answer = mod3 $$five; -- fails to compile -- answer = mod3$$$five; -- fails to compile -- answer = ($) mod3 five; -- fails to compile} --end

# Competitive programming in Haskell: folding challenge

As of this writing, I am the only person who has solved Origami on Open Kattis (see the problem statistics here). This is a shame since it is a very nice problem! Of course, I solved it in Haskell.

I challenge you to solve itâ€”bonus points for using a fold in your solution!

## February 17, 2021

### Yesod Web Framework

Three years ago, I met fellow Haskeller Brian Hurt while working with Obsidian Systems on a Reflex project. Not long after, he started telling me his ideas about how to fix social media. These ideas intrigued me, one thing led to another, and we began building ChatWisely. We thought other Haskellers might like to hear about what we’re doing and how we use Haskell to do it.

ChatWisely is a member-supported mini-blogging social network currently in open beta. We envision a place where people connect in a spirit of comradery. We see an elevated discourse and a way to show bullies the door by providing a platform to debate safely. Here’s how we’re doing it.

## Safety First

Brian and I built several mechanisms to filter out people looking for a fight or to harm others. First and foremost will be the monthly subscription fee of one dollar. We think that will discourage a large portion of toxic people. Another is a sharable mute list that we believe will help mitigate the rest. And finally, in the event of a serious Terms of Service violation, we ban payment methods rather than just accounts.

## Mini-Blogging

The big idea here is that sometimes, a short post isn’t enough. Brian and I made a way to link that short post to a longer one. So the timeline looks something like Twitter’s, but with some posts that can expand to something more detailed. These can be connected to other people’s posts to create a continuity in conversation hard to come by on other platforms. So when another member’s post inspires you to write a longer one about your experience with the ghcjs ffi (for example), you can link your post to theirs.

Members can organize their timeline and choose to what extent they follow other people’s posts. The typical mainstream social network requires that when you follow someone you must follow everything they post, or nothing. Sure, there are filtered word lists in some cases. But none of it seems to work quite right. Instead, we have groups called caboodles that members can use to decide where other people’s posts fit, and how to share their own. So say someone likes their uncle’s cookie recipes but not his political posts. They can follow one but not the other.

## Geolocated Messaging

One day this pandemic will be over and we’ll be there to meet that day. At that time, when a member’s movie caboodle wants to organize local screenings of the latest blockbuster from the House of Mouse they can make posts visible to people in their proximity. Perhaps you want to target local Haskellers to organize a meetup, or leave them a message that pops up when they’ve found the meeting place. Also, I think running a scavenger hunt with geolocated clues sounds like a hoot.

## How It’s Built

Brian and I rely heavily on the Haskell ecosystem to build ChatWisely. Haskell’s type system reduces errors and cognitive load. GHC is our pair programming partner that tightens the delivery cycle and lets us learn how to build ChatWisely while we’re building it. Refactoring is a breeze, and unit testing is constrained to the I/O edges of the system, which means we spend less time on that and more time building the product. Here’s the principal tools we rely on to get the job done.

#### Ghcjs

The fact is we all hate javascript. The problem is, we can’t build web apps without it. Ghcjs lets us deal with the realities of building a product that uses a web browser for a client. The FFI lets us wrap our hand-written javascript in Haskell which helps to keep that part of the codebase pretty small.. We especially love what is built on top of that, Reflex-Dom.

#### Reflex-Dom

Reflex helps us build a system that needs to adapt to changes in data, requirements and platform. We’re learning how to build ChatWisely as we build it, and reflex keeps up with our changing ideas on how to do that. Our first app store product will be a PWA, delivered with reflex.

#### Servant

Servant delivers the API, and requires us to separate definition from implementation. This helps us keep the backend from turning into a big ball of mud. We can auto-generate clients, which we currently use in unit testing. They even have a way to generate a reflex client, and we’ll be adding that in the near future.

#### Yesod

We use Yesod for marketing tasks, which largely require static pages. Currently it handles our landing page and unsubscribe mechanism for our emails. The Yesod Widget is a monoid, and therefore composable, which makes structuring html simple.

## Three Reasons Why People Should Use ChatWisely.

#### We are member supported

We won’t have ads, which means we have no need to manipulate people’s timelines in order to serve those ads. Their timeline should be about conversations of interest.

#### We solve the tweet thread problem

Brian and I find tweet threads hard to follow. Our mini-blog looks like twitter in the sense that you get a timeline of short posts. However if a post is the beginning of something more developed, that message can open up to access it.

#### Keep the RealWorld conversation going

We have delayed the development of these features for obvious reasons. But one day we’ll be together again. By then we’ll have useful geo-location tools for conference attendees and speakers to continue the conversation.

What makes a weekend conference fun for me are the conversations in-between formal talks. I get all caught up in compelling conversations, and want to keep that going. We’ll have a way to do that, without having to know or remember anyone’s email address or phone number.

Conference speakers will often want to build on the momentum gathered after a successful talk. Brian and I think Twitter hashtags are the terrible but often only way to do this. We’ll have a way to use proximity and common interests to help build that momentum and keep everyone engaged.

We built ChatWisely as a response to the unpleasantness all too common on mainstream social networks. Depending on our membership for support creates the place we want to meet because ad revenue and data-mining motivates engagement, not conversations and connection. No one is fooled by what mainstream social networks call engagement because that looks to us like derailed conversations, confusing timelines we only have a shallow control over, and unsafe situations.

Brian and I love the daily experience of building ChatWisely, the Haskell ecosystem brings joy to the experience of running our startup. You can support us on patreon and should come by and test the beta. We look forward to hearing from you about any ideas or questions you may have.

## February 15, 2021

We're back again with some more site improvements, this time to our Advanced Content. All of these series now have improved syntax highlighting and code blocks for better readability. In addition, we've revised three of them with updated companion code! Here's a summary.

Once you've mastered the foundations of the language, this series should be your first stop! It will walk you through several different libraries demonstrating how you can perform some real-world tasks with Haskell, like connecting to a database and running a web server. You can follow along with all the code here on GitHub.

## Parsing Series

As a functional language, Haskell thrives on being able to seemlessly compose smaller pieces of code together into a large, coherent whole. Parsing libraries are one area where this paradigm fits very well. In this series, we go through a few different parsing libraries and compare them. The code is available in this repository if you want to try it out yourself!

## API Integrations Series

A lot of coding projects involved connected with outside services and APIs. Luckily, Haskell has a few libraries for interacting with these services! In this series, we'll explore integrations with Twilio and Mailgun so that we can send text messages and emails to your users! You can get a detailed breakdown of the code on GitHub. You can even fork the repository and run the code for yourself!

## What's Coming Up?

Our next area of focus will be working on a first release of Haskellings, an interactive beginner tutorial for the language. We built this over the course of the last few months of 2020 in an extended video series that you can catch here on YouTube. The project is Open Source and currently available for contributions! Stay tuned for more updates on it!

# Short Circuit Sum in Rust

This blog post is the first in a planned series I'm calling "Rust quickies." In my training sessions, we often come up with quick examples to demonstrate some point. Instead of forgetting about them, I want to put short blog posts together focusing on these examples. Hopefully these will be helpful, enjoy!

FP Complete is looking for Rust and DevOps engineers. Interested in working with us? Check out our jobs page.

## Short circuiting a for loop

Let's say I've got an Iterator of u32s. I want to double each value and print it. Easy enough:

fn weird_function(iter: impl IntoIterator<Item=u32>) {
for x in iter.into_iter().map(|x| x * 2) {
println!("{}", x);
}
}

fn main() {
weird_function(1..10);
}


And now let's say we hate the number 8, and want to stop when we hit it. That's a simple one-line change:

fn weird_function(iter: impl IntoIterator<Item=u32>) {
for x in iter.into_iter().map(|x| x * 2) {
if x == 8 { return } // added this line
println!("{}", x);
}
}


Easy, done, end of story. And for this reason, I recommend using for loops when possible. Even though, from a functional programming background, it feels overly imperative. However, some people out there want to be more functional, so let's explore that.

## for_each vs map

Let's forget about the short-circuiting for a moment. And now we want to go back to the original version of the program, but without using a for loop. Easy enough with the method for_each. It takes a closure, which it runs for each value in the Iterator. Let's check it out:

fn weird_function(iter: impl IntoIterator<Item=u32>) {
iter.into_iter().map(|x| x * 2).for_each(|x| {
println!("{}", x);
})
}


But why, exactly do we need for_each? That seems awfully similar to map, which also applies a function over every value in an Iterator. Trying to make that change, however, demonstrates the problem. With this code:

fn weird_function(iter: impl IntoIterator<Item=u32>) {
iter.into_iter().map(|x| x * 2).map(|x| {
println!("{}", x);
})
}


we get an error message:

error[E0308]: mismatched types
--> src\main.rs:2:5
|
2 | /     iter.into_iter().map(|x| x * 2).map(|x| {
3 | |         println!("{}", x);
4 | |     })
| |______^ expected (), found struct Map


Undaunted, I fix this error by sticking a semicolon at the end of that expression. That generates a warning of unused Map that must be used. And sure enough, running this program produces no output.

The problem is that map doesn't drain the Iterator. Said another way, map is lazy. It adapts one Iterator into a new Iterator. But unless something comes along and drains or forces the Iterator, no actions will occur. By contrast, for_each will always drain an Iterator.

One easy trick to force draining of an Iterator is with the count() method. This will perform some unnecessary work of counting how many values are in the Iterator, but it's not that expensive. Another approach would be to use collect. This one is a little trickier, since collect typically needs some type annotations. But thanks to a fun trick of how FromIterator is implemented for the unit type, we can collect a stream of ()s into a single () value. Meaning, this code works:

fn weird_function(iter: impl IntoIterator<Item=u32>) {
iter.into_iter().map(|x| x * 2).map(|x| {
println!("{}", x);
}).collect()
}


Note the lack of a semicolon at the end there. What do you think will happen if we add in the semicolon?

## Short circuiting

EDIT Enough people have asked "why not use take_while?" that I thought I'd address it. Yes, below, take_while will work for "short circuiting." It's probably even a good idea. But the main goal in this post is to explore some funny implementation approaches, not recommend a best practice. And overall, despite some good arguments for take_while being a good choice here, I still stand by the overall recommendation to prefer for loops for simplicity.

With the for loop approach, stopping at the first 8 was a trivial, 1 line addition. Let's do the same thing here:

fn weird_function(iter: impl IntoIterator<Item=u32>) {
iter.into_iter().map(|x| x * 2).map(|x| {
if x == 8 { return }
println!("{}", x);
}).collect()
}


Take a guess at what the output will be. Ready? OK, here's the real thing:

2
4
6
10
12
14
16
18


We skipped 8, but we didn't stop. It's the difference between a continue and a break inside the for loop. Why did this happen?

It's important to think about the scope of a return. It will exit the current function. And in this case, the current function isn't weird_function, but the closure inside the map call. This is what makes short-circuiting inside map so difficult.

The same exact comment will apply to for_each. The only way to stop a for_each from continuing is to panic (or abort the program, if you want to get really aggressive).

But with map, we have some ingenious ways of working around this and short-circuiting. Let's see it in action.

## collect an Option

map needs some draining method to drive it. We've been using collect. I've previously discussed the intricacies of this method. One cool feature of collect is that, for Option and Result, it provides short-circuit capabilities. We can modify our program to take advantage of that:

fn weird_function(iter: impl IntoIterator<Item=u32>) -> Option<()> {
iter.into_iter().map(|x| x * 2).map(|x| {
if x == 8 { return None } // short circuit!
println!("{}", x);
Some(()) // keep going!
}).collect()
}


I put a return type of weird_function, though we could also use turbofish on collect and throw away the result. We just need some type annotation to say what we're trying to collect. Since collecting the underlying () values doesn't take up extra memory, this is even pretty efficient! The only cost is the extra Option. But that extra Option is (arguably) useful; it lets us know if we short-circuited or not.

But the story isn't so rosy with other types. Let's say our closure within map returns the x value. In other words, replace the last line with Some(x) instead of Some(()). Now we need to somehow collect up those u32s. Something like this would work:

fn weird_function(iter: impl IntoIterator<Item=u32>) -> Option<Vec<u32>> {
iter.into_iter().map(|x| x * 2).map(|x| {
if x == 8 { return None } // short circuit!
println!("{}", x);
Some(x) // keep going!
}).collect()
}


But that incurs a heap allocation that we don't want! And using count() from before is useless too, since it won't even short circuit.

But we do have one other trick.

## sum

It turns out there's another draining method on Iterator that performs short circuiting: sum. This program works perfectly well:

fn weird_function(iter: impl IntoIterator<Item=u32>) -> Option<u32> {
iter.into_iter().map(|x| x * 2).map(|x| {
if x == 8 { return None } // short circuit!
println!("{}", x);
Some(x) // keep going!
}).sum()
}


The downside is that it's unnecessarily summing up the values. And maybe that could be a real problem if some kind of overflow occurs. But this mostly works. But is there some way we can stay functional, short circuit, and get no performance overhead? Sure!

## Short

The final trick here is to create a new helper type for summing up an Iterator. But this thing won't really sum. Instead, it will throw away all of the values, and stop as soon as it sees an Option. Let's see it in practice:

#[derive(Debug)]
enum Short {
Stopped,
Completed,
}

impl<T> std::iter::Sum<Option<T>> for Short {
fn sum<I: Iterator<Item = Option<T>>>(iter: I) -> Self {
for x in iter {
if let None = x { return Short::Stopped }
}
Short::Completed
}
}
fn weird_function(iter: impl IntoIterator<Item=u32>) -> Short {
iter.into_iter().map(|x| x * 2).map(|x| {
if x == 8 { return None } // short circuit!
println!("{}", x);
Some(x) // keep going!
}).sum()
}

fn main() {
println!("{:?}", weird_function(1..10));
}


And voila! We're done!

Exercise It's pretty cheeky to use sum here. collect makes more sense. Replace sum with collect, and then change the Sum implementation into something else. Solution at the end.

## Conclusion

That's a lot of work to be functional. Rust has a great story around short circuiting. And it's not just with return, break, and continue. It's with the ? try operator, which forms the basis of error handling in Rust. There are times when you'll want to use Iterator adapters, async streaming adapters, and functional-style code. But unless you have a pressing need, my recommendation is to stick to for loops.

If you liked this post, and would like to see more Rust quickies, let me know. You may also like these other pages:

## Solution

use std::iter::FromIterator;

#[derive(Debug)]
enum Short {
Stopped,
Completed,
}

impl<T> FromIterator<Option<T>> for Short {
fn from_iter<I: IntoIterator<Item = Option<T>>>(iter: I) -> Self {
for x in iter {
if let None = x { return Short::Stopped }
}
Short::Completed
}
}
fn weird_function(iter: impl IntoIterator<Item=u32>) -> Short {
iter.into_iter().map(|x| x * 2).map(|x| {
if x == 8 { return None } // short circuit!
println!("{}", x);
Some(x) // keep going!
}).collect()
}

fn main() {
println!("{:?}", weird_function(1..10));
}
`