# Working with Hedis

I'm now writing the second Haskell service using Redis to store data. There are a few packages on Hackage related to Redis but I only found 2 client libraries, redis-io and hedis. I must say I like the API of redis-io better, but it breaks a rule I hold very dear:

> Libraries should never log, that's the sole responsibility of the application.

So, hedis it is. I tried using the API as is, but found it really cumbersome so looked around and after some inspiration from hedis-simple I came up with the following functions.

First a wrapper around a Redis function that put everything into ExceptionT with a function that transforms a reply into an Exception.

lpush :: Exception e => (Reply -> e) -> ByteString -> [ByteString] -> ExceptionT Redis Integer
lpush mapper key element = ExceptionT $replyToExc <$> R.lpush key element
where
replyToExc = first (toException . mapper)


I found wrapping up functions like this is simple, but repetitive.

Finally I need a way to run the whole thing and unwrap it all back to IO:

runRedis :: Connection -> ExceptionT Redis a -> IO (Either SomeException a)
runRedis conn = R.runRedis conn . runExceptionT


## May 06, 2021

### Michael Snoyman

<quick-intro> In my previous post, I mentioned that we were opening up the Haskell Foundation Slack instance for more people. I'd like to emphasize that a second time, and mention that we are trying to use Slack as a tech track communication hub. We're actively looking for contributors to join up on some of these initiatives and bring them to reality. If the previous topics I've mentioned (like the Vector Types Proposal) or topics like improving Haskell educational material sound interesting, please join the Slack, introduce yourself in #tech-track, and tell us what you'd be interested in helping out with. You'll get to be part of some hopefully significant improvements to the Haskell ecosystem, and have access to a lot of other collaborators and mentors to make this happen. </quick-intro>

Also, we're working to get a unified listing of the active projects and what kind of contributors we're looking for, so stay tuned for that too. Anyway, on to the newest proposal!

## Stack

There are currently multiple ways to install Haskell tooling, and not all of them work for everyone. Personally, I've worked on two such methods: first MinGHC (a Windows installer), and later Stack, which subsumed MinGHC. Stack combines both a build tool and a toolchain installer, and manages GHC installations as well as—on Windows—install msys. And if it seems like Windows is coming up a lot here... that's a theme.

Side note What's msys? Why do we need it? msys is essentially a bunch of POSIX utilities ported to run on Windows. The most basic problem is solves is allowing packages with configure scripts, such as network, to be installed on Windows. It can also be used to provide a package manager for getting shared libraries on Windows. For this post, take as a given that having msys on Windows is basically a necessity.

Anyway, with Stack, you have a set of different installers for the Stack binary itself: a GUI installer for Windows or a curl-able script for POSIX-like systems. Stack then can install GHC and msys2 into user-specific directories, and provides commands like stack ghc or stack --resolver ghc-8.10.4 exec to run commands with the appropriate toolchain configured in environment variables.

There are basically three problems with this setup:

1. Some people really don't like user-specific installs, and want tooling installed system-globally. Or they really want package managers to maintain toolchains. Or a few other such goals. There are inherent problems with some of those goals, such as distributions having out-of-date installers, or permissions management, or others. The per-user toolchain installer has become pretty standards across different languages (Python, Ruby, Rust, etc), and is unlikely to change any time soon.
2. Some people really don't like typing in stack ghc instead of ghc. I'm overall a fan of needing the wrapper tooling, since it allows you to trivially switch which toolchain you're using, and makes it more blatant that certain environment variables (like STACK_YAML) and files (like stack.yaml) may affect behavior.
3. Stack isn't the only tool in town. For example, many people want to use cabal-install instead. Stack doesn't have support for downloading cabal-install binaries right now. Even if it did, it's not clear that the cabal-install team will consent to an installer that leverages Stack like this. But there are even more binaries we may wish to install and manage going forward, such as linters, stylers, IDE integration tools, etc.

But if you're using Stack to build your projects, "how do I install GHC and associated tools" is basically a solved problem. Perhaps (2) irritates some people, but we'll get back to that. Perhaps you don't like curl-able scripts, or Windows installers, or user-specific installs. You can go ahead and manually download the Stack executable and use things like { system-ghc: true, install-ghc: false }. But from my experience, most people just use the defaults and are happy enough.

## Everything else

But as mentioned, not everyone is on Stack. And there's other tooling we'd like to be able to install. And we don't want to bifurcate our install methods too much. And it's kind of weird that Stack is both a toolchain manager and a build tool. And frankly, it would be nice to finally get a downloads page that everyone can agree with.

There are other approaches to installing toolchains available today. ghcup, Haskell Platform (to some extent), manually ./configure && make installing GHC bindists, distro specific methods, etc. Some of these solves some of the three problems I mentioned with Stack. Some solve other problems as well. Some create new problems. But a recurring theme in particular that the Haskell Foundation members have heard is that the story for non-Stack users on Windows is particularly painful.

Let me clarify that I'm wearing two hats with what I'm about to say:

1. A maintainer of Stack and the original author of much of Stack's toolchain install logic, who would always rather see the Stack codebase size go down instead of up.
2. A member of the Haskell Foundation board, who is looking to improve the Haskell adoption story.

So the problem statement: can't we standardize an install method that meets most people's goal, most of the time?

## The goal

I'll start by laying out the ultimate goal that's in my head. This idea's been bouncing around for a few months now, but only came up in May 4, 2021 tech track meeting for the Haskell Foundation. I don't think it's that ambitious of a goal. But in the next section, I'm also going to lay out some shorter term milestones.

What I'd like to see is a well maintained Haskell library for installing Haskell toolchains, generally acknowledged as the de facto standard way to install GHC and other tools. It should be Haskell, so that it's easy for Haskellers to maintain. It should support as many operating systems as possible. It should provide an easy-to-use Haskell API that other tooling can leverage.

In addition, we should have some executables built against that library that provide a nice command line interface for managing toolchains. Some examples of these executables would be:

• The primary installer executable, that let's you say things like "install GHC 9.0.1" and "run the following command with env vars set so that it will use GHC 9.0.1"
• A ghc/ghc.exe shim executable that does the same thing as above, but doesn't require a longer command line like haskell-install --ghc 9.0.1 --install-ghc exec -- ghc -O2 Main.hs. Instead, with this ghc/ghc.exe shim, all of that would be implied. How does it know what GHC version to use? How do you override it? All fair questions I'm going to gloriously punt on for now.
• We can build out additional shim executables for other common tools, like runghc, or ghci, or... and this is the cool part... shim executables for Stack and cabal-install.

Next, we're going to build installers for all of these executables. On Windows, it will be a GUI installer. On Linux, OS X, and others, we can take our pick, though starting with curl-able scripts makes the most sense in my opinion.

Stack no longer will have logic in its codebase for installing GHC. Instead, it will rely on this new library. Or, perhaps, it won't even do that. Perhaps it will be the new tool's job to ensure the correct GHC is available for Stack invocations. I'm not quite sure.

cabal-install doesn't currently maintain a GHC toolchain, so I think it would fit into this model perfectly without any need for modification.

## The milestones

The biggest pain point we want to immediately solve is to get a Windows installer out the door. The short term goal for this is to provide a brand new executable, build against Stack.Setup, and build an installer around it. Maybe that installer will include recent cabal-install and Stack executables. The details on what exactly it will do are still slightly fuzzy. But the point is: we're just writing an executable that reuses existing installer logic within Stack's library.

The next step is creating the new installer library. My recommendation is to simply cleave off the aforementioned Stack.Setup module into a new package and then modify as needed. Modifications may include adding support for installing other tools, updating the location of config files for specifying toolchain binaries, or more. But Stack.Setup is a module that's been around for a long time now, and has worked out a lot of these details. I'd advise against recreating the wheel here.

With that in place, we can begin to work on the other shim executables mentioned above, and begin to look at how this new installer library and the shim executables interact with the rest of the installer ecosystem. That's an open discussion yet to happen.

And in parallel to that, the Stack codebase itself can begin its move over to the new library.

## Inspiration

I'm trying to avoid sounding like a broken record, but a lot of the inspiration for my thoughts here come from rustup. Honestly though, most of the ideas of how this installer should work predate Rust, and come from a combination of experience with MinGHC and seeing how tools like virtualenv and RVM work.

## Get involved!

I'm going to put this blog post on Discourse, as I've been doing with previous posts. But I'm going to strongly encourage anyone interested in this to join up on Slack too. We've created a new channel for this topic, #tech-track-installer. If you're interested in the topic, please join up!

I'm not just asking for people to provide some feedback, though that's welcome. I'm looking for people to contribute, work on code, and more broadly, take ownership of this work. I think this is a relatively small, but massively vital, piece of Haskell infrastructure we're trying to solidify. Someone looking to get deeper into open source maintenance would be great. And you'll have plenty of people available to provide guidance when desired.

## Stack/cabal-install/GHC mapping

One final idea I was playing with, and thought I'd leave as more of a footnote. One of the painpoints in maintaining build tooling (Stack or cabal-install) is maintaining backwards compatibility with old versions of GHC. However, other languages don't always do that. For example, with Rust, each new compiler version gets a new Cargo (the build tool) version:

> cargo --version
cargo 1.51.0 (43b129a20 2021-03-16)
> rustc --version
rustc 1.51.0 (2fd73fabe 2021-03-23)


With a toolchain installer in place like I'm describing, we could end up in that reality for Haskell too. Perhaps we could have a config file that essentially says, for each GHC version, what the recommended version is of cabal-install, Stack, msys2, stylish-haskell, and a slew of other tools. You could override it if desired, but the default would be the right thing for most people. Talking about a unified "version 9.2.1" Haskell toolchain that represents the compiler, standard libraries, and add-on tooling sounds like a huge step forward for easy adoption, at least to me.

# PyTorch Developer Podcast

I'm launching a new podcast, the PyTorch Developer Podcast. The idea is to be a place for the PyTorch dev team to do bite sized (10-20 min) topics about all sorts of internal development topics in PyTorch. For now, it's just me monologuing for fifteen minutes about whatever topic I decide. The plan is to release an episode daily, five days a week, until I run out of things to say (probably not for a while, I have SO MANY THINGS TO SAY). I don't edit the podcasts and do minimal planning, so they're a bit easier to do than blog posts. Check it out! There's two episodes out already, one about how we do Python bindings for our C++ objects and another about history and constraints of the dispatcher. If there are any topics you'd like me to cover, give a shout.

# The trick to avoid deeply-nested error-handling code

either-trick

This post documents a small trick that I use to avoid deeply-nested error-handling code. This trick is a common piece of Haskell folklore that many people either learn from others or figure out on their own, but I’m not sure what the official name of this trick is (so I had difficulty searching for prior art explaining this trick). However, I’ve taught this trick to others enough times that I think it merits a blog post of its own.

This post assumes some familiarity with Haskell’s Either type and do notation, but the Appendix at the end of the post will walk through all of the details using equational reasoning if you’re having trouble following along with how things work.

#### The motivating example

Let’s begin from the following contrived Either-based example that uses deeply nested error-handling:

{-# LANGUAGE NamedFieldPuns #-}import Text.Read (readMaybe)data Person = Person { age :: Int, alive :: Bool } deriving (Show)example :: String -> String -> Either String Personexample ageString aliveString = do    case readMaybe ageString of        Nothing -> do            Left "Invalid age string"        Just age -> do            if age < 0                then do                    Left "Negative age"                else do                    case readMaybe aliveString of                        Nothing -> do                            Left "Invalid alive string"                        Just alive -> do                            return Person{ age, alive }

… which we can use like this:

>>> example "24" "True"Right (Person {age = 24, alive = True})>>> example "24" "true"Left "Invalid alive string">>> example "" "True"Left "Invalid age string">>> example "-5" "True"Left "Negative age"

Notice how the above coding style increases the nesting / indentation level every time we parse or validate the input in some way. We could conserve indentation by using 2-space indents or indenting only once for each level of nesting:

{-# LANGUAGE NamedFieldPuns #-}import Text.Read (readMaybe)data Person = Person { age :: Int, alive :: Bool } deriving (Show)example :: String -> String -> Either String Personexample ageString aliveString = case readMaybe ageString of  Nothing  -> Left "Invalid age string"  Just age -> if age < 0    then Left "Negative age"    else case readMaybe aliveString of      Nothing    -> Left "Invalid alive string"      Just alive -> return Person{ age, alive }

However, I think most people writing code like that would prefer to keep the indentation level stable, no matter how many validations the code requires.

#### The trick

Fortunately, we can avoid nesting the code with the following change:

{-# LANGUAGE NamedFieldPuns #-}import Text.Read (readMaybe)data Person = Person { age :: Int, alive :: Bool } deriving (Show)example :: String -> String -> Either String Personexample ageString aliveString = do    age <- case readMaybe ageString of        Nothing  -> Left "Invalid age string"        Just age -> return age    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe aliveString of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }

Here we make use of several useful properties:

• return in Haskell does not short-circuit or exit from the surrounding code

In fact, some Haskell programmers prefer to use pure (a synonym for return) to better convey the absence of short-circuiting behavior:

{-# LANGUAGE NamedFieldPuns #-}import Text.Read (readMaybe)data Person = Person { age :: Int, alive :: Bool } deriving (Show)example :: String -> String -> Either String Personexample ageString aliveString = do    age <- case readMaybe ageString of        Nothing  -> Left "Invalid age string"        Just age -> pure age    if age < 0        then Left "Negative age"        else pure ()    alive <- case readMaybe aliveString of        Nothing    -> Left "Invalid alive string"        Just alive -> pure alive    pure Person{ age, alive }
• Left does short-circuit from the surrounding code

In fact, we can define a synonym for Left named throw if we want to better convey the presence of short-circuiting behavior::

{-# LANGUAGE NamedFieldPuns #-}import Text.Read (readMaybe)data Person = Person { age :: Int, alive :: Bool } deriving (Show)example :: String -> String -> Either String Personexample ageString aliveString = do    age <- case readMaybe ageString of        Nothing  -> throw "Invalid age string"        Just age -> pure age    if age < 0        then throw "Negative age"        else pure ()    alive <- case readMaybe aliveString of        Nothing    -> throw "Invalid alive string"        Just alive -> pure alive    pure Person{ age, alive }throw :: String -> Either String athrow = Left
• Left / throw “return” any type of value

If you look at the type of throw, the “return” type is a polymorphic type a because throw short-circuits (and therefore doesn’t need to return a real value of that type).

This is why the type checker doesn’t complain when we do this:

age <- case readMaybe ageString of    Nothing  -> throw "Invalid age string"    Just age -> pure age

… because both branches of the case expression type-check as an expression that “return”s an Int. The left branch type-checks as a branch that returns an Int because it cheats and never needs to return anything and the right branch returns a bona-fide Int (the age).

We can make this explicit by giving both branches of the case expression an explicit type annotation:

age <- case readMaybe ageString of    Nothing  -> throw "Invalid age string" :: Either String Int    Just age -> pure age                   :: Either String Int

This means that both branches of a case expression will always share the same return type if at least one branch is a Left / throw. Similarly, both branches of an if expression will share the same return type if at least one branch is a Left / throw:

if age < 0    then throw "Negative age" :: Either String ()    else pure ()              :: Either String ()
• We can return a value from a case expression

New Haskell programmers might not realize that case expressions can return a value (just like any other expression), which means that their result can be stored as a new value within the surrounding scope:

age <- case readMaybe ageString of    Nothing  -> throw "Invalid age string" :: Either String Int    Just age -> pure age                   :: Either String Int

The type-checker doesn’t mind that only the second branch returns a valid age because it knows that the outer age is unreachable if the first branch short-circuits. This understanding is not built into the compiler, but is rather an emergent property of how do notation works for Either. See the Appendix for a fully-worked equational reasoning example showing why this works.

#### Combinators

You can build upon this trick by defining helpful utility functions to simplify things further. For example, I sometimes like to define the following helper function:

orDie :: Maybe a -> String -> Either String aJust a  orDie _      = return aNothing orDie string = Left string{- Equivalent, more explicit, implementation:maybe orDie string =    case maybe of        Nothing -> Left string        Just a  -> return a-}

… which you can use like this:

{-# LANGUAGE NamedFieldPuns #-}import Text.Read (readMaybe)data Person = Person { age :: Int, alive :: Bool } deriving (Show)orDie :: Maybe a -> String -> Either String aJust a  orDie _      = Right aNothing orDie string = Left stringexample :: String -> String -> Either String Personexample ageString aliveString = do    age <- readMaybe ageString orDie "Invalid age string"    if age < 0        then Left "Negative age"        else return ()    alive <- readMaybe aliveString orDie "Invalid alive string"    return Person{ age, alive }

… which is even more clear (in my opinion).

#### Conclusion

That is the entirety of the trick. You can return values from case expressions to avoid deeply-nesting your Either code, or you can define utility functions (such as orDie) which do essentially the same thing.

This trick applies equally well to any other Monad that supports some notion of short-circuiting on failure, such as ExceptT (from transformers / mtl). The only essential ingredient is some throw-like function that short-circuits and therefore has a polymorphic return type.

#### Appendix

You can build a better intuition for why this works using equational reasoning. We’ll begin from an example usage of our function and at each step of the process we will either desugar the code or substitute one subexpression with another equal subexpression.

In all of the examples, we’ll begin from this definition for the example function:

example :: String -> String -> Either String Personexample ageString aliveString = do    age <- case readMaybe ageString of        Nothing  -> Left "Invalid age string"        Just age -> return age    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe aliveString of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }

Let’s first begin with the example that fails the most quickly:

example "" "True"-- Substitute example with its definition:---- … replacing ageString with ""---- … replacing aliveString with "True"do  age <- case readMaybe "" of        Nothing  -> Left "Invalid age string"        Just age -> return age    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }-- Definition of readMaybe (not shown):---- (readMaybe "" :: Maybe Int) = Nothingdo  age <- case Nothing of        Nothing  -> Left "Invalid age string"        Just age -> return age    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }-- Simplify the first case expressiondo  age <- Left "Invalid age string"    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }-- Desugar do notation one step---- (do age <- m; n) = (m >>= \age -> n)Left "Invalid age string" >>= \age ->    do  if age < 0            then Left "Negative age"            else return ()        alive <- case readMaybe "True" of            Nothing    -> Left "Invalid alive string"            Just alive -> return alive        return Person{ age, alive }-- Definition of >>= for Either---- (Left x >>= _) = (Left x)Left "Invalid age string"

… and we’re done! The key step is the last bit where we simplify (Left … >>= _) to (Left …), which is how Either short-circuits on failure. Because this simplification does not use the downstream code everything type-checks just fine even though we never supply a valid age.

For completeness, let’s also walk through the example where everything succeeds:

example "24" "True"-- Substitute example with its definition:---- … replacing ageString with "24"---- … replacing aliveString with "True"do  age <- case readMaybe "24" of        Nothing  -> Left "Invalid age string"        Just age -> return age    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }-- Definition of readMaybe (not shown):---- (readMaybe "24" :: Maybe Int) = (Just 24)do  age <- case Just 24 of        Nothing  -> Left "Invalid age string"        Just age -> return age    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }-- Simplify the first case expressiondo  age <- return 24    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }-- return = Rightdo  age <- Right 24    if age < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age, alive }-- Desugar do notation one step---- (do age <- m; n) = (m >>= \age -> n)Right 24 >>= \age ->    do  if age < 0            then Left "Negative age"            else return ()        alive <- case readMaybe "True" of            Nothing    -> Left "Invalid alive string"            Just alive -> return alive        return Person{ age, alive }-- Definition of >>= for Either---- (Right x >>= f) = (f x)---- This means that we substitute age with 24do  if 24 < 0        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age = 24, alive }-- (24 < 0) = Falsedo  if False        then Left "Negative age"        else return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age = 24, alive }-- (if False then l else r) = rdo  return ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age = 24, alive }-- return = Rightdo  Right ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age = 24, alive }-- (do m; n) = (do _ <- m; n)do  _ <- Right ()    alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age = 24, alive }-- Desugar do notation one step---- (do age <- m; n) = (m >>= \age -> n)Right () >>= _ ->     do  alive <- case readMaybe "True" of            Nothing    -> Left "Invalid alive string"            Just alive -> return alive        return Person{ age = 24, alive }-- Definition of >>= for Either---- (Right x >>= f) = (f x)---- This means that we substitute _ (unused) with ()do  alive <- case readMaybe "True" of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age = 24, alive }-- Definition of readMaybe (not shown):---- (readMaybe "True" :: Maybe Bool) = (Just True)do  alive <- case Just True of        Nothing    -> Left "Invalid alive string"        Just alive -> return alive    return Person{ age = 24, alive }-- Simplify the case expressiondo  alive <- return True    return Person{ age = 24, alive }-- return = Rightdo  alive <- Right True    return Person{ age = 24, alive }-- Desugar do notation one step---- (do age <- m; n) = (m >>= \age -> n)Right True >>= \alive ->    do  return Person{ age = 24, alive }-- Definition of >>= for Either---- (Right x >>= f) = (f x)---- This means that we substitute alive with Truedo  return Person{ age = 24, alive = True }-- Desugar do notation---- do m = mreturn Person{ age = 24, alive = True }-- return = RightRight Person{ age = 24, alive = True }

As an exercise, you can try to extrapolate between those two examples to reason through what happens when we evaluate the remaining two examples which fail somewhere in between:

>>> example "24" "true">>> example "-5" "True"

# Specifying State Machines with Temporal Logic

Quickstrom uses linear temporal logic (LTL) for specifying web applications. When explaining how it works, I’ve found that the basics of LTL are intuitive to newcomers. On the other hand, it’s not obvious how to specify real-world systems using LTL. That’s why I’m sharing some of my learnings and ideas from the past year in the form of blog posts.

This post focuses on how to use LTL to specify systems in terms of state machines. It’s a brief overview that avoids going into too much detail. For more information on how to test web applications using such specifications, see the Quickstrom documentation.

To avoid possible confusion, I want to start by pointing out that a state machine specification in this context is not the same as a model in TLA+ (or similar modeling languages.) We’re not building a model to prove or check properties against. Rather, we’re defining properties in terms of state machine transitions, and the end goal is to test actual system behavior (e.g. web applications, desktop applications, APIs) by checking that recorded traces match our specifications.

## Linear Temporal Logic

In this post, we’ll be using an LTL language. It’s a sketch of a future specification language for Quickstrom.

A formula (plural formulae) is a logical expression that evaluates to true or false. We have the constants:

• true
• false

We combine formulae using the logical connectives, e.g:

• &&
• ||
• not
• ==>

The ==> operator is implication. So far we have propositional logic, but we need a few more things.

## Temporal Operators

At the core of our language we have the notion of state. Systems change state over time, and we’d like to express that in our specifications. But the formulae we’ve seen so far do not deal with time. For that, we use temporal operators.

To illustrate how the temporal operators work, I’ll use diagrams to visualize traces (sequences of states). A black circle denotes a state in which the formula is true, and a white circle denotes a state where the formula is false.

For example, let’s say we have two formulae, P and Q, where:

• P is true in the first and second state
• Q is true in the second state

Both formulae are false in all other states. The formulae and trace would be visualized as follows:

P   ●───●───○
Q   ○───●───○

Note that in these diagrams, we assume that the last state repeats forever. This might seem a bit weird, but drawing an infinite number of states is problematic.

### Next

The next operator takes a formula as an argument and evaluates it in the next state.

next P   ●───○───○
P        ○───●───○

The next operator is relative to the current state, not the first state in the trace. This means that we can nest nexts to reach further into the future.

next (next P)   ●───●───○───○───○
next P          ○───●───●───○───○
P               ○───○───●───●───○

### Next for State Transitions

All right, time for a more concrete example, something we’ll evolve throughout this post. Let’s say we have a formula gdprConsentIsVisible which is true when the GDPR consent screen is visible. We specify that the screen should be visible in the current and next state like so:

gdprConsentIsVisible && next gdprConsentIsVisible

A pair of consecutive states is called a step. When specifying state machines, we use the next operator to describe state transitions. A state transition formula is a logical predicate on a step.

In the GDPR example above, we said that the consent screen should stay visible in both states of the step. If we want to describe a state change in the consent screen’s visibility, we can say:

gdprConsentIsVisible && next (not gdprConsentIsVisible)

The formula describes a state transition from a visible to a hidden consent screen.

### Always

But interesting state machines usually have more than one possible transition, and interesting behaviors likely contain multiple steps.

While we could nest formulae containing the next operator, we’d be stuck with specifications only describing a finite number of transitions.

Consider the following, where we like to state that the GDPR consent screen should always be visible:

gdprConsentIsVisible && next (gdprConsentIsVisible && next ...)

This doesn’t work for state machines with cycles, i.e. with possibly infinite traces, because we can only nest a finite number of next operators. We want state machine specifications that describe any number of transitions.

This is where we pick up the always operator. It takes a formula as an argument, and it’s true if the given formula is true in the current state and in all future states.

always P   ●───●───●───●───●
P          ●───●───●───●───●
always Q   ○───○───●───●───●
Q          ●───○───●───●───●

Note how always Q is true in the third state and onwards, because that’s when Q becomes true in the current and all future states.

Let’s revisit the always-visible consent screen specification. Instead of trying to nest an infinite amount of next formulae, we instead say:

always gdprConsentIsVisible

Neat! This is called an invariant property. Invariants are assertions on individual states, and an invariant property says that it must hold for every state in the trace.

### Always for State Machines

Now, let’s up our game. To specify the system as a state machine, we can combine transitions with disjunction (||) and the always operator. First, we define the individual transition formulae open and close:

let open =
not gdprConsentIsVisible && next gdprConsentIsVisible;

let close =
gdprConsentIsVisible && next (not gdprConsentIsVisible);

Our state machine formula says that it always transitions as described by open or close:

always (open || close)

We have a state machine specification! Note that this specification only allows for transitions where the visibility of the consent screen changes back and forth.

So far we’ve only seen examples of safety properties. Those are properties that specify that “nothing bad happens.” But we also want to specify that systems somehow make progress. The following two temporal operators let us specify liveness properties, i.e. “good things eventually happen.”

Quickstrom does not support liveness properties yet.1

### Eventually

We’ve used next to specify transitions, and always to specify invariants and state machines. But we might also want to use liveness properties in our specifications. In this case, we are not talking about specific steps, but rather goals.

The temporal operator eventually takes a formula as an argument, and it’s true if the given formula is true in the current or any future state.

eventually P   ○───○───○───○───○
P              ○───○───○───○───○
eventually Q   ●───●───●───●───○
Q              ○───○───○───●───○

For instance, we could say that the consent screen should initially be visible and eventually be hidden:

gdprConsentIsVisible && eventually (not gdprConsentIsVisible)

This doesn’t say that it stays hidden. It may become visible again, and our specification would allow that. To specify that it should stay hidden, we use a combination of eventually and always:

gdprConsentIsVisible && eventually (always (not gdprConsentIsVisible))

Let’s look at a diagram to understand this combination of temporal operators better:

eventually (always P)   ○───○───○───○───○
P                       ○───○───●───●───○
eventually (always Q)   ●───●───●───●───●
Q                       ○───○───●───●───●

The formula eventually (always P) is not true in any state, because P never starts being true forever. The other formula, eventually (always Q), is true in all states because Q becomes true forever in the third state.

### Until

The last temporal operator I want to discuss is until. For P until Q to be true, P must be true until Q becomes true.

P until Q   ●───●───●───●───○
P           ●───●───○───○───○
Q           ○───○───●───●───○

Just as with the eventually operator, the stop condition (Q) doesn’t have to stay true forever, but it has to be true at least once.

The until operator is more expressive than always and eventually, and they can both be defined using until.2

Anyway, let’s get back to our running example. Suppose we have another formula supportChatVisible that is true when the support chat button is shown. We want to make sure it doesn’t show up until after the GDPR consent screen is closed:

not supportChatVisible until not gdprConsentIsVisible

The negations make it a bit harder to read, but it’s equivalent to the informal statement: “the support chat button is hidden at least until the GDPR consent screen is hidden.” It doesn’t demand that the support chat button is ever visible, though. For that, we instead say:

gdprConsentIsVisible
until (supportChatVisible && not gdprConsentIsVisible)

In this formula, supportChatVisible has to become true eventually, and at that point the consent screen must be hidden.

### Until for State Machines

We can use the until operator to define a state machine formula where the final transition is more explicit.

Let’s say we want to specify the GDPR consent screen more rigorously. Suppose we already have the possible state transition formulae defined:

• allowCollectedData
• disallowCollectedData
• submit

We can then put together the state machine formula:

let gdprConsentStateMachine =
gdprConsentIsVisible
&& (allowCollectedData || disallowCollectedData)
until (submit && next (not gdprConsentIsVisible));

In this formula we allow any number of allowCollectedData or disallowCollectedData transitions, until the final submit resulting in a closed consent screen.

## What’s next?

We’ve looked at some temporal operators in LTL, and how to use them to specify state machines. I’m hoping this post has given you some ideas and inspiration!

Another blog post worth checking out is TLA+ Action Properties by Hillel Wayne. It’s written specifically for TLA+, but most of the concepts are applicable to LTL and Quickstrom-style specifications.

I intend to write follow-ups, covering atomic propositions, queries, actions, and events. If you want to comment, there are threads on GitHub, Twitter, and on Lobsters. You may also want to sponsor my work.

Thank you Vitor Enes, Andrey Mokhov, Pascal Poizat, and Liam O’Connor for reviewing drafts of this post.

## Footnotes

1. A future version of Quickstrom will use a different flavor of LTL tailored for testing, and that way support liveness properties.↩︎

2. We can define eventually P = true until P, and perhaps a bit harder to grasp, always P = not (true until not P). Or we could say always P = not (eventually not P).↩︎

# Introduction to the Cardano blockchain ledger and smart contracts

Recently, I wanted to better understand how the Cardano blockchain works on a technical level, in particular how the ledger and smart contracts work. Cardano is a rapidly moving target, and I had some trouble finding an accessible, yet detailed introduction. So, I decided to write up an introduction myself, after browsing through the documentation for Cardano, the (separate?) documentation for Plutus, the research papers, and the formal specifications. For me, the most accessible summary was M. Chakravarty et. al.’s paper “Functional Blockchain Contracts”, though for some reason, this paper still has draft status.

Cardano is a decentralized blockchain platform built on a proof-of-stake consensus mechanism. The main cryptocurrency on the Cardano blockchain is called ADA, named after lady Ada Lovelace.

Cardano is developed by the Cardano Foundation. You can find more information about their goals and ecosystem on their website. In this post, I would like to focus on the technology and give a brief, yet accessible introduction to how the Cardano ledger works, i.e. how transactions and smart contracts work. For more insight into the consensus protocol that is used to add transactions to the blockchain in real time, I refer to a blogpost by E. de Vries.

# The Cardano Ledger

## Basic transactions

This section presents the information you need to make sense of basic transactions on the blockchain, which you can view with e.g. the Cardano explorer.

A blockchain is a decentralized ledger. This ledger records transactions between different parties. A simple example of such a transaction is one where Alice (A) sends some quantity of digital currency to Bob (B). In the Cardano blockchain, this digital currency is called ADA. The following figure illustrates a transfer of four ADA from Alice to Bob:

The illustration shows the state before the transaction above the horizontal line, and the state after the transaction below the line: Before, Alice owns 4 ADA, while afterwards, Bob owns 4 ADA.

Unlike in the traditional banking system, where the bank authorizes who can submit transactions, anybody can submit a transaction to the blockchain. However, we have to make sure that only Alice is authorized to add this transaction to the ledger; otherwise, the system becomes useless, as e.g. Bob could attempt to add this transactions for his own benefit without confirmation from Alice. In a blockchain, this authorization is done using public key cryptography: Alice has a private key, which she uses to digitally sign the contents of the transaction. Using Alice’s public key, anyone can verify that it was indeed Alice who signed the transaction, but only someone who knows the private key is able to produce this signature. Typically, Alice uses a wallet software to manage her private key and sign transactions. The following figure illustrates the signed transaction:

If we know Alice’s public key, we can check that the signature is valid and that it was indeed Alice who authorized the transaction. In fact, the blockchain does not keep track of which key belongs to whom; Alice’s public key is simply included in the transaction data. Nothing prevents Alice from using multiple public and private key pairs, corresponding to different “accounts” or “wallets”.

The flip side of not relying on a bank to authorize transactions, and possibly revert them in the case of abuse, is that the security of your money is tied to the security of your private key. Specifically, a) you need to keep your private key secret, because other people can spend your money if they know it, and b) if you lose / forget / hide too well your private key, nobody, including you, can spend your money anymore. It cannot be stressed often enough: Your private key = your money!

In the real Cardano blockchain, parties are not identified by their public keys; instead, money is transferred between addresses. A party can use as many addresses as they like. While the public key of Alice is included in the transaction, the public key of Bob is not included, only one of his addresses. We will explain the relation between keys and addresses in more detail below.

In addition, the Cardano blockchain allows transactions between multiple addresses at once. For instance, a transaction may transfer money from one address to, say, two different addresses. The following figure illustrates a transfer of four ADA from address A, three of which go to address B, and one of which goes to address C:

The Cardano blockchain is public. Anyone can view the transactions on the Cardano explorer. For example, the transaction with ID 825c63… takes 42.42… ADA from an address and transfers 5 ADA to one address and 37.25… to another. In addition, 0.16… ADA is taken as a fee for including the transaction in the blockchain.

It is worth noting that the Cardano ledger does not record total balances of money owned by different addresses. It turns out that this simplifies the ledger model; we will discuss this below. Here, we merely note that storing total balances is not necessary, because the blockchain is public information, and one can compute the address balance by following all transactions from the beginning. The Cardano explorer does that automatically; for example, at the time of this writing, the address addr1q8kzwv… was involved in two transactions and has a total balance of 0 ADA.

On the Cardano blockchain, transactions are grouped into blocks, and blocks are grouped into epochs. This is relevant for determining which of the submitted transactions are to be included in the ledger, via the proof-of-stake consensus protocol, but as far as the currency is concerned, the grouping is irrelevant.

## UTXO

While the previous section explained how to read basic transactions on the Cardano blockchain explorer, this section will present more details on how transactions are constructed. For example, many transaction which represent a simple transfer of currency between two parties are built using three addresses, one input and two output addresses, and sometimes even many more addresses. The reason is that Cardano uses an unspent transaction output (UTXO) model, similar to Bitcoin, which we will explain here.

On the surface, transactions in the unspent transaction output (UTXO) model may look like transfers of funds (money) between addresses, as described previously. However, on a deeper level, the UTXO model is better understood if we take a radical departure from the idea that addresses act as containers for funds. Instead, in the UTXO model, the containers for funds are the transaction outputs. Each transaction consists of a unique transaction ID, a set of inputs, and a list of outputs. Conversely, each output is uniquely identified by its position in the list of outputs and the transaction ID of the associated transaction. The inputs of a new transaction that is added to the blockchain are always references to previous outputs; moreover, once an output has been used as an input to a new transaction, it cannot be used a second time. Put differently, the inputs of new transaction may only refer to unspent transactions, which have not been used as input to another transaction yet. The fact that outputs are associated with addresses and funds is not relevant for being able to connect the inputs and outputs of transactions. The following figure illustrates the blockchain as a set of transactions with connected inputs and outputs

Here, the grey boxes within each figure correspond to inputs (above the line) and outputs (below the line). Each output can be connected to exactly one input; this is visualized by a dangling black line (output) which connects to a dangling red line (input).

The Cardano explorer is not a direct visualization of the UTXO model, because it does not directly show which transaction IDs the inputs reference. However, one can often infer this information from the addresses and funds associated with the input. For example, the address addr1q8kzwv… is involved in two transactions. The last output of the first transaction 34d8… is the input to the second transaction 825c…, because both the address and the amount of ADA are identical.

Funds and addresses do become important when validating transactions. For example, a transaction must conserve funds: The sum of funds of the inputs must be equal to the sum of funds of the outputs (minus a small transaction fee). We will discuss this in the next section.

Transferring funds between parties within the UTXO model often involves multiple inputs and outputs. For example, consider an unspent transaction output which contains 5 ADA. Any transaction that wants to use this output must use the entire sum. If we want to transfer 4 ADA to another party, we cannot do this with a transaction that expects 4 ADA as an input — we would not be able to connect it to the output, because an input and an output must match identically. However, we can create a transaction which takes an input with 5 ADA and transfers 4 ADA to one output with an address belonging to the other party, and 1 ADA to a second output with an address that is under our control. Many transactions on the Cardano blockchain are of this kind, for example the transaction 825c63…. However, it may also happen that we want to transfer 4 ADA, but only three outputs with 1.5 Ada each are available that are owned by us. In this case, we create a transactions with these three inputs and with two outputs, one that sends 3 Ada to the other party, and one with the remaining 0.5 Ada and an address owned by us. For example, the transaction 1bab… uses this scheme, as all the inputs have the same address.

This UTXO approach to transferring funds may seem a little roundabout at first, but the advantage is that all validity checks are confined to transactions — we can connect transactions in any way that adheres to the UTXO principle (each output is spent fully and at most once), and be guaranteed that the ledger conserves money as long as each individual transaction is valid. This simplifies the ledger model, and is an example of “compositionality”.

## Transaction Validation

Having explained how inputs and outputs are connected via the UTXO model, we next turn to the validation of transactions, which is arguably the most important task of a blockchain. We will look at addresses and public keys for authorizing transactions in this sections, and discuss smart contracts in the next section.

The validation of transactions is what enables blockchain ledgers to store monetary value. To be considered valid, a transaction must pass several checks; only then can it be included in the blockchain ledger.

First, the transaction must conform to the UTXO model, i.e. every input refers to a unique unspent output. This effectively prevents double spending of funds.

Second, money must be conserved. The funds associated to the inputs may be redistributed and be assigned differently to the outputs, but the sum of funds of the inputs must be equal to the sum of funds of the outputs, minus a small transaction fee.

Third, the transaction must be authorized by the parties that own the addresses associated to the inputs. As mentioned in a previous section, this is done using public key cryptography. However, an address is not a public key, instead, it is the hash of a public key. This is a neat trick to reduce the size of the addresses. That said, it is impossible to check a signature by using just the hash of the public key. Thus, the transaction explicitly lists all the public keys that hash to the input addresses, thereby revealing the public keys. The transaction must be signed by all corresponding private keys (which must be kept secret by the parties). However, the public keys of the output addresses are not revealed; this saves space, and adds an extra layer of cryptographic protection by not exposing the public keys to a possible (though likely hopeless) offline cryptanalysis.

With these validation checks, the ledger is suitable for storing monetary value.

However, by allowing more complicated validation logic, we can support smart contracts, which are automated contracts for transferring funds and data. To this end, the Cardano ledger associates inputs and outputs not just with funds and addresses, but with additional, custom data which encodes the state of the smart contract. This leads to the Extended UTXO (EUTXO) model, which we discuss in the next section.

At the time of this writing, the data supported by the Cardano ledger, and also its data format, are not set in stone: From time to time, the Cardano blockchain will undergo a hard fork where the ledger format changes. However, thanks to a special programming technique known as a hard fork combinator, these forks are almost seamless: existing transactions on the ledger will be unchanged, and are compatible with all future transactions that are added to the chain. The Cardano ledger has already undergone a number of hard forks; the data formats are documented in formal specifications. The Cardano blockchain started with the Byron format, which supported UTXO model transactions of Ada. Then came the Shelly format, which enabled stake pools and staking rewards. The Mary hard fork enabled multi-assets (Shelly-MA), i.e. the transfer of tokens or currencies besides Ada. The Alonzo hard fork will enable smart contracts.

## Smart Contracts and EUTXO

Blockchains can automate not just simple transfers of money, but also more complex exchanges involving multiple parties and conditions. Smart contracts are pieces of software that describe such exchanges.

On the Cardano blockchain, smart contracts are identified by their addresses. Instead of sending funds to an address controlled by a human, you can send funds to an address representing a smart contract, which will then perform some automated transactions involving your funds. By looking at the source code for this contract beforehand, you can understand exactly what the contract is going to do. The validity checks on the blockchain ensure that the contract is executed exactly as specified.

A simple example of an exchange suitable for a smart contract would be donation matching: Whenever Bob receives a donation, Alice wants to match that donation with an equal amount of money. One possible implementation in terms of a smart contract would would work as follows: First, Alice sends the funds she wants to use for donation matching to the contract. Then, any willing donor, say Dorothee, sends money to the contract, which will then automatically forward this money plus an equal amount of matching funds to the address owned by Bob.

This example shows that smart contracts often involve multiple transactions: Alice makes a transaction that transfers funds to the contract, and Dorothee makes a transaction that donates to Bob. These transactions need to adhere to the contract rules, e.g. that Alice is the first to make the transaction, and Dorothee must come second. Any smart contract consists of two components: an on-chain component, which ensures that transactions adhere to the contract rules, and an off-chain component, which is a program that Alice or Dorothee use to submit rule-abiding transactions. The on-chain component never submits any transactions to the blockchain, it only checks existing transactions for validity. The on-chain component is run on the nodes that validate the blockchain ledger, while the off-chain component is run by the user, e.g. as part of his wallet. For Cardano, the Plutus platform allows us to write both components using the same programming language, Haskell.

The on-chain component of a smart contract operates like a state machine. When an output of a transaction is a smart contract address, this address has to be accompanied by additional, custom state data. This data represents a new state of the contract. Here, the same hashing trick as for addresses is used: Only a hash of the data, not the data itself is included in the transaction. But when the input of transaction references this output, via the UTXO model, the data is included in full. In addition, this input has to come with additional, custom data called a redeemer, which has the role of user input, and corresponds to a state transition label for the state machine. The UTXO-like ledger model which includes state data and redeemer data is known as the Extended UTXO (EUTXO) model.

For example, the donation matching example above can be implemented with the following state machine:

The machine has a single state Funded addrA addrB, which indicates that the contract is funded and ready to match donations for Bob. This state stores both Alice’s address addrA , and Bob’s address addrB. The possible transitions are Donate for making a matched donation to Bob, and Refund to return Alice’s funds when she wants to end the fundraiser. The state machine may seem somewhat unusual in that the contract state does not contain information about Alice’s total funds, and that the donation amount is not recorded in the Donate transition. However, there is no need for this, because transaction inputs and outputs always store an amount of funds.

The on-chain component can implement this state machine by only accepting the following transaction (schemas) as valid:

Here, the first transaction schema corresponds to a Donate transition. It takes an amount y from Dorothee and transfers twice that amount to Bob, using the funds held by the smart contract input. The second transaction schema corresponds to the Refund transition. It returns all the funds in the contract to Alice by spending the input contract. Here, we have also added an input from Alice (with zero funds) in order to make sure that not only does all the money go to back to Alice, but also that only Alice can initiate a refund: As her address appears in the input, this transaction is only valid with her signature. Similarly, as the smart contract address appears as an input, the validator code of the smart contract is run on this transaction. By refusing all transactions that do not fit into these two schema, only valid transitions of the smart contract state are possible. (The description here is a bit oversimplified in that we have ignored transaction fees and the fact unspent outputs with zero funds are not allowed. Transactions on the blockchain always involve a small fee.)

Finally, to start a fundraiser for Bob, Alice can submit the following transaction to the chain.

This transaction creates a smart contract output with funds, ready to be spent as an input.

# Conclusion

We have taken a look at the ledger for the Cardano blockchain. We have looked at basic transactions, the unspent transaction output (UTXO) model popularized by bitcoin, validation with addresses and public keys, and finally introduced the basics of smart contracts and how their on-chain components can be understood as state machines. That said, we have also skipped over some features of the ledger, such as multiple assets and staking, because I believe that they do not affect the substance of the EUTXO model and can be understood separately if desired. I hope this helps!

If this were a YouTube channel, I would ask you to smash the like button and click subscribe. Oh well. I guess I can ask you to leave comments below, though.

# [accudfui] Pell equation notes

regular (positive) Pell equation is x^2 - D*y^2 = +1.

negative Pell equation is x^2 - D*y^2 = -1.  if you have a solution (x,y) to the negative Pell equation, run Newton's method computing sqrt(D) for one iteration, starting with the guess old=x/y.  that is, new = (old + D/old)/2 (the Babylonian method is equivalent to Newton's method).  the components of the resulting fraction newx/newy will be a solution to the regular Pell equation: newx^2 - D*newy^2 = +1.

solving the Pell equation with continued fractions is very straightforward.  compute the continued fraction of sqrt(D), try all truncations in order of length.  for each truncation, convert to a regular fraction x/y and check if x and y solve the regular Pell equation.  there is no need to systematically modify the last term before the truncation point as you do when using continued fractions to find the sequence of best fractions that approximate a given number.

as an optimization, also check whether each (x,y) solves the negative Pell equation, then use the Newton's method trick above to convert the negative equation solution to a regular equation solution.  the smallest negative solution (if one exists) will convert to the smallest regular solution.

the following sequence of D require increasing numbers of continued fraction terms to solve.  the Newton's method optimization is included, so we are counting the number of continued fraction terms needed to reach the first regular or negative solution.  these D represent challenges of increasing difficulty.  the famous value 61 is not on the list.  61 requires only 11 continued fraction terms to find a negative equation solution (x=29718, y=3805).  it is eclipsed by the smaller 46 which requires 12 terms for its smallest solution (x=24335, y=3588), which happens to solve the regular equation.

2 3 7 13 19 31 43 46 94 139 151 166 211 331 421 526 571 604 631 751 886 919 1291 1324 1366 1516 1621 1726 2011 2311 2566 2671 3004 3019 3334 3691 3931 4174 4846 5119 6211 6451 6679 6694 7606 8254 8779 8941 9739 9949 10399 10651 10774 12541 12919 13126

the sequence below is a subsequence of the sequence above.  for these record-setting D values, the solution is to the negative equation, so you need to do one Newton's method iteration afterward, so the smallest regular solution unusually large.

2 13 421 1621 8941 9949 12541 17341 39901 40429 43261

some disorganized Haskell source code is here.  it includes numerical testing of the conjectures stated above.

we used the quadratic-irrational package to compute the continued fraction representation of square roots.  we used the continued-fraction package to convert continued fractions to Rational.  below is code to convert a quadratic irrational into an infinite list of continued fraction terms, using the cycle function.  Haskell handles infinite data structures elegantly.

qitocf = Quadraticirrational.qiToContinuedFraction >>> \case { (integerpart, Quadraticirrational.CycList start repeatend) -> integerpart:(start ++ cycle repeatend);};

having constructed an infinite list, use the inits function to construct all truncations, then for each truncation, test whether it solves the Pell equation.  (but our code does not use inits because we were also testing whether the final term needs to be modified.)

# The Pains of Path Parsing

I've spent a considerable amount of coding time getting into the weeds of path parsing and generation in web applications. First with Yesod in Haskell, and more recently with a side project for routetypes in Rust. (Side note: I'll likely do some blogging and/or videos about that project in the future, stay tuned.) My recent work reminded me of a bunch of the pain points involved here. And as so often happens, I was complaining to my wife about these pain points, and decided to write a blog post about it.

First off, there are plenty of pain points I'm not going to address. For example, the insane world of percent encoding, and the different rules for what part of the URL you're in, is a constant source of misery and mistakes. Little things like required leading forward slashes, or whether query string parameters should differentiate between "no value provided" (e.g. ?foo) versus "empty value provided" (e.g. ?foo=). But I'll restrict myself to just one aspect: roundtripping path segments and rendered paths.

## What's a path?

Let's take this blog post's URL: https://www.fpcomplete.com/blog/pains-path-parsing/. We can break it up into four logical pieces:

• https is the scheme
• :// is a required part of the URL syntax
• www.fpcomplete.com is the authority. You may be wondering: isn't it just the domain name? Well, yes. But the authority may contain additional information too, like port number, username, password
• /blog/pains-path-parsing/ is the path, including the leading and trailing forward slashes

This URL doesn't include them, but URLs may also include query strings, like ?source=rss, and fragments, like #what-s-a-path. But we just care about that path component.

The first way to think of a path is as a string. And by string, I mean a sequence of characters. And by sequence of characters, I really mean Unicode code points. (See how ridiculously pedantic I'm getting? Yeah, that's important.) But that's not true at all. To demonstrate, here's some Rust code that uses Hebrew letters in the path:

fn main() {
let uri = http::Uri::builder().path_and_query("/hello/מיכאל/").build();
println!("{:?}", uri);
}


And while that looks nice and simple, it fails spectacularly with the error message:

Err(http::Error(InvalidUri(InvalidUriChar)))


In reality, according to the RFC, paths are made up of a limited set of ASCII characters, represented as octets (raw bytes). And we somehow have to use percent encoding to represent other characters.

But before we can really talk about encoding and representing, we have to ask another orthogonal question.

## What do paths represent?

While a path is technically a sequence of a reserved number of ASCII octets, that's not how our applications treat them. Instead, we want to be able to talk about the full range of Unicode code points. But it's more than just that. We want to be able to talk about groupings of sequences. We call these segments typically. The raw path /hello/world can be thought of as the segments ["hello", "world"]. I would call this parsing the path. And, in reverse, we can render those segments back into the original raw path.

With these kinds of parse/render pairs, it's always nice to have complete roundtripping abilities. In other words, parse(render(x)) == x and render(parse(x)) == x. Generally these rules fail for a variety of reasons, such as:

1. Multiple valid representations. For example, with the percent encoding we'll mention below, %2a and %2A mean the same thing.
2. Often unimportant whitespace details get lost during parsing. This applies to formats like JSON, where [true, false] and [ true, false ] have the same meaning.
3. Parsing can fail, so that it's invalid to call render on parse(x).

Because of this, we often end up reducing our goals to something like: for all x, parse(render(x)) is successful, and produces output identical to x.

In path parsing, we definitely have problem (1) above (multiple valid representations). But by using this simplified goal, we no longer worry about that problem. Paths in URLs also don't have unimportant whitespace details (every octet has meaning), so (2) isn't a problem to be concerned with. Even if it was, our parse(render(x)) step would end up "fixing" it.

The final point is interesting, and is going to be crucial to our complete solution. What exactly does it mean for path parsing to fail? I can think of two ideas in basic path parsing:

• It includes an octet outside of the allowed range
• It includes a percent encoding which is invalid, e.g. %@@

Let's assume for the rest of this post, however, that those have been dealt with at a previous step, and we know for a fact that those error conditions will not occur. Are there any other ways for parsing to fail? In a basic sense: no. In a more sophisticated parsing: absolutely.

## Basic rendering

The basic rendering steps are fairly straightforward:

• Perform percent encoding on each segment
• Interpolate the segments with a slash separator
• Prepend a slash to the entire string

To allow roundtripping, we need to ensure that each input to the render function generates a unique output. Unfortunately, with these basic rendering steps, we immediately run into an error:

render segs = "/" ++ interpolate '/' (map percentEncode segs)

render []
= "/" ++ interpolate '/' (map percentEncode [])
= "/" ++ interpolate '/' []
= "/" ++ ""
= "/"

render [""]
= "/" ++ interpolate '/' (map percentEncode [""])
= "/" ++ interpolate '/' [""]
= "/" ++ ""
= "/"


In other words, both [] and [""] encode to the same raw path, /. This may seem like a trivial corner case not worth addressing. In fact, even more generally, empty path segments seem like a corner case. One possibility would be to say "segments must be non-zero length". Then there's no potential [""] input to worry about.

When this topic came up in Yesod, we decided to approach this differently. We actually did have some people who had use cases for empty path segments. We'll get back to this in normalized rendering.

## Percent encoding

I mentioned originally the annoyances of percent encoding character sets. I'm still not going to go deeply into details of it. But we do need to discuss it at a surface level. In the steps above, let's ask two related questions:

• Why did we percent encode before interpolating?
• Do we percent encode forward slashes?

Let's try percent encoding after interpolating. And let's say we decide not to percent encode forward slashes. Then render(["foo/bar"]) would turn into /foo/bar, which is identical to render(["foo", "bar"]). That's not what we want. And if we decide we're going to percent encode after interpolating and that we will percent encode forward slashes, both inputs result in /foo%2Fbar as output. Neither of those is any good.

OK, going back to percent encoding before interpolating, let's say that we don't percent encode forward slashes. Then both ["foo/bar"] and ["foo", "bar"] will turn into /foo/bar, again bad. So by process of elimination, we're left with percent encoding before interpolating, and escaping the forward slashes in segments. With this configuration, we're left with render(["foo/bar"]) == "/foo%2Fbar" and render(["foo", "bar"]) == "/foo/bar". Not only is this unique output (our goal here), but it also intuitively feels right, at least to me.

## Unicode codepoint handling

One detail we've glossed over here is Unicode, and the difference between codepoints and octets. It's time to rectify that. Percent encoding is a process that works on bytes, not characters. I can percent encode / into %2F, but only because I'm assuming an ASCII representation of that character. By contrast, let's go back to my favorite non-Latin alphabet example, Hebrew. How do you represent the Hebrew letter Alef א with percent encoding? The answer is that you can't, at least not directly. Instead, we need to represent that Unicode codepoint (U+05D0) as bytes. And the most universally accepted way to do that is to use UTF-8. So our process is something like this:

let segment: &[char] = "א";
let segment_bytes: &[u8] = encode_utf8(segment); // b"\xD7\x90"
let encoded: &[u8] = percent_encode(segment_bytes); // b"%D7%90"


OK, awesome, we now have a way to take a sequence of non-empty Unicode strings and generate a unique path representation of that. What's next?

## Basic parsing

How do we go backwards? Easy: we reverse each of the steps above. Let's see the render steps again:

• Percent encode each segment, consisting of:
• UTF-8 encode the codepoints into bytes
• Percent encode all relevant octets, including the forward slash
• Interpolate all of the segments together, separated by a forward slash
• Technically, the "forward slash" here is the forward slash octet \x2F. But because everyone basically assumes ASCII/UTF-8 encoding, we can typically be a little loose in our terminology.
• Prepend a forward slash (octet).

Basic parsing is exactly the same steps in reverse:

• Strip off the forward slash.
• Arguably, if a forward slash is missing, you could consider this a parse error. Most parsers simply ignore it instead.
• Split the raw path on each occurrence of a forward slash. We'll discuss some subtleties about this next.
• Percent decode each segment, consisting of:
• Look for any % signs, and grab the next two hexadecimal digits. In theory, you could treat an incorrect or missing digit as a parse error. In practice, many people end up using some kind of fallback.
• Take the percent decoded octets and UTF-8 decode them. Again, in theory, you could treat invalid UTF-8 data as a parse error, but many people simply use the Unicode replacement character.

If implemented correctly, this should result in the goal we mentioned above: encoding and decoding a specific input will always give back the original value (ignoring the empty segment case, which we still haven't addressed). The one really tricky thing is making sure that our split and interpolate operations mirror each other correctly. There are actually many different ways of splitting lists and strings. Fortunately for my Rust interpolation, the standard split method on str happens to implement exactly the behavior we want. You can check out the method's documentation for details (helpful even for non-Rustaceans!). Pay particular attention to the comments about contiguous separators, and think about how ["foo", "", "", "bar"] would end up being interpolated and then parsed.

OK, we're all done, right? Wrong!

## Normalization

I bet you thought I forgot about the empty segments. (Actually, given how many times I called them out, I bet you didn't think that.) Before, we saw exactly one problem with empty segments: the weird case of [""]. I want to first establish that empty segments are a much bigger problem than that.

I gave a link above to a GitHub repository: https://github.com/snoyberg/routetype-rs. Let's change that URL ever so slightly, and add an extra forward slash in between snoyberg and routetype-rs: https://github.com/snoyberg//routetype-rs. Amazingly, you get the same page for both URLs. Isn't that weird?

No, not really. Extra forward slashes are often times ignored by web servers. "I know what you meant, and you didn't mean an empty path segment." This isn't just a "feature" of webservers. The same concept applies on my Linux command line:

$cat /etc/debian_version bullseye/sid$ cat /etc///debian_version
bullseye/sid


I've got two problems with the behavior GitHub is demonstrating above:

• What if I'm writing some web application and I really, truly want to be able to embed a meaningful empty segment in the path?
• Doesn't it feel wrong, and maybe even hurt SEO, to have two different URLs that resolve to the same content?

In Yesod, we addressed the second issue with a class method called cleanPath, that analyzes the segments of an incoming path and sees if there's a more canonical representation of them. For the case above, https://github.com/snoyberg//routetype-rs would produce the segments ["snoyberg", "", "routetype-rs"], and cleanPath would decide that a more canonical representation would be ["snoyberg", "routetype-rs"]. Then, Yesod would take the canonical representation and generate a redirect. In other words, if GitHub was written in Yesod, my request to https://github.com/snoyberg//routetype-rs would result in a redirect to https://github.com/snoyberg/routetype-rs.

Way back in 2012, this led to a problem, however. Someone actually had empty path segments, and Yesod was automatically redirecting away from the generated URLs. We came up with a solution back then that I'm still very fond of: dash prefixing. See the linked issue for the details, but the way it works is:

• When encoding, if a segment consists entirely of dashes, add one more dash to it.
• By our definition of "consists entirely of dashes," the empty string counts too. So dashPrefix "" == "-", and dashPrefix "---" == "----".
• When decoding:
• Perform the split operation above.
• Next, perform the clean path check, and generate a redirect if there are any empty path segments.
• Once we know that there are no empty path segments, then undo dash prefixing. If a segment consists of only dashes, remove one of the dashes.

If you work this through enough, you can see that with this addition, every possible sequence of segments—even empty segments—results in a unique raw path after rendering. And every incoming raw path can either be parsed to a necessary redirect (if there are empty segments) or to a sequence of segments. And finally, each sequence of segments will successfully roundtrip back to the original sequence when parsing and rendering.

I call this normalized parsing and rendering, since it is normalizing each incoming path to a single, canonical representation, at least as far as empty path segments are concerned. I suppose if someone wanted to be truly pedantic, they could also try to address variations in percent encoding behavior or invalid UTF-8 sequences. But I'd consider the former a non-semantic difference, and the latter garbage-in-garbage-out.

## Trailing slashes

There's one final point to bring up. What exactly causes an empty path segment to occur when parsing? One example is contiguous slashes, like our snoyberg//routetype-rs example above. But there's a far more interesting and prevalent case: the trailing slash. Many web servers use trailing slashes, likely originating from the common pattern of having index.html files and accessing a page based on the containing directory name. In fact, this blog post is hosted on a statically generating site which uses that technique, which is why the URL has a trailing slash. And if you perform basic parsing on our path here, you'd get:

basic_parse("/blog/pains-path-parsing/") == ["blog", "pains-path-parsing", ""]


Whether to include trailing slashes in URLs has been an old argument on the internet. Personally, because I consider the parsing-into-segments concept to be central to path parsing, I prefer excluding the trailing slash. And in fact, Yesod's default (and, at least for now, routetype-rs's default) is to treat such a URL as non-canonical and redirect away from it. I felt even more strongly about that when I realized lots of frameworks have special handling for "final segments with filename extensions." For example, /blog/bananas/ is good with a trailing slash, but /images/bananas.png should not have a trailing slash.

However, since so many people like having trailing slashes, Yesod is configurable on this point, which is why cleanPath is a typeclass method that can be overridden. To each their own I suppose.

## Conclusion

I hope this blog post gave a little more insight into the wild world of the web and how something as seemingly innocuous as paths actually hides some depth. If you're interested in learning more about the routetype-rs project, please let me know, and I'll try to prioritize some follow ups on it.

You may be interested in more Rust or Haskell from FP Complete. Also, check out our blog for a wide range of technical content.

# Rage bug reporting

At Facebook, we have an internal convention for tooling called "rage". When something goes wrong and you want to report a bug, the tool developer will typically ask you to give them a rage. For a command line tool, this can be done by running a rage subcommand, which will ask about which previous CLI invocation you'd like to report, and then giving you a bundle of logs to send to the developer.

A rage has an important property, compared to a conventional log level flag like -v: rage recording is always on. In other words, it is like traditional server application logs, but applied to client software. Logging is always turned on, and the rage subcommand makes it easy for a user to send only the relevant portion of logs (e.g., the logs associated with the command line invocation that is on).

For some reason, rage functionality is not that common in open source tools. I can imagine any number of reasons why this might be the case:

• Adding proper logging is like flossing--annoying to do at the time even when it can save you a lot of pain later.
• Even if you have logging, you still need to add infrastructure to save the logs somewhere and let users retrieve them afterwards.
• It's something of an art to write logs that are useful enough so that developer can diagnose the problem simply by "reading the tea leaves", but not so detailed that they slow down normal execution of the program. And don't forget, you better not expose private information!
• Most programs are simple, and you can just fall back on the old standby of asking the user to submit reproduction instructions in their bug report.

Still, in the same way most sysadmins view logging as an invaluable tool for debugging server issues, I think rage reporting is an invaluable tool for debugging client issues. In ghstack, it didn't take very many lines of code to implement rage reporting: ghstack.logs (for writing the logs to the rage directory) and ghstack.rage (for reading it out). But it has greatly reduced my support load for the project; given a rage, I can typically figure out the root cause of a bug without setting up a reproducer first.

## April 23, 2021

### Douglas M. Auclair (geophf)

• 2021-04-20, Tuesday:

I have pairs :: [(a, IO [b])]

but I want pairs' :: IO [(a, b)]

sequence a gives me something like I don't know what: distributing the list monad, not the IO monad. Implement:

sequence' :: [(a, IO [b])] -> IO [(a, b)]

• p h z @phaazon_:
fmap join . traverse (\(a, io) -> fmap (map (a,)) io)
• lucas卞dicioccio, PhD @lucasdicioccio:

Just to annoy you I'll use the list-comprehension syntax you dislike.

solution xs = sequence [fmap (a,) io | (a, io) <- xs]

• Benkio @benkio89:
fmap concat . traverse (\(a,iobs) -> fmap (a,) <$> iobs) • Social Justice Cleric @noaheasterly fmap concat . traverse (getCompose . traverse Compse) • Social Justice Cleric @noaheasterly fmap (concatMap getCompose) . getCompose . traverse Compose. Compose • Basile Henry @basile_henry: Slightly less polymorphic: sequence' = traverse @[] (sequence @((,) _) @IO) • Basile Henry @basile_henry: I think it's just traverse sequence ;) • Jérôme Avoustin @JeromeAvoustin: there surely is a shorter version, but I could come up with... fmap join . sequence . (fmap . fmap) sequence . fmap sequence • 2021-04-16, Friday: You have a monad, or applicative, and you wish to execute the action of the latter but return the result of the former. The simplest representation for me is: pass :: IO a -> b -> IO b so: return 5 >>= pass (putStrLn "Hi, there!")  would return IO 5 GO! • D Oisín Kidney @oisdk flip (<$)
• ⓘ_jack @Iceland_jack ($>) • 2021-04-12, Monday: A function that takes the result of another function then uses that result and the original pair of arguments to compute the result: f :: a -> a -> bg :: b -> a -> a -> c so: (\x y -> g (f x y) x y) curry away the x and y arguments. • 2021-04-07, Wednesday: you have (Maybe a, Maybe b) you want Maybe (a, b) If either (Maybe a) or (Maybe b) is Nothing then the answer is Nothing. If both (Maybe a) and (Maybe b) are (Just ...) then the answer is Just (a, b) WHAT SAY YOU? • Jérôme Avoustin @JeromeAvoustin: bisequence • p h z @phaazon_ with base: uncurry$ liftA2 (,)
• greg nwosu @buddhistfist: (,) <$> ma <*> mb ### Chris Smith 2 # Continued Fractions: Haskell, Equational Reasoning, Property Testing, and Rewrite Rules in Action ### Overview In this article, we’ll develop a Haskell library for continued fractions. Continued fractions are a different representation for real numbers, besides the fractions and decimals we all learned about in grade school. In the process, we’ll build correct and performant software using ideas that are central to the Haskell programming language community: equational reasoning, property testing, and term rewriting. I posted an early version of this article to Reddit, and I’m grateful to those who responded with insight and questions that helped me complete this journey, especially iggybibi and lpsmith. ### Introduction to continued fractions Let’s start with a challenge. Assume I know how to write down an integer. Now, how can I write down a real number? In school, we’ve all learned a few answers to this question. The big two are fractions and decimals. Fractions have simple forms for rational numbers. Just write two integers, for the numerator and denominator. That’s great! But alas, they are no good at all for irrational numbers like π, or the square root of 2. I could, of course, write a fraction that’s close to the irrational number, such as 22/7 as an approximation of π. But this is only good up to a certain limited precision. To get a closer approximation (say, 355/113), I have to throw it away and start over. Sure, I could write an infinite sequence of fractions which converge to the rational number, but every finite prefix of such a sequence is redundant and wasted. Irrationals are just the limiting case of rational numbers, and very precise rational numbers suffer from this, too. It’s not easy to glance at a fraction like 103993/33102 and get a sense of its value. (Answer: this is another approximation of π.) Or, can you quickly tell whether 5/29 is larger or smaller than 23/128? Instead, we can turn to decimals. If you’re like me, you don’t think about the mechanics of decimal numbers very often, and conventional notation like 3.14159… really hides what’s going on. I’d like to offer a different perspective by writing it like this: Here, d₀ is the integer part, and d₁, d₂, d₃, d₄, etc. are called the decimal digits. Notice the symmetry here: after the integer part, a decimal has inside of it another decimal representing its tenths. This recursive structure leads to the main property that we care about with decimals: we can truncate them. If I have a hundred-digit decimal number, and don’t need that much accuracy, I can simply chop off what I don’t need to get a simpler approximation. This is the same property that lets me write a decimal representation of an irrational number. As long as we accept that infinite sequences exist — and as Haskell programmers, infinite data structures don’t scare us…. much — there is a unique infinite sequence of digits that represents any irrational number. The sequence must be infinite is to be expected, since there are uncountably many irrationals, and any finite representation has only countable values. (Unfortunately, this uniqueness isn’t quite true of rational numbers, since 0.4999… and 0.5 are the same number, for example!) As nice as that is, though, there are also some significant disadvantages to decimals as a representation: • Although a few rational numbers have simple decimal representations, almost all rational numbers actually need infinite repeating decimals. Specifically, a rational number whose denominator has any prime factor other than 2 or 5 repeats indefinitely. • Why 10? This is a question for historians, anthropologists, perhaps biologists, but has no answer within mathematics. It is a choice that matters, too! In a different base, a different subset of the rational numbers will have finite decimals. It’s the choice of base 10 that makes it harder to compute with 1/3. Having to make an arbitrary choice about something that matters so much is unfortunate. We can make up some ground on these disadvantages with a third representation: continued fractions. A continued fraction is an expression of this form: Just like with decimals, this gives a representation of a real number as a (possibly finite, or possibly infinite) sequence [n₀, n₁, n₂, n₃, n₄, …]. Now we call the integer parts terms, instead of digits. The first term, n₀, is the integer part. The difference is that when we have a remainder to write, we take its reciprocal and continue the sequence with that. Note: • Continued fractions represent all rational numbers as finite sequences of terms, while still accounting for all irrationals using infinite sequences. • Continued fractions do not depend on an arbitrary choice of base. The reciprocal is the same regardless of how we choose to write numbers. • Continued fractions, like decimals, can be truncated to produce a rational approximation. Unlike decimals, though, the approximations are optimal in the sense that they are as close as possible to the original number without needing a larger denominator. With this motivation in mind, let’s see what we can do about using continued fractions to represent real numbers. ### Part 1: Types and Values Time to write some code. As a declarative language that promotes equational reasoning, Haskell is a joy to use for problems like this one. If you want to see it all together, the code and tests for this part can be found at https://code.world/haskell#PVxpksYe_YAcGw3CNdvEFzw. To start, we want a new type for continued fractions. A continued fraction is a (finite or infinite) sequence of terms, so we could simply represent it with a Haskell list. However, I find it more readable to define a new type so that we can choose constructors with suggestive names. (It helps with this decision that standard list combinators like map or ++ don’t have obvious interpretations for continued fractions.) I’ve defined the type like this. data CFrac where (:+/) :: Integer -> CFrac -> CFrac Inf :: CFrac deriving instance Eq CFracderiving instance Show CFrac infixr 5 :+/ Here, x :+/ y means x + 1/y, where x is an integer, and y is another continued fraction (the reciprocal of the remainder). This is the basic building block of continued fractions, as we saw above. The other constructor may be surprising, though. I’ve defined a special continued fraction representing infinity! Why? We’ll use it for terminating fractions. When there is no remainder, we have x = x + 0 = x + 1/∞, so Inf is the reciprocal of the remainder for a continued fraction that represents an integer. That we can represent ∞ itself as a top-level continued fraction wasn’t the goal, but it doesn’t seem worth dodging. Noticing that an empty list of terms acts like infinity is actually a great intuition for things to come. There is a simple one-to-one correspondence between continued fractions and their lists of terms, as follows: • The :+/ operator corresponds to the list cons operator, :. • The Inf value corresponds to the empty list, []. We can define conversions to witness this correspondence between CFrac and term lists. terms :: CFrac -> [Integer]terms Inf = []terms (n :+/ x) = n : terms x fromTerms :: [Integer] -> CFracfromTerms = foldr (:+/) Inf Rational numbers can be converted into continued fractions by following the process described above, and the code is very short. cfFromFrac :: Integer -> Integer -> CFraccfFromFrac _ 0 = InfcfFromFrac n d = n div d :+/ cfFromFrac d (n mod d) cfFromRational :: Rational -> CFraccfFromRational r = cfFromFrac (numerator r) (denominator r) The real fun of continued fractions, though, is that we can represent irrational numbers precisely, as well! Rational numbers are precisely the solutions to linear equations. The simplest irrational numbers to write as continued fractions are the quadratic irrationals: numbers that are not rational, but are solutions to quadratic equations, and these are precisely the continued fractions with infinitely repeating terms. We can write a function to build these: cycleTerms :: [Integer] -> CFraccycleTerms ns = fix (go ns) where go [] x = x go (t : ts) x = t :+/ go ts x And then we can write a quick catalog of easy continued fractions, including some small square roots and the golden ratio. sqrt2 :: CFracsqrt2 = 1 :+/ cycleTerms [2] sqrt3 :: CFracsqrt3 = 1 :+/ cycleTerms [1, 2] sqrt5 :: CFracsqrt5 = 2 :+/ cycleTerms [4] phi :: CFracphi = cycleTerms [1] It’s really worth pausing to think how remarkable it is that these fundamental quantities, whose decimal representations have no obvious patterns at all, are so simple as continued fractions! And it doesn’t end there. Euler’s constant e, even though it’s a transcendental number, also has a simple pattern as a continued fraction. exp1 :: CFracexp1 = 2 :+/ fromTerms (concatMap (\n -> [1, 2 * n, 1]) [1 ..]) This really strengthens the claim I made earlier, that being based on the reciprocal instead of multiplying by an arbitrary base makes continued fractions somehow less arbitrary than decimals. I wish I could tell you that π has a similar nice pattern, but alas, it does not. π has a continued fraction just as random as its representation in other notations, and just as hard to compute to arbitrary precision. It will be interesting later, though, to look at its first few terms. For that, it’s good enough to use the Double approximation to π from Haskell’s base library. It’s not really π; in fact, it’s a rational number! But it‘s close enough for our purposes. Computing an exact value of π needs some more machinery, which we will develop in the next section. approxPi :: CFracapproxPi = cfFromRational (realToFrac (pi :: Double)) #### Canonical forms We’ll now return to a more theoretical concern. We want each real number to have a unique continued fraction representation. In fact, I snuck something by you earlier when I derived an Eq instance for CFrac, because that instance is only valid when the type has unique representations. However, this isn’t quite true in general. Here are some examples of continued fractions that are written differently, but have the same value: Case (1) deals with negative numbers. These will actually cause quite a few problems — not just now, but in convergence properties of algorithms later, too. I have been quietly assuming up to this point that all the numbers we’re dealing with are non-negative. Let’s make that assumption explicit, and require that all continued fraction terms are positive. That takes care of case (1). This might seem like a terrible restriction. In the end, though, we can recover negative numbers in the same way humans have done so for centuries: by keeping track of a sign, separate from the absolute value. A signed continued fraction type, then, would wrap CFrac and include a Bool field for whether it’s negative. This is entirely straight-forward, and I leave it as an exercise for the interested reader. Case (2) involves a term with a value of zero. It’s normal for the integer part of a continued fraction to be zero. After that, though, the remaining terms should never be zero, because they are the reciprocals of numbers less than one. So we will impose a second rule that only the first term of a continued fraction may be zero. Case (3) involves a rational continued fraction with 1 as its final term. It turns out that even after solving cases (1) and (2), every rational number has two distinct continued fractions: one that has a 1 that is not the integer part as its last term, and one that doesn’t. This is analogous to the fact that terminating decimals have two decimal representations, one ending in an infinite sequence of 0s, and the other in an infinite sequence of 9s. In this case, the trailing 1 is longer than it needs to be, so we’ll disallow it, making a third rule that a term sequence can never end in 1, unless that 1 is the integer part. Subject to these three rules, we get the canonical forms we wanted. It’s unfortunate that non-canonical forms are representable in the CFrac type (i.e., we have failed to make illegal data unrepresentable), but we can do the next best thing: check that our computations all produce canonical values. To do so, let’s write some code to check that a CFrac obeys these rules. isCanonical :: CFrac -> BoolisCanonical Inf = TrueisCanonical (n :+/ cont) = n >= 0 && isCanonicalCont cont isCanonicalCont :: CFrac -> BoolisCanonicalCont Inf = TrueisCanonicalCont (1 :+/ Inf) = FalseisCanonicalCont (n :+/ cont) = n > 0 && isCanonicalCont cont #### Property testing One very powerful idea that originated in the Haskell community is property testing. We state a property that we want to verify about our code, and allow a testing framework like to QuickCheck to make up examples and test them. Now is a great time to try this out. Here I’ve defined a few properties around canonical forms. The first is a trivial property that just asserts that we make the right decision about the specific examples above. The second, less trivial, guarantees that our cfFromRational function, the only real semantic function we’ve written, produces canonical forms. prop_isCanonical_examples :: Boolprop_isCanonical_examples = not (isCanonical (2 :+/ (-2) :+/ Inf)) && isCanonical (1 :+/ 2 :+/ Inf) && not (isCanonical (1 :+/ 0 :+/ 2 :+/ Inf)) && isCanonical (3 :+/ Inf) && not (isCanonical (1 :+/ 1 :+/ Inf)) && isCanonical (2 :+/ Inf) prop_cfFromRational_isCanonical :: NonNegative Rational -> Boolprop_cfFromRational_isCanonical (NonNegative x) = isCanonical (cfFromRational x) (We could try to check that the irrational values are canonical, as well, but alas, we run into non-termination since they are infinite. This will be a theme: we’ll have to be satisfied with checking rational test cases, and relying on the fact that any bugs with irrational values will manifest in some nearby rational value.) We can now run our code for this section. Try it yourself at https://code.world/haskell#PVxpksYe_YAcGw3CNdvEFzw. In addition to running the tests, we’ll print out our example continued fractions so we can see them. Testing prop_isCanonical_examples+++ OK, passed 1 test. Testing prop_cfFromRational_isCanonical+++ OK, passed 100 tests. 3/7 = 0 :+/ (2 :+/ (3 :+/ Inf))sqrt2 = 1 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2...sqrt3 = 1 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1...sqrt5 = 2 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4...phi = 1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1...e = 2 :+/ (1 :+/ (2 :+/ (1 :+/ (1 :+/ (4 :+/ (1 :+/ (1...approxPi = 3 :+/ (7 :+/ (15 :+/ (1 :+/ (292 :+/ (1 :+/ (1 :+/ (1 :+/ (2 :+/ (1 :+/ (3 :+/ (1 :+/ (14 :+/ (3 :+/ (3 :+/ (2 :+/ (1 :+/ (3 :+/ (3 :+/ (7 :+/ (2 :+/ (1 :+/ (1 :+/ (3 :+/ (2 :+/ (42 :+/ (2 :+/ Inf)))))))))))))))))))))))))) ### Part 2: Conversions In the previous section, we converted from rational numbers into continued fractions. We now turn to the inverse conversion: from a continued fraction to a rational number. The code and tests for this part can be found at https://code.world/haskell#PemataxO6pmYz99dOpaeT9g. #### Computing convergents Not all continued fractions correspond to rational numbers. However, one of the key benefits of the continued fraction representation is that its truncated term sequences represent particularly efficient rational approximations. These rational approximations are called the convergents. It’s not hard to write a simple recursive function to compute the convergents. naiveConvergents :: CFrac -> [Rational]naiveConvergents Inf = []naiveConvergents (n :+/ r) = fromInteger n : map (\x -> fromInteger n + 1 / x) (naiveConvergents r) We can also use QuickCheck to write a property test, verifying that a rational number is its own final convergent. This gives a sanity check on our implementation. prop_naiveConvergents_end_at_Rational :: NonNegative Rational -> Propertyprop_naiveConvergents_end_at_Rational (NonNegative r) = last (naiveConvergents (cfFromRational r)) === r You may guess from my choice of name that I am not happy with this implementation. The problem with naiveConvergents is that it recursively maps a lambda over the entire tail of the list. As it recurses deeper into the list, we build up a bunch of nested mapped lambdas, and end up evaluating O(n²) of these lambdas to produce only n convergents. #### Mobius transformations Let’s fix that quadratic slowdown. A more efficient implementation can be obtained by defunctionalizing. The lambdas have the form \x -> fromInteger n + 1 / x. To avoid quadratic work, we need a representation that lets us compose functions of this form into something that can be applied in constant time. What works is mobius transformations. A mobius transformation is a function of the form: For our purposes, the coefficients a, b, c, and d are always integers. Instead of an opaque lambda, we represent a mobius transformation by its four coefficients. data Mobius where Mobius :: Integer -> Integer -> Integer -> Integer -> Mobius deriving instance Eq Mobiusderiving instance Show Mobius We seek to rewrite our computation of convergents using mobius transformations. First of all, we will restructure it as a left fold, which exposes the mobius transformation itself as an accumulator. We’ll want these building blocks: • An identity mobius transformation, to initialize the accumulator. • A composition to combine two mobius transformations into a single one. In other words, mobius transformations should form a monoid! instance Semigroup Mobius where Mobius a1 b1 c1 d1 <> Mobius a2 b2 c2 d2 = Mobius (a1 * a2 + b1 * c2) (a1 * b2 + b1 * d2) (c1 * a2 + d1 * c2) (c1 * b2 + d1 * d2) instance Monoid Mobius where mempty = Mobius 1 0 0 1 There are axioms for monoids: mempty should act like an identity, and <> should be associative. We can test these with property tests. This is as good a time as any to set up a QuickCheck generator for mobius transformations. To keep things sane, we want to always want to choose mobius transformations that have non-zero denominators, since otherwise the function is undefined on all inputs. We will also want non-negative coefficients in the denominator, since this ensures the transformation is defined (and monotonic, which will matter later on) for all non-negative input values. instance Arbitrary Mobius where arbitrary = suchThat ( Mobius <$> arbitrary          <*> arbitrary          <*> (getNonNegative <$> arbitrary) <*> (getNonNegative <$> arbitrary)      )      (\(Mobius _ _ c d) -> max c d > 0)
  shrink (Mobius a b c d) =    [ Mobius a' b' c' d'      | (a', b', NonNegative c', NonNegative d') <-          shrink (a, b, NonNegative c, NonNegative d),        max c' d' > 0    ]

With the generator in place, the tests are trivial to write:

prop_Mobius_id :: Mobius -> Propertyprop_Mobius_id m = mempty <> m === m .&&. m <> mempty === m
prop_Mobius_assoc :: Mobius -> Mobius -> Mobius -> Propertyprop_Mobius_assoc m1 m2 m3 = (m1 <> m2) <> m3 === m1 <> (m2 <> m3)

#### Faster convergents

convergents :: CFrac -> [Rational]convergents = go mempty  where    go m Inf = []    go m (n :+/ x) = mobiusLimit m' : go m' x      where        m' = m <> Mobius n 1 1 0
    mobiusLimit (Mobius a _ c _) = a % c

In the expression, go m x, m is a mobius transformation that turns a convergent of x into a convergent for the entire continued fraction. When x is the entire continued fraction, m is the identity function. As we recurse into the continued fraction, we compose transformations onto m so that this remains true. The lambda \x -> fromInteger n + 1 / x from the naive implementation is now defunctionalized into Mobius n 1 1 0.

The remaining task is to compute a truncated value of the continued fraction at each step. Recall that a terminating continued fraction is one which has an infinite term. To determine this truncated value, then, we want to consider the limit of the accumulated mobius transformation as its input tends to infinity. This is:

and that completes the implementation. A quick test helps us to be confident in the result.

prop_convergents_matches_naive :: NonNegative Rational -> Propertyprop_convergents_matches_naive (NonNegative r) =  convergents (cfFromRational r)    === naiveConvergents (cfFromRational r)

#### Converting to decimals

Let’s also consider the conversion from continued fractions to decimals. This time, both representations can be approximated by just truncating, so instead of producing a sequence of approximations as we did with rational numbers, we will just a single decimal that lazily adds to the representation with more precision.

Mobius transformations are still a good tool for this job, but we need to refine our earlier observations. Consider the general form of a mobius transformation again:

It’s worth taking the time now to play around with different choices for a, b, c, and d, and what they do to the function. You can do so with the Desmos link below:

We previously noted one of these bounds, and now add the other:

As long as c and d are positive (remember when we agreed to keep them so?), f is monotonic on the entire interval [0, ∞), so f is bounded by these two fractions. In particular, if a/c and b/d have the same integer part, then we know this is the integer part of the result of f, regardless of the
(always non-negative) value of x. We can express this insight as a function:

mobiusIntPart :: Mobius -> Maybe IntegermobiusIntPart (Mobius a b c d)  | c /= 0 && d /= 0 && n1 == n2 = Just n1  | otherwise = Nothing  where    n1 = a div c    n2 = b div d

We can now proceed in a manner similar to the computation of convergents. We maintain a mobius transformation which maps the remaining continued fraction to the remaining decimal. Initially, that is the identity. But this time, instead of blindly emitting an approximation at each step, we make a choice:

• If the mobius transformation is bounded to guarantee the integer part of its result, then we can produce that decimal digit. We then update the transformation m to yield the remaining decimal places after that. This will tend to widen its bounds.
• Otherwise, we will pop off the integer part of the continued fraction, and update the transformation to expect the remaining continued fraction terms after that. This will tend to narrow its bounds, so we’ll be closer to producing a new decimal digit.
• If we ever end up with the zero transformation (that is, a and b are both zero), then all remaining decimal places will be zero, so we can stop. If we encounter an infinite term of input, though, we still need to continue, but we can update b and d to match a and c, narrowing those bounds to a single point to indicate that we now know the exact value of the input.

Here is the implementation:

cfToBase :: Integer -> CFrac -> [Integer]cfToBase base = go mempty  where    go (Mobius 0 0 _ _) _ = []    go m x      | Just n <- mobiusIntPart m,        let m' = Mobius base (-base * n) 0 1 <> m       = n : go m' x    go m (n :+/ x) = go (m <> Mobius n 1 1 0) x    go (Mobius a _ c _) Inf = go (Mobius a a c c) Inf
cfToDecimal :: CFrac -> StringcfToDecimal Inf = "Inf"cfToDecimal x = case cfToBase 10 x of  [] -> "0.0"  [z] -> show z ++ ".0"  (z : digits) -> show z ++ "." ++ concatMap show digits

To test this, we will compare the result to standard Haskell output. However, Haskell’s typical Show instances for fractional types are too complex, producing output like 1.2e6 that we don’t want to match. The Numeric module has the simpler functionality we want. There are some discrepancies due to rounding error, as well, so the test will ignore differences in the last digit of output from the built-in types.

prop_cfToDecimal_matchesRational :: NonNegative Rational -> Propertyprop_cfToDecimal_matchesRational (NonNegative r) =  take n decFromRat === take n decFromCF  where    decFromRat = showFFloat Nothing (realToFrac r :: Double) ""    decFromCF = cfToDecimal (cfFromRational r)    n = max 10 (length decFromRat - 1)

#### Generalized Continued Fractions

Earlier, we settled for only an approximation of π, and I mentioned that π doesn’t have a nice pattern as a continued fraction. That’s true, but it’s not the whole story. If we relax the definition of a continued fraction just a bit, there are several known expressions for π as a generalized continued fraction. The key is to allow numerators other than 1.

Here’s one that’s fairly nice:

Generalized continued fractions don’t have many of the same nice properties that standard continued fractions do. They are not unique, and sometimes converge very slowly (or not at all!), yielding poor rational approximations when truncated. But we can compute with them using mobius transformations, using the same algorithms from above. In particular, we can use the same tools as above to convert from a generalized continued fraction to a standard one.

First, we’ll define a type for generalized continued fractions.

data GCFrac where  (:+#/) :: (Integer, Integer) -> GCFrac -> GCFrac  GInf :: GCFrac
deriving instance Show GCFrac

This time I haven’t defined an Eq instance, because representations are not unique. Converting from a standard to a generalized continued fraction is trivial.

cfToGCFrac :: CFrac -> GCFraccfToGCFrac Inf = GInfcfToGCFrac (n :+/ x) = (n, 1) :+#/ cfToGCFrac x

The conversion in the other direction, though, requires a mobius-like algorithm.

gcfToCFrac :: GCFrac -> CFracgcfToCFrac = go mempty  where    go (Mobius a _ c _) GInf = cfFromFrac a c    go m gcf@((int, numer) :+#/ denom)      | Just n <- mobiusIntPart m =        n :+/ go (Mobius 0 1 1 (- n) <> m) gcf      | otherwise = go (m <> Mobius int numer 1 0) denom

We can write a property test to verify that, at least, this gives a correct round trip from continued fractions to generalized, and back to standard again.

prop_GCFrac_roundtrip :: NonNegative Rational -> Propertyprop_GCFrac_roundtrip (NonNegative r) =  gcfToCFrac (cfToGCFrac x) === x  where    x = cfFromRational r

Here’s the definition of π above, expressed in code, and converted to a continued fraction. I’ve written a test to verify that it mostly agrees with the approximate value we defined earlier, just as a sanity check on the implementation.

gcfPi = (0, 4) :+#/ go 1  where    go i = (2 * i - 1, i ^ 2) :+#/ go (i + 1)
exactPi = gcfToCFrac gcfPi
prop_correct_pi :: Propertyprop_correct_pi =  take 17 (cfToDecimal approxPi)    === take 17 (cfToDecimal exactPi)

#### Results

We now have built enough to get some interesting results. For example, we have exact representations several irrational numbers, and we can use those to obtain both rational approximations and long decimal representations of each.

In addition to running our tests, we will print the continued fraction terms, convergents, and decimal representation of each of our test values. Check out the full code at https://code.world/haskell#PemataxO6pmYz99dOpaeT9g.

Testing prop_naiveConvergents_end_at_Rational+++ OK, passed 100 tests.
Testing prop_Mobius_id+++ OK, passed 100 tests.
Testing prop_Mobius_assoc+++ OK, passed 100 tests.
Testing prop_convergents_matches_naive+++ OK, passed 100 tests.
Testing prop_cfToDecimal_matchesRational+++ OK, passed 100 tests.
Testing prop_GCFrac_roundtrip+++ OK, passed 100 tests.
Testing prop_correct_pi+++ OK, passed 1 test.
3/7:- terms: 0 :+/ (2 :+/ (3 :+/ Inf))- frac : [0 % 1,1 % 2,3 % 7]- dec  : 0.428571428571428571428571428571428571428571428571
sqrt2:- terms: 1 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2 :+/ (2- frac : [1 % 1,3 % 2,7 % 5,17 % 12,41 % 29,99 % 70,239 % 1- dec  : 1.414213562373095048801688724209698078569671875376
sqrt3:- terms: 1 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1 :+/ (2 :+/ (1- frac : [1 % 1,2 % 1,5 % 3,7 % 4,19 % 11,26 % 15,71 % 41,9- dec  : 1.732050807568877293527446341505872366942805253810
sqrt5:- terms: 2 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4 :+/ (4- frac : [2 % 1,9 % 4,38 % 17,161 % 72,682 % 305,2889 % 129- dec  : 2.236067977499789696409173668731276235440618359611
phi:- terms: 1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1 :+/ (1- frac : [1 % 1,2 % 1,3 % 2,5 % 3,8 % 5,13 % 8,21 % 13,34 %- dec  : 1.618033988749894848204586834365638117720309179805
e:- terms: 2 :+/ (1 :+/ (2 :+/ (1 :+/ (1 :+/ (4 :+/ (1 :+/ (1- frac : [2 % 1,3 % 1,8 % 3,11 % 4,19 % 7,87 % 32,106 % 39,- dec  : 2.718281828459045235360287471352662497757247093699
approxPi:- terms: 3 :+/ (7 :+/ (15 :+/ (1 :+/ (292 :+/ (1 :+/ (1 :+/- frac : [3 % 1,22 % 7,333 % 106,355 % 113,103993 % 33102,1- dec  : 3.141592653589793115997963468544185161590576171875
exactPi:- terms: 3 :+/ (7 :+/ (15 :+/ (1 :+/ (292 :+/ (1 :+/ (1 :+/- frac : [3 % 1,22 % 7,333 % 106,355 % 113,103993 % 33102,1- dec  : 3.141592653589793238462643383279502884197169399375

Most of these are exact irrational values, so we can only look at a finite prefix of the convergents, which go on forever getting more and more accurate (at the expense of larger denominators). I’ve chopped them off at 50 characters, so that they comfortably fit on the screen, but you can change that to see as much of the value as you like.

With this data in front of us, we can learn more about continued fractions and how they behave.

Did you notice that the convergents of π jump from 355/113 all the way to 103993/33102? That’s because 355/113 is a remarkably good approximation of π. You have to jump to a much larger denominator to get a better approximation. You can observe this same fact directly from the continued fraction terms. The fifth term is 292, which is unusually large. 292 is the next term after the truncation that yields the convergent 355/113. Large terms like that in a continued fraction indicate unusually good rational approximations.

Conversely, what if there are no large continued fraction terms? More concretely, what can we say about the number that has all 1s as its continued fraction? That’s the golden ratio! What this means is that there are no particularly good rational approximations to the golden ratio. And you can see that, and hear it!

• Flowers tend to grow new petals or seeds at an angle determined by the golden ratio, specifically because there are no small intervals (i.e., small denominators) after which they will overlap.
• Musical notes sound harmonic when their frequencies are close to simple rational numbers. The least harmonic frequencies are those related by the golden ratio, and you can hear it!

(By the way, did you notice the fibonacci numbers in the convergents of the golden ratio? That’s a whole other topic I won’t get into here.)

### Part 3: Arithmetic

Now that we’ve got a nice type and some conversions, let’s turn our attention to computing with continued fractions.

The code and tests for this section can be found at https://code.world/haskell#P_T4CkfWGu3bgSwFV0R04Sg.

#### Simple computations

There are a few computations that are easy, so let’s warm up with those.

Computing the reciprocal of a continued fraction turns out to be almost trivial, because continued fractions are built out of reciprocals! All we need to do for this one is shift all the terms by one… in either direction, really. Since our normal form only allows zero as the first term, we’ll make the decision based on whether the whole part is already zero. We also need a special case for 1 to prevent building a non-normal representation.

cfRecip :: CFrac -> CFraccfRecip (0 :+/ x) = xcfRecip (1 :+/ Inf) = 1 :+/ InfcfRecip x = 0 :+/ x

We will test a few expected properties:

• The reciprocal is self-inverse.
• The result matches taking a reciprocal in the rationals.
• The reciprocal always gives an answer in normal form.
prop_recipRecip_is_id :: NonNegative Rational -> Propertyprop_recipRecip_is_id (NonNegative r) =  r /= 0 ==> cfRecip (cfRecip (cfFromRational r))    === cfFromRational r
prop_cfRecip_matches_Rational :: NonNegative Rational -> Propertyprop_cfRecip_matches_Rational (NonNegative r) =  r /= 0 ==> cfFromRational (recip r) === cfRecip (cfFromRational r)
prop_cfRecip_isCanonical :: NonNegative Rational -> Propertyprop_cfRecip_isCanonical (NonNegative r) =  r /= 0 ==> isCanonical (cfRecip (cfFromRational r))

Comparing two continued fractions is also not too difficult. It’s similar to comparing decimals, in that you look down the sequence for the first position where they differ, and that determines the result. However, since we take reciprocals at each term of a continued fraction, we’ll need to flip the result at each term.

cfCompare :: CFrac -> CFrac -> OrderingcfCompare Inf Inf = EQcfCompare _ Inf = LTcfCompare Inf _ = GTcfCompare (a :+/ a') (b :+/ b') = case compare a b of  EQ -> cfCompare b' a'  r -> r

Let’s write a QuickCheck property to verify that the result matches the Ord instance for Rational, and then define an Ord instance of our own.

prop_cfCompare_matches_Rational ::  NonNegative Rational -> NonNegative Rational -> Propertyprop_cfCompare_matches_Rational (NonNegative x) (NonNegative y) =  compare x y === cfCompare (cfFromRational x) (cfFromRational y)
instance Ord CFrac where  compare = cfCompare

#### Arithmetic with rational numbers

To make any progress on the rest of basic arithmetic, we must return to our trusty old hammer: the mobius transformation. Recall that we’ve used the mobius transformation in two ways so far:

• To produce convergents, we produced mobius transformations from terms, composed them together, and also produced a new approximation at each step.
• To produce decimals, we looked at the bounds on the result of a mobius transformation, and proceeded in one of two ways: produce a bit of output, and then add a new mobius transformation to the output side; or consume a term of input, and then add a new mobius transformation to the input side.

We will now implement mobius transformations that act on continued fractions and produce new continued fractions. The strategy is the same as it was when producing decimals, but modified to produce continued fraction terms as output instead of decimal digits.

cfMobius :: Mobius -> CFrac -> CFraccfMobius (Mobius a _ c _) Inf = cfFromFrac a ccfMobius (Mobius _ _ 0 0) _ = InfcfMobius m x  | Just n <- mobiusIntPart m =    n :+/ cfMobius (Mobius 0 1 1 (- n) <> m) xcfMobius m (n :+/ x) = cfMobius (m <> Mobius n 1 1 0) x

There are two properties to test here. The first is that computations on continued fractions match the same computations on rational numbers. To implement this test, we’ll need an implementation of mobius transformations on rational numbers. Then we’ll test that cfMobius gives results in their canonical form. For both tests, we don’t care about transformations whose rational results are negative, as we will just agree not to evaluate them.

mobius :: (Eq a, Fractional a) => Mobius -> a -> Maybe amobius (Mobius a b c d) x  | q == 0 = Nothing  | otherwise = Just (p / q)  where    p = fromInteger a * x + fromInteger b    q = fromInteger c * x + fromInteger d
prop_cfMobius_matches_Rational ::  Mobius -> NonNegative Rational -> Propertyprop_cfMobius_matches_Rational m (NonNegative r) =  case mobius m r of    Just x      | x >= 0 ->        cfMobius m (cfFromRational r) === cfFromRational x    _ -> discard
prop_cfMobius_isCanonical ::  Mobius -> NonNegative Rational -> Propertyprop_cfMobius_isCanonical m (NonNegative r) =  case mobius m r of    Just rat      | rat >= 0 ->        let x = cfMobius m (cfFromRational r)         in counterexample (show x) (isCanonical x)    _ -> discard

The punchline here is that, using an appropriate mobius transformation, we can perform any of the four arithmetic functions — addition, subtraction, multiplication, or division — involving one continuous fraction and one rational number.

#### Binary operations on continued fractions

This was all sort of a warm-up, though, for the big guns: binary operations on two continued fractions, yielding a new continued fraction. Here, for the first time, our trusted mobius transformations fail us. Indeed, this remained an unsolved problem until 1972, when Bill Gosper devised a solution. That solution uses functions of the form:

Again, for our purposes, all coefficients will be integers, and the coefficients in the denominator will always be positive. Since these are the generalization of mobius transformations to binary operations, we can call them the bimobius transformations.

data Bimobius where  BM ::    Integer ->    Integer ->    Integer ->    Integer ->    Integer ->    Integer ->    Integer ->    Integer ->    Bimobius
deriving instance Eq Bimobiusderiving instance Show Bimobius

The rest of the implementation nearly writes itself! We simply want to follow the same old strategy. We keep a bimobius transformation that computes the rest of the output, given the rest of the inputs. At each step, we emit more output if possible, and otherwise consume a term from one of two inputs. As soon as either input is exhausted (if, indeed that happens), the remaining calculation reduces to a simple mobius transformation of the remaining argument.

First, we need an analogue of mobiusIntPart, which was how we determined whether an output was available.

bimobiusIntPart :: Bimobius -> Maybe IntegerbimobiusIntPart (BM a b c d e f g h)  | e /= 0 && f /= 0 && g /= 0 && h /= 0      && n2 == n1      && n3 == n1      && n4 == n1 =    Just n1  | otherwise = Nothing  where    n1 = a div e    n2 = b div f    n3 = c div g    n4 = d div h

The other building block we used to implement mobius transformations was composition with the Monoid instance. Unfortunately, that’s not so simple, since bimobius transformations have two inputs. There are now three composition forms. First, we can compose the mobius transformation on the left, consuming the output of the bimobius:

(<>||) :: Mobius -> Bimobius -> Bimobius-- (mob <>|| bimob) x y = mob (bimob x y)Mobius a1 b1 c1 d1 <>|| BM a2 b2 c2 d2 e2 f2 g2 h2 =  BM a b c d e f g h  where    a = a1 * a2 + b1 * e2    b = a1 * b2 + b1 * f2    c = a1 * c2 + b1 * g2    d = a1 * d2 + b1 * h2    e = c1 * a2 + d1 * e2    f = c1 * b2 + d1 * f2    g = c1 * c2 + d1 * g2    h = c1 * d2 + d1 * h2

We can also compose it on the right side, transforming either the first or second argument of the bimobius.

(||<) :: Bimobius -> Mobius -> Bimobius-- (bimob ||< mob) x y = bimob (mob x) yBM a1 b1 c1 d1 e1 f1 g1 h1 ||< Mobius a2 b2 c2 d2 =  BM a b c d e f g h  where    a = a1 * a2 + c1 * c2    b = b1 * a2 + d1 * c2    c = a1 * b2 + c1 * d2    d = b1 * b2 + d1 * d2    e = e1 * a2 + g1 * c2    f = f1 * a2 + h1 * c2    g = e1 * b2 + g1 * d2    h = f1 * b2 + h1 * d2
(||>) :: Bimobius -> Mobius -> Bimobius-- (bimob ||> mob) x y = bimob x (mob y)BM a1 b1 c1 d1 e1 f1 g1 h1 ||> Mobius a2 b2 c2 d2 =  BM a b c d e f g h  where    a = a1 * a2 + b1 * c2    b = a1 * b2 + b1 * d2    c = c1 * a2 + d1 * c2    d = c1 * b2 + d1 * d2    e = e1 * a2 + f1 * c2    f = e1 * b2 + f1 * d2    g = g1 * a2 + h1 * c2    h = g1 * b2 + h1 * d2

That’s a truly dizzying bunch of coefficients! I’m definitely not satisfied without property tests to ensure they match the intended meanings. To do that, we need two pieces of test code.

First, we want to generate random bimobius transformations with an Arbitrary instance. As with mobius transformations, the instance will guarantee that the denominators are non-zero and have non-negative coefficients.

instance Arbitrary Bimobius where  arbitrary =    suchThat      ( BM          <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary <*> (getNonNegative <$> arbitrary)          <*> (getNonNegative <$> arbitrary) <*> (getNonNegative <$> arbitrary)          <*> (getNonNegative <$> arbitrary) ) (\(BM _ _ _ _ e f g h) -> maximum [e, f, g, h] > 0)  shrink (BM a b c d e f g h) = [ BM a' b' c' d' e' f' g' h' | ( a', b', c', d', NonNegative e', NonNegative f', NonNegative g', NonNegative h' ) <- shrink ( a, b, c, d, NonNegative e, NonNegative f, NonNegative g, NonNegative h ), maximum [e', f', g', h'] > 0 ] And second, we want a simple and obviously correct base implementation of the bimobius transformation to compare with. bimobius :: (Eq a, Fractional a) => Bimobius -> a -> a -> Maybe abimobius (BM a b c d e f g h) x y | q == 0 = Nothing | otherwise = Just (p / q) where p = fromInteger a * x * y + fromInteger b * x + fromInteger c * y + fromInteger d q = fromInteger e * x * y + fromInteger f * x + fromInteger g * y + fromInteger h With this in mind, we’ll check each of the three composition operators performs as expected on rational number, at least. There’s one slight wrinkle: when composing the two functions manually leads to an undefined result, the composition operator sometimes cancels out the singularity and produces an answer. I’m willing to live with that, so I discard cases where the original result is undefined. prop_mob_o_bimob :: Mobius -> Bimobius -> Rational -> Rational -> Propertyprop_mob_o_bimob mob bimob r1 r2 = case mobius mob =<< bimobius bimob r1 r2 of Just ans -> bimobius (mob <>|| bimob) r1 r2 === Just ans Nothing -> discard prop_bimob_o_leftMob :: Bimobius -> Mobius -> Rational -> Rational -> Propertyprop_bimob_o_leftMob bimob mob r1 r2 = case (\x -> bimobius bimob x r2) =<< mobius mob r1 of Just ans -> bimobius (bimob ||< mob) r1 r2 === Just ans Nothing -> discard prop_bimob_o_rightMob :: Bimobius -> Mobius -> Rational -> Rational -> Propertyprop_bimob_o_rightMob bimob mob r1 r2 = case (\y -> bimobius bimob r1 y) =<< mobius mob r2 of Just ans -> bimobius (bimob ||> mob) r1 r2 === Just ans Nothing -> discard Now to the task at hand: implementing bimobius transformations on continued fractions. cfBimobius :: Bimobius -> CFrac -> CFrac -> CFraccfBimobius (BM a b _ _ e f _ _) Inf y = cfMobius (Mobius a b e f) ycfBimobius (BM a _ c _ e _ g _) x Inf = cfMobius (Mobius a c e g) xcfBimobius (BM _ _ _ _ 0 0 0 0) _ _ = InfcfBimobius bm x y | Just n <- bimobiusIntPart bm = let bm' = Mobius 0 1 1 (- n) <>|| bm in n :+/ cfBimobius bm' x ycfBimobius bm@(BM a b c d e f g h) x@(x0 :+/ x') y@(y0 :+/ y') | g == 0 && h == 0 = consumeX | h == 0 || h == 0 = consumeY | abs (g * (h * b - f * d)) > abs (f * (h * c - g * d)) = consumeX | otherwise = consumeY where consumeX = cfBimobius (bm ||< Mobius x0 1 1 0) x' y consumeY = cfBimobius (bm ||> Mobius y0 1 1 0) x y' There are a few easy cases at the start. If either argument is infinite, then the infinite terms dominate, and the bimobius reduces to a unary mobius transformation of the remaining argument. On the other hand, if the denominator is zero, the result is infinite. Next is the output case. This follows the same logic as the mobius transformations above: But since m and m’ are now bimobius transformations, we just use one of the special composition forms defined earlier. The remaining case is where we don’t have a new term to output, and must expand a term of the input, using the now-familiar equation, but composing with the ||< or ||> operators, instead. There’s a new wrinkle here, though! Which of the two inputs should we expand? It seems best to make the choice that narrows the bounds the most, but I’ll be honest: I don’t actually understand the full logic behind this choice. I’ve simply copied it from Rosetta Code, where it’s not explained in any detail. We will rely on testing for confidence that it works. Speaking of testing, we definitely need a few tests to ensure this works as intended. We’ll test the output against the Rational version of the function, and that the result is always in canonical form. prop_cfBimobius_matches_Rational :: NonNegative Rational -> NonNegative Rational -> Bimobius -> Propertyprop_cfBimobius_matches_Rational (NonNegative r1) (NonNegative r2) bm = case bimobius bm r1 r2 of Just x | x >= 0 -> cfFromRational x === cfBimobius bm (cfFromRational r1) (cfFromRational r2) _ -> discard prop_cfBimobius_isCanonical :: NonNegative Rational -> NonNegative Rational -> Bimobius -> Boolprop_cfBimobius_isCanonical (NonNegative r1) (NonNegative r2) bm = case bimobius bm r1 r2 of Just x | x >= 0 -> isCanonical (cfBimobius bm (cfFromRational r1) (cfFromRational r2)) _ -> discard If you like, it’s interesting to compare the implementations of continued fraction arithmetic here with some other sources, such as: These sources are great, but I think the code above is nicer. The reason is that with lazy evaluation instead of mutation, you can read the code more mathematically as a set of equations, and see what’s going on more clearly. This is quite different from the imperative implementations, which modify state in hidden places and require a lot of mental bookkeeping to keep track of it all. As I said, Haskell is a joy for this kind of programming. #### Tying it all together We now have all the tools we need to write Num and Fractional instances for CFrac. That’s a great way to wrap this all together, so that clients of this code don’t need to worry about mobius and bimobius transformations at all. checkNonNegative :: CFrac -> CFraccheckNonNegative Inf = InfcheckNonNegative x@(x0 :+/ x') | x < 0 = error "checkNonNegative: CFrac is negative" | x > 0 = x | otherwise = x0 :+/ checkNonNegative x' instance Num CFrac where fromInteger n | n >= 0 = n :+/ Inf | otherwise = error "fromInteger: CFrac cannot be negative"  (+) = cfBimobius (BM 0 1 1 0 0 0 0 1)  x - y = checkNonNegative (cfBimobius (BM 0 1 (-1) 0 0 0 0 1) x y)  (*) = cfBimobius (BM 1 0 0 0 0 0 0 1)  signum (0 :+/ Inf) = 0 signum _ = 1  abs = id  negate (0 :+/ Inf) = 0 :+/ Inf negate _ = error "negate: CFrac cannot be negative" instance Fractional CFrac where fromRational x | x < 0 = error "fromRational: CFrac cannot be negative" | otherwise = cfFromRational x  recip = cfRecip  (/) = cfBimobius (BM 0 1 0 0 0 0 1 0) You can find the complete code and tests for this section at https://code.world/haskell#P_T4CkfWGu3bgSwFV0R04Sg. In addition to running the tests, I’ve included a few computations, demonstrating for example that: ### Part 4: Going faster Everything we’ve done so far is correct, but there’s one way that it’s not terribly satisfying. It can do a lot of unnecessary work. In this part, we’ll try to fix this performance bug. You can find the code for this part in https://code.world/haskell#PP8uMTTchtH2F36nJYWT3VQ. Consider, for example, a computation like x + 0.7. We know (and, indeed, Haskell also knows!) that 0.7 is a rational number. We should, then, be able to work this out by applying cfMobius to a simple mobius transformation. But that’s not what GHC does. Instead, GHC reasons like this: • We assume that x is a CFrac. But + requires that both arguments have the same type, so 0.7 must also be a CFrac. • Since 0.7 is a literal, GHC will implicitly add a fromRational, applied to the rational form 7/10, which (using the Fractional instance defined in the last part) uses cfFromRational to convert 0.7 into the continued fraction 0 :+/ 1 :+/ 2 :+/ 3 :+/ Inf. • Now, since we have two CFrac values, we must use the heavyweight cfBimobius to compute the answer, choosing between the two streams at every step. In the process, cfBimobius must effectively undo the conversion of 7/10 into continued fraction form, doubling the unnecessary work. #### Rewriting expressions We can fix this with GHC’s rewrite rules. They can avoid this unnecessary conversion, and reduce constant factors at the same time, by rewriting expressions to use cfMobius instead of cfBimobius when one argument need not be a continued fraction. Similarly, when cfMobius is applied to a rational argument, it can be replaced by a simpler call to cfFromFrac after just evaluating the mobius transformation on the Rational directly. Integers are another special case: less expensive because their continued fraction representations are trivial, but but we might as well rewrite them, too. {-# RULES"cfrac/cfMobiusInt" forall a b c d n. cfMobius (Mobius a b c d) (n :+/ Inf) = cfFromFrac (a * n + b) (c * n + d)"cfrac/cfMobiusRat" forall a b c d p q. cfMobius (Mobius a b c d) (cfFromFrac p q) = cfFromFrac (a * p + b * q) (c * p + d * q)"cfrac/cfBimobiusInt1" forall a b c d e f g h n y. cfBimobius (BM a b c d e f g h) (n :+/ Inf) y = cfMobius (Mobius (a * n + c) (b * n + d) (e * n + g) (f * n + h)) y"cfrac/cfBimobiusRat1" forall a b c d e f g h p q y. cfBimobius (BM a b c d e f g h) (cfFromFrac p q) y = cfMobius ( Mobius (a * p + c * q) (b * p + d * q) (e * p + g * q) (f * p + h * q) ) y"cfrac/cfBimobiusInt2" forall a b c d e f g h n x. cfBimobius (BM a b c d e f g h) x (n :+/ Inf) = cfMobius (Mobius (a * n + b) (c * n + d) (e * n + f) (g * n + h)) x"cfrac/cfBimobiusRat2" forall a b c d e f g h p q x. cfBimobius (BM a b c d e f g h) x (cfFromFrac p q) = cfMobius ( Mobius (a * p + b * q) (c * p + d * q) (e * p + f * q) (g * p + h * q) ) x #-} There’s another opportunity for rewriting, and we’ve actually already used it! When mobius or bimobius transformations are composed together, we can do the composition in advance to produce a single transformation. Especially since these transformations usually have simple integer arguments that GHC knows how to optimize, this can unlock a lot of other arithmetic optimizations. Fortunately, we’ve already defined and tested all of the composition operators we need here. {-# RULES"cfrac/mobius_o_mobius" forall m1 m2 x. cfMobius m1 (cfMobius m2 x) = cfMobius (m1 <> m2) x"cfrac/mobius_o_bimobius" forall m bm x y. cfMobius m (cfBimobius bm x y) = cfBimobius (m <>|| bm) x y"cfrac/bimobius_o_mobiusLeft" forall bm m x y. cfBimobius bm (cfMobius m x) y = cfBimobius (bm ||< m) x y"cfrac/bimobius_o_mobiusRight" forall bm m x y. cfBimobius bm x (cfMobius m y) = cfBimobius (bm ||> m) x y #-} #### Inlining, and not This all looks great… but it doesn’t work. The problem is that GHC isn’t willing to inline the right things at the right times to get to exactly the form we need for these rewrite rules to fire. To fix this, I had to go back through all of the previous code, being careful to annotate many of the key functions with {-# INLINE ... #-} pragmas. There are two forms of this pragma. When a trivial function might just get in the way of a rewrite rule, it’s annotated with a simple INLINE pragma so that GHC can inline and hopefully eliminate the function. But when a function actually appears on the left hand side of a rewrite rule, though, we need to step more carefully. Inlining that function can prevent the rewrite rule from firing. We could use NOINLINE for this, but I’ve instead picked the form of inline that includes a phase number. The phase number prevents inlining from occurring prior to that phase. It looks something like: {-# INLINE [2] cfRecip #-} Choosing a phase number seems to be something of a black art. I picked 2, and it worked, so… maybe try that one? All these edits mean we have completely new versions of all of the code, and it’s in these files: Done… but how do we know it worked? #### Verifying the rewrites Rewrite rules are a bit of a black art, and it’s nice to at least get some confirmation that they are working. GHC can tell us what’s going on if we pass some flags. Here are some useful ones: • -ddump-rule-rewrites will print out a description of every rewrite rule that fires, including the before-and-after code. • -ddump-simpl will print out the final Core, which is GHC’s Haskell-like intermediate language, after the simplifier (which applies these rules) is done. Adding -dsuppress-all makes the result more readable by hiding metadata we don’t care about. In CodeWorld, we can pass options to GHC by adding an OPTIONS_GHC pragma to the source code. If you looked at the Part 4 link above, you saw that I’ve done so. The output is long! But there are a few things we can notice. First, we named our rules, so we can search for the rules by name to see if they have fired. They do! We see this: ... Rule fired Rule: cfrac/cfBimobiusInt1 Module: (CFracPart4) ... Rule fired Rule: timesInteger Module: (BUILTIN) Before: GHC.Integer.Type.timesInteger ValArg 0 ValArg n_aocc After: 0 Cont: Stop[RuleArgCtxt] GHC.Integer.Type.Integer Rule fired Rule: plusInteger Module: (BUILTIN) Before: GHC.Integer.Type.plusInteger ValArg src<GHC/In an imported module> 0 ValArg 1 After: 1 ... Rule fired Rule: cfrac/bimobius_o_mobiusLeft Module: (CFracPart4) ... Rule fired Rule: cfrac/cfBimobiusInt2 Module: (CFracPart4) Good news: our rules are firing! I highlighted the timesInteger and plusInteger rules, as well. This is GHC taking the result of our rules, which involve a bunch of expressions like a1 * a2 + b1 * c2, and working out the math at compile time because it already knows the arguments. This is great! We want that to happen. But what about the result? For this, we turn to the -ddump-simpl output. It’s a bit harder to read, but after enough digging, we can find this (after some cleanup): main4 = 1main3 = 0main2 = 2main1 =$wunsafeTake      50# (cfToDecimal ($wcfMobius main4 main4 main3 main2 sqrt5)) This is essentially the expression take 50 (cfDecimal (cfMobius (Mobius 1 1 0 2) sqrt5)). Recall that what we originally wrote was (1 + sqrt5) / 2. We have succeeded in turning an arithmetic expression of CFrac values into a simple mobius transformation at compile time! ### Conclusion We finally reach the end of our journey. There are still some loose ends, as there always are. For example: • There’s still the thing with negative numbers, which I left as an exercise. • Arithmetic hangs when computing with irrational numbers that produce rational results. For example, try computing sqrt2 * sqrt2, and you’ll be staring at a blank screen while your CPU fan gets some exercise. This might seems like a simple bug, but it’s actually a fundamental limitation of our approach. An incorrect term arbitrarily far into the expression for sqrt2 could change the integer part of the result, so any correct algorithm must read the entire continued fraction before producing an answer. But the entire fraction is infinite! You could try to fix this by special-casing the cyclic continued fractions, but you can never catch all the cases. • There’s probably plenty more low-hanging fruit for performance. I haven’t verified the rewrite rules in earnest, really. Something like Joachim Breitner’s inspection testing idea could bring more rigor to this. (Sadly, libraries for inspection testing require Template Haskell, so you cannot try it within CodeWorld.) • Other representations for exact real arithmetic offer more advantages. David Lester, for example, has written a very concise exact real arithmetic library using Cauchy sequences with exponential convergence. In the end, though, I hope you’ve found this part of the journey rewarding in its own right. We developed a non-trivial library in Haskell, derived the code using equational reasoning (even if it was a bit fuzzy), tested it with property testing, and optimized it with rewrite rules. All of these play very well with Haskell’s declarative programming style. ### Tweag I/O # Ormolu internship Two years ago I started working on a new formatter for Haskell source code. My idea was to take advantage of the parser of GHC itself, which finally became available as a library around that time. Soon Tweag supported the initiative and the project became Ormolu. It was announced in May, right before ZuriHac 2019. Many people kindly helped me with it during the Hackathon. We went on to release the first version in October 2019. The use of the GHC parser and a solid approach to testing won Ormolu the reputation of a dependable tool. More and more industrial users choose it as their Haskell formatter. In summer 2020 a rather big company had decided that they wanted to format their Haskell code with Ormolu, which resulted in a three-month full-time contract. A new level of quality was reached. I think it is cool to be paid to work full-time on a project like this. If you concur, we have good news for you! Tweag currently has an opening for an intern who would like to work on the formatter. This is a project for someone who would enjoy iteratively improving a popular Haskell tool and learn about GHC AST, its parser, and perhaps a bit of Nix. There are various issues affecting Ormolu, and fixing these would have a positive impact on the user experience. The internship would address them in severity order: 1. Upgrading the GHC parser. Presently, Ormolu uses ghc-lib-parser. GHC 9.0 fixes some long-standing issues, and we can take advantage of that by switching to ghc-lib-parser-9.0.1.xxx. 2. Some more bugs of varying difficulty. 3. Stylistic changes. Please let us know if you are interested! Include a cover letter describing your Haskell experience; familiarity with GHC AST is a plus. We will collect applications till Tuesday, June 1, 2021. The internship can start any time after the offer is made, subject to mutual availability. Internships typically last 12 weeks, although the duration may be adjusted if necessary. If you have any questions, feel free to email me. ## April 21, 2021 ### Magnus Therning # First contribution to nixpkgs.haskellPackages Nothing much to be proud of, but yesterday I found out that servant-docs was marked broken in nixpkgs even though it builds just fine and this morning I decided to do something about it. So, with the help of a post on the NixOS discourse I put together my first PR. ### Gabriel Gonzalez # The end of history for programming history I spend quite a bit of time thinking about what the end of history for programming might look like. By the “end of history” I mean the point beyond which programming paradigms would not evolve significantly. I care about programming’s “destiny” because I prefer to work on open source projects that bring us closer to that final programming paradigm. In my experience, this sort of work has greater longevity, higher impact, and helps move the whole field of programming forward. So what would the end of history look like for programming? Is it: • … already here? Some people treat programming as a solved problem and view all new languages and paradigms as reskins of old languages or paradigms. From their point of view all that remains is to refine our craft by slowly optimizing things, weeding out bugs or addressing non-technical issues like people management or funding. I personally do not subscribe to this philosophy because I believe, at a very minimum, that functional programming paradigms will slowly displace object-oriented and imperative programming paradigms (although functional programming might not necessarily be the final programming paradigm). • … artificial intelligence? Maybe machines will translate our natural language instructions into code for us, relieving us of the burden of precisely communicating our intent. Or maybe some sufficiently smart AI-powered IDE will auto-complete most of our program for us. I don’t believe this either, and I think Dijkstra did an excellent job of dismantling this line of reasoning in his essay: On the foolishness of “natural language programming”. Truthfully, I’m not certain what is the correct answer, but I’ll present my own guess for what the end of history for programming looks like. #### Mathematical DSLs My view is that the next logical step for programming is to split into two non-overlapping programming domains: • runtime building for … • … mathematical programming languages Specifically, I expect programming languages to evolve to become more mathematical in nature where the programs users author to communicate their intent resemble pure mathematical expressions. For example, consider the following mathematical specifications of the boolean logical “and” operator and the function composition operator: True && x = xx && True = xFalse && False = False(f . g)(x) = f(g(x)) These mathematical specifications are also executable Haskell code (albeit with extra parentheses to resemble mainstream function syntax). Haskell is one example of a programming language where the code aspires to resemble pure mathematical expressions and definitions. Any language presenting such an idealized mathematical interface requires introducing a significant amount of complexity “under the hood” since the real world is messy. This is where runtime building comes into play to gloss over such ugly details. In other words, I’m predicting that the end of history for programming is to become an interface bridging pure mathematical expressions to the real world. #### The past and present Let me give a few examples of where this trend towards mathematical userland code is already playing out: • Memory management Manual memory management used to be a “userland” concern for most programming languages, but the general trend for new languages is automatic memory management (with the exception of Rust). Memory management used to be an explicit side effect that programmers had to care about and pushing memory management down into the runtime (via garbage collection or other methods) has made programming languages more pure, allowing them to get one step closer to idealized mathematical expressions. Indeed, Rust is the exception that proves the rule, as Rust is widely viewed as better suited for building runtimes rather than being used for high-level specification of intent. • Functional programming (especially purely functional programming) Functional programming is a large step towards programming in a more mathematical style that prefers: • expressions over statements • pure functions over side effects • algebraic datatypes over objects • recursion over loops I cover this in more detail in Why I prefer functional programming. However, functional programming is not a free win. Supporting higher-order functions and closures efficiently isn’t easy (especially for compiled languages), which is why less sophisticated language implementations tend to be more imperative and less functional. • Evaluation order (especially laziness) I’ve always felt that “lazy evaluation” doesn’t do a good job of selling the benefits of Haskell’s evaluation model. I prefer to think of Haskell as having “automatic evaluation management”1. In other words, the programmer specifies the expressions as a graph of dependent computations and the runtime figures out the most efficient order in which to reduce the graph. This is yet another example of where we push something that used to be a userland concern (order of evaluation) into the runtime. Glossing over evaluation order frees us to specify things in a more mathematical style, because in mathematics the evaluation order is similarly irrelevant. #### The present and future A common pattern emerges when we study the above trends: • Push a userland concern into a runtime concern, which: • … makes programs more closely resemble pure mathematical expressions, and: • … significantly increases the complexity of the runtime. You might wonder: what are some other userland concerns that might eventually get pushed into runtime concerns in the near future? Some examples I can think of are: • Package management This one is so near in the future that it’s already happening (see: Nix and Dhall). Both languages provide built-in support for fetching code, rather than handling packages out-of-band using a standalone package management tool. This language-level support allows programs to embed external code as if it were a pure sub-expression, more closely approximating the mathematical ideal. • Error handling This one requires more explanation: I view type systems as the logical conclusion of pushing error handling into the “runtime” (actually, into the type-checker, not the runtime, to be pedantic). Dhall is an example of a language that takes this idea to the extreme: Dhall has no userland support for raising or catching errors because all errors are type errors2. Advances in dependently-typed and total functional programming languages get us closer to this ideal of pushing error handling into a runtime concern. • Logging I’m actually surprised that language support for pervasively logging everything hasn’t been done already (or maybe it has happened and I missed it). It seems like a pretty mundane thing languages could implement, especially for application domains that are not performance-sensitive. Many languages already support profiling, and it seems like it wouldn’t be a big leap to turn profiling support into logging support if the user is willing to spend their performance budget on logging. Logging is one of those classic side effects that is an ugly detail that “ruins” code that would have otherwise been pure and mathematical. • Services Service-oriented architectures are another thing that tends to get in the way of writing pure side-effect-free code. I’m not exactly sure what a service-oriented language runtime would look like, but I don’t think current “serverless” solutions are what I have in mind. Something like AWS Lambda is still too low-level to promote code that is mathematical in nature. Like, if any part of the programming process involves using a separate tool to deploy or manage the serverless code then that is a significant departure from authoring pure mathematical expressions. There needs to be something like a “Nix or Dhall for serverless code”. #### Conclusion You might criticize my prediction by claiming that it’s not falsifiable. It seems like I could explain away any new development in programming as falling into the runtime building or mathematical expression category. That’s why I’d like to stress a key pillar of the prediction: runtimes and mathematical expressions will become more sharply divided over time. This is the actual substance of the prediction and we can infer a few corrolaries from that prediction. Currently, many mainstream programming paradigms and engineering organizations conflate the two responsibilities, so you end up with people authoring software projects that mix operational logic (runtime concerns) and “business logic” (mathematical intent). What I predict will happen is that the field of engineering will begin to generate a sharp demand for people with experience in programming language theory or programming language engineering. These people will be responsible for building special-purpose languages and runtimes that abstract away as many operational concerns as possible to support pure mathematical domain-specific languages for their respective enterprise. These languages will in turn be used by a separate group of people whose aim is to translate human intent into mathematical expressions. One consequence of this prediction is that you’ll begin to see a Cambrian explosion of programming languages in the near future. Naturally as language engineers push more operational concerns into the runtime they will need to more closely tailor the runtime to specific purposes or organizational needs rather than trying to author a general-purpose language. In other words, there will be a marked fragmentation of language runtimes (and type-checkers) as each new language adapts to their respective niche. Despite runtime fragmentation, you will see the opposite trend in userland code: programs authored in these disparate languages will begin to more closely resemble one another as they become more mathematical in nature. In a sense, mathematical expressions will become the portable “lingua franca” of userland code, especially as non-mathematical concerns get pushed into each respective language’s runtime. That is a prediction that is much easier to falsify. Also, if you like this post then you will probably also enjoy the seminal paper: The Next 700 Programming Languages. 1. Also, this interpretation is not that far from the truth as the Haskell standard only specifies a non-strict evaluation strategy. GHC is lazy, but Haskell the language standard does not require implementations to be lazy. For more details, see: Lazy vs. non-strict↩︎ 2. Okay, this is a bit of an over-simplification because Dhall has support for Optional values and you can model errors using unions in Dhall, but they are not commonly used in this way and idiomatically most Dhall code in the wild uses the type-checker to catch errors. Dhall is a total functional programming language and the language goes to significant lengths to discourage runtime errors, like forbidding comparing Text values for equality. Also, the language is technically dependently typed and supports testing arbitrary code at type-checking time to catch errors statically.↩︎ ### Matt Parsons # Global IORef in Template Haskell I’m investigating a way to speed up persistent as well as make it more powerful, and one of the potential solutions involves persisting some global state across module boundaries. I decided to investigate whether the “Global IORef Trick” would work for this. Unfortunately, it doesn’t. On reflection, it seems obvious: the interpreter for Template Haskell is a GHCi-like process that is loaded for each module. Loading an interpreter for each module is part of why Template Haskell imposes a compile-time penalty - in my measurements, it’s something like ~100ms. Not huge, but noticeable on large projects. (I still generally find that DeriveGeneric and the related Generic code to be slower, but it’s a complex issue). Anyway, let’s review the trick and obseve the behavior. # Global IORef Trick This trick allows you to have an IORef (or MVar) that serves as a global reference. You almost certainly do not need to do this, but it can be a convenient way to hide state and make your program deeply mysterious. Here’s the trick: module Lib where import Data.IORef import System.IO.Unsafe globalRef :: IORef [String] globalRef = unsafePerformIO$ newIORef []
{-# NOINLINE globalRef #-}


There are two important things to note:

1. You must give a concrete type to this.
2. You must write the {-# NOINLINE globalRef #-} pragma.

Let’s say we give globalRef a more general type:

globalRef :: IORef [a]


This means that we woudl be allowed to write and read whatever we want from this reference. That’s bad! We could do something like writeIORef globalRef [1,2,3], and then readIORef globalRef :: IO [String]. Boom, your program explodes.

Unless you want a dynamically typed reference for some reason - and even then, you’d better use Dynamic.

If you omit the NOINLINE pragma, then you’ll just get a fresh reference each time you use it. GHC will see that any reference to globalRef can be inlined to unsafePerformIO (newIORef []), and it’ll happily perform that optimization. But that means you won’t be sharing state through the reference.

This is a bad idea, don’t use it. I hesitate to even explain it.

# Testing the Trick

But, well, sometimes you try things out to see if they work. In this case, they don’t, so it’s useful to document that.

We’re going to write a function trackString that remembers the strings that are passed previously, and defines a value that returns those.

trackString "hello"
-- hello = []

trackString "goodbye"
-- goodbye = ["hello"]

trackString "what"
-- what = ["goodbye", "hello"]


Here’s our full module:

{-# language QuasiQuotes #-}

module Lib where

import Data.IORef
import System.IO.Unsafe

globalRef :: IORef [String]
globalRef = unsafePerformIO $newIORef [] {-# NOINLINE globalRef #-} trackStrings :: String -> Q [Dec] trackStrings input = do strs <- runIO$ readIORef globalRef
_ <- runIO $atomicModifyIORef globalRef (\i -> (input : i, ())) ty <- [t| [String] |] pure [ SigD (mkName input) ty , ValD (VarP (mkName input)) (NormalB (ListE$ map (LitE . stringL) $strs)) [] ]  This works in a single module just fine. {-# language TemplateHaskell #-} module Test where import Lib trackStrings "what" trackStrings "good" trackStrings "nothx" test :: IO () test = do print what print good print nothx  If we evaluate test, we get the following output: [] ["what"] ["good","what"]  This is exactly what we want. Unfortunately, this is only module-local state. Given this Main module, we get some disappointing output:  {-# language TemplateHaskell #-} module Main where import Lib import Test trackStrings "hello" trackStrings "world" trackStrings "goodbye" main :: IO () main = do test print hello print world print goodbye  [] ["what"] ["good","what"] [] ["hello"] ["world","hello"]  To solve my problem, main would have needed to output: [] ["what"] ["good","what"] ["nothx","good","what"] ["hello","nothx","good","what"] ["world","hello","nothx","good","what"]  # Module-local state in Template Haskell Fortunately, we don’t even need to do anything awful like this. The Q monad offers two methods, getQ and putQ that allow module-local state. -- | Get state from the Q monad. Note that the state is -- local to the Haskell module in which the Template -- Haskell expression is executed. getQ :: Typeable a => Q (Maybe a) -- | Replace the state in the Q monad. Note that the -- state is local to the Haskell module in which the -- Template Haskell expression is executed. putQ :: Typeable a => a -> Q ()  These use a Typeable dictionary, so you can store many kinds of state - one for each type! This is a neat way to avoid the “polymorphic reference” problem I described above. # How to actually solve the problem? If y’all dare me enough I might write a follow-up where I investigate using a compact region to persist state across modules, but I’m terrified of the potential complexity at play there. I imagine it’d work fine for a single threaded compile, but there’d probably be contention on the file with parallel builds. Hey, maybe I just need to spin up a redis server to manage the file locks… Perhaps I can install nix at compile-time and call out to a nix-shell that installs Redis and runs the server. ## April 19, 2021 ### Mark Jason Dominus # Odd translation choices Recently I've been complaining about unforced translation errors. ([1] [2]) Here's one I saw today: The translation was given as: “honk honk, your Uber has arrived” “Oleg, what the fuck” Now, the Russian text clearly says “beep-beep” (“бип-бип”), not “honk honk”. I could understand translating this as "honk honk" if "beep beep" were not a standard car sound in English. But English-speaking cars do say “beep beep”, so why change the original? (Also, a much smaller point: I have no objection to translating “Что за херня” as “what the fuck”. But why translate “Что за херня, Олег?” as “Oleg, what the fuck” instead of “What the fuck, Oleg”?) [ Addendum 20210420: Katara suggested that perhaps the original translator was simply unaware that Anglophone cars also “beep beep”. ] ### Manuel M T Chakravarty # This is the video of the keynote talk “Blockchains are functional” that I delivered at... This is the video of the keynote talk “Blockchains are functional” that I delivered at the ACM SIGPLAN International Conference on Functional Programming 2019. Here is the abstract: Functional programming and blockchains are a match made in heaven! The immutable and reproducible nature of distributed ledgers is mirrored in the semantic foundation of functional programming. Moreover, the concurrent and distributed operation calls for a programming model that carefully controls shared mutable state and side effects. Finally, the high financial stakes often associated with blockchains suggest the need for high assurance software and formal methods. Nevertheless, most existing blockchains favour an object-oriented, imperative approach in both their implementation as well as in the contract programming layer that provides user-defined custom functionality on top of the basic ledger. On the one hand, this might appear surprising, given that it is widely understood that this style of programming is particularly risky in concurrent and distributed systems. On the other hand, blockchains are still in their infancy and little research has been conducted into associated programming language technology. In this talk, I explain the connection between blockchains and functional programming as well as highlight several areas where functional programming, type systems, and formal methods have the potential to advance the state of the art. Overall, I argue that blockchains are not just a well-suited application area for functional programming techniques, but that they also provide fertile ground for future research. I illustrate this with evidence from the research-driven development of the Cardano blockchain and its contract programming platform, Plutus. Cardano and Plutus are implemented in Haskell and Rust, and the development process includes semi-formal specifications together with the use of Agda, Coq, and Isabelle to formalise key components. ## April 15, 2021 ### Tweag I/O # Arrows, through a different lens Our previous posts on computational pipelines, such as those introducing Funflow and Porcupine, show that Arrows are very useful for data science workflows. They allow the construction of effectful and composable pipelines whose structure is known at compile time, which is not possible when using Monads. However, Arrows may seem awkward to work with at first. For instance, it’s not obvious how to use lenses to access record fields in Arrows. My goal in this post is to show how lenses and other optics can be used in Arrow-based workflows. Doing so is greatly simplified thanks to Profunctor optics and some utilities that I helped add to the latest version of the lens library. ## Optics on functions We’re used to think of lenses in terms of getters and setters, but I’m more interested today in the functions over and traverseOf. -- We will use this prefix for the remaining of the post. -- VL stands for Van Laarhoven lenses. import qualified Control.Lens as VL -- Transform a pure function. over :: VL.Lens s t a b -> (a -> b) -> (s -> t) -- Transform an effectful function. traverseOf :: VL.Traversal s t a b -> (a -> m b) -> (s -> m t) We would like to use similar functions on Arrow-based workflows, something like overArrow :: VL.Lens s t a b -> Task a b -> Task s t However, the type of lenses: type Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t doesn’t make it very obvious how to define overArrow. On the other hand, Arrows come equipped with functions first and second: first :: Task b c -> Task (b, d) (c, d) second :: Task b c -> Task (d, b) (d, c) which very much feel like specialised versions of overArrow for the lenses _1 :: VL.Lens (b, d) (c, d) b c _2 :: VL.Lens (d, b) (d, c) b c so maybe there is a common framework that can take both of these into account? The answer is yes, and the solution is lenses — but lenses of a different type. ## Profunctor optics There is an alternative and equivalent formulation of optics, called Profunctor optics, that works very well with Arrows. Optics in the Profunctor framework have the following shape: type Optic p s t a b = p a b -> p s t with more precise optics such as Lens being obtained by imposing constraints to p coming from the different Profunctor classes. In other words, an Optic is precisely a higher-order function acting on some profunctor. Because every Arrow is also a Profunctor1, the shape of an Optic is precisely what is needed to act on Arrows! Moreover, like the optics of the lens library, profunctor optics can be composed like regular functions, with (.). The lens library now includes a module containing functions that convert between standard and profunctor optics, which makes using them very convenient. In the following sections, we will go through the use and the intuition of the most common optics: Lens, Prism and Traversal. But first, let’s import the compatibility module for profunctor optics: -- PL for Profunctor Lenses import Control.Lens.Profunctor as PL ## Lenses Standard lenses are all about products — view, for example, is used to deconstruct records: view _fst :: (a, b) -> a Therefore, it makes sense for Profunctor lenses to also talk about products. Indeed, that is exactly what happens, through the Strong type class: class Profunctor p => Strong p where first' :: p a b -> p (a, c) (b, c) second' :: p a b -> p (c, a) (c, b) With profunctor optics, a Lens is defined as follows: type Lens s t a b = forall p. Strong p => p a b -> p s t Every Arrow satisfies the Strong class. If we squint, we can rewrite the type of these functions as: first' :: Lens' (a,c) (b,c) a b second' :: Lens' (c,a) (c,b) a b That is, a Strong profunctor is equipped with lenses to reach inside products. One can always convert a record into nested pairs and act on them using Strong — the Lens just makes this much more convenient. But how do we build a Lens? Besides writing them manually, we can also use all Lenses from lens: PL.fromLens :: VL.Lens s t a b -> Lens s t a b which means we can still use all the lenses we know and love. For example, one can apply a task to a tuple of arbitrary size: PL.fromLens _1 :: Task a b -> Task (a,x,y) (b,x,y) Summarizing, a Strong profunctor is one we can apply lenses to. Since every Arrow is also a Strong profunctor, one can use Lenses with them. ## Prisms Standard prisms are all about sums — preview, for example, is used to deconstruct sum-types: view _Left :: Either a b -> Maybe a Therefore, it makes sense for Profunctor prisms to also talk about sums. Indeed, that is exactly what happens, through the Choice type class: class Profunctor p => Choice p where left' :: p a b -> p (Either a c) (Either b c) right' :: p a b -> p (Either c a) (Either c b) With profunctor optics, a Prism is defined as follows: type Prism s t a b = forall p. Choice p => p a b -> p s t Every ArrowChoice satisfies the Choice class. Once more, we can rewrite the type of these functions as: left' :: Prism (Either a c) (Either b c) a b right' :: Prism (Either c a) (Either c b) a b That is, a Choice profunctor is equipped with prisms to discriminate sums. One can always convert a sum into nested Eithers and act on them using Choice — the Prism just makes this much more convenient. But how do we build a Prism? We can also use any prisms from lens with a simple conversion: PL.fromPrism :: VL.Prism s t a b -> Prism s t a b For example, one can execute a task conditionally, depending on the existence of the input: PL.fromPrism _Just :: Action a b -> Action (Maybe a) (Maybe b) Summarizing, a Choice profunctor is one we can apply prisms to. Since every ArrowChoice can be a Choice profunctor, one can uses prisms with them. ## Traversals Standard traversals are all about Traversable structures — mapMOf, for example, is used to execute effectful functions: mapMOf traverse readFile :: [FilePath] -> IO [String] Therefore, it makes sense for Profunctor traversals to also talk about these traversable structures. Indeed, that is exactly what happens, through the Traversing type class: class (Choice p, Strong p) => Traversing p where traverse' :: Traversable f => p a b -> p (f a) (f b) With profunctor optics, a Traversal is defined as follows: type Traversal s t a b = forall p. Traversing p => p a b -> p s t There is no associated Arrow class that corresponds to this class, but many Arrows, such as Kleisli, satisfy it. We can rewrite the type of this functions as: traverse' :: Traversable f => Traversal (f a) (f b) a b That is, a Traversing profunctor can be lifted through Traversable functors. But how do we build a Traversal? We can also use any Traversal from lens with a simple conversion: PL.fromTraversal :: VL.Traversal s t a b -> Traversal s t a b For example, one can have a task and apply it to a list of inputs: PL.fromTraversal traverse :: Action a b -> Action [a] [b] ## Conclusion Using Arrows does not stop us from taking advantage of the Haskell ecosystem. In particular, optics interact very naturally with Arrows, both in their classical and profunctor formulations. For the moment, the ecosystem is still lacking a standard library for Profunctor optics, but this is not a show stopper — the lens library itself has most of the tools we need. So the next time you are trying out Funflow or Porcupine, don’t shy away from using lens! 1. The fact that these hierarchies are separated is due to historical reasons. ## April 14, 2021 ### Mark Jason Dominus # More soup-guzzling A couple of days ago I discussed the epithet “soup-guzzling pie-muncher”, which in the original Medieval Italian was brodaiuolo manicator di torte. I had compained that where most translations rendered the delightful word brodaiuolo as something like “soup-guzzler” or “broth-swiller”, Richard Aldington used the much less vivid “glutton”. A form of the word brodaiuolo appears in one other place in the Decameron, in the sixth story on the first day, also told by Emilia, who as you remember has nothing good to say about the clergy: … lo 'nquisitore sentendo trafiggere la lor brodaiuola ipocrisia tutto si turbò… J. M. Rigg (1903), who had elsewhere translated brodaiuolo as “broth-guzzling”, this time went with “gluttony”: …the inquisitor, feeling that their gluttony and hypocrisy had received a home-thrust… G. H. McWilliam (1972) does at least imply the broth: …the inquisitor himself, on hearing their guzzling hypocrisy exposed… John Payne (1886): the latter, feeling the hit at the broth-swilling hypocrisy of himself and his brethren… Cormac Ó Cuilleanáin's revision of Payne (2004): …the inquisitor himself, feeling that the broth-swilling hypocrisy of himself and his brethren had been punctured… And what about Aldington (1930), who dropped the ball the other time and rendered brodaiuolo merely as “glutton”? Here he says: … he felt it was a stab at their thick-soup hypocrisy… Oh, Richard. I think you should have tried harder. ### Well-Typed.Com # GHC activities report: February-March 2021 This is the fifth edition of our GHC activities report, which is intended to provide regular updates on the work on GHC and related projects that we are doing at Well-Typed. This edition covers roughly the months of February and and March 2021. The previous editions are here: A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK and Facebook, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide. If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch. Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)! ## Release management • Ben Gamari finalized releases of GHC 8.10.4 and 9.0.1, and started work on 9.0.2. Additionally, he worked to finish and merge the outstanding patches pending for GHC 9.2, culminating in an alpha release candidate (9.2.1-alpha1). • Ben also prepared a blog post updating the community on the state of GHC on Apple M1 hardware. ## Frontend • Doug Wilson proposed adding interprocess semaphore support to allow multiple concurrent compiler processes to make better use of available parallelism (!5176). • Matthew Pickering performed some simulations and experiments of the proposed changes to increase the parallelism in --make mode (#14095) and the effect of the Doug’s -jsem flag (!5176). The results show both changes have potential to improve compile times when many cores are available. • Matt opened a GHC proposal to modify the import syntax to distinguish between modules which are used at compile time and runtime. This will mean that enabling TemplateHaskell in your project will cause less recompilation when using ghci and haskell-language-server. • Ben fixed a long-standing bug rendering the GHCi linker unable to locate dynamic libraries when not explicitly added to LD_LIBRARY_PATH (#19350) ## Profiling and Debugging • Matt landed the last of the ghc-debug patches into GHC, so it will be ready for use when GHC 9.2 is released. • Matt has finished off a patch which Ben and David Eichmann started which fixes races in the eventlog implementation. This makes it possible to reliably restart the eventlog and opens up the possibility of remote monitoring tooling which connects to the eventlog over a socket. Matt has made a start on implementing libraries which allow monitoring in this way. • Matt enhanced the restart support for the eventlog by ensuring certain initialisation events are re-posted every time the eventlog is restarted (!5186). • Matt added new events to the eventlog which track the current number of allocated blocks and some statistics about memory fragmentation (!5126). • To aid in recent compiler-performance investigations, Ben has been working on merging corediff, a utility for performing structural comparisons on Core, into his ghc-dump tool. ## RTS • Matt investigated some discrepancies between OS reported memory usage and live bytes used by a program. This resulted in a new RTS flag (-Fd) in !5036 and a fix which reduces fragmentation due to nursery blocks (!5175). This work is described in more detail in a recent blog post. ## Compiler Performance • Adam Gundry has been investigating the extremely poor performance of type families that do significant computation (#8095), building on previous work by Ben and Simon Peyton Jones on “coercion zapping”. He has an experimental patch that can significantly improve performance in some cases (!5286), although more work is needed. • Andreas Klebinger refactored the way the simplifier creates uniques (!4804). This reduces compiler allocations by between 0.1% to 0.5% when compiling with -O. • Andreas also stomped out some sources of thunks in the simplifier (!4808). The impact varies by program but allocations for GHC were reduced by ~1% in common cases and by up to 4% for particular tests. • Andreas also finished a similar patch for the code generator (!5236). This mostly benefits users frequently compiling with -O0 where we save around 0.2% of allocations. • Andreas and Ben also applied the one-shot trick (#18202) to a few more monads inside of GHC. In aggregate this should reduce GHC’s allocations by another half percent in the common case. • Matt started working on compile time performance and fixed some long standing leaks in the demand analyser. He used ghc-debug and -hi profiling in order to quickly find where the leaks were coming from. • Matt and Ben started work on a public Grafana dashboard to display the long-term trends in compiler benchmarks in an easy to understand way, taking advantage of the performance measurement infrastructure developed over the last 12 months. • Ben is investigating a significant regression in compiler performance of the aeson library (#19478) in GHC 9.0. • Ben reworked the derivation logic for Enum instances, significantly reducing the quantity of code required for such instances. ## Runtime performance • Andreas has fixed a few more obstacles which will allow GHC to turn on -fstrict-dicts by default (!2575) when compiling with -O2. • Andreas investigated a runtime regression with GHC 9.0.1 (#19474) which ultimately was caused by vector regressing under certain circumstances. This was promptly fixed by Ben upstream. • After a review by Simon Peyton Jones, Andreas is evaluating a possible simplification in the implementation of his tag inference analysis. • Ben investigated and fixed a runtime regression in bytestring caused by GHC 9.0’s less-aggressive simplification of unsafeCoerce (#19539). ## Compiler correctness • Andreas investigated and fixed (!4926) an issue where the behaviour of unsafeDupablePerformIO had changed in GHC 9.0 as a result of more aggressive optimisation of runRW#. In particular when a user wrote code like:  unsafePerformIO$ do
let x = f x
writeIORef ref x
return x

Before the fix it was possible for x to be evaluated before the write to the IORef occurred.

• Ben fixed a bug resulting in the linker failing to load libraries with long paths on Windows (#19541)

• Ben fixed a rather serious garbage-collector bug affecting code unloading in GHC 8.10.4 and 9.0.1 (#19417).

## Compiler functionality and language extensions

• Adam assisted with landing the NoFieldSelectors extension (!4743) and the record dot syntax extensions (!4532). Both of these features will be in GHC 9.2, although record dot syntax will be properly supported only for selection, not for record update. Adam is working on a GHC proposal to address outstanding design questions regarding record update.
• Ben finished and merged the BoxedRep implementation started by Andrew Martin, allowing true levity polymorphism in GHC 9.2 (#17526)
• Ben has been working with a contributor to finish stack snapshotting functionality, which will allow backtraces on all platforms supported by GHC.

## CI and infrastructure

• Andreas fixed a number of issues to allow windows builds to be validated locally by developers. In particular !4935, !5162, !5200 and !5040.
• Ben has been working with the Haskell Foundation to diversify the resource pool supporting GHC’s CI infrastructure.

# Nix shell, direnv and XDG_DATA_DIRS

A few weeks ago I noticed that I no longer could use haskell-hoogle-lookup-from-website in Emacs. After a bit of experimentation I found that the reason was that I couldn't use xdg-open in a Nix shell. Yesterday I finally got around to look into further.

It's caused by direnv overwriting XDG_DATA_DIRS rather than appending to it. Of course someone already reported a bug already.

The workaround is to use

use nix --keep XDG_DATA_DIRS


# Scrooge

A few months ago I was pondering what it might be like to be Donald Trump. Pretty fucking terrible, I imagine. What's it like, I wondered, to wake up every morning and know that every person in your life is only interested in what they can get from you, that your kids are eagerly waiting for you to die and get out of their way, and that there is nobody in the world who loves you? How do you get out of bed and face that bitter world? I don't know if I could do it. It doesn't get him off the hook for his terrible behavior, of course, but I do feel real pity for the man.

It got me to thinking about another pitiable rich guy, Ebeneezer Scrooge. Scrooge in the end is redeemed when he is brought face to face with the fact that his situation is similar to Trump's. Who cares that Scrooge has died? Certainly not his former business associates, who discuss whether they will attend his funeral:

“It's likely to be a very cheap funeral,” said the same speaker; “for, upon my life, I don't know of anybody to go to it. Suppose we make up a party, and volunteer.”

“I don't mind going if a lunch is provided," observed the gentleman with the excresence on his nose.

Later, the Spirit shows Scrooge the people who are selling the curtains stolen from his bed and the shirt stolen from his corpse, and Scrooge begs:

“If there is any person in the town who feels emotion caused by this man's death," said Scrooge, quite agonized, “show that person to me, Spirit, I beseech you!”

The Spirit complies, by finding a couple who had owed Scrooge money, and who will now, because he has died, have time to pay.

I can easily replace Scrooge with Trump in any of these scenes, right up to the end of chapter 4. But Scrooge in the end is redeemed. He did once love a woman, although she left him. Scrooge did have friends, long ago. He did have a sister who loved him, and though she is gone her son Fred still wants to welcome him back into the family. Did Donald Trump ever have any of those things?

# Haskell Foundation Board - Meeting Minutes - April 8, 2021

This blog post is a summary of the meeting minutes for the Haskell Foundation board meeting that took place on April 8, 2021. This is the first time I'm writing these up, and potentially the only time I'm putting them on this blog. So this post is going to be a bit weird; we'll start with some questions.

## Questions

Why are you writing these meeting minutes? As you'll see below, one of the decisions at the meeting was selection of additional officers for the board. I was selected as Secretary, which seems to put meeting minutes into my camp.

OK, but why are you publishing them on your personal blog? As you'll also see below, the new Haskell Foundation website is nearing completion, and in the future I hope these posts go there. But I wanted to kick things off with what I hope is close to the methodology going forward.

Isn't a blog post for each meeting excessive? Yes. As I mentioned in my transparency blog post, transparency includes a balance between too much and too little information. I intend to leverage announcement blog posts for things that deserve an announcement.

With those out of the way, let's get into the noteworthy information from this meeting.

## Opening up Slack

A lot of the work in the board so far has been in the "ways of working" direction. Basically: how does the foundation operate, what are the responsibilities of the board, what do officers do, how elections work, etc. Included in that, and in my opinion of significant interest to the community, is how we communicate. All of this information, and the decision making process around it, can be followed in the hf/meta repo on gitlab.haskell.org. I'm not going to try and summarize everything in those documents. Instead, the announcement here is around merge request !12. Specifically:

• HF is standardizing on Slack as its avenue of text chatting.
• We're going to start opening this up to everyone in the community interested in participating in discussions.
• On a personal note, I'm very excited about this. I think a central place for ecosystem discussions is vital.
• There's some hesitation/concern about moderation, off-topic discussions, and other points, which we'll need to work out over time.

I'd encourage anyone interested in joining in the conversation and staying up to date with topics to request an invite. We're already starting separate topic-specific channels to iterate on various technical topics.

Note that Slack is not a replacement for Discourse or other existing platforms. We'll still use Discourse for more official discussions and announcements. (This blog post is an example, the official discussion for it lives on Discourse.) Like many things, we'll likely be figuring out the details over time.

## Board officers

There are a total of six board officers, who have all now been selected:

• Chair: Richard Eisenberg
• Vice Chair: Tom Ellis
• Treasurer: Ryan Trinkle
• Vice Treasurer: José Pedro Magalhães
• Secretary: Michael Snoyman
• Vice Secretary: Théophile Hécate Choutri

There is a new Haskell Foundation website in the works, which should be ready to go live in the next week. It is fully open source and viewable now:

I'm hoping that, in the future, I'll be putting posts like this one on that site instead!

## What's an announcement? Where do minutes go? Where's transparency?!?

I'm going to try and reserve these kinds of announcement posts to topics that I think will have widespread interest. I may make mistakes in that judgement, I apologize in advance. My goal is that anyone who wants to stay up to speed on large decisions coming from the board will be able to without using up a lot of their bandwidth.

That said, every board meeting (and, for that matter, most or all working group meetings) take and keep meeting notes. We've been sharing these on Discourse, but in the future may simply publish them in the hf/minutes repo.

Until now, we've been creating a new Discourse thread and posting the meeting minutes for each meeting. I proposed reducing how often we do that, and instead leave the minutes in Google Drive for those interested. My hope is that a combination of "information is all available in Drive" with "important things get an announcement" will cover most people. But if people would like to see a new Discourse thread for each meeting, I'd like to hear about it. Please comment on this (and other topics) in the Discourse thread linked.

## Get involved!

We've been in planning and discussion mode for the past few months on the board. The various working groups are beginning to get some clarity around how they want to function, and are ready for more people to get involved. There's technical work to do, documentation, outreach, and much more. If you're excited to be a part of the efforts of the Haskell Foundation to improve the overall state of Haskell, now's a great time to get in and influence the direction. I strongly encourage everyone to check out the Slack, ask questions, interact on Discourse, and overall dive in!

# Typing polymorphic variants in Giml

In the last blog post we covered extensible records and how we infer their types in Giml.

In this blog post we'll take a closer look at another interesting feature and the dual of extensible records - polymorphic variants.

There are several approaches for typing polymorphic variants and they have different tradeoffs as well. Giml's system is very similar to OCaml's polymorphic variants system and also shares its limitations. If you are looking for different approaches you might want to look at this/this or this.

I'm also not sure polymorphic variants is the best we can do to have extensible sum types, as having recursive polymorphic variants is quite problematic. But there are still plenty of usecases where they could be used so it's worth including them in a language.

[Word of warning, this is a bit more complicated than previous posts and it might be best to look at this section as a cookbook. If you see something weird or not explained well, it might be a testement to that I'm a little fuzzy on the details myself]

## Polymorphic Variants

Polymorphic variants can be seen as the dual of extensible records.

Records are product types, they let us hold multiple values together in one structure at the same time, and by giving a label for each value we can refer to each part.

Variants are sum types, they let us define multiple values as alternatives to one another, and we label each alternative as well.

There are two operations we can do with variants, one is creating a tagged value, for example #Ok 1, the other is pattern matching on variants. For example:

withDefault def x =
case x of
| #Ok v -> v
| #Err _ -> def
end

The types for polymorphic variants are a bit more complicated. There are actually three different kinds of polymorphic variants: closed, lower-bounded and upper-bounded.

In a closed variant we list the different labels and the type of their payload, for example: [ Err : String, Ok : Int ] is a type that have two possible alternative , #Err <some-string> and #Ok <some-integer>. This type is something that can be generated by applying a polymorphic variant to a function that take polymorphic variants as input.

An upper-bounded polymorphic variant is open, it describes the maximum set of variants that could be represented by this type, and is represented syntactically with a little < right after the opening bracket (for example: [< Err : String, Ok : Int ]).

Why does this exist? This type can be used when we want to use a variant in a pattern matching expression, and we have a set of variants that we can handle. The pattern matching example above is one like that. In that pattern matching expression we could match any subset of [ Err : a, Ok : b ]. So we use an upper-bounded polymorphic variant to express that in types.

Lower-bounded polymorphic variant is open as well, it describes the minimum set of variants that could be represented by this type, and is represented will a little > right after the opening bracket (for example: [> Err : String, Ok : Int ]). It can unify with any superset of the specified alternatives.

Why does this exist? There are a couple of cases:

One, this type is used when we want to pass a variant to a function that does pattern matching on that variant, and has the ability to match on unspecified amount of variants, for example with a wildcard (_) or variable pattern. For example:

withDefault def x =
case x of
| #Ok 0 -> def
| #Ok v -> v
| _ -> def
end

In this case, even if we passed #Foo 182 this pattern matching should work, it will just match the wildcard pattern and return def.

But note that expressions cannot have multiple types (at least not in Giml), therefore we must specify in our case that the #Ok must have a specific type (Int) and we cannot pass #Ok "hello" instead. In the pattern matching above we could match on [ Ok : Int ] or any other type, and we write this like this: [> Ok : Int ].

Another case is the type of variant "literals" themselves! This could be considered the dual of record selection: when we specify a tagged value (say, #Ok 1), what we really want for it is to fit anywhere that expect at least [ Ok : Int ].

### Implementation

#### Type definition

So we add 3 new constructors to represent the three variant types:

1. TypeVariant [(Label, Type)] for closed variant
2. TypePolyVariantLB [(Constr, Type)] TypeVar for lower-bounded polymorphic variant
3. TypePolyVariantUB TypeVar [(Constr, Type)] for upper-bounded polymorphic variant

One thing that wasn't visible before when we talked about the syntactic representation of variants is the row type variable. Here, just like with records, we use the row type variable to encoded hidden knowledge about our types.

For lower-bounded polymorphic variants (from now on, LB), we use the row type variable to represent other variants (similar to how with records the row type variable was used to represent other fields).

For upper-bound polymorphic variants (from now on, UB) the row type variable has a different role. We use the row type variable to keep track of which UB we want to merge together (those with the same row type variable), how to convert a UB to LB (constrain the UB row type variable with the merge of the UB and LB fields + the LB row type variable), and when to treat two UBs as normal variants (those with a different row type variable).

We'll explore each scenario soon.

#### Elaboration and constraint generation

##### Creating variants

When we create a variant, we:

1. Elaborate the type of the payload
2. Generate a type variable
3. Annotate the type of the AST node as TypePolyVariantLB [(<label>, <payload-type>)] <typevar>

We will see how this unify with other variants.

##### Pattern matching

For matching an expression with one or more patterns of polymorphic variants, we elaborate the expression and constrain the type we got with a newly generated type variable which is going to be our hidden row type variable and representative of the type of the expression (and let's call it tv for now).

For each variant pattern matching we run into, we add the constraint:

Equality (TypeVar tv) (TypePolyVariantUB tv [(<label>, <payload-type>)])

By making all variant patterns equal to tv, we will eventually need to unify all the types that arise from the patterns, and by keeping the hidden row type variable tv the same for all of them, we annotate that these TypePolyVariantUBs should unify by merging the unique fields on each side and unifying the shared fields.

Another special case we have is a regular type variable pattern (capture pattern) and wildcard. In our system, a capture pattern should capture any type, including any variant. This case produces the opposite constraint:

Equality (TypeVar tv) (TypePolyVariantLB [] tv2)

Note that this special type, TypePolyVariantLB [] tv2, unifies with none variant types as if it a type variable, but with variant types it behave like other TypePolyVariantLB.

#### Constraint solving

We have 3 new types, TypeVariant, TypePolyVariantLB and TypePolyVariantUB. And in total we have 7 unique new cases to handle:

• Equality TypeVariant TypeVariant
• Equality TypeVariant TypePolyVariantLB
• Equality TypeVariant TypePolyVariantUB
• Equality TypePolyVariantLB TypePolyVariantLB
• Equality TypePolyVariantLB TypePolyVariantUB
• Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables match)
• Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables don't match)

The cases are symmetrical, so we'll look at one side.

##### Equality TypeVariant TypeVariant

For two TypeVariant, like with two TypeRec, we unify all the fields. If there are any that are only on one side, we fail.

##### Equality TypeVariant TypePolyVariantLB

This case is similar to the Equality TypeRec TypeRecExt case, all Fields in the LB side must appear and unify in the TypeVariant side, and the fields that are unique to the TypeVariant should unify with the row type variable of LB.

We don't want the LB have any extra fields that are not found in the regular variant side!

##### Equality TypeVariant TypePolyVariantUB

When an upper-bounded polymorphic variant meets a regular variant, we treat the UB as a regular variant as well. This is something that can makes seemingly well typed programs to be rejected by the typechecker (in a manner that is similar to not having let-polymorphism) but we do not represent the constraint of a subset/subtype in Giml, only equality, and the two types just aren't equal.

##### Equality TypePolyVariantLB TypePolyVariantLB

Two lower-bounded polymorphic variant should unify all the shared fields, and their row type variants should unify with a new LB that contains the non-shared fields (and a new row type variable).

##### Equality TypePolyVariantLB TypePolyVariantUB

LB means any superset of the mentioned fields, and UB means any subset of the mentioned fields. Meaning, we don't want to have a field in the LB said that doesn't exist in the UB side. This is basically the same case as Equality TypeVariant TypePolyVariantLB.

##### Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables match)

When the two row type variables match, this means that the two types represent alternative pattern matches, so we make sure their fields unify and merge them: we generate a constraint that unifies the merged UB (with the same row type variable) with the row type variable. This is so we can establish the most up-to-date type of the row type variable in the substitution.

Equality (TypePolyVariantUB tv <merged-fields>) (TypeVar tv)

This way we add to the list of alternative variants, add keep track of the row type variable is the substitution and subsequent constraints.

##### Equality TypePolyVariantUB TypePolyVariantUB (where the row type variables don't match)

We treat these two as unrelated variants that represent the same subest of fields, so the fields should match just like regular variants (Equality TypeVariant TypeVariant).

#### Instantiation

Instantiation is straightforward, just like all other cases: we instantiate the row type variables here as well.

#### Substitution

For substitution we need to handle a few cases:

• For UB, if the row type variable maps to another type variable, we just switch the type variable. Otherwise we merge the variant fields we have with the type.
• For LB with no variant fields (TypePolyVariantLB [] tv) we return the value mapped in the substitution (remember that this case is also used for wildcards and captures in pattern matching and not just variants).
• For LB with variant fields, we try to merge the variant fields with the type.

Merging variants happens the same as with records, but we don't have to worry about duplicate variants as the constraint solving algorithm makes sure they have the same type.

Note that this is where a UB can flip and become LB (if it's merged with an LB).

### Example

As always, let's finish of with an example and infer the types of the following program:

withDefault def x =
case x of
| #Some v -> v
| #Nil _ -> def
end

one = withDefault 0 (#Some 1)

#### Elaboration and constraints generation

    +---- targ3 -> t4 -> t5
|
|        +---- targ3
|        |
|        |  +----- t4
|        |  |
withDefault def x =
+------------ t4
|
case x of        +---- t7
|
| #Some v -> v

| #Nil _ -> def
|
|
+---- targ3
end

-- Constraints:

[ Equality t4 t6
, Equality t5 t7
, Equality t5 targ3
, Equality t6 [< Nil : t9 | (t6)]
, Equality t6 [< Some : t7 | (t6)]
, Equality t7 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> t4 -> t5)
, Equality top0 (targ3 -> t4 -> t5)
]
 +-- t16
|         +--- top0_i11
|         |
|         |      +---- Int
|         |      |
|         |      |    +---- t14 -> [> Some : t14 | (t13)]
|         |      |    |
|         |      |    |   +--- Int
|         |      |    |   |
one = withDefault 0 (#Some 1)
-------
|
+--- t15

-- Constraints:

[ Equality t12 (t15 -> t16)
, Equality top0_i11 (Int -> t12)
, Equality top1 t16
, Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf top0_i11 top0
]

#### Constraint solving

##### First group
[ Equality t4 t6
, Equality t5 t7
, Equality t5 targ3
, Equality t6 [< Nil : t9 | (t6)]
, Equality t6 [< Some : t7 | (t6)]
, Equality t7 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> t4 -> t5)
, Equality top0 (targ3 -> t4 -> t5)
]

-- Constraints:
[ Equality t4 t6
, Equality t5 t7
, Equality t5 targ3
, Equality t6 [< Nil : t9 | (t6)]
, Equality t6 [< Some : t7 | (t6)]
, Equality t7 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> t4 -> t5)
, Equality top0 (targ3 -> t4 -> t5)
]
-- Substitution:
[]

1. Equality t4 t6
-- => t4 := t6

-- Constraints:
[ Equality t5 t7
, Equality t5 targ3
, Equality t6 [< Nil : t9 | (t6)]
, Equality t6 [< Some : t7 | (t6)]
, Equality t7 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> t6 -> t5)
, Equality top0 (targ3 -> t6 -> t5)
]
-- Substitution:
[ t4 := t6
]

2. Equality t5 t7
-- => t5 := t7

-- Constraints:
[ Equality t7 t7
, Equality t7 targ3
, Equality t6 [< Nil : t9 | (t6)]
, Equality t6 [< Some : t7 | (t6)]
, Equality t7 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> t6 -> t7)
, Equality top0 (targ3 -> t6 -> t7)
]
-- Substitution:
[ t4 := t6
, t5 := t7
]

3. Equality t7 t7
-- => Nothing to do

-- Constraints:
[ Equality t7 targ3
, Equality t6 [< Nil : t9 | (t6)]
, Equality t6 [< Some : t7 | (t6)]
, Equality t7 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> t6 -> t7)
, Equality top0 (targ3 -> t6 -> t7)
]
-- Substitution:
unchanged

4. Equality t7 targ3
-- => t7 := targ3

-- Constraints:
[ Equality t6 [< Nil : t9 | (t6)]
, Equality t6 [< Some : targ3 | (t6)]
, Equality targ3 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> t6 -> targ3)
, Equality top0 (targ3 -> t6 -> targ3)
]
-- Substitution:
[ t4 := t6
, t5 := targ3
, t7 := targ3
]

5. Equality t6 [< Nil : t9 | (t6)]
-- => t6 := [< Nil : t9 | (t6)] -- note that we don't do occurs check here

-- Constraints:
[ Equality [< Nil : t9 | (t6)] [< Nil : t9 | (t6)]
, Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)]
, Equality targ3 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> [< Nil : t9 | (t6)] -> targ3)
, Equality top0 (targ3 -> [< Nil : t9 | (t6)] -> targ3)
]
-- Substitution:
[ t4 := [< Nil : t9 | (t6)]
, t5 := targ3
, t7 := targ3
, t6 := [< Nil : t9 | (t6)]
]

6. Equality [< Nil : t9 | (t6)] [< Nil : t9 | (t6)]
-- => The two sides are the same, nothing to do.

-- Constraints:
[ Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)]
, Equality targ3 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> [< Nil : t9 | (t6)] -> targ3)
, Equality top0 (targ3 -> [< Nil : t9 | (t6)] -> targ3)
]
-- Substitution:
unchanged

7. Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)]
-- => [ Equality t6 [< Nil : t9 | Some : targ3 | (t6)], Equality t9 t9 ]

-- Constraints:
[ Equality t6 [< Nil : t9 | Some : targ3 | (t6)]
, Equality t9 t9
[ Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)]
, Equality targ3 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> [< Nil : t9 | (t6)] -> targ3)
, Equality top0 (targ3 -> [< Nil : t9 | (t6)] -> targ3)
]
-- Substitution:
unchanged

8. Equality [< Nil : t9 | (t6)] [< Nil : t9 | Some : targ3 | (t6)]
-- => t6 := [< Nil : t9 | Some : targ3 | (t6)]

-- Constraints:
[ Equality t9 t9
, Equality targ3 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3)
, Equality top0 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3)
]
-- Substitution:
[ t4 := [< Nil : t9 | Some : targ3 | (t6)]
, t5 := targ3
, t7 := targ3
, t6 := [< Nil : t9 | Some : targ3 | (t6)]
]

9. Equality t9 t9
-- => Nothing to do

-- Constraints:
[ Equality targ3 t8
, Equality t9 t10
, Equality tfun2 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3)
, Equality top0 (targ3 -> [< Nil : t9 | Some : targ3 | (t6)] -> targ3)
]
-- Substitution:
[ t4 := [< Nil : t9 | Some : targ3 | (t6)]
, t5 := targ3
, t7 := targ3
, t6 := [< Nil : t9 | Some : targ3 | (t6)]
]

10. Equality targ3 t8
-- => targ3 := t8

-- Constraints:
[ Equality t9 t10
, Equality tfun2 (t8 -> [< Nil : t9 | Some : t8 | (t6)] -> t8)
, Equality top0 (t8 -> [< Nil : t9 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t9 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t9 | Some : t8 | (t6)]
, targ3 := t8
]

11. Equality t9 t10
-- => t9 := t10

-- Constraints:
[ Equality tfun2 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
, Equality top0 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
]

11. Equality tfun2 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
-- => tfun2 := (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)

-- Constraints:
[ Equality top0 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
]

12. Equality top0 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
-- => top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8

-- Constraints: []
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
]

#### Second group

[ Equality t12 (t15 -> t16)
, Equality top0_i11 (Int -> t12)
, Equality top1 t16
, Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf top0_i11 top0
]

and after applying the substitution:

[ Equality t12 (t15 -> t16)
, Equality top0_i11 (Int -> t12)
, Equality top1 t16
, Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf top0_i11 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Constraints:
[ Equality t12 (t15 -> t16)
, Equality top0_i11 (Int -> t12)
, Equality top1 t16
, Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf top0_i11 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
]

1. Equality t12 (t15 -> t16)
-- => t12 := t15 -> t16

-- Constraints:
[ Equality top0_i11 (Int -> t15 -> t16)
, Equality top1 t16
, Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf top0_i11 (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := t15 -> t16
]

2. Equality top0_i11 (Int -> t15 -> t16)
-- => top0_i11 := Int -> t15 -> t16

-- Constraints:
[ Equality top1 t16
, Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := t15 -> t16
, top0_i11 := Int -> t15 -> t16
]

3. Equality top1 t16
-- => top1 := t16

-- Constraints:
[ Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := t15 -> t16
, top0_i11 := Int -> t15 -> t16
, top1 := t16
]

4. Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
-- => [ Equality t14 Int, Equality [> Some : t14 | (t13)] t15 ]

-- Constraints:
[ Equality t14 Int
, Equality [> Some : t14 | (t13)] t15
, Equality (t14 -> [> Some : t14 | (t13)]) (Int -> t15)
, InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
unchanged

5. Equality t14 Int
-- => t14 := Int

-- Constraints:
[ Equality [> Some : Int | (t13)] t15
, Equality (Int -> [> Some : Int | (t13)]) (Int -> t15)
, InstanceOf (Int -> t15 -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := t15 -> t16
, top0_i11 := Int -> t15 -> t16
, top1 := t16
, t14 := Int
]

6. Equality [> Some : Int | (t13)] t15
-- => t15 := [> Some : Int | (t13)]

-- Constraints:
[ Equality (Int -> [> Some : Int | (t13)]) (Int -> [> Some : Int | (t13)])
, InstanceOf (Int -> [> Some : Int | (t13)] -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := [> Some : Int | (t13)] -> t16
, top0_i11 := Int -> [> Some : Int | (t13)] -> t16
, top1 := t16
, t14 := Int
, t15 := [> Some : Int | (t13)]
]

7. Equality (Int -> [> Some : Int | (t13)]) (Int -> [> Some : Int | (t13)])
-- => [ Equality Int Int, Equality [> Some : Int | (t13)] [> Some : Int | (t13)] ]

-- Constraints:
[ Equality Int Int
, Equality [> Some : Int | (t13)] [> Some : Int | (t13)]
, InstanceOf (Int -> [> Some : Int | (t13)] -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
]
-- Substitution:
unchanged

8. The two sides are equals so we skip them
9. The two sides are equals so we skip them

10. InstanceOf (Int -> [> Some : Int | (t13)] -> t16) (t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8)
-- => [ Equality (Int -> [> Some : Int | (t13)] -> t16) (t17 -> [< Nil : t18 | Some : t17 | (t19)] -> t17) ] -- (instantiation)

-- Constraints:
[ Equality (Int -> [> Some : Int | (t13)] -> t16) (t17 -> [< Nil : t18 | Some : t17 | (t19)] -> t17)
]
-- Substitution:
unchanged

11. Equality (Int -> [> Some : Int | (t13)] -> t16) (t17 -> [< Nil : t18 | Some : t17 | (t19)] -> t17)
-- => [ Equality Int t17, Equality [> Some : Int | (t13)] [< Nil : t18 | Some : t17 | (t19)], Equality t16 t17 ]

-- Constraints:
[ Equality Int t17
, Equality [> Some : Int | (t13)] [< Nil : t18 | Some : t17 | (t19)]
, Equality t16 t17
]
-- Substitution:
unchanged

11. Equality Int t17
-- => t17 := Int

-- Constraints:
[ Equality [> Some : Int | (t13)] [< Nil : t18 | Some : Int | (t19)]
, Equality t16 Int
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := [> Some : Int | (t13)] -> t16
, top0_i11 := Int -> [> Some : Int | (t13)] -> t16
, top1 := t16
, t14 := Int
, t15 := [> Some : Int | (t13)]
, t17 := Int
]

12. Equality [> Some : Int | (t13)] [< Nil : t18 | Some : Int | (t19)]
-- => Equality [> Some : Int | (t13)] [ Nil : t18 | Some : Int ] -- LB ~ UB are solved like LB and normal variant

-- Constraints:
[ Equality [> Some : Int | (t13)] [ Nil : t18 | Some : Int ]
, Equality t16 Int
]
-- Substitution:
unchanged

13. Equality [> Some : Int | (t13)] [ Nil : t18 | Some : Int ]
-- => [ Equality Int Int, Equality [ Nil : t18 ] t13 ]

-- Constraints:
[ Equality Int Int
, Equality [ Nil : t18 ] t13
, Equality t16 Int
]
-- Substitution:
unchanged

14. Two sides are equal, skip

15. Equality [ Nil : t18 ] t13
-- => t13 := [ Nil : t18 ]

-- Constraints:
[ Equality t16 Int
]
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := [ Nil : t18 | Some : Int ] -> t16             -- * substitution of LB with normal variant
, top0_i11 := Int -> [ Nil : t18 | Some : Int ] -> t16 -- * here too
, top1 := t16
, t14 := Int
, t15 := [ Nil : t18 | Some : Int ]                    -- * and here
, t17 := Int
]

16. Equality t16 Int
-- => t16 := Int

-- Constraints: []
-- Substitution:
[ t4 := [< Nil : t10 | Some : t8 | (t6)]
, t5 := t8
, t7 := t8
, t6 := [< Nil : t10 | Some : t8 | (t6)]
, targ3 := t8
, t9 := t10
, tfun2 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, top0 := t8 -> [< Nil : t10 | Some : t8 | (t6)] -> t8
, t12 := [ Nil : t18 | Some : Int ] -> Int             -- * substitution of LB with normal variant
, top0_i11 := Int -> [ Nil : t18 | Some : Int ] -> Int -- * here too
, top1 := Int
, t14 := Int
, t15 := [ Nil : t18 | Some : Int ]                    -- * and here
, t17 := Int
, t16 := Int
]

... and done.

### Result

Let's substitute the type variables in our program and hide the row type variables:

    +---- t8 -> [< Nil : t10 | Some : t8 ] -> t8
|
|        +---- t8
|        |
|        |  +----- [< Nil : t10 | Some : t8 ]
|        |  |
withDefault def x =
+------------ [< Nil : t10 | Some : t8 ]
|
case x of        +---- t8
|
| #Some v -> v

| #Nil _ -> def
|
|
+---- t8
end
 +-- Int
|         +--- Int -> [ Nil : t18 | Some : Int ] -> Int
|         |
|         |      +---- Int
|         |      |
|         |      |    +---- Int -> [ Nil : t18 | Some : Int ]
|         |      |    |
|         |      |    |   +--- Int
|         |      |    |   |
one = withDefault 0 (#Some 1)
-------
|
+--- [ Nil : t18 | Some : Int ]

Phew, looks correct!

### Alternative approach: simulation in userland

Another cool thing I'd like to note is that polymorphic variants could potentially be simulated in userland if you have extensible records in the language (and this is what purescript-variant) does for example).

One approach I tweeted about that also helped me reach the polymorphic variant implementation for Giml is to:

1. Make variants functions - functions that take a record of labels to function that take the variant's payload
2. Make patterns are these records from labels to functions on the payload
3. Make pattern matching just applying variants with patterns

Though in this approach we can't add a "default handler" label.

## Summary

Polymorphic variants are a bit complicated and implementing them is a bit tricky, and they also have several limitations. But I believe that for some use cases they can really make a difference and it's worth having them in the language.

If you're interesting to see a video of me live coding polymorphic variants in Giml, check out Part 17 of my live streaming Giml development series.

# Soup-guzzling pie-munchers

The ten storytellers in The Decameron aren't all well-drawn or easy to tell apart. In the introduction of my favorite edition, the editor, Cormac Ó Cuilleanáin, says:

Early in the book we are given hints that we are going to get to know these ten frame characters…. Among the Decameron storytellers, for instance, Pampinea emerges as being bossy, while Dioneo has a filthy mind. But little further character development takes place.

I agree, mostly. I can see Dioneo more clearly than Ó Cuilleanáin suggests. Dioneo reminds me of Roberto Benigni's Roman filthy-minded Roman taxi driver in Night on Earth. I also get a picture of Bocaccio's character Filostrato, who is a whiny emo poet boy who complains that he woman he was simping for got tired of him and dumped him for someone else:

To be humble and obedient to her and to follow all her whims as closely as I could, was all of no avail to me, and I was soon abandoned for another. Thus I go from bad to worse, and believe I shall until I die.… The person who gave me the nickname of Filostrato [ “victim of love” ] knew what she was doing.

When it's Filostrato's turn to choose the theme for the day's stories, he makes the others tell stories of ill-starred love with unhappy endings. They comply, but are relieved when it is over. (Dioneo, who is excused from the required themes, tells instead a farcical story of a woman who hides her secret lover in a chest after he unwittingly drinks powerful sedative.)

Ah, but Emilia. None of the characters in the Decameron is impressed with the manners or morals of priests. But Emilia positively despises them. Her story on the third day is a good example. The protagonist, Tedaldo, is meeting his long-lost mistress Ermellina; she broke off the affair with him seven years ago on the advice of a friar who advised that she ought to remain faithful to her husband. Tedaldo is disguised as a friar himself, and argues that she should resume the affair. He begins by observing that modern friars can not always be trusted:

Time was when the friars were most holy and worthy men, but those who today take the name and claim the reputation of friars have nothing of the friar but the costume. No, not even that,…

Modern friars, narrates Emilia, "strut about like peacocks" showing off their fine clothes. She goes on from there, complaining about friars' vanity, and greed, and lust, and hypocrisy, getting more and more worked up until you can imagine her frothing at the mouth. This goes on for about fifteen hundred words before she gets back to Tedaldo and Ermellina, just at the same time that I get around to what I actually meant to write about in this article: Emilia has Tedaldo belittle the specific friar who was the original cause of his troubles,

who must without a doubt have been some soup-guzzling pie-muncher…

This was so delightful that I had to write a whole blog post just to show it to you. I look forward to calling other people soup-guzzling pie-munchers in the coming months.

But, as with the earlier article about the two-bit huckster I had to look up the original Italian to see what it really said. And, as with the huckster, the answer was, this was pretty much what Bocaccio had originally written, which was:

il qual per certo doveva esser alcun brodaiuolo manicator di torte

• Brodaiuolo is akin to “broth”, and it has that disparaging diminutive “-uolo” suffix that we saw before in mercantuolo.

• A manicator is a gobbler; it's akin to “munch”, “manger”, and “mandible”, to modern Italian mangia and related French manger. A manicator di torte is literally a gobbler of pies.

Delightful! I love Bocaccio.

While I was researching this article I ran into some other English translations of the phrase. The translation at Brown University's Decameron Web is by J.M. Rigg:

some broth-guzzling, pastry-gorging knave without a doubt

which I award full marks. The translation of John Payne has

must for certain have been some broth-swilling, pastry-gorger

and two revised versions of Payne, by Singleton and Ó Cuilleanáin, translate it similarly.

But the translation of Richard Aldington only says:

who must certainly have been some fat-witted glutton.

which I find disappointing.

I often wonder why translators opt to water down their translations like this. Why discard the vivid and specific soup and pie in favor of the abstract "fat-witted glutton"? What could possibly be the justification?

Translators have a tough job. A mediocre translator will capture only the surface meaning and miss the subtle allusions, the wordplay, the connotations. But here, Aldington hasn't even captured the surface meaning! How hard is it to see torte and include pie in your translation somewhere? I can't believe that his omitting it was pure carelessness, only that Aldington thought that he was somehow improving on the original. But how, I can't imagine.

Well, I can imagine a little. Translations can also be too literal. Let's consider the offensive Spanish epithet pendejo. Literally, this is a pubic hair. But to translate it in English as "pubic hair" would be a mistake, since English doesn't use that term in the same way. A better English translation is "asshole". This is anatomically illogical, but linguistically correct, because the metaphor in both languages has worn thin. When an anglophone hears someone called an “asshole” they don't normally imagine a literal anus, and I think similarly Spanish-speakers don't picture a literal pubic hair for pendejo. Brodaiuolo could be similar. Would a 14th-century Florentine, hearing brodaiuolo, picture a generic glutton, or would they imagine someone literally holding a soup bowl up to their face? We probably don't know. But I'm inclined to think that “soup-guzzler” is not too rich, because by this point in Emilia's rant we can almost see the little flecks of spittle flying out of here mouth.

I'm offended by Aldington's omission of pie-munching.

[ Addendum 20210414: More translations of brodaiuolo. ]

# Vote!

The UK holds elections on 6 May. From the gov.uk site:

Register by 11:59pm on 19 April to vote in the following elections on 6 May:

• local government elections and referendums in England
• Police and Crime Commissioner elections in England and Wales
• Scottish Parliament elections
• Senedd (Welsh Parliament) elections
• Mayor of London and London Assembly elections

Register Online. It usually takes about 5 minutes. Start Now.

Registration is easy: a rare example of a well-designed web site.

You can also support your party with a donation. Mine is the Edinburgh Green Party; feel free to add yours via the comments. Current polling shows Green on track to win 10 seats in the regional lists, and Alba on track to get no seats.

# Typing extensible records in Giml

In the last blog post we covered the general structure and algorithms of Giml's type inference engine.

In this blog post we'll take a closer look at extensible records in Giml and how to infer their types.

## Records

A record is a collection of values, each associated with a name (also called a label). For example, { name = "Giml", age = 0 }. The type of records is very similar to their structure for example the type of the record above is: { age : Int, name : String }.

Records are used to build compound data aggregating multiple values into one value that contains all of the information. The different values (also called fields) inside the record can be accessed (or selected) by their labels, like this: <record>.<label>.

Records can also be extended with additional fields:

let
lang = { name = "Giml", age = 0 }
in
{ website = "https://giml-lang.org" | lang }

The result of the expression is a new record that has all of the fields lang has and has one extra field website as well. Note that in Giml, we can't have multiple values with the same label. When we add a new field with an existing label, that previous one is replaced with the new value.

There are other reasonable behaviours here such as disallowing such operation or having a scope of labels, but this is the behaviour I chose for Giml :)

Records are first class and can be used anywhere an Int can be expected. They can also be pattern matched:

case { name = "Giml" } of
| { name = "Giml" } ->
"Yeah I heard about it, funny name."

| { name = "Haskell" } ->

| other ->
concat (concat "I'm sure " other.name) " is pretty good too!"
end

So far we've seen record literals and three operations on records: field selection, extension and pattern matching.

When we approach to type inference of features we need to take into consideration a few things:

1. What the types of the feature look like
2. What to do for each operation (elaboration and constraints generation)
3. How to unify the types (constraint solving)
4. How to instantiate
5. How to substitute

We are going to first try a naive approach: Just represent the types of record as TypeRec [(Label, Type)] and see how it's just not enough to make this feature useful without type annotations.

Let see how we elaborate each operation of Record literals, field selection, record extension, pattern matching:

Record literals: simple, we have an expression that looks like this: { label1 = expr1, label2 = expr2, ... }, so we elaborate each expression and annotate the node with the type TypeRec [(label1, t1), ...].

Record selection: Also simple, we have something that looks like expr.label, so we elaborate the type of the expression and constrain it as equals to TypeRec [(label, t)].

But hold up, if we try to work with this scheme we'll see that it is not good enough. Trying to typecheck this expression { a = 1, b = 1 }.a will create this constraint: Equality { a : Int, b : Int } { a : t }, and while they both have a field a, one of them is missing a field! So they are definitely not equal.

So we need to go at this differently and there are several approaches, one is to create a new type of constraint that describe the relationship of "subtyping", another is to describe the types differently.

For Giml I chose the second approach, we add extra information to the types to encode that "this record type may have more fields". This approach is called row polymorphism.

### Row Polymorphism

Row polymorphism provides us with the ability to represent extra information in an additional type variable (called a row type variable) in the type.

So in addition to the variant we created TypeRec [(Label, Type)], we add another one: TypeRecExt [(Label, Type)] TypeVar. Syntactically the type will look like this:

{ <field1> : <type1>, ... | <extension-type-variable> }

What this extra type variable does is represent additional fields that may be part of the type that we don't know yet.

Let's use this for record field selection, instead of saying "this type is equals to a record with one field of type t", we can now say "this type is equals to a record that has this field with this type, and other fields represented by this type variable".

With this scheme we can move forward. Let's try again:

### Elaboration and constraint generation

same as before

#### Record selection

We have something that looks like expr.label, so we:

1. Generate a type variable representing the return type (which is the field type), let's call this t
2. Elaborate the type of the expression
3. Generate a type variable representing the rest of the fields, let's call this ext
4. Constrain the type of the expression as equals to { label : t | ext }.

Now when we try to typecheck the expression { a = 1, b = 1 }.a, which generates this constraint: Equality { a : Int, b : Int } { a : t | ext }, we can match the fields and types that we know (a : Int with a : t), and also match the fields that we don't ({ b : Int } with ext) - more on that later.

And when we find out what the real type of ext is, we'll substitute it (we'll talk about that later as well).

#### Record extension

We have something that looks like { <field1> = <expr1>, ... | <expression> }, so we:

1. Elaborate the type of each field
2. Elaborate the type of the expression
3. Generate a type variable for the expression which we will call ext
4. Constrain the type of the expression to be equal to { | ext }
5. Annotate the node with the type { <field1> : <type1>, ... | ext }

#### Pattern matching

Pattern matching is similar to record literals case, but instead of matching the expression in the case with a rigid TypeRec, we generate a type variable and match with TypeRecExt. That way we can select less fields in the pattern than might be available in the expression type.

### Constraint solving

We need to handle 3 new cases - equality between:

1. Two TypeRec
2. A TypeRec and a TypeRecExt
3. Two TypeRecExt

#### Two TypeRec

The two records should have the same labels and for each label we generate a new constraint of Equality between the types for the label on each side.

So for example { a : Int, b : String } and { a : t1, b : t2 }, we generate the two constraint: Equality Int t1 and Equality String t2.

If one TypeRec has a field that the other do not have, we throw an error of a missing field.

#### A TypeRec and a TypeRecExt

Each field in the TypeRecExt should match with the matching field in TypeRec. If there's a field in TypeRecExt that does not exist in TypeRec we throw an error.

The other side is different, all the missing field in TypeRecExt that exist in TypeRec will be matched with the row type variable of TypeRecExt. Remember - we said that with row polymorphism we use a type variable as a representative of fields we don't know of yet! So we treat the type variable in TypeRecExt as if it is a TypeVar that matches the fields that exist in the TypeRec but not in the TypeRecExt.

So for example in Equality { a : Int, b : String } { a : t1 | ext } we generate 2 new constraints: Equality Int t1 and Equality { b : String } ext.

#### Two TypeRecExt

This scenario is slightly trickier, of course - the specified fields in both sides should be matched just like in previous cases, but what do we do with the missing fields from each side?

Let's check a concrete example, we'll make it simpler by not including matching fields.

Equality { a : Int | t1 } { b : Int | t2 }

What this constraint mean, is that t2 is equals to { a : Int | t1 } without { b : Int }, and t1 is equals to { b : Int | t2 } without { a : Int }.

Since we don't have any notion of subtracting a type from a type (some other type system do support this), we can try and represent this differently, we could represent this subset as a new type variable, and translate the Equality above to these two constraints:

Equality { a : Int | t' } t2
Equality { b : Int | t' } t1

Now t' represents t2 without { a : Int } and t' also represents t1 without { b : Int }.

So more generally, when we try to unify two TypeRecExt, we match the matching fields and also create a new row type variable to represents all of the unspecified fields in both types, and match each row type variable with the fields specified on the other side.

Let's describe this one more time in psuedo code, adding new constraints:

Equality { <fields-only-found-on-the-left-side>  | new-type-var } extRight
Equality { <fields-only-found-on-the-right-side> | new-type-var } extLeft
<matched-fields>

### Instantiation

Instantiation occurs as usual, the row type variable in TypeRecExt should be instantiated as well.

### Substitution

What do we do if the row type variable in TypeRecExt appears in the substitution?

One, if the row type variable is mapped to a different type variable, we just replace it.

Two, if it's mapped to a TypeRec, we merge it with the existing fields and return a TypeRec with all fields, but it's important to note that some fields in the TypeRec from the substitution may be the same as ones from our TypeRecExt.

There are a few approaches one could take here, one is to keep a scope of labels, another is to try and unify the types of the field.

In Giml we take the type of the left-most instance, discarding the right one. So semantically this expression is legal: { a = "Hi!" | { a = 1} } and its type is { a : String }.

Three, if the row type variable is mapped to a TypeRecExt, we do the same thing as in Two, but return a TypeRecExt instad of a TypeRec with the row type variable from the mapped value as our new row type variable.

This is basically what we need to do to infer the type of extensible records.

## Example

Let's see how to infer the type of a simple record field access example. This process is a bit much, but I've included it for those who want a live demonstration of how this works. Feel free to skip it!

giml = { name = "Giml", age = 0 }

getName record = record.name

gimlName = getName giml

#### Elaboration

Through elaboration, we end up with the following AST and constraints:

-- Ast:

+-- { age : Int, name : String }
|               |
|               |
|     __________|_______________
giml = { name = "Giml", age = 0 }

-- Constraints:

[ Equality top0 {age : Int, name : String}
]
-- Ast:

+--- targ4 -> t5
|
|               +--- targ4
|               |
|             __|___
getName record = record.name
-----------
|
+------- t5

-- Constraints:

[ Equality targ4 {name : t5 | t6}
, Equality tfun3 (targ4 -> t5)
, Equality top1 (targ4 -> t5)
]
-- Ast:

+--- t9
|
|
gimlName = getName giml
|     |
|     |
|     +---- top0_i8
|
+---- top1_i7

-- Constraints:

[ Equality top1_i7 (top0_i8 -> t9)
, Equality top2 t9
, InstanceOf top0_i8 top0
, InstanceOf top1_i7 top1
]

The first stage of inference is to group dependencies and order them topologically. Since we don't have any mutual dependencies, each definition is standalone and the order of dependencies is kept (the last definition uses the previous two).

Also note that the names of type variables do not matter to the algorithm. They are a bit different so I have better time knowing where they are introduced.

#### Constraint solving

Now we go over group by group and solve the constraints:

##### First group
-- Constraints: [Equality top0 {age : Int, name : String}]
-- Substitution: []

1. Equality top0 {age : Int, name : String}
-- => add top0 := {age : Int, name : String} to the substitution

-- Constraints: []
-- Substitution: [top0 := {age : Int, name : String}]

We carry the substitution to the next group.

##### Second group
[ Equality targ4 {name : t5 | t6}
, Equality tfun3 (targ4 -> t5)
, Equality top1 (targ4 -> t5)
]

We need to apply the substitution to the constraints we are about to handle, but this doesn't change anything in this instance.

-- Constraints:
[Equality targ4 {name : t5 | t6}, Equality tfun3 (targ4 -> t5), Equality top1 (targ4 -> t5)]
-- Substitution:
[top0 := {age : Int, name : String}]

1. Equality targ4 {name : t5 | t6}
-- => add targ4 := {name : t5 | t6} to the substitution and apply

-- Constraints:
[Equality tfun3 ({name : t5 | t6} -> t5), Equality top1 ({name : t5 | t6} -> t5)]
-- Substitution:
[top0 := {age : Int, name : String}, targ4 := {name : t5 | t6}]

2. Equality tfun3 ({name : t5 | t6} -> t5)
-- => add tfun3 := ({name : t5 | t6} -> t5) to the substitution and apply

-- Constraints:
[Equality top1 ({name : t5 | t6} -> t5)]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
]

3. Equality top1 ({name : t5 | t6} -> t5)
-- => add top1 := ({name : t5 | t6} -> t5) and apply

-- Constraints:
[]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
]
##### Third group
[ Equality top1_i7 (top0_i8 -> t9)
, Equality top2 t9
, InstanceOf top0_i8 top0
, InstanceOf top1_i7 top1
]

And after applying the substitution:

[ Equality top1_i7 (top0_i8 -> t9)
, Equality top2 t9
, InstanceOf top0_i8 {age : Int, name : String}
, InstanceOf top1_i7 ({name : t5 | t6} -> t5)
]
-- Constraints:
[ Equality top1_i7 (top0_i8 -> t9)
, Equality top2 t9
, InstanceOf top0_i8 {age : Int, name : String}
, InstanceOf top1_i7 ({name : t5 | t6} -> t5)
]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
]

1. Equality top1_i7 (top0_i8 -> t9)
-- => add top1_i7 := (top0_i8 -> t9)

-- Constraints:
[ Equality top2 t9
, InstanceOf top0_i8 {age : Int, name : String}
, InstanceOf (top0_i8 -> t9) ({name : t5 | t6} -> t5)
]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
, top1_i7 := (top0_i8 -> t9)
]

2. Equality top2 t9
-- add top2 := t9 and apply

-- Constraints:
[ InstanceOf top0_i8 {age : Int, name : String}
, InstanceOf (top0_i8 -> t9) ({name : t5 | t6} -> t5)
]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
, top1_i7 := (top0_i8 -> t9)
, top2 := t9
]

3. InstanceOf top0_i8 {age : Int, name : String}
-- => for InstanceOf constraint we instantiate the second type
--    and add a new Equality constraint
--    in this case instantiation does nothing because
--    there are no type variables on the second type
--    so we add Equality top0_i8 {age : Int, name : String}
--    which immediately turns into
--    so we add top0_i8 := {age : Int, name : String}

-- Constraints:
[ InstanceOf ({age : Int, name : String} -> t9) ({name : t5 | t6} -> t5)
]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
, top1_i7 := ({age : Int, name : String} -> t9)
, top2 := t9
, top0_i8 := {age : Int, name : String}
]

4. InstanceOf ({age : Int, name : String} -> t9) ({name : t5 | t6} -> t5)
-- => new constraint:
-- Equality ({age : Int, name : String} -> t9) ({name : ti10 | ti11} -> ti10)

-- Constraints:
[ Equality ({age : Int, name : String} -> t9) ({name : ti10 | ti11} -> ti10)
]
-- Substitution:
unchanged

5. Equality ({age : Int, name : String} -> t9) ({name : ti10 | ti11} -> ti10)
-- => add constraints Equality {age : Int, name : String} {name : ti10 | ti11}
--    and Equality t9 ti10

-- Constraints:
[ Equality {age : Int, name : String} {name : ti10 | ti11}
, Equality t9 ti10
]
-- Substitution:
unchanged

6. Equality {age : Int, name : String} {name : ti10 | ti11}
-- => This is a TypeRec & TypeRecExt scenario. we match the shared fields
--    and match the row type variable with the missing fields from the typeRec
--    adding Equality String ti10 and Equality ti11 { age : Int }

-- Constraints:
[ Equality String ti10
, Equality ti11 { age : Int }
, Equality t9 ti10
]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
, top1_i7 := ({age : Int, name : String} -> t9)
, top2 := t9
, top0_i8 := {age : Int, name : String}
]

7. Equality String ti10
-- => ti10 := String

-- Constraints:
[ Equality ti11 { age : Int }
, Equality t9 String
]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
, top1_i7 := ({age : Int, name : String} -> t9)
, top2 := t9
, top0_i8 := {age : Int, name : String}
, ti10 := String
]

7. Equality ti11 { age : Int }
-- => ti11 := { age : Int }

-- Constraints:
[ Equality t9 String
]
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
, top1_i7 := ({age : Int, name : String} -> t9)
, top2 := t9
, top0_i8 := {age : Int, name : String}
, ti10 := String
, ti11 := { age : Int }
]

8. Equality t9 String
-- =>  t9 := String

-- Constraints: []
-- Substitution:
[ top0 := {age : Int, name : String}
, targ4 := {name : t5 | t6}
, tfun3 := ({name : t5 | t6} -> t5)
, top1 := ({name : t5 | t6} -> t5)
, top1_i7 := ({age : Int, name : String} -> String)
, top2 := String
, top0_i8 := {age : Int, name : String}
, ti10 := String
, ti11 := { age : Int }
, t9 := String
]

Phew, we made it! Let's apply the substitution back to our AST and see the result!

#### Result

 +-- { age : Int, name : String }
|               |
|               |
|     __________|_______________
giml = { name = "Giml", age = 0 }

   +--- {name : t5 | t6} -> t5
|
|               +--- {name : t5 | t6}
|               |
|             __|___
getName record = record.name
-----------
|
+------- t5
   +--- String
|
|
gimlName = getName giml
|     |
|     |
|     +---- {age : Int, name : String}
|
+---- {age : Int, name : String} -> String

This look correct! (and yey me who did the constraint solving by hand)

### Summary

I found implementing extensible records using row polymorphism to be relatively straightforward. I hope this article makes it more approachable as well.

In the next blog post we'll discuss polymorphic variants and how to infer their types.

# [hupfginj] Nim

Nim, the game of heaps of coins, is a solved game.  Here is how to compute the nim sum (Grundy value) of a nim position in Haskell.

nimsum :: [Integer] -> Integer;
nimsum = Data.List.foldl' Data.Bits.xor 0;

(It's might be surprising that Integer is an instance of Bits, but we will just run with it.)

If the nim sum of a position is zero, then the position is lost: any move loses (and makes the nim sum nonzero).  If the nim sum is nonzero, the position can be won: choose a move that reduces the nim sum to zero.  (What is the computational complexity of finding such a move?  Typically, how many winning moves are there?  Below, we will give some positions with more than one winning move.)

Important details omitted: if the nim sum is nonzero, there always exists a move that makes it zero.  If the nim sum is zero, there are no moves that keep it zero.

Call a nim position obviously lost if the following mirroring strategy is applicable.  The heaps can be organized into pairs of equal-sized heaps.  Whatever the first player does to one heap of a pair, the second player can do to the corresponding other heap of the pair and thereby eventually win.

All 2-heap nim positions that are lost are obviously lost.  In other words, the lost 2-heap nim positions are exactly those which have 2 equal sized heaps.

Call a position non-obviously lost if it is lost but not obviously lost.

The simplest non-obviously lost position is [1,2,3].

The positions [1,3,3] [2,2,3] [2,3,3] are the three simplest winning positions that have more than one winning move.  All can be reduced to [1,2,3] or to [a,a].

Below are the non-obviously lost 3-heap positions whose smallest heap has size 1.  All entries are [1, 2n, 2n+1].

[1,2,3],[1,4,5],[1,6,7],[1,8,9],[1,10,11],[1,12,13],[1,14,15],[1,16,17],[1,18,19],...

Below are the non-obviously lost 3-heap positions whose smallest heap has size 2.  All entries are of the form [2, 4n, 4n+2] or [2, 4n+1, 4n+3].

[2,4,6],[2,5,7],[2,8,10],[2,9,11],[2,12,14],[2,13,15],[2,16,18],[2,17,19],...

Below are the non-obviously lost 3-heap positions whose smallest heap has size 3.  All entries are of the form [3, 4n, 4n+3] or [3, 4n+1, 4n+2].

[3,4,7],[3,5,6],[3,8,11],[3,9,10],[3,12,15],[3,13,14],[3,16,19],[3,17,18],...

With more coins and more heaps, there continue to be patterns, but they are more complicated, so we will not attempt to describe them.

Below are the non-obviously lost 3-heap positions whose smallest heap has size 4.  Pattern involves 8n+k.

[4,8,12],[4,9,13],[4,10,14],[4,11,15],[4,16,20],[4,17,21],[4,18,22],[4,19,23],[4,24,28],[4,25,29],[4,26,30],[4,27,31],[4,32,36],[4,33,37],[4,34,38],[4,35,39],[4,40,44],[4,41,45],[4,42,46],[4,43,47]...

Below are the non-obviously lost 4-heap positions whose largest heap has size at most 15.

[2,3,4,5],[1,3,4,6],[1,2,5,6],[1,2,4,7],[1,3,5,7],[2,3,6,7],[4,5,6,7],[2,3,8,9],[4,5,8,9],[6,7,8,9],[1,3,8,10],[4,6,8,10],[5,7,8,10],[1,2,9,10],[5,6,9,10],[4,7,9,10],[1,2,8,11],[5,6,8,11],[4,7,8,11],[1,3,9,11],[4,6,9,11],[5,7,9,11],[2,3,10,11],[4,5,10,11],[6,7,10,11],[8,9,10,11],[1,5,8,12],[2,6,8,12],[3,7,8,12],[1,4,9,12],[3,6,9,12],[2,7,9,12],[2,4,10,12],[3,5,10,12],[1,7,10,12],[3,4,11,12],[2,5,11,12],[1,6,11,12],[1,4,8,13],[3,6,8,13],[2,7,8,13],[1,5,9,13],[2,6,9,13],[3,7,9,13],[3,4,10,13],[2,5,10,13],[1,6,10,13],[2,4,11,13],[3,5,11,13],[1,7,11,13],[2,3,12,13],[4,5,12,13],[6,7,12,13],[8,9,12,13],[10,11,12,13],[2,4,8,14],[3,5,8,14],[1,7,8,14],[3,4,9,14],[2,5,9,14],[1,6,9,14],[1,5,10,14],[2,6,10,14],[3,7,10,14],[1,4,11,14],[3,6,11,14],[2,7,11,14],[1,3,12,14],[4,6,12,14],[5,7,12,14],[8,10,12,14],[9,11,12,14],[1,2,13,14],[5,6,13,14],[4,7,13,14],[9,10,13,14],[8,11,13,14],[3,4,8,15],[2,5,8,15],[1,6,8,15],[2,4,9,15],[3,5,9,15],[1,7,9,15],[1,4,10,15],[3,6,10,15],[2,7,10,15],[1,5,11,15],[2,6,11,15],[3,7,11,15],[1,2,12,15],[5,6,12,15],[4,7,12,15],[9,10,12,15],[8,11,12,15],[1,3,13,15],[4,6,13,15],[5,7,13,15],[8,10,13,15],[9,11,13,15],[2,3,14,15],[4,5,14,15],[6,7,14,15],[8,9,14,15],[10,11,14,15],[12,13,14,15]

For casual play, let the initial position be a non-obviously lost position.  No move is objectively better than another from a lost position, so this will tend to produce varied games from as early as the first move.  (The first player explores different ways to try to swindle a win.)  [3,4,7] mentioned above is a nice small 3-heap initial position for casual play.  Some nice 4-heap (lost) initial positions that are easy to remember are [2,3,4,5] (14 coins total), [4,5,6,7] (22 coins), and [6,7,8,9] (30 coins).

We propose non-obviously lost initial positions with more heaps that satisfy the following aesthetic constraints: distinct heap sizes, minimize total number of coins, minimize largest heap, maximize smallest heap.  We avoid heaps of equal size in the initial position to make the mirroring strategy somewhat less likely to be usable toward the beginning of the game.  However, equal-sized heaps can easily appear after the first move.

The unique 3-heap satisfying these constraints is [1,2,3] (6 coins).

4-heap: [2,3,4,5] (14 coins).

5-heap: [3,4,6,8,9] (30 coins).

6-heap: [1,2,4,6,8,9] (30 coins).

The perfect triangle 7-heap [1,2,3,4,5,6,7] satisfies the constraints. It has size 28, smaller than the 5 or 6 heaps above.

8-heap: [2,3,4,5,6,7,8,9] (44 coins).

9-heap: [2,3,4,5,7,8,9,10,12] or [2,3,4,5,6,8,9,11,12] (60 coins).

10-heap: [1,2,3,4,5,6,8,9,10,12] (60 coins).

Best 11-heap is another perfect triangle: [1,2,3,4,5,6,7,8,9,10,11] (66 coins).

12-heap: [2,3,4,5,6,7,8,9,10,11,12,13] (90 coins).

13-heap: [1,2,3,4,5,6,7,9,10,12,14,16,17], [1,2,3,4,5,6,7,8,11,12,14,16,17], [1,2,3,4,5,6,7,8,10,13,14,16,17], or [1,2,3,4,5,6,7,8,10,12,15,16,17] (106 coins)

The values of n for which a perfect triangle [1,2,3,...,n] is a lost position: 0 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 95 99 ... The values are 4k-1.

There are no non-obviously lost positions with all heaps having the same number of coins.  An even number of identical heaps is obviously lost.  An odd number of identical heaps is won: just take an entire heap to leave your opponent with an even number of identical heaps, a position which is lost.

The following trapezoids (heap sizes of consecutive integers, at least 2 and at most 20 coins in a heap) are non-obviously lost.  It appears the number of heaps must be a multiple of 4, and the smallest heap must have an even number of coins.

[2,3,4,5],[2,3,4,5,6,7,8,9],[2,3,4,5,6,7,8,9,10,11,12,13],[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17],
[4,5,6,7],[4,5,6,7,8,9,10,11],[4,5,6,7,8,9,10,11,12,13,14,15],[4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19],
[6,7,8,9],[6,7,8,9,10,11,12,13],[6,7,8,9,10,11,12,13,14,15,16,17],
[8,9,10,11],[8,9,10,11,12,13,14,15],[8,9,10,11,12,13,14,15,16,17,18,19],
[10,11,12,13],[10,11,12,13,14,15,16,17],
[12,13,14,15],[12,13,14,15,16,17,18,19],
[14,15,16,17],
[16,17,18,19]

Future work: organize positions by their distance from being obviously lost.  (Previously, we analyzed Chomp in this way.)  What other classes of positions can be quickly recognized as lost?  (We noted [1, 2n, 2n+1] above.)

Future work: repeat this analysis for the "subtraction game" in which the number of coins that can be removed in one move is limited (up to k).  Surprisingly, optimal play can be calculated quickly: compute nim sum mod (k+1).

# How to replace Proxy with AllowAmbiguousTypes

no-proxy

This post is essentially a longer version of this Reddit post by Ashley Yakeley that provides more explanation and historical context.

Sometimes in Haskell you need to write a function that “dispatches” only on a type and not on a value of that type. Using the example from the above post, we might want to write a function that, given an input type, prints the name of that type.

#### Approach 1 - undefined

One naive approach would be to do this:

class TypeName a where    typeName :: a -> Stringinstance TypeName Bool where    typeName _ = "Bool"instance TypeName Int where    typeName _ = "Int"

… which we could use like this:

>>> typeName False"Bool">>> typeName (0 :: Int)"Int"

However, this approach does not work well because we must provide the typeName method with a concrete value of type a. Not only is this superfluous (we don’t care which value we supply) but in some cases we might not even be able to supply such a value.

For example, consider this instance:

import Data.Void (Void)instance TypeName Void where    typeName _ = "Void"

There is a perfectly valid name associated with this type, but we cannot retrieve the name without cheating because we cannot produce a (total) value of type Void. Instead, we have to use something like undefined:

>>> typeName (undefined :: Void)"Void"

The base package uses this undefined-based approach. For example, Foreign.Storable provides a sizeOf function that works just like this:

class Storable a where    -- | Computes the storage requirements (in bytes) of the argument. The value    -- of the argument is not used.    sizeOf :: a -> Int    …

… and to this day the idiomatic way to use sizeOf is to provide a fake value using undefined:

>>> sizeOf (undefined :: Bool)4

This works because sizeOf never evaluates its argument. It’s technically safe, albeit not very appealing to depend on undefined.

#### Approach 2A - Proxy

The next evolution of this approach was to use the Proxy type (now part of base in the Data.Proxy module). As the documentation notes:

Historically, Proxy :: Proxy a is a safer alternative to the undefined :: a idiom.

I’m not exactly sure what the name Proxy was originally meant to convey, but I believe the intention was that a term (the Proxy constructor) stands in as a “proxy” for a type (specifically, the type argument to the Proxy type constructor).

We can amend our original example to use the Proxy type like this:

import Data.Proxy (Proxy(..))import Data.Void (Void)class TypeName a where    typeName :: Proxy a -> Stringinstance TypeName Bool where    typeName _ = "Bool"instance TypeName Int where    typeName _ = "Int"instance TypeName Void where    typeName _ = "Void"

… and now we can safely get the name of a type without providing a specific value of that type. Instead we always provide a Proxy constructor and give it a type annotation which “stores” the type we wish to use:

>>> typeName (Proxy :: Proxy Bool)"Bool">>> typeName (Proxy :: Proxy Int)"Int">>> typeName (Proxy :: Proxy Void)"Void"

We can simplify that a little bit by enabling the TypeApplications language extension, which permits us to write this:

>>> :set -XTypeApplications>>> typeName (Proxy @Bool)"Bool">>> typeName (Proxy @Int)"Int">>> typeName (Proxy @Void)"Void"

… or this if we prefer:

>>> typeName @Bool Proxy"Bool">>> typeName @Int Proxy"Int">>> typeName @Void Proxy"Void"

#### Approach 2B - proxy

A minor variation on the previous approach is to use proxy (with a lowercase “p”) in the typeclass definition:

import Data.Void (Void)class TypeName a where    typeName :: proxy a -> String    --          ↑instance TypeName Bool where    typeName _ = "Bool"instance TypeName Int where    typeName _ = "Int"instance TypeName Void where    typeName _ = "Void"

Everything else works the same, but now neither the author nor the consumer of the typeclass needs to depend on the Data.Proxy module specifically. For example, the consumer could use any other type constructor just fine:

>>> typeName ([] :: [Int])  -- Technically legal, but weird"Int"

… or (more likely) the consumer could define their own Proxy type to use instead of the one from Data.Proxy, which would also work fine:

>>> data Proxy a = Proxy>>> typeName (Proxy :: Proxy Int)"Int"

This trick helped back when Proxy was not a part of the base package. Even now that Proxy is in base you still see this trick when people author typeclass instances because it’s easier and there’s no downside.

Both of these Proxy-based solutions are definitely better than using undefined, but they are both still a bit unsatisfying because we have to supply a Proxy argument to specify the desired type. The ideal user experience should only require the type and the type alone as an input to our function.

#### Approach 3 - AllowAmbiguousTypes + TypeApplications

We previously noted that we could shorten the Proxy-based solution by using TypeApplications:

>>> typeName @Bool Proxy"Bool"

Well, what if we could shorten things even further and just drop the Proxy, like this:

>>> typeName @Bool

Actually, we can! This brings us to a more recent approach (the one summarized in the linked Reddit post), which is to use AllowAmbiguousTypes + TypeApplications, like this:

{-# LANGUAGE AllowAmbiguousTypes #-}import Data.Void (Void)class TypeName a where    typeName :: Stringinstance TypeName Bool where    typeName = "Bool"instance TypeName Int where    typeName = "Int"instance TypeName Void where    typeName = "Void"

… which we can invoke like this:

>>> :set -XTypeApplications>>> typeName @Bool"Bool">>> typeName @Int"Int">>> typeName @Void"Void"

The use of TypeApplications is essential, since otherwise GHC would have no way to infer which typeclass instance we meant. Even a type annotation would not work:

>>> typeName :: String  -- Clearly, this type annotation is not very helpful<interactive>:1:1: error:    • Ambiguous type variable ‘a0’ arising from a use of ‘typeName’      prevents the constraint ‘(TypeName a0)’ from being solved.      Probable fix: use a type annotation to specify what ‘a0’ should be.      These potential instances exist:        instance [safe] TypeName Void -- Defined at Example.hs:14:10        instance [safe] TypeName Bool -- Defined at Example.hs:8:10        instance [safe] TypeName Int -- Defined at Example.hs:11:10    • In the expression: typeName :: String      In an equation for ‘it’: it = typeName :: String

Type applications work here because you can think of a polymorphic function as really having one extra function argument: the polymorphic type. I elaborate on this a bit in my post on Polymorphism for dummies, but the basic idea is that TypeApplications makes this extra function argument for the type explicit. This means that you can directly tell the compiler which type to use by “applying” the function to the right type instead of trying to indirectly persuade the compiler into using the the right type with a type annotation.

#### Conclusion

My personal preference is to use the last approach with AllowAmbiguousTypes and TypeApplications. Not only is it more concise, but it also appeals to my own coding aesthetic. Specifically, guiding compiler behavior using type-annotations feels more like logic programming to me and using explicit type abstractions and TypeApplications feels more like functional programming to me (and I tend to prefer functional programming over logic programming).

However, the Proxy-based approach requires no language extensions, so that approach might appeal to you if you prefer to use the simplest subset of the language possible.

### Tweag I/O

The capability library is an alternative to the venerable mtl (see our earlier blog posts on the subject). It features a set of “mtl-style” type classes, representing effects, along with deriving combinators to define interpreters as type class instances. It relies on the -XDerivingVia extension to discharge effects declaratively, close to the definition of the application’s monad. Business logic can be written in a familiar, idiomatic way.

As an example, consider the following computation:

testParity :: (HasReader "foo" Int m, HasState "bar" Bool m) => m ()
testParity = do
put @"bar" (even num)

This function assumes a Reader effect "foo" of type Int, and a State effect "bar" of type Bool. It computes whether or not "foo" is an even number and stores the result in "bar".

Save for the tags "foo" and "bar", used to enable multiple Reader or State effects within the same monad (an impossible thing with mtl), this is fairly standard Haskell: Type classes are used to constrain what kind of effects the function can perform, while decoupling the computation from any concrete implementation. At use-site, it relies on GHC’s built-in resolution mechanism to “inject” required dependencies. Any seasoned Haskeller should feel right at home !

## Providing instances

To actually call this admittedly silly function, we need to provide interpreters for the "foo" and "bar" effects. Following the ReaderT design pattern, we’ll pack everything we need into a single context record, then interpret our effects over this context in the IO monad, using the deriving combinators provided by the library:

data Ctx = Ctx { foo :: Int, bar :: IORef Bool }
deriving Generic

newtype M a = M { runM :: Ctx -> IO a }
-- Use DerivingVia to derive a HasReader instance.
deriving (HasReader "foo" Int, HasSource "foo" Int) via
-- Pick the field foo from the Ctx record in the ReaderT environment.
-- Use DerivingVia to derive a HasState instance.
deriving (HasState "bar" Bool, HasSource "bar" Bool, HasSink "bar" Bool) via
-- Convert a reader of IORef to a state capability.
ReaderIORef (Field "bar" "ctx" (MonadReader (ReaderT Ctx IO)))

Thus equipped, we can now make use of our testParity function in an actual program:

example :: IO ()
example = do
rEven <- newIORef False
runM testParity (Ctx 2 rEven)
readIORef rEven >>= print

How do we test a function such as testParity in isolation? In our contrived example, this is quite easy: the example function could be easily converted into a test-case. In the Real World™, though, our context Ctx could be much bigger, providing a pool of database connections, logging handles, etc. Surely, we don’t want to spawn a database instance to test such a simple function!

In previous iterations of capability, the solution to this problem would have been to create a new monad for testing purposes, leaving out the capabilities we don’t want. While it works, it is not always the best tool for the job:

• You need to define a new monad for each combination of effects you want to test.
• Test cases are no longer self-contained; their implementation is spread across multiple places. It makes things less readable and harder to maintain.

A solution, supported by fancier effect system libraries such as polysemy or fused-effects, is to define ad-hoc interpreters in the executable code itself. At first glance, it might seem like this is not possible in capability. Indeed, since interpreters are provided as type class instances, and type classes are an inherently static mechanism, surely there is no way of specifying those dynamically. Or is there?

As of version 0.4.0.0, the capability library features an experimental Capability.Reflection module, addressing this very limitation. It is inspired by, and uses, Edward Kmett’s reflection library, and uses similar type class wrangling magic to let you define interpreters as explicit dictionaries.

### Interpreters as reified dictionaries

Making use of those new features, the example function can be rewritten as:

import qualified Control.Monad.Reader as MTLReader

example :: IO ()
example = do
let
runTestParity :: (Int, IORef Bool) -> IO ()
runTestParity (foo, bar) =
flip MTLReader.runReaderT foo $-- Interpret the effects into 'ReaderT Int IO'. -- -- Write the 'HasReader "foo" Int' dictionary -- in terms of mtl functions. -- -- Forward the 'MonadIO' capability. interpret @"foo" @'[MonadIO] ReifiedReader { _reader = MTLReader.reader , _local = MTLReader.local , _readerSource = ReifiedSource { _await = MTLReader.ask } }$
-- Use 'MonadIO' to write the 'HasState "bar" Bool' dictionary.
-- Forward the 'HasReader "foo" Int' capability.
--
-- The 'MonadIO' capability is not forwarded, and hence forgotten.
interpret @"bar" @'[HasReader "foo" Int] ReifiedState
{ _state = \f -> do
b <- liftIO $readIORef bar let (a, b') = f b liftIO$ writeIORef bar b'
pure a
, _stateSource = ReifiedSource
{ _await = liftIO $readIORef bar } , _stateSink = ReifiedSink { _yield = liftIO . writeIORef bar } } testParity rEven <- newIORef False runTestParity (2, rEven) readIORef rEven >>= print Defining a test monad is no longer required: the effects are interpreted directly in terms of the underlying ReaderT Int IO monad. Type-class dictionaries are passed to the interpret function as mere records of functions and superclass dictionaries — just like GHC does under the hood as hidden parameters when we use statically defined instances. Let’s dissect the ReifiedReader dictionary: ReifiedReader { _reader = MTLReader.reader , _local = MTLReader.local , _readerSource = ReifiedSource { _await = MTLReader.ask } } Omitting the extra Proxy# arguments, which are here for technical reasons, the first two attributes, _reader and _local, correspond directly to the methods of the HasReader t type class: class (Monad m, HasSource tag r m) => HasReader (tag :: k) (r :: *) (m :: * -> *) | tag m -> r where local_ :: Proxy# tag -> (r -> r) -> m a -> m a reader_ :: Proxy# tag -> (r -> a) -> m a The _readerSource argument, on the other hand, represents the dictionary of the HasSource superclass: class Monad m => HasSource (tag :: k) (a :: *) (m :: * -> *) | tag m -> a where await_ :: Proxy# tag -> m a ### Abstracting interpreters This is quite boilerplatey, though. If we’re writing a lot of test cases, we are bound to redefine those interpreters several times. This is tedious, error-prone, and clutters our beautiful test logic. Maybe this is could all be factored out? Sure thing! interpretFoo :: forall cs m a. (MTLReader.MonadReader Int m, All cs m) => (forall m'. All (HasReader "foo" Int : cs) m' => m' a) -> m a interpretFoo = interpret @"foo" @cs ReifiedReader { _reader = MTLReader.reader , _local = MTLReader.local , _readerSource = ReifiedSource { _await = MTLReader.ask } } interpretBar :: forall cs m a. (MonadIO m, All cs m) => IORef Bool -> (forall m'. All (HasState "bar" Bool : cs) m' => m' a) -> m a interpretBar bar = interpret @"bar" @cs ReifiedState { _state = \f -> do b <- liftIO$ readIORef bar
let (a, b') = f b
liftIO $writeIORef bar b' pure a , _stateSource = ReifiedSource { _await = liftIO$ readIORef bar }
, _stateSink = ReifiedSink
{ _yield = liftIO . writeIORef bar }
}

These two functions follow a similar pattern. Let’s have a closer look at the type of interpretBar to understand what is going on:

interpretBar
:: forall cs m a. (MonadIO m, All cs m)
=> IORef Bool
-> (forall m'. All (HasState "bar" Bool : cs) m' => m' a)
-> m a
• The (typelevel) cs :: [(* -> *) -> Constraint] argument is a list of capabilities that we wish to retain in the underlying action.
• Since we interpret the State effect with a mutable IORef reference, we require that the underlying monad be an instance of MonadIO. Moreover, we ask that our target monad also implement all the required capabilities by adding the All cs m constraint to the context (All is a type family that applies a list of capabilities to a monad to generate a single constraint; for example, All '[MonadIO, HasSource "baz" Baz] m is equivalent to (MonadIO m, HasSource "baz" Baz m)).
• The IORef used to store our state is passed as a standard function argument. This was not possible without ad-hoc interpreters: we needed to add the IORef to the Ctx type. With ad-hoc interpreters, on the other hand, we can write instances which capture references in their closures.
• The last argument is a monadic action that makes use of HasState "bar" Bool along with the forwarded cs capabilities. It is required to be polymorphic in the monad type, which guarantees that the action cannot use other effects.

Now that we have factored out the interpretation of the "foo" and "bar" effects into dedicated functions, they can be neatly composed to provide just the effects we need to run testParity:

example :: IO ()
example = do
let
runTestParity :: (Int, IORef Bool) -> IO ()
runTestParity (foo, bar) = flip MTLReader.runReaderT foo $interpretFoo @'[MonadIO]$
interpretBar @'[HasReader "foo" Int] bar $testParity rEven <- newIORef False runTestParity (2, rEven) readIORef rEven >>= print ## Deriving capabilities Truth be told, in this example, the dictionaries we’ve been writing aren’t so different from a custom type class with capabilities provided by deriving-via. While the extra power that comes with dynamic dictionaries can be very useful, it isn’t always warranted. There is a middle ground, however: we can provide capabilities locally, but with deriving-via combinators using a function that we call derive. You would typically use derive to derive high-level capabilities from lower-level capabilities. In our case, we can replace: runTestParity :: (Int, IORef Bool) -> IO () runTestParity (foo, bar) = flip MTLReader.runReaderT foo$
interpretFoo @'[MonadIO] $interpretBar @'[HasReader "foo" Int] bar$
testParity

with:

runTestParity :: (Int, IORef Bool) -> IO ()
runTestParity ctx = flip MTLReader.runReaderT ctx $derive -- Strategy @(ReaderIORef :.: Rename 2 :.: Pos 2 _ :.: MonadReader) -- New capability @'[HasState "bar" Bool] -- Forwarded capability @'[MTLReader.MonadReader (Int, IORef Bool)]$

derive
@'[HasState "bar" Bool]

testParity

thus getting rid of the interpret{Foo,Bar} helpers entirely. For instance, the HasState "bar" Bool capability is derived from the IORef Bool in the second position of the tuple provided by the ambient MonadReader (Int, IORef Bool) instance. Think DerivingVia, but dynamically!

## Conclusion

Wrapping things up:

• At its core, the capability library is just mtl on steroids, modeling effects with type classes.
• The standard way of using capability is to define interpreters declaratively, using the provided combinators; this programming-style does not allow defining ad-hoc interpreters, at runtime.
• The new version of capability provides a way of overcoming this limitation with reified dictionaries.
• Standard deriving strategies can be used to provide dynamic instances with less boilerplate, using the underlying deriving mechanism.

Writing tests is just one example. Another application might be to dynamically select the interpretation of an effect based on a configuration parameter. All this is still experimental: the API and ergonomics are likely to change a bit over the next few releases, but we hope this post motivates you to give it a try.

# Giml's type inference engine

Giml's type inference engine uses unification-based constraint solving.

It has 5 important parts:

1. Topologically order definitions and group those that depend on one another
2. Elaboration and constraint generation
3. Constraint solving
4. Instantiation
5. Substitution

In summary, we sort and group definitions by their dependencies, we elaborate the program with type variables and types that we know and collect constraints on those variables, we then solve the constraints for each group using unification and create substitutions which are mapping from type variables to types and merge them together, and we then substitute the type variables we generated in the elaboration stage with their mapped types from the substitution in the program.

Instantiation occurs during the elaboration stage and the constraint solving stage, when we want to use a polymorphic type somewhere.

## Group and order definitions

Before we start actually doing type inference, we figure out the dependencies between the various definitions. This is because we want to figure out the type of a definition before it's use, and because we want definitions that depend on each other (mutual recursion) to be solved together.

So for example if we have this file as input:

main = ffi("console.log", odd 17)

even x =
if int_equals x 0
then True
else if int_equals x 1
then False
else odd (sub x 1)

odd x =
if int_equals x 0
then False
else if int_equals x 1
then True
else even (sub x 1)

The output of the reordering and grouping algorithm will be:

[["even", "odd"], ["main"]]

This is because even and odd are mutually recursive, so we want to typecheck them together, and because main depends on odd we want to typecheck it after we finished typechecking odd.

The module implementing this algorithm is Language.Giml.Rewrites.PreInfer.GroupDefsByDeps.

## Elaboration

Now the real work begins. We start by going over the program's AST and annotate it with the types we know. When we don't know what the type of something is we "invent" a new fresh type variable as a placeholder and mark that type with a constraint according to its usage.

### Types

Language.Giml.Types.Types contains a datatype that can be used to represent Giml types. For example:

• TypeCon "Int" represents the type Int
• TypeVar "t1" represents a polymorphic type variable
• TypeApp (TypeCon "Option") (TypeCon "Int") represents the type Option Int
• TypeRec [("x", TypeCon "Int")] represents a record with the type { x : Int }

### Constraints

A Constraint between two types describes their relationship.

The most common constraint is Equality. Which means that the two types are interchangeable and should unify to the same type. For example from the constraint Equality (TypeVar t1) (TypeCon "Int") we learn that t1 is the same as Int.

The other, less common constraint, is InstanceOf. Which means that the first type has an Equality relationship with the instantiation of the second type. an instantiation is a monomorphic instance of a polymorphic type, for example (a -> b) -> List a -> List b is polymorphic, and can be instantiated, among other examples, to (Int -> String) -> List Int -> List String by replacing the polymorphic type variables a and b with monomorphic types such as Int and String.

We will talk about solving these constraints a bit later.

### Implementation

We also use State, Reader and Except in our algorithm:

• The State contains:
• A monotonically increasing integer for generating type variables
• A list of constraints we generated through the elaboration phase, which we will return as output at the end
• An environment of data definitions, some are built-in and others are user defined and added as part of the elaboration stage (maybe I should've put this in the Reader part...)
• The currently scoped variables in the environment and their type (which might be polymorphic!)
• The type of builtin functions and expressions (such as int_equals and pure, which might also be polymorphic)
• The Except part can throw a few errors such as "unbound variable" error.

Most of the action happens in elaborateExpr and I'm not going to cover every usecase, but lets look at a few examples:

#### Elaborating literals

Elaborating literals is simple - we already know what the types are - if we see a string literal we annotate the AST node with the type String, if we see an integer literal we annotate it with Int, and we don't need to generate new constraints because we know exactly what the type is.

#### Elaborating lambda functions

Elaborating lambda expressions is interesting, because we don't know what the type for the arguments is going to be: it might be inferred by the usage like in this case: \n -> add 1 n or not like in this case \x y -> x. What we do here is:

1. Generate a type variable for each function argument
2. Elaborate the body of the lambda with the extended environment containing the mapping between the function arguments and their newly generated types
3. Annotate the lambda AST node with the type of a function from the argument types to the type of the body, which we get from the elaborated node after (2)

And we will trust the constraint solving algorithm to unify all of the types.

#### Elaborating function application

To elaborate a function application we:

1. Elaborate the arguments
2. Elaborate the applied expression
3. Generate a fresh type variable to represent the return type
4. Constrain the type of the applied expression as a function from the arguments to the return type
5. Annotate the function application AST node with the return type (which we just generated)

Remember that the type of the applied expression may just be a generated type variable, so we can't examine its type to see what the return type is. For example if we had this expression \f x -> f x, as we saw before we generated type variables for f and x! So we have to make sure the type inference engine knows what their type should be (by generating a constraint) according to the usage and trust that it'll do its job.

#### Elaborating variables

For variables, we expect an earlier stage (such as elaborating lambdas, lets or definitions) to have already elaborated their type and added them to the environment, so we look them up. We first try in the scoped environment and if that fails we try the built-ins environment. If that fails as well, we throw an error that the variable is unbound.

If we do find a variable in the scoped environment, there are two scenarios we need to take care of:

1. The variable was bound in a place that expect it to be monomorphic, such as a lambda function argument
2. The variable was bound in a place that expect it to be polymorphic, such as a term definition

Monomorphic types can be used as is, but polymorphic types represent an instantiation of a type. When we find that the variable we are elaborating is polymorphic, we will generate a new type variable for it and constrain it as an InstanceOf the type we found in the environment.

Later in the constraints solving stage we will instantiate the polymorphic type we found in the environment and say that the type variable we generate is equals to the instantiation. We delay this because we want the polymorphic type to be solved first before we instantiate it.

If we do not find the variable in the scoped environment, we check the builtins environment. If we find the type there, we instantiate it immediately and annotate the AST node with this type. Here we instantiate the type immediately because it is already solved, so we don't need to wait.

## Constraints solving

The heart of the type inference engine. Here we visit each the constraints generated from each group of substitutions separately and generate a substitution (mapping from type variables to types) as output.

the solveConstraints function takes an accumulated substitution and a list of constraints as input and tries to solve the first constraint on the list. Solving a constraint might generate more constraints to solve, and may return a substitution as well. We then apply this substitution (substitute all occurences of the keys with the values) to the rest of the constraints and the accumulated substitution itself, merge the new substitution with the accumulated substitution, and keep going until there are no more constraints left to solve.

(solveConstraint) uses unification to solve a constraint. Here are a few important cases that it handles:

#### Equality t1 t2 (where t1 == t2)

When the two types are the same (for example Int and Int, or t1 and t1), the unification succeeds without extra work needed, we don't need to create new constraints or create a substitution.

#### Equality (TypeVar tv) t2

When one of the types is a type variable, we substitute (replace) it with t2 in all places we know of (accumulated substitution and rest of the constraints). So the rest of the constraints that contain it will use t2 instead, and when substituting it with another type it'll be t2.

#### Equality (TypeApp f1 a1) (TypeApp f2 a2)

We generate two new constraints: Equality f1 a2 and Equality f2 a2, reducing a constraint of complex types into something we already know how to handle: a constraint on simple types.

#### InstanceOf t1 t2

When we run into an InstanceOf constraint, we instantiate the second type t2 and produce an Equality constraint between t1 and the instantiation of t2.

Instantiating is basically "copying" the type signature, and producing the Equality constraint with t1 makes the type inference engine find the right type for the instantiated type variables.

We use InstanceOf instead of Equality when the type is polymorphic, for example when we want to use id that we defined as id x = x. If we were to use an Equality constraint instead, every use of id would constrain it to have a particular type (so id 1 would make the type of id Int -> Int instead of only where we use it).

There are a few more cases, but this is the gist of it.

When we fail to unify we throw an error of type mismatch.

### Example

Let's see an example for this (very) short program:

one = (\x -> x) 1

After the elaboration stage we generate the following AST and constraints:

          +-- t1 -> t1
|
---------
one = (\x -> x) 1
|           |  |
+-- t2      |  +------ Int
|
+---- t1
[ Equality top0 t2
, Equality (t1 -> t1) (Int -> t2)
]

#### How did we get these constraints?

1. The lambda expression case generated the type t1 for x, so it's type is t1 -> t1
2. The literal case annotated 1 as Int
3. The function application generated the return type t2 and generated the constraint Equality (t1 -> t1) (Int -> t2) which is "the type of the applied expression is equals to a function from the argument to the return type I just generated"
4. The definition of one generated the type variable top0 for it so it can be used in other definitions polymorphically, but for this function it should be used monomorphically, so it's constrainted as equal to the return type and generated the constraint Equality top0 t2.

#### How do we unify them?

Let's go over them in order. We begin with an empty substitution:

- Constraints: [Equality top0 t2, Equality (t1 -> t1) (Int -> t2)]
- Substitution: []

1. Equality top0 t2
-- ==> The first is a type variable, so add top0 := t2 to the substitution

- Constraints: [Equality (t1 -> t1) (Int -> t2)]
- Substitution: [top0 := t2]

2. Equality (t1 -> t1) (Int -> t2)
-- ==> replace with two new constraints:
--  1. Equality t1 Int
--  2. Equality t1 t2

- Constraints: [Equality (-> t1) (-> Int), Equality t1 t2]
- Substitution: [top0 := t2]

3. Equality (-> t1) (-> Int)
-- ==> replace again with two new constraints:
--  1. Equality (->) (->)
--  2. Equality t1 Int

- Constraints: [Equality (->) (->), Equality t1 Int, Equality t1 t2]
- Substitution: [top0 := t2]

4. Equality (->) (->)
-- ==> these two types are identical, so we can continue

- Constraints: [Equality t1 Int, Equality t1 t2]
- Substitution: [top0 := t2]

5. Equality t1 Int
-- ==> The first is a type variable, so add t1 := Int to the substitution

- Constraints: [Equality Int t2]
- Substitution: [top0 := t2, t1 := Int]

6. Equality Int t2
-- ==> The second is a type variable, so add t2 := Int to the substitution

- Constraints: []
- Substitution: [top0 := Int, t1 := Int, t2 := Int]

And we're done! The output is the substitution [top0 := Int, t1 := Int, t2 := Int], we can now apply it to our elaborated program and get the fully type checked program:

          +-- Int -> Int
|
---------
one = (\x -> x) 1
|           |  |
+-- Int     |  +------ Int
|
+---- Int

Two more things we talk to talk about before we wrap this up:

## Instantiation

We instantiate a type by looking up all of the type variables in the type and generate a fresh new type variable for each one (though all instances of a type variable should be mapped to the same new type variable).

So for example in the type (a -> b) -> List a -> List b, if t1 and t2 are two new type variables we haven't seen before, we can replace a with t1 and b with t2 everywhere in the type and get (t1 -> t2) -> List t1 -> List t2

After that, we let the Equality constraints and the constraint solving algorithm find the right monomorphic type for t1 and t2.

## Substitution

When we apply a substitution to a constraint or another substitution, we replace all occurences of a type variable tv with t if t := tv is in the substitution. But, we also need to check that tv does not contain t inside of it! (this is called an occurs check)

if tv does contain t, it means that the we ran into an infinite type and we don't support those here, so we will throw an error.

A simple way to generate such case is this:

x y = x

The constraint generated from this definition is: Equality top0 (t1 -> top0), which will fail the occurs check.

If you try to desugar this definition into a lambda expression, you'll quickly discover that you've been had: anytime you'll go to replace x with \y -> x you'll just add another level of lambda.

# Summary

This is the general overview of Giml's type inference engine. If you are interested in learning more about it, check out the source code.

In the next blog posts I will cover the type inference of extensible records and polymorphic variants.

If this helped you, there's something you want me to clarify, you think there's something I missed, there's something you want to know more about, or there's something I got totally wrong, do contact me on Twitter or via Email.

# Sockchain

As we all know, blockchain is a disruptive technology that has helped us find new efficiencies in our lives. For example, prior to Bitcoin, creation of new currency required wasteful processes like digging gold out of the ground or printing paper currency. Bitcoin replaces that with cheap electricity. The world is better off for the reduced resource usage. Efficiency!

More industries are beginning to adopt various blockchain solutions to solve common problems. These are creating synergies in enterprises and forming new markets. But so far, blockchain has been used almost exclusively to solve industrial and financial systems issues. Today, I'm happy to introduce a new blockchain technology aimed at solving problems faced by everyday users.

## The problem

It's a perennial issue, with a huge economic impact (detailed below). We estimate that in addition to the massive economic impact, the emotional and psychological damage may even be the larger factor.

I'm talking, of course, about mismatched and lost socks. While "the dryer ate my socks" is a common term, we have in fact identified multiple related but separable subproblems:

• The standard missing-sock
• Similar, yet slightly different, versions of the same socks, leading to excess effort in identifying a match. This problem breaks down further into:
• Socks which have been worn a different number of times and thereby stretched differently
• Different manufacturing lines leading to microvariations within construction
• Despite all claims to the contrary, there are in fact left vs right variations of socks, either due to manufacture or usage. Identifying a correct pairing becomes an NP-complete problem

This is in fact just the tip of the iceberg. We fully believe there may be at least 13 further classifications of sock-related inefficiencies. Based just upon our current knowledge, we have calculated the following impacts:

• 7 billion sock-wearing individuals worldwide (some populations are, of course, part of the rabid anti-socker cabal).
• Individuals tend to change their socks approximately every 1.2 days.
• Individuals tend to launder socks approximately every 8 days.
• On average, 3.4 individuals run laundry concurrently.
• This leads to an average socks pair/laundry of ~23, or ~46 socks per laundry cycle.
• As this is an NP-hard problem to solve, with a comparison taking on average 2.3 seconds.
• This leads to approximately 75,000 minutes spent sorting socks per laundry cycle. While there may be some ad-hoc optimizations, we expect no greater than a 10x speedup from such optimizations, and believe each laundry cycle requires at least 7,500 minutes, just on sock sorting.
• Combining this with our estimates for laundry cycles per individual, we estimate that the global economy receives an impact of 1.3 billion person years wasted per year just on sock sorting!

As you can see, sock sorting accounts for easily the greatest modern crisis known to humankind.

But further than this is the emotional and psychological harm caused by these situations. We estimate 1 in 4 divorces are caused by mismatched socks. 3 in 10 children will wear a mismatched pair of socks to school at least once per month, with a 27% chance of cyberbullying as a result.

## Enter the sockchain

The sockchain is the natural, obvious, and modern solution to this problem. We leverage our novel Proof-of-Work/Proof-of-Stake hybrid technology to create a distributed application that uses Near Field Communication (NFC) to tag each pair of socks. Dryers throughout the world can easily and cheaply be outfitted with NFC detectors. Miners will be able to detect the presence of such socks in distributed drying systems, and provide a searchable, distributed ledger of sock-pair-locations (SPLs).

Our fee incentive structure will guarantee payment of at least 2 socks-pairs/block, leading to economies of scale. The fee structure will appropriately respond to the evolving and dynamic socketplace.

We envision further development as well. Already, some of our partners are embedded fungus detection systems within our NFCs for athlete's foot elimination, dealing with yet another worldwide pandemic.

## Cost

It's right to question the costs of such endeavors. As mentioned above, the average labor cost per year currently is 1.3 billion person years. Even taking a modest estimate of $150/hour of labor, we end up with an average cost per year of nearly two quadrillion dollars, which is almost the marketcap of Bitcoin itself. We estimate 80 billion active pairs of socks worldwide. At an NFC cost of$20 per pair, we end up with a modest hardware cost of $1.6 trillion, an easily affordable expense given the current human impact. Outfitting the dryers will cost an additional$50 trillion. And we estimate that electricity costs for our novel Proof of Work system to be no greater than $100 trillion. Including training, marketing, and our very slim administrative expenses, the project will cost approximately$300 trillion. Compared to the current cost of $2 quadrillion, we're looking at an 85% cost savings. It would be inhumane not to undertake such a project. ## March 31, 2021 ### Douglas M. Auclair (geophf) # March 2021 1HaskellADay Problems and Solutions ### Michael Snoyman # Haskell base proposal, part 2: unifying vector-like types Discourse thread for discussion Two weeks back, I wrote a blog post with a proposal for unification of vector-like types in bytestring, text, and vector. This continued with a discussion on Discourse, and has been part of some brainstorming sessions at the Haskell Foundation tech track to lock down some of the concrete details. (Meeting minutes have been posted to Discourse regularly.) I've discussed with a few other interested parties, and invited feedback from people who have been working on related projects. (And I even received some such feedback!) At this point, I wanted to summarize what's come up, and propose some concrete next steps. Also, since this is one of the first such proposals we're trying to get to happen through the Haskell Foundation, I'll take a step back with some meta comments about the process itself. ## Refined goals I threw quite a bit into the original proposal. The brainstorming since has helped make it clear what can be achieved, what will have major breaking impacts, and what will deliver the most value. Here are some outcomes: • As much as I still think a single, unified Vector type that does boxing or unboxing depending on the contained type would be a Very Good Thing, there are technical hurdles, and not everyone is bought into it. I doubt it's going to happen any time soon. • The temporary pinning of memory is likely not going to happen, since it would be too big a GC change. However, larger ByteArray#s (3kb IIRC) are already pinned, and using the copy-when-unpinned technique will not be too expensive for things under 3kb. So we have a path forward that requires no (or few) changes to GHC. • It seems like one of the biggest advantages we may get out of this proposal is to move ByteString to unpinned memory. This would be good, since it would reduce memory fragmentation. The downside is the need to copy smaller ByteStrings before performing FFI calls. But overall, as mentioned previously, we don't think that will have a major performance impact. • There are already efforts underway, and have been for a while, to rewrite text to use UTF-8. Combined with this proposal, we could be seeing some serious improvements to how textual data is consumed and worked with in Haskell, but that's a bit too far off right now. ## Refined design With that in place, a semi-coherent design is beginning to form around this proposal: • We're not ready to move types into base yet, but fortunately we already have another package that's a good candidate for shared-vector-types: primitive. It's already used by vector, and can be used by bytestring and text. • We have a PrimArray type present in primitive, but it doesn't support slicing. Let's add a PrimVector type with the minimal machinery necessary to get slicing in place. • The big change: let's rewrite ByteString (the strict variant) to be newtype ByteString = ByteString (PrimVector Word8). • We can recover backwards compatibility in most of the package, including in the .Unsafe module. • People directly using the .Internal module will likely be broken, though there may be some clever tricks to recover backwards compat there too. • We'll get the non-memory-fragmentation benefit immediately. • Not yet discussed, but putting down for future brainstorming: what should we do with ShortByteString? If we move it over to PrimVector, it will end up with a little more overhead for slicing. Do we leave it alone instead? Move it to PrimArray? There are additional steps I could write around text and vector, but honestly: let's stop it right there. If we get a working bytestring package on top of primitive's PrimVector, I think that's a great time to take a break, analyze the performance impact, and see ecosystem impact, likely by compiling a Stackage snapshot with the tweaked bytestring and primitive packages. ## Action items Next steps: find out who's interested in doing the work and dive in! This is still proof of concept, so no real buy-in is needed. We're exploring a possibility. There's a bunch of code that needs to be rewritten in bytestring to see if this is possible. And while brainstorming in calls has been good, I don't think it's great. I'd like to move to discussions in a chat room to make it easier for others to engage. I'll comment a bit more on this below. ## Other packages I've reached out to some authors of other packages to get their input on this proposal. I've received some, and incorporated some of that here. For example, both Alexey Kuleshevich and Alexey Khudyakov proposed using primitive as the new home for the shared data types. Others have expressed that they'd rather first see a concrete proposal. We'll see if there is further collaboration possible in the future. ## Process level comments Let me summarize, at a high level, what the process was that was involved in this proposal: 1. Free-form discussions on the Haskell Foundation tech track in a live meeting. People discussing different ideas, different concerns, which ultimately triggered the idea of this proposal. 2. Blog post to the world and Discourse thread laying out some initial ideas and looking for feedback. 3. Public discussion. 4. A few private emails reaching out to specific interested parties to get their input. 5. In concert with (3) and (4): further tech track live meetings to discuss details further. 6. This second blog post (and associated Discourse thread) to update everyone and call to action. 7. Hopefully: some actual coding. Overall, I think this is a good procedure. If I could make one change, it would be towards leveraging asynchronous communication more and live meetings less. I absolutely see huge value in live meetings of people to brainstorm, as happened in (1). But I think one of the best things we can do as the Haskell Foundation is encourage more people to easily get involved in specific topics they care about. On the bright side, the significant usage of Discourse for collaboration and reporting on meeting minutes has been a Good Thing. I think blog posts like this one and my previous one are a Good Thing for collecting thoughts coherently. That said, I realize I'm in the driver's seat on this proposal, and have a skewed view of how the outside world sees things. If people have concerns with how this was handled, or ideas for improvement, bring them up. I think figuring out how to foster serious discussion of complex technical issues in the Haskell ecosystem is vital to its continued success. ## March 29, 2021 ### Philip Wadler # Conferences after COVID: An Early Career Perspective One silver-lining to the cloud of COVID has been the development of virtual forms of participation. A SIGPLAN blog post by five early-career researchers offers their perspective on what we should do next. We propose that SIGPLAN form a Committee on Conference Data. The committee would be made up of: one organizing-committee representative from each of the flagship SIGPLAN conferences, one early career representative, and, crucially, a professional data collection specialist hired by SIGPLAN. The group would identify and collect key data that is pertinent to conference organization, especially with respect to physical versus virtual conference formats. The committee would make data-driven recommendations to SIGPLAN organizers based on the collected data and guided by core tenets such as community building, inclusivity, research dissemination, and climate responsibility. We realize that this is not a small request, but we are confident that it is both necessary and achievable. If the committee were to form by May 1, 2021, it would be able to start collecting data at PLDI 2021 and continue through the next two years, providing enormous clarity for SIGPLAN organizers at a time when so much is unclear. ### Mark Jason Dominus # Skin worms? The King James Version of Job 19:26 says: And though after my skin worms destroy this body, yet in my flesh shall I see God: I find this mysterious for two reasons. First, I cannot understand the grammar. How is this supposed to be parsed? I can't come up with any plausible way to parse this so that it is grammatically correct. Second, how did the worms get in there? No other English translation mentions worms and they appear to be absent from the original Hebrew. Did the KJV writers mistranslate something? (Probably not, there is nothing in the original to mistranslate.) Or is it just an interpolation? Pretty ballsy, to decide that God left something out the first time around, but that you can correct His omission. ### Michael Snoyman # Shutting down Haskellers.com? I've been playing with this idea off-and-on for a few years now, and decided to finally write a blog post to float it. I'm considering shutting down the Haskellers, and turning it into a redirect to some other community resource. The reason for this is pretty straightforward: I created Haskellers site originally as a some kind of a community/discussion/professional connection hub. But essentially all of its functionality seems better served by other resources at this point, and I'd rather promote those other services. The simplest idea I had was to change it to a redirect to the haskell.org Discourse, though perhaps hosting a single-page site with links to common Haskell resources, together with a very prominent "edit" button for additional contributions, would be worthwhile. I'm curious if anyone has feedback, either for or against, or ideas on what can or should replace a haskellers.com page. If so, please click the button below to the GitHub issue, to at least click the â€‹â€‹â€‹â€‹â€‹â€‹â€‹ğŸ‘� or ğŸ‘� button. ## March 26, 2021 ### Well-Typed.Com # Improvements to memory usage in GHC 9.2 Sometimes OS-reported memory usage might be quite different to the live data in your Haskell program. In a couple of previous posts we’ve explored some reasons for this and shown how ghc-debug can help pinpoint fragmentation problems. In GHC 9.2 I have made two improvements which should make the memory usage reported by long-running applications more closely line up with the amount of memory they need. 1. By default a lot of extra memory was retained (up to 4 times the amount of live data). Now idle collections will start to return some of that memory configured by the -Fd option. (!5036) 2. Allocation of pinned objects was causing fragmentation of the nursery which could lead to the nursery retaining a lot of memory. (!5175) In the rest of this post I will explain how to estimate the memory usage of your program based on its amount of live data and provide some more detail about the new -Fd option. ## Inspecting memory usage There are three metrics which are important when understanding memory usage at a high-level. The GHC.Stats module provides means of inspecting these statistics from inside your program, and they are also included in the eventlog. Name Description GHC.Stats Eventlog Live Bytes The amount of live heap allocated data in your program gcdetails_live_bytes HeapLive Heap Size The amount of memory the RTS is currently retaining for the Haskell heap gcdetails_mem_in_use_bytes HeapSize OS VmRSS The amount of memory the OS thinks your program is using Sample from /proc/pid/status OS VmRSS and heap size should correspond closely to each other. If they do not then there are two likely reasons. 1. It could mean that off-heap allocations (such as calls to mallocBytes or allocations from C libraries) are contributing significantly to the resident memory of your program. You can analyse off-heap allocations using tools such as heapcheck. 2. By default memory is returned lazily to the OS by marking returned regions using MADV_FREE, the memory is then free to be returned to the OS but the OS only takes it when it needs more memory. The RTS flag --disable-delayed-os-memory-return means that we use MADV_DONTNEED instead, which promptly returns memory. In the rest of this post I will assume that OS VmRSS and heap size are close to each other. ## Understanding how Heap Size corresponds to Live Bytes Heap size is the amount of memory that the RTS has currently allocated for the Haskell heap. There are three main factors which determine heap size. ### Collection Strategy Firstly, the garbage collection strategy used by the oldest generation requires some overhead. By default a copying strategy is used which requires at least 2 times the amount of currently live data in order to perform a major collection. For example, if your program’s live data is 1GB then the heap size needs to be at least 2GB as each major collection will copy all data into a diferent block. If instead you are using the compacting (-c) or nonmoving (-xn) strategies for the oldest generation then less overhead is required as the strategy immediately reuses already allocated memory by overwriting. For a program with live bytes 1GB then you might expect the heap size to be at minimum a small percentage above 1GB. ### Nursery Size Secondly, a certain amount of memory is reserved for the nursery. The size of the nursery (per core) can be specified using the -A flag. By default each nursery is 4MB so if there are 8 cores then 8 * 4 = 32MB will be allocated and reserved for the nursery. As you increase the -A flag the baseline memory usage will correspondingly increase. ### Memory Retention Behaviour Thirdly, after doing some allocation GHC is quite reluctant to decrease its heap size and return the memory to the OS. This is because after performing a major collection the program might still be allocating a lot and it costs to have to request more memory. Therefore the RTS keeps an extra amount to reuse which depends on the -F ⟨factor⟩ option. By default the RTS will keep up to (2 + F) * live_bytes after performing a major collection due to exhausting the available heap. The default value is F = 2 so you can see the heap size to be as high as 4 times the amount used by your program. Without further intervention, once your program has topped out at this high threshold, no more memory would be returned to the OS so heap size would always remain at 4 times the live data. If you had a server with 1.5G live data, then if there was a memory spike up to 6G for a short period, then heap size would never dip below 6G. This is what happened before GHC 9.2. In GHC 9.2 memory is gradually returned to the OS so OS memory usage returns closer to the theoretical minimums. ## New memory return strategy in GHC 9.2 The -Fd ⟨factor⟩ option controls the rate at which the heap size is decreased and hence memory returned to the OS. On consecutive major collections which are not triggered by heap overflows, a counter (t) is increased and the F factor is inversely scaled according to the value of t and Fd. The factor is scaled by the equation: \texttt{F}&39; = \texttt{F} \times {2 ^ \frac{- \texttt{t}}{\texttt{Fd}}} By default Fd = 4, increasing Fd decreases the rate memory is returned. Major collections which are not triggered by heap overflows arise mainly in two ways. 1. Idle collections (controlled by -I ⟨seconds⟩) 2. Explicit trigger using performMajorGC. For example, idle collections happen by default after 0.3 seconds of inactivity. If you are running your application and have also set -Iw30, so that the minimum period between idle GCs is 30 seconds, then say you do a small amount of work every 5 seconds, there will be about 10 idle collections every 5 minutes. This number of consecutive idle collections will scale the F factor as follows: \texttt{F}&39; = 2 \times {2^{\frac{-10}{4}}} \approx 0.35 and hence we will only retain (0.35 + 2) * live_bytes rather than the original 4 times. If you have less frequent idle collections (e.g. due to increasing -I or -Iw) then you should also decrease Fd so that more memory is returned each time a collection takes place. Enabling idle collections is important if you want your program to return memory to the operating system and promptly run finalisers. In the past it has sometimes been recommended that long running applications disable idle collections in order to avoid unecessary work but now it is better to keep idle collections enabled but configure the -Iw option to avoid them happening too frequently. If you set -Fd0 then GHC will not attempt to return memory, which corresponds with the behaviour from releases prior to 9.2. You probably don’t want to do this as unless you have idle periods in your program the behaviour will be similar anyway. ### Analysis and Further Tweaking These two graphs show the difference between -Fd0 and -Fd4, with -Fd4 the memory usage returns to a baseline of around 4GB after spiking at 8GB. With -Fd0, the memory usage never retreats back below 7GB. If you want to retain a specific amount of memory then it’s better to set -H1G in order to communicate that you are happy with a heap size of 1G. If you do this then heap size will never decrease below this amount if it ever reaches this threshold. The collecting strategy also affects the fragmentation of the heap and hence how easy it is to return memory to a theoretical baseline. David in a previous post gave a much more in-depth explanation of fragmentation. In theory the compacting strategy has a lower memory baseline but practically it can be hard to reach the baseline due to how compacting never defragments. On the other hand, the copying collecting has a higher theoretical baseline but we can often get very close to it because the act of copying leads to lower fragmentation. ## Conclusion In this post I explained the new GHC option -Fd and how it can be configured to return additional memory to the OS during idle periods. Under the default settings this should result in a lower reported memory usage for most long-running applications in GHC 9.2. A big thank you to Hasura, an open-source GraphQL engine built in Haskell, for partnering with us and making the work presented here possible. ### Tweag I/O # Incubating the Haskell Foundation For the three of us, the launch of the Haskell Foundation was one of 2020’s few bright spots. In November 2020, Simon Peyton Jones announced the formation of the Haskell Foundation. Conceived during the first lockdowns of early 2020, the Haskell Foundation is a non-profit organization aimed at promoting adoption of the Haskell programming language and bolstering its core infrastructure. The Foundation has now raised approximately$500K in cash and financial commitments from sponsors, recruited a volunteer board of 14 directors and a full-time staff of two: Andrew Boardman, Executive Director (ED); and Emily Pillmore, Chief Technology Officer (CTO). Today, the Foundation has its own point of view, plans, sponsors, and community of volunteers. It has reached escape velocity. Our personal opinions now or back then don’t matter much, but we thought it might be nice to share the story of what sparked its inception.

In the end, the Haskell Foundation was launched with input from dozens of friends of Haskell. What we thought might be like rolling a boulder uphill turned out to be more like kicking a stone and starting a rockslide. It seems the Haskell Foundation is an idea whose time had come.

## Inception

In early 2020, Simon Peyton Jones got a call from Frank Rodorigo, a US-based entrepreneur, who was in the process of reviving Skills Matter, a community of tech creators, users and adopters. Skills Matter had run into financial difficulties in 2019, and Frank, together with his CTO Scott Conley, wanted to make sure Haskell was at the center of their reboot plans. He hoped to explore some ideas with Simon about the best ways to help Haskell. One thing led to the next: under the impetus of the rebooted Skills Matter, we brainstormed about what extra glue the community might need to bolster the lofty goals that so many seemed to have.

Over the years, we have been encouraged by the inspiration Haskell has given to other languages like Rust, Apple’s Swift and others. We even saw Java programmers singing the praises of lazy streams and anonymous functions. It had always seemed like just a matter of time before more programmers start wanting to use “The Real Thing”. While we saw some progress in the Haskell ecosystem, with the release of GHC 8.0, and efforts to eliminate “dependency hell” on the part of the Stack and Cabal projects, it didn’t feel like enough.

Enough for what? If you wanted to use Haskell in production at a company, you still had to be brave and determined. The evidence was anecdotal but hard to ignore. It ranged from conference attendees talking about how complicated Haskell has become, to concerns such as getting Haddock working with GHC’s new type system features, to the trouble enabling Stack to keep working with Hackage. Pull requests to some of the core libraries were languishing and many felt there were problems in those libraries simply going unaddressed.

To be sure, there were some people thinking about remedies with proposals like Simple Haskell and Boring Haskell. But we thought more was needed. We started with some goals:

• An easier on-ramp. Starting out with Haskell is harder than it should be, with a wealth of ways of setting up a Haskell dev machine (some of them out-of-date!). Even after this first step, newcomers often land on unmaintained wiki pages or other seldom helpful destinations.
• More inclusivity. To help the community grow, we wanted the occasional alienating post or dismissive comment to be reliably addressed.
• More progress, faster. We wanted to encourage more innovation across the many different projects that comprise Haskell. We also wanted to explore ways to help the committees and volunteers that make up the Haskell community to channel resources and volunteers to where they are needed most, and to ensure that each tool works well with others, forming a cohesive whole.
• Funding. We knew first-hand that many companies were already investing in Haskell to ease their own pain points, but their efforts weren’t very connected. We wanted to make all of the above more feasible with the help of sponsors.

## Gaining momentum

Having envisioned the outlines of the Haskell Foundation, what next? We wrote those ideas down in a live shared document we called “the whitepaper” and started gathering feedback from an ever widening group. We really wanted to iron out the wrinkles to avoid announcing something that could fall flat on its face.

As Simon Peyton Jones is the central architect and developer behind the Haskell ecosystem, the rest of us thought that his visible involvement and leadership would be essential. Happily, Simon agreed to stay involved and eventually spent way more time than he had planned.

We still had some worries. Would others consider affiliating? Were we stepping on anyone’s toes? We decided to take a slow and deliberate concentric rings approach to next reach out to core committees, then companies and large projects, then influential community members, to socialize the idea further.

We started with the Haskell.Org Committee. Expecting pushback, we were blown away by how eager the chairperson (Jasper van der Jeugt) was, and then in turn the rest of the haskell.org members: Ryan Trinkle, Emily Pillmore, Tikhon Jelvis, Rebecca Skinner, and Alex Garcia de Oliveira. They all became early and essential participants in the launch of the Foundation. It seems that the idea resonated strongly with some of their own ongoing discussions.

We were still a small team at this point — fewer than ten people, with Tim Sears as our chief day-to-day organizer. We kept adding anyone who wanted to help to a bi-weekly video call and kept iterating the whitepaper. After a short time it looked nothing like the first draft.

Feeling a bit braver, we then decided to also reach out to companies and key stakeholders in the community. We met skeptics along the way, like Michael Snoyman (now on the Foundation Board), both on the vision and the feasibility. These skeptics’ input turned out to be extremely helpful. Among other things, we used their feedback to sharpen the Foundation’s commitment to transparency. Ryan Trinkle (Obsidian Systems) soon started playing the role of shadow CFO. Duncan Coutts and Ben Gamari (Well-Typed) provided valuable input, as did Neil Mitchell, John Wiegley, Ed Kmett, Simon Marlow and others too numerous to mention.

It was important to connect with the Core Libraries Committee, the Hackage Trustees and the GHC Steering Committee. Emissaries were dispatched. Those groups ranged from amenable to enthusiastic, but they also asked some thorny questions. Did the community really need another committee? How would the Foundation differentiate itself? Could we actually raise money? In private, Simon Peyton Jones asked Tim why he wrote down a 7-figure sponsorship goal in an early draft of the whitepaper. Was it realistic? Tim had to admit he wasn’t sure, but without funding the Foundation would never have big impact. Only one way to find out…

At some point we started calling our informal cabal the HF Working Group. Eventually the invite list numbered in the dozens, with about 12 or so turning up regularly to our video chats. Scheduling was rarely a problem, since nobody was traveling - the silver lining of a terrible pandemic.

## The Foundation escapes quarantine

In August, the Working Group asked itself: “Haven’t we socialized this idea enough yet? Can’t we freeze and ratify the whitepaper? Why can’t we just launch?“. Just like that we turned a corner. We started a semi-public outreach effort on the following basis:

• The Foundation would be non-profit. Our goal would never be to create a consultancy or training company. Our goal would be to promote Haskell and related technologies. We would not be selling any services.
• The Foundation would be inclusive. It would seek input from a variety of sources and be community-driven. A goal of the Foundation would be to look like the community we want to become.
• The Foundation would be funded. We would start with ambitious fund-raising goals. Specifically, we wanted to aim for a yearly budget of over $1,000,000. Donations would come from industry and from the general Haskelling public. • The Foundation would have an executive director. Acknowledging that we are all busy people, well-meaning volunteers simply do not have the bandwidth to offer sustained attention to where it is needed. Instead, the Foundation would have at least one full-time employee, whose day job it is to manage the Haskell community and promote its interests. To our eyes, now informed by the community, this seemed like a winning formula — a design that would be able to fix Haskell’s problems and promote the language, while strengthening our community. The Foundation quickly became the world’s worst-kept secret as the Working Group set a public launch date for November. ## Announcement and bootstrapping Right away there was a new chicken-and-egg problem: we wanted the Board of Foundation to be drawn from the wider community, and yet we needed to have someone in charge so we could launch quickly. The HF Working Group landed on a two-step process. The Foundation would start with an interim Board and launch. The Board would then both replace itself and hire an ED to run the organization. The Working Group identified eight prominent Haskellers pre-launch and invited them to serve as an interim board. Thanks to Simon Peyton Jones’s persuasive powers, and a promise that their main duty would be limited to the above, they were quickly recruited. Tim Sears, Emily Pillmore, Ryan Trinkle, and Alex Garcia de Oliveira led the fundraising efforts, starting seriously in August with a launch date slated for November. Even before launch, the Foundation quickly landed over$200,000 in commitments. Soon we knew that we would have enough to fund an ED who could spend enough time to foster the sponsorship efforts. Ryan started the paperwork to incorporate the Haskell Foundation as a non-profit, while Emily took over the de facto running of the Working Group. Jasper led an Affiliation Track to coordinate transparency policies for the community.

Richard Eisenberg worked with other volunteers (Ben Gamari of Well-Typed, Davean Scies of xkcd, Emily, Moritz Angermann of IOHK, Tikhon Jelvis from the Haskell.Org committee, and Tim Sears) to develop an initial technical agenda, a list of the kinds of projects the Foundation would seek to accelerate. This was to be used in our fundraising pitches and in forming a starting place for the real work to come: we knew that, once the Foundation was made public and we had selected a board, the agenda could be revised in the light of the freedom of being able to consult the public.

Rebecca Skinner from the Haskell.Org committee (helped by Tim and Davean) volunteered to lead up the effort to create a website, in advance of the upcoming launch, which we decided to incorporate into the Haskell eXchange conference, hosted by Tweag partner Skills Matter. Cardano/IOHK provided much of the on-the-ground labor in putting the initial website draft together, handing it over to Rebecca to push it over the line for the launch.

Finally, on 4 November, 2020 Simon Peyton Jones publicly announced the Haskell Foundation. You can watch the video of his announcement. Despite the feverish pace of work leading up to that announcement, everyone knew that the real work was only beginning, but we had a launch!

The story we wanted to share now comes to an end.

## Epilogue

In December and January the interim Board recruited an outstanding slate of 14 members from a large group of applicants. As a special bonus they decided that the initial funding was sufficient to hire not only an ED, but also a CTO! Andrew Boardman joined as ED and Emily Pillmore stayed on in a permanent role as CTO. In its first meeting, the Board elected Richard Eisenberg as its chair.

The work of the Haskell Foundation proper is finally underway. This includes developing a process for community input, identifying projects to fund and otherwise support, and continuing outreach efforts. We expect that the creation of the Foundation will mark an inflection point in the history of Haskell. It is extremely gratifying to have played a role in helping a community that has given so much to us personally. We’re eager to see where it will go from here.

Finally, the Haskell Foundation needs volunteers, people just like you. The best place to reach the Foundation is via the Haskell Foundation category at the Haskell Discourse instance, though you can reach out directly to the Board or its chair, Andrew the ED, or Emily the CTO.

# Sprawdź nasze filmy w kategoriach

Sposobów na spędzanie wolnego czasu jest bez wątpienia wiele. Jednakże są wśród nich takie metody, które okazują się być nie tylko najpopularniejsze, ale również ponadczasowe. Mowa tu oczywiście o seansach domowych, a więc oglądaniu filmów online, w swoim własnym zaciszu domowym. Takie rozwiązanie jest korzystne szczególnie dla tych, którzy cenią sobie ogólny komfort i swobodę. Nie da się ukryć, że obejrzenie filmu na sali kinowej w towarzystwie wielu osób, a oglądanie go w swoich przysłowiowych czterech kątach w pojedynkę bądź wraz z naszymi najbliższymi to zdecydowanie duża różnica. Oprócz tego poszczególne platformy online, na których możliwe jest obejrzenie różnorodnych filmów mają naprawdę szeroki wybór. Jednym z takich stron internetowych jest właśnie zalukaj.

## Dlaczego warto korzystać z zalukaj?

Przede wszystkim warto wiedzieć, że zalukaj to strona, na której masz możliwość nie tylko przejrzeć różne kategorie filmowe i propozycje filmów (których znajdziesz całe mnóstwo) ale również możesz obejrzeć taki film. Tak jak zostało już wspomniane, kategorii filmowych jest na zalukaj naprawdę wiele. W związku z tym możliwe jest obejrzenie zarówno takich propozycji jak Dramat, Kryminał, Thriller, Fantasy, Komedia, Sci-Fi jak i także film Przygodowy, Animacja, Familijny, Komedia rom., Sztuki walki, czy też Melodramat.

## Co jeszcze warto wiedzieć o zalukaj?

Nie da się nie zauważyć, że w ostatnim czasie wzrosła popularność filmów online. Bez żadnych wątpliwości można stwierdzić, że do największych zalet oglądania filmów dostępnych w Internecie jest fakt, iż takie filmy możemy oglądać w dogodnym dla siebie miejscu i czasie. Dlatego też najciekawsze filmy które nas intrygują bądź też wielokrotnie już obejrzane klasyki, które pragniemy sobie odtworzyć, możemy oglądać zarówno z łóżka, ulubionego fotela, u znajomych czy nawet podczas podróży. Niezależnie więc od tego, czy preferujesz taki gatunek jak film Kostiumowy, Akcja, Historyczny, Romans, Krótkometrażowy, Horror, Western bądź też chociażby Biograficzny, Polityczny, Wojenny, Muzyczny, Psychologiczny, Dokumentalny lub Czarna komedia czy Biblijny - każdy z tych gatunków jest z pewnością warty obejrzenia. Warto także wiedzieć, że na stronie internetowej zalukaj istnieje możliwość założenia swojego konta lub konta vip, aby być ze wszystkim na bieżąco i korzystać z jeszcze większej ilości ofert.

# How to build hybrid Haskell and Java programs

When working in a sufficiently large Haskell project, one eventually faces the question of whether to implement missing libraries in Haskell or borrow the implementation from another language, such as Java.

For a long time now, it has been possible to combine Haskell and Java programs using inline-java. Calling Java functions from Haskell is easy enough.

import Language.Java (J, JType(Class)) -- from jvm
import Language.Java.Inline (imports, java) -- from inline-java

imports "org.apache.commons.collections4.map.*"

createOrderedMap :: IO (J ('Class "org.apache.commons.collections4.OrderedMap"))
createOrderedMap = [java| new LinkedMap() |]

The ghc and javac compilers can cooperate to build this program provided that both know where to find the program’s dependencies. Back in the day, when inline-java was being born, there was no build tool capable of pulling the dependencies of both Haskell and Java, or at least not without additional customization. Since then, however, Tweag has invested effort into enabling Bazel, a polyglot build system, to build Haskell programs. In this blog post we go over the problems of integrating build tools designed for different languages, and how they can be addressed with Bazel, as an example of a single tool that builds them all. More specifically, this post also serves as a tutorial for using inline-java with Bazel, which is a requirement for the latest inline-java release.

## Dependencies done the hard way

Suppose we rely on cabal-install or stack to install the Haskell dependencies. This would make the inline-java and jvm packages available. But these tools are specialized to build Haskell packages. If our program also depends on the commons-collections4 Java package, we can’t rely on Cabal to build it. We need help from some other Java-specific package manager.

We could rely on maven or gradle to install common-collections4. At that point we can build our project by invoking ghc and telling javac where to find the java dependencies in the system via environment variables (i.e. CLASSPATH).

With some extra work, we could even coordinate the build systems so one calls to the other to collect all the necessary dependencies and invoke the compilers. This is, in fact, what inline-java did until recently.

But there is a severe limitation to this approach: no build system can track changes to files in the jurisdiction of the other build system. The Cabal-based build system is not going to notice if we change the gradle configuration to build a different version of common-collections4, or if we change a source file on the Java side. Easy enough, we could run gradle every time we want to rebuild our program, just in case something changed in the Java side. But then, should the Haskell build system rebuild the Haskell side? Or should it reuse the artifacts produced in an earlier build?

We could continue to extend the integration between build systems so one can detect if the artifacts produced by the other have changed. Unfortunately, this is a non-trivial task and leads to reimplementing features that build systems already implement for their respective languages. We would be responsible for detecting changes on every dependency crossing the language boundary.

If that didn’t sound bad enough, incremental builds is not the only concern requiring coordination. Running tests, building deployable artifacts, remote builds and caches, also involve both build systems.

## Dependencies with a polyglot build system

A straightforward answer is to use only one build system to build all languages in the project. We chose to turn to Bazel for our building needs.

Bazel lets us express dependencies between artifacts written in various languages. In this respect, it is similar to make. However, Bazel comes equipped with sets of rules, such as rules_haskell, for many languages, which know how to invoke compilers and linkers; these rules are distributed as libraries and are reusable across projects. With make, the user must manually encode all of this knowledge in a Makefile herself. It’s not the subject of this blog post, but Bazel comes with a number of other perks, such as hermeticity of builds for reproducibility, distributed builds, and remote caching.

We offer a working example in the inline-java repository. In order to specify how to build our Haskell program, we start by importing a rule to build Haskell binaries from rules_haskell.

# file: BUILD.bazel
)

name = "example",
srcs = ['Main.hs'],
extra_srcs = ["@openjdk//:rpath"],
compiler_flags = [
"-optl-Wl,@$(location @openjdk//:libjvm.so)", "-threaded", ], deps = [ "//jvm", "//jni", "//:inline-java", "@rules_haskell//tools/runfiles", "@stackage//:base", "@stackage//:text", ] + java_deps, data = [ ":jar_deploy.jar" ], plugins = ["//:inline-java-plugin"], ) java_deps = [ "@maven//:org_apache_commons_commons_collections4_4_1", ] The load instruction is all we need to do to invoke the rules_haskell library and access its various rules. Here we use the haskell_binary rule to build our hybrid program. Besides the fact that Main.hs is written in both Haskell and Java, one can tell the hybrid nature of the artifact by observing the dependencies in the deps attribute, which refers to both Haskell and Java libraries. • //jvm, //jni and //:inline-java refer to Haskell libraries implemented in the current repository • @rules_haskell//tools/runfiles refers to a special Haskell library defined as part of the rules_haskell rule set. More on this below. • @stackage//:base and @stackage//:text refer to Haskell packages coming from stackage • @maven//:org_apache_commons_commons_collections4_4_1 is a Java library coming from a maven repository Additional configuration in the WORKSPACE file makes precise the location of dependencies coming from rules_haskell, stackage and maven, where the stackage snapshot and the list of maven repositories is specified. More information on the integration with stackage can be found in an earlier post and with maven in the documentation of rules_jvm_external. While the dependencies in deps are made available at build time, the attribute data = [":jar_deploy.jar"] declares a runtime dependency. Specifically, it makes the Java artifacts available to the Haskell program. The ":jar_deploy.jar" target is a jar file produced with the following rule from the Java rule set. A future version of rules_haskell may generate this runtime dependency automatically, but for the time being we need to add it manually. java_binary( name = "jar", main_class = "dummy", runtime_deps = java_deps, ) Now, when there are changes in the Java dependencies, Bazel will know to rebuild the Haskell artifacts, and only if there were changes. Bazel makes sure that jar_deploy.jar is built, and stores it in an appropriate location. But nothing, so far, tells the Haskell program where to find this file. This is where the runfiles library comes into play. Bazel lays out runtime dependencies, such as jar_deploy.jar following known rules; the runfiles library abstracts over these rules. To complete our example, we need the main function to call to the runfiles library and discover the jar file’s location. import qualified Bazel.Runfiles as Runfiles import Data.String (fromString) import Language.Java (J, JType(Class), withJVM) ... main = do r <- Runfiles.create let jarPath = Runfiles.rlocation r "my_workspace/example/jar_deploy.jar" withJVM [ "-Djava.class.path=" <> fromString jarPath ]$
void createOrderedMap

## Closing thoughts

Mixing languages is challenging both from the perspective of the compiler and of the build system. In this blog post we provided an overview of the challenges of integrating build systems, and how a unifying build system can offer a more practical framework for reusing language-specific knowledge.

Bazel is a materialization of this unifying build system, where rule sets are the units of reuse. We recently moved inline-java builds to rely on Bazel, by depending strongly on the rule sets for Haskell and Java. This implies that end users will also have to use Bazel. Though this means departing from stack and cabal-install, the build tools that Haskellers are used to, we hope that it will offer a better path for adopters to build their multi-language projects. And since rules_haskell can still build Cabal packages and stackage snapshots via Cabal-the-library, going with Bazel doesn’t forego the community effort invested in curating the many packages in the Haskell ecosystem.

# Mandelbrot Maps --- an invitation

[An open request from three of my students. Please try it out! I'm impressed with what they have achieved.]

We'd like to invite you to take a look at Mandelbrot Maps - an interactive fractal explorer!

https://jmaio.github.io/mandelbrot-maps/

This has been my honours project for the past two years, and this year it's also been updated with contributions from Fraser Scott and Georgina Medd. (cheers!)

There's a [Help] menu with information about the various available options: hopefully you'll learn something new!

If you could take a few minutes to look around, we'd really appreciate it!
Feel free to leave feedback either through the button on the website ([Settings] > [Info] > [Feedback]) or directly:
https://forms.office.com/r/uRQwkQvVpL

(This study was certified according to the Informatics Research Ethics Process, RT number 2019/22202)

Thank you!
Joao
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

## March 23, 2021

### Douglas M. Auclair (geophf)

• 2021-03-23:

You have [a] and (a -> IO b).

You want IO [(a, b)]

That is, you want to pair your inputs to their outputs for further processing in the IO-domain.

• Chris Martin @chris__martin:

\as f -> traverse @ [] @ IO (\a -> f a >>= \b ->

return (a, b)) as

• cλementd Children crossing @clementd:

traverse (sequenceA . id &&& f)

(actually, tranverse (sequence . (id &&& f)))

Or p traverse (traverse f . dup)

• 2021-03-23: You have

[([a], [b])]

You want ([a], [b])

so: [([1,2], ["hi"]), ([3,4], ["bye"])]

becomes ([1,2,3,4],["hi","bye"])

• karakfa @karakfa:

conc xs = (concatMap fst xs, concatMap snd xs)

# Why Should Anyone use Colours?

Marco Patrignani explains why and how to use colour in your technical papers. The guidelines for making highlighting useful even for colour-blind folk are particularly helpful.

# Penrose Kite and Dart Tilings with Haskell Diagrams

Infinite non-periodic tessellations of Roger Penrose’s kite and dart tiles.

As part of a collaboration with Stephen Huggett, working on some mathematical properties of Penrose tilings, I recognised the need for quick renderings of tilings. I thought Haskell diagrams would be helpful here, and that turned out to be an excellent choice. Two dimensional vectors were well-suited to describing tiling operations and these are included as part of the diagrams package.

This literate Haskell uses the Haskell diagrams package to draw tilings with kites and darts. It also implements the main operations of inflate and decompose which are essential for constructing tilings (explained below).

Firstly, these 5 lines are needed in Haskell to use the diagrams package:

{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE FlexibleContexts          #-}
{-# LANGUAGE TypeFamilies              #-}
import Diagrams.Prelude
import Diagrams.Backend.SVG.CmdLine

These are the kite and dart tiles.

The red line marking here on the right hand copies, is purely to illustrate rules about how tiles can be put together for legal (non-periodic) tilings. Obviously edges can only be put together when they have the same length. If all the tiles are marked with red lines as illustrated on the right, the vertices where tiles meet must all have a red line or none must have a red line at that vertex. This prevents us from forming a simple rhombus by placing a kite top at the base of a dart and thus enabling periodic tilings.

All edges are powers of the golden section $\phi$ which we write as phi.

phi::Double
phi = (1.0 + sqrt 5.0) / 2.0

So if the shorter edges are unit length, then the longer edges have length phi. We also have the interesting property of the golden section that $phi^2 = phi + 1$ and so $1/phi = phi-1$, $phi^3 = 2phi +1$ and $1/phi^2 = 2-phi$.

All angles in the figures are multiples of tt which is 36 deg or 1/10 turn. We use ttangle to express such angles (e.g 180 degrees is ttangle 5).

ttangle:: Int -> Angle Double
ttangle n = (fromIntegral (n mod 10))*^tt
where tt = 1/10 @@ turn

## Components

In order to implement inflate and decompose, we need to work with half tiles. These can be represented with a simple 2D vector to provide orientation and scale. We will call these components : Left Dart, Right Dart, Left Kite, Right Kite and use the type

data Component = LD (V2 Double)
| RD (V2 Double)
| LK (V2 Double)
| RK (V2 Double)

The vector represents the join edge of each half tile where halves come together. The origin for a dart is the tip, and the origin for a kite is the acute angle tip (marked in the figure with a red dot).

These are the only 4 components we use (oriented along the x axis)

ldart,rdart,lkite,rkite:: Component
ldart = LD unitX
rdart = RD unitX
lkite = LK (phi*^unitX)
rkite = RK (phi*^unitX)

Perhaps confusingly we regard left and right of a dart differently from left and right of a kite when viewed from the origin. The diagram shows the left dart before the right dart and the left kite before the right kite. Thus in a complete tile, going clockwise round the origin the right dart comes before left dart, but the left kite comes before right kite.

When it comes to drawing components, for the simplest case, we just want to show the two tile edges of each component (and not the join edge). These edges are calculated as a list of 2 new vectors, using the join edge vector v. They are ordered clockwise from the origin of each component

compEdges:: Component -> [V2 Double]
compEdges (LD v) = [v',v ^-^ v'] where v' = phi*^rotate (ttangle 9) v
compEdges (RD v) = [v',v ^-^ v'] where v' = phi*^rotate (ttangle 1) v
compEdges (RK v) = [v',v ^-^ v'] where v' = rotate (ttangle 9) v
compEdges (LK v) = [v',v ^-^ v'] where v' = rotate (ttangle 1) v

Now drawing lines for the 2 outer edges of a component is simply

drawComp:: Component -> Diagram B
drawComp = strokeLine . fromOffsets . compEdges

It is also useful to calculate a list of the 4 tile edges of a completed half-tile component clockwise from the origin of the tile. (This is useful for colour filling a tile)

tileEdges:: Component -> [V2 Double]
tileEdges (LD v) =
compEdges (RD v) ++ map (zero ^-^) (reverse (compEdges (LD v)))
tileEdges (RD v) = tileEdges (LD v)
tileEdges (LK v) =
compEdges (LK v) ++ map (zero ^-^) (reverse (compEdges (RK v)))
tileEdges (RK v) = tileEdges (LK v)

To fill whole tiles with colours, darts with dcol and kites with kcol we use fillDK. This uses only the left components to identify the whole tile and ignores right components so that a tile is not filled twice.

fillDK:: Colour Double -> Colour Double -> Component -> Diagram B
fillDK dcol kcol c = case c of
(LD _) -> (strokeLoop $glueLine$ fromOffsets $tileEdges c) # fc dcol (LK _) -> (strokeLoop$ glueLine $fromOffsets$ tileEdges c)  # fc kcol
_      -> mempty

To show half tiles separately, we will use drawJComp which adds the join edge of each half tile to make loops with 3 edges.

drawJComp:: Component -> Diagram B
drawJComp = strokeLoop . closeLine . fromOffsets . compEdges

Apart from drawing, we can also scale a component

scaleComp:: Double -> Component -> Component
scaleComp = compMap . scale 

where compMap applies a function to the vector of a component

compMap f (LD v) = LD (f v)
compMap f (RD v) = RD (f v)
compMap f (LK v) = LK (f v)
compMap f (RK v) = RK (f v)

and rotate a component by an angle

rotateComp :: Angle Double -> Component -> Component
rotateComp = compMap . rotate 

(Positive rotations are in the anticlockwise direction.)

## Patches

A patch is a list of components each with an offset represented by a vector

type Patch = [(V2 Double, Component)]

To turn a whole patch into a diagram using some function cd for drawing the components, we use

decomps cpt = vsep 1 $fmap drawPatch$ take 5 $decompositions [(zero,cpt)]  We have made use of the fact that we can create an infinite list of finer and finer decompositions of any patch, using: decompositions:: Patch -> [Patch] decompositions p = inf where inf = p:fmap decompose inf We could get the n-fold decomposition of a patch as just the nth item in such a list multiDecomp :: Int -> Patch -> Patch multiDecomp n p = decompositions p !! n For example, here is an infinite list of decomposed versions of sun. suns = decompositions sun The coloured tiling shown at the beginning is simply 6 decompositions of sun displayed using fillDK sun6 = suns!!6 filledSun6 = patchWith (fillDK red blue) sun6 # lw ultraThin The earlier figure illustrating larger kites and darts emphasised from the smaller ones is also sun6 but this time drawn with experimentFig = patchWith experiment sun6 # lw thin where components are drawn with experiment:: Component -> Diagram B experiment c = emph c <> (drawComp c # dashingN [0.002,0.002] 0 # lw ultraThin) where emph c = case c of (LD v) -> (strokeLine . fromOffsets) [v] # lc red -- emphasise join edge of darts in red (RD v) -> (strokeLine . fromOffsets) [v] # lc red (LK v) -> (strokeLine . fromOffsets) [rotate (ttangle 1) v] -- emphasise long edge for kites (RK v) -> (strokeLine . fromOffsets) [rotate (ttangle 9) v] ## Inflation You might expect inflation (also called composition) to be a kind of inverse to decomposition, but it is a bit more complicated than that. With our current representation of components, we can only inflate single components. This amounts to embedding the component into a larger component that matches how the larger component decomposes. There is thus a choice at each inflation step as to which of several possibilities we select as the larger half-tile. We represent this choice as a list of alternatives. This list should not be confused with a patch. It only makes sense to select one of the alternatives giving a new single component. The earlier diagram illustrating how decompositions are calculated also shows the two choices for inflating a right dart into either a right kite or a larger right dart. There will be two symmetric choices for a left dart, and three choices for left and right kites. Once again we work with offsets so we can calculate the offset for any resulting component (to ensure the larger component contains the original in its original position in a decomposition). inflate :: (V2 Double, Component) -> [(V2 Double, Component)] inflate (v, RD vd) = [(v ^+^ v', RD vd') ,(v, RK vk ) ] where v' = (phi+1) *^ vd -- vd*phi^2 vd' = rotate (ttangle 9) (vd ^-^ v') vk = rotate (ttangle 1) v' inflate (v, LD vd) = [(v ^+^ v', LD vd') ,(v, LK vk ) ] where v' = (phi+1) *^ vd -- vd*phi^2 vd' = rotate (ttangle 1) (vd ^-^ v') vk = rotate (ttangle 9) v' inflate (v, RK vk) = [(v, LD vk ) ,(v ^+^ lv', LK lvk') ,(v ^+^ rv', RK rvk') ] where lv' = phi*^rotate (ttangle 9) vk rv' = phi*^rotate (ttangle 1) vk rvk' = phi*^rotate (ttangle 7) vk lvk' = phi*^rotate (ttangle 3) vk inflate (v, LK vk) = [(v, RD vk ) ,(v ^+^ rv', RK rvk') ,(v ^+^ lv', LK lvk') ] where v0 = rotate (ttangle 1) vk lv' = phi*^rotate (ttangle 9) vk rv' = phi*^rotate (ttangle 1) vk rvk' = phi*^rotate (ttangle 7) vk lvk' = phi*^rotate (ttangle 3) vk As the result is a list of alternatives, we need to select one to do further inflations. We can express all the alternatives after n steps as inflations n where inflations :: Int -> (V2 Double, Component) -> [(V2 Double, Component)] inflations 0 vc = [vc] inflations n vc = [vc'' | vc' <- inflate vc, vc'' <- inflations (n-1) vc'] (That last line could be written more succinctly with do notation) This figure illustrates 5 consecutive choices for inflating a left dart to produce a left kite. On the left, the finishing component is shown with the starting component embedded, and on the right the 5-fold decomposition of the result is shown. fiveInflate = hsep 1$ fmap drawPatch [[ld,lk'], multiDecomp 5 [lk']]
-- two seperate patches
where
ld  = (zero,ldart)
lk  = inflate ld  !!1
rk  = inflate lk  !!1
rk' = inflate rk  !!2
ld' = inflate rk' !!0
lk' = inflate ld' !!1

Finally, at the end of this literate haskell program we choose which figure to draw as output.

fig::Diagram B
fig = filledSun6
main = mainWith fig

That’s it. But, What about inflating whole patches?, I hear you ask. Unfortunately we need to answer questions like what components are adjacent to a component in a patch and whether there is a corresponding other half for a component. These cannot be done with our simple vector representations. We would need some form of planar graph representation, which is much more involved. That is another story.

Many thanks to Stephen Huggett for his inspirations concerning the tilings. A library version of the above code is available on GitHub

## Further reading on Penrose Tilings

As well as the Wikipedia entry Penrose Tilings I recommend two articles in Scientific American from 2005 by David Austin Penrose Tiles Talk Across Miles and Penrose Tilings Tied up in Ribbons.

There is also a very interesting article by Roger Penrose himself: Penrose R Tilings and quasi-crystals; a non-local growth problem? in Aperiodicity and Order 2, edited by Jarich M, Academic Press, 1989.

# The fundamental problem solved by blockchain

This post originally appeared on Edsko’s personal blog.

There are many blog posts and descriptions of blockchain technology on the web, but I feel that very few of them get to the heart of the matter: what is the fundamental problem that blockchains solve? In this blog post I want to take a look at this question; this blog post is intended for a general audience and I will assume no technical knowledge. That does not mean however that we will need to dumb things down; we will see what the fundamental problem solved by blockchain is, and we will even see two alternative ways in which this problem is solved, known as proof-of-work (used by for example Bitcoin) and proof-of-stake (used by for example the Cardano blockchain).

Full disclosure: I was one of the lead engineers on the Cardano blockchain, where I worked on the wallet specification, researched coin selection, designed the architecture of the consensus layer and wrote the hard fork combinator. This blog post is however just my personal perspective.

## Motivation

Suppose a group of people want to share information with each other. The nature of that information does not really matter; for the sake of having an example, let’s say it’s a group of companies who want to share supply availability, purchase orders, etc. How might we go about setting up a system to make this possible?

### Bulletin board

As a starting point, perhaps we could just set up a server where everybody can post messages and everybody can see everybody else’s messages. For reasons that will become clear later, we will refer to the messages that users post as blocks: users can post arbitrary blocks of data.1 Suppose we have some suppliers A, B, C, ..; they might be posting blocks (messages) such as this:

 Block contents Number 0 A has 1000 sprockets available 1 B has 500 springs available 2 A looking for a supplier of 300 springs 3 C agrees to buy 100 sprockets from A at €2/sprocket

(Block numbers added just for clarity, they are not part of the block itself.)

Systems like this are basically just some kind of bulletin board and have been around in digital form since before Internet times, and obviously in non-digital form basically forever.

### Signatures

Naturally, the problem with this system is that nothing is preventing people from posting messages that aren’t true. For the system to be useful as a supply chain system, suppliers must be able to depend on the messages that get posted. An important and obvious first step towards addressing this problem is to require messages to be signed. Digital signatures are a well-understood problem; the exact details of how it’s done don’t really matter.

With the addition of signatures, the server might now look something like this:

Block contents Signed by
A has 1000 sprockets available A
B has 500 springs available B
A looking for a supplier of 300 springs A
C agrees to buy 100 sprockets from A at €2/sprocket A, C

### Replay attacks

The addition of signatures is a big improvement, but it’s not yet sufficient. For example, perhaps A agreed to sell 100 sprockets to C at €2/sprocket only as a once-off; maybe it was some kind of special introductory price. The fact that A agreed to do this once does not necessarily mean that A will agree to do it again, but in our system so far nothing is stopping C from reposting the message again:

Block contents Signed by
A has 1000 sprockets available A
B has 500 springs available B
A looking for a supplier of 300 springs A
C agrees to buy 100 sprockets from A at €2/sprocket A, C
other messages
C agrees to buy 100 sprockets from A at €2/sprocket A, C

After all, the message is signed by both A and C and so it looks legit. This is known as a replay attack, and there are many ways to solve it. Here we will consider just one way: we will introduce an explicit ordering; as will see, this will be important later as well.

To introduce an ordering, each block will record the hash of its predecessor block. You can think of a hash of a block as a mathematical summary of the block: whenever the block changes, so does the summary.

Hash Pred Block contents Signed by
5195 none A has 1000 sprockets available A
5843 5195 B has 500 springs available B
2874 5843 A looking for a supplier of 300 springs A
9325 2874 C agrees to buy 100 sprockets from A at €2/sprocket A, C

Crucially, the signature also covers the block’s predecessor; this is what is preventing the replay attack: C cannot post the same message again, because if they did, that new block must have a new predecessor, and that would then require a new signature:

Hash Pred Block contents Signed by
5195 none A has 1000 sprockets available A
5843 5195 B has 500 springs available B
2874 5843 A looking for a supplier of 300 springs A
9325 2874 C agrees to buy 100 sprockets from A at €2/sprocket A, C
9325 other messages
6839
7527 6839 C agrees to buy 100 sprockets from A at €2/sprocket ???

Notice how this is creating a chain of blocks: every block linked to its predecessor. In other words, a block chain; although what we have described so far isn’t really a blockchain in the normal sense of the word just yet.

### Eliminating the central server

So far all messages have been stored on some kind of central server. One worry we might have here is what happens when that server goes down, suffers a hard disk failure, etc. However, server replication, data backup, etc. are again well-understood problems with known solutions; that won’t concern us here.

A more important question is: what happens if we cannot trust the central server? First, notice what the server can not do: since messages are signed by users, and the server cannot recreate those signatures, it is not possible for the server to forge new messages. The server can however delete messages, or simply not show all messages to all users.

Let’s first consider why the server might even want to try. Suppose the server currently has the following blocks:

Hash Pred Block contents Signed by
2547 none A has 1000 sprockets available A, B, C
8861 2547 A agrees to sell 800 sprockets to B A

In this example, A declares to have a sprocket supply, and both B and C have ratified the message. Now when A agrees to sell sprockets to B, B knows that A actually has these sprockets available, and so might transfer the money for the sprockets to A. If A tries to sell the sprockets to both B and C, C will know something is amiss:

Hash Pred Block contents Signed by
2547 none A has 1000 sprockets available A, B, C
8861 2547 A agrees to sell 800 sprockets to B A
1010 8861 A agrees to sell 800 sprockets to C A

If A only has 1000 sprockets, they can’t be selling 800 sprockets to both B and C. But suppose the server is in cahoots with A, and simply doesn’t show the transaction with B to C; in other words, the chain that the server might present to C could be

Hash Pred Block contents Signed by
2547 none A has 1000 sprockets available A, B, C
6079 2547 A agrees to sell 800 sprockets to C A

Now C doesn’t know that anything is wrong. This way A can basically sell their stock to as many people as they want, collect the money, and disappear. This is known as a double spending attack, and we cannot fix it if we stick with a central server: after all, the chains that B and C see are both perfectly reasonable and plausible; the attack consists of showing a different chain to different users, even though the chain that each user sees is itself valid. As we will see, this is the problem that the blockchain solves.

### Revising history

Before we get to the real blockchain, however, we should discuss one more thing. Consider again the example from the previous section, where the server is untrustworthy, and is cooperating with A to dupe both B and C. Unlike in the previous example, however, suppose that the transaction with A was not the last transaction on the chain; perhaps the server looks like this instead:

Hash Pred Block contents Signed by
2547 none A has 1000 sprockets available A, B, C
8861 2547 A agrees to sell 800 sprockets to B A
3598 8861 Some other message D

If the server now wants to present an alternative chain to C, in which A is selling the sprockets to C instead of to B, it cannot include the message signed by D. The message signed by D records the hash of its predecessor; if the server changes that predecessor (A selling to C instead of A selling to B), it would have a different hash, and would therefore require a new signature:

Hash Pred Block contents Signed by
2547 none A has 1000 sprockets available A, B, C
6079 2547 A agrees to sell 800 sprockets to C A
4592 6079 Some other message ???

However, this unfortunately does not help us. First of all, notice that the server could simply truncate the chain and not include the message signed by D when presenting the chain to C. Moreover, even when the chain does include additional messages, messages earlier on the chain are still not trustworthy. The problem is that we don’t know who D is. Perhaps D is the server itself; if so, then the existence of a message signed by D tells us nothing. Indeed, in the final example above, the server could just construct a new signature in the place of the “???”.

## Proof of work: solving puzzles

If we cannot trust the central server, perhaps we can simply allow everyone to produce blocks. Everyone keeps a copy of the block chain, and when someone wants to add a message to the chain they just create it themselves and send it everybody else. If two people happen to create a block at the same time, no problem: we just pick one of the two blocks, and whoever “lost” will just create a new block on top of the block that got adopted, no big deal.

This by itself does not solve anything just yet. A could just take the block that contains the transaction with B and send it to B but not to C, and vice versa. Just like before, even if B and C would wait for additional blocks to be added on top of the block they just received, they have no guarantee that those blocks are not being produced by nodes that are cooperating with A.

To even have a chance of having a way out of this impasse we will need to impose an assumption: although some participants may be untrustworthy, we will assume that the majority is honest. In the previous section we saw that once the chain includes an honest block, untrustworthy participants can no longer revise the blockchain’s history. Therefore, if the blockchain is operating normally, every new block that comes in increases the likelihood that the chain contains an honest block. After a while the probability that the most recent k blocks on the chain (for some number k) does not contain any honest block is so small that any information that is older than k can be considered trustworthy.2

Unfortunately, while the assumption that the majority of the participants is honest may seem reasonable, that does not mean that we can assume that the majority of the blocks we receive are produced by honest participants. After all, nothing is stopping untrustworthy participants from producing lots and lots of blocks. Unless… we introduce something that does stop that.

### Puzzles

This was the genius insight behind Bitcoin: we can translate the assumption “the majority of participants is honest” to “the majority of blocks is honest” if blocks are costly to produce. In order to produce a block, the block must solve a particular mathematical puzzle. It does not really matter what the puzzle is exactly, except that (1) it takes a computer a long time to solve, and (2) we can verify that the solution is correct very quickly. The answer to this puzzle is called proof of work (proof that you did the work that makes a block computationally expensive to produce), and must be included in every block:

Hash Pred Proof of work Block contents Signed by
2547 none A has 1000 sprockets available A, B, C
8861 2547 A agrees to sell 800 sprockets to B A
3598 8861 Some other message D

Notice how this prevents dishonest nodes from producing lots of different blocks: they would have to solve lots of different puzzles, which would simply be impossible in the time available. We do have to make our “honest majority” assumption a bit more precise for this to be viable: we must assume that the honest participants have more puzzle solving power than the dishonest participants; roughly speaking, the honest participants together must have more powerful computers than the dishonest participants.

This means that provided the honest participants are producing blocks, the blocks we receive over the network have at most a 50% probability of being produced by dishonest participants, and therefore the probability that no block is produced by an honest particant diminishes quickly; after 100 blocks that probability is roughly .000000000000000000000000000001.

### Chain selection

We mentioned above that it’s possible that two participants produce a block at the same time; when this happens, we just pick one arbitrarily, and continue; the participant who “lost” just produces a new block on top of the block that got picked, and we continue. It is possible that we discover this clash a bit later. For example, suppose B and C produce a block at the same time, D only sees the block produced by B and adds their block on top of Bs block:

 5285 none .. A 7115 5285 .. A 1413 7115 .. B 5980 7115 .. C 6679 1413 .. D

This is known as a fork in the blockchain. If all nodes are honest, how we resolve the fork does not actually matter terribly: as long as everybody applies the same rule, we end up with the same chain, and the blocks that “lost” are just reconstructed on the new fork. However, we want to throw away as little work as possible, and so we will opt to select the longest chain. More importantly, if the fork was intentionally produced by a dishonest participant, then picking the longer fork also gives us a higher probability of picking the fork with more blocks produced by honest participants: after all, in order to produce the longer fork, more puzzles had to be solved. We say that the difficulty of producing a chain increases with its length.

## Proof of stake

Although the introduction of the puzzles was a genius insight, it does have one very important drawback: it is extremely wasteful. A study in 2014 found that the global production of blocks on the Bitcoin block chain consumed as much energy as the entire country of Ireland.3 In a world where energy consumption is one of the greatest challenges to face mankind, this is obviously not a sustainable solution.

The key insight is that the important feature of proof-of-work is not necessarily that it is costly to produce blocks, but rather that we tie block production to some kind of resource of which participants have a limited amount. In proof of work that resource happens to puzzle solving (compute) power, but other choices are possible as well. In particular, most blockchains themselves are recording some kind of resource that is tied to a real-world resource: if the blockchain is supporting a cryptocurrency, then the amount of crypto someone owns can be considered their resource; for our running example, we could consider the amount of supplies that someone offers on the blockchain to be their resource.4

Let’s say someone owns 10% of all resources on the chain (i.e., they own 10% of the cryptocurrency, or in our example, 10% of the available supplies is theirs); we then say that they have a 10% stake. In a proof-of-stake blockchain, the probability that someone is allowed to produce a block is proportional to their stake: users with little stake can only produce a few blocks, users with a lot of stake can produce many. Our “honest majority” assumption now becomes: provided that the honest participants have more than 50% stake in the system, the probability that any block is produced by a dishonest participant will be less than 50%, just as for proof-of-work.

There are many ways to construct a proof-of-stake blockchain; what I will describe here is based on the Ouroboros consensus protocol5, the consensus protocol underlying the Cardano blockchain.

Here’s how a proof of stake blockchain works. Time is divided into slots; for simplicity, we can think of slots as just being equal to seconds (indeed, on the Cardano chain one slot takes one second), and slots are grouped into epochs; on the Cardano chain, one epoch contains a little under half a million slots, for a total of 5 days.6 At the start of each epoch, we run a lottery for each slot in the next epoch. Most of those slots will be empty7, but for some slots we will assign a slot leader: the participant that is allowed to produce a block for that slot. The details of how this lottery works are not very important for our current purposes, except to say that there are clear rules on how to run the lottery that every participant on the chain will follow and cannot really tamper with.

The only way for an dishonest participant to increase how many blocks they can produce is to increase their stake in the system. However, this has real life consequences: depending on how “stake” is defined exactly, it might mean they have to buy cryptocurrency, make more supply stock available, etc. Provided that the real world consequences of increasing stake are more than the potential benefit from attempting an attack on the system, the system is secure.

Put another way, participants with the most stake in the system are also most invested in the system: they need the system to succeed and continue to operate. Therefore, they are more trustworthy than nodes with less stake.

### Chain selection

Since block production is no longer costly (the whole point was to avoid wasteful energy consumption after all), it is not difficult for dishonest participants to construct long chains. Therefore if we have to choose between two chains, choosing the longer chain may not necessarily be the right choice. Instead, we will look at chain density. Suppose we have to choose between two chains:

We will count the number of blocks on each chains in a window of s slots; the exact value of s does not matter so much; let’s say s is 100,000 blocks.8 Recall the shape that the “honest majority” assumption takes in a proof-of-stake blockchain: the honest majority has more than 50% stake in the system. That means that within the window, the chain produced by the honest participants in the system will contain more blocks than the chain produced by dishonest participants. Therefore, whenever there is a fork in the chain, we must pick the chain that is denser (contains more blocks) in a window of slots at the intersection point. This is known as the Ouroboros Genesis rule.9

There is a useful special case for this rule. The mathematical analysis of the Ouroboros protocol10 tells us that when all participants in the network are following the protocol, their chains might be a little different near the tip (because not all forks are resolved yet), but the intersection point between all those chains will be no more than k blocks back. Moreover, blocks for “future” slots (slots that we haven’t reached yet11) are considered invalid. In other words, the situation looks a bit like this:

where none of the chains can exceed past “now”. In this case, the densest chain in the window will also be the longest; this means that for nodes that are up to date, the longest chain rule (just like in proof of work) can be used. Indeed, this is the rule in Ouroboros Praos, albeit with a side condition; for the details, please see my chapter “Genesis” in the Cardano Consensus Layer Report.

## Conclusions

Allowing a group of people to share information is not a difficult problem to solve; techniques such as digital signatures as well as ways to address replay attacks are well understood problems. Things get a lot trickier however if we drop the assumption that we can trust the server on which the data is stored. Although the server cannot manufacture data, it can selectively omit data. Detecting and guarding against this is very difficult: after all, when the server omits data, it is simply going back to an earlier state of the chain, in which that data was not yet present. Therefore almost by definition the version of the chain with some parts omitted must also be valid.

The solution to this is to allow all participants in the network to produce blocks. When we do that, we must have a way to resolving conflicts: when two or more nodes produce a block at the same time, how do we choose? If all participants are honest, we could just choose arbitrarily, but if dishonest participants are present, we have to be careful: if the dishonest participant can produce lots of blocks, then choosing randomly may favour the dishonest participant over honest participants.

This then is the fundamental problem solved by blockchain technology: how do we limit the number of blocks that dishonest participants can generate? One option, known as proof-of-work and first introduced by Bitcoin, is to make block production simply costly (require a lot of time for a computer to generate). This works, but is extremely wasteful of energy; Bitcoin is consuming as much energy as a small country. The better alternative is proof-of-stake: the blockchain itself records some finite resource, and participants are allowed to produce blocks in proportion to the amount of stake they have.

#### Footnotes

1. For the purposes of this blog post I will ignore the differences between blocks and transactions. Put another way, I will assume every block contains exactly one transaction.↩︎

2. This number k is known as the security parameter, and is one of the most important parameters in a blockchain.↩︎

3. Bitcoin Mining and its Energy Footprint, Karl J. O’Dwyer and David Malone. Available online See also Bitcoin’s energy consumption is underestimated: A market dynamics approach by Alex de Vries.↩︎

4. This only works if the supplies someone records on the blockchain are either validated or listing supplies on the blockchain comes with a legal commitment.↩︎

5. Ouroboros: A Provably Secure Proof-of-Stake Blockchain Protocol. Aggelos Kiayias, Alexander Russell, Bernardo David, and Roman Oliynykov. Available online↩︎

6. 432.000 of them, to be precise.↩︎

7. The proportion of slots we expect to be empty on average is known as the active slot coefficient, and is usually denoted by the variable f.↩︎

8. There are constraints on the value of s. When s is too small, doing a density comparison is not meaningful, because the probability that adversarial nodes have more than 50% stake within the window by pure luck is too large. When s is too large, the leadership schedule within the window is no longer determined by the ledger state at the intersection point, and so the adversary can start to influence the leadership schedule. In Cardano, s is set to 3k/f slots.↩︎

9. Ouroboros Genesis: Composable Proof-of-Stake Blockchains with Dynamic Availability. Christian Badertscher, Peter Gaži, Aggelos Kiayias, Alexander Russel and Vassilis Zikas. Available online↩︎

10. Ouroboros Praos: An adaptively-secure, semi-synchronous proof-of-stake protocol. Bernardo David, Peter Gaži, Aggelos Kiayias and Alexander Russell. Available online↩︎

11. Although this means that the protocol relies on a shared notion of “time” between all participants (a wallclock), the resolution of this clock does not need to be particularly precise, and standard solutions such as NTP suffice.↩︎

# Todo items in org-roam, an update

I got an email from Mr Z with a nice modification to the code in my post on keeping todo items in org-roam.

He already had a bunch of agenda files that he wanted to keep using (I had so few of them that I'd simply converted them to roam files). Here's the solution he shared with me:

(defvar roam-extra-original-org-agenda-files nil
"Original value of  org-agenda-files'.")

(defun roam-extra:update-todo-files (&rest _)
"Update the value of org-agenda-files'."
(unless roam-extra-original-org-agenda-files
(setq roam-extra-original-org-agenda-files org-agenda-files))
(setq org-agenda-files
(append roam-extra-original-org-agenda-files
(roam-extra:todo-files))))


It's a rather nice modification I think. Thanks to Mr Z for agreeing to let me share it here.

# Barbie and KenJSON

After higher-kinded data (HKD) and barbies were mentioned in episode 35 of Haskell Weekly I've been wondering if it could be used in combination with aeson to do validation when implementing web services.

TLDR; I think it'd work, but I have a feeling I'd have to spend some more time on it to get an API with nice ergonomics.

## Defining a type to play with

I opted to use barbies-th to save on the typing a bit. Defining a simple type holding a name and an age can then look like this

declareBareB
[d|
data Person = Person {name :: Text, age :: Int}
|]

deriving instance Show (Person Covered Identity)
deriving instance Show (Person Covered Maybe)
deriving instance Show (Person Covered (Either Text))


The two functions from the Barbies module documentation, addDefaults and check, can then be written like this

addDefaults :: Person Covered Maybe -> Person Covered Identity -> Person Covered Identity
where
trans m d = maybe d pure m

check :: Person Covered (Either Text) -> Either [Text] (Person Covered Identity)
check pe = case btraverse (either (const Nothing) (Just . Identity)) pe of
Just pin -> Right pin
Nothing -> Left $bfoldMap (either (: []) (const [])) pe  I found it straight forward to define some instances and play with those functions a bit. ## Adding in JSON The bit that wasn't immediately obvious to me was how to use aeson to parse into a type like Person Covered (Either Text). First off I needed some data to test things out with. bs0, bs1 :: BSL.ByteString bs0 = "{\"name\": \"the name\", \"age\": 17}" bs1 = "{\"name\": \"the name\", \"age\": true}"  To keep things simple I took baby steps, first I tried parsing into Person Covered Identity. It turns out that the FromJSON instance from that doesn't need much thought at all. (It's a bit of a pain to have to specify types in GHCi all the time, so I'm throwing in a specialised decoding function for each type too.) instance FromJSON (Person Covered Identity) where parseJSON = withObject "Person"$
\o -> Person <$> o .: "name" <*> o .: "age" decodePI :: BSL.ByteString -> Maybe (Person Covered Identity) decodePI = decode  Trying it out on the test data gives the expected results λ> let i0 = decodePI bs0 λ> i0 Just (Person {name = Identity "the name", age = Identity 17}) λ> let i1 = decodePI bs1 λ> i1 Nothing  So far so good! Moving onto Person Covered Maybe. I spent some time trying to use the combinators in Data.Aeson for dealing with parser failures, but in the end I had to resort to using <|> from Alternative. instance FromJSON (Person Covered Maybe) where parseJSON = withObject "Person"$
\o -> Person <$> (o .: "name" <|> pure Nothing) <*> (o .: "age" <|> pure Nothing) decodePM :: BSL.ByteString -> Maybe (Person Covered Maybe) decodePM = decode  Trying that out I saw exactly the behaviour I expected, i.e. that parsing won't fail. (Well, at least not as long as it's a valid JSON object to being with.) λ> let m0 = decodePM bs0 λ> m0 Just (Person {name = Just "the name", age = Just 17}) λ> let m1 = decodePM bs1 λ> m1 Just (Person {name = Just "the name", age = Nothing})  With that done I found that the instance for Person Covered (Either Text) followed quite naturally. I had to spend a little time on getting the types right to parse the fields properly. Somewhat disappointingly I didn't get type errors when the behaviour of the code turned out to be wrong. I'm gussing aeson's Parser was a little too willing to give me parser failures. Anyway, I ended up with this instance instance FromJSON (Person Covered (Either Text)) where parseJSON = withObject "Person"$
\o -> Person <$> ((Right <$> o .: "name") <|> pure (Left "A name is most needed"))
<*> ((Right <$> o .: "age") <|> pure (Left "An integer age is needed")) decodePE :: BSL.ByteString -> Maybe (Person Covered (Either Text)) decodePE = decode  That does exhibit the behaviour I want λ> let e0 = decodePE bs0 λ> e0 Just (Person {name = Right "the name", age = Right 17}) λ> let e1 = decodePE bs1 λ> e1 Just (Person {name = Right "the name", age = Left "An integer age is needed"})  ## In closing I think everyone will agree that the FromJSON instances are increasingly messy. I think that can be fixed by putting some thought into what a more pleasing API should look like. I'd also like to mix in validation beyond what aeson offers out-of-the-box, which really only is "is the field present?" and "does the value have the correct type?". For instance, Once we know there is a field called age, and that it's an Int, then we might want to make sure it's non-negitive, or that the person is at least 18. I'm guessing that wouldn't be too difficult. Finally, I'd love to see examples of using HKDs for parsing/validation in the wild. It's probably easiest to reach me at @magthe@mastodon.technology. ### Ken T Takusagawa # [crexbapy] default lambda variable first, introduce a special keyword LAMBDAVARIABLE so that an expression containing LAMBDAVARIABLE gets rewritten by the compiler to \x -> expression where x is a fresh symbol, and every instance of LAMBDAVARIABLE in the expression is replaced by x. (we have used Haskell syntax for lambda functions.) for example, (LAMBDAVARIABLE + 10) rewrites to (\x -> x + 10). the benefits are not having to type "x" that one extra time at the beginning ("Don't Repeat Yourself"), and not having to think of a fresh name x. the latter may be useful for generated code. problems: 1. the construct cannot be nested. 2. you do not know that an expression is a lambda function until you've parsed into the expression deeply, encountering the special keyword LAMBDAVARIABLE. 3. does (10 * (5 + LAMBDAVARIABLE)) mean (\x -> 10 * (5+x)) or (10 * (\x -> 5+x))? solution: augment with another construct LAMBDA(...) which delimits a lambda expression and allows nesting. for example, LAMBDA(LAMBDAVARIABLE + LAMBDA(LAMBDAVARIABLE^2)(LAMBDAVARIABLE+1)) rewrites to (\x -> x + (\x -> x^2)(x+1)). note that the inner x (and inner LAMBDAVARIABLE) masks the outer one via static scoping. if you need access to an outer variable in nested lambdas, don't use this construct. Mathematica calls lambda functions pure anonymous functions. LAMBDAVARIABLE is # and LAMBDA is postfix & . previously, lambda with more than one variable. consider keywords BINARYLAMBDA, LAMBDAVARIABLE1, LAMBDAVARIABLE2. beyond two arguments, it's probably best to force the programmer to explicitly name them. there's a parallel here with lambda expressions themselves being shortcuts: (\x -> expression) rewrites to (let { f x = expression } in f) with the benefits of not having to think of a fresh identifier f and not having to Repeat Yourself in f. ## March 17, 2021 ### Matt Parsons # Async Control Flow This post is an investigation of persistent issue #1199 where an asynchronous exception caused a database connnection to be improperly returned to the pool. The linked issue contains some debugging notes, along with the PR that fixes the problem. While I was able to identify the problem and provide a fix, I don’t really understand what happened - it’s a complex bit of work. So I’m going to write this up as an exploration into the exact code paths that are happening. # Data.Pool resource-pool is a how persistent manages concurrent pooling and sharing of database connections. When you create a Pool, you specify how to create resources, destroy them, and then some information around resource management: how long to keep an unused resource open, how many sub-pools to maintain, and how many resources per sub-pool (aka stripe). persistent calls createPool here. The database client libraries provide a LogFunc -> IO SqlBackend that is used to create new database connections, and the close' delegates to the connClose field on the SqlBackend record. While resource-pool isn’t seeing much maintenance activity, it’s relatively well tested and reliable. Once you’ve got a Pool a from createPool, the recommended way to use it is withResource: withResource :: (MonadBaseControl IO m) => Pool a -> (a -> m b) -> m b withResource pool act = control$ \runInIO -> mask $\restore -> do (resource, local) <- takeResource pool ret <- restore (runInIO (act resource)) onException destroyResource pool local resource putResource local resource return ret  # Data.Acquire Now, in persistent-2.10.5, a new API based on the resourcet package’s Data.Acquire was introduced, and this API became the underlying implementation for the runSqlPool family of functions. The underlying functionality is in the new function unsafeAcquireSqlConnFromPool, which was later factored out into resourcet-pool. This change was introduced because resource-pool operates in MonadBaseControl, which is incompatible with many other monad transformers - specifically, ConduitT. Acquire is based on MonadUnliftIO, which is compatible. In hindsight, we could have just changed the code to use MonadUnliftIO - it’s relatively straightforward to do. A term with a single constrant like MonadBaseControl IO m => m a can be specialized to IO a, and we can then run that using withRunInIO from unliftio. toUnliftIO :: MonadUnliftIO n => (forall m. MonadBaseControl IO m => m a) -> n a toUnliftIO mbc = withRunInIO$ \runInIO -> do
mbc

toPlainIO
:: (forall m. MonadBaseControl IO m => m a)
-> IO a
toPlainIO mbc = mbc

=> (forall m. MonadBaseControl IO m => m a)
-> n a
toMonadIO mbc = liftIO (toPlainIO mbc)


# Acquire vs Pool

I didn’t realize this at the time, but Data.Acquire is inherently a weaker tool than Data.Pool. Data.Acquire provides a means of creating a new resource, and also freeing it automatically when a scope is exited. Data.Pool keeps track of resources, resource counts, and occasionally destroys them if they’re unsued.

So let’s look at our conversion function:

unsafeAcquireSqlConnFromPool = do

let freeConn :: (backend, LocalPool backend) -> ReleaseType -> IO ()
freeConn (res, localPool) relType = case relType of
ReleaseException -> P.destroyResource pool localPool res
_ -> P.putResource localPool res

return $fst <$> mkAcquireType (P.takeResource pool) freeConn


mkAcquireType is analogous to createPool - it creates a handle Acquire a that can be used with a function named with:

with :: MonadUnliftIO m
=> Acquire a
-> (a -> m b)
-> m b
with (Acquire f) g = withRunInIO $\run -> E.mask$ \restore -> do
Allocated x free <- f restore
res <- restore (run (g x)) E.onException free ReleaseException
free ReleaseNormal
return res


with is aliased to withAcquire, which I’ll use from here on out to disambiguate.

You may notice that withAcquire and withResource are implemented nearly identically. withResource uses MonadBaseControl and withAcquire uses MonadUnliftIO, and that’s the whole of the difference. They have the same async exception handling with mask and use the same onException functions. All the exception handling stuff is from Control.Exception, so we’re not using UnliftIO.Exception or Control.Monad.Catch or Control.Exception.Safe here.

These are really similar. When we look at how the unsafeSqlConnFromPool works, it should provide identical behavior. For free, we case on ReleaseType and do destroyResource on exception and putResource on any other exit.

We’re not handling ReleaseEarly specially - this constructor is only used when we use ResourceT’s release function on a value. Using withAcquire, we’ll only ever pass ReleaseNormal and ReleaseException. So this is locally safe. Weirdly enough, resourcet doesn’t really depend on the Acquire type at all, at least not directly - the ReleaseMap type contains a function ReleaseType -> IO () for freeing resources, but doesn’t mention anything else about it.

Anyway, let’s get back on track. Since withAcquire and withResource are nearly identical, it may be our translating code that is the problem. We can use algebraic substitution to check this out. Let’s look at mkAcquireType:

mkAcquireType
:: IO a -- ^ acquire the resource
-> (a -> ReleaseType -> IO ()) -- ^ free the resource
-> Acquire a
mkAcquireType create free = Acquire $\_ -> do x <- create return$! Allocated x (free x)


The ignored parameter in the lambda there is a function that looks like restore - and we’re ignoring it. So, this action gets run when we unpack the Acquire in withAcquire. Let’s plug in our create and free parameters:

mkAcquireType
:: IO a -- ^ acquire the resource
-> (a -> ReleaseType -> IO ()) -- ^ free the resource
-> Acquire a
mkAcquireType (create = P.takeResource pool) (free = freeConn) = Acquire $\_ -> do x <- (P.takeResource pool) return$! Allocated x (freeConn x)
where
freeConn (res, localPool) relType = case relType of
ReleaseException -> P.destroyResource pool localPool res
_ -> P.putResource localPool res


The pool variable is captured in the closure.

Now we can look at withAcquire, and plug in our behavior:

withAcquire (Acquire f) g = withRunInIO $\run -> E.mask$ \restore -> do
-- f ignores the restore argument: possible bug?
Allocated x free <- f restore
-- so x here comes from P.takeResource pool
-- free = freeConn
ret <- restore (run (g x))
E.onException free ReleaseException
free ReleaseNormal
return ret


Let’s plug in the specific case for free:

withAcquire (Acquire f) g = withRunInIO $\run -> E.mask$ \restore -> do
-- f ignores the restore argument: possible bug?
Allocated x free <- f restore
-- so x here comes from P.takeResource pool
-- free = freeConn
ret <- restore (run (g x))
E.onException P.destroyResource pool localPool res
P.putResource localPool res
return ret


Closer, closer… Let’s unpack the Allocated stuff:

withAcquire (Acquire _) g = withRunInIO $\run -> E.mask$ \restore -> do
-- f ignores the restore argument: possible bug?
Allocated x free <- f restore
-- so x here comes from P.takeResource pool
-- free = freeConn
ret <- restore (run (g x))
E.onException P.destroyResource pool localPool res
P.putResource localPool res
return ret
where
f _ = do
x <- (P.takeResource pool)
return $! Allocated x (freeConn x) -- OK, let's splice in the definition of f: withAcquire (Acquire _) g = withRunInIO$ \run -> E.mask $\restore -> do -- f ignores the restore argument: possible bug? Allocated x free <- do x <- P.takeResource pool return$! Allocated x (freeConn x)
-- so x here comes from P.takeResource pool
-- free = freeConn
ret <- restore (run (g x))
E.onException P.destroyResource pool localPool res
P.putResource localPool res
return ret

-- Now let's remove the Allocated constructor:

withAcquire (Acquire _) g = withRunInIO $\run -> E.mask$ \restore -> do
x@(res, localPool) <- P.takeResource pool
ret <- restore (run (g x))
E.onException P.destroyResource pool localPool res
P.putResource localPool res
return ret


With this, we’re now nearly identical with withResource (copied again):

withResource
=> Pool a
-> (a -> m b)
-> m b
withResource pool act =
control $\runInIO -> mask$ \restore -> do
(resource, local) <- takeResource pool
ret <- restore (runInIO (act resource)) onException
destroyResource pool local resource
putResource local resource
return ret


The only difference here is that Acquire also passes the LocalPool to the given action. In the persistent code, we use fmap fst so that it only passes the resource to the callback.

So, I’m not sure this function is at fault. Let’s see how we call this function.

# What’s that >>= doing there??

acquireSqlConnFromPool is what’s actually called by runSqlPool in this version of the code.

acquireSqlConnFromPool
=> m (Acquire backend)
acquireSqlConnFromPool = do
connFromPool <- unsafeAcquireSqlConnFromPool
return $connFromPool >>= acquireSqlConn  That >>= is weird. What’s going on here? We have return :: a -> m a, and then f >>= g. f :: Acquire backend - so then g must have the type g :: backend -> Acquire backend, meaning that we’re using the >>= of Acquire a -> (a -> Acquire b) -> Acquire b. acquireSqlConn cashes out to rawAcquireSqlConn: rawAcquireSqlConn :: forall backend m . (MonadReader backend m, BackendCompatible SqlBackend backend) => Maybe IsolationLevel -> m (Acquire backend) rawAcquireSqlConn isolation = do conn <- MonadReader.ask let rawConn :: SqlBackend rawConn = projectBackend conn getter :: T.Text -> IO Statement getter = getStmtConn rawConn beginTransaction :: IO backend beginTransaction = conn <$ connBegin rawConn getter isolation

finishTransaction :: backend -> ReleaseType -> IO ()
finishTransaction _ relType = case relType of
ReleaseException -> connRollback rawConn getter
_ -> connCommit rawConn getter

return $mkAcquireType beginTransaction finishTransaction  So, in the investigation, the exception (libpq: failed (another command is already in progress)) would happen (as best as I can tell) when we try to call connRollback. The problem is somewhere around here. Um excuse me what? This is also operating in m (Acquire backend), not Acquire backend. How is it possibly being used on the RHS of a >>=? … Oh, right. Just like MonadBaseControl IO m => m a can be concretized to IO a, we can concretize MonadReader r m => m a to r -> a. So what’s happening here is we’re picking the spcialized type: rawAcquireSqlConn :: Maybe IsolationLevel -> backend -> Acquire backend  Wild. Well, let’s look at >>= for Acquire: instance Monad Acquire where return = pure Acquire f >>= g' = Acquire$ \restore -> do
Allocated x free1 <- f restore
let Acquire g = g' x
Allocated y free2 <- g restore E.onException free1 ReleaseException
return $! Allocated y (\rt -> free2 rt E.finally free1 rt)  Hmm! This smells funny. The problem occurs when we try to roll back the transaction. So let’s apply some more substitution here. Acquire f contains: \_ -> do x <- P.takeResource pool pure$ Allocated x (freeConn x)


And g' contains (simplifying a tiny bit):

\sqlBackend -> do
Acquire $\_ -> do _ <- beginTransaction sqlBackend getter isolation pure$ Allocated sqlBackend $\case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend  So, we can start inlining. Acquire$ \restore -> do
Allocated x free1 <- (\_ -> do
x <- P.takeResource pool
pure $Allocated x (freeConn x)) restore let Acquire g = g' x Allocated y free2 <- g restore E.onException free1 ReleaseException return$! Allocated y (\rt -> free2 rt E.finally free1 rt)

-- (\_ -> x) restore = x
Acquire $\restore -> do Allocated x free1 <- do x <- P.takeResource pool pure$ Allocated x (freeConn x)

let Acquire g = g' x

Allocated y free2 <- g restore E.onException free1 ReleaseException
return $! Allocated y (\rt -> free2 rt E.finally free1 rt) -- float c and freeConn up Acquire$ \restore -> do
x <- P.takeResource pool
let free1 = freeConn x

let Acquire g = g' x

Allocated y free2 <- g restore E.onException free1 ReleaseException
return $! Allocated y (\rt -> free2 rt E.finally free1 rt) -- inline g' Acquire$ \restore -> do
x <- P.takeResource pool
let free1 = freeConn x

let sqlBackend = x
let Acquire g =
Acquire $\_ -> do _ <- beginTransaction sqlBackend getter isolation pure$ Allocated sqlBackend $\case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend Allocated y free2 <- g restore E.onException free1 ReleaseException return$! Allocated y (\rt -> free2 rt E.finally free1 rt)

-- Remove Acquire constructor:
Acquire $\restore -> do x <- P.takeResource pool let free1 = freeConn x let sqlBackend = x let g _ = do _ <- beginTransaction sqlBackend getter isolation pure$ Allocated sqlBackend $\case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend Allocated y free2 <- g restore E.onException free1 ReleaseException return$! Allocated y (\rt -> free2 rt E.finally free1 rt)

-- Inline g, ignore restore parameter
Acquire $\restore -> do x <- P.takeResource pool let free1 = freeConn x let sqlBackend = x Allocated y free2 <- (do _ <- beginTransaction sqlBackend getter isolation pure$ Allocated sqlBackend $\case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend ) E.onException free1 ReleaseException return$! Allocated y (\rt -> free2 rt E.finally free1 rt)


Now, this next transformation feels a bit tricky. I’m going to float beginTransaction up and put the E.onException only on it. Note that we’re not actually running the free2 action here - just preparing it. Then I’ll assign it with a let.

Acquire $\restore -> do x <- P.takeResource pool let free1 = freeConn x let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation E.onException free1 ReleaseException let free2 = \case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend return$! Allocated y (\rt -> free2 rt E.finally free1 rt)

-- Inline free1 and free2
Acquire $\restore -> do x <- P.takeResource pool let (sqlBackend, localPool) = x _ <- beginTransaction sqlBackend getter isolation E.onException freeConn x ReleaseException return$! Allocated y $\rt -> (\case ReleaseException -> connRollback sqlBackend _ -> connCommit sqlBackend) rt E.finally (freeConn x rt) -- Inline freeConn Acquire$ \restore -> do
x <- P.takeResource pool
let (sqlBackend, localPool) = x

_ <- beginTransaction sqlBackend getter isolation
E.onException
P.destroyResource pool localPool sqlBackend

return $! Allocated y$ \rt ->
(case rt of
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend)
E.finally do
case rt of
ReleaseException ->
P.destroyResource pool localPool sqlBackend
_ ->
P.putResource localPool sqlBackend


I think it’s important to note that, again we don’t ever actually call restore. So the masking state is inherited and not ever changed. It feels important but I’m not sure if it actually is.

Let’s plug this into withAcquire now.

withAcquire (Acquire f) g = withRunInIO $\run -> E.mask$ \restore -> do
Allocated x free <- f restore
res <- restore (run (g x)) E.onException free ReleaseException
free ReleaseNormal
return res

-- Inline f. Since restore is never called, we can omit passing it as
-- a parameter.
withAcquire (Acquire f) g = withRunInIO $\run -> E.mask$ \restore -> do
Allocated x free <- do
x <- P.takeResource pool
let (sqlBackend, localPool) = x

_ <- beginTransaction sqlBackend getter isolation
E.onException free1 ReleaseException

return $! Allocated x$ \rt ->
(case rt of
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend)
E.finally do
case rt of
ReleaseException ->
P.destroyResource pool localPool sqlBackend
_ ->
P.putResource localPool sqlBackend

res <- restore (run (g x)) E.onException free ReleaseException
free ReleaseNormal
return res

-- float x <- P.takeResource pool to the top, and define free using let
withAcquire (Acquire f) g = withRunInIO $\run -> E.mask$ \restore -> do
x <- P.takeResource pool
let free1 = freeConn x
let (sqlBackend, localPool) = x

_ <- beginTransaction sqlBackend getter isolation
E.onException free1 ReleaseException

let free rt =
(case rt of
ReleaseException ->
connRollback sqlBackend
_ ->
connCommit sqlBackend)
E.finally do
case rt of
ReleaseException ->
P.destroyResource pool localPool sqlBackend
_ ->
P.putResource localPool sqlBackend

res <- restore (run (g x)) E.onException free ReleaseException
free ReleaseNormal
return res

-- inline free for each case:
withAcquire (Acquire f) g = withRunInIO $\run -> E.mask$ \restore -> do
x <- P.takeResource pool
let (sqlBackend, localPool) = x

_ <- beginTransaction sqlBackend getter isolation
E.onException
P.destroyResource pool localPool sqlBackend

res <- restore (run (g x)) E.onException do
connRollback sqlBackend
E.finally
P.destroyResource pool localPool sqlBackend

do -- ReleaseNormal
connCommit sqlBackend
E.finally do
P.putResource localPool sqlBackend
return res


Let’s consider our masking state. We’re masked for everything except for the restoure (run (g x)) call. Including beginning the transaction and committing the transaction.

But we can still receive asynchronous exceptions during interruptible operations. Interruptible operations include “anything that can block or perform IO,” which seems very likely to include the Postgres code here.

# The Original

Let’s compare this with the original code. The original code delegated to runSqlConn after acquiring a SqlBackend from the Pool in MonadUnliftIO.

runSqlConn :: (MonadUnliftIO m, BackendCompatible SqlBackend backend) => ReaderT backend m a -> backend -> m a
runSqlConn r conn = withRunInIO $\runInIO -> mask$ \restore -> do
let conn' = projectBackend conn
getter = getStmtConn conn'
restore $connBegin conn' getter Nothing x <- onException (restore$ runInIO $runReaderT r conn) (restore$ connRollback conn' getter)
restore $connCommit conn' getter return x  We’ll inline this into runSqlPool, so we’ll now see: runSqlPool r pconn = withRunInIO$ \run ->
withResource pconn $run . runSqlConn r -- expand lambda runSqlPool r pconn = withRunInIO$ \run ->
withResource pconn $\conn -> run$ runSqlConn r conn

-- inline runSqlConn
runSqlPool r pconn =
withRunInIO $\run -> withResource pconn$ \conn ->
run $withRunInIO$ \runInIO ->
mask $\restore -> do let conn' = projectBackend conn getter = getStmtConn conn' restore$ connBegin conn' getter Nothing
x <- onException
(restore $runInIO$ runReaderT r conn)
(restore $connRollback conn' getter) restore$ connCommit conn' getter
return x


Kind of a lot of withStuff going on, including two withRunInIOs lol. Let’s make it even worse by inlining withResource:

-- abstract action to a variable
runSqlPool r pconn =
withRunInIO $\run -> withResource pconn$ \conn ->
let act =
run $withRunInIO$ \runInIO ->
mask $\restore -> do let conn' = projectBackend conn getter = getStmtConn conn' restore$ connBegin conn' getter Nothing
x <- onException
(restore $runInIO$ runReaderT r conn)
(restore $connRollback conn' getter) restore$ connCommit conn' getter
return x
in act

-- inline withResource

runSqlPool r pconn =
withRunInIO $\run -> -- withResource pconn$ \conn ->
control $\runInIO0 -> mask$ \restore0 -> do
let act conn =
run $withRunInIO$ \runInIO1 ->
mask $\restore1 -> do let conn' = projectBackend conn getter = getStmtConn conn' restore1$ connBegin conn' getter Nothing
x <- onException
(restore1 $runInIO1$ runReaderT r conn)
(restore1 $connRollback conn' getter) restore1$ connCommit conn' getter
return x
(resource, local) <- takeResource pool
ret <- restore0 (runInIO0 (act resource)) onException
destroyResource pool local resource
putResource local resource
return ret

-- inline act
runSqlPool r pconn =
withRunInIO $\run -> -- withResource pconn$ \conn ->
control $\runInIO0 -> mask$ \restore0 -> do
(resource, local) <- takeResource pool
ret <- restore0 (runInIO0 (
run $withRunInIO$ \runInIO1 ->
mask $\restore1 -> do let conn = resource let conn' = projectBackend conn getter = getStmtConn conn' restore1$ connBegin conn' getter Nothing
x <- onException
(restore1 $runInIO1$ runReaderT r conn)
(restore1 $connRollback conn' getter) restore1$ connCommit conn' getter
return x)) onException
destroyResource pool local resource
putResource local resource
return ret


The restore paratmer in mask doesn’t unmask it completely - it restores the existing masking state before the mask was entered. So mask $\restore -> mask$ \restore -> restore (print 10) doesn’t have print 10 in an unmasked state, but the same mask as before. However, here, we have this pattern:

mask $\restore -> do restore$ do
mask $\restore' -> do ...  Which is interesting! runSqlPool r pconn = -- Unmasked withRunInIO$ \run ->
control $\runInIO0 -> mask$ \restore0 -> do
(resource, local) <- takeResource pool
ret <- restore0
(runInIO0 $run$ withRunInIO $\runInIO1 -> -- Masked mask$ \restore1 -> do
let conn = resource
let conn' = projectBackend conn
getter = getStmtConn conn'
restore1 $connBegin conn' getter Nothing x <- onException -- Unmasked (restore1$ runInIO1 $runReaderT r conn) -- Unmasked (restore1$ connRollback conn' getter)
connCommit conn' getter
return x)
onException
destroyResource pool local resource
putResource local resource
return ret


1. takeResource pool
2. onException
3. onException and then destroyResource
4. putResource

1. connBegin
2. r (the action passed to runSqlConn)
3. connRollback
4. connCommit

Let’s compare with withAcquire which was all inlined above:

1. takeResource
2. beginTransaction
3. destroyResource
4. connRollback
5. destroyResource again
6. connCommit
7. putResource
1. run (g x) – the action passed to withAcquire and runSqlConn.

So withAcquire actually has quite a bit more masking going on! Interesting. Remembering, the problem occurs when the thread killed exception happens and the connRollback function is called, causing libpq to die with the “command in progress” error.

So, we throw a killThread at our withAcquire function. It’ll land as soon as we’re unmasked, or an interruptible action occurs. Since almost all of it is masked, we need to determine what the interruptible operations are.

takeResource might be interruptible - it has an STM transaction, which does call retry. I don’t know if any code with retry triggers an interrupt, or if only actually calling retry can trigger an interruptible state. Based on a quick and bewildering look at the GHC source, I think it’s just that retry itself can be interrupted. retry occurs when there are no available entries in the local pool and we’re at max resources for the pool. This is exactly the scenario this test is exercising: a single stripe with a single resource that’s constantly in use.

beginTransaction kicks off an IO action to postgres, so it is almost definitely interruptible. Same for connRollback and connCommit. So the masked-state for these items in withAcquire is probably not a big deal - but we could check by using uninterruptibleMask on them.

# To be continued?

I wish I had a more satisfying conclusion here, but I’m all out of time to write on this for now. Please comment on the relevant GitHub issues if you’re interested or have some insight!

# What is a type?

Posted on March 17, 2021

My fascination for types rivals my inability to define the concept.

Even though I don’t know what a type is, I can still recognize when a paper “is about types”: the paper usually contains many occurrences of formulas of the form “t : T”, where “t” is some piece of code, some program, and “T” is one of these mysterious types. The colon in the middle is of no technical significance, but it signals some cultural awareness on the part of authors.

Hypothesis: Types are a meme.

My experience is also that things are very very bad when “t : T” does not hold. Types are a way to tell right from wrong, for some domain-specific definitions of correctness.

Hypothesis: Types are specifications.

One idea that really narrows it down for me is that programming languages have types. You can assign types to everything in a language: static types to source code, dynamic types to run-time values.

Another way to look at this is to compare with other forms of specifications. How do you prove a specification? A priori, you could use any method, and all you care about is that it is somehow “sound”, but otherwise the proof method is a black box.

• Automated theorem proving, with an SMT solver as a black box.
• Interactive theorem proving, with PhD students as a black box.
• Testing, with the compiler/interpreter as a black box.1

Approaches using “types” seem different. To prove a specification, a typing judgement “t : T”, the way forward is to prove more typing judgements by following the rules of some “type system”. Types tell me both “how to specify things” and “how to verify things”—by specifying and verifying sub-things.

Hypothesis: Types are compositional specifications.

I originally posted this question on the recently created TYPES Zulip server, a spin-off of the TYPES mailing list. Those are nice hangouts for people who like types, whatever you believe they are (and also for getting spammed with Calls-For-Papers).

1. Whereas most formal methods are sound for proving the absence of certain bugs, testing is a sound method of finding bugs.↩︎

# Wealth Shown to Scale and Incarceration in Real Numbers

Wealth Shown to Scale and Incarceration in Real Numbers, two intriguing infographics from Matt Korostoff.

# Using GHC low-latency garbage collection in production

This is a guest post by Domen Kožar.

In this post I’ll dive into how low-latency garbage collection (GC) has improved developer experience for Cachix users.

### The need for low latency

Cachix serves the binary cache protocol for the Nix package manager.

Before Nix builds a package, it will ask the binary cache if it contains the binary for a given package it wants to build. For a typical invocation of Nix there can be hundreds or even thousand of packages that need to be checked. However, if a binary is present in the cache, building the package is no longer required, potentially saving a lot of time and CPU.

It is crucial for the backend of such a binary cache service to respond in a timely manner so that the optimisation of skipping builds actually pays off.

### Monitoring GC pauses

The easiest way to monitor and graph how long GC pauses last is via ekg-statsd, by exposing the rts.gc.gc_wall_ms metric. For Cachix, a typical plot of this metric used to look like this:

From the picture, we can see that under load, we experience GC pause times of up to nearly 800 ms. Having 800 ms pauses stopping the world is far from ideal (I’ve even observed some pauses that last over a second under really heavy load), since the endpoint for checking if a certain binary exists normally takes only about 2–4 ms.

### Switching to the low-latency GC

If you want to try the low-latency GC in your own code, please make sure to use GHC 8.10.3 or later since it fixes a few crashing bugs that you don’t want to encounter. Then, to enable the low-latency GC, append the following flags when invoking an executable built by GHC:

myexecutable +RTS --nonmoving-gc

For Cachix, a typical picture of GC pauses plotted over time then looks as follows:

While this is comparing apples to oranges since the load is not exactly the same between the two pictures, you can see that the distribution is now significantly different. By far the most pauses are now actually in the range of just a few milliseconds.

Unfortunately, non-moving GC can occasionally still cause relatively long pauses in the worst case (measured at 150–200 ms). We believe that this is due to the workload spawning many threads. There is still work to be done to further reduce pause times of the low-latency collector under such circumstances.

The throughput impact of the low-latency GC hasn’t been measured for this case. The response time of a “does this binary exist?” request is still within the 2–4 ms range most of the time.

### Conclusion

• Monitor GC pauses to understand how they impact your application response times.

• Non-moving GC has been running in production for over a month, reducing worst-case response time for a performance-sensitive endpoint without any issues.

• Being able to monitor the total number of threads in the RTS would improve production insights, but that is yet to be implemented.

# The bottom of the Haskell Pyramid

Inspired by Kowainik's Knowledge Map that tries to cover many Haskell concepts a Haskeller will probably run into through their journey, as well as libraries and tools, I've decided to make a more modest list of things one should be familiar with to be productive with Haskell, the bottom of The Haskell Pyramid if you will, hoping it will help to focus learners of the language that might get loss in the vast amount of information.

## Installing and running a Haskell toolchain

This could be a bit overwhelming to beginners as there are many choices, should they install the toolchain from their package manager? Download ghc binaries? Install stack? ghcup? maybe even nix?

In my opinion installing Haskell using stack or ghcup probably lead to the best results.

A new user should be able to run ghc and ghci when starting, the rest can come later.

## Expressions, operators and applying functions

What goes in the right-hand side of the equals sign in a definition? Things like "Hello world", 1 + 1, min (max 43212 12345) 43251.

## Indentation rules

How does Haskell know where a definition begins and where it end? Which arguments are applied to which functions? Haskell uses indentation using spaces to figure that out - best to learn how it works quickly to avoid subtle bugs and cryptic errors.

## Compiler errors

The compiler will notify us when our program doesn't make sense. The fact that Haskell provides a lot of feedback about our program helps us a lot from trying to figure our everything ourselves, gotta get used to Haskell errors because they are here to help. But it also means we have to learn what they mean to be effective.

First, try to read them carefully so you get used to them. At some point you won't even need to read them, just know they are there.

## If expressions & Recursion

Recursion is very foundational to Haskell. It is used for iteration, managing state, and more. It is best to get good control over it as it will appear many times.

## The evaluation model, substitution, equational reasoning and referential transparency

A lot of big words that describe how to "understand Haskell code".

Understanding how Haskell evaluates our program is probably the most important tool available when we need to debug our programs (and when writing them, of course)

## Debug tracing, stack traces

... but often programs can be quite big and we need debug traces and stack traces to help guide us in the right direction of the bug.

## Total vs partial functions

How to make sure all input cases are taken into account? What does it mean when they aren't? How can Haskell help us make sure our program won't crash during runtime because we didn't think of a certain input? How can it guide our API design to make it more robust?

## Parametric Polymorphism

Also known as "generics" in other circles (but not in Haskell as Generics mean something else). Helps us write fuctions that are more reusable, composable and precise.

## Higher order functions

Higher order functions are also foundational, they help us avoid repeating ourselves and help us extract the essence of algorithms such as traversing structures or sorting a structure without having to go into the details of "what to do with the elements when traversing" or "how should I compare two elements when sorting".

## Lists and tuples

We use them all the time, maybe too much to be honest.

## Annotating types

In Haskell annotating types isn't mandatory in most cases, but type annotation can sometimes serve as a tiny summary of code, and also will provide some safety net that checks that our summary match our code, so we are more likely to implement the thing we wanted to implement.

## Using and chaining IO operations

To create a useful program that interacts with the world we often need to use IO. What does that mean for referential transparency? (spoiler: nothing)

## Functional core imperative shell

This is a pattern for designing programs that interact with the world. In Haskell we like to keep our functions uneffectful because they are more composable, reusable and are easier to test, but we still need some effects. One way to keep that to a minimum is to design a program in layers: one thin imperative and effectful layer does that interacting, and then the core layer, the logic of the program, that doesn't do effects.

An example of mixing effects and logic: when making a fizzbuzz program, write a loop that increments an integer and checks which bucket it falls, prints that number, increments the number, and repeats the process until we reach the final number to check.

To write the same program with "functional core imperative shell" in mind, We could write the core logic that generates a list of results, and then write a function that writes the numbers on screen

(and that function looks like this):

writeFizzBuzzResults :: [String] -> IO ()
writeFizzBuzzResults results = mapM_ putStrLn results

## Modules

Learn about organizing code and how to import functions and other things from other modules.

## Using hoogle

A wonderful tool for finding functions and types that many other languages lack.

Let's make our programs safer, easier to understand and more modular.

## Pattern matching

Let's write our own control flow over ADTs.

## Installing editor tools & Ghcid

In my opinion having fast feedback dramatically increase the programmer's productivity.

Some people will say Ghcid is all you need, but I think editor features like error-at-point provide faster feedback than ghcid when working on a single module, so those are preferrable. Though when making a big change across many modules it's hard to beat Ghcid at the moment.

## newtypes

They are like a special case of data but have some benefits over it so it's worth to know as they are wildly used.

## Common data types: Maybe, Either

We use them (and often over use them) a lot.

Either especially can be used for handling errors in uneffectful code instead of IO exceptions.

## A package manager

How can I generate random numbers? How do I parse a json string? Send a file over the network? Write a graphical snake game?

The easiest way to accomplish these goals is to use an external library (packages in Haskell), and the easiest way to import a library, build it, and use it with ghc is through a package manager.

There are many options of package managers, stack and cabal-install are the most common ones. Make sure to read the docs. If you go the cabal-install path, make sure you use the v2 version of the commands. Note that you are not tied to one package manager whichever you choose, you can always switch or even use both if you'd like.

## Finding packages

How to find a proper Haskell package for your usecase can be tricky. Most packages can be found on Hackage, some can be found on Stackage.

It's possible to google the usecase + Haskell for example "Haskell json", click on a few links, see if the documentation makes sense and if the package is popular, But sometimes it's easier to ask in Haskell circles.

## Using the right data structures: text, containers, bytestring, vectors

A lot of Haskell programs are slower than one would expect because of choosing the wrong algorithm or the wrong data structures such as using String instead of Text, using lists when one needs fast lookup or indexing instead of using HashMap, Map or Vector, or using Set when one needs fast insertion (and order doesn't matter), or using nub to remove duplicate elements in a list instead of Set.toList . Set.fromList.

Remember that using the right data structure is also useful for understanding the code, as the choice of data structure reveals which algorithms are important for the program.

## Parse, don't validate

Parse, don't validate is an important pattern that helps write code that is safer and more predictable.

## Typeclasses and deriving typeclasses instances

One of the defining features of Haskell. Adds ad-hoc polymorphism and gives us the ability to create interfaces and abstractions over groups of types.

With typeclasses with can write functions like sort :: Ord a => [a] -> [a], that can sort a list of any type as long as its internal order is defined.

## Numbers typeclasses

Working with numbers can be a bit complicated because number groups (such as integers, floating points, etc) are both similar and different. The numbers typeclass hierarchy tries to capture their similarities and differences.

## Common Typeclasses: Show, Read, Eq, Ord, Semigroup, Monoid

How do we display a value for debugging purposes as a String? How do we convert a string to a value? How do we compare values? How do we append two values?

## Kinds

Kinds are to types what types are to values. It creates some sort of a separation between different types and defines where we can use different kinds of types.

## Common Typeclasses: Foldable, Traversable, Functor, Applicative, Monad

These typeclasses are the most common ones that define abstraction around certain types and their selected operations. You will find them at the core API of many data structures, concurrency libraries, parsing, and more.

These modules help reduce boilerplate by emulating imperative style programming using their various type classes interfaces (Such as Functor and Monad).

They are frequently used in Haskell code in the wild so they're worth knowing.

## forkIO, async, STM, TVar

Want to do some concurrency? Haskell got you covered.

## Language extensions

The GHC compiler provides extra (often experimental) language features beyond what standard Haskell provides. Some are nice to have such as LambdaCase and NumericUnderscores, Some are very useful but can make valid Haskell code invalid such as OverloadedStrings, Some provide really important functionality such as FlexibleInstances and BangPatterns, Some are sometimes incredibly useful such as TypeFamilies but often aren't, and some should probably not be used such as NPlusKPatterns.

It's probably better to run into language extensions organically and learn when you need to rather than try to learn them all.

The GHC2021 proposal classifies Haskell extensions, might be useful to go over it at some point.

## Tests: tasty, hspec, hunit, doctest, etc

Ever heard that you don't need to write tests for Haskell? Forget that, choose a unit testing framework, write tests and enjoy the gurantees that come from both strong static typing and tests!

Also learn about property-based testing and QuickCheck it's beautiful.

While not strictly necessary, many programs and libraries in the wild use monad transformers (Types that provide a monadic interface that can be combined with one another), and mtl, which is a library of interfaces of capabilities.

## Extras

### Simple lenses

It took me maybe 3-4 years to finally think about learning lenses, and even then I've only ever scratched the surface with them. While very powerful and expressive, they aren't really necessary to build programs, but many programs and libraries use basic lenses, so if (or when) you run into them in the wild come back and learn them.

• In my experience view, get, over, ., lens, and makeLenses are all you need to know for most cases.

### Profiling

At some point the average Haskell will write a program that won't perform well enough. It's a good idea to learn how to figure out what makes the program slow. But this can be posponed to later.

Template Haskell (a Haskell metaprogramming framework) might creep at you sometimes as a user, but learning to write template haskell isn't usually necessary.

### Lambda Calculus

You might have heard about the lambda calculus, it is a minimalistic functional language that is at the heart of Haskell. Learning the lambda calculus means learning about the power of functions, how functions can be used to represent data, and how to evaluate expressions using pen and paper.

I found it to be very interesting and eye opening, seeing the core of Haskell naked and exposed without syntactic sugar, but it's not required to use or understand Haskell.

## Summary

Haskell has a really high ceiling of concepts, but one doesn't need master Haskell to be productive with it. Yes, there might be additional concepts to learn in order to use certain libraries, but many applications and libraries can (and have) been built on these concepts alone.

Don't believe me? Here are a few applications that (as far as I could tell - corrections are welcomed!) don't use anything more sophisticated than what I mentioned in this article:

• Aura - A secure, multilingual package manager for Arch Linux and the AUR.
• Elm-compiler - Compiler for Elm, a functional language for reliable webapps.
• Patat - Terminal-based presentations using Pandoc.

## March 15, 2021

We spent a few months last year building the groundwork for Haskellings in this YouTube series. Now after some more hard work, we're happy to announce that Haskellings is now available in the beta stage. This program is meant to be an interactive tutorial for learning the Haskell language. If you've never written a line of Haskell in your life, this program is designed to help you take those first steps! You can take a look at the Github repository to learn all the details of using it, but here's a quick overview.

## Overview

Haskellings gives you the chance to write Haskell code starting from the very basics with a quick evaluation loop. It currently has 50 different exercises for you to work with. In each exercise, you can read about a Haskell concept, make appropriate modifications, and then run the code with a single command to check your work!

You can do exercises individually, but the easiest way to do everything in order is to run the program in Watcher mode. In this mode, it will automatically tell you which exercise is next. It will also re-run each exercise every time you save your work.

Haskellings covers a decent amount of ground on basic language concepts. It starts with the very basics of expressions, types and functions, and goes through the basic usage of monads.

Haskellings is an open source project! If you want to report issues or contribute, learn how by reading this document! So go ahead, give it a try!

## Don't Forget: Haskell From Scratch

Haskellings is a quick and easy way to learn the language basics, but it only touches on the surface of a lot of elements. To get a more in-depth look at the language, you should consider our Haskell From Scratch video course. This course includes:

1. Hours of video lectures explaining core language concepts and syntax

This course will help you build a rock-solid foundation for your future Haskell learnings. And even better, we've now cut the price in half! So don't miss out!

# Hyperfunctions

Posted on March 14, 2021

Check out this type:

newtype a -&> b = Hyp { invoke :: (b -&> a) -> b } 

This a hyperfunction (Launchbury, Krstic, and Sauerwein 2013; 2000; 2000), and I think itâ€™s one of the weirdest and most interesting newtypes you can write in Haskell.

The first thing to notice is that the recursion pattern is weird. For a type to refer to itself recursively on the left of a function arrow is pretty unusual, but on top of that the recursion is non-regular. That means that the recursive reference has different type parameters to its parent: a -&> b is on the left-hand-side of the equals sign, but on the right we refer to b -&> a.

Being weird is reason enough to write about them, but whatâ€™s really shocking about hyperfunctions is that theyâ€™re useful. Once I saw the definition I realised that a bunch of optimisation code I had written (to fuse away zips in particular) was actually using hyperfunctions (Ghani et al. 2005). After that, I saw them all over the place: in coroutine implementations, queues, breadth-first traversals, etc.

Anyways, since coming across hyperfunctions a few months ago I thought Iâ€™d do a writeup on them. Iâ€™m kind of surprised theyâ€™re not more well-known, to be honest: theyâ€™re like a slightly more enigmatic Cont monad, with a far cooler name. Letâ€™s get into it!

# What Are Hyperfunctions?

The newtype noise kind of hides whatâ€™s going on with hyperfunctions: expanding the definition out might make things slightly clearer.

type a -&> b = (b -&> a) -> b
= ((a -&> b) -> a) -> b
= (((b -&> a) -> b) -> a) -> b
= ((((... -> b) -> a) -> b) -> a) -> b

So a value of type a -&> b is kind of an infinitely left-nested function type. One thing worth noticing is that all the as are in negative positions and all the bs in positive. This negative and positive business basically refers to the position of arguments in relation to a function arrow: to the left are negatives, and to the right are positives, but two negatives cancel out.

((((... -> b) -> a) -> b) -> a) -> b
+     -     +     -     +

All the things in negative positions are kind of like the things a function â€œconsumesâ€�, and positive positions are the things â€œproducedâ€�. Itâ€™s worth fiddling around with very nested function types to get a feel for this notion. For hyperfunctions, though, itâ€™s enough to know that a -&> b does indeed (kind of) take in a bunch of as, and it kind of produces bs.

By the way, one of the ways to get to grips with polarity in this sense is to play around with the Cont monad, codensity monad, or selection monad (Hedges 2015). If you do, you may notice one of the interesting parallels about hyperfunctions: the type a -&> a is in fact the fixpoint of the continuation monad (Fix (Cont a)). Suspicious!

# Hyperfunctions Are Everywhere

Before diving further into the properties of the type itself, Iâ€™d like to give some examples of how it can show up in pretty standard optimisation code.

### Zips

Letâ€™s say you wanted to write zip with foldr (I have already described this particular algorithm in a previous post). Not foldr on the left argument, mind you, but foldr on both. If you proceed mechanically, replacing every recursive function with foldr, you can actually arrive at a definition:

zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr xf xb xs (foldr yf yb ys)
where
xf x xk yk = yk x xk
xb _ = []

yf y yk x xk = (x,y) : xk yk
yb _ _ = []

In an untyped language, or a language with recursive types, such a definition would be totally fine. In Haskell, though, the compiler will complain with the following:

â€¢ Occurs check: cannot construct the infinite type:
t0 ~ a -> (t0 -> [(a, b)]) -> [(a, b)]

Seasoned Haskellers will know, though, that this is not a type error: no, this is a type recipe. The compiler is telling you what parameters it wants you to stick in the newtype:

newtype Zip a b =
Zip { runZip :: a -> (Zip a b -> [(a,b)]) -> [(a,b)] }

zip :: forall a b. [a] -> [b] -> [(a,b)]
zip xs ys = xz yz
where
xz :: Zip a b -> [(a,b)]
xz = foldr f b xs
where
f x xk yk = runZip yk x xk
b _ = []

yz :: Zip a b
yz = foldr f b ys
where
f y yk = Zip (\x xk -> (x,y) : xk yk)
b = Zip (\_ _ -> [])

And here we see the elusive hyperfunction: hidden behind a slight change of parameter order, Zip a b is in fact the same as [(a,b)] -&> (a -> [(a,b)]).

zip :: forall a b. [a] -> [b] -> [(a,b)]
zip xs ys = invoke xz yz
where
xz :: (a -> [(a,b)]) -&> [(a,b)]
xz = foldr f b xs
where
f x xk = Hyp (\yk -> invoke yk xk x)
b = Hyp (\_ -> [])

yz :: [(a,b)] -&> (a -> [(a,b)])
yz = foldr f b ys
where
f y yk = Hyp (\xk x -> (x,y) : invoke xk yk)
b = Hyp (\_ _ -> [])

### BFTs

In another previous post I derived the following function to do a breadth-first traversal of a tree:

data Tree a = a :& [Tree a]

newtype Q a = Q { q :: (Q a -> [a]) -> [a] }

bfe :: Tree a -> [a]
bfe t = q (f t b) e
where
f :: Tree a -> Q a -> Q a
f (x :& xs) fw = Q (\bw -> x : q fw (bw . flip (foldr f) xs))

b :: Q a
b = Q (\k -> k b)

e :: Q a -> [a]
e (Q q) = q e

That Q type there is another hyperfunction.

bfe :: Tree a -> [a]
bfe t = invoke (f t e) e
where
f :: Tree a -> ([a] -&> [a]) -> ([a] -&> [a])
f (x :& xs) fw = Hyp (\bw -> x : invoke fw (Hyp (invoke bw . flip (foldr f) xs)))

e :: [a] -&> [a]
e = Hyp (\k -> invoke k e)

One of the problems I had with the above function was that it didnâ€™t terminate: it could enumerate all the elements of the tree but it didnâ€™t know when to stop. A similar program (Allison 2006; described and translated to Haskell in Smith 2009) manages to solve the problem with a counter. Will it shock you to find out this solution can also be encoded with a hyperfunction?

bfe t = invoke (f t e) e 1
where
f :: Tree a -> (Int -> [a]) -&> (Int -> [a])
-> (Int -> [a]) -&> (Int -> [a])
f (x :& xs) fw =
Hyp (\bw n -> x :
invoke fw
(Hyp (invoke bw . flip (foldr f) xs)) (n+1))

e :: (Int -> [a]) -&> (Int -> [a])
e = Hyp b

b x 0 = []
b x n = invoke x (Hyp b) (n-1)

(my version here is actually a good bit different from the one in Smith 2009, but the basic idea is the same)

### Coroutines

Hyperfunctions seem to me to be quite deeply related to coroutines. At the very least several of the types involved in coroutine implementations are actual hyperfunctions. The ProdPar and ConsPar types from Pieters and Schrijvers (2019) are good examples:

newtype ProdPar a b = ProdPar (ConsPar a b -> b)
newtype ConsPar a b = ConsPar (a -> ProdPar a b -> b)

ProdPar a b is isomorphic to (a -> b) -&> b, and ConsPar a b to b -&> (a -> b), as witnessed by the following functions:

Conversion functions between ProdPar, ConsPar and hyperfunctions

fromP :: ProdPar a b -> (a -> b) -&> b
fromP (ProdPar x) = Hyp (x . toC)

toC ::  b -&> (a -> b) -> ConsPar a b
toC (Hyp h) = ConsPar (\x p -> h (fromP p) x)

toP :: (a -> b) -&> b -> ProdPar a b
toP (Hyp x) = ProdPar (x . fromC)

fromC :: ConsPar a b -> b -&> (a -> b)
fromC (ConsPar p) = Hyp (\h x -> p x (toP h))

In fact this reveals a little about what was happening in the zip function: we convert the left-hand list to a ProdPar (producer), and the right-hand to a consumer, and apply them to each other.

# Hyperfunctions Are Weird

Aside from just being kind of weird intuitively, hyperfunctions are weird in theory. Set-theoretically, for instance, you cannot form the set of a -&> b: if you tried, youâ€™d run into those pesky size restrictions which stop us from making things like â€œthe set of all setsâ€�. Haskell types, however, are not sets, precisely because we can define things like a -&> b.

For slightly different reasons to the set theory restrictions, we canâ€™t define the type of hyperfunctions in Agda. The following will get an error:

record _â†¬_ (A : Type a) (B : Type b) : Type (a â„“âŠ” b) where
inductive; constructor hyp
field invoke : (B â†¬ A) â†’ B

And for good reason! Agda doesnâ€™t allow recursive types where the recursive call is in a negative position. If we turn off the positivity checker, we can write Curryâ€™s paradox (example proof taken from here):

yes? : âŠ¥ â†¬ âŠ¥
yes? .invoke h = h .invoke h

no! : (âŠ¥ â†¬ âŠ¥) â†’ âŠ¥
no! h = h .invoke h

boom : âŠ¥
boom = no! yes?

Note that this isnâ€™t an issue with the termination checker: the above example passes all the normal termination conditions without issue (yes, even if â†¬ is marked as coinductive). Itâ€™s directly because the type itself is not positive.

Interestingly, there is a slightly different, and nearly equivalent, definition of hyperfunctions which doesnâ€™t allow us to write the above proof:

record _â†¬_ (A : Type a) (B : Type b) : Type (a â„“âŠ” b) where
inductive; constructor hyp
field invoke : ((A â†¬ B) â†’ A) â†’ B

This is basically a slightly expanded out version of the hyperfunction type, and importantly itâ€™s positive. Not strictly positive however, since the recursive call does occur to the left of a function arrow: itâ€™s just positive, in that itâ€™s to the left of an even number of function arrows.

I found in a blog post by SjÃ¶berg (2015) some interesting discussion regarding the question of this extra strictness: in Coq, allowing certain positive but not strictly positive types does indeed introduce an inconsistency (Coquand and Paulin 1990). However this inconsistency relies on an impredicative universe, which Agda doesnâ€™t have. As far as I understand it, it would likely be safe to allow types like â†¬ above in Agda (Coquand 2013), although Iâ€™m not certain that with all of Agdaâ€™s newer features thatâ€™s still the case.

The connection between non-strictly-positive types and breadth-first traversals has been noticed before: Berger, Matthes, and Setzer (2019) make the argument for their inclusion in Agda and Coq using a breadth-first traversal algorithm by Hofmann (1993), which uses the following type:

data Rou
= Over
| Next ((Rou -> [Int]) -> [Int])

Now this type isnâ€™t a hyperfunction (but itâ€™s close); weâ€™ll see soon what kind of thing it is.

# Hyperfunctions Are a Category

So weâ€™ve seen that hyperfunctions show up kind of incidentally through certain optimisations, and weâ€™ve seen that they occupy a strange space in terms of their theoretical interpretation: we havenâ€™t yet seen much about the type itself in isolation. Luckily Ed Kmett has already written the hyperfunctions package (2015), where a laundry list of instances are provided, which can tell us a little more about what hyperfunctions can actually do on their own.

The Category instance gives us the following:

instance Category (-&>) where
id = Hyp (\k -> invoke k id)
f . g = Hyp (\k -> invoke f (g . k))

Weâ€™ve actually seen the identity function a few times: we used it as the base case for recursion in the breadth-first traversal algorithms.

Composition we actually have used as well but itâ€™s more obscured. An analogy to help clear things up is to think of hyperfunctions as a kind of stack. id is the empty stack, and we can use the following function to push items onto the stack:

push :: (a -> b) -> a -&> b -> a -&> b
push f q = Hyp (\k -> f (invoke k q))

Understood in this sense, composition acts like a zipping operation on stacks, since we have the following law:

push f p . push g q â‰¡ push (f . g) (p . q)

While we canâ€™t really pop elements off the top of the stack directly, we can get close with invoke, since it satisfies the following law:

invoke (push f p) q â‰¡ f (invoke q p)

Along with the id implementation we have, this will let us run a hyperfunction, basically folding over the contents of the stack:

run :: a -&> a -> a
run f = invoke f id

This analogy helps us understand how the breadth-first traversals worked: the hyperfunctions are kind of like stacks with <semantics>ğ�’ª(1)<annotation encoding="application/x-tex">\mathcal{O}(1)</annotation></semantics> push and zip, which is precisely what you need for an efficient breadth-first traversal.

bfe :: Tree a -> [a]
bfe = run . f
where
f (x :& xs) = push (x:) (zips (map f xs))

zips = foldr (.) id

Finally, hyperfunctions are of course monads:

instance Monad ((-&>) a) where
m >>= f = Hyp (\k -> invoke (f (invoke m (Hyp (invoke k . (>>=f))))) k)

I wonâ€™t pretend to understand whatâ€™s going on here, but it looks a little like a nested reader monad. Perhaps thereâ€™s some intuition to be gained from noticing that a -&> a ~ Fix (Cont a).

# Hyper Arrows Areâ€¦?

As I said in the introduction Iâ€™m kind of surprised thereâ€™s not more research out there on hyperfunctions. Aside from the excellent papers by Launchbury, Krstic, and Sauerwein (2013) thereâ€™s just not much out there. Maybe itâ€™s that thereâ€™s not that much theoretical depth to them, but all the same there are some clear questions worth looking into.

For example: is there a hyperfunction monad transformer? Or, failing that, can you thread a monad through the type at any point, and do you get anything interesting out?

I have made a little headway on this question, while fiddling with one of the bfe definitions above. Basically I wanted to remove the Int counter for the terminating bfe, and I wanted to use a Maybe somewhere instead. I ended up generalising from Maybe to any m, yielding the following type:

newtype HypM m a b = HypM { invokeM :: m ((HypM m a b -> a) -> b) }

This does the job for the breadth-first traversal:

bfe t = r (f t e)
where
f :: Tree a -> HypM Maybe [a] [a] -> HypM Maybe [a] [a]
f (x :& xs) fw = HypM (Just (\bw -> x : fromMaybe (\k -> k e) (invokeM fw) (bw . flip (foldr f) xs)))

e :: HypM Maybe [a] [a]
e = HypM Nothing

r :: HypM Maybe [a] [a] -> [a]
r = maybe [] (\k -> k r) . invokeM

(In fact, when m is specialised to Maybe we have the same type as Rou)

This type has a very practical use, as it happens, which is related to the church-encoded list monad transformer:

newtype ListT m a = ListT { runListT :: forall b. (a -> m b -> m b) -> m b -> m b }

Just like -&> allowed us to write zip on folds (i.e.Â using foldr), HypM will allow us to write zipM on ListT:

zipM :: Monad m => ListT m a -> ListT m b -> ListT m (a,b)
zipM xs ys = ListT (\c n ->
let
xf x xk = pure (\yk -> yk (HypM xk) x)
xb = pure (\_ -> n)

yf y yk = pure (\xk x -> c (x, y) (join (invokeM xk <*> yk)))
yb = pure (\_ _ -> n)
in join (runListT xs xf xb <*> runListT ys yf yb))

I actually think this function could be used to seriously improve the running time of several of the functions on LogicT: my reading of them suggests that interleave is <semantics>ğ�’ª(n2)<annotation encoding="application/x-tex">\mathcal{O}(n^2)</annotation></semantics> (or worse), but the zip above could be trivially repurposed to give a <semantics>ğ�’ª(n)<annotation encoding="application/x-tex">\mathcal{O}(n)</annotation></semantics> interleave. This would also have knock-on effects on, for instance, >>- and so on.

Another question is regarding the arrows of the hyperfunction. Weâ€™ve seen that a hyperfunction kind of adds â€œstackingâ€� to functions, can it do the same for other arrows? Basically, does the following type do anything useful?

newtype HypP p a b = HypP { invokeP :: p (HypP p b a) b }

Along a similar vein, many of the breadth-first enumeration algorithms seem to use â€œhyperfunctions over the endomorphism monoidâ€�. Basically, they all produce hyperfunctions of the type [a] -&> [a], and use them quite similarly to how we would use difference lists. But we know that there are Cayley transforms in other monoidal categories, for instance in the applicative monoidal category: can we construct the â€œhyperfunctionâ€� version of those?

# References

Allison, Lloyd. 2006. â€œCircular Programs and Self-Referential Structures.â€� Software: Practice and Experience 19 (2) (October): 99â€“109. doi:10.1002/spe.4380190202. http://users.monash.edu/~lloyd/tildeFP/1989SPE/.

Berger, Ulrich, Ralph Matthes, and Anton Setzer. 2019. â€œMartin Hofmannâ€™s Case for Non-Strictly Positive Data Types.â€� In 24th International Conference on Types for Proofs and Programs (TYPES 2018), ed by. Peter Dybjer, JosÃ© EspÃ­rito Santo, and LuÃ­s Pinto, 130:22. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss DagstuhlLeibniz-Zentrum fuer Informatik. doi:10.4230/LIPIcs.TYPES.2018.1. http://drops.dagstuhl.de/opus/volltexte/2019/11405.

Coquand, Thierry. 2013. â€œ[Agda] Defining Coinductive Types.â€� https://lists.chalmers.se/pipermail/agda/2013/006189.html.

Coquand, Thierry, and Christine Paulin. 1990. â€œInductively Defined Types.â€� In COLOG-88, ed by. Per Martin-LÃ¶f and Grigori Mints, 50â€“66. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer. doi:10.1007/3-540-52335-9_47.

Ghani, Neil, Patricia Johann, Tarmo Uustalu, and Varmo Vene. 2005. â€œMonadic Augment and Generalised Short Cut Fusion.â€� In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming, 294â€“305. ICFP â€™05. New York, NY, USA: Association for Computing Machinery. doi:10.1145/1086365.1086403. https://doi.org/10.1145/1086365.1086403.

Hedges, Jules. 2015. â€œThe Selection Monad as a CPS Transformation.â€� arXiv:1503.06061 [cs] (March). http://arxiv.org/abs/1503.06061.

Hofmann, Martin. 1993. â€œNon Strictly Positive Datatypes in System F.â€� https://www.seas.upenn.edu/~sweirich/types/archive/1993/msg00027.html.

Kmett, Edward. 2015. â€œHyperfunctions: Hyperfunctions.â€� https://hackage.haskell.org/package/hyperfunctions.

Krstic, Sava, and John Launchbury. 2000. â€œA Category of Hyperfunctions.â€� http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.2421.

Launchbury, J., S. Krstic, and T. E. Sauerwein. 2013. â€œCoroutining Folds with Hyperfunctions.â€� Electron. Proc. Theor. Comput. Sci. 129 (September): 121â€“135. doi:10.4204/EPTCS.129.9. http://arxiv.org/abs/1309.5135.

Launchbury, John, Sava Krstic, and Timothy E. Sauerwein. 2000. Zip Fusion with Hyperfunctions. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.4961.

Pieters, Ruben P., and Tom Schrijvers. 2019. â€œFaster Coroutine Pipelines: A Reconstruction.â€� In Practical Aspects of Declarative Languages, ed by. JosÃ© JÃºlio Alferes and Moa Johansson, 133â€“149. Lecture Notes in Computer Science. Cham: Springer International Publishing. doi:10.1007/978-3-030-05998-9_9. https://people.cs.kuleuven.be/~tom.schrijvers/portfolio/padl2019.html.

SjÃ¶berg, Vilhelm. 2015. â€œWhy Must Inductive Types Be Strictly Positive?â€� Code and stuff. https://vilhelms.github.io/posts/why-must-inductive-types-be-strictly-positive/.

Smith, Leon P. 2009. â€œLloyd Allisonâ€™s Corecursive Queues: Why Continuations Matter.â€� The Monad.Reader 14 (14) (July): 28. https://meldingmonads.files.wordpress.com/2009/06/corecqueues.pdf.

# Strema is now Giml

For the past few weeks I've been working on Strema, a hobby programming language I developed live on stream. It's been a great experience and I'm proud of what I was able to accomplish in the short time I've worked on it.

I've decided to change Strema a bit and rename it to Giml (pronounced like the hebrew letter Gimel).

I've been wanting to make Giml for a long time, and I had Giml in mind when I initially designed Strema. Building Strema was also kind of a prototype for Giml. I feel the prototype was a success and Strema has grown sufficiently close to what I had in mind for Giml, so I've decided to shift gears and focus my interest on building Giml instead.

Giml resembles Haskell in terms of syntax and semantics more closely than Strema (for example, Giml is purely functional where Strema is not), though Giml is also strict where Haskell in non-strict, and Giml is much leaner in terms of features than Haskell.

I will continue to stream work on Giml, at least for the next few weeks. I hope to make it a hobby programming language I can be proud of, and I hope people who've been interested in Strema will find Giml just as interesting, if not more :)

If you'd like to follow my adventures with Giml, you can view the VOD playlist on Youtube, Or follow my Twitch.tv channel and join me when I stream. I'll be happy to have you!

# The Logarithm of a Sum

This is the story of an interesting flight of fancy with mathematics. I found it intriguing, and hope you do, as well.

#### The question

Here’s a fact that comes up in high school mathematics: you can demote multiplication into addition by using logarithms. That is:

That is, you can compute the log of a product, given only the logs of the factors.

To students today, this might seem like just another algebraic identity. But in the age before calculators, it was actually the main reason for a typical high school student to be interested in logarithms at all! Multiplication is more difficult than addition, so if you have a way to represent numbers that makes multiplication into addition, that helps. This is whole principle behind doing multiplication with a slide rule, for example: one just converts to logarithms, adds the resulting distances, and then converts back.

Similarly, one can use logarithms to demote powers into multiplication:

But if we’re imagining a world where we work entirely with logarithms, it’s not entirely fair to just multiply by y, so I’m going to rewrite this (let’s agree that all logarithms are natural) as:

There’s an additional exponential function there, but if we take that as given, we can now compute the log of a power using only multiplication, the exponential function, and the logs of the inputs.

An interesting question to ask is: what about addition? The following does not work, although math teachers will recognize it as a very common mistake!

So, can we complete this equation?

At first glance, thinking of the logarithm as translating operations down one order (multiplication into addition, and exponents into multiplication), this seems to call for an operation an order lower than addition. What could fit in such a place?

We can start to answer this question using simple algebra and our existing identities. Let’s assume x is not zero (since then it would have no logarithm anyway!), and then we can factor:

So by applying the log rule for multiplication, we get this nifty little formula:

Notice that although the presentation here doesn’t look symmetric, it actually is. Swapping the x and y values doesn’t change the result.

Again, imagining that we have only the logarithms and not the actual values, that fraction at the end is sort of cheating. Just as I did with the multiplication formula, I’ll introduce an explicit exponential, and it simplifies nicely.

In order to write this more clearly, I’ll name a new function, h, and define in terms of that:

It’s true that we haven’t succeeded in getting rid of addition, but this is leading somewhere interesting. But what is this mysterious function h?

#### h: The soft rectified linear function

We can start to explore h by looking at a graph.

At first glance, it looks like h(x) is approximately zero for any input less than -2, and approximately x for any input greater than 2. This sounds like the so-called “rectified linear” function:

Indeed, we can graph the two functions on the same axes, and see that they agree except near zero. (You can also verify this by reasoning about the formula. For inputs far less than zero, the exponential term becomes insignificant, while for inputs far greater than zero, the constant term becomes insignificant. This is the basis of a not-too-hard proof that these are asymptotes.)

We can, therefore, think of h as a soft rectified linear function; what you get by just rounding out the rectified linear function around its sharp corner.

(This rectified linear function, incidentally, has been popularized in machine learning, where for reasons that depend on who you ask, it has turned out to be wildly successful as an activation function for artificial neural networks. Part of the reason for that success is that it is so simple it can be computed quickly. But that’s not enough to explain all of its success! I suspect another part of the reason is that it’s closely related to sums exactly in the sense of the very investigation we’re doing now.)

#### Back to the sum

So if h is so similar to the rectified linear function, what happens when you (inaccurately) use the rectified linear function itself in the sum formula above. Remarkably, you get this:

In other words, in terms of logarithms, adding numbers is approximately the same as just taking the maximum! At least, it is when the difference between the numbers is large. That sort of makes sense, actually. If you add a very large number to a very small number, the result will indeed be approximately the same as the large number. (Remember that since we’re only thinking about numbers with logarithms, both inputs must be positive. We need not worry about the case where both numbers are large but with opposite signs.)

We can pull out of this pattern a sort of “soft” maximum function, which is almost like just giving the greater of its two arguments, but if the arguments are close then it rounds off the curve. Unfortunately, the phrase softmax already means something different and somewhat more complex to the machine learning community mentioned above, so perhaps we ought to call this something like smoothmax instead.

It’s not easily computable, really, in the sense that products and powers were, but this still gives some intuition for the function that does compute the log of a sum, given the logs of the summands. Anyway, I’m satisfied enough with that answer.

This tells us that this smoothmax function can play the role of addition in mathematical expressions. That implies that all of the algebraic properties of addition ought to hold for smoothmax, as well. That’s interesting!

For example, smoothmax ought to be commutative. That is:

Indeed, this is true. I made that observation above when first introducing the formula. One can also expect that smoothmax is associative. That is:

And, indeed, although the algebra is a little more complex, this turns out to be true, as well. In fact, we need not really show each of these with complicated algebra. We’ve already shown that smoothmax is addition, just using the logarithms to represent the numbers.

I think things get even more interesting when we consider the distributive property. Remember that when we work with logs, multiplication gets replaced with addition, so we have this:

Thinking of this as a softened maximum, this works out to be some kind of translation invariance property of the maximum: if you take the maximum of two numbers and then add x, that’s the same as adding x to each one and then taking the maximum! That intuitively checks out.

There are some things that don’t work, though.

You might also hope for something like an identity property, since for addition we have x + 0 = x. This one doesn’t turn out so well, because we cannot take the logarithm of zero! We end up wanting to write something like:

This would make sense given the asymptotic behavior of the smoothmax function, but we’re playing sort of fast and loose with infinities there, so I wouldn’t call it a true identity. To say that correctly, you need limits.

You also need to be careful with expecting smoothmax to act like a maximum! For example:

That’s weird… but not if you remember that smoothmax is at its least accurate when its two inputs are close together, so both inputs being the same is a worst case scenario. Indeed, that’s where the true max function has a non-differentiable sharp corner that needed to be smoothed out. And, indeed, the exact behavior is given by addition, rather than maximums, and addition is not idempotent (i.e., adding a number to itself doesn’t give the same number back).

In fact, speaking of smooth maxing a number with itself:

which resembles a sort of definition of addition of log-naturals as “repeated smoothmax of a number with itself”, in very much the same sense that multiplication by naturals can be defined as repeated addition of a number with itself, strengthening the notion that this operation is sort-of one order lower than addition.

So there you have it. That’s as far as my flight of fancy goes. I found it interesting enough to share.

# Captures in closures and async blocks

This blog post is the second in the Rust quickies series. In my training sessions, we often come up with quick examples to demonstrate some point. Instead of forgetting about them, I want to put short blog posts together focusing on these examples. Hopefully these will be helpful, enjoy!

FP Complete is looking for Rust and DevOps engineers. Interested in working with us? Check out our jobs page.

## Hello Hyper!

For those not familiar, Hyper is an HTTP implementation for Rust, built on top of Tokio. It's a low level library powering frameworks like Warp and Rocket, as well as the reqwest client library. For most people, most of the time, using a higher level wrapper like these is the right thing to do.

But sometimes we like to get our hands dirty, and sometimes working directly with Hyper is the right choice. And definitely from a learning perspective, it's worth doing so at least once. And what could be easier than following the example from Hyper's homepage? To do so, cargo new a new project, add the following dependencies:

hyper = { version = "0.14", features = ["full"] }
tokio = { version = "1", features = ["full"] }


And add the following to main.rs:

use std::convert::Infallible;
use hyper::{Body, Request, Response, Server};
use hyper::service::{make_service_fn, service_fn};

async fn hello_world(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
Ok(Response::new("Hello, World".into()))
}

#[tokio::main]
async fn main() {
// We'll bind to 127.0.0.1:3000

// A Service is needed for every connection, so this
// creates one from our hello_world function.
let make_svc = make_service_fn(|_conn| async {
// service_fn converts our function into a Service
Ok::<_, Infallible>(service_fn(hello_world))
});

// Run this server for... forever!
if let Err(e) = server.await {
eprintln!("server error: {}", e);
}
}


If you're interested, there's a quick explanation of this code available on Hyper's website. But our focus will be on making an ever-so-minor modification to this code. Let's go!

## Counter

Remember the good old days of Geocities websites, where every page had to have a visitor counter? I want that. Let's modify our hello_world function to do just that:

use std::sync::{Arc, Mutex};

type Counter = Arc<Mutex<usize>>; // Bonus points: use an AtomicUsize instead

async fn hello_world(counter: Counter, _req: Request<Body>) -> Result<Response<Body>, Infallible> {
let mut guard = counter.lock().unwrap(); // unwrap poisoned Mutexes
*guard += 1;
let message = format!("You are visitor number {}", guard);
Ok(Response::new(message.into()))
}


That's easy enough, and now we're done with hello_world. The only problem is rewriting main to pass in a Counter value to it. Let's take a first, naive stab at the problem:

let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
let counter: Counter = Arc::new(Mutex::new(0));

let make_svc = make_service_fn(|_conn| async {
Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))
});

if let Err(e) = server.await {
eprintln!("server error: {}", e);
}


Unfortunately, this fails due to moving out of captured variables. (That's a topic we cover in detail in our closure training module.)

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:21:58
|
18 |     let counter: Counter = Arc::new(Mutex::new(0));
|         ------- captured outer variable
...
21 |         Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))
|                                                          ^^^^^^^ move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:20:50
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- captured outer variable
19 |
20 |       let make_svc = make_service_fn(|_conn| async {
|  __________________________________________________^
21 | |         Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))
| |                                        -------------------------------
| |                                        |
| |                                        move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait
| |                                        move occurs due to use in generator
22 | |     });
| |_____^ move out of counter occurs here


## Clone

That error isn't terribly surprising. We put our Mutex inside an Arc for a reason: we'll need to make multiple clones of it and pass those around to each new request handler. But we haven't called clone once yet! Again, let's do the most naive thing possible, and change:

Ok::<_, Infallible>(service_fn(|req| hello_world(counter, req)))


into

Ok::<_, Infallible>(service_fn(|req| hello_world(counter.clone(), req)))


This is where the error messages begin to get more interesting:

error[E0597]: counter does not live long enough
--> src\main.rs:21:58
|
20 |       let make_svc = make_service_fn(|_conn| async {
|  ____________________________________-------_-
| |                                    |
| |                                    value captured here
21 | |         Ok::<_, Infallible>(service_fn(|req| hello_world(counter.clone(), req)))
| |                                                          ^^^^^^^ borrowed value does not live long enough
22 | |     });
| |_____- returning this value requires that counter is borrowed for 'static
...
29 |   }
|   - counter dropped here while still borrowed


Both async blocks and closures will, by default, capture variables from their environment by reference, instead of taking ownership. Our closure needs to have a 'static lifetime, and therefore can't hold onto a reference to data in our main function.

## move all the things!

The standard solution to this is to simply sprinkle moves on each async block and closure. This will force each closure to own the Arc itself, not a reference to it. Doing so looks simple:

let make_svc = make_service_fn(move |_conn| async move {
Ok::<_, Infallible>(service_fn(move |req| hello_world(counter.clone(), req)))
});


And this does in fact fix the error above. But it gives us a new error instead:

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:20:60
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- captured outer variable
19 |
20 |       let make_svc = make_service_fn(move |_conn| async move {
|  ____________________________________________________________^
21 | |         Ok::<_, Infallible>(service_fn(move |req| hello_world(counter.clone(), req)))
| |                                        --------------------------------------------
| |                                        |
| |                                        move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait
| |                                        move occurs due to use in generator
22 | |     });
| |_____^ move out of counter occurs here


## Double the closure, double the clone!

Well, even this error makes a lot of sense. Let's understand better what our code is doing:

• Creates a closure to pass to make_service_fn, which will be called for each new incoming connection
• Within that closure, creates a new closure to pass to service_fn, which will be called for each new incoming request on an existing connection

This is where the trickiness of working directly with Hyper comes into play. Each of those layers of closure need to own their own clone of the Arc. And in our code above, we're trying to move the Arc from the outer closure's captured variable into the inner closure's captured variable. If you squint hard enough, that's what the error message above is saying. Our outer closure is an FnMut, which must be callable multiple times. Therefore, we cannot move out of its captured variable.

It seems like this should be an easy fix: just clone again!

let make_svc = make_service_fn(move |_conn| async move {
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| hello_world(counter_clone.clone(), req)))
});


And this is the point at which we hit a real head scratcher: we get almost exactly the same error message:

error[E0507]: cannot move out of counter, a captured variable in an FnMut closure
--> src\main.rs:20:60
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- captured outer variable
19 |
20 |       let make_svc = make_service_fn(move |_conn| async move {
|  ____________________________________________________________^
21 | |         let counter_clone = counter.clone();
| |                             -------
| |                             |
| |                             move occurs because counter has type Arc<std::sync::Mutex<usize>>, which does not implement the Copy trait
| |                             move occurs due to use in generator
22 | |         Ok::<_, Infallible>(service_fn(move |req| hello_world(counter_clone.clone(), req)))
23 | |     });
| |_____^ move out of counter occurs here


What we need to do is to rewrite our code ever so slightly so reveal what the problem is. Let's add a bunch of unnecessary braces. We'll convert the code above:

let make_svc = make_service_fn(move |_conn| async move {
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| hello_world(counter_clone.clone(), req)))
});


into this semantically identical code:

let make_svc = make_service_fn(move |_conn| { // outer closure
async move { // async block
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| { // inner closure
hello_world(counter_clone.clone(), req)
}))
}
});


The error message is basically identical, just slightly different source locations. But now I can walk through the ownership of counter more correctly. I've added comments to highlight three different entities in the code above that can take ownership of values via some kind of environment:

• The outer closure, which handles each connection
• An async block, which forms the body of the outer closure
• The inner closure, which handles each request

In the original structuring of the code, we put move |_conn| async move next to each other on one line, which—at least for me—obfuscated the fact that the closure and async block were two completely separate entities. With that change in place, let's track the ownership of counter:

1. We create the Arc in the main function; it's owned by the counter variable.
2. We move the Arc from the main function's counter variable into the outer closure's captured variables.
3. We move the counter variable out of the outer closure and into the async block's captured variables.
4. Within the body of the async block, we create a clone of counter, called counter_clone. This does not move out of the async block, since the clone method only requires a reference to the Arc.
5. We move the Arc out of the counter_clone variable and into the inner closure.
6. Within the body of the inner closure, we clone the Arc (which, as explained in (4), doesn't move) and pass it into the hello_world function.

Based on this breakdown, can you see where the problem is? It's at step (3). We don't want to move out of the outer closure's captured variables. We try to avoid that move by cloning counter. But we clone too late! By using counter from inside an async move block, we're forcing the compiler to move. Hurray, we've identified the problem!

## Non-solution: non-move async

It seems like we were simply over-ambitious with our "sprinkling move" attempt above. The problem is that the async block is taking ownership of counter. Let's try simply removing the move keyword there:

let make_svc = make_service_fn(move |_conn| {
async {
let counter_clone = counter.clone();
Ok::<_, Infallible>(service_fn(move |req| {
hello_world(counter_clone.clone(), req)
}))
}
});


Unfortunately, this isn't a solution:

error: captured variable cannot escape FnMut closure body
--> src\main.rs:21:9
|
18 |       let counter: Counter = Arc::new(Mutex::new(0));
|           ------- variable defined here
19 |
20 |       let make_svc = make_service_fn(move |_conn| {
|                                                 - inferred to be a FnMut closure
21 | /         async {
22 | |             let counter_clone = counter.clone();
| |                                 ------- variable captured here
23 | |             Ok::<_, Infallible>(service_fn(move |req| {
24 | |                 hello_world(counter_clone.clone(), req)
25 | |             }))
26 | |         }
| |_________^ returns an async block that contains a reference to a captured variable, which then escapes the closure body
|
= note: FnMut closures only have access to their captured variables while they are executing...
= note: ...therefore, they cannot allow references to captured variables to escape


The problem here is that the outer closure will return the Future generated by the async block. And if the async block doesn't move the counter, it will be holding a reference to the outer closure's captured variables. And that's not allowed.

## Real solution: clone early, clone often

OK, undo the async move to async transformation, it's a dead end. It turns out that all we've got to do is clone the counter before we start the async move block, like so:

let make_svc = make_service_fn(move |_conn| {
let counter_clone = counter.clone(); // this moved one line earlier
async move {
Ok::<_, Infallible>(service_fn(move |req| {
hello_world(counter_clone.clone(), req)
}))
}
});


Now, we create a temporary counter_clone within the outer closure. This works by reference, and therefore doesn't move anything. We then move the new, temporary counter_clone into the async move block via a capture, and from there move it into the inner closure. With this, all of our closure captured variables remain unmoved, and therefore the requirements of FnMut are satisfied.

And with that, we can finally enjoy the glory days of Geocities visitor counters!

## Async closures

The formatting recommended by rustfmt hides away the fact that there are two different environments at play between the outer closure and the async block, by moving the two onto a single line with move |_conn| async move. That makes it feel like the two entities are somehow one and the same. But as we've demonstrated, they aren't.

Theoretically this could be solved by having an async closure. I tested with #![feature(async_closure)] on nightly-2021-03-02, but couldn't figure out a way to use an async closure to solve this problem differently than I solved it above. But that may be my own lack of familiarity with async_closure.

For now, the main takeaway is that closures and async blocks are two different entities, each with their own environment.

If you liked this post you may also be interested in: