Planet Haskell

September 25, 2017

Functional Jobs

Haskell or Scala engineer at Courex (Full-time)

What we do

Courex, a subsidiary of Keppel T&T, is an 8 year old ecommerce logistics company driven by technology. We help our customers manage their supply chain so they can focus on selling. We do the following

  • last mile delivery
  • warehousing
  • omnichannel integration

Our operations is driven by technology. Some interesting stuff

  • We run a hybrid crowd-sourced(uber style) + fixed fleet model.
  • We built an automated parcel dimension measurement machine using Kinect
  • We have autonomous robots coming in late 2017 to pick and sort parcels

Experience a different sort of scale. Not bits and bytes, but parcels, machines and people. Your work affects the real world in a huge traditional industry.

As part of the Keppel group, your work will reach the supply chain across Southeast Asia and China. Help us digitise the supply chain.

What are we looking for

We have openings in 2 teams. The inventory management product, which is written in Haskell, and the inventory synchronisation product which is written in Scala. Getting the inventory right is crucial in the supply chain, and functional programming gives us the confidence to do that.

The inventory management product manages how inventory flows in and out of the various warehouses in the region. Whereas the inventory synchronisation team synchronises the state of the inventory to various places such as Amazon, Lazada and Shopee.

We are looking for people interested in functional programming. It doesn't matter if you don't have working experience in them. Qualifications are not necessary. We like people who are practical and prolific. We are expanding to Southeast Asia. Our HQ is in Singapore, but you can work from one of Malaysia, Indonesia or Vietnam.

Get information on how to apply for this position.

September 25, 2017 07:34 AM

September 23, 2017

Christopher Allen

Alternatives to Typed Holes for talking to your compiler

Rejected title: Type Praxis

I frequently see people recommend that others use typed holes. I think people are more apt to recommend typed holes than the alternatives because it’s a bespoke feature intended to enable discovering the type of a sub-expression more easily. Which is fair enough, except it doesn’t really have a good use-case! I will demonstrate in this post why.

I frequently find myself relying on GHC Haskell’s features in order to off-load brain effort. The idea behind typed holes is that if you have an incomplete expression and aren’t sure what type the remaining part should be, you can ask the compiler! Lets reuse the example from the Haskell Wiki:

pleaseShow :: Show a => Bool -> a -> Maybe String
pleaseShow False _ = Nothing
pleaseShow True a = Just (show _a)

The idea here is that we aren’t sure what should go at the end of the final line and we’re using _a to ask GHC what the type of _a should be. You get a type error that tries to describe the typed hole as follows:

    • Found hole: _a :: a0
      Where: ‘a0’ is an ambiguous type variable
      Or perhaps ‘_a’ is mis-spelled, or not in scope
    • In the first argument of ‘show’, namely ‘_a’
      In the first argument of ‘Just’, namely ‘(show _a)’
      In the expression: Just (show _a)
    • Relevant bindings include
        a :: a
        pleaseShow :: Bool -> a -> Maybe String

Okay so here’s the problem. There’s a Show constraint on a but the typed hole message doesn’t bother saying so:

    • Found hole: _a :: a0

This represents sort of a problem. Typeclass constraints aren’t always as syntactically obvious as they are from the declaration of pleaseShow here:

pleaseShow :: Show a => Bool -> a -> Maybe String

Sometimes they arise from other sub-expressions in your code and aren’t manifest in the type of your declaration!

You can’t productively point new people to typed holes because they’ll get extremely confused about type variables that have no constraints. If they’re reading good learning material, they’ll know that means they can’t actually do anything with something that is parametrically polymorphic. Even that framing aside, they just won’t know what terms are available to them for anything polymorphic.

Then we come to the expert. The expert is more likely to be working with code leveraging typeclasses and polymorphism and therefore…typed holes is of less help to them. If they’re aware of what typeclass constraints are attached to a type variable, fine, but the compiler is still forcing the programmer to juggle more context in their head than is really necessary.

In which I offer a better alternative

pleaseShow :: Show a => Bool -> a -> Maybe String
pleaseShow False _ = Nothing
pleaseShow True a =
  let x :: z
      x = a
  in Just (show undefined)

This time we get an error that mentions where the original type came from along with the relevant typeclass constraints:

    • Couldn't match expected type ‘z’ with actual type ‘a’
      ‘a’ is a rigid type variable bound by
        the type signature for:
          pleaseShow :: forall a. Show a => Bool -> a -> Maybe String
      ‘z’ is a rigid type variable bound by
        the type signature for:
          x :: forall z. z

Keep in mind, this isn’t perfect! It’s not strictly the same as typed holes either as it’s more about contradicting the compiler about what type a had in order to discover what its type is. However, at least this way, we get a more complete picture of what the type of a is. Also note how I used undefined in order to ignore the parts of my code I wasn’t interested in getting errors about. This isn’t a perfect fit here as it results in GHC wanting to know which type it’s meant to expect from undefined, but in more typical circumstances, it works great for positing hypotheticals without bothering to write the actual code.

We’re about to do something more gnarly looking in the next section, the tl;dr is this:


Use let expressions, undefined, impossible types and the like instead of typed holes. And don’t recommend Typed Holes to new people, they’re more confusing than helpful and the facilities of typed holes don’t scale well to more complicated contexts anyway.

Tackling slightly more complicated situations

Warning: If you haven’t worked through about 2/3s of the Haskell Book or possess the equivalent practice and knowledge, you are unlikely to grok this section.

Sometimes you want to be able to posit something or lay down types for sub-expressions in a situation where you have a polymorphic type arising from a typeclass instance or function declaration. In those situations, knowing how to combine ScopedTypeVariables, InstanceSigs, and let expressions can be very valuable!

What if we’re stumped on something like this?

doubleBubble :: ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b)
doubleBubble f ffa =

So we try to start by assigning a type to a sub-expression:

doubleBubble :: ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b)
doubleBubble f ffa =
  let x :: z
      x = f
  in undefined

And get the following type error:

    • Couldn't match expected type ‘z’
                  with actual type ‘f1 (f2 (a -> b))’

Fair enough, what if we try to make the types agree?

doubleBubble :: ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b)
doubleBubble f ffa =
  let x :: f1 (f2 (a -> b))
      x = f
  in undefined

We get a type error?!

    • Couldn't match type ‘f1’ with ‘f4’
      ‘f1’ is a rigid type variable bound by
        the type signature for:
          doubleBubble :: forall (f1 :: * -> *) (f2 :: * -> *) a b.
                          (Applicative f1, Applicative f2) =>
                          f1 (f2 (a -> b)) -> f1 (f2 a) -> f1 (f2 b)
      ‘f4’ is a rigid type variable bound by
        the type signature for:
          x :: forall (f4 :: * -> *) (f5 :: * -> *) a1 b1. f4 (f5 (a1 -> b1))

The issue is that types usually only last the scope of a single type signature denoted by ::, so the variables f1, a, b, and the like can only be referenced in our declaration. That kinda sucks, how do we keep referring to the same type variables under our declaration? ScopedTypeVariables!

{-# LANGUAGE ScopedTypeVariables #-}

doubleBubble :: forall f1 f2 a b
              . ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b)
doubleBubble f ffa =
  let x :: f1 (f2 (a -> b))
      x = f
  in undefined

This now type-checks because we used forall to tell GHC that we wanted those variables to be lexically scoped! Now we’re really cooking with gas. Lets follow a chain of experiments and how they change our type errors:

doubleBubble :: forall f1 f2 a b
              . ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b)
doubleBubble f ffa =
  let x :: z
      x = fmap (<*>) f
  in undefined
    • Couldn't match expected type ‘z’
                  with actual type ‘f1 (f2 a -> f2 b)’
doubleBubble :: forall f1 f2 a b
              . ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b)
doubleBubble f ffa =
  let x :: z
      x = (fmap (<*>) f) <*> ffa
  in undefined
    • Couldn't match expected type ‘z’ with actual type ‘f1 (f2 b)’
-- this typechecks.
doubleBubble :: forall f1 f2 a b
              . ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b) -- <---
doubleBubble f ffa =      ------ hmm
  let x :: f1 (f2 b) -- <--------
      x = (fmap (<*>) f) <*> ffa
  in undefined

And now we’re done:

doubleBubble :: forall f1 f2 a b
              . ( Applicative f1
                , Applicative f2 )
             => f1 (f2 (a -> b))
             -> f1 (f2 a)
             -> f1 (f2 b) -- <---
doubleBubble f ffa =      ------ hmm
  let x :: f1 (f2 b) -- <--------
      x = (fmap (<*>) f) <*> ffa
  in x

The intuition here is that we have to applicatively (monoidal functor, remember?) combine the * -> * kinded structure twice,

   f1 (f2 (a -> b))
-- <>  <>
-> f1 (f2     a)

Once for f1 of the function and f1 of the value, once for f2 of the function and f2 of the value.

Prelude> :t (\f a -> f <*> a)
(\f a -> f <*> a) :: Applicative f => f (a -> b) -> f a -> f b
Prelude> :t (\f a -> (fmap (<*>) f) <*> a)
(\f a -> (fmap (<*>) f) <*> a)
  :: (Applicative f, Applicative f1) =>
     f1 (f (a -> b)) -> f1 (f a) -> f1 (f b)

The following doesn’t fit because we end up triggering the Reader (function type) Applicative:

Prelude> :t (\f a -> ((<*>) f) <*> a)
(\f a -> ((<*>) f) <*> a)
  :: (a1 -> a -> b) -> ((a1 -> a) -> a1) -> (a1 -> a) -> b

Rewriting the working solution a little:

apply = (<*>)
doubleAp f a = apply (fmap apply f) a
Prelude> let apply = (<*>)
Prelude> let doubleAp f a = apply (fmap apply f) a
Prelude> :t doubleAp
  :: (Applicative f1, Applicative f) =>
     f1 (f (a -> b)) -> f1 (f a) -> f1 (f b)

Then breaking down:

doubleAp f a = apply (fmap apply f) a
--             [1]    [2]  [3]
  1. This apply grafts in the pre-lifted apply, cf.
Prelude> import Data.Void
Prelude> let v :: Void; v = undefined
Prelude> let doubleAp f a = v (fmap apply f) a

<interactive>:104:20: error:
    • Couldn't match expected type ‘f1 (f a -> f b) -> t1 -> t’
                  with actual type ‘Void’
  1. This fmap lifts a regular apply into a type that can graft together two values embedded in f such that the type is: f a -> f b, cf.
Prelude> let doubleAp f a = apply (v apply f) a

<interactive>:105:27: error:
    • Couldn't match expected type ‘(f0 (a0 -> b0) -> f0 a0 -> f0 b0)
                                    -> t -> f (a -> b)’
                  with actual type ‘Void’
  1. This is the apply lifted by fmap, transformed from:

(f0 (a0 -> b0) into f0 a0 -> f0 b0

(The void error here is less useful)

Kicking in the contradiction we get for a if we replace it with the Void typed v variable:

Prelude> let doubleAp f a = apply (fmap apply f) v

<interactive>:107:41: error:
    • Couldn't match expected type ‘f1 (f a)’ with actual type ‘Void’
    • In the second argument of ‘apply’, namely ‘v’
      In the expression: apply (fmap apply f) v

Not bad eh? I find it’s better to teach people these techniques than to point them to typed holes, but reasonable minds disagree. Even when a learner is relatively early in the learning process, these techniques can be made approachable/digestible.

That’s all folks. Below is just a demonstration of the missing-constraint problem with an example from the Haskell Wiki.

Re-demonstration of the missing constraint problem using the Haskell Wiki’s example

module FreeMonad where
data Free f a
  = Pure a
  | Free (f (Free f a))

-- These are just to shut the compiler up, we
-- are not concerned with these right now.
instance Functor f => Functor (Free f) where
  fmap = undefined

instance Functor f => Applicative (Free f) where
  pure = undefined
  (<*>) = undefined

-- Okay, we do care about the Monad though.
instance Functor f => Monad (Free f) where
  return a     = Pure a
  Pure a >>= f = f a
  Free f >>= g = Free _a
code/FreeMonad.hs:20:23: error:
    • Found hole: _a :: f (Free f b)
      Where: ‘f’ is a rigid type variable bound by
               the instance declaration at code/FreeMonad.hs:17:10
             ‘b’ is a rigid type variable bound by
               the type signature for:
                 (>>=) :: forall a b. Free f a -> (a -> Free f b) -> Free f b
               at code/FreeMonad.hs:19:10
      Or perhaps ‘_a’ is mis-spelled, or not in scope
    • In the first argument of ‘Free’, namely ‘_a’
      In the expression: Free _a
      In an equation for ‘>>=’: (Free f) >>= g = Free _a
    • Relevant bindings include
        g :: a -> Free f b (bound at code/FreeMonad.hs:20:14)
        f :: f (Free f a) (bound at code/FreeMonad.hs:20:8)
        (>>=) :: Free f a -> (a -> Free f b) -> Free f b
          (bound at code/FreeMonad.hs:19:3)
Failed, modules loaded: none.

^^ Look ma, no Functor.

_a :: f (Free f b)
instance Functor f => Monad (Free f) where

Not consistently, but I’m more likely to get better type errors when I create contradictions manually via let expressions than I am using typed holes.

I know this site is a bit of a disaster zone, but if you like my writing or think you could learn something useful from me, please take a look at the Haskell book I've been writing. There's a free sample available too!

September 23, 2017 12:00 AM

September 22, 2017

Tweag I/O

GHC compiler plugins in the wild:<br/> typing Java

Facundo Domínguez, Mathieu Boespflug

Previously, we discussed how to use inline-java to call any Java function from Haskell. The reverse is also possible, though that will be a topic for a future post. In this post, we'll peek underneath the hood to talk a little about how inline-java does its deed.

You might find it an interesting read for at least the following reason: since the latest v0.7 release of inline-java, it's an example use of a recent feature of GHC called compiler plugins. These allow you to introspect and transform types and the abstract syntax tree before handing them off to later stages of the compiler pipeline. We use this to good effect in order to check that argument and return types on the Java side line up with those on the Haskell side (and vice versa).

Calling Java

inline-java makes it possible to invoke code written in Java using a Haskell language feature known as quasiquotes.

{-# LANGUAGE QuasiQuotes #-}
import Language.Java (withJVM)
import Language.Java.Inline

main :: IO ()
main = withJVM [] $ do
    let x = 1.5 :: Double
    y <- [java| { System.out.println($x);
                  return $x + 1;
                } |]
    print (y :: Double)

The function withJVM starts an instance of the Java Virtual Machine (JVM), and the java quasiquotation executes the Java code passed to it as a block of statements.

In this example, the Haskell value x of type Double is coerced into a Java value of primitive type double, which is then used whenever the antiquoted variable $x appears inside the quasiquotation. When the quasiquotation finishes executing, the Java value resulting from evaluating $x + 1 is coerced back to a Haskell value of type Double.

GHC doesn't parse or generate any Java. Neither does inline-java. So how can this program possibly work? The answer is that inline-java feeds the quasiquotation to the javac compiler, which generates some bytecode that is stored in the object file of the module. At runtime, inline-java arranges for the bytecode to be handed to the JVM using the jni package. Finally, inline-java makes use of the jvm package to have the bytecode executed.

Type safety

A notable characteristic of this approach is that we know at compile time if types are correct. We know that Java won't return an object if on the Haskell side we expect a double, because the Java side knows it's on the hook for handing us a double. javac will raise a compile time error if the Java code doesn't do that. Even if the Haskell side expected an object, say of type java.util.List, the Java quasiquotation can't return an object of type java.lang.String either. And conversely for arguments, Java and Haskell need to agree on the type of arguments, or a compile-time error ensues.

Given that no one compiler analyses both languages, how can type-checking work across language boundaries? Fortunately, both compilers can be put to cooperate on the task. First, GHC infers the types of the antiquoted variables and the return type which is expected of the quasiquotation. Then, these types are translated to Java types. The translation is conducted by a machinery of type classes living in the jvm package. The details of this process are not important at this point. What matters is that it enables us to translate types across languages. For instance,

Haskell type Java type
Double double
[Double] double[]
ByteString byte[]
Text java.lang.String

The translated types are passed to javac together with the rest of the quasiquoted Java code. In our running example this would be

double fresh_name(double $x) {
    return $x + 1;

Finally, the javac compiler type checks the quasiquotation. Type mismatches will be discovered and reported at this stage.

It turns out that the first step is by far the most intricate. Specifically, for inline-java to query the types that GHC inferred for the antiquoted variables, and also query the type of the entire quasiquotation.

Looking for the types

At first, it appears as if determining these types is trivial. There is a Template Haskell primitive called reify.

reify :: Name -> Q Info

data Info =
    | VarI Name Type (Maybe Dec)	

Given an antiquoted variable $x, we ought to be able to use reify 'x to determine its Haskell type. Well, except that this doesn't quite work, because type checking is not finished when reify gets evaluated. From there, we went down a rabbit hole of trying to propose patches to Template Haskell to reliably get our hands on the inferred types. If you want to follow the intricacies of our journey, here are the related GHC issues for your amusement: initial discussion, 12777, 12778, 13608.

After many discussions with Simon Peyton Jones, and some deal of creative hacking, we could kind of get the inferred types for antiquoted variables, but only for as long as the java quasiquotation didn't appear inside Template Haskell brackets ([| ... |]). Moreover, we made no progress getting the expected type of the quasiquotation. Every idea we came up with required difficult compromises in the design. In the meantime, we had to choose between checking the type of the values returned by quasiquotations at runtime or using unsafe coercions, neither of which is an attractive option.

Eventually, we learnt that Template Haskell was not the only way to query the output of the type checker.

Enter GHC Core plugins

The GHC compiler uses an explicitly typed intermediate language known as Core. All type applications of terms in Core are explicit, making it possible to learn the types inferred at the type checking phase by inspecting Core terms. In order to get our hands on Core terms, we can use Core plugins. We could think of a Core plugin as a set of Core-to-Core passes that we can ask GHC to add to the compilation pipeline. The passes can be inserted anywhere in the Core pipeline, and in particular, they can be inserted right after desugaring, the phase which generates Core from the abstract syntax tree of a Haskell program.

Quasiquotations disappear from the abstract syntax tree when Template Haskell is executed. This happens well before the plugin passes. In order to enable the plugin to find the location of the quasiquotations, the quasiquoter can insert some artificial function call as a beacon or marker. In inline-java, our example program looks something as follows after Template Haskell runs.

main :: IO ()
main = withJVM [] $ do
    let x = 1.5 :: Double
    y <- qqMarker
	   "{ System.out.println($x); return $x + 1; }"
    print (y :: Double)

qqMarker :: forall args r. String -> args -> IO r
qqMarker = error "inline-java: The Plugin is not enabled."

The GHC Plugin is supposed to replace the call to qqMarker with an appropriate call to the generated Java method. The all-important point, however, is that the calls to qqMarker are annotated with the types we want to determine in Core.

main :: IO ()
main = ...
         @ Double
         @ Double
         "{ System.out.println($x); return $x + 1; }"

The type parameters provide us with the type of the antiquoted variable and the expected type of the quasiquotation. From here, the plugin has all the information it needs to generate the Java code to feed to javac. In addition, the plugin can inject the generated bytecode in the object file of the module, and it arranges for this bytecode to be located at runtime so it can be loaded in the JVM.

Now the user needs to remember to tell GHC to use the plugin by passing it the option -fplugin=Language.Java.Inline.Plugin. But this is only until Template Haskell learns the ability to tell GHC which plugins to use.


By using a GHC plugin, we have simplified inline-java from a complicated spaghetti which sprung from attempting to use Template Haskell's reify and didn't fully addressed the type lookup problem in a robust way. Now we have a straight forward story which starts by introducing the qqMarker beacons, attaches the Java bytecode in the plugin phase and ends by loading it at runtime into the JVM.

Writing a compiler plugin is similar to writing Template Haskell code. Both approaches need to manipulate abstract syntax trees. The plugin approach can be regarded as more coupled with a particular version of the compiler, since it relies on the internal Core language. However, Core changes relatively little over the years, and anyway, a pass that looks for some markers is hardly going to change a lot even if Core did change.

Many thanks to Simon Peyton Jones for his patience to walk with us over our attempts to fix Template Haskell. Without this dialog with the compiler implementors, it would have been difficult for us to explore as much of the design space as we needed to.

September 22, 2017 12:00 AM

September 21, 2017

Brent Yorgey

New baby, and Haskell Alphabet

My wife and I just had a baby!

If you missed seeing me at ICFP, this is why.

In honor of my son’s birth (he will need to learn the alphabet and Haskell soon)—and at the instigation of Kenny Foner—I revived the Haskell Alphabet by converting it to modern Hakyll and updating some of the broken or outdated links. Some of it is a bit outdated (I wrote it seven years ago), but it’s still a fun little piece of Haskell history. Enjoy!

by Brent at September 21, 2017 06:46 PM

September 20, 2017

Neil Mitchell

Shake 0.16 - revised rule definitions

Summary: I've just released shake v0.16. A lot has changed, but it's probably only visible if you have defined your own rules or oracles.

Shake-0.16 is now out, 8 months since the last release, and with a lot of improvements. For full details read the changelog, but in this post I'm going to go through a few of the things that might have the biggest impact on users.

Rule types redefined

Since the first version of Shake there has been a Rule key value type class defining all rule types - for instance the file rule type has key of filename and value of modification time. With version 0.16 the type class is gone, rules are harder to write, but offer higher performance and more customisation. For people using the builtin rule types, you'll see those advantages, and in the future see additional features that weren't previously possible. For people defining custom rule types, those will require rewriting - read the docs and if things get tough, ask on StackOverflow.

The one place many users might encounter the changes are that oracle rules now require a type instance defining between the key and value types. For example, if defining an oracle for the CompilerVersion given the CompilerName, you would have to add:

type instance RuleResult CompilerName = CompilerVersion

As a result of this type instance the previously problematic askOracle can now infer the result type, removing possible sources of error and simplifying callers.

The redefining of rule types represents most of the work in this release.

Add cmd_

The cmd_ function is not much code, but I suspect will turn out to be remarkably useful. The cmd function in Shake is variadic (can take multiple arguments) and polymorphic in the return type (you can run it in multiple monads with multiple results). However, because of the overloading, if you didn't use the result of cmd it couldn't be resolved, leading to ugly code such as () <- cmd args. With cmd_ the result is constrained to be m (), so cmd_ args can be used.

Rework Skip/Rebuild

Since the beginning Shake has tried to mirror the make command line flags. In terms of flags to selectively control rebuilding, make is based entirely on ordered comparison of timestamps, and flags such as --assume-new don't make a lot of sense for Shake. In this release Shake stops trying to pretend to be make, removing the old flags (that never worked properly) and adding --skip (don't build something even if it is otherwise required) and --build (build something regardless). Both these flags can take file patterns, e.g, --build=**/*.o to rebuild all object files. I don't think these flags are finished with, but it's certainly less of a mess than before.

by Neil Mitchell ( at September 20, 2017 07:53 PM

Mark Jason Dominus

Gompertz' law for wooden utility poles

Gompertz' law says that the human death rate increases exponentially with age. That is, if your chance of dying during this year is , then your chance of dying during next year is for some constant . The death rate doubles every 8 years, so the constant is empirically around . This is of course mathematically incoherent, since it predicts that sufficiently old people will have a mortality rate greater than 100%. But a number of things are both true and mathematically incoherent, and this is one of them. (Zipf's law is another.)

The Gravity and Levity blog has a superb article about this from 2009 that reasons backwards from Gompertz' law to rule out certain theories of mortality, such as the theory that death is due to the random whims of a fickle god. (If death were entirely random, and if you had a 50% chance of making it to age 70, then you would have a 25% chance of living to 140, and a 12.5% chance of living to 210, which we know is not the case.)

Gravity and Levity says:

Surprisingly enough, the Gompertz law holds across a large number of countries, time periods, and even different species.

To this list I will add wooden utility poles.

A couple of weeks ago Toph asked me why there were so many old rusty staples embedded in the utility poles near our house, and this is easy to explain: people staple up their yard sale posters and lost-cat flyers, and then the posters and flyers go away and leave behind the staples. (I once went out with a pliers and extracted a few dozen staples from one pole; it was very satisfying but ultimately ineffective.) If new flyer is stapled up each week, that is 52 staples per year, and 1040 in twenty years. If we agree that 20 years is the absolute minimum plausible lifetime of a pole, we should not be surprised if typical poles have hundreds or thousands of staples each.

But this morning I got to wondering what is the expected lifetime of a wooden utility pole? I guessed it was probably in the range of 40 to 70 years. And happily, because of the Wonders of the Internet, I could look it up right then and there, on the way to the trolley stop, and spend my commute time reading about it.

It was not hard to find an authoritative sounding and widely-cited 2012 study by electric utility consultants Quanta Technology.

Summary: Most poles die because of fungal rot, so pole lifetime varies widely depending on the local climate. An unmaintained pole will last 50–60 years in a cold or dry climate and 30-40 years in a hot wet climate. Well-maintained poles will last around twice as long.

Anyway, Gompertz' law holds for wooden utility poles also. According to the study:

Failure and breakdown rates for wood poles are thought to increase exponentially with deterioration and advancing time in service.

The Quanta study presents this chart, taken from the (then forthcoming) 2012 book Aging Power Delivery Infrastructures:

The solid line is the pole failure rate for a particular unnamed utility company in a median climate. The failure rate with increasing age clearly increases exponentially, as Gompertz' law dictates, doubling every 12½ years or so: Around 1 in 200 poles fails at age 50, around 1 in 100 of the remaining poles fails at age 62.5, and around 1 in 50 of the remaining poles fails at age 75.

(The dashed and dotted lines represent poles that are removed from service for other reasons.)

From Gompertz' law itself and a minimum of data, we can extrapolate the maximum human lifespan. The death rate for 65-year-old women is around 1%, and since it doubles every 8 years or so, we find that 50% of women are dead by age 88, and all but the most outlying outliers are dead by age 120. And indeed, the human longevity record is currently attributed to Jeanne Calment, who died in 1997 at the age of 122½.

Similarly we can extrapolate the maximum service time for a wooden utility pole. Half of them make it to 90 years, but if you have a large installed base of 110-year-old poles you will be replacing about one-seventh of them every year and it might make more sense to rip them all out at once and start over. At a rate of one yard sale per week, a 110-year-old pole will have accumulated 5,720 staples.

The Quanta study does not address deterioration of utility poles due to the accumulation of rusty staples.

by Mark Dominus ( at September 20, 2017 06:41 PM

Dominic Orchard

Scrap Your Reprinter

Back in 2013, Andrew Rice and I were doing some initial groundwork on how to build tools to help scientists write better code (e.g., with the help of refactoring tools and verification tools). We talked to a lot of scientists who wrote Fortran almost exclusively, so we started creating infrastructure for building tools to work on Fortran. This was the kernel of the CamFort project for which we got an EPSRC grant in 2015 (which is ongoing). The CamFort tool now has a couple of fairly well developed specification/verification features, and a few refactoring features. Early on, I started building everything in Haskell using the brilliant uniplate library, based on the Scrap Your Boilerplate [1] work. This helped us to get the tool off the ground quickly by utilising the power of datatype generic programming. Fairly quickly we hit upon an interesting problem with building refactoring tools: how do you output source code for a refactored AST whilst preserving all the original comments and white space? It is not enough just to pretty print the AST, unless your AST contains all the comments and layout information. Building a parser to capture all this information is extremely hard, and we use a parser generator which limits flexibility (but is really useful for a large grammar). Another approach is to output patch/edit information for the original source code, calculated from the AST.

In the end, I came up with a datatype generic algorithm which I call the reprinter. The reprinter takes the original source code and an updated AST (which contains location information) and maps them into a new piece of source code. Here is an illustration which I’ll briefly explain:


Some source text (arithmetic code in prefix notation here) is parsed into an AST. The AST contains the “spans” of each syntactic fragment: the start position and end position in the original source code (for simplicity in this illustration, just the column number is represented). Some transformation/refactoring is applied next. In this case, the transformation rewrites redundant additions of 0, which happens in the node coming from source locations 10 to 16. The refactored node is marked in red. The reprinting then runs, stitching together the original source code with the updated source tree. A pretty printer is used to generate code for any new nodes, but all the original source text for the other nodes is preserved. The cool thing about this algorithm is that it is datatype generic: it works for any datatype, with some modest side conditions about storing source spans. The implementation uses the Scrap Your Zipper [2] library to do a context-dependent generic traversal of a datatype. In essence, the algorithm is similar to what one might do if you were to spit out edit information from an AST, then apply this to a piece of source text. But, the algorithm does this generically, and in a single simultaneous pass of the AST and the input source text.

I’ve always thought it was a cute and useful algorithm, which combined some cool techniques from functional programming.  As with all the “Scrap Your X” libraries it saves huge amounts of time and messing around, especially when your AST representation keeps changing (which it did/does for us). The algorithm is really useful anywhere you need to update human-written source code in a layout-preserving way; for example, IDEs and refactoring tools but also in interactive theorem provers and program synthesis tools, where you need to synthesise source text into some existing human-written code. This is one of the ways it is used in CamFort, where specifications are synthesised from code analysis data and then inserted as comments into user code.

This summer, I was fortunate enough to have the resources to hire several interns. One of the interns, Harry Clarke, (amongst other things) worked with me to tidy up the code for the reprinter, add some better interfaces, make it usable as a library for others, and write it all up. He presented the work at IFL 2017 and the pre-proceedings version of the paper is available. We are working on a post-proceedings version for December, so any comments gratefully appreciated.

[1] Lämmel, Ralf, and Simon Peyton Jones. Scrap your boilerplate: a practical design pattern for generic programming. Vol. 38. No. 3. ACM, 2003.

[2] Adams, Michael D. “Scrap your zippers: a generic zipper for heterogeneous types.” Proceedings of the 6th ACM SIGPLAN workshop on Generic programming. ACM, 2010.

by dorchard at September 20, 2017 03:20 PM

Wolfgang Jeltsch

Registration for Haskell in Leipzig 2017 is open

Haskell in Leipzig 2017 opened its gates for everyone interested in Haskell and generally functional programming. Expect a great day of talks, tutorials, and a performance with a focus on FRP, followed by a Hackathon. Register early and get your ticket at a reduced rate. Looking forward to meeting you in Leipzig.


Haskell is a modern functional programming language that allows rapid development of robust and correct software. It is renowned for its expressive type system, its unique approaches to concurrency and parallelism, and its excellent refactoring capabilities. Haskell is both the playing field of cutting-edge programming language research and a reliable base for commercial software development.

The workshop series Haskell in Leipzig (HaL), now in its 12th year, brings together Haskell developers, Haskell researchers, Haskell enthusiasts, and Haskell beginners to listen to talks, take part in tutorials, join in interesting conversations, and hack together. To support the latter, HaL will include a one-day hackathon this year. The workshop will have a focus on functional reactive programming (FRP) this time, while continuing to be open to all aspects of Haskell. As in the previous year, the workshop will be in English.

Invited Speaker

Invited Performer


Registration information is available on the web page of the local organizers.

Program Committee

  • Edward Amsden, Plow Technologies, USA
  • Heinrich Apfelmus, Germany
  • Jurriaan Hage, Utrecht University, The Netherlands
  • Petra Hofstedt, BTU Cottbus-Senftenberg, Germany
  • Wolfgang Jeltsch, Tallinn University of Technology, Estonia (chair)
  • Andres Löh, Well-Typed LLP, Germany
  • Keiko Nakata, SAP SE, Germany
  • Henrik Nilsson, University of Nottingham, UK
  • Ertuğrul Söylemez, Intelego GmbH, Germany
  • Henning Thielemann, Germany
  • Niki Vazou, University of Maryland, USA
  • Johannes Waldmann, HTWK Leipzig, Germany

Tagged: conference, FRP, functional programming, Haskell

by Wolfgang Jeltsch at September 20, 2017 02:56 PM

September 19, 2017

Douglas M. Auclair (geophf)

August 2017 1HaskellADay 1Liners Problems and Solutions

  • August 1st, 2017:
  • f :: (Maybe a, b) -> Maybe (a, b) Define points-free.
  • August 1st, 2017:
    Given f above and f a and f b are mutually exclusive in Maybe monad, define
    g :: Maybe (a, b) -> Maybe (a, b) -> (Maybe a, b)
    points free
  • August 1st, 2017:
    Now, let's define the dual of f
    f' :: Maybe (a, b) -> (Maybe a, b)
    points free

by geophf ( at September 19, 2017 03:47 PM

September 18, 2017


Visualizing lazy evaluation

<script> function addScript(filename) { var scriptBlock=document.createElement('script') scriptBlock.setAttribute("type","text/javascript") scriptBlock.setAttribute("src", filename) document.getElementsByTagName("head")[0].appendChild(scriptBlock) } addScript("../../../aux/files/foldrl-animated/length-naive.js"); addScript("../../../aux/files/foldrl-animated/length-acc.js"); addScript("../../../aux/files/foldrl-animated/length-acc-strict.js"); addScript("../../../aux/files/foldrl-animated/mapM_Maybe.js"); </script>

Haskell and other call-by-need languages use lazy evaluation as their default evaluation strategy. For beginners and advanced programmers alike this can sometimes be confusing. At Well-Typed a core part of our business is teaching Haskell, which we do through public courses (such as the upcoming Skills Matter courses Fast Track to Haskell, Haskell Performance and Optimization and The Haskell Type System), private in-house courses targeted at specific client needs, and of course through writing blog posts.

In order to help us design these courses, we developed a tool called visualize-cbn. It is a simple interpreter for a mini Haskell-like language which outputs the state of the program at every step in a human readable format. It can also generate a HTML/JavaScript version with “Previous” and “Next” buttons to step through a program. We released the tool as open source to github and Hackage, in the hope that it will be useful to others.

The file in the repo explains how to run the tool. In this blog post we will illustrate how one might take advantage of it. We will revisit the infamous triple of functions foldr, foldl, foldl', and show how they behave. As a slightly more advanced example, we will also study the memory behaviour of mapM in the Maybe monad. Hopefully, this show-rather-than-tell blog post might help some people understand these functions better.

Throughout this section we will use this definition of enumFromTo:

enumFromTo n m =
  if n <= m then n : enumFromTo (n + 1) m
            else []

so that, say, [1..3] corresponds to (enumFromTo 1 3).


In this section we will examine the difference between these three functions. We will not study these functions directly, however, but study a slightly simpler variant in the form of three definitions of length on lists. For a more in-depth discussion of this triple of functions, see our earlier blog post on this topic.


Consider the naive definition of length:

length xs =
  case xs of
    []      -> 0
    (x:xs') -> 1 + length xs'

This corresponds to defining length = foldr (\x n -> 1 + n) 0.

Let’s consider what happens when we compute length [1..3]; you can click on Prev and Next to step through the execution:

Prev Next (step Step, Status)


When you step through this, notice what is going on:

  • We first apply the definition of length.
  • Then in step 1, length needs to do a case analysis, which forces us apply enumFromTo, and evaluate it until we have a top-level Cons constructor (in step 4)
  • At that point we can execute the pattern match and the process continues.
  • When we evaluate enumFromTo (add 1 1) 3 in step 6, note that the expression add 1 1 is only evaluated once, although it is used twice; this sharing of computation is what makes Haskell a true lazy (call-by-need) language (as opposed to a call-by-name language).
  • In these animations these shared expressions are shown separately below the expression; you can think of this as the “heap”, and accordingly the animation also shows when these expressions are garbage collected (e.g. step 14).

Note as you step through that we have a build up of calls to add which are only resolved until the very end. This is the source of the memory leak in this definition of length (corresponding to foldr).


In a foldl-style definition of length, we introduce an accumulator:

length acc xs =
  case xs of
    []    -> acc
    x:xs' -> length (1 + acc) xs'

This corresponds to defining length = foldl (\n x -> 1 + n) 0.

Unlike the previous definition, this is tail-recursive; however, it still suffers from a memory leak due to Haskell’s extremely lazy nature. You will see why when you step through the execution:

Prev Next (step Step, Status)


Notice how we are still building up a chain of additions, except they are now in the accumulator instead. This chain is only resolved (and garbage collected) at the very end (step 26).


In the final definition, we make sure to evaluate the accumulator as we go:

length acc xs =
  case xs of
    []    -> acc
    x:xs' -> let acc' = add 1 acc in seq acc' (length acc' xs')

This corresponds to defining length = foldl' (\n x -> 1 + n) 0.

If you step through this note that we evaluate the accumulator immediately at each step, and moreover that garbage collection can now happen as we compute as well:

Prev Next (step Step, Status)


mapM over the Maybe monad

As a final example of a slightly more advanced nature, try predicting what will happen when we run this in ghci:

case mapM return [1..] of Just (x:_) -> x

If you try it out and the result is not what you expected, perhaps stepping through the following evaluation of mapM return [1..3] to weak-head normal form (whnf: when there is a constructor at the top-level) will help you understand:

Prev Next (step Step, Status)


Note that this expression reduces to whnf only after the entire list has been evaluated, and moreover that this requires an O(n) nested pattern matches. The take-away point from this example is that mapM should not be applied to long lists in most monads, as this will result in a memory leak.


Laziness can be tricky to understand sometimes, and being able to go through the evaluation of a program step by step can be very helpful. The visualize-cbn tool can be used to generate HTML/JavaScript files that can be used to visualize this evaluation as shown on this blog post; alternatively, it can also write the evaluation trace to the console. The source files (the various definitions of length) can be found in the repo. Feedback and pull requests are of course always welcome :)

by edsko at September 18, 2017 03:45 PM

FP Complete

Cryptographic Hashing in Haskell

The cryptonite library is the de facto standard in Haskell today for cryptography. It provides support for secure random number generic, shared key and public key encryption, message authentification codes (MACs), and—for our purposes today—cryptographic hashes.

For those unfamiliar: a hash function is a function that maps data from arbitrary size to a fixed size. A cryptographic hash function is a hash function with properties suitable for cryptography (see Wikipedia article for more details). A common example of cryptographic hash usage is providing a checksum on a file download to ensure it has not be tampered with. Common algorithms in use today include SHA256, Skein512, and (the slightly outdated) MD5.

The cryptonite library is built on top of the memory library, which provides type classes and convenience functions for dealing with reading and creating byte arrays. You may initially think "shouldn't it all be ByteStrings." We'll get to why the type classes are so helpful later.

Once familiar with these two libraries, they are straightforward to use. However, seeing how all the pieces fit together is difficult from just the API docs, especially understanding where an explicit type signature will be necessary. This post will give a quick overview of the pieces you'll want to be interacting with with simple, runnable examples. By the end, the goal is that you'll be able to trivially understand the API docs themselves.

The runnable examples below will all use the Stack script interpreter support. Make sure you have Stack installed and then, for each example:

  • Copy the contents into a file called Main.hs
  • Run stack Main.hs

Basic typeclasses

You're used to dealing with a a number of different string-like things, between strict and lazy bytestrings and text, plus Strings. However, if I asked you to tell me how to represent a strict sequence of bytes, you'd likely refer to Data.ByteString.ByteString. However, as you'll see through this tutorial, there are multiple data types we'll want to treat as a sequence of bytes.

The memory library defines two typeclasses to help out with this:

  • ByteArrayAccess gives you read-only access to the bytes within a data type.
  • ByteArray gives you read/write access, and is a child class of ByteArrayAccess.

To demonstrate, let's do a pointless conversion between ByteString and Bytes (which we'll explain in a second).

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
{-# LANGUAGE OverloadedStrings #-}
import qualified Data.ByteArray as BA
import Data.ByteString (ByteString)
import qualified Data.ByteString as B

main :: IO ()
main = do
  B.writeFile "test.txt" "This is a test"
  byteString <- B.readFile "test.txt"
  let bytes :: BA.Bytes
      bytes = BA.convert byteString
  print bytes

We're starting off with some file I/O using the bytestring library (because you should really do I/O with bytestring). Then the convert function can turn that into a Bytes value.

EXERCISE What do you think the type signature of convert is, given the description of the two typeclasses above? You can check if you're right.

Did you notice that explicit type signature I put on the bytes value? Well, that's your next lesson with memory and cryptonite: since so many functions work on type classes instead of concrete types, you'll often end up needing to give GHC some assistance on type inference.

I could show you an example of a data type which is a ByteArrayAccess but not a ByteArray, but it will ring hollow right now. When we get to actual hashing, the distinction in type classes will make a lot more sense. So let's just wait.

Why the different data types?

You may be legitimately wondering why there's a Bytes datatype in memory, when it seems identical to ByteString. In fact: it's not. Bytes has less memory overhead, which it gets by not tracking the offset and length of its slice. In exchange for that: a Bytes value doesn't allow for any slicing. In other words, the drop function on a Bytes would have to create a new copy of the buffer.

In other words: this is all performance stuff. And a library dealing with cryptography generally needs to be more concerned with performance.

Another interesting data type in memory is ScrubbedBytes. This type has three special properties (as called out in its Haddocks):

  • Being scrubbed after its goes out of scope.
  • A Show instance that doesn't actually show any content
  • A Eq instance that is constant time

In other words: it automatically prevents a number of common security holes when dealing with sensitive data.

OK, not much code to look at here, let's get to more fun stuff!

Base 16 encoding/decoding

Let's convert some user input into base 16 (aka hexadecimal):

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import qualified Data.ByteArray          as BA
import           Data.ByteArray.Encoding (convertToBase, Base (Base16))
import           Data.ByteString         (ByteString)
import           Data.Text.Encoding      (encodeUtf8)
import qualified Data.Text.IO            as TIO
import           System.IO               (hFlush, stdout)

main :: IO ()
main = do
  putStr "Enter some text: "
  hFlush stdout
  text <- TIO.getLine
  let bs = encodeUtf8 text
  putStrLn $ "You entered: " ++ show bs
  let encoded = convertToBase Base16 bs :: ByteString
  putStrLn $ "Converted to base 16: " ++ show encoded

The convertToBase will convert the contents of any ByteArrayAccess into a ByteArray using the given base. Other options here besides Base16 include Base64 and others (just check out the docs).

As you can see, I had to put in an explicit ByteString type signature, since otherwise GHC wouldn't know which instance of ByteArrayAccess to use.

As you may guess, there is also a convertFromBase to do the opposite conversion. It returns an Either String byteArray value in case the input is not in the correct format.

EXERCISE Write a program to base 16 decode its input. (Solution follows.)

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import qualified Data.ByteArray          as BA
import           Data.ByteArray.Encoding (convertFromBase, Base (Base16))
import           Data.ByteString         (ByteString)
import           Data.Text.Encoding      (encodeUtf8)
import qualified Data.Text.IO            as TIO
import           System.IO               (hFlush, stdout)

main :: IO ()
main = do
  putStr "Enter some hexadecimal text: "
  hFlush stdout
  text <- TIO.getLine
  let bs = encodeUtf8 text
  putStrLn $ "You entered: " ++ show bs
  case convertFromBase Base16 bs of
    Left e -> error $ "Invalid input: " ++ e
    Right decoded ->
      putStrLn $ "Converted from base 16: " ++ show (decoded :: ByteString)

EXERCISE Write a program to convert input from base 16 to base 64 encoding.

Hashing a strict bytestring

Alright, that's enough of the memory library. Time to do some real crypto stuff. We're going to get the SHA256 hash (aka digest) of some user input:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import           Crypto.Hash             (hash, SHA256 (..), Digest)
import           Data.ByteString         (ByteString)
import           Data.Text.Encoding      (encodeUtf8)
import qualified Data.Text.IO            as TIO
import           System.IO               (hFlush, stdout)

main :: IO ()
main = do
  putStr "Enter some text: "
  hFlush stdout
  text <- TIO.getLine
  let bs = encodeUtf8 text
  putStrLn $ "You entered: " ++ show bs
  let digest :: Digest SHA256
      digest = hash bs
  putStrLn $ "SHA256 hash: " ++ show digest

We've used the hash function to convert a ByteString (or any instance of ByteArrayAccess) into a Digest SHA256. If you're already wondering: yes, you could replace SHA256 with one of the other hash algorithms available.

As before: it's important to use a type signature of Digest SHA256 to let GHC know what kind of hash you want to perform. However, in this case, there's an alternative function you could choose instead:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import           Crypto.Hash             (hashWith, SHA256 (..))
import           Data.ByteString         (ByteString)
import           Data.Text.Encoding      (encodeUtf8)
import qualified Data.Text.IO            as TIO
import           System.IO               (hFlush, stdout)

main :: IO ()
main = do
  putStr "Enter some text: "
  hFlush stdout
  text <- TIO.getLine
  let bs = encodeUtf8 text
  putStrLn $ "You entered: " ++ show bs
  let digest = hashWith SHA256 bs
  putStrLn $ "SHA256 hash: " ++ show digest

The Show instance of Digest will display the digest in hexadecimal/base 16. That's pretty nice. But let's suppose we want to display it in base 64 instead. Get ready for this: Digest is an instance of ByteArrayAccess, so you can use convertToBase. (And it's not an instance of ByteArray, consider why such an instance would be problematic. If you're stumped: read the docs for this function for the answer.)

EXERCISE Display the digest as a base 64 encoded string (solution follows).

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import           Crypto.Hash             (hashWith, SHA256 (..))
import           Data.ByteString         (ByteString)
import           Data.ByteArray.Encoding (convertToBase, Base (Base64))
import           Data.Text.Encoding      (encodeUtf8)
import qualified Data.Text.IO            as TIO
import           System.IO               (hFlush, stdout)

main :: IO ()
main = do
  putStr "Enter some text: "
  hFlush stdout
  text <- TIO.getLine
  let bs = encodeUtf8 text
  putStrLn $ "You entered: " ++ show bs
  let digest = convertToBase Base64 (hashWith SHA256 bs)
  putStrLn $ "SHA256 hash: " ++ show (digest :: ByteString)

Notice how we needed the type signature on digest to make it clear that it's a ByteString.

Check if any files match

Here's a neat little program. The user will provide a number of files as command line arguments. Then we'll print out lists of all the files with identical content (or, at least, matching SHA256s). (Try to notice something memory-inefficient in this implementation; we'll address it later.)

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import           Crypto.Hash             (Digest, SHA256, hash)
import qualified Data.ByteString         as B
import           Data.Foldable           (forM_)
import           Data.Map.Strict         (Map)
import qualified Data.Map.Strict         as Map
import           System.Environment      (getArgs)

readFile' :: FilePath -> IO (Map (Digest SHA256) [FilePath])
readFile' fp = do
  bs <- B.readFile fp
  let digest = hash bs -- notice lack of type signature :)
  return $ Map.singleton digest [fp]

main :: IO ()
main = do
  args <- getArgs
  m <- Map.unionsWith (++) <$> mapM readFile' args
  forM_ (Map.toList m) $ \(digest, files) ->
    case files of
      [] -> error "can never happen"
      [_] -> return () -- only one file
      _ -> putStrLn $ show digest ++ ": " ++ unwords (map show files)

EXERCISE Write a program that will print out the SHA256 for every file name passed in on the command line.

QUESTION What's the inefficiency in the code above? You'll see in the next section.

More efficient file hashing

If we tried implementing our program from above without hashing, we'd either have to hold the entire file contents of each file in memory at once, or do some weird O(n^2) pairwise comparisons. Our hash-based implementation is better. But it's still a problem: it uses Data.ByteString.readFile, causing possibly unbounded memory usage. There's a more efficient way to hash entire files, using cryptonite-conduit:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import           Crypto.Hash         (Digest, SHA256, hash)
import           Crypto.Hash.Conduit (hashFile)
import           Data.Foldable       (forM_)
import           Data.Map.Strict     (Map)
import qualified Data.Map.Strict     as Map
import           System.Environment  (getArgs)

readFile' :: FilePath -> IO (Map (Digest SHA256) [FilePath])
readFile' fp = do
  digest <- hashFile fp
  return $ Map.singleton digest [fp]

main :: IO ()
main = do
  args <- getArgs
  m <- Map.unionsWith (++) <$> mapM readFile' args
  forM_ (Map.toList m) $ \(digest, files) ->
    case files of
      [] -> error "can never happen"
      [_] -> return () -- only one file
      _ -> putStrLn $ show digest ++ ": " ++ unwords (map show files)

Pretty simple change (in fact, I'd argue this code is just slightly easier to read), and we get far better memory performance (linear in the number of files being compared, constant in the size of those files).

Streaming hashing

Perhaps your ears (or eyes? you're probably reading this) perked up at the mention of conduit. To answer the question I'm going to pretend you're asking: yes, you can do streaming computation of a hash. Here's a program that will take a URL and file path, write the contents of the URL's response body to a file path, and print out the SHA256 digest. And the cool part: it will only look at each chunk of data once.

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import Conduit
import Crypto.Hash         (Digest, SHA256, hash)
import Crypto.Hash.Conduit (sinkHash)
import Network.HTTP.Simple
import System.Environment  (getArgs)

main :: IO ()
main = do
  args <- getArgs
  (url, fp) <-
    case args of
      [x, y] -> return (x, y)
      _ -> error $ "Expected: URL FILEPATH"
  req <- parseRequest url
  digest <- runResourceT $ httpSink req $ \_res -> getZipSink $
    ZipSink (sinkFile fp) *>
    ZipSink sinkHash
  print (digest :: Digest SHA256)

Of course, if conduit can do it, you can do it too. Let's write a hashFile implementation ourselves without conduit to get some exposure to the raw hashing API:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import Crypto.Hash
import System.Environment  (getArgs)
import System.IO (withBinaryFile, IOMode (ReadMode))
import Data.Foldable (forM_)
import qualified Data.ByteString as B

hashFile :: HashAlgorithm ha => FilePath -> IO (Digest ha)
hashFile fp = withBinaryFile fp ReadMode $ \h ->
  let loop context = do
        chunk <- B.hGetSome h 4096
        if B.null chunk
          then return $ hashFinalize context
          else loop $! hashUpdate context chunk
   in loop hashInit

main :: IO ()
main = do
  args <- getArgs
  forM_ args $ \fp -> do
    digest <- hashFile fp
    putStrLn $ show (digest :: Digest SHA256) ++ "  " ++ fp

This uses the pure update API provided by Crypto.Hash. We can also use a mutating API in this case, which is slightly more efficient by bypassing some buffer copies:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import Crypto.Hash
import Crypto.Hash.IO
import System.Environment  (getArgs)
import System.IO (withBinaryFile, IOMode (ReadMode))
import Data.Foldable (forM_)
import qualified Data.ByteString as B

hashFile :: HashAlgorithm ha => FilePath -> IO (Digest ha)
hashFile fp = withBinaryFile fp ReadMode $ \h -> do
  context <- hashMutableInit
  let loop = do
        chunk <- B.hGetSome h 4096
        if B.null chunk
          then hashMutableFinalize context
          else do
            hashMutableUpdate context chunk

main :: IO ()
main = do
  args <- getArgs
  forM_ args $ \fp -> do
    digest <- hashFile fp
    putStrLn $ show (digest :: Digest SHA256) ++ "  " ++ fp

EXERCISE Use lazy I/O and the hashlazy function to implement hashFile. (NOTE: I am not condoning lazy I/O here.)

September 18, 2017 02:00 AM

September 17, 2017

Neil Mitchell

Existential Serialisation

Summary: Using static pointers you can perform binary serialisation of existentials.

Many moons ago I asked how to write a Binary instance for a type including an existential, such as:

data Foo = forall a . (Typeable a, Binary a) => Foo a

Here we have a constructor Foo which contains a value. We don't statically know the type of the contained value, but we do know it has the type classes Typeable (so we can at runtime switch on its type) and Binary (so we can serialise it). But how can we deserialise it? We can store the relevant TypeRep when serialising, but when deserialising there is no mechanism to map from TypeRep to a Binary instance.

In Shake, I needed to serialise existentials, as described in the S4.1 of the original paper. My solution was to build a global mapping table, storing pairs of TypeRep and Binary instances for the types I knew were relevant. This solution works, but cannot deserialise anything that has not already been added to the global table, which required certain functions to live in weird places to ensure that they were called before deserialisation. Effective, but ugly.

Recently Abhiroop Sarkar suggested using the relatively new static pointers extension. This extension lets you turn top-level bindings with no arguments into a StaticPtr which can then be serialised/deserialsed, even between different instances of a process. To take advantage of this feature, we can redefine Foo as:

data Foo = forall a . (StaticFoo a, Binary a) => Foo a

class StaticFoo a where
staticFoo :: a -> StaticPtr (Get Foo)

The approach is to switch from serialising the TypeRep (from which we try to look up Get Foo), to serialising the Get Foo directly. We can write a Binary Foo instance by defining put:

put :: Foo -> Put
put (Foo x) = do
put $ staticKey $ staticFoo x
put x

Here we simply grab a StaticPtr (Get Foo) which can deserialise the object, then use staticKey to turn it into something that can be serialised itself. Next, we write out the payload. To reverse this process we define get:

get :: Get Foo
get = do
ptr <- get
case unsafePerformIO (unsafeLookupStaticPtr ptr) of
Just value -> deRefStaticPtr value :: Get Foo
Nothing -> error "Binary Foo: unknown static pointer"

We first get the staticKey, use unsafeLookupStaticPtr to turn it into a StaticPtr (Get Foo) followed by deRefStaticPtr to turn it into a Get Foo. The unsafe prefix on these functions is justified - bugs while developing this code resulted in segfaults.

The final piece of the puzzle is defining StaticFoo instances for the various types we might want to serialise. As an example for String:

instance StaticFoo String where
staticFoo _ = static (Foo <$> (get :: Get String))

We perform the get, wrap a Foo around it, and then turn it into a StaticPtr. All other types follow the same pattern, replacing String with Int (for example). The expression passed to static must have no free variables, including type variables, so we cannot define an instance for a, or even an instance for [a] - it must be [Char] and [Int] separately.

A complete code sample and test case is available here.

This approach works, and importantly allows extra constraints on the existential. The two disadvantages are: 1) that static isn't very flexible or easy to abstract over, resulting in a lot of StaticFoo boilerplate; 2) the static pointer is not guaranteed to be valid if the program changes in any way.

Will Shake be moving over to this approach? No. The next version of Shake has undergone an extensive rewrite, and in the process, moved away from needing this feature. A problem I had for 8 years has been solved, just as I no longer need the solution!

by Neil Mitchell ( at September 17, 2017 11:46 AM

September 15, 2017

Tweag I/O

Java from Haskell:<br/> a tutorial

Facundo Domínguez

Our introductory post for inline-java showed that we could call methods written in Java (or indeed any JVM function) from Haskell. Much in the style of other packages, it is moreover possible to do using java syntax, so examples from Java API documentation can be reused as-is.

In celebration of the recently released inline-java-0.7.0, this post is a tutorial on how to use it all. We cover the marshalling of values between Haskell and Java and how we leverage the type checker to ensure that neither sides disagree on what types arguments and return values should have. This git repository contains the minimal configuration necessary to try the examples that follow.

Invoking java methods

Let's start with a simple program.

-- hello-java.hs
{-# LANGUAGE QuasiQuotes #-}
{-# OPTIONS_GHC -fplugin=Language.Java.Inline.Plugin #-}
import Language.Java (withJVM)
import Language.Java.Inline

main :: IO ()
main = withJVM [] [java| { System.out.println("Hello Java!"); } |]

The function withJVM starts an instance of the Java Virtual Machine (JVM), and the java quasiquotation executes the java code passed to it as a block of statements. The program can be built and executed from inside the above mentioned folder with

$ stack --nix build
$ stack --nix exec hello-java
Hello Java!

Because part of inline-java is implemented in a GHC plugin, we tell GHC to use this plugin with the pragma OPTIONS_GHC. Every module using inline-java needs to ask for the plugin in the same way (this requirement might be lifted in a future version of GHC).

GHC doesn't parse any Java and neither does inline-java. So how can this program possibly work? The answer is that inline-java feeds the quasiquotation to the javac compiler, which generates some bytecode that is stored in the object file of the module. At runtime, inline-java arranges the bytecode to be handed to the JVM using the jni package. Finally, inline-java makes use of the package jvm to have the bytecode executed.

Marshalling values

Now, suppose we have some value in Haskell that we want to provide as an argument to a Java method.

main :: IO ()
main = withJVM [] $ do
    let d = 1 :: Double
    [java| { System.out.println($d); } |]

We are coercing a Haskell value of type Double into a Java value of the primitive type double, which is then used in the quasiquotation in the form of an antiquoted variable. When inline-java passes this quasiquotation to javac, it feeds it a static method of the form:

static void fresh_name(double $d) { System.out.println($d); }

At runtime, inline-java passes the result of the coercion as the argument $d. Any instance of Language.Java.Coercible a ty can be used in the same way, where a stands for the Haskell type and ty stands for an encoding of the Java type (JType). The package jvm defines a few instances, and users can define their own.

class Coercible a (ty :: JType) | a -> ty
instance Coercible () 'Void
instance Coercible Bool ('Prim "boolean")
instance Coercible CChar ('Prim "byte")
instance Coercible Char ('Prim "char")
instance Coercible Word16 ('Prim "char")
instance Coercible Int16 ('Prim "short")
instance Coercible Int32 ('Prim "int")
instance Coercible Int64 ('Prim "long")
instance Coercible Float ('Prim "float")
instance Coercible Double ('Prim "double")

In the following program we get an integer value from Java.

import Data.Int (Int32)

main :: IO ()
main = withJVM [] $ do
    x <- [java| new Object[5].length |]
    print (x :: Int32)

Here we have dropped the braces surrounding the Java code in order to hint to inline-java that we are giving an expression rather than a block of statements. Expressions, unlike statements, have values. We are coercing a Java value of type int into a Haskell value of type Int32. The quasiquoter arranges for the coercion to happen after the JVM finishes evaluating the Java expression. As it was the case for antiquoted variables, the return type of the quasiquotation needs to be an instance of Language.Java.Coercible a ty.

Marshalling Java objects

Coercing values is useful enough until we consider how to marshal values which do not have an obvious counterpart in Java. For instance, what do we coerce a Haskell list or a vector to? As these types require a more elaborate representation in Java, we use the classes Reflect and Reify from the package jvm.

type family Interp a :: JType

class Reify a where
  reify :: J (Interp a) -> IO a

class Reflect a where
  reflect :: a -> IO (J (Interp a))

The type family Interp a stands for the Java type that corresponds to the Haskell type a. A value of type J (Interp a) is a reference to a Java object of type Interp a. With reify we can convert a Java object to a Haskell value. With reflect we can convert a Haskell value back into a Java object. As with the type class Coercible, the package jvm already provides a few instances of Reify and Reflect. For example,

type instance Interp ByteString = 'Array ('Prim "byte")
instance Reify ByteString
instance Reflect ByteString

type instance Interp Text = 'Class "java.lang.String"
instance Reify Text
instance Reflect Text

type instance Interp Double = 'Class "java.lang.Double"
instance Reify Double
instance Reflect Double

type instance Interp [a] = 'Array (Interp a)
instance Reify a => Reify [a]
instance Reflect a => Reflect [a]

There is an instance of Coercible (J ty) ty. So we can use references produced with reflect in java quasiquotations.

import qualified Data.Text
import Language.Java (reflect)

main :: IO ()
main = withJVM [] $ do
    text <- reflect (Data.Text.pack "Hello Java!")
    [java| { System.out.println($text); } |]

In this example, text has type J ('Class "java.lang.String") and the antiquoted variable $text is expected to have type java.lang.String. Conversely, we can use reify, to create a Haskell value from the reference produced by a quasiquotation.

main :: IO ()
main = withJVM [] $ do
    jarray <- [java| new String[] {"a", "b"} |]
    xs <- reify jarray
    print (xs :: [Text])

Type checking

One of the strengths of inline-java is that it makes it difficult to get an ill-typed interaction between Haskell and Java. What if a quasiquotation returned a value of a type that the Haskell side does not expect? What if any of the methods used in the quasiquotation are used with arguments of the wrong type?

The short answer to both questions is that, most of the time, GHC and javac will catch the type mismatches.

main :: IO ()
main = withJVM [] $ do
    jarray <- [java| new String[] {"a", "b"} |]
    xs <- reify jarray
    print (xs :: [Double])

Based on the instances of Reify and Coercible that are in scope, Haskell is able to determine with precision what Java type the quasiquotation should return. In this program, the Haskell side expects Java to return an array of doubles (java.lang.Double[]) when the Java side is returning an array of strings (java.lang.String[]). The javac compiler complains.

$ stack --nix build
[1 of 1] Compiling Main             ( Main.hs, Main.o )
.../ error: incompatible types: String[] cannot be converted to Double[]
{ return  new String[] {"a", "b"} ; } // .hs:10

When making calls with the wrong argument or return types to Java methods, the functions in the package jvm produce a runtime error. Usually the programmer would get an exception called NoSuchMethodError and the name of the offending method. The error produced by inline-java improves this in two aspects. Firstly, we get the error at build time. Secondly, the error message points precisely at either the return type or the argument with the mismatched type.

Are there any type errors that cannot be caught at build time? There is, indeed. For instance, a quasiquotation can yield or use objects of type java.lang.Object. The Haskell or the Java side may then need to downcast these objects. Downcasts, as is always the case in vanilla Java, involve a dynamic type check, which can fail if the objects are downcasted to the wrong type. In this sense, Haskell + Java is no safer than Java alone.


In this blogpost we introduced quasiquotation, Coercible and the Reify/Reflect type classes. We saw that with inline-java, it's possible to call JVM methods from Haskell with little fuss. An important aspect of the design of inline-java is that conversions are always explicit. That's because they can be expensive, so the programmer should be well aware when they are happening. Coercible captures the class of types that can be passed to the JVM without paying any marshalling/unmarshalling costs.

The package inline-java not only makes Java code convenient to embed in Haskell programs, it also prevents coding mistakes which could otherwise occur when relying on the lower-level packages jni and jvm. In a future post we'll take a peak under the hood. Since v0.7, inline-java makes use of a GHC plugin to make the type safety happen.

September 15, 2017 12:00 AM

September 14, 2017

Dimitri Sabadie

State of luminance

I’ve been a bit surprised for a few days now. Some rustaceans posted two links about my spectra and luminance crates on reddit – links here and here. I didn’t really expect that: the code is public, on Github, but I don’t communicate about them that much.

However, I saw interested people, and I think it’s the right time to write a blog post about some design choices. I’ll start off with luminance and I’ll speak about spectra in another blog post because I truly think spectra starts to become very interesting (I have my own shading language bundled up within, which is basically GLSL on steroids).

luminance and how it copes with fast and as-stateless-as-possible graphics


luminance is a library that I wrote, historically, in Haskell. You can find the package here if you’re interested – careful though, it’s dying and the Rust version has completely drifted path with it. Nevertheless, when I ported the library to Rust, I imported the same “way of programming” that I had·have in Haskell – besides the allocation scheme; remember, I’m a demoscener, I do care a lot about performance, CPU usage, cache friendliness and runtime size. So the Rust luminance crate was made to be hybrid: it has the cool functional side that I imported from my Haskell codebase, and the runtime performance I wanted that I had when I wrote my two first 64k in C++11. I had to remove and work around some features that only Haskell could provide, such as higher-kinded types, type families, functional dependencies, GADTs and a few other things such as existential quantification (trait objects saved me here, even though I don’t use them that much in luminance now).

I have to admit, I dithered a lot about the scope of luminance — both in Haskell and Rust. At first, I thought that it’d be great to have a “base” crate, hosting common and abstracted code, and “backend” crates, implementing the abstract interface. That would enable me to have several backends – OpenGL, Vulkan, Metal, a software implementation, something for Amiigaaaaaaa, etc. Though, time has passed, and now, I think it’s:

  • Overkill.
  • A waste of framework.

The idea is that if you need to be very low-level on the graphics stack of your application, you’re likely to know what you are doing. And then, your needs will be very precise and well-defined. You might want very specific piece of code to be available, related to a very specific technology. That’s the reason why abstracting over very low-level code is not a good path to me: you need to expose as most as posssible the low-level interface. That’s the goal of luminance: exposing OpenGL’s interface in a stateless, bindless and typesafe way, with no or as minimal as possible runtime overhead.

More reading here.


Today, luminance is almost stable – it still receives massive architecture redesign from time to time, but it’ll hit the 1.0.0 release soon. As discussed with kvark lately, luminance is not about the same scope as gfx’s one. The goals of luminance are:

  • To be a typesafe, stateless and bindless OpenGL framework.
  • To provide a friendly experience and expose as much as possible all of the OpenGL features.
  • To be very lightweight (the target is to be able to use it without std nor core).

To achieve that, luminance is written with several aspects in mind:

  • Allocation must be explicitely stated by the user: we must avoid as much as possible to allocate things in luminance since it might become both a bottleneck and an issue to the lightweight aspect.
  • Performance is a first priority; safety comes second. If you have a feature that can be either exclusively performant or safe, it must then be performant. Most of the current code is, for our joy, both performant and safe. However, some invariants are left around the place and you might shoot your own feet. This is an issue and some reworking must be done (along with tagging some functions and traits unsafe).
  • No concept of backends will ever end up in luminance. If it’s decided to switch to Vulkan, the whole luminance API will and must be impacted, so that people can use Vulkan the best possible way.
  • A bit like the first point, the code must be written in a way that the generated binary is as small as possible. Generics are not forbidden – they’re actually recommended – but things like crate dependencies are likely to be forbidden (exception for the gl dependency, of course).
  • Windowing must not be addressed by luminance. This is crucial. As a demoscener, if I want to write a 64k with luminance, I must be able to use a library over X11 or the Windows API to setup the OpenGL context myself, set the OpenGL pointers myself, etc. This is not the typical usecase – who cares besides demosceners?! – but it’s still a good advantage since you end up with loose coupling for free.

The new luminance

luminance has received more attention lately, and I think it’s a good thing to talk about how to use it. I’ll add examples on github and its online documentation.

I’m going to do that like a tutorial. It’s easier to read and you can test the code in the same time. Let’s render a triangle!

Note: keep in mind that you need a nightly compiler to compile luminance.

Getting your feet wet

I’ll do everything from scratch with you. I’ll work in /tmp:

$ cd /tmp

First thing first, let’s setup a lumitest Rust binary project:

$ cargo init --bin lumitest
Created binary (application) project
$ cd lumitest

Let’s edit our Cargo.toml to use luminance. We’ll need two crates:

At the time of writing, corresponding versions are luminance-0.23.0 and luminance-glfw-0.3.2.

Have the following [dependencies] section

luminance = "0.23.0"
luminance-glfw = "0.3.2"
$ cargo check

Everything should be fine at this point. Now, let’s step in in writing some code.

extern crate luminance;
extern crate luminance_glfw;

use luminance_glfw::{Device, WindowDim, WindowOpt};

const SCREEN_WIDTH: u32 = 960;
const SCREEN_HEIGHT: u32 = 540;

fn main() {
let rdev = Device::new(WindowDim::Windowed(SCREEN_WIDTH, SCREEN_HEIGHT), "lumitest", WindowOpt::default());

The main function creates a Device that is responsible in holding the windowing stuff for us.

Let’s go on:

match rdev {
Err(e) => {
eprintln!("{:#?}", e);

Ok(mut dev) => {
println!("let’s go!");

This block will catch any Device errors and will print them to stderr if there’s any.

Let’s write the main loop:

'app: loop {
for (_, ev) in { // the pair is an interface mistake; it’ll be removed
match ev {
WindowEvent::Close | WindowEvent::Key(Key::Escape, _, Action::Release, _) => break 'app,
_ => ()

This loop runs forever and will exit if you hit the escape key or quit the application.

Setting up the resources

Now, the most interesting thing: rendering the actual triangle! You will need a few things:

type Position = [f32; 2];
type RGB = [f32; 3];
type Vertex = (Position, RGB);

const TRIANGLE_VERTS: [Vertex; 3] = [
([-0.5, -0.5], [0.8, 0.5, 0.5]), // red bottom leftmost
([-0., 0.5], [0.5, 0.8, 0.5]), // green top
([0.5, -0.5], [0.5, 0.5, 0.8]) // blue bottom rightmost

Position, Color and Vertex define what a vertex is. In our case, we use a 2D position and a RGB color.

You have a lot of choices here to define the type of your vertices. In theory, you can choose any type you want. However, it must implement the Vertex trait. Have a look at the implementors that already exist for a faster start off!

Important: do not confuse between [f32; 2] and (f32, f32). The former is a single 2D vertex component. The latter is two 1D components. It’ll make a huge difference when writing shaders.

The TRIANGLE_VERTS is a constant array with three vertices defined in it: the three vertices of our triangle. Let’s pass those vertices to the GPU with the Tess type:

// at the top location
use luminance::tess::{Mode, Tess, TessVertices};

// just above the main loop
let triangle = Tess::new(Mode::Triangle, TessVertices::Fill(&TRIANGLE_VERTS), None);

This will pass the TRIANGLE_VERTS vertices to the GPU. You’re given back a triangle object. The Mode is a hint object that states how vertices must be connected to each other. TessVertices lets you slice your vertices – this is typically enjoyable when you use a mapped buffer that contains a dynamic number of vertices.

We’ll need a shader to render that triangle. First, we’ll place its source code in data:

$ mkdir data

Paste this in data/vs.glsl:

layout (location = 0) in vec2 co;
layout (location = 1) in vec3 color;

out vec3 v_color;

void main() {
gl_Position = vec4(co, 0., 1.);
v_color = color;

Paste this in data/fs.glsl:

in vec3 v_color;

out vec4 frag;

void main() {
frag = vec4(v_color, 1.);

And add this to your

const SHADER_VS: &str = include_str!("../data/vs.glsl");
const SHADER_FS: &str = include_str!("../data/fs.glsl");

Note: this is not a typical workflow. If you’re interested in shaders, have a look at how I do it in spectra. That is, hot reloading it via SPSL (Spectra Shading Language), which enables to write GLSL modules and compose them in a single file but just writing functions. The functional programming style!

Same thing as for the tessellation, we need to pass the source to the GPU’s compiler to end up with a shader object:

// add this at the top of your
use luminance::shader::program::Program;

// below declaring triangle
let (shader, warnings) = Program::<Vertex, (), ()>::from_strings(None, SHADER_VS, None, SHADER_FS).unwrap();

for warning in &warnings {
eprintln!("{:#?}", warning);

Finally, we need to tell luminance in which framebuffer we want to make the render. It’s simple: to the default framebuffer, which ends up to be… your screen’s back buffer! This is done this way with luminance:

use luminance::framebuffer::Framebuffer;

let screen = Framebuffer::default([SCREEN_WIDTH, SCREEN_HEIGHT]);

And we’re done for the resources. Let’s step in the actual render now.

The actual render: the pipeline

luminance’s approach to render is somewhat not very intuitive yet very simple and very efficient: the render pipeline is explicitly defined by the programmer in Rust, on the fly. That means that you must express the actual state the GPU must have for the whole pipeline. Because of the nature of the pipeline, which is an AST (Abstract Syntax Tree), you can batch sub-parts of the pipeline (we call such parts nodes) and you end up with minimal GPU state switches. The theory is as following:

  • At top most, you have the pipeline function that introduces the concept of shading things to a framebuffer.
  • Nested, you find the concept of a shader gate. That is, an object linked to its parent (pipeline) and that gives you the concept of shading things with a shader.
    • Nested, you find the concept of rendering things. That is, can set on such nodes GPU states, such as whether you want a depth test, blending, etc.
    • Nested, you find the concept of a tessellation gate, enabling you to render actual Tess objects.

That deep nesting enables you to batch your objects on a very fine granularity. Also, notice that the functions are not about slices of Tess or hashmaps of Program. The allocation scheme is completely ignorant about how the data is traversed, which is good: you decide. If you need to borrow things on the fly in a shading gate, you can.

Let’s get things started:

use luminance::pipeline::{entry, pipeline};

entry(|_| {
pipeline(&screen, [0., 0., 0., 1.], |shd_gate| {
shd_gate.shade(&shader, |rdr_gate, _| {
rdr_gate.render(None, true, |tess_gate| {
let t = &triangle;

We just need a final thing now: since we render to the back buffer of the screen, if we want to see anything appear, we need to swap the buffer chain so that the back buffer become the front buffer and the front buffer become the back buffer. This is done by wrapping our render code in the Device::draw function:

dev.draw(|| {
entry(|_| {
pipeline(&screen, [0., 0., 0., 1.], |shd_gate| {
shd_gate.shade(&shader, |rdr_gate, _| {
rdr_gate.render(None, true, |tess_gate| {
let t = &triangle;

You should see this:

As you can see, the code is pretty straightforward. Let’s get deeper, and let’s kick some time in!

use std::time::Instant;

// before the main loop
let t_start = Instant::now();
// in your main loop
let t_dur = t_start.elapsed();
let t = (t_dur.as_secs() as f64 + t_dur.subsec_nanos() as f64 * 1e-9) as f32;

We have the time. Now, we need to pass it down to the GPU (i.e. the shader). luminance handles that kind of things with two concepts:

  • Uniforms.
  • Buffers.

Uniforms are a good match when you want to send data to a specific shader, like a value that customizes the behavior of a shading algorithm.

Because buffers are shared, you can use buffers to share data between shader, leveraging the need to pass the data to all shader by hand – you only pass the index to the buffer that contains the data.

We won’t conver the buffers for this time.

Because of type safety, luminance requires you to state which types the uniforms the shader contains are. We only need the time, so let’s get this done:

// you need to alter this import
use luminance::shader::program::{Program, ProgramError, Uniform, UniformBuilder, UniformInterface, UniformWarning};

struct TimeUniform(Uniform<f32>);

impl UniformInterface for TimeUniform {
fn uniform_interface(builder: UniformBuilder) -> Result<(Self, Vec<UniformWarning>), ProgramError> {
// this will fail if the "t" variable is not used in the shader
//let t = builder.ask("t").map_err(ProgramError::UniformWarning)?;

// I rather like this one: we just forward up the warning and use the special unbound uniform
match builder.ask("t") {
Ok(t) => Ok((TimeUniform(t), Vec::new())),
Err(e) => Ok((TimeUniform(builder.unbound()), vec![e]))

The UniformBuilder::unbound is a simple function that gives you any uniform you want: the resulting uniform object will just do nothing when you pass values in. It’s a way to say “— Okay, I don’t use that in the shader yet, but don’t fail, it’s not really an error”. Handy.

And now, all the magic: how do we access that uniform value? It’s simple: via types! Have you noticed the type of our Program? For the record:

let (shader, warnings) = Program::<Vertex, (), ()>::from_strings(None, SHADER_VS, None, SHADER_FS).unwrap();

See the type is parametered with three type variables:

  • The first one – here Vertex, our own type – is for the input of the shader program.
  • The second one is for the output of the shader program. It’s currently not used at all by luminance by is reserved, as it will be used later for enforcing even further type safety.
  • The third and latter is for the uniform interface.

You guessed it: we need to change the third parameter from () to TimeUniform:

let (shader, warnings) = Program::<Vertex, (), TimeUniform>::from_strings(None, SHADER_VS, None, SHADER_FS).unwrap();

And that’s all. Whenever you shade with a ShaderGate, the type of the shader object is being inspected, and you’re handed with the uniform interface:

shd_gate.shade(&shader, |rdr_gate, uniforms| {

rdr_gate.render(None, true, |tess_gate| {
let t = &triangle;

Now, change your fragment shader to this:

in vec3 v_color;

out vec4 frag;

uniform float t;

void main() {
frag = vec4(v_color * vec3(cos(t * .25), sin(t + 1.), cos(1.25 * t)), 1.);

And enjoy the result! Here’s the gist that contains the whole

by Dimitri Sabadie ( at September 14, 2017 03:58 PM

September 12, 2017

Tom Schrijvers

PPDP & LOPSTR 2017: Call for Participation



PPDP 2017

  19th International Symposium on
  Principles and Practice of Declarative Programming
  Namur, Belgium, October 9-11

co-located with


  27th International Symposium on
  Logic-Based Program Synthesis and Transformation
  Namur, Belgium, October 10-12


Registration is now open:

** Early registration deadline: September 15, 2017 **


Marieke Huisman (Universiteit Twente)
A Verification Technique for Deterministic Parallel Programs
(joint PPDP/LOPSTR speaker)

Sumit Gulwani (Microsoft)
Programming by Examples: Applications, Algorithms, and Ambiguity Resolution
(joint PPDP/LOPSTR speaker)

Serge Abiteboul (INRIA)
Ethical issues in data management

Grigore Rosu (University of Illinois at Urbana-Champaign)
K: A Logic-Based Framework for Program Transformation and Analysis

Please consult the conferences' webpages for a list of accepted papers.

Hope to see you in Namur !

by Tom Schrijvers ( at September 12, 2017 07:27 AM

FP Complete

All About Strictness

Haskell is—perhaps infamously—a lazy language. The basic idea of laziness is pretty easy to sum up in one sentence: values are only computed when they're needed. But the implications of this are more subtle. In particular, it's important to understand some crucial topics if you want to write memory- and time-efficient code:

  • Weak head normal form (WHNF) versus normal form (NF)
  • How to use the seq and deepseq functions (and related concepts)
  • Strictness annotations on data types
  • Bang patterns
  • Strictness of data structures: lazy, spine-strict, and value-strict
  • Choosing the appropriate helper functions, especially with folds

This blog post was inspired by some questions around writing efficient conduit code, so I'll try to address some of that directly at the end. The concepts, though, are general, and will transfer to not only other streaming libraries, but non-streaming data libraries too.

NOTE This blog post will mostly treat laziness as a problem to be solved, as opposed to the reality: laziness is sometimes an asset, and sometimes a liability. I'm focusing on the negative exclusively, because our goal here is to understand the rough edges and how to avoid them. There are many great things about laziness that I'm not even hinting at. I trust my readers to add some great links to articles speaking on the virtues of laziness in the comments :)

Basics of laziness

Let's elaborate on my one liner above:

Values are only computed when they're needed

Let's explore this by comparison with a strict language: C.

#include <stdio.h>

int add(int x, int y) {
  return x + y;

int main() {
  int five = add(1 + 1, 1 + 2);
  int seven = add(1 + 2, 1 + 3);

  printf("Five: %d\n", five);
  return 0;

Our function add is strict in both of its arguments. And its result is also strict. This means that:

  • Before add is called the first time, we will compute the result of both 1 + 1 and 1 + 2.
  • We will call the add function on 2 and 3, get a result of 5, and place that value in memory pointed at by the variable five.
  • Then we'll do the same thing with 1 + 2, 1 + 3, and placing 7 in seven.
  • Then we'll call printf with our five value, which is already fully computed.

Let's compare that to the equivalent Haskell code:

add :: Int -> Int -> Int
add x y = x + y

main :: IO ()
main = do
  let five = add (1 + 1) (1 + 2)
      seven = add (1 + 2) (1 + 3)

  putStrLn $ "Five: " ++ show five

There's something called strictness analysis which will result in something more efficient than what I'll describe here in practice, but semantically, we'll end up with the following:

  • Instead of immediately computing 1 + 1 and 1 + 2, the compiler will create a thunk (which you can think of as a promise) for those computations, and pass those thunks to the add function.
  • Except: we won't call the add function right away either: five will be a thunk representing the application of the add function to the thunk for 1 + 1 and 1 + 2.
  • We'll end up doing the same thing with seven: it will be a thunk for applying add to two other thunks.
  • When we finally try to print out the value five, we need to know the actual number. This is called forcing evaluation. We'll get into more detail on when and how this happens below, but for now, suffice it to say that when putStrLn is executed, it forces evaluation of five, which forces evaluation of 1 + 1 and 1 + 2, converting the thunks into real values (2, 3, and ultimately 5).
  • Because seven is never used, it remains a thunk, and we don't spend time evaluating it.

Compared to the C (strict) evaluation, there is one clear benefit: we don't bother wasting time evaluating the seven value at all. That's three addition operations bypassed, woohoo! And in a real world scenario, instead of being three additions, that could be a seriously expensive operation.

However, it's not all rosey. Creating a thunk does not come for free: we need to allocate space for the thunk, which costs both allocation, and causes GC pressure for freeing them afterwards. Perhaps most importantly: the thunked version of an expression can be far more costly than the evaluated version. Ignoring some confusing overhead from data constructors (which only make the problem worse), let's compare our two representations of five. In C, five takes up exactly one machine word*. In Haskell, our five thunk will take up roughly:

* Or perhaps less, as int is probably only 32 bits, and you're probably on a 64 bit machine. But then you get into alignment issues, and registers... so let's just say one machine word.

  • One machine word to say "I'm a thunk"
  • Within that thunk, pointers to the add function, and the 1 + 1 and 1 + 2 thunks (one machine word each). So three machine words total.
  • Within the 1 + 1 thunk, one machine word for the thunk, and then again a pointer to the + operator, and the 1 values. (GHC has an optimization where it will keep small int values in dedicated parts of memory, avoiding extra overhead for the ints themselves. But you could theoretically add in an extra machine word for each.) Again, conservatively: three machine words.
  • Same logic for the 1 + 2 thunk, so three more machine words.
  • For a whopping total of 10 machine words, or 10 times the memory usage as C!

Now in practice, it's not going to work out that way. I mentioned the strictness analysis step, which will say "hey, wait a second, it's totally better to just add two numbers than allocate a thunk, I'mma do that now, kthxbye." But it's vital when writing Haskell to understand all of these places where laziness and thunks can creep in.


Let's look at how we can force Haskell to be more strict in its evaluation. Likely the easiest way to do this is with bang patterns. Let's look at the code first:

{-# LANGUAGE BangPatterns #-}
add :: Int -> Int -> Int
add !x !y = x + y

main :: IO ()
main = do
  let !five = add (1 + 1) (1 + 2)
      !seven = add (1 + 2) (1 + 3)

  putStrLn $ "Five: " ++ show five

This code now behaves exactly like the strict C code. Because we've put a bang (!) in front of the x and y in the add function, GHC knows that it must force evaluation of those values before evaluating it. Similarly, by placing bangs on five and seven, GHC must evaluate these immediately, before getting to putStrLn.

As with many things in Haskell, however, bang patterns are just syntactic sugar for something else. And in this case, that something else is the seq function. This function looks like:

seq :: a -> b -> b

You could implement this type signature yourself, of course, by just ignoring the a value:

badseq :: a -> b -> b
badseq a b = b

However, seq uses primitive operations from GHC itself to ensure that, when b is evaluated, a is evaluated too. Let's rewrite our add function to use seq instead of bang patterns:

add :: Int -> Int -> Int
add x y =
  let part1 = seq x part2
      part2 = seq y answer
      answer = x + y
   in part1
-- Or more idiomatically
add x y = x `seq` y `seq` x + y

What this is saying is this:

  • part1 is an expression which will tell you the value of part2, after it evaluates x
  • part2 is an expression which will tell you the value of answer, after it evaluates y
  • answer is just x + y

Of course, that's a long way to write this out, and the pattern is common enough that people will usually just use seq infix as demonstrated above.

EXERCISE What would happen if, instead of in part1, the code said in part2? How about in answer?

There is always a straightforward translation from bang patterns to usage of let. We can do the same with the main function:

main :: IO ()
main = do
  let five = add (1 + 1) (1 + 2)
      seven = add (1 + 2) (1 + 3)

  five `seq` seven `seq` putStrLn ("Five: " ++ show five)

It's vital to understand how seq is working, but there's no advantage to using it over bang patterns where the latter are clear and easy to read. Choose whichever option makes the code easiest to read, which will often be bang patterns.

Tracing evaluation

So far, you've just had to trust me about the evaluation of thunks occurring. Let's see a method to more directly observe evaluation. The trace function from Debug.Trace will print a message when it is evaluated. Take a guess at the output of these programs:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
import Debug.Trace

add :: Int -> Int -> Int
add x y = x + y

main :: IO ()
main = do
  let five = trace "five" (add (1 + 1) (1 + 2))
      seven = trace "seven" (add (1 + 2) (1 + 3))

  putStrLn $ "Five: " ++ show five


#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
{-# LANGUAGE BangPatterns #-}
import Debug.Trace

add :: Int -> Int -> Int
add x y = x + y

main :: IO ()
main = do
  let !five = trace "five" (add (1 + 1) (1 + 2))
      !seven = trace "seven" (add (1 + 2) (1 + 3))

  putStrLn $ "Five: " ++ show five

Think about this before looking at the answer...

OK, hope you had a good think. Here's the answer:

  • The first program will print both five and Five: 5. It will not bother printing seven, since that expression is never forced. (Due to strangeness around output buffering, you may see interleaving of these two output values.)
  • The second will print both five and seven, because the bang patterns force their evaluation. However, the order of their output may be different than you expect. On my system, for example, seven prints before five. That's because GHC retains the right to rearrange order of evaluation in these cases.
  • By contrast, if you use five `seq` seven `seq` putStrLn ("Five: " ++ show five), it comes out in the order you would intuitively expect: first five, then seven, and then "Five: 5". This gives a bit of a lie to my claim that bang patterns are always a simple translation to seqs. However, the fact is that with an expression x `seq` y, GHC is free to choose whether it evaluates x or y first, as long as it ensure that when that expression finishes evaluating, both x and y are evaluated.

All that said: as long as your expressions are truly pure, you will be unable to observe the difference between x and y evaluating first. Only the fact that we used trace, which is an impure function, allowed us to observe the order of evaluation.

QUESTION Does the result change at all if you put bangs on the add function? Why do bangs there affect (or not affect) the output?

The value of bottom

This is all well and good, but the more standard way to demonstrate evaluation order is to use bottom values, aka undefined. undefined is special in that, when it is evaluated, it throws a runtime exception. (The error function does the same thing, as do a few other special functions and values.) To demonstrate the same thing about seven not being evaluated without the bangs, compare these two programs:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
{-# LANGUAGE BangPatterns #-}

add :: Int -> Int -> Int
add x y = x + y

main :: IO ()
main = do
  let five = add (1 + 1) (1 + 2)
      seven = add (1 + 2) undefined -- (1 + 3)

  putStrLn $ "Five: " ++ show five


#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
{-# LANGUAGE BangPatterns #-}

add :: Int -> Int -> Int
add x y = x + y

main :: IO ()
main = do
  let five = add (1 + 1) (1 + 2)
      !seven = add (1 + 2) undefined -- (1 + 3)

  putStrLn $ "Five: " ++ show five

The former completes without issue, since seven is never evaluated. However, in the latter, we have a bang pattern on seven. What GHC does here is:

  • Evaluate the expression add (1 + 2) undefined
  • This reduces to (1 + 2) + undefined
  • But this is still an expression, not a value, so more evaluation is necessary
  • In order to evaluate the + operator, it needs actual values for the two arguments, not just thunks. This can be seen as if + has bang patterns on its arguments. The correct way to say this is "+ is strict in both of its arguments."
  • GHC is free to choose to either evaluate 1 + 2 or undefined first. Let's assume it does 1 + 2 first. It will come up with two evaluated values (1 and 2), pass them to +, and get back 3. All good.
  • However, it then tries to evaluate undefined, which triggers a runtime exception to be thrown.

QUESTION Returning to the question above: does it look like bang patterns inside the add function actually accomplish anything? Think about what the output of this program will be:

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
{-# LANGUAGE BangPatterns #-}

add :: Int -> Int -> Int
add !x !y = x + y

main :: IO ()
main = do
  let five = add (1 + 1) (1 + 2)
      seven = add (1 + 2) undefined -- (1 + 3)

  putStrLn $ "Five: " ++ show five

To compare this behavior to a strict language, we need a language with something like runtime exceptions. I'll use Rust's panics:

fn add(x: isize, y: isize) -> isize {
    println!("adding: {} and {}", x, y);
    x + y

fn main() {
    let five = add(1 + 1, 1 + 2);
    let seven = add(1 + 2, panic!());

    println!("Five: {}", five);

Firstly, to Rust's credit: it gives me a bunch of warnings about how this program is dumb. Fair enough, but I'm going to ignore those warnings and charge ahead with it. This program will first evaluate the add(1 + 1, 1 + 2) expression (which we can see in the output of adding: 2 and 3). Then, before it ever enters the add function the second time, it needs to evaluate both 1 + 2 and panic!(). The former works just fine, but the latter results in a panic being generated and short-circuiting the rest of our function.

If we want to regain Haskell's laziness properties, there's a straightforward way to do it: use a closure. A closure is, essentially, a thunk. The Rust syntax for creating a closure is |args| body. We can create closures with no arguments to act like thunks, which gives us:

fn add<X, Y>(x: X, y: Y) -> isize
    where X: FnOnce() -> isize,
          Y: FnOnce() -> isize {
    let x = x();
    let y = y();
    println!("adding: {} and {}", x, y);
    x + y

fn main() {
    let five = || add(|| 1 + 1, || 1 + 2);
    let seven = || add(|| 1 + 2, || panic!());

    println!("Five: {}", five());

Again, the Rust compiler complains about the unused seven, but this program succeeds in running, since we never run the seven closure.

Still not up to speed with Rust? Let's use everyone's favorite language: Javascript:

function add(x, y) {
    return x() + y();

function panic() {
    throw "Panic!"

var five = ignored => add(ignored => 1 + 1, ignored => 1 + 2);
var seven = ignored => add(ignored => 1 + 2, panic);
console.log("Five: " + five());

Alright, to summarize until now:

  • Haskell is lazy by default
  • You can use bang patterns and seq to make things strict
  • By contrast, in strict languages, you can use closures to make things lazy
  • You can see if a function is strict in its arguments by passing in bottom (undefined) and seeing if it explodes in your face
  • The trace function can help you see this as well

This is all good, and make sure you have a solid grasp of these concepts before continuing. Consider rereading the sections above.


Here's something we didn't address: what, exactly, does it mean to evaluate or force a value? To demonstrate the problem, let's implement an average function. We'll use a helper datatype, called RunningTotal, to capture both the cumulative sum and the number of elements we've seen so far.

data RunningTotal = RunningTotal
  { sum :: Int
  , count :: Int

printAverage :: RunningTotal -> IO ()
printAverage (RunningTotal sum count)
  | count == 0 = error "Need at least one value!"
  | otherwise = print (fromIntegral sum / fromIntegral count :: Double)

-- | A fold would be nicer... we'll see that later
printListAverage :: [Int] -> IO ()
printListAverage =
  go (RunningTotal 0 0)
    go rt [] = printAverage rt
    go (RunningTotal sum count) (x:xs) =
      let rt = RunningTotal (sum + x) (count + 1)
       in go rt xs

main :: IO ()
main = printListAverage [1..1000000]

We're going to run this with run time statistics turned on so we can look at memory usage:

$ stack ghc average.hs && ./average +RTS -s

Lo and behold, our memory usage is through the roof!

[1 of 1] Compiling Main             ( average.hs, average.o )
Linking average ...
     258,654,528 bytes allocated in the heap
     339,889,944 bytes copied during GC
      95,096,512 bytes maximum residency (9 sample(s))
       1,148,312 bytes maximum slop
             164 MB total memory in use (0 MB lost due to fragmentation)

We're allocating a total of 258MB, and keeping 95MB in memory at once. For something that should just be a tight inner loop, that's ridiculously large.


You're probably thinking right now "shouldn't we use that seq stuff or those bang patterns?" Certainly that makes sense. And in fact, it looks really trivial to solve this problem with a single bang to force evaluation of the newly constructed rt before recursing back into go. For example, we can add {-# LANGUAGE BangPatterns #-} to the top of our file and then define go as:

go !rt [] = printAverage rt
go (RunningTotal sum count) (x:xs) =
  let rt = RunningTotal (sum + x) (count + 1)
   in go rt xs

Unfortunately, this results in exactly the same memory usage as we had before. In order to understand why this is happening, we need to look at something called weak head normal form.

Weak Head Normal Form

Note in advance that there's a great Stack Overflow answer on this topic for further reading.

We've been talking about forcing values and evaluating expressions, but what exactly that means hasn't been totally clear. To start simple, what will the output of this program be?

main = putStrLn $ undefined `seq` "Hello World"

You'd probably guess that it will print an error about undefined, since it will try to evaluate undefined before it will evaluate "Hello World", and because putStrLn is strict in its argument. And you'd be correct. But let's try something a little bit different:

main = putStrLn $ Just undefined `seq` "Hello World"

If you assume that "evaluate" means "fully evaluate into something with no thunks left," you'll say that this, too, prints an undefined error. But in fact, it happily prints out "Hello World" with no exceptions. What gives?

It turns out that when we talk about forcing evaluation with seq, we're only talking about evaluating to weak head normal form (WHNF). For most data types, this means unwrapping one layer of constructor. In the case of Just undefined, it means that we unwrap the Just data constructor, but don't touch the undefined within it. (We'll see a few ways to deal with this differently below.)

It turns out that, with a standard data constructor*, the impact of using seq is the same as pattern matching the outermost constructor. If you want to monomorphise, for example, you can implement a function of type seqMaybe :: Maybe a -> b -> b and use it in the main example above. Go ahead and give it a shot... answer below.

* Hold your horses, we'll talk about newtypes later and then you'll understand this weird phrasing.

seqMaybe :: Maybe a -> b -> b
seqMaybe Nothing b = b
seqMaybe (Just _) b = b

main :: IO ()
main = do
  putStrLn $ Just undefined `seqMaybe` "Hello World"
  putStrLn $ undefined `seqMaybe` "Goodbye!"

Let's up the ante again. What do you think this program will print?

main = do
  putStrLn $ error `seq` "Hello"
  putStrLn $ (\x -> undefined) `seq` "World"
  putStrLn $ error "foo" `seq` "Goodbye!"

You might think that error `seq` ... would be a problem. After all, isn't error going to throw an exception? However, error is a function. There's no exception getting thrown, or no bottom value being provided, until error is given its String argument. As a result, evaluating does not, in fact, generate an error. The rule is: any function applied to too few values is automatically in WHNF.

A similar logic applies to (\x -> undefined). Although it's a lambda expression, its type is a function which has not been applied to all arguments. And therefore, it will not throw an exception when evaluated. In other words, it's already in WHNF.

However, error "foo" is a function fully applied to its arguments. It's no longer a function, it's a value. And when we try to evaluate it to WHNF, its exception blows up in our face.

EXERCISE Will the following throw exceptions when evaluated?

  • (+) undefined
  • Just undefined
  • undefined 5
  • (error "foo" :: Int -> Double)

Fixing average

Having understood WHNF, let's return to our example and see why our first bang pattern did nothing to help us:

go !rt [] = printAverage rt
go (RunningTotal sum count) (x:xs) =
  let rt = RunningTotal (sum + x) (count + 1)
   in go rt xs

In WHNF, forcing evaluation is the same as unwrapping the constructor, which we are already doing in the second clause! The problem is that the values contained inside the RunningTotal data constructor are not being evaluated, and therefore are accumulating thunks. Let's see two ways to solve this:

go rt [] = printAverage rt
go (RunningTotal !sum !count) (x:xs) =
  let rt = RunningTotal (sum + x) (count + 1)
   in go rt xs

Instead of putting the bangs on the RunningTotal value, I'm putting them on the values within the constructor, forcing them to be evaluated at each loop. We're no longer accumulating a huge chain of thunks, and our maximum residency drops to 44kb. (Total allocations, though, are still up around 192mb. We need to play around with other optimizations outside the scope of this post to deal with the total allocations, so we're going to ignore this value for the rest of the examples.) Another approach is:

go rt [] = printAverage rt
go (RunningTotal sum count) (x:xs) =
  let !sum' = sum + x
      !count' = count + 1
      rt = RunningTotal sum' count'
   in go rt xs

This one instead forces evaluation of the new sum and count before constructing the new RunningTotal value. I like this version a bit more, as it's forcing evaluation at the correct point: when creating the value, instead of on the next iteration of the loop when destructing it.

Moral of the story: make sure you're evaluating the thing you actually need to evaluate, not just its container!


The fact that seq only evaluates to weak head normal form is annoying. There are lots of times when we would like to fully evaluate down to normal form (NF), meaning all thunks have been evaluated inside our values. While there is nothing built into the language to handle this, there is a semi-standard (meaning it ships with GHC) library to handle this: deepseq. It works by providing an NFData type class the defines how to reduce a value to normal form (via the rnf method).

{-# LANGUAGE BangPatterns #-}
import Control.DeepSeq

data RunningTotal = RunningTotal
  { sum :: Int
  , count :: Int
instance NFData RunningTotal where
  rnf (RunningTotal sum count) = sum `deepseq` count `deepseq` ()

printAverage :: RunningTotal -> IO ()
printAverage (RunningTotal sum count)
  | count == 0 = error "Need at least one value!"
  | otherwise = print (fromIntegral sum / fromIntegral count :: Double)

-- | A fold would be nicer... we'll see that later
printListAverage :: [Int] -> IO ()
printListAverage =
  go (RunningTotal 0 0)
    go rt [] = printAverage rt
    go (RunningTotal sum count) (x:xs) =
      let rt = RunningTotal (sum + x) (count + 1)
       in rt `deepseq` go rt xs

main :: IO ()
main = printListAverage [1..1000000]

This has a maximum residency, once again, of 44kb. We define our NFData instance, which includes an rnf method. The approach of simply deepseqing all of the values within a data constructor is almost always the approach to take for NFData instances. In fact, it's so common, that you can get away with just using Generic deriving and have GHC do the work for you:

{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics (Generic)
import Control.DeepSeq

data RunningTotal = RunningTotal
  { sum :: Int
  , count :: Int
  deriving Generic
instance NFData RunningTotal

The true beauty of having NFData instances is the ability to abstract over many different data types. We can use this not only to avoid space leaks (as we're doing here), but also to avoid accidentally including exceptions inside thunks within a value. For an example of that, check out the tryAnyDeep function from the safe-exceptions library.

EXERCISE Define the deepseq function yourself in terms of rnf and seq.

Strict data

These approaches work, but they are not ideal. The problem lies in our definition of RunningTotal. What we want to say is that, whenever you have a value of type RunningTotal, you in fact have two Ints. But because of laziness, what we're actually saying is that a RunningTotal value could contain two Ints, or it could contain thunks that will evaluate to Ints, or thunks that will throw exceptions.

Instead, we'd like to make it impossible to construct a RunningTotal value that has any laziness room left over. And to do that, we can use strictness annotations in our definition of the data type:

data RunningTotal = RunningTotal
  { sum :: !Int
  , count :: !Int
  deriving Generic

printAverage :: RunningTotal -> IO ()
printAverage (RunningTotal sum count)
  | count == 0 = error "Need at least one value!"
  | otherwise = print (fromIntegral sum / fromIntegral count :: Double)

-- | A fold would be nicer... we'll see that later
printListAverage :: [Int] -> IO ()
printListAverage =
  go (RunningTotal 0 0)
    go rt [] = printAverage rt
    go (RunningTotal sum count) (x:xs) =
      let rt = RunningTotal (sum + x) (count + 1)
       in go rt xs

main :: IO ()
main = printListAverage [1..1000000]

All we've done is put bangs in front of the Ints in the definition of RunningTotal. We have no other references to strictness or evaluation in our program. However, by placing the strictness annotations on those fields, we're saying something simple and yet profound:

Whenever you evaluate a value of type RunningTotal, you must also evaluate the two Ints it contains

As we mentioned above, our second go clause forces evaluation of the RunningTotal value by taking apart its constructor. This act now automatically forces evaluation of sum and count, which we previously needed to achieve via a bang pattern.

There's one other advantage to this, which is slightly out of scope but worth mentioning. When dealing with small values like an Int, GHC will automatically unbox strict fields. This means that, instead of keeping a pointer to an Int inside RunningTotal, it will keep the Int itself. This can further reduce memory usage.

You're probably asking a pretty good question right now: "how do I know if I should use a strictness annotation on my data fields?" This answer is slightly controversial, but my advice and recommended best practice: unless you know that you want laziness for a field, make it strict. Making your fields strict helps in a few ways:

  • Avoids accidental space leaks, like we're doing here
  • Avoids accidentally including bottom values
  • When constructing a value with record syntax, GHC will give you an error if you forget a strict field. It will only give you a warning for non-strict fields.

The curious case of newtype

Let's define three very similar data types:

data Foo = Foo Int
data Bar = Bar !Int
newtype Baz = Baz Int

Let's play a game, and guess the output of the following potential bodies for main. Try to work through each case in your head before reading the explanation below.

  1. case undefined of { Foo _ -> putStrLn "Still alive!" }
  2. case Foo undefined of { Foo _ -> putStrLn "Still alive!" }
  3. case undefined of { Bar _ -> putStrLn "Still alive!" }
  4. case Bar undefined of { Bar _ -> putStrLn "Still alive!" }
  5. case undefined of { Baz _ -> putStrLn "Still alive!" }
  6. case Baz undefined of { Baz _ -> putStrLn "Still alive!" }

Case (1) is relatively straightforward: we try to unwrap one layer of data constructor (the Foo) and find a bottom value. So this thing throws an exception. The same thing applies to (3).

(2) does not throw an exception. We have a Foo data constructor in our expression, and it contains a bottom value. However, since there is no strictness annotation on the Int in Foo, uwnrapping the Foo does not force evaluation of the Int, and therefore no exception is thrown. By contrast, in (4), we do have a strictness annotation, and therefore caseing on Bar throws an exception.

What about newtypes? What we know about newtypes is that they have no runtime representation. Therefore, it's impossible for the Baz data constructor to be hiding an extra layer of bottomness. In other words, Baz undefined and undefined are indistinguishable. That may sound like Bar at first, but interestingly it's not.

You see, unwrapping a Baz constructor can have no effect on runtime behavior, since it was never there in the first place. The pattern match inside (5), therefore, does nothing. It is equivalent to case undefined of { _ -> putStrLn "Still alive!" }. And since we're not inspecting the undefined at all (because we're using a wildcard pattern and not a data constructor), no exception is thrown.

Similarly, in case (6), we've applied a Baz constructor to undefined, but since it has no runtime representation, it may as well not be there. So once again, no exception is thrown.

EXERCISE What is the output of the program main = Baz undefined `seq` putStrLn "Still alive!"? Why?

Convenience operators and functions

It can be inconvenient, as you may have noticed already, to use seq and deepseq all over the place. Bang patterns help, but there are other ways to force evaluation. Perhaps the most common is the $! operator, e.g.:

mysum :: [Int] -> Int
mysum list0 =
  go list0 0
    go [] total = total
    go (x:xs) total = go xs $! total + x

main = print $ mysum [1..1000000]

This forces evaluation of total + x before recursing back into the go function, avoiding a space leak. (EXERCISE: do the same thing with a bang pattern, and with the seq function.)

The $!! operator is the same, except instead of working with seq, it uses deepseq and therefore evaluates to normal form.

import Control.DeepSeq

average :: [Int] -> Double
average list0 =
  go list0 (0, 0)
    go [] (total, count) = fromIntegral total / count
    go (x:xs) (total, count) = go xs $!! (total + x, count + 1)

main = print $ average [1..1000000]

Another nice helper function is force. What this does is makes it that, when the expression you're looking at is evaluated to WHNF, it's actually evaluated to NF. For example, we can rewrite the go function above as:

go [] (total, count) = fromIntegral total / count
go (x:xs) (total, count) = go xs $! force (total + x, count + 1)

EXERCISE Define these convenience functions and operators yourself in terms of seq and deepseq.

Data structures

Alright, I swear that's all of the really complicated stuff. If you've absorbed all of those details, the rest of this just follows naturally and introduces a little bit more terminology to help us understand things.

Let's start off slowly: what's the output of this program:

data List a = Cons a (List a) | Nil

main = Cons undefined undefined `seq` putStrLn "Hello World"

Well, using our principles from above: Cons undefined undefined is already in WHNF, since we've got the outermost constructor available. So this program prints "Hello World", without any exceptions. Cool. Now let's realize that Cons is the same as the : data constructor for lists, and see that the above is identical to:

main = (undefined:undefined) `seq` putStrLn "Hello World"

This tells me that lists are a lazy data structure: I have a bottom value for the first element, a bottom value for the rest of the list, and yet this first cell is not bottom. Let's try something a little bit different:

data List a = Cons a !(List a) | Nil

main = Cons undefined undefined `seq` putStrLn "Hello World"

This is going to explode in our faces! We are now strict in the tail of the list. However, the following is fine:

data List a = Cons a !(List a) | Nil

main = Cons undefined (Cons undefined Nil) `seq` putStrLn "Hello World"

With this definition of a list, we need to know all the details about the list itself, but the values can remain undefined. This is called spine strict. By contrast, we can also be strict in the values and be value strict:

data List a = Cons !a !(List a) | Nil

main = Cons undefined (Cons undefined Nil) `seq` putStrLn "Hello World"

This will explode in our faces, as we'd expect.

There's one final definition of list you may be expecting, one strict in values but not in the tail:

data List a = Cons !a (List a) | Nil

In practice, I'm aware of no data structures in Haskell that follow this pattern, and therefore it doesn't have a name. (If there are such data structures, and this does have a name, please let me know, I'd be curious about the use cases for it.)

So standard lists are lazy. Let's look at a few other data types:


The vectors in Data.Vector (also known as boxed vectors) are spine strict. Assuming an import of import qualified Data.Vector as V, what would be the results of the following programs?

  1. main = V.fromList [undefined] `seq` putStrLn "Hello World"
  2. main = V.fromList (undefined:undefined) `seq` putStrLn "Hello World"
  3. main = V.fromList undefined `seq` putStrLn "Hello World"

The first succeeds: we have the full spine of the vector defined. The fact that it contains a bottom value is irrelevant. The second fails, since the spine of the tail of the list is undefined, making the spine undefined. And finally the third (of course) fails, since the entire list is undefined.

Now let's look at unboxed vectors. Because of inference issues, we need to help out GHC a little bit more. So starting with this head of a program:

import qualified Data.Vector.Unboxed as V

fromList :: [Int] -> V.Vector Int
fromList = V.fromList

What happens with the three cases above?

  1. main = fromList [undefined] `seq` putStrLn "Hello World"
  2. main = fromList (undefined:undefined) `seq` putStrLn "Hello World"
  3. main = fromList undefined `seq` putStrLn "Hello World"

As you'd expect, (2) and (3) have the same behavior as with boxed vectors. However, (1) also throws an exception, since unboxed vectors are value strict, not just spine strict. The same applies to storable and primitive vectors.

Unfortunately, to my knowledge, there is no definition of a strict, boxed vector in a public library. Such a data type would be useful to help avoid space leaks (such as the original question that triggered this blog post).

Sets and Maps

If you look at the containers and unordered-containers packages, you may have noticed that the Map-like modules come in Strict and Lazy variants (e.g., Data.HashMap.Strict and Data.HashMap.Lazy) while the Set-like modules do not (e.g., Data.IntSet). This is because all of these containers are spine strict, and therefore must be strict in the keys. Since a set only has keys, no separate values, it must also be value strict.

A map, by contrast, has both keys and values. The lazy variants of the map-like modules are spine-strict, value-lazy, whereas the strict variants are both spine and value strict.

EXERCISE Analyze the Data.Sequence.Seq data type and classify it as either lazy, spine strict, or value strict.

Function arguments

A function is considered strict in one of its arguments if, when the function is applied to a bottom value for that argument, the result is bottom. As we saw way above, + for Int is strict in both of its arguments, since: undefined + x is bottom, and x + undefined is bottom.

By contrast, the const function, defined as const a b = a, is strict in its first argument and lazy in its second argument.

The : data constructor for lists is lazy in both its first and second argument. But if you have data List = Cons !a !(List a) | Nil, Cons is strict in both its first and second argument.


A common place to end up getting tripped up by laziness is dealing with folds. The most infamous example is the foldl function, which lulls you into a false sense of safety only to dash your hopes and destroy your dreams:

mysum :: [Int] -> Int
mysum = foldl (+) 0

main :: IO ()
main = print $ mysum [1..1000000]

This is so close to correct, and yet uses 53mb of resident memory! The solution is but a tick away, using the strict left fold foldl' function:

import Data.List (foldl')

mysum :: [Int] -> Int
mysum = foldl' (+) 0

main :: IO ()
main = print $ mysum [1..1000000]

Why does the Prelude expose a function (foldl) which is almost always the wrong one to use?

Hysterical Raisins

But the important thing to note about almost all functions that claim to be strict is that they are only strict to weak head normal form. Pulling up our average example from before, this still has a space leak:

import Data.List (foldl')

average :: [Int] -> Double
average =
  divide . foldl' add (0, 0)
    divide (total, count) = fromIntegral total / count
    add (total, count) x = (total + x, count + 1)

main :: IO ()
main = print $ average [1..1000000]

My advice is to use a helper data type with strict fields. But perhaps you don't want to do that, and you're frustrated that there is no foldl' that evaluates to normal form. Fortunately for you, by just throwing in a call to force, you can easily upgrade a WHNF fold into a NF fold:

import Data.List (foldl')
import Control.DeepSeq (force)

average :: [Int] -> Double
average =
  divide . foldl' add (0, 0)
    divide (total, count) = fromIntegral total / count
    add (total, count) x = force (total + x, count + 1)

main :: IO ()
main = print $ average [1..1000000]

Like a good plumber, force patches that leak right up!

Streaming data

One of the claims of streaming data libraries (like conduit) is that they promote constant memory usage. This may make you think that you can get away without worrying about space leaks. However, all of the comments about WHNF vs NF mentioned above apply. To prove the point, let's do average badly with conduit:

import Conduit

average :: Monad m => ConduitM Int o m Double
average =
  divide <$> foldlC add (0, 0)
    divide (total, count) = fromIntegral total / count
    add (total, count) x = (total + x, count + 1)

main :: IO ()
main = print $ runConduitPure $ enumFromToC 1 1000000 .| average

You can test the memory usage of this with:

$ stack --resolver lts-9.3 ghc --package conduit-combinators -- Main.hs -O2
$ ./Main +RTS -s

EXERCISE Make this program run in constant resident memory, by using:

  1. The force function
  2. Bang patterns
  3. A custom data type with strict fields

Chain reaction

Look at this super strict program. It's got a special value-strict list data type. I've liberally sprinkled bang patterns and calls to seq throughout. I've used $!. How much memory do you think it uses?

#!/usr/bin/env stack
-- stack --resolver lts-9.3 script
{-# LANGUAGE BangPatterns #-}

data StrictList a = Cons !a !(StrictList a) | Nil

strictMap :: (a -> b) -> StrictList a -> StrictList b
strictMap _ Nil = Nil
strictMap f (Cons a list) =
  let !b = f a
      !list' = strictMap f list
   in b `seq` list' `seq` Cons b list'

strictEnum :: Int -> Int -> StrictList Int
strictEnum low high =
  go low
    go !x
      | x == high = Cons x Nil
      | otherwise = Cons x (go $! x + 1)

double :: Int -> Int
double !x = x * 2

evens :: StrictList Int
evens = strictMap double $! strictEnum 1 1000000

main :: IO ()
main = do
  let string = "Hello World"
      string' = evens `seq` string
  putStrLn string

Look carefully, read the code well, and make a guess. Ready? Good.

It uses 44kb of memory. "What?!" you may exclaim. "But this thing has to hold onto a million Ints in a strict linked list!" Ehh... almost. It's true, our program is going to do a hell of a lot of evaluation as soon as we force the evens value. And as soon as we force the string' value in main, we'll force evens.

However, our program never actually forces evaluation of either of these! If you look carefully, the last line in the program uses the string value. It never looks at string' or evens. When executing our program, GHC is only interested in performing the IO actions it is told to perform by the main function. And main only says something about putStrLn string.

This is vital to understand. You can build up as many chains of evaluation using seq and deepseq as you want in your program. But ultimately, unless you force evaluation via some IO action of the value at the top of the chain, it will all remain an unevaluated thunk.


  1. Change putStrLn string to putStrLn string' and see what happens to memory usage. (Then undo that change for the other exercises.)
  2. Use a bang pattern in main somewhere to get the great memory usage.
  3. Add a seq somewhere in the putStrLn string line to force the greater memory usage.

September 12, 2017 05:00 AM

September 10, 2017

Joachim Breitner

Less parentheses

Yesterday, at the Haskell Implementers Workshop 2017 in Oxford, I gave a lightning talk titled ”syntactic musings”, where I presented three possibly useful syntactic features that one might want to add to a language like Haskell.

The talked caused quite some heated discussions, and since the Internet likes heated discussion, I will happily share these ideas with you

Context aka. Sections

This is probably the most relevant of the three proposals. Consider a bunch of related functions, say analyseExpr and analyseAlt, like these:

analyseExpr :: Expr -> Expr
analyseExpr (Var v) = change v
analyseExpr (App e1 e2) =
  App (analyseExpr e1) (analyseExpr e2)
analyseExpr (Lam v e) = Lam v (analyseExpr flag e)
analyseExpr (Case scrut alts) =
  Case (analyseExpr scrut) (analyseAlt <$> alts)

analyseAlt :: Alt -> Alt
analyseAlt (dc, pats, e) = (dc, pats, analyseExpr e)

You have written them, but now you notice that you need to make them configurable, e.g. to do different things in the Var case. You thus add a parameter to all these functions, and hence an argument to every call:

type Flag = Bool

analyseExpr :: Flag -> Expr -> Expr
analyseExpr flag (Var v) = if flag then change1 v else change2 v
analyseExpr flag (App e1 e2) =
  App (analyseExpr flag e1) (analyseExpr flag e2)
analyseExpr flag (Lam v e) = Lam v (analyseExpr (not flag) e)
analyseExpr flag (Case scrut alts) =
  Case (analyseExpr flag scrut) (analyseAlt flag <$> alts)

analyseAlt :: Flag -> Alt -> Alt
analyseAlt flag (dc, pats, e) = (dc, pats, analyseExpr flag e)

I find this code problematic. The intention was: “flag is a parameter that an external caller can use to change the behaviour of this code, but when reading and reasoning about this code, flag should be considered constant.”

But this intention is neither easily visible nor enforced. And in fact, in the above code, flag does “change”, as analyseExpr passes something else in the Lam case. The idiom is indistinguishable from the environment idiom, where a locally changing environment (such as “variables in scope”) is passed around.

So we are facing exactly the same problem as when reasoning about a loop in an imperative program with mutable variables. And we (pure functional programmers) should know better: We cherish immutability! We want to bind our variables once and have them scope over everything we need to scope over!

The solution I’d like to see in Haskell is common in other languages (Gallina, Idris, Agda, Isar), and this is what it would look like here:

type Flag = Bool
section (flag :: Flag) where
  analyseExpr :: Expr -> Expr
  analyseExpr (Var v) = if flag then change1 v else change2v 
  analyseExpr (App e1 e2) =
    App (analyseExpr e1) (analyseExpr e2)
  analyseExpr (Lam v e) = Lam v (analyseExpr e)
  analyseExpr (Case scrut alts) =
    Case (analyseExpr scrut) (analyseAlt <$> alts)

  analyseAlt :: Alt -> Alt
  analyseAlt (dc, pats, e) = (dc, pats, analyseExpr e)

Now the intention is clear: Within a clearly marked block, flag is fixed and when reasoning about this code I do not have to worry that it might change. Either all variables will be passed to change1, or all to change2. An important distinction!

Therefore, inside the section, the type of analyseExpr does not mention Flag, whereas outside its type is Flag -> Expr -> Expr. This is a bit unusual, but not completely: You see precisely the same effect in a class declaration, where the type signature of the methods do not mention the class constraint, but outside the declaration they do.

Note that idioms like implicit parameters or the Reader monad do not give the guarantee that the parameter is (locally) constant.

More details can be found in the GHC proposal that I prepared, and I invite you to raise concern or voice support there.

Curiously, this problem must have bothered me for longer than I remember: I discovered that seven years ago, I wrote a Template Haskell based implementation of this idea in the seal-module package!

Less parentheses 1: Bulleted argument lists

The next two proposals are all about removing parentheses. I believe that Haskell’s tendency to express complex code with no or few parentheses is one of its big strengths, as it makes it easier to visualy parse programs. A common idiom is to use the $ operator to separate a function from a complex argument without parentheses, but it does not help when there are multiple complex arguments.

For that case I propose to steal an idea from the surprisingly successful markup language markdown, and use bulleted lists to indicate multiple arguments:

foo :: Baz
foo = bracket
        • some complicated code
          that is evaluated first
        • other complicated code for later
        • even more complicated code

I find this very easy to visually parse and navigate.

It is actually possible to do this now, if one defines (•) = id with infixl 0 •. A dedicated syntax extension (-XArgumentBullets) is preferable:

  • It only really adds readability if the bullets are nicely vertically aligned, which the compiler should enforce.
  • I would like to use $ inside these complex arguments, and multiple operators of precedence 0 do not mix. (infixl -1 • would help).
  • It should be possible to nest these, and distinguish different nesting levers based on their indentation.

Less parentheses 1: Whitespace precedence

The final proposal is the most daring. I am convinced that it improves readability and should be considered when creating a new language. As for Haskell, I am at the moment not proposing this as a language extension (but could be convinced to do so if there is enough positive feedback).

Consider this definition of append:

(++) :: [a] -> [a] -> [a]
[]     ++ ys = ys
(x:xs) ++ ys = x : (xs++ys)

Imagine you were explaining the last line to someone orally. How would you speak it? One common way to do so is to not read the parentheses out aloud, but rather speak parenthesised expression more quickly and add pauses otherwise.

We can do the same in syntax!

(++) :: [a] -> [a] -> [a]
[]   ++ ys = ys
x:xs ++ ys = x : xs++ys

The rule is simple: A sequence of tokens without any space is implicitly parenthesised.

The reaction I got in Oxford was horror and disgust. And that is understandable – we are very used to ignore spacing when parsing expressions (unless it is indentation, of course. Then we are no longer horrified, as our non-Haskell colleagues are when they see our code).

But I am convinced that once you let the rule sink in, you will have no problem parsing such code with ease, and soon even with greater ease than the parenthesised version. It is a very natural thing to look at the general structure, identify “compact chunks of characters”, mentally group them, and then go and separately parse the internals of the chunks and how the chunks relate to each other. More natural than first scanning everything for ( and ), matching them up, building a mental tree, and then digging deeper.

Incidentally, there was a non-programmer present during my presentation, and while she did not openly contradict the dismissive groan of the audience, I later learned that she found this variant quite obvious to understand and more easily to read than the parenthesised code.

Some FAQs about this:

  • What about an operator with space on one side but not on the other? I’d simply forbid that, and hence enforce readable code.
  • Do operator sections still require parenthesis? Yes, I’d say so.
  • Does this overrule operator precedence? Yes! a * b+c == a * (b+c).
  • What is a token? Good question, and I am not not decided. In particular: Is a parenthesised expression a single token? If so, then (Succ a)+b * c parses as ((Succ a)+b) * c, otherwise it should probably simply be illegal.
  • Can we extend this so that one space binds tighter than two spaces, and so on? Yes we can, but really, we should not.
  • This is incompatible with Agda’s syntax! Indeed it is, and I really like Agda’s mixfix syntax. Can’t have everything.
  • Has this been done before? I have not seen it in any language, but Lewis Wall has blogged this idea before.

Well, let me know what you think!

by Joachim Breitner ( at September 10, 2017 10:10 AM

September 09, 2017

Mikhail Glushenkov

What's new in Cabal/cabal-install 2.0 — improved new-build, Backpack, foreign libraries and more!

A couple of weeks ago we’ve quietly released versions 2.0 of both Cabal and cabal-install after approximately a year of development. The 2.0 release incorporates more than 1500 commits by 64 different contributors. This post serves as a formal release announcement and describes what’s new and improved in version 2.0.

There is a number of backwards-incompatible Cabal library API changes in this release that affect packages with Custom setup scripts. Therefore cabal-install will by default use a previous version of Cabal to build setup scripts that don’t explicitly declare compatibility with Cabal 2.0. The 2.0 migration guide gives advice for package authors on how to adapt Custom setup scripts to backwards-incompatible changes in this release.

Major new features

  • Much improved new-build feature (also known as nix-style local builds), that solves many long-standing problems and is going to become the default mode of operation of cabal-install in version 3.0 (tentative release date: Autumn 2018). Killer features of new-build are reproducible isolated builds with global dependency caching and multi-package projects. For a more extensive introduction to new-build, see this blog post by Edward Z. Yang.

  • Support for Backpack, a new system for mix-in packages. See this article by Edward Z. Yang for an introduction to Backpack and its features.

  • Native suport for foreign libraries: Haskell libraries that are intended to be used by non-Haskell code. See this section of the user guide for an introduction to this feature.

  • Convenience/internal libraries are now supported (#269). An internal library is declared using the stanza library 'libname' and can be only used by other components inside a package.

  • Package components can be now built and installed in parallel. This is especially handy when compiling packages with large numbers of independent components (usually those are executables). As a consequence, the Setup.hs command-line interface now allows to specify the component to be configured.

  • Nix package manager integration (#3651).

  • New cabal-install command: outdated, for listing outdated version bounds in a .cabal file or a freeze file (#4201). Work on this feature was sponsored by Scrive AB.

  • New cabal-install command reconfigure, which re-runs configure with the most recently used flags (#2214).

  • Package repos are now assumed to be hackage-security-enabled by default. If a remote-repo section in ~/.cabal/config doesn’t have an explicit secure field, it now defaults to secure: True, unlike in cabal-install 1.24. See this post on the Well-Typed blog for an introduction to hackage-security and what benefits it brings.

  • New caret-style version range operator ^>= (#3705) that is equivalent to >= intersected with an automatically inferred major upper bound. For example, foo ^>= 1.3.1 is equivalent to foo >= 1.3.1 && < 1.4. Besides being a convenient syntax sugar, ^>= allows to distinguish “strong” and “weak” upper bounds: foo >= 1.3.1 && < 1.4 means “I know for sure that my package doesn’t work with foo-1.4”, while foo ^>= 1.3.1 means “I don’t know whether foo-1.4, which is not out yet, will break my package, but I want to be cautious and follow PVP”. In the future, this feature will allow to implement automatic version bounds relaxation in a formally sound way (work on this front is progressing on See this section of the manual for more information.

  • Changed cabal upload to upload a package candidate by default (#3419). Same applies to uploading documentation. Also added a new cabal upload flag --publish for publishing a package on Hackage instead of uploading a candidate (#3419).

  • Support for --allow-older (dual to --allow-newer) (#3466).

  • New build-tool-depends field that replaces build-tools and has a better defined semantics (#3708, #1541). cabal-install will now install required build tools and add them to PATH automatically.

  • New autogen-modules field for automatically generated modules (like Paths_PACKAGENAME) that are not distributed inside the package tarball (#3656).

  • Added a new scope field to the executable stanza (#3461). Executable scope can be either public or private; private executables are those that are expected to be run by other programs rather than users and get installed into $libexecdir/$libexecsubdir. Additionally, $libexecdir now has a subdir structure similar to $lib(sub)dir to allow installing private executables of different packages and package versions alongside one another.

  • New --index-state flag for requesting a specific version of the package index (#3893, #4115).

  • Added CURRENT_PACKAGE_VERSION CPP constant to cabal_macros.h (#4319).

Minor improvements and bug fixes

  • Dropped support for versions of GHC earlier than 6.12 (#3111). Also, GHC compatibility window for the Cabal library has been extended to five years (#3838).

  • Added a technical preview version of the ‘cabal doctest’ command (#4480).

  • Cabal now invokes GHC with -Wmissing-home-modules, if that flag is supported (added in version 8.2). This means that you’ll get a warning if you forget to list a module in other-modules or exposed-modules (#4254).

  • Verbosity -v now takes an extended format which allows specifying exactly what you want to be logged. The format is [silent|normal|verbose|debug] flags, where flags is a space separated list of flags. At the moment, only the flags +callsite and +callstack are supported; these report the call site/stack of a logging output respectively (these are only supported if Cabal is built with GHC 8.0/7.10.2 or greater, respectively).

  • The -v/--verbosity option no longer affects GHC verbosity (except in the case of -v0). Use --ghc-options=-v to enable verbose GHC output (#3540, #3671).

  • Packages which use internal libraries can result in multiple registrations; thus --gen-pkg-config can now output a directory of registration scripts rather than a single file.

  • Changed the default logfile template from .../$pkgid.log to .../$compiler/$libname.log (#3807).

  • Macros in ‘cabal_macros.h’ are now #ifndef’d, so that they don’t cause an error if the macro is already defined (#3041).

  • Added qualified constraints for setup dependencies. For example, --constraint=" == 1.0" constrains all setup dependencies on bar, and --constraint=" == 1.0" constrains foo’s setup dependency on bar (part of #3502).

  • Non-qualified constraints, such as –constraint=“bar == 1.0”, now only apply to top-level dependencies. They don’t constrain setup or build-tool dependencies. The new syntax --constraint=" ==1.0" constrains all uses of bar.

  • Added a new solver flag, --allow-boot-library-installs, that allows normally non-upgradeable packages like base to be installed or upgraded (#4209). Made the ‘template-haskell’ package non-upgradable again (#4185).

  • Fixed password echoing on MinTTY (#4128).

  • Added optional solver output visualisation support via the tracetree package (#3410). Mainly intended for debugging.

  • New ./Setup configure flag --cabal-file, allowing multiple .cabal files in a single directory (#3553). Primarily intended for internal use.

  • Removed the --check option from cabal upload (#1823). It was replaced by Hackage package candidates.

  • Removed the --root-cmd parameter of the ‘install’ command and deprecated cabal install --global (#3356).

  • Removed the top-down solver (#3598).

  • Cabal no longer supports using a version bound to disambiguate between an internal and external package (#4020). This should not affect many people, as this mode of use already did not work with the dependency solver.

  • Miscellaneous minor and/or internal bug fixes and improvements.

See the full Cabal 2.0 and cabal-install 2.0 changelogs for the complete list of changes in the 2.0 release.


Thanks to everyone who contributed code and bug reports. Full list of people who contributed patches to Cabal/cabal-install 2.0 is available here.

Looking forward

We plan to make a new release of Cabal/cabal-install before the end of the year – that is, around December 2017. We want to decouple the Cabal release cycle from the GHC one; that’ll allow us to release a new version of Cabal/cabal-install approximately every six months in the future. Main features that are currently targeted at the 2.2 milestone are:

We would like to encourage people considering contributing to take a look at the bug tracker on GitHub, take part in discussions on tickets and pull requests, or submit their own. The bug tracker is reasonably well maintained and it should be relatively clear to new contributors what is in need of attention and which tasks are considered relatively easy. Additionally, the list of potential projects from the latest hackathon and the tickets marked “easy” and “newcomer” can be used as a source of ideas for what to work on.

For more in-depth discussion there is also the cabal-devel mailing list and the #hackage IRC channel on FreeNode.

September 09, 2017 12:00 AM

September 08, 2017

Functional Jobs

Backend Ruby and Haskell engineer at Health eFilings (Full-time)

Our backend engineering team manages the ingestion and normalization of data sets, from data extraction through to product delivery. We want to work smarter instead of harder, and create domain specific languages, meta-programming etc. where possible.

Our current code base is written in Ruby and Coffee Script, but some new modules are being written in Haskell. You will be on the front lines of creating a Haskell-based infrastructure that is maintainable and can scale to support our needs as we grow.

We currently expect that about 80% of your work will be in Ruby/CoffeeScript, and 20% in Haskell, but that ratio will decrease over time as we move more of our functionality to Haskell. (The faster you can work to migrate functionality to Haskell, the more Haskell you will be doing.)


You will have ownership of an entire module, including responsibility for:

  • Creating new features in a clean and maintainable way
  • Re-factoring existing code to ensure that we stay agile
  • Reviewing teammates’ code and providing feedback
  • Keeping yourself focused and your projects on track
  • An “I can run through walls” mentality to ensure that goals are met
  • Answering questions from our implementation team and squashing bugs on a monthly support rotation

We are a small team (four engineers), and so it is critical that you be a team player, willing to pitch in and help out your colleagues.


  • Autonomy to solve problems in the way you best see fit
  • A manager who is accountable for ensuring you meet your professional goals
  • A team who helps each other and always strives to improve
  • The time to focus on creating the right solution, instead of the easiest one


  • Professional experience as a software engineer
  • Experience with Haskell and Ruby
  • A desire for continual self-improvement
  • An understanding of best practices regarding maintainability and scalability
  • Must have US work authorization and be located in the US (we cannot sponsor visas at this time)
  • There are no formal education requirements for this position


  • Experience with data scraping and parsing


This is expected to be a remote position, although our Madison, Wisconsin office is also available as a work location.

Get information on how to apply for this position.

September 08, 2017 09:43 PM

Dominic Orchard

ICFP / FSCD day 1 – rough notes

(Blog posts for Day 1, Day 2, Day 3, Day 4 (half day))

I decided to take electronic notes at ICFP and FSCD (colocated) this year, and following the example of various people who put their conference notes online (which I’ve found useful), I thought I would attempt the same. However, there is a big caveat: my notes are going to be partial and may be incorrect; my apologies to the speakers for any mistakes.

(ICFP keynote #1) Computational Creativity, Chris Martens (slides)

In the early days, automated theorem proving by computers was seen as an AI activity. Theorem proving, program synthesis, and AI planning are search procedures but can be viewed as creative procedures.

Linear logic is useful for describing plots (story telling).

There is a difference between what an AI opponent and AI cooperator should do: an opponent need not make intelligible moves, but a cooperator needs to act in a way that is intelligible/understandable to the human player.

Applied Grice’s maxims of conversation to build an “intentional” cooperative player for the Hanabi game. Grice’s maxims are: quantity (not too much or little), quality (tell the truth), relation (be relevant), manner (avoid ambiguity).

Theory of mind: forming mental models of other people’s mental models.

Dynamic epistemic logic, has two indexed modalities:
\Box_a A   (agent ‘a’ believes A)
[\alpha] A   (A holds true after action \alpha).

Actions are defined inductively:
\begin{array}{rll} \alpha, \beta & = \\ & \textit{flip} \; p & \text{(change truth value)} \\ \mid & ?A & \text{(precondition)} \\ \mid & \alpha + \beta & \text{(non deterministic choice)} \\ \mid & \alpha ; \beta  & \text{(sequence)} \\ \mid & \alpha^a & \text{(appearance to agent 'a')} \\ \mid & \alpha* & \text{(public action)} \end{array}

Semantics is based on a possible worlds formulation (current actions change future possible worlds). Ostari (Eger & Martens) is a full epistemic logic language that can capture lots of different games with different kinds of knowledge between players.

Key point: use compositionally (of various kinds) as a design principle

Three research challenges:

  • Dynamic logics with functional action languages
  • Constructive/lambda-calculable DEL
  • Epistemic session types

(FSCD keynote #1) Brzozowski Goes Concurrent – Alexandra Silva

Ongoing project CoNeCo (Concurrency, Networks, and Coinduction)

SDN – Software Defined Network architectures let you write a program for the network controller. Requires languages for the controller to interact with the underlying switches / routers. Goals of new network PL: raise the level of abstraction beyond the underlying OpenFlow API– make it easier to reason about. Based on Kleene algebra.

NetKat – based on regular expression (Kleene algebra) with tests (KAT) + additional specialized constructions relating to networking. This is compiled to OpenFlow.

Kleene algebra, e.g., (0 + 1(01^*0)^*1)^* – multiples of 3 in binary.

For reasoning about imperative programs, this can be extended with the idea of ‘tests’ (Kozen). Need to capture more of the control flow graph, but there are guards (tests). The solution is to split the underlying alphabet of the Kleene algebra into two: one for actions and one for tests: combine a Kleene algebra and a Boolean algebra (where there is negation), e.g. $\bar{b}$ is the negation of b. Then the three standard control flow constructs of an imperative language are defined:

\begin{array}{rl} p ; q & = pq \\ \textbf{if} \; b \; \textbf{then} \; p \; \textbf{else} \; q & = bp + \bar{b}q \\ \textbf{while} \; b \; \textbf{do} \; p & = (bp)^*\bar{b} \end{array}

Subsumes Hoare logic, i.e.:
b \{p\} c \Leftrightarrow b\,p\,\bar{c} = 0

Decidable in PSPACE.

What is the minimum number of constructs needed to be added to KAT to deal with packet flow?

A packet is an assignment of constant values n to fields x a packet history is a nonempty sequence of packets. Include assignments (field <- value) and tests on fields (filed = value). e.g.,

sw = 6; pt = 88; dest <-; pt <- 50

For all packets incoming on port 88 of switch 6 set the destination IP to and send the packet to port 50.

Can then reason about reachability (can two hosts communicate)? Security (does all untrusted traffic pass through an intrusion detection)? loop detection (packet to be stuck in a forwarding cycle).

Since networks are non-deterministic, this has been adapted to probabilistic NetKAT.

Can we add concurrency to NetKAT (deal with interaction between different components): within this paradigm, can we add concurrency to Kleene algebra? Concurrent Kleene algebra adds a parallel composition || (cf. Hoare 2011). Would like to reason about weak memory model like behaviour (cf. Sewell’s litmus tests). There were developments in concurrent Kleene algebra, but a solid foundations was not yet available.

Have proved a Kleene theorems: equivalence between Kleene algebra and finite automata. In this case, the corresponding automata to a concurrent Kleene algebra is a pomset automata.

Can make Concurrent Kleene algebras into regular subsets of pomsets (useful model of concurrency, cf Gischer) i.e., [-] : 2^{\textsf{Pom}_{\Sigma}}.

Pomset automata- has the idea of splitting into two threads (two paths) that once they reach their accept states, then go onto the next state of the originating state (can be expressed as multiple automata). This reminds me of Milner’s interleaving semantics (for atomic actions) in CCS. Alphabet of the automata is a pomset over some other alphabet of actions.

Showed the (Brzozowski) construction to build an automata from a Concurrent Kleene algebra term, and then how to go back from automata to expressions.

The exchange law captures the interaction of sequential and parallel interaction. Currently working on extending the Kleene theorem in this work to include the exchange law.

(ICFP) Faster Coroutine Pipelines, Mike Spivey

(I missed the start and had some trouble catching up).

Coroutine pipelines are a way to structure stream-processing programs. There are libraries for doing this that are based on a term encoding that is then interpreted which suffers serious slow-down when there is process nesting. This paper proposes an alternate approaching using continuation which doesn’t suffer these slow down problems.

The direct style can be mapped to the CPS style which is a monad morphism between two Pipe monads (capturing the coroutine pipelines). To show that the two models are equivalent could be done with logical relations (Mike recommended Relating models of backtracking by Wand and Vaillancourt (2004)).

Termination. Consider a situation:
p1 \mid\mid ((p2 \mid\mid p3 \mid\mid p4) ; q) \mid\mid p5
where p3 then terminates.

Need to end up with p1 \mid\mid q \mid\mid p5 not p1 \mid\mid ((p2 \mid\mid p4) ; q) \mid\mid p5. (I wasn’t sure if this was an open question or whether there was a solution in this paper).

Ohad Kammar asked on the ICFP slack whether this relates to
Ploeg and Kiselyov’s paper at Haskell 2014 but this didn’t get asked.

(ICFP) A pretty but not greedy printer (Functional pearl), Jean-Philippe Bernardy


Hughes proposed a pretty printing approach, which was improved upon by Wadler. For example, if you want to pretty print some SExpr but with a maximum column width: when do you spill onto the next line?

Proposes three laws of pretty-printing:
1). Do not print beyond the right margin
2). Do not reveal the structure of the input
3). Use as few lines as possible.

(Hughes breaks 3 and Wadler breaks 2 (I didn’t spot how in the talk)).
These three rules are not compatible with a greedy approach.
This work trades performance for being law-abiding instead.

A small API is provided to write pretty printers (four combinators:
text, flush, (horizontal composition), ).

Phil Wadler asked how you represent hang in the library. JP quickly typed up the solution:
hang x y = (x text " " y) (x $$ (text " " y))

Phil countered that he could do the same, but JP answered that it didn’t have the expected behaviour. I’m not entirely sure what hang does, but in the Hughes-based library (, hang :: Doc -> Int -> Doc -> Docwith behaviour like:

Prelude Text.PrettyPrint> hang (text "Foo\nBar") 0 (text "Bar")
Bar Bar

(not sure what the extra Int argument is for).

(ICFP) Generic Functional Parallel Algorithms: Scan and FFT – Conal Elliott

Arrays are the dominant type for parallel programming, but FP uses a variety of data types. Generic programming decomposes data types into fundamental building blocks (sums, products, composition).

Perfect trees can be represented as a n-wise composition of a functor $h$.

Prefix sum (left scan)

b_k = \Sigma_{1 \leq i \leq k} a_i
for k = 1, ..., n+1, e.g., scan [1,2,3,4] = [1,3,6,10]
Define more generally as:

class Functor f => LScan f where
  lscan :: Monoid a => f a -> f a x a

Instances were then given for the components of a datatype generic representation, (constant types, type variable, sum, product, composition) which gives data type generic scan. (Insight: Do left scans on left vectors, not left scans on right vectors! The latter adds a lot of overhead).

Data types were describe numerically, e.g., 2 is a pair, \stackrel{\leftarrow}{4} is a 4-vector.

Can do the esame with Discrete Fourier Transform. A nice property is that a
DFT can be factored into separate parts (Johnson 2010, e.g., a 1D DFT of size N = N1*N2 is equivalent to a 2D DFT of size N1xN2– I couldn’t see this so well on the slide as I was in the overflow room).

(ICFP) A Unified Approach to Solving Seven Programming Problems (Functional Pearl) – William E. Byrd, Michael Ballantyne, Gregory Rosenblatt, Matthew Might

Started off referencing – 99 ways to say “I love you” in Racket (all based on different list operations). But what if you wanted more, for some other language? How could you generate this, to say get 1000, or 10000 such examples? The solution put forward here is to using miniKanren, the constraint logic language. The eval function is turned into a relation then can be “run” in reverse, (i.e., run 99 (q) (evalo q '(I love you)) gives 99 S-expressions that evaluate to ​(I love you)). What about for generating quines (programs that evaluate to themselves)? (run 1 (e) (evalo e e)) produced just the 0 expression and getting three results was just 0#t and #f but with 4 it produced a more interesting quine.

Then, William showed (run n (p q) (eval p q) (evalo q p) (/= q p) generating “twines” (pair of mutually producing programs). This was then applied to a proof checker to turn it into an automated theorem prover! The same technology was used then for program synthesis (although some additional work was needed under the hood to make it fast enough).

(FSCD) Relating System F and λ2: A Case Study in Coq, Abella and Beluga – Jonas KaiserBrigitte PientkaGert Smolka

System F (Girard ’72): Two-sorted, types and terms, based on the presentation of System F in Harper’13. Meanwhile, study of CC lead to Pure Type Systems- System F appeared in the Lambda Cube at the corner \lambda2. This work shows the formal relation (that indeed these two systems are equivalent). A proof is partially given in Geuvers ’93, but a full Coq formalisation was given by Kaiser, Tebbi and Smolka at POPL 2017. In this work, the proof is replayed across three tools to provide a useful comparison of the three systems: Coq, Abella and Beluga.

A complication is that the syntax is non-uniform on the System F side: \Pi x : a . b in PTS can correspond to A \rightarrow B and \forall X . B. Have to also keep track of context assumptions. One (of many) “topics of interest” in this comparison is how to manage contexts (which is interesting to me as I am quite interested in CMTT in Beluga).

The approaches were:

  • Coq – first-order de Bruijn indices, parallel substitutions (Autosubst library), and invariants (e.g, a context can be extended with a new term variable). Inductive predicate for typing judgments. Traversal of binders requires context adjustments.
  • Abella – HOAS, nominal quantification (fresh names handled internally), relation proof search [Abella has two layers: specification and logic]. Contexts represented by lists (as in Coq). A compound inductive predicate is define to relate the different context representations (and to keep them all consistent).
  • Beluga – HOAS, 1st class contexts (via Contextual Modal Type Theory) and context schemas. Object K (e.g., terms, types, deriviations) are alway paired with a 1-st class context \Gamma that gives it meaning [\Gamma \vdash K]. There is no concept of free variable., e.g., in Coq 0 \vdash 0_{ty} \rightarrow 0_{ty} \textsf{ty} \Rightarrow \bot is provable but in Belluga \bullet \vdash 0 is not even well formed (accessing the first variable 0 (think de Bruijns) in an empty context). Context schemas provide a way to give the structure of the typed contexts (as a dependent record).

(ICFP) A Framework for Adaptive Differential Privacy – Daniel Winograd-CortAndreas HaeberlenAaron RothBenjamin C. Pierce

Associated GitHub:

They created a new language, Adaptive Fuzz for adaptive differential privacy (DP), uses Fuzz type checking to verify individual pieces in a piecewise static way.

DFuzz is a dependently typed version of Fuzz, can we use this? Privacy cost dependent on feature selection: which could depend on values in the databases: but this is practically useless as we don’t know what is in the database. We can still get static guarantees if it is done piece-wise in two modes: data mode (access to sensitive data, uses the Fuzz type checker to provide DP proofs) and adaptive mode (computation between pieces). End up with “light” dependent types.

These two modes have the same syntax though (code it one works in the other). Here is an example type for a function that scales one number by the other (multiplication by abs):

\textsf{scale} \; (c : \mathbb{R}) \; (n : [\textsf{abs} \; c] \mathbb{R}) : \mathbb{R}

(scale the second number by the first argument). In adaptive mode the annotations are ignored, in data mode the first argument is forced to be a constant (partial evaluation happens under the hood so we don’t need dependent types really).

General scheme is: analyst (program) writes data mode code, it is partially evaluated and the Fuzz type checker works out a cost, which is the compared against a privacy filter (where/how is this defined?) Can write a gradient descent algorithm with different levels of randomness provided by different privacy budgets (how much information do we want to reveal at the end, e.g., if budget is infinite then you get not noise in the result). (another example in the paper is feature selection).

This seems like a really interesting mix of a static and semi-static approach (via partial evaluation). I wonder how this relates to fine-grained cost-tracking type systems?

(ICFP) – Symbolic Conditioning of Arrays in Probabilistic Programs –
Praveen Narayanan, Chung-chieh Shan

(tag line: Bayesian inference via program transformation).
Say you are benchmarking some iterative code, which you model by a n + b for some startup cost b and n iterations and a a cost for the loop. You might then collect some results and do linear regression (best fist line) to see if this model is confirmed. Bayesian linear regression gives you many lines for different kinds of fit.

  1. Write a generative model, e.g., some normally distributed n to generate a set of possible lines (normally distributed gradients), to which we maybe then add some additional normally distributed noise for each point.  (k measurements)
  2. Observe running times
  3. Infer distribution of lines. Delete those candidates from the generative model that weren’t observed (from step 2), keep the ones that were.

Bayesian inference is compositional: you can build it from different components, by treating distributions as programs (e.g., a program normal (a * n_1 + b) is gives us one of the lines in the generative model) and then transforming the program. A program transformation implements the model-theoretic notion of “disintegration” but the existing approach for this is problematic (I wasn’t quite clear on how). A new transformation approach is proposed which provides a smaller output program via some additional combinators which replace an unrolled set of k observations into a replication like combination (plate).

Definitional interpreters: programs that take ASTs and run them, giving an executable semantics. Key challenge: a parameterised interpreter that recovers both concrete and abstract semantics.

A concrete interpreter for the lambda calculus: exp -> m (val) where m is a monad which (at least) captures the environment and the store (which is used for representing the variable environment). The definition is written non-recursive “unfixed” (unrolled), which can be fixed (e.g., apply to Y combinator) later to run the interpreter. By leaving it unfixed, we can interpose different behaviour in the recursive cases.

To make the interpreter abstract, abstract the primitive operations and types, e.g., integers are extended with an ‘abstract’ value, i.e., \mathbb{Z} \cup \{\texttt{'N}\} where 'N represents the abstract integer. The semantics is then main non-deterministic so that an iszero? predicate on 'N splits into two execution traces. But its partial (could run forever).

How to make it total (always terminating). Look for states that it has already seen before, sufficient for termination but unsound for abstraction. If you observe fact 'N then one branch fails (non-terminating as visit previous state) and the 1 branch succeeds, so the result is just 1 (not ideal). Instead, look in a “cache” of precomputed values to return something like [[fact 'N]] = {1} U {'N x $[fact 'N]}. Intercept recursion points to evaluate through a cache and to stop when a previously computed value is hit (sounds space intensive?).

(ICFP) – On the Expressive Power of User-Defined Effects: Effect Handlers, Monadic Reflection, Delimited Control – Yannick Forster, Ohad Kammar, Sam Lindley, Matija Pretnar

Say you want to write a program like:

toggle = { x <- get!
           y <- not! x
           put! y
           x }

If we don’t have effects, then we can do explicit state passing, but you have to do a full rewrite. Ideally want only local transformations. (Ohad showed this in three different styles, see below).

Relative expressiveness in language design, compare/contrast: (1 – Eff) algebraic effects and handlers (2 – Mon) monads (3 – Del) delimited control. Did this by extending CBPV in these three directions (and formalised this in Abella) and defining macro translations between every pair of these extended languages. Expressivity is stated as formal translations between calculi. Then this was considering in the typed setting. Interestingly there is no typed macro translation from Eff to Del nor from Eff to Mon.

[There is a large design space which yields lots of different languages, this study is a start. Inexpressivity is brittle: adding other language features can change the result.]

In the Eff version, we give standard state handlers to implement get and put. Its type contains the effect-system like information State = {get : 1 -> bit, put : bit -> 1}where toggle : U_State F bit.

In the Mon version, monadic reflection is used with a state monad.
(Q: put was of type U (bit -> F) but I thought the monad would be T = U F?)

(Q: In the translation from Mon -> Eff, how do generate the effect type if the source program only uses put or get ? I guess it just maps to the set of all effect operations).

(Q: the talked showed just simple state effects; are the results shown for any other kind of effect).

(ICFP) – Imperative Functional Programs That Explain Their WorkWilmer Ricciotti, Jan Stolarek, Roly Perera, James Cheney

Slicing can be used to explain particular (parts of) a program output: which parts of the source code contributed to a value. Extend TML (Transparent ML) to Imperative TML.

Construct backwards and forward slicing as a Galois connection (between two lattices). The lattices are based on the idea of definedness: where ‘holes’ make programs more undefined, e.g., 1 + 2 is above 1 + \Box and \box + 2 (partial expressions). Forward slicing preserves meets in the lattice; backward slicing should be consistent and minimal with respect to forward slicing (I missed what this meant exactly, but I talked to Janek later to clarify: consistency is x \leq \textsf{fwd}(\textsf{bwd}(x))). The idea then is that we have \textsf{bwd} : values_\Box \rightarrow expr_\Box and \textsf{fwd} : expr_\Box \rightarrow values_\Box which form a Galois connection, i.e., \textsf{bwd}(x) \leq y  \Leftrightarrow x \leq \textsf{fwd}(y). [Given one of the slicings we don’t necessarily get the other for free].

Given a small program l1 := 0; (!l1, !l2) a partial version of it \Box; (!l1, !l) and an initial store [l1 -> 1, l2 -> 2] should forward slice to (\Box, 2) by propagating the “hole” through the store.

(ICFP) Effect-Driven QuickChecking of Compilers – Jan Midtgaard, Mathias Nygaard Justesen, Patrick Kasting, Flemming Nielson, Hanne Riis Nielson

They made a generator for writing OCaml programs: useful for automated testing, e.g., comparing two compilers against each other (compile -> run -> diff, any difference will be suspicious).

Generated a program let k = (let i = print_newline () in fun q -> fun i -> "") () in 0 which found a bug: different behaviour between the native code generator and the bytecode generator due to effects being delayed (indefinitely). Another example was found (related), due to effects being removed.

Program generation, how to do it:

  • Try: generate arbitrary strings?! Most won’t lex or parse).
  • Instead, follow the grammar (see Celento et al ’80). Most won’t type check!
  • Instead, follow the simply-typed lambda calculus (Palka et al. ’11) by bottom-up reading of a typing relation. This will make it through the type checker (I guess you can generalise the type rules).

Observed behaviour in OCaml depends on evaluation order (when effects are involved). OCaml bytecode uses right-to-left but sometimes native code backend uses left-to-right (hm!) But the specification is itself a bit loose. Can we avoid generating such programs since they are not well specified for anyway?

  • Instead, following a type-and-effect systems which has one bit for marking pure/effectful (boolean lattice effect system). Extend to a pair of bits, where the second bit is order dependence (extend the effect algebra a bit more). Our goal is then a program which may have effects but may not have order-dependent effects.

Have an effect-dependent soundness theorem (that the effect bits correctly anticipate the effects).

Next, there is a type preserving shrinker to reduce huge ugly generated programs into a minimal (much nicer) example. e.g., replace a pure integer expression by a literal 0 and large literals by small literals, replace beta-redexes by let bindings, etc. Shrunk tests are checked whether they still show a difference.

A question was asked whether the effect-based mechanism could be used to rule-out other allowed difference, e.g., in floating point. Jan answered that floating point isn’t generated anyway, but an interesting idea.

Another question: how do you avoid generating non-terminating programs? Jan explained that since it is STLC (+ extension) only total terms are generated.

by dorchard at September 08, 2017 01:05 PM

September 07, 2017

Dominic Orchard

FSCD day 4 – rough notes

(Blog posts for Day 1, Day 2, Day 3, Day 4 (half day))

I decided to take electronic notes at ICFP and FSCD (colocated) this year, and following the example of various people who put their conference notes online (which I’ve found useful), I thought I would attempt the same. However, there is a big caveat: my notes are going to be partial and may be incorrect; my apologies to the speakers for any mistakes.

(FSCD Keynote #3) Type systems for the relational verification of higher order programs, Marco Gaboardi

Relational properties R(X_1, X_2) \Rightarrow S(P(X_1), P(X_2)). For example, take R and S to be notions of program equivalence (equivalent inputs produce equivalent outputs). Another relation might be in information-flow security where relations R, S mean “these two programs are low-level equivalent” (low-security).
Another is differential privacy where R means two programs differ in one individual data point and  S(Y_!, Y_@) = Pr[Y_1] \leq e^\epsilon Pr[Y_2].

In relational cost analysis, we want to compute the difference in cost (the relative cost) between the two programs (may depend on the input and the underlying relation), e.g., cost(P)-cost(Q) \leq f(X_1, X_2, R, S) (giving an upper bound computed in terms of the inputs and underlying relations). This is useful for guaranteeing compile optimisations (not worse, or definitely better), ruling out side-channel attacks (i.e., the \textsf{cost}(e[v_1/x]) - \textsf{cost}(e[v_2/x]) = 0, such that different inputs do not yield different costs thus information about the inputs is not leaked).

Motivating example 1: find in a list of lists (2D data structure). We want to prove that one implementation of this algorithm is faster than the other (I’ve omitted the code that Marco showed us, which was expressed via a common higher-order function with two different parameter functions to give the two implementations).
Motivating example 2: prove a precise upper bound on the relative cost of insertion sort between two lists (different by n in length).

[Clarkson, Schneider ’08] – formalises the idea of ‘hyperproperties’. Properties are sets of traces, hyperproperties are sets of sets of traces. Relation verification is a set of pair of traces (a 2-property). They show how to reduce verification of hyperproperties into the verification of properties. Using these kind of results requires encodings, they do not reflect the relational nature of the property, and they do not reflect the connection of the relational reasoning to the programs syntax (see the talk from ICFP day 2, on A Relational Logic for Higher-order Programs). Lots of previous work (going back to Abadi’s System R).

Relational typing: \Gamma \vdash t_1 \approx t_2 : \tau. Talk about relational properties of the input and relational properties of the output. Usually, if we interpret to relations then we want to prove soundness as: (\bar{v_1}, \bar{v_2}) \in [\Gamma] \Rightarrow ([t_1[\bar{v_1}/\Gamma]], t_2[\bar{v_2}/Gamma]]) \in [\tau].\
How can we reason about the cost? Two approaches from here: (1) lightweight typing  (bottom up) extending type system with the necessary constructors, (2) heavyweight typing (top down) use a powerful type system and encode the needed constructions. (In the future, we want to find a balance between the two approaches).

Approach (1) (lightweight) called Relcost a relational refinement type-and-effect system. The idea is to take advantage of structural similarities between programs and inputs as much as possible. There will be two sorts of type (unary and relational). Typing judgments \Omega \vdash^U_L t : A where U is the upper bound and L is the lower bound execution cost, which are thought of as a unary effect. A relational judgement \Gamma \vdash t_1 \ominus t_2 \preceq D : \tau where D is the upper bound on the relative cost of the two programs. The types have an annotation function type (like in normal effect systems) \sigma \xrightarrow{U, L} \tau and data types for integers and lists have indices to express dependencies.

e.g. n : \textsf{int}, f : \textsf{int} \xrightarrow{k, k} \textsf{int} \vdash^{k+1}_1 \textsf{if} \, n \leq 0 \textsf{then} f \; n \textsf{else} \; 1 : \textsf{int}.
But if we know that n is known to be 5 this can be refined (with the type n : \textsf{int}[5] in the indices, which will change the bound to $\vdash^{k+1}_{k+1}$.

The relational types are a little different. There is still a latent effect on function types, but is now a single relative cost \sigma \xrightarrow{D} \tau.
Judgements are like the following, which says equal integers have zero cost: \Gamma \vdash n \ominus n \preceq 0 : \textsf{int}_r.

The unary cost and relational cost are connected as follows:
\dfrac{|\Gamma| \vdash^U_{-} t_1 : A \qquad |\Gamma| \vdash^{-}_L t_2 : A}{\Gamma \vdash t_1 \ominus t_2 \preceq U - L : \textsf{U} A}
Thus, we can drop into the unary cost to compute a worst and best run time individual, and then combine these into a relative cost.
There is a box modality which captures terms which have the same cost and allow a reseting therefore of the differential cost, e.g., if we have identical terms then:
\dfrac{\Gamma \vdash t \ominus t \preceq D : \tau \qquad \forall x . \Gamma(x) \sqsubseteq \Box \Gamma(x)}{\Gamma \vdash t \ominus t \preceq 0 : \Box \tau}.
The type \textsf{List}^\alpha{I} \tau captures lists which differ in less than \alpha positions. This interacts with the \Box type in order to capture the notion of differing numbers of elements in the list.
In the semantics, the relative cost is pushed inside the semantics (some step-indexed logical relations are used, details not shown).
Going back to the earlier example of searching nested lists, we see a typing of 1 \leq |\texttt{find1}| \leq 3n and 3n \leq |\texttt{find2}| \leq 4n meaning \texttt{find1} has a relatively smaller cost. Plugging this into the higher-order function and doing the type derivation again gives a looser bound.

Approach (2) is the heavyweight approach was presented earlier at ICFP (A Relational Logic for Higher-order Programs), the language RHOL (Relational Higher Order Logic). Has relational and unary judgments, where the relational rule is \Gamma \mid \Psi \vdash t_1 : \tau_1 \approx t_2 : \tau_2 \mid \theta which contains term variable assertions \Psi and relational assertion \theta. In the type system, some rules are synchronous (building up the related two terms with the same syntactic construct) or asynchronous (building up one of the terms).

HOL terms can be encoded into RHOL, and vice versa (a bit like the hyperproperty result of Clarkson/Schneider), but RHOL is much easier to work with for relational reasoning but also all the power of HOL can be embedded into RHOL.
The system R^C is a monadic meta language with a specific cost effect (with \textsf{cstep}_n(m) introducing a cost. Intuition: if a term m is closed and $m \Downarrow_n v$ then $m \cong \textsf{cstep}_n(v)$.We can define formulae in HOL which all us to reason about these explicit cost terms and their quantities: we can define what we need in the logic.

For computing a relative upper bound on the insertion sort, we want to prove \textsf{UnsortedDiff}(l_1, l_2, n) \Rightarrow \textsf{cost}(\textit{isort} \, l_1) - \textsf{cost}(\textit{isort} \, l_2) \leq n. Using the system , we can prove this property, using a suitable invariant/predicate over the input and output list(s).

Take home: type-based relational verification is a large research space that needs more work.

Q: what about recursion? In the first system, letrec is included straightforwardly, in the second system the assumption is terminating (general) recursion.
Q: are the costs tied to a specific evaluation strategy? Yes, but one can encode different ones in the system. (Both systems are parametric in the way you count cost). In the first system, this is part of the typing, in the second, this comes where you put monadic cost operations \textsf{cstep}.

(FSCD) Arrays and References in Resource Aware ML, Benjamin Lichtman, Jan Hoffmann

Resource-aware ML (RAML) – model resource usages of programs with cost semantics, e.g., implement Quicksort and it produce the bound 14n^2 + 19^n + c (I forgot what the constant c was). This work introduces references and arrays to enable analysis of programs where resource consumption depends on the data stored in mutable structures (references and arrays).

How does it work? Automatic amortized resource analysis. Assign “potential” functions to a data structures, which will pay the resource consumption cost, then hopefully there is some left over for later. This propagates backwards to give the overall upper bound cost.

Each variable in scope may contribute (carry) potential, e.g., recursive data types (such as x : \textsf{list}^2(\textsf{int}) that is it can contribute 2n units of potential where n is the length of the list.

Based on an affine type system (all variable used at most once). Subexpressions are bound to variables whenever possible (share-let normal form) in order to explicitly track when things are evaluated (via variables). When a variable carrying potential needs to be shared, the potential can be explicitly split share x as (x1, x2) in e in order to use a variable more than once (gives you contraction). This splits the potential of x across the two variables. Benjamin walked us through a nice example with appending to a list.

Now let’s add references.

g l = 
  let t = ref l in 
  share r as (r1, r2) in  -- r2 : L^1 (int)
  let _ = h r1 in   -- h : (L^q int) ref -> int  
                    -- (need to make the type of this stronger)
  append (!r2, []) -- append : L^1(int) * L^0(int) -> L^0(int)

One approach is to carry around extra information about possible effects, but they want to keep effects out of the type system because they seem them as cumbersome and difficult to use. Instead, strengthen contracts by require that potential annotations of data inside references are fixed.

g l = 
  let t = ref l in 
  share r as (r1, r2) in  -- the restricted h means r1 and r2 have 
                          -- to have the same type
  let _ = h r1 in   -- h : (L^1 int) ref -> int  
  append (!r2, [])

Introduce a swap command as the only way to update a cell, that behaves like:

swap (r, l) = let x = !r in (let _ = r := l in x)

Requires that data placed into a cell has the same type as data being taken out, ensures that potential annotations of mutable cells never change (the only way to update a cell). Thus, swap is the well-typed version of dereference in situations where you need to work with potential. Thus, require the previous program to use swap on r2 with an empty list instead of a direct dereference in the last line.

The paper has a detailed example of depth-first search (stateful) which shows more of the details related to how to work with state and potentials. This also utilises an idea of “wrapper” types.

Introduce a notion of “memory typing”, which tracks all memory locations pointed to by mutual references, this is require to prove the soundness of the cost analysis (based on the free potential of the system, the potential of the context, and the potential of each mutable memory reference).

Mutable arrays were then a natural extension from this basis, with an array-swap operation for updating an array cell.

In the future, they’d like to create some syntactic sugar so that the programming style is not affected as much by having to instead swap, aswap, and wrapper types. Would like to investigate how fixed potential annotations restrict the inhabitants of a type (Yes, yes, this is very interesting!)

I asked about what happens if you write down Landin’s Knot to get recursion via higher-order state: Benjamin answered that functions don’t get given potential so storing and updating functions in a reference would type correctly. I still wonder if this would let me get the correct potential bounds for a recursive algorithm (like DFS) if the recursion was implemented via higher-order state.

(FSCD) The Complexity of Principal Inhabitation, Andrej Dudenhefner, Jakob Rehof

In the simply-typed lambda calculus (STLC), with principle types. We say \tau is the principle type of M if \vdash M : \tau and for all types \sigma such that M : \sigma there exists a substitution S such that S(\tau) = \sigma.

Definition: (Normal Principal Inhabitant), We say that a term M in beta-normal form is a normal principal inhabitant of a type t, if t is the principal type of M.

This work shows that principal inhabitation for STLC is PSPACE-complete (in 1979 it was shown that inhabitation (not considering) principality is PSPACE-complete).  Thus, this work seeks to answer the following: since principality is a global property of derivations, does it increase the complexity of inhabitation? This is also practically interesting for type-based program synthesis, since a normal principal inhabitant of a type is a natural implementation of the type.

For example, a \rightarrow a \rightarrow a is inhabited by the K combinator K = \lambda x . \lambda y .  x but is not principally inhabited (I haven’t understood why yet).

The proof is based on the subformula calculus. The subformula calculus provides paths to index types (as trees) which are strings of either 1 or 2.  This is useful for reasonable about subformulae, where the usual STLC rules are re-expressed in terms of type paths, e.g., \dfrac{\Gamma, x : 1\pi \vdash e : 2\pi}{\Gamma \vdash \lambda x . e : \pi}. Define relations on paths, R_M as the minimal equivalence relation for a beta-normal term M (capture constraints on subformulae of terms) and R_\tau  which defines an equivalence relation on paths of paths with common subpaths (capture constraints on subformulae on types).

(I had to step out, so missed the end).

(FSCD) Types as Resources for Classical Natural Deduction, Delia Kesner, Pierre Vial

Quantitative types seen as resources (can’t be duplicated); they provide simple arithmetical arguments to prove operational equivalences. These ideas are extended to classical logic in this talk.

In simple types, typability implies normalisation. With intersection types this is an equivalence, i.e., normalising also implies typability. In an intersection type system, a variable can be assigned several types, e.g. x : A \wedge B \wedge C \wedge B (note the two occurrences of B) where intersection is associative, commutative and (possibly) idempotent. In 1994, Gardner introduces a system which is not idempotent, where types are now multisets (rather than sets in the earlier formulations). This has the flavour of linear logic resources (my interpretation: the types can then capture how many times a variable is used (quantitative) if we contraction takes the intersection of the two types and intersection is non-idempotent, e.g., x_1 : A \wedge B, x_2 : B \wedge C \leadsto x : A \wedge B \wedge B \wedge C).

In the literature, we see how to build a computational interpretation of classical natural deduction: Intuitionistic Logic + Peirce’s law gives classical logic and Felleisen’s call-cc operator gives the computational interpretation. Relatedly, the lambda_mu calculus (Parigot 92) give a direct interpretation of classical natural deduction.

Types are strict where intersection can only appear on the left-hand side of a function arrow.

Interestingly, you can have a situation where the type of a lambda abstraction has an empty intersection of types in its source type. In order to get normalisation in this context, the system was extended a bit to deal with this (I didn’t capture this part, see paper). The full system had rules to ensure this didn’t happen.

A notion of “weighted” subject reduction is defined, where the size of a derivation tree is strictly decreasing during reduction.

by dorchard at September 07, 2017 10:58 AM

ICFP / FSCD day 2 – rough notes

(Blog posts for Day 1, Day 2, Day 3, Day 4 (half day))

I decided to take electronic notes at ICFP and FSCD (colocated) this year, and following the example of various people who put their conference notes online (which I’ve found useful), I thought I would attempt the same. However, there is a big caveat: my notes are going to be partial and may be incorrect; my apologies to the speakers for any mistakes.

(ICFP Keynote #2) – Challenges in Assuring AI, John Launchbury

The moment we think we’ve got our (verification) hands round the world’s systems (e.g., compilers), a whole new set of systems appear: AI.

The dual of “have I tested enough?” is “how good is my model?” (in a verification-based approach). What is the model we have for AI systems?

What is intelligence? Information processing. But there are different kinds of information processing ability; asking if a system is ‘intelligent’ is too coarse/binary. John breaks this down into ‘perceiving’ P, ‘learning’ L, ‘abstracting’ A (create new meanings), and ‘reasoning’ R (plan and decide).

AI wave 1 : rule-based, hand-crafted knowledge: P 1, L 0, A 0, R 3 (poor handling of uncertainty, no learning). Type systems are a little like this (perceiving, didn’t have to tell it everything). Can do some pretty amazing stuff: see smart security systems that analysis code and patch bugs.

AI wave 2: statistical learning. Specific problem domains, train them on big data. Still requires a lot of engineering work (to get the right models and statistical techniques for the task). P 3, L 3, A 1, R 1 (less good at reasoning).

What does it mean to prove a vision system correct?

Manifold hypothesisdata coming in is high-dimensional but the features you are interested in form lower-dimensional structures. E.g., with cars, lots of data comes in but understanding the data means separating the manifolds (of low dimensionality).
Another example, a 28×28 image is a 784 dimensional space (as in 784 numbers, 784-length vector). Variation in handwritten digits form 10 distinct manifolds within the 784 dimensional space. High-degree of coherence between all data samples for a particular digit. (“Manifold” here may be slightly more informally used than the full mathematical definition).

Imagine two interlocking spiral arms which are really 1-dimensional manifolds in a 2-dimensional space. By continuous transforms on the 2-dimensional space these can morphed into linearly separable components. Sometimes stretching into a new dimension enables enclosed manifolds to be separated.

Neural nets separate manifolds; each layer of a neural network stretches and squashes the data space until the manifolds are cleanly separated. Non-linear functions at each step (otherwise multiple linear transformations would collapse into just one layer). Use calculus to adjust the weights (error propagation backwards through the layers).

Structured neural nets have different kinds of layers, e.g., feature maps (which perform a local analysis over the whole input); generated potentially via convolutions (i.e., turn a 28×28 image into 20 24×24 images), do some subsampling. Programming these requires some trial-and-error.

Models to explain decisions: “why did you classify this image as a cat?”; not just “I ran it through  my calculations and ‘cat’ came up as highest”. A deep neural net could produce a set of words (fairly abstract) and then a language-generating recurrent neural net (RNN) translates these into captions (sentences). Currently, statistically impressive but individual unreliable.

Adversarial perturbations: Panda + <1% computed distortion, becomes classified as a Gibbon. Whilst the two images are indistinguishable to a human, the noise interferes with the manifold separation. There are even universal adversarial perturbations that throw-off classification of all images.

Assured control code (e.g., in a helicopter): a physical machine is modelled (system dynamics, power, inertia), a control system design is created, from which controller code is generated. But this does not produce a very nice to fly system. Need some kind of refinement back to the control code. Given the correct structure for the control code framework then the fine tuning from feedback on the actual system is very effective. This can be overlayed with different goals (different fitness functions), e.g., safety over learning.

AI wave 3: combining wave 1 and 2 techniques. Perceive into a contextual model on which abstractions are computed and which can be tuned (learning) to produce reasoning. Aim to get a better intelligence profile: P 3, L 3, A 2, R 3. Hand assurance arguments on the contextual model part instead.
e.g., build models that are specialised based on our knowledge, e.g., we know the probably number of strokes in each digit 0-9 and its trajectory. The generative model generates explanation of how a test character was generated (go from the model of whats its looking for to what it sees; how likely is the thing seen generatable from the model of a particular digit).

Need to be able to specify intent in a variety of ways; all real world specifications are partial and humans can’t always communicate the full intent of what they want (cf., trying to explain how to drive a car).

What are we trying to assure? Mistakes in sensing/reasoning leading to undesirable actions, undesirable emergent behaviours, hackable systems being subverted, misalignment between human/machine values.

(AlphaGo is a wave3-like system as it combined the neural net-based classification of moves with traditional planning/reasoning/tree-based pruning/search).

(FSCD Keynote #2) – Uniform Resource Analysis by Rewriting: Strengths and Weaknesses, George Moser

Can we use complexity measures on term rewriting systems to make complexity arguments about higher-order functions?

The main scheme is to convert a program via a complexity reflecting transformation into a Term Rewriting System (TRS), then do automated resource analysis on the TRSS to get an asymptotic bound. The first half of the talk is some background on doing complexity analysis (which was a bit fast for me so my notes are a bit incoherent below). There are some really nice complexity analysis tools for programs coming out of this work (links near the end).

Some preliminaries: Example, set up list reverse as a finite set of rewrite rules (a Term Rewriting System, TRS):

rev(xs)            -> rev'(xs, nil)
rev'(nil, acc)     -> acc
rev'(x :: xs, acc) -> rev'(xs, x :: acc)

A computation of this set of rules is the application of the rules from left to right. The rules are non-overlapping.

Can see rewriting as an abstraction of functional programming, or can see it as part of equational reasoning (universal algebra); but this really generalises functional programming (function symbols can appear nested on the left-hand side).

Definition: A TRS is terminating if the rewriting relation is “well-founded”.
Definition: A function symbol f is “defined” if it is the root symbol on the left of a rule (other f is a constructor).
Definition: Runtime complexity wrt. a terminating TRS is defined:

dh(t) = max \{ n \mid exists u . t \rightarrow^n u\}
rc(n) = max \{ dh(t) \mid \mathsf{size}(t) \leq n \wedge \textit{n and t are "basic"} \}.

RC is the runtime complexity; Q: is this a “natural notion” for rewriting? (I don’t know how one defined natural here).
Derivational complexity has no restriction on the terms,
dc(n) = max \{ dh(t) \mid \textsf{size}(t) \leq n\} This has been used mainly to make termination arguments.
See “Termination proofs and the length of derivations” (Hofbauer, Lautemann, 1989).

Definition: Multiset Path Order. A precedence order > induces a multiset path order >_{mpo}, (whose definition I’m not going to transcribe, Wikipedia has a definition) It basically says that s = f(s_1, ..., s_n) > t = g(t_1,...,t_m) if s is > all of t_i, or s is \geq some t_i.

The Hydra Battle- is this terminating? (Dershowitz and Jouannaud designed the TRS as a termination problem, later rectified by Dershowitz): The beast is a finite tree, each leaf corresponds to a head. Hercules chops off heads of the Hydra, but the Hydra regrows:

  • If the cut head has a pre-predecessor (grandmother) then the remaining subtree is multiplied by the stage of the game (heads regrowing and multiplying).
  • Otherwise, nothing happens (the head doesn’t regrow)

Can show it is terminating by an inductive argument over transfinite numbers. (but complexity is quite bad). See “The Hydra battle and Cicho’s principle” (Moser, 2009). 

The RTA list of open problem was mentioned:

TRSes can be represented via a matrix interpretation (details were too fast for me).
These techniques were developed into a fully automated complexity analysis tool for TRSes called tct-trs (modular complexity analysis framework).

Going back to the original goal: can we use this for higher-order programs? Consider the textbook example of reverse defined by a left-fold. Need to do a complexity preserving conversion into a TRS (complexity preserving transformation ensures lower bounds, complexity reflexing ensures upper bounds). First, defunctionalise via a rewrite system into a first-order system. Built a tool tct-hoca which works well for various small higher-order functional programs (with complexity proofs taking up to 1 minute to compute).  See the tools at:

What about “real” programs or “imperative” programs? Use ‘integer transition systems’ for modelling imperative programs, which is integrated into the tools now.

Some related work: Multivariate amortized source analysis (Hoffman, Aehlig, Hofmann, 2012) and a type-based approach “Towards automatic resource bound analysis for OCaml” (Hofmann, Das, Weng, POPL 2017) (note to self: read this!).

A previous challenge from Tobias Nipkow: how do you do this with data structures like splay trees (cf Tarjan)? There is some new work in progress. Build on having sized types so that we can account for tree size. An annotated signature then decorates function types with the size polynomial, e.g. \texttt{splay} : A \times T_n \xrightarrow{n^3 + 2} T_n (recursive function to create a splay tree). A type system for this was shown, with cost annotation judgment (no time to copy down any details).

The strength of the uniform resource analysis approach is its modularity and extensibility (different intermediate languages and complexity problems).
Weaknesses: the extensibility/modularity required some abstraction which weakens the proving power, so adding new kinds of analysis (e.g., constant amortised / logarithmic amortised) requires a lot of work.
Audience question: can I annotate my program with the expected complexity and get a counter-example if its false? (counter-examples, not yet).

(FSCD) Continuation Passing Style for Effect Handlers, Daniel Hillerström, Sam Lindley, Bob Atkey, KC Sivaramakrishnan

Consider two effects: nondeterminism and exceptions, in the drunk toss example (a drunk person tosses a coin and either catches it and gets a bool for head or tails, or drops the coin [failure/exception]).

drunkToss : Toss ! {Choose:Bool; Fail:Zero}
drunkToss = if do Choose
            then if do Choose then Heads else Tails
            else absurd do Fail

Induces a simple computation tree.

A modular handler, using row polymorphism (with variable r below)

allChoices : a ! {Choose : Bool; r} => List a ! {r}
allChoices = return x |-> [x]
             Choose k |-> k true ++ k false

fail : a ! {Fail: Zero; r} => List a ! {r}
fail = return x |-> [x]
       Fail k   |-> []

The ! is an annotation explaining the effect operations, k are continuations, r are row variables (universally quantified).  Two possible interpretations:

(1) handle the computation with fail first, then handle that with allChoices
returns the result [[Heads, Tails], []].

(2) handle with allChoices first then fail gives the result [].

So the order of the handling matters. The operational semantics for handlers is based on a substitution in the computation tree for the relevant operations.

How do handlers get compiled? Use CPS! (restricted subset of the lambda calculus, good for implementing control flow). Based on \lambda^{\rho}_{\textit{eff}} (Hillerstrom, Lindley 2016) – lambda calculus with effect handlers (separate syntactic category but contains lambda terms), row polymorphism, and monadic metalanguage style constructs for effectful let binding and ‘return’. The CPS translation is a homomorphism on all the lambda calculus syntax, but continuation passing on the effectful fragment (effectful let and return).  One part of the interpretation for the effectful operation term: [[\textbf{do} \, l \, V]] = \lambda k . \lambda h . \; h (l \; \langle[[V]], \lambda x . k \; x \; h\rangle) (see the paper for the full translation!)

A key part of the translation is to preserve the stack of handlers so that the order is preserved and (the interpretation) of operations can access up and down the stack.
A problem with the translation is that it yields administrative redexes (redundant beta-redexes) and is not tail-recursive. Use an uncurried CRPS where the stack of continuations is explicit (see Materzok and Biernacki 2012), this leads to a new translation which can manipulate the continuation stack directly (see paper). However, this still yields some administrative redexes. Solution: adopt a two-level calculus (Danvy, Nielsen 2033) which has two kinds of continuations: static (translation time) and dynamic (run time), where you can reify/reflect between the two. This let’s you squash out administrative redexes from the runtime to the static part, which can then be beta-reduced immediately at compile time (end up with no static redexes).
The paper proves that this translation preserves the operational semantics.
Implementation in the ‘links’ language:

I was going to ask about doing this typed: this is shown in the paper.

(ICFP) How to Prove Your Calculus Is Decidable: Practical Applications of Second-Order Algebraic Theories and Computation, Makoto Hamana

SOL (Second-Order Laboratory) – tool for analysing confluence and termination of second-order rewrite rules within Haskell. Provides an embedded Haskell DSL (via Template Haskell) for describing rewrite rules. SOL can then automatically check termination and confluence.

Confluence + termination => decidability. Can use this to prove a calculi is decidable.

Based on a higher-order version of Knuth-Bendix critical pair checking using extended HO pattern unification (Libal,Miller 2016).

For example, the theory of monads is decidable: given two well-typed terms s, t consistent of return, bind, and variables, the question  is s = t derivable from the three laws is decidable. How? First, orient the monad laws as rewrite laws. Check confluence (Church-Rosser, CR) and Strong Normalisation (SN)– together these imply their exists unique normal forms, which are then used to prove equivalence (reduce terms to their normal forms). So, how to prove CR and SN?

One way to prove Strong Normalisation is to assign weights and show that reductions reduce weights. Unfortunately, this does not work in the higher-order context of the monad operations. Another standard technique is to, use the “reducibility method” (see Tait,Girard and Lindley,Stark’05). SOL uses the ‘general schema criterion’ which works for general rewrite rules (Blanqui’00,’16) using various syntactic conditions (positivity, accessibility, safe use of recursive calls, metavariables). [sound but obviously not complete due to halting problem]. (Uses also Newman’s Lemma– a terminating TRS is confluent when it is locally confluent).

This tool can be applied to various calculi (see paper, 8 different systems). Sounds really useful when building new calculi. The tool was applied to a database of well know termination/confluence problems and could check 93/189 termination and 82/96 confluence problems.

“Let’s prove your calculus is decidable using SOL” (Makoto is happy to assist!)

(ICFP) A Relational Logic for Higher-Order Programs
Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, Pierre-Yves Strub

Relational properties are things like, if X and Y are related by R then F(X) and F(Y) are related by R, or over two properties, e.g., the results are related instead by S.

Relational refinement types: \Gamma \vdash t_1 \sim t_2 : \{n \mid n_1 = n_2\} These can be used to express monotonicity properties, are syntax directed, and exploit structural similarities of code. However, say we want to prove the naturality of take, but formulating the right relational refinement type is difficult because the computations are structurally different.

Start with a basic logic with lambda-terms over simple inductive types, and a separate layer of predicates

RHOL has judgements \Gamma \mid \Psi \vdash t_1 : \tau_1 \sim t_2 : \tau_2 \mid \phi(r_1, r_2) where \phi is a binary predicate on properties of the two terms and \Psi gives assertions about the free variables. A key idea here is that the types and assertions are separate.

An example rule two-sided rule (where we inductively construct terms with the same syntactic constructor on both sides of a relation):

\dfrac{\Gamma, x_1 : \sigma_1, x_2 : \sigma_2 \mid \Psi, \phi' \vdash t_1 : \tau_1 \sim t_2 : \tau_2 \mid \phi} {\Gamma \mid \Psi \vdash \lambda x_1 : t_1 : \sigma_1 \rightarrow \tau_1 \sim \lambda x_2 . t_2 : \sigma_2 \rightarrow \tau_2  \mid \forall x_1 . \phi' \Rightarrow \phi[r_1 x_1 / r_1, \, r_2 x_2 / r_2]}.
This seems quite natural to me: abstraction introduces a universal quantification over the constraints attached to the free variables.

Rules can also be one side (where you construct a term on just one side of a relation).
The paper shows that RHOL is as expressive as HOL (by a mutual encoding).

Other relational typing system can be embedded into RHOL: Relational Refinement Types, DCC (Dependency Core Calculus, and RelCost (relational cost).

Q: if HOL and RHOL are equivalent (rather than their just being an embedding of HOL into RHOL say) then why have RHOL? Relational reasoning was gained which was not built into HO. Update: I discussed this further later with one of the authors, and they made the point that people have done relational-style reasoning in HOL but it requires taking a pairing of programs and building all the relational machinery by hand which is very cumbersome. RHOL has all this built in and the one-side and two-side rules let you reason about differently-structured terms much more easily. I can see this now.

Foundations of Strong Call by Need

Thibaut Balabonski, Pablo Barenbaum, Eduardo Bonelli, Delia Kesner

What does this lambda term reduce to:  (\lambda x \, y. x \, x) \, (id \, id)? It depends on call-by-value, call-by-name, call-by-need, or full beta then you get different answers. The call-by-value, call-by-name, and call-by-need strategies are called weak because they do not reduce underneath lambda abstraction. As a consequence, they do not compute normal forms on their own.

Weak call-by-need strategy (the usual strategy outlined in the literature): procrastinate and remember. Beta reduction into a let binding (don’t evaluate the right-hand side of an application). (There was some quite technical reduction work here, but I noted this interesting rule on “bubbling” reductions out by splitting a nested reduction into two: t[v[u/y]/x] \rightarrow t[u/y][v/x]).

Strong call-by-need provide the call-by-need flavour of only evaluating needed values once, but crucially computes normal forms (by reducing under a lambda). This strong approach has two properties (conservative) the strong strategy first does whatever the weak strategy will do; and (complete) if there is a beta-normal form it will reach (a representative of) it. This comes by reducing under the lambda at some points. Consequently, this approach only ever duplicates values, not computations.

Q: If you implemented this in Haskell, would it make it faster? I didn’t quite hear the answer, but this would completely change the semantics as Haskell is impure with respect to exceptions and non-termination. It sounded like it would be more useful for something pure like Coq/Agda (where it could provide some performance improvement).

(ICFP) – No-Brainer CPS Conversion, Milo Davis, William Meehan, Olin Shivers

The Fischer/Reynolds algorithm for CPS transformation introduces administrative redexes. Since then, other algorithms have improved on this considerably (Danvy/Filinsky). In this approach, some additional reductions are applied to the term to get a smaller CPS term out. The moral of the story (or the key idea) is to treat the CPS converter like a compiler (use abstract representations, symbol tables).

Rules of the game: make the terms strictly smaller (greedy reductions), should be strongly normalisation, and confluent, keep the original algorithmic complexity of the CPS transforms (linear time).

– Variables are the same size, so do beta redexes on variables
– Beta reduce application of a lambda to a lambda (\lambda x . e) (\lambda y . e') only when the number of reference to x in e is less than or equal to 1 (so things don’t explode).
– Eta reduce (??only when there are no references to the variable in the body??) (I think this is what was said, but I think I might have made a mistake here).

Relies on some machinery: doing reference counts (variable use count), abstract representation of continuations (halt, variables, function continuations, and application continuations) rather than going straight to syntax and syntax constructor functions (which can do reductions) to dovetail with this, and explicit environments to be used with function closures.

(ICFP) Compiling to Categories, Conal Elliott

Why overload? Get a common vocabulary, laws for modular reasoning.

But generally doesn’t apply to lambda, variables, and application. So what to do? Eliminate them! Rewrite lambda terms into combinators: const, id, apply, pairing of functions (fork), curry, and keep going…. and you can automate this via a compiler plugin (so you don’t have to write code like this yourself).

This is implemented by the interface (algebra) of a category (for function composition / identities) + Cartesian (product) structure + coproducts + exponents: giving you (bi)Cartesian-closed categories. Implement this as class/interface so you can give different realisations/compilations, e.g., a graph drawing implementation (Conal showed some pretty pictures generated from this).

Another implementation is to generate Verilog (hardware descriptions), or to graphics shader code.

What about compiling to the derivative of a program? Represent this via the type:
newtype D a b = D (a -> b x (a -o b)) i.e., a differentiable function is a function which produces its result and its derivative as a linear map (an implementation was then shown, see paper). You can then compose interpretations, e.g., graph interpretation with derivative interpretation.

Another interpretation: interval analysis (e.g., Interval Double = (Double, Double)). There are more examples in the paper, including constraint solving via SMT by writing functions to Bool which are then compiled directly into SMT constraints (rather than using existing embeddings, e.g., SBV).

Q: what about recursion? Can be done (compiling from a letrec into a fix-point combinator class).

 (ICFP) Visitors Unchained, François Pottier

(Humorous tagline from Francois: Objects at ICFP?! Binders again?!)

Manipulating abstract syntax with binding can be a chore: nameplate (boilerplate on names, term coined by James Cheney). Large and easy to get wrong (especially in weakly typed languages, non-dependent languages). This paper is a way of getting rid of nameplate in OCaml via a library and an syntax extension. Supports multiple representations, complex binding, is modular, and relies on just a small amount of code generation.

Several Haskell libraries support this via its support for datatype generic programming. This will use OCaml’s object technology. Annotating a data type with [@@deriving visitors { variety = 'map' }] automatically generates a visitor pattern (a class) for doing a map-operations on the datatype. To use this, create a local object inheriting the visitor class and override one or more methods, e.g., for a syntax tree type, override the visit method for the “add” node in order to implement the rewriting of e+0 -> e.
Visitors are quite versatile: easy to customise behaviour via inheritance which is something you can’t do in FP without a lot of redesign to your algorithms.

Want to traverse syntax with binders, considering three things (from three perspectives or “users” in the talk here):
(1) “End user” describes the structure of an AST, parameterised by types of bound names and free names and using a parametric data type abs whenever a binding is needed (e.g., in a lambda). This should also deriving the map visitor(see above) with option [""].
(2) “Binding library” (provided as part of this work) defines the abs type which capture the binding construct, but they leave the scope extrusion function extend as a parameter (extend gets used when you traverse a binder to go inside a scope, e.g., going into the body of a lambda and extending the context by the bound variable).
(3) The last part defines how to represent environments (and lookups) via an overriding a visitor, nominal terms, and scope extrusion. (The work provides some implementations of these part in a library too).

As seen above, the structure of this library is fairly modular. The “end user” has to glue all these things together.
One limitation is that nonlinear patterns (e.g., as in Erlang) can’t be represented (I didn’t immediately see why though).

(ICFP) – Staged Generic Programming, Jeremy Yallop

Advert at the start: if you have an abstract library that is not very high-performance, then the techniques here could be useful to you.

Generic programming a la Scrap Your Boilerplate, for any data type get automatic traversals e.g., listify (matches c) for some constant c, gives you a list of all nodes in a data type matching c; as a programming you don’t need to write listify or matches: they are generated. Requires internally a way to represent types as data and to map data types into a generic form, a collection of traversals over the general form, and generic schemes which plug together the traversals with parameter generic queries. (In Haskell see the syb). However, the overhead of this roughly 20x writing the code manually! This overhead was quite portable: reimplementing Scrap Your Boilerplate into OCaml had about the same amount of slow down compared with manual traversals.

Why this slow? Type comparisons are slow, lots of indirection, and lots of unnecessary applications of applications. (Dominic: to me it seems like it ends up a bit like the slowness involved in dynamically typed languages).

Solution: use staging to optimise the datatype generic implementation. Staging lets you explain which parts of the program can be evaluated early, and which bits need to be evaluated later (due to waiting for inputs)– these are the quoted bits (inside  with the option to jump back into the early part by backquoting ~).

Naively staging SYB keeps the type representation unchanged, but the shallow traversals and recursion schemes can be quoted judiciously such that type comparison, indirect calls, and polymorphism are all eliminated. (Jeremy showed us the code output.. but it has lots of noise and opportunities for improvement).

Instead, use a special fixed-point operator that inlines non-recursive functions and perform (semantics preserving) code transformations as well: e.g., decomposing the generic map operation.

One technique is to allow equations on “partially-static data” e.g., if you have [] @ x @ [] (i.e., static empty list concat dynamic list x concat static empty list []) this still can be statically rewritten into x. Thus apply laws on the static parts as far as possible.
Another technique was to eta reduces matches that have identical branches.
(I think there were a few more thing here). Ends up with much tighter code which performs very close to or even better than the hand-written code.

Q: How does this compare to the Template-your-Boilerplate library for Haskell?
I completely missed the answer as I was looking up this library which I really should be using…!

by dorchard at September 07, 2017 10:48 AM

ICFP / FSCD day 3 – rough notes

(Blog posts for Day 1, Day 2, Day 3, Day 4 (half day))

I decided to take electronic notes at ICFP and FSCD (colocated) this year, and following the example of various people who put their conference notes online (which I’ve found useful), I thought I would attempt the same. However, there is a big caveat: my notes are going to be partial and may be incorrect; my apologies to the speakers for any mistakes.

(ICFP) – A Specification for Dependent Types in Haskell, Stephanie Weirich, Antoine Voizard, Pedro Henrique Avezedo de Amorim, Richard A. Eisenberg

Haskell already has dependent types! ish! (ICFP’14) But singletons are “gross”! (mock discussion between Stephanie and Richard). This work describes a semantics for dependent types in Haskell, along with a replacement for GHC’s internal (Core) language, along with the high-level type theory and fully mechanized metatheory.

But isn’t Haskell already dependently typed? Cf Haskell indexed type for vectors. The following cannot be written in GHC at the moment:

vreplicate :: Pi(n :: Nat) -> Int -> Vec n
vreplicate Zero _ = Nil
vreplicate (Succ n) a = Cons a (vreplicate n a)

The reason this can’t be typed is due to n being erased (after compilation, i.e., it doesn’t appear in the run time) but it is needed dynamically in the first argument. In Dependent Haskell this can be done.

But… it’s not your usual system. For example, what about infinite data?

inf :: Nat
inf = S inf
vinf :: Vec inf
vinf = Cons 2 vinf  -- yields equality demand Vec (S inf) = Vec inf

Idealized Haskell (IH) = System F_omega + GADTs (has uncediable type checking!)

System FC (GHC Core) is a reified version of Idealized Haskell with enough annotations so that type checking is decidable, trivial, and syntax directed. Terms in Core can be see as type derivations of IH terms: contains type annotations and type equalities. (These can all be erased to get back into IH.) The first contribution of this work is a dependent System FC (DC). But its very complicated (needs also type soundness (progress/preservation), erasable coercions, ).
Instead, they introduce System D, the dependently typed analog of Idealized Haskell: its a stripped down dependent Haskell in Curry style (implicit types) and similarly to IH its type checking is undecidable. It has dependent types, non-termination, and coercion abstraction.

Coercion abstraction: for example

g : Int -> Int |- \v -> \c -> \v -> v : Pi n . (g n ~ 7) -> Vec (g n) -> Vec 7

Here the variable v (which has type Vec (g n) when passed in) is implicitly cast using the coercion c (which is passed in as a parameter) to Vec 7. This is translated into DC using an explicit coercion operator, applying c to coerce v.

System D and System DC were mechanized (in Coq) during the design using Ott (to input the rules of the language) & LNgen (for generating lemmas). Currently working on dependent pattern matching, a full implementation in GHC, and roles. (Hence, this is the specification of the type system so far).

Q: type inference for Dependent Haskell?
A: Not sure yet (but see Richard Eisenberg’s thesis)
Q: How do you get decidable type checking in the presence of infinite terms?
A: I didn’t understand the answer (will followup later).
Q: Why is coercion abstraction explicit in D while coercion application is implicit?
A: “Coercion application *is* explicit in D, it’s the use of coercion to cast a type that is implicit, but if you were to apply it to a value it would be explicit.”

(ICFP) – Parametric Quantifiers for Dependent Type Theory Andreas Nuyts, Andrea Vezzosi, Dominique Devriese

A type variable is parameteric if is only used for type-checking (free well-behavedness theorems): i.e., can’t be inspected by pattern matching so you have the same algorithm on all types, e.g. flatten : forall X . Tree X -> List X. Parametericity is well studied in the System F world. This work looks at parametericity in dependent type theory.; some results carry over, some can be proved internally, but some are lost. This work formulates a sound dependent type system ParamDTT which .

Parametricity gives representation independence (in System F):
A \rightarrow B \cong (\forall X . (X \rightarrow A) \rightarrow (X \rightarrow B)
(This result follows by parametricity, using the result that g : \forall X . (X \rightarrow A) \rightarrow (X \rightarrow B) implies g \, X_0 \, r_0 \, x\, = g \, A \, \textit{id} \, (r_0 \; x_0) which can be provided by relational parametricity, see Reynolds “related things map to related things”, applies identity extension lemma).

Can we do the same thing for DTT and can we prove the result internally?
\Pi is however not parametric. For example if we convert to \Pi (X : U) . (X \rightarrow A) \rightarrow (X \rightarrow B). Suppose $B = U$ then we can leak details of the implementation (rerepresentation type is returned as data) \textit{leak} \; X \; r \; x = X (has this type). This violates the identity extension lemma used in the above proof for System F.

So instead, add a parametric quantifier to DTT to regain representation independence, making the above \textit{leak} ill typed. The proof of parametricity for g : \forall X . (X \rightarrow A) \rightarrow (X \rightarrow B) can  now be proved internally. This uses the relational interval type 0 -- 1 : \mathbb{I} (cf Bernardy, Coquand, and Moulin 2015, on bridge/path-based proofs) which gives a basis for proving the core idea of “related things map to related things” where 0 is connected with the type X (in the above type) and 1 is connected to the type A via a switching term / r/ : \mathbb{I} \rightarrow U (see paper for the proof) (I think this is analogous to setting up a relation between X and A in the usual relational parametricity proof).

They extended Agda with support for this.

(ICFP) – Normalization by Evaluation for Sized Dependent Types
Andreas Abel, Andrea Vezzosi, Theo Winterhalter

(context, DTT a la Martin-Lof, and you can add subtpying to this, where definitional equality implies subtyping). Definitional equality is decideable (for the purposes of type checking), but the bigger the better (want to be able to know as many things equal as possible).

Termination is needed for consistency, general fix gives you inconsistency. Instead, you can used data types indexed by tree height, where \textsf{Nat}^i = \{ n \mid n < i\}, you can define \textsf{fix} : (\forall i \rightarrow (\textsf{Nat}^{i} \rightarrow C) \rightarrow (\textsf{Nat}^{i+1} \rightarrow C)) \rightarrow \forall j \rightarrow \textsf{Nat}^j \rightarrow C.

However, size expressions are not unique, which is problematic for proofs. e.g., suc i : \textsf{Nat}^i \rightarrow \textsf{Nat}^\infty but also suc \infty : \textsf{Nat}^i \rightarrow \textsf{Nat}^\infty. Intuition: sizes are irrelevant in terms, but relevant in types only.

Andreas set up some standard definitions of inductive Nat then tried to define a Euclidean division on Nat (where monus is minus cutting off at 0).

div : Nat → Nat → Nat
div zero y = zero
div (suc x) y = {! suc (div (monus x y) y) !}

However, Agda fails to termination check this. The solution is to “size type”-up everything, i.e., redefine the usual inductive Nat to have a size index:

data Nat : Size → Set where
 zero : ∀ i → Nat (↑ i)
 suc : ∀ i → Nat i → Nat (↑ i)

All the other definitions were given sized type parameters (propogated through) and then div was redefined (type checking to):

div : ∀ i → Nat i → Nat ∞ → Nat i
div .(↑ i) (zero i) y = zero i
div .(↑ i) (suc i x) y = suc _ (div i (monus i _ x y) y)

So far this is the usual technique. However, now we have difficulty proving a lemma (which was straightforward before sizing that minus of x with itself gives 0. Now this looks like:

monus-diag : ∀ i (x : Nat i) → Eq ∞ (monus i i x x) (zero ∞)

In the case: monus-diag .(↑ i) (suc i x) = monus-diag i x Agda gets the error i != ↑ i of type Size.

In the Church-style, dependent type functions = polymorphic functions, so you can’t have irrelevant arguments.  In Curry-style (with forall) this is okay. (see the previous talk, this could be a possible alternate solution?)

Instead, we want “Churry”-style where size arguments are used for type checking but can be ignored during equality checking: thus we want typing rules for irrelevant sizes. This was constructed via a special “irrelevant” modality \bullet, where sizes uses as indexes can be marked as irrelevant in the context. Here is one of the rules for type formation when abstracting over an irrelevantly typed argument:

\dfrac{\Gamma \vdash A : Type \quad \Gamma, \bullet \bullet x : A \vdash B : \textsf{type}}{\Gamma \vdash (x : \bullet A) \rightarrow B : \textsf{type}}

This seemed to work really nicely (I couldn’t get all the details down, but the typing rules were nice and clean and made a lot of sense as a way of adding irrelevancy).

(ICFP) – A Metaprogramming Framework for Formal Verification
Gabriel Ebner, Sebastian Ullrich, Jared Roesch, Jeremy Avigad, Leonardo De Moura

(Context: Lean is a dependently-typed theorem prover that aims to bridge the gap between interactive and automatic theorem proving. It has a very small trusted kernel (no pattern matching and termination checking), based on Calculus of Inductive Constructions + proof irrelevance + quotient types.)

They want to extend Lean with tactics, but its written in C++ (most tactic approaches are defined internally, cf Coq). Goal: to extend Lean using Lean by making Lean a reflective metaprogramming language (a bit like Idris) does, in order to build and run tactics within Lean. Do this by exposing Lean internals to Lean (Unification, type inference, type class resolution, etc.). Also needed an efficient evaluator to run meta programs.

lemma simple (p q : Prop) (hp : p) (hq : q) : q := 
by assumption

Assumption is a tactic, a meta program in the tactic monad:

meta def assumption : tactic unit := do
  ctx <- local_context,
  t   <- target
  h   <- find t ctx,
  exact h

(seems like a really nice DSL for building the tactics). Terms are reflected so there is a meta-level inductive definition of Lean terms, as well as quote and unquote primitives built in, but with a shallow (constant-time) reflection and reification mechanism.

The paper shows an example that defines a robust simplifier that is only about 40 locs (applying a congruence closure). They built a bunch of other things, including a bigger prover (~3000 loc) as well as command-line VM debugger as a meta program (~360 loc).

Next need a DSL for actually working with tactics (not just defining them). Initially quite ugly (with lots of quoting/unquoting). So to make things cleaner they let the tactics define their own parsers (which can then hook into the reflection mechanism). They then reused this to allow user-defined notations.

(FSCD) A Fibrational Framework for Substructural and Modal Logics, Dan Licata, Michael Shulman, Mitchell Riley

Background/context: model logic core \dfrac{\emptyset \vdash A}{\Gamma \vdash \Box A} for necessity (true without any hypotheses) \dfrac{\Gamma \vdash \Diamond A}{\Diamond \Gamma \vdash \Diamond B} (if something A is possible true then possibility can propagate to the context– doesn’t change the “possibleness”). Various intuitionistic substructural and modal logics/type systems: linear/affine, relevant, ordered, bunched (separation logic), coeffects, etc.  Cohesive HoTT: take dependent type theory and add modalities (int, sharp, flat). S-Cohesion (Finster, Licata, Morehouse, Riley: a comonad and monad that themselves are adjoint, ends up with two kind of products which the modality maps between). Motivation: what are the common patterns in substructural and modal logics? How do we construct these things more systematically from a a small basic calculus.

With S4 modality (in sequent calculus), its common to make two contexts, one of modality formulae and one of normal, e.g., \Gamma; \Delta \vdash A which is modelled by \Box \Gamma \times \Delta \rightarrow A. In the sequent calculus style,  use the notion of context to form the type. Linear logic ! is similar (but with no weakening). The rules for \otimes follow the same kind of pattern: e.g., \dfrac{\Gamma, A, B \vdash C}{\Gamma, A\otimes B \vdash C} where the context “decays” into the logical operator (rather than being literally equivalent): the types inherent the properties of the context (cf. also exchange).

The general pattern: operations on contexts, with explicit or admissible structure properties, then have a type constructor (an operator) that internalises the implementation and inherits the properties. This paper provides a framework for doing this, and abstracting the common aspects of many intuitionistic substructural and modal logicas based on a sequent calculus (has cut elimination for all its connectives, equational theory, and categorical semantics).

Modal operators are decomposed in the same way as adjoint linear logic (Benton&Wadler’94), see also Atkey’s lambda-calculus for resource logic (2004) and Reed (2009).

So.. how? Sequents are completely Cartesian but are annotated with a description of the context and its properties/substructurality by a first order term (a context descriptor) \Gamma \vdash_a A where a is this context descriptor. Context descriptors are built on “modes”. e.g., a: A, b : B, c : C, d : D \vdash_{(a \otimes b) \otimes (c \otimes d)} X the meaning of this sequent depends on the equations/inequations on the context descriptor:, e.g.,if you have associativity of \otimes you get ordered logic, if you have contraction a \Rightarrow a \otimes a then you get a relevant logic, etc. Bunched implication will have two kinds of nodes in the context descriptor. For a modality, unary function symbols, e.g. \textbf{r}(a) \otimes b means that the left part of the context is wrapped by a modality.

A subtlety of this is “weakening over weakening”, e.g., \dfrac{\Gamma \vdash_a B}{\Gamma, x : A \vdash_a B} where we have weakend in the sequent calculus but the context descriptor is unchanged (i.e., we can’t use x : A to prove B).

The F types give you a product structured according to a context descriptor, e.g. A \otimes B := F_{x \otimes y} (x : A, y : B). All of the left rules become instances of one rule on F: \dfrac{\Gamma, \Delta \vdash_{\beta[\alpha / x]} B}{\Gamma, x : F_\alpha(\Delta) \vdash B} which gives you all your left-rules (a related dual rule gives the right sequent rules).

Categorical semantics is based on a cartesian 2-multicategory (see paper!)
Q: is this like display logics? Yes in the sense of “punctuating” the shape of the context, but without the “display” (I don’t know what this means though).

This is really beautiful stuff. It looks like it could extend okay with graded modalities for coeffects, but I’ll have to go away and try it myself.

(FSCD) – Dinaturality between syntax and semantics, Paolo Pistone

Multivariate functions F X X and G Y Y where the first parameter is contravariant and second is covariant may have dinatural transformations between them. (Dinaturality is weaker than naturality). Yields a family of maps \theta_X \in Hom_C(F X X, G X X) which separate the contravariant and covariant actions in the coherence condition (see Wiki for the hexagonal diagram which is the generalisation of the usual naturality property into the dinatural setting).

A standard approach in categorical semantics is to interpret open formula by multivariant functor F X Y = X \rightarrow Y and proofs/terms by dinautral transformations.  Due to Girard, Scedrov, Scott 1992 there is the theorem that: if M is closed and \vdash M : \sigma then M is dinatural. The aim of the paper is to prove the converse result: if M is closed and \beta\eta-normal and (syntactically) dinatural then \vdash M : \sigma (i.e., implies typability).

Syntactic dinaturality examples where shown (lambda terms which exhibit the dinaturality property). Imagine a transformation on a Church integer (X \rightarrow X) \xrightarrow{h} (X \rightarrow X)– this is a dinatural transformation and its dinaturality hexagon is representable by lambda terms. (I didn’t get a chance to grok all the details here and serialise them).

Typability of a lambda term was characterised by its tree structure and properties of its shape and size (e.g., \lambda x_1 . \ldots \lambda x_n . M : \sigma_1 \rightarrow \ldots \sigma_h \rightarrow \tau then $n \leq h$, there are various others; see paper).

The theorem then extends from typability to parametricity (implies that the syntactically dinatural terms are parametric) when the type is “interpretable” (see paper for definition).

(sorry my notes don’t go further as I got a bit lost, entirely my fault, I was flagging after lunch).

(FSCD) – Models of Type Theory Based on Moore Paths, Andrew Pitts, Ian Orton

In MLTT (Martin-Lof Type Theory), univalence is an extensional property (e.g., (\forall x . f x = g x)  \rightarrow (f = g)) of types in a universe U: given X, Y : U every identification p : Id_U X Y induces an (Id)-isomorphism X \cong Y. U is univalent “if all Id-isomorphisms X \cong Y is U are induced by some identification P : Id_U X Y” (want this expressed in type theory) (this is a simpler restating of Voevodsky’s original definition due to Licata, Shulman, et al.). This paper: towards answering, are there models for usual type theory which contain such univalent universes?

Univalence is inconsistent with extensional type theory where p : Id_A X Y implies x = y \wedge p = \textsf{refl}. Univalence is telling you that the notion of identification is richer subsuming ETT where there is an identification between two types then they are already equal, which is much smaller/thinner.

Need a source of models of “intensional” identification types (i.e., not the extensional ones ^^^): homotopy type theory provides a source of such models via paths: p : Id_A x y are paths from point x to point y  in a space A where \textsf{refl}_A is a “constant path” (transporting along such a path does nothing to the element).

So we need to find some structure Path_A x y for these identifications, but we need to restrict to families of type (B \; x \mid x : A) carrying structure to guarantee substitution operations (in this DTT context).  “Holes” is diagrams are filled using path reversals and path composition. Solution is “Moore” paths.

A Moore path from x to y in a topological space X is specified by a shape (its length) |p| in \mathbb{R}_+ and an extent function p@ : \mathbb{R}_+ \rightarrow X satisfying p@ 0 = x and (\forall i \geq |p|) . p@i = y. These paths have a reversal operator by truncated subtraction \textit{rev}(p@i) = p@(|p| - i). (note this is idempotent on constant paths as required, allowing holes in square paths to be filled (I had to omit the details, too many diagrams)). But topological spaces don’t give us a model of MLTT, so they replace it by an arbitrary topos (modelling extensional type theory) and instead of \mathbb{R} for path length/shape use any totally ordered commutative ring. All the definition/properties of Moore paths are the expressed in the internal language of the topos.  This successfully gives you a category with families (see Dybjer) modelling intensional Martin-Lof type theory (with Pi). This also satisfies dependent functional extensionality thus its probably a good model as univalence should imply extensionality. But haven’t yet got a proof that this does contain a univalence universe (todo).

Q: can it just be done with semirings? Yes.

(ICFP) – Theorems for Free for Free: Parametricity, With and Without Types, Amal Ahmed, Dustin Jamner, Jeremy G. Siek, Philip Wadler

Want gradual typing AND parametricity. There has been various work integrating polymorphic types and gradual typing (with blame) but so far there have been no (published) proofs of parametricity. This paper proposes a variant of the polymorphic blame calculus \lambda B [Ahmed, Findler, Siek, Wadler POPL’11] (fixed a few problems) and proves parametricity for it.

Even dynamically typed code cast to universal type should behave parametrically., e.g. consider the increment function which is typed dynamically and cast to a parametric type \lambda x . x + 1 : \ast \Rightarrow^p \forall X . X \rightarrow X) [int] 4 returns 5 but we want to raise a “blame” for the coercion p here (coercing a monomorphic to polymorphic type is bad!).

The language has coercions and blame, and values can be tagged with a coercion (same for function values). Some rules:
(v : G \Rightarrow^p \ast \Rightarrow^q G) \mapsto v  (that is, casts to the dynamic type and back again cancel out).
But, if you cast to a different type (tag) then blame pops out: (v : G \Rightarrow^p \ast \Rightarrow^q G') \mapsto \textsf{blame} p (or maybe that was blame q?)

add = \Lambda X . \lambda (x : X) . 3 + (x : X \Rightarrow^p \ast \Rightarrow^q \textsf{int}) this has a type that makes it look like a constant function \lambda X . X \rightarrow \textsf{int}. When executing this we want to raise blame rather than actually produce a value (e.g. for add 3 [\textsf{int}] 2). Solution: use runtime type generation rather than substitution for type variables.

A type-name store \Sigma provides a dictionary mapping type variables to types.
\Sigma \rhd (\Lambda X . v) [B] \mapsto \Sigma, \alpha := B \rhd v[\alpha/X] i.e., instead of substituting the B in the application, create a fresh type name \alpha that is then mapped to B in the store. Then we extend this with types and coercion: \Sigma \rhd (\Lambda X . v) [B] \mapsto \Sigma, \alpha := B \rhd v[\alpha/X] : A[\alpha/X] \Rightarrow^{+\alpha} A[B/X] where +\alpha (I think!) means replace occurrences of B with \alpha.

Casts out a polymorphic type are instantiated with the dynamic type. Casts into polymorphic types are delayed like function casts.

Parametricity is proved via step-indexed Kripke logical relations. The set of worlds W are tuples of the type-name stores, and relations about when concealed values should be related.  See paper: also shows soundness of the logical relations wrt. contextual equivalence. There was some cool stuff with “concealing” that I missed really.

(ICFP) – On Polymorphic Gradual Typing, Yuu Igarashi, Taro Sekiyama, Atsushi Igarashi

Want to smoothly integrate polymorphism/gradual. e.g., apply dynamically-typed arguments into polmorphically typed parameters, and vice versa. Created two systems: System Fg (polymorphic gradually typed lambda cal) which translates to System Fc.

Want conservativity: A non-dynamically typed term in System F should be convertible into System Fg, and vice versa.  The following rule shows the gradual typing “consistency” rule for application (in System Fg and gradual lambda calculus):
\dfrac{\Gamma \vdash s : T_{11} \rightarrow T_{12} \quad \Gamma \vdash t : T_1 \quad T_1 \sim T_{11}}{\Gamma \vdash s \; t : T_2} where \sim defines type consistency. Idea: extend this to include notions of consistency between polymorphic and dynamic types. An initial attempt breaks conservativity since there are ill-typed System F terms which are well-typed in System Fg then.

The gradual guarantee conjecture: coercions without changes to the operational semantics.

They introduce “type precision” (which I think replaces consistency) which is not symmetric where A \sqsubseteq \astI didn’t yet get the importance of this asymmetry). But there are some issues again… a solution is to do fresh name generation to prevent non-parametric behaviour (Ahmed et al. ’11, ’17 the last talk) But making types less precise still changes behaviour (create blame exceptions). Their approach to fix this is to restrict the term precision relation \sqsubseteq so that it can not be used inside of \Lambda (term level) and \forall (type level).
Still need to prove the gradual guarantee for this new approach (and adapt the parametricity results from Ahmed et al. ’17).

(ICFP) – Constrained Type Families, J. Garrett Morris, Richard A. Eisenberg

(Recall type families = type-level functions)
Contribution 1: Discovery: GHC assumes that all type families are total
Contribution 2: First type-safety proof with non-termination and non-linear patterns
Contribution 2.5 (on the way): simplified metatheory
The results are applicable to any language with partiality and type-level functions (e.g. Scala)
(Haskellers leave no feature unabused!)

-- Example type family:

type family Pred n
type instance Pred (S n) = n
type instance Pred Z = Z

Can define a closed version:

type family Pred n where
            Pred (S n) = n
            Pred Z = Z

Definition: A ground type has no type families.
Definition: A total type family, when applied to ground types, always produces a ground type.

In the new proposal: all partial type families must be associated type families.

type family F a                 -- with no instances
x = fst (5, undefined :: F Int) -- is typeable, but there are no instance of F!

Instead, put it into a class:

class CF a where
    type F a
x = fst (5, undefined :: F Int) -- now get error "No instance for (CF Int)"

Can use closed type families for type equality.

type family EqT a b where
       EqT a a = Char
       EqT a b = Bool
f :: a -> EqT a (Maybe a) -- This currently fails to type check but it should work
f _ = False

Another example:

type family Maybes a 
type instance Maybes a = Maybe (Maybes a)
justs = Just justs -- GHC says we can't have an infinite type a ~ Maybe a
-- But we clearly do with this type family!

Why not just ban ​Maybes? Sometimes we need recursive type families.

By putting (partial) type classes inside a class means we are giving a constraint that explains there should be a type at the bottom of (once we’ve evaluated) a type family. Therefore we escape the totality trap and avoid the use of bogus types.

Total type families need not be associated. Currently GHC has a termination checker, but it is quite weak, so often people turn it off with UndecideableInstances.
But, wrinkle: backwards compatibility (see paper) (lots of partial type families that are not associated).

Forwards compatibility: dependent types, terminating checking, is Girard’s paradox encodable?

This work also lets us simplify injective type family constraints (this is good, hard to understand at the moment). It makes closed type families more powerful (also great).

(ICFP) – Automating Sized-Type Inference for Complexity Analysis, Martin Avanzini, Ugo Dal Lago

(for higher-order functional programs automatically). Instrument a program with a step counter by turning them into state monad which gives a dynamic count.
Sized typed (e.g., sized vectors), let you type ‘reverse’ of a list as \forall i . L_i \, a \rightarrow L_i \, a. But inference is highly non-trivial.

Novel sized-type system with polymorphic recursion and arbitrary rank polymorphism on sizes with inference procedure for the size types. Constructor definitions define size measure, e.g., for lists of nats:  \textsf{nil} : L_1 and \textsf{cons} : \forall i j . N_i \rightarrow L_j \rightarrow L_{i + j + 1} (here the natural number has a size too).
Can generalise size types in the application rules, (abs, app, and var rules deal with abstraction and instantiation of size parameters).

Theorem (soundness of sizes): (informally) the sizes attached to the a well-typed program gives you a bound on the number of execution steps that actually happen dynamically (via the state encoding). [also implies termination].

Inference step 1: add template annotations, e.g., \textit{reverse} : \forall k . L_k \rightarrow L_{g(k)}. Then we need to find a concrete interpretation of g.
Step 2: constraint generation, which arise from subtyping which approximate sizes, plus the constructor sizes. These collect together to constraint the meta variables introduced in step 1.
Step 3: constraint solving: replace meta variable by Skolem terms. Then use an SMT solver (the constraints are non-linear integer constraints). (from a question) The solution may not be affine (but if it is then it is easier to find).

The system does well at inferring the types of lots of standard higher-order functions with lists. But cross product on lists produces a type which is too polymorphic: with a little more annotation you can give it a useful type.
Q: What happens with quicksort? Unfortunately it can’t be done in the current system. Insertion sort is fine though.
Q: What about its relation to RAML? The method is quite different (RAML uses an amortized analysis). The examples that each can handle are quite different.

(ICFP) – Inferring Scope through Syntactic Sugar, Justin Pombrio, Shriram Krishnamurthi, Mitchell Wand


Lots of languages use syntactic sugar, but then its hard to understand what the scoping rules are for these syntactic sweets.

E.g. list comprehension [e | p <- L] desugars to map (\p . e) L. The approach in this work is a procedure of scope inference which takes these rewrite patterns (where e and p and L are variables/parameters). We know the scoping rules for the core language (i.e., the binding on the right) [which parameterises the scope inference algorithm]. Desugaring shouldn’t change scope (i.e., if there is path between two variables then it should be preserved). So the scope structure is homomorphic to the desugared version, but smaller.  (A scope preorder was defined, I didn’t get to write down the details).

Theorem: if resugaring succeeds, then desugaring will preserve scope.

Evaluation: the techniques was tested on Pyret (educational language), Haskell (list comprehensions), and Scheme (all of the binding constructs in R5RS). It worked for everything apart from ‘do’ in R5RS.

by dorchard at September 07, 2017 10:46 AM

September 06, 2017

Tweag I/O

Tracking performance over the entire software lifecyle

Nicolas Mattia

In this post I'll show you how to see the performance of your software project improve over the entire lifetime of the project. That is, how to get something like the below picture. For that, you'll need to systematically track and store benchmark results over time. We'll use Elasticsearch for that.

Evolution of our client's benchmarks over a month

Here's a standard workflow:

  1. Your manager tells you clients have been complaining about the performance of frobnicating aadvarks. Your manager wants you to solve that.
  2. You quickly put together a few benchmarks that characterize the cost of frobnication.
  3. You then spend the next few days or weeks playing the game: look at benchmark numbers, make a change, rerun benchmarks, compare, commit if the numbers look better. Rinse and repeat.

What's missing is the notion of time. How do you know that as you work on the product for the next few months, you aren't introducing any performance regressions? Better still, how do you visualize what progress has been made over the last few months (or years!), when clients were already complaining about performance and you had already made some improvements?

Tracking performance over time is hard. For one, results need to be stored in a shared data store, that can be reused across many benchmark runs. Further, there needs to be an easy way to visualize these results as trend lines, and to perform further analyses on an ad hoc basis. I'll show you how to do that with Elasticsearch for safe and accessible storage, Kibana for visualization and a Criterion-like framework for writing benchmarks.

Benchmarking is hard

Measuring the performance of isolated components through micro-benchmarks is pretty much a solved problem — at least in Haskell and Ocaml — thanks to the amazing Criterion library and subsequent imitations. However, isolated benchmarks are rarely useful. You rarely care about exactly how long it takes for a function to run without some context. You want to be able to compare benchmark results: before this commit, over time, etc. This leads to other problems like ensuring that all the benchmarks are run on the same hardware and how to store those benchmarks. Some solutions have emerged for particular use cases (see rust's solution and GHC's solution, to name a few) but it's impractical to roll out a new solution for every single project.

The tools mentioned above allow the benchmarks to be compared and visualized. But there are off-the-shelf tools that enable just that: Elasticsearch, which allows you to store, search and analyze your data, and Kibana, an Elasticsearch plugin, that makes it possible to visualize just about anything.

We'll use Hyperion to perform the benchmarks and generate the data. We can imagine that, in the future, it may be possible to do this using Criterion instead.

Setting up Elasticsearch

We'll set up local instances of Elasticsearch and Kibana. The easiest way is to spin up ready-made containers, for instance using the docker-elk project. It uses docker-compose to start some containers: Elasticsearch on port 9200, and Kibana on port 5601 (it also starts a Logstash container but we won't be using it). Here we go:

$ git clone
$ cd docker-elk
$ docker-compose up -d
$ curl localhost:9200
  "name" : "7pOsdaA",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "e_CQlDgCQ1qIHob-n9h3fA",
  "version" : {
    "number" : "5.5.2",
    "build_hash" : "b2f0c09",
    "build_date" : "2017-08-14T12:33:14.154Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.0"
  "tagline" : "You Know, for Search"

We'll need to configure Elasticsearch. Time to get your JSON goggles, because we'll use JSON as the lingua franca, and there's going to be plenty of it from now on.

We'll create an Elasticsearch index by crafting a JSON index specification and uploading it to the Elasticsearch instance. An index is similar to a database. The index requires a "type" (here all-benchmarks), which is similar to a table (we won't go into the details but go have a look at the documentation if you're interested):

  "settings" : {
      "number_of_shards" : 1
  "mappings" : {
    "all-benchmarks" : {
      "dynamic" : true,
      "properties" : {
        "bench_name": {
          "type": "keyword"
        "time_in_nanos": {
          "type": "float"
        "timestamp": {
          "type": "date",
          "format": "epoch_second"

When using Elasticsearch it is much easier to work with flat JSON documents without nested fields. Here's a JSON document that fits with the index above, which we find works well for benchmark data:

    "bench_name": "fibonacci",
    "timestamp": "1504081132",
    "time_in_nanos": 240

Let's break down the index fields. The key properties specifies which fields we want to be present when a new ES document is uploaded. In this case we'll have

  • bench_name: the benchmark name, as a keyword, which is a string field type that ES can index,
  • time_in_nanos: the measured execution time for the given benchmark,
  • timestamp: an ES date which will be used as the canonical timestamp (to clarify we specify the format, the number of seconds since the epoch).

dynamic: true: this tells ES that other fields might be present and that it should try to index them as well. Let's save the index definition to a file index.json and upload it as index hyperion-test:

$ curl -X PUT 'http://localhost:9200/hyperion-test' --data @index.json

Elasticsearch is set up!

Generating the data

To upload benchmark timings, you'll need them in JSON format. How you generate the data depends greatly on what kinds of systems you are going to benchmark and what programming language you are using. One possibility is to use the Criterion benchmark framework. But in this post, for simplicity, I'll use Hyperion, an experimental lab to explore future Criterion features, because the JSON output produced by that tool is easier on Kibana.

hyperion is not a replacement for Criterion, rather it is a place for trying out new ideas. We wanted a lab where we could develop and experiment with new features, which might in turn be contributed to Criterion itself. Hyperion includes features like recording the benchmark input in the report and the ability to export raw data as a flat JSON structure (we'll use this in a minute).

We'll use Hyperion's micro-benchmark example to generate the data:

benchmarks :: [Benchmark]
benchmarks =
    [ bench "id" (nf id ())
    , series [0,5..20] $ \n ->
        bgroup "pure-functions"
          [ bench "fact" (nf fact n)
          , bench "fib" (nf fib n)
    , series [1..4] $ \n ->
        series [1..n] $ \k ->
          bench "n choose k" $ nf (uncurry choose) (n, k)

main :: IO ()
main = defaultMain "hyperion-example-micro-benchmarks" benchmarks

If you've used Criterion before, the functions bench and benchGroup will look familiar. series is a Hyperion function that will run the benchmarks with different input while allowing the input to be recorded in the report. Let's see what this looks like:

$ stack exec hyperion-micro-benchmark-example -- --json -
  "metadata": {
    "location": null,
    "timestamp": "2017-08-30T08:36:14.282423Z"
  "results": [
      "alloc": null,
      "bench_name": "pure-functions:15/fib",
      "bench_params": [
      "cycles": null,
      "garbage_collections": null,
      "measurements": null,
      "time_in_nanos": 4871.05956043956

The --json - argument tells Hyperion to write a JSON report to stdout. Hyperion generates a metadata section with general information, and a results section containing the individual benchmark results. You'll see that some benchmarks results contain a field bench_params: this is the input data given by series.

This format however is not ideal for working with Elasticsearch for several reasons:

  • All the benchmark results are bundled within the same JSON object which means that they can't be indexed independently by Elasticsearch.
  • The metadata section is barely useful: the location field does not provide any useful information and the timestamp field was sourced from system date which means it depends on when the benchmark was run, not when the code was committed.

We'll annotate the metadata with information gathered from git and we'll make the format ES-friendly:

$ alias git_unix_timestamp="git show -s --format=%ct HEAD"
$ alias git_commit_hash="git rev-parse HEAD"
$ stack exec hyperion-micro-benchmark-example -- \
    --arg timestamp:"$(git_unix_timestamp)" \
    --arg commit:"$(git_commit_hash)" \
    --flat -

We created two helpers for getting a unix timestamp and the latest commit's hash from git and used that to annotate the metadata with Hyperion's --arg key:val. Then with --flat - we tell Hyperion to write a flat version of the JSON benchmark report to stdout, which is effectively a JSON array where each element contains the information about a particular benchmark as well as a copy of the metadata:

    "alloc": null,
    "bench_name": "pure-functions:15/fib",
    "commit": "91df3b9edc0945a1660e875e37f494e54b1419f5",
    "cycles": null,
    "garbage_collections": null,
    "location": null,
    "measurements": null,
    "time_in_nanos": 4998.988241758242,
    "timestamp": "1504081805",
    "x_1": 15

Very good, let's reuse that command and (with a little help from jq and xargs) feed each element to Elasticsearch (in practice you'll want to use Elasticsearch's bulk API but let's keep it simple for now):

$ stack exec hyperion-micro-benchmark-example -- \
    --arg timestamp:"$(git_unix_timestamp)" \
    --arg commit:"$(git_commit_hash)" \
    --flat - \
    | jq -cM '.[]' \
    | xargs -I '{}' -d '\n' \
    curl -X POST 'http://localhost:9200/hyperion-test/all-benchmarks' \
        -H 'Content-Type: application/json' \
        -d '{}'

Kibana's Way

As mentioned earlier, we also span up a Kibana container. However if try to access it now you'll most likely get an error, since we haven't configured Kibana yet. We need to tell it what ES indices it should gather the data from. Let's tell Kibana to use any index that's prefixed with hyperion. We'll first create an index pattern, then tell Kibana to use that index pattern as the default:

$ curl -X PUT 'http://localhost:9200/.kibana/index-pattern/hyperion*' \
    -H 'Content-Type: application/json' \
    -d '{"title": "all hyperion", "timeFieldName": "timestamp"}'
$ curl -X PUT 'http://localhost:9200/.kibana/config/5.5.2' \
    -H 'Content-Type: application/json' -d '{"defaultIndex": "all hyperion"}'

You should now be able to access Kibana by pointing your browser to http://localhost:5601. In the Discover landing page you should see a list of the benchmark data that we uploaded to Elasticsearch earlier (if you don't see anything try to expand the time range by clicking the time picker on the top right corner). We'll create our first visualization using that data by clicking the Visualize tab on the left menu, and then "Create a new visualization", "Line". Then click the index pattern that we created earlier ("hyperion*"). This should take you to a dashboard where you can describe the visualization. This is where the magic happens.

We'll create a benchmark visualization for the Fibonacci function as implemented in Hyperion's micro-benchmark example at the commit we uploaded earlier (in my case 91df3b9...). First let's set the Kibana's filter to only include the benchmark results we're interested in (for more information about the query language see the Lucene query syntax documentation):

commit:"91df3b9edc0945a1660e875e37f494e54b1419f5" AND bench_name:pure-functions*fib

Then let's make sure that our benchmark data is in scope by clicking the time picker at the top right corner and set the time range to "Last 6 months" (anything will work as long as the timestamp of the specific commit you tagged your data with falls into that range). We can fill the metrics and buckets settings in the panel on the left hand side (see picture below).

Panel settings for the Fibonacci plot

Let's break this down:

  • X-Axis Aggregation: By using Elasticsearch's term aggregation we're able to use one of the document's field as values for the X axis. Here we use x_1 which is the parameter generated by series and used as an argument to fib. We set the Order to "Ascending" and Order By to "Term". If you nested several series you could pick x_2, x_3, ... as the X axis, allowing you to easily compare the effect of various parameters. Or you could use the timestamp parameter to see the evolution of your system's performance over time. Let's stick to x_1 for now.

  • Y-Axis Aggregation: here we picked "Max" as the aggregation value. What this means is that when several values are available for a specific X value Kibana will only display the maximum value. If you've set the filter to use only the values of pure-functions*fib you should only get a single value anyway, but this can come in handy if you have more than one value and you want to perform custom regressions (see Hyperion's --raw argument).

You should get a plot similar to the following image:

Fibonacci plot result

And there it is!


This was a very simple example of what's possible to achieve with Elasticsearch and Kibana. We've setup a bridge between our benchmarking code and Elasticsearch, to which we offload all the heavy lifting of the data analysis. We got great plots for free, thanks to Kibana. But we've only scratched the surface of what's now possible. Here are a few things you could try out:

  • Use Kibana's split series to display several benchmarks alongside each other for better analysis,
  • Use Hyperion's --raw mode (or similar from Criterion) with Elasticsearch's various aggregators to get more insights into your benchmark data (like visualizing variance),
  • Display your per-commit benchmark data linearly by using Hyperion's extensible metadata mechanism with --arg commit_number:"$(git rev-list --count HEAD)",
  • Setup a direct link to your commits on GitHub from Kibana by using Hyperion's extensible metadata mechanism with --arg commit_link:"$(git rev-parse HEAD)",
  • Ensure your benchmarks are run (and uploaded to Elasticsearch) on your CI.


September 06, 2017 12:00 AM

Sandy Maguire

Modeling Music

<article> <header>

Modeling Music


<time>September 6, 2017</time>

One of my long-term goals since forever has been to get good at music. I can sightread music, and I can play music by ear – arguably I can play music well. But this isn’t to say that I am good at music; I’m lacking any theory which might take me from “following the path” of music to “navigating” music.

Recently I took another stab at learning this stuff. Every two years or so I make an honest-to-goodness attempt at learning music theory, but inevitably run into the same problems over and over again. The problem is that I have yet to find any music education resources that communicate on my wavelength.

Music education usually comes in the form of “here are a bunch of facts about music; memorize them and you will now know music.” As someone who got good at math because it was the only subject he could find that didn’t require a lot of memorization, this is a frustrating situation to be in for me. Math education, in other words, presents too many theorems and too few axioms.

My learning style prefers to know the governing fundamentals, and derive results when they’re needed. It goes without saying that this is not the way most music theory is taught.

Inspired by my recent forays into learning more mathematics, I’ve had an (obvious) insight into how to learn things, and that’s to model them in systems I already understand. I’m pretty good at functional programming, so it seemed like a pretty reasonable approach.

I’ve still got a long way to go, but this post describes my first attempt at modeling music, and, vindicating my intuitions, shows how we can derive value out of this model.

Music from First Principles

I wanted to model music, but it wasn’t immediately obviously how to actually go about doing that. I decided to write down the few facts about music theory I did know: there are notes.

data Note = C | C' | D | D' | E | F | F' | G | G'| A | A' | B
  deriving (Eq, Ord, Show, Enum, Bounded, Read)

Because Haskell doesn’t let you use # willy-nilly, I decided to mark sharps with apostrophes.

I knew another fact, which is that the sharp keys can also be described as flat keys – they are enharmonic. I decided to describe these as pattern synonyms, which may or may not have been a good idea. Sometimes the name of the note matters, but sometimes it doesn’t, and I don’t have a great sense of when that is. I resolved to reconsider this decision if it caused issues down the road.

{-# LANGUAGE PatternSynonyms #-}

pattern Ab = G'
pattern Bb = A'
pattern Db = C'
pattern Eb = D'
pattern Gb = F'

The next thing I knew was that notes have some notion of distance between them. This distance is measured in semitones, which correspond to the pitch difference you can play on a piano. This distance is called an interval, and the literature has standard names for intervals of different sizes:

data Interval
  = Uni    -- 0 semitones
  | Min2   -- 1 semitone
  | Maj2   -- 2 semitones
  | Min3   -- etc.
  | Maj3
  | Perf4
  | Tri
  | Perf5
  | Min6
  | Maj6
  | Min7
  | Maj7
  deriving (Eq, Ord, Show, Enum, Bounded, Read)

It’s pretty obvious that intervals add in the usual way, since they’re really just names for different numbers of semitones. We can define addition over them, with the caveat that if we run out of interval names, we’ll loop back to the beginning. For example, this will mean we’ll call an octave a Unison, and a 13th a Perf4. Since this is “correct” if you shift down an octave every time you wrap around, we decide not to worry about it:

iAdd :: Interval -> Interval -> Interval
iAdd a b = toEnum $ (fromEnum a + fromEnum b) `mod` 12

We can similarly define subtraction.

This “wrapping around” structure while adding should remind us of our group theory classes; in fact intervals are exactly the group \(\mathbb{Z}/12\mathbb{Z}\) – a property shared by the hours on a clock where \(11 + 3 = 2\). That’s certainly interesting, no?

If intervals represent distances between notes, we should be able to subtract two notes to get an interval, and add an interval to a note to get another note.

iBetween :: Note -> Note -> Interval
iBetween a b = toEnum $ (fromEnum a - fromEnum b) `mod` 12

iAbove :: Note -> Interval -> Note
iAbove a b = toEnum $ (fromEnum a + fromEnum b) `mod` 12

Let’s give this all a try, shall we?

> iAdd Maj3 Min3

> iBetween E C

> iAbove D Maj3

Looks good so far! Encouraged by our success, we can move on to trying to model a scale.


This was my first stumbling block – what exactly is a scale? I can think of a few: C major, E major, Bb harmonic minor, A melodic minor, and plenty others! My first attempt was to model a scale as a list of notes.

Unfortunately, this doesn’t play nicely with our mantra of “axioms over theorems”. Represented as a list of notes, it’s hard to find the common structure between C major and D major.

Instead, we can model a scale as a list of intervals. Under this lens, all major scales will be represented identically, which is a promising sign. I didn’t know what those intervals happened to be, but I did know what C major looked like:

cMajor :: [Note]
cMajor = [C, D, E, F, G, A, B]

We can now write a simple helper function to extract the intervals from this:

intsFromNotes :: [Note] -> [Interval]
intsFromNotes notes = fmap (\x -> x `iBetween` head notes) notes

major :: [Interval]
major = intsFromNotes cMajor

To convince ourselves it works:

> major

Seems reasonable; the presence of all those major intervals is probably why they call it a major scale. But while memorizing the intervals in a scale is likely a fruitful exercise, it’s no good to me if I want to actually play a scale. We can write a function to add the intervals in a scale to a tonic in order to get the actual notes of a scale:

transpose :: Note -> [Interval] -> [Note]
transpose n = fmap (iAbove n)
> transpose A major

Looking good!


The music theory I’m actually trying to learn with all of this is jazz theory, and my jazz theory book talks a lot about modes. A mode of a scale, apparently, is playing the same notes, but starting on a different one. For example, G mixolydian is actually just the notes in C major, but starting on G (meaning it has an F♮, rather than F#).

By rote, we can scribe down the names of the modes:

data Mode
  = Ionian
  | Dorian
  | Phrygian
  | Lydian
  | Mixolydian
  | Aeolian
  | Locrian
  deriving (Eq, Ord, Show, Enum, Bounded, Read)

If you think about it, playing the same notes as a different scale but starting on a different note sounds a lot like rotating the order of the notes you’re playing. I got an algorithm for rotating a list off stack overflow:

rotate :: Int -> [a] -> [a]
rotate _ [] = []
rotate n xs = zipWith const (drop n (cycle xs)) xs

which we can then use in our dastardly efforts:

modeOf :: Mode -> [a] -> [a]
modeOf mode = rotate (fromEnum mode)
> modeOf Mixolydian $ transpose C major

That has a F♮, all right. Everything seems to be proceeding according to our plan!

Something that annoys me about modes is that “G mixolydian” has the notes of C, not of G. This means the algorithm I need to carry out in my head to jam with my buddies goes something as follows:

  • G mixolydian?
  • Ok, mixolydian is the fifth mode.
  • So what’s a major fifth below G?
  • It’s C!
  • What’s the C major scale?
  • OK, got it.
  • So I want to play the C major scale but starting on a different note.
  • What was I doing again?

That’s a huge amount of thinking to do on a key change. Instead, what I’d prefer is to think of “mixolydian” as a transformation on G, rather than having to backtrack to C. I bet there’s an easier mapping from modes to the notes they play. Let’s see if we can’t tease it out!

So to figure out what are the “mode rules”, I want to compare the intervals of C major (ionian) to C whatever, and report back any which are different. As a sanity check, we know from thinking about G mixolydian that the mixolydian rules should be Maj7 => Min7 in order to lower the F# to an F♮.

modeRules :: Mode -> [(Interval, Interval)]
modeRules m = filter (uncurry (/=))
            . zip (intsFromNotes $ transpose C major)
            . intsFromNotes
            . transpose C
            $ modeOf m major

What this does is construct the notes in C ionian, and then in C whatever, turns both sets into intervals, and then removes any groups which are the same. What we’re left with is pairs of intervals that have changed while moving modes.

> modeRules Mixolydian

> modeRules Dorian
[(Maj3,Min3), (Maj7,Min7)]

Very cool. Now I’ve got something actionable to memorize, and it’s saved me a bunch of mental effort to compute on my own. My new strategy for determining D dorian is “it’s D major but with a minor 3rd and 7th”.


My jazz book suggests that practicing every exercise along the circle of fifths would be formative. The circle of fifths is a sequence of notes you get by successively going up or down a perfect 5th starting from C. In jazz allegedly it is more valuable to go down, so we will build that:

circleOfFifths :: [Note]
circleOfFifths = iterate (`iMinus` Perf5) C

This is an infinite list, so we’d better be careful when we look at it:

> take 5 circleOfFifths

Side note, we get to every note via the circle of fifths because there are 12 distinct notes (one for each semitone on C). A major fifth, being 7 semitones, is semi-prime with 12, meaning, meaning it will never get into a smaller cycle. Math!

Ok, great! So now I know which notes to start my scales on. An unfortunate property of the jazz circle of fifths is that going down by fifths means you quickly get into the freaky scales they don’t teach 7 year olds. You get into the weeds where the scales start on black notes and don’t adhere to your puny human intuitions about fingerings.

A quick google search suggested that there is no comprehensive reference for “what’s the fingering for scale X”. However, that same search did provide me with a heuristic – “don’t use your thumb on a black note.”

That’s enough for me to go on! Let’s see if we can’t write a program to solve this problem for us. It wasn’t immediately obvious to me how to generate potential fingerings, but it seems like we’ll need to know which notes are black:

isBlack :: Note -> Bool
isBlack A' = True
isBlack C' = True
isBlack D' = True
isBlack F' = True
isBlack G' = True
isBlack _ = False

For the next step, I thought I’d play it safe and hard code the list of fingering patterns for the right hand that I already know.

fingerings :: [[Int]]
fingerings = [ [1, 2, 3, 1, 2, 3, 4, 5]  -- C major
             , [1, 2, 3, 4, 1, 2, 3, 4]  -- F major

That’s it. That’s all the fingerings I know. Don’t judge me. It’s obvious that none of my patterns as written will avoid putting a thumb on a black key in the case of, for example, Bb major, so we’ll make a concession and say that you can start anywhere in the finger pattern you want.

allFingerings :: [[Int]]
allFingerings = do
  amountToRotate <- [0..7]
  fingering      <- fingerings
  pure $ rotate amountToRotate fingering

With this list of mighty potential fingerings, we’re ready to find one that fits a given scale!

fingeringOf :: [Note] -> [Int]
fingeringOf notes = head $ do
  fingers <- allFingerings
  guard . not
        . or
        . fmap (\(n, f) -> isBlack n && f == 1)
        . zip notes
        $ fingers
  pure fingers

We can test it:

> fingeringOf $ transpose C major

> fingeringOf $ transpose F major

> fingeringOf $ transpose A' major

So it doesn’t work amazingly, but it does in fact find fingerings that avoid putting a thumb on a black key. We could tweak how successful this function is by putting more desirable fingerings earlier in allFingerings, but as a proof of concept this is good enough.

That’s about as far as I’ve taken this work so far, but it’s already taught me more about music theory than I’d learned in 10 years of lessons (in which, admittedly, I skipped the theory sections). More to come on this topic, probably.

As usual, you can find the associated code on Github.


September 06, 2017 12:00 AM

Alp Mestanogullari

Quick and minimal Haskell hacking with Nix

This post will not explain in detail how Nix can build haskell projects or set up development environments, but simply show one workflow that I use a lot. Please refer to this manual, Gabriel Gonzalez’s guide and various blog posts out there to find out more about Nix and Haskell.

Alright, so I often end up wanting to quickly experiment with some ideas in a ghci session or a standalone Haskell module. Sometimes I can get away by simply prototyping my idea with data types and functions from base, but sometimes (most of the time, really) I want (or need) to use other libraries. I’d like to have a magic command to which I would give a ghc version and a list of haskell packages, and I would end up with an environment with ghc/ghci and all the aforementionned packages pre-installed in that ghc’s package database, so that just running ghci would be enough for being able to use those packages. Similarly, standalone modules could be built just with ghc --make for example. No need for a cabal file, cabal-install or stack.

Well, this is possible with Nix. The nix-shell program allows us to enter some environment that possibly requires downloading, building and running several things. The precise provisioning process depends a lot on what the environment consists in. It turns out that the Haskell infrastructure in Nix provides some functions out of the box for describing environments that consist in a ghc install along with some Haskell packages from Hackage. Making use of them is as simple as:

$ nix-shell -p "haskell.packages.ghc821.ghcWithPackages (pkgs: [pkgs.aeson pkgs.dlist])"

This enters a shell with GHC 8.2.1 with aeson and dlist (and their transitive dependencies) installed in its package database. You might even fetch all those things from a binary cache with a little bit of luck!

The text between double quotes is a function call, in the Nix programming language. It calls a function, haskell.packages.ghc821.ghcWithPackages (. here is simply field/attribute access, like in many OO languages), provided by the standard Haskell infrastructure in nixpkgs, with an argument that’s itself a function: pkgs: [pkgs.aeson pkgs.dlist] is equivalent to (pseudo Haskell code) \pkgs -> [aeson pkgs, dlist pkgs] where aeson and dlist would be fields of a big record containing all Haskell packages. So this ghcWithPackages function lets us provision a ghc package database using a package set that it is providing us with, pkgs.

Note: a haskell package set in nix is basically any record that provides the packages you need, where packages must be declared under a very precise shape. The one we’re using here (haskell.packages.ghc821) is derived from the latest stackage LTS (9.x) but we could very well be calling thebestpackageset.ghcWithPackages instead, provided that we define thebestpackageset somewhere and that its definition is valid. For example, you could simply extend the latest LTS with a few packages of yours that nix should get from github, by simply using the record extension syntax of Nix. Or you could override just a few package versions from the LTS. Or you could even put together an entire package set yourself.

In that new shell, you can verify that you indeed get the GHC you asked for and that you can use the said packages:

[nix-shell:~]$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.2.1

[nix-shell:~]$ ghc-pkg list

[nix-shell:~]$ ghci
GHCi, version 8.2.1:  :? for help
Prelude> import Data.Aeson
Prelude Data.Aeson> import Data.DList
Prelude Data.Aeson Data.DList>

Finally, you can define a very handy shell function for making it easier to spin up a new shell without having to remember the precise syntax for the nix-shell call above. Add this to your ~/.bash_profile, ~/.bashrc or what have you:

function nix-haskell() {
	if [[ $# -lt 2 ]];
		echo "Must provide a ghc version (e.g ghc821) and at least one package"
		return 1;
		echo "Starting haskell shell, ghc = $ghcver, pkgs = $pkgs"
		nix-shell -p "haskell.packages.$ghcver.ghcWithPackages (pkgs: with pkgs; [$pkgs])"

The with pkgs; bit simply allows us to say [aeson dlist] instead of [pkgs.aeson pkgs.dlist], it just “opens” the pkgs “namespace”, like RecordWildCards in Haskell, when you write SomeRecord{..} to bring all the fields of a record in scope, with the field selector name as variable name for each field.

Now entering our previous shell is even simpler!

$ nix-haskell ghc821 aeson dlist

Of course, if you need private dependencies or fancier environments, you do need to resort to writing a shell.nix that carefully puts everything together (not that it’s that complicated, but definitely not as simple as this nix-haskell command). However, for a quick hacking session or exploration of an idea, I’m not aware of a nicer solution.

Posted on September 6, 2017

by Alp Mestanogullari at September 06, 2017 12:00 AM

September 01, 2017

Douglas M. Auclair (geophf)

August 2017 1HaskellADay problems and solutions

by geophf ( at September 01, 2017 12:31 PM

August 31, 2017

Tweag I/O

Enter the matrix, Haskell style

Manuel M T Chakravarty

This is the second post in a series about array programming in Haskell — you might be interested in the previous post, too.

Matrices are the bread and butter of most scientific and numeric computing. It is not surprising then that there is a range of standard libraries and interfaces, first and foremost BLAS (Basic Linear Algebra Subprograms) and LAPACK — Linear Algebra PACKage, which have been around since the FORTRAN days, and more recently also the GSL (GNU Scientific Library).

In Haskell, hmatrix provides a uniform interface to much of the functionality of these three libraries. The functionality is split over four packages, hmatrix (linear algebra), hmatrix-gsl (common numeric computations), hmatrix-gsl-stats(GSL statistics), and hmatrix-special (the ”special” functions of GSL). Due to the popularity of hmatrix, there exists a whole ecosystem of packages on Hackage that either build on hmatrix , provide bindings to other standard C libraries by extending the hmatrix interface, and implement adaptors to interoperate with other array libraries (such as hmatrix-repa).

Vectors and matrices

At the core of hmatrix are the Vector and Matrix data types,

data Vector e
data Matrix e

The Vector type is based on the Storable instances of the widely used vector package and Matrix adopts a matrix representation that enables the efficient use of BLAS routines. Values of both types can be easily created from lists and come with the expected basic vector and matrix functions. For example, we have

hmatrix> vector [1,2,3] * vector [3,0,-2]

Although, the result is displayed in list syntax, it is indeed a Vector.

hmatrix> :t vector [1,2,3] * vector [3,0,-2]
vector [1,2,3] * vector [3,0,-2] :: Vector R

The element type R is one of several type synonyms used in some of the hmatrix interface:

type R = Double
type C = Complex Double
type I = CInt
type Z = Int64

Similarly, for matrices, we have

hmatrix> matrix 3 [1..9] * ident 3
 [ 1.0, 0.0, 0.0
 , 0.0, 5.0, 0.0
 , 0.0, 0.0, 9.0 ]
hmatrix> :t matrix 3 [1..9] * ident 3
matrix 3 [1..9] * ident 3 :: Matrix R

where matrix 3 turns a list into a matrix with 3 columns and ident produces the identity matrix of a given size.

Generalised matrices

In addition to the dense Vector and Matrix types, hmatrix also provides a ”general” matrix type GMatrix that provides optimised representations for dense, sparse, diagonal, banded, and constant matrices of Doubles (aka R). For example, the following sparse 2x2000 matrix with two non-zero elements is represented as follows in the compressed sparse row (CSR) format:

hmatrix> mkSparse [((0,999),1.0),((1,1999),2.0)]
{ gmCSR = CSR 
          { csrVals = [1.0,2.0]
          , csrCols = [1000,2000]
          , csrRows = [1,2,3]
          , csrNRows = 2
          , csrNCols = 2000
, nRows = 2
, nCols = 2000

The package includes an implementation of the conjugate gradient method for sparse linear systems.


In addition to the basic operations on vectors and matrices, the core hmatrix package provides solvers for linear systems, computes inverses, determinants, singular value decomposition, eigenvalues & eigenvectors, QR, Cholesky & LU factorisation, and some other common matrix operations.

On top of that hmatrix-gsl covers integration, differentiation, FFT, solving general polynomial equations, minimization of a multidimensional functions, multidimensional root finding, ordinary differential equations, nonlinear least-squares fitting, and interpolation routines. Moreover, hmatrix-gsl-stats includes random distribution functions, linear regression functions, histograms, permutations, and common statistics functions (mean, variance, standard deviation, and so on).

Finally, hmatrix-special provides Airy, Bessel, Clausen, Coulomb wave, coupling coefficient, Dawson, Debye, dilogarithm, elliptic integral, Jacobian elliptic, Fermi-Dirac integral, Gegenbauer, hypergeometric, Laguerre, Lambert W, synchrotron, digamma, trigamma, and transport, Riemann zeta functions as well as Gamma distributions, Legendre polynomials, and common trigonometric and exponential functions.

An example: minima of arbitrary multidimensional functions

For example, to find the minimum of a function

f [x,y] = 10*(x-1)^2 + 20*(y-2)^2 + 30

without providing a gradient, we can define

minimizeS :: ([Double] -> Double) -> [Double] -> ([Double], Matrix Double)
minimizeS f xi 
  = minimize NMSimplex2 1E-2 100 (replicate (length xi) 1) f xi

using the minimize function of hmatrix-gsl’s Numeric.GSL.Minimization. It provides us with the minimum as well as the path taken by the algorithm to reach that solution. Using gnuplot by way of hmatrix’ undocumented Graphics.Plot interface, to visualise the path, we get

Minimise Plot

For the full example code, see minimize.hs.

Type safety

A common mistake in array programming is to apply an operation to one or more matrices whose dimensions violate a precondition of the operation. For example, for matrix multiplication to be well-defined for two matrices a and b, we require a to be ixj if b is jxk; i.e., the number of a’s columns needs to coincide with the number of b’s rows. Like most array libraries in Haskell, we cannot express such a constraint in hmatrix’ standard interface, where the type of a matrix is independent of the size of its dimensions.

The static interface of matrix provides some of hmatrix’ functionality with an alternative API, based on GHC’s type literals extension, which allows matrix size constraints to be expressed. We will discuss this interface in more detail in a future blog post.


If you are coming from Python, then hmatrix will be the closest to numpy and scipy that you will find in the Haskell ecosystem. The packages in both languages are realised by wrapping highly optimised standard routines written in low-level languages (such as BLAS and LAPACK) as libraries. However, the functionality they provide is not directly comparable as, despite a strong overlap, both provide functionality that is absent in the other. While numpy and scipy shine with maturity and very widespread use, Haskell offers increased safety and the potential to provide additional high-performance functionality and fusion of computational kernels in Haskell itself (i.e., without the need to drop down to C or C++) by combining hmatrix with some of the array libraries that we will discuss in the forthcoming posts in this series.

August 31, 2017 12:00 AM

August 28, 2017

FP Complete

Manage Secrets on AWS with credstash and terraform


During automatic infrastructure deployment on AWS, a common question is: what is the best way to deliver sensitive information over to EC2 instances or, more precisely applications running on them. There are numerous solutions, such as placing the information into user-data initialization script or simply SFTPing them onto the instance. Although these are perfectly viable solutions, there are well-known drawbacks with those approaches, such as size limitations on the former and the necessity to open the SSH port for the latter. There are also comprehensive solutions, such as HashiCorp Vault with Consul, that can do a lot more than just deliver credentials, but those can be an overkill for common and simple scenarios.


There is a way to solve secret management through utilization of resources provided only by AWS and a cool tool called credstash. You will find a nice guide on how to use the tool and a description on how it works if you follow the link, but the basic idea behind credstash is that it stores key/value data in DynamoDB while encrypting values with KMS (Key Management Service). As a result, only a user or resource that has read access to the DynamoDB table and permission to use that KMS Master key can access that data. The encrypted data can then be accessed through a very simple command line interface. In the most simple case the process would look like this:

On your laptop:

$ credstash put my-secret high-entropy-password
my-secret has been stored

On the EC2 instance:

$ credstash get my-secret

Boom, you transferred the password across the internet in a totally secure fashion using nothing but AWS services. At a high level, here is what happened in the above example. During the put operation:

  • a new random data encryption key was generated using KMS.
  • the value high-entropy-password was encrypted with that key.
  • data encryption key is itself encrypted with the KMS Master key, while its plaintext form is discarded.
  • the encrypted data, together with the encrypted data encryption key, are stored in a DynamoDB table under the name my-secret.

The KMS Master key that is used by default by credstash is the one with the name alias/credstash, while the default DynamoDB table name is credstash-store. This whole key wrapping technique is necessary, because the KMS Master key can only encrypt up to 4KiB of data at a time.

During the get operation the process is inverted:

  • pull out the blob from DynamoDB table with the name my-secret
  • decrypt the data key by using the KMS API and the KMS Master key
  • decrypt the actual secret using the decrypted key.

As you probably suspect, access to the secret data can be controlled on two levels, namely through access to DynamoDB table and to KMS Master key. More on that later.


There are a number of tools that are used for automatic deployment on AWS. Terraform, being one of them, stands out as an amazing tool that allows you to describe your infrastructure as code. It works not just with AWS, but with many other providers. In this post we'll use nothing but terraform, so if you are already familiar with it, go on reading forward; otherwise a Getting Started tutorial could be beneficial if you want to try things out while moving along.

Initial Setup

Installing terraform is pretty straightforward, since it is written in Go you can just download a binary for your operating system from terraform downloads page.

Credstash, on the other hand, is written in Python and as such can be installed with pip. It does have a few non-python dependencies that need to be installed beforehand. Here is how you'd get it on Ubuntu:

$ sudo apt-get update
$ sudo apt-get install -y libssl-dev libffi-dev python-dev python-pip
$ sudo -H pip install --upgrade pip
$ sudo -H pip install credstash

If you'd rather not install anything globally you can use Python environments, or even download another implementation of credstash ported to a different language, for instance gcredstash, written in Go, which, just like terraform, can be downloaded as a static executable and is fully compatible with credstash. Implementations in other languages are listed in the README.


Naturally, the example from Introduction will not work just out of the box, so prior to using credstash, a database table and an encryption key must be created first. Going through the credstash documentation will reveal that a DynamoDB table with a default name credstash-store can be created by running credstash setup, while the KMS Master key has to be created manually:

$ credstash -t test-table setup
Creating table...
Waiting for table to be created...
Table has been created. Go read the README about how to create your KMS key

Well that's no fun, we ought to be able to automate the whole process. The credstash-setup terraform module will do just that, thereby taking care of the initial setup for us. Remember, we need to do this only once and make sure not to run terraform destroy, unless you really want your secret data to be permanently deleted.

Create a file:

module "credstash" {
  source = ""

Then execute it in the same folder with the above file:

$ terraform get
$ terraform apply

Once applied, terraform will create a DynamoDB table with the name credstash-store and a KMS key with the name alias/credstash. After deployment is complete, you can go ahead and start using credstash on your local machine.

Remote state

Although, it is not strictly required, I would highly recommend using terraform's remote state feature in order to later simplify getting the values created by this setup. We even have a terraform module that can help you with getting the s3-remote-state bucket that was created.

terraform {
  backend "s3" {
    encrypt = "true"
    region  = "us-west-1"
    bucket  = "remote-tfstate"
    key     = "credstash/terraform.tfstate"
module "credstash" {
  source = ""

The main benefit of using remote terraform state is that credstash-related resources can be created just once and their reuse can be automated during our infrastructure deployment by all of the team members. Another more involved way would be to manually copy and paste outputs of this module into others as input variables, and that just sounds like too much work.

Roles and Grants

So far, usage of credstash was limited only to users that have implicit access to all KMS keys and DynamoDB tables, i.e. admins, power users, what have you. Basically, running credstash on an EC2 instance will result in a permission error, but that is where it is most useful. The best way to allow an EC2 instance access the secrets is to:

  • create an IAM profile with an IAM role, while attaching that profile to an EC2 instance that we are deploying
  • create an IAM policy(s) that allow reading and writing from/to the database table credsatsh-store, and attach those policies to the above mentioned role.
  • create a KMS grant(s) for the Master key, that gives permission for encryption and/or decryption with that key to a grantee, which will also be the above mentioned IAM role

The first two steps we can easily automate with terraform, but the last step has to be done with aws-cli or directly through an API with some SDK. But wait, I said that we won't be using anything besides terraform, and it is so indeed, aws-cli is an implicit dependency, which has to be installed despite the fact that we will not be interacting with directly.

Let's start with creating IAM policies first, as they can be reused as many times as we'd like.

... # also remote state, just as above

module "credstash" {
  source = ""
  enable_key_rotation = true
  create_reader_policy = true
  create_writer_policy = true
output "kms_key_arn" {
  value = "${module.credstash.kms_key_arn}"
output "reader_policy_arn" {
  value = "${module.credstash.reader_policy_arn}"
output "writer_policy_arn" {
  value = "${module.credstash.writer_policy_arn}"
output "install_snippet" {
  value = "${module.credstash.install_snippet}"
output "get_cmd" {
  value = "${module.credstash.get_cmd}"
output "put_cmd" {
  value = "${module.credstash.put_cmd}"

One of the greatest features of terraform, in my opinion, is that it knows exactly what needs to be done in order to reach the desired state, so if you already called terraform apply in the previous example, it will figure out everything that needs to be changed and apply only those changes without touching resources that need no modification.

When you run terraform apply you should see something along the lines:


get_cmd = /usr/local/bin/credstash -r us-east-1 -t credential-store get
install_snippet = { apt-get update;
  apt-get install -y build-essential libssl-dev libffi-dev python-dev python-pip;
  pip install --upgrade pip;
  pip install credstash; }

kms_key_arn = arn:aws:kms:us-east-1:123456789012:key/87b3526c-8100-11e7-9de5-4bff2f10d02a
put_cmd = /usr/local/bin/credstash -r us-east-1 -t credential-store put -k alias/credstash
reader_policy_arn = arn:aws:iam::123456789012:policy/credential-store-reader
writer_policy_arn = arn:aws:iam::123456789012:policy/credential-store-writer

At this point credstash is set up and we can verify that it works. Helper snippets are targeted at Ubuntu based systems, but can be easily adapted to other operating systems.

Let's install credstash on a local machine, store a test value and pull it out from credstash-store afterwards:

$ sudo -H bash -c "$(terraform output install_snippet)"
$ $(terraform output put_cmd) test-key test-value
test-key has been stored
$ $(terraform output get_cmd) test-key

We can also set a new value for the key, while auto incrementing its version, by setting -a flag:

$ $(terraform output put_cmd) -a test-key new-test-value2
test-key has been stored
$ $(terraform output get_cmd) test-key

There are a few other useful features that credstash has, which don't have helper snippets like get_cmd and put_cmd do, since they are less likely to be used in automated scripts. But they can still be easily constructed using terraform outputs. It's worth noting that all previously stored values are always available, unless deleted manually:

$ credstash -r us-east-1 -t credential-store get test-key -v 0000000000000000000
$ credstash -r us-east-1 -t credential-store list
test-key -- version 0000000000000000000
test-key -- version 0000000000000000001
$ credstash -r us-east-1 -t credential-store delete test-key
Deleting test-key -- version 0000000000000000000
Deleting test-key -- version 0000000000000000001

Deploy EC2

Using credstash directly is extremely simple, but setting everything up for it to work on EC2 instances can be a bit daunting, so this is what this section and the credstash-grant terraform module are about.

The simplest example that comes to mind—which is actually pretty common in practice—is deploying an EC2 instance with an nginx webserver serving a web page (or working as a reverse proxy), while protecting it with BasicAuthentication. We will use credstash to automatically retrieve credentials that we will store prior to EC2 instance deployment:

$ $(terraform output put_cmd) nginx-username admin
nginx-username has been stored
$ $(terraform output put_cmd) nginx-password foobar
nginx-password has been stored

The full example can be found in this gist, but here are the parts that are of most interest to us.

Using the credstash-grant module will effectively allow read access to the DynamoDB table by attaching that policy to an IAM role and creating a KMS grant, thus allowing that IAM role to use the KMS Master key for decryption. This grant will automatically be revoked upon destruction, so there is no need to worry about some dangling settings that should be cleaned up.

# lookup credstash remote state
data "terraform_remote_state" "credstash" {
  backend = "s3"
  config {
    region  = "us-west-1"
    bucket  = "remote-tfstate"
    key     = "credstash/terraform.tfstate"
module "credstash-grant" {
  source            = ""
  kms_key_arn       = "${data.terraform_remote_state.credstash.kms_key_arn}"
  reader_policy_arn = "${data.terraform_remote_state.credstash.reader_policy_arn}"
  roles_count       = 1
  roles_arns        = ["${aws_iam_role.credstash-role.arn}"]
  roles_names       = ["${}"]

You might notice that we created a writer policy during credstash-setup stage, but didn't supply its ARN to the module. This ensures that we give EC2 instance read only access to the secret store. If ability to store secrets from within EC2 is desired, supplying writer_policy_arn to the module is all that is necessary for that to work.

This is the part where credstash is called on the EC2 side:

resource "aws_instance" "webserver" {
  associate_public_ip_address = true
  iam_instance_profile        = "${}"
  user_data                   = <<USER_DATA
apt-get install -y nginx
BASIC_AUTH_USERNAME="$(${data.terraform_remote_state.credstash.get_cmd} nginx-username)"
BASIC_AUTH_PASSWORD="$(${data.terraform_remote_state.credstash.get_cmd} nginx-password)"
echo -n "$BASIC_AUTH_USERNAME:" > /etc/nginx/.htpasswd
openssl passwd -apr1 "$BASIC_AUTH_PASSWORD" >> /etc/nginx/.htpasswd

You are not required to use the helper snippets if you don't want to, but those can be very helpful in the long run, especially if some time later you choose to customize the KMS key name, DynamoDB table or simply try to use credstash in another AWS region. The get_cmd and put_cmd snippets encapsulate this information, so we won't have to chase all the places were we used credstash in order to update these values.

Applying terraform will deploy our webserver. After it gets fully initialized we can verify that it worked as expected:

$ curl -s http://$(terraform output instance_ip) | grep title
<head><title>401 Authorization Required</title></head>
$ curl -s http://admin:foobar@$(terraform output instance_ip) | grep title
<title>Welcome to nginx!</title>


In a setup where we are using credstash only on one EC2 instance we have nothing else to worry about. Nowadays though, that is not such a common scenario. You might have a database cluster running on a few instances, a webserver that is managed by an auto scaling group, a message broker running on other instances, so on and so forth. Each one of those services requires its own set of credentials, TLS certificates or what have you. In these kinds of scenarios we need to make sure that instances running a webserver will not have access to secrets that are meant only for instances running the database. To complicate it even more, we often have stage, test, dev, and prod environments, and we would ideally like to isolate those from each other as much as possible. KMS Encryption Context is a straightforward solution to this problem.

By itself, context doesn't give any extra level of protection, but when combined with constraints, which are specified during KMS grant creation, it turns into a powerful protection and isolation concept that significantly increases overall security.

Here is how encryption contexts work in a nutshell:

Whenever you run credstash put name secret foo=bar, the key value pair {"foo": "bar"}, called context, becomes cryptographically bound to the cyphertext that is stored in the database, and therefore the same key value pair will be required in order to decrypt the secret. Keep in mind that this pair does not have to be something complicated, and in fact, it must not contain any sensitive information, as it will be visible in CloudTrail logs.

During grant creation, which is performed for us by the credstash-grant module, we can supply reader_context and writer_context, which will prevent credstash from running get and put commands respectively without passing exactly the same context as extra arguments. If I create a grant with reader context env=test service=database, there is no way for an instance with that IAM role to read secrets that were encrypted with env=prod service=database or env=test service=webserver contexts, or no context at all for that matter. It has to match exactly.

When analyzing security it is important to look at such cases when the system does get compromised. In a case when an attacker acquires root access, then all bets are off and secrets can be easily extracted from memory or file storage, but if privilege escalation did not occur then one of the possible concerns could be the fact that access even to those credentials stored with specific context could be accessed long after the deployment was complete. In a simple single machine setup this situation can be alleviated by revoking the KMS grant after initialization was complete, thus preventing long term access to KMS Master key. But if an EC2 instance is deployed as part of an Auto Scaling Group, that approach will not work, as access to secrets is needed at all times since EC2 instances can come and go at anytime.

As a side note. There is a way to control access to KMS keys through an IAM policy, just as it is done with DynamoDB table, but because encryption context constraints are not available at the policy level, we resort to explicit grant creation, precisely for the reason of isolation described in this section.

Other use cases

Besides the obvious use case of passing credentials to EC2 instances, credstash can be a potential solution in other areas. The practical size limit for the data being encrypted and stored with credstash is on the order of ~100KiB, so you can store things much larger than a short passphrases, making it perfect for storing things like TLS certificates and SSH keys. For example we were able to successfully supply all necessary TLS certificates used for mutual authentication in Elastic's Beats protocol during deployment of Logstash, while automatically generating certificates using certstrap.

There is no reason to think that information stored with credstash has to be sensitive in the first place. Usual configuration files can be stored and retrieved just as well, therefore taking this heavy burden away from the initialization script. In fact, this idea could be taken up a notch, and a credstash pull command could be setup in crontab. This way an application that is running on a server can be configured to periodically reload its configuration, thus giving an ability to administrator to update the configuration dynamically without use of any other provisioning tools.

AWS Lambda also uses IAM roles, and credstash being an open source python tool could be in theory used there as well.

It is so easy to use credstash that I actually started to manage my own credentials with it.

A couple of examples for some of the use cases, plus more documentation on credstash related terraform modules, can be found in the fpco/fpco-terraform-aws repository.


The overall benefits of using credstash should be pretty obvious at this point. Sensitive information is encrypted, stored securely in DynamoDB, and is available at all times. Furthermore, we have the ability to fine tune which part of it can be accessed and by which parties through usage of IAM roles and KMS grants. There are no more worries about manually encrypting your secrets and finding the best way to move them across the wire. You are no longer limited by a tiny 14KiB size limit of user-data. There is no more need for setting up SSH connections just to pass over those couple TLS certificates. By keeping credentials in a central remote location you are less likely to forget to remove them before pushing the code to your repository, or leaving them in unencrypted form in terraform state. More importantly, it gives you a unified, programatic way to manage credentials, bringing more structure and order to your DevOps.

August 28, 2017 03:00 PM

Mark Jason Dominus

Miscellanea about 24 puzzles

This is a collection of leftover miscellanea about twenty-four puzzles. In case you forgot what that is:

The puzzle «4 6 7 9 ⇒ 24» means that one should take the numbers 4, 6, 7, and 9, and combine them with the usual arithmetic operations of addition, subtraction, multiplication, and division, to make the number 24. In this case the unique solution is $$6\times\frac{7 + 9}{4}.$$ When the target number is 24, as it often is, we omit it and just write «4 6 7 9».

Prior articles on this topic:

How many puzzles have solutions?

For each value of , there are 715 puzzles «a b c d ⇒ T». (I discussed this digression in two more earlier articles: [1] [2].) When the target , 466 of the 715 puzzles have solutions. Is this typical? Many solutions of «a b c d» puzzles end with a multiplication of 6 and 4, or of 8 and 3, or sometimes of 12 and 2—so many that one quickly learns to look for these types of solutions right away. When , there won't be any solutions of this type, and we might expect that relatively few puzzles with prime targets have solutions.

This turns out to be the case:

ALT attribute suggestions welcome!

The x-axis is the target number , with 0 at the left, 300 at right, and vertical guide lines every 25. The y axis is the number of solvable puzzles out of the maximum possible of 715, with 0 at the bottom, 715 at the top, and horizontal guide lines every 100.

Dots representing prime number targets are colored black. Dots for numbers with two prime factors (4, 6, 9, 10, 14, 15, 21, 22, etc.) are red; dots with three, four, five, six, and seven prime factors are orange, yellow, green, blue, and purple respectively.

Two countervailing trends are obvious: Puzzles with smaller targets have more solutions, and puzzles with highly-composite targets have more solutions. No target number larger than 24 has as many as 466 solvable puzzles.

These are only trends, not hard rules. For example, there are 156 solvable puzzles with the target 126 (4 prime factors) but only 93 with target 128 (7 prime factors). Why? (I don't know. Maybe because there is some correlation with the number of different prime factors? But 72, 144, and 216 have many solutions, and only two different prime factors.)

The smallest target you can't hit is 417. The following numbers 418 and 419 are also impossible. But there are 8 sets of four digits that can be used to make 416 and 23 sets that can be used to make 420. The largest target that can be hit is obviously ; the largest target with two solutions is .

(The raw data are available here).

There is a lot more to discover here. For example, from looking at the chart, it seems that the locally-best target numbers often have the form . What would we see if we colored the dots according to their largest prime factor instead of according to their number of prime factors? (I tried doing this, and it didn't look like much, but maybe it could have been done better.)

Making zero

As the chart shows, 705 of the 715 puzzles of the type «a b c d ⇒ 0», are solvable. This suggests an interesting inverse puzzle that Toph and I enjoyed: find four digits that cannot be used to make zero. (The answers).

Identifying interesting or difficult problems

(Caution: this section contains spoilers for many of the most interesting puzzles.)

I spent quite a while trying to get the computer to rank puzzles by difficulty, with indifferent success.


Seven puzzles require the use of fractions. One of these is the notorious «3 3 8 8» that I mentioned before. This is probably the single hardest of this type. The other six are:

    «1 3 4 6»
    «1 4 5 6»
    «1 5 5 5»
    «1 6 6 8»
    «3 3 7 7»
    «4 4 7 7»

(Solutions to these (formatted image); solutions to these (plain text))

«1 5 5 5» is somewhat easier than the others, but they all follow pretty much the same pattern. The last two are pleasantly symmetrical.

Negative numbers

No puzzles require the use of negative intermediate values. This surprised me at first, but it is not hard to see why. Subexpressions with negative intermediate values can always be rewritten to have positive intermediate values instead.

For instance, can be rewritten as and can be rewritten as .

A digression about tree shapes

In one of the earlier articles I asserted that there are only two possible shapes for the expression trees of a puzzle solution:

(((a # b) # c) # d) ((a # b) # (c # d))
Form A Form B

(Pink square nodes contain operators and green round nodes contain numbers.)

Lindsey Kuper pointed out that there are five possible shapes, not two. Of course, I was aware of this (it is a Catalan number), so what did I mean when I said there were only two? It's because I had the idea that any tree that wasn't already in one of those two forms could be put into form A by using transformations like the ones in the previous section.

For example, the expression isn't in either form, but we can commute the × to get the equivalent , which has form A. Sometimes one uses the associative laws, for example to turn into .

But I was mistaken; not every expression can be put into either of these forms. The expression is an example.

Unusual intermediate values

The most interesting thing I tried was to look for puzzles whose solutions require unusual intermediate numbers.

For example, the puzzle «3 4 4 4» looks easy (the other puzzles with just 3s and 4s are all pretty easy) but it is rather tricky because its only solution goes through the unusual intermediate number 28: .

I ranked puzzles as follows: each possible intermediate number appears in a certain number of puzzle solutions; this is the score for that intermediate number. (Lower scores are better, because they represent rarer intermediate numbers.) The score for a single expression is the score of its rarest intermediate value. So for example has the intermediate values 7 and 28. 7 is extremely common, and 28 is quite unusual, appearing in only 151 solution expressions, so receives a fairly low score of 151 because of the intermediate 28.

Then each puzzle received a difficulty score which was the score of its easiest solution expression. For example, «2 2 3 8» has two solutions, one () involving the quite unusual intermediate value 22, which has a very good score of only 79. But this puzzle doesn't count as difficult because it also admits the obvious solution and this is the solution that gives it its extremely bad score of 1768.

Under this ranking, the best-scoring twenty-four puzzles, and their scores, were:

      «1 2 7 7» 3
    * «4 4 7 7» 12
    * «1 4 5 6» 13
    * «3 3 7 7» 14
    * «1 5 5 5» 15
      «5 6 6 9» 23
      «2 5 7 9» 24
      «2 2 5 8» 25
      «2 5 8 8» 45
      «5 8 8 8» 45
      «2 2 2 9» 47
    * «1 3 4 6» 59
    * «1 6 6 8» 59
      «2 4 4 9» 151
      «3 4 4 4» 151
    * «3 3 8 8» 152
      «6 8 8 9» 152
      «2 2 2 7» 155
      «2 2 5 7» 155
      «2 3 7 7» 155
      «2 4 7 7» 155
      «2 5 5 7» 155
      «2 5 7 7» 156
      «4 4 8 9» 162

(Something is not quite right here. I think «2 5 7 7» and «2 5 5 7» should have the same score, and I don't know why they don't. But I don't care enough to do it over.)

Most of these are at least a little bit interesting. The seven puzzles that require the use of fractions appear; I have marked them with stars. The top item is «1 2 7 7», whose only solution goes through the extremely rare intermediate number 49. The next items require fractions, and the one after that is «5 6 6 9», which I found difficult. So I think there's some value in this procedure.

But is there enough value? I'm not sure. The last item on the list, «4 4 8 9», goes through the unusual number 36. Nevertheless I don't think it is a hard puzzle.

(I can also imagine that someone might see the answer to «5 6 6 9» right off, but find «4 4 8 9» difficult. The whole exercise is subjective.)

Solutions with unusual tree shapes

I thought about looking for solutions that involved unusual sequences of operations. Division is much less common than the other three operations.

To get it right, one needs to normalize the form of expressions, so that the shapes and aren't counted separately. The Ezpr library can help here. But I didn't go that far because the preliminary results weren't encouraging.

There are very few expressions totaling 24 that have the form . But if someone gives you a puzzle with a solution in that form, then and are also solutions, and one or another is usually very easy to see. For example, the puzzle «1 3 8 9» has the solution , which has an unusual form. But this is an easy puzzle; someone with even a little experience will find the solution immediately.

Similarly there are relatively few solutions of the form , but they can all be transformed into which is not usually hard to find. Consider $$\frac 8{\left(\frac{6 - 4}6\right)}.$$ This is pretty weird-looking, but when you're trying to solve it one of the first things you might notice is the 8, and then you would try to turn the rest of the digits into a 3 by solving «4 6 6 ⇒ 3», at which point it wouldn't take long to think of . Or, coming at it from the other direction, you might see the sixes and start looking for a way to make «4 6 8 ⇒ 4», and it wouldn't take long to think of .

Ezpr shape

Ezprs (see previous article) correspond more closely than abstract syntax trees do with our intuitive notion of how expressions ought to work, so looking at the shape of the Ezpr version of a solution might give better results than looking at the shape of the expression tree. For example, one might look at the number of nodes in the Ezpr or the depth of the Ezpr.


When trying to solve one of these puzzles, there are a few things I always try first. After adding up the four numbers, I then look for ways to make or ; if that doesn't work I start branching out looking for something of the type .

Suppose we take a list of all solvable puzzles, and remove all the very easy ones: the puzzles where one of the inputs is zero, or where one of the inputs is 1 and there is a solution of the form .

Then take the remainder and mark them as “easy” if they have solutions of the form or . Also eliminate puzzles with solutions of the type or .

How many are eliminated in this way? Perhaps most? The remaining puzzles ought to have at least intermediate difficulty, and perhaps examining just those will suggest a way to separate them further into two or three ranks of difficulty.

I give up

But by this time I have solved so many twenty-four puzzles that I am no longer sure which ones are hard and which ones are easy. I suspect that I have seen and tried to solve most of the 466 solvable puzzles; certainly more than half. So my brain is no longer a reliable gauge of which puzzles are hard and which are easy.

Perhaps looking at puzzles with five inputs would work better for me now. These tend to be easy, because you have more to work with. But there are 2002 puzzles and probably some of them are hard.

Close, but no cigar

What's the closest you can get to 24 without hitting it exactly? The best I could do was . Then I asked the computer, which confirmed that this is optimal, although I felt foolish when I saw the simpler solutions that are equally good: .

The paired solutions $$5 × \left(4 + \frac79\right) < 24 < 7 × \left(4 - \frac59\right)$$ are very handsome.

Phone app

The search program that tells us when a puzzle has solutions is only useful if we can take it with us in the car and ask it about license plates. A phone app is wanted. I built one with Code Studio.

Code Studio is great. It has a nice web interface, and beginners can write programs by dragging blocks around. It looks very much like MIT's scratch project, which is much better-known. But Code Studio is a much better tool than Scratch. In Scratch, once you reach the limits of what it can do, you are stuck, and there is no escape. In Code Studio when you drag around those blocks you are actually writing JavaScript underneath, and you can click a button and see and edit the underlying JavaScript code you have written.

Suppose you need to convert A to 1 and B to 2 and so on. Scratch does not provide an ord function, so with Scratch you are pretty much out of luck; your only choice is to write a 26-way if-else tree, which means dragging around something like 104 stupid blocks. In Code Studio, you can drop down the the JavaScript level and type in ord to use the standard ord function. Then if you go back to blocks, the ord will look like any other built-in function block.

In Scratch, if you want to use a data structure other than an array, you are out of luck, because that is all there is. In Code Studio, you can drop down to the JavaScript level and use or build any data structure available in JavaScript.

In Scratch, if you want to initialize the program with bulk data, say a precomputed table of the solutions of the 466 twenty-four puzzles, you are out of luck. In Code Studio, you can upload a CSV file with up to 1,000 records, which then becomes available to your program as a data structure.

In summary, you spend a lot of your time in Scratch working around the limitations of Scratch, and what you learn doing that is of very limited applicability. Code Studio is real programming and if it doesn't do exactly what you want out of the box, you can get what you want by learning a little more JavaScript, which is likely to be useful in other contexts for a long time to come.

Once you finish your Code Studio app, you can click a button to send the URL to someone via SMS. They can follow the link in their phone's web browser and then use the app.

Code Studio is what Scratch should have been. Check it out.


Thanks to everyone who contributed to this article, including:

  • my daughters Toph and Katara
  • Shreevatsa R.
  • Dr. Lindsey Kuper
  • Darius Bacon
  • everyone else who emailed me

by Mark Dominus ( at August 28, 2017 03:36 AM

August 26, 2017

Marcos Pividori

3 Libraries for communicating with Mobile devices through Push Notifications.

I announce the result of 3 months working on the GSoC project: "Communicating with mobile devices". After reading many documentation, thinking about a good abstraction, learning about mobile apps and networks connections, and really valuable recommendations from my mentor, (I really want to thank Michael Snoyman for his availability and kindness) I finished developing 3 libraries to make it easy to send push notifications to mobile devices. They are available on Hackage:

* push-notify- : This library offers a simple API for sending notifications through APNS, GCM and MPNS. [1]

* push-notify-ccs- : This library offers an API for sending/receiving notifications through CCS (XMPP - Google Cloud Messaging). [2]

* push-notify-general- : This library offers a general API for sending/receiving notifications, and handling the registration of devices on the server. It provides a general abstraction which can be used to communicate through different services as APNS, GCM, MPNS, hiding as much as possible, the differences between them. [3]

Now, it is easy to send/receive information with mobile devices, so you can easily develop server applications which interact with them.
So, to clarify this, as part of this project, I developed many test examples, including Android/WPhone and Yesod apps. The code is available on GitHub: [4] and they are running online on: [5] , so can be downloaded and tested on mobile devices. (BackAndForth Messaging and Connect4 apps)
Every feedback is welcome!


by Marcos ( at August 26, 2017 10:13 PM

August 24, 2017

FP Complete

Exiting a Haskell process

This blog post was inspired by a recent Stack Overflow question. It also uses the Stack script interpreter for inline snippets if you want to play along at home. Don't forget to get Stack first.

The non trick case

Here's a non trick question: what do you think the output of this series of shell commands is going to be?

$ cat Main.hs
#!/usr/bin/env stack
-- stack --resolver lts-9.0 script
import System.Exit

main = exitWith (ExitFailure 42)
$ stack Main.hs
$ echo $?

If you guessed 42, you're right. Our Haskell process uses exitWith to exit the process with exit code 42. Then echo $? prints the last exit code. All relatively straightforward (if you're familiar with the shell).

Race condition

Alright, let's make it more fun with some concurrency (concurrency makes everything more fun):

#!/usr/bin/env stack
-- stack --resolver lts-9.0 script
import System.Exit
import Control.Concurrent.Async

main = concurrently
  (exitWith (ExitFailure 41))
  (exitWith (ExitFailure 42))

The output this time is nondeterministic. We don't know if the first thread (which exits with 41) or the second thread (which exits with 42) will exit first. I tested this about 5 times on my machine, and got both 41 and 42 as outputs. So this isn't just theoretically nondetministic, it's practically nondetministic.

Surprise! Warp

Alright, that's fine, probably nothing too terribly surprising. Now let's throw the curve balls in. I'm going to write a web server with Warp, and when someone requests /die, I want the server to go down. Here's the code. If you're not familiar with WAI and Warp, just ignore the web bits and focus on the exitWith part:

#!/usr/bin/env stack
-- stack --resolver lts-9.0 script
{-# LANGUAGE OverloadedStrings #-}
import Network.Wai
import Network.Wai.Handler.Warp
import Network.HTTP.Types
import System.Exit

main = run 3000 $ \req send ->
  if pathInfo req == ["die"]
    then exitWith (ExitFailure 43)
    else send (responseLBS status200 [] "Still alive!\n")

Let's see what happens when we run it:

$ stack Main.hs&
[2] 19117
$ curl http://localhost:3000
Still alive!
$ curl http://localhost:3000
Still alive!
$ curl http://localhost:3000
Still alive!
$ curl http://localhost:3000/die
ExitFailure 43
Something went wrong
$ curl http://localhost:3000
Still alive!
$ fg
stack Main.hs

A few different weird things just happened:

  • When we made a request to /die, the server apparently didn't die! We can see that from both the fact that the next request succeeded, and the fg call.
  • For some reason, ExitFailure 43 is printed to the console. We can't tell here, but it's coming from the server process.
  • And our HTTP response body contains the content Something went wrong, even though we didn't write that.

I would have expected the process to just die and get an empty response. Why this surprising behavior instead?

Implementation of exitWith

To understand what's happening, let's look at a simplified version of the implementation of the exitWith function. Feel free to read the original as well.

exitWith :: ExitCode -> IO a
exitWith code = throwIO code

I would have anticipated that this would, you know, actually exit the process. Such a function does exist in Haskell. It's called exitImmediately, it lives in the unix package, and it calls out to the exit C library function. But not exitWith: it throws a runtime exception.

There's a good reason for this exception-based behavior. It allows cleanup code to run before the process just up and dies, which would allow things like flushing file handles and gracefully closing network connections. However, this can certainly result in surprising behavior. We'll get back to the Warp case in a bit; let's see something simpler first:

#!/usr/bin/env stack
-- stack --resolver lts-9.0 script
import Control.Exception.Safe
import System.Exit

main = tryAny foo >>= print

foo :: IO String
foo = exitWith (ExitFailure 44)

And the output is:

$ stack Main.hs
Left (ExitFailure 44)
$ echo $?

We've exited with code 0, a success! And our program continued running after the call to exitWith. That's because our tryAny call intercepted the exception, converted it into a Left value, and then our program succeeded in printing out that value.

What's up with Warp?

Warp employs a pretty simple model for handling requests:

  • Grabs a listening port
  • Loops accepting connections on that port
  • For each connection, fork a new worker thread

Within each worker thread, Warp accepts a request, passes it to the user-supplied application, takes the response, and sends it. Additionally, Warp installs an exception handler in case the application throws an exception. In that case, it will print the exception to stderr and send a 500 Internal Server Error response with the response body (wait for it) Something went wrong.

So of course our initial attempt at killing our Warp application failed: the exception was intercepted!

As an aside, if you really want to be able to exit from a Warp application, you can see my answer on Stack Overflow, which I'm not going to detail here as it will be a tangent to the main point.

Child threads in general

Alright, let's make another mistake (certainly my specialty):

#!/usr/bin/env stack
-- stack --resolver lts-9.0 script
import Control.Concurrent
import System.Exit

main = do
  forkIO (exitWith (ExitFailure 45))
  threadDelay 1000000
  putStrLn "Normal exit :("

We're not intercepting the exception via a handler at all, and thanks to our threadDelay (which delays the parent thread by one second), we have plenty of time for the child thread to act before the parent exits on its own. Surely this will exit with exit code 45, right?

$ stack Main.hs
Main.hs: ExitFailure 45
Normal exit :(
$ echo $?

Foiled again! We're running into something different now. In GHC's runtime system, a process exits when the main thread exits. If a child thread exits for any reason, the process keeps running. If the main thread exits, even if there are still child threads running, the process exits.

When we call forkIO, a default exception handler is installed on this new child thread. And that default exception handler will simply print out the exception to stderr. That's the Main.hs: ExitFailure 45 output we see.

As usual: async to the rescue

Where did we go wrong? By using the forkIO function, of course! As I'm wont to say:

<script async="" charset="utf-8" src=""></script>

The problem is that forkIO installs a default exception handler, instead of properly propagating exceptions through our application. Fortunately, there's a great solution to this, which we've already seen in this post: use the concurrently function from async (or, in some cases, race).

#!/usr/bin/env stack
-- stack --resolver lts-9.0 script
import Control.Concurrent
import Control.Concurrent.Async
import System.Exit

main = concurrently (exitWith (ExitFailure 45)) $ do
  threadDelay 1000000
  putStrLn "Normal exit :("

Any luck?

$ stack Main.hs
$ echo $?

Woohoo! I've never been so happy to see a process exit with a failure code before.

In contrast to forkIO, the concurrently and race functions track the exceptions occurring in their child threads and rethrow those exceptions in the parent thread should anything go wrong. So instead of exceptions disappearing into the aether, they tear down our process with dignity.

If you're not familiar with the async library, check out the tutorial I wrote on it, which focuses on using concurrently and race wherever possible.


Takeaways to remember:

  • exitWith works by throwing exceptions, not directly killing the process
  • A Haskell process dies when the main thread dies
  • Warp worker threads install an exception handler that generates 500 Internal Server Error responses
  • Use concurrently and race in place of forkIO, and generally try to use the async library

August 24, 2017 02:10 PM

Tweag I/O

Compact normal forms + linear types <br>= efficient network communication

Arnaud Spiwack

We saw last time that with linear types, we could precisely capture the state of sockets in their types. In this post, I want to use the same idea of tracking states in types, but applied to a more unusual example from our paper: sending rich structured data types across the network and back with as little copying as possible.

Our approach to this problem works for values of any datatype, but for the sake of simplicity, we'll consider here simple tree values of the following datatype:

data Tree = Branch Tree Tree | Leaf Int

Network communication and serialization

Say I have one such Tree. And suppose that I want to use a service, on a different machine across the network, that adds 1 to all the leaves of the tree.

The process would look like this:

  • I serialize my tree into a serialized form and send it across the network.
  • The service deserializes the tree.
  • The service adds 1 to the leaves.
  • The service serializes the updated tree and sends it across the network.
  • I deserialize this tree to retrieve the result.

This process involves copying the tree structure 5 times, converting back and forth between a pointer representation, which Haskell can use, and a serialized representation, which can be sent over the network, for a single remote procedure call.

This goes to show that it should be no surprise when overhead of serialization and deserialization in a distributed application is significant. It can even become the main bottleneck.

Compact normal forms

To overcome this, compact normal forms were introduced in GHC 8.2. The idea is to dispense with the specialised serialized representation and to send the pointer representation through the network.

Of course, this only works if the service is implemented in Haskell too. Also, you can still only send bytestrings across the network.

To bridge the gap, data is copied into a contiguous region of memory, and the region itself can be seen as a bytestring. The interface is (roughly) as follows:

compact   :: Tree -> Compact Tree
unCompact :: Compact Tree -> Tree

The difference with serialization and deserialization is that while compact t copies t into a form amenable to network communication, unCompact is free because, in the compact region, the Tree is still a bunch of pointers. So our remote call would look like this:

  • I compact my tree and send it across the network.
  • The service retrieves the tree and adds 1 to the leaves.
  • The service compacts the updated tree and sends it accross the network.

When I recieve the tree, it is already in a pointer representation (remember, unCompact is free), so we are down to 3 copies! We also get two additional benefits from compact normal forms:

  • A Tree in compact normal form is close to its subtrees, so tree traversals will be cache friendly: it is likely that following a pointer will land into already prefetched cache.
  • Compact regions have no outgoing pointers. It means that the garbage collector never needs to traverse a compact region: a compact region does not keep other heap object alive. Less work for the garbage collector means both more predictable latencies and better throughput.

Programming with serialized representations

Compact normal forms save a lot of copies. But they still impose one copy as compact is the only way to introduce a compact value. To address that we could make an even more radical departure from the traditional model. While compact normal forms do away with the serialized representation and send the pointer representation instead, let's go in the opposite direction, and compute directly with the serialized representation! That is, Branch (Branch (Leaf 1) (Leaf 2)) (Leaf 3)) would be represented as a single contiguous buffer in memory:

| Branch | Branch |  Leaf  |   1    |  Leaf  |   2    |  Leaf  |   3    |

If you want to know more, Ryan Newton, one of my coauthors on the linear-type paper, and also a co-author on the compact-normal-form paper has also been involved in an entire article on such representations.

In the meantime, let's return to our example. The remote call now has a single copy, which is due to our immutable programming model, rather than due to networking:

  • I send my tree, the service adds 1 to the leaves of the tree, and sends the result back.

Notice that when we are looking at the first cell of the array-representation of the tree above, we know that we are looking at a Branch node. The left subtree comes immediately after, so that one is readily available. But there is a problem: we have no way to know where the right subtree starts in the array, so we can't just jump to it. The only way we will be able to access subtrees is by doing left-to-right traversals, the pinacle of cache-friendliness.

If this all starts to sound a little crazy. It's probably because it kind of is. It is very tricky to program with such a data representation. It is so error prone that, despite the performance edge, our empirical observation is that even very dedicated C programmers don't do it.

The hard part isn't so much directly consuming a representation of the above form. It's constructing them in the first place. We'll get to that in a minute, but first let's look at what consuming trees looks like:

data Packed (l :: [*])

  :: Packed (Tree ': r)
  -> Either (Packed (Tree ': Tree ': r)) (Int, Packed r)

We have a datatype Packed of trees represented as above. We define a one-step unfolding function, caseTree, which you can as well think of as an elaborate case-of (pattern matching) construct. There is a twist: Packed is indexed by a list of types. This is because of the right-subtree issue that I mentionned above: once I've read a Branch tag, I have a pointer to the left subtree, but not to the right subtree. So all I can say is that I have a pointer which is followed by the representation of two consecutive trees (in the example above, this means a pointer to the second Branch tag).

The construction of trees is a much trickier business. Operationally we want to write into an array, but we can't just use a mutable array for this: it is too easy to get wrong. We need at least:

  • To write each cell only once, otherwise we can get inconsistent, nonsensical trees such as
    | Branch | Branch |   0    |   1    |  Leaf  |   2    |  Leaf  |   3    |
  • To write complete trees otherwise we may get things like
    | Branch | Branch |  Leaf  |   1    |  Leaf  |   2    |        |        |
    where the blank cells contain garbage

You will have noticed that these are precisely the invariants that linear types afford. An added bonus is that linear types make our arrays observationally pure, so no need for, say, the ST monad.

The type of write buffers is

data Need (l :: [*]) (t :: *)

where Need '[Tree, Tree, Int] Tree should be understood as "write two Tree-s and an Int and you will get a (packed) Tree", as illustrated by the finish function:

-- `Unrestricted` means that the returned value is not linear: you get
-- as many uses as you wish
finish :: Need '[] t ⊸ Unrestricted (Packed [t])

To construct a tree, we use constructor-like functions (though, because we are constructing the tree top-down, the arrows are in the opposite direction of regular constructors):

leaf :: Int -> Need (Tree':r) t ⊸ Need r t
branch :: Need (Tree':r) t ⊸ Need (Tree':Tree':r) t

Finally (or initially!) we need to allocate an array. The following idiom expresses that the array must be used linearly:

alloc :: (Need '[Tree] Tree ⊸ Unrestricted r) ⊸ Unrestricted r

-- When using `Unrestricted a` linearly, you have no restriction on the inner `a`!
data Unrestricted a where Unrestricted :: a -> Unrestricted a

Because the Need array is used linearly, both leaf and branch make their argument unavailable. This ensures that we can only write, at any time, in the left-most empty cell, saving us from inconsistent trees. The type of finish, makes sure that we never construct a partial tree. Mission accomplished!

The main routine of our service can now be implemented as follows:

add1 :: Packed '[Tree] -> Packed '[Tree]
add1 tree = getUnrestricted finished
    -- Allocates the destination array and run the main loop
    finished :: Unrestricted (Packed '[Tree])
    finished = alloc (\need -> finishNeed (mapNeed tree need))

    -- Main loop: given an packed array and a need array with
    -- corresponding types (starting with a tree), adds 1 to the
    -- leaves of the first tree and returns the rest of the
    -- arrays. Notice how `mapNeed` is chained twice in the `Branch`
    -- case.
      :: Packed (Tree ': r) -> Need (Tree ': r) Tree
      ⊸ (Unrestricted (Packed r), Need r Tree)
    mapNeed trees need = case (caseTree trees) of
      Left subtrees -> mapNeed' (mapNeed subtrees (branch need))
      Right (n, otherTrees) -> (Unrestricted otherTrees, leaf (n+1) need)

    -- Uncurried variant of the main loop
      :: (Unrestricted (Packed (Tree ': r)), Need (Tree ': r) Tree)
      ⊸  (Unrestricted (Packed r), Need r Tree)
    mapNeed' (Unrestricted trees, need) = mapNeed h n

    -- The `finish` primitive with an extra unrestricted argument
      :: (Unrestricted (Packed '[]), Need '[] Tree) ⊸ Unrestricted (Has '[Tree])
    finishNeed (Unrestricted _, need) = finish need

In the end, what we have is a method to communicate and compute over trees without having to perform any extra copies in Haskell because of network communication.

What I like about this API is that it turns a highly error-prone endeavour, programming directly with serialized representation of data types, into a rather comfortable situation. The key ingredient is linear types. I'll leave you with an exercise, if you like a challenge: implement pack and unpack analogs of compact region primitives:

unpack :: Packed [a] -> a
pack :: a -> Packed [a]

You may want to use a type checker.

August 24, 2017 12:00 AM

August 23, 2017

Twan van Laarhoven

Traversing syntax trees

When working with syntax trees (such as in a type theory interpreter) you often want to apply some operation to all subtrees of a node, or to all nodes of a certain type. Of course you can do this easily by writing a recursive function. But then you would need to have a case for every constructor, and there can be many constructors.

Instead of writing a big recursive function for each operation, it is often easier to use a traversal function. Which is what this post is about. In particular, I will describe my favorite way to handle such traversal, in the hope that it is useful to others as well.

As a running example we will use the following data type, which represents expressions in a simple lambda calculus

-- Lambda calculus with de Bruijn indices
data Exp
  = Var !Int
  | Lam Exp
  | App Exp Exp
  | Global String
  deriving Show
example1 :: Exp example1 = Lam $ Var 0 -- The identity function
example2 :: Exp example2 = Lam $ Lam $ Var 1 -- The const function
example3 :: Exp example3 = Lam $ Lam $ Lam $ App (Var 2) (App (Var 1) (Var 0)) -- Function composition

Now, what do I mean by a traversal function? The base library comes with the Traversable class, but that doesn't quite fit our purposes, because that class is designed for containers that can contain any type a. But expressions can only contain other sub-expressions. Instead we need a monomorphic variant of traverse for our expression type:

traverseExp :: Applicative f => (Exp -> f Exp) -> (Exp -> f Exp)

The idea is that traverseExp applies a given function to all direct children of an expression.

The uniplate package defines a similar function, descendM. But it has two problems: 1) descendM has a Monad constraint instead of Applicative, and 2) the class actually requires you to implement a uniplate method, which is more annoying to do.

The ever intimidating lens package has a closer match in plate. But aside from the terrible name, that function also lacks a way to keep track of bound variables.

For a language with binders, like the lambda calculus, many operations need to know which variables are bound. In particular, when working with de Bruijn indices, it is necessary to keep track of the number of bound variables. To do that we define

type Depth = Int
-- Traverse over immediate children, with depth
traverseExpD :: Applicative f => (Depth -> Exp -> f Exp) -> (Depth -> Exp -> f Exp)
traverseExpD _ _ (Var i)    = pure (Var i)
traverseExpD f d (Lam x)    = Lam <$> f (d+1) x
traverseExpD f d (App x y)  = App <$> f d x <*> f d y
traverseExpD _ _ (Global x) = pure (Global x)

Once we have written this function, other traversals can be defined in terms of traverseExpD

-- Traverse over immediate children
traverseExp :: Applicative f => (Exp -> f Exp) -> (Exp -> f Exp)
traverseExp f = traverseExpD (const f) 0

And map and fold are just traversals with a specific applicative functor, Identity and Const a respectively. Recent versions of GHC are smart enough to know that it is safe to coerce from a traversal function to a mapping or folding one.

-- Map over immediate children, with depth
mapExpD :: (Depth -> Exp -> Exp) -> (Depth -> Exp -> Exp)
mapExpD = coerce (traverseExpD :: (Depth -> Exp -> Identity Exp) -> (Depth -> Exp -> Identity Exp))
-- Map over immediate children mapExp :: (Exp -> Exp) -> (Exp -> Exp) mapExp = coerce (traverseExp :: (Exp -> Identity Exp) -> (Exp -> Identity Exp))
-- Fold over immediate children, with depth foldExpD :: forall a. Monoid a => (Depth -> Exp -> a) -> (Depth -> Exp -> a) foldExpD = coerce (traverseExpD :: (Depth -> Exp -> Const a Exp) -> (Depth -> Exp -> Const a Exp))
-- Fold over immediate children foldExp :: forall a. Monoid a => (Exp -> a) -> (Exp -> a) foldExp = coerce (traverseExp :: (Exp -> Const a Exp) -> (Exp -> Const a Exp))

After doing all this work, it is easy to answer questions like "how often is a variable used?"

varCount :: Depth -> Exp -> Sum Int
varCount i (Var j)
  | i == j   = Sum 1
varCount i x = foldExpD varCount i x

or "what is the set of all free variables?"

freeVars :: Depth -> Exp -> Set Int
freeVars d (Var i)
  | i < d     = Set.empty             -- bound variable
  | otherwise = Set.singleton (i - d) -- free variable
freeVars d x = foldExpD freeVars d x

Or to perform (silly) operations like changing all globals to lower case

lowerCase :: Exp -> Exp
lowerCase (Global x) = Global (map toLower x)
lowerCase x = mapExp lowerCase x

These functions follows a common pattern of specifying how a particular constructor, in this case Var or Global, is handled, while for all other constructors traversing over the child expressions.

As another example, consider substitution, a very important operation on syntax trees. In its most general form, we can combine substitution with raising expressions to a larger context (also called weakening). And we should also consider leaving the innermost, bound, variables alone. This means that there are three possibilities for what to do with a variable.

substRaiseByAt :: [Exp] -> Int -> Depth -> Exp -> Exp
substRaiseByAt ss r d (Var i)
  | i < d           = Var i -- A bound variable, leave it alone
  | i-d < length ss = raiseBy d (ss !! (i-d)) -- substitution
  | otherwise       = Var (i - length ss + r) -- free variable, raising
substRaiseByAt ss r d x = mapExpD (substRaiseByAt ss r) d x

Similarly to varCount, we use mapExpD to handle all constructors besides variables. Plain substitution and raising are just special cases.

-- Substitute the first few free variables, weaken the rest
substRaiseBy :: [Exp] -> Int -> Exp -> Exp
substRaiseBy ss r = substRaiseByAt ss r 0
raiseBy :: Int -> Exp -> Exp raiseBy r = substRaiseBy [] r
subst :: [Exp] -> Exp -> Exp subst ss = substRaiseBy ss 0
λ> raiseBy 2 (App (Var 1) (Var 2))
App (Var 3) (Var 4)
λ> subst [Global "x"] (App (Var 0) (Lam (Var 0))) App (Global "x") (Lam (Var 0))
λ> substRaiseBy [App (Global "x") (Var 0)] 2 $ App (Lam (App (Var 1) (Var 0))) (Var 2) App (Lam (App (App (Global "x") (Var 1)) (Var 0))) (Var 3)

As a slight generalization, it can also make sense to put traverseExpD into a type class. That way we can traverse over the subexpressions inside other data types. For instance, if the language uses a separate data type for case alternatives, we might write

data Exp
  = ...
  | Case [Alt]
data Alt = Alt Pat Exp
class TraverseExp a where traverseExpD :: Applicative f => (Depth -> Exp -> f Exp) -> (Depth -> a -> f a)
instance TraverseExp a => TraverseExp [a] where traverseExpD f d = traverse (traverseExpD f d)
instance TraverseExp Exp where traverseExpD f d ... traverseExpD f d (Case xs) = Case <$> traverseExpD f d xs
instance TraverseExp Alt where traverseExpD f d (Alt x y) = Alt x <$> traverseExpD f (d + varsBoundByPat x) y

Another variation is to track other things besides the number of bound variables. For example we might track the names and types of bound variables for better error messages. And with a type class it is possible to track different aspects of bindings as needed,

class Env env where
  extend :: VarBinding -> env -> env
instance Env Depth where extend _ = (+1)
instance Env [VarBinding] where extend = (:)
instance Env () where extend _ _ = ()
traverseExpEnv :: Applicative f => (env -> Exp -> f Exp) -> (env -> Exp -> f Exp) traverseExpEnv f env (Lam name x) = Lam <$> f (extend name env) x traverseExpEnv f env ...

Overall, I have found that after writing traverseExpD once, I rarely have to look at all constructors again. I can just handle the default cases by traversing the children.

A nice thing about this pattern is that it is very efficient. The traverseExpD function is not recursive, which means that the compiler can inline it. So after optimization, a function like lowerCase or varCount is exactly what you would have written by hand.

August 23, 2017 12:26 AM

Oliver Charles

Providing an API for extensible-effects and monad transformers

I was recently working on a small little project - a client API for the ListenBrainz project. Most of the details aren’t particularly interesting - it’s just a HTTP client library to a REST-like API with JSON. For the implementation, I let Servant and aeson do most of the heavy lifting, but I got stuck when considering what final API to give to my users.

Obviously, interacting with ListenBrainz requires some sort of IO so whatever API I will be offering has to live within some sort of monad. Currently, there are three major options:

  1. Supply an API targetting a concrete monad stack.

    Under this option, our API would have types such as

    submitListens :: ... -> M ()
    getListens :: ... -> M Listens

    where M is some particular monad (or monad transformer).

  2. Supply an API using type classes

    This is the mtl approach. Rather than choosing which monad my users have to work in, my API can be polymorphic over monads that support accessing the ListenBrainz API. This means my API is more like:

    submitListens :: MonadListenBrainz m => ... -> m ()
    getListens :: MonadListenBrainz m => ... -> m Listens
  3. Use an extensible effects framework.

    Extensible effects are a fairly new entry, that are something of a mix of the above options. We target a family of concrete monads - Eff - but the extensible effects framework lets our effect (querying ListenBrainz) seamlessly compose with other effects. Using freer-effects, our API would be:

    submitListens :: Member ListenBrainzAPICall effects => ... -> Eff effects ()
    getListens :: Member ListenBrainzAPICall effects => ... -> Eff effects Listens

So, which do we choose? Evaluating the options, I have some concerns.

For option one, we impose pain on all our users who want to use a different monad stack. It’s unlikely that you’re application is going to be written soley to query ListenBrainz, which means client code becomes littered with lift. You may write that off as syntatic, but there is another problem - we have committed to an interpretation strategy. Rather than describing API calls, my library now skips directly to prescribing how to run API calls. However, it’s entirely possible that you want to intercept these calls - maybe introducing a caching layer or additional logging. Your only option is to duplicate my API into your own project and wrap each function call and then change your program to use your API rather than mine. Essentially, the program itself is no longer a first class value that you can transform.

Extensible effects gives us a solution to both of the above. The use of the Member type class automatically reshuffles effects so that multiple effects can be combined without syntatic overhead, and we only commit to an interpretation strategy when we actually run the program. Eff is essentially a free monad, which captures the syntax tree of effects, rather than the result of their execution.

Sounds good, but extensible effects come with (at least) two problems that make me hesistant: they are experimental and esoteric, and it’s unclear that they are performant. By using only extensible effects, I am forcing an extensible effects framework on my users, and I’d rather not dictate that. Of course, extensible effects can be composed with traditional monad transformers, but I’ve still imposed an unnecessary burden on my users.

So, what do we do? Well, as Old El Paso has taught us: why don’t we have both?

It’s trivial to actually support both a monad transformer stack and extensible effects by using an mtl type class. As I argue in Monad transformers, free monads, mtl, laws and a new approach, I think the best pattern for an mtl class is to be a monad homomorphism from a program description, and often a free monad is a fine choice to lift:

class Monad m => MonadListenBrainz m where
  liftListenBrainz :: Free f a -> m a

But what about f? As observed earlier, extensible effects are basically free monads, so we can actually share the same implementation. For freer-effects, we might describe the ListenBrainz API with a GADT such as:

data ListenBrainzAPICall returns where
  GetListens :: ... -> ListenBrainzAPICall Listens
  SubmitListens :: ... -> ListenBrainzAPICall ()

However, this isn’t a functor - it’s just a normal data type. In order for Free f a to actually be a monad, we need f to be a functor. We could rewrite ListenBrainzAPICall into a functor, but it’s even easier to just fabricate a functor for free - and that’s exactly what Coyoneda will do. Thus our mtl type class becomes:

class Monad m => MonadListenBrainz m where
  liftListenBrainz :: Free (Coyoneda ListenBrainzAPICall) a -> m a 

We can now provide an implementation in terms of a monad transformer:

instance Monad m => MonadListenBrainz (ListenBrainzT m)
  liftListenBrainz f =
    iterM (join . lowerCoyoneda . hoistCoyoneda go)

      go :: ListenBrainzAPICall a -> ListenBrainzT m a

or extensible effects:

instance Member ListenBrainzAPICall effs => MonadListenBrainz (Eff effs) where
  liftListenBrainz f = iterM (join . lowerCoyoneda . hoistCoyoneda send) f 

or maybe directly to a free monad for later inspection:

instance MonadListenBrainz (Free (Coyoneda ListenBrainzAPICall)) where
  liftListenBrainz = id

For the actual implementation of performing the API call, I work with a concrete monad transformer stack:

performAPICall :: Manager -> ListenBrainzAPICall a -> IO (Either ServantError a)

which both my extensible effects “run” function calls, or the go function in the iterM call for ListenBrainzT’s MonadListenBrainz instance.

In conclusion, I’m able to offer my users a choice of either:

  • a traditional monad transformer approach, which doesn’t commit to a particular intepretation strategy by using an mtl type class
  • extensible effects

All without extra syntatic burden, a complicated type class, or duplicating the implementation.

You can see the final implemantion of listenbrainz-client here.

Bonus - what about the ReaderT pattern?

The ReaderT design pattern has been mentioned recently, so where does this fit in? There are two options if we wanted to follow this pattern:

  • We require a HTTP Manager in our environment, and commit to using this. This has all the problems of providing a concrete monad transformer stack - we are committing to an interpretation.
  • We require a family of functions that explain how to perform each API call. This kind of like a van Laarhoven free monad, or really just explicit dictionary passing. I don’t see this really gaining much on abstracting with type classes.

I don’t feel like the ReaderT design pattern offers anything that isn’t already dealt with above.

by Oliver Charles at August 23, 2017 12:00 AM

August 21, 2017

Manuel M T Chakravarty

Two months back, I gave my talk “Do-It-Yourself Functional...

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="225" mozallowfullscreen="mozallowfullscreen" src=";byline=0&amp;portrait=0" title="Do-It-Yourself Functional Reactive Programming" webkitallowfullscreen="webkitallowfullscreen" width="400"></iframe>

Two months back, I gave my talk “Do-It-Yourself Functional Reactive Programming” at the Sydney CocoaHeads meetup. I am explaining what FRP is all about, how to easily implement an FRP library in Swift, and how to use it in an iPhone app.

August 21, 2017 11:36 AM

Mark Jason Dominus

Recognizing when two arithmetic expressions are essentially the same

[ Warning: The math formatting in the RSS / Atom feed for this article is badly mutilated. I suggest you read the article on my blog. ]

In this article, I discuss “twenty-four puzzles”. The puzzle «4 6 7 9 ⇒ 24» means that one should take the numbers 4, 6, 7, and 9, and combine them with the usual arithmetic operations of addition, subtraction, multiplication, and division, to make the number 24. In this case the unique solution is .

When the target number after the is 24, as it often is, we omit it and just write «4 6 7 9». Every example in this article has target number 24.

This is a continuation of my previous articles on this topic:

My first cut at writing a solver for twenty-four puzzles was a straightforward search program. It had a couple of hacks in it to cut down the search space by recognizing that and are the same, but other than that there was nothing special about it and I've discussed it before.

It would quickly and accurately report whether any particular twenty-four puzzle was solvable, but as it turned out that wasn't quite good enough. The original motivation for the program was this: Toph and I play this game in the car. Pennsylvania license plates have three letters and four digits, and if we see a license plate FBV 2259 we try to solve «2 2 5 9». Sometimes we can't find a solution and then we wonder: it is because there isn't one, or is it because we just didn't get it yet? So the searcher turned into a phone app, which would tell us whether there was solution, so we'd know whether to give up or keep searching.

But this wasn't quite good enough either, because after we would find that first solution, say , we would wonder: are there any more? And here the program was useless: it would cheerfully report that there were three, so we would rack our brains to find another, fail, ask the program to tell us the answer, and discover to our disgust that the three solutions it had in mind were:

$$ 2 \cdot (5 + (9 - 2)) \\ 2 \cdot (9 + (5 - 2)) \\ 2 \cdot ((5 + 9) - 2) $$

The computer thinks these are different, because it uses different data structures to represent them. It represents them with an abstract syntax tree, which means that each expression is either a single constant, or is a structure comprising an operator and its two operand expressions—always exactly two. The computer understands the three expressions above as having these structures:

It's not hard to imagine that the computer could be taught to understand that the first two trees are equivalent. Getting it to recognize that the third one is also equivalent seems somewhat more difficult.

Commutativity and associativity

I would like the computer to understand that these three expressions should be considered “the same”. But what does “the same” mean? This problem is of a kind I particularly like: we want the computer to do something, but we're not exactly sure what that something is. Some questions are easy to ask but hard to answer, but this is the opposite: the real problem is to decide what question we want to ask. Fun!

Certainly some of the question should involve commutativity and associativity of addition and multiplication. If the only difference between two expressions is that one has where the other has , they should be considered the same; similarly is the same expression as and as and and so forth.

The «2 2 5 9» example above shows that commutativity and associativity are not limited to addition and multiplication. There are commutative and associative properties of subtraction also! For example, $$a+(b-c) = (a+b)-c$$ and $$(a+b)-c = (a-c)+b.$$ There ought to be names for these laws but as far as I know there aren't. (Sure, it's just commutativity and associativity of addition in disguise, but nobody explaining these laws to school kids ever seems to point out that subtraction can enter into it. They just observe that , say “subtraction isn't associative”, and leave it at that.)

Closely related to these identities are operator inversion identities like , , and their multiplicative analogues. I don't know names for these algebraic laws either.

One way to deal with all of this would to build a complicated comparison function for abstract syntax trees that tried to transform one tree into another by applying these identities. A better approach is to recognize that the data structure is over-specified. If we want the computer to understand that and are the same expression, we are swimming upstream by using a data structure that was specifically designed to capture the difference between these expressions.

Instead, I invented a data structure, called an Ezpr (“Ez-pur”), that can represent expressions, but in a somewhat more natural way than abstract syntax trees do, and in a way that makes commutativity and associativity transparent.

An Ezpr has a simplest form, called its “canonical” or “normal” form. Two Ezprs represent essentially the same mathematical expression if they have the same canonical form. To decide if two abstract syntax trees are the same, the computer converts them to Ezprs, simplifies them, and checks to see if resulting canonical forms are identical.

The Ezpr

Since associativity doesn't matter, we don't want to represent it. When we (humans) think about adding up a long column of numbers, we don't think about associativity because we don't add them pairwise. Instead we use an addition algorithm that adds them all at once in a big pile. We don't treat addition as a binary operation; we normally treat it as an operator that adds up the numbers in a list. The Ezpr makes this explicit: its addition operator is applied to a list of subexpressions, not to a pair. Both and are represented as the Ezpr

    SUM [ a b c - ]

which just says that we are adding up , , and . (The - sign is just punctuation; ignore it for now.)

Similarly the Ezpr MUL [ a b c ÷ ] represents the product of , , and . (Please ignore the ÷ sign for the time being.)

To handle commutativity, we want those [ a b c ] lists to be bags. Perl doesn't have a built-in bag object, so instead I used arrays and required that the array elements be in sorted order. (Exactly which sorted order doesn't really matter.)

Subtraction and division

This doesn't yet handle subtraction and division, and the way I chose to handle them is the only part of this that I think is at all clever. A SUM object has not one but two bags, one for the positive and one for the negative part of the expression. An expression like is represented by the Ezpr:

SUM [ a c - b d ]

and this is also the representation of , of , of , and of any other expression of the idea that we are adding up and and then deducting and . The - sign separates the terms that are added from those that are subtracted.

Either of the two bags may be empty, so for example is just SUM [ a b - ].

Division is handled similarly. Here conventional mathematical notation does a little bit better than in the sum case: MUL [ a c ÷ b d ] is usually written as .

Ezprs handle the associativity and commutativity of subtraction and division quite well. I pointed out earlier that subtraction has an associative law even though it's not usually called that. No code is required to understand that those two expressions are equal if they are represented as Ezprs, because they are represented by completely identical structures:

        SUM [ a b - c ]

Similarly there is a commutative law for subtraction: and once again that same Ezpr does for both.

Ezpr laws

Ezprs are more flexible than binary trees. A binary tree can represent the expressions and but not the expression . Ezprs can represent all three and it's easy to transform between them. Just as there are rules for building expressions out of simpler expressions, there are a few rules for combining and manipulating Ezprs.

Lifting and flattening

The most important transformation is lifting, which is the Ezpr version of the associative law. In the canonical form of an Ezpr, a SUM node may not have subexpressions that are also SUM nodes. If you have

  SUM [ a SUM [ b c - ] - … ]

you should lift the terms from the inner sum into the outer one:

  SUM [ a b c - … ]

effectively transforming into . More generally, in

   SUM [ a SUM [ b - c ]
       - d SUM [ e - f ] ]

we lift the terms from the inner Ezprs into the outer one:

   SUM [ a b f - c d e ]

This effectively transforms to .

Similarly, when a MUL node contains another MUL, we can flatten the structure.

Say we are converting the expression to an Ezpr. The conversion function is recursive and the naïve version computes this Ezpr:

      MUL [ 7 ÷ MUL [ 3 ÷ MUL [ 6 4 ÷ ] ] ]

But then at the bottom level we have a MUL inside a MUL, so the 4 and 6 in the innermost MUL are lifted upward:

      MUL [ 7 ÷ MUL [ 3 ÷ 6 4 ] ]

which represents . Then again we have a MUL inside a MUL, and again the subexpressions of the innermost MUL can be lifted:

      MUL [ 7 6 4 ÷ 3 ]

which we can imagine as .

The lifting only occurs when the sub-node has the same type as its parent; we may not lift terms out of a MUL into a SUM or vice versa.

Trivial nodes

The Ezpr SUM [ a - ] says we are adding up just one thing, , and so it can be eliminated and replaced with just . Similarly SUM [ - a ] can be replaced with the constant , if is a constant. MUL can be handled similarly.

An even simpler case is SUM [ - ] which can be replaced by the constant 0; MUL [ ÷ ] can be replaced with 1. These sometimes arise as a result of cancellation.


Consider the puzzle «3 3 4 6». My first solver found 49 solutions to this puzzle. One is . Another is . A third is .

I think these are all the same: the solution is to multiply the 4 by the 6, and to get rid of the threes by subtracting them to make a zero term. The zero term can be added onto the rest of expression or to any of its subexpressions—there are ten ways to do this—and it doesn't really matter where.

This is easily explained in terms of Ezprs: If the same subexpression appears in both of a node's bags, we can drop it. For example, the expression starts out as

    MUL [ 6 SUM [ 3 4 - 3 ] ÷ ]

but the duplicate threes in SUM [ 3 4 - 3 ] can be canceled, to leave

    MUL [ 6 SUM [ 4 - ] ÷ ]

The sum is now trivial, as described in the previous section, so can be eliminated and replaced with just 4:

    MUL [ 6 4 ÷ ]

This Ezpr records the essential feature of each of the three solutions to «3 3 4 6» that I mentioned: they all are multiplying the 6 by the 4, and then doing something else unimportant to get rid of the threes.

Another solution to the same puzzle is . Mathematically we would write this as and we can see this is just again, with the threes gotten rid of by multiplication and division, instead of by addition and subtraction. When converted to an Ezpr, this expression becomes:

    MUL [ 6 4 3 ÷ 3 ]

and the matching threes in the two bags are cancelled, again leaving

    MUL [ 6 4 ÷ ]

In fact there aren't 49 solutions to this puzzle. There is only one, with 49 trivial variations.

Identity elements

In the preceding example, many of the trivial variations on the solution involved multiplying some subexpression by . When one of the input numbers in the puzzle is a 1, one can similarly obtain a lot of useless variations by choosing where to multiply the 1.

Consider «1 3 3 5»: We can make 24 from . We then have to get rid of the 1, but we can do that by multiplying it onto any of the five subexpressions of :

$$ 1 × (3 × (3 + 5)) \\ (1 × 3) × (3 + 5) \\ 3 × (1 × (3 + 5)) \\ 3 × ((1 × 3) + 5) \\ 3 × (3 + (1×5)) $$

These should not be considered different solutions. Whenever we see any 1's in either of the bags of a MUL node, we should eliminate them. The first expression above, , is converted to the Ezpr

 MUL [ 1 3 SUM [ 3 5 - ] ÷ ]

but then the 1 is eliminated from the MUL node leaving

 MUL [ 3 SUM [ 3 5 - ] :- ]

The fourth expression, , is initially converted to the Ezpr

 MUL [ 3 SUM [ 5 MUL [ 1 3 ÷ ] - ] ÷ ]

When the 1 is eliminated from the inner MUL, this leaves a trivial MUL [ 3 ÷ ] which is then replaced with just 3, leaving:

 MUL [ 3 SUM [ 5 3 - ] ÷ ]

which is the same Ezpr as before.

Zero terms in the bags of a SUM node can similarly be dropped.

Multiplication by zero

One final case is that MUL [ 0 … ÷ … ] can just be simplified to 0.

The question about what to do when there is a zero in the denominator is a bit of a puzzle. In the presence of division by zero, some of our simplification rules are questionable. For example, when we have MUL [ a ÷ MUL [ b ÷ c ] ], the lifting rule says we can simplify this to MUL [ a c ÷ b ]—that is, that . This is correct, except that when or it may be nonsense, depending on what else is going on. But since zero denominators never arise in the solution of these puzzles, there is no issue in this application.


The Ezpr module is around 200 lines of Perl code, including everything: the function that converts abstract syntax trees to Ezprs, functions to convert Ezprs to various notations (both MUL [ 4 ÷ SUM [ 3 - 2 ] ] and 4 ÷ (3 - 2)), and the two versions of the normalization process described in the previous section. The normalizer itself is about 35 lines.

Associativity is taken care of by the Ezpr structure itself, and commutativity is not too difficult; as I mentioned, it would have been trivial if Perl had a built-in bag structure. I find it much easier to reason about transformations of Ezprs than abstract syntax trees. Many operations are much simpler; for example the negation of SUM [ A - B ] is simply SUM [ B - A ]. Pretty-printing is also easier because the Ezpr better captures the way we write and think about expressions.

It took me a while to get the normalization tuned properly, but the results have been quite successful, at least for this problem domain. The current puzzle-solving program reports the number of distinct solutions to each puzzle. When it reports two different solutions, they are really different; when it fails to support the exact solution that Toph or I found, it reports one essentially the same. (There are some small exceptions, which I will discuss below.)

Since there is no specification for “essentially the same” there is no hope of automated testing. But we have been using the app for several months looking for mistakes, and we have not found any. If the normalizer failed to recognize that two expressions were essentially similar, we would be very likely to notice: we would be solving some puzzle, be unable to find the last of the solutions that the program claimed to exist, and then when we gave up and saw what it was we would realize that it was essentially the same as one of the solutions we had found. I am pretty confident that there are no errors of this type, but see “Arguable points” below.

A harder error to detect is whether the computer has erroneously conflated two essentially dissimilar expressions. To detect this we would have to notice that an expression was missing from the computer's solution list. I am less confident that nothing like this has occurred, but as the months have gone by I feel better and better about it.

I consider the problem of “how many solutions does this puzzle really have to have?” been satisfactorily solved. There are some edge cases, but I think we have identified them.

Code for my solver is on Github. The Ezpr code is in the Ezpr package in the file. This code is all in the public domain.

Some examples

The original program claims to find 35 different solutions to «4 6 6 6». The revised program recognizes that these are of only two types:

MUL [ 4 6 - ]
MUL [ SUM [ 6 - 4 ] SUM [ 6 6 - ] ÷ ]

Some of the variant forms of the first of those include:

$$ 6 × (4 + (6 - 6)) \\ 6 + ((4 × 6) - 6) \\ (6 - 6) + (4 × 6) \\ (6 ÷ 6) × (4 × 6) \\ 6 ÷ ((6 ÷ 4) ÷ 6) \\ 6 ÷ (6 ÷ (4 × 6)) \\ 6 × (6 × (4 ÷ 6)) \\ (6 × 6) ÷ (6 ÷ 4) \\ 6 ÷ ((6 ÷ 6) ÷ 4) \\ 6 × (6 - (6 - 4)) \\ 6 × (6 ÷ (6 ÷ 4)) \\ \ldots

In an even more extreme case, the original program finds 80 distinct expressions that solve «1 1 4 6», all of which are trivial variations on .

Of the 715 puzzles, 466 (65%) have solutions; for 175 of these the solution is unique. There are 3 puzzles with 8 solutions each («2 2 4 8», «2 3 6 9», and «2 4 6 8»), one with 9 solutions («2 3 4 6»), and one with 10 solutions («2 4 4 8»).

The 10 solutions for «2 4 4 8» are as follows:

SUM [ MUL [ 4 8 ÷ ] - MUL [ 2 4 ÷ ] ]
MUL [ 4 SUM [ 2 8 - 4 ] ÷ ]
MUL [ SUM [ 8 - 4 ] SUM [ 2 4 - ] ÷ ]
MUL [ 4 SUM [ 4 8 - ] ÷ 2 ]
MUL [ SUM [ 4 - 2 ] SUM [ 4 8 - ] ÷ ]
MUL [ 8 SUM [ 1 2 - ] ÷ ]
SUM [ MUL [ 2 4 4 ÷ ] - 8 ]
SUM [ 8 MUL [ 2 SUM [ 4 4 - ] ÷ ] - ]
SUM [ 4 4 MUL [ 2 8 ÷ ] - ]
MUL [ 4 SUM [ 8 - MUL [ 4 ÷ 2 ] ] ÷ ]

A complete listing of every essentially different solution to every «a b c d» puzzle is available here. There are 1,063 solutions in all.

Arguable points

There are a few places where we have not completely pinned down what it means for two solutions to be essentially the same; I think there is room for genuine disagreement.

  1. Any solution involving can be changed into a slightly different solution involving instead. These expressions are arithmetically different but numerically equal. For example, I mentioned earlier that «2 2 4 8» has 8 solutions. But two of these are and . I am willing to accept these as essentially different. Toph, however, disagrees.

  2. A similar but more complex situation arises in connection with «1 2 3 7». Consider , which equals 24. To get a solution to «1 2 3 7», we can replace either of the threes in with , obtaining or . My program considers these to be different solutions. Toph is unsure.

It would be pretty easy to adjust the normalization process to handle these the other way if the user wanted that.

Some interesting puzzles

«1 2 7 7» has only one solution, quite unusual. (Spoiler) «2 2 6 7» has two solutions, both somewhat unusual. (Spoiler)

Somewhat similar to «1 2 7 7» is «3 9 9 9» which also has an unusual solution. But it has two other solutions that are less surprising. (Spoiler)

«1 3 8 9» has an easy solution but also a quite tricky solution. (Spoiler)

One of my neighbors has the license plate JJZ 4631. «4 6 3 1» is one of the more difficult puzzles.

What took so long?

Back in March, I wrote:

I have enough material for at least three or four more articles about this that I hope to publish here in the coming weeks.

But the previous article on this subject ended similarly, saying

I hope to write a longer article about solvers in the next week or so.

and that was in July 2016, so don't hold your breath.

And here we are, five months later!

This article was a huge pain to write. Sometimes I sit down to write something and all that comes out is dreck. I sat down to write this one at least three or four times and it never worked. The tortured Git history bears witness. In the end I had to abandon all my earlier drafts and start over from scratch, writing a fresh outline in an empty file.

But perseverance paid off! WOOOOO.

[ Addendum 20170825: I completely forgot that Shreevatsa R. wrote a very interesting article on the same topic as this one, in July of last year soon after I published my first article in this series. ]

[ Addendum 20170829: A previous version of this article used the notations SUM [ … # … ] and MUL [ … # … ], which I said I didn't like. Zellyn Hunter has persuaded me to replace these with SUM [ … - … ] and MUL [ … ÷ … ]. Thank you M. Hunter! ]

[ Yet more on this topic! ]

by Mark Dominus ( at August 21, 2017 01:53 AM

August 20, 2017

Neil Mitchell

Ghcid and VS Code

Summary: There's now a Ghcid VS Code addin that gives you red squiggles.

I've been using Ghcid for about 3 years, and VS Code for about 6 months. Ghcid alone is able to display the errors and warnings, and update them whenever you save. In this post I'll show how VS Code users can also click on errors to jump to them, and how to get red squiggles in the text buffer for errors.

Clicking on errors

Using a recent VS Code, if you run ghcid from the terminal window, hold Ctrl and click the filename and it jumps to the right location in the erroneous file.

Red squiggles

Red squiggles are now possible using the haskell-ghcid addin. To get it working:

  • Run ghcid -o ghcid.txt which will produce a file ghcid.txt which updates every time ghcid updates. Running the underlying ghci with -ferror-spans will significantly improve the errors reported.
  • Open ghcid.txt in VS Code as the active editor. Run the VS Code command (Ctrl+Shift+P) named "Watch Ghcid output".

These steps cause the ghcid errors to appear in the VS Code Problems pane, and have red squiggles in the editor. Even though the errors are in the proper problems pane, I still prefer the output provided by the ghcid terminal, so still look at that.

The VS Code addin is not very well polished - but I'm using it on a daily basis.

by Neil Mitchell ( at August 20, 2017 08:18 PM

Roman Cheplyaka

Understanding Asymmetric Numeral Systems

Apparently, Google is trying to patent (an application of) Asymmetric Numeral Systems, so I spent some time today learning what it is.

In its essense lies a simple and beautiful idea.

<figure> </figure>

ANS is a lossless compression algorithm. Its input is a list of symbols from some finite set. Its output is a positive integer. Each symbol \(s\) has a fixed known probability \(p_s\) of occurring in the list. The algorithm tries to assign each list a unique integer so that the more probable lists get smaller integers.

If we ignore the compression part (assigning smaller integers to more probable inputs), the encoding could be done as follows: convert each symbol to a number from \(0\) to \(B-1\) (where \(B\) is the number of symbols), add a leading 1 to avoid ambiguities caused by leading zeros, and interpret the list as an integer written in a base-\(B\) positional system.

This encoding process is an iterative/recursive algorithm:

  1. Start with the number 1;
  2. If the current number is \(n\), and the incoming symbol correspond to a number \(s\), update the number to be \(s + n\cdot B\).

The decoding process is a corecursive algorithm:

  1. Start with the number that we are decoding;
  2. Split the current number \(n\) into the quotient and remainder modulo \(B\);
  3. Emit the remainder and continue decoding the quotient;
  4. Stop when the current number reaches 1.

(The decoding is LIFO: the first decoded element will be the last encoded element.)

This encoding scheme relies on the standard isomorphism between the sets \(\{0,\ldots,B-1\}\times \mathbb{Z}_{\geq 1}\) and \(\mathbb{Z}_{\geq B}\), established by the functions

\[f(s,n) = s + n\cdot B;\] \[g(n) = (n \bmod B, [n/B]).\]

(The peculiar domain and codomain of this isomorphism are chosen so that we have \(\forall n,s.\;f(s,n) > n\); this ensures that the decoding process doesn’t get stuck.)

We can represent this in Haskell as

{-# LANGUAGE ScopedTypeVariables, TypeApplications,
             NamedFieldPuns, AllowAmbiguousTypes #-}

import Data.Ord
import Data.List
import Numeric.Natural

data Iso a b = Iso
  { to :: a -> b
  , from :: b -> a

encode :: Iso (s, Natural) Natural -> [s] -> Natural
encode Iso{to} = foldl' (\acc s -> to (s, acc)) 1

decode :: Iso (s, Natural) Natural -> Natural -> [s]
decode Iso{from} = reverse . unfoldr
  (\n ->
    if n == 1
      then Nothing
      else Just $ from n

And the standard isomorphism which we used in the simple encoding process is

std_iso :: forall s . (Bounded s, Enum s) => Iso (s, Natural) Natural
std_iso = Iso (\(s,n) -> s2n s + base @s * n) (\n -> (n2s $ n `mod` base @s, n `div` base @s))

s2n :: forall s . (Bounded s, Enum s) => s -> Natural
s2n s = fromIntegral $
  ((fromIntegral . fromEnum) s             :: Integer) -
  ((fromIntegral . fromEnum) (minBound @s) :: Integer)

n2s :: forall s . (Bounded s, Enum s) => Natural -> s
n2s n = toEnum . fromIntegral $
  (fromIntegral n + (fromIntegral . fromEnum) (minBound @s) :: Integer)

base :: forall s . (Bounded s, Enum s) => Natural
base = s2n (maxBound @s) + 1

(The functions are more complicated than they have to be to support symbol types like Int. Int does not start at 0 and is prone to overflow.)

Let’s now turn to the general form of the isomorphism

\[f \colon \{0,\ldots,B-1\}\times \mathbb{Z}_{\geq 1} \to \mathbb{Z}_{\geq \beta};\] \[g \colon \mathbb{Z}_{\geq \beta} \to \{0,\ldots,B-1\}\times \mathbb{Z}_{\geq 1}.\]

(In general, \(\beta\), the smallest value of \(f\), does not have to equal \(B\), the number of symbols.)

If we know (or postulate) that the second component of \(g\), \(g_2\colon \mathbb{Z}_{\geq \beta} \to \mathbb{Z}_{\geq 1}\), is increasing, then we can recover it from the first component, \(g_1\colon \mathbb{Z}_{\geq \beta} \to \{0,\ldots,B-1\}\).

Indeed, for a given \(s=g_1(n)\), \(g_2\) must be the unique increasing isomorphism from \[A_s = \{f(s,m)\mid m\in\mathbb{Z}_{\geq 1}\} = \{n\mid n\in\mathbb{Z}_{\geq \beta}, g_1(n) = s\}\] to \(\mathbb{Z}_{\geq 1}\). To find \(g_2(n)\), count the number of elements in \(A_s\) that are \(\leq n\).

Similarly, we can recover \(f\) from \(g_1\). To compute \(f(s,n)\), take \(n\)th smallest number in \(A_s\).

In Haskell:

ans_iso :: forall s . Eq s => (Natural, Natural -> s) -> Iso (s, Natural) Natural
ans_iso (b, classify) = Iso{to, from} where
  to :: (s, Natural) -> Natural
  to (s, n) = [ k | k <- [b..], classify k == s ] `genericIndex` (n-1)

  from :: Natural -> (s, Natural)
  from n =
    let s = classify n
        n' = genericLength [ () | k <- [b..n], classify k == s ]
    in (s, n')

For every function \(g_1\colon \mathbb{Z}_{\geq \beta} \to \{0,\ldots,B-1\}\) (named classify in Haskell), we have a pair of encode/decode functions, provided that each of the sets \(A_s\) is infinite. In particular, we can get the standard encode/decode functions (originally defined by std_iso) by setting classify to

classify_mod_base :: forall s . (Bounded s, Enum s) => (Natural, Natural -> s)
classify_mod_base = (base @s, \n -> n2s (n `mod` base @s))

By varying \(g_1\) (and therefore the sets \(A_s\)), we can control which inputs get mapped to smaller integers.

If \(A_s\) is more dense, \(f(s,n)\), defined as \(n\)th smallest number in \(A_s\), will be smaller.

If \(A_s\) is more sparse, \(f(s,n)\) will be larger.

The standard isomorphism makes the sets \[A_s = \{ s+n\cdot B \mid n\in \mathbb Z_{\geq 1} \} \] equally dense for all values of \(s\). This makes sense when all \(s\) are equally probable.

But in general, we should make \(A_s\) denser for those \(s\) that are more frequent. Specifically, we want

\[ \frac{|\{k\in A_s \mid k \leq x\}|}{x} \approx p_s. \]

Substituting \(x=f(s,n)\) then gives \(\log_2 f(s,n) \approx \log_2 n + \log_2 (1/p_s)\). This means that adding a symbol \(s\) costs \(\log_2 (1/p_s)\) bits, which is what we should strive for.

Here’s a simple example of a suitable \(g_1\):

classify_prob :: Show s => (Bounded s, Enum s) => [Double] -> (Natural, Natural -> s)
classify_prob probs =
  let beta = 2 -- arbitrary number > 1
      t = genericLength l
      l = concatMap (\(s, t) -> replicate t s)
        . sortBy (comparing (Down . snd))
        . zip [minBound..maxBound]
        $ map (round . (/ minimum probs)) probs
      g1 n = l `genericIndex` ((n-beta) `mod` t)
  in (beta, g1)

This is a periodic function. It computes the number of times each symbol \(s\) will appear within a single period as \(k_s=\mathrm{round}(p_s/\min \{p_s\})\). The number \(p_s/\min \{p_s\}\) is chosen for its following two properties:

  1. it is proportional to the probability of the symbol, \(p_s\);
  2. it is \(\geq 1\), so that even the least likely symbol occurs among the values of the function.

The function then works by mapping the first \(k_0\) numbers to symbol \(0\), the next \(k_1\) numbers to symbol \(1\), and so on, until it maps \(k_{B-1}\) numbers to symbol \(B-1\) and repeats itself. The period of the function is \(\sum_s k_s\approx 1/\min \{p_s\}\).

classify_prob rearranges the symbols in the order of decreasing probability, which gives further advantage to the more probable symbols. This is probably the best strategy if we want to allocate integers in blocks; a better way would be to interleave the blocks in a fair or random way in order to keep the densities more uniform.

Another downside of this function is that its period may be too small to distinguish between similar probabilities, such as 0.4 and 0.6. The function used in rANS is better in this regard; it uses progressively larger intervals, which provide progressively better approximations.

But classify_prob is enough to demonstate the idea. Let’s encode a list of booleans where True is expected 90% of time.

> iso = ans_iso $ classify_prob [0.1,0.9]
> encode iso (replicate 4 True)
> encode iso (replicate 4 False)

Four Trues compress much better than four Falses. Let’s also compare the number of bits in 11111 with the number of bits that the information theory predicts are needed to encode four events with probability 0.1:

> logBase 2 11111
> 4 * logBase 2 (1/0.1)

Not bad.

The implementation of ANS in this article is terribly inefficient, especially its decoding part, mostly because the isomorphism uses brute force search instead of computation. The intention is to elucidate what the encoding scheme looks like and where it comes from. An efficient implementation of ANS and its different variants is an interesting topic in itself, but I’ll leave it for another day.

The full code (including tests) is available here.

Thanks to /u/sgraf812 for pointing out a mistake in a previous version of classify_prob.

August 20, 2017 08:00 PM

Joachim Breitner

Compose Conference talk video online

Three months ago, I gave a talk at the Compose::Conference in New York about how Chris Smith and I added the ability to create networked multi-user programs to the educational Haskell programming environment CodeWorld, and finally the recording of the talk is available on YouTube (and is being discussed on reddit):

<iframe frameborder="0" height="360" src="" style="display: block; margin-left: auto; margin-right: auto" width="640"> </iframe>

It was the talk where I got the most positive feedback afterwards, and I think this is partly due to how I created the presentation: Instead of showing static slides, I programmed the complete visual display from scratch as an “interaction” within the CodeWorld environment, including all transitions, an working embedded game of Pong and a simulated multi-player environments with adjustable message delays. I have put the code for the presentation online.

Chris and I have written about this for ICFP'17, and thanks to open access I can actually share the paper freely with you and under a CC license. If you come to Oxford you can see me perform a shorter version of this talk again.

by Joachim Breitner ( at August 20, 2017 06:50 PM

Sandy Maguire

Review: Information Effects

<article> <header>

Review: Information Effects


<time>August 20, 2017</time> papers, review, james, sabry, haskell, reversible computing

One of the most exciting papers I’ve read in a long time is James and Sabry’s Information Effects. It starts with the hook “computation is a physical process which, like all other physical processes, is fundamentally reversible,” and it goes from there. If that doesn’t immediately pull you in, perhaps some of the subsequent PL jargon will – it promises a “typed, universal, and reversible computation model in which information is treated as a linear resource”.

I don’t know about you, but I was positively shaking with anticipation at this point. That’s one heck of an abstract.

After some philosophy and overview of the paper, James and Sabry dive into the appetizer in a section titled “Thermodynamics of Computation and Information”. They give the following definition:

DEFINITION 2.2 (Entropy of a variable). Let \(b\) be a (not necessarily finite) type whose values are labeled \(b_1\), \(b_2\), \(\ldots\). Let \(\xi\) be a random variable of type \(b\) that is equal to \(b_i\) with probability \(p_i\). The entropy of \(\xi\) is defined as \(- \sum p_i \log{p_i}\).

and the following, arguably less inspired definition:

DEFINITION 2.3 (Output entropy of a function). Consider a function f : a -> b where b is a (not necessarily finite) type whose values are labeled \(b_1\), \(b_2\), \(\ldots\). The output entropy of the function is given by \(- \sum q_j \log{q_j}\) where \(q_j\) indicates the probability of the function to have value \(b_j\).

We can say now that a function is reversible if and only if the entropy of its arguments is equal to the entropy of its output. Which is to say that the gain in entropy across the function is 0.

Of course, as astute students of mathematics we know that reversibility of a function is equivalent to whether that function is an isomorphism. While this is how we will prefer to think of reversibility, the definition in terms of entropy brings up interesting questions of pragmatics that we will get to later.

James et al. present the following language, which we have reproduced here translated into Haskell. The language is first order, and so we will ignore function types, giving us the types:

{-# LANGUAGE TypeOperators     #-}
{-# LANGUAGE NoImplicitPrelude #-}

-- Corresponds to Haskell '()' type
data U = U

-- Corresponds to Haskell 'Either' type
data a + b
  = InL a
  | InR a

-- Corresponds to Haskell '(,)' type
data a * b = Pair a b

The language presented is based around the notion of type isomorphisms, and so in order to model this language in Haskell, we’ll need the following types:

data a <=> b = Iso
  { run :: a -> b
  , rev :: b -> a

This type a <=> b represents an isomorphism between type a and type b, as witnessed by a pair of functions to and from. This probably isn’t the best encoding of an isomorphism, but for our purposes it will be sufficient.

James and Sabry present the following axioms of their language:

swapP   ::       a + b <=> b + a
assocP  :: a + (b + c) <=> (a + b) + c
unite   ::       U * a <=> a
swapT   ::       a * b <=> b * a
assocT  :: a * (b * c) <=> (a * b) * c
distrib :: (a + b) * c <=> (a * c) + (b * c)
id      ::           a <=> a

The implementations of these terms are all trivial, being that they are purely syntactic isomorphisms. They will not be reproduced here, but can be found in the code accompanying this post. The motivated reader is encouraged to implement these for themself.

With the terms of our algebra out of the way, we’re now ready for the operators. We are presented with the following:

-- Isomorphisms are symmetric.
sym :: (a <=> b) -> (b <=> a)

-- Isomorphisms are transitive.
(>>) :: (a <=> b) -> (b <=> c) -> (a <=> c)

-- Products and coproducts are bifunctors.
(.+) :: (a <=> c) -> (b <=> d) -> (a + b <=> c + d)
(.*) :: (a <=> c) -> (b <=> d) -> (a * b <=> c * d)

It turns out that the resulting language is already surprisingly expressive. We can encode booleans:

type Bool = U + U

true, false :: Bool
true  = InL U
false = InR U

With these out of the way, James et al. show us a “one-armed if-expression”: if the incoming Bool is true, transform the a by the provided combinator:

ifthen :: (a <=> a) -> (Bool * a <=> Bool * a)
ifthen c = distrib >> (id .* c) .+ id >> sym distrib

For syntactic convenience, we will enable rebindable syntax, allowing us to represent these chains of isomorphism transitivities with do notation. We can thus express the above more clearly as:

{-# LANGUAGE RebindableSyntax #-}

ifthen :: (a <=> a) -> (Bool * a <=> Bool * a)
ifthen c = do
  (id .* c) .+ id
  sym distrib

The mystery behind naming our transitivity operator (>>) is thus explained.

But how does our ifthen combinator actually work? Recall that Bool = U + U, meaning that we can distribute the Us across the pair, giving us the type (U * a) + (U * a). The left branch (of type U * a) of this coproduct has an inhabitant if the incoming boolean was true.

We can thus bimap over the coproduct. Since the left case corresponds to an incoming true, we can apply an isomorphism over only that branch. Because we want to transform the incoming a by the combinator c, we then bimap over our U * a with id .* c – not touching the U but using our combinator.

Finally, we need to repackage our (U * a) + (U * a) into the correct return type Bool * a, which we can do by factoring out the a. Factoring is the inverse of distributing, and so we can use the sym operator to “undo” the distrib.

It’s crazy, but it actually works! We can run these things to convince ourselves. Given:

not :: Bool <=> Bool
not = swapP  -- move a left ('true') to a right ('false'), and vice versa.

We get:

> run (ifthen not) $ Pair true false
Pair true true

> run (ifthen not) $ Pair false false
Pair false false

Neat, no? For fun, we can also run these things backwards:

> rev (ifthen not) $ Pair true true
Pair true false

> rev (ifthen not) $ Pair false false
Pair false false

James et al. are eager to point out that ifthen (ifthen not) :: Bool * (Bool * Bool) <=> Bool * (Bool * Bool) is the Toffoli gate – a known universal reversible gate. Because we can implement Toffoli (and due to its universalness), we can thus implement any boolean expression.

Recursion and Natural Numbers

Given two more primitives, James and Sabry show us how we can extend this language to be “expressive enough to write arbitrary looping programs, including non-terminating ones.”

We’ll need to define a term-level recursion axiom:

trace :: (a + b <=> a + c) -> (b <=> c)

The semantics of trace are as follows: given an incoming b (or, symmetrically, a c), lift it into InR b :: a + b, and then run the given iso over it looping until the result is an InR c, which can then be returned.

Notice here that we have introduced potentially non-terminating looping. Combined with our universal boolean expressiveness, this language is now Turing-complete, meaning it is capable of computing anything computable. Furthermore, by construction, we also have the capability to compute backwards – given an output, we can see what the original input was.

You might be concerned that the potential for partiality given by the trace operator breaks the bijection upon which all of our reversibility has been based. This, we are assured is not a problem, because a divergence is never actually observed, and as such, does not technically violate the bijectiveness. It’s fine, you guys. Don’t worry.

There is one final addition we need, which is the ability to represent inductive types:

data Fix f = Fix { unFix :: f (Fix f) }

fold :: f (Fix f) <=> Fix f
fold = Iso Fix unFix

Given these things, we can define the natural numbers a little circuitously. We can define their type as follows:

type Nat = Fix ((+) U)

Constructing such things is a little tricky, however. First we’ll need a way to introduce a coproduct. The type and name of this isomorphism should be suggestive:

just :: a <=> U + a
just = trace $ do
  sym assocP
  (sym fold >> swapP) .+ id

just is a tricky little beast; it works by using trace to eliminate the Nat + U of a (Nat + U) + (U + a). We can follow the derivation a little more closely:

body :: (Nat + U) + a <=> (Nat + U) + (U + a)
body = do
  sym assocP      -- Nat + (U + a)
  sym fold .+ id  -- (U + Nat) + (U + a)
  swapP    .+ id  -- (Nat + U) + (U + a)

trace body :: a <=> U + a

I wish I had come up with this, because it’s quite clever. Notice however that this is a partial isomorphism; when run backwards, it will diverge in the case of InR U :: U + a.

Given just, we can now define succ:

succ :: Nat <=> Nat
succ = do
  just  -- U + Nat
  fold  -- Nat

James et al. provide a little more machinery in order to get to the introduction of a 0:

injectR :: a <=> a + a
injectR = do
  sym unite       -- U * a
  just .* id      -- (U + U) * a
  distrib         -- (U * a) + (U * a)
  unite .+ unite  -- a + a

and finally:

zero :: U <=> Nat
zero = trace $ do
  id       -- Nat + U
  swapP    -- U + Nat
  fold     -- Nat
  injectR  -- Nat + Nat

What’s interesting here is that the introduction of 0 is an isomorphism between U and Nat, as we should expect since 0 is a constant.

Induction on Nats

The paper teases an implementation of isEven for natural numbers – from the text:

For example, it is possible to write a function even? :: Nat * Bool <=> Nat * Bool which, given inputs (n, b), reveals whether n is even or odd by iterating not n-times starting with b. The iteration is realized using trace as shown in the diagram below (where we have omitted the boxes for fold and unfold).

Emphasis mine. The omitted fold and unfold bits of the diagram are the actual workhorses of the isomorphism, and their omission caused me a few days of work to rediscover. I have presented the working example here to save you, gentle reader, from the same frustration.

The insight is this – our desired isomorphism has type Nat * a <=> Nat * a. Due to its universally qualified nature, we are unable to pack any information into the a, and thus to be reversible, the Nat must be the same on both sides. Since we are unable to clone arbitrary values given our axioms (seriously! try it!), our only solution is to build a resulting Nat up from 0 as we tear apart the one we were given.

We can view the a in trace :: (a + b <=> a + c) -> (b <=> c) as “scratch space” or “intermediate state”. It is clear that in order to execute upon our earlier insight, we will need three separate pieces of state: the Nat we’re tearing down, the Nat we’re building up, and the a along for the ride.

For reasons I don’t deeply understand, other than it happened to make the derivation work, we also need to introduce a unit to the input of our traced combinator.

With this line of reasoning, we have the following:

iterNat :: (a <=> a) -> (Nat * a <=> Nat * a)
iterNat step = do
  sym unite
  trace $ do
    id  -- (Nat' * (Nat * a)) + (U * (Nat * a))

For clarity, we’ll annotate the natural number under construction as Nat'.

When the iteration begins, our combinator receives an InR whose contents are of type U * (Nat * a) corresponding to the fact that there is not yet any internal state. From there we can factor our the Nat * a:

  trace $ do
    id           -- (Nat' * (Nat * a)) + (U * (Nat * a))
    sym distrib  -- (Nat' + U) * (Nat * a)

All of a sudden this looks like a more tenable problem. We now have a product of (conceptually) a Maybe Nat', the Nat being torn down, and our a. We can fold :: U + Nat <=> Nat our Nat', which will give us 0 in the case that the state hasn’t yet been created, or \(n+1\) in the case it has.

  trace $ do
    id                     -- (Nat' * (Nat * a)) + (U * (Nat * a))
    sym distrib            -- (Nat' + U) * (Nat * a)
    (swapP >> fold) .* id  -- Nat' * (Nat * a)

The only thing left is to destruct the incoming Nat and apply our step isomorphism. We introduce a lemma to help:

swapBacT :: a * (b * c) <=> b * (a * c)
swapBacT = do
  swapT .* id
  sym assocT

which we can then use to move the pieces of our state and destruct the correct number:

  trace $ do
    id                         -- (Nat' * (Nat * a)) + (U * (Nat * a))
    sym distrib                -- (Nat' + U) * (Nat * a)
    (swapP >> fold) .* id      -- Nat' * (Nat * a)
    swapBacT                   -- Nat * (Nat' * a)
    (sym fold >> swapP) .* id  -- (Nat + U) * (Nat' * a)

We can then distribute out the Nat + U again:

  trace $ do
    id                         -- (Nat' * (Nat * a)) + (U * (Nat * a))
    sym distrib                -- (Nat' + U) * (Nat * a)
    (swapP >> fold) .* id      -- Nat' * (Nat * a)
    swapBacT                   -- Nat * (Nat' * a)
    (sym fold >> swapP) .* id  -- (Nat + U) * (Nat' * a)
    distrib                    -- (Nat * (Nat' * a)) + (U * (Nat' * a))

And finally, we apply our step iso to the internal state (we do this after the distrib so that we don’t apply the combinator if the incoming number was 0). The fruits of our labor are presented in entirety:

iterNat :: (a <=> a) -> (Nat * a <=> Nat * a)
iterNat step = do
  sym unite
  trace $ do
    id                          -- (Nat' * (Nat * a)) + (U * (Nat * a))
    sym distrib                 -- (Nat' + U) * (Nat * a)
    (swapP >> fold) .* id       -- Nat' * (Nat * a)
    swapBacT                    -- Nat * (Nat' * a)
    (sym fold >> swapP) .* id   -- (Nat + U) * (Nat' * a)
    distrib                     -- (Nat * (Nat' * a)) + (U * (Nat' * a))
    (id .* (id .* step)) .+ id  -- (Nat * (Nat' * a)) + (U * (Nat' * a))
    swapBacT .+ id              -- (Nat' * (Nat * a)) + (U * (Nat' * a))

Lo and behold, the types now line up, and thus quod erat demonstrandum. The implementation of isEven is now trivial:

isEven :: Nat * Bool <=> Nat * Bool
isEven = iterNat not

which computes if a Nat is even in the case the incoming Bool is false, and whether it is odd otherwise.


James and Sabry provide a sketch of how to define lists, but I wanted to flesh out the implementation to test my understanding.

For reasons I don’t pretend to understand, Haskell won’t let us partially apply a type synonym, so we’re forced to write a higher-kinded data definition in order to describe the shape of a list.

-- To be read as @type ListF a b = U + (a * b)@.
data ListF a b
  = Nil
  | Cons a b

We can then get the fixpoint of this in order to derive a real list:

type List a = Fix (ListF a)

And to get around the fact that we had to introduce a wrapper datatype in order to embed this into Haskell, we then provide an eliminator to perform “pattern matching” on a List a. In a perfect world, this function would just be sym fold, but alas, we must work with what we have.

liste :: List a <=> U + (a * List a)
liste = Iso to from
    to (Fix Nil)          = InL U
    to (Fix (Cons a b))   = InR (Pair a b)
    from (InL U)          = Fix Nil
    from (InR (Pair a b)) = Fix (Cons a b)

From here, it is trivial to write cons:

cons :: a * List a <=> List a
cons = do
  just       -- U + (a * List a)
  sym liste  -- List

However, introducing a list via nil is actually quite complicated. Note the parallels with the natural numbers, where it was trivial to define succ but required a clever trick to introduce a zero.

We begin with a lemma that moves a coproduct:

swapCbaP :: (a + b) + c <=> (c + b) + a
swapCbaP = do
  sym assocP   -- a + (b + c)
  swapP        -- (b + c) + a
  swapP .+ id  -- (c + b) + a

And given that, we can write an isomorphism between any a and any b. The catch, of course, is that you can never witness such a thing since it obviously doesn’t exist. Nevertheless, we can use it to convince the type checker that we’re doing the right thing in cases that would diverge in any case.

diverge :: a <=> b
diverge = trace $ do
  id                 -- (a + b) + a
  swapP .+ id        -- (b + a) + a
  swapCbaP           -- (a + a) + b
  sym injectR .+ id  -- a + b
  swapP              -- b + a
  right .+ id        -- (b + b) + a
  swapCbaP           -- (a + b) + b

Finally we can implement nil using the same trick we did for zero – use trace to vacuously introduce exactly the type we need, rip out the result, and then divergently reconstruct the type that trace expects.

nil :: U <=> List a
nil = trace $ do
  id                        -- (a * List a) + U
  swapP                     -- U + (a * List a)
  sym liste                 -- List a
  sym unite                 -- U * List a
  just .* id                -- (U + U) * List a
  distrib                   -- (U * List a) + (U * List a)
  (diverge .* id) .+ unite  -- (a * List a) + List a

Induction on Lists

In a manner spiritually similar to iterNat, we can define iterList :: (a * z <=> b * z) -> (List a * z <=> List b * z). The semantics are mostly what you’d expect from its type, except that the resulting List b is in reverse order due to having to be constructed as the List a was being destructed. We present the implementation here for completeness but without further commentary.

iterList :: (a * z <=> b * z) -> (List a * z <=> List b * z)
iterList f = do
  sym unite
  trace $ do
                                -- ((b * List b) * (List a * z)) + (U * (List a * z))
    sym distrib                 -- ((b * List b) + U) * (List a * z)
    (swapP >> sym liste) .* id  -- List b * (List a * z)
    swapBacT                    -- List a * (List b * z)
    liste .* id                 -- (U + (a * List a)) * (List b * z)
    distrib                     -- (U * (List b * z)) + ((a * List a) * (List b * z))
    (.+) id $                   -- (U * (List b * z)) + ...
        swapT .* id             --    ((List a * a) * (List b * z))
        swapAcbdT               --    ((List a * List b) * (a * z))
        id .* f                 --    ((List a * List b) * (b * z))
        swapAcbdT               --    ((List a * b) * (List b * z))

    swapP                       -- ((List a * b) * (List b * z)) + (U * (List b * z))
    (swapT .* id) .+ id         -- ((b * List a) * (List b * z)) + (U * (List b * z))
    swapAcbdT .+ id             -- ((b * List b) * (List a * z)) + (U * (List b * z))

swapAcbdT :: (a * b) * (c * d) <=> (a * c) * (b * d)
swapAcbdT = do
  sym assocT
  id .* sw

From here, the functional programming favorite map is easily defined:

map :: (a <=> b) -> (List a <=> List b)
map f = do
  sym unite
  iterList $ f .* id  -- map
  iterList id         -- reverse to original order


The bulk of the remainder of the paper is an extension to the reversible semantics above, introducing create :: U ~> a and erase :: a ~> U where (~>) is a non-reversible arrow. We are shown how traditional non-reversible languages can be transformed into the (~>)-language.

Of more interest is James and Sabry’s construction which in general transforms (~>) (a non-reversible language) into (<=>) (a reversible one). But how can such a thing be possible? Obviously there is a trick!

The trick is this: given a ~> b, we can build h * a <=> g * b where h is “heap” space, and g is “garbage”. Our non-reversible functions create and erase thus become reversible functions which move data from the heap and to the garbage respectively.

Unfortunately, this is a difficult thing to model in Haskell, since the construction requires h and g to vary based on the axioms used. Such a thing requires dependent types, which, while possible, is quite an unpleasant undertaking. Trust me, I actually tried it.

However, just because it’s hard to model entirely in Haskell doesn’t mean we can’t discuss it. We can start with the construction of (~>):


data a ~> b where
  Arr     :: (a <=> b) -> (a ~> b)
  Compose :: (a ~> b) -> (b ~> c) -> (a ~> c)
  First   :: (a ~> b) -> (a * c ~> b * c)
  Left    :: (a ~> b) -> (a + c ~> b + c)
  Create  :: U ~> a
  Erase   :: a ~> U

The axioms here are quite explanatory and will not be discussed further. A point of note, however, is that Arr allows arbitrary embeddings of our iso (<=>) language in this arrow language.

The semantics of Create is given by induction:

\[ \newcommand{\u}{\text{U}} \begin{align*} \text{create U} & \mapsto \u \\ \text{create}(a + b) & \mapsto \text{InL } (\text{create } a) \\ \text{create}(a \times b) & \mapsto (\text{create } a, \text{create } b) \end{align*} \]

With the ability to create and erase information, we’re (thankfully) now able to write some everyday functions that you never knew you missed until trying to program in the iso language without them. James et al. give us what we want:

fstA :: a * b ~> a
fstA = do
  arr swapT    -- b * a
  first erase  -- U * a
  arr unite    -- a

In addition to projections, we also get injections:

leftA :: a ~> a + b
leftA = do
  arr $ sym unite  -- U * a
  first create     -- (a + b) * a
  arr leftSwap     -- (a + b) * a
  fstA             -- a + b

leftSwap :: (a + b) * a <=> (a + b) * a
leftSwap = do
  distrib      -- (a * a') + (b * a')
  swapT .+ id  -- (a' * a) + (b * a')
  sym distrib  -- (a' + b) * a

And the ability to extract from a coproduct:

join :: a + a ~> a
join = do
  arr $ do
    sym unite .+ sym unite  -- (U * a) + (U * a)
    sym distrib             -- (U + U) * a
    swapT                   -- a * (U + U)
  fstA                      -- a

We are also provided with the ability to clone a piece of information, given by structural induction. Cloning U is trivial, and cloning a pair is just cloning its projections and then shuffling them into place. The construction of cloning a coproduct, however, is more involved:

clone :: a + b ~> (a + b) * (a + b)
clone = do
  left $ do
    clone                -- (a * a) + b
    first leftA          -- ((a + b) * a) + b
    arr swapT            -- (a * (a + b)) + b
  arr swapP              -- b + (a * (a + b))
  left $ do
    clone                -- (b * b) + (a * (a + b))
    first leftA          -- ((b + a) * b) + (a * (a + b))
    arr swapT            -- (b * (b + a)) + (a * (a + b))
  arr $ do
    swapP                -- (a * (a + b)) + (b * (b + a))
    id .+ (id .* swapP)  -- (a * (a + b)) + (b * (a + b))
    sym distrib          -- (a + b) * (a + b)

It should be quite clear that this arrow language of ours is now more-or-less equivalent to some hypothetical first-order version of Haskell (like Elm?). As witnessed above, information is no longer a linear commodity. A motivated programmer could likely get work done in a 9 to 5 with what we’ve built so far. It probably wouldn’t be a lot of fun, but it’s higher level than C at the very least.

The coup de grace of Information Effects is its construction lifting our arrow language back into the isomorphism language. The trick is to carefully construct heap and garbage types to correspond exactly with what our program needs to create and erase. We can investigate this further by case analysis on the constructors of our arrow type:

Arr :: (a <=> b) -> (a ~> b)

As we’d expect, an embedding of an isomorphism in the arrow language is already reversible. However, because we need to introduce a heap and garbage anyway, we’ll use unit.

Since we can’t express the typing judgment in Haskell, we’ll use a sequent instead:

\[ \newcommand{\lifted}[3]{\text{lift } #1 : #2 \leftrightarrow #3} \newcommand{\arr}{\rightsquigarrow} \frac{\text{arr } f : a \arr b}{\lifted{(\text{arr } f)}{\u \times a}{\u \times b}} \]

Assuming we have a way of describing this type in Haskell, all that’s left is to implement the lifting of our iso into the enriched iso language:

lift (Arr f) = id .* f

Compose :: (a ~> b) -> (b ~> c) -> (a ~> c)

Composition of arrows proceeds likewise in a rather uninteresting manner. Here, we have two pairs of heaps and garbages, results from lifting each of the arrows we’d like to compose. Because composition will run both of our arrows, we’ll need both heaps and garbages in order to implement the result. By this logic, the resulting heap and garbage types are pairs of the incoming ones.

\[ \frac{\lifted{f}{h_1\times a}{g_1\times b},\; \lifted{g}{h_2\times b}{g_2\times c}}{\lifted{(g \circ f)}{(h_1\times h_2)\times a}{(g_1\times g_2)\times c}} \]

We can express the resulting combinator in Haskell:

lift (Compose f g) = do
  id            -- (H1 * H2) * a
  swapT .* id   -- (H2 * H1) * a
  sym assocT    -- H2 * (H1 * a)
  id .* lift f  -- H2 * (G1 * b)
  assocT        -- (H2 * G1) * b
  swapT .* id   -- (G1 * H2) * b
  sym assocT    -- G1 * (H2 * b)
  id .* lift g  -- G1 * (G2 * c)
  assocT        -- (G1 * G2) * c

First :: (a ~> b) -> (a * c ~> b * c)

Lifting arrows over products again is uninteresting – since we’re doing nothing with the second projection, the only heap and garbage we have to work with are those resulting from the lifting of our arrow over the first projection.

\[ \frac{\lifted{f}{h\times a}{g\times b}} {\lifted{(\text{First } f)}{h\times (a\times c)}{g\times (b\times c)}} \]

In Haskell, our resulting combinator looks like this:

lift (First f) = do
  id          -- H * (a * c)
  assocT      -- (H * a) * c
  f .* id     -- (G * b) * c
  sym assocT  -- G * (b * c)

Left :: (a ~> b) -> (a + c ~> b + c)

Finally, we get to an interesting case. In the execution of Left, we may or may not use the incoming heap. We also need a means of creating a b + c given a b or given a c. Recall that in our iso language, we do not have create (nor relatedly, leftA) at our disposal, and so this is a harrier problem than it sounds at first.

We can solve this problem by requiring both a b + c and a c + b from the heap. Remember that the Toffoli construction (what we’re implementing here) will create a reversible gate with additional inputs and outputs that gives the same result when all of its inputs have their default values (ie. the same as those provided by create’s semantics). This means that our incoming b + c and c + b will both be constructed with InL.

Given this, we can thus perform case analysis on the incoming a + c, and then use leftSwap from earlier to move the resulting values into their coproduct.

What does the garbage situation look like? In the case we had an incoming InL, we will have used up our function’s heap, as well as our b + c, releasing the g, b (the default value swapped out of our incoming b + c), and the unused c + b.

If an InR was input to our isomorphism, we instead emit the function’s heap h, the unused b + c, and the default c originally in the heap’s coproduct.

Our final typing judgment thus looks like this:

\[ \frac{\lifted{f}{h\times a}{g\times b}}{\lifted{(\text{Left f})}{h'\times (a + c)}{g' \times (b + c)}} \]

\[ h' = h\times ((b + c) \times(c + b)) \\ g' = (g\times (b\times(c + b))) + (h\times ((b+c)\times c)) \]

and is rather horrifyingly implemented:

lift (Left f) = do
  leftSide f .+ rightSide
  sym distrib

    :: (h * a <=> g * b)
    -> (a * (h * ((b + c) * (c + b))) <=> (g * (b * (c + b))) * (b + c))
leftSide f = do
  swapT                -- (H * ((b + c) * (c + b))) * a
  swapT .* id          -- (((b + c) * (c + b)) * H) * a
  sym assocT           -- ((b + c) * (c + b)) * (H * a)
  id .* f              -- ((b + c) * (c + b)) * (G * b')
  swapT .* id          -- ((c + b) * (b + c)) * (G * b')
  sw2                  -- ((c + b) * G) * ((b + c) * b')
  id .* leftSwap       -- ((c + b) * G) * ((b' + c) * b)
  swapT .* swapT       -- (G * (c + b)) * (b * (b' + c))
  assocT               -- ((G * (c + b)) * b) * (b' + c)
  sym assocT .* id     -- (G * ((c + b) * b)) * (b' + c)
  (id .* swapT) .* id  -- (G * (b * (c + b))) * (b' + c)

rightSide :: c * (h * ((b + c) * (c + b))) <=> (h * ((b + c) * c)) * (b + c)
rightSide = do
  swapT                -- c' * (H * ((b + c) * (c + b)))
  assocT .* id         -- ((H * (b + c)) * (c + b)) * c'
  sym assocT           -- (H * (b + c)) * ((c + b) * c')
  id .* leftSwap       -- (H * (b + c)) * ((c' + b) * c)
  id .* swapT          -- (H * (b + c)) * (c * (c' + b))
  assocT               -- ((H * (b + c)) * c) * (c' + b)
  sym assocT .* swapP  -- (H * ((b + c) * c)) * (b + c')

The home stretch is within sight. We have only two constructors of our arrow language left. We look first at Create:

Create  :: U ~> a

Because we’ve done all of this work to thread through a heap in order to give us the ability to create values, the typing judgment should come as no surprise:

\[ \frac{}{\lifted{\text{create}}{a\times\u}{\u\times a}} \]

Our heap contains the a we want, and we drop our incoming U as garbage. The implementation of this is obvious:

lift Create = swapT

We’re left with Erase, whose type looks suspiciously like running Create in reverse:

Erase  :: a ~> U

This is no coincidence; the two operations are duals of one another.

\[ \frac{}{\lifted{\text{erase}}{\u\times a}{a\times\u}} \]

As expected, the implementation is the same as Create:

lift Erase = swapT

And we’re done! We’ve now constructed a means of transforming any non-reversible program into a reversible one. Success!


Still here? We’ve come a long way, which we’ll briefly summarize. In this paper, James and Sabry have taken us through the construction of a reversible language, given a proof that it’s Turing-complete, and given us some simple constructions on it. We set out on our own to implement lists and derived map for them.

We then constructed a non-reversible language (due to its capability to create and erase information), and then gave a transformation from this language to our earlier reversible language – showing that non-reversible computing is a special case of its reversible counterpart.

Information Effects ends with a short discussion of potential applications, which won’t be replicated here.

Commentary (on the physics)

Assuming I understand the physics correctly (which I probably don’t), the fact that these reversible functions do not increase entropy implies that they should be capable of shunting information for near-zero energy. Landauer’s Principle and [Szilard’s engine] suggests that information entropy and thermodynamic entropy are one and the same; if we don’t increase entropy in our computation of a function, there is nowhere for us to have created any heat.

That’s pretty remarkable, if you ask me. Together with our construction from any non-reversible program to a reversible one, it suggests we should be able to cut down on our CPU power usage by a significant order of magnitudes.

Commentary (on where to go from here)

An obvious limitation of what we’ve built here today is that it is first-order, which is to say that functions are not a first class citizen. I can think of no immediate problem with representing reversible functions in this manner. We’d need to move our (<=>) directly into the language.

id would provide introduction of this type, and (>>) (transitivity) would allow us to create useful values of the type. We’d also need a new axiom:

apply :: a * (a <=> b) <=> b * (b <=> a)

which would allow us to use our functions. We should also expect the following theorems (which may or may not be axioms) due to our iso language forming a cartesian closed category:

product   :: (a <=> (b * c)) <=> (a <=> b) * (a <=> c)
coproduct :: (a <=> (b + c)) <=> (a <=> b) + (a <=> c)

Things that we’d expect to be theorems but are not are:

terminal :: U <=> (a <=> U)
select   :: a <=> (U <=> a)

due to the symmetry of (<=>), both of these are equivalent to create and erase. I think the fact that these are not theorems despite U being the terminal object is that (<=>) requires arrows in both directions, but U only has incoming arrows.


August 20, 2017 12:00 AM

August 18, 2017

Edward Z. Yang

Backpack for deep learning

This is a guest post by Kaixi Ruan.

Backpack is a module system for Haskell, released recently in GHC 8.2.1. As this is a new feature, I wanted to know how people use it. So I searched Twitter every day, and the other day I saw this tweet:

Are there other examples than String/Bytestring/Text? So far I haven’t seen any; it seems like backpack is just for glorified string holes.

There were a number of good responses, but I want to give another use case from deep learning.

In deep learning, people are interested in doing computations on tensors.  Tensors can have different value types: int, float, double etc. Additionally, ensor computations can be done on the CPU or GPU. Although there many different types of tensor,  the computations for each type of tensor are the same, i.e, they share the same interface. Since Backpack lets you program against one interface which can have multiple implementations, it is the perfect tool for implementing a tensor library.

Torch is a widely used library, implemented in C, for deep learning. Adam Paszke has a nice article about Torch. We can write some Haskell bindings for Torch, and then use Backpack to switch between implementations of float and int tensors. Here is a program that uses tensors via a Backpack signature:

unit torch-indef where
  signature Tensor where
    import Data.Int
    data Tensor
    data AccReal
    instance Show AccReal
    instance Num AccReal
    read1dFile :: FilePath -> Int64 -> IO Tensor
    dot :: Tensor -> Tensor -> IO AccReal
    sumall :: Tensor -> IO AccReal
  module App where
    import Tensor
    app = do
        x <- read1dFile "x" 10
        y <- read1dFile "y" 10
        d <- dot x y
        s <- sumall x
        print (d + s)
        return ()

We have a simple main function which reads two 1D tensors from files, does dot product of the two, sums all entries of the first tensor, and then finally prints out the sum of these two values. (This program is transcribed from Adam’s article, the difference is that Adam’s program uses Float Tensor, and we keep the Tensor type abstract so with Backpack we can do both float and int). The program uses functions like dot, which are defined in the signature.

Here is an implementation of dot and types for float tensors. The C functions are called using Haskell’s FFI:

import Foreign
import Foreign.C.Types
import Foreign.C.String
import Foreign.ForeignPtr

foreign import ccall "THTensorMath.h THFloatTensor_dot"
    c_THFloatTensor_dot :: (Ptr CTHFloatTensor) -> (Ptr CTHFloatTensor) -> IO CDouble

type Tensor = FloatTensor
type AccReal = Double

dot :: Tensor -> Tensor -> IO AccReal
dot (FT f) (FT g) = withForeignPtr f $ \x ->
                    withForeignPtr g $ \y -> do
                    d <- c_THFloatTensor_dot x y
                    return (realToFrac d)

As you can see, Backpack can be used to structure a deep learning library which has multiple implementations of operations for different types. If you wrote bindings for all of the functions in Torch, you would have a deep learning library for Haskell; with Backpack, you could easily write models that were agnostic to the types of tensors they operate on and the processing unit (CPU or GPU) they run on.

You can find the full sample code on GitHub.

by Edward Z. Yang at August 18, 2017 02:05 AM

August 17, 2017

Manuel M T Chakravarty

Moving On

I am excited about my new role at Tweag I/O! Curious why? Read why I am a functional programming evangelist.

August 17, 2017 12:14 AM

Alp Mestanogullari

Coyoneda and fmap fusion

Let’s quickly see how the (dual variant of the) Yoneda lemma can speed up some Haskell programs – more specifically ones that are repeatedly calling fmap to transform some data within a Functor.

We will be focusing on the following Functor:

data Tree a
  = Bin a (Tree a) (Tree a)
  | Nil
  deriving (Eq, Show)

instance Functor Tree where
  fmap _ Nil         = Nil
  fmap f (Bin a l r) = Bin (f a) (fmap f l) (fmap f r)

-- we'll also use this instance to perform a silly computation
-- on our tree after some transformations have occured
instance Foldable Tree where
  foldMap _ Nil         = mempty
  foldMap f (Bin a l r) = f a <> foldMap f l <> foldMap f r

A simple binary tree with data stored in the nodes, whose Functor instance lets us map a function over each a stored in our tree and whose Foldable instance lets us combine computations performed over our as.

<section class="level1" id="coyoneda">


The Coyoneda lemma, when interpreted on haskell Functors, tells us that Coyoneda f a is equivalent (isomorphic) to f a, where Coyoneda is:

data Coyoneda f a where
  Coyoneda :: (b -> a) -> f b -> Coyoneda f a

We see that it holds an f b and a way to go from bs to as, effectively making it equivalent to f a if you fmap the first field on the second one. That’s also the only sensible thing we can do with such a value, as the b is hidden.

If it’s equivalent to f a, it must be a Functor too? Sure enough it is.

instance Functor (Coyoneda f) where
  fmap f (Coyoneda b2a fb) = Coyoneda (f . b2a) fb

We see that calling fmap f amounts to “accumulating” more work in the b -> a field, possibly even changing from a given a to some other type, as allowed by fmap. This is exactly the piece of code that powers “fmap fusion”. Instead of going from f a to f b with fmap f and then to f c with fmap g, the Coyoneda representation keeps hold of the original f a, which is left untouched by the Functor instance from above, and instead simply composes f and g in that first field.

Now, we said that f a and Coyoneda f a are isomorphic but did not provide functions to prove our claim, let’s fix that right away.

coyo :: f a -> Coyoneda f a
coyo = Coyoneda id

uncoyo :: Functor f => Coyoneda f a -> f a
uncoyo (Coyoneda f fa) = fmap f fa

Note that we do not need f to be a Functor to build a Coyoneda f a, as there’s no need to call fmap until the very end, when we have composed all our transformations and finally want to get the final result as some f a, not Coyoneda f a.

Maybe it’s still not clear to you that successive fmap calls are fused, so let’s prove it. We want to show that for two functions f :: b -> c and g :: a -> b, uncoyo . fmap f . fmap g . Coyoneda id = fmap (f . g).

uncoyo . fmap f . fmap g . coyo
  = uncoyo . fmap f . fmap g . Coyoneda id -- definition of coyo
  = uncoyo . fmap f . Coyoneda (g . id)    -- Functor instance for Coyoneda
  = uncoyo . fmap f . Coyoneda g           -- g . id = g
  = uncoyo . Coyoneda (f . g)              -- Functor instance for Coyoneda
  = fmap (f . g)                           -- definition of uncoyo

Nice! And you could of course chain any number of fmap calls and they would all get fused into a single fmap call that applies the composition of all the functions you wanted to fmap.

For instance, back to our tree, let’s define some silly computations:

-- sum all the values in a tree
sumTree :: Num a => Tree a -> a
sumTree = getSum . foldMap Sum

-- an infinite tree with integer values
t :: Tree Integer
t = go 1

  where go r = Bin r (go (2*r)) (go (2*r + 1))

-- only keep the given number of depth levels of
-- the given tree
takeDepth :: Int -> Tree a -> Tree a
takeDepth _ Nil = Nil
takeDepth 0 _   = Nil
takeDepth d (Bin r t1 t2) = Bin r (takeDepth (d-1) t1) (takeDepth (d-1) t2)

-- a chain of transformations to apply to our tree
transform :: (Functor f, Num a) => f a -> f a
transform = fmap (^2) . fmap (+1) . fmap (*2)

Now with a simple main we can compare how efficient it is to compute sumTree $ takeDepth n (transform t) by using Tree vs Coyoneda Tree as the functor on which the transformations are applied. You can find an executable module in this gist.

If we compare with and without Coyoneda for n = 23, there’s already a noticeable (and reproducible) difference:

$ time ./yo 23


$ time ./yo 23 --coyo

Posted on August 17, 2017

by Alp Mestanogullari at August 17, 2017 12:00 AM

August 16, 2017

Michael Snoyman


About five years ago, I decided to start working out at home since I wanted to get in better shape. About three years ago, I got more serious about it as I realized my health was slipping (specifically, recurrence of asthmatic symptoms after 20 years of being clear). But I only started weight lifting 1.5 years ago, and the reason was simple: back pain.

Like many people in our industry—our industry being the "sit in front of a computer all day" industry—I suffered from chronic lower back pain. I'd been having problems with it on or off since I was a teenager (yeah, I was sitting in front of a computer then too). But over the preceeding few years, it got significantly worse. I had many episodes of waking up unable to get out of bed without significant pain. I had a few cases of my spine turning S-shaped for days on end, unable to stand up straight at all.

I have a strong family history of back pain. Like going bald, I'd taken it as a given for many years that this would happen. I went to an orthopedist, who prescribed painkillers. And that could have been the rest of my life: regular pain, popping pills, waiting to see if I'd kill my liver with the pills before something else got me. And most likely, inactivity due to back pain could have led to plenty of other health problems.

Today is a different story. I won't claim that I'm totally back pain free—problems still crop up from time to time. But the debilitating level I had previously is gone. And when some negative event occurs (like getting knocked down and back slammed by a wave this Sunday), I'm far more resilient to the damage. I'm writing this blog post since I strongly believe many of my friends, family, colleagues, and general fellow programmers suffer terribly from back pain, when they could greatly improve the situation. I'll tell you what I've done, and what I continue to do.

If you suffer from back pain, I strongly recommend you consider being proactive about it. Feel free to take my experiences into account, but also do your own research and determine what you think is your best course of action. There is unfortunately—like most things in the health world—quite a bit of contradictory advice out there.

Two pronged approach

From my research, I decided that there were likely two things I could do (outside of pill popping) that I could do to improve the situation with my back:

  • Improve the muscles in my posterior chain (lower back, glutes, hamstrings) to better support my spine
  • Change the way I was moving my back, though I didn't really understand yet how

The first bit is easy to explain. I'd been doing bodyweight workouts at home until then, which—according to the program I was following, don't really offer a good alternative to the deadlift for posterior chain work. That's why I switched to Stronglifts 5x5 and put a large emphasis on the deadlift, also focusing on stabilizing my core a lot during the squat.

I'll be honest: I threw my back out badly a few times on the squat. I almost gave up. I'm glad I didn't. I (finally) figured out how I was misusing my back on the exercises, and now can squat and deadlift almost twice the weight that had previously thrown my back out. I consider it a huge success.

In addition to the muscle improvements, the other takeaway is: lifting weights taught me how to use my back in a safer way.


But now on to the (for me) more complicated bit. I watched tons of YouTube videos, read articles, browsed forums, and spoke with doctors and chiropractors about proper posture. The problem is that there are different schools of thought on what it means to stand or sit correctly. From my reading, the most contentious point comes down to pelvic tilt. To demonstrate visually:

Pelvic tilt

There's a basic question: should your pelvis tip slightly forward, slightly backwards, or be neutral (perfectly vertical). As far as I can tell, the most mainstream opinion is a neutral pelvis. I'm always nervous to give anything close to health advice, especially contrary to mainstream opinion, so instead I'll say: I found a lot of success with the Gokhale Method, and specifically Esther's book "8 Steps to a Pain Free Back."

The reasoning Esther uses to arrive at her conclusions is solid to me. Analyzing the shape of the vertebrae, and specifically the L5-S1 joint, does make a good case for the pelvis needing to be slightly anteverted. In addition, I buy her argument of the source of back pain being the predominance of slouching introduced in the western world in the earlier 20th century. The evidence of more uniform posture among cultures unexposed to this slouching epidemic, and their relative lack of back problems, is compelling.

I won't try to describe the method here; her book and YouTube videos do a better job than I ever could. I will, however, comment on some of the takeaways that I try to keep in mind throughout the day:

  • Keep the spine in a stretched position as much as possible
  • Stack the bones: try to ensure that your weight is being distributed down your spinal column, through your pelvis, and down your legs, instead of relying on your muscles or (worse) joints to keep you stable

Keep in mind that this is not an overnight change. You'll need to practice better posture and get it to the point of muscle memory. I think it's worth every second of investment you can give it. It's not worth living your life in pain, afraid to move, and constantly doped up.

Why now?

Two things happened this week that made me want to write this blog post. I took my kids to the beach on Sunday, and as I mentioned above, got knocked down hard by a wave, which twisted my back in a bad angle. For the next few seconds that I was under water, absolute fear went through my mind. "Oh no, did my back just go out? How am I going to drive the kids home? How will I work this week? What if one of the kids gets pulled under the water and I can't save him/her?"

The wave subsided, my feet touched the floor, I stood up... and everything was fine. I know in my bones (hah!) that that kind of impact would have put me out for a week just a few years ago. I'm sitting at my desk typing this now, after having done a deadlift session in the gym, and everything is fine.

Yesterday I took a trip to the doctor (not the topic of today's post). I sat in the patient's chair in his office, and noticed that—contrary to prior visits—I was sitting perfectly upright. I hadn't thought about it. The chair wasn't really well designed either: using the back support would have required leaning back and no longer remaining straight. It was a minor victory, but I'll take it.

August 16, 2017 09:43 AM

August 14, 2017

Wolfgang Jeltsch

Haskell in Leipzig 2017 submission deadline ahead

Let me remind you that the submission deadline of Haskell in Leipzig 2017 is this Friday. We seek abstracts of about 2 pages length on anything related to Haskell. Looking forward to your contributions. 😉


Haskell is a modern functional programming language that allows rapid development of robust and correct software. It is renowned for its expressive type system, its unique approaches to concurrency and parallelism, and its excellent refactoring capabilities. Haskell is both the playing field of cutting-edge programming language research and a reliable base for commercial software development.

The workshop series Haskell in Leipzig (HaL), now in its 12th year, brings together Haskell developers, Haskell researchers, Haskell enthusiasts, and Haskell beginners to listen to talks, take part in tutorials, join in interesting conversations, and hack together. To support the latter, HaL will include a one-day hackathon this year. The workshop will have a focus on functional reactive programming (FRP) this time, while continuing to be open to all aspects of Haskell. As in the previous year, the workshop will be in English.


Everything related to Haskell is on topic, whether it is about current research, practical applications, interesting ideas off the beaten track, education, or art, and topics may extend to functional programming in general and its connections to other programming paradigms.

Contributions can take the form of

  • talks (about 30 minutes),
  • tutorials (about 90 minutes),
  • demonstrations, artistic performances, or other extraordinary things.

Please submit an abstract that describes the content and form of your presentation, the intended audience, and required previous knowledge. We recommend a length of 2 pages, so that the program committee and the audience get a good idea of your contribution, but this is not a hard requirement.

Please submit your abstract as a PDF document via EasyChair until Friday, August 18, 2017. You will be notified by Friday, September 8, 2017.

Hacking Projects

Projects for the hackathon can be presented during the workshop. A prior submission is not needed for this.

Invited Speaker

  • Ivan Perez, University of Nottingham, UK

Invited Performer

  • Lennart Melzer, Robert-Schumann-Hochschule Düsseldorf, Germany

Program Committee

  • Edward Amsden, Plow Technologies, USA
  • Heinrich Apfelmus, Germany
  • Jurriaan Hage, Utrecht University, The Netherlands
  • Petra Hofstedt, BTU Cottbus-Senftenberg, Germany
  • Wolfgang Jeltsch, Tallinn University of Technology, Estonia (chair)
  • Andres Löh, Well-Typed LLP, Germany
  • Keiko Nakata, SAP SE, Germany
  • Henrik Nilsson, University of Nottingham, UK
  • Ertuğrul Söylemez, Intelego GmbH, Germany
  • Henning Thielemann, Germany
  • Niki Vazou, University of Maryland, USA
  • Johannes Waldmann, HTWK Leipzig, Germany

Tagged: conference, FRP, functional programming, Haskell

by Wolfgang Jeltsch at August 14, 2017 10:19 PM

August 12, 2017

Dan Piponi (sigfpe)

What is a photon?


Popular science writing about quantum mechanics leaves many people full of questions about the status of photons. I want to answer some of these without using any tricky mathematics.

One of the challenges is that photons are very different to ordinary everyday objects like billiard balls. This is partly because photons are described by quantum mechanics whereas billiard balls are better modelled with classical Newtonian mechanics. Quantum mechanics defies many of our intuitions. But it's also because the word photon plays by different linguistic rules to billiard ball. I hope to explain why.

One of my goals is to avoid saying anything original. I'm largely going remove the mathematics from material I first learnt from three or so courses I took at Cambridge University many years ago: Quantum Mechanics, Solid State Physics and Quantum Field Theory. I also learnt about some of this from David Miller at Stanford University who talked a little about what properties it is meaningful to apply to a photon. (I hope I haven't misrepresented him too badly.)

The simple harmonic oscillator

Here's a mass hanging on a spring:

Suppose it's initially sitting in equilibrium so that the net force acting on it is zero. Now we lift the mass a small distance and let it go. Because we lifted it, we shortened the spring, reducing its tension. This means the force due to gravity is now more than the spring tension and the mass falls. Eventually it falls below the equilibrium point, increasing the tension in the spring so there is a net force pulling it back up again. To a good approximation, the force restoring the mass to its equilibrium point is proportional to how far it has been displaced. When this happens we end up with oscillating motion where the mass bounces up and down. Here's what a graph of its displacement looks like over time:

It's actually a sine wave but that detail doesn't matter for us right now.

An oscillator where the restoring force is proportional to the displacement from the equilibrium point is called a simple harmonic oscillator and its oscillation is always described by a sine wave.

Note that I'm ignoring friction here. This is a reasonable approximation for many physical systems.

Masses on springs aren't all that important in themselves. But simple harmonic oscillators are very common. Another standard example is the pendulum swinging under the influence of gravity:

At a more fundamental level, an example might be an atom in a crystal being held in place by electrostatic forces from its neighbouring atoms.

If you have one of these systems, then in principle you can set it in motion with as little energy as you like. Pull a mass on a spring down a little bit and it will bounce back up, oscillating a certain amount. Pull the mass down half the amount and it'll bounce with oscillations half the size. In principle we could keep repeating this experiment, each time starting with the mass displaced half the amount we tried previously. In other words, a simple harmonic oscillator can have any energy we like. The spectrum of possible energies of one of these oscillators is continuous. (Note that the word spectrum here is merely physicist-speak for a set of possible values.) If we can set one in motion with 1 unit of energy then we can also set it oscillating with 0.5 units, or 0.01 units, or 0.000123 units of energy.

Quantum mechanics

Everything I've said above is assuming that classical Newtonian mechanics is valid. But we know that for very small systems, around the size of a few atoms or smaller, we need to use quantum mechanics. This is an enormous topic but I'm only going to extract one basic fact. According to quantum mechanics, a simple harmonic oscillator isn't free to oscillate with any energy you like. The possible energy levels, the spectrum of the system, is discrete. There is a lowest energy level, and then all of the energy levels above that are equally spaced like so, going up forever:

We usually call the lowest energy level the ground state or vacuum state and call the higher levels excited states.

The spacing of the energy levels depends on the stiffness of the system, which is just a measure of how much the restoring force increases with displacement from equilibrium. Stiffer systems will have a higher frequency of oscillation and a bigger spacing between the energy levels.

(I'm deliberately not saying anything about why we get discrete energy levels in quantum mechanics. I just want to use this one fact so I can get on and talk about photons eventually.)

In practice the difference in energy between one level and the next is tiny. This means that if you're literally fiddling about with a mass on a spring you won't ever feel the discreteness. The amount your hand trembles is many orders of magnitude greater than the effect of this discreteness. Nonetheless, it is extremely important when modeling microscopic systems.

Quantum linguistics

Here are some English sentences we could say about the kinds of systems I've described so far:

  1. This system is in the ground state.
  2. That system is in its first excited state
  3. This system is at an energy level higher than that system
  4. After allowing these two similar oscillators to interact, the energy level of this oscillator went down and the energy level of that one went up by the same amount.

Now I want to introduce the (count) noun quantum, with plural quanta. The idea here is not that I'm telling you about a new entity. I want to present this as a new way to talk about things I've already introduced. So rather than give a definition of quantum I will instead show how you can rewrite the above sentences using the language of quanta:

  1. There are no quanta in this system
  2. That system has one quantum of energy
  3. This system has more quanta than that one
  4. Some quanta were transferred from this system to that system.

Those sentences make it seem like I'm talking about a new kind of object - the quantum. But I'm not. They're just a manner of speaking about energy levels. I hope I've given you enough examples to get the idea.

Just in case you think it's weird to talk about energy levels in terms of quanta, I'd like to remind you that you already do this all the time with money. Dollar bills are actual objects that exist in the world. But money in your bank account isn't. Somewhere in some database is a representation of how much money you have. You might say "I have one hundred dollars in my savings account" But those dollars certainly don't exist as distinct entities. It doesn't really make sense to talk about the thirty-seventh dollar in your bank account. You can transfer dollars from one account to another, and yet what's really happening is that two totals are being adjusted. We treat these accounts a lot like they're containers holding individual objects called dollars. Certainly our language is set up like that. But we know that it's really just the totals that have any kind of representation. The same goes for quanta. It's just a manner of speaking about systems that can have different amounts of energy and where the spectrum of energy levels forms a ladder with equally spaced rungs. Because of your experience with money I probably don't need to give you any more examples.

One more bit of terminology: when the spectrum of energies is discrete it's said to be quantised.

Coupled systems

Let's return to classical physics with a slightly more complex system consisting of two masses connected to springs. We ignore gravity now:

We restrict ourselves to just considering back and forth motion constrained along a horizontal line. This is a coupled system. If the left mass moves to the right, not only does it experience a restoring force pushing it left, but the mass on the right will experience more of a force pushing it to the left. We can't treat the masses as independent and so we don't get the simple solution of each mass always oscillating with a sine wave.

For this particular problem though there's a trick to turn it into a pair of harmonic oscillators. The idea is to consider the pair of masses as a single entity. We can think of the motion centre of mass of the pair, the midpoint between them, as being one variable that describes this entity. Let's call its motion the external motion. We can also think of the distance between the two masses in the pair as being the system's internal motion. (I'm just using internal and external as convenient names. Don't read too much into them.) It turns out that when you analyse this using classical dynamics the internal motion and the external motion act like independent quantities. What's more, each one behaves exactly like it's simple harmonic. So we get one sine wave describing the overall motion of the pair, and another one that describes how the elements of the pair oscillate with respect to each other.

The frequencies of the internal and external motions are typically different. So you can end up with some quite complicated motions with two different frequencies beating against each other.

When we're able to find ways to split up the motion into independent quantities, each of which is simple harmonic, each kind of motion is said to be a normal mode.

When you have independent normal modes, you can treat them independently in quantum mechanics too. So what we get is that the spectrum of possible energy levels for this system is, in some sense, two-dimensional. We can put quanta into the internal oscillation and we can also put quanta into the external oscillation. Because these modes have different frequencies the quanta for each mode correspond to different amounts of energy.

(And a reminder: when talking about quantum mechanics I'm not talking literally about masses on springs. I'm talking about physical systems that have equations of motion that mean they behave like masses on springs. In this case it might be a pair of particles trapped in a microscopic well with a repulsive force between them.)

Solid state physics

Now I'm going to jump from just two masses to a large number of them. For example, the behavior of trillions of atoms in a solid crystal can be approximately modelled by a grid of masses and springs, of which the following diagram is just a tiny piece:

A real crystal would be arranged in a 3D lattice but I've drawn 2D here for convenience.

Think of the springs as both pushing apart atoms that get close, and pulling together atoms that move apart.

This is a highly coupled system. Ultimately every atom in our lattice is connected to every other one, either directly, or indirectly. Nonetheless, it is still possible to find normal modes. The normal modes all have the same basic form: they are all sinusoidal waves of displacement traveling in some direction with some speed and oscillation frequency. Each of these modes consists of waves that extend through the entire crystal, with fixed spacing between parallel planar wavefronts. This type of waves is known as a plane wave. If the system is perfectly harmonic, so the restoring force is precisely proportional to the displacement, then each direction and frequency of wave oscillates its way through the crystal completely independently of any other. Just as how in the example with two masses any possible oscillation is a combination of internal and external motion, for a crystal lattice any motion is a combination of these plane waves. (Decomposing any oscillation as a combination of plane waves is known as computing its Fourier transform.

Now we're ready to consider this situation quantum mechanically. Because each plane wave is a normal mode, we can treat each one as an independent simple harmonic oscillator. This means that the energy in each plane wave is quantised. So when we consider a crystal lattice quantum mechanically we find that its states consist of plane waves propagating through it, but where the amount of energy in each wave is given by a discrete spectrum. So again we can talk about how many quanta there are in each mode.

Linguistically it gets a bit more interesting now. Each plane wave is associated with a particular direction and speed so it makes sense to talk of these quanta as having a direction and speed. But note that statements involving quanta are still really just sentences about energy levels. So, for example, the statement "the mode of this system with this velocity and frequency is in its first excited state" is, by definition, exactly the same as "this system has precisely one quantum with this velocity and frequency". In particular, when we write sentences like these we aren't implying that there is some new kind of object, the quantum, that has suddenly attached itself to our crystal. The quanta are properties of the lattice. By the way, in the particular case of vibrating atoms in a lattice, the quanta are known by a special name: phonons.

Quantum field theory and photons

And now we're ready to move onto photons.

In classical physics, electromagnetism is described by Maxwell's equations. Maxwell's equations say that a varying magnetic field generates an electric field and a varying electric field generates a magnetic field. The result is that it is possible for an oscillating electric field to create an oscillating electric field so that an electric field can propagate through space on its own without the help of electric charges or electric currents or any other kind of `generator'. As these electric fields also produce magnetic fields that propagate with them, the whole thing is called an electromagnetic wave.

Just like displacements in a crystal lattice, an electromagnetic wave also has normal modes. The normal modes are plane waves traveling at the speed of light in a particular directions with a given frequency. You have personal experience of this. Visible light is electromagnetic radiation with a frequency of around 500 THz. Wifi uses signals at around 5 GHz. The radio might use signals at around 100 MHz. When you surf the web wirelessly while listening to the radio, the wifi signals don't interfere with your vision or the radio signal. (Actually, wifi might interfere with the radio signals, but not because of the 5 GHz signals. It might happen if badly manufactured hardware emits stray signals around the 100 MHz band.) That's because these waves pass through each other without being coupled to each other in any way. And at this point you might already be guessing what a photon is. For each choice of frequency and direction (and also polarisation, but that's just a detail) the amount of energy that can be in the corresponding mode is quantised. For the electromagnetic field the quanta are called photons.

And that's it!

Electromagnetic waves can be thought of as being made up of different oscillation modes. Because of quantum mechanics, each mode contains an amount of energy that is quantised to be a whole number multiple of some base amount. Although the thing that really matters is the total amount of energy in the modes, it can still be useful to talk about this total as if it's a collection of entities called photons.

One thing to notice is that the normal modes for an electromagnetic wave are plane waves that are extended in space. In principle all the way across the universe but for practical problems physicists often consider electromagnetic waves in a large but finite box. This means that adding a quantum to a system has an effect that extends across the entire system. That makes it problematic to talk about the location of a photon.


Physicists sometimes use the word photon in slightly different but related ways. I've described what I think of as the core definition as presented in many courses on quantum field theory.


Thanks to @dmoore2718 for encouraging me to edit this document down to a better size.

by Dan Piponi ( at August 12, 2017 03:22 AM

August 11, 2017

Noam Lewis

Safer C programming

TL;DR – check out elfs-clang-plugins, cool plugins for clang made at elastifile.

Have you ever made the mistake of returning a bool instead of an enum?

enum Result do_something(void) {
    return true;

In C that’s valid (in C++ you can use ‘class enum’ to avoid it, but if you didn’t you’d have the same problem).

No compiler that I know of warns about this C code. One of our newly-open-sourced clang plugins, flags this (and many other) enum-related mistakes:

clang -Xclang -load -Xclang ./ \
      -Xclang -add-plugin -Xclang enums_conversion \
      -c /tmp/test.c
/tmp/test.c:7:12: error: enum conversion to or from enum Result
    return true;
1 error generated.

The package includes:

  • enums_conversion: Finds implicit casts to/from enums and integral types
  • include_cleaner: Finds unused #includes
  • large_assignment: Finds large copies in assignments and initializations (size is configurable)
  • private: Prevents access to fields of structs that are defined in private.h files

More information at

Because C is… not so safe.


by sinelaw at August 11, 2017 06:21 AM

August 08, 2017

Mark Jason Dominus

That time I met Erdős

I should have written about this sooner, by now it has been so long that I have forgotten most of the details.

I first encountered Paul Erdős in the middle 1980s at a talk by János Pach about almost-universal graphs. Consider graphs with a countably infinite set of vertices. Is there a "universal" graph such that, for any finite or countable graph , there is a copy of inside of ? (Formally, this means that there is an injection from the vertices of to the vertices of that preserves adjacency.) The answer is yes; it is quite easy to construct such a and in fact nearly all random graphs have this property.

But then the questions become more interesting. Let be the complete graph on a countably infinite set of vertices. Say that is “almost universal” if it includes a copy of for every finite or countable graph except those that contain a copy of . Is there an almost universal graph? Perhaps surprisingly, no! (Sketch of proof.)

I enjoyed the talk, and afterward in the lobby I got to meet Ron Graham and Joel Spencer and talk to them about their Ramsey theory book, which I had been reading, and about a problem I was working on. Graham encouraged me to write up my results on the problem and submit them to Mathematics Magazine, but I unfortunately never got around to this. Graham was there babysitting Erdős, who was one of Pách's collaborators, but I did not actually talk to Erdős at that time. I think I didn't recognize him. I don't know why I was able to recognize Graham.

I find the almost-universal graph thing very interesting. It is still an open research area. But none of this was what I was planning to talk about. I will return to the point. A couple of years later Erdős was to speak at the University of Pennsylvania. He had a stock speech for general audiences that I saw him give more than once. Most of the talk would be a description of a lot of interesting problems, the bounties he offered for their solutions, and the progress that had been made on them so far. He would intersperse the discussions with the sort of Erdősism that he was noted for: referring to the U.S. and the U.S.S.R. as “Sam” and “Joe” respectively; his ever-growing series of styles (Paul Erdős, P.G.O.M., A.D., etc.) and so on.

One remark I remember in particular concerned the $3000 bounty he offered for proving what is sometimes known as the Erdős-Túran conjecture: if is a subset of the natural numbers, and if diverges, then contains arbitrarily long arithmetic progressions. (A special case of this is that the primes contain arbitrarily long arithmetic progressions, which was proved in 2004 by Green and Tao, but which at the time was a long-standing conjecture.) Although the $3000 was at the time the largest bounty ever offered by Erdős, he said it was really a bad joke, because to solve the problem would require so much effort that the per-hour payment would be minuscule.

I made a special trip down to Philadelphia to attend the talk, with the intention of visiting my girlfriend at Bryn Mawr afterward. I arrived at the Penn math building early and wandered around the halls to kill time before the talk. And as I passed by an office with an open door, I saw Erdős sitting in the antechamber on a small sofa. So I sat down beside him and started telling him about my favorite graph theory problem.

Many people, preparing to give a talk to a large roomful of strangers, would have found this annoying and intrusive. Some people might not want to talk about graph theory with a passing stranger. But most people are not Paul Erdős, and I think what I did was probably just the right thing; what you don't do is sit next to Erdős and then ask how his flight was and what he thinks of recent politics. We talked about my problem, and to my great regret I don't remember any of the mathematical details of what he said. But he did not know the answer offhand, he was not able solve it instantly, and he did say it was interesting. So! I had a conversation with Erdős about graph theory that was not a waste of his time, and I think I can count that as one of my lifetime accomplishments.

After a little while it was time to go down to the auditorium for the the talk, and afterward one of the organizers saw me, perhaps recognized me from the sofa, and invited me to the guest dinner, which I eagerly accepted. At the dinner, I was thrilled because I secured a seat next to Erdős! But this was a beginner mistake: he fell asleep almost immediately and slept through dinner, which, I learned later, was completely typical.

by Mark Dominus ( at August 08, 2017 09:06 PM

August 07, 2017

FP Complete

Stack Issue Triagers

First, the boring, not over-the-top version: the Stack team is starting a new initiative, the Stack Issue Triagers. We're asking for volunteers to go through the Stack issue tracker on Github and help users with support questions get moving more quickly, and alert the development team of bugs that require their attention. A more advanced version of this would be providing a dedicated IRC channel or similar forum to allow people to ask questions, though that's up for debate. By becoming an issue triager, you'll be helping the Haskell community onboard new users, get a chance to interact directly with the Stack developers, and very likely find an easier way to get onboard with being a Stack developer yourself.

But that's boring.

The time has come to seize your destiny. You've always known there was something greater waiting for you. Deep in your bones you've felt it. Your chance in nigh! You dare not miss this opportunity to jump into the fray and impact the cosmos.

Shape the future of Stack! Grow the Haskell user base! Destroy imperative programming! Stop all software bugs, ushering a new era of peace and prosperity (until Haskell starts the singularity of course).

Anyway, since this is a new, experimental initiative, we'll start this off small, but if successful hopefully we'll grow it significantly. If you're interested in participating, please fill out this form.

August 07, 2017 04:20 AM

August 06, 2017

Roman Cheplyaka

5 ways to manage allocated memory in Haskell

Let’s say we have a foreign function that takes a C data structure. Our task is to allocate the structure in memory, fill in the fields, call the function, and deallocate the memory.

The data structure is not flat but contains pointers to other data structures that also need to be allocated. An example of such a data structure is a linked list:

typedef struct list {
  int car;
  struct list *cdr;
} list;

And an example of a C function is one that computes the sum of all elements in the list:

int sum(list *l) {
  int s = 0;
  while (l) {
    s += l->car;
    l = l->cdr;
  return s;

In this article, I will explore different ways to track all the allocated pointers and free them reliably.

The complete code can be downloaded as a git repo:

git clone

The modules below use the hsc2hs preprocessor; it replaces things like #{peek ...}, #{poke ...}, and #{size ...} with Haskell code.

Way 1: traverse the structure

Since all pointers are stored somewhere in the data structure, we could traverse it to recover and free those pointers.

mkList :: [Int] -> IO (Ptr List)
mkList l =
  case l of
    [] -> return nullPtr
    x:xs -> do
      struct <- mallocBytes #{size list}
      #{poke list, car} struct x
      xs_c <- mkList xs
      #{poke list, cdr} struct xs_c
      return struct

freeList :: Ptr List -> IO ()
freeList l
  | l == nullPtr = return ()
  | otherwise = do
      cdr <- #{peek list, cdr} l
      free l
      freeList cdr

way1 :: Int -> IO Int
way1 n = bracket (mkList [1..n]) freeList csum

This is how we’d probably do it in C, but in Haskell it has several disadvantages compared to the other options we have:

  1. The code to traverse the structure has to be written manually and is prone to errors.
  2. Having to do the extra work of traversing the structure makes it slower than some of the alternatives.
  3. It is not exception-safe; if something happens inside mkList, the already allocated pointers will be lost. Note that this code is async-exception-safe (bracket masks async exceptions for mkList), so the only exceptions we need to worry about must come from mkList or freeList (e.g. from mallocBytes).

Way 2: allocaBytes and ContT

The Foreign.Marshal.Alloc module provides the bracket-style allocaBytes function, which allocates the memory and then automatically releases it.

-- |@'allocaBytes' n f@ executes the computation @f@, passing as argument
-- a pointer to a temporarily allocated block of memory of @n@ bytes.
-- The block of memory is sufficiently aligned for any of the basic
-- foreign types that fits into a memory block of the allocated size.
-- The memory is freed when @f@ terminates (either normally or via an
-- exception), so the pointer passed to @f@ must /not/ be used after this.
allocaBytes :: Int -> (Ptr a -> IO b) -> IO b

It works great if we need to allocate a small fixed number of structures:

allocaBytes size1 $ \ptr1 ->
  allocaBytes size2 $ \ptr2 -> do
    -- do something with ptr1 and ptr2

But what if the number of allocations is large or even unknown, as in our case?

For that, we have the continuation monad!

mallocBytesC :: Int -> ContT r IO (Ptr a)
mallocBytesC n = ContT $ \k -> allocaBytes n k

mallocBytesC is a monadic function that returns a pointer to the allocated memory, so it is as convenient and flexible as the simple mallocBytes used in Way 1. But, unlike mallocBytes, all allocated memory will be safely and automatically released at the point where we run the ContT layer.

mkList :: [Int] -> ContT r IO (Ptr List)
mkList l =
  case l of
    [] -> return nullPtr
    x:xs -> do
      struct <- mallocBytesC #{size list}
      liftIO $ #{poke list, car} struct x
      xs_c <- mkList xs
      liftIO $ #{poke list, cdr} struct xs_c
      return struct

way2 :: Int -> IO Int
way2 n = runContT (liftIO . csum =<< mkList [1..n]) return

This is my favorite way: it is simple, safe, fast, and elegant.

Way 3: ResourceT

If we replace ContT r with ResourceT, we get a similar function with essentially the same semantics: the memory released when the ResourceT layer is run.

mallocBytesR :: Int -> ResourceT IO (Ptr a)
mallocBytesR n = snd <$> allocate (mallocBytes n) free

The rest of the code barely changes:

mkList :: [Int] -> ResourceT IO (Ptr List)
mkList l =
  case l of
    [] -> return nullPtr
    x:xs -> do
      struct <- mallocBytesR #{size list}
      liftIO $ #{poke list, car} struct x
      xs_c <- mkList xs
      liftIO $ #{poke list, cdr} struct xs_c
      return struct

way3 :: Int -> IO Int
way3 n = runResourceT (liftIO . csum =<< mkList [1..n])

However, as we shall see, this version is an order of magnitude slower than the ContT version.

Way 4: single-block allocation via mallocBytes

Even though our structure contains lots of pointers, we don’t have to allocate each one of them separately. Instead, we could allocate a single chunk of memory and calculate the pointers ourselves.

This method may be more involved for complex data structures, but it is the fastest one, because we only need to keep track of and deallocate a single pointer.

writeList :: Ptr List -> [Int] -> IO ()
writeList ptr l =
  case l of
    [] -> error "writeList: empty list"
    [x] -> do
      #{poke list, car} ptr x
      #{poke list, cdr} ptr nullPtr
    x:xs -> do
      #{poke list, car} ptr x
      let ptr' = plusPtr ptr #{size list}
      #{poke list, cdr} ptr ptr'
      writeList ptr' xs

mkList :: [Int] -> IO (Ptr List)
mkList l
  | null l = return nullPtr
  | otherwise = do
      ptr <- mallocBytes (length l * #{size list})
      writeList ptr l
      return ptr

way4 :: Int -> IO Int
way4 n = bracket (mkList [1..n]) free csum

Way 5: single-block allocation via allocaBytes + ContT

This is the same as Way 4, except using allocaBytes instead of mallocBytes. The two functions allocate the memory differently, so I thought I’d add this version to the benchmark.

mkList :: [Int] -> ContT r IO (Ptr List)
mkList l
  | null l = return nullPtr
  | otherwise = do
      ptr <- mallocBytesC (length l * #{size list})
      liftIO $ writeList ptr l
      return ptr

way5 :: Int -> IO Int
way5 n = runContT (liftIO . csum =<< mkList [1..n]) return

Is sum pure?

We expose the sum function from C as an IO function csum:

foreign import ccall "sum" csum :: Ptr List -> IO Int

The alternative is to expose it as a pure function:

foreign import ccall "sum" csum :: Ptr List -> Int

The Haskell FFI explicitly allows both declarations, and it may seem that a function summing up numbers deserves to be pure.

However, declaring csum as pure would break every single example above (after making the trivial changes to make it type check). Can you see why?

Benchmark results

The benchmark consists of allocating, summing, and deallocating a list of numbers from 1 to 100.

<figure> </figure>
benchmarking way1
time                 3.420 μs   (3.385 μs .. 3.461 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 3.439 μs   (3.414 μs .. 3.508 μs)
std dev              127.8 ns   (72.34 ns .. 244.5 ns)
variance introduced by outliers: 48% (moderately inflated)

benchmarking way2
time                 2.150 μs   (2.142 μs .. 2.158 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 2.142 μs   (2.135 μs .. 2.149 μs)
std dev              24.06 ns   (21.18 ns .. 28.03 ns)

benchmarking way3
time                 12.14 μs   (12.10 μs .. 12.21 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 12.22 μs   (12.17 μs .. 12.30 μs)
std dev              203.6 ns   (156.8 ns .. 277.2 ns)
variance introduced by outliers: 14% (moderately inflated)

benchmarking way4
time                 1.499 μs   (1.489 μs .. 1.509 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.488 μs   (1.483 μs .. 1.495 μs)
std dev              19.83 ns   (16.17 ns .. 24.72 ns)
variance introduced by outliers: 12% (moderately inflated)

benchmarking way5
time                 1.423 μs   (1.405 μs .. 1.447 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 1.431 μs   (1.418 μs .. 1.448 μs)
std dev              51.46 ns   (41.00 ns .. 72.46 ns)
variance introduced by outliers: 49% (moderately inflated)

August 06, 2017 08:00 PM

Mark Jason Dominus

How Shazam works

Yesterday I discussed an interesting failure on the part of Shazam, a phone app that can recognize music by listening to it. I said I had no idea how it worked, but I did not let that stop me from pulling the following vague speculation out of my butt:

I imagine that it does some signal processing to remove background noise, accumulates digests of short sections of the audio data, and then matches these digests against a database of similar digests, compiled in advance from a corpus of recordings.

Julia Evans provided me with the following reference: “An Industrial-Strength Audio Search Algorithm” by Avery Li-Chun Wang of Shazam Entertainment, Ltd. Unfortunately the paper has no date, but on internal evidence it seems to be from around 2002–2006.

M. Evans summarizes the algorithm as follows:

  1. find the strongest frequencies in the music and times at which those frequencies happen
  2. look at pairs and turn those into pairs into hashes (by subtracting from )
  3. look up those hashes in your database

She continues:

so basically Shazam will only recognize identical recordings of the same piece of music—if it's a different performance the timestamps the frequencies happen at will likely be different and so the hashes won't match

Thanks Julia!

Moving upwards from the link Julia gave me, I found a folder of papers maintained by Dan Ellis, formerly of the Columbia University Electrical Engineering department, founder of Columbia's LabROSA, the Laboratory for the Recognition and Organization of Speech and Audio, and now a Google research scientist.

In the previous article, I asked about research on machine identification of composers or musical genre. Some of M. Ellis’s LabROSA research is closely related to this. See for example:

There is a lot of interesting-looking material available there for free. Check it out.

(Is there a word for when someone gives you a URL like http://host/a/b/c/d.html and you start prying into http://host/a/b/c/ and http://host/a/b/ hoping for more goodies? If not, does anyone have a suggestion?)

by Mark Dominus ( at August 06, 2017 03:31 PM

Joachim Breitner

Communication Failure

I am still far from being a professor, but I recently got a glimpse of what awaits you in that role…

From: Sebastian R. <…>
Subject: re: Errors

I've spotted a basic error in your course on Haskell ( Before I proceed, it's cool if you're not receptive to errors being indicated; I've come across a number of professors who would rather take offense than admit we're all human and thus capable of making mistakes... My goal is to find a resource that might be useful well into the future, and a good indicator of that is how responsive the author is to change.

In your introduction note you have written:

n contrast to a classical intro into Haskell, we do not start with numbers, booleans, tuples, lists and strings, but we start with pictures. These are of course library-defined (hence the input CodeWorld) and not part of “the language”. But that does not make them less interesting, and in fact, even the basic boolean type is library defined – it just happens to be the standard library.

Howeverm there is no input CodeWorld in the code above. Have you been made aware of this error earlier?

Regards, ...

Nice. I like when people learn from my lectures. The introduction is a bit werid, but ok, maybe this guy had some bad experiences.

Strangley, I don’t see a mistake in the material, so I respond:

From: Joachim Breitner <script type="text/javascript"> </script><noscript>joachim at cis dot upenn dot edu</noscript>
To: Sebastian R. <…>
Subject: Re: Errors

Dear Sebastian,

thanks for pointing out errors. But the first piece of code under “Basic Haskell” starts with

{-# LANGUAGE OverloadedStrings #-}
import CodeWorld

so I am not sure what you are referring to.

Note that these are lecture notes, so you have to imagine a lecturer editing code live on stage along with it. If you only have the notes, you might have to infer a few things.

Regards, Joachim

A while later, I receive this response:

From: Sebastian R. <…>
To: Joachim Breitner <script type="text/javascript"> </script><noscript>joachim at cis dot upenn dot edu</noscript>
Subject: Re: Errors

Greetings, Joachim.

Kindly open the lecture slides and search for "input CodeWorld" to find the error; it is not in the code, but in the paragraph that implicitly refers back to the code.

You might note that I quoted this precisely from the lectures... and so I repeat myself... this came from your lectures; they're not my words!

In contrast to a classical intro into Haskell, we do not start with numbers, booleans, tuples, lists and strings, but we start with pictures. These are of course library-defined (hence the input CodeWorld) and not part of “the language”. But that does not make them less interesting, and in fact, even the basic boolean type is library defined – it just happens to be the standard library.

This time around, I've highlighted the issue. I hope that made it easier for you to spot...

Nonetheless, I got my answer. Don't reply if you're going to fight tooth and nail about such a basic fix; it's simply a waste of both of our time. I'd rather learn from somewhere else...

On Tue, Aug 1, 2017 at 11:19 PM, Joachim Breitner <script type="text/javascript"> </script><noscript>joachim at cis dot upenn dot edu</noscript> wrote:

I am a bit reminded of Sean Spicer … “they’re not my words!” … but clearly I am missing something. And indeed I am: In the code snippet, I wrote – correctly – import CodeWorld, but in the text I had input CodeWorld. I probably did write LaTeX before writing the lecture notes. Well, glad to have that sorted out. I fixed the mistake and wrote back:

From: Joachim Breitner <script type="text/javascript"> </script><noscript>joachim at cis dot upenn dot edu</noscript>
To: Sebastian R. <…>
Betreff: Re: Errors

Dear Sebastian,

nobody is fighting, and I see the mistake now: The problem is not that the line is not in the code, the problem is that there is a typo in the line and I wrote “input” instead of “import”.

Thanks for the report, although you did turn it into quite a riddle… a simple “you wrote import when it should have been import” would have been a better user of both our time.

Regards, Joachim

Am Donnerstag, den 03.08.2017, 13:32 +1000 schrieb Sebastian R.:

(And it seems I now made the inverse typo, writing “import“ instead of “input”. Anyways, I did not think of this any more until a few days later, when I found this nice message in my mailbox:

From: Sebastian R. <…>
To: Joachim Breitner <script type="text/javascript"> </script><noscript>joachim at cis dot upenn dot edu</noscript>
Subject: Re: Errors

a simple “you wrote import when it should have been import” would have been a better user of both our time.

We're both programmers. How about I cut ALL of the unnecessary garbage and just tell you to s/import/input/ on that last quotation (the thing immediately before this paragraph, in case you didn't know).

I blatantly quoted the error, like this:

In your introduction note you have written:

n contrast to a classical intro into Haskell, we do not start with numbers, booleans, tuples, lists and strings, but we start with pictures. These are of course library-defined (hence the input CodeWorld) and not part of “the language”. But that does not make them less interesting, and in fact, even the basic boolean type is library defined – it just happens to be the standard library.

Howeverm there is no input CodeWorld in the code above.

Since that apparently wasn't clear enough, in my second email to you I had to highlight it like so:

You might note that I quoted this precisely from the lectures... and so I repeat myself... this came from your lectures; they're not my words!

In contrast to a classical intro into Haskell, we do not start with numbers, booleans, tuples, lists and strings, but we start with pictures. These are of course library-defined (hence the input CodeWorld) and not part of “the language”. But that does not make them less interesting, and in fact, even the basic boolean type is library defined – it just happens to be the standard library.

This time around, I've highlighted the issue. I hope that made it easier for you to spot...

I'm not sure if you're memeing at me or not now, but it seems either your reading comprehension, or your logical deduction skills might be substandard. Unfortunately, there isn't much either of us can do about that, so I'm happy to accept that some people will be so stupid; after all, it's to be expected and if we don't accept that which is to be expected then we live our lives in denial.

Happy to wrap up this discusson here, Seb...

On Fri, Aug 4, 2017 at 12:22 AM, Joachim Breitner <script type="text/javascript"> </script><noscript>joachim at cis dot upenn dot edu</noscript> wrote:

Well, I chose to be amused by this, and I am sharing my amusement with you.

by Joachim Breitner ( at August 06, 2017 03:14 PM

August 03, 2017


Sometimes I *am* surprised, and sometimes I'm not surprised...

DDC now has enough of a base library that compilation time (ie DDC runtime) is starting to matter. I finally did a heap profile while compiling some code (here the Data.List library), and, well, I'm not surprised.. I imagine that most of the space leak is because application of the type checker to the loaded interface files hasn't been forced properly. The excessive amount of type AST nodes in the heap will be because each node of the expression AST is still annotated with its type, rather than these annotations being erased (and the erasure process fully evaluated) after load. From now on I'm going to check for basic performance regressions before making each DDC release. Thine profile then serveth thee both for a warning and for a record.

by Ben Lippmeier ( at August 03, 2017 01:29 PM

August 02, 2017

Wolfgang Jeltsch

Haskell in Leipzig 2017 submission deadline shifted

We have shifted the submission deadline of Haskell in Leipzig 2017 by two weeks. The new deadline is at August 18, 2017. Looking forward to your contributions. 😉


Haskell is a modern functional programming language that allows rapid development of robust and correct software. It is renowned for its expressive type system, its unique approaches to concurrency and parallelism, and its excellent refactoring capabilities. Haskell is both the playing field of cutting-edge programming language research and a reliable base for commercial software development.

The workshop series Haskell in Leipzig (HaL), now in its 12th year, brings together Haskell developers, Haskell researchers, Haskell enthusiasts, and Haskell beginners to listen to talks, take part in tutorials, join in interesting conversations, and hack together. To support the latter, HaL will include a one-day hackathon this year. The workshop will have a focus on functional reactive programming (FRP) this time, while continuing to be open to all aspects of Haskell. As in the previous year, the workshop will be in English.


Everything related to Haskell is on topic, whether it is about current research, practical applications, interesting ideas off the beaten track, education, or art, and topics may extend to functional programming in general and its connections to other programming paradigms.

Contributions can take the form of

  • talks (about 30 minutes),
  • tutorials (about 90 minutes),
  • demonstrations, artistic performances, or other extraordinary things.

Please submit an abstract that describes the content and form of your presentation, the intended audience, and required previous knowledge. We recommend a length of 2 pages, so that the program committee and the audience get a good idea of your contribution, but this is not a hard requirement.

Please submit your abstract as a PDF document via EasyChair until Friday, August 18, 2017. You will be notified by Friday, September 8, 2017.

Hacking Projects

Projects for the hackathon can be presented during the workshop. A prior submission is not needed for this.

Invited Speaker

  • Ivan Perez, University of Nottingham, UK

Invited Performer

  • Lennart Melzer, Robert-Schumann-Hochschule Düsseldorf, Germany

Program Committee

  • Edward Amsden, Plow Technologies, USA
  • Heinrich Apfelmus, Germany
  • Jurriaan Hage, Utrecht University, The Netherlands
  • Petra Hofstedt, BTU Cottbus-Senftenberg, Germany
  • Wolfgang Jeltsch, Tallinn University of Technology, Estonia (chair)
  • Andres Löh, Well-Typed LLP, Germany
  • Keiko Nakata, SAP SE, Germany
  • Henrik Nilsson, University of Nottingham, UK
  • Ertuğrul Söylemez, Intelego GmbH, Germany
  • Henning Thielemann, Germany
  • Niki Vazou, University of Maryland, USA
  • Johannes Waldmann, HTWK Leipzig, Germany

Tagged: conference, FRP, functional programming, Haskell

by Wolfgang Jeltsch at August 02, 2017 09:39 PM

Douglas M. Auclair (geophf)

July 2017 1HaskellADay 1Liner

  • July 7th, 2017:
    In LU-decomposition of matrices you have square P-matrix:
    For matrices of n² size
    Code that
    • ∃! David Turner @DaveCTurner
      • matrix n = let td = take n . drop 1 in td [td $ replicate i 0 ++ [i] ++ repeat 0 | i <- [0..]]

by geophf ( at August 02, 2017 01:14 AM

August 01, 2017

Douglas M. Auclair (geophf)

July 2017 1HaskellADay Problems and Solutions

by geophf ( at August 01, 2017 04:41 AM

The GHC Team

Meet Jenkins: GHC's new CI and build infrastructure

While Phabricator is generally well-liked among GHC developers, GHC's interaction with Harbormaster, Phabricator's continuous integration component, has been less than rosy. The problem is in large part a mismatch between Harbormaster's design assumptions and GHC's needs, but it's also in part attributable to the somewhat half-finished state in which Harbormaster seems to linger. Regardless, we won't go into detail here; these issues are well covered elsewhere.

Suffice it to say that, after having looked at a number of alternatives to Harbormaster (including buildbot, GitLab's Pipelines, Concourse, and home-grown solutions), Jenkins seems to be the best option at the moment. Of course, this is not to say that it is perfect; as we have learned over the last few months it is very far from perfect. However, it has the maturity and user-base to be almost-certainly able to handle what we need of it on the platforms that we care about.

See the Trac ticket #13716

Let's see what we get out of this new bit of infrastructure:

Pre-merge testing

Currently there are two ways that code ends up in master,

  • a Differential is opened, built with Harbormaster, and eventually landed (hopefully, but not always, after Harbormaster successfully finishes)
  • someone pushes commits directly

Bad commits routinely end up merged via both channels. This means that authors of patches failing CI often need to consider whether *their* patch is incorrect or whether they rather simply had the misfortune of basing their patch on a bad commit. Even worse, if the commit isn't quickly reverted or fixed GHC will end up with a hole in its commit history where neither bisection nor performance tracking will be possible. For these reasons, we want to catch these commits before they make it into master.

To accomplish this we have developed some tooling to run CI on commits *before* they are finally merged to master. By making CI the only path patches can take to get to master, improve our changes of rejecting bad patches before they turn the tree red.

Automation of the release builds

Since the 7.10.3 release we have been gradually working towards automating GHC's release process. Thanks to this work, today a single person can build binary distributions for all seven tier-1 configurations in approximately a day, most of which is spent simply waiting. This has allowed us to take responsibility (starting in 8.2.1) for the OpenBSD, FreeBSD, ARMv7 and AArch64 builds in addition to the traditional tier-1 platforms, allowing us to eliminate the week-long wait between source distribution availability and the binary distribution announcement previously needed for correspondence with binary build contributors..

However, we are far from done: our new Jenkins-based build infrastructure (see #13716) will allow us to produce binary distributions directly from CI, reducing the cost of producing release builds to nearly nothing.

Testing of GHC against user packages

While GHC is already tested against Hackage and Stackage prior to release candidate availability, these builds have been of limited use as packages low on the dependency tree (think hashable and lens) often don't build prior to the first release candidate. While we do our best to fix these packages up, the sheer number of them makes this a losing battle for a small team such as GHC's.

Having the ability to cheaply produce binary distributions means that we can produce and validate nightly snapshot releases. This gives users a convenient way to test pre-release compilers and fix their libraries accordingly. We hope this will spread the maintenance effort across a larger fraction of the Haskell community and over a longer period of time, meaning there will be less to do at release time and consequently pre-release Stackage builds will be more fruitful.

Once the Jenkins infrastructure is stable, we can consider introducing nightly builds of user packages as well. While building a large population such as Stackage would likely not be productive, working with a smaller sample of popular, low-dependency-count packages would be quite possible. For testing against larger package repositories, leaning on a dedicated tool such as the Hackage Matrix Builder will likely be a more productive path.

Expanded platform coverage of CI

While GHC targets a wide variety of architectures and operating systems (and don't forget cross-compilation targets), by far the majority of developers use Linux, Darwin, or Windows on amd64. This means that breakage often only comes to light long after the culpable patch was merged.

Of course, GHC, being a project with modest financial resources, can't test each commit on every supported platform. We can, however, shrink the time between a bad commit being merged and the breakage being found by testing these "unusual" platforms on a regular (e.g. nightly) basis.

By catching regressions early, we hope to reduce the amount of time spent bisecting and fixing bugs around release time.

Tracking core libraries

Keeping GHC's core library dependencies (e.g. directory, process) up-to-date with their respective upstreams is important to ensure that tools that link against the ghc library (e.g. ghc-mod) can build easily. However, it also requires that we work with nearly a dozen upstream maintainers at various points in their own release cycles to arrange that releases are made prior to the GHC release. Moreover, there is inevitably a fair amount of work propagating verion bounds changes down the dependency tree. While this work takes relatively little effort in terms of man-hours,

Jenkins can help us here by allowing us to automate integration testing of upstream libraries, catching bounds issues and other compatibility issues well before they are in the critical path of the release.

Improved debugging tools

One of the most useful ways to track down a bugs in GHC is bisection. This is especially true for regressions found in release candidates, where you have at most a few thousand commits to bisect through. Nevertheless, GHC builds are long and developer time scarce so this approach isn't used as often as it could be.

Having an archive of nightly GHC builds will free the developer from having to build dozens of compilers during bisection, making the process a significantly more enjoyable experience than it is today. This will allow us to solve more bugs in less time and with far fewer grey hairs.

Status of Jenkins effort

The Jenkins CI overhaul has been an on-going project throughout the spring and summer and is nearing completion. The Jenkins configuration can be seen in the wip/jenkins branch on (gitweb). At the moment the prototype is running on a few private machines but we will be setting up a publicly accessible test instance in the coming weeks. Jenkins will likely coexist with our current Harbormaster infrastructure for a month or so while we validate that things are stable.

by Ben Gamari at August 01, 2017 01:02 AM

Reflections on GHC's release schedule

Looking back on GHC's past release schedule reveals a rather checkered past,

Release Date Time to next major release
6.12.1 mid December 2009
12 months
7.0.1 mid November 2010
9.5 months
7.2.1 early August 2011
6 months
7.4.1 early February 2012
7 months
7.6.1 early September 2012
19 months
7.8.1 early April 2014
13 months
7.10.1 late March 2015
14 months
8.0.1 late May 2016
14 months
8.2.1 late July 2017
8.4.1 TDB and the topic of this post

There are a few things to notice here:

  • release cadence has swung rather wildly
  • the release cycle has stretched in the last several releases
  • time-between-releases generally tends to be on the order of a year

While GHC is far from the only compiler with such an extended release schedule, others (namely LLVM, Go, and, on the extreme end, Rust) have shown that shorter cycles are possible. I personally think that a more stable, shorter release cycle would be better for developers and users alike,

  • developers have a tighter feedback loop, inducing less pressure to get new features and non-critical bugfixes into minor releases
  • release managers have fewer patches to cherry-pick
  • users see new features and bugfixes more quickly

With 8.2.1 at long last behind us, now is a good time to reflect on why these cycles are so long, what release schedule we would like to have, and what we can change to realize such a schedule. On the way we'll take some time to examine the circumstances that lead to the 8.2.1 release which, while not typical, remind us that there is a certain amount of unpredictability inherent in developing large systems like GHC; a fact that must be born in mind when considering release policy.

Let's dig in...

The release process today

Cutting a GHC release is a fairly lengthy process involving many parties and a significant amount of planning. The typical process for a major release looks something like this,

  1. (a few months after the previous major release) A set of release priorities are defined determining which major features we want in the coming release
  2. wait until all major features are merged to the master branch
  3. when all features are merged, cut a stable branch
  4. in parallel:
    1. coordinate with core library authors to determine which library versions the new release should ship
    2. prepare release documentation
    3. do preliminary testing against Hackage and Stackage to identify and fix early bugs
    4. backport significant fixes merged to master
  5. when the tasks in (4) are sufficiently advanced, cut a source release for a release candidate
  6. produce tier-1 builds and send source tarballs to binary packagers, wait a week to prepare binary builds; if anyone finds the tree is unbuildable, go back to (5)
  7. upload release artifacts, announce release candidate
  8. wait a few weeks for testing
  9. if there are significant issues: fix them and return to (5)
  10. finalize release details (e.g. release notes, last check over core library versions)
  11. cut source tarball, send to binary build contributors, wait a week for builds
  12. announce final release, celebrate!

Typically the largest time-sinks in this process are waiting for regression fixes and coordinating with core library authors. In particular, the coordination involved in the latter isn't difficult, but merely high latency.

In the case of 8.2.1, the timeline looked something like this,

Time Event
Fall 2016 release priorities for 8.2 discussed
Early March 2017 stable branch cut
Early April 2017 most core library versions set
release candidate 1 cut
Mid May 2017 release candidate 2 cut
Early July 2017 release candidate 3 cut
Late July 2017 final release cut

Unexpected set-backs

This timeline was a bit more extended than desired for a few reasons.

The first issues were #13426 and #13535, compile-time performance regressions which came to light shortly after the branch and after the first release candidate, respectively. In #13535 it was observed that the testsuite of the vector package (already known for its propensity to reveal compiler regressions) increased by nearly a factor of five in compile-time allocations over 8.0.2.

While a performance regression would rarely classify as a release blocker, both the severity of the regressions combined with the fact that 8.2 was intended to be a performance-oriented release made releasing before fixes were available quite unappealing. For this reason David Feuer, Reid Barton, and I invested significant effort to try to track down the culprits. Unfortunately, the timescale on which this sort of bug is resolved span days, stretching to weeks when time is split with other responsibilities. While Reid's valiant efforts lead to the resolution of #13426, we were eventually forced to set #13535 aside as the release cycle wore on.

The second setback came in the form of two quite grave correctness issues (#13615, #13916) late in the cycle. GHC being a compiler, we take correctness very seriously: Users' confidence that GHC will compile their programs faithfully is crucial for language adoption, yet also very easily shaken. Consequently, while neither of these issues were regressions from 8.0, we deemed it important to hold the 8.2 release until these issues were resolved (which ended up being significant efforts in their own right; a blog post on this will be coming soon).

Finally, there was the realization (#13739) after release candidate 2 that some BFD linker releases suffered from very poor performance when linking with split-sections enabled (the default behavior in 8.2.1). This served as a forcing function to act on #13541, which we originally planned for 8.4. As expected, it took quite some time to follow through on this in a way that satisfied users and distribution packagers in a portable manner.

Moving forward: Compressing the release schedule

Collectively the above issues set the release back by perhaps six or eight weeks in total, including the additional release candidate necessary to validate the raft of resulting patches. While set-backs due to long-standing bugs are hard to avoid, there are a few areas where we can do better,

  1. automate the production of release artifacts
  2. regularly test GHC against user packages in between releases
  3. expand continuous integration of GHC to less common platforms to ensure that compatibility problems are caught before the release candidate stage
  4. regularly synchronize with core library maintainers between releases to reduce need for version bound bumps at release time
  5. putting in place tools to ease bisection, which is frequently a useful debugging strategy around release-time

As it turns out, nearly all of these are helped by our on-going effort to move GHC's CI infrastructure to Jenkins (see #13716). As this is a rather deep topic in its own right, I'll leave this more technical discussion for a second post (blog:jenkins-ci).

With the above tooling and process improvements, I think it would be feasible to get the GHC release cycle down to six months or shorter if we so desired. Of course, shorter isn't necessarily better: we need to be careful to balance the desire for a short release cycle against the need for an adequate post-release "percolation" time. This time is crucial to allow the community to adopt the new release, discover and fix its regressions. In fact, the predictability that a short release schedule (hopefully) affords is arguably more important than the high cadence itself.

Consequently, we are considering tightening up the release schedule for future GHC releases in a slow and measured manner. Given that we are now well into the summer, I think positioning the 8.4 release around February 2018, around seven months from now, would be a sensible timeline. However, we would like to hear your opinions.

Here are some things to think about,

  1. Do you feel that it takes too long for GHC features to make it to users' hands?
  2. How many times per year do you envision upgrading your compiler before the process becomes too onerous? Would the current load of interface changes per release be acceptable under a faster release cadence?
  3. Should we adjust the three-release policy to counteract a shorter GHC release cycle?
  4. Would you feel more likely to contribute to GHC if your work were more quickly available in a release?

We would love to hear your thoughts. Be sure to mention whether you are a user, GHC contributor, or both.

by Ben Gamari at August 01, 2017 12:59 AM