Planet Haskell

September 22, 2014

Russell O'Connor

Hard Drive Failure

A few weeks ago my desktop computer suffered catastrophic hard drive failure. Not only did it not boot, it soon got to the point where even the BIOS would fail to recognise the device. At first I did not worry too much. Although I was not doing automatic backups, I was still doing irregular weekly manual backups of my home directory with tarsnap and I had performed one about three days prior. I was not specifically making backups of my NixOS system and user configuration, but I had some old copies. The configuration files do not change much and they are less important. Importantly, I had backups of my tarsnap keys stored in other places, such as my shell account.

While waiting for a replacement drive to arrive, I realized I had a serious problem. My tarsnap keys were encrypted with my PGP key. I had two specific places where I kept backup of my PGP keys. One place was a USB drive in a safe deposit box. However, at some point I had taken that one out to update it, and then misplaced it before putting it back. Naturally, I had been meaning to get around to replacing that USB drive and the data on it, for some number of years. Also, to my surprise, I had never actually placed my PGP key in my secondary spot.

I was sunk. I had some very old hard drive images with older versions of my PGP key on it, but because I rotate my encryption keys every five years, they were not useful. Within the last five years I had started using full disk encryption. I had some newer hard drive images that also have my PGP keys but I need the passphrases to decrypt these images. I had copies of the passphrase around, but, of course, they were all encrypted with my PGP keys.

After an emotional day and some meditation, slowly my old passphrase came back to me and I was able to decrypt one of my disk images. I was able to rescue my PGP keys and from there I was able to recover everything I had.

I plan to get a bit more serious about distributing copies of my PGP key since I use it so widely. With my PGP key I should be able to always recover everything since I keep all my other key material encrypted with it. Instead of a single USB drive in a safe deposit box, I want to keep two identical USB keys, one at home and one in the box. When I want to update the data, I will update the one at home, swap it with the one in the box, and update the second one and keep it at home until the next update is needed.

I have also gotten more serious about automatic backup. Turns out that NixOS already comes with a tarsnap system service. All that one has to do is place one’s write-only tarsnap key in the appropriate place and specify which directories to back up. I am hoping to make system recovery even easier by also backing up my ~/.nix-profile/manifest.nix, /root/.nix-profile/manifest.nix, /etc/nixos and /var/lib.There are probably a few more things I should back up, like my user profiles, but I am not yet sure how best to do that.

I also want to restart my programme of escrow for my passwords in case something happens to me. I need to improve my documentation of password list to make it easier for others to use. I will use ssss to split my master password and distribute among my escrow agents. The nice thing about public-key cryptography is that I can assign escrow agents without requiring anything from them beyond the fact that they already possess and use PGP keys. I do not even need to inform them that they are my escrow agents. The encrypted components will be stored on my USB drive in the safe deposit box.

Overall, I am glad to have narrowly avoid disaster and have definitely learned some lessons. Check your backup policy everyone!

September 22, 2014 01:38 AM

September 21, 2014

Joachim Breitner

Using my Kobo eBook reader as an external eInk monitor

I have an office with a nice large window, but more often than not I have to close the shades to be able to see something on my screen. Even worse: There were so many nice and sunny days where I would have loved to take my laptop outside and work there, but it (a Thinkpad T430s) is simply not usable in bright sun. I have seen those nice eInk based eBook readers, who are clearer the brighter they are. That’s what I want for my laptop, and I am willing to sacrifice color and a bit of usability due to latency for being able to work in the bright daylight!

So while I was in Portland for DebConf14 (where I guess I felt a bit more like tinkering than otherwise) I bought a Kobo Aura HD. I chose this device because it has a resolution similar to my laptop (1440×1080) and I have seen reports from people running their own software on it, including completely separate systems such as Debian or Android.

This week, I was able to play around with it. It was indeed simple to tinker with: You can simply copy a tarball to it which is then extracted over the root file system. There are plenty of instructions online, but I found it easier to take them as inspiration and do it my way – with basic Linux knowledge that’s possible. This way, I extended the system boot script with a hook to a file on the internal SD card, and this file then runs the telnetd daemon that comes with the device’s busybox installation. Then I just have to make the device go online and telnet onto it. From there it is a pretty normal Linux system, albeit without an X server, using the framebuffer directly.

I even found an existing project providing a VNC client implementation for this and other devices, and pretty soon I could see my laptop screen on the Kobo. Black and white worked fine, but colors and greyscales, including all anti-aliased fonts, were quite broken. After some analysis I concluded that it was confusing the bit pattern of the pixels. Luckily kvncclient shares that code with koreader, which worked fine on my device, so I could copy some files and settings from there et voilá: I now have an eInk monitor for my laptop. As a matter of fact, I am writing this text with my Kobo sitting on top of the folded-back laptop screen!

I did some minor adjustments to my laptop:

  • I changed the screen size to match the Kobo’s resolution. Using xrandr’s --panning option this is possible even though my real screen is only 900 pixels high.
  • I disabled the cursor-blink where possible. In general, screen updates should be avoided, so I hide my taffybar (which has a CPU usage monitor) and text is best written at the very end of the line (and not before a, say, </p>).
  • My terminal windows are now black-on-white.
  • I had to increase my font-size a bit (the kobo has quite a high DPI), and color is not helpful (so :set syntax=off in vim).

All this is still very manual (going online with the kobo, finding its IP address, logging in via telnet, killing the Kobo's normal main program, starting x11vnc, finding my ip address, starting the vnc client, doing the adjustments mentioned above), so I need to automate it a bit. Unfortunately, there is no canonical way to extend the Kobo by your own application: The Kobo developers made their device quite open, but stopped short from actually encouraging extensions, so people have created many weird ways to start programs on the Kobo – dedicated start menus, background programs observing when the regular Kobo app opens a specific file, complete replacements for the system. I am considering to simply run an SSH server on the device and drive the whole process from the laptop. I’ll keep you up-to-date.

A dream for the future would be to turn the kobo into a USB monitor and simply connect it to any computer, where it then shows up as a new external monitor. I wonder if there is a standard for USB monitors, and if it is simple enough (but I doubt it).

A word about the kobo development scene: It seems to be quite active and healthy, and a number of interesting applications are provided for it. But unfortunately it all happens on a web forum, and they use it not only for discussion, but also as a wiki, a release page, a bug tracker, a feature request list and as a support line – often on one single thread with dozens of posts. This makes it quite hard to find relevant information and decide whether it is still up-to-date. Unfortunately, you cannot really do without it. The PDF viewer that comes with the kobo is barely okish (e.g. no crop functionality), so installing, say, koreader is a must if you read more PDFs than actual ebooks. And then you have to deal with the how-to-start-it problem.

That reminds me: I need to find a decent RSS reader for the kobo, or possibly a good RSS-to-epub converter that I can run automatically. Any suggestions?

PS and related to this project: Thanks to Kathey!

by Joachim Breitner ( at September 21, 2014 08:11 PM

Danny Gratzer

Introduction to Dependent Types: Off, Off to Agda Land

Posted on September 21, 2014

First, an apology. Sorry this has take so long to push out. I’ve just started my first semester at Carnegie Mellon. I fully intend to keep blogging, but it’s taken a little while to get my feet under me. Happy readings :)

In this second post of my “intro to dependent types” series we’re going on a whirlwind tour of Agda. Specifically we’re going to look at translating our faux-Haskell from the last post into honest to goodness typecheckable Agda.

There are 2 main reasons to go through the extra work of using a real language rather than pseudo-code

  1. This is typecheckable. I can make sure that all the i’s are dotted and t’s crossed.
  2. It’s a lot cleaner We’re only using the core of Agda so it’s more or less a very stripped down Haskell with a much more expressive but simpler type system.

With that in mind let’s dive in!

What’s the Same

There’s quite a bit of shared syntax between Agda and Haskell, so a Haskeller can usually guess what’s going on.

In Agda we still give definitions in much the same way (single : though)

    thingy : Bool
    thingy = true

where as in Haskell we’d say

    name :: Type
    name = val

In fact, we even get Haskell’s nice syntactic sugar for functions.

    function : A -> B -> ... -> C
    function a b ... = c

Will desugar to a lambda.

    function : A -> B -> ... -> C
    function = \a b ... -> c

One big difference between Haskell and Agda is that, due to Agda’s more expressive type system, type inference is woefully undecidable. Those top level signatures are not optional sadly. Some DT language work a little harder than Agda when it comes to inference, but for a beginner this is a bit of a feature: you learn what the actual (somewhat scary) types are.

And of course, you always give type signatures in Haskell I’m sure :)

Like Haskell function application is whitespace and functions are curried

    -- We could explicitly add parens
    -- foo : A -> (B -> C)
    foo : A -> B -> C
    foo = ...

    a : A
    a = ...

    bar : B -> C
    bar = foo a

Even the data type declarations should look familiar, they’re just like GADTs syntactically.

    data Bool : Set where
      true  : Bool
      false : Bool

Notice that we have this new Set thing lurking in our code. Set is just the kind of normal types, like * in Haskell. In Agda there’s actually an infinite tower of these Bool : Set : Set1 : Set2 ..., but won’t concern ourselves with anything beyond Set. It’s also worth noting that Agda doesn’t require any particular casing for constructors, traditionally they’re lower case.

Pattern matching in Agda is pretty much identical to Haskell. We can define something like

    not : Bool -> Bool
    not true  = false
    not false = true

One big difference between Haskell and Agda is that pattern matching must be exhaustive. Nonexhaustiveness is a compiler error in Agda.

This brings me to another point worth mentioning. Remember that structural induction I mentioned the other day? Agda only allows recursion when the terms we recurse on are “smaller”.

In other words, all Agda functions are defined by structural induction. This together with the exhaustiveness restriction means that Agda programs are “total”. In other words all Agda programs reduce to a single value, they never crash or loop forever.

This can occasionally cause pain though since not all recursive functions are modelled nicely by structural induction! A classic example is merge sort. The issue is that in merge sort we want to say something like

    mergeSort : List Nat -> List Nat
    mergeSort [] = []
    mergeSort (x :: []) = x :: []
    mergeSort xs = let (l, r) = split xs in
                     merge (mergeSort l, mergeSort r)

But wait, how would the typechecker know that l and r are strictly smaller than xs? In fact, they might not be! We know that the length of length xs > 1, but convincing the typechecker of that fact is a pain! In fact, without elaborate trickery, Agda will reject this definition.

So, apart from these restriction for totality Agda has pretty much been a stripped down Haskell. Let’s start seeing what Agda offers over Haskell.

Dependent Types

There wouldn’t be much point in writing Agda if it didn’t have dependent types. In fact the two mechanisms that comprise our dependent types translate wonderfully into Agda.

First we had pi types, remember those?

    foo :: (a :: A) -> B
    foo a = ...

Those translate almost precisely into Agda, where we’d write

    foo : (a : A) -> B

The only difference is the colons! In fact, Agda’s pi types are far more general than what we’d discussed previously. The extra generality comes from what we allow A to be. In our previous post, A was always some normal type with the kind * (Set in Agda). In Agda though, we allow A to be Set itself. In Haskell syntax that would be something like

    foo :: (a :: *) -> B

What could a be then? Well anything with the kind * is a type, like Bool, (), or Nat. So that a is like a normal type variable in Haskell

    foo :: forall a. B

In fact, when we generalize pi types like this, they generalize parametric polymorphism. This is kind of like how we use “big lambdas” in System F to write out polymorphism explicitly.

Here’s a definition for the identity function in Agda.

    id : (A : Set) -> A -> A
    id A a = a

This is how we actually do all parametric polymorphism in Agda, as a specific use of pi types. This comes from the idea that types are also “first class”. We can pass them around and use them as arguments to functions, even dependent arguments :)

Now our other dependently typed mechanism was our generalized generalized algebraic data types. These also translate nicely to Agda.

    data Foo : Bool -> Set where
      Bar : Foo True

We indicate that we’re going to index our data on something the same way we would in Haskell++, by adding it to the type signature on the top of the data declaration.

Agda’s GGADTs also allow us to us to add “parameters” instead of indices. These are things which the data type may use, but each constructor handles uniformly without inspecting it.

For example a list type depends on the type of it’s elements, but it doesn’t poke further at the type or value of those elements. They’re handled “parametrically”.

In Agda a list would be defined as

    data List (A : Set) : Set where
      nil  : List A
      cons : A -> List A -> List A

If your wondering what on earth the difference is, don’t worry! You’ve already in fact used parametric/non-parametric type arguments in Haskell. In Haskell a normal algebraic type can just take several type variables and can’t try to do clever things depending on what the argument is. For example, our definition of lists

    data List a = Cons a (List a) | Nil

can’t do something different if a is Int instead of Bool or something like that. That’s not the case with GADTs though, there we can do clever things like

    data List :: * -> * where
      IntOnlyCons :: Int -> List Int -> List Int

Now we’re not treating our type argument opaquely, we can figure things out about it depending on what constructor our value uses! That’s the core of the difference between parameters in indices in Agda.

Next let’s talk about modules. Agda’s prelude is absolutely tiny. By tiny I mean essentially non-existant. Because of this I’m using the Agda standard library heavily and to import something in Agda we’d write

import Foo.Bar.Baz

This isn’t the same as a Haskell import though. By default, imports in Agda import a qualified name to use. To get a Haskell style import we’ll use the special shortcut

open import Foo.Bar

which is short for

import Foo.Bar
open Bar

Because Agda’s prelude is so tiny we’ll have to import things like booleans, numbers, and unit. These are all things defined in the standard library, not even the core language. Expect any Agda code we write to make heavy use of the standard library and begin with a lot of imports.

Finally, Agda’s names are somewhat.. unique. Agda and it’s standard library are unicode heavy, meaning that instead of unit we’d type ⊤ and instead of Void we’d use ⊥. Which is pretty nifty, but it does take some getting used to. If you’re familiar with LaTeX, the Emacs mode for Agda allows LaTeX style entry. For example ⊥ can be entered as \bot.

The most common unicode name we’ll use is ℕ. This is just the type of natural numbers as their defined in Data.Nat.

A Few Examples

Now that we’ve seen what dependent types look like in Agda, let’s go over a few examples of their use.

First let’s import a few things

    open import Data.Nat
    open import Data.Bool

Now we can define a few simple Agda functions just to get a feel for how that looks.

    not : Bool -> Bool
    not true  = false
    not false = true

    and : Bool -> Bool -> Bool
    and true b  = b
    and false _ = false

    or : Bool -> Bool -> Bool
    or false b = b
    or true  _ = true

As you can see defining functions is mostly identical to Haskell, we just pattern match and the top level and go from there.

We can define recursive functions just like in Haskell

    plus : ℕ -> ℕ -> ℕ
    plus (suc n) m = suc (plus n m)
    plus zero    m = m

Now with Agda we can use our data types to encode “proofs” of sorts.

For example

    data IsEven : ℕ -> Set where
      even-z : IsEven zero
      even-s  : (n : Nat) -> IsEven n -> IsEven (suc (suc n))

Now this inductively defines what it means for a natural number to be even so that if Even n exists then n must be even. We can also state oddness

    data IsOdd : ℕ -> Set where
      odd-o : IsOdd (suc zero)
      odd-s : (n : ℕ) -> IsOdd n -> IsOdd (suc (suc n))

Now we can construct a decision procedure which produces either a proof of evenness or oddness for all natural numbers.

    open import Data.Sum -- The same thing as Either in Haskell; ⊎ is just Either

    evenOrOdd : (n : ℕ) -> Odd n ⊎ Even n

So we’re setting out to construct a function that, given any n, builds up an appropriate term showing it is either even or odd.

The first two cases of this function are kinda the base cases of this recurrence.

    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o

So if we’re given zero or one, return the base case of IsEven or IsOdd as appropriate. Notice that instead of Left or Right as constructors we have inj₁ and inj₂. They serve exactly the same purpose, just with a shinier unicode name.

Now our next step would be to handle the case where we have

    evenOrOdd (suc (suc n)) = ?

Our code is going to be like the Haskell code

    case evenOrOdd n of
      Left evenProof -> Left (EvenS evenProof)
      Right oddProof -> Right (OddS  oddProof)

In words, we’ll recurse and inspect the result, if we get an even proof we’ll build a bigger even proof and if we can an odd proof we’ll build a bigger odd proof.

In Agda we’ll use the with keyword. This allows us to “extend” the current pattern matching by adding an expression to the list of expressions we’re pattern matching on.

    evenOrOdd (suc (suc n)) with evenOrOdd n
    evenOrOdd (suc (suc n)) | inj₁ x = ?
    evenOrOdd (suc (suc n)) | inj₂ y = ?

Now we add our new expression to use for matching by saying ... with evenOrOdd n. Then we list out the next set of possible patterns.

From here the rest of the function is quite straightforward.

    evenOrOdd (suc (suc n)) | inj₁ x = inj₁ (even-s n x)
    evenOrOdd (suc (suc n)) | inj₂ y = inj₂ (odd-s n y)

Notice that we had to duplicate the whole evenOrOdd (suc (suc n)) bit of the match? It’s a bit tedious so Agda provides some sugar. If we replace that portion of the match with ... Agda will just automatically reuse the pattern we had when we wrote with.

Now our whole function looks like

    evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n
    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o
    evenOrOdd (suc (suc n)) with evenOrOdd n
    ... | inj₁ x = inj₁ (even-s n x)
    ... | inj₂ y = inj₂ (odd-s n y)

How can we improve this? Well notice that that suc (suc n) case involved unpacking our Either and than immediately repacking it, this looks like something we can abstract over.

    bimap : (A B C D : Set) -> (A -> C) -> (B -> D) -> A ⊎ B -> C ⊎ D
    bimap A B C D f g (inj₁ x) = inj₁ (f x)
    bimap A B C D f g (inj₂ y) = inj₂ (g y)

If we gave bimap a more Haskellish siganture

    bimap :: forall a b c d. (a -> c) -> (b -> d) -> Either a b -> Either c d

One interesting point to notice is that the type arguments in the Agda function (A and B) also appeared in the normal argument pattern! This is because we’re using the normal pi type mechanism for parametric polymorphism, so we’ll actually end up explicitly passing and receiving the types we quantify over. This messed with me quite a bit when I first starting learning DT languages, take a moment and convince yourself that this makes sense.

Now that we have bimap, we can use it to simplify our evenOrOdd function.

    evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n
    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o
    evenOrOdd (suc (suc n)) =
      bimap (IsEven n) (IsOdd n)
            (IsEven (suc (suc n))) (IsOdd (suc (suc n)))
            (even-s n) (odd-s n) (evenOrOdd n)

We’ve gotten rid of the explicit with, but at the cost of all those explicit type arguments! Those are both gross and obvious. Agda can clearly deduce what A, B, C and D should be from the arguments and what the return type must be. In fact, Agda provides a convenient mechanism for avoiding this boilerplate. If we simply insert _ in place of an argument, Agda will try to guess it from the information it has about the other arguments and contexts. Since these type arguments are so clear from context, Agda can guess them all

    evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n
    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o
    evenOrOdd (suc (suc n)) =
      bimap _ _ _ _ (even-s n) (odd-s n) (evenOrOdd n)

Now at least the code fits on one line! This also raises something interesting, the types are so strict that Agda can actually figure out parts of our programs for us! I’m not sure about you but at this point in time my brain mostly melted :) Because of this I’ll try to avoid using _ and other mechanisms for Agda writing programs for us where I can. The exception of course being situations like the above where it’s necessary for readabilities sake.

One important exception to that rule is for parameteric polymorphism. It’s a royal pain to pass around types explicitly everywhere. We’re going to use an Agda feature called “implicit arguments”. You should think of these as arguments for which the _ is inserted for it. So instead of writing

    foo _ zero zero

We could write

    foo zero zero

This more closely mimicks what Haskell does for its parametric polymorphism. To indicate we want something to be an implicit argument, we just wrap it in {} instead of (). So for example, we could rewrite bimap as

    bimap : {A B C D : Set} -> (A -> C) -> (B -> D) -> A ⊎ B -> C ⊎ D
    bimap f g (inj₁ x) = inj₁ (f x)
    bimap f g (inj₂ y) = inj₂ (g y)

To avoid all those underscores.

Another simple function we’ll write is that if we can construct an IsOdd n, we can build an IsEven (suc n).

    oddSuc : (n : ℕ) -> IsOdd n -> IsEven (suc n)

Now this function has two arguments, a number and a term showing that that number is odd. To write this function we’ll actually recurse on the IsOdd term.

    oddSuc .1 odd-o = even-s zero even-z
    oddSuc .(suc (suc n)) (odd-s n p) = even-s (suc n) (oddSuc n p)

Now if we squint hard and ignore those . terms, this looks much like we’d expect. We build the Even starting from even-s zero even-z. From there we just recurse and talk on a even-s constructor to scale the IsEven term up by two.

There’s a weird thing going on here though, those . patterns. Those are a nifty little idea in Agda that pattern matching on one thing might force another term to be some value. If we know that our IsOdd n is odd-o n must be suc zero. Anything else would just be completely incorrect. To notate these patterns Agda forces you to prefix them with .. You should read .Y as “because of X, this must be Y”.

This isn’t an optional choice though, as . patterns may do several wonky things. The most notable is that they often use pattern variables nonlinearly, notice that n appeared twice in our second pattern clause. Without the . this would be very illegal.

As an exercise to the reader, try to write

    evenSuc : (n : ℕ) -> IsEven n -> IsOdd (suc n)

Wrap Up

That wraps up this post which came out much longer than I expected. We’ve now covered enough basics to actually discuss meaningful dependently typed programs. That’s right, we can finally kiss natural numbers good bye in the next post!

Next time we’ll cover writing a small program but interesting program and use dependent types to assure ourselves of it’s correctness.

As always, please comment with any questions :)

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

September 21, 2014 12:00 AM

Christopher Done

shell-conduit: Write shell scripts in Haskell with Conduit

As part of my series of write-about-personal-projects, my latest obsession is writing shell scripts with Michael Snoyman’s Conduit.

Here is my package, shell-conduit. It’s still in the experimental phase, but I don’t forsee any changes now for a while.

Bash is evil

I hate writing scripts in Bash. Until now, it was the easiest way to just write unix scripts. Its syntax is insane, incredibly error prone, its defaults are awful, and it’s not a real big person programming language.

Perl/Python/Ruby are also evil

If you’re going to go as far as using a real programming language, why bother with these dynamically typed messes? Go straight for Haskell.

Like a glove

I had an inkling a while back that conduits mirror the behaviour of bash pipes very well. I knew there was something to the idea, but didn’t act on it fully for a while. Last week I experimented somewhat and realised that the following Haskell code

source $= conduit $ sink

does indeed accurately mirror

source | conduit > sink

And that also the following

(do source
    source $= conduit)
$$ sink

is analogous to

source | conduit

We’ll see examples of why this works later.

I must Haskell all the things

Another trick I realised is to write some template Haskell code which will calculate all executables in your PATH at compilation time and generate a top-level name that is a Haskell function to launch that process. So instead of writing

run "ls"

you could instead just write


There are a few thousand executables, so it takes about 10 seconds to compile such a module of names. But that’s all.

Again, we’ll see how awesome this looks in a minute.

Modeling stdin, stderr and stdout

My choice of modeling the typical shell scripting pipe handles is by having a type called Chunk:

type Chunk = Either ByteString ByteString

All Left values are from stderr. All Right values are either being pulled from stdin or being sent to stdout. In a conduit the difference between stdin and stdout is more conceptual than real.

When piping two commands, the idea is that any Left values are just re-yielded along, they are not consumed and passed into the process.

A process conduit on chunks

Putting the previous model into practice, we come up with a type for launching a process like this:

conduitProcess :: (MonadResource m)
               => CreateProcess -> Conduit Chunk m Chunk

Meaning the process will be launched, and the conduit will accept any upstream stdin (Right values), and send downstream anything that comes from the actual process (both Left and Right values).

Process conduits API

I defined two handy functions for running process conduits:

shell :: (MonadResource m)
      => String -> Conduit Chunk m Chunk
proc :: (MonadResource m)
     => String -> [String] -> Conduit Chunk m Chunk

One to launch via a shell, one to launch via program name and arguments. These functions can be used in your shell scripts. Though, we’ll see in a minute why you should rarely need either.

Executing a shell scripting conduit

First we want something to consume any remainder chunks after a script has finished. That’s writeChunks:

writeChunks :: (MonadIO m)
            => Consumer Chunk m ()
writeChunks =
    (\c ->
       case c of
         Left e -> liftIO (S.hPut stderr e)
         Right o -> liftIO (S.hPut stdout o))

This simply consumes anything left in the pipeline and outputs to the correct file handles, either stderr or stdout.

Now we can write a simple run function:

run :: (MonadIO m,MonadBaseControl IO m)
    => Conduit Chunk (ShellT m) Chunk -> m ()
run p =
    (runShellT (sourceList [] $=
                p $

First it yields an empty upstream of chunks. That’s the source. Then our script p is run as the conduit in between, finally we write out any chunks that remain.

Let’s try that out:

λ> run (shell "echo hello!")
λ> run (proc "date" ["+%Y"])
λ> run (shell "echo oops > /dev/stderr")

Looks good. Standard output was written properly, as was stderr.

Returning to our mass name generation

Let’s take our earlier work of generating names with template-haskell. With that in place, we have a process conduit for every executable in PATH. Add to that variadic argument handling for each one, we get a list of names like this:

rmdir :: ProcessType r => r
ls :: ProcessType r => r
egrep :: ProcessType r => r
dmesg :: ProcessType r => r

The real types when instantiated will look like:

rmdir "foo" :: Conduit Chunk m Chunk
ls :: Conduit Chunk m Chunk
ls "." :: Conduit Chunk m Chunk

Putting it all together

We can now provide any number of arguments:

λ> run ls
λ> run (ls "/")

We can pipe things together:

λ> run (do ls "-1" $= head' "-2")
λ> run (ls $= grep "Key" $= shell "cat" $= (second ( toUpper)))

Results are outputted to stdout unless piped into other processes:

λ> run (do shell "echo sup"; shell "echo hi")
λ> run (do shell "echo sup"; sed "s/u/a/"; shell "echo hi")
λ> run (do shell "echo sup" $= sed "s/u/a/"; shell "echo hi")

Live streaming between pipes like in normal shell scripting is possible:

λ> run (do tail' "/tmp/example.txt" "-f" $= grep "--line-buffered" "Hello")
Hello, world!
Oh, hello!

(Remember that grep needs --line-buffered if it is to output things line-by-line).

Error handling

By default, if a process errors out, the whole script ends. This is contrary to Bash, which keeps going regardless of failure. This is bad.

In Bash, to revert this default, you run:

set -e

And the way to ignore erroneous commands on case-by-case basis is to use || true:

killall nonexistant || true
echo OK, done.

Which means “do foo, or otherwise ignore it, continue the script”.

We can express the same thing using the Alternative instance for the ShellT type:

λ> run (do killall "nonexistant" "-q"; echo "OK, done.")
*** Exception: ShellExitFailure 1
λ> run (do killall "nonexistant" "-q" <|> return (); echo "OK, done.")
OK, done.

String types

If using OverloadedStrings so that you can use Text for arguments, then also enable ExtendedDefaultRules, otherwise you’ll get ambiguous type errors.

{-# LANGUAGE ExtendedDefaultRules #-}

But this isn’t necessary if you don’t need to use Text yet. Strings literals will be interpreted as String. Though you can pass a value of type Text or any instance of CmdArg without needing conversions.

Examples of script files

Quick script to reset my keyboard (Linux tends to forget these things when I unplug my keyboard):

import Data.Conduit.Shell
main =
  run (do xmodmap ".xmodmap"
          xset "r" "rate" "150" "50")

Cloning and initializing a repo (ported from a bash script):

import Control.Monad.IO.Class
import Data.Conduit.Shell
import System.Directory
main =
  run (do exists <- liftIO (doesDirectoryExist "fpco")
          if exists
             then rm "fpco/.hsenvs" "-rf"
             else git "clone" ""
          liftIO (setCurrentDirectory "fpco")
          shell "./dev-scripts/"
          shell "./dev-scripts/"

Script to restart a web process (ported from an old bash script I had):

import Control.Applicative
import Control.Monad.Fix
import Data.Conduit.Shell
main =
  run (do ls
          echo "Restarting server ... ?"
          killall name "-q" <|> return ()
          fix (\loop ->
                 do echo "Waiting for it to terminate ..."
                    sleep "1"
                    (ps "-C" name $= discardChunks >> loop) <|> return ())
          shell "dist/build/ircbrowse/ircbrowse ircbrowse.conf")
  where name = "ircbrowse"

You’ve seen Shelly, right?

Right. Shelly’s fine. It just lacks the two killer things for me:

  • All names are bound, so I can just use them as normal functions.
  • shell-conduit also, due to its mass name binding, prioritizes commands. For example, Shelly has a group of functions for manipulating the file system. In shell-conduit, you just use your normal commands: rm "x" and mv "x" "y".
  • Not based on conduit. Conduit is a whole suite of streaming utilities perfect for scripting.
  • Piped is not the default, either. There’re a bunch of choices: Shelly, Shelly.Lifted, Shelly.Pipe. Choice is good, but for a scripting language I personally prefer one goto way to do something.

Also, Shelly cannot do live streams like Conduit can.

Conduits as good scripting libraries

You might want to import the regular Conduit modules qualified, too:

import qualified Data.Conduit.List as CL

Which contains handy functions for working on streams in a list-like way. See the rest of the handy modules for Conduit in conduit-extra.

Also of interest is csv-conduit, html-conduit, and http-conduit.

Finally, see the Conduit category on Hackage for other useful libraries:

All of these general purpose Conduits can be used in shell scripting.

Using it for real scripts

So far I have ported a few small scripts to shell-conduit from Bash and have been happy every time. I suck at Bash. I’m pretty good at Haskell.

The next test is applying this to my Hell shell and seeing if I can use it as a commandline shell, too.

My friend complained that having to quote all arguments is a pain. I don’t really agree that this is bad. In Bash it’s often unclear how arguments are going to be interpreted. I’m happy just writing something predictable than something super convenient but possibly nonsense.


I set out a week ago to just stop writing Bash scripts. I’ve written a bunch of scripts in Haskell, but I would still write Bash scripts too. Some things were just too boring to write. I wanted to commit to Haskell for scripting. Today, I’m fairly confident I have a solution that is going to be satisfactory for a long while.

September 21, 2014 12:00 AM

September 20, 2014

Matthew Sackman

Concurrency, Actors, Locks and mailboxes

Having been part of the original team that wrote RabbitMQ (indeed I wrote the very first prototype, back in the summer of 2006), and having worked full time with Erlang since 2009 until fairly recently, it's been interesting doing some work in Go recently.

Go's a language I currently have mixed feelings for. In some ways I like it - it's simple, it doesn't take too long to learn, the syntax is clean, the tool chain is superb (compilation is quick), performance is very good, it's very easy to drop into C whenever you really need to. It also has one very nice high end feature: first class channels - you can pass channels over channels which is pretty powerful. But equally, it's statically typed and has a terrible type system (i.e. no generics. Personally, I don't feel like the proofs offered by type checkers are worth much to me so I'd much rather have no static type checking than one that is this primitive and brain dead), it's not extensible properly (e.g. you can't create your own data structures which work with the range) and worst of all, there's no pattern matching. The lack of pattern matching is particularly horrific given Go's "best practise" of returning tuples from functions (except they're not really tuples - another mistake in the language), the right most of which indicates an error or success. But you can't pattern match on the assignment so you end up with endless if-statements that explicitly check the error for nil. There are other irritations which I've found, particularly related to its locks package (i.e. non-re-entrant; can't upgrade read to write; waiting for a write blocks all attempts to gain reads. Yes, I know I'm free to implement my own locks package if I want to).

Go also doesn't push you to using actors - you have to re-implement all that yourself if you want to. In a recent project, I started off with some locks and within about three days found it utterly impossible to reason about which go-routines can hold which locks and whether or not there's any deadlock potential. Inevitably, there was. So I ripped all the locking code out and wrote my own actor loops.

This was quite interesting as here I could now be more flexible than Erlang. I think most people think that actors mean "only one thread/process/routine can read and write its state" - there is a key concept of owning that state and regulating access to it. However, what I found is that I actually only needed to limit modifying the state to a single go-routine: each actor-receive-loop routine would take a write lock on its state whenever it needs to modify its own state, but it's perfectly reasonable to have anyone read the state, provided they take a read lock before doing so. The fact we can share pointers in Go makes this possible, whereas it's impossible to do this in Erlang (well, not quite - if you use ets then you can do it, which is exactly what we do in RabbitMQ in the rabbit_msg_store - but it's certainly not pretty!). So now we can have concurrent reads and no need to pass read-requests over a channel/mailbox. This seems pretty nice to me.

Recently I was reading a paper and it suggested that:

In message passing systems, processes interact exclusively by sending and receiving messages and they do not have access to shared memory.

Firstly, on a very technical note, they do have access to shared memory - the mailbox or queue is exactly that. The key reason why it leads to more composable systems is that when you hold the lock to write into a mailbox, you can never do anything other than write into that mailbox - you can never try to acquire multiple locks, so you can't deadlock in this way. And that's even assuming you're using locks for mailboxes - queues make lovely structures for lock-free concurrent access.

Secondly, as I suggest above, it appears to be safe to allow multiple concurrent readers of an actor's state, provided modifications to the state are done atomically by the actor thread - though more care has to be taken now to ensure updates are consistent - you have to make sure you update all the state you need to change in one go under a write lock (the sort of transactional semantics you end up needing to ensure makes me heavily think of STM). Whilst I would probably still call such a system a "message passing system" I can certainly imagine others would disagree and at a minimum it's some sort of hybrid (you could argue that the side effect of releasing the write lock when you've finished modifying the state is to publish an immutable copy of the state to any and all subscribers that want it - except without all that overhead. When viewed in these terms, it makes more intuitive sense that it's safe - provided of course that you don't do anything blocking whilst you're holding a state read-lock). This design also seems to get a fair bit trickier once you get to distributed systems and the need to have proxy objects representing the state and interface of a remote actor. By comparison, in Erlang a reference to an Actor is an immutable process identifier of Pid which is easy to send around and reason about.

But mainly I was thinking about the pattern of data flow: a mailbox allows multiple writers to send data to a single reader (a gather operation, maybe). The actor loop allows the exact opposite: a single reader of the mailbox can then affect multiple things (a scatter) - either by sending out messages to many other actors (in essence, a push action), or by (as I suggest above) modifying state which can be concurrently read by many other actors (correspondingly, a pull action). In my mind's eye, I see a sort of concertina effect as all these messages are pushed into a mailbox, and then slowly the effects of each message spread out again. In some ways it seems slightly odd how powerful this is, but in other ways it makes perfect sense: if you consider a finite state machine then your mailbox is just the stream of events coming into it and you have your little automaton updating the state with some operation combining the current state with the current message. It is the very fact that the next state is dependent on the current state and the current message that requires mutual exclusion around modifying the state. And of course by ensuring that that mutual exclusion lock is (implicitly) held in absence of any other locks that makes actor systems so much easier to reason about and understand - any deadlocks that occur are at the protocol level and, if you model your protocols between actors properly, can be determined statically (though I'm not aware that anyone actually does this - false positives may abound).

This then made makes me think about how, once all actors have done their initialisation and reached the core actor loop, the entire system is purely event driven. When looked at like this, are we really sure actors are enough? Are there not other forms of expression that capture the relation between events as inputs, with state, and an output more cleanly? In particular I'm thinking of things like Join calculus and Functional Reactive Programming. Given that actors are apparently becoming rather more mainstream these days, I wonder if that really means they're only part of the solution: sure I can write large distributed systems that scale, perform well, don't deadlock or livelock and are exceedingly robust. But I can I write them with less code and cleaner semantics?

September 20, 2014 02:02 PM


Yet again, a new blog. This one is much simpler than the last. Whilst one always hopes, don't expect the frequency of posts to be greater than any of the previous blogs here... The styling is currently non-existent - seriously, there is no CSS right now. Sorry.

This one is much simpler than the last - gone is Serendipity and PyBlosxom is in, in its place. I'm currently trialing using muse to write posts. Not entirely sure it's worth it, but I need to get my emacs foo improved so working with a new major mode is likely a good idea.

September 20, 2014 12:32 PM

First post!

This is your first post! If you can see this with a web-browser, then it's likely that everything's working nicely!

September 20, 2014 10:34 AM

Eric Kidd

Rust lifetimes: Getting away with things that would be reckless in C++

Over the years, I've learned to be cautious with C++ pointers. In particular, I'm always very careful about who owns a given pointer, and who's in charge of calling delete on it. But my caution often forces me to write deliberately inefficient functions. For example:

vector<string> tokenize_string(const string &text);

Here, we have a large string text, and we want to split it into a vector of tokens. This function is nice and safe, but it allocates one string for every token in the input. Now, if we were feeling reckless, we could avoid these allocations by returning a vector of pointers into text:

vector<pair<const char *,const char *>> tokenize_string2(const string &text);

In this version, each token is represented by two pointers into text: One pointing to the first character, and one pointing just beyond the last character.1 But this can go horribly wrong:

// Disaster strikes!
auto v = tokenize_string2(get_input_string());

Why does this fail? The function get_input_string returns a temporary string, and tokenize_string2 builds an array of pointers into that string. Unfortunately, the temporary string only lives until the end of the current expression, and then the underlying memory is released. And so all our pointers in v now point into oblivion—and our program just wound up getting featured in a CERT advisory. So personally, I'm going to prefer the inefficient tokenize_string function almost every time.

Rust lifetimes to the rescue!

Going back to our original design, let's declare a type Token. Each token is either a Word or an Other, and each token contains pointers into a pre-existing string. In Rust, we can declare this as follows:

#[deriving(Show, PartialEq)]
enum Token<'a> {
    Word(&'a str),
    Other(&'a str)

Read more…

September 20, 2014 03:07 AM

Christopher Done

hindent: A Haskell indenter

A question of style

In this post I’m going to use the word “style” to refer to the way code is printed in concrete terms. No changes in the code that would yield a different syntax tree are considered “style” here.

What’s the deal with code style?

Code style is important! If you’re a professional Haskell programmer, you’re working with Haskell code all day. The following things are affected by the style used:

  • How easily it can be manipulated with regular editors: the more code is laid out in a way that prevents you from readily using your normal editor functions, the less efficient you are.
  • How well general tooling works with it: do diff and things like that work well?
  • How easily you can absorb the structure: do you have to spend time hunting around for the start and end of syntactical nodes? Can’t see the tree for the forest?
  • How quickly you can write it: can you just write code or do you have to spend time thinking about how it’s going to be laid out before writing, or editing the layout afterwards?
  • How aesthetically offended you are1: does the code you’re looking at assault your sense of beauty?

Code style is important! Let’s have a code style discussion. I propose to solve it with tooling.

Is this really an issue, though?

Okay, so I’m one guy with a bee in his bonnet. Let’s do a quick Google and see what others are saying in this StackOverflow question:

Could someone provide a link to a good coding standard for Haskell? I’ve found this and this, but they are far from comprehensive.

The following points refer to style:

  • Format your code so it fits in 80 columns. (Advanced users may prefer 87 or 88; beyond that is pushing it.)
  • Put spaces around infix operators. Put a space following each comma in a tuple literal.
  • Prefer a space between a function and its argument, even if the argument is parenthesized.
  • Use line breaks carefully. Line breaks can increase readability, but there is a trade-off: Your editor may display only 40–50 lines at once. If you need to read and understand a large function all at once, you mustn’t overuse line breaks.
  • When possible, align – lines, = signs, and even parentheses and commas that occur in adjacent lines.

Even the Haskell community is not immune to long, protracted debates about tabs vs spaces. That reddit submission has zero points. That means it’s very controversial. The submission also has 117 comments. That means people are very vocal about this topic. That’s because bike-shedding is inversely proportional to the triviality of the debated thing. We know that.

Nevertheless, style is important enough to be discussed.

So let’s formalise a standard style

That’s a simplification. There are many style guides:

These are just public ones. In-house styles are also common, for a particular company. It’s not practical to force everyone into one single style. It’s a well-worn topic in the C world.

Okay, but is this a problem tooling even needs to solve?

There is precedent for other tooling based on style:

Everyone has their own style

So then let’s make a tool with a select number of styles, you might say. The problem is that people don’t even use the standards that exist out there. They used slightly varying versions. Ask any Haskeller what style they use, and they will say “mostly like X, but with some differences.”

For example What are some good style guides for writing Haskell code?2

I use a very similar style, with a few modifications. […] Perhaps I should write out my own style guide at some point, especially as I’ve become pretty set in my style lately.

And Good Haskell Coding Style3:

My Personal Pet Peeves

For the most part the style guides that I have added above (and the tools provided) mirror my own style guide (or perhaps my guide mirrors them). However, there is one item of style that particularly annoys me on a regular basis. […]

Can’t we just use structured editors?

Some more optimistic folk out there might suggest, perhaps, we should just throw away text files and go straight for structured code databases. Put all this formatting nonsense behind us. Layout is just a stylesheet! It’s not data to be stored in a file!

Maybe so. But nobody is using structured editors yet.

A practical way forward

Taking all of the above into consideration, here is my approach at the problem. The hindent library and program. Styled on GNU indent, the intention is that you simply run the program on some source code and it reformats it.

$ hindent
hindent: arguments: --style [fundamental|chris-done|johan-tibell]

hindent has the concept of styles built in. There is a type for it:

data Style =
  forall s. Style {styleName :: !Text
                  ,styleAuthor :: !Text
                  ,styleDescription :: !Text
                  ,styleInitialState :: !s
                  ,styleExtenders :: ![Extender s]
                  ,styleDefConfig :: !Config

It contains authorship metadata. It holds an initial state which can be used during printing. Most importantly, it has a list of extenders. Means to extend the printer and change its behaviour on a node-type-specific basis.

Take a normal pretty printing approach. It’s usually something like:

class Pretty a where
  pretty :: a -> String

Then for all the types in the AST we implement an instance of Pretty.

The limitation here is that we can’t, as a user of the library, decide to print one particular node type differently.

Instead, here’s a new class:

class (Annotated ast,Typeable ast) => Pretty ast where
  prettyInternal :: ast NodeInfo -> Printer ()

(It runs in a Printer type to store some state about the current column and indentation level, things like that.)

Now, we implement the prettyInternal method for all our types. But when implementing instances, we never use the prettyInternal method directly. Instead, we use another function, pretty, which can be considered the main “API” for printing, like before. Here’s the type:

pretty :: (Pretty ast) => ast NodeInfo -> Printer ()

We’ll look at its implementation in a moment. Here is an example instance of Pretty:

instance Pretty ClassDecl where
  prettyInternal x =
    case x of
      ClsTyDef _ this that ->
        do write "type "
           pretty this
           write " = "
           pretty that

See how pretty this and pretty that are used for recursing instead of prettyInternal? This approach is used for all instances.

Now let’s see what that affords us:

pretty :: (Pretty ast) => ast NodeInfo -> Printer ()
pretty a =
  do st <- get
     case st of
       PrintState{psExtenders = es,psUserState = s} ->
         do case listToMaybe (mapMaybe (makePrinter s) es) of
              Just m -> m
              Nothing -> prettyNoExt a
            printComments a
  where makePrinter s (Extender f) =
          case cast a of
            Just v -> Just (f s v)
            Nothing -> Nothing
        makePrinter s (CatchAll f) = (f s a)

In this function, we’re grabbing our (mentioned earlier) list of [Extender s] values from psExtenders and then looking up to see if any of the types match. To clarify, here is the Extender type:

data Extender s where
  Extender :: forall s a. (Typeable a) => (s -> a -> Printer ())
           -> Extender s
  CatchAll :: forall s. (forall a. Typeable a => s -> a -> Maybe (Printer ()))
           -> Extender s

Both constructors are rank-n. Both accept the current state as an argument and the current node. The Extender constructor is Prismesque. It’s existential, and lets you say “I want things of this type”. The CatchAll will just accept anything.

All that adds up to me being able to do something like this. Here’s a demo style:

demo =
  Style {styleName = "demo"
        ,styleAuthor = "Chris Done"
        ,styleDescription = "Demo for hindent."
        ,styleInitialState = ()
        ,styleExtenders =
           [Extender (\_ x ->
                        case x of
                          InfixApp _ a op b ->
                            do pretty a
                               pretty op
                               pretty b
                          _ -> prettyNoExt x)]
        ,styleDefConfig = def}

(The prettyNoExt is a publicly exposed version of the (private) method prettyInternal.)

Now let’s test the fundamental style versus our demo style:

λ> test fundamental "x = foo (a * b) + what"
x =
    (a * b) + what
λ> test demo "x = foo (a * b) + what"
x =

Viola! We’ve configured how to pretty print infix operators in a few lines of code.

In practice, there are three styles. Here’s a larger example:

foo = do print "OK, go"; foo (foo bar) -- Yep.
          (if bar then bob else pif) (case mu {- cool -} zot of
            Just x -> return ();
            Nothing -> do putStrLn "yay"; return (1,2)) bill -- Etc
  where potato Cakes {} = 2 * x foo * bar / 5

Fundamental style:

foo =
  do print
       "OK, go"
       (if bar
           then bob
           else pif)
       (case mu {- cool -}
               zot of
          Just x ->
          Nothing ->
            do putStrLn
       bill -- Etc
  where potato Cakes{} =
          2 * x
                foo * bar / 5

Johan Tibell’s style (which I’ve only started implementing, there’re some things I need to clarify):

foo = do
    print "OK, go"
        (foo bar)
        (if bar
             then bob
             else pif)
        (case mu {- cool -} zot of
             Just x ->
                 return ()
             Nothing -> do
                 putStrLn "yay"
                 return (1, 2))
        bill -- Etc
    potato Cakes{} =
        2 * x foo * bar / 5

My style (pretty much complete):

foo =
  do print "OK, go"
     foo (foo bar)
         (if bar
             then bob
             else pif)
         (case mu {- cool -} zot of
            Just x -> return ()
            Nothing ->
              do putStrLn "yay"
                 return (1,2))
         bill -- Etc
  where potato Cakes{} = 2 * x foo * bar / 5

The styles are part of the module hierarchy of the package.

Write your own style!

I welcome anyone to write their own style. All styles are based upon the fundamental style, which should never change, by extending it. You can base yours off of another style, or start from scratch. While you can keep your style locally, like your XMonad configuration, it’s encouraged to contribute your style as a module.

My style and Johan’s are quite different. But yours may be similar with small tweaks. Another distinctly different style is Simon Peyton Jones’s with explicit braces. This is a style you can implement if you want it.

See the contributing section of the README for more info.

Preserving meaning

A recommendation is to preserve the meaning of the code. Don’t make AST changes. Like removing parentheses, changing $ into parens, moving lets into wheres, etc. You can do it, but the results might surprise you.

Editing advantages

Having implemented Emacs support, I have been using this for a few weeks on my own code. It’s amazing. I’ve all but stopped manually making style changes. I just write code and then hit C-c i and it almost always does exactly what I want.

It can’t always do what I want. It has simple, predictable heuristics. But even when it doesn’t do what I want, I’ve so far been okay with the results. The advantages vastly outweigh that.

Remaining problems

I need to write a test suite for the fundamental style (and maybe the others). This isn’t hard, it’s just a bit laborious so I haven’t gotten round to it yet.

There’re some parts of the AST I haven’t finished filling out. You can see them which are marked by FIXME’s. This just means if you try to format a node type it doesn’t know how to print, you’ll get a message saying so. Your code won’t be touched.

Comment re-insertion is a little bit of a challenge. I have a decent implementation that generally preserves comments well. There’re some corner cases that I’ve noticed, but I’m confident I have the right change to make to clear that up.

The fundamental printer is fast. My personal ChrisDone style is slower, due to its heuristics of deciding when to layout a certain way. It took 6s to layout a complex and large declaration. I updated my style and brought it down to 2s. That’s okay for now. There are always speed tricks that can be done.

There are of course issues like whether HSE can parse your code, whether you have #ifdef CPP pragmas in your code, and things like that. That’s just part of the general problem space of tools like this.

Remaining ideas

Currently you can only reformat a declaration. I don’t yet trust (any) Haskell printer with my whole file. I invoke it on a per-declaration basis and I see the result. That’s good for me at the moment. But in the future I might extend it to support whole modules.

Implement some operator-specific layouts. There are some operators that can really only be laid out in a certain way. Control.Applicative operators spring to mind:

Foo <$> foo
    <*> bar
    <*> mu

This can easily be handled as part of a style. Other considerations might be strings of lens operators. I’ve noticed that people tend not to put spaces around them, like:4


There’s also alignment, which is another possibility and easily implemented. The challenge will be deciding when alignment will look good versus making the code too wide and whitespacey. In my own style I personally haven’t implemented any alignment as it’s not that important to me, but I might one day.


Hopefully I’ve motivated the case that style is important, that formalizing style is important, and that automating it is practical and something we should solve and then move on, redirect our energy that was wasted on manually laying things out and debating.

  1. Granted, this isn’t a technical consideration. But it’s a real one.

  2. Both of these are referring to conventions other than simple layout, but I don’t think that detracts the point.

  3. Both of these are referring to conventions other than simple layout, but I don’t think that detracts the point.

  4. I don’t know lens’s operators so that might not be real code, but the point is that could be a style to implement.

September 20, 2014 12:00 AM

Formatting in Haskell

This post is about the formatting package.

What’s wrong with printf?

The Text.Printf module is problematic simply because it’s not type-safe:

λ> import Text.Printf
λ> printf "" 2
*** Exception: printf: formatting string ended prematurely
λ> printf "%s" 2
*** Exception: printf: bad formatting char 's'

And it’s not extensible in the argument type. The PrintfType class does not export its methods.

And it’s not extensible in the formatting. You can’t add a “%K” syntax to it to print a value in Kelvins, for example.

And it’s implicit. You can’t just use your normal API searching facilities to search how to print a Day.

Holy Moly!

A while ago I was inspired by the HoleyMonoid package to use that mechanism to make a general replacement for printf.

It’s a continuation-based way of building up monoidal functions by composition with the ability to insert constants in-between. Example:

let holey = now "x = "
          . later show
          . now ", y = "
          . later show

> run holey 3 5
"x = 3, y = 5"

The now function inserts a monoidal value directly into the composition. So

run (now x . now y)

is equivalent to

x <> y


run (later show . now x . later show . now y)

is equivalent to

\a b -> show a <> x <> show b <> y

The Formatting package

The package is available on Hackage as formatting.

Comparison with other formatters


format ("Person's name is " % text %  ", age is " % hex) "Dave" 54

or with short-names:

format ("Person's name is " % t % ", age is " % x) "Dave" 54

Similar to C’s printf:

printf("Person's name is %s, age is %x","Dave",54);

and Common Lisp’s FORMAT:

(format nil "Person's name is ~a, age is ~x" "Dave" 54)

The Holey type

newtype HoleyT r a m = Holey { runHM :: (m -> r) -> a }

type Holey m r a = HoleyT r a m

This is my version of the HoleyMonoid. To make this into a useful package I changed a few things.

The Category instance implied a name conflict burden with (.), so I changed that to (%):

(%) :: Monoid n => Holey n b c -> Holey n b1 b -> Holey n b1 c

Rather than have the name-conflicting map function, I flipped the type arguments of the type and made it an instance of Functor.


There is an array of top-level printing functions for various output types:

-- | Run the formatter and return a lazy 'Text' value.
format :: Holey Builder Text a -> a

-- | Run the formatter and return a strict 'S.Text' value.
sformat :: Holey Builder S.Text a -> a

-- | Run the formatter and return a 'Builder' value.
bprint :: Holey Builder Builder a -> a

-- | Run the formatter and print out the text to stdout.
fprint :: Holey Builder (IO ()) a -> a

-- | Run the formatter and put the output onto the given 'Handle'.
hprint :: Handle -> Holey Builder (IO ()) a -> a

All the combinators work on a lazy text Builder which has good appending complexity and can output to a handle in chunks.

The Format type

There is a short-hand type for any formatter:

type Format a = forall r. Holey Builder r (a -> r)

All formatters are written in terms of now or later.


There is a standard set of formatters in Formatting.Formatters, for example:

text :: Format Text
int :: Integral a => Format a
sci :: Format Scientific
hex :: Integral a => Format a

Finally, there is a general build function that will build anything that is an instance of the Build class from the text-format package:

build :: Buildable a => Format a

For which there are a bunch of instances. See the README for a full set of examples.

Composing formatters

%. is like % but feeds one formatter into another:

λ> format (left 2 '0' %. hex) 10


You can include things verbatim in the formatter:

> format (now "This is printed now.")
"This is printed now."

Although with OverloadedStrings you can just use string literals:

> format "This is printed now."
"This is printed now."

You can handle things later which makes the formatter accept arguments:

> format (later (const "This is printed later.")) ()
"This is printed later."

The type of the function passed to later should return an instance of Monoid.

later :: (a -> m) -> Holey m r (a -> r)

The function you format with (format, bprint, etc.) will determine the monoid of choice. In the case of this library, the top-level formating functions expect you to build a text Builder:

format :: Holey Builder Text a -> a

Because builders are efficient generators.

So in this case we will be expected to produce Builders from arguments:

format . later :: (a -> Builder) -> a -> Text

To do that for common types you can just re-use the formatting library and use bprint:

 > :t bprint
bprint :: Holey Builder Builder a -> a
> :t bprint int 23
bprint int 23 :: Builder

Coming back to later, we can now use it to build our own printer combinators:

> let mint = later (maybe "" (bprint int))
> :t mint
mint :: Holey Builder r (Maybe Integer -> r)

Now mint is a formatter to show Maybe Integer:

> format mint (readMaybe "23")
> format mint (readMaybe "foo")

Although a better, more general combinator might be:

> let mfmt x f = later (maybe x (bprint f))

Now you can use it to maybe format things:

> format (mfmt "Nope!" int) (readMaybe "foo")


I’ve been using formatting in a bunch of projects since writing it. Happily, its API has been stable since releasing with some additions.

It has the same advantages as Parsec. It’s a combinator-based mini-language with all the same benefits.

September 20, 2014 12:00 AM

September 19, 2014

JP Moresmau

A new Haskell IDE!

Well, that title is click-bait. It's only a proof of concept, so far (-:, sorry!

I wanted to play with Threepenny-GUI since it allowed to do UI the FRP way, without having the troubles of getting a proper UI library to work. And I was going through a period of frustration with EclipseFP, so I thought about something else for a while... It's got the romantic name of HWIDE!

So in fact, this is a very simple integration of CodeMirror and Threepenny-GUI, to be able to edit Haskell sources inside your browser. When you save your buffer, the data is written to disk, and if a cabal file could be found for that source file, a cabal build (in a sandbox) is attempted (with a configure if needed). The output is then parsed by code ripped off BuildWrapper, and the errors/warnings are displayed, so you can click on them and see the offending line in the code.

That's all, really, and even that is not 100% perfect, but it's a start. I could get to play a bit with Events and Behaviors, develop some little widgets. If some FRP experts want to have a look at the code and offer some advice, I'd be all ears!

I probably won't have much time to turn this into the next generation Haskell IDE, but please fork and hack to your heart's content! The repository is at

Happy Haskell Hacking!

by JP Moresmau ( at September 19, 2014 01:15 PM

September 18, 2014

Chung-chieh Shan


A type checker that derails:
show :: a -> String
x :: Int
show x
“but you can’t show functions”

A proof checker that derails:
“Birds fly. Tweety the Canary is a bird. So Tweety flies.”
“But Fido the Ostrich doesn’t fly.”

Another type checker that derails:
main = putStrLn s
“OK, how about s = show id

Another proof checker that derails:
“Fido doesn’t fly.”
“But Fido is a bird, and birds fly.”

A helpful type checker:
show id
“but you can’t show functions”

A helpful proof checker:
“Birds fly. Fido is a bird. So Fido flies.”
“But Fido is an ostrich, and ostriches don’t fly.”

Another helpful type checker:
main = putStrLn s
“OK, how about s = show 42

Another helpful proof checker:
“Tweety doesn’t fly.”
“But Tweety is a bird, and birds fly.”

September 18, 2014 05:33 PM

JP Moresmau

Learning Scala via unit tests

Last month I followed a big data training session, and we used Scala to write algorithms for Spark. I had looked at Scala some years ago but I think at the time the Eclipse support wasn't very good (the pot calling the kettle black, eh?), and it piqued my curiosity again. So I started looking at tutorials, and found Scala Labs. The idea is interesting: you get a project with sources and tests, and you need to make all the tests pass. The source code for both the code to change and the unit tests is heavily documented, and guides you through the language nicely. And seeing the green ticks appear always triggers the proper reward areas of my brain (-:

I had a little issue getting the code to compile using Eclipse, since there's a sbt-launch-0.13.0.jar library at the root of the project, and Eclipse used that as the top library, and it had a weird version of the List class, that wasn't genericized! I removed that jar from the class path and everything worked fine.

I'm not aware of a similar tutorial for Haskell, but that would be a good idea!

by JP Moresmau ( at September 18, 2014 05:07 PM

Ian Ross

Non-diffusive atmospheric flow #6: principal components analysis

Non-diffusive atmospheric flow #6: principal components analysis

The pre-processing that we’ve done hasn’t really got us anywhere in terms of the main analysis we want to do – it’s just organised the data a little and removed the main source of variability (the seasonal cycle) that we’re not interested in. Although we’ve subsetted the original geopotential height data both spatially and temporally, there is still a lot of data: 66 years of 181-day winters, each day of which has 72×15 Z500 values. This is a very common situation to find yourself in if you’re dealing with climate, meteorological, oceanographic or remote sensing data. One approach to this glut of data is something called dimensionality reduction, a term that refers to a range of techniques for extracting “interesting” or “important” patterns from data so that we can then talk about the data in terms of how strong these patterns are instead of what data values we have at each point in space and time.

I’ve put the words “interesting” and “important” in quotes here because what’s interesting or important is up to us to define, and determines the dimensionality reduction method we use. Here, we’re going to side-step the question of determining what’s interesting or important by using the de facto default dimensionality reduction method, principal components analysis (PCA). We’ll take a look in detail at what kind of “interesting” and “important” PCA give us a little later.

PCA is, in principle, quite a simple method, but it causes many people endless problems. There are some very good reasons for this:

  • PCA is in some sense nothing more than a generic change of basis operation (with the basis we change to chosen in a special way). The result of this is that a lot of the terminology used about PCA is also very generic, and hence very confusing (words like “basis”, “component”, “eigenvector”, “projection” and so on could mean more or less anything in this context!).

  • PCA is used in nearly every field where multivariate data is analysed, and is the archetypical “unsupervised learning” method. This means that it has been invented, reinvented, discovered and rediscovered many times, under many different names. Some other names for it are: empirical orthogonal function (EOF) analysis, the Karhunen-Loève decomposition, proper orthogonal decomposition (POD), and there are many others. Each of these different fields also uses different terms for the different outputs from PCA. This is very confusing: some people talk about principal components, some about empirical orthogonal functions and principal component time series, some about basis functions, and so on. Here, we’re going to try to be very clear and careful about the names that we use for things to try to alleviate some of the confusion.

  • There is a bit of a conceptual leap that’s necessary to go from very basic examples of using PCA to using PCA to analyse the kind of spatio-temporal data we have here. I used to say something like: “Well, there’s a nice two-dimensional example, and it works just the same in 100 dimensions, so let’s just apply it to our atmospheric data!” A perfectly reasonable reponse to that is: “WHAT?! Are you an idiot?”. Here, we’re going to take that conceptual leap slowly, and describe exactly how the “change of basis” view of PCA works for spatio-temporal data.

  • There are some aspects of the scaling of the different outputs from PCA that are really confusing. In simple terms, PCA breaks your data down into two parts, and you could choose to put the units of your data on either one of those parts, normalising the other part. Which one you put the units on isn’t always an obvious choice and it’s really easy to screw things up if you do it wrong. We’ll look at this carefully here.

So, there’s quite a bit to cover in the next couple of articles. In this article, we will: explain the basic idea of PCA with a very simple (two-dimensional!) example; give a recipe for how to perform PCA on a data set; talk about why PCA works from an algebraic standpoint; talk about how to do these calculations in Haskell. Then in the next article, we will: describe exactly how we do PCA on spatio-temporal data; demonstrate how to perform PCA on the Z500 anomaly data; show how to visualise the Z500 PCA results and save them for later use. What we will end up with from this stage of our analysis is a set of “important” spatial patterns (we’ll see what “important” means for PCA) and time series of how strong each of those spatial patterns is at a particular point in time. The clever thing about this decomposition is that we can restrict our attention to the few most “important” patterns and discard all the rest of the variability in the data. That makes the subsequent exploration of the data much simpler.

The basic idea of PCA

We’re going to take our first look at PCA using a very simple example. It might not be immediately obvious how the technique we’re going to develop here will be applicable to the spatio-temporal Z500 data we really want to analyse, but we’ll get to that a little later, after we’ve seen how PCA works in this simple example and we’ve done a little algebra to get a clearer understanding of just why the “recipe” we’re going to use works the way that it does.

Suppose we go to the seaside and measure the shells of mussels1. We’ll measure the length and width of each shell and record the data for each mussel as a two-dimensional (length, width) vector. There will be variation in the sizes and shapes of the mussels, some longer, some shorter, some fatter, some skinnier. We might end up with data that looks something like what’s shown below, where there’s a spread of length in the shells around a mean of about 5 cm, a spread in the width of shells around a mean of about 3 cm, and there’s a clear correlation between shell length and width (see Figure 1 below). Just from eyeballing this picture, it seems apparent that maybe measuring shell length and width might not be the best way to represent this data – it looks as though it could be better to think of some combination of length and width as measuring the overall “size” of a mussel, and some other combination of length and width as measuring the “fatness” or “skinniness” of a mussel. We’ll see how a principal components analysis of this data extracts these two combinations in a clear way.

The code for this post is available in a Gist. The Gist contains a Cabal file as well as the Haskell source, to make it easy to build. Just do something like this to build and run the code in a sandbox:

git clone pca-2d
cd pca-2d
cabal sandbox init
cabal install

Just for a slight change, I’m going to produce all the plots in this section using Haskell, specifically using the Chart library. We’ll use the hmatrix library for linear algebra, so the imports we end up needing are:

import Control.Monad
import Numeric.LinearAlgebra.HMatrix
import Graphics.Rendering.Chart.Easy hiding (Matrix, Vector, (|>), scale)
import Graphics.Rendering.Chart.Backend.Cairo

There are some name overlaps between the monadic plot interface provided by the Graphics.Rendering.Chart.Easy module and hmatrix, so we just hide the overlapping ones.

We generate 500 synthetic data points:

-- Number of test data points.
n :: Int
n = 500

-- Mean, standard deviation and correlation for two dimensions of test
-- data.
meanx, meany, sdx, sdy, rho :: Double
meanx = 5.0 ; meany = 3.0 ; sdx = 1.2 ; sdy = 0.6 ; rho = 0.75

-- Generate test data.
generateTestData :: Matrix Double
generateTestData =
  let seed = 1023
      mean = 2 |> [ meanx, meany ]
      cov = matrix 2 [ sdx^2       , rho*sdx*sdy
                     , rho*sdx*sdy , sdy^2       ]
      samples = gaussianSample seed n mean cov
  in fromRows $ filter ((> 0) . minElement) $ toRows samples

The mussel shell length and width values are generated from a two-dimensional Gaussian distribution, where we specify mean and standard deviation for both shell length and width, and the correlation between the length and width (as the usual Pearson correlation coefficient). Given this information, we can generate samples from the Gaussian distribution using hmatrix’s gaussianSample function. (If we didn’t have this function, we would calculate the Cholesky decomposition of the covariance matrix we wanted, generate samples from a pair of standard one-dimensional Gaussian distributions and multiple two-dimensional vectors of these samples by one of the Cholesky factors of the covariance matrix – this is just what the gaussianSample function does for us.) We do a little filtering in generateTestData to make sure that we don’t generate any negative values2.

The main program that drives the generation of the plots we’ll look at below is:

main :: IO ()
main = do
  let dat = generateTestData
      (varexp, evecs, projs) = pca dat
      (mean, cov) = meanCov dat
      cdat = fromRows $ map (subtract mean) $ toRows dat
  forM_ [(PNG, "png"), (PDF, "pdf"), (SVG, "svg")] $ \(ptype, suffix) -> do
    doPlot ptype suffix dat evecs projs 0
    doPlot ptype suffix cdat evecs projs 1
    doPlot ptype suffix cdat evecs projs 2
    doPlot ptype suffix cdat evecs projs 3
  putStrLn $ "FRACTIONAL VARIANCE EXPLAINED: " ++ show varexp

and you can see the doPlot function that generates the individual plots in the Gist. I won’t say a great deal about the plotting code, except to observe that the new monadic API to the Chart library makes generating this kind of simple plot in Haskell no harder than it would be using Gnuplot or something similar. The plot code produces one of four plots depending on an integer parameter, which ranges from zero (the first plot above) to three. Because we’re using the Cairo backend to the Chart library, we can generate image output in any of the formats that Cairo supports – here we generate PDF (to insert into LaTeX documents), SVG (to insert into web pages) and PNG (for a quick look while we’re playing with the code).

The main program above is pretty simple: generate test data, do the PCA calculation (by calling the pca function, which we’ll look at in detail in a minute), do a little bit of data transformation to help with plotting, then call the doPlot function for each of the plots we want. Here are the plots we produce, which we’ll refer to below as we work through the PCA calculation:

Synthetic mussel shell test data for two-dimensional PCA example.

Centred synthetic mussel shell test data for two-dimensional PCA example.

PCA eigenvectors for two-dimensional PCA example.

Data projection onto PCA eigenvectors for two-dimensional PCA example.

Let’s now run through the “recipe” for performing PCA, looking at the figures above in parallel with the code for the pca function:

pca :: Matrix Double -> (Vector Double, Matrix Double, Matrix Double)
pca xs = (varexp, evecs, projs)
  where (mean, cov) = meanCov xs
        (_, evals, evecCols) = svd cov
        evecs = fromRows $ map evecSigns $ toColumns evecCols
        evecSigns ev = let maxelidx = maxIndex $ cmap abs ev
                           sign = signum (ev ! maxelidx)
                       in cmap (sign *) ev
        varexp = scale (1.0 / sumElements evals) evals
        projs = fromRows $ map project $ toRows xs
        project x = evecs #> (x - mean)

We’ll look at just why this recipe works in the next section, but for the moment, let’s just see what happens:

  1. We start with our original mussel shell data (Figure 1 above).

  2. We calculate the mean and covariance of our data (line 3 of the pca function listing). PCA analyses the deviations of our data from the mean, so we effectively look at “centred” data, as shown in Figure 2, where we’ve just removed the mean from each coordinate in our data. The mean and covariance calculation is conveniently done using hmatrix’s meanCov function.

  3. Then we calculate the eigendecomposition of the covariance matrix. Because the covariance matrix is a real symmetric matrix, by construction, we know that the eigenvectors will form a complete set that we can use as a basis to represent our data. (We’re going to blithely ignore all questions of possible degeneracy here – for real data, “almost surely” means always!) Here, we do the eigendecomposition using a singular value decomposition (line 4 in the listing of the pca function). The singular values give us the eigenvalues and the right singular vectors give us the eigenvectors. The choice here to use SVD (via hmatrix’s svd function) rather than some other means of calculating an eigendecomposition is based primarily on the perhaps slightly prejudiced idea that SVD has the best and most stable implementations – here, hmatrix calls out to LAPACK to do this sort of thing, so there’s probably not much to choose, since the other eigendecomposition implementations in LAPACK are also good, but my prejudice in favour of SVD remains! If you want some better justification for why SVD is “the” matrix eigendecomposition, take a look at this very interesting historical review of the development of SVD: G. W. Stewart (1993). On the early history of the singular-value decomposition. SIAM Rev. 35(4), 551-566.

  4. We do a little manipulation of the directions of the eigenvectors (lines 5-8 in the listing), flipping the signs of them to make the largest components point in the positive direction – this is mostly just to make the eigenvectors look good for plotting. The eigenvectors are shown in the Figure 3: we’ll call them e1 (the one pointing to the upper right) and e2 (the one pointing to the upper left). Note that these are unit vectors. We’ll talk about this again when we look at using PCA for spatio-temporal data.

  5. Once we have unit eigenvectors, we can project our (centred) data points onto these eigenvectors (lines 10 and 11 of the listing: the project function centres a data point by taking off the mean, then projects onto each eigenvector using hmatrix’s matrix-vector product operator #>). Figure 4 shows in schematic form how this works – we pick out one data point in green and draw lines parallel and orthogonal to the eigenvectors showing how we project the data point onto the eigenvectors. Doing this for each data point is effectively just a change of basis: instead of representing our centred data value by measurements along the x- and y-axes, we represent it by measurements in the directions of e1 and e2. We’ll talk more about this below as well.

  6. Finally, the eigenvalues from the eigendecomposition of the covariance matrix tell us something about how much of the total variance in our input data is “explained” by the projections onto each of the eigenvectors. I’ve put the word “explained” in quotes because I don’t think it’s a very good word to use, but it’s what everyone says. Really, we’re just saying how much of the data variance lies in the direction of each eigenvector. Just as you can calculate the variance of the mussel length and width individually, you can calculate the variance of the projections onto the eigenvectors. The eigenvalues from the PCA eigendecomposition tell you how much variance there is in each direction, and we calculate the “fraction of variance explained” for each eigenvector and return it from the pca function.

So, the pca function returns three things: eigenvalues (actually fractional explained variance calculated from the eigenvalues) and eigenvectors from the PCA eigendecomposition, plus projections of each of the (centred) data points onto each of the eigenvectors. The terminology for all these different things is very variable between different fields. We’re going to sidestep the question of what these things are called by always explicitly referring to PCA eigenvectors (or, later on when we’re dealing with spatio-temporal data, PCA eigenpatterns), PCA explained variance fractions and PCA projected components. These terms are a bit awkward, but there’s no chance of getting confused this way. We could choose terminology from one of the fields where PCA is commonly used, but that could be confusing for people working in other fields, since the terminology in a lot of cases is not very well chosen.

Together, the PCA eigenvectors and PCA projected components constitute nothing more than a change of orthonormal basis for representing our input data – the PCA output contains exactly the same information as the input data. (Remember that the PCA eigenvectors are returned as unit vectors from the pca function, so we really are just looking at a simple change of basis.) So it may seem as though we haven’t really done anything much interesting with our data. The interesting thing comes from the fact that we can order the PCA eigenvectors in decreasing order of the explained variance fraction. If we find that data projected onto the first three (say) eigenvectors explains 80% of the total variance in our data, then we may be justified in considering only those three components. In this way, PCA can be used as a dimensionality reduction method, allowing us to use low-dimensional data analysis and visualisation techniques to deal with input data that has high dimensionality.

This is exactly what we’re going to do with the Z500 data: we’re going to perform PCA, and take only the leading PCA eigenvectors and components, throwing some information away. The way that PCA works guarantees that the set of orthogonal patterns we keep are the “best” patterns in terms of explaining the variance in our data. We’ll have more to say about this in the next section when we look at why our centre/calculate covariance/eigendecomposition recipe works.

The algebra of PCA

In the last section, we presented a “recipe” for PCA (at least for two-dimensional data): centre the data; calculate the covariance matrix; calculate the eigendecomposition of the covariance matrix; project your centred data points onto the eigenvectors. The eigenvalues give you a measure of the proportion of the variance in your data in the direction of the corresponding eigenvector. And the projection of the data points onto the PCA eigenvectors is just a change of basis, from whatever original basis your data was measured in (mussel shell length and width as the two components of each data point in the example) to a basis with the PCA eigenvectors as basis vectors.

So why does this work? Obviously, you can use whatever basis you like to describe your data, but why is the PCA eigenbasis useful and interesting? I’ll explain this quite quickly, since it’s mostly fairly basic linear algebra, and you can read about it in more detail in more or less any linear algebra textbook3.

To start with, let’s review some facts about eigenvectors and eigenvalues. For a matrix A, an eigenvector u and its associated eigenvalue λ satisfy


The first thing to note is that any scalar multiple of u is also an eigenvector, so an eigenvector really refers to a “direction”, not to a specific vector with a fixed magnitude. If we multiply both sides of this by uT and rearrange a little, we get


The denominator of the fraction on the right hand side is just the length of the vector u. Now, we can find the largest eigenvalue λ1 and corresponding eigenvector u1 by solving the optimisation problem

u1=arg maxuTu=1uTAu,

where for convenience, we’ve restricted the optimisation to find a unit eigenvector, and we find λ1 directly from the fact that Au1=λ1u1.

We can find next largest (in magnitude) eigenvalue and corresponding eigenvector of the matrix A by projecting the rows of A into the subspace orthogonal to u1 to give a new matrix A1 and solving the optimisation problem

u2=arg maxuTu=1uTA1u,

finding the second largest eigenvalue λ2 from A1u2=λ2u2. Further eigenvectors and eigenvalues can be found in order of decreasing eigenvalue magnitude by projecting into subspaces orthogonal to all the eigenvectors found so far and solving further optimisation problems.

This link between this type of optimisation problem and the eigenvectors and eigenvalues of a matrix is the key to understanding why PCA works the way that it does. Suppose that we have centred our (K-dimensional) data, and that we call the N centred data vectors xi, i=1,2,N. If we now construct an N×K matrix X whose rows are the xi, then the sample covariance of the data is


Now, given a direction represented as a unit vector u, we can calculate the data variance in that direction as Xu2, so that if we want to know the direction in which the data has the greatest variance, we solve an optimisation problem of the form

u1=arg maxuTu=1(Xu)TXu=arg maxuTu=1uTCu.

But this optimisation problem is just the eigendecomposition optimisation problem for the covariance matrix C. This demonstrates we can find the directions of maximum variance in our data by looking at the eigendecomposition of the covariance matrix C in decreasing order of eigenvalue magnitude.

There are a couple of things to add to this. First, the covariance matrix is, by construction, a real symmetric matrix, so its eigenvectors form a complete basis – this means that we really can perform a change of basis from our original data to the PCA basis with no loss of information. Second, because the eigenvectors of the covariance matrix are orthogonal, the projections of our data items onto the eigenvector directions (what we’re going to call the PCA projected components) are uncorrelated. We’ll see some consequences of this when we look at performing PCA on the Z500 data. Finally, and related to this point, it’s worth noting that PCA is a linear operation – the projected components are linearly uncorrelated, but that doesn’t mean that there can’t be some nonlinear relationship between them. There are generalisations of PCA to deal with this case, but we won’t be talking about them for the purposes of this analysis.

Everything we’ve done here is pretty straightforward, but you might be wondering why we would want to change to this PCA basis at all? What’s the point? As I noted above, but is worth reiterating, the most common use for PCA, and the way that we’re going to use it with the Z500 data, is as a dimensionality reduction method. For the Z500 data, we have, for each day we’re looking at, 72×15=1080 spatial points, which is a lot of data to look at and analyse. What we usually do is to perform PCA, then ignore all but the first few leading PCA eigenvectors and projected components. Because of the way the optimisation problems described above are set up, we can guarantee that the leading m PCA eigenvectors span the m-dimensional subspace of the original data space containing the most data variance, and we can thus convince ourselves that we aren’t missing interesting features of our data by taking only those leading components. We’ll see how this works in some detail when we do the PCA analysis of the Z500 data, but in the mussel measurement case, this would correspond to thinking just of the projection of the mussel length and width data along the leading e1 eigendirection, so reducing the measurements to a single “size” parameter that neglects the variation in fatness or skinniness of the mussels. (This two-dimensional case is a bit artificial. Things will make more sense when we look at the 1080-dimensional case for the Z500 data.)

  1. Well, not really, since I live in the mountains of Austria and there aren’t too many mussels around here, so I’ll generate some synthetic data!

  2. Obviously, a Gaussian distribution is not right for quantities like lengths that are known to be positive, but here we’re just generating some data for illustrative purposes, so we don’t care all that much. If we were trying to model this kind of data though, we’d have to be more careful.

  3. I like Gilbert Strang’s Linear Algebra and its Applications, although I’ve heard from some people that they think it’s a bit hard for a first textbook on the subject – if you’ve had any exposure to this stuff before though, it’s good.

<script src="" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

September 18, 2014 01:39 PM

Keegan McAllister

Raw system calls for Rust

I wrote a small library for making raw system calls from Rust programs. It provides a macro that expands into in-line system call instructions, with no run-time dependencies at all. Here's an example:


#[phase(plugin, link)]
extern crate syscall;

fn write(fd: uint, buf: &[u8]) {
unsafe {
syscall!(WRITE, fd, buf.as_ptr(), buf.len());

fn main() {
write(1, "Hello, world!\n".as_bytes());

Right now it only supports x86-64 Linux, but I'd love to add other platforms. Pull requests are much appreciated. :)

by keegan ( at September 18, 2014 12:33 AM

September 17, 2014

Philip Wadler

The military and the referendum.

Many readers will have heard about Lord Dannat in the Telegraph arguing a vote for independence will dishonour Scotland's war dead.

Perhaps not as many will have heard that Jimmy Sinclair (the oldest surviving Desert Rat, aged 102), Colin May (Lieutenant Commander, Faslane), and sixteen others have written a letter slamming Dannat; at least, I didn't hear until this morning. "How dare he take their sacrifice in vain and try to turn it to political advantage?"

Both sides are reported by the BBC, though the headline mentions only one. (More #bbcbias?)

by Philip Wadler ( at September 17, 2014 08:42 PM

Eric Kidd

Deploying Rust applications to Heroku, with example code for Iron

Now with support for Iron, Cargo and Cargo.lock!

You can deploy an example Rust application to Heroku using this button:


If you'd prefer to use the command line, you'll need git and the Heroku toolbelt. Once these are installed, run:

git clone
cd heroku-rust-cargo-hello
heroku create --buildpack
git push heroku master

This will download the example application, create a new Heroku project, and deploy the code to Heroku. That's it!

How it works

Our server is based on the Iron middleware library. We parse URL parameters and dispatch HTTP requests to the appropriate handler routine using Iron's router module:

fn main() {
    let mut router = Router::new();
    router.get("/", hello);
    router.get("/:name", hello_name);
    Iron::new(router).listen(Ipv4Addr(0, 0, 0, 0), get_server_port());

The hello function returns an HTTP status code and the content to send to the user:

fn hello(_: &mut Request) -> IronResult<Response> {
    Ok(Response::with(status::Ok, "Hello world!"))

The hello_name function is similar, but we look up the value of the :name parameter that we declared to the router, above.

fn hello_name(req: &mut Request) -> IronResult<Response> {
    let params = req.extensions.find::<Router,Params>().unwrap();
    let name = params.find("name").unwrap();
    Ok(Response::with(status::Ok, format!("Hello, {}!", name)))

The final piece needed to run on Heroku is a function to look up up our port number:

fn get_server_port() -> Port {
        .and_then(|s| from_str::<Port>(s.as_slice()))

The full source code is available on GitHub. To learn more about Rust, see the excellent Rust guide.

Keep reading for notes on building the program locally and on configuring your own programs to run on Heroku.

Read more…

September 17, 2014 08:35 PM

Philip Wadler

The case for Europe

<iframe frameborder="0" height="500" src=";title=Today%3A%2017%2F09%2F2014%3A%20Salmond%20promises%20%22common-sense%20agreement%20on%20a%20common%20currency%E2%80%9D&amp;product=iplayer" width="400"></iframe>

Readers of this list will know that I don't always see eye-to-eye with Alex Salmond. Nonetheless, I think he put the case for Europe well this morning on the Today Program.
In a BBC interview just the other night, a spanish minister being
interviewed by Kirsty Wark despite being invited three or four times
refused to say that Spain would attempt to veto Scottish
membership. And the reason for that of course is that the Spanish
government have already said that in the circumstance of a consented
democratic referendum, as they put it, Spain would have nothing to say
about it.

We can go through legal opinion and expert opinion as much as we like.
I think the answer is in four figures: 1, 20, 25, and 60.

1% is the percentage of Scotland's population compared to the European Union.

20% is the percentage of the fish stocks of the entire European Union.

25% is the percentage of the renewable energy of the entire European Union offshore.

And 60% is the oil reserves that Scotland has.

Anyone who believes that a country [with these resources] is not
going to be welcome in the wider Europe doesn't understand the process
by which Europe accepts democratic results and that Scotland has a
huge amount of attractiveness to the rest of the European continent.

You can hear the original here, the relevant segment starts at around 8:00.

by Philip Wadler ( at September 17, 2014 08:10 PM

Mike Izbicki

Beginner error messages in C++ vs Haskell

Beginner error messages in C++ vs Haskell

posted on 2014-09-17 by Paul Starkey

Learning Haskell was excruciating. The error messages from the Haskell compiler ghc were way more difficult to understand than the error messages I was used to from g++. I admit I’m still a novice programmer: My only experience is a year of classes in C++ programming. But the Haskell compiler should do a better job generating error messages for beginners like me.

First we’ll see four concrete examples of ghc doing worse than g++, then Mike will talk about some ways to fix ghc’s error messages.

Example 1

Below are two equivalent C++ and Haskell programs. I’ve intentionally added some syntax errors:

    /* C++ Code */
    #include <iostream>

    using namespace std;

    int main () 
        int in = -1;
        cout << "Please choose 1 for a message" << endl;
        cin >> in;
err->   if in == 1 
            cout << "Hello, World!" << endl;
            cout << "Error, wrong choice" << endl;
        return 0;
    {- Haskell Code -}
    main = do
        putStrLn "Please enter 1 for a message"
        num <- getLine
        if num == "1"
            then do
                putStrLn "Hello, World" 
                putStrLn "Error, wrong choice"

Alright, so the first notable difference is that the Haskell code is much shorter. It takes up roughly half the space that the C++ code does, yet they both output hello world when the correct number is entered.


Haskell already seems better, right?


Notice how I messed up the if statement in both programs. In the C++ version, I forgot the parentheses, and in the Haskell version I forgot the else. Both omissions are simple mistakes that I’ve made while learning these languages.

Now let’s see the error messages:

    -- C++ Error --
    main.cpp: In function 'int main()':
    main.cpp:15:5: error: expected '(' before 'in'
    main.cpp:19:2: error: 'else' without a previous 'if'
    Compilation failed.
    -- Haskell Error --
        parse error (possibly incorrect indentation or mismatched brackets)
    Failed, modules loaded: none.

Both error messages let the programmer know where the mistake happened, but the g++ message is far more helpful. It tells us how to fix the syntax error by adding some missing parentheses. Bam! Problem solved.

Now let us turn to ghc’s output. Okay, something about a parse error… might have indentation errors… and no modules loaded. Cool. Now I’ve never taken a compiler course, so I don’t know what parse error means, and I have no idea how to fix it. The error message is simply not helpful.

Example 2

Here’s another example of parsing errors.

        /* C++ Code */ 
        #include <iostream>
        using namespace std;
        int main() 
err->       string in = ""
            cout << "Please enter a single word and get the string size back" << endl;
            cin >> in;
            cout << "The size of your word, \"" << in << "\", is "
                 << in.length() << "!" << endl;
            return 0;
        {- Haskell Code -}
err->   main = {-do-}
            putStrLn "Please enter a single word and get the string size back"
            num <- getLine
            let size = length num
            putStrLn $ "The size of your word, \"" ++ num ++ "\", is " ++ show size ++ "!"

As you can see, in the C++ I forgot to include a semicolon and in the Haskell I forgot the do in main. As a beginner, I’ve personally made both of these mistakes.

Here are the error messages:

    -- C++ Error --
    main.cpp:8:2: error: expected ',' or ';' before 'cout'
    Compilation failed.
    -- Haskell Error --
        parse error on input '<-'
    Failed, modules loaded: none.

C++ delivers a clear message explaining how to fix the error. Haskell, however, is not so helpful. It says there’s a parse error on the input operator. How should I know this is related to a missing do?

Example 3

Next let’s see what happens when you call the built-in strlen and length functions with no arguments at all.

    /* C++ Code */
    #include <iostream>
    #include <cstring>

    using namespace std;
    int main (){
        char input[256];
        cout << "Please enter a word" << endl;
        cin >> input;
err->   cout << "The size of your string is: " << (unsigned)strlen();
        cout << "!" << endl;
        return 0;
    {- Haskell Code -}
    main = do
        putStrLn "Please enter a word"
        num <- getLine
err->   let size = length 
        putStrLn $ "The size of your string is: " ++ show size ++ "!"

Now let us see the different error messages that are produced:

    -- C++ Error --
    main.cpp: In function 'int main()':
    main.cpp:11:61: error: too few arguments to function 'size_t_strlen(const char*)'
    Compilation failed.
    -- Haskell Error --
    No instance for (Show ([a0]->Int)) arising from a use of 'show'
    Possile fix: add an instance declaration for (Show ([a0]->Int))
    In the first argument of '(++)', namely 'show size'
    In the second argument of '(++)', namely 'show size ++ "!"'
    In the second argument of '(++)', namely
      '"\", is " ++ show size ++ "!"'
    Failed, modules loaded:none.

Once again, it appears that the C++ compiler g++ knew exactly what was wrong with the code and how to fix the error. It tells me that there are not enough arguments in my function call.

Wow, Hakell’s error message is quite the mouthful this time. I suppose this is better than just a parse error message, but I’m not sure what exactly ghc is even wanting me to correct. The error is simply too technical to help me.

Example 4

Next, we will look at what happens when you pass too many arguments to functions in both languages:

    /* C++ Code */
    #include <iostream>
    using namespace std;

    int main () {
        string in[256];
        cout << "Please enter a single word to get the string size back" << endl;
        cin >> in;
err->   cout << "The size of your string, \"" << in << "\", is " << (unsigned)strlen(in, in);
        cout << "!" << endl;
        return 0;
    {- Haskell Code -}
    main = do
        putStrLn "Please enter a single word to get the string size back"
        num <- getLine
err->   let size = length num num
        putStrLn $ "The size of your string, \"" ++ num ++ "\", is " ++ show size ++ "!"

And the errors:

    -- C++ Error --
    main.cpp:16:78: error: too many arguments to function 'int newLength(std::string)'
    main.cpp:6:5: note: declared here
    Compilation failed.
    -- Haskell Error --
    Couldn't match expected type 'String -> t0' with actual type 'Int'  
    The function 'length' is applied to two arguments,
    but its type '[Char] -> Int' has only one
    In the expression: length num num
    In an equation for 'size':size = length num num
    Failed, modules loaded: none.

The C++ error clearly explains how to fix the code, and I even understand the Haskell error this time. Both languages tell me that there are too many arguments. Yet the C++ error message tells me this without a bunch of technical jargon. So even when Haskell is actually helpful with its error messages, it still manages to hide what it wants the user to do.


To me, Haskell seems like a language only for experienced programmers because the errors are not user-friendly. How can I write code if a few simple mistakes cripple my progress? Haskell’s compiler ghc simply lags behind g++ in terms of useful error messages for beginners.

Mike’s Epilogue

I’ve created a patch for ghc that clarifies the specific error messages that Paul had trouble with (and a few related ones). In particular:

  1. Anytime there is a parse error caused by a malformed if, case, lambda, or (non-monadic) let, ghc will now remind the programmer of the correct syntax. In the first example Paul gives above, we would get the much clearer:

    parse error in if statement: missing required else clause
  2. To help with the second example, anytime ghc encounters a parse error caused by a <- token, it now outputs the hint:

    Perhaps this statement should be within a 'do' block?
  3. The third example Paul points out comes from the type checker, rather than the parser. It’s a little less obvious how to provide good hints here. My idea is based on the fact that it is fairly rare for functions to be instances of type classes. The only example I know of is the Monad instance for (a->).

    Therefore, if the type checker can’t find an instance for a function, the more likely scenario is that the programmer simply did not pass enough parameters to the function. My proposed change is that in this situation, ghc would output the hint:

    maybe you haven't applied enough arguments to a function?

This patch doesn’t completely fix ghc’s problem with poor error messages. For example, it doesn’t address Paul’s last point about type errors being verbose. But hopefully it will make it a little easier for aspiring haskellers who still aren’t familiar with all the syntax rules.

September 17, 2014 12:00 AM

September 16, 2014

Yesod Web Framework

Persistent 2.1 Release Candidate

We are happy to announce a stable persistent 2 release candidate.

We previously announced an unstable release of persistent 2. It was a good idea to call it unstable because some commenters pointed out that we were not exposing the full power of the new flexible Key type. This lead to a couple of breaking releases that organizied the internal types of persistent. All of these are on the unstable 2.0.x series.

persistent-2.0.3.* is on hackage now. We consider this a release candidate that will be promoted to persistent-2.1. We may wait until the haddocks build on hackage before releasing.

Ongoing persistent maintainership

Persistent is of huge importance to the Hasekll community, playing the role of default ORM. The project has benefited immensely from the community involvement. We get a lot of prompt bug reports that often include fixes. And there have been many great new features added by contributors.

However, persistent it is lacking in dedicated maintainers for each backend. We would like for Michael to keep doing a great job stewarding the project, but for others to step up and help own a SQL backend.

An extreme example of a lack of backend maitenance is when we received a pull request for a CouchDB backend. It was great to share the code as a starting point, but it was already using an older version of persistent and is now sitting in the experimental folder in a bit-rotted state.

In general a persistent backend can only be first class and in our tree with a dedicated maintainer. Michael and I maintain persistent and persistent-template. I maintain the persitent-mongoDB backend. The issue now is more with the SQL backends, where the maitenance and development for them is being pushed on Michael. For example, I implemented custom Key types in persistent, persistent-template, and persistent-mongoDB. Michael and myself implemented them for persistent-sqlite, but it still needs to be implement for persistent-postgressql and persistent-mysql.

Maintaining persitent and persitent-template has had a questionable cost/benefit ratio for me. But I have personally found that maintaing the persistent-mongoDB backend has paid off well for me. I need to have a good understanding of what is happening with my code that deals with the database. Rather than treating it as a black box I make continous incremental improvements to the library that I rely on, and I can smoothly get code onto hackage rather than having local modifications.

Let us know if you are interested in helping to maintain a backend.

September 16, 2014 10:15 PM

Neil Mitchell

Towards Shake 1.0

Summary: I've just released a new version of Shake, with a --demo feature and an underlying continuation monad. I want to release v1.0 in the near future.

I've just released a new version of the Shake build system, version 0.13.3. While the version number is only 0.0.1 higher, the changelog lists a large number of improvements. In particular, two changes are:

  • The Action monad is now based on continuations, which allows Shake to suspend threads without requiring a GHC RTS thread. The result is significantly less memory used on thread stacks. I still find it quite amazing that Haskell has such powerful and robust abstraction mechanisms that a huge change doesn't even break the API.
  • The shake binary now features a --demo mode, invoked by running shake --demo. This mode generates a Shake project, compiles it, and shows off some of the features of Shake. You can the output of --demo here.

Version 1.0

With the two features above, I'm now looking towards Shake version 1.0. I'm not looking to make any significant backwards-incompatible change, or necessarily any code/API changes at all. However, if you have anything you think should be addressed before reaching such a milestone, please comment on the issue tracker or email the mailing list.

Shake website

The one thing I still want to finish before releasing version 1.0 is to have a proper website for Shake. I've registered which will host the content, and have set up GitHub pages to serve it up. I have some rough content in the docs directory and a prototype generator in the website directory - as an example it currently generates something a bit like this for the user manual, but with a table of contents when run through the latest generator. I'd appreciate any help with the content, the generator, or the styling - just email the mailing list.

by Neil Mitchell ( at September 16, 2014 08:54 PM

Chung-chieh Shan

A challenge for a better community

Donation button
Donate to the Ada Initiative

Did you know that all ACM-sponsored conferences have an anti-harassment policy? I didn’t, until I chaired the Haskell Symposium last year. The concise policy says, among other things, that people shouldn’t use my family constitution to interfere with my professional participation. And the policy has teeth. That’s great.

My not knowing the policy and not seeing it publicized didn’t make me go out of my way to harass anyone. But it did make me less sure and less comfortable that I belonged at ICFP. Briefly, it’s because I didn’t know if it would be common ground at the conference that my actual self was fully human. That’s not something I can take for granted in general society. Also, it’s because I didn’t know whether my fellow conference attendees were aware of the policy. We could all use a reminder, and a public signal that we mean it.

For these reasons, I’m very happy that ICFP will start to publicize ACM’s existing anti-harassment policy and make sure everyone registered knows it. All ACM conferences should do it. That’s why Tim Chevalier, Clement Delafargue, Adam Foltzer, Eric Merritt, and I are doing two things. We ask you to join us:

  1. Donate to the Ada Initiative. Our goal is for the functional programming community to raise $8192 by the end of Friday (Sept 19) UTC. To count toward this goal, please use this link:
  2. Call on the ACM and tell your friends. For example, I tweeted this:
    I donate to @AdaInitiative because I want @TheOfficialACM events to announce their anti-harassment policy #lambda4ada

Thanks for improving our professional homes!

(UPDATE: Wow! We reached our initial goal $4096 in just 5 hours! We increased the goal to $8192, thanks to your generosity. And if we raise $16,384, we will sing “There’s no type class like Show type class” and put a recording on the Internet.)

September 16, 2014 04:09 PM

September 15, 2014

Robert Harper

Scotland: Vote No

So far I’ve ignored the back and forth on the Scottish referendum on secession from the United Kingdom, but this weekend I decided that it was past time for me to sort it out.  For those of you who don’t know me, I’ll mention that I lived for 3.5 years in Scotland quite some time ago, so I am not completely ignorant of the cultural and political issues that underly the debate.  As a rule my political views are very much in line with those of the average Scot, solidly Labour Party back in the day when people like Derek Hatton and Ken Livingston and Roy Hattersley and Tony Benn defined what that meant.  Despite Tony Blair’s slimy “third way” nonsense, and his toadying up to Dick “Dick” Cheney’s sock puppet to help lie us into the Iraq war, Scotland in national politics remains solidly Labour; practically every Scottish seat is a Labour seat.

Although I used to be a so up on British politics that I could read and enjoy Private Eye, it’s been a long while since I’ve paid more than scant attention to what’s been going on there, apart from noting that The Scotsman was one of the few sources of truth about the Iraq War back when it really mattered.  The Scots have spines.

I’m no historian, but I do have basic understanding of Scottish history, particularly as regards the English, and am very familiar with the Scottish concept of valor in glorious defeat.  I understand full well that practically every Scotsman harbors some resentment towards the English for centuries of injustices, including the highland clearances, and, more recently, the appropriation of the oil in Scottish territory for the scant benefit of the Scots themselves.  And I am well aware of the bravery and sacrifice that so many Scots made fighting against the Axis during World War II.

My home institution, Carnegie Mellon University, was founded by a Scotsman from Kirkaldy, just across the spectacular Forth Bridge from Edinburgh.  Carnegie was born into penury and died as the wealthiest man on earth, far wealthier relative to GDP than Gates by a wide margin.  Carnegie was extraordinary, but the Scots in general punch far above their weight class in all things, especially industrious self-reliance.

In short, I love Scotland, and consider it to be a second home.  (OK, the weather is appalling, but we’ll set that aside for the time being.)

Emotionally, I am deeply sympathetic to the Scottish independence movement.  I know full well how poorly the U.K. treats Scotland and its interests.  Politics in the UK revolves around the “home counties” in the south of England; the terminology tells you all you need to know.  One time while watching the weather report on the BBC, the national broadcasting network, the announcer said that there was some horrendous weather coming our way, but that “it’ll mostly be up in Scotland, though”.  Though.  Though.

But I urge all my Scottish friends to vote NO on the independence proposal.  It makes no sense whatsoever in its present form, and represents to me a huge scam being perpetrated by the SNP to seize power and impose policies that nearly every Scot, judging from their voting record over decades and decades, would oppose.  The whole movement seems driven by the powerful urge to finally stick it to the English and get their country back, and Salmond is exploiting that to the hilt.  Back when I lived in Scotland I looked into the SNP, because even then I had separatist sympathies, but when I did, it was obvious why they had so few backers.  They’re just Tories without the class structure, more akin to our Tea Party lunatics than to the British Conservatives, and steadfastly opposed to policies, such as well-funded public education, that nearly all Scots support, and determined to follow the post-cold war Slovakian model of slashing taxes on the wealthy in the hope of attracting business to the country.  Having not followed Scottish politics for so long, it is astonishing to me that the SNP has managed to gain a majority in the Scottish Parliament, while the voting pattern at the national level has not changed at all.  How did this happen?  From my position of ignorance of the last decade or so of politics in Scotland, it looks as though Salmond is a slick operator who has pulled off a colossal con by exploiting the nationalist tendencies that lie within every Scot.

But never mind Salmond, the main reason that Scots must vote NO on the referendum is that it proposes to keep the English pound as Scotland’s national currency!  This is such a preposterous idea that I can only suspect dishonesty and deceit, because no sane political leader of honest intent could ever voluntarily place his or her country’s economic future in the hands of another.  The Bank of England will, particularly after separation, have no interest whatsoever in the economic conditions in Scotland when determining its policies on the pound.  And the Bank of Scotland will have no ability to control its own currency, the prime means of maintaining economic balance between labor and capital.  The Scots will, in effect, be putting themselves on a gold standard, the stupidest possible monetary system, so that, in a crisis, they will have to buy or borrow pounds, at interest, in emergency conditions, to deal with, say, the failure of the Royal Bank of Scotland (but don’t worry, that sort of thing can never happen again).  And the Bank of Scotland will have no means of stimulating the economy in a demand slump other than borrowing pounds from somewhere outside the country, rendering themselves in debt beyond their means.  And this will become an excuse for dismantling the social system that has been so important to elevating the Scots from poverty to a decent standard of living within one or two generations.  Just look at the poor PIGS in the Euro-zone being pushed around by Germany, especially, to satisfy the conveniences of the German bankers, and to hell with the living, breathing souls in Greece or Spain or Ireland or Portugal, to name the canonical victims.

A country that does not control its own currency is not independent and cannot be independent.  It’s an illusion.  Just what are Salmond’s true intentions are not entirely clear to me, but on the basis of his monetary policies alone, I implore my Scottish friends to suppress the natural wish to make a statement of pride, and instead do the sensible thing.  The proposal to be voted on this week is not a spittle on the  Heart of Midlothian, it is an irrevocable decision to place Scotland in an even worse position with respect to England than it already is in.

Listen to reason.  Vote NO on independence.

Filed under: Research Tagged: Scottish referendum

by Robert Harper at September 15, 2014 04:05 PM

The GHC Team

GHC Weekly News - 2014/09/15

Hi *,

Here's a new thing: Blog posts! That's right. A while back, we started a new set of emails on the developers list containing weekly updates, from GHC HQ. But we eventually decided it should be more broad and encompass the work GHC sees as a project - including all the things our contributors do.

So now it's the weekly GHC news - and we (or, well, I) have decided to blogify the weekly emails!

So without further adieu, here's the current recap. The original mailing list copy is available here.

  • As Gabor mentioned on the list earlier today, I (Austin) accidentally broke the Windows build. Sorry. :( We really need to get Phab building Windows too ASAP... I'm working on a fix this morning.
  • I sent out the HCAR draft this morning. Please edit it! I think we have a few weeks of lead time however, so we're not in a rush like last time. But I will send reminders. :)
  • The server migration for seems to have gone pretty smoothly in the past week. It's had plenty of traffic so far. The full migration is still ongoing and I want to complete it this week.
  • I've finished reorganizing some of the Git and Repository pages after some discussion. We now have the Repositories[1] page, linked to on the left side, which details some notes on the repositories we use, and links to other various pages. I'm thinking of replacing this side-bar "root" with a link to the main Git[2] page, perhaps.
  • Miscellaneous: and now sets the Strict-Transport-Security header. This just means you always use SSL now when you visit those pages (so you can't be connection-hijacked via a 503 redirect).
  • GHC works on Wine apparently for all you Linux users - thanks Mikolaj![3]
  • Jan had some questions about infrastructure which I just followed up on this morning. In particular: does anyone feel strongly about his first question?[4]
  • Herbert committed the first part of the Traversable/Foldable changes, by moving the typeclasses to Prelude. This is part of an ongoing series of patches. Things like adding Bifunctor will finally come after this.[5]

Also, added bonus: we'll start including some of the tickets we closed this week.

Closed tickets for the past week include: #9585, #9545, #9581, #6086, #9558, and #3658.

Please let me know if you have any questions.


by thoughtpolice at September 15, 2014 01:47 PM

Mike Izbicki

Comparing AVL Trees in C++ and Haskell

Comparing AVL Trees in C++ and Haskell

posted on 2014-09-15 by Dat Do

This post compares the runtimes of AVL tree operations in C++ vs Haskell. In particular, we insert 713,000 strings from a file into an AVL Tree. This is a \(O(n \log n)\) operation. But we want to investigate what the constant factor looks like in different situations.

Experimental setup: All the code for these tests is available in the github repository. The C++ AVL tree was created in a data structures course that I took recently and the Haskell AVL tree is from the Haskell library Data.Tree.AVL. Additionally, the Haskell code stores the strings as ByteStrings because they are much more efficient than the notoriously slow String. To see how the runtime is affected by files of different sizes, the file was first partitioned into 10 segments. The first segment has 71,300 words, the second 71,300 * 2 words, and so on. Both the C++ and Haskell programs were compiled with the -O2 flag for optimization. The test on each segment is the average runtime of three separate runs.

Here’s the results:



C++ is a bit faster than Haskell on the last partition for this test.

I guess this is because Haskell operates on immutable data. Every time a new element is to be inserted into the Haskell AVL tree, new parent nodes must be created because the old parent nodes cannot be changed. This creates quite a bit of overhead. C++ on the other hand, does have mutable data and can simply change the node that a parent node is pointing to. This is faster than making a whole new copy like the Haskell code does.

Is there an easy way to speed up our Haskell code?

There is a Haskell library called parallel that makes parallel computations really convenient. We’ll try to speed up our program with this library.

You might think that it is unfair to compare multithreaded Haskell against C++ that is not multithreaded. And you’re absolutely right! But let’s be honest, manually working with pthreads in C++ is quite the headache, but parallism in Haskell is super easy.

Before we look at the results, let’s look at the parallelized code. What we do is create four trees each with a portion of the set of strings. Then, we call par on the trees so that the code is parallelized. Afterwards, we union the trees to make them a single tree. Finally, we call deepseq so that the code is evaluated.

{-# LANGUAGE TemplateHaskell #-}
import Control.DeepSeq.TH
import Control.Concurrent
import Data.Tree.AVL as A
import Data.COrdering
import System.CPUTime
import qualified Data.ByteString.Char8 as B
import Control.DeepSeq
import Data.List.Split
import System.Environment
import Control.Parallel

$(deriveNFData ''AVL)

-- Inserts elements from list into AVL tree
load :: AVL B.ByteString -> [B.ByteString] -> AVL B.ByteString
load t [] = t
load t (x:xs) = A.push (fstCC x) x (load t xs)

main = do
    args <- getArgs
    contents <- fmap B.lines $ B.readFile $ args !! 0
    let l = splitEvery (length contents `div` 4) contents
    deepseq contents $ deepseq l $ return ()
    start <- getCPUTime

    -- Loading the tree with the subpartitions
    let t1 = load empty $ l !! 0
    let t2 = load empty $ l !! 1
    let t3 = load empty $ l !! 2
    let t4 = load empty $ l !! 3
    let p = par t1 $ par t2 $ par t3 t4

    -- Calling union to combine the trees
    let b = union fstCC t1 t2
    let t = union fstCC t3 t4
    let bt = union fstCC b t
    let bt' = par b $ par t bt
    deepseq p $ deepseq bt' $ return ()

    end <- getCPUTime
    n <- getNumCapabilities
    let diff = ((fromIntegral (end-start)) / (10^12) / fromIntegral n)
    putStrLn $ show diff

Great, so now that the Haskell code has been parallelized, we can compile and run the program again to see the difference. To compile for parallelism, we must use some special flags.

ghc –O2 filename -rtsopts –threaded

And to run the program (-N4 refers to the number of cores).

./filename +RTS –N4


Haskell now gets better runtimes than C++.

Now that we know Haskell is capable of increasing its speeds through parallelism, it would be interesting to see how the runtime is affected by the degree of parallelism.

According to Amdahl’s law, a program that is 100% parallelized will see a proportional speed up based on the number of threads of execution. For example, if a program that is 100% parallelized takes 2 seconds to run on 1 thread, then it should take 1 second to run using 2 threads. The code used for our test, however, is not 100% parallelized since there a union operation performed at the end to combine the trees created by the separate threads. The union of the trees is a \(O(n)\) operation while the insertion of the strings into the AVL tree is a \(O\left(\frac{n \log n }{p}\right)\) operation, where \(p\) is the number of threads. Therefore, the runtime for our test should be

\[O\left(\frac{n\log{n}}{p} + n\right)\]

Here is a graph showing the runtime of the operation on the largest set (713,000 strings) across increasing levels of parallelism.



Taking a look at the results, we can see that the improvement in runtime does not fit the 100% parallelized theoretical model, but does follow it to some extent. Rather than the 2 core runtime being 50% of the 1 core runtime, the 2 core runtime is 56% of the 1 core runtime, with decreasing efficiency as the number of cores increases. Though, it is clear that there are significant improvements in speed through the use of more processor cores and that parallelism is an easy way to get better runtime speeds with little effort.

September 15, 2014 12:00 AM

September 14, 2014

Philip Wadler

The British Biased Corporation

Scandalous! Nick Robinson asks Alex Salmond a question, and Salmond takes seven minutes to answer in detail.

<iframe allowfullscreen="" frameborder="0" height="315" src="" width="560"></iframe>

On the evening news, Nick Robinson summarises Salmond's answer in a few seconds as 'He didn't answer'.

<iframe allowfullscreen="" frameborder="0" height="259" src="" width="460"></iframe>

(Above spotted via Arc of Prosperity.)

And today, this.
I used to be a supporter of the BBC, but it's getting harder and harder to justify.

by Philip Wadler ( at September 14, 2014 10:01 PM

Krugman vs. Stiglitz, now with added Stiglitz

My last post quoted Joe Stiglitz, indirectly, to refute Paul Krugman's fear mongering. Now the man himself has spoken in the Sunday Herald.
As Scotland contemplates independence, some, such as Paul Krugman, have questioned the "economics".

Would Scotland, going it alone, risk a decline in standards of living or a fall in GDP? There are, to be sure, risks in any course of action: should Scotland stay in the UK, and the UK leave the EU, the downside risks are, by almost any account, significantly greater. If Scotland stays in the UK, and the UK continues in its policies which have resulted in growing inequality, even if GDP were slightly larger, the standards of living of most Scots could fall.

Cutbacks in UK public support to education and health could force Scotland to face a set of unpalatable choices - even with Scotland having considerable discretion over what it spends its money on.

But there is, in fact, little basis for any of the forms of fear-mongering that have been advanced. Krugman, for instance, suggests that there are significant economies of scale: a small economy is likely, he seems to suggest, not to do well. But an independent Scotland will still be part of Europe, and the great success of the EU is the creation of a large economic zone.

Besides, small political entities, like Sweden, Singapore, and Hong Kong have prospered, while much larger entities have not. By an order of magnitude, far more important is pursuit of the right policies.

Another example of a non-issue is the currency. There are many currency arrangements that would work. Scotland could continue using sterling - with or without England's consent.

Because the economies of England and Scotland are so similar, a common currency is likely to work far better than the euro - even without shared fiscal policy. But many small countries have managed to have a currency of their own - floating, pegged, or "managed."

by Philip Wadler ( at September 14, 2014 06:13 PM

Dinna fash yersel — Scotland will dae juist fine!

One relentless lie behind 'No' is that Scotland is too wee to make it on its own, counterexamples such as Denmark, Sweden, Singapore, and Hong Kong being conveniently ignored. May this post from Thomas Widmann, a Dane residing in Scotland, help to dispel the disinformation. 
Pick a random person from somewhere on this planet. Ask them to name an alcoholic drink from Scotland, and it’s very likely they’ll reply “Whisky”. Ask them to name one from Denmark, and they’ll probably be tongue-tied. (They could answer “Gammel Dansk” or “Akvavit”, but they’re just not nearly as famous as whisky.)

Now repeat the exercise, but ask about a food item. Again, it’s likely they’ll have heard of haggis but that they’ll be struggling to name anything from Denmark.

Now try a musical instrument. Bagpipes and … sorry, cannot think of a Danish one.

A sport? Scotland has golf, of course. Denmark can perhaps claim ownership of handball, but it’s not associated with Denmark in the way that golf makes everybody think of Scotland.

A piece of clothing? Everybody knows the kilt, but I’d be very surprised if anybody can name one from Denmark.

A monster? Everybody knows what’s lurking in Loch Ness, but is there anything scary in Denmark?

The only category where Denmark perhaps wins is toys, where Lego surely is more famous than anything from Scotland (but many people don’t know Lego is from Denmark).

Denmark is also well-known for butter and bacon, of course, but these aren’t Danish in origin or strongly associated with Denmark in people’s minds.

Several famous writers and philosophers were Danish (e.g., Hans Christian Andersen and Søren Kierkegaard), but Scotland can arguably list more names of the same calibre, and the Scottish ones wrote in English, which makes them much more accessible to the outside world.

Scottish universities are also ranked better than the Danish ones in recent World rankings.

Finally, Scotland has lots of oil and wind, water and waves. Denmark has some, but not nearly as much, and most other countries have less than Denmark.

Because of all of this, I don’t worry about the details when it comes to Scottish independence. If Denmark can be one of the richest countries on the planet, of course Scotland can be one too.

Yes, there might be a few tough years while the rUK are in a huff and before everything has been sorted out. And of course there will be occasional crises in the future, like in any other country.

However, unless you subscribe to the school that Denmark and other small countries like Norway and Switzerland are complete failures because they don’t have nuclear weapons and a permanent seat on the UN’s Security Council, there’s simply no reason to assume Scotland won’t do exceptionally well as an independent country in the longer term.

So I’m not worried. Of course there are many details to sort out, but at the end of the day everything will be fine. Scotland will be a hugely successful independent country. Dinna fash yersel!

by Philip Wadler ( at September 14, 2014 06:13 PM

September 13, 2014

The Gentoo Haskell Team

ghc 7.8.3 and rare architectures

After some initially positive experience with ghc-7.8-rc1 I’ve decided to upstream most of gentoo fixes.

On rare arches ghc-7.8.3 behaves a bit bad:

  • ia64 build stopped being able to link itself after ghc-7.4 (gprel overflow)
  • on sparc, ia64 and ppc ghc was not able to create working shared libraries
  • integer-gmp library on ia64 crashed, and we had to use integer-simple

I have written a small story of those fixes here if you are curious.


To get ghc-7.8.3 working nicer for exotic arches you will need to backport at least the following patches:

Thank you!

by Sergei Trofimovich at September 13, 2014 12:35 PM

unsafePerformIO and missing NOINLINE

Two months ago Ivan asked me if we had working darcs-2.8 for ghc-7.8 in gentoo. We had a workaround to compile darcs to that day, but darcs did not work reliably. Sometimes it needed 2-3 attempts to pull a repository.

A bit later I’ve decided to actually look at failure case (Issued on darcs bugtracker) and do something about it. My idea to debug the mystery was simple: to reproduce the difference on the same source for ghc-7.6/7.8 and start plugging debug info unless difference I can understand will pop up.

Darcs has great debug-verbose option for most of commands. I used debugMessage function to litter code with more debugging statements unless complete horrible image would emerge.

As you can see in bugtracker issue I posted there various intermediate points of what I thought went wrong (don’t expect those comments to have much sense).

The immediate consequence of a breakage was file overwrite of partially downloaded file. The event timeline looked simple:

  • darcs scheduled for download the same file twice (two jobs in download queue)
  • first download job did finish
  • notified waiter started processing of that downloaded temp file
  • second download started truncating previous complete download
  • notified waiter continued processing partially downloadeed file and detected breakage

Thus first I’ve decided to fix the consequence. It did not fix problems completely, sometimes darcs pull complained about remote repositories still being broken (missing files), but it made errors saner (only remote side was allegedly at fault).

Ideally, that file overwrite should not happen in the first place. Partially, it was temp file predictability.

But, OK. Then i’ve started digging why 7.6/7.8 request download patterns were so severely different. At first I thought of new IO manager being a cause of difference. The paper says it fixed haskell thread scheduling issue (paper is nice even for leisure reading!):

GHC’s RTS had a bug in which yield
placed the thread back on the front of the run queue. This bug
was uncovered by our use of yield
which requires that the thread
be placed at the end of the run queue

Thus I was expecting the bug from this side.

Then being determined to dig A Lot in darcs source code I’ve decided to disable optimizations (-O0) to speedup rebuilds. And, the bug has vanished.

That made the click: unsafePerformIO might be the real problem. I’ve grepped for all unsafePerformIO instances and examined all definition sites.

Two were especially interesting:

-- src/Darcs/Util/Global.hs
-- ...
_crcWarningList :: IORef CRCWarningList
_crcWarningList = unsafePerformIO $ newIORef []
{-# NOINLINE _crcWarningList #-}
-- ...
_badSourcesList :: IORef [String]
_badSourcesList = unsafePerformIO $ newIORef []
{- NOINLINE _badSourcesList -}
-- ...

Did you spot the bug?

Thus The Proper Fix was pushed upstream a month ago. Which means ghc is now able to inline things more aggressively (and _badSourcesList were inlined in all user sites, throwing out all update sites).

I don’t know if those newIORef [] can be de-CSEd if types would have the same representation. Ideally the module also needs -fno-cse, or get rid of unsafePerformIO completely :].

(Side thought: top-level global variables in C style are surprisingly non-trivial in "pure" haskell. They are easy to use via peek / poke (in a racy way), but are hard to declare / initialize.)

I had a question wondered how many haskell packages manage to misspell ghc pragma decparations in a way darcs did it. And there still _is_ a few of such offenders:

$ fgrep -R NOINLINE . | grep -v '{-# NOINLINE' | grep '{-'
ajhc-{- NOINLINE filterFB #-}
ajhc-{- NOINLINE iterateFB #-}
ajhc-{- NOINLINE mapFB #-}
darcs-2.8.4/src/Darcs/Global.hs:{- NOINLINE _badSourcesList -}
darcs-2.8.4/src/Darcs/Global.hs:{- NOINLINE _reachableSourcesList -}
dph-lifted-copy-{- NOINLINE emptyP #-}
dph-par-{- NOINLINE emptyP #-}
dph-seq-{- NOINLINE emptyP #-}
freesect-0.8/FreeSectAnnotated.hs:{- # NOINLINE showSSI #-}
freesect-0.8/FreeSectAnnotated.hs:{- # NOINLINE FreeSectAnnotated.showSSI #-}
freesect-0.8/FreeSect.hs:{- # NOINLINE fs_warn_flaw #-}
http-proxy-0.0.8/Network/HTTP/Proxy/ReadInt.hs:{- NOINLINE readInt64MH #-}
http-proxy-0.0.8/Network/HTTP/Proxy/ReadInt.hs:{- NOINLINE mhDigitToInt #-}
lhc-0.10/lib/base/src/GHC/PArr.hs:{- NOINLINE emptyP #-}
property-list-{- NOINLINE doubleToWord64 -}
property-list-{- NOINLINE word64ToDouble -}
property-list-{- NOINLINE floatToWord32 -}
property-list-{- NOINLINE word32ToFloat -}
warp-{- NOINLINE readInt64MH #-}
warp-{- NOINLINE mhDigitToInt #-}

Looks like there is yet something to fix :]

Would be great if hlint would be able to detect pragma-like comments and warn when comment contents is a valid pragma, but comment brackets don’t allow it to fire.

{- NOINLINE foo -} -- bad
{- NOINLINE foo #-} -- bad
{-# NOINLINE foo -} -- bad
{-# NOINLINE foo #-} -- ok

Thanks for reading!

by Sergei Trofimovich at September 13, 2014 09:18 AM

September 12, 2014

Bjorn Buckwalter

Haskell tools for satellite operations

At last week's CUFP I did a talk called “Haskell tools for satellite operations”. The abstract is:

Since 2013-04 the presenter has been supporting SSC (the Swedish Space Corporation) in operating the telecommunications satellite “Sirius 3” from its Mission Control Center in Kiruna. Functions in the satellite vendor's operations software are breaking down as the orbit of the ageing satellite degrades. To fill in the gaps in software capabilities the presenter has developed several operational tools using Haskell.
The talk will give an overview of the satellite operations environment, the tools developed in Haskell, how they benefitted (and occasionally suffered) from the choice of implementation language, which (public) libraries were critical to their success, and how they were deployed in the satellite operations environment.
A video recording of the talk is available on the CUFP page for the talk and on youtube.

If this interests you be sure to check out the other talk from the “Functional programming in space!” track; Michael Oswald's Haskell in the Misson Control Domain.

by Björn Buckwalter ( at September 12, 2014 12:46 PM

Gabriel Gonzalez

Morte: an intermediate language for super-optimizing functional programs

The Haskell language provides the following guarantee (with caveats): if two programs are equal according to equational reasoning then they will behave the same. On the other hand, Haskell does not guarantee that equal programs will generate identical performance. Consequently, Haskell library writers must employ rewrite rules to ensure that their abstractions do not interfere with performance.

Now suppose there were a hypothetical language with a stronger guarantee: if two programs are equal then they generate identical executables. Such a language would be immune to abstraction: no matter how many layers of indirection you might add the binary size and runtime performance would be unaffected.

Here I will introduce such an intermediate language named Morte that obeys this stronger guarantee. I have not yet implemented a back-end code generator for Morte, but I wanted to pause to share what I have completed so far because Morte uses several tricks from computer science that I believe deserve more attention.

Morte is nothing more than a bare-bones implementation of the calculus of constructions, which is a specific type of lambda calculus. The only novelty is how I intend to use this lambda calculus: as a super-optimizer.


The typed lambda calculus possesses a useful property: every term in the lambda calculus has a unique normal form if you beta-reduce everything. If you're new to lambda calculus, normalizing an expression equates to indiscriminately inlining every function call.

What if we built a programming language whose intermediate language was lambda calculus? What if optimization was just normalization of lambda terms (i.e. indiscriminate inlining)? If so, then we would could abstract freely, knowing that while compile times might increase, our final executable would never change.


Normally you would not want to inline everything because infinitely recursive functions would become infinitely large expressions. Fortunately, we can often translate recursive code to non-recursive code!

I'll demonstrate this trick first in Haskell and then in Morte. Let's begin from the following recursive List type along with a recursive map function over lists:

import Prelude hiding (map, foldr)

data List a = Cons a (List a) | Nil

example :: List Int
example = Cons 1 (Cons 2 (Cons 3 Nil))

map :: (a -> b) -> List a -> List b
map f Nil = Nil
map f (Cons a l) = Cons (f a) (map f l)

-- Argument order intentionally switched
foldr :: List a -> (a -> x -> x) -> x -> x
foldr Nil c n = n
foldr (Cons a l) c n = c a (foldr l c n)

result :: Int
result = foldr (map (+1) example) (+) 0

-- result = 9

Now imagine that we disable all recursion in Haskell: no more recursive types and no more recursive functions. Now we must reject the above program because:

  • the List data type definition recursively refers to itself

  • the map and foldr functions recursively refer to themselves

Can we still encode lists in a non-recursive dialect of Haskell?

Yes, we can!

-- This is a valid Haskell program

{-# LANGUAGE RankNTypes #-}

import Prelude hiding (map, foldr)

type List a = forall x . (a -> x -> x) -> x -> x

example :: List Int
example = \cons nil -> cons 1 (cons 2 (cons 3 nil))

map :: (a -> b) -> List a -> List b
map f l = \cons nil -> l (\a x -> cons (f a) x) nil

foldr :: List a -> (a -> x -> x) -> x -> x
foldr l = l

result :: Int
result = foldr (map (+ 1) example) (+) 0

-- result = 9

Carefully note that:

  • List is no longer defined recursively in terms of itself

  • map and foldr are no longer defined recursively in terms of themselves

Yet, we somehow managed to build a list, map a function over the list, and fold the list, all without ever using recursion! We do this by encoding the list as a fold, which is why foldr became the identity function.

This trick works for more than just lists. You can take any recursive data type and mechanically transform the type into a fold and transform functions on the type into functions on folds. If you want to learn more about this trick, the specific name for it is "Boehm-Berarducci encoding". If you are curious, this in turn is equivalent to an even more general concept from category theory known as "F-algebras", which let you encode inductive things in a non-inductive way.

Non-recursive code greatly simplifies equational reasoning. For example, we can easily prove that we can optimize map id l to l:

map id l

-- Inline: map f l = \cons nil -> l (\a x -> cons (f a) x) nil
= \cons nil -> l (\a x -> cons (id a) x) nil

-- Inline: id x = x
= \cons nil -> l (\a x -> cons a x) nil

-- Eta-reduce
= \cons nil -> l cons nil

-- Eta-reduce
= l

Note that we did not need to use induction to prove this optimization because map is no longer recursive. The optimization became downright trivial, so trivial that we can automate it!

Morte optimizes programs using this same simple scheme:

  • Beta-reduce everything (equivalent to inlining)
  • Eta-reduce everything

To illustrate this, I will desugar our high-level Haskell code to the calculus of constructions. This desugaring process is currently manual (and tedious), but I plan to automate this, too, by providing a front-end high-level language similar to Haskell that compiles to Morte:


( \(List : * -> *)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b) -> List a -> List b
-> \(id : forall (a : *) -> a -> a)

-> \(a : *) -> map a a (id a)

-- List
(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil

-- id
(\(a : *) -> \(va : a) -> va)

This line of code is the "business end" of the program:

\(a : *) -> map a a (id a)

The extra 'a' business is because in any polymorphic lambda calculus you explicitly accept polymorphic types as arguments and specialize functions by applying them to types. Higher-level functional languages like Haskell or ML use type inference to automatically infer and supply type arguments when possible.

We can compile this program using the morte executable, which accepts a Morte program on stdin, outputs the program's type stderr, and outputs the optimized program on stdout:

$ morte <
∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a
→ x → x) → x → x

λ(a : *) → λ(l : ∀(x : *) → (a → x → x) → x → x) → l

The first line is the type, which is a desugared form of:

forall a . List a -> List a

The second line is the program, which is the identity function on lists. Morte optimized away the map completely, the same way we did by hand.

Morte also optimized away the rest of the code, too. Dead-code elimination is just an emergent property of Morte's simple optimization scheme.


We could double-check our answer by asking Morte to optimize the identity function on lists:


( \(List : * -> *)
-> \(id : forall (a : *) -> a -> a)

-> \(a : *) -> id (List a)

-- List
(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)

-- id
(\(a : *) -> \(va : a) -> va)

Sure enough, Morte outputs an alpha-equivalent result (meaning the same up to variable renaming):

$ ~/.cabal/bin/morte <
∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a
→ x → x) → x → x

λ(a : *) → λ(va : ∀(x : *) → (a → x → x) → x → x) → va

We can even use the morte library to mechanically check if two Morte expressions are alpha-, beta-, and eta- equivalent. We can parse our two Morte files into Morte's Expr type and then use the Eq instance for Expr to test for equivalence:

$ ghci
Prelude> import qualified Data.Text.Lazy.IO as Text
Prelude Text> txt1 <- Text.readFile ""
Prelude Text> txt2 <- Text.readFile ""
Prelude Text> import Morte.Parser (exprFromText)
Prelude Text Morte.Parser> let e1 = exprFromText txt1
Prelude Text Morte.Parser> let e2 = exprFromText txt2
Prelude Text Morte.Parser> import Control.Applicative (liftA2)
Prelude Text Morte.Parser Control.Applicative> liftA2 (==) e1 e2
Right True
$ -- `Right` means both expressions parsed successfully
$ -- `True` means they are alpha-, beta-, and eta-equivalent

We can use this to mechanically verify that two Morte programs optimize to the same result.

Compile-time computation

Morte can compute as much (or as little) at compile as you want. The more information you encode directly within lambda calculus, the more compile-time computation Morte will perform for you. For example, if we translate our Haskell List code entirely to lambda calculus, then Morte will statically compute the result at compile time.


( \(Nat : *)
-> \(zero : Nat)
-> \(one : Nat)
-> \((+) : Nat -> Nat -> Nat)
-> \((*) : Nat -> Nat -> Nat)
-> \(List : * -> *)
-> \(Cons : forall (a : *) -> a -> List a -> List a)
-> \(Nil : forall (a : *) -> List a)
-> \( map
: forall (a : *) -> forall (b : *)
-> (a -> b) -> List a -> List b
-> \( foldr
: forall (a : *)
-> List a
-> forall (r : *)
-> (a -> r -> r) -> r -> r
-> ( \(two : Nat)
-> \(three : Nat)
-> ( \(example : List Nat)

-> foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero

-- example
(Cons Nat one (Cons Nat two (Cons Nat three (Nil Nat))))

-- two
((+) one one)

-- three
((+) one ((+) one one))

-- Nat
( forall (a : *)
-> (a -> a)
-> a
-> a

-- zero
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Zero

-- one
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Succ Zero

-- (+)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a Succ (n a Succ Zero)

-- (*)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a (n a Succ) Zero

-- List
( \(a : *)
-> forall (x : *)
-> (a -> x -> x) -- Cons
-> x -- Nil
-> x

-- Cons
( \(a : *)
-> \(va : a)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Cons va (vas x Cons Nil)

-- Nil
( \(a : *)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Nil

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil

-- foldr
( \(a : *)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> vas

The relevant line is:

foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero

If you remove the type-applications to Nat, this parallels our original Haskell example. We can then evaluate this expression at compile time:

$ morte <
∀(a : *) → (a → a) → a → a

λ(a : *) → λ(Succ : a → a) → λ(Zero : a) → Succ (Succ (Succ
(Succ (Succ (Succ (Succ (Succ (Succ Zero))))))))

Morte reduces our program to a church-encoded nine.

Run-time computation

Morte does not force you to compute everything using lambda calculus at compile time. Suppose that we wanted to use machine arithmetic at run-time instead. We can do this by parametrizing our program on:

  • the Int type,
  • operations on Ints, and
  • any integer literals we use

We accept these "foreign imports" as ordinary arguments to our program:


-- Foreign imports
\(Int : *) -- Foreign type
-> \((+) : Int -> Int -> Int) -- Foreign function
-> \((*) : Int -> Int -> Int) -- Foreign function
-> \(lit@0 : Int) -- Literal "1" -- Foreign data
-> \(lit@1 : Int) -- Literal "2" -- Foreign data
-> \(lit@2 : Int) -- Literal "3" -- Foreign data
-> \(lit@3 : Int) -- Literal "1" -- Foreign data
-> \(lit@4 : Int) -- Literal "0" -- Foreign data

-- The rest is compile-time lambda calculus
-> ( \(List : * -> *)
-> \(Cons : forall (a : *) -> a -> List a -> List a)
-> \(Nil : forall (a : *) -> List a)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b) -> List a -> List b
-> \( foldr
: forall (a : *)
-> List a
-> forall (r : *)
-> (a -> r -> r) -> r -> r
-> ( \(example : List Int)

-> foldr Int (map Int Int ((+) lit@3) example) Int (+) lit@4

-- example
(Cons Int lit@0 (Cons Int lit@1 (Cons Int lit@2 (Nil Int))))

-- List
( \(a : *)
-> forall (x : *)
-> (a -> x -> x) -- Cons
-> x -- Nil
-> x

-- Cons
( \(a : *)
-> \(va : a)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Cons va (vas x Cons Nil)

-- Nil
( \(a : *)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Nil

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil

-- foldr
( \(a : *)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> vas

We can use Morte to optimize the above program and Morte will reduce the program to nothing but foreign types, operations, and values:

$ morte <
∀(Int : *) → (Int → Int → Int) → (Int → Int → Int) → Int →
Int → Int → Int → Int → Int

λ(Int : *) → λ((+) : Int → Int → Int) → λ((*) : Int → Int →
Int) → λ(lit : Int) → λ(lit@1 : Int) → λ(lit@2 : Int) →
λ(lit@3 : Int) → λ(lit@4 : Int) → (+) ((+) lit@3 lit) ((+)
((+) lit@3 lit@1) ((+) ((+) lit@3 lit@2) lit@4))

If you study that closely, Morte adds lit@3 (the "1" literal) to each literal of the list and then adds them up. We can then pass this foreign syntax tree to our machine arithmetic backend to transform those foreign operations to efficient operations.

Morte lets you choose how much information you want to encode within lambda calculus. The more information you encode in lambda calculus the more Morte can optimize your program, but the slower your compile times will get, so it's a tradeoff.


Corecursion is the dual of recursion. Where recursion works on finite data types, corecursion works on potentially infinite data types. An example would be the following infinite Stream in Haskell:

data Stream a = Cons a (Stream a)

numbers :: Stream Int
numbers = go 0
go n = Cons n (go (n + 1))

-- numbers = Cons 0 (Cons 1 (Cons 2 (...

map :: (a -> b) -> Stream a -> Stream b
map f (Cons a l) = Cons (f a) (map f l)

example :: Stream Int
example = map (+ 1) numbers

-- example = Cons 1 (Cons 2 (Cons 3 (...

Again, pretend that we disable any function from referencing itself so that the above code becomes invalid. This time we cannot reuse the same trick from previous sections because we cannot encode numbers as a fold without referencing itself. Try this if you don't believe me.

However, we can still encode corecursive things in a non-corecursive way. This time, we encode our Stream type as an unfold instead of a fold:

-- This is also valid Haskell code

{-# LANGUAGE ExistentialQuantification #-}

data Stream a = forall s . MkStream
{ seed :: s
, step :: s -> (a, s)

numbers :: Stream Int
numbers = MkStream 0 (\n -> (n, n + 1))

map :: (a -> b) -> Stream a -> Stream b
map f (MkStream s0 k) = MkStream s0 k'
k' s = (f a, s')
where (a, s') = k s

In other words, we store an initial seed of some type s and a step function of type s -> (a, s) that emits one element of our Stream. The type of our seed s can be anything and in our numbers example, the type of the internal state is Int. Another stream could use a completely different internal state of type (), like this:

-- ones = Cons 1 ones

ones :: Stream Int
ones = MkStream () (\_ -> (1, ()))

The general name for this trick is an "F-coalgebra" encoding of a corecursive type.

Once we encode our infinite stream non-recursively, we can safely optimize the stream by inlining and eta reduction:

map id l

-- l = MkStream s0 k
= map id (MkStream s0 k)

-- Inline definition of `map`
= MkStream s0 k'
k' = \s -> (id a, s')
(a, s') = k s

-- Inline definition of `id`
= MkStream s0 k'
k' = \s -> (a, s')
(a, s') = k s

-- Inline: (a, s') = k s
= MkStream s0 k'
k' = \s -> k s

-- Eta reduce
= MkStream s0 k'
k' = k

-- Inline: k' = k
= MkStream s0 k

-- l = MkStream s0 k
= l

Now let's encode Stream and map in Morte and compile the following four expressions:

map id


map f . map g

map (f . g)

Save the following Morte file to and then uncomment the expression you want to test:

(   \(id : forall (a : *) -> a -> a)
-> \( (.)
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> (a -> c)
-> \(Pair : * -> * -> *)
-> \(P : forall (a : *) -> forall (b : *) -> a -> b -> Pair a b)
-> \( first
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (a -> b)
-> Pair a c
-> Pair b c

-> ( \(Stream : * -> *)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b)
-> Stream a
-> Stream b

-- example@1 = example@2
-> ( \(example@1 : forall (a : *) -> Stream a -> Stream a)
-> \(example@2 : forall (a : *) -> Stream a -> Stream a)

-- example@3 = example@4
-> \( example@3
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> Stream a
-> Stream c

-> \( example@4
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> Stream a
-> Stream c

-- Uncomment the example you want to test
-> example@1
-- -> example@2
-- -> example@3
-- -> example@4

-- example@1
(\(a : *) -> map a a (id a))

-- example@2
(\(a : *) -> id (Stream a))

-- example@3
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> map a c ((.) a b c f g)

-- example@4
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> (.) (Stream a) (Stream b) (Stream c) (map b c f) (map a b g)

-- Stream
( \(a : *)
-> forall (x : *)
-> (forall (s : *) -> s -> (s -> Pair a s) -> x)
-> x

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \( st
: forall (x : *)
-> (forall (s : *) -> s -> (s -> Pair a s) -> x)
-> x
-> \(x : *)
-> \(S : forall (s : *) -> s -> (s -> Pair b s) -> x)
-> st
( \(s : *)
-> \(seed : s)
-> \(step : s -> Pair a s)
-> S
(\(seed@1 : s) -> first a b s f (step seed@1))

-- id
(\(a : *) -> \(va : a) -> va)

-- (.)
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> \(va : a)
-> f (g va)

-- Pair
(\(a : *) -> \(b : *) -> forall (x : *) -> (a -> b -> x) -> x)

-- P
( \(a : *)
-> \(b : *)
-> \(va : a)
-> \(vb : b)
-> \(x : *)
-> \(P : a -> b -> x)
-> P va vb

-- first
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : a -> b)
-> \(p : forall (x : *) -> (a -> c -> x) -> x)
-> \(x : *)
-> \(Pair : b -> c -> x)
-> p x (\(va : a) -> \(vc : c) -> Pair (f va) vc)

Both example@1 and example@2 will generate alpha-equivalent code:

$ morte <
∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a →
s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x
: *) → (a → s → x) → x) → x) → x

λ(a : *) → λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) →
(a → s → x) → x) → x) → x) → st

$ morte <
∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a →
s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x
: *) → (a → s → x) → x) → x) → x

λ(a : *) → λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) →
(a → s → x) → x) → x) → x) → va

Similarly, example@3 and example@4 will generate alpha-equivalent code:

$ morte <
∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x :
*) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) →
x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x)
→ x) → x) → x

λ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b)
→ λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s
→ x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀
(x : *) → (c → s → x) → x) → x) → st x (λ(s : *) → λ(seed :
s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(
seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x
(λ(va : a) → Pair (f (g va)))))

$ morte <
∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x :
*) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) →
x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x)
→ x) → x) → x

λ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b)
→ λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s
→ x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀
(x : *) → (c → s → x) → x) → x) → va x (λ(s : *) → λ(seed :
s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(
seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x
(λ(va : a) → Pair (f (g va))))

We inadvertently proved stream fusion for free, but we're still not done, yet! Everything we learn about recursive and corecursive sequences can be applied to model recursive and corecursive effects!


I will conclude this post by showing how to model both recursive and corecursive programs that have side effects. The recursive program will echo ninety-nine lines from stdin to stdout. The equivalent Haskell program is in the comment header:


-- The Haskell code we will translate to Morte:
-- import Prelude hiding (
-- (+), (*), IO, putStrLn, getLine, (>>=), (>>), return )
-- -- Simple prelude
-- data Nat = Succ Nat | Zero
-- zero :: Nat
-- zero = Zero
-- one :: Nat
-- one = Succ Zero
-- (+) :: Nat -> Nat -> Nat
-- Zero + n = n
-- Succ m + n = m + Succ n
-- (*) :: Nat -> Nat -> Nat
-- Zero * n = Zero
-- Succ m * n = n + (m * n)
-- foldNat :: Nat -> (a -> a) -> a -> a
-- foldNat Zero f x = x
-- foldNat (Succ m) f x = f (foldNat m f x)
-- data IO r
-- = PutStrLn String (IO r)
-- | GetLine (String -> IO r)
-- | Return r
-- putStrLn :: String -> IO U
-- putStrLn str = PutStrLn str (Return Unit)
-- getLine :: IO String
-- getLine = GetLine Return
-- return :: a -> IO a
-- return = Return
-- (>>=) :: IO a -> (a -> IO b) -> IO b
-- PutStrLn str io >>= f = PutStrLn str (io >>= f)
-- GetLine k >>= f = GetLine (\str -> k str >>= f)
-- Return r >>= f = f r
-- -- Derived functions
-- (>>) :: IO U -> IO U -> IO U
-- m >> n = m >>= \_ -> n
-- two :: Nat
-- two = one + one
-- three :: Nat
-- three = one + one + one
-- four :: Nat
-- four = one + one + one + one
-- five :: Nat
-- five = one + one + one + one + one
-- six :: Nat
-- six = one + one + one + one + one + one
-- seven :: Nat
-- seven = one + one + one + one + one + one + one
-- eight :: Nat
-- eight = one + one + one + one + one + one + one + one
-- nine :: Nat
-- nine = one + one + one + one + one + one + one + one + one
-- ten :: Nat
-- ten = one + one + one + one + one + one + one + one + one + one
-- replicateM_ :: Nat -> IO U -> IO U
-- replicateM_ n io = foldNat n (io >>) (return Unit)
-- ninetynine :: Nat
-- ninetynine = nine * ten + nine
-- main_ :: IO U
-- main_ = getLine >>= putStrLn

-- "Free" variables
( \(String : * )
-> \(U : *)
-> \(Unit : U)

-- Simple prelude
-> ( \(Nat : *)
-> \(zero : Nat)
-> \(one : Nat)
-> \((+) : Nat -> Nat -> Nat)
-> \((*) : Nat -> Nat -> Nat)
-> \(foldNat : Nat -> forall (a : *) -> (a -> a) -> a -> a)
-> \(IO : * -> *)
-> \(return : forall (a : *) -> a -> IO a)
-> \((>>=)
: forall (a : *)
-> forall (b : *)
-> IO a
-> (a -> IO b)
-> IO b
-> \(putStrLn : String -> IO U)
-> \(getLine : IO String)

-- Derived functions
-> ( \((>>) : IO U -> IO U -> IO U)
-> \(two : Nat)
-> \(three : Nat)
-> \(four : Nat)
-> \(five : Nat)
-> \(six : Nat)
-> \(seven : Nat)
-> \(eight : Nat)
-> \(nine : Nat)
-> \(ten : Nat)
-> ( \(replicateM_ : Nat -> IO U -> IO U)
-> \(ninetynine : Nat)

-> replicateM_ ninetynine ((>>=) String U getLine putStrLn)

-- replicateM_
( \(n : Nat)
-> \(io : IO U)
-> foldNat n (IO U) ((>>) io) (return U Unit)

-- ninetynine
((+) ((*) nine ten) nine)

-- (>>)
( \(m : IO U)
-> \(n : IO U)
-> (>>=) U U m (\(_ : U) -> n)

-- two
((+) one one)

-- three
((+) one ((+) one one))

-- four
((+) one ((+) one ((+) one one)))

-- five
((+) one ((+) one ((+) one ((+) one one))))

-- six
((+) one ((+) one ((+) one ((+) one ((+) one one)))))

-- seven
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))

-- eight
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))
-- nine
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))))

-- ten
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))))

-- Nat
( forall (a : *)
-> (a -> a)
-> a
-> a

-- zero
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Zero

-- one
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Succ Zero

-- (+)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a Succ (n a Succ Zero)

-- (*)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a (n a Succ) Zero

-- foldNat
( \(n : forall (a : *) -> (a -> a) -> a -> a)
-> n

-- IO
( \(r : *)
-> forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (r -> x)
-> x

-- return
( \(a : *)
-> \(va : a)
-> \(x : *)
-> \(PutStrLn : String -> x -> x)
-> \(GetLine : (String -> x) -> x)
-> \(Return : a -> x)
-> Return va

-- (>>=)
( \(a : *)
-> \(b : *)
-> \(m : forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (a -> x)
-> x
-> \(f : a
-> forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (b -> x)
-> x
-> \(x : *)
-> \(PutStrLn : String -> x -> x)
-> \(GetLine : (String -> x) -> x)
-> \(Return : b -> x)
-> m x PutStrLn GetLine (\(va : a) -> f va x PutStrLn GetLine Return)

-- putStrLn
( \(str : String)
-> \(x : *)
-> \(PutStrLn : String -> x -> x )
-> \(GetLine : (String -> x) -> x)
-> \(Return : U -> x)
-> PutStrLn str (Return Unit)

-- getLine
( \(x : *)
-> \(PutStrLn : String -> x -> x )
-> \(GetLine : (String -> x) -> x)
-> \(Return : String -> x)
-> GetLine Return

This program will compile to a completely unrolled read-write loop, as most recursive programs will:

$ morte <
∀(String : *) → ∀(U : *) → U → ∀(x : *) → (String → x → x) →
((String → x) → x) → (U → x) → x

λ(String : *) → λ(U : *) → λ(Unit : U) → λ(x : *) → λ(PutStr
Ln : String → x → x) → λ(GetLine : (String → x) → x) → λ(Ret
urn : U → x) → GetLine (λ(va : String) → PutStrLn va (GetLin
e (λ(va@1 : String) → PutStrLn va@1 (GetLine (λ(va@2 : Strin
g) → PutStrLn va@2 (GetLine (λ(va@3 : String) → PutStrLn ...
... GetLine (λ(va@92 : String) → PutStrLn va@92 (GetLine (λ(
va@93 : String) → PutStrLn va@93 (GetLine (λ(va@94 : String)
→ PutStrLn va@94 (GetLine (λ(va@95 : String) → PutStrLn va@
95 (GetLine (λ(va@96 : String) → PutStrLn va@96 (GetLine (λ(
va@97 : String) → PutStrLn va@97 (GetLine (λ(va@98 : String)
→ PutStrLn va@98 (Return Unit))))))))))))))))))))))))))))))

In contrast, if we encode the effects corecursively we can express a program that echoes indefinitely from stdin to stdout:


-- data IOF r s
-- = PutStrLn String s
-- | GetLine (String -> s)
-- | Return r
-- data IO r = forall s . MkIO s (s -> IOF r s)
-- main = MkIO
-- Nothing
-- (maybe (\str -> PutStrLn str Nothing) (GetLine Just))

( \(String : *)
-> ( \(Maybe : * -> *)
-> \(Just : forall (a : *) -> a -> Maybe a)
-> \(Nothing : forall (a : *) -> Maybe a)
-> \( maybe
: forall (a : *)
-> Maybe a
-> forall (x : *)
-> (a -> x)
-> x
-> x
-> \(IOF : * -> * -> *)
-> \( PutStrLn
: forall (r : *)
-> forall (s : *)
-> String
-> s
-> IOF r s
-> \( GetLine
: forall (r : *)
-> forall (s : *)
-> (String -> s)
-> IOF r s
-> \( Return
: forall (r : *)
-> forall (s : *)
-> r
-> IOF r s
-> ( \(IO : * -> *)
-> \( MkIO
: forall (r : *)
-> forall (s : *)
-> s
-> (s -> IOF r s)
-> IO r
-> ( \(main : forall (r : *) -> IO r)
-> main

-- main
( \(r : *)
-> MkIO
(Maybe String)
(Nothing String)
( \(m : Maybe String)
-> maybe
(IOF r (Maybe String))
(\(str : String) ->
(Maybe String)
(Nothing String)
(GetLine r (Maybe String) (Just String))

-- IO
( \(r : *)
-> forall (x : *)
-> (forall (s : *) -> s -> (s -> IOF r s) -> x)
-> x

-- MkIO
( \(r : *)
-> \(s : *)
-> \(seed : s)
-> \(step : s -> IOF r s)
-> \(x : *)
-> \(k : forall (s : *) -> s -> (s -> IOF r s) -> x)
-> k s seed step

-- Maybe
(\(a : *) -> forall (x : *) -> (a -> x) -> x -> x)

-- Just
( \(a : *)
-> \(va : a)
-> \(x : *)
-> \(Just : a -> x)
-> \(Nothing : x)
-> Just va

-- Nothing
( \(a : *)
-> \(x : *)
-> \(Just : a -> x)
-> \(Nothing : x)
-> Nothing

-- maybe
( \(a : *)
-> \(m : forall (x : *) -> (a -> x) -> x-> x)
-> m

-- IOF
( \(r : *)
-> \(s : *)
-> forall (x : *)
-> (String -> s -> x)
-> ((String -> s) -> x)
-> (r -> x)
-> x

-- PutStrLn
( \(r : *)
-> \(s : *)
-> \(str : String)
-> \(vs : s)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> PutStrLn str vs

-- GetLine
( \(r : *)
-> \(s : *)
-> \(k : String -> s)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> GetLine k

-- Return
( \(r : *)
-> \(s : *)
-> \(vr : r)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> Return vr


This compiles to a state machine that we can unfold one step at a time:

$ morte <
∀(String : *) → ∀(r : *) → ∀(x : *) → (∀(s : *) → s → (s → ∀
(x : *) → (String → s → x) → ((String → s) → x) → (r → x) →
x) → x) → x

λ(String : *) → λ(r : *) → λ(x : *) → λ(k : ∀(s : *) → s → (
s → ∀(x : *) → (String → s → x) → ((String → s) → x) → (r →
x) → x) → x) → k (∀(x : *) → (String → x) → x → x) (λ(x : *)
→ λ(Just : String → x) → λ(Nothing : x) → Nothing) (λ(m : ∀
(x : *) → (String → x) → x → x) → m (∀(x : *) → (String → (∀
(x : *) → (String → x) → x → x) → x) → ((String → ∀(x : *) →
(String → x) → x → x) → x) → (r → x) → x) (λ(str : String)
→ λ(x : *) → λ(PutStrLn : String → (∀(x : *) → (String → x)
→ x → x) → x) → λ(GetLine : (String → ∀(x : *) → (String → x
) → x → x) → x) → λ(Return : r → x) → PutStrLn str (λ(x : *)
→ λ(Just : String → x) → λ(Nothing : x) → Nothing)) (λ(x :
*) → λ(PutStrLn : String → (∀(x : *) → (String → x) → x → x)
→ x) → λ(GetLine : (String → ∀(x : *) → (String → x) → x →
x) → x) → λ(Return : r → x) → GetLine (λ(va : String) → λ(x
: *) → λ(Just : String → x) → λ(Nothing : x) → Just va))

I don't expect you to understand that output other than to know that we can translate the output to any backend that provides functions, and primitive read/write operations.


If you would like to use Morte, you can find the library on both Github and Hackage. I also provide a Morte tutorial that you can use to learn more about the library.

Morte is dependently typed in theory, but in practice I have not exercised this feature so I don't understand the implications of this. If this turns out to be a mistake then I will downgrade Morte to System Fw, which has higher-kinds and polymorphism, but no dependent types.

Additionally, Morte might be usable to transmit code in a secure and typed way in distributed environment or to share code between diverse functional language by providing a common intermediate language. However, both of those scenarios require additional work, such as establishing a shared set of foreign primitives and creating Morte encoders/decoders for each target language.

Also, there are additional optimizations which Morte might implement in the future. For example, Morte could use free theorems (equalities you deduce from the types) to simplify some code fragments even further, but Morte currently does not do this.

My next goals are:

  • Add a back-end to compile Morte to LLVM
  • Add a front-end to desugar a medium-level Haskell-like language to Morte

Once those steps are complete then Morte will be a usable intermediate language for writing super-optimizable programs.

Also, if you're wondering, the name Morte is a tribute to a talking skull from the game Planescape: Torment, since the Morte library is a "bare-bones" calculus of constructions.


If this topic interests you more, you may find the following links helpful, in roughly increasing order of difficulty:

by Gabriel Gonzalez ( at September 12, 2014 11:38 AM

Ken T Takusagawa

[prbwmqwj] Functions to modify a record

Haskell could use some new syntax LAMBDA_RECORD_MODIFY which could be used as follows:

import qualified Control.Monad.State as State;
data Record { field :: Int };
... State.modify $ LAMBDA_RECORD_MODIFY { field = ... };

which is equivalent to

State.modify $ \x -> x { field = ... }

but not having to name the lambda parameter "x" (twice).

I suspect this is one of the things lenses are trying to do.

by Ken ( at September 12, 2014 03:43 AM

September 11, 2014

The GHC Team

Static pointers and serialisation

This longish post gives Simon's reflections on the implementation of Cloud-Haskell-style static pointers and serialiation. See also StaticPointers.

Much of what is suggested here is implemented, in some form, in two existing projects

My goal here is to identify the smallest possible extension to GHC, with the smallest possible trusted code base, that would enable these libraries to be written in an entirely type-safe way.


Background: the trusted code base

The implementation Typeable class, and its associated functions, in GHC offers a type-safe abstraction, in the classic sense that "well typed programs won't go wrong". For example, we in Data.Typeable we have

cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b

We expect cast to be type-safe: if cast returns a value Just x then we really do know that x :: b. Let's remind ourselves of class Typeable:

class Typeable a where
  typeRep :: proxy a -> TypeRep

(It's not quite this, but close.) The proxy a argument is just a proxy for type argument; its value is never inspected and you can always pass bottom.

Under the hood, cast uses typeRep to get the runtime TypeRep for a and b, and compares them, thus:

cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b
cast x = if typeRep (Proxy :: Proxy a) == typeRep (Proxy :: Proxy b)
           then Just (unsafeCoerce x)
           else Nothing

Although cast is written in Haskell, it uses unsafeCoerce. For it to truly be type-safe, it must trust the Typeable instances. If the user could write a Typeable instance, they could write a bogus one, and defeat type safety. So only GHC is allowed write Typeable instances.

In short, cast and the Typeable instances are part of the trusted code base, or TCB:

  • The TCB should be as small as possible
  • The TCB should have a small, well-defined, statically-typed API used by client code
  • Client code is un-trusted; if the client code is well-typed, and the TCB is implemented correctly, nothing can go wrong

Background Typeable a and TypeRep

I'll use the Typeable a type class and values of type TypeRep more or less interchangeably. As you can see from the definition of class Typeable above, its payload is simply a constant function returning a TypeRep. So you can think of a Typeable a as simply a type-tagged version of TypeRep.

Of course, a Typeable a is a type class thing, which is hard to pass around explicitly like a value, but that is easily fixed using the "Dict Trick", well known in Haskell folk lore:

data Dict (c :: Constraint) where
  Dict :: forall c. c => Dict c

Now a value of type Dict (Typeable a) is an ordinary value that embodies a Typeable a dictionary. For example:

f :: Dict (Typeable a) -> Dict (Typeable b) -> a -> Maybe b
f Dict Dict val = cast val

The pattern-matches against the Dict constructor brings the Typeable dictionaries into scope, so they can be used to discharge the constraint arising from the call to cast.

Background: serialisation

I'm going to assume a a type class Serialisable, something like this:

class Serialisable a where
  encode :: a -> ByteString
  decode :: ByteString -> Maybe (a, ByteString)

'll use "encode" and "decode" as synonyms for "serialise" and "deserialise", because the former are easier to pronounce.

Here's an interesting question: are instances of Serialisable part of the TCB? No, they are not. Here is a tricky case:

  decode (encode [True,False]) :: Maybe (Int, ByteString)

Here I have encode a [Bool] into a ByteString, and then decoded an Int from that ByteString. This may be naughty or undesirable, but it cannot seg-fault: it is type-safe in the sense above. You can think of it like this: a decoder is simply a parser for the bits in the ByteString, so a decoder for (say) Int can fail to parse a full Int (returning Nothing), but it can't return a non-Int.

For the naughtiness, one could imagine that a Cloud Haskell library might send fingerprints or TypeReps or whatnot to eliminate potential naughtiness. But even then it is very valuable if the type-safety of the system does not rely on the CH library. Type safety depends only on the correctness of the (small) TCB; naughtiness-safety might additionally depend on the correctness of the CH library.

Background: static pointers

I'm taking for granted the basic design of the Cloud Haskell paper. That is,

  • A type constructor StaticPtr :: * -> *. Intuitively, a value of type StaticPtr t is represented by a static code pointer to a value of type t. Note "code pointer" not "heap pointer". That's the point!
  • A language construct static <expr>, whose type is StaticPtr t if <expr> has type t.
  • In static <expr>, the free variables of <expr> must all be bound at top level. The implementation almost certainly works by giving <expr> a top-level definition with a new name, static34 = <expr>.
  • A function unStatic :: StaticPtr a -> a, to unwrap a static pointer.
  • Static values are serialisable. Something like instance Serialisable (StaticPtr a). (This will turn out to be not quite right.) Operationally this works by serialising the code pointer, or top-level name (e.g "Foo.static34").

All of this is built-in. It is OK for the implementation of StaticPtr to be part of the TCB. But our goal is that no other code need be in the TCB.

A red herring. I'm not going to address the question of how to serialise a static pointer. One method would be to serialise a machine address, but that only works if the encoding and decoding ends are running identical binaries. But that's easily fixed: encode a static as the name of the static value e.g. "function foo from module M in package p". Indeed, I'll informally assume an implementation of this latter kind.

In general, I will say that what we ultimately serialise is a StaticName. You can think of a StaticName as package/module/function triple, or something like that. The implementation of StaticName is certainly not part of the client-visible API for StaticPtr; indeed, the type StaticName is not part of the API either. But it gives us useful vocabulary.

Serialising static pointers

We can see immediately that we cannot expect to have instance Serialisable (Static a), which is what the Cloud Haskell paper proposed. If we had such an instance we would have

encodeStatic :: forall a. StaticPtr a -> ByteString
decodeStatic :: forall a. ByteString -> Maybe (StaticPtr a, ByteString)

And it's immediately apparent that decodeStatic cannot be right. I could get a ByteString from anywhere, apply decodeStatic to it, and thereby get a StaticPtr a. Then use unStatic and you have a value of type a, for, for any type a!!

Plainly, what we need is (just in the case of cast) to do a dynamic typecheck, thus:

decodeStatic :: forall a. Typeable a
                       => ByteString -> Maybe (StaticPtr a, ByteString)

Let's think operationally for a moment:

  • GHC collects all the StaticPtr values in a table, the static pointer table or SPT. Each row contains
    • The StaticName of the value
    • A pointer to closure for the value itself
    • A pointer to its TypeRep
  • decodeStatic now proceeds like this:
    • Parse a StaticName from the ByteString (failure => Nothing)
    • Look it up in table (not found => Nothing)
    • Compare the TypeRep passed to decodeStatic (via the Typeable a dictionary) with the one ine the table (not equal => Nothing)
    • Return the value

Side note. Another possibility is for decodeStatic not to take a Typeable a context but instead for unStatic to do so:: unStatic :: Typeable a => StaticPtr a -> Maybe a. But that seems a mess. Apart from anything else, it would mean that a value of type StaticPtr a might or might not point to a value of type a, so there's no point in having the type parameter in the first place. End of side note.

This design has some useful consequences that are worth calling out:

  • A StaticPtr is serialised simply to the StaticName; the serialised form does not need to contain a TypeRep. Indeed it would not even be type-safe to serialise a StaticPtr to a pair of a StaticName and a TypeRep, trusting that the TypeRep described the type of the named function. Why not? Think back to "Background: serialisation" above, and imagine we said
    decode (encode ["wibble", "wobble"])
      :: Typeable a => Maybe (StaticPtr a, ByteString)
    Here we create an essentially-garbage ByteString by encoding a [String], and try to decode it. If, by chance, we successfully parse a valid StaticName and TypeRep, there is absolutely no reason to suppose that the TypeRep will describe the type of the function.

    Instead, the TypeRep of the static pointer lives in the SPT, securely put there when the SPT was created. Not only is this type-safe, but it also saves bandwidth by not transmittingTypeReps.
  • Since clients can effectively fabricate a StaticName (by supplying decodeStatic with a bogus ByteString, a StaticName is untrusted. That gives the implementation a good deal of wiggle room for how it chooses to implement static names. Even a simple index in the range 0..N would be type-safe!

    The motivation for choosing a richer representation for StaticName (eg package/module/name) is not type-safety but rather resilience to change. For example, the Haskell programs at the two ends could be quite different, provided only that they agreed about what to call the static pointers that they want to exchange.

Statics and existentials

Here is something very reasonable:

data StaticApp b where
  SA :: StaticPtr (a->b) -> StaticPtr a -> StaticApp b
unStaticApp :: StaticApp a -> a
unStaticApp (SA f a) = unStatic f (unStatic a)

(We might want to add more constructors, but I'm going to focus only on SA.) A SA is just a pair of StaticPtrs, one for a function and one for an argument. We can securely unwrap it with unStaticApp.

Now, here is the question: can we serialise StaticApps? Operationally, of course yes: to serialise a SA, just serialise the two StaticPtrs it contains, and dually for deserialisation. But, as before, deserialisation is the hard bit. We seek:

decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)

But how can we write decodeSA? Here is the beginning of an attempt:

decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
  = case decodeStatic bs :: Maybe (StaticPtr (a->b)) of
      Nothing -> Nothing
      Just (fun, bs1) -> ...

and you can immediately see that we are stuck. Type variable b is not in scope. More concretely, we need a Typeable (a->b) to pass in to decodeStatic, but we only have a Typeable b to hand.

What can we do? Tantalisingly, we know that if decodeStatic succeeds in parsing a static StaticName from bs then, when we look up that StaticName in the Static Pointer Table, we'll find a TypeRep for the value. So rather than passing a Typeable dictionary into decodeStatic, we'd like to get one out!

With that in mind, here is a new type signature for decodeStatic that returns both pieces:

data DynStaticPtr where
  DSP :: Typeable a => StaticPtr a -> DynStaticPtr
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)

(The name DynStaticPtr comes from the fact that this data type is extremely similar to the library definition of Dynamic.)

Operationally, decodeStaticK bs fail cont works like this;

  • Parse a StaticName from bs (failure => return Nothing)
  • Look it up in the SPT (not found => return Nothing)
  • Return the TypeRep and the value found in the SPT, paired up with DSP. (Indeed the SPT could contain the DynStaticPtr values directly.)

For the construction of DynStaticPtr to be type-safe, we need to know that the TypeRep passed really is a TypeRep for the value; so the construction of the SPT is (unsurprisingly) part of the TCB.

Now we can write decodeSA (the monad is just the Maybe monad, nothing fancy):

decodeSA :: forall b. Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
  = do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
       ; (DSP (arg :: StaticPtr targ), bs2) <- decodeStatic bs1
            -- At this point we have
            --     Typeable b      (from caller)
            --     Typeable tfun   (from first DSP)
            --     Typeable targ   (from second DSP)
       ; fun' :: StaticPtr (targ->b) <- cast fun
       ; return (SA fun' arg, bs2) }

The call to cast needs Typeable tfun, and Typeable (targ->b). The former is bound by the first DSP pattern match. The latter is constructed automatically from Typeable targ and Typeable b, both of which we have. Bingo!

Notice that decodeSA is not part of the TCB. Clients can freely write code like decodeSA and be sure that it is type-safe.

From static pointers to closures

The original Cloud Haskell paper defines closures like this:

data Closure a where
  Clo :: StaticPtr (ByteString -> a) -> ByteString -> Closure a

It is easy to define

unClo :: Closure a -> a
unClo (Clo s e) = unStatic s e

Side note on HdpH

HdpH refines the Cloud Haskell Closure in (at least) two ways. I think (but I am not certain) that this declaration captures the essence:

data Closure a where
  Clo :: StaticPtr (ByteString -> a) -> Put () -> a -> Closure a

The refinements are:

  • The extra argument of type 'a' to avoid costs when we build a closure and then unwrap it with unClo locally, or repeatedly.
  • The use of Put () rather than a ByteString for the serialised environment, to avoid repeated copying when doing nested serialisation.

Both are importnat, but they are orthogonal to the discussion about static types, so I'll use the CH definition from here on.

Serialising closures

Just as in the case of StaticPtr, it is immediately clear that we cannot expect to have

decodeClo :: ByteString -> Maybe (Closure a, ByteString)

Instead we must play the same trick, and attempt to define

data DynClosure where
  DC :: Typeable a => Closure a -> DynClosure
decodeClo :: ByteString -> Maybe (DynClosure, ByteString)

But there's an immediate problem in writing decodeClo:

decodeClo bs
  = do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
       ; (env, bs2)                         <- decodeByteString bs1
       ; return (DC (Clo fun env), bs2) }  -- WRONG

This won't typecheck because DC needs Typeable a, but we only have Typeable (ByteString -> a)`.

This is Jolly Annoying. I can see three ways to make progress:

  • Plan A: Provide some (type-safe) way to decompose TypeReps, to get from Typeable (a->b) to Typeable b (and presumably Typeable a as well).
  • Plan C: Serialise a TypeRep a with every Closure a.
  • Plan C: Generalise StaticPtr

I like Plan C best. They are each discussed next.

Plan A: Decomposing TypeRep

At the moment, GHC provides statically-typed ways to construct and compare a TypeRep (via cast), but no way to decompose one, at least not in a type-safe way. It is tempting to seek this function as part of the TCB:

class Typeable a where
  typeRep :: proxy a -> TypeRep
  decomposeTypeRep :: DecompTR a
data DecompTR a where
  TRApp :: (Typeable p, Typeable q) => DecompTR (p q)
  TRCon :: TyCon -> DecompTR a

This isn't a bad idea, but it does mean that Typeable a must be implemented (and presumably serialised) using a tree, whereas the current API would allow an implementation consisting only of a fingerprint.

(Thought experiment: maybe a Typeable a, and Dict (Typeable a) can be represented as a tree, but a TypeRep could be just a fingerprint?)

Plan B: serialise TypeRep with Closure

Since we need a Typeable a at the far end, we could just serialise it directly with the Closure, like this:

encodeClo :: forall a. Typeable a => Closure a -> ByteString
encodeClo (Clo fun env)
  =  encodeTypeable (proxy :: a)
  ++ encodeStatic fun
  ++ encodeByteString env

Here I am assuming (as part of the TBC)

encodeTypeable :: Typeable a => proxy a -> ByteString
decodeTypeable :: ByteString -> Maybe (DynTypeable, ByteString)
data DynTypeable where
  DT :: Typeable a => proxy a -> DynTypeable

which serialises a TypeRep. (Or, operationally, perhaps just its fingerprint.) Now I think we can write decodeClo:

decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
  = do { (DT (_ :: Proxy a),           bs1)  <- decodeTypeable
       ; (DSP (fun :: StaticPtr tfun), bs2)  <- decodeStatic bs1
       ; (env, bs3)                          <- decodeByteString bs2
       ; fun' :: StaticPtr (ByteString -> a) <- cast fun
       ; return (DC (Clo fun' env), bs2) }  -- WRONG

But this too is annoying: we have to send these extra TypeReps when morally they are already sitting there in the SPT.

Plan C: Generalising StaticPtr

Our difficulty is that we are deserialising StaticPtr (ByteString -> a) but we want to be given Typeable a not Typeable (ByteString -> a). So perhaps we can decompose the type into a type constructor and type argument, like this:

data StaticPtr (f :: *->*) (a :: *)
unStatic :: StaticPtr f a -> f a
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)
data DynStaticPtr where
  DS :: (Typeable f, Typeable a) => StaticPtr (f a) -> DynStaticPtr

Each row of the SPT contains:

  • The StaticName
  • The value of type f a
  • The Typeable f dictionary
  • The Typeable a dictionary

Now we can define closures thus:

data Closure a where
  Clo :: StaticPtr (ByteString ->) a -> ByteString -> Closure a

and these are easy to deserialise:

decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
  = do { (DSP (fun :: StaticPtr f a), bs1) <- decodeStatic bs
       ; (env, bs2)                        <- decodeByteString bs1
           -- Here we have Typeable f, Typeable a
       ; fun' :: StaticPtr (ByteString ->) a <- cast fun
           -- This cast checks that f ~ (ByteString ->)
           -- Needs Typeable f, Typealbe (ByteString ->)
       ; return (DC (Clo fun env), bs2) }
           -- DC needs Typeable a

I like this a lot better, but it has knock on effects.

  • The old StaticPtr a is now StaticPtr Id a.
  • What becomes of our data type for StaticApply? Perhpas
    data StaticApp f b where
      SA :: StaticPtr f (a->b) -> StaticPtr f b -> StaticApp f b
    unStaticApp :: Applicative => StaticApp f b -> f b

ToDo: ...I have not yet followed through all the details

Applying closures

Can we write closureApply? I'm hoping for a structure like this:

closureApply :: Closure (a->b) -> Closure a -> Closure b
closureApply fun arg = Clo (static caStatic) (fun, arg)
caStatic :: ByteString -> b  -- WRONG
caStatic bs = do { ((fun,arg), bs1) <- decode bs
                 ; return (unClo fun (unClo arg), bs1) }

This is obviously wrong. caStatic clearly cannot have that type. It would at least need to be

caStatic :: Typeable b => ByteString -> b

and now there is the thorny question of where the Typeable b dictionary comes from.

ToDo: ...I have stopped here for now

Polymorphism and serialisation

For this section I'll revert to the un-generalised single-parameter StaticPtr.

Parametric polymorphism

Consider these definitions:

rs1 :: Static ([Int] -> [Int])
rs1 = static reverse
rs2 :: Static ([Bool] -> [Bool])
rs2 = static reverse
rs3 :: forall a. Typeable a => Static ([a] -> [a])
rs3 = static reverse

The first two are clearly fine. The SPT will get one row for each of the two monomorphic calls to reverse, one with a TypeRep of [Int] -> [Int] and one with a TypeRep of [Bool] -> [Bool].

But both will have the same code pointer, namely the code for the polymorpic reverse function. Could we share just one StaticName for all instantiations of reverse, perhaps including rs3 as well?

I think we can. The story would be this:

  • The SPT has a row for reverse, containing
    • The StaticName for reverse
    • A pointer to the code for reverse (or, more precisely, its static closure).
    • A function of type TypeRep -> TypeRep that, given the TypeRep for a returns a TypeRep for [a] -> [a].
  • When we serialise a StaticPtr we send
    • The StaticName of the (polymorphic) function
    • A list of the TypeReps of the type arguments of the function
  • The rule for static <expr> becomes this: the free term variables <expr> must all be top level, but it may have free type variables, provided they are all Typeable.

All of this is part of the TCB, of course.

Type-class polymorphism

Consider static sort where sort :: Ord a => [a] -> [a]. Can we make such a StaticPtr. After all, sort gets an implicit value argument, namely an Ord a dictionary. If that dictionary can be defined at top level, well and good, so this should be OK:

ss1 :: StaticPtr ([Int] -> [Int])
ss1 = static sort

But things go wrong as soon as you have polymorphism:

ss2 :: forall a. Ord a => StaticPtr ([a] -> [a])
ss2 = static sort  -- WRONG

Now, clearly, the dictionary is a non-top-level free variable of the call to sort.

We might consider letting you write this:

ss3 :: forall a. StaticPtr (Ord a => [a] -> [a])
ss3 = static sort   -- ???

so now the static wraps a function expeting a dictionary. But that edges us uncomforatbly close to impredicative types, which is known to contain many dragons.

A simpler alternative is to use the Dict Trick (see Background above):

ss4 :: forall a. StaticPtr (Dict (Ord a) -> [a] -> [a])
ss4 = static sortD
sortD :: forall a. Dict (Ord a) -> [a] -> [a]
sortD Dict xs = sort xs

Now, at the call side, when we unwrap the StaticPtr, we need to supply an explicit Ord dictionary, like this:

...(unStatic ss4 Dict)....

For now, I propose to deal with type classes via the Dict Trick, which is entirely end-user programmable, leaving only parametric polymorphism for built-in support.

by simonpj at September 11, 2014 01:34 PM

Yesod Web Framework

Clarification of previous blog post

I've heard that my previous blog post has caused a bit of confusion, as sarcasm doesn't really come across in text very well. So let me elaborate (and of course, in the process, kill the joke):

Some years back, Erik found a case that was quite difficult to implement using enumerator. After we cracked our heads on it for long enough, some of us (I don't actually remember who was involved) decided to work on a new streaming library. That library ended up being called conduit (thanks to Yitz for the naming idea). It turns out that most people are unaware of that history, so when at ICFP, I casually mentioned that Erik was the cause of conduit coming into existence, some people were surprised. Erik jokingly chastised me for not giving him enough credit. In response, I decided to write an over-the-top post giving Erik all credit for conduit. I say over the top, since I made it seem like there was some large amount of blame being heaped on as well.

So to be completely clear:

  • Erik and I are good friends, and this was just a bit of an inside joke turned public.
  • No one has said anything offensive to me at all about conduit. There are obviously differing opinions out there about the best library for a job, but there's nothing offensive about it, just healthy discussion around a complicated topic. My purpose in making a big deal about it was not to express frustration at anyone attacking me, but rather to just play up the joke a bit more.

My apologies to anyone who was confused, upset, or worried by the previous post, it was completely unintentional.

September 11, 2014 12:00 AM

FP Complete

We're hiring: Haskell web UI developer

FP Complete is looking to expand its Haskell development team. We’re looking for a Haskeller with a strong background in web UI development. This position will encompass both work on our core products- such as FP Haskell Center and School of Haskell- as well as helping customers develop frontends to their Haskell applications.

We will want you to start right away. The will be a contractor position, full time for at least 3 months, with the intention to continue long-term on a more or less full-time basis. Additionally, while the main focus of the position will be UI development, there will be many opportunities to expand into other areas of focus.

This is a telecommute position: you can work from home or wherever you choose, with little or no travel. Location in North America is ideal; you will work with colleagues who are on North American and European hours.

Skills required:

  • Strong Haskell coding skills.
  • Experience with creating HTML/CSS/Javascript web applications (fat clients a plus).
  • Ideally: experience with both Yesod and Fay for server and client side coding, respectively. (Perk: you’ll get a chance to work with the authors of both tools.)
  • Experience deploying applications into production, especially at large scale, is a plus.
  • Ability to interact with a distributed development team, and to manage your time without an in-person supervisor
  • Ability to work with clients on gathering requirements
  • General source control/project skills: Git, issue tracking
  • Ability to communicate clearly in issues, bug reports and emails
  • Proficient on a Linux system
  • Plus: experience with deployment, Docker, and/or CoreOS

Please send resume or CV to Any existing work- either a running website or an open source codebase- which you can include links to will be greatly appreciated as well.

September 11, 2014 12:00 AM

September 10, 2014

Mike Izbicki

Polymorphism in Haskell vs C++

Polymorphism in Haskell vs C++

posted on 2014-09-10 by Jonathan Dugan

Parametric polymorphism is when you write one function that works on many data types. In C++, this is pretty confusing, but it’s really easy in Haskell. Let’s take a look at an example.

Let’s say we want a function that calculates the volume of a box. In C++, we’d use templates so that our function works with any numeric type:

template<typename T>
T boxVolume(T length, T width, T height)
    return length * width * height;

Templates have an awkward syntax, but that isn’t too much of a hassle. C++ has much bigger problems. What if in the course of writing your program, you accidentally pass in some strings to this function?

int main()
    cout << boxVolume("oops","no","strings") << endl;

We get this error when we compile with g++:

test.cpp: In instantiation of _T boxVolume(T, T, T) [with T = const    char*]_:
test.cpp:22:47:   required from here
test.cpp:8:19: error: invalid operands of types _const char*_ and _const char*_ to binary
    return length * width * height;

This error message is a little hard to understand because of the templates. If we had written our function to use doubles instead of templates:

double boxVolume(double length, double width, double height)
    return length * width * height;

We would get this simpler error message:

test.cpp: In function _int main()_:
test.cpp:22:47: error: cannot convert _const char*_ to _double_ for argument _1_ to _double
boxVolume(double, double, double)_
    cout << boxVolume("oops","nope","bad!") << endl;

We see that this error is shorter and easier to use, as it clearly tells us we cannot pass string literals to our function. Plus there is no superfluous comment about our “instantiation” of boxVolume.

Now let’s try to write a polymorphic boxVolume in Haskell:

boxVolume :: a -> a -> a -> a
boxVolume length width height = length * width * height

When we try to compile, we get the error:

    No instance for (Num a) arising from a use of `*'
    Possible fix:
      add (Num a) to the context of
        the type signature for boxVolume :: a -> a -> a -> a
    In the expression: length * width * height
    In an equation for `boxVolume':
        boxVolume length width height = length * width * height

Uh-oh! An error message! What went wrong? It says that we tried to use the * operator without declaring our parameters as an instance of the Num type class.

But what is a type class? This leads us to ad hoc polymorphism, also known as function overloading. Ad hoc polymorphism is when a function can be applied to different argument types, each with a different implementation. For example, the STL classes stack and queue each have their own push and pop functions, which, although they have the same names, do different things:

stack<int> s;
queue<int> q;

s.push(1); q.push(1);
s.push(2); q.push(2);
s.push(3); q.push(3);

s.pop(); q.pop();

After the above code is executed, the stack s will be left with the numbers 1,2 while the queue q will be left with the numbers 2,3. The function pop behaves differently on stacks and queues: calling pop on a stack removes the item added last, while calling pop on a queue removes the item added first.

Haskell does not support function overloading, except through type classes. For example, if we were to specifically declare our own Stack and Queue classes with push and pop functions:

data Stack = Stack  [Int] deriving Show
data Queue = Queue [Int] deriving Show

push :: Stack -> Int -> Stack
push (Stack xs) x = Stack (x:xs)

pop :: Stack -> Stack
pop (Stack []) = Stack []
pop (Stack xs) = Stack (tail xs)

push :: Queue -> Int -> Queue
push (Queue xs) x = Queue (x:xs)

pop :: Queue -> Queue
pop (Queue []) = Queue []
pop (Queue xs) = Queue (init xs)

It results in a compiler error:

    Duplicate type signatures for `push'
    at stack.hs:4:1-4

    Multiple declarations of `push'
    Declared at: stack.hs:5:1

    Duplicate type signatures for `pop'
    at stack.hs:7:1-3

    Multiple declarations of `pop'
    Declared at: stack.hs:8:1

Changing the names of our push and pop functions to, say, stackPush, stackPop, queuePush, and queuePop would let the program compile.

A more generic way, however, is to create a type class. Let’s make a Sequence type class that implements our push and pop functions.

class Sequence s where
    push :: s -> Int -> s
    pop :: s -> s

This type class declaration says that any data type that is an instance of this Sequence type class can use the push and pop operations, or, in other words, can add and remove an Int. By making our Stack and Queue instances of the Sequence type class, both data types can have their own implementations of the push and pop functions!

instance Sequence Stack where
    push (Stack xs) x = Stack (x:xs)
    pop (Stack []) = Stack []
    pop (Stack xs) = Stack (tail xs)

instance Sequence Queue where
    push (Queue xs) x = Queue (x:xs)
    pop (Queue []) = Queue []
    pop (Queue xs) = Queue (init xs)

Replacing our function definitions with these instantiations of the Sequence type class lets our program compile.

Type classes are also an important part of using templates in function definitions. In our function boxVolume, we got an error because we tried to use the * operation without declaring the type variable a as an instance of the Num type class. The Num type class is basically for anything that acts like a number, such as Int, Float, and Double, and it lets you use the common operations of +, -, and *.

Let’s change our function to declare that a is a Num:

boxVolume :: (Num a) => a -> a -> a -> a
boxVolume length width height = length * width * height

This is called adding a class constraint. Whenever we want to declare a template function that relies on other functions, we have to add a class constraint that tells both the user and the compiler which types of data can be put into the function.

If we were to call boxVolume on strings, we would get this simple error message:

ghci> boxVolume "a" "b" "c"

    No instance for (Num [Char]) arising from a use of `boxVolume'
    Possible fix: add an instance declaration for (Num [Char])
    In the expression: boxVolume "a" "b" "c"
    In an equation for `it': it = boxVolume "a" "b" "c"

The compiler tells us it can’t evaluate this function because strings aren’t numbers! If we really wanted to, we could make String an instance of the Num type class, and then this function would work! (Of course, why you would want to do that is beyond me.) That’s the power of parametric polymorphism combined with type classes.

So there you have it. In C++, although we can easily implement ad hoc polymorphism through function overloading, parametric polymorphism is a tricky beast. This is made easier in Haskell, especially with the use of type classes. Type classes guarantee that data passed in to functions will work, and guide the user into what they can pass into a function. Use type classes to your advantage when you next write a Haskell program!

September 10, 2014 12:00 AM

September 09, 2014

Dominic Steinitz

Fun with (Extended Kalman) Filters


An extended Kalman filter in Haskell using type level literals and automatic differentiation to provide some guarantees of correctness.

Population Growth

Suppose we wish to model population growth of bees via the logistic equation

\displaystyle  \begin{aligned}  \dot{p} & = rp\Big(1 - \frac{p}{k}\Big)  \end{aligned}

We assume the growth rate r is unknown and drawn from a normal distribution {\cal{N}}(\mu_r, \sigma_r^2) but the carrying capacity k is known and we wish to estimate the growth rate by observing noisy values y_i of the population at discrete times t_0 = 0, t_1 = \Delta T, t_2 = 2\Delta T, \ldots. Note that p_t is entirely deterministic and its stochasticity is only as a result of the fact that the unknown parameter of the logistic equation is sampled from a normal distribution (we could for example be observing different colonies of bees and we know from the literature that bee populations obey the logistic equation and each colony will have different growth rates).

Haskell Preamble

> {-# OPTIONS_GHC -Wall                     #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing  #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults   #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind  #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans         #-}
> {-# LANGUAGE DataKinds                    #-}
> {-# LANGUAGE ScopedTypeVariables          #-}
> {-# LANGUAGE RankNTypes                   #-}
> {-# LANGUAGE BangPatterns                 #-}
> {-# LANGUAGE TypeOperators                #-}
> {-# LANGUAGE TypeFamilies                 #-}
> module FunWithKalman3 where
> import GHC.TypeLits
> import Numeric.LinearAlgebra.Static
> import Data.Maybe ( fromJust )
> import Numeric.AD
> import Data.Random.Source.PureMT
> import Data.Random
> import Control.Monad.State
> import qualified Control.Monad.Writer as W
> import Control.Monad.Loops

Logistic Equation

The logistic equation is a well known example of a dynamical system which has an analytic solution

\displaystyle  p = \frac{kp_0\exp rt}{k + p_0(\exp rt - 1)}

Here it is in Haskell

> logit :: Floating a => a -> a -> a -> a
> logit p0 k x = k * p0 * (exp x) / (k + p0 * (exp x - 1))

We observe a noisy value of population at regular time intervals (where \Delta T is the time interval)

\displaystyle  \begin{aligned}  p_i &= \frac{kp_0\exp r\Delta T i}{k + p_0(\exp r\Delta T i - 1)} \\  y_i &= p_i + \epsilon_i  \end{aligned}

Using the semi-group property of our dynamical system, we can re-write this as

\displaystyle  \begin{aligned}  p_i &= \frac{kp_{i-1}\exp r\Delta T}{k + p_{i-1}(\exp r\Delta T - 1)} \\  y_i &= p_i + \epsilon_i  \end{aligned}

To convince yourself that this re-formulation is correct, think of the population as starting at p_0; after 1 time step it has reached p_1 and and after two time steps it has reached p_2 and this ought to be the same as the point reached after 1 time step starting at p_1, for example

> oneStepFrom0, twoStepsFrom0, oneStepFrom1 :: Double
> oneStepFrom0  = logit 0.1 1.0 (1 * 0.1)
> twoStepsFrom0 = logit 0.1 1.0 (1 * 0.2)
> oneStepFrom1  = logit oneStepFrom0 1.0 (1 * 0.1)
ghci> twoStepsFrom0

ghci> oneStepFrom1

We would like to infer the growth rate not just be able to predict the population so we need to add another variable to our model.

\displaystyle  \begin{aligned}  r_i &= r_{i-1} \\  p_i &= \frac{kp_{i-1}\exp r_{i-1}\Delta T}{k + p_{i-1}(\exp r_{i-1}\Delta T - 1)} \\  y_i &= \begin{bmatrix}0 & 1\end{bmatrix}\begin{bmatrix}r_i \\ p_i\end{bmatrix} + \begin{bmatrix}0 \\ \epsilon_i\end{bmatrix}  \end{aligned}

Extended Kalman

This is almost in the form suitable for estimation using a Kalman filter but the dependency of the state on the previous state is non-linear. We can modify the Kalman filter to create the extended Kalman filter (EKF) by making a linear approximation.

Since the measurement update is trivially linear (even in this more general form), the measurement update step remains unchanged.

\displaystyle  \begin{aligned}  \boldsymbol{v}_i & \triangleq  \boldsymbol{y}_i - \boldsymbol{g}(\hat{\boldsymbol{x}}^\flat_i) \\  \boldsymbol{S}_i & \triangleq  \boldsymbol{G}_i \hat{\boldsymbol{\Sigma}}^\flat_i  \boldsymbol{G}_i^\top + \boldsymbol{\Sigma}^{(y)}_i \\  \boldsymbol{K}_i & \triangleq \hat{\boldsymbol{\Sigma}}^\flat_i  \boldsymbol{G}_i^\top\boldsymbol{S}^{-1}_i \\  \hat{\boldsymbol{x}}^i &\triangleq \hat{\boldsymbol{x}}^\flat_i + \boldsymbol{K}_i\boldsymbol{v}_i \\  \hat{\boldsymbol{\Sigma}}_i &\triangleq \hat{\boldsymbol{\Sigma}}^\flat_i - \boldsymbol{K}_i\boldsymbol{S}_i\boldsymbol{K}^\top_i  \end{aligned}

By Taylor we have

\displaystyle  \boldsymbol{a}(\boldsymbol{x}) \approx \boldsymbol{a}(\boldsymbol{m}) + \boldsymbol{A}_{\boldsymbol{x}}(\boldsymbol{m})\delta\boldsymbol{x}

where \boldsymbol{A}_{\boldsymbol{x}}(\boldsymbol{m}) is the Jacobian of \boldsymbol{a} evaluated at \boldsymbol{m} (for the exposition of the extended filter we take \boldsymbol{a} to be vector valued hence the use of a bold font). We take \delta\boldsymbol{x} to be normally distributed with mean of 0 and ignore any difficulties there may be with using Taylor with stochastic variables.

Applying this at \boldsymbol{m} = \hat{\boldsymbol{x}}_{i-1} we have

\displaystyle  \boldsymbol{x}_i = \boldsymbol{a}(\hat{\boldsymbol{x}}_{i-1}) + \boldsymbol{A}_{\boldsymbol{x}}(\hat{\boldsymbol{x}}_{i-1})(\boldsymbol{x}_{i-1} - \hat{\boldsymbol{x}}_{i-1}) + \boldsymbol{\epsilon}_i

Using the same reasoning as we did as for Kalman filters and writing \boldsymbol{A}_{i-1} for \boldsymbol{A}_{\boldsymbol{x}}(\hat{\boldsymbol{x}}_{i-1}) we obtain

\displaystyle  \begin{aligned}  \hat{\boldsymbol{x}}^\flat_i &=  \boldsymbol{a}(\hat{\boldsymbol{x}}_{i-1}) \\  \hat{\boldsymbol{\Sigma}}^\flat_i &= \boldsymbol{A}_{i-1}  \hat{\boldsymbol{\Sigma}}_{i-1}  \boldsymbol{A}_{i-1}^\top  + \boldsymbol{\Sigma}^{(x)}_{i-1}  \end{aligned}

Haskell Implementation

Note that we pass in the Jacobian of the update function as a function itself in the case of the extended Kalman filter rather than the matrix representing the linear function as we do in the case of the classical Kalman filter.

> k, p0 :: Floating a => a
> k = 1.0
> p0 = 0.1 * k
> r, deltaT :: Floating a => a
> r = 10.0
> deltaT = 0.0005

Relating ad and hmatrix is somewhat unpleasant but this can probably be ameliorated by defining a suitable datatype.

> a :: R 2 -> R 2
> a rpPrev = rNew # pNew
>   where
>     (r, pPrev) = headTail rpPrev
>     rNew :: R 1
>     rNew = konst r
>     (p,  _) = headTail pPrev
>     pNew :: R 1
>     pNew = fromList $ [logit p k (r * deltaT)]
> bigA :: R 2 -> Sq 2
> bigA rp = fromList $ concat $ j [r, p]
>   where
>     (r, ps) = headTail rp
>     (p,  _) = headTail ps
>     j = jacobian (\[r, p] -> [r, logit p k (r * deltaT)])

For some reason, hmatrix with static guarantees does not yet provide an inverse function for matrices.

> inv :: (KnownNat n, (1 <=? n) ~ 'True) => Sq n -> Sq n
> inv m = fromJust $ linSolve m eye

Here is the extended Kalman filter itself. The type signatures on the expressions inside the function are not necessary but did help the implementor discover a bug in the mathematical derivation and will hopefully help the reader.

> outer ::  forall m n . (KnownNat m, KnownNat n,
>                         (1 <=? n) ~ 'True, (1 <=? m) ~ 'True) =>
>           R n -> Sq n ->
>           L m n -> Sq m ->
>           (R n -> R n) -> (R n -> Sq n) -> Sq n ->
>           [R m] ->
>           [(R n, Sq n)]
> outer muPrior sigmaPrior bigH bigSigmaY
>       littleA bigABuilder bigSigmaX ys = result
>   where
>     result = scanl update (muPrior, sigmaPrior) ys
>     update :: (R n, Sq n) -> R m -> (R n, Sq n)
>     update (xHatFlat, bigSigmaHatFlat) y =
>       (xHatFlatNew, bigSigmaHatFlatNew)
>       where
>         v :: R m
>         v = y - (bigH #> xHatFlat)
>         bigS :: Sq m
>         bigS = bigH <> bigSigmaHatFlat <> (tr bigH) + bigSigmaY
>         bigK :: L n m
>         bigK = bigSigmaHatFlat <> (tr bigH) <> (inv bigS)
>         xHat :: R n
>         xHat = xHatFlat + bigK #> v
>         bigSigmaHat :: Sq n
>         bigSigmaHat = bigSigmaHatFlat - bigK <> bigS <> (tr bigK)
>         bigA :: Sq n
>         bigA = bigABuilder xHat
>         xHatFlatNew :: R n
>         xHatFlatNew = littleA xHat
>         bigSigmaHatFlatNew :: Sq n
>         bigSigmaHatFlatNew = bigA <> bigSigmaHat <> (tr bigA) + bigSigmaX

Now let us create some sample data.

> obsVariance :: Double
> obsVariance = 1e-2
> bigSigmaY :: Sq 1
> bigSigmaY = fromList [obsVariance]
> nObs :: Int
> nObs = 300
> singleSample :: Double -> RVarT (W.Writer [Double]) Double
> singleSample p0 = do
>   epsilon <- rvarT (Normal 0.0 obsVariance)
>   let p1 = logit p0 k (r * deltaT)
>   lift $ W.tell [p1 + epsilon]
>   return p1
> streamSample :: RVarT (W.Writer [Double]) Double
> streamSample = iterateM_ singleSample p0
> samples :: [Double]
> samples = take nObs $ snd $
>           W.runWriter (evalStateT (sample streamSample) (pureMT 3))

We created our data with a growth rate of

ghci> r

but let us pretend that we have read the literature on growth rates of bee colonies and we have some big doubts about growth rates but we are almost certain about the size of the colony at t=0.

> muPrior :: R 2
> muPrior = fromList [5.0, 0.1]
> sigmaPrior :: Sq 2
> sigmaPrior = fromList [ 1e2, 0.0
>                       , 0.0, 1e-10
>                       ]

We only observe the population and not the rate itself.

> bigH :: L 1 2
> bigH = fromList [0.0, 1.0]

Strictly speaking this should be 0 but this is close enough.

> bigSigmaX :: Sq 2
> bigSigmaX = fromList [ 1e-10, 0.0
>                      , 0.0, 1e-10
>                      ]

Now we can run our filter and watch it switch away from our prior belief as it accumulates more and more evidence.

> test :: [(R 2, Sq 2)]
> test = outer muPrior sigmaPrior bigH bigSigmaY
>        a bigA bigSigmaX (map (fromList . return) samples)

by Dominic Steinitz at September 09, 2014 08:28 AM

Yesod Web Framework

Misassigned credit for conduit

When I was at ICFP last week, it became clear that I had made a huge mistake in the past three years. A few of us were talking, including Erik de Castro Lopo, and when I mentioned that he was the original inspiration for creating the conduit package, everyone else was surprised. So firstly: Erik, I apologize for not making it clear that you initially kicked off development by finding some fun corner cases in enumerator that were difficult to debug.

So to rectify that, I think it's only fair that I write the following:

  • conduit is entirely Erik's fault.
  • If you love conduit, write Erik a thank you email.
  • More importantly, if you hate conduit, there's no need to complain to me anymore. Erik presumably will be quite happy to receive all such further communications.
  • In other words, it's not my company, I just work here.

Thanks Erik :)

UPDATE Please also read my follow-up blog post clarifying this one, just in case you're confused.

September 09, 2014 12:00 AM

September 08, 2014

The GHC Team

Jan Stolarek

Promoting functions to type families in Haskell

It’s been very quiet on the blog these past few months not because I’m spending less time on functional programming but precisely for the opposite reason. Since January I’ve been working together with Richard Eisenberg to extend his singletons library. This work was finished in June and last Friday I gave a talk about our research on Haskell Symposium 2014. This was the first time I’ve been to the ICFP and Haskell Symposium. It was pretty cool to finally meet all these people I know only from IRC. I also admit that the atmosphere of the conference quite surprised me as it often felt like some sort of fan convention rather than the biggest event in the field of functional programming.

The paper Richard and I published is titled “Promoting Functions to Type Families in Haskell”. This work is based on Richard’s earlier paper “Dependently typed programming with singletons” presented two years ago on Haskell Symposium. Back then Richard presented the singletons library that uses Template Haskell to generate singleton types and functions that operate on them. Singleton types are types that have only one value (aside from bottom) which allows to reason about runtime values during compilation (some introduction to singletons can be found in this post on Richard’s blog). This smart encoding allows to simulate some of the features of dependent types in Haskell. In our current work we extended promotion capabilities of the library. Promotion is only concerned with generating type-level definitions from term-level ones. Type-level language in GHC has become quite expressive during the last couple of years but it is still missing many features available in the term-level language. Richard and I have found ways to encode almost all of these missing features using the already existing type-level language features. What this means is that you can write normal term-level definition and then our library will automatically generate an equivalent type family. You’re only forbidden from using infinite terms, the do-notation, and decomposing String literals to Chars. Numeric literals are also very problematic and the support is very limited but some of the issues can be worked around. What is really cool is that our library allows you to have partial application at the type level, which GHC normally prohibits.

You can learn more by watching my talk on YouTube, reading the paper or the singletons documentation. Here I’d like to add a few more information that are not present in the paper. So first of all the paper was concerned only with promotion and didn’t say anything about singletonization. But as we enabled more and more language constructs to be promoted we also made them singletonizable. So almost everything that can be promoted can also be singletonized. The most notable exception to this rule are type classes, which are not yet implemented at the moment.

An interesting issue was raised by Adam Gundry in a question after the talk: what about difference between lazy term-level semantics and strict type-level semantics? You can listen to my answer in the video but I’ll elaborate some more on this here. At one point during our work we were wondering about this issue and decided to demonstrate an example of an algorithm that crucially relies on laziness to work, ie. fails to work with strict semantics. I think it’s not straightforward to come up with such an algorithm but luckily I recalled the backwards state monad from Philip Wadler’s paper “The essence of functional programming”1. Bind operator of that monad looks like this (definition copied from the paper):

m `bindS` k = \s2 -> let (a,s0) = m s1
                         (b,s1) = k a s2
                     in  (b,s0)

The tricky part here is that the output of call to m becomes input to call to k, while the output of call to k becomes the input of m. Implementing this in a strict language does not at all look straightforward. So I promoted that definition expecting it to fail spectacularly but to my surprised it worked perfectly fine. After some investigation I understood what’s going on. Type-level computations performed by GHC are about constraint solving. It turns out that GHC is able to figure out in which order to solve these constraints and get the result. It’s exactly analogous to what happens with the term-level version at runtime: we have an order of dependencies between the closures and there is a way in which we can run these closures to get the final result.

All of this work is a small part of a larger endeavour to push Haskell’s type system towards dependent types. With singletons you can write type-level functions easily by writing their definitions using the term-level language and then promoting these definitions. And then you can singletonize your functions to work on singleton types. There were two other talks about dependent types during the conference: Stephanie Weirich’s “Depending on Types” keynote lecture during ICPF and Richard’s “Dependent Haskell” talk during Haskell Implementators Workshop. I encourage everyone interested in Haskell’s type system to watch both of these talks.

  1. The awful truth is that this monad does not really work with the released version of singletons. I only realized that when I was writing this post. See issue #94 on singletons bug tracker.

by Jan Stolarek at September 08, 2014 11:02 AM

September 07, 2014

Neil Mitchell

Shake in the wild

Summary: I spotted a few things using Shake, which I had nothing to do with.

In the past few days I have come across several things using the Shake build system. I wasn't involved in any of them, and haven't (yet) tried any of them out, but they certainly look cool.


Tibor Bremer from Utrecht University gave a talk at the Haskell Implementors Workshop 2014 about his ToolCabal project. This project replaces the "build a package" part of Cabal with something more flexible, supporting multiple simultaneous targets and more flexible preprocessors - all built on top of Shake. It doesn't attempt to tackle dependency resolution yet. There is a video of the talk:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="" width="420"></iframe>


The folks at Samplecount have written several Shake based things. None are yet on Hackage, so I suspect they are somewhat prototypes, but they look like they're already used quite seriously.

  • shake-cabal-build to make it easier to build your Shake build systems with Cabal. Shake build systems need to be compiled with GHC, for which I usually use ghc --make, but this project explains how to get things building with Cabal - important if your build system pulls in other libraries.
  • shake-language-c is a project to simplify building C/C++ projects with Shake. From the docs:

shake-language-c is a cross-platform build system based on the Shake Haskell library. The focus is on cross-compilation of C, C++ and Objective C source code to various target platforms. Currently supported target platforms are iOS, Android NDK, Google Portable Native Client, MacOS X, Linux and Windows (MinGW). Supported host platforms are MacOS X, Linux and Windows.

  • methcla is their mobile sound engine, which is built using this Shake script, which (unsurprisingly) uses shake-language-c and shake-cabal-build.

by Neil Mitchell ( at September 07, 2014 09:02 PM

Edward Z. Yang

Haskell Implementor’s Workshop ’14

This year at ICFP, we had some blockbuster attendance to the Haskell Implementor's Workshop (at times, it was standing room only). I had the pleasure of presenting the work I had done over the summer on Backpack.


You can grab the slides or view the presentation itself (thank you ICFP organizers for being incredibly on-the-ball with videos this year!) The talk intersects a little bit with my blog post A taste of Cabalized Backpack, but there are more pictures, and I also emphasize (perhaps a little too much) the long term direction we are headed in.


There were a lot of really nice talks at HiW. Here are some of my personal highlights:

by Edward Z. Yang at September 07, 2014 01:05 PM

Mike Izbicki

Getting started with GitHub, vim, and bash

Getting started with GitHub, vim, and bash

posted on 2014-09-07 by Rashid Goshtasbi and Kyler Rynear

Learning to use git, vim, and bash was hard for us. These tools are so different than the tools we used when we first learned to program. And they’re confusing! But our professor made us use them… and eventually… after we learned the tools… we discovered that we really like them! So we’ve put together a simple video guide to help you learn and enjoy these tools too. We did this as part of the CS100 open source software development class at UC Riverside.

Click here to watch the full playlist on YouTube.

Getting Started with GitHub

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

This video shows you step by step how to create an account on GitHub. Then we see how to create our first repository called test, and transfer it from GitHub onto our local machine using the git clone command.

Creating a file, pushing to GitHub, and pulling from GitHub

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

How do we create files and upload them to GitHub? The touch <filename> command will create an empty file for you. The vim <filename> command will open a file in an advanced text editor that we talk about farther down the page. The git push command sends these files from your local machine up to GitHub, and the git pull command downloads files from GitHub and saves them to your local computer.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Branches let you work on files without messing up your original code. When you finish your changes, you can merge them into the master branch. This is the best part of version control.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Most programs have different versions, for example: 1.0, 1.1, 1.2, 2.1 and 2.2.1. The git tag command let’s you create these versions. They’re just like a checkpoint in a Mario game!

Forking & Pull Requests

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Let’s say you want to contribute to an open source project, but you don’t have permission. In order to contribute to someone else’s repository, you must first “fork” it to create a repo that you do have push permission on. Then you issue a pull request through the GitHub website. This tells the owner of the original repo that you’ve made some changes they can incorporate.

The file

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe> files are how you document your projects. The should explain your program, give installation instructions, and list known bugs. Basically, it explains to someone else who has absolutely no idea what your program does or how to code, but it enables the user to understand the concepts and basic directions to execute your program. The .md extension at the end of the filename indicates that the file uses markdown formatting. This is a simple way to create nice looking documentation.

Learning vim

vim is an advanced text editor for Unix operating systems. It’s powerful, but all the commands are intimidating for first time users. Even though it’s hard to get used to at first, these videos will help you learn some of the basic commands and get comfortable with vim.

Getting Started

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

It was difficult at first trying to transverse my code while using vim. I was so used to being able to use my mouse and simply click where I wanted to go. There are many ways to maneuver inside of vim. Some may just use the h,j,k,l, up, down, left, right arrow keys, or the w, e, b keys to move. You can also press gg to go to the top of the code, G to go to the bottom of it, and (any number)G to go to the line number typed before the capital G.)

Cutting, copying, and pasting took a while to get used to when using vim. Sometimes there was something I wanted in my code that was in the instructions for the assignment. In order to paste I would use the p command, but I could not paste things from outside of vim into it. If I had something copied outside of vim, then to paste it into vim I would right click and just click paste. This would paste it wherever the cursor currently is. If you right click to copy, then it will not affect what is copied by using the commands y to copy or the commands d or x to cut. If those commands are used, the just clicking p will paste them. There are other ways to store more than one thing while copying or cutting, but these two ways were the most helpful as I learned how to use vim.

Another personal favorite features of vim, are the shift-a (takes you to the end of the line and into insert mode) and the shift-i (takes you to the beginning of the line and into insert mode) command. You can also press a to append after the cursor position, as well as i to insert before the current cursor position

vim also allows you to use the v or shift-v keys to highlight certain text or lines of code. You can then use other vim commands such as the copy, paste and delete keys to perform your needed actions.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

At first it felt very time consuming to indent multiple lines. I felt this way until I found about the V command. V lets users highlight a line and pressing up or down can highlight as many lines as they desire. All that was left to do was to type > after everything I wanted to indent was highlighted and it all would indented once to the right. Typing < would instead indent it to the left if I ever wanted to do that.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="" width="697.6"></iframe>

There are two commands for deleting single character. x deletes the character that the cursor is on and moves the cursor to the right; and X deletes the character that the cursor is on and moves the cursor to the left.

The d command is a more powerful way to delete. d can be used with many different things after it. dd will delete the entire line. d$ will delete the rest of the current line. de will delete from where the cursor is up until the end of the word.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="" width="697.6"></iframe>

Lower case r can replace one letter while upper case R can replace one letter with many.

There are three c commands that I regularly use for replacement: ce , which deletes up until the end of the word that the cursor is currently on, then allows you to insert immediately; c$ , which deletes from where the cursor is up until the end of the line, then allows you to insert immediately; and cc , which deletes the whole line that the cursor is on and allows you to insert immediately at the beginning of the line.

Customizing your vim editor with the .vimrc file

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Ever wondered how’ve we get our vim editor to work in the way we have it versus the default editor? vim has a file where you can setup it’s defaults such as auto parentheses, auto-indent, and much more. By watching our video above, you can easily create new defaults for your vim editor that can cut time spent formating your text to spend more on coding.

Learning The Terminal

One of the best features of Unix operating systems is the powerful terminal they provide.

The ls command

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

The ls command is one of the most used terminal commands.

The basic ls command, when run, displays the contents within the current working directory. Passing in a directory name as an argument will display the contents of that directory. It is also possible to pass in a path for a directory to display any directory, regardless of the directory the user is currently in.

If the -a flag is passed in with ls, all items in the current working directory prepended with a . are also displayed, along with the rest of the items.

Passing in the -l flag prints information for each item in the directory in a series of columns on a single line. The first column displays the read, write, and executable permissions for the main user, the group the current user is in, and any user in that order. The next column shows the owner of the item and the next column shows the group owner. The fourth column displays the size, in bytes, of the item. The fifth column displays the moment the item was created, and the last column displays the name of the item.

If the -R flag is passed in, the command will display the contents of the current directory, and then recursively enter every directory within the current directory and display the contents of that directory, then keep going into every directory until there are no more directories in the current directory it is in.

All these options are combinable for different uses. For example, I could use the -l and -a flags to display the information for the items prepended with a . , or use -R and -l together.

The cd and mv commands

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

The cd and mv commands are crucial commands in order to actually use the terminal. Without cd, I would forever be stuck in their home directory. The mv command is necessary for moving files from one section of the hard drive. The cd command by itself will change the current working directory to the home directory. If passed a directory name that is within the current working directory, the current working directory will be changed to the name of the passed in directory. cd will also take a path as an argument. When a path is passed in, the current working directory will be changed to the directory specified by the path. When cd is passed with .., the directory will go backwards, the directory that the current directory is in.

The mv command will move an item within a certain directory to the directory passed in.

If the destination argument is not a path, the command will look for the destination in the current working directory. The destination argument can be a path, so I can move the item to any directory in the hard drive.

Recording terminal sessions via scripts

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

With the script command you can record the commands you run in your terminal into a file. By just typing script file_name_here, you can start a script. Also, you don’t need to worry about making a file beforehand, because when you specify the filename, it will make once for you in that name. Then when you’re done, type exit and your terminal will say your script session has ended and re-state the filename in which it recorded all your commands in.

How To SSH (into well server)

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Computer Science students have the ability to log into the school’s server using the ssh command. The way to do access the terminal is to type into the command terminal the following text:


If it is your first time entering the terminal, you will be asked to trust the encryption that the server uses, then prompted to enter the password associated with your NetID. Once doing all those steps, you will be brought to your home directory on the server. To exit the server, type exit into the command prompt and press enter.

A useful command that moves files to and from the remote server onto your home computer is the scp command. To put items from your home computer to the school’s server, type into the command prompt:

scp filename/absolute_path

To move items from the remote server onto your home computer, type into the command prompt:

scp absolute_path

Spectacle App: Using the terminal and vim in one screen

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

One of the first things I noticed about vim that I initially disliked was that it took over the terminal when I used it. Users with Windows 7 & above automatically have this ability by dragging your screen to the left or right border of your screen. Unfortunately, OS X users don’t have this built in ability. To get around this, OS X users can install the Spectacle App which will enable you to organize multiple windows on your screen with a touch of a buttom. To get around this issue, I started using two terminals instead of just one while I was programming. I would run vim using the first terminal and would run the executable in the second. It was as simple as using :w to save on vim instead of using :wq to save and quit. I could now test my code without ever having to close vim.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

When programming for unix based operating systems (which is a primary component of CS100), system calls are a prominent component for code. The perror function captures the error value (if returned) from the system call and prints to stdout an error message based on the system call and the type of error. It takes in one c-string argument, which is a message the user can pass in.

September 07, 2014 12:00 AM

September 06, 2014

Joachim Breitner

ICFP 2014

Another on-my-the-journey-back blog post; this time from the Frankfurt Airport Train Station – my flight was delayed (if I knew that I could have watched the remaining Lightning Talks), and so was my train, but despite 5min of running through the Airport just not enough. And now that the free 30 Minutes of Railway Station Internet are used up, I have nothing else to do but blog...

Last week I was attending ICFP 2014 in Gothenburg, followed by the Haskell Symposium and the Haskell Implementors Workshop. The justification to attend was the paper on Safe Coercions (joint work with Richard Eisenberg, Simon Peyton Jones and Stephanie Weirich), although Richard got to hold the talk, and did so quite well. So I got to leisurely attend the talks, while fighting the jet-lag that I brought from Portland.

There were – as expected – quite a few interesting talks. Among them the first keynote, Kathleen Fisher on the need for formal methods in cars and toy-quadcopters and unmanned battle helicopters, which made me conclude that my Isabelle skills might eventually become relevant in practical applications. And did you know that if someone gains access to your car’s electronics, they can make the seat belt pull you back hard?

Stefanie Weirich’s keynote (and the subsequent related talks by Jan Stolarek and Richard Eisenberg) on what a dependently typed Haskell would look like and what we could use it for was mouth-watering. I am a bit worried that Haskell will be become a bit obscure for newcomers and people that simply don’t want to think about types too much, on the other hand it seems that Haskell as we know it will always stay there, just as a subset of the language.

Similarly interesting were refinement types for Haskell (talks by Niki Vazou and by Eric Seidel), in the form of LiquidTypes, something that I have not paid attention to yet. It seems to be a good way for more high assurance in Haskell code.

Finally, the Haskell Implementors Workshop had a truckload of exciting developments in and around Haskell: More on GHCJS, Partial type signatures, interactive type-driven development like we know it from Agda, the new Haskell module system and amazing user-defined error messages – the latter unfortunately only in Helium, at least for now.

But it’s not the case that I only sat and listened. During the Haskell Implementors Workshop I held a talk “Contributing to GHC” with a live demo of me fixing a (tiny) bug in GHC, with the aim of getting more people to hack on GHC (slides, video). The main message here is that it is not that big of deal. And despite me not actually saying much interesting in the talk, I got good feedback afterwards. So if it now actually motivates someone to contribute to GHC, I’m even more happier.

And then there is of course the Hallway Track. I discussed the issues with fusing a left fold (unfortunately, without a great solution). In order to tackle this problem more systematically, John Wiegley and I created the beginning of a “List Fusion Lab”, i.e. a bunch of list benchmark and the possibility to compare various implementations (e.g. with different RULES) and various compilers. With that we can hopefully better assess the effect of a change to the list functions.

PS: The next train is now also delayed, so I’ll likely miss my tram and arrive home even later...

PPS: I really have to update my 10 year old picture on my homepage (or redesign it completely). Quite a few people knew my name, but expected someone with shoulder-long hair...

PPPS: Haskell is really becoming mainstream: I just talked to a randomly chosen person (the boy sitting next to me in the train), and he is a Haskell enthusiast, building a structured editor for Haskell together with his brother. And all that as a 12th-grader...

by Joachim Breitner ( at September 06, 2014 10:46 PM

Tim Docker

A New Charting API

One of the challenges with building a library like Chart is the tension between ease of use and flexibility. Users want to produce charts with a minimum of code up front, but later want to refine the details. The chart library addresses this through the use of "defaulted records" using Data.Default.Class. Because such records are often nested, we rely on the somewhat intimidating lens library to modify the default values. We end up with code to create chart elements like this:

sinusoid2 = plot_points_title .~ "fn(x)"
          $ plot_points_values .~ mydata
          $ plot_points_style . point_color .~ opaque red
          $ def

This is much simpler and cleaner that the corresponding code using native record accessors, but it still has a certain amount of syntactic overhead.

I’ve added a simple state monad to the library to further clean up the syntax. The state of the monad is the value being constructed, allowing the use of the monadic lens operators. The above code sample becomes:

sinusoid2 = execEC $ do
    plot_points_title .= "fn(x)" 
    plot_points_values .= mydata
    plot_points_style . point_color .= opaque red

This may seem only a minor syntactic improvement, but it adds up over an typical chart definition.

A few other changes further reduce the clutter in charting code:

  • A new Easy module that includes helper functions and key dependencies
  • Simpler "toFile" functions in the rendering backends
  • Automatic sequencing of colours for successive plots

All this means that a simple plot can now be a one liner:

import Graphics.Rendering.Chart.Easy
import Graphics.Rendering.Chart.Backend.Cairo

mydata :: [Double,Double]
mydata = ...

main = toFile def "test.png" $ plot $ points "lines" mydata

But this extends naturally to more complex charts. The code differences between the new stateful API versus the existing API can been seen in this example.

The stateful API is available in chart v1.3 It is a thin layer over the existing API – both will be continue to be available in the future.

by Tim Docker at September 06, 2014 05:34 AM

September 04, 2014

Edward Z. Yang

Open type families are not modular

One of the major open problems for building a module system in Haskell is the treatment of type classes, which I have discussed previously on this blog. I've noted how the current mode of use in type classes in Haskell assume “global uniqueness”, which is inherently anti-modular; breaking this assumption risks violating the encapsulation of many existing data types.

As if we have a choice.

In fact, our hand is forced by the presence of open type families in Haskell, which are feature many similar properties to type classes, but with the added property that global uniqueness is required for type safety. We don't have a choice (unless we want type classes with associated types to behave differently from type classes): we have to figure out how to reconcile the inherent non-modularity of type families with the Backpack module system.

In this blog post, I want to carefully lay out why open type families are inherently unmodular and propose some solutions for managing this unmodularity. If you know what the problem is, you can skip the first two sections and go straight to the proposed solutions section.

Before we talk about open type family instances, it's first worth emphasizing the (intuitive) fact that a signature of a module is supposed to be able to hide information about its implementation. Here's a simple example:

module A where
    x :: Int

module B where
    import A
    y = 0
    z = x + y

Here, A is a signature, while B is a module which imports the signature. One of the points of a module system is that we should be able to type check B with respect to A, without knowing anything about what module we actually use as the implementation. Furthermore, if this type checking succeeds, then for any implementation which provides the interface of A, the combined program should also type check. This should hold even if the implementation of A defines other identifiers not mentioned in the signature:

module A where
    x = 1
    y = 2

If B had directly imported this implementation, the identifier y would be ambiguous; but the signature filtered out the declarations so that B only sees the identifiers in the signature.

With this in mind, let's now consider the analogous situation with open type families. Assuming that we have some type family F defined in the prelude, we have the same example:

module A where
    type instance F Int
    f :: F Bool

module B where
    import A
    type instance F Bool = Int -> Bool
    x = f 2

Now, should the following module A be a permissible implementation of the signature?

module A where
    type instance F Int = Int
    type instance F Bool = Int
    f = 42

If we view this example with the glasses off, we might conclude that it is a permissible implementation. After all, the implementation of A provides an extra type instance, yes, but when this happened previously with a (value-level) declaration, it was hidden by the signature.

But if put our glasses on and look at the example as a whole, something bad has happened: we're attempting to use the integer 42 as a function from integers to booleans. The trouble is that F Bool has been given different types in the module A and module B, and this is unsound... like, segfault unsound. And if we think about it some more, this should not be surprising: we already knew it was unsound to have overlapping type families (and eagerly check for this), and signature-style hiding is an easy way to allow overlap to sneak in.

The distressing conclusion: open type families are not modular.

So, what does this mean? Should we throw our hands up and give up giving Haskell a new module system? Obviously, we’re not going to go without a fight. Here are some ways to counter the problem.

The basic proposal: require all instances in the signature

The simplest and most straightforward way to solve the unsoundness is to require that a signature mention all of the family instances that are transitively exported by the module. So, in our previous example, the implementation of A does not satisfy the signature because it has an instance which is not mentioned in the signature, but would satisfy this signature:

module A where
    type instance F Int
    type instance F Bool

While at first glance this might not seem too onerous, it's important to note that this requirement is transitive. If A happens to import another module Internal, which itself has its own type family instances, those must be represented in the signature as well. (It's easy to imagine this spinning out of control for type classes, where any of the forty imports at the top of your file may be bringing in any manner of type classes into scope.) There are two major user-visible consequences:

  1. Module imports are not an implementation detail—you need to replicate this structure in the signature file, and
  2. Adding instances is always a backwards-incompatible change (there is no weakening).

Of course, as Richard pointed out to me, this is already the case for Haskell programs (and you just hoped that adding that one extra instance was "OK").

Despite its unfriendliness, this proposal serves as the basis for the rest of the proposals, which you can conceptualize as trying to characterize, “When can I avoid having to write all of the instances in my signature?”

Extension 1: The orphan restriction

Suppose that I write the following two modules:

module A where
    data T = T
    type instance F T = Bool

module B where
    import A
    type instance F T = Int -> Int

While it is true that these two type instances are overlapping and rightly rejected, they are not equally at fault: in particular, the instance in module B is an orphan. An orphan instance is an instance for type class/family F and data type T (it just needs to occur anywhere on the left-hand side) which lives in a module that defines neither. (A is not an orphan since the instance lives in the same module as the definition of data type T).

What we might wonder is, “If we disallowed all orphan instances, could this rule out the possibility of overlap?” The answer is, “Yes! (...with some technicalities).” Here are the rules:

  1. The signature must mention all what we will call ragamuffin instances transitively exported by implementations being considered. An instance of a family F is a ragamuffin if it is not defined with the family definition, or with the type constructor at the head in the first parameter. (Or some specific parameter, decided on a per-family basis.) All orphan instances are ragamuffins, but not all ragamuffins are orphans.
  2. A signature exporting a type family must mention all instances which are defined in the same module as the definition of the type family.
  3. It is strictly optional to mention non-ragamuffin instances in a signature.

(Aside: I don't think this is the most flexible version of the rule that is safe, but I do believe it is the most straightforward.) The whole point of these rules is to make it impossible to write an overlapping instance, while only requiring local checking when an instance is being written. Why did we need to strengthen the orphan condition into a ragamuffin condition to get this non-overlap? The answer is that absence of orphans does not imply absence of overlap, as this simple example shows:

module A where
    data A = A
    type instance F A y = Int

module B where
    data B = B
    type instance F x B = Bool -> Bool

Here, the two instances of F are overlapping, but neither are orphans (since their left-hand sides mention a data type which was defined in the module.) However, the B instance is a ragamuffin instance, because B is not mentioned in the first argument of F. (Of course, it doesn't really matter if you check the first argument or the second argument, as long as you're consistent.)

Another way to think about this rule is that open type family instances are not standalone instances but rather metadata that is associated with a type constructor when it is constructed. In this way, non-ragamuffin type family instances are modular!

A major downside of this technique, however, is that it doesn't really do anything for the legitimate uses of orphan instances in the Haskell ecosystem: when third-parties defined both the type family (or type class) and the data type, and you need the instance for your own purposes.

Extension 2: Orphan resolution

This proposal is based off of one that Edward Kmett has been floating around, but which I've refined. The motivation is to give a better story for offering the functionality of orphan instances without gunking up the module system. The gist of the proposal is to allow the package manager to selectively enable/disable orphan definitions; however, to properly explain it, I'd like to do first is describe a few situations involving orphan type class instances. (The examples use type classes rather than type families because the use-cases are more clear. If you imagine that the type classes in question have associated types, then the situation is the same as that for open type families.)

The story begins with a third-party library which defined a data type T but did not provide an instance that you needed:

module Data.Foo where
    data Foo = Foo

module MyApp where
    import Data.Foo
    fooString = show Foo -- XXX no instance for Show

If you really need the instance, you might be tempted to just go ahead and define it:

module MyApp where
    import Data.Foo
    instance Show Foo where -- orphan
        show Foo = "Foo"
    fooString = show Foo

Later, you upgrade Data.Foo to version 1.0.0, which does define a Show instance, and now your overlapping instance error! Uh oh.

How do we get ourselves out of the mess? A clue is how many package authors currently “get out of jail” by using preprocessor macros:

module MyApp where
    import Data.Foo
#if MIN_VERSION_foo(1,0,0)
    instance Show Foo where -- orphan
        show Foo = "Foo"
    fooString = show Foo

Morally, we'd like to hide the orphan instance when the real instance is available: there are two variations of MyApp which we want to transparently switch between: one which defines the orphan instance, and one which does not and uses the non-orphan instance defined in the Data.Foo. The choice depends on which foo was chosen, a decision made by the package manager.

Let's mix things up a little. There is no reason the instance has to be a non-orphan coming from Data.Foo. Another library might have defined its own orphan instance:

module MyOtherApp where
    import Data.Foo
    instance Show Foo where ... -- orphan
    otherFooString = show Foo

module MyApp where
    import Data.Foo
    instance Show Foo where ... -- orphan
    fooString = show Foo

module Main where
    import MyOtherApp
    import MyApp
    main = print (fooString ++ otherFooString ++ show Foo)

It's a bit awful to get this to work with preprocessor macros, but there are two ways we can manually resolve the overlap: we can erase the orphan instance from MyOtherApp, or we can erase the orphan instance from MyApp. A priori, there is no reason to prefer one or the other. However, depending on which one is erased, Main may have to be compiled differently (if the code in the instances is different). Furthermore, we need to setup a new (instance-only) import between the module who defines the instance to the module whose instance was erased.

There are a few takeaways from these examples. First, the most natural way of resolving overlapping orphan instances is to simply “delete” the overlapping instances; however, which instance to delete is a global decision. Second, which overlapping orphan instances are enabled affects compilation: you may need to add module dependencies to be able to compile your modules. Thus, we might imagine that a solution allows us to do both of these, without modifying source code.

Here is the game plan: as before, packages can define orphan instances. However, the list of orphan instances a package defines is part of the metadata of the package, and the instance itself may or may not be used when we actually compile the package (or its dependencies). When we do dependency resolution on a set of packages, we have to consider the set of orphan instances being provided and only enable a set which is non-overlapping, the so called orphan resolution. Furthermore, we need to add an extra dependency from packages whose instances were disabled to the package who is the sole definer of an instance (this might constrain which orphan instance we can actually pick as the canonical instance).

The nice thing about this proposal is that it solves an already existing pain point for type class users, namely defining an orphan type class instance without breaking when upstream adds a proper instance. But you might also think of it as a big hack, and it requires cooperation from the package manager (or some other tool which manages the orphan resolution).

The extensions to the basic proposal are not mutually exclusive, but it's an open question whether or not the complexity they incur are worth the benefits they bring to existing uses of orphan instances. And of course, there may other ways of solving the problem which I have not described here, but this smorgasbord seems to be the most plausible at the moment.

At ICFP, I had an interesting conversation with Derek Dreyer, where he mentioned that when open type families were originally going into GHC, he had warned Simon that they were not going to be modular. With the recent addition of closed type families, many of the major use-cases for open type families stated in the original paper have been superseded. However, even if open type families had never been added to Haskell, we still might have needed to adopt these solutions: the global uniqueness of instances is deeply ingrained in the Haskell community, and even if in some cases we are lax about enforcing this constraint, it doesn't mean we should actively encourage people to break it.

I have a parting remark for the ML community, as type classes make their way in from Haskell: when you do get type classes in your language, don’t make the same mistake as the Haskell community and start using them to enforce invariants in APIs. This way leads to the global uniqueness of instances, and the loss of modularity may be too steep a price to pay.

Postscript. One natural thing to wonder, is if overlapping type family instances are OK if one of the instances “is not externally visible.” Of course, the devil is in the details; what do we mean by external visibility of type family instances of F?

For some definitions of visibility, we can find an equivalent, local transformation which has the same effect. For example, if we never use the instance at all, it certainly OK to have overlap. In that case, it would also have been fine to delete the instance altogether. As another example, we could require that there are no (transitive) mentions of the type family F in the signature of the module. However, eliminating the mention of the type family requires knowing enough parameters and equations to reduce: in which case the type family could have been replaced with a local, closed type family.

One definition that definitely does not work is if F can be mentioned with some unspecified type variables. Here is a function which coerces an Int into a function:

module A where
  type instance F Int = Int
  f :: Typeable a => a -> F a
  f x = case eqT of
    Just Refl -> x :: Int
    Nothing -> undefined

module ASig where
  f :: Typeable a => a -> F a

module B where
  import ASig
  type instance F Int = Bool -> Bool
  g :: Bool
  g = f 0 True -- oops

...the point being that, even if a signature doesn't directly mention the overlapping instance F Int, type refinement (usually by some GADT-like structure) can mean that an offending instance can be used internally.

by Edward Z. Yang at September 04, 2014 10:12 PM

Roman Cheplyaka

Dependent Haskell

Emulating dependent types (and, more generally, advanced type-level programming) has been a hot topic in the Haskell community recently. Some incredible work has been done in this direction: GADTs, open and closed type families, singletons, etc. The plan is to copy even more features to the type level, like type classes and GADTs, and simplify the promotion of value-level functions.

On the other hand, there’s a good deal of scepticism around this idea. «If you want to program like in Agda, why don’t you program in Agda?»

First, libraries. It took us many years to arrive at the state of hackage that is suitable for industrial usage — and we still don’t have satisfactory answers to many problems. My guess is that it will take at least as long as that for the dependently typed community to arrive at this point — not only because of the smaller community, but also because they will look for even more type-safe solutions, which is naturally a harder problem.

Making your code extremely type-safe is quite hard (or else it would not be worth an ICFP talk). In a real-world application, you’d probably have just a few places where it’s worth the effort. But if you write the whole application in a dependently-typed language, you pay the price for your whole codebase. The price includes, for instance, the absence of type inference.

The compilation problem is not solved either. In particular, there are concerns about type erasure in both Agda and Idris. This is not unsolvable, just hasn’t been done yet.

So maybe you could write some parts in Agda/Idris and the rest in Haskell? Neither Agda nor Idris has a good story for that. For instance, Agda can generate Haskell code, but interfacing with Haskell won’t be very smooth. And the differences in languages mean that it probably won’t ever be effortless.

Don’t get me wrong, I am very enthusiastic about these (and future) languages. However, it doesn’t seem like they will be ready for production usage anytime soon. At the same time, you can satisfy the majority of your dependently typed needs in Haskell right now, no hidden charges. Isn’t that cool?


I am no expert in Agda/Idris. The above is my impression from talking to different people at ICFP and participating in tutorials on these two languages given by their corresponding authors. I’ll gladly accept rebuttals.


There’s a discussion happening on /r/haskell, where people dispute many of my points. I encourage you to read that discussion and make your own conclusions.

In particular, it has been pointed out that erasure is not as big of a problem as I thought initially. Still, it’ll take quite some time (and non-trivial systems built with dependent types) for us to be able to trust Idris/Agda compilers as much as we now trust GHC.

September 04, 2014 09:00 PM

Ian Ross

Non-diffusive atmospheric flow #5: pre-processing

Non-diffusive atmospheric flow #5: pre-processing

Note: there are a couple of earlier articles that I didn’t tag as “haskell” so they didn’t appear in Planet Haskell. They don’t contain any Haskell code, but they cover some background material that’s useful to know (#3 talks about reanalysis data and what Z500 is, and #4 displays some of the characteristics of the data we’re going to be using). If you find terms here that are unfamiliar, they might be explained in one of these earlier articles.

The code for this post is available in a Gist.

Before we can get into the “main analysis”, we need to do some pre-processing of the Z500 data. In particular, we are interested in large-scale spatial structures, so we want to subsample the data spatially. We are also going to look only at the Northern Hemisphere winter, so we need to extract temporal subsets for each winter season. (The reason for this is that winter is the season where we see the most interesting changes between persistent flow regimes. And we look at the Northern Hemisphere because it’s where more people live, so it’s more familiar to more people.) Finally, we want to look at variability about the seasonal cycle, so we are going to calculate “anomalies” around the seasonal cycle.

We’ll do the spatial and temporal subsetting as one pre-processing step and then do the anomaly calculation seperately, just for simplicity.

Spatial and temporal subsetting

The title of the paper we’re trying to follow is “Observed Nondiffusive Dynamics in Large-Scale Atmospheric Flow”, so we need to decide what we mean by “large-scale” and to subset our data accordingly. The Z500 data from the NCEP reanalysis dataset is at 2.5° × 2.5° resolution, which turns out to be a little finer than we need, so we’re going to extract data on a 5° × 5° grid instead. We’ll also extract only the Northern Hemisphere data, since that’s what we’re going to work with.

For the temporal subsetting, we need to take 181 days of data for each year starting on 1 November each year. Since the data starts at the beginning of 1948 and goes on to August 2014 (which is when I’m writing this), we’ll have 66 years of data, from November 1948 until April 2014. As usual when handling dates, there’s some messing around because of leap years, but here it basically just comes down to which day of the year 1 November is in a given year, so it’s not complicated.

The daily NCEP reanalysis geopotential height data comes in one file per year, with all the pressure levels used in the dataset bundled up in each file. That means that the geopotential height variable in each file has coordinates: time, level, latitude, longitude, so we need to slice out the 500 mb level as we do the other subsetting.

All this is starting to sound kind of complicated, and this brings us to a regrettable point about dealing with this kind of data – it’s messy and there’s a lot of boilerplate code to read and manipulate coordinate metadata. This is true pretty much whatever language you use for processing these multi-dimensional datasets and it’s kind of unavoidable. The trick is to try to restrict this inconvenient stuff to the pre-processing phase by using a consistent organisation of your data for the later analyses. We’re going to do that here by storing all of our Z500 anomaly data in a single NetCDF file, with 181-day long winter seasons back-to-back for each year. This will make time and date processing trivial.

The code for the data subsetting is in the subset.hs program in the Gist. We’ll deal with it in a few bites.

Skipping the imports that we need, as well as a few “helper” type synonym definitions, the first thing that we need to do it open one of the input NCEP NetCDF files to extract the coordinate metadata information. This listing shows how we do this:

  Right refnc <- openFile $ indir </> ""
  let Just nlat = ncDimLength <$> ncDim refnc "lat"
      Just nlon = ncDimLength <$> ncDim refnc "lon"
      Just nlev = ncDimLength <$> ncDim refnc "level"
  let (Just lonvar) = ncVar refnc "lon"
      (Just latvar) = ncVar refnc "lat"
      (Just levvar) = ncVar refnc "level"
      (Just timevar) = ncVar refnc "time"
      (Just zvar) = ncVar refnc "hgt"
  Right lon <- get refnc lonvar :: SVRet CFloat
  Right lat <- get refnc latvar :: SVRet CFloat
  Right lev <- get refnc levvar :: SVRet CFloat

We open the first of the NetCDF files (I’ve called the directory where I’ve stored these things indir) and use the hnetcdf ncDim and ncVar functions to get the dimension and variable metadata for the latitude, longitude, level and time dimensions; we then read the complete contents of the “coordinate variables” (for level, latitude and longitude) as Haskell values (here, SVRet is a type synonym for a storable vector wrapped up in the way that’s returned from the hnetcdf get functions).

Once we have the coordinate variable values, we need to find the index ranges to use for subsetting. For the spatial subsetting, we find the start and end ranges for the latitudes that we want (17.5°N-87.5°N) and for the level, we find the index of the 500 mb level:

  let late = vectorIndex LT FromEnd lat 17.5
      lats = vectorIndex GT FromStart lat 87.5
      levi = vectorIndex GT FromStart lev 500.0

using a helper function to find the correspondence between coordinate values and indexes:

data IndexStart = FromStart | FromEnd

vectorIndex :: (SV.Storable a, Ord a)
            => Ordering -> IndexStart -> SV.Vector a -> a -> Int
vectorIndex o s v val = case (go o, s) of
  (Nothing, _) -> (-1)
  (Just i, FromStart) -> i
  (Just i, FromEnd) -> SV.length v - 1 - i
  where go LT = SV.findIndex (>= val) vord
        go GT = SV.findIndex (<= val) vord
        vord = case s of
          FromStart -> v
          FromEnd -> SV.reverse v

For the temporal subsetting, we just work out what day of the year 1 November is for leap and non-leap years – since November and December together are 61 days, for each winter season we need those months plus the first 120 days of the following year:

  let inov1non = 305 - 1
      -- ^ Index of 1 November in non-leap years.
      wintertsnon = [0..119] ++ [inov1non..365-1]
      -- ^ Indexes of all winter days for non-leap years.
      inov1leap = 305 + 1 - 1
      -- ^ Index of 1 November in leap years.
      wintertsleap = [0..119] ++ [inov1leap..366-1]
      -- ^ Indexes of all winter days for leap years.
      winterts1948 = [inov1leap..366-1]
      winterts2014 = [0..119]
      -- ^ Indexes for winters in start and end years.

This is kind of hokey, and in some cases you do need to do more sophisticated date processing, but this does the job here.

Once we have all this stuff set up, the critical part of the subsetting is easy – for each input data file, we figure out what range of days we need to read, then use a single call the getS from hnetcdf (“get slice”):

        forM_ winterts $ \it -> do
          -- Read a slice of a single time-step of data: Northern
          -- Hemisphere (17.5-87.5 degrees), 5 degree resolution, 500
          -- mb level only.
          let start = [it, levi, lats, 0]
              count = [1, 1, (late - lats) `div` 2 + 1, nlon `div` 2]
              stride = [1, 1, 2, 2]
          Right slice <- getS nc zvar start count stride :: RepaRet2 CShort

Here, we have a set of start indexes, a set of counts and a set of strides, one for each dimension in our input variable. Since the input geopotential height files have dimensions of time, level, latitude and longitude, we have four entries in each of our start, count and stride lists. The start values are the current day from the list of days we need (called it in the code), the level we’re interested in (levi), the start latitude index (lats) and zero, since we’re going to get the whole range of longitude. The count list gets a single time step, a single level, and a number of latitude and longitude values based on taking every other entry in each direction (since we’re subsetting from a spatial resolution of 2.5° × 2.5° to a resolution of 5° × 5°). Finally, for the stride list, we use a stride of one for the time and level directions (which doesn’t really matter anyway, since we’re only reading a single entry in each of those directions) and a stride of two in the latitude and longitude directions (which gives us the “every other one” subsetting in those directions).

All of the other code in the subsetting program is involved in setting up the output file and writing the Z500 slices out. Setting up the output file is slightly tedious (this is very common when dealing with NetCDF files – there’s always lots of metadata to be managed), but it’s made a little simpler by copying attributes from one of the input files, something that doing this in Haskell makes quite a bit easier than in Fortran or C. The next listing shows how the NcInfo for the output file is created, which can then be passed to the hnetcdf withCreateFile function to actually create the output file:

  let outlat = SV.fromList $ map (lat SV.!) [lats,lats+2..late]
      outlon = SV.fromList $ map (lon SV.!) [0,2..nlon-1]
      noutlat = SV.length outlat
      noutlon = SV.length outlon
      outlatdim = NcDim "lat" noutlat False
      outlatvar = NcVar "lat" NcFloat [outlatdim] (ncVarAttrs latvar)
      outlondim = NcDim "lon" noutlon False
      outlonvar = NcVar "lon" NcFloat [outlondim] (ncVarAttrs lonvar)
      outtimedim = NcDim "time" 0 True
      outtimeattrs = foldl' (flip M.delete) (ncVarAttrs timevar)
      outtimevar = NcVar "time" NcDouble [outtimedim] outtimeattrs
      outz500attrs = foldl' (flip M.delete) (ncVarAttrs zvar)
                     ["actual_range", "level_desc", "valid_range"]
      outz500var = NcVar "z500" NcShort
                   [outtimedim, outlatdim, outlondim] outz500attrs
      outncinfo =
        emptyNcInfo (outdir </> "") #
        addNcDim outlatdim # addNcDim outlondim # addNcDim outtimedim #
        addNcVar outlatvar # addNcVar outlonvar # addNcVar outtimevar #
        addNcVar outz500var

Although we can mostly just copy the coordinate variable attributes from one of the input files, we do need to do a little bit of editing of the attributes to remove some things that aren’t appropriate for the output file. Some of these things are just conventions, but there are some tools that may look at these attributes (actual_range, for example) and may complain if the data doesn’t match the attribute. It’s easier just to remove the suspect ones here.

This isn’t pretty Haskell by any means, and the hnetcdf library could definitely do with having some higher-level capabilities to help with this kind of file processing. I may add some based on the experimentation I’m doing here – I’m developing hnetcdf in parallel with writing this!

Anyway, the result of running the subsetting program is a single NetCDF file containing 11946 days (66 × 181) of Z500 data at a spatial resolution of 5° × 5°. We can then pass this on to the next step of our processing.

Seasonal cycle removal

In almost all investigations in climate science, the annual seasonal cycle stands out as the largest form of variability (not always true in the tropics, but in the mid-latitudes and polar regions that we’re looking at here, it’s more or less always true). The problem, of course, is that the seasonal cycle just isn’t all that interesting. We learn about the difference between summer and winter when we’re children, and although there is much more to say about seasonal variations and how they interact with other phenomena in the climate system, much of the time they just get in the way of seeing what’s going on with those other phenomena.

So what do we do? We “get rid” of the seasonal cycle by looking at what climate scientists call “anomalies”, which are basically just differences between the values of whatever variable we’re looking at and values from a “typical” year. For example, if we’re interested in daily temperatures in Innsbruck over the course of the twentieth century, we construct a “typical” year of daily temperatures, then for each day of our 20th century time series, we subtract the “typical” temperature value for that day of the year from the measured temperature value to give a daily anomaly. Then we do whatever analysis we want based on those anomalies, perhaps looking for inter-annual variability on longer time scales, or whatever.

This approach is very common, and it’s what we’re going to do for our Northern Hemisphere winter-time Z500 data here. To do this, we need to do two things: we need to calculate a “typical” annual cycle, and we need to subtract that typical annual cycle from each year of our data.

OK, so what’s a “typical” annual cycle? First, let’s say a word about what we mean by “annual cycle” in this case. We’re going to treat each spatial point in our data independently, trusting to the natural spatial correlation in the geopotential height to smooth out any shorter-term spatial inhomogeneities in the typical patterns. We then do some sort of averaging in time to generate a “typical” annual cycle at each spatial point. It’s quite common to use the mean annual cycle over a fixed period for this purpose (a period of 30 years is common: 1960-1990, say). In our case, we’re going to use the mean annual cycle across all 66 years of data that we have. Here’s how we calculate this mean annual cycle (this is from the seasonal-cycle.hs program in the Gist):

  -- Each year has 181 days, so 72 x 15 x 181 = 195480 data values.
  let ndays = 181
      nyears = ntime `div` ndays

  -- Use an Int array to accumulate values for calculating mean annual
  -- cycle.  Since the data is stored as short integer values, this is
  -- enough to prevent overflow as we accumulate the 2014 - 1948 = 66
  -- years of data.
  let sh = Repa.Z Repa.:. ndays Repa.:. nlat Repa.:. nlon
      slicecount = [ndays, nlat, nlon]
      zs = take (product slicecount) (repeat 0)
      init = Repa.fromList sh zs :: FArray3 Int

  -- Computation to accumulate a single year's data.
  let doone current y = do
        -- Read one year's data.
        let start = [(y - 1948) * ndays, 0, 0]
        Right slice <- getA innc z500var start slicecount :: RepaRet3 CShort
        return $!
          Repa.computeS $ current Repa.+^ ( fromIntegral slice)

  -- Accumulate all data.
  total <- foldM doone init [1948..2013]

  -- Calculate the final mean annual cycle.
  let mean = Repa.computeS $
    (fromIntegral . (`div` nyears)) total :: FArray3 CShort

Since each year of data has 181 days, a full year’s data has 72 × 15 × 181 data values (72 longitude points, 15 latitude points, 181 days), i.e. 195,480 values. In the case here, since we have 66 years of data, there are 12,901,680 data values altogether. That’s a small enough number that we could probably slurp the whole data set into memory in one go for processing. However, we’re not going to do that, because there are plenty of cases in climate data analysis where the data sets are significantly larger than this, and you need to do “off-line” processing, i.e. to read data from disk piecemeal for processing.

We do a monadic fold (using the standard foldM function) over the list of years of data, and for each year we read a single three-dimensional slice of data representing the whole year and add it to the current accumulated sum of all the data. (This is what the doone function does: the only slight wrinkle here is that we need to deal with conversion from the short integer values stored in the data file to the Haskell Int values that we use in the accumulator. Otherwise, it’s just a question of applying Repa’s element-wise addition operator to the accumulator array and the year’s data.) Once we’ve accumulated the total values across all years, we divide by the number of years and convert back to short integer values, giving a short integer valued array containing the mean annual cycle – a three-dimensional array with one entry for each day in our 181-day winter season and for each latitude and longitude in the grid we’re using.

Once we have the mean annual cycle with which we want to calculate anomalies, determining the anomalies is simply a matter of subtracting the mean annual cycle from each year’s data, matching up longitude, latitude and day of year for each data point. The next listing shows the main loop that does this, reading a single day of data at a time, then subtracting the appropriate slice of the mean annual cycle to produce anomaly values (Repa’s slice function is handy here) and writing these to an output NetCDF file:

      let count = [1, nlat, nlon]
      forM_ [0..ntime-1] $ \it -> do
        -- Read time slice.
        Right slice <- getA innc z500var [it, 0, 0] count :: RepaRet2 CShort

        -- Calculate anomalies and write out.
        let sl = Repa.Z Repa.:. (it `mod` 181) Repa.:. Repa.All Repa.:. Repa.All
            anom = Repa.computeS $
                   slice Repa.-^ (Repa.slice mean sl) :: FArray2 CShort
        putA outnc outz500var [it, 0, 0] count anom

The only thing we have to be a little bit careful about when we create the final anomaly output file is that we need to remove some of the attributes from the Z500 variable: because we’re now working with differences between actual values and our “typical” annual cycle, we no longer need the add_offset and scale_factor attributes that are used to convert from the stored short integer values to floating point geopotential height values. Instead, the values that we store in the anomaly file are the actual geopotential height anomaly values in metres.

After doing all this pre-processing, what we end up with is a single NetCDF file containing 66 winter seasons of daily Z500 anomalies for the region we’re interested in. The kind of rather boring data processing we’ve had to do to get to this point is pretty typical for climate data processing – you almost always need to do some sort of subsetting of your data, you often want to remove signals that aren’t interesting (like the annual cycle here, although things can get much more complicated than that). This kind of thing is unavoidable, and the best that you can really do is to try to organise things so that you do the pre-processing once and end up with data in a format that’s then easy to deal with for further processing. That’s definitely the case here, where we have fixed-length time series (181 days per winter) for each year, so we don’t need to do any messing around with dates.

In other applications, pre-processing can be a much bigger job. For functional brain imaging applications, for example, as well as extracting a three-dimensional image from the output of whatever (MRI or CT) scanner you’re using, you often need to do something about the low signal-to-noise ratios that you get, you need to compensate for subject motion in the scanner during the measurements, you need to compensate for time-varying physiological nuisance signals (breathing and heart beat), you need to spatially warp images to match an atlas image to enable inter-subject comparison, and so on. And all that is before you get to doing whatever statistical analysis you’re really interested in.

We will look at some of these “warts and all” pre-processing cases for other projects later on, but for now the small amount of pre-processing we’ve had to do here is enough. Now we can start with the “main analysis”.

Before we do that though, let’s take a quick look at what these anomaly values look like. The two figures below show anomaly plots for the same time periods for which the original Z500 data is shown in the plots in this earlier article. The “normal” anomaly plots are a bit more variable than the original Z500 plots, but the persistent pattern over the North Atlantic during the blocking episode in the second set of plots is quite clear. This gives us some hope that we’ll be able to pick out these persistent patterns in the data relatively easily.

First, the “normal” anomalies:

Normal Z500 anomaly snapshots

then the “blocking” anomalies:

Blocking Z500 anomaly snapshots

<script src="" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

September 04, 2014 12:54 PM

September 01, 2014

Douglas M. Auclair (geophf)

August Haskell Daily puzzles and solutions

August, 2014

  • August 1st, 2014: "But programming isn't math, so ..." Today's #haskell problem? 'Fix' yesterday's solution to be less ... 'icky.' π, fast (So you're not eating π? ... okay: 'quickly.') A solution set to today's exercises with Wow-o-wow results.
  • Bonus: August 1st, 2014: This is Embarrassing(ly parallel) to have π in da face!
  • A-to-the-ST for da D-down-low-on-the-SL. Today's #haskell exercise is write your own language, then some rules. Yeah. LAWLZ! NAND on the A-to-the-ST! A solution to the first 2 exercises to today's problem. YMMV: not pretty nor generic.
  • Baby needs a new pair of shoes! And you need a new daily #haskell problem to solve. Done! ;) Love is the Universal Language. No, wait. We were talking about Money, yes? Me confused. Solution to today's exercise I have a Yen for Thinking Japanese, or so I think so ... a solution to the bonus #haskell exercise
  • Bayes was a man of letters. A Bayesian classifier for letter-recognition? Sure, let's give it a go for today's #haskell exercise. So now we know 'M' looks like 'W' to some peeps but 'B', 'I', 'O' check out fine, so WE ARE READING WITH HASKELL! YAY ... added definitions to do test runs over the entire data set and then did various runs, tweaking the system. Results noted. Informative.
  • Today's #haskell exercise. An expert system for blood donation. I use the term 'expert' loosely, eh? ;) Have at it! "I vant to drinq yur bloot! Bwa-hahaha!" Okay. That quote is just ... creepy. A solution to today's exercise is at
  • Veni, Vidi, ... Duci? Today's #haskell exercise. It is a relief to know that no tactical nukes were launched by testing the solution to today's #haskell exercise. or How do you count Mississippi's in Roman numerals?
  • August 11th, 2014, Monday: You've heard of 'Rock Management'? How about ROCK LOBSTERS! GIVE ME A ROCK, NAOW!!!1!!!!11! Today's #haskell exercise 
  • August 12th, 2014, Tuesday: To Fract or Not to Fract: why are we even considering this question? Today's #haskell exercise 

  • Ooh! Bonus problem? On a Tuesday? Wowsers! Today's #haskell #bonus problem: "Continuum spectrum maximum... uh, 'thing'um"
  • August 13th, 2014, Wednesday. No problem! (literally), but: "Fract! The Cylons!" No... wait: Starbuck used a different word. A solution to yesterday's fracting problem is at
Flatliners. No fracting, no 'peak'ing, just a baseline data set of 
initializeSpectrum defined with flatline function.

Twin Peaks. Still not fracted, but data set with two spikes 
rolled into the base data set with smoothSpike fn 

Fracted. Data set with peaks, fracted using the frackedSpike function

  • August 14th, 2014, Thursday: Pining for that next fibo date-time before 2584. Today's #haskell problem inspired by @AlgebraFact Updated problem. Tightened up return value, removing unnecessary semideterminism. So, it gives, leik, a septatuple answer, leik. ... Yeah. Time keeps flowing like a (Fibonacci) river, to the sea. A solution to  the 'next Fibonacci' problem posted today.
  • Hail, Eris! or the Law of Fives or today's #haskell problem (implementing a Ripple-down rule set). Do it to it! One of three-parter solution to today's problem: Forall Existential(ism) A solution allowing forall-quantified values. Two of the three-part solution: All you need is fnord (la-di-da-di-dah)! Rippling down (and accumulating fired) rules Third of three-part solution: RippleDownRuleto ergo sum, adding rules to the RDR system  Updated the 3rd solution to the RDR (Writer monad definition) to ripple down to left and right, fixing inconsistency in rule findings.
  • August 18th, 2014: What's FP good for? Crypto and technical (financial) indicators. Why? Um, let's do today's #haskell exercise! Full on, all the way! Double Rain-... wait: line-graph. Solution to today's #haskell exercise

  • August 19th, 2014: What does it all mean? All the way! No, wait: this is just today's #haskell exercise (backtesting SMA indicator ). Take the monoid and run(State-WriterT), or WAAH! WAAH! I lost 100 corn chips on my investment strategy or, solution: But, as solace, it does come with a pretty picture, illustrating today's solution. Ooh! 

  • August 20th, 2014: Next up for today's #haskell exercise is the Exponential Moving Average. A solution to the E(xponential)M(oving)A(verage) #haskell problem: Stately Conformance 

  • August 21st, 2014: My Jeans. No ... I meant: 'GENES'! That's it. Today's #haskell exercise. A solution to (not even CLOSE to) 'yesterday's' #haskell exercise: Nature or Nurture? We present a genetic algorithm
  • August 22nd, 2014: Today's (much delayed) #haskell problem: the Silver Ratio (from @AlgebraFact Every cloud has a silver lining .. Every Haskell problem has a solution. (sometimes) (except noncomputable problems)

  • August 25th, 2014: LYADCFGG! Today's #haskell exercise? (automated/learned) Document classification.
  • August 26th, 2014: "Join me, Luke, on the Dark Side of the Force, and help me go #FORTH to solve today's #haskell exercise! MWA-HAHA!" The World's smallest DSL: Forth. A solution to today's #haskell exercise
  • August 28th, 2014: "As I compose this #haskell problem ... p.s. I love you ... You! You! You!" #1960s #song A little card logic problem. Oh, NOES! LOOK OUT! IT'S A CARD-#SHARKNADO! Nah, it's just 'today's #haskell solution, is all
  • August 29th, 2014: Groovitudes! WordNet Themes! or: today's #haskell exercise A 3x3 matrix has 36 possible solutions. A 5x5 matrix? Over 200M poss solutions? YIKES! A solution to the themed-words
  • The Forth language problem solution given on August 26th gives a very snazzy RPN ('reverse Polish notation') calculator, but that's all it does the ':'-defining word needs access and look-ahead to the program parsed stream, and that's a bit more to ask than pushing and popping stack operators.
  • For the August 29th problem(WordNet themes) the raw generated solution set is over 209M possibilities. My little Haskell program was not finished scanning them over four hours when I killed the process. However, my dear wife solved the problem in under five minutes. Setting aside the fact that she's a genius, the program needs to be better. It needs to use the ontology of English-language words to eliminate fruitless endeavors during their generation, not afterwards during the test phase.

by geophf ( at September 01, 2014 10:11 AM

Yesod Web Framework

Planning Yesod 1.4

Now that persistent 2.0 is out the door, it's time to start talking about Yesod version 1.4. First question you might ask: what happened to Yesod 1.3? Answer: a few of the Yesod libraries (e.g., yesod-auth) are already on version 1.3, so to avoid confusion, we're jumping straight to 1.4.

Second question: what are we planning on breaking this time? Answer: hopefully nothing! The main purpose of this release is actually to just remove some backwards-compatibility hacks in the Yesod codebase for older versions of dependencies, like shakespeare pre-2.0, conduit pre-1.2, WAI pre-3.0, and persistent pre-2.0.

There are few exceptions to this, which should hopefully have minimal impact on users. You can see these in the detailed change list. One change I'd like to call out is the updated routing system. This is a fundamental change to how yesod-routes works. The generated code is drastically simpler as a result. Instead of constructing a data structure that allows for efficient pattern matching of the request path and then attempting to parse the resulting pieces, the new code simply generates a series of clauses, one for each route, and ensures proper parsing using view patterns. In my initial benchmarking, this made routing twice as fast as Yesod 1.2. I would release this as part of 1.2, but it introduces a new requirement on the ViewPatterns language extension. So instead, I held it off for the 1.4 release.

If there are other breaking changes that people would like to propose, now's the time to do it. But be aware that I'll likely push back hard on any breakage. If there's a very good reason for it, we can do it. But I'd rather keep stability wherever possible.

There's one exception to that rule, which is the purpose of the rest of this blog post: the scaffolded site. Making changes to the scaffolded site never breaks existing application, and therefore we can be much more liberal about changing things there. There is a downside in terms of education: all existing tutorials on the scaffolding would need to be updated. But one of my points below addresses that.

So here are my proposed scaffolding changes:

  • Let's move away from config files towards environment variables for configuration. A config file is still a convenient way to record configuration, but injecting that configuration through environment variables means configuration can also be stored in a database or elsewhere and injected through environment variables the same way.
  • Along the same lines, we would no longer need a command line argument to indicate which environment we're in (devel vs production, etc). All such settings would be controlled via environment variables.
  • To allow for easy development, we would have a single YESOD_DEVEL environment variables which would indicate if we're currently in development. If so, it would apply a number of default environment variable values to avoid the need to set these in your shell manually.
  • Finally, and I expect this to be controversial: let's use classy-prelude-yesod in the Import module, instead of just taking Prelude with a few objectionable functions filtered out.

This is just a first pass at a scaffolding cleanup, I'm sure there are other improvements that can be made as well.

I don't have a specific date on a Yesod 1.4 release, but I'm not expecting it to be a long development process. The vast majority of the work is already done (on the yesod-1.4 branch), and that codebase is already being used extensively in a rather large Yesod application, so I'm not too worried about regressions having slipped in.

September 01, 2014 12:00 AM

August 30, 2014

Joachim Breitner

DebConf 14

I’m writing this blog post on the plane from Portland towards Europe (which I now can!), using the remaining battery live after having watched one of the DebConf talks that I missed. (It was the systemd talk, which was good and interesting, but maybe I should have watched one of the power management talks, as my battery is running down faster than it should be, I believe.)

I mostly enjoyed this year’s DebConf. I must admit that I did not come very prepared: I had neither something urgent to hack on, nor important things to discuss with the other attendees, so in a way I had a slow start. I also felt a bit out of touch with the project, both personally and technically: In previous DebConfs, I had more interest in many different corners of the project, and also came with more naive enthusiasm. After more than 10 years in the project, I see a few things more realistic and also more relaxed, and don’t react on “Wouldn’t it be cool to have <emph>crazy idea</emph>” very easily any more. And then I mostly focus on Haskell packaging (and related tooling, which sometimes is also relevant and useful to others) these days, which is not very interesting to most others.

But in the end I did get to do some useful hacking, heard a few interesting talks and even got a bit excited: I created a new tool to schedule binNMUs for Haskell packages which is quite generic (configured by just a regular expression), so that it can and will be used by the OCaml team as well, and who knows who else will start using hash-based virtual ABI packages in the future... It runs via a cron job on to produce output for Haskell and for OCaml, based on data pulled via HTTP. If you are a Debian developer and want up-to-date results, log into and run ~nomeata/binNMUs --sql; it then uses the projectb and wanna-build databases directly. Thanks to the ftp team for opening up, by the way!

Unsurprisingly, I also held a talk on Haskell and Debian (slides available). I talked a bit too long and we had too little time for discussion, but in any case not all discussion would have fitted in 45 minutes. The question of which packages from Hackage should be added to Debian and which not is still undecided (which means we carry on packaging what we happen to want in Debian for whatever reason). I guess the better our tooling gets (see the next section), the more easily we can support more and more packages.

I am quite excited by and supportive of Enrico’s agenda to remove boilerplate data from the debian/ directories and relying on autodebianization tools. We have such a tool for Haskell package, cabal-debian, but it is unofficial, i.e. neither created by us nor fully endorsed. I want to change that, so I got in touch with the upstream maintainer and we want to get it into shape for producing perfect Debian packages, if the upstream provided meta data is perfect. I’d like to see the Debian Haskell Group to follows Enrico’s plan to its extreme conclusion, and this way drive innovation in Debian in general. We’ll see how that goes.

Besides all the technical program I enjoyed the obligatory games of Mao and Werewolves. I also got to dance! On Saturday night, I found a small but welcoming Swing-In-The-Park event where I could dance a few steps of Lindy Hop. And on Tuesday night, Vagrant Cascadian took us (well, three of us) to a blues dancing night, which I greatly enjoyed: The style was so improvisation-friendly that despite having missed the introduction and never having danced Blues before I could jump right in. And in contrast to social dances in Germany, where it is often announced that the girls are also invited to ask the boys, but then it is still mostly the boys who have to ask, here I took only half a minute of standing at the side until I got asked to dance. In retrospect I should have skipped the HP reception and went there directly...

I’m not heading home at the moment, but will travel directly to Göteborg to attend ICFP 2014. I hope the (usually worse) west-to-east jet lag will not prevent me from enjoying that as much as I could.

by Joachim Breitner ( at August 30, 2014 03:10 PM

Chris Wong

Hackage update, part 4

A lot has happened with Hackage since my last update. Now that the Summer of Code is over, I’ll summarize the work I’ve done since then, and outline where this project will go next.

What’s “build reporting”?

Since my project covered a few obscure parts of Hackage and Cabal, I think it’s worthwhile to clear some terminology first.

If you’ve uploaded a library to Hackage before, you may have noticed that the Haddock documentation does not appear straight away. Since building a package can be quite resource intensive, the job is handled by a dedicated build bot. This bot continually polls for new packages, invokes cabal install on them (with some special flags, which I’ll go into later), and uploads the result.

Of course, this process does not always succeed. If a package fails to compile, then it will not have any documentation either. This is clearly very inconvenient.

Fortunately, recent versions of cabal include a feature called build reporting. When invoked with the --build-summary option, cabal creates a file containing useful information about the build. Here’s an example using the robot package:

$ cabal install robot --build-summary='$'
$ cat
package: robot-
os: linux
arch: x86_64
compiler: ghc-7.6.3
client: cabal-install-
dependencies: xhb-0.5.2014.4.10 transformers-
              exceptions-0.6.1 containers- base-
install-outcome: InstallOk
docs-outcome: NotTried
tests-outcome: NotTried

Since the build bot uses cabal, it has access to these reports as well. So whenever the bot completes a build — successful or not — it posts the corresponding report to Hackage. You can read these reports yourself via a special URL; for our robot example it’s

In summary: if the docs for a package are missing, then the reports will tell us why. If there are no reports, then it must mean the build bot hasn’t attempted the package yet. All is fine and dandy, at least in theory.


… not all builds were reported. The gaps were in two places: planning failures and package candidates. My latest patch to cabal fixed both these issues.

Reporting planning failures

A planning failure is when cabal-install cannot find a consistent set of dependencies to use. You can trigger a planning failure yourself:

$ cabal install robot --constraint='robot < 1.1' --constraint='robot > 1.1'
cabal: Could not resolve dependencies:

Since we can’t have a robot which is both older and newer than 1.1, the resolver fails.

Formerly, as dependency resolution ran early in the build process, any failures at this stage did not generate a corresponding report. So if the build bot encountered a planning failure, all the user saw was missing documentation, with no hints as to what went wrong.

The fix was mostly straightforward, save for one issue: since users can report their own builds, a naïve implementation would have lead to Hackage being swamped with frivolous reports. So this feature is guarded behind a flag (--report-planning-failure), and disabled by default.

Reporting candidate builds

Hackage has a feature called build candidates. This lets package maintainers upload and test packages without publishing them to the main site.

Again, the problem was the lack of reporting: when a candidate was uploaded, the build bot would compile the package but not submit a report. This was a major issue, since this reporting was what motivated the feature in the first place.

After some digging, I traced this to a bug in cabal. A candidate is not published in the main package index (by definition), so it is impossible to refer to one by name (e.g. hello-1.0). So the build bot invokes cabal using the bare URL instead (e.g.

The problem was if only a URL was given, cabal considered it a “local” package and did not generate a report. The reason for this behavior is outside the scope of this post, but the fix was clear: change cabal to generate reports for all packages, no matter how they are specified on the command line.

Where to next?

Though the Summer of Code has ended, my work with Hackage has not. There are still many issues that need clearing up, especially with the candidates feature; I’ll continue hacking away at them in my spare time.

And lest I forget — many thanks to my mentor Duncan Coutts for his guidance throughout this project! I had plenty of fun this summer, and learned just as much.

August 30, 2014 12:00 AM

August 29, 2014

Yesod Web Framework

Announcing Persistent 2

We are happy to announce the release of persistent 2.0

persistent 2.0 adds a flexible key type and makes some breaking changes. 2.0 is an unstable release that we want your feedback on for the soon to follow stable 2.1 release.

New Features

  • type-safe composite primary and foreign keys
  • added an upsert operation (update or insert)
  • added an insertMany_ operation


  • An Id suffix is no longer automatically assumed to be a Persistent type
  • JSON serialization * MongoDB ids no longer have a prefix 'o' character.

Breaking changes

  • Use a simple ReaderT for the underlying connection
  • fix postgreSQL timezone storage
  • remove the type parameter from EntityDef and FieldDef

In depth

Composite keys

The biggest limitation of data modeling with persistent is an assumption of a simple (for current SQL backends an auto-increment) primary key. We learned from Groundhog that a more flexible primary key type is possible. Persistent adds a similar flexible key type while maintaining its existing invariant that a Key is tied to a particular table.

To understand the changes to the Key data type, lets look at a change in the test suite for persistent 2.

       i <- liftIO $ randomRIO (0, 10000)
-      let k = Key $ PersistInt64 $ abs i
+      let k = PersonKey $ SqlBackendKey $ abs i

Previously Key contained a PersistValue. This was not type safe. PersistValue is meant to serialize any basic Haskell type to the database, but a given table only allows specific values as the key. Now we generate the PersonKey data constructor which specifies the Haskell key types. SqlBackendKey is the default key type for SQL backends.

Now lets look at code from CompositeTest.hs

mkPersist sqlSettings [persistLowerCase|
      name  String maxlen=20
      name2 String maxlen=20
      age Int
      Primary name name2 age
      deriving Show Eq
      name  String maxlen=20
      name2 String maxlen=20
      age Int
      Foreign Parent fkparent name name2 age
      deriving Show Eq

Here Parent has a composite primary key made up of 3 fields. Child uses that as a foreign key. The primary key of Child is the default key for the backend.

let parent = Parent "a1" "b1" 11
let child = Child "a1" "b1" 11
kp <- insert parent
_ <- insert child
testChildFkparent child @== parent

Future changes

Short-term improvements

Before the 2.1 release I would like to look at doing some simple things to speed up model compilation a little bit.

  • Speed up some of the compile-time persistent code (there is a lot of obviously naive code).
  • Reduce the size of Template Haskell generation (create a reference for each EntityDef and some other things rather than potentially repeatedly inlining it)

Medium-term improvement: better support for Haskell data types

We want to add better support for modeling ADTs, particularly for MongoDB where this is actually very easy to do in the database itself. Persistent already support a top-level entity Sum Type and a simple field ADT that is just an enumeration.

Another pain point is serializing types not declared in the schema. The declaration syntax in groundhog is very verbose but allows for this. So one possibility would be to allow the current DRY persistent declaration style and also a groundhog declaration style.

Long-term improvements: Projections

It would be possible to add projections now as groundhog or esqueleto have done. However, the result is not as end-user friendly as we would like. When the record namespace issue is dealt with in the GHC 7.10 release we plan on adding projections to persistent.

Ongoing: Database specific functionality

We always look forward to see more databases adapters for persistent. In the last year, a Redis and ODBC adapter were added.

Every database is different though, and you also want to take advantage of your database-specific features. esqueleto and persistent-mongoDB have shown how to build database specific features in a type-safe way on top of persistent.


Although the persistent code has no dependency on Yesod, I would like to make the infrastructure a little more independent of yesod. The first steps would be

  • putting it under a different organization on github.
  • having a separate mail list (should stackoverflow be prioritized over e-mail?)

August 29, 2014 09:46 PM

Antti-Juhani Kaijanaho (ibid)

Licentiate Thesis is now publicly available

My recently accepted Licentiate Thesis, which I posted about a couple of days ago, is now available in JyX.

Here is the abstract again for reference:

Kaijanaho, Antti-Juhani
The extent of empirical evidence that could inform evidence-based design of programming languages. A systematic mapping study.
Jyväskylä: University of Jyväskylä, 2014, 243 p.
(Jyväskylä Licentiate Theses in Computing,
ISSN 1795-9713; 18)
ISBN 978-951-39-5790-2 (nid.)
ISBN 978-951-39-5791-9 (PDF)
Finnish summary

Background: Programming language design is not usually informed by empirical studies. In other fields similar problems have inspired an evidence-based paradigm of practice. Central to it are secondary studies summarizing and consolidating the research literature. Aims: This systematic mapping study looks for empirical research that could inform evidence-based design of programming languages. Method: Manual and keyword-based searches were performed, as was a single round of snowballing. There were 2056 potentially relevant publications, of which 180 were selected for inclusion, because they reported empirical evidence on the efficacy of potential design decisions and were published on or before 2012. A thematic synthesis was created. Results: Included studies span four decades, but activity has been sparse until the last five years or so. The form of conditional statements and loops, as well as the choice between static and dynamic typing have all been studied empirically for efficacy in at least five studies each. Error proneness, programming comprehension, and human effort are the most common forms of efficacy studied. Experimenting with programmer participants is the most popular method. Conclusions: There clearly are language design decisions for which empirical evidence regarding efficacy exists; they may be of some use to language designers, and several of them may be ripe for systematic reviewing. There is concern that the lack of interest generated by studies in this topic area until the recent surge of activity may indicate serious issues in their research approach.

Keywords: programming languages, programming language design, evidence-based paradigm, efficacy, research methods, systematic mapping study, thematic synthesis

by Antti-Juhani Kaijanaho at August 29, 2014 08:45 AM

August 28, 2014

Functional Jobs

Senior Software Engineer (Functional) at McGraw-Hill Education (Full-time)

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District. We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

  • Move textbooks and learning into the digital era
  • Create software used by millions of students
  • Advance the state of the art in adaptive learning technology
  • Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you do have some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

  • Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
  • Be comfortable moving between multiple programming languages
  • Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

August 28, 2014 09:18 PM

Douglas M. Auclair (geophf)

Dylan: the harsh realities of the market

So, this is a little case study.

I did everything for Dylan. And when I say everything, I mean everything.  Here's my resumé:

  • I got excited about Dylan as a user, and I used it. I bought an old Mac that I don't ever remember the designation for, it's so '90's old, and got the floppies for the Dylan IDE from Apple research.
I'm not joking.

  • I integrated Dylan into my work at work, building an XML parser then open-sourcing it to the community under the (then) non-restrictive license. I think mine was the only XML parser that was industrial-strength for Dylan. Can't claim originality: I ported over the Common-LISP one, but it was a lot of (fun) work.
  • I made improvements to the gwydion-dylan compiler, including some library documentation (you can see my name right there, right in the compiler code), including some library functionality, did I work on the compiler itself? The Dylan syntax extensions or type system? I don't recall; if not in those places, I know I've looked at those guts: I had my fingers all over parts of the compiler.
I was in the Dylan compiler code. For you ll-types ('little language') that's no big deal.

But ask a software developer in industry if they've ever been in their compiler code. I have, too: I've found bugs in Java Sun-compiler that I fixed locally and reported up the chain.
  • I taught a course at our community college on Dylan. I had five students from our company that made satellite mission software.
  • I effing had commercial licenses bought when the boss asked me: what do we have to do to get this (my system) done/integrated into the build. I put my job on the line, for Dylan. ... The boss bought the licenses: he'd rather spend the $x than spending six weeks to back-port down to Java or C++.
  • I built a rule-based man-power scheduling system that had previously took three administrative assistants three days each quarter to generate. My system did it, and printed out a PDF in less than one second. I sold it, so that means I started a commercial company and sold my software.
I sold commercial Dylan software. That I wrote. Myself. And sold. Because people bought it. Because it was that good.

Hells yeah.

Question: what more could I have done?

I kept Dylan alive for awhile. In industry. For real.

So why is Dylan dead?

That's not the question.

Or, that question is answered over and over and over again.

Good languages, beautiful languages, right-thing languages languish and die all the time.

Dylan was the right-thing, and they (Apple) killed it in the lab, and for a reason.

Who is Dylan for?

That's not the question either. Because you get vague, general, useless answers.

The question is to ask it like Paul Graham answered it for LISP.

Lisp is a pointless, useless, weird language that nobody uses.

But Paul and his partner didn't care. They didn't give a ...


... what anybody else thought. They knew that this language, the language they loved, was built and designed and made for them. Just them and only them, because the only other people who were using it were college kids on comp.lang.lisp asking for the answers for problem-set 3 on last night's homework.

That's what Lisp was good for: nothing.
That's who Lisp was good for: nobody.

Same exact scenario for Erlang. Exactly the same. Erlang was only good for Joe Armstrong and a couple of buddies/weirdos like him, you know: kooks, who believed that Erlang was the right-thing for what they were doing, because they were on a mission, see, and nothing nobody could say could stop them nor stand against them, and all who would rise up against them would fall.


What made Lisp and Haskell and Erlang and Scala and Prolog (yes, Prolog, although you'll never hear that success story publicly, but $26M and three lives saved? Because of a Prolog system I wrote? And that's just one day in one month's report for data? I call that a success) work when nobody sane would say that these things would work?

Well, it took a few crazy ones to say, no, not: 'say' that it would work, but would make it work with their beloved programming language come hell or high water or, worse: indifferent silence, or ridicule, or pity from the rest of the world.

That is the lesson of perl and python and all these other languages. They're not good for anything. They suck. And they suck in libraries and syntax and semantics and weirdness-factor and everything.

But two, not one, but at least two people loved that language enough to risk everything, and ...

They lost.

Wait. What?

Did you think I was going to paint the rosy picture and lie to you and say 'they won'?

Because they didn't.

Who uses Lisp commercially? Or Haskell, except some fringers, or Scala or Clojure or Erlang or Smalltalk or Prolog

... or Dylan.

These languages are defined, right there in the dictionary.

Erlang: see 'career wrecker.'

Nobody uses those languages nor admits to even touching them with a 10-foot (3-meter) pole. I had an intern from college. 'Yeah, we studied this weird language called ML in Comp.sci. Nobody uses it.'

She was blown away when I started singing ML's praises and what it can do.

A meta-language, and she called it useless? Seriously?

Because that's what the mainstream sees.

Newsflash. I'm sorry. Dylan, Haskell, Idris: these aren't main-stream, and they never will be.

Algebraic types? Dependent types? You'll never see them. They're too ... research-y. They stink of academe, which is: they stink of uselessness-to-industry. You'll be dead and buried to see them in this form, even after they discover the eternity elixir. Sorry.

Or you'll see them in Visual Basic as a new Type-class form that only a few Microserfs use because they happened to have written those extensions. Everybody else?


Here's how Dylan will succeed, right now.

Bruce and I will put our heads together, start a company, and we'll code something. Not for anybody else to use and to love and to cherish, just for us, only for us, and it will blow out the effing doors, and we'll be bought out for $40M because our real worth is $127M.

And the first thing that Apple will do, after they bought us, is to show us the door, then convert the code into Java. Or Swift. Or Objective-C, or whatever.

And that's how we'll win.

Not the $40M. Not the lecture series on 'How to Make Functional Programming Work in Industry for Real' afterwards at FLoC and ICFP conferences with fan-bois and -girls wanting to talk to us afterwards and ask us how they can get a job doing functional programming.

Not that.

We'll win because we made something in Dylan, and it was real, and it worked, and it actually did something for enough people that we can now go to our graves knowing that we did something once with our lives (and we can do it again and again, too: there's no upper limit on the successes you're allowed to have, people) that meant something to some bodies. And we did that. With Dylan.


I've done that several times already, by my counting: the Prolog project, the Dylan project, the Mercury project, and my writing.

I'm ready to do that, again.

Because, actually, fundamentally, doing something in this world and for it ... there's nothing like it.

You write that research paper, and I come up to you, waving it in your face, demanding you implement your research because I need it to do my job in Industry?

I've done that to three professors so far. Effing changed their world-view in that moment. "What?" they said, to a person, "somebody actually wants to use this?" The look of bemused surprise on their faces?

It was sad, actually, because they did write something that somebody out there (moiself) needed, but they never knew that what they were doing meant something.

And it did.

Effing change your world-view. Your job? Your research? Your programming language?

That's status quo, and that's good and necessary and dulce and de leche (or decorum, I forget which).

But get up out of the level you're at, and do something with it so that that other person, slouched in their chair, sits up and takes notice, and a light comes over their face and they say, 'Ooh! That does that? Wow!' and watch their world change, because of you and what you've done.

Dylan is for nothing and for nobody.

So is everything under the Sun, my friend.

Put your hand to the plow, and with the sweat of your brow, make it yours for this specific thing.

Regardless of the long hours, long months of unrewarded work, and regardless of the hecklers, naysayers, and concerned friends and parents, and regardless of the mountain of unpaid bills.

You make it work, and you don't stop until it does.

That's how I've won.

Every time.

by geophf ( at August 28, 2014 04:06 PM


Dealing with Asynchronous Exceptions during Resource Acquisition


Consider the following code: we open a socket, compute with it, and finally close the socket again. The computation happens inside an exception handler (try), so even when an exception happens we still close the socket:

example1 :: (Socket -> IO a) -> IO a 
example1 compute = do -- WRONG
  s <- openSocket 
  r <- try $ compute s
  closeSocket s
  case r of
    Left ex -> throwIO (ex :: SomeException)
    Right a -> return a

Although this code correctly deals with synchronous exceptions–exceptions that are the direct result of the execution of the program–it does not deal correctly with asynchronous exceptions–exceptions that are raised as the result of an external event, such as a signal from another thread. For example, in

example2 :: (Socket -> IO a) -> IO (Maybe a)
example2 compute = timeout someTimeout $ example1 compute

it is possible that the timeout signal arrives after we have opened the socket but before we have installed the exception handler (or indeed, after we leave the scope of the exception handler but before we close the socket). In order to address this we have to control precisely where asynchronous exceptions can and cannot be delivered:

example3 :: (Socket -> IO a) -> IO a
example3 compute =
  mask $ \restore -> do
    s <- openSocket 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

We mask asynchronous exceptions, and then restore them only inside the scope of the exception handler. This very common pattern is captured by the higher level combinator bracket, and we might rewrite the example as

example4 :: (Socket -> IO a) -> IO a
example4 = bracket openSocket closeSocket

Allowing asynchronous exceptions during resource acquisition

Suppose that we wanted to define a derived operation that opens a socket and performs some kind of handshake with the server on the other end:

openHandshake :: IO Socket
openHandshake = do
  mask $ \restore -> do
    s <- openSocket
    r <- try $ restore $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

(These and the other examples can be defined in terms of bracket and similar, but we use mask directly so that it’s easier to see what is happening.) We might use openHandshake as follows:

example5 :: (Socket -> IO a) -> IO a
example5 compute = do
  mask $ \restore -> do
    s <- openHandshake 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

There are no resource leaks in this code, but there is a different problem: we call openHandshake with asynchronous exceptions masked. Although openHandshake calls restore before doing the handshake, restore restores the masking state to that of the enclosing context. Hence the handshake with the server cannot be timed out. This may not be what we want–we may want to be able to interrupt example5 with a timeout either during the handshake or during the argument computation.

Note that this is not a solution:

example6 :: (Socket -> IO a) -> IO a 
example6 compute = do
  mask $ \restore -> do
    s <- restore openHandshake -- WRONG
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

Consider what might happen: if an asynchronous exception is raised after openHandshake returns the socket, but before we leave the scope of restore, the asynchronous exception will be raised and the socket will be leaked. Installing an exception handler does not help: since we don’t have a handle on the socket, we cannot release it.

Interruptible operations

Consider this definition from the standard libraries:

withMVar :: MVar a -> (a -> IO b) -> IO b
withMVar m io =
  mask $ \restore -> do
    a <- takeMVar m
    b <- restore (io a) `onException` putMVar m a
    putMVar m a
    return b

This follows almost exactly the same pattern as the examples we have seen so far; we mask asynchronous exceptions, take the contents of the MVar, and then execute some operation io with the contents of the MVar, finally putting the contents of the MVar back when the computation completes or when an exception is raised.

An MVar acts as a lock, with takeMVar taking the role of acquiring the lock. This may, of course, take a long time if the lock is currently held by another thread. But we call takeMVar with asynchronous exceptions masked. Does this mean that the takeMVar cannot be timed out? No, it does not: takeMVar is a so-called interruptible operation. From the Asynchronous Exceptions in Haskell paper:

Any operation which may need to wait indefinitely for a resource (e.g., takeMVar) may receive asynchronous exceptions even within an enclosing block, but only while the resource is unavailable. Such operations are termed interruptible operations. (..) takeMVar behaves atomatically when enclosed in block. The takeMVar may receive asynchronous exceptions right up until the point when it acquires the MVar, but not after.

(block has been replaced by mask since the publication of the paper, but the principle is the same.) Although the existence of interruptible operations makes understanding the semantics of mask harder, they are necessary: like in the previous section, wrapping takeMVar in restore is not safe. If we really want to mask asynchronous exceptions, even across interruptible operations, Control.Exception offers uninterruptibleMask.

Custom interruptible operations

So an interruptible operation is one that can be interrupted by an asynchronous exception even when asynchronous exceptions are masked. Can we define our own interruptible operations? Yes, we can:

-- | Open a socket and perform handshake with the server
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ unsafeUnmask $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

unsafeUnmask is defined in GHC.IO, and unmasks asynchronous exceptions, no matter what the enclosing context is. This is of course somewhat dangerous, because now calling openHandshake' inside a mask suddenly opens up the possibility of an asynchronous exception being raised; and the only way to know is to look at the implementation of openHandshake', or its Haddock documentation. This is somewhat unsatisfactory, but exactly the same goes for takeMVar and any other interruptible operation, or any combinator that uses an interruptible operation under the hood. A sad state of affairs, perhaps, but one that we don’t currently have a better solution for.

Actually, using unsafeUnmask is a bit too crude. Control.Exception does not export it, but does export

allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()

with documentation

When invoked inside mask, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking.

When called outside mask, or inside uninterruptibleMask, this function has no effect.

(emphasis mine.) Sadly, this documentation does not reflect the actual semantics: unsafeUnmask, and as a consequence allowInterrupt, unmasks asynchronous exceptions no matter what the enclosing context is: even inside uninterruptibleMask. We can however define our own operator to do this:

interruptible :: IO a -> IO a
interruptible act = do
  st <- getMaskingState
  case st of
    Unmasked              -> act
    MaskedInterruptible   -> unsafeUnmask act
    MaskedUninterruptible -> act 

where we call unsafeUnmask only if the enclosing context is mask, but not if it is uninterruptibleMask (TODO: What is the semantics when we nest these two?). We can use it as follows to define a better version of openHandshake:

-- | Open a socket and perform handshake with the server
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ interruptible $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

Resource allocation timeout

If we wanted to timeout the allocation of the resource only, we might do

example7 :: (Socket -> IO a) -> IO a
example7 compute = do
  mask $ \restore -> do
    ms <- timeout someTimeout $ openHandshake'
    case ms of
      Nothing -> throwIO (userError "Server busy")
      Just s  -> do r <- try $ restore $ compute s
                    closeSocket s
                    case r of
                      Left ex -> throwIO (ex :: SomeException)
                      Right a -> return a

Exceptions are masked when we enter the scope of the timeout, and are unmasked only once we are inside the exception handler in openHandshake'–in other words, if a timeout happens, we are guaranteed to clean up the socket. The surrounding mask is however necessary. For example, suppose we are writing some unit tests and we are testing openHandshake'. This is wrong:

example8 :: IO ()
example8 = do 
  ms <- timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

Even if we are sure that the example8 will not be interrupted by asynchronous exceptions, there is still a potential resource leak here: the timeout exception might be raised just after we leave the mask_ scope from openHandshake but just before we leave the timeout scope. If we are sure we don’t need to worry about other asynchronous exceptions we can write

example8 :: IO ()
example8 = do
  s <- mask_ $ timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

although of course it might be better to simply write

example8 :: IO ()
example8 = 
  bracket (timeout someTimeout $ openHandshake')
          (\ms -> case ms of Just s  -> closeSocket s
                             Nothing -> return ())
          (\_ -> return ())


Making sure that resources are properly deallocated in the presence of asynchronous exceptions is difficult. It is very important to make sure that asynchronous exceptions are masked at crucial points; unmasking them at the point of calling a resource allocation function is not safe. If you nevertheless want to be able to timeout resource allocation, you need to make your resource allocation function interruptible.

For completeness’ sake, there are some other solutions that avoid the use of unsafeUnmask. One option is to thread the restore argument through (and compose multiple restore arguments if there are multiple nested calls to mask). This requires resource allocations to have a different signature, however, and it is very error prone: a single mask somewhere along the call chain where we forget to thread through the restore argument will mean the code is no longer interruptible. The other option is to run the code that we want to be interruptible in a separate thread, and wait for the thread to finish with, for example, a takeMVar. Getting this right is however no easy task, and it doesn’t change anything fundamentally anyway: rather than using unsafeUnmask we are now using a primitive interruptible operation; either way we introduce the possibility of exceptions even in the scope of mask_.

Finally, when your application does not fit the bracket pattern we have been using (implicitly or explicitly), you may want to have a look at resourcet and pipes or conduit, or my talk Lazy I/O and Alternatives in Haskell.

by edsko, duncan at August 28, 2014 03:48 PM

Keegan McAllister

Calling a Rust library from C (or anything else!)

One reason I'm excited about Rust is that I can compile Rust code to a simple native-code library, without heavy runtime dependencies, and then call it from any language. Imagine writing performance-critical extensions for Python, Ruby, or Node in a safe, pleasant language that has static lifetime checking, pattern matching, a real macro system, and other goodies like that. For this reason, when I started html5ever some six months ago, I wanted it to be more than another "Foo for BarLang" project. I want it to be the HTML parser of choice, for a wide variety of applications in any language.

Today I started work in earnest on the C API for html5ever. In only a few hours I had a working demo. And this is a fairly complicated library, with 5,000+ lines of code incorporating

It's pretty cool that we can use all this machinery from C, or any language that can call C. I'll describe first how to build and use the library, and then I'll talk about the implementation of the C API.

html5ever (for C or for Rust) is not finished yet, but if you're feeling adventurous, you are welcome to try it out! And I'd love to have more contributors. Let me know on GitHub about any issues you run into.

Using html5ever from C

Like most Rust libraries, html5ever builds with Cargo.

$ git clone
$ cd html5ever
$ git checkout dev
$ cargo build
Updating git repository ``
Compiling phf_mac v0.0.0 (
Compiling html5ever-macros v0.0.0 (file:///tmp/html5ever)
Compiling phf v0.0.0 (
Compiling html5ever v0.0.0 (file:///tmp/html5ever)

The C API isn't Cargo-ified yet, so we'll build it using the older Makefile-based system.

$ mkdir build
$ cd build
$ ../configure
$ make libhtml5ever_for_c.a
rustc -D warnings -C rpath -L /tmp/html5ever/target -L /tmp/html5ever/target/deps \
-o libhtml5ever_for_c.a --cfg for_c --crate-type staticlib /tmp/html5ever/src/
warning: link against the following native artifacts when linking against this static library
note: the order and any duplication can be significant on some platforms, and so may need to be preserved
note: library: rt
note: library: dl
note: library: pthread
note: library: gcc_s
note: library: pthread
note: library: c
note: library: m

Now we can build an example C program using that library, and following the link instructions produced by rustc.

$ H5E_PATH=/tmp/html5ever
$ gcc -Wall -o tokenize tokenize.c -I $H5E_PATH/capi -L $H5E_PATH/build \
-lhtml5ever_for_c -lrt -ldl -lpthread -lgcc_s -lpthread -lc -lm

$ ./tokenize 'Hello&comma; <i class=excellent>world!</i>'
CHARS : Hello
TAG : <i>
ATTR: class="excellent"
CHARS : world!
TAG : </i>

The build process is pretty standard for C; we just link a .a file and its dependencies. The biggest obstacle right now is that you won't find the Rust compiler in your distro's package manager, because the language is still changing so rapidly. But there's a ton of effort going into stabilizing the language for a Rust 1.0 release this year. It won't be too long before rustc is a reasonable build dependency.

Let's look at the C client code.

#include <stdio.h>

#include "html5ever.h"

void put_str(const char *x) {
fputs(x, stdout);

void put_buf(struct h5e_buf text) {
fwrite(, text.len, 1, stdout);

void do_start_tag(void *user, struct h5e_buf name, int self_closing, size_t num_attrs) {
put_str("TAG : <");
if (self_closing) {

// ...

struct h5e_token_ops ops = {
.do_chars = do_chars,
.do_start_tag = do_start_tag,
.do_tag_attr = do_tag_attr,
.do_end_tag = do_end_tag,

struct h5e_token_sink sink = {
.ops = &ops,
.user = NULL,

int main(int argc, char *argv[]) {
if (argc < 2) {
printf("Usage: %s 'HTML fragment'\n", argv[0]);
return 1;

struct h5e_tokenizer *tok = h5e_tokenizer_new(&sink);
h5e_tokenizer_feed(tok, h5e_buf_from_cstr(argv[1]));
return 0;

The struct h5e_token_ops contains pointers to callbacks. Any events we don't care to handle are left as NULL function pointers. Inside main, we create a tokenizer and feed it a string. html5ever for C uses a simple pointer+length representation of buffers, which is this struct h5e_buf you see being passed by value.

This demo only does tokenization, not tree construction. html5ever can perform both phases of parsing, but the API surface for tree construction is much larger and I didn't get around to writing C bindings yet.

Implementing the C API

Some parts of Rust's libstd depend on runtime services, such as task-local data, that a C program may not have initialized. So the first step in building a C API was to eliminate all std:: imports. This isn't nearly as bad as it sounds, because large parts of libstd are just re-exports from other libraries like libcore that we can use with no trouble. To be fair, I did write html5ever with the goal of a C API in mind, and I avoided features like threading that would be difficult to integrate. So your library might give you more trouble, depending on which Rust features you use.

The next step was to add the #![no_std] crate attribute. This means we no longer import the standard prelude into every module. To compensate, I added use core::prelude::*; to most of my modules. This brings in the parts of the prelude that can be used without runtime system support. I also added many imports for ubiquitous types like String and Vec, which come from libcollections.

After that I had to get rid of the last references to libstd. The biggest obstacle here involved macros and deriving, which would produce references to names under std::. To work around this, I create a fake little mod std which re-exports the necessary parts of core and collections. This is similar to libstd's "curious inner-module".

I also had to remove all uses of format!(), println!(), etc., or move them inside #[cfg(not(for_c))]. I needed to copy in the vec!() macro which is only provided by libstd, even though the Vec type is provided by libcollections. And I had to omit debug log messages when building for C; I did this with conditionally-defined macros.

With all this preliminary work done, it was time to write the C bindings. Here's how the struct of function pointers looks on the Rust side:

pub struct h5e_token_ops {
do_start_tag: extern "C" fn(user: *mut c_void, name: h5e_buf,
self_closing: c_int, num_attrs: size_t),

do_tag_attr: extern "C" fn(user: *mut c_void, name: h5e_buf,
value: h5e_buf),

do_end_tag: extern "C" fn(user: *mut c_void, name: h5e_buf),

// ...

The processing of tokens is straightforward. We pattern-match and then call the appropriate function pointer, unless that pointer is NULL. (Edit: eddyb points out that storing NULL as an extern "C" fn is undefined behavior. Better to use Option<extern "C" fn ...>, which will optimize to the same one-word representation.)

To create a tokenizer, we heap-allocate the Rust data structure in a Box, and then transmute that to a raw C pointer. When the C client calls h5e_tokenizer_free, we transmute this pointer back to a box and drop it, which will invoke destructors and finally free the memory.

You'll note that the functions exported to C have several special annotations:

  • #[no_mangle]: skip name mangling, so we end up with a linker symbol named h5e_tokenizer_free instead of _ZN5for_c9tokenizer18h5e_tokenizer_free.
  • unsafe: don't let Rust code call these functions unless it promises to be careful.
  • extern "C": make sure the exported function has a C-compatible ABI. The data structures similarly get a #[repr(C)] attribute.

Then I wrote a C header file matching this ABI:

struct h5e_buf {
unsigned char *data;
size_t len;

struct h5e_buf h5e_buf_from_cstr(const char *str);

struct h5e_token_ops {
void (*do_start_tag)(void *user, struct h5e_buf name,
int self_closing, size_t num_attrs);

void (*do_tag_attr)(void *user, struct h5e_buf name,
struct h5e_buf value);

void (*do_end_tag)(void *user, struct h5e_buf name);

/// ...

struct h5e_tokenizer;

struct h5e_tokenizer *h5e_tokenizer_new(struct h5e_token_sink *sink);
void h5e_tokenizer_free(struct h5e_tokenizer *tok);
void h5e_tokenizer_feed(struct h5e_tokenizer *tok, struct h5e_buf buf);
void h5e_tokenizer_end(struct h5e_tokenizer *tok);

One remaining issue is that Rust is hard-wired to use jemalloc, so linking html5ever will bring that in alongside the system's libc malloc. Having two separate malloc heaps will likely increase memory consumption, and it prevents us from doing fun things like allocating Boxes in Rust that can be used and freed in C. Before Rust can really be a great choice for writing C libraries, we need a better solution for integrating the allocators.

If you'd like to talk about calling Rust from C, you can find me as kmc in #rust and #rust-internals on And if you run into any issues with html5ever, do let me know, preferably by opening an issue on GitHub. Happy hacking!

by keegan ( at August 28, 2014 06:18 AM

August 27, 2014

Bill Atkins

NSNotificationCenter, Swift and blocks

The conventional way to register observers with NSNotificationCenter is to use the target-action pattern. While this gets the job done, it's inherently not type-safe.

For example, the following Swift snippet will compile perfectly:

    NSNotificationCenter.defaultCenter().addObserver(self, selector: Selector("itemAdded:"),
      name: MyNotificationItemAdded, object: nil)

even though at runtime it will fail unless self has a method named itemAdded that takes exactly one parameter (leaving off that last colon in the selector will turn this line into a no-op). Plus, this method gives you no way to take advantages of Swift's closures, which would allow the observer to access local variables in the method that adds the observer and would eliminate the need to create a dedicated method to handle the event.

A better way to do this is to use blocks. And NSNotificationCenter does include a block-based API:

    NSNotificationCenter.defaultCenter().addObserverForName(MyNotificationItemAdded, object: nil, queue: nil) { note in
      // ...

This is much nicer, especially with Swift's trailing closure syntax. There are no method names to be looked up at runtime, we can refer to local variables in the method that registered the observer and we can perform small bits of logic in reaction to events without having to create and name dedicated methods.

The catch comes in resource management. It's very important that an object remove its event observers when it's deallocated, or else NSNotificationCenter will try to invoke methods on invalid pointers.

The traditional target-action method has the one advantage that we can easily handle this requirement with a single call in deinit:

  deinit {

With the block API, however, since there is no explicit target object, each call to addObserverForName returns "an opaque object to act as observer." So your observer class would need to track all of these objects and then remove them all from the notification center in deinit, which is a pain.

In fact, the hassle of having to do bookkeeping on the observer objects almost cancels out the convenience of using the block API. Frustrated by this situation, I sat down and created a simple helper class, NotificationManager:

class NotificationManager {
  private var observerTokens: [AnyObject] = []

  deinit {

  func deregisterAll() {
    for token in observerTokens {

    observerTokens = []

  func registerObserver(name: String!, block: (NSNotification! -> ()?)) {
    let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: nil, queue: nil) {note in

  func registerObserver(name: String!, forObject object: AnyObject!, block: (NSNotification! -> ()?)) {
    let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: object, queue: nil) {note in

First, this simple class provides a Swift-specialized API around NSNotificationCenter.  It provides an additional convenience method without an object parameter (rarely used, in my experience) to make it easier to use trailing-closure syntax. But most importantly, it keeps track of the observer objects generated when observers are registered, and removes them when the object is deinit'd.

A client of this class can simply keep a member variable of type NotificationManager and use it to register its observers. When the parent class is deallocated, the deinit method will automatically be called on its NotificationManager member variable, and its observers will be properly disposed of:

class MyController: UIViewController {
  private let notificationManager = NotificationManager()
  override init() {
    notificationManager.registerObserver(MyNotificationItemAdded) { note in
      println("item added!")
  required init(coder: NSCoder) {
    fatalError("decoding not implemented")

When the MyController instance is deallocated, its NotificationManager member variable will be automatically deallocated, triggering the call to deregisterAll that will remove the dead objects from NSNotificationCenter.

In my apps, I add a notificationManager instance to my common UIViewController base class so I don't have to explicitly declare the member variable in all of my controller subclasses.

Another benefit of using my own wrapper around NSNotificationCenter is that I can add useful functionality, like group observers: an observer that's triggered when any one of a group of notifications are posted:

struct NotificationGroup {
  let entries: [String]
  init(_ newEntries: String...) {
    entries = newEntries


extension NotificationManager {
  func registerGroupObserver(group: NotificationGroup, block: (NSNotification! -> ()?)) {
    for name in group.entries {
      registerObserver(name, block: block)

This can be a great way to easily set up an event handler to run when, for example, an item is changed in any way at all:

   let MyNotificationItemsChanged = NotificationGroup(

    notificationManager.registerGroupObserver(MyNotificationItemsChanged) { note in
      // ...

by More Indirection ( at August 27, 2014 11:21 AM