Planet Haskell

April 23, 2014

wren gayle romano

Out of the hospital

I woke up feeling terrible last monday, and by midnight I was on a bed in the ER. Spent the next few days in the hospital: had surgery on wednesday, got released on thursday. Since then I've been resting about the house, and have been recovering quickly. It was my first "real" surgery. (I've had other surgical procedures, but they aren't the sort that doctors mean when they ask if you've "had any surgeries".) Before saying what I had done, I'd like to make a point about social awareness.

Do you recognize these symptoms?

  • sharp shooting pain in the chest, possibly extending to shoulders and neck/throat, lightheadedness/dizziness, shortness of breath.
  • dark urine, yellowing skin/eyes, nausea/vomiting, difficulty concentrating, sleepiness.
  • urinating often, increased thirst/hunger, blurry vision, abrasions heal slowly, tingling/pain/numbness in hands/feet.
  • dull pain in the pit of your stomach, possibly extending to back or right shoulder, possibly increasing after eating fatty foots, doesn't abate in different positions, fever and chills.

This day and age I'd expect any moderately educated person to recognize the first three immediately: heart disease, liver disease, and diabetes (type 2). Both heart disease and diabetes have had powerful ad campaigns to increase awareness. Liver disease hasn't had that advantage, but the symptoms of jaundice are mentioned in the side-effect reports of most medications, and they're pretty memorable to boot. The last one I never would have recognized until it happened to me. And, frankly, the ER doctors had a hell of a time figuring out what it might be based only on my symptoms. I felt feverish at the time, though my temperature was normal. This was the first out-and-out attack, so I couldn't talk about how often it happened nor say whether it got worse after eating fatty foods. Knowing all the symptoms now, I can look back and see that this has been a long time coming; but at the moment all I could tell the docs was: intense dull pain in the pit of my stomach, doesn't get better or worse when changing position.

These are the symptoms of gallbladder disease. Women are more than twice as likely as men to get it. Women on hormone replacement therapy are more likely to get it. Many women are hit with it during the end of pregnancy— so many so that nurses remark on the number of cholecystectomy patients with one-week old babies. There's something about estrogen (or fluctuations thereof) that doesn't play nicely with the gallbladder. So I mention this for all the women, especially trans women, in the audience. Of course, none of this is to say that there aren't plenty of men who suffer from the same. Prior to going to the ER I'd heard almost nothing about gallbladder disease, other than knowing that gallstones were a thing. But in the scarce week since then I've lost track of how many women have told me about their cholecystectomies. With how common it is, I think this is one of those diseases that we as a society should be able to recognize— like diabetes, heart attacks, and jaundice.

So yeah, I'm down an organ now. There aren't too many side effects of having your gallbladder removed (compared to removing other organs), though it does mean I'll have to watch my diet. I've been doing that anyways, now it's just different things to look for. I'll have to put the high-protein low-carb diet on hold for a couple months, since I need to reduce fat intake until my body gets used to the new me. Also worth noting: apparently losing weight quickly (as with the 30-pounds I dropped last fall) can increase the risk of gallstones. So if you're dropping weight, you should be sure to monitor things and try to flush/cleanse your gallbladder.

That's it for now. Goodnight and good health.

comment count unavailable comments

April 23, 2014 11:19 PM

Roman Cheplyaka

Lens is unidiomatic Haskell

Edward Kmett writes:

Ironically if I had to say anything from the traffic in my inbox and on #haskell about it, it is mostly the old guard who gets disgruntled by lens.

So let me try and explain why that is. I’ll go ahead and say this: lens is unidiomatic Haskell.

By which I mean that lens isn’t like any other library that we normally use. It doesn’t follow the conventions of Haskell APIs, on which I elaborate below.

Now let me clarify that this doesn’t necessarily mean that lens is a bad library. It’s an unusual library. It’s almost a separate language, with its own idioms, embedded in Haskell.

It is as unusual as, say, Yesod is. But unlike Yesod, which follows Haskell’s spirit, not letter (syntax), lens, I argue, follows Haskell’s letter, but not its spirit.

So here’s why I think lens violates the spirit of Haskell:

  1. In Haskell, types provide a pretty good explanation of what a function does. Good luck deciphering lens types.

    Here’s a random function signature I picked from lens:

    below :: Traversable f => APrism' s a -> Prism' (f s) (f a)

    Despite having some (fairly basic) understanding of what prisms are, this signature tells me nothing at all, even despite the usage of type synonyms.

    So you have to rely on documentation much more that on types. Yeah, just like in Ruby.

  2. Usually, types in Haskell are rigid. This leads to a distinctive style of composing programs: look at the types and see what fits where. This is impossible with lens, which takes overloading to the level mainstream Haskell progably hasn’t seen before.

    We have to learn the new language of the lens combinators and how to compose them, instead of enjoying our knowledge of how to compose Haskell functions. Formally, lens types are Haskell function types, but while with ordinary Haskell functions you immediately see from types whether they can be composed, with lens functions this is very hard in practice.

  3. The size of the lens API is comparable to the size of what I’d call «core Haskell» (i.e. more or less the base library). It is also similar in spirit to base: it has a big number of trivial combinations of basic functions, in order to create a base vocabulary in which bigger programs are expressed.

    Ordinary libraries, instead, give only basic functions/combinators, and rely on the vocabulary provided by base (or lens) to compose them together.

    This is why I regard lens as a language in its own. And this demonstrates why learning lens is hard: surely learning a new language is harder than learning a new library.

  4. Dependencies. A library as basic in its purpose as lens ideally should have almost no dependencies at all. Instead, other libraries should depend on it and implement its interface. (Or even do it without depending on it, as is possible with lens.)

    A package implementing lenses that depends on a JSON parsing library and a gzip compression library sounds almost like a joke to me.

    OTOH, it kind of makes sense if you think about lens as a language. It just ships with a “rich standard library”. Nice!

  5. Backward composition of lenses. It’s a minor issue, and I wouldn’t mention it if it wasn’t a great demonstration of how lens goes against the conventions of Haskell.

Note that I’m not trying to make a judgment here (although my tone probably does give away my attitude towards lens). I’m simply explaining why people may dislike and resist it.

Nor am I trying to argue against any particular design decision of lens. I’m sure they all have valid rationale behind them.

I just hope that someone will write an idiomatic Haskell library as powerful as (or close to) lens, with perhaps a different set of compromises made. Otherwise, I’m afraid we all will have to learn this new language sooner or later.

April 23, 2014 09:00 PM

Douglas M. Auclair (geophf)

'T' is for Theorem-proving

'T' is for Theorem-proving

So, you've been theorem-proving all your life.

Let's just get that one out there.

When you're given a math problem that says:

"Solve x + 1 = 2 (for x) (of course)"

And you say, "Why, x is 1, of course," quick as thinking, you've just proved a theorem.

Here, let me show you:

x + 1 = 2 (basis)
     -1 = -1 (reflexive)
x + 0 = 1 (addition)
x       = 1 (identity)

Q.E.D. ('cause you rock)

So, yeah. Theorem proving is this, correction: theorem proving is simply this: going from step 1, to step 2 to step 3 until you've got to where you want to be.

How do you go from step 1 to step 2, etc? The same way you do everything! You follow the rules you're given.

Let's prove a theorem, right now.

So, in classical logic, we have a theorem that says

p → p

That is, if you've got a p, that implies (that you've got a) p.


But proving that? How do we go about doing that?

Well, in classical logic, you're given three basic axioms (thanks to sigfpe for his article "Adventures in Classic Land"):

1. p → (q → p)
2. (p → (q → r)) → ((p → q) → (p → r))
3. ¬¬p → p

So, p → p. How do we get there? Let's start with axiom 1 above.

1. p → ((p → p) → p)
axiom 1 (where q = (p → p))
2. (p → ((p → p) → p) → ((p → (p → p)) → (p → p))
axiom 2 (where q = (p → p) and r = p)
3. (p → (p → p)) → (p → p)
modus ponens
4. p → (p → p)
axiom 1 (where q = p)
5. p → p
modus ponens

Q.E.D. (a.k.a. whew!)

I used something called modus ponens in the above proof. It says, basically, that if you've proved something that you're depending on, you can drop it. Or, logically:

p → q, p

Or, we have "p implies q" we've proved p, so now we can just use q.

Now, there's been a lot of thought put into theory, coming up with theorems and proving them, and this has been over a long time, from Aristotle up to now. The big question for a long time was that ...

Well, theorem proving is a lot of hard work. Is there any way to automate it?

So there's been this quest to make theorem proving mechanical.

But to make theorem-proving mechanical, you have to mechanize everything, the axioms, the rules, and the process. Up until recent history, theory has been a lot of great minds spouting truths, and 'oh, here's a new way to look at the world!'

And that has been great (it has), but it hasn't helped mechanize things.

Then along came Frege. What Frege did was to give us a set of symbols that represented logical connectives and then symbols that represented things.

And there you had it: when you have objects and their relations, you have an ontology, a knowledge-base. Frege provided the tools to represent knowledge as discreet things that could be manipulated by following the rules of the (symbolized) relations.

He gave abstraction to knowledge and an uniform way of manipulating those abstractions, so, regardless of the underlying knowledge be it about money or chickens or knowledge, itself, it could be represented and then manipulated.

That way you could go from step 1 to step 2 to step 3, etc, until you arrived at your conclusion, or, just as importantly, arrived at a conclusion (including one that might possibly say what you were trying to prove was inadmissible).

Since Frege, there has been (a lot of) refinement to his system, but we've been using his system since because it works so well. He was the one who came up with the concept of types ('T' is for Types) and from that we've improved logic to be able to deliver this blog post to you on a laptop that is, at base, a pile of sand that constantly proving theorems in a descendent of Frege's logic.

Let's take a look at one such mechanized system. It's a Logic Framework, and is called tweLF. The first example proof from the Princeton team that uses this system is 'implies true,' and it goes like this:

imp_true: pf B -> pf (A imp B) = ...

That's the declaration. It's saying: the proof of B, or pf B, implies the proof of A implies B, or pf (A imp B).

How can you claim such a thing?

Well, you prove it.

imp_true: pf B -> pf (A imp B) =
[p1: pf B]            % you have the proof of B
% ... but now what? for we don't have the proof that A implies B

So the 'now what' is our theorem-proving adventure. Mechanized.

We don't have the proof that A implies B, but we have a logical loop-hole (called a hole) that a proof some something that's true is its proof:

hole: {C} pf C.

Which says, mechanized what I just said in words, so with that:

imp_true: pf B -> pf (A imp B) =
[p1: pf B]
(hole (A imp B)).

And there you go.

... that is, if you're fine with a big, gaping loophole in your proof of such a thing.

If you are satisfied, then, well, here, sit on this rusty nail until you get unsatisfied.

So there.

Okay, so we have an implication, but we need to prove that.

So we introduce the implication into the proof, which is defined as:

imp_i: (pf A -> pf B) -> pf (A imp B)

SO! we have that, and can use it:

imp_true: pf B -> pf (A imp B) =
[p1 : pf B]
(imp_i ([p2: pf A] (hole B))).

So, we have the proof of A and that leads to B. But wait! We already have B, don't we? It's p1 (that is [p1 : pf B])


imp_true: pf B -> pf (A imp B) =
[p1 : pf B]
(imp_i ([p2 : pf A] p1)).


This is what theorem-proving is: you start with what you want, e.g.:

imp_true: pf B -> pf (A imp B)

which is "implies-true is that if you have B proved, then you have the proof that (A implies B) is true, too."

Then you take what you've got:

pf B

And give it a value:

[p1 : pf B]

Then you apply your rules (in this case, implication introduction, or 'imp_i') to fill the holes in your proof, to get:

[p1: pf B] (imp_i [p2: pf A] p1)

Which is the demonstration of your proof, and that's why we say 'Q.E.D.' or 'quod est demonstratum.' 'Thus it is demonstrated.'

Ta-dah! Now you have all the tools you need to prove theorems.

Now go prove Fermat's Last Theorem.

(I am so mean!)

Okay, okay, so maybe Fermat's Last Theorem is a bit much, but if you want to try your hand at proving some theorems, there's a list of some simple theorems to prove.

by geophf ( at April 23, 2014 04:40 PM

April 22, 2014

Douglas M. Auclair (geophf)

'S' is for Simply Summing

'S' is for ... simply summing.

Okay, this post was going to be about symmetries and super-symmetries, then I was going to transition to sieves, like the Sieve of Eratosthenes, and how that relates to fizz-buzz, ...

But then I thought: 'eh.'

Yeah, that's what I thought: 'eh.'

I'm a moody brute, filled with deep, indiscernible thoughts, 'cause that's how I roll.

So, yeah: eh. This one will be a simple post today.


Sum the numbers 1.

Yeah, I just said that; humor me.

Okay, the sum of 1 is 1. Toughie.

Next, sum 1 + 2.

3, right? Still with me?

Okay, 1 + 2 + 3.

Got it? 6. No probs.

Now: sum 1 to 10. No laptop help, just sum it yourself.

A little more fun, right? So, that's 55.

Okay, no computers: sum 1 to 100.

Hm, hm, hm. Done yet?

La-di-dah. You're not even trying, are you? You're just reading this post.

Okay, the sum from 1 to 100, inclusive, no computers is ... 5,050.

1 to 1000: 500,500.

Pattern? You bet.

1 to ... oh, I don't know: 123:

That's a might harder, but, no computers:

62 00 + 12 4 0 + 18 6 = 7,626.

Okay, let's verify, by asking my Haskell interpreter: 

Prelude> sum [1 .. 123]

Yeppers, I was right. I was also as fast as a person typing a program into their computer to solve it, if they were using a functional programming language, and much, much faster than a person using a computer if they weren't and called up Excel, for example (loading ... loading ... loading ... blue screen of death) (You really need to virus-clean your computer, because I see you don't have a Mac, do you, tsk, tsk!) or if they pulled out their HP calculator from their high school days because they are that old, boys and girls!

You see this thing in my hand, children? This is not a smart phone, it's a calculator.

Children: Ooh! Papa, can I borrow it for a sec ...

Children, borrowing it, the puzzling over it: Um, Dad, ... where's the Plants vs. Zombies 2 app?

Yeah, a calculator, it calculates. And that's it. No google, no nothing, just numbers and numeric operations.

Children, looking uncomprehendingly at the thing in their hands, then, tossing the cootied thing from their hands and run to their rooms, screaming, remembering with horror the time Dear Old Dad got out an LP and told them that music came out of it, but ... it didn't have playlists!

Yeah. Calculator. But would your calculator get you that sum that fast? I mean, after you go to the CVS to buy a new battery. Those things are long-lasting, but thirty years later? And you expect your calculator to still just work?

Pfft! Heh.

Pick a number (natural number), any number, I'll sum it for you, from the origin.

Sum 1 to 31415.


31415 * 15708 = 314,15 0,000 + 157,075, 000 + 21,990,5 00 + 0 + 251,320 = 493,466,820

Okay, that took a bit longer, let's see what Haskell says:

Prelude> sum [1..31415]

Okay. Damn! ... Sometimes I impress even myself.

How do I do this? How do I add all the numbers from 1 to 31,415 and in under a minute? And perfectly, perfectly, correctly?

Well, I'll tell you.

But first, I'll tell you a story.

Once upon a time, there was a little boy named Gauss, and he was a naughty, little, precocious lad, always getting into trouble for dunking Susie Derkins' pigtails in the inkwell at his desk, and that would be fine if she had jet black hair, but you know the Viking types, right? All blond-haired, blue-eyed things, and getting your honey-blond long-flowing locks dunked into ink was not on the agenda for poor, little Susie, who cried, and had to be sent home to console herself with an appointment at the beautician's who thought the Goth-look actually fit Susie well, so she came back the next day as Suze-in-black-leather-and-studs which was disruptive for the class in other ways, so Gauss found himself at detention again.

But this time the math teacher had had enough of little Gauss' pronouncements and recitations of π to sixty-seven places, and decided to teach the little scamp but good.

Not that I'm channeling, or anything. It's not 'Mary Sue'-writing; it's 'walking in your character's moccasins a mile' ... even though Gauss probably didn't wear moccasins all that often, anyway.

Still with me?

So, mean-old Herr Schopenhauer told little Gauss. "Okay, twerp, this is gonna be a light one on you. You can go home after you sum the number from one to one hundred."

Huh? One to one hundred? But that will take all night!

"That's right," said mean old Herr Bergermeister Meisterburger, "so you'd better get crackin'!" And he cackled evilly with Schadenfreude.

The Head-Master had several names, don't you know, ... and a peridiction for Schadenfreude with his afternoon tea and muffin.

So, what could little Gauss do? He got cracking, the poor lamb.

1 = 1
1 + 2 = 3
1 + 2 + 3 = 6
1 + 2 + 3 + 4 = 10 ... obviously, you just add the last number to the previous sum
1 + 2 + 3 + 4 + 5 = 15 ... now, wait a minute ...
1 + 2 + 3 + 4 + 5 + 6 = 21 ... there seems to be another pattern here ...
1 + 2 + 3 + 4 + 5 + 6 + 7 = 28 ... Gauss looked at the chalkboard, then ...

Got it! Little Gauss thought.

Then he wrote:

1 + 2 + 3 + ... + 98 + 99 + 100 = 5,050.

"G'nite, Teach!" Gauss crowed, and off he skipped home with Suze, where they shared a chocolate milk and a strawberry poptart ... before all that sugar glaze had been added to the crust (shudder).

And Herr Herren HerringSchönheit was left stuttering a "Vat? Vat? Git back here, you rapscallion!"

Okay, what's a rap-scallion? Is it a thug onion with 'tude, a glock, and a triple-platinum pedigree?

But all Herr whatz-his-name could do was sputter, because little Gauss got it.

The sum from one (zero, actually) to any number reduces to a simple formula:

sum [1..n] = n * (n + 1) / 2

Either n or n + 1 is even, so one of them is divisible by two, so it works.

Try it out:

1 = 1 * (1 + 1) / 2 = 1 * 2 / 2 = 1
1 + 2 = 2 * (2 + 1) / 2 = 2 * 3 / 2 = 3
1 + 2 + 3 = 3 * (3 + 1) / 2 = 3 * 4 / 2 = 3 * 2 = 6
... so obviously:
1 + 2 + 3 + ... + 31,415 = 31,415 * (31,415 + 1) / 2 = 31,415 * 15,708 = that big number I computed earlier. Yeah, almost five-hundred million!

Put that into your "ceci-n'est-pas-unu-pipe" and smoke it. I'd buy that for a dollar.

Now, there's something you can impress your friends at Happy Hour with.


Okay, so you can sum 1 to ... whatevs, but what if a friend asks you, very kindly, and very politely, 'Okay, Ms. Smartypants, how about summing 1,000 to 2,000? What's that, huh? Betcha can't do that! Huh? Amirite? Amirite? Lookacha sweating bullets now, arencha? So, well, what is it? Huh?"

Yeah, really politely. Like that.

Now you could say, "But-but-but, Gauss' summer[-function, not -time] doesn't work like that."

But then you'd have feet of clay and egg on your face.

And with this hot summer coming, you don't want egg on your face. Trust me on that one.

So. What to do?

Think about it, that's what.

You simply make your 1,000 your zero, by scaling each number back by 1,000, then get your sum, then scale back each number by 1,000. For each time you applied the scale, you apply the scale to the result.

So, the sum of 1,000 to 2,000 inclusive is:

1,000 + 1,001 + 1,002 + ... 1,998 + 1,999 + 2,000 =

scaled back is

0 + 1 + 2 + ... + 998 + 999 + 1000 = 1000 * (1000 + 1) / 2 = 500,500

Now scale each number forward again, by adding 1,000 back to each number. You did that 1,001 times, right? (Don't forget to count the zero.) That's a rescalation (that's a word now) of 1,001,000. Add that to your sum to get 1,501,500.

There's your answer.

Let's check:

Prelude> sum [1000..2000]

Bingo! Tickets! This must be the Front Row!

And you, very politely, just told your friend where they could put their superior 'tude, too.

Win-win! Good times; good times!

See y'all tomorrow. By the way, what's the number of moves needed to solve the Towers of Hanoi? Not that 'Towers of Hanoi' is going to be my topic for 'T.'

Not at all.

by geophf ( at April 22, 2014 07:06 PM

Russell O'Connor

How to Fake Dynamic Binding in Nix

The Nix language is a purely functional, lazy, statically scoped configuration language that is commonly used to specify software package configurations, OS system configurations, distributed system configurations, and cloud system configurations.

Static scoping means that variables are statically bound; all variable references are resolved based on their scope at declaration time. For example, if we declare a set with recursive bindings,

> let a = rec { x = "abc"; x2 = x + "123"; }; in a

{ x = "abc"; x2 = "abc123"; }
then the use of x in the definition of x2 is resolved to "abc". Even if we later update the definition of x, the definition of x2 will not change.
> let a = rec { x = "abc"; x2 = x + "123"; }; in a // { x = "def"; }

{ x = "def"; x2 = "abc123"; }

Generally speaking, static binding is a good thing. I find that languages with static scoping are easy to understand because variable references can be followed in the source code. Static scoping lets Nix be referentially transparent, so, modulo α-renaming, you can always substitute expressions by their (partially) normalized values. In the previous example, we can replace the expression rec { x = "abc"; x2 = x + "123"; } with { x = "abc"; x2 = "abc123"; } and the result of the program does not change.

> let a = { x = "abc"; x2 = "abc123"; }; in a // { x = "def"; }

{ x = "def"; x2 = "abc123"; }

That said, on some occasions it would be nice to have some dynamic binding. Dynamic binding is used in Nixpkgs to allow users to override libraries with alternate versions in such a way that all other software packages that depend on it pick up the replacement version. For example, we might have the following in our nixpkgs declaration

  boost149 = callPackage ../development/libraries/boost/1.49.nix { };
  boost155 = callPackage ../development/libraries/boost/1.55.nix { };
  boost = boost155;
and perhaps we want to make boost149 the default boost version to work around some regression. If we write nixpkgs // { boost = boost149; } then we only update the boost field of the nix package collection and none of the packages depending on boost will change. Instead we need to use the config.packageOverrides to change boost in such a way that all expressions depending on boost are also updated. Our goal is to understand the technique that packageOverrides and other similar overrides employ to achieve this sort of dynamic binding in a statically scoped language such as Nix. This same technique is also used to give semantics to object-oriented languages.

First we need to review normal recursive bindings. The rec operator is can almost be defined as a function in Nix itself by taking a fixed point. Recall that in Nix lambda expressions are written as x: expr.

> let fix = f: let fixpoint = f fixpoint; in fixpoint; in
   let a = fix (self: { x = "abc"; x2 = self.x + "123"; }); in

{ x = "abc"; x2 = "abc123"; }

The function self: { x = "abc"; x2 = self.x + "123"; } is an object written in the open recursive style. By taking the fixpoint of this function, the recursion is closed yielding the desired value. Written this way, we had to prefix the recursive references to x with self.. However using Nix’s with operator, we can bring the values of self into scope allowing us to drop the prefix.

> let fix = f: let fixpoint = f fixpoint; in fixpoint; in
   let a = fix (self: with self; { x = "abc"; x2 = x + "123"; }); in

{ x = "abc"; x2 = "abc123"; }

This is pretty close to a definition of rec. We can almost think of rec { bindings } as syntactic sugar for fix (self: with self; { bindings }).

In order to override the definition of x instead up updating it, we need to intercept the definition of x before the open recursion is closed. To this end, we write a fixWithOverride function that takes a set of overriding bindings and an open recursive object and applies the override bindings before closing the recursion.

> let fix = f: let fixpoint = f fixpoint; in fixpoint;
       withOverride = overrides: f: self: f self // overrides;
       fixWithOverride = overrides: f: fix (withOverride overrides f); in
   let open_a = self: with self; { x = "abc"; x2 = x + "123"; }; in
   fixWithOverride { x = "def"; } open_a   

{ x = "def"; x2 = "def123"; }

Success! We have manage to override the definition of x and get the definition of x2 updated automatically to reflect the new value of x. Let us step through this code to see how it works. First we defined open_a to be the same as our previous definition of a, but written as an open recursive object. The expression fixWithOverride { x = "def"; } open_a reduces to fix (withOverride { x = "def"; } open_a). What the withOverride function does is takes an open recursive object and returns a new open recursive object with updated bindings. In particular, withOverride { x = "def"; } open_a reduces to

self: (with self; { x = "abc"; x2 = x + "123"; }) // { x = "def"; }
which in turn reduces to self: { x = "def"; x2 = self.x + "123"; }. Finally, fix takes the fixpoint of this updated open recursive object to get the closed value { x = "def"; x2 = "def123"; }.

This is great; however, we do not want to have to work with open recursive objects everywhere. Instead, what we can do is build a closed recursive value, but tack on an extra field named _override that provides access to withOverride applied to the open recursive object. This will allow us to perform both updates and overrides at our discretion.

> let fix = f: let fixpoint = f fixpoint; in fixpoint;
       withOverride = overrides: f: self: f self // overrides;
       virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in
   let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in rec
   { example1 = a;
     example2 = a // { x = "def"; };
     example3 = a._override { x = "def"; };
     example4 = example3._override { y = true; };
     example5 = example4._override { x = "ghi"; };

{ example1 = { _override = <LAMBDA>; x = "abc"; x2 = "abc123"; };
  example2 = { _override = <LAMBDA>; x = "def"; x2 = "abc123"; };
  example3 = { _override = <LAMBDA>; x = "def"; x2 = "def123"; };
  example4 = { _override = <LAMBDA>; x = "def"; x2 = "def123"; y = true; };
  example5 = { _override = <LAMBDA>; x = "ghi"; x2 = "ghi123"; y = true; };

One remaining problem with the above definition of virtual is that we cannot override the method x2 and still get access to x.

> let fix = f: let fixpoint = f fixpoint; in fixpoint;
       withOverride = overrides: f: self: f self // overrides;
       virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in
   let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in
   a._override { x2 = x + "456"; }

error: undefined variable `x' at `(string):5:23'

Remembering that Nix is statically scoped, we see that the variable x in a._override { x2 = x + "456"; } is a dangling reference and does not refer to anything in lexical scope. To remedy this, we allow the overrides parameter to withOverride to optionally take a open recursive object rather than necessarily a set of bindings.

> let fix = f: let fixpoint = f fixpoint; in fixpoint;
       withOverride = overrides: f: self: f self //
           (if builtins.isFunction overrides then overrides self else overrides);
       virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in
   let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in rec
   { example6 = a._override (self: with self; { x2 = x + "456"; });
     example7 = example6._override { x = "def"; };

{ example6 = { _override = <LAMBDA>; x = "abc"; x2 = "abc456"; };
  example7 = { _override = <LAMBDA>; x = "def"; x2 = "def456"; };

This illustrates is the basic technique that packageOverrides and other similar overrides use. The packageOverrides code is a bit more complex because there are multiple points of entry into the package override system, but the above is the essential idea behind it. The makeOverridable function from customisation.nix creates an override field similar to my _override field above, but overrides function arguments rather than overriding recursive bindings.

The syntax virtual (self: with self; { bindings }) is a little heavy. One could imagine adding a virtual keyword to Nix, similar to rec, so that virtual { bindings } would denote this expression.

After writing all this I am not certain my title is correct. I called this faking dynamic binding, but I think one could argue that this is actually real dynamic binding.

April 22, 2014 02:29 PM

Douglas M. Auclair (geophf)

'R' is fer Realz, yo!

'R' is for Let's Get Real (the Real Number line)

... because this reality you're living in?

It isn't. It's entirely manufactured for you and by you, and you have no clue that you're living in this unreality that you think isn't.

It's as simple, and as pervasive, as this:

The 'Real Numbers'?

They aren't.

Let's take a step back.

First, as you know, there was counting, and that started, naturally, from the number one.

But even that statement is so obviously flawed and ridiculous on the face of it to a modern-day mathematician. Why are you starting with the number one? Why aren't you starting with the number zero?

This isn't an argument over semantics, by the way, this has real (heh!), fundamental impact, for the number one, in counting, is not the identity. You cannot add one to a number and get that number back, and if you can't do that, you don't have a category (a 'Ring'), and if you don't have that, you have nothing, because your number system has no basis, no foundation, and anything goes because nothing is sure.

But getting to the number zero, admitting that it exists, even though it represents the zero quantity, so technically, it refers to things that don't exist (and therefore a fundamental problem with the number zero in ancient societies) ... well, there was a philosopher who posited that the number zero existed.

He was summarily executed by Plato and his 'platonic' buddies because he had spouted heresy.

If number zero exists, then you had to be able to divide by it, and when you did, you got infinity. And, obviously, infinity cannot be allowed to exist, so no number zero for you.

We went on for a long, long time without the number zero. Even unto today. You study the German rule, then you learn your multiplication tables starting from which number? Not zero: one.

"One times one is one. One times two is two. One times three is three."

Where is the recitation for the "Zero times ..."?

And I mean 'we' as Western society, as shaped by Western philosophy. The Easterners, lead by the Indians, had no problem admitting the number zero, they even had a symbol for it, 0, and then playing in the infinite playground it opened up, and therefore Eastern philosophy thrived, flourished, while Western philosophy, and society, stagnated, ...

... for one thousand years, ...

Just because ... now get this, ... just because people refused to open their eyes to a new way of seeing the world, ...

... through the number zero.


That's what happened when finally West met East, through exchange of ideas through trade (spices) and the Crusades (coffee served with croissants), and philosophers started talking, and the number zero was raised as a possibility.


Mathematics, mathematical ideas, and ideas, themselves, exploded onto the world and into though. Now that there was zero, there was infinity, now that there was infinity, and it was tenable, people now had the freedom to explore spaces that didn't exist anymore. People could go to the New World now, both figuratively and literally.

For growth in Mathematics comes from opening up your mind the possibilities you wouldn't ('couldn't') consider before, and growth in Mathematics leads to opening the mind further.

Take, for example, the expansion of the counting numbers, from not admitting zero to, now, admitting it, yes, but then the fractional numbers. You could count fractionally now.

From zero to infinity there were an infinite number of numbers, but, with fractions, we now found that from zero to one, there were an equal number of infinite number of fractions.

The really neat discovery was that if you put all the fractions in one set, and you put all the counting numbers into another, there was a one-to-one correspondence between the two.

An infinity of counting numbers was the same size as an infinity of counting-by-fraction numbers.


So, infinity was the biggest number, fer realz, then, eh?

No, not really.

Because then came what we call the 'Real Numbers' (which aren't, not by a long shot), and then we found an infinite number of numbers between one-half and one-third.

But the thing with these numbers?

The were rationals (fractional) in there, to be sure, but they were also irrationals.

There were numbers like π and τ and e and other weird and wonderful numbers, and the problem with these numbers was that there was no correspondence between them and the rational numbers. There was no way you could combine rational numbers in any form and point directly to τ, for example. These numbers were transcendent.

What's more: they were more. There were infinitely more transcendent numbers, irrational numbers, than there were rationals.

And not even countably infinite more, they were uncountably more infinite.

There was an infinite that was bigger than infinity, and this we call the Continuum.

Why? Because between zero and one and then between one and two there's this measured, discrete, gap, and this we use for counting. There's a measured, even, step between the counting numbers, and even between the fractional numbers: you can count by them, because between them there is this discreteness.

Between the Reals there's no measurable gap. You can't count by them, and you can't add 'just this much' (every time) to go from τ to π ...

(Heh, actually, you can: π = τ + τ, but then what? What's the 'next' irrational number? There's no such thing as the 'next' irrational number, because no matter what 'next' number you choose, there will always be an uncountably infinite number of numbers between that 'next' number and the number you started from, so your 'next' number isn't the next number at all, and never will be.)

So, wow, the Reals. Lots of them. They cover everything, then, right?

Not even close.

There are numbers that are not numbers.

For example, what is the number of all the functions that yield the number zero?

There are, in fact, an infinite number of those.

How about all the functions that give you ('eventually') π?

... Ooh! There are several different ones to find π, aren't they?

Yes. Yes, there are. In fact, there are an uncountably infinite number of functions that compute π.

Now, wait. You're saying, geophf, that there are uncountably infinite number of functions to find each and every Real number and that the Real numbers are uncountable as well, so that means...

Yeah, the Continuum reigned supreme for just a while as the biggest-big number, but it was soon toppled by this Powerset infinity (my term for it, it's actually called something else).

Now, I don't know the relation between the functions that yield numbers, and the functions that construct functions that do that.

But do you see where we're going with this?

As big as you can stretch yourself, there's new vistas to see in mathematics (and meta-mathematics, let's not neglect that, now, shall we?).

But we still haven't scratched the surface.

Is the world quantized, like the rational numbers? Or is it a continuum like the Reals?

Or is something else, even something more?


Direct current comes to you in a straight, steady line.

The thing about DC ('direct current')? It sucks.

(Just ask Marvel.)

If you want clean, pure, ... powerful, ... well: power, over any kind of distance, you have to ask ... not our friend Tommy (Edison), but our wild-child Tesla.

He proposed to Edison that we should use AC ('alternating current') to provide electricity, and Edison threw him out of his lab, that idiot, telling him never to show his face there again.

Guess how your electricity is delivered to your home today?

The thing about alternating current? It's a wave-form, and not only that, it's a triple wave-form. How do real numbers model electricity? Well, with DC, you've got one number: "That there is 5 volts or 5 amps or 1.21 gigiwatts."

Boy, that last one came out of the blue: like a bolt of lightning!


But if it's alternating current, then you need the sine and cosine functions to describe your power. Functions? Wouldn't it be nice if it were just a number?

Yes, it would be nice, and there is a number to describe wave-form functions, like your alternating current.

They are called 'imaginary numbers,' because, if you look hard enough on the number line, with good enough eyes, eventually you'll see the number π or τ or e or 1, or 2, or 3, or even zero.

But no matter how hard you strain your eyes, you will never see a number with an imaginary component, because why? Because most imaginary numbers, being on the curve of the wave-form are either above or below the number line. They 'aren't' numbers, then, because they're not on the numberline.

They're imaginary.

I mean, come on! The square-root of negative one? Why would anybody do this? Unless they were daft or a bit batty.

The thing is, without imaginary numbers, we wouldn't have the forms to get our heads around alternating current.

Most of the world, except those very lucky few who lived within a mile or two of a power plant, would be in darkness.

And the computer? Pfft! Don't get me started. Hie thee to the nunnery, because we are now back in the Dark Ages.

Or at most in the Age of 'Enlightenment' where you had to run for cover when the landlady bellowed "'ware slops!" ... unless you wanted raw sewage mixed with rotted fruit on your head.

But now, here we are, because we have both Real and imaginary numbers, together giving us the complex number set (which, it turns out, is not bigger than the Reals, as there is a one-to-one correspondence between each real number and each complex number. Fancy that! An infinity 'more' number of complex number above and below the Real number line gives the same number of complex numbers as Reals).

We're good?

Not even close.

Last I checked, I don't live in Flatland, and, last I checked, nor do you.

Complex go 'above' and 'below' the Real number line, but ... what about the third dimension? Is there numbers to model us in three dimension?

What would such numbers look like?

And here's a stunner. If I were on Mars, or the Moon, and you were here, reading this blog post, how would I know where to look to see you?

The Moon, and Mars, too, has their own three-dimensional frames of reference, and the Earth has its own, too (it's called geocentric). So, to draw a line from a satellite (such as the Moon, known as 'Earth's satellite') so that it can look down at a spot on the Earth, you actually have to use a four-dimensional number to connect the two three-dimensional frames of reference so that one can look at the other. This four-dimensional number is called the Quaternion.

It's simple, really, it's just rocket science.

And it's really ... mind-bending, at first, wrapping your head around the math and drawing pictures, or using both your hands, three fingers out indicating both sets of axes, and you use your nose to draw the line connecting the two, and then you scream, 'but how do I measure the angles?'

Not that I've worked on satellite projects or anything. cough-EarthWatch-cough.

But nowadays, you can't get into making a realistic game without having quaternions under your belt. Monster sees you, monster charges you, monster bonks you on the head. Game over, thank you for playing, please insert $1.50 to die ... that is to say, to try again.

The thing is: how does the monster 'see' you? The monster has it's own frame of reference, just as you do. The monster exists in its own three-dimensional coordinate system, just as you do. If you were standing on a little hillock, would you expect the monster not to see you because you're slightly elevated?

Of course not! The monster sees you, the monster bonks you. All of this happens through transformation of disparate coordinate systems via quaternions.

Now that's something to impress people with at cocktail parties.

'Yeah, I spent all day translating coordinate systems using quaternions. Busy day, busy day."

Just don't say that to a mathematician, because he'll (in general, 'he') will pause, scratch his head then ask: "So you were checking out babes checking you out?"

Then you'll have to admit that, no, not that, you were instead avoiding your boss trying to catch your eye so he could hand you a whole stack of TPS reports to work on over the weekend.

Like I ... didn't. Ooh. Ouch! Guess who was working through Easter?

Fun, fun!

Okay, though. Four dimensions. We've got it all, now, right?

Not if you're a bee.

Okay, where did that come from?

Bees see the world differently from you and me.

Please reflect on the syntax of that sentence, writers.

Bees see the world differently (not: 'different') from you and me (not: 'from you and I').

Gosh! Where is the (American) English language going?

(But I digress.)

(As always.)

If you look at how they communicate through their dance, you see an orientation, but you also see some other activity, they 'waggle' (so it's called the 'waggle-dance') and the vigor at which they do their waggle communicates a more interesting cache of nectar. There are other factors, too.

The fact of the matter: three dimensions are not enough for the bee's dance to communicate what it needs to say to the other drones about the location, distance, and quantity of nectar to be found.

So, it has its waggle-dance to communicate this information. Everybody knows this.

Until, one little girl, working on her Ph.D. in mathematics, stopped by her daddy's apiary, and, at his invitation, watched what he was doing.


"Daddy," she said, "those bees are dancing in six dimensions."

Guess who changed the topic of her Ph.D., right then and there?

Combining distance, angle from the sun, quantity of interest, ... all the factors, the bees came up with a dance.

They had only three dimensions to communicate six things.

The thing is, nobody told the bees they had only three dimensions to work with. So they do their dance in six dimension.

If you map what they are doing up to the sixth dimension, it gives a simple (six-dimensional) vector, which conveys all the information in one number.

Bees live in six dimensions, and they live pretty darn well in it, too.

Or, put this way: 80% of the world's food supply would disappear if bees didn't do what they did.

You are living in six dimensions, or, more correctly, you are alive now, thanks to a little six-dimensional dance.

Six dimensions.

Okay, but what if you're Buckaroo Banzai?

Pfft! Eight dimensions? Light weight.

In String Theory, we need at least ten dimensions for super strings, and twenty-six dimensions for some types of strings.

So, 'R' is for real numbers.

The neat thing about numbers, is ... they can get as big as you can think them.

And they're good for a Ph.D. thesis ... or two.

by geophf ( at April 22, 2014 02:16 PM

'F' is for function

'F' is for function.

Okay, you just knew we were going there today, didn't you?

Today, and ... well, every day in the last 2,600 years (I'm not joking) has had this dynamic tension between objects and functions. Which is greater?

Read this tell-all post to find out!

Sorry, I channeled Carl Sagan in Cosmos for a second. 

Not really.

But I digress.

WAY back when, back in the olden times, there were just things, and then when you had a full belly and your one thing, your neighbor, who happened to be named 'Jones,' had two things.

And then keeping up with the Jones' came to be a real problem, because you had one thing, and they had two. So you got one more.

And counting was invented.

For things.

In fact, in some cultures, counting doesn't exist as an abstract thing. You don't count, in general, you have different counting systems for different things. Counting, for example, pens, is a very different thing than counting eggs.

Ha-ha, so quaint! Who would ever do that?

Well, you do. What time is it? You have a different way for counting time than you do for counting distance or for counting groceries that you need to buy, don't you?

But, at base, they are all just things: time, distance, and gallons of milk.

Sorry: litres of milk. How very British Imperial of me!

Question: why are Americans just about the only ones using the British Imperial system any more? That is, the real question is: why are the British not using the British Imperial system? They onto something we're not?

Anybody want a satellite coded wrongly because some engineers intermingled British Imperial units with metrics?

Okay, so, units, things, metrics, whatevs.

And counting.

Counting is doing something to something, whereas, at first, it's important you have that chicken in your stomach (cooked is nice, but optional), suddenly it's become important that you have not one chicken, a hen, but a (very rarely) occasional rooster with all your hens would be nice, too.

(Roosters are big jerks, by the way.)

Counting was all we had, after the chickens and their yummy eggs, that is, and that for a long while.

Question: which came first, the chicken or the egg.

Answer: that's obvious, the egg.

Think about it. You have eggs for breakfast, then you have a chicken-salad sandwich.

Like I said: obvious! First the egg, then the chicken.

Unless you're one of those weirdos that eat chicken sausage with your breakfast, then it's a toss-up, but in that case I have no help for you.

Okay, chickens, eggs (egg came first) (You read that first, right here on this blog) (and you thought there was no point to math!), and then counting, for a long, long time.

Then, ... counting became too much, because you ran out of fingers, then toes, to count your mutton on. It happens. That's a good thing, because then people started to have to think (that's a good thing, too), and they thought, 'how do I count all my mutton when I run out of fingers and toes? I need to know that I have more sheep than the Jones', and I've paired each of my sheep to his, and I have more than twenty-two than he does. How many more do I have!"

Then people started getting smart. They did more than one-to-one, or pairwise, groupings, they started grouping numbers, themselves.

All this is very nice background, geophf, but to what end?

First counting, then grouping, those are operations, and on or using numbers.

Then addition and subtraction, which are abstractions, or generalizations on counting, then multiplication, and let's not even go into division.


Counting, adding, multiplying. These things worked for you chickens (and eggs) (first), but also on your mutton, and when you settled down, it worked on how many feet, yards, the hectares of land you had, too. The same principles applied: numbers worked on things, no matter what the things were, and the operations on numbers worked on ... well, no matter what the numbers counted.

Now. Today.

Or, sometime back in the 40s anyway, way back before the Birkenstock-wearers were mellowing out with lattes, anyway. A couple of dudes (Samuel Eilenberg and Saunders Mac Lane, specifically) said, 'Hey, wait!' they exclaimed, 'it doesn't matter what the thing is, at all!' they said, 'as long as the things are all the same thing, we don't care all all about the thing itself at all!'

And they invented a little something that eventually became known as Category Theory, which is the basis of ... well, a lot of things now: quantum thermodynamics, for one, and computer science, for another.

And they nailed it. They created a one-to-one correspondence between the things in categories and the things in Set theory.

Why is that important? Set theory is this:


Got me?

A set is a thing that contains a thing. The fundamental set is the set that contains nothing (the 'empty set'), which is this:


And other sets are sets that contain sets:

{{}, {{}, {{}, {}}}}

With sets you can model numbers if you'd like:

0: {},
1 {{}} (the set that contains '0'), 
2: {{}, {{}}} (the set that contains '0' and '1')
3: {{}, {{}}, {{}, {{}}}} (the set that contains '0', '1', and '2')


And as Gödel (eventually) showed, once you model numbers, you can model anything.

(We'll get to that ... for 'G' is for Gödel-numbering, I'm thinking)

How? Well, once you have numbers, you can count with them (as just demonstrated), you can add one to them (the 'successor' function), you can take away one from the (the 'predecessor' function), you can add two of them together (successively add one to the first number until the second number is zero. Try it!), subtract, by doing the opposite (or by just removing duplicates from both until one of them is gone ... 'pairwise-subtraction'!), multiply two together (successive addition), divide numbers ... (um, yeah)... and, well, do anything with them.

Once you have Number, you have counting, and from that, you have everything else.

So, our dudes mapped category theory things (morphisms) down to Set mapping functions, and they had it.

Because why?

Well, it's rather cumbersome to carry a whole bunch of sets for your Pert formulation, especially when you want to see the money you (don't) save by paying off your mortgage's principal early.

Banker: oh, geophf, pay off your mortgage principal early, you'll save a ton in the long run.
me: uh, that was true when mortgage interest was 10-15-21%...

(you remember those Jimmy Carter days?)

me, continuing: ... but now that my interest rate is 2.5% I save zip after 30 years for early payment.

Try it. Run the numbers. Bankers haven't. They're just repeating what was good advice, ... twenty-five years ago.

But when you have objects, whose representations do not matter, be they 0, 1, a googol, or {{}, {{}}, {{}, {{}}}}, then you can concentrate on what's really important.

The functions.

Whoopsie! I slipped and gave the answer, didn't I?

Me, and Alan Kay, both.

Alan Kay, inventor of Smalltalk, and inspirer of the movie Tron: The objects aren't important, it's the messages that go between them that is.

So there.

Well, okay, functions win. It used to be the chicken (or the egg), but now we've ... grown a bit, worrying less about what fills our stomach, and where we're going to get our next meal (and not be the next meal), you know: survival.

We've transcended our stomachs.


Well, okay, functions rule, so back off.

The dudes didn't stop there, because there's been some really neat things going on with functions, since discovering functions are things in and of themselves, because if functions are things, in and of themselves, then, well, ...

Well, you can number them (tomorrow) (promise), you can count them, you can add them, you can multiply them, you can exponate them, you can ...

You can, in fact, operate on them, applying functions to functions.

So, our dudes did just that. They said their functions, their morphisms have two rules to be a function.

You can ask what it is, it's identity:

id f = f

And you can string them together, composing them:

g . f = ... well, g . f

So, if you have 

> f x = succ x


> g x = x * 2

then (g . f) gives you some function, we can call it, h, here if you'd like, and h is the composition of applying f to your number first, then g, or

> h x = 2 (x + 1)

Try it!

Pull up your functional programming language listener (I use Haskell for now) (until I find or invent something better)

and enter the above.

Now, it's an interesting proposition to show that

(g . f) == h

But, for now, in Haskell, we cannot do that.

Now, in some other programming languages (I'm thinking of my experience with Twelf), you can, but first you have to provide numbers, on you own, like: 0, 1, 2, ...

... and prove they are what they are, you know: they follow in order, and stuff like that ...

Then you provide the functions for identity and composition, and then you can prove the above statement, because now you have theorems and a theorem prover.


No, ... no 'yay'!

I mean, 'yay,' if that's your thing, proving theorems (I happen to have fun doing that at times), but 'ick,' otherwise, because it's a lot of work to get to things like just plain old addition.

See, theorem-prover-folk are these hard-core fighters, right on the edge of discovering new mathematics and science. Not plain-old-little me, I just use the stuff (to build new theories from preexisting, proved, ones), so I'm a lover of the stuff.

To quote an eminent mathematical philosopher: I'm a lover, not a fighter.

And, no, Billie-Jean is not my lover. She's just a girl that claims that I am the one.


So, but anyway, using the above two functions, identity and composition, you're now able to reason about functions generally, making new functions by stringing together (composing) other, preexisting functions.

This is so important that Eilenberg and Mac Lane gave this process it's own name 'natural transformation.'

But okay, Categories have units and you don't particularly care what those units are; we saw one set of units, numbers, themselves, as an example,

But units can be anything.

Even functions, so you have a category of functions (several categories, in fact, as there are different kinds of functions, just so you know) ... You can even have categories of categories of things (things being some arbitary unitary types, like numbers, or functions, or categories, again ... it can get as wild and woolly as you want it to get).

And types. Categories can be used to model types, so you can have a category that represents the natural numbers ... as types:

Nat : Category

zero : 1 -> Nat
succ : Nat -> Nat

so zero = zero ()
one = succ zero ()
two = succ one
three = succ two
etc ...

And those formula, those functions works, regardless of the underlying type of things like 'number' or even 'function.'

Category theory was originally called 'Abstract nonsense,' by the way, because Eilenberg and Mac Lane saw it as so abstract that they were actually talking about nothing.

Sort of like how Set Theory does, starting from nothing, the empty set, and building everything from that object, but even more abstractly than that, because category theory doesn't even have the 'nothing'-object to start from, it just starts identifying and composing functions against objects about which it doesn't care what they are.

Now, there was, at one point, a working definition of abstraction, and that was: the more useful a theorem is, the less abstract it is.

The contrapositive implied: the more abstract a formula is, the less useful.

But I find I play in these domains very happily, and in playing, I'm able to invent stuff, useful stuff that's able to do things, in software, that other software engineers struggle with and even give up, stating: "This can't be done."

I love it when I'm told I "can't" do something.

You know what I hear when I hear the word "can't"?

I hear opportunity. 

Everyone else may have given up on this particular task, because it 'can't' be done. But me? I go right in with my little abstract nonsense, reinventing the language that was unable to frame the problem before with a language that can, and I find, hey, I not only can do it, but I just did.

Language frames our thoughts, telling us what we can and cannot do. English is a very complex language, taking in a lot of the world and describing it just so.

The language of mathematics is very, very simple, and describes a very tiny world, a world that is so tiny, in fact, doesn't exist in reality. It's a quantum-sized world: mathematics and has the possibility to be even smaller than that.

But so what? I get to invent a world that doesn't exist, each time I apply a theorem from mathematics that others are unfamiliar with.

And when I do so, I get to create a world where the impossible is possible.

It's very liberating, being able to do things others 'can't.' And when I pull it off, not only do I win, but my customers win, and maybe even those developers who 'can't' win, too, if they're willing to open their eyes to see possibilities they didn't see before.

'F' is for functions, a totally different way of seeing the world, not as a world of objects, but as a world of possibilities to explore.

by geophf ( at April 22, 2014 10:43 AM

Christopher Done

Typeable and Data in Haskell

Data.Typeable and Data.Data are rather mysterious. Starting out as a Haskell newbie you see them once in a while and wonder what use they are. Their Haddock pages are pretty opaque and scary in places. Here’s a quick rundown I thought I’d write to get people up to speed nice and quick so that they can start using it.1

It’s really rather beautiful as a way to do generic programming in Haskell. The general approach is that you don’t know what data types are being given to you, but you want to work upon them almost as if you did. The technique is simple when broken down.


First, there is a class exported by each module. The class Typeable and the class Data. Your data types have to be instances of these if you want to use the generic programming methods on them.

Happily, we don’t have to write these instances ourselves (and in GHC 7.8 it is actually not possible to do so): GHC provides the extension DeriveDataTypeable, which you can enable by adding {-# LANGUAGE DeriveDataTypeable #-} to the top of your file, or providing -XDeriveDataTypeable to ghc.

Now you can derive instances of both:

data X = X
  deriving (Data,Typeable)

Now we can start doing generic operations upon X.

The Typeable class

As a simple starter, we can trivially print the type of any instance of Typeable. What are some existing instances of Typeable? Let’s ask GHCi:

λ> :i Typeable
class Typeable a where typeOf :: a -> TypeRep
instance [overlap ok] (Typeable1 s, Typeable a) => Typeable (s a)
instance [overlap ok] Typeable TypeRep
instance [overlap ok] Typeable TyCon
instance [overlap ok] Typeable Ordering
instance [overlap ok] Typeable Integer
instance [overlap ok] Typeable Int
instance [overlap ok] Typeable Float
instance [overlap ok] Typeable Double
instance [overlap ok] Typeable Char
instance [overlap ok] Typeable Bool
instance [overlap ok] Typeable ()

That’s the basic Prelude types and the Typeable library’s own types.

There’s only one method in the Typeable class:

typeOf :: a -> TypeRep

The TypeRep value has some useful normal instances:

λ> :i TypeRep
instance [overlap ok] Eq TypeRep
instance [overlap ok] Ord TypeRep
instance [overlap ok] Show TypeRep
instance [overlap ok] Typeable TypeRep

Use-case 1: Print the type of something

So we can use this function on a Char value, for example, and GHCi can print it:

λ> :t typeOf 'a'
typeOf 'a' :: TypeRep
λ> typeOf 'a'

This is mostly useful for debugging, but can also be useful when writing generic encoders or any tool which needs an identifier to be associated with some generic value.

Use-case 2: Compare the types of two things

We can also compare two type representations:

λ> typeOf 'a' == typeOf 'b'
λ> typeOf 'a' == typeOf ()

Any code which needs to allow any old type to be passed into it, but which has some interest in sometimes enforcing or triggering on a specific type can use this to compare them.

Use-case 3: Reifying from generic to concrete

A common thing to need to do is when given a generic value, is to sometimes, if the type is right, actually work with the value as the concrete type, not a polymorphic type. For example, a printing function:

char :: Typeable a => a -> String

The specification for this function is: if given an Char, return its string representation, otherwise, return "unknown". To do this, we need a function that will convert from a polymorphic value to a concrete one:

cast :: (Typeable a, Typeable b) => a -> Maybe b

This function from Data.Typeable will do just that. Now we can implement char:

λ> let char x = case cast x of
                  Just (x :: Char) -> show x
                  Nothing -> "unknown"
λ> char 'a'
λ> char 5
λ> char ()

The Data class

That’s more or less where the interesting practical applications of the Typeable class ends. But it becomes more interesting once you have that, the Data class can take advantage of it. The Data class is much more interesting. The point is to be able to look into a data type’s constructors, its fields and traverse across or fold over them. Let’s take a look at the class.

Again, there are some basic instances provided:

instance Data a => Data [a]
instance Data Ordering
instance Data a => Data (Maybe a)
instance Data Integer
instance Data Int
instance Data Float
instance (Data a, Data b) => Data (Either a b)
instance Data Double
instance Data Char
instance Data Bool

It’s a rather big class, so I’ll just cover some methods that demonstrate the key use-cases.

Use-case 1: Get the data type

Similar to the TypeRep, you can use dataTypeOf to get a unique representation of a data type:

dataTypeOf :: Data a => a -> DataType

For example:

λ> dataTypeOf (Just 'a')
DataType {tycon = "Prelude.Maybe", datarep = AlgRep [Nothing,Just]}

There aren’t any other interesting instances for this type, but we’ll look at uses for this type later. Representations (so-called FooRep) tend to be references from which you can reify into more concrete values.

Use-case 2: Inspecting a data type

The most common thing to want to do is to get a list of constructors that a type contains. So, the Maybe type contains two.

λ> :t dataTypeConstrs
dataTypeConstrs :: DataType -> [Constr]
λ> dataTypeConstrs (dataTypeOf (Nothing :: Maybe ()))

We’ll look at what we can do with constructors later.

It’s also surprisingly common to want to see what the constructor is at a particular index. We could write this function ourself, but there’s already one provided:

λ> indexConstr (dataTypeOf (Nothing :: Maybe ())) 2

Sometimes you want to know whether a data type is algebraic (in other words, does it have constructors and is it not one of the built-in types like Int/Float/etc)?

λ> isAlgType (dataTypeOf (Just 'a'))
λ> isAlgType (dataTypeOf 'a')

Use-case 3: Get the constructor of a value

We have the method

toConstr :: a -> Constr

Which given any instance of Data will yield a constructor.

λ> :i Constr
data Constr
instance Eq Constr
instance Show Constr

You can’t do much with a constructor as-is, but compare and print it:

λ> toConstr (Just 'a')
λ> toConstr (Just 'a') == toConstr (Nothing :: Maybe Char)

However, those operations by themselves can be useful.

By the way, we can also get back the DataRep of a constructor:

λ> constrType (toConstr (Just 'a'))
DataType {tycon = "Prelude.Maybe", datarep = AlgRep [Nothing,Just]}

Use-case 4: Get fields of a constructor

Another typical thing to want to do is to use the field names of a constructor. So for example:

λ> data X = X { foo :: Int, bar :: Char } deriving (Typeable,Data)
λ> toConstr (X 0 'a')
λ> constrFields (toConstr (X 0 'a'))

It’s a good use-case for serializing and debugging.

Use-case 5: Make a real value from its constructor

It’s actually possible to produce a value from its constructor. We have this function

fromConstr :: Data a => Constr -> a


λ> fromConstr (toConstr (Nothing :: Maybe ())) :: Maybe ()

But what do you do when the constructor has fields? No sweat. We have this function:

fromConstrB :: forall a. Data a
            => (forall d. Data d => d) -> Constr -> a

Haskell beginners: Don’t fear the rank-N type. What it’s saying is merely that the fromConstrB function determines what the type of d will be by itself, by looking at Constr. It’s not provided externally by the caller, as it would be if the forall d. were at the same level as the a. Think of it like scope. let a = d in let d = … doesn’t make sense: the d is in a lower scope. That means we can’t just write:

fromConstrB (5 :: Int) (toConstr (Just 1 :: Maybe Int)) :: Maybe Int

The Int cannot unify with the d because the quantification is one level lower. It basically doesn’t exist outside of the (forall d. Data d => d) (nor can it escape). That’s okay, though. There is a type-class constraint which lets us be generic. We already have a function producing a value of that type:

λ> :t fromConstr (toConstr (1 :: Int))
fromConstr (toConstr (1 :: Int)) :: Data a => a

So we can just use that:

λ> fromConstrB (fromConstr (toConstr (1 :: Int)))
               (toConstr (Just 1 :: Maybe Int)) :: Maybe Int
Just 5

Tada! But wait… What if there’re more fields? How do we provide more than one, and of different types?

Enter fromConstrM:

fromConstrM :: forall m a. (Monad m, Data a)
            => (forall d. Data d => m d) -> Constr -> m a

Because it’s monadic we can use a state monad to keep an index! Observe:

λ> :t execState
execState :: State s a -> s -> s
λ> :t execState (modify (+1))
execState (modify (+1)) :: Num s => s -> s
λ> :t execState (forM_ [1..5] (const (modify (+1))))
execState (forM_ [1..5] (const (modify (+1)))) :: Num s => s-> s
λ> execState (forM_ [1..5] (const (modify (+1)))) 5

Let’s put this to use with fromConstrM:

λ> evalState
       (do i <- get
           modify (+1)
             (case i of
               0 -> fromConstr (toConstr (5::Int))
               1 -> fromConstr (toConstr 'b')))
       (toConstr (Foo 4 'a')))
     0 :: Foo
Foo 5 'b'

In other words, keep an index starting at 0. Increase it each iteration that fromConstrM does. When we’re at index 0, return an Int, when we’re at index 1, return a Char. Easy! Right?

Use-case 6: mapping over data structures generically

A common thing to want is to map over a value in a structure-preserving way, but changing its values. For that we have gmapT:

gmapT :: forall a. Data a
      => (forall b. Data b => b -> b) -> a -> a

Similar to fromConstr*, there is a rank-n type b that refers to each type in the constructor of type a. It’s easy enough to use:

λ> gmapT
     (\d ->
        case cast d of
          Nothing -> d
          Just x ->
            fromJust (cast (if isUpper x then '!' else x)))
     (Foo 4 'a')
Foo 4 'a'
λ> gmapT
     (\d ->
        case cast d of
          Nothing -> d
          Just x ->
            fromJust (cast (if isUpper x then '!' else x)))
     (Foo 4 'A')
Foo 4 '!'

Here I’m doing a little check on any field in the constructor of type Char and if it’s upper case, replacing it with !, otherwise leaving it as-is. The first trick is to use the cast function we used earlier to reify the generic d into something real (Char). The second trick is to cast our concrete Char back into a generic d type.

Just like fromConstrM earlier, if you want to operate on exact indices of the constructor rather than going by type, you can use gmapM and use a state monad to do the same thing as we did before.

Use-case 7: generating from data structures generically

Another slightly different use-case is to walk over the values of a data structure, collecting the result. You can do this with gmapM and a state monad or a writer, but there’s a handy function already to do this:

gmapQ :: forall a. Data a => (forall d. Data d => d -> u) -> a -> [u]

Trivial example:

λ> gmapQ (\d -> toConstr d) (Foo 5 'a')

A more useful example can be found in structured-haskell-mode which walks over the Haskell syntax tree and collects source spans into a flat list.

Printer example

Here’s a trivial (not very good, but something I wrote once) generic printer:

gshows :: Data a => a -> ShowS
gshows = render `extQ` (shows :: String -> ShowS) where
  render t
    | isTuple = showChar '('
              . drop 1
              . commaSlots
              . showChar ')'
    | isNull = showString "[]"
    | isList = showChar '['
             . drop 1
             . listSlots
             . showChar ']'
    | otherwise = showChar '('
                . constructor
                . slots
                . showChar ')'

    where constructor = showString . showConstr . toConstr $ t
          slots = foldr (.) id . gmapQ ((showChar ' ' .) . gshows) $ t
          commaSlots = foldr (.) id . gmapQ ((showChar ',' .) . gshows) $ t
          listSlots = foldr (.) id . init . gmapQ ((showChar ',' .) . gshows) $ t
          isTuple = all (==',') (filter (not . flip elem "()") (constructor ""))
          isNull = null (filter (not . flip elem "[]") (constructor ""))
          isList = constructor "" == "(:)"

I wrote it because the GHC API doesn’t have Show instances for most of its data types, so it’s rather hard to actually inspect any data types that you’re working with in the REPL. It has instances for pretty printing, but pretty printing confuses presentation with data.


λ> data Foo = Foo Char Int deriving (Data,Typeable)
λ> gshow ([Just 2],'c',Foo 'a' 5)
"([(Just (2))],('c'),(Foo ('a') (5)))"

Note: no Show instance for Foo.


We’ve briefly covered how to query types, how to cast them, how to walk over them or generate from them. There’re other things one can do, but those are the main things. The real trick is understanding how to make the types work and that comes with a bit of experience. Fiddle around with the concepts above and you should gain an intution for what is possible with this library. See also: Data.Generics.Aliases.

Hope it helps!

  1. I’ll migrate this to the HaskellWiki when it doesn’t look so, uh, shall we say, unattractive.

April 22, 2014 12:00 AM

April 21, 2014

Robert Harper

Bellman Confirms A Suspicion

As is by now well-known, I regard the supposed opposition between static and dynamic languages as a fallacy: the latter, being a special case of the former, can scarcely be an alternative to it.  I cannot tell you how many times I’ve been handed arguments along the lines of “oh, static languages are just fine, but I want something more dynamic,” the speaker not quite realizing the absurdity of what they are saying.  Yet somehow this sort of argument has an appeal, and I’ve often wondered why.  I think it’s mostly just semantic innocence, but I’ve long had the suspicion that part of it is that it sounds good to be dynamic (active, outgoing, nimble) rather than static (passive, boring, staid).  As we all know, much of the popularity of programming languages comes down to such superficialities and misunderstandings, so what else is new?

Well, nothing, really, except that I recently learned (from Guy Blelloch) the origin of the notably inapt term dynamic programming for a highly useful method of memoization invented by Richard Bellman that is consonant with my suspicion.  Bellman, it turns out, had much the same thought as mine about the appeal of the word “dynamic”, and used it consciously to further his own ends:

“I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes.

“An interesting question is, ‘Where did the name, dynamic programming, come from?’ The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word, research. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term, research, in his presence. You can imagine how he felt, then, about the term, mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various rea- sons. I decided therefore to use the word, ‘programming.’ I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying—I thought, let’s kill two birds with one stone. Let’s take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities” (p. 159).

Hilarious, or what?  It explains a lot, I must say, and confirms a long-standing suspicion of mine about the persistent belief in a non-existent opposition.

Filed under: Research, Teaching Tagged: dynamic and static typing

by Robert Harper at April 21, 2014 05:43 PM

April 20, 2014

Roman Cheplyaka

Setting up Samsung Wireless Printer on Linux

Here’s a complete guide for setting up a wireless Samsung printer on Linux, where by “setting up” I mean making it connect to your wireless network.

It worked for me with Samsung ML-2165W on Debian GNU/Linux «jessie», but should work for other models and distributions, too.

Connecting Samsung printer to a wireless network

  1. Create a new, temporary user. We’ll use it to launch Samsung’s wireless setup utility. This is optional, but it provides an additional layer of security (who knows what those utilities from Samsung do behind the scenes) and ensures that nothing will mess with your system.

    We add the new user to the lp group, so that it can talk to the printer.

    user$ sudo useradd --create-home --shell /bin/bash --groups lp samsung
  2. Allow the new user to use our display. (Samsung’s utility is graphical.)

    user$ xhost +local:samsung
  3. Now, time to switch to our temporary user.

    user$ sudo su - samsung
  4. Download Samsung’s PSU (“Printer Settings Utility”) archive from their website. Unpack it and go to the wirelesssetup directory.

    samsung$ wget
    samsung$ tar xzf PSU_1.01.tar.gz
    samsung$ cd cdroot/Linux/wirelesssetup
  5. Check if there are any missing dynamic libraries:

    samsung$ ldd bin/wirelesssetup  | grep 'not found'

    (Note: this is for a 32-bit system. On a 64-bit system, replace bin with bin64.)

    In my case, the output was => not found

    This particular library is included in the PSU archive, so we load it by

    samsung$ export LD_PRELOAD=$PWD/../psu/share/lib/

    (Likewise, replace lib with lib64 on a 64-bit system.)

    If there are more missing libraries, first see if your distribution ships them. The major versions must match! E.g. Debian jessie ships, which has the major version number 30, so that won’t do.

    If your distribution doesn’t have the right version, use a resource like to find a package that has one. Unpack it (do not install!) and set LD_PRELOAD and/or LD_LIBRARY_PATH so that they are found.

  6. Now connect the printer via a USB cable to the Linux machine and run

    samsung$ bin/wirelesssetup /dev/usb/lp0

    A graphical window should appear, where you’ll be able to choose your wireless network and enter the password to it.

  7. After you made the printer connect to the wireless network, you can logout and remove the temporary user. Note that the command below will remove that user’s home directory.

    user$ sudo userdel --remove samsung


Please do not ask me about the problems you may have with your printer. Either try to solve them yourself, or use the usual venues (forums, mailing lists, Samsung support etc.) to ask for help.

However, if you solved your problems, and they were related to the instructions above, please do contact me so that I can fix/update the instructions.

If this article helped you and you want to say thanks, that’s fine, too :-)

April 20, 2014 09:00 PM

Edward Kmett

CUFP 2014 Call For Presentations

Workshop for
Commercial Users of Functional Programming 2014
Sponsored by SIGPLAN
[CUFP 2014](
Co-located with ICFP 2014
Gothenburg, Sweden
Sep 4-6
Talk Proposal Submission Deadline: 27 June 2014

CUFP 2014 Presentation Submission Form

The annual CUFP workshop is a place where people can see how others are using functional programming to solve real world problems; where practitioners meet and collaborate; where language designers and users can share ideas about the future of their favorite language; and where one can learn practical techniques and approaches for putting functional programming to work.

Giving a CUFP Talk

If you have experience using functional languages in a practical setting, we invite you to submit a proposal to give a talk at the workshop. We're looking for two kinds of talks:

Experience reports are typically 25 minutes long, and aim to inform participants about how functional programming plays out in real-world applications, focusing especially on lessons learned and insights gained. Experience reports don't need to be highly technical; reflections on the commercial, management, or software engineering aspects are, if anything, more important.

Technical talks are also 25 minutes long, and should focus on teaching the audience something about a particular technique or methodology, from the point of view of someone who has seen it play out in practice. These talks could cover anything from techniques for building functional concurrent applications, to managing dynamic reconfigurations, to design recipes for using types effectively in large-scale applications. While these talks will often be based on a particular language, they should be accessible to a broad range of programmers.

We strongly encourage submissions from people in communities that are underrepresented in functional programming, including but not limited to women; people of color; people in gender, sexual and romantic minorities; people with disabilities; people residing in Asia, Africa, or Latin America; and people who have never presented at a conference before. We recognize that inclusion is an important part of our mission to promote functional programming. So that CUFP can be a safe environment in which participants openly exchange ideas, we abide by the SIGPLAN Conference Anti-Harassment Policy.

If you are interested in offering a talk, or nominating someone to do
so, please submit your presentation before 27 June 2014 via the

CUFP 2014 Presentation Submission Form

You do not need to submit a paper, just a short proposal for your talk! There will be a short scribe's report of the presentations and discussions but not of the details of individual talks, as the meeting is intended to be more a discussion forum than a technical interchange.

Nevertheless, presentations will be video taped and presenters will be expected to sign an ACM copyright release form.

Note that we will need all presenters to register for the CUFP workshop and travel to Gothenburg at their own expense.

Program Committee

More information

For more information on CUFP, including videos of presentations from
previous years, take a look at the CUFP website at Note that presenters, like other attendees, will need to register for the event. Presentations will be video taped and presenters will be expected to sign an ACM copyright release form. Acceptance and rejection letters will be sent out by July 16th.

Guidance on giving a great CUFP talk

Focus on the interesting bits: Think about what will distinguish your talk, and what will engage the audience, and focus there. There are a number of places to look for those interesting bits.

  • Setting: FP is pretty well established in some areas, including formal verification, financial processing and server-sid web-services. An unusual setting can be a source of interest. If you're deploying FP-based mobile UIs or building servers on oil rigs, then the challenges of that scenario are worth focusing on. Did FP help or hinder in adapting to the setting?
  • Technology: The CUFP audience is hungry to learn about how FP techniques work in practice. What design patterns have you applied, and to what areas? Did you use functional reactive programming for user interfaces, or DSLs for playing chess, or fault-tolerant actors for large scale geological data processing? Teach us something about the techniques you used, and why we should consider using them ourselves.
  • Getting things done: How did you deal with large software development in the absence of a myriad of pre-existing support that are often expected in larger commercial environments (IDEs, coverage tools, debuggers, profilers) and without larger, proven bodies of libraries? Did you hit any brick walls that required support from the community?
  • Don't just be a cheerleader: It's easy to write a rah-rah talk about how well FP worked for you, but CUFP is more interesting when the talks also spend time on what _doesn't_ work. Even when the results were all great, you should spend more time on the challenges along the way than on the parts that went smoothly.

by Edward Kmett at April 20, 2014 07:13 PM

Johan Tibell

Announcing cabal 1.20

On behalf of all cabal contributors, I'm proud to announce cabal 1.20. This is quite a big release, with 404 commits since 1.18. To install:

cabal update
cabal install Cabal- cabal-install-

New features

Since there are 404 commits since cabal 1.18, there are too many changes to give all of them a just treatment here. I've cherry-picked some that I thought you would find interesting.

To see the full list of commits, run:

git log Cabal-v1.18.1.3..Cabal-v1.20.0.0

in the cabal repo.

Dependency freezing

If you're building an application that you're delivering to customers, either as binary or as something that runs on your servers, you want to make sure unwanted changes don't sneak in between releases.

For example, say you've found a bug in the just released version and you want to release, which contains a fix. If you build the binary on a different machine than you built the release on, you risk building it against a slightly different set of dependencies, which means that your introducing untested code into your application, possible causing new bugs.

cabal freeze lets developers of applications freeze the set of dependencies used to make builds reproducible. cabal freeze computes a valid set of package versions, just like cabal install does, and stores the result in cabal.config. You can then check cabal.config into your source control system to make sure that all developers that work on the application build against the exact same set of dependencies.

Here's the contents of the cabal.config file created by running cabal freeze in the cabal-install repo:

constraints: ...
             HTTP ==4000.2.8,
             array ==,
             base ==,
             bytestring ==,

If you later want to update the set of dependencies either:

  • manually edit cabal.config or
  • delete (parts of) cabal.config and re-run cabal freeze.

Note that you should only use cabal freeze on applications you develop in-house, not on packages you intend to upload to Hackage.

Parallel cabal build

Starting with 7.8, GHC now accepts a -j flag to allow using multiple CPU cores to build a single package. This build performance enhancement is now exposed as a -j flag on cabal build (and also cabal test/bench/run). Build time improvements are modest, but free.

Flag to temporary ignore upper bounds

When new major versions of a package P is released, it usually takes a little while for packages that depend on P to update their version bounds to allow for the new version. This can be frustrating if you just want to test if your own package, which depends on both some of these other packages and on P, builds with the new P.

The --allow-newer=P flag tells the dependency solver to ignore all upper version bounds on P, allowing you to try to compile all packages against this newer version.

Unnecessary re-linking avoidance

Before 1.20, cabal would sometimes re-link build targets that hadn't changed. For example, if you had several test suites that tested different parts of your library, every test suite would be re-linked when you ran cabal test, even if no source file that the test suite depends on was changed. The same thing would happen for executables and benchmarks.

Now cabal doesn't re-link executables (of any kind) unless something changed.

Streaming cabal test output

cabal test can now stream its output to stdout, making it easier to see progress for long-running tests. Streaming output is enabled by passing --show-details=streaming to cabal test and is off by default (for now.)

New cabal exec command

Cabal sandboxes have almost completely replaced previous sandbox implementations. There was one use case that wasn't completely captured by the integrated sandbox implementation, namely starting new shells where the environment was set up to automatically use the sandbox for all GHC commands.

cabal exec allows you to launch any binary in an environment where the sandbox package DB is used by default. In particular you can launch a new shell using cabal exec [ba]sh.

Haddock configuration options

Haddock options can now be set in ~/.cabal/config. Here are the options and their default values:

     -- keep-temp-files: False
     -- hoogle: False
     -- html: False
     -- html-location:
     -- executables: False
     -- tests: False
     -- benchmarks: False
     -- all:
     -- internal: False
     -- css:
     -- hyperlink-source: False
     -- hscolour-css:
     -- contents-location:

How to use cabal

While not strictly related to this release, I thought I'd share how we expect users to use cabal. Using cabal this way makes sure that the features work well for you, now and in the future.

The main message is this: to build a package, use build, not install.

Building packages using cabal install comes from a time when

  • installing dependencies was more difficult,
  • depending on non-published packages was difficult (i.e. no sandbox add-source), and
  • using the other commands required manual configure-ing.

My intention is to remove the need for install from the development workflow altogether. Today the recommended way to build a package is to run this once:

cabal sandbox init
cabal install --only-dep  # optionally: --enable-tests

The development workflow is then just

cabal build/test/bench/run

No configuring (or manual rebuilding) needed. build implies configure and test/bench/run imply build.

Soon build will also imply the above dependency installation, when running in a sandbox.


Here are the contributors for this release, ordered by number of commits:

  • Mikhail Glushenkov
  • Johan Tibell
  • Duncan Coutts
  • Thomas Tuegel
  • Ian D. Bollinger
  • Ben Armston
  • Niklas Hambüchen
  • Daniel Trstenjak
  • Tuncer Ayaz
  • Herbert Valerio Riedel
  • Tillmann Rendel
  • Liyang HU
  • Dominic Steinitz
  • Brent Yorgey
  • Ricky Elrod
  • Geoff Nixon
  • Florian Hartwig
  • Bob Ippolito
  • Björn Peemöller
  • Wojciech Danilo
  • Sergei Trofimovich
  • Ryan Trinkle
  • Ryan Newton
  • Roman Cheplyaka
  • Peter Selinger
  • Niklas Haas
  • Nikita Karetnikov
  • Nick Smallbone
  • Mike Craig
  • Markus Pfeiffer
  • Luke Iannini
  • Luite Stegeman
  • John Lato
  • Jens Petersen
  • Jason Dagit
  • Gabor Pali
  • Daniel Velkov
  • Ben Millwood

Apologies if someone was left out. Once in a while we have to make commits on behalf of an author. Those authors are not captured above.

by Johan Tibell ( at April 20, 2014 11:36 AM

Douglas M. Auclair (geophf)

'M' is for burrito. No, really! (Monads, a burrito-escque introduction)

'M' is for Burrito. No, really.
(A little post about monads)

This one's easy.

A monad is a burrito: a nice, soft-shell wrapper hiding all that yummy, but complicated stuff inside (green peppers and onions? who was the first one to think of that?)

A monad is a burrito, wrapping all that complicated stuff like an integer:

Just 3

under its soft-shell wrapper:


I just showed you two monads: the first one is the 'Maybe' monad, representing semideterminism (we're certain, in this case, that the underlying unit is 'Just' 3, but we could, in a different situation, be equally certain that the underlying unit was 'Nothing' Those two cases (two = semi) give us our certainty (determinism) one way or the other), the second is the 'List' monad, representing nondeterminism (you could have the empty list [] (no answers, no = non), or, in the above case, a list with a set of different answers (one or several)).

So, monad = burrito. Simple, right?

Class dismissed.

"Excuse me, Professor geophf," the wormy student in the back of the class wearing b-c glasses and who always has to ask annoying questions in his whiny voice at the very end of class, making everybody late for their next class and me late for my latte, and you don't want to make me late for my latte! He continued, oblivious: "What about monads that don't contain anything at all? What is that? An empty burrito?"

He giggled at his own joke, thinking himself rather smart and drôle.

"What?" I roared, "An empty burrito? Such sacrilege cannot be! Besides, what's the point of an empty monad? That's just ridiculous!"

I was rather peeved that this young upstart would trespass on the beauty of my burrito-analogue. It was simple and easy to understand. Why mess with it?

"But," he continued in his whiny voice, "what about the empty list and 'Nothing' in the Maybe monad. They carry no value but are essential to each of those monads, how do you explain these monadic values as burritos? Are they churros?"

I sputtered in fury. He had me there.

So I did what every <strikeout>tyrant</strikeout>teacher does, I relied on my position of authority.

"No, monads are burritos! Get that through your head!" And then: "Class dismissed!"

Everybody fled. I do cut quite the imposing figure when I have that angry glint in my eye and my flinty face set in stone.

Okay, so monads aren't burritos. I admit it. (But not to Mr. Wormwood. Never, never!) There is nothing that says a monad has to carry a value, it's just a convenience to think of monads like that: a wrapper around an 'ordinary' object. But monads are objects, themselves, and, in fact, more 'ordinary,' that is: regular, than plain-old objects like integers.

The problem of plain-old objects, the problem that monads address, each in their own way (and there are many species of monads, not just maybes and lists, but eithers, and states and ... stuff! Lots of wild and weird different stuff!), is that plain-old objects admit the uninhabited instance, , in their type. So the type of natural numbers is 0, 1, 2, 3, ... but it's also . For programming languages that realize this (haskell, for example), they deal with this gracefully. For programming languages that realize this, but then do nothing about it, there is a systemic problem of the program going into an unstable state when encountering an uninhabited instance.

Something that causes Algol-child language programmers to shake in their booties: the 'null-pointer exception.'

Question: how can Java throw a 'NullPointerException,' when it doesn't have the pointer construct in its language, at all? Am I the first to call this one out?

So, the function in these languages that have but do not handle it properly must always admit that you're not working with functions at all, for:

x + y

is x plus y, but if either in uninhabited, the answer is KABOOM!

A program crash. You just destroyed your entire system. And you didn't even do it: the language designers did.

Monads were an answer to this problem. A little dude with a long, bushy beard by the name of Kleisli developed the work around what came to be known as the Kleisli categories, that had monads as their objects and (what came to be known as) Kleisli arrows as morphism, or functions.

The guarantee of the Kleisli category was that, indeed, every object as a monad, so there's no uninhabited type, this means Kleisli functions always get you back into a Kleisli category (yes: once you enter the monadic domain, you never leave it. That's the guarantee of a Kleisli category, you can only be a monad (or a monadic function or monad category ...) to get in, and no monads exist outside the Kleisli categories. The monadic domain is the Hotel California of categories).

Okay, you're stuck. And the guarantee of the monad is that you don't care what's under the soft-shell, there's no 'getting a value out' of a monad, because, as my annoying student, Mr. Wormwood, pointed out, there are monads with no underlying values. So what do you do with them?

Monads have a special function associated with them, called the 'join' operator.

Monads, you see, are a special type in that a join of a monad of a monad is just a monad, so, for example,

join (Just (Just 3))

gives you 

Just 3

What does that do for us?

Well, the structure of a monadic function is this:

Monad m ↠ f m :: a → b m

That is, the function takes an ordinary type a and 'lifts' it into the monadic category giving the (monadic) answer of type b (correctly m b, as m is the monadic type and b is the type carried by the monad).

Okay, but monadic functions can't dig out the underlying type in a monad of its category, so how do even the functions work at all?

Just as you can lift an unit type into a monad:

return 3 = Just 3

You can also lift an ordinary function into the monadic domain:

liftM succ = m Int → m Int (for some monadic type m)

Okay, so, but where does that even get us? We're still stuck, right?

Well, the thing is, you can even lift monadic functions into the monadic-monadic domain:

liftM f (of type Monad m ↠ a → m b) = f' :: Monad m ↠ m a → m (m b)

What does that get us?

Well, combining join with a lifted monadic function gets us a monad from monadic input:

join (liftM f (Just 3)) = some monadic answer dependent on the resultant type of f.

This whole lifting-into-the-monadic-domain-and-then-joining-the-result is so common when working in monad categories that a special operator was invented, à la Haskell, named '>>=' (pronounced: 'bind').

Now, with bind the true power of monads come out:

m >>= f >>= g >>= h >>= ... etc.

What you see here a monad being bound, in turn, to the monadic functions f, g, h, etc.

So, okay, you've got a short-hand for monadically binding functions together. Great. Good on you, and have a latte.

The thing here is this.

In 'plain-old' programming, you have this ever-present fear that a NullPointerException is going to ruin your life, so you can't write:


getting an A from obj, then getting a B from that A, then getting a C from that B, then you do something with that C.

I mean, you can write that code, but if any of those objects are null: obj, A, B, C, your whole statement explodes?

No, worst than that: your entire system blows up, and you have to deal with irate customers wondering why they got a 505-Service down error when they submit their credit card information but forgot to enter their zip code, or some such.

So you have to wrap that statement in a try-block, and try it and see if it works, and if it doesn't you have this whole envelope, this whole burrito shell of code you have to write to handle the uninhabited case.

Now let's look at the monadic function binding again:

m >>= f >>= g >>= h >>= ... etc.

What happens if m is null, or the result of f m is ...

No. Wait. Monads don't have nulls. Monads are just monads, so the above binding ...

... okay, wait for it, ...



In fact, in the Kleisli category, it is guaranteed to work.

It. just. works.

For you, not a Java(script) programmer, you're like: "Well, duh, yeah, that's how it's supposed to be! 1 + 1 = 2, not 'error' or whatever."

You can say that, and expect that, in a pure mathematical domain (with monads implicit in the mathematics for you. You're welcome.) But a Java programmer, with any experience, any hard-won experience should be (rightly) awed.

"Wait. You mean, it just works? How is that possible?"

Yeah. How is that possible? That Java code just works?

It doesn't. But, when I translate the Kleisli category down into a Java implementation, and then lift every Java object into that cateogy, so it's not a Plain-old Java Object (POJO) any more, but now it's a monad in the Kleisli category, and operate on it as such from then on.

Well, then, it just works.

Some people look at my monadic harness and say I'm unnecessarily complicating things.

I beg to differ.

"But, geophf, I just want the start date; all this stuff just complicates things!"

But what do you want the start date ... for? If you just want the start date, then


will get you there, and you're done.

But if you want to base any logic off it? Is your start date the start of a work-flow? So if the start date is null, your system, that you unit tested with start date values, now crashes in production and you have no idea why, because you unit tested it, and the code looks good.

But that one null wrecks everything.

I don't have any nulls. So I lift my start date up into a monad, and start my work flow, and my work flow works, and if the start date is null, then the start date isn't lifted, the functions fall through, because they're not working with a monad, and I just go onto the next thing.

And I coded zero lines of code around that.

Your try-catch blocks looking for nulls everywhere ...

Well, what's more complicated now? What code works in production, regardless of dirty or bad data?

Monads are a guarantee, and, as burritos go, they are quite tasty. Give them a try!


Now there's the whole conversation around monads that carry no value intentionally throughout the entire monad. So, for example, if the monad type is 'Quark' the the monadic inhabited types are up, down, strange, charmed, top and bottom, and their values are ... nothing at all: their instance is the value, it has no underlying value it carries (unless you want to get into the guts of spin and charge ...), its the particular monad instance, or the particular quark, we care about, not what's inside at all, something like monads as enumerated values, but ... eh. Monads-as-burritos is a limited view, it will trip you up, but for the most part it works. When you do get to the point of treating monads-as-monads, and not caring, anymore, about the underlying value, you can tap yourself on the shoulder with monadic-Nirvana. It's a really good feeling, and gives an entirely new understanding of monads and their utility. "Good to know," as it were, but monad-as-burrito works for the most part and is tasty, too, to boot.

by geophf ( at April 20, 2014 08:33 AM

April 19, 2014

Roman Cheplyaka

JSON validation combinators

Why write a custom JSON validator?

At Signal Vine we have a JSON-based language for describing text messaging campaigns. We may design a better surface syntax for this language in the future, but the current one gets the job done and there are certain existing systems that depend on it.

Anyway, the problem with this language is that it is too easy to make mistakes — including errors in JSON syntax, structural errors (plugging an array where an object is expected), or name errors (making a typo in a field name).

So the first thing I did was write a validator for our JSON format.

There are several projects of «JSON schemas» around, but there were many reasons against using them.

  1. I don’t know about the quality of the tools that support such schemas (i.e. the quality of error messages they generate), and the expressivity of the schemas themselves (whether they’d let us to express the structure of our JSON structure and the constraints we’d like to enforce). So, though it may seem that using an existing solution is «free», it is not — I’d have to spend time learning and evaluating these existing solutions.

  2. I remember that we went through this in our team at Barclays, and eventually decided to create a custom JSON schema language, although I was not involved in the evaluation process, so can’t share the details.

  3. I was almost certain that no existing «generic JSON schema» solution can provide the power of a custom one. For instance, some of the JSON strings contain expressions in another in-house mini-language. Ideally, I’d like to parse those expressions while I am parsing the enclosing JSON structure, and give locations of possible errors as they appear in the JSON file.

  4. I’d need a parser for the language anyway. Maintaining a schema separately from the parser would mean one more thing to keep in sync and worry about.

I couldn’t use an existing JSON parsing library either. Of course, aeson was out of question, being notorious for its poor error messages (since it’s based on attoparsec and optimized for speed). json, though, is based on parsec, so its error messages are better.

But there is a deeper reason why a JSON parsing library is inadequate for validation. All of the existing JSON libraries first parse into a generic JSON structure, and only then do they try to recognize the specific format and convert to a value of the target Haskell type.

Which means that during parsing, only JSON syntax errors will be detected, but not the other kinds of errors described above. Granted, they all can be detected sooner or later. But what differentiates sooner from later is that once we’re out of the parsing monad, we no longer have access to the position information (unless, of course, our JSON parsing library does extra work to store locations in the parsed JSON structure — which it typically doesn’t). And not having such position information severely impacts our ability to produce good error messages.

To summarize, in order to provide good diagnostics, it is important to parse exactly the language we expect (and not some superset thereof), and to perform all the checks in the parsing monad, where we have access to location information.

JSON parsing combinators

Even though I couldn’t re-use an existing JSON parser or schema, I still wanted my parser to be high-level, and ideally to resemble a JSON schema, just embedded in Haskell.

The rest of this article describes the JSON schema combinators I wrote for this purpose.


As I mentioned before, the json package uses parsec underneath, so I was able to reuse some basic definitions from there — most notably, p_string, which parses a JSON string. This is fortunate, because handling escape sequences is not straightforward, and I’d rather use a well-tested existing implementation.

string :: Parser String
string = {- copied from Text.JSON.Parsec -}

I introduced one other combinator, theString, which parses a given string:

theString :: String -> Parser ()
theString str = (<?> "\"" ++ str ++ "\"") $ try $ do
  str' <- string
  if str == str'
    then return ()
    else empty


Objects are an interesting case because we know what set of fields to expect, but not the order in which they come (it may be arbitrary). Such syntax is known as a «permutation phrase», and can be parsed as described in the classical paper Parsing Permutation Phrases by Arthur Baars, Andres Löh and Doaitse Swierstra.

There are surprisingly many implementations of permutation parsing on hackage, including one in parsec itself. Most of them suffer from one or both of the following issues:

  1. they use custom combinators, which, despite being similar to Applicative and Alternative operators, have their quirks and require learning

  2. they don’t support permutation phrases with separators, which is obviously required to parse JSON objects. (The technique to parse permutation phrases with separators was descibed in the original paper, too.)

On the other hand, the action-permutations library by Ross Paterson addresses both of these issues. It provides the familiar Applicative interface to combine permutation elements (or atoms, as it calls them), and includes the function runPermsSep to parse phrases with separators. The interface is also very generic, requiring the underlying functor to be just Alternative.

Below are the combinators for parsing JSON objects. field parses a single object field (or member, as it’s called in the JSON spec), using the supplied parser to parse the field’s value. optField is similar, except it returns Nothing if the field is absent (in which case field would produce an error message). Finally, theField is a shortcut to parse a field with the fixed contents. It is useful when there’s a tag-like field identifying the type/class of the object, for instance

data Item
  = Book
      String -- writer
  | Song
      String -- composer
      String -- singer

item =
  (try . object $
    Book <$
    theField "type" "book" <*>
    field "writer" string)
  (try . object $
    Song <$
    theField "type" "song" <*>
    field "composer" string <*>
    field "singer" string)

(Note: examples in this article have nothing to do with the JSON schema we actually use at Signal Vine.)

One thing to pay attention to is how field parsers (field, theField and optField) have a different type from the ordinary parsers. This makes it much easier to reason about what actually gets permuted.

object :: Perms Parser a -> Parser a
object fields = (<?> "JSON object") $
  between (tok (char '{')) (tok (char '}')) $
    runPermsSep (tok (char ',')) fields

-- common function used by field and optField
  :: String -- key name
  -> Parser a -- value parser
  -> Parser a
field' key value = theString key *> tok (char ':') *> value

  :: String -- key name
  -> Parser a -- value parser
  -> Perms Parser a
field key value = atom $ field' key value

  :: String -- key name
  -> String -- expected value
  -> Perms Parser ()
theField key value = () <$ field key (theString value)

  :: String -- key name
  -> Parser a -- value parser
  -> Perms Parser (Maybe a)
optField key value = maybeAtom $ field' key value

Aside: correct separator parsing

There was only one issue I ran into with action-permutations, and it is interesting enough that I decided to describe it here in more detail.

Consider, for example, the expression runPermsSep sep (f <$> atom a <*> atom b <*> atom c)

It would expand to

(flip ($) <$> a <*> (
  (sep *> (flip ($) <$> b <*>
  (sep *> (flip ($) <$> c <*>
    pure (\xc xb xa -> f xc xb xa)))))
  (sep *> (flip ($) <$> c <*>
  (sep *> (flip ($) <$> b <*>
    pure (\xc xb xa -> f xb xc xa)))))
(flip ($) <$> b <*> (
(flip ($) <$> c <*> (

See the problem? Suppose the actual order of the atoms in the input stream is a, c, b. At the beginning the parser is lucky to enter the right branch (the one starting from flip ($) <$> a <*> ...) on the first guess. After that, it has two alternatives: b-then-c, or c-then-b. First it enters the b-then-c branch (i.e. the wrong one) and fails. However, it fails after having consumed some input (namely, the separator) — which in libraries like parsec and trifecta means that the other branch (the right one) won’t be considered.

We cannot even work around this outside of the library by using try, because we can’t insert it in the right place. E.g. wrapping the separator in try won’t work. The right place to insert try would be around the whole alternative

  (sep *> (flip ($) <$> b <*>
  (sep *> (flip ($) <$> c <*>
    pure (\xc xb xa -> f xc xb xa)))))

but this piece is generated by the lirbary and, as a library user, we have no control over it.

The usage of try inside the library itself is unsatisfactory, too. Remember, the interface only assumes the Alternative instance, which has no notion of try. If we had to make it less generic by imposing a Parsing constraint, that would be really unfortunate.

Fortunately, once identified, this problem is not hard to fix properly — and no usage of try is required! All we need is to change runPermsSep so that it expands to the tree where separator parsing is factored out:

(flip ($) <$> a <*> sep *> (
  (flip ($) <$> b <*> sep *>
  (flip ($) <$> c <*>
    pure (\xc xb xa -> f xc xb xa)))))
  (flip ($) <$> c <*> sep *>
  (flip ($) <$> b <*>
    pure (\xc xb xa -> f xb xc xa)))
(flip ($) <$> b <*> sep *> (
(flip ($) <$> c <*> sep *> (

Now, all alternatives start with atoms, so we have full control over whether they consume any input.

Mathematically, this demonstrates that <*> does not distribute over <|> for some backtracking parsers. Note that such distributive property is not required by the Alternative class.

Even for parser monads that allow backtracking by default (attoparsec, polyparse) and for which there’s no semantic difference between the two versions, this change improves efficiency by sharing separator parsing across branches.

My patch fixing the issue has been incorporated into the version of action-permutations.


Arrays should be easier to parse than objects in that the order of the elements is fixed. Still, we need to handle separators (commas) between array elements.

If we interpreted arrays as lists, then the schema combinator for arrays might look like

  :: Parser a -- parser for a signle element
  -> Parser [a] -- parser for the array

Implementation would be straightforward, too.

However, in our JSON schema we use arrays as tuples rather than lists. That is, we typically expect an array of a fixed number of heterogeneous elements. Thus we’d like to combine these tuple elements into a single parser using the applicative interface.

Let’s say we expect a 2-tuple of a string (a person’s name) and an object (that person’s address).

data Address -- = ...
data Person =
    String -- name

Written by hand, the parser may look like

address :: Parser Address
address = _

personAddress = 
  between (tok (char '[')) (tok (char ']')) $
    Person <$> string <* sep <*> address
  where sep = tok $ char ','

It makes sense to move brackets parsing to the array combinator:

array :: Parser a -> Parser a
array p = (<?> "JSON array") $
  between (tok (char '[')) (tok (char ']')) p

But what should we do with the commas? Manually interspersing elements with separators is error-prone and doesn’t correspond to my view of a high-level JSON schema description.

Inserting comma parsers automatically isn’t impossible — after all, it is done in the action-permutations package and we used it to parse object fields, which are comma-separated, too. But it cannot be as easy as adding a separator to every element since there’s one less separators than elements. We have somehow to detect the last element and not to expect a separator after it.

A nice and simple way to achieve this is with a free applicative functor. A free applicative functor will allow us to capture the whole applicative expression and postpone the decision on where to insert separator parsers until we can tell which element is the last one. In this case we’ll use Twan van Laarhoven’s free applicative, as implemented in the free package.

element :: Parser a -> Ap Parser a
element = liftAp

theArray :: Ap Parser a -> Parser a
theArray p = between (tok (char '[')) (tok (char ']')) $ go p
    go :: Ap Parser a -> Parser a
    go (Ap p (Pure f)) = f <$> p
    go (Ap p1 pn) = flip id <$> p1 <* tok (char ',') <*> go pn

Like object fields, array elements have a special type which makes it clear which pieces exactly are comma-separated.

In fact, the applicative functor Perms is essentially the free applicative functor Ap plus branching.

Optional array elements

Now comes the twist. Some of the array elements may be optional — in the same way as positional function arguments in some languages may be optional. Since the elements are positional, if one of them is omitted, all subsequent ones have to be omitted, too — otherwise we won’t be able to tell which one was omitted, exactly.

For that reason, all optional elements should come after all the non-optional ones; if not, then we’ve made a mistake while designing (or describing) our schema. Ideally, our solution should catch such mistakes, too.

So, how can the above solution be adapted to handle optional arguments?

Attempt #1:
optElement :: Parser a -> Ap Parser (Maybe a)
optElement p = element $ optional p

Here optional is a combinator defined in Control.Applicative as

optional v = Just <$> v <|> pure Nothing

This won’t work at all, as it doesn’t give us any information about whether it’s an optional element or just a parser that happens to return a Maybe type.

Attempt #2:

Well, let’s just add a flag indicating whether the element was created using optElement, shall we?

data El a =
    Bool -- is optional?
    (Parser a)

element :: Parser a -> Ap El a
element = liftAp . El False

optElement :: Parser a -> Ap El (Maybe a)
optElement = liftAp . El True . optional

Now we can check that optional arguments come after non-optional ones. If an element’s parse result is Nothing, we also know whether that element is an optional one, and whether we should stop trying to parse the subsequent elements.

Still, there are two related issues preventing this version from working:

  • How do we actually know when a parser returns Nothing? Once we lift a parser into the free applicative, its return type becomes existentially quantified, i.e. we should treat it as polymorphic and cannot assume it has form Maybe a (by pattern-matching on it), even if we can convince ourselves by looking at the Bool flag that it indeed must be of that form.

  • Similarly, once we’ve detected an absent optional element (assuming for a second that it is possible), we have to force all the remaining optional parsers to return Nothing without parsing anything. But again, we cannot convince the compiler that Nothing is an acceptable return value of those parsers.

Attempt #3:

So, we need certain run-time values (the optionality flag) to introduce type-level information (namely, that the parser’s return type has form Maybe a). That’s exactly what GADTs do!

data El a where
  El :: Parser a -> El a
  OptEl :: Parser a -> El (Maybe a)

El’s two constructors are a more powerful version of our old Bool flag. They let us see whether the element is optional, and if so, guarantee that its parser’s return type is Maybeish.

And here’s the code for the parsing functions:

element :: Parser a -> Ap El a
element = liftAp . El

optElement :: Parser a -> Ap El (Maybe a)
optElement = liftAp . OptEl

theArray :: Ap El a -> Parser a
theArray p = between (tok (char '[')) (tok (char ']')) $
  go True False False p
    go :: Bool -> Bool -> Bool -> Ap El a -> Parser a
    go _ _ _ (Pure x) = pure x
    go isFirst optionalOccurred optionalOmitted (Ap el1 eln) =
        eltSequenceError :: a
        eltSequenceError =
          error "theArray: a non-optional element after an optional one"

        !_check =
          case el1 of
            El {} | optionalOccurred -> eltSequenceError
            _ -> ()
        if optionalOmitted
            case el1 of
              El {} -> eltSequenceError
              OptEl {} -> go False True True eln <*> pure Nothing
          else do
              sep = if isFirst then pure () else () <$ tok (char ',')
            case el1 of
              El p1 ->
                flip id <$ sep <*> p1 <*> go False False False eln
              OptEl p1 -> do
                r1 <- optional $ sep *> p1
                go False True (isNothing r1) eln <*> pure r1

theArray is a state machine with three pieces of state: isFirst, optionalOccurred and optionalOmitted. isFirst and optionalOccurred are used to guide actual parsing, while optionalOccurred is needed to check the proper arrangement of optional vs non-optional arguments.


Although the standard approach to JSON parsing is to parse into a generic JSON representation first, the article shows that an alternative approach — parsing the expected structure directly — is also viable and can be employed to improve error reporting.

Of course, the tricks and ideas described here are not specific to JSON. Understanding how they work and how to use them may become handy in a variety of parsing situations.

April 19, 2014 09:00 PM

Philip Wadler

Silicon Valley could force NSA reform, tomorrow. What's taking so long?

<figure class="element element-image" data-media-id="gu-fc-3e1d8f3a-dd53-4f60-9376-ccc2021cc1eb" style="background-repeat: no-repeat no-repeat; border-collapse: collapse; margin: 0px 0px 10px; padding: 0px;"><figcaption style="background-repeat: no-repeat no-repeat; border-collapse: collapse; color: #666666; font-size: 0.858em; line-height: 1.25; margin: 0px 0px 10px; padding: 0px;">CEOs from Yahoo to Dropbox and Microsoft to Zynga met at the White House, but are they just playing for the cameras? Photograph: Kevin Lamarque / Reuters</figcaption></figure>
Trevor Timm asks a key question in The Guardian:
The CEOs of the major tech companies came out of the gate swinging 10 months ago, complaining loudly about how NSA surveillance has been destroying privacy and ruining their business. They still are. Facebook founder Mark Zuckerberg recently called the US a "threat" to the Internet, and Eric Schmidt, chairman of Google, called some of the NSA tactics "outrageous" and potentially "illegal". They and their fellow Silicon Valley powerhouses – from Yahoo to Dropbox and Microsoft to Apple and more – formed a coalition calling for surveillance reform and had conversations with the White House.
But for all their talk, the public has come away empty handed. The USA Freedom Act, the only major new bill promising real reform, has been stalled in the Judiciary Committee. The House Intelligence bill may be worse than the status quo. Politico reported on Thursday that companies like Facebook and are now "holding fire" on the hill when it comes to pushing for legislative reform.
We know it's worked before. Three years ago, when thousands of websites participated in an unprecedented response to internet censorship legislation, the Stop Online Piracy Act (Sopa), the public stopped a once-invincible bill in its tracks. If they really, truly wanted to do something about it, the online giants of Silicon Valley and beyond could design their systems so that even the companies themselves could not access their users' messages by making their texting and instant messaging clients end-to-end encrypted.
But the major internet outfits were noticeably absent from this year's similar grassroots protest – dubbed The Day We Fight Back – and refused to alter their websites à la Sopa. If they really believed the NSA was the threat so many of them have claimed, they'd have blacked out their websites in protest already.

by Philip Wadler ( at April 19, 2014 08:59 AM

I Shall Vote No

Spotted on Bella Caledonia.
[After Christopher Logue, I Shall Vote Labour (1966)
By A.R. Frith
I shall vote No because, without Westminster, We’d never have got rid of the Poll Tax
I shall vote No because eight hundred thousand Scots live in England, and there are no jobs here to match their talents and meet their aspirations
I shall vote No, because my grandmother was a MacDougall
I shall vote No in case Shell and BP leave and take their oil with them
I shall vote No because otherwise we would have to give back the pandas
I shall vote No because I am feart
I shall vote No because the people who promised us a better deal if we voted No in 79, and warned us of the dire consequences of devolution in 97, tell us we should
I shall vote No so as not to let down my fellow socialists in Billericay and Basildon
I shall vote No, because if we got rid of Trident and stopped taking part in illegal wars we would be a target for terrorism
I shall vote No because if I lived under a government that listened to me and had policies I agreed with, I wouldn’t feel British
I shall vote No because the RAF will bomb our airports if we are a separate country
I shall vote No because to vote Yes dishonours the Dead of the Great War, who laid down their lives for the rights of small nations
I shall vote No, lest being cut off from England turns Red Leicester cheese and Lincolnshire sausages into unobtainable foreign delicacies, like croissants, or bananas
I shall vote No, because, as a progressive, I have more in common with Billy Bragg or Tariq Ali, who aren’t Scottish, than some toff like Lord Forsyth, who is.
I shall vote No, because the certainty of billions of pounds worth of spending cuts to come is preferable to the uncertainty of wealth
I shall vote No, because it is blindingly obvious that Scotlands voice at the UN, and other international bodies, will be much diminished if we are a member-state
I shall vote No because having a parliament with no real power, and another which is run by people we didnt vote for, is the best of both worlds
I shall vote No because I trust and admire Nick Clegg, who is promising us Federalism when the Liberals return to office
I shall vote No, because Emma Thompson would vote No, and her Dad did The Magic Roundabout
I shall vote No, because A.C. Grayling would vote No,and his Mum was born on Burns Night
I shall vote No because David Bowie asked Kate Moss to tell us to, and he lives in New York and used to be famous
I shall vote No, because nobody ever asks me what I think
I shall vote No, because a triple-A credit rating is vital in the modern world
I shall vote No because things are just fine as they are
I shall vote No because the English say they love us,
and that if we vote Yes, they will wreck our economy.

by Philip Wadler ( at April 19, 2014 08:35 AM

April 17, 2014


Haskell gets static typing right

The following blog post originally appeared as a guest post on the Skills Matter blog. Well-Typed are regularly teaching both introductory or advanced Haskell courses at Skills Matter, and we will also be running a special course on Haskell’s type system.

Statically typed languages are often seen as a relic of the past – old and clunky. Looking at languages such as C and Java, we’re used to writing down a lot of information in a program that just declares certain variables to be of certain types. And what do we get in return? Not all that much. Yes, granted, some errors are caught at compile time. But the price is high: we’ve cluttered up the code with noisy declarations. Often, code has to be duplicated or written in a more complicated way, just to satisfy the type checker. And then, we still have a significant risk of run-time type errors, because type casting is common-place and can fail at unexpected moments.

So it isn’t a big surprise that dynamically typed languages are now very fashionable. They promise to achieve much more in less time, simply by getting rid of static type checking.

However, I want to argue that we shouldn’t be too keen on giving up the advantages of static types, and instead start using programming languages that get static typing right. Many functional languages such as Scala, F#, OCaml and in particular Haskell are examples of programming languages with strong static type systems that try not to get in the way, but instead guide and help the programmer.

In the rest of this post, I want to look at a few of the reasons why Haskell’s type system is so great. Note that some of the features I’m going to discuss are exclusive to Haskell, but most are not. I’m mainly using Haskell as a vehicle to advertise the virtues of good static type systems.

1. Type inference

Type inference makes the compiler apply common sense to your programs. You no longer have to declare the types of your variables, the compiler looks at how you use them and tries to determine what type they have. If any of the uses are inconsistent, then a type error is reported. This removes a lot of noise from your programs, and lets you focus on what’s important.

Of course, you are still allowed to provide explicit type signatures, and encouraged to do so in places where it makes sense, for example, when specifying the interface of your code.

2. Code reuse

Nothing is more annoying than having to duplicate code. In the ancient days of statically typed programming, you had to write the same function several times if you wanted it to work for several types. These days, most languages have “generics” that allow you to abstract over type parameters. In Haskell, you just write a piece of code that works for several types, and type inference will tell you that it does, by inferring a type that is “polymorphic”. For example, write code that reverses all the elements of a data structure, and type inference will tell you that your code is independent of the type of elements of the data structure, so it’ll just work regardless of what element type you use. If you write code that sorts a data structure, type inference will figure out that all you require to know about the elements is that they admit an ordering.

3. No run-time type information by default

Haskell erases all type information after type checking. You may think that this is mainly a performance issue, but it’s much more than that. The absence of run-time type information means that code that’s polymorphic (i.e., type-agnostic, see above) cannot access certain values. This can be a powerful safety net. For example, just the type signature of a function can tell you that the function could reorder, delete or duplicate elements in a data structure, but not otherwise touch them, modify them or operate on them in any other way. Whereas in the beginning of this post I complained that bad static type systems don’t allow you to do what you want because they’re not powerful enough, here we can deliberately introduce restrictions to save us (as well as colleagues) from accidental mistakes. So polymorphism turns out to be much more than just a way to reduce code duplication.

By the way, Haskell gives you various degrees of selective run-time typing. If you really need it in places, you can explicitly attach run-time type information to values and then make type-based decisions. But you say where and when, making a conscious choice that you gain flexibility at the cost of safety.

4. Introducing new datatypes made easy

In Haskell, it’s a one-liner to define a new datatype with a new name that has the same run-time representation as an existing type, yet is treated as distinct by the type system. (This may sound trivial, but surprisingly many statically typed languages get it wrong.) So for example it’s easy to define lots of different types that are all integers internally: counters, years, quantities, … In Haskell, this simple feature is often used to define safe boundaries: a specific type for URLs, a specific type for SQL queries, a specific type for HTML documents, and so on. Each of these types then comes with specific functions to operate on them. All such operations guarantee that whenever you have a value of this type, it’s well-formed, and whenever you render a value of this type, it’s syntactically correct and properly escaped.

5. Explicit effects

In virtually all programming languages, a function that performs some calculations on a few numbers and another function that performs the same calculations, but additionally sends a million spam emails to addresses all over the world, have exactly the same type, and therefore the same interface. Not so in Haskell. If a function writes to the screen, reads from the disk, sends messages over the network, accesses the system time, or makes use of any other so-called side effect, this is visible in its type. This has two advantages: first, it makes it much easier to rely on other people’s code. If you look at the interface and a function is effect-free, then you for example automatically know that it is also thread-safe. Second, the language facilitates a design where side effects are isolated into relatively small parts of the code base. This may seem difficult to achieve for highly stateful systems, but surprisingly, it usually is not: even interactive systems can usually be described as pure functions reacting to a series of requests with appropriate responses, and a separate driver that does the actual communication. Such separation makes it not only easier to test the program, but also facilitates the evolution of the program such, for example, to adapt it to run in a different environment. Haskell’s type system therefore encourages good design.

6. Types as a guide in program development

If you only ever see types as a source of errors, and therefore as enemies on your path of getting your program accepted, you’re doing them injustice. Types as provided by Haskell are an element of program design. If you give your program precise types and follow systematic design principles, your program almost writes itself. Programming with a strong type system is comparable to playing a puzzle game, where the type system removes many of the wrong choices and helpfully guides you to take the right path. This style of programming is supported by a new language extension called “Typed Holes” where you leave parts of your program unspecified during development, and obtain feedback from the development environment about what type has to go into the hole, and what program fragments you have available locally to construct a value of the desired type. Playing this kind of puzzle game is actually quite fun!

7. Programming on the type level

Haskell’s type system provides many advanced features that you don’t usually have to know about, but that can help you if you want to ensure that some complicated invariants hold in your program. Scarily named concepts such as “higher-ranked polymorphism”, “generalized algebraic datatypes” and “type families” essentially provide you with a way to write programs that compute with types. The possibilities are nearly endless. From playful things such as writing a C-printf-style function where the first argument determines the number of arguments that are expected afterwards as well as their types, you can go on to code that provides useful guarantees such as that mutable references that are available within one thread of control are guaranteed not to be accessed in a completely different context, arrays that can adapt to different internal representations depending on what type of values they contain, working with lists that are guaranteed to be of a specific length, or with trees that are guaranteed to be balanced, or with heterogeneous lists (where each element can be of a different type) in a type-safe way. The goal is always to make illegal inputs impossible to construct. If they’re impossible to construct by the type system, you can isolate sanity tests at the boundary of your code, rather than having to do them over and over again. The good thing is that these features are mostly optional, and often hardly affect the interface of libraries. So as a user, you can benefit from libraries employing such features and having extra safety guarantees internally. As a library writer, you can choose whether you’re happy with the normal level of Haskell type safety (which is already rather a lot), or if you want to spend some extra effort and get even more.

If my overview has tempted you and you now want to learn more about Haskell, you’re welcome follow one of my introductory or advanced Haskell courses that I (together with my colleagues at Well-Typed) regularly teach at Skills Matter. These courses do not just focus on the type system of Haskell (although that’s a significant part). They introduce the entire language in a hands-on way with lots of examples and exercises, as well as providing guidelines on how to write idiomatic Haskell and how to follow good development practices.

If you already know some Haskell and are particularly interested in the advanced type system features mentioned in point 7, we also offer a new one-day course on Haskell’s type system that specifically focuses on how far you can push it.

by andres at April 17, 2014 04:04 PM

April 16, 2014

Jan Stolarek

Autocomplete command-line flags with GHC 7.8.2

GHC 7.8.2 has been released just a few days ago1. This is the first official release of GHC that contains my contributions. Most of them are improvements in the code generator and are thus practically invisible to most users. But I also implemented one very nice feature that will be useful to an average Haskeller. GHC now has --show-options flag that lists all command-line flags. This feature can be used to auto-complete command-line flags in shells that support this feature. To enable auto-completion in Bash add this code snippet to your ~/.bashrc file:

# Autocomplete GHC commands
    local envs=`ghc --show-options`
    # get the word currently being completed
    local cur=${COMP_WORDS[$COMP_CWORD]}
    # the resulting completions should be put into this array
    COMPREPLY=( $( compgen -W "$envs" -- $cur ) )
complete -F _ghc -o default ghc

From my experience the first completion is a bit slow but once the flags are cached things work fast.

  1. Please ignore 7.8.1 release. It shipped with a bug that caused rejection of some valid programs.

by Jan Stolarek at April 16, 2014 02:48 PM

Philip Wadler

Team Scotland

What happens after Yes? It's not the SNP, it's the people. A great post on Bella Caledonia.
The UK chattering classes have been wondering what a real, mass grass-roots campaign might look like in modern, professionalised politics. Impotent is their usual conclusion. Well come on up and we’ll show you. The old feudal dance where councilor doths cap to MP, MP to Minister, Minister to Prime Minister and Prime Minister to corporate CEO may well continue apace even here in Scotland. But it’s not winning Scotland.

by Philip Wadler ( at April 16, 2014 12:19 PM

Help ORG restart the debate about internet filters

The Open Rights Group is starting a campaign opposed to the default filtering now imposed by all providers in the UK---de facto censorship. You can fund it via IndieGogo.

by Philip Wadler ( at April 16, 2014 11:59 AM

April 15, 2014


Writing admin interfaces with Fay using fay-builder

We are in the process of open sourcing a lot of the general purpose libraries we’ve built at Silk under the BSD3 license. We’re planning to write a series of blog posts introducing them so please stay tuned!

First off is one of the smaller packages, fay-builder (github).

Fay is a proper subset of Haskell that compiles to JavaScript, as well as a compiler. I’ve been helping out with the project since its first release in July 2012.

We have several Haskell backend services with admin interfaces written in HTML and JavaScript that communicate with the running server using REST APIs. It seemed like a good idea to write their admin interfaces in Haskell and we have been experimenting with using Fay for this. Since Fay supports Cabal packages we can use our usual tools and upload these packages to the public hackage or our private instance.

fay-builder lets you store options needed to compile Fay code along with an executable. This is done by using custom cabal flags that can be read either by Setup.hs (but see caveats below) or from within the application at startup, on request, or at some other point.

If you want to try it out you can install it by running cabal install fay-builder.

This post demonstrates how we build Haskell code for our admin interfaces.
Join our team if you enjoy this stuff too, we’re hiring!

The Cabal file

Here’s how you can define a Cabal file that uses fay-builder for a project:

name:                my-program
version:             0.1

Use a custom build type

build-type:          Custom

extra-source-files specifies files what should be included with a source distribution. The fay files are added here so they don’t get lost when running cabal sdist.

extra-source-files: fay/*.hs

data-files are also included, with the addition that Cabal will generate a paths module that you can use to fetch the resources during runtime. We probably want to precompile the fay files so the program can just serve them.

data-files: my-program.cabal, js/*.js

x-foo specifies custom flags that Cabal itself ignores, but we can use it for extra configuration options.

x-fay-packages:      fay-jquery, fay-text
x-fay-root-modules:  Index, Misc
x-fay-include-paths: src
x-fay-output-dir:    data/js
x-fay-source-dir:    fay

executable my-program
  main-is:           Main.hs
  hs-source-dirs:    src
  default-language:  Haskell2010
  ghc-options:       -Wall

This is the haskell module Cabal generates to let us access data-files.

other-modules:     Paths_my_program

To make sure we can build the Fay code we add all the dependencies it has here along with the GHC libraries dependencies.

One thing to note is that we can’t depend on base and fay-base at the same time since they have clashing module names. But if we depend on some other fay package we get a transitive dependency to fay-base and we know it will be available.

build-depends:     base, Cabal, fay, fay-builder, fay-jquery, fay-text

We have a few options for when to build the Fay code.

Building as a Cabal hook

With a custom Setup.hs we can add a postBuildHook that will compile the Fay code after the executable has been compiled, but before an sdist is created. This is nice because it will generate the needed .js files before the tarball is created. The disadvantage is that Setup.hs cannot have explicit dependencies so you won’t be able to sdist your program unless you already have all dependencies installed.

It is possible to do this, and fay-builder includes a defaultFayHook that you can use in Setup.hs like this:

import Fay.Builder (defaultFayHook)
main = defaultFayHook

It’s pretty cool, but maybe not very useful because of the dependency issue. Just make sure to have fay-builder installed before running cabal configure.

Building inside the program

For one application I started out with the above Setup.hs approach, but soon realized that this would not work out since sdisting wasn’t possible in a clean package DB. I instead chose to add a --compile-fay flag that would compile all Fay sources when the process starts when developing. The live setup won’t do this and instead read the data-files.

The trick here is that we added the .cabal file as a data-file to let us access it from other modules.

import qualified Paths_my_program as Paths
import qualified Fay.Builder as Builder

compileFay :: IO ()
compileFay = do
  pkgDb       <- getPackageDb
  packageDesc <- Paths.getDataFileName "my-program.cabal"
                 >>= Builder.readPackageDescription packageDesc pkgDb
      -- Some clever way to fetch the current package DB
      -- if you are using a sandbox.
      getPackageDb :: IO (Maybe FilePath)

If we had added the settings elsewhere, such as in a custom configurator file like snaplet-fay does, we wouldn’t be able to read it from within Setup.hs so the cabal file might be the only place to put this configuration if we want both features at once.

One of the biggest advantages in using fay-builder is that all dependencies can be listed in one cabal file instead of having to split the client and server code into multiple packages (which would turn into three different packages if there’s shared code).

April 15, 2014 09:38 AM

April 14, 2014

FP Complete

The heartbleed bug and FP Haskell Center

The heartbleed bug and FP Haskell Center

If you haven't heard about the heartbleed bug and related security issues, this article provides a good explanation.

Applications developed in FPHC and deployed using the FP Application servers are not vulnerable to this bug.

Services we use from Amazon and GitHub were affected. SSL connections to our servers go through Amazon software, and we use SSL to connect to GitHub repositories. Both Amazon and GitHub have already updated their services to remove the vulnerability. FP Complete has followed GitHub's suggestions by changing our passwords and OAuth tokens. You can read those guidelines at github.

While we have no evidence that any sensitive information was leaked from FPHC, we recommend changing your password for FPHC as soon as possible, just in case.

Other measures to increase security never hurt. Things like using two-factor authentication on sites for which it is available, and using password locker software that will generate strong passwords unique to each site, will help prevent people breaking into your accounts. This event provides a good time to consider adding these extra security measures if you aren't already using them.

April 14, 2014 02:25 PM

Danny Gratzer

Continuations and Exceptions

Posted on April 14, 2014

Continuations are useful things. They provide a nice way to manually handle control flow. This fact makes them very useful for compilers, an often used alternative to SSA is an intermediate language in which every function call is a tail-call and every expression has been converted to continuation passing style.

Often however, this isn’t enough. In a language which exceptions, we don’t just have a single continuation. Since every expression can either do one of two things.

  1. Continue the rest of the program normally
  2. Throw an exception and run an alternative program, the exception handler

To represent this, we can imagine having two continuations. Instead of

    newtype Cont r a = Cont {runCont :: (a -> r) -> r}

We have

    {-# LANGUAGE DeriveFunctor #-}
    import Control.Monad
    newtype Throws r e a = Throws {runThrows :: (e -> r) -> (a -> r) -> r}
    deriving (Functor)

Now we have two continuations, where e -> r represents the composition of exception handlers.

We can write a trivial monad instance similar to Cont

    instance Monad (Throws r e) where
      return a = Throws $ \ex cont -> cont a
      (Throws c) >>= f = Throws $ \ ex cont ->
        c ex $ \a -> runThrows (f a) e cont

So >>= maintains the exception handler between computations and otherwise acts exactly like Cont.

To actually take advantage of our exception handlers, we need two things, a throw and catch like pair of function. Let’s start with throw since it’s easiest.

    throw :: e -> Throws r e a
    throw e = Throws $ \ex cont -> ex e

This is pretty straightforward, when we’re given an exception an to throw, we simply feed it to our exception handler continuation. Since care what value cont needs, we can universally quantify over a.

Next up is handle, we’ll represent an exception handler as a function from e -> Maybe a. If we return Nothing, we can’t handle the exception at this level and we’ll just pass it to the existing exception handler.

So our handle is

    handle :: Throws r e a -> (e -> Maybe a) -> Throws r e a
    handle (Throws rest) handler = Throws $ \ex cont ->
      rest (\e -> maybe (ex e) cont (handler e)) cont

Notice the clever bit here, each handler actually contains both the success and failure continuations! If we can’t handle the exception we fail otherwise we can resume exactly where we were before.

No post would be complete without a demonstration!

    data Ex = Boom | Kaboom | Splat String
            deriving Show
    throwEx1 = throw Boom
    throwEx2 = throw Kaboom
    throwEx3 = throw (Splat "splat")
    test = do
      result <- handle throwEx1 $ \e -> case e of
        Boom -> Just "foo"
        _    -> Nothing
      result2 <- handle throwEx2 $ \e -> case e of
        Boom   -> Just "what"
        Kaboom -> Just "huh?"
        _      -> Nothing
      result3 <- handle throwEx3 $ \e -> case e of
        Splat s -> Just s
        _       -> Nothing
      return (unwords [result, result2, result3])

We can run this with

    runThrows (error . ("Toplevel fail "++)) test

which returns

    "foo huh? splat"

A Note on Either

Now we already have a perfectly good system of monadic exception like thing in the form of Either.

It might be interesting to note that what we’ve written is in fact isomorphic to Either. (e -> r) -> (a -> r) -> r is just the church representation of Either e a.

We can even go all the way and change Throws to

    newtype Throws e a = Throws {runThrows :: forall r. (e -> r) -> (a -> r) -> r}

So there you are, an interesting realization that one of the classic representations of a language like SML is in fact a really complicated version of Either :)

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

April 14, 2014 12:00 AM

April 10, 2014

Manuel M T Chakravarty

As a spin off from teaching programming to my 10 year old son...

As a spin off from teaching programming to my 10 year old son and his friends, we have published a sprite and pixel art editor for the iPad, called BigPixel, which you can get from the App Store. (It has a similar feature set as the earlier Haskell version, but is much prettier!)

April 10, 2014 11:47 AM

Alejandro Cabrera

Migration Complete: Welcome to!

The migration is now complete. Currently, I have:

* Atom feed support
* Hosted on https:// (with heartbleed patched)
* Sources hosted on github
* Main blog over at
* All content from here ported over
* All posts tagged appropriately

I've documented the process as my first new post at the new location: Learning Hakyll and Setting Up

At this point, the one feature I'd like to add to my static blog soon is the ability to have a Haskell-only feed. I'll be working on that over the coming week.

Thanks again for reading, and I hope you'll enjoy visiting the new site!

by Alejandro Cabrera ( at April 10, 2014 07:27 AM

Robert Harper

Parallelism and Concurrency, Revisited

To my delight, I still get compliments on and criticisms of my post from three years ago (can it possibly be that long?) on parallelism and concurrency.  In that post I offered a “top down” argument to the effect that these are different abstractions with different goals: parallelism is about exploiting computational resources to maximize efficiency, concurrency is about non-deterministic composition of components in a system.  Parallelism never introduces bugs (the semantics is identical to the sequential execution), but concurrency could be said to be the mother lode of all bugs (the semantics of a component changes drastically, without careful provision, when composed concurrently with other components).  The two concepts just aren’t comparable, yet somehow the confusion between them persists.  (Not everyone agrees with me on this distinction, but neither have I seen a comparable analysis that shows them to be the same concept.  Most complaints seem to be about my use of the words “parallelism” and “concurrency” , which is an unavoidable problem, or about my temerity in trying to define two somewhat ill-defined concepts, a criticism that I’ll just have to accept.)

I’ve recently gotten an inkling of why it might be that many people equate the two concepts (or see no point in distinguishing them).  This post is an attempt to clear up what I perceive to be a common misunderstanding that seems to explain it.  It’s hard for me to say whether it really is all that common of a misunderstanding, but it’s the impression I’ve gotten, so forgive me if I’m over-stressing an obvious point.  In any case I’m going to try for a “bottom up” explanation that might make more sense to some people.

The issue is scheduling.

The naive view of parallelism is that it’s just talk for concurrency, because all you do when you’re programming in parallel is fork off some threads, and then do something with their results when they’re done.  I’ve previously argued that this is the wrong way to think about parallelism (it’s really about cost), but let’s just let that pass.  It’s unarguably true that a parallel computation does consist of a bunch of, well, parallel computations.  So, the argument goes, it’s nothing but concurrency, because concurrency is, supposedly, all about forking off some threads and waiting to see what they do, and then doing something with it.  I’ve argued that that’s not a good way to think about concurrency either, but we’ll let that pass too.  So, the story goes, concurrency and parallelism are synonymous, and bullshitters like me are just trying to confuse people and make trouble.

Being the troublemaker that I am, my response is, predictably, nojust no.  Sure, it’s kinda sorta right, as I’ve already acknowledged, but not really, and here’s why: scheduling as you learned about it in OS class (for example) is an altogether different thing than scheduling for parallelism.  And this is the heart of the matter, from a “bottom-up” perspective.

There are two aspects of OS-like scheduling that I think are relevant here.  First, it is non-deterministic, and second, it is competitive.  Non-deterministic, because you have little or no control over what runs when or for how long.  A beast like the Linux scheduler is controlled by a zillion “voodoo parameters” (a turn of phrase borrowed from my queueing theory colleague, Mor Harchol-Balter), and who the hell knows what is going to happen to your poor threads once they’re in its clutches.  Second, and more importantly, an OS-like scheduler is allocating resources competitively.  You’ve got your threads, I’ve got my threads, and we both want ours to get run as soon as possible.  We’ll even pay for the privilege (priorities) if necessary.  The scheduler, and the queueing theory behind it (he says optimistically) is designed to optimize resource usage on a competitive basis, taking account of quality of service guarantees purchased by the participants.  It does not matter whether there is one processor or one thousand processors, the schedule is unpredictable.  That’s what makes concurrent programming hard: you have to program against all possible schedules.  And that’s why you can’t prove much about the time or space complexity of your program when it’s implemented concurrently.

Parallel scheduling is a whole ‘nother ball of wax.  It is (usually, but not necessarily) deterministic, so that you can prove bounds on its efficiency (Brent-type theorems, as I discussed in my previous post and in PFPL).  And, more importantly, it is cooperative in the sense that all threads are working together for the same computation towards the same ends.  The threads are scheduled so as to get the job (there’s only one) done as quickly and as efficiently as possible.  Deterministic schedulers for parallelism are the most common, because they are the easiest to analyze with respect to their time and space bounds.  Greedy schedulers, which guarantee to maximize use of available processors, never leaving any idle when there is work to be done, form an important class for which the simple form of Brent’s Theorem is obvious.

Many deterministic greedy scheduling algorithms are known, of which I will mention p-DFS and p-BFS, which do p-at-a-time depth- and breadth-first search of the dependency graph, and various forms of work-stealing schedulers, pioneered by Charles Leiserson at MIT.  (Incidentally, if you don’t already know what p-DFS or p-BFS are, I’ll warn you that they are a little trickier than they sound.  In particular p-DFS uses a data structure that is sort of like a stack but is not a stack.)  These differ significantly in their time bounds (for example, work stealing usually involves expectation over a random variable, whereas the depth- and breadth traversals do not), and differ dramatically in their space complexity.  For example, p-BFS is absolutely dreadful in its space complexity.  For a full discussion of these issues in parallel scheduling, I recommend Dan Spoonhower’s PhD Dissertation.  (His semantic profiling diagrams are amazingly beautiful and informative!)

So here’s the thing: when you’re programming in parallel, you don’t just throw some threads at some non-deterministic competitive scheduler.  Rather, you generate an implicit dependency graph that a cooperative scheduler uses to maximize efficiency, end-to-end.  At the high level you do an asymptotic cost analysis without considering platform parameters such as the number of processors or the nature of the interconnect.  At the low level the implementation has to validate that cost analysis by using clever techniques to ensure that, once the platform parameters are known, maximum use is made of the computational resources to get your job done for you as fast as possible.  Not only are there no bugs introduced by the mere fact of being scheduled in parallel, but even better, you can prove a theorem that tells you how fast your program is going to run on a real platform.  Now how cool is that?

Filed under: Programming, Research Tagged: concurrency, parallelism

by Robert Harper at April 10, 2014 03:18 AM

April 09, 2014

Dominic Steinitz

Gibbs Sampling in R, Haskell, Jags and Stan


It’s possible to Gibbs sampling in most languages and since I am doing some work in R and some work in Haskell, I thought I’d present a simple example in both languages: estimating the mean from a normal distribution with unknown mean and variance. Although one can do Gibbs sampling directly in R, it is more common to use a specialised language such as JAGS or STAN to do the actual sampling and do pre-processing and post-processing in R. This blog post presents implementations in native R, JAGS and STAN as well as Haskell.


> {-# OPTIONS_GHC -Wall                      #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing   #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults    #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind   #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods  #-}
> {-# OPTIONS_GHC -fno-warn-orphans          #-}
> {-# LANGUAGE NoMonomorphismRestriction     #-}
> module Gibbs (
>     main
>   , m
>   , Moments(..)
>   ) where
> import qualified Data.Vector.Unboxed as V
> import qualified Control.Monad.Loops as ML
> import Data.Random.Source.PureMT
> import Data.Random
> import Control.Monad.State
> import Data.Histogram ( asList )
> import Data.Histogram.Fill
> import Data.Histogram.Generic ( Histogram )
> import Data.List
> import qualified Control.Foldl as L
> import Diagrams.Backend.Cairo.CmdLine
> import LinRegAux
> import Diagrams.Backend.CmdLine
> import Diagrams.Prelude hiding ( sample, render )

The length of our chain and the burn-in.

> nrep, nb :: Int
> nb   = 5000
> nrep = 105000

Data generated from {\cal{N}}(10.0, 5.0).

> xs :: [Double]
> xs = [
>     11.0765808082301
>   , 10.918739177542
>   , 15.4302462747137
>   , 10.1435649220266
>   , 15.2112705014697
>   , 10.441327659703
>   , 2.95784054883142
>   , 10.2761068139607
>   , 9.64347295100318
>   , 11.8043359297675
>   , 10.9419989262713
>   , 7.21905367667346
>   , 10.4339807638017
>   , 6.79485294803006
>   , 11.817248658832
>   , 6.6126710570584
>   , 12.6640920214508
>   , 8.36604701073303
>   , 12.6048485320333
>   , 8.43143879537592
>   ]

A Bit of Theory

Gibbs Sampling

For a multi-parameter situation, Gibbs sampling is a special case of Metropolis-Hastings in which the proposal distributions are the posterior conditional distributions.

Referring back to the explanation of the metropolis algorithm, let us describe the state by its parameters i \triangleq \boldsymbol{\theta}^{(i)} \triangleq (\theta^{(i)}_1,\ldots, \theta^{(i)}_n) and the conditional posteriors by \pi\big({\theta}_{k}^{(j)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(i)}\big) where {\boldsymbol{\theta}}^{(i)}_{-k} = \big(\theta_1^{(i)},\ldots,\theta_{k-1}^{(i)},\theta_{k+1}^{(i)}\ldots\theta_n^{(i)}\big) then

\displaystyle   \begin{aligned}  \frac{\pi\big(\boldsymbol{\theta}^{(j)}\big)q\big(\boldsymbol{\theta}^{(j)}, \boldsymbol{\theta}^{(i)}\big)}  {\pi(\boldsymbol{\theta}^{(i)})q(\boldsymbol{\theta}^{(i)}, \boldsymbol{\theta}^{(j)})}  &=  \frac{  \pi\big({\theta}_{k}^{(j)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(j)}\big)\pi\big({\boldsymbol{\theta}}_{-k}^{(j)}\big)\pi\big({\theta}_{k}^{(i)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(j)}\big)  }  {  \pi\big({\theta}_{k}^{(i)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(i)}\big)\pi\big({\boldsymbol{\theta}}_{-k}^{(i)}\big)\pi\big({\theta}_{k}^{(j)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(i)}\big)  } \\  &=  \frac{  \pi\big({\theta}_{k}^{(j)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(j)}\big)\pi\big({\boldsymbol{\theta}}_{-k}^{(j)}\big)\pi\big({\theta}_{k}^{(i)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(j)}\big)  }  {  \pi\big({\theta}_{k}^{(i)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(j)}\big)\pi\big({\boldsymbol{\theta}}_{-k}^{(j)}\big)\pi\big({\theta}_{k}^{(j)} \,\big|\, {\boldsymbol{\theta}}_{-k}^{(j)}\big)  } \\  &= 1  \end{aligned}

where we have used the rules of conditional probability and the fact that \boldsymbol{\theta}_i^{(-k)} = \boldsymbol{\theta}_j^{(-k)}

Thus we always accept the proposed jump. Note that the chain is not in general reversible as the order in which the updates are done matters.

Normal Distribution with Unknown Mean and Variance

It is fairly standard to use an improper prior

\displaystyle   \begin{aligned}  \pi(\mu, \tau) \propto \frac{1}{\tau} & & -\infty < \mu < \infty\, \textrm{and}\, 0 < \tau < \infty  \end{aligned}

The likelihood is

\displaystyle   p(\boldsymbol{x}\,|\,\mu, \sigma) = \prod_{i=1}^n \bigg(\frac{1}{\sigma\sqrt{2\pi}}\bigg)\exp{\bigg( -\frac{(x_i - \mu)^2}{2\sigma^2}\bigg)}

re-writing in terms of precision

\displaystyle   p(\boldsymbol{x}\,|\,\mu, \tau) \propto \prod_{i=1}^n \sqrt{\tau}\exp{\bigg( -\frac{\tau}{2}{(x_i - \mu)^2}\bigg)} = \tau^{n/2}\exp{\bigg( -\frac{\tau}{2}\sum_{i=1}^n{(x_i - \mu)^2}\bigg)}

Thus the posterior is

\displaystyle   p(\mu, \tau \,|\, \boldsymbol{x}) \propto \tau^{n/2 - 1}\exp{\bigg( -\frac{\tau}{2}\sum_{i=1}^n{(x_i - \mu)^2}\bigg)}

We can re-write the sum in terms of the sample mean \bar{x} = \frac{1}{n}\sum_{i=1}^n x_i and variance s^2 = \frac{1}{n-1}\sum_{i=1}^n (x_i - \bar{x})^2 using

\displaystyle   \begin{aligned}  \sum_{i=1}^n (x_i - \mu)^2 &= \sum_{i=1}^n (x_i - \bar{x} + \bar{x} - \mu)^2 \\  &= \sum_{i=1}^n (x_i - \bar{x})^2 - 2\sum_{i=1}^n (x_i - \bar{x})(\bar{x} - \mu) + \sum_{i=1}^n (\bar{x} - \mu)^2 \\  &= \sum_{i=1}^n (x_i - \bar{x})^2 - 2(\bar{x} - \mu)\sum_{i=1}^n (x_i - \bar{x}) + \sum_{i=1}^n (\bar{x} - \mu)^2 \\  &= (n - 1)s^2 + n(\bar{x} - \mu)^2  \end{aligned}

Thus the conditional posterior for \mu is

\displaystyle   \begin{aligned}  p(\mu \,|\, \tau, \boldsymbol{x}) &\propto \exp{\bigg( -\frac{\tau}{2}\bigg(\nu s^2 + \sum_{i=1}^n{(\mu - \bar{x})^2}\bigg)\bigg)} \\  &\propto \exp{\bigg( -\frac{n\tau}{2}{(\mu - \bar{x})^2}\bigg)} \\  \end{aligned}

which we recognise as a normal distribution with mean of \bar{x} and a variance of (n\tau)^{-1}.

The conditional posterior for \tau is

\displaystyle   \begin{aligned}  p(\tau \,|\, , \mu, \boldsymbol{x}) &\propto \tau^{n/2 -1}\exp\bigg(-\tau\frac{1}{2}\sum_{i=1}^n{(x_i - \mu)^2}\bigg)  \end{aligned}

which we recognise as a gamma distribution with a shape of n/2 and a scale of \frac{1}{2}\sum_{i=1}^n{(x_i - \mu)^2}

In this particular case, we can calculate the marginal posterior of \mu analytically. Writing z = \frac{\tau}{2}\sum_{i=1}^n{(x_i - \mu)^2} we have

\displaystyle   \begin{aligned}  p(\mu \,|\, \boldsymbol{x}) &= \int_0^\infty p(\mu, \tau \,|\, \boldsymbol{x}) \textrm{d}\tau \\  &\propto \int_0^\infty \tau^{n/2 - 1}\exp{\bigg( -\frac{\tau}{2}\sum_{i=1}^n{(x_i - \mu)^2}\bigg)} \textrm{d}\tau \\  &\propto \bigg( \sum_{i=1}^n{(x_i - \mu)^2} \bigg)^{-n/2} \int_0^\infty z^{n/2 - 1}\exp{-z}\textrm{d}\tau \\  &\propto \bigg( \sum_{i=1}^n{(x_i - \mu)^2} \bigg)^{-n/2} \\  \end{aligned}

Finally we can calculate

\displaystyle   \begin{aligned}  p(\mu \,|\, \boldsymbol{x}) &\propto \bigg( (n - 1)s^2 + n(\bar{x} - \mu)^2 \bigg)^{-n/2} \\  &\propto \bigg( 1 + \frac{n(\mu - \bar{x})^2}{(n - 1)s^2} \bigg)^{-n/2} \\  \end{aligned}

This is the non-standardized Student’s t-distribution t_{n-1}(\bar{x}, s^2/n).

Alternatively the marginal posterior of \mu is

\displaystyle   \frac{\mu - \bar{x}}{s/\sqrt{n}}\bigg|\, x \sim t_{n-1}

where t_{n-1} is the standard t distribution with n - 1 degrees of freedom.

The Model in Haskell

Following up on a comment from a previous blog post, let us try using the foldl package to calculate the length, the sum and the sum of squares traversing the list only once. An improvement on creating your own strict record and using foldl’ but maybe it is not suitable for some methods e.g. calculating the skewness and kurtosis incrementally, see below.

> x2Sum, xSum, n :: Double
> (x2Sum, xSum, n) = L.fold stats xs
>   where
>     stats = (,,) <$>
>             (L.premap (\x -> x * x) L.sum) <*>
>             L.sum <*>
>             L.genericLength

And re-writing the sample variance s^2 = \frac{1}{n-1}\sum_{i=1}^n (x_i - \bar{x})^2 using

\displaystyle   \begin{aligned}  \sum_{i=1}^n (x_i - \bar{x})^2 &= \sum_{i=1}^n (x_i^2 - 2x_i\bar{x} + \bar{x}^2) \\  &= \sum_{i=1}^n x_i^2 - 2\bar{x}\sum_{i=1}^n x_i + \sum_{i=1}^n \bar{x}^2 \\  &= \sum_{i=1}^n x_i^2 - 2n\bar{x}^2 + n\bar{x}^2 \\  &= \sum_{i=1}^n x_i^2 - n\bar{x}^2 \\  \end{aligned}

we can then calculate the sample mean and variance using the sums we have just calculated.

> xBar, varX :: Double
> xBar = xSum / n
> varX = n * (m2Xs - xBar * xBar) / (n - 1)
>   where m2Xs = x2Sum / n

In random-fu, the Gamma distribution is specified by the rate paratmeter, \beta.

> beta, initTau :: Double
> beta = 0.5 * n * varX
> initTau = evalState (sample (Gamma (n / 2) beta)) (pureMT 1)

Our sampler takes an old value of \tau and creates new values of \mu and \tau.

> gibbsSampler :: MonadRandom m => Double -> m (Maybe ((Double, Double), Double))
> gibbsSampler oldTau = do
>   newMu <- sample (Normal xBar (recip (sqrt (n * oldTau))))
>   let shape = 0.5 * n
>       scale = 0.5 * (x2Sum + n * newMu^2 - 2 * n * newMu * xBar)
>   newTau <- sample (Gamma shape (recip scale))
>   return $ Just ((newMu, newTau), newTau)

From which we can create an infinite stream of samples.

> gibbsSamples :: [(Double, Double)]
> gibbsSamples = evalState (ML.unfoldrM gibbsSampler initTau) (pureMT 1)

As our chains might be very long, we calculate the mean, variance, skewness and kurtosis using an incremental method.

> data Moments = Moments { mN :: !Double
>                        , m1 :: !Double
>                        , m2 :: !Double
>                        , m3 :: !Double
>                        , m4 :: !Double
>                        }
>   deriving Show
> moments :: [Double] -> Moments
> moments xs = foldl' f (Moments 0.0 0.0 0.0 0.0 0.0) xs
>   where
>     f :: Moments -> Double -> Moments
>     f m x = Moments n' m1' m2' m3' m4'
>       where
>         n = mN m
>         n'  = n + 1
>         delta = x - (m1 m)
>         delta_n = delta / n'
>         delta_n2 = delta_n * delta_n
>         term1 = delta * delta_n * n
>         m1' = m1 m + delta_n
>         m4' = m4 m +
>               term1 * delta_n2 * (n'*n' - 3*n' + 3) +
>               6 * delta_n2 * m2 m - 4 * delta_n * m3 m
>         m3' = m3 m + term1 * delta_n * (n' - 2) - 3 * delta_n * m2 m
>         m2' = m2 m + term1

In order to examine the posterior, we create a histogram.

> numBins :: Int
> numBins = 400
> hb :: HBuilder Double (Data.Histogram.Generic.Histogram V.Vector BinD Double)
> hb = forceDouble -<< mkSimple (binD lower numBins upper)
>   where
>     lower = xBar - 2.0 * sqrt varX
>     upper = xBar + 2.0 * sqrt varX

And fill it with the specified number of samples preceeded by a burn-in.

> hist :: Histogram V.Vector BinD Double
> hist = fillBuilder hb (take (nrep - nb) $ drop nb $ map fst gibbsSamples)

Now we can plot this.

And calculate the skewness and kurtosis.

> m :: Moments
> m = moments (take (nrep - nb) $ drop nb $ map fst gibbsSamples)
ghci> import Gibbs
ghci> putStrLn $ show $ (sqrt (mN m)) * (m3 m) / (m2 m)**1.5

ghci> putStrLn $ show $ (mN m) * (m4 m) / (m2 m)**2

We expect a skewness of 0 and a kurtosis of 3 + 6 / \nu - 4 = 3.4 for \nu = 19. Not too bad.

The Model in JAGS

JAGS is a mature, declarative, domain specific language for building Bayesian statistical models using Gibbs sampling.

Here is our model as expressed in JAGS. Somewhat terse.

model {
	for (i in 1:N) {
		x[i] ~ dnorm(mu, tau)
	mu ~ dnorm(0, 1.0E-6)
	tau <- pow(sigma, -2)
	sigma ~ dunif(0, 1000)

To run it and examine its results, we wrap it up in some R

## Import the library that allows R to inter-work with jags.

## Read the simulated data into a data frame.
fn <- read.table("", header=FALSE)

jags <- jags.model('example1.bug',
                   data = list('x' = fn[,1], 'N' = 20),
                   n.chains = 4,
                   n.adapt = 100)

## Burnin for 10000 samples
update(jags, 10000);

mcmc_samples <- coda.samples(jags, variable.names=c("mu", "sigma"), n.iter=20000)


And now we can look at the posterior for \mu.

The Model in STAN

STAN is a domain specific language for building Bayesian statistical models similar to JAGS but newer and which allows variables to be re-assigned and so cannot really be described as declarative.

Here is our model as expressed in STAN. Again, somewhat terse.

data {
  int<lower=0> N;
  real x[N];

parameters {
  real mu;
  real<lower=0,upper=1000> sigma;
  x     ~ normal(mu, sigma);
  mu    ~ normal(0, 1000);

Just as with JAGS, to run it and examine its results, we wrap it up in some R.


## Read the simulated data into a data frame.
fn <- read.table("", header=FALSE)

## Running the model
fit1 <- stan(file = 'Stan.stan',
             data = list('x' = fn[,1], 'N' = 20),
             pars=c("mu", "sigma"),


Again we can look at the posterior although we only seem to get medians and 80% intervals.


Write the histogram produced by the Haskell code to a file.

> displayHeader :: FilePath -> Diagram B R2 -> IO ()
> displayHeader fn =
>   mainRender ( DiagramOpts (Just 900) (Just 700) fn
>              , DiagramLoopOpts False Nothing 0
>              )
> main :: IO ()
> main = do
>   displayHeader "diagrams/DataScienceHaskPost.png"
>     (barDiag
>      (zip (map fst $ asList hist) (map snd $ asList hist)))

The code can be downloaded from github.

by Dominic Steinitz at April 09, 2014 11:43 AM

April 08, 2014

Ketil Malde

Some not-so-grand challenges in bioinformatics

Bioinformatics is a field in the intersection of biology, statistics, and computer science. Broadly speaking, biologists will plan an experiment, bioinformaticians will run the analysis, and the biologists will interpret the results. Sometimes it can be rather frustrating to be the bioinformatician squeezed into the middle, realizing that results are far from as robust as they should be.

Here, I try to list a number of areas in bioinformatics that I think are problematic. They are not in any way “grand challenges”, like computing protein structure or identifying non-trivial factors from GWAS analysis. These are more your everyday challenges that bioinformaticians need to be aware of, but are too polite to mention.

Importance of bioinformatics

High throughput sequencing (HTS) has rapidly become an indispensable tool in medicine, biology, and ecology. As a technology, it is crucial in addressing many of our most important challenges: health and disease, food supply and food production, and ecologoy. Scientifically, it has allowed us to reveal the mechanisms of living organisms in unprecedented detail and scale.

New technology invariably poses new challenges to analysis, and in addition to a thorough understanding of relevant biology, robust study design and analysis based on HTS requires a solid understanding of statistics and bioinformatics. A lack of this background often results in study design that is overly optimistic and ambitious, and in analysis that fails to take the many sources of noise and error into account.

Specific examples

The typical bioinformatics pipeline is (as implied by the name) structured as a chain of operations. The tools used will of course not be perfect, and they will introduce errors, artifacts, and biases. Unfortunately, the next stage in the pipeline usually works from the assumption that the results from the previous stage are correct, and errors thus propagate and accumulate - and in the end, it is difficult to say whether results represent real, biological phenomena, or are simply artifacts of the analysis. What makes matters worse, is that the pipeline is often executed by scientists with a superficial knowledge of the errors that can occur. (A computer said it, so it must be true.)

Gene prediction and transcript assembly

One important task in genomics is identifying the genes of new organisms. This can be performed with a variety of tools, and in one project, we used both ab initio gene prediction (MAKER and Augustus) as well as RNAseq assembly (CLC, abyss). The results varied substantially, with predicted gene counts ranging from 16000 to 45000. To make things worse, clustering revealed that there was not a simple many-to-one relationship between the predicted genes. Clearly, results from analysis of e.g. GO-enrichment or expansion and contraction of gene families will be heavily dependent on which set of transcripts one uses.

Expression analysis

Quantifying the expression of specific genes is an important tool in many studies. In addition to the challenges of acquiring a good set of transcripts, expression analysis based on RNAseq introduces several other problems.

The most commonly used measure, reads (or fragments) per kilobase per million (RPKM), has some statistical problems.

In addition, mapping is challenging, and in particular, mapping reads from transcripts to a genome will struggle with intron/exon boundaries. Of course, you can map to the transcriptome instead, but as we have seen, the reliability of reference transcriptomes are not always as we would like.

Reference genomes

The main goal of a de novo genome project is to produce a correct reference sequence for the genome. Although in many ways a simpler task than the corresponding transcriptome assembly, there are many challenges to overcome. In reality, most genome sequences are fragmented and contain errors. But even in the optimal case, a reference genome may not be representative.

For instance, a reference genome from a single individual is clearly unlikely to accurately model sex chromosomes of the other sex (but on the other hand, similarities between different sex chromosomes will likely lead to a lot of incorrectly mapped reads). But a single individual is often not representative even for other individuals of the same sex. E.g., an obvious method for identifying a sex chromosome is to compare sequence data from one sex to the draft reference (which in our case happened to be from an individual of the other sex) Unfortunately, it turned out that the marker we found to be absent in the reference - and thus inferred to be unique to the other sex – also occurred in 20% of the other sex. Just not in the sequenced individual.

So while reference genomes may work fairly well for some species, it is less useful for species with high genomic variation (or with B-chromosomes), and simply applying methods developed for mammals to different species will likely give misleading and/or incorrect results.

Genome annotation

I recently attended an interesting talk about pseudogenes in various species. One thing that stuck in my mind, was that the number of pseudogenes in a species is highly correlated with the institution responsible for the gene prediction. Although this is just an observation and not a scientific result, it seems very likely that different institutes use in-house pipelines with varying degrees of sensitivity/specificity. Thus what one pipeline would identify as a real gene, a more conservative pipeline would ignore as a pseudogene. So here too, our precious curated resources may be less valuable than you thought.

Population genomics

In population genomics, there are many different measures of diversity, but the most commonly used is FST. Unfortunately, the accuracy of estimating FST for a SNP is highly dependent on coverage and error rate, and a simulation study I did showed that with the coverage commonly used in projects, the estimated FST has a large variance.

In addition, the usual problems apply: the genome may not be complete, correct or representative. And even if it were, reads could still be mapped incorrectly. And it is more likely to fail in variable regions, meaning you get less accurate results exactly in the places it matters most.

Variant analysis

Variant analysis suffers from many of the same challenges as population genomics, genome assembly and mapping being what they are. Again, estimated allele frequencies tend to be wildly inaccurate.

Structural variants (non-trivial indels, transpositions, copy number variation, etc) are very common, but difficult to detect accurately from mapped reads. In many cases, such features will just result in lower mapping coverage, further reducing your chances of identifying them.

Curation is expensive

Analysis is challenging even under the best of conditions – with high quality data and using a good reference genome that is assembled and annotated and curated by a reputable institution or heavyweight consortium. But in many cases, data quality is poor and we don’t even have the luxury of good references.

A de novo genome project takes years and costs millions. This is acceptable for species crucial to medicine, like human or mouse, and for food production, like cow or wheat. These are sequenced by large multi-national consortia at tremendous expense. But while we will eventually get there for the more important species, the large majority of organisms – the less commercially or scientifically interesting ones – will never have anything close to this.

Similarly, the databases with unannotated sequences (e.g. UniProt) grow at a much quicker pace than the curated databases (e.g. SwissProt), simply because identifying the function of a protein in the lab takes approximately one post doc., and nobody has the incentives or manpower to do this.


So, what is the take-home message from all of this? Let’s take a moment to sum up:

Curation is less valuable than you think

Everybody loves and anticipates finished genomes, complete with annotated transcripts and proteins. But there’s a world of difference between the finished genomes of widely studied species and the draft genomes of the more obscure ones. Often, the reference resources are incomplete and partially incorrect, and sometimes the best thing that can be said for them, is that as long as we all use it our results will be comparable because we all make the same errors.

Our methods aren’t very robust

Genome assembly, gene annotation, transcriptome assembly, and functional annotation is difficult. Anybody can run some assembler or gene predictor and get a result, but if you run two of them, you will get two different results.

Sequence mapping is not very complicated (at a conference, somebody claimed to have counted over ninety programs for short read mapping), and works well if you have a good reference genome, if you don’t have too many repeats, and if you don’t have too much variation in your genome. In other words, everywhere you don’t really need it.

Dealing with the limitations

Being everyday challenges, I don’t think any of this is insurmountable, but I’ll save that for a later post. Or grant application. In the meantime, do let me know what you think.

April 08, 2014 05:00 PM

Yesod Web Framework

Proposal: Changes to the PVP

As I mentioned two posts ago, there was a serious discussion on the libraries mailing list about the Package Versioning Policy (PVP).

This blog post presents some concrete changes I'd like to see to the PVP to make it better for both general consumers of Hackage, and for library authors as well. I'll start off with a summary of the changes, and then give the explanations:

  1. The goal of the PVP needs to be clarified. Its purpose is not to ensure reproducible builds of non-published software, but rather to provide for more reliable builds of libraries on Hackage. Reproducible builds should be handled exclusively through version freezing, the only known technique to actually give the necessary guarantees.

  2. Upper bounds should not be included on non-upgradeable packages, such as base and template-haskell (are there others?). Alternatively, we should establish some accepted upper bound on these packages, e.g. many people place base < 5 on their code.

  3. We should be distinguishing between mostly-stable packages and unstable packages. For a package like text, if you simply import Data.Text (Text, pack, reverse), or some other sane subset, there's no need for upper bounds.

    Note that this doesn't provide a hard-and-fast rule like the current PVP, but is rather a matter of discretion. Communication between library authors and users (via documentation or other means) would be vital to making this work well.

  4. For a package version A.B.C, a bump in A or B indicates some level of breaking change. As an opt-in approach, package authors are free to associated meaning to A and B beyond what the PVP requires. Libraries which use these packages are free to rely on the guarantees provided by package authors when placing upper bounds.

    Note that this is very related to point (3).

While I (Michael Snoyman) am the author of this proposal, the following people have reviewed the proposal and support it:

  • Bryan O'Sullivan
  • Felipe Lessa
  • Roman Cheplyaka
  • Vincent Hanquez

Reproducible builds

There are a number of simple cases that can result in PVP-compliant code not being buildable. These aren't just hypothetical cases; in my experience as both a package author and Stackage maintainer, I've seen these come up.

  • Package foo version 1.0 provides an instance for MonadFoo for IO and Identity. Version 1.1 removes the IO instance for some reason. Package bar provides a function:

    bar :: MonadFoo m => Int -> m Double

    Package bar compiles with both version 1.0 and 1.1 of foo, and therefore (following the PVP) adds a constraint to its cabal file foo >= 1.0 && < 1.2.

    Now a user decides to use the bar package. The user never imports anything from foo, and therefore has no listing for foo in the cabal file. The user code depends on the IO instance for MonadFoo. When compiled with foo 1.0, everything works fine. However, when compiled with foo 1.1, the code no longer compiles.

  • Similarly, instead of typeclass instances, the same situation can occur with module export lists. Consider version 1.0 of foo which provides:

    module Foo (foo1, foo2) where

    Version 1.1 removes the foo2 export. The bar package reexports the entire Foo module, and then a user package imports the module from bar. If the user package uses the foo2 function, it will compile when foo-1.0 is used, but not when foo-1.1 is used.

In both of these cases, the issue is the same: transitive dependencies are not being clamped down. The PVP makes an assumption that the entire interface for a package can be expressed in its version number, which is not true. I see three possible solutions to this:

  1. Try to push even more of a burden onto package authors, and somehow make them guarantee that their interface is completely immune to changes elsewhere in the stack. This kind of change was proposed on the libraries list. I'm strongly opposed to some kind of change like this: it makes authors' lives harder, and makes it very difficult to provide backwards compatibility in libraries. Imagine if transformers 0.4 adds a new MonadIO instance; the logical extreme of this position would be to disallow a library from working with both transformers 0.3 and 0.4, which will split Hackage in two.

  2. Modify the PVP so that instead of listing just direct dependencies, authors are required to list all transitive dependencies as well. So it would be a violation to depend on bar without explicitly listing foo in the dependency list. This will work, and be incredibly difficult to maintain. It will also greatly increase the time it takes for a new version of a deep dependency to be usable due to the number of authors who will have to bump version bounds.

  3. Transfer responsibility for this to package users: if you first built your code against foo 1.0, you should freeze that information and continue building against foo 1.0, regardless of the presence of new versions of foo. Not only does this increase reproducibility, it's just common sense: it's entirely possible that new versions of a library will introduce a runtime bug, performance regression, or even fix a bug that your code depends on. Why should the reliability of my code base be dependent on the actions of some third party that I have no control over?

Non-upgradeable packages

There are some packages which ship with GHC and cannot be upgraded. I'm aware of at least base and template-haskell, though perhaps there are others (haskell98 and haskell2010?). In the past, there was good reason to place upper bounds on base, specifically with the base 3/4 split. However, we haven't had that experience in a while, and don't seem to be moving towards doing that again. In today's world, we end up with the following options:

  • Place upper bounds on base to indicate "I haven't tested this with newer versions of GHC." This then makes it difficult for users to test out that package with newer versions of GHC.
  • Leave off upper bounds on base. Users may then try to install a package onto a version of GHC on which the package hasn't been tested, which will result in either (1) everything working (definitely the common case based on my Stackage maintenance), or (2) getting a compilation error.

I've heard two arguments to push us in the direction of keeping the upper bounds in this case, so I'd like to address them both:

  • cabal error messages are easier to understand than GHC error messages. I have two problems with that:
    • I disagree: cabal error messages are terrible. (I'm told this will be fixed in the next version of cabal.) Take the following output as a sample:

      cabal: Could not resolve dependencies:
      trying: 4Blocks-0.2
      rejecting: base- (conflict: 4Blocks => base>=2 && <=4)
      rejecting: base-,,,,,,,,,,,,,, (global
      constraint requires installed instance)

      I've seen a number of users file bug reports not understanding that this message means "you have the wrong version of GHC."

    • Even if the error messages were more user-friendly, they make it more difficult to fix the actual problem: the code doesn't compile with the new version of GHC. Often times, I've been able to report an error message to a library author and, without necessarily even downloading the new version of GHC, he/she has been able to fix the problem.

  • Using upper bounds in theory means that cabal will be able to revert to an older version of the library that is compatible with the new version of GHC. However, I find it highly unlikely that there's often- if ever- a case where an older version of a library is compatible with a later version of GHC.

Mostly-stable, and finer-grained versioning

I'll combine the discussion of the last two points. I think the heart of the PVP debates really comes from mostly-stable packages. Let's contrast with the extremes. Consider a library which is completely stable, never has a breaking change, and has stated with absolute certainty that it never will again. Does anyone care about upper bounds on this library? They're irrelevant! I'd have no problem with including an upper bound, and I doubt even the staunchest PVP advocates would really claim it's a problem to leave it off.

On the other hand, consider an extremely unstable library, which is releasing massively breaking changes on a weekly basis. I would certainly agree in that case that an upper bound on that library is highly prudent.

The sticking point is the middle ground. Consider the following code snippet:

import Data.Text (Text, pack)

foo :: Text
foo = pack "foo"

According to the PVP as it stands today, this snippet requires an upper bound of < 1.2 on the text package. But let's just play the odds here: does anyone actually believe there's a real chance that the next iteration of text will break this code snippet? I highly doubt it; this is a stable subset of the text API, and I doubt it will ever be changing. The same can be said of large subsets of many other packages.

By putting in upper bounds in these cases, we run a very real risk of bifurcating Hackage into "those demanding the new text version for some new feature" vs "those who haven't yet updated their upper bounds to allow the new version of text."

The PVP currently takes an extremely conservative viewpoint on this, with the goal of solving just one problem: making sure code that compiles now continues to compile. As I demonstrated above, it doesn't actually solve that problem completely. And in addition, in this process, it has created other problems, such as this bifurcation.

So my proposal is that, instead of creating rigid rules like "always put an upper bound no matter what," we allow some common sense into the process, and also let package authors explicitly say "you can rely on this API not changing."

April 08, 2014 03:55 PM

Noam Lewis

Generate Javascript classes for your .NET types

We open-sourced another library: ClosureExterns.NET (on github and nuget). It generates Javascript classes from .NET-based backend types, to preserve type “safety” (as safe as Javascript gets) across front- and backend. As a bonus you get Google closure annotations. The type annotations are understood by WebStorm (and other editors) and improve your development experience. Also, if you use Google Closure to compile or verify your code, it will take these types into account. We use it extensively with C#. We haven’t tried it with F#, but it’s supposed to work with any .NET type.

ClosureExterns.NET makes it easier to keep your frontend models in sync with your backend. The output is customizable – you can change several aspects of the generated code. For example you can change the constructor function definition, to support inheritance from some other Javascript function. For more details see ClosureExternOptions.

Getting Started

First, install it. Using nuget, install the package ClosureExterns.

Then, expose a method that generates your externs. For example, a console application:

public static class Program
    public static void Main()
        var types = ClosureExternsGenerator.GetTypesInNamespace(typeof(MyNamespace.MyType));
        var output = ClosureExternsGenerator.Generate(types);

You can also customize the generation using a ClosureExternsOptions object.

Example input/output


class B
    public int[] IntArray { get; set; }


var Types = {};

// ClosureExterns.Tests.ClosureExternsGeneratorTest+B
/** @constructor
Types.B = function() {};
/** @type {Array.<number>} */
Types.B.prototype.intArray = null;

For a full example see the tests.

Tagged: .NET, c#, Javascript

by sinelaw at April 08, 2014 03:36 PM

Danny Gratzer

Bargain Priced Coroutines

Posted on April 8, 2014

The other day I was reading the 19th issue of the Monad.Reader and there was a fascinating post on coroutines.

While reading some of the code I noticed that it, like most things in Haskell, can be reduced to 5 lines with a library that Edward Kmett has written.

Consider the type of a trampoline as described in this article

    newtype Trampoline m a = Tramp {runTramp :: m (Either (Tramp m a) a)}

So a trampoline is a monadic computation of some sort returning either a result, a, or another computation to run to get the rest.

Now this looks strikingly familiar. A computation returning Trampoline m a is really a computation returning a tree of Tramp m a’s terminating in a pure value.

This sounds like a free monad!

    import Control.Monad.Trans.Free
    import Control.Monad.Identity
    type Trampoline = FreeT Identity

Recall that FreeT is defined as

    data FreeF f a b = Pure a | Free (f b)
    data FreeT f m a = FreeT (m (FreeF f a (FreeT f m a)))

This is isomorphic to what we where looking at before. As an added bonus, we’ve saved the tedium of defining our own monad and applicative instance for Trampoline.

We can now implement bounce and pause to define our trampolines. bounce must take a computation and unwrap it by one level, leaving either a value or another computation.

This is just a matter of rejiggering the FreeF into an Either

    bounce :: Functor m => Trampoline m a -> m (Either (Trampoline m a) a)
    bounce = fmap toEither . runFreeT
      where toEither (Pure a) = Right a
            toEither (Free m) = Left $ runIdentity m

pause requires some thought, the trick is to realize that if we wrap a computation in one layer of Free when unwrapped by bounce we’ll get the rest of the computation.


    pause :: Monad m => Trampoline m ()
    pause = FreeT $ return (Free . Identity $ return ())

So that’s 6 lines of code for trampolines. Let’s move on to generators.

A generator doesn’t yield just another computation, it yields a pair of a computation and a freshly generated value. We can account for this by changing that Identity functor.

    type Generator c = FreeT ((,) c)

Again we get free functor, applicative and monad instances. We two functions, yield and runGen. Yield is going to take one value and stick it into the first element of the pair.

    yield :: Monad m => g -> Generator g m ()
    yield g = FreeT . return $ Free (g, return ())

This just sticks a good old boring m () in the second element of the pair.

Now runGen should take a generator and produce a m (Maybe c, Generator c m a). This can be done again by pattern matching on the underlying FreeF.

    runGen :: (Monad m, Functor m) => Generator g m a -> m (Maybe g, Generator g m a)
    runGen = fmap toTuple . runFreeT
      where toTuple (Pure a)         = (Nothing, return a)
            toTuple (Free (g, rest)) = (Just g, rest)

Now, last but not least, let’s build consumers. These wait for a value rather than generating one, so -> looks like the right functor.

    type Consumer c = FreeT ((->) c)

Now we want await and runCon. await to wait for a value and runCon to supply one. These are both fairly mechanical.

    runConsumer :: Monad m => c -> Consumer c m a -> m a
    runConsumer c = (>>= go) . runFreeT
      where go (Pure a) = return a
            go (Free f) = runConsumer c $ f c

    runCon :: (Monad m, Functor m)
        => Maybe c
        -> Consumer c m a
        -> m (Either a (Consumer c m a))
    runCon food c = runFreeT c >>= go
      where go (Pure a) = return . Left $ a
            go (Free f) = do
              result <- runFreeT $ f food
              return $ case result of
                Pure a -> Left                   $ a
                free   -> Right . FreeT . return $ free

runCon is a bit more complex than I’d like. This is to essentially ensure that if we had some code like

    Just a <- await
    lift $ do
    Just b <- await

We want foo, bar, and baz to run with just one await. You’d expect that we’d run as much as possible with each call to runCon. Thus we unwrap not one, but two layers of our FreeT and run them, then rewrap the lower layer. The trick is that we make sure never to duplicate side effects by using good old return.

We can sleep easy that this is sound since return a >>= f is f a by the monad laws. Thus, our call to return can’t do anything detectable or too interesting.

While this is arguably more intuitive, I don’t particularly like it so we can instead write

    runCon :: (Monad m, Functor m)
        => Maybe c
        -> Consumer c m a
        -> m (Either a (Consumer c m a))
    runCon food = fmap go . runFreeT
      where go (Pure a) = Left a
            go (Free f) = Right (f food)

Much simpler, but now our above example wouldn’t run foo and friends until the second call of runCon.

Now we can join generators to consumers in a pretty naive way,

    (>~>) :: (Functor m, Monad m) => Generator c m () -> Consumer c m a -> m a
    gen >~> con = do
      (cMay, rest) <- runGen gen
      case cMay of
        Nothing -> starve con
        Just c  -> runCon c con >>= use rest
      where use _    (Left a)  = return a
            use rest (Right c) = rest >~> c

And now we can use it!

    addGen :: Generator Int IO ()
    addGen = do
      lift $ putStrLn "Yielding 1"
      yield 1
      lift $ putStrLn "Yielding 2"
      yield 2
    addCon :: Consumer Int IO ()
    addCon = do
      lift $ putStrLn "Waiting for a..."
      Just a <- await
      lift $ putStrLn "Waiting for b..."
      Just b <- await
      lift . print $ a + b

    main = addGen >~> addCon

When run this prints

    Yielding 1
    Waiting for a...
    Yielding 2
    Waiting for b...

Now, this all falls out of playing with what functor we give to FreeT. So far, we’ve gotten trampolines out of Identity, generators out of (,) a, and consumers out of (->) a.

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

April 08, 2014 12:00 AM

April 05, 2014

Lennart Augustsson

Haskell error reporting with locations, update

Since some people (I'm among them) dislike impure features in Haskell I thought I'd present a slight variation on the error location feature that is "pure".

First, the __LOCATION__ variable gets an abstract type. So

  data Location
  __LOCATION__ :: Location
It's defined in the Prelude and always in scope. The type cannot be compared, shown, or anything. There's just one thing that can be done, namely:
  extractLocation :: Location -> IO String
The error function needs a new exception to throw
  data ErrorCallLoc = ErrorCallLoc Location String

  error :: String -> a
  error s = throw (ErrorCallLoc __LOCATION__ s)
This means that the location string cannot be used when we throw the error. But it can be used where the error is caught, since this can only be done in the IO monad.

Under the hood the everything is just as before, Location is just a string. It just can't be manipulated except in the IO monad, so we can pretend it's pure.

  newtype Location = Location String
  extractLocation (Location s) = return s
It now looks a lot like Michael Snoyman's proposal.

by augustss ( at April 05, 2014 07:53 PM

Philip Wadler

Dan Piponi (sigfpe)

This is a test


I like to see connections between different branches of mathematics. Here's an application of the same technique in two completely different areas that I haven't seen in the literature. It might take a bit of work to convince you that they are in fact the same thing, so here goes.

The dual of an optimization problem

Consider the optimization problem
$$ \begin{aligned} \mbox{min} & f(x) \\ \mbox{subject to} & x\le a\\ \end{aligned} $$
illustrated below and where $f$ is convex:

The essential features are there in this problem despite it being the most basic problem with an inequality.

Now consider the optimum given as a function of $a$. So $f_+(a)=\inf_{x\le a}f(a)$. You can think of $f_+$ as representing the "best value of $f$ so far". Its graph looks like this:

It's basically like the original $f$ except that we only have the descending parts and none of the ascending parts.

Now we're going to construct $f_+$ by a slightly roundabout way, not by considering the function itself, but its epigraph, the set of points above the graph. Here's the epigraph for the original $f$:

Notice I've drawn a bunch of lines just touching the epigraph, ie. its subtangents. In general, any convex shape can be expressed as the intersection of half-planes. Those subtangents indicate a handful of the boundaries of those half planes. We can construct those subtangents by considering the lines $y=\lambda x$ and then pushing those lines up or down so each is high as possible without crossing $f$. We can find out how much pushing we need by solving the optimisation:
$$ \begin{aligned} \mbox{maximise} & \lambda x-f(x) \end{aligned} $$
For any $\lambda$ we call the optimal value $f^\ast(\lambda)$. $f^\ast$ is also known as the Legendre transform or Fenchel dual. It's basically telling us which half-planes define our epigraph.

Interestingly, reconstructing the function $f$ from $f^\ast$ requires exactly the same expression with $f^\ast$ substituted for $f$. So the Fenchel dual also takes a collection of half-planes and gives you back the original function.

Now we construct the epigraph of $f_+$. We just want the descending parts of $f$. It's easy to construct the epigraph, we take the intersection of just those half-planes whose boundaries are decreasing. Here's a picture:

So here's a process for constructing $f_+$: we take the Fenchel dual of $f$, we throw away the "half" where $\lambda\le0$, and now take the inverse Fenchel dual (which is in fact just the Fenchel dual).

Now let's go back to the original problem and construct its dual as described in numerous textbooks. We form the Lagrangian
$$L(a,x,\lambda)=f(x)+\lambda (x-a)$$
We then form the function
$$g(a,\lambda)=\min_x L(a,x,\lambda)$$
This is basically (modulo a sign) the Fenchel dual of $f$ with an extra $-\lambda a$ term. The dual problem, as in the textbooks, is
$$ \begin{aligned} \mbox{maximise} & g(a,\lambda)\\ \mbox{such that} & \lambda\ge0\\ \end{aligned} $$
Remembering that $-\lambda a$ term, that's almost the inverse Fenchel dual except we've maximised over just "half" of the $\lambda$s, those for which $\lambda\ge0$. When we compute the optimum of the dual problem we get $h(x)=\max_x g(a,x)$. If the problem is convex, we've just computed $f_+$ by another path. So this gives what I think is a clear picture of what the dual problem is. Forming the dual problem is (modulo a vertical shift) converting the original problem into finding the subtangents. Solving the dual problem is converting that back to a function again, but using only the downward sloping lines. So this illustrates the primal-dual theorem. We get the same result in both the primal and dual problems as the dual problem is simply solving our original problem via epigraphs.

The Riemann-Hilbert problem and the Hilbert Transform

Here's my latest toy:
It's a software defined radio I bought for $13 on Amazon. Here's what it does.

You have an electromagnetic field in the vicinity of the antenna. Let's pretend such fields, $\phi$ are scalars. Just about any kind of radio uses a carrier modulated in some way. For example AM radio encodes the audio (represented as a function of time, A) as something like
$$\phi=A(t)cos(\omega t)$$

by Dan Piponi ( at April 05, 2014 01:05 AM

April 04, 2014

FP Complete

Calculating the Minimum Variance Portfolio in R, Pandas and IAP


As part of producing a demo for FP Complete's new IAP product, I wound up implementing the Minimum Variance Portfolio calculation for a stock portfolio in R, then in Haskell for the IAP, and finally in Python using the NumPy and SciPy extension libraries. I want to look at the process of writing each of these, and the resulting code in this article. I am not an expert in these things, so if you know a better way to do them, please let me know. In particular, the R example was borrowed from someone else.

The function

First, we find a version of the formula for MVP that can be converted into those systems. I like the one used by Yue Kuen KWOK:

ω = Ω⁻¹ 1/ 1 ⋅ Ω⁻¹ 1

where Ω is the covariant matrix of the returns for the stocks in question.

R version

The R version is fairly straightforward - we just need to put the pieces together:

minvar <- function (px){
    Rt <- px[-1,]/px[-nrow(px),]-1
    cov.Rt <- cov(Rt)
    one.vec <- rep(1,nrow(cov.Rt))
    num <- solve(cov.Rt) %*% one.vec
    den <- as.numeric(t(one.vec) %*% num)

Calculating returns can be done with standard matrix operations and slicing. The covariant function is built in, as is inverting it (solve). Since the numerator Ω⁻¹ 1 appears in the denominator, I reuse it's value there.

All the array operations were documented in the same place. That I only needed one unit vector was a bit of a surprise, but R sized it dynamically to work. That I had to transpose the unit vector and use the cross product operator (%*%) to get a dot product was a a less pleasant surprise, but is apparently a standard R idiom.

Python version

The Python version is almost a direct copy of the R version.

def minvar(prices):
    cov = np.cov((prices[1:] / prices[:-1] - 1).transpose())
    vu = np.array(cov.shape[1] * [1], float)
    num =, vu)
    den =, num)
    return num / den

In this case, I passed the returns matrix to the covariant function directly. The NumPy dot function performs both cross products and dot products, and again the unit vector adopts it's length dynamically.

Documentation was a bit more scattered. Being a mix of traditional imperative and object-oriented, some functions are functions in the module, and others are object methods. The biggest surprise was that the returns matrix needed to be transposed before the covariant was calculated.

Haskell version

Haskell is not quite as obvious a translation of the R and Python versions, but is a more straightforward translation of the original formula - once you notice that Ω⁻¹ 1 has been factored into tv. It reads from top to bottom like the original, with the main formula at the top and the various terms defined below.

Returns again use the standard Haskell idiom for slicing the array. This is a bit more verbose than either R or Python, as they are functions rather than special syntax.

minVariance prices = tv / scalar (tvu <.> tv)
    where rets = dropRows 1 prices / takeRows (rows prices - 1) prices - 
          (_, cov) = meanCov $ rets
          vu = constant 1.0 (cols cov)
          tvu = constant 1.0 (rows cov)
          tv = inv cov <> vu

The documentation was again straightforward, with everything being a function in the hmatrix package. In this case, both unit vectors were needed, as Haskell does not scale them automatically. It was the least surprising in at least one way - it used a distinct dot product operator for the two vectors rather than transposing - whether implicitly like Python or explicitly like R - the unit vector in a cross product.


While performance comparisons with IAP aren't very useful, as it runs in the cloud so doing comparisons on identical systems may be impossible, Haskell does have one interesting advantage.

All three of these systems have the same basic architecture - a high-level language running in an interactive environment, with bindings to low-level, fast implementations of the matrix manipulations. Haskell adds the ability to compile your program into native x86_64 code. Doing so reduced the wall clock time of this short demo by roughly two orders of magnitude.


I found the IAP version a little easier to deal with. Having custom operators and functions for everything - and this will only get better with time - along with being able to use the mathematical layout of the where statement just made things a little easier to deal with. While not having unit vectors that automatically size themselves - or transpose into matrices - is a little inconvenient, this also exposed a problem in the original R version, in that the unit vector's length was initially set wrong. I'm not sure that will make any real difference, but the thought that the language can catch such errors for me is comforting.

April 04, 2014 08:07 PM

Lennart Augustsson

Haskell error reporting with locations

Error reporting in GHC is not always the nicest. For example, I often develop code by using undefined as a placeholder for code I've not written yet. Here's a simple example:
import System.Environment
  main = do
    args <- getargs
    if null args then
Running this looks like this:
$ ./H
  H: Prelude.undefined
Which undefined caused that? Looking at the error message we have no idea. Wouldn't it be nice with some location information?

We can actually get location information by using Control.Exception.assert:

import Control.Exception(assert)
  import System.Environment

  main = do
    args <- getargs
    if null args then
      assert False undefined
      assert False undefined
Now running it is much more informative:
$ ./H
  H: H.hs:7:9-14: Assertion failed
How is assert able to report the location? If we dig deep enough we discover that it's because the ghc compiler contains a special hack to recognize this function and give it location information.

A generalized hack

In a Haskell compiler that I've implemented I've taken this compiler hack and extended it so it can be used for any function.  It comes in two parts, location information and location transparent definitions.


The __LOCATION__ identifier is always defined and utterly magical. Its value is a string that describes the location of that very identifier. This is the very opposite of a referentially transparent name. In fact it's value varies with where it is placed in the code! So it's definitely not for purists. But I'm a practical man, so I sometimes have resort of the ugliness of reality. And in reality we want to report locations in errors.

Enough philosophy, here's an example:

main = do
    print __LOCATION__
    print   __LOCATION__
And running it prints:
And to illustrate the impurity:
main = do
    let loc = __LOCATION__
    print loc
    print loc
And running this:

Location transparency

The __LOCATION__ identifier gives the location of itself. This is of little use on its own. Imagine the definition we could give for undefined. Somewhere in the Prelude module it could say something like
undefined = error ("undefined: " ++ __LOCATION__)
But if we use this all that it will tell us is where the definition of undefined is, not where it was used.

To get the point of use instead of the definition I've introduced location transparent definitions. In a location transparent definition the __LOCATION__ identifier will not refer to its own position, but to the position of the reference to the definition. Location transparency is introduced with a pragma.

  undefined = error ("undefined: " ++ __LOCATION__)
With this definition our initial example looks like this when we run it:
undefined: test/H.hs:6:9
In fact, the real definition of undefined doesn't look like that. The __LOCATION__ identifier is only used in the definition of error, so it looks something like this:
  error :: String -> a
  error s = throw (ErrorCall (__LOCATION__ ++ ": " ++ s))

  {-# LOCATIONTRANSPARENT undefined #-}
  undefined = error "undefined"
Since both error and undefined are transparent any use of undefined will be reported with the location of the use.

Furthermore, we can make a few more functions location transparent, e.g.,

  head :: [a] -> a
  head [] = error "Empty list"
  head (x:xs) = x
A simple example:
main = putStr (head [])
Which will print:
test/Head.hs:1:16: Empty list
which is the location where head was called.


There are different ways to implement this feature, and I'm going to sketch two of them.

First: Every function that has the LOCATIONTRANSPARENT pragma will be inlined at the point of use, and the __LOCATION__ identifier in the inlined code will be updated to reflect the call site. The definitions must be processed in a bottom-up fashion for this to work. It's fairly simple to implement, but will cause some code bloat due to inlining.

Second: Every function that has LOCATIONTRANSPARENT pragma will be rewritten (by the compiler) to have an extra location argument, and each use of this function will be rewritten to pass in the current location. For example (using $$f for the location version of f):

main = putStr ($$head __LOCATION__ [])

  $$head __LOCATION__ [] = $$error __LOCATION__ "Empty list"
  $$head __LOCATION__ (x:xs) = x
  $$error __LOCATION__ s = throw (ErrorCall (__LOCATION__ ++ ": " ++ s))
This should be fairly straightforward to implement, but I've not tried it. (It's somewhat like dynamic binding, so maybe ghc could reuse that mechanism for locations.)

And, of course, the global __LOCATION__ identifier has to be recognized by the compiler and replaced by a string that is its location.


I implemented the __LOCATION__ hack quite a while ago, and I like the much improved reporting of error locations. I hope someone will add it to ghc as well.

by augustss ( at April 04, 2014 07:28 AM

April 03, 2014

Thiago Negri

Premature optimization: it's not just about code

When we talk about performance, we shouldn't look only to performance of code execution. There are more types that we should care about, like time to market, development time and processes. Actually, we shouldn't look into any of the performance measures at a single perspective. As I see, all these measures sums up to constitute a single value of business performance.

There are lots of things that can slow down your business, and if you have a slow business, you'll have a hard time to keep pace with your rivals. As you discover that some areas of your company may be slowing you down, you may become tempted to add key performance indicators to each step trying to squeeze all possible value out of each part of your processes. This can bring up a whole new set of problems, your business can be seen as a complex system and it will adapt itself to render good results even in chaotic scenarios. [1]

So, to avoid going nuts with indicators all over the place, you decide to avoid bottlenecks from the start. To each and every new thing you need to deliver, you'll start a precise planning routine, choosing among a plethora of providers, technologies and unicorns to avoid at all costs to have a new bottleneck in the future. This is what I like to call premature optimization in business performance.

Simply put: you can't have a business if you don't have a business. How can you possibly know all the events that will happen to have an impact in the decision you take today? You can't.I'm not saying that planning is wrong or a Bad Thing™. What I really want to avoid is losing energy on stuff that won't make a difference. You may spend a whole year choosing between apples and oranges, and in the end be crushed by some weirdo playing around with random technology. Why? Because he was not worried about future bottlenecks. He was really trying to do something cool, and he did it while you were arguing in endless and purposeless meetings.

You're always trading performance measurements. For example, everyone is talking about service oriented architecture (SOA), but what does it really means in terms of business? It means a tradeoff between code execution performance and others benefits to the business as a whole, like decentralized grow and continuous delivery. Or, you may look at the difference between Apple's App Store and Google's Play Store model. There is a huge tradeoff of quality assurance and time to market between these two players: one offers their customers (smartphone owners), quality assured apps; the other offers their customers (operating system users), fast time to market for their apps.

The big question really is: what are you trying to deliver to your customer? Are you trying to deliver software over time or value over time? I bet most of you are trying to deliver value over time, so why bother with premature optimization of technologies or processes? Let the bad stuff to die on its own: failing fast is orders of magnitude better than not doing anything at all. And let the good stuff to accompany you to where you want to go.

Don't bother guessing the future, embrace the uncertainty of life: you can't predict how a complex system will behave, you can't know how your systems, services or whatever you are doing will be used in the real world. Put it into the wild, take the complaints and tune it (or throw it away) afterwards, this is a constant process. [2]

Start measuring how much you are delivering over time and stop over-planning. Your end goal is to deliver stuff. The world couldn't care less about what you can do, it only cares about what you do.


  1. Complex Systems and Key Performance Indicators - Márcio Geovani Jasinski
  2. Complaint-Driven Development - Jeff Atwood

by Thiago Negri ( at April 03, 2014 04:46 PM

Daniil Frumin

Heads up: scotty-hastache 0.2.1

To accommodate for the release of hastache-0.6, I’ve updated the scotty-hastache package to the version 0.2.1.

Tagged: haskell, hastache, scotty, web

by Dan at April 03, 2014 01:59 PM

Lennart Augustsson

A small Haskell extension

The extension

In Haskell you can give a type to an expression by writing expr ::  type.  To an untrained eye the :: looks just like an infix operator, even if it is described as a special syntactical construct in the Haskell report.  But let's think about it as an infix operator for a moment.

For an infix operator you you can for a section, i.e., a use of the operator with one operand left out.  For instance (* 2) leaves out the first operand, and Haskell defines this to be the same as (\ x -> x * 2).  Regarding :: as an operator we should be able to write (:: type) and it should have the obvious meaning (\ x -> x :: type).

I suggest, and I plan sending the haskell-prime mailing list, Haskell should adopt this small extension.


First, the extension is very light weight and has almost no extra intellectual weight for anyone learning Haskell.  I'd argue it makes the language simpler because it allows :: to be treated more like an infix operator.  But without use cases this would probably not be enough of an argument.

Example 1

We want to make a function, canonDouble, that takes a string representing a Double and changes it to the standard Haskell string representing this Double.  E.g. canonDouble "0.1e1" == "1.0".  A first attempt might look like this:

  canonDouble :: String -> String
  canonDouble = show . read         -- WRONG!

This is, of course, wrong since the compiler cannot guess that the type between read and show should be a Double.  We can convey this type information in different ways, e.g.:

  canonDouble :: String -> String
  canonDouble = show . asDouble . read
    where asDouble :: Double -> Double
          asDouble x = x

This is somewhat clumsy.  Using my proposed extension we can instead write:

  canonDouble :: String -> String
  canonDouble = show . (:: Double) . read

This has the obvious meaning, and succinctly describes what we want.

Example 2

In ghc 7.8 there is a new, better implementation of Data.Typeable.  It used to be (before ghc 7.8) that to get a TypeRep for some type you would have to have a value of that type.  E.g., typeOf True gives the TypeRep for the Bool type.  If we don't have a value handy of the type, then we will have to make one, e.g., by using undefined.  So we could write typeOf (undefined :: Bool).

This way of using undefined is rather ugly, and relies on non-strictness to work.  Ghc 7.8 add a new, cleaner way of doing it.

  typeRep :: proxy a -> TypeRep

The typeRep function does not need an actual value, but just a proxy for the value.  A common proxy is the Proxy type from Data.Proxy:

  data Proxy a = Proxy

Using this type we can now get the TypeRep of a Bool by writing typeRep (Proxy :: Proxy Bool).  Note that in the type signature of typeRep the proxy is a type variable.  This means we can use other values as proxies, e.g., typeRep ([] :: [Bool]).

We can in fact use anything as a proxy that has a structure that unifies with proxy a.  For instance, if we want a proxy for the type T we could use T -> T, which is the same as (->) T T.  The (->) T part makes of it is the proxy and the last T makes up the a.

The extension I propose provides an easy way to write a function of type T -> T, just write (:: T).  So to get a TypeRep for Bool we can simply write typeRep (:: Bool).  Doesn't that look (deceptively) simple?

In fact, my driving force for coming up with this language extension was to get an easy and natural way to write type proxies, and I think using (:: T) for a type proxy is a as easy and natural as it gets (even if the reason it works is rather peculiar).


I've implemented the extension in one Haskell compiler and it was very easy to add and it works as expected.  Since it was so easy, I'll implement it for ghc as well, and the ghc maintainers can decide if the want to merge it.  I suggest this new feature is available using the language extension name SignatureSections.


Does it make sense to do a left section of ::?  I.e., does (expr ::) make sense?  In current Haskell that does not make sense, since it would be an expression that lacks an argument that is a type.  Haskell doesn't currently allow explicit type arguments, but if it ever will this could be considered.

With the definition that (:: T) is the same as (\ x -> x :: T) any use of quantified or qualified types as T will give a type error.  E.g., (:: [a]), which is (\ x -> x :: [a],  is a type error.  You could imagine a different desugaring of (:: T), namely (id :: T -> T).  Now (:: [a]) desugars to (id :: [a] -> [a]) which is type correct.  In general, we have to keep quantifiers and qualifiers at the top, i.e., (:: forall a . a) turns into (id :: forall a . a -> a).

Personally, I'm not convinced this more complex desugaring is worth the extra effort.

by augustss ( at April 03, 2014 07:40 AM

Yesod Web Framework

Consolidation progress: shakespeare, conduit, http-client

My last blog post detailed a number of changes I was going to be making for package consolidation. A number of those have gone through already, this blog post is just a quick summary of the changes.


shakespeare is now a single package. hamlet, shakespeare-css, shakespeare-js, shakespeare-i18n, shakespeare-text, and servius have all been merged in and marked as deprecated. I've also uploaded new, empty versions of those deprecated packages. This means that, in order to support both the old and new versions of shakespeare, you just need to ensure that you have both the shakespeare and deprecated packages listed in your cabal file. In other words, if previously you depended on hamlet, now you should depend on hamlet and shakespeare. When you're ready to drop backwards compatibility, simply put a lower bound of >= 2.0 on shakespeare and remove the deprecated packages.

(Note: this method for dealing with deprecated packages is identical for all future deprecations, I won't detail the steps in the rest of this blog post.)


conduit-extra now subsumes attoparsec-conduit, blaze-builder-conduit, network-conduit, and zlib-conduit. It also includes three modules that used to be in conduit itself: .Text, .Binary, and .Lazy. To deal with this change, simply adding conduit-extra to your dependencies should be sufficient.

The other changes have to do with resourcet. In particular:

  • Data.Conduit no longer reexports identifiers from resourcet and monad-control. These should be imported directly from their sources.
  • Instead of defining its own MonadThrow typeclass, resourcet now uses the MonadThrow typeclass from the exceptions package. For backwards compatibility, Control.Monad.Trans.Resource provides monadThrow as an alias for the new throwM function.
  • The Resource monad had a confusing name, in that it wasn't directly related to the ResourceT transformer. I've renamed it to Acquire, and put it in its own module (Data.Acquire).
    • I'm actually very happy with Acquire, and think it's a great alternative to hard-coding either the bracket pattern or resourcet into libraries. I'm hoping to add better support to WAI for Acquire, and blog a bit more about the usage of Acquire.
  • MonadUnsafeIO has been removed entirely. All of its functionality can be replaced with MonadPrim and MonadBase (for example, see the changes to blaze-builder-conduit).
  • MonadActive, which is only needed for Data.Conduit.Lazy, has been moved to that module.


http-client-multipart has been merged into http-client. In addition, instead of using the failure package, http-client now uses the exceptions package.

http-client-conduit has been merged into http-conduit. I've also greatly expanded the Network.HTTP.Client.Conduit module to contain what I consider its next-gen API. In particular:

  • No usage of ResumableSource.
  • Instead of explicit ResourceT usage, it uses the Acquire monad and bracket pattern (acquireResponse, withResponse).
  • Instead of explicitly passing around a Manager, it uses MonadReader and the HasHttpManager typeclass.

I'm curious how people like the new API. I have no plans on removing or changing the current Network.HTTP.Conduit module, this is merely an alternative approach.

Updated yesod-platform

I've also released a new version of yesod-platform that uses the new versions of the packages above. A number of packages on Hackage still depend on conduit 1.0, but I've sent quite a few pull requests in the past few days to get things up-to-date. Thankfully, maintaining compatibility with both 1.0 and 1.1 is pretty trivial.

April 03, 2014 07:07 AM

April 02, 2014

Functional Jobs

Server Game Developer at Quark Games (Full-time)

Quark Games was established in 2008 with the mission to create hardcore games for the mobile and tablet platforms. By focusing on making high quality, innovative, and engaging games, we aim to redefine mobile and tablet gaming as it exists today.

We seek to gather a group of individuals who are ambitious but humble professionals who are relentless in their pursuit of learning and sharing knowledge. We're looking for people who share our passion for games, aren’t afraid to try new and different things, and inspire and push each other to personal and professional success.

As a Server Game Developer, you’ll be responsible for implementing server related game features. You’ll be working closely with the server team to create scalable infrastructure as well as the client team for feature integration. You’ll have to break out of your toolset to push boundaries on technology to deliver the most robust back end to our users.

What you’ll do every day

  • Develop and maintain features and systems necessary for the game
  • Collaborate with team members to create and manage scalable architecture
  • Work closely with Client developers
    on feature integration
  • Solve real time problems at a large
  • Evaluate new technologies and products

What you can bring to the role

  • Ability to get stuff done
  • Desire to learn new technologies and design patterns
  • Care about creating readable, reusable, well documented, and clean code
  • Passion for designing and building systems to scale
  • Excitement for building and playing games

Bonus points for

  • Experience with a functional language (Erlang, Elixir, Haskell, Scala, Julia, Rust, etc..)
  • Experience with a concurrent language (Erlang, Elixir, Clojure, Go, Scala, etc..)
  • Being a polyglot programmer and having experience with a wide range of languages (Ruby, C#, and Objective-C)
  • Experience with database integration and management for NoSQL systems (Riak, Couchbase, Redis, etc...)
  • Experience with server operations, deployment, and with tools such as Chef or Puppet
  • Experience with system administration

Get information on how to apply for this position.

April 02, 2014 11:36 PM

Mateusz Kowalczyk

Haddock 2.14.2

Posted on April 2, 2014 by Fūzetsu

This is just a quick follow-up to my previous post. We have now released Haddock 2.14.2 which contains few minor changes. The reason for this release is to get a few quick patches in. No fancy overview today, just quick mentions. Here is the relevant part of the changelog:

Changes in version 2.14.2

  • Always drop –split-objs GHC flag for performance reasons (#292)

  • Print kind signatures GADTs (#85)

  • Drop single leading whitespace when reasonable from @-style blocks (#201)

  • Fix crashes associated with exporting data family record selectors (#294)

#201 was the the annoying aesthetics bug I mentioned last time and that is now fixed.

#294 was a bug we’re glad to have gotten rid of now: it was only reported recently but I imagine more and more projects would have start to hit it.

#292 should improve performance considerably in some special cases, such as when Template Haskell is being used.

#85 was just a quick resolution of years old ticket, I think you’ll find it useful.

I predict that this is the version that will ship with GHC 7.8.1 and I don’t think we’ll have any more 2.14.x releases.

Ideally I’d like to get well under 100 open tickets for the next release (there are currently 117 open).

Some things I will be concentrating on next is splitting up Haddock into a few packages and working on the Hoogle back-end. The Hoogle back-end is incredibly broken which is a shame considering Hoogle is a very useful service. We want to make the maintainers life easier.

Splitting up Haddock into a few packages will be of great advantage to people wishing to use (parts of) Haddock as a library without adding a dependency on a specific version of GHC to their program. It should also become much easier to implement and maintain your own back-ends.

If you are interested in helping out with Haddock, we’d love to have you. Pop into #haddock on Freenode, make some noise and wait for someone to respond. Alternatively, contact me through other means.

PS: While I realise that some of my posts make it on reddit, I myself do not use it. You’re welcome to discuss these but if you leave questions or messages to me on reddit, I will almost certainly not see them. If you want my attention, please either use e-mail or IRC. Thanks!

April 02, 2014 11:03 PM

Dominic Steinitz

Student’s T and Space Leaks


The other speaker at the Machine Learning Meetup at which I gave my talk on automatic differentiation gave a very interesting talk on A/B testing. Apparently this is big business these days as attested by the fact I got 3 ads above the wikipedia entry when I googled for it.

It seems that people tend to test with small sample sizes and to do so very often, resulting in spurious results. Of course readers of XKCD will be well aware of some of the pitfalls.

I thought a Bayesian approach might circumvent some of the problems and set out to write a blog article only to discover that there was no Haskell library for sampling from Student’s t. Actually there was one but is currently an unreleased part of random-fu. So I set about fixing this shortfall.

I thought I had better run a few tests so I calculated the sampled mean, variance, skewness and kurtosis.

I wasn’t really giving this my full attention and as a result ran into a few problems with space. I thought these were worth sharing and that is what this blog post is about. Hopefully, I will have time soon to actually blog about the Bayesian equivalent of A/B testing.


> {-# OPTIONS_GHC -Wall                      #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing   #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults    #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind   #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods  #-}
> {-# OPTIONS_GHC -fno-warn-orphans          #-}
> {-# LANGUAGE NoMonomorphismRestriction     #-}
> module StudentTest (
>     main
>   ) where
> import qualified Data.Vector.Unboxed as V
> import Data.Random.Source.PureMT
> import Data.Random
> import Data.Random.Distribution.T
> import Control.Monad.State
> import Data.Histogram.Fill
> import Data.Histogram.Generic ( Histogram )
> import Data.List

Space Analysis

Let’s create a reasonable number of samples as the higher moments converge quite slowly.

> nSamples :: Int
> nSamples = 1000000

An arbitrary seed for creating the samples.

> arbSeed :: Int
> arbSeed = 8

Student’s t only has one parameter, the number of degrees of freedom.

> nu :: Integer
> nu = 6

Now we can do our tests by calculating the sampled values.

> ts :: [Double]
> ts =
>   evalState (replicateM nSamples (sample (T nu)))
>             (pureMT $ fromIntegral arbSeed)
> mean, variance, skewness, kurtosis :: Double
> mean = (sum ts) / fromIntegral nSamples
> variance = (sum (map (**2) ts)) / fromIntegral nSamples
> skewness = (sum (map (**3) ts) / fromIntegral nSamples) / variance**1.5
> kurtosis = (sum (map (**4) ts) / fromIntegral nSamples) / variance**2

This works fine for small sample sizes but not for the number we have chosen.

./StudentTest +RTS -hc
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

It seems a shame that the function in the Prelude has this behaviour but never mind let us ensure that we consume values strictly (they are being produced lazily).

> mean' = (foldl' (+) 0 ts) / fromIntegral nSamples
> variance' = (foldl' (+) 0 (map (**2) ts)) / fromIntegral nSamples
> skewness' = (foldl' (+) 0 (map (**3) ts) / fromIntegral nSamples) / variance'**1.5
> kurtosis' = (foldl' (+) 0 (map (**4) ts) / fromIntegral nSamples) / variance'**2

We now have a space leak on the heap as using the ghc profiler below shows. What went wrong?

If we only calculate the mean using foldl then all is well. Instead of 35M we only use 45K.

Well that gives us a clue. The garbage collector cannot reclaim the samples as they are needed for other calculations. What we need to do is calculate the moments strictly altogether.

Let’s create a strict record to do this.

> data Moments = Moments { m1 :: !Double
>                        , m2 :: !Double
>                        , m3 :: !Double
>                        , m4 :: !Double
>                        }
>   deriving Show

And calculate the results strictly.

> m  = foldl' (\m x -> Moments { m1 = m1 m + x
>                              , m2 = m2 m + x**2
>                              , m3 = m3 m + x**3
>                              , m4 = m4 m + x**4
>                              }) (Moments 0.0 0.0 0.0 0.0) ts
> mean''       = m1 m / fromIntegral nSamples
> variance''   = m2 m / fromIntegral nSamples
> skewness''   = (m3 m / fromIntegral nSamples) / variance''**1.5
> kurtosis'' = (m4 m / fromIntegral nSamples) / variance''**2

Now we have what we want; the program runs in small constant space.

> main :: IO ()
> main = do
>   putStrLn $ show mean''
>   putStrLn $ show variance''
>   putStrLn $ show skewness''
>   putStrLn $ show kurtosis''

Oh and the moments give the expected answers.

ghci> mean''

ghci> variance''

ghci> skewness''

ghci> kurtosis''

Running the Code

To run this you will need my version of random-fu. The code for this article is here. You will need to compile everything with profiling, something like

ghc -O2 -main-is StudentTest StudentTest.lhs -prof

Since you need all the packages to be built with profiling, you will probably want to build using a sandbox as above. The only slightly tricky aspect is building random-fu so it is in your sandbox.

runghc Setup.lhs configure --enable-library-profiling

by Dominic Steinitz at April 02, 2014 10:17 AM

Lee Pike

Embedded Security in the Mainstream

Science Friday had an interesting podcast with Bruce Schneier worth a listen.  It discussed security in embedded systems (or as is hip to say now, the Internet of Things).  The idea of security in embedded systems seemed pretty obscure just a few years ago, so it’s nice to see it being featured on a general-audience radio show.

So who’s going to listen to it from their phone, connected to their car’s audio over Bluetooth, on the way to the store to buy a Nest smoke alarm?

by Lee Pike at April 02, 2014 04:21 AM

April 01, 2014

Neil Mitchell

Exceptional Testing

Summary: Failing properties should throw exceptions, not return False.

When testing properties in Haskell with QuickCheck we usually write a predicate that takes some arguments, and returns a boolean. For example:

import Test.QuickCheck
main = quickCheck $ \x -> exp (log (abs x)) == abs (x :: Double)

Here we are checking that for all positive Double values, applying log then exp is the identity. This statement is incorrect for Double due to floating point errors. Running main we get:

*** Failed! Falsifiable (after 6 tests and 2 shrinks):

QuickCheck is an incredibly powerful tool - it first found a failing example, and then automatically simplified it. However, it's left out some important information - what is the value of exp (log 3)? For my tests I usually define === and use it instead of ==:

a === b | a == b = True
| otherwise = error $ show a ++ " /= " ++ show b

Now when running main we get:

*** Failed! Exception: '3.0000000000000004 /= 3.0'
(after 2 tests and 2 shrinks):

We can immediately see the magnitude of the error introduced, giving a kick-start to debugging.

Do we still need Bool?

When writing functions returning Bool there are three interesting cases:

  • The function returns True, the test passes, continue on.
  • The function returns False, the test fails, with no information beyond the input arguments.
  • The function throws an exception, the test fails, but with a human readable error message about the point of failure.

Of these, returning False is the least useful, and entirely subsumed by exceptions. It is likely there was additional information available before reducing to False, which has now been lost, and must first be recovered before debugging can start.

Given that the only interesting values are True and exception, we can switch to using the () type, where passing tests return () and failing tests throw an exception. However, working with exceptions in pure code is a bit ugly, so I typically define:

import Control.Monad
(===) :: (Show a, Eq a) => a -> a -> IO ()
a === b = when (a /= b) $ error $ show a ++ " /= " ++ show b

Now === is an action, and passing tests do nothing, while failing tests raise an error. This definition forces all tests to end up in IO, which is terrible for "real" Haskell code, where pure and IO computations should be segregated. However, for tests, as long as the test is repeatable (doesn't store some temporary state) then I don't worry.

To test the IO () returning property with QuickCheck we can define:

instance Testable () where
property () = property True
instance Testable a => Testable (IO a) where
property = property . unsafePerformIO

Now QuickCheck can work with this === definition, but also any IO property. I have found that testing IO properties with QuickCheck is very valuable, even without using ===.

Non-QuickCheck tests

Whilst I have argued for exceptions in the context of QuickCheck tests, I use the same exception pattern for non-parameterised/non-QuickCheck assertions. As an example, taken from Shake:

test = do
dropDirectory1 "aaa/bbb" === "bbb"
dropDirectory1 "aaa/" === ""
dropDirectory1 "aaa" === ""
dropDirectory1 "" === ""

Using IO () properties allows for trivial sequencing. As a result, the tests are concise, and the effort to add additional tests is low.

(===) in QuickCheck

In QuickCheck-2.7 the === operator was introduced (which caused name clashes in both Shake and Hoogle - new versions of both have been released). The === operator uses the Property data type in QuickCheck to pass both the Bool value and additional information together. I prefer my definition of === because it's simpler, doesn't rely on QuickCheck and makes it clear how to pass additional information beyond just the arguments to ===. As an example, we could also return the log value:

\x -> when (exp (log (abs x)) /= abs x) $
error $ show (x,log $ abs x, exp $ log $ abs x)

However, the QuickCheck === is a great improvement over ==, and should be a very popular addition.

by Neil Mitchell ( at April 01, 2014 08:57 PM


Fixing foldl

The foldl function is broken. Everyone knows it’s broken. It’s been broken for nearly a quarter of a century. We should finally fix it!

Today I am proposing that Prelude.foldl be redefined using the implementation currently known as Data.List.foldl'.

foldl is broken!

I’m sure you knew that already, but just in case…

Have you ever noticed that Haskellers usually recommend using either foldr or foldl' but not foldl? For example Real World Haskell has this to say:

Due to the thunking behaviour of foldl, it is wise to avoid this function in real programs: even if it doesn’t fail outright, it will be unnecessarily inefficient. Instead, import Data.List and use foldl'.

In the online version of the book the first user comments on that paragraph are

Why isn’t Data.List foldl implementation placed in Prelude?

I second the question: Why isn’t foldl' the default?

Good question.

Ok, so obviously we’re talking about the difference between foldl and foldl':

foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f a []     = a
foldl f a (x:xs) = foldl f (f a x) xs

foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f a []     = a
foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs

The dry technical difference is that foldl' evaluates the call to f before making the next recursive call. The consequences of that are perhaps not immediately obvious, so we’ll take a step back and look at a slightly bigger picture.

Folding left and right

When we first learn Haskell we learn that there are two ways to fold a list, from the left or right.

foldl f z [x1, x2, ..., xn] = (...((z `f` x1) `f` x2) `f`...) `f` xn

foldr f z [x1, x2, ..., xn] = x1 `f` (x2 `f` ... (xn `f` z)...)

Saying “from the left” or “from the right” is a description of what foldl and foldr calculate, with the parenthesis nesting to the left or to the right. At runtime of course we always have to start from the left (front) of the list.

We later learn other ways of thinking about left and right folds, that a left fold can be used like a classic loop where we go through the whole list eagerly, while a right fold can be used like a demand-driven iterator. For the left fold that means using a function that is strict in its first argument (like (+)) while for the right fold that means using a function that is not strict in its second argument (like (:)).

Indeed when looking at whether we want foldl or foldr in any particular case our choice is usually governed by whether we want “all at once” behaviour (foldl) or if we want incremental or short-cut behaviour (foldr).

Accumulating thunks

Again, as we are learning Haskell, we get told that foldl has this crazy behaviour

foldl (+) 0 (1:2:3:[])
          =  foldl (+) (0 + 1)             (2:3:[])
          =  foldl (+) ((0 + 1) + 2)       (3:[])
          =  foldl (+) (((0 + 1) + 2) + 3) []
          =            (((0 + 1) + 2) + 3)

when what we had in mind when we thought of an accumulating loop was

foldl' (+) 0 (1:2:3:[])
          =  foldl' (+) 1 (2:3:[])
          =  foldl' (+) 3 (3:[])
          =  foldl' (+) 6 []
          =             6

Of course that’s just what foldl' does, it evaluates the call to + before making the next recursive call.

When is foldl (rather than foldl') useful?

The short answer is “almost never”.

As beginners we often assume that foldl must still make sense in some cases (or why have both?) but it turns out that’s not really the case.

When the f argument of foldl is a strict function then delaying the evaluation does not gain us anything as it all has to be evaluated at the end anyway. The only time when delaying the evaluation could save us anything is when the f function is not strict in its first argument – in which case you either don’t care or probably should have been using foldr in the first place.

In fact even if our f function is non-strict in its first argument, we probably do not gain anything from delaying evaluation and it usually does no harm to evaluate earlier. Remember that we still have to traverse the whole list, we don’t get any short-cutting behaviour like with foldr.

We can, if we think about it, construct examples where foldl' would be too strict. We could define last and last' like this:

last  = foldl  (\_ y -> y) (error "empty list")

last' = foldl' (\_ y -> y) (error "empty list")

Now if we try

> last [1,undefined,3]
> last' [1,undefined,3]
*** Exception: Prelude.undefined

This is because our accumulator is always the latest element that we’ve seen but we don’t actually want to evaluate the elements (except the last one).

So it’s true that foldl' fails in this case, but it’s also a silly definition, the usual definition is a lot clearer

last [x]    = x
last (_:xs) = last xs
last []     = error "empty list"

That goes for pretty much all the other examples you might be able to think of where foldl would work but foldl' would not: the examples are either artificial or are clearer written in other ways.

People sometimes point out that sum is defined using foldl and not foldl' and claim that this is something to do with Haskell’s designers wanting to allow Num instances where (+) might be lazy. This is pretty much nonsense. If that were the case then sum would have been defined using foldr rather than foldl to properly take advantage of short-cutting behaviour. A much simpler explanation is that foldl' was not available in early versions of Haskell when sum was defined.

In nearly 15 years as a Haskell programmer I think I’ve specifically needed foldl rather than foldl' about three times. I say “about” because I can only actually remember one. That one case was in a slightly cunning bit of code for doing cache updates in a web server. It would almost certainly have been clearer as a local recursion but I was amused to find a real use case for foldl and couldn’t help myself from using it just for fun. Of course it needed a comment to say that I was using it on purpose rather than by mistake!

So why do we have foldl and foldl'?

If foldl is almost always a mistake (or merely benign) then why do we have it in the first place?

I don’t know for sure, but here’s my guess…

When Haskell 1.0 was published on this day 24 years ago there was no seq function at all, so there was no choice but to define foldl in the “classic” way.

Eventually, six years later after much discussion, we got the seq function in Haskell 1.3. Though actually in Haskell 1.3 seq was part of an Eval class, so you couldn’t use it just anywhere, such as in foldl. In Haskell 1.3 you would have had to define foldl' with the type:

foldl' :: Eval b => (b -> a -> b) -> b -> [a] -> b

Haskell 1.4 and Haskell 98 got rid of the Eval class constraint for seq but foldl was not changed. Hugs and GHC and other implementations added the non-standard foldl'.

I suspect that people then considered it a compatibility and inertia issue. It was easy enough to add a non-standard foldl' but you can’t so easily change the standard.

I suspect that if we had had seq from the beginning then we would have defined foldl using it.

Miranda, one of Haskell’s predecessor languages, already had seq 5 years before Haskell 1.0.

A strict foldl in Orwell!

Orwell is an interesting case. Orwell was another Haskell predecessor, very similar to Miranda and early Haskell. An informed source told me that Orwell had defined its foldl in the way that we now define foldl', ie with strict evaluation. Information on Orwell is a little hard to get ahold of online these days so I asked Philip Wadler. Phil very kindly fished out the manuals and looked up the definitions for me.

In the original version:

An Introduction to Orwell
Philip Wadler
1 April 1985

In the standard prelude:

lred f a []  =  a
lred f a (x:xs) = lred f (f a x) xs

But five years later, by the time Haskell 1.0 is being published…

An Introduction to Orwell 6.00
by Philip Wadler
revised by Quentin Miller
Copyright 1990 Oxford University Computing Lab

In the standard prelude:

foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f a []  =  a
foldl f a (x:xs)  =  strict (foldl f) (f a x) xs

Note the use of strict. Presumably Orwell’s strict function was defined as (or equivalent to)

strict :: (a -> b) -> a -> b
strict f x = x `seq` f x

(These days in Haskell we call this function ($!).)

So my source was right, Orwell did change foldl to be the strict version!

I contend that this was and is the right decision, and that it was just a consequence of the late arrival of seq in Haskell and inertia and fears about backwards compatibility that have kept us from fixing foldl.

Just do it!

It’d help all of us who are sometimes tempted to use foldl because we can’t be bothered to import Data.List. It’d help confused beginners. It’d save teachers from the embarrassment of having to first explain foldl and then why you should never use it.

Orwell fixed this mistake at least 24 years ago, probably before Haskell 1.0 was released. Just because it’s an old mistake doesn’t mean we shouldn’t fix it now!

A postscript: which foldl'?

I hate to complicate a simple story but I should admit that there are two plausible definitions of foldl' and I’ve never seen any serious discussion of why we use one rather than the other (I suspect it’s another historical accident).

So the version above is the “standard” version, perhaps more clearly written using bang patterns as

foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f a []     = a
foldl' f a (x:xs) = let !a' = f a x
                     in foldl' f a' xs

But an equally plausible version is

foldl'' :: (b -> a -> b) -> b -> [a] -> b
foldl'' f !a []     = a
foldl'' f !a (x:xs) = foldl'' f (f a x) xs

The difference is this: in the first version we evaluate the new accumulating parameter before making the recursive call, while in the second version we ensure that the accumulating parameter is always evaluated (to WHNF) on entry into each call.

These two ways of forcing the evaluation have almost the same effect. It takes a moment or two to find a case where it makes a difference, but here’s one:

foldl'  (\_ y -> y) undefined [1] = 1
foldl'' (\_ y -> y) undefined [1] = undefined

The standard foldl' ensures that all the new accumulating parameter values are evaluated, but still allows the initial value to be unevaluated. The alternative version simply says that the accumulating parameter is always evaluated.

The second version is attractive from a code generation point of view. One of the clever things that GHC can do with foldl' (and strict function arguments generally) is to unbox the values, so for example an Int can be unboxed to a Int# and passed in registers rather than on the heap. With the standard foldl' it needs a special case for the first iteration of the loop where the initial value of the accumulating parameter which might not be evaluated yet. With foldl'', that’s not a problem, we can unbox things right from the start. In practice, GHC can often see that the initial value is evaluated anyway, but not always.

(Don Stewart and I noticed this a few years back when we were working on stream fusion for lists. We had defined foldl' on streams in a way that corresponded to the second form above and then got a test failure when doing strictness testing comparing against the standard list functions.)

So if we’re going to fix foldl to be the strict version, then perhaps it should be the fully strict version, not just the “strict after the first iteration” version.

by duncan at April 01, 2014 10:38 AM

Edward Z. Yang

Calculating Shanten in Mahjong

Move aside, poker! While the probabilities of various poker hands are well understood and tabulated, the Chinese game of chance Mahjong [1] enjoys a far more intricate structure of expected values and probabilities. [2] This is largely due in part to the much larger variety of tiles available (136 tiles, as opposed to the standard playing card deck size of 52), as well as the turn-by-turn game play, which means there is quite a lot of strategy involved with what is ostensibly a game of chance. In fact, the subject is so intricate, I’ve decided to write my PhD thesis on it. This blog post is a condensed version of one chapter of my thesis, considering the calculation of shanten, which we will define below. I’ll be using Japanese terms, since my favorite variant of mahjong is Riichi Mahjong; you can consult the Wikipedia article on the subject if you need to translate.

Calculating Shanten

The basic gameplay of Mahjong involves drawing a tile into a hand of thirteen tiles, and then discarding another tile. The goal is to form a hand of fourteen tiles (that is, after drawing, but before discarding a tile) which is a winning configuration. There are a number of different winning configurations, but most winning configurations share a similar pattern: the fourteen tiles must be grouped into four triples and a single pair. Triples are either three of the same tile, or three tiles in a sequence (there are three “suits” which can be used to form sequences); the pair is two of the same tiles. Here is an example:

Represented numerically, this hand consists of the triples and pairs 123 55 234 789 456.

One interesting quantity that is useful to calculate given a mahjong hand is the shanten number, that is, the number of tiles away from winning you are. This can be used to give you the most crude heuristic of how to play: discard tiles that get you closer to tenpai. The most widely known shanten calculator is this one on Tenhou’s website [3]; unfortunately, the source code for this calculator is not available. There is another StackOverflow question on the subject, but the “best” answer offers only a heuristic approach with no proof of correctness! Can we do better?

Naïvely, the shanten number is a breadth first search on the permutations of a hand. When a winning hand is found, the algorithm terminates and indicates the depth the search had gotten to. Such an algorithm is obviously correct; unfortunately, with 136 tiles, one would have to traverse ((136-13)\times 14)^n hands (choices of new tiles times choices of discard) while searching for a winning hand that is n-shanten away. If you are four tiles away, you will have to traverse over six trillion hands. We can reduce this number by avoiding redundant work if we memoize the shanten associated with hands: however, the total number of possible hands is roughly 136 \choose 13, or 59 bits. Though we can fit (via a combinatorial number system) a hand into a 64-bit integer, the resulting table is still far too large to hope to fit in memory.

The trick is to observe that shanten calculation for each of the suits is symmetric; thus, we can dynamic program over a much smaller space of the tiles 1 through 9 for some generic suit, and then reuse these results when assembling the final calculation. 9 \times 4 \choose 13 is still rather large, so we can take advantage of the fact that because there are four copies of each tile, an equivalent representation is a 9-vector of the numbers zero to four, with the constraint that the sum of these numbers is 13. Even without the constraint, the count 5^9 is only two million, which is quite tractable. At a byte per entry, that’s 2MB of memory; less than your browser is using to view this webpage. (In fact, we want the constraint to actually be that the sum is less than or equal to 13, since not all hands are single-suited, so the number of tiles in a hand is less.

The breadth-first search for solving a single suit proceeds as follows:

  1. Initialize a table A indexed by tile configuration (a 9-vector of 0..4).
  2. Initialize a todo queue Q of tile configurations.
  3. Initialize all winning configurations in table A with shanten zero (this can be done by enumeration), recording these configurations in Q.
  4. While the todo queue Q is not empty, pop the front element, mark the shanten of all adjacent uninitialized nodes as one greater than that node, and push those nodes onto the todo queue.

With this information in hand, we can assemble the overall shanten of a hand. It suffices to try every distribution of triples and the pairs over the four types of tiles (also including null tiles), consulting the shanten of the requested shape, and return the minimum of all these configurations. There are 4 \times {4 + 4 - 1 \choose 4} (by stars and bars) combinations, for a total of 140 configurations. Computing the shanten of each configuration is a constant time operation into the lookup table generated by the per-suit calculation. A true shanten calculator must also accomodate the rare other hands which do not follow this configuration, but these winning configurations are usually highly constrained, and quite easily to (separately) compute the shanten of.

With a shanten calculator, there are a number of other quantities which can be calculated. Uke-ire refers to the number of possible draws which can reduce the shanten of your hand: one strives for high uke-ire because it means that probability that you will draw a tile which moves your hand closer to winning. Given a hand, it's very easy to calculate its uke-ire: just look at all adjacent hands and count the number of hands which have lower shanten.

Further extensions

Suppose that you are trying to design an AI which can play Mahjong. Would the above shanten calculator provide a good evaluation metric for your hand? Not really: it has a major drawback, in that it does not consider the fact that some tiles are simply unavailable (they were discarded). For example, if all four “nine stick” tiles are visible on the table, then no hand configuration containing a nine stick is actually reachable. Adjusting for this situation is actually quite difficult, for two reasons: first, we can no longer precompute a shanten table, since we need to adjust at runtime what the reachability metric is; second, the various suits are no longer symmetric, so we have to do three times as much work. (We can avoid an exponential blowup, however, since there is no inter-suit interaction.)

Another downside of the shanten and uke-ire metrics is that they are not direct measures of “tile efficiency”: that is, they do not directly dictate a strategy for discards which minimizes the expected time before you get a winning hand. Consider, for example, a situation where you have the tiles 233, and only need to make another triple in order to win. You have two possible discards: you can discard a 2 or a 3. In both cases, your shanten is zero, but discarding a 2, you can only win by drawing a 3, whereas discarding a 3, you can win by drawing a 1 or a 4. Maximizing efficiency requires considering the lifetime ure-kire of your hands.

Even then, perfect tile efficiency is not enough to see victory: every winning hand is associated with a point-score, and so in many cases it may make sense to go for a lower-probability hand that has higher expected value. Our decomposition method completely falls apart here, as while the space of winning configurations can be partitioned, scoring has nonlocal effects, so the entire hand has to be considered as a whole. In such cases, one might try for a Monte Carlo approach, since the probability space is too difficult to directly characterize. However, in the Japanese Mahjong scoring system, there is yet another difficulty with this approach: the scoring system is exponential. Thus, we are in a situation where the majority of samples will be low scoring, but an exponentially few number of samples have exponential payoff. In such cases, it’s difficult to say if random sampling will actually give a good result, since it is likely to miscalculate the payoff, unless exponentially many samples are taken. (On the other hand, because these hands are so rare, an AI might do considerably well simply ignoring them.)

To summarize, Mahjong is a fascinating game, whose large state space makes it difficult to accurately characterize the probabilities involved. In my thesis, I attempt to tackle some of these questions; please check it out if you are interested in more.

[1] No, I am not talking about the travesty that is mahjong solitaire.

[2] To be clear, I am not saying that poker strategy is simple—betting strategy is probably one of the most interesting parts of the game—I am simply saying that the basic game is rather simple, from a probability perspective.

[3] Tenhou is a popular Japanese online mahjong client. The input format for the Tenhou calculator is 123m123p123s123z, where numbers before m indicate man tiles, p pin tiles, s sou tiles, and z honors (in order, they are: east, south, west, north, white, green, red). Each entry indicates which tile you can discard to move closer to tenpai; the next list is of ure-kire (and the number of tiles which move the hand further).

by Edward Z. Yang at April 01, 2014 08:20 AM


Fun and Profit with Strongly-Typed Data Schemas

Over the past few months, Duncan and I have been working with Chris Dornan and Alfredo Di Napoli on api-tools, a DSL for specifying data schemas for REST-like APIs in Haskell. If you’re interested in the real-world use of Haskell, static types and DSLs, why not come along to hear Chris talk about it?

Wednesday 9th April, 6:30pm, London

Find out more and register for free over at Skills Matter:

Typical business apps store structured data, transform it and send it hither and thither. They are typically made of multiple components that have to agree on the schema of the data they exchange. So a large part of what it means to be “flexible” is for it to be easy to modify and extend the schema of the data that the system handles.

Strong typing can help with this, ensuring that the code that accesses the data is consistent with the schema. One idea that has been tried with databases is to generate Haskell types from the database schema, or generate the database schema from the Haskell types, or both from some standalone schema.

In this talk we will describe how we have applied this same idea but for REST APIs. We have a DSL that we use to specify the schema of the data available via a REST API. With a machine description of the schema we can generate the code and documentation for many parts of the system, and of course those parts are then easily adaptable when the schema changes. We generate Haskell types, JSON conversion functions, REST API documentation, disk persistence and more. We also have a story for managing schema changes and migrating data. This system is in use at Iris Connect and is now available on Hackage.

This talk will also discuss further applications we have found for these tools and some of the experience from the development project at Iris Connect, including problems we have had to solve building the Haskell components of the system and the wider challenge of integrating it into the overall development project.

And if that isn’t enough for you, Well-Typed’s Haskell courses are back at the end of April, with an all-new course on advanced features of the Haskell type system. Stay tuned for more events coming soon…

by adam at April 01, 2014 07:24 AM

Christopher Done

Haskell structured diffs

Project-request: someone please implement a program that will diff Haskell in a cleverer way than lines.

In an effort to reign in my incessant work on Haskell tooling1, I’m outlining a tool that I’d personally like and welcome people to implement it. Otherwise it serves as a motivating problem description for the next time I come around to it myself with free time.

Before anyone emails me saying “lines/words are simple, other things are hard, that’s why it’s not been done yet. People undervalue the simple solution …” with a long lecture, spare me!

The concrete diff

The concrete diff is the line-based, and sometimes character-based, diff that we all know and love. There’s no reason to throw this away. You will need to keep this as an optional backend for when you are unable to parse a Haskell file.

Pros: simple to implement. You produce the necessary lines to delete and insert to create the change from A to B.

Cons: doesn’t know about syntactic redundancy where some changes don’t mean anything, and where the actual important change occurs. For example:

main = do putStrLn "Write your name!"
          name <- getLine
          print name

Now you change this to:

main = do args <- getArgs
          case args of
            [] -> do putStrLn "Write your name!"
                     name <- getLine
                     print name
            _ -> runWithArgs args

The diff will look like this:

@@ -5,3 +5,6 @@ module Main where
-main = do putStrLn "Write your name!"
-          name <- getLine
-          print name
+main = do args <- getArgs
+          case args of
+            [] -> do putStrLn "Write your name!"
+                     name <- getLine
+                     print name
+            _ -> runWithArgs args

But it’s clear to observe that this is not the change we made in spirit, it’s just one line-based way to achieve it. In actual fact, our do putStrLn … was moved into a case, un-changed. At this size, it’s not a big deal. When the code is more interesting, it’s important to know what was really changed, and what remains the same.

The abstract syntax diff

Enter the syntactic diff. We show the difference between two syntactic trees. How this is to be achieved in a readable way is the rub, but here are some ideas.

Take our example above, one approach can be to label nodes.


¹{main = ²{do putStrLn "Write your name!"
              name <- getLine
              print name}}


¹{main = do args <- getArgs
            case args of
              [] -> ²{do putStrLn "Write your name!"
                         name <- getLine
                         print name}
              _ -> runWithArgs args}

Now, at least at a superficial glance, you don’t even need this explained to you. You can see exactly what has happened: The code before has changed to the code after, but we can see that node2 has just moved to inside the case.

Where the trickiness arises is taking this to its logical conclusion and applying it generally. What’s displayed if you also change the string in the putStrLn? Good question. Here’s an idea:

¹{main = ²{do putStrLn -{"Write your name!"}
              name <- getLine
              print name}}

Because the node "Write your name" has now been lost, we don’t need to reference it any longer. So one way to show that it has been removed could be to put -{…}. And then to show what replaced it, put in +{…}, a la classic diffs:

¹{main = +{do args <- getArgs
              case args of
                [] -> ²{do putStrLn +{"Write your name!"}
                           name <- getLine
                           print name}
                _ -> runWithArgs args}}

In reality this rule would insert more -{…} and +{…} than I’ve written here, but I’m writing these examples manually so take them with a grain of salt. Let’s take it further and say that the string has actually been moved. Then we should indeed give it a number to reference it later:


¹{main = ²{do putStrLn ³{"Write your name!"}
              name <- getLine
              print name}}


¹{main = +{do args <- getArgs
              case args of
                [] -> ²{do putStrLn +{greeting}
                           name <- getLine
                           print name}
                _ -> runWithArgs args}
    +{where greeting = ³{"Write your name!"}}}

Again, I don’t think anybody is going to find this confusing. The node3 has moved into a where clause, which has been named greeting and referenced in place of its original place.

Am I making obvious sense, here? It’s not a particularly novel display, it states what happened syntactically, precisely. With a UI, you could expand/collapse nodes in a nested fashion or “explode” all the pieces into a flat list of numbered or +’d or -’d nodes, or just narrow down to one specific interesting expression, like

²{do putStrLn +{greeting}
     name <- getLine
     print name}

If you’re sufficiently nerd-sniped to find this interesting and do-able, then I invite you to go ahead and give it a go. I’d love to see a prototype. I don’t plan on implementing this in the near or distant future, so we won’t be toe stepping.

The reduced semantic diff

If you’re still reading by this point, let me try to entice you with ambitious ideas. Take the above approach, everything we just laid out, but let’s put an additional step in there: instead of diffing Haskell’s abstract syntax tree, diff the Core.

If you compile the below with GHC,

main = case Just () of
         Just () -> print "Hey!"

The external core is:

%module main:Main
  main:main5 :: (ZMZN Char) = unpackCStringzh ("Hey!"::Addrzh);
  main:main4 :: (ZMZN Char) = ZC @ Char base:zdfShowChar1 (ZMZN @ Char);
  main:main3 :: (ZMZN Char) = base:showLitString main:main5 main:main4;
  main:main2 :: (ZMZN Char) = ZC @ Char base:zdfShowChar1 main:main3;
  main:main1 :: (Statezh RealWorld) -> (Z2H ((Statezh RealWorld)) Z0T) =
    \ (etaB1::(Statezh RealWorld)) ->
      base:hPutStr2 base:stdout main:main2 True etaB1;
  main:main :: (IO Z0T) = %cast (main:main1) (%sym ((NTCoZCIO Z0T)));
  main:main6 :: (Statezh RealWorld) -> (Z2H ((Statezh RealWorld)) Z0T) =
    \ (etaXb::(Statezh RealWorld)) ->
      base:runMainIO1 @ Z0T (%cast (main:main1) (%sym ((NTCoZCIO Z0T))))
  main:ZCmain :: (IO Z0T) = %cast (main:main6) (%sym ((NTCoZCIO Z0T)));

You can see that the pointless case has been removed. This is the bread and butter of Core simplification. But if I remove the case myself, the Core is exactly the same. This is redundant semantic content, which is why GHC removed it.

If someone made a change like this in a real codebase which removed some redundant semantic content, not just syntactical redundancy, your diff could show it like that. In other words, nothing important semantically actually happened here.

In fact, if I refactored a bunch of code, re-organized a bit, does my next colleague really want to read through all the syntax tree just to see the crux of what changed? Sometimes, but not always. Sometimes, they just want to see the precise thing that will change at runtime.

It might actually be insane, with big blow ups in code difference for minute high-level changes, or it might be great for teams caring about performance. Difficult to know until you try it. You can also do a source-mapping back to the original Haskell source, for a more interesting display.

If you want to implement this, I would love to see any results.

The typed diff

Okay, you’re still reading so you’re pretty easily nerd sniped. Let’s continue with the ideas. Another type of difference between two sources is the types of expressions in there. Consider:

main = let x = [1,2,3]
       in print (x <> x)

Now you change the code to:

main = let x = myFancyMonoid
       in print (x <> x)

Our structural diff laid out earlier will show this:

main = let x = -{[1,2,3]}
       in print (x <> x)


main = let x = +{myFancyMonoid}
       in print (x <> x)

But actually, more things have changed here. As a result of the different monoid instance, the print (x <> x) will do something different. Maybe it’s a * rather than +, maybe it’s a number, whatever. Maybe that expression is used in a more interesting way than merely printing it. What’s the real diff?

main = let {x::[Integer]} = -{[1,2,3]}
       in print {{(x <> x)}::[Integer]}


main = let {x::MyFancyMonoid} = +{myFancyMonoid}
       in print {(x <> x)}::MyFancyMonoid}

Or something like that. I’m being hand-wavey in the display, here. The real difference is that we’ve changed the type of x. It’s an important change, which has semantic meaning. My ideas are more vague here. I haven’t thought through many scenarios of how to represent this. But one thing is clear: a diff of types can actually be useful and interesting.

The editing diff

The diffs above are all examples of “cold” diffs. Calculating the difference between two files as-is. If you’re in a structured editor like Lamdu, then you don’t have to do cold diffs and figure out and guess at what happened. You know exactly what happened. This node was raised here, this variable was renamed there, etc. But if you want to work on that, you pretty much have to work on Lamdu.


In summary I’ve intentionally listed increasingly more wacky diff ideas, from the familiar to the fairly novel. My general approach to tooling is progressive: start with the simplest working implementation then step up. Structured-haskell-mode is an example of this. It’s no Lamdu, and it’s no vanilla text-based mode. It’s a stepping stone inbetween. The impedance to try SHM is lower.

In the same way, maybe we can start with the abstract syntax diff, let people become acclimatized to it, let it stabilize, get it integrated into things like Git, and then work our way up from there.

If nobody bothers trying out these ideas, I’ll probably end up doing them myself eventually, but I thought I’d put the suggestion out there first.

  1. In favour of writing programs that concern themselves with things other than Haskell for once!

April 01, 2014 12:00 AM

March 31, 2014

Functional Jobs

Infrastructure Engineer at Zeta Project Germany GmbH (Full-time)

About us

We are an early-stage product company. At Zeta everyone has a voice and we expect you to change our world. If you like coming up with creative solutions then this is definitely the place for you.

We’ve built a team of passionately experienced, curious engineers and designers who are dedicated, hardworking and wholeheartedly embrace technical and design challenges of all types.

We are located in Berlin Mitte and are looking for talented individuals who share our passion.

Job Description

This position is for someone to work in the Infrastructure team, to contribute to the evolution of our software platform.

The team is responsible for developing and supporting the company's core systems and infrastructure, which is predominately written in Haskell.

You will work directly with other teams within the company, including client, backend, and QA engineers, ensuring that the infrastructure runs correctly, reliably and at scale.

The ability to educate others on the best use of the software resources the team provides is crucial, as is the ability to use feedback to continually improve our tooling and processes.

Initially we would like you to own our Jenkins setup and drive best practices for the automation of various teams' Continuous Integration and Deployment workflows, with future work planned to improve problem areas such as development workflow, monitoring, alerting, provisioning, and performance.

Skills & Requirements

  • Practical experience using various Amazon Web Services
  • Comfortable with admin tasks, like partitioning and formatting filesystems, Debian packaging, debugging live processes, and general Linux best practices
  • Working knowledge of Cassandra, Redis, or similar databases
  • Proactive with the ability to own and oversee projects
  • Extensive familiarity with Jenkins and Continuous Integration, for technologies such as Objective-C, C++ and Haskell
  • Good programmer with experience programming Ruby, Python, Erlang or other languages
  • Experience with various Configuration Management technologies like Ansible, Chef, Puppet, etc.
  • Interested in learning Haskell
  • Experience working as a Linux System Administrator

Get information on how to apply for this position.

March 31, 2014 06:14 PM

Alejandro Cabrera

Migrating to Static, Self-Hosted Blog

Hello, readers.

Soon, I'll be moving away from the Blogger platform to a statically hosted blog. I've been playing with hakyll for some time now, and am pleased with it's support for syntax highlighting, markup formats, RSS/Atom generation, and packaged Warp.

It'll be great to develop my blog with emacs, leveraging git for backup and rsync for deployment.


* Less latency
* A home-grown design
* More code samples, with pretty highlighting

With possibly:

* More frequent posts

Stay tuned. I'll be releasing the new blog by the end of the week!

by Alejandro Cabrera ( at March 31, 2014 03:50 PM

Daniil Frumin

ANN: hastache 0.6

Announcing: hastache 0.6

Hastache is a Haskell implementation of the mustache templating system.

Quick start

cabal update
cabal install hastache

A simple example:

import Text.Hastache 
import Text.Hastache.Context 
import qualified Data.Text.Lazy.IO as TL

main = hastacheStr defaultConfig (encodeStr template) (mkStrContext context)
    >>= TL.putStrLn

template = "Hello, {{name}}!\n\nYou have {{unread}} unread messages." 

context "name" = MuVariable "Haskell"
context "unread" = MuVariable (100 :: Int)

Read Mustache documentation for template syntax; consult README for more details.

Whats’s new in 0.6?

The interface of the library has been switched from ByteString to (lazy) Text. That means, for example, that the type of hastcheStr function is now the following:

:: MonadIO m => MuConfig m	-> Text	-> MuContext m -> m Text

The generic context generation (mkGenericContext) now supports functions of the types Text -> Text and Text -> m Text, as well as types with multiple constructors. That is, given a Haskell datastructure

data A = A { str :: String }
       | B { num :: Int }

it is possible to write a template like this:

A : {{str}}
B : {{num}}

Please take a look at the multiple constructors example if you are interested.

Additionally, a couple of new MuVar instances has been added.


Special thanks to Mark Lentczner, Alexander Voikov and others who reported issues, tested the library and provided feedback.

Tagged: haskell, hastache

by Dan at March 31, 2014 07:49 AM

March 30, 2014

Christopher Done

One step forward, two steps back

The issue is that programming languages don’t go forward, they move sideways or diagonally, or sometimes backwards.

A new car comes out, and it has some cool feature: Hey, it has road surface detection and changes the steering accordingly! But it should also come with all the old stuff that you come to expect. Comfy seats, seatbelts, airconditioner, heated windows, wipers, proximity detection, power steering, cruise control, etc.

With new programming languages, what you tend to get is a chassis, engine and steering wheel, and the road surface detection.

Here is a list of cool ideas that have been discovered and implemented in programming languages, but which do not in their whole make up any existing language:

The moral is, if you’re inventing a new general purpose programming language and you have some clue that it’s going to be adopted, I implore you to thoroughly explore all of the features above within the existing languages that do them well, and think about adding it to your language.

As a follow-up post I might make a matrix of the top, say, 30, general purpose programming languages and all the features that they tick off.

  1. Also known as quotation, quasiquotes, macros, templating, mixins, etc.

  2. So, numbers, strings, vectors, patterns, etc.

  3. Preferablly static. Also known as implicit parameters, contexts, etc.

  4. Also known as “hot-swapping”, “live update”, “plugins”, etc.

March 30, 2014 12:00 AM

March 29, 2014

ERDI Gergo

My first computer

tl;dr: I've built a computer on a Xilinx FPGA using Kansas Lava, based on a virtual machine design from the mid seventies.

I would be lying if I said I always wanted to build my own computer. For a very long time, hardware didn't tickle my curiosity at all; and even today, I prefer staying away from dangling wires and soldering irons. I like my computing platforms to Just Work™, and hardware problems are just a huge hassle. But then in 2010 some coworkers of mine started getting into electronics and I learned from them just enough to start hooking some ICs up on a breadbord, and it seemed like a very fun diversion from all the high-level, abstract, softwarey stuff. In some sense, it filled the same void in me that assembly programing would probably have had. But even back in the days on the Commodore 64, I was looking for more BASIC extensions instead of going downwards to assembly / machine code.

One thing led to another and I was creating a Brainfuck CPU on a Papilio FPGA. It was a very satisfying experience, plus, I got to learn about a completely new world (that of digital hardware design and hardware description languages). So ever since then, I had a voice in the back of my head saying I should go all in, and implement a proper full computer, with I/O and all. But jumping straight into replicating a real computer seemed like trying to run before I can walk.

I can't remember now how I bumped into the CHIP-8 platform, but when I read about it in detail, I realized this is something I, a self-taught HDL guy, could realistically implement. To recap, it's a virtual machine spec from the mid seventies indended to be implemented on home computers of the time. It is super low performance, meaning I could implement everything the most naïve way possible and still get something workable out of it: graphics is 64×32 black & white, RAM is short of 4KBytes, CPU speed is not really part of the spec, and the only timers provided run at 60 Hz.

I can do this!

Setting the scene

The FPGA board that I targeted is a Papilio One 500K, which has 500K... somethings called "system gates", and about 40KByte of SRAM. The Arcade MegaWing provides a D-sub 15 VGA connector and two PS/2 ports, among some other stuff that I am not using for this project.

The home computer targeted by the original CHIP-8 spec only had a four-by-four keypad, so I am not 100% happy about using a PS/2 keyboard for the input. Maybe a later version could use a bespoke keypad connected to the unused pins of the LogicStart MegaWing. However, by using a PS/2 keyboard, I could not worry about the hardware side of things and just implement a decoder for the PS/2 protocol.

In terms of software, there are several games available for the CHIP-8. Maybe even donzens! But just for fun, after finishing the machine itself, I ended up writing my own game as well: a crappy port of the currently trending 2048 game.

Understanding the CHIP-8

I would say the CPU itself is a RISC architecture, in that it has 16 registers and not a whole lot of operations. But I'm not sure it would be called RISC back in its day.

The aforementioned 64×32 one-bit graphics is manipulated via a sprite interface: there's a special CPU opcode for xor-blitting an 8-pixel-wide, variable-height sprite onto any part of the screen. I went with the obvious way of implementing it as a 2048-bit frame buffer.

To validate my understanding, I first created a software emulator in Haskell. That unearthed a ton of edge cases and situations where I was not reading the spec with enough attention. Once I had the emulator working well enough to play the sample games I found, I enumerated the modules I'll need to implement.

The emulator running a Tetris game

A TODO list

Since I've already done the Brainfuck CPU, I didn't foresee too much difficulty in implementing the CPU proper (oh boy was I wrong). However, the world of peripherals was a new one to me.

I've never looked into VGA signal timings in detail before, and for some reason I just assumed that it's going to be just as complicated as PAL, about which I knew just enough to know that you have to generate all kinds of elaborate sync patterns. So actually reading the VGA spec was a relief, and I quickly came up with a scheme where my CHIP-8 computer would be running at 50 MHz, so that it can easily implement the 25 MHz pixel clock needed for 640×480@60 Hz. I went with this mode because it has several advantages:

  • It refreshes the screen at 60 Hz, so I can synchronize the CHIP-8 timers to it
  • It has square pixel aspect ratio
  • It is a "default" mode, so I can be confident that if I generate the correct signal, modern LCDs will be able to display it. In fact, for testing, I later got a 7" CCTV LCD with a VGA input on the cheap, so that I didn't have to make room for the 21" TFT I had lying around. The CCTV monitor supports three VGA video modes, and 640×480@60 Hz is one of them.
  • By dividing both the horizontal and vertical resolution by 8, I can get 80×60 which doesn't leave too much unused border around 64×32. I could use all horizontal screen real estate if I divided by 10 instead, to get 64×48, but I have no idea how I could divide my horizontal/vertical position counter by 10; whereas dividing by 8 is a simple matter of shifting to the right by 3 bits.

The Papilio One itself has a clock of 32 MHz, and I first hoped that 32 Mhz and 25 Mhz is close enough that I can just generate a signal using the 32 MHz as the pixel clock. Turns out that's not quite how signal timings work.

Luckily, I've found out that the Papilio One also has something called a DCM which can divide and multiply the clock signal. I was able to go to 25 or 50 MHz easily with it. It's a bit of a black box component: I had to run a wizard in the Xilinx IDE to get a small binary file describing my DCM parameters, and then I integrated it into my build process by running another Xilinx program on it which spits out some magic bitstream.

The PS/2 protocol is a simple serial protocol with a slow (~10 KHz) clock and one parity bit per 8 data bits. Decoding it into a stream of bytes was a straightforward thing once I hunted down an actual PS/2 keyboard, since it turned out those USB to PS/2 dongles don't really work with regular USB keyboards; rather, the keyboard have to be made ready to "speak" PS/2 over the dongle. So I ended up getting a proper PS/2 keyboard from Mustafa Centre (thanks to Viki's sister Dori for picking it up for me); god only knows why they still had one on stock.


The Xilinx tool suite seems to be aimed at being used from the IDE. This has several drawbacks: version controlling is complicated because generated artifacts litter the source tree; the editor itself in the IDE is of course crap compared to Emacs; and most importantly, I didn't intend to manually write either VHDL or Verilog. As I've mentioned before in my post about the Brainfuck CPU, I've found both of these hardware description languages to be lacking in the same way mainstream imperative programming languages are: the tools you, the developer, is given to introduce your own abstractions are very poor. So I planned to use Kansas Lava, an HDL embedded in Haskell.

Now, the first thing to note about Kansas Lava is, as nice at is, the software package itself is bit-rotten. The latest version available on Hackage cannot even be compiled with GHC 7.6. While the change to fix that is trivial, after that, you can easily bump into bugs in the Lava compiler itself. But more on that later. Later versions available from GitHub are not even self-consisent between the various dependencies. I've put my patched-up version of Kansas Lava and some of the packages it dependens on on GitHub and I'm trying to get Andy Gill to allow me to upload the bundle as a update to Hackage. Don says I should maybe say fuck it and create my own Singapore Lava fork, just to stay with the Lava tradition of a twisty little maze of incompatible forks, all alike.

However, when it all works, it is amazing. I was able to extend the library of Papilio-specific Lava code that I created for the Brainfuck project with reusable modules for the VGA signal generator and the PS/2 decoder in such a way that they should be very easy to be re-used for any other future projects. And it's all type-safe, so for example, the Papilio Arcade MegaWing VGA port is statically enforced to be 4+4+4 bits whereas the LogicStart MegaWing is 3+3+2.

But there was one bug that left a bitter aftertaste in my mouth. Once I had both the CPU and the VGA parts working and I started running some test programs on the thing, I noticed that the framebuffer blitting is exhibiting "or" behaviour instead of "xor". Running the same code in the Lava simulator and dumping the contents of the framebuffer, it showed the correct, "xor" behaviour. After weeks of frustration and a complete rework of the communication system between the CPU, the VGA signal generator and the frame buffer to add memory read bussing, I finally took a look at the Lava compiler's source code to find that it simulates the xor2 primitive as xor but compiles it to or. How do you not notice this bug? Has the Kansas Lava suite been used by anyone for everything at all in the history of ever???

End result

The final result, after much trial and error, looking at the Lava simulator's output, and pouring over the code, is available here. I'm reasonably happy about the code base, except for the implementation of the CPU itself, which feels a bit spaghetti to me. Especially around the parts where it's waiting for results to come back from main RAM or the video framebuffer.

Below are some photographs and videos of it running two games: the memory puzzle game Hidden and the 2048 clone mentioned above. Unfortunately, the Tetris game from the screenshot above seems to have a bug in its input handling, in that it samples the keyboard at an unthrottled frequency; thus, even the shortest keypress sends the falling piece to the edge of the wall. I'll eventually need to disassemble the game and fix this.

The VGA signal generator is not as neat as it could be, because it doesn't do pre-fetching. This means by the time the current CHIP-8 pixel's value is read from the framebuffer, it is already too late, the first of the 8 real pixels for the given virtual CHIP-8 pixel should already have been drawn. This results in some ghosting. But since I know what's causing it, I will hopefully be able to fix this in some next version. But right now I'm too happy to have the whole thing just working.

The setup

Detail of the FPGA board itself

<object height="350" width="425"><param name="movie" value=""/><param name="wmode" value="transparent"/><embed height="350" src="" type="application/x-shockwave-flash" width="425" wmode="transparent"></embed></object>

<object height="350" width="425"><param name="movie" value=""/><param name="wmode" value="transparent"/><embed height="350" src="" type="application/x-shockwave-flash" width="425" wmode="transparent"></embed></object>

March 29, 2014 01:06 PM

Vincent Hanquez

Listing licenses with cabal-db

Following discussions with fellow haskellers, regarding the need to be careful with adding packages that could depends on GPL or proprietary licenses, it turns out it’s not easy to get your dependencies’s licenses listed.

It would be convenient to be able to ask the hackage database those things, and this is exactly what cabal-db usually works with.

cabal-db, for the ones that missed the previous annoucement, is a simple program that using the index.tar.gz file, is able to recursively display or search into packages and dependencies.

The license subcommand is mainly to get a summary of the licenses of a packages and its dependencies, but it can also display the tree of licenses. Once a package has been listed, it would not appears again in the tree even if another package depend on it.

A simple example is better than many words:

$ cabal-db license -s -t BNFC
  process: BSD3
    unix: BSD3
      time: BSD3
        old-locale: BSD3
          base: BSD3
        deepseq: BSD3
          array: BSD3
      bytestring: BSD3
    filepath: BSD3
    directory: BSD3
  pretty: BSD3
  mtl: BSD3
    transformers: BSD3
  containers: BSD3
== license summary ==
BSD3: 14
GPL: 1

cabal-db is only using the license listed in the license field in cabal files, so if the field is incorrectly set, cabal-db would have no idea.

March 29, 2014 12:00 AM

March 28, 2014

Bryan O'Sullivan

Where credit belongs for Hack

It’s been exciting to see the positive reception in the tech press and the open source community to Hack, the new programming language that my team at Facebook released last week.

At the same time, one part of that coverage has made me wince: a couple of articles give the impression that I created Hack, but I didn’t. What I do is manage the Hack team, and since I’m incredibly proud of the work they do, I’d like to shine a little light on the people who really deserve the credit.

A few years ago, Julien Verlaguet and Alok Menghrajani had a notion that Facebook could improve some aspects of how it ships software. They developed a proof-of-concept of a system that could detect certain kinds of logic error in a program, and showed it to a few colleagues. Buoyed by the positive feedback they received, they decided to invest more effort in the project, which would in due course become Hack.

This process underscores one of the aspects I find most appealing about engineering at Facebook: engineers are welcome to try out bold ideas, and to push on with them if they look like they’ve got a good prospect of paying off.

The work that Julien and Alok did to get the project off the ground was far from simple. They had to design not just a type system, but one where the type checking algorithms could scale to run efficiently and incrementally across hundreds of thousands of source files. Not just a static type system, but one that could interoperate seamlessly with PHP’s dynamic types. Even the notionally simple act of watching the filesystem to quickly find changed files is impressively difficult; just ask anyone who’s had to wrestle with race conditions involving Linux’s inotify subsystem.

When presented with the early prototype of Hack, Drew Paroski went beyond simply offering support: he immediately signed up to enrich Hack by building the Collections system, a set of clean, modern APIs for dealing with bulk data. While the face that Collections presents to the programmer is easy to understand, there’s once again a lot of subtle work going on behind the scenes (including a lot of runtime support in HHVM, to make them efficient). Collections interoperate seamlessly with PHP arrays, both when a program is being typechecked and while it’s running.

While designing a language and building a typechecker are substantial efforts in their own right, a big reason that we’re excited about Hack is that it is already a success within Facebook. This is in large part due to the work of two engineers, Gabe Levi and Josh Watzman, who converted a large part of our codebase to use Hack’s type annotations. Pulling this off is a surprisingly subtle task. In a large dynamically typed codebase, there can be both latent type errors and code that is more elegantly left dynamically typed. Introducing static types on a large scale involves a series of judgment calls: is what we’re looking at an error; a piece of code that we can refactor so that we can express its types statically; or something that’s just fine as it stands?

As the rest of the team proceeded with the language design, engineering, and conversion work, Joel Marcey built out the HHVM and Hack documentation, while Eugene Letuchy both added features to the language (e.g. richer support for traits) and made significant contributions to our PHP-to-Hack conversion work.

I’m proud to be associated with such skilled people who put so much passion and care into their work. When it comes to Hack, I’m simply the messenger, and these engineers have all along been the ones making it happen.

by Bryan O'Sullivan at March 28, 2014 06:03 PM

Leon P Smith

Announcing postgresql-simple-0.4.2

With 19 new releases, postgresql-simple has seen substantial development since my announcement of version 0.3.1 nearly a year ago. Compared to my last update, the changes are almost exclusively bugfixes and new features. Little code should break as a result of these changes, and most if not all of the code that does break should fail at compile time.

As it stands today, postgresql-simple certainly has the best support for postgres-specific functionality of anything on hackage. So, for a few highlights:

Parametrized Identifiers

Thanks to contributions from Tobias Florek, perhaps the most exciting change for many is that postgresql-simple now properly supports parametrized identifiers, including column, table, and type names. These are properly escaped via libpq. So for example, you can now write:

              (Only ("schema.table" :: QualifiedIdentifier))

Of course, this particular example is still potentially very insecure, but it shouldn’t be possible to create a SQL injection vulnerability this way.

The downside of this change is that postgresql-simple now requires libpq version 9.0 or later. However, community support for 8.4 is coming to and end this July. Also, it is possible to use newer versions of libpq to connect to older versions of postgres, so you don’t have to upgrade your server. (In fact, in one particular situation I’m still using postgresql-simple to connect to postgresql 8.1, although some features such as non-builtin types don’t work.)

Copy In and Copy Out support

While the postgresql-libpq binding has supported COPY FROM STDIN and COPY TO STDOUT for some time, postgresql-simple now supports these directly without having to muck around with postgresql-libpq calls via the Internal module.

If you are interested in streaming data to and from postgres, you may also be interested in higher-level COPY bindings for pipes and io-streams. This is available in Oliver Charles’s pipes-postgresql-simple, which also supports cursors, and my own unreleased postgresql-simple-streams, which also supports cursors and large objects.

Out-of-box support for JSON, UUID, and Scientific

Thanks to contributions from Bas van Dijk, postgresql-simple now provides FromField and ToField instances for aeson‘s value type, Antoine Latter’s uuid, as well as scientific.

Savepoints and nestable folds

Thanks to contributions from Joey Adams, postgresql-simple now has higher-level support for savepoints in the Transaction module, as well as the ability to nest the fold operator. Previously, postgresql-simple had assigned a static name to the cursor that underlies fold, and now every connection has a counter used to generate temporary names.

Parametrized VALUES expressions

While executeMany and returning already support common use cases of this new feature, the new Values type allows you to parameterize more than just a single VALUES table literal.

For example, these situations commonly arise when dealing with writable common table expressions. Let’s say we have a table of things with an associated table of attributes:

   name TEXT NOT NULL,

CREATE TABLE attributes (
   id    INT NOT NULL REFERENCES things,
   key   TEXT   NOT NULL,

Then we can populate both a thing and its attributes with a single query, returning the id generated by the database:

query conn [sql|
    WITH thing AS (
        INSERT INTO things (name) VALUES (?) RETURNING id
    ), newattrs AS (
        INSERT INTO attributes
            SELECT, a.*
            FROM thing JOIN ? a
      SELECT id FROM thing;
  |]  ("bob", Values   [ "text" ,"int8" ]
                     [ ( "foo"  , 42    )
                     , ( "bar"  , 60    ) ])

The empty case is also dealt with correctly; see the documentation for details.

Improved support for outer joins

A long standing challenge with postgresql-simple is dealing with outer joins in a sane way. For example, with the schema above, let’s say we want to fetch all the things along with their attributes, whether or not any given thing has any attributes at all. Previously, we could write:

getAllTheThings :: Connection -> IO [(Text, Maybe Text, Maybe Int64)]
getAllTheThings conn = do
    query conn [sql|
        SELECT name, key, value
          FROM things LEFT OUTER JOIN attributes
            ON =

Now, the columns from the attributes table are not nullable, so normally we could avoid the Maybe constructors, however the outer join changes that. Since both of these columns are not nullable, they are always both null or both not null, which is an invariant not captured by the type. And a separate Maybe for each column gets awkward to deal with, especially when more columns are involved.

What we would really like to do is change the type signature to:

getAllTheThings :: Connection -> IO [Only Text :. Maybe (Text, Int64)]

And now we can, and it will just work! Well, almost. The caveat is that there is a separate instance FromRow (Maybe ...) for most of the provided FromRow instances. This won’t work with your own FromRow instances unless you also declare a second instance. What’s really desired is a generic instance:

instance FromRow a => FromRow (Maybe a)

This instance would return Nothing if all the columns that would normally be consumed are null, and attempt a full conversion otherwise. This would reduce code bloat and repetition, and improve polymorphism and compositionality.

But alas, it’s not possible to define such an instance without changing the FromRow interface, and quite probably breaking everybody’s FromRow instances. Which I’m totally willing to do, once somebody comes up with a way to do it.

by lpsmith at March 28, 2014 05:09 PM

Robert Harper

Old neglected theorems are still theorems

I have very recently been thinking about the question of partiality vs totality in programming languages, a perennial topic in PL’s that every generation thinks it discovers for itself.  And this got me to remembering an old theorem that, it seems, hardly anyone knows ever existed in the first place.  What I like about the theorem is that it says something specific and technically accurate about the sizes of programs in total languages compared to those in partial languages.  The theorem provides some context for discussion that does not just amount to opinion or attitude (and attitude alway seems to abound when this topic arises).

The advantage of a total programming language such as Goedel’s T is that it ensures, by type checking, that every program terminates, and that every function is total. There is simply no way to have a well-typed program that goes into an infinite loop. This may seem appealing, until one considers that the upper bound on the time to termination can be quite large, so large that some terminating programs might just as well diverge as far as we humans are concerned. But never mind that, let us grant that it is a virtue of  T that it precludes divergence.

Why, then, bother with a language such as PCF that does not rule out divergence? After all, infinite loops are invariably bugs, so why not rule them out by type checking? (Don’t be fooled by glib arguments about useful programs, such as operating systems, that “run forever”. After all, infinite streams are programmable in the language M of inductive and coinductive types in which all functions terminate. Computing infinitely does not mean running forever, it just means “for as long as one wishes, without bound.”)  The notion does seem appealing until one actually tries to write a program in a language such as T.

Consider computing the greatest common divisor (GCD) of two natural numbers. This can be easily programmed in PCF by solving the following equations using general recursion:

\begin{array}{rcl}    \textit{gcd}(m,0) & = & m \\    \textit{gcd}(0,m) & = & m \\    \textit{gcd}(m,n) & = & \textit{gcd}(m-n,n) \quad \text{if}\ m>n \\    \textit{gcd}(m,n) & = & \textit{gcd}(m,n-m) \quad \text{if}\ m<n    \end{array}

The type of \textit{gcd} defined in this manner has partial function type (\mathbb{N}\times \mathbb{N})\rightharpoonup \mathbb{N}, which suggests that it may not terminate for some inputs. But we may prove by induction on the sum of the pair of arguments that it is, in fact, a total function.

Now consider programming this function in T. It is, in fact, programmable using only primitive recursion, but the code to do it is rather painful (try it!). One way to see the problem is that in T the only form of looping is one that reduces a natural number by one on each recursive call; it is not (directly) possible to make a recursive call on a smaller number other than the immediate predecessor. In fact one may code up more general patterns of terminating recursion using only primitive recursion as a primitive, but if you examine the details, you will see that doing so comes at a significant price in performance and program complexity. Program complexity can be mitigated by building libraries that codify standard patterns of reasoning whose cost of development should be amortized over all programs, not just one in particular. But there is still the problem of performance. Indeed, the encoding of more general forms of recursion into primitive recursion means that, deep within the encoding, there must be “timer” that “goes down by ones” to ensure that the program terminates. The result will be that programs written with such libraries will not be nearly as fast as they ought to be.  (It is actually quite fun to derive “course of values” recursion from primitive recursion, and then to observe with horror what is actually going on, computationally, when using this derived notion.)

But, one may argue, T is simply not a serious language. A more serious total programming language would admit sophisticated patterns of control without performance penalty. Indeed, one could easily envision representing the natural numbers in binary, rather than unary, and allowing recursive calls to be made by halving to achieve logarithmic complexity. This is surely possible, as are numerous other such techniques. Could we not then have a practical language that rules out divergence?

We can, but at a cost.  One limitation of total programming languages is that they are not universal: you cannot write an interpreter for T within T (see Chapter 9 of PFPL for a proof).  More importantly, this limitation extends to any total language whatever.  If this limitation does not seem important, then consider the Blum Size Theorem (BST) (from 1967), which places a very different limitation on total languages.  Fix any total language, L, that permits writing functions on the natural numbers. Pick any blowup factor, say 2^{2^n}, or however expansive you wish to be.  The BST states that there is a total function on the natural numbers that is programmable in L, but whose shortest program in L is larger by the given blowup factor than its shortest program in PCF!

The underlying idea of the proof is that in a total language the proof of termination of a program must be baked into the code itself, whereas in a partial language the termination proof is an external verification condition left to the programmer. Roughly speaking, there are, and always will be, programs whose termination proof is rather complicated to express, if you fix in advance the means by which it may be proved total. (In T it was primitive recursion, but one can be more ambitious, yet still get caught by the BST.)  But if you leave room for ingenuity, then programs can be short, precisely because they do not have to embed the proof of their termination in their own running code.

There are ways around the BST, of course, and I am not saying otherwise.  For example, the BST merely guarantees the existence of a bad case, so one can always argue that such a case will never arise in practice.  Could be, but I did mention the GCD in T problem for a reason: there are natural problems that are difficult to express in a language such as T.  By fixing the possible termination arguments in advance, one is tempting fate, for there are many problems, such as the Collatz Conjecture, for which the termination proof of a very simple piece of code has been an open problem for decades, and has resisted at least some serious attempts on it.  One could argue that such a function is of no practical use.  I agree, but I point out the example not to say that it is useful, but to say that it is likely that its eventual termination proof will be quite nasty, and that this will have to be reflected in the program itself if you are limited to a T-like language (rendering it, once again, useless).  For another example, there is no inherent reason why termination need be assured by means similar to that used in T.  We got around this issue in NuPRL by separating the code from the proof, using a type theory based on a partial programming language, not a total one.  The proof of termination is still required for typing in the core theory (but not in the theory with “bar types” for embracing partiality).  But it’s not baked into the code itself, affecting its run-time; it is “off to the side”, large though it may be).

Updates: word smithing, fixed bad link, corrected gcd, removed erroneous parenthetical reference to Coq, fixed LaTeX problems.

Filed under: Programming, Research, Teaching Tagged: functional programming, primitive recursion, programming languages, type theory, verification

by Robert Harper at March 28, 2014 01:26 AM

March 26, 2014

Ketil Malde

Parallel SNP identification


Lately, I’ve been trying to do SNP identification. Basically, we take samples from different populations (geographically separated, or based on some phenotype or property), sequence them in pools, and then we want to identify variants that are more specific to certain populations. The basic method is to map sequences to a reference genome, and examine the sites where the mappings disagree - either with the reference, or with each other.

There are many different implementation of this, most require indiviual genotypes (that is, each position is identified for each individual as homo- or heterozygotic), and most are geared towards finding variant positions, rather than finding diagnostic variants. Many methods - at least those I’ve tested are also quite slow, and they give output that is harder to use (i.e. pairwise scores, rather than overall).

In my case, we have pooled sequences (i.e. not individual data), we want diagnostic variants (i.e. those that differ most between populations), and we only need candidates that will be tested on individuals later, so some errors are okay. So I implemented my own tool for this, and ..uh.. invented some new measures for estimating diversity. (Yes, I know, I know. But on the other hand, it does look pretty good, in my brief testing I get less than a 0.06% FP positive rate - and plenty of true positives. But if things work out, you’ll hear more about that later.)

Parallel implementation

Anyway - my program is reasonably fast, but ‘samtools mpileup’ is a bit faster. So I thought I’d parallelize the execution. After all, we live in the age of multicore and functional programming, meaning plenty of CPUs, and an abundance of parallel programming paradigms at our disposal. Since the program basically is a simple function applied line by line, it should be easy to parallelize with ‘parMap’ or something similarly straightforward. The probelm with ‘parMap’ is that - although it gets a speedup for short inputs - it builds a list with one MVar per input (or IVars if you use the [Par monad]) strictly. So for my input - three billion records or so - this will be quite impractical.

After toying a bit with different modeles, I ended up used a rather quick and dirty hack. Avert your eyes, or suffer the consequences:

parMap :: (b -> a) -> [b] -> IO [a]
parMap f xs = do
  outs <- replicateM t newEmptyMVar
  ins  <- replicateM t newEmptyMVar
  sequence_ [forkIO (worker i o) | (i,o) <- zip ins outs]
  forkIO $ sequence_ [putMVar i (f x) | (i,x) <- zip (cycle ins) xs]
  sequence' [takeMVar o | (o,_) <- zip (cycle outs) xs]

worker :: MVar a -> MVar a -> IO b
worker imv omv = forever $ do 
  ex <- takeMVar imv 
  ex `seq` putMVar omv ex

Basically, I use worker threads with two MVars for input and output, arranged in circular queues. The input MVars are populated by a separate thread, and then the outputs are collected in sequence. I don’t bother about terminating threads or cleaning up, so it’s likely this would leak space or have other problems. But it seems to work okay for my purposes. And sequence_ is the normal sequence, but lazy, i.e. it uses unsafeInterleaveIO to delay execution.


To test the parallel performance, I first ran samtools mpileup on six BAM files, and then extracted the first ten million records (lines) from the output file. I then processed the result, calculating confidence intervals, a chi-squared test, and Fst, with the results in the table below.

Table 1: Benchmark results for different numbers of threads.
Threads wall time time (s) size (MB) cpu (s) sys (s)
1 8:00 480 32.0 472 7
2 4:40 280 39.7 525 19
4 2:33 153 40.1 550 32
8 1:43 103 40.7 689 57
12 1:50 110 36.9 978 106
16 1:33 93 40.3 1047 125
24 1:17 77 40.7 1231 152
Benchmark results for different numbers of threads.

Benchmark results for different numbers of threads.

I would guess that system time is Linux scheduling overhead.

I thought the time with 12 threads was a fluke, but then I reran it thrice: 1:52, 1:50, 1:52. Strange? The only explanation I can see, is that the machine has 12 real cores (hyperthreading giving the pretense of 24), and that running the same number of threads gives some undesirable effect. Running for t={10,11,13,14} gives

Table 2: A more detailed look at performance scaling.
Threads time
10 1:45
11 1:44
12 1:51
13 1:45
14 1:45

So: apparently a fluke in the middle there, and 11 seems like a local minimum. So let’s try also with 23 threads:

% /usr/bin/time varan 10m.mpile.out --conf --chi --f-st > /dev/null +RTS -N23
1165.14user 137.80system 1:15.10elapsed 1734%CPU (0avgtext+0avgdata 41364maxresident)k
0inputs+0outputs (0major+10518minor)pagefaults 0swaps

Sure enough, slightly faster than 24, and 5% less work overall. Of course, the GHC scheduler is constantly being improved, so this will likely change. These numbers were with 7.6.3, I also have a 32-core machine with 7.0.4, but currently, that refuses to compile some of the dependencies.


Memory stays roughly constant (and quite low), scalability is okay, but starts to suffer with increasing parallelism, and gains are small beyond about eight threads.

Genomes vary enourmously in size, from a handful of megabases to multiple gigabases (each position corresponding to a record here), but scaling this up to human-sized genomes at about three gigabases indicates a processing time of roughly two days for a single-threaded run, reduced to about ten hours if you can throw four CPUs at it. This particular difference is important, as you can start the job as you leave work, and have the results ready the next morning (as opposed to waiting to over the weekend). Anyway, the bottleneck is now samtools. I can live with that.

March 26, 2014 06:00 AM

March 25, 2014

Yesod Web Framework

Package consolidation

A few weeks ago, there was a pretty heated debate about the PVP on the libraries mailing list. I've seen a few different outcomes from that discussion. One is to reduce dependency footprints to try and avoid some of the dependency problems. (Another one is concrete proposals for changes to the PVP; I intend to post a proposal as well in the coming days, but wanted to get the easier stuff out of the way first.)

This blog post is about some plans I have for consolidating multiple packages together, hopefully to result in simpler dependency trees without causing users to have unneeded package dependencies- at least not too often. The reason I'm writing up this blog post is to let the community know about these changes in advance, and let me know if any of these changes will cause them problems. Also, if I list a package as deprecated, and you'd like to take over maintainership, please let me know.

One of the guiding principles I've used in setting this up is that I belive some dependencies should be considered incredibly cheap. Depending on process, for example, is a cheap dependency, since it comes with GHC. (Unless you place restrictive bounds on process, which can instead cause dependency problems.) For more information on these principles, please read my description.

I'll start at the bottom of the dependency tree and build up.

streaming-commons, zlib-bindings, text-stream-decode, conduit and pipes

Currently, there are about six core conduit packages: conduit, zlib-conduit, attoparsec-conduit, etc. In addition, for some of these packages, I've split off helper packages providing functionality to be used by other streaming data libraries as well, such as zlib-bindings and text-stream-decode.

I want to collapse that into just three packages. All of the streaming helpers will end up in a new package, streaming-commons. I've talked with Gabriel Gonzalez about this, and we'll be collaborating together on creating a high-quality, low-dependency library. This library will also include some features currently baked into conduit itself, like lock-free Windows file reading and directory traversals.

All of the conduit core packages on top of that would then be merged into a new package, conduit-extra. So we'd end up with conduit, conduit-extra, and streaming-commons. The only downside is that, if you only needed zlib support, you'll now get a few extra packages as well. However, following the principle I listed above, these extra dependencies should all be coming from the "basically free" dependency category.

Crazier ideas for streaming-commons

This may be taking the idea too far, but we could include some even more advanced tooling in streaming-commons. This could include not only the data types from xml-types and json-types- which provide both streaming and tree based data structures for those data formats- but also attoparsec parsers and blaze-builder renderers. This could allow quite a bit of the xml-conduit codebase to be shared by the pipes world, for example.

I'm curious if people think this is a cool idea, or too radical (or both!).

Deprecate failure, attempt, and safe-failure

This one's been on my list for a while, pending some details being worked out with Edward Kmett. The goal is to completely deprecate failure and attempt in favor of the exceptions package, and within exceptions split out MonadThrow from MonadCatch.

This will also mean removing some redundant functionality from resourcet. It will be nice to be rid of the custom MonadThrow and MonadUnsafeIO defined there.


A few simple moves: merge http-client-multipart into http-client, and merge http-client-conduit into http-conduit. The latter change will mean that it's a bit more difficult to use http-client in conduit without depending on tls, but that's a use case anyone has expressed interest to me in.

Another change I'm planning to do at the same time is to add a new module to http-conduit, with an alternate API. There are a few places where I'm dissatisfied with the current API, and this module will work as an experimental next-gen http-conduit. I'm planning on keeping both versions of the API around for quite a while for backwards compatibility, however. The changes I'm looking at are:

  • Avoiding ResumableSource
  • Using with-semantics (like http-client) to avoid accidentally keeping connections open.
  • Don't bake in the need for ResourceT
  • Possibly use lenses for field modifiers on the Request data type.


This change is pretty simple: collapse shakespeare, shakespeare-css, shakespeare-js, shakespeare-text, shakespeare-i18n, and hamlet into a single package. It made sense to keep these separate when APIs were changing rapidly, but things are basically stable now.


Since they don't add any extra dependencies, I'd like to merge wai-test and wai-eventsource into wai-extra. Once again, since we're dealing with stable APIs, this shouldn't cause too much trouble.

I'm also considering deprecating wai-handler-devel, since it's no longer used at all by yesod devel.

Deprecate pool-conduit

pool-conduit used to be a resource pool implementation based on code in conduit. Since then:

  • The actual resource pool implementation is provided by resource-pool.
  • The code I used from conduit has moved to resourcet.

At this point, pool-conduit doesn't really have much in it. If there's code that people are using from it, I'd like to get it merged into resource-pool itself.


The next iteration of Yesod will have a significantly simpler dispatch system. This new code doesn't really make sense as a general-purpose routing tool, so I'm planning on moving that code into yesod-core itself and deprecate yesod-routes. I know there are other users of yesod-routes; I think it makes sense to rename yesod-routes to do something like merging yesod-routes into wai-routes, as yesod-routes has no Yesod-specifics in it.

Another minor change: merge yesod-eventsource into yesod-core. No extra dep, and a stable API.

Finally, the big (and somewhat controversial) one: merge most of the yesod-* core packages into the yesod package itself. A number of year ago, we did precisely the opposite. However, given API stability, I think the time has come to simplify our dependency list again here. This will have the added benefit that when a user reports "I'm using Yesod version 1.2.3", it will give us more information.


I'm planning on deprecating dtd, uri-conduit, and xml-catalog, as I don't use them any more and have no time to maintain them.

Another idea I'm playing with is merging html-conduit into xml-conduit. This would add a tagstream-conduit dependency. Alternatively, perhaps tagstream-conduit could be merged into xml-conduit as well.


Merge classy-prelude-conduit in with classy-prelude. Downside: classy-prelude will depend on conduit-combinators, but that doesn't actually add more than 3 extra packages.

Next step: trimming dependencies in general

I'm not planning on rushing any of these changes. The goal is to take them slowly and methodically, and release changes incrementally. For example, after the conduit changes are done, I'd release new versions of wai, yesod, etc, that are compatible with both the new and old versions. Hopefully the user facing changes will simply be tweaking import lists and cabal files, but users will be able to take their time on this.

Ultimately, I'm planning on releasing new version of persistent (2.0) and Yesod (1.4). You can see the Persistent 2.0 goals. The Yesod 1.4 release doesn't actually have any breaking changes planned, aside from upgrading its dependencies.

One other thing I'm going to be doing in this process is a more general trimming down of dependencies. I'll be going through yesod-platform to find packages we don't need. I also want to avoid redundant packages if possible (e.g., we only need one cryptographic hash packages). In many cases, as a community, we have multiple package options available. I think a great move for the community would be to start consolidating those options as well where it makes sense. If I have concrete ideas on that, I'll certainly share them.

March 25, 2014 08:07 AM