# Abstract Nonsense, chapter 3: the Tuple

We consider the tuple.

Really, after Unit and Id, the tuple is one of the simplest data structures around. In its base for the tuple is of type

data Tuple a b = (a, b)

Tuple has two fundamental functions:

fst :: (a, b) -> a
snd :: (a, b) -> b

and that's all there is to it, and that's all there is to the tuple.

(really, the above declaration should be:

data (a, b) = (a, b)

but the typeful LHS ('left hand side') declaration being the same syntax (and I say 'the same syntax,' not 'the same semantics') as the instance definition on the RHS ('right hand side') may make some people's head hurt. That's also why I gave the list declaration as:

data list a = nil | (a, list a)

and not the standard syntax of:

data [a] = [] | (a:[a])

There is the question of programs are written to be read as well as to be executed by a machine).

So: simple.

Or.
Is.
It?

(I believe this has become my trademark.)

Of course if you have the Cartesian cross-product of two typed values, then why not three? why not one? why not ... 'n'?

So, of course, the tuple is any number of typed values associated through their cross products, but I like to keep things simple here (and in my code), so I'll just be working with the above given definition of tuple and where I need a 'triple' (heh: 'triple') ('triple' was the term used for monads up until recently) I'll define one, or I'll simply compose tuples, for, after all, any n-relation can be reduced to a set of 2-relations, e.g.:

data Triple a b c = (a, (b, c))
data Quad a b c d = (a, (b, (c, d)))
data Quinz a b c d e = ... etc

So from tuple you have triple, or quad, or quinz, or whatever.

So: tuple.

What can we do with it?

Well, everything, actually. You see most programming languages take the tuple for granted, and call it the parameter-set of functions. Haskell does not do this. Haskell be smart. Haskell be curry(ing) the arguments to functions.

Haskell B. Curry. Geddit? No? Look it up.

So, in programming language X, where X is most likely the language of your choice you invoke a function in the 'standard' way:

foo(bar, baz, ...)

Not so with Haskell ... I mean, not necessarily so (and not even ordinarily so).

Nor even with other Lisp-like languages. When everything's a list, you get a tremendous amount of conceptual freedom and from that freedom comes power.

But ... power? Later.

So, with Haskell, you can force the tupling of arguments to a function:

foo :: (bar, baz) -> quux

But why would you want to do that, requiring all the arguments to a function to be present when you can curry them:

foo :: bar -> baz -> quux

And then apply the arguments individually as needed, building a partial function (the partial function itself being a (functional) value) until you finally apply to get the solution (in this example, the quux-typed value).

So, currying: the 'anti-tuple.'

So we showed a simple, elegant use contra the tuple, but what are the uses of the tuple.

Well, like I said: plentiful, many-ful, many-fold. Manifold.

(I love writing words.) (Bear with me.)

(which means something totally different than 'bare with me.')

One of the primary uses that come to mind for the tuple, besides freely-relating two disconnected values (which, in and of itself is a big, huge win ... ever see a list of parameters to a function where there are obviously-paired values, except they aren't paired, so the code gets all mucked up treating something that isn't as if it were because it should be?), is to 'pretend' that two values are one and then to fool everybody else with this conceit.

Where does this work?

Have you ever needed to return two values from a function?

But of course you can't because a function returns 'a,' 'one,' 'single' value. A function doesn't return two values, and you don't want to have to write a class to package those values together (because why? That's something you may have to look into yourself to answer: if two values are related, why aren't you making that relation explicit in the type system? Types makes things easier, but only when you use them), so what do you do?

Return the tupled pair.

You now have functions that return two values, well-typed, at that, and if they can return two, then can return three, or four, or ... 'n.'

That's one place where tuples shine: 'Oh, I need the value from the list, but I also need the list-index where that value was.' You tuple the index with the value, and your function now returns the data you need to proceed with your work.

But then, once you start using tuples one place, you find yourself starting to see uses for tuples in many other places you never considered before, because you just coded around two disparate values.

'Coding around' things is the algorithmic approach, tuple provides a data-centric solution. You ever notice how greatly algorithm simplifies when the right data types are used?

Tuple.

Examples

Arrows

Of course, the fundamental data type of category theory, the map can be viewed as a list (which is a linked set of tuples (or 'pairs')) of tuples, where each tuple is composed of the key for the first value and the value for the second.

So that gives us another example: one is 'Map.Entry<K, V>' so obviously 'Tuple' I'm wondering why the Java folks just didn't name it 'Duck' since it was quacking like one.

But then ... Arrow.

Arrow captures the entirety of the function f :: a -> b, so if you look at it slightly differently, an arrow, or morphism, or function, is the mapping of all values applicable to the function f from Category A to Category B.

So, one way to look at an arrow f is simple the set of all tuples (a, b) where the 'a's are all possible inputs to f with the 'b's being the corresponding outcomes.

But there are more to arrows than that (and, fear not, this article will not be side-track to an arrow-exclusive): the primary arrow operation is 'first' (arrows are fundamentally, or minimally, defined on 'arr' (which is the 'arrow-maker' function) and 'first').

What does first do?

First supposes that the input has been split from some value a into two (tupled) values (a, a). And first takes an unit function and applies it to the the tuple's first value, leaving the second value unchanged.

So, if you have some value a, and a function f :: a -> b then

first f

does the following

first f (a, a) -> (b, a)

Or, actually, and more generally, this works for any tupled input value so:

first f (a, d) -> (b, d)

works as well.

Exercise: Given than f :: a -> b and first f :: (a, d) -> (b, d), write an implementation of first.

Ordering

We mentioned before that Map.Entry<K, V> is just another name for 'Tuple<K, V>', so then, looking at it this way, either List<Tuple<K, V>> or Set<Tuple<K, V>> is another name for Map<K, V> depending on which search strategy we wish to use.

... with a significant difference: as both the K and V in the tuple are used to compute the tupled instance's identity, then both those data structures permit 'duplicate' entries: a key may be associated with zero, one or more values.

And, here is where the search strategy becomes significant: List has an explicit positional ordering: first comes the first value, next come the second value, etc. This positioning may be exploited for the inherent positional index to a tuple ... or I exploited this inherent indexing, at any rate.

Problem statement: so, have a list of values, rows from a data table that we wish to display in order.

Simple enough: we make the data table entity a comparable type, sort the section, and we're done.

Or.
Are.
We?

The problem here is that ...

Well, there's always a 'the problem here is' when a model encounters the real world and when developed code hits production data.

This particular problem is that there are scenarios where an entity is represented by compound data, and so you need two rows to indicate one entity.  In these cases, you have the same entity id, so what do we do? We can complicate the sort-criterium for the comparison for the entire entity because of this special case, or we can sort simply and handle sorting these two-row entities on a case-by-case basis.

But then we sort the list twice. Once naturally and automatically, and we don't need to think about that sort.

But how do we juxtapose two rows with the same entity id in a list? We can't change the indices on a list pel-mel, can we?

Or.
Can.
We?

Using tuples, we can indeed do just that. If we have the mostly-well-sorted list

[Ent {13, ... Nov 20, asdf},
Ent {27, ... Aug 3, aega},
Ent {27, ... null, hjg},
Ent {34, ... null, asjkl},
Ent {34, ... Apr 26, aehe},
...]

And we wish to have the 'null' date higher up on the list, as per customer requirements. One way to do that is to add this to the search criteria. But what if the search criteria is already based off of three values of the primary key (and this date is not part of the primary key, because it is nullable)? Do we wish to add this new criterium for these specialized cases?

Yes, we can do that.

Or, as we did in the project, we did another sort, manually. You see, the question comes up, how do we recognized that two rows represent one value, and, when recognized, how do we swap the row order for that one case where we wish to?

You can't say, 'oh, this list index is now less one and the other list is now index plus one'

Not with a straight-up list, anyway, you can't1 change the indices there. But if we make the indices explicit by tupling them with their listed data:

int index = 0;
for(Ent entity : entities) {
}

And then, when we encounter our special case, we verify the order or correct it:

OrderedTuple<Integer, Ent> oldPair =
OrderedTuple.pair(-1, Ent.prototype());
for(OrderedTuple<Integer, Ent> pair: newList) {
Ent newEnt = pair.snd();
Ent oldEnt = oldPair.snd();
if(newEnt.id == oldEnt.id) {
if(newEnt.date == null) {
int pairKey = pair.fst();
pair.changeKey(pairKey - 1);
oldPair.changeKey(pairKey);
}
}
oldPair = pair;
}

// now we sort on the revised indices:
Collections.sort(newList);

// and return that sorted entity list (sans the explicit indicies):
List<Ent> revisedList = new ArrayList<Ent>();
for(OrderedTuple<Integer, Ent> pair : newList) {
}
return revisedList;

And that's one way to skin that cat.

So, what's OrderedTuple? It's simply a mutable tuple with an ordering imposed on the keyed value:

@Effectual
public class OrderedTuple<K extends Comparable<K>, V>
extends MutableTuple<K, V>
implements Comparable<OrderedTuple<K, V>> {
protected OrderedTuple(K key, V val) { super(key, val); }
public static <KEY extends Comparable<KEY> VALUE>
OrderedTuple<KEY, VALUE> pair(KEY k, VALUE v) {
return new OrderedTuple<KEY, VALUE>(k, v);
}

// Comparable implementation --------------
@Override public int compareTo(OrderedTuple<K, V> tup) {
return fst().compareTo(tup.fst());
}
}

Where mutable tuple simply decorates the tuple protocol with setters:

@Effectual public class MutableTuple<K, V> extends Tuple<K, V> {
protected MutableTuple(K k, V v) {
super(k, v);
key = new State<K>(k);
value = new State<V>(v);
}
@Override public K fst() { return key.arg(); }
@Override public V snd() { return value.arg(); }
@Effectual public void changeKey(k newKey) { key.put(newKey); }
@Effectual public void changeValue(v newValue) { value.put(newValue); }

private final State<K> key;
private final State<V> value;
}

Entanglement

Tuples are a way of pairing or associating things that are not explicitly already paired. As we saw above, normally the indices of a list are implied, but they are there and are associated with the elements of the list. A tuple makes that implied associate explicit. Once something is explicated then it can also be exploited in ways unimagined heretofore.

Another, novel, way I've used tupling is entanglement. Something over here, an event or an object, changes, and that change is supposed to reflect throughout the system. How do we do that?

One way is to hardcode it:

if(x changed) then change y

But this is at the algorithmic level, and things happening procedurally are notoriously hard to track.

We could, instead, make this a structure and realize entanglement as a data type.

This is what I do. I associate an observed object with an action, and that action is to change a value entangled (implicitly) with the observed object.

@Effectual public class Entanglement<T> extends Tuple<T, Effect<T>> {
private Entanglement(T source, Effect<T> setterF) {
super(source, setterF);
}

// the point of it all: we entangle two objects
public static <T1> Entanglement<T1> entangle(T1 src, Effect<T1> destF) {
return new Entanglement<T1>(src, destF);
}

// when we observe an object, that act of observation, itself,
// changes the destination object via entanglement:
public Maybe<T> observe() throws Failure {
return (Maybe<T>) Maybe.nullOr(fst()).liftThenBind(snd());
}
}

A very, very simple class for a very simple concept.

Where is such a concept used?

Well, as in the above scenario with Ent types, I was collecting a set of three of the most current dates from Ent rows where 'most current' is defined as the the last update date auditing column is greatest on a particular set of entity types.

The thing is that any of these three dates could be null, so if I wholesale replaced dates with the most recent Ent row, then I found I would be nulling out already established dates.

I had to update dates on the most recent non-null dates, and this occurred on a case-by-case basis.

A further complication: one of the dates has an associated duration type-code, so every time I replaced that date, I had to update the duration code, too.

Algorithmically this turned into a nightmarish mess. Three development teams were called in to code the solution, and three teams each, eventually, threw up their hands, declaring this to be an intractable problem.

I used entanglement to solve this problem.

For the associated date-duration-code, I had one entanglement:

Entanglement.entangle(incomingRow.getDateA(), new Effect<Date>() {
@Override protected void execute(Date newA) throws Failure {
stored.setDateA(newA);
int dur = incomingRow.getDurCd();
if(dur > 0) {
stored.setDurCd(dur);
}
}
}).observe();

For the stand-alone date, I used a simple entanglement:

Entanglement.entangle(incomingRow.getDateB(), new Effect<Date>() {
@Override protected void execute(Date newDateB) throws Failure {
stored.setDateB(newDateB);
}
}).observe();

Then I simply iterated up the result set of the set of entities of concern and I was able to get the correct dates for each entity row that were required.

Easy, peasy!

Using the entanglement type (which, itself, is simply a tuple of observed value and an effect) I overhauled a patchwork system, that actually wasn't working at all, of over one-thousand lines of code with a fifty line replacement that satisfied customer requirements.

My take-away here is that solving problems algorithmically leads to code bloat and complication, but when the problem is reconceptualized into data structures, the problem is simplified and modularized.

Tuple is a data type that does just that: often, in code, two values are implied to be associated, but there is no direct correlation. A tuple makes that association explicit.

κCalc: the Kappa calculus

One more point to cover on tuples.

The λ calculus can be considered to be the marriage of two dual calculi: the κ calculus, which is the calculus of composition,2 and allows functors of the first order only, and the ζ calculus, which is the calculus of application. So, in the κ calculus, values are lifted into tupled types, or dropped to the unit type. An entire calculus exists around the manipulation of tuples, and it has its own arithmetic and everything.

You can do anything with tuples! Tuples are like Tiger Balm that way.

-----

(1) Ever hear a British-English person say the word 'can't' ...? It is to die for! But I digress, and also, in digression, have embarrassed Mr. Sh, my boss.

(2) When I read this paper on the κ calculus, I was so pleased! I had defined my morphism type to be a type that contained only the identity and the composition functions, but not application! And I had wondered if I were doing the right thing. Isn't the whole point of it all to apply a function? Isn't that what a morphism ... 'is'? And, actually, no, that's not the case at all. There do exist morphisms free of the concept of function application, and the κ calculus has arrows where elimination occurs from (de)composition, not from function application.

There exist a whole class of arrows that do no application at all. And I got that correct!

But this topic can be discussed in the article where we consider the morphism.

I suppose I'll start that article with the words: "We consider the morphism."

Hm, I wonder if the map article and the morphism article are one and the same?

# Comprehensions

Prompted by some recent work I’ve been doing on reasoning about monadic computations, I’ve been looking back at the work from the 1990s by Phil Trinder, Limsoon Wong, Leonidas Fegaras, Torsten Grust, and others, on monad comprehensions as a framework for database queries.

The idea goes back to the adjunction between extension and intension in set theory—you can define a set by its extension, that is by listing its elements:

$\displaystyle \{ 1, 9, 25, 49, 81 \}$

or by its intension, that is by characterizing those elements:

$\displaystyle \{ n^2 \mid 0 < n < 10 \land n \equiv 1 (\mathop{mod} 2) \}$

Expressions in the latter form are called set comprehensions. They inspired a programming notation in the SETL language from NYU, and have become widely known through list comprehensions in languages like Haskell. The structure needed of sets or of lists to make this work is roughly that of a monad, and Phil Wadler showed how to generalize comprehensions to arbitrary monads, which led to the “do” notation in Haskell. Around the same time, Phil Trinder showed that comprehensions make a convenient database query language. The comprehension notation has been extended to cover other important aspects of database queries, particularly aggregation and grouping. Monads and aggregations have very nice algebraic structure, which leads to a useful body of laws to support database query optimization.

## List comprehensions

Just as a warm-up, here is a reminder about Haskell’s list comprehensions.

$\displaystyle [ 2 \times a + b \mid a \leftarrow [1,2,3] , b \leftarrow [4,5,6] , b \mathbin{\underline{\smash{\mathit{mod}}}} a == 0 ]$

This (rather concocted) example yields the list of all values of the expression ${2 \times a + b}$ as ${a}$ is drawn from ${[1,2,3]}$ and ${b}$ from ${[4,5,6]}$ and such that ${b}$ is divisible by ${a}$, namely ${[6,7,8,8,10,12]}$.

To the left of the vertical bar is the term (an expression). To the right is a comma-separated sequence of qualifiers, each of which is either a generator (of the form ${a \leftarrow x}$, with a variable ${a}$ and a list expression ${x}$) or a filter (a boolean expression). The scope of a variable introduced by a generator extends to all subsequent generators and to the term. Note that, in contrast to the mathematical inspiration, bound variables need to be generated from some existing list.

The semantics of list comprehensions is defined by translation; see for example Phil Wadler’s Chapter 7 of The Implementation of Functional Programming Languages. It can be expressed equationally as follows:

$\displaystyle \begin{array}{lcl} [ e \mid \epsilon ] &=& [e] \\ {} [ e \mid b ] &=& \mathbf{if}\;b\;\mathbf{then}\;[ e ]\;\mathbf{else}\;[\,] \\ {} [ e \mid a \leftarrow x ] &=& \mathit{map}\,(\lambda a \mathbin{.} e)\,x \\ {} [ e \mid q, q' ] &=& \mathit{concat}\,[ [ e \mid q' ] \mid q ] \end{array}$

(Here, ${\epsilon}$ denotes the empty sequence of qualifiers. It’s not allowed in Haskell, but it is helpful in simplifying the translation.)

Applying this translation to the example at the start of the section gives

$\displaystyle \begin{array}{ll} & [ 2 \times a + b \mid a \leftarrow [1,2,3] , b \leftarrow [4,5,6] , b \mathbin{\underline{\smash{\mathit{mod}}}} a == 0 ] \\ = & \mathit{concat}\,(\mathit{map}\,(\lambda a \mathbin{.} \mathit{concat}\,(\mathit{map}\,(\lambda b \mathbin{.} \mathbf{if}\;b \mathbin{\underline{\smash{\mathit{mod}}}} a == 0\;\mathbf{then}\;[2 \times a + b]\;\mathbf{else}\;[\,])\,[4,5,6]))\,[1,2,3]) \\ = & [6,7,8,8,10,12] \end{array}$

More generally, a generator may match against a pattern rather than just a variable. In that case, it may bind multiple (or indeed no) variables at once; moreover, the match may fail, in which case it is discarded. This is handled by modifying the translation for generators to use a function defined by pattern-matching, rather than a straight lambda-abstraction:

$\displaystyle [ e \mid p \leftarrow x ] = \mathit{concat}\,(\mathit{map}\,(\lambda a \mathbin{.} \mathbf{case}\;a\;\mathbf{of}\;p \rightarrow [ e ] \;;\; \_ \rightarrow [\,])\,x)$

or, more perspicuously,

$\displaystyle [ e \mid p \leftarrow x ] = \mathbf{let}\;h\,p = [ e ] ; h\,\_ = [\,]\;\mathbf{in}\; \mathit{concat}\,(\mathit{map}\,h\,x)$

It is clear from the above translation that the necessary ingredients for list comprehensions are ${\mathit{map}}$, singletons, ${\mathit{concat}}$, and the empty list. The first three are the operations arising from lists as a functor and a monad, which suggests that the same translation might be applicable to other monads too. But the fourth ingredient, the empty list, does not come from the functor and monad structures; that requires an extra assumption:

$\displaystyle \begin{array}{ll} \mathbf{class}\;\mathit{Monad}\,m \Rightarrow \mathit{MonadZero}\,m\;\mathbf{where} \\ \quad \mathit{mzero} :: m\,a \end{array}$

Then the translation for list comprehensions can be generalized to other monads:

$\displaystyle \begin{array}{lcl} [ e \mid \epsilon ] &=& \mathit{return}\,e \\ {} [ e \mid b ] &=& \mathbf{if}\;b\;\mathbf{then}\;\mathit{return}\,e\;\mathbf{else}\;\mathit{mzero} \\ {} [ e \mid p \leftarrow m ] &=& \mathbf{let}\;h\,p = \mathit{return}\,e ; h\,\_ = \mathit{mzero}\;\mathbf{in}\; \mathit{join}\,(\mathit{map}\,h\,m) \\ {} [ e \mid q, q' ] &=& \mathit{join}\,[ [ e \mid q' ] \mid q ] \end{array}$

(so ${[ e \mid \epsilon ] = [ e \mid \mathit{True} ]}$). The actual monad to be used is implicit; if we want to be explicit, we could use a subscript, as in “${[ e \mid q ]_\mathsf{List}}$“.

This translation is different from the one used in the Haskell language specification, which to my mind is a little awkward: the empty list crops up in two different ways in the translation of list comprehensions—for filters, and for generators with patterns—and these are generalized in two different ways to other monads (to the ${\mathit{mzero}}$ method of the ${\mathit{MonadPlus}}$ class in the first case, and the ${\mathit{fail}}$ method of the ${\mathit{Monad}}$ class in the second). I think it is neater to have a monad subclass ${\mathit{MonadZero}}$ with a single method subsuming both these operators. Of course, this does mean that the translation forces a monad comprehension with filters or possibly failing generators to be interpreted in a monad in the ${\mathit{MonadZero}}$ subclass rather than just ${\mathit{Monad}}$—the type class constraints that are generated depend on the features used in the comprehension. (Perhaps this translation was tried in earlier versions of the language specification, and found wanting?)

Taking this approach gives basically the monad comprehension notation from Wadler’s Comprehending Monads paper; it loosely corresponds to Haskell’s do notation, except that the term is to the left of a vertical bar rather than at the end, and that filters are just boolean expressions rather than introduced using ${\mathit{guard}}$.

We might impose the law that ${\mathit{mzero}}$ is a “left” zero of composition, in the sense

$\displaystyle \mathit{join}\,\mathit{mzero} = \mathit{mzero}$

or, in terms of comprehensions,

$\displaystyle [ e \mid a \leftarrow \mathit{mzero} ] = \mathit{mzero}$

Informally, this means that any failing steps of the computation cleanly cut off subsequent branches. Conversely, we do not require that ${\mathit{mzero}}$ is a “right” zero too:

$\displaystyle \mathit{join}\,(\mathit{map}\,(\lambda a \mathbin{.} \mathit{mzero})\,m) \ne \mathit{mzero} \quad\mbox{(in general)}$

This would have the consequence that a failing step also cleanly erases any effects from earlier parts of the computation, which is too strong a requirement for many monads—particularly those of the “launch missiles now” variety. (The names “left-” and “right zero” make more sense when the equations are expressed in terms of the usual Haskell bind operator ${(\gg\!=)}$, which is a kind of sequential composition.)

One more ingredient is needed in order to characterize monads that correspond to “collection classes” such as sets and lists, and that is an analogue of set union or list append. It’s not difficult to see that this is inexpressible in terms of the operations introduced so far: given only collections ${m}$ of at most one element, any comprehension using generators of the form ${a \leftarrow m}$ will only yield another such collection, whereas the union of two one-element collections will in general have two elements.

To allow any finite collection to be expressed, it suffices to introduce a binary union operator ${\uplus}$:

$\displaystyle \begin{array}{ll} \mathbf{class}\;\mathit{Monad}\,m \Rightarrow \mathit{MonadPlus}\,m\;\mathbf{where} \\ \quad (\uplus) :: m\,a \times m\,a \rightarrow m\,a \end{array}$

We require composition to distribute over union, in the following sense:

$\displaystyle \mathit{join}\,(m \uplus n) = \mathit{join}\,m \uplus \mathit{join}\,n$

or, in terms of comprehensions,

$\displaystyle [ e \mid a \leftarrow m \uplus n, q ] = [ e \mid a \leftarrow m, q ] \uplus [ e \mid a \leftarrow n, q ]$

For the remainder of this post, we will assume a monad in both ${\mathit{MonadZero}}$ and ${\mathit{MonadPlus}}$. Moreover, we will assume that ${\mathit{mzero}}$ is the unit of ${\uplus}$, and is both a left- and a right zero of composition. To stress the additional constraints, we will write “${\emptyset}$” for “${\mathit{mzero}}$” from now on. The intention is that such monads exactly capture collection classes; Phil Wadler has called these structures ringads. (He seems to have done so in an unpublished note Notes on Monads and Ringads from 1990, which is cited by some papers from the early 1990s. But Phil no longer has a copy of this note, and it’s not online anywhere… I’d love to see a copy, if anyone has one!)

$\displaystyle \begin{array}{ll} \mathbf{class}\;(\mathit{MonadZero}\,m, \mathit{MonadPlus}\,m) \Rightarrow \mathit{Ringad}\,m\;\mathbf{where} \end{array}$

(There are no additional methods; the class ${\mathit{Ringad}}$ is the intersection of the two parent classes ${\mathit{MonadZero}}$ and ${\mathit{MonadPlus}}$, with the union of the two interfaces, together with the laws above.) I used roughly the same construction already in the post on Horner’s Rule.

As well as (finite) sets and lists, ringad instances include (finite) bags and a funny kind of binary tree (externally labelled, possibly empty, in which the empty tree is a unit of the binary tree constructor). These are all members of the so-called Boom Hierarchy of types—a name coined by Richard Bird, for an idea due to Hendrik Boom, who by happy coincidence is named for one of these structures in his native language. All members of the Boom Hierarchy are generated from the empty, singleton, and union operators, the difference being whether union is associative, commutative, and idempotent. Another ringad instance, but not a member of the Boom Hierarchy, is the type of probability distributions—either normalized, with a weight-indexed family of union operators, or unnormalized, with an additional scaling operator.

## Aggregation

The well-behaved operations over monadic values are called the algebras for that monad—functions ${k}$ such that ${k \cdot \mathit{return} = \mathit{id}}$ and ${k \cdot \mathit{join} = k \cdot \mathit{map}\,k}$. In particular, ${\mathit{join}}$ is itself a monad algebra. When the monad is also a ringad, ${k}$ necessarily distributes also over ${\uplus}$—there is a binary operator ${\oplus}$ such that ${k\,(m \uplus n) = k\,m \oplus k\,n}$ (exercise!). Without loss of generality, we write ${\oplus/}$ for ${k}$; these are the “reductions” of the Bird–Meertens Formalism. In that case, ${\mathit{join} = \uplus/}$ is a ringad algebra.

The algebras for a ringad amount to aggregation functions for a collection: the sum of a bag of integers, the maximum of a set of naturals, and so on. We could extend the comprehension notation to encompass aggregations too, for example by adding an optional annotation, writing say “${[ e \mid q ]^\oplus}$“; although this doesn’t add much, because we could just have written “${\oplus/\,[e \mid q]}$” instead. We could generalize from reductions ${\oplus/}$ to collection homomorphisms ${\oplus/ \cdot \mathit{map}\,f}$; but this doesn’t add much either, because the map is easily combined with the comprehension—it’s easy to show the “map over comprehension” property

$\displaystyle \mathit{map}\,f\,[e \mid q] = [f\,e \mid q]$

Leonidas Fegaras and David Maier develop a monoid comprehension calculus around such aggregations; but I think their name is inappropriate, because nothing forces the binary aggregating operator to be associative.

Note that, for ${\oplus/}$ to be well-defined, ${\oplus}$ must satisfy all the laws that ${\uplus}$ does—${\oplus}$ must be associative if ${\uplus}$ is associative, and so on. It is not hard to show, for instance, that there is no ${\oplus}$ on sets of numbers for which ${\mathit{sum}\,(x \cup y) = \mathit{sum}\,x \oplus \mathit{sum}\,y}$; such an ${\oplus}$ would have to be idempotent, which is inconsistent with its relationship with ${\mathit{sum}}$. (So, although ${[a^2 \mid a \leftarrow x, \mathit{odd}\,a]_\mathsf{Bag}^{+}}$ denotes the sum of the squares of the odd elements of bag ${x}$, the expression ${[a^2 \mid a \leftarrow x, \mathit{odd}\,a]_\mathsf{Set}^{+}}$ (with ${x}$ now a set) is not defined, because ${+}$ is not idempotent.) In particular, ${\oplus/\emptyset}$ must be the unit of ${\oplus}$, which we write ${1_\oplus}$.

We can derive translation rules for aggregations from the definition

$\displaystyle [ e \mid q ]^\oplus = \oplus/\,[e \mid q]$

For empty aggregations, we have:

$\displaystyle \begin{array}{ll} & [ e \mid \epsilon ]^\oplus \\ = & \qquad \{ \mbox{aggregation} \} \\ & \oplus/\,[ e \mid \epsilon ] \\ = & \qquad \{ \mbox{comprehension} \} \\ & \oplus/\,(\mathit{return}\,e) \\ = & \qquad \{ \mbox{monad algebra} \} \\ & e \end{array}$

For filters, we have:

$\displaystyle \begin{array}{ll} & [ e \mid b ]^\oplus \\ = & \qquad \{ \mbox{aggregation} \} \\ & \oplus/\,[ e \mid b ] \\ = & \qquad \{ \mbox{comprehension} \} \\ & \oplus/\,(\mathbf{if}\;b\;\mathbf{then}\;\mathit{return}\,e\;\mathbf{else}\;\emptyset) \\ = & \qquad \{ \mbox{lift out the conditional} \} \\ & \mathbf{if}\;b\;\mathbf{then}\;{\oplus/}\,(\mathit{return}\,e)\;\mathbf{else}\;{\oplus/}\,\emptyset \\ = & \qquad \{ \mbox{ringad algebra} \} \\ & \mathbf{if}\;b\;\mathbf{then}\;e\;\mathbf{else}\;1_\oplus \end{array}$

For generators, we have:

$\displaystyle \begin{array}{ll} & [ e \mid p \leftarrow m ]^\oplus \\ = & \qquad \{ \mbox{aggregation} \} \\ & \oplus/\,[ e \mid p \leftarrow m ] \\ = & \qquad \{ \mbox{comprehension} \} \\ & \oplus/\,(\mathbf{let}\;h\,p = \mathit{return}\,e ; h\,\_ = \emptyset\;\mathbf{in}\;\mathit{join}\,(\mathit{map}\,h\,m)) \\ = & \qquad \{ \mbox{lift out the \textbf{let}} \} \\ & \mathbf{let}\;h\,p = \mathit{return},e ; h\,\_ = \emptyset\;\mathbf{in}\;{\oplus/}\,(\mathit{join}\,(\mathit{map}\,h\,m)) \\ = & \qquad \{ \mbox{monad algebra} \} \\ & \mathbf{let}\;h\,p = \mathit{return}\,e ; h\,\_ = \emptyset\;\mathbf{in}\;{\oplus/}\,(\mathit{map}\,(\oplus/)\,(\mathit{map}\,h\,m)) \\ = & \qquad \{ \mbox{functors} \} \\ & \mathbf{let}\;h\,p = \mathit{return}\,e ; h\,\_ = \emptyset\;\mathbf{in}\;{\oplus/}\,(\mathit{map}\,(\oplus/ \cdot h)\,m) \\ = & \qquad \{ \mbox{let~} h' = \oplus/ \cdot h \} \\ & \mathbf{let}\;h'\,p = \oplus/\,(\mathit{return}\,e) ; h'\,\_ = \oplus/\,\emptyset\;\mathbf{in}\;{\oplus/}\,(\mathit{map}\,h'\,m) \\ = & \qquad \{ \mbox{ringad algebra} \} \\ & \mathbf{let}\;h'\,p = e ; h'\,\_ = 1_\oplus\;\mathbf{in}\;{\oplus/}\,(\mathit{map}\,h'\,m) \end{array}$

And for sequences of qualifiers, we have:

$\displaystyle \begin{array}{ll} & [ e \mid q, q' ]^\oplus \\ = & \qquad \{ \mbox{aggregation} \} \\ & \oplus/\,[ e \mid q, q' ] \\ = & \qquad \{ \mbox{comprehension} \} \\ & \oplus/\,(\mathit{join}\,[ [ e \mid q'] \mid q ] \\ = & \qquad \{ \mbox{monad algebra} \} \\ & \oplus/\,(\mathit{map}\,(\oplus/)\,[ [ e \mid q'] \mid q ]) \\ = & \qquad \{ \mbox{map over comprehension} \} \\ & \oplus/\,[ \oplus/\,[ e \mid q'] \mid q ] \\ = & \qquad \{ \mbox{aggregation} \} \\ & [ [ e \mid q']^\oplus \mid q ]^\oplus \end{array}$

Putting all this together, we have:

$\displaystyle \begin{array}{lcl} [ e \mid \epsilon ]^\oplus &=& e \\ {} [ e \mid b ]^\oplus &=&\mathbf{if}\;b\;\mathbf{then}\;e\;\mathbf{else}\;1_\oplus \\ {} [ e \mid p \leftarrow m ]^\oplus &=& \mathbf{let}\;h\,p = e ; h\,\_ = 1_\oplus\;\mathbf{in}\;{\oplus/}\,(\mathit{map}\,h\,m) \\ {} [ e \mid q, q' ]^\oplus &=& [ [ e \mid q']^\oplus \mid q ]^\oplus \end{array}$

## Heterogeneous comprehensions

We have seen that comprehensions can be interpreted in an arbitrary ringad; for example, ${[a^2 \mid a \leftarrow x, \mathit{odd}\,a]_\mathsf{Set}}$ denotes (the set of) the squares of the odd elements of (the set) ${x}$, whereas ${[a^2 \mid a \leftarrow x, \mathit{odd}\,a]_\mathsf{Bag}}$ denotes the bag of such elements, with ${x}$ a bag. Can we make sense of “heterogeneous comprehensions”, involving several different ringads?

Let’s introduced the notion of a ringad morphism, extending the familiar analogue on monads. For monads ${\mathsf{M}}$ and ${\mathsf{N}}$, a monad morphism ${\phi : \mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{N}}$ is a natural transformation ${\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{N}}$—that is, a family ${\phi_\alpha :: \mathsf{M}\,\alpha \rightarrow \mathsf{N}\,\alpha}$ of arrows, coherent in the sense that ${\phi_\beta \cdot \mathsf{M}\,f = \mathsf{N}\,f \cdot \phi_\alpha}$ for ${f :: \alpha \rightarrow \beta}$—that also preserves the monad structure:

$\displaystyle \begin{array}{lclcl} \phi \cdot \mathit{return}_\mathsf{M} &=& \mathit{return}_\mathsf{N} \\ \phi \cdot \mathit{join}_\mathsf{M} &=& \mathit{join}_\mathsf{N} \cdot \phi \cdot \mathsf{M}\,\phi &=& \mathit{join}_\mathsf{N} \cdot \mathsf{N}\,\phi \cdot \phi \end{array}$

A ringad morphism ${\phi : \mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{N}}$ for ringads ${\mathsf{M},\mathsf{N}}$ is a monad morphism ${\phi : \mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{N}}$ that also respects the ringad structure:

$\displaystyle \begin{array}{lcl} \phi\,\emptyset_\mathsf{M} &=& \emptyset_\mathsf{N} \\ \phi\,(x \uplus_\mathsf{M} y) &=& \phi\,x \uplus_\mathsf{N} \phi\,y \end{array}$

Then a ringad morphism behaves nicely with respect to ringad comprehensions—a comprehension interpreted in ringad ${\mathsf{M}}$, using existing collections of type ${\mathsf{M}}$, with the result transformed via a ringad morphism ${\phi : \mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{N}}$ to ringad ${\mathsf{N}}$, is equivalent to the comprehension interpreted in ringad ${\mathsf{N}}$ in the first place, but with the initial collections transformed to type ${\mathsf{N}}$. Informally, there will be no surprises arising from when ringad coercions take place, because the results are the same whenever this happens. This property is straightforward to show by induction over the structure of the comprehension. For the empty comprehension, we have:

$\displaystyle \begin{array}{ll} & \phi\,[ e \mid \epsilon ]_\mathsf{M} \\ = & \qquad \{ \mbox{comprehension} \} \\ & \phi\,(\mathit{return}_\mathsf{M}\,e) \\ = & \qquad \{ \mbox{ringad morphism} \} \\ & \mathit{return}_\mathsf{N}\,e \\ = & \qquad \{ \mbox{comprehension} \} \\ & [e \mid \epsilon ]_\mathsf{N} \end{array}$

For filters, we have:

$\displaystyle \begin{array}{ll} & \phi\,[ e \mid b ]_\mathsf{M} \\ = & \qquad \{ \mbox{comprehension} \} \\ & \phi\,(\mathbf{if}\;b\;\mathbf{then}\;\mathit{return}_\mathsf{M}\,e\;\mathbf{else}\;\emptyset_\mathsf{M}) \\ = & \qquad \{ \mbox{lift out the conditional} \} \\ & \mathbf{if}\;b\;\mathbf{then}\;\phi\,(\mathit{return}_\mathsf{M}\,e)\;\mathbf{else}\;\phi\,\emptyset_\mathsf{M} \\ = & \qquad \{ \mbox{ringad morphism} \} \\ & \mathbf{if}\;b\;\mathbf{then}\;\mathit{return}_\mathsf{N}\,e\;\mathbf{else}\;\emptyset_\mathsf{N} \\ = & \qquad \{ \mbox{comprehension} \} \\ & [ e \mid b ]_\mathsf{N} \end{array}$

For generators:

$\displaystyle \begin{array}{ll} & \phi\,[ e \mid p \leftarrow m ]_\mathsf{M} \\ = & \qquad \{ \mbox{comprehension} \} \\ & \phi\,(\mathbf{let}\;h\,p = \mathit{return}_\mathsf{M}\,e ; h\,\_ = \emptyset_\mathsf{M}\;\mathbf{in}\;\mathit{join}_\mathsf{M}\,(\mathit{map}_\mathsf{M}\,h\,m)) \\ = & \qquad \{ \mbox{lift out the \textbf{let}} \} \\ & \mathbf{let}\;h\,p = \mathit{return}_\mathsf{M}\,e ; h\,\_ = \emptyset_\mathsf{M}\;\mathbf{in}\;\phi\,(\mathit{join}_\mathsf{M}\,(\mathit{map}_\mathsf{M}\,h\,m)) \\ = & \qquad \{ \mbox{ringad morphism, functors} \} \\ & \mathbf{let}\;h\,p = \mathit{return}_\mathsf{M}\,e ; h\,\_ = \emptyset_\mathsf{M}\;\mathbf{in}\;\mathit{join}_\mathsf{N}\,(\phi\,(\mathit{map}_\mathsf{M}\,(\phi \cdot h)\,m)) \\ = & \qquad \{ \mbox{let~} h' = \phi \cdot h \} \\ & \mathbf{let}\;h'\,p = \phi\,(\mathit{return}_\mathsf{M}\,e) ; h'\,\_ = \phi\,\emptyset_\mathsf{M}\;\mathbf{in}\;\mathit{join}_\mathsf{N}\,(\phi\,(\mathit{map}_\mathsf{M}\,h'\,m)) \\ = & \qquad \{ \mbox{ringad morphism, induction} \} \\ & \mathbf{let}\;h'\,p = \mathit{return}_\mathsf{N}\,e ; h'\,\_ = \emptyset_\mathsf{N}\;\mathbf{in}\;\mathit{join}_\mathsf{N}\,(\phi\,(\mathit{map}_\mathsf{M}\,h'\,m)) \\ = & \qquad \{ \mbox{naturality of~} \phi \} \\ & \mathbf{let}\;h'\,p = \mathit{return}_\mathsf{N}\,e ; h'\,\_ = \emptyset_\mathsf{N}\;\mathbf{in}\;\mathit{join}_\mathsf{N}\,(\mathit{map}_\mathsf{N}\,h'\,(\phi\,m)) \\ = & \qquad \{ \mbox{comprehension} \} \\ & [ e \mid p \leftarrow \phi\,m ]_\mathsf{N} \end{array}$

And for sequences of qualifiers:

$\displaystyle \begin{array}{ll} & \phi\,[ e \mid q, q' ]_\mathsf{M} \\ = & \qquad \{ \mbox{comprehension} \} \\ & \phi\,(\mathit{join}\,[ [ e \mid q' ]_\mathsf{M} \mid q ]_\mathsf{M}) \\ = & \qquad \{ \mbox{ringad morphism} \} \\ & \phi\,(\mathit{map}\,\phi\,[ [ e \mid q' ]_\mathsf{M} \mid q ]_\mathsf{M}) \\ = & \qquad \{ \mbox{map over comprehension} \} \\ & \phi\,[ \phi\,[ e \mid q' ]_\mathsf{M} \mid q ]_\mathsf{M} \\ = & \qquad \{ \mbox{induction} \} \\ & [ [ e \mid q' ]_\mathsf{N} \mid q ]_\mathsf{N} \\ = & \qquad \{ \mbox{comprehension} \} \\ & [ e \mid q, q' ]_\mathsf{N} \end{array}$

For example, if ${\mathit{bag2set} : \mathsf{Bag} \mathbin{\stackrel{.}{\to}} \mathsf{Set}}$ is the obvious ringad morphism from bags to sets, discarding information about the multiplicity of repeated elements, and ${x}$ a bag of numbers, then

$\displaystyle \mathit{bag2set}\,[a^2 \mid a \leftarrow x, \mathit{odd}\,a]_\mathsf{Bag} = [a^2 \mid a \leftarrow \mathit{bag2set}\,x, \mathit{odd}\,a]_\mathsf{Set}$

and both yield the set of squares of the odd members of ${x}$. As a notational convenience, we might elide use of the ringad morphism when it is “obvious from context”—we might write just ${[a^2 \mid a \leftarrow x, \mathit{odd}\,a]_\mathsf{Set}}$ even when ${x}$ is a bag, relying on the “obvious” morphism ${\mathit{bag2set}}$. This would allow us to write, for example,

$\displaystyle [ a+b \mid a \leftarrow [1,2,3], b \leftarrow \langle4,4,5\rangle ]_\mathsf{Set} = \{ 5,6,7,8 \}$

(writing ${\langle\ldots\rangle}$ for the extension of a bag), instead of the more pedantic

$\displaystyle [ a+b \mid a \leftarrow \mathit{list2set}\,[1,2,3], b \leftarrow \mathit{bag2set}\,\langle4,4,5\rangle ]_\mathsf{Set} = \{ 5,6,7,8 \}$

There is a forgetful function from any poorer member of the Boom hierarchy to a richer one, flattening some distinctions by imposing additional laws—for example, from bags to sets, flattening distinctions concerning multiplicity—and I would class these forgetful functions as “obvious” morphisms. On the other hand, any morphisms in the opposite direction—such as sorting, from bags to lists, and one-of-each, from sets to bags—are not “obvious”, and so should not be elided; and similarly, I’m not sure that I could justify as “obvious” any morphisms involving non-members of the Boom Hierarchy, such as probability distributions.

# Joining the Industrial Haskell Group

As of today, I’m a an associate member of the Industrial Haskell Group!

The IHG is an organisation to support the needs of commercial users of Haskell.

Do you do Haskell development professionally? If so, consider becoming a member, too! It’s an easy way for companies and academic departments/groups to help support general health of the Haskell development platform.

# Homotopy type theory lectures and notes on-line

My activity on this blog has been reduced to nil recently because I have been spending my time preparing a series of lectures on homotopy type theory, starting from basic principles and ending with the application of univalence and higher inductive types to algebraic topology.  The course web page contains links to the video-taped lectures and to the course notes prepared by the students this semester.  These will be available indefinitely and are accessible to anyone interested in the course.  My hope is that these will provide a useful introduction to the HoTT Book, which is available for free on the web and may be printed on demand.  My intention is to write an introduction to dependent type theory for a CS audience, which will serve as a reference for much ongoing work in the area and as a prolegomenon to the HoTT Book itself.

Filed under: Research, Teaching Tagged: homotopy type theory, hott, hott book

# Homotopy before type theory

Last Friday, the 18th of October, I talked about homotopy and the fundamental group. As I heard about homotopy type theory, I thought that reviewing what I know of the geometric part could be useful. So I prepared this talk, using Allen Hatcher’s free textbook.

Consider a continuous line on the whiteboard, from point $x_0$ to point $x_1$, and another continuous line from point $y_0$ to point $y_1$. Think of a bijection between the two lines: we may think of it as a family of trajectories from a point on the first line to a point of the second line, so that at time $t=0$ we are at the beginning of each trajectory, and at time $t=1$ we are at the end. But we may want to add another requirement: that, at each time $t \in [0,1]$, the collection of points that have been reached at time $t$ is itself a continuous line on the whiteboard! This is the idea at the base of the geometric concept of homotopy.

Recall that a path in a topological space $X$ is a continuous function $p$ from the unit interval $I = [0,1]$ to $X$. Call $X$ pathwise connected if for any two points $x_0, x_1 \in X$ there exists a path $p$ from $x_0$ to $x_1$, i.e., such that $p(0) = x_0$ and $p(1) = x_1$. If $p$ is a path in $X$ and $\phi : X \to Y$ is continuous, then $\phi \circ p$ is a path in $Y$: thus, a continuous image of a pathwise connected space is pathwise connected.The $n$-dimensional unit balls $D^n = \{ x \in \mathbb{R}^n \mid \|x\| \leq 1 \}$ and $n$-dimensional spheres $S^n = \{ x \in \mathbb{R}^{n+1} \mid \|x\| = 1 \}$ are pathwise connected.

If $f, g : I \to X$ are paths and $f(1) = g(0)$, their concatenation $f \,;\, g$ (read: “$f$ then $g$”) is defined by $(f \,;\, g)(t) = f(2t)$ for $t \leq 1/2$ and $(f \,;\, g)(t) = g(2t-1)$ for $t \geq 1/2$: in other words, we run across both paths in sequence, at twice the speed.

Definition 1. Let $X$ and $Y$ be topological spaces and let $f_0, f_1 : X \to Y$ be continuous functions. A homotopy from $f_0$ to $f_1$ is a continuous function $\Phi : X \times I \to Y$ such that $\Phi(x, i) = f_i(x)$ for every $x \in X$ and $i \in \{0, 1\}$. Two functions $f_0, f_1$ are homotopic, written $f_0 \simeq f_1$, if there exists a homotopy from $f_0$ to $f_1$.

It can be easily check that homotopy is an equivalence relation.

Definition 2. A homotopy equivalence between two pathwise connected spaces $X, Y$ is a pair of continuous functions $f : X \to Y$, $g : Y \to X$ such that $g \circ f \simeq \mathrm{id}_X$ and $f \circ g \simeq \mathrm{id}_Y$. $X$ and $Y$ are homotopy equivalent if there exists a homotopy equivalence between them.

For example, the zero function $f : \mathbb{R}^n \to \{0\}$ and the inclusion $g : \{0\} \to \mathbb{R}^n$ form a homotopy equivalence, with $f \circ g = \mathrm{id}_{\{0\}}$ and $\Phi(x,t) = tx$ being a homotopy from $g \circ f$ and $\mathrm{id}_{\mathbb{R}^n}$. Spaces that are homotopy equivalent to a point are called contractible.

The importance of homotopy equivalence, is that it preserves several algebraic invariants, i.e., algebraic objects associated to topological spaces so that homeomorphic spaces have same invariants. The first such invariant is the fundamental group, which we will define in the next paragraphs.

Let $X$ be a path connected space and $x_0, x_1 \in X$: consider the family of paths from $x_0$ to $x_1$. We want to study the homotopy classes of such paths, with an additional restriction on the homotopies to be endpoint-fixing: $\Phi(0,t) = x_0$ and $\Phi(1,t) = 1$ for every $t \in [0,1]$, i.e., at each time $t$ the function $\lambda (s : [0,1]) . \Phi(s,t)$ must be a path from $x_0$ to $x_1$, rather than any two arbitrary points of $X$. This restriction on the homotopies makes them congruential with respect to concatenation: if $f_0 \simeq f_1$, $g_0 \simeq g_1$, and $f_0(1) = f_1(1) = g_0(0) = g_1(0)$, then $f_0 \,;\, g_0 \simeq f_1 \,;\, g_1$. We may thus think about the 2-category whose objects are the points of a topological space, morphisms are the paths from point to point, and morphisms of morphisms are endpoint-fixing homotopies from path to path.

There is more! Although $(f \,;\, g) \,;\, h$ and $f \,;\, (g \,;\, h)$ are not the same, the only difference is in the velocities while running on the different tracks: and it is easy to see how such change of velocity can be done continuously, so that $(f \,;\, g) \,;\, h \simeq f \,;\, (g \,;\, h)$. For the same reason, if $c_{x_0}$ is the constant path at $x_0$, then $c_{x_0} \,;\, f$ is clearly homotopic to $f$ whatever $f$ is; similarly, $f \,; c_{x_1} \simeq f$. (Paths with initial point equal to final point are called loops.) Finally, if we define the reverse of the path $f$ by $f^R(t) = f(1-t)$, then $f \,;\, f^R \simeq c_{x_0}$ and $f^R \,;\, f \simeq c_{x_1}$. (Think about the original point having a lazy friend, who stops somewhere to rest and waits for the original point to come back to go back together to $x_0$.) This justifies

Definition 3. Let $X$ be a pathwise connected space and let $x_0 \in X$. The fundamental group of $X$ based at $x_0$ is the group $\pi_1(X, x_0)$ whose elements are the homotopy classes of loops at $x_0$, product is defined by $[f] \cdot [g] = [f \,;\, g]$, identity is $[c_{x_0}]$, and inverse is defined by $[f]^{-1} = [f^R]$.

Suppose $f : X \to Y$ and $g : Y \to X$ form a homotopy equivalence. Let $p$ be a loop in $X$ based on $x_0$: then $\phi([p]) = [f \circ p \circ g]$ is well defined, and with some computation it turns out to be a group isomorphism. This is the reason why we sometimes talk about $\pi_1(X)$, without specifying $x_0$.

As every loop $p$ in in $\mathbb{R}^n$ is homotopic to a constant loop via the homotopy $\Phi(s,t) = (1-t) \cdot p(s)$, $\pi_1(\mathbb{R}^n)$ is the trivial group: in this case, when $\pi_1(X) = 0$, we say that $X$ is simply connected. For $n \geq 3$, $\mathbb{R}^n \setminus \{0\}$ is also simply connected, as intuitively we may always move the path in a bounded region which contains the image of the entire homotopy, and does not contain the origin. On the other hand, the same intuition tells us that it should not be possible to continuously transform a loop in $\mathbb{R}^2$ around the origin into a loop in $\mathbb{R}^2$ which does not “surround” the origin, without crossing the origin sometimes: this is confirmed by

Theorem 1. $\pi_1(S^1) \cong \mathbb{Z}$.

The proof of Theorem 1 is based on the intuitive idea that we can homomorphically identify the number $k \in \mathbb{Z}$ with the homotopy class of $|k|$ windings based on $(1,0)$, counterclockwise if $k > 0$ and clockwise if $k < 0$. To prove that such homomorphism is bijective, however, would go beyond the scope of this post. On the other hand, $S^n$ is simply connected for $n \geq 2$: think to what happens when we tie an elastic rubber band around a tennis ball.

Definition 4. A retraction of a topological space $X$ onto a subspace $S \subseteq X$ is a continuous function $r : X \to S$ that fixes every point of $S$; $S$ is a retract of $X$ if there exists a retraction of $X$ onto $S$. A deformation retraction of a topological space $X$ onto a subspace $S \subseteq X$ is a homotopy $\Phi : X \times I \to X$ from the identity of $X$ to a retraction of $X$ onto $S$; $S$ is a deformation retract of $X$ if there exists a deformation retraction of $X$ onto $S$.

$S^n$ is a retract of $\mathbb{R}^{n+1} \setminus \{0\}$, a retraction being $r(x) = x / \|x\|$: it is also a deformation retract, as $\Phi(x,t) = (1 - t + t/\|x\|) \cdot x$ is a homotopy from the identity of $\mathbb{R}^{n+1} \setminus \{0\}$ to $r$.

Theorem 2. If $S$ is a retract of $X$ then $\pi_1(S)$ is isomorphic to a subgroup of $\pi_1(X)$. If $S$ is a deformation retract of $X$ then $\pi_1(S)$ is isomorphic to $\pi_1(X)$.

It follows from Theorems 1 and 2 that $S^1$ is not a retract of $D^2$: from which, in turn, follows

Brouwer’s fixed point theorem in dimension 2. Every continuous function from $D^2$ to itself has a fixed point.

Proof: If $f : D^2 \to D^2$ has no fixed points, then the half-line from $f(x)$ through $x$ is well defined for every $x \in D^2$. Let $g(x)$ be the intersection of such half line with $S^1$: then $g$ is a continuous function from $D^2$ to $S^1$ such that $g(x) = x$ for every $x \in S^1$, i.e., a retraction of $D^2$ onto $S^1$—which is impossible. $\Box$

There are some ways to compute the fundamental group of a space, given the fundamental groups of other spaces. For example, $\pi_1(X \times Y, (x_0, y_0))$ is isomorphic to the direct product of $\pi_1(X, x_0)$ and $\pi_1(Y, y_0)$, because the loops in $X \times Y$ based on $(x_0, y_0)$ are precisely the products of the loops in $X$ based on $x_0$ and the loops in $Y$ based on $y_0$. Another important tool is provided by

van Kampen’s theorem. Let $X_1$ and $X_2$ be pathwise connected spaces such that $X_1 \cap X_2$ is pathwise connected and nonempty. Let $x_0 \in X_1 \cap X_2$. Then $\pi_1(X_1 \cup X_2, x_0)$ is the pushout of the diagram $\pi_1(X_1, x_0) \longleftarrow \pi_1(X_1 \cap X_2, x_0) \longrightarrow \pi_1(X_2, x_0)$, where the arrows are induced by the inclusions.

As an application of van Kampen’s theorem, suppose $X_1$ and $X_2$ are copies of $S^1$, and that $X_1 \cap X_2 = \{x_0\}$ is a single point. Then $\pi_1(X_1 \cap X_2, x_0)$ is free on two generators, because $\mathbb{F}_2$ is the pushout of $\mathbb{Z} \longleftarrow 0 \longrightarrow \mathbb{Z}$, where $0$ is the trivial group that only contains the identity element. More in general, the pushout of $\mathbb{F}_m \longleftarrow 0 \longrightarrow \mathbb{F}_n$ is $\mathbb{F}_{m+n}$ for every two positive integers $m, n$: thus, the fundamental group of a bouquet of $n$ circlesi.e., $n$ copies of $S^1$, joined at a single point $x_0$—is free on $n$ generators.

As a final note, consider a continuous function $\phi : X \to Y$ between two pathwise connected spaces. If $p$ is a loop in $X$ with base point $x_0$, then $\phi \circ p$ is a loop in $Y$ with base point $\phi(x_0)$: it is also easy to check that $\phi (p \,;\, q) = (\phi \circ p) \,;\, (\phi \circ q)$ and that $\phi \circ c_{x_0} = c_{\phi(x_0)}$. From this follows that the function $\phi_\ast : \pi_1(X) \to \pi_1(Y)$ defined by $\phi_\ast([p]) = [\phi \circ p]$ is a group homomorphism: therefore, the application that maps every $\phi$ into the corresponding $\phi_\ast$, is the component on the arrows of a functor from the category $\mathbf{PCTop}$ of pointed pathwise connected topological spaces with basepoint-preserving continuous function to the category $\mathbf{Grp}$ of groups with group homomorphisms, whose component on objects maps every pathwise connected space to its fundamental group.

# Announcing: WAI 2.0 and http-client

I'm happy to announce the release of the Web Application Interface (WAI) 2.0, as well as the first non-experimental release of http-client (version 0.2). The lion's share of the work on the new WAI and Warp was done by Kazu, and he's planning on a separate write-up on some of the changes he's made.

On the http-client side, this release includes OpenSSL support both via the pure-Haskell tls package and the OpenSSL library, as well as a new version of http-conduit (2.0) built on top of http-client. The API of http-conduit remains mostly unchanged. A big thank you to Joseph Abrahamson for providing very helpful feedback on the API design.

My previous blog post discussed the motivations behind these releases, so I won't go into those details here. At a high level, here are some of the motivations for these changes:

• WAI 2.0
• Improving performance of Warp
• Scaling on multicore if complied with coming GHC 7.8
• Cleaning up API of Warp and enriching documentations
• http-client
• Shared infrastructure for multiple high-level HTTP client libraries.
• Choice of SSL backend.
• Lower dependency requirements for simple HTTP queries.
• Easier to test out new high-level APIs.

The only other thing I want to touch on is the affect on the rest of the package ecosystem triggered by these releases.

Together with this release, I'm releasing new versions of all of the WAI and Yesod package sets, as well as http-reverse-proxy and keter. Kazu is releasing new versions of fast-logger and wai-logger as well. I highly recommend either upgrading all of these packages or sticking to the old versions entirely; mixing-and-matching may work, but has not been highly tested. (As usual, Stackage, FP Haskell Center and yesod-platform will continue to provide sets of compatible packages.)

I've taken a lot of care in the new release of Yesod to keep backwards compatibility, so unless you were depending specifically on WAI libraries, your old code should continue building without issue. That said, if you run into any upgrade issues, please bring them up on the mailing list so that, minimally, others can learn from the experience, and possibly the issues can be addressed in the libraries themselves.

While I'm very excited with this release, and think it pushes the library ecosystem in the right direction, I'm even happier that Yesod is continuing to maintain a high level of API stability.

# A new version of the GPU language Accelerate

We released version 0.14 of Accelerate, the embedded high-level language for general-purpose GPU programming. In addition to new constructs for iterative algorithms and improved code generation, this version adds support for the latest CUDA release (5.5) and for OS X Mavericks.

Call for help: Accelerate currently works (out of the box) on OS X and Linux. It should also work on Windows, but we need some community help to fix the build process on Windows — for details, please see the recent issue on GitHub.

# Natural numbers with addition form a monoid

The other day I gave a very short introductory talk on formalising mathematics in Agda to a highly mathematically literate but otherwise quite general audience. I presented this example.

In Agda, proofs and programs are the same thing, types are sets are propositions. These sets contain values of that type, or equally they contain proofs of that proposition.
Agda is a very expressive language so very little is built in and most things can be defined in a library. We will define some of those things now to see how they work.

module talk where


Natural numbers (0,1,2,3,…) can be seen as being defined by either zero or the successor (+1) suc of another natural number.

data Nat : Set where
zero : Nat
suc  : Nat → Nat


Addition _+_ takes two natural numbers and returns another. It can be defined by recursion (induction) on the first argument. If it is zero we return n and if it is successor we can make a recursive call to (m + n) and apply the successor to the result. The recursive call (inductive hypothesis) is valid as m is structurally smaller than suc m.

_+_ : Nat → Nat → Nat
zero  + n = n
suc m + n = suc (m + n)


Having defined one function we now want to prove something about it. One can think of this as an exercise in formal verification. I want to prove that _+_ satisfies some equations. There is an equals sign in the definition above but this is not the one I should use to state my equations. The = is the definitional equality symbol and denotes equations that the computer can see are true. I want to prove some things that the computer isn’t able to see for itself.

The notion we need is propositional equality which is a relation we will define now. In another language one might expect a relation to have type like Nat → Nat → Bool which might return true if the natural numbers were equal. In Agda we would use Set instead of Bool and the set would be inhabited if the relation holds. We can define propositional equality _≅_ once and for all for any set A which we write as an implicit parameter {A : Set}. Remarkably this set has only one canonical inhabitant refl which is inhabits the type of equations between really identical (definitionally equal) values. This may seem strange but one can prove (derive) the other properties, such as symmetry and transitivity.


-- propositional equality
data _≅_ {A : Set} : A → A → Set where
refl : ∀{a} → a ≅ a


We could prove useful lemmas such as symmetry now but for this post I will only need cong – that functions preserve equality. Given any function f : A → B and two equal elements of A the function should return equal results.
The lemma is very easy to prove as when we pattern match on the proof that the elements are equal then the only possible pattern is refl which forces a and a' to be equal, replacing, say, a' with a which reduces our task to showing that f a ≅ f a. This is easily proved using refl.

-- lemma: functions preserve equality
cong : {A B : Set} → (f : A → B) → {a a' : A} → a ≅ a' → f a ≅ f a'
cong f refl = refl


The lemma is very useful for proving equations that have the same function (such as suc) on both sides. We can use it to reduce our task to proving that what is underneath the function on both sides is equal.

One can define an algebraic structure like a monoid as a record (named tuple/struct/etc.) in Agda. It contains fields for the data and also for the laws. A monoid has a carrier set M a distinguished unit element e and a binary operation op. The three laws state that op has e as its left and right unit and it is associative. The monoid record Mon lives in Set1 as it contains a Set(0).

-- Monoid
record Mon : Set1 where
-- data
field M     : Set
e     : M
op    : M → M → M
-- laws
lunit : ∀ m → op e m ≅ m
runit : ∀ m → op m e ≅ m
assoc : ∀ m n o → op (op m n) o ≅ op m (op n o)


The goal is to define a monoid where M = Nat, e = zero and op = _+_. Let’s go ahead and prove the laws. The first law which would be (zero + n) ≅ n doesn’t require proof as it holds definitionally (it is the first line of the definition of _+_). The second law does require proof as here the _+_ doesn’t compute as the first argument is the variable n.

+runit : ∀ n → (n + zero) ≅ n
+runit zero    = refl
+runit (suc n) = cong suc (+runit n)


We prove the second law +runit by induction on n. When proving a property of a program as we are doing here it’s a good idea to follow the same pattern in the proof as in the program. Here _+_ is defined by recursion on its first argument so it makes sense to carry out the proof by induction on the first argument too. This makes things compute nicely. The first case is zero + zero ≅ zero which Agda computes to zero ≅ zero by applying the definition of _+_. This is easily proved by refl. The second case computes to suc (n + zero) ≅ suc n. First we observe that there is a function suc on both sides so we type cong suc ?. This reduces our problem to proving that n + zero ≅ n which follows from the inductive hypothesis +runit n

+assoc : ∀ m n o → ((m + n) + o) ≅ (m + (n + o))
+assoc zero    n o = refl
+assoc (suc m) n o = cong suc (+assoc m n o)


The proof for associativity proceeds analogously. We pattern match on the first argument m which gives two cases. The first case computes to n + o ≅ n + o and the second computes to suc ((m + n) + o) ≅ suc (m + (n + 0)). As before the first case follows by reflexivity and the second case by congruence of suc and inductive hypothesis.

Having done all the hard work we can now define a monoid for Nat, _+_, and zero:

-- natural numbers with addition form a monoid
NatMon : Mon
NatMon = record {
M     = Nat;
e     = zero;
op    = _+_;
lunit = λ _ → refl; -- this one doesn't require proof
runit = +runit;
assoc = +assoc}


# Version 0.10.3 out - Hiera support, and a score of incremental updates

I did not announce the 0.10.2 version, which was mainly a bug fixing release and the first time I bothered to fix the dependencies of the language-puppet package. This version is full of new stuff however, you can get it on hackage or as binaries.

## Hiera support, with facts manipulation utilities

This is the main attraction to this release. I did not use Hiera beforehand, so the development only started because of a GitHub issue. It turns out that this is really a great feature ! Most Hiera features should be supported (actually, I think all of them are, but I might have missed something). Facts manipulation is also easier now.

You can now have that kind of sessions :

<figure class="code"><figcaption>session.sh </figcaption>
 1 2 3 4 5 6  $ssh remote.node 'facter --yaml' > /tmp/facts.yaml # Retrieve facts from a remote host.$ vi /tmp/facts.yaml # Edit the facts so as to test variations. $puppetresources -p ~/gits/puppet -o remote.node \ --yaml ~/gits/puppet/hiera.yaml \ # Load the given yaml file. --facts-override /tmp/facts.yaml # Override collected facts with # those from the remote host.  </figure> ## New stateWriter monad I previously blogged about a replacement for the standard RWS monad, wrote a package, and moved to it. Code speed increased by 10% on a single run, but I expect subsequent runs to go twice as fast. I will need to benchmark this of course, but this is currently not a priority. ## JSON output, for use with puppet apply This is a funny feature. There is support for this in puppetresources, with the --json flag. Once this is generated, you can copy the file onto a machine and apply it with puppet apply --catalog. This also means that it is now trivial to write a replacement for the puppetmaster, a feature that was experimental before the full rewrite, but that might be fully supported (with debian packages) soon. ## Full scope tracking for all resources Now all resources are stored with the full ‘scope’ (think of it as some kind of stack trace, including the originating host for exported resources), which makes debugging much easier. A typical use case is resource duplication in a define, such as the following case situation : <figure class="code"><figcaption>manifests/site.pp </figcaption>  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  define filewrapper($filename,$owner) { file {$filename: ensure => present, owner => $owner; } } class class1 { filewrapper { 'fw1': filename => '/tmp/xxx', owner => 'class1' ; } } class class2 { filewrapper { 'fw2': filename => '/tmp/xxx', owner => 'class2' ; } } node 'a.b.c' { include class1,class2 }  </figure> The Puppet error messages is : <figure class="code">  1 2 3  Error: Duplicate declaration: File[/tmp/xxx] is already declared in file \ /tmp/x/manifests/site.pp:3; cannot redeclare at /tmp/x/manifests/site.pp:3 on node a.b.c </figure> The language-puppet error message is : This error message is way more informative : • The resources are fully displayed, including all attribute values, which might shed some light on the problem. • The originating host (a.b.c) is displayed. This is crucial for debugging exported resource collisions. The standard output also shows ## Various stuff There are also a few minor changes in this release : • Support for the $classes variable in templates : this one is just as limited as the official one, as its value is parse-order dependent.
• Support for @instance variables in templates.
• Support for scope['key'] syntax in templates.
• The haddocks were updated for some modules.
• New is_virtual fact, and stdlib functions (flatten, str2bool, validate_absolute_path).

# Media attention

I have had a lot of media attention in the past few days due to a press release from my undergrad university, featuring me as alumnus of the month, and highlighting my role in Chordify. For posterity, here are the places I could find (all in Portuguese): Público, Jornal de Notícias, Correio da Manhã, TSF, TMN, Jornal i, Sol, Minho University's Facebook page, canalsuperior.pt, maissuperior.com, boasnoticias.pt, xpressingmusic.com (interview), 4gnews.pt, pavablog.com.

I've also been interviewed on the Antena 1 radio; you can listen to the interview here (I show up around minute 34).

# Idris – first impressions

During last few weeks I got a pretty good grip of basics of dependent types and Agda. Programming in Agda is fun but nevertheless I decided to experiment with other dependently-typed programming languages. Back in March I attempted to learn Idris from one of Edwin Brady’s presentations, but having no knowledge of dependent types I had to give up after about 30 minutes of first video. Now that I know basics of Agda I decided to give Idris another try. This time it was much simpler. Reading official Idris tutorial and doing some experiments took me about 5 hours. Below are some of my first impressions (I’m underlining that phrase to make it clear that some of these opinions may change in the future).

• Standard library in Idris feels friendlier than in Agda. It is bundled with the compiler and doesn’t require additional installation (unlike Agda’s). Prelude is by default imported into every module so programmer can use Nat, Bool, lists and so on out of the box. There are also some similarities with Haskell prelude. All in all, standard library in Idris is much less daunting than in Agda.
• Idris is really a programming language, i.e. one can write programs that actually run. Agda feels more like a proof assistant. According to one of the tutorials I’ve read you can run programs written in Agda, but it is not as straightforward as in Idris. I personally I haven’t run a single Agda program – I’m perfectly happy that they typecheck.
• Compared to Agda Idris has limited Unicode support. I’ve never felt the need to use Unicode in my source code until I started programming in Agda – after just a few weeks it feels like an essential thing. I think Idris allows Unicode only in identifiers, but doesn’t allow it in operators, which means I have to use awkward operators like <!= instead of ≤. I recall seeing some discussions about Unicode at #idris channel, so I wouldn’t be surprised if that changed soon.
• One of the biggest differences between Agda and Idris is approach to proofs. In Agda a proof is part of function’s code. Programmer is assisted by agda-mode (in Emacs) which guides code writing according to types (a common feature in dependently typed languages). Over the past few weeks I’ve come to appreciate convenience offered by agda-mode: automatic generation of case analysis, refinement of holes, autocompletion of code based on types to name a few. Idris-mode for Emacs doesn’t support interactive development. One has to use interactive proof mode provided in Idris REPL – this means switching between terminal windows, which might be a bit inconvenient. Proofs in Idris can be separated from code they are proving. This allows to write code that is much clearer. In proof mode one can use tactics, which are methods used to convert proof terms in order to reach a certain goal. Generated proof can then be added to source file. It is hard for me to decide which method I prefer. The final result is more readable in Idris, but using tactics is not always straightforward. I also like interactive development offered by Agda. Tough choice.
• Both languages are poorly documented. That said, Idris has much less documentation (mostly papers and presentations by Edwin Brady). I expect this to change, as the Idris community seems to be growing (slowly, but still).
• One thing I didn’t like in Idris are visibility qualifiers used to define how functions and datatypes are exported from the module. There are three available: public (export name and implementation), private (don’t export anything) and abstract (export type signature, but don’t export implementation). This is slightly different than in Haskell – I think that difference comes from properties of dependent types. What I didn’t like are rules and syntax used to define export visibility. Visibility for a function or datatype can be defined by annotating it with one of three keywords: public, private, abstract. If all definitions in a module are not annotated then everything is public. But if there is at least one annotation everything without annotation is private. Unless you changed the default visibility, in which case everything without annotation can be abstract! In other words if you see a definition without annotation it means that: a) it can be public, but you have to check if all other definitions are without annotations; b) private, if at least one other definition is annotated – again, you have to check whole file; c) but it can be abstract as well – you need to check the file to see if the default export level was set. The only way to be sure – except for nuking the entire site from orbit – is annotating every function with an export modifier, but that feels very verbose. I prefer Haskell’s syntax for defining what is exported and what is not and I think it could be easily extended to support three possible levels of export visibility.
• Unlike Agda, Idris has case expressions. They have some limitations however. I’m not sure whether these limitations come from properties of dependently typed languages or are they just simplifications in Idris implementation that could theoretically be avoided.
• Idris has lots of other cool features. Idiom brackets are a syntactic sugar for applicative style: you can write [| f a b c |] instead of pure f <$> a <$> b <> c. Idris has syntax extensions designed to support development of EDSLs. Moreover tuples are available out of the box, there’s do-notation for monadic expressions, there are list comprehensions and Foreign Function Interface. • One feature that I’m a bit sceptical about are “implicit conversions” that allow to define implicit casts between arguments and write expressions like "Number " ++ x, where x is an Int. I can imagine this could be a misfeature. • Idris has “using” notation that allows to introduce definitions that are visible throughout a block of code. Most common use seems to be in definition of data types. Agda does it better IMO by introducing type parameters into scope of data constructors. • Idris seems to be developed more actively. The repos are stored on github so anyone can easily contribute. This is not the case with Agda, which has Darcs repos and the whole process feels closed (in a sense “not opened to community”). On the other hand mailing list for Idris is set up on Google lists, which is a blocker for me. All in all programming in Idris is also fun although it is slightly different kind of fun than in Agda. I must say that I miss two features from Agda: interactive development in Emacs and Unicode support. Given how actively Idris is developed I imagine it could soon become more popular than Agda. Perhaps these “missing” features will also be added one day? As an exercise I rewrote code from “Why dependent types matter” paper from Agda (see my previous post) to Idris. Code is available in on github. ## November 30, 2013 ### Philip Wadler # Fame, almost My sister spotted my likeness on a loaf of organic bread. ## November 29, 2013 ### Well-Typed.Com # Overloaded record fields for GHC TL;DR: GHC HEAD (but not GHC 7.8) will soon support OverloadedRecordFields, an extension to permit datatypes to reuse field labels and even turn them into lenses. ### Introduction The Haskell records system is a frequent source of frustration to Haskell programmers working on large projects. On the face of it, it is simple: a datatype such as data Person = Person { id :: Int, name :: String }  gives rise to top-level projection functions: id :: Person -> Int name :: Person -> String  However, this means that using multiple records with the same field label leads to name clashes. Workarounds are possible, such as declaring one datatype per module (so the module prefix can be used to disambiguate) or adding a per-datatype prefix to all the field labels, but they are somewhat clumsy. Over the summer before I started working for Well-Typed, I implemented a new GHC extension, OverloadedRecordFields, that allows the same name to be used for multiple fields. This is not a whole new records system for Haskell, and does not include everything one might want (such as anonymous records), but it is a small step forward in a notorious quagmire. Proposals for better systems are welcome, but while it is easy to propose a more powerful design in isolation, integrating it with other extensions and syntax in GHC is another matter! Unfortunately, the extension will not be ready for GHC 7.8, to allow time for the design to settle and the codebase changes to mature. However, it should land in HEAD soon after the 7.8 release is cut, so the adventurous are encouraged to build GHC and try it out. Feedback from users will let us polish the design before it is finally released in 7.10. ### Record projection The essential point of the Haskell records system is unchanged by this extension: record projections are still functions, except now they are polymorphic in the choice of datatype. Instead of name :: Person -> String  we have name :: (r { name :: t }) => r -> t  where r { name :: t } is a constraint meaning that the type r has a field name of type t. The typechecker will automatically solve such constraints, based on the datatypes in scope. This allows module abstraction boundaries to be maintained just as at present. If a module does not export a field, clients of that module cannot use the OverloadedRecordFields machinery to access it anyway. For example, the following code will be valid: data Company { name :: String, employees :: [Person] }  companyNames :: Company -> [String] companyNames c = name c : map name (employees c)  Notice that name can be used at two different record types, and the typechecker will figure out which type is meant, just like any other polymorphic function. Similarly, functions can be polymorphic in the record type: nameDefined :: (r { name :: [a] }) => r -> Bool nameDefined = not . null . name  ### Record update The traditional Haskell record update syntax is very powerful: an expression like e { id = 3, name = "Me" }  can update (and potentially change the types of) multiple fields. The OverloadedRecordFields extension does not attempt to generalise this syntax, so a record update expression will always refer to a single datatype, and if the field names do not determine the type uniquely, a type signature may be required. With the Person and Company types as defined above, the previous expression is accepted but the definition f x = x { name = "Me" }  is ambiguous, so a type signature must be given to f (or a local signature given on x or the update expression). On the other hand, type-changing update of single fields is possible in some circumstances using lenses, discussed below. ### Technical details The constraint r { name :: t } introduced above is in fact syntactic sugar for a pair of constraints (Has r "name", t ~ FldTy r "name")  where the Has r n class constraint means that the type r has a field named n, and the FldTy r n type family gives the type of the field. They use type-level strings (with the DataKinds extension) to name fields. The type family is used so that the field type is a function of the data type and field name, which improves type inference. Unlike normal classes and type families, the instances of Has and FldTy are generated automatically by the compiler, and cannot be given by the user. Moreover, these instances are only in scope if the corresponding record projection is in scope. For example, if the name field of Person is in scope, the extension works as if the following declarations had been given: instance Has Person "name" type instance FldTy Person "name" = String  ### Lenses An excellent way to deal with nested data structures, and to alleviate the shortcomings of the Haskell record system, is to use one of the many Haskell lens libraries, such as lens, data-lens, fclabels and so on. These pair together a "getter" and a "setter" for a field (or more generally, a substructure of a larger type), allowing them to be composed as a single unit. Many of these libraries use Template Haskell to generate lenses corresponding to record fields. A key design question for OverloadedRecordFields was how to fit neatly into the lenses ecosystem. The solution is to generalise the type of field functions still further. Instead of the desugared name :: (Has r "name") => r -> FldTy r "name"  we in fact generate name :: (Accessor p r "name") => p r (FldTy r "name")  where Accessor is a normal typeclass (no instances are generated automatically). In particular, it has an instance instance Has r n => Accessor (->) r n  so when p = (->), the new field type specialises to the previous type. Thus a field can still be used as a (record-polymorphic) function. Moreover, each lens library can give its own instance of Accessor, allowing fields to be used directly as lenses of that type. (Unfortunately this doesn't quite work for van Laarhoven lenses, as in the lens library, because they do not have the shape p r (FldTy r n). A wrapper datatype can be used to define a combinator that converts fields into van Laarhoven lenses.) ### Concluding remarks For more information, see the detailed discussion of the design and implementation on the GHC wiki. A prototype implementation, that works in GHC 7.6.3, shows how the classes are defined and gives examples of the instances that will be generated by GHC. Keep an eye out for the extension to land in GHC HEAD later this year, and please try it out and give your feedback! I'm very grateful to Simon Peyton Jones for acting as mentor for this project and as my guide to the GHC codebase. Edward Kmett threw several wrenches in the works, and the final design owes a great deal to his careful scrutiny and helpful suggestions. Anthony Clayden also gave many helpful comments on the design. My thanks go also to the many denizens of the ghc-devs and glasgow-haskell-users mailing lists who gave feedback at various stages in the development process. This work was supported by the Google Summer of Code programme. ## November 28, 2013 ### language-puppet # A faster RWS variant I recently saw comments stating that the Writer (and thus RWS) monad is slow and leaky. I have several programs and libraries built around the RWS monad, where I use the writer part only for logging (that is, snocing a single element at the end of the log), and already noticed in some cases that the profiler reported up to 80% of the time usage spent doing mappends. Here is a POC, with a pair of monads I called RSS (for Reader - State - State). They offer an API similar to that of RWS, the only difference being that the writer part is internally riding with the state. Tests with a (top secret) real program gave a two times speed increase. For language-puppet, it is a 10% speed increase for a single run, but most of the time is dominated by the initial parsing, so I expect the gain to be more substantial for other type of applications (complex tests come in mind). I wrote a benchmark, that compares the lazy and strict versions of the RSS and RWS monads, using several distinct monoids as the writer type. Here are the results on my machine : RSS.Lazy RSS.Strict RWS.Lazy RWS.Strict Seq 193 µs 194 µs 4.95 ms 457 µs [] 4.46 ms 4.47 ms 4.89 ms 538 µs Vector.Prim 764 µs 784 µs 47.72 ms 47.46 ms RevList 173 µs 175 µs 10.57 ms 5.60 ms IntSet 187 µs 184 µs 4.98 ms 472 µs DList 177 µs 176 µs 4.56 ms 302 µs • RevList is the best data structure for my use case (it is just a newtype around a regular list, with reverse applied when the list is extracted), which should not come as a surprise. What was surprising (to me) is how terrible it turns out to me under RWS … • The performance of Vector.Prim with the RWS monad is terrible. I suppose this is due to some optimization rule that could not be deduced in that case. I was however surprised at the good performance compared to lists in the RSS case. • The RSS version is much faster than the RWS one … except for lists ! • The strict version of the RSS monad doesn’t have a leaky writer (test here). I also added a tellElement helper function that would snoc an element to the writer (with suitable monoids) : tell tellElement Seq 193 µs 231 µs List 4.46 ms 4.45 ms RevList 173 µs 200 µs It seems that, to my surprise, there is a small price to pay for calling snoc instead of mappend. I will probably release this as a package so that I can build language-puppet against it. I will copy’n’paste all the haddocks from the mtl and transformers packages, and move the modules so as to mimick their hierarchy. If you have suggestions, especially concerning the package or module names, strictness properties of the new monads (a subject where I am at loss), please let me know. EDIT : Added benchmark results for DList ### JP Moresmau # Reactive-banana-SDL on Hackage! I've uploaded to Hackage the reactive-banana-sdl library. This library was developed by R. Kyle Murphy(orclev), and I've just added a couple of functions and some documentation in the code. Orclev didn't want to maintain it anymore so I've taken the maintainer role, but I'm certainly not a FRP expert (in fact I started to use the library to learn about FRP), so be gentle... I have a little typing game called TypeClass, which uses the library. You can have a look at the code, it may help you get started. There's not that much there, but hopefully this can be of use to some people, and let me know what goodies we could add! Happy Reactive SDL Hacking! ### Philip Wadler # Students support strike Will Hutton, writing in the Guardian, described the situation succinctly: The real wages of academics have fallen by 13% since 2008, one of the largest sustained wage cuts any profession has suffered since the Second World War. University administrators offer 1% and refuse to negotiate. Faculty struck on 31 October and a second strike is called for 3 December. As usual, our students put the case more eloquently than our union (the UCU, University and College Union). The leadership of EUSA (Edinburgh University Student Association) write: Tensions are high within the University, as the second day of strike action draws nearer and many continue to work to rule. At such times we may hear the oft-quoted: “For the sake of students, don’t go on strike!” We write on behalf of EUSA and our 32,000 members, to actively encourage you to take strike action. In the short term this will indeed affect our education, but the long term benefits are significantly vaster. It is critical that students and staff struggle collectively. Not only to ensure that the sector continues to attract the highest calibre of people, but also so staff are able to focus on the job – not worrying about the rocketing cost of living. Needless to say, colleagues at the start of their career are hardest hit, including the thousands of EUSA’s postgraduate members who help teach. The demands are reasonable, and the more effective the action now the sooner we can get back to the reason we’re all here – education. It is for this reason that EUSA’s academic reps, who aren’t exactly our most radical bunch, have voted overwhelmingly in favour of actively supporting the on-going strike action. This decision has been met with broad approval from across the student body. Dozens of our reps and countless other students were out on the picket lines at 7am the other week, and have vowed to be there next time.We are actively encouraging our members not to cross the picket lines and to study from home.Education continues to be progressively marketised, fees continue to rise, power continues to shift away from ordinary staff and into the hands of the overpaid in Old College. At times like these it is vital that the university acts as a community and reasserts its stake over the corporate body. We do this by working together, and recognising that our struggles are in common. So again, we implore you, on behalf of your students, not to undermine the strike. And hopefully we'll be seeing you on the picket lines! In solidarity,Hugh, Nadia, Alex and KirstyThe EUSA Sabbatical Team The photo above shows me picketing the Informatics Forum, surrounded by students during a previous strike in 2011. ## November 26, 2013 ### Philip Wadler # Amdahl's law for predicting the future of multicores considered harmful The following will be the subject of a talk at Edinburgh on Thursday. I won't attend, because it's Thanksgiving, but it looks interesting. #### Amdahl's law for predicting the future of multicores considered harmful B.H.H. Juurlink , C. H. Meenderinck, ACM SIGARCH Computer Architecture News, Volume 40 Issue 2, May 2012, Pages 1-9. Several recent works predict the future of multicore systems or identify scalability bottlenecks based on Amdahl's law. Amdahl's law implicitly assumes, however, that the problem size stays constant, but in most cases more cores are used to solve larger and more complex problems. There is a related law known as Gustafson's law which assumes that runtime, not the problem size, is constant. In other words, it is assumed that the runtime on p cores is the same as the runtime on 1 core and that the parallel part of an application scales linearly with the number of cores. We apply Gustafson's law to symmetric, asymmetric, and dynamic multicores and show that this leads to fundamentally different results than when Amdahl's law is applied. We also generalize Amdahl's and Gustafson's law and study how this quantitatively effects the dimensioning of future multicore systems. ### Holden Karau # I've written a book "Fast Data Processing with Spark" which covers Python, Scala, and Java I recently finished writting "Fast Data Processing with Spark". Apache Spark is a framework for writing fast, distributed programs. Fast Data Processing with Spark covers how to write distributed map reduce style programs with Spark. The book guides you through every step required to write effective distributed programs from setting up your cluster and interactively exploring the API, to deploying your job to the cluster. Personally, while the fast nature of Spark is not to be understated, I really enjoy its functional style APIs and find it a lovely environment to code in. ### wren ng thornton # The Announcement There are three parts to this post. Everyone should read the first two sections, especially the second section. Haskellers should also be sure to read the third section. ### The Announcement If you don't yet know: I'm transgender. My sense of gender and self have never aligned with my physical appearance, and I’ve spent most of my life dealing with this fact. This is not an acquired condition nor a recent change; it is an intrinsic and life-long part of who I am. I began the process of transitioning half a year ago and, over the next six months or so, I will complete the transition to living as a woman full-time. Many of my followers are already familiar with transgender issues, but since this is a public announcement I assume many of you are not. There are numerous resources online for learning more, but I find the PFLAG pamphlet to be a particularly good place to start. If you still have any questions after reading that, I can provide additional resources and am willing to answer questions. ### How to respond This is going to depend on how you know me. If we interact predominantly online This includes everyone in the Haskell community (both online and academically), as well as everyone from Reddit, Twitter, etc. Henceforth, please use feminine pronouns (she/her/hers) exclusively when referring to me. I understand this will take some getting used to, but it will soon become second nature. If we interact predominantly in person I'd prefer you use feminine pronouns (she/her/hers) when referring to me, especially when online and when mentioning me anonymously. But, for the time being, masculine pronouns (he/him/his) are still acceptable. Sometime in the spring I will send another announcement around letting you know when "T-day" is. After that date, I will be presenting as female full-time and will no longer tolerate masculine pronouns. ### PSA for Haskellers I shouldn't have to say this, but since there were some complaints about the "homosexual propaganda" in my recent posts, may I remind my readers of The Planet Haskell policies regarding political and religious content. I rarely post political content, but am well within the guidelines in doing so. The stated mission of Planet Haskell is to "show what is happening in the community, what people are thinking about or doing". I am an active and well-known member of the Haskell community, and the violence endured by trans people is something I've been thinking a lot about lately. When Chung-chieh Shan gave the 2013 Haskell Symposium program chair report, he made a specific point of highlighting the effects of sexism, racism, homophobia, and transphobia in driving people out of the Haskell community. Therefore, I think it is fair to say that these issues are pertinent, above and beyond my personal involvement with them. That said, I do not intend to discuss trans issues at length on this blog. Nevertheless, on occasion, these issues will come up because I refuse to live in silence and shame for who I am. comments ## November 25, 2013 ### Douglas M. Auclair (geophf) # Abstract Nonsense: chapter 2s, Stately Statefulness This is about continuations, really. This is the continuing series of articles of continuing on continuations. Really. And State. Oh, and State. Ugh. So, the State monad is a monad on a 'stateful-like' function. Still with me? So I have this beautiful implementation in Java — No, really. I do. — and, so ... how do I test that it works? Well, you know I have to use it. I have this State monad transformer example, but I want something really, really simple, so: > inc :: State Int Int > inc = do x <- get > put (x + 1) > return x But the thing is, we don't have the do-notation in Java, so we have to unwind this as a set of monadic binds, so: > inc1 = get >>= \x -> put (x + 1) >> return x Cool! But we don't have (>>) ('then') in Java, we have (>>=) ('bind'), so one more transformation: > inc2 = get >>= \x -> put (x + 1) >>= \_ -> return x AND we're done! Now, an example: > runState (inc2 >> inc2 >> inc2) 2 and we get: (4, 5) as expected. Shoot, do we have to write (>>) ('then') as a convenience function? We still have to take the argument and declare the type of the function, but the function itself simplifies. Let's see the Java implementation first as: > runState (inc2 >>= \_ -> inc2 >>= \_ -> inc2) 2 (still works...) (Holy Chajoles, Batmans, my Java-State-thingie just frikken worked!)1 So, State is a straight transliteration from Haskell to Java. It's a monad of a function: public class State<STATE, DATUM> implements Monad<State, DATUM> { protected State(Arrow<STATE, Tuple<DATUM, STATE>> f) { state = f; } // get and put are straight from the haskell standard library: // get = State \x -> (x, x)
// so is x <- get the same as get >>= State $\state -> (x, state1) public static <STATE1> State<STATE1, STATE1> get(STATE1 basis) { return new State<STATE1, STATE1>(Tuple.dupF(basis)); } // put :: a -> State a () public static <STATE1> State<STATE1, Unit> put(final STATE1 s) { Arrow<STATE1, Tuple<Unit, STATE1>> putter = new Function<STATE1, Tuple<Unit, STATE1>>() { public Tuple<Unit, STATE1> apply(STATE1 ignored) throws Failure { return Tuple.pair(Unit.instance, s); } }; return new State<STATE1, Unit>(putter); } // eval/exec/run State functions are yawn-worthy: public DATUM evalState(STATE s) throws Failure { return state.apply(s).fst(); } public STATE execState(STATE s) throws Failure { return state.apply(s).snd(); } public Tuple<DATUM, STATE> runState(STATE s) throws Failure { return state.apply(s); } // now here comes the fun stuff. Functor: // Functor override -------------------- /* instance Functor (State s) where fmap f m = State$ \s -> let
(a, s') = runState m s
in (f a, s')
*/
public <A, B> Arrow<Functor<A>, Functor<B>> fmap(final Arrow<A, B> fn) {
return new Function<Functor<A>, Functor<B>>() {
public Functor<B> apply(final Functor<A> functor) throws Failure {
// TODO: implement -- uh, fmap and >>= are the same for state
// ... almost
Arrow<STATE, Tuple<B, STATE>> stater =
new Function<STATE, Tuple<B, STATE>>() {
public Tuple<B, STATE> apply(STATE s) throws Failure {
State<STATE, A> m = (State<STATE, A>)functor;
final Tuple<A, STATE> tup = m.runState(s);
return Tuple.pair(fn.apply(tup.fst()), tup.snd());
}
};
return new State<STATE, B>(stater);
}
};
}

// and monad bind and return and then:

public <X> Monad<State, X> unit(final X d) {
Arrow<STATE, Tuple<X, STATE>> lifter =
new Function<STATE, Tuple<X, STATE>>() {
public Tuple<X, STATE> apply(STATE st) throws Failure {
return Tuple.pair(d, st);
}
};
return new State<STATE, X>(lifter);
}

// state >>= f = State $\st -> // let (x, st') = runState state st // in runState (f x) st' public <B> Monad<State, B> bind(final Arrow<DATUM, Monad<State, B>> fn) throws Failure { final State<STATE, DATUM> status = this; Arrow<STATE, Tuple<B, STATE>> binder = new Function<STATE, Tuple<B, STATE>>() { public Tuple<B, STATE> apply(STATE st) throws Failure { Tuple<DATUM, STATE> tup = status.runState(st); return ((State<STATE, B>)fn.apply(tup.fst())) .runState(tup.snd()); } }; return new State<STATE, B>(binder); } // join() ??? public Monad<State, ?> join() throws Failure { throw new Failure("Okay, why are you joining on state?"); } public <A, B> Arrow<A, Monad<State, B>> liftM(Arrow<A, B> fn) { return Monad.Binder.lifter(this, fn); } public <B> Monad<State, B> liftThenBind(Arrow<DATUM, B> fn) throws Failure { return bind(liftM(fn)); } public <T> Monad<State, T> then(final Monad<State, T> m) throws Failure { return bind(new Function<DATUM, Monad<State, T>>() { public Monad<State, T> apply(DATUM ignored) throws Failure { return m; } }); } // just capture types here ----- private final Arrow<STATE, Tuple<DATUM, STATE>> state; } ... and with that we have (functional) state! Now, to test it with our inc2 function(al): // So we need to test the hell out of this ... public static void main(String args[]) throws Failure { // TODO: test me! well a simple test is monadic succ, so ... // inc2 :: State Int Int // inc2 = get >>= \x -> put (x + 1) >> return x Monad<State, Integer> inc2 = State.get(0).bind(new Function<Integer, Monad<State, Integer>>() { public Monad<State, Integer> apply(final Integer x) throws Failure { return State.put(x + 1) .then(new State<Integer, Integer>( Tuple.mkWithFst(x, 0))); } }); System.out.println("runState (inc2 >> inc2 >> inc2) 2 is (4,5): " + ((State<Integer, Integer>) (inc2.then(inc2).then(inc2))).runState(2)); System.out.println("runState (inc2 >> inc2 >> inc2) 17 is (19, 20): " + ((State<Integer, Integer>) (inc2.then(inc2).then(inc2))).runState(17)); } And the results output are as expected. YAY! You see that I did implement the (>>) ('then') operator in Java. It was very easy to do: just a bind-ignore-return. Now. Now. Now. (Now, I'm going to stop saying 'now.' Really.) Now that we have state, we can use it along with continuations to create a threading or coroutining model of computation without having to use the (dreaded/revered) call/cc.2 Tune in next week, Ladies and Gentlemen, for another thrilling chapter on our continuing saga of continuations! ----- (1) 'Thingie' is a technical term ... 'frikken' is ... not so technical. (2) "Oh, you code with call/cc?" the awed neophyte whispers reverentially, then bows down and worships.3 (3) Thus spake the Zen Master: "Why are you in awe of those who use call/cc and similar parlor tricks? The real miracle is that when I wish to sum, I use (+), and when I wish to 'continue' I hit the carriage return." Then he concluded his lesson with the cryptic utterance: "One bowl of rice" and departed their presence for Nirvana (he likes Soundgarden, as well). Chapter Postlude The problem that this chapter addresses is what I've had in my java categories library for some time now. I've had this thing called 'MutableBoxedType<T>' with a very easy implementation: @Effectual public class MutableBoxedType<T> implements Functor<T>, Copointed<T> { public MutableBoxedType(T val) { this.value = val; } @Effectual public void put(T newval) { this.value = newval; } @Effectual public final Effect<T> putF = new Effect<T>() { @Override protected void execute(T obj) throws Failure { put(obj); } }; // Functor implementation ----------------------------------- public <X, Y> Arrow<Functor<X>, Functor<Y>> fmap(final Arrow<X, Y> fn) { return new Function<Functor<X>, Functor<Y>>() { public Functor<Y> apply(Functor<X> obj) throws Failure { MutableBoxedType<X> mbt = (MutableBoxedType<X>)obj; return new MutableBoxedType<Y>(fn.apply(mbt.extract())); } }; } // Copointed implementation -------------------------------- public T extract() { return value; } private T value; } Simple enough, we just put a box around a mutable type. The problem is all those @Effectual annotations and the very mutability-ness of the type itself. Once we make the guts of an instance mutable we lose all provability as 'equals' and '==' now have been reduced to meaningless, vis: MutableBoxedType<Integer> five = new MutableBoxedType<Integer>(5); oldFive= five; five.put(12); five.equals(oldFive); ... is true Yes, five and oldFive are equal, but their shapes have changed during the course of the program and we need deep copy semantics here to sniff that out. The (functional) state monad doesn't 'change' ... not via mutation, anyway. What happens instead is that when the state gets updated, we have a new state. It just makes sense (provably so) and gives us our beloved 'mutation' (by all appearances) even in the functional domain. ### FP Complete # Call For Entries ## FP Haskell Competition Call for Entries We just launched our new Free Community edition of FP Haskell Center™ making it easier than ever to participate in our FP Haskell Competition. Each month we are giving away up to$2,500 in cash for Haskell projects. This is an excellent opportunity for Haskell developers to make some serious money and get some recognition for their work.

Why Submit?

What’s in it for you…besides the cash ($1,000 for first prize and$500 for up to three runners up), this is a prefect platform to publish and get feedback on your work. Why not share your knowledge and get paid for the work you’re already doing? We want to give you money. Submit your personal projects, work with teams to develop other projects…there is no limit to how many submissions you can turn in.

Take a look at our recent winners and see if your projects compare.

# The shittiest project I ever worked on

Sometimes in job interviews I've been asked to describe a project I worked on that failed. This is the one I always think of first.

In 1995 I quit my regular job as senior web engineer for Time-Warner and became a consultant developing interactive content for the World-Wide Web, which was still a pretty new thing at the time. Time-Warner taught me many things. One was that many large companies are not single organizations; they are much more like a bunch of related medium-sized companies that all share a building and a steam plant. (Another was that I didn't like being part of a large company.)

One of my early clients was Prudential, which is a large life insurance, real estate, and financial services conglomerate based in Newark, New Jersey—another fine example of a large company that actually turned out to be a bunch of medium-sized companies sharing a single building. I did a number of projects for them, one of which was to produce an online directory of Prudential-affiliated real estate brokers. I'm sure everyone is familiar with this sort of thing by now. The idea was that you would visit a form on their web site, put in your zip code or town name, and it would extract the nearby brokers from a database and present them to you on a web page, ordered by distance.

Similarly, they tweaked the output format of the program repeatedly over weeks: first the affiliates should be listed in distance order, but no, they should be listed alphabetically if they are in the same town and then after that the ones from other towns, grouped by town; no, the Prudential Preferred affiliates must be listed first regardless of distance, which necessitated a redelivery of the data which up until then hadn't distinguished between ordinary and Preferred affiliates; no wait, that doesn't make sense, it puts a far-off Preferred affiliate ahead of a nearby regular affiliate... again, this is something that many clients do, but I wasn't expecting it and it took a lot of time I hadn't budgeted for. Also these people had, I now know, an unusually bad case of it.

Anyway, we finally got it just right, and it had been approved by multiple layers of management and given a gold star by the Compliance Department, and my clients took it to the Prudential Real Estate people for a demonstration.

You may recall that Prudential is actually a bunch of medium-sized companies that share a building in Newark. The people I was working with were part of one of these medium-sized companies. The real estate business people were in a different company. The report I got about the demo was that the real estate people loved it, it was just what they wanted.

“But,” they said, “how do we collect the referral fees?”

Prudential Real Estate is a franchise operation. Prudential does not actually broker any real estate. Instead, a local franchisee pays a fee for the use of the name and logo and other services. One of the other services is that Prudential runs a national toll-free number; you can call this up and they will refer you to a nearby affiliate who will help you buy or sell real estate. And for each such referral, the affiliate pays Prudential a referral fee.

We had put together a real estate affiliate locator application which let you locate a nearby Prudential-affiliated franchisee and contact them directly, bypassing the referral and eliminating Prudential's opportunity to collect a referral fee.

So I was told to make one final change to the affiliate locator. It now worked like this: The user would enter their town or zip code; the application would consult the database and find the contact information for the nearby affiliates, it would order them in the special order dictated by the Compliance Department, and then it would display a web page with the addresses and phone numbers of the affiliates carefully suppressed. Instead, the name of each affiliate would be followed by the Prudential national toll-free number AND NOTHING ELSE. Even the names were suspect. For a while Prudential considered replacing each affiliate's name with a canned string, something like "Prudential Real Estate Affiliate", because what if the web user decided to look up the affiliate in the Yellow Pages and call them directly? It was eventually decided that the presence of the toll-free number directly underneath rendered this risk acceptably small, so the names stayed. But everything else was gone.

Prudential didn't need an affiliate locator application. They needed a static HTML page that told people to call the number. All the work I had put into importing the data, into formatting the output, into displaying the realtors in precisely the right order, had been a complete waste of time.

# PIGS in SPACE!!!

Today I was at the Pittsburgh Perl Workshop and I gave a talk because that's my favorite part of a workshop, getting to give a talk.

The talk is about Moonpig, the billing system that Rik Signes and I wrote in Perl. Actually it's about Moonpig as little as possible because I didn't think the audience would be interested in the details of the billing system. (They are very interesting, and someone who is actually interested in a billing system, or in a case study of a medium-sized software system, would enjoy a three-hour talk about the financial architecture of Moonpig. But I wasn't sure that person would be at the workshop.) Instead the talk is mostly about the interesting technical underpinnings of Moonpig. Here's the description:

Moonpig is an innovative billing and accounting system that Rik Signes and I worked on between 2010 and 2012, totaling about twenty thousand lines of Perl. It was a success, and that is because Rik and I made a number of very smart decisions along the way, many of which weren't obviously smart at the time.

You don't want to hear about the billing and accounting end of Moonpig, so I will discuss that as little as possible, to establish a context for the clever technical designs we made. The rest of the talk is organized as a series of horrible problems and how we avoided, parried, or mitigated them:

• Times and time zones suck
• Floating-point arithmetic sucks
• It sucks to fix your mangled data after an automated process fails
• Testing a yearlong sequence of events sucks
• It sucks to have your automated test accidentally send a bunch of bogus invoices to the customers
• Rounding errors suck
• Relational databases usually suck
• Modeling objects in the RDB really really sucks
• Perl's garbage collection sucks
• OO inheritance sucks
Moonpig, however, does not suck.

Some of the things I'll talk about will include the design of our web API server and how it played an integral role in the system, our testing strategies, and our idiotically simple (but not simply idiotic) persistent storage solution. An extended digression on our pervasive use of Moose roles will be deferred to the lightning talks session on Sunday.

Much of the design is reusable, and is encapsulated in modules that have been released to CPAN or that are available on GitHub, or both.

I went to sleep too late the night before, slept badly, and woke up at 6:30 and couldn't go back to sleep. I spent an hour wandering around Oakdale looking for a place that served breakfast before 8 AM, and didn't find one. Then I was in a terrible mood. But for this talk, that was just right. I snarled and cursed at all the horrible problems and by the end of the talk I felt pretty good.

Slides are available here. Video may be forthcoming.

Share and enjoy.

# Cobblestones

This is a public service announcement.

This is not a picture of a cobbled street:

Rather, these stones are "Belgian block", also called setts.

Cobblestones look like this:

I took these pictures in front of the library of the American Philosophical Society on South 5th Street in Philadelphia. South 5th Street is paved with Belgian block, and the lane beside the APS is cobbled. You can just barely distinguish them in this satellite photograph.

# Overlapping intervals

Our database stores, among other things, "budgets", which have a lifetime with a start and end time. A business rule is that no two budgets may be in force at the same time. I wanted to build a method which, given a proposed start and end time for a new budget, decided whether there was already a budget in force during any part of the proposed period.

The method signature is:

   sub find_overlapping_budgets {
my ($self,$start, $end) = @_; ... }  and I want to search the contents of$self->budgets for any budgets that overlap the time interval from $start to$end. Budgets have a start_date and an end_date property.

My first thought was that for each existing budget, it's enough to check to see if its start_date or its end_date lies in the interval of interest, so I wrote it like this:

   sub find_overlapping_budgets {
my ($self,$start, $end) = @_; return$self->budgets->search({
[ { start_date => { ">=" , $start }, start_date => { "<=" ,$end },
},
{ end_date => { ">=" , $start }, end_date => { "<=" ,$end },
},
]
});
}

People ridicule Lisp for having too many parentheses, and code like this, a two-line function which ends with },},]});}, should demonstrate that that is nothing but xenophobia. I'm not gonna explain the ridiculous proliferation of braces and brackets here, except to say that this is expressing the following condition:

which we can abbreviate as:

And if this condition holds, then the intervals overlap. Anyway, this seemed reasonable at the time, but is totally wrong, and happily, the automated tests I wrote for the method caught the error. Say that we ask whether we can create a budget that runs from June 1 to June 10. Say there is a budget that already exists, running from June 6 to June 7. Then the query asks :

Both of the disjuncts are false, so the method reports that there is no overlap. My implementation was just completely wrong. it's not enough to check to see if either endpoint of the proposed interval lies within an existing interval; you also have to check to see if any of the endspoints of the existing intervals lie within the proposed interval. (Alert readers will have noticed that although the condition "Intervals A and B overlap" is symmetric in A and B, the condition as I wrote it is not symmetric, and this should raise your suspicions.)

This was yet another time when I felt slightly foolish as I wrote the automated tests, assuming that the time and effort I spent on testing this trivial function would would be time and effort thrown away on nothing—and then they detected a real fault. Someday perhaps I'll stop feeling foolish writing tests for functions like this one; until then, many cases just like this one will help me remember that I must write the tests even though I feel foolish doing it.

Okay, how to get this right? I tried a bunch of things, mostly involving writing out a conjunction of every required condition and then using boolean algebra to simplify the resulting expression:

This didn't work well, partly because I was doing it at two in the morning, partly because there are many conditions, all very similar, and I kept getting them mixed up, and partly because, for implementation reasons, the final expression must be a query on interval A, even though it is most naturally expressed symmetrically between the two intervals.

But then I had a happy idea: For some reason it seemed much simpler to express the opposite condition, that the two intervals do not conflict. If they don't conflict, then interval A must be entirely to the left of interval B, so that or vice-versa, so that Then the intervals do not overlap if either of these is true:

and the condition that we want, that the two intervals do overlap, is simply its negation:

This is correct, or at least all the tests now pass, and it is even simpler than the incorrect condition I wrote in the first place. The code looks like this:

   sub find_overlapping_budgets {
my ($self,$start, $end) = @_; return$self->budgets->search({
end_date   =>   { '>=', $start }, start_date => { '<=',$end   },
});
}

Usually I like to draw some larger lesson from this sort of thing. What comes to mind now (other than “Just write the tests, fool!”) is this: The end result is quite clever. Often I see the final version of the code and say "Oh, I wonder why I didn't see that right off?" Not this time. I want to say I couldn't have found it by myself, except that I did find it by myself, not by just pulling it magically out of my head, but by applying technique.

Instead of "not by magically pulling it out of my head" I was about to write "not by just thinking", but that is not quite right. I did solve it by "just thinking", but it was a different sort of thinking. Sometimes I consider a problem, and a solution leaps to mind, as it did in this case, except that it was wrong. That is what I call "just thinking". But applying carefully-learned and practiced technique is also thinking.

The techniques I applied in this problem included: noticing and analyzing symmetries of the original problem, and application of laws of boolean algebra, both in the unsuccessful and the successful attempt. Higher-level strategies included trying more than one approach, and working backwards. Learning and correctly applying technique made me effectively a better thinker, not just in general, but in this particular case.

[ Addendum 20130917: Dfan Schmidt remarks: "I'm astonished you didn't know the interval-overlap trick already." I was a little surprised, also, when I tried to pull the answer out of my head and didn't find one there already, either from having read it somewhere before, or from having solved the problem before. ]

# In which I revisit the pastimes of my misspent youth

Last weekend I was at a flea market and saw an HP-15C calculator for $10. The HP-15C was the last pocket calculator I owned, some time before pocket calculators became ridiculous. It was a really nice calculator when I got it in 1986, one of my most prized possessions. I lost my original one somewhere along the way, and also the spare I had bought from a friend against the day when I lost the original, and I was glad to get another one, even though I didn't have any idea what I was going to do with it. My phone has a perfectly serviceable scientific calculator in it, a very HP-ish one called RealCalc. (It's nice, you should check it out.) The 15C was sufficiently popular that someone actually brought it back a couple of years ago, in a new and improved version, with the same interface but 21st-century technology, and I thought hard about getting one, but decided I couldn't justify spending that much money on something so useless, even if it was charming. Finding a cheap replacement was a delightful surprise. Then on Friday night I was sitting around thinking about which numbers n are such that a perfect square, and I couldn't think of any examples except for 0, 2, and 4. Normally I would just run and ask the computer, which would take about two minutes to write the program and one second to run it. But I was out in the courtyard, it was a really nice evening, my favorite time of the year, the fading light was beautiful, and I wasn't going to squander it by going inside to brute-force some number problem. But I did have the HP-15C in my pocket, and the HP-15C is programmable, by mid-1980s programmable calculator standards. That is to say, it is just barely programmable, but just barely is all you need to implement linear search for solutions of . So I wrote the program and discovered, to my surprise, that I still remember many of the fussy details of how to program an HP-15C. For example, the SST button single-steps through the listing, in program mode, but single-steps the execution in run mode. And instead of using the special test 5 to see if the x and y registers are equal you might as well subtract them and use the x=0 test; it uses the same amount of program memory and you won't have to flip the calculator over to remember what test 5 is. And the x2 and INT() operations are on the blue shift key. Here's the program:  001 - 42,21,11 002 - 43 11 003 - 1 004 - 0 005 - 20 006 - 9 007 - 40 008 - 36 009 - 11 010 - 36 011 - 43 44 012 - 30 013 - 43 20 014 - 31 015 - 43 32 016 - 42,21,12 017 - 40 018 - 45 0 019 - 32 11 020 - 2 021 - 44,40, 0 022 - 22 12  I see now that when I tested for integrality, I did it the wrong way. My method used four steps:  010 - 36 -- push stack 011 - 43 44 -- x ← INT(x) 012 - 30 -- subtract 013 - 43 20 -- test x=0 ?  but it would have been better to just test the fractional part of the value for zeroness:  42 44 -- x ← FRAC(x) 43 20 -- test x=0 ?  Saving two instructions might not seem like a big deal, but it takes the calculator a significant amount of time to execute two instructions. The original program takes 55.2 seconds to find n=80; with the shorter code, it takes only 49.2 seconds, a 10% improvement. And when your debugging tool can only display a single line of numeric operation codes, you really want to keep the program as simple as you can. Besides, stuff should be done right. That's why it's called "right". But I kind of wish I had that part of my brain back. Who knows what useful thing I would be able to remember if I wasn't wasting my precious few brain cells remembering that the back-step key ("BST") is on the blue shift, and that "42,21,12" is the code for "subroutine B starts here". Anyway, the program worked, once I had debugged it, and in short order (by 1986 standards) produced the solutions n=18, 80, 154, which was enough to get my phone to search the OEIS and find the rest of the sequence. The OEIS entry mentioned that the solutions have the generating function and when I saw that in the denominator, I laughed, really loudly. My new neighbor was in her back yard, which adjoins the courtyard, and heard me, and said that if I was going to laugh like that I had to explain what was so funny. I said “Do you really want to know?” and she said yes, but I think she was mistaken. ### Douglas M. Auclair (geophf) # Abstract Nonsense: Chapter 2pc, Continuations with Choice So, okay, we've shown how to lift continuation functions up into the monadic domain, so we could control the input to a continuation, just as we've shown how we can (very easily) control the output from such a function. And the '(very easily)' is the rub there. If it's the output of the function that concerns us,1 then let's simply work with the output, good or bad. There's a pattern that addresses this, it's known as the Either data type: > data Either a b = Left a | Right b The Either type, itself, is not a functor, as it represents choice. By convention this falls very nicely into failure conditions where we wish to know what went wrong: > type Answer a = Either a String If we get back a Left a solution, we know our answer is good; but if we get back a Right String answer, then we know we had an error, and the error message is contained in the string. We can use the either function to dispatch on our called-function's output: > either :: (a -> c) -> (b -> c) -> Either a b -> c So, all that heavy lifting of CPS-functions we did in the last chapter to make them monadic ...? Let's return to the drawing board, restarting from the example code base where we made our divBy function in the CPS-style for unit values: public ANS appDivBy(final Arrow<Integer, ANS> tail, Integer by) throws Failure { return Continuity.divBy(basis).apply(by) .apply(new Function<Integer, ANS>() { public ANS apply(Integer x) throws Failure { return app(tail, basis, x); } }); } And with the Either type ... Whoopsie, we need to define our Either type in Java, first, before we can use it! Java is particular about these kinds of things. Funny, that. Well, the type is very easy to declare: interface Either<LEFT, RIGHT> { public <ANSWER> ANSWER either(Arrow<LEFT, ANSWER> afn, Arrow<RIGHT, ANSWER> bfn) throws Failure; } The Either type gives us the data structure of choice (materializing that choice, once it's made), and the function that dispatches on that choice. I'd give you the implementation for Left and Right, the two materialized options of the Either choice, but we have to do a bit of ground-work to make them happen. Or should I just try to develop the two subtypes of Either and let the basic types fall out, as what happened when I did the development work? Yes, let's do that. First, let's define our Right option of our Either choice-point: public class Right<LEFT, RIGHT> implements Functor<RIGHT>, Either<LEFT, RIGHT> { public Right(Right value) { this.arg = value; } // Functor implementation ----------------------- public <A, B> Arrow<Functor<A>, Functor<B>> fmap(final Arrow<A, B> fn) { return new Function<Functor<A>, Functor<B>>() { public Functor<B> apply(Functor<A> func) throws Failure { Right<?, B> ans = null; try { Right<?, A> righto = (Right<?, A>) func; ans = new Right<?, B>(fn.apply(righto.arg())); } catch(Exception ex) { throw new Failure("Unable to fmap(Right) for " + func); } return ans; } }; } public Right arg() { return arg; } // Either implementation ------------------- public <ANSWER> ANSWER either(Arrow<LEFT, ANSWER> a, Arrow<RIGHT, ANSWER> b) throws Failure { return b.apply(arg()); } // Object override ---------------- @Override public String toString() { return getClass().getSimpleName() + "<" + arg() + ">"; } private final RIGHT arg; } So, then we have the 'left' option of an Either choice point: public class Left<LEFT, RIGHT> implements Functor<LEFT>, Either<LEFT, RIGHT> { public Left(LEFT value) { this.arg = value; } // Functor implementation ----------------------- public <A, B> Arrow<Functor<A>, Functor<B>> fmap(final Arrow<A, B> fn) { return new Function<Functor<A>, Functor<B>>() { public Functor<B> apply(Functor<A> func) throws Failure { Left<B, ?> ans = null; try { Left<A, ?> louie = (Left<A, ?>) func; ans = new Left<B, ?>(fn.apply(louie.arg())); } catch(Exception ex) { throw new Failure("Unable to fmap(Left) for " + func); } return ans; } }; } public LEFT arg() { return arg; } // Either implementation ------------------- public <ANSWER> ANSWER either(Arrow<LEFT, ANSWER> a, Arrow<RIGHT, ANSWER> b) throws Failure { return a.apply(arg()); } // Object override ---------------- @Override public String toString() { return getClass().getSimpleName() + "<" + arg() + ">"; } private final LEFT arg; } There's a lot of red in the above code. Why? Well, because we already implemented all that code in the Right<?, RIGHT> type. All the red code is repeated, nearly verbatim, just replacing 'Left's for 'Right's. What's going on here? It seems like there's shared functionality here. There is. The thing is, are functors require to carry around some unit type? I argue: no. The intrinsic property of a functor is that it is mappable from one category to the other: fmap(). I further argue that it is not intrinsic to a functor-type to carry a value. So I said arg() is not part of the Functor protocol.2 And this, by definition, fits in very nicely with the whole concept of monads. Once you're lifted into the monadic domain, you stay there, and you are not allowed to extract() values from the monad. For example: try extracting a value from Maybe.NOTHING, or from [] ... you get an error. I argue even more strongly than this arg() against a(n unspecified) monad should not compile! And that's my heart-ache with the above code. Left, by declaration, is a Functor, but it shouldn't have to carry around all these marshalling and unmarshalling functions around a carried type. A generic type-carrying type should have all that functionality and Left and Right should shunt or delegate all this functionality to that type-carrying type. What could this type be? Well, the obvious candidate is Id. public class Id<T> implements Functor<T> { public Id(T val) { arg = val; } public T arg() { return arg; } // Functor override -------------------------- public <A, B> Arrow<Functor<A>, Functor<B>> fmap(final Arrow<A, B> fn) { return new Function<Functor<A>, Functor<B>>() { public Functor<B> apply(Functor<A> func) throws Failure { Id<B> ans = null; try { Id<A> id = (Id<A>) func; ans = constructor(fn.apply(id.arg())); } catch(Exception ex) { throw new Failure("Unable to fmap(id) for " + func); } return ans; } }; } protected <X> Id<X> constructor(X val) { return new Id<X>(val); } // Object overrides -------------------------- @Override public String toString() { return getClass().getSimpleName() + "<" + arg() + ">"; } @Override public int hashCode() { int hash = getClass().hashCode(); if(arg() != null) { hash += 7 * arg().hashCode(); } return hash; } @Override public boolean equals(Object obj) { if(obj instanceof Id) { Id id = (Id)obj; if(arg() == null) return id.arg() == null; return arg().equals(id.arg()); } return false; } private final T arg; } With this common class as a parent, both in functionality and in type-inheritance (both Left and Right are identifying types or type labels on the underlying values), both Left and Right reduce to much smaller implementations: public class Left<LEFT, RIGHT> extends Id<LEFT> implements Either<LEFT, RIGHT> { public Left(LEFT value) { super(value); } // Id override ----------------------- protected <X> Id<X> constructor(X val) { return new Left<X, Unit>(val); } // Either override ------------------- public <ANSWER> ANSWER either(Arrow<LEFT, ANSWER> a, Arrow<RIGHT, ANSWER> b) throws Failure { return a.apply(arg()); } } and for Right: public class Right<LEFT, RIGHT> extends Id<RIGHT> implements Either<LEFT, RIGHT> { public Right(RIGHT value) { super(value); } // Id override ----------------------- protected <X> Id<X> constructor(X val) { return new Right<Unit, X>(val); } // Either override ------------------- public <ANSWER> ANSWER either(Arrow<LEFT, ANSWER> a, Arrow<RIGHT, ANSWER> b) throws Failure { return b.apply(arg()); } } Much, much simpler! You see in these implementations the constructor() function supplies our (sub-)type, so that fmap() on Id works equally well for these types, as well. You also saw in the original implementation the following (questionable) type: ? That type has been replaced by the Unit type (familiar to you Haskell-functional programmers). Unit is what? One? Well, yes, there's only one Unit instance, because it represents the null or nothing value in the unit category: public class Unit { private Unit() { } @Override public String toString() { return "()"; } public static final Unit instance = new Unit(); } Easy, peasy! And, in fact, with just these elements, we have a system sufficient to handle normal and erroneous states (or any left or right paths) for our simple arithmetic system: class ResultE extends Result<Either<Integer, String>> { public ResultE(Either<Integer, String> res) { super(res); } public Either<Integer, String> apply(final Arrow<Integer, Either<Integer, String>> tail) throws Failure { return appE(tail, basis, 3); } private static Either<Integer, String> appE(final Arrow<Integer, Either<Integer, String>> tail, final Either<Integer, String> basis, Integer in) throws Failure { return Continuity.g(basis).apply(in) .apply(new Function<Integer, Either<Integer, String>>() { public Either<Integer, String> apply(Integer x) throws Failure { return Continuity.f(basis).apply(x).apply(tail); } }); } public Either<Integer, String> appEDivBy(final Arrow<Integer, Either<Integer, String>> tail, Integer by) throws Failure { Either<Integer, String> ans = null; try { Continuation<Integer, Either<Integer, String>> cont = Continuity.divBy(basis).apply(by); ans = cont.apply(new Function<Integer, Either<Integer, String>>() { public Either<Integer, String> apply(Integer x) throws Failure { return appE(tail, basis, x); } }); } catch(Exception ex) { ans = new Right<Integer, String>("Failed with error " + ex); } return ans; } public static void main(String args[]) throws Failure { eitherCPSdivBy0(); } private static void eitherCPSdivBy0() throws Failure { final Either<Integer, String> basis = new Left<Integer, String>(0); final ResultE inf = new ResultE(basis); looper(new Callee<Either<Integer, String>>() { public Either<Integer, String> call(Integer inni) throws Failure { return inf.appEDivBy(new Function<Integer, Either<Integer, String>>() { public Either<Integer, String> apply(Integer x) throws Failure { return new Left<Integer, String>(x); } }, inni); } }, "either-CPS"); } } Note that it is the caller that determines what is 'good' (left) and what is 'bad' (right) and what to do in either case. And notice that in the below output, how simple and obvious everything is, we either have a left/number or a right/error condition. javac -d .classes ResultE.java java -cp .classes ResultE Hey, for either-CPS (20 divBy 0 + 1) * 2 is Right<Failed with error java.lang.ArithmeticException: / by zero> Hey, for either-CPS (20 divBy 1 + 1) * 2 is Left<42> Hey, for either-CPS (20 divBy 2 + 1) * 2 is Left<22> Hey, for either-CPS (20 divBy 3 + 1) * 2 is Left<14> Hey, for either-CPS (20 divBy 4 + 1) * 2 is Left<12> Hey, for either-CPS (20 divBy 5 + 1) * 2 is Left<10> Hey, for either-CPS (20 divBy 6 + 1) * 2 is Left<8> Hey, for either-CPS (20 divBy 7 + 1) * 2 is Left<6> Hey, for either-CPS (20 divBy 8 + 1) * 2 is Left<6> Hey, for either-CPS (20 divBy 9 + 1) * 2 is Left<6> ... aaaaaand we're done! Apprennez le leçon, s'il vous plaît So, okay, what's the take-away from all this? Besides the fact that Either is easy to implement and to apply and that lifting continuations to monadic binding is hard? Well, just that. The previous two chapters are a reinvention of the ContT, the continuation monadic transformer, where I developed a need or justification for its use and then I wrote the thing out by hand from scratch. THEN I looked up ContT in the Haskell standard library. The journey of Haskell programmers is reimplementing functionality of the standard libraries. Enlightenment comes when they realize they are doing just that. But I'm actually feeling pretty good from all this exercise. I never saw the 'point' before of continuations, and I never 'got' ContT; I didn't even look at it as a thing, and now, it's ... 'Ah! A Kleisli arrow on the continuation function! Now I see!" and I didn't see the 'point' of the Either type. It was just choice. So what? Choice is a powerful thing. It's more than just error handling, as we have and show here, but even just that is powerful enough: it's simple to implement and easy to apprehend. Continuations. It's shown us ... well, continuations, but the control pattern also exposes opportunity to explore data types that are very simple to use, but why? Because the patterns that arise beg the simplicity that are filled by these data types. That's what I like about these simple concepts: rigorously applied, they open new vistas, yes, but they also facilitate the exploration with these patterns that allow us to try something, ContT, perhaps, and if that doesn't work well, then, well, how about Either, then? Or perhaps Maybe? Or a monadic or comonadic List? An applicative functor? An universe of space to explore, and, with these tool, a (type-)safe, fun and exciting set of ways to branch off in these explorations. I'll see you on the other side. Next chapter. Tuples? Coroutining with continuations? We shall see. ----- (1) In most cases, it is so that the output of functions is what concerns us. That's a reason monads are so pervasive: they allow us to control the output of functions safely in the monadic domain. (2) This was my personal little crisis. I just couldn't see the point of functor types not carrying a datum. I mean: what did that look like? I was stuck on the idea of functor-as-id, and id carried a value it identified, even if it were the empty/null/unit-type instance. Same for monads. Yes, monads are the triple {T, μ, η} but if they aren't carrying a datum, what's the point? Monads not carrying a value seemed to be a rather abstract (and pointless) exercise to me. After all, even Nothing has a value: Nothing. But that's where I missed it, right? But I'm in good company: the ancient Greeks looked upon the concept of zero as anathema and heretical. 'Something' exists, and 'nothing' doesn't exist. It can't. Conceptualizing it is error. Or so they thought, and so did I. And looked what that blocking ... block? Math was stuck until zero was admitted, and then from zero, came everything ('infinity'). Same for me. By forcing Functor and, consequently, Monad to have arg(), I was boxing Functor into having something that it mayn't, and forcing Monad into something that it isn't: you extract from comonads, to extract from a monad is an error. Let's look at Nothing again. Nothing :: Maybe a Nothing has the (Maybe'd) a-type, but it doesn't carry an a-instance. Same thing for any kind of list: [] :: [a] The empty list has the (listed) a-type, but it doesn't carry an a-instance. It's an error to ask to extract an a-instance from an empty list. Same thing for any mzero (dependent) (sub) type of a monoidally-typed (or -kinded) type. So, whereas before I would have arg() for monads and throw an error when I encountered a Nothing or [] or other mzero-value, now I'm thinking that arg() depends on the type of the value. There's no arg() for [] and none for Nothing, either, and it's now a compiler error to ask for an arg() from those values. You just can't ask for arg() from Maybe nor from list any more when they're monadic, you have to obey the math of monads now, and fmap() the monadic function against the monad itself, then join() to get the monadic result. Much cleaner, and now provably correct, because now the above doesn't yield a runtime error. When this new concept of monads is applied in programs, compiling the program successfully means it works now. And this reconceptualization means I can now explore monads that do not carry an underlying value. I can have some monad m that itself has values m1, m2, m3, ..., mn and those (monadic) values, themselves, have their own intrinsic meaning: I don't have to ask() for their arg()ed values for the context of them: I can now use comonads for that when I need that. Where will this new understanding lead me? I don't know. I'm excited to find that out for myself. ## November 24, 2013 ### Darcs # darcs news #105 ### News and discussions 1. This year's Google Summer of Code projects brought a lot of improvements to darcs and its ecosystem! • BSRK Aditya: Darcsden improvements: • José Neder: patience diff, file move detection, token replace detection: 2. Gian Piero Carrubba asked why adjacent hunks could not commute: 3. We listed the changes that occurred between version 2.8.4 and the current development branch into a 2.10 release page: ### Issues resolved (8) issue346 Jose Luis Neder issue1828 Guillaume Hoffmann issue2181 Guillaume Hoffmann issue2309 Owen Stephens issue2313 Jose Luis Neder issue2334 Guillaume Hoffmann issue2343 Jose Luis Neder issue2347 Guillaume Hoffmann ### Patches applied (39) 2013-10-28 Guillaume Hoffmann • record: no longer accept first commented line as patch name • resolve issue2347: fix amend-record --prompt-long-comment • refuse incorrect patch names (empty or starting with TAG) 2013-06-15 Jose Luis Neder • resolve issue2313: whatsnew -l: Stack space overflow • tests for issue2313: whatsnew -l: Stack space overflow 2013-02-17 Mark Stosberg • new failing test for issue2303 - illustrates a Patch Index diagnostic that can be improved. • failing test for issue2272: darcs rebase unsuspend should succeed despite unrecorded changes • issue2271: New (passing) test which confirms that darcs optimize --disable-patch-index will fail in the face of a permissions problem. 2013-08-20 BSRK Aditya • Export getChangesInfo from Changes command module 2013-09-22 Jose Luis Neder • Add better comments to the issue2343 test file • resolve issue2343: darcs amend-record does not record my change 2013-08-20 Ganesh Sittampalam • Introduce a local copy of Data.Map.Strict from containers 0.5, • Weaken hashable dependency 2013-08-13 Guillaume Hoffmann • put upper bound on hashable dependency 2013-07-30 Jose Luis Neder • make darcs-test run tests with both diff algorithms • resolve issue346: implement "patience diff" from bzr • name change Lcs.hs -> Diff/Myers.hs 2013-08-05 Guillaume Hoffmann • resolve issue2334 fix win32 build removing file permission functions 2013-08-27 BSRK Aditya • When using send command, post the patch if using http, • Update containers minimum version 2013-08-12 Guillaume Hoffmann • remove import warning • put newline after darcs diff changelog output • change format of patch name file when a text editor is invoked • remove redundant case in Darcs.UI.Command.Record.getLog • update unrecord and obliterate help, mention -O flag 2013-07-18 Owen Stephens • Fix a couple of warnings from annotate with line numbers patch • Resolve issue2309: annotate includes line numbers 2013-07-25 Guillaume Hoffmann • fix test for issue1290 for when tests are run in parallel • announce moved files • make move patch explicit • announce removed files • announce added files • resolve issue2181: put cache in$XDG_CACHE_HOME (~/.cache by default)
• merge Darcs.Util.IO into Darcs.Util.File
2013-07-17 Owen Stephens
• Move chompNewline from Commands to Util/Text and rename to chompTrailingNewline
2013-07-04 Guillaume Hoffmann
• update help of mark-conflicts
• resolve issue1828: file listing and working --dry-run for mark-conflicts
2013-03-18 Ganesh Sittampalam
• fix warnings introduced by issue1105 fix
2013-02-21 Florent Becker
• Fix issue1105 even in the presence of command-line optinos
See darcs wiki entry for details.

# PhD Position on the Reasoning about Coroutines project

## PhD Position on the Reasoning about Coroutines project

The Programming Languages Group of Ghent University (UGent) invites applicants for a PhD position on the project Reasoning about Coroutines under the direction of Tom Schrijvers. The project concerns reasoning techniques (like type systems and program analysis) for coroutines in its many forms (like delimited continuations and effect handlers). It is conducted in collaboration with the declarative languages group of KU Leuven university and our international partners.
We seek applicants at an international level of excellence. As a successful applicant, you have a master degree in Computer Science or equivalent. Ideally, you also have a strong, documented interest in doing research. Strong problem-solving and programming skills are essential. Prior knowledge of functional programming, logic programming, type systems or program analysis are an advantage.
The PhD position is for 4 years and starts between January and October 2014. The position is fully funded by the Fund for Scientific Research Flanders (FWO). The salary is compatible with other Belgian PhD rates and among the better ones in Europe and abroad.
You will become part of UGent's dynamic and growing Programming Languages Group. UGent is one of the major universities in the Dutch-speaking region of Europe. It distinguishes itself as a socially committed and pluralistic university in a broad international perspective. Our university is located in the beautiful city of Ghent, which is one of Europe’s greatest discoveries, according to Lonely Planet. For more information on life as a PhD researcher at UGent see here.
Please direct your inquiries and applications by e-mail to Tom Schrijvers, principal investigator on this project.
To apply, send as soon as possible:
1. a letter of interest (including motivation relevant to the research topic),
2. your detailed curriculum vitae (including study curriculum rankings, relevant research experience and publications),
3. your diploma and transcripts (including translation if possible).
We will determine the start date together with the selected candidate.

# k-mer counting in a high-level language?

One important ingredient in the bioinformatician’s toolchest is the k-mer indexer or counter. (A k-mer, sometimes called a k-word or q-gram, is simply a contiguous substring of length k, and the k-mers of a data set are all overlapping such substrings.) This forms the basis for many kinds of sequence statistics and analysis, including alignments and assembly, for instance.

There exist a variety of tools to count k-mers, the important attributes are speed and memory usage - the latter is often more important, since data sets are typically large. Stumbling over this link, one gets the impression that Python and Perl, while convenient, suffer performance-wise. Consequently, k-mer counting programs tend to be written in C, C++, or some other bare metal language. So I thought it would be interesting to compare a Haskell implementation. Titus Brown and his group (who author the khmer probabilistic k-mer counter) has done some work comparing different tools, I won’t go into details except note that jellyfish seems to currently be the one to beat.

I believe most k-mer indexing tools pack nucleotide sequences (consisting of A, C, G, T) into two bits per nucleotide, and ensure somehow that a k-mer hashes to the same value as its reverse complement.

Then, there are several alternatives for storing the counts. The most common is probably some form of hash table, but others use suffix arrays (which are much more general), probably including compressed, sparse, and enhanced variants. And some use probabilistic methods based on Bloom filters or similar structures, which are fast and compact at the expense of being slightly incorrect and not being able to give you back the k-mers.

Anyway, since I am toying with Judy arrays (which like a hashtable is an associative data structure, but which is implemented using a trie), I thought I’d generate k-mers, and store them in a Judy array. I wrote the following program:

main = do
[k',f] <- getArgs
kmer_seq = kmers_rc k . unSD . seqdata
freqs <- mk_judy (2*fromIntegral k)
mapM_ (add_count freqs) $concatMap kmer_seq s cts <- mk_judy 20 mapM_ (add_count cts . fromIntegral) =<< counts freqs writeHist cts (f++".hist") This generates all k-mers, using a small wrapper around kmers_rc, which hashes each k-mer with its reverse complement. It then inserts their counts in a judy array called freqs, and then constructs a histogram called cts containing the counts from freqs, which are then written to a file. How does this perform? I ran this on my laptop (Core i5 at 2.5GHz) on a file containing 250K 100bp reads, a 60MB file with about 25MB of actual data. I also ran jellyfish on the same file. It looks like this, first my program: % /usr/bin/time ./mkhist 31 250K-reads.fastq Init judy 62 Init judy 20 13.59user 0.19system 0:13.78elapsed 100%CPU (0avgtext+0avgdata 293544maxresident)k 0inputs+64outputs (0major+73470minor)pagefaults 0swaps Not too shabby, perhaps, this is close to 2MB/s. Then jellyfish: % /usr/bin/time jellyfish count -m 31 -C -s 10000000 250K-reads.fastq -o 250K 8.27user 0.15system 0:08.12elapsed 103%CPU (0avgtext+0avgdata 187392maxresident)k 0inputs+328864outputs (0major+47007minor)pagefaults 0swaps % /usr/bin/time jellyfish merge 250K_0 250K_1 -o 250K.jf 1.29user 0.10system 0:01.35elapsed 103%CPU (0avgtext+0avgdata 60060maxresident)k 0inputs+325048outputs (0major+15097minor)pagefaults 0swaps % /usr/bin/time jellyfish histo 250K.jf > 250K.hist 0.34user 0.01system 0:00.35elapsed 99%CPU (0avgtext+0avgdata 163984maxresident)k 0inputs+8outputs (0major+41061minor)pagefaults 0swaps Jellyfish is a bit less convenient, it builds the counts in several files (two here), which must then be merged later. But it is much faster, only eight seconds to do the count, and about one and a half for the merge and histogram - a total of less than 10 seconds is a good bit better. Now, these examples are pretty small, but you can also notice that my program uses a bit more memory, and if you suspect it will get worse as you scale up, you are right: jellyfish just generates more files to keep memory use down. My program barely manages to count 7M reads on my 8G laptop (in about ten minutes, if you have to know), so as it stands, it’s not very useful for real data. And to be fair, jellyfish also has good multiprocessor support - although I think that wouldn’t be too hard to add. And to be completely fair, I guess I might point out that the main reason we’re doing well, is that we rely on Judy arrays, written in C, lovingly handcrafted for compactness and efficiency. On the bright side, k-mer hashing is pure Haskell, and fast enough not be a noticeable bottleneck. Oh, yeah, and I diffed the output files and got the same. I did get slightly different answers for the 7M reads file, I’ll have to look into that, I guess. ### wren ng thornton # It is done. Things done today • Gave my advisors The Letter. The public announcement is scheduled for monday • Handed in the revised version of qual #2. Clocked in at 59 pages. I'm not entirely satisfied with it, but sometimes y'just gotta let it go. What remains before I'm ABD • P550 final paper. Target length 10 pages. • Qual #3, due by x-mas. This one is just a lit-review, and I've already read the lit, so it shouldn't take too long (I hope!) • Qual oral defense, the first week or two of Spring. Cakewalk expected • Dissertation proposal. Aiming to get it done by January comments ## November 21, 2013 ### Jeremy Gibbons # Lenses are the coalgebras for the costate comonad I took part in the Dagstuhl Seminar on Bidirectional Transformations “BX” earlier this month. It was a meeting of people from four communities—databases, graph transformations, programming languages, and software engineering—discussing their various perspectives—namely the view–update problem in databases, triple graph grammars, lenses, and model synchronization—on the common problem of “BX”. While there, I reported on a marvellous observation made by Russell O’Connor, that lenses are exactly the coalgebras for the costate comonad. That is, the independently identified notion of a “very well-behaved lens” in the work of Pierce and others coincides exactly with the categorical notion of a “coalgebra” for a particular comonad, the “costate” comonad. I’ll unpack that claim here. ## Lenses Pierce’s lenses are pairs of functions between “source” and “view” datatypes ${S}$ and ${V}$: a “get” function ${g : S \rightarrow V}$ and a “put” function ${p : S \times V \rightarrow S}$. The story is that the view is some projection of the data in the source—perhaps a subset of the data, or the data in a simpler format—and so in order to update the source given a modified view, one needs also a copy of the original source from which to reconstruct the missing information. For these two functions to capture a “well-behaved” lens, they should satisfy the so-called Get–Put and Put–Get laws: $\displaystyle \begin{array}{lcl} p\,(s,g\,s) &=& s \\ g\,(p\,(s,v)) &=& v \end{array}$ The Get–Put law says that if you “get” a view of the source, and then “put” it straight back without modifying it, the source remains unmodified: a no-op edit on the view translates into a no-op on the source. The Put–Get law says that if you “put” any view into a source and then “get” it back, you end up with the view you first thought of: nothing is lost from the view when it is put back. Additionally, for these two functions to capture a “very well-behaved” lens, they must satisfy a third law, the Put–Put law: $\displaystyle \begin{array}{lcl} p\,(p\,(s,v),u) &=& p\,(s,u) \end{array}$ In words, “put”ting back two views ${v}$ then ${u}$ is equivalent to “put”ting back just the second; any changes to the source from putting back ${v}$ are completely overwritten when putting back ${u}$. (This turns out to be rather a strong condition, requiring that the source basically factors into the view and a completely independent “complement”; few real applications of bidirectional transformation satisfy it. But that’s another story.) ## The costate comonad Intuitively, comonads capture “data in context”. A comonad ${(D,\mathit{extr},\mathit{dupl})}$ consists of a functor ${D}$ together with two natural transformations ${\mathit{extr} : D \rightarrow 1}$ and ${\mathit{dupl} : D \rightarrow DD}$ that extract the data from its context and duplicate the context, satisfying the three axioms: $\displaystyle \begin{array}{lcl} \mathit{extr} \cdot \mathit{dupl} &=& \mathit{id} \\ \mathit{fmap}\,\mathit{extr} \cdot \mathit{dupl} &=& \mathit{id} \\ \mathit{fmap}\,\mathit{dupl} \cdot \mathit{dupl} &=& \mathit{dupl} \cdot \mathit{dupl} \end{array}$ One example of a comonad is the “costate” construction: for fixed ${V}$, define functor ${D}$ by $\displaystyle \begin{array}{lcl} D\,A &=& V \times (V \rightarrow A) \end{array}$ so that the “map” function for ${D}$ satisfies ${\mathit{fmap}\,h\,(v,f) = (v, h \cdot f)}$. The operations are given by $\displaystyle \begin{array}{lcl} \mathit{extr}\,(v,f) &=& f\,v \\ \mathit{dupl}\,(v,f) &=& (v, \lambda u \rightarrow (u,f)) \end{array}$ Verifying that these definitions satisfy the comonad axioms is left as an exercise for the interested reader. (Incidentally, I think it’s called the “costate” comonad more because it is the dual ${(V\times)\cdot(V\rightarrow)}$ of the “state” monad ${(V\rightarrow)\cdot(V\times)}$, rather than because it has anything to do with stateful computations. However, it does model state in the sense of stored variables; and indeed, Russell O’Connor’s blog posting calls ${D}$ the “store” comonad.) ## Coalgebras of a comonad For a functor ${F}$, an ${F}$-coalgebra is a pair ${(A,f)}$ of a type ${A}$ and a function ${f : A \rightarrow F\,A}$. A “coalgebra for a comonad ${D}$” is a ${D}$-coalgebra that interacts well with the operations ${\mathit{extr}}$ and ${\mathit{dupl}}$ of the comonad; that is, the function ${f}$ should also satisfy the laws: $\displaystyle \begin{array}{lcl} \mathit{extr} \cdot f &=& \mathit{id} \\ \mathit{dupl} \cdot f &=& \mathit{fmap}\,f \cdot f \end{array}$ (Another incidentally: I don’t have a feeling for what these laws mean, in the way that I do for the laws of an algebra of a monad. At least for the free monads ${T}$ that represent terms with free variables, an algebra is a pair ${(A, f : T\,A \rightarrow A)}$ such that ${f}$ makes sense as an “expression evaluator”—it respects singleton variables and substitution. It’s clear to me that the laws of a coalgebra for a comonad are the obvious duals of those for the algebra of a monad; and that they describe the interesting ways of putting together the coalgebra operation with the comonad operations; but I still don’t have a direct intuition. Any comments gratefully received!) ## Lenses are coalgebras of the costate comonad Now it’s just a matter of putting the pieces together. Curry the “put” function of a lens to obtain ${p : S \rightarrow (V \rightarrow S)}$, and define a lens to be the fork of the “get” and “put” functions: $\displaystyle \begin{array}{lcl} \ell\,s &=& (g\,s, p\,s) \end{array}$ Note that now ${\ell : S \rightarrow D\,S}$ where ${D}$ is the costate comonad. The Get–Put law is equivalent to the counit axiom of the coalgebra: $\displaystyle \begin{array}{ll} & \mathit{extr} \cdot \ell = \mathit{id} \\ \Leftrightarrow & \qquad \{ \mbox{apply to an~} s \} \\ & \mathit{extr}\,(\ell\,s) = s \\ \Leftrightarrow & \qquad \{ \ell \} \\ & \mathit{extr}\,(g\,s, p\,s) = s \\ \Leftrightarrow & \qquad \{ \mathit{extr} \} \\ & p\,s\,(g\,s) = s \end{array}$ And the Put–Get and Put–Put laws together are equivalent to the coassociativity axiom: $\displaystyle \begin{array}{ll} & \mathit{dupl} \cdot \ell = \mathit{fmap}\,\ell \cdot \ell \\ \Leftrightarrow & \qquad \{ \mbox{apply to an~} s \} \\ & \mathit{dupl}\,(\ell\,s) = \mathit{fmap}\,\ell\,(\ell\,s) \\ \Leftrightarrow & \qquad \{ \ell \} \\ & \mathit{dupl}\,(g\,s, p\,s) = \mathit{fmap}\,\ell\,(g\,s, p\,s) \\ \Leftrightarrow & \qquad \{ \mathit{fmap} \mbox{~for~} D \} \\ & \mathit{dupl}\,(g\,s, p\,s) = (g\,s, \ell \cdot p\,s) \\ \Leftrightarrow & \qquad \{ \mathit{dupl} \} \\ & (g\,s, \lambda v \rightarrow (v, p\,s)) = (g\,s, \ell \cdot p\,s) \\ \Leftrightarrow & \qquad \{ \mbox{first components are clearly equal} \} \\ & \lambda v \rightarrow (v, p\,s) = \ell \cdot p\,s \\ \Leftrightarrow & \qquad \{ \mbox{apply to a~} v \} \\ & (v, p\,s) = \ell\,(p\,s\,v) \\ \Leftrightarrow & \qquad \{ \ell \} \\ & (v, p\,s) = (g\,(p\,s\,v), p\,(p\,s\,v)) \\ \Leftrightarrow & \qquad \{ \mbox{apply second components to a~} u \} \\ & (v, p\,s\,u) = (g\,(p\,s\,v), p\,(p\,s\,v)\,u) \\ \Leftrightarrow & \qquad \{ \mbox{pair equality is pointwise} \} \\ & v = g\,(p\,s\,v) \land p\,s\,u = p\,(p\,s\,v)\,u \end{array}$ # The stream monad I read quite a nice problem on Nick Wu’s blog, which will serve as a fine warm-up exercise. It’s about the fact that streams (infinite lists) form a monad, in a different way from lists. Nick shows the “right” and two “wrong” definitions of the join or bind operation, distinguishing them on the basis of the monad laws. But I think Nick’s proofs are more complicated than they need to be, because he hasn’t fully exploited the recursion patterns that underlie his definitions. This post will involve some language that we have not yet covered. Fear not! I hope it will be clear from context. But in case it isn’t, you might want to take a look at some of the background material (especially the paper Calculating Functional Programs). ## Streams Like Nick, for simplicity we will take the datatype of streams to be a synonym for lists; in all that follows, assume that lists are properly infinite (not finite, or partial). $\displaystyle \mathbf{type}\;\mathit{Stream}\,a = [a]$ Streams are naturally a codatatype rather than a datatype: in the category of sets and total functions, they would be represented as a final coalgebra rather than an initial algebra. In Haskell, which is roughly based on the category of CPOs and continuous functions, initial algebras and final coalgebras coincide, so we need not (indeed, we cannot) make the distinction formally. But we can make it informally, by stipulating that the basic pattern of computation for streams is the ${\mathit{unfold}}$: $\displaystyle \begin{array}{l} \mathit{unfold} :: (b \rightarrow (a,b)) \rightarrow b \rightarrow \mathit{Stream}\,a \\ \mathit{unfold}\,f\,b = a : \mathit{unfold}\,f\,b' \qquad\mathbf{where}\qquad (a,b') = f\,b \end{array}$ ${\mathit{unfold}\,f}$ generates a stream from a seed, using the body ${f}$ that transforms a seed ${b}$ into an element ${a}$ and a new seed ${b'}$. For example, the map function for streams uses the input stream as the seed, repeatedly splitting it into its head and tail: $\displaystyle \begin{array}{l} \mathit{mapS} :: (a \rightarrow b) \rightarrow \mathit{Stream}\,a \rightarrow \mathit{Stream}\,b \\ \mathit{mapS}\,f = \mathit{unfold}\,(\mathit{fork}\,(f \cdot \mathit{head}, \mathit{tail})) \end{array}$ where ${\mathit{fork}}$ applies two functions to the same argument: $\displaystyle \mathit{fork}\,(f,g)\,a = (f\,a, g\,a)$ The crucial property of ${\mathit{unfold}}$ is its universal property, which provides necessary and sufficient conditions for a computation to be expressible as an instance of ${\mathit{unfold}}$: $\displaystyle h = \mathit{unfold}\,f \Leftrightarrow \mathit{out} \cdot h = \mathit{prod}\,(\mathit{id},h) \cdot f$ where ${\mathit{out} = \mathit{fork}\,(\mathit{head},tail)}$ deconstructs a stream into its head and tail, and $\displaystyle \mathit{prod}\,(f,g)\,(a,b) = (f\,a, g\,b)$ From the universal property, one can easily (exercise!) prove three simple consequences (we’ll call them the “identity” and two “evaluation” rules): $\displaystyle \begin{array}{l} \mathit{unfold}\,\mathit{out} = \mathit{id} \\ \mathit{head} \cdot \mathit{unfold}\,(\mathit{fork}\,(f,g)) = f \\ \mathit{tail} \cdot \mathit{unfold}\,(\mathit{fork}\,(f,g)) = \mathit{unfold}\,(\mathit{fork}\,(f,g)) \cdot g \end{array}$ and the very important fusion law: $\displaystyle \mathit{unfold}\,f \cdot h = \mathit{unfold}\,g \Leftarrow f \cdot h = \mathit{prod}\,(\mathit{id},h) \cdot g$ allowing a preceding function ${h}$ to be absorbed into the unfold. ## Streams as a monad Making streams a monad amounts to defining functions $\displaystyle \begin{array}{lcl} \mathit{return} &::& a \rightarrow \mathit{Stream}\,a \\ \mathit{join} &::& \mathit{Stream}\,(\mathit{Stream}\,a) \rightarrow \mathit{Stream}\,a \end{array}$ satisfying the monad laws: $\displaystyle \begin{array}{lcl} \mathit{join} \cdot \mathit{return} &=& \mathit{id} \\ \mathit{join} \cdot \mathit{mapS}\,\mathit{return} &=& \mathit{id} \\ \mathit{join} \cdot \mathit{mapS}\,\mathit{join} &=& \mathit{join} \cdot \mathit{join} \end{array}$ Looking at the type, the obvious (indeed, I think the only possible) definition one can give for ${\mathit{return}}$ is ${\mathit{return} = \mathit{repeat}}$ where $\displaystyle \mathit{repeat} = \mathit{unfold}\,\mathit{double}$ and ${\mathit{double} = \mathit{fork}\,(\mathit{id},\mathit{id})}$ makes two copies of its argument. However, there are many type-correct definitions one could give for ${ \mathit{join}}$, including ${\mathit{head}}$, ${\mathit{mapS}\,\mathit{head}}$, and ${\mathit{diag}}$, where $\displaystyle \mathit{diag} = \mathit{unfold}\,\mathit{hhtt}$ and where (for brevity in what follows) we define $\displaystyle \begin{array}{lcl} \mathit{hhtt} &=& \mathit{fork}\,(\mathit{hh},\mathit{tt}) \\ \mathit{hh} &=& \mathit{head}\cdot\mathit{head} \\ \mathit{tt} &=& \mathit{tail}\cdot\mathit{mapS}\,\mathit{tail} \end{array}$ Obviously, ${\mathit{head}}$ yields the first “row” of a stream of streams (if one considers it in row-major order), and ${\mathit{mapS}\,\mathit{head}}$ yields the first column; as the name suggests, ${\mathit{diag}}$ yields the leading diagonal. Nick’s post demonstrates that the first two, although type-correct, do not satisfy the monad laws. He also provides a proof that the third does, which we turn to next. ## Checking the monad laws The proofs that ${\mathit{repeat}}$ and ${\mathit{diag}}$ satisfy the three monad laws are very straightforward, using the universal property of ${\mathit{unfold}}$ and its consequences. For the first monad law, fusion gives us the condition to check: $\displaystyle \begin{array}{ll} & \mathit{diag}\cdot\mathit{repeat} = \mathit{id} \\ \Leftarrow & \\ & \mathit{hhtt}\cdot\mathit{repeat} = \mathit{prod}\,(\mathit{id},\mathit{repeat})\cdot\mathit{fork}\,(\mathit{head},\mathit{tail}) \end{array}$ Working on the right-hand side, we have: $\displaystyle \begin{array}{ll} & \mathit{hhtt}\cdot\mathit{repeat} \\ = & \qquad \{ \mbox{definition} \} \\ & \mathit{fork}\,(\mathit{hh},\mathit{tt})\cdot\mathit{repeat} \\ = & \qquad \{ \mbox{composition distributes backwards over fork} \} \\ & \mathit{fork}\,(\mathit{hh}\cdot\mathit{repeat},\mathit{tt}\cdot\mathit{repeat}) \\ = & \qquad \{ \mbox{definitions} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{repeat},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{repeat}) \\ = & \qquad \{ \mbox{evaluation for~} \mathit{repeat} \} \\ & \mathit{fork}\,(\mathit{head},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{repeat}) \\ = & \qquad \{ \mbox{naturality; evaluation} \} \\ & \mathit{fork}\,(\mathit{head},\mathit{repeat}\cdot\mathit{tail}) \\ = & \qquad \{ \mbox{pairs} \} \\ & \mathit{prod}\,(\mathit{id},\mathit{repeat}) \cdot \mathit{fork}\,(\mathit{head},\mathit{tail}) \end{array}$ discharging the proof obligation. Similarly, for the second monad law, fusion gives us the condition: $\displaystyle \begin{array}{ll} & \mathit{diag}\cdot\mathit{mapS}\,\mathit{repeat} = \mathit{id} \\ \Leftarrow & \\ & \mathit{hhtt}\cdot\mathit{mapS}\,\mathit{repeat} = \mathit{prod}\,(\mathit{id},\mathit{mapS}\,\mathit{repeat})\cdot\mathit{fork}\,(\mathit{head},\mathit{tail}) \end{array}$ and working on the right-hand side, in almost exactly the same steps we get: $\displaystyle \begin{array}{ll} & \mathit{hhtt}\cdot\mathit{mapS}\,\mathit{repeat} \\ = & \qquad \{ \mbox{definition} \} \\ & \mathit{fork}\,(\mathit{hh},\mathit{tt})\cdot\mathit{mapS}\,\mathit{repeat} \\ = & \qquad \{ \mbox{composition distributes backwards over fork} \} \\ & \mathit{fork}\,(\mathit{hh}\cdot\mathit{mapS}\,\mathit{repeat},\mathit{tt}\cdot\mathit{mapS}\,\mathit{repeat}) \\ = & \qquad \{ \mbox{definitions} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{mapS}\,\mathit{repeat},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,\mathit{repeat}) \\ = & \qquad \{ \mbox{naturality; evaluation} \} \\ & \mathit{fork}\,(\mathit{head},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,\mathit{repeat}) \\ = & \qquad \{ \mbox{naturality; functors; evaluation} \} \\ & \mathit{fork}\,(\mathit{head},\mathit{mapS}\,\mathit{repeat}\cdot\mathit{tail}) \\ = & \qquad \{ \mbox{pairs} \} \\ & \mathit{prod}\,(\mathit{id},\mathit{mapS}\,\mathit{repeat}) \cdot \mathit{fork}\,(\mathit{head},\mathit{tail}) \end{array}$ discharging the obligation. What about the third monad law? To apply the universal property (or fusion), we need one side to be expressed as an unfold; but neither side of the equation ${\mathit{diag}\cdot\mathit{diag} = \mathit{diag}\cdot\mathit{mapS}\,\mathit{diag}}$ is in that form. No matter; let us hypothesize that one side—say, the left—can be expressed in the form ${\mathit{unfold}\,h}$ for some ${h}$, then calculate a suitable definition for ${h}$ (if one exists). Assuming we succeed, then we can use fusion to check that the other side equals ${\mathit{unfold}\,h}$. (This strategy doesn’t work if we can find no such ${h}$!) Again, fusion gives us $\displaystyle \mathit{diag}\cdot\mathit{diag} = \mathit{unfold}\,h \Leftarrow \mathit{hhtt}\cdot\mathit{diag} = \mathit{prod}\,(\mathit{id},\mathit{diag})\cdot h$ so we calculate: $\displaystyle \begin{array}{ll} & \mathit{hhtt}\cdot\mathit{diag} \\ = & \qquad \{ \mbox{definition; distribution} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{diag},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{diag}) \\ = & \qquad \{ \mbox{evaluation for~} \mathit{diag} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{head},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{diag}) \\ = & \qquad \{ \mbox{naturality; evaluation} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{head},\mathit{diag}\cdot\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,(\mathit{mapS}\,\mathit{tail})) \\ = & \qquad \{ \mbox{pairs} \} \\ & \mathit{prod}\,(\mathit{id},\mathit{diag})\cdot\mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{head},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,(\mathit{mapS}\,\mathit{tail})) \end{array}$ Therefore, letting $\displaystyle \mathit{hhhttt} = \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{head},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,(\mathit{mapS}\,\mathit{tail}))$ we have concluded that $\displaystyle \mathit{diag}\cdot\mathit{diag} = \mathit{unfold}\,\mathit{hhhttt}$ Now all we have to do is to check that the right-hand side of the third monad law also equals this; fusion gives us the condition $\displaystyle \begin{array}{ll} & \mathit{diag}\cdot\mathit{mapS}\,\mathit{diag} = \mathit{unfold}\,\mathit{hhhttt} \\ \Leftarrow & \\ & \mathit{hhtt}\cdot\mathit{mapS}\,\mathit{diag} = \mathit{prod}\,(\mathit{id},\mathit{mapS}\,\mathit{diag})\cdot \mathit{hhhttt} \end{array}$ and we calculate on the right-hand side: $\displaystyle \begin{array}{ll} & \mathit{hhtt}\cdot\mathit{mapS}\,\mathit{diag} \\ = & \qquad \{ \mbox{definition; distribution} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{mapS}\,\mathit{diag},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,\mathit{diag}) \\ = & \qquad \{ \mbox{naturality; evaluation} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{head},\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,\mathit{diag}) \\ = & \qquad \{ \mbox{functors; naturality; evaluation} \} \\ & \mathit{fork}\,(\mathit{head}\cdot\mathit{head}\cdot\mathit{head},\mathit{mapS}\,\mathit{diag}\cdot\mathit{tail}\cdot\mathit{mapS}\,\mathit{tail}\cdot\mathit{mapS}\,(\mathit{mapS}\,\mathit{tail})) \\ = & \qquad \{ \mbox{pairs; definition} \} \\ & \mathit{prod}\,(\mathit{id},\mathit{mapS}\,\mathit{diag})\cdot\mathit{hhhttt} \end{array}$ completing the proof. As you’ll see, the calculations are all quite short and simple, whereas in Nick’s formulation, they were rather hard work; I think that was (a) because he wasn’t exploiting the universal property, and (b) because he was working in terms of the “bind” rather than the “join” of the monad, which forced him into a more pointwise rather than pointfree style. Points are helpful when writing programs, but less so when reasoning about them. ## Deducing diag Here’s another way of looking at the problem. Nick’s blog presented three plausible (that is, type-correct) definitions for the ${\mathit{join}}$ operation. Two of these didn’t satisfy the necessary laws, so were evidently wrong. The third, ${\mathit{diag}}$, does satisfy the laws, but is it the only possible definition that does? I believe that it is the only solution in the form of an unfold; but I only have a hand-waving argument as to why. Let us suppose that indeed $\displaystyle join = \mathit{unfold}\,k$ for some ${k}$. Without loss of generality, let us suppose also that $\displaystyle k = \mathit{fork}\,(k_1,k_2)$ with $\displaystyle \begin{array}{lcl} k_1 &::& \mathit{Stream}\,(\mathit{Stream}\,a) \rightarrow a \\ k_2 &::& \mathit{Stream}\,(\mathit{Stream}\,a) \rightarrow \mathit{Stream}\,(\mathit{Stream}\,a) \end{array}$ I claimed above that ${\mathit{repeat}}$ is the only type-correct definition of the ${\mathit{return}}$ operation. (Ignoring bottoms, that is. Which is to say, in Haskell, all type-correct definitions are approximations in the definedness ordering to ${\mathit{repeat}}$.) Consideration of just the first two monad laws gives us some constraints on ${k}$, since we know that ${\mathit{return} = \mathit{repeat}}$: $\displaystyle \begin{array}{lcl} k\cdot\mathit{repeat} &=& \mathit{prod}\,(\mathit{id},\mathit{repeat})\cdot\mathit{fork}\,(\mathit{head},\mathit{tail}) \\ k\cdot\mathit{mapS}\,\mathit{repeat} &=& \mathit{prod}\,(\mathit{id},\mathit{mapS}\,\mathit{repeat})\cdot\mathit{fork}\,(\mathit{head},\mathit{tail}) \end{array}$ Or in terms of ${k}$‘s two components, $\displaystyle \begin{array}{lcll} k_1\cdot\mathit{repeat} &=& \mathit{head} &(1)\\ k_2\cdot\mathit{repeat} &=& \mathit{repeat}\cdot\mathit{tail} &(2)\\ k_1\cdot\mathit{mapS}\,\mathit{repeat} &=& \mathit{head} &(3)\\ k_2\cdot\mathit{mapS}\,\mathit{repeat} &=& \mathit{mapS}\,\mathit{repeat}\cdot\mathit{tail} &(4) \end{array}$ I claim that (1) entails that ${k_1}$ picks some element out of the first “column” of a stream of streams (thinking of the input as an infinite matrix in row-major order again)—for the equation says that when the input consists of infinitely many copies of the same stream, ${k_1}$ picks (one of the many copies of) the head of that stream. Symmetrically, (3) entails that, when given a infinite matrix whose columns are all equal, ${k_1}$ picks some element out of the first “row”. And because ${k_1}$ has to be polymorphic, it cannot behave differently on special matrices like these than it does in general. Putting those statements together, and waving my hands in the air, I conclude that ${k_1}$ picks the only element that is in both the first row and the first column: $\displaystyle k_1 = \mathit{head} \cdot \mathit{head}$ Similarly, Equation (2) says that, given an infinite input matrix all of whose rows are equal, ${k_2}$ drops the first column (and possibly some of the rows are duplicated or dropped, and the order of the rows may change; but the elements of the rows are untouched). Symmetrically, (4) says that, given an input whose columns are all equal, ${k_2}$ drops the first row (and may duplicate, drop, or rearrange the columns, but not change any of them). And again, the behaviour in general must be consistent with these special cases. Putting these observations together, ${k_2}$ must drop the first row and the first column, and cannot change any of the remainder of the matrix. $\displaystyle k_2 = \mathit{tail} \cdot \mathit{mapS}\,\mathit{tail}$ What is the right framework in which to present such arguments more formally? It feels rather like Paul Hoogendijk’s relational approach to generic programming, which has to talk about largest natural transformations of a given type: the relational setting provides the conjunction one needs in order to express the two separate constraints on ${k_1}$. ### Functional Jobs # Software Developer - Functional Programming at Genetec (Full-time) Software developers at Genetec use their technical aptitudes creatively in order to design and program new features, while working closely with the product management teams to meet customers’ expectations. They work in multidisciplinary teams driven by the desire to overcome the limits of the technology in order to deliver products of outstanding quality, beauty and creativity to the customers. A software development career at Genetec is much more than just an opportunity to create great products; it is also an opportunity to work in a world class, talented, high energy software development team with a solid track record of creating winning products. Roles and Responsibilities: The current position is a code intensive position specialized in distributed applications development using functional programming and .Net technologies. • Design and implement large scale distributed network centric applications using .NET 4.0 technologies in F#. • Elaborate functional and architectural specifications for different features. • Manage their time to respect milestones and delivery dates. • Work in conjunction with software testers to fix different bugs in the product. Requirements: • Bachelor or Master’s degree in Computer Engineering, Software Engineering, Computer Science, Mathematics or Physics. • Minimum of 1 year of experience in F# development or other functional languages such as Erlang, Haskell, OCaml, Scala or Scheme. •Must be fluent in French and English. Technical Requirements: • Strong knowledge in functional programming • Strong knowledge in object-oriented programming. • Strong knowledge of multi-thread application development. • Experience with Microsoft Visual Studio .NET 2008 or 2010. Assets: • Experience with the following: o TCP/IP and protocol development o Microsoft SQL Server programming o Transactional and n-tier network Architectures Get information on how to apply for this position. ### apfelmus # GUI - Release of the threepenny-gui library, version 0.4.0.0 I am pleased to announce release of threepenny-gui version 0.4, a cheap and simple library to satisfy your immediate GUI needs in Haskell. Want to write a small GUI thing but forgot to sacrifice to the giant rubber duck in the sky before trying to install wxHaskell or Gtk2Hs? Then this library is for you! Threepenny is easy to install because it uses the web browser as a display. The library also has functional reactive programming (FRP) built-in, which makes it a lot easier to write GUI application without getting caught in spaghetti code. For an introduction to FRP, see for example my slides from a tutorial I gave in 2012. (The API is slightly different in Reactive.Threepenny.) Version 0.4 is an incremental improvement over the previous version. A new UI monad has been introduced to simplify the JavaScript FFI and allow recursion for FRP. DOM elements are now subject to garbage collection. (Unfortunately, this also leads to a bug when using custom HTML files.) To see Threepenny in action, have a look at the following applications:  Daniel Austin’s FNIStash Editor for Torchlight 2 inventories. Chaddai’s CurveProject Plotting curves for math teachers. Get the library here: Note that the API is still in flux and is likely to change radically in the future. You’ll have to convert frequently or develop against a fixed version. Many thanks to Daniel Austin and Daniel Mlot for their help with this project and to Chris Done for implementing the Ji library which is the basis for this effort. ### Jasper Van der Jeugt # Picking a random element from a frequency list # Introduction A week or so ago, I wrote Lorem Markdownum: a small webapp to generate random text (like the many lorem ipsum generators out there), but in markdown format. This blogpost explains a mildly interesting algorithm I used to pick a random element from a frequency list. It is written in Literate Haskell so you should be able to drop it into a file and run it – the raw version can be found here. > import Data.List (sortBy) > import Data.Ord (comparing) > import qualified Data.Map.Strict as M > import System.Random (randomRIO) # The problem Lorem ipsum generators usually create random but realistic-looking text by using sample data, on which a model is trained. A very simple example of that is just to pick words according to their frequencies. Let us take some sample data from a song that fundamentally changed the music industry in the early 2000s: Badger badger badger Mushroom mushroom Badger badger badger Panic, a snake Badger badger badger Oh, it’s a snake! This gives us the following frequency list: > badgers :: [(String, Int)] > badgers = > [ ("a", 2) > , ("badger", 9) > , ("it's", 1) > , ("mushroom", 2) > , ("oh", 1) > , ("panic", 1) > , ("snake", 2) > ] The sum of all the frequencies in this list is 18. This means that we will e.g. pick “badger” with a chance of 9/18. We can naively implement this by expanding the list so it contains the items in the given frequencies and then picking one randomly. > decodeRle :: [(a, Int)] -> [a] > decodeRle [] = [] > decodeRle ((x, f) : xs) = replicate f x ++ decodeRle xs > sample1 :: [(a, Int)] -> IO a > sample1 freqs = do > let expanded = decodeRle freqs > idx <- randomRIO (0, length expanded - 1) > return$ expanded !! idx

This is obviously extremely inefficient, and it is not that hard to come up with a better definition: we do not expand the list, and instead use a specialised indexing function for frequency lists.

> indexFreqs :: Int -> [(a, Int)] -> a
> indexFreqs _   [] = error "please reboot computer"
> indexFreqs idx ((x, f) : xs)
>     | idx < f     = x
>     | otherwise   = indexFreqs (idx - f) xs
> sample2 :: [(a, Int)] -> IO a
> sample2 freqs = do
>     idx <- randomRIO (0, sum (map snd freqs) - 1)
>     return $indexFreqs idx freqs However, sample2 is still relatively slow when we our sample data consists of a large amount of text (imagine what happens if we have a few thousand different words). Can we come up with a better but still elegant solution? Note that lorem ipsum generators generally employ more complicated strategies than just picking a word according to the frequencies in the sample data. Usually, algorithms based on Markov Chains are used. But even when this is the case, picking a word with some given frequencies is still a subproblem that needs to be solved. # Frequency Trees It is easy to see why sample2 is relatively slow: indexing in a linked list is expensive. Purely functional programming languages usually solve this by using trees instead of lists where fast indexing is required. We can use a similar approach here. A leaf in the tree simply holds an item and its frequency. A branch also holds a frequency – namely, the sum of the frequencies of its children. By storing this computed value, we will be able to write a fast indexing this method. > data FreqTree a > = Leaf !Int !a > | Branch !Int (FreqTree a) (FreqTree a) > deriving (Show) A quick utility function to get the sum of the frequencies in such a tree: > sumFreqs :: FreqTree a -> Int > sumFreqs (Leaf f _) = f > sumFreqs (Branch f _ _) = f Let us look at the tree for badgers (we will discuss how this tree is computed later): Once we have this structure, it is not that hard to write a faster indexing function, which is basically a search in a binary tree: > indexFreqTree :: Int -> FreqTree a -> a > indexFreqTree idx tree = case tree of > (Leaf _ x) -> x > (Branch _ l r) > | idx < sumFreqs l -> indexFreqTree idx l > | otherwise -> indexFreqTree (idx - sumFreqs l) r > sample3 :: FreqTree a -> IO a > sample3 tree = do > idx <- randomRIO (0, sumFreqs tree - 1) > return$ indexFreqTree idx tree

There we go! We intuitively see this method is faster since we only have to walk through a few nodes – namely, those on the path from the root node to the specific leaf node.

But how fast is this, exactly? This depends on how we build the tree.

# Well-balanced trees

Given a list with frequencies, we can build a nicely balanced tree (i.e., in the sense in which binary tries are balanced). This minimizes the longest path from the root to any node.

We first have a simple utility function to clean up such a list of frequencies:

> uniqueFrequencies :: Ord a => [(a, Int)] -> [(a, Int)]
> uniqueFrequencies =
>     M.toList . M.fromListWith (+) . filter ((> 0) . snd)

And then we have the function that actually builds the tree. For a singleton list, we just return a leaf. Otherwise, we simply split the list in half, build trees out of those halves, and join them under a new parent node. Computing the total frequency of the parent node (freq) is done a bit inefficiently, but that is not the focus at this point.

> balancedTree :: Ord a => [(a, Int)] -> FreqTree a
> balancedTree = go . uniqueFrequencies
>   where
>     go []       = error "balancedTree: Empty list"
>     go [(x, f)] = Leaf f x
>     go xs       =
>         let half     = length xs div 2
>             (ys, zs) = splitAt half xs
>             freq     = sum $map snd xs > in Branch freq (go ys) (go zs) # Huffman-balanced trees However, well-balanced trees might not be the best solution for this problem. It is generally known that few words in most natural languages are extremely commonly used (e.g. “the”, “a”, or in or case, “badger”) while most words are rarely used. For our tree, it would make sense to have the more commonly used words closer to the root of the tree – in that case, it seems intuitive that the expected number of nodes visited to pick a random word will be lower. It turns out that this idea exactly corresponds to a Huffman tree. In a Huffman tree, we want to minimize the expected code length, which equals the expected path length. Here, we want to minimize the expected number of nodes visited during a lookup – which is precisely the expected path length! The algorithm to construct such a tree is surprisingly simple. We start out with a list of trees: namely, one singleton leaf tree for each element in our frequency list. Then, given this list, we take the two trees which have the lowest total sums of frequencies (sumFreqs), and join these using a branch node. This new tree is then inserted back into the list. This algorithm is repeated until we are left with only a single tree in the list: this is our final frequency tree. > huffmanTree :: Ord a => [(a, Int)] -> FreqTree a > huffmanTree = go . map (\(x, f) -> Leaf f x) . uniqueFrequencies > where > go trees = case sortBy (comparing sumFreqs) trees of > [] -> error "huffmanTree: Empty list" > [ft] -> ft > (t1 : t2 : ts) -> > go$ Branch (sumFreqs t1 + sumFreqs t2) t1 t2 : ts

This yields the following tree for our example:

# Is the second approach really better?

Although Huffman trees are well-studied, for our example, we only intuitively explained why the second approach is probably better. Let us see if we can justify this claim a bit more, and find out how much better it is.

The expected path length L of an item in a balanced tree can be very easily approached, since it is just a binary tree and we all know those (suppose N is the number of unique words):

However, if we have a tree we built using the huffmanTree, it is not that easy to calculate the expected path length. We know that for a Huffman tree, the path length should approximate the entropy, which, in our case, gives us an approximation for the path length for item with a specified frequency f:

Where F is the total sum of all frequencies. If we assume that we know the frequency for every item, the expected path length is simply a weighted mean:

This is where it gets interesting. It turns out that the frequency of words in a natural language is a well-researched topic, and predicted by something called Zipf’s law. This law tells us that the frequency of an item f can be estimated by:

Where s characterises the distribution and is typically very close to 1 for natural languages. H is the generalised harmonic number:

If we substitute in the definition for the frequencies into the formula for the expected path length, we get:

This is something we can work with! If we plot this for s = 1, we get:

It is now clear that the expected path length for a frequency tree built using huffmanTree is expected to be significantly shorter than a frequency tree built using balancedTree, even for relatively small N. Yay! Since the algorithm now works, the conclusion is straightforward.

# Conclusion

Lorem markdownum constitit foret tibi Phoebes propior poenam. Nostro sub flos auctor ventris illa choreas magnis at ille. Haec his et tuorum formae obstantes et viribus videret vertit, spoliavit iam quem neptem corpora calidis, in. Arcana ut puppis, ad agitur telum conveniant quae ardor? Adhuc arcu acies corpore amplexans equis non velamina buxi gemini est somni.

Thanks to Simon Meier, Francesco Mazzoli and some other people at Erudify for the interesting discussions about this topic!

# TDOR

On this day we remember our dead.

When right-wing bigots lie and fabricate stories about trans* people, you look at our dead and tell me with a straight face who should fear whom. While you worry about your kids feeling nervous about nothing happening, I'm too worried for the children who will one day soon be shot, strangled, suffocated, stabbed, tortured, beheaded, lit on fire, and thrown off bridges simply for existing.

And you on the left: I love all you queers, and I'm glad for your victories; but the next time you celebrate an "LGBT" victory you take a long hard look at your history of throwing that "T" under the bus and you look at our dead and tell me with a straight face how it's not yet time to fight for trans* rights.

# PhD Studentship on ABCD

Please circulate the following to those who might be interested.
We are recruiting for one PhD student to work on design and implementation of programming languages. The post is on the project "From Data Types to Session Types: A Basis for Concurrency and Distribution".

The project has particular emphasis on putting theory into practice, embedding session types in a range of programming languages and applying them to realistic case studies. The research programme is joint between the University of Edinburgh, University of Glasgow, and Imperial College London, and includes collaboration with Amazon, Cognizant, Red Hat, VMware, and the Ocean Observatories Initiative. We have a programme grant funded by EPSRC for five years from 20 May 2013.

The successful candidate will join a team responsible for extending the functional web programming language Links with session types to support concurrency and distribution. We will test our techniques by providing a library to access Amazon Web Services (AWS) cloud computing infrastructure, and perform empirical experiments to assess how our language design impacts the performance of programmers.

You should possess an undergraduate degree in a relevant area, or being nearing completion of same, or have comparable experience. You should have evidence of ability to undertake research and communicate well. You should have a background in programming languages, including type systems, and programming and software engineering skills.

It is desirable for candidates to also have one or more of the following: a combination of theoretical and practical skills; experience of web programming or cloud programming; knowledge of the theory or practice of concurrent and distributed systems; knowledge of linear logic; or training in empirical measurement of programming tasks. We especially welcome applications from women and minorities.

We seek applicants at an international level of excellence. The School of Informatics at Edinburgh is among the strongest in the world, and Edinburgh is known as a cultural centre providing a high quality of life.

The successful candidate will receive a studentship covering tuition and subsistence. Students from the UK or EU are preferred, but studentships may be available for overseas students with strong qualifications. Applications should be received by 13 December to be eligible for the full range of scholarships. Consult the University of Edinburgh website for details of how to apply.

# The oldest Shake bug - async exceptions are hard

Summary: Shake could corrupt the build database under rare circumstances. Last week I finally figured out why, and fixed it.

The Shake build system records build metadata as it runs, using several techniques to ensure the metadata is not corrupted, even if the build process is killed. Over the past few years I've received a small number of reports where Shake ended up with corrupted metadata, causing a complete rebuild. These reports usually involved a build error, and often a laptop, but I was never able to reproduce the bug. Last week I finally understood the cause, and the bug is now fixed in shake-0.10.9. In this post I explain how Shake records its metadata, what the bug was and how I fixed it.

Shake records build metadata in a file, as a sequence of entries. Each entry represents one firing of a build rule, and contains the length of the entry, followed by the key of that rule (e.g. a filename) and the values produced by that rule (e.g. a modification time and list of dependencies).

The metadata file is designed to be robust even if the build is forcibly killed, using the length of the entry to detect whether the entry was written out in full, or if the program aborted while writing the entry. The key property I rely on is that all entries are complete and valid, apart from the last which may be only partially written.

#### The Bug

I was writing to the file with:

withLock lock $LBS.hPut handle entry This code was called from multiple threads, so to ensure writes were not interleaved, I used withLock. In Shake, if any build rule fails, I abort the build by killing all worker threads. However, during the process of aborting, one thread could take the lock, write out some prefix of its entry, then be killed. Another thread could follow the same pattern, resulting in two half-entries and a corrupted metadata file. For this problem to arise you need to abort a thread that is in the middle of writing to the file, then wait sufficiently long to give a second thread the chance to start writing before it too is killed. In practice I believe it requires an error to be raised and two rules to produce entries almost simultaneously, and for the first thread to be killed to be the one that took the lock, and for the threads to be killed somewhat slowly. In days of random testing once every 20 seconds I never managed to reproduce such a delicate arrangement. #### The Fix The fix is straightforward, just make a single thread responsible for all writes to the file: chan <- newChanforkIO$ forever $LBS.hPut h =<< readChanlet write x = do evaluate$ LBS.length x    writeChan x

Here the chan keeps a list of things to write out, a single thread reads from chan and writes to the file, and any thread can call write. It is important to evaluate before writing to chan so that any exceptions caused by binary encoding are raised by the thread that caused them, and errors get the correct stack trace (Shake adds nice stack traces to all error messages).

In practice, the code is complicated by the possibility of exceptions, cleaning up on shutdown and a desire to flush the file periodically. The real code in Development.Shake.Storage is:

flushThread :: Maybe Double -> Handle -> ((LBS.ByteString -> IO ()) -> IO a) -> IO aflushThread flush h act = do    chan <- newChan -- operations to perform on the file    kick <- newEmptyMVar -- kicked whenever something is written    died <- newBarrier -- has the writing thread finished    flusher <- case flush of        Nothing -> return Nothing        Just flush -> fmap Just $forkIO$ forever $do takeMVar kick threadDelay$ ceiling $flush * 1000000 tryTakeMVar kick writeChan chan$ hFlush h >> return True    root <- myThreadId    writer <- forkIO $handle (\(e :: SomeException) -> signalBarrier died () >> throwTo root e)$        -- only one thread ever writes, ensuring only the final write can be torn        whileM $join$ readChan chan    (act $\s -> do evaluate$ LBS.length s -- ensure exceptions occur on this thread            writeChan chan $LBS.hPut h s >> tryPutMVar kick () >> return True) finally do maybe (return ()) killThread flusher writeChan chan$ signalBarrier died () >> return False            waitBarrier died

This function takes the flush interval (in seconds, or Nothing to never flush), and the file handle, and an action to run which requires the write function. It's pretty complex code, which is why it has such a high density of comments (compared to my usual code).

# Horner’s Rule

This post is about my all-time favourite calculation, of a linear-time algorithm for the maximum segment sum problem, based on Horner’s Rule. The problem was popularized in Jon Bentley’s Programming Pearls series in CACM (and in the subsequent book), but I learnt about it from Richard Bird’s lecture notes on The Theory of Lists and Constructive Functional Programming and his paper Algebraic Identities for Program Calculation, which he was working on around the time I started my DPhil. It seems like I’m not the only one for whom the problem is a favourite, because it has since become a bit of a cliché among program calculators; but that won’t stop me writing about it again.

## Maximum segment sum

The original problem is as follows. Given a list of numbers (say, a possibly empty list of integers), find the largest of the sums of the contiguous segments of that list. In Haskell, this specification could be written like so:

$\displaystyle \begin{array}{lcl} \mathit{mss} &=& \mathit{maximum} \cdot \mathit{map}\,\mathit{sum} \cdot \mathit{segs} \end{array}$

where ${\mathit{segs}}$ computes the contiguous segments of a list:

$\displaystyle \begin{array}{lcl} \mathit{segs} &=& \mathit{concat} \cdot \mathit{map}\,\mathit{inits} \cdot \mathit{tails} \\ \mathit{tails} &=& \mathit{foldr}\,f\,[[\,]] \quad\mathbf{where}\; f\,\mathit{x}\,\mathit{xss} = (\mathit{x}:\mathit{head}\,\mathit{xss}):\mathit{xss} \\ \mathit{inits} &=& \mathit{foldr}\,g\,[[\,]] \quad\mathbf{where}\; g\,\mathit{x}\,\mathit{xss} = [\,] : \mathit{map}\,(\mathit{x}:)\,\mathit{xss} \end{array}$

and ${\mathit{sum}}$ computes the sum of a list, and ${\mathit{maximum}}$ the maximum of a nonempty list:

$\displaystyle \begin{array}{lcl} \mathit{sum} &=& \mathit{foldr}\,(+)\,0 \\ \mathit{maximum} &=& \mathit{foldr}_1\,\max \end{array}$

This specification is executable, but takes cubic time; the problem is to do better.

We can get quite a long way just with standard properties of ${\mathit{map}}$, ${\mathit{inits}}$, etc:

$\displaystyle \begin{array}{ll} & \mathit{mss} \\ = & \qquad \{ \mbox{definition of~} \mathit{mss} \} \\ & \mathit{maximum} \cdot \mathit{map}\,\mathit{sum} \cdot \mathit{segs} \\ = & \qquad \{ \mbox{definition of~} \mathit{segs} \} \\ & \mathit{maximum} \cdot \mathit{map}\,\mathit{sum} \cdot \mathit{concat} \cdot \mathit{map}\,\mathit{inits} \cdot \mathit{tails} \\ = & \qquad \{ \mbox{naturality of~} \mathit{concat} \} \\ & \mathit{maximum} \cdot \mathit{concat} \cdot \mathit{map}\,(\mathit{map}\,\mathit{sum}) \cdot \mathit{map}\,\mathit{inits} \cdot \mathit{tails} \\ = & \qquad \{ \mathit{maximum} \cdot \mathit{concat} = \mathit{maximum} \cdot \mathit{map}\,\mathit{maximum} \} \\ & \mathit{maximum} \cdot \mathit{map}\,\mathit{maximum} \cdot \mathit{map}\,(\mathit{map}\,\mathit{sum}) \cdot \mathit{map}\,\mathit{inits} \cdot \mathit{tails} \\ = & \qquad \{ \mbox{functors} \} \\ & \mathit{maximum} \cdot \mathit{map}\,(\mathit{maximum} \cdot \mathit{map}\,\mathit{sum} \cdot \mathit{inits}) \cdot \mathit{tails} \end{array}$

For the final step, if we can write ${\mathit{maximum} \cdot \mathit{map}\,\mathit{sum} \cdot \mathit{inits}}$ in the form ${\mathit{foldr}\,h\,e}$, then the ${\mathit{map}}$ of this can be fused with the ${\mathit{tails}}$ to yield ${\mathit{scanr}\,h\,e}$; this observation is known as the Scan Lemma. Moreover, if ${h}$ takes constant time, then this gives a linear-time algorithm for ${\mathit{mss}}$.

The crucial observation is based on Horner’s Rule for evaluation of polynomials, which is the first important thing you learn in numerical computing—I was literally taught it in secondary school, in my sixth-year classes in mathematics. Here is its familiar form:

$\displaystyle \sum_{i=0}^{n-1} a_i x^i = a_0 + a_1 x + a_2 x^2 + \cdots + a_{n-1} x^{n-1} = a_0 + x(a_1 + x(a_2 + \cdots + x\,a_{n-1}))$

but the essence of the rule is about sums of products:

$\displaystyle \sum_{i=0}^{n-1} \prod_{j=0}^{i-1} u_j = 1 + u_0 + u_0u_1 + \cdots + u_0u_1\ldots u_{n-1} = 1 + u_0(1 + u_1(1 + \cdots + u_{n-1}))$

Expressed in Haskell, this is captured by the equation

$\displaystyle \mathit{sum} \cdot \mathit{map}\,\mathit{product} \cdot \mathit{inits} = \mathit{foldr}\,(\oplus)\,e \quad \mathbf{where}\; e = 1 \mathbin{;} u \oplus z = e + u \times z$

(where ${\mathit{product} = \mathit{foldr}\,(\times)\,1}$ computes the product of a list of integers).

But Horner’s Rule is not restricted to sums and products; the essential properties are that addition and multiplication are associative, that multiplication has a unit, and that multiplication distributes over addition. This the algebraic structure of a semiring (but without needing commutativity and a unit of addition, or that that unit is a zero of multiplication). In particular, the so-called tropical semiring on the integers, in which “addition” is binary ${\max}$ and “multiplication” is integer addition, satisfies the requirements. So for the maximum segment sum problem, we get

$\displaystyle \mathit{maximum} \cdot \mathit{map}\,\mathit{sum} \cdot \mathit{inits} = \mathit{foldr}\,(\oplus)\,e \quad \mathbf{where}\; e = 0 \mathbin{;} u \oplus z = e \max (u + z)$

Moreover, ${\oplus}$ takes constant time, so this gives a linear-time algorithm for ${\mathit{mss}}$.

## Tail segments, datatype-generically

About a decade after the initial “theory of lists” work on the maximum segment sum problem, Richard Bird (with Oege de Moor and Paul Hoogendijk) came up with a datatype-generic version of the problem in the paper Generic functional programming with types and relations. It’s clear what “maximum” and “sum” mean generically, but not so clear what “segment” means for nonlinear datatypes; the point of their paper is basically to resolve that issue.

Recalling the definition of ${\mathit{segs}}$ in terms of ${\mathit{inits}}$ and ${\mathit{tails}}$, we see that it would suffice to develop datatype-generic notions of “initial segment” and “tail segment”. One fruitful perspective is given in Bird & co’s paper: a “tail segment” of a cons list is just a subterm of that list, and an “initial segment” is the list but with some tail (that is, some subterm) replaced with the empty structure.

So, representing a generic “tail” of a data structure is easy: it’s a data structure of the same type, and a subterm of the term denoting the original structure. A datatype-generic definition of ${\mathit{tails}}$ is a little trickier, though. For lists, you can see it as follows: every node of the original list is labelled with the subterm of the original list rooted at that node. I find this a helpful observation, because it explains why the ${\mathit{tails}}$ of a list is one element longer than the list itself: a list with ${n}$ elements has ${n+1}$ nodes (${n}$ conses and a nil), and each of those nodes gets labelled with one of the ${n+1}$ subterms of the list. Indeed, ${\mathit{tails}}$ ought morally to take a possibly empty list and return a non-empty list of possibly empty lists—there are two different datatypes involved. Similarly, if one wants the “tails” of a data structure of a type in which some nodes have no labels (such as leaf-labelled trees, or indeed such as the “nil” constructor of lists), one needs a variant of the datatype providing labels at those positions. Also, for a data structure in which some nodes have multiple labels, or in which there are different types of labels, one needs a variant for which every node has precisely one label.

Bird & co call this the labelled variant of the original datatype; if the original is a polymorphic datatype ${\mathsf{T}\,\alpha = \mu(\mathsf{F}\,\alpha)}$ for some binary shape functor ${\mathsf{F}}$, then the labelled variant is ${\mathsf{L}\,\alpha = \mu(\mathsf{G}\,\alpha)}$ where ${\mathsf{G}\,\alpha\,\beta = \alpha \times \mathsf{F}\,1\,\beta}$—whatever labels ${\mathsf{F}}$ may or may not have specified are ignored, and precisely one label per node is provided. Given this insight, it is straightforward to define a datatype-generic variant ${\mathit{subterms}}$ of the ${\mathit{tails}}$ function:

$\displaystyle \mathit{subterms}_{\mathsf{F}} = \mathit{fold}_{\mathsf{F}}(\mathit{in}_{\mathsf{G}} \cdot \mathit{fork}(\mathit{in}_{\mathsf{F}} \cdot \mathsf{F}\,\mathit{id}\,\mathit{root}, \mathsf{F}\,!\,\mathit{id})) :: \mathsf{T}\,\alpha \rightarrow \mathsf{L}(\mathsf{T}\,\alpha)$

where ${\mathit{root} = \mathit{fst} \cdot \mathit{in}_{\mathsf{G}}^{-1} = \mathit{fold}_{\mathsf{G}}\,\mathit{fst} :: \mathsf{L}\,\alpha \rightarrow \alpha}$ returns the root label of a labelled data structure, and ${!_{\alpha} :: \alpha \rightarrow 1}$ is the unique arrow to the unit type. (Informally, having computed the tree of subterms for each child of a node, we make the tree of subterms for this node by assembling all the child trees with the label for this node; the label should be the whole structure rooted at this node, which can be reconstructed from the roots of the child trees.) What’s more, there’s a datatype-generic scan lemma too:

$\displaystyle \begin{array}{lcl} \mathit{scan}_{\mathsf{F}} &::& (\mathsf{F}\,\alpha\,\beta \rightarrow \beta) \rightarrow \mathsf{T}\,\alpha \rightarrow \mathsf{L}\,\beta \\ \mathit{scan}_{\mathsf{F}}\,\phi &=& \mathsf{L}\,(\mathit{fold}_{\mathsf{F}}\,\phi) \cdot \mathit{subterms}_{\mathsf{F}} \\ &=& \mathit{fold}_{\mathsf{F}}(\mathit{in}_{\mathsf{G}} \cdot \mathit{fork}(\phi \cdot \mathsf{F}\,\mathit{id}\,\mathit{root}, \mathsf{F}\,!\,\mathit{id})) \end{array}$

(Again, the label for each node can be constructed from the root labels of each of the child trees.) In fact, ${\mathit{subterms}}$ and ${\mathit{scan}}$ are paramorphisms, and can also be nicely written coinductively as well as inductively. I’ll return to this in a future post.

## Initial segments, datatype-generically

What about a datatype-generic “initial segment”? As suggested above, that’s obtained from the original data structure by replacing some subterms with the empty structure. Here I think Bird & co sell themselves a little short, because they insist that the datatype ${\mathsf{T}}$ supports empty structures, which is to say, that ${\mathsf{F}}$ is of the form ${\mathsf{F}\,\alpha\,\beta = 1 + \mathsf{F}'\,\alpha\,\beta}$ for some ${\mathsf{F}'}$. This isn’t necessary: for an arbitrary ${\mathsf{F}}$, we can easily manufacture the appropriate datatype ${\mathsf{U}}$ of “data structures in which some subterms may be replaced by empty”, by defining ${\mathsf{H}\,\alpha\,\beta = 1 + \mathsf{F}\,\alpha\,\beta}$ and ${\mathsf{U}\,\alpha = \mu(\mathsf{H}\,\alpha)}$.

As with ${\mathit{subterms}}$, the datatype-generic version of ${\mathit{inits}}$ is a bit trickier—and this time, the special case of lists is misleading. You might think that because a list has just as many initial segments as it does tail segments, so the labelled variant ought to suffice just as well here too. But this doesn’t work for non-linear data structures such as trees—in general, there are many more “initial” segments than “tail” segments (because one can make independent choices about replacing subterms with the empty structure in each child), and they don’t align themselves conveniently with the nodes of the original structure.

The approach I prefer here is just to use an unstructured collection type to hold the “initial segments”; that is, a monad. This could be the monad of finite lists, or of finite sets, or of finite bags—we will defer until later the discussion about precisely which, and write simply ${\mathsf{M}}$. We require only that it provide a ${\mathit{MonadPlus}}$-like interface, in the sense of an operator ${\mathit{mplus} :: \mathsf{M}\,\alpha \times \mathsf{M}\,\alpha \rightarrow \mathsf{M}\,\alpha}$; however, for reasons that will become clear, we will expect that it does not provide a ${\mathit{mzero}}$ operator yielding empty collections.

Now we can think of the datatype-generic version of ${\mathit{inits}}$ as nondeterministically pruning a data structure by arbitrarily replacing some subterms with the empty structure; or equivalently, as generating the collection of all such prunings.

$\displaystyle \mathit{prune} = \mathit{fold}_{\mathsf{F}}(\mathsf{M}\,\mathit{in}_{\mathsf{H}} \cdot \mathit{opt}\,\mathit{Nothing} \cdot \mathsf{M}\,\mathit{Just} \cdot \delta_2) :: \mu(\mathsf{F}\,\alpha) \rightarrow \mathsf{M}(\mu(\mathsf{H}\,\alpha))$

Here, ${\mathit{opt}}$ supplies a new alternative for a nondeterministic computation:

$\displaystyle opt\,a\,\mathit{mx} = \mathit{return}\,a \mathbin{\underline{\smash{\mathit{mplus}}}} \mathit{mx}$

and ${\delta_2 :: (\mathsf{F}\,\alpha)\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}(\mathsf{F}\alpha)}$ distributes the shape functor ${\mathsf{F}}$ over the monad ${\mathsf{M}}$ (which can be defined for all ${\mathit{Traversable}}$ functors ${\mathsf{F}\,\alpha}$). Informally, once you have computed all possible ways of pruning each of the children of a node, a pruning of the node itself is formed either as ${\mathit{Just}}$ some node assembled from arbitrarily pruned children, or ${\mathit{Nothing}}$ for the empty structure.

## Horner’s Rule, datatype-generically

As we’ve seen, the essential property behind Horner’s Rule is one of distributivity. In the datatype-generic case, we will model this as follows. We are given an ${(\mathsf{F}\,\alpha)}$-algebra ${(\beta,f)}$, and a ${\mathsf{M}}$-algebra ${(\beta,k)}$; you might think of these as “datatype-generic product” and “collection sum”, respectively. Then there are two different methods of computing a ${\beta}$ result from an ${\mathsf{F}\,\alpha\,(\mathsf{M}\,\beta)}$ structure: we can either distribute the ${\mathsf{F}\,\alpha}$ structure over the collection(s) of ${\beta}$s, compute the “product” ${f}$ of each structure, and then compute the “sum” ${k}$ of the resulting products; or we can “sum” each collection, then compute the “product” of the resulting structure. Distributivity of “product” over “sum” is the property that these two different methods agree, as illustrated in the following diagram.

For example, with ${f :: \mathsf{F}\,{\mathbb Z}\,{\mathbb Z} \rightarrow {\mathbb Z}}$ adding all the integers in an ${\mathsf{F}}$-structure, and ${k :: \mathsf{M}\,{\mathbb Z} \rightarrow {\mathbb Z}}$ finding the maximum of a (non-empty) collection, the diagram commutes. (To match up with the rest of the story, we have presented distributivity in terms of a bifunctor ${\mathsf{F}}$, although the first parameter ${\alpha}$ plays no role. We could just have well have used a unary functor, dropping the ${\alpha}$, and changing the distributor to ${\delta :: \mathsf{F}\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}\mathsf{F}}$.)

Note that ${(\beta,k)}$ is required to be an algebra for the monad ${\mathsf{M}}$. This means that it is not only an algebra for ${\mathsf{M}}$ as a functor (namely, of type ${\mathsf{M}\,\beta \rightarrow \beta}$), but also it should respect the extra structure of the monad: ${k \cdot \mathit{return} = \mathit{id}}$ and ${k \cdot \mathit{join} = k \cdot \mathsf{M}\,k}$. For the special case of monads for associative collections (such as lists, bags, and sets), and in homage to the old Squiggol papers, we will stick to reductions${k}$s of the form ${\oplus/}$ for associative binary operator ${{\oplus} :: \beta \times \beta \rightarrow \beta}$; then we also have distribution over choice: ${\oplus / (x \mathbin{\underline{\smash{\mathit{mplus}}}} y) = (\oplus/x) \oplus (\oplus/y)}$. Note also that we prohibited empty collections in ${\mathsf{M}}$, so we do not need a unit for ${\oplus}$.

Recall that we modelled an “initial segment” of a structure of type ${\mu(\mathsf{F}\,\alpha)}$ as being of type ${\mu(\mathsf{H}\,\alpha)}$, where ${\mathsf{H}\,\alpha\,\beta = 1 + \mathsf{F}\,\alpha\,\beta}$. We need to generalize “product” to work on this extended structure, which is to say, we need to specify the value ${b}$ of the “product” of the empty structure too. Then we let ${g = \mathit{maybe}\,b\,f :: \mathsf{H}\,\alpha\,\beta \rightarrow \beta}$, so that ${\mathit{fold}_{\mathsf{H}}(g) :: \mu(\mathsf{H}\,\alpha) \rightarrow \beta}$.

The datatype-generic version of Horner’s Rule is then about computing the “sum” of the “products” of each of the “initial segments” of a data structure:

$\displaystyle {\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(g)) \cdot \mathit{prune}$

We will use fold fusion to show that this can be computed as a single fold, given the necessary distributivity property.

$\displaystyle \begin{array}{ll} & \mathord{\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(g)) \cdot \mathit{prune} \cdot \mathit{in}_{\mathsf{F}} \\ = & \qquad \{ \mbox{evaluation for~} \mathit{prune} \} \\ & \mathord{\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(g)) \cdot \mathsf{M}\,\mathit{in}_{\mathsf{H}} \cdot \mathit{opt}\,\mathit{Nothing} \cdot \mathsf{M}\,\mathit{Just} \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ \mbox{functors; evaluation for~} \mathit{fold}_{\mathsf{H}} \} \\ & \mathord{\oplus/} \cdot \mathsf{M}(g \cdot \mathsf{H}\,\mathit{id}\,(\mathit{fold}_{\mathsf{H}}(g))) \cdot \mathit{opt}\,\mathit{Nothing} \cdot \mathsf{M}\,\mathit{Just} \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ \mathsf{M}\,h \cdot \mathit{opt}\,a = \mathit{opt}\,(h\,a) \cdot \mathsf{M}\,h \} \\ & \mathord{\oplus/} \cdot \mathit{opt}\,(g\,\mathit{Nothing}) \cdot \mathsf{M}(g \cdot \mathsf{H}\,\mathit{id}\,(\mathit{fold}_{\mathsf{H}}(g))) \cdot \mathsf{M}\,\mathit{Just} \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ \mbox{functors;~} \mathit{Just} :: \mathsf{F}\,\alpha \mathbin{\stackrel{.}{\to}} \mathsf{H}\,\alpha \} \\ & \mathord{\oplus/} \cdot \mathit{opt}\,(g\,\mathit{Nothing}) \cdot \mathsf{M}\,g \cdot \mathsf{M}(\mathit{Just} \cdot \mathsf{F}\,\mathit{id}\,(\mathit{fold}_{\mathsf{H}}(g))) \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ \mbox{functors;~} \delta_2 :: (\mathsf{F}\alpha)\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}(\mathsf{F}\alpha) \} \\ & \mathord{\oplus/} \cdot \mathit{opt}\,(g\,\mathit{Nothing}) \cdot \mathsf{M}(g \cdot \mathit{Just}) \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,(\mathsf{M}(\mathit{fold}_{\mathsf{H}}(g))) \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \end{array}$

(Sadly, I have to break this calculation in two to get it through WordPress’s somewhat fragile LaTeX processor… where were we? Ah, yes:)

$\displaystyle \begin{array}{ll} & \mathord{\oplus/} \cdot \mathit{opt}\,(g\,\mathit{Nothing}) \cdot \mathsf{M}(g \cdot \mathit{Just}) \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,(\mathsf{M}(\mathit{fold}_{\mathsf{H}}(g))) \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ g = \mathit{maybe}\,b\,f \} \\ & \mathord{\oplus/} \cdot \mathit{opt}\,b \cdot \mathsf{M}\,f \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,(\mathsf{M}(\mathit{fold}_{\mathsf{H}}(g))) \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ {\oplus/} \cdot \mathit{opt}\,b = (b\oplus) \cdot {\oplus/} \} \\ & (b\oplus) \cdot {\oplus/} \cdot \mathsf{M}\,f \cdot \delta_2 \cdot \mathsf{F}\,\mathit{id}\,(\mathsf{M}(\mathit{fold}_{\mathsf{H}}(g))) \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ \mbox{distributivity:~} {\oplus/} \cdot \mathsf{M}\,f \cdot \delta_2 = f \cdot \mathsf{F}\,\mathit{id}\,(\oplus/) \} \\ & (b\oplus) \cdot f \cdot \mathsf{F}\,\mathit{id}\,(\oplus/) \cdot \mathsf{F}\,\mathit{id}\,(\mathsf{M}(\mathit{fold}_{\mathsf{H}}(g))) \cdot \mathsf{F}\,\mathit{id}\,\mathit{prune} \\ = & \qquad \{ \mbox{functors} \} \\ & (b\oplus) \cdot f \cdot \mathsf{F}\,\mathit{id}\,({\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(g)) \cdot \mathit{prune}) \end{array}$

Therefore,

$\displaystyle {\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(\mathit{maybe}\,b\,f)) \cdot \mathit{prune} = \mathit{fold}_{\mathsf{F}}((b\oplus) \cdot f)$

(Curiously, it doesn’t seem to matter what value is chosen for ${b}$.)

## Maximum segment sum, datatype-generically

We’re nearly there. We start with the traversable shape bifunctor ${\mathsf{F}}$, a collection monad ${\mathsf{M}}$, and a distributive law ${\delta_2 :: (\mathsf{F}\,\alpha)\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}(\mathsf{F}\alpha)}$. We are given an ${(\mathsf{F}\,\alpha)}$-algebra ${(\beta,f)}$, an additional element ${b :: \beta}$, and a ${\mathsf{M}}$-algebra ${(\beta,{\oplus/})}$, such that ${f}$ and ${\oplus}$ take constant time and ${f}$ distributes over ${\oplus/}$ in the sense above. Then

$\displaystyle {\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(\mathit{maybe}\,b\,f)) \cdot \mathit{segs}$

can be computed in linear time, where

$\displaystyle \mathit{segs} = \mathit{join} \cdot \mathsf{M}\,\mathit{prune} \cdot \mathit{contents}_{\mathsf{L}} \cdot \mathit{subterms} :: \mu(\mathsf{F}\,\alpha) \rightarrow \mathsf{M}(\mu(\mathsf{H}\,\alpha))$

and where

$\displaystyle \mathit{contents}_{\mathsf{L}} :: \mathsf{L} \mathbin{\stackrel{.}{\to}} \mathsf{M}$

computes the contents of an ${\mathsf{L}}$-structure (which, like ${\delta_2}$, can be defined using the traversability of ${\mathsf{F}}$). Here’s the calculation:

$\displaystyle \begin{array}{ll} & \mathord{\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(\mathit{maybe}\,b\,f)) \cdot \mathit{segs} \\ = & \qquad \{ \mbox{definition of~} \mathit{segs} \} \\ & \mathord{\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(\mathit{maybe}\,b\,f)) \cdot \mathit{join} \cdot \mathsf{M}\,\mathit{prune} \cdot \mathit{contents}_{\mathsf{L}} \cdot \mathit{subterms} \\ = & \qquad \{ \mbox{naturality of~} \mathit{join} :: \mathsf{M}\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}\mbox{; functors} \} \\ & \mathord{\oplus/} \cdot \mathit{join} \cdot \mathsf{M}(\mathsf{M}(\mathit{fold}_{\mathsf{H}}(\mathit{maybe}\,b\,f)) \cdot\mathit{prune}) \cdot \mathit{contents}_{\mathsf{L}} \cdot \mathit{subterms} \\ = & \qquad \{ \oplus/ \mbox{~is an~} \mathsf{M}\mbox{-algebra; functors} \} \\ & \mathord{\oplus/} \cdot \mathsf{M}({\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(\mathit{maybe}\,b\,f)) \cdot\mathit{prune}) \cdot \mathit{contents}_{\mathsf{L}} \cdot \mathit{subterms} \\ = & \qquad \{ \mbox{naturality of~} \mathit{contents} :: \mathsf{L} \mathbin{\stackrel{.}{\to}} \mathsf{M} \} \\ & \mathord{\oplus/} \cdot \mathit{contents}_{\mathsf{L}} \cdot \mathsf{L}({\oplus/} \cdot \mathsf{M}(\mathit{fold}_{\mathsf{H}}(\mathit{maybe}\,b\,f)) \cdot\mathit{prune}) \cdot \mathit{subterms} \\ = & \qquad \{ \mbox{Horner's Rule} \} \\ & \mathord{\oplus/} \cdot \mathit{contents}_{\mathsf{L}} \cdot \mathsf{L}(\mathit{fold}_{\mathsf{F}}((b\oplus)\cdot f)) \cdot \mathit{subterms} \\ = & \qquad \{ \mbox{Scan Lemma} \} \\ & \mathord{\oplus/} \cdot \mathit{contents}_{\mathsf{L}} \cdot \mathit{scan}_{\mathsf{F}}((b\oplus)\cdot f) \end{array}$

The scan can be computed in linear time, because its body takes constant time; moreover, the “sum” ${\oplus/}$ and ${\mathit{contents}}$ can also be computed in linear time (and what’s more, they can be fused into a single pass).

For example, with ${f :: \mathsf{F}\,{\mathbb Z}\,{\mathbb Z} \rightarrow {\mathbb Z}}$ adding all the integers in an ${\mathsf{F}}$-structure, ${b = 0 :: {\mathbb Z}}$, and ${{\oplus} :: {\mathbb Z}\times{\mathbb Z} \rightarrow {\mathbb Z}}$ returning the greater of two integers, we get a datatype-generic version of the linear-time maximum segment sum algorithm.

As the title of their paper suggests, Bird & co carried out their development using the relational approach set out in the Algebra of Programming book; for example, their version of ${\mathit{prune}}$ is a relation between data structures and their prunings, rather than being a function that takes a structure and returns the collection of all its prunings. There’s a well-known isomorphism between relations and set-valued functions, so their relational approach roughly looks equivalent to the monadic one I’ve taken.

I’ve known their paper well for over a decade (I made extensive use of the “labelled variant” construction in my own papers on generic downwards accumulations), but I’ve only just noticed that although they discuss the maximum segment sum problem, they don’t discuss problems based on other semirings, such as the obvious one of integers with addition and multiplication—which is, after all, the origin of Horner’s Rule. Why not? It turns out that the relational approach doesn’t work in that case!

There’s a hidden condition in the calculation, which relates back to our earlier comment about which collection monad—finite sets, finite bags, lists, etc—to use. When ${\mathsf{M}}$ is the set monad, distribution over choice (${\oplus / (x \mathbin{\underline{\smash{\mathit{mplus}}}} y) = (\oplus/x) \oplus (\oplus/y)}$)—and consequently the condition ${{\oplus/} \cdot \mathit{opt}\,b = (b\oplus) \cdot {\oplus/}}$ that we used in proving Horner’s Rule—require ${\oplus}$ to be idempotent, because ${\mathit{mplus}}$ itself is idempotent; but addition is not idempotent. For the same reason, the distributivity property does not hold for addition with the set monad. But everything does work out for the bag monad, for which ${\mathit{mplus}}$ is not idempotent. The bag monad models a flavour of nondeterminism in which multiplicity of results matters—as it does for the sum-of-products instance of the problem, when two copies of the same segment should be treated differently from just one copy. Similarly, if the order of results matters—if, for example, we were looking for the “first” solution—then we would have to use the list monad rather than bags or sets. Seen from a monadic perspective, the relational approach is programming with just one monad, namely the set monad; if that monad doesn’t capture your effects faithfully, you’re stuck.

(On the other hand, there are aspects of the problem that work much better relationally. We have carefully used ${\max}$ only for a linear order, namely the usual ordering of the integers. A partial order is more awkward monadically, because there need not be a unique maximal value. For example, it is not so easy to compute a segment with maximal sum, unless we refine the sum ordering on segments to make it once more a linear order; relationally, this works out perfectly straightforwardly. We can try the same trick of turning the relation “maximal under a partial order” into the collection-valued function “all maxima under a partial order”, but I fear that the equivalent trick on the ordering itself—turning the relation “${<}$” into the collection-valued function “all values less than this one”—runs into problems from taking us outside the world of finite nondeterminism.)

# Is Perl syntax better than randomly chosen syntax?

Programming language designers should perform more empirical studies of the sort published at Plateau. Last night I saw a paper which compared the author's language, Quorum, to Perl and a 'placebo' language with syntax chosen at random, dubbed Randomo.
Perl users in our study performed notably poorly, not only performing less well than Quorum, but no better than a language designed largely by chance.While Perl has never had a particular reputation for clarity, the fact that our data shows that there is only a 55.2 % (1 - p) chance that Perl affords more accurate performance amongst novices than Randomo, a language that even we, as the designers, find excruciatingly difficult to understand, was very surprising. This is especially true, we think, considering we chose to test only the syntax in Perl that is relatively common across a number of languages (e.g., if statements, loops, functions, parameters). Considering that Java syntax, which many would arguably consider to be easier to understand than Perl, uses similar syntax, we are curious how it would perform. Given this interesting first result, we plan to test a number of additional languages using the same procedures.
An Empirical Comparison of the Accuracy Rates of Novices Andreas Stefik, Susanna Siebert, Melissa Stefik, and Kim Slattery. Plateau 2011, Portland, 24 October 2011.

Spotted via Kevin Hammond.

# Writing a streaming twitter waterflow solution

In this post Philip Nilsson describes an inspiring, principled approach to solving a toy problem posed in a programming interview. I wanted to implement a solution to a variant of the problem where we'd like to process a stream. It was pretty easy to sketch a solution out on paper but Philip's solution was invaluable in testing and debugging my implementation. (See also Chris Done's mind-melting loeb approach)

My goal was to have a function:

waterStream :: [Int] -> [Int]


that would take a possibly-infinite list of columns and return a stream of known water quantities, where volumes of water were output as soon as possible. We can get a solution to the original problem, then, with

ourWaterFlow = sum . waterStream


Here is the solution I came up with, with inline explanation:

{-# LANGUAGE BangPatterns #-}

-- start processing str initializing the highest column to the left at 0, and
-- an empty stack.
waterStream :: [Int] -> [Int]
waterStream str = processWithMax 0 str []

processWithMax :: Int -> [Int] -> [(Int,Int)] -> [Int]
processWithMax prevMax = process
where
process []     = const []
-- output the quantity of water we know we can get, given the column at the
-- head of the stream, y:
process (y:ys) = eat 1
where
eat !n xxs@((offset,x):xs)
-- done with y, push it and its offset onto the stack
| y < x = process ys ((n,y):xxs)
-- at each "rise" we can output some known quantity of water;
-- storing the "offset" as we did above lets us calculate water
-- above a previously filled "valley"
| otherwise = let col = offset*(min y prevMax - x)
cols = eat (n+offset) xs
-- filter out zeros:
in if col == 0 then cols else col : cols
-- if we got to the end of the stack, then y is the new highest
-- column we've seen.
eat !n [] = processWithMax y ys [(n,y)]


The bit about "offsets" is the tricky part which I don't know how to explain without a pretty animation.

## Correctness

It took me much longer than I was expecting to code up the solution above that worked on a few hand-drawn test cases, and at that point I didn't have high confidence that the code was correct, so I turned to quickcheck and assert.

First I wanted to make sure the invariant that the "column" values in the stack were strictly increasing held:

 import Control.Exception (assert)

...
--process (y:ys) = eat 1
process (y:ys) stack = assert (stackSane stack) $eat 1 stack ...  Then I used Philip's solution (which I had confidence in): waterFlow :: [Int] -> Int waterFlow h = sum$
zipWith (-)
(zipWith min (scanl1 max h) (scanr1 max h))
h


to test my implementation:

*Waterflow> import Test.QuickCheck
*Waterflow Test.QuickCheck> quickCheck (\l -> waterFlow l == ourWaterFlow l)
*** Failed! Falsifiable (after 21 tests and 28 shrinks):
[1,0,0,0,1]


Oops! It turned out I had a bug in this line (fixed above):

                           --old buggy:
--cols = eat (n+1) xs
--new fixed:
cols = eat (n+offset) xs


## Performance

The solution seems to perform pretty well, processing 1,000,000 Ints in 30ms on my machine:

import Criterion.Main

main = do
gen <- create
rs <- replicateM 1000000 $uniformR (0,100) gen defaultMain [ bench "ourWaterFlow"$ whnf ourWaterFlow rs


I didn't get a good look at space usage over time, as I was testing with mwc-random which doesn't seem to support creating a lazy infinite list of randoms and didn't want to hunt down another library. Obviously on a stream that simply descends forever, our stack of (Int,Int) will grow to infinite size.

It seems as though there is a decent amount of parallelism that could be exploited in this problem, but I didn't have any luck on a quick attempt.

## Thoughts?

Have a parallel solution, or something just faster? Or an implementation that doesn't need a big stack of previous values?

# Abstract Nonsense: chapter 2c, Continuing on Continuations

So, continuing on Continuations (like you thought we're done with this? We're NEVER done with this!)

So. Yeah. That.

So, like printing out the number eight, like, to display the true power of continuations? Like, sigfpe can get away with that, and with style, too, but can I do that to you?

"Continuations are AWSM! Lookit me! I just printed the number 8! (and not even '8 factorial' but just '8')."

Weaksauce.

So, this is where we spice up the weaksauce and take things up a notch.

Continuously.

Okay, so first off, we do what we should have done in the first place: generalize all the continuations from (int -> int) -> int to (int -> a) -> a. We keep the input int, because in this example, we are working with integers and with functions that take integers as inputs, but the outputs are totally up to us, the function-caller, not to the function itself.

Did you get that? With continuations, the caller determines the output, and, if we so choose to do, determines even the input to the functions being continuationified.

That's a word now.

So, generalizing, Function's id function is now:

public static <T> Function<T, T> id(T basis) {
return new Function<T, T>() {
public T apply(T x) { return x; }
};
}

The same generalization applies for functions f and g in Continuity:

public static <ANS> Continuity<Integer, ANS> f(ANS basis) {
return new Continuity<Integer, ANS>() {
public Continuation<Integer, ANS> apply(final Integer x) {
return new Continuation<Integer, ANS>() {
public ANS apply(Arrow<Integer, ANS> arr) {
return arr.apply(2 * x);
}
};
}
};
};
public static <ANS> Continuity<Integer, ANS> g(ANS basis) {
return new Continuity<Integer, ANS>() {
public Continuation<Integer, ANS> apply(final Integer x) {
return new Continuation<Integer, ANS>() {
public ANS apply(Arrow<Integer, ANS> arr) {
return arr.apply(x + 1);
}
};
}
};
}

Then we finally generalize the Result type, calling f, g, and id with the specified (generalized) result type (instead of assuming it's an integer):

class Result<ANS> extends Continuation<Integer, ANS> {
public Result(ANS res) {
basis = res;
}
public ANS apply(final Arrow<Integer, ANS> tail) {
return Continuity.g(basis).apply(3)
.apply(new Function<Integer, ANS>() {
public ANS apply(Integer x) {
return Continuity.f(basis).apply(x).apply(tail);
}
});
}
private final ANS basis;

public static void main(String args[]) {
Result<Integer> just8 = new Result<Integer>(0);
System.out.println("Continue this! " + just8.apply(Function.id(0)));
}
}

Compiling and running this new set gets us the same result:

$javac -d .classes Result.java$ java -cp .classes Result
Continue this! 8

Okay, no biggie. We've proved that by a half-generalization of the continuation type from (int -> int) -> int to (int -> a) -> a gets us the same output from the same input.

Good.

Now, let's take that up a notch.

Aspects

So, I want to change what the program returns to me. I don't want an integer. I hate integers. Now, strings? Strings and I have this thing for each other. We're MFEO, actually.

Result<String> foo8 = new Result<String>("");
System.out.println("Foo that! "
+ foo8.apply(new Function<Integer, String>() {
public String apply(Integer x) {
return "foo " + x;
}
}));

Compiling and running this gets us:

Foo that! foo 8

Do you see what we just did? We didn't change one line of the called code, but we, as the caller, determined the type returned from two plain, ordinary math functions. Integers, right?

Nope, strings, because that's how we call the shots, and in CPS, we can call the shots.

Exception Handling

So, like, not in this example, but in other examples, you run into the problem that the data causes the system to crash. Like a divide-by-zero error, but there's no trying and catching all over the place here, and, with continuations, we don't need to add them.

What happens when we divide by zero? Well, the question quickly becomes a math philosophy one, and then devolves into 'well, what do you want to have happen?' really.

Like, for example: which zero are we talking about? Zero approaching from negative or positive infinity?

See?

Well, okay.

So, in some cases we want to abort the program (why?) gracefully (why?) in other cases we wish to discard this result, in still other cases, we wish to use a default value.

With continuations, we, the caller, get to choose the course of action.

Let's say I want to discard the computation if I have a divide-by-zero error, nice, and simple, and it doesn't cause the system to crash (which, the system crashing, is usually considered a Bad Thing (tm)).

So, we have some function:

> divBy x = 20 / x

or in Java:

public static Integer divBy(Integer x) {
return 20 / x;
}

Which blows up nicely in either language as we approach zero:

> map (divBy >>> g >>> f) [10,9..0]

[6.0,6.444444444444445,7.0,7.714285714285714,8.666666666666668,10.0,12.0,15.333333333333334,22.0,42.0,Infinity]

(oops, Haskell doesn't blow up. Bummer, and curse you, Haskell for being a pretty darn good programming language ... and smart, at that, too!)

(but in our beloved Java ...)

for(int x = 10; x > -1; x--) {
System.out.print("Hey, (20 divBy " + x + " + 1) * 2 is ");
System.out.flush();
System.out.println(2 * (1 + Continuity.divBy(x)));
}

Hey, (20 divBy 10 + 1) * 2 is 6
Hey, (20 divBy 9 + 1) * 2 is 6
Hey, (20 divBy 8 + 1) * 2 is 6
Hey, (20 divBy 7 + 1) * 2 is 6
Hey, (20 divBy 6 + 1) * 2 is 8
Hey, (20 divBy 5 + 1) * 2 is 10
Hey, (20 divBy 4 + 1) * 2 is 12
Hey, (20 divBy 3 + 1) * 2 is 14
Hey, (20 divBy 2 + 1) * 2 is 22
Hey, (20 divBy 1 + 1) * 2 is 42
Hey, (20 divBy 0 + 1) * 2 is Exception in thread "main" java.lang.ArithmeticException: / by zero
at Continuity.divBy(Continuity.java:36)
at Result.main(Result.java:35)

BOOM!

So how do we not go boom?

Well, we continuationify the function:

public static <ANS> Continuity<Integer, ANS> divBy(ANS basis) {
return new Continuity<Integer, ANS>() {
public Continuation<Integer, ANS> apply(final Integer x) {
return new Continuation<Integer, ANS>() {
public ANS apply(Arrow<Integer, ANS> arr) {
return arr.apply(20 / x);
}
};
}
};
}

and, if we pass in a continuation-handler function, that does business-as-usual, then we'll continue to get business-as-usual results:

public ANS appDivBy(final Arrow<Integer, ANS> tail, Integer by) {
return Continuity.divBy(basis).apply(by)
.apply(new Function<Integer, ANS>() {
public ANS apply(Integer x) {
return app(tail, basis, x);
}
});
}

final ANSWER basis, Integer in) {
return Continuity.g(basis).apply(in)
return Continuity.f(basis).apply(x).apply(tail);
}
});
}

// So we refactored apply() simply to call the generic method app()
public ANS apply(final Arrow<Integer, ANS> tail) {
return app(tail, basis, 3);
}

gives the same arithmetic exception:

// unit-CPS style fails with arithmetic error:
Result<Integer> inf = new Result<Integer>(0);
for(int x = 10; x > -1; x--) {
System.out.print("Hey, in unit-CPS, (20 divBy " + x
+ " + 1) * 2 is ");
System.out.flush();
System.out.println(inf.appDivBy(Function.id(0), x));
}

Hey, in unit-CPS, (20 divBy 10 + 1) * 2 is 6
Hey, in unit-CPS, (20 divBy 9 + 1) * 2 is 6
Hey, in unit-CPS, (20 divBy 8 + 1) * 2 is 6
Hey, in unit-CPS, (20 divBy 7 + 1) * 2 is 6
Hey, in unit-CPS, (20 divBy 6 + 1) * 2 is 8
Hey, in unit-CPS, (20 divBy 5 + 1) * 2 is 10
Hey, in unit-CPS, (20 divBy 4 + 1) * 2 is 12
Hey, in unit-CPS, (20 divBy 3 + 1) * 2 is 14
Hey, in unit-CPS, (20 divBy 2 + 1) * 2 is 22
Hey, in unit-CPS, (20 divBy 1 + 1) * 2 is 42
Hey, in unit-CPS, (20 divBy 0 + 1) * 2 is Exception in thread "main" java.lang.ArithmeticException: / by zero
at Continuity$3$1.apply(Continuity.java:29)
at Continuity$3$1.apply(Continuity.java:28)
at Result.appDivBy(Result.java:20)
at Result.main(Result.java:38)

Ick. ALL THAT WORK FOR NOTHING?!?!? NOOOOOOOOOOOOOES!

BUT!

Despair not, fair maiden!

Or, if you're not a fair maiden, because you're not fair, or a maid, then whatevs, and equal opportunity and fairness for all.

You get my drift.

The thing is, we're treating ANS as an integer or a string or some other unit type.

But we don't have to.

Read my next article to see what we can do.

# Abstract Nonsense, chapter 1: the List

Abstract Nonsense: Rambling thoughts of a Category Theorist

Synopsis: Looking at things through the lens of category theory, where 'things' may be anything from writing a good (a really, awesomely good) piece of code to ... well, what else is there in this life?

Chapter 1: the List

A monad is not intrinsically monoidal, but, then again, the monad and the comonad are not intrinsically functors, but thinking of monads and comonads and thinking of them not as functors is a blocker for me ... it gives me a bit of a headache, thinking of a monad (the triple (T, mu (or 'join'), eta (or 'unit')). I mean, it is simple to think of monads as monads only when considering the join function, but what is the point of eta if you are not thinking of the monad in terms of a functor?

So I do not consider the monad, or the comonad (W, epsilon (or extract), delta (or duplicate)) (definition from "Introduction to Higher-Order Categorical Logic), http://books.google.com/books?isbn=0521356539, as being independently or intrinsically defined from the functor, Fa, as, after all unit and extract are functorable.

Okay. So we consider lists. We can consider lists free of the monadic and comonadic implications, because, after all, a list is simply a structure, and can be viewed as tree-like structure, where we define

data list a = nil | (a, list a)

we see that, firstly list is monoidal on a where mzero = nil and mappend is easily defined as append or (++).

So, anywhere along the list, one can grab the pair, and the tree falls out of the structure:

(a, |->) -> (b, |->) -> nil

The pairwise operators, fst and snd, are head and tail on list a, and the fundamental constructor of the list is cons, where cons :: a -> list a -> list a

(which looks like a continuation, now that I look at it, OMG!)

(The continuation monad is the 'mother of all monads', and the continuation is the 'mother of all function calls,' perhaps?)

from cons, mappend is simply defined:

nil ++ b = b
a ++ b = cons (head a) (tail a ++ b)

Since list a is functor a (leaving aside nil) we have fmap for lists, too:

fmap f nil = nil

as you can see, list definitions lend themselves naturally to induction.

Okay. Lists qua lists (of course, as monoids and functors) are very easily defined in just a few lines of code in any programming language with any heft (I've demonstrated Haskell here, the definitions are similar in Java once monoid and functor are properly established as types. This requires that Arrows and functors be defined, but that is not part of the scope of this article ... it's hard, but doable).

Now let's consider the case of lists as monads.

return :: a -> m a
join :: m (m a) -> m a

That definition is unusual in the programming community, but not to a categorist. Programmers are used to the more applicative definition of monad:

return :: a -> m a
m >>= f :: m a -> (a -> m b) -> m b

The issue with defining monads in terms of bind (>>=) is that sometimes wrapping your head around what bind should be for a particular monad can be tricky (sometimes it isn't, but sometimes it is), but if one knows what join is for a monad, and, if that monad is a functor, then we have a very simple, and a very general, definition of bind:

bind m f = join . fmap f

proof

bind is defined as above

bind (m a) f = (join . fmap f) (m a) = m b (supposition)
= (join . fmap (a -> m b)) (m a)
= (join . (m a -> m (m b))) (m a)
= (join (m (m b)) = m b
Q.E.D.

So, given that we have join and fmap defined for a monad, bind falls out easily from their composition, and that's why I declare monads as per category theory, as being the triple of the monadic type, the unit function (in Haskell, it's called 'return') and the join function.

Now, the monadic aspect of list a is simply this:

return a = [a]
join [] = []

... or join for list [a,b,c] is a ++ b ++ c.

And thus we have have monadic lists defined and may therefor use lists monadically, e.g.:

[1,2,3] >>= return . succ = [2,3,4]

and, even better:

[1,null, 3] >>= return . succ = [2,4]

It. Just. Works.

(side note: it really works in Haskell, as, after all, 'null' is not a number so the list [1,null, 3] does not exist in Haskell.

In Java, null very much exists, but if we make monadic list construction dependent on the monadicity of the units then the lifting function (which we will cover later) from a 'plain' list to the monadic list gets us into the a good position already:

return [1,null, 3] = [1,3] (monadic)

And if we don't do that, then the binding operation will use the Maybe type to guarantee the safety of the operation:

[1,null,3] >>= return . succ == succ (Just 1) : succ (Nothing) : succ (Just 3) = [2, 4]

end side note.)

Why? Because whereas with monads we have bind, which lends itself nicely to each element of lists, with comonads we have extend which also extends itself very nicely to operations on lists as a whole, iteratively reductively.

extract :: w a -> a
duplicate :: w a -> w (w a)

Again, instead of making extend intrinsic to the definition of comonads, I chose, instead, the mathematical definition, again, using duplicate.

Again, for gainful reasons, because extend can be defined in terms of duplicate:

extend :: w a -> (w a -> b) -> w b

Or:

extend f (w a) = (fmap f . duplicate) (w a) = w b (supposition)

proof:

extend f (w a) = (fmap f . duplicate) (w a)
= (fmap (w a -> b) . duplicate) (w a)
= ((w (w a) -> w b) . duplicate) (w a)
= ((w (w a) -> w b) (w (w a))
= w b

Q.E.D.

So for lists:

extract [] = doesn't happen. Really
duplicate [] = []
duplicate list@(_:tail) = list : (duplicate tail)

... and we're done.

So, if we have the basic list type defined as (a, list a) or nil

Then we convert to mlist (monadic list) or wlist (comonadic list) by defining nil for each of the particular types and then just adding elements (with cons) to each of those container types:

and in the derived times

Piece of cake.

Or.
Is.
It?

Constructing Lists: Lists as Streams.

The 'problem' of these functional list types is that they look more like Stacks than Queue, so we can interact very easily with the most recent element and cdr down through the list in an orderly fashion, but if we have a preexisting list of one type and wish to view it from a different aspect, not only do we have to build it iteratively, but we have to make sure that iterative build is not penalized (linearly) for each insert at the end of the list (which turns out to be an exponential punishment).

How do we do this?

Lists are functions in a functional aspect.

So, the problem commonly faced by users of them in the functional domain is that when one has to construct a large one iteratively, one faces the problem of continuously appending the next element to the end of the list, and since naïve append is O(N) time (linear), then doing that, iteratively incurs a near O(N*N) time cost.

(side note: the cost is actually O(N*(N-1)/2) time)

What can we do about that?

Simple, actually, if you come from a Prolog programming background, just use difference lists.

A difference list is a list representation of a pair of lists, one list representing the whole and one representing a part of the this:

data dlist a = DL { realize :: list a -> list a }

In Prolog, logic variables are used to provide the (declarative) representation (and consequently, the implementation):

List = [1,2,3|L] - L

Since we have L and L is a part of the whole list, we have a reference into a latter part of the list that in standard list representations we do not. Further, as L can be anything, it can even be the 'end' of the list, and so 'prepending' onto L allows us, by consequence (actually: 'by accident') to 'postpend' onto the list we're working with.

We saw how to represent difference lists using unification and logic variables in Prolog. How do we represent difference lists functionally?

It comes down to the fundamental function of lists, cons:

cons :: a -> list a -> list a

that is, if we give cons an element and a list we get a cons'ed list in return.

What happens if we don't give cons the list to cons to, but instead, we defer it? We then get the partial or curried function:

(cons x) :: list a -> list a

Which is no different than before.

Or.
Is.
It?

Well, recall that a difference list is the difference of two lists. Look at the signature of (cons x) above and what do we have? We have a function that, when given a list, returns a list.

What is a difference list again? A think that, when given a list, gives the full list back.

(cons x), that is: the curried cons function, and the difference list type can be viewed as the same thing, and from that we have our difference list definition (of the realize function):

realize = cons x

So, given realize, we can define prepend and postpend operations:

prepend :: a -> dlist a -> dlist a
prepend x dl = DL { cons x . realize dl }
x |> dl = prepend x dl

postpend :: a -> dlist a -> dlist a
postpend x dl = DL { realize dl . cons x }
dl <| x = postpend x dl

So what? Well, for prepend, there's no big deal, the operation occurs in constant time, but so does cons for (regular) lists. But for postpend, the operation takes the same constant time, but for regular lists, if we were to define postpend in the regular fashion:

postpend :: a -> list a -> list a
postpend a list = list ++ [a]

we would be traversing to the tail of the list each time to append the new element, and that traversal incurs a linear-time cost. Upshot: postpend for regular lists has an O(n) cost, but for difference lists, it occurs in constant time.

And that's the payoff, if you are working with a large list from an external source and need to enlistify it in an order-preserving fashion, doing so with a regular list would incur an exponential cost (nearly O(N*N)) but using difference lists has a logarithmic flattening effect (O(2N)).

Using difference lists saved my bacon on a project where I did a scan-then-process operation on a framework that processes large documents.

Back to Java.

One way to convert from one representation of lists to another is to rebuild the list in the new type. Unfortunately, Java doesn't have the linear construct

mlist [] = m[]
mlist (h:t) = h m: mlist t

So what it does is iteratively build the new list with mappend (as lists are monoidal).

Well, using the O(N) mappend in an iterative fashion incurs an exponential-time cost for list-revisualization.

So, I've replaced my fromCollection() methods in Java from:

public static <L<t conslist="" extends="" l="">> L fromCollection(Collection<t> coll, L basis) {</t></t>
L ans = basis.mzero();
for(T elt : coll) {
ans = ans.mappend(basis.singleton(elt)); // exponential cost
}
return ans;
}

to be:

public static <L<t conslist="" extends="" l="">> L fromCollection(Collection<t> coll, L basis) {</t></t>
DifferenceList<t l=""> ans = DifferenceList.nil(basis);</t>
for(T elt : coll) {
ans = ans.postpend(elt); // constant time cost
}
return ans.realizeFrom(basis);
}

so list construction that was at a cost of exponential time now become linear time, as it should be.

This way, I can work with a list either monadically or comonadically and pay only a linear cost when I change my (aspect) perspective on the list.

# Quick facts editing with pdbquery

I was made aware that it was not easy to play with facts for newcomers. I added a quick command to pdbquery that works like this :

<figure class="code">
 1  pdbquery -t test -l /tmp/test.yaml addfacts test.site.com
</figure>

This should create the file /tmp/test.yaml containing the following facts :

<figure class="code">
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  facts: test.site.com: memorysize: 31.31 GB ipaddress: 192.168.0.1 hostname: test osfamily: Debian lsbdistcodename: precise facterversion: 0.1 fqdn: test.site.com memoryfree: 21.16 GB environment: test operatingsystem: Ubuntu domain: site.com lsbdistrelease: 12.04 puppetversion: language-puppet virtual: xenu operatingsystemrelease: 12.04 clientcert: test.site.com is_virtual: 'true' lsbdistid: Ubuntu mountpoints: / /data architecture: x86_64 swapsize: 9.31 GB lsbmajdistrelease: 12 rootrsa: xxx swapfree: 9.31 GB lsbdistdescription: Ubuntu 12.04.3 LTS hardwaremodel: x86_64 resources: {}
</figure>

You can then edit this file to add/remove/alter facts as you want. You can then use them in the following way:

<figure class="code">
 1  puppetresources -p . -o test.site.com --pdbfile /tmp/test.yaml
</figure>

This only works properly in HEAD right now, and will be fixed in 0.10.3, because of this Yaml bug that just got worked around today.

# [xwwyxbsu] Program translation and linking

One of the main reasons for choosing an inferior programming language is the existence of libraries in that language.  The prototypical example is C.  Will this consideration continue to be an issue forever?  Or will tools for translating between languages, or tools to link to libraries written in different languages, become a lot better?

It is a difficult problem because of considerable "impedance mismatch" of execution models (and data models) between different languages: Haskell and its non-strict evaluation, or the many ways different languages and frameworks handle concurrency and multithreading.

# Flat design sucks.

Recently Microsoft updated the design of its Dynamics CRM to a “flat” look. Since then, we can’t figure out what is clickable or editable, and what is just content.

My plea to designers: make that clickable thingy look clickable. Make that editable input box look like an input box, and not just when I hover over it.

Buttons and interface widgets, when present, need to be easily distinguishable from content. They need to have good affordances that invite users to action. In the absence of strong signifiers, they can get ignored, and users may find themselves lost and disoriented.

Not all “flat” design has indistinguishable chrome from content, as pointed out in the above article.

Tagged: design

# How do professional Windows programmers stand Visual Studio?

I have a new assignment at work and now find myself at yet another Windows shop. They are making embedded systems, but have for some strange reason decided that Windows is the only development platform to use. After only a few weeks here I’m noting a growing irritation with the tools offered for my use. The amount of forced mouse usage is astounding and the main tool, Visual Studio, is turning out to be the main culprit. After a week or so of exposure to VS I’ve found what I consider to be a serious flaw with a tool for developers: it doesn’t scale.

1. No hierarchical structure It doesn’t scale very well with the size of a project. The concept of having a solution with a number of projects is not bad. But the implementation doesn’t scale. A project can’t have sub-projects, which means I can’t really layer the code in the IDE in the same way I do on disk. The only thing I can do is organise viewing of files through the concept of filters.
2. All configuration is manual, part 1 MS seems to have optimised for small projects. All configuration is kept in XML files, and the only interface to them is a set of property dialogues (some which can be resized, others not) requiring an amazing amount of pointing and clicking to get anything done.
3. All configuration is manual, part 2 MS have optimised for beginning new projects. Getting started is amazingly quick, but once you reach a size of about 10 projects it becomes daunting to fix anything that requires configuration in all projects. Making sure that all configurations are correct is a major undertaking, and requires an insane amount using the mouse. Some earlier versions of VS seem to even have made it impossible to edit the common settings of configurations properly; a simple mistake and the value of one configuration is lost.
4. Experience There are no shortcuts to discover. The configuration is all in XML, which means it’s not really possible to jump out of VS and use a text editor for fixing up semi-broken configurations (like dealing with someone’s decision to place all intermediate files and final results in the top-level solution directory).

So, how do Windows developers cope with this? Don’t they use VS? Do they police the configurations diligently to ensure no silliness creeps in? Or are they all using tools to address these points (like CMake)?

### New Free Community Edition of FP Haskell Center is Now Available

FP Complete™ is excited to announce we are now offering a free community edition of FP Haskell Center™. To get it, all you need to do is login. Instantly you can start learning, developing and sharing Haskell code. Within the IDE you can open projects directly from Git, or the web. The free subscription also includes continuous display of type and error information and type autocompletion, Hoogle and Haddock integration, an easy to use build system, easy to understand error messages as well as vetted and stable libraries. By providing this free edition of FP Haskell Center, FP Complete is making it easier than ever to explore and share the benefits of Haskell.

For Haskell developers seeking more features like the ability to push projects to Git and GitHub, full Git integration, or access to a private repository, you can upgrade to our Personal edition for $9.99 a month. If you are a commercial developer of Haskell we offer a Professional edition starting at$75 a month which includes shared team accounts, support for sub projects, FP Application Server™ access, multiple repository projects as well as a commercial use license. Feel free to try a paid edition for 30 days for free before you buy.

With our most recent round of upgrades we have made FP Haskell Center more accessible than ever and interaction within the IDE even more responsive.

To make interaction within the IDE more responsive we’ve added:

• Search/replace in editor
• Code search/grep
• Subexpression type information lookup
• Better error handling, Faster save/error message results

As always, we look forward to your feedback. Our goal is to make FP Haskell Center an invaluable resource for you. If there is a feature you’d like to see added, don’t hesitate to reach out to us. Leave a comment here or email us at sales@fpcomplete.com

## November 17, 2013

### mightybyte

In part 1 of this tutorial we talked about types and kinds. Knowledge of kinds will help to orient yourself in today's discussion of monads.

What is a monad? When you type "monad" into Hayoo the first result takes you to the documentation for the type class Monad. If you don't already have a basic familiarity with type classes, you can think of a type class as roughly equivalent to a Java interface. A type class defines a set of functions involving a certain data type. When a data type defines all the functions required by the type class, we say that it is an instance of that type class. When a type Foo is an instance of the Monad type class, you'll commonly hear people say "Foo is a monad". Here is a version of the Monad type class.

class Monad m where
return :: a -> m a
(=<<) :: (a -> m b) -> m a -> m b

(Note: If you're the untrusting type and looked up the real definition to verify that mine is accurate, you'll find that my version is slightly different. Don't worry about that right now. I did it intentionally, and there is a method to my madness.)

This basically says that in order for a data type to be an instance of the Monad type class, it has to define the two functions return and (=<<) (pronounced "bind") that have the above type signatures. What do these type signatures tell us? Let's look at return first. We see that it returns a value of type m a. This tells us that m has the kind signature m :: * -> *. So whenever we hear someone say "Foo is a monad" we immediately know that Foo :: * -> *.

In part 1, you probably got tired of me emphasizing that a type is a context. When we look at return and bind, this starts to make more sense. The type m a is just the type a in the context m. The type signature return :: a -> m a tells us that the return function takes a plain value a and puts that value into the context m. So when we say something is a monad, we immediately know that we have a function called return that lets us put arbitrary other values into that context.

Now, what about bind? It looks much more complicated and scary, but it's really pretty simple. To see this, let's get rid of all the m's in the type signature. Here's the before and after.

before :: (a -> m b) -> m a -> m b
after  :: (a -> b) -> a -> b

The type signature for after might look familiar. It's exactly the same as the type signature for the ($) function! If you're not familiar with it, Haskell's $ function is just syntax sugar for function application. (f a) is exactly the same as (f a). It applies the function f to its argument a. It is useful because it has very low precedence and is right associative, so it is a nice syntax sugar that allows us to eliminate parenthesis in certain situations. When you realize that (=<<) is roughly analogous to the concept of function application (modulo the addition of a context m), it suddenly makes a lot more sense. So now what happens when we look at bind's type signature with the m's back in? (f =<< k) applies the function f to the value k. However, the crucial point is that k is a value wrapped in the context m, but f's parameter is an unwrapped value a. From this we see that the bind function's main purpose is to pull a value out of the context m and apply the function f, which does some computation, and returns the result back in the context m again. The monad type class does not provide any mechanism for unconditionally pulling a value out of the context. The only way to get access to the unwrapped value is with the bind function, but bind does this in a controlled way and requires the function to wrap things up again before the result is returned. This behavior, enabled by Haskell's strong static type system, provides complete control over side effects and mutability. Some monads do provide a way to get a value out of the context, but the choice of whether to do so is completely up to the author of said monad. It is not something inherent in the concept of a monad. Monads wouldn't be very fun to use if all you had was return, bind, and derived functions. To make them more usable, Haskell has a special syntax called "do notation". The basic idea behind do notation is that there's a bind between every line, and you can do a <- func to unwrap the return value of func and make it available to later lines with the identifier 'a'. You can find a more detailed treatment of do notation elsewhere. I hear that Learn You a Haskell and Real World Haskell are good. In summary, a monad is a certain type of context that provides two things: a way to put things into the context, and function application within the context. There is no way to get things out. To get things out, you have to use bind to take yourself into the context. Once you have these two operations, there are lots of other more complicated operations built on the basic primitives that are provided by the API. Much of this is provided in Control.Monad. You probably won't learn all this stuff in a day. Just dive in and use these concepts in real code. Eventually you'll find that the patterns are sinking in and becoming clearer. ### Joachim Breitner # Breaking News: Debian TC leaked decision on init system According to the German news page “heisse news”, the Debian Technical Committee has made a decision on the question about the default init system in Debian: There will be none! Instead, according to Ian “Vorlon” Bart, Debian will start to distribute a suspend-to-disk image to their users, with all daemons already started, completely eradicating the need for an init system. If you are able to read German, read all about it at Debian Technical Committee entscheidet sich gegen ein Init-System ## November 16, 2013 ### Neil Mitchell # ACM Article - Leaking Space I wrote an article for the Communcations of the ACM which is now available on the ACM Queue, entitled Leaking Space, all about space leaks. It's got plenty of Haskell, but is intended as a general article on space leaks, and even has an example in Javascript. The introduction features both printed encyclopedia's and postage stamps - both things my son might only learn about on Wikipedia. The abstract: A space leak occurs when a computer program uses more memory than necessary. In contrast to memory leaks, where the leaked memory is never released, the memory consumed by a space leak is released, but later than expected. This article presents example space leaks and how to spot and eliminate them. Read Leaking Space for free, or if you have an ACM subscription, also here. Thanks to Andy Gill for requesting the article and for suggestions to improve it, and to the team at ACM for typesetting/proofing etc. ### language-puppet # Hiera support in HEAD Hiera support is now in the latest version of the source repository. This is the last important feature that I wanted to implement before hitting 1.0. As with all the other features, this was really quick to develop (an evening to write the Hiera server and integrate it with the interpreter, and two lunch breaks to implement hash, array search types, and to finalize variable interpolation), and it is mostly untested. I did not look much at Hiera before, but now that I have learned about it I can see how it will simplify my manifests, so I’ll test things a bit more during the following days. I’ll also revamp the test library, and, when I’ll be happy about it, will release 1.0. I believe several persons are trying to use the library, and I would be quite happy to have feedback about it. If you need a feature, or a bug fixed, do not hesitate to create an issue. ## November 14, 2013 ### Christopher Done # Twitter waterflow problem and loeb <style> .g b{background:#888;color:#eee} .g i{background:#ccc;color:#888} .g a{background:#1a6e8e;color:#1a6e8e;} .g u{background:#8f4e8b;color:#fff;text-decoration:none;} .g s{background:#397460;color:#fff;text-decoration:none;} .g b,.g i,.g a,.g u,.g s {padding:0.2em;display:block;border-radius:0.1em;width:1em;height:1em;float:left;margin:0.1em;line-height:1em;text-align:center;font-weight:normal;text-shadow:none;} .g em{clear:both;display:block} </style> ## The Waterflow Problem I recently saw I Failed a Twitter Interview which features the following cute problem. Consider the following picture: 7 7 6 5 4 3 2 2 1 Fig. 1 In Fig. 1, we have walls of different heights. Such pictures are represented by an array of integers, where the value at each index is the height of the wall. Fig. 1 is represented with an array as [2,5,1,2,3,4,7,7,6]. Now imagine it rains. How much water is going to be accumulated in puddles between walls? For example, if it rains in Fig 1, the following puddle will be formed: 7 7 6 5 4 3 2 2 1 Fig. 2 No puddles are formed at edges of the wall, water is considered to simply run off the edge. We count volume in square blocks of 1×1. Thus, we are left with a puddle between column 1 and column 6 and the volume is 10. Write a program to return the volume for any array. ## My Reaction I thought, this looks like a spreadsheet problem, and closed the page, to get on with my work. Last thing I need right now is nerd sniping. A week or so later I saw A functional solution to Twitter’s waterflow problem which presented a rather concise and beautiful approach to solving the problem. I present it here, in the style that I prefer: water :: [Int] -> Int water h = sum (zipWith (-) (zipWith min (scanl1 max h) (scanr1 max h)) h) This is the second fastest algorithm in this page, clocking in at a mean 3.511872 ms for a random list of 10000 elements. See Benchmarks for more details. An efficient algorithm can be achieved with a trivial rewrite of Michael Kozakov’s Java solution is: {-# LANGUAGE BangPatterns #-} {-# LANGUAGE ViewPatterns #-} import qualified Data.Vector as V import Data.Vector ((!),Vector) water :: Vector Int -> Int water land = go 0 0 (V.length land - 1) 0 0 where go !volume !left !right (extend left -> leftMax) (extend right -> rightMax) | left < right = if leftMax >= rightMax then go (volume + rightMax - land!right) left (right - 1) leftMax rightMax else go (volume + leftMax - land!left) (left + 1) right leftMax rightMax | otherwise = volume extend i d = if land!i > d then land!i else d This is expectedly the fastest algorithm in this page, clocking in at a mean of 128.2953 us for a random vector of 10000 elements. But I still thought my spreadsheet idea was feasible. ## My approach In a similar way to Philip Nilsson, I can define the problem as it comes intuitively to me. As I saw it in my head, the problem can be broken down into “what is the volume that a given column will hold?” That can be written like this: volume0 = 0 volume|S|-1 = 0 volumei = min(lefti-1,righti+1)−heighti Where left and right are the peak heights to the left or right: left0 = height0 lefti = max(heighti,lefti-1) right|S|-1 = height|S|-1 righti = max(heighti,righti+1) That’s all. ## A visual example An example of i is: 7 7 6 5 4 3 2 2 1 We spread out in both directions to find the “peak” of the columns: 7 7 6 5 4 3 2 2 1 How do we do that? We simply define the volume of a column to be in terms of our immediate neighbors to the left and to the right: 7 7 6 5 4 3 2 2 1 A X B X is defined in terms of A and B. A and B are, in turn, are defined in terms of their immediate neighbors. Until we reach the ends: 7 7 6 5 4 3 2 2 1 A X Y B The ends of the wall are the only ones who only have one side defined in terms of their single neighbor, which makes complete sense. Their volume is always 0. It’s impossible to have a puddle on the edge. A’s “right” will be defined in terms of X, and B’s “left” will be defined in terms of Y. But how does this approach avoid infinite cycles? Easy. Each column in the spreadsheet contains three values: 1. The peak to the left. 2. The peak to the right. 3. My volume. A and B below depend upon eachother, but for different slots. A depends on the value of B’s “right” peak value, and B depends on the value of A’s “left” value: 7 7 6 5 4 3 2 2 1 A B The height of the column’s peak will be the smallest of the two peaks on either side: 7 7 6 5 4 3 2 2 1 And then the volume of the column is simply the height of the peak minus the column’s height: 7 7 6 5 4 3 2 2 1 ## Enter loeb I first heard about loeb from Dan Piponi’s From Löb’s Theorem to Spreadsheet Evaluation some years back, and ever since I’ve been wanting to use it for a real problem. It lets you easily define a spreadsheet generator by mapping over a functor containing functions. To each function in the container, the container itself is passed to that function. Here’s loeb: loeb :: Functor f => f (f b -> b) -> f b loeb x = fmap (\f -> f (loeb x)) x (Note, there is a more efficient version here.) So as described in the elaboration of how I saw the problem in my head, the solution takes the vector of numbers, generates a spreadsheet of triples, defined in terms of their neighbors—exept edges—and then simply makes a sum total of the third value, the volumes. import Control.Lens import qualified Data.Vector as V import Data.Vector ((!),Vector) water :: Vector Int -> Int water = V.sum . V.map (view _3) . loeb . V.imap cell where cell i x xs | i == 0 = edge _2 | i == V.length xs - 1 = edge _1 | otherwise = col i x xs where edge ln = set l (view l (col i x xs)) (x,x,0) where l r = cloneLens ln r col i x xs = (l,r,min l r - x) where l = neighbor _1 (-) r = neighbor _2 (+) neighbor l o = max x (view l (xs ! (i o 1))) It’s not the most efficient algorithm—it relies on laziness in an almost perverse way, but I like that I was able to express exactly what occured to me. And loeb is suave. It clocks in at a mean of 3.512758 ms for a vector of 10000 random elements. That’s not too bad, compared to the scanr/scanl. This is was also my first use of lens, so that was fun. The cloneLens are required because you can’t pass in an arbitrary lens and then use it both as a setter and a getter, the type becomes fixed on one or the other, making it not really a lens anymore. I find that pretty disappointing. But otherwise the lenses made the code simpler. ## Update with comonads & pointed lists Michael Zuser pointed out another cool insight from Comonads and reading from the future (Dan Piponi’s blog is a treasure trove!) that while loeb lets you look at the whole container, giving you absolute references, the equivalent corecursive fix (below wfix) on a comonad gives you relative references. Michael demonstrates below using Jeff Wheeler’s pointed list library and Edward Kmett’s comonad library: import Control.Comonad import Control.Lens import Data.List.PointedList (PointedList) import qualified Data.List.PointedList as PE import Data.Maybe instance Comonad PointedList where extend = PE.contextMap extract = PE._focus water :: [Int] -> Int water = view _2 . wfix . fmap go . fromMaybe (PE.singleton 0) . PE.fromList where go height context = (lMax, total, rMax) where get f = maybe (height, 0, height) PE._focus f context
(prevLMax,         _,        _) = get PE.previous
(_       , prevTotal, prevRMax) = get PE.next
lMax = max height prevLMax
rMax = max height prevRMax
total = prevTotal + min lMax rMax - height

I think if I’d’ve heard of this before, this solution would’ve come to mind instead, it seems entirely natural!

Sadly, this is the slowest algorithm on the page. I’m not sure how to optimize it to be better.

## Update on lens

Russell O’Connor gave me some hints for reducing the lens verbiage. First, eta-reducing the locally defined lens l in my code removes the need for the NoMonomorphismRestriction extension, so I’ve removed that. Second, a rank-N type can also be used, but then the type signature is rather large and I’m unable to redouce it presently without reading more of the lens library.

## Update on loeb

On re-reading the comments for From Löb’s Theorem to Spreadsheet Evaluation I noticed Edward Kmett pointed out we can get better laziness sharing with the following definition of loeb:

loeb :: Functor f => f (f b -> b) -> f b
loeb x = xs
where xs = fmap ($xs) x For my particular water function, this doesn’t make much of a difference, but there is a difference. See the next section. ## Time travelling solution Michael Zuser demonstrated a time travelling solution based on the Tardis monad: import Control.Monad import Control.Monad.Tardis water :: [Int] -> Int water = flip evalTardis (minBound, minBound) . foldM go 0 where go total height = do modifyForwards$ max height
leftmax <- getPast
rightmax <- getFuture
modifyBackwards $max height return$ total + min leftmax rightmax - height

## Fastest

Sami Hangaslammi submitted this fast version (clocks in at 33.80864 us):

import Data.Array

waterVolume :: Array Int Int -> Int
waterVolume arr = go 0 minB maxB where
(minB,maxB) = bounds arr

go !acc lpos rpos
| lpos >= rpos             = acc
| leftHeight < rightHeight =
segment leftHeight    1  acc lpos contLeft
| otherwise                =
segment rightHeight (-1) acc rpos contRight
where
leftHeight          = arr ! lpos
rightHeight         = arr ! rpos
contLeft  acc' pos' = go acc' pos' rpos
contRight acc' pos' = go acc' lpos pos'

segment limit move !acc' !pos cont
| delta <= 0 = cont acc' pos'
| otherwise  = segment limit move (acc' + delta) pos' cont
where
delta = limit - arr ! pos'
pos'  = pos + move

Changing the data structure to a vector brings this down to 26.79492 us.

## Benchmarks

benchmarking water/10000
mean: 2.624748 ms, lb 2.619731 ms, ub 2.630364 ms, ci 0.950
std dev: 85.45235 us, lb 78.33856 us, ub 103.2333 us, ci 0.950
found 54 outliers among 1000 samples (5.4%)
51 (5.1%) high mild
3 (0.3%) high severe
variance introduced by outliers: 79.838%
variance is severely inflated by outliers

benchmarking water_loeb/10000
mean: 3.512758 ms, lb 3.508533 ms, ub 3.517332 ms, ci 0.950
std dev: 70.77688 us, lb 65.88087 us, ub 76.97691 us, ci 0.950
found 38 outliers among 1000 samples (3.8%)
24 (2.4%) high mild
14 (1.4%) high severe
variance introduced by outliers: 59.949%
variance is severely inflated by outliers

benchmarking water_loeb'/10000
mean: 3.511872 ms, lb 3.507492 ms, ub 3.516778 ms, ci 0.950
std dev: 74.34813 us, lb 67.99334 us, ub 84.56183 us, ci 0.950
found 43 outliers among 1000 samples (4.3%)
34 (3.4%) high mild
9 (0.9%) high severe
variance introduced by outliers: 62.285%
variance is severely inflated by outliers

benchmarking water_onepass/10000
mean: 128.2953 us, lb 128.1194 us, ub 128.4774 us, ci 0.950
std dev: 2.864153 us, lb 2.713756 us, ub 3.043611 us, ci 0.950
found 18 outliers among 1000 samples (1.8%)
17 (1.7%) high mild
1 (0.1%) high severe
variance introduced by outliers: 64.826%
variance is severely inflated by outliers

benchmarking water_array/10000
mean: 33.80864 us, lb 33.76067 us, ub 33.86597 us, ci 0.950
std dev: 844.9932 ns, lb 731.3158 ns, ub 1.218807 us, ci 0.950
found 29 outliers among 1000 samples (2.9%)
27 (2.7%) high mild
2 (0.2%) high severe
variance introduced by outliers: 69.836%
variance is severely inflated by outliers

benchmarking water_vector/10000
mean: 26.79492 us, lb 26.75906 us, ub 26.83274 us, ci 0.950
std dev: 595.1865 ns, lb 559.8929 ns, ub 652.5076 ns, ci 0.950
found 21 outliers among 1000 samples (2.1%)
18 (1.8%) high mild
3 (0.3%) high severe
variance introduced by outliers: 64.525%
variance is severely inflated by outliers

collecting 1000 samples, 1 iterations each, in estimated 28315.44 s

benchmarking water_tardis/10000
collecting 1000 samples, 1 iterations each, in estimated 28.98788 s

I never bothered waiting around for the comonad or the time traveller ones to complete. They’re slow, let’s say that.

Here’s the source code to the benchmarks, let me know if I made a mistake in the benchmark setup.

# Constructing a list in a Monad

In Haskell, lists are ubiquitous. In real-world Haskell, monads are ubiquitous. So it should be an easy task to fill a list with values obtained in a monad, i.e. a list of n random numbers (in the Rand monad), or getting some input from the user, and storing it in a list, until the user chooses to finish the input.

For now, assume a function getInput :: IO Int; we want to run it 10000 and then have the list of the values returned. Sounds simple and straight forward, but it is actually really hard to get the run-time behaviour that we naively and optimistically expect.

### The expectation

In an imperative setting, we would have a simple counting loop. In the loop, we run getInput, allocate a new cell for the linked list, and append it to the list. Clearly, this

• consumes no stack space and
• at the end of the loop, the list is already fully evaluated in memory, ready to be used.

Can we achieve that with Haskell? Let’s look at a few alternatives (complete code available)

### Attempt 1: The accumulator

I’ll start with the probably worst attempt. Unfortunately, from what I have seen from correcting Haskell exams, this is what beginners would write, especially if they have the above imperative algorithm in mind, and know about the “convert loops to recursion with accumulators” approach to solving Haskell problems:

accumAppend :: Int -> IO [Int]
accumAppend n = go n []
where
go 0 l = return l
go n l = do {x <- getInput; go (n-1) (l++[x])}


To check the space usage of this, I pass -Kn to the GHC runtime and see what limit is just enough to let this program calculate accumAppend 10000 and fully evaluate the resulting list. It turns out that we need 328336 bytes of stack space.

So what about the second requirement; that the list is fully evaluated? Experienced Haskellers see it immediatelly: The accumulator is not used in the loop, so each iteration adds a thunk around the previous value, that would add the next entry to the end. Only when the list is actually used, these are run, each traversing the whole list, causing quadratic cost. We can verify this using ghc-vis:

Clearly (and not surprisingly) this is not the best solution.

### Attempt 2: The accumulator (forced)

Lets try to fix the second issue, by making sure that the list is always fully evaluated. We import Control.DeepSeq and write

accumAppendSeq :: Int -> IO [Int]
accumAppendSeq n = go n []
where
go 0 l = return l


### Attempt 6: Difference lists

It is time to dig deeper into our box of tricks. What if we have lists with constant time Snoc (a.k.a. appending a new element)? Then Attempt 3 would not require reversing the list. Such lists are provided by difference lists; there are libraries for that, but we can simply implement them using lambdas:

accumDList :: Int -> IO [Int]
accumDList n = go n id
where
go 0 l = return (l [])
go n l = do {x <- getInput; go (n-1) (l . (x:))}


Indeed, no stack is required (8 bytes are enough). But we are cheating here: We are still not constructing the final list, but rather we combine functions that append elements to lists. So when accumDList returns, what we have in memory, is a thunk and sequence of partial function applications in the wrong order, and evaluating the thunk does basically the same thing as reverse in Attempt 3. It might be a bit faster, is still not satisfying enough:

### Attempt 7: Unsafe functions

Even deepter in the box of tricks, somewhere in the grit and dust where one usually should not reach, there is a function that can help us out here: unsafeInterleaveIO. This allow us to modify the naive recursion in a way that first creates the Cons-cell, and then continues the loop:

listUnsafeInterleave :: Int -> IO [Int]
listUnsafeInterleave n = do {l <- go n; return $!! l} where go 0 = return [] go n = do x <- getInput r <- unsafeInterleaveIO (go (n-1)) return (x:r)  The stack consumption is 8 bytes. It is worth noting that the all to deepseq (in$!!) will actually drive the evaluation of go. Also, it guarantees that the unsafe effects of unsafeInterleaveIO are confined to the execution of listUnsafeInterleave, i.e. are actually safe.

### Run time statistics

I ran criterion over the functions (generating lists of length 1000), and found these results:

• accumAppend: 6.5 ms
• accumAppendSeq: 8.2 ms
• accumReverse: 14.0 us
• listRecurse: 8.0 us (same as listReplicateM and listStreams)
• accumDList: 12.0 us
• listUnsafeInterleave: 174.0 us

### Conclusion

There are many ways to write a list-constructing list in Haskell, none are perfect – they either fill the heap, or do not construct the list directly (usually requiring a second pass through it), or rely on IO-specific unsafe functions. The naive recursion performs the best, but may cause a stack overflow (note, though that from GHC 7.8, the stack will be unlimited).  The next best option seems to be the difference lists

I’m still wondering: What would be required from Haskell, GHC or the monads in question to have a fully satisfactory solution here, i.e. one that is as fast as the naive recursion, as stack efficient as the difference list solution, and allocates no thunks, only list cells?

# WAI 2.0 and http-client

Kazu and I have been working on some changes to WAI and Warp for a 2.0 release. At this point, most of the API changes are done, and could benefit from some input from the rest of the community. At the same time, the work we've been doing inspired me to finally write the http-client library, which is intended to be the new underlying engine for http-conduit in its upcoming 2.0 release. This blog post is intended to give an overview of the coming changes.

## WAI

I originally sent an email to web-devel about the planned changes to WAI. For the most part, the changes Kazu and I have implemented coincide with the conclusions of that thread, namely:

• The Request constructor has been moved to an Internal module.
• serverName and serverPort have been removed.
• requestBody no longer lives in ResourceT IO.
• As an optimization to avoid extra lookups, requestHeaderHost is now part of Request. We may add other such optimization fields in the future.
• FilePart now contains the total file size, which is necessary for properly and efficiently supporting range requests.
• Instead of living in ResourceT, ResponseSource allows for bracket-like semantics to provide for exception safety. This allows for a lot of optimizations in Warp, and leads to some other simplifications which I'll get to later. The responseSourceBracket helper function is a convenient way to get exception safety.

In addition to interface changes, Kazu has made a number of optimizations to the internals of Warp. A big thank you to Gregory Collins who provided some recommendations on this front.

I'll hold off on making any comments about performance, as that's Kazu's domain and he's still adding some optimizations.

## http-client

There are a few questions which I've received multiple times about http-conduit:

• Is it possible to get a slimmed-down version, without SSL support, to be used for testing purposes?
• Is it possible to use OpenSSL instead of the tls package?
• If all I need is a way to get a lazy ByteString, can I have a lower-dependency package?

It turns out that the http-conduit code base is pretty well set up for this kind of change. The concept of a connection has already been abstracted in such a way that it's trivial to use different backends for SSL connections. And the internals of the package don't really use conduit very much, rather, the package just exports a conduit interface for convenience.

So after some hacking, I present to you the http-client megarepo. It's broken up into five packages currently:

• http-client provides the core functionality, without any SSL or conduit support.
• http-client-tls uses the tls and connection packages to provide SSL support.
• http-client-openssl uses OpenSSL for that purpose.
• http-client-multipart provides helper functions for generating multipart request bodies.
• http-client-conduit provides some helper functions for interacting with conduit.

On top of all this, I've set up a branch of http-conduit which provides mostly the same API, but uses http-client for all of its functionality. Now, some random thoughts:

• http-client does not use resourcet at all; it uses the bracket idiom instead. This is important for both (1) simplifying the library, and (2) making it easier to integrate with WAI 2.0, which also doesn't use resourcet. http-conduit continues to provide a resourcet-powered interface for ease-of-use.
• Most functionality provided by http-conduit has been ported to http-client, including:

• Connection pooling, which is an important optimization when you're regularly connecting to the same host (e.g., API access).
• Cookies (browser API will continue in http-conduit-browser)
• Multipart messages
• Proxies
• Socks proxies (when using http-client-tls)
• Basic authentication
• There's a distinct lack of extensive testing in http-client, since most testing is still being done in http-conduit. I'd certainly like to increase the test coverage in http-client, possibly by simply porting http-conduit test cases over. But for now, the http-conduit test suite is a sufficient means of testing http-client.

I think http-client is at a very nice level of abstraction for people building higher-level APIs. If you're in such a boat, I'd really like to hear feedback. I think we're headed towards too much fragmentation in the HTTP client space right now, and having a shared engine could be a good solution to the problem.

## Yesod/http-reverse-proxy/keter

Not only can Yesod be ported over to WAI 2.0 without any API changes, but it can do so with just a few CPP declarations, so that Yesod will compile both with the current version of WAI and WAI 2.0. The only real problematic change is the removal of ResourceT from the WAI protocol. This is handled in Yesod by:

• Starting a new ResourceT block for each Yesod request.
• When returning a non-streaming response, closing the ResourceT block when passing the response value to Warp.
• When returning a streaming response, using the bracket-like functionality of ResponseSource to make sure the ResourceT block is closed.

http-reverse-proxy and keter both end up a bit simpler as a result of this change. Since neither http-client nor Warp deal with resourcet any more, the reverse proxy layer can ignore it as well. Basically, we can just peel off a layer of monad transformers from ever being used.

## What's not changing: conduit

In this whole rewrite process, I actually wrote a conduit-free branch of WAI for testing purposes. It worked, but Kazu and I came to the conclusion that:

1. Removing conduit provided no discernible performance improvement.
2. Both the internal Warp code, and external user code, became harder to manage without conduit involved.

I also investigated some possible changes to conduit itself, based on last month's blog post. For the moment, I'm not pursuing that route further, but I don't want to relegate that discussion to a footnote in this blog post. I'll hopefully give that topic more attention later.

## Feedback requested!

None of the changes discussed here are set in stone yet, so now's the perfect time to clone the repos and start experimenting. In particular, I'm looking for feedback from the following groups:

• Users of WAI, either for writing WAI applications or WAI-based frameworks.
• Users of http-conduit, who are either happy with the current API, or would be interested in any of the changes mentioned above (e.g., switching tls for OpenSSL).
• Developers of other HTTP client libraries, who could consider collaboration on a single, shared HTTP client engine.

# NSIS plugins and LZMA solid

Summary: If you use NSIS plugins along with LZMA solid compression, make sure you reference the plugins first.

The NSIS Windows Installer supports plugins, which are dlls added to the installer which are called at install time. It also supports LZMA solid compression, which means that all files are compressed as a single continuous run for better compression. These two features can interact poorly. As an example:

SetCompressor /SOLID lzmaPage instfiles "" createInstFilesSection dummy    SetOutPath "\$INSTDIR"    File "*.exe"SectionEndFunction createInstFiles    w7tbp::StartFunctionEnd

This installer is LZMA solid compressed and installs all .exe files into the installation directory. It also uses the Taskbar progress plugin to provide progress information in the taskbar.

Unfortunately, if the .exe files are large (say 1Gb) the installer will visibly freeze for around 40 seconds when you start to install. When the createInstFiles function is run, it first extracts the 3Kb file w7tbp.dll and loads it. NSIS adds files to the archive in the order they are discovered, meaning that since w7tbp.dll is first mentioned at the end of the file it is placed at the end of the archive. Solid compression requires decompressing everything before a file to extract it, and thus accessing the last file requires a lot of disk access and CPU time.

The workaround is to move all addin calls before all sections, so they go at the front of the archive, and can be extracted quickly. However, sections define names which are only accessible later on in the file, so not all functions can be moved before sections. Therefore, as a general technique, you can write a dummy function that is never called but which references each plugin, and put that function before any sections.

My NSIS wrapper library automatically uses the workaround.

# Website go bye bye

Heads up, the host I've been using for my website llama.freegeek.org will be going away on Sunday, 17 November 2013. Since I'm strapped for time, I won't be able to migrate it to another host for a while. Once I get a chance to set up a new one, I'll let y'all know.

# Compiler Engineer at BAE Systems (Full-time)

Do you want to contribute to the development of new programming languages that will be used to implement a secure operating system on a novel hardware platform? Would you like to be part of a growing research group working with leading universities on cutting edge computer science research?

BAE System’s Technology Solutions group in Burlington, MA, is leading the SAFE project to do a clean slate design of a secure, resilient, general purpose computer and operating system. Under the auspices of the DARPA CRASH program, we are designing a novel tagged hardware architecture, an information flow-aware programming language, and a new “zero kernel” operating system based on the principles of least privilege and mutual suspicion. We are using formal methods to prove security and correctness properties of the programming language and of the runtime components of SAFE.

Critical to this effort are the language implementations – SAFE has two new languages under development: (1) Breeze, an information-flow aware, dynamically typed, functional programming language, and (2) Tempest, a low level systems programming language with user-defined calling conventions and inline assembly. Language implementation encompasses compilers, interpreters, analysis and debugging tools, language design, and language runtime design and implementation. In short, if you want to be involved in creating new programming languages that will be used to implement a secure operating system on a novel hardware platform, this project is made for you!

As a member of the Advanced Computing and Decision Systems group, you will participate on different projects over time, and you will also be able to help define new projects. Some of the other projects we are currently working on involve probabilistic programming (under the DARPA PPAML project), meta-reasoning over proof structures as part of the DARPA HACMS program, and melding formal reasoning tools such as ACL2 with system engineering tools under the OSD Software Producibility Initiative.

BAE Systems Technology Solutions Advanced Computing and Decision Systems is an entrepreneurial organization that focuses on research and development in computer science and artificial intelligence. We deliver state-of-the-art decision support and system analysis design and integration technologies for many customers in the defense and intelligence communities. We are based in Burlington, MA, and are part of a 1979 MIT spinoff company that was acquired by BAE Systems in 2004 and has since grown to more than 300 researchers, system engineers, and software developers, over 70 percent of whom hold advanced engineering and science degrees.

Required Skills and Education:

• Industry and/or academic experience in programming language implementation (compilers, interpreters, formal semantics, parsing, optimization, code generation)
• Bachelor’s degree and 2 years experience or Master’s degree in computer science or related field
• Strong oral and written communication skills
• US Citizenship is a requirement for this position

Preferred Skills:

• Experience with functional programming languages – using them, compiling them, and designing them – is highly desirable
• A Ph.D. in computer science is a strong plus.
• Experience in any of the following areas: Haskell, LLVM, formal methods (proof assistants, language semantics, proofs of correctness or safety), computer security, parallel computing, and computer architecture
• Experience with agile software development processes.
• Experience with a version control system such as git or subversion

Get information on how to apply for this position.

### Ketil Malde

One of the things I find myself using associative data structures (a.k.a. finite maps, hash tables, …) for again and again, is frequency counts. This is such a common task, and thus it should be easy and efficient, but due to the non-strictness of Haskell, it’s in fact a bit tricky to get right. So if you use a Data.Map (specifically, Data.Map v Int), instead of a simple expression like:

freqtable = foldl' (M.insertWith (+1) 1) M.empty

I often found myself writing something like:

freqtable = go M.empty
where
go m [] = m
go m (i:is) = let v = M.findWithDefault 0 i m
v' = v+1
m' = M.insert i v' m
in v' seq m' seq go m' is

The good news is that the contortions above are not really necessary anymore. We now have strict versions of these data structures which always evaluate the value elements, as well as insertWith' (with a tick at the end) which evaluates values (to WHNF) on insertion.

The bad news is that for various reasons, Data.Map isn’t always the optimal data structure for this, and so I decided to investigate a bit.

## Candidate data structures

The default data structure for associative maps in Haskell is perhaps still Data.Map, but in many cases, Data.IntMap.Strict is a better choice. This is strict, and by using Int as the key (which isn’t really a severe limitation for most purposes), is more efficient. This is still implemented as a tree structure, which enables sharing and immutability and purity - the usual goodies. (But note that this isn’t so important here - we’ll often just build the table once at the outset and then use it, we’re not updating things much later on.)

Another alternative is a direct table of counts. Data.Vector is Haskell’s new all-purpose arrays, providing a 0-indexed table with various options for content. The older Data.Array provided more flexible arrays with multiple dimensions etc, but these can fairly easily be built on top of vectors. Like Data.Array, vectors come in multiple flavors:

• regular - unboxed, can hold complex (and unevaluated, non-strict) values
• unboxed - stores primitive data structures directly, and is strict
• mutable - can do destructive updates, but requires operations in a monad

Then finally, there is Data.Judy - in fact, stumbling over its Haskell library by Don Stewart is the reason I started looking into this. The C version is advertised as an extremely performant, mutable, cache-friendly tree structure.

## Implementation

To benchmark these data structures, I’ve written a small driver program that basically inserts a given number of values from a given range into one of these data structures, and measured CPU time and memory usage with /usr/bin/time.

So, to make it clear, given a range r, and a number of values n, we insert n random values between 0 and r. Since Vector has a predetermined size, we allocate an r-sized array, for the others, the size grows dynamically.

Also, all counts are Ints (or the equivalent Word for Judy), especially Vector could probably save by using a smaller data type for storing counts.

Oh, and Don’s Judy library is a bit sparse, among other things, I added an insertWith. This makes frequency counting a bit faster (see below), as well as thread safe.

## Results

Anyway, this is how it looks so far:

Here we use a fixed range (up to 1M), and vary the number of inserts. We see that all implementations scale linearly - which is what we would expect. IntMap and (boxed) Vector are about seven times slower than the others, though, while unboxed and mutable vectors are a bit faster than Judy (j is Judy with lookup and insert, jw is judy using insertWith)

Memory consumption is lowest and constant for mutable and unboxed vectors. Judy has a comparable low memory footprint, but grows slightly with number of inserts. IntMap is way more expensive, something like 20x bigger. Interestingly, boxed vector seems to use a large amount of memory, I’m not exactly sure why this is.

Of course, vectors are fast when you can allocate a relatively small array, and update it directly. But the price you pay is limited range, here we look at how a constant number of insertions (again 1M) look when we increase the range. Again, IntMap and boxed vectors are slow, Judy and the other vectors are substantially faster.

Finally, we see that for memory use, vectors scale linearly with the range. Interestingly, the mutable vectors have half the memory footprint of immutable, but unboxed vectors. I would have thought an unsafeFreeze or equivalent would make them the same - but apparently not? And Judy really shines here, with a tiny footprint compared to the others.

## Conclusions

For small ranges and dense data (relatively few counts of zero), use Data.Vector, the mutable kind. For large ranges and/or sparse data, use Data.Judy.

A version of Judy that compiles on recent compilers, and adds some extra functions (e.g., insertWith and adjust) can be found here. Scripts etc used in benchmarking are here. I haven’t spent a lot of effort exploring implementation options, so it is likely that my implementations are suboptimal. As always, comments welcome.