Planet Haskell

March 31, 2015

Edward Z. Yang

Protected: Expression-level Backpack

This content is password protected. To view it please enter your password below:

by Edward Z. Yang at March 31, 2015 11:29 PM

Brent Yorgey

Blogging again, & some major life events

It’s been a long time since I’ve written anything here; the blog was on hold while I was finishing my PhD and on the academic job market. Now that things have settled down a bit I plan to get back to blogging.

For starters, here are a few of the major events that have happened in the meantime, that readers of this blog might care about:

  • I successfully defended my PhD dissertation in October, officially graduated in December, and got an actual diploma in the mail a few weeks ago. I’ll be back in Philadelphia for the official graduation ceremony in May.
  • I accepted a tenure-track position at Hendrix College in Conway, Arkansas, and will be moving there this summer.
  • Dan Piponi and I had a paper accepted to MPC 2015. Here’s the github repo, and I plan to post a PDF copy here soon (once I get around to incorporating feedback from the reviewers). I look forward to seeing a bunch of folks (Volk?) in Königswinter this summer; I already have my plane tickets (CIU -> DTW -> AMS -> CGN, it’s a long story).
  • Work on diagrams continues strong (no thanks to me!), and we are aiming for a big new release soon—I will certainly post about that here as well.

by Brent at March 31, 2015 06:03 PM

March 30, 2015

Robert Harper

The power of negative thinking

Exception tracking is a well-known tar baby of type system design.  After all, if expressions can have two sorts of result, why shouldn’t the type say something about them both?  Languages such as CLU, FX, and Java, to name three, provide “throws” or “raises” clauses to the types of procedures that specify an upper bound on the exceptions that can occur when they are called.  It all seems natural and easy, but somehow the idea never really works very well.  One culprit is any form of higher-order programming, which is inherent in object-oriented and functional languages alike.  To handle the indirection requires more complex concepts, such as effect polymorphism, to make thing work reasonably well.  Or untracked exceptions are used to avoid the need to track them.  Somehow such an appealing idea seems rather stickier to realize than one might expect.  But why?

A piece of the puzzle was put into place by Xavier Leroy and François Pessaux in their paper on tracking uncaught exceptions. Their idea was to move use type-based methods to track uncaught exceptions, but to move the clever typing techniques required out of the programming language itself and into a separate analysis tool.  They make effective use of the powerful concept of row polymorphism introduced by Didier Rémy for typing records and variants in various dialects of Caml.  Moving exception tracking out of the language and into a verification tool is the decisive move, because it liberates the analyzer from any constraints that may be essential at the language level.

But why track uncaught exceptions?  That is, why track uncaught exceptions, rather than caught exceptions?  From a purely methodological viewpoint it seems more important to know that a certain code fragment cannot raise certain exceptions (such as the \texttt{match} exception in ML, which arises when a value matches no pattern in a case analysis).  In a closed world in which all of the possible exceptions are known, then tracking positive information about which exceptions might be raised amounts to the same as tracking which exceptions cannot be raised, by simply subtracting the raised set from the entire set.  As long as the raised set is an upper bound on the exceptions that might be raised, then the difference is a lower bound on the set of exceptions that cannot be raised.  Such conservative approximations are necessary because a non-trivial behavioral property of a program is always undecidable, and hence requires proof.  In practice this means that stronger invariants must be maintained than just the exception information so that one may prove, for example, that the values passed to a pattern match are limited to those that actually do satisfy some clause of an inexhaustive match.

How realistic is the closed world assumption?  For it to hold seems to require a whole-program analysis, and is therefore non-modular, a risky premise in today’s world.  Even on a whole-program basis exceptions must be static in the sense that, even if they are scoped, they may in principle be declared globally, after suitable renaming to avoid collisions.  The global declarations collectively determine the whole “world” from which positive exception tracking information may be subtracted to obtain negative exception information.  But in languages that admit multiple instantiations of modules, such as ML functors, static exceptions are not sufficient (each instance should introduce a distinct exception).  Instead, static exceptions must be replaced by dynamic exceptions that are allocated at initialization time, or even run-time, to ensure that no collisions can occur among the instances.  At that point we have an open world of exceptions, one in which there are exceptions that may be raised, but which cannot be named in any form of type that seeks to provide an upper bound on the possible uncaught exceptions that may arise.

For example consider the ML expression

  exception X
  raise X

If one were to use positive exception tracking, what would one say about the expression as a whole?  It can, in fact it does, raise the exception \texttt{X}, yet this fact is unspeakable outside of the scope of the declaration.   If a tracker does not account for this fact, it is unsound in the sense that the uncaught exceptions no longer provide an upper bound on what may be raised.  One maneuver, used in Java, for example, is to admit a class of untracked exceptions about which no static information is maintained.  This is useful, because it allows one to track those exceptions that can be tracked (by the Java type system) and to not track those that cannot.

In an open world (which includes Java, because exceptions are a form of object) positive exception tracking becomes infeasible because there is no way to name the exceptions that might be tracked.  In the above example the exception \textsf{X} is actually a bound variable bound to a reference to an exception constructor.  The name of the bound variable ought not matter, so it is not even clear what the exception raised should be called.  (It is amusing to see the messages generated by various ML compilers when reporting uncaught exceptions.  The information they provide is helpful, certainly, but is usually, strictly speaking, meaningless, involving identifiers that are not in scope.)

The methodological considerations mentioned earlier suggest a way around this difficulty.  Rather than attempt to track those exceptions that might be raised, instead track the exceptions that cannot be raised.  In the above example there is nothing to say about \texttt{X} not being raised, because it is being raised, so we’re off the hook there.  The “dual” example

  exception X

illustrates the power of negative thinking.  The body of the \textsf{let} does not raise the exception bound to \textsf{X}, and this may be recorded in a type that makes sense within the scope of \textsf{X}.  The crucial point is that when exiting its scope it is sound to drop mention of this information in a type for the entire expression.  Information is lost, but the analysis is sound.  In contrast there is no way to drop positive information without losing soundness, as the first example shows.

One way to think about the situation is in terms of type refinements, which express properties of the behavior of expressions of a type.  To see this most clearly it is useful to separate the exception mechanism into two parts, the control part and the data part.  The control aspect is essentially just a formulation of error-passing style, in which every expression has either a normal return of a specified type, or an exceptional return of the type associated to all exceptions.  (Nick Benton and Andrew Kennedy nicely formulated this view of exceptions as an extension of the concept of a monad.)

The data aspect is, for dynamic exceptions, the type of dynamically classified values, which is written \textsf{clsfd} in PFPL.  Think of it as an open-ended sum in which one can dynamically generate new classifiers (aka summands, injections, constructors, exceptions, channels, …) that carry a value of a specified type.  According to this view the exception \textsf{X} is bound to a dynamically-generated classifier carrying a value of unit type.  (Classifier allocation is a storage effect, so that the data aspect necessarily involves effects, whereas the control aspect may, and, for reasons of parallelism, be taken as pure.)  Exception constructors are used to make values of type \textsf{clsfd}, which are passed to handlers that can deconstruct those values by pattern matching.

Type refinements come into play as a means of tracking the class of a classified value.  For the purposes of exception tracking, the crucial refinements of the type \textsf{clsfd} are the positive refinement, \textsf{a}\cdot, and the negative refinement\overline{\textsf{a}\cdot}, which specify that a classified value is, or is not, of class \textsf{a}.  Positive exception tracking reduces to maintaining invariants expressed by a disjunction of positive refinements; negative exception tracking reduces to maintaining invariants expressed by a conjunction of negative refinements.  Revisiting the logic of exception tracking, the key is that the entailment

\overline{\mathsf{a}_1\cdot}\wedge \cdots\wedge \overline{\mathsf{a}_n\cdot} \leq \overline{\mathsf{a}_1\cdot} \wedge \cdots \wedge \overline{\mathsf{a}_{n-1}\cdot}

is valid, whereas the “entailment”

\mathsf{a}_1\cdot \vee \cdots \vee \mathsf{a}_n\cdot \leq \mathsf{a}_1\cdot\vee \cdots \vee \mathsf{a}_{n-1}\cdot

is not.  Thus, in the negative setting we may get ourselves out of the scope of an exception by weakening the refinement, an illustration of the power of negative thinking.

Filed under: Programming, Research Tagged: dynamic classification, execeptions, open vs closed world, type refinements

by Robert Harper at March 30, 2015 03:50 PM


OverloadedRecordFields revived

Way back in the summer of 2013, with support from the Google Summer of Code programme, I implemented a GHC extension called OverloadedRecordFields to address the oft-stated desire to improve Haskell’s record system. This didn’t get merged into GHC HEAD at the time, because the implementation cost outweighed the benefits. Now, however, I’m happy to report that Well-Typed are sponsoring the work required to get an improved version into GHC. Moreover, the first part of the work is already up for review on Phabricator.

The crucial thing that has enabled OverloadedRecordFields to get going again is that we’ve found a way to factor the design differently, so that we get a much better power-to-weight ratio for the extension by splitting it into two parts.

Part 1: Records with duplicate field labels

The first step is to cut down the core OverloadedRecordFields extension as much as possible. The essential idea is the same as it ever was, namely that a single module should be able to use the same field name in multiple datatypes, as in this example:

data Person  = Person  { personId :: Int, name :: String }
data Address = Address { personId :: Int, address :: String }

These definitions are forbidden in normal Haskell, because personId is defined twice, but the OverloadedRecordFields extension will permit them and instead postpone name conflict checking to use sites. The basic extension will require that fields are used in such a way that the relevant datatype is always unambiguously determined, and the meanings of record construction, pattern matching, selection and update will not change. This means that the extension can always be enabled for an existing module and it will continue to compile unchanged, an important property that was not true of the previous design.

The Haskell syntax for record construction and pattern-matching is always unambiguous, because it mentions the data constructor, which means that code like this is perfectly fine:

p = Person { personId = 1, name = "Donald" }
getId (Person { personId = i }) = i

On the other hand, record selector functions are potentially ambiguous. The name and address selectors can be used without restrictions, and with their usual types, but it will not be possible to use personId as a record selector if both versions are in scope (although we will shortly see an alternative).

Record update is a slightly more interesting case, because it may or may not be ambiguous depending on the fields being updated. In addition, since updates are a special syntactic form, the ambiguity can be resolved using a type signature. For example, this update would be ambiguous and hence rejected by the compiler:

f x = x { personId = 0 } -- is x a Person or an Address?

On the other hand, all these updates are unambiguous:

g :: Person -> Person
g x = x { personId = 0 }                 -- top-level type signature

h x = x { personId = 0 } :: Person       -- type signature outside

k x = (x :: Person) { personId = 0 }     -- type signature inside

l x = x { personId = 0, name = "Daffy" } -- only Person has both fields

Overall, this extension requires quite a bit of rewiring inside GHC to distinguish between field labels, which may be overloaded, and record selector function names, which are always unambiguous. However, it requires nothing conceptually complicated. As mentioned above, the implementation of this part is available for review on Phabricator.

Part 2: Polymorphism over record fields

While the OverloadedRecordFields extension described in part 1 is already useful, even though it is a relatively minor relaxation of the Haskell scoping rules, another important piece of the jigsaw is some way to refer to fields that may belong to multiple datatypes. For example, we would like to be able to write a function that selects the personId field from any type that has such a field, rather than being restricted to a single datatype. Much of the unavoidable complexity of the previous OverloadedRecordFields design came from treating all record selectors in an overloaded way.

But since this is new functionality, it can use a new syntax, tentatively a prefix # sign (meaning that use of # as an operator will require a space afterwards when the extension is enabled). This means that it will be possible to write #personId for the overloaded selector function. Since we have a syntactic cue, it is easy to identify such overloaded uses of selector functions, without looking at the field names that are in scope.

Typeclasses and type families will be used to implement polymorphism over fields belonging to record types, though the details are beyond the scope of this blog post. For example, the following definition is polymorphic over all types r that have a personId :: Int field:

getId :: r { personId :: Int } => r -> Int
getId x = #personId x

Moreover, we are not limited to using #personId as a selector function. The same syntax can also be given additional interpretations, allowing overloaded updates and making it possible to produce lenses for fields without needing Template Haskell. In fact, the syntax is potentially useful for things that have nothing to do with records, so it will be available as a separate extension (implied by, but distinct from, OverloadedRecordFields).

Further reading

More details of the redesigned extensions are available on the GHC wiki, along with implementation notes for GHC hackers. Last year, I gave a talk about the previous design which is still a good guide to how the types work under the hood, even though it predates the redesign.

by adam at March 30, 2015 01:39 PM

Ian Ross

C2HS 0.25.1 "Snowmelt"

C2HS 0.25.1 "Snowmelt"

I took over the day-to-day support for C2HS about 18 months ago and have now finally cleaned up all the issues on the GitHub issue tracker. It took a lot longer than I was expecting, mostly due to pesky “real work” getting in the way. Now seems like a good time to announce the 0.25.1 “Snowmelt” release of C2HS and to summarise some of the more interesting new C2HS features.

Regression suite and Travis testing

When I first started working on C2HS, I kept breaking things and getting emails letting me know that such-and-such a package no longer worked. That got boring pretty quickly, so I wrote a Shelly-driven regression suite to build a range of packages that use C2HS to check for breakages. This now runs on Travis CI so that whenever a C2HS change is pushed to GitHub, as well as the main C2HS test suite, a bunch of C2HS-dependent packages are built. This has been pretty handy for avoiding some stupid mistakes.

Enum handling

Thanks to work contributed by Philipp Balzarek, the treatment of the mapping between C enum values and Haskell Enum types is now much better than it was. The C enum/Haskell Enum association is kind of an awkward fit, since the C and Haskell worlds make really quite different assumptions about what an “enumerated” type is, and the coincidence of names is less meaningful than you might hope. We might have to do some more work on that in the future: I’ve been thinking about whether it would be good to have a CEnum class in Foreign.C.Types to capture just the features of C enums that can be mapped to Haskell types in a sensible way.

Finalizers for foreign pointers

You can now say things like:

#include <stdio.h>

{#pointer *FILE as File foreign finalizer fclose newtype#}

{#fun fopen as ^ {`String', `String'} -> `File'#}
{#fun fileno as ^ {`File'} -> `Int'#}

main :: IO ()
main = do
  f <- fopen "tst.txt" "w"

and the file descriptor f will be cleaned up by a call to fclose via the Haskell garbage collector. This encapsulates a very common use case for handling pointers to C structures allocated by library functions. Previously there was no direct way to associate finalizers with foreign pointers in C2HS, but now it’s easy.

Easy access to preprocessor constants

C2HS has a new const hook for directly accessing the value of C preprocessor constants – you can just say {#const FOO#} to use the value of a constant FOO defined in a C header in Haskell code.

Special case argument marshalling

I’ve implemented a couple of special mechanisms for argument marshalling that were requested. The first of these is a little esoteric, but an example should make it clear. A common pattern in some C libraries is to have code that looks like this:

typedef struct {
  int a;
  float b;
  char dummy;
} oid;

void func(oid *obj, int aval, float bval);
int oid_a(oid *obj);
float oid_b(oid *obj);

Here the function func takes a pointer to an oid structure and fills in the values in the structure and the other functions take oid pointers and do various things with them. Dealing with functions like func through the Haskell FFI is a tedious because you need to allocate space for an oid structure, marshall a pointer to the allocated space and so on. Now though, the C2HS code

{#pointer *oid as Oid foreign newtype#}

{#fun func as ^ {+, `Int', `Float'} -> `Oid'#}

generates Haskell code like this:

newtype Oid = Oid (ForeignPtr Oid)
withOid :: Oid -> (Ptr Oid -> IO b) -> IO b
withOid (Oid fptr) = withForeignPtr fptr

func :: Int -> Float -> IO Oid
func a2 a3 =
  mallocForeignPtrBytes 12 >>= \a1'' -> withForeignPtr a1'' $ \a1' ->
  let {a2' = fromIntegral a2} in
  let {a3' = realToFrac a3} in
  func'_ a1' a2' a3' >>
  return (Oid a1'')

This allocates the right amount of space using the fast mallocForeignPtrBytes function and deals with all the marshalling for you. The special + parameter in the C2HS function hook definition triggers this (admittedly rather specialised) case.

The second kind of “special” argument marshalling is more general. A lot of C libraries include functions where small structures are passed “bare”, i.e. not as pointers. The Haskell FFI doesn’t include a means to marshal arguments of this type, which makes using libraries of this kind painful, with a lot of boilerplate marshalling code needed (just the kind of thing C2HS is supposed to eliminate!). The solution I came up with for C2HS is to add an argument annotation for function hooks that says that a structure pointer should really be passed as a bare structure. In such cases, C2HS then generates an additional C wrapper function to marshal between structure pointer and bare structure arguments. An example will make this clear. Suppose you have some code in a C header:

typedef struct {
  int x;
  int y;
} coord_t;

coord_t *make_coord(int x, int y);
void free_coord(coord_t *coord);
int coord_x(coord_t c, int dummy);

Here, the coord_x function takes a bare coord_t structure as a parameter. To bind to these functions in C2HS code, we write this:

{#pointer *coord_t as CoordPtr foreign finalizer free_coord newtype#}

{#fun pure make_coord as makeCoord {`Int', `Int'} -> `CoordPtr'#}
{#fun pure coord_x as coordX {%`CoordPtr', `Int'} -> `Int'#}

Here, the % annotation on the CoordPtr argument to the coordX function hook tells C2HS that this argument needs to be marshalled as a bare structure. C2HS then generates Haskell code as usual, but also an extra .chs.c file containing wrapper functions. This C code needs to be compiled and linked to the Haskell code.

This is kind of new and isn’t yet really supported by released versions of Cabal. I’ve made some Cabal changes to support this, which have been merged and will hopefully go into the next or next but one Cabal release. When that’s done, the handling of the C wrapper code will be transparent – Cabal will know that C2HS has generated these extra C files and will add them to the “C sources” list for whatever it’s building.

Binding to variadic C functions

Previously, variadic C functions weren’t supported in C2HS at all. Now though, you can do fun things like this:

#include <stdio.h>

{#fun variadic printf[int] as printi {`String', `Int'} -> `()'#}
{#fun variadic printf[int, int] as printi2 {`String', `Int', `Int'} -> `()'#}
{#fun variadic printf[const char *] as prints {`String', `String'} -> `()'#}

You need to give distinct names for the Haskell functions to be bound to different calling sequences of the underlying C function, and because there’s no other way of finding them out, you need to specify explicit types for the arguments you want to pass in the place of C’s ... variadic argument container (that’s what the C types in the square brackets are). Once you do that, you can call printf and friends to your heart’s content. (The user who wanted this feature wanted to use it for calling Unix ioctl…)

User-defined default marshallers

A big benefit of C2HS is that it tries quite hard to manage the associations between C and Haskell types and the marshalling of arguments between C and Haskell. To that end, we have a lot of default marshallers that allow you very quickly to write FFI bindings. However, we can’t cover every case. There were a few long-standing issues (imported from the original Trac issue tracker when I moved the project to GitHub) asking for default marshalling for various C standard or “standardish” typedefs. I held off on trying to fix those problems for a long time, mostly because I thought that fixing them one at a time as special cases would be a little futile and would just devolve into endless additions of “just one more” case.

In the end, I implemented a general scheme to allow users to explicitly associate C typedef names with Haskell types and to define default marshallers between them. As an example, using this facility, you can write code to marshal Haskell String values to and from C wide character strings like this:

#include <wchar.h>

{#typedef wchar_t CWchar#}
{#default in `String' [wchar_t *] withCWString* #}
{#default out `String' [wchar_t *] peekCWString* #}
{#fun wcscmp {`String', `String'} -> `Int'#}
{#fun wcscat {`String', `String'} -> `String'#}

I think that’s kind of fun…


As well as the features described above, there’s a lot more that’s been done over the last 18 months: better handling of structure tags and typedefs; better cross-platform support (OS X, FreeBSD and Windows); lots more default marshallers; support for parameterised pointer types; some vague gestures in the direction of “backwards compatibility” (basically just a C2HS_MIN_VERSION macro); and just in the last couple of days, some changes to deal with marshalling of C bool values (really C99 _Bool) which aren’t supported directly by the Haskell FFI (so again require some wrapper code and some other tricks).


As well as myself and Manuel Chakravarty, the original author of C2HS, the following people have contributed to C2HS development over the last 18 months (real names where known, GitHub handles otherwise):

  • Anton Dessiatov
  • Boyun Tang
  • Cindy Wang
  • Dimitri Sabadie
  • Facundo Dominguez
  • Index Int
  • j-keck
  • Kai Harries
  • Merijn Verstraaten
  • Michael Steele
  • Philipp Balzarek
  • RyanGlScott
  • Sivert Berg
  • watashi
  • Zejun Wu

Many thanks to all of them, and many thanks also to Benedikt Huber, who maintains the language-c package on which C2HS is critically dependent!

What next?

All of the work I’ve done on C2HS has been driven purely by user demand, based on issues I imported from the original Trac issue tracker and then on things that people have asked for on GitHub. (Think of it as a sort of call-by-need exploration of the C2HS design space.) I’m now anticipating that since I’ve raised my head above the parapet by touting all these shiney new features, I can expect a new stream of bug reports to come in…

One potential remaining large task is to “sort out” the Haskell C language libraries, of which there are now at least three, all with different pros and cons. The language-c library used in C2HS has some analysis capabilities that aren’t present in the other libraries, but the other libraries (notably Geoffrey Mainland’s language-c-quote and Manuel’s language-c-inline) support more recent dialects of C. Many of the issues with C2HS on OS X stem from modern C features that occur in some of the OS X headers that the language-c package just doesn’t recognise. Using one of the other C language packages might alleviate some of those problems. To do that though, some unholy mushing-together of language-c and one of these other packages has to happen, in order to bring the analysis capabilities of language-c to the other package. That doesn’t look like much fun at all, so I might ignore the problem and hope it goes away.

I guess longer term the question is whether tools like C2HS really have a future. There are better approaches to FFI programming being developed by research groups (Manuel’s is one of them: this talk is pretty interesting) so maybe we should just wait until they’re ready for prime time. On the other hand, quite a lot of people seem to use C2HS, and it is pretty convenient.

One C2HS design decision I’ve recently had to modify a little is that C2HS tries to use only information available via the “official” Haskell FFI. Unfortunately, there are situations where that just isn’t enough. The recent changes to marshal C99 _Bool values are a case in point. In order to determine offsets into structures containing _Bool members, you need to know how big a _Bool is. Types that are marshalled by the Haskell FFI are all instances of Storable, so you can just use the size method from Storable for this. However, the Haskell FFI doesn’t know anything about _Bool, so you end up having to “query” the C compiler for the information by generating a little C test program that you compile and run. (You can find out which C compiler to use from the output of ghc --info, which C2HS thus needs to run first.) This is all pretty nasty, but there’s no obvious other way to do it.

This makes me think, since I’m having to do this anyway, that it might be worth reorganising some of C2HS’s structure member offset calculation code to use the same sort of “query the C compiler” approach. There are some cases (e.g. structures within structures) where it’s just not possible to reliably calculate structure member offsets from the size and alignment information available through the Haskell FFI – the C compiler is free to insert padding between structure members, and you can’t work out just by looking when a particular compiler is going to do that. Generating little C test programs and compiling and running them allows you to get the relevant information “straight from the horse’s mouth”… (I don’t know whether this idea really has legs, but it’s one thing I’m thinking about.)

<script src="" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

March 30, 2015 09:43 AM

FP Complete

Announcing: open sourcing of ide-backend

After many years of development, FP Complete is very happy and proud to announce the open sourcing of ide-backend. ide-backend has served as the basis for our development of School of Haskell and FP Haskell Center, by providing a high level, easy to use, and robust wrapper around the GHC API. We'd like to express our thanks to Duncan Coutts, Edsko de Vries, and Mikolaj Konarski for implementing this library on our behalf.

ide-backend provides a means to do a variety of tasks involving GHC, such as:

  • Compile code
  • Get compile error message
  • Submit updated code for recompilation
  • Extract type information
  • Find usage locations
  • Run generated bytecode
  • Produce optimized executables

For much more information, you can see the Haddock documentation.

Members of the Commercial Haskell Special Interest Group have encouraged us to open source more of our work, to help them build more tools useful to real-world developers. We're happy to contribute.

ide-backend opens the possibility for many new and interesting tools. To give some ideas:

  • A basis for providing a fast development web server while working on a code base. The idea here is a generalized yesod devel, which compiles and runs your web application on each change.
  • Edward Kmett has a project idea of using ide-backend to extract and organize type information from a large number of packages
  • Editor plugins can be improved, simplified, and begin to share much more code than they do today
  • Lightweight tools for inspecting code
  • Refactoring tools

I've shared information about this repository with some maintainers of existing tools in the Haskell world already, and hopefully now with the complete move to open source, we can start a much broader discussion going.

But today's release isn't just a code release; we also have demos! Edsko and Chris have been collaborating on some next-generation editor plugins, and have put together ide-backend-client with support for both Emacs and Atom. Chris has put together a screencast of his Emacs integration:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="" width="420"></iframe>

We also have an early prototype tool at FP Complete for inspecting a code base and getting type information, based on ide-backend, GHCJS, and React.

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="" width="420"></iframe>

Where we go from here

Open sourcing this library is just the first step.

  • Duncan is planning on writing a blog post describing the architecture employed by this library.
  • Edsko's ide-backend-client project is a great place to continue making contributions and playing with new ideas.
  • FP Complete intends to release more of our ide-backend based tooling in the future as it matures.
  • I've asked for a GSoC proposal on a better development web server, and I'm sure other proposals would be great as well
  • FP Complete and Well Typed are both currently maintaining this library, and we are happy to have new people join the team

I'm excited to hear everyone's thoughts on this library, and look forward to seeing some awesome tools appear.

March 30, 2015 04:00 AM

March 29, 2015

Functional Jobs

Full-stack Software Engineer at Capital Match (Full-time)

Capital Match is a Singapore-based marketplace lending platform that enables SMEs to secure loans from private investors. Backed by a number of VCs and banking industry leaders, we have started our operations at the beginning of 2015 and are now embarking on the next stage of tech development.

We are looking for a software engineer interested in all aspects of the creation, growth and operations of a secure web-based platform: Front-end design and implementation, front-to-back features development, distributed deployment and automation in the cloud, build automation...

Our platform is primarily developed in Haskell with an ClojureScript frontend. A candidate with a good command of Functional Programming paradigm and real web programming experience, even not primarily in Haskell or Clojurescript, is welcomed to apply. For the platform side of things, we want to keep things as simple as possible and learning to use both languages is relatively easy.

Experience in web-based software development is essential and minimum exposure to and understanding of XP practices (TDD, CI, Emergent Design, Refactoring, Peer review and programming, Continuous improvement) is expected. These are standard practices within the team. We therefore favour candidates who value simplicity and respect, accept and give feedback and who are generally team player.

Proposed salary range is USD 3000-4000 / month, depending on profile. We offer up to 1% equity in stock-options, subject to usual vesting scheme.

Our team is distributed so ability to work comfortably in a distributed setting is essential. However, please note this position is to be filled in Singapore so if you are not currently living there, relocation is necessary in the short-term. Visa sponsorship will be provided. European citizens who relocate to Singapore do not have to pay their home country taxes and the local tax rate in Singapore is around 2% (effective on the proposed salary range). Singapore is a great place to live, a vibrant city rich with diverse cultures, a very strong financial sector and a central position in Southeast Asia.

Get information on how to apply for this position.

March 29, 2015 11:42 AM

March 28, 2015

Paul Johnson

Google Maps on Android demands I let Google track me

I recently upgraded to Android 5.1 on my Nexus 10. One app I often use is Google Maps. This has a "show my location" button:
When I clicked on this I got the following dialog box:

Notice that I have two options: I either agree to let Google track me, or I cancel the request. There is no "just show my location" option.

As a matter of principle, I don't want Google to be tracking me. I'm aware that Google can offer me all sorts of useful services if I just let it know every little detail of my life, but I prefer to do without them. But now it seems that zooming in on my GPS-derived location has been added to the list of features I can't have. There is no technical reason for this; it didn't used to be the case. But Google has decided that as the price for looking at the map of where I am, I now have to tell them where I am all the time.

I'm aware that of course my cellphone company knows roughly where I am and who I talk to, and my ISP knows which websites I visit and can see my email (although unlike GMail I don't think they derive any information about me from the contents), and of course Google knows what I search for. But I can at least keep that information compartmentalised in different companies. I suspect that the power of personal data increases non-linearly with the volume and scope, so having one company know where I am and another company read my email means less loss of privacy than putting both location and email in the same pot.

 Hey, Google, stop being evil!

by Paul Johnson ([email protected]) at March 28, 2015 02:59 PM

Joachim Breitner

An academic birthday present

Yesterday, which happened to be my 30th birthday, a small package got delivered to my office: The printed proceedings of last year's “Trends in Functional Programming” conference, where I published a paper on Call Arity (preprint). Although I doubt the usefulness of printed proceedings, it was a nicely timed birthday present.

Looking at the rather short table of contents – only 8 papers, after 27 presented and 22 submitted – I thought that this might mean that, with some luck, I might have chances to get the “Best student paper award”, which I presumed to be announced at the next iteration of the conference.

For no particular reason I was leisurely browsing through the book, and started to read the preface. And what do I read there?

Among the papers selected for these proceedings, two papers stood out. The award for Best Student Paper went to Joachim Breitner for his paper entitled Call Arity, and the award for Best Paper Overall went to Edwin Brady for his paper entitled Resource-dependent Algebraic Effects. Congratulations!

Now, that is a real nice birthday present! Not sure if I even would have found out about it, had I not have thrown a quick glance at page V...

I hope that it is a good omen for my related ICFP'15 submission.

by Joachim Breitner ([email protected]) at March 28, 2015 01:13 PM

Gabriel Gonzalez

Algebraic side effects

Haskell differentiates itself from most other functional languages by letting you reason mathematically about programs with side effects. This post begins with a pure Haskell example that obeys algebraic equations and then generalizes the example to impure code that still obeys the same equations.


In school, you probably learned algebraic rules like this one:

f * (xs + ys) = (f * xs) + (f * ys)

Now let's make the following substitutions:

  • Replace the mathematical multiplication with Haskell's map function
  • Replace the mathematical addition with Haskell's ++operator

These two substitutions produce the following Haskell equation:

map f (xs ++ ys) = (map f xs) ++ (map f ys)

In other words, if you concatenate the list xs with the list ys and then map a function named f over the combined list, the result is indistinguishable from mapping f over each list individually and then concatenating them.

Let's test this equation out using the Haskell REPL:

>>> map (+ 1) ([2, 3] ++ [4, 5])
>>> (map (+ 1) [2, 3]) ++ (map (+ 1) [4, 5])

Evaluation order

However, the above equation does not hold in most other languages. These other languages use function evaluation to trigger side effects, and therefore if you change the order of evaluation you change the order of side effects.

Let's use Scala to illustrate this. Given the following definitions:

>>> def xs() = { print("!"); Seq(1, 2) }
>>> def ys() = { print("?"); Seq(3, 4) }
>>> def f(x : Int) = { print("*"); x + 1 }

.. the order of side effects differs depending on whether I concatenate or map first:

>>> (xs() ++ ys()).map(f)
!?****res0: Seq[Int] = List(2, 3, 4, 5)
>>> (xs().map(f)) ++ (ys().map(f))
!**?**res1: Seq[Int] = List(2, 3, 4, 5)

One line #1, the two lists are evaluated first, printing "!" and "?", followed by evaluating the function f on all four elements, printing "*" four times. On line #2, we call f on each element of xs before beginning to evaluate ys. Since evaluation order matters in Scala we get two different programs which print the punctuation characters in different order.

The solution

Haskell, on the other hand, strictly separates evaluation order from side effect order using a two-phase system. In the first phase you evaluate your program to build an abstract syntax tree of side effects. In the second phase the Haskell runtime executes the tree by interpreting it for its side effects. This phase distinction ensures that algebraic laws continue to behave even in the presence of side effects.

To illustrate this, we'll generalize our original Haskell code to interleave side effects with list elements and show that it still obeys the same algebraic properties as before. The only difference from before is that we will:

  • Generalize pure lists to their impure analog, ListT
  • Generalize functions to impure functions that wrap side effects with lift
  • Generalize (++) (list concatenation) to (<|>) (ListT concatenation)
  • Generalize map to (=<<), which streams elements through an impure function

This means that our new equation becomes:

-- f  *  (xs  +  ys) = (f  *  xs)  +  (f  * ys)
f =<< (xs <|> ys) = (f =<< xs) <|> (f
=<< ys)

You can read this as saying: if we concatenate xs and ys and then stream their values through the impure function f, the behavior is indistinguishable from streaming each individual list through f first and then concatenating them.

Let's test this equation out with some sample definitions for xs, ys, and f that mirror their Scala analogs:

>>> import Control.Applicative
>>> import Pipes
>>> let xs = do { lift (putChar '!'); return 1
<|> return 2 }
>>> let ys = do { lift (putChar '?'); return 3
<|> return 4 }
>>> let f x = do { lift (putChar '*'); return (x + 1) }
>>> runListT (f =<< (xs <|> ys)) -- Note:
`runListT` discards the result
!**?**>>> runListT ((f =<< xs) <|> (f =<< ys))

The resulting punctuation order is identical. Many people mistakenly believe that Haskell's mathematical elegance breaks down when confronted with side effects, but nothing could be further from the truth.


Haskell preserves algebraic equations even in the presence of side effects, which simplifies reasoning about impure code. Haskell separates evaluation order from side effect order so that you spend less time reasoning about evaluation order and more time reasoning about your program logic.

by Gabriel Gonzalez ([email protected]) at March 28, 2015 12:21 AM

March 27, 2015

Jan Stolarek

The basics of coinduction

I don’t remember when I first heard the terms “coinduction” and “corecursion” but it must have been quite long ago. I had this impression that they are yet another of these difficult theoretical concepts and that I should learn about them one day. That “one day” happened recently while reading chapter 5 of “Certified Programming with Dependent Types”. It turns out that basics of coinduction are actually quite simple. In this post I’ll share with you what I already know on the subject.

Recursion in Haskell

Let’s begin with looking at Haskell because it is a good example of language not formalizing coinduction in any way. Two features of Haskell are of interest to us. First one is laziness. Thanks to Haskell being lazy we can write definitions like these (in GHCi):

ghci> let ones = 1 : ones
ghci> let fib = zipWith (+) (1:fib) (1:1:fib)

ones is – as the name implies – an infinite sequence (list) of ones. fib is a sequence of Fibonacci numbers. Both these definitions produce infinite lists but we can use these definitions safely because laziness allows us to force a finite number of elements in the sequence:

ghci> take 5 ones
ghci> take 10 fib

Now consider this definition:

ghci> let inf = 1 + inf

No matter how hard we try there is no way to use the definition of inf in a safe way. It always causes an infinite loop:

ghci> (0 /= inf)
*** Exception: <<loop>>

The difference between definitions of ones or fib an the definition of inf is that the former use something what is called a guarded recursion. The term guarded comes from the fact that recursive reference to self is hidden under datatype constructor (or: guarded by a constructor). The way lazy evaluation is implemented gives a guarantee that we can stop the recursion by not evaluating the recursive constructor argument. This kind of infinite recursion can also be called productive recursion, which means that although recursion is infinite each recursive call is guaranteed to produce something (in my examples either a 1 or next Fibonacci number). By contrast recursion in the definition of inf is not guarded or productive in any way.

Haskell happily accepts the definition of inf even though it is completely useless. When we write Haskell programs we of course don’t want them to fall into silly infinite loops but the only tool we have to prevent us from writing such code is our intelligence. Situation changes when it comes to….

Dependently-typed programming languages

These languages deeply care about termination. By “termination” I mean ensuring that a program written by the user is guaranteed to terminate for any input. I am aware of two reasons why these languages care about termination. First reason is theoretical: without termination the resulting language is inconsistent as logic. This happens because non-terminating term can prove any proposition. Consider this non-terminating Coq definition:

Fixpoint evil (A : Prop) : A := evil A.

If that definition was accepted we could use it to prove any proposition. Recall that when it comes to viewing types as proofs and programs as evidence “proving a proposition” means constructing a term of a given type. evil would allow to construct a term inhabiting any type A. (Prop is a kind of logical propositions so A is a type.) Since dependently-typed languages aim to be consistent logics they must reject non-terminating programs. Second reason for checking termination is practical: dependently typed languages admit functions in type signatures. If we allowed non-terminating functions then typechecking would also become non-terminating and again this is something we don’t want. (Note that Haskell gives you UndecidableInstances that can cause typechecking to fall into an infinite loop).

Now, if you paid attention on your Theoretical Computer Science classes all of this should ring a bell: the halting problem! The halting problem says that the problem of determining whether a given Turing machine (read: a given computer program) will ever terminate is undecidable. So how is that possible that languages like Agda, Coq or Idris can answer that question? That’s simple: they are not Turing-complete (or at least their terminating subsets are not Turing complete). They prohibit user from using some constructs, probably the most important one being general recursion. Think of general recursion as any kind of recursion imaginable. Dependently typed languages require structural recursion on subterms of the arguments. That means that if a function receives an argument of an inductive data type (think: algebraic data type/generalized algebraic data type) then you can only make recursive calls on terms that are syntactic subcomponents of the argument. Consider this definition of map in Idris:

map : (a -> b) -> List a -> List b
map f []      = []
map f (x::xs) = f x :: map f xs

In the second equation we use pattern matching to deconstruct the list argument. The recursive call is made on xs, which is structurally smaller then the original argument. This guarantees that any call to map will terminate. There is a silent assumption here that the List A argument passed to map is finite, but with the rules given so far it is not possible to construct infinite list.

So we just eliminated non-termination by limiting what can be done with recursion. This means that our Haskell definitions of ones and fib would not be accepted in a dependently-typed language because they don’t recurse on an argument that gets smaller and as a result they construct an infinite data structure. Does that mean we are stuck with having only finite data structures? Luckily, no.

Coinduction to the rescue

Coinduction provides a way of defining and operating on infinite data structures as long as we can prove that our operations are safe, that is they are guarded and productive. In what follows I will use Coq because it seems that it has better support for coinduction than Agda or Idris (and if I’m wrong here please correct me).

Coq, Agda and Idris all require that a datatype that can contain infinite values has a special declaration. Coq uses CoInductive keyword instead of Inductive keyword used for standard inductive data types. In a similar fashion Idris uses codata instead of data, while Agda requires ∞ annotation on a coinductive constructor argument.

Let’s define a type of infinite nat streams in Coq:

CoInductive stream : Set :=
| Cons : nat -> stream -> stream.

I could have defined a polymorphic stream but for the purpose of this post stream of nats will do. I could have also defined a Nil constructor to allow finite coinductive streams – declaring data as coinductive means it can have infinite values, not that it must have infinite values.

Now that we have infinite streams let’s revisit our examples from Haskell: ones and fib. ones is simple:

CoFixpoint ones : stream := Cons 1 ones.

We just had to use CoFixpoint keyword to tell Coq that our definition will be corecursive and it is happily accepted even though a similar recursive definition (ie. using Fixpoint keyword) would be rejected. Allow me to quote directly from CPDT:

whereas recursive definitions were necessary to use values of recursive inductive types effectively, here we find that we need co-recursive definitions to build values of co-inductive types effectively.

That one sentence pins down an important difference between induction and coinduction.

Now let’s define zipWith and try our second example fib:

CoFixpoint zipWith (f : nat -> nat -> nat) (a : stream)
                   (b : stream) : stream :=
  match a, b with
    | Cons x xs, Cons y ys > Cons (f x y) (zipWith f xs ys)
CoFixpoint fib : stream :=
   zipWith plus (Cons 1 fib) (Cons 1 (Cons 1 fib)).

Unfortunately this definition is rejected by Coq due to “unguarded recursive call”. What exactly goes wrong? Coq requires that all recursive calls in a corecursive definition are:

  1. direct arguments to a data constructor
  2. not inside function arguments

Our definition of fib violates the second condition – both recursive calls to fib are hidden inside arguments to zipWith function. Why does Coq enforce such a restriction? Consider this simple example:

Definition tl (s : stream) : stream :=
  match s with
    | Cons _ tl' => tl'
CoFixpoint bad : stream := tl (Cons 1 bad).

tl is a standard tail function that discards the first element of a stream and returns its tail. Just like our definition of fib the definition of bad places the corecursive call inside a function argument. I hope it is easy to see that accepting the definition of bad would lead to non-termination – inlining definition of tl and simplifying it leads us to:

CoFixpoint bad : stream := bad.

and that is bad. You might be thinking that the definition of bad really has no chance of working whereas our definition of fib could in fact be run safely without the risk of non-termination. So how do we persuade Coq that our corecursive definition of fib is in fact valid? Unfortunately there seems to be no simple answer. What was meant to be a simple exercise in coinduction turned out to be a real research problem. This past Monday I spent well over an hour with my friend staring at the code and trying to come up with a solution. We didn’t find one but instead we found a really nice paper “Using Structural Recursion for Corecursion” by Yves Bertot and Ekaterina Komendantskaya. The paper presents a way of converting definitions like fib to a guarded and productive form accepted by Coq. Unfortunately the converted definition looses the linear computational complexity of the original definition so the conversion method is far from perfect. I encourage to read the paper. It is not long and is written in a very accessible way. Another set of possible solutions is given in chapter 7 of CPDT but I am very far from labelling them as “accessible”.

I hope this post demonstrates that basics ideas behind coinduction are actually quite simple. For me this whole subject of coinduction looks really fascinating and I plan to dive deeper into it. I already have my eyes set on several research papers about coinduction so there’s a good chance that I’ll write more about it in future posts.

by Jan Stolarek at March 27, 2015 03:21 PM


Qualified Goals in the Cabal Solver

When you ask cabal-install to install one or more packages, it needs to solve a constraint satisfaction problem: select a version for each of the packages you want to install, plus all their dependencies, such that all version constraints are satisfied. Those version constraints come both from the user (“please install lens version 4”) and from the packages themselves (“this package relies on mtl between version 2.1 and 2.2”). Internally cabal-install uses a modular constraint solver, written by Andres Löh. It was first presented at the Haskell Implementor’s Workshop in 2011, and is based on the paper Modular lazy search for Constraint Satisfaction Problems.

For the Industrial Haskell Group, Well-Typed has recently extended the constraint solver to deal with qualified goals. In this blog post we will explain how this constraint solver works by means of a running example, and look at how the new qualified goals change things.

Overview of the Solver

The ability to be able to decouple generating solutions from finding the right one is one of the classical reasons why functional programming matters, and cabal-install’s constraint solver makes very heavy use of this. It first builds a tree with lots of “solutions”; solutions in quotes because many of these solutions will not be valid. It then validates these solutions, marking any invalid ones. This might still leave many possible solutions, so after this we apply some preferences (we prefer newer packages over older ones for example) and heuristics (we want to pick the version of base that is already installed, no point considering others) and then it uses the first solution it finds (if any).

It is very important to realize throughout all this that these trees are never built in their entirety. We depend on laziness to only evaluate as much as necessary. A key design principle throughout the solver is that we must have enough information at each node in the tree to be able to make local decisions. Any step that would require looking at the tree in its entirety would be a big no-no.

In the remainder of this section we will see what these different steps look like using a running example.

Building the tree

Suppose we have a package database with base version 4.0, mtl versions 1.0 and 2.0, both of which depend on base, and a package foo that depends on both base and mtl version 2.0.

When we ask cabal to install package foo, it constructs a search tree that looks something like this:

GoalChoice nodes, shown as G nodes in the diagram, represent points where we decide on which package to solve for next. Initially we have only one option: we need to solve for package foo. Similarly, PChoice nodes P represent points where we decide on a package version. Since there is only one version of foo available, we again only have one choice.

Once we have chosen foo version 1.0, we need to solve for foo's dependencies: base and mtl. We don’t know which we should solve for first; the order in which we consider packages may affect how quickly we find a solution, and which solution we return (since we will eventually report the first solution we find). When we build the tree we essentially make a arbitary decision (depending on which order we happen to find the dependencies), and we record the decision using a GoalChoice node. Later we can traverse the tree and apply local heuristics to these GoalChoice nodes (for instance, we might want to consider base before mtl).

In the remainder of the tree we then pick a version for mtl (here we do have a choice in version), and then a version for base, or the other way around. Note that when we build the tree, package constraints are not yet applied: in the tree so far there is nothing that reflects the fact that foo wants version 2.0 of mtl, and every path ends with a Done node D, indicating success. Indeed, we would get precisely the same tree if we have a package DB

where foo depends on either version of mtl.


Once we have built the tree, we then walk over the tree and verify package constraints. As soon as we detect a violation of a constraint on a given path we replace that node in the tree with a failure node. For the above example this gives us the following tree:

Paths through the tree that lead to failure are not removed from the tree, but are replaced by explicit failure F. This helps with generating a good error message if we fail to find a solution. In this case, both failures are essentially the same problem: we cannot pick version 1.0 for mtl because foo needs version 2.0.

Heuristics and Preferences

After validation we apply a number of heuristics to the tree. For example, we prefer to pick a version of base early because there is generally only one version of base available in the system. In addition, we apply user preferences; for example, we try newer versions of packages before older versions. For our example this gives us

Finding a solution

After applying the heuristics we throw away all but the first choice in each GoalChoice node (but keeping all choices in the PChoice nodes)

and traverse the tree depth first to find a solution, returning the first solution we find. In our running example, this means that we will use version 1.0 of foo, 4.0 of base and 2.0 of mtl.

Whenever we encounter a Fail node we backtrack. This backtracking is aided by so-called conflict sets. I haven’t shown these conflict sets in the diagrams, but each Fail node in the tree is annotated with a conflict set which records why this path ended in failure. In our running example the conflict set for both Fail nodes is the set {foo, mtl}, recording that there is a conflict between the versions of mtl and the version of foo that we picked. The conflict set is used to guide the backtracking; any choice that we encounter while backtracking that does not involve any variables in the conflict set does not need to be reconsidered, as it would lead to the same failure.

If we cannot find any solution, then we must report an error. Reporting a good error here however is difficult: after all, we have a potentially very large tree, with lots of different kinds of failures. Constructing an informative error from this is difficult, and this is one area where cabal-install might still be improved.

Qualified goals


Normally cabal-install can only pick a single version for each package. For example, if we have a situation

we cannot install package D because it would require installing both versions 1.0 and 2.0 of package A (this is known as the diamond problem).

Setup scripts

Cabal packages can however have their own custom Setup scripts, when necessary, which are Cabal’s equivalent of the traditional ./configure scripts. In principle there should be no problem building these Setup scripts using different (and possibly conflicting) dependencies than the library itself; after all, the Setup script is completely independent from the library.

From Cabal 1.23 and up these setup scripts can have their own list of dependencies. Let’s suppose that in our running example the Setup script of foo has a dependency on any version of mtl:

We want to allow the setup script to be compiled against a different version of mtl as foo itself, but of course we would prefer not to, in order to avoid unnecessary additional compilation time.


In order to allow picking a different version, we introduce qualified goals for each of the setup dependencies. In our running example, this means that cabal-install will now try to solve for the variables foo, mtl, and base, as well as foo.setup.mtl and foo.setup.base. This makes it possible to pick one version for mtl and another for foo.setup.mtl.


But how do we make sure that we pick the same version when possible? One (non-) option is to look at the entire search tree and find the solution that installs the smallest number of packages. While that might work in theory, it violates the earlier design principle we started with: we only ever want to make local decisions. The search tree can be quite large; indeed, the addition of the single setup dependency to foo already makes the tree much larger, as we shall see shortly. We certainly never want to inspect the entire search tree (or, worse yet, have the entire search tree in memory).

Instead, we introduce the concept of linking. This means that when we select a version for foo.setup.mtl (say), in addition to being able to pick either version 1.0 or 2.0, we can also say “link the version of foo.setup.mtl to the version of mtl” (provided that we already picked a version for mtl).

Then we can make local decisions: when we pick a version, we prefer to link whenever possible. Of course, this is subject to certain constraints. In the remainder of this section we shall see how qualified goals and linking works using our running example, and identify some of these linking constraints.

Building the tree

The code to build the initial tree is modified to introduce qualified constraints for setup dependencies, but does not itself deal with linking. Instead, it builds the tree as normal, and we then add additional linking options into the tree as a separate phase.

The search tree for our running example gets much bigger now due to combinational explosion: we have two additional variables to solve for, and linking means we have more choices for package versions. Here’s part of the initial search tree:

Let’s follow along the spine of this tree to see what’s going on. We first consider foo and pick version 1.0, just like before (there is no other choice). Then, on this path, we first consider foo.setup.mtl, and we have two options: we can either pick version 1.0 or version 2.0. We pick version 1.0, and continue with foo.setup.base and pick version 4.0 (only one option).

But when we now consider mtl things are more interesting: in addition to picking versions 1.0 and 2.0, we can also decide to link the version of mtl against the version of foo.setup.mtl (indicated by a red label in the tree). Similarly, when we pick a version for base, we can now choose to link base against the version of foo.setup.base in addition to picking version 4.0.


When we link mtl against foo.setup.mtl, we are really saying “please use the exact same package instance for both mtl and foo.setup.mtl”. This means that the dependencies of mtl must also be linked against the dependencies of foo.setup.mtl.

In addition, ghc enforces a so-called single instance restriction. This means that (in a single package database) we can only have one instance of a particular package version. So, for example, we can have both mtl version 1.0 and mtl version 2.0 installed in the same package database, but we cannot have two instance of mtl version 2.0 (for instance, one linked against version 3.0 of transformers and one linked against version 4.0 oftransformers) installed at the same time. Lifting this restriction is an important step towards solving Cabal Hell, but for now we have to enforce it. In our terminology, this means that when we have to solve for both (say) mtl and foo.setup.mtl, we can either pick two different versions, or we can link one to the other, but we cannot pick the same version for both goals.

So, in addition to the regular validation phase which verifies package constraints, we introduce a second validation phase that verifies these kinds of “linking constraints”. We end up with a tree such as

In this part of the tree, the two failures for mtl are because we picked version 1.0 for foo.setup.mtl, but since foo itself wants mtl version 2.0, we cannot pick version 1.0 for goal mtl nor can we link mtl to foo.setup.mtl. The two failures for base are due to the single instance restriction: since we picked version 4.0 for foo.setup.base, we must link base to foo.setup.base.


If we picked the first solution we found in the tree above, we would select version 1.0 of mtl for foo’s setup script and version 2.0 of mtl for foo itself. While that is not wrong per se, it means we do more work than necessary. So, we add an additional heuristic that says that we should consider setup dependencies after regular (library) dependencies. After we apply this heuristic (as well as all the other heuristics) we end up with

In this part of the tree we see one failure for foo.setup.mtl and two failures for foo.setup.base. The failure for foo.setup.mtl comes from the single instance restriction again: since we picked version 2.0 for mtl, we cannot pick an independent instance for foo.setup.mtl. The failure for foo.setup.base on the right is due to the same reason, but there is an additional reason for the left failure: since we chose to link foo.setup.mtl to mtl, its dependencies (in this case, foo.setup.base) must also be linked.

Finding a solution

As before, after applying the heuristics we prune

and we report the first solution we find. In this case, this means that we will pick version 2.0 for mtl and link foo.setup.mtl to mtl, and foo.setup.base to base.


Although we skimmed over many details, we covered the most important design principles behind the solver and the new implementation of qualified goals. One thing we did not talk about in this blog post are flag assignments; when the solver needs to decide on the value for a flag, it introduces a FChoice node into the tree with two subtrees for true and false, and then proceeds as normal. When we link package P to package Q, we must then also verify that their flag assignments match.

Qualified goals and linking are now used for setup dependencies, but could also be used to deal with private dependencies, to split library dependencies from executable dependencies, to deal with the base-3 shim package, and possibly other purposes. The modular design of the solver means that such features can be added (mostly) as independent units of code. That doesn’t of course necessarily mean the code is also easy; making sure that all decisions remain local can be a subtle problem. Hopefully this blog post will make the solver code easier to understand and to contribute to.

by edsko at March 27, 2015 10:17 AM

FP Complete

MinGHC for GHC 7.10

We're happy to announce the availability of MinGHC for GHC 7.10. MinGHC is a minimal GHC installer for Windows, consisting of:

  • GHC itself
  • cabal-install
  • MSYS, which is necessary for building some packages such as network

MinGHC came out of some conversation Neil Mitchell and I had at ICFP last year about pain on the Windows platform, and consists of a lot of good code by Neil, and some hacky attempts by me at understanding Windows installer processes. While this project predates the Commercial Haskell SIG by a few months, it's essentially the first project undertaken by the group.

While MinGHC is relatively young, it's also a minimalistic product, simply repackaging and serving upstream components unmodified. Additionally, it has become the recommended installer by quite a few sites, and therefore has already received significant testing. Also, due to its minimal nature, it's easy to pop out new releases of MinGHC. For example, this current release was made less than two hours after GHC itself was released!

While this project is stable, additional testers and contributors are certainly welcome. Please check out the Github project page for more information.

March 27, 2015 09:30 AM

Danny Gratzer

Value vs Monomorphism Restriction

Posted on March 27, 2015
Tags: sml, haskell

I’m taking the undergraduate course on programming languages at CMU. For the record, I still get really excited about the novelty of taking a class (at school!) on programming languages. I’m easily made happy.

We started talking about System F and before long we touched on the value restriction. Specifically, how most people think of the value restriction incorrectly. To understand why this is, let’s first define the value restriction since it’s probably new to you if you don’t use SML.

The Value Restriction

In SML there are value level declarations just like in Haskell. We can write things like

    val x = 1
    val y = x + 1

and we end up with x bound to 1 and y bound to 2. Note that SML is strict so these bindings are evaluated right as we reach them. Also like in Haskell, SML has polymorphism, so we can write map

   fun map f [] = []
     | map f (x :: xs) = f x :: map f xs

And it gets the type ('a -> 'b) -> ('a list -> 'b list). Aside from minor syntatic differences, this is pretty much identical to what we’d write in Haskell. The value restriction concerns the intersection of these two things. In SML, the following should not compile under the standard

    val x = rev []

This is because SML requires that all polymorphic val bindings be values! In practice all implementations will do something besides this but we’ll just focus on what the standard says. Now the reason for this value restriction is widely misunderstood. Most people believe that the value restrictions

    val r  = ref NONE
    val () = r := SOME 1
    val _  = case !r of
                 SOME s => s
               | NONE   => ""

This seems to illustrate a pretty big issue for SML! We’re filling in polymorphic reference with one type and unboxing it with a different one! Clearly this would segfault without the value restriction. However, there’s a catch.

SML is based on System F (did you really think I could get through a blog post without some theory?) which is sometimes called the “polymorphic lambda calculus”. It’s the minimal language with polymorphism and functions. In this language there’s a construct for making polymorphic things: Λ.

In this language we write polymorphism explicitly by saying Λ τ. e which has the type ∀ t. T. So for example we write the identity function as

    id ≡ Λ τ. λ x : τ. x
    () = id[unit] ()

Now SML (and vanilla Haskell) have a limited subset of the power of Λ. Specifically all these lambdas have to appear at the start of a toplevel term. Meaning that they have to be of the form

    val x = Λ α. Λ β. ... e

This is called “prenex” form and is what makes type inference for SML possible. Now since we don’t show anyone the hidden Λs it doesn’t make sense to show them the type application that comes with them and SML infers and adds those for us too. What’s particularly interesting is that SML is often formalized as having this property: values start with Λ and are implicitly applied to the appropriate types where used. Even more interesting, how do you suppose we should evaluate a Λ? What for example, should this code do

    val x  = Λ τ. raise[τ] Fail (* Meaning raise an exception and say
                                  we have type τ *)
    val () = print "I made it here"
    val () = x[unit]

It seems clear that Λ should be evaluated just like how we evaluate λ, when we apply it. So I’d (and the formalization of SML) would expect this to print "I made it here" before throwing that exception. This might now surprise you just by parallels with code like this

    val x  = fn () => raise[τ] Fail
    val () = print "I made it here"
    val () = x ()

However, what about when those lambdas are implicit? In the actual source language of ML our code snippet would be

    val x  = raise Fail
    val () = print "I made it here"
    val () = x[unit]

Uhoh, this really looks like it ought to throw an exception but it apparently doesn’t! More worringly, what about when we have something like

    fun die ()  = raise Fail
    val x = die ()
    val () = print "Made it here"

Since x is never specialized, this doesn’t even throw an error! Yikes! Clearly this is a little confusing. It is however, type safe. Consider our original motivation for the value restriction. With explicit type application

    val r  = Λ τ. ref[τ] NONE
    val () = r[int] := SOME 1
    val _  = case !(r[string]) of
                 SOME s => s
               | NONE   => ""

Since the body of this function is run every time we do something with r, we’re just creating a whole bunch of new references in this code! There’s no type safety failure since !(r[string]) returns a fresh ref cell, completely different from the one we modified on the line above! This code always runs the NONE case. In fact, if this did the wrong thing it’s just a canary in the coal mine, a symptom of the fact that our system evaluates under (big) lambda binders.

So the value restriction is really not at all about type safety, it’s about comprehensibility. Mostly since the fact that a polymorphic expression is evaluated at usage rather than location is really strange. Most documentation seems to be wrong about this, everyone here seems agree that this is unfortunate but such is life.

The Monomorphism Restriction

Now let’s talk about the monomorphism restriction. This is better understood but still worth recapping. In Haskell we have type classes. They let us overload function to behave differently on different types. Everyone’s favoriate example is the type class for numbers which let’s us write

    fact :: (Eq a, Num a) => a -> a
    fact 0 = 1
    fact n = n * fact (n - 1)

And this works for all numbers, not just int or something. Under the hood, this works by passing a record of functions like *, fromInteger, and - to make the code work. That => is really just a sort of function arrow that happens to only take particular “implicit” records as an argument.

Now what do you suppose the most polymorphic type this code is?

    x = fact 10000

It could potentially work on all numbers so it gets the type

    x :: (Num a, Eq a) => a

However this is really like a function! This means that fact :: Integer and fact :: Int evaluate that computation twice. In fact each time we call fact we supply a new record and end up evaluating again. This is very costly and also very surprising to most folks. After all, why should something that looks like a normal number evaluate every time we use it! The monomorphism restriction is essentially

  1. If you have a binding
  2. Whose type is (C1, C2 ...) => t
  3. And has no arguments to the left of the =
  4. Don’t generalize it

This is intended to keep us from the surprise of evaluating a seemingly fully reduced term twice.

Sound familiar? Just like with the value restriction the whole point of the monomorphism restriction is to prevent a hidden function, either type abstraction or type constraints, from causing us to silently and dangerously duplicate work. While neither of them are essential to type safety: without it some really simple looking pieces of code become exponential.

Wrap Up

That about covers things. It turns out that both of these restrictions are just patches to cover some surprising areas of the semantics but both are easily understood when you look at the elaborated version. I deliberately went a bit faster through the monomorphism restriction since quite a lot of ink has already been spilled on the subject and unlike the value restriction, most of it is correct :)

As one final note, the way that Haskell handles the monomorphism restriction is precisely how OCaml handles the value restriction: weak polymorphism. Both of them mark the type variables they refuse to generalize as weak type variables. Whenever we first instantiate them to something we go back and retroactively modify the definition to pretend we had used this type all along. In this way, we only evaluate things once but can handle a lot of simple cases of binding a value and using it once.

The more you know.

<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

March 27, 2015 12:00 AM

March 26, 2015

Manuel M T Chakravarty

Schrödinger’s Box

In Objective-C, NSValue provides a universal container that can hold scalars and value types in addition to references. Conversely, in Swift, we quickly learnt to appreciate Box<T> as a wrapper to embed (cyclic) references in enums and structs (especially due to limitations in the initial versions of the Swift compiler):

final public class Box<T> {
  public let unbox: T
  public init(_ value: T) { self.unbox = value }

NSValue is also useful to wrap reference. For example, if we want to turn a strong reference into a weak reference (possibly to store it in a collection, such as NSArray), we can create an NSValue with +valueWithNonretainedObject.

How do we achieve that in Swift without losing type information, as using NSValue would? We use Schrödinger’s Box:

final public class WeakBox<T: AnyObject> {
  private weak var box: T?
  public var unbox: T? { get { return box } }
  public init(_ value: T) { = value }

You can put something into Schrödinger’s Box, which will or will not be gone by the time you try to retrieve it. It enables, for example, to build a typed collection of weakly referenced objects (to avoid retain cycles).

NB: The definition of WeakBox is fine with the latest Swift 1.2 compiler. However, the Swift 1.1 compiler can only handle it with -Onone; otherwise, it crashes.

March 26, 2015 11:43 PM

Neil Mitchell

FP Complete

Our composable community infrastructure

Our composable community infrastructure

TL;DR: we propose to factor Hackage into a separate, very simple service serving a stash of Haskell packages, with the rest of Hackage built on top of that, in the name of availability, reliability and extensibility.

One of the main strengths of Haskell is just how much it encourages composable code. As programmers, we are goaded along a good path for composability by the strictures of purity, which forces us to be honest about the side effects we might use, but first and foremost because first class functions and lazy evaluation afford us the freedom to decompose solutions into orthogonal components and recompose them elsewhere. In the words of John Hughes, Haskell provides the necessary glue to build composable programs, ultimately enabling robust code reuse. Perhaps we ought to build our shared community infrastructure along the same principles: freedom to build awesome new services by assembling together existing ones, made possible by the discipline to write these basic building blocks as stateless, scalable, essentially pure services. Let's think about how, taking packages hosting as an example, with a view towards solving three concrete problems:

  • Availability of package metadata and source code (currently these are no longer available when goes down).
  • Long cabal update download times.
  • The difficulty of third party services and other community services to interoperate with and extend it in any direction the community deems fit.

Haskell packages

Today Haskell packages are sets of files with a distinguished *.cabal file containing the package metadata. We host these files on a central package repository called Hackage, a community supported service. Hackage is a large service that has by and large served the community well, and has done so since 2007. The repository has grown tremendously, by now hosting no less than 5,600 packages. It implements many features, some of which include package management. In particular, Hackage allows registered users to:

  • Upload a new package: either from the browser or via cabal upload.

  • Download an index of all packages available: this index includes the full content of all *.cabal files for all packages and all versions.

  • Query the package database via a web interface: from listing all packages available by category, to searching packages by name. Hackage maintains additional metadata for each package not stored in the package itself, such as download counts, package availability in various popular Linux distributions. Perhaps in the future this metadata will also include social features such as number of "stars", à la Github.

Some of the above constitute the defining features of a central package repository. Of course, Hackage is much more than just that today - it is a portal for exploring what packages are out there through a full blown web interface, running nightly builds on all packages to make sure they compile and putting together build reports, generating package API documentation and providing access to the resulting HTML files, maintaining RSS feeds for new package uploads, generating activity graphs, integration with Hoogle and Hayoo, etc.

In the rest of this blog post, we'll explore why it's important to tweeze out the package repository from the rest, and build the Hackage portal on top of that. That is to say, talk separately about Hackage-the-repository and Hackage-the-portal.

A central hub is the cement of the community

A tell-tale sign of a thriving development community is that a number of services pop up independently to address the needs of niche segments of the community or indeed the community as a whole. Over time, these community resources together as a set of resources form an ecosystem, or perhaps even a market, in much the same way that the set of all Haskell packages form an ecosystem. There is no central authority deciding which package ought to be the unique consecrated package for e.g. manipulating filesystem paths: on Hackage today there are at least 5, each exploring different parts of the design space.

However, we do need common infrastructure in place, because we do need consensus about what package names refer to what code and where to find it. People often refer to Hackage as the "wild west" of Haskell, due to its very permissive policies about what content makes it on Hackage. But that's not to say that it's an entirely chaotic free-for-all: package names are unique, only designated maintainers can upload new versions of some given package and version numbers are bound to a specific set of source files and content for all time.

The core value of Hackage-the-repository then, is to establish consensus about who maintains what package, what versions are available and the metadata associated with each version. If Alice has created a package called foolib, then Bob can't claim foolib for any of his own packages, he must instead choose another name. There is therefore agreement across the community about what foolib means. Agreement makes life much easier for users, tools and developers talking about these packages.

What doesn't need consensus is anything outside of package metadata and authorization: we may want multiple portals to Haskell code, or indeed have some portals dedicated to particular views (a particular subset of the full package set) of the central repository. For example, today is one such portal, dedicated to LTS Haskell and Stackage Nightly, two popular views of consistent package sets maintained by FP Complete. We fully anticipate that others will over time contribute other views &mdash; general-purpose or niche (e.g. specialized for a particular web framework) &mdash; or indeed alternative portals &mdash; ranging from small, fast and stable to experimental and replete with social features aplenty. Think powerful new search functionality, querying reverse dependencies, pre-built package sets for Windows, OS X and Linux, package reviews, package voting ... you name it!

Finally, by carving out the central package repository into its own simple and reliable service, we limit the impact of bugs on both availability and reliably, and thus preserve on one of our most valuable assets: the code that we together as a community have written. Complex solutions invariably affect reliability. Keeping the core infrastructure small and making it easy to build on top is how we manage that.

The next section details one way to carve the central package repository, to illustrate my point. Alternative designs are possible of course - I merely wish to seek agreement that a modular architecture with at its core a set of very small and simple services as our community commons would be beneficial to the community.

Pairing down the central hub to its bare essence

Before we proceed, let's first introduce a little bit of terminology:

  • A persistent data structure is an data structure that is never destructively updated: when modified, all previous versions of the data structure are still available. Data.Map from the containers package is persistent in this sense, as are lists and most other data structures in Haskell.
  • A service is stateless if its response is a function of the state of other services and the content of the request. Stateless services are trivially scaled horizontally - limited only by the scalability of the services they depend on.
  • A persistent service is a service that maintains its only state as a persistent data structure. Most resources served by a persistent service are immutable. Persistent services share many of the same properties as stateless services: keeping complexity down and scaling them is easy because concurrent access and modification of a persistent data structure requires little to no coordination (think locks, critical sections, etc).

A central hub for all open source Haskell packages might look something like this:

  • A persistent read-only directory of metadata for all versions of all packages (i.e. the content of the .cabal file). Only the upload service may modify this directory, and even then, only in a persistent way.

  • A persistent read-only directory of the packages themselves, that is to say the content of the archives produced by cabal sdist.

  • An upload service, for uploading new revisions of the metadata directory. This service maintains no state of its own, therefore multiple upload services can be spawned if necessary.

  • An authentication service, granting access tokens to users for adding a new package or modifying the metadata for their own existing packages via the upload service(s).

The metadata and package directories together form a central repository of all open source Haskell packages. Just as is the case with Hackage today, anyone is allowed to upload any package they like via the upload service. We might call these directories collectively The Haskell Stash. End-user command-line tools, such as cabal-install, need only interact with the Stash to get the latest list of packages and versions. If the upload or authentication services go down, existing packages can still be downloaded without any issue.

Availability is a crucial property for such a core piece of infrastructure: users from around the world rely on it today to locate the dependencies necessary for building and deploying Haskell code. The strategy for maintaining high availability can be worked out independently for each service. A tried and tested approach is to avoid reinventing as much of the wheel as we can, reusing existing protocols and infrastructure where possible. I envision the following implementation:

  • Serve the metadata directory as a simple Git repository. A Git repository is persistent (objects are immutable and live forever), easy to add new content to, easy to backup, easy to mirror and easy to mine for insights on how packages changes over time. Advanced features such as package candidates fall out nearly for free. Rather than serving in its entirety a whole new static tarball of all package metadata (totalling close to 9MB of compressed data) as we do today, we can leverage the existing Git wire protocol to transfer new versions to end users much more efficiently. In short, a faster cabal update!.

    The point here is very much not to use Git as a favoured version control system (VCS), fine as it may be for that purpose, at the expense of any other such tool. Git is at its core an efficient persistent object store first and foremost, with a VCS layered on top. The idea is to not reinvent our own object store. It features a simple disk format that has remained incredibly stable over the years. Hosting all our metadata as a simple Git repository means we can leverage any number of existing Git hosting providers to serve our community content with high uptime guarantees.

  • Serve package source archives (produced by cabal sdist) via S3, a de facto standard API for file storage, supported by a large array of cloud providers. These archives can be large, but unlike package metadata, their content is fixed for all time. Uploading a new version of a package means uploading a new source archive with a different name. Serving our package content via a standard API means we can have that content hosted on a reliable cloud platform. In short, better uptime and higher chance that cabal install will not randomly fail.


The Haskell Stash is a repository in which to store our community's shared code assets in as simple, highly available and composable a manner as possible. Reduced to its bare essence, easily consumable by all manner of downstream services, most notably, Hackage itself,,,, etc. It is by enabling people to extend core infrastructure in arbitrary directions that we can hope to build a thriving community that meets not just the needs of those that happened to seed it, but that furthermore embraces new uses, new needs, new people.

Provided community interest in this approach, the next steps would be:

  1. implement the Haskell Stash;
  2. implement support for the Haskell Stash in Hackage Server;
  3. in the interim, if needed, mirror Hackage content in the Haskell Stash.

In the next post in this series, we'll explore ways to apply the same principles of composability to our command-line tooling, in the interest of making our tools more hackable, more powerful and ship with fewer bugs.

March 26, 2015 05:50 AM

March 25, 2015

Ketil Malde

Can you trust science?

Hardly a week goes by without newspaper writing about new and exciting results from science. Perhaps scientists have discovered a new wonderful drug for cancer treatment, or maybe they have found a physiological cause for CFS. Or perhaps this time they finally proved that homeopathy works? And in spite of these bold announcements, we still don't seem to have cured cancer. Science is supposed to be the method which enables us to answer questions about how the world works, but one could be forgiven for wondering whether it, in fact, works at all.

As my latest contribution to my local journal club, I presented a paper by Ioannidis, titled Why most published research findings are false 1. This created something of a stir when it was published in 2005, because it points out some simple mathematical reasons why science isn't as accurate as we would like to believe.

The ubiquitous p-value

Science is about finding out what is true. For instance, is there a relationship between treatment with some drug and the progress of some disease - or is there not? There are several ways to go about finding out, but in essence, it boils down to making some measurements, and doing some statistical calculations. Usually, the result will be reported along with a p-value, which is a by-product of the statistical calculations saying something about how certain we are of the results.

Specifically, if we claim there is a relationship, the associated p-value is the probability we would make such a claim even if there is no relationship in reality.

We would like this probability to be low, of course, and since we usually are free to select the p-value threshold, it is usually chosen to be 0.05 (or 0.01), meaning that if the claim is false, we will only accept it 5% (or 1%) of the times.

The positive predictive value

Now, the p-value is often interpreted as the probability of our (positive) claim being wrong. This is incorrect! There is a subtle difference here, which it is important to be aware of. What you must realize, is that the probability α relies on the assumption that the hypothesis is wrong - which may or may not be true, we don't know (which is precisely why we want to find out).

The probability of a claim being wrong after the fact is called the positive predictive value (PPV). In order to say something about this, we also need to take into account the probability of claiming there exists a relationship when the claim is true. Our methods aren't perfect, and even if a claim is true, we might not have sufficient evidence to say for sure.

So, take one step back and looking at our options. Our hypothesis (e.g., drug X works against disease Y) can be true or false. In either case, our experiment and analysis can lead us to reject or accept it with some probability. This gives us the following 2-by-2 table:

True False
Accept 1-β α
Reject β 1-α

Here, α is the probability of accepting a false relationship by accident (i.e., the p-value), and β is the probability of missing a true relationship -- we reject a hypothesis, even when it is true.

To see why β matters, consider a hypothetical really really poor method, which has no chance of identifying a true relationship, in other words, $\beta$=1. Then, every accepted hypothesis must come from the False column, as long as α is at all positive. Even if the p-value threshold only accepts 1 in 20 false relationships, that's all you will get, and as such, they constitute 100% of the accepted relationships.

But looking at β is not sufficient either. Let's say a team of researchers test hundreds of hypotheses, which all happen to be false? Then again, some of them will get accepted anyway (sneaking in under the p-value threshold α), and since there are no hypotheses in the True column, again every positive claim is false.

A β of 1 or a field of research with 100% false hypotheses are extreme cases2, and in reality, things are not quite so terrible. The Economist had a good article with a nice illustration showing how this might work in practice with more reasonable numbers. It should still be clear that the ratio of true to false hypotheses being tested, as well as the power of the analysis to identify true hypotheses are important factors. And if these numbers approach their limits, things can get quite bad enough.

More elaborate models

Other factors also influence the PPV. Try as we might to be objective, scientists often try hard to find a relationship -- that's what you can publish, after all3. Perhaps in combination with a less firm grasp of statistics than one could wish for (and scientists who think they know enough statistics are few and far between - I'm certainly no exception there), this introduces bias towards acceptance.

Multiple teams pursuing the same challenges in a hot and rapidly developing field also decrease the chance of results being correct, and there's a whole cottage industry of scientist reporting spectacular and surprising results in high-ranking journals, followed by a trickle of failures to replicate.

Solving this

One option is to be stricter - this is the default when you do multiple hypothesis testing, you require a lower p-value threshold in order to reduce α. The problem with this is that if you are stricter with what you accept as true, you will also reject more actually true hypotheses. In other words, you can reduce α, but only at the cost of increasing β.

On the other hand, you can reduce β by running a larger experiment. One obvious problem with this is cost, for many problems, a cohort of a hundred thousand or more is necessary, and not everybody can afford to run that kind of studies. Perhaps even worse, a large cohort means that almost any systematic difference will be found significant. Biases that normally are negligible will show up as glowing bonfires in your data.

In practice?

Modern biology has changed a lot in recent years, and today we are routinely using high-throughput methods to test the expression of tens of thousands of genes, or the value of hundreds of thousands of genetic markers.

In other words, we simultaneously test an extreme number of hypotheses, where we expect a vast majority of them to be false, and in many cases, the effect size and the cohort are both small. It's often a new and exciting field, and we usually strive to use the latest version of the latest technology, always looking for new and improved analysis tools.

To put it bluntly, it is extremely unlikely that any result from this kind of study will be correct. Some people will claim these methods are still good for "hypothesis generation", but Ioannidis shows a hypothetical example where a positive result increases the likelihood that a hypothesis is correct by 50%. This doesn't sound so bad, perhaps, but in reality, the likelihood is only improved from 1 in 10000 to 1 in 7000 or so. I guess three thousand fewer trials to run in the lab is something, but you're still going to spend the rest of your life running the remaining ones.

You might expect scientists to be on guard for this kind of thing, and I think most scientists will claim they desire to publish correct results. But what counts for your career is publications and citations, and incorrect results are no less publishable than correct ones - and might even get cited more, as people fail to replicate them. And as you climb the academic ladder, publications in high-ranking journals is what counts, an for that you need spectacular results. And it is much easier to get spectacular incorrect results than spectacular correct ones. So the academic system rewards and encourages bad science.


The bottom line is to be skeptical of any reported scientific results. The ability of the experiment and analysis to discover true relationships is critical, and one should always ask what the effect size is, and what the statistical power -- the probability of detecting a real effect -- is.

In addition, the prior probability of the hypothesis being true is crucial. Apparently-solid, empirical evidence of people getting cancer from cell phone radiation, or working homeopathic treatment of disease can almost be dismissed out of hand - there simply is no probable explanation for how that would work.

A third thing to look out for, is how well studied a problem is, and how the results add up. For health effects of GMO foods, there is a large body of scientific publications, and an overwhelming majority of them find no ill effects. If this was really dangerous, wouldn't some of these investigations show it conclusively? For other things, like the decline of honey bees, or the cause of CFS, there is a large body of contradictory material. Again - if there was a simple explanation, wouldn't we know it by now?

  1. And since you ask: No, the irony of substantiating this claim with a scientific paper is not lost on me.

  2. Actually, I would suggest that research in paranormal phenomena is such a field. They still manage to publish rigorous scientific works, see this Less Wrong article for a really interesting take.

  3. I think the problem is not so much that you can't publish a result claiming no effect, but that you can rarely claim it with any confidence. Most likely, you just didn't design your study well enough to tell.

March 25, 2015 08:00 PM

FP Complete

FP Complete's Hackage mirror

We have been running a mirror of Hackage package repository which we use internally for the FP Complete Haskell Centre's IDE, building Stackage, and other purposes. This has been an open secret, but now we're making it official.

To use it, replace the remote-repo line in your ~/.cabal/config with the following:


Then run cabal update, and you're all set.

This mirror is updated every half-hour. It is statically hosted on Amazon S3 so downtime should be very rare (Amazon claims 99.99% availability).

The mirror does not include the HTML documentation. However, Stackage hosts documentation for a large set of packages.

We have also released our hackage-mirror tool. It takes care of efficiently updating a static mirror of Hackage on S3, should anyone wish to host their own. While the official hackage-server has its own support for mirroring, our tool differs in that it does not require running a hackage-server process to host the mirror.

HTTPS for Stackage

On a tangentially related note, we have enabled TLS for Since cabal-install does not support TLS at this time, we have not set up an automatic redirect from insecure connections to the https:// URL.

March 25, 2015 02:00 PM

Ken T Takusagawa

[zyxbhqnd] Defining monads with do notation

If one happens to be most comfortable with "do" notation for monads ("What are monads? They are the things for which "do" notation works well."), so monads implicitly being defined in terms of bind (>>=) and return, here are the definitions of map and join, the "other" way of defining monads:

join :: (Monad m) => m (m a) -> m a;
join xs = do { x <- xs ; x }

map :: (Monad m) => (a -> b) -> m a -> m b;
map f xs = do { x <- xs ; return $ f x }

map is identical to liftM and slightly narrower than fmap which requires only the Functor typeclass instead of Monad.  This redundancy is one of the motivations for the AMP proposal.  Incidentally, map (as defined above) would work as well as the famous Prelude map function which operates only on lists, because a list is a Monad (and a Functor).

Just for completeness, here is bind defined in do notation:

(>>=) :: (Monad m) => m a -> (a -> m b) -> m b;
(>>=) xs f = do { x <- xs ; f x }

I sometimes like explicitly using the bind operator instead of do notation because the syntax, specifying xs then f, lines up well with the thought process "first prepare the input xs to a function, then call the function f on it".  It also works well for longer chains.  For example, the expression xs >>= f >>= g >>= h is equivalent to

do {
x <- xs;
y <- f x;
z <- g y;
h z;

but not having to name the intermediate results.

Inspired by the tutorial Monads as containers.

Update: added type signature for (>>=).

by Ken ([email protected]) at March 25, 2015 08:33 AM

The GHC Team

GHC Weekly News - 2015/03/24

Hi *,

It's time for the GHC weekly news. We've had an absence of the last one, mostly due to a lot of hustle to try and get 7.10 out the door (more on that shortly throughout this post). But now we're back and everything seems to be taken care of.

This week, in the wake of the GHC 7.10 release (which is occuring EOD, hopefully), GHC HQ met up for a brief chat and caught up:

  • This week GHC HQ met for only a very short time to discuss the pending release - it looks like all the blocking bugs have been fixed, and we've got everything triaged appropriately. You'll hopefully see the 7.10 announcement shortly after reading this.

We've also had small amounts of list activity (the past 6-8 weeks have been very, very quiet it seems):

  • Yitzchak Gale revived a thread he started a while back, which puttered out: bootstrapping GHC 7.8 with GHC 7.10. The long and short of it is, it should just about work - although we still haven't committed to this policy, it looks like Yitz and some others are quite adamant about it.

Some noteworthy commits that went into ghc.git in the past two weeks include:

Closed tickets this past week include: #9122, #10099, #10081, #9886, #9722, #9619, #9920, #9691, #8976, #9873, #9541, #9619, #9799, #9823, #10156, #1820, #6079, #9056, #9963, #10164, #10138, #10166, #10115, #9921, #9873, #9956, #9609, #7191, #10165, #10011, #8379, #10177, #9261, #10176, #10151, #9839, #8078, #8727, #9849, #10146, #9194, #10158, #7788, #9554, #8550, #10079, #10139, #10180, #10181, #10170, #10186, #10038, #10164, and #8976.

by thoughtpolice at March 25, 2015 02:20 AM

March 24, 2015

Jan Stolarek

First impressions of Coq and “Certified Programming with Dependent Types”

A simplistic view of strongly typed functional programming ecosystem is that the two most influential languages are Haskell and ML. Haskell is my language of choice so when it came to learning dependently-typed programming I stayed on the Haskell side of the spectrum and went with Agda and Idris. I chose the two over the ML-inspired Coq, most often advertised as a proof assistant rather that a programming language, planning to learn it “when I find some free time”. I wouldn’t probably find that time if it wasn’t for a friend of mine who recently picked up a book “Certified Programming with Dependent Types” by Adam Chlipala of MIT. Having a companion to discuss the ideas in the book was a perfect opportunity to pick it up – the truth is I found out about the book well over a year ago and since then it stayed on my always too long must-read-this-some-day list. So far I have read around 1/3rd of the book and would like to share some of my first impressions, both about the book, which I will refer to as CPDT, and Coq.

(Note: In what follows I will compare Coq to Agda and Idris but you have to be aware that despite similarity in features of these languages they don’t aim to be the same. Coq is a proof assistant with an extra feature of code extraction that allows you to turn your proofs into code – if you ever heard about “programs as proofs” this is it. Idris is a programming language with extra features that allow you to prove your code correct. I’m not really sure how to classify Agda. It is definitely on the programming-language-end of the spectrum – it allows you to prove your code correct but does not provide any extra built-in proof support. At the same time turning Agda code into working programs is non-trivial.)

Let me start off by saying that I don’t have any real-life Coq project on the horizon, so my learning is not motivated by need to solve any practical problem. My main driving force for learning Coq is purely interest in programming languages and seeing how Coq compares to Agda and Idris. A common thing with dependently-typed languages is that the types can get too complicated for the programmer to comprehend and thus a language requires an interactive mode to provide programmer with compiler feedback about the types. This is true for Agda, Idris and Coq. Agda offers a great support for holes: programmer can insert question marks into the program code and once it is re-compiled in the editor (read: Emacs) question marks become holes ie. places where Agda compiler provides user with feedback about expected types, bindings available in the hole context as well as some nice inference features allowing to automatically fill in contents of holes. So in Agda one proves by constructing terms of appropriate types. Coq is different, as it relies on a mechanism called “tactics”. Once the user writes down a type (theorem) he is presented with a set of goals to prove. Applying a tactic transforms the current goal into a different goal (or several goals). Conducting consecutive steps of the proof (ie. applying several tactics) should lead to some trivial goal that follows from definition and ends the proof. To work with Coq I decided to use Proof General, an Emacs extension for working with proofs (many other proof assistants are supported besides Coq)1. It launches Coq process in the background and essentially integrates writing code with proving. With Proof General I can easily step through my proofs to see how the goals are transformed by usage of tactics. Idris falls somewhere between Agda and Coq. As stated earlier it is mostly a programming language but it also provides tactic-based proving. So for example when I write a definition that requires explicit proof to typecheck, idris-mode launches interactive REPL in which I can conduct a proof in a fashion similar to Proof General and once I’m finished the proof is inserted into the source code. the result looks something like this:

par : (n : Nat) -> Parity n
par Z = even {n=Z}
par (S Z) = odd {n=Z}
par (S (S k)) with (par k)
  par (S (S (j + j)))     | even ?= even {n = S j}
  par (S (S (S (j + j)))) | odd  ?= odd {n = S j}
---------- Proofs ----------
Basics.par_lemma_2 = proof
  rewrite sym (plusSuccRightSucc j j)
Basics.par_lemma_1 = proof
  rewrite sym (plusSuccRightSucc j j)

The last time I checked Idris once the proof was completed and added to the source code it was not possible to step through it back and forth to see how goals are transformed. (Things might have changed since I last checked.)

So far I’ve been doing rather basic stuff with Coq so I haven’t seen much that wouldn’t be also possible in Agda or Idris. The biggest difference is that Coq feels a much more grown up language than any of the mentioned two. One totally new thing I learned so far is co-induction, but I’m only starting with it and won’t go into details, rather leaving it for a separate post. (Agda also supports co-induction.)

As for the CPDT book I have to say it is a challenging read: it’s very dense and focuses on more advanced Coq techniques without going into details of basics. As such it is a great demonstration of what can be done in Coq but not a good explanation of how it can be done. Depending on what you are expecting this book might or might not be what you want. As stated earlier I don’t plan on applying Coq in any project but rather want to see a demo of Coq features and possibly pick up some interesting theoretical concepts. As such CPDT works quite well for me although I am not entirely happy with not being able to fully understand some of the demonstrated techniques. As such CPDT is definitely not a self-contained read, which I believe was a conscious decision on the author’s side. Discussing with people on #coq IRC channel and reading various posts on the internet leads me to a conclusion that CPDT is a great book for people that have been using Coq for some time and want to take their skills to a new level. The main theme of the book is proof automation, that is replacing tactic-based sequential proofs with automated decision procedures adjusted to the problem at hand that can construct proofs automatically. Indeed tactic-based proofs are difficult to understand and maintain. Consider this proof of a simple property that n + 0 = n:

Theorem n_plus_O : forall (n : nat), plus n O = n.
  induction n.
  rewrite IHn.

To understand that proof one has to step through it to see how goals are transformed or have enough knowledge of Coq to know that without the need of stepping through. Throughout the book Adam Chlipala demonstrates the power of proof automation by using his tactic called crush, which feels like a magic wand since it usually ends the proof immediately (sometimes it requires some minimal guidance before it ends the proof immediately). I admit this is a bit frustrating as I don’t feel I learn anything by seeing crush applied to magically finish a proof. Like I said, a good demo of what can be done but without an explanation. The worst thing is that crush does not seem to be explained anywhere in the book so readers wanting to understand it are left on their own (well, almost on their own).

What about those of you who want to learn Coq starting from the basics? It seems like Software Foundations is the introductory book about Coq and – given that the main author is Benjamin Pierce – it looks like you can’t go wrong with this book. I am not yet sure whether I’ll dive into SF but most likely not as this would mean giving up on CPDT and for me it’s more important to get a general coverage of more advanced topics rather than in-depth treatment of basics.

  1. Other choices of interactive mode are available for Coq, for example CoqIDE shipped by default with Coq installation

by Jan Stolarek at March 24, 2015 01:53 PM

Noam Lewis

Danny Gratzer

A Tiny Compiler For A Typed Higher Order Language

Posted on March 24, 2015

Hi folks, the last week or so I was a little tired of schoolwork so I decided to scratch out some fun code. The end result is an extremely small compiler for a typed, higher order functional language called PCF to C. In this post I’ll explain attempt to explain the whole thing, from front to back :)

What’s PCF

First things first, it’s important to define the language we’re compiling. The language, PCF short for “partial computable functions”, is an extremely small language you generally find in a book on programming languages, it originates with Plotkin if I’m not mistaken.

PCF is based around 3 core elements: natural numbers, functions (closures), and general recursion. There are two constants for creating numbers, Zero and Suc. Zero is self explanatory and Suc e is the successor of the natural number e evaluates to. In most programming languages this just means Suc e = 1 + e but + isn’t a primitive in PCF (we can define it as a normal function).

For functions, we have lambdas like you’d find in any functional language. Since PCF includes no polymorphism it’s necessary to annotate the function’s argument with it’s type.

Finally, the weird bit: recursion. In PCF we write recursive things with fix x : τ in e. Here we get to use x in e and we should understand that x “stands for” the whole expression, fix .... As an example, here’s how we define +.

    plus =
          fix rec : nat -> nat -> nat in
            λ m : nat.
            λ n : nat.
              ifz m {
                  Zero  => n
                | Suc x => Suc (rec x n)

Now Let’s Compile It

Now compilation is broken up into a bunch of phases and intermediate languages. Even in this small of a compiler there are 3 (count-em) languages so along with the source and target language there are 5 different languages running around inside of this compiler. Each phase with the exception of typechecking is just translating one intermediate language (IL) into another and in the process making one small modification to the program as a whole.


This compiler starts with an AST, I have no desire to write a parser for this because parsers make me itchy. Here’s the AST

    data Ty = Arr Ty Ty
            | Nat
            deriving Eq

    data Exp a = V a
               | App (Exp a) (Exp a)
               | Ifz (Exp a) (Exp a) (Scope () Exp a)
               | Lam Ty (Scope () Exp a)
               | Fix Ty (Scope () Exp a)
               | Suc (Exp a)
               | Zero
               deriving (Eq, Functor, Foldable, Traversable)

What’s interesting here is that our AST uses bound to manage variables. Unfortunately there really isn’t time to write both a bound tutorial and a PCF compiler one. I’ve written about using bound before here otherwise you can just check out the official docs. The important bits here are that Scope () ... binds one variable and that a stands for the free variables in an expression. 3 constructs bind variables here, Ifz for pattern matching, Fix for recursive bindings, and Lam for the argument. Note also that Fix and Lam both must be annotated with a type otherwise stuff like fix x in x and fn x => x are ambiguous.

Type Checking

First up is type checking. This should be familiar to most people we’ve written a type checker before since PCF is simply typed. We simply have a Map of variables to types. Since we want to go under binders defined using Scope we’ll have to use instantiate. However this demands we be able to create fresh free variables so we don’t accidentally cause clashes. To prevent this we use monad-gen to generate fresh free variables.

To warm up, here’s a helper function to check that an expression has a particular type. This uses the more general typeCheck function which actually produces the type of an expression.

    type TyM a = MaybeT (Gen a)

    assertTy :: (Enum a, Ord a) => M.Map a Ty -> Exp a -> Ty -> TyM a ()
    assertTy env e t = (== t) <$> typeCheck env e >>= guard

This type checks the variable in an environment (something that stores the types of all of the free variables). Once it receives that it compares it to the type we expected and chucks the resulting boolean into guard. This code is used in places like Ifz where we happen to know that the first expression has the type Nat.

Now on to the main code, typeCheck

    typeCheck :: (Enum a, Ord a) => M.Map a Ty -> Exp a -> TyM a Ty
    typeCheck _   Zero = return Nat
    typeCheck env (Suc e) = assertTy env e Nat >> return Nat

The first two cases for typeCheck are nice and straightforward. All we if we get a Zero then it has type Nat. If we get a Suc e we assert that e is an integer and then the whole thing has the type Nat.

    typeCheck env (V a) = MaybeT . return $ M.lookup a env

For variables we just look things up in the environment. Since this returns a Maybe it’s nice and easy to just jam it into our MaybeT.

    typeCheck env (App f a) = typeCheck env f >>= \case
      Arr fTy tTy -> assertTy env a fTy >> return tTy
      _ -> mzero

Application is a little more interesting. We recurse over the function and make sure it has an actual function type. If it does, we assert the argument has the argument type and return the domain. If it doesn’t have a function type, we just fail.

    typeCheck env (Lam t bind) = do
      v <- gen
      Arr t <$> typeCheck (M.insert v t env) (instantiate1 (V v) bind)
    typeCheck env (Fix t bind) = do
      v <- gen
      assertTy (M.insert v t env) (instantiate1 (V v) bind) t
      return t

Type checking lambdas and fixpoints is quite similar. In both cases we generate a fresh variable to unravel the binder with. We know what type this variable is supposed to have because we required explicit annotations so we add that to the map constituting our environment. Here’s where they diverge.

For a fixpoint we want to make sure that the body has the type as we said it would so we use assertTy. For a lambda we infer the body type and return a function from the given argument type to the body type.

    typeCheck env (Ifz i t e) = do
      assertTy env i Nat
      ty <- typeCheck env t
      v <- gen
      assertTy (M.insert v Nat env) (instantiate1 (V v) e) ty
      return ty

For Ifz we want to ensure that we actually are casing on a Nat so we use assertTy. Next we figure out what type the zero branch returns and make sure that the else branch has the same type.

All in all this type checker is not particularly fascinating since all we have are simple types. Things get a bit more interesting with polymorphism. I’d suggest looking at that if you want to see a more interesting type checker.

Closure Conversion

Now for our first interesting compilation phase, closure conversion. In this phase we make closures explicit by annotating lambdas and fixpoints with the variables that they close over. Those variables are then explicitly bound in the scope of the lambda. With these changes, our new syntax tree looks like this

    -- Invariant, Clos only contains VCs, can't be enforced statically due
    -- to annoying monad instance
    type Clos a = [ExpC a]

    data ExpC a = VC a
                | AppC (ExpC a) (ExpC a)
                | LamC Ty (Clos a) (Scope Int ExpC a)
                | FixC Ty (Clos a) (Scope Int ExpC a)
                | IfzC (ExpC a) (ExpC a) (Scope () ExpC a)
                | SucC (ExpC a)
                | ZeroC
                deriving (Eq, Functor, Foldable, Traversable)

The interesting parts are the additions of Clos and the fact that the Scope for a lambda and a fixpoint now binds an arbitrary number of variables instead of just one. Here if a lambda or fixpoint binds n variables, the first n - 1 are stored in the Clos and the last one is the “argument”. Closure conversion is thus just the process of converting an Exp to an ExpC.

    closConv :: Ord a => Exp a -> Gen a (ExpC a)
    closConv (V a) = return (VC a)
    closConv Zero = return ZeroC
    closConv (Suc e) = SucC <$> closConv e
    closConv (App f a) = AppC <$> closConv f <*> closConv a
    closConv (Ifz i t e) = do
      v <- gen
      e' <- abstract1 v <$> closConv (instantiate1 (V v) e)
      IfzC <$> closConv i <*> closConv t <*> return e'

Most of the cases here are just recursing and building things back up applicatively. There’s the moderately interesting case where we instantiate the else branch of an Ifz with a fresh variable and then recurse, but the interesting cases are for fixpoints and lambdas. Since they’re completely identical we only present the case for Fix.

    closConv (Fix t bind) = do
      v <- gen
      body <- closConv (instantiate1 (V v) bind)
      let freeVars = S.toList . S.delete v $ foldMap S.singleton body
          rebind v' = elemIndex v' freeVars <|>
                      (guard (v' == v) *> (Just $ length freeVars))
      return $ FixC t (map VC freeVars) (abstract rebind body)

There’s a lot going on here but it boils down into three parts.

  1. Recurse under the binder
  2. Gather all the free variables in the body
  3. Rebind the body together so that all the free variables map to their position in the closure and the argument is n where n is the number of free variables.

The first is accomplished in much the same way as in the above cases. To gather the number of free variables all we need to is use the readily available notion of a monoid on sets. The whole process is just foldMap S.singleton! There’s one small catch: we don’t want to put the argument into the list of variables we close over so we carefully delete it from the closure. We then convert it to a list which gives us an actual Clos a. Now for the third step we have rebind.

rebind maps a free variable to Maybe Int. It maps a free variable to it’s binding occurrence it has one here. This boils down to using elemIndex to look up somethings position in the Clos we just built up. We also have a special case for when the variable we’re looking at is the “argument” of the function we’re fixing. In this case we want to map it to the last thing we’re binding, which is just length n. To capture the “try this and then that” semantics we use the alternative instance for Maybe which works wonderfully.

With this, we’ve removed implicit closures from our language: one of the passes on our way to C.

Lambda Lifting

Next up we remove both fixpoints and lambdas from being expressions. We want them to have an explicit binding occurrence because we plan to completely remove them from expressions soon. In order to do this, we define a language with lambdas and fixpoints explicitly declared in let expressions. The process of converting from ExpC to this new language is called “lambda lifting” because we’re lifting things into let bindings.

Here’s our new language.

    data BindL a = RecL Ty [ExpL a] (Scope Int ExpL a)
                 | NRecL Ty [ExpL a] (Scope Int ExpL a)
                 deriving (Eq, Functor, Foldable, Traversable)
    data ExpL a = VL a
                | AppL (ExpL a) (ExpL a)
                | LetL [BindL a] (Scope Int ExpL a)
                | IfzL (ExpL a) (ExpL a) (Scope () ExpL a)
                | SucL (ExpL a)
                | ZeroL
                deriving (Eq, Functor, Foldable, Traversable)

Much here is the same except we’ve romved both lambdas and fixpoints and replaced them with LetL. LetL works over bindings which are either recursive (Fix) or nonrecursive (Lam). Lambda lifting in this compiler is rather simplistic in how it lifts lambdas: we just boost everything one level up and turn

    λ (x : τ). ...


    let foo = λ (x : τ). ...
    in foo

Just like before, this procedure is captured by transforming an ExpC into an ExpL.

    llift :: Eq a => ExpC a -> Gen a (ExpL a)
    llift (VC a) = return (VL a)
    llift ZeroC = return ZeroL
    llift (SucC e) = SucL <$> llift e
    llift (AppC f a) = AppL <$> llift f <*> llift a
    llift (IfzC i t e) = do
      v <- gen
      e' <- abstract1 v <$> llift (instantiate1 (VC v) e)
      IfzL <$> llift i <*> llift t <*> return e'

Just like in closConv we start with a lot of very boring and trivial “recurse and build back up” cases. These handle everything but the cases where we actually convert constructs into a LetL.

Once again, the interesting cases are pretty much identical. Let’s look at the case for LamC for variety.

    llift (LamC t clos bind) = do
      vs <- replicateM (length clos + 1) gen
      body <- llift $ instantiate (VC . (!!) vs) bind
      clos' <- mapM llift clos
      let bind' = abstract (flip elemIndex vs) body
      return (LetL [NRecL t clos' bind'] trivLetBody)

Here we first generate a bunch of fresh variables and unbind the body of our lambda. We then recurse on it. We also have to recurse across all of our closed over arguments but since those are variables we know that should be pretty trivial (why do we know this?). Once we’ve straightened out the body and the closure all we do is transform the lambda into a trivial let expression as shown above. Here trivLetBody is.

    trivLetBody :: Scope Int ExpL a
    trivLetBody = fromJust . closed . abstract (const $ Just 0) $ VL ()

Which is just a body that returns the first thing bound in the let. With this done, we’ve pretty much transformed our expression language to C. In order to get rid of the nesting, we want to make one more simplification before we actually generate C.


C-With-Expressions is our next intermediate language. It has no notion of nested functions or of fixpoints. I suppose now I should finally fess up to why I keep talking about fixpoints and functions as if they’re the same and why this compiler is handling them identically. The long and short of it is that fixpoints are really a combination of a “fixed point combinator” and a function. Really when we say

    fix x : τ in ...

It’s as if we had sayed

    F (λ x : τ. ...)

Where F is a magical constant with the type

    F :: (a -> a) -> a

F calculates the fixpoint of a function. This means that f (F f) = F f. This formula underlies all recursive bindings (in Haskell too!). In the compiler we basically compile a Fix to a closure (the runtime representation of a function) and pass it to a C function fixedPoint which actually calculates the fixed point. Now it might seem dubious that a function has a fixed point. After all, it would seem that there’s no x so that (λ (x : nat). suc x) = x right? Well the key is to think of these functions as not ranging over just values in our language, but a domain where infinite loops (bottom values) are also represented. In the above equation, the solution is that x should be bottom, an infinite loop. That’s why

    fix x : nat in suc x

should loop! There’s actual some wonderful math going on here about how computable functions are continuous functions over a domain and that we can always calculate the least fixed point of them in this manner. The curious reader is encouraged to check out domain theory.

Anyways, so that’s why I keep handling fixpoints and lambdas in the same way, because to me a fixpoint is a lambda + some magic. This is going to become very clear in C-With-Expressions (FauxC from now on) because we’re going to promote both sorts of let bindings to the same thing, a FauxC toplevel function. Without further ado, here’s the next IL.

    -- Invariant: the Integer part of a FauxCTop is a globally unique
    -- identifier that will be used as a name for that binding.
    type NumArgs = Int
    data BindTy = Int | Clos deriving Eq

    data FauxCTop a = FauxCTop Integer NumArgs (Scope Int FauxC a)
                    deriving (Eq, Functor, Foldable, Traversable)
    data BindFC a = NRecFC Integer [FauxC a]
                  | RecFC BindTy Integer [FauxC a]
                  deriving (Eq, Functor, Foldable, Traversable)
    data FauxC a = VFC a
                 | AppFC (FauxC a) (FauxC a)
                 | IfzFC (FauxC a) (FauxC a) (Scope () FauxC a)
                 | LetFC [BindFC a] (Scope Int FauxC a)
                 | SucFC (FauxC a)
                 | ZeroFC
                 deriving (Eq, Functor, Foldable, Traversable)

The big difference is that we’ve lifted things out of let bindings. They now contain references to some global function instead of actually having the value right there. We also tag fixpoints as either fixing an Int or a Clos. The reasons for this will be apparent in a bit.

Now for the conversion. We don’t just have a function from ExpL to FauxC because we also want to make note of all the nested lets we’re lifting out of the program. Thus we use WriterT to gather a lift of toplevel functions as we traverse the program. Other than that this is much like what we’ve seen before.

    type FauxCM a = WriterT [FauxCTop a] (Gen a)

    fauxc :: ExpL Integer -> FauxCM Integer (FauxC Integer)
    fauxc (VL a) = return (VFC a)
    fauxc (AppL f a) = AppFC <$> fauxc f <*> fauxc a
    fauxc ZeroL = return ZeroFC
    fauxc (SucL e) = SucFC <$> fauxc e
    fauxc (IfzL i t e) = do
      v <- gen
      e' <- abstract1 v <$> fauxc (instantiate1 (VL v) e)
      IfzFC <$> fauxc i <*> fauxc t <*> return e'

In the first couple cases we just recurse. as we’ve seen before. Things only get interesting once we get to LetL

    fauxc (LetL binds e) = do
      binds' <- mapM liftBinds binds
      vs <- replicateM (length binds) gen
      body <- fauxc $ instantiate (VL . (!!) vs) e
      let e' = abstract (flip elemIndex vs) body
      return (LetFC binds' e')

In this case we recurse with the function liftBinds across all the bindings, then do what we’ve done before and unwrap the body of the let and recurse in it. So the meat of this transformation is in liftBinds.

      where liftBinds (NRecL t clos bind) = lifter NRecFC clos bind
            liftBinds (RecL t clos bind) = lifter (RecFC $ bindTy t) clos bind
            lifter bindingConstr clos bind = do
              guid <- gen
              vs <- replicateM (length clos + 1) gen
              body <- fauxc $ instantiate (VL . (!!) vs) bind
              let bind' = abstract (flip elemIndex vs) body
              tell [FauxCTop guid (length clos + 1) bind']
              bindingConstr guid <$> mapM fauxc clos
            bindTy (Arr _ _) = Clos
            bindTy Nat = Int

To lift a binding all we do is generate a globally unique identifier for the toplevel. Once we have that we that we can unwrap the particular binding we’re looking at. This is going to comprise the body of the TopC function we’re building. Since we need it to be FauxC code as well we recurse on it. No we have a bunch of faux-C code for the body of the toplevel function. We then just repackage the body up into a binding (a FauxCTop needs one) and use tell to make a note of it. Once we’ve done that we return the stripped down let binding that just remembers the guid that we created for the toplevel function.

In an example, this code transformers

    let x = λ (x : τ). ... in
      ... x ...


    TOP = λ (x : τ). ...
    let x = TOP in
      ... x ...

With this done our language is now 80% of the way to C!

Converting To SSA-ish C

Converting our faux-C language to actual C has one complication: C doesn’t have let expressions. Given this, we have to flatten out a faux-C expression so we can turn a let expression into a normal C declaration. This conversion is almost a conversion to single static assignment form, SSA. I say almost because there’s precisely one place where we break the single assignment discipline. This is just because it seemed rather pointless to me to introduce an SSA IL with φ just so I could compile it to C. YMMV.

This is what LLVM uses for its intermediate language and because of this I strongly suspect regearing this compiler to target LLVM should be pretty trivial.

Now we’re using a library called c-dsl to make generating the C less painful, but there’s still a couple of things we’d like to add. First of all, all our names our integers so we have i2e and i2d for converting an integer into a C declaration or an expression.

    i2d :: Integer -> CDeclr
    i2d = fromString . ('_':) . show

    i2e :: Integer -> CExpr
    i2e = var . fromString . ('_':) . show

We also have a shorthand for the type of all expression in our generated C code.

    taggedTy :: CDeclSpec
    taggedTy = CTypeSpec "tagged_ptr"

Finally, we have our writer monad and helper function for implementing the SSA conversion. We write C99 block items and use tellDecl binding an expression to a fresh variable and then we return this variable.

    type RealCM = WriterT [CBlockItem] (Gen Integer)

    tellDecl :: CExpr -> RealCM CExpr
    tellDecl e = do
      i <- gen
      tell [CBlockDecl $ decl taggedTy (i2d i) $ Just e]
      return (i2e i)

Next we have the conversion procedure. Most of this is pretty straightforward because we shell out to calls in the runtime system for all the hardwork. We have the following RTS functions

  • mkZero, create a zero value
  • inc, increment an integer value
  • dec, decrement an integer value
  • apply, apply a closure to an argument
  • mkClos, make a closure with a closing over some values
  • EMPTY, an empty pointer, useful for default values
  • isZero, check if something is zero
  • fixedPoint, find the fixed point of function
  • INT_SIZE, the size of the runtime representation of an integer
  • CLOS_SIZE, the size of the runtime representation of a closure

Most of this code is therefore just converting the expression to SSA form and using the RTS functions to shell do the appropriate computation at each step. Note that c-dsl provides a few overloaded string instances and so to generate the C code to apply a function we just use "foo"#[1, "these", "are", "arguments"].

The first few cases for conversion are nice and straightforward.

    realc :: FauxC CExpr -> RealCM CExpr
    realc (VFC e) = return e
    realc (AppFC f a) = ("apply" #) <$> mapM realc [f, a] >>= tellDecl
    realc ZeroFC = tellDecl $ "mkZero" # []
    realc (SucFC e) = realc e >>= tellDecl . ("inc"#) . (:[])

We take advantage of the fact that realc returns it’s result and we can almost make this look like the applicative cases we had before. One particularly slick case is how Suc works. We compute the value of e and apply the result to suc. We then feed this expression into tellDecl which binds it to a fresh variable and returns the variable. Haskell is pretty slick.

    realc (IfzFC i t e) = do
      outi <- realc i
      deci <- tellDecl ("dec" # [outi])
      let e' = instantiate1 (VFC deci) e
      (outt, blockt) <- lift . runWriterT $ (realc t)
      (oute, blocke) <- lift . runWriterT $ (realc e')
      out <- tellDecl "EMPTY"
      let branch b tempOut =
            CCompound [] (b ++ [CBlockStmt . liftE $ out <-- tempOut]) undefNode
          ifStat =
            cifElse ("isZero"#[outi]) (branch blockt outt) (branch blocke oute)
      tell [CBlockStmt ifStat]
      return out

In this next case we’re translating Ifz. For this we obviously need to compute the value of i. We do that by recursing and storing the result in outi. Now we want to be able to use 1 less than the value of i in case we go into the successor branch. This is done by calling dec on outi and storing it for later.

Next we do something a little odd. We recurse on the branches of Ifz but we definitely don’t want to compute both of them! So we can’t just use a normal recursive call. If we did they’d be added to the block we’re building up in the writer monad. So we use lift . runWriterT to give us back the blocks without adding them to the current one we’re building. Now it’s just a matter of generating the appropriate if statement.

To do this we add one instruction to the end of both branches, to assign to some output variable. This ensures that no matter which branch we go down we’ll end up the result in one place. This is also the one place where we are no longer doing SSA. Properly speaking we should write this with a φ but who has time for that? :)

Finally we build add the if statement and the handful of declarations that precede it to our block. Now for the last case.

    realc (LetFC binds bind) = do
      bindings <- mapM goBind binds
      realc $ instantiate (VFC . (bindings !!)) bind
      where sizeOf Int = "INT_SIZE"
            sizeOf Clos = "CLOS_SIZE"
            goBind (NRecFC i cs) =
              ("mkClos" #) <$> (i2e i :) . (fromIntegral (length cs) :)
                           <$> mapM realc cs
                           >>= tellDecl
            goBind (RecFC t i cs) = do
              f <- ("mkClos" #) <$> (i2e i :) . (fromIntegral (length cs) :)
                                <$> mapM realc cs
                                >>= tellDecl
              tellDecl ("fixedPoint"#[f, sizeOf t])

For our last case we have to deal with lets. For this we simply traverse all the bindings which are now flat and then flatten the expression under the binder. When we mapM over the bindings we actually get back a list of all the expressions each binding evaluated to. This is perfect for use with instantiate making the actual toplevel function quite pleasant. goBind is slightly less so.

In the nonrecursive case all we have to do is create a closure. So goBind of a nonrecursive binding shells out to mkClos. This mkClos is applied to the number of closed over expressions as well as all the closed over expressions. This is because mkClos is variadic. Finally we shove the result into tellDecl as usual. For a recursive call there’s a slight difference, namely after doing all of that we apply fixedPoint to the output and to the size of the type of the thing we’re fixing. This is why we kept types around for these bindings! With them we can avoid dragging the size with every value since we know it statically.

Next, we have a function for converting a faux C function into an actual function definition. This is the function that we use realc in.

    topc :: FauxCTop CExpr -> Gen Integer CFunDef
    topc (FauxCTop i numArgs body) = do
      binds <- gen
      let getArg = (!!) (args (i2e binds) numArgs)
      (out, block) <- runWriterT . realc $ instantiate getArg body
      return $
        fun [taggedTy] ('_' : show i) [decl taggedTy $ ptr (i2d binds)] $
          CCompound [] (block ++ [CBlockStmt . creturn $ out]) undefNode
      where indexArg binds i = binds ! fromIntegral i
            args binds na = map (VFC . indexArg binds) [ - 1]

This isn’t the most interesting function. We have one array of arguments to our C function, and then we unbind the body of the FauxC function by indexing into this array. It’s not explicitly stated in the code but the array contains the closed over expressions for the first n - 1 entries and the nth is the actual argument to the function. This is inline with how the variables are actually bound in the body of the function which makes unwrapping the body to index into the argument array very simple. We then call realc which transforms our faux-c expression into a block of actual C code. We add one last statement to the end of the block that returns the final outputted variable. All that’s left to do is bind it up into a C function and call it a day.

Putting It All Together

Finally, at the end of it all we have a function from expression to Maybe CTranslUnit, a C program.

    compile :: Exp Integer -> Maybe CTranslUnit
    compile e = runGen . runMaybeT $ do
      assertTy M.empty e Nat
      funs <- lift $ pipe e
      return . transUnit . map export $ funs
      where pipe e = do
              simplified <- closConv e >>= llift
              (main, funs) <- runWriterT $ fauxc simplified
              i <- gen
              let topMain = FauxCTop i 1 (abstract (const Nothing) main)
                  funs' = map (i2e <$>) (funs ++ [topMain])
              (++ [makeCMain i]) <$> mapM topc funs'
            makeCMain entry =
              fun [intTy] "main"[] $ hBlock ["call"#[i2e entry]]

This combines all the previous compilation passes together. First we typecheck and ensure that the program is a Nat. Then we closure convert it and immediately lambda lift. This simplified program is then fed into fauxc giving a fauxc expression for main and a bunch of functions called by main. We wrap up the main expression in a function that ignores all it’s arguments. We then map realc over all of these fauxc functions. This gives us actual C code. Finally, we take on a trivial C main to call the generated code and return the whole thing.

And that’s our PCF compiler.

Wrap Up

Well if you’ve made it this far congratulations. We just went through a full compiler from a typed higher order language to C. Along the way we ended up implementing

  1. A Type Checker
  2. Closure Conversion
  3. Lambda Lifting
  4. Conversion to Faux-C
  5. SSA Conversion

If you’d like to fiddle a bit more, some fun project might be

  1. Writing type checkers for all the intermediate languages. They’re all typeable except perhaps Faux-C
  2. Implement compilation to LLVM instead of C. As I said before, this shouldn’t be awful


<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

March 24, 2015 12:00 AM

March 22, 2015

Neil Mitchell

Finding a GHC bug

Summary: I found a nasty bug in GHC 7.10 RC3. It's been fixed.

For Shake, I have an extensive test suite (2500+ lines of tests). I also test on 7 GHC versions, including GHC HEAD. After adding GHC 7.10 Release Candidate 3 to the mix one of the tests started failing. A week later the bug has been simplified, diagnosed and fixed as bug #10176. Below is a tale of how that happened, including a technical explanation of the bug in Step 8.

Step 1: Write a lot of tests

The Shake test that caught this particular bug checks that if the user makes a mistake then the error message must contain certain substrings correctly identifying the problem. With GHC 7.10 RC3 on Travis this test stopped throwing an exception entirely, continuing as though nothing were wrong. Weird.

Step 2: Reproduce locally

I tried to reproduce the failure locally, which ended up spotting a fatal bug in the GHC 7.10 RC3 32bit Windows version. After opting for the 64bit version, at first I couldn't reproduce the error. Eventually I realised that you needed to turn on optimisation (at -O1), and that running through ghci (how I usually develop Haskell) didn't cause the problem. Noticing that -O1 was required gave me a clue, that it was related to an optimisation. The typical cause of programs that work without optimisation but fail with it are programs that raise exceptions in pure code (since the exception can change due to optimisations) or those that call unsafePerformIO (it has unsafe in the name for a reason). I certainly do both those things in Shake, but I wasn't aware of anywhere I did them in a dubious manner.

Step 3: Reduce the test case

I spent a lot of time trying to reduce the test case. By inserting print statements I narrowed the place the difference was happening to Development.Shake.Core.applyKeyValue, which is a pretty core bit of Shake. However, while I was able to chop out a lot of auxiliary features (lint tracking, command tracing) the actual code remained difficult to reduce to any great extent, for two reasons. Firstly, the bug was incredibly fragile - moving a monomorphic NOINLINE function from one module to another made the bug disappear. Secondly, the applyKeyValue function is right in the middle of Shake, and the test required a few successful Shake runs to set up things for the failing test, so I couldn't change its observable semantics too much.

What I did conclude was that Shake didn't seem to be doing anything dodgy in the small patch of code that seemed relevant, giving me the first hint that maybe GHC was at fault, not Shake.

Step 4: Differences at the Core level

At this point, I reached out to the GHC mailing list, asking if anyone had any ideas of a culprit. They didn't, but Simon Peyton Jones suggested finding the smallest breaking change and comparing the generated Core. You can do that by compiling with -ddump-simpl, and adding -dsuppress-all -dsuppress-uniques to get something a bit easier to diff. Fortunately, by this point I had a very small change to make the error appear/disappear (moving a function from one module to another), so the difference in Core was tiny. The change in the problematic version read:

case (\_ -> error "here") of {}

In GHC Core a case always evaluates its scrutinee until it has the outermost value available (aka WHNF). The empty alternatives mean that GHC has proven that the evaluation always results in an exception. However, a lambda already has a value available (namely the lambda) so evaluation never throws an exception. As a result, GHC has violated the rules of Core and bad things happen.

Step 5: Reducing further

In order to reduce the bug further I now had a better test, namely:

ghc Core.hs -O -ddump-simpl | grep -F "case (\\"

With this test I didn't have to keep the internals of Shake working, and in fact didn't even have to provide a runnable entry point - all I had to do was look for the dodgy construction in the Core language. Note that I'm not actually looking for case of a lambda with empty alternatives, reasoning (seemingly correctly) that any case on a lambda with non-empty alternatives would be eliminated by the GHC simplifier, so any case followed by lambda is buggy.

I reduced by having a ghcid Window open in one corner, using the warnings -fwarn-unused-binds and -fwarn-unused-imports. I hacked out some part of the program and then patched everything up so it no longer raised an error using ghcid for rapid feedback. I then ran the grep test. If the bug had gone I put the program back to how it was and tried somewhere else. If the bug remained I then cleaned up the now redundant declarations and imports and checked again, repeating until the code was minimal.

Several hours later I was left with something like:

buggy :: (() -> Bool) -> () -> Bool -> IO ()
buggy fun unit bool =
runReaderT (
(if bool then liftIO $ print () else p) >>
(if fun unit then error2Args unit unit >> p else p)) ()

{-# NOINLINE error2Args #-}
error2Args :: () -> () -> a
error2Args _ _ = error "here"

Note that error2Args must be in a different module to buggy.

Step 6: Bisecting

At this point hvr stepped in and bisected all the changes between GHC 7.10 RC2 and RC3, determining that a large Typeable change introduced the bug in the original shake test case. However, using the minimal program, the bug was also present in GHC 7.10 RC2. That suggested the bug might have been around for a while.

Step 7: Augmenting GHC's Lint Checker

GHC already has a pass in the compiler, enabled with -dcore-lint, which checks for dodgy constructs in the Core language. Enabling it didn't pick up this example (hence I used grep instead), so Joachim Breitner added such a check. He also added the example as a test case, so that if it ever breaks in future things it will be spotted immediately.

Step 8: Diagnose and Fix

Joachim then continued to diagnose and fix the issue, the details of which can be found in the patch. The problem (as I understand it) is that GHC looks at the code:

fun x = error "foo" x

And concludes two facts.

  1. If fun is called with one argument then the code will raise an error. That's true, and allows the compiler to replace fun () () with fun ().
  2. After analysing all calls of fun it spots that fun is always called with two arguments, so it is free to change fun to be fun x y = error "foo" x y.

By applying these two facts, we can make the transformation:

case fun () () of {}
-- apply the first rule
case fun () of {}
-- inline fun after applying the second rule
case (\x y -> error "foo" x y) () of {}
-- beta reduce:
case (\y -> error "foo" () y) of {}

Now we have caused invalid Core to be produced. While the two facts are each individually correct, applying the first fact causes the second fact to stop being true. Joachim fixed this by making the call argument count analysis stop at the first argument that guarantees an error.

Step 9: The Impact

The manifestation of the bug is quite interesting. Essentially GHC decides something is an error, but then fails to actually throw the error. As a result, any code the simplifier places after the error call will be eliminated, and that can remove a large chunk of the program. However, any code the simplifier doesn't manage to hoist next to the code will still get run, even though it should have been skipped due to an error. In essence, given exactly the wrong conditions to trigger the bug, you can write:

main = do
putStrLn "here1"
... error "foo" ...
putStrLn "here2"
putStrLn "here3"

And end up with the program printing here1 followed by here3, without throwing an exception. In the case of my original Shake test it started to compile, should have stopped with an error but instead just skipped compiling altogether and went on to do the bits after compiling. A very weird manifestation.

Disclaimer: I've eliminating many missteps of mine, which included pushing random patches to try and reduce on the Travis machine and installing a Linux VM.

by Neil Mitchell ([email protected]) at March 22, 2015 10:05 PM

Philip Wadler

Status Report 4

It seemed as if no time had passed: the anaesthetist injected my spine, and next thing I knew I was waking in recovery.  Keyhole surgery to remove my left kidney was completed on Tuesday 17 March, and I expect to leave the Western General on Saturday 21 March. Meanwhile, progress on diagnosing the amyloid spotted in my liver: I had a bone marrow biopsy on Thursday 19 March, and two days of testing at the National Amyloidosis Centre in London are to be scheduled. NHS has provided excellent care all around.

My room was well placed for watching the partial eclipse this morning. A nurse with a syringe helped me jury rig a crude pinhole camera (below), but it was too crude. Fortunately, there was exactly the right amount of cloud cover through which to view the crescent sun. My fellow patients and our nurses all gathered together, and for five minutes it was party time on the ward.

Update: I left the hospital as planned on Saturday 21 March. Thanks to Guido, Sam, Shabana, Stephen, and Jonathan for visits; to Marjorie for soup; to Sukkat Shalom council for a card and to Gillian for hand delivery; and to Maurice for taking me in while my family was away.

Related: Status report, Status report 2, A paean to the Western General, Status report 3.

by Philip Wadler ([email protected]) at March 22, 2015 07:44 PM

March 21, 2015

Thiago Negri

Using Facebook's Flux architectural style for game development in Unity 3D

I decided to work again in an old game project. The game is named "Shotem" and the mechanic is quite simple: coins start falling from the sky and you need to "shot them" (got it?). Tap on a coin to shoot it, you gain score if you hit and you lose score if you miss the shot or if a coin falls off the screen (in case you didn't tap it). You have seven bullets to use, each shot (finger tap) consumes a bullet and you need to shake the mobile device to reload your bullets.

The game is still a work in progress, but I added the following video in order to show that it actually works in real life:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="" width="420"></iframe>

I'll write the history of the development process of this game. It started in a very naive form, then I converted it to Flux style and left it alone for three months. Now I'm coming back to it and had a pleasant surprise to read the code after forgetting about it. I think Flux style made it easy to continue the project were I left it. This post is 100% opinion based. Please, take a seat.

The first version I coded was very straightforward. Each coin was a GameObject with rendering and game behavior logic. All game state were inside a Singleton and two other objects were responsible for creating coins and handling user taps. Not organized at all.

I was interested in how to code the "reload by shaking" feature. As soon as I got it working, I lost my interest in the game mechanic itself. This happened quite fast actually, Unity 3D has a very nice class to use for dealing with the mobile device gyroscope: Input.gyro.

At the same time, I got in touch with Facebook's Flux architectural style. I tried it in some web projects and had a nice feeling about it. It seemed easier to write, maintain and understand code written in Flux style than the code written in MVC fashion. So, I decided to rewrite the code of Shotem in Flux style.

At first, it didn't seem to fit into Unity's way of doing things. But I worked it out and coded it anyway. It worked fine, then I left the code alone.

Three months later I'm coming back to this project and I didn't remember that I tested Flux style in it. At first I thought: "oh no, I'll need to delete this project and start it over, why did I do it?" Then I did read the code and look at the project structure: I was amazed at how easy it was to get things back on track, I were able to get productive again in less than five minutes! What a pleasant surprise!

Now, I feel that I should share this with the rest of the world. I appreciate feedback, so if you feel that using Flux style in game development is madness or if you feel like giving it a try too, leave a comment.

From now on, I'll use some terms from Flux, e.g. Store, Dispatcher, Action. I won't explain what are their meanings, so feel free to take a little detour and read the official Facebook's Flux website to get a summary of it.

I created a Dispatcher class to register the Stores, dispatch actions, track dependencies between Stores and ensure that the Actions do not cascade. The Dispatcher, in it's full glory, is here:

<script src=""></script>

The core game mechanic is encoded in three Stores: CoinStore, handling all coins in the game; GunStore, handling player's ammunition; ScoreStore, tracking user score.

Player input may dispatch two Actions in the game: ShootAction and ReloadGunAction. To handle the physics of the game (falling coins), one Action is sent every FixedUpdate: PhysicsUpdateAction.

The Stores handle each action according to simple rules:

PhysicsUpdateAction is handled by CoinStore by spawning new coins, moving the coins in the screen and removing coins that are off the screen. ScoreStore waits for the CoinStore to finish and then updates the score by decrementing the amount of coins that were removed from CoinStore, because they have fallen off the screen. GunStore ignores this action.

ShootAction is handled by GunStore by decrementing player's ammunition. CoinStore waits for the GunStore and ignores the action if no ammunition was used, otherwise it checks for overlapping coins in the shot area and removes them. ScoreStore waits for GunStore and CoinStore then increases the score by each coin removed or decreases the score in case an ammunition was used and no coin was removed.

ReloadGunAction is handled by GunStore to reset the player's ammunition to seven. CoinStore and ScoreStore ignores this action.

That's the whole game logic. Each Store is observable, the Views register themselves to receive update notifications. When the Store handles an Action that changes its state, it triggers a notification. I've created an abstract class Store to handle this common behavior:

<script src=""></script>

The listing below is the code of ScoreStore, the other Stores follow the same style:

<script src=""></script>

Setup code is done in a Singleton GameObject's Awake method, I called it StoreHub because I use this to get the reference to the Stores from other Views:

<script src=""></script>

ShotemActionGenerator is a helper class to create Actions and send them to Dispatcher, nothing fancy.

The View classes are also simple, for example, the game may play three different sound effects when the GunStore is updated: when a bullet is shot, when user tries to shot without ammunition left and when the gun is reloaded:

<script src=""></script>

Another view shows the quantity of ammunition remaining, which are just children GameObjects with Renderers that I enable or disable based on the GunStore state:

<script src=""></script>

There is a View to show the score, one to show a tip to remind the player to reload the gun when it's empty, another to handle the renderers of each coin visible in the screen, etc. The entire game is made in the same solid base, or may I say "solid flux". Data always flows in a single direction without cascading, there is always an Action that enters the system and triggers something, nothing happens if no Action is dispatched.

I don't know if this design can scale to a big game project, as the style as it is now depends on it running in a single core. But, none of my games are that intense, so I will try to finish this game with Flux. I guess that using Flux will also make it easier to create automated tests of the game mechanic, as it is self-contained. Maybe I'll post some follow up thoughts on it.

Below is an overview image of the current flux of the game. I decided to call "Inception" the layer that generates the Actions, Flux do not give it a name.

Thanks for your time,

by Thiago Negri ([email protected]) at March 21, 2015 03:23 PM

Mark Jason Dominus

Examples of contracts you should not sign

Shortly after I posted A public service announcement about contracts Steve Bogart asked me on on Twitter for examples of dealbreaker clauses. Some general types I thought of immediately were:

  • Any nonspecific non-disclosure agreement with a horizon more than three years off, because after three years you are not going to remember what it was that you were not supposed to disclose.

  • Any contract in which you give up your right to sue the other party if they were to cheat you.

  • Most contracts in which you permanently relinquish your right to disparage or publicly criticize the other party.

  • Any contract that leaves you on the hook for the other party's losses if the project is unsuccessful.

  • Any contract that would require you to do something immoral or unethical.

A couple of recent specific examples:

  • Comcast is negotiating a contract with our homeowner's association to bring cable Internet to our village; the propsed agreement included a clause in which we promised not to buy Internet service from any other company for the next ten years. I refused to sign. The guy on our side who was negotiating the agreement was annoyed with me. If too many people refuse to sign, maybe Comcast will back out. “Do you think you're going to get FIOS in here in the next ten years?” he asked sarcastically. “No,” I said. “But I might move.”

    Or, you know, I might get sick of Comcast and want to go back to whatever I was using before. Or my satellite TV provider might start delivering satellite Internet. Or the municipal wireless might suddenly improve. Or Google might park a crazy Internet Balloon over my house. Or some company that doesn't exist yet might do something we can't even imagine. Google itself is barely ten years old! The iPhone is only eight!

  • In 2013 I was on a job interview at company X and was asked to sign an NDA that enjoined me from disclosing anything I learned that day for the next ten years. I explained that I could not sign such an agreement because I would not be able to honor it. I insisted on changing it to three years, which is also too long, but I am not completely unwilling to compromise. It's now two years later and I have completely forgotten what we discussed that day; I might be violating the NDA right now for all I know. Had they insisted on ten years, would I have walked out? You bet I would. You don't let your mouth write checks that your ass can't cash.

by Mark Dominus ([email protected]) at March 21, 2015 12:00 AM

A public service announcement about contracts

Every so often, when I am called upon to sign some contract or other, I have a conversation that goes like this:

Me: I can't sign this contract; clause 14(a) gives you the right to chop off my hand.

Them: Oh, the lawyers made us put that in. Don't worry about it; of course we would never exercise that clause.

There is only one response you should make to this line of argument:

Well, my lawyer says I can't agree to that, and since you say that you would never exercise that clause, I'm sure you will have no problem removing it from the contract.

Because if the lawyers made them put in there, that is for a reason. And there is only one possible reason, which is that the lawyers do, in fact, envision that they might one day exercise that clause and chop off your hand.

The other party may proceed further with the same argument: “Look, I have been in this business twenty years, and I swear to you that we have never chopped off anyone's hand.” You must remember the one response, and repeat it:

Great! Since you say that you have never chopped off anyone's hand, then you will have no problem removing that clause from the contract.

You must repeat this over and over until it works. The other party is lazy. They just want the contract signed. They don't want to deal with their lawyers. They may sincerely believe that they would never chop off anyone's hand. They are just looking for the easiest way forward. You must make them understand that there is no easier way forward than to remove the hand-chopping clause.

They will say “The deadline is looming! If we don't get this contract executed soon it will be TOO LATE!” They are trying to blame you for the blown deadline. You should put the blame back where it belongs:

As I've made quite clear, I can't sign this contract with the hand-chopping clause. If you want to get this executed soon, you must strike out that clause before it is TOO LATE.

And if the other party would prefer to walk away from the deal rather than abandon their hand-chopping rights, what does that tell you about the value they put on the hand-chopping clause? They claim that they don't care about it and they have never exercised it, but they would prefer to give up on the whole project, rather than abandon hand-chopping? That is a situation that is well worth walking away from, and you can congratulate yourself on your clean escape.

[ Addendum: Steve Bogart asked on Twitter for examples of unacceptable contract demands; I thought of so many that I put them in a separate article. ]

by Mark Dominus ([email protected]) at March 21, 2015 12:00 AM

March 20, 2015

Philip Wadler

Status Report 3

I have a date for keyhole surgery to remove my kidney: Tuesday 17 March. This fits what I was told previously, so all is to schedule. Recovery time is likely to be four weeks. I expect to be in hospital until Monday 23 March: visitors most welcome! Please call the Western General, or Level 4 office has my contact details.

My liver biopsy took place on Thursday 19 February; the hardest part was to lie still for six hours after. I had meetings with my infectious disease doctor on Wednesday 25 February and with my cardiologist on Monday 2 March. My endocarditis is clear, but will need to be monitored once a year. The biopsy shows a deposit of amyloid protein; experts are being consulted, but it has no effect on my surgery. My thanks to all my doctors and the staff at the Western General, who have been excellent.

Related: Status report, Status report 2, A paean to the Western General, Status report 4.

by Philip Wadler ([email protected]) at March 20, 2015 02:13 PM

Mark Jason Dominus

Rectangles with equal area and perimeter

Wednesday while my 10-year-old daughter Katara was doing her math homework, she observed with pleasure that a rectangle has a perimeter of 18 units and also an area of 18 square units. I mentioned that there was an infinite family of such rectangles, and, after a small amount of tinkering, that the only other such rectangle with integer sides is a square, so in a sense she had found the single interesting example. She was very interested in how I knew this, and I promised to show her how to figure it out once she finished her homework. She didn't finish before bedtime, so we came back to it the following evening.

This is just one of many examples of how she has way too much homework, and how it interferes with her education.

She had already remarked that she knew how to write an equation expressing the condition she wanted, so I asked her to do that; she wrote $$(L×W) = ([L+W]×2).$$ I remember being her age and using all different shapes of parentheses too. I suggested that she should solve the equation for , getting on one side and a bunch of stuff involving on the other, but she wasn't sure how to do it, so I offered suggestions while she moved the symbols around, eventually obtaining $$W = 2L\div (L-2).$$ I would have written it as a fraction, but getting the right answer is important, and using the same notation I would use is much less so, so I didn't say anything.

I asked her to plug in and observe that popped right out, and then similarly that yields , and then I asked her to try the other example she knew. Then I suggested that she see what did: it gives , This was new, so she checked it by calculating the area and the perimeter, both . She was very excited by this time. As I have mentioned earlier, algebra is magical in its ability to mechanically yield answers to all sorts of questions. Even after thirty years I find it astonishing and delightful. You set up the equations, push the symbols around, and all sorts of stuff pops out like magic. Calculus is somehow much less astonishing; the machinery is all explicit. But how does algebra work? I've been thinking about this on and off for a long time and I'm still not sure.

At that point I took over because I didn't think I would be able to guide her through the next part of the problem without a demonstration; I wanted to graph the function and she does not have much experience with that. She put in the five points we already knew, which already lie on a nice little curve, and then she asked an incisive question: does it level off, or does it keep going down, or what? We discussed what happens when gets close to 2; then shoots up to infinity. And when gets big, say a million, you can see from the algebra that is a hair more than 2. So I drew in the asymptotes on the hyperbola.

Katara is not yet familiar with hyperbolas. (She has known about parabolas since she was tiny. I have a very fond memory of visiting Portland with her when she was almost two, and we entered Holladay park, which has fountains that squirt out of the ground. Seeing the water arching up before her, she cried delightedly “parabolas!”)

Once you know how the graph behaves, it is a simple matter to see that there are no integer solutions other than and . We know that does not work. For the value of is always strictly between and . For there is no value of that works at all. For the formula says that is negative, on the other branch of the hyperbola, which is a perfectly good numerical solution (for example, ) but makes no sense as the width of a rectangle. So it was a good lesson about how mathematical modeling sometimes introduces solutions that are wrong, and how you have to translate the solutions back to the original problem to see if they make sense.

[ Addendum 20150330: Thanks to Steve Hastings for his plot of the hyperbola, which is in the public domain. ]

by Mark Dominus ([email protected]) at March 20, 2015 12:00 AM

March 19, 2015

Noam Lewis

A simple bash script for monitoring the output of a running process exampleI got tired of my screen getting flooded with text when all I wanted was to see the last output line at any given time. What I really want is to just see one output line and have it replaced with some newer text when it’s available. So I wrote a script (in, ahem, bash) that lets your do exactly this. Call it “” or whatever you want:

(If the code below is screwed up, see it here on gist)

#!/bin/bash -eu
SB="stdbuf -i0 -oL"
tput sc
$@ 2>&1 \
   | $SB tee $LOG_FILE \
   | $SB cut -c-$(tput cols) \
   | $SB sed -u 's/\(.\)/\\\1/g' \
   | $SB xargs -0 -d'\n' -iyosi  -n1  bash -c 'tput rc;tput el; printf "\r%s" yosi'
tput rc;tput el;printf "\r" # Delete the last printed line

I use it in a test env bringup script to run some builds. Instead of having to lose all the context of the wrapping script, I now see just a single output line of the invoked builds at any given time.

by sinelaw at March 19, 2015 08:21 PM

Mark Jason Dominus

An ounce of theory is worth a pound of search

The computer is really awesome at doing quick searches for numbers with weird properties, and people with an amateur interest in recreational mathematics would do well to learn some simple programming. People appear on math.stackexchange quite often with questions about tic-tac-toe, but there are only 5,478 total positions, so any question you want to ask can be instantaneously answered by an exhaustive search. An amateur showed up last fall asking “Is it true that no prime larger than 241 can be made by either adding or subtracting 2 coprime numbers made up out of the prime factors 2,3, and 5?” and, once you dig through the jargon, the question is easily answered by the computer, which quickly finds many counterexamples, such as and .

But sometimes the search appears too large to be practical, and then you need to apply theory. Sometimes you can deploy a lot of theory and solve the problem completely, avoiding the search. But theory is expensive, and not always available. A hybrid approach often works, which uses a tiny amount of theory to restrict the search space to the point where the search is easy.

One of these I wrote up on this blog back in 2006:

A number is excellent if it has an even number of digits, and if when you chop it into a front half and a back half , you have . For example, is excellent, because , and is excellent, because .

The programmer who gave me thie problem had tried a brute-force search over all numbers, but to find all 10-digit excellent numbers, this required an infeasible search of 9,000,000,000 candidates. With the application of a tiny amount of algebra, one finds that and it's not hard to quickly test candidates for to see if has this form and if so to find the corresponding value of . (Details are in the other post.) This reduces the search space for 10-digit excellent numbers from 9,000,000,000 candidates to 90,000, which could be done in under a minute even with last-century technology, and is pretty nearly instantaneous on modern equipment.

But anyway, the real point of this note is to discuss a different problem entirely. A recreational mathematician on stackexchange wanted to find distinct integers for which and were all perfect squares. You can search over all possible quadruples of numbers, but this takes a long time. The querent indicated later that he had tried such a search but lost patience before it yielded anything.

Instead, observe that if is a perfect square then and are the legs of a right triangle with integer sides; they are terms in what is known as a Pythagorean triple. The prototypical example is , and is the Pythagorean triple. (The querent was quite aware that he was asking for Pythagorean triples, and mentioned them specifically.)

Here's the key point: It has been known since ancient times that if is a Pythagorean triple, then there exist integers and such that: $$\begin{align} \require{align} a & = n^2-m^2 \\ b & = 2mn \\ c & = n^2 + m^2 \end{align}$$

So you don't have to search for Pythagorean triples; you can just generate them with no searching:

    for my $m (1 .. 200) {
      for my $n ($m+1 .. 200) {
        my $a = $n*$n-$m*$m;
        my $b = 2 * $n * $m;
        $trip{$a}{$b} = 1;
        $trip{$b}{$a} = 1;

This builds a hash table, %trip, with two important properties:

  1. $trip{$a} is a sub-table whose keys are all the numbers that can form a triple with . For example, $trip{20} is a hash with three keys: 21, 48, and 99, because and , but 20 is not a participant in any other triples.

  2. $trip{$a}{$b} is true if and only if is a perfect square, and false otherwise.

The table has only around 40,000 entries. Having constructed it, we now search it:

    for my $a (keys %trip) {
      for my $b (keys %{$trip{$a}}) {
        for my $c (keys %{$trip{$b}}) {
          next if $c == $a;
          for my $d (keys %{$trip{$c}}) {
            next if $d == $b;
            print "$a $b $c $d\n" if $trip{$d}{$a};

The outer loop runs over each that is known to be a member of a Pythagorean triple. (Actually the formulas show that every number bigger than 2 is a member of some triple, but we may as well skip the ones that are only in triples we didn't tabulate.) Then the next loop runs over every that can possibly form a triple with ; that is, every for which is a perfect square. We don't have to search for them; we have them tabulated ahead of time. Then for each such (and there aren't very many) we run over every that forms a triple with , and again there is no searching and very few candidates. Then then similarly , and if the we try forms a triple with , we have a winner.

The next if $c == $a and next if $d == $b tests are to rule out trivial solutions like , which the querent wasn't interested in anyway. We don't have to test for equality of any of ther other pairs because no number can form a Pythagorean triple with itself.

This runs in less than a second on so-so hardware and produces 11 solutions:

    3472  7296  10400 2175
    4312  23520 12008 465
    6512  9984  800   6375
    12312 666   1288  8415
    14592 6944  4350  20800
    16830 2576  1332  24624
    19968 13024 12750 1600
    25500 26048 39936 3200
    30192 6175  2400  9856
    41600 29184 13888 8700
    47040 8624  930   24016

Only five of these are really different. For example, the last one is the same as the second, with every element multiplied by 2; the third, seventh, and eighth are similarly the same. In general if is a solution, so is for any . A slightly improved version would require that the four numbers not have any common factor greater than 1; there are few enough solutions that the cost of this test would be completely negligible.

The only other thing wrong with the program is that it produces each solution 8 times; if is a solution, then so are and so on. This is easily fixed with a little post-filtering; pipe the output through

    perl -nle '$k = join " ",  sort { $a <=> $b } split; print unless $seen{$k}++ '

or something of that sort.

The corresponding run with and up to 2,000 instead of only 200 takes 5 minutes and finds 445 solutions, of which 101 are distinct, including . It would take a very long time to find this with a naïve search.

by Mark Dominus ([email protected]) at March 19, 2015 12:00 AM

March 18, 2015

Keegan McAllister

html5ever project update: one year!

I started the html5ever project just a bit over one year ago. We adopted the current name last July.

<kmc> maybe the whole project needs a better name, idk
<Ms2ger> htmlparser, perhaps
<jdm> tagsoup
<Ms2ger> UglySoup
<Ms2ger> Since BeautifulSoup is already taken
<jdm> html5ever
<Ms2ger> No
<jdm> you just hate good ideas
<pcwalton> kmc: if you don't call it html5ever that will be a massive missed opportunity

By that point we already had a few contributors. Now we have 469 commits from 18 people, which is just amazing. Thank you to everyone who helped with the project. Over the past year we've upgraded Rust almost 50 times; I'm extremely grateful to the community members who had a turn at this Sisyphean task.

Several people have also contributed major enhancements. For example:

  • Clark Gaebel implemented zero-copy parsing. I'm in the process of reviewing this code and will be landing pieces of it in the next few weeks.

  • Josh Matthews made it possible to suspend and resume parsing from the tree sink. Servo needs this to do async resource fetching for external <script>s of the old-school (non-async/defer) variety.

  • Chris Paris implemented fragment parsing and improved serialization. This means Servo can use html5ever not only for parsing whole documents, but also for the innerHTML/outerHTML getters and setters within the DOM.

  • Adam Roben brought us dramatically closer to spec conformance. Aside from foreign (XML) content and <template>, we pass 99.6% of the html5lib tokenizer and tree builder tests! Adam also improved the build and test infrastructure in a number of ways.

I'd also like to thank Simon Sapin for doing the initial review of my code, and finding a few bugs in the process.

html5ever makes heavy use of Rust's metaprogramming features. It's been something of a wild ride, and we've collaborated with the Rust team in a number of ways. Felix Klock came through in a big way when a Rust upgrade broke the entire tree builder. Lately, I've been working on improvements to Rust's macro system ahead of the 1.0 release, based in part on my experience with html5ever.

Even with the early-adopter pains, the use of metaprogramming was absolutely worth it. Most of the spec-conformance patches were only a few lines, because our encoding of parser rules is so close to what's written in the spec. This is especially valuable with a "living standard" like HTML.

The future

Two upcoming enhancements are a high priority for Web compatibility in Servo:

  • Character encoding detection and conversion. This will build on the zero-copy UTF-8 parsing mentioned above. Non-UTF-8 content (~15% of the Web) will have "one-copy parsing" after a conversion to UTF-8. This keeps the parser itself lean and mean.

  • document.write support. This API can insert arbitrary UTF-16 code units (which might not even be valid Unicode) in the middle of the UTF-8 stream. To handle this, we might switch to WTF-8. Along with document.write we'll start to do speculative parsing.

It's likely that I'll work on one or both of these in the next quarter.

Servo may get SVG support in the near future, thanks to canvg. SVG nodes can be embedded in HTML or loaded from an external XML file. To support the first case, html5ever needs to implement WHATWG's rules for parsing foreign content in HTML. To handle external SVG we could use a proper XML parser, or we could extend html5ever to support "XML5", an error-tolerant XML syntax similar to WHATWG HTML. Ygg01 made some progress towards implementing XML5. Servo would most likely use it for XHTML as well.

Improved performance is always a goal. html5ever describes itself as "high-performance" but does not have specific comparisons to other HTML parsers. I'd like to fix that in the near future. Zero-copy parsing will be a substantial improvement, once some performance issues in Rust get fixed. I'd like to revisit SSE-accelerated parsing as well.

I'd also like to support html5ever on some stable Rust 1.xversion, although it probably won't happen for 1.0.0. The main obstacle here is procedural macros. Erick Tryzelaar has done some great work recently with syntex, aster, and quasi. Switching to this ecosystem will get us close to 1.x compatibility and will clean up the macro code quite a bit. I'll be working with Erick to use html5ever as an early validation of his approach.

Simon has extracted Servo's CSS selector matching engine as a stand-alone library. Combined with html5ever this provides most of the groundwork for a full-featured HTML manipulation library.

The C API for html5ever still builds, thanks to continuous integration. But it's not complete or well-tested. With the removal of Rust's runtime, maintaining the C API does not restrict the kind of code we can write in other parts of the parser. All we need now is to complete the C API and write tests. This would be a great thing for a community member to work on. Then we can write bindings for every language under the sun and bring fast, correct, memory-safe HTML parsing to the masses :)

by keegan ([email protected]) at March 18, 2015 10:29 PM

FP Complete

Stackage and GHC 7.10 update

As many of you saw, GHC 7.10 RC3 is now available. As has become a pretty long-standing tradition, the Stackage project has been tracking GHC 7.10 library support for a few months now. Given how close we are to a release, now seems like a good time to make a more public status report.

Firstly: restrictive upper bounds. I added a comment two hours ago listing all of the restrictive upper bounds, that is, upper bounds on libraries which prevent them from working with GHC 7.10. (Note: --allow-newer can override that.) In some cases, these are unnecessary upper bounds, in that simply relaxing the bounds in the cabal file will allow them to build. In others, they are in fact necessary, and the code needs to change. To help identify which packages fall into which categories, I then did a full build without bounds checking. I've uploaded three different sets of results:

The Haskell community excels in getting libraries to quickly be compatible with the newest GHC version. Let's strive to do it again with this release!

March 18, 2015 03:40 PM

March 17, 2015

Johan Tibell

Google Summer of Code 2015 Project Ideas

Every year I try to list some Google Summer of Code projects that I think are worthwhile. Here's this year's list. The focus is, as usual, on infrastructure projects. I think they are the most likely to succeed and give the community the highest long-term value.

Generalize cabal to work with collections of packages instead of having a single package focus

Today most cabal commands assume that there's a "current" package that's the focus of most operations. As code bases grow users will end up with multiple-package projects and thus a need to build/test subsets of those packages together. We should generalize most commands (e.g. build, test, bench, and repl) to take a list of targets. You can already today specify local targets (i.e. only run these N test suites in my .cabal file). We'd extend the syntax to allow for

cabal test [[PACKAGE DIR]:[SECTION]]...


cabal test my-pkg1-dir:test1 my-pkg1-dir:test2 my-pkg2-dir

Implementation wise this means that the working dir (i.e. dist/ today) needs to be able to live outside the some "current" package's dir. It would live in the project "root" (e.g. some director that contains all the packages that are checked out in source form.)

We should also make it more convenient to create these collections of packages. Today you can get half way there by creating a sandbox and manually add-source all your packages. That doesn't scale well to 100-1000s of packages. We should have more clever defaults (e.g. scan subdirs for .cabal files and consider them part of the package set.)

Cabal PVP compliance checker

A Cabal sub-command that given two package versions tells you which version number component that needs to be bumped (according to the PVP) and why. See for example the Elm solution to this problem.

Have GHC track the different library "ways" it has installed

Today GHC doesn't know if it has installed e.g. both the vanilla and profiling versions of a library. This leads to annoying error messages and an unpleasant experience when users want to profile something. If GHC tracked which versions of a library it had installed cabal could easily and automatically installed the missing profiling versions.

There's some more info on the GHC Trac: 1, 2.

Strict Haskell language pragma

This is a project for a student with some background in compilers, perhaps even with some experience working on GHC.

As an experiment in language design I'd like to add two language pragmas to GHC, Strict and StrictData, the latter being a subset of the former. These pragmas would give better control over the strictness of code, on a per-module basis. The design is quite well fleshed out on the wiki. There's also the beginings of an implementation of StrictData available.

by Johan Tibell ([email protected]) at March 17, 2015 05:45 AM

March 16, 2015

Don Stewart (dons)

Haskell development role in Strats at Standard Chartered

The Strats team at Standard Chartered has an open position for a typed functional programming developer to join the team in London.

This is a “front office” finance role – meaning you will work on the trading floor, directly with traders, building software to automate their work and improve their efficiency. The role is highly development focused, and you will use Haskell for almost all tasks: data analysis, market data publishing, database access, web services, desktop GUIs, large parallel tasks, quantitative models, solvers, everything. This is a fast paced role – code you write today will be deployed within hours to hundreds of users and has to work.

You will join an expert team in London, and demonstrable experience in typed FP (Haskell, OCaml, F# etc) is required. We have around 2.5 million lines of Haskell, and our own Haskell compiler. In this context we look for skill and taste in typed functional programming to capture and abstract over complex, messy systems.

Experience writing typed APIs to external systems such as databases, web services, pub/sub platforms is very desirable. We like working code, so if you have Hackage or github libraries, we want to see them. We also like StackOverflow answers, blog posts, academic papers, or other arenas where you can show broad FP ability. A PhD in computer science is a strong advantage.

The role requires physical presence on the trading floor in London. Remote work isn’t an option. Ideally you have some project and client management skills — you will talk to users, understand their problems, and then implement and deliver what they really need. No financial background is required.

More info about our development process is in the 2012 PADL keynote, and a 2013 HaskellCast interview.

If this sounds exciting to you, please send your resume to me – donald.stewart <at>

Role posted 2015-03-16

Tagged: jobs, london

by Don Stewart at March 16, 2015 03:44 PM

Dimitri Sabadie

Getting Into Netwire


Nowadays, there is a cool concept out there in the Functional Programming world which is called FRP. It stands for Functional Reactive Programming and is a pretty decent way to make event-driven programs.

The problem with FRP is that, beside Wikipedia, and a few other resources, like Conal Elliott’s papers, we lack learning materials. Getting into FRP is really not trivial and because of the lack of help, you’ll need to be brave to get your feet wet.

Because I found it hard learning it from scratch and because I think it’s a good thing to pass knowledge by, I decided to write a few about it so that people can learn via an easier path.

I’ll be talking about netwire, which is not the de-facto library to use in Haskell, because, eh… we don’t have any. However, netwire exposes a lot of very interesting concepts and helped me to understand more general abstractions. I hope it’ll help you as well. :)

The FRP Thing

Event-driven programming context

In traditional event-driven programming codebase, you’d find constructions such as events polling (when you explicitely retrieve last occurring events), callbacks and event handlers. GLFW is a very famous example of callback uses for event-driven programming. Such functions as glfwSetWindowCloseCallback require you to pass a callback that will be called when the event occurs. While that way to go seems nice, it’s actually error-prone and ill-formed design:

  • you eventually end up with spaghetti code
  • debugging callbacks is a true nightmare as the codebase grows
  • because of the first point, the code doesn’t compose – or in very minimal ways – and is almost impossible to test against
  • you introduce side-effects that might introduce nasty bugs difficult to figure out
  • debugging is like hitting your head on a tree log

However, it’s not black or white. Callbacks are mandatory. They’re useful, and we’ll see that later on.

What FRP truely is?

FRP is a new way to look at event-driven programming. Instead of representing reaction through callbacks, we consume events over time. Instead of building a callback that will be passed as reaction to the setPushButtonCallback function, we consume and transform events over time. The main idea of FRP could be summed up with the following concepts:

  • behaviors: a behavior is a value that reacts to time
  • events: events are just values that have occurrences in time
  • switching: the act of changing of behavior


According to Wikipedia, a behavior is the range of actions and mannerisms made by individuals, organisms, systems, or artificial entities in conjunction with themselves or their environment, which includes the other systems or organisms around as well as the (inanimate) physical environment. If we try to apply that to a simple version of FRP that only takes into account the time as external stimulus, a behavior isany kind of value that consumes time. What’s that? Well…

newtype Behavior a = Behavior { stepBehavior :: Double -> a }

A behavior is a simple function from time (Double) to a given value. Let’s take an example. Imagine you want to represent a cube rotating around the X axis. You can represent the actual rotation with a Behavior Rotation, because the angle of rotation is linked to the time:

rotationAngle :: Behavior Rotation
rotationAngle = Behavior $ \t -> rotate xAxis t

Pretty simple, see?! However, it’d would more convenient if we could chose the type of time. We don’t really know what the time will be in the final application. It could be the current UTC time, it could be an integral time (think of a stepped discrete simulation), it could be the monotonic time, the system time, something that we don’t even know. So let’s make our Behavior type more robust:

newtype Behavior t a = Behavior { stepBehavior :: t -> a }

Simple change, but nice improvement.

That is the typical way to picture a behavior. However, we’ll see later that the implementation is way different that such a naive one. Keep on reading.


An event is something happening at some time. Applied to FRP, an event is a pair of time – giving the time of occurrence – and a carried value:

newtype Event t a = Event { getEvent :: (t,a) }

For instance, we could create an event that yields a rotation of 90° around X at 35 seconds:

rotate90XAt35s :: Event Float Rotation
rotate90XAt35s = Event (35,rotate xAxis $ 90 * pi / 180)

Once again, that’s the naive way to look at events. Keep in mind that events have time occurrences and carry values.

Behavior switch

You switch your behavior every time. Currently, you’re reading this paper, but you may go grab some food, go to bed, go to school or whatever you like afterwards. You’re already used to behavior switching because that’s what we do every day in a lifetime.

However, applying that to FRP is another thing. The idea is to express this:

“Given a first behavior, I’ll switch to another behavior when a given event occurs.”

This is how we express that in FRP:

switch :: Behavior t a -> Event t (Behavior t a) -> Behavior t a

Let me decrypt switch for you.

The first parameter, a Behavior t a, is the initial behavior. For instance, currently, you’re reading. That could be the first behavior you’d pass to switch.

The second parameter, an Event t (Behavior t a), is an event that yields a new Behavior t a. Do you start to get it? No? Well then:

switch reading finished

reading is the initial behavior, and finished is an event that occurs when you’re done reading. switch reading finished is then a behavior that equals to reading until finished happens. When it does, switch reading finished extracts the behavior from the event, and uses it instead.

I tend to think switch is a bad name, and I like naming it until:

reading `until` finished

Nicer isn’t it?! :)


Stepping is the act of passing the input – i.e. the time t in our case – down to the Behavior t a and extract the resulting a value. Behaviors are commonly connected to each other and form a reactive network.

That operation is also often refered to as reactimation in certain implementations, but is more complex than just stepping the world. You don’t have to focus on that yet, just keep in mind the reactimate function. You might come across it at some time.

Before going on…

Everything you read in that paper until now was just pop-science so that you understand the main idea of what FRP is. The next part will cover a more decent and real-world implementation and use of FRP, especially efficient implementations.

Let’s build a FRP library!

The first common error a lot of programmers make is trying to write algorithms or libraries to solve a problem they don’t even know. Let’s then start with an example so that we can figure out the problem.

Initial program

Setting up

Let’s say we want to be able to control a camera with a keyboard:

  • W would go forward
  • S would go backward
  • A would left strafe
  • D would right strafe
  • R would go up
  • F would go down

Let’s write the Input type:

data Input
= W
| S
| A
| D
| R
| F
| Quit
deriving (Eq,Read,Show)

Straight-forward. We also have a function that polls events from IO:

pollEvents :: IO [Input]
pollEvents = fmap treatLine getLine
treatLine = concatMap (map fst . reads) . words

We use [Input] because we could have several events at the same time (imagine two pressed keys). The function is using dumb implementation in order to abstract over event polling. In your program, you won’t use getLine but a function from SDL or similar.

And the camera:

newtype Camera = Camera { _cameraPosition :: V3 Float } deriving (Eq,Read,Show)

makeLenses ''Camera

V3 is a type from linear. You’ll need that lib then, and import Linear.V3 to make the Camera compile. You’ll also need lens and the GHC TemplateHaskell extension enabled as well as import Control.Lens.

Ok, let’s react to events!

First attempt: the regular and naive one

The idea is to use some kind of state we’ll change on an event. In our case the state is pretty simple:

data AppSt = AppSt {
_appCamera :: Camera
} deriving (Eq,Read,Show)

makeLenses ''AppSt

updateAppSt :: AppSt -> Input -> Maybe AppSt
updateAppSt appst input = case input of
W -> Just $ appst & appCamera . cameraPosition . _z -~ 0.1
S -> Just $ appst & appCamera . cameraPosition . _z +~ 0.1
A -> Just $ appst & appCamera . cameraPosition . _x -~ 0.1
D -> Just $ appst & appCamera . cameraPosition . _x +~ 0.1
F -> Just $ appst & appCamera . cameraPosition . _y -~ 0.1
R -> Just $ appst & appCamera . cameraPosition . _y +~ 0.1
Quit -> Nothing

A lot of boilerplate on updateAppSt, but that doesn’t matter that much. The idea is to modify the application state and just return it for all inputs but Quit, in which case we return Nothing to express the wish to quit the application.

I’ve been using that idea for a while. It’s simple, nice and neat, because we don’t spread IO actions in our program logic, which remains then pure. That’s a very good way of doing it, and in most cases, it’ll even be sufficient. However, that idea suffers from a serious issue: it doesn’t scale.

Who has only one camera? No one. You have a camera – maybe more than just one, lights, objects, terrains, AI, sounds, assets, maps and so on and so forth. Our little AppSt type would explode as we add objects. That doesn’t scale at all. You could, though, go on and add all your objects in your AppSt – I did that once, it was a pretty harsh experience.

Furthermore, imagine you want to add a new behavior to the camera, like being able to handle the mouse cursor move – Input being augmented, of course. You’d need to change / add lines in our updateAppSt function. Imagine how messy updateAppSt would be… That would, basically, gather all reactions into a single function. Not neat.

Adding FRP

FRP enables us to use our reactive values as if they are regular values. You can add reactive values, you can substract them, combine them in any way you want. The semantics of your values should be true for the reactive values.

Typically, with FRP, you don’t have event handlers anymore. The codebase can then grow sanely without having to accumulate big states every now and then. FRP applications scale and compose pretty well.

Let’s start with a simple FRP implementation for our example.

Naive FRP implementation

Let’s start with the behavior:

newtype Behavior t a = Behavior { stepBehavior :: t -> a }

How could we implement our camera’s behavior with that? We actually can’t since we don’t have any ways to pass events.

“I guess we could build a Behavior t Camera by passing our Input to the initial function?”

Something like this?

camera inputs = Behavior $ \_ -> -- implement the behavior with inputs

That could be a way to go, yeah. However, how would you implement switching with that? Remember the type of until:

until :: Behavior t a -> Event (Behavior t a) -> Behavior t a

camera is not a behavior, it’s a function from events to a behavior. You have to apply the events on camera in order to get its behavior. Once you’ve done that, you cannot pass events to the next behavior. What a pity. That is more a configured behavior than a behavior consuming inputs / events.

With the current Behavior t a implementation, a behavior network is reduced to a function t -> a, basically. Then, the only stimulus you got from outside is… time. We lack something.

Arrowized behaviors

“A way to forward events?”

Yes! But more mainly, a way to extend our Behavior t a with inputs! Don’t get me wrong, we are not talking about a reactive value here. We are talking about a reactive relationship:

newtype Behavior t a b = Behavior { stepBehavior :: t -> a -> b }

That’s way better! Our new behavior represents a relationship between two reactive objects. The b is our old a, and the new a is the input! Let’s see what we can do with that.

camera :: Behavior t [Input] Camera
camera = Behavior (const treatEvents)
treatEvents events
| W `elem` events = Camera $ V3 0 0 (-0.1)
| S `elem` events = Camera $ V3 0 0 0.1
| otherwise = Camera $ V3 0 0 0

That is not exactly what we intented to express. Here, if we push the W button, we just put the camera in V3 0 0 (-0.1), while we’d like to move it forward. That is due to the fact we need switching.

The idea is the following: the initial behavior is idleing. We just don’t do anything. Then, we switch to a given behavior when a given event occurs. We’ll need recursion so that we can ping-pong between behaviors. Let’s write the idleing behavior:

idleing :: Behavior t ([Input],Camera) Camera
idleing = Behavior (const snd)

That behavior requires as input our Input events list and a Camera and simply returns the Camera. Pretty nice.

How do we switch then? We need Event. Consider this:

newtype Event t a = Event { getEvent :: (t,a) }

In order to switch, we need a to be a behavior. In the first place, we’ll create several Event t [Input] and pass them as input to the behavior network. How could we change the [Input] to something more interesting? Simple: Functor!

instance Functor (Event t) where
fmap f (Event e) = Event (fmap f e)

Note: because of Event t a being a newtype, you should rather use the GHC GeneralizedNewtypeDeriving extension to automatically let GHC infer the instance for you.

newtype Event t a = Event { getEvent :: (t,a) } deriving (Functor)

Then, we can use the Functor instance to change the type of the carried value of the event. That’s great because we don’t change the occurrence time, only the carried value. Transforming events is really important in FRP. We can then transform the [Input] into a single behavior:

inputToBehavior i = case i of
W -> Behavior $ \_ (_,cam) -> cam & cameraPosition . _z -~ 0.1
S -> Behavior $ \_ (_,cam) -> cam & cameraPosition . _z +~ 0.1
A -> -- and so on
D -> -- and so forth
F -> -- ;)
R -> -- ...
_ -> Behavior $ \_ (_,cam) -> cam

Pretty simple, see? When we push W, we go forward forever. We could have implemented the function above with another until call in order to go back to idleing, making some kind of behavior loop.

However, switching is fairly poorly implemented here. It’s not very efficient, and requires a ton of boilerplate.


There is a very cool type out there called Auto, used to implement automatons.

newtype Auto a b = Auto { runAuto :: a -> (b,Auto a b) }

An Auto a b is a function-like structure. It has an input and an output. The difference with a regular function is the fact it also has a secondary output, which is another Auto a b. That is, it’s the next automaton to be used.

Auto a b wraps pretty cool concepts, such as locally defined states. It’s also a great ally when implementing switching in a FRP system, because we can easily state that BehaviorAuto. A Behavior is a function from the environment state to the next reactive value, and has also another output representing what to do “next”.

Let’s then change our Behavior type to make it look like a bit more like Auto:

newtype Behavior t a b = Behavior { stepBehavior :: t -> a -> (b,Behavior t a b) }

Yeah, that’s it! That’s a pretty good start!

Useful abstractions

Before going on, I’d like to introduce those scary abstractions you are afraid of. Because they’re actually not. They’re all simple. At least for Haskell purposes.

Note: I do know we could simply use the GeneralizedNewtypeDeriving` extension but I want to detail all the implementation, so we’ll see how to implement all the nice abstractions.


Arrows are a generalization of functions along the axis of computation. A computation has inputs and outputs. So does a behavior.

Although arrows are not really used in Haskell, they’re ultra simple (themselves and the common combinators built over the abstraction) and useful in some cases.

In order to implement arrows, we need to provide code for both the arr function, which type is arr :: (Arrow a) => (b -> c) -> a b c and first, which type is first :: (Arrow a) => a b c -> a (b,d) (c,d). arr is used to lift a common function into the arrowized version, and first takes an arrow which takes a value as input and exposes an arrow that takes a pair as input, applying the given function on the first value of the pair. Let’s implement that:

instance Arrow (Behavior t) where
arr f = fix $ \r -> Behavior $ \t a -> (f a,r)
first f = Behavior $ \t (b,d) ->
let (c,fn) = stepBehavior f t b
in ((c,d),first fn)


A category basically exposes two concepts: composition and identity. In our case, the identity represents a constant behavior in time and the composition composes two behaviors in time. Let’s implement Category by providing implementation for both id and (.):

instance Category (Behavior t) where
id = arr id
x . y = Behavior $ \t a ->
let (yr,yn) = stepBehavior y t a
(xr,xn) = stepBehavior x t yr
in (xr,xn . yn)

Note: because of Prelude exporting specialized implementation of id and (.) – the function ones – you should hide them in order to implement Category:

import Prelude hiding ( (.), id )


A semigroup is a pretty cool algebraic structure used in Haskell to represent “anything that associates”. It exposes an associative binary function over a set. In the case of behaviors, if two behaviors output semigroup values, we can associates the behaviors to build a single one.

A Semigroup is implemented via a single typeclass method, (<>). Let’s do that for behaviors:

instance (Semigroup b) => Semigroup (Behavior t a b) where
x <> y = Behavior $ \t a ->
let (xr,xn) = stepBehavior x t a
(yr,yn) = stepBehavior y t a
in (xr <> yr,xn <> yn)

Simple and neat.


You might already know that one since I talked about it a few lines ago, but let’s write the instance for our Behavior:

instance Functor (Behavior t a) where
fmap f b = Behavior $ \t a ->
let (br,bn) = stepBehavior b t a
in (f br,fmap f bn)

Pretty cool eh?


A very known one too. Let’s see how we could implement Applicative:

instance Applicative (Behavior t a) where
pure = arr . const
f <*> x = Behavior $ \t a ->
let (fr,fn) = stepBehavior f t a
(xr,xn) = stepBehavior x t a
in (fr xr,fn <*> xn)


This one is special. You don’t have to know what a profunctor is, but eh, you should, because profunctors are pretty simple to use in Haskell, and are very useful. I won’t explain what they are – you should have a look at this article for further details.

If you do know them, here’s the implementation for dimap:

instance Profunctor (Behavior t) where
dimap l r x = Behavior $ \t a ->
let (xr,xn) = stepBehavior x t (l a)
in (r xr,dimap l r xn)


Bare concept

Behaviors consume environment state and have outputs. However, they sometimes just don’t. They don’t output anything. That could be the case for a behavior that only emits during a certain period of time. It could also be the case for a signal function that’s defined on a given interval: what should we output for values that lie outside?

Such a scenario is called inhibition. There’re several solutions to implement inhibition. The simplest and most famous one is by using Maybe as a wrapper over the output. Like the following:

Behavior t a (Maybe b)

If (Maybe b) is Nothing, the output is undefined, then the behavior inhibits.

However, using a bare Maybe exposes the user directly to inhibition. There’s another way to do that:

newtype Behavior t a b = Behavior { stepBehavior :: t -> a -> Maybe (b,Behavior t a b) }

Here we are. We have behaviors that can inhibit. If a behavior doesn’t inhibit, it returns Just (output,nextBehavior), otherwise it outputs Nothing and inhibits forever.

Exercise: try to reimplement all the above abstractions with the new type of Behavior.

We can add a bunch of other interesting functions:

dead :: Behavior t a b
dead = Behavior $ \_ _ -> Nothing

one :: b -> Behavior t a b
one x = Behavior $ \_ _ -> Just (x,dead)

dead is a behavior that inhibits forever. That is, it doesn’t produce any value at any time.

one x produces x once, and then inhibits forever. That’s a nice combinator to use when you want to pulse values in time. We’ll see later that it’s very useful to represent discrete events, like key presses or mouse motion.

However, inhibiting can be useful. For instance, we can implement a new kind of behavior switching using inhibition. Let’s try to implement a function that takes two behaviors and switches to the latter when the former starts inhibiting:

revive :: Behavior t a b -> Behavior t a b -> Behavior t a b
revive x y = Behavior $ \t a -> case stepBehavior x t a of
Just (xr,xn) -> return (xr,revive xn y)
Nothing -> stepBehavior y t a

(~>) :: Behavior t a b -> Behavior t a b -> Behavior t a b
(~>) = revive

(~>) is a handy alias to revive. Then, a ~> b is a behavior that is a until it inhibits, afterwhile it’s b. Simple, and useful.

In netwire, revive – or (~>) – is (-->). There’s also an operator that does the opposite thing: (>--). a >-- b is a until b starts producing – i.e. until b doesn’t inhibit anymore.

Exercise: write the implementatof of (>~), our version for netwire’s (>--).

Behaviors revisited

Now you have a better idea of how you could implement a behavior, let’s talk about netwire’s one.

netwire’s behavior type is called Wire. It’s actually:

Wire s e m a b

s is the session time – it’s basically a type that’s used to extract time. e is the inhibition value. m is a inner monad – yes, you can use monadic code within netwire, which is not really useful actually, except for Reader, I guess. And a and b are respectively inputs and outputs.

“What is that inhibition type?”

Yeah, netwire doesn’t use Maybe for inhibition. Picture Wire as:

newtype Wire s e m a b = Wire { stepWire :: s -> a -> m (Either e (b,Wire s e m a b)) }

Instead of using Maybe (b,Wire s e m a b), it uses Either. Some functions require e to be a Monoid. I guess netwire uses that to accumulate during inhibition. I don’t see decent use cases of such a feature, but it’s there. I tend to use this kind of wire in all my uses of netwire:

Wire s () Identity a b -- I think this is the most common type of wire

Keep in mind that although you can set m to IO, it’s not what netwire – and FRP – was designed for.


What about events? Well, netwire exposes events as a home-made Maybe:

data Event a
= Event a
| NoEvent
deriving (Eq,Show)

instance Functor Event where
fmap f e = case e of
Event a -> Event (f a)
NoEvent -> NoEvent

That’s actually enough, because we can attach Event a to time occurences with Behavior t b a. You’ll find every now and then functions using Wire s e m (Event a) b, for instance. You should get used to that as you write toy examples, and real-world ones, of course.

In the end

What a trek… As you can see, we were able to approach netwire’s implementation understanding pretty closely. There are a few concepts I haven’t covered – like intrinsic switches, continuable switches, deferred switches… – but I don’t pretend having a comprehensive FRP article. You’ll have to dig in a bit more ;)

I’ll write another article about FRP and netwire to implement the camera example with netwire so that you can have a concrete example.

by Dimitri Sabadie ([email protected]) at March 16, 2015 09:34 AM

Tom Schrijvers

PPDP 2015: Extended Deadline


   Call for papers
   17th International Symposium on
  Principles and Practice of Declarative Programming
      PPDP 2015

         Special Issue of Science of Computer Programming (SCP)

    Siena, Italy, July 14-16, 2015
    (co-located with LOPSTR 2015)


   !!!!!!  EXTENDED DEADLINE: April 6, 2015   !!!!!!


PPDP  2015  is a  forum  that  brings  together researchers  from  the
declarative  programming communities, including  those working  in the
logic,  constraint  and  functional  programming paradigms,  but  also
embracing languages, database  languages, and knowledge representation
languages. The  goal is  to stimulate research  in the use  of logical
formalisms  and  methods  for  specifying, performing,  and  analyzing
computations,   including   mechanisms   for   mobility,   modularity,
concurrency,  object-orientation,  security,  verification and  static
analysis. Papers related to the use of declarative paradigms and tools
in industry and education are especially solicited. Topics of interest
include, but are not limited to

* Functional programming
* Logic programming
* Answer-set programming
* Functional-logic programming
* Declarative visual languages
* Constraint Handling Rules
* Parallel implementation and concurrency
* Monads, type classes and dependent type systems
* Declarative domain-specific languages
* Termination, resource analysis and the verification of declarative programs
* Transformation and partial evaluation of declarative languages
* Language extensions for security and tabulation
* Probabilistic modeling in a declarative language and modeling reactivity
* Memory management and the implementation of declarative systems
* Practical experiences and industrial application

This   year  the  conference   will  be   co-located  with   the  25th
International   Symposium  on   Logic-Based   Program  Synthesis   and
Transformation (LOPSTR 2015).

The  conference will  be held in Siena, Italy.  Previous symposia were
held  at Canterbury  (UK),  Madrid (Spain),  Leuven (Belgium),  Odense
(Denmark), Hagenberg (Austria),  Coimbra (Portugal), Valencia (Spain),
Wroclaw (Poland),  Venice (Italy), Lisboa  (Portugal), Verona (Italy),
Uppsala  (Sweden),   Pittsburgh  (USA),  Florence   (Italy),  Montreal
(Canada), and Paris (France). You might have a look at the contents of
past PPDP symposia.

Papers  must  describe original  work,  be  written  and presented  in
English, and must not substantially overlap with papers that have been
published  or   that  are  simultaneously  submitted   to  a  journal,
conference, or  workshop with refereed proceedings.  Work that already
appeared in  unpublished or informally  published workshop proceedings
may be submitted (please contact the PC chair in case of questions).

After the symposium, a selection of the best papers will be invited to
extend their submissions in the light of the feedback solicited at the
symposium.   The papers  are expected  to include  at least  30% extra
material over and above the PPDP version. Then, after another round of
reviewing, these revised  papers will be published in  a special issue
of SCP with a target publication date by Elsevier of 2016.

Important Dates

   Abstract Submission:        30 March, 2015
   Paper submission:         6 April, 2015
   Notification:        14 May,   2015
   Camera-ready:        To be announced
   Symposium:       14-16 July, 2015
Authors  should  submit  an  electronic  copy of  the  full  paper  in
PDF. Papers  should be  submitted to the  submission website  for PPDP
2015. Each submission must include  on its first page the paper title;
authors  and   their  affiliations;   abstract;  and  three   to  four
keywords. The keywords will be used to assist the program committee in
selecting appropriate  reviewers for the paper.  Papers should consist
of   the   equivalent  of   12   pages   under   the  ACM   formatting
guidelines.  These   guidelines  are  available   online,  along  with
formatting templates  or style files. Submitted papers  will be judged
on the basis of significance, relevance, correctness, originality, and
clarity. They should  include a clear identification of  what has been
accomplished and  why it is  significant. Authors who wish  to provide
additional material to  the reviewers beyond the 12-page  limit can do
so in  clearly marked appendices:  reviewers are not required  to read
such appendices.

Invited speakers:

Patrick Cousot (NYU, Jointly with LOPSTR)
Martin Hofmann (LMU, Munich)
Dale Miller (INRIA and Ecole Polytechnique, Jointly with LOPSTR)

Program Committee

    Michael Adams, University of Utah, USA
    Puri Arenas, Complutense University of Madrid, Spain
    Amir Ben-Amram, Tel-Aviv Academic College, Israel
    Ines Castro, Universidade do Porto, Portugal
    Patrick Cousot, New York University, USA
    Gregory Duck, National University of Singapore, Singapore
    Fabio Fioravanti, University of Chieti-Pescara, Italy
    Thom Frühwirth, University of Ulm, Germany
    Roberto Giacobazzi, University of Verona, Italy
    Michael Hanus, CAU Kiel, Germany
    Andy King, University of Kent, UK
    F. López-Fraguas, Complutense University of Madrid, Spain
    Ian Mackie, University of Sussex, UK
    Dale Miller, INRIA and LIX/Ecole Polytechnique, France
    Torsten Schaub, University of Potsdam, Germany
    Tom Schrijvers KU Leuven, Belgium
    Frank D. Valencia, CNRS and LIX, Ecole Polytechnique, France
    German Vidal, Universitat Politecnica de Valencia, Spain
    Marina Vos, University of Bath, UK
    Nobuko Yoshida, Imperial College London, UK 

Program Chair

    Elvira Albert
    Complutense University of Madrid
    C/ Profesor Garcia Santesmases
    E-28040 Madrid, Spain

Symposium Chair

    Moreno Falaschi
    Department of information engineering and mathematics
    University of Siena, Italy

by Tom Schrijvers ([email protected]) at March 16, 2015 09:04 AM

March 15, 2015

Brent Yorgey


This is part four in a series of posts on avoiding the axiom of choice (part one, part two, part three).

In my previous post, we considered the “Axiom of Protoequivalence”—that is, the statement that every fully faithful, essentially surjective functor (i.e. every protoequivalence) is an equivalance—and I claimed that in a traditional setting this is equivalent to the axiom of choice. However, intuitively it feels like AP “ought to” be true, whereas AC must be rejected in constructive logic.

One way around this is by generalizing functors to anafunctors, which were introduced by Makkai (1996). The original paper is difficult going, since it is full of tons of detail, poorly typeset, and can only be downloaded as seven separate postscript files. There is also quite a lot of legitimate depth to the paper, which requires significant categorical sophistication (more than I possess) to fully understand. However, the basic ideas are not too hard to grok, and that’s what I will present here.

It’s important to note at the outset that anafunctors are much more than just a technical device enabling the Axiom of Protoequivalence. More generally, if everything in category theory is supposed to be done “up to isomorphism”, it is a bit suspect that functors have to be defined for objects on the nose. Anafunctors can be seen as a generalization of functors, where each object in the source category is sent not just to a single object, but to an entire isomorphism class of objects, without privileging any particular object in the class. In other words, anafunctors are functors whose “values are specified only up to unique isomorphism”.

Such functors represent a many-to-many relationship between objects of \mathbb{C} and objects of \mathbb{D}. Normal functors, as with any function, may of course map multiple objects of \mathbb{C} to the same object in \mathbb{D}. The novel aspect is the ability to have a single object of \mathbb{C} correspond to multiple objects of \mathbb{D}. The key idea is to add a class of “specifications” which mediate the relationship between objects in the source and target categories, in exactly the same way that a “junction table” must be added to support a many-to-many relationship in a database schema, as illustrated below:

On the left is a many-to-many relation between a set of shapes and a set of numbers. On the right, this relation has been mediated by a “junction table” containing a set of “specifications”—in this case, each specification is simply a pair of a shape and a number—together with two mappings (one-to-many relations) from the specifications to both of the original sets, such that a specification maps to a shape s and number n if and only if s and n were originally related.

In particular, an anafunctor F : \mathbb{C} \to \mathbb{D} is defined as follows.

  • There is a class S of specifications.
  • There are two functions \mathrm{Ob}\ \mathbb{C}  \stackrel{\overleftarrow{F}}{\longleftarrow} S  \stackrel{\overrightarrow{F}}{\longrightarrow} \mathrm{Ob}\ \mathbb{D} mapping specifications to objects of \mathbb{C} and \mathbb{D}.

S, \overleftarrow{F}, and \overrightarrow{F} together define a many-to-many relationship between objects of \mathbb{C} and objects of \mathbb{D}. D \in \mathbb{D} is called a specified value of F at C if there is some specification s \in S such that \overleftarrow{F}(s) = C and \overrightarrow{F}(s) = D, in which case we write F_s(C) = D. Moreover, D is a value of F at C (not necessarily a specified one) if there is some s for which D \cong F_s(C).

The idea now is to impose additional conditions which ensure that F “acts like” a regular functor \mathbb{C} \to \mathbb{D}.

  • Functors are defined on all objects; so we require each object of \mathbb{C} to have at least one specification s which corresponds to it—that is, \overleftarrow{F} must be surjective.
  • Functors transport morphisms as well as objects. For each s,t \in   S (the middle of the below diagram) and each f :   \overleftarrow{F}(s) \to \overleftarrow{F}(t) in \mathbb{C} (the left-hand side below), there must be a morphism F_{s,t}(f) :   \overrightarrow{F}(s) \to \overrightarrow{F}(t) in \mathbb{D} (the right-hand side):

  • Functors preserve identities: for each s \in S we should have F_{s,s}(\mathit{id}_{\overleftarrow{F}(s)}) = \mathit{id}_{\overrightarrow{F}(s)}.
  • Finally, functors preserve composition: for all s,t,u \in S (in the middle below), f : \overleftarrow{F}(s) \to \overleftarrow{F}(t), and g : \overleftarrow{F}(t) \to \overleftarrow{F}(u) (the left side below), it must be the case that F_{s,u}(f ; g) = F_{s,t}(f) ; F_{t,u}(g):

Our initial intuition was that an anafunctor should map objects of \mathbb{C} to isomorphism classes of objects in \mathbb{D}. This may not be immediately apparent from the definition, but is in fact the case. In particular, the identity morphism \mathit{id}_C maps to isomorphisms between specified values of C; that is, under the action of an anafunctor, an object C together with its identity morphism “blow up” into an isomorphism class (aka a clique). To see this, let s,t \in S be two different specifications corresponding to C, that is, \overleftarrow{F}(s) = \overleftarrow{F}(t) = C. Then by preservation of composition and identities, we have F_{s,t}(\mathit{id}_C) ; F_{t,s}(\mathit{id}_C) = F_{s,s}(\mathit{id}_C ; \mathit{id}_C) = F_{s,s}(\mathit{id}_C) = \mathit{id}_{\overrightarrow{F}(s)}, so F_{s,t}(\mathit{id}_C) and F_{t,s}(\mathit{id}_C) constitute an isomorphism between F_s(C) and F_t(C).

There is an alternative, equivalent definition of anafunctors, which is somewhat less intuitive but usually more convenient to work with: an anafunctor F : \mathbb{C} \to \mathbb{D} is a category of specifications \mathbb{S} together with a span of functors \mathbb{C} \stackrel{\overleftarrow{F}}{\longleftarrow} \mathbb{S} \stackrel{\overrightarrow{F}}{\longrightarrow} \mathbb{D} where \overleftarrow{F} is fully faithful and (strictly) surjective on objects.

Note that in this definition, \overleftarrow{F} must be strictly (as opposed to essentially) surjective on objects, that is, for every C \in \mathbb{C} there is some S \in \mathbb{S} such that \overleftarrow{F}(S) = C, rather than only requiring \overleftarrow{F}(S) \cong C. Given this strict surjectivity on objects, it is equivalent to require \overleftarrow F to be full, as in the definition above, or to be (strictly) surjective on the class of all morphisms.

We are punning on notation a bit here: in the original definition of anafunctor, S is a set and \overleftarrow{F} and \overrightarrow{F} are functions on objects, whereas in this more abstract definition \mathbb{S} is a category and \overleftarrow{F} and \overrightarrow{F} are functors. Of course, the two are closely related: given a span of functors \mathbb{C} \stackrel{\overleftarrow{F}}{\longleftarrow} \mathbb{S} \stackrel{\overrightarrow{F}}{\longrightarrow} \mathbb{D}, we may simply take the objects of \mathbb{S} as the class of specifications S, and the actions of the functors \overleftarrow{F} and \overrightarrow{F} on objects as the functions from specifications to objects of \mathbb{C} and \mathbb{D}. Conversely, given a class of specifications S and functions \overleftarrow{F} and \overrightarrow{F}, we may construct the category \mathbb{S} with \mathrm{Ob}\ \mathbb{S} = S and with morphisms \overleftarrow{F}(s) \to \overleftarrow{F}(t) in \mathbb{C} acting as morphisms s \to t in \mathbb{S}. From \mathbb{S} to \mathbb{C}, we construct the functor given by \overleftarrow{F} on objects and the identity on morphisms, and the other functor maps f : s \to t in \mathbb{S} to F_{s,t}(f) : \overrightarrow{F}(s) \to \overrightarrow{F}(t) in \mathbb{D}.

Every functor F : \mathbb{C} \to \mathbb{D} can be trivially turned into an anafunctor \mathbb{C} \stackrel{\mathit{Id}}{\longleftarrow} \mathbb{C} \stackrel{F}{\longrightarrow} \mathbb{D}. Anafunctors also compose. Given compatible anafunctors F : \mathbb{C} \stackrel{\overleftarrow F}{\longleftarrow} S \stackrel{\overrightarrow F}{\longrightarrow} \mathbb{D} and G : \mathbb{D} \stackrel{\overleftarrow G}{\longleftarrow} T \stackrel{\overrightarrow G}{\longrightarrow} \mathbb{E}, consider the action of their composite on objects: each object of \mathbb{C} may map to multiple objects of \mathbb{E}, via objects of \mathbb{D}. Each such mapping corresponds to a zig-zag path C \longleftarrow s \longrightarrow D \longleftarrow t \longrightarrow E. In order to specify such a path it suffices to give the pair (s,t), which determines C, D, and E. Note, however, that not every pair in S \times T corresponds to a valid path, but only those which agree on the middle object D \in \mathbb{D}. Thus, we may take \{ (s,t) \mid s \in S, t \in T, \overrightarrow{F}(s) = \overleftarrow{G}(t) \} as the set of specifications for the composite F ; G, with \overleftarrow{F ; G}(s,t) = \overleftarrow{F}(s) and \overrightarrow{F ; G}(s,t) = \overrightarrow{G}(t). On morphisms, (F ; G)_{(s,t),(u,v)}(f) = G_{t,v}(F_{s,u}(f)). It is not hard to check that this satisfies the anafunctor laws.

If you know what a pullback is, note that the same thing can also be defined at a higher level in terms of spans. \mathbf{Cat}, the category of all (small) categories, is complete, and in particular has pullbacks, so we may construct a new anafunctor from \mathbb{C} to \mathbb{E} by taking a pullback of \overrightarrow F and \overleftarrow G and then composing appropriately.

One can go on to define ananatural transformations between anafunctors, and show that together these constitute a 2-category \mathbf{AnaCat} which is analogous to the usual 2-category of (small) categories, functors, and natural transformations; in particular, there is a fully faithful embedding of \mathbf{Cat} into \mathbf{AnaCat}, which moreover is an equivalence if AC holds.

To work in category theory based on set theory and classical logic, while avoiding AC, one is therefore justified in “mixing and matching” functors and anafunctors as convenient, but discussing them all as if they were regular functors (except when defining a particular anafunctor). Such usage can be formalized by turning everything into an anafunctor, and translating functor operations and properties into corresponding operations and properties of anafunctors.

However, as I will argue in some future posts, there is a better solution, which is to throw out set theory as a foundation of category theory and start over with homotopy type theory. In that case, thanks to a generalized notion of equality, regular functors act like anafunctors, and in particular AP holds.


Makkai, Michael. 1996. “Avoiding the Axiom of Choice in General Category Theory.” Journal of Pure and Applied Algebra 108 (2). Elsevier: 109–73.

by Brent at March 15, 2015 07:55 PM

Jeremy Gibbons

Breadth-First Traversal

Recently Eitan Chatav asked in the Programming Haskell group on Facebook

What is the correct way to write breadth first traversal of a {[\mathsf{Tree}]}?

He’s thinking of “traversal” in the sense of the {\mathit{Traversable}} class, and gave a concrete declaration of rose trees:

\displaystyle  \begin{array}{lcl} \mathbf{data}\;\mathsf{Tree}\;\alpha &=& \mathit{Tree}\; \{\; \mathit{root} :: \alpha, \mathit{children} :: [\mathsf{Tree}\;\alpha] \;\} \end{array}

It’s an excellent question.

Breadth-first enumeration

First, let’s think about breadth-first enumeration of the elements of a tree. This isn’t compositional (a fold); but the related “level-order enumeration”, which gives a list of lists of elements, one list per level, is compositional:

\displaystyle  \begin{array}{lcl} \mathit{levels} &::& \mathsf{Tree}\;\alpha \rightarrow [[\alpha]] \\ \mathit{levels}\;t &=& [\mathit{root}\;t] : \mathit{foldr}\;(\mathit{lzw}\;(\mathbin{{+}\!\!\!{+}}))\;[\,]\;(\mathit{map}\;\mathit{levels}\;(\mathit{children}\;t)) \end{array}

Here, {\mathit{lzw}} is “long zip with”; it’s similar to {\mathit{zipWith}}, but returns a list as long as its longer argument:

\displaystyle  \begin{array}{llllcl} \mathit{lzw} &&&&::& (\alpha\rightarrow\alpha\rightarrow\alpha) \rightarrow [\alpha]\rightarrow[\alpha]\rightarrow[\alpha] \\ \mathit{lzw}&f&(a:x)&(b:y) &=& f\;a\;b : \mathit{lzw}\;f\;x\;y \\ \mathit{lzw}&f&x&[\,] &=& x \\ \mathit{lzw}&f&[\,]&y &=& y \end{array}

(It’s a nice exercise to define a notion of folds for {\mathsf{Tree}}, and to write {\mathit{levels}} as a fold.)

Given {\mathit{levels}}, breadth-first enumeration is obtained by concatenation:

\displaystyle  \begin{array}{lcl} \mathit{bf} &::& \mathsf{Tree}\;\alpha \rightarrow [\alpha] \\ \mathit{bf} &=& \mathit{concat} \cdot \mathit{levels} \end{array}

Incidentally, this allows trees to be foldable, breadth-first:

\displaystyle  \begin{array}{lcl} \mathbf{instance}\;\mathit{Foldable}\;\mathsf{Tree}\;\mathbf{where} \\ \quad \mathit{foldMap}\;f = \mathit{foldMap}\;f \cdot \mathit{bf} \end{array}


Level-order enumeration is invertible, in the sense that you can reconstruct the tree given its shape and its level-order enumeration.

One way to define this is to pass the level-order enumeration around the tree, snipping bits off it as you go. Here is a mutually recursive pair of functions to relabel a tree with a given list of lists, returning also the unused bits of the lists of lists.

\displaystyle  \begin{array}{lcl} \mathit{relabel} &::& (\mathsf{Tree}\;(),[[\alpha]]) \rightarrow (\mathsf{Tree}\;\alpha,[[\alpha]]) \\ \mathit{relabel}\;(t,(x:\mathit{xs}):\mathit{xss}) &=& \mathbf{let}\; (\mathit{us},\mathit{yss}) = \mathit{relabels}\; (\mathit{children}\;t,\mathit{xss}) \; \mathbf{in}\; (\mathit{Tree}\;x\;\mathit{us}, \mathit{xs}:\mathit{yss}) \medskip \\ \mathit{relabels} &::& ([\mathsf{Tree}\;()],[[\alpha]]) \rightarrow ([\mathsf{Tree}\;\alpha],[[\alpha]]) \\ \mathit{relabels}\;([],\mathit{xss}) &=& ([],\mathit{xss}) \\ \mathit{relabels}\;(t:\mathit{ts},\mathit{xss}) &=& \mathbf{let} \; (u,\mathit{yss}) = \mathit{relabel}\;(t,\mathit{xss}) \mathbin{;} (\mathit{us},\mathit{zss}) = \mathit{relabels}\;(\mathit{ts},\mathit{yss}) \; \mathbf{in} \; (u:\mathit{us},\mathit{zss}) \end{array}

Assuming that the given list of lists is “big enough”—ie each list has enough elements for that level of the tree—then the result is well-defined. Then {\mathit{relabel}} is determined by the equivalence

\displaystyle  \begin{array}{ll} & \mathit{relabel}\;(t,\mathit{xss}) = (u,\mathit{yss}) \\ \Leftrightarrow & \\ & \mathit{shape}\;u = \mathit{shape}\;t \land \mathit{lzw}\;(\mathbin{{+}\!\!\!{+}})\;(\mathit{levels}\;u)\;\mathit{yss} = \mathit{xss} \end{array}

Here, the {\mathit{shape}} of a tree is obtained by discarding its elements:

\displaystyle  \begin{array}{lcl} \mathit{shape} &::& \mathsf{Tree}\;\alpha \rightarrow \mathsf{Tree}\;() \\ \mathit{shape} &=& \mathit{fmap}\;(\mathit{const}\;()) \end{array}

In particular, if the given list of lists is the level-order of the tree, and so is exactly the right size, then {\mathit{yss}} will have no remaining elements, consisting entirely of empty levels:

\displaystyle \mathit{relabel}\;(\mathit{shape}\;t, \mathit{levels}\;t) = (t, \mathit{replicate}\;(\mathit{depth}\;t)\;[\,])

So we can take a tree apart into its shape and contents, and reconstruct the tree from such data.

\displaystyle  \begin{array}{lcl} \mathit{split} &::& \mathsf{Tree}\;\alpha \rightarrow (\mathsf{Tree}\;(), [[\alpha]]) \\ \mathit{split}\;t &=& (\mathit{shape}\;t, \mathit{levels}\;t) \medskip \\ \mathit{combine} &::& \mathsf{Tree}\;() \rightarrow [[\alpha]] \rightarrow \mathsf{Tree}\;\alpha \\ \mathit{combine}\;u\;\mathit{xss} &=& \mathit{fst}\;(\mathit{relabel}\;(u, \mathit{xss})) \end{array}

Breadth-first traversal

This lets us traverse a tree in breadth-first order, by performing the traversal just on the contents. We separate the tree into shape and contents, perform a list-based traversal, and reconstruct the tree.

\displaystyle  \begin{array}{l} \mathbf{instance}\;\mathit{Traversable}\;\mathsf{Tree}\;\mathbf{where} \\ \quad \mathit{traverse}\;f\;t = \mathit{pure}\;(\mathit{combine}\;(\mathit{shape}\;t)) \circledast \mathit{traverse}\;(\mathit{traverse}\;f)\;(\mathit{levels}\;t) \end{array}

This trick of traversal by factoring into shape and contents is explored in my paper Understanding Idiomatic Traversals Backwards and Forwards from Haskell 2013.

Inverting breadth-first enumeration

We’ve seen that level-order enumeration is invertible in a certain sense, and that this means that we perform traversal by factoring into shape and contents then traversing the contents independently of the shape. But the contents we’ve chosen is the level-order enumeration, which is a list of lists. Normally, one thinks of the contents of a data structure as simply being a list, ie obtained by breadth-first enumeration rather than by level-order enumeration. Can we do relabelling from the breadth-first enumeration too? Yes, we can!

There’s a very clever cyclic program for breadth-first relabelling of a tree given only a list, not a list of lists; in particular, breadth-first relabelling a tree with its own breadth-first enumeration gives back the tree you first thought of. In fact, the relabelling function is precisely the same as before! The trick comes in constructing the necessary list of lists:

\displaystyle  \begin{array}{lcl} \mathit{bflabel} &::& \mathsf{Tree}\;() \rightarrow [\alpha] \rightarrow \mathsf{Tree}\;\alpha \\ \mathit{bflabel}\;t\;\mathit{xs} &=& \mathbf{let}\;(u,\mathit{xss}) = \mathit{relabel}\;(t, \mathit{xs}:\mathit{xss})\;\mathbf{in}\;u \end{array}

Note that variable {\mathit{xss}} is defined cyclically; informally, the output leftovers {\mathit{xss}} on one level also form the input elements to be used for relabelling all the lower levels. Given this definition, we have

\displaystyle  \mathit{bflabel}\;(\mathit{shape}\;t)\;(\mathit{bf}\;t) = t

for any {t}. This program is essentially due to Geraint Jones, and is derived in an unpublished paper Linear-Time Breadth-First Tree Algorithms: An Exercise in the Arithmetic of Folds and Zips that we wrote together in 1993.

We can use this instead in the definition of breadth-first traversal:

\displaystyle  \begin{array}{l} \mathbf{instance}\;\mathit{Traversable}\;\mathsf{Tree}\;\mathbf{where} \\ \quad \mathit{traverse}\;f\;t = \mathit{pure}\;(\mathit{bflabel}\;(\mathit{shape}\;t)) \circledast \mathit{traverse}\;f\;(\mathit{bf}\;t) \end{array}

by jeremygibbons at March 15, 2015 05:00 PM

March 13, 2015

Matthew Sackman

Paxos notes

I've been doing a lot of reading on Paxos lately. There are many papers to read. Some of them are very good, some of them are less so. Paxos is not very complex - there are only four messages types in total in the classic Paxos case (though most implementations will end up with at least six), but it is quite subtle. Very few of the papers speak at all about why things need to be written to disk, or when, for example. The following are some notes of mine, which might be useful to others.

I'm not going to attempt to describe Paxos (though I may do so by accident). The most succinct description I've come across is the first paragraph of the Paxos Made Practical paper by Robbert van Renesse:

Paxos is a simple protocol that a group of machines in a distributed system can use to agree on a value proposed by a member of the group. If it terminates, the protocol reaches consensus even if the network was unreliable and multiple machines simultaneously tried to propose different values. The basic idea is that each proposal has a unique number. Higher numbered proposals override lower-numbered ones. However, a "proposer" machine must notify the group of its proposal number before proposing a particular value. If, after hearing from a majority of the group, the proposer learns one or more values from previous proposals, it must re-use the same value as the highest-numbered previous proposal. Otherwise, the proposer can select any value to propose.

Before going further, you probably need to have read the Paxos Made Simple paper and probably the Wikipedia page on Paxos too.

Far too many papers choose different terminology for the different phases of the protocol. Thus in the following I'm just going to call them phase 1 and phase 2. Phase 1 is the one where: (1a) a proposer invents a ballot number and sends that to a majority of the acceptors; (1b) each acceptor replies iff the ballot number from (1a) is greater than any ballot number it's previously seen in a (1a) message, and that reply contains the ballot number and value of any (2b) message. Phase 2 is the one where: (2a) a proposer asks the majority of acceptors to accept a value with a ballot number; (2b) each acceptor replies (and accepts the value) iff the ballot number is greater or equal to the maximum ballot number it has seen in a (1a) message.

These papers normally talk about being able to tolerate failure of up to F machines (actually, acceptors). For this to be possible, Paxos still requires the availability of a majority of the original set of acceptors. So that means a total of 2F + 1 acceptors (if you have 2F + 1 machines, then a majority of them is F+1 machines, hence F of them can still fail and you have access to a majority of the original set of machines). The importance of a majority is that if you randomly choose two sets of F+1 machines, there will be at least one machine in common (intersection is never empty). The whole point of ballot numbers (and the way they're constructed such that they can never collide) is so that at least one machine in any two majority sets of machines will be able to correctly order any two different ballot numbers. Thus the point of phase 1 is to figure out if your ballot is currently the maximum known ballot number to each acceptor, and if it is, what value you're allowed to present for phase 2.

In the original papers, acceptors only ever reply iff the ballot number from the proposer meets certain requirements. If it does not, the acceptors are silent and the proposer is meant to determine through some sort of timeout that their message has failed. However, there's no need to implement it like this - several systems have the acceptors actively send back nack messages to the proposer. Paxos will work perfectly well on unreliable communication channels, which means it's fine over UDP. However, UDP frequently doesn't work in the cloud due to cloud providers breaking PMTU discovery and not supporting fragmentation. In such an environment, Paxos will not violate its properties, but you might find nothing makes any progress. If none of that applies to you and so you use UDP then you may well need to implement the timeouts too, in case the nacks go missing (if you choose to use that nacks too) and you can't detect that loss. If you're using TCP then you might decide you can just rely on the nacks (application-layer nacks, not TCP), not bother with timeouts, and also watch for the TCP connection dropping. The argument against timeouts is that the machine with whom you're communicating might just be under heavy load. But then again, is that acceptable for the service you're trying to build?

Paxos ensures that once a majority of the acceptors have accepted a value (by accepted, I mean an acceptor has received a phase 2a message which meets the requirements such that it intends to reply with a 2b message), it is impossible to get a majority of the acceptors to accept a different value. This, in combination with the construction of ballot numbers means that if any two acceptors have accepted a value for the same ballot, it is the same value. An entire instance of Paxos creates consensus in perpetuity on one value only. Normally you want to create a stream of values, so you need to run lots of rounds. How you construct your stream is up to you, but simply name-spacing each instance by the event log ID works just fine.

If in phase 1 you established your ballot number is the greatest ballot number ever and then you were just allowed to pick any old value, then it should be clear that anyone else could come along later, pick an even greater ballot number, and change the accepted value. So this is why phase 1b includes the value and ballot number for the most recent 2b message the acceptor has sent. The proposer has to receive 2b messages from a majority of acceptors before it can make further progress. Now let's pretend that each acceptor actually sends back all the ballot numbers and values for all 2b messages its ever sent as part of this instance. The proposer now has the complete history of all values accepted from a majority of acceptors. These histories can be different for each acceptor, though as said above, where any two acceptors accepted a value for the same ballot, the value will be the same. You can therefore create a list of all the accepted values, with the ballot numbers as indices; there may well be gaps in this list. What should the proposer now do? All the proposer can do is to cause the acceptors to append to this imaginary list - the acceptors will only act on a greater-than-ever-before ballot number, which means appending to the end of our imaginary list. We want future proposers to be forced to continue our work rather than diverge. If, for example, we should force the proposer to send a 2a message with the earliest accepted value then that does not cause more acceptors to agree on what their earliest accepted value is. So the only other sane choice is to force the proposer to send a 2a message with the value of the greatest ballot number it knows of that's been accepted. This can't reduce the spread of this particular value: even if the proposer dies right here, the acceptors haven't lost anything. It can increase the spread of this value though by passing this value to acceptors who haven't previously accepted any value for this ballot number (and because we got a 1b reply from that acceptor, we know that our current ballot number is acceptable to that acceptor; from this point, we can realise that we're most likely to make progress if the majority we send our 2a message to is the same as the majority we sent our 1a message to).

Once a value has been accepted by a majority of acceptors, any further valid (by valid, I mean it does not get ignored, or cause a nack) 1a message from a proposer will guarantee the accepted value is returned in the 1b message and must be chosen again by the proposer in its next 2a message (by the definition of a majority, it is not possible for a different value to have been accepted at the same time by a different majority (even with a higher ballot number)). Once a value is accepted by a majority of acceptors, at least one member of that majority is guaranteed to be in any other majority, and will thus present its accepted value in any 1b messages it sends. Acceptors are considered single-threaded in their dealings with Paxos. So when an acceptor is dealing with a valid 2a message and sending a 2b response, it is not simultaneously processing 1a messages. Thus if an acceptor receives a valid 2a message and accepts that value, some other proposer may be sending phase 1a messages or even different 2a messages to this same acceptor, but they have to wait their turn. In both cases: if some 1a or 2a message arrives afterwards, they are ignored (or a nack sent) if their ballot number is too low, otherwise the 1a will elicit a response (1b) containing the newly accepted value. However, the 2a cannot be valid. This is because a proposer would only send a 2a if it got a 1b back, which implies its ballot number is the greatest. But this acceptor has just accepted a value, implying the accepted value's ballot number must be greater still. Thus in any interleaving involving multiple proposers and an acceptor, the only value acceptable will be with the ballot number of the most recently issued 1b message (or put another way, a 2a will only be accepted from proposer ρ if the previous valid message the acceptor received was a 1a from proposer ρ (other than the special case of the first ballot number where phase 1 isn't necessary - covered below)).

All this talk of majorities is often confusing in combination with failures. The simplest formulation is that a proposer should send the 1a and 2a messages to all acceptors, and can make progress when it receives 1b and 2b message responses (respectively) from a majority of them. This is the simplest way of dealing with the possibility of failures. However, it's frequently a good idea to optimise for the common case, which is when failures don't occur. Thus you can pick your majority of acceptors at the start (perhaps randomly, perhaps not) and communicate with just them, sending your 1a and 2a messages just to them, and waiting for all of them to reply. But what then happens if any of them fail? You're now not talking to a majority. The simplest choice here is to abandon the current ballot, pick a fresh majority (this is a majority of the original 2F+1 acceptors - don't reduce the size of your machines just because of a failure; you should only reduce the size of your machines when you know a machine is not going to come back - covered below), and start from phase 1a with an increased ballot number.

When a proposer receives 2b messages back from a majority of acceptors, it knows the value chosen is never going to change for this particular instance of Paxos. The proposer can then disseminate this information as it chooses (this is often called phase 3/learning phase). If you plan it so, you can have the acceptors send their 2b message to not-just-the-proposer: that way, several parties can learn at the same time that a value has been chosen, without needing the extra hop of going through the proposer. You'll need to deal with some learners dying, whilst others don't, and the need to re-synchronise what's been learnt. The re-synchronising will however be easy because you won't have any conflicts - Paxos guarantees us that. So it should just be adding everything that's been learnt together from all available sources.

The proposer only gets to propose its own value in a 2a message when there is no information returned in the 1b messages from a majority of acceptors. Thus if a proposer is trying to get a particular value added to an event log, it may have to take part in several instances of Paxos before it finds one where its able to get 2a messages to a majority of acceptors for the first ballot number of that instance. Note that in light of failures of acceptors it should not immediately skip to the next instance: it could be that it was able to get its value to some acceptors (albeit not a majority), and some other proposer just happened to pick some of those acceptors in its own majority, and succeeded in spreading that value to a majority. It should only move on to a new Paxos instance if it has learnt it has lost the current instance.

In this light, if the proposer fails after having sent some 2a messages, you have no knowledge as to what value is going to be agreed on by later ballots. If the proposer did manage to get 2a messages to the majority then yes, you have consensus. But if the 2a messages didn't reach a majority, then a different proposer can happen to choose a majority not including any of the previous acceptors, and get a totally different value accepted. Or it can choose a majority which has some acceptors in common with the original 2a messages, and thus complete the instance with the value as originally sent. So you can't assume anything about what will happen in an instance when a proposer dies.

If you happen to construct your system such that you know a particular instance will only ever be started by a particular proposer, then that proposer doesn't need to bother with phase 1 at all - it can start with a phase 2a message (you're guaranteeing there's no history of this instance of Paxos for the proposer to learn through phase 1). Of course, again in light of failures of acceptors it may need to revert to phase 1, but in the common case (no failures), this is a useful optimisation that can halve the number of messages.

The original Paxos papers talk about having the acceptors write their state to disk, though it's not really explained why. If you do have the acceptors write state to disk then it means they can be restarted and continue where they left off - they'll have to read their state off disk and send some more messages, perhaps redundantly, and so your proposers will have to make sure they can handle messages they receive (i.e. 1b and 2b messages) idempotently, but you've probably done that anyway. But for this to work also requires that the restart of the acceptors isn't detected as a failure by the proposers, which may mean you're using UDP rather than TCP, which means you've probably gone down the timeout route. All of this means performance is unlikely to be amazing: the extra fsyncs necessary is going to hurt, the timeouts may have to be fairly generous (and how on earth do you decide what amount of time is allowed for an acceptor to restart without considering that a failure?), and detecting actual failures is going to be more complex.

If you decide to use TCP instead, and you decided that a TCP connection dropping indicates a failure at the other end, then it means that an acceptor being restarted will be considered a failure of an acceptor. In which case, it doesn't matter if that acceptor loses its state. Consequently, the acceptors don't need to write anything to disk. To see this, consider the following: due to design, or some programming bug, you have always chosen the exact same F+1 as your majority of acceptors. They have never crashed, and so they contain all the information to date about every instance of Paxos you've run. The remaining F acceptors contain nothing. Suddenly there's a power failure, and F of those chosen F+1 acceptors die. But Paxos can handle the death of up to F acceptors, so you're still fine. But now you have just F+1 acceptors left, and those F+1 acceptors are your only choice, ongoing, for the majority of acceptors. Crucially, that includes the 1 surviving acceptor from the original majority that has all the state. So nothing has been lost, even without writing anything to disk.

Now yes, if you need to be able to power down the entire system and then resume where you left off then you're going to have to write to disk. But that may still be outside of Paxos rather than within. For example, if you're using Paxos to achieve consensus on some stream of commands then the client which is submitting the command doesn't return until: 1) Paxos has successfully added the command to the stream (i.e. the client, playing the role of proposer, has had 2b messages back from a majority of acceptors for an instance of Paxos in which the proposer was able to pick its own value (command) for the 2a message); 2) the command has been processed by some later step (command processor) and some result returned. Now if the acceptors are co-located with the command processor and you need to turn everything off then does it matter if the stream of commands is lost? The client that submitted the command is just going to get an error, even if its command is eventually processed some time later, so do we really care if that command is lost?

It depends. Presumably the state of the command processors is written to disk every time the state changes, and if you have several of these processors then they could run at different speeds. So it's then a matter of: how do you re-synchronise the state of these command processors? Some of the papers show that you could use Paxos for this, which is true, but then you may need Paxos to maintain quite a history and add other messages to the acceptors so that you can efficiently catch up. Or you could build this re-synchronisation through other means, outside of Paxos, and just keep Paxos for speedy fault-tolerant non-blocking consensus.

Update (13th March 2015): There's a case I missed here. Consider that paxos isn't writing to disk. Acceptors send out their 2b messages to learners. Only 1 learner receives at least F+1 2b messages before all the acceptors die. That 1 learner acts on the 2bs, proceeding in some way (which does involve writing to disk) before it too dies. At this point, the system is blocked because >F acceptors have died, and indeed the entire system is now turned off. Everyone now comes back up, except for the one node that received all the 2bs. Consequently, knowledge of this paxos instance is lost, but the system is operational as <F acceptors are dead. If that one node ever comes back, you have divergence. Even worse, that one node that acted on the 2bs may have done enough work to mean the result of that instance became visible to the outside world.

So how do you actually deal with intentional removal (or addition) of acceptors? One simple idea is that the set of identities of acceptors amounts to a topology, and this is versioned (ver n). So if you want to add or remove an acceptor then you calculate what the new topology is, give it an increased version (ver n+1), and now run a Paxos instance proposing achieving consensus on what ver n+1 of the topology really is (I think you want to do this Paxos instance with the new topology rather than the old). All instances of Paxos reference the topology version. In a Paxos instance, once a value has been accepted by the majority, you cannot change the topology for that instance. If you increased the number of acceptors then you could render the old majority so small that none of them are chosen in a later ballot, thus allowing the accepted value to be changed. If you decreased the number of acceptors then you could remove enough of the old majority such that a new majority from the new topology need not include any of the original majority, and again the accepted value could change. Thus the easiest is simply that any 1a message would have to carry the topology version, and once an acceptor has received a valid 1a message, the topology version for that Paxos instance is fixed. So then if an acceptor receives a 1a or 2a message for that instance which has the wrong topology number, it will issue a nack, indicating the correct topology number, hopefully forcing the proposer to discover the correct new topology. That does mean that if >F acceptors fail, a Paxos instance can just get stuck. This, you'll just have to deal with explicitly, having everyone detect this case and completely aborting the instance.

There are further complications here too. Consider the earlier case where the same majority of F+1 kept getting chosen, and then F of them died, leaving just 1 acceptor with all the information. If a topology change now happens, removing all the failed acceptors then all the information held by this special acceptor is certainly not held by a majority any more, and is in danger of being lost. So historical instances of Paxos must be made read only, and before the topology change is agreed, it may be essential to duplicate or otherwise make safe such data.

Hopefully the above is useful to some people, not just myself. I've struggled to find much information regarding the various approaches and tradeoffs when implementing Paxos. There are various papers such as Paxos Made Live which are certainly worth reading. But they don't seem to cover much of what I've tried to cover above. Such papers tend to record their concrete implementations rather than considering various different uses of Paxos and what the consequences are. Comments and corrections are very welcome - please let me know.

March 13, 2015 09:30 PM

Functional Jobs

Software Engineer at Xledger (Full-time)

Xledger is a finance, project, and business information system that gives customers tighter control of their businesses. We are looking for a qualified software engineer to join our team.

The current technology stack is a standard ASP.NET Web Forms style stack, with some JavaScript heavy views (some in React.js) backed by ASP.NET MVC. For this position, we are looking for someone to help deliver improvements on the framework level in that stack, and help start a new project using the best technologies.

Must haves:

  • 2+ years of experience
  • A good fundamental understanding of how the web and web applications work (HTTP, databases, caching, API's, etc)
  • Ability to do solid work in multiple areas, such as (data access, caching, the front end, etc)
  • Ability to pick things up quickly
  • High standards
  • Ability to focus and get things done

Nice to haves

  • Experience with functional programming languages, such as Clojure, F#, or Haskell

Get information on how to apply for this position.

March 13, 2015 08:13 PM

Software Engineer (Frontend/backend, Sr/Jr candidates welcome) at Xledger (Full-time)

Xledger is a finance, project, and business information system that gives customers tighter control of their businesses. We are looking for a qualified software engineer to join our team.

The current technology stack is a standard ASP.NET Web Forms style stack, with some JavaScript heavy views (some in React.js) backed by ASP.NET MVC. For this position, we are looking for someone to help deliver improvements on the framework level in that stack, and help start a new project using the best technologies.

Must haves:

  • 2+ years of experience
  • A good fundamental understanding of how the web and web applications work (HTTP, databases, caching, API's, etc)
  • Ability to do solid work in multiple areas, such as (data access, caching, the front end, etc)
  • Ability to pick things up quickly
  • High standards
  • Ability to focus and get things done

Nice to haves

  • Experience with functional programming languages, such as Clojure, F#, or Haskell

Note: For junior front end candidates, the only hard requirement is some experience with JS and HTML/CSS, and an interest in functional programming.

Get information on how to apply for this position.

March 13, 2015 08:13 PM

Tom Schrijvers

Functional Jobs

Lead front end developer at Anchor Systems (Full-time)

Are you a Haskell hacker with plenty of front end development experience? Do you get excited whilst navigating the border between complex back end systems and simple, elegant UIs?

Anchor Systems is looking for a lead front end developer to help us build excellent UIs for our newest products.

You will be working as part of the Anchor R&D group, where we develop systems for cloud orchestration, data visualisation, monitoring and much more. Our R&D developers work exclusively in Haskell. They're great at providing well-documented APIs but we need someone to present this to our users. This is where you come in, bridging the gap between back end and front end.

As our sole (for now) front end developer you will make technical decisions on framework usage, scalable architecture, and abstractions for exposing our back end systems (things like OpenStack, AWS, OAuth) to the user. You will be responsible for the correctness, maintainability and usability of the interfaces you build.

  • You must be an expert in the current HTML5, JS and CSS happenings.

  • You must be comfortable with linux systems.

  • You must be able to architect scalable and highly available front ends.

  • You must have experience working with REST APIs.

  • You should know and love Haskell (or be really interested in learning it).

  • You should have a working knowledge of computer science fundamentals.

  • You should think that usability testing is a great idea.

  • Bonus points for experience with OpenStack, AWS, react.js.

Whilst working with us, you'll find yourself in a great office overlooking Hyde park, with free coffee from a local cafe, a transparent no-nonsense atmosphere and a collaborative academic environment. You'll be sure to learn a lot.

Get information on how to apply for this position.

March 13, 2015 12:23 AM

Jasper Van der Jeugt

Practical testing in Haskell


There has been a theme of “Practical Haskell” in the last few blogposts I published, and when I published the last one, on how to write an LRU Cache, someone asked me if I could elaborate on how I would test or benchmark such a module. For the sake of brevity, I will constrain myself to testing for now, although I think a lot of the ideas in the blogpost also apply to benchmarking.

This post is written in Literate Haskell. It depends on the LRU Cache we wrote last time, so you need both modules if you want to play around with the code. Both can be found in this repo.

Since I use a different format for blogpost filenames than GHC expects for module names, loading both modules is a bit tricky. The following works for me:

$ ghci posts/2015-02-24-lru-cache.lhs \
*Data.SimpleLruCache> :m +Data.SimpleLruCache.Tests
*Data.SimpleLruCache Data.SimpleLruCache.Tests>

Alternatively, you can of course rename the files.

Test frameworks in Haskell

There are roughly two kinds of test frameworks which are commonly used in the Haskell world:

  • Unit testing, for writing concrete test cases. We will be using HUnit.

  • Property testing, which allows you to test properties rather than specific cases. We will be using QuickCheck. Property testing is something that might be unfamiliar to people just starting out in Haskell. However, because there already are great tutorials out there on there on QuickCheck, I will not explain it in detail. smallcheck also falls in this category.

Finally, it’s nice to have something to tie it all together. We will be using Tasty, which lets us run HUnit and QuickCheck tests in the same test suite. It also gives us plenty of convenient options, e.g. running only a part of the test suite. We could also choose to use test-framework or Hspec instead of Tasty.

A module structure for tests

Many Haskell projects start out by just having a tests.hs file somewhere, but this obviously does not scale well to larger codebases.

The way I like to organize tests is based on how we organize code in general: through the module hierarchy. If I have the following modules in src/:


I aim to have the following modules in tests/:


If I want to add some higher-level tests which basically test the entire product, I can usually add these higher in the module tree. For example, if I wanted to test our entire awesome product, I would write the tests in AcmeCompany.AwesomeProduct.Tests.

Every .Tests module exports a tests :: TestTree value. A TestTree is a tasty concept – basically a structured group of tests. Let’s go to our motivating example: testing the LRU Cache I wrote in the previous blogpost.

Since I named the module Data.SimpleLruCache, we use Data.SimpleLruCache.Tests here.

> {-# OPTIONS_GHC -fno-warn-orphans #-}
> {-# LANGUAGE BangPatterns               #-}
> {-# LANGUAGE GeneralizedNewtypeDeriving #-}
> module Data.SimpleLruCache.Tests
>     ( tests
>     ) where
> import           Control.Applicative     ((<$>), (<*>))
> import           Control.DeepSeq         (NFData)
> import           Control.Monad           (foldM_)
> import           Data.Hashable           (Hashable (..))
> import qualified Data.HashPSQ            as HashPSQ
> import           Data.IORef              (newIORef, readIORef, writeIORef)
> import           Data.List               (foldl')
> import qualified Data.Set                as S
> import           Prelude                 hiding (lookup)
> import           Data.SimpleLruCache
> import qualified Test.QuickCheck         as QC
> import qualified Test.QuickCheck.Monadic as QC
> import           Test.Tasty              (TestTree, testGroup)
> import           Test.Tasty.HUnit        (testCase)
> import           Test.Tasty.QuickCheck   (testProperty)
> import           Test.HUnit              (Assertion, (@?=))

What to test

One of the hardest questions is, of course, which functions and modules should I test? If unlimited time and resources are available, the obvious answer is “everything”. Unfortunately, time and resources are often scarce.

My rule of thumb is based on my development style. I tend to use GHCi a lot during development, and play around with datastructures and functions until they seem to work. These “it seems to work” cases I execute in GHCi often make great candidates for simple HUnit tests, so I usually start with that.

Then I look at invariants of the code, and try to model these as QuickCheck properties. This sometimes requires writing tricky Arbitrary instances; I will give an example of this later in this blogpost.

I probably don’t have to say that the more critical the code is, the more tests should be added.

After doing this, it is still likely that we will hit bugs if the code is non-trivial. These bugs form good candidates for testing as well:

  1. First, add a test case to reproduce the bug. Sometimes a test case will be a better fit, sometimes we should go with a property – it depends on the bug.
  2. Fix the bug so the test case passes.
  3. Leave in the test case for regression testing.

Using this strategy, you should be able to convince yourself (and others) that the code works.

Simple HUnit tests

Testing simple cases using HUnit is trivial, so we won’t spend that much time here. @?= asserts that two values must be equal, so let’s use that to check that trimming the empty Cache doesn’t do anything evil:

> testCache01 :: Assertion
> testCache01 =
>     trim (empty 3 :: Cache String Int) @?= empty 3

If we need to some I/O for our test, we can do so without much trouble in HUnit. After all,

Test.HUnit> :i Assertion
type Assertion = IO ()  -- Defined in 'Test.HUnit.Lang'

so Assertion is just IO!

> testCache02 :: Assertion
> testCache02 = do
>     h  <- newHandle 10 :: IO (Handle String Int)
>     v1 <- cached h "foo" (return 123)
>     v1 @?= 123
>     v2 <- cached h "foo" (fail "should be cached")
>     v2 @?= 123

That was fairly easy.

As you can see, I usually give simple test cases numeric names. Sometimes there is a meaningful name for a test (for example, if it is a regression test for a bug), but usually I don’t mind using just numbers.

Simple QuickCheck tests

Let’s do some property based testing. There are a few properties we can come up with.

Calling HashPSQ.size takes O(n) time, which is why are keeping our own counter, cSize. We should check that it matches HashPSQ.size, though:

> sizeMatches :: (Hashable k, Ord k) => Cache k v -> Bool
> sizeMatches c =
>     cSize c == HashPSQ.size (cQueue c)

The cTick field contains the priority of our next element that we will insert. The priorities currently in the queue should all be smaller than that.

> prioritiesSmallerThanNext :: (Hashable k, Ord k) => Cache k v -> Bool
> prioritiesSmallerThanNext c =
>     all (< cTick c) priorities
>   where
>     priorities = [p | (_, p, _) <- HashPSQ.toList (cQueue c)]

Lastly, the size should always be smaller than or equal to the capacity:

> sizeSmallerThanCapacity :: (Hashable k, Ord k) => Cache k v -> Bool
> sizeSmallerThanCapacity c =
>     cSize c <= cCapacity c

Tricks for writing Arbitrary instances

The Action trick

Of course, if you are somewhat familiar with QuickCheck, you will know that the previous properties require an Arbitrary instance for Cache.

One way to write such instances is what I’ll call the “direct” method. For us this would mean generating a list of [(key, priority, value)] pairs and convert that to a HashPSQ. Then we could compute the size of that and initialize the remaining fields.

However, writing an Arbitrary instance this way can get hard if our datastructure becomes more complicated, especially if there are complicated invariants. Additionally, if we take any shortcuts in the implementation of arbitrary, we might not test the edge cases well!

Another way to write the Arbitrary instance is by modeling use of the API. In our case, there are only two things we can do with a pure Cache: insert and lookup.

> data CacheAction k v
>     = InsertAction k v
>     | LookupAction k
>     deriving (Show)

This has a trivial Arbitrary instance:

> instance (QC.Arbitrary k, QC.Arbitrary v) =>
>         QC.Arbitrary (CacheAction k v) where
>     arbitrary = QC.oneof
>         [ InsertAction <$> QC.arbitrary <*> QC.arbitrary
>         , LookupAction <$> QC.arbitrary
>         ]

And we can apply these actions to our pure Cache to get a new Cache:

> applyCacheAction
>     :: (Hashable k, Ord k)
>     => CacheAction k v -> Cache k v -> Cache k v
> applyCacheAction (InsertAction k v) c = insert k v c
> applyCacheAction (LookupAction k)   c = case lookup k c of
>     Nothing      -> c
>     Just (_, c') -> c'

You probably guessed where this was going by now: we can generate an arbitrary Cache by generating a bunch of these actions and applying them one by one on top of the empty cache.

> instance (QC.Arbitrary k, QC.Arbitrary v, Hashable k, NFData v, Ord k) =>
>         QC.Arbitrary (Cache k v) where
>     arbitrary = do
>         capacity <- QC.choose (1, 50)
>         actions  <- QC.arbitrary
>         let !cache = empty capacity
>         return $! foldl' (\c a -> applyCacheAction a c) cache actions

Provided that we can model the complete user facing API using such an “action” datatype, I think this is a great way to write Arbitrary instances. After all, our Arbitrary instance should then be able to reach the same states as a user of our code.

An extension of this trick is using a separate datatype which holds the list of actions we used to generate the Cache as well as the Cache.

> data ArbitraryCache k v = ArbitraryCache [CacheAction k v] (Cache k v)
>     deriving (Show)

When a test fails, we can then log the list of actions which got us into the invalid state – very useful for debugging. Furthermore, we can implement the shrink method in order to try to reach a similar invalid state using less actions.

The SmallInt trick

Now, note that our Arbitrary instance is for Cache k v, i.e., we haven’t chosen yet what we want to have as k and v for our tests. In this case v is not so important, but the choice of k is important.

We want to cover all corner cases, and this includes ensuring that we cover collisions. If we use String or Int as key type k, collisions are very unlikely due to the high cardinality of both types. Since we are using a hash-based container underneath, hash collisions must also be covered.

We can solve both problems by introducing a newtype which restricts the cardinality of Int, and uses a “worse” (in the traditional sense) hashing method.

> newtype SmallInt = SmallInt Int
>     deriving (Eq, Ord, Show)
> instance QC.Arbitrary SmallInt where
>     arbitrary = SmallInt <$> QC.choose (1, 100)
> instance Hashable SmallInt where
>     hashWithSalt salt (SmallInt x) = (salt + x) `mod` 10

Monadic QuickCheck

Now let’s mix QuickCheck with monadic code. We will be testing the Handle interface to our cache. This interface consists of a single method:

    :: (Hashable k, Ord k)
    => Handle k v -> k -> IO v -> IO v

We will write a property to ensure our cache retains and evicts the right key-value pairs. It takes two arguments: the capacity of the LRU Cache (we use a SmallInt in order to get more evictions), and a list of key-value pairs we will insert using cached (we use SmallInt so we will cover collisions).

> historic
>     :: SmallInt              -- ^ Capacity
>     -> [(SmallInt, String)]  -- ^ Key-value pairs
>     -> QC.Property           -- ^ Property
> historic (SmallInt capacity) pairs = QC.monadicIO $ do is used to lift IO code into the QuickCheck property monad PropertyM – so it is a bit like a more concrete version of liftIO. I prefer it here over liftIO because it makes it a bit more clear what is going on.

>     h <- $ newHandle capacity

We will fold (foldM_) over the pairs we need to insert. The state we pass in this foldM_ is the history of pairs we previously inserted. By building this up again using :, we ensure history contains a recent-first list, which is very convenient.

Inside every step, we call cached. By using an IORef in the code where we would usually actually “load” the value v, we can communicate whether or not the value was already in the cache. If it was already in the cache, the write will not be executed, so the IORef will still be set to False. We store that result in wasInCache.

In order to verify this result, we reconstruct a set of the N most recent keys. We can easily do this using the list of recent-first key-value pairs we have in history.

>     foldM_ (step h) [] pairs
>   where
>     step h history (k, v) = do
>         wasInCacheRef <- $ newIORef True
>         _             <- $ cached h k $ do
>             writeIORef wasInCacheRef False
>             return v
>         wasInCache    <- $ readIORef wasInCacheRef
>         let recentKeys = nMostRecentKeys capacity S.empty history
>         QC.assert (S.member k recentKeys == wasInCache)
>         return ((k, v) : history)

This is our auxiliary function to calculate the N most recent keys, given a recent-first key-value pair list.

> nMostRecentKeys :: Ord k => Int -> S.Set k -> [(k, v)] -> S.Set k
> nMostRecentKeys _ keys [] = keys
> nMostRecentKeys n keys ((k, _) : history)
>     | S.size keys >= n    = keys
>     | otherwise           =
>         nMostRecentKeys n (S.insert k keys) history

This test did not cover checking that the values in the cache are correct, but only ensures it retains the correct key-value pairs. This is a conscious decision: I think the retaining/evicting part of the LRU Cache code was the most tricky, so we should prioritize testing that.

Tying everything up

Lastly, we have our tests :: TestTree. It is not much more than an index of tests in the module. We use testCase to pass HUnit tests to the framework, and testProperty for QuickCheck properties.

Note that I usually tend to put these at the top of the module, but here I put it at the bottom of the blogpost for easier reading.

> tests :: TestTree
> tests = testGroup "Data.SimpleLruCache"
>     [ testCase "testCache01" testCache01
>     , testCase "testCache02" testCache02
>     , testProperty "size == HashPSQ.size"
>         (sizeMatches :: Cache SmallInt String -> Bool)
>     , testProperty "priorities < next priority"
>         (prioritiesSmallerThanNext :: Cache SmallInt String -> Bool)
>     , testProperty "size < capacity"
>         (sizeSmallerThanCapacity :: Cache SmallInt String -> Bool)
>     , testProperty "historic" historic
>     ]

The last thing we need is a main function for cabal test to invoke. I usually put this in something like tests/Main.hs. If you use the scheme which I described above, this file should look very neat:

module Main where

import           Test.Tasty (defaultMain, testGroup)

import qualified AcmeCompany.AwesomeProduct.Database.Tests
import qualified AcmeCompany.AwesomeProduct.Importer.Csv.Tests
import qualified AcmeCompany.AwesomeProduct.Importer.Tests
import qualified Data.SimpleLruCache.Tests

main :: IO ()
main = defaultMain $ testGroup "Tests"
    [ AcmeCompany.AwesomeProduct.Database.Tests.tests
    , AcmeCompany.AwesomeProduct.Importer.Csv.Tests.tests
    , AcmeCompany.AwesomeProduct.Importer.Tests.tests
    , Data.SimpleLruCache.Tests.tests

If you are still hungry for more Haskell testing, I would recommend looking into Haskell program coverage for mission-critical modules.

Special thanks to Alex Sayers, who beat everyone’s expectations when he managed to stay sober for just long enough to proofread this blogpost.

March 13, 2015 12:00 AM

March 12, 2015

Theory Lunch (Institute of Cybernetics, Tallinn)

Limit languages of cellular automata

At today’s Theory Lunch I discussed limit languages of cellular automata, and Lyman Hurd’s example of a CA whose limit language is not regular. I wrote about this on my other blog.


by Silvio Capobianco at March 12, 2015 02:32 PM

March 11, 2015

Dominic Steinitz

Stochastic Volatility


Simple models for e.g. financial option pricing assume that the volatility of an index or a stock is constant, see here for example. However, simple observation of time series show that this is not the case; if it were then the log returns would be white noise

One approach which addresses this, GARCH (Generalised AutoRegressive Conditional Heteroskedasticity), models the evolution of volatility deterministically.

Stochastic volatility models treat the volatility of a return on an asset, such as an option to buy a security, as a Hidden Markov Model (HMM). Typically, the observable data consist of noisy mean-corrected returns on an underlying asset at equally spaced time points.

There is evidence that Stochastic Volatility models (Kim, Shephard, and Chib (1998)) offer increased flexibility over the GARCH family, e.g. see Geweke (1994), Fridman and Harris (1998) and Jacquier, Polson, and Rossi (1994). Despite this and judging by the numbers of questions on the R Special Interest Group on Finance mailing list, the use of GARCH in practice far outweighs that of Stochastic Volatility. Reasons cited are the multiplicity of estimation methods for the latter and the lack of packages (but see here for a recent improvement to the paucity of packages).

In their tutorial on particle filtering, Doucet and Johansen (2011) give an example of stochastic volatility. We save this approach for future blog posts and follow Lopes and Polson and the excellent lecture notes by Hedibert Lopes.

Here’s the model.

\displaystyle   \begin{aligned}  H_0     &\sim {\mathcal{N}}\left( m_0, C_0\right) \\  H_t     &= \mu + \phi H_{t-1} + \tau \eta_t \\   Y_n     &= \beta \exp(H_t / 2) \epsilon_n \\  \end{aligned}

We wish to estimate \mu, \phi, \tau and \boldsymbol{h}. To do this via a Gibbs sampler we need to sample from

\displaystyle   {p \left( \mu, \phi, \tau \,\vert\, \boldsymbol{h}, \boldsymbol{y} \right)} \quad \text{and} \quad  {p \left( \boldsymbol{h} \,\vert\, \mu, \phi, \tau, \boldsymbol{y} \right)}

Haskell Preamble

> {-# OPTIONS_GHC -Wall                      #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing   #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults    #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind   #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods  #-}
> {-# OPTIONS_GHC -fno-warn-orphans          #-}
> {-# LANGUAGE RecursiveDo                   #-}
> {-# LANGUAGE ExplicitForAll                #-}
> {-# LANGUAGE TypeOperators                 #-}
> {-# LANGUAGE TypeFamilies                  #-}
> {-# LANGUAGE ScopedTypeVariables           #-}
> {-# LANGUAGE DataKinds                     #-}
> {-# LANGUAGE FlexibleContexts              #-}
> module StochVol (
>     bigM
>   , bigM0
>   , runMC
>   , ys
>   , vols
>   , expectationTau2
>   , varianceTau2
>   ) where
> import Numeric.LinearAlgebra.HMatrix hiding ( (===), (|||), Element,
>                                               (<>), (#>), inv )
> import qualified Numeric.LinearAlgebra.Static as S
> import Numeric.LinearAlgebra.Static ( (<>) )
> import GHC.TypeLits
> import Data.Proxy
> import Data.Maybe ( fromJust )
> import Data.Random
> import Data.Random.Source.PureMT
> import Control.Monad.Fix
> import Control.Monad.State.Lazy
> import Control.Monad.Writer hiding ( (<>) )
> import Control.Monad.Loops
> import Control.Applicative
> import qualified Data.Vector as V
> inv :: (KnownNat n, (1 <=? n) ~ 'True) => S.Sq n -> S.Sq n
> inv m = fromJust $ S.linSolve m S.eye
> infixr 8 #>
> (#>) :: (KnownNat m, KnownNat n) => S.L m n -> S.R n -> S.R m
> (#>) = (S.#>)
> type StatsM a = RVarT (Writer [((Double, Double), Double)]) a
> (|||) :: (KnownNat ((+) r1 r2), KnownNat r2, KnownNat c, KnownNat r1) =>
>          S.L c r1 -> S.L c r2 -> S.L c ((+) r1 r2)
> (|||) = (S.¦)

Marginal Distribution for Parameters

Let us take a prior that is standard for linear regression

\displaystyle   (\boldsymbol{\theta}, \tau^2) \sim {\mathcal{NIG}}(\boldsymbol{\theta}_0, V_0, \nu_0, s_0^2)

where \boldsymbol{\theta} = (\mu, \phi)^\top and use standard results for linear regression to obtain the required marginal distribution.

That the prior is Normal Inverse Gamma ({\cal{NIG}}) means

\displaystyle   \begin{aligned}  \boldsymbol{\theta} \, | \, \tau^2 & \sim {\cal{N}}(\boldsymbol{\theta}_0, \tau^2 V_0) \\  \tau^2                             & \sim {\cal{IG}}(\nu_0 / 2, \nu_0 s_0^2 / 2)  \end{aligned}

Standard Bayesian analysis for regression tells us that the (conditional) posterior distribution for

\displaystyle   y_i = \beta + \alpha x_i + \epsilon_i

where the \{\epsilon_i\} are IID normal with variance \sigma^2 is given by

\displaystyle   {p \left( \alpha, \beta, \eta \,\vert\, \boldsymbol{y}, \boldsymbol{x} \right)} =  {\cal{N}}((\alpha, \beta); \mu_n, \sigma^2\Lambda_n^{-1})\,{\cal{IG}}(a_n, b_n)


\displaystyle   \Lambda_n = X_n^\top X_n  + \Lambda_0

\displaystyle   \begin{matrix}  \mu_n = \Lambda_n^{-1}({X_n}^{\top}{X_n}\hat{\boldsymbol{\beta}}_n + \Lambda_0\mu_0) &  \textrm{where} &  \hat{\boldsymbol\beta}_n = ({X}_n^{\rm T}{X}_n)^{-1}{X}_n^{\rm T}\boldsymbol{y}_n  \end{matrix}

\displaystyle   \begin{matrix}  a_n = \frac{n}{2} + a_0 & \quad &  b_n = b_0 +        \frac{1}{2}(\boldsymbol{y}^\top\boldsymbol{y} +                    \boldsymbol{\mu}_0^\top\Lambda_0\boldsymbol{\mu}_0 -                    \boldsymbol{\mu}_n^\top\Lambda_n\boldsymbol{\mu}_n)  \end{matrix}

Recursive Form

We can re-write the above recursively. We do not need to for this blog article but it will be required in any future blog article which uses Sequential Monte Carlo techniques.

\displaystyle   \Lambda_n = \boldsymbol{x}_n^\top \boldsymbol{x}_n  + \Lambda_{n-1}


\displaystyle   \Lambda_{n}\mu_{n} =  {X}_{n}^{\rm T}\boldsymbol{y}_{n} + \Lambda_0\mu_0 =  {X}_{n-1}^{\rm T}\boldsymbol{y}_{n-1} + \boldsymbol{x}_n^\top y_n + \Lambda_0\mu_0 =  \Lambda_{n-1}\mu_{n-1} + \boldsymbol{x}_n^\top y_n

so we can write

\displaystyle   \boldsymbol{\mu}_n = \Lambda_n^{-1}(\Lambda_{n-1}\mu_{n-1} + \boldsymbol{x}_n^\top y_n)


\displaystyle   \begin{matrix}  a_n = a_{n-1} + \frac{1}{2} & \quad &  b_n = b_{n-1} + \frac{1}{2}\big[(y_n - \boldsymbol{\mu}_n^\top \boldsymbol{x}_n)y_n + (\boldsymbol{\mu}_{n-1} - \boldsymbol{\mu}_{n})^\top \Lambda_{n-1}\boldsymbol{\mu}_{n-1}\big]  \end{matrix}


In the case of our model we can specialise the non-recursive equations as

\displaystyle   \Lambda_n = \begin{bmatrix} 1 & 1 & \ldots & 1 \\                              x_1 & x_2 & \ldots & x_n              \end{bmatrix}              \begin{bmatrix} 1 & x_1 \\                              1 & x_2 \\                              \ldots & \ldots \\                              1 & x_n              \end{bmatrix}  + \Lambda_0

\displaystyle   \begin{matrix}  \mu_n = (\Lambda_n)^{-1}({X_n}^{\top}{X_n}\hat{\boldsymbol{\beta}}_n + \Lambda_0\mu_0) &  \textrm{where} &  \hat{\boldsymbol\beta}_n = ({X}_n^{\rm T}{X}_n)^{-1}{X}_n^{\rm T}\boldsymbol{x}_{1:n}  \end{matrix}

\displaystyle   \begin{matrix}  a_n = \frac{n}{2} + a_0 & \quad &  b_n = b_0 +        \frac{1}{2}(\boldsymbol{x}_{1:n}^\top\boldsymbol{x}_{1:n} +                    \boldsymbol{\mu}_0^\top\Lambda_0\boldsymbol{\mu}_0 -                    \boldsymbol{\mu}_n^\top\Lambda_n\boldsymbol{\mu}_n)  \end{matrix}

Let’s re-write the notation to fit our model.

\displaystyle   \Lambda_n = \begin{bmatrix} 1 & 1 & \ldots & 1 \\                              h_1 & h_2 & \ldots & h_n              \end{bmatrix}              \begin{bmatrix} 1 & h_1 \\                              1 & h_2 \\                              \ldots & \ldots \\                              1 & h_n              \end{bmatrix}  + \Lambda_0

\displaystyle   \begin{matrix}  \mu_n = (\Lambda_n)^{-1}({H_n}^{\top}{H_n}\hat{\boldsymbol{\theta}}_n + \Lambda_0\mu_0) &  \textrm{where} &  \hat{\boldsymbol\theta}_n = ({H}_n^{\rm T}{H}_n)^{-1}{H}_n^{\rm T}\boldsymbol{x}_{1:n}  \end{matrix}

\displaystyle   \begin{matrix}  a_n = \frac{n}{2} + a_0 & \quad &  b_n = b_0 +        \frac{1}{2}(\boldsymbol{x}_{1:n}^\top\boldsymbol{x}_{1:n} +                    \boldsymbol{\mu}_0^\top\Lambda_0\boldsymbol{\mu}_0 -                    \boldsymbol{\mu}_n^\top\Lambda_n\boldsymbol{\mu}_n)  \end{matrix}

Sample from {p \left( \boldsymbol{\theta}, \tau^2 \,\vert\, \boldsymbol{h}, \boldsymbol{y} \right)} \sim {\mathcal{NIG}}(\boldsymbol{\theta}_1, V_1, \nu_1, s_1^2)

We can implement this in Haskell as

> sampleParms ::
>   forall n m .
>   (KnownNat n, (1 <=? n) ~ 'True) =>
>   S.R n -> S.L n 2 -> S.R 2 -> S.Sq 2 -> Double -> Double ->
>   RVarT m (S.R 2, Double)
> sampleParms y bigX theta_0 bigLambda_0 a_0 s_02 = do
>   let n = natVal (Proxy :: Proxy n)
>       a_n = 0.5 * (a_0 + fromIntegral n)
>       bigLambda_n = bigLambda_0 + (tr bigX) <> bigX
>       invBigLambda_n = inv bigLambda_n
>       theta_n = invBigLambda_n #> ((tr bigX) #> y + (tr bigLambda_0) #> theta_0)
>       b_0 = 0.5 * a_0 * s_02
>       b_n = b_0 +
>             0.5 * (S.extract (S.row y <> S.col y)!0!0) +
>             0.5 * (S.extract (S.row theta_0 <> bigLambda_0 <> S.col theta_0)!0!0) -
>             0.5 * (S.extract (S.row theta_n <> bigLambda_n <> S.col theta_n)!0!0)
>   g <- rvarT (Gamma a_n (recip b_n))
>   let s2 = recip g
>       invBigLambda_n' = m <> invBigLambda_n
>         where
>           m = S.diag $ S.vector (replicate 2 s2)
>   m1 <- rvarT StdNormal
>   m2 <- rvarT StdNormal
>   let theta_n' :: S.R 2
>       theta_n' = theta_n + S.chol (S.sym invBigLambda_n') #> (S.vector [m1, m2])
>   return (theta_n', s2)

Marginal Distribution for State

Marginal for H_0

Using a standard result about conjugate priors and since we have

\displaystyle   h_0 \sim {\cal{N}}(m_0,C_0) \quad h_1 \vert h_0 \sim {\cal{N}}(\mu + \phi h_0, \tau^2)

we can deduce

\displaystyle   h_0 \vert h_1 \sim {\cal{N}}(m_1,C_1)


\displaystyle   \begin{aligned}  \frac{1}{C_1} &= \frac{1}{C_0} + \frac{\phi^2}{\tau^2} \\  \frac{m_1}{C_1} &= \frac{m_0}{C_0} + \frac{\phi(h_1 - \mu)}{\tau^2}  \end{aligned}

> sampleH0 :: Double ->
>             Double ->
>             V.Vector Double ->
>             Double ->
>             Double ->
>             Double ->
>             RVarT m Double
> sampleH0 iC0 iC0m0 hs mu phi tau2 = do
>   let var = recip $ (iC0 + phi^2 / tau2)
>       mean = var * (iC0m0 + phi * ((hs V.! 0) - mu) / tau2)
>   rvarT (Normal mean (sqrt var))

Marginal for H_1 \ldots H_n

From the state equation, we have

\displaystyle   \begin{aligned}  H_{t+1} &=  \mu + \phi H_{t} + \tau \eta_t \\  \phi^2 H_{t} &=  -\phi\mu + \phi H_{t+1} - \phi \tau \eta_t \\  \end{aligned}

We also have

\displaystyle   \begin{aligned}  H_{t} &=  \mu + \phi H_{t-1} + \tau \eta_{t-1} \\  \end{aligned}

Adding the two expressions together gives

\displaystyle   \begin{aligned}  (1 + \phi^2)H_{t} &= \phi (H_{t-1} + H_{t+1}) + \mu (1 - \phi) + \tau(\eta_{t-1} - \phi\eta_t) \\  H_{t} &= \frac{\phi}{1 + \phi^2} (H_{t-1} + H_{t+1}) + \mu \frac{1 - \phi}{1 + \phi^2} + \frac{\tau}{1 + \phi^2}(\eta_{t-1} - \phi\eta_t) \\  \end{aligned}

Since \{\eta_t\} are standard normal, then H_t conditional on H_{t-1} and H_{t+1} is normally distributed, and

\displaystyle   \begin{aligned}  \mathbb{E}(H_n\mid H_{n-1}, H_{n+1}) &= \frac{1 - \phi}{1+\phi^2}\mu +                                          \frac{\phi}{1+\phi^2}(H_{n-1} + H_{n+1}) \\  \mathbb{V}(H_n\mid H_{n-1}, H_{n+1}) &= \frac{\tau^2}{1+\phi^2}  \end{aligned}

We also have

\displaystyle   h_{n+1} \vert h_n \sim {\cal{N}}(\mu + \phi h_n, \tau^2)


\displaystyle   \boldsymbol{h}_{-t} = \begin{bmatrix}                        h_0, &                        h_1, &                        \ldots, &                        h_{t-1}, &                        h_{t+1}, &                        \ldots, &                        h_{n-1}, &                        h_n                        \end{bmatrix}

by Bayes’ Theorem we have

\displaystyle   {p \left( h_t \,\vert\, \boldsymbol{h}_{-t}, \theta, \boldsymbol{y} \right)} \propto  {p \left( y_t \,\vert\, h_t \right)} {p \left( h_t \,\vert\, \boldsymbol{h}_{-t}, \theta, y_{-t} \right)} =  f_{\cal{N}}(y_t;0,e^{h_t}) f_{\cal{N}}(h_t;\mu_t,\nu_t^2)

where f_{\cal{N}}(x;\mu,\sigma^2) is the probability density function of a normal distribution.

We can sample from this using Metropolis

  1. For each t, sample h_t^\flat from {\cal{N}}(h_t, \gamma^2) where \gamma^2 is the tuning variance.

  2. For each t=1, \ldots, n, compute the acceptance probability

\displaystyle   p_t = \min{\Bigg(\frac{f_{\cal{N}}(h^\flat_t;\mu_t,\nu_t^2) f_{\cal{N}}(y_t;0,e^{h^\flat_t})}{f_{\cal{N}}(h_t;\mu_t,\nu_t^2) f_{\cal{N}}(y_t;0,e^{h_t})}, 1 \Bigg)}

  1. For each t, compute a new value of h_t

\displaystyle   h^\sharp_t =  \begin{cases}  h^\flat_t \text{with probability } p_t \\  h_t \text{with probability } 1 - p_t  \end{cases}

> metropolis :: V.Vector Double ->
>               Double ->
>               Double ->
>               Double ->
>               Double ->
>               V.Vector Double ->
>               Double ->
>               RVarT m (V.Vector Double)
> metropolis ys mu phi tau2 h0 hs vh = do
>   let eta2s = V.replicate (n-1) (tau2 / (1 + phi^2)) `V.snoc` tau2
>       etas  = sqrt eta2s
>       coef1 = (1 - phi) / (1 + phi^2) * mu
>       coef2 = phi / (1 + phi^2)
>       mu_n  = mu + phi * (hs V.! (n-1))
>       mu_1  = coef1 + coef2 * ((hs V.! 1) + h0)
>       innerMus = V.zipWith (\hp1 hm1 -> coef1 + coef2 * (hp1 + hm1)) (V.tail (V.tail hs)) hs
>       mus = mu_1 `V.cons` innerMus `V.snoc` mu_n
>   hs' <- V.mapM (\mu -> rvarT (Normal mu vh)) hs
>   let num1s = V.zipWith3 (\mu eta h -> logPdf (Normal mu eta) h) mus etas hs'
>       num2s = V.zipWith (\y h -> logPdf (Normal 0.0 (exp (0.5 * h))) y) ys hs'
>       nums  = V.zipWith (+) num1s num2s
>       den1s = V.zipWith3 (\mu eta h -> logPdf (Normal mu eta) h) mus etas hs
>       den2s = V.zipWith (\y h -> logPdf (Normal 0.0 (exp (0.5 * h))) y) ys hs
>       dens = V.zipWith (+) den1s den2s
>   us <- V.replicate n <$> rvarT StdUniform
>   let ls   = V.zipWith (\n d -> min 0.0 (n - d)) nums dens
>   return $ V.zipWith4 (\u l h h' -> if log u < l then h' else h) us ls hs hs'

Markov Chain Monte Carlo

Now we can write down a single step for our Gibbs sampler, sampling from each marginal in turn.

> singleStep :: Double -> V.Vector Double ->
>               (Double, Double, Double, Double, V.Vector Double) ->
>               StatsM (Double, Double, Double, Double, V.Vector Double)
> singleStep vh y (mu, phi, tau2, h0, h) = do
>   lift $ tell [((mu, phi),tau2)]
>   hNew <- metropolis y mu phi tau2 h0 h vh
>   h0New <- sampleH0 iC0 iC0m0 hNew mu phi tau2
>   let bigX' = (S.col $ S.vector $ replicate n 1.0)
>               |||
>               (S.col $ S.vector $ V.toList $ h0New `V.cons` V.init hNew)
>       bigX =  bigX' `asTypeOf` (snd $ valAndType nT)
>   newParms <- sampleParms (S.vector $ V.toList h) bigX (S.vector [mu0, phi0]) invBigV0 nu0 s02
>   return ( (S.extract (fst newParms))!0
>          , (S.extract (fst newParms))!1
>          , snd newParms
>          , h0New
>          , hNew
>          )


Let’s create some test data.

> mu', phi', tau2', tau' :: Double
> mu'   = -0.00645
> phi'  =  0.99
> tau2' =  0.15^2
> tau'  = sqrt tau2'

We need to create a statically typed matrix with one dimension the same size as the data so we tie the data size value to the required type.

> nT :: Proxy 500
> nT = Proxy
> valAndType :: KnownNat n => Proxy n -> (Int, S.L n 2)
> valAndType x = (fromIntegral $ natVal x, undefined)
> n :: Int
> n = fst $ valAndType nT

Arbitrarily let us start the process at

> h0 :: Double
> h0 = 0.0

We define the process as a stream (aka co-recursively) using the Haskell recursive do construct. It is not necessary to do this but streams are a natural way to think of stochastic processes.

> hs, vols, sds, ys :: V.Vector Double
> hs = V.fromList $ take n $ fst $ runState hsAux (pureMT 1)
>   where
>     hsAux :: (MonadFix m, MonadRandom m) => m [Double]
>     hsAux = mdo { x0 <- sample (Normal (mu' + phi' * h0) tau')
>                 ; xs <- mapM (\x -> sample (Normal (mu' + phi' * x) tau')) (x0:xs)
>                 ; return xs
>                 }
> vols = exp hs

We can plot the volatility (which we cannot observe directly).

And we can plot the log returns.

> sds = sqrt vols
> ys = fst $ runState ysAux (pureMT 2)
>   where
>     ysAux = V.mapM (\sd -> sample (Normal 0.0 sd)) sds

We start with a vague prior for H_0

> m0, c0 :: Double
> m0 = 0.0
> c0 = 100.0

For convenience

> iC0, iC0m0 :: Double
> iC0 = recip c0
> iC0m0  = iC0 * m0

Rather than really sample from priors for \mu, \phi and \tau^2 let us cheat and assume we sampled the simulated values!

> mu0, phi0, tau20 :: Double
> mu0   = -0.00645
> phi0  =  0.99
> tau20 =  0.15^2

But that we are still very uncertain about them

> bigV0, invBigV0 :: S.Sq 2
> bigV0 = S.diag $ S.fromList [100.0, 100.0]
> invBigV0 = inv bigV0
> nu0, s02 :: Double
> nu0    = 10.0
> s02    = (nu0 - 2) / nu0 * tau20

Note that for the inverse gamma this gives

> expectationTau2, varianceTau2 :: Double
> expectationTau2 = (nu0 * s02 / 2) / ((nu0 / 2) - 1)
> varianceTau2 = (nu0 * s02 / 2)^2 / (((nu0 / 2) - 1)^2 * ((nu0 / 2) - 2))
ghci> expectationTau2

ghci> varianceTau2

Running the Markov Chain

Tuning parameter

> vh :: Double
> vh = 0.1

The burn-in and sample sizes may be too low for actual estimation but will suffice for a demonstration.

> bigM, bigM0 :: Int
> bigM0 = 2000
> bigM  = 2000
> multiStep :: StatsM (Double, Double, Double, Double, V.Vector Double)
> multiStep = iterateM_ (singleStep vh ys) (mu0, phi0, tau20, h0, vols)
> runMC :: [((Double, Double), Double)]
> runMC = take bigM $ drop bigM0 $
>         execWriter (evalStateT (sample multiStep) (pureMT 42))

And now we can look at the distributions of our estimates


Doucet, Arnaud, and Adam M Johansen. 2011. “A Tutorial on Particle Filtering and Smoothing: Fifteen Years Later.” In Handbook of Nonlinear Filtering. Oxford, UK: Oxford University Press.

Fridman, Moshe, and Lawrence Harris. 1998. “A Maximum Likelihood Approach for Non-Gaussian Stochastic Volatility Models.” Journal of Business & Economic Statistics 16 (3): 284–91.

Geweke, John. 1994. “Bayesian Comparison of Econometric Models.”

Jacquier, Eric, Nicholas G. Polson, and Peter E. Rossi. 1994. “Bayesian Analysis of Stochastic Volatility Models.”

Kim, Sangjoon, Neil Shephard, and Siddhartha Chib. 1998. “Stochastic Volatility: Likelihood Inference and Comparison with ARCH Models.” Review of Economic Studies 65 (3): 361–93.

by Dominic Steinitz at March 11, 2015 10:06 AM

March 10, 2015

ARJANEN Loïc Jean David

What is RFC 707 ?

Welcome to the first blog post of this series consacred to my project of a RFC 707 implementation in Haskell.

Definition and description

RFC 707, whose formal name is “A High-Level Framework for Network-Based Resource Sharing” and which was published on the 14th of January 1976, is a primitive system of Remote Procedure Call.
It describes the manner in which a networked procedure call is to be done, the format each message (call and return) must abide by and the binary encoding of each message. In essence, it is the ancestor of ONC RPC, CORBA, SOAP… However, it doesn’t describe the transport protocol used, the ports used, any form of authentication… It has the inconvenient of being underspecified enough that two implementations don’t have much chance to understand each other and the advantage of being very lightweight.

Why will I implement it ?

First of all, I’ll implement it because it is a simple binary RPC protocol, a stepping stone to my project of an ONC RPC client implementation. Another reason is that’s there’s to my knowledge no implementation of this protocol, so it gives me more freedom to handle unspecified parts of the protocol, like how should I handle floating-point numbers. And it serves as a test of my capacities as an Haskell developper and as a spec reader. And there’s the fact that RPC systems are often distributed file systems’ basis, such as NFS or 9P.

Why Haskell ?

Because I appreciate and want to learn this language, and such a library seems like a good idea to me: after all, monads (and I/O in particular) is often cited as THE stumbling block for learning and mastering this language.

And after ?

The next blog post in this series will be consacred to the API my library will have and to the implementation choices I’ll make. After all, like I have said above, this RPC method is woefully underspecified by its RFC.

After this project, I’ll tackle either the ONC RPC client implementation or the AWT curses implementation.

by Azel at March 10, 2015 11:40 PM

The GHC Team

GHC Weekly News - 2015/03/10

Hi *,

It's that time again! Today GHC HQ met on a Tuesday to avoid some scheduling conflicts, and that means it's time to send some news to people.

Just a quick reminder from last week: we're hoping to make our third GHC 7.10.1 release candidate on Friday, March 13th, with the final release on Friday, March 27th.

  • Today, GHC HQ mostly talked about 7.10 bugs; HEAD is steaming along as usual with no impediments, but we've got several critical bugs we plan on landing fixes for this week; see milestone:7.10.1 for more.

But we've also had a little more list activity this week than we did before:

  • Karel Gardas wrote to the list about how to better parallelize the GHC build. For background, he's using a UltraSPARC T1-CPU/T2000 server with 32 hardware threads over 8 cores, where parallelism is a bigger win than raw single-threaded performance. But despite that, the performance is of the parallel GHC build is relatively abysmal - and Karel wants help brainstorming ideas to fix it.

Some noteworthy commits that went into ghc.git in the past week include:

Closed tickets the past week include: #7854, #10118, #10119, #3321, #10132, #9987, #10126, #9707, #10142, #10147, #10113, #9524, #10058, #10100, #2991, #10140, and #9122.

by thoughtpolice at March 10, 2015 06:22 PM

Yesod Web Framework

Designing APIs for Extensibility

This is a snapshot of some content I wrote for the Commercial Haskell group's Haskell Documentation project. The official home for this content is in that repo. As I often do, I'm copying the content into the blog post itself for ease of viewing.

Every time you make a breaking change in your API, it means that- potentially- one or more of your users will need to change his/her code to adapt. Even if this update is trivial, it adds friction to the code maintenance process. On the other hand, we don't want to be constrained by bad design choices early on in a project, and sometimes a breaking API change is the best option.

The point of this document, however, is to give you a third option: design your APIs from the outset to be extensible. There are common techniques employed in the Haskell world to make APIs that are resilient to changing feature-sets, and by employing them early on in your design process, you can hopefully avoid the painful choices between a better API and happy users.

Almost all techniques start with implementation hiding. Guidelines here are simple: don't expose anything non-public. For example, if you write a number of helper functions, you may not want to start off by exposing them, since you're then telling users that these are good, stable functions to be relied upon. Instead, use explicit export lists on your modules and only include functions that are intended for public consumption.

More important- and more tricky- than functions are data constructors. In many cases, you want to avoid exposing the internals of your data types to users, to allow you to expand on them in the future. A common use case for this is some kind of a data type providing configuration information. Consider that you're going to communicate with some web services, so you write up the following API:

module MyAPI
    ( Settings (..)
    , makeAPICall
    ) where

data Settings = Settings
    { apiKey :: Text
    , hostName :: Text

makeAPICall :: Settings -> Foo -> IO Bar

The way your users will access this will be something like:

makeAPICall Settings
    { apiKey = myAPIKey
    , hostName = ""
    } myFoo

Now suppose a user points out that, in some cases, the standard port 80 is not used for the API call. So you add a new field port :: Int to your Settings constructor. This will break your user's code, since the port field will not be set.

Instead, a more robust way of specifying your API will look like:

module MyAPI
    ( Settings
    , mkSettings
    , setHostName
    , makeAPICall
    ) where

data Settings = Settings
    { apiKey :: Text
    , hostName :: Text

-- | Create a @Settings@ value. Uses default value for host name.
mkSettings :: Text -- ^ API Key
           -> Settings
mkSettings key = Settings
    { apiKey = key
    , hostName = ""

setHostName :: Text -> Settings -> Settings
setHostName hn s = s { hostName = hn }

makeAPICall :: Settings -> Foo -> IO Bar

Now your user code will instead look like:

makeAPICall (mkSettings myAPIKey) myFoo

This has the following benefits:

  • The user is not bothered to fill in default values (in this case, the hostname).
  • Extending this API to allow for more fields in the future is trivial: add a new set* function. Internally, you'll add a field to Settings and set a default value in mkSettings.

One thing to note: please do not expose the field accessors directly. If you want to provide getter functions in addition to setters, write them explicitly, e.g.:

getHostName :: Settings -> Text
getHostName = hostName

The reason for this is that by exposing field accessors, users will be able to write code such as:

(mkSettings myAPIKey) { hostName = "" }

This ties your hand for future internal improvements, since you are now required to keep a field of name hostName with type Text. By just using set and get functions, you can change your internal representation significantly and still provide a compatibility layer.

For those of you familiar with other languages: this is in fact quite similar to the approach taken in Java or C#. Just because Java does it doesn't mean it's wrong.

Note that this advice is different to, and intended to supersede, the settings type approach. Projects like Warp which previously used that settings type approach are currently migrating to this more extensible approach.

Also, while settings have been used here as a motivating example, the same advice applies to other contexts.

Internal modules

One downside of implementation hiding is that it can make it difficult for users to do things you didn't intend for them to do with your API. You can always add more functionality on demand, but the delay can be a major nuissance for users. A compromise solution in the Haskell community is to provide a .Internal module for your project which exports non-quite-public components. For example, in wai, the Response constructors are exposed in a Network.Wai.Internal module. Normally, users are supposed to use smart constructors like responseFile, but occasionally they'll want more fine-grained control.

March 10, 2015 09:17 AM

March 09, 2015

Neil Mitchell

Implementing a Functor instance

Summary: Implementing a Functor instance is much easier than implementing a Monad instance, and can turn out to be quite useful.

Haskell forces all programmers to understand some details of the Monad typeclass to do basic IO, but currently nothing forces people to learn the Functor typeclass. However, Functor is much simpler than Monad, and all Monads must be Functors, so thinking more about Functor can be a nice route to understanding Monad better.

An intuitive description of a functor is:

A container whose contents can be replaced, without changing the shape of the container.

Some example functors include lists and Maybe. Both contain values, and you can replace the values inside them. In fact, most types with a single type parameter can be made functors. For example, in CmdArgs I define something similar to:

data Group a = Group {groupUnnamed :: [a], groupNamed :: [(String, [a])]}

This Group structure contains a values inside it. Sometimes it is useful to transform all the underlying a values, perhaps to a different type. The Functor instance has a single member:

fmap :: Functor f => (a -> b) -> f a -> f b

For the above type, we instantiate f to Group so we get:

fmap :: (a -> b) -> Group a -> Group b

We can implement fmap by applying f to every a value inside Group:

instance Functor Group where
fmap f (Group a b) = Group (map f a) [(x, map f y) | (x,y) <- b]

Note in particular that Group is usually written Group a, but in the instance declaration we're omitting the a, to say Group itself (without any arguments) is a functor. Providing insufficient type arguments like that makes Functor a higher-kinded type class, in contrast to those like Eq or Ord which would have been on Group a.

When implementing fmap the type checker eliminates most bad implementations, so the only law you need to think about is that fmap id = id - given the identity function, the value shouldn't change. We can show this law for Group with:

Group a b = fmap id (Group a b)
-- inline fmap
Group a b = Group (map id a) [(x, map id y) | (x,y) <- b]
-- map id x ==> x
Group a b = Group a [(x, y) | (x,y) <- b]
-- simplify list comprehension
Group a b = Group a b
-- equal

In fact, the function map is just fmap specialised to [], so the rule map id x ==> x is just applying the fmap id = id law on lists. From this law, we can derive the additional law that:

fmap (f . g)  ==  fmap f . fmap g

Both these laws can serve as the basis for optimisation opportunities, reducing the number of times we traverse a value, and GHC exploits these laws for the list type.

In general, most data types that take a type parameter can be made functors, but there are a few common exceptions:

  • You have a value on the left of an arrow – for example data Foo a = Foo (a -> Int) cannot be made a functor, since we have no way to change the incoming b back to an a.
  • You have an invariant relating the structure and the elements. For example data OrdList a = Nil | Gt a (OrdList a), where all functions on OrdList have an Ord context, and OrdList is exported abstractly. Here the functor would break the abstraction.
  • You require an instance for the element type, e.g. Data.Vector.Storable requires a Storable instance to create a vector, which Functor does not allow.

The name functor may sound scary, or confusing to C++ programmers (who accidentally say functor to mean function) – but they are a nice simple abstraction.

by Neil Mitchell ([email protected]) at March 09, 2015 09:57 PM

March 08, 2015

Ian Ross

Constraint kinds and associated types

Constraint kinds and associated types

This is going to be the oldest of old hat for the cool Haskell kids who invent existential higher-kinded polymorphic whatsits before breakfast, but it amused me, and it’s the first time I’ve used some of these more interesting language extensions for something “real”.


I have a Haskell library called hnetcdf for reading and writing NetCDF files. NetCDF is a format for gridded data that’s very widely used in climate science, meteorology and oceanography. A NetCDF file contains a number of gridded data sets, along with associated information describing the coordinate axes for the data. For example, in a climate application, you might have air temperature or humidity on a latitude/longitude/height grid.

So far, so simple. There are C and Fortran libraries for reading and writing NetCDF files and the interfaces are pretty straightforward. Writing a basic Haskell binding for this stuff isn’t very complicated, but one thing is a little tricky, which is the choice of Haskell type to represent the gridded data.

In Haskell, we have a number of different array abstractions that are in common use – you can think of flattening your array data into a vector, using a Repa array, using a hmatrix matrix, or a number of other possibilities. I wanted to support a sort of “store polymorphism” over these different options, so you’d be able to use the same approach to read data directly into a Repa array or a hmatrix matrix.

NcStore: first try

To do this, I wrote an NcStore class, whose first version looked something like this:

class NcStore s where
  toForeignPtr :: Storable e => s e -> ForeignPtr e
  fromForeignPtr :: Storable e => ForeignPtr e -> [Int] -> s e
  smap :: (Storable a, Storable b) => (a -> b) -> s a -> s b

It’s basically just a way of getting data in and out of a “store”, in the form of a foreign pointer that can be used to pass data to the NetCDF C functions, plus a mapping method. This thing can’t be a functor because of the Storable constraints on the types to be stored (which we need so that we can pass these things to C functions).

That works fine for vectors from Data.Vector.Storable:

instance NcStore Vector where
  toForeignPtr = fst . unsafeToForeignPtr0
  fromForeignPtr p s = unsafeFromForeignPtr0 p (Prelude.product s)
  smap = map

and for Repa foreign arrays:

import Data.Array.Repa
import qualified Data.Array.Repa as R
import qualified Data.Array.Repa.Repr.ForeignPtr as RF
import Data.Array.Repa.Repr.ForeignPtr (F)

instance Shape sh => NcStore (Array F sh) where
  toForeignPtr = RF.toForeignPtr
  fromForeignPtr p s = RF.fromForeignPtr (shapeOfList $ reverse s) p
  smap f s = computeS $ f s

Additional element type constraints

However, there’s a problem if we try to write an instance of NcStore for hmatrix matrices. Most hmatrix functions require that the values stored in a hmatrix matrix are instances of the hmatrix Element class. While it’s completely trivial to make types instances of this class (you just write instance Element Blah and you’re good), you still need to propagate the Element constraint through your code. In particular, I needed to use the hmatrix flatten function to turn a matrix into a vector of values in row-major order for passing to the NetCDF C API. The flatten function has type signature

flatten :: Element t => Matrix t -> Vector t

so that Element constraint somehow has to get into NcStore, but only for cases when the “store” is a hmatrix matrix.

Constraint kinds to the rescue

At this point, all the real Haskell programmers are asking what the big deal is. You just switch on the ConstraintKinds and TypeFamilies extensions and rewrite NcStore like this:

class NcStore s where
  type NcStoreExtraCon s a :: Constraint
  type NcStoreExtraCon s a = ()
  toForeignPtr :: (Storable e, NcStoreExtraCon s e) =>
                  s e -> ForeignPtr e
  fromForeignPtr :: (Storable e, NcStoreExtraCon s e) =>
                    ForeignPtr e -> [Int] -> s e
  smap :: (Storable a, Storable b, NcStoreExtraCon s a, NcStoreExtraCon s b) =>
          (a -> b) -> s a -> s b

Here, I’ve added an associated type called NcStoreExtraCon s a, which is a constraint, I’ve given a default for this (of (), which is a “do nothing” empty constraint), and I’ve added the relevant constraint to each of the methods of NcStore. The NcStore instances for storable Vectors and Repa arrays look the same as before, but the instance for hmatrix matrices now looks like this:

instance NcStore HMatrix where
  type NcStoreExtraCon HMatrix a = C.Element a
  toForeignPtr (HMatrix m) = fst3 $ unsafeToForeignPtr $ C.flatten m
  fromForeignPtr p s =
    let c = last s
        d = product s
    in HMatrix $ matrixFromVector RowMajor (d `div` c) (last s) $
       unsafeFromForeignPtr p 0 (Prelude.product s)
  smap f (HMatrix m) = HMatrix $ C.mapMatrix f m

I’ve just added the Element constraint on the type of values contained in the “store” to the instance, and I can then use any hmatrix function that requires this constraint without any trouble: you can see the use of flatten in the toForeignPtr method definition.


The problem I had here is really just an instance of what’s come to be called the “restricted monad” problem. This is where you have a type class, possibly with constraints, and you want to write instances of the class where you impose additional constraints. The classic case is making Set a monad: Set requires its elements to be members of Ord, but Monad is fully polymorphic, and so there’s no way to make an instance of something like Ord a => Monad (Set a).

There’s even a package on Hackage called rmonad that uses just this “constraint kinds + associated types” approach to allow you to write “restricted monads” of this kind. So this appears to be a well-known method, but it was fun to rediscover it. The ability to combine these two language extensions in this (to me) quite unexpected way is really rather satisfying!

<script src="" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

March 08, 2015 06:32 PM

Philip Wadler

March 07, 2015

Philip Wadler

A paean to the Western General

I resided in Ward 42 of the Regional Infectious Disease Unit of Edinburgh's Western General Hospital from 17 Dec—2 Jan. For most of the time I received antibiotic by drip or injection every four hours, day and night, and I thought my stay would be six weeks. Fortunately, my infection reacted well to antibiotic and I could be released early for outpatient treatment.

The building is ugly, but the people inside it are beautiful. I cannot recall another time or place where everyone was so unfailingly helpful and friendly. On the first day, a nurse found a staff member willing to lend me a charger when my phone ran down. The doctors took as much time as needed to answer my questions. The nurses were cheerful despite constant interruptions. The men who brought the coffee remembered that I liked it with milk, and one often asked after my twins (he had twins as well). No one objected when my daughter brought me a poster of the Justice League and I blue-tacked to my wall; several admired it.

Most often, I interact with institutions where the people who help you are, clearly, in it for the pay. They are nice only to the extent required by their job, and often less than that. Part of the difference in attitude here must be that the people know they are actively helping patients in need. I take my hat off to an institution that inculcates a caring attitude in everyone.

(Picture above courtesy of The Edinburgh Sketcher, whose pictures adorn a couple of corridors in the Western General. The RIDU is a different building, though somehow every building in the complex contrives to be equally unattractive.)

Related: Status report, Status report 2.

by Philip Wadler ([email protected]) at March 07, 2015 05:49 PM

Danny Gratzer

Worlds in Twelf

Posted on March 7, 2015
Tags: twelf, types

In this post I wanted to focus on one particular thing in Twelf: %worlds declarations. They seems to be the most mysterious. I’ve had a couple people tell me that they just blindly stick %worlds () (x _ _ _) before every total and pray which is a little concerning..

In this post hopefully we’ll remove some of the “compile-n-pray” from using Twelf code.

What is %worlds

In Twelf we’re interested in proving theorems. These theorems are basically proven by some chunk of code that looks like this.

    my-cool-tyfam : with -> some -> cool -> args -> type.
    %mode my-cool-tyfam +A +B +C -D.

    some         : ... -> my-cool-tyfam A B C D.
    constructors : ... -> my-cool-tyfam A B C D.

    %worlds (...) (my-cool-tyfam _ _ _ _).
    %total (T) (my-cool-tyfam T _ _ _).

What’s interesting here is the 3 directives we needed

  • %mode to specify which arguments of the type family are universally quantified and which are existentially qualified in our theorem. This specifies the “direction” of the type family, + arguments are inputs and - arguments are outputs.
  • %total which actually goes and proves the theorem by induction on the canonical forms of the term in the parens.
  • %worlds which specifies the set of contexts to check the totality in. Note that a world is simply a set of contexts.

The one we’re interested in talking about here is %worlds. Everything we want to call %total has to have on of these and as mentioned above it specifies the contexts to check the theorem in. Remember that total is proven by induction over the canonical forms. One of the canonical forms for every type is off the form

For some x : ty ∈ Γ, then x is a canonical form of ty.

This is a little different than in other languages. We could usually just invert upon something in the context. That’s not the case in Twelf, we have to handle variables parametrically (this is critical to admitting HOAS and similar). This means that means we have to extremely careful about what’s in Γ lest we accidentally introduce something canonical form of ty without any additional information about it. The worlds specification tells us about the forms Γ can take. Twelf allows us to specify sets of contexts that are “regular”.

So for example remember how plus might be defined.

    plus : nat -> nat -> nat -> type.
    %mode plus +N +M -P.

    plus/z : plus z N N.
    plus/s : plus N M P -> plus (s N) M (s P).

This is total in the empty context. If we added some b : nat to our context then we have no way of showing it is either a s or a z! This means that there’s a missing case for variables of type nat in our code. In order to exclude this impossible case we just assert that we only care about plus’s totality in the empty context. This is what the %worlds specification for plus stipulates

    %worlds () (plus _ _ _).

should be read as “plus should only be considered in the empty context” so the only canonical forms of plus are those specified as constants in our signature. This sort of specification is what we want for most vanilla uses of Twelf.

For most cases we want to be proving theorems in the empty context because we do nothing to extend the context in our constructors. That’s not to say that we can’t specify some nonempty world. We can specify a world where there is a b : nat, but if such a b must appear we have a derivation {a} plus b a z. This way when Twelf goes to check the canonical forms case for something in our context, b : nat, it knows that there’s a derivation that precisely matches what we need. I’ll circle back to this in a second, but first we have to talk about how to specify fancier worlds.

%block and Fancier Worlds

In Twelf there’s some special syntax for specifying worlds. Basically we can specify a template for some part of the world, called a block. A world declaration is just a conglomeration of blocks and Twelf will interpret this as a world of contexts in which each block may appear zero or more times.

In Twelf code we specify a block with the following syntax

    %block block_name : block {a : ty} ... {b : ty'}.

This specifies that if there is an a : ty in the context, it’s going to be accompanied by a bunch of other stuff including a b : ty'. Some blocks are pretty trivial. For example, if we wanted to allow plus to be defined in a context with some a : nat in the context we might say

    %block random_nat : block {b : nat}.
    %worlds (random_nat) (plus _ _ _).

This doesn’t work though. If we ask Twelf to check totality it’ll get angry and say

Coverage error --- missing cases:
{#random_nat:{b:nat}} {X1:nat} {X2:nat} |- plus #random_nat_b X1 X2.

In human,

You awful person Danny! You’re missing the case where you have to random integers and the random natural number b from the random_nat block and we want to compute plus b X X'.

Now there are a few things to do here. The saner person would probably just say “Oh, I clearly don’t want to try to prove this theorem in a nonempty context”. Or we can wildly add things to our context in order to patch this hole. In this case, we need some proof that about adding b to other stuff. Let’s supplement our block

    %block random_nat : block {b : nat}{_:{a} plus b a z}

Such a context is pretty idiotic though since there isn’t a natural number that can satisfy it. It is however enough to sate the totality checker.

    %total (T) (plus T _ _).

For a non contrived for example let’s discuss where interesting worlds come into play: with higher order abstract syntax. When we use HOAS we end up embedding the LF function space in our terms. This is important because it means as we go to prove theorems about it we end up recursing on a term under an honest to goodness LF lambda. This means we extend the context at some points in our proof and we can’t just prove theorems in the empty context!

To see this in action here’s an embedding of the untyped lambda calculus in LF

    term : type.
    lam  : (term -> term) -> term.
    app  : term -> term -> term.

Now let’s say we want to determine how many binders are in a lambda term. We start by defining our relation

    nbinds : term -> nat -> type.
    %mode nbinds +T -N.

We set this type family up so that it has one input (the term) and one output (a nat representing the number of binders). We have two cases to deal with here

    nbinds/lam : nbinds (lam F) (s N)
                  <- ({x : term} nbinds (F x) N).
    nbinds/app : nbinds (app F A) O
                  <- nbinds F N1
                  <- nbinds A N2
                  <- plus N1 N2 O.

In the lam case we recurse under the binder. This is the interesting thing here, we stick the recurse call under a pi binder. This gives us access to some term x which we apply the LF function two. This code in effect says “If for all terms F has N binders then lam F has N + 1 binders. The app case just sums the two binders.

We can try to world check this in only the empty context but this fails with

While checking constant nbinds/lam:
World violation for family nbinds: {x:term} </: 1

This says that even though we promised never to extend the LF context we did just that! To fix this we must have a fancier world. We create a block which just talks about adding a term to the context.

    %block nbinds_block : block {x : term}.
    %worlds (nbinds_block) (nbinds _ _).

This world checks but there’s another issue lurking about. Let’s try to ask Twelf to prove totality.

    %total (T) (nbinds T _).

This spits out the error message

Coverage error --- missing cases:
{#nbinds_block:{x:term}} {X1:nat} |- nbinds #nbinds_block_x X1.

This is the same error as before! Now that we’ve extended our context with a term we need to somehow be able to tell Twelf the height of that term. This smacks of the slightly fishy type of nbinds/lam: it’s meaning is that F x has the height N for any term x. This seems a little odd, why doesn’t the height of a functions body depend on its argument? We really ought to be specifying that whatever this x is, we know its height is z. This makes our new code

    nbinds/lam : nbinds (lam F) (s N)
                 <- ({x : term}{_ : nbinds x z} nbinds (F x) N).

Now we specify that the height of x is zero. This means we have to change our block to

    %block nbinds_block : block {x : term}{_ : nbinds x z}.

With this modification else everything goes through unmodified. For fun, we can ask Twelf to actually compute some toy examples.

    %solve deriv : nbinds (lam ([x] (lam [y] x))) N.

This gives back that deriv : nbinds (lam ([x] lam ([y] x))) (s (s z)) as we’d hope. It’s always fun to run our proofs.


Hopefully that clears up some of the mystery of worlds in Twelf. Happily this doesn’t come up for a lot of simple uses of Twelf. As far as I know the entire constructive logic course at CMU sidesteps the issue with a quick “Stick %worlds () (...) before each totality check”.

It is completely invaluable if you’re doing anything under binders which turns out to be necessary for most interesting proofs about languages with binders. If nothing else, the more you know..

Those who enjoyed this post might profit from Dan Licata and Bob Harper’s paper on mechanizing metatheory.


<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

March 07, 2015 12:00 AM

March 06, 2015

Philip Wadler

An Open Letter to John Swinney

Dear John Swinney,

I sat in the gallery yesterday to watch the debate on Privacy and the State. As a member of the Open Rights Group, I have a keen interest in the subject.

The Identity Management and Privacy Principles, published in October 2014 with your name on the foreword, states in Section 4.6:
If a public service organisation needs to link personal information from different systems and databases (internally or between organisations), it should avoid sharing persistent identifiers; other mechanisms, such as matching, should be considered.
Willie Rennie and Patrick Harvie drew attention to the proposal that stakeholders should share the Unique Citizen Reference Number (UCRN). Using the UCRN as a key would link multiple databases, in effect forming a super-database, which is what concerns those who object to the plan. In your closing speech, Harvie intervened to ask you to acknowledge that the proposal breaches the Scottish Government's own guidelines. You responded:
I do not believe that the proposal breaches the data privacy principles that we set out.
Time was short, so you said no more, but I invite you now to elaborate on your bald denial. The proposal certainly appears to contravene the principle  advocated by your own document---how can you claim it does not?

It is not only your own guidelines which question your plan. The British Medical Association and the Royal College of General Practitioners have raised concerns about sharing information collected by the NHS with HMRC, and the UK Information Commissioner has warned that the proposal may breach European and British data protection laws. Both these criticism were raised during the debate, but ignored by you---I invite you to answer them as well.

There are ways forward that meet the needs of government and citizens that are far less problematic. I welcome your offer to review the proposals, and I hope that in light of these criticisms the Scottish Government will rethink its plans.


Philip Wadler
Professor of Theoretical Computer Science
School of Informatics
University of Edinburgh


Minutes of the debate on Privacy and the State

Identity Management and Privacy Principles

Report of the debate from BBC News

by Philip Wadler ([email protected]) at March 06, 2015 06:14 PM

March 05, 2015

Brandon Simmons

Benchmarking very fast things with criterion

There’s a pervasive myth that Bryan O’Sullivan’s excellent haskell benchmarking library criterion is only useful for benchmarks that take some significant chunk of time (I’ve even heard some people claim on the ms scale). In fact criterion is useful for almost anything you’d want to benchmark.

At a high level criterion makes your benchmark the inner loop of a function, and runs that loop a bunch of times, measures the result, and then divides by the number of iterations it performed. The approach is both useful for comparing alternative implementations, and probably the only meaningful way of answering “how long does this code take to run”, short of looking at the assembly and counting the instructions and consulting your processor’s manual.

If you’re skeptical, here’s a benchmark we’d expect to be very fast:

import Criterion.Main

main :: IO ()
main = do
    defaultMain [ 
        bench "sum2" $ nf sum [1::Int,2]
      , bench "sum4" $ nf sum [1::Int,2,3,4]
      , bench "sum5" $ nf sum [1::Int,2,3,4,5]

And indeed it’s on the order of nanoseconds:

benchmarking sum2
time                 27.20 ns   (27.10 ns .. 27.35 ns)
                     0.994 R²   (0.984 R² .. 1.000 R²)
mean                 28.72 ns   (27.29 ns .. 32.44 ns)
std dev              6.730 ns   (853.1 ps .. 11.71 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking sum4
time                 58.45 ns   (58.31 ns .. 58.59 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 58.47 ns   (58.26 ns .. 58.66 ns)
std dev              654.6 ps   (547.1 ps .. 787.8 ps)
variance introduced by outliers: 11% (moderately inflated)

benchmarking sum5
time                 67.08 ns   (66.84 ns .. 67.33 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 67.04 ns   (66.85 ns .. 67.26 ns)
std dev              705.5 ps   (596.3 ps .. 903.5 ps)

The results are consistent with each other; sum seems to be linear, taking 13-14ns per list element, across our different input sizes.

Trying to measure even faster things

This is what I was doing today which motivated this post. I was experimenting with measuring the inner loop of a hash function:

fnvInnerLoopTest :: Word8 -> Word32
{-# INLINE fnvInnerLoopTest #-}
fnvInnerLoopTest b = (2166136261 `xor` fromIntegral b) * 16777619

These were the results criterion gave me:

benchmarking test
time                 9.791 ns   (9.754 ns .. 9.827 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 9.798 ns   (9.759 ns .. 9.862 ns)
std dev              167.3 ps   (117.0 ps .. 275.3 ps)
variance introduced by outliers: 24% (moderately inflated)

These are the sorts of timescales that get into possibly measuring overhead of function calls, boxing/unboxing, etc. and should make you skeptical of criterion’s result. So I unrolled 4 and 8 iteration versions of these and measured the results:

main :: IO ()
main = do
    defaultMain [ 
        bench "test"  $ nf fnvInnerLoopTest   7
      , bench "test4" $ nf fnvInnerLoopTest4 (7,8,9,10)
      , bench "test8" $ nf fnvInnerLoopTest8 (7,8,9,10,11,12,13,14)

benchmarking test
time                 9.380 ns   (9.346 ns .. 9.418 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 9.448 ns   (9.399 ns .. 9.567 ns)
std dev              240.4 ps   (137.9 ps .. 418.6 ps)
variance introduced by outliers: 42% (moderately inflated)

benchmarking test4
time                 12.66 ns   (12.62 ns .. 12.72 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 12.68 ns   (12.64 ns .. 12.73 ns)
std dev              158.8 ps   (126.9 ps .. 215.7 ps)
variance introduced by outliers: 15% (moderately inflated)

benchmarking test8
time                 17.88 ns   (17.82 ns .. 17.94 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 17.89 ns   (17.81 ns .. 17.97 ns)
std dev              262.7 ps   (210.3 ps .. 349.7 ps)
variance introduced by outliers: 19% (moderately inflated)

So this seems to give a more clear picture of how good our bit twiddling is in that inner loop. I was curious if I could measure the overhead directly in criterion though. Somewhat surprisingly to me, it seems I could!

I added the following benchmark to my list:

  , bench "baseline32" $ nf (\x-> x) (777::Word32)

The idea being to isolate the overhead of applying the most trivial function and calling nf on an example value of our output type (Word32 in this case).

benchmarking baseline32
time                 9.485 ns   (9.434 ns .. 9.543 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 9.509 ns   (9.469 ns .. 9.559 ns)
std dev              155.8 ps   (122.6 ps .. 227.8 ps)
variance introduced by outliers: 23% (moderately inflated)

If we consider this value the baseline for the measurements initially reported, the new results are both linear-ish, as we would expect, and also the resulting absolute measurements fall about where we’d expect from the assembly we’d hope for (I still need to verify that this is actually the case), e.g. our intial test is in the ~1ns range, about what we’d expect from an inner loop with a couple instructions.

I thought this was compelling enough to open an issue to see whether this technique might be incorporated into criterion directly. It’s at least a useful technique that I’ll keep playing with.

Anyway, benchmark your code.

March 05, 2015 11:53 PM

March 04, 2015

Functional Jobs

Hacker (Node.js, NOSQL, Data Science) at Mobitrans (Full-time)

We are looking for exceptional functional developers to join our product research and development team. It's a full stack job that requires more than anything a passion for learning and experimenting with new technologies, creating prototypes and dealing with a few million data points every day. You will be working on any technology between Linux shell scripts, high performance Web servers and beautiful UIs.

We tweak our advertising and targeting algorithms multiple times a day and you should be able to reason about, program, discover and propose new ways of improving our result.

Our backend is supported by Node.js and Mongodb and our front-end is powered by JavaScript, TypeScript, LiveScript, AngularJS and React. You should be able to improve our architecture, bring new ideas and create POCs and lead the development of new projects.

You will be building ad servers for serving interactive mobile ads as well as realtime data analysis of the traffic; you will be tinkering with the algorithms and assisting product designers and data analysts by developing data visualization tools for UX and performance reports.

Who we are looking for?

  • Familiarity with functional programming languages including but not limited to Haskell, Clojure, F#, OCaml.
  • Hands-on experience with JavaScript, Python, Node.js, Mongodb, WebSockets, Reactive programming.
  • We know LiveScript is not the most popular language in the world, but we like its functional style and we use it all the time, if you're applying for this job, go and check it out:
  • Experience with R, Mathematica, and similar software
  • Familiar with data analysis, machine learning software, MapReduce, Hadoop and related concepts and big data technologies.
  • Good UI skills with Underscore.js, d3 and React.
  • Some math
  • Practical knowledge in using Git.

About Us

Mobitrans, started in 2006 and is the global leader in mobile learning and entertainment services. We are active in 38 countries. Head quartered in Dubai Media City, we have an international and diverse team, creating and marketing our products to millions of users every day.

Our offer

  • We offer a very competitive salary.
  • World-class healthcare package.
  • Complementary gym membership.
  • Flexible working hours.
  • Opportunities for career progression within the company.
  • 365 days of sunshine!
  • Amazing office location with stunning views of the famous Palm Jumeirah.

This position is open for worldwide candidates and in case of relocation we will assist you ensuring a smooth transition to working and living in the beautiful city of Dubai.

Get information on how to apply for this position.

March 04, 2015 12:30 PM