Planet Haskell

October 31, 2014

The GHC Team

GHC Weekly News - 2014/10/31 (Halloween Edition)

Hello *,

Welcome to the GHC Weekly news. And it's just in time before you go out and eat lots of candy and food.

  • David Feuer and Joachim Brietner spent some time this past week talking about more optimizations for Haskell code for fusing code, and creating better consumers and producers. This work includes optimizations of "One shot lambdas" (lambdas used at most once) and Call Arity, which was implemented by Joachim at Microsoft Research. The thread is here -

The current situation is that Call Arity and One shot analysis tends to have good combined results for exploiting more fusion opportunities, but sometimes these backfire. As a result, Joachim and David have worked on improving the situation - particularly by letting programmers help with a new oneShot primitive (in Phab:D392 & Phab:D393).

  • Herbert Valerio Riedel opened up a discussion about the origins of code contributions. In short, we'd like to have some stronger ground to stand on in the face of contributions from contributors - the origin of a change and its legal status is a bit nebulous. The thread is here:

Overall, there's been a lot of separate points made, including CLAs (unlikely), "Developer Certificates of Origin" a la the Linux Kernel, and reworking how we mark up header files, and keep track of GHC's long history of authors. If you work on a big project where some of these concerns are real, we'd like to hear what you have to say!

  • Gintautas Milauskas has done some fantastic work for GHC on Windows lately, including fixing tests, improving the build, and making things a lot more reasonable to use. With his work, we hope GHC 7.10 will finally ship an updated MinGW compiler (a long requested feature), and have a lot of good fixes for windows. Thank you, Gintautas!
  • And on that note, the call for Windows developers rages on - it looks like Gintautaus, Austin, Simon, and others will be meeting to discuss the best way to tackle our current woes. Are you a Windows user? Please give us input - having input is a crucial part of the decision making process, so let us know.
  • Jan Stolarek had a question about core2core - a lot of questions, in fact. What's the difference between demand, strictness, and cardinality analylsis? Does the demand analyzer change things? And what's going on in some of the implementation? A good read if you're interested in deep GHC optimization magic:
  • Peter Wortmann has put up the new DWARF generation patches for review, in Phab:D396. This is one of the major components we still plan on landing in 7.10, and with a few weeks to spare, it looks like we can make sure it's merged for the STABLE freeze!
  • There have been a lot of good changes in the tree this past week:

Thanks to Michael Orlitzky, we plan on adding doctest examples to more modules in 7.10, and increase that coverage further. This is *really* important work, but very low fruit - thanks a ton Michael!

Data.Bifunctor is now inside base! (Phab:D336)

atomicModifyIORef' has been optimized with excellent speedups (as much as 1.7x to 1.4x, depending on the RTS used), thanks to some older work by Patrick Palka (Phab:D315). GHC's internals have been reworked to unwire Integer from GHC, leading not only to a code cleanup, but laying the foundation for further GMP (and non-GMP!) related Integer improvements (Phab:D351).

David Feuer and Joachim have been relentless in improving fusion opportunities, including the performance of take, isSuffixOf, and more prelude improvements, spread over nearly half a dozen patches. And this doesn't even include the work on improving oneShot or Call Arity!

In a slight change to base semantics, David Feuer also finally fixed #9236. This is a change that can expose latent bugs in your program (as it did for Haddock), so be sure to test thoroughly with 7.10 (Phab:D327).

GHC now has support for a new __GLASGOW_HASKELL_TH__ macro, primarily useful for testing bootstrap compilers, or compilers which don't support GHCi.

And there have been many closed tickets: #9549, #9593, #9720, #9031, #8345, #9439, #9435, #8825, #9006, #9417, #9727, #2104, #9676, #2628, #9510, #9740, #9734, #9367, #9726, #7984, #9230, #9681, #9747, and #9236.

by thoughtpolice at October 31, 2014 11:57 PM

Douglas M. Auclair (geophf)

Wolfgang Jeltsch

Grapefruit updated

Yesterday, I updated the Grapefruit FRP library once again, this time to make it compatible with GHC 7.8. The new version is To install or update, you can use the following commands:

cabal update
cabal install grapefruit-ui-gtk grapefruit-examples

Many thanks to Samuel Gélineau for providing several patches. (By the way, Samuel maintains an interesting page that compares different FRP libraries, the FRP Zoo.)

Grapefruit 0.1 is actually a phase-out model, which I only update to work with newer GHC versions. However, I am working on a new Grapefruit. This will be based on my research about FRP semantics and will be quite different from the old one. I expect that the sound theoretical foundation will lead to a more powerful library with a more sensible interface. One particular new feature will be integration of side effects into FRP, in a purely functional style.

Tagged: FRP, Grapefruit, Haskell, Samuel Gélineau

by Wolfgang Jeltsch at October 31, 2014 10:52 AM

Yesod Web Framework

Slides on conduit and GHCJS

I'm currently sitting in a hotel room in Poznan, Poland, getting ready to attend the second day of PolyConf. In the past two days, I've already had quite some fun. Wednesday- thanks to Matthias Fischmann and Adam Drake- I spoke to the Berlin Haskell users group about conduit. I had no idea that there were this many Haskellers in Berlin, and it was awesome to meet all of you. Yesterday I spoke about Haskell on the client side at PolyConf, mostly talking about how awesome GHCJS is (after Luite indoctrinated me at ICFP).

For those interested, the slides are now available online at:

I've also learnt quite a number of cool client side tricks at PolyConf, hopefully we'll get to take advantage of some of them in Yesod in the near future :).

October 31, 2014 05:00 AM

October 30, 2014

Functional Jobs

Senior Software Engineer at Soda Software Labs (Full-time)

Role: Senior Software Engineer Primary Location: Manchester, UK Employee Status: Permanent (Full Time) Salary: Competitive, with share options possible

Company Overview

Soda Software Labs are searching for a world class software engineer to help with the design and development of their flagship FinTech product PROFILE. At the heart of the product is a behavioural analysis platform which mines social data and performs analysis to provide insight for a number of industry verticals. The role offers the opportunity to join a small but fast growing company, to innovate in a number of industries and to get in at the ‘ground level’ within the fascinating space of BIG and social data.

The Role

We are looking for someone who not only has a proven track record of working with complex systems, but someone who has the passion, drive and motivation to continually develop our products to the highest standard.Typical duties of the role will involve:

Investigating current applications to improve their functionality Testing the product to ensure that it operates satisfactorily Confidently training and liaising with users Producing specifications and agreeing proposals Costing new or modified systems Writing new software and operating manuals

Knowledge and Skills

We are looking for an experienced senior-level software engineer, with experience across a range of programming languages of at least 2 or 3 different paradigms i.e. Java and C#/.NET being classified as a single language on the wider spectrum (as is C++).

The candidate needs to have functional programming knowledge of one or more of the following: Lisp/Scheme Erlang ML/OCaml Haskell Scala Python Ruby

A good understanding of social platform APIs such as Twitter API, Facebook API, LinkedIn API is also essential.

Additional Experience

3-4 years’ experience as a test/ software engineer Degree educated, preferably in a relevant subject such as computer science, software engineering, physics, mathematics or electronics Excellent critical thinking and analytical skills Ability to work to another’s design Commercial awareness Good communication skills

You'll be joining a well funded FinTech start up in the centre of Manchester with a work hard (but sometimes play hard too) culture. Moreover, it's the chance to get in at ground level with a business that has an extremely strong team and huge ambition. If this sounds like your kind of place, we'd love to hear from you.

Get information on how to apply for this position.

October 30, 2014 07:35 AM

October 29, 2014

Keegan McAllister

A taste of Rust (yum) for C/C++ programmers

If, like me, you've been frustrated with the status quo in systems languages, this article will give you a taste of why Rust is so exciting. In a tiny amount of code, it shows a lot of ways that Rust really kicks ass compared to C and C++. It's not just safe and fast, it's a lot more convenient.

Web browsers do string interningto condense the strings that make up the Web, such as tag and attribute names, into small values that can be compared quickly. I recently added event logging support to Servo's string interner. This will allow us to record traces from real websites, which we can use to guide further optimizations.

Here are the events we can log:

pub enum Event {
Insert(u64, String),

Interned strings have a 64-bit ID, which is recorded in every event. The Stringwe store for "insert" events is like C++'s std::string; it points to a buffer in the heap, and it owns that buffer.

This enum is a bit fancier than a C enum, but its representation in memory is no more complex than a C struct. There's a tag for the three alternatives, a 64-bit ID, and a few fields that make up the String. When we pass or return an Event by value, it's at worst a memcpyof a few dozen bytes. There's no implicit heap allocation, garbage collection, or anything like that. We didn't define a way to copy an event; this means the String buffer always has a unique owner who is responsible for freeing it.

The deriving(Show) attribute tells the compiler to auto-generatea text representation, so we can print an Eventjust as easily as a built-in type.

Next we declare a global vector of events, protected by a mutex:

lazy_static! {
pub static ref LOG: Mutex<Vec<Event>>
= Mutex::new(Vec::with_capacity(50_000));

lazy_static! will initialize both of them when LOG is first used. Like String, the Vec is a growable buffer. We won't turn on event logging in release builds, so it's fine to pre-allocate space for 50,000 events. (You can put underscores anywhere in a integer literal to improve readability.)

lazy_static!, Mutex, and Vec are all implemented in Rust using gnarly low-level code. But the amazing thing is that all three expose a safe interface. It's simply not possible to use the variable before it's initialized, or to read the value the Mutex protects without locking it, or to modify the vector while iterating over it.

The worst you can do is deadlock. And Rust considers that pretty bad, still, which is why it discourages global state. But it's clearly what we need here. Rust takes a pragmatic approach to safety. You can always write the unsafe keywordand then use the same pointer tricks you'd use in C. But you don't need to be quite so guarded when writing the other 95% of your code. I want a language that assumes I'm brilliant but distracted :)

Rust catches these mistakes at compile time, and produces the same code you'd see with equivalent constructs in C++. For a more in-depth comparison, see Ruud van Asseldonk's excellent series of articlesabout porting a spectral path tracer from C++ to Rust. The Rust code performs basically the same as Clang / GCC / MSVC on the same platform. Not surprising, because Rust uses LLVMand benefits from the same backend optimizations as Clang.

lazy_static! is not a built-in language feature; it's a macro provided by a third-party library. Since the library uses Cargo, I can include it in my project by adding

git = ""

to Cargo.toml and then adding

extern crate lazy_static;

to src/ Cargo will automatically fetch and build all dependencies. Code reuse becomes no harder than in your favorite scripting language.

Finally, we define a function that pushes a new event onto the vector:

pub fn log(e: Event) {

LOG.lock() produces an RAII handle that will automatically unlock the mutex when it falls out of scope. In C++ I always hesitate to use temporaries like this because if they're destroyed too soon, my program will segfault or worse. Rust has compile-time lifetime checking, so I can do things that would be reckless in C++.

If you scroll up you'll see a lot of prose and not a lot of code. That's because I got a huge amount of functionality for free. Here's the logging module again:

pub enum Event {
Insert(u64, String),

lazy_static! {
pub static ref LOG: Mutex<Vec<Event>>
= Mutex::new(Vec::with_capacity(50_000));

pub fn log(e: Event) {

This goes in src/event.rsand we include it from src/

#[cfg(feature = "log-events")]
pub mod event;

The cfg attributeis how Rust does conditional compilation. Another project can specify

git = ""
features = ["log-events"]

and add code to dump the log:

for e in string_cache::event::LOG.lock().iter() {
println!("{}", e);

Any project which doesn't opt in to log-eventswill see zero impact from any of this.

If you'd like to learn Rust, the Guide is a good place to start. We're getting close to 1.0and the important concepts have been stable for a while, but the details of syntax and libraries are still in flux. It's not too early to learn, but it might be too early to maintain a large library.

By the way, here are the events generated by interning the three strings foobarbaz foo blockquote:

Insert(0x7f1daa023090, foobarbaz)

There are three different kinds of IDs, indicated by the least significant bits. The first is a pointer into a standard interning table, which is protected by a mutex. The other two are created without synchronization, which improves parallelism between parser threads.

In UTF-8, the string foois smaller than a 64-bit pointer, so we store the characters directly. blockquote is too big for that, but it corresponds to a well-known HTML tag. 0xb is the index of blockquote in a static listof strings that are common on the Web. Static atoms can also be used in pattern matching, and LLVM's optimizations for C's switch statements will apply.

by keegan ([email protected]) at October 29, 2014 07:15 PM

Yesod Web Framework

Announcing: yesod-gitrepo

I'm happy to announce the first release of a new package, yesod-gitrepo. This package encapsulates a pattern I've used a number of times, namely: loading and refreshing content from a Git repository. Below is the current contents of the file.

This code is currently being used in production, and should be pretty stable. That said, it has not received a huge amount of battle testing yet. So please due test corner cases before shipping it in production for your site.

yesod-gitrepo provides a means of embedding content from a Git repository inside a Yesod application. The typical workflow is:

  • Use gitRepo to specify a repository and branch you want to work with.
  • Provide a function that will perform some processing on the cloned repository.
  • Use grContent in your Handler functions to access this parsed data.
  • Embed the GitRepo as a subsite that can be used to force a refresh of the data.
  • Set up a commit handler that pings that URL. On Github, this would be a webhook.

This is likely easiest to understand with a concrete example, so let's go meta: here's an application that will serve this very file. We'll start off with language extensions and imports:

{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes       #-}
{-# LANGUAGE TemplateHaskell   #-}
{-# LANGUAGE TypeFamilies      #-}
import           ClassyPrelude.Yesod
import           Text.Markdown
import           Yesod.GitRepo

Now we're going to create our foundation datatype. We need to give it one field: the GitRepo value containing our parsed content. Our content will simply be the text inside, wrapped up in a Markdown newtype for easy rendering. This gives us:

data App = App
    { getGitRepo :: GitRepo Markdown

instance Yesod App

And now let's set up our routes. We need just two: a homepage, and the subsite. Our subsite type is GitRepo Markdown (as given above). We replace the space with a hyphen as an escaping mechanism inside Yesod's route syntax:

mkYesod "App" [parseRoutes|
/ HomeR GET
/refresh RefreshR GitRepo-Markdown getGitRepo

Next up is our home handler. We start off by getting the current content parsed from the repository:

getHomeR :: Handler Html
getHomeR = do
    master <- getYesod
    content <- liftIO $ grContent $ getGitRepo master

Then it's just some normal Hamlet code:

    defaultLayout $ do
        setTitle "yesod-gitrepo sample"
                <a href=@{RefreshR GitRepoRoute}>
                    Force a refresh at
                    @{RefreshR GitRepoRoute}

And finally, our main function. We pass in the repo URL and branch name, plus a function that, given the filepath containing the cloned repo, processes it and generates a Markdown value. Finally, we provide the generated repo value to our App constructor and run it:

main :: IO ()
main = do
    repo <- gitRepo
        $ \fp -> fmap Markdown $ readFile $ fp </> ""
    warp 3000 $ App repo

Give it a shot. You should have a webapp hosting this very README file!

October 29, 2014 01:16 PM

October 28, 2014

Gabriel Gonzalez

How to desugar Haskell code

Haskell's core language is very small, and most Haskell code desugars to either:

  • lambdas / function application,
  • algebraic data types / case expressions,
  • recursive let bindings,
  • type classes and specialization, or:
  • Foreign function calls

Once you understand those concepts you have a foundation for understanding everything else within the language. As a result, the language feels very small and consistent.

I'll illustrate how many higher-level features desugar to the same set of lower-level primitives.


if is equivalent to a case statement:

if b then e1 else e2

-- ... is equivalent to:
case b of
True -> e1
False -> e2

This works because Bools are defined within the language:

data Bool = False | True

Multi-argument lambdas

Lambdas of multiple arguments are equivalent to nested lambdas of single arguments:

\x y z -> e

-- ... is equivalent to:
\x -> \y -> \z -> e


Functions are equivalent to lambdas:

f x y z = e

-- ... is equivalent to:
f = \x y z -> e

-- ... which in turn desugars to:
f = \x -> \y -> \z -> e

As a result, all functions of multiple arguments are really just nested functions of one argument each. This trick is known as "currying".

Infix functions

You can write functions of at least two arguments in infix form using backticks:

x `f` y

-- ... desugars to:
f x y


Operators are just infix functions of two arguments that don't need backticks. You can write them in prefix form by surrounding them with parentheses:

x + y

-- ... desugars to:
(+) x y

The compiler distinguishes operators from functions by reserving a special set of punctuation characters exclusively for operators.

Operator parameters

The parentheses trick for operators works in other contexts, too. You can bind parameters using operator-like names if you surround them with parentheses:

let f (%) x y = x % y
in f (*) 1 2

-- ... desugars to:
(\(%) x y -> x % y) (*) 1 2

-- ... reduces to:
1 * 2

Operator sections

You can partially apply operators to just one argument using a section:

(1 +)

-- desugars to:
\x -> 1 + x

This works the other way, too:

(+ 1)

-- desugars to:
\x -> x + 1

This also works with infix functions surrounded by backticks:

(`f` 1)

-- desugars to:
\x -> x `f` 1

-- desugars to:
\x -> f x 1

Pattern matching

Pattern matching on constructors desugars to case statements:

f (Left  l) = eL
f (Right r) = eR

-- ... desugars to:
f x = case x of
Left l -> eL
Right r -> eR

Pattern matching on numeric or string literals desugars to equality tests:

f 0 = e0
f _ = e1

-- ... desugars to:
f x = if x == 0 then e0 else e1

-- ... desugars to:
f x = case x == 0 of
True -> e0
False -> e1

Non-recursive let / where

Non-recursive lets are equivalent to lambdas:

let x = y in z

-- ... is equivalent to:
(\x -> z) y

Same thing for where, which is identical in purpose to let:

z where x = y

-- ... is equivalent to:
(\x -> z) y
Actually, that's not quite true, because of let generalization, but it's close to the truth.

Recursive let / where cannot be desugared like this and should be treated as a primitive.

Top-level functions

Multiple top-level functions can be thought of as one big recursive let binding:

f0 x0 = e0

f1 x1 = e1

main = e2

-- ... is equivalent to:
main = let f0 x0 = e0
f1 x1 = e1
in e2

In practice, Haskell does not desugar them like this, but it's a useful mental model.


Importing modules just adds more top-level functions. Importing modules has no side effects (unlike some languages), unless you use Template Haskell.


Type classes desugar to records of functions under the hood where the compiler implicitly threads the records throughout the code for you.

class Monoid m where
mappend :: m -> m -> m
mempty :: m

instance Monoid Int where
mappend = (+)
mempty = 0

f :: Monoid m => m -> m
f x = mappend x x

-- ... desugars to:
data Monoid m = Monoid
{ mappend :: m -> m -> m
, mempty :: m

intMonoid :: Monoid Int
intMonoid = Monoid
{ mappend = (+)
, mempty = 0

f :: Monoid m -> m -> m
f (Monoid p z) x = p x x

... and specializing a function to a particular type class just supplies the function with the appropriate record:

g :: Int -> Int
g = f

-- ... desugars to:
g = f intMonoid

Two-line do notation

A two-line do block desugars to the infix (>>=) operator:

do x <- m

-- ... desugars to:
m >>= (\x ->
e )

One-line do notation

For a one-line do block, you can just remove the do:

main = do putStrLn "Hello, world!"

-- ... desugars to:
main = putStrLn "Hello, world!"

Multi-line do notation

do notation of more than two lines is equivalent to multiple nested dos:

do x <- mx
y <- my

-- ... is equivalent to:
do x <- mx
do y <- my

-- ... desugars to:
mx >>= (\x ->
my >>= (\y ->
z ))

let in do notation

Non-recursive let in a do block desugars to a lambda:

do let x = y

-- ... desugars to:
(\x -> z) y


The ghci interactive REPL is analogous to one big do block (with lots and lots of caveats):

$ ghci
>>> str <- getLine
>>> let str' = str ++ "!"
>>> putStrLn str'

-- ... is equivalent to the following Haskell program:
main = do
str <- getLine
let str' = str ++ "!"
putStrLn str'

-- ... desugars to:
main = do
str <- getLine
do let str' = str ++ "!"
putStrLn str'

-- ... desugars to:
main =
getLine >>= (\str ->
do let str' = str ++ "!"
putStrLn str' )

-- ... desugars to:
main =
getLine >>= (\str ->
(\str' -> putStrLn str') (str ++ "!") )

-- ... reduces to:
main =
getLine >>= (\str ->
putStrLn (str ++ "!") )

List comprehensions

List comprehensions are equivalent to do notation:

[ (x, y) | x <- mx, y <- my ]

-- ... is equivalent to:

do x <- mx
y <- my
return (x, y)

-- ... desugars to:
mx >>= (\x -> my >>= \y -> return (x, y))

-- ... specialization to lists:
concatMap (\x -> concatMap (\y -> [(x, y)]) my) mx

The real desugared code is actually more efficient, but still equivalent.

The MonadComprehensions language extension generalizes list comprehension syntax to work with any Monad. For example, you can write an IO comprehension:

>>> :set -XMonadComprehensions
>>> [ (str1, str2) | str1 <- getLine, str2 <- getLine ]
("Line1", "Line2")

Numeric literals

Integer literals are polymorphic by default and desugar to a call to fromIntegral on a concrete Integer:

1 :: Num a => a

-- desugars to:
fromInteger (1 :: Integer)

Floating point literals behave the same way, except they desugar to fromRational:

1.2 :: Fractional a => a

-- desugars to:
fromRational (1.2 :: Rational)


You can think of IO and all foreign function calls as analogous to building up a syntax tree describing all planned side effects:

main = do
str <- getLine
putStrLn str
return 1

-- ... is analogous to:
data IO r
= PutStrLn String (IO r)
| GetLine (String -> IO r)
| Return r

instance Monad IO where
(PutStrLn str io) >>= f = PutStrLn str (io >>= f)
(GetLine k ) >>= f = GetLine (\str -> k str >>= f)
Return r >>= f = f r

main = do
str <- getLine
putStrLn str
return 1

-- ... desugars and reduces to:
main =
GetLine (\str ->
PutStrLn str (
Return 1 ))

This mental model is actually very different from how IO is implemented under the hood, but it works well for building an initial intuition for IO.

For example, one intuition you can immediately draw from this mental model is that order of evaluation in Haskell has no effect on order of IO effects, since evaluating the syntax tree does not actually interpret or run the tree. The only way you can actually animate the syntax tree is to define it to be equal to main.


I haven't covered everything, but hopefully that gives some idea of how Haskell translates higher level abstractions into lower-level abstractions. By keeping the core language small, Haskell can ensure that language features play nicely with each other.

Note that Haskell also has a rich ecosystem of experimental extensions, too. Many of these are experimental because each new extension must be vetted to understand how it interacts with existing language features.

by Gabriel Gonzalez ([email protected]) at October 28, 2014 03:24 AM

Danny Gratzer

The Guts of a Spineless Machine

Posted on October 28, 2014
Tags: haskell, c

It’s fairly well known that Haskell is a bit um.. different than how stock hardware sees the world. I’m not aware of too many processors that have decided that immutability and higher order functions are the right way to go.

Compiling Haskell and its ilk, however, does have one interesting wrinkle on top of the normal problem: laziness. Laziness stands completely at odds with how most everything else works. More over, whether or not you think it’s the right default it’s an interesting question of how to efficiently compile some evaluation strategy other than call by value or name.

To this end, people have built a lot of abstract machines that lazy languages could target. The idea being that these machines can be mapped easily to what we hardware want and transitively, we can get our compiler. Most of these work by “graph reduction” (that’s the G in STG) and the latest incarnation of these graph machines is the spineless tagless graph machine which lies at the heart of GHC and a few other compilers.

In this post, I’d like to go over how exactly the STG machine actually works. Turns out it’s pretty cool!

Core Concepts

The basic idea behind a compiler intent on going the STG route is something like

  1. .. front end stuff ..
  2. Translate IL to STG language
  3. Compile STG language to C/ASM/LLVM/Javascript

In GHC case I understand the pipeline is something like

  1. Parsing
  2. Type checking
  3. Desugaring + a few bobs and bits
  4. Translation to core
  5. Lion share of optimization
  6. Translation to STG language
  7. STG language to C–
  8. C– to assembly or llvm

We’re really concerned with parts 6 and 7 here. First things first, let’s lay out what’s exactly in the STG language. It’s a tiny functional language that looks a bit like Haskell or Core, with a few restrictions. A program is simply a series of bindings, much like Haskell. The top levels look something like

f = {x y z} flag {a b c} -> ...

You should read this for now as f = \a b c -> .... The first set of variables and the flag correspond to some stuff we’ll discuss later.

Inside the ... we can write most of what you would expect form Haskell. We have let[rec] bindings, case expressions, application, constructors, literals, and primitives. There is a caveat though, first off all constructor applications must be fully saturated. This isn’t unlike OCaml or something where you can’t just treat a constructor as a function with an arbitrary name. We would write

\a -> Just a

instead of just Just. Another bit of trickiness, our language has no lambdas! So we can’t even write the above. Instead if we had something like

 map Just [1, 2, 3]

We’d have to write

 let f   = \a -> Just a
     l'' = 3 : nil
     l'  = 2 : l''
     l   = 1 : l'
 in map f l

The reason for the awkward l'' series is that we’re only allowed to apply constructors and functions to atoms (literals and variables).

One other noteworthy feature of STG is that we have primitive operations. They need to be fully saturated, just like constructors, but they work across unboxed things. For example there would probably be something like +# which adds to unboxed integers. To work with these we also have unboxed literals, 1#, 2#, so on and so on.

No despite all these limitations placed on STG, it’s still a pretty stinking high level language. There’s letrec, higher order functions, a lot of the normal stuff we’d expect in a functional language. This means it’s not actually to hard to compile something like Haskell or Core to STG (I didn’t say “compile efficiently”).

As an example, let’s look at translating factorial into STG language. We start with

f :: Int -> Int
f i = case i of
  0 -> 1
  i -> i * (f (i - 1))

Now the first step is we change the binding form

f = {} n {i} -> ...

The case expressions clause can remain the same, we’re already casing on an atom

case i of
  (MkInt# i#) -> ...

Now comes the first big change, our boxed integers are going to get in the way here, so the case expression strips away the constructor leaving us with an unboxed integer. We can similarly refactor the body to make evaluation order explicit

 case i of
   MkInt i# ->
     case i# -# 1# of
       dec# ->
         let dec = \{dec#} u {} -> MkInt dec#
         in case fact dec of
              MkInt rest# ->
                case i# * rest# of
                  result# -> MkInt result#

Notice how the case expressions here are used to make the evaluation of various expressions explicit and let was used to create a new thing to evaluate.

Now we can see what those extra {}’s were for. They notate the free variables for a thunk. Remember how we can have all sorts of closures and it can make for some really nice code? Well the machine doesn’t exactly support those naively. What we need to do and note the variables that we close over explicitly and then generate code that will store these free variables with the value that closes over them. This pair is more or less what is called a “closure” for the rest of this post.

dec for example has a free variable dec# and it exists to box that result for the recursive call to factorial. We use case expressions to get evaluation. Most programs thus become chains of case’s and let alternating between creating thunks and actually doing work.

That u in between the {}’s in dec was also important. It’s the update flag. Remember how in Haskell we don’t want to force the same thunk twice. If I say

let i = 1 + 1 in i + i

We should only evaluate 1 + 1 once. That means that the thunk i will have to be mutated to not evaluate 1 + 1 twice. The update flag signifies the difference between thunks that we want to update and thunks that we don’t. For example, if we replaced the thunk for + with the first result it returned, we’d be mighty surprised. Suddenly 1 + 1 + 1 is just 2!

The u flag says “yes, I’m just a normal expression that should be updated” and the n flag says the opposite.

That about wraps up our discussion of the STG language, let’s talk about how to implement it now.


This language wouldn’t be much good if it didn’t lend itself to an easy implementation, indeed we find that the restrictions we placed upon the language prove to be invaluable for its compilation (almost like they were designed that way!).

In order to decide how best to implement it, we first let at the formal semantics for our language. We give these semantics a tuple of 6 things.

  1. The code - the instruction we’re currently executing
  2. The argument stack - A stack of integers or pointers to closures
  3. The return stack - A stack of continuations
  4. The update stack - A stack of update frames and sadness
  5. The heap - A map from addresses to closures
  6. The environment - A map from names to addresses of toplevel closures

A code is more or less the current thing we’re attempting to do. It’s either

  1. Eval e p - evaluate an expression in an environment (p)
  2. Enter a - Enter a closure
  3. ReturnCon c ws - Return a constructor applied to some arguments
  4. ReturnInt - Return an integer

Now the idea is we’re going to “unroll” our computations into pushing things onto the continuation stack and entering closures. We start with the code Eval main {}. That is to say, we start by running main. Then if we’re looking at a case we do something really clever

 EVAL(case expr of {pat1 -> expr1; ...}, p) as rs us h o


EVAL (expr, p) as ({pat1 -> expr1; ...} : rs) us h o

That is to say, we just push the pattern matching on to the continuation stack and evaluate the expression.

At some point we’ll get to a “leaf” in our expression. That is random literal (a number) or constructor. At this point we make use of our continuation stack

EVAL (C ws, p) as ((...; c vs -> expr; ...) : rs) us h o
ReturnCon (C ws) as ((...; c vs -> expr; ...) : rs) us h o
EVAL (expr, p[vs -> ws]) as rs us h o

So our pattern matching is rolled into ReturnCon. ReturnCon will just look on top of the return stack looking for a continuation which wants its constructor and evaluate its expression, mapping the constructor’s variables to the pattern’s variables.

The story is similar for literals

EVAL (Int i, p) as ((...; c vs -> expr; ...) : rs) us h o
ReturnInt i as ((...; i -> expr; ...) : rs) us h o
EVAL (expr, p) as rs us h o

Another phase is how we handle let’s and letrec’s. In this phase instead of dealing with continuations, we allocate more thunks onto the heap.

EVAL ((let x = {fs} f {xs} -> e; ... in expr), p) as rs us h o
EVAL e p' as us h' o

So as we’d expect, evaluating a let expression does indeed go and evaluate the body of the let expression, but changes up the environment in which we evaluate them. We have

p' = p[x -> Addr a, ...]
h' = h[a -> ({fs} f {xs} -> e) p fs, ...]

In words “the new environment contains a binding for x to some address a. The heap is extended with an address a with a closure {fs} f {xs} -> ... where the free variables come from p”. The definition for letrec is identical except the free variables come from p' allowing for recursion.

So the STG machine allocates things in lets, adds continuations with case, and jumps to continuation on values.

Now we also have to figure out applications.

EVAL (f xs, p) as rs us h o
ENTER a (values of xs ++ as) rs us h o

where the value of f is Addr a. So we push all the arguments (remember they’re atoms and therefore trivial to evaluate) on to the argument stack and enter the closure of the function.

How do we actually enter a closure? Well we know that our closures are of the form

({fs} f {vs} -> expr) frees

If we have enough arguments to run the closure (length vs > length of argument stack), then we can just EVAL expr [vs -> take (length vs) as, fs -> frees]. This might not be the case in something like Haskell though, we have partial application. So what do we do in this case?

What we want is to somehow get something that’s our closure but also knows about however many arguments we actually supplied it. Something like

({fs ++ supplied} f {notSupplied} -> expr) frees ++ as

where supplied ++ notSupplied = vs. This updating of a closure is half of what our update stack us is for. The other case is when we do actually enter the closure, but f = u so we’re going to want to update it. If this is the case we add an update from to the stack (as, rs, a) where as is the argument stack, rs is the return stack, and a is the closure which should be updated. Once we’ve pushed this frame, we promptly empty the argument stack and return stack.

We then add the following rules to the definition of ReturnCon

ReturnCon c ws {} {} (as, rs, a) : us h o
ReturnCon c ws as rs us h' o

where h' is the new heap that’s replaced our old closure at a with our new, spiffy, updated closure

h' = h[a -> ({vs} n {} -> c vs) ws]

So that’s what happens when we go to update a closure. But what about partial application?

Enter a as {} (asU, rs, aU) : us h o
Enter a (as ++ asU) rs us h' o


h a = ({vs} n {xs} -> expr) frees
h' = h [aU -> ((vs ++ bound) n xs -> e) (frees ++ as)]

This is a simplified rule from what’s actually used, but gives some intuition to what’s happening: we’re minting a new closure in which we use the arguments we’ve just bound and that’s what the result of our update is.

Compiling This

Now that we have some idea of how this is going to work, what does this actually become on the machine?

The original paper by SPJ suggests an “interpreter” approach to compilation. In other words, we actually almost directly map the semantics to C and call it compiled. There’s a catch though, we’d like to represent the body of closures as C functions since they’re well.. functions. However, since all we do is enter closures and jump around to things till the cows come home, it had damn well better be fast. C function calls aren’t built to be that fast. Instead the paper advocates a tiny trampolining-esque approach.

When something wants to enter a closure, it merely returns it and our main loop becomes

 while(1){cont = (*cont)();}

Which won’t stackoverflow. In reality, more underhanded tricks are applied to make the performance suck less, but for we’ll ignore such things.

In our compiled results there will be 2 stacks, not the 3 found in our abstract machine. In the first stack (A-stack) there are pointer things and the B-stack has non-pointers. This are monitored by two variables/registers SpA and SpBwhich keep track of the heights of the two stacks. Then compilation becomes reasonably straightforward.

An application pushes the arguments onto the appropriate stacks, adjusts Sp*, and enters the function A let block allocates each of the bound variables, then the body. Entering a closure simply jumps to the closures code pointer. This is actually quite nifty. We delegate all the work of figuring out exactly what Enter will do (updates, continuation jiggering) is left to the closure itself.

A case expression is a bit more complicated since a continuation’s representation involves boxing up the local environment for each branch. Once that’s bundled away, we represent a continuation as a simple code pointer. It is in charge of scrutinizing the argument stack and selecting an alternative and then running the appropriate code. This is a lot of work, and unless I’m crazy will need to types of bound variables for each branch (really just ptr/non-ptr). The selection of an alternative would be represented as a C switch, letting all sorts of trickery with jump tables be done by the C compiler.

In order to return a value, we do something clever. We take a constructor and point a global variable at its constructor closure, containing its values and jump to the continuation. The continuation can then peek and poke at this global variable to bind things as needed for the alternatives. There is potentially a massive speedup by returning through registers, but this is dangerously close to work.

From here, primitive operations can be compiled to statements/instructions in whatever environment we’re targeting. In C for example we’d just use the normal + to add our unboxed integers.

The last beast to slay is updates. We represent update frames by pointers to argument stacks and a pointer to a closure. That means that the act of updating is merely saving Sp* in an update from, clobbering them, and then jumping into the appropriate closure. We push the update from onto stack B and keep on going.

I realize that this is a glancing overview and I’m eliding a lot of the tricky details, but hopefully this is sufficient to understand a bit about what going on at an intuitive level.

Wrap Up

So now that you’ve put all the effort to get through this post, I get to tell you it’s all lies! In reality GHC has applied all manner of tricks and hacks to get fast performance out of an STG model. To be honest I’m not sure where I should point to that explains these tricks because well… I have no idea what they are.

I can point to

If you have any suggestions for other links I’d love to add them!

Thanks Chris Ganas for proof reading

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

October 28, 2014 12:00 AM

Christopher Done

Fast pagination on PostgreSQL

During the implementation of IRCBrowse I discovered that Postgres’s built-in offset is not very fast.

Here are the characteristics of my data:

ircbrowse=> \d event
                                    Table "public.event"
  Column   |           Type           |
 timestamp | timestamp with time zone |
 type      | text                     |
 nick      | text                     |
 text      | text                     |
 network   | integer                  |
 channel   | integer                  |
 id        | bigint                   |
    "event_unique" UNIQUE CONSTRAINT,
    btree (network, channel, "timestamp", nick, type, text)
    "event_unique_id" UNIQUE CONSTRAINT, btree (id)
    "event_channel_idx" btree (channel)
    "event_nick_idx" btree (nick)
    "event_timestamp_idx" btree ("timestamp")
    "event_type_idx" btree (type)

And the size:

ircbrowse=> select count(*) from event;

Channel 1 is the biggest:

ircbrowse=> select count(*) from event where channel = 1;

When you’re working with data on this scale (large, but not “big data”), PostgreSQL handles it beautifully. But the speed of OFFSET/LIMIT is not great:

ircbrowse=> explain analyze select * from event where channel = 1
                            order by id offset 500000 limit 30;
Limit  (cost=5648.81..5818.28 rows=30 width=85)
       (actual time=0.301..0.309 rows=30 loops=1)
   ->  Index Scan using event_unique_id on event
   (cost=0.00..81914674.39 rows=14501220 width=85)
   (actual time=0.020..0.288 rows=1030 loops=1)
         Filter: (channel = 1)

I think that this index scan is simply too expensive. Notice that I’m ordering by id which has a unique btree index on it. Check out the speed:

ircbrowse=> select * from event where channel = 1
            order by id offset 1000 limit 30;
Time: 0.721 ms
ircbrowse=> select * from event where channel = 1
            order by id offset 500000 limit 30;
Time: 191.926 ms

You might think less than a second to sift through 500,000 rows of a 28million row table is pretty good, but I think it sucks. It’s also deceptive. Let’s increase it to 1,000,000 rows (of 19,000,00):

ircbrowse=> select * from event where channel = 1
            order by id offset 1000000 limit 30;
Time: 35022.464 ms

This is getting worse and worse! It’s probably linear in its poor performance.

However, there’s a solution. Use an index table. A separate table which contains foreign keys pointing to this table:

ircbrowse=> \d event_order_index
Table "public.event_order_index"
 Column |  Type   | Modifiers
 id     | integer | not null
 origin | integer | not null
 idx    | integer | not null
    "event_order_id_origin" UNIQUE CONSTRAINT, btree (id, origin)
    "event_order_idx" btree (id)
    "event_order_idx_idx" btree (idx)
    "event_order_origin_dx" btree (origin)

Now you can have a pagination index for channel 1:

ircbrowse=> select * from event_order_index where idx = 1000 limit 1;
 id | origin | idx
  1 |      1 | 1000

(I used idx=1000 for channel 1, 2000 for channel 2, etc. so that I would have space for other numerical indexes for the same channel.)

Now you can make a very efficient query for the same data as above:

ircbrowse=> SELECT,e.timestamp,,,
ircbrowse=> e.type,e.nick,e.text FROM event e,
ircbrowse-> event_order_index idx
ircbrowse-> WHERE = idx.origin and idx.idx = 1000 and
ircbrowse=> > 1000000 and < 1000030
ircbrowse-> ORDER BY asc
ircbrowse-> LIMIT 30;
Time: 1.001 ms

This is more or less constant time.

And you can see this in action on the site. Takes about 30ms to load and render the page if I run this on the server:

$ time curl ''

real	0m0.031s
user	0m0.000s
sys     0m0.004s

Of course, sending a request in your browser will take longer due to the connection overhead and assets, but generally the goal was for it to be very snappy. The old (by another individual, who kindly let me have the name) was very slow indeed. You’d see the page loading the data incrementally from the database.

Anyhoo, thought that was a decent, practical PostgreSQL-specific optimization regarding pagination. Hope it was worth writing up.

October 28, 2014 12:00 AM

October 27, 2014

Danny Gratzer

Notes on Focusing

Posted on October 27, 2014
Tags: types

I’ve been spending a lot of time whacking my head on focusing literature. I’d like to jot down some intuition around what a focused system is and how it relates to the rest of the world. I’m going to steer clear of actually proving things but I will point out where a proof would be needed.

What Is Focusing

In a nutshell, focusing is a strategy to create proofs that minimizes the amount of choices available at each step. Focusing is thus amenable to mechanization since a computer is very good at applying a lot of deterministic procedures but incredibly bad at nondeterministic choice.

Now when we set out to define a focused system we usually do something like

  1. Formalize our logical framework with natural deduction
  2. Translate our framework into a sequent calculus
  3. Transform our sequent calculus into a focused variant

At each of these steps there’s a proof that says something like “System 2 is sound and complete with respect to System 1”. We can then chain these proofs together to get that we can transform any nonfocused proof into a focused one (focalization) and the reverse (de-focalization).

In order to actually carry out these proofs there’s a fair amount of work and pain. Usually we’ll need something like cut elimination and/or identity expansion.


Now before we go on to define an example logic, let’s notice a few things. First off, in sequent calculus there are left and right rules. Left rules decompose known facts into other known facts while right rules transform our goal. There’s also an identity sequent which more or less just states

 A is an atom
   Γ, A → A

This is a bit boring though.

Now certain rules are invertible: their conclusion implies their premise in addition to the reverse. For example if I said you must prove A ∧ B clearly we’ll have to prove both A and B in order to prove A ∧ B; there’s no alternative set of rule applications that let us circumvent proving A and B.

This means that if we were mechanically trying to prove something of the form A ∧ B we can immediately apply the right rule that decomposes into 2 goals.

We can these sort of rules invertible or asynchronous. Dually, there are rules that when applied transform our goal into something impossible to prove. Consider ⊥ ∨ ⊤, clearly apply the rule that transforms this into would be a bad idea!

Now if we begin classifying all the left and write rules we’ll notice that the tend to all into 2 categories

  • Things with invertible left rules and noninvertible right rules
  • Things with noninvertible left rules and invertible right rules

We dub the first group “positive” things and the second “negative” things. This is called polarization and isn’t strictly necessary but greatly simplifies a lot of our system.

Now there are a few things that could be considered both positive and negative. For example we can consider as positive with

  Γ → A⁺  Γ → B⁺
   Γ → A⁺ ∧ B⁺

   Γ, A⁺, B⁺ → C
   Γ, A⁺ ∧ B⁺ → C

In this case, the key determiner for the polarity of ∧ comes from its subcomponents. We can just treat ∧ as positive along with its subcomponents and with an appropriate dual ∧⁻, our proof system will still be complete.

As a quick example, implication is negative. the right rule

 Γ, A → B
Γ → A ⊃ B

While its left rule isn’t

 Γ, A ⊃ B → A  Γ, B, A ⊃ B → C
         Γ, A ⊃ B → C

Since we could easily have something like ⊥ ⊃ ⊤ but using this rule would entail (heh) proving ! Urk. If our system applied this rules remorselessly, we’d quickly end up in a divergent proof search.

An Actual Focused System

Do note that these typing rules are straight out of Rob Simmons’ paper, linked below

Now that we’ve actually seen some examples of invertible rules and polarized connectives, let’s see how this all fits into a coherent system. There is one critical change we must make to the structure of our judgments: an addition to the form _ → _. Instead of just an unordered multiset on the left, in order to properly do inversion we change this to Γ; Ω ⊢ A where Ω is an ordered list of propositions we intend to focus on.

Furthermore, since we’re dealing with a polarized calculus, we occasionally want to view positive things as negative and vice versa. For this we have shifts, ↓ and ↑. When we’re focusing on some proposition and we reach a shift, we pop out of the focused portion of our judgment.

Our system is broken up into 3 essentially separate judgments. In this judgment we basically apply as many invertible rules as many places as we can.

 Γ, A⁻; Q ⊢ U
Γ; ↓A⁻, Q ⊢ U

Γ; A⁺, Ω ⊢ U  Γ; B+; Ω ⊢ U
    Γ; A⁺ ∨ B⁺, Ω ⊢ U

  Γ; A⁺, B⁺, Ω ⊢ U
  Γ; A⁺ ∧ B⁺, Ω ⊢ U

 Γ; ⊥, Ω ⊢ U

We first look at how to break down Ω into simpler forms. The idea is that we’re going to keep going till there’s nothing left in Ω. Ω can only contain positive propositions so eventually we’ll decompose everything to shifts (which we move into Γ) ⊤+ (which we just drop on the floor) or ⊥ (which means we’re done). These are all invertible rules to we can safely apply them eagerly and we won’t change the provability of our goal.

Once we’ve moved everything out of Ω we can make a choice. If U is “stable” meaning that we can’t break it down further easily, we can pick a something negative out of our context and focus on it

   Γ; [A⁻] ⊢ U
  Γ, A⁻; • ⊢ U

This pops us into the next judgment in our calculus. However, if U is not stable, then we have to decompose it further as well.

  Γ; • ⊢ A⁺
  Γ; • ⊢ ↑ A⁺

 Γ; • ⊢ ⊤⁻

  Γ; A⁺ ⊢ B⁻
Γ; • ⊢ A⁺ ⊃ B⁻

Γ; • ⊢ A⁻   Γ; • ⊢ B⁻
   Γ; • ⊢ A⁻ ∧ B⁻

If we have a negative connective at the top level we can decompose that further, leaving us with a strictly smaller goal. Finally, we may reach a positive proposition with nothing in Ω. In this case we focus on the right.

  Γ ⊢ [A⁺]
 Γ; • ⊢ A⁺

Now we’re in a position to discuss these two focused judgments. If we focus on the right we decompose positive connectives

 Γ ⊢ [⊤⁺]

Γ; • ⊢ A⁻
Γ ⊢ ↓ A⁻

   Γ ⊢ [A⁺]
 Γ ⊢ [A⁺ ∨ B⁺]

   Γ ⊢ [B⁺]
 Γ ⊢ [A⁺ ∨ B⁺]

Γ ⊢ [A⁺]   Γ ⊢ [B⁺]
   Γ ⊢ [A⁺ ∧ B⁺]

These judgments follow the ones we’ve already seen. If we encounter a shift, we stop focusing. Otherwise we decompose the topmost positive connective. Now looking at these, you should see that sometimes these rules we’ll lead us to a “mistake”. Imagine if we applied the 4th rule to ⊤ ∨ ⊥! This is why these rules are segregated into a separate judgment.

In this judgment’s dual we essentially apply the exact same rules to the left of the turnstile and on negative connectives.

  Γ; A⁺ ⊢ U
Γ; [↑A⁺] ⊢ U

Γ ⊢ [A⁺]   Γ; [B⁻] ⊢ U
  Γ; [A⁺ ⊃ B⁻] ⊢ U

   Γ; [A⁻] ⊢ U
 Γ; [A⁻ ∧ B⁻] ⊢ U

   Γ; [B⁻] ⊢ U
 Γ; [A⁻ ∧ B⁻] ⊢ U

That wraps up our focused system. The idea is now we have this much more limited system which can express the same things our original, unfocused system could. A computer can be easily programmed to do a focused search since there’s much less backtracking everywhere leading to fewer rules being applicable at each step. I think Pfenning has referred to this as removing most of the “don’t-care” nondeterminism from our rules.

Wrap Up

I’m going to wrap up the post here. Proving focalization or even something like cut elimination is quite fiddly and I have no desire at all to try to transcribe it (painfully) into markdown and get it wrong in the process.

Instead, now that you have some idea of what focusing is about, go read Rob Simmons’ paper. It provides a clear account of proving everything necessary prove a focused system is complete and sound with respect to its unfocused counterpart.


<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

October 27, 2014 12:00 AM

October 24, 2014

The GHC Team

GHC Weekly News - 2014/10/24

Hi *,

Welcome to the weekly GHC news. This one will be short this week, as the preceding one occurred only on Monday - but we'll be going with Fridays from now on, so next week we'll hopefully see a longer list.

  • GHC 7.8.4 tickets have been in waiting, and the RC will be soon after Austin finishes some final merges and tests on his branch. We have not committed a time for the release after the RC, yet we would like people to please seriously test and immediately report any major showstoppers - or alert us of ones we missed.
  • For the GHC 7.10 release, one of the major features we planned to try and merge was DWARF debugging information. This is actually a small component of larger ongoing work, including adding stack traces to Haskell executables. While, unfortunately, not all the work can be merged, we talked with Peter, and made a plan: our hope is to get Phab:D169 merged, which lays all the groundwork, followed by DWARF debugging information in the code generators. This will allow tools like gdb or other extensible debuggers to analyze C-- IR accurately for compiled executables. Peter has written up a wiki page, available at SourceNotes, describing the design. We hope to land all the core infrastructure in Phab:D169 soon, followed by DWARF information for the Native Code Generator, all for 7.10.1
  • This past week, a discussion sort of organically started on the #ghc IRC channel about the future of the LLVM backend. GHC's backend is buggy, has no control over LLVM versions, and breaks frequently with new versions. This all significantly impacts users, and relegates the backend to a second class citizen. After some discussion, Austin wrote up a proposal for a improved backend, and wrangled several other people to help. The current plan is to try an execute this by GHC 7.12, with the goal of making the LLVM backend Tier 1 for major supported platforms.
  • You may notice is now responds slightly faster in some cases - we've activated a caching layer (CloudFlare) on the site, so hopefully things should be a little more smooth.

Closed tickets this week: #9684, #9692, #9038, #9679, #9537, #1473.

by thoughtpolice at October 24, 2014 11:43 PM

Danny Gratzer

Update on Old Projects

Posted on October 24, 2014
Tags: haskell, personal

All though most people I talk to know me for my blog, I do occasionally actually write software instead of just talking about it :)

Sadly, as a mercurial user most of my stuff has languished with on bitbucket. I’ve had a few people tell me that this is annoying for various reasons. Yesterday, I finally got around to fixing that!

As of yesterday, all of my interesting projects are mirrored on [github][my-github]. I’m still using mercurial but thanks to the lovely git-hg tool this is not an issue. You can fork, pull-request, or generally peek and poke as you please. From my end all of these actions look like nice mercurial changesets so I can continue to live under my rock where I don’t need to understand Git.

As a quick list of what haskell code is up there now

Which I think includes every project I’ve blogged about here as well as a few others. Sorry it took so long!

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

October 24, 2014 12:00 AM

October 21, 2014

Tom Schrijvers

Postdoc Position in Functional and Constraint Programming

Postdoctoral position in Functional and Constraint Programming at KU Leuven

The Declarative Languages and Artificial Intelligence (DTAI) group of KU Leuven (Belgium) invites applicants for a postdoctoral position in the area of functional and constraint programming. The position revolves around domain-specific languages (DSLs) embedded in Haskell for constraint programming. It is part of the EU project GRACeFUL whose overarching theme is tools for collective decision making. The KU Leuven part of the project is under the direction of prof. Tom Schrijvers.

To apply you must hold a recent PhD (or be about to graduate) related to either functional or constraint programming. Experience in both areas is an advantage.

You will work closely with prof. Schrijvers and his PhD students at KU Leuven, as well as with the GRACeFUL project partners across Europe.

The position is for 3 years. The salary is competitive and the starting date negotiable (but no later than February 1). Moreover, KU Leuven's policy of equal opportunities and diversity applies to this position.

Application procedure:

by Tom Schrijvers ([email protected]) at October 21, 2014 08:08 AM

wren gayle romano

Upcoming talk

For all you local folks, I'll be giving a talk about my dissertation on November 5th at 4:00–5:00 in Ballantine Hall 011. For those who've heard me give talks about it before, not much has changed since NLCS 2013. But the majority of current CL/NLP, PL, and logic folks haven't seen the talk, so do feel free to stop by.

Abstract: Many natural languages allow scrambling of constituents, or so-called "free word order". However, most syntactic formalisms are designed for English first and foremost. They assume that word order is rigidly fixed, and consequently these formalisms cannot handle languages like Latin, German, Russian, or Japanese. In this talk I introduce a new calculus —the chiastic lambda-calculus— which allows us to capture both the freedoms and the restrictions of constituent scrambling in Japanese. In addition to capturing these syntactic facts about free word order, the chiastic lambda-calculus also captures semantic issues that arise in Japanese verbal morphology. Moreover, chiastic lambda-calculus can be used to capture numerous non-linguistic phenomena, such as: justifying notational shorthands in category theory, providing a strong type theory for programming languages with keyword-arguments, and exploring metatheoretical issues around the duality between procedures and values.

comment count unavailable comments

October 21, 2014 02:39 AM

October 20, 2014

The GHC Team

GHC Weekly News - 2014/10/20

Hi *,

It's been a few weeks since the last message - and I apologize! We actually are changing the posting time to be Friday now, so hopefully this situation will be corrected preeeeetty quickly from this point forward, and hopefully will give better context over the course of a weekly discussion.

That said, let's begin!

  • We've seen plenty of changes to GHC itself in the past few weeks. Some of the highlights include:
    • Some changes to help make Prelude combinators fuse better. David Feuer has been leading a lot of this work, and it's been quite fruitful, with several new things now fusing (like takeWhile, scanl, scanr, and mapAccumL.
    • Relatedly, Data.List.Inits should be far faster thanks to David Feuer (ref: Phab:D329).
    • The testsuite driver now has preliminary support for Python 3 - which should be useful for platforms out there that sport it, and ones that will use it as the default eventually (such as Fedora 22, possibly).
    • Some of the initial work by Edward Yang to remove HEAP_ALLOCED from the GHC runtime system has landed. Briefly, HEAP_ALLOCED is a check the RTS uses to determine if some address is part of the dynamic heap - but the check is a bit costly. Edward's full patchset hopes to remove this with an 8% speedup or so on average.
    • GHC now has a new macro, __GLASGOW_HASKELL_PATCHLEVEL__, which will allow you to determine the point-level release of the GHC you're using. This has been a requested feature in the past we were a little hesitant of adding, but Herbert went through and did it for us. (Ref: Phab:D66)
    • Template Haskell now supports LINE pragmas, thanks to Eric Mertens (ref: Phab:D299).
    • Sergei Trofimovich revived libbfd debugging support for the runtime system linker, which should be of use to several daring souls out there (ref: Phab:D193).
    • Several patches from Gintautas Miliauskas has improved the usability of msys and the testsuite on Windows - and he's not done yet!
    • A few improvements to the x86 code generator were committed by Reid Barton and Herbert Valerio Riedel, improving size/space for certain cases (ref: Phab:D320, Phab:D163).

and more besides that, including some linker improvements, and general cleanups as usual.

  • The mailing list has been busy (as usual), with some discussions including:
    • Austin posted some discussion about the tentative 7.10.1 plans - we're still hoping these are accurate, so take note! We hope to freeze mid-November, and release Feburary 2015! [1]
    • Austin also called for some feedback: GHC HQ has become convinced a 7.8.4 release is needed to fix some showstoppers - so please let him know soon if you're totally incapable of using 7.8 for something! [2]
    • Alan Zimmerman has asked for some feedback on his proposed "AST Annotations", which will hopefully allow GHC API clients to add richer annotations to GHC's syntactic representations. The motivation is for refactoring tools like HaRe - and I'm sure all input would be appreciated. [3]
    • Chris done sparked off a discussion about making GHCi awesomer, and I'm sure everyone can appreciate that! In particular, Chris wanted to discuss programmatic means of controlling GHCi itself, and naturally we need to ask - is the current API not enough, and why? [4]
    • Yuras Shumovich has implemented a proposal for allowing the Haskell FFI to support C structures natively as return values - this involves interfacing with C ABI rules to properly support structure layout. While Yuras has an initial implementation in Phab:D252, some questions about the feature - including its implementation complexity - remain before it gets merged. [5]
    • Richard Eisenberg made a modest request: can Phabricator patches have a 'roadmap', so people can instruct reviewers how to read a diff? The answer: yes, and it should be relatively easy to implement, and Austin can do so Real Soon Now™. [6]
    • Ben Gamari started a big discussion about one-shot event semantics in the I/O manager, with a lot of replies concerning not only the bug, but machines to test the actual change on. With any luck, Ben's fix for the I/O manager and a test machine should come quickly enough. [7]
    • Herbert Valerio Riedel opened an RFC: Should we look into using AsciiDoc for GHC? Historically, GHC's documentation has been written using DocBook, a verbose but very precise and unambiguous documentation format. However, AsciiDoc offers a much nicer markup language, while retaining DocBook support. In short, it looks like GHC may get a much more clean user manual soon. [8]
    • Yuras opened another discussion: Should we namespace proposals we create on our wiki? What seems uncontroversial can lead to surprising discussion, and the results were mixed this time it seems. [9]
    • Geoff Mainland stepped up and fixed Data Parallel Haskell to work with a new version of vector and GHC. Austin had disabled DPH a few weeks prior due to its difficulty to upgrade, and divergent source trees. With 7.10, GHC will hopefully ship a more modern vector and dph to boot.
    • Austin asks: can we warn on tabs by default for GHC 7.10? It's an easy change and a minor one - but we should at least ask first. Vote now! [10]
    • Philip Hölzenspies opens up a discussion about Uniques in GHC, and their impact on the compilers current design. Philip has a hopeful design to redo Unique values in GHC, and a patch to support it: Phab:D323. [11]
    • Richard Eisenberg asks: can we somehow integrate GitHub into our development process? While GitHub doesn't have as many nice tools as something like Phabricator, it has a very high inertia factor, and Richard is interested in making the 'first step' as easy as possible for newbies. Discussions about Phab<->GitHub integrations were afoot, as well as general discussion about contributor needs. There were a lot of points brought up, but the conversation has slightly dried up now - but will surely be revived again. [12]

And now, look at all these tickets we closed! Including: #9658, #9094, #9356, #9604, #9680, #9689, #9670, #9345, #9695, #9639, #9296, #9377, #9184, #9684.


by thoughtpolice at October 20, 2014 02:14 PM

Magnus Therning

Dijkstra quotes from EWD1284

I recently read through this long post entitles Object Oriented Programming is an expensive disaster which must end. I have to agree I largely agree with what he writes, but I don’t think I ever could have written such a well-researched article, and absolutely never one of equal length ;)

It does include some nice quotes and references and so far I’ve only read one of the many that I bookmarked, Computing Science: Achievements and Challenges (EWD1284). It does include a few quips that, based on other articles I’ve read, seem fairly typical to Dijkstra. I simply love the way he expressed his opinions at times.

This one really ought to have been in the lengthy post on the OOP disaster:

After more than 45 years in the field, I am still convinced that in computing, elegance is not a dispensable luxury but a quality that decides between success and failure; in this connection I gratefully quote from The Concise Oxford Dictionary a definition of “elegant”, viz. “ingeniously simple and effective”. Amen. (For those who have wondered: I don’t think object-oriented programming is a structuring paradigm that meets my standards of elegance.)

And his thoughts on formal methods are well-known of course, as are his thoughts on iterative design. However, I rather like how he expresses a certain level of disgust of the Anglo-Saxon world when writing about those topics:

The idea of a formal design discipline is often rejected on account of vague cultural/philosophical condemnations such as “stifling creativity”; this is more pronounced in the Anglo-Saxon world where a romantic vision of “the humanities” in fact idealizes technical incompetence. Another aspect of that same trait is the cult of iterative design.

It amazes me every time I read something by someone like Dijkstra, just how much things stay the same, even in a field like computing science, which is said to be moving so fast.

October 20, 2014 12:00 AM

FP Complete

New Stackage features

We have two new updates to Stackage: providing cabal.config files and including Haddock documentation.

Haddock documentation on snapshots

Now all new exclusive snapshots will have haddock links, which you can access via the following steps:

That link will be to an index page like this from which you can view documentation of all packages included in the snapshot. This means you can generally view everything in one place, on one high availability service.

Using Stackage without changing your repo

The recommended way to use Stackage is to simply change your remote-repo field in your .cabal/config file and run cabal update. Henceforth your dependencies will be resolved from Stackage, which is backed by high availability Amazon S3 storage servers, and you will have successful build plans, compilation and passing tests. Hurrah!

Try Haskell and the upcoming homepage were both developed with Stackage. This meant I could just specify the Stackage snapshot to use in the README and then the next time I want to upgrade I can just update the snapshot version to get a fresh working build plan.

The issue some people are facing is that they cannot change this remote-repo field, either because they're using a cabal sandbox, which doesn't support this yet, or because they just don't want to.

The solution to this, in my experience, has been for me to manually go and run cabal freeze in a project I've already built to get the cabal.config file and then give these people that file.

Now, it's automated via a cabal.config link on snapshot pages:

For new developers working on an application who want to use Stackage, they can do something like this:

$ cabal sandbox init
$ curl<stackage version>/cabal.config > cabal.config
$ cabal install --only-dep

Which will install their dependencies from Hackage. We can't guarantee this will always work -- as Stackage snapshots sometimes will have a manual patch in the package to make it properly build with other packages, but otherwise it's as good as using Stackage as a repo.

A cabal freeze output in cabal.config will contain base and similar packages which are tied to the minor GHC version (e.g. GHC 7.8.2 vs GHC 7.8.3 have different base numbers), so if you get a cabal.config and you get a dependencies error about base, you probably need to open up the cabal.config and remove the line with the base constraint. Stackage snapshots as used as repos do not have this constraint (it will use whichever base your GHC major version uses).

Another difference is that cabal.config is more like an “inclusive” Stackage snapshot -- it includes packages not known to build together, unlike “exclusive” snapshots which only contain packages known to build and pass tests together. Ideally every package you need to use (directly or indirectly) will come from an exclusive snapshot. If not, it's recommended that you submit the package name to Stackage, and otherwise inclusive snapshots or cabal.config are the fallbacks you have at your disposal.

October 20, 2014 12:00 AM

October 19, 2014

Neil Mitchell

HLint now spots bad unsafePerformIO

Summary: I've just released a new version of HLint that can spot an unsafePerformIO which should have NOINLINE but doesn't.

I've just released HLint v1.9.10, a tool to suggest improvements to Haskell code. I don't usually bother with release announcements of HLint, as each version just fixes a few things and adds a few hints, it's all in the changelog (plus there have now been 102 releases). The latest release attempts to make using unsafePerformIO a little bit safer. A common idiom for top-level mutable state in Haskell is:

myGlobalVar :: IORef Int
myGlobalVar = unsafePerformIO (newIORef 17)

That is, define a top-level CAF (function with no variables) and bind it to unsafePerformIO of some created mutable state. But the definition above is unsafe. GHC might decide myGlobalVar is cheap and inline it into several uses, duplicating the variable so that some functions update different versions. Running this code through the latest version of HLint we see:

Sample.hs:2:1: Error: Missing NOINLINE pragma
myGlobalVar = unsafePerformIO (newIORef 17)
Why not:
{-# NOINLINE myGlobalVar #-}
myGlobalVar = unsafePerformIO (newIORef 17)

HLint has spotted the problem, and suggested the correct fix.

Trying it for real

Let's take the package slave-thread-0.1.0 and run HLint on it. Slave thread is a new package that helps you ensure you don't end up with ghost threads or exceptions being thrown but missed - a useful package. Running HLint we see:

Sample.hs:19:1: Error: Missing NOINLINE pragma
slaves = unsafePerformIO $ Multimap.newIO
Why not:
{-# NOINLINE slaves #-}
slaves = unsafePerformIO $ Multimap.newIO

Sample.hs:20:3: Warning: Redundant $
unsafePerformIO $ Multimap.newIO
Why not:
unsafePerformIO Multimap.newIO

Sample.hs:25:1: Error: Eta reduce
fork main = forkFinally (return ()) main
Why not:
fork = forkFinally (return ())

Sample.hs:55:28: Warning: Redundant $
PartialHandler.totalizeRethrowingTo_ masterThread $ mempty
Why not:
PartialHandler.totalizeRethrowingTo_ masterThread mempty

Sample.hs:72:5: Error: Use unless
if null then return () else retry
Why not:
Control.Monad.unless null retry

HLint has managed to spot the missing NOINLINE pragma, and also a handful of tiny cleanups that might make the code a little more readable. Fortunately (and independent of HLint), the NOINLINE pragma was added in slave-thread-0.1.1, so the library no longer suffers from that bug.

by Neil Mitchell ([email protected]) at October 19, 2014 09:23 PM

Matthew Sackman

Anonymity in public life

In an article published in the Guardian yesterday, author Kathleen Hale recounts how her first book got some negative reviews by reviewers on a book review website. One reviewer in particular upset her and Kathleen ends up figuring out the reviewer is using a false identity, finds out who the reviewer really is and confronts her. The piece doesn't read to me like some sort of valedictory "I outed a fraud" type piece (though there are some passages in there which are questionable in that direction) and equally there are several passages where Kathleen expresses deep embarrassment and regret for the course of action she took. This episode, and that article in particular has caused substantial reaction: currently 600 comments on the Guardian article plus several other blog posts. There's no shortage of opinion to be found on Twitter either, as you'd expect.

The course of action that Kathleen took seems to be fairly undisputed as far as I can find. There is some dispute from some of the other blog posts as to exactly what was tweeted and said by whom, and there is dispute over Kathleen's claim that there are factual inaccuracies made in a review of her book. It is not disputed that the reviewer was using a false identity and that the reviewer had at least public Twitter, Facebook, and Instagram accounts under the false identity. The false identity was also a real name (Blythe Harris), by which I mean a name which if you introduced yourself by that name, no one would think you're using a false identity. This is distinct from claiming to be Peter Rabbit, or Buzz Lightyear.

Many people have equated Kathleen's actions with stalking. My dictionary defines the verb to stalk as:

  1. to follow or approach (game, prey, etc.) stealthily and quietly
  2. to pursue persistently and, sometimes, attack (a person with whom one is obsessed, often a celebrity)
  3. , 4,... [not relevant]

The second item there certainly fits. The British legal approach, whilst it gives no strict definition gives examples and guidance:

....following a person, watching or spying on them or forcing contact with the victim through any means, including social media.

The effect of such behaviour is to curtail a victim's freedom, leaving them feeling that they constantly have to be careful. In many cases, the conduct might appear innocent (if it were to be taken in isolation), but when carried out repeatedly so as to amount to a course of conduct, it may then cause significant alarm, harassment or distress to the victim.

I'm glad it includes "social media" there. Some comments have suggested that stalking "in real life" is worse than online. This seems bizarre to me: as if through a computer you are not interacting with other human beings but merely with shiny pixels who have no emotional capacity. "In real life" is everything we know. Whilst we're alive we have no personal experience of anything other than "in real life".

So I'm fairly sold on the whole argument that Kathleen's behaviour towards this reviewer can be considered stalking and as such is reprehensible.

To me, the far more interesting issue is the use of anonymity, false identities and any realistic expectation we have of privacy on the internet. A number of people who claim to write book reviews on such sites have suggested that the behaviour of Kathleen is exactly why they write their reviews under false names. I think there's something of a contradiction going on here.

But let's work backwards. Firstly, Kathleen, through some social engineering (she requested from the book review site the address of the reviewer so that she could post her a copy of the book) got the address of the book reviewer. She then used a telephone directory and census results to identify who really lived there (or likely owned the land). Now the use of the telephone directory seems a bit odd to me: telephony directories map names to numbers (and maybe addresses). Yes, you could use it to map an address to a name but it's very inefficient: you're essentially searching through the whole directory looking for the address whilst the directory is sorted by name, not address. So unless it was a very small telephone directory, I don't really buy that. Using census results is far more creditable: they're public documents and when they're online, they do allow you to search by address. In the UK you can only get access to the raw census details 100 years after the census has been published which, to a high probability, rules it out as a means to tie an address to a person who's still alive. You can get statistics and aggregates from more recent census results but you can't get the raw data. I'm assuming that in the US there's no such restriction on access to raw census data. If there is then I don't understand how Kathleen really managed to get a name for the owner of the property.

Instead, in the UK, if you want to find out who owns some land, you can pay the land registry £3 and they'll tell you. Presumably there are means by which you can legally hide this; I'm sure the rich have figured this out - probably some method by which some fake company in a tax haven technically "owns" the land and as they're registered abroad, they don't have to divulge any further details about that company. So yes, you could argue the Land Registry is profiting from facilitating stalkers, but equally there are a bunch of legitimate reasons to need access to such data and I can't think of any sane way to ensure the use of such a service isn't abused. So from that I conclude that unless the owner is a millionaire, the owner of any land is public knowledge.

The use of social engineering to get the address in the first place is more interesting but very obvious. This sort of thing happens a lot and sometimes to horrifying consequences (e.g. the Australian DJs who phoned up a hospital, pretending to be the Queen and Prince of Wales, enquiring as to the health of the Duchess of Cambridge. The nurse fell for the hoax and put the call through. Three days later, the nurse committed suicide). As a species we are not good at taking the time to verify who we're talking to or why. Whilst (hopefully) most of us would hang up if our bank apparently rang us and then asked for our credit card details "for security" this is largely only because it's in the bank's interest (in terms of cost of insurance) to reduce fraud, so they've trained us as such. But in all sorts of other scenarios we implicitly trust people we've no real reason to. A simple example: ticket inspectors on public transport. They may be wearing the uniform, but it could be faked. With their travel-card readers they could be seeing who has the expensive yearly travel cards, scanning the unique numbers from them and then using them to program up fraudulent cards. The crypto on those things is notoriously weak. Has anyone ever requested some means to verify the identity of a ticket inspector? And even if you could, how do you know they're not crooked regardless?

So phoning someone up, impersonating someone else, or pretending to have valid reasons to request the information you're requesting is always likely to work. It might be illegal in some cases, but it's certainly human nature to try to be helpful and if you're given a plausible justification, on what basis could you refuse the request unless it's contrary to some sort of company policy? In this case, if you're concerned about anonymity, wouldn't you be concerned about this possibility, and make use of an anonymous mail box?

Article 8 of the European Convention on Human Rights guarantees an individual's right to respect for privacy and family life, including correspondence. Is privacy the same as anonymity? No, definitely not:

In conflating anonymity and privacy, we have failed to see an important factual difference between them: under the condition of privacy, we have knowledge of a person’s identity, but not of an associated personal fact; whereas under the condition of anonymity, we have knowledge of a personal fact, but not of the associated person’s identity

The vast violations of our lives by state surveillance as revealed by Snowdon over the last year demonstrates the whole-scale collation of everything we do online and off by our governments. This is both being able to observe an action and identify the individual who caused it (thus we have no hope of performing any action anonymously), and being able to observe an individual and know the actions they take (thus no privacy). I can't work out whether the ECHR has anything to say on a right to anonymity; I get the sense that it doesn't try to protect that. So that's basically saying: "the state shouldn't record your every move (as that's an invasion of privacy), but moves that we're interested in, we can know who did them". Of course, we now know they're recording everything anyway.

We also know that computer systems can always be hacked into - there is no real security anywhere. Given a skilled and sufficiently funded adversary, any computer system connected in any way to the internet can be hacked into. Why? Because humans wrote the software that runs on those computers and humans are incapable of writing bug-free software. Look at all the large scale data breaches in recent history. Nothing is secure.

So we have laws that seem to try and protect privacy, but they're violated by our own governments, and in any case, we have countless examples of our inability to store any information securely. So is there really any hope to be able to exist with anonymity on the internet?

As ever, it depends who your adversary is. If your adversary is a government (either your own or some foreign government) then no, you have no hope. If it's a previous partner of yours who has no particular computer training, then yes, you're probably going to have a reasonable chance of being anonymous for a while. But you need to read up on this and think hard: it's not a trivial undertaking. There are some good guides as to how to do this, but:

All writers - whether writing under their own names or not - should be aware of the risks they may incur by hitting 'publish'.

What is the effect of hitting "publish"? It's to put more data points out there which may lead people to be able to identify you. The fewer data points out there, the better. So coming back to our book reviewer, if you want to review books anonymously, and if your justification for acting anonymously is to avoid being stalked by authors who don't like your reviews, then why put so many data points out there? Why have the Facebook page, the Instagram profile with the faked photos, the Twitter account? Why give your real postal address to the book review club knowing they're going to post books to it and might conceivably give your address out to other people?

The social media accounts in particular I find most odd. If you want to review books then review books. Build your following, your reputation and cachet on the quality of your reviews. If I'm looking at a book review I really don't care where you went on holiday, what your tweets are, or how many pets you have. Putting that information out there undermines your entire justification for being anonymous: if you want to be anonymous (i.e. you don't want people to find out who you are) then why are you putting so much unnecessary information out there that may allow people to figure out who you are?

Equally, use a name that clearly communicates to me you're trying to be anonymous: call yourself TheBookReviewer53, DostoyevskyLover or OrwellWasRight. Doing so doesn't lessen the validity of your opinions on your chosen subject and is more honest with people reading your reviews: it's overtly saying "I have reasons to want to exist anonymously on the internet". It reveals nothing more about your real identity either: regardless of the obvious fictitious-ness of your online persona, if you can be found, you can be found.

Researchers show that four data points about a person’s location can identify that person with 95% accuracy. FOUR. You think you can tweet anonymously from your phone? You think apps like Whisper allow you to act anonymously? As with pretty much everything related to the internet and computing, unless you've spent the last 20 years of your life working with computers, studying computers and thinking very hard about threat models and what data you're putting out there, and are utterly paranoid, you basically haven't got a chance. Do you turn off wifi on your phone when you leave the house? You should. You trust that USB pen drive you're transferring documents on? You shouldn't.

Finally and most obviously, any attempt at anonymity clearly doesn't insulate you from the law. As members of various hacking groups such as lulzsec found out, you always can be found out by law enforcement agencies. Yes, you might be able to make it difficult for a poorly funded person to come after you for libel (which is really just an indictment of the disgusting relationship between justice and money) but it's quite a risk to take. If you wouldn't put it in print with your real name attached, you're placing an awful lot of trust on your ability to maintain your anonymity against an adversary you probably don't know as well as you need to.

October 19, 2014 03:00 PM


Quasi-quoting DSLs for free

Suppose you are writing a compiler for some programming language or DSL. If you are doing source to source transformations in your compiler, perhaps as part of an optimization pass, you will need to construct and deconstruct bits of abstract syntax. It would be very convenient if we could write that abstract syntax using the syntax of your language. In this blog post we show how you can reuse your existing compiler infrastructure to make this possible by writing a quasi-quoter with support for metavariables. As we will see, a key insight is that we can reuse object variables as meta variables.

Toy Language “Imp

For the sake of this blog post we will be working with a toy language called Imp. The abstract syntax for Imp is defined by

type VarName = String

data Expr =
    Var VarName
  | Add Expr Expr
  | Sub Expr Expr
  | Int Integer
  | Read
  deriving (Data, Typeable, Show, Eq)

data Cmd =
    Write Expr
  | Assign VarName Expr
  | Decl VarName
  deriving (Data, Typeable, Show)

data Prog = Prog [Cmd]
  deriving (Data, Typeable, Show)

and we will assume that we have some parsec parsers

parseExpr :: Parser Expr
parseProg :: Parser Prog

We will also make use of

topLevel :: Parser a -> Parser a
topLevel p = whiteSpace *> p <* eof

and the following useful combinator for running a parser:

parseIO :: Parser a -> String -> IO a

The details of these parsers are beyond the scope of this post. There are plenty of parsec tutorials online; for instance, you could start with the parsec chapter in Real World Haskell. Moreover, the full code for this blog post, including a simple interpreter for the language, is available on github if you want to play with it. Here is a simple example of an Imp program:

var x ;
x := read ;
write (x + x + 1)

A simple quasi-quoter

We want to be able to write something like

prog1 :: Prog
prog1 = [prog|
    var x ;
    x := read ;
    write (x + x + 1)

where the intention is that the [prog| ... |] quasi-quote will expand to something like

prog1 = Prog [ 
      Decl "x"
    , Assign "x" Read
    , Write (Add (Add (Var "x") (Var "x")) (Int 1))

To achieve this, we have to write a quasi-quoter. A quasi-quoter is an instance of the following data type:

data QuasiQuoter = QuasiQuoter { 
    quoteExp  :: String -> Q Exp
  , quotePat  :: String -> Q Pat
  , quoteType :: String -> Q Type
  , quoteDec  :: String -> Q [Dec] 

The different fields are used when using the quasi-quoter in different places in your Haskell program: at a position where we expect a (Haskell) expression, a pattern (we will see an example of that later), a type or a declaration; we will not consider the latter two at all in this blog post.

In order to make the above example (prog1) work, we need to implement quoteExp but we can leave the other fields undefined:

prog :: QuasiQuoter
prog = QuasiQuoter {
      quoteExp = \str -> do
        l <- location'
        c <- runIO $ parseIO (setPosition l *> topLevel parseProg) str
        dataToExpQ (const Nothing) c
    , quotePat  = undefined
    , quoteType = undefined
    , quoteDec  = undefined

Let’s see what’s going on here. The quasi-quoter gets as argument the string in the quasi-quote brackets, and must return a Haskell expression in the Template-Haskell Q monad. This monad supports, amongst other things, getting the current location in the Haskell file. It also supports IO.


The first thing that we do is find the current location in the Haskell source file and convert it to parsec format:

location' :: Q SourcePos
location' = aux <$> location
    aux :: Loc -> SourcePos
    aux loc = uncurry (newPos (loc_filename loc)) (loc_start loc)

Running the parser

Once we have the location we then parse the input string to a term in our abstract syntax (something of type Prog). We use parsec’s setPosition to tell parsec where we are in the Haskell source file, so that if we make a mistake such as

prog1 :: Prog
prog1 = [prog|
    var x ;
    x := read ;
    write (x + x + )

we get an error that points to the correct location in our Haskell file:

    Exception when trying to run compile-time code:
      user error ("TestQQAST.hs" (line 9, column 20):
unexpected ")"
expecting "(", "read", identifier or integer)

Converting to Haskell abstract syntax

The parser returns something of type Prog, but we want something of type Exp; Exp is defined in Template Haskell and reifies the abstract syntax of Haskell. For example, we would have to translate the Imp abstract syntax term

Var "x" :: Prog

to its reflection as a piece of abstract Haskell syntax as

AppE (ConE 'Var) (LitE (StringL "x")) :: TH.Exp

which, when spliced into the Haskell source, yields the original Prog value. Fortunately, we don’t have to write this translation by hand, but we can make use of the following Template Haskell function:

dataToExpQ :: Data a 
           => (forall b. Data b => b -> Maybe (Q Exp)) 
           -> a -> Q Exp

This function can translate any term to a reified Haskell expression, as long as the type of the term derives Data (Data instances can be auto-derived by ghc if you enable the DeriveDataTypeable language extension). The first argument allows you to override the behaviour of the function for specific cases; we will see an example use case in the next section. In our quasi-quoter so far we don’t want to override anything, so we pass a function that always returns Nothing.

Once we have defined this quasi-quoter we can write

prog1 :: Prog
prog1 = [prog|
    var x ;
    x := read ;
    write (x + x + 1)

and ghc will run our quasi-quoter and splice in the Haskell expresion corresponding to the abstract syntax tree of this program (provided that we enable the QuasiQuotes language extension).


Consider this function:

prog2 :: VarName -> Integer -> Prog
prog2 y n = [prog|
    var x ;
    x := read ;
    write (x + y + n)

As mentioned, in the source code for this blog post we also have an interpreter for the language. What happens if we try to run (prog2 "x" 1)?

*Main> intIO $ intProg (prog2 "x" 2)
*** Exception: user error (Unbound variable "y")

Indeed, when we look at the syntax tree that got spliced in for prog2 we see

Prog [ Decl "x"
     , Assign "x" Read
     , Write (Add (Add (Var "x") (Var "y")) (Var "n"))

What happened? Didn’t we pass in "x" as the argument y? Actually, on second thought, this makes perfect sense: this is after all what our string parses to. The fact that y and n also happen to Haskell variables, and happen to be in scope at the point of the quasi-quote, is really irrelevant. But we would still like prog2 to do what we expected it to do.

Meta-variables in Template Haskell

To do that, we have to support meta variables: variables from the “meta” language (Haskell) instead of the object language (Imp). Template Haskell supports this out of the box. For example, we can define

ex :: Lift a => a -> Q Exp
ex x = [| id x |]

Given any argument that supports Lift, ex constructs a piece of abstract Haskell syntax which corresponds to the application of the identity function to x. (Don’t confuse this with anti-quotation; see Brief Intro to Quasi-Quotation.) Lift is a type class with a single method

class Lift t where
  lift :: t -> Q Exp

For example, here is the instance for Integer:

instance Lift Integer where
  lift x = return (LitE (IntegerL x))

Meta-variables in quasi-quotes

Quasi-quotes don’t have automatic support for meta-variables. This makes sense: Template Haskell is for quoting Haskell so it has a specific concrete syntax to work with, where as quasi-quotes are for arbitrary custom syntaxes and so we have to decide what the syntax and behaviour of meta-variables is going to be.

For Imp we want to translate any unbound Imp (object-level) variable in the quasi-quote to a reference to a Haskell (meta-level) variable. To do that, we will introduce a similar type class to Lift:

class ToExpr a where
  toExpr :: a -> Expr

and provide instances for variables and integers:

instance ToExpr VarName where
  toExpr = Var

instance ToExpr Integer where
  toExpr = Int

We will also need to know which variables in an Imp program are bound and unbound; in the source code you will find a function which returns the set of free variables in an Imp program:

fvProg :: Prog -> Set VarName

Overriding the behaviour of dataToExpQ

In the previous section we mentioned that rather than doing the Prog -> Q Exp transformation by hand we use the generic function dataToExpQ to do it for us. However, now we want to override the behaviour of this function for the specific case of unbound Imp variables, which we want to translate to Haskell variables.

Recall that dataToExpQ has type

dataToExpQ :: Data a 
           => (forall b. Data b => b -> Maybe (Q Exp)) 
           -> a -> Q Exp

This is a rank-2 type: the first argument to dataToExpQ must itself be polymorphic in b: it must work on any type b that derives Data. So far we have been passing in

const Nothing

which is obviously polymorphic in b since it completely ignores its argument. But how do we do something more interesting? Data and its associated combinators come from a generic programming library called Scrap Your Boilerplate (Data.Generics). A full discussion of of SYB is beyond the scope of this blog post; the SYB papers are a good starting point if you would like to know more (I would recommend reading them in chronological order, the first published paper first). For the sake of what we are trying to do it suffices to know about the existence of the following combinator:

extQ :: (Typeable a, Typeable b) => (a -> q) -> (b -> q) -> a -> q

Given a polymorphic query (forall a)—in our case this is const NothingextQ allows to extend the query with a type specific case (for a specific type b). We will use this to give a specific case for Expr: when we see a free variable in an expression we translate it to an application of toExpr to a Haskell variable with the same name:

metaExp :: Set VarName -> Expr -> Maybe ExpQ
metaExp fvs (Var x) | x `Set.member` fvs = 
  Just [| toExpr $(varE (mkName x)) |]
metaExp _ _ = 

The improved quasi-quoter

With this in hand we can define our improved quasi-quoter:

prog :: QuasiQuoter
prog = QuasiQuoter {
      quoteExp = \str -> do
        l <- location'
        c <- runIO $ parseIO (setPosition l *> topLevel parseProg) str
        dataToExpQ (const Nothing `extQ` metaExp (fvProg c)) c
    , quotePat  = undefined
    , quoteType = undefined
    , quoteDec  = undefined

Note that we are extending the query for Expr, not for Prog; dataToExpQ (or, more accurately, SYB) makes sure that this extension is applied at all the right places. Running (prog2 "x" 2) now has the expected behaviour:

*Main> intIO $ intProg (prog2 "x" 2)

Indeed, when we have a variable in our code that is unbound both in Imp and in Haskell, we now get a Haskell type error:

prog2 :: VarName -> Integer -> Prog
prog2 y n = [prog|
    var x ;
    x := read ;
    write (x + z + n)


TestQQAST.hs:15:19: Not in scope: ‘z’

Parenthetical remark: it is a design decision whether or not we want to allow local binding sites in a splice to “capture” meta-variables. Put another way, when we pass in "x" to prog2, do we mean the x that is bound locally in prog2, or do we mean a different x? Certainly a case can be made that we should not be able to refer to the locally bound x at all—after all, it’s not bound outside of the snippet! This is an orthogonal concern however and we will not discuss it any further in this blog post.

Quasi-quoting patterns

We can also use quasi-quoting to support patterns. This enables us to write something like

optimize :: Expr -> Expr
optimize [expr| a + n - m |] | n == m = optimize a
optimize other = other

As before, the occurrence of a in this pattern is free, and we intend it to correspond to a Haskell variable, not an Imp variable; the above code should correspond to

optimize (Sub (Add a n) m) | n == m = optimize a

(note that this is comparing Exprs for equality, hence the need for Expr to derive Eq). We did not mean the pattern

optimize (Sub (Add (Var "a") (Var "n")) (Var "m"))

To achieve this, we can define a quasi-quoter for Expr that supports patterns (as well as expressions):

expr :: QuasiQuoter
expr = QuasiQuoter {
      quoteExp  = \str -> do
        l <- location'
        e <- runIO $ parseIO (setPosition l *> topLevel parseExpr) str
        dataToExpQ (const Nothing `extQ` metaExp (fvExpr e)) e
    , quotePat  = \str -> do
        l <- location'
        e <- runIO $ parseIO (setPosition l *> topLevel parseExpr) str
        dataToPatQ (const Nothing `extQ` metaPat (fvExpr e)) e
    , quoteType = undefined
    , quoteDec  = undefined

The implementation of quotePat is very similar to the definition of quoteExp. The only difference is that we use dataToPatQ instead of dataToExpQ to generate a Haskell pattern rather than a Haskell expression, and we use metaPat to give a type specific case which translates free Imp variables to Haskell pattern variables:

metaPat :: Set VarName -> Expr -> Maybe PatQ
metaPat fvs (Var x) | x `Set.member` fvs = Just (varP (mkName x))
metaPat _ _ = Nothing

Note that there is no need to lift now; the Haskell variable will be bound to whatever matches in the expression.


We might be tempted to also add support for Prog patterns. While that is certainly possible, it’s of limited use if we follow the same strategy that we followed for expressions. For instance, we would not be able to write something like

opt [prog| var x ; c |] | x `Set.notMember` fvProg c = opt c

The intention here is that we can remove unused variables. Unfortunately, this will not work because this will cause a parse error: the syntax for Imp does not allow for variables for commands, and hence we also don’t allow for meta-variables at this point. This is important to remember:

By using object-level variables as stand-ins for meta-level variables, we only allow for meta-level variables where the syntax for the object-level language allows variables.

If this is too restrictive, we need to add special support in the ADT and in the corresponding parsers for meta-variables. This is a trade-off in increased expressiveness of the quasi-quotes against additional complexity in their implementation (new parsers, new ADTs).


By reusing object-level variables as stand-ins for meta variables you can reuse existing parsers and ADTs to define quasi-quoters. Using the approach described in this blog we were able to add support for quasi-quoting to a real compiler for a domain specific programming language with a minimum of effort. The implementation is very similar to what we have shown above, except that we also dealt with renaming (so that meta variables cannot be captured by binding sites in the quasi quotes) and type checking (reusing the existing renamer and type checker, of course).

by edsko at October 19, 2014 02:01 PM

October 17, 2014

Erik de Castro Lopo

Haskell : A neat trick for GHCi

Just found a really nice little hack that makes working in the GHC interactive REPL a little easier and more convenient. First of all, I added the following line to my ~/.ghci file.


All that line does is define a GHC_INTERACTIVE pre-processor symbol.

Then in a file that I want to load into the REPL, I need to add this to the top of the file:

  {-# LANGUAGE CPP #-}

and then in the file I can do things like:

  import Data.Aeson.Encode.Pretty

  prettyPrint :: Value -> IO ()
  prettyPrint = LBS.putStrLn . encodePretty

In this particular case, I'm working with some relatively large chunks of JSON and its useful to be able to pretty print them when I'm the REPL, but I have no need for that function when I compile that module into my project.

October 17, 2014 10:16 PM

Neil Mitchell

Fixing Haddock docs on Hackage

Summary: A few weeks ago Hackage stopped generating docs. I have a script that generates the docs, and also fixes some Haddock bugs.

Update: The Haddock documentation generators are running once again, but may be somewhat behind for a little while. A few weeks ago Hackage stopped generating documentation, so if you look at recently uploaded pages they tend to either lack docs, or have very alert maintainers who did a manual upload. I've packaged up my solution, which also fixes some pretty annoying Haddock bugs. Given that I can now get docs faster and better with my script, I'll probably keep using it even after Haddock on Hackage gets restarted.

How to use it

  • You are likely to get better results if you always install the packages you use with documentation.
  • Ensure you have tar, curl, cp and cabal on your $PATH.
  • cabal update && cabal install neil
  • Make a release, don't touch any other code, then make sure you are in the project directory.
  • neil docs --username=YourHackageUsername
  • Type in your Hackage password at the prompt.

And like that, your docs are online. To see an example of something that was generated with this process, look at Shake.

What I fixed

I automated the process using scripts originally taken from the lens library, supplemented with suggestions from Reddit. I then do a number of manual fixups.

  • Haddock now makes cross-module links where it doesn't know what the target is default to types. Concretely, if I write 'Development.Shake.need' in Haddock it generates a link to #t:need, which doesn't exist, when it should be #v:need - I fix that.
  • Update: fixed in Haddock 1.14 or above, as per this ticket.
  • On Windows, if you use CPP and multiline bird-tick (>) Haddock blocks you get a blank line between each line. I fix that.
  • If you follow some of the simpler scripts links outside your package won't work (or at least, didn't for me). I fix that.

The neil tool

The neil tool is my personal set of handy Haskell scripts. I make all my releases with it (neil sdist), and do lots of checks that my packages conform to my rules (neil check). I also use it for driving my Travis instances. It's in fairly regular flux. Until now, I've always kept it in Darcs/Git and never released it - it's personal stuff tuned to how I work.

You might also notice that neil provides a library. Don't use that, I intend to delete it in a few weeks. Update: library removed.

by Neil Mitchell ([email protected]) at October 17, 2014 08:49 PM

Danny Gratzer

Notes on Quotients Types

Posted on October 17, 2014
Tags: types

Lately I’ve been reading a lot of type theory literature. In effort to help my future self, I’m going to jot down a few thoughts on quotient types, the subject of some recent google-fu.

But Why!

The problem quotient types are aimed at solving is actually a very common one. I’m sure at some point or another you’ve used a piece of data you’ve wanted to compare for equality. Additionally, that data properly needed some work to determine whether it was equal to another piece.

A simple example might would be representing rational numbers. A rational number is a fraction of two integers, so let’s just say

    type Rational = (Integer, Integer)

Now all is well, we can define a Num instance and what not. But what about equality? Clearly we want equivalent fractions to be equal. That should mean that (2, 4) = (1, 2) since they both represent the same number.

Now our implementation has a sticky point, clearly this isn’t the case on its own! What we really want to say is “(2, 4) = (1, 2) up to trivial rejiggering”.

Haskell’s own Rational type solves this by not exposing a raw tuple. It still exists under the hood, but we only expose smart constructors that will reduce our fractions as far as possible.

This is displeasing from a dependently typed setting however, we want to be able to formally prove the equality of some things. This “equality modulo normalization” leaves us with a choice. Either we can really provide a function which is essentially

    foo : (a b : Rational)
        -> Either (reduce a = reduce b) (reduce a /= reduce b)

This doesn’t really help us though, there’s no way to express that a should be observationally equivalent to b. This is a problem seemingly as old as dependent types: How can we have a simple representation of equality that captures all the structure we want and none that we don’t.

Hiding away the representation of rationals certainly buys us something, we can use a smart constructor to ensure things are normalized. From there we could potentially prove a (difficult) theorem which essentially states that

    =-with-norm : (a b c d : Integer)
                -> a * d = b * c -> mkRat a b = mkRat c d

This still leaves us with some woes however, now a lot of computations become difficult to talk about since we’ve lost the helpful notion that denominator o mkRat a = id and similar. The lack of transparency shifts a lot of the burden of proof onto the code privy to the internal representation of the type, the only place where we know enough to prove such things.

Really what we want to say is “Hey, just forget about a bit of the structure of this type and just consider things to be identical up to R”. Where R is some equivalence relation, eg

  1. a R a
  2. a R b implies b R a
  3. a R b and b R c implies a R c

If you’re a mathematician, this should sound similar. It’s a lot like how we can take a set and partition it into equivalence classes. This operation is sometimes called “quotienting a set”.

For our example above, we really mean that our rational is a type quotiented by the relation (a, b) R (c, d) iff a * c = b * d.

Some other things that could potentially use quotienting

  • Sets
  • Maps
  • Integers
  • Lots of Abstract Types

Basically anything where we want to hide some of the implementation details that are irrelevant for their behavior.

More than Handwaving

Now that I’ve spent some time essentially waving my hand about quotient types what are they? Clearly we need a rule that goes something like

 Γ ⊢ A type, E is an equivalence relation on A
        Γ ⊢ A // E type

Along with the typing rule

    Γ ⊢ a : A
  Γ ⊢ a : A // E

So all members of the original type belong to the quotiented type, and finally

  Γ ⊢ a : A, Γ ⊢ b : A, Γ ⊢ a E b
         Γ ⊢ a ≡ b : A // E

Notice something important here, that is the fancy shmancy judgmental equality baked right into the language. This calls into question decidability. It seems that a E b could involve some non-trivial proof terms.

More than that, in a constructive, proof relevant setting things can be a bit trickier than they seem. We can’t just define a quotient to be the same type with a different equivalence relation, since that would imply some icky things.

To illustrate this problem, imagine we have a predicate P on a type A where a E b implies P a ⇔ P b. If we just redefine the equivalence relation on quotes, P would not be a wellformed predicate on A // E, since a ≡ b : A // E doesn’t mean that P a ≡ P b. This would be unfortunate.

Clearly some subtler treatment of this is needed. To that end I found this paper discussing some of the handling of NuRPL’s quotients enlightening.

How NuPRL Does It

The paper I linked to is a discussion on how to think about quotients in terms of other type theory constructs. In order to do this we need a few things first.

The first thing to realize is that NuPRL’s type theory is different than what you are probably used to. We don’t have this single magical global equality. Instead, we define equality inductively across the type. This notion means that our equality judgment doesn’t have to be natural in the type it works across. It can do specific things at each case. Perhaps the most frequent is that we can have functional extensionality.

f = g ⇔ ∀ a. f a = g a

Okay, so now that we’ve tossed aside the notion of a single global equality, what else is new? Well something new is the lens through which many people look at NuRPL’s type theory: PER semantics. Remember that PER is a relationship satisfying

  1. a R b → then b R a
  2. a R b ∧ b R c → a R c

In other words, a PER is an equivalence relationship that isn’t necessarily reflexive at all points.

The idea is to view types not as some opaque “thingy” but instead to be partial equivalence relations across the set of untyped lambda calculus terms. Inductively defined equality falls right out of this idea since we can just define a ≡ b : A to be equivalent to (a, b) ∈ A.

Now another problem rears it head, what does a : A mean? Well even though we’re dealing with PERs, but it’s quite reasonable to say something is a member of a type if it’s reflexive. That is to say each relation is a full equivalence relation for the things we call members of that type. So we can therefore define a : A to be (a, a) ∈ A.

Another important constraint, in order for a type family to be well formed, it needs to respect the equality of the type it maps across. In other words, for all B : A → Type, we have (a, a') ∈ A' ⇒ (B a = B a') ∈ U. This should seem on par with how we defined function equality and we call this “type functionality”.

Let’s all touch on another concept: squashed types. The idea is to take a type and throw away all information other than whether or not it’s occupied. There are two basic types of squashing, extensional or intensional. In the intensional we consider two squashed things equal if and only if the types they’re squashing are equal

     A = B
   [A] = [B]

Now we can also consider only the behavior of the squashed type, the extensional view. Since the only behavior of a squashed type is simply existing, our extensional squash type has the equivalence

   ∥A∥ ⇔ ∥B∥
    ∥A∥ = ∥B∥

Now aside from this, the introduction of these types are basically the same: if we can prove that a type is occupied, we can grab a squashed type. Similarly, when we eliminate a type all we get is the trivial occupant of the squashed type, called •.

    Γ ⊢ A
   Γ ⊢ [A]

    Γ, x : |A|, Δ[̱•] ⊢ C[̱•]
    Γ, x : |A|, Δ[x] ⊢ C[x]

What’s interesting is that when proving an equality judgment, we can unsquash obth of these types. This is only because NuRPL’s equality proofs computationally trivial.

Now with all of that out of the way, I’d like to present two typing rules. First

  Γ ⊢ A ≡ A';  Γ, x : A, y : A ⊢ E[x; y] = E'[x; y]; E and E' are PERS
                      Γ ⊢ A ‌// E ≡ A' // E'

In English, two quotients are equal when the types and their quotienting relations are equal.

 Γ, u : x ≡ y ∈ (A // E), v : 
∥x E y∥, Δ[u] ⊢ C [u]
       Γ, u : x ≡ y ∈ (A // E), Δ[u] ⊢ C [u]

There are a few new things here. The first is that we have a new Δ [u] thing. This is a result of dependent types, can have things in our context that depend on u and so to indicate that we “split” the context, with Γ, u, Δ and apply the depend part of the context Δ to the variable it depends on u.

Now the long and short of this is that when we’re of this is that when we’re trying to use an equivalence between two terms in a quotient, we only get the squashed term. This done mean that we only need to provide a squash to get equality in the first place though

Γ ⊢ ∥ x E y 
∥; Γ ⊢ x : A; Γ ⊢ y : A
      Γ ⊢ x ≡ y : A // E

Remember that we can trivially form an ∥ A ∥ from A’.

Now there’s just one thing left to talk about, using our quotiented types. To do this the paper outlines one primitive elimination rule and defines several others.

Γ, x : A, y : A, e : x E y, a : ND, Δ[ndₐ{x;y}] ⊢ |C[ndₐ{x;y}]|
               Γ, x : A // E, Δ[x] ⊢ |C[x]|

ND is a admittedly odd type that’s supposed to represent nondeterministic choice. It has two terms, tt and ff and they’re considered “equal” under ND. However, nd returns its first argument if it’s fed tt and the second if it is fed ff. Hence, nondeterminism.

Now in our rule we use this to indicate that if we’re eliminating some quotiented type we can get any value that’s considered equal under E. We can only be assured that when we eliminate a quotiented type, it will be related by the equivalence relation to x. This rule captures this notion by allowing us to randomly choose some y : A so that x E y.

Overall, this rule simply states that if C is occupied for any term related to x, then it is occupied for C[x].

Wrap up

As with my last post, here’s some questions for the curious reader to pursue

  • What elimination rules can we derive from the above?
  • If we’re of proving equality can we get more expressive rules?
  • What would an extensional quotient type look like?
  • Why would we want intensional or extensional?
  • How can we express quotient types with higher inductive types from HoTT

The last one is particularly interesting.

Thanks to Jon Sterling for proof reading

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

October 17, 2014 12:00 AM

Jasper Van der Jeugt

Generalizing function composition


In this blogpost I present a proof-of-concept operator $.$, which allows you to replace:

foo x0 x1 x2 ... xN = bar $ qux x0 x1 x2 ... xN


foo = bar $.$ qux


This is a literate Haskell file, which means you should be able to just drop it into GHCi and play around with it. You can find the raw .lhs file here. Do note that this requires GHC 7.8 (it was tested on GHC 7.8.2).

> {-# LANGUAGE FlexibleContexts     #-}
> {-# LANGUAGE FlexibleInstances    #-}
> {-# LANGUAGE OverlappingInstances #-}
> {-# LANGUAGE TypeFamilies         #-}
> {-# LANGUAGE TypeOperators        #-}
> import Data.Char (toLower)

If you have been writing Haskell code for a while, you have undoubtedly used the $ operator to “wrap” some expression with another function, mapping over the result type. For example, we can “wrap” the expression toLower 'A' with print to output the result.

print $ toLower 'A'

It is not unlikely either to have functions that just wrap other functions, e.g.:

> printLower :: Char -> IO ()
> printLower x = print $ toLower x

If the function that is being wrapped (toLower in the previous example) has only one argument, the . operator allows writing a very concise definition of functions which just wrap those single-argument functions.

> printLower' :: Char -> IO ()
> printLower' = print . toLower

However, this gets tedious when the number arguments increases. Say that we have the following function which takes three arguments (don’t worry about the horrible implementation, but rather focus on the type):

> -- | Formats a double using a simple spec. Doesn't do proper rounding.
> formatDouble
>     :: Bool    -- ^ Drop trailing '0'?
>     -> Int     -- ^ #digits after decimal point
>     -> Double  -- ^ Argument
>     -> String  -- ^ Result
> formatDouble dropTrailingZero numDigits double =
>     let (pre, post) = case break (== '.') (show double) of
>             (x, '.' : y) -> (x, y)
>             (x, y)       -> (x, y)
>         post'       = take numDigits (post ++ repeat '0')
>         pre'        = case pre of
>             '0' : x -> if dropTrailingZero then x else pre
>             _       -> pre
>     in pre' ++ "." ++ post'

We can wrap formatDouble using print by successively using the . operator, but the result does not look pretty, nor very readable:

> printDouble :: Bool -> Int -> Double -> IO ()
> printDouble = (.) ((.) ((.) print)) formatDouble

The $.$ operator

This makes one wonder if we can’t define an additional operator $.$ (pronounced holla-holla-get-dolla) which can be used like this:

> printDouble' :: Bool -> Int -> Double -> IO ()
> printDouble' = print $.$ formatDouble

Additionally, it should be generic, as in, it should work for an arbitrary number of arguments, so that we can also have:

> printMax' :: Int -> Int -> IO ()
> printMax' = print $.$ max
> printLower'' :: Char -> IO ()
> printLower'' = print $.$ toLower

From this, it becomes clear that the type of $.$ should be something like:

    :: (a -> b)
    -> (x0 -> x1 -> ... -> xn -> a)
    -> (x0 -> x1 -> ... -> xn -> b)

The first question is obviously, can we write such an operator? And if we can, how generic is it?

When my colleague Alex asked me this question, I spent some time figuring it out. I previously thought it was not possible to write such an operator in a reasonably nice way, but after some experiments with the closed type families in GHC 7.8 I managed to get something working. However, the solution is far from trivial (and I now suspect more elegant solutions might exist).

A possible solution

Thanks to my colleague Alex for proofreading!

October 17, 2014 12:00 AM

October 15, 2014

Douglas M. Auclair (geophf)

September 2014 1HaskellADay problems and solutions

September, 2014

  • September 1st, 2014: They tried to kill the Metal...I don't know where I'm going with that. But rock-n-roll with today's #haskell exercise
  • September 2nd, 2014: Good morning! Triangle Sums is our #haskell problem for today: No triangles were harmed in the solution of their sum (nor in the summation of their solution)

  • September 3rd, 2014: Pay It Forward. What? You didn't think I'd just say: today's #haskell problem is hard and leave it at that, did you? Paid. Or: a constructivist approach reduces the generated sets from 200M+ down to 8 possible solutions That's doable. ... and here is the 'Mr. Clean' version of the solution: fast, and neat. Groovy!
  • September 4th, 2014: Today's #haskell problem: Abacus words because MRFE says "I don't like your math problems; I want more word problems"
  • September 5th, 2014: These 'edgy' relationships these days!  Remember when today's #haskell problem didn't involve graph theory? Data.Graph FTW! A solution to today's 4sum #haskell problem, and it didn't require generating 1625702400 solutions!
  • September 8th, 2014: We have puzzles 1 and 5 from the "Montley Stew" problem set for today's #haskell problem The solution-style to Montley Stew isswimming in list-compression stew
  • September 9th, 2014: King Tut! Our #haskell problem for today is NOT a Pyramid Scheme. Maybe.
  • September 10th, 2014: 'Sed' is 'but(t)' just another word ... in "'laddin" Today's #haskell problem is mix-n-match words. "But(t) I sed ..." ARG! Enough with the 3rd-grade humor! On with the solution to the mix-n-match words!
  • September 11th, 2014: A-plus for you when you solve today's #haskell exercise But an F- (NOT an F# ... geddit?) for /usr/share/dict/words :/ A solution to today's injectInto #haskell problem
  • September 12th, 2014: Today's #Haskell problem comes from She. She who commands you solve it before coding it. So, you know: there it is. Okay, 'thurt' is a word in WHICH Universe? A solution to today's #haskell 'ditto' problem
  • September 15th, 2014: "The name of the game is Connect Four!" and today's #haskell problem as suggested by a tweet from @DrEugeniaCheng. I played Connect 4 against myself and lost! :/ A semi-solution to today's #haskell problem at

  • September 16th, 2014: There's more than one way to slice and dice a matrix for today's #haskell problem (follow-up to yesterday's Connect4) A Hack-n-slash solution to today's diagonal view of matrices. Thebonus solution is provided back at the Connect Four answer to make that game complete:
  • September 17th, 2014: Do you Yahoo!? Today's #haskell problem: connecting to Yahoo!'s financial webservice I like my java GREEN! (java means 'coffee') A solution to stock webservice #haskell problem.

  • September 18th, 2014: Star (Tuna?) Fish? A radial word-finding/matching game is today's #haskell puzzle. Wait. Quantum...WHAT? A solution to today's #haskell problem using quantum superpositions to solve it. I'm not joking. STAR POWER! A solution for pretty-printing the star-puzzle
  • September 19th, 2014: Continued fractions and dual inversionals are today's #haskell problem It even comes with (thoughts about) flowers. #Ult Today's problem was inspired by a comment, then the main article, from @aperiodical That was some LOOOOOOOONG Division! A solution to today's #haskell problem.
  • September 22nd, 2014: Oh, noes! The 'M'-word! for today's #haskell exercise. Project Eulerproblem 11'M' is for 'Monoid' A solution to today's #haskell problem.
  • September 23rd, 2014: "Oh, the snark bites, with his teeth, dear." MACD Knife ... geddit? Today's #haskell problem is a technical indicator.
  • September 24th, 2014: Jones, Smith, and Brown work at the Bank... but not Mr. Banks.A logic puzzle from 1957 for today's #haskell puzzle A pair of PhDs ( helped to solve today's #haskell problem. Neatly, too, I might add.
  • September 25th, 2014: Corned-beef hashi? No, that's not right, and now I'm hungry! :( Shoot! Well, anyway: today's #haskell problem.
  • September 26th, 2014: HA! I TOLD you we'd be getting to cipher-text! From the 1700's, still: it IS cipher text for today's #haskell problem. Ooh! The Plot Thickens (like pea soup)! "ALLMENHAVING" and "be mis T r U st ed " is a solution to today's problem.
  • September 29th, 2014: Big (Crypto) Generator! Today's #haskell problem is a follow-on to Friday's. Human Error ... WHAT human error? A solution to today's make-your-own-cypto-table #haskell problem
  • September 30th, 2014: "I wanna be a blue-collar man!" Yes, but who, and which occupation? Today's #haskell problem addresses this. Plumber, ... or painter? IDK! TWO-solutions to today's #haskell problem (one of them MAY be correct... :/ )

by geophf ([email protected]) at October 15, 2014 05:36 PM

October 14, 2014

Joachim Breitner

Switching to sytemd-networkd

Ever since I read about sytemd-networkd being in the making I was looking forward to try it out. I kept watching for the package to appear in Debian, or at least ITP bugs. A few days ago, by accident, I noticed that I already have systemd-networkd on my machine: It is simply shipped with the systemd package!

My previous setup was a combination of ifplugd to detect when I plug or unplug the ethernet cable with a plain DHCP entry in /etc/network/interface. A while ago I was using guessnet to do a static setup depending on where I am, but I don’t need this flexibility any more, so the very simple approach with systemd-networkd is just fine with me. So after stopping ifplugd and

$ cat > /etc/systemd/network/ <<__END__
$ systemctl enable systemd-networkd
$ systemctl start systemd-networkd

I was ready to go. Indeed, systemd-networkd, probably due to the integrated dhcp client, felt quite a bit faster than the old setup. And what’s more important (and my main motivation for the switch): It did the right thing when I put it to sleep in my office, unplug it there, go home, plug it in and wake it up. ifplugd failed to detect this change and I often had to manually run ifdown eth0 && ifup eth0; this now works.

But then I was bitten by what I guess some people call the viral nature of systemd: sytemd-networkd would not update /etc/resolve.conf, but rather relies on systemd-resolved. And that requires me to change /etc/resolve.conf to be a symlink to /run/systemd/resolve/resolv.conf. But of course I also use my wireless adapter, which, at that point, was still managed using ifupdown, which would use dhclient which updates /etc/resolve.conf directly.

So I investigated if I can use systemd-networkd also for my wireless account. I am not using NetworkManager or the like, but rather keep wpa_supplicant running in roaming mode, controlled from ifupdown (not sure how that exactly works and what controls what, but it worked). I found out that this setup works just fine with systemd-networkd: I start wpa_supplicant with this service file (which I found in the wpasupplicant repo, but not yet in the Debian package):

Description=WPA supplicant daemon (interface-specific version)

ExecStart=/sbin/wpa_supplicant -c/etc/wpa_supplicant/wpa_supplicant-%I.conf -i%I


Then wpa_supplicant will get the interface up and down as it goes, while systemd-networkd, equipped with


does the rest.

So suddenly I have a system without /etc/init.d/networking and without ifup. Feels a bit strange, but also makes sense. I still need to migrate how I manage my UMTS modem device to that model.

The only thing that I’m missing so far is a way to trigger actions when the network configuration has changes, like I could with /etc/network/if-up.d/ etc. I want to run things like killall -ALRM tincd and exim -qf. If you know how to do that, please tell me, or answer over at Stack Exchange.

by Joachim Breitner ([email protected]) at October 14, 2014 08:26 PM

Switching to systemd-networkd

Ever since I read about systemd-networkd being in the making I was looking forward to try it out. I kept watching for the package to appear in Debian, or at least ITP bugs. A few days ago, by accident, I noticed that I already have systemd-networkd on my machine: It is simply shipped with the systemd package!

My previous setup was a combination of ifplugd to detect when I plug or unplug the ethernet cable with a plain DHCP entry in /etc/network/interface. A while ago I was using guessnet to do a static setup depending on where I am, but I don’t need this flexibility any more, so the very simple approach with systemd-networkd is just fine with me. So after stopping ifplugd and

$ cat > /etc/systemd/network/ <<__END__
$ systemctl enable systemd-networkd
$ systemctl start systemd-networkd

I was ready to go. Indeed, systemd-networkd, probably due to the integrated dhcp client, felt quite a bit faster than the old setup. And what’s more important (and my main motivation for the switch): It did the right thing when I put it to sleep in my office, unplug it there, go home, plug it in and wake it up. ifplugd failed to detect this change and I often had to manually run ifdown eth0 && ifup eth0; this now works.

But then I was bitten by what I guess some people call the viral nature of systemd: systemd-networkd would not update /etc/resolve.conf, but rather relies on systemd-resolved. And that requires me to change /etc/resolve.conf to be a symlink to /run/systemd/resolve/resolv.conf. But of course I also use my wireless adapter, which, at that point, was still managed using ifupdown, which would use dhclient which updates /etc/resolve.conf directly.

So I investigated if I can use systemd-networkd also for my wireless account. I am not using NetworkManager or the like, but rather keep wpa_supplicant running in roaming mode, controlled from ifupdown (not sure how that exactly works and what controls what, but it worked). I found out that this setup works just fine with systemd-networkd: I start wpa_supplicant with this service file (which I found in the wpasupplicant repo, but not yet in the Debian package):

Description=WPA supplicant daemon (interface-specific version)

ExecStart=/sbin/wpa_supplicant -c/etc/wpa_supplicant/wpa_supplicant-%I.conf -i%I


Then wpa_supplicant will get the interface up and down as it goes, while systemd-networkd, equipped with


does the rest.

So suddenly I have a system without /etc/init.d/networking and without ifup. Feels a bit strange, but also makes sense. I still need to migrate how I manage my UMTS modem device to that model.

The only thing that I’m missing so far is a way to trigger actions when the network configuration has changes, like I could with /etc/network/if-up.d/ etc. I want to run things like killall -ALRM tincd and exim -qf. If you know how to do that, please tell me, or answer over at Stack Exchange.

by Joachim Breitner ([email protected]) at October 14, 2014 08:26 PM

October 13, 2014

Neil Mitchell

Shake's Internal State

Summary: Shake is not like Make, it has different internal state, which leads to different behaviour. I also store the state in an optimised way.

Update: I'm keeping an up to date version of this post in the Shake repo, which includes a number of questions/answers at the bottom, and is likely to evolve over time to incorporate that information into the main text.

In order to understand the behaviour of Shake, it is useful to have a mental model of Shake's internal state. To be a little more concrete, let's talk about Files which are stored on disk, which have ModTime value's associated with them, where modtime gives the ModTime of a FilePath (Shake is actually generalised over all those things). Let's also imagine we have the rule:

file *> \out -> do
need [dependency]

So file depends on dependency and rebuilds by executing the action run.

The Make Model

In Make there is no additional state, only the file-system. A file is considered dirty if it has a dependency such that:

modtime dependency > modtime file

As a consequence, run must update modtime file, or the file will remain dirty and rebuild in subsequent runs.

The Shake Model

For Shake, the state is:

database :: File -> (ModTime, [(File, ModTime)])

Each File is associated with a pair containing the ModTime of that file, plus a list of each dependency and their modtimes, all from when the rule was last run. As part of executing the rule above, Shake records the association:

file -> (modtime file, [(dependency, modtime dependency)])

The file is considered dirty if any of the information is no longer current. In this example, if either modtime file changes, or modtime dependency changes.

There are a few consequences of the Shake model:

  • There is no requirement for modtime file to change as a result of run. The file is dirty because something changed, after we run the rule and record new information it becomes clean.
  • Since a file is not required to change its modtime, things that depend on file may not require rebuilding even if file rebuilds.
  • If you update an output file, it will rebuild that file, as the ModTime of a result is tracked.
  • Shake only ever performs equality tests on ModTime, never ordering, which means it generalises to other types of value and works even if your file-system sometimes has incorrect times.

These consequences allow two workflows that aren't pleasant in Make:

  • Generated files, where the generator changes often, but the output of the generator for a given input changes rarely. In Shake, you can rerun the generator regularly, and using a function that writes only on change (writeFileChanged in Shake) you don't rebuild further. This technique can reduce some rebuilds from hours to seconds.
  • Configuration file splitting, where you have a configuration file with lots of key/value pairs, and want certain rules to only depend on a subset of the keys. In Shake, you can generate a file for each key/value and depend only on that key. If the configuration file updates, but only a subset of keys change, then only a subset of rules will rebuild. Alternatively, using Development.Shake.Config you can avoid the file for each key, but the dependency model is the same.

Optimising the Model

The above model expresses the semantics of Shake, but the implementation uses an optimised model. Note that the original Shake paper gives the optimised model, not the easy to understand model - that's because I only figured out the difference a few days ago (thanks to Simon Marlow, Simon Peyton Jones and Andrey Mokhov). To recap, we started with:

database :: File -> (ModTime, [(File, ModTime)])

We said that File is dirty if any of the ModTime values change. That's true, but what we are really doing is comparing the first ModTime with the ModTime on disk, and the list of second ModTime's with those in database. Assuming we are passed the current ModTime on disk, then a file is valid if:

valid :: File -> ModTime -> Bool
valid file mNow =
mNow == mOld &&
and [fst (database d) == m | (d,m) <- deps]
where (mOld, deps) = database file

The problem with this model is that we store each File/ModTime pair once for the file itself, plus once for every dependency. That's a fairly large amount of information, and in Shake both File and ModTime can be arbitrarily large for user rules.

Let's introduce two assumptions:

Assumption 1: A File only has at most one ModTime per Shake run, since a file will only rebuild at most once per run. We use Step for the number of times Shake has run on this project.

Consequence 1: The ModTime for a file and the ModTime for its dependencies are all recorded in the same run, so they share the same Step.

Assumption 2: We assume that if the ModTime of a File changes, and then changes back to a previous value, we can still treat that as dirty. In the specific case of ModTime that would require time travel, but even for other values it is very rare.

Consequence 2: We only use historical ModTime values to compare them for equality with current ModTime values. We can instead record the Step at which the ModTime last changed, assuming all older Step values are unequal.

The result is:

database :: File -> (ModTime, Step, Step, [File])

valid :: File -> ModTime -> Bool
valid file mNow =
mNow == mOld &&
and [sBuild >= changed (database d) | d <- deps]
where (mOld, sBuilt, sChanged, deps) = database file
changed (_, _, sChanged, _) = sChanged

For each File we store its most recently recorded ModTime, the Step at which it was built, the Step when the ModTime last changed, and the list of dependencies. We now check if the Step for this file is greater than the Step at which dependency last changed. Using the assumptions above, the original formulation is equivalent.

Note that instead of storing one ModTime per dependency+1, we now store exactly one ModTime plus two small Step values.

We still store each file many times, but we reduce that by creating a bijection between File (arbitrarily large) and Id (small index) and only storing Id.

Implementing the Model

For those who like concrete details, which might change at any point in the future, the relevant definition is in Development.Shake.Database:

data Result = Result
{result :: Value -- the result when last built
,built :: Step -- when it was actually run
,changed :: Step -- when the result last changed
,depends :: [[Id]] -- dependencies
,execution :: Float -- duration of last run
,traces :: [Trace] -- a trace of the expensive operations
} deriving Show

The differences from the model are:

  • ModTime became Value, because Shake deals with lots of types of rules.
  • The dependencies are stored as a list of lists, so we still have access to the parallelism provided by need, and if we start rebuilding some dependencies we can do so in parallel.
  • We store execution and traces so we can produce profiling reports.
  • I haven't shown the File/Id mapping here - that lives elsewhere.
  • I removed all strictness/UNPACK annotations from the definition above, and edited a few comments.

As we can see, the code follows the optimised model quite closely.

by Neil Mitchell ([email protected]) at October 13, 2014 08:53 PM

Austin Seipp

The New

Hello there!

What you're reading is a blog post. Where is it from? It's from What's it doing? It's cataloging the thoughts of the people who run

That's right. This is our new adventure in communicating with you. We wanted some place to put more long-form posts, invite guests, and generally keep people up to date about general improvements (and faults) to our infrastructure, now and in the future. Twitter and a short-form status site aren't so great, and a non-collective mind of people posting scattered things on various sites/lists isn't super helpful for cataloging things.

So for an introduction post, we've really got a lot to talk about...

A quick recap on recent events has had some rough times lately.

About a month and a half ago, we had an extended period of outage, roughly around the weekend of ICFP 2014. This was due to a really large amount of I/O getting backed up on our host machine, rock. rock is a single-tenant, bare-metal machine from Hetzner that we used to host several VMs that comprise the old server set; including the main website, the GHC Trac and git repositories, and Hackage. We alleviated a lot of the load by turning off the hackage server, and migrating one of the VMs to a new hosting provider.

Then, about a week and a half ago, we had another hackage outage that was a result of more meager concerns: disk space usage. Much to my chagrin, this was due in part to an absence of log rotation over the past year, which resulted in a hefty 15GB of text sitting around (in a single file, no less). Oops.

This caused a small bump on the road, which was that the hackage server had a slight error while committing some transactions in the database when it ran out of disk. We recovered from this (thanks to @duncan for the analysis), and restarted it. (We also had point-in-time backups, but in this case it was easier to fix than rollback the whole database).

But we've had several other availability issues beforehand too, including faulty RAM and inconsistent performance. So we're setting out to fix it. And in the process we figured, hey, they'd probably like to hear us babble about a lot of other stuff, too, because why not?

New things

OK, so enough sad news about what happened. Now you're wondering what's going to happen. Most of these happening-things will be good, I hope.

There are a bunch of new things we've done over the past year or so for, so it's best to summarize them a bit. These aren't in any particular order; most of the things written here are pretty new and some are a bit older since the servers have started churning a bit. But I imagine many things will be new to y'all.

A new blog, right here.

And it's a fancy one at that (powered by Phabricator). Like I said, we'll be posting news updates here that we think are applicable for the community at large - but most of the content will focus on the administrative side.

A new hosting provider: Rackspace

As I mentioned earlier this year pending the GHC 7.8 release, Rackspace has graciously donated resources towards for GHC, particularly for buildbots. We had at that time begun using Rackspace resources for hosting resources. Over the past year, we've done so more and more, to the point where we've decided to move all of It became clear we could offer a lot higher reliability and greatly improved services for users, using these resources.

Jesse Noller was my contact point at Rackspace, and has set up for its 2nd year running with free Rackspace-powered machines, storage, and services. That's right: free (to a point, the USD value of which I won't disclose here). With this, we can provide more redundant services both technically and geographically, we can offer better performance, better features and management, etc. And we have their awesome Fanatical Support.

So far, things have been going pretty well. We've migrated several machines to Rackspace, including:

We're still moving more servers, including:

Many thanks to Rackspace. We owe them greatly.

Technology upgrades, increased security, etc etc

We've done several overhauls of the way is managed, including security, our underlying service organization, and more.

  • A better webserver: All of our web instances are now served with nginx where we used Apache before. A large motivation for this was administrative headache, since most of us are much more familiar with nginx as opposed to our old Apache setup. On top of that we get increased speed and a more flexible configuration language (IMO). It also means we have to now run separate proxy servers for nginx, but systems like php-fpm or gunicorn tend to have much better performance and flexibility than things like mod_php anyway.
  • Ubuntu LTS: Almost all of our new servers are running Ubuntu 14.04 LTS. Previously we were running Debian stable, and before Debian announced their LTS project for Squeeze, the biggest motivation was Ubuntu LTSes typically had a much longer lifespan.
  • IPv6 all the way: All of our new servers have IPv6, natively.
  • HTTPS: We've rolled out HTTPS for the large majority of Our servers sport TLS 1.2, ECDHE key exchange, and SPDY v3 with strong cipher suites. We've also enabled HSTS on several of our services (including Phabricator), and will continue enabling it for likely every site we have.
  • Reorganized: We've done a massive reorganization of the server architecture, and we've generally split up services to be more modular, with servers separated in both geographic locations and responsibilities where possible.
  • Consolidation: We've consolidated several of our services too. The biggest change is that we now have a single, consolidated MariaDB 10.0 server powering our database infrastructure. All communications to this server are encrypted with spiped for high security. Phabricator, the wiki, some other minor things (like a blog), and likely future applications will use it for storage where possible too.
  • Improved hardware: Every server now has dedicated network, and servers that are linked together (like buildbots, or databases) are privately networked. All networking operations are secured with spiped where possible.

Interlude: A new Hackage server

While we're on the subject, here's an example of what the new Hackage Server will be sporting:

Old server:

  • 8GB RAM, probably 60%+ of all RAM taken by disk cache.
  • Combined with the hackage-builder process.
  • 1 core.
  • Shared ethernet link amongst multiple VMs (no dedicated QOS per VM, AFAIK). No IPv6.
  • 1x100GB virtual KVM block device backed by RAID1 2x2TB SATA setup on the host.

New server:

  • 4x cores.
  • 4GB RAM, as this should fit comfortably with nginx as a forward proxy.
  • Hackage builder has its own server (removing much of the RAM needs).
  • Dedicated 800Mb/s uplink, IPv6 enabled.
  • Dedicated dual 500GB block devices (backed by dedicated RAID10 shared storage) in RAID1 configuration.

So, Hackage should hopefully be OK for a long time. And, the doc builder is now working again, and should hopefully stay that way too.

Automation: it's a thing

Like many other sites, is big, complicated, intimidating, and there are occasionally points where you find a Grue, and it eats you mercilessly.

As a result, automation is an important part of our setup, since it means if one of us is hit by a bus, people can conceivably still understand, maintain and continue to improve in the future. We don't want knowledge of the servers locked up in anyone's head.

In The Past, Long ago in a Galaxy unsurprisingly similar to this one at this very moment, did not really have any automation. At all, not even to create users. Some of still does not have automation. And even still, in fact, some parts of it are still a mystery to all, waiting to be discovered. That's obviously not a good thing.

Today, has two projects dedicated to automation purposes. These are:

  • Ansible, available in rA, which is a set of Ansible playbooks for automating various aspects of the existing servers.
  • Auron, available in rAUR, is a new, Next Gen™ automation framework, based on NixOS.

We eventually hope to phase out Ansible in favor of Auron. While Auron is still very preliminary, several services have been ported over, and the setup does work on existing providers. Auron also is much more philosophically aligned with our desires for automation, including reproducibility, binary determinism, security features, and more.

More bugs code in the open

In our quest to automate our tiny part of the internet, we've begun naturally writing a bit of code. What's the best thing to do with code? Open source it!

The new haskell-infra organization on GitHub hosts our code, including:

Most of our repositories are hosted on GitHub, and we use our Phabricator for code review and changes between ourselves. (We still accept GitHub pull requests though!) So it's pretty easy to contribute in whatever way you want.

Better DNS and content delivery: CloudFlare & Fastly

We're very recently begun using CloudFlare for for DNS management, DDoS mitigation, and analytics. After a bit of deliberation, we decided that after moving off Hetzner we'd think about a 3rd party provider, as opposed to running our own servers.

We chose CloudFlare mostly because aside from a nice DNS management interface, and great features like AnyCast, we also get analytics and security features, including immediate SSL delivery. And, of course, we get a nice CDN on top for all HTTP content. The primary benefits from CloudFlare are the security and caching features (in that order, IMO). The DNS interface is still particularly useful however; the nameservers should be redundant, and CloudFlare acts more like a reverse proxy as changes are quick and instant.

But unfortunately while CloudFlare is great, it's only a web content proxy. That means certain endpoints which need things like SSH access can not (yet) be reliably proxied, which is one of the major downfalls. As a result, not all of will be magically DDoS/spam resistant, but a much bigger amount of it will be. But the bigger problem is: we have a lot of non-web content!

In particular, none of our Hackage server downloads for example can proxied: Hackage, like most package repositories, merely uses HTTP as a transport layer for packages. In theory you could use a binary protocol, but HTTP has a number of advantages (like firewalls being nice to it). Using a service like CloudFlare for such content is - at the least - a complete violation of the spirit of their service, and just a step beyond that a total violation of their ToS (Section 10). But hackage pushes a few TB a month in traffic - so we have to pay line-rate for that, by the bits. And also, Hackage can't have data usefully mirrored to CDN edges - all traffic has to hop through to the Rackspace DCs, meaning users suffer at the hands of latency and slower downloads.

But that's where Fastly came to the rescue. Fastly also recently stepped up to provide with an Open Source Software discount - meaning we get their awesome CDN for free, for custom services! Hooray!

Since Fastly is a dedicated CDN service, you can realistically proxy whatever you want with it, including our package downloads. With the help of a new friend of ours (@davean), we'll be moving Fastly in front of Hackage soon. Hopefully this just means your downloads and responsiveness will get faster, and we'll use less bandwidth. Everyone wins.

Finally, we're rolling out CloudFlare gradually to new servers to test them and make sure they're ready. In particular, we hope to not disturb any automation as a result of the switch (particularly to new SSL certificates), and also, we want to make sure we don't unfairly impact other people, such as Tor users (Tor/CloudFlare have a contentious relationship - lots of nasty traffic comes from Tor endpoints, but so does a ton of legitimate traffic). Let us know if anything goes wrong.

Better server monitoring: DataDog & Nagios

Server monitoring is a crucial part of managing a set of servers, and unfortunately was quite bad at it before. But not anymore! We've done a lot to try and increase things. Before my time, as far as I know, we pretty much only had some lame mrtg graphs of server metrics. But we really needed something more than that, because it's impossible to have modern infrastructure on that alone.

Enter DataDog. I played with their product last year, and I casually approached them and asked if they would provide an account for - and they did!

DD provides real-time analytics for servers, while providing a lot of custom integrations with services like MySQL, nginx, etc. We can monitor load, networking, and correlate this with things like database or webserver connection count. Events occur from all over On top of that, DD serves as a real-time dashboard for us to organize and comment on events as they happen.

But metrics aren't all we need. There are two real things we need: metrics (point-in-time data), and resource monitoring (logging, daemon watchdogs, resource checks, etc etc).

This is where Nagios comes in - we have it running and monitoring all our servers for daemons, heatlh checks, endpoint checks for connectivity, and more. Datadog helpfully plugins into Nagios, and reports events (including errors), as well as sending us weekly summaries of Nagios reports. This means we can helpfully use the Datadog dashboard as a consolidated piece of infrastructure for metrics and events.

As a result: is being monitored much more closely here on out we hope.

Better log analysis: ElasticSearch

We've (very recently) also begun rolling out another part of the equation: log management. Log management is essential to tracking down big issues over time, and in the past several years, ElasticSearch has become incredibly popular. We have a new ElasticSearch instance, running along with Logstash, which several of our servers now report to (via the logstash-forwarder service, which is lightweight even on smaller servers). Kibana sits in front on a separate server for query management so we can watch the systems live.

Furthermore, our ElasticSearch deployment is, like the rest of our infrastructure, 100% encrypted - Kibana proxies backend ElasticSearch queries through HTTPS and over spiped. Servers dump messages into LogStash over SSL. I would have liked to use spiped for the LogStash connection as well, but SSL is unfortunately mandatory at this time (perhaps for the best).

We're slowly rolling out logstash-forwarder over our new machines, and tweaking our LogStash filters so they can get juicy information. Hopefully our log index will become a core tool in the future.

A new status site

As I'm sure some of you might be aware, we now have a fancy new site,, that we'll be using to post updates about the infrastructure, maintenance windows, and expected (or unexpected!) downtimes. And again, someone came to help us - gave us this for free!

Better server backups

Rackspace also fully supports their backup agents which provide compressed, deduplicated backups for your servers. Our previous situation on Hetzner was a lot more limited in terms of storage and cost. Our backups are stored privately on Cloud Files - the same infrastructure that hosts our static content.

Of course, backup on Rackspace is only one level of redundancy. That's why we're thinking about trying to roll out Tarsnap soon too. But either way, our setup is far more reliable and robust and a lot of us are sleeping easier (our previous backups were space hungry and becoming difficult to maintain by hand.)

GHC: Better build bots, better code review

GHC has for a long time had an open infrastructure request: the ability to build patches users submit, and even patches we write, in order to ensure they do not cause machines to regress. Developers don't necessarily have access to every platform (cross compilers, Windows, some obscurely old Linux machine), so having infrastructure here is crucial.

We also needed more stringent code review. I (Austin) review most of the patches, but ideally we want more people reviewing lots of patches, submitting patches, and testing patches. And we really need ways to test all that - I can't be the bottleneck to test a foreign patch on every machine.

At the same time, we've also had a nightly build infrastructure, but our build infrastructure as often hobbled along with custom code running it (bad for maintenance), and the bots are not directed and off to the side - so it's easy to miss build reports from them.

Enter Harbormaster, our Phabricator-powered buildbot for continuous integration and patch submissions!

Harbormaster is a part of Phabricator, and it runs builds on all incoming patches and commits to GHC. How?

  • First, when a patch or commit for GHC comes in, this triggers an event through a Herald rule. Herald is a Phab application to get notified or perform actions when events arrive. When a GHC commit or patch comes in, a rule is triggered, which begins a build.
  • Our Herald rule runs a build plan, which is a dependency based sequence of actions to run.
  • The first thing our plan does is allocate a machine resource, or a buildbot. It does this by taking a lease on the resource to acquire (non-exclusive) access to it, and it moves forward. Machine management is done by a separate application, Drydock.
  • After leasing a machine, we SSH into it.
  • We then run a build, using our phab-ghc-builder code.
  • Harbormaster tracks all the stdout output, and test results.
  • It then reports back on the Code review, or the commit in question, and emails the author.

This has already lead to a rather large change in development for most GHC developers, and Phabricator is building our patches regularly now - yes, even committers use it!

Harbormaster will get more powerful in the future: our build plans will lease more resources, including Windows, Mac, and different varieties of Linux machines, and it will run more general build plans for cross compilers and other things. It's solved a real problem for us, and the latest infrastructure has been relatively reliable. In fact I just get lazy and submit diffs to GHC without testing them - I let the machines do it. Viva la code review!

(See the GHC wiki for more on our Phabricator process there's a lot written there for GHC developers.)

Phabricator: Documentation, an official wiki, and a better issue tracker

That's right, there's now documentation about the infrastructure, hosted on our new official wiki. And now you can report bugs through Maniphest to us. Both of these applications are powered by Phabricator, just like our blog.

In a previous life, used Request Tracker (RT) to do support management. Our old RT instance is still running, but it's filled with garbage old tickets, some spam, it has its own PostgreSQL instance alone for it (quite wasteful) and generally has not seen active use in years. We've decided to phase it out soon, and instead use our Phabricator instance to manage problems, tickets, and discussions. We've already started importing and rewriting new content into our wiki and modernizing things.

Hopefully these docs will help keep people up to date about the happenings here.

But also, our Phabricator installation has become an umbrella installation for several projects (even the Committee may try to use it for book-keeping). In addition, we've been taking the time to extend and contribute to Phab where possible to improve the experience for users.

In addition to that, we've also authored several Phab extensions:

  • libphutil-haskell in rPHUH, which extends Phabricator with custom support for GHC and other things.
  • libphutil-rackspace in rPHUR, which extends Phabricator with support for Rackspace, including Cloud Files for storage needs, and build-machine allocation for Harbormaster.
  • libphutil-scrypt (deprecated; soon to be upstream) in rPHUSC, which extends Phabricator with password hashing support for the scrypt algorithm.

Future work

Of course, we're not done. That would be silly. Maintaining and providing better services to the community is a real necessity for anything to work at all (and remember: computers are the worst).

We've got a lot further to go. Some sneak peaks...

  • We'll probably attempt to roll out HHVM for our Mediawiki instance to improve performance and reduce load.
  • We'll be creating more GHC buildbots, including a fabled Windows build bot, and on-demand servers for distributed build load.
  • We'll be looking at ways of making it easier to donate to (on the homepage, with a nice embedded donation link).
  • Moar security. I (Austin) in particular am looking into deploying a setup like grsecurity for new servers to harden them automatically.
  • We'll roll out a new server,, that will serve as a powerful, scalable file hosting solution for things like the Haskell Platform or GHC. This will hopefully alleviate administration overhead, reduce bandwidth, and make things quicker (thanks again, Fastly!)

And, of course, we'd appreciate all the help we can get!

el fin

This post was long. This is the ending. You probably won't read it. But we're done now! And I think that's all the time we have for today.

by austin (Austin Seipp) at October 13, 2014 06:10 PM

Kevin Reid (kpreid)

Game idea: “Be Consistent”

Here’s another idea for a video game.

The theme of the game is “be consistent”. It's a minimalist-styled 2D platformer. The core mechanic is that whatever you do the first time, the game makes it so that that was the right action. Examples of how this could work:

  • At the start, you're standing at the center of a 2×2 checkerboard of background colors (plus appropriate greebles, not perfect squares). Say the top left and bottom right is darkish and the other quadrants are lightish. If you move left, then the darkish stuff is sky, the lightish stuff is ground, and the level extends to the left. If you move right, the darkish stuff is ground, and the level extends to the right.

  • The first time you need to jump, if you press W or up then that's the jump key, or if you press the space bar then that's the jump key. The other key does something else. (This might interact poorly with an initial “push all the keys to see what they do”, though.)

  • <o>You meet a floaty pointy thing. If you walk into it, it turns out to be a pickup. If you shoot it or jump on it, it turns out to be an enemy.
  • If you jump in the little pool of water, the game has underwater sections or secrets. If you jump over the little pool, water is deadly.

by Kevin Reid (kpreid) ([email protected]) at October 13, 2014 05:46 PM

Tom Schrijvers

Mathematics of Program Construction (MPC 2015): first call for papers


12th International Conference on Mathematics of Program Construction, MPC 2015
Königswinter, Germany, 29 June - 1 July 2015


The MPC conferences aim to promote the development of mathematical principles
and techniques that are demonstrably practical and effective in the process
of constructing computer programs, broadly interpreted.

The 2015 MPC conference will be held in Königswinter, Germany, from 29th June
to 1st July 2015. The previous conferences were held in Twente, The
Netherlands (1989), Oxford, UK (1992), Kloster Irsee, Germany (1995),
Marstrand, Sweden (1998), Ponte de Lima, Portugal (2000), Dagstuhl, Germany
(2002), Stirling, UK (2004, colocated with AMAST), Kuressaare, Estonia (2006,
colocated with AMAST), Marseille, France (2008), Québec City, Canada (2010,
colocated with AMAST), and Madrid, Spain (2012).


Papers are solicited on mathematical methods and tools put to use in program
construction. Topics of interest range from algorithmics to support for
program construction in programming languages and systems. The notion of
"program" is broad, from algorithms to hardware. Some typical areas are type
systems, program analysis and transformation, programming-language semantics,
security, and program logics. Theoretical contributions are welcome, provided
that their relevance to program construction is clear. Reports on
applications are welcome, provided that their mathematical basis is evident.

We also encourage the submission of "pearls": elegant, instructive, and fun
essays on the mathematics of program construction.


   * Submission of abstracts:      26 January 2015
   * Submission of full papers:     2 February 2015
   * Notification to authors:      16 March 2015
   * Final version:                13 April 2015


Submission is in two stages. Abstracts (plain text, 10 to 20 lines) must be
submitted by 26 January 2015. Full papers (pdf) adhering to the LaTeX llncs
style must be submitted by 2 February 2015. There is no official page limit,
but authors should strive for brevity. The web-based system EasyChair will be
used for submission (

Papers must report previously unpublished work, and must not be submitted
concurrently to a journal or to another conference with refereed proceedings.
Accepted papers must be presented at the conference by one of the authors.
Please feel free to write to [email protected] with any questions about
academic matters.

The proceedings of MPC 2015 will be published in Springer-Verlag's Lecture
Notes in Computer Science series, as have all the previous editions. Authors
of accepted papers will be expected to transfer copyright to Springer for
this purpose. After the conference, authors of the best papers will be
invited to submit revised versions to a special issue of the Elsevier journal
Science of Computer Programming.


Ralf Hinze                University of Oxford, UK (chair)

Eerke Boiten              University of Kent, UK
Jules Desharnais          Université Laval, Canada
Lindsay Groves            Victoria University of Wellington, New Zealand
Zhenjiang Hu              National Institute of Informatics, Japan
Graham Hutton             University of Nottingham, UK
Johan Jeuring             Utrecht University and Open University, The Netherlands
Jay McCarthy              Vassar College, US
Bernhard Möller           Universität Augsburg, Germany
Shin-Cheng Mu             Academia Sinica, Taiwan
Dave Naumann              Stevens Institute of Technology, US
Pablo Nogueira            Universidad Politécnica de Madrid, Spain
Ulf Norell                University of Gothenburg, Sweden
Bruno C. d. S. Oliveira   The University of Hong Kong, Hong Kong
José Nuno Oliveira        Universidade do Minho, Portugal
Alberto Pardo             Universidad de la República, Uruguay
Christine Paulin-Mohring  INRIA-Université Paris-Sud, France
Tom Schrijvers            KU Leuven, Belgium
Emil Sekerinski           McMaster University, Canada
Tim Sheard                Portland State University, US
Anya Tafliovich           University of Toronto Scarborough, Canada
Tarmo Uustalu             Institute of Cybernetics, Estonia
Janis Voigtländer         Universität Bonn, Germany


The conference will take place in Königswinter, Maritim Hotel, where
accommodation has been reserved. Königswinter is situated on the right bank
of the river Rhine, opposite Germany's former capital Bonn, at the foot of
the Siebengebirge.


Ralf Hinze                      University of Oxford, UK (co-chair)
Janis Voigtländer               Universität Bonn, Germany (co-chair)
José Pedro Magalhães            University of Oxford, UK
Nicolas Wu                      University of Oxford, UK

For queries about local matters, please write to [email protected].

by Tom Schrijvers ([email protected]) at October 13, 2014 08:30 AM

Magnus Therning

Optparse-applicative and custom argument parsers

The latest update of optparse-applicative triggered me to look over the functions in cblrepo for parsing a few custom command line options. I used to do the parsing in a rather ad-hoc way with lots of use of list functions to split on specific characters. For instance, some option values are pairs of package name and version separated by a comma: PKG-NAME,VERSION. This worked fine and was easy to plug into version 0.10 of optparse-applicative. It was also easily extended to triples, PKG-NAME,VERSION,RELEASE, but it started feeling a bit brittle when some tuples got extended with an optional list of flag assignments, PKG-NAME,VERSION[:FLAG,FLAG,FLAG,...]. The recent release of version 0.11 of optparse-applicative changed the API for custom option value parsers radically; instead of passing a string to the parser, the parser has to use readerAsk to get the string. In short, ReaderM turned into a state monad.

In adjusting to the new API I noticed that the code was organised in such a way that some low-level parsing functions were used directly from command line option definitions, while also being used as building blocks for the more complex parsers. This of course meant that the structuring of the functions needed to be changed completely to deal with the API change.

It turns out there already was a parser that was written in a different style (here already adjusted to the 0.11 API):

readerGhcVersion :: ReadM Version
readerGhcVersion =
    arg <- readerAsk
    case lastMay $ readP_to_S parseVersion arg of
        Just (v, "") -> return v
        _ -> fail $ "cannot parse value `" ++ arg ++ "`"

So I rewrote the rest of the parsers in a similar style. The arguably most complicated is this one:

readPkgNVersion :: ReadP (String, Version)
readPkgNVersion = do
    n <- many (satisfy (/= ','))
    char ','
    v <- parseVersion
    return (n, v)

readFlag :: ReadP (FlagName, Bool)
readFlag = readNegFlag <++ readPosFlag
        readNegFlag = do
            char '-'
            n <- many (satisfy (/= ','))
            return (FlagName n, False)

        readPosFlag = do
            n0 <- get
            n <- many (satisfy (/= ','))
            return (FlagName (n0 : n), True)

strCblPkgArgReader :: ReadM (String, Version, FlagAssignment)
strCblPkgArgReader = let
        readWithFlags = do
            (n, v) <- readPkgNVersion
            char ':'
            fas <- sepBy readFlag (char ',')
            return (n, v, fas)

        readWithoutFlags = do
            (n, v) <- readPkgNVersion
            return (n, v, [])

    in do
        s <- readerAsk
        case lastMay (readP_to_S (readWithFlags <++ readWithoutFlags) s) of
            Just (r, "") -> return r
            _ -> fail $ "Cannot parse: " ++ s

It is slightly longer, but it’s rather a lot easier to read what’s happening after this rewrite. ReadP feels like a lighter option than pulling in parsec as a dependency, but I’d love to hear any comments or suggestions, as well as pointers to how other people deal with parsing of non-trivial types of arguments in combination with optparse-applicative.

October 13, 2014 12:00 AM

October 12, 2014

Ian Ross

Non-diffusive atmospheric flow #7: PCA for spatio-temporal data

Non-diffusive atmospheric flow #7: PCA for spatio-temporal data

Although the basics of the “project onto eigenvectors of the covariance matrix” prescription do hold just the same in the case of spatio-temporal data as in the simple two-dimensional example we looked at in the earlier article, there are a number of things we need to think about when we come to look at PCA for spatio-temporal data. Specifically, we need to think bout data organisation, the interpretation of the output of the PCA calculation, and the interpretation of PCA as a change of basis in a spatio-temporal setting. Let’s start by looking at data organisation.

The Z500 anomaly data we want to analyse has 66 × 151 = 9966 days of data, each of which has 72 × 15 = 1080 spatial points. In our earlier two-dimensional PCA example, we performed PCA on a collection of two-dimensional data points. For the Z500 data, it’s pretty clear that the “collection of points” covers the time steps, and each “data point” is a 72 × 15 grid of Z500 values. We can think of each of those grids as a 1080-dimensional vector, just by flattening all the grid values into a single row, giving us a sequence of “data points” as vectors in 1080 that we can treat in the same kind of way as we did the two-dimensional data points in the earlier example. Our input data thus ends up being a set of 9966 1080-dimensional vectors, instead of 500 two-dimensional vectors (as for the mussel data). If we do PCA on this collection of 1080-dimensional vectors, the PCA eigenvectors will have the same shape as the input data vectors, so we can interpret them as spatial patterns, just by inverting the flattening we did to get from spatial maps of Z500 to vectors – as long as we interpret each entry in the eigenvectors as the same spatial point as the corresponding entry in the input data vectors, everything works seamlessly. The transformation goes like this:

pattern → vector → PCA → eigenvector → eigenpattern

So we have an interpretation of the PCA eigenvectors (which we’ll henceforth call “PCA eigenpatterns” to emphasise that they’re spatial patterns of variability) in this spatio-temporal data case. What about the PCA eigenvalues? These have exactly the same interpretation as in the two-dimensional case: they measure the variance “explained” by each of the PCA eigenpatterns. And finally, the PCA projected components tell us how much of each PCA eigenpattern is present in each of the input data vectors. Since our input data has one spatial grid per time step, the projections give us one time series for each of the PCA eigenvectors, i.e. one time series of PCA projected components per spatial point in the input. (In one way, it’s kind of obvious that we need this number of values to reproduce the input data perfectly – I’ll say a little more about this when we think about what “basis” means in this setting.)

The PCA calculation works just the same as it did for the two-dimensional case: starting with our 1080-dimensional data, we centre the data, calculate the covariance matrix (which in this case is a 1080 × 1080 matrix, the diagonal entries of which measure the variances at each spatial point and the off-diagonal entries of which measure the covariances between each pair of spatial points), perform an eigendecomposition of the covariance matrix, then project each of the input data points onto each of the eigenvectors of the covariance matrix.

We’ve talked about PCA as being nothing more than a change of basis, in the two-dimensional case from the “normal” Euclidean basis (with unit basis vectors pointing along the x- and y-coordinate axes) to another orthnormal basis whose basis vectors are the PCA eigenvectors. How does this work in the spatio-temporal setting? This is probably the point that confuses most people in going from the simple two-dimensional example to the N-dimensional spatio-temporal case, so I’m going to labour the point a bit to make things as clear as possible.

First, what’s the “normal” basis here? Each time step of our input data specifies a Z500 value at each point in space – we have one number in our data vector for each point in our grid. In the two-dimensional case, we had one number for each of the mussel shell measurements we took (length and width). For the Z500 data, the 1080 data values are the Z500 values measured at each of the spatial points. In the mussel shell case, the basis vectors pointed in the x-axis direction (for shell length) and the y-axis direction (for the shell width). For the Z500 case, we somehow need basis vectors that point in each of the “grid point directions”, one for each of the 1080 grid points. What do these look like? Imagine a spatial grid of the same shape (i.e. 72 × 15) as the Z500 data, where all the grid values are zero, except for one point, which has a grid value of one. That is a basis vector pointing in the “direction” of the grid point with the non-zero data value. We’re going to call this the “grid” basis for brevity. We can represent the Z500 value at any spatial point (i,j) as


where ek(i,j) is zero unless k=15(i1)+j, in which case it’s one (i.e. it’s exactly the basis element we just described, where we’re numbering the basis elements in row-major order) and ϕk is a “component” in the expansion of the Z500 field using this grid basis. Now obviously here, because of the basis we’re using, we can see immediately that ϕ15(i1)+j=Z500(i,j), but this expansion holds for any orthnormal basis, so we can transform to a basis where the basis vectors are the PCA eigenvectors, just as for the two-dimensional case. If we call these eigenvectors e~k(i,j), then


where the ϕ~k are the components in the PCA eigenvector basis. Now though, the e~k(i,j) aren’t just the “zero everywhere except at one point” grid basis vectors, but they can have non-zero values anywhere.

Compare this to the case for the two-dimensional example, where we started with data in a basis that had seperate measurements for shell length and shell width, then transformed to the PCA basis where the length and width measurements were “mixed up” into a sort of “size” measurement and a sort of “aspect ratio” measurement. The same thing is happening here: instead of looking at the Z500 data in terms of the variations at individual grid points (which is what we see in the grid basis), we’re going to be able to look at variations in terms of coherent spatial patterns that span many grid points. And because of the way that PCA works, those patterns are the “most important”, in the sense that they are the orthogonal (which in this case means uncorrelated) patterns that explain the most of the total variance in the Z500 data.

As I’ve already mentioned, I’m going to try to be consistent in terms of the terminology I use: I’m only ever going to talk about “PCA eigenvalues”, “PCA eigenpatterns”, and “PCA projected components” (or “PCA projected component time series”). Given the number of discussions I’ve been involved in in the past where people have been talking past each other just because one person means one thing by “principal component” and the other means something else, I’d much rather pay the price of a little verbosity to avoid that kind of confusion.

Z500 PCA calculation

The PCA calculation for the Z500 data can be done quite easily in Haskell. We’ll show in this section how it’s done, and we’ll use the code to address a couple of remaining issues with how spatio-temporal PCA works (specifically, area scaling for data in latitude/longitude coordinates and the relative scaling of PCA eigenpatterns and projected components).

There are three main steps to the PCA calculation: first we need to centre our data and calculate the covariance matrix, then we need to do the eigendecomposition of the covariance matrix, and finally we need to project our original data onto the PCA eigenvectors. We need to think a little about the data volumes involved in these steps. Our Z500 data has 1080 spatial points, so the covariance matrix will be a 1080 × 1080 matrix, i.e. it will have 1,166,400 entries. This isn’t really a problem, and performing an eigendecomposition of a matrix of this size is pretty quick. What can be more of a problem is the size of the input data itself – although we only have 1080 spatial points, we could in principle have a large number of time samples, enough that we might not want to read the whole of the data set into memory at once for the covariance matrix calculation. We’re going to demonstrate two approaches here: in the first “online” calculation, we’ll just read all the data at once and assume that we have enough memory; in the second “offline” approach, we’ll only ever read a single time step of Z500 data at a time into memory. Note that in both cases, we’re going to calculate the full covariance matrix in memory and do a direct eigendecomposition using SVD. There are offline approaches for calculating the covariance and there are iterative methods that allow you to calculate some eigenvectors of a matrix without doing a full eigendecomposition, but we’re not going to worry about that here.

Online PCA calculation

As usual, the code is in a Gist.

For the online calculation, the PCA calculation itself is identical to our two-dimensional test case and we reuse the pca function from the earlier post. The only thing we need to do is to read the data in as a matrix to pass to the pca function. In fact, there is one extra thing we need to do before passing the Z500 anomaly data to the pca function. Because the Z500 data is sampled on a regular latitude/longitude grid, grid points near the North pole correspond to much smaller areas of the earth than grid points closer to the equator. In order to compensate for this, we scale the Z500 anomaly data values by the square root of the cosine of the latitude – this leads to covariance matrix values that scale as the cosine of the latitude, which gives the correct area weighting. The listing below shows how we do this. First we read the NetCDF data then we use the hmatrix build function to construct a suitably scaled data matrix:

  Right z500short <- get innc z500var :: RepaRet3 CShort

  -- Convert anomaly data to a matrix of floating point values,
  -- scaling by square root of cos of latitude.
  let latscale = (\lt -> realToFrac $ sqrt $ cos (lt / 180.0 * pi)) lat
      z500 = build (ntime, nspace)
             (\t s -> let it = truncate t :: Int
                          is = truncate s :: Int
                          (ilat, ilon) = divMod is nlon
                          i = Repa.Z Repa.:. it Repa.:. ilat Repa.:. ilon
                        in (latscale ! ilat) *
                           (fromIntegral $ z500short Repa.! i)) :: Matrix Double

Once we have the scaled Z500 anomaly data in a matrix, we call the pca function, which does both the covariance matrix calculation and the PCA eigendecomposition and projection, then write the results to a NetCDF file. We end up with a NetCDF file containing 1080 PCA eigenpatterns, each with 72 × 15 data points on our latitude/longitude grid and PCA projected component time series each with 9966 time steps.

One very important thing to note here is the relative scaling of the PCA eigenpatterns and the PCA projected component time series. In the two-dimensional mussel shell example, there was no confusion about the fact that the PCA eigenvectors as we presented them were unit vectors, and the PCA projected components had the units of length measured along those unit vectors. Here, in the spatio-temporal case, there is much potential for confusion (and the range of conventions in the climate science literature doesn’t do anything to help alleviate that confusion). To make things very clear: here, the PCA eigenvectors are still unit vectors and the PCA projected component time series have the units of Z500!

The reason for the potential confusion is that people quite reasonably like to draw maps of the PCA eigenpatterns, but they also like to think of these maps as being spatial patterns of Z500 variation, not just as basis vectors. This opens the door to all sorts of more or less reputable approaches to scaling the PCA eigenpatterns and projected components. One well-known book on statistical analysis in climate research suggests that people should scale their PCA eigenpatterns by the standard deviation of the corresponding PCA projected component time series and the values of the PCA projected component time series should be divided by their standard deviation. The result of this is that the maps of the PCA eigenpatterns look like Z500 maps and all of the PCA projected component time series have standard deviation of one. People then talk about the PCA eigenpatterns as showing a “typical ± 1 SD” event.

Here, we’re going to deal with this issue by continuing to be very explicit about what we’re doing. In all cases, our PCA eigenpatterns will be unit vectors, i.e. the things we get back from the pca function, without any scaling. That means that the units in our data live on the PCA projected component time series, not on the PCA eigenpatterns. When we want to look at a map of a PCA eigenpattern in a way that makes it look like a “typical” Z500 deviation from the mean (which is a useful thing to do), we will say something like “This plot shows the first PCA eigenpattern scaled by the standard deviation of the first PCA projected component time series.” Just to be extra explicit!

Offline PCA calculation

The “online” PCA calculation didn’t require any extra work, apart from some type conversions and the area scaling we had to do. But what if we have too much data to read everything into memory in one go? Here, I’ll show you how to do a sort of “offline” PCA calculation. By “offline”, I mean an approach that only ever reads a single time step of data from the input at a time, and only ever writes a single time step of the PCA projected component time series to the output at a time.

Because we’re going to be interleaving calculation and I/O, we’re going to need to make our PCA function monadic. Here’s the main offline PCA function:

pcaM :: Int -> (Int -> IO V) -> (Int -> V -> IO ()) -> IO (V, M)
pcaM nrow readrow writeproj = do
  (mean, cov) <- meanCovM nrow readrow
  let (_, evals, evecCols) = svd cov
      evecs = fromRows $ map evecSigns $ toColumns evecCols
      evecSigns ev = let maxelidx = maxIndex $ cmap abs ev
                         sign = signum (ev ! maxelidx)
                     in cmap (sign *) ev
      varexp = scale (1.0 / sumElements evals) evals
      project x = evecs #> (x - mean)
  forM_ [0..nrow-1] $ \r -> readrow r >>= writeproj r . project
  return (varexp, evecs)

It makes use of a couple of convenience type synonyms:

type V = Vector Double
type M = Matrix Double

The pcaM function takes as arguments the number of data rows to process (in our case, the number of time steps), an IO action to read a single row of data (given the zero-based row index), and an IO action to write a single row of PCA projected component time series data. As with the “normal” pca function, the pcaM function returns the PCA eigenvalues and PCA eigenvectors as its result.

Most of the pcaM function is the same as the pca function. There are only two real differences. First, the calculation of the mean and covariance of the data uses the meanCovM function that we’ll look at in a moment. Second, the writing of the PCA projected component time series output is done by a monadic loop that uses the IO actions passed to pcaM to alternately read, project and write out rows of data (the pca function just returned the PCA projected component time series to its caller in one go).

Most of the real differences to the pca function lie in the calculation of the mean and covariance of the input data:

meanCovM :: Int -> (Int -> IO V) -> IO (V, M)
meanCovM nrow readrow = do
  -- Accumulate values for mean calculation.
  refrow <- readrow 0
  let maddone acc i = do
        row <- readrow i
        return $! acc + row
  mtotal <- foldM maddone refrow [1..nrow-1]

  -- Calculate sample mean.
  let mean = scale (1.0 / fromIntegral nrow) mtotal

  -- Accumulate differences from mean for covariance calculation.
  let refdiff = refrow - mean
      caddone acc i = do
        row <- readrow i
        let diff = row - mean
        return $! acc + (diff `outer` diff)
  ctotal <- foldM caddone (refdiff `outer` refdiff) [1..nrow-1]

  -- Calculate sample covariance.
  let cov = scale (1.0 / fromIntegral (nrow - 1)) ctotal

  return (mean, cov)

Since we don’t want to read more than a single row of input data at a time, we need to explicitly accumulate data for the mean and covariance calculations. That means making two passes over the input data file, reading a row at a time – the maddone and caddone helper functions accumulate a single row of data for the mean and covariance calculations. The accumulator for the mean is pretty obvious, but that for the covariance probably deserves a bit of comment. It uses the hmatrix outer function to calculate (xix)(xix)T (where xi is the ith data row (as a column vector) and x is the data mean), which is the appropriate contribution to the covariance matrix for each individual data row.

Overall, the offline PCA calculation makes three passes over the input data file (one for the mean, one for the covariance, one to project the input data onto the PCA eigenvectors), reading a single data row at a time. That makes it pretty slow, certainly far slower than the online calculation, which reads all of the data into memory in one go, then does all the mean, covariance and projection calculations in memory, and finally writes out the PCA projected components in one go. However, if you have enough data that you can’t do an online calculation, this is the way to go. You can obviously imagine ways to make this more efficient, probably by reading batches of data rows at a time. You’d still need to do three passes over the data, but batching the reads would make things a bit quicker.

Visualising the PCA results

There are three things we can look at that come out of the PCA analysis of the Z500 anomaly data: the PCA eigenvalues (best expressed as “fraction of variance explained”), the PCA eigenpatterns and the PCA projected component time series.

First, let’s look at the eigenvalues. This plot shows the fraction of variance explained for the first 100 PCA eigenvalues of the Z500 anomaly data, both individually (blue) and cumulatively (orange):

The eigenvalues are ordered in decreasing order of magnitude in what’s usually called a “scree plot”. The reason for the name is pretty obvious, since the eigenvalues fall off quickly in magnitude giving the graph the look of cliff face with a talus slope at its foot. We often look for a “knee” in a plot like this to get some idea of how many PCA eigenpatterns we need to consider to capture a good fraction of the total variance in the data we’re looking at. Here we can see that just ten of the PCA eigenpatterns explain about half of the total variance in the Z500 anomaly data (which is a set of 1080-dimensional vectors, remember). However, there’s not all that much of a “knee” in the scree plot here, which is pretty typical for climate and meteorological data – we often see a gradual fall-off in PCA eigenvalue magnitude rather than a discrete set of larger magnitude eigenvalues that we can identify as “the important ones”.

We can get some idea of what’s going on with this gradual fall-off by looking at the PCA eigenpatterns. As mentioned in the previous section, there is a question about how we scale these for display. To be completely explicit about things, here we’re going to plot PCA eigenpatterns scaled by the standard deviation of the corresponding PCA projected component time series. This gives us “typical one standard deviation” patterns that we can plot with units of geopotential height. These are usually easier to interpret than the “unit vector” PCA eigenpatterns than come out of the PCA calculation.

Here are the first six PCA eigenpatterns for the Z500 anomaly data (you can click on these images to see larger versions; the numbers in parentheses show the fraction of total Z500 anomaly variance explained by each PCA eigenpattern.):

Z500 PCA eigenpatterns 1-6

For comparison here are the eigenpatterns for eigenpatterns 10,20,,60:

Z500 PCA eigenpatterns 10-60

The first thing to note about these figures is that the spatial scales of variation for the PCA eigenpatterns corresponding to smaller eigenvalues (i.e. smaller explained variance fractions) are also smaller – for the most extreme case, compare the dipolar circumpolar spatial pattern for the first eigenpattern (first plot in the first group of plots) to the fine-scale spatial features for the 60th eigenpattern (last plot in the second group). This is what we often seen when we do PCA on atmospheric data. The larger spatial scales capture most of the variability in the data so are represented by the first few eigenpatterns, while smaller scale spatial variability is represented by later eigenpatterns. Intuitively, this is probably related to the power-law scaling in the turbulent cascade of energy from large (planetary) scales to small scales (where dissipation by thermal diffusion occurs) in the atmosphere1.

The next thing we can look at, at least in the first few patterns in the first group of plots, are some of the actual patterns of variability these things represent. The first PCA eigenpattern, for example, represents a dipole in Z500 anomaly variability with poles in the North Atlantic just south of Greenland and over mainland western Europe. If you look back at the blocking Z500 anomaly plots in an earlier post, you can kind of convince yourself that this first PCA eigenpattern looks a little like some instances of a blocking pattern over the North Atlantic. Similarly, the second PCA eigenpattern is mostly a dipole between the North Pacific and North America (with some weaker associated variability over the Atlantic, so we might expect this somehow to be related to blocking episodes in the Pacific sector.

This is all necessarily a bit vague, because these patterns represent only part of the variability in the data, with each individual pattern representing only a quite small fraction of the variability (8.86% for the first eigenpattern, 7.46% for the second, 6.27% for the third). At any particular point in time, the pattern of Z500 anomalies in the atmosphere will be made up of contributions from these patterns plus many others. What we hope though is that we can tease out some interesting characteristics of the atmospheric flow by considering just a subset of these PCA eigenpatterns. Sometimes this is really easy and obvious – if you perform PCA and find that there are two leading eigenpatterns that explain 80% of the variance in your data, then you can quite straightforwardly press ahead with analysing only those two patterns of variability, safe in the knowledge that you’re capturing most of what’s going on in your data. In our case, we’re going to try to get some sense of what’s going on by looking at only the first three PCA eigenpatterns (we’ll see why three in the next article). The first three eigenpatterns explain only 22.59% of the total variance in our Z500 anomaly data, so this isn’t obviously a smart thing to do. It does turn out to work and to be quite educational though!

The last component of the output from the PCA procedure is the time series of PCA projected component values. Here we have one time series (of 9966 days) for each of the 1080 PCA eigenpatterns that we produced. At each time step, the actual Z500 anomaly field can be recovered by adding up all the PCA eigenpatterns, each weighted by the corresponding projected component. You can look at plots of these time series, but they’re not in themselves all that enlightening. I’ll say some more about them in the next article, where we need to think about the autocorrelation properties of these time series.

(As a side note, I’d comment that the PCA eigenpatterns shown above match up pretty well with those in Crommelin’s paper, which is reassuring. The approach we’re taking here, of duplicating the analysis done in an existing paper, is actually a very good way to go about developing new data analyis code – you can see quite quickly if you screw things up as you’re going along by comparing your results with what’s in the paper. Since I’m just making up all the Haskell stuff here as I go along, this is pretty handy!)

  1. But don’t make too much of that, not in any kind of quantitative sense anyway – there’s certainly no obvious power law scaling in the explained variance of the PCA eigenpatterns as a function of eigenvalue index, unless you look at the data with very power law tinted spectacles! I’m planning to look at another paper at some point in the future that will serve as a good vehicle for exploring this question of when and where we can see power law behaviour in observational data.}

<script src="" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

October 12, 2014 06:46 PM

October 11, 2014

Lee Pike


I wanted to try out a recent feature in GHC this week, so I was using HEAD. When I say using, I mean it: I wasn’t developing with it, but using it to build Ivory, our safe C EDSL, as well as some libraries written in Ivory. I want to point out a few dragons that lay ahead for others using HEAD ( today.

  • The Functor, Applicative, Monad hierarchy (as well as the Alternative and MonadPlus hierarchy) is no longer a warning but an error. You’ll be adding a lot of instances to library code.
  • You’ll need to update Cabal, which comes as a submodule with the GHC sources. Unfortunately, building Cabal was a pain because of the bootstrapping problem. The script in cabal-install is broken (e.g., outdated dependency versions). Getting it to work involved using runhaskell directly, modifying a bunch of Cabal’s dependencies, and getting a little help from Iavor Diatchki.
  • Some of the datatypes in Template Haskell have changed; notably, the datatypes for creating class constraints have been folded into the datatype for constructing types (constraint kinds!). Many libraries that depend on Template Haskell breaks as a result.
  • I won’t mention the issues with package dependency upper bounds. Oops, I just did.

Fortunately, once Cabal is updated, Cabal sandboxes work just fine. I wrote a simple shell script to swap out my sandboxes to switch between GHC versions. (It would be a nice feature if Cabal recognized you are using a different version of GHC than the one the cabal sandbox was originally made and created a new sandbox automatically.)

Because I’m on OSX and use Homebrew, I tried using it to manage my GHC versions, including those installed with Brew and those built and installed manually. It works great. When building GHC, configure it to install into your Cellar, or wherever you have Brew install packages. So I ran

> ./configure --prefix=/usr/local/Cellar/ghc/7.9

which makes Brew aware of the new version of GHC, despite being manually installed. After it’s installed,

> brew switch ghc 7.9

takes care of updating all your symbolic links. No more making “my-ghc-7.9″ symbolic links or writing custom shell scripts.

by Lee Pike at October 11, 2014 04:23 AM

October 10, 2014

Functional Jobs

Functional Software Engineer at Cake Solutions Ltd (Full-time)

At Cake Solutions we work with our customers to build high-quality, scalable and resilient software systems using the latest technology. As a software engineer, you'll not only be able to write good, maintainable software, but also stay at the forefront of technology and advocate a principled approach to software engineering. You'll get the opportunity to work on a wide range of interesting projects for our clients, using Java, Scala, Play, and Akka.

What to expect:

To begin with, you will take part in a 2 week Typesafe certified workshop style training course where you will get introduced to Scala, Akka and Play.

You can expect a lively and challenging environment with very interesting problems to solve. We are happy to train and mentor the right people; the important thing is to have a bright mind and the motivation to question, explore and learn. Having published work, on Github or elsewhere, is extremely helpful. CVs should focus on what you really know, not what you've seen once.

If you are a graduate, we are not expecting commercial experience, but we want to see evidence of hobby / university engineering. We do, however, expect you to be independent and to have good understanding of the principles of software engineering.

You will also get flexible working hours and gym membership.

Skills & Requirements As a software engineer at Cake Solutions, you should:

-- Have a good understanding of Java and the JVM. Experience with Scala is a plus.

-- Know how to use UNIX or Linux.

-- Know how to apply object-oriented and functional programming styles to real-world software engineering.

-- Have experience with at least one database system and be aware of the wider database landscape (relational, document, key/value, graph, ...).

-Understand modern software development practices, such as testing, continuous integration and producing maintainable code.

Advantages include:

-- Open-source contributions.

-- Modern web development experience.

-- An understanding of asynchronous and non-blocking principles.

-- Experience in writing multi-threaded software.

-- More detailed knowledge of strongly-typed functional programming (e.g. Scala, Haskell, OCaml).

-- Even more specifically, experience with Akka or Play.

About Cake Solutions Ltd Cake Solutions architects, implements and maintains modern and scalable software, which includes server-side, rich browser applications and mobile development. Alongside the software engineering and delivery, Cake Solutions provides mentoring and training services. Whatever scale of system you ask us to develop, we will deliver the entire solution, not just lines of code. We appreciate the importance of good testing,Continuous Integration and delivery, and DevOps. We motivate, mentor and guide entire teams through modern software engineering. This enables us to deliver not just software, but to transform the way organisations think about and execute software delivery.

The core members of our team are published authors and experienced speakers. Our teams have extensive experience in designing and implementing event-driven, resilient, responsive and scalable systems. We use modern programming languages such as Scala, Java, C++, Objective-C, and JavaScript to implement the systems' components. We make the most of messaging infrastructures, modern DBMSs, and all other services that make up today's enterprise systems. Automated provisioning, testing, integration and delivery allow us to release high quality systems safely and predictably. The mentoring through continuous improvement at all levels of the project work gives our clients the insight and flexibility they expect.

We rely on open source software in our day-to-day development; it gives us access to very high quality code, allows us to make improvements if we need to, and provides access to excellent source of inspiration and talent. We give back to the open source community by contributing to the open source projects we use, and by publishing our own open source projects. The team have contributed various Typesafe Activator templates, and shared their expertise with Akka and Scala in Akka Patterns and Akka Extras. Outside of the Typesafe stack, we have contributed to Tru-strap, and OpenCV. The team members have also created open source projects that scratch our own itch: we have Reactive Monitor, Specs2 Spring and Scalad.

Get information on how to apply for this position.

October 10, 2014 12:02 PM

Tom Schrijvers

ICFP 2015: Call for Workshop and Co-Located Event Proposals

                            ICFP 2015
 20th ACM SIGPLAN International Conference on Functional Programming
                   August 30 – September 5, 2015
                        Vancouver, Canada

The 120th ACM SIGPLAN International Conference on Functional
Programming will be held in Vancouver, British Columbia, Canada on
August 30-September 5, 2015.  ICFP provides a forum for researchers
and developers to hear about the latest work on the design,
implementations, principles, and uses of functional programming.

Proposals are invited for workshops (and other co-located events, such
as tutorials) to be affiliated with ICFP 2015 and sponsored by
SIGPLAN. These events should be less formal and more focused than ICFP
itself, include sessions that enable interaction among the attendees,
and foster the exchange of new ideas. The preference is for one-day
events, but other schedules can also be considered.

The workshops are scheduled to occur on August 30 (the day
before ICFP) and September 3-5 (the three days after ICFP).


Submission details
 Deadline for submission:     November 16, 2014
 Notification of acceptance:  December 15, 2014

Prospective organizers of workshops or other co-located events are
invited to submit a completed workshop proposal form in plain text
format to the ICFP 2015 workshop co-chairs (Tom Schrijvers and Nicolas
Wu), via email to [email protected] by November 16,
2014. (For proposals of co-located events other than workshops, please
fill in the workshop proposal form and just leave blank any sections
that do not apply.) Please note that this is a firm deadline.

Organizers will be notified if their event proposal is accepted by
December 15, 2014, and if successful, depending on the event, they
will be asked to produce a final report after the event has taken
place that is suitable for publication in SIGPLAN Notices.

The proposal form is available at:

Further information about SIGPLAN sponsorship is available at:


Selection committee

The proposals will be evaluated by a committee comprising the
following members of the ICFP 2015 organising committee, together with
the members of the SIGPLAN executive committee.

 Workshop Co-Chair: Tom Schrijvers                          (KU Leuven)
 Workshop Co-Chair: Nicolas Wu                   (University of Oxford)
 General Chair :    Kathleen Fisher                  (Tufts University)
 Program Chair:     John Reppy                (University of Chicago)


Further information

Any queries should be addressed to the workshop co-chairs (Tom
Schrijvers and Nicolas Wu), via email to
[email protected].

by Tom Schrijvers ([email protected]) at October 10, 2014 06:58 AM

October 09, 2014

Philip Wadler


SNAPL will take place 3-6 May 2015 at Asilomar.
SNAPL provides a biennial venue to focus on big-picture questions, long-running research programs, and experience reports in programming languages. The ideal SNAPL paper takes one of two forms:
  • A promising idea that would benefit from discussion and feedback.
  • A paper about an ongoing research project that might not be accepted at a traditional conference.
Examples include papers that
  • lay out a research roadmap
  • summarize experiences
  • present negative results
  • discuss design alternatives
SNAPL is interested in all aspects of programming languages. The PC is broad and open-minded and receptive to interesting ideas that will provoke thought and discussion.
Interesting papers would combine the specific with the general. Submissions are limited to five pages (excluding bibliography), and must be formatted using ACM SIG style. The final papers can be up to 10 pages in length. Accepted papers will be published on an open-access site, probably arXiv CoRR.
To encourage authors to submit only their best work, each person can be an author or co-author of a single paper only. SNAPL will prefer experienced presenters and each submission must indicate on the submission site which co-author will present the paper at the conference.
SNAPL also accepts one-page abstracts. Abstracts will be reviewed lightly and all submitted abstracts will be published on the SNAPL 2015 web page. Authors of selected abstracts will be invited to give a 5-minute presentation at the gong show at the conference.
SNAPL is unaffiliated with any organization. It is a conference for the PL community organized by the PL community.

Important Dates

Submission: January 6, 2015
Decisions announced: February 20, 2015
Final versions due: March 20, 2015
Conference: May 3-6, 2015

by Philip Wadler ([email protected]) at October 09, 2014 09:39 AM

October 08, 2014

Functional Jobs

Software Engineer / Developer at Clutch Analytics/ Windhaven Insurance (Full-time)

Position Description:

Windhaven Insurance is seeking an experienced Software Engineer/ Developer to join a small elite team who are disrupting the auto insurance industry with innovative technology. You will work in a startup atmosphere as part of a subsidiary of a rapidly growing larger company. This means that along with programming, you’ll have a hand in the larger issues, such as product architecture and direction.


  • Someone who knows at least one functional language such as: Elixir, Erlang, Lisp, Scheme, Haskell, ML, Clojure, Racket, Ocaml or F#
  • Someone who ENGAGES a.k.a “gives a damn” about what we do and how technology can help make us more competitive in the marketplace.
  • Someone who COLLABORATES. We have the flattest organization in the industry designed with one main goal – the TEAM. Are you hungry to make a significant impact in the tech world?
  • Someone who RESPECTS Teammates, customers and the community.

Special Requirements:

You need to have made an achievement, in any field, of significance worth talking about, that required grit, determination and skill. This can be a personal achievement that few know about, or it could have gotten coverage by the media. It doesn’t matter, what does matter is some demonstration of grit, determination and skill. If it can be described succinctly, please describe the achievement in your cover letter, if not, be prepared to tell us all about it during the interview.

Professional & Technical Qualifications:

  • Experience with languages such as Elixir or Erlang
  • Experience with Ember.js (or Angular.js)
  • Experience with NoSQL data stores, such as Couchbase, Riak, etc.
  • DevOps experience a plus
  • Ability to explain technical issues and recommend solutions
  • Strong team player with a high degree of flexibility
  • Excellent verbal and written communication skills


Competitive salary based on experience. Benefits package includes: medical, dental, vision insurance, life insurance, short term and long term disability insurance, 401K, paid time off. EOE.

Get information on how to apply for this position.

October 08, 2014 01:26 PM

October 07, 2014

Neil Mitchell

Bake - Continuous Integration System

Summary: I've written a continuous integration system, in Haskell, designed for large projects. It works, but probably won't scale yet.

I've just released bake, a continuous integration system - an alternative to Jenkins, Travis, Buildbot etc. Bake eliminates the problem of "broken builds", a patch is never merged into the repo before it has passed all the tests.

Bake is designed for large, productive, semi-trusted teams:

  • Large teams where there are at least several contributors working full-time on a single code base.
  • Productive teams which are regularly pushing code, many times a day.
  • Semi-trusted teams where code does not go through manual code review, but code does need to pass a test suite and perhaps some static analysis. People are assumed to be fallible.

Current state: At the moment I have a rudimentary test suite, and it seems to mostly work, but Bake has never been deployed for real. Some optional functionality doesn't work, some of the web UI is a bit crude, the algorithms probably don't scale and all console output from all tests is kept in memory forever. I consider the design and API to be complete, and the scaling issues to be easily fixable - but it's easier to fix after it becomes clear where the bottleneck is. If you are interested, take a look, and then email me.

To give a flavour, the web GUI looks of a running Bake system looks like:

The Design

Bake is a Haskell library that can be used to put together a continuous integration server. To run Bake you start a single server for your project, which coordinates tasks, provides an HTTP API for submitting new patches, and a web-based GUI for viewing the progress of your patches. You also run some Bake clients which run the tests on behalf of the server. While Bake is written in Haskell, most of the tests are expected to just call some system command.

There are a few aspects that make Bake unique:

  • Patches are submitted to Bake, but are not applied to the main repo until they have passed all their tests. There is no way for someone to "break the build" - at all points the repo will build on all platforms and all tests will pass.
  • Bake scales up so that even if you have 5 hours of testing and 50 commits a day it will not require 250 hours of computation per day. In order for Bake to prove that a set of patches pass a test, it does not have to test each patch individually.
  • Bake allows multiple clients to run tests, even if some tests are only able to be run on some clients, allowing both parallelisation and specialisation (testing both Windows and Linux, for example).
  • Bake can detect that tests are no longer valid, for example because they access a server that is no longer running, and report the issue without blaming the submitted patches.

An Example

The test suite provides both an example configuration and commands to drive it. Here we annotate a slightly simplified version of the example, for lists of imports see the original code.

First we define an enumeration for where we want tests to run. Our server is going to require tests on both Windows and Linux before a patch is accepted.

data Platform = Linux | Windows deriving (Show,Read)
platforms = [Linux,Windows]

Next we define the test type. A test is something that must pass before a patch is accepted.

data Action = Compile | Run Int deriving (Show,Read)

Our type is named Action. We have two distinct types of tests, compiling the code, and running the result with a particular argument. Now we need to supply some information about the tests:

allTests = [(p,t) | p <- platforms, t <- Compile : map Run [1,10,0]]

execute :: (Platform,Action) -> TestInfo (Platform,Action)
execute (p,Compile) = matchOS p $ run $ do
cmd "ghc --make Main.hs"
execute (p,Run i) = require [(p,Compile)] $ matchOS p $ run $ do
cmd ("." </> "Main") (show i)

We have to declare allTests, then list of all tests that must pass, and execute, which gives information about a test. Note that the test type is (Platform,Action), so a test is a platform (where to run the test) and an Action (what to run). The run function gives an IO action to run, and require specifies dependencies. We use an auxiliary matchOS to detect whether a test is running on the right platform:

myPlatform = Windows
myPlatform = Linux

matchOS :: Platform -> TestInfo t -> TestInfo t
matchOS p = suitable (return . (==) myPlatform)

We use the suitable function to declare whether a test can run on a particular client. Finally, we define the main function:

main :: IO ()
main = bake $
ovenGit "" "master" $
ovenTest readShowStringy (return allTests) execute

We define main = bake, then fill in some configuration. We first declare we are working with Git, and give a repo name and branch name. Next we declare what the tests are, passing the information about the tests. Finally we give a host/port for the server, which we can visit in a web browser or access via the HTTP API.

Using the Example

Now we have defined the example, we need to start up some servers and clients using the command line for our tool. Assuming we compiled as bake, we can write bake server and bake client (we'll need to launch at least one client per OS). We can view the state by visiting in a web browser.

To add a patch we can run bake addpatch --name=cb3c2a71, using the SHA1 of the commit, which will try and integrate that patch into the master branch, after all the tests have passed.

by Neil Mitchell ([email protected]) at October 07, 2014 09:24 PM

Brandon Simmons

Announcing unagi-chan

Today I released version 0.2 of unagi-chan, a haskell library implementing fast and scalable FIFO queues with a nice and familiar API. It is available on hackage and you can install it with:

$ cabal install unagi-chan

This version provides a bounded queue variant (and closes issue #1!) that has performance on par with the other variants in the library. This is something I’m somewhat proud of, considering that the standard TBQueue is not only significantly slower than e.g. TQueue, but also was seen to livelock at a fairly low level of concurrency (and so is not included in the benchmark suite).

Here are some example benchmarks. Please do try the new bounded version and see how it works for you.


What follows are a few random thoughts more or less generally-applicable to the design of bounded FIFO queues, especially in a high-level garbage-collected language. These might be obvious, uninteresting, or unintelligible.

What is Bounding For?

I hadn’t really thought much about this before: a bounded queue limits memory consumption because the queue is restricted from growing beyond some size.

But this isn’t quite right. If for instance we implement a bounded queue by pre-allocating an array of size bounds then a write operation need not consume any additional memory; indeed the value to be written has already been allocated on the heap before the write even begins, and will persist whether the write blocks or returns immediately.

Instead constraining memory usage is a knock-on effect of what we really care about: backpressure; when the ratio of “producers” to their writes is high (the usual scenario), blocking a write may limit memory usage by delaying heap allocations associated with elements for future writes.

So bounded queues with blocking writes let us:

  • when threads are “oversubscribed”, transparently indicate to the runtime which work has priority
  • limit future resource usage (CPU time and memory) by producer threads

We might also like our bounded queue to support a non-blocking write which returns immediately with success or failure. This might be thought of (depending on the capabilities of your language’s runtime) as more general than a blocking write, but it also supports a distinctly different notion of bounding, that is bounding message latency: a producer may choose to drop messages when a consumer falls behind, in exchange for lower latency for future writes.

Unagi.Bounded Implementation Ideas

Trying to unpack the ideas above helped in a few ways when designing Unagi.Bounded. Here are a few observations I made.

We need not block before “writing”

When implementing blocking writes, my intuition was to (when the queue is “full”) have writers block before “making the message available” (whatever that means for your implementation). For Unagi that means blocking on an MVar, and then writing a message to an assigned array index.

But this ordering presents a couple of problems: first, we need to be able to handle async exceptions raised during writer blocking; if its message isn’t yet “in place” then we need to somehow coordinate with the reader that would have received this message, telling it to retry.

By unpacking the purpose of bounding it became clear that we’re free to block at any point during the write (because the write per se does not have the memory-usage implications we originally naively assumed it had), so in Unagi.Bounded writes proceed exactly like in our other variants, until the end of the writeChan, at which point we decide when to block.

This is certainly also better for performance: if a wave of readers comes along, they need not wait (themselves blocking) for previously blocked writers to make their messages available.

One hairy detail from this approach: an async exception raised in a blocked writer does not cause that write to be aborted; i.e. once entered, writeChan always succeeds. Reasoning in terms of linearizability this only affects situations in which a writer thread is known-blocked and we would like to abort that write.

Fine-grained writer unblocking in probably unnecessary and harmful

In Unagi.Bounded I relax the bounds constraint to “somewhere between bounds and bounds*2”. This allows me to eliminate a lot of coordination between readers and writers by using a single reader to unblock up to bounds number of writers. This constraint (along with the constraint that bounds be a power of two, for fast modulo) seemed like something everyone could live with.

I also guess that this “cohort unblocking” behavior could result in some nicer stride behavior, with more consecutive non-blocking reads and writes, rather than having a situation where the queue is almost always either completely full or empty.

One-shot MVars and Semaphores

This has nothing to do with queues, but just a place to put this observation: garbage-collected languages permit some interesting non-traditional concurrency patterns. For instance I use MVars and IORefs that only ever go from empty to full, or follow a single linear progression of three or four states in their lifetime. Often it’s easier to design algorithms this way, rather than by using long-lived mutable variables (for instance I struggled to come up with a blocking bounded queue design that used a circular buffer which could be made async-exception-safe).

Similarly the CAS operation (which I get exported from atomic-primops) turns out to be surprisingly versatile far beyond the traditional read/CAS/retry loop, and to have very useful semantics when used on short-lived variables. For instance throughout unagi-chan I do both of the following:

  • CAS without inspecting the return value, content that we or any other competing thread succeeded.

  • CAS using a known initial state, avoiding an initial read

October 07, 2014 05:41 PM

Joachim Breitner

New website layout

After 10 years I finally got around to re-decorating my website. One reason was ICFP, where just too many people told me that I don’t look like on my old website any more (which is very true). Another reason was that I was visting my brother, who is very good at web design (check out his portfolio), who could help me a bit.

I wanted something practical and maybe a bit staid, so I drew inspiration from typical Latex typography, and also from Edward Z. Yang’s blog: A serif font (Utopia) for the main body, justified and hyphenated text. Large section headers in a knobbly bold sans-serif font (Latin Modern Sans, which reasonably resembles Computer Modern). To intensify that impression, I put the main text on a white box that lies – like a paper – on the background. As a special gimmic the per-page navigation (or, in the case of the blog, the list of categories) is marked up like a figure in a paper.

Of course this would be very dire without a suitable background. I really like the procedural art by Jared Tarbell, espcially substrate and interAggregate. Both have been turned into screensavers shipped with xscreensaver, so I hacked the substrate code to generate a seamless tile and took a screenshot of the result. I could not make up my mind yet how dense it has to be to look good, so I for every page I randomly pick one of six variants randomly for now.

I simplified the navigation a bit. The old News section has been removed recently already. The Links section is gone – I guess link lists on homepages are so 90s. The section Contact and About me are merged and awaiting some cleanup. The link to the satire news Heisse News is demoted to a mention on the Contents section.

This hopefully helps to make the site navigatable on mobile devices (the old homepage was unusable). CSS media queries adjust the layout slightly on narrow screens, and separately for print devices.

Being the nostaltic I am, I still keep the old design, as well as the two designs before that, around and commented their history.

by Joachim Breitner ([email protected]) at October 07, 2014 03:40 PM

ghc-heap-view for GHC 7.8

Since the last release of ghc-heap-view, which was compatible with GHC-7.6, I got 8 requests for a GHC-7.8 compatible version. I started working on it in January, but got stuck and then kept putting it off.

Today, I got the ninths request, and I did not want to wait for the tenth, so I finally finished the work and you can use the new ghc-heap-view-0.5.2 with GHC-7.8.

I used this chance to migrate its source repository from Darcs to git (mirrored on GitHub), so maybe this means that when 7.10 comes out, the requests to update it come with working patches :-). I also added a small test script so that travis can check it: ghc-heap-view

I did not test it very thoroughly yet. In particular, I did not test whether ghc-vis works as expected.

I still think that the low-level interface that ghc-heap-view creates using custom Cmm code should move into GHC itself, so that it does not break that easily, but I still did not get around to propose a patch for that.

by Joachim Breitner ([email protected]) at October 07, 2014 01:55 PM

Jan Stolarek

Weight-biased leftist heaps verified in Haskell using dependent types

In January I announced my implementation of weight-biased leftist heaps verified with dependent types in Agda. This was part of my work on a paper submitted to CAV’14 conference. The paper got rejected and I decided not to resubmit it anywhere else. At this year’s ICFP listening to Stephanie Weirich’s keynote speech motivated me to finally port that implementation to Haskell, something that I had planned for a couple of months now. You can take a look at the result on github. Here I want to share some of my experiences and insights.

My overall impression is that porting from Agda to Haskell turned out to be fairly straightforward. It was definitely not a complete rewrite. More like syntax adjustments here and there. There were of course some surprises and bumps along the way but nothing too problematic. More precise details are given in the code comments.

Agda beats Haskell

When it comes to programming with dependent types Agda, being a fully-fledged dependently-typed language, beats Haskell in many aspects:

  • Agda has the same language for terms and types. Haskell separates these languages, which means that if I want to have addition for natural numbers then I need to have two separate definitions for terms and types. Moreover, to tie types and terms together I need singleton types. And once I have singleton types then I need to write third definition of addition that works on singletons. All of this is troublesome to write and use. (This tedious process can be automated by  using singletons package.)
  • interactive agda-mode for Emacs makes writing code much simpler in Agda. Here I was porting code that was already written so having an interactive Emacs mode for Haskell was not at all important. But if I were to write all that dependently-typed code from scratch in Haskell this would be painful. We definitely need better tools for dependently-typed programming in Haskell.
  • Agda admits Unicode identifiers. This allows me to have type constructors like or variables like p≥b. In Haskell I have GEq and pgeb, respectively. I find that less readable. (This is very subjective.)
  • Agda has implicit arguments that can be deduced from types. Haskell does not, which makes some function calls more difficult. Surprisingly that was not as huge problem as I initially thought it will be.
  • Agda is total, while Haskell is not. Since there are bottoms in Haskell it is not sound as a logic. In other words we can prove false eg. by using undefined.

Haskell beats Agda

The list is noticeably shorter:

  • Haskell has much better term-level syntax. In many places this resulted in significantly shorter code than in Agda.
  • Haskell is not total. As stated earlier this has its drawbacks but it also has a good side: we don’t need to struggle with convincing the termination checker that our code does actually terminate. This was painful in Agda since it required using sized types.
  • Haskell’s gcastWith function is much better than Agda’s subst. Both these functions allow type-safe casts given the proof that the cast is safe. The difference is that Agda’s subst requires more explicit arguments (as I noted earlier the opposite is usually the case) and restricts the cast to the last type parameter (Haskell allows cast for any type parameter).


While the list of wins is longer for Agda than it is for Haskell I’m actually very happy with Haskell’s performance in this task. The verification in Haskell is as powerful as it is in Agda. No compromises required.

It’s worth remarking that my implementation works with GHC 7.6, so you don’t need the latest fancy type-level features like closed type families. The really essential part are the promoted data types.

by Jan Stolarek at October 07, 2014 01:35 PM

October 06, 2014

Yesod Web Framework

Haskell's Prelude is adding Foldable/Traversable

Haskell's Prelude is changing to favor using Foldable/Traversable instead of just lists. Many Haskellers are concerned that upcoming changes to the Prelude could

  • break existing code
  • make maintaining code more difficult
  • decrease beginner friendliness

Lets discuss these concerns

Stability and the Prelude design space

Neil Mitchell writes:

Step 3: Incorporate feedback

I expect that will result in a significant number of revisions, and perhaps significant departures from the original design. Note that the classy-prelude package is following the steps above, and has had 38 revisions on Hackage so far.

As a contributor to and user of classy-prelude, I wanted to point out something about this statement. Most of these revisions are minor and backwards compatible consisting of bug-fixes or something like adding a non-default implementation of a typeclass method or an additional typeclass instance. A better data point is the number of major revision releases. classy-prelude is at release 0.10 now, so that would be 10.

Neil mentions classy-prelude a second time:

The classy-prelude work has gone in that direction, and I wish them luck, but the significant changes they've already iterated through suggest the design space is quite large.

Of these 10 changes, I would only consider one to be major and represent any kind of change in the design space: the 0.6 release basing classy-prelude on the mono-traversable package. Some of the version bumps represent a change in code organization between mono-traversable and classy-prelude. One change that required a bump is a change to the type signature of mapM_ that should be made in the Prelude but probably never will because it will break existing code. The major change in the 0.6 release is very similar to the upcoming changes in the Prelude, except that classy-prelude (via mono-traversable) works on monomorphic structures such as Text in addition to polymorphic structures. So I would not consider the design space to be large for classy-prelude or for other prelude replacements. classy-prelude before 0.6 and most other Prelude replacements or type-class conveniences have not worked out very well. There is only 1 design that has worked well (incorporating Foldable and Traversable), and that is the design being incorporated into the new Prelude.

Neil also writes about other Haskell abstractions:

We already have that problem with the Control.Arrow module, which (as far as most people use it), is just a pile of tuple combinators. But unlike other tuple combinators, these are ones whose type signature can't be understood. When I want to use &&& or ***

I want to point out that classy-prelude solved some of this issue by exporting Data.BiFunctor instead of functions from Data.Arrow, which required a version bump from 0.9.x to 0.10. I also want to point out that these kinds of arguments are straw man arguments. Every proposed abstraction to Haskell has its own advantages and dis-advantages. Because of the need for backwards compatibility, we are going to be able to point to abstractions in the Prelude that are not being used in the right way for a long time. However, Foldable/Traversable is the only serious contender for abstracting over data structures in Haskell. It has stood the test of time so far, but it has not seen a lot of use yet because everyone is initially directed to just use lists for everything, and the next instinct when using other data structures in the current environment is to use qualified names.

Iterating the new design

One proposed remedy for dealing with change is trying to release the new Prelude in an iterative way. This could be a good idea, but in practice it is very difficult to implement: most users are still going to import Prelude rather than trying out something different and giving their feedback. A better approach than holding it back is to use a design that makes it easier for new releases to make backwards incompatible changes. One approach to this could be at the package level the way that base-compat operates. Another approach that could be useful to library authors is incorporate versioning at the module level.

Something to keep in mind though is that because the new Prelude needs to try to work with the old Prelude, there are not that many options in the design space. classy-prelude has had the luxury of being able to re-think every Haskell wart. So it was able to remove all partial functions and use Text instead of String in many places. But that process is very difficult for the actual Prelude, which is severly constrained.

Why change? Module qualified names and generic code.

The motivation for classy-prelude was to confront one of Haskell's most glaring warts: name-spacing and the need to qualify functions. We could certainly have our IDE automatically write import statements, but we still end up with needing to use module qualified names. This isn't really an acceptable way to program. I have not seen another language where this extra line noise is considered good style. For Haskell to move forward and be as convenient to use as other programming languages, there are 2 solutions I know of.

  1. change the language
  2. make it convenient to write generic code

Changing the language so that module qualification is not needed is arguably a much better approach. This is the case in Object-Oriented languages, and possible in languages very similar to Haskell such as Frege that figure out how to disambiguate a function based on the data type being used. I think this would be a great change to Haskell, but the idea was rejected by Simon Peyton Jones himself during the discussion on fixing Haskell records because it is not compatible with how Haskell's type system operates today. Simon did propose Type directed name resolution which I always though was a great idea, but that proposal was not able to get off the ground in part because changing Haskell's dot operator proved too controversial.

So the only practical option I know of is to focus on #2. Being able to write generic code is an important issue in of itself. Programmers in most other mainstream languages write code that operates on multiple data structures of a particular shape, but Haskell programmers are still specializing a lot of their interfaces.

Lists are holding Haskell back

It is taken by many to be a truism that programming everything with lists makes things simpler or at least easier for new Haskell programmers. I have found this statement to be no different than 99% of things given the glorious "simple" label: the "simplicity" is not extensible, does not even live up to its original use case, and ends up creating its own incidental complexity.

I used to frequently warp the functions I wrote to fit the mold of Haskell's list. Now that I use classy-prelude I think about the data structure that is needed. Or often I start with a list, eventually discover that something such as appending is needed, and I am able to quickly change the function to operate on a different data structure.

Using an associative list is an extreme example of using the wrong data structure where lookup is O(n) instead of constant or O(log(n)). But by warping a function I am really talking about writing a function in a way to reduce list appends or doing a double reverse instead of using a more natural DList or a Seq. This warping process probably involves performing recursion by hand instead of re-using higher-order functions. As a library developer, I would like to start exposing interfaces that allow my users to use different data structures, but I know that it is also going to cause some inconvenience because of the current state of the Prelude.

Neil writes that he had an opposite experience:

I have taken over a project which made extensive use of the generalised traverse and sequence functions. Yes, the code was concise, but it was read-only, and even then, required me to "trust" that the compiler and libraries snapped together properly.

This kind of report is very worrying and it is something we should take very seriously. Any you certainly cannot tell someone that their actual experience was wrong. However, it is human nature to over-generalize our experiences just as it was the nature of the code author in question to over-generalize functions. In order to have a productive discussion about this, we need to see (at least a sample or example of) the code in question. Otherwise we are only left guessing at what mistakes the author made.

In general I would suggest specializing your application code to lists or other specific structures (this can always be done with type signatures) until there is a need for more abstraction, and that could be a big part of the problem in this case.

It would be really great to start having these discussions now based off of actual code examples and to have a community guide that explains the missing common sense for how to use abstractions appropriately.

The uncertainty discussed here is the heart of the matter, and talking about what is good for beginners is largely a distraction.

Haskell is for programmers first, students in their first class in functional programming second

This might sound hostile to beginners. In fact, the opposite is the case! The programming languages that are taught to beginners are the very same ones that are used in industry. The first programming language taught to me at school was C, and it was not because it was optimized out of the box for beginners.

So the way to attract more beginners is simply to become more popular with industry. Haskell has already reached the growth rate limit of a language that is popular for the sake of learning about programming.

That being said, we do need to do a lot of things to make the experience as nice as possible for beginners, and an alternative prelude for beginners could be a great idea. But making this the default that holds back progress hurts everyone in the long-run.

It isn't Beginner vs. Expert anyways

The most up-voted comment on Reddit states:

What other languages have different standard libraries for people learning and people not learning? What could be a greater way to confuse learners, waste their time and make them think this language is a joke than presenting them with n different standard libraries?

I will add my own assertion here: Haskell is confusing today because the Prelude is in a backward state that no longer reflects several important best practices (for example, Neil had to create the Safe package!) and it does not hold up once you write more than a trivial amount of code in your module.

We also need to keep in mind that using Haskell can be difficult for beginners precisely for some of the same reasons that it is painful for experts. And the same reason these new changes will be more difficult for beginners (mental overhead of using the Foldable/Traversable abstraction instead of just lists) will also create difficulties for non-beginners.

So the changes to the Prelude are going to make some aspects a better for beginners or existing users and others harder.

If we really want to improve Haskell for beginners we need to stop creating a false dichotomy between beginner and expert. We also need to empower committees to make forward progress rather than letting minority objections stall all forward progress.

Improving the library process

Some have expressed being surprised to learn about what is going on in the Haskell libraries committee at a late stage. On the other hand, I doubt that hearing more objections earlier would actually be helpful, because the libraries process has not learned from GHC.

Take a look at the extensive documentation around proposed changes to improve Haskell's record system. Creating a solution to Haskell's problem with records was a very difficult process. There were several designs that looked good in a rough sketch form, but that had issues when explored in thorough detail on the wiki. More importantly, the wiki helped summarize and explain a discussion that was extremely laborious to read and impossible to understand by looking through a mail list.

Before creating a non-minor change to GHC, there is a convention of creating a wiki page (certainly it isn't always done). At a minimum there is a Trac ticket that can serve a somewhat similar purpose.

My suggestion is that the libraries process use the existing GHC or Haskell wiki to create a page for every non-minor change. The page for Foldable/Traversable would explain

  • what is being changed
  • which changes create a breakage
  • how type errors have changed
  • how library code is affected
  • how user code is affected
  • best practices for using Foldable/Traversable

Right now we are stuck in a loop of repeating the same points that were already made in the original discussion of the proposal. Given a wiki page, Neil and others could point out the down-sides of the proposal with actual code and have their voice heard in a productive way that builds up our body of knowledge.

October 06, 2014 02:50 AM

October 05, 2014

Kevin Reid (kpreid)

Help me name a game keybinding library.

(I could say some meta-commentary about how I haven't been blogging much and I've made a resolution to get back to it and it'll be good for me and so on, but I think I've done that too many times already, so let's get right to the actual thing...)

When I wrote Cubes (a browser-based “Minecraft-like”), one of the components I built was a facility for key-bindings — that is, allowing the user to choose which keys (or mouse buttons, or gamepad buttons) to assign to which functions (move left, fly up, place block, etc.) and then generically handling calling the right functions when the event occurs.

Now, I want to use that in some other programs. But in order for it to exist as a separate library, it needs a name. I have failed to think of any good ones for months. Suggestions wanted.

Preferably, the name should hint at that it supports the gamepad API as well as keyboard and mouse. It should not end in “.js” because cliche. Also for reference, the other library that arose out of Cubes development I named Measviz (which I chose as a portmanteau and for having almost zero existing usage according to web searches).

(The working draft name is web-input-mapper, which is fairly descriptive but also thoroughly clunky.)

by Kevin Reid (kpreid) ([email protected]) at October 05, 2014 03:22 PM

Philip Wadler

Errata, please

This is a space where you can leave comments describing errata in any of my published papers. Please include bibliographic details of the paper, and a link to where the paper appears on my web page if you can. Thank you to Dave Della Costa for volunteering the first entry and inspiring the creation of this post, and to all who utilise this space to record and correct my errors for posterity.

by Philip Wadler ([email protected]) at October 05, 2014 02:15 PM

Matthew Sackman

Programming in the real world

One of the things that annoys me about Object Oriented Programming is how it's often suggested that it models the "real world". Frequently tutorials will start with creating an object modelling a chair, and through inheritance you'll be able to build up composable aspects of chairs: different numbers of legs, different colours, different designs. Sometimes they use tables rather than chairs. This is lovely, but it actually has everything to do with data modelling through inheritance, decomposition, abstraction and encapsulation, and almost nothing to do with Object Orientation: the key is that these chairs have no modifying methods on them. If they have any methods at all then they'll be for things like getting the number of legs or the colour, or volume or something - something that is basically fixed once the object is instantiated. At this point in such tutorials I'd probably claim this is not actually programming yet: all that's been achieved so far is that we've assigned some semantics to some numbers held in memory and we can write some numbers in memory. Programming is when we manipulate numbers: that involves reading and writing numbers.

The problem then is that Object Orientation immediately stops being about modelling the "real world" as soon as we can modify memory. If we think about how we actually would go about getting a chair made for us, it could go a bit like this:

  1. Go see your local carpenter,
  2. Have a discussion with them about the style and type of chair you'd like,
  3. They make the chair and give it to you in return for payment,
  4. You take the chair home,
  5. You decide you don't like the colour so you take it to your garage and repaint it yourself.

It should be clear that the inanimate object (the chair) is the odd one out here. Everything else is done by actors that have their own state, mainly act asynchronously, and can communicate with other actors through protocols - protocols that do not involve sharing mutable state (e.g. if I say something to you, that speech is immutable; you can't change what I've said (though you could choose to mishear me!)). At no point is any state of any actor actually exposed to another actor: I may share with you what I'm currently thinking, and you can try to influence me, but we don't exactly need a mutex around memory in my brain because YOU'RE NOT GETTING IN THERE!

If you tried modelling this sort of thing through Object Orientation without actors then you'd end up with your own thread doing all the work: it'd be you, it'd be the carpenter and it'd be the chair, maybe all at once. If your carpenter is in fact a growing business with a receptionist, a design team and a billing department your thread would be playing those roles too and would probably have to use locks to avoid unexpected interactions with other threads doing the same commissioning-receptioning-designing-constructing-delivery-repainting dance. And all the time, whilst you're doing the carpentry yourself, you'd could easily have your own thoughts, feelings, aspirations and regrets all on the same stack for your carpenter-alias to mess with.

Thus Object Orientation causes multiple personality disorder.

So in my view, the way Object Orientation gets introduced tends to be more like "useful tools for modelling data". But the OO approach to manipulating that data goes wrong as soon as you try to model the animated real world. Firstly it has nothing to say about separating out threads to self-contained actors (but try this in a language or on a platform without green-threads, or without the ability to preempt threads and you can quickly hit pain), and secondly even if you do have actors, OO encourages the sharing of mutable data rather than passing around either immutable data or copies of data. Yes, good programming discipline can result in sane designs and a successful result, but it's not a core aspect of the OOP mantra.

So, OOP has nothing good to say on manipulating data at all - it either says nothing or it encourages silly ideas like using locks. The data modelling bits are fine, but I think they're a broader concept beyond the confines of OOP. What else does OOP get you? An arbitrary restriction on the receiver of any method. That's about it. It's thanks to this restriction that writing combinators like cons on a list library in an OO language is really painful.

This week Erik Meijer wrote an article called The Curse of the Excluded Middle: "Mostly functional" programming does not work. After an introduction, we get onto The Problem, which (paraphrasing) is that languages that are mainly imperative but offer some features from pure functional languages are not as safe as pure functional languages.

The first three examples, from C# are certainly surprising to me (I barely know any C# at all though). The first two problems come from trying to compose side-effecting stuff with laziness. In the first case it's not clear that the problem is with the IO operation (printing things out) or actually with the laziness, but more the odd behaviour of the Where operator (presumably the implementation of Where doesn't know that a Cartesian product isn't necessary, but surely any normal monadic/list-comprehension implementation wouldn't have this problem?). The second case is certainly the terrifying composition of laziness with throwing exceptions and thus the exception having the potential to pop out anywhere where the lazy expression gets forced. However, if you know the Select operator is lazy, it's not really that surprising. It's arguably piss-poor language design that there's nothing there to help you, but C# doesn't have checked exceptions; apparently programmers don't like having to deal with errors so you reap what you sow. The third case is how C# has a nice using feature which binds a resource to a lexical scope. But if you construct a closure capturing the resource and then send it out of that lexical scope then using goes wrong (it will still discard the resource even though there's a reference to it within the closure which remains in-scope). This is certainly piss-poor language design: if the closure captures stuff from your lexical scope and you're not reference counting (or equivalent) your lexical scope then YOU'VE DONE IT WRONG. This is as bad as in C allocating stuff on your stack and then returning pointers to it.

Next he moves on somewhat tangentially to the point that if object creation is an observable action then you can't optimise it out. I'm not sure anyone outside a pure functional language specialist would ever want object creation to be optimised out, but the point is that if your constructor has side effects or can in any other way be observed then you can't have your language runtime do memoization of object creation. Doing side effects in object constructors has long been discouraged: I first read that back in the Effective Java book about a decade ago and I'm sure it wasn't exactly a ground-breaking piece of advice then.

So far then we have that side effects which are untracked have the potential to be bad: whether it's printing things out, or throwing exceptions, or discarding resources early, or preventing compiler optimisations. But next I feel the article goes a bit wrong. He first moves onto how channels in C⍵ can store state so they're not pure either, thus bad. And then goes onto how in Erlang you have the same problem as you're just modelling mutable state in actors:

Note how this Erlang actor basically encodes an object with dynamic method dispatch using the pattern-matching, message-sending, and recursion primitives of the language, which you may happily leverage to implement mutable references, sabotaging the fact that the Erlang language does not natively expose mutable state.

This is wrong: you cannot implement mutable references in Erlang. Data is immutable in Erlang so if you send some value out of an actor, you are sending that value. Not a reference to a value or variable. Even if you create a closure and send that out of the actor, the closure is capturing those values as they exist at that point in time. If you have received a value sent to you from an actor, you may use it to create other values, but doing so does not affect the "original", and similarly, the actor itself can continue to modify its own state, but it does not affect the values it sent to you. Yes, you can use Erlang actors to model objects. But an actor's own modifications of its state cannot be observed as side effects on values you've previously retrieved from that actor, and vice versa.

The reference you have to an actor is a process identifier (also immutable) which does not present any information itself about the state of the actor. Through that, you can send messages to an actor and test whether or not the actor is still alive, but that is all. And in any case, where has the sudden objection to mutable state come from? State is just a catamorphism on prior inputs. State is not the problem: unconstrained side effects are the problem. Certainly sharing mutable state is a problem (and you could argue that mutating shared state is a side effect and that it should be tracked statically), but Erlang does not allow for that.

He may have been better off going for an example of opening an file, sending the file handle to another process and then closing the file handle before it's been used (i.e. the same as the third C# example). Except:

  1. All file operations can return an error anyway so handling errors in such code is completely normal;
  2. In Erlang a file handle is normally an actor itself, so what you're doing is passing around a process identifier. Sending messages to a dead process (once the file is closed) is a normal activity and you can detect if the process has died in normal ways;
  3. If you bypass such normal file handling for performance reasons and open the file in "raw" mode then Erlang has a light form of object capabilities in which only the process that opened the file is allowed to use the file handle, so again the use of the file handle would error predictably;
  4. The language doesn't have the same C# feature for discarding resources once you return out of a lexical scope. Consequently closing a file is an explicit operation and given the asynchronous concurrent mindset one develops when working in Erlang, it's very likely you'll realise how odd it is to be closing a file handle whilst there's some closure out there which may not have been run yet.

Beyond this, he introduces the Haskell type system and explains that it captures side effects statically. As a result, by bowing to the demands of the type checker, it offers you a proof that if such effects occur, your program will handle them: exceptions will not go uncaught, IO operations are only permitted where the semantics lead to expected outcomes, resources are not used after they're discarded and the compiler can use all these proofs to do all manner of optimisations to your program.

These proofs can certainly be very valuable (though they are no substitute for disciplined, high quality design and careful implementation). Obviously, they don't capture everything though. Particularly relevant for concurrent and distributed programs, they don't capture sufficient side effects to allow for a proof of the absence of deadlocks. Haskell standard libraries contain channels and semaphores which can easily be used to sporadically end up with a deadlock between processes. A deadlock is definitely a side effect: the effect is the program probably stops working. The cause is an insufficient use of locks to control the scheduler (be it scheduling of OS threads or language runtime scheduling of green threads).

More broadly, the proof a type checker offers is that the specification you've provided (type signatures) is not violated by its inferences about your code. Until the type checker allows "and makes progress" as part of a specification, Haskell itself is no safer than any other language that claims to be "mostly functional".

October 05, 2014 09:22 AM

Neil Mitchell

How to Rewrite the Prelude

Summary: Rewriting the Prelude should be done in a separate package and take a few years to complete, hopefully building a consensus and improving the result.

My recent suggestion that the Prelude shouldn't be rewritten to incorporate Foldable/Traversable generated a lot of feedback. It's clear some people strongly agree, and some strongly disagree. So instead of continuing the argument, in this post I'm going to describe how I think the people who want to change the Prelude should go about it.

There were a lot of comments that we should optimise Haskell for practitioners, not beginners - in particular that there should be Prelude (for advanced users) and Prelude.Simple (for beginners). My personal opinion is that if we have two Prelude's, why not make the beginner one the default one? Switching to the non-default one probably takes one line of moderately advanced syntax (currently import Prelude(); import Prelude.Advanced). For a beginner, writing a simple one line program, that more than doubles the complexity. For a moderate user, that's almost certainly outweighed by lines of LANGUAGE pragmas. (I'm also not sure that I want the "Advanced" Prelude, but that's not relevant to this post.)

Step 1: Write the new Prelude in a package

The first step to proposing a new Prelude should be to write Prelude.Advanced in a separate package on Hackage. Let's assume we have a package base-advanced with a module Prelude.Advanced. The Prelude is not trivial, and translating high-level design goals (e.g. move Foldable into the Prelude) requires some design choices to translate into concrete code.

Step 2: Gain users and feedback

Rather than roll out the Prelude to everyone at once, slowly try and persuade people that your Prelude is superior to the existing one. In doing so, advanced users can start trying out the Prelude, and describing their experiences with it. Does it require Foldable to gain a size method? Does it show up a GHC error message that could be better? Do we need some new type of defaulting? Does profiling give poor results without some manually inserted cost centre? Is a direct list isomorphism a better type class to start from (as F# does with IEnumerable)? Given the number of enthusiastic supporters, this step should be easy.

Step 3: Incorporate feedback

The feedback from step 2 needs incorporating into the package, and potentially into GHC itself. I expect that will result in a significant number of revisions, and perhaps significant departures from the original design. Note that the classy-prelude package is following the steps above, and has had 38 revisions on Hackage so far.

Step 4: Gain acceptance

Before going any further, some critical mass of people need to agree the new way is preferable. There may turn out to be competing proposals, and hopefully as a community we can find consensus and identify the best ideas (as we are currently doing with IO streaming libraries).

Step 5: Move to base

Once we have agreement on what the new Prelude should look like, it should probably be moved into base. At this point, based on all the experience we've got, we can decide whether it becomes Prelude or Prelude.Advanced. At the same time as moving into base, we should decide on the simpler story, but what that decision looks like, will depend on what the proposed Prelude looks like.

The current approach

There are a number of areas that current approach worry me. The change was made directly in the base Prelude, based on some agreement of high-level design decisions, and has already resulted in unexpected additions to the Foldable class. The Prelude will first be exposed to lots of users once GHC 7.10 ships, by which point iterating on feedback will be slow and difficult. The plan for the "beginner" version is to encourage beginners to use the haskell2010 package, which I don't think has been thought out (as someone who tried to use the haskell98 package for a short while, the exception change meant you couldn't use haskell98 and base in the same package, which is basically fatal).

The Haskell community changed Applicative to be a superclass of Monad. That change was simple and thoroughly thought out over a period of years. Warnings were added to GHC, packages were analysed, people experimented with what it would mean, and once decided the change took a number of GHC releases to fully roll out. I consider this change to have been done the right way. In contrast, the complete rewrite of the Prelude is happening far more quickly, and I would argue too quickly. By taking a bit more time hopefully we can come up with something even better.

by Neil Mitchell ([email protected]) at October 05, 2014 07:05 AM

October 04, 2014


GUI - Release of the threepenny-gui library, version

I am pleased to announce release of threepenny-gui version 0.5, a cheap and simple library to satisfy your immediate GUI needs in Haskell.

Want to write a small GUI thing but forgot to sacrifice to the giant rubber duck in the sky before trying to install wxHaskell or Gtk2Hs? Then this library is for you! Threepenny is easy to install because it uses the web browser as a display.

The library also has functional reactive programming (FRP) built-in, which makes it a lot easier to write GUI application without getting caught in spaghetti code. For an introduction to FRP, see for example my slides from a tutorial I gave in 2012 (the API is slightly different in Reactive.Threepenny) and the preliminary widget design guide.

Version 0.5 is essentially a maintenance release, allowing for newer versions of the libraries that it depends on. It also incorporates various contributions by other people, including a small Canvas API by Ken Friis Larsen and Carsten König, and a complete set of SVG elements and attributes by Steve Bigham. Many thanks also to Yuval Langer and JP Moresmau.

However, while it’s great that the library begins to grow in breadth, incorporating larger and larger parts of the DOM API, I also feel that the current backend code is unable to cope with this growth. In the next version, I intend to overhaul the server code and put the JavaScript FFI on a more solid footing.

To see Threepenny in action, have a look at the following applications:

Daniel Austin’s FNIStash
Editor for Torchlight 2 inventories.
Daniel Mlot’s Stunts Cartography Track Viewer
Map viewer for the Stunts racing game.

Get the library here:

Note that the API is still in flux and is likely to change radically in the future. You’ll have to convert frequently or develop against a fixed version.

October 04, 2014 08:59 AM

Magnus Therning

Script for migrating from WordPress to Hakyll

As I wrote about in a previous post on converting post from WordPress to Hakyll I couldn’t quite find a tool that met my needs out of the box. I had a look at the source of hakyll-convert but it uses a library for RSS parsing, which sounds good. However, the export of posts from WordPress is in an extended RSS format, among other things it contains the comments of posts. Unfortunately it doesn’t look like the RSS library supports the WordPress extensions, so modifying hakyll-convert to also extract comments seems like a bit more work than I’d like to put into it. Especially since I had a feeling that hacking up something using tagsoup would be quite a bit faster.

I put the resulting script in gist on github. I call it bbwp, which is short for Bye, bye, WordPress.

October 04, 2014 12:00 AM

October 03, 2014

Jeremy Gibbons

Distributivity in Horner’s Rule

This is a continuation of my previous post on Horner’s Rule, and in particular, of the discussion there about distributivity in the datatype-generic version of the Maximum Segment Sum problem:

the essential property behind Horner’s Rule is one of distributivity. In the datatype-generic case, we will model this as follows. We are given an {(\mathsf{F}\,\alpha)}-algebra {(\beta,f)} [for a binary shape functor {\mathsf{F}}], and a {\mathsf{M}}-algebra {(\beta,k)} [for a collection monad {\mathsf{M}}]; you might think of these as “datatype-generic product” and “collection sum”, respectively. Then there are two different methods of computing a {\beta} result from an {\mathsf{F}\,\alpha\,(\mathsf{M}\,\beta)} structure: we can either distribute the {\mathsf{F}\,\alpha} structure over the collection(s) of {\beta}s, compute the “product” {f} of each structure, and then compute the “sum” {k} of the resulting products; or we can “sum” each collection, then compute the “product” of the resulting structure. Distributivity of “product” over “sum” is the property that these two different methods agree, as illustrated in the following diagram.

For example, with {f :: \mathsf{F}\,{\mathbb Z}\,{\mathbb Z} \rightarrow {\mathbb Z}} adding all the integers in an {\mathsf{F}}-structure, and {k :: \mathsf{M}\,{\mathbb Z} \rightarrow {\mathbb Z}} finding the maximum of a (non-empty) collection, the diagram commutes.

There’s a bit of hand-waving above to justify the claim that this is really a kind of distributivity. What does it have to do with the common-or-garden equation

\displaystyle  a \otimes (b \oplus c) = (a \otimes b) \oplus (a \otimes c)

stating distributivity of one binary operator over another? That question is the subject of this post.

Distributing over effects

Recall that {\delta_2 :: (\mathsf{F}\,\alpha)\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}(\mathsf{F}\,\alpha)} distributes the shape functor {\mathsf{F}} over the monad {\mathsf{M}} in its second argument; this is the form of distribution over effects that crops up in the datatype-generic Maximum Segment Sum problem. More generally, this works for any idiom {\mathsf{M}}; this will be important below.

Generalizing in another direction, one might think of distributing over an idiom in both arguments of the bifunctor, via an operator {\delta : \mathsf{F} \cdot (\mathsf{M} \times \mathsf{M}) \mathbin{\stackrel{.}{\to}} \mathsf{M} \cdot \mathsf{F}}, which is to say, {\delta_\beta :: \mathsf{F}\,(\mathsf{M}\beta)\,(\mathsf{M}\beta) \rightarrow \mathsf{M}(\mathsf{F}\beta)}, natural in the {\beta}. This is the {\mathit{bidist}} method of the {\mathit{Bitraversable}} subclass of {\mathit{Bifunctor}} that Bruno Oliveira and I used in our Essence of the Iterator Pattern paper; informally, it requires just that {\mathsf{F}} has a finite ordered sequence of “element positions”. Given {\delta}, one can define {\delta_2 = \delta \cdot \mathsf{F}\,\mathit{pure}\,\mathit{id}}.

That traversability (or equivalently, distributivity over effects) for a bifunctor {\mathsf{F}} is definable for any idiom, not just any monad, means that one can also conveniently define an operator {\mathit{contents}_{\mathsf{H}} : \mathsf{H} \mathbin{\stackrel{.}{\to}} \mathsf{List}} for any traversable unary functor {\mathsf{H}}. This is because the constant functor {\mathsf{K}_{[\beta]}} (which takes any {\alpha} to {[\beta]}) is an idiom: the {\mathit{pure}} method returns the empty list, and idiomatic application appends two lists. Then one can define

\displaystyle  \mathit{contents}_{\mathsf{H}} = \delta \cdot \mathsf{H}\,\mathit{wrap}

where {\mathit{wrap}} makes a singleton list. For a traversable bifunctor {\mathsf{F}}, we define {\mathit{contents}_{\mathsf{F}} = \mathit{contents}_{\mathsf{F}\cdot\triangle}} where {\triangle} is the diagonal functor; that is, {\mathit{contents}_{\mathsf{F}} :: \mathsf{F}\,\beta\,\beta \rightarrow [\beta]}, natural in the {\beta}. (No constant functor is a monad, except in trivial categories, so this convenient definition of contents doesn’t work monadically. Of course, one can use a writer monad, but this isn’t quite so convenient, because an additional step is needed to extract the output.)

One important axiom of {\delta} that I made recent use of in a paper with Richard Bird on Effective Reasoning about Effectful Traversals is that it should be “natural in the contents”: it should leave shape unchanged, and depend on contents only up to the extent of their ordering. Say that a natural transformation {\phi : \mathsf{F} \mathbin{\stackrel{.}{\to}} \mathsf{G}} between traversable functors {\mathsf{F}} and {\mathsf{G}} “preserves contents” if {\mathit{contents}_{\mathsf{G}} \cdot \phi = \mathit{contents}_{\mathsf{F}}}. Then, in the case of unary functors, the formalization of “naturality in the contents” requires {\delta} to respect content-preserving {\phi}:

\displaystyle  \delta_{\mathsf{G}} \cdot \phi = \mathsf{M}\phi \cdot \delta_{\mathsf{F}} : \mathsf{T}\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}\mathsf{G}

In particular, {\mathit{contents}_{\mathsf{F}} : \mathsf{F} \mathbin{\stackrel{.}{\to}} \mathsf{List}} itself preserves contents, and so we expect

\displaystyle  \delta_{\mathsf{List}} \cdot \mathit{contents}_{\mathsf{F}} = \mathsf{M}(\mathit{contents}_{\mathsf{F}}) \cdot \delta_{\mathsf{F}}

to hold.

Folding a structure

Happily, the same generic operation {\mathit{contents}_{\mathsf{F}}} provides a datatype-generic means to “fold” over the elements of an {\mathsf{F}}-structure. Given a binary operator {\otimes :: \beta\times\beta \rightarrow \beta} and an initial value {b :: \beta}, we can define an {(\mathsf{F}\,\beta)}-algebra {(\beta,f)}—that is, a function {f :: \mathsf{F}\,\beta\,\beta\rightarrow\beta}—by

\displaystyle  f = \mathit{foldr}\,(\otimes)\,b \cdot \mathit{contents}_{\mathsf{F}}

(This is a slight specialization of the presentation of the datatype-generic MSS problem from last time; there we had {f :: \mathsf{F}\,\alpha\,\beta \rightarrow \beta}. The specialization arises because we are hoping to define such an {f} given a homogeneous binary operator {\otimes}. On the other hand, the introduction of the initial value {b} is no specialization, as we needed such a value for the “product” of an empty “segment” anyway.)

Incidentally, I believe that this “generic folding” construction is exactly what is intended in Ross Paterson’s Data.Foldable library.

Summing a collection

The other ingredient we need is an {\mathsf{M}}-algebra {(\beta,k)}. We already decided last time to

stick to reductions{k}s of the form {\oplus/} for associative binary operator {{\oplus} :: \beta \times \beta \rightarrow \beta}; then we also have distribution over choice: {\oplus / (x \mathbin{\underline{\smash{\mathit{mplus}}}} y) = (\oplus/x) \oplus (\oplus/y)}. Note also that we prohibited empty collections in {\mathsf{M}}, so we do not need a unit for {\oplus}.

On account of {\oplus/} being an algebra for the collection monad {\mathsf{M}}, we also get a singleton rule {\oplus/ \cdot \mathit{return} = \mathit{id}}.

Reduction to distributivity for lists

One of the take-home messages in the Effective Reasoning about Effectful Traversals paper is that it helps to reduce a traversal problem for datatypes in general to a more specific one about lists, exploiting the “naturality in contents” property of traversability. We’ll use that tactic for the distributivity property in the datatype-generic version Horner’s Rule.

In this diagram, the perimeter is the commuting diagram given at the start of this post—the diagram we have to justify. Face (1) is the definition of {\delta_2} in terms of {\delta}. Faces (2) and (3) are the expansion of {f} as generic folding of an {\mathsf{F}}-structure. Face (4) follows from {\oplus/} being an {\mathsf{M}}-algebra, and hence being a left-inverse of {\mathit{return}}. Face (5) is an instance of the naturality property of {\mathit{contents}_{\mathsf{F}} : \mathsf{F}\triangle \mathbin{\stackrel{.}{\to}} \mathsf{List}}. Face (6) is the property that {\delta} respects the contents-preserving transformation {\mathit{contents}_{\mathsf{F}}}. Therefore, the whole diagram commutes if Face (7) does—so let’s look at Face (7)!

Distributivity for lists

Here’s Face (7) again:

Demonstrating that this diagram commutes is not too difficult, because both sides turn out to be list folds.

Around the left and bottom edges, we have a fold {\mathit{foldr}\,(\otimes)\,b} after a map {\mathsf{List}\,(\oplus)}, which automatically fuses to {\mathit{foldr}\,(\odot)\,b}, where {\odot} is defined by

\displaystyle  x \odot a = (\oplus/x) \otimes a

or, pointlessly,

\displaystyle  (\odot) = (\otimes) \cdot (\oplus/) \times \mathit{id}

Around the top and right edges we have the composition {\oplus/ \cdot \mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \cdot \delta_{\mathsf{List}}}. If we can write {\delta_{\mathsf{List}}} as an instance of {\mathit{foldr}}, we can then use the fusion law for {\mathit{foldr}}

\displaystyle  h \cdot \mathit{foldr}\,f\,e = \mathit{foldr}\,f'\,e' \;\Leftarrow\; h\,e=e' \land h \cdot f = f' \cdot \mathit{id}\times h

to prove that this composition equals {\mathit{foldr}\,(\odot)\,b}.

In fact, there are various equivalent ways of writing {\delta_{\mathsf{List}}} as an instance of {\mathit{foldr}}. The definition given by Conor McBride and Ross Paterson in their original paper on idioms looked like the identity function, but with added idiomness:

\displaystyle  \begin{array}{lcl} \delta_{\mathsf{List}}\,[\,] &=& \mathit{pure}\,[\,] \\ \delta_{\mathsf{List}}\,(\mathit{mb} : \mathit{mbs}) &=& \mathit{pure}\,(:) \circledast \mathit{mb} \circledast \delta_{\mathsf{List}}\,\mathit{mbs} \end{array}

In the special case that the idiom is a monad, it can be written in terms of {\mathit{liftM}_0} (aka {\mathit{return}}) and {\mathit{liftM}_2}:

\displaystyle  \begin{array}{lcl} \delta_{\mathsf{List}}\,[\,] &=& \mathit{liftM}_0\,[\,] \\ \delta_{\mathsf{List}}\,(\mathit{mb} : \mathit{mbs}) &=& \mathit{liftM}_2\,(:)\,\mathit{mb}\,(\delta_{\mathsf{List}}\,\mathit{mbs}) \end{array}

But we’ll use a third definition:

\displaystyle  \begin{array}{lcl} \delta_{\mathsf{List}}\,[\,] &=& \mathit{return}\,[\,] \\ \delta_{\mathsf{List}}\,(\mathit{mb} : \mathit{mbs}) &=& \mathsf{M}(:)\,(\mathit{cp}\,(\mathit{mb}, \delta_{\mathsf{List}}\,\mathit{mbs})) \end{array}


\displaystyle  \begin{array}{lcl} \mathit{cp} &::& \mathsf{M}\,\alpha \times \mathsf{M}\,\beta \rightarrow \mathsf{M}(\alpha\times\beta) \\ \mathit{cp}\,(x,y) &=& \mathbf{do}\,\{\,a \leftarrow x \mathbin{;} b \leftarrow y \mathbin{;} \mathit{return}\,(a,b) \} \end{array}

That is,

\displaystyle  \delta_{\mathsf{List}} = \mathit{foldr}\,(\mathsf{M}(:)\cdot\mathit{cp})\,(\mathit{return}\,[\,])

Now, for the base case we have

\displaystyle  \oplus/\,(\mathsf{M}(\mathit{foldr}\,(\otimes)\,b)\,(\mathit{return}\,[\,])) = \oplus/\,(\mathit{return}\,(\mathit{foldr}\,(\otimes)\,b\,[\,])) = \oplus/\,(\mathit{return}\,b) = b

as required. For the inductive step, we have:

\displaystyle  \begin{array}{ll} & \oplus/ \cdot \mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \cdot \mathsf{M}(:) \cdot \mathit{cp} \\ = & \qquad \{ \mbox{functors} \} \\ & \oplus/ \cdot \mathsf{M}(\mathit{foldr}\,(\otimes)\,b \cdot (:)) \cdot \mathit{cp} \\ = & \qquad \{ \mbox{evaluation for~} \mathit{foldr} \} \\ & \oplus/ \cdot \mathsf{M}((\otimes) \cdot \mathit{id}\times\mathit{foldr}\,(\otimes)\,b) \cdot \mathit{cp} \\ = & \qquad \{ \mbox{functors; naturality of~} \mathit{cp} \} \\ & \oplus/ \cdot \mathsf{M}(\otimes) \cdot \mathit{cp} \cdot \mathsf{M}\mathit{id}\times\mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \\ = & \qquad \{ \mbox{distributivity for~} \mathit{cp} \mbox{: see below} \} \\ & (\otimes) \cdot (\oplus/)\times(\oplus/) \cdot \mathsf{M}\mathit{id}\times\mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \\ = & \qquad \{ \mbox{functors} \} \\ & (\otimes) \cdot (\oplus/)\times\mathit{id} \cdot \mathit{id}\times\mathsf{M}(\oplus/\cdot\mathit{foldr}\,(\otimes)\,b) \end{array}

which completes the fusion proof, modulo the wish about distributivity for {\mathit{cp}}:

\displaystyle  \oplus/ \cdot \mathsf{M}(\otimes) \cdot \mathit{cp} = (\otimes) \cdot (\oplus/)\times(\oplus/)

Distributivity for cartesian product

As for that wish about distributivity for {\mathit{cp}}:

\displaystyle  \begin{array}{ll} & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathsf{M}(\otimes) \mathbin{\hbox{\footnotesize\$}} \mathit{cp}\,(x,y) \\ = & \qquad \{ \mbox{definition of~} \mathit{cp} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathsf{M}(\otimes) \mathbin{\hbox{\footnotesize\$}} \mathbf{do}\,\{\,a \leftarrow x \mathbin{;} b \leftarrow y \mathbin{;} \mathit{return}\,(a,b) \,\} \\ = & \qquad \{ \mbox{map over~} \mathbf{do} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathbf{do}\,\{\,a \leftarrow x \mathbin{;} b \leftarrow y \mathbin{;} \mathit{return}\,(a \otimes b) \,\} \\ = & \qquad \{ \mbox{expanding~} \mathbf{do} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathit{join} \mathbin{\hbox{\footnotesize\$}} \mathsf{M}\,(\lambda a \mathbin{.} \mathsf{M}\,(a\otimes)\,y)\,x \\ = & \qquad \{ \oplus/ \mbox{~is an~} \mathsf{M} \mbox{-algebra} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathsf{M}(\oplus/) \mathbin{\hbox{\footnotesize\$}} \mathsf{M}\,(\lambda a \mathbin{.} \mathsf{M}\,(a\otimes)\,y)\,x \\ = & \qquad \{ \mbox{functors} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathsf{M}\,(\lambda a \mathbin{.} \oplus/(\mathsf{M}\,(a\otimes)\,y))\,x \\ = & \qquad \{ \mbox{distributivity for collections: see below} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathsf{M}\,(\lambda a \mathbin{.} a \otimes (\oplus/\,y))\,x \\ = & \qquad \{ \mbox{sectioning} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\$}} \mathsf{M}\,(\otimes (\oplus/\,y))\,x \\ = & \qquad \{ \mbox{distributivity for collections again} \} \\ & (\otimes (\oplus/\,y))\,(\oplus/\,x) \\ = & \qquad \{ \mbox{sectioning} \} \\ & (\oplus/\,x) \otimes (\oplus/\,y) \\ = & \qquad \{ \mbox{eta-expansion} \} \\ & (\otimes) \mathbin{\hbox{\footnotesize\$}} (\oplus/ \times \oplus/) \mathbin{\hbox{\footnotesize\$}} (x,y) \\ \end{array}

which discharges the proof obligation about distributivity for cartesian product, but again modulo two symmetric wishes about distributivity for collections:

\displaystyle  \begin{array}{lcl} \oplus/ \cdot \mathsf{M}(a\otimes) &=& (a\otimes) \cdot \oplus/ \\ \oplus/ \cdot \mathsf{M}(\otimes b) &=& (\otimes b) \cdot \oplus/ \\ \end{array}

Distributivity for collections

Finally, the proof obligations about distributivity for collections are easily discharged, by induction over the size of the (finite!) collection, provided that the binary operator {\otimes} distributes over {\oplus} in the familiar sense. The base case is for a singleton collection, ie in the image of {\mathit{return}} (because we disallowed empty collections); this case follows from the fact that {\oplus/} is an {\mathsf{M}}-algebra. The inductive step is for a collection of the form {u \mathbin{\underline{\smash{\mathit{mplus}}}} v} with {u,v} both strictly smaller than the whole (so, if the monad is idempotent, disjoint, or at least not nested); this requires the distribution of the algebra over choice {\oplus / (u \mathbin{\underline{\smash{\mathit{mplus}}}} v) = (\oplus/u) \oplus (\oplus/v)}, together with the familiar distribution of {\otimes} over {\oplus}.


So, the datatype-generic distributivity for {\mathsf{F}}-structures of collections that we used for the Maximum Segment Sum problem reduced to distributivity for lists of collections, which reduced to the cartesian product of collections, which reduced to that for pairs. That’s a much deeper hierarchy than I was expecting; can it be streamlined?

by jeremygibbons at October 03, 2014 03:25 PM

Morality and temptation

Inspired by Bob Harper’s recent postings, I too have a confession to make. I know what is morally right; but sometimes the temptation is too great, and my resolve is weak, and I lapse. Fast and loose reasoning may excuse me, but my conscience would be clearer if I could remain pure in the first place.

Initial algebras, final coalgebras

We know and love initial algebras, because of the ease of reasoning with their universal properties. We can tell a simple story about recursive programs, solely in terms of sets and total functions. As we discussed in the previous post, given a functor {\mathsf{F} : \mathbb{S}\mathrm{et} \rightarrow \mathbb{S}\mathrm{et}}, an {\mathsf{F}}-algebra is a pair {(X,f)} consisting of an object {X} and an arrow {f : \mathsf{F}(X) \rightarrow X}. A homomorphism between {\mathsf{F}}-algebras {(X,f)} and {(Y,g)} is an arrow {h : X \rightarrow Y} such that {h \cdot f = g \cdot \mathsf{F}(h)}:

The {\mathsf{F}}-algebra {(X,f)} is initial iff there is a unique such {h} for each {(Y,g)}; for well-behaved functors {\mathsf{F}}, such as the polynomial functors on {\mathbb{S}\mathrm{et}}, an initial algebra always exists. We conventionally write “{(\mu\mathsf{F},\mathit{in})}” for the initial algebra, and “{\mathit{fold}_{\mathsf{F}}(g)}” for the unique homomorphism {h} to another {\mathsf{F}}-algebra {(Y,g)}. (In {\mathbb{S}\mathrm{et}}, initial algebras correspond to datatypes of finite recursive data structures.)

The uniqueness of the solution is captured in the universal property:

\displaystyle  h = \mathit{fold}(g) \Leftrightarrow h \cdot \mathit{in} = g \cdot \mathsf{F}(h)

In words, {h} is this fold iff {h} satisfies the defining equation for the fold.

The universal property is crucial. For one thing, the homomorphism equation is a very convenient style in which to define a function; it’s the datatype-generic abstraction of the familiar pattern for defining functions on lists:

\displaystyle  \begin{array}{lcl} h\,[] &=& e \\ h\,(x:\mathit{xs}) &=& f\,x\,(h\,\mathit{xs}) \end{array}

These two equations implicitly characterizing {h} are much more comprehensible and manipulable than a single equation

\displaystyle  h = \lambda \mathit{xs}\;.\; \textbf{if}\;\mathit{null}\,\mathit{xs}\;\textbf{then}\;e\;\textbf{else}\;f\,(\mathit{head}\,\mathit{xs})\,(h\,(\mathit{tail}\,\mathit{xs}))

explicitly giving a value for {h}. But how do we know that this assortment of two facts about {h} is enough to form a definition? Of course! A system of equations in this form has a unique solution.

Moreover, the very expression of the uniqueness of the solution as an equivalence {h = \ldots \Leftrightarrow \ldots} provides many footholds for reasoning:

  • Read as an implication from left to right, instantiating {h} to {\mathit{fold}(g)} to make the left-hand side trivially true, we get an evaluation rule for folds:

    \displaystyle  \mathit{fold}(g) \cdot \mathit{in} = g \cdot \mathsf{F}(\mathit{fold}(g))

  • Read as an implication from right to left, we get a proof rule for demonstrating that some complicated expression {h} is a fold:

    \displaystyle  h = \mathit{fold}(g) \Leftarrow \ldots

  • In particular, we can quickly see that the identity function is a fold:

    \displaystyle  \begin{array}{ll} & \mathit{id} = \mathit{fold}(g) \\ \Leftarrow & \qquad \{ \mbox{universal property} \} \\ & \mathit{id} \cdot \mathit{in} = g \cdot \mathsf{F}(\mathit{id}) \\ \Leftrightarrow & \qquad \{ \mbox{identities} \} \\ & \mathit{in} = g \end{array}

    so {\mathit{id} = \mathit{fold}(\mathit{in})}. (In fact, this one’s an equivalence.)

  • We get a very simple proof of a fusion rule, for combining a following function with a fold to make another fold:

    \displaystyle  \begin{array}{ll} & h \cdot \mathit{fold}(f) = \mathit{fold}(g) \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & h \cdot \mathit{fold}(f) \cdot \mathit{in} = g \cdot \mathsf{F}(h \cdot \mathit{fold}(f)) \\ \Leftrightarrow & \qquad \{ \mbox{evaluation rule, functors} \} \\ & h \cdot f \cdot \mathsf{F}(\mathit{fold}(f)) = g \cdot \mathsf{F}(h) \cdot \mathsf{F}(\mathit{fold}(f)) \\ \Leftarrow & \qquad \{ \mbox{Leibniz} \} \\ & h \cdot f = g \cdot \mathsf{F}(h) \end{array}

  • Using this, we can deduce Lambek’s Lemma, that the constructors {\mathit{in}} form an isomorphism. Supposing that there is a right inverse, and it is a fold, what must it look like?

    \displaystyle  \begin{array}{ll} & \mathit{in} \cdot \mathit{fold}(f) = \mathit{id} \\ \Leftrightarrow & \qquad \{ \mathit{id} \mbox{~as a fold} \} \\ & \mathit{in} \cdot \mathit{fold}(f) = \mathit{fold}(\mathit{in}) \\ \Leftarrow & \qquad \{ \mbox{fusion} \} \\ & \mathit{in} \cdot f = \mathit{in} \cdot \mathsf{F}(\mathit{in}) \\ \Leftarrow & \qquad \{ \mbox{Leibniz} \} \\ & f = \mathsf{F}(\mathit{in}) \end{array}

    So if we define {\mathit{in}^{-1} = \mathit{fold}(\mathsf{F}(\mathit{in}))}, we get {\mathit{in} \cdot \mathit{in}^{-1} = \mathit{id}}. We should also check the left inverse property:

    \displaystyle  \begin{array}{ll} & \mathit{in}^{-1} \cdot \mathit{in} \\ = & \qquad \{ \mathit{in}^{-1} \mbox{~as a fold} \} \\ & \mathit{fold}(\mathsf{F}(\mathit{in})) \cdot \mathit{in} \\ = & \qquad \{ \mbox{evaluation rule} \} \\ & \mathsf{F}(\mathit{in}) \cdot \mathsf{F}(\mathit{fold}(\mathsf{F}(\mathit{in}))) \\ = & \qquad \{ \mathit{in}^{-1} \mbox{~as a fold again} \} \\ & \mathsf{F}(\mathit{in}) \cdot \mathsf{F}(\mathit{in}^{-1}) \\ = & \qquad \{ \mbox{functors} \} \\ & \mathsf{F}(\mathit{in} \cdot \mathit{in}^{-1}) \\ = & \qquad \{ \mbox{right inverse} \} \\ & \mathsf{F}(\mathit{id}) \\ = & \qquad \{ \mbox{functors} \} \\ & \mathit{id} \end{array}

And so on, and so on. Many useful functions can be written as instances of {\mathit{fold}}, and the universal property gives us a very powerful reasoning tool—the universal property of {\mathit{fold}} is a marvel to behold.

And of course, it all dualizes beautifully. An {\mathsf{F}}-coalgebra is a pair {(X,f)} with {f : X \rightarrow \mathsf{F}(X)}. A homomorphism between {\mathsf{F}}-coalgebras {(X,f)} and {(Y,g)} is a function {h : X \rightarrow Y} such that {g \cdot h = \mathsf{F}(h) \cdot f}:

The {\mathsf{F}}-coalgebra {(Y,g)} is final iff there is a unique homomorphism to it from each {(X,f)}; again, for well-behaved {\mathsf{F}}, final coalgebras always exist. We write “{(\nu\mathsf{F},\mathit{out})}” for the final coalgebra, and {\mathit{unfold}_{\mathsf{F}}(f)} for the unique homomorphism to it. (In {\mathbb{S}\mathrm{et}}, final coalgebras correspond to datatypes of finite-or-infinite recursive data structures.)

Uniqueness is captured by the universal property

\displaystyle  h = \mathit{unfold}(f) \Leftrightarrow \mathit{out} \cdot h = \mathsf{F}(h) \cdot f

which has just as many marvellous consequences. Many other useful functions are definable as instances of {\mathit{unfold}}, and again the universal property gives a very powerful tool for reasoning with them.


There are also many interesting functions that are best described as a combination of a fold and an unfold. The hylomorphism pattern, with an unfold followed by a fold, is the best known: the unfold produces a recursive structure, which the fold consumes.

The factorial function is a simple example. The datatype of lists of natural numbers is determined by the shape functor

\displaystyle  \mathsf{L}(X) = 1 + \mathbb{N} \times X

Then we might hope to write

\displaystyle  \mathit{fact} = \mathit{product} \cdot \mathit{downFrom}

where {\mathit{downFrom} = \mathit{unfold}_{\mathsf{L}}(d)} and {\mathit{product} = \mathit{fold}_{\mathsf{L}}(m)} with

\displaystyle  \begin{array}{lcl} d &::& \mathbb{N} \rightarrow \mathsf{L}(\mathbb{N}) \\ d\,0 &=& \mathit{inl}\,() \\ d\,(n+1) &=& \mathit{inr}\,(n+1,n) \bigskip\\ m &::& \mathsf{L}(\mathbb{N}) \rightarrow \mathbb{N} \\ m\,(\mathit{inl}\,()) &=& 1 \\ m\,(\mathit{inr}\,(n,n')) &=& n \times n' \end{array}

More elaborately, we might hope to write {\mathit{quicksort} : \mathsf{List}({\mathbb Z}) \rightarrow \mathsf{List}({\mathbb Z})} as the composition of {\mathit{unfold}_\mathsf{B}(s)} (to generate a binary search tree) and {\mathit{fold}_\mathsf{B}(g)} (to flatten that tree to a list), where {\mathsf{B}} is the shape functor for internally-labelled binary trees,

\displaystyle  p : \mathsf{List}({\mathbb Z}) \rightarrow \mathsf{B}(\mathsf{List}({\mathbb Z}))

partitions a list of integers into the unit or a pivot and two sublists, and

\displaystyle  g : \mathsf{B}(\mathsf{List}({\mathbb Z})) \rightarrow \mathsf{List}({\mathbb Z})

glues together the unit or a pivot and two sorted lists into one list. In fact, any divide-and-conquer algorithm can be expressed in terms of an unfold computing a tree of subproblems top-down, followed by a fold that solves the subproblems bottom-up.

But sadly, this doesn’t work in {\mathbb{S}\mathrm{et}}, because the types don’t meet in the middle. The source type of the fold is (the carrier of) an initial algebra, but the target type of the unfold is a final coalgebra, and these are different constructions.

This is entirely reasonable, when you think about it. Our definitions in {\mathbb{S}\mathrm{et}}—the category of sets and total functions—necessarily gave us folds and unfolds as total functions; the composition of two total functions is a total function, and so a fold after an unfold ought to be a total function too. But it is easy to define total instances of {\mathit{unfold}} that generate infinite data structures (such as a function {\mathit{upFrom}}, which generates an infinite ascending list of naturals), on which a following fold is undefined (such as “the product” of an infinite ascending list of naturals). The composition then should not be a total function.

One might try interposing a conversion function of type {\nu\mathsf{F} \rightarrow \mu\mathsf{F}}, coercing the final data structure produced by the unfold into an initial data structure for consumption by the fold. But there is no canonical way of doing this, because final data structures may be “bigger” (perhaps infinitely so) than initial ones. (In contrast, there is a canonical function of type {\mu\mathsf{F} \rightarrow \nu\mathsf{F}}. In fact, there are two obvious definitions of it, and they agree—a nice exercise!)

One might try parametrizing that conversion function with a natural number, bounding the depth to which the final data structure is traversed. Then the coercion is nicely structural (in fact, it’s a fold over the depth), and everything works out type-wise. But having to thread such “resource bounds” through the code does terrible violence to the elegant structure; it’s not very satisfactory.

Continuous algebras

The usual solution to this conundrum is to give up on {\mathbb{S}\mathrm{et}}, and to admit that richer domain structures than sets and total functions are required. Specifically, in order to support recursive definitions in general, and the hylomorphism in particular, one should move to the category {\mathbb{C}\mathrm{po}} of continuous functions between complete partial orders (CPOs). Now is not the place to give all the definitions; see any textbook on denotational semantics. The bottom line, so to speak, is that one has to accept a definedness ordering {\sqsubseteq} on values—both on “data” and on functions—and allow some values to be less than fully defined.

Actually, in order to give meaning to all recursive definitions, one has to further restrict the setting to pointed CPOs—in which there is a least-defined “bottom” element {\bot_X} for each type {X}, which can be given as the “meaning” (solution) of the degenerate recursive definition {x=x} at type {X}. Then there is no “empty” CPO; the smallest CPO {0} has just a single element, namely {\bot}. As with colimits in general, this smallest object is used as the start of a chain of approximations to a limiting solution. But in order for {0} really to be an initial object, one also has to constrain the arrows to be strict, that is, to preserve {\bot}; only then is there a unique arrow {0 \rightarrow A} for each {A}. The category of strict continuous functions between pointed CPOs is called {\mathbb{C}\mathrm{po}_\bot}.

It so happens that in {\mathbb{C}\mathrm{po}_\bot}, initial algebras and final coalgebras coincide: the objects (pointed CPOs) {\mu\mathsf{F}} and {\nu\mathsf{F}} are identical. This is very convenient, because it means that the hylomorphism pattern works fine: the structure generated by the unfold is exactly what is expected by the fold.

Of course, it still happen that the composition yields a “partial” (less than fully defined) function; but at least it now type-checks. Categories with this initial algebra/final coalgebra coincidence are called algebraically compact; they were studied by Freyd, but there’s a very good survey by Adámek, Milius and Moss.

However, the story gets murkier than that. For one thing, {\mathbb{C}\mathrm{po}_\bot} does not have proper products. (Indeed, an algebraically compact category with products collapses.) But beyond that, {\mathbb{C}\mathrm{po}_\bot}—with its restriction to strict arrows—is not a good model of lazy functional programming; {\mathbb{C}\mathrm{po}}, with non-strict arrows too, is better. So one needs a careful balance of the two categories. The consequences for initial algebras and final coalgebras are spelled out in one of my favourite papers, Program Calculation Properties of Continuous Algebras by Fokkinga and Meijer. In a nutshell, one can only say that the defining equation {h \cdot \mathit{in} = g \cdot \mathsf{F}(h)} for folds has a unique strict solution in {h}; without the strictness side-condition, {h\,\bot} is unconstrained (because {\mathit{in}\,x \ne \bot} for any {x}). But the situation for coalgebras remains unchanged—the defining equation {\mathit{out} \cdot h = \mathsf{F}(h) \cdot f} for unfolds has a unique solution (and moreover, it is strict when {f} is strict).

This works, but it means various strictness side-conditions have to be borne in mind when reasoning about folds. Done rigorously, it’s rather painful.

Recursive coalgebras

So, back to my confession. I want to write divide-and-conquer programs, which produce intermediate data structures and then consume them. Folds and unfolds in {\mathbb{S}\mathrm{et}} do not satisfy me; I want more—hylos. Morally, I realise that I should pay careful attention to those strictness side-conditions. But they’re so fiddly and boring, and my resolve is weak, so I usually just brush them aside. Is there away that I can satisfy my appetite for divide-and-conquer programs while still remaining in the pure {\mathbb{S}\mathrm{et}} world?

Tarmo Uustalu and colleagues have a suggestion. Final coalgebras and algebraic compactness are sufficient but not necessary for the hylo diagram above to have a unique solution; they propose to focus on recursive coalgebras instead. The {\mathsf{F}}-coalgebra {(X,f)} is “recursive” iff, for each {g : \mathsf{F}(Y) \rightarrow Y}, there is a unique {h} such that {h = g \cdot \mathsf{F}(h) \cdot f}:

This is a generalization of initial algebras: if {\mathsf{F}} has an initial algebra {(\mu\mathsf{F},\mathit{in})}, then by Lambek’s Lemma {\mathit{in}} has an inverse {\mathit{in}^{-1}}, and {(\mu\mathsf{F},\mathit{in}^{-1})} is a recursive coalgebra. And it is a strict generalization: it also covers patterns such as paramorphisms (primitive recursion)—since {(\mu\mathsf{F}, \mathsf{F}(\mathit{fork}(\mathit{id},\mathit{id}))\cdot\mathit{in}_\mathsf{F}^{-1})} is a recursive {\mathsf{G}}-coalgebra where {\mathsf{G}} is the functor taking {X} to {\mathsf{F}(X \times \mathsf{F}(X))}—and the “back one or two steps” pattern used in the Fibonacci function.

Crucially for us, almost by definition it covers all of the “reasonable” hylomorphisms too. For example, {(\mathbb{N},d)} is a recursive {\mathsf{L}}-coalgebra, where {\mathsf{L}} is the shape functor for lists of naturals and {d} the {\mathsf{L}}-coalgebra introduced above that analyzes a natural into nothing (for zero) or itself and its predecessor (for non-zero inputs). Which is to say, for each {m : \mathsf{L}(X) \rightarrow X}, there is a unique {h} such that {h = m \cdot \mathsf{L}(h) \cdot d}; in particular, for the {m} given above that returns 1 or multiplies, the unique {h} is the factorial function. (In fact, this example is also an instance of a paramorphism.) And {(\mathsf{List}({\mathbb Z}),p)} is a recursive {\mathsf{B}}-coalgebra, where {p} is the partition function of quicksort—for any {\mathsf{B}}-algebra {(Y,g)}, there is a unique {h} such that {h = g \cdot \mathsf{B}(h) \cdot p}, and in particular when {g} is the glue function for quicksort, that unique solution is quicksort itself.

This works perfectly nicely in {\mathbb{S}\mathrm{et}}; there is no need to move to more complicated settings such as {\mathbb{C}\mathrm{po}} or {\mathbb{C}\mathrm{po}_\bot}, or to consider partiality, or strictness, or definedness orderings. The only snag is the need to prove that a particular coalgebra of interest is indeed recursive. Capretta et al. study a handful of “basic” recursive coalgebras and of constructions on coalgebras that preserve recursivity.

More conveniently, Taylor and Adámek et al. relate recursivity of coalgebras to the more familiar notion of variant function, ie well-founded ordering on arguments of recursive calls. They restrict attention to finitary shape functors; technically, preserving directed colimits, but informally, I think that’s equivalent to requiring that each element of {\mathsf{F}(X)} has a finite number of {X} elements—so polynomial functors are ok, as is the finite powerset functor, but not powerset in general. If I understand those sources right, for a finitary functor {\mathsf{F}} and an {\mathsf{F}}-coalgebra {(X,f)}, the following conditions are equivalent: (i) {(X,f)} is corecursive; (ii) {f} is well-founded, in the sense that there is a well-founded ordering {\prec} such that {y \prec x} for each “element” {y} of {f(x)}; (iii) every element of {\mathit{unfold}_\mathsf{F}(f)} has finite depth; and (iv) there is a coalgebra homomorphism from {(X,f)} to {(\mu\mathsf{F},\mathit{in})}.

This means that I can resort to simple and familiar arguments in terms of variant functions to justify hylo-style programs. The factorial function is fine, because ({\mathsf{L}} is a finitary functor, being polynomial, and) the chain of recursive calls to which {d} leads is well-founded; quicksort is fine, because the partitioning step is well-founded; and so on. Which takes a great weight of guilt off my shoulders: I can give in to the temptation to write interesting programs, and still remain morally as pure as the driven snow.

by jeremygibbons at October 03, 2014 03:20 PM


Universal properties are a generalization of the notion of a Galois connection between two orderings. Or perhaps I should say: universal properties arise from adjunctions, and it is adjunctions that are a generalization of Galois connections. Adjunctions capture in an abstract categorical setting the idea of “optimal solutions to a problem”; and this idea is itself very general, capturing many of the structures underlying common patterns in programming (not to mention the rest of mathematics). Solutions to equations, products, limits of sequences of approximations, and minimality and maximality are just some of the instances of this powerful abstraction that we will make use of. In the preface to Categories for the Working Mathematician, Mac Lane wrote that “adjoint functors arise everywhere”.

Adjoint functors

Two functors {\mathsf{F} : \mathbb{D} \leadsto \mathbb{C}} and {\mathsf{G} : \mathbb{C} \leadsto \mathbb{D}} form an adjunction, written {\mathsf{F} \dashv \mathsf{G}}, if there is an isomorphism between the sets of arrows {\mathsf{F}(B) \rightarrow A} in {\mathbb{C}} and {B \rightarrow \mathsf{G}(A)} in {\mathbb{D}}. We say that {\mathsf{F}} is the left adjoint and {\mathsf{G}} the right adjoint. The essence of the isomorphism is captured by two natural transformations {\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} \mathsf{G} \mathbin{\cdot} \mathsf{F}} in {\mathbb{D}} and {\epsilon : \mathsf{F} \mathbin{\cdot} \mathsf{G} \mathbin{\stackrel{.}{\to}} \mathsf{Id}} in {\mathbb{C}}, called the unit and counit of the adjunction; {\eta} is the image in {\mathbb{D}} of {\mathit{id}_{\mathsf{F}(B)} : \mathsf{F}(B) \rightarrow \mathsf{F}(B)} in {\mathbb{C}}, and conversely, {\epsilon} is the image in {\mathbb{C}} of {\mathit{id}_{\mathsf{G}(A)}} in {\mathbb{D}}. The unit and counit satisfy the laws

\displaystyle  \begin{array}{lcl} \epsilon_{\mathsf{F}(B)} \cdot \mathsf{F}(\eta_B) &=& \mathit{id}_{\mathsf{F}(B)} \\ \mathsf{G}(\epsilon_A) \cdot \eta_{\mathsf{G}(A)} &=& \mathit{id}_{\mathsf{G}(A)} \end{array}

From them one can construct the witnesses to the isomorphism for arbitrary arrows: for each arrow {f : \mathsf{F}(B) \rightarrow A} in {\mathbb{C}}, there is a unique arrow {g : B \rightarrow \mathsf{G}(A)} in {\mathbb{D}} such that {\epsilon_A \cdot \mathsf{F}(g) = f}, given by {g = \mathsf{G}(f) \cdot \eta_B}; and conversely, for each arrow {g : B \rightarrow \mathsf{G}(A)} in {\mathbb{D}}, there is a unique arrow {f : \mathsf{F}(B) \rightarrow A} in {\mathbb{C}} such that {\mathsf{G}(f) \cdot \eta_ B = g}, given by {f = \epsilon_B \cdot \mathsf{F}(g)}; and moreover, these two constructions are each other’s inverses.

Adjunctions from Galois connections

A preorder {(X,{\le})} forms a category: the objects of the category are the elements of the set~{X}, and between any two elements {x,y \in X}, there is a unique arrow if {x \le y}, and no arrow otherwise. That adjunctions are a generalization of Galois connections follows straightforwardly from the fact that there is at most one arrow between any two objects in a preorder category. Then monotonic functions {f : (X,{\le_X}) \rightarrow (Y,{\le_Y})} and {g : (Y,{\le_Y}) \rightarrow (X,{\le_X})} between preorders {(X,{\le_X})} and {(Y,{\le_Y})} form a Galois connection precisely if the sets of arrows {f(y) \rightarrow x} and {y \rightarrow g(x)} are isomorphic—that is, if both {f(y) \le_X x} and {y \le_Y g(x)} hold, or neither do, or in other words,

\displaystyle  f(y) \le_X x \Leftrightarrow y \le_Y g(x)

Adjoints of the diagonal functor

A very useful example of adjunctions arises in the definition of products—in the category {\mathbb{S}\mathrm{et}} of sets and total functions, for given types {A,B,C}, there is an isomorphism between the set of pair-generating functions, of type {A \rightarrow B \times C}, and their two projections, pairs of functions of types {A \rightarrow B} and {A \rightarrow C}. (Indeed, given functions {f:A \rightarrow B} and {g:A \rightarrow C}, one can construct the pair-generating function {\mathit{fork}(f,g) : A \rightarrow B \times C}; and conversely, given a pair-generating function {h : A \rightarrow B \times C}, one can construct its two projections {fst \cdot h : A \rightarrow B} and {snd \cdot h : A \rightarrow C}; and moreover, these two constructions are inverses.)

The “isomorphism between sets of arrows” can be elegantly expressed as an adjunction; since it concerns pairs of arrows, one side of the adjunction involves the product category {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}. The right adjoint is the product functor {(\times) : \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \leadsto \mathbb{S}\mathrm{et}}, mapping an object in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}—that is, a pair of sets—to their cartesian product as an object in {\mathbb{S}\mathrm{et}}, and an arrow in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}—that is, a parallel pair of functions—to a function in {\mathbb{S}\mathrm{et}} acting pointwise on pairs. In the other direction, the left adjoint is the diagonal functor {\triangle : \mathbb{S}\mathrm{et} \leadsto \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, mapping an object {A} in {\mathbb{S}\mathrm{et}} to the object {(A,A)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, and a function {f} to the pair of functions {(f,f)} as an arrow in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}. The adjunction {{\triangle} \dashv (\times)} amounts to the isomorphism

\displaystyle  \triangle A \rightarrow (B,C) \approx A \rightarrow {\times} (B,C)

or equivalently,

\displaystyle  (A \rightarrow B)\times(A \rightarrow C) \approx A \rightarrow (B\times C)

The unit and counit of the adjunction are {\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} (\times) \mathbin{\cdot} \triangle} and {\epsilon : \triangle \mathbin{\cdot} (\times) \mathbin{\stackrel{.}{\to}} \mathsf{Id}}. In more familiar terms, the unit is a natural transformation in {\mathbb{S}\mathrm{et}}, so a polymorphic function; in fact, it’s the function of type {A \rightarrow A \times A} that we might call {\mathit{double}}. However, the counit is a natural transformation {(A \times B,A \times B) \rightarrow (A,B)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, so not simply a (polymorphic) function; but arrows in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}} are pairs of functions, so we might write this {(\mathit{fst},\mathit{snd}) :: (A \times B \rightarrow A, A \times B \rightarrow B)}.

Then the “fork” operation is in fact one of the two witnesses to the isomorphism between the sets of arrows: given an arrow {\triangle A \rightarrow (B,C)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, that is, a pair {(f,g)} of functions of types {(A \rightarrow B,A \rightarrow C)}, then {\mathit{fork}(f,g)} is an arrow {A \rightarrow {\times} (B,C)} in {\mathbb{S}\mathrm{et}}, that is, a function of type {A \rightarrow B \times C}, given by the construction above:

\displaystyle  \mathit{fork}(f,g) = (\times) (f,g) \cdot \mathit{double}

or, with more points,

\displaystyle  \mathit{fork} (f,g)\,a = (f\,a, g\,a)

The laws that the unit and counit satisfy are

\displaystyle  \begin{array}{lcl} (\mathit{fst},\mathit{snd}) \cdot \triangle \mathit{double} &=& \mathit{id} \\ (\times) (\mathit{fst},\mathit{snd}) \cdot \mathit{double} &=& \mathit{id} \end{array}

or, in more familiar terms,

\displaystyle  \begin{array}{lcl} \mathit{fst} \cdot \mathit{double} &=& \mathit{id} \\ \mathit{snd} \cdot \mathit{double} &=& \mathit{id} \\ \mathit{fork} (\mathit{fst},\mathit{snd}) &=& \mathit{id} \end{array}

The universal property of {\mathit{fork}} follows from the isomorphism between sets of arrows:

\displaystyle  \begin{array}{ll} & h = \mathit{fork}(f,g) \\ \Leftrightarrow & \qquad \{ \mathit{fork} \} \\ & h = (\times) (f,g) \cdot \mathit{double} \\ \Leftrightarrow & \qquad \{ \mbox{isomorphism between arrow sets} \} \\ & (\mathit{fst},\mathit{snd}) \cdot \triangle h = (f,g) \\ \Leftrightarrow & \qquad \{ \triangle \} \\ & (\mathit{fst},\mathit{snd}) \cdot (h,h) = (f,g) \\ \Leftrightarrow & \qquad \{ \mbox{composition in~} \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \mbox{~is pointwise} \} \\ & (\mathit{fst} \cdot h,\mathit{snd} \cdot h) = (f,g) \\ \Leftrightarrow & \qquad \{ \mbox{equality of pairs is pointwise} \} \\ & \mathit{fst} \cdot h=f \land \mathit{snd} \cdot h=g \end{array}

The universal property of {\mathit{fork}} underlies all the useful laws of that operator.

Of course, the situation nicely dualizes too. Coproducts in {\mathbb{S}\mathrm{et}} arise from the isomorphism between the set of arrows {A+B \rightarrow C} and the pairs of arrows in {A \rightarrow C} and {B \rightarrow C}. Again, “pairs of arrows” suggest the product category; but this time, the diagonal functor is the right adjoint, with the coproduct functor {(+) : \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \rightarrow \mathbb{S}\mathrm{et}} (which takes a pair of sets {(A,B)} to their disjoint union) as the left adjoint. That is, the adjunction is {(+) \dashv \triangle}, and the isomorphism is

\displaystyle  (+) (A,B) \rightarrow C \approx (A,B) \rightarrow \triangle C

The unit {\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} \triangle \mathbin{\cdot} (+)} is a natural transformation {(A,B) \rightarrow (A+B,A+B)} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, that is, a pair of functions {\mathit{inl} : A \rightarrow A+B} and {\mathit{inr} : B \rightarrow A+B}. The counit {\epsilon : (+) \mathbin{\cdot} \triangle \mathbin{\stackrel{.}{\to}} \mathsf{Id}} is a natural transformation {A+A \rightarrow A} in {\mathbb{S}\mathrm{et}}, which we might call {\mathit{merge}}. The “join” of two functions with a common range is a witness to one half of the isomorphism—given an arrow {(f,g) : (A,B) \rightarrow \triangle C} in {\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}, then {\mathit{join} (f,g)} is an arrow {(+) (A,B) \rightarrow C} in {\mathbb{S}\mathrm{et}}, defined by

\displaystyle  \mathit{join} (f,g) = \mathit{merge} \cdot (+) (f,g)

The two laws that the unit and counit satisfy are:

\displaystyle  \begin{array}{lcl} \mathit{merge} \cdot (+) (\mathit{inl},\mathit{inr}) &=& \mathit{id} \\ \triangle \mathit{merge} \cdot (\mathit{inl},\mathit{inr}) &=& \mathit{id} \end{array}

or, perhaps more perspicuously,

\displaystyle  \begin{array}{lcl} \mathit{join} (\mathit{inl},\mathit{inr}) &=& \mathit{id} \\ \mathit{merge} \cdot \mathit{inl} &=& \mathit{id} \\ \mathit{merge} \cdot \mathit{inr} &=& \mathit{id} \end{array}

Another familiar example from functional programming is the notion of currying, which arises when one can construct the function space {A \Rightarrow B} (the type of functions from {A} to {B}, for each type {A} and {B}), such that there is an isomorphism between the sets of arrows {A \rightarrow (B \Rightarrow C)} and {A \times B \rightarrow C}. Here, the adjunction is {( \times B) \dashv (B \Rightarrow )}—in this case, both functors are endofunctors on {\mathbb{S}\mathrm{et}}. The unit and counit are natural transformations {\mathsf{Id} \mathbin{\stackrel{.}{\to}} (B \Rightarrow )\mathbin{\cdot}( \times B)} and {( \times B)\mathbin{\cdot}(B \Rightarrow ) \mathbin{\stackrel{.}{\to}} \mathsf{Id}}. We might call these {\mathit{pair}} and {\mathit{apply}}, since the first is a curried pair-forming operator, and the second applies a function to an argument:

\displaystyle  \begin{array}{lcl} \mathit{pair} &:& A \rightarrow (B \Rightarrow (A \times B)) \\ \mathit{apply} &:& (B \Rightarrow A) \times B \rightarrow A \end{array}

The laws they satisfy are as follows:

\displaystyle  \begin{array}{lcl} \mathit{apply} \cdot ( \times B) \mathit{pair} &=& \mathit{id} \\ (B \Rightarrow )\mathit{apply} \cdot \mathit{pair} &=& \mathit{id} \end{array}

or, in points,

\displaystyle  \begin{array}{lcl} \mathit{apply} (\mathit{pair}\,a,b) &=& (a,b) \\ \mathit{apply} \cdot \mathit{pair}\,f &=& f \end{array}

The isomorphism itself is witnessed by the two inverse functions

\displaystyle  \begin{array}{lcl} \mathit{curry}\,f &=& (B \Rightarrow ) f \cdot \mathit{pair} \\ \mathit{uncurry}\,g &=& \mathit{apply} \cdot ( \times B) g \end{array}

where {f : A \times B \rightarrow C} and {g : A \rightarrow (B \Rightarrow C)}.

by jeremygibbons at October 03, 2014 03:16 PM

Universal properties and Galois connections

One recurring theme throughout this series will be that of a universal property—an identity that captures an indirect means of solving a problem, by transforming that problem into a different (and hopefully simpler) domain, while still preserving all its essential properties. In particular, the original problem has a solution if and only if the transformed problem does, and moreover, the solution to the transformed problem can easily be translated back into a solution to the original problem. One can see universal properties as a generalization of the notion of a Galois connection between two orderings, which are a similarly powerful technique of relating problems in two different settings. (In fact, the proper generalization of Galois connections is to adjunctions, but that’s a story for next time.)

Universal properties

The universal property of the {\mathit{fork}} operation for products is a representative example. Recall that {\mathit{fork}\,(f,g) :: a \rightarrow (b,c)} when {f :: a \rightarrow b} and {g :: a \rightarrow c}; and that {\mathit{fst} :: (b,c) \rightarrow b} and {\mathit{snd} :: (b,c) \rightarrow c}. Then {\mathit{fork}} is completely defined by its universal property:

\displaystyle  h = \mathit{fork}\,(f,g) \quad\Leftrightarrow\quad \mathit{fst} \cdot h = f \land \mathit{snd} \cdot h = g

This identity repays careful study.

  • It translates a problem in the more complex domain of products (namely, the problem of showing how some complicated expression {h} can be written in terms of {\mathit{fork}}) into simpler problems (here, equations about the two projections of {h}).
  • It’s an equivalence. So not only do you have an implication from left to right (any {h} expressible as a {\mathit{fork}} satisfies the two properties on the right), you also have one from right to left (any pair of functions {f,g} satisfying the two properties on the right induces a {\mathit{fork}}). In other words, {h} is a solution to the equation on the left iff it is a solution on the right; not only does a solution on the right yield a construction on the left, but also the absence of solutions on the right implies the absence on the left. Or again: the equations on the right have a unique solution in {h}—since any two solutions {h,h'} must both be equal to the same expression on the left.
  • It has many useful simple consequences. You can make the left-hand side trivially true by letting {h = \mathit{fork}\,(f,g)}; then the right-hand side must also be true:

    \displaystyle  \begin{array}{lcl} \mathit{fst} \cdot \mathit{fork}\,(f,g) &=& f \\ \mathit{snd} \cdot \mathit{fork}\,(f,g) &=& g \end{array}

    Symmetrically, you can make the right-hand side trivially true by letting {f = \mathit{fst} \cdot h} and {g = \mathit{snd} \cdot h}; then the left-hand side must also be true:

    \displaystyle  h = \mathit{fork}\,(\mathit{fst} \cdot h, \mathit{snd} \cdot h)

    If you further let {h = \mathit{id}}, you conclude that every pair consists solely of its two projections, nothing more:

    \displaystyle  \mathit{id} = \mathit{fork}\,(\mathit{fst}, \mathit{snd})

    In fact, the universal property of {\mathit{fork}} tells you everything you need to know about {\mathit{fork}}; you might take that as one justification for the term “universal”.

  • It also has many useful less obvious consequences. For example, if you’re searching for an {h} that acts independently on the two components of a pair—{\mathit{fst} \cdot h = h_1 \cdot \mathit{fst}} and {\mathit{snd} \cdot h = h_2 \cdot \mathit{snd}}—just let {f = h_1 \cdot \mathit{fst}} and {g = h_2 \cdot \mathit{snd}} in the universal property, and conclude

    \displaystyle  h = \mathit{fork}\,(h_1\cdot\mathit{fst}, h_2\cdot\mathit{snd})

    (which we’ve written “{\mathit{prod}\,(h_1,h_2)}” elsewhere). For another example, we can deduce a fusion law for {\mathit{fork}}: for what {f',g'} does the equation

    \displaystyle  \mathit{fork}\,(f,g) \cdot k = \mathit{fork}\,(f',g')

    hold? This matches the left-hand side of the universal property; expanding the right-hand side yields

    \displaystyle  \begin{array}{lclcl} f' &=& \mathit{fst}\cdot\mathit{fork}\,(f,g)\cdot k &=& f \cdot k \\ g' &=& \mathit{snd}\cdot\mathit{fork}\,(f,g)\cdot k &=& g \cdot k \end{array}

Such a rich harvest from so small a seed! (In fact, we will see later that an even smaller seed suffices.)

Galois connections

We can see the same structures that occur in universal properties like that of {\mathit{fork}} above also in relationships between orderings. As a very simple example, consider the problem of dividing a natural number {n} by two, exactly; the universal property of a solution {m} to this problem is the equivalence

\displaystyle  n / 2 = m \Leftrightarrow n = m \times 2

That is, {m} is a solution to the problem “compute {n / 2}” precisely when {n = m \times 2}; both the existence and the identification of a solution to a problem expressed in terms of division has been translated to one in terms of multiplication—which is arguably a simpler setting. Note that the universal property amounts to an equivalence

\displaystyle  f(n) = m \Leftrightarrow n = g(m)

involving the two functions {f = (/2)} and {g = (\times 2)}, which are in some sense inverses. This pattern will crop up over and over again.

The division example involved an equivalence between the two identities {f(n)=m} and {n=g(m)}. More generally, another relation than “{=}” might be involved. Extending the previous example to integer division, rounding down, we have for {k>0}:

\displaystyle  n \div k \ge m \Leftrightarrow n \ge m \times k

Again, this relates the two (in some sense inverse) functions {(\div k)} and {(\times k)}; but this time equality is inadequate for stating the problem, and it perhaps more convincing to claim that a more complicated problem {(\div k)} has been translated into a simpler one {(\times k)}. What is more, translating the problem via this universal property pays dividends when it comes to reasoning about the problem, because the simpler problem space is much more amenable to calculation. For example, properties of repeated division {(n \div k) \div l} (for {k,l>0}) do not trip off the tongue; but we can reason straightforwardly as follows:

\displaystyle  \begin{array}{ll} & (n \div k) \div l \ge m \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & n \div k \ge m \times l \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & n \ge (m \times l) \times k \\ \Leftrightarrow & \qquad \{ \mbox{multiplication is associative} \} \\ & n \ge m \times (l \times k) \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & n \div (l \times k) \ge m \end{array}

Thus, {(n \div k) \div l \ge m} precisely when {n \div (l \times k) \ge m}, or in other words, {(n \div k) \div l = n \div (l \times k)}.

In this case, the two problem spaces have both involved the same relation {\ge} on the same domain, namely the natural numbers; that is not essential. For example, the universal property of the floor function {\lfloor\cdot\rfloor} from reals to integers is given by:

\displaystyle  \mathit{inj}(n) \le_R x \Leftrightarrow n \le_I \lfloor x \rfloor

where, to be completely explicit, we have written {\le_R} for the usual ordering on reals and {\le_I} for the corresponding ordering on integers, and {\mathit{inj}} for the injection from the integers into the reals. This time the two problem spaces involve two different orderings on different domains; we say that the pair of functions {\mathit{inj}} and {\lfloor\cdot\rfloor} form a Galois connection between the orderings {\le_R} and {\le_I}. (We also see that the relationship between the two functions {\mathit{inj}} and {\lfloor\cdot\rfloor} is becoming less like a pure inverse relationship, and more of an embedding–projection pair.)

As a simple non-arithmetical example of a Galois connection on a single domain, consider some set {U} and a fixed subset {X \subseteq U}; then

\displaystyle  A \cap X \subseteq B \Leftrightarrow A \subseteq B \cup \overline{X}

That is, {(\cap X)} and {(\cup \overline{X})} form a Galois connection between {\subseteq} and itself.

A non-arithmetical example between two different domains is afforded by the field of formal concept analysis, which relates “objects” and their “properties”. Given are sets {O} of objects and {P} of properties, and a relation {(\vdash) \subseteq O \times P}; we write {o \mathrel{\vdash} p} to denote that object {o} has property {p}. This induces “concept-forming operators” {\mathit{intent} : 2^O \rightarrow 2^P} and {\mathit{extent} : 2^P \rightarrow 2^O} defined by:

\displaystyle  \begin{array}{lcl} \mathit{intent}(E) &=& \{ p \in P \mid \forall o \in E .\; o \mathrel{\vdash} p \} \\ \mathit{extent}(I) &=& \{ o \in O \mid \forall p \in I .\; o \mathrel{\vdash} p \} \end{array}

That is, {\mathit{intent}(E)} is the set of properties enjoyed by all objects in {E}, and {\mathit{extent}(I)} is the set of objects enjoying all the properties in {I}; a concept is a pair {(E,I)} with {\mathit{intent}(E) = I} and {\mathit{extent}(I) = E}. The concept-forming operators form a Galois connection between {\subseteq} and {\supseteq}:

\displaystyle  \begin{array}{ll} & \mathit{extent}(I) \supseteq E \\ \Leftrightarrow& \qquad \{ \mbox{characteristic of~} \mathit{extent} \} \\ & \forall o \in E .\; (\forall p \in I .\; o \mathrel{\vdash} p) \\ \Leftrightarrow& \qquad \{ \mbox{commuting quantifiers} \} \\ & \forall p \in I .\; (\forall o \in E .\; o \mathrel{\vdash} p) \\ \Leftrightarrow& \qquad \{ \mbox{characteristic of~} \mathit{intent} \} \\ & I \subseteq \mathit{intent}(E) \end{array}

This construction can be used to translate a problem about the extension of a concept (that is, an enumeration of its instances) into one about the intension (that is, the characteristic properties of its instances). It is related to the observation that “syntax and semantics are adjoint“—under the analogy that “objects” are sets of mathematical structures, “properties” are axioms, and the relation is “satisfaction”, the models of an axiomatic theory {T} are included in a set of structures {S} if and only if the theory {T} logically entails the minimal axiomatization of {S}.

by jeremygibbons at October 03, 2014 03:13 PM

October 02, 2014

Joachim Breitner

11 ways to write your last Haskell program

At my university, we recently held an exam that covered a bit of Haskell, and a simple warm-up question at the beginning asked the students to implement last :: [a] -> a. We did not demand a specific behaviour for last [].

This is a survey of various solutions, only covering those that are actually correct. I elided some variation in syntax (e.g. guards vs. if-then-else).

Most wrote the naive and straightforward code:

last [x] = x
last (x:xs) = last xs

Then quite a few seemed to be uncomfortable with pattern-matching and used conditional expressions. There was some variety in finding out whether a list is empty:

last (x:xs)
  | null xs == True = x
  | otherwise       = last xs

last (x:xs)
  | length (x:xs) == 1 = x
  | otherwise          = last xs

last (x:xs)
  | length xs == 0 = x
  | otherwise      = last xs

last xs
  | lenght xs > 1 = last (tail xs)
  | otherwise     = head xs

last xs
  | lenght xs == 1 = head xs
  | otherwise      = last (tail xs)

last (x:xs)
  | xs == []  = x
  | otherwise = last xs

The last one is not really correct, as it has the stricter type Eq a => [a] -> a. Also we did not expect our students to avoid the quadratic runtime caused by using length in every step.

The next class of answers used length to pick out the right elemet, either using (!!) directly, or simulating it with head and drop:

last xs = xs !! (length xs - 1)

last xs = head (drop (length xs - 1) xs)

There were two submissions that spelled out an explicit left folding recursion:

last (x:xs) = lastHelper x xs
    lastHelper z [] = z
    lastHelper z (y:ys) = lastHelper y ys

And finally there are a few code-golfers that just plugged together some other functions:

last x = head (reverse x)

Quite a lot of ways to write last!

by Joachim Breitner ([email protected]) at October 02, 2014 01:47 PM

Yesod Web Framework

Updating auto-update

A few weeks back, Kazu and I received an email from Martin Bailey, who was interested in working with us to further optimize Warp. The subject at the time was reduced buffer copying and allocations. That subject is very interesting in itself, and once finalized, will get its own blog post as well. But that's not the subject of this blog post.

After working on some ideas, Kazu benchmarked Warp, and found a massive performance degradation which had already slipped onto Hackage. The problem only appears under highly concurrent requests (which is exactly what Kazu's benchmark was testing). That bug has now been resolved, and users are recommended to upgrade to the newest versions of auto-update and warp. (We also included some other performance boosts here, care of Ryan Newton's atomic-primops package.) This blog post covers the problem, and its solution.

It's about time

One of the required response headers according to the HTTP spec is the Date header. As you might guess, this consists of the date, as well as the current time on the server. This has a very specific format specified in the spec. A naive approach to filling in this header would be, for each request, to run:

now <- getCurrentTime
let value = formatTimeHTTP now
    headers' = ("Date", value) : headers

However, when you're writing a server that handles hundreds of thousands of requests per second, having to get and format the current time on each request is incredibly expensive. Instead, quite some time ago, Kazu introduced a date formatting worker thread, which essentially looks like this:

let getNowFormatted = formatTimeHTTP <$> getCurrentTime
nowFormatted <- getNowFormatted >>= newIORef
forkIO $ forever $ do
    threadDelay 1000000
    getNowFormatted >>= writeIORef nowFormatted
return $ readIORef nowFormatted

We fork a single thread that recalculates the formatted time every second, and updates a mutable reference. This means that, regardless of how many clients connect, we will always run this computation at most once per second. And reading the current time requires no system calls, formatting, or allocation: it's just a memory read.

Watt's the matter

The problem with this approach is that, even if there are zero requests, we're still going to run this computation once a second. This doesn't hurt our performance too much (it's a relatively cheap operation). However, it does mean that our process never stops working, which is bad for power consumption. So we needed a way to let that worker thread turn off when it wasn't needed anymore, and start up again on demand.

If this sounds familiar, it's because we blogged about it just two months ago. But there's an implementation detail I want to point out about auto-update. When I started writing it, I aimed to have the worker thread completely shut down when not needed, and then for a new worker thread to be spawned on demand. This resulted in some strange corner cases. In particular, it was possible for two different callers to spawn two different worker threads at the same time. We used atomicModifyIORef to ensure that only one of those threads became the "official" worker, and the other one would shut down right away. There was also a lot of tricky business around getting async exceptions right.

The code wasn't too difficult to follow, and was still pretty short. You can view it on Github.

The core of the problem

Unfortunately, before release, we didn't do enough performance testing of auto-update in highly concurrent settings. When we threw enough cores at the problem, we quickly ran into a situation where multiple Haskell threads would end up making concurrent requests for the auto-update value, and each of those threads would fork a separate worker thread. Only one of those would survive, but the forking itself was killing our performance! (I had one benchmark demonstrating that, for one million requests across one thousand threads on four cores, over ten thousands worker threads were forked.)

There was another problem as well. We made heavy use of atomicModifyIORef to get this working correctly. Compare this to our original dedicated thread solution: each data access was a simple, cheap readIORef. By introducing atomicModifyIORef at each request site, we introduced memory barrier issues immediately.

The solution

After iterating a few times, Kazu, Martin and I came up with a fairly elegant solution to the problem. As noted, the original dedicated thread was in many ways ideal. A lot of our problems were coming from trying to fork threads on demand. So we came up with a compromise: why not have a single, dedicated worker thread, but allow it to go to sleep/block when it's no longer needed?

Blocking semantics meant we needed to move beyond IORefs over to either MVars or STM (we elected for the former, but it would still be interesting to compare performance with the latter). But we still want to be able to have incredibly cheap lookup in the common case of a value being precomputed. So we ended up with three mutable variables:

  1. An IORef containing a Maybe a. If a precomputed value is available, it's present in there as Just a. Otherwise, there's a Nothing value. This mean that, in the normal case, a requester only needs to (a) do a memory read, and (b) case analyze the value.

  2. An MVar baton to tell the worker thread that someone is waiting for a value. In the case that the IORef contains Nothing, the requester fills this MVar. Before it starts working, the worker thread always tries to takeMVar this baton.

  3. An MVar for passing back result values. This may seem redundant with the IORef, but is necessary for the case of Nothing. A requester putMVars into the baton, and then blocks in readMVar on this reference until the value is available. The worker thread will update both the IORef and this value at the same time, and then go to sleep until it needs to compute again, at which point the process will loop.

Since the requesters never try to fork threads, there's no possibility of running into the high contention problem of multiple forks. The worst case scenario is that many requesters see a Nothing value at the same time, and all end up blocking on the same MVar. While less efficient than just reading an IORef, this isn't nearly as expensive as what we had previously.

There's another big advantage: the normal case no longer uses any atomicModifyIORef (or any other synchronized memory access), meaning we've avoided memory barriers during periods of high concurrency.

And finally: the code is drastically simpler.

And now that we've fixed our regression, and gotten back high performance together with better power consumption, we can finally return to Martin's original idea of better buffer management. More on that soon hopefully :).

October 02, 2014 11:24 AM

October 01, 2014

Functional Jobs

Full Stack Functional Web Engineer at Front Row Education (Full-time)

Front Row Education ( - San Francisco, Fulltime

== Position ==

Early Full-stack web engineer to join fast-growing education startup that changes how hundreds of thousands of kids learn math.

== TL;DR - Reasons to care about working with Front Row ==

  • A mission you can be proud of: kids using Front Row improve twice as much at math

  • Tremendous impact on a small team

  • Flexibility on what you work on, and autonomy over your schedule

  • Effective & exciting tools: Clojure, Haskell, PostgreSQL, Backbone.js, Ansible

  • Be part of a growing success: in just over a year after launch, we are in more than 6% of all US schools

== The Business ==

Millions of teachers around the USA are struggling to help 30+ students in their class learn math because every student is in their own place. In a typical fourth grade classroom, there may be students learning to count, students learning to add, students learning to multiply, and students learning how exponents work - and one teacher somehow needs to address all these needs.

Thousands of teachers use Front Row every day to save hours of time and make sure their students are growing quickly. Front Row active users have been growing 100% a month for the past 8 school months.

== The Role ==

As one of our very first engineers, you will be part of a team of developers who are passionate about their vocation, take pride in their craft and who push each other to grow as professionals. You will strive for pragmatism and 80/20 in your work. You will be using tools that provide you with the most leverage and make you most effective. By working really smart, you will produce more than the average developer ever will, but without the crazy hours.

You will work in an effective team that plans, executes and reflects together. Because we’re a small team, everything you create will go into production and be used by students. You will never do unimportant work: every contribution will make a clear and tangible impact on the company’s trajectory. Your personal success will be directly aligned with that of the company.

Most importantly, your work will have purpose: Front Row is a mission-driven company that takes pride in making a significant impact in the lives of hundreds of thousands of students.

== Benefits ==

  • Competitive salary

  • Generous equity option grants

  • Medical, Dental, and Vision

  • Food budget

  • Work from home up to 1 day a week, extremely flexible work schedule

  • Team meal outing every week, and half-day event every month (trip to Sonoma, BBQ, etc)

  • Equipment budget

  • Flexible, untracked vacation day policy

  • Working from Runway - one of the best startup coworking locations right in the heart of San Francisco

Get information on how to apply for this position.

October 01, 2014 07:17 PM

Senior Software Engineer at McGraw-Hill Education (Full-time)

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District.

We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

  • Move textbooks and learning into the digital era
  • Create software used by millions of students
  • Advance the state of the art in adaptive learning technology
  • Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you have do some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

  • Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
  • Be comfortable moving between multiple programming languages
  • Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

October 01, 2014 04:07 PM

September 30, 2014

The GHC Team

GHC Weekly News - 2014/09/30

Hi *,

Here's some news for y'all! Apologizes about the unavailability last week; the internet wasn't exactly a very fun place for a system administrator...

So without delay, here's the current recap of the past two weeks:

  • Lots of merged code and code reviews have gone in, and a lot of commits: in the past two weeks since the last update, ghc.git has seen just over 100 commits, from about a dozen different developers.
  • As part of those patches, a significant amount of them have gone towards implementing the "Burning Bridges Proposal" or BBP for the base library. This is a set of changes to base that have generalized many parts of the API, by putting Traversable and Foldable in the Prelude. This required a bit of shoveling by Herbert, but now this is all in GHC HEAD, and will be part of 7.10:
    • Prelude combinators, like mapM, have been generalized to the Traversable and Foldable classes.
    • Several other modules, like Control.Monad and Data.List, have been generalized to Traversable and Foldable where applicable.
    • Control.Monad combinators generalized to Applicative where possible.
    • Similarly, MonadPlus combinators like guard are generalized to Alternative.
    • Foldable has been extended with new methods, like length and null.
  • But also, GHC's compiler is now tab-free! That's right, after what seemed like a million years, a very large portion of the code has been detabbed, and -fwarn-tabs is now on by default in the GHC build.
  • There are an assortment of other changes: GHC's linker is not as loud[1], and various documentation improvements.
  • The windows build is broken *again* unfortunately, this time due to what seems to be a Cabal update, I think. Austin is once again on the case.
  • The HCAR draft for October has seen some nice improvements. If you're a developer, please amend things. If you're a user, read with eager anticipation of all the new features![2]
  • It turns out the new Applicative/Monad changes have unfortunately broken the haskell98 and haskell2010 packages, with an unclear migration strategy for the future: see #9590. For GHC 7.10, it seems the haskell2010 packages will need to change to accomodate these standard deviations. If any users of the haskell98 or haskell2010 packages would speak up to help, that would be fantastic. The discussion will surely continue for a little bit - 7.10 is still a ways off.

In miscellaneous news:

  • may have been temporarily unavailable during this weekend due to an emergency downtime with our provider for a security update, but the window was quite small.
  • Relatedly (but not the exact same scenario), the internet also caught fire in several other places, requiring quite a lot of careful watching and remediation after the Bash "ShellShock" bug hit last week.

And I think that sums it up quite nicely, folks!

Closed tickets for the past two weeks include (there are a lot of them!): #9650, #7068, #5190, #5880, #8374, #9603, #3303, #3470, #3509, #781, #8115, #9641, #9191, #9515, #9326, #7987, #9634, #9576, #9612, #9635, #8593, #7544, #8529, #9338, #5794, #9535, #3646, #617, #8026, #8480, #8881, #9366, #8251, #7673, #8983, #8369, #8897, #8070, #9550, #9057, #9629, #8836, #8960, #8941, #9565, #9589, #5248, #8918, #9568, #9620, #1042, #9557, #7863, #5647, #9610, #5395, #9580, #9529, #4426, #9607, #9592, #8097, #9559, #9560, #4218, #9602, #9528, #9530, #9423, #9400, #1407, #9598, #9597.

I'd like to mention that for the above tickets, a *huge* amount of them were closed by one of our newest contributors, Thomas Miedema, who went through the bug tracker and confirmed or closed a large majority of them. I lost track of how many. Thanks Thomas!


by thoughtpolice at September 30, 2014 08:07 PM