# Postdoc Position in Functional and Constraint Programming

## Postdoctoral position in Functional and Constraint Programming at KU Leuven

The Declarative Languages and Artificial Intelligence (DTAI) group of KU Leuven (Belgium) invites applicants for a postdoctoral position in the area of functional and constraint programming. The position revolves around domain-specific languages (DSLs) embedded in Haskell for constraint programming. It is part of the EU project GRACeFUL whose overarching theme is tools for collective decision making. The KU Leuven part of the project is under the direction of prof. Tom Schrijvers.

To apply you must hold a recent PhD (or be about to graduate) related to either functional or constraint programming. Experience in both areas is an advantage.

You will work closely with prof. Schrijvers and his PhD students at KU Leuven, as well as with the GRACeFUL project partners across Europe.

The position is for 3 years. The salary is competitive and the starting date negotiable (but no later than February 1). Moreover, KU Leuven's policy of equal opportunities and diversity applies to this position.

Application procedure: http://people.cs.kuleuven.be/~tom.schrijvers/postdocposition2.html

# Upcoming talk

For all you local folks, I'll be giving a talk about my dissertation on November 5th at 4:00–5:00 in Ballantine Hall 011. For those who've heard me give talks about it before, not much has changed since NLCS 2013. But the majority of current CL/NLP, PL, and logic folks haven't seen the talk, so do feel free to stop by.

Abstract: Many natural languages allow scrambling of constituents, or so-called "free word order". However, most syntactic formalisms are designed for English first and foremost. They assume that word order is rigidly fixed, and consequently these formalisms cannot handle languages like Latin, German, Russian, or Japanese. In this talk I introduce a new calculus —the chiastic lambda-calculus— which allows us to capture both the freedoms and the restrictions of constituent scrambling in Japanese. In addition to capturing these syntactic facts about free word order, the chiastic lambda-calculus also captures semantic issues that arise in Japanese verbal morphology. Moreover, chiastic lambda-calculus can be used to capture numerous non-linguistic phenomena, such as: justifying notational shorthands in category theory, providing a strong type theory for programming languages with keyword-arguments, and exploring metatheoretical issues around the duality between procedures and values.

# GHC Weekly News - 2014/10/20

Hi *,

It's been a few weeks since the last message - and I apologize! We actually are changing the posting time to be Friday now, so hopefully this situation will be corrected preeeeetty quickly from this point forward, and hopefully will give better context over the course of a weekly discussion.

That said, let's begin!

• We've seen plenty of changes to GHC itself in the past few weeks. Some of the highlights include:
• Some changes to help make Prelude combinators fuse better. David Feuer has been leading a lot of this work, and it's been quite fruitful, with several new things now fusing (like takeWhile, scanl, scanr, and mapAccumL.
• Relatedly, Data.List.Inits should be far faster thanks to David Feuer (ref: Phab:D329).
• The testsuite driver now has preliminary support for Python 3 - which should be useful for platforms out there that sport it, and ones that will use it as the default eventually (such as Fedora 22, possibly).
• Some of the initial work by Edward Yang to remove HEAP_ALLOCED from the GHC runtime system has landed. Briefly, HEAP_ALLOCED is a check the RTS uses to determine if some address is part of the dynamic heap - but the check is a bit costly. Edward's full patchset hopes to remove this with an 8% speedup or so on average.
• GHC now has a new macro, __GLASGOW_HASKELL_PATCHLEVEL__, which will allow you to determine the point-level release of the GHC you're using. This has been a requested feature in the past we were a little hesitant of adding, but Herbert went through and did it for us. (Ref: Phab:D66)
• Template Haskell now supports LINE pragmas, thanks to Eric Mertens (ref: Phab:D299).
• Sergei Trofimovich revived libbfd debugging support for the runtime system linker, which should be of use to several daring souls out there (ref: Phab:D193).
• Several patches from Gintautas Miliauskas has improved the usability of msys and the testsuite on Windows - and he's not done yet!
• A few improvements to the x86 code generator were committed by Reid Barton and Herbert Valerio Riedel, improving size/space for certain cases (ref: Phab:D320, Phab:D163).

and more besides that, including some linker improvements, and general cleanups as usual.

• The mailing list has been busy (as usual), with some discussions including:
• Austin posted some discussion about the tentative 7.10.1 plans - we're still hoping these are accurate, so take note! We hope to freeze mid-November, and release Feburary 2015! [1]
• Austin also called for some feedback: GHC HQ has become convinced a 7.8.4 release is needed to fix some showstoppers - so please let him know soon if you're totally incapable of using 7.8 for something! [2]
• Alan Zimmerman has asked for some feedback on his proposed "AST Annotations", which will hopefully allow GHC API clients to add richer annotations to GHC's syntactic representations. The motivation is for refactoring tools like HaRe - and I'm sure all input would be appreciated. [3]
• Chris done sparked off a discussion about making GHCi awesomer, and I'm sure everyone can appreciate that! In particular, Chris wanted to discuss programmatic means of controlling GHCi itself, and naturally we need to ask - is the current API not enough, and why? [4]
• Yuras Shumovich has implemented a proposal for allowing the Haskell FFI to support C structures natively as return values - this involves interfacing with C ABI rules to properly support structure layout. While Yuras has an initial implementation in Phab:D252, some questions about the feature - including its implementation complexity - remain before it gets merged. [5]
• Richard Eisenberg made a modest request: can Phabricator patches have a 'roadmap', so people can instruct reviewers how to read a diff? The answer: yes, and it should be relatively easy to implement, and Austin can do so Real Soon Now™. [6]
• Ben Gamari started a big discussion about one-shot event semantics in the I/O manager, with a lot of replies concerning not only the bug, but machines to test the actual change on. With any luck, Ben's fix for the I/O manager and a test machine should come quickly enough. [7]
• Herbert Valerio Riedel opened an RFC: Should we look into using AsciiDoc for GHC? Historically, GHC's documentation has been written using DocBook, a verbose but very precise and unambiguous documentation format. However, AsciiDoc offers a much nicer markup language, while retaining DocBook support. In short, it looks like GHC may get a much more clean user manual soon. [8]
• Yuras opened another discussion: Should we namespace proposals we create on our wiki? What seems uncontroversial can lead to surprising discussion, and the results were mixed this time it seems. [9]
• Geoff Mainland stepped up and fixed Data Parallel Haskell to work with a new version of vector and GHC. Austin had disabled DPH a few weeks prior due to its difficulty to upgrade, and divergent source trees. With 7.10, GHC will hopefully ship a more modern vector and dph to boot.
• Austin asks: can we warn on tabs by default for GHC 7.10? It's an easy change and a minor one - but we should at least ask first. Vote now! [10]
• Philip Hölzenspies opens up a discussion about Uniques in GHC, and their impact on the compilers current design. Philip has a hopeful design to redo Unique values in GHC, and a patch to support it: Phab:D323. [11]
• Richard Eisenberg asks: can we somehow integrate GitHub into our development process? While GitHub doesn't have as many nice tools as something like Phabricator, it has a very high inertia factor, and Richard is interested in making the 'first step' as easy as possible for newbies. Discussions about Phab<->GitHub integrations were afoot, as well as general discussion about contributor needs. There were a lot of points brought up, but the conversation has slightly dried up now - but will surely be revived again. [12]

And now, look at all these tickets we closed! Including: #9658, #9094, #9356, #9604, #9680, #9689, #9670, #9345, #9695, #9639, #9296, #9377, #9184, #9684.

# Dijkstra quotes from EWD1284

I recently read through this long post entitles Object Oriented Programming is an expensive disaster which must end. I have to agree I largely agree with what he writes, but I don’t think I ever could have written such a well-researched article, and absolutely never one of equal length ;)

It does include some nice quotes and references and so far I’ve only read one of the many that I bookmarked, Computing Science: Achievements and Challenges (EWD1284). It does include a few quips that, based on other articles I’ve read, seem fairly typical to Dijkstra. I simply love the way he expressed his opinions at times.

This one really ought to have been in the lengthy post on the OOP disaster:

After more than 45 years in the field, I am still convinced that in computing, elegance is not a dispensable luxury but a quality that decides between success and failure; in this connection I gratefully quote from The Concise Oxford Dictionary a definition of “elegant”, viz. “ingeniously simple and effective”. Amen. (For those who have wondered: I don’t think object-oriented programming is a structuring paradigm that meets my standards of elegance.)

And his thoughts on formal methods are well-known of course, as are his thoughts on iterative design. However, I rather like how he expresses a certain level of disgust of the Anglo-Saxon world when writing about those topics:

The idea of a formal design discipline is often rejected on account of vague cultural/philosophical condemnations such as “stifling creativity”; this is more pronounced in the Anglo-Saxon world where a romantic vision of “the humanities” in fact idealizes technical incompetence. Another aspect of that same trait is the cult of iterative design.

It amazes me every time I read something by someone like Dijkstra, just how much things stay the same, even in a field like computing science, which is said to be moving so fast.

# New Stackage features

We have two new updates to Stackage: providing cabal.config files and including Haddock documentation.

Now all new exclusive snapshots will have haddock links, which you can access via the following steps:

That link will be to an index page like this from which you can view documentation of all packages included in the snapshot. This means you can generally view everything in one place, on one high availability service.

## Using Stackage without changing your repo

The recommended way to use Stackage is to simply change your remote-repo field in your .cabal/config file and run cabal update. Henceforth your dependencies will be resolved from Stackage, which is backed by high availability Amazon S3 storage servers, and you will have successful build plans, compilation and passing tests. Hurrah!

Try Haskell and the upcoming Haskell.org homepage were both developed with Stackage. This meant I could just specify the Stackage snapshot to use in the README and then the next time I want to upgrade I can just update the snapshot version to get a fresh working build plan.

The issue some people are facing is that they cannot change this remote-repo field, either because they're using a cabal sandbox, which doesn't support this yet, or because they just don't want to.

The solution to this, in my experience, has been for me to manually go and run cabal freeze in a project I've already built to get the cabal.config file and then give these people that file.

Now, it's automated via a cabal.config link on snapshot pages:

For new developers working on an application who want to use Stackage, they can do something like this:

$cabal sandbox init$ curl http://www.stackage.org/<stackage version>/cabal.config > cabal.config
$cabal install --only-dep Which will install their dependencies from Hackage. We can't guarantee this will always work -- as Stackage snapshots sometimes will have a manual patch in the package to make it properly build with other packages, but otherwise it's as good as using Stackage as a repo. A cabal freeze output in cabal.config will contain base and similar packages which are tied to the minor GHC version (e.g. GHC 7.8.2 vs GHC 7.8.3 have different base numbers), so if you get a cabal.config and you get a dependencies error about base, you probably need to open up the cabal.config and remove the line with the base constraint. Stackage snapshots as used as repos do not have this constraint (it will use whichever base your GHC major version uses). Another difference is that cabal.config is more like an “inclusive” Stackage snapshot -- it includes packages not known to build together, unlike “exclusive” snapshots which only contain packages known to build and pass tests together. Ideally every package you need to use (directly or indirectly) will come from an exclusive snapshot. If not, it's recommended that you submit the package name to Stackage, and otherwise inclusive snapshots or cabal.config are the fallbacks you have at your disposal. ## October 19, 2014 ### Neil Mitchell # HLint now spots bad unsafePerformIO Summary: I've just released a new version of HLint that can spot an unsafePerformIO which should have NOINLINE but doesn't. I've just released HLint v1.9.10, a tool to suggest improvements to Haskell code. I don't usually bother with release announcements of HLint, as each version just fixes a few things and adds a few hints, it's all in the changelog (plus there have now been 102 releases). The latest release attempts to make using unsafePerformIO a little bit safer. A common idiom for top-level mutable state in Haskell is: myGlobalVar :: IORef IntmyGlobalVar = unsafePerformIO (newIORef 17) That is, define a top-level CAF (function with no variables) and bind it to unsafePerformIO of some created mutable state. But the definition above is unsafe. GHC might decide myGlobalVar is cheap and inline it into several uses, duplicating the variable so that some functions update different versions. Running this code through the latest version of HLint we see: Sample.hs:2:1: Error: Missing NOINLINE pragmaFound: myGlobalVar = unsafePerformIO (newIORef 17)Why not: {-# NOINLINE myGlobalVar #-} myGlobalVar = unsafePerformIO (newIORef 17) HLint has spotted the problem, and suggested the correct fix. Trying it for real Let's take the package slave-thread-0.1.0 and run HLint on it. Slave thread is a new package that helps you ensure you don't end up with ghost threads or exceptions being thrown but missed - a useful package. Running HLint we see: Sample.hs:19:1: Error: Missing NOINLINE pragmaFound: slaves = unsafePerformIO$ Multimap.newIOWhy not:  {-# NOINLINE slaves #-}  slaves = unsafePerformIO $Multimap.newIOSample.hs:20:3: Warning: Redundant$Found:  unsafePerformIO $Multimap.newIOWhy not: unsafePerformIO Multimap.newIOSample.hs:25:1: Error: Eta reduceFound: fork main = forkFinally (return ()) mainWhy not: fork = forkFinally (return ())Sample.hs:55:28: Warning: Redundant$Found:  PartialHandler.totalizeRethrowingTo_ masterThread $memptyWhy not: PartialHandler.totalizeRethrowingTo_ masterThread memptySample.hs:72:5: Error: Use unlessFound: if null then return () else retryWhy not: Control.Monad.unless null retry HLint has managed to spot the missing NOINLINE pragma, and also a handful of tiny cleanups that might make the code a little more readable. Fortunately (and independent of HLint), the NOINLINE pragma was added in slave-thread-0.1.1, so the library no longer suffers from that bug. ### Matthew Sackman # Anonymity in public life In an article published in the Guardian yesterday, author Kathleen Hale recounts how her first book got some negative reviews by reviewers on a book review website. One reviewer in particular upset her and Kathleen ends up figuring out the reviewer is using a false identity, finds out who the reviewer really is and confronts her. The piece doesn't read to me like some sort of valedictory "I outed a fraud" type piece (though there are some passages in there which are questionable in that direction) and equally there are several passages where Kathleen expresses deep embarrassment and regret for the course of action she took. This episode, and that article in particular has caused substantial reaction: currently 600 comments on the Guardian article plus several other blog posts. There's no shortage of opinion to be found on Twitter either, as you'd expect. The course of action that Kathleen took seems to be fairly undisputed as far as I can find. There is some dispute from some of the other blog posts as to exactly what was tweeted and said by whom, and there is dispute over Kathleen's claim that there are factual inaccuracies made in a review of her book. It is not disputed that the reviewer was using a false identity and that the reviewer had at least public Twitter, Facebook, and Instagram accounts under the false identity. The false identity was also a real name (Blythe Harris), by which I mean a name which if you introduced yourself by that name, no one would think you're using a false identity. This is distinct from claiming to be Peter Rabbit, or Buzz Lightyear. Many people have equated Kathleen's actions with stalking. My dictionary defines the verb to stalk as: 1. to follow or approach (game, prey, etc.) stealthily and quietly 2. to pursue persistently and, sometimes, attack (a person with whom one is obsessed, often a celebrity) 3. , 4,... [not relevant] The second item there certainly fits. The British legal approach, whilst it gives no strict definition gives examples and guidance: ....following a person, watching or spying on them or forcing contact with the victim through any means, including social media. The effect of such behaviour is to curtail a victim's freedom, leaving them feeling that they constantly have to be careful. In many cases, the conduct might appear innocent (if it were to be taken in isolation), but when carried out repeatedly so as to amount to a course of conduct, it may then cause significant alarm, harassment or distress to the victim. I'm glad it includes "social media" there. Some comments have suggested that stalking "in real life" is worse than online. This seems bizarre to me: as if through a computer you are not interacting with other human beings but merely with shiny pixels who have no emotional capacity. "In real life" is everything we know. Whilst we're alive we have no personal experience of anything other than "in real life". So I'm fairly sold on the whole argument that Kathleen's behaviour towards this reviewer can be considered stalking and as such is reprehensible. To me, the far more interesting issue is the use of anonymity, false identities and any realistic expectation we have of privacy on the internet. A number of people who claim to write book reviews on such sites have suggested that the behaviour of Kathleen is exactly why they write their reviews under false names. I think there's something of a contradiction going on here. But let's work backwards. Firstly, Kathleen, through some social engineering (she requested from the book review site the address of the reviewer so that she could post her a copy of the book) got the address of the book reviewer. She then used a telephone directory and census results to identify who really lived there (or likely owned the land). Now the use of the telephone directory seems a bit odd to me: telephony directories map names to numbers (and maybe addresses). Yes, you could use it to map an address to a name but it's very inefficient: you're essentially searching through the whole directory looking for the address whilst the directory is sorted by name, not address. So unless it was a very small telephone directory, I don't really buy that. Using census results is far more creditable: they're public documents and when they're online, they do allow you to search by address. In the UK you can only get access to the raw census details 100 years after the census has been published which, to a high probability, rules it out as a means to tie an address to a person who's still alive. You can get statistics and aggregates from more recent census results but you can't get the raw data. I'm assuming that in the US there's no such restriction on access to raw census data. If there is then I don't understand how Kathleen really managed to get a name for the owner of the property. Instead, in the UK, if you want to find out who owns some land, you can pay the land registry £3 and they'll tell you. Presumably there are means by which you can legally hide this; I'm sure the rich have figured this out - probably some method by which some fake company in a tax haven technically "owns" the land and as they're registered abroad, they don't have to divulge any further details about that company. So yes, you could argue the Land Registry is profiting from facilitating stalkers, but equally there are a bunch of legitimate reasons to need access to such data and I can't think of any sane way to ensure the use of such a service isn't abused. So from that I conclude that unless the owner is a millionaire, the owner of any land is public knowledge. The use of social engineering to get the address in the first place is more interesting but very obvious. This sort of thing happens a lot and sometimes to horrifying consequences (e.g. the Australian DJs who phoned up a hospital, pretending to be the Queen and Prince of Wales, enquiring as to the health of the Duchess of Cambridge. The nurse fell for the hoax and put the call through. Three days later, the nurse committed suicide). As a species we are not good at taking the time to verify who we're talking to or why. Whilst (hopefully) most of us would hang up if our bank apparently rang us and then asked for our credit card details "for security" this is largely only because it's in the bank's interest (in terms of cost of insurance) to reduce fraud, so they've trained us as such. But in all sorts of other scenarios we implicitly trust people we've no real reason to. A simple example: ticket inspectors on public transport. They may be wearing the uniform, but it could be faked. With their travel-card readers they could be seeing who has the expensive yearly travel cards, scanning the unique numbers from them and then using them to program up fraudulent cards. The crypto on those things is notoriously weak. Has anyone ever requested some means to verify the identity of a ticket inspector? And even if you could, how do you know they're not crooked regardless? So phoning someone up, impersonating someone else, or pretending to have valid reasons to request the information you're requesting is always likely to work. It might be illegal in some cases, but it's certainly human nature to try to be helpful and if you're given a plausible justification, on what basis could you refuse the request unless it's contrary to some sort of company policy? In this case, if you're concerned about anonymity, wouldn't you be concerned about this possibility, and make use of an anonymous mail box? Article 8 of the European Convention on Human Rights guarantees an individual's right to respect for privacy and family life, including correspondence. Is privacy the same as anonymity? No, definitely not: In conflating anonymity and privacy, we have failed to see an important factual difference between them: under the condition of privacy, we have knowledge of a person’s identity, but not of an associated personal fact; whereas under the condition of anonymity, we have knowledge of a personal fact, but not of the associated person’s identity The vast violations of our lives by state surveillance as revealed by Snowdon over the last year demonstrates the whole-scale collation of everything we do online and off by our governments. This is both being able to observe an action and identify the individual who caused it (thus we have no hope of performing any action anonymously), and being able to observe an individual and know the actions they take (thus no privacy). I can't work out whether the ECHR has anything to say on a right to anonymity; I get the sense that it doesn't try to protect that. So that's basically saying: "the state shouldn't record your every move (as that's an invasion of privacy), but moves that we're interested in, we can know who did them". Of course, we now know they're recording everything anyway. We also know that computer systems can always be hacked into - there is no real security anywhere. Given a skilled and sufficiently funded adversary, any computer system connected in any way to the internet can be hacked into. Why? Because humans wrote the software that runs on those computers and humans are incapable of writing bug-free software. Look at all the large scale data breaches in recent history. Nothing is secure. So we have laws that seem to try and protect privacy, but they're violated by our own governments, and in any case, we have countless examples of our inability to store any information securely. So is there really any hope to be able to exist with anonymity on the internet? As ever, it depends who your adversary is. If your adversary is a government (either your own or some foreign government) then no, you have no hope. If it's a previous partner of yours who has no particular computer training, then yes, you're probably going to have a reasonable chance of being anonymous for a while. But you need to read up on this and think hard: it's not a trivial undertaking. There are some good guides as to how to do this, but: All writers - whether writing under their own names or not - should be aware of the risks they may incur by hitting 'publish'. What is the effect of hitting "publish"? It's to put more data points out there which may lead people to be able to identify you. The fewer data points out there, the better. So coming back to our book reviewer, if you want to review books anonymously, and if your justification for acting anonymously is to avoid being stalked by authors who don't like your reviews, then why put so many data points out there? Why have the Facebook page, the Instagram profile with the faked photos, the Twitter account? Why give your real postal address to the book review club knowing they're going to post books to it and might conceivably give your address out to other people? The social media accounts in particular I find most odd. If you want to review books then review books. Build your following, your reputation and cachet on the quality of your reviews. If I'm looking at a book review I really don't care where you went on holiday, what your tweets are, or how many pets you have. Putting that information out there undermines your entire justification for being anonymous: if you want to be anonymous (i.e. you don't want people to find out who you are) then why are you putting so much unnecessary information out there that may allow people to figure out who you are? Equally, use a name that clearly communicates to me you're trying to be anonymous: call yourself TheBookReviewer53, DostoyevskyLover or OrwellWasRight. Doing so doesn't lessen the validity of your opinions on your chosen subject and is more honest with people reading your reviews: it's overtly saying "I have reasons to want to exist anonymously on the internet". It reveals nothing more about your real identity either: regardless of the obvious fictitious-ness of your online persona, if you can be found, you can be found. Researchers show that four data points about a person’s location can identify that person with 95% accuracy. FOUR. You think you can tweet anonymously from your phone? You think apps like Whisper allow you to act anonymously? As with pretty much everything related to the internet and computing, unless you've spent the last 20 years of your life working with computers, studying computers and thinking very hard about threat models and what data you're putting out there, and are utterly paranoid, you basically haven't got a chance. Do you turn off wifi on your phone when you leave the house? You should. You trust that USB pen drive you're transferring documents on? You shouldn't. Finally and most obviously, any attempt at anonymity clearly doesn't insulate you from the law. As members of various hacking groups such as lulzsec found out, you always can be found out by law enforcement agencies. Yes, you might be able to make it difficult for a poorly funded person to come after you for libel (which is really just an indictment of the disgusting relationship between justice and money) but it's quite a risk to take. If you wouldn't put it in print with your real name attached, you're placing an awful lot of trust on your ability to maintain your anonymity against an adversary you probably don't know as well as you need to. ### Well-Typed.Com # Quasi-quoting DSLs for free Suppose you are writing a compiler for some programming language or DSL. If you are doing source to source transformations in your compiler, perhaps as part of an optimization pass, you will need to construct and deconstruct bits of abstract syntax. It would be very convenient if we could write that abstract syntax using the syntax of your language. In this blog post we show how you can reuse your existing compiler infrastructure to make this possible by writing a quasi-quoter with support for metavariables. As we will see, a key insight is that we can reuse object variables as meta variables. ## Toy Language “Imp” For the sake of this blog post we will be working with a toy language called Imp. The abstract syntax for Imp is defined by type VarName = String data Expr = Var VarName | Add Expr Expr | Sub Expr Expr | Int Integer | Read deriving (Data, Typeable, Show, Eq) data Cmd = Write Expr | Assign VarName Expr | Decl VarName deriving (Data, Typeable, Show) data Prog = Prog [Cmd] deriving (Data, Typeable, Show) and we will assume that we have some parsec parsers parseExpr :: Parser Expr parseProg :: Parser Prog We will also make use of topLevel :: Parser a -> Parser a topLevel p = whiteSpace *> p <* eof and the following useful combinator for running a parser: parseIO :: Parser a -> String -> IO a The details of these parsers are beyond the scope of this post. There there are plenty of parsec tutorials online; for instance, you could start with the parsec chapter in Real World Haskell. Moreover, the full code for this blog post, including a simple interpreter for the language, is available on github if you want to play with it. Here is a simple example of an Imp program: var x ; x := read ; write (x + x + 1) ## A simple quasi-quoter We want to be able to write something like prog1 :: Prog prog1 = [prog| var x ; x := read ; write (x + x + 1) |] where the intention is that the [prog| ... |] quasi-quote will expand to something like prog1 = Prog [ Decl "x" , Assign "x" Read , Write (Add (Add (Var "x") (Var "x")) (Int 1)) ] To achieve this, we have to write a quasi-quoter. A quasi-quoter is an instance of the following data type: data QuasiQuoter = QuasiQuoter { quoteExp :: String -> Q Exp , quotePat :: String -> Q Pat , quoteType :: String -> Q Type , quoteDec :: String -> Q [Dec] } The different fields are used when using the quasi-quoter in different places in your Haskell program: at a position where we expect a (Haskell) expression, a pattern (we will see an example of that later), a type or a declaration; we will not consider the latter two at all in this blog post. In order to make the above example (prog1) work, we need to implement quoteExp but we can leave the other fields undefined: prog :: QuasiQuoter prog = QuasiQuoter { quoteExp = \str -> do l <- location' c <- runIO$ parseIO (setPosition l *> topLevel parseProg) str
dataToExpQ (const Nothing) c
, quotePat  = undefined
, quoteType = undefined
, quoteDec  = undefined
}

Let’s see what’s going on here. The quasi-quoter gets as argument the string in the quasi-quote brackets, and must return a Haskell expression in the Template-Haskell Q monad. This monad supports, amongst other things, getting the current location in the Haskell file. It also supports IO.

### Location

The first thing that we do is find the current location in the Haskell source file and convert it to parsec format:

location' :: Q SourcePos
location' = aux <$> location where aux :: Loc -> SourcePos aux loc = uncurry (newPos (loc_filename loc)) (loc_start loc) ### Running the parser Once we have the location we then parse the input string to a term in our abstract syntax (something of type Prog). We use parsec’s setPosition to tell parsec where we are in the Haskell source file, so that if we make a mistake such as prog1 :: Prog prog1 = [prog| var x ; x := read ; write (x + x + ) |] we get an error that points to the correct location in our Haskell file: TestQQAST.hs:6:9: Exception when trying to run compile-time code: user error ("TestQQAST.hs" (line 9, column 20): unexpected ")" expecting "(", "read", identifier or integer) ### Converting to Haskell abstract syntax The parser returns something of type Prog, but we want something of type Exp; Exp is defined in Template Haskell and reifies the abstract syntax of Haskell. For example, we would have to translate the Imp abstract syntax term Var "x" :: Prog to its reflection as a piece of abstract Haskell syntax as AppE (ConE 'Var) (LitE (StringL "x")) :: TH.Exp which, when spliced into the Haskell source, yields the original Prog value. Fortunately, we don’t have to write this translation by hand, but we can make use of the following Template Haskell function: dataToExpQ :: Data a => (forall b. Data b => b -> Maybe (Q Exp)) -> a -> Q Exp This function can translate any term to a reified Haskell expression, as long as the type of the term derives Data (Data instances can be auto-derived by ghc if you enable the DeriveDataTypeable language extension). The first argument allows you to override the behaviour of the function for specific cases; we will see an example use case in the next section. In our quasi-quoter so far we don’t want to override anything, so we pass a function that always returns Nothing. Once we have defined this quasi-quoter we can write prog1 :: Prog prog1 = [prog| var x ; x := read ; write (x + x + 1) |] and ghc will run our quasi-quoter and splice in the Haskell expresion corresponding to the abstract syntax tree of this program (provided that we enable the QuasiQuotes language extension). ## Meta-variables Consider this function: prog2 :: VarName -> Integer -> Prog prog2 y n = [prog| var x ; x := read ; write (x + y + n) |] As mentioned, in the source code for this blog post we also have an interpreter for the language. What happens if we try to run (prog2 "x" 1)? *Main> intIO$ intProg (prog2 "x" 2)
5
*** Exception: user error (Unbound variable "y")

Indeed, when we look at the syntax tree that got spliced in for prog2 we see

Prog [ Decl "x"
]

What happened? Didn’t we pass in "x" as the argument y? Actually, on second thought, this makes perfect sense: this is after all what our string parses to. The fact that y and n also happen to Haskell variables, and happen to be in scope at the point of the quasi-quote, is really irrelevant. But we would still like prog2 to do what we expected it to do.

To do that, we have to support meta variables: variables from the “meta” language (Haskell) instead of the object language (Imp). Template Haskell supports this out of the box. For example, we can define

ex :: Lift a => a -> Q Exp
ex x = [| id x |]

Given any argument that supports Lift, ex constructs a piece of abstract Haskell syntax which corresponds to the application of the identity function to x. (Don’t confuse this with anti-quotation; see Brief Intro to Quasi-Quotation.) Lift is a type class with a single method

class Lift t where
lift :: t -> Q Exp

For example, here is the instance for Integer:

instance Lift Integer where
lift x = return (LitE (IntegerL x))

### Meta-variables in quasi-quotes

Quasi-quotes don’t have automatic support for meta-variables. This makes sense: Template Haskell is for quoting Haskell so it has a specific concrete syntax to work with, where as quasi-quotes are for arbitrary custom syntaxes and so we have to decide what the syntax and behaviour of meta-variables is going to be.

For Imp we want to translate any unbound Imp (object-level) variable in the quasi-quote to a reference to a Haskell (meta-level) variable. To do that, we will introduce a similar type class to Lift:

class ToExpr a where
toExpr :: a -> Expr

and provide instances for variables and integers:

instance ToExpr VarName where
toExpr = Var

instance ToExpr Integer where
toExpr = Int

We will also need to know which variables in an Imp program are bound and unbound; in the source code you will find a function which returns the set of free variables in an Imp program:

fvProg :: Prog -> Set VarName

### Overriding the behaviour of dataToExpQ

In the previous section we mentioned that rather than doing the Prog -> Q Exp transformation by hand we use the generic function dataToExpQ to do it for us. However, now we want to override the behaviour of this function for the specific case of unbound Imp variables, which we want to translate to Haskell variables.

Recall that dataToExpQ has type

dataToExpQ :: Data a
=> (forall b. Data b => b -> Maybe (Q Exp))
-> a -> Q Exp

This is a rank-2 type: the first argument to dataToExpQ must itself be polymorphic in b: it must work on any type b that derives Data. So far we have been passing in

const Nothing

which is obviously polymorphic in b since it completely ignores its argument. But how do we do something more interesting? Data and its associated combinators come from a generic programming library called Scrap Your Boilerplate (Data.Generics). A full discussion of of SYB is beyond the scope of this blog post; the SYB papers are a good starting point if you would like to know more (I would recommend reading them in chronological order, the first published paper first). For the sake of what we are trying to do it suffices to know about the existence of the following combinator:

extQ :: (Typeable a, Typeable b) => (a -> q) -> (b -> q) -> a -> q

Given a polymorphic query (forall a)—in our case this is const NothingextQ allows to extend the query with a type specific case (for a specific type b). We will use this to give a specific case for Expr: when we see a free variable in an expression we translate it to an application of toExpr to a Haskell variable with the same name:

metaExp :: Set VarName -> Expr -> Maybe ExpQ
metaExp fvs (Var x) | x Set.member fvs =
Just [| toExpr $(varE (mkName x)) |] metaExp _ _ = Nothing ### The improved quasi-quoter With this in hand we can define our improved quasi-quoter: prog :: QuasiQuoter prog = QuasiQuoter { quoteExp = \str -> do l <- location' c <- runIO$ parseIO (setPosition l *> topLevel parseProg) str
dataToExpQ (const Nothing extQ metaExp (fvProg c)) c
, quotePat  = undefined
, quoteType = undefined
, quoteDec  = undefined
}

Note that we are extending the query for Expr, not for Prog; dataToExpQ (or, more accurately, SYB) makes sure that this extension is applied at all the right places. Running (prog2 "x" 2) now has the expected behaviour:

*Main> intIO $intProg (prog2 "x" 2) 6 14 Indeed, when we have a variable in our code that is unbound both in Imp and in Haskell, we now get a Haskell type error: prog2 :: VarName -> Integer -> Prog prog2 y n = [prog| var x ; x := read ; write (x + z + n) |] gives TestQQAST.hs:15:19: Not in scope: ‘z’ Parenthetical remark: it is a design decision whether or not we want to allow local binding sites in a splice to “capture” meta-variables. Put another way, when we pass in "x" to prog2, do we mean the x that is bound locally in prog2, or do we mean a different x? Certainly a case can be made that we should not be able to refer to the locally bound x at all—after all, it’s not bound outside of the snippet! This is an orthogonal concern however and we will not discuss it any further in this blog post. ## Quasi-quoting patterns We can also use quasi-quoting to support patterns. This enables us to write something like optimize :: Expr -> Expr optimize [expr| a + n - m |] | n == m = optimize a optimize other = other As before, the occurrence of a in this pattern is free, and we intend it to correspond to a Haskell variable, not an Imp variable; the above code should correspond to optimize (Sub (Add a n) m) | n == m = optimize a (note that this is comparing Exprs for equality, hence the need for Expr to derive Eq). We did not mean the pattern optimize (Sub (Add (Var "a") (Var "n")) (Var "m")) To achieve this, we can define a quasi-quoter for Expr that supports patterns (as well as expressions): expr :: QuasiQuoter expr = QuasiQuoter { quoteExp = \str -> do l <- location' e <- runIO$ parseIO (setPosition l *> topLevel parseExpr) str
dataToExpQ (const Nothing extQ metaExp (fvExpr e)) e
, quotePat  = \str -> do
l <- location'
e <- runIO $parseIO (setPosition l *> topLevel parseExpr) str dataToPatQ (const Nothing extQ metaPat (fvExpr e)) e , quoteType = undefined , quoteDec = undefined } The implementation of quotePat is very similar to the definition of quoteExp. The only difference is that we use dataToPatQ instead of dataToExpQ to generate a Haskell pattern rather than a Haskell expression, and we use metaPat to give a type specific case which translates free Imp variables to Haskell pattern variables: metaPat :: Set VarName -> Expr -> Maybe PatQ metaPat fvs (Var x) | x Set.member fvs = Just (varP (mkName x)) metaPat _ _ = Nothing Note that there is no need to lift now; the Haskell variable will be bound to whatever matches in the expression. ## Limitations We might be tempted to also add support for Prog patterns. While that is certainly possible, it’s of limited use if we follow the same strategy that we followed for expressions. For instance, we would not be able to write something like opt [prog| var x ; c |] | x Set.notMember fvProg c = opt c The intention here is that we can remove unused variables. Unfortunately, this will not work because this will cause a parse error: the syntax for Imp does not allow for variables for commands, and hence we also don’t allow for meta-variables at this point. This is important to remember: By using object-level variables as stand-ins for meta-level variables, we only allow for meta-level variables where the syntax for the object-level language allows variables. If this is too restrictive, we need to add special support in the ADT and in the corresponding parsers for meta-variables. This is a trade-off in increased expressiveness of the quasi-quotes against additional complexity in their implementation (new parsers, new ADTs). ## Conclusions By reusing object-level variables as stand-ins for meta variables you can reuse existing parsers and ADTs to define quasi-quoters. Using the approach described in this blog we were able to add support for quasi-quoting to a real compiler for a domain specific programming language with a minimum of effort. The implementation is very similar to what we have shown above, except that we also dealt with renaming (so that meta variables cannot be captured by binding sites in the quasi quotes) and type checking (reusing the existing renamer and type checker, of course). ## October 17, 2014 ### Erik de Castro Lopo # Haskell : A neat trick for GHCi Just found a really nice little hack that makes working in the GHC interactive REPL a little easier and more convenient. First of all, I added the following line to my ~/.ghci file.  :set -DGHC_INTERACTIVE  All that line does is define a GHC_INTERACTIVE pre-processor symbol. Then in a file that I want to load into the REPL, I need to add this to the top of the file:  {-# LANGUAGE CPP #-}  and then in the file I can do things like:  #ifdef GHC_INTERACTIVE import Data.Aeson.Encode.Pretty prettyPrint :: Value -> IO () prettyPrint = LBS.putStrLn . encodePretty #endif  In this particular case, I'm working with some relatively large chunks of JSON and its useful to be able to pretty print them when I'm the REPL, but I have no need for that function when I compile that module into my project. ### Neil Mitchell # Fixing Haddock docs on Hackage Summary: A few weeks ago Hackage stopped generating docs. I have a script that generates the docs, and also fixes some Haddock bugs. Update: The Haddock documentation generators are running once again, but may be somewhat behind for a little while. A few weeks ago Hackage stopped generating documentation, so if you look at recently uploaded pages they tend to either lack docs, or have very alert maintainers who did a manual upload. I've packaged up my solution, which also fixes some pretty annoying Haddock bugs. Given that I can now get docs faster and better with my script, I'll probably keep using it even after Haddock on Hackage gets restarted. How to use it • You are likely to get better results if you always install the packages you use with documentation. • Ensure you have tar, curl, cp and cabal on your $PATH.
• cabal update && cabal install neil
• Make a release, don't touch any other code, then make sure you are in the project directory.
• neil docs --username=YourHackageUsername

And like that, your docs are online. To see an example of something that was generated with this process, look at Shake.

What I fixed

I automated the process using scripts originally taken from the lens library, supplemented with suggestions from Reddit. I then do a number of manual fixups.

• Haddock now makes cross-module links where it doesn't know what the target is default to types. Concretely, if I write 'Development.Shake.need' in Haddock it generates a link to #t:need, which doesn't exist, when it should be #v:need - I fix that.
• Update: fixed in Haddock 1.14 or above, as per this ticket.
• On Windows, if you use CPP and multiline bird-tick (>) Haddock blocks you get a blank line between each line. I fix that.
• If you follow some of the simpler scripts links outside your package won't work (or at least, didn't for me). I fix that.

The neil tool

The neil tool is my personal set of handy Haskell scripts. I make all my releases with it (neil sdist), and do lots of checks that my packages conform to my rules (neil check). I also use it for driving my Travis instances. It's in fairly regular flux. Until now, I've always kept it in Darcs/Git and never released it - it's personal stuff tuned to how I work.

You might also notice that neil provides a library. Don't use that, I intend to delete it in a few weeks. Update: library removed.

# Notes on Quotients Types

Posted on October 17, 2014
Tags: types

Lately I’ve been reading a lot of type theory literature. In effort to help my future self, I’m going to jot down a few thoughts on quotient types, the subject of some recent google-fu.

## But Why!

The problem quotient types are aimed at solving is actually a very common one. I’m sure at some point or another you’ve used a piece of data you’ve wanted to compare for equality. Additionally, that data properly needed some work to determine whether it was equal to another piece.

A simple example might would be representing rational numbers. A rational number is a fraction of two integers, so let’s just say

    type Rational = (Integer, Integer)

Now all is well, we can define a Num instance and what not. But what about equality? Clearly we want equivalent fractions to be equal. That should mean that (2, 4) = (1, 2) since they both represent the same number.

Now our implementation has a sticky point, clearly this isn’t the case on its own! What we really want to say is “(2, 4) = (1, 2) up to trivial rejiggering”.

Haskell’s own Rational type solves this by not exposing a raw tuple. It still exists under the hood, but we only expose smart constructors that will reduce our fractions as far as possible.

This is displeasing from a dependently typed setting however, we want to be able to formally prove the equality of some things. This “equality modulo normalization” leaves us with a choice. Either we can really provide a function which is essentially

    foo : (a b : Rational)
-> Either (reduce a = reduce b) (reduce a /= reduce b)

This doesn’t really help us though, there’s no way to express that a should be observationally equivalent to b. This is a problem seemingly as old as dependent types: How can we have a simple representation of equality that captures all the structure we want and none that we don’t.

Hiding away the representation of rationals certainly buys us something, we can use a smart constructor to ensure things are normalized. From there we could potentially prove a (difficult) theorem which essentially states that

    =-with-norm : (a b c d : Integer)
-> a * d = b * c -> mkRat a b = mkRat c d

This still leaves us with some woes however, now a lot of computations become difficult to talk about since we’ve lost the helpful notion that denominator o mkRat a = id and similar. The lack of transparency shifts a lot of the burden of proof onto the code privy to the internal representation of the type, the only place where we know enough to prove such things.

Really what we want to say is “Hey, just forget about a bit of the structure of this type and just consider things to be identical up to R”. Where R is some equivalence relation, eg

1. a R a
2. a R b implies b R a
3. a R b and b R c implies a R c

If you’re a mathematician, this should sound similar. It’s a lot like how we can take a set and partition it into equivalence classes. This operation is sometimes called “quotienting a set”.

For our example above, we really mean that our rational is a type quotiented by the relation (a, b) R (c, d) iff a * c = b * d.

Some other things that could potentially use quotienting

• Sets
• Maps
• Integers
• Lots of Abstract Types

Basically anything where we want to hide some of the implementation details that are irrelevant for their behavior.

## More than Handwaving

Now that I’ve spent some time essentially waving my hand about quotient types what are they? Clearly we need a rule that goes something like

 Γ ⊢ A type, E is an equivalence relation on A
———————————————–———————————————————————————————
Γ ⊢ A // E type

Along with the typing rule

    Γ ⊢ a : A
——————————————————
Γ ⊢ a : A // E

So all members of the original type belong to the quotiented type, and finally

  Γ ⊢ a : A, Γ ⊢ b : A, Γ ⊢ a E b
–——————————————–——————————————————
Γ ⊢ a ≡ b : A // E

Notice something important here, that ≡ is the fancy shmancy judgmental equality baked right into the language. This calls into question decidability. It seems that a E b could involve some non-trivial proof terms.

More than that, in a constructive, proof relevant setting things can be a bit trickier than they seem. We can’t just define a quotient to be the same type with a different equivalence relation, since that would imply some icky things.

To illustrate this problem, imagine we have a predicate P on a type A where a E b implies P a ⇔ P b. If we just redefine the equivalence relation on quotes, P would not be a wellformed predicate on A // E, since a ≡ b : A // E doesn’t mean that P a ≡ P b. This would be unfortunate.

Clearly some subtler treatment of this is needed. To that end I found this paper discussing some of the handling of NuRPL’s quotients enlightening.

## How NuPRL Does It

The paper I linked to is a discussion on how to think about quotients in terms of other type theory constructs. In order to do this we need a few things first.

The first thing to realize is that NuPRL’s type theory is different than what you are probably used to. We don’t have this single magical global equality. Instead, we define equality inductively across the type. This notion means that our equality judgment doesn’t have to be natural in the type it works across. It can do specific things at each case. Perhaps the most frequent is that we can have functional extensionality.

f = g ⇔ ∀ a. f a = g a

Okay, so now that we’ve tossed aside the notion of a single global equality, what else is new? Well something new is the lens through which many people look at NuRPL’s type theory: PER semantics. Remember that PER is a relationship satisfying

1. a R b → then b R a
2. a R b ∧ b R c → a R c

In other words, a PER is an equivalence relationship that isn’t necessarily reflexive at all points.

The idea is to view types not as some opaque “thingy” but instead to be partial equivalence relations across the set of untyped lambda calculus terms. Inductively defined equality falls right out of this idea since we can just define a ≡ b : A to be equivalent to (a, b) ∈ A.

Now another problem rears it head, what does a : A mean? Well even though we’re dealing with PERs, but it’s quite reasonable to say something is a member of a type if it’s reflexive. That is to say each relation is a full equivalence relation for the things we call members of that type. So we can therefore define a : A to be (a, a) ∈ A.

Another important constraint, in order for a type family to be well formed, it needs to respect the equality of the type it maps across. In other words, for all B : A → Type, we have (a, a') ∈ A' ⇒ (B a = B a') ∈ U. This should seem on par with how we defined function equality and we call this “type functionality”.

Let’s all touch on another concept: squashed types. The idea is to take a type and throw away all information other than whether or not it’s occupied. There are two basic types of squashing, extensional or intensional. In the intensional we consider two squashed things equal if and only if the types they’re squashing are equal

     A = B
————————————
[A] = [B]

Now we can also consider only the behavior of the squashed type, the extensional view. Since the only behavior of a squashed type is simply existing, our extensional squash type has the equivalence

   ∥A∥ ⇔ ∥B∥
————————–
∥A∥ = ∥B∥

Now aside from this, the introduction of these types are basically the same: if we can prove that a type is occupied, we can grab a squashed type. Similarly, when we eliminate a type all we get is the trivial occupant of the squashed type, called •.

    Γ ⊢ A
———————
Γ ⊢ [A]

Γ, x : |A|, Δ[̱•] ⊢ C[̱•]
——————————————————————————
Γ, x : |A|, Δ[x] ⊢ C[x]

What’s interesting is that when proving an equality judgment, we can unsquash obth of these types. This is only because NuRPL’s equality proofs computationally trivial.

Now with all of that out of the way, I’d like to present two typing rules. First

  Γ ⊢ A ≡ A';  Γ, x : A, y : A ⊢ E[x; y] = E'[x; y]; E and E' are PERS
————————————————————————————————————————————————————————————————————
Γ ⊢ A ‌// E ≡ A' // E'

In English, two quotients are equal when the types and their quotienting relations are equal.

 Γ, u : x ≡ y ∈ (A // E), v :  ∥x E y∥, Δ[u] ⊢ C [u]
———————————————————————————————————————————————————–
Γ, u : x ≡ y ∈ (A // E), Δ[u] ⊢ C [u]

There are a few new things here. The first is that we have a new Δ [u] thing. This is a result of dependent types, can have things in our context that depend on u and so to indicate that we “split” the context, with Γ, u, Δ and apply the depend part of the context Δ to the variable it depends on u.

Now the long and short of this is that when we’re of this is that when we’re trying to use an equivalence between two terms in a quotient, we only get the squashed term. This done mean that we only need to provide a squash to get equality in the first place though

Γ ⊢ ∥ x E y  ∥; Γ ⊢ x : A; Γ ⊢ y : A
——————————————————————————————————–
Γ ⊢ x ≡ y : A // E

Remember that we can trivially form an ∥ A ∥ from A’.

Now there’s just one thing left to talk about, using our quotiented types. To do this the paper outlines one primitive elimination rule and defines several others.

Γ, x : A, y : A, e : x E y, a : ND, Δ[ndₐ{x;y}] ⊢ |C[ndₐ{x;y}]|
——————————————————————————————————————————————————————————————–
Γ, x : A // E, Δ[x] ⊢ |C[x]|

ND is a admittedly odd type that’s supposed to represent nondeterministic choice. It has two terms, tt and ff and they’re considered “equal” under ND. However, nd returns its first argument if it’s fed tt and the second if it is fed ff. Hence, nondeterminism.

Now in our rule we use this to indicate that if we’re eliminating some quotiented type we can get any value that’s considered equal under E. We can only be assured that when we eliminate a quotiented type, it will be related by the equivalence relation to x. This rule captures this notion by allowing us to randomly choose some y : A so that x E y.

Overall, this rule simply states that if C is occupied for any term related to x, then it is occupied for C[x].

## Wrap up

As with my last post, here’s some questions for the curious reader to pursue

• What elimination rules can we derive from the above?
• If we’re of proving equality can we get more expressive rules?
• What would an extensional quotient type look like?
• Why would we want intensional or extensional?
• How can we express quotient types with higher inductive types from HoTT

The last one is particularly interesting.

Thanks to Jon Sterling for proof reading

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

# TL;DR

In this blogpost I present a proof-of-concept operator $.$, which allows you to replace:

foo x0 x1 x2 ... xN = bar $qux x0 x1 x2 ... xN by: foo = bar$.$qux # Introduction This is a literate Haskell file, which means you should be able to just drop it into GHCi and play around with it. You can find the raw .lhs file here. Do note that this requires GHC 7.8 (it was tested on GHC 7.8.2). > {-# LANGUAGE FlexibleContexts #-} > {-# LANGUAGE FlexibleInstances #-} > {-# LANGUAGE OverlappingInstances #-} > {-# LANGUAGE TypeFamilies #-} > {-# LANGUAGE TypeOperators #-} > import Data.Char (toLower) If you have been writing Haskell code for a while, you have undoubtedly used the $ operator to “wrap” some expression with another function, mapping over the result type. For example, we can “wrap” the expression toLower 'A' with print to output the result.

print $toLower 'A' It is not unlikely either to have functions that just wrap other functions, e.g.: > printLower :: Char -> IO () > printLower x = print$ toLower x

If the function that is being wrapped (toLower in the previous example) has only one argument, the . operator allows writing a very concise definition of functions which just wrap those single-argument functions.

> printLower' :: Char -> IO ()
> printLower' = print . toLower

However, this gets tedious when the number arguments increases. Say that we have the following function which takes three arguments (don’t worry about the horrible implementation, but rather focus on the type):

> -- | Formats a double using a simple spec. Doesn't do proper rounding.
> formatDouble
>     :: Bool    -- ^ Drop trailing '0'?
>     -> Int     -- ^ #digits after decimal point
>     -> Double  -- ^ Argument
>     -> String  -- ^ Result
> formatDouble dropTrailingZero numDigits double =
>     let (pre, post) = case break (== '.') (show double) of
>             (x, '.' : y) -> (x, y)
>             (x, y)       -> (x, y)
>         post'       = take numDigits (post ++ repeat '0')
>         pre'        = case pre of
>             '0' : x -> if dropTrailingZero then x else pre
>             _       -> pre
>     in pre' ++ "." ++ post'

We can wrap formatDouble using print by successively using the . operator, but the result does not look pretty, nor very readable:

> printDouble :: Bool -> Int -> Double -> IO ()
> printDouble = (.) ((.) ((.) print)) formatDouble

# The $.$ operator

This makes one wonder if we can’t define an additional operator $.$ (pronounced holla-holla-get-dolla) which can be used like this:

> printDouble' :: Bool -> Int -> Double -> IO ()
> printDouble' = print $.$ formatDouble

Additionally, it should be generic, as in, it should work for an arbitrary number of arguments, so that we can also have:

> printMax' :: Int -> Int -> IO ()
> printMax' = print $.$ max
> printLower'' :: Char -> IO ()
> printLower'' = print $.$ toLower

From this, it becomes clear that the type of $.$ should be something like:

($.$)
:: (a -> b)
-> (x0 -> x1 -> ... -> xn -> a)
-> (x0 -> x1 -> ... -> xn -> b)

The first question is obviously, can we write such an operator? And if we can, how generic is it?

When my colleague Alex asked me this question, I spent some time figuring it out. I previously thought it was not possible to write such an operator in a reasonably nice way, but after some experiments with the closed type families in GHC 7.8 I managed to get something working. However, the solution is far from trivial (and I now suspect more elegant solutions might exist).

# A possible solution

Thanks to my colleague Alex for proofreading!

## October 15, 2014

### Douglas M. Auclair (geophf)

September, 2014

• September 1st, 2014: They tried to kill the Metal...I don't know where I'm going with that. But rock-n-roll with today's #haskell exercise http://lpaste.net/110331
• September 2nd, 2014: Good morning! Triangle Sums is our #haskell problem for today: http://lpaste.net/110404 No triangles were harmed in the solution of their sum (nor in the summation of their solution) http://lpaste.net/110432

• September 3rd, 2014: Pay It Forward. What? You didn't think I'd just say: today's #haskell problem is hard and leave it at that, did you? http://lpaste.net/110444 Paid. Or: a constructivist approach reduces the generated sets from 200M+ down to 8 possible solutions http://lpaste.net/110684 That's doable. ... and here is the 'Mr. Clean' version of the solution: fast, and neat. Groovy! http://lpaste.net/110685
• September 4th, 2014: Today's #haskell problem: Abacus words http://lpaste.net/110494 because MRFE says "I don't like your math problems; I want more word problems"
• September 5th, 2014: These 'edgy' relationships these days!  Remember when today's #haskell problem didn't involve graph theory? http://lpaste.net/110543 Data.Graph FTW! http://lpaste.net/110571 A solution to today's 4sum #haskell problem, and it didn't require generating 1625702400 solutions!
• September 8th, 2014: We have puzzles 1 and 5 from the "Montley Stew" problem set for today's #haskell problem http://lpaste.net/110699 The solution-style to Montley Stew isswimming in list-compression stew http://lpaste.net/110750
• September 9th, 2014: King Tut! http://lpaste.net/110789 Our #haskell problem for today is NOT a Pyramid Scheme. Maybe.
• September 10th, 2014: 'Sed' is 'but(t)' just another word ... in "'laddin" http://lpaste.net/110833 Today's #haskell problem is mix-n-match words. "But(t) I sed ..." ARG! Enough with the 3rd-grade humor! On with the solution to the mix-n-match words! http://lpaste.net/110859
• September 11th, 2014: A-plus for you when you solve today's #haskell exercise http://lpaste.net/110862 But an F- (NOT an F# ... geddit?) for /usr/share/dict/words :/ A solution to today's injectInto #haskell problem http://lpaste.net/110891
• September 12th, 2014: Today's #Haskell problem comes from She. She who commands you solve it before coding it. http://lpaste.net/110925 So, you know: there it is. Okay, 'thurt' is a word in WHICH Universe? A solution to today's #haskell 'ditto' problem http://lpaste.net/110955
• September 15th, 2014: "The name of the game is Connect Four!" and today's #haskell problem http://lpaste.net/111065 as suggested by a tweet from @DrEugeniaCheng. I played Connect 4 against myself and lost! :/ A semi-solution to today's #haskell problem at http://lpaste.net/111105

• September 16th, 2014: There's more than one way to slice and dice a matrix for today's #haskell problem http://lpaste.net/111109 (follow-up to yesterday's Connect4) A Hack-n-slash solution to today's diagonal view of matrices. http://lpaste.net/111130 Thebonus solution is provided back at the Connect Four answer to make that game complete: http://lpaste.net/111105
• September 17th, 2014: Do you Yahoo!? Today's #haskell problem: connecting to Yahoo!'s financial webservice http://lpaste.net/111071 I like my java GREEN! http://lpaste.net/111238 (java means 'coffee') A solution to stock webservice #haskell problem.

• September 18th, 2014: Star (Tuna?) Fish? http://lpaste.net/111216 A radial word-finding/matching game is today's #haskell puzzle. Wait. Quantum...WHAT? http://lpaste.net/111259 A solution to today's #haskell problem using quantum superpositions to solve it. I'm not joking. STAR POWER! http://lpaste.net/edit/111259 A solution for pretty-printing the star-puzzle
• September 19th, 2014: Continued fractions and dual inversionals are today's #haskell problem http://lpaste.net/111067 It even comes with (thoughts about) flowers. #Ult Today's problem was inspired by a comment, then the main article, from @aperiodicalhttp://aperiodical.com/2013/11/from-the-mailbag-dual-inversal-numbers/#comment-611141 That was some LOOOOOOOONG Division! http://lpaste.net/111314 A solution to today's #haskell problem.
• September 22nd, 2014: Oh, noes! The 'M'-word! http://lpaste.net/111419 for today's #haskell exercise. Project Eulerproblem 11'M' is for 'Monoid' http://lpaste.net/111435 A solution to today's #haskell problem.
• September 23rd, 2014: "Oh, the snark bites, with his teeth, dear." MACD Knife ... geddit? http://lpaste.net/111468 Today's #haskell problem is a technical indicator.
• September 24th, 2014: Jones, Smith, and Brown work at the Bank... but not Mr. Banks.A logic puzzle from 1957 for today's #haskell puzzle http://lpaste.net/111461. A pair of PhDs (http://lpaste.net/111580) helped to solve today's #haskell problem. Neatly, too, I might add.
• September 25th, 2014: Corned-beef hashi? http://lpaste.net/111452 No, that's not right, and now I'm hungry! :( Shoot! Well, anyway: today's #haskell problem.
• September 26th, 2014: HA! I TOLD you we'd be getting to cipher-text! http://lpaste.net/111459 From the 1700's, still: it IS cipher text for today's #haskell problem. Ooh! The Plot Thickens (like pea soup)! http://lpaste.net/111679 "ALLMENHAVING" and "be mis T r U st ed " is a solution to today's problem.
• September 29th, 2014: Big (Crypto) Generator! http://lpaste.net/111795 Today's #haskell problem is a follow-on to Friday's. Human Error ... WHAT human error?http://lpaste.net/111829 A solution to today's make-your-own-cypto-table #haskell problem
• September 30th, 2014: "I wanna be a blue-collar man!" Yes, but who, and which occupation? http://lpaste.net/111755 Today's #haskell problem addresses this. Plumber, ... or painter? IDK! http://lpaste.net/111876 TWO-solutions to today's #haskell problem (one of them MAY be correct... :/ )

# Switching to sytemd-networkd

Ever since I read about sytemd-networkd being in the making I was looking forward to try it out. I kept watching for the package to appear in Debian, or at least ITP bugs. A few days ago, by accident, I noticed that I already have systemd-networkd on my machine: It is simply shipped with the systemd package!

My previous setup was a combination of ifplugd to detect when I plug or unplug the ethernet cable with a plain DHCP entry in /etc/network/interface. A while ago I was using guessnet to do a static setup depending on where I am, but I don’t need this flexibility any more, so the very simple approach with systemd-networkd is just fine with me. So after stopping ifplugd and

$cat > /etc/systemd/network/eth.network <<__END__ [Match] Name=eth0 [Network] DHCP=yes __END__$ systemctl enable systemd-networkd
$systemctl start systemd-networkd I was ready to go. Indeed, systemd-networkd, probably due to the integrated dhcp client, felt quite a bit faster than the old setup. And what’s more important (and my main motivation for the switch): It did the right thing when I put it to sleep in my office, unplug it there, go home, plug it in and wake it up. ifplugd failed to detect this change and I often had to manually run ifdown eth0 && ifup eth0; this now works. But then I was bitten by what I guess some people call the viral nature of systemd: sytemd-networkd would not update /etc/resolve.conf, but rather relies on systemd-resolved. And that requires me to change /etc/resolve.conf to be a symlink to /run/systemd/resolve/resolv.conf. But of course I also use my wireless adapter, which, at that point, was still managed using ifupdown, which would use dhclient which updates /etc/resolve.conf directly. So I investigated if I can use systemd-networkd also for my wireless account. I am not using NetworkManager or the like, but rather keep wpa_supplicant running in roaming mode, controlled from ifupdown (not sure how that exactly works and what controls what, but it worked). I found out that this setup works just fine with systemd-networkd: I start wpa_supplicant with this service file (which I found in the wpasupplicant repo, but not yet in the Debian package): [Unit] Description=WPA supplicant daemon (interface-specific version) Requires=sys-subsystem-net-devices-%i.device After=sys-subsystem-net-devices-%i.device [Service] Type=simple ExecStart=/sbin/wpa_supplicant -c/etc/wpa_supplicant/wpa_supplicant-%I.conf -i%I [Install] Alias=multi-user.target.wants/wpa_supplicant@%i.service Then wpa_supplicant will get the interface up and down as it goes, while systemd-networkd, equipped with [Match] Name=wlan0 [Network] DHCP=yes does the rest. So suddenly I have a system without /etc/init.d/networking and without ifup. Feels a bit strange, but also makes sense. I still need to migrate how I manage my UMTS modem device to that model. The only thing that I’m missing so far is a way to trigger actions when the network configuration has changes, like I could with /etc/network/if-up.d/ etc. I want to run things like killall -ALRM tincd and exim -qf. If you know how to do that, please tell me, or answer over at Stack Exchange. # Switching to systemd-networkd Ever since I read about systemd-networkd being in the making I was looking forward to try it out. I kept watching for the package to appear in Debian, or at least ITP bugs. A few days ago, by accident, I noticed that I already have systemd-networkd on my machine: It is simply shipped with the systemd package! My previous setup was a combination of ifplugd to detect when I plug or unplug the ethernet cable with a plain DHCP entry in /etc/network/interface. A while ago I was using guessnet to do a static setup depending on where I am, but I don’t need this flexibility any more, so the very simple approach with systemd-networkd is just fine with me. So after stopping ifplugd and $ cat > /etc/systemd/network/eth.network <<__END__
[Match]
Name=eth0
[Network]
DHCP=yes
__END__
$systemctl enable systemd-networkd$ systemctl start systemd-networkd

I was ready to go. Indeed, systemd-networkd, probably due to the integrated dhcp client, felt quite a bit faster than the old setup. And what’s more important (and my main motivation for the switch): It did the right thing when I put it to sleep in my office, unplug it there, go home, plug it in and wake it up. ifplugd failed to detect this change and I often had to manually run ifdown eth0 && ifup eth0; this now works.

But then I was bitten by what I guess some people call the viral nature of systemd: systemd-networkd would not update /etc/resolve.conf, but rather relies on systemd-resolved. And that requires me to change /etc/resolve.conf to be a symlink to /run/systemd/resolve/resolv.conf. But of course I also use my wireless adapter, which, at that point, was still managed using ifupdown, which would use dhclient which updates /etc/resolve.conf directly.

So I investigated if I can use systemd-networkd also for my wireless account. I am not using NetworkManager or the like, but rather keep wpa_supplicant running in roaming mode, controlled from ifupdown (not sure how that exactly works and what controls what, but it worked). I found out that this setup works just fine with systemd-networkd: I start wpa_supplicant with this service file (which I found in the wpasupplicant repo, but not yet in the Debian package):

[Unit]
Description=WPA supplicant daemon (interface-specific version)
Requires=sys-subsystem-net-devices-%i.device
After=sys-subsystem-net-devices-%i.device

[Service]
Type=simple
ExecStart=/sbin/wpa_supplicant -c/etc/wpa_supplicant/wpa_supplicant-%I.conf -i%I

[Install]
Alias=multi-user.target.wants/wpa_supplicant@%i.service

Then wpa_supplicant will get the interface up and down as it goes, while systemd-networkd, equipped with

[Match]
Name=wlan0
[Network]
DHCP=yes

does the rest.

So suddenly I have a system without /etc/init.d/networking and without ifup. Feels a bit strange, but also makes sense. I still need to migrate how I manage my UMTS modem device to that model.

The only thing that I’m missing so far is a way to trigger actions when the network configuration has changes, like I could with /etc/network/if-up.d/ etc. I want to run things like killall -ALRM tincd and exim -qf. If you know how to do that, please tell me, or answer over at Stack Exchange.

# Shake's Internal State

Summary: Shake is not like Make, it has different internal state, which leads to different behaviour. I also store the state in an optimised way.

Update: I'm keeping an up to date version of this post in the Shake repo, which includes a number of questions/answers at the bottom, and is likely to evolve over time to incorporate that information into the main text.

In order to understand the behaviour of Shake, it is useful to have a mental model of Shake's internal state. To be a little more concrete, let's talk about Files which are stored on disk, which have ModTime value's associated with them, where modtime gives the ModTime of a FilePath (Shake is actually generalised over all those things). Let's also imagine we have the rule:

file *> \out -> do    need [dependency]    run

So file depends on dependency and rebuilds by executing the action run.

The Make Model

In Make there is no additional state, only the file-system. A file is considered dirty if it has a dependency such that:

modtime dependency > modtime file

As a consequence, run must update modtime file, or the file will remain dirty and rebuild in subsequent runs.

The Shake Model

For Shake, the state is:

database :: File -> (ModTime, [(File, ModTime)])

Each File is associated with a pair containing the ModTime of that file, plus a list of each dependency and their modtimes, all from when the rule was last run. As part of executing the rule above, Shake records the association:

file -> (modtime file, [(dependency, modtime dependency)])

The file is considered dirty if any of the information is no longer current. In this example, if either modtime file changes, or modtime dependency changes.

There are a few consequences of the Shake model:

• There is no requirement for modtime file to change as a result of run. The file is dirty because something changed, after we run the rule and record new information it becomes clean.
• Since a file is not required to change its modtime, things that depend on file may not require rebuilding even if file rebuilds.
• If you update an output file, it will rebuild that file, as the ModTime of a result is tracked.
• Shake only ever performs equality tests on ModTime, never ordering, which means it generalises to other types of value and works even if your file-system sometimes has incorrect times.

These consequences allow two workflows that aren't pleasant in Make:

• Generated files, where the generator changes often, but the output of the generator for a given input changes rarely. In Shake, you can rerun the generator regularly, and using a function that writes only on change (writeFileChanged in Shake) you don't rebuild further. This technique can reduce some rebuilds from hours to seconds.
• Configuration file splitting, where you have a configuration file with lots of key/value pairs, and want certain rules to only depend on a subset of the keys. In Shake, you can generate a file for each key/value and depend only on that key. If the configuration file updates, but only a subset of keys change, then only a subset of rules will rebuild. Alternatively, using Development.Shake.Config you can avoid the file for each key, but the dependency model is the same.

Optimising the Model

The above model expresses the semantics of Shake, but the implementation uses an optimised model. Note that the original Shake paper gives the optimised model, not the easy to understand model - that's because I only figured out the difference a few days ago (thanks to Simon Marlow, Simon Peyton Jones and Andrey Mokhov). To recap, we started with:

database :: File -> (ModTime, [(File, ModTime)])

We said that File is dirty if any of the ModTime values change. That's true, but what we are really doing is comparing the first ModTime with the ModTime on disk, and the list of second ModTime's with those in database. Assuming we are passed the current ModTime on disk, then a file is valid if:

valid :: File -> ModTime -> Boolvalid file mNow =    mNow == mOld &&    and [fst (database d) == m | (d,m) <- deps]    where (mOld, deps) = database file

The problem with this model is that we store each File/ModTime pair once for the file itself, plus once for every dependency. That's a fairly large amount of information, and in Shake both File and ModTime can be arbitrarily large for user rules.

Let's introduce two assumptions:

Assumption 1: A File only has at most one ModTime per Shake run, since a file will only rebuild at most once per run. We use Step for the number of times Shake has run on this project.

Consequence 1: The ModTime for a file and the ModTime for its dependencies are all recorded in the same run, so they share the same Step.

Assumption 2: We assume that if the ModTime of a File changes, and then changes back to a previous value, we can still treat that as dirty. In the specific case of ModTime that would require time travel, but even for other values it is very rare.

Consequence 2: We only use historical ModTime values to compare them for equality with current ModTime values. We can instead record the Step at which the ModTime last changed, assuming all older Step values are unequal.

The result is:

database :: File -> (ModTime, Step, Step, [File])valid :: File -> ModTime -> Boolvalid file mNow =    mNow == mOld &&    and [sBuild >= changed (database d) | d <- deps]    where (mOld, sBuilt, sChanged, deps) = database file          changed (_, _, sChanged, _) = sChanged

For each File we store its most recently recorded ModTime, the Step at which it was built, the Step when the ModTime last changed, and the list of dependencies. We now check if the Step for this file is greater than the Step at which dependency last changed. Using the assumptions above, the original formulation is equivalent.

Note that instead of storing one ModTime per dependency+1, we now store exactly one ModTime plus two small Step values.

We still store each file many times, but we reduce that by creating a bijection between File (arbitrarily large) and Id (small index) and only storing Id.

Implementing the Model

For those who like concrete details, which might change at any point in the future, the relevant definition is in Development.Shake.Database:

data Result = Result    {result    :: Value   -- the result when last built    ,built     :: Step    -- when it was actually run    ,changed   :: Step    -- when the result last changed    ,depends   :: [[Id]]  -- dependencies    ,execution :: Float   -- duration of last run    ,traces    :: [Trace] -- a trace of the expensive operations    } deriving Show

The differences from the model are:

• ModTime became Value, because Shake deals with lots of types of rules.
• The dependencies are stored as a list of lists, so we still have access to the parallelism provided by need, and if we start rebuilding some dependencies we can do so in parallel.
• We store execution and traces so we can produce profiling reports.
• I haven't shown the File/Id mapping here - that lives elsewhere.
• I removed all strictness/UNPACK annotations from the definition above, and edited a few comments.

As we can see, the code follows the optimised model quite closely.

# Game idea: “Be Consistent”

Here’s another idea for a video game.

The theme of the game is “be consistent”. It's a minimalist-styled 2D platformer. The core mechanic is that whatever you do the first time, the game makes it so that that was the right action. Examples of how this could work:

• At the start, you're standing at the center of a 2×2 checkerboard of background colors (plus appropriate greebles, not perfect squares). Say the top left and bottom right is darkish and the other quadrants are lightish. If you move left, then the darkish stuff is sky, the lightish stuff is ground, and the level extends to the left. If you move right, the darkish stuff is ground, and the level extends to the right.

• The first time you need to jump, if you press W or up then that's the jump key, or if you press the space bar then that's the jump key. The other key does something else. (This might interact poorly with an initial “push all the keys to see what they do”, though.)

• <o>You meet a floaty pointy thing. If you walk into it, it turns out to be a pickup. If you shoot it or jump on it, it turns out to be an enemy.
• If you jump in the little pool of water, the game has underwater sections or secrets. If you jump over the little pool, water is deadly.

# Mathematics of Program Construction (MPC 2015): first call for papers

FIRST CALL FOR PAPERS

12th International Conference on Mathematics of Program Construction, MPC 2015
Königswinter, Germany, 29 June - 1 July 2015
http://www.cs.ox.ac.uk/conferences/MPC2015/

BACKGROUND

The MPC conferences aim to promote the development of mathematical principles
and techniques that are demonstrably practical and effective in the process
of constructing computer programs, broadly interpreted.

The 2015 MPC conference will be held in Königswinter, Germany, from 29th June
to 1st July 2015. The previous conferences were held in Twente, The
Netherlands (1989), Oxford, UK (1992), Kloster Irsee, Germany (1995),
Marstrand, Sweden (1998), Ponte de Lima, Portugal (2000), Dagstuhl, Germany
(2002), Stirling, UK (2004, colocated with AMAST), Kuressaare, Estonia (2006,
colocated with AMAST), Marseille, France (2008), Québec City, Canada (2010,
colocated with AMAST), and Madrid, Spain (2012).

TOPICS

Papers are solicited on mathematical methods and tools put to use in program
construction. Topics of interest range from algorithmics to support for
program construction in programming languages and systems. The notion of
"program" is broad, from algorithms to hardware. Some typical areas are type
systems, program analysis and transformation, programming-language semantics,
security, and program logics. Theoretical contributions are welcome, provided
that their relevance to program construction is clear. Reports on
applications are welcome, provided that their mathematical basis is evident.

We also encourage the submission of "pearls": elegant, instructive, and fun
essays on the mathematics of program construction.

IMPORTANT DATES

* Submission of abstracts:      26 January 2015
* Submission of full papers:     2 February 2015
* Notification to authors:      16 March 2015
* Final version:                13 April 2015

SUBMISSION

Submission is in two stages. Abstracts (plain text, 10 to 20 lines) must be
submitted by 26 January 2015. Full papers (pdf) adhering to the LaTeX llncs
style must be submitted by 2 February 2015. There is no official page limit,
but authors should strive for brevity. The web-based system EasyChair will be
used for submission (https://easychair.org/conferences/?conf=mpc2015).

Papers must report previously unpublished work, and must not be submitted
concurrently to a journal or to another conference with refereed proceedings.
Accepted papers must be presented at the conference by one of the authors.

The proceedings of MPC 2015 will be published in Springer-Verlag's Lecture
Notes in Computer Science series, as have all the previous editions. Authors
of accepted papers will be expected to transfer copyright to Springer for
this purpose. After the conference, authors of the best papers will be
invited to submit revised versions to a special issue of the Elsevier journal
Science of Computer Programming.

PROGRAMME COMMITTEE

Ralf Hinze                University of Oxford, UK (chair)

Eerke Boiten              University of Kent, UK
Lindsay Groves            Victoria University of Wellington, New Zealand
Zhenjiang Hu              National Institute of Informatics, Japan
Graham Hutton             University of Nottingham, UK
Johan Jeuring             Utrecht University and Open University, The Netherlands
Jay McCarthy              Vassar College, US
Bernhard Möller           Universität Augsburg, Germany
Dave Naumann              Stevens Institute of Technology, US
Ulf Norell                University of Gothenburg, Sweden
Bruno C. d. S. Oliveira   The University of Hong Kong, Hong Kong
José Nuno Oliveira        Universidade do Minho, Portugal
Alberto Pardo             Universidad de la República, Uruguay
Christine Paulin-Mohring  INRIA-Université Paris-Sud, France
Tom Schrijvers            KU Leuven, Belgium
Tim Sheard                Portland State University, US
Anya Tafliovich           University of Toronto Scarborough, Canada
Tarmo Uustalu             Institute of Cybernetics, Estonia
Janis Voigtländer         Universität Bonn, Germany

VENUE

The conference will take place in Königswinter, Maritim Hotel, where
accommodation has been reserved. Königswinter is situated on the right bank
of the river Rhine, opposite Germany's former capital Bonn, at the foot of
the Siebengebirge.

LOCAL ORGANIZERS

Ralf Hinze                      University of Oxford, UK (co-chair)
Janis Voigtländer               Universität Bonn, Germany (co-chair)
José Pedro Magalhães            University of Oxford, UK
Nicolas Wu                      University of Oxford, UK

# Optparse-applicative and custom argument parsers

The latest update of optparse-applicative triggered me to look over the functions in cblrepo for parsing a few custom command line options. I used to do the parsing in a rather ad-hoc way with lots of use of list functions to split on specific characters. For instance, some option values are pairs of package name and version separated by a comma: PKG-NAME,VERSION. This worked fine and was easy to plug into version 0.10 of optparse-applicative. It was also easily extended to triples, PKG-NAME,VERSION,RELEASE, but it started feeling a bit brittle when some tuples got extended with an optional list of flag assignments, PKG-NAME,VERSION[:FLAG,FLAG,FLAG,...]. The recent release of version 0.11 of optparse-applicative changed the API for custom option value parsers radically; instead of passing a string to the parser, the parser has to use readerAsk to get the string. In short, ReaderM turned into a state monad.

In adjusting to the new API I noticed that the code was organised in such a way that some low-level parsing functions were used directly from command line option definitions, while also being used as building blocks for the more complex parsers. This of course meant that the structuring of the functions needed to be changed completely to deal with the API change.

It turns out there already was a parser that was written in a different style (here already adjusted to the 0.11 API):

readerGhcVersion :: ReadM Version
case lastMay $readP_to_S parseVersion arg of Just (v, "") -> return v _ -> fail$ "cannot parse value " ++ arg ++ ""

So I rewrote the rest of the parsers in a similar style. The arguably most complicated is this one:

readPkgNVersion :: ReadP (String, Version)
n <- many (satisfy (/= ','))
char ','
v <- parseVersion
return (n, v)

where
char '-'
n <- many (satisfy (/= ','))
return (FlagName n, False)

n0 <- get
n <- many (satisfy (/= ','))
return (FlagName (n0 : n), True)

char ':'
fas <- sepBy readFlag (char ',')
return (n, v, fas)

return (n, v, [])

in do
Just (r, "") -> return r
_ -> fail $"Cannot parse: " ++ s It is slightly longer, but it’s rather a lot easier to read what’s happening after this rewrite. ReadP feels like a lighter option than pulling in parsec as a dependency, but I’d love to hear any comments or suggestions, as well as pointers to how other people deal with parsing of non-trivial types of arguments in combination with optparse-applicative. ## October 12, 2014 ### Ian Ross # Non-diffusive atmospheric flow #7: PCA for spatio-temporal data # Non-diffusive atmospheric flow #7: PCA for spatio-temporal data Although the basics of the “project onto eigenvectors of the covariance matrix” prescription do hold just the same in the case of spatio-temporal data as in the simple two-dimensional example we looked at in the earlier article, there are a number of things we need to think about when we come to look at PCA for spatio-temporal data. Specifically, we need to think bout data organisation, the interpretation of the output of the PCA calculation, and the interpretation of PCA as a change of basis in a spatio-temporal setting. Let’s start by looking at data organisation. The ${Z}_{500}$ anomaly data we want to analyse has 66 × 151 = 9966 days of data, each of which has 72 × 15 = 1080 spatial points. In our earlier two-dimensional PCA example, we performed PCA on a collection of two-dimensional data points. For the ${Z}_{500}$ data, it’s pretty clear that the “collection of points” covers the time steps, and each “data point” is a 72 × 15 grid of ${Z}_{500}$ values. We can think of each of those grids as a 1080-dimensional vector, just by flattening all the grid values into a single row, giving us a sequence of “data points” as vectors in ${\mathbb{\text{ℝ}}}^{1080}$ that we can treat in the same kind of way as we did the two-dimensional data points in the earlier example. Our input data thus ends up being a set of 9966 1080-dimensional vectors, instead of 500 two-dimensional vectors (as for the mussel data). If we do PCA on this collection of 1080-dimensional vectors, the PCA eigenvectors will have the same shape as the input data vectors, so we can interpret them as spatial patterns, just by inverting the flattening we did to get from spatial maps of ${Z}_{500}$ to vectors – as long as we interpret each entry in the eigenvectors as the same spatial point as the corresponding entry in the input data vectors, everything works seamlessly. The transformation goes like this: pattern → vector → PCA → eigenvector → eigenpattern So we have an interpretation of the PCA eigenvectors (which we’ll henceforth call “PCA eigenpatterns” to emphasise that they’re spatial patterns of variability) in this spatio-temporal data case. What about the PCA eigenvalues? These have exactly the same interpretation as in the two-dimensional case: they measure the variance “explained” by each of the PCA eigenpatterns. And finally, the PCA projected components tell us how much of each PCA eigenpattern is present in each of the input data vectors. Since our input data has one spatial grid per time step, the projections give us one time series for each of the PCA eigenvectors, i.e. one time series of PCA projected components per spatial point in the input. (In one way, it’s kind of obvious that we need this number of values to reproduce the input data perfectly – I’ll say a little more about this when we think about what “basis” means in this setting.) The PCA calculation works just the same as it did for the two-dimensional case: starting with our 1080-dimensional data, we centre the data, calculate the covariance matrix (which in this case is a 1080 × 1080 matrix, the diagonal entries of which measure the variances at each spatial point and the off-diagonal entries of which measure the covariances between each pair of spatial points), perform an eigendecomposition of the covariance matrix, then project each of the input data points onto each of the eigenvectors of the covariance matrix. We’ve talked about PCA as being nothing more than a change of basis, in the two-dimensional case from the “normal” Euclidean basis (with unit basis vectors pointing along the $x$- and $y$-coordinate axes) to another orthnormal basis whose basis vectors are the PCA eigenvectors. How does this work in the spatio-temporal setting? This is probably the point that confuses most people in going from the simple two-dimensional example to the $N$-dimensional spatio-temporal case, so I’m going to labour the point a bit to make things as clear as possible. First, what’s the “normal” basis here? Each time step of our input data specifies a ${Z}_{500}$ value at each point in space – we have one number in our data vector for each point in our grid. In the two-dimensional case, we had one number for each of the mussel shell measurements we took (length and width). For the ${Z}_{500}$ data, the 1080 data values are the ${Z}_{500}$ values measured at each of the spatial points. In the mussel shell case, the basis vectors pointed in the $x$-axis direction (for shell length) and the $y$-axis direction (for the shell width). For the ${Z}_{500}$ case, we somehow need basis vectors that point in each of the “grid point directions”, one for each of the 1080 grid points. What do these look like? Imagine a spatial grid of the same shape (i.e. 72 × 15) as the ${Z}_{500}$ data, where all the grid values are zero, except for one point, which has a grid value of one. That is a basis vector pointing in the “direction” of the grid point with the non-zero data value. We’re going to call this the “grid” basis for brevity. We can represent the ${Z}_{500}$ value at any spatial point $\left(i,j\right)$ as ${Z}_{500}\left(i,j\right)=\sum _{k=1}^{1080}{\varphi }_{k}{e}_{k}\left(i,j\right)$ where ${e}_{k}\left(i,j\right)$ is zero unless $k=15\left(i-1\right)+j$, in which case it’s one (i.e. it’s exactly the basis element we just described, where we’re numbering the basis elements in row-major order) and ${\varphi }_{k}$ is a “component” in the expansion of the ${Z}_{500}$ field using this grid basis. Now obviously here, because of the basis we’re using, we can see immediately that ${\varphi }_{15\left(i-1\right)+j}={Z}_{500}\left(i,j\right)$, but this expansion holds for any orthnormal basis, so we can transform to a basis where the basis vectors are the PCA eigenvectors, just as for the two-dimensional case. If we call these eigenvectors ${\stackrel{~}{e}}_{k}\left(i,j\right)$, then ${Z}_{500}\left(i,j\right)=\sum _{k=1}^{1080}{\stackrel{~}{\varphi }}_{k}{\stackrel{~}{e}}_{k}\left(i,j\right),$ where the ${\stackrel{~}{\varphi }}_{k}$ are the components in the PCA eigenvector basis. Now though, the ${\stackrel{~}{e}}_{k}\left(i,j\right)$ aren’t just the “zero everywhere except at one point” grid basis vectors, but they can have non-zero values anywhere. Compare this to the case for the two-dimensional example, where we started with data in a basis that had seperate measurements for shell length and shell width, then transformed to the PCA basis where the length and width measurements were “mixed up” into a sort of “size” measurement and a sort of “aspect ratio” measurement. The same thing is happening here: instead of looking at the ${Z}_{500}$ data in terms of the variations at individual grid points (which is what we see in the grid basis), we’re going to be able to look at variations in terms of coherent spatial patterns that span many grid points. And because of the way that PCA works, those patterns are the “most important”, in the sense that they are the orthogonal (which in this case means uncorrelated) patterns that explain the most of the total variance in the ${Z}_{500}$ data. As I’ve already mentioned, I’m going to try to be consistent in terms of the terminology I use: I’m only ever going to talk about “PCA eigenvalues”, “PCA eigenpatterns”, and “PCA projected components” (or “PCA projected component time series”). Given the number of discussions I’ve been involved in in the past where people have been talking past each other just because one person means one thing by “principal component” and the other means something else, I’d much rather pay the price of a little verbosity to avoid that kind of confusion. ### ${Z}_{500}$ PCA calculation The PCA calculation for the ${Z}_{500}$ data can be done quite easily in Haskell. We’ll show in this section how it’s done, and we’ll use the code to address a couple of remaining issues with how spatio-temporal PCA works (specifically, area scaling for data in latitude/longitude coordinates and the relative scaling of PCA eigenpatterns and projected components). There are three main steps to the PCA calculation: first we need to centre our data and calculate the covariance matrix, then we need to do the eigendecomposition of the covariance matrix, and finally we need to project our original data onto the PCA eigenvectors. We need to think a little about the data volumes involved in these steps. Our ${Z}_{500}$ data has 1080 spatial points, so the covariance matrix will be a 1080 × 1080 matrix, i.e. it will have 1,166,400 entries. This isn’t really a problem, and performing an eigendecomposition of a matrix of this size is pretty quick. What can be more of a problem is the size of the input data itself – although we only have 1080 spatial points, we could in principle have a large number of time samples, enough that we might not want to read the whole of the data set into memory at once for the covariance matrix calculation. We’re going to demonstrate two approaches here: in the first “online” calculation, we’ll just read all the data at once and assume that we have enough memory; in the second “offline” approach, we’ll only ever read a single time step of ${Z}_{500}$ data at a time into memory. Note that in both cases, we’re going to calculate the full covariance matrix in memory and do a direct eigendecomposition using SVD. There are offline approaches for calculating the covariance and there are iterative methods that allow you to calculate some eigenvectors of a matrix without doing a full eigendecomposition, but we’re not going to worry about that here. #### Online PCA calculation As usual, the code is in a Gist. For the online calculation, the PCA calculation itself is identical to our two-dimensional test case and we reuse the pca function from the earlier post. The only thing we need to do is to read the data in as a matrix to pass to the pca function. In fact, there is one extra thing we need to do before passing the ${Z}_{500}$ anomaly data to the pca function. Because the ${Z}_{500}$ data is sampled on a regular latitude/longitude grid, grid points near the North pole correspond to much smaller areas of the earth than grid points closer to the equator. In order to compensate for this, we scale the ${Z}_{500}$ anomaly data values by the square root of the cosine of the latitude – this leads to covariance matrix values that scale as the cosine of the latitude, which gives the correct area weighting. The listing below shows how we do this. First we read the NetCDF data then we use the hmatrix build function to construct a suitably scaled data matrix:  Right z500short <- get innc z500var :: RepaRet3 CShort -- Convert anomaly data to a matrix of floating point values, -- scaling by square root of cos of latitude. let latscale = SV.map (\lt -> realToFrac$ sqrt $cos (lt / 180.0 * pi)) lat z500 = build (ntime, nspace) (\t s -> let it = truncate t :: Int is = truncate s :: Int (ilat, ilon) = divMod is nlon i = Repa.Z Repa.:. it Repa.:. ilat Repa.:. ilon in (latscale ! ilat) * (fromIntegral$ z500short Repa.! i)) :: Matrix Double

Once we have the scaled ${Z}_{500}$ anomaly data in a matrix, we call the pca function, which does both the covariance matrix calculation and the PCA eigendecomposition and projection, then write the results to a NetCDF file. We end up with a NetCDF file containing 1080 PCA eigenpatterns, each with 72 × 15 data points on our latitude/longitude grid and PCA projected component time series each with 9966 time steps.

One very important thing to note here is the relative scaling of the PCA eigenpatterns and the PCA projected component time series. In the two-dimensional mussel shell example, there was no confusion about the fact that the PCA eigenvectors as we presented them were unit vectors, and the PCA projected components had the units of length measured along those unit vectors. Here, in the spatio-temporal case, there is much potential for confusion (and the range of conventions in the climate science literature doesn’t do anything to help alleviate that confusion). To make things very clear: here, the PCA eigenvectors are still unit vectors and the PCA projected component time series have the units of ${Z}_{500}$!

The reason for the potential confusion is that people quite reasonably like to draw maps of the PCA eigenpatterns, but they also like to think of these maps as being spatial patterns of ${Z}_{500}$ variation, not just as basis vectors. This opens the door to all sorts of more or less reputable approaches to scaling the PCA eigenpatterns and projected components. One well-known book on statistical analysis in climate research suggests that people should scale their PCA eigenpatterns by the standard deviation of the corresponding PCA projected component time series and the values of the PCA projected component time series should be divided by their standard deviation. The result of this is that the maps of the PCA eigenpatterns look like ${Z}_{500}$ maps and all of the PCA projected component time series have standard deviation of one. People then talk about the PCA eigenpatterns as showing a “typical ± 1 SD” event.

Here, we’re going to deal with this issue by continuing to be very explicit about what we’re doing. In all cases, our PCA eigenpatterns will be unit vectors, i.e. the things we get back from the pca function, without any scaling. That means that the units in our data live on the PCA projected component time series, not on the PCA eigenpatterns. When we want to look at a map of a PCA eigenpattern in a way that makes it look like a “typical” ${Z}_{500}$ deviation from the mean (which is a useful thing to do), we will say something like “This plot shows the first PCA eigenpattern scaled by the standard deviation of the first PCA projected component time series.” Just to be extra explicit!

#### Offline PCA calculation

The “online” PCA calculation didn’t require any extra work, apart from some type conversions and the area scaling we had to do. But what if we have too much data to read everything into memory in one go? Here, I’ll show you how to do a sort of “offline” PCA calculation. By “offline”, I mean an approach that only ever reads a single time step of data from the input at a time, and only ever writes a single time step of the PCA projected component time series to the output at a time.

Because we’re going to be interleaving calculation and I/O, we’re going to need to make our PCA function monadic. Here’s the main offline PCA function:

pcaM :: Int -> (Int -> IO V) -> (Int -> V -> IO ()) -> IO (V, M)
pcaM nrow readrow writeproj = do
(mean, cov) <- meanCovM nrow readrow
let (_, evals, evecCols) = svd cov
evecs = fromRows $map evecSigns$ toColumns evecCols
evecSigns ev = let maxelidx = maxIndex $cmap abs ev sign = signum (ev ! maxelidx) in cmap (sign *) ev varexp = scale (1.0 / sumElements evals) evals project x = evecs #> (x - mean) forM_ [0..nrow-1]$ \r -> readrow r >>= writeproj r . project
return (varexp, evecs)

It makes use of a couple of convenience type synonyms:

type V = Vector Double
type M = Matrix Double

The pcaM function takes as arguments the number of data rows to process (in our case, the number of time steps), an IO action to read a single row of data (given the zero-based row index), and an IO action to write a single row of PCA projected component time series data. As with the “normal” pca function, the pcaM function returns the PCA eigenvalues and PCA eigenvectors as its result.

Most of the pcaM function is the same as the pca function. There are only two real differences. First, the calculation of the mean and covariance of the data uses the meanCovM function that we’ll look at in a moment. Second, the writing of the PCA projected component time series output is done by a monadic loop that uses the IO actions passed to pcaM to alternately read, project and write out rows of data (the pca function just returned the PCA projected component time series to its caller in one go).

Most of the real differences to the pca function lie in the calculation of the mean and covariance of the input data:

meanCovM :: Int -> (Int -> IO V) -> IO (V, M)
-- Accumulate values for mean calculation.
let maddone acc i = do
return $! acc + row mtotal <- foldM maddone refrow [1..nrow-1] -- Calculate sample mean. let mean = scale (1.0 / fromIntegral nrow) mtotal -- Accumulate differences from mean for covariance calculation. let refdiff = refrow - mean caddone acc i = do row <- readrow i let diff = row - mean return$! acc + (diff outer diff)
ctotal <- foldM caddone (refdiff outer refdiff) [1..nrow-1]

-- Calculate sample covariance.
let cov = scale (1.0 / fromIntegral (nrow - 1)) ctotal

return (mean, cov)

Since we don’t want to read more than a single row of input data at a time, we need to explicitly accumulate data for the mean and covariance calculations. That means making two passes over the input data file, reading a row at a time – the maddone and caddone helper functions accumulate a single row of data for the mean and covariance calculations. The accumulator for the mean is pretty obvious, but that for the covariance probably deserves a bit of comment. It uses the hmatrix outer function to calculate $\left({x}_{i}-\stackrel{‾}{x}\right)\left({x}_{i}-\stackrel{‾}{x}{\right)}^{T}$ (where ${x}_{i}$ is the $i$th data row (as a column vector) and $\stackrel{‾}{x}$ is the data mean), which is the appropriate contribution to the covariance matrix for each individual data row.

Overall, the offline PCA calculation makes three passes over the input data file (one for the mean, one for the covariance, one to project the input data onto the PCA eigenvectors), reading a single data row at a time. That makes it pretty slow, certainly far slower than the online calculation, which reads all of the data into memory in one go, then does all the mean, covariance and projection calculations in memory, and finally writes out the PCA projected components in one go. However, if you have enough data that you can’t do an online calculation, this is the way to go. You can obviously imagine ways to make this more efficient, probably by reading batches of data rows at a time. You’d still need to do three passes over the data, but batching the reads would make things a bit quicker.

### Visualising the PCA results

There are three things we can look at that come out of the PCA analysis of the ${Z}_{500}$ anomaly data: the PCA eigenvalues (best expressed as “fraction of variance explained”), the PCA eigenpatterns and the PCA projected component time series.

First, let’s look at the eigenvalues. This plot shows the fraction of variance explained for the first 100 PCA eigenvalues of the ${Z}_{500}$ anomaly data, both individually (blue) and cumulatively (orange):

The eigenvalues are ordered in decreasing order of magnitude in what’s usually called a “scree plot”. The reason for the name is pretty obvious, since the eigenvalues fall off quickly in magnitude giving the graph the look of cliff face with a talus slope at its foot. We often look for a “knee” in a plot like this to get some idea of how many PCA eigenpatterns we need to consider to capture a good fraction of the total variance in the data we’re looking at. Here we can see that just ten of the PCA eigenpatterns explain about half of the total variance in the ${Z}_{500}$ anomaly data (which is a set of 1080-dimensional vectors, remember). However, there’s not all that much of a “knee” in the scree plot here, which is pretty typical for climate and meteorological data – we often see a gradual fall-off in PCA eigenvalue magnitude rather than a discrete set of larger magnitude eigenvalues that we can identify as “the important ones”.

We can get some idea of what’s going on with this gradual fall-off by looking at the PCA eigenpatterns. As mentioned in the previous section, there is a question about how we scale these for display. To be completely explicit about things, here we’re going to plot PCA eigenpatterns scaled by the standard deviation of the corresponding PCA projected component time series. This gives us “typical one standard deviation” patterns that we can plot with units of geopotential height. These are usually easier to interpret than the “unit vector” PCA eigenpatterns than come out of the PCA calculation.

Here are the first six PCA eigenpatterns for the ${Z}_{500}$ anomaly data (you can click on these images to see larger versions; the numbers in parentheses show the fraction of total ${Z}_{500}$ anomaly variance explained by each PCA eigenpattern.):

For comparison here are the eigenpatterns for eigenpatterns $10,20,\dots ,60$:

The first thing to note about these figures is that the spatial scales of variation for the PCA eigenpatterns corresponding to smaller eigenvalues (i.e. smaller explained variance fractions) are also smaller – for the most extreme case, compare the dipolar circumpolar spatial pattern for the first eigenpattern (first plot in the first group of plots) to the fine-scale spatial features for the 60th eigenpattern (last plot in the second group). This is what we often seen when we do PCA on atmospheric data. The larger spatial scales capture most of the variability in the data so are represented by the first few eigenpatterns, while smaller scale spatial variability is represented by later eigenpatterns. Intuitively, this is probably related to the power-law scaling in the turbulent cascade of energy from large (planetary) scales to small scales (where dissipation by thermal diffusion occurs) in the atmosphere1.

The next thing we can look at, at least in the first few patterns in the first group of plots, are some of the actual patterns of variability these things represent. The first PCA eigenpattern, for example, represents a dipole in ${Z}_{500}$ anomaly variability with poles in the North Atlantic just south of Greenland and over mainland western Europe. If you look back at the blocking ${Z}_{500}$ anomaly plots in an earlier post, you can kind of convince yourself that this first PCA eigenpattern looks a little like some instances of a blocking pattern over the North Atlantic. Similarly, the second PCA eigenpattern is mostly a dipole between the North Pacific and North America (with some weaker associated variability over the Atlantic, so we might expect this somehow to be related to blocking episodes in the Pacific sector.

This is all necessarily a bit vague, because these patterns represent only part of the variability in the data, with each individual pattern representing only a quite small fraction of the variability (8.86% for the first eigenpattern, 7.46% for the second, 6.27% for the third). At any particular point in time, the pattern of ${Z}_{500}$ anomalies in the atmosphere will be made up of contributions from these patterns plus many others. What we hope though is that we can tease out some interesting characteristics of the atmospheric flow by considering just a subset of these PCA eigenpatterns. Sometimes this is really easy and obvious – if you perform PCA and find that there are two leading eigenpatterns that explain 80% of the variance in your data, then you can quite straightforwardly press ahead with analysing only those two patterns of variability, safe in the knowledge that you’re capturing most of what’s going on in your data. In our case, we’re going to try to get some sense of what’s going on by looking at only the first three PCA eigenpatterns (we’ll see why three in the next article). The first three eigenpatterns explain only 22.59% of the total variance in our ${Z}_{500}$ anomaly data, so this isn’t obviously a smart thing to do. It does turn out to work and to be quite educational though!

The last component of the output from the PCA procedure is the time series of PCA projected component values. Here we have one time series (of 9966 days) for each of the 1080 PCA eigenpatterns that we produced. At each time step, the actual ${Z}_{500}$ anomaly field can be recovered by adding up all the PCA eigenpatterns, each weighted by the corresponding projected component. You can look at plots of these time series, but they’re not in themselves all that enlightening. I’ll say some more about them in the next article, where we need to think about the autocorrelation properties of these time series.

(As a side note, I’d comment that the PCA eigenpatterns shown above match up pretty well with those in Crommelin’s paper, which is reassuring. The approach we’re taking here, of duplicating the analysis done in an existing paper, is actually a very good way to go about developing new data analyis code – you can see quite quickly if you screw things up as you’re going along by comparing your results with what’s in the paper. Since I’m just making up all the Haskell stuff here as I go along, this is pretty handy!)

1. But don’t make too much of that, not in any kind of quantitative sense anyway – there’s certainly no obvious power law scaling in the explained variance of the PCA eigenpatterns as a function of eigenvalue index, unless you look at the data with very power law tinted spectacles! I’m planning to look at another paper at some point in the future that will serve as a good vehicle for exploring this question of when and where we can see power law behaviour in observational data.}

<script src="http://zor.livefyre.com/wjs/v3.0/javascripts/livefyre.js" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "livefyre.com", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

## October 11, 2014

### Lee Pike

I wanted to try out a recent feature in GHC this week, so I was using HEAD. When I say using, I mean it: I wasn’t developing with it, but using it to build Ivory, our safe C EDSL, as well as some libraries written in Ivory. I want to point out a few dragons that lay ahead for others using HEAD (7.9.xxx) today.

• The Functor, Applicative, Monad hierarchy (as well as the Alternative and MonadPlus hierarchy) is no longer a warning but an error. You’ll be adding a lot of instances to library code.
• You’ll need to update Cabal, which comes as a submodule with the GHC sources. Unfortunately, building Cabal was a pain because of the bootstrapping problem. The bootstrap.sh script in cabal-install is broken (e.g., outdated dependency versions). Getting it to work involved using runhaskell directly, modifying a bunch of Cabal’s dependencies, and getting a little help from Iavor Diatchki.
• Some of the datatypes in Template Haskell have changed; notably, the datatypes for creating class constraints have been folded into the datatype for constructing types (constraint kinds!). Many libraries that depend on Template Haskell breaks as a result.
• I won’t mention the issues with package dependency upper bounds. Oops, I just did.

Fortunately, once Cabal is updated, Cabal sandboxes work just fine. I wrote a simple shell script to swap out my sandboxes to switch between GHC versions. (It would be a nice feature if Cabal recognized you are using a different version of GHC than the one the cabal sandbox was originally made and created a new sandbox automatically.)

Because I’m on OSX and use Homebrew, I tried using it to manage my GHC versions, including those installed with Brew and those built and installed manually. It works great. When building GHC, configure it to install into your Cellar, or wherever you have Brew install packages. So I ran

> ./configure --prefix=/usr/local/Cellar/ghc/7.9

which makes Brew aware of the new version of GHC, despite being manually installed. After it’s installed,

> brew switch ghc 7.9

takes care of updating all your symbolic links. No more making “my-ghc-7.9″ symbolic links or writing custom shell scripts.

# Functional Software Engineer at Cake Solutions Ltd (Full-time)

At Cake Solutions we work with our customers to build high-quality, scalable and resilient software systems using the latest technology. As a software engineer, you'll not only be able to write good, maintainable software, but also stay at the forefront of technology and advocate a principled approach to software engineering. You'll get the opportunity to work on a wide range of interesting projects for our clients, using Java, Scala, Play, and Akka.

What to expect:

To begin with, you will take part in a 2 week Typesafe certified workshop style training course where you will get introduced to Scala, Akka and Play.

You can expect a lively and challenging environment with very interesting problems to solve. We are happy to train and mentor the right people; the important thing is to have a bright mind and the motivation to question, explore and learn. Having published work, on Github or elsewhere, is extremely helpful. CVs should focus on what you really know, not what you've seen once.

If you are a graduate, we are not expecting commercial experience, but we want to see evidence of hobby / university engineering. We do, however, expect you to be independent and to have good understanding of the principles of software engineering.

You will also get flexible working hours and gym membership.

Skills & Requirements As a software engineer at Cake Solutions, you should:

-- Have a good understanding of Java and the JVM. Experience with Scala is a plus.

-- Know how to use UNIX or Linux.

-- Know how to apply object-oriented and functional programming styles to real-world software engineering.

-- Have experience with at least one database system and be aware of the wider database landscape (relational, document, key/value, graph, ...).

-Understand modern software development practices, such as testing, continuous integration and producing maintainable code.

-- Open-source contributions.

-- Modern web development experience.

-- An understanding of asynchronous and non-blocking principles.

-- Experience in writing multi-threaded software.

-- More detailed knowledge of strongly-typed functional programming (e.g. Scala, Haskell, OCaml).

-- Even more specifically, experience with Akka or Play.

About Cake Solutions Ltd Cake Solutions architects, implements and maintains modern and scalable software, which includes server-side, rich browser applications and mobile development. Alongside the software engineering and delivery, Cake Solutions provides mentoring and training services. Whatever scale of system you ask us to develop, we will deliver the entire solution, not just lines of code. We appreciate the importance of good testing,Continuous Integration and delivery, and DevOps. We motivate, mentor and guide entire teams through modern software engineering. This enables us to deliver not just software, but to transform the way organisations think about and execute software delivery.

The core members of our team are published authors and experienced speakers. Our teams have extensive experience in designing and implementing event-driven, resilient, responsive and scalable systems. We use modern programming languages such as Scala, Java, C++, Objective-C, and JavaScript to implement the systems' components. We make the most of messaging infrastructures, modern DBMSs, and all other services that make up today's enterprise systems. Automated provisioning, testing, integration and delivery allow us to release high quality systems safely and predictably. The mentoring through continuous improvement at all levels of the project work gives our clients the insight and flexibility they expect.

We rely on open source software in our day-to-day development; it gives us access to very high quality code, allows us to make improvements if we need to, and provides access to excellent source of inspiration and talent. We give back to the open source community by contributing to the open source projects we use, and by publishing our own open source projects. The team have contributed various Typesafe Activator templates, and shared their expertise with Akka and Scala in Akka Patterns and Akka Extras. Outside of the Typesafe stack, we have contributed to Tru-strap, and OpenCV. The team members have also created open source projects that scratch our own itch: we have Reactive Monitor, Specs2 Spring and Scalad.

Get information on how to apply for this position.

# ICFP 2015: Call for Workshop and Co-Located Event Proposals

CALL FOR WORKSHOP AND CO-LOCATED EVENT PROPOSALS
ICFP 2015
20th ACM SIGPLAN International Conference on Functional Programming
August 30 – September 5, 2015

The 120th ACM SIGPLAN International Conference on Functional
Programming will be held in Vancouver, British Columbia, Canada on
August 30-September 5, 2015.  ICFP provides a forum for researchers
and developers to hear about the latest work on the design,
implementations, principles, and uses of functional programming.

Proposals are invited for workshops (and other co-located events, such
as tutorials) to be affiliated with ICFP 2015 and sponsored by
SIGPLAN. These events should be less formal and more focused than ICFP
itself, include sessions that enable interaction among the attendees,
and foster the exchange of new ideas. The preference is for one-day
events, but other schedules can also be considered.

The workshops are scheduled to occur on August 30 (the day
before ICFP) and September 3-5 (the three days after ICFP).

----------------------------------------------------------------------

Submission details
Deadline for submission:     November 16, 2014
Notification of acceptance:  December 15, 2014

Prospective organizers of workshops or other co-located events are
invited to submit a completed workshop proposal form in plain text
format to the ICFP 2015 workshop co-chairs (Tom Schrijvers and Nicolas
Wu), via email to icfp2015-workshops@cs.kuleuven.be by November 16,
2014. (For proposals of co-located events other than workshops, please
fill in the workshop proposal form and just leave blank any sections
that do not apply.) Please note that this is a firm deadline.

Organizers will be notified if their event proposal is accepted by
December 15, 2014, and if successful, depending on the event, they
will be asked to produce a final report after the event has taken
place that is suitable for publication in SIGPLAN Notices.

The proposal form is available at:

http://www.icfpconference.org/icfp2015/icfp15-workshops-form.txt

http://acm.org/sigplan/sigplan_workshop_proposal.htm

----------------------------------------------------------------------

Selection committee

The proposals will be evaluated by a committee comprising the
following members of the ICFP 2015 organising committee, together with
the members of the SIGPLAN executive committee.

Workshop Co-Chair: Tom Schrijvers                          (KU Leuven)
Workshop Co-Chair: Nicolas Wu                   (University of Oxford)
General Chair :    Kathleen Fisher                  (Tufts University)
Program Chair:     John Reppy                (University of Chicago)

----------------------------------------------------------------------

Further information

Any queries should be addressed to the workshop co-chairs (Tom
Schrijvers and Nicolas Wu), via email to

# SNAPL

SNAPL will take place 3-6 May 2015 at Asilomar.
SNAPL provides a biennial venue to focus on big-picture questions, long-running research programs, and experience reports in programming languages. The ideal SNAPL paper takes one of two forms:
• A promising idea that would benefit from discussion and feedback.
• A paper about an ongoing research project that might not be accepted at a traditional conference.
Examples include papers that
• lay out a research roadmap
• summarize experiences
• present negative results
• discuss design alternatives
SNAPL is interested in all aspects of programming languages. The PC is broad and open-minded and receptive to interesting ideas that will provoke thought and discussion.
Interesting papers would combine the specific with the general. Submissions are limited to five pages (excluding bibliography), and must be formatted using ACM SIG style. The final papers can be up to 10 pages in length. Accepted papers will be published on an open-access site, probably arXiv CoRR.
To encourage authors to submit only their best work, each person can be an author or co-author of a single paper only. SNAPL will prefer experienced presenters and each submission must indicate on the submission site which co-author will present the paper at the conference.
SNAPL also accepts one-page abstracts. Abstracts will be reviewed lightly and all submitted abstracts will be published on the SNAPL 2015 web page. Authors of selected abstracts will be invited to give a 5-minute presentation at the gong show at the conference.
SNAPL is unaffiliated with any organization. It is a conference for the PL community organized by the PL community.

### Important Dates

Submission: January 6, 2015
Decisions announced: February 20, 2015
Final versions due: March 20, 2015
Conference: May 3-6, 2015

# Software Engineer / Developer at Clutch Analytics/ Windhaven Insurance (Full-time)

Position Description:

Windhaven Insurance is seeking an experienced Software Engineer/ Developer to join a small elite team who are disrupting the auto insurance industry with innovative technology. You will work in a startup atmosphere as part of a subsidiary of a rapidly growing larger company. This means that along with programming, you’ll have a hand in the larger issues, such as product architecture and direction.

Required:

• Someone who knows at least one functional language such as: Elixir, Erlang, Lisp, Scheme, Haskell, ML, Clojure, Racket, Ocaml or F#
• Someone who ENGAGES a.k.a “gives a damn” about what we do and how technology can help make us more competitive in the marketplace.
• Someone who COLLABORATES. We have the flattest organization in the industry designed with one main goal – the TEAM. Are you hungry to make a significant impact in the tech world?
• Someone who RESPECTS Teammates, customers and the community.

Special Requirements:

You need to have made an achievement, in any field, of significance worth talking about, that required grit, determination and skill. This can be a personal achievement that few know about, or it could have gotten coverage by the media. It doesn’t matter, what does matter is some demonstration of grit, determination and skill. If it can be described succinctly, please describe the achievement in your cover letter, if not, be prepared to tell us all about it during the interview.

Professional & Technical Qualifications:

• Experience with languages such as Elixir or Erlang
• Experience with Ember.js (or Angular.js)
• Experience with NoSQL data stores, such as Couchbase, Riak, etc.
• DevOps experience a plus
• Ability to explain technical issues and recommend solutions
• Strong team player with a high degree of flexibility
• Excellent verbal and written communication skills

Compensation:

Competitive salary based on experience. Benefits package includes: medical, dental, vision insurance, life insurance, short term and long term disability insurance, 401K, paid time off. EOE.

Get information on how to apply for this position.

# Bake - Continuous Integration System

Summary: I've written a continuous integration system, in Haskell, designed for large projects. It works, but probably won't scale yet.

I've just released bake, a continuous integration system - an alternative to Jenkins, Travis, Buildbot etc. Bake eliminates the problem of "broken builds", a patch is never merged into the repo before it has passed all the tests.

Bake is designed for large, productive, semi-trusted teams:

• Large teams where there are at least several contributors working full-time on a single code base.
• Productive teams which are regularly pushing code, many times a day.
• Semi-trusted teams where code does not go through manual code review, but code does need to pass a test suite and perhaps some static analysis. People are assumed to be fallible.

Current state: At the moment I have a rudimentary test suite, and it seems to mostly work, but Bake has never been deployed for real. Some optional functionality doesn't work, some of the web UI is a bit crude, the algorithms probably don't scale and all console output from all tests is kept in memory forever. I consider the design and API to be complete, and the scaling issues to be easily fixable - but it's easier to fix after it becomes clear where the bottleneck is. If you are interested, take a look, and then email me.

To give a flavour, the web GUI looks of a running Bake system looks like:

## The Design

Bake is a Haskell library that can be used to put together a continuous integration server. To run Bake you start a single server for your project, which coordinates tasks, provides an HTTP API for submitting new patches, and a web-based GUI for viewing the progress of your patches. You also run some Bake clients which run the tests on behalf of the server. While Bake is written in Haskell, most of the tests are expected to just call some system command.

There are a few aspects that make Bake unique:

• Patches are submitted to Bake, but are not applied to the main repo until they have passed all their tests. There is no way for someone to "break the build" - at all points the repo will build on all platforms and all tests will pass.
• Bake scales up so that even if you have 5 hours of testing and 50 commits a day it will not require 250 hours of computation per day. In order for Bake to prove that a set of patches pass a test, it does not have to test each patch individually.
• Bake allows multiple clients to run tests, even if some tests are only able to be run on some clients, allowing both parallelisation and specialisation (testing both Windows and Linux, for example).
• Bake can detect that tests are no longer valid, for example because they access a server that is no longer running, and report the issue without blaming the submitted patches.

## An Example

The test suite provides both an example configuration and commands to drive it. Here we annotate a slightly simplified version of the example, for lists of imports see the original code.

First we define an enumeration for where we want tests to run. Our server is going to require tests on both Windows and Linux before a patch is accepted.

data Platform = Linux | Windows deriving (Show,Read)platforms = [Linux,Windows]

Next we define the test type. A test is something that must pass before a patch is accepted.

data Action = Compile | Run Int deriving (Show,Read)

Our type is named Action. We have two distinct types of tests, compiling the code, and running the result with a particular argument. Now we need to supply some information about the tests:

allTests = [(p,t) | p <- platforms, t <- Compile : map Run [1,10,0]]execute :: (Platform,Action) -> TestInfo (Platform,Action)execute (p,Compile) = matchOS p $run$ do    cmd "ghc --make Main.hs"execute (p,Run i) = require [(p,Compile)] $matchOS p$ run $do cmd ("." </> "Main") (show i) We have to declare allTests, then list of all tests that must pass, and execute, which gives information about a test. Note that the test type is (Platform,Action), so a test is a platform (where to run the test) and an Action (what to run). The run function gives an IO action to run, and require specifies dependencies. We use an auxiliary matchOS to detect whether a test is running on the right platform: #if WINDOWSmyPlatform = Windows#elsemyPlatform = Linux#endifmatchOS :: Platform -> TestInfo t -> TestInfo tmatchOS p = suitable (return . (==) myPlatform) We use the suitable function to declare whether a test can run on a particular client. Finally, we define the main function: main :: IO ()main = bake$    ovenGit "http://example.com/myrepo.git" "master" $ovenTest readShowStringy (return allTests) execute defaultOven{ovenServer=("127.0.0.1",5000)} We define main = bake, then fill in some configuration. We first declare we are working with Git, and give a repo name and branch name. Next we declare what the tests are, passing the information about the tests. Finally we give a host/port for the server, which we can visit in a web browser or access via the HTTP API. ## Using the Example Now we have defined the example, we need to start up some servers and clients using the command line for our tool. Assuming we compiled as bake, we can write bake server and bake client (we'll need to launch at least one client per OS). We can view the state by visiting http://127.0.0.1:5000 in a web browser. To add a patch we can run bake addpatch --name=cb3c2a71, using the SHA1 of the commit, which will try and integrate that patch into the master branch, after all the tests have passed. ### Brandon Simmons # Announcing unagi-chan Today I released version 0.2 of unagi-chan, a haskell library implementing fast and scalable FIFO queues with a nice and familiar API. It is available on hackage and you can install it with: $ cabal install unagi-chan


This version provides a bounded queue variant (and closes issue #1!) that has performance on par with the other variants in the library. This is something I’m somewhat proud of, considering that the standard TBQueue is not only significantly slower than e.g. TQueue, but also was seen to livelock at a fairly low level of concurrency (and so is not included in the benchmark suite).

Here are some example benchmarks. Please do try the new bounded version and see how it works for you.

What follows are a few random thoughts more or less generally-applicable to the design of bounded FIFO queues, especially in a high-level garbage-collected language. These might be obvious, uninteresting, or unintelligible.

# What is Bounding For?

I hadn’t really thought much about this before: a bounded queue limits memory consumption because the queue is restricted from growing beyond some size.

But this isn’t quite right. If for instance we implement a bounded queue by pre-allocating an array of size bounds then a write operation need not consume any additional memory; indeed the value to be written has already been allocated on the heap before the write even begins, and will persist whether the write blocks or returns immediately.

Instead constraining memory usage is a knock-on effect of what we really care about: backpressure; when the ratio of “producers” to their writes is high (the usual scenario), blocking a write may limit memory usage by delaying heap allocations associated with elements for future writes.

So bounded queues with blocking writes let us:

• when threads are “oversubscribed”, transparently indicate to the runtime which work has priority
• limit future resource usage (CPU time and memory) by producer threads

We might also like our bounded queue to support a non-blocking write which returns immediately with success or failure. This might be thought of (depending on the capabilities of your language’s runtime) as more general than a blocking write, but it also supports a distinctly different notion of bounding, that is bounding message latency: a producer may choose to drop messages when a consumer falls behind, in exchange for lower latency for future writes.

# Unagi.Bounded Implementation Ideas

Trying to unpack the ideas above helped in a few ways when designing Unagi.Bounded. Here are a few observations I made.

## We need not block before “writing”

When implementing blocking writes, my intuition was to (when the queue is “full”) have writers block before “making the message available” (whatever that means for your implementation). For Unagi that means blocking on an MVar, and then writing a message to an assigned array index.

But this ordering presents a couple of problems: first, we need to be able to handle async exceptions raised during writer blocking; if its message isn’t yet “in place” then we need to somehow coordinate with the reader that would have received this message, telling it to retry.

By unpacking the purpose of bounding it became clear that we’re free to block at any point during the write (because the write per se does not have the memory-usage implications we originally naively assumed it had), so in Unagi.Bounded writes proceed exactly like in our other variants, until the end of the writeChan, at which point we decide when to block.

This is certainly also better for performance: if a wave of readers comes along, they need not wait (themselves blocking) for previously blocked writers to make their messages available.

One hairy detail from this approach: an async exception raised in a blocked writer does not cause that write to be aborted; i.e. once entered, writeChan always succeeds. Reasoning in terms of linearizability this only affects situations in which a writer thread is known-blocked and we would like to abort that write.

## Fine-grained writer unblocking in probably unnecessary and harmful

In Unagi.Bounded I relax the bounds constraint to “somewhere between bounds and bounds*2”. This allows me to eliminate a lot of coordination between readers and writers by using a single reader to unblock up to bounds number of writers. This constraint (along with the constraint that bounds be a power of two, for fast modulo) seemed like something everyone could live with.

I also guess that this “cohort unblocking” behavior could result in some nicer stride behavior, with more consecutive non-blocking reads and writes, rather than having a situation where the queue is almost always either completely full or empty.

# One-shot MVars and Semaphores

This has nothing to do with queues, but just a place to put this observation: garbage-collected languages permit some interesting non-traditional concurrency patterns. For instance I use MVars and IORefs that only ever go from empty to full, or follow a single linear progression of three or four states in their lifetime. Often it’s easier to design algorithms this way, rather than by using long-lived mutable variables (for instance I struggled to come up with a blocking bounded queue design that used a circular buffer which could be made async-exception-safe).

Similarly the CAS operation (which I get exported from atomic-primops) turns out to be surprisingly versatile far beyond the traditional read/CAS/retry loop, and to have very useful semantics when used on short-lived variables. For instance throughout unagi-chan I do both of the following:

• CAS without inspecting the return value, content that we or any other competing thread succeeded.

• CAS using a known initial state, avoiding an initial read

# New website layout

After 10 years I finally got around to re-decorating my website. One reason was ICFP, where just too many people told me that I don’t look like on my old website any more (which is very true). Another reason was that I was visting my brother, who is very good at web design (check out his portfolio), who could help me a bit.

I wanted something practical and maybe a bit staid, so I drew inspiration from typical Latex typography, and also from Edward Z. Yang’s blog: A serif font (Utopia) for the main body, justified and hyphenated text. Large section headers in a knobbly bold sans-serif font (Latin Modern Sans, which reasonably resembles Computer Modern). To intensify that impression, I put the main text on a white box that lies – like a paper – on the background. As a special gimmic the per-page navigation (or, in the case of the blog, the list of categories) is marked up like a figure in a paper.

Of course this would be very dire without a suitable background. I really like the procedural art by Jared Tarbell, espcially substrate and interAggregate. Both have been turned into screensavers shipped with xscreensaver, so I hacked the substrate code to generate a seamless tile and took a screenshot of the result. I could not make up my mind yet how dense it has to be to look good, so I for every page I randomly pick one of six variants randomly for now.

I simplified the navigation a bit. The old News section has been removed recently already. The Links section is gone – I guess link lists on homepages are so 90s. The section Contact and About me are merged and awaiting some cleanup. The link to the satire news Heisse News is demoted to a mention on the Contents section.

This hopefully helps to make the site navigatable on mobile devices (the old homepage was unusable). CSS media queries adjust the layout slightly on narrow screens, and separately for print devices.

Being the nostaltic I am, I still keep the old design, as well as the two designs before that, around and commented their history.

# ghc-heap-view for GHC 7.8

Since the last release of ghc-heap-view, which was compatible with GHC-7.6, I got 8 requests for a GHC-7.8 compatible version. I started working on it in January, but got stuck and then kept putting it off.

Today, I got the ninths request, and I did not want to wait for the tenth, so I finally finished the work and you can use the new ghc-heap-view-0.5.2 with GHC-7.8.

I used this chance to migrate its source repository from Darcs to git (mirrored on GitHub), so maybe this means that when 7.10 comes out, the requests to update it come with working patches :-). I also added a small test script so that travis can check it:

I did not test it very thoroughly yet. In particular, I did not test whether ghc-vis works as expected.

I still think that the low-level interface that ghc-heap-view creates using custom Cmm code should move into GHC itself, so that it does not break that easily, but I still did not get around to propose a patch for that.

# Weight-biased leftist heaps verified in Haskell using dependent types

In January I announced my implementation of weight-biased leftist heaps verified with dependent types in Agda. This was part of my work on a paper submitted to CAV’14 conference. The paper got rejected and I decided not to resubmit it anywhere else. At this year’s ICFP listening to Stephanie Weirich’s keynote speech motivated me to finally port that implementation to Haskell, something that I had planned for a couple of months now. You can take a look at the result on github. Here I want to share some of my experiences and insights.

My overall impression is that porting from Agda to Haskell turned out to be fairly straightforward. It was definitely not a complete rewrite. More like syntax adjustments here and there. There were of course some surprises and bumps along the way but nothing too problematic. More precise details are given in the code comments.

When it comes to programming with dependent types Agda, being a fully-fledged dependently-typed language, beats Haskell in many aspects:

• Agda has the same language for terms and types. Haskell separates these languages, which means that if I want to have addition for natural numbers then I need to have two separate definitions for terms and types. Moreover, to tie types and terms together I need singleton types. And once I have singleton types then I need to write third definition of addition that works on singletons. All of this is troublesome to write and use. (This tedious process can be automated by  using singletons package.)
• interactive agda-mode for Emacs makes writing code much simpler in Agda. Here I was porting code that was already written so having an interactive Emacs mode for Haskell was not at all important. But if I were to write all that dependently-typed code from scratch in Haskell this would be painful. We definitely need better tools for dependently-typed programming in Haskell.
• Agda admits Unicode identifiers. This allows me to have type constructors like ≥ or variables like p≥b. In Haskell I have GEq and pgeb, respectively. I find that less readable. (This is very subjective.)
• Agda has implicit arguments that can be deduced from types. Haskell does not, which makes some function calls more difficult. Surprisingly that was not as huge problem as I initially thought it will be.
• Agda is total, while Haskell is not. Since there are bottoms in Haskell it is not sound as a logic. In other words we can prove false eg. by using undefined.

The list is noticeably shorter:

• Haskell has much better term-level syntax. In many places this resulted in significantly shorter code than in Agda.
• Haskell is not total. As stated earlier this has its drawbacks but it also has a good side: we don’t need to struggle with convincing the termination checker that our code does actually terminate. This was painful in Agda since it required using sized types.
• Haskell’s gcastWith function is much better than Agda’s subst. Both these functions allow type-safe casts given the proof that the cast is safe. The difference is that Agda’s subst requires more explicit arguments (as I noted earlier the opposite is usually the case) and restricts the cast to the last type parameter (Haskell allows cast for any type parameter).

# Summary

While the list of wins is longer for Agda than it is for Haskell I’m actually very happy with Haskell’s performance in this task. The verification in Haskell is as powerful as it is in Agda. No compromises required.

It’s worth remarking that my implementation works with GHC 7.6, so you don’t need the latest fancy type-level features like closed type families. The really essential part are the promoted data types.

## October 06, 2014

### Yesod Web Framework

Haskell's Prelude is changing to favor using Foldable/Traversable instead of just lists. Many Haskellers are concerned that upcoming changes to the Prelude could

• break existing code
• make maintaining code more difficult
• decrease beginner friendliness

Lets discuss these concerns

## Stability and the Prelude design space

Step 3: Incorporate feedback

I expect that will result in a significant number of revisions, and perhaps significant departures from the original design. Note that the classy-prelude package is following the steps above, and has had 38 revisions on Hackage so far.

As a contributor to and user of classy-prelude, I wanted to point out something about this statement. Most of these revisions are minor and backwards compatible consisting of bug-fixes or something like adding a non-default implementation of a typeclass method or an additional typeclass instance. A better data point is the number of major revision releases. classy-prelude is at release 0.10 now, so that would be 10.

The classy-prelude work has gone in that direction, and I wish them luck, but the significant changes they've already iterated through suggest the design space is quite large.

Of these 10 changes, I would only consider one to be major and represent any kind of change in the design space: the 0.6 release basing classy-prelude on the mono-traversable package. Some of the version bumps represent a change in code organization between mono-traversable and classy-prelude. One change that required a bump is a change to the type signature of mapM_ that should be made in the Prelude but probably never will because it will break existing code. The major change in the 0.6 release is very similar to the upcoming changes in the Prelude, except that classy-prelude (via mono-traversable) works on monomorphic structures such as Text in addition to polymorphic structures. So I would not consider the design space to be large for classy-prelude or for other prelude replacements. classy-prelude before 0.6 and most other Prelude replacements or type-class conveniences have not worked out very well. There is only 1 design that has worked well (incorporating Foldable and Traversable), and that is the design being incorporated into the new Prelude.

We already have that problem with the Control.Arrow module, which (as far as most people use it), is just a pile of tuple combinators. But unlike other tuple combinators, these are ones whose type signature can't be understood. When I want to use &&& or ***

I want to point out that classy-prelude solved some of this issue by exporting Data.BiFunctor instead of functions from Data.Arrow, which required a version bump from 0.9.x to 0.10. I also want to point out that these kinds of arguments are straw man arguments. Every proposed abstraction to Haskell has its own advantages and dis-advantages. Because of the need for backwards compatibility, we are going to be able to point to abstractions in the Prelude that are not being used in the right way for a long time. However, Foldable/Traversable is the only serious contender for abstracting over data structures in Haskell. It has stood the test of time so far, but it has not seen a lot of use yet because everyone is initially directed to just use lists for everything, and the next instinct when using other data structures in the current environment is to use qualified names.

## Iterating the new design

One proposed remedy for dealing with change is trying to release the new Prelude in an iterative way. This could be a good idea, but in practice it is very difficult to implement: most users are still going to import Prelude rather than trying out something different and giving their feedback. A better approach than holding it back is to use a design that makes it easier for new releases to make backwards incompatible changes. One approach to this could be at the package level the way that base-compat operates. Another approach that could be useful to library authors is incorporate versioning at the module level.

Something to keep in mind though is that because the new Prelude needs to try to work with the old Prelude, there are not that many options in the design space. classy-prelude has had the luxury of being able to re-think every Haskell wart. So it was able to remove all partial functions and use Text instead of String in many places. But that process is very difficult for the actual Prelude, which is severly constrained.

## Why change? Module qualified names and generic code.

The motivation for classy-prelude was to confront one of Haskell's most glaring warts: name-spacing and the need to qualify functions. We could certainly have our IDE automatically write import statements, but we still end up with needing to use module qualified names. This isn't really an acceptable way to program. I have not seen another language where this extra line noise is considered good style. For Haskell to move forward and be as convenient to use as other programming languages, there are 2 solutions I know of.

1. change the language
2. make it convenient to write generic code

Changing the language so that module qualification is not needed is arguably a much better approach. This is the case in Object-Oriented languages, and possible in languages very similar to Haskell such as Frege that figure out how to disambiguate a function based on the data type being used. I think this would be a great change to Haskell, but the idea was rejected by Simon Peyton Jones himself during the discussion on fixing Haskell records because it is not compatible with how Haskell's type system operates today. Simon did propose Type directed name resolution which I always though was a great idea, but that proposal was not able to get off the ground in part because changing Haskell's dot operator proved too controversial.

So the only practical option I know of is to focus on #2. Being able to write generic code is an important issue in of itself. Programmers in most other mainstream languages write code that operates on multiple data structures of a particular shape, but Haskell programmers are still specializing a lot of their interfaces.

## Lists are holding Haskell back

It is taken by many to be a truism that programming everything with lists makes things simpler or at least easier for new Haskell programmers. I have found this statement to be no different than 99% of things given the glorious "simple" label: the "simplicity" is not extensible, does not even live up to its original use case, and ends up creating its own incidental complexity.

I used to frequently warp the functions I wrote to fit the mold of Haskell's list. Now that I use classy-prelude I think about the data structure that is needed. Or often I start with a list, eventually discover that something such as appending is needed, and I am able to quickly change the function to operate on a different data structure.

Using an associative list is an extreme example of using the wrong data structure where lookup is O(n) instead of constant or O(log(n)). But by warping a function I am really talking about writing a function in a way to reduce list appends or doing a double reverse instead of using a more natural DList or a Seq. This warping process probably involves performing recursion by hand instead of re-using higher-order functions. As a library developer, I would like to start exposing interfaces that allow my users to use different data structures, but I know that it is also going to cause some inconvenience because of the current state of the Prelude.

Neil writes that he had an opposite experience:

I have taken over a project which made extensive use of the generalised traverse and sequence functions. Yes, the code was concise, but it was read-only, and even then, required me to "trust" that the compiler and libraries snapped together properly.

This kind of report is very worrying and it is something we should take very seriously. Any you certainly cannot tell someone that their actual experience was wrong. However, it is human nature to over-generalize our experiences just as it was the nature of the code author in question to over-generalize functions. In order to have a productive discussion about this, we need to see (at least a sample or example of) the code in question. Otherwise we are only left guessing at what mistakes the author made.

In general I would suggest specializing your application code to lists or other specific structures (this can always be done with type signatures) until there is a need for more abstraction, and that could be a big part of the problem in this case.

It would be really great to start having these discussions now based off of actual code examples and to have a community guide that explains the missing common sense for how to use abstractions appropriately.

The uncertainty discussed here is the heart of the matter, and talking about what is good for beginners is largely a distraction.

## Haskell is for programmers first, students in their first class in functional programming second

This might sound hostile to beginners. In fact, the opposite is the case! The programming languages that are taught to beginners are the very same ones that are used in industry. The first programming language taught to me at school was C, and it was not because it was optimized out of the box for beginners.

So the way to attract more beginners is simply to become more popular with industry. Haskell has already reached the growth rate limit of a language that is popular for the sake of learning about programming.

That being said, we do need to do a lot of things to make the experience as nice as possible for beginners, and an alternative prelude for beginners could be a great idea. But making this the default that holds back progress hurts everyone in the long-run.

## It isn't Beginner vs. Expert anyways

The most up-voted comment on Reddit states:

What other languages have different standard libraries for people learning and people not learning? What could be a greater way to confuse learners, waste their time and make them think this language is a joke than presenting them with n different standard libraries?

I will add my own assertion here: Haskell is confusing today because the Prelude is in a backward state that no longer reflects several important best practices (for example, Neil had to create the Safe package!) and it does not hold up once you write more than a trivial amount of code in your module.

We also need to keep in mind that using Haskell can be difficult for beginners precisely for some of the same reasons that it is painful for experts. And the same reason these new changes will be more difficult for beginners (mental overhead of using the Foldable/Traversable abstraction instead of just lists) will also create difficulties for non-beginners.

So the changes to the Prelude are going to make some aspects a better for beginners or existing users and others harder.

If we really want to improve Haskell for beginners we need to stop creating a false dichotomy between beginner and expert. We also need to empower committees to make forward progress rather than letting minority objections stall all forward progress.

## Improving the library process

Some have expressed being surprised to learn about what is going on in the Haskell libraries committee at a late stage. On the other hand, I doubt that hearing more objections earlier would actually be helpful, because the libraries process has not learned from GHC.

Take a look at the extensive documentation around proposed changes to improve Haskell's record system. Creating a solution to Haskell's problem with records was a very difficult process. There were several designs that looked good in a rough sketch form, but that had issues when explored in thorough detail on the wiki. More importantly, the wiki helped summarize and explain a discussion that was extremely laborious to read and impossible to understand by looking through a mail list.

Before creating a non-minor change to GHC, there is a convention of creating a wiki page (certainly it isn't always done). At a minimum there is a Trac ticket that can serve a somewhat similar purpose.

My suggestion is that the libraries process use the existing GHC or Haskell wiki to create a page for every non-minor change. The page for Foldable/Traversable would explain

• what is being changed
• which changes create a breakage
• how type errors have changed
• how library code is affected
• how user code is affected
• best practices for using Foldable/Traversable

Right now we are stuck in a loop of repeating the same points that were already made in the original discussion of the proposal. Given a wiki page, Neil and others could point out the down-sides of the proposal with actual code and have their voice heard in a productive way that builds up our body of knowledge.

# Help me name a game keybinding library.

(I could say some meta-commentary about how I haven't been blogging much and I've made a resolution to get back to it and it'll be good for me and so on, but I think I've done that too many times already, so let's get right to the actual thing...)

When I wrote Cubes (a browser-based “Minecraft-like”), one of the components I built was a facility for key-bindings — that is, allowing the user to choose which keys (or mouse buttons, or gamepad buttons) to assign to which functions (move left, fly up, place block, etc.) and then generically handling calling the right functions when the event occurs.

Now, I want to use that in some other programs. But in order for it to exist as a separate library, it needs a name. I have failed to think of any good ones for months. Suggestions wanted.

Preferably, the name should hint at that it supports the gamepad API as well as keyboard and mouse. It should not end in “.js” because cliche. Also for reference, the other library that arose out of Cubes development I named Measviz (which I chose as a portmanteau and for having almost zero existing usage according to web searches).

(The working draft name is web-input-mapper, which is fairly descriptive but also thoroughly clunky.)

This is a space where you can leave comments describing errata in any of my published papers. Please include bibliographic details of the paper, and a link to where the paper appears on my web page if you can. Thank you to Dave Della Costa for volunteering the first entry and inspiring the creation of this post, and to all who utilise this space to record and correct my errors for posterity.

# Programming in the real world

One of the things that annoys me about Object Oriented Programming is how it's often suggested that it models the "real world". Frequently tutorials will start with creating an object modelling a chair, and through inheritance you'll be able to build up composable aspects of chairs: different numbers of legs, different colours, different designs. Sometimes they use tables rather than chairs. This is lovely, but it actually has everything to do with data modelling through inheritance, decomposition, abstraction and encapsulation, and almost nothing to do with Object Orientation: the key is that these chairs have no modifying methods on them. If they have any methods at all then they'll be for things like getting the number of legs or the colour, or volume or something - something that is basically fixed once the object is instantiated. At this point in such tutorials I'd probably claim this is not actually programming yet: all that's been achieved so far is that we've assigned some semantics to some numbers held in memory and we can write some numbers in memory. Programming is when we manipulate numbers: that involves reading and writing numbers.

The problem then is that Object Orientation immediately stops being about modelling the "real world" as soon as we can modify memory. If we think about how we actually would go about getting a chair made for us, it could go a bit like this:

1. Go see your local carpenter,
2. Have a discussion with them about the style and type of chair you'd like,
3. They make the chair and give it to you in return for payment,
4. You take the chair home,
5. You decide you don't like the colour so you take it to your garage and repaint it yourself.

It should be clear that the inanimate object (the chair) is the odd one out here. Everything else is done by actors that have their own state, mainly act asynchronously, and can communicate with other actors through protocols - protocols that do not involve sharing mutable state (e.g. if I say something to you, that speech is immutable; you can't change what I've said (though you could choose to mishear me!)). At no point is any state of any actor actually exposed to another actor: I may share with you what I'm currently thinking, and you can try to influence me, but we don't exactly need a mutex around memory in my brain because YOU'RE NOT GETTING IN THERE!

If you tried modelling this sort of thing through Object Orientation without actors then you'd end up with your own thread doing all the work: it'd be you, it'd be the carpenter and it'd be the chair, maybe all at once. If your carpenter is in fact a growing business with a receptionist, a design team and a billing department your thread would be playing those roles too and would probably have to use locks to avoid unexpected interactions with other threads doing the same commissioning-receptioning-designing-constructing-delivery-repainting dance. And all the time, whilst you're doing the carpentry yourself, you'd could easily have your own thoughts, feelings, aspirations and regrets all on the same stack for your carpenter-alias to mess with.

Thus Object Orientation causes multiple personality disorder.

So in my view, the way Object Orientation gets introduced tends to be more like "useful tools for modelling data". But the OO approach to manipulating that data goes wrong as soon as you try to model the animated real world. Firstly it has nothing to say about separating out threads to self-contained actors (but try this in a language or on a platform without green-threads, or without the ability to preempt threads and you can quickly hit pain), and secondly even if you do have actors, OO encourages the sharing of mutable data rather than passing around either immutable data or copies of data. Yes, good programming discipline can result in sane designs and a successful result, but it's not a core aspect of the OOP mantra.

So, OOP has nothing good to say on manipulating data at all - it either says nothing or it encourages silly ideas like using locks. The data modelling bits are fine, but I think they're a broader concept beyond the confines of OOP. What else does OOP get you? An arbitrary restriction on the receiver of any method. That's about it. It's thanks to this restriction that writing combinators like cons on a list library in an OO language is really painful.

This week Erik Meijer wrote an article called The Curse of the Excluded Middle: "Mostly functional" programming does not work. After an introduction, we get onto The Problem, which (paraphrasing) is that languages that are mainly imperative but offer some features from pure functional languages are not as safe as pure functional languages.

The first three examples, from C# are certainly surprising to me (I barely know any C# at all though). The first two problems come from trying to compose side-effecting stuff with laziness. In the first case it's not clear that the problem is with the IO operation (printing things out) or actually with the laziness, but more the odd behaviour of the Where operator (presumably the implementation of Where doesn't know that a Cartesian product isn't necessary, but surely any normal monadic/list-comprehension implementation wouldn't have this problem?). The second case is certainly the terrifying composition of laziness with throwing exceptions and thus the exception having the potential to pop out anywhere where the lazy expression gets forced. However, if you know the Select operator is lazy, it's not really that surprising. It's arguably piss-poor language design that there's nothing there to help you, but C# doesn't have checked exceptions; apparently programmers don't like having to deal with errors so you reap what you sow. The third case is how C# has a nice using feature which binds a resource to a lexical scope. But if you construct a closure capturing the resource and then send it out of that lexical scope then using goes wrong (it will still discard the resource even though there's a reference to it within the closure which remains in-scope). This is certainly piss-poor language design: if the closure captures stuff from your lexical scope and you're not reference counting (or equivalent) your lexical scope then YOU'VE DONE IT WRONG. This is as bad as in C allocating stuff on your stack and then returning pointers to it.

Next he moves on somewhat tangentially to the point that if object creation is an observable action then you can't optimise it out. I'm not sure anyone outside a pure functional language specialist would ever want object creation to be optimised out, but the point is that if your constructor has side effects or can in any other way be observed then you can't have your language runtime do memoization of object creation. Doing side effects in object constructors has long been discouraged: I first read that back in the Effective Java book about a decade ago and I'm sure it wasn't exactly a ground-breaking piece of advice then.

So far then we have that side effects which are untracked have the potential to be bad: whether it's printing things out, or throwing exceptions, or discarding resources early, or preventing compiler optimisations. But next I feel the article goes a bit wrong. He first moves onto how channels in C⍵ can store state so they're not pure either, thus bad. And then goes onto how in Erlang you have the same problem as you're just modelling mutable state in actors:

Note how this Erlang actor basically encodes an object with dynamic method dispatch using the pattern-matching, message-sending, and recursion primitives of the language, which you may happily leverage to implement mutable references, sabotaging the fact that the Erlang language does not natively expose mutable state.

This is wrong: you cannot implement mutable references in Erlang. Data is immutable in Erlang so if you send some value out of an actor, you are sending that value. Not a reference to a value or variable. Even if you create a closure and send that out of the actor, the closure is capturing those values as they exist at that point in time. If you have received a value sent to you from an actor, you may use it to create other values, but doing so does not affect the "original", and similarly, the actor itself can continue to modify its own state, but it does not affect the values it sent to you. Yes, you can use Erlang actors to model objects. But an actor's own modifications of its state cannot be observed as side effects on values you've previously retrieved from that actor, and vice versa.

The reference you have to an actor is a process identifier (also immutable) which does not present any information itself about the state of the actor. Through that, you can send messages to an actor and test whether or not the actor is still alive, but that is all. And in any case, where has the sudden objection to mutable state come from? State is just a catamorphism on prior inputs. State is not the problem: unconstrained side effects are the problem. Certainly sharing mutable state is a problem (and you could argue that mutating shared state is a side effect and that it should be tracked statically), but Erlang does not allow for that.

He may have been better off going for an example of opening an file, sending the file handle to another process and then closing the file handle before it's been used (i.e. the same as the third C# example). Except:

1. All file operations can return an error anyway so handling errors in such code is completely normal;
2. In Erlang a file handle is normally an actor itself, so what you're doing is passing around a process identifier. Sending messages to a dead process (once the file is closed) is a normal activity and you can detect if the process has died in normal ways;
3. If you bypass such normal file handling for performance reasons and open the file in "raw" mode then Erlang has a light form of object capabilities in which only the process that opened the file is allowed to use the file handle, so again the use of the file handle would error predictably;
4. The language doesn't have the same C# feature for discarding resources once you return out of a lexical scope. Consequently closing a file is an explicit operation and given the asynchronous concurrent mindset one develops when working in Erlang, it's very likely you'll realise how odd it is to be closing a file handle whilst there's some closure out there which may not have been run yet.

Beyond this, he introduces the Haskell type system and explains that it captures side effects statically. As a result, by bowing to the demands of the type checker, it offers you a proof that if such effects occur, your program will handle them: exceptions will not go uncaught, IO operations are only permitted where the semantics lead to expected outcomes, resources are not used after they're discarded and the compiler can use all these proofs to do all manner of optimisations to your program.

These proofs can certainly be very valuable (though they are no substitute for disciplined, high quality design and careful implementation). Obviously, they don't capture everything though. Particularly relevant for concurrent and distributed programs, they don't capture sufficient side effects to allow for a proof of the absence of deadlocks. Haskell standard libraries contain channels and semaphores which can easily be used to sporadically end up with a deadlock between processes. A deadlock is definitely a side effect: the effect is the program probably stops working. The cause is an insufficient use of locks to control the scheduler (be it scheduling of OS threads or language runtime scheduling of green threads).

More broadly, the proof a type checker offers is that the specification you've provided (type signatures) is not violated by its inferences about your code. Until the type checker allows "and makes progress" as part of a specification, Haskell itself is no safer than any other language that claims to be "mostly functional".

# How to Rewrite the Prelude

Summary: Rewriting the Prelude should be done in a separate package and take a few years to complete, hopefully building a consensus and improving the result.

My recent suggestion that the Prelude shouldn't be rewritten to incorporate Foldable/Traversable generated a lot of feedback. It's clear some people strongly agree, and some strongly disagree. So instead of continuing the argument, in this post I'm going to describe how I think the people who want to change the Prelude should go about it.

There were a lot of comments that we should optimise Haskell for practitioners, not beginners - in particular that there should be Prelude (for advanced users) and Prelude.Simple (for beginners). My personal opinion is that if we have two Prelude's, why not make the beginner one the default one? Switching to the non-default one probably takes one line of moderately advanced syntax (currently import Prelude(); import Prelude.Advanced). For a beginner, writing a simple one line program, that more than doubles the complexity. For a moderate user, that's almost certainly outweighed by lines of LANGUAGE pragmas. (I'm also not sure that I want the "Advanced" Prelude, but that's not relevant to this post.)

Step 1: Write the new Prelude in a package

The first step to proposing a new Prelude should be to write Prelude.Advanced in a separate package on Hackage. Let's assume we have a package base-advanced with a module Prelude.Advanced. The Prelude is not trivial, and translating high-level design goals (e.g. move Foldable into the Prelude) requires some design choices to translate into concrete code.

Step 2: Gain users and feedback

Rather than roll out the Prelude to everyone at once, slowly try and persuade people that your Prelude is superior to the existing one. In doing so, advanced users can start trying out the Prelude, and describing their experiences with it. Does it require Foldable to gain a size method? Does it show up a GHC error message that could be better? Do we need some new type of defaulting? Does profiling give poor results without some manually inserted cost centre? Is a direct list isomorphism a better type class to start from (as F# does with IEnumerable)? Given the number of enthusiastic supporters, this step should be easy.

Step 3: Incorporate feedback

The feedback from step 2 needs incorporating into the package, and potentially into GHC itself. I expect that will result in a significant number of revisions, and perhaps significant departures from the original design. Note that the classy-prelude package is following the steps above, and has had 38 revisions on Hackage so far.

Step 4: Gain acceptance

Before going any further, some critical mass of people need to agree the new way is preferable. There may turn out to be competing proposals, and hopefully as a community we can find consensus and identify the best ideas (as we are currently doing with IO streaming libraries).

Step 5: Move to base

Once we have agreement on what the new Prelude should look like, it should probably be moved into base. At this point, based on all the experience we've got, we can decide whether it becomes Prelude or Prelude.Advanced. At the same time as moving into base, we should decide on the simpler story, but what that decision looks like, will depend on what the proposed Prelude looks like.

The current approach

There are a number of areas that current approach worry me. The change was made directly in the base Prelude, based on some agreement of high-level design decisions, and has already resulted in unexpected additions to the Foldable class. The Prelude will first be exposed to lots of users once GHC 7.10 ships, by which point iterating on feedback will be slow and difficult. The plan for the "beginner" version is to encourage beginners to use the haskell2010 package, which I don't think has been thought out (as someone who tried to use the haskell98 package for a short while, the exception change meant you couldn't use haskell98 and base in the same package, which is basically fatal).

The Haskell community changed Applicative to be a superclass of Monad. That change was simple and thoroughly thought out over a period of years. Warnings were added to GHC, packages were analysed, people experimented with what it would mean, and once decided the change took a number of GHC releases to fully roll out. I consider this change to have been done the right way. In contrast, the complete rewrite of the Prelude is happening far more quickly, and I would argue too quickly. By taking a bit more time hopefully we can come up with something even better.

# GUI - Release of the threepenny-gui library, version 0.5.0.0

I am pleased to announce release of threepenny-gui version 0.5, a cheap and simple library to satisfy your immediate GUI needs in Haskell.

Want to write a small GUI thing but forgot to sacrifice to the giant rubber duck in the sky before trying to install wxHaskell or Gtk2Hs? Then this library is for you! Threepenny is easy to install because it uses the web browser as a display.

The library also has functional reactive programming (FRP) built-in, which makes it a lot easier to write GUI application without getting caught in spaghetti code. For an introduction to FRP, see for example my slides from a tutorial I gave in 2012 (the API is slightly different in Reactive.Threepenny) and the preliminary widget design guide.

Version 0.5 is essentially a maintenance release, allowing for newer versions of the libraries that it depends on. It also incorporates various contributions by other people, including a small Canvas API by Ken Friis Larsen and Carsten König, and a complete set of SVG elements and attributes by Steve Bigham. Many thanks also to Yuval Langer and JP Moresmau.

However, while it’s great that the library begins to grow in breadth, incorporating larger and larger parts of the DOM API, I also feel that the current backend code is unable to cope with this growth. In the next version, I intend to overhaul the server code and put the JavaScript FFI on a more solid footing.

To see Threepenny in action, have a look at the following applications:

 Daniel Austin’s FNIStash Editor for Torchlight 2 inventories. Daniel Mlot’s Stunts Cartography Track Viewer Map viewer for the Stunts racing game.

Get the library here:

Note that the API is still in flux and is likely to change radically in the future. You’ll have to convert frequently or develop against a fixed version.

# Script for migrating from WordPress to Hakyll

As I wrote about in a previous post on converting post from WordPress to Hakyll I couldn’t quite find a tool that met my needs out of the box. I had a look at the source of hakyll-convert but it uses a library for RSS parsing, which sounds good. However, the export of posts from WordPress is in an extended RSS format, among other things it contains the comments of posts. Unfortunately it doesn’t look like the RSS library supports the WordPress extensions, so modifying hakyll-convert to also extract comments seems like a bit more work than I’d like to put into it. Especially since I had a feeling that hacking up something using tagsoup would be quite a bit faster.

I put the resulting script in gist on github. I call it bbwp, which is short for Bye, bye, WordPress.

# Distributivity in Horner’s Rule

This is a continuation of my previous post on Horner’s Rule, and in particular, of the discussion there about distributivity in the datatype-generic version of the Maximum Segment Sum problem:

the essential property behind Horner’s Rule is one of distributivity. In the datatype-generic case, we will model this as follows. We are given an ${(\mathsf{F}\,\alpha)}$-algebra ${(\beta,f)}$ [for a binary shape functor ${\mathsf{F}}$], and a ${\mathsf{M}}$-algebra ${(\beta,k)}$ [for a collection monad ${\mathsf{M}}$]; you might think of these as “datatype-generic product” and “collection sum”, respectively. Then there are two different methods of computing a ${\beta}$ result from an ${\mathsf{F}\,\alpha\,(\mathsf{M}\,\beta)}$ structure: we can either distribute the ${\mathsf{F}\,\alpha}$ structure over the collection(s) of ${\beta}$s, compute the “product” ${f}$ of each structure, and then compute the “sum” ${k}$ of the resulting products; or we can “sum” each collection, then compute the “product” of the resulting structure. Distributivity of “product” over “sum” is the property that these two different methods agree, as illustrated in the following diagram.

For example, with ${f :: \mathsf{F}\,{\mathbb Z}\,{\mathbb Z} \rightarrow {\mathbb Z}}$ adding all the integers in an ${\mathsf{F}}$-structure, and ${k :: \mathsf{M}\,{\mathbb Z} \rightarrow {\mathbb Z}}$ finding the maximum of a (non-empty) collection, the diagram commutes.

There’s a bit of hand-waving above to justify the claim that this is really a kind of distributivity. What does it have to do with the common-or-garden equation

$\displaystyle a \otimes (b \oplus c) = (a \otimes b) \oplus (a \otimes c)$

stating distributivity of one binary operator over another? That question is the subject of this post.

## Distributing over effects

Recall that ${\delta_2 :: (\mathsf{F}\,\alpha)\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}(\mathsf{F}\,\alpha)}$ distributes the shape functor ${\mathsf{F}}$ over the monad ${\mathsf{M}}$ in its second argument; this is the form of distribution over effects that crops up in the datatype-generic Maximum Segment Sum problem. More generally, this works for any idiom ${\mathsf{M}}$; this will be important below.

Generalizing in another direction, one might think of distributing over an idiom in both arguments of the bifunctor, via an operator ${\delta : \mathsf{F} \cdot (\mathsf{M} \times \mathsf{M}) \mathbin{\stackrel{.}{\to}} \mathsf{M} \cdot \mathsf{F}}$, which is to say, ${\delta_\beta :: \mathsf{F}\,(\mathsf{M}\beta)\,(\mathsf{M}\beta) \rightarrow \mathsf{M}(\mathsf{F}\beta)}$, natural in the ${\beta}$. This is the ${\mathit{bidist}}$ method of the ${\mathit{Bitraversable}}$ subclass of ${\mathit{Bifunctor}}$ that Bruno Oliveira and I used in our Essence of the Iterator Pattern paper; informally, it requires just that ${\mathsf{F}}$ has a finite ordered sequence of “element positions”. Given ${\delta}$, one can define ${\delta_2 = \delta \cdot \mathsf{F}\,\mathit{pure}\,\mathit{id}}$.

That traversability (or equivalently, distributivity over effects) for a bifunctor ${\mathsf{F}}$ is definable for any idiom, not just any monad, means that one can also conveniently define an operator ${\mathit{contents}_{\mathsf{H}} : \mathsf{H} \mathbin{\stackrel{.}{\to}} \mathsf{List}}$ for any traversable unary functor ${\mathsf{H}}$. This is because the constant functor ${\mathsf{K}_{[\beta]}}$ (which takes any ${\alpha}$ to ${[\beta]}$) is an idiom: the ${\mathit{pure}}$ method returns the empty list, and idiomatic application appends two lists. Then one can define

$\displaystyle \mathit{contents}_{\mathsf{H}} = \delta \cdot \mathsf{H}\,\mathit{wrap}$

where ${\mathit{wrap}}$ makes a singleton list. For a traversable bifunctor ${\mathsf{F}}$, we define ${\mathit{contents}_{\mathsf{F}} = \mathit{contents}_{\mathsf{F}\cdot\triangle}}$ where ${\triangle}$ is the diagonal functor; that is, ${\mathit{contents}_{\mathsf{F}} :: \mathsf{F}\,\beta\,\beta \rightarrow [\beta]}$, natural in the ${\beta}$. (No constant functor is a monad, except in trivial categories, so this convenient definition of contents doesn’t work monadically. Of course, one can use a writer monad, but this isn’t quite so convenient, because an additional step is needed to extract the output.)

One important axiom of ${\delta}$ that I made recent use of in a paper with Richard Bird on Effective Reasoning about Effectful Traversals is that it should be “natural in the contents”: it should leave shape unchanged, and depend on contents only up to the extent of their ordering. Say that a natural transformation ${\phi : \mathsf{F} \mathbin{\stackrel{.}{\to}} \mathsf{G}}$ between traversable functors ${\mathsf{F}}$ and ${\mathsf{G}}$ “preserves contents” if ${\mathit{contents}_{\mathsf{G}} \cdot \phi = \mathit{contents}_{\mathsf{F}}}$. Then, in the case of unary functors, the formalization of “naturality in the contents” requires ${\delta}$ to respect content-preserving ${\phi}$:

$\displaystyle \delta_{\mathsf{G}} \cdot \phi = \mathsf{M}\phi \cdot \delta_{\mathsf{F}} : \mathsf{T}\mathsf{M} \mathbin{\stackrel{.}{\to}} \mathsf{M}\mathsf{G}$

In particular, ${\mathit{contents}_{\mathsf{F}} : \mathsf{F} \mathbin{\stackrel{.}{\to}} \mathsf{List}}$ itself preserves contents, and so we expect

$\displaystyle \delta_{\mathsf{List}} \cdot \mathit{contents}_{\mathsf{F}} = \mathsf{M}(\mathit{contents}_{\mathsf{F}}) \cdot \delta_{\mathsf{F}}$

to hold.

## Folding a structure

Happily, the same generic operation ${\mathit{contents}_{\mathsf{F}}}$ provides a datatype-generic means to “fold” over the elements of an ${\mathsf{F}}$-structure. Given a binary operator ${\otimes :: \beta\times\beta \rightarrow \beta}$ and an initial value ${b :: \beta}$, we can define an ${(\mathsf{F}\,\beta)}$-algebra ${(\beta,f)}$—that is, a function ${f :: \mathsf{F}\,\beta\,\beta\rightarrow\beta}$—by

$\displaystyle f = \mathit{foldr}\,(\otimes)\,b \cdot \mathit{contents}_{\mathsf{F}}$

(This is a slight specialization of the presentation of the datatype-generic MSS problem from last time; there we had ${f :: \mathsf{F}\,\alpha\,\beta \rightarrow \beta}$. The specialization arises because we are hoping to define such an ${f}$ given a homogeneous binary operator ${\otimes}$. On the other hand, the introduction of the initial value ${b}$ is no specialization, as we needed such a value for the “product” of an empty “segment” anyway.)

Incidentally, I believe that this “generic folding” construction is exactly what is intended in Ross Paterson’s Data.Foldable library.

## Summing a collection

The other ingredient we need is an ${\mathsf{M}}$-algebra ${(\beta,k)}$. We already decided last time to

stick to reductions${k}$s of the form ${\oplus/}$ for associative binary operator ${{\oplus} :: \beta \times \beta \rightarrow \beta}$; then we also have distribution over choice: ${\oplus / (x \mathbin{\underline{\smash{\mathit{mplus}}}} y) = (\oplus/x) \oplus (\oplus/y)}$. Note also that we prohibited empty collections in ${\mathsf{M}}$, so we do not need a unit for ${\oplus}$.

On account of ${\oplus/}$ being an algebra for the collection monad ${\mathsf{M}}$, we also get a singleton rule ${\oplus/ \cdot \mathit{return} = \mathit{id}}$.

## Reduction to distributivity for lists

One of the take-home messages in the Effective Reasoning about Effectful Traversals paper is that it helps to reduce a traversal problem for datatypes in general to a more specific one about lists, exploiting the “naturality in contents” property of traversability. We’ll use that tactic for the distributivity property in the datatype-generic version Horner’s Rule.

In this diagram, the perimeter is the commuting diagram given at the start of this post—the diagram we have to justify. Face (1) is the definition of ${\delta_2}$ in terms of ${\delta}$. Faces (2) and (3) are the expansion of ${f}$ as generic folding of an ${\mathsf{F}}$-structure. Face (4) follows from ${\oplus/}$ being an ${\mathsf{M}}$-algebra, and hence being a left-inverse of ${\mathit{return}}$. Face (5) is an instance of the naturality property of ${\mathit{contents}_{\mathsf{F}} : \mathsf{F}\triangle \mathbin{\stackrel{.}{\to}} \mathsf{List}}$. Face (6) is the property that ${\delta}$ respects the contents-preserving transformation ${\mathit{contents}_{\mathsf{F}}}$. Therefore, the whole diagram commutes if Face (7) does—so let’s look at Face (7)!

## Distributivity for lists

Here’s Face (7) again:

Demonstrating that this diagram commutes is not too difficult, because both sides turn out to be list folds.

Around the left and bottom edges, we have a fold ${\mathit{foldr}\,(\otimes)\,b}$ after a map ${\mathsf{List}\,(\oplus)}$, which automatically fuses to ${\mathit{foldr}\,(\odot)\,b}$, where ${\odot}$ is defined by

$\displaystyle x \odot a = (\oplus/x) \otimes a$

or, pointlessly,

$\displaystyle (\odot) = (\otimes) \cdot (\oplus/) \times \mathit{id}$

Around the top and right edges we have the composition ${\oplus/ \cdot \mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \cdot \delta_{\mathsf{List}}}$. If we can write ${\delta_{\mathsf{List}}}$ as an instance of ${\mathit{foldr}}$, we can then use the fusion law for ${\mathit{foldr}}$

$\displaystyle h \cdot \mathit{foldr}\,f\,e = \mathit{foldr}\,f'\,e' \;\Leftarrow\; h\,e=e' \land h \cdot f = f' \cdot \mathit{id}\times h$

to prove that this composition equals ${\mathit{foldr}\,(\odot)\,b}$.

In fact, there are various equivalent ways of writing ${\delta_{\mathsf{List}}}$ as an instance of ${\mathit{foldr}}$. The definition given by Conor McBride and Ross Paterson in their original paper on idioms looked like the identity function, but with added idiomness:

$\displaystyle \begin{array}{lcl} \delta_{\mathsf{List}}\,[\,] &=& \mathit{pure}\,[\,] \\ \delta_{\mathsf{List}}\,(\mathit{mb} : \mathit{mbs}) &=& \mathit{pure}\,(:) \circledast \mathit{mb} \circledast \delta_{\mathsf{List}}\,\mathit{mbs} \end{array}$

In the special case that the idiom is a monad, it can be written in terms of ${\mathit{liftM}_0}$ (aka ${\mathit{return}}$) and ${\mathit{liftM}_2}$:

$\displaystyle \begin{array}{lcl} \delta_{\mathsf{List}}\,[\,] &=& \mathit{liftM}_0\,[\,] \\ \delta_{\mathsf{List}}\,(\mathit{mb} : \mathit{mbs}) &=& \mathit{liftM}_2\,(:)\,\mathit{mb}\,(\delta_{\mathsf{List}}\,\mathit{mbs}) \end{array}$

But we’ll use a third definition:

$\displaystyle \begin{array}{lcl} \delta_{\mathsf{List}}\,[\,] &=& \mathit{return}\,[\,] \\ \delta_{\mathsf{List}}\,(\mathit{mb} : \mathit{mbs}) &=& \mathsf{M}(:)\,(\mathit{cp}\,(\mathit{mb}, \delta_{\mathsf{List}}\,\mathit{mbs})) \end{array}$

where

$\displaystyle \begin{array}{lcl} \mathit{cp} &::& \mathsf{M}\,\alpha \times \mathsf{M}\,\beta \rightarrow \mathsf{M}(\alpha\times\beta) \\ \mathit{cp}\,(x,y) &=& \mathbf{do}\,\{\,a \leftarrow x \mathbin{;} b \leftarrow y \mathbin{;} \mathit{return}\,(a,b) \} \end{array}$

That is,

$\displaystyle \delta_{\mathsf{List}} = \mathit{foldr}\,(\mathsf{M}(:)\cdot\mathit{cp})\,(\mathit{return}\,[\,])$

Now, for the base case we have

$\displaystyle \oplus/\,(\mathsf{M}(\mathit{foldr}\,(\otimes)\,b)\,(\mathit{return}\,[\,])) = \oplus/\,(\mathit{return}\,(\mathit{foldr}\,(\otimes)\,b\,[\,])) = \oplus/\,(\mathit{return}\,b) = b$

as required. For the inductive step, we have:

$\displaystyle \begin{array}{ll} & \oplus/ \cdot \mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \cdot \mathsf{M}(:) \cdot \mathit{cp} \\ = & \qquad \{ \mbox{functors} \} \\ & \oplus/ \cdot \mathsf{M}(\mathit{foldr}\,(\otimes)\,b \cdot (:)) \cdot \mathit{cp} \\ = & \qquad \{ \mbox{evaluation for~} \mathit{foldr} \} \\ & \oplus/ \cdot \mathsf{M}((\otimes) \cdot \mathit{id}\times\mathit{foldr}\,(\otimes)\,b) \cdot \mathit{cp} \\ = & \qquad \{ \mbox{functors; naturality of~} \mathit{cp} \} \\ & \oplus/ \cdot \mathsf{M}(\otimes) \cdot \mathit{cp} \cdot \mathsf{M}\mathit{id}\times\mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \\ = & \qquad \{ \mbox{distributivity for~} \mathit{cp} \mbox{: see below} \} \\ & (\otimes) \cdot (\oplus/)\times(\oplus/) \cdot \mathsf{M}\mathit{id}\times\mathsf{M}(\mathit{foldr}\,(\otimes)\,b) \\ = & \qquad \{ \mbox{functors} \} \\ & (\otimes) \cdot (\oplus/)\times\mathit{id} \cdot \mathit{id}\times\mathsf{M}(\oplus/\cdot\mathit{foldr}\,(\otimes)\,b) \end{array}$

which completes the fusion proof, modulo the wish about distributivity for ${\mathit{cp}}$:

$\displaystyle \oplus/ \cdot \mathsf{M}(\otimes) \cdot \mathit{cp} = (\otimes) \cdot (\oplus/)\times(\oplus/)$

## Distributivity for cartesian product

As for that wish about distributivity for ${\mathit{cp}}$:

$\displaystyle \begin{array}{ll} & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathsf{M}(\otimes) \mathbin{\hbox{\footnotesize\}} \mathit{cp}\,(x,y) \\ = & \qquad \{ \mbox{definition of~} \mathit{cp} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathsf{M}(\otimes) \mathbin{\hbox{\footnotesize\}} \mathbf{do}\,\{\,a \leftarrow x \mathbin{;} b \leftarrow y \mathbin{;} \mathit{return}\,(a,b) \,\} \\ = & \qquad \{ \mbox{map over~} \mathbf{do} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathbf{do}\,\{\,a \leftarrow x \mathbin{;} b \leftarrow y \mathbin{;} \mathit{return}\,(a \otimes b) \,\} \\ = & \qquad \{ \mbox{expanding~} \mathbf{do} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathit{join} \mathbin{\hbox{\footnotesize\}} \mathsf{M}\,(\lambda a \mathbin{.} \mathsf{M}\,(a\otimes)\,y)\,x \\ = & \qquad \{ \oplus/ \mbox{~is an~} \mathsf{M} \mbox{-algebra} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathsf{M}(\oplus/) \mathbin{\hbox{\footnotesize\}} \mathsf{M}\,(\lambda a \mathbin{.} \mathsf{M}\,(a\otimes)\,y)\,x \\ = & \qquad \{ \mbox{functors} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathsf{M}\,(\lambda a \mathbin{.} \oplus/(\mathsf{M}\,(a\otimes)\,y))\,x \\ = & \qquad \{ \mbox{distributivity for collections: see below} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathsf{M}\,(\lambda a \mathbin{.} a \otimes (\oplus/\,y))\,x \\ = & \qquad \{ \mbox{sectioning} \} \\ & \oplus/ \mathbin{\hbox{\footnotesize\}} \mathsf{M}\,(\otimes (\oplus/\,y))\,x \\ = & \qquad \{ \mbox{distributivity for collections again} \} \\ & (\otimes (\oplus/\,y))\,(\oplus/\,x) \\ = & \qquad \{ \mbox{sectioning} \} \\ & (\oplus/\,x) \otimes (\oplus/\,y) \\ = & \qquad \{ \mbox{eta-expansion} \} \\ & (\otimes) \mathbin{\hbox{\footnotesize\}} (\oplus/ \times \oplus/) \mathbin{\hbox{\footnotesize\}} (x,y) \\ \end{array}$

which discharges the proof obligation about distributivity for cartesian product, but again modulo two symmetric wishes about distributivity for collections:

$\displaystyle \begin{array}{lcl} \oplus/ \cdot \mathsf{M}(a\otimes) &=& (a\otimes) \cdot \oplus/ \\ \oplus/ \cdot \mathsf{M}(\otimes b) &=& (\otimes b) \cdot \oplus/ \\ \end{array}$

## Distributivity for collections

Finally, the proof obligations about distributivity for collections are easily discharged, by induction over the size of the (finite!) collection, provided that the binary operator ${\otimes}$ distributes over ${\oplus}$ in the familiar sense. The base case is for a singleton collection, ie in the image of ${\mathit{return}}$ (because we disallowed empty collections); this case follows from the fact that ${\oplus/}$ is an ${\mathsf{M}}$-algebra. The inductive step is for a collection of the form ${u \mathbin{\underline{\smash{\mathit{mplus}}}} v}$ with ${u,v}$ both strictly smaller than the whole (so, if the monad is idempotent, disjoint, or at least not nested); this requires the distribution of the algebra over choice ${\oplus / (u \mathbin{\underline{\smash{\mathit{mplus}}}} v) = (\oplus/u) \oplus (\oplus/v)}$, together with the familiar distribution of ${\otimes}$ over ${\oplus}$.

## Summary

So, the datatype-generic distributivity for ${\mathsf{F}}$-structures of collections that we used for the Maximum Segment Sum problem reduced to distributivity for lists of collections, which reduced to the cartesian product of collections, which reduced to that for pairs. That’s a much deeper hierarchy than I was expecting; can it be streamlined?

# Morality and temptation

Inspired by Bob Harper’s recent postings, I too have a confession to make. I know what is morally right; but sometimes the temptation is too great, and my resolve is weak, and I lapse. Fast and loose reasoning may excuse me, but my conscience would be clearer if I could remain pure in the first place.

## Initial algebras, final coalgebras

We know and love initial algebras, because of the ease of reasoning with their universal properties. We can tell a simple story about recursive programs, solely in terms of sets and total functions. As we discussed in the previous post, given a functor ${\mathsf{F} : \mathbb{S}\mathrm{et} \rightarrow \mathbb{S}\mathrm{et}}$, an ${\mathsf{F}}$-algebra is a pair ${(X,f)}$ consisting of an object ${X}$ and an arrow ${f : \mathsf{F}(X) \rightarrow X}$. A homomorphism between ${\mathsf{F}}$-algebras ${(X,f)}$ and ${(Y,g)}$ is an arrow ${h : X \rightarrow Y}$ such that ${h \cdot f = g \cdot \mathsf{F}(h)}$:

The ${\mathsf{F}}$-algebra ${(X,f)}$ is initial iff there is a unique such ${h}$ for each ${(Y,g)}$; for well-behaved functors ${\mathsf{F}}$, such as the polynomial functors on ${\mathbb{S}\mathrm{et}}$, an initial algebra always exists. We conventionally write “${(\mu\mathsf{F},\mathit{in})}$” for the initial algebra, and “${\mathit{fold}_{\mathsf{F}}(g)}$” for the unique homomorphism ${h}$ to another ${\mathsf{F}}$-algebra ${(Y,g)}$. (In ${\mathbb{S}\mathrm{et}}$, initial algebras correspond to datatypes of finite recursive data structures.)

The uniqueness of the solution is captured in the universal property:

$\displaystyle h = \mathit{fold}(g) \Leftrightarrow h \cdot \mathit{in} = g \cdot \mathsf{F}(h)$

In words, ${h}$ is this fold iff ${h}$ satisfies the defining equation for the fold.

The universal property is crucial. For one thing, the homomorphism equation is a very convenient style in which to define a function; it’s the datatype-generic abstraction of the familiar pattern for defining functions on lists:

$\displaystyle \begin{array}{lcl} h\,[] &=& e \\ h\,(x:\mathit{xs}) &=& f\,x\,(h\,\mathit{xs}) \end{array}$

These two equations implicitly characterizing ${h}$ are much more comprehensible and manipulable than a single equation

$\displaystyle h = \lambda \mathit{xs}\;.\; \textbf{if}\;\mathit{null}\,\mathit{xs}\;\textbf{then}\;e\;\textbf{else}\;f\,(\mathit{head}\,\mathit{xs})\,(h\,(\mathit{tail}\,\mathit{xs}))$

explicitly giving a value for ${h}$. But how do we know that this assortment of two facts about ${h}$ is enough to form a definition? Of course! A system of equations in this form has a unique solution.

Moreover, the very expression of the uniqueness of the solution as an equivalence ${h = \ldots \Leftrightarrow \ldots}$ provides many footholds for reasoning:

• Read as an implication from left to right, instantiating ${h}$ to ${\mathit{fold}(g)}$ to make the left-hand side trivially true, we get an evaluation rule for folds:

$\displaystyle \mathit{fold}(g) \cdot \mathit{in} = g \cdot \mathsf{F}(\mathit{fold}(g))$

• Read as an implication from right to left, we get a proof rule for demonstrating that some complicated expression ${h}$ is a fold:

$\displaystyle h = \mathit{fold}(g) \Leftarrow \ldots$

• In particular, we can quickly see that the identity function is a fold:

$\displaystyle \begin{array}{ll} & \mathit{id} = \mathit{fold}(g) \\ \Leftarrow & \qquad \{ \mbox{universal property} \} \\ & \mathit{id} \cdot \mathit{in} = g \cdot \mathsf{F}(\mathit{id}) \\ \Leftrightarrow & \qquad \{ \mbox{identities} \} \\ & \mathit{in} = g \end{array}$

so ${\mathit{id} = \mathit{fold}(\mathit{in})}$. (In fact, this one’s an equivalence.)

• We get a very simple proof of a fusion rule, for combining a following function with a fold to make another fold:

$\displaystyle \begin{array}{ll} & h \cdot \mathit{fold}(f) = \mathit{fold}(g) \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & h \cdot \mathit{fold}(f) \cdot \mathit{in} = g \cdot \mathsf{F}(h \cdot \mathit{fold}(f)) \\ \Leftrightarrow & \qquad \{ \mbox{evaluation rule, functors} \} \\ & h \cdot f \cdot \mathsf{F}(\mathit{fold}(f)) = g \cdot \mathsf{F}(h) \cdot \mathsf{F}(\mathit{fold}(f)) \\ \Leftarrow & \qquad \{ \mbox{Leibniz} \} \\ & h \cdot f = g \cdot \mathsf{F}(h) \end{array}$

• Using this, we can deduce Lambek’s Lemma, that the constructors ${\mathit{in}}$ form an isomorphism. Supposing that there is a right inverse, and it is a fold, what must it look like?

$\displaystyle \begin{array}{ll} & \mathit{in} \cdot \mathit{fold}(f) = \mathit{id} \\ \Leftrightarrow & \qquad \{ \mathit{id} \mbox{~as a fold} \} \\ & \mathit{in} \cdot \mathit{fold}(f) = \mathit{fold}(\mathit{in}) \\ \Leftarrow & \qquad \{ \mbox{fusion} \} \\ & \mathit{in} \cdot f = \mathit{in} \cdot \mathsf{F}(\mathit{in}) \\ \Leftarrow & \qquad \{ \mbox{Leibniz} \} \\ & f = \mathsf{F}(\mathit{in}) \end{array}$

So if we define ${\mathit{in}^{-1} = \mathit{fold}(\mathsf{F}(\mathit{in}))}$, we get ${\mathit{in} \cdot \mathit{in}^{-1} = \mathit{id}}$. We should also check the left inverse property:

$\displaystyle \begin{array}{ll} & \mathit{in}^{-1} \cdot \mathit{in} \\ = & \qquad \{ \mathit{in}^{-1} \mbox{~as a fold} \} \\ & \mathit{fold}(\mathsf{F}(\mathit{in})) \cdot \mathit{in} \\ = & \qquad \{ \mbox{evaluation rule} \} \\ & \mathsf{F}(\mathit{in}) \cdot \mathsf{F}(\mathit{fold}(\mathsf{F}(\mathit{in}))) \\ = & \qquad \{ \mathit{in}^{-1} \mbox{~as a fold again} \} \\ & \mathsf{F}(\mathit{in}) \cdot \mathsf{F}(\mathit{in}^{-1}) \\ = & \qquad \{ \mbox{functors} \} \\ & \mathsf{F}(\mathit{in} \cdot \mathit{in}^{-1}) \\ = & \qquad \{ \mbox{right inverse} \} \\ & \mathsf{F}(\mathit{id}) \\ = & \qquad \{ \mbox{functors} \} \\ & \mathit{id} \end{array}$

And so on, and so on. Many useful functions can be written as instances of ${\mathit{fold}}$, and the universal property gives us a very powerful reasoning tool—the universal property of ${\mathit{fold}}$ is a marvel to behold.

And of course, it all dualizes beautifully. An ${\mathsf{F}}$-coalgebra is a pair ${(X,f)}$ with ${f : X \rightarrow \mathsf{F}(X)}$. A homomorphism between ${\mathsf{F}}$-coalgebras ${(X,f)}$ and ${(Y,g)}$ is a function ${h : X \rightarrow Y}$ such that ${g \cdot h = \mathsf{F}(h) \cdot f}$:

The ${\mathsf{F}}$-coalgebra ${(Y,g)}$ is final iff there is a unique homomorphism to it from each ${(X,f)}$; again, for well-behaved ${\mathsf{F}}$, final coalgebras always exist. We write “${(\nu\mathsf{F},\mathit{out})}$” for the final coalgebra, and ${\mathit{unfold}_{\mathsf{F}}(f)}$ for the unique homomorphism to it. (In ${\mathbb{S}\mathrm{et}}$, final coalgebras correspond to datatypes of finite-or-infinite recursive data structures.)

Uniqueness is captured by the universal property

$\displaystyle h = \mathit{unfold}(f) \Leftrightarrow \mathit{out} \cdot h = \mathsf{F}(h) \cdot f$

which has just as many marvellous consequences. Many other useful functions are definable as instances of ${\mathit{unfold}}$, and again the universal property gives a very powerful tool for reasoning with them.

## Hylomorphisms

There are also many interesting functions that are best described as a combination of a fold and an unfold. The hylomorphism pattern, with an unfold followed by a fold, is the best known: the unfold produces a recursive structure, which the fold consumes.

The factorial function is a simple example. The datatype of lists of natural numbers is determined by the shape functor

$\displaystyle \mathsf{L}(X) = 1 + \mathbb{N} \times X$

Then we might hope to write

$\displaystyle \mathit{fact} = \mathit{product} \cdot \mathit{downFrom}$

where ${\mathit{downFrom} = \mathit{unfold}_{\mathsf{L}}(d)}$ and ${\mathit{product} = \mathit{fold}_{\mathsf{L}}(m)}$ with

$\displaystyle \begin{array}{lcl} d &::& \mathbb{N} \rightarrow \mathsf{L}(\mathbb{N}) \\ d\,0 &=& \mathit{inl}\,() \\ d\,(n+1) &=& \mathit{inr}\,(n+1,n) \bigskip\\ m &::& \mathsf{L}(\mathbb{N}) \rightarrow \mathbb{N} \\ m\,(\mathit{inl}\,()) &=& 1 \\ m\,(\mathit{inr}\,(n,n')) &=& n \times n' \end{array}$

More elaborately, we might hope to write ${\mathit{quicksort} : \mathsf{List}({\mathbb Z}) \rightarrow \mathsf{List}({\mathbb Z})}$ as the composition of ${\mathit{unfold}_\mathsf{B}(s)}$ (to generate a binary search tree) and ${\mathit{fold}_\mathsf{B}(g)}$ (to flatten that tree to a list), where ${\mathsf{B}}$ is the shape functor for internally-labelled binary trees,

$\displaystyle p : \mathsf{List}({\mathbb Z}) \rightarrow \mathsf{B}(\mathsf{List}({\mathbb Z}))$

partitions a list of integers into the unit or a pivot and two sublists, and

$\displaystyle g : \mathsf{B}(\mathsf{List}({\mathbb Z})) \rightarrow \mathsf{List}({\mathbb Z})$

glues together the unit or a pivot and two sorted lists into one list. In fact, any divide-and-conquer algorithm can be expressed in terms of an unfold computing a tree of subproblems top-down, followed by a fold that solves the subproblems bottom-up.

But sadly, this doesn’t work in ${\mathbb{S}\mathrm{et}}$, because the types don’t meet in the middle. The source type of the fold is (the carrier of) an initial algebra, but the target type of the unfold is a final coalgebra, and these are different constructions.

This is entirely reasonable, when you think about it. Our definitions in ${\mathbb{S}\mathrm{et}}$—the category of sets and total functions—necessarily gave us folds and unfolds as total functions; the composition of two total functions is a total function, and so a fold after an unfold ought to be a total function too. But it is easy to define total instances of ${\mathit{unfold}}$ that generate infinite data structures (such as a function ${\mathit{upFrom}}$, which generates an infinite ascending list of naturals), on which a following fold is undefined (such as “the product” of an infinite ascending list of naturals). The composition then should not be a total function.

One might try interposing a conversion function of type ${\nu\mathsf{F} \rightarrow \mu\mathsf{F}}$, coercing the final data structure produced by the unfold into an initial data structure for consumption by the fold. But there is no canonical way of doing this, because final data structures may be “bigger” (perhaps infinitely so) than initial ones. (In contrast, there is a canonical function of type ${\mu\mathsf{F} \rightarrow \nu\mathsf{F}}$. In fact, there are two obvious definitions of it, and they agree—a nice exercise!)

One might try parametrizing that conversion function with a natural number, bounding the depth to which the final data structure is traversed. Then the coercion is nicely structural (in fact, it’s a fold over the depth), and everything works out type-wise. But having to thread such “resource bounds” through the code does terrible violence to the elegant structure; it’s not very satisfactory.

## Continuous algebras

The usual solution to this conundrum is to give up on ${\mathbb{S}\mathrm{et}}$, and to admit that richer domain structures than sets and total functions are required. Specifically, in order to support recursive definitions in general, and the hylomorphism in particular, one should move to the category ${\mathbb{C}\mathrm{po}}$ of continuous functions between complete partial orders (CPOs). Now is not the place to give all the definitions; see any textbook on denotational semantics. The bottom line, so to speak, is that one has to accept a definedness ordering ${\sqsubseteq}$ on values—both on “data” and on functions—and allow some values to be less than fully defined.

Actually, in order to give meaning to all recursive definitions, one has to further restrict the setting to pointed CPOs—in which there is a least-defined “bottom” element ${\bot_X}$ for each type ${X}$, which can be given as the “meaning” (solution) of the degenerate recursive definition ${x=x}$ at type ${X}$. Then there is no “empty” CPO; the smallest CPO ${0}$ has just a single element, namely ${\bot}$. As with colimits in general, this smallest object is used as the start of a chain of approximations to a limiting solution. But in order for ${0}$ really to be an initial object, one also has to constrain the arrows to be strict, that is, to preserve ${\bot}$; only then is there a unique arrow ${0 \rightarrow A}$ for each ${A}$. The category of strict continuous functions between pointed CPOs is called ${\mathbb{C}\mathrm{po}_\bot}$.

It so happens that in ${\mathbb{C}\mathrm{po}_\bot}$, initial algebras and final coalgebras coincide: the objects (pointed CPOs) ${\mu\mathsf{F}}$ and ${\nu\mathsf{F}}$ are identical. This is very convenient, because it means that the hylomorphism pattern works fine: the structure generated by the unfold is exactly what is expected by the fold.

Of course, it still happen that the composition yields a “partial” (less than fully defined) function; but at least it now type-checks. Categories with this initial algebra/final coalgebra coincidence are called algebraically compact; they were studied by Freyd, but there’s a very good survey by Adámek, Milius and Moss.

However, the story gets murkier than that. For one thing, ${\mathbb{C}\mathrm{po}_\bot}$ does not have proper products. (Indeed, an algebraically compact category with products collapses.) But beyond that, ${\mathbb{C}\mathrm{po}_\bot}$—with its restriction to strict arrows—is not a good model of lazy functional programming; ${\mathbb{C}\mathrm{po}}$, with non-strict arrows too, is better. So one needs a careful balance of the two categories. The consequences for initial algebras and final coalgebras are spelled out in one of my favourite papers, Program Calculation Properties of Continuous Algebras by Fokkinga and Meijer. In a nutshell, one can only say that the defining equation ${h \cdot \mathit{in} = g \cdot \mathsf{F}(h)}$ for folds has a unique strict solution in ${h}$; without the strictness side-condition, ${h\,\bot}$ is unconstrained (because ${\mathit{in}\,x \ne \bot}$ for any ${x}$). But the situation for coalgebras remains unchanged—the defining equation ${\mathit{out} \cdot h = \mathsf{F}(h) \cdot f}$ for unfolds has a unique solution (and moreover, it is strict when ${f}$ is strict).

This works, but it means various strictness side-conditions have to be borne in mind when reasoning about folds. Done rigorously, it’s rather painful.

## Recursive coalgebras

So, back to my confession. I want to write divide-and-conquer programs, which produce intermediate data structures and then consume them. Folds and unfolds in ${\mathbb{S}\mathrm{et}}$ do not satisfy me; I want more—hylos. Morally, I realise that I should pay careful attention to those strictness side-conditions. But they’re so fiddly and boring, and my resolve is weak, so I usually just brush them aside. Is there away that I can satisfy my appetite for divide-and-conquer programs while still remaining in the pure ${\mathbb{S}\mathrm{et}}$ world?

Tarmo Uustalu and colleagues have a suggestion. Final coalgebras and algebraic compactness are sufficient but not necessary for the hylo diagram above to have a unique solution; they propose to focus on recursive coalgebras instead. The ${\mathsf{F}}$-coalgebra ${(X,f)}$ is “recursive” iff, for each ${g : \mathsf{F}(Y) \rightarrow Y}$, there is a unique ${h}$ such that ${h = g \cdot \mathsf{F}(h) \cdot f}$:

This is a generalization of initial algebras: if ${\mathsf{F}}$ has an initial algebra ${(\mu\mathsf{F},\mathit{in})}$, then by Lambek’s Lemma ${\mathit{in}}$ has an inverse ${\mathit{in}^{-1}}$, and ${(\mu\mathsf{F},\mathit{in}^{-1})}$ is a recursive coalgebra. And it is a strict generalization: it also covers patterns such as paramorphisms (primitive recursion)—since ${(\mu\mathsf{F}, \mathsf{F}(\mathit{fork}(\mathit{id},\mathit{id}))\cdot\mathit{in}_\mathsf{F}^{-1})}$ is a recursive ${\mathsf{G}}$-coalgebra where ${\mathsf{G}}$ is the functor taking ${X}$ to ${\mathsf{F}(X \times \mathsf{F}(X))}$—and the “back one or two steps” pattern used in the Fibonacci function.

Crucially for us, almost by definition it covers all of the “reasonable” hylomorphisms too. For example, ${(\mathbb{N},d)}$ is a recursive ${\mathsf{L}}$-coalgebra, where ${\mathsf{L}}$ is the shape functor for lists of naturals and ${d}$ the ${\mathsf{L}}$-coalgebra introduced above that analyzes a natural into nothing (for zero) or itself and its predecessor (for non-zero inputs). Which is to say, for each ${m : \mathsf{L}(X) \rightarrow X}$, there is a unique ${h}$ such that ${h = m \cdot \mathsf{L}(h) \cdot d}$; in particular, for the ${m}$ given above that returns 1 or multiplies, the unique ${h}$ is the factorial function. (In fact, this example is also an instance of a paramorphism.) And ${(\mathsf{List}({\mathbb Z}),p)}$ is a recursive ${\mathsf{B}}$-coalgebra, where ${p}$ is the partition function of quicksort—for any ${\mathsf{B}}$-algebra ${(Y,g)}$, there is a unique ${h}$ such that ${h = g \cdot \mathsf{B}(h) \cdot p}$, and in particular when ${g}$ is the glue function for quicksort, that unique solution is quicksort itself.

This works perfectly nicely in ${\mathbb{S}\mathrm{et}}$; there is no need to move to more complicated settings such as ${\mathbb{C}\mathrm{po}}$ or ${\mathbb{C}\mathrm{po}_\bot}$, or to consider partiality, or strictness, or definedness orderings. The only snag is the need to prove that a particular coalgebra of interest is indeed recursive. Capretta et al. study a handful of “basic” recursive coalgebras and of constructions on coalgebras that preserve recursivity.

More conveniently, Taylor and Adámek et al. relate recursivity of coalgebras to the more familiar notion of variant function, ie well-founded ordering on arguments of recursive calls. They restrict attention to finitary shape functors; technically, preserving directed colimits, but informally, I think that’s equivalent to requiring that each element of ${\mathsf{F}(X)}$ has a finite number of ${X}$ elements—so polynomial functors are ok, as is the finite powerset functor, but not powerset in general. If I understand those sources right, for a finitary functor ${\mathsf{F}}$ and an ${\mathsf{F}}$-coalgebra ${(X,f)}$, the following conditions are equivalent: (i) ${(X,f)}$ is corecursive; (ii) ${f}$ is well-founded, in the sense that there is a well-founded ordering ${\prec}$ such that ${y \prec x}$ for each “element” ${y}$ of ${f(x)}$; (iii) every element of ${\mathit{unfold}_\mathsf{F}(f)}$ has finite depth; and (iv) there is a coalgebra homomorphism from ${(X,f)}$ to ${(\mu\mathsf{F},\mathit{in})}$.

This means that I can resort to simple and familiar arguments in terms of variant functions to justify hylo-style programs. The factorial function is fine, because (${\mathsf{L}}$ is a finitary functor, being polynomial, and) the chain of recursive calls to which ${d}$ leads is well-founded; quicksort is fine, because the partitioning step is well-founded; and so on. Which takes a great weight of guilt off my shoulders: I can give in to the temptation to write interesting programs, and still remain morally as pure as the driven snow.

Universal properties are a generalization of the notion of a Galois connection between two orderings. Or perhaps I should say: universal properties arise from adjunctions, and it is adjunctions that are a generalization of Galois connections. Adjunctions capture in an abstract categorical setting the idea of “optimal solutions to a problem”; and this idea is itself very general, capturing many of the structures underlying common patterns in programming (not to mention the rest of mathematics). Solutions to equations, products, limits of sequences of approximations, and minimality and maximality are just some of the instances of this powerful abstraction that we will make use of. In the preface to Categories for the Working Mathematician, Mac Lane wrote that “adjoint functors arise everywhere”.

Two functors ${\mathsf{F} : \mathbb{D} \leadsto \mathbb{C}}$ and ${\mathsf{G} : \mathbb{C} \leadsto \mathbb{D}}$ form an adjunction, written ${\mathsf{F} \dashv \mathsf{G}}$, if there is an isomorphism between the sets of arrows ${\mathsf{F}(B) \rightarrow A}$ in ${\mathbb{C}}$ and ${B \rightarrow \mathsf{G}(A)}$ in ${\mathbb{D}}$. We say that ${\mathsf{F}}$ is the left adjoint and ${\mathsf{G}}$ the right adjoint. The essence of the isomorphism is captured by two natural transformations ${\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} \mathsf{G} \mathbin{\cdot} \mathsf{F}}$ in ${\mathbb{D}}$ and ${\epsilon : \mathsf{F} \mathbin{\cdot} \mathsf{G} \mathbin{\stackrel{.}{\to}} \mathsf{Id}}$ in ${\mathbb{C}}$, called the unit and counit of the adjunction; ${\eta}$ is the image in ${\mathbb{D}}$ of ${\mathit{id}_{\mathsf{F}(B)} : \mathsf{F}(B) \rightarrow \mathsf{F}(B)}$ in ${\mathbb{C}}$, and conversely, ${\epsilon}$ is the image in ${\mathbb{C}}$ of ${\mathit{id}_{\mathsf{G}(A)}}$ in ${\mathbb{D}}$. The unit and counit satisfy the laws

$\displaystyle \begin{array}{lcl} \epsilon_{\mathsf{F}(B)} \cdot \mathsf{F}(\eta_B) &=& \mathit{id}_{\mathsf{F}(B)} \\ \mathsf{G}(\epsilon_A) \cdot \eta_{\mathsf{G}(A)} &=& \mathit{id}_{\mathsf{G}(A)} \end{array}$

From them one can construct the witnesses to the isomorphism for arbitrary arrows: for each arrow ${f : \mathsf{F}(B) \rightarrow A}$ in ${\mathbb{C}}$, there is a unique arrow ${g : B \rightarrow \mathsf{G}(A)}$ in ${\mathbb{D}}$ such that ${\epsilon_A \cdot \mathsf{F}(g) = f}$, given by ${g = \mathsf{G}(f) \cdot \eta_B}$; and conversely, for each arrow ${g : B \rightarrow \mathsf{G}(A)}$ in ${\mathbb{D}}$, there is a unique arrow ${f : \mathsf{F}(B) \rightarrow A}$ in ${\mathbb{C}}$ such that ${\mathsf{G}(f) \cdot \eta_ B = g}$, given by ${f = \epsilon_B \cdot \mathsf{F}(g)}$; and moreover, these two constructions are each other’s inverses.

A preorder ${(X,{\le})}$ forms a category: the objects of the category are the elements of the set~${X}$, and between any two elements ${x,y \in X}$, there is a unique arrow if ${x \le y}$, and no arrow otherwise. That adjunctions are a generalization of Galois connections follows straightforwardly from the fact that there is at most one arrow between any two objects in a preorder category. Then monotonic functions ${f : (X,{\le_X}) \rightarrow (Y,{\le_Y})}$ and ${g : (Y,{\le_Y}) \rightarrow (X,{\le_X})}$ between preorders ${(X,{\le_X})}$ and ${(Y,{\le_Y})}$ form a Galois connection precisely if the sets of arrows ${f(y) \rightarrow x}$ and ${y \rightarrow g(x)}$ are isomorphic—that is, if both ${f(y) \le_X x}$ and ${y \le_Y g(x)}$ hold, or neither do, or in other words,

$\displaystyle f(y) \le_X x \Leftrightarrow y \le_Y g(x)$

## Adjoints of the diagonal functor

A very useful example of adjunctions arises in the definition of products—in the category ${\mathbb{S}\mathrm{et}}$ of sets and total functions, for given types ${A,B,C}$, there is an isomorphism between the set of pair-generating functions, of type ${A \rightarrow B \times C}$, and their two projections, pairs of functions of types ${A \rightarrow B}$ and ${A \rightarrow C}$. (Indeed, given functions ${f:A \rightarrow B}$ and ${g:A \rightarrow C}$, one can construct the pair-generating function ${\mathit{fork}(f,g) : A \rightarrow B \times C}$; and conversely, given a pair-generating function ${h : A \rightarrow B \times C}$, one can construct its two projections ${fst \cdot h : A \rightarrow B}$ and ${snd \cdot h : A \rightarrow C}$; and moreover, these two constructions are inverses.)

The “isomorphism between sets of arrows” can be elegantly expressed as an adjunction; since it concerns pairs of arrows, one side of the adjunction involves the product category ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$. The right adjoint is the product functor ${(\times) : \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \leadsto \mathbb{S}\mathrm{et}}$, mapping an object in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$—that is, a pair of sets—to their cartesian product as an object in ${\mathbb{S}\mathrm{et}}$, and an arrow in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$—that is, a parallel pair of functions—to a function in ${\mathbb{S}\mathrm{et}}$ acting pointwise on pairs. In the other direction, the left adjoint is the diagonal functor ${\triangle : \mathbb{S}\mathrm{et} \leadsto \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$, mapping an object ${A}$ in ${\mathbb{S}\mathrm{et}}$ to the object ${(A,A)}$ in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$, and a function ${f}$ to the pair of functions ${(f,f)}$ as an arrow in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$. The adjunction ${{\triangle} \dashv (\times)}$ amounts to the isomorphism

$\displaystyle \triangle A \rightarrow (B,C) \approx A \rightarrow {\times} (B,C)$

or equivalently,

$\displaystyle (A \rightarrow B)\times(A \rightarrow C) \approx A \rightarrow (B\times C)$

The unit and counit of the adjunction are ${\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} (\times) \mathbin{\cdot} \triangle}$ and ${\epsilon : \triangle \mathbin{\cdot} (\times) \mathbin{\stackrel{.}{\to}} \mathsf{Id}}$. In more familiar terms, the unit is a natural transformation in ${\mathbb{S}\mathrm{et}}$, so a polymorphic function; in fact, it’s the function of type ${A \rightarrow A \times A}$ that we might call ${\mathit{double}}$. However, the counit is a natural transformation ${(A \times B,A \times B) \rightarrow (A,B)}$ in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$, so not simply a (polymorphic) function; but arrows in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$ are pairs of functions, so we might write this ${(\mathit{fst},\mathit{snd}) :: (A \times B \rightarrow A, A \times B \rightarrow B)}$.

Then the “fork” operation is in fact one of the two witnesses to the isomorphism between the sets of arrows: given an arrow ${\triangle A \rightarrow (B,C)}$ in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$, that is, a pair ${(f,g)}$ of functions of types ${(A \rightarrow B,A \rightarrow C)}$, then ${\mathit{fork}(f,g)}$ is an arrow ${A \rightarrow {\times} (B,C)}$ in ${\mathbb{S}\mathrm{et}}$, that is, a function of type ${A \rightarrow B \times C}$, given by the construction above:

$\displaystyle \mathit{fork}(f,g) = (\times) (f,g) \cdot \mathit{double}$

or, with more points,

$\displaystyle \mathit{fork} (f,g)\,a = (f\,a, g\,a)$

The laws that the unit and counit satisfy are

$\displaystyle \begin{array}{lcl} (\mathit{fst},\mathit{snd}) \cdot \triangle \mathit{double} &=& \mathit{id} \\ (\times) (\mathit{fst},\mathit{snd}) \cdot \mathit{double} &=& \mathit{id} \end{array}$

or, in more familiar terms,

$\displaystyle \begin{array}{lcl} \mathit{fst} \cdot \mathit{double} &=& \mathit{id} \\ \mathit{snd} \cdot \mathit{double} &=& \mathit{id} \\ \mathit{fork} (\mathit{fst},\mathit{snd}) &=& \mathit{id} \end{array}$

The universal property of ${\mathit{fork}}$ follows from the isomorphism between sets of arrows:

$\displaystyle \begin{array}{ll} & h = \mathit{fork}(f,g) \\ \Leftrightarrow & \qquad \{ \mathit{fork} \} \\ & h = (\times) (f,g) \cdot \mathit{double} \\ \Leftrightarrow & \qquad \{ \mbox{isomorphism between arrow sets} \} \\ & (\mathit{fst},\mathit{snd}) \cdot \triangle h = (f,g) \\ \Leftrightarrow & \qquad \{ \triangle \} \\ & (\mathit{fst},\mathit{snd}) \cdot (h,h) = (f,g) \\ \Leftrightarrow & \qquad \{ \mbox{composition in~} \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \mbox{~is pointwise} \} \\ & (\mathit{fst} \cdot h,\mathit{snd} \cdot h) = (f,g) \\ \Leftrightarrow & \qquad \{ \mbox{equality of pairs is pointwise} \} \\ & \mathit{fst} \cdot h=f \land \mathit{snd} \cdot h=g \end{array}$

The universal property of ${\mathit{fork}}$ underlies all the useful laws of that operator.

Of course, the situation nicely dualizes too. Coproducts in ${\mathbb{S}\mathrm{et}}$ arise from the isomorphism between the set of arrows ${A+B \rightarrow C}$ and the pairs of arrows in ${A \rightarrow C}$ and ${B \rightarrow C}$. Again, “pairs of arrows” suggest the product category; but this time, the diagonal functor is the right adjoint, with the coproduct functor ${(+) : \mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et} \rightarrow \mathbb{S}\mathrm{et}}$ (which takes a pair of sets ${(A,B)}$ to their disjoint union) as the left adjoint. That is, the adjunction is ${(+) \dashv \triangle}$, and the isomorphism is

$\displaystyle (+) (A,B) \rightarrow C \approx (A,B) \rightarrow \triangle C$

The unit ${\eta : \mathsf{Id} \mathbin{\stackrel{.}{\to}} \triangle \mathbin{\cdot} (+)}$ is a natural transformation ${(A,B) \rightarrow (A+B,A+B)}$ in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$, that is, a pair of functions ${\mathit{inl} : A \rightarrow A+B}$ and ${\mathit{inr} : B \rightarrow A+B}$. The counit ${\epsilon : (+) \mathbin{\cdot} \triangle \mathbin{\stackrel{.}{\to}} \mathsf{Id}}$ is a natural transformation ${A+A \rightarrow A}$ in ${\mathbb{S}\mathrm{et}}$, which we might call ${\mathit{merge}}$. The “join” of two functions with a common range is a witness to one half of the isomorphism—given an arrow ${(f,g) : (A,B) \rightarrow \triangle C}$ in ${\mathbb{S}\mathrm{et} \times \mathbb{S}\mathrm{et}}$, then ${\mathit{join} (f,g)}$ is an arrow ${(+) (A,B) \rightarrow C}$ in ${\mathbb{S}\mathrm{et}}$, defined by

$\displaystyle \mathit{join} (f,g) = \mathit{merge} \cdot (+) (f,g)$

The two laws that the unit and counit satisfy are:

$\displaystyle \begin{array}{lcl} \mathit{merge} \cdot (+) (\mathit{inl},\mathit{inr}) &=& \mathit{id} \\ \triangle \mathit{merge} \cdot (\mathit{inl},\mathit{inr}) &=& \mathit{id} \end{array}$

or, perhaps more perspicuously,

$\displaystyle \begin{array}{lcl} \mathit{join} (\mathit{inl},\mathit{inr}) &=& \mathit{id} \\ \mathit{merge} \cdot \mathit{inl} &=& \mathit{id} \\ \mathit{merge} \cdot \mathit{inr} &=& \mathit{id} \end{array}$

Another familiar example from functional programming is the notion of currying, which arises when one can construct the function space ${A \Rightarrow B}$ (the type of functions from ${A}$ to ${B}$, for each type ${A}$ and ${B}$), such that there is an isomorphism between the sets of arrows ${A \rightarrow (B \Rightarrow C)}$ and ${A \times B \rightarrow C}$. Here, the adjunction is ${( \times B) \dashv (B \Rightarrow )}$—in this case, both functors are endofunctors on ${\mathbb{S}\mathrm{et}}$. The unit and counit are natural transformations ${\mathsf{Id} \mathbin{\stackrel{.}{\to}} (B \Rightarrow )\mathbin{\cdot}( \times B)}$ and ${( \times B)\mathbin{\cdot}(B \Rightarrow ) \mathbin{\stackrel{.}{\to}} \mathsf{Id}}$. We might call these ${\mathit{pair}}$ and ${\mathit{apply}}$, since the first is a curried pair-forming operator, and the second applies a function to an argument:

$\displaystyle \begin{array}{lcl} \mathit{pair} &:& A \rightarrow (B \Rightarrow (A \times B)) \\ \mathit{apply} &:& (B \Rightarrow A) \times B \rightarrow A \end{array}$

The laws they satisfy are as follows:

$\displaystyle \begin{array}{lcl} \mathit{apply} \cdot ( \times B) \mathit{pair} &=& \mathit{id} \\ (B \Rightarrow )\mathit{apply} \cdot \mathit{pair} &=& \mathit{id} \end{array}$

or, in points,

$\displaystyle \begin{array}{lcl} \mathit{apply} (\mathit{pair}\,a,b) &=& (a,b) \\ \mathit{apply} \cdot \mathit{pair}\,f &=& f \end{array}$

The isomorphism itself is witnessed by the two inverse functions

$\displaystyle \begin{array}{lcl} \mathit{curry}\,f &=& (B \Rightarrow ) f \cdot \mathit{pair} \\ \mathit{uncurry}\,g &=& \mathit{apply} \cdot ( \times B) g \end{array}$

where ${f : A \times B \rightarrow C}$ and ${g : A \rightarrow (B \Rightarrow C)}$.

# Universal properties and Galois connections

One recurring theme throughout this series will be that of a universal property—an identity that captures an indirect means of solving a problem, by transforming that problem into a different (and hopefully simpler) domain, while still preserving all its essential properties. In particular, the original problem has a solution if and only if the transformed problem does, and moreover, the solution to the transformed problem can easily be translated back into a solution to the original problem. One can see universal properties as a generalization of the notion of a Galois connection between two orderings, which are a similarly powerful technique of relating problems in two different settings. (In fact, the proper generalization of Galois connections is to adjunctions, but that’s a story for next time.)

## Universal properties

The universal property of the ${\mathit{fork}}$ operation for products is a representative example. Recall that ${\mathit{fork}\,(f,g) :: a \rightarrow (b,c)}$ when ${f :: a \rightarrow b}$ and ${g :: a \rightarrow c}$; and that ${\mathit{fst} :: (b,c) \rightarrow b}$ and ${\mathit{snd} :: (b,c) \rightarrow c}$. Then ${\mathit{fork}}$ is completely defined by its universal property:

$\displaystyle h = \mathit{fork}\,(f,g) \quad\Leftrightarrow\quad \mathit{fst} \cdot h = f \land \mathit{snd} \cdot h = g$

This identity repays careful study.

• It translates a problem in the more complex domain of products (namely, the problem of showing how some complicated expression ${h}$ can be written in terms of ${\mathit{fork}}$) into simpler problems (here, equations about the two projections of ${h}$).
• It’s an equivalence. So not only do you have an implication from left to right (any ${h}$ expressible as a ${\mathit{fork}}$ satisfies the two properties on the right), you also have one from right to left (any pair of functions ${f,g}$ satisfying the two properties on the right induces a ${\mathit{fork}}$). In other words, ${h}$ is a solution to the equation on the left iff it is a solution on the right; not only does a solution on the right yield a construction on the left, but also the absence of solutions on the right implies the absence on the left. Or again: the equations on the right have a unique solution in ${h}$—since any two solutions ${h,h'}$ must both be equal to the same expression on the left.
• It has many useful simple consequences. You can make the left-hand side trivially true by letting ${h = \mathit{fork}\,(f,g)}$; then the right-hand side must also be true:

$\displaystyle \begin{array}{lcl} \mathit{fst} \cdot \mathit{fork}\,(f,g) &=& f \\ \mathit{snd} \cdot \mathit{fork}\,(f,g) &=& g \end{array}$

Symmetrically, you can make the right-hand side trivially true by letting ${f = \mathit{fst} \cdot h}$ and ${g = \mathit{snd} \cdot h}$; then the left-hand side must also be true:

$\displaystyle h = \mathit{fork}\,(\mathit{fst} \cdot h, \mathit{snd} \cdot h)$

If you further let ${h = \mathit{id}}$, you conclude that every pair consists solely of its two projections, nothing more:

$\displaystyle \mathit{id} = \mathit{fork}\,(\mathit{fst}, \mathit{snd})$

In fact, the universal property of ${\mathit{fork}}$ tells you everything you need to know about ${\mathit{fork}}$; you might take that as one justification for the term “universal”.

• It also has many useful less obvious consequences. For example, if you’re searching for an ${h}$ that acts independently on the two components of a pair—${\mathit{fst} \cdot h = h_1 \cdot \mathit{fst}}$ and ${\mathit{snd} \cdot h = h_2 \cdot \mathit{snd}}$—just let ${f = h_1 \cdot \mathit{fst}}$ and ${g = h_2 \cdot \mathit{snd}}$ in the universal property, and conclude

$\displaystyle h = \mathit{fork}\,(h_1\cdot\mathit{fst}, h_2\cdot\mathit{snd})$

(which we’ve written “${\mathit{prod}\,(h_1,h_2)}$” elsewhere). For another example, we can deduce a fusion law for ${\mathit{fork}}$: for what ${f',g'}$ does the equation

$\displaystyle \mathit{fork}\,(f,g) \cdot k = \mathit{fork}\,(f',g')$

hold? This matches the left-hand side of the universal property; expanding the right-hand side yields

$\displaystyle \begin{array}{lclcl} f' &=& \mathit{fst}\cdot\mathit{fork}\,(f,g)\cdot k &=& f \cdot k \\ g' &=& \mathit{snd}\cdot\mathit{fork}\,(f,g)\cdot k &=& g \cdot k \end{array}$

Such a rich harvest from so small a seed! (In fact, we will see later that an even smaller seed suffices.)

## Galois connections

We can see the same structures that occur in universal properties like that of ${\mathit{fork}}$ above also in relationships between orderings. As a very simple example, consider the problem of dividing a natural number ${n}$ by two, exactly; the universal property of a solution ${m}$ to this problem is the equivalence

$\displaystyle n / 2 = m \Leftrightarrow n = m \times 2$

That is, ${m}$ is a solution to the problem “compute ${n / 2}$” precisely when ${n = m \times 2}$; both the existence and the identification of a solution to a problem expressed in terms of division has been translated to one in terms of multiplication—which is arguably a simpler setting. Note that the universal property amounts to an equivalence

$\displaystyle f(n) = m \Leftrightarrow n = g(m)$

involving the two functions ${f = (/2)}$ and ${g = (\times 2)}$, which are in some sense inverses. This pattern will crop up over and over again.

The division example involved an equivalence between the two identities ${f(n)=m}$ and ${n=g(m)}$. More generally, another relation than “${=}$” might be involved. Extending the previous example to integer division, rounding down, we have for ${k>0}$:

$\displaystyle n \div k \ge m \Leftrightarrow n \ge m \times k$

Again, this relates the two (in some sense inverse) functions ${(\div k)}$ and ${(\times k)}$; but this time equality is inadequate for stating the problem, and it perhaps more convincing to claim that a more complicated problem ${(\div k)}$ has been translated into a simpler one ${(\times k)}$. What is more, translating the problem via this universal property pays dividends when it comes to reasoning about the problem, because the simpler problem space is much more amenable to calculation. For example, properties of repeated division ${(n \div k) \div l}$ (for ${k,l>0}$) do not trip off the tongue; but we can reason straightforwardly as follows:

$\displaystyle \begin{array}{ll} & (n \div k) \div l \ge m \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & n \div k \ge m \times l \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & n \ge (m \times l) \times k \\ \Leftrightarrow & \qquad \{ \mbox{multiplication is associative} \} \\ & n \ge m \times (l \times k) \\ \Leftrightarrow & \qquad \{ \mbox{universal property} \} \\ & n \div (l \times k) \ge m \end{array}$

Thus, ${(n \div k) \div l \ge m}$ precisely when ${n \div (l \times k) \ge m}$, or in other words, ${(n \div k) \div l = n \div (l \times k)}$.

In this case, the two problem spaces have both involved the same relation ${\ge}$ on the same domain, namely the natural numbers; that is not essential. For example, the universal property of the floor function ${\lfloor\cdot\rfloor}$ from reals to integers is given by:

$\displaystyle \mathit{inj}(n) \le_R x \Leftrightarrow n \le_I \lfloor x \rfloor$

where, to be completely explicit, we have written ${\le_R}$ for the usual ordering on reals and ${\le_I}$ for the corresponding ordering on integers, and ${\mathit{inj}}$ for the injection from the integers into the reals. This time the two problem spaces involve two different orderings on different domains; we say that the pair of functions ${\mathit{inj}}$ and ${\lfloor\cdot\rfloor}$ form a Galois connection between the orderings ${\le_R}$ and ${\le_I}$. (We also see that the relationship between the two functions ${\mathit{inj}}$ and ${\lfloor\cdot\rfloor}$ is becoming less like a pure inverse relationship, and more of an embedding–projection pair.)

As a simple non-arithmetical example of a Galois connection on a single domain, consider some set ${U}$ and a fixed subset ${X \subseteq U}$; then

$\displaystyle A \cap X \subseteq B \Leftrightarrow A \subseteq B \cup \overline{X}$

That is, ${(\cap X)}$ and ${(\cup \overline{X})}$ form a Galois connection between ${\subseteq}$ and itself.

A non-arithmetical example between two different domains is afforded by the field of formal concept analysis, which relates “objects” and their “properties”. Given are sets ${O}$ of objects and ${P}$ of properties, and a relation ${(\vdash) \subseteq O \times P}$; we write ${o \mathrel{\vdash} p}$ to denote that object ${o}$ has property ${p}$. This induces “concept-forming operators” ${\mathit{intent} : 2^O \rightarrow 2^P}$ and ${\mathit{extent} : 2^P \rightarrow 2^O}$ defined by:

$\displaystyle \begin{array}{lcl} \mathit{intent}(E) &=& \{ p \in P \mid \forall o \in E .\; o \mathrel{\vdash} p \} \\ \mathit{extent}(I) &=& \{ o \in O \mid \forall p \in I .\; o \mathrel{\vdash} p \} \end{array}$

That is, ${\mathit{intent}(E)}$ is the set of properties enjoyed by all objects in ${E}$, and ${\mathit{extent}(I)}$ is the set of objects enjoying all the properties in ${I}$; a concept is a pair ${(E,I)}$ with ${\mathit{intent}(E) = I}$ and ${\mathit{extent}(I) = E}$. The concept-forming operators form a Galois connection between ${\subseteq}$ and ${\supseteq}$:

$\displaystyle \begin{array}{ll} & \mathit{extent}(I) \supseteq E \\ \Leftrightarrow& \qquad \{ \mbox{characteristic of~} \mathit{extent} \} \\ & \forall o \in E .\; (\forall p \in I .\; o \mathrel{\vdash} p) \\ \Leftrightarrow& \qquad \{ \mbox{commuting quantifiers} \} \\ & \forall p \in I .\; (\forall o \in E .\; o \mathrel{\vdash} p) \\ \Leftrightarrow& \qquad \{ \mbox{characteristic of~} \mathit{intent} \} \\ & I \subseteq \mathit{intent}(E) \end{array}$

This construction can be used to translate a problem about the extension of a concept (that is, an enumeration of its instances) into one about the intension (that is, the characteristic properties of its instances). It is related to the observation that “syntax and semantics are adjoint“—under the analogy that “objects” are sets of mathematical structures, “properties” are axioms, and the relation is “satisfaction”, the models of an axiomatic theory ${T}$ are included in a set of structures ${S}$ if and only if the theory ${T}$ logically entails the minimal axiomatization of ${S}$.

## October 02, 2014

### Joachim Breitner

At my university, we recently held an exam that covered a bit of Haskell, and a simple warm-up question at the beginning asked the students to implement last :: [a] -> a. We did not demand a specific behaviour for last [].

This is a survey of various solutions, only covering those that are actually correct. I elided some variation in syntax (e.g. guards vs. if-then-else).

Most wrote the naive and straightforward code:

last [x] = x
last (x:xs) = last xs

Then quite a few seemed to be uncomfortable with pattern-matching and used conditional expressions. There was some variety in finding out whether a list is empty:

last (x:xs)
| null xs == True = x
| otherwise       = last xs

last (x:xs)
| length (x:xs) == 1 = x
| otherwise          = last xs

last (x:xs)
| length xs == 0 = x
| otherwise      = last xs

last xs
| lenght xs > 1 = last (tail xs)

last xs
| lenght xs == 1 = head xs
| otherwise      = last (tail xs)

last (x:xs)
| xs == []  = x
| otherwise = last xs

The last one is not really correct, as it has the stricter type Eq a => [a] -> a. Also we did not expect our students to avoid the quadratic runtime caused by using length in every step.

The next class of answers used length to pick out the right elemet, either using (!!) directly, or simulating it with head and drop:

last xs = xs !! (length xs - 1)

last xs = head (drop (length xs - 1) xs)

There were two submissions that spelled out an explicit left folding recursion:

last (x:xs) = lastHelper x xs
where
lastHelper z [] = z
lastHelper z (y:ys) = lastHelper y ys

And finally there are a few code-golfers that just plugged together some other functions:

last x = head (reverse x)

Quite a lot of ways to write last!

# Updating auto-update

A few weeks back, Kazu and I received an email from Martin Bailey, who was interested in working with us to further optimize Warp. The subject at the time was reduced buffer copying and allocations. That subject is very interesting in itself, and once finalized, will get its own blog post as well. But that's not the subject of this blog post.

After working on some ideas, Kazu benchmarked Warp, and found a massive performance degradation which had already slipped onto Hackage. The problem only appears under highly concurrent requests (which is exactly what Kazu's benchmark was testing). That bug has now been resolved, and users are recommended to upgrade to the newest versions of auto-update and warp. (We also included some other performance boosts here, care of Ryan Newton's atomic-primops package.) This blog post covers the problem, and its solution.

One of the required response headers according to the HTTP spec is the Date header. As you might guess, this consists of the date, as well as the current time on the server. This has a very specific format specified in the spec. A naive approach to filling in this header would be, for each request, to run:

now <- getCurrentTime
let value = formatTimeHTTP now
headers' = ("Date", value) : headers

However, when you're writing a server that handles hundreds of thousands of requests per second, having to get and format the current time on each request is incredibly expensive. Instead, quite some time ago, Kazu introduced a date formatting worker thread, which essentially looks like this:

let getNowFormatted = formatTimeHTTP <$> getCurrentTime nowFormatted <- getNowFormatted >>= newIORef forkIO$ forever $do threadDelay 1000000 getNowFormatted >>= writeIORef nowFormatted return$ readIORef nowFormatted

We fork a single thread that recalculates the formatted time every second, and updates a mutable reference. This means that, regardless of how many clients connect, we will always run this computation at most once per second. And reading the current time requires no system calls, formatting, or allocation: it's just a memory read.

## Watt's the matter

The problem with this approach is that, even if there are zero requests, we're still going to run this computation once a second. This doesn't hurt our performance too much (it's a relatively cheap operation). However, it does mean that our process never stops working, which is bad for power consumption. So we needed a way to let that worker thread turn off when it wasn't needed anymore, and start up again on demand.

If this sounds familiar, it's because we blogged about it just two months ago. But there's an implementation detail I want to point out about auto-update. When I started writing it, I aimed to have the worker thread completely shut down when not needed, and then for a new worker thread to be spawned on demand. This resulted in some strange corner cases. In particular, it was possible for two different callers to spawn two different worker threads at the same time. We used atomicModifyIORef to ensure that only one of those threads became the "official" worker, and the other one would shut down right away. There was also a lot of tricky business around getting async exceptions right.

The code wasn't too difficult to follow, and was still pretty short. You can view it on Github.

## The core of the problem

Unfortunately, before release, we didn't do enough performance testing of auto-update in highly concurrent settings. When we threw enough cores at the problem, we quickly ran into a situation where multiple Haskell threads would end up making concurrent requests for the auto-update value, and each of those threads would fork a separate worker thread. Only one of those would survive, but the forking itself was killing our performance! (I had one benchmark demonstrating that, for one million requests across one thousand threads on four cores, over ten thousands worker threads were forked.)

There was another problem as well. We made heavy use of atomicModifyIORef to get this working correctly. Compare this to our original dedicated thread solution: each data access was a simple, cheap readIORef. By introducing atomicModifyIORef at each request site, we introduced memory barrier issues immediately.

## The solution

After iterating a few times, Kazu, Martin and I came up with a fairly elegant solution to the problem. As noted, the original dedicated thread was in many ways ideal. A lot of our problems were coming from trying to fork threads on demand. So we came up with a compromise: why not have a single, dedicated worker thread, but allow it to go to sleep/block when it's no longer needed?

Blocking semantics meant we needed to move beyond IORefs over to either MVars or STM (we elected for the former, but it would still be interesting to compare performance with the latter). But we still want to be able to have incredibly cheap lookup in the common case of a value being precomputed. So we ended up with three mutable variables:

1. An IORef containing a Maybe a. If a precomputed value is available, it's present in there as Just a. Otherwise, there's a Nothing value. This mean that, in the normal case, a requester only needs to (a) do a memory read, and (b) case analyze the value.

2. An MVar baton to tell the worker thread that someone is waiting for a value. In the case that the IORef contains Nothing, the requester fills this MVar. Before it starts working, the worker thread always tries to takeMVar this baton.

3. An MVar for passing back result values. This may seem redundant with the IORef, but is necessary for the case of Nothing. A requester putMVars into the baton, and then blocks in readMVar on this reference until the value is available. The worker thread will update both the IORef and this value at the same time, and then go to sleep until it needs to compute again, at which point the process will loop.

Since the requesters never try to fork threads, there's no possibility of running into the high contention problem of multiple forks. The worst case scenario is that many requesters see a Nothing value at the same time, and all end up blocking on the same MVar. While less efficient than just reading an IORef, this isn't nearly as expensive as what we had previously.

There's another big advantage: the normal case no longer uses any atomicModifyIORef (or any other synchronized memory access), meaning we've avoided memory barriers during periods of high concurrency.

And finally: the code is drastically simpler.

And now that we've fixed our regression, and gotten back high performance together with better power consumption, we can finally return to Martin's original idea of better buffer management. More on that soon hopefully :).

# Full Stack Functional Web Engineer at Front Row Education (Full-time)

Front Row Education (https://www.frontrowed.com/) - San Francisco, Fulltime

== Position ==

Early Full-stack web engineer to join fast-growing education startup that changes how hundreds of thousands of kids learn math.

== TL;DR - Reasons to care about working with Front Row ==

• A mission you can be proud of: kids using Front Row improve twice as much at math

• Tremendous impact on a small team

• Flexibility on what you work on, and autonomy over your schedule

• Effective & exciting tools: Clojure, Haskell, PostgreSQL, Backbone.js, Ansible

• Be part of a growing success: in just over a year after launch, we are in more than 6% of all US schools

Millions of teachers around the USA are struggling to help 30+ students in their class learn math because every student is in their own place. In a typical fourth grade classroom, there may be students learning to count, students learning to add, students learning to multiply, and students learning how exponents work - and one teacher somehow needs to address all these needs.

Thousands of teachers use Front Row every day to save hours of time and make sure their students are growing quickly. Front Row active users have been growing 100% a month for the past 8 school months.

== The Role ==

As one of our very first engineers, you will be part of a team of developers who are passionate about their vocation, take pride in their craft and who push each other to grow as professionals. You will strive for pragmatism and 80/20 in your work. You will be using tools that provide you with the most leverage and make you most effective. By working really smart, you will produce more than the average developer ever will, but without the crazy hours.

You will work in an effective team that plans, executes and reflects together. Because we’re a small team, everything you create will go into production and be used by students. You will never do unimportant work: every contribution will make a clear and tangible impact on the company’s trajectory. Your personal success will be directly aligned with that of the company.

Most importantly, your work will have purpose: Front Row is a mission-driven company that takes pride in making a significant impact in the lives of hundreds of thousands of students.

== Benefits ==

• Competitive salary

• Generous equity option grants

• Medical, Dental, and Vision

• Food budget

• Work from home up to 1 day a week, extremely flexible work schedule

• Team meal outing every week, and half-day event every month (trip to Sonoma, BBQ, etc)

• Equipment budget

• Flexible, untracked vacation day policy

• Working from Runway - one of the best startup coworking locations right in the heart of San Francisco

Get information on how to apply for this position.

# Senior Software Engineer at McGraw-Hill Education (Full-time)

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District.

We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

• Move textbooks and learning into the digital era
• Create software used by millions of students
• Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you have do some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

• Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
• Be comfortable moving between multiple programming languages
• Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

# GHC Weekly News - 2014/09/30

Hi *,

Here's some news for y'all! Apologizes about the unavailability last week; the internet wasn't exactly a very fun place for a system administrator...

So without delay, here's the current recap of the past two weeks:

• Lots of merged code and code reviews have gone in, and a lot of commits: in the past two weeks since the last update, ghc.git has seen just over 100 commits, from about a dozen different developers.
• As part of those patches, a significant amount of them have gone towards implementing the "Burning Bridges Proposal" or BBP for the base library. This is a set of changes to base that have generalized many parts of the API, by putting Traversable and Foldable in the Prelude. This required a bit of shoveling by Herbert, but now this is all in GHC HEAD, and will be part of 7.10:
• Prelude combinators, like mapM, have been generalized to the Traversable and Foldable classes.
• Several other modules, like Control.Monad and Data.List, have been generalized to Traversable and Foldable where applicable.
• Control.Monad combinators generalized to Applicative where possible.
• Similarly, MonadPlus combinators like guard are generalized to Alternative.
• Foldable has been extended with new methods, like length and null.
• But also, GHC's compiler is now tab-free! That's right, after what seemed like a million years, a very large portion of the code has been detabbed, and -fwarn-tabs is now on by default in the GHC build.
• There are an assortment of other changes: GHC's linker is not as loud[1], and various documentation improvements.
• The windows build is broken *again* unfortunately, this time due to what seems to be a Cabal update, I think. Austin is once again on the case.
• The HCAR draft for October has seen some nice improvements. If you're a developer, please amend things. If you're a user, read with eager anticipation of all the new features![2]
• It turns out the new Applicative/Monad changes have unfortunately broken the haskell98 and haskell2010 packages, with an unclear migration strategy for the future: see #9590. For GHC 7.10, it seems the haskell2010 packages will need to change to accomodate these standard deviations. If any users of the haskell98 or haskell2010 packages would speak up to help, that would be fantastic. The discussion will surely continue for a little bit - 7.10 is still a ways off.

In miscellaneous news:

• ghc.haskell.org may have been temporarily unavailable during this weekend due to an emergency downtime with our provider for a security update, but the window was quite small.
• Relatedly (but not the exact same scenario), the internet also caught fire in several other places, requiring quite a lot of careful watching and remediation after the Bash "ShellShock" bug hit last week.

And I think that sums it up quite nicely, folks!

Closed tickets for the past two weeks include (there are a lot of them!): #9650, #7068, #5190, #5880, #8374, #9603, #3303, #3470, #3509, #781, #8115, #9641, #9191, #9515, #9326, #7987, #9634, #9576, #9612, #9635, #8593, #7544, #8529, #9338, #5794, #9535, #3646, #617, #8026, #8480, #8881, #9366, #8251, #7673, #8983, #8369, #8897, #8070, #9550, #9057, #9629, #8836, #8960, #8941, #9565, #9589, #5248, #8918, #9568, #9620, #1042, #9557, #7863, #5647, #9610, #5395, #9580, #9529, #4426, #9607, #9592, #8097, #9559, #9560, #4218, #9602, #9528, #9530, #9423, #9400, #1407, #9598, #9597.

I'd like to mention that for the above tickets, a *huge* amount of them were closed by one of our newest contributors, Thomas Miedema, who went through the bug tracker and confirmed or closed a large majority of them. I lost track of how many. Thanks Thomas!

# List of publications and talks updated

The list of publications and talks on my website has been pretty much out of date. Now it is updated, for you to enjoy. :-)

Tagged: publication, talk

# Announcing Yesod 1.4

We are happy to announce the release of Yesod 1.4. This includes:

• Releases of all Yesod packages to support version 1.4.
• The book content on yesodweb.com is completely updated for Yesod 1.4, with all snippets confirmed to compile and most of the text proofread from scratch for accuracy (in the next week the rest will be finished).
• A new Stackage snapshot available for GHC 7.8.3.

Its worth mentioning that there have been a ton of improvements to Yesod since version 1.2, they just didn't need any breaking changes.

Thanks to everyone who provided code, feedback, and testing for this release, it should be a very solid one!

## Changelog

What is most exciting to report is that this was a very minor change to Yesod, and therefore most code should be upgradeable with minor changes. First, the changelog of breaking changes:

### New routing system with more overlap checking control

This requires OverloadedStrings and ViewPatterns. The generated code is faster and much more readable.

Yesod routes are not just type-safe, they also check for overlapping that could cause ambiguity. This is a great feature, but sometimes it gets in your way. Overlap checking can be turned off for multipieces, entire routes, and parent routes in a hierarchy. For more information, see the commit comment.

### Dropped backwards compatibility with older versions of dependencies

In particular, persistent-1 and wai-2. We will talk more about persistent 2. wai-3 uses a CPS style that will require some middleware to have an additional CPS parameter. Looking at the wai-extra source code can help with upgrading, but it should just be adding an extra parameter.

There is better support for non-persistent backends in yesod-auth. See pull request 821 for details. For most users, you can fix this by adding instance YesodAuthPersist App to your Foundation.hs.

yesod-auth already released a breaking change to be able to accept JSON everywhere. That bumped the version to 1.3 We like to keep the yesod-* packages in sync, so now everything is getting bumped to 1.4 together.

In the 1.4 release, we also fixed requireAuth and and requireAuthId to return a 401 response when a JSON response is requested. See pull request 783.

### yesod-test sends HTTP/1.1 as the version

This may require updating tests to expect 303 instead of 302 redirects.

### Type-based caching with keys.

The Type-based caching code was moved into a separate module without Yesod dependencies and documented. If there is interest in seeing this as a separate package let us know, but it is also pretty easy to just copy the module.

To me, TypeCache is a beautiful demonstration of Haskell's advanced type system that shows how you can get the best of both worlds in a strongly typed language.

type TypeMap      = HashMap TypeRep Dynamic

Above we have the wonderful juxtaposition of Haskell's strong typing in the Key, and dynamic typing in the value. This HashMap is used to cache the result of a monadic action.

cached :: (Monad m, Typeable a)
=> TypeMap
-> m a                       -- ^ cache the result of this action
-> m (Either (TypeMap, a) a) -- ^ Left is a cache miss, Right is a hit

Dynamic is used to have a HashMap of arbitrary value types. TypeRep is used to create a unique key for the cache. Yesod uses this to cache the authentication lookup of the database for the duration of the request.

newtype CachedMaybeAuth val = CachedMaybeAuth { unCachedMaybeAuth :: Maybe val }
deriving Typeable

cachedAuth
= fmap unCachedMaybeAuth
. cached
. fmap CachedMaybeAuth
. getAuthEntity

CachedMaybeAuth is a newtype that isn't exported. TypeRep is specific to a module, so this pattern guarantees that your cache key will not conflict outside of your module.

This functionality was in yesod-1.2 even though the code was not separated into a new module. The 1.4 release adds the ability to cache multiple values per type

type KeyedTypeMap = HashMap (TypeRep, ByteString) Dynamic

cachedBy :: (Monad m, Typeable a)
=> KeyedTypeMap
-> ByteString                     -- ^ a cache key
-> m a                            -- ^ cache the result of this action
-> m (Either (KeyedTypeMap, a) a) -- ^ Left is a cache miss, Right is a hit

This is useful if your monadic action has inputs: if you serialize them to a ByteString you can use thm as a key.

The most significant set of changes in the Yesod ecosystem actually landed in Persistent 2. However, these were mostly internal changes with new features that maintain backwards compatibility, so many users will be unaffected.

To kickoff the upgrade process, you need to change update your cabal file to allow yesod version 1.4. If you had constraints on persistent, update them to > 2.1 If you are using cabal freeze to peg your versions in the cabal.config file, cabal will provide you no assistance in making a smooth upgrae. You are probably going to want to delete a whole lot of things in cabal.config (or possibley the entire file), and upgrade a lot of dependencies at once. When you are done and things compile again, you will want to do a cabal freeze

As has become the custom for each major release, the upgrade process is documented by the diff of the Haskellers code base upgrading to Yesod 1.4. For Haskellers it was pretty simple.

In sum:

• Replace type YesodPersistBackend App = SqlPersist with type YesodPersistBackend App = SqlBackend.
• Add instance YesodAuthPersist App to Foundation.hs.
• Add the ViewPatterns language extension.

If you have more complex persistent code you may have more to do. Look at the previous post on persistent-2.1

# How we might abolish Cabal Hell, part 1

At ICFP a few weeks ago a hot topic in the corridors and in a couple talks was the issues surrounding packaging and “Cabal Hell”.

Fortunately we were not just discussing problems but solutions. Indeed I think we have a pretty good understanding now of where we want to be, and several solutions are in development or have reasonably clear designs in peoples’ heads.

I want to explain what’s going on for those not already deeply involved in the conversation. So this is the first of a series of blog posts on the problems and solutions around Cabal hell.

There are multiple problems and multiple solutions. The solutions overlap in slightly complicated ways. Since it is a bit complicated, I’m going to start with the big picture of the problems and solutions and how they relate to each other. In subsequent posts I’ll go into more detail on particular problems and solutions.

## “Cabal hell”: the problems

So what is “Cabal hell”? Let’s consult the dictionary…

Cabal Hell

The feeling of powerlessness one has when Cabal does not do what one wanted and one does not know how to fix it.

I’m joking obviously, but my point is that Cabal hell is not a precise technical term. There are a few different technical problems (and misunderstandings and UI problems) that can cause Cabal hell.

A useful concept when talking about this topic is that of the packaging “wild wild west”. What we mean is whether we are in a context where we reasonably expect packages to work together (because there has been some deliberate effort to make them work together), or if we are in the “wild wild west”. In the “wild wild west” we have to do things like deal with packages that were uploaded yesterday by multiple different authors. The point is that nobody has yet had time to try and make things consistent. It is a useful concept because we have developers who need to deal with the “wild wild west” and those who would really rather not, and the solutions tend to look a bit different.

Another term we often use when talking about packages is “consistency”. What we mean is that in a collection of packages there is at most one version of each package. For example when you ask cabal-install to install package A and B, we say that it will try to find a “consistent” set of dependencies – meaning a set including A, B and their dependencies that has only one version of any package.

### “Cabal hell”: the symptoms

So lets consider a breakdown of the technical problems. To start with lets look at a breakdown based on the symptoms that a developer in Cabal Hell experiences

We can first break things down by whether there is a solution or not. That is, whether a perfect dependency resolver could find a plan to install the package(s) and their dependencies consistently. We want such a solution because it’s a prerequisite for installing working packages. (We’re ignoring the possibility that there is a solution but the solver fails to find one. That is possible but it’s a relatively rare problem.)

Given the situation where the solver tells us that there is no solution, there are a few different cases to distinguish:

No solution expected

The failure was actually expected. For example a developer updating their package to work with the latest version of GHC is not going to be surprised if their initial install attempt fails. Then based on what the solver reports they can work out what changes they need to make to get things working.

The more common case is that the developer was not expecting to be working in the wild west. The developer had an expectation that the package or packages they were asking for could just be installed. In this case the answer “no that’s impossible” from the solver is very unhelpful, even though it’s perfectly correct.

Unnecessary solver failure

The symptoms here are exactly the same, namely the solver cannot find a solution, but the reason is different. More on reasons in a moment.

Even when there is a solution we can hit a few problems:

Compile error

Compilation can fail because some interface does not match. Typically this will manifest as a naming error or type error.

Breaking re-installations

Cabal’s chosen solution would involve reinstalling an existing version of a package but built with different dependencies. This re-installation would break any packages that depend on the pre-existing instance of the installed package. By default cabal-install will not go ahead with such re-installation, but you can ask it to do so.

Type errors when using packages together

It is possible to install two package and then load them both in GHCi and find that you cannot use them together because you get type errors when composing things defined in the two different packages.

### “Cabal hell”: the reasons

So those are the major problems. Lets look at some reasons for those problems.

Inconsistent versions of dependencies required

There are two sub-cases worth distinguishing here. One is where the developer is asking for two or more packages that could be installed individually, but cannot be installed and used together simultaneously because they have clashing requirements on their common dependencies. The other is that a package straightforwardly has no solution (at least with the given compiler & core library versions), because of conflicting constraints of its dependencies.

Constraints wrong

With under-constrained dependencies we get build failures, and with over-constrained dependencies we get unnecessary solver failures. That is, a build failure is (almost always) due to dependency constraints saying some package version combination should work, when actually it does not. And the dual problem: an unnecessary solver failure is the case where there would have been a solution that would actually compile, if only the constraints had been more relaxed.

Single instance restriction

Existing versions of GHC and Cabal let you install multiple versions of a package, but not multiple instances of the same version of a package. This is the reason why Cabal has to reinstall packages, rather than just add packages.

Inconsistent environment

These errors occur because cabal-install does not enforce consistency in the developer’s environment, just within any set of packages it installs simultaneously.

We’ll go into more detail on all of these issues in subsequent posts, so don’t worry if these things don’t fully make sense yet.

## “Cabal hell”: the solutions

There are several problems and there isn’t one solution that covers them all. Rather there are several solutions. Some of those solutions overlap with each other, meaning that for some cases either solution will work. The way the solutions overlap with the problems and each other is unfortunately a bit complicated.

Here’s the overview:

So what does it all mean?

We’ll look at the details of the solutions in subsequent posts. At this stage the thing to understand is which solutions cover which problems, and where those solutions overlap.

We’ll start with the two most important solutions. They’re the most important in the sense that they cover the most cases.

Nix-style persistent store with multiple consistent environments

This solves all the cases of breaking re-installations, and all cases of inconsistent environments. It doesn’t help with wrong constraints.

You’ll note that it covers some cases where there is no solution and you might wonder what this can mean. Some cases where there is no solution are due to two (or more) sets of packages that could be installed independently but cannot be installed together consistently. In a nix-style setting it would be possible to offer developers the option to install the packages into separate environments when the solver determines that this is possible.

Curated consistent package collections

These are things like the Debian Haskell packages or Stackage. This solves some cases of each of the different problems: breaking re-installations, inconsistent environments, wrong constraints and lack of consistent solutions. It solves those cases to the extent that the package collection covers all the packages that the developer is interested in. For many developers this will be enough. Almost by definition however it cannot help with the “wild west” of packages because the curation takes time and effort. Unless used in combination with a isolated environment solution (e.g. nix-style, but also less sophisticated systems like hsevn or cabal sandboxes) it does not allow using multiple versions of the collection (e.g. different projects using different Stackage versions).

It is worth noting that these two solutions should work well together. Neither one subsumes the other. We don’t need to pick between the two. We should pick both. The combination would get us a long way to abolishing Cabal hell.

There are also a number of smaller solutions:

Automatic build reporting

This helps with detecting compile errors arising from constraints that are too lax. It doesn’t help with constraints that are too tight. This solution requires a combination of automation and manual oversight to fix package constraints and to push those fixes upstream.

Upper-bound build bots

This is similar to gathering build reports from users, but instead of looking at cases of compile failure (constraints too lax), it explicitly tries relaxing upper bounds and checks if things still compile and testsuites work. Again, this requires automation to act on the information gleaned to minimise manual effort.

Package interface compatibility tools

This is to help package authors get their dependency constraints right in the first place. It can help them follow a version policy correctly, and tell them what minimum and maximum version bounds of their dependencies to use. It does not completely eliminate the need to test, because type compatibility does not guarantee semantic compatibility. Solutions in this area could eliminate a large number of cases of wrong constraints, both too lax and too tight.

Private dependencies

This allows solutions to exist where they do not currently exist, by relaxing the consistency requirement in a safe way. It means global consistency of dependencies is not always required, which allows many more solutions. This solution would cover a lot of cases in the “wild wild west” of packaging, and generally in the large set of packages that are not so popular or well maintained as to be included in a curated collection.

### Next time…

So that’s the big picture of the problems and solutions and how they relate to each other. In subsequent posts we’ll look in more detail at the problems and solutions, particularly the solutions people are thinking about or actively working on.

### Magnus Therning

Adding tags to a Hakyll site brought some surprises. In retrospect it all makes sense, but it took some thinking on my part to work out the why of it all. The resulting code was heavily influenced by Erik Kronberg’s site.

First I thought I’d just add tags to each rendered post, by building the tags

tags <- buildTags "posts/*" (fromCapture "tags/*.html")

then adding them to the post context

let postCtx =
field "previousPostUrl" (previousPostUrl "posts/*") <>
field "previousPostTitle" (previousPostTitle "posts/*") <>
field "nextPostUrl" (nextPostUrl "posts/*") <>
field "nextPostTitle" (nextPostTitle "posts/*") <>
field "postId" getPostId <>
tagsField "tags" tags <>
baseCtx

and last modify the template

<p>Tags: $tags$</p>

Easy! Except it doesn’t work that way. The $tags$ is always empty. To actually get the tagsField to work as intended it’s necessary to build the tag pages, which can be accomplished using tagsRules

tagsRules tags $\ tagStr pattern -> do route idRoute compile$ do
posts <- loadAll pattern >>= recentFirst
let tagsCtx =
constField "thetag" tagStr <>
listField "posts" baseCtx (return posts) <>
baseCtx
makeItem ""
>>= relativizeUrls

The template for the tags pages is very simple at the moment

<h1>Posts tagged $thetag$</h1>
<ul>
$for(posts)$
<li>
<a href="$url$">$title$</a> - $date$
</li>
$endfor$
</ul>

That’s it. With that in place the $tags$ field renders properly in the post pages as well.

# Persistent 2.1 released

Persistent 2.1, a stable release of the next generation of persistent is released to Hackage.

Persistent is an ORM for Haskell that keeps everything type-safe.

Persistent 2.1 features

• a flexible, yet more type-safe Key type

I already announced persistent 2 and the 2.1 release candidate.

Everyone should set their persistent dependencies to > 2.1 && < 3. 2.0.x was the unstable release and is now deprecated.

I want to thank all the early persistent 2 adopters for putting up with a fast-moving, buggy code base. This was an experiment in shipping an unstable version, and what I learned from it is that it was a great process, but we need to make sure Travis CI is running properly, which it is now!

### Persistent 2.1 library support

The persistent and persistent-template libraries should support any kind of primary key type that you need. The backends are still catching up to the new features

• persistent-sqlite backend has fully implemented these features.
• persistent-postgres and persitent-mysql don't yet support changing the type of the id field
• persistent-mongoDB does not yet support composite primary keys

All of the above packages except persistent-mysql are being well maintained, but just developing new features at their own pace. persistent-mysql is in the need of a dedicated maintainer. There are some major defects in the migration code that have gone unresolved for a long time now.

• persistent-redis is in the process of being upgraded to 2.1
• persistent-zookeeper was just released, but it is on persistent 1.3.*
• There are other persistent packages out there that I have not had the chance to check on yet, most noteably persistent-odbc. Feel free to ask for help with upgrading.

Simple persistent usage may not need any changes to upgrade.

The fact that the Key type is now flexible means it may need to be constrained. So if you have functions that have Key in the type signature that are not specific to one PersistEntity, you may need to constrain them to the BackendKey type. An easy way to do this is using ConstraintKinds.

type DBEntity record =
( PersistEntityBackend record ~ MongoContext
, PersistEntity record
, ToBackendKey MongoContext record
)

A SQL user would use SqlBackend instead of MongoContext. So you can now change the type signature of your functions:

- PersistEntity record => Key record
+ DBEntity record => Key record

Depending on how you setup your monad stacks, you may need some changes. Here is one possible approach to creating small but flexible Monad stack type signatures. It requires Rank2Types, and the code show is specialized to MongoDB.

type ControlIO m = ( MonadIO m , MonadBaseControl IO m)
type LogIO m = ( MonadLogger m , ControlIO m)

-- these are actually types, not constraints
-- with persistent-2 things work out a lot easier this way
type DB    a =  LogIO m => ReaderT MongoContext m a
type DBM m a =  LogIO m => ReaderT MongoContext m a

The basic type signature is just DB () (no constraints required). For working with different monad stacks, you can use DBM. If you are using conduits, you will have MonadResource m => DBM m (). Here is another example:

class Monad m => HasApp m where
getApp :: m App
instance HasApp Handler where
getApp = getYesod
instance HasApp hasApp => HasApp (ReaderT MongoContext hasApp) where
getApp = lift $getApp instance MonadIO m => HasApp (ReaderT App m) where getApp = ask -- | synonym for DB plus HasApp operations type DBApp a = HasApp m => DBM m a type DBAppM m a = HasApp m => DBM m a  With this pattern our return type signature is always ReaderT MongoContext m, and we are changing m as needed. A different approach is to have a return type signature of m and to place a MonadReader constraint on it. type Mongo m = (LogIO m, MonadReader MongoContext m) Right now this approach requires using a call to Database.MongoDB.liftDB around each database call, but I am sure there are approaches to dealing with that. One approach would be to wrap every persistent "primitive" with liftDB. ### Robert Harper # Structure and Efficiency of Computer Programs For decades my colleague, Guy Blelloch, and I have been promoting a grand synthesis of the two “theories” of computer science, combinatorial theory and logical theory. It is only a small exaggeration to say that these two schools of thought operate in isolation. The combinatorial theorists concern themselves with efficiency, based on hypothetical translations of high-level algorithms to low-level machines, and have no useful theory of composition, the most important tool for developing large software systems. Logical theorists concern themselves with composition, emphasizing the analysis of the properties of components of systems and how those components may be combined; the entire subject of logic is a theory of composition (entailment). But relatively scant attention is paid to efficiency, and, to a distressingly large extent, the situation is worsening, rather than improving. Guy and I have been arguing, through our separate and joint work, for the applicability of PL ideas to algorithms design, leading, for example, to the concept of adaptive programming that has been pursued aggressively by Umut Acar over the last dozen years. And we have argued for the importance of cost analysis, for various measures of cost, at the level of the code that one actually writes, rather than in terms of how it is supposedly compiled. Last spring, after initial discussions with Anindya Banerjee at NSF last winter, I decided to write a position paper on the topic, outlining the scientific opportunities and challenges that would arise from an attempt to unify the two, disparate theories of computing. The first draft was circulated privately in May, and was revised, very lightly, in July in preparation for a conference call sponsored by NSF among a group of leading researchers in both PL’s and Algorithms with the goal to find common ground and isolate key technical challenges. I am delighted (and relieved) to see that Swarat Chaudhuri, in a post on his PL Enthusiast blog, has heartily endorsed our proposal for a grand synthesis of the “two theories” of Computer Science. During his visit, Swarat had lengthy discussions with Umut, Guy, and me on our work in both research and education, but were surprised and disheartened by his opposition to our approach. His concern was based on the common misapprehension that it is impossible to give a useful cost model for the lambda calculus, which would thereby undermine the entire body of work on which Guy and I, among others, had been pursuing for decades. Coming from such a distinguished researcher as Chaudhuri, his opposition created for us a period of anxiety, could we be wrong? But no, it is simply not true. Guy and John Greiner provided an adequate cost model for the lambda calculus (under both a sequential and a parallel interpretation) in 1993, and that paper has withstood decades of scrutiny. But it did take quite some time to sort this out to be absolutely sure. For some mysterious reason, when it comes to the lambda calculus nonsense gets twice around the world before the truth can get its boots on, to borrow a turn of phrase from Mark Twain. After some back-and-forth, the matter is settled, and I am delighted that we can now count Swarat among our supporters. It would have been a heavy burden for us to have to bear the opposition of a distinguished researcher such as himself to the very foundations of our proposed program. Which is not to say that there are not serious obstacles to be overcome if such a grand synthesis is to be accomplished. The first step is to get the right people together to discuss the issues and to formulate a unified vision of what are the core problems, and what are promising directions for the short- and long-term. To this end there is likely to be a workshop held during the next academic year to start addressing these problems at a scientific level. Contrary to what is implied in the PL Enthusiast post, my position paper is not a proposal for funding, but is rather a proposal for a scientific meeting designed to bring together two largely (but not entirely) disparate communities. This summer NSF hosted a three-hour long conference call among a number of researchers in both areas with a view towards formulating a workshop proposal in the near future. Please keep an eye out for future announcements. I think there are many good problems to be considered, and many opportunities for new results. I would like to mention that I am particularly grateful to Anindya Banerjee at NSF for initiating the discussion last winter that led to my position paper, and for helping to move forward the process of formulating a workshop proposal. And I am very happy to learn of Swarat’s endorsement; it will go a long way towards helping attract interest from both sides of the great divide. Filed under: Research Tagged: algorithms, programming languages, research ### Magnus Therning # Adding support for comments It seems most people using Hakyll or other static site generators rely on services like Disqus, but I really don’t like the idea of putting a bunch of JavaScript on each page and dynamically loading all comments off some cloud storage. It sorts of fly in the face of the idea of having a static site to begin with. Searching online resulted in a few posts related to a plugin for static comments in Jekyll. This post only covers dealing with the comments, and not how the reader actually submits a comment. I’ll save that for the next post. ## Code changes I settled on the following naming scheme for comments. The comments for a post P, which is found at posts/<P>.mkd will be put into files named comments/<P>-c000.mkd, comments/<P>-c001.mkd, and so on. The crucial bits are that, first the post’s name is a prefix of all its comments’ names, and two the identifiers (basically the filenames) of the comments are, just like identifiers for posts, easy to sort in date order. Adding a rule for the comments is easy: match "comments/*"$ compile pandocCompiler

Then it got a little more tricky. The comments for each post needs to be put into the context used to build the posts. Previously I’ve used field, which takes a function turning an Item String into String. I’ve also used listField which is used to tie a key to a list of Item a. What I needed here though doesn’t seem to exist, i.e. a context function that takes an Item a and returns a list of Item s. So after a bit of studying the source of field and listField I came up with listFieldFunc:

listFieldFunc :: Show a => String -> Context a -> (Item a -> Compiler [Item a]) -> Context a
listFieldFunc key ctx func = Context $\ k i -> if k == key then value i else empty where value i = do is <- func i return$ ListField ctx is

The function for extracting a post’s comments can then be written as

getComments :: (Binary a, Typeable a) => Pattern -> Item a -> Compiler [Item a]
idents <- getMatches pattern >>= sortChronological
let iId = itemIdentifier item
comments = filter (isCommentForPost iId) idents

isCommentForPost :: Identifier -> Identifier -> Bool
isCommentForPost post comment = let
postBase = takeBaseName $toFilePath post cmtBase = takeBaseName$ toFilePath comment
in isPrefixOf postBase cmtBase

Adding the key to the context used for the posts results in

let postCtx =
field "previousPostUrl" (previousPostUrl "posts/*") <>
field "previousPostTitle" (previousPostTitle "posts/*") <>
field "nextPostUrl" (nextPostUrl "posts/*") <>
field "nextPostTitle" (nextPostTitle "posts/*") <>
field "postId" getPostId <>
baseCtx

## Template changes

The template changes are trivial of course

$for(comments)$
<div>
<p>$author$</p>
$body$
</div>
$endfor$

# Notes on Abstract and Existential Types

Posted on September 29, 2014

I’m part of a paper reading club at CMU. Last week we talked about a classic paper, Abstract Types have Existential Type. The concept described in this paper is interesting and straightforward. Sadly some of the notions and comparisons made in the paper are starting to show their age. I thought it might be fun to give a tldr using Haskell.

The basic idea is that when we have an type with an abstract implementation some functions upon it, it’s really an existential type.

To exemplify this let’s define an abstract type (in Haskell)

    module Stack (Stack, empty, push, pop) where
newtype Stack a = Stack [a]

empty :: Stack a
empty = Stack []

push :: a -> Stack a -> Stack a
push a (Stack xs) = Stack (a : xs)

pop :: Stack a -> Maybe a
pop (Stack []) = Nothing
pop (Stack (x : xs)) = Just x

shift :: Stack a -> Maybe (Stack a)
shift (Stack []) = Nothing
shift (Stack (x : xs)) = Just (Stack xs)

Now we could import this module and use its operations:

    import Stack

main = do
let s = push 1 . push 2 . push 3 $empty print (pop s) What we couldn’t do however, is pattern match on stacks to take advantage of its internal structure. We can only build new operations out of combinations of the exposed API. The classy terminology would be to say that Stack is abstract. This is all well and good, but what does it mean type theoretically? If we want to represent Haskell as a typed calculus it’d be a shame to have to include Haskell’s (under powered) module system to talk about abstract types. After all, we’re not really thinking about modules as so much as hiding some details. That sounds like something our type system should be able to handle without having to rope in modules. By isolating the concept of abstraction in our type system, we might be able to more deeply understand and reason about code that uses abstract types. This is in fact quite possible, let’s rephrase our definition of Stack  module Stack (Stack, StackOps(..), ops) where newtype Stack a = Stack [a] data StackOps a = StackOps { empty :: Stack a , push :: a -> Stack a -> Stack a , pop :: Stack a -> Maybe a , shift :: Stack a -> Maybe (Stack a) } ops :: StackOps ops = ... Now that we’ve lumped all of our operations into one record, our module is really only exports a type name, and a record of data. We could take a step further still,  module Stack (Stack, StackOps(..), ops) where newtype Stack a = Stack [a] data StackOps s a = StackOps { empty :: s a , push :: a -> s a -> s a , pop :: s a -> Maybe a , shift :: s a -> Maybe (s a) } ops :: StackOps Stack ops = ... Now the only thing that needs to know the internals of Stack. It seems like we could really just smush the definition into ops, why should the rest of the file see our private definition.  module Stack (StackOps(..), ops) where data StackOps s a = StackOps { empty :: s a , push :: a -> s a -> s a , pop :: s a -> Maybe a , shift :: s a -> Maybe (s a) } ops :: StackOps ??? ops = ... Now what should we fill in ??? with? It’s some type, but it’s meant to be chosen by the callee, not the caller. Does that sound familiar? Existential types to the rescue!  {-# LANGUAGE PolyKinds, KindSignatures, ExistentialQuantification #-} module Stack where data Packed (f :: k -> k' -> *) a = forall s. Pack (f s a) data StackOps s a = StackOps { empty :: s a , push :: a -> s a -> s a , pop :: s a -> Maybe a , shift :: s a -> Maybe (s a) } ops :: Packed StackOps ops = Pack ... The key difference here is Packed. It lets us take a type function and instantiate it with some type variable and hide our choice from the user. This means that we can even drop the whole newtype from the implementation of ops  ops :: Packed StackOps ops = Pack$ StackOps { empty = []
, push  = (:)
, pop   = fmap fst . uncons
, shift = fmap snd . uncons }
where uncons [] = Nothing
uncons (x : xs) = Just (x, xs)

Now that we’ve eliminated the Stack definition from the top level, we can actually just drop the notion that this is in a separate module.

One thing that strikes me as unpleasant is how Packed is defined, we must jump through some hoops to support StackOps being polymorphic in two arguments, not just one.

We could get around this with higher rank polymorphism and making the fields more polymorphic while making the type less so. We could also just wish for type level lambdas or something. Even some of the recent type level lens stuff could be aimed at making a general case definition of Packed.

From the client side this definition isn’t actually so unpleasant to use either.

    {-# LANGUAGE RecordWildCards #-}

someAdds :: Packed Stack Int -> Maybe Int
someAdds (Pack Stack{..}) = pop (push 1 empty)

With record wild cards, there’s very little boilerplate to introduce our record into scope. Now we might wonder about using a specific instance rather than abstracting over all possible instantiations.

    someAdds :: Packed Stack Int -> Maybe Int
let (Pack Stack{..}) = ops in
pop (push 1 empty)

The resulting error message is amusing :)

Now we might wonder if we gain anything concrete from this. Did all those language extensions actually do something useful?

Well one mechanical transformation we can make is that we can change our existential type into a CPS-ed higher rank type.

    unpackPacked :: (forall s. f s a -> r) -> Packed f a -> r
unpackPacked cont (Pack f) = cont f

someAdds' :: Stack s Int -> Maybe Int
someAdds' Stack{..} = pop (push 1 empty)

someAdds :: Packed Stack Int -> Maybe Int
someAdds = unpackPacked someAdds'

Now we’ve factored out the unpacking of existentials into a function called unpack. This takes a continuation which is parametric in the existential variable, s.

Now our body of someAdds becomes someAdds, but notice something very interesting here, now s is a normal universally quantified type variable. This means we can apply some nice properties we already have used, eg parametricity.

This is a nice effect of translating things to core constructs, all the tools we already have figured out can suddenly be applied.

## Wrap Up

Now that we’ve gone through transforming our abstract types in existential ones you can final appreciate at least one more thing: the subtitle on Bob Harper’s blog. You can’t say you didn’t learn something useful :)

I wanted to keep this post short and sweet. In doing this I’m going to some of the more interesting questions we could ask. For the curious reader, I leave you with these

• How can we use type classes to prettify our examples?
• What can we do to generalize Packed?
• How does this pertain to modules? Higher order modules?
• How would you implement “sharing constraints” in this model?
• What happens when we translate existentials to dependent products?

Cheers.

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

# sml-family.org up and running!

After far too long, and far too many obstacles to be overcome, Dave MacQueen, Lars Bergstrom, and I have finally prepared an open-source site for the entire family of languages derived from Standard ML.  The defining characteristic of Standard ML has always been that it has a mathematically precise definition, so that it is always clear what is a valid program and how it should behave.  And indeed we have seven different working compilers, all of which are compatible with each other, with the exception of some corner cases arising from known mistakes in the definition.  Moreover, there are several active projects developing new variations on the language, and it would be good to maintain the principle that such extensions be precisely defined.

To this end the sources of the 1990 and 1997 versions of the definition are on the web site, with the permission of MIT Press, as is the type-theoretic definition formulated by Chris Stone and me, which was subsequently used as the basis for a complete machine-checked proof of type safety for the entire language done by Karl Crary, Daniel K. Lee.  It is be hoped that the errors in the definition (many are known, we provide links to the extensive lists provided by Kahrs and Rossberg in separate investigations) may now be corrected by a community process.  Anyone is free to propose an alteration to be merged into the main branch, which is called “SML, The Living Language” and also known as “Successor ML”.  One may think of this as the third edition of the definition, but one that is in continual revision by the community.  Computer languages, like natural languages and like mathematics, belong to us all collectively, and we all contribute to their evolution.

Everyone is encouraged to create forks for experimental designs or new languages that enrich, extend, or significantly alter the semantics of the language.  The main branch will be for generally accepted corrections, modifications, and extensions, but it is to be expected that completely separate lines of development will also emerge.

The web site, sml-family.org is up and running, and will be announced in various likely places very soon.

Update: We have heard that some people get a “parked page” error from GoDaddy when accessing sml-family.org.  It appears to be a DNS propagation problem.

Update: The DNS problems have been resolved, and I believe that the web site is stably available now as linked above.

Filed under: Research Tagged: functional programming, sml

### Matthew Sackman

I remember when I first went to University, coming across people who were both clearly extremely expert in their fields, from whom I wanted to learn, but were also religious, and how this utterly baffled me. At that point I would cheerfully describe myself as an avid atheist. My ignorance and naivety was somewhat extensive.

Over a decade later I like to think I have a more nuanced view. The most recent war in Gaza led, obviously, to vast amounts of suffering but some excellent articles on the subject (this one by Hadley Freeman in particular) helped me see perspectives more clearly and articulated how crucial it is to be precise with criticism: are you criticising a religion, a people, a government, a policy or something else? Nothing is ever black-and-white and it seems increasingly important to anticipate the consequences of an ill-thought-through comment or reaction. A good example of that is George Galloway's comments this week in the debate about this country once again getting involved in Iraq. On the face of it, and certainly without being remotely well-enough informed to evaluate the accuracy of his claims, if his claims on the size and makeup of ISIS/ISIL are true then there seems little likelihood that the bombing campaigns being discussed will be effective, and quite likely counter-productive. But all of that got lost due his description of Iraqis as quiescent. The way in which that description was seized upon by other MPs and the resultant media storm resulted in the over-shadowing not just of the rest of his contribution to the debate, but also of other important aspects of the debate, such as the resignation of Rushanara Ali (Labour's Shadow Minister for Education), citing once again the lack of a credible long-term plan for the region and our involvement.

Addressing the broader and somewhat more abstract issue is this enlightening article by Karen Armstrong. Again, I'm not claiming to be expert in the area, merely I found the article very educative. It had barely occurred to me that the western world's separation of the secular from the sacred was firstly such a recent occurrence, and secondly that it arose from a specific set of circumstances. There is no implicit reason why separation of state from church is an inevitable or even likely happenstance (to me, this reminds me of the question "if humans evolved from monkeys, then why can't we find monkeys still evolving into humans today?", to which the answer is "the circumstances are not right for that to occur"). The fact that the English word "religion" can't really be translated accurately into other languages (especially not languages that predate English such as Greek or Latin; as historically faith is all encompassing of life, not merely a private affair as we treat it today in the west) starts to show quite how odd the separation of secular from sacred in the modern west really is.

More interesting still is the observation that in the west, belonging to a Nation has in some ways subsumed the role of belonging to a Religion, only apparently with more positive overtones: we consider it almost reprehensible to die for your religion, but honourable to die for your nation. It would seem the concept of even belonging to a nation and having any sense of greater community outside your immediate surroundings only came about with the increased ability of governments to engage with (or intrude upon) their citizens. Before that point, presumably with church attendance widespread and frequent, one's interaction with "the wider world" was through the representative of the church. This would seem to explain a lot about why governments of the past sought the blessing of their nation's church for particular courses of action: maybe the church was seen as the bridge between the government (or monarchy) and the people. The whole article is worth a read.

# Announcing: Heist v0.14

The Snap team is happy to announce the release of version 0.14 of Heist.

## Major Changes

### Namespace Support

Heist now has support for namespaces. This allows you to configure Heist so that all of your splices require a namespace. Requiring a namespace allows Heist to be more aggressive with errors for unbound splices. For instance, imagine you set the hcNamespace field in your HeistConfig to “h”, and you bind two splices.

mySplices = do
"foo" #! fooSplice
"bar" #! barSplice

With this setup, you would put the “h” namespace on all of the splice tags in your templates. Instead of calling those splices with “<foo>” and “<bar>”, you would use “<h:foo>” and “<h:bar>”. So why go to all this trouble so you have to type more? Because it allows Heist to do more error checking. Without namespaces there is no way for Heist to know whether a tag is supposed to be a splice or not. We could use the list of valid HTML tags to infer it, but that significantly constrains the scope of what we intended for Heist. This approach allows the user to be explicit about which tags should be splices. If you do not want to use namespaces, set the namespace to the empty string.

Along with the namespace field, we introduced the hcErrorNotBound for controlling the error checking. When hcErrorNotBound is True, Heist will generate an error message if it encounters any namespaced tags that do not have splices bound for them. This eliminates a large class of bugs where users were using a splice in their templates, but forgot to bind it in their code. The intent is that users will usually want to have this error checking turned on. But we felt it was also important to be able to use namespacing without the strict enforcement, so we included this flag to give users full control.

### Generalized Error Reporting

Since this release is doing more error checking, we decided to expose error facilities to the user. This release exposes a new function tellSpliceError that you can use when error conditions are detected in your splices. If you are using compiled Heist, then all your templates will be processed up front at load time. If any of your load time or compiled splices detect an error condition that the user needs to fix, you can call tellSpliceError with an error message. If there are any errors, Heist initialization will fail and all the errors will be returned.

### Restructured HeistConfig

The addition of hcNamespace and hcErrorNotBound to HeistConfig required some restructuring. Previously HeistConfig had a Monoid instance, but we removed that since the new fields make it unclear which instance should be used. But we also did not want to completely get rid of the monoid functionality either. So in order to get the best of both worlds, we refactored all of HeistConfig’s previous fields into another data structure called SpliceConfig. This way we can keep the Monoid instance for SpliceConfig and still avoid imposing a weird set of semantics on the user.

Unfortunately, given the use of field accessors it was impossible to make this change without breaking backwards compatibility. What seems like it should have been a simple addition of a couple parameters ended up being a more significant refactoring. To make these kinds of changes easier in the future Heist now exports lenses to all of the HeistConfig fields as well as an emptyHeistConfig value to use as a starting point. These lenses work with both the lens and lens-family-core packages and we export them without incurring a dependency on either lens package.

The HeistConfig constructor and field accessors have been moved to the Heist.Internal.Types module, so if you really need them, they are still available. However, be aware that Internal modules are subject to change without the deprecation cycle that we use for other modules.

## Minor improvements

• Factored out SpliceAPI module into separate map-syntax package and deprecated the old module which will be removed in the next major release.

• Snap has been updated to support this Heist 0.14.

## September 23, 2014

### Well-Typed.Com

The Haskell eXchange will take place in London for the third time now, and I’m happy to report that there’s going to be a really fantastic program again:

As is almost traditional by now, Simon Peyton Jones (Microsoft Research) himself will open the conference, this time with a talk on “Safe, zero-cost coercions in Haskell”.

This is followed by Jeremy Gibbons (University of Oxford) with “Categories for the Working Haskeller”, explaining to Haskell developers and people who are interested in Haskell what role category theory plays in Haskell and if/how categories can be helpful.

I’m very pleased that Bryan O’Sullivan (Facebook) has agreed to give a presentation at the Haskell eXchange this year. As the author of countless Haskell libraries (many of which are among the most used in the entire Haskell ecosystem), and being a co-author of the widely known O’Reilly book “Real World Haskell”, he’s certainly learned how to squeeze a bit of extra performance out of Haskell code when needed. He’s going to share his experiences and provide valuable advice in his talk.

After lunch, we’ll continue with a pair of talks looking at using Haskell for the development of RESTful web services from slightly different angles. Erik Hesselink (Silk) is going to present the rest framework, which makes it easy to develop and maintain REST APIs, independent of the underlying web framework you use. After that, Chris Dornan (Iris Connect) and Adam Gundry (Well-Typed) will talk about api-tools and in particular address the question of how you can solve the problem of schema migrations nicely.

Tim Williams and Peter Marks (both Barclays) will give a joint talk on Lucid, their in-house non-embedded DSL that is written in Haskell and has a mostly structural type system with interesting features such as row polymorphism and extensible records as well as extensible sum-types.

The talks will be concluded by Oliver Charles (Fynder), well-known for his tireless efforts in the “24 days of Hackage” series, who is going to show us how the use of GHC’s most advanced type system extensions helps him write better real-world code at his company.

After the talks, there’s going to be pizza and beer and an informal “Park Bench Discussion” where we can all discuss the questions that have been raised throughout the day in more detail.

I hope you’re as excited about this program as I am: I think there’s a fantastic range of topics covered, from language features and theoretical aspects, practical advice for programmers to interesting case studies of real-world code. Also, it’s an excellent opportunity to meet fellow developers interested in Haskell. If you’re working for a company using Haskell and are looking for new developers, this may be an excellent opportunity to recruit. On the other hand, if you’d like nothing more than a Haskell job, this is an opportunity to meet people who are actually using it in practice, and may have a job for you or at least be able to give you advice on how to find one.

If you haven’t registered yet, please consider doing so! We’re looking forward to meeting you there.

These two successful courses have been offered on a regular basis since October 2012. They’re continuously updated to reflect the latest changes in the Haskell world, such as updates to the infrastructure, new features of the main Haskell compiler GHC, or exciting new libraries.

Both courses are hands-on, comprising a combination of lectures, interactive coding and exercises that the participants are supposed to work on alone or in teams, with the help of the teacher(s).

The Fast Track course is for people who know how to program, but have little or no experience in Functional Programming or Haskell. It teaches Haskell in from scratch in just two days, covering important concepts such as datatypes, polymorphism, higher-order functions, type classes, how IO works in Haskell, and ending with an introduction to monads. It’s also interesting for people who are interested in learning about functional programming in general, because Haskell is a prime example of a functional programming language, and the course focuses on the important programming concepts more than on language peculiarities.

Being in the same week as the Haskell eXchange makes it possible to combine one of the courses (or even both) with the eXchange, where you can hear several other viewpoints and get to know more Haskellers!

### Other courses

We’re offering additional training courses on-demand and on-site for companies in Europe, the US, or anywhere in the world on request. See our training page for more information.

### HacBerlin

By the way, I’m also going to be at the Haskell Hackathon in Berlin this upcoming weekend. On Sunday, I’m going to give a talk on parser combinators. You can still register for the Hackathon, too. It’s free.

# The woes of multiple package versions

When I've answered the same question more than three times, it's usually time to write up a blog post explaining the situation in more detail, and just link people to that in the future. This is such a blog post.

Many people working with Haskell end up with one of these two classes of inexplicable GHC errors:

1. Cannot match ByteString with ByteString, with a whole bunch of package name and version junk as well.

2. SomeTransformerT is not an instance of MonadTrans, when the documentation clearly indicates that SomeTransformerT does define such an instance.

How can a ByteString not be a ByteString? Well, there are two ways I can think of. The first is that you're accidentally trying to mix up a strict ByteString with a lazy ByteString, whose types clearly don't unify. While that problem does pop up, in my experience most people figure that one out pretty quickly. By the time someone's asking a question on Stack Overflow/Reddit/haskell-cafe, it's usually much more insidious: there are two copies of bytestring on their system.

Imagine this situation: you install GHC 7.6, which ships with bytestring-0.10.0.2. You then install text via cabal install text. A few days later, someone mentions that bytestring-0.10.4.0 is out, and it's all new and shiny, so you go ahead and install it with cabal install bytestring. Everything works wonderfully, and life is good. Then you decide to experiment with text a bit, so you write the following program:

{-# LANGUAGE OverloadedStrings #-}
import Data.Text.Encoding (encodeUtf8)
import qualified Data.ByteString.Char8 as S8

main :: IO ()
main = S8.putStrLn $encodeUtf8 "Hello World!" Woe unto you! GHC rejects your program with: foo.hs:6:22: Couldn't match expected type S8.ByteString' with actual type bytestring-0.10.0.2:Data.ByteString.Internal.ByteString' In the return type of a call of encodeUtf8' In the second argument of ($)', namely encodeUtf8 "Hello World!"'
In the expression: S8.putStrLn $encodeUtf8 "Hello World!" When is a ByteString not a ByteString? Here, apprently. Now it turns out the GHC is actually giving you quite of bit of useful information, you just need to know what to look for. It's expecting the type S8.ByteString, which expands to Data.ByteString.Char8.ByteString, which in reality is just a type synonym for Data.ByteString.Internal.ByteString. So what GHC really means is that it can't unify the following two types: expected: Data.ByteString.Internal.ByteString actual: bytestring-0.10.0.2:Data.ByteString.Internal.ByteString Now the difference just jumps out at you: the actual type comes from the bytestring-0.10.0.2 package, whereas the first comes from... well, somewhere else. As I'm sure you're guessing right now, that "somewhere else" is bytestring-0.10.4.0, but GHC doesn't bother telling us that, since including that level of information in every error message would be overwhelming. To step through why this came up exactly: • text is installed against bytestring-0.10.0.2 (it was the only version of bytestring available at the time you installed text). • Therefore, encodeUtf8 will generate a ByteString value from version 0.10.0.2. • Your program imports Data.ByteString.Char8, which is provided by both bytestring-0.10.0.2 and bytestring-0.10.4.0. • GHC's default dependency resolution is: take the latest version of each package, in this case 0.10.4.0. • Now we have a S8.putStrLn function expecting a 0.10.4.0 ByteString, but an encodeUtf8 function returning a 0.10.0.2 ByteString. So how do we work around this problem? I can think of three ways: 1. Explicitly tell GHC which version of the bytestring package you want to use to force consistency, e.g. runghc -package=bytestring-0.10.0.2 foo.hs. 2. Never use GHCi, runghc, or ghc directly from the command line. Instead, always create a cabal file first. cabal's default dependency resolution will force consistent package loading. 3. Don't wind up in the situation in the first place, by ensuring you only have one version of each package installed. That last point is what I strongly recommend to all users. And this is exactly the design goal around Stackage, so it will hopefully not come as a surprise that that's exactly what I recommend most Haskell users use to get their packages installed. Let's demonstrate that second case of MonadTrans. This time, let's try it with GHC 7.8.3. GHC ships with transformers-0.3.0.0. Next, we'll install the either package with cabal install either. Once again, someone comes along and tells us about a shiny new package, transformers-0.4.1.0. Dutifully, we upgrade with cabal install transformers-0.4.1.0. And then we try to run the following simple program: import Control.Monad.Trans.Class import Control.Monad.Trans.Either main :: IO () main = do x <- runEitherT$ lift $putStrLn "hey there!" print (x :: Either () ()) GHC mocks you with: foo.hs:6:23: No instance for (MonadTrans (EitherT ())) arising from a use of ‘lift’ In the expression: lift In the second argument of ‘($)’, namely
‘lift $putStrLn "hey there!"’ In a stmt of a 'do' block: x <- runEitherT$ lift $putStrLn "hey there!" "But EitherT is an instance of MonadTrans!" you insist. That may be true, but it's an instance of the wrong MonadTrans. The either package is built against transformers-0.3.0.0, whereas you've imported lift from transformers-0.4.1.0. This can be worked around as above, with runghc -package=transformers-0.3.0.0 foo.hs. And yet again, my strong recommendation is: use Stackage. There's one more particularly painful thing I need to point out. Some packages are bundled with GHC, and are depended on by the ghc package. The special thing about the ghc package is that it cannot be reinstalled without installing a new version of GHC itself. Any packages depended on by the ghc package cannot be unregistered without breaking ghc, which would in turn break libraries like doctest and hint. If you follow these points to conclusion, this means that you should never upgrade GHC-bundled libraries. I wrote a blog post on this topic, and the takeaway is: please, always support older versions of packages like bytestring, transformers, and- of course- base. There's one final case I want to mention. Try running cabal install data-default-0.5.1 http-client, and then run the following program: import Data.Default import Network.HTTP.Client main :: IO () main = withManager defaultManagerSettings$ \man -> do
res <- httpLbs def man
print res

You'll get the irritating error message:

foo.hs:6:20:
No instance for (Default Request) arising from a use of ‘def’
In the first argument of ‘httpLbs’, namely ‘def’
In a stmt of a 'do' block: res <- httpLbs def man
In the expression:
do { res <- httpLbs def man;
print res }

But if you look at http-client, Request is in fact an instance of Default. "Alright, I know what's going on here" you say. Certainly there are two versions of data-default installed, right? Actually, no, that's not the case. Have a look at the following:

\$ ghc-pkg list | grep data-default
data-default-0.5.1
data-default-class-0.0.1

There's just a single version of each of these packages available. So why are we getting our mysterious error message? Once again, it's because we have two versions of the Default class. After data-default version 0.5.1, data-default split into a number of packages, and the Default class migrated into data-default-class. http-client defines an instance for Default from data-default-class. And if you use data-default version 0.5.2 or higher, it will simply re-export that same class, and everything will work.

However, our cabal install command forced the installation of the older data-default (0.5.1) which defines its own Default typeclass. Therefore, we end up with two separate Default` classes that don't unify. This is a problem that exists whenever packages are split or joined, which is why you should embark on such refactorings with great care.

As it happens, this is yet another problem that is solved by using Stackage, since it forces a consistent set of versions for data-default and data-default-class.