Planet Haskell

September 18, 2014

Keegan McAllister

Raw system calls for Rust

I wrote a small library for making raw system calls from Rust programs. It provides a macro that expands into in-line system call instructions, with no run-time dependencies at all. Here's an example:


#[phase(plugin, link)]
extern crate syscall;

fn write(fd: uint, buf: &[u8]) {
unsafe {
syscall!(WRITE, fd, buf.as_ptr(), buf.len());

fn main() {
write(1, "Hello, world!\n".as_bytes());

Right now it only supports x86-64 Linux, but I'd love to add other platforms. Pull requests are much appreciated. :)

by keegan ( at September 18, 2014 12:33 AM

September 17, 2014

Philip Wadler

The military and the referendum.

Many readers will have heard about Lord Dannat in the Telegraph arguing a vote for independence will dishonour Scotland's war dead.

Perhaps not as many will have heard that Jimmy Sinclair (the oldest surviving Desert Rat, aged 102), Colin May (Lieutenant Commander, Faslane), and sixteen others have written a letter slamming Dannat; at least, I didn't hear until this morning. "How dare he take their sacrifice in vain and try to turn it to political advantage?"

Both sides are reported by the BBC, though the headline mentions only one. (More #bbcbias?)

by Philip Wadler ( at September 17, 2014 08:42 PM

Eric Kidd

Deploying Rust applications to Heroku, with example code for Iron

Now with support for Iron, Cargo and Cargo.lock!

You can deploy an example Rust application to Heroku using this button:


If you'd prefer to use the command line, you'll need git and the Heroku toolbelt. Once these are installed, run:

git clone
cd heroku-rust-cargo-hello
heroku create --buildpack
git push heroku master

This will download the example application, create a new Heroku project, and deploy the code to Heroku. That's it!

How it works

Our server is based on the Iron middleware library. We parse URL parameters and dispatch HTTP requests to the appropriate handler routine using Iron's router module:

fn main() {
    let mut router = Router::new();
    router.get("/", hello);
    router.get("/:name", hello_name);
    Iron::new(router).listen(Ipv4Addr(0, 0, 0, 0), get_server_port());

The hello function returns an HTTP status code and the content to send to the user:

fn hello(_: &mut Request) -> IronResult<Response> {
    Ok(Response::with(status::Ok, "Hello world!"))

The hello_name function is similar, but we look up the value of the :name parameter that we declared to the router, above.

fn hello_name(req: &mut Request) -> IronResult<Response> {
    let params = req.extensions.find::<Router,Params>().unwrap();
    let name = params.find("name").unwrap();
    Ok(Response::with(status::Ok, format!("Hello, {}!", name)))

The final piece needed to run on Heroku is a function to look up up our port number:

fn get_server_port() -> Port {
        .and_then(|s| from_str::<Port>(s.as_slice()))

The full source code is available on GitHub. To learn more about Rust, see the excellent Rust guide.

Keep reading for notes on building the program locally and on configuring your own programs to run on Heroku.

Read more…

September 17, 2014 08:35 PM

Philip Wadler

The case for Europe

<iframe frameborder="0" height="500" src=";title=Today%3A%2017%2F09%2F2014%3A%20Salmond%20promises%20%22common-sense%20agreement%20on%20a%20common%20currency%E2%80%9D&amp;product=iplayer" width="400"></iframe>

Readers of this list will know that I don't always see eye-to-eye with Alex Salmond. Nonetheless, I think he put the case for Europe well this morning on the Today Program.
In a BBC interview just the other night, a spanish minister being
interviewed by Kirsty Wark despite being invited three or four times
refused to say that Spain would attempt to veto Scottish
membership. And the reason for that of course is that the Spanish
government have already said that in the circumstance of a consented
democratic referendum, as they put it, Spain would have nothing to say
about it.

We can go through legal opinion and expert opinion as much as we like.
I think the answer is in four figures: 1, 20, 25, and 60.

1% is the percentage of Scotland's population compared to the European Union.

20% is the percentage of the fish stocks of the entire European Union.

25% is the percentage of the renewable energy of the entire European Union offshore.

And 60% is the oil reserves that Scotland has.

Anyone who believes that a country [with these resources] is not
going to be welcome in the wider Europe doesn't understand the process
by which Europe accepts democratic results and that Scotland has a
huge amount of attractiveness to the rest of the European continent.

You can hear the original here, the relevant segment starts at around 8:00.

by Philip Wadler ( at September 17, 2014 08:10 PM

Mike Izbicki

Beginner error messages in C++ vs Haskell

Beginner error messages in C++ vs Haskell

posted on 2014-09-17 by Paul Starkey

Learning Haskell was excruciating. The error messages from the Haskell compiler ghc were way more difficult to understand than the error messages I was used to from g++. I admit I’m still a novice programmer: My only experience is a year of classes in C++ programming. But the Haskell compiler should do a better job generating error messages for beginners like me.

First we’ll see four concrete examples of ghc doing worse than g++, then Mike will talk about some ways to fix ghc’s error messages.

Example 1

Below are two equivalent C++ and Haskell programs. I’ve intentionally added some syntax errors:

    /* C++ Code */
    #include <iostream>

    using namespace std;

    int main () 
        int in = -1;
        cout << "Please choose 1 for a message" << endl;
        cin >> in;
err->   if in == 1 
            cout << "Hello, World!" << endl;
            cout << "Error, wrong choice" << endl;
        return 0;
    {- Haskell Code -}
    main = do
        putStrLn "Please enter 1 for a message"
        num <- getLine
        if num == "1"
            then do
                putStrLn "Hello, World" 
                putStrLn "Error, wrong choice"

Alright, so the first notable difference is that the Haskell code is much shorter. It takes up roughly half the space that the C++ code does, yet they both output hello world when the correct number is entered.


Haskell already seems better, right?


Notice how I messed up the if statement in both programs. In the C++ version, I forgot the parentheses, and in the Haskell version I forgot the else. Both omissions are simple mistakes that I’ve made while learning these languages.

Now let’s see the error messages:

    -- C++ Error --
    main.cpp: In function 'int main()':
    main.cpp:15:5: error: expected '(' before 'in'
    main.cpp:19:2: error: 'else' without a previous 'if'
    Compilation failed.
    -- Haskell Error --
        parse error (possibly incorrect indentation or mismatched brackets)
    Failed, modules loaded: none.

Both error messages let the programmer know where the mistake happened, but the g++ message is far more helpful. It tells us how to fix the syntax error by adding some missing parentheses. Bam! Problem solved.

Now let us turn to ghc’s output. Okay, something about a parse error… might have indentation errors… and no modules loaded. Cool. Now I’ve never taken a compiler course, so I don’t know what parse error means, and I have no idea how to fix it. The error message is simply not helpful.

Example 2

Here’s another example of parsing errors.

        /* C++ Code */ 
        #include <iostream>
        using namespace std;
        int main() 
err->       string in = ""
            cout << "Please enter a single word and get the string size back" << endl;
            cin >> in;
            cout << "The size of your word, \"" << in << "\", is "
                 << in.length() << "!" << endl;
            return 0;
        {- Haskell Code -}
err->   main = {-do-}
            putStrLn "Please enter a single word and get the string size back"
            num <- getLine
            let size = length num
            putStrLn $ "The size of your word, \"" ++ num ++ "\", is " ++ show size ++ "!"

As you can see, in the C++ I forgot to include a semicolon and in the Haskell I forgot the do in main. As a beginner, I’ve personally made both of these mistakes.

Here are the error messages:

    -- C++ Error --
    main.cpp:8:2: error: expected ',' or ';' before 'cout'
    Compilation failed.
    -- Haskell Error --
        parse error on input '<-'
    Failed, modules loaded: none.

C++ delivers a clear message explaining how to fix the error. Haskell, however, is not so helpful. It says there’s a parse error on the input operator. How should I know this is related to a missing do?

Example 3

Next let’s see what happens when you call the built-in strlen and length functions with no arguments at all.

    /* C++ Code */
    #include <iostream>
    #include <cstring>

    using namespace std;
    int main (){
        char input[256];
        cout << "Please enter a word" << endl;
        cin >> input;
err->   cout << "The size of your string is: " << (unsigned)strlen();
        cout << "!" << endl;
        return 0;
    {- Haskell Code -}
    main = do
        putStrLn "Please enter a word"
        num <- getLine
err->   let size = length 
        putStrLn $ "The size of your string is: " ++ show size ++ "!"

Now let us see the different error messages that are produced:

    -- C++ Error --
    main.cpp: In function 'int main()':
    main.cpp:11:61: error: too few arguments to function 'size_t_strlen(const char*)'
    Compilation failed.
    -- Haskell Error --
    No instance for (Show ([a0]->Int)) arising from a use of 'show'
    Possile fix: add an instance declaration for (Show ([a0]->Int))
    In the first argument of '(++)', namely 'show size'
    In the second argument of '(++)', namely 'show size ++ "!"'
    In the second argument of '(++)', namely
      '"\", is " ++ show size ++ "!"'
    Failed, modules loaded:none.

Once again, it appears that the C++ compiler g++ knew exactly what was wrong with the code and how to fix the error. It tells me that there are not enough arguments in my function call.

Wow, Hakell’s error message is quite the mouthful this time. I suppose this is better than just a parse error message, but I’m not sure what exactly ghc is even wanting me to correct. The error is simply too technical to help me.

Example 4

Next, we will look at what happens when you pass too many arguments to functions in both languages:

    /* C++ Code */
    #include <iostream>
    using namespace std;

    int main () {
        string in[256];
        cout << "Please enter a single word to get the string size back" << endl;
        cin >> in;
err->   cout << "The size of your string, \"" << in << "\", is " << (unsigned)strlen(in, in);
        cout << "!" << endl;
        return 0;
    {- Haskell Code -}
    main = do
        putStrLn "Please enter a single word to get the string size back"
        num <- getLine
err->   let size = length num num
        putStrLn $ "The size of your string, \"" ++ num ++ "\", is " ++ show size ++ "!"

And the errors:

    -- C++ Error --
    main.cpp:16:78: error: too many arguments to function 'int newLength(std::string)'
    main.cpp:6:5: note: declared here
    Compilation failed.
    -- Haskell Error --
    Couldn't match expected type 'String -> t0' with actual type 'Int'  
    The function 'length' is applied to two arguments,
    but its type '[Char] -> Int' has only one
    In the expression: length num num
    In an equation for 'size':size = length num num
    Failed, modules loaded: none.

The C++ error clearly explains how to fix the code, and I even understand the Haskell error this time. Both languages tell me that there are too many arguments. Yet the C++ error message tells me this without a bunch of technical jargon. So even when Haskell is actually helpful with its error messages, it still manages to hide what it wants the user to do.


To me, Haskell seems like a language only for experienced programmers because the errors are not user-friendly. How can I write code if a few simple mistakes cripple my progress? Haskell’s compiler ghc simply lags behind g++ in terms of useful error messages for beginners.

Mike’s Epilogue

I’ve created a patch for ghc that clarifies the specific error messages that Paul had trouble with (and a few related ones). In particular:

  1. Anytime there is a parse error caused by a malformed if, case, lambda, or (non-monadic) let, ghc will now remind the programmer of the correct syntax. In the first example Paul gives above, we would get the much clearer:

    parse error in if statement: missing required else clause
  2. To help with the second example, anytime ghc encounters a parse error caused by a <- token, it now outputs the hint:

    Perhaps this statement should be within a 'do' block?
  3. The third example Paul points out comes from the type checker, rather than the parser. It’s a little less obvious how to provide good hints here. My idea is based on the fact that it is fairly rare for functions to be instances of type classes. The only example I know of is the Monad instance for (a->).

    Therefore, if the type checker can’t find an instance for a function, the more likely scenario is that the programmer simply did not pass enough parameters to the function. My proposed change is that in this situation, ghc would output the hint:

    maybe you haven't applied enough arguments to a function?

This patch doesn’t completely fix ghc’s problem with poor error messages. For example, it doesn’t address Paul’s last point about type errors being verbose. But hopefully it will make it a little easier for aspiring haskellers who still aren’t familiar with all the syntax rules.

September 17, 2014 12:00 AM

September 16, 2014

Yesod Web Framework

Persistent 2.1 Release Candidate

We are happy to announce a stable persistent 2 release candidate.

We previously announced an unstable release of persistent 2. It was a good idea to call it unstable because some commenters pointed out that we were not exposing the full power of the new flexible Key type. This lead to a couple of breaking releases that organizied the internal types of persistent. All of these are on the unstable 2.0.x series.

persistent-2.0.3.* is on hackage now. We consider this a release candidate that will be promoted to persistent-2.1. We may wait until the haddocks build on hackage before releasing.

Ongoing persistent maintainership

Persistent is of huge importance to the Hasekll community, playing the role of default ORM. The project has benefited immensely from the community involvement. We get a lot of prompt bug reports that often include fixes. And there have been many great new features added by contributors.

However, persistent it is lacking in dedicated maintainers for each backend. We would like for Michael to keep doing a great job stewarding the project, but for others to step up and help own a SQL backend.

An extreme example of a lack of backend maitenance is when we received a pull request for a CouchDB backend. It was great to share the code as a starting point, but it was already using an older version of persistent and is now sitting in the experimental folder in a bit-rotted state.

In general a persistent backend can only be first class and in our tree with a dedicated maintainer. Michael and I maintain persistent and persistent-template. I maintain the persitent-mongoDB backend. The issue now is more with the SQL backends, where the maitenance and development for them is being pushed on Michael. For example, I implemented custom Key types in persistent, persistent-template, and persistent-mongoDB. Michael and myself implemented them for persistent-sqlite, but it still needs to be implement for persistent-postgressql and persistent-mysql.

Maintaining persitent and persitent-template has had a questionable cost/benefit ratio for me. But I have personally found that maintaing the persistent-mongoDB backend has paid off well for me. I need to have a good understanding of what is happening with my code that deals with the database. Rather than treating it as a black box I make continous incremental improvements to the library that I rely on, and I can smoothly get code onto hackage rather than having local modifications.

Let us know if you are interested in helping to maintain a backend.

September 16, 2014 10:15 PM

Neil Mitchell

Towards Shake 1.0

Summary: I've just released a new version of Shake, with a --demo feature and an underlying continuation monad. I want to release v1.0 in the near future.

I've just released a new version of the Shake build system, version 0.13.3. While the version number is only 0.0.1 higher, the changelog lists a large number of improvements. In particular, two changes are:

  • The Action monad is now based on continuations, which allows Shake to suspend threads without requiring a GHC RTS thread. The result is significantly less memory used on thread stacks. I still find it quite amazing that Haskell has such powerful and robust abstraction mechanisms that a huge change doesn't even break the API.
  • The shake binary now features a --demo mode, invoked by running shake --demo. This mode generates a Shake project, compiles it, and shows off some of the features of Shake. You can the output of --demo here.

Version 1.0

With the two features above, I'm now looking towards Shake version 1.0. I'm not looking to make any significant backwards-incompatible change, or necessarily any code/API changes at all. However, if you have anything you think should be addressed before reaching such a milestone, please comment on the issue tracker or email the mailing list.

Shake website

The one thing I still want to finish before releasing version 1.0 is to have a proper website for Shake. I've registered which will host the content, and have set up GitHub pages to serve it up. I have some rough content in the docs directory and a prototype generator in the website directory - as an example it currently generates something a bit like this for the user manual, but with a table of contents when run through the latest generator. I'd appreciate any help with the content, the generator, or the styling - just email the mailing list.

by Neil Mitchell ( at September 16, 2014 08:54 PM

Chung-chieh Shan

A challenge for a better community

Donation button
Donate to the Ada Initiative

Did you know that all ACM-sponsored conferences have an anti-harassment policy? I didn’t, until I chaired the Haskell Symposium last year. The concise policy says, among other things, that people shouldn’t use my family constitution to interfere with my professional participation. And the policy has teeth. That’s great.

My not knowing the policy and not seeing it publicized didn’t make me go out of my way to harass anyone. But it did make me less sure and less comfortable that I belonged at ICFP. Briefly, it’s because I didn’t know if it would be common ground at the conference that my actual self was fully human. That’s not something I can take for granted in general society. Also, it’s because I didn’t know whether my fellow conference attendees were aware of the policy. We could all use a reminder, and a public signal that we mean it.

For these reasons, I’m very happy that ICFP will start to publicize ACM’s existing anti-harassment policy and make sure everyone registered knows it. All ACM conferences should do it. That’s why Tim Chevalier, Clement Delafargue, Adam Foltzer, Eric Merritt, and I are doing two things. We ask you to join us:

  1. Donate to the Ada Initiative. Our goal is for the functional programming community to raise $8192 by the end of Friday (Sept 19) UTC. To count toward this goal, please use this link:
  2. Call on the ACM and tell your friends. For example, I tweeted this:
    I donate to @AdaInitiative because I want @TheOfficialACM events to announce their anti-harassment policy #lambda4ada

Thanks for improving our professional homes!

(UPDATE: Wow! We reached our initial goal $4096 in just 5 hours! We increased the goal to $8192, thanks to your generosity. And if we raise $16,384, we will sing “There’s no type class like Show type class” and put a recording on the Internet.)

September 16, 2014 04:09 PM

September 15, 2014

Robert Harper

Scotland: Vote No

So far I’ve ignored the back and forth on the Scottish referendum on secession from the United Kingdom, but this weekend I decided that it was past time for me to sort it out.  For those of you who don’t know me, I’ll mention that I lived for 3.5 years in Scotland quite some time ago, so I am not completely ignorant of the cultural and political issues that underly the debate.  As a rule my political views are very much in line with those of the average Scot, solidly Labour Party back in the day when people like Derek Hatton and Ken Livingston and Roy Hattersley and Tony Benn defined what that meant.  Despite Tony Blair’s slimy “third way” nonsense, and his toadying up to Dick “Dick” Cheney’s sock puppet to help lie us into the Iraq war, Scotland in national politics remains solidly Labour; practically every Scottish seat is a Labour seat.

Although I used to be a so up on British politics that I could read and enjoy Private Eye, it’s been a long while since I’ve paid more than scant attention to what’s been going on there, apart from noting that The Scotsman was one of the few sources of truth about the Iraq War back when it really mattered.  The Scots have spines.

I’m no historian, but I do have basic understanding of Scottish history, particularly as regards the English, and am very familiar with the Scottish concept of valor in glorious defeat.  I understand full well that practically every Scotsman harbors some resentment towards the English for centuries of injustices, including the highland clearances, and, more recently, the appropriation of the oil in Scottish territory for the scant benefit of the Scots themselves.  And I am well aware of the bravery and sacrifice that so many Scots made fighting against the Axis during World War II.

My home institution, Carnegie Mellon University, was founded by a Scotsman from Kirkaldy, just across the spectacular Forth Bridge from Edinburgh.  Carnegie was born into penury and died as the wealthiest man on earth, far wealthier relative to GDP than Gates by a wide margin.  Carnegie was extraordinary, but the Scots in general punch far above their weight class in all things, especially industrious self-reliance.

In short, I love Scotland, and consider it to be a second home.  (OK, the weather is appalling, but we’ll set that aside for the time being.)

Emotionally, I am deeply sympathetic to the Scottish independence movement.  I know full well how poorly the U.K. treats Scotland and its interests.  Politics in the UK revolves around the “home counties” in the south of England; the terminology tells you all you need to know.  One time while watching the weather report on the BBC, the national broadcasting network, the announcer said that there was some horrendous weather coming our way, but that “it’ll mostly be up in Scotland, though”.  Though.  Though.

But I urge all my Scottish friends to vote NO on the independence proposal.  It makes no sense whatsoever in its present form, and represents to me a huge scam being perpetrated by the SNP to seize power and impose policies that nearly every Scot, judging from their voting record over decades and decades, would oppose.  The whole movement seems driven by the powerful urge to finally stick it to the English and get their country back, and Salmond is exploiting that to the hilt.  Back when I lived in Scotland I looked into the SNP, because even then I had separatist sympathies, but when I did, it was obvious why they had so few backers.  They’re just Tories without the class structure, more akin to our Tea Party lunatics than to the British Conservatives, and steadfastly opposed to policies, such as well-funded public education, that nearly all Scots support, and determined to follow the post-cold war Slovakian model of slashing taxes on the wealthy in the hope of attracting business to the country.  Having not followed Scottish politics for so long, it is astonishing to me that the SNP has managed to gain a majority in the Scottish Parliament, while the voting pattern at the national level has not changed at all.  How did this happen?  From my position of ignorance of the last decade or so of politics in Scotland, it looks as though Salmond is a slick operator who has pulled off a colossal con by exploiting the nationalist tendencies that lie within every Scot.

But never mind Salmond, the main reason that Scots must vote NO on the referendum is that it proposes to keep the English pound as Scotland’s national currency!  This is such a preposterous idea that I can only suspect dishonesty and deceit, because no sane political leader of honest intent could ever voluntarily place his or her country’s economic future in the hands of another.  The Bank of England will, particularly after separation, have no interest whatsoever in the economic conditions in Scotland when determining its policies on the pound.  And the Bank of Scotland will have no ability to control its own currency, the prime means of maintaining economic balance between labor and capital.  The Scots will, in effect, be putting themselves on a gold standard, the stupidest possible monetary system, so that, in a crisis, they will have to buy or borrow pounds, at interest, in emergency conditions, to deal with, say, the failure of the Royal Bank of Scotland (but don’t worry, that sort of thing can never happen again).  And the Bank of Scotland will have no means of stimulating the economy in a demand slump other than borrowing pounds from somewhere outside the country, rendering themselves in debt beyond their means.  And this will become an excuse for dismantling the social system that has been so important to elevating the Scots from poverty to a decent standard of living within one or two generations.  Just look at the poor PIGS in the Euro-zone being pushed around by Germany, especially, to satisfy the conveniences of the German bankers, and to hell with the living, breathing souls in Greece or Spain or Ireland or Portugal, to name the canonical victims.

A country that does not control its own currency is not independent and cannot be independent.  It’s an illusion.  Just what are Salmond’s true intentions are not entirely clear to me, but on the basis of his monetary policies alone, I implore my Scottish friends to suppress the natural wish to make a statement of pride, and instead do the sensible thing.  The proposal to be voted on this week is not a spittle on the  Heart of Midlothian, it is an irrevocable decision to place Scotland in an even worse position with respect to England than it already is in.

Listen to reason.  Vote NO on independence.

Filed under: Research Tagged: Scottish referendum

by Robert Harper at September 15, 2014 04:05 PM

The GHC Team

GHC Weekly News - 2014/09/15

Hi *,

Here's a new thing: Blog posts! That's right. A while back, we started a new set of emails on the developers list containing weekly updates, from GHC HQ. But we eventually decided it should be more broad and encompass the work GHC sees as a project - including all the things our contributors do.

So now it's the weekly GHC news - and we (or, well, I) have decided to blogify the weekly emails!

So without further adieu, here's the current recap. The original mailing list copy is available here.

  • As Gabor mentioned on the list earlier today, I (Austin) accidentally broke the Windows build. Sorry. :( We really need to get Phab building Windows too ASAP... I'm working on a fix this morning.
  • I sent out the HCAR draft this morning. Please edit it! I think we have a few weeks of lead time however, so we're not in a rush like last time. But I will send reminders. :)
  • The server migration for seems to have gone pretty smoothly in the past week. It's had plenty of traffic so far. The full migration is still ongoing and I want to complete it this week.
  • I've finished reorganizing some of the Git and Repository pages after some discussion. We now have the Repositories[1] page, linked to on the left side, which details some notes on the repositories we use, and links to other various pages. I'm thinking of replacing this side-bar "root" with a link to the main Git[2] page, perhaps.
  • Miscellaneous: and now sets the Strict-Transport-Security header. This just means you always use SSL now when you visit those pages (so you can't be connection-hijacked via a 503 redirect).
  • GHC works on Wine apparently for all you Linux users - thanks Mikolaj![3]
  • Jan had some questions about infrastructure which I just followed up on this morning. In particular: does anyone feel strongly about his first question?[4]
  • Herbert committed the first part of the Traversable/Foldable changes, by moving the typeclasses to Prelude. This is part of an ongoing series of patches. Things like adding Bifunctor will finally come after this.[5]

Also, added bonus: we'll start including some of the tickets we closed this week.

Closed tickets for the past week include: #9585, #9545, #9581, #6086, #9558, and #3658.

Please let me know if you have any questions.


by thoughtpolice at September 15, 2014 01:47 PM

Mike Izbicki

Comparing AVL Trees in C++ and Haskell

Comparing AVL Trees in C++ and Haskell

posted on 2014-09-15 by Dat Do

This post compares the runtimes of AVL tree operations in C++ vs Haskell. In particular, we insert 713,000 strings from a file into an AVL Tree. This is a \(O(n \log n)\) operation. But we want to investigate what the constant factor looks like in different situations.

Experimental setup: All the code for these tests is available in the github repository. The C++ AVL tree was created in a data structures course that I took recently and the Haskell AVL tree is from the Haskell library Data.Tree.AVL. Additionally, the Haskell code stores the strings as ByteStrings because they are much more efficient than the notoriously slow String. To see how the runtime is affected by files of different sizes, the file was first partitioned into 10 segments. The first segment has 71,300 words, the second 71,300 * 2 words, and so on. Both the C++ and Haskell programs were compiled with the -O2 flag for optimization. The test on each segment is the average runtime of three separate runs.

Here’s the results:



C++ is a bit faster than Haskell on the last partition for this test.

I guess this is because Haskell operates on immutable data. Every time a new element is to be inserted into the Haskell AVL tree, new parent nodes must be created because the old parent nodes cannot be changed. This creates quite a bit of overhead. C++ on the other hand, does have mutable data and can simply change the node that a parent node is pointing to. This is faster than making a whole new copy like the Haskell code does.

Is there an easy way to speed up our Haskell code?

There is a Haskell library called parallel that makes parallel computations really convenient. We’ll try to speed up our program with this library.

You might think that it is unfair to compare multithreaded Haskell against C++ that is not multithreaded. And you’re absolutely right! But let’s be honest, manually working with pthreads in C++ is quite the headache, but parallism in Haskell is super easy.

Before we look at the results, let’s look at the parallelized code. What we do is create four trees each with a portion of the set of strings. Then, we call par on the trees so that the code is parallelized. Afterwards, we union the trees to make them a single tree. Finally, we call deepseq so that the code is evaluated.

{-# LANGUAGE TemplateHaskell #-}
import Control.DeepSeq.TH
import Control.Concurrent
import Data.Tree.AVL as A
import Data.COrdering
import System.CPUTime
import qualified Data.ByteString.Char8 as B
import Control.DeepSeq
import Data.List.Split
import System.Environment
import Control.Parallel

$(deriveNFData ''AVL)

-- Inserts elements from list into AVL tree
load :: AVL B.ByteString -> [B.ByteString] -> AVL B.ByteString
load t [] = t
load t (x:xs) = A.push (fstCC x) x (load t xs)

main = do
    args <- getArgs
    contents <- fmap B.lines $ B.readFile $ args !! 0
    let l = splitEvery (length contents `div` 4) contents
    deepseq contents $ deepseq l $ return ()
    start <- getCPUTime

    -- Loading the tree with the subpartitions
    let t1 = load empty $ l !! 0
    let t2 = load empty $ l !! 1
    let t3 = load empty $ l !! 2
    let t4 = load empty $ l !! 3
    let p = par t1 $ par t2 $ par t3 t4

    -- Calling union to combine the trees
    let b = union fstCC t1 t2
    let t = union fstCC t3 t4
    let bt = union fstCC b t
    let bt' = par b $ par t bt
    deepseq p $ deepseq bt' $ return ()

    end <- getCPUTime
    n <- getNumCapabilities
    let diff = ((fromIntegral (end-start)) / (10^12) / fromIntegral n)
    putStrLn $ show diff

Great, so now that the Haskell code has been parallelized, we can compile and run the program again to see the difference. To compile for parallelism, we must use some special flags.

ghc –O2 filename -rtsopts –threaded

And to run the program (-N4 refers to the number of cores).

./filename +RTS –N4


Haskell now gets better runtimes than C++.

Now that we know Haskell is capable of increasing its speeds through parallelism, it would be interesting to see how the runtime is affected by the degree of parallelism.

According to Amdahl’s law, a program that is 100% parallelized will see a proportional speed up based on the number of threads of execution. For example, if a program that is 100% parallelized takes 2 seconds to run on 1 thread, then it should take 1 second to run using 2 threads. The code used for our test, however, is not 100% parallelized since there a union operation performed at the end to combine the trees created by the separate threads. The union of the trees is a \(O(n)\) operation while the insertion of the strings into the AVL tree is a \(O\left(\frac{n \log n }{p}\right)\) operation, where \(p\) is the number of threads. Therefore, the runtime for our test should be

\[O\left(\frac{n\log{n}}{p} + n\right)\]

Here is a graph showing the runtime of the operation on the largest set (713,000 strings) across increasing levels of parallelism.



Taking a look at the results, we can see that the improvement in runtime does not fit the 100% parallelized theoretical model, but does follow it to some extent. Rather than the 2 core runtime being 50% of the 1 core runtime, the 2 core runtime is 56% of the 1 core runtime, with decreasing efficiency as the number of cores increases. Though, it is clear that there are significant improvements in speed through the use of more processor cores and that parallelism is an easy way to get better runtime speeds with little effort.

September 15, 2014 12:00 AM

September 14, 2014

Philip Wadler

The British Biased Corporation

Scandalous! Nick Robinson asks Alex Salmond a question, and Salmond takes seven minutes to answer in detail.

<iframe allowfullscreen="" frameborder="0" height="315" src="" width="560"></iframe>

On the evening news, Nick Robinson summarises Salmond's answer in a few seconds as 'He didn't answer'.

<iframe allowfullscreen="" frameborder="0" height="259" src="" width="460"></iframe>

(Above spotted via Arc of Prosperity.)

And today, this.
I used to be a supporter of the BBC, but it's getting harder and harder to justify.

by Philip Wadler ( at September 14, 2014 10:01 PM

Krugman vs. Stiglitz, now with added Stiglitz

My last post quoted Joe Stiglitz, indirectly, to refute Paul Krugman's fear mongering. Now the man himself has spoken in the Sunday Herald.
As Scotland contemplates independence, some, such as Paul Krugman, have questioned the "economics".

Would Scotland, going it alone, risk a decline in standards of living or a fall in GDP? There are, to be sure, risks in any course of action: should Scotland stay in the UK, and the UK leave the EU, the downside risks are, by almost any account, significantly greater. If Scotland stays in the UK, and the UK continues in its policies which have resulted in growing inequality, even if GDP were slightly larger, the standards of living of most Scots could fall.

Cutbacks in UK public support to education and health could force Scotland to face a set of unpalatable choices - even with Scotland having considerable discretion over what it spends its money on.

But there is, in fact, little basis for any of the forms of fear-mongering that have been advanced. Krugman, for instance, suggests that there are significant economies of scale: a small economy is likely, he seems to suggest, not to do well. But an independent Scotland will still be part of Europe, and the great success of the EU is the creation of a large economic zone.

Besides, small political entities, like Sweden, Singapore, and Hong Kong have prospered, while much larger entities have not. By an order of magnitude, far more important is pursuit of the right policies.

Another example of a non-issue is the currency. There are many currency arrangements that would work. Scotland could continue using sterling - with or without England's consent.

Because the economies of England and Scotland are so similar, a common currency is likely to work far better than the euro - even without shared fiscal policy. But many small countries have managed to have a currency of their own - floating, pegged, or "managed."

by Philip Wadler ( at September 14, 2014 06:13 PM

Dinna fash yersel — Scotland will dae juist fine!

One relentless lie behind 'No' is that Scotland is too wee to make it on its own, counterexamples such as Denmark, Sweden, Singapore, and Hong Kong being conveniently ignored. May this post from Thomas Widmann, a Dane residing in Scotland, help to dispel the disinformation. 
Pick a random person from somewhere on this planet. Ask them to name an alcoholic drink from Scotland, and it’s very likely they’ll reply “Whisky”. Ask them to name one from Denmark, and they’ll probably be tongue-tied. (They could answer “Gammel Dansk” or “Akvavit”, but they’re just not nearly as famous as whisky.)

Now repeat the exercise, but ask about a food item. Again, it’s likely they’ll have heard of haggis but that they’ll be struggling to name anything from Denmark.

Now try a musical instrument. Bagpipes and … sorry, cannot think of a Danish one.

A sport? Scotland has golf, of course. Denmark can perhaps claim ownership of handball, but it’s not associated with Denmark in the way that golf makes everybody think of Scotland.

A piece of clothing? Everybody knows the kilt, but I’d be very surprised if anybody can name one from Denmark.

A monster? Everybody knows what’s lurking in Loch Ness, but is there anything scary in Denmark?

The only category where Denmark perhaps wins is toys, where Lego surely is more famous than anything from Scotland (but many people don’t know Lego is from Denmark).

Denmark is also well-known for butter and bacon, of course, but these aren’t Danish in origin or strongly associated with Denmark in people’s minds.

Several famous writers and philosophers were Danish (e.g., Hans Christian Andersen and Søren Kierkegaard), but Scotland can arguably list more names of the same calibre, and the Scottish ones wrote in English, which makes them much more accessible to the outside world.

Scottish universities are also ranked better than the Danish ones in recent World rankings.

Finally, Scotland has lots of oil and wind, water and waves. Denmark has some, but not nearly as much, and most other countries have less than Denmark.

Because of all of this, I don’t worry about the details when it comes to Scottish independence. If Denmark can be one of the richest countries on the planet, of course Scotland can be one too.

Yes, there might be a few tough years while the rUK are in a huff and before everything has been sorted out. And of course there will be occasional crises in the future, like in any other country.

However, unless you subscribe to the school that Denmark and other small countries like Norway and Switzerland are complete failures because they don’t have nuclear weapons and a permanent seat on the UN’s Security Council, there’s simply no reason to assume Scotland won’t do exceptionally well as an independent country in the longer term.

So I’m not worried. Of course there are many details to sort out, but at the end of the day everything will be fine. Scotland will be a hugely successful independent country. Dinna fash yersel!

by Philip Wadler ( at September 14, 2014 06:13 PM

September 13, 2014

The Gentoo Haskell Team

ghc 7.8.3 and rare architectures

After some initially positive experience with ghc-7.8-rc1 I’ve decided to upstream most of gentoo fixes.

On rare arches ghc-7.8.3 behaves a bit bad:

  • ia64 build stopped being able to link itself after ghc-7.4 (gprel overflow)
  • on sparc, ia64 and ppc ghc was not able to create working shared libraries
  • integer-gmp library on ia64 crashed, and we had to use integer-simple

I have written a small story of those fixes here if you are curious.


To get ghc-7.8.3 working nicer for exotic arches you will need to backport at least the following patches:

Thank you!

by Sergei Trofimovich at September 13, 2014 12:35 PM

unsafePerformIO and missing NOINLINE

Two months ago Ivan asked me if we had working darcs-2.8 for ghc-7.8 in gentoo. We had a workaround to compile darcs to that day, but darcs did not work reliably. Sometimes it needed 2-3 attempts to pull a repository.

A bit later I’ve decided to actually look at failure case (Issued on darcs bugtracker) and do something about it. My idea to debug the mystery was simple: to reproduce the difference on the same source for ghc-7.6/7.8 and start plugging debug info unless difference I can understand will pop up.

Darcs has great debug-verbose option for most of commands. I used debugMessage function to litter code with more debugging statements unless complete horrible image would emerge.

As you can see in bugtracker issue I posted there various intermediate points of what I thought went wrong (don’t expect those comments to have much sense).

The immediate consequence of a breakage was file overwrite of partially downloaded file. The event timeline looked simple:

  • darcs scheduled for download the same file twice (two jobs in download queue)
  • first download job did finish
  • notified waiter started processing of that downloaded temp file
  • second download started truncating previous complete download
  • notified waiter continued processing partially downloadeed file and detected breakage

Thus first I’ve decided to fix the consequence. It did not fix problems completely, sometimes darcs pull complained about remote repositories still being broken (missing files), but it made errors saner (only remote side was allegedly at fault).

Ideally, that file overwrite should not happen in the first place. Partially, it was temp file predictability.

But, OK. Then i’ve started digging why 7.6/7.8 request download patterns were so severely different. At first I thought of new IO manager being a cause of difference. The paper says it fixed haskell thread scheduling issue (paper is nice even for leisure reading!):

GHC’s RTS had a bug in which yield
placed the thread back on the front of the run queue. This bug
was uncovered by our use of yield
which requires that the thread
be placed at the end of the run queue

Thus I was expecting the bug from this side.

Then being determined to dig A Lot in darcs source code I’ve decided to disable optimizations (-O0) to speedup rebuilds. And, the bug has vanished.

That made the click: unsafePerformIO might be the real problem. I’ve grepped for all unsafePerformIO instances and examined all definition sites.

Two were especially interesting:

-- src/Darcs/Util/Global.hs
-- ...
_crcWarningList :: IORef CRCWarningList
_crcWarningList = unsafePerformIO $ newIORef []
{-# NOINLINE _crcWarningList #-}
-- ...
_badSourcesList :: IORef [String]
_badSourcesList = unsafePerformIO $ newIORef []
{- NOINLINE _badSourcesList -}
-- ...

Did you spot the bug?

Thus The Proper Fix was pushed upstream a month ago. Which means ghc is now able to inline things more aggressively (and _badSourcesList were inlined in all user sites, throwing out all update sites).

I don’t know if those newIORef [] can be de-CSEd if types would have the same representation. Ideally the module also needs -fno-cse, or get rid of unsafePerformIO completely :].

(Side thought: top-level global variables in C style are surprisingly non-trivial in "pure" haskell. They are easy to use via peek / poke (in a racy way), but are hard to declare / initialize.)

I had a question wondered how many haskell packages manage to misspell ghc pragma decparations in a way darcs did it. And there still _is_ a few of such offenders:

$ fgrep -R NOINLINE . | grep -v '{-# NOINLINE' | grep '{-'
ajhc-{- NOINLINE filterFB #-}
ajhc-{- NOINLINE iterateFB #-}
ajhc-{- NOINLINE mapFB #-}
darcs-2.8.4/src/Darcs/Global.hs:{- NOINLINE _badSourcesList -}
darcs-2.8.4/src/Darcs/Global.hs:{- NOINLINE _reachableSourcesList -}
dph-lifted-copy-{- NOINLINE emptyP #-}
dph-par-{- NOINLINE emptyP #-}
dph-seq-{- NOINLINE emptyP #-}
freesect-0.8/FreeSectAnnotated.hs:{- # NOINLINE showSSI #-}
freesect-0.8/FreeSectAnnotated.hs:{- # NOINLINE FreeSectAnnotated.showSSI #-}
freesect-0.8/FreeSect.hs:{- # NOINLINE fs_warn_flaw #-}
http-proxy-0.0.8/Network/HTTP/Proxy/ReadInt.hs:{- NOINLINE readInt64MH #-}
http-proxy-0.0.8/Network/HTTP/Proxy/ReadInt.hs:{- NOINLINE mhDigitToInt #-}
lhc-0.10/lib/base/src/GHC/PArr.hs:{- NOINLINE emptyP #-}
property-list-{- NOINLINE doubleToWord64 -}
property-list-{- NOINLINE word64ToDouble -}
property-list-{- NOINLINE floatToWord32 -}
property-list-{- NOINLINE word32ToFloat -}
warp-{- NOINLINE readInt64MH #-}
warp-{- NOINLINE mhDigitToInt #-}

Looks like there is yet something to fix :]

Would be great if hlint would be able to detect pragma-like comments and warn when comment contents is a valid pragma, but comment brackets don’t allow it to fire.

{- NOINLINE foo -} -- bad
{- NOINLINE foo #-} -- bad
{-# NOINLINE foo -} -- bad
{-# NOINLINE foo #-} -- ok

Thanks for reading!

by Sergei Trofimovich at September 13, 2014 09:18 AM

September 12, 2014

Bjorn Buckwalter

Haskell tools for satellite operations

At last week's CUFP I did a talk called “Haskell tools for satellite operations”. The abstract is:

Since 2013-04 the presenter has been supporting SSC (the Swedish Space Corporation) in operating the telecommunications satellite “Sirius 3” from its Mission Control Center in Kiruna. Functions in the satellite vendor's operations software are breaking down as the orbit of the ageing satellite degrades. To fill in the gaps in software capabilities the presenter has developed several operational tools using Haskell.
The talk will give an overview of the satellite operations environment, the tools developed in Haskell, how they benefitted (and occasionally suffered) from the choice of implementation language, which (public) libraries were critical to their success, and how they were deployed in the satellite operations environment.
A video recording of the talk is available on the CUFP page for the talk and on youtube.

If this interests you be sure to check out the other talk from the “Functional programming in space!” track; Michael Oswald's Haskell in the Misson Control Domain.

by Björn Buckwalter ( at September 12, 2014 12:46 PM

Gabriel Gonzalez

Morte: an intermediate language for super-optimizing functional programs

The Haskell language provides the following guarantee (with caveats): if two programs are equal according to equational reasoning then they will behave the same. On the other hand, Haskell does not guarantee that equal programs will generate identical performance. Consequently, Haskell library writers must employ rewrite rules to ensure that their abstractions do not interfere with performance.

Now suppose there were a hypothetical language with a stronger guarantee: if two programs are equal then they generate identical executables. Such a language would be immune to abstraction: no matter how many layers of indirection you might add the binary size and runtime performance would be unaffected.

Here I will introduce such an intermediate language named Morte that obeys this stronger guarantee. I have not yet implemented a back-end code generator for Morte, but I wanted to pause to share what I have completed so far because Morte uses several tricks from computer science that I believe deserve more attention.

Morte is nothing more than a bare-bones implementation of the calculus of constructions, which is a specific type of lambda calculus. The only novelty is how I intend to use this lambda calculus: as a super-optimizer.


The typed lambda calculus possesses a useful property: every term in the lambda calculus has a unique normal form if you beta-reduce everything. If you're new to lambda calculus, normalizing an expression equates to indiscriminately inlining every function call.

What if we built a programming language whose intermediate language was lambda calculus? What if optimization was just normalization of lambda terms (i.e. indiscriminate inlining)? If so, then we would could abstract freely, knowing that while compile times might increase, our final executable would never change.


Normally you would not want to inline everything because infinitely recursive functions would become infinitely large expressions. Fortunately, we can often translate recursive code to non-recursive code!

I'll demonstrate this trick first in Haskell and then in Morte. Let's begin from the following recursive List type along with a recursive map function over lists:

import Prelude hiding (map, foldr)

data List a = Cons a (List a) | Nil

example :: List Int
example = Cons 1 (Cons 2 (Cons 3 Nil))

map :: (a -> b) -> List a -> List b
map f Nil = Nil
map f (Cons a l) = Cons (f a) (map f l)

-- Argument order intentionally switched
foldr :: List a -> (a -> x -> x) -> x -> x
foldr Nil c n = n
foldr (Cons a l) c n = c a (foldr l c n)

result :: Int
result = foldr (map (+1) example) (+) 0

-- result = 9

Now imagine that we disable all recursion in Haskell: no more recursive types and no more recursive functions. Now we must reject the above program because:

  • the List data type definition recursively refers to itself

  • the map and foldr functions recursively refer to themselves

Can we still encode lists in a non-recursive dialect of Haskell?

Yes, we can!

-- This is a valid Haskell program

{-# LANGUAGE RankNTypes #-}

import Prelude hiding (map, foldr)

type List a = forall x . (a -> x -> x) -> x -> x

example :: List Int
example = \cons nil -> cons 1 (cons 2 (cons 3 nil))

map :: (a -> b) -> List a -> List b
map f l = \cons nil -> l (\a x -> cons (f a) x) nil

foldr :: List a -> (a -> x -> x) -> x -> x
foldr l = l

result :: Int
result = foldr (map (+ 1) example) (+) 0

-- result = 9

Carefully note that:

  • List is no longer defined recursively in terms of itself

  • map and foldr are no longer defined recursively in terms of themselves

Yet, we somehow managed to build a list, map a function over the list, and fold the list, all without ever using recursion! We do this by encoding the list as a fold, which is why foldr became the identity function.

This trick works for more than just lists. You can take any recursive data type and mechanically transform the type into a fold and transform functions on the type into functions on folds. If you want to learn more about this trick, the specific name for it is "Boehm-Berarducci encoding". If you are curious, this in turn is equivalent to an even more general concept from category theory known as "F-algebras", which let you encode inductive things in a non-inductive way.

Non-recursive code greatly simplifies equational reasoning. For example, we can easily prove that we can optimize map id l to l:

map id l

-- Inline: map f l = \cons nil -> l (\a x -> cons (f a) x) nil
= \cons nil -> l (\a x -> cons (id a) x) nil

-- Inline: id x = x
= \cons nil -> l (\a x -> cons a x) nil

-- Eta-reduce
= \cons nil -> l cons nil

-- Eta-reduce
= l

Note that we did not need to use induction to prove this optimization because map is no longer recursive. The optimization became downright trivial, so trivial that we can automate it!

Morte optimizes programs using this same simple scheme:

  • Beta-reduce everything (equivalent to inlining)
  • Eta-reduce everything

To illustrate this, I will desugar our high-level Haskell code to the calculus of constructions. This desugaring process is currently manual (and tedious), but I plan to automate this, too, by providing a front-end high-level language similar to Haskell that compiles to Morte:


( \(List : * -> *)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b) -> List a -> List b
-> \(id : forall (a : *) -> a -> a)

-> \(a : *) -> map a a (id a)

-- List
(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil

-- id
(\(a : *) -> \(va : a) -> va)

This line of code is the "business end" of the program:

\(a : *) -> map a a (id a)

The extra 'a' business is because in any polymorphic lambda calculus you explicitly accept polymorphic types as arguments and specialize functions by applying them to types. Higher-level functional languages like Haskell or ML use type inference to automatically infer and supply type arguments when possible.

We can compile this program using the morte executable, which accepts a Morte program on stdin, outputs the program's type stderr, and outputs the optimized program on stdout:

$ morte <
∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a
→ x → x) → x → x

λ(a : *) → λ(l : ∀(x : *) → (a → x → x) → x → x) → l

The first line is the type, which is a desugared form of:

forall a . List a -> List a

The second line is the program, which is the identity function on lists. Morte optimized away the map completely, the same way we did by hand.

Morte also optimized away the rest of the code, too. Dead-code elimination is just an emergent property of Morte's simple optimization scheme.


We could double-check our answer by asking Morte to optimize the identity function on lists:


( \(List : * -> *)
-> \(id : forall (a : *) -> a -> a)

-> \(a : *) -> id (List a)

-- List
(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)

-- id
(\(a : *) -> \(va : a) -> va)

Sure enough, Morte outputs an alpha-equivalent result (meaning the same up to variable renaming):

$ ~/.cabal/bin/morte <
∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a
→ x → x) → x → x

λ(a : *) → λ(va : ∀(x : *) → (a → x → x) → x → x) → va

We can even use the morte library to mechanically check if two Morte expressions are alpha-, beta-, and eta- equivalent. We can parse our two Morte files into Morte's Expr type and then use the Eq instance for Expr to test for equivalence:

$ ghci
Prelude> import qualified Data.Text.Lazy.IO as Text
Prelude Text> txt1 <- Text.readFile ""
Prelude Text> txt2 <- Text.readFile ""
Prelude Text> import Morte.Parser (exprFromText)
Prelude Text Morte.Parser> let e1 = exprFromText txt1
Prelude Text Morte.Parser> let e2 = exprFromText txt2
Prelude Text Morte.Parser> import Control.Applicative (liftA2)
Prelude Text Morte.Parser Control.Applicative> liftA2 (==) e1 e2
Right True
$ -- `Right` means both expressions parsed successfully
$ -- `True` means they are alpha-, beta-, and eta-equivalent

We can use this to mechanically verify that two Morte programs optimize to the same result.

Compile-time computation

Morte can compute as much (or as little) at compile as you want. The more information you encode directly within lambda calculus, the more compile-time computation Morte will perform for you. For example, if we translate our Haskell List code entirely to lambda calculus, then Morte will statically compute the result at compile time.


( \(Nat : *)
-> \(zero : Nat)
-> \(one : Nat)
-> \((+) : Nat -> Nat -> Nat)
-> \((*) : Nat -> Nat -> Nat)
-> \(List : * -> *)
-> \(Cons : forall (a : *) -> a -> List a -> List a)
-> \(Nil : forall (a : *) -> List a)
-> \( map
: forall (a : *) -> forall (b : *)
-> (a -> b) -> List a -> List b
-> \( foldr
: forall (a : *)
-> List a
-> forall (r : *)
-> (a -> r -> r) -> r -> r
-> ( \(two : Nat)
-> \(three : Nat)
-> ( \(example : List Nat)

-> foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero

-- example
(Cons Nat one (Cons Nat two (Cons Nat three (Nil Nat))))

-- two
((+) one one)

-- three
((+) one ((+) one one))

-- Nat
( forall (a : *)
-> (a -> a)
-> a
-> a

-- zero
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Zero

-- one
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Succ Zero

-- (+)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a Succ (n a Succ Zero)

-- (*)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a (n a Succ) Zero

-- List
( \(a : *)
-> forall (x : *)
-> (a -> x -> x) -- Cons
-> x -- Nil
-> x

-- Cons
( \(a : *)
-> \(va : a)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Cons va (vas x Cons Nil)

-- Nil
( \(a : *)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Nil

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil

-- foldr
( \(a : *)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> vas

The relevant line is:

foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero

If you remove the type-applications to Nat, this parallels our original Haskell example. We can then evaluate this expression at compile time:

$ morte <
∀(a : *) → (a → a) → a → a

λ(a : *) → λ(Succ : a → a) → λ(Zero : a) → Succ (Succ (Succ
(Succ (Succ (Succ (Succ (Succ (Succ Zero))))))))

Morte reduces our program to a church-encoded nine.

Run-time computation

Morte does not force you to compute everything using lambda calculus at compile time. Suppose that we wanted to use machine arithmetic at run-time instead. We can do this by parametrizing our program on:

  • the Int type,
  • operations on Ints, and
  • any integer literals we use

We accept these "foreign imports" as ordinary arguments to our program:


-- Foreign imports
\(Int : *) -- Foreign type
-> \((+) : Int -> Int -> Int) -- Foreign function
-> \((*) : Int -> Int -> Int) -- Foreign function
-> \(lit@0 : Int) -- Literal "1" -- Foreign data
-> \(lit@1 : Int) -- Literal "2" -- Foreign data
-> \(lit@2 : Int) -- Literal "3" -- Foreign data
-> \(lit@3 : Int) -- Literal "1" -- Foreign data
-> \(lit@4 : Int) -- Literal "0" -- Foreign data

-- The rest is compile-time lambda calculus
-> ( \(List : * -> *)
-> \(Cons : forall (a : *) -> a -> List a -> List a)
-> \(Nil : forall (a : *) -> List a)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b) -> List a -> List b
-> \( foldr
: forall (a : *)
-> List a
-> forall (r : *)
-> (a -> r -> r) -> r -> r
-> ( \(example : List Int)

-> foldr Int (map Int Int ((+) lit@3) example) Int (+) lit@4

-- example
(Cons Int lit@0 (Cons Int lit@1 (Cons Int lit@2 (Nil Int))))

-- List
( \(a : *)
-> forall (x : *)
-> (a -> x -> x) -- Cons
-> x -- Nil
-> x

-- Cons
( \(a : *)
-> \(va : a)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Cons va (vas x Cons Nil)

-- Nil
( \(a : *)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Nil

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil

-- foldr
( \(a : *)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> vas

We can use Morte to optimize the above program and Morte will reduce the program to nothing but foreign types, operations, and values:

$ morte <
∀(Int : *) → (Int → Int → Int) → (Int → Int → Int) → Int →
Int → Int → Int → Int → Int

λ(Int : *) → λ((+) : Int → Int → Int) → λ((*) : Int → Int →
Int) → λ(lit : Int) → λ(lit@1 : Int) → λ(lit@2 : Int) →
λ(lit@3 : Int) → λ(lit@4 : Int) → (+) ((+) lit@3 lit) ((+)
((+) lit@3 lit@1) ((+) ((+) lit@3 lit@2) lit@4))

If you study that closely, Morte adds lit@3 (the "1" literal) to each literal of the list and then adds them up. We can then pass this foreign syntax tree to our machine arithmetic backend to transform those foreign operations to efficient operations.

Morte lets you choose how much information you want to encode within lambda calculus. The more information you encode in lambda calculus the more Morte can optimize your program, but the slower your compile times will get, so it's a tradeoff.


Corecursion is the dual of recursion. Where recursion works on finite data types, corecursion works on potentially infinite data types. An example would be the following infinite Stream in Haskell:

data Stream a = Cons a (Stream a)

numbers :: Stream Int
numbers = go 0
go n = Cons n (go (n + 1))

-- numbers = Cons 0 (Cons 1 (Cons 2 (...

map :: (a -> b) -> Stream a -> Stream b
map f (Cons a l) = Cons (f a) (map f l)

example :: Stream Int
example = map (+ 1) numbers

-- example = Cons 1 (Cons 2 (Cons 3 (...

Again, pretend that we disable any function from referencing itself so that the above code becomes invalid. This time we cannot reuse the same trick from previous sections because we cannot encode numbers as a fold without referencing itself. Try this if you don't believe me.

However, we can still encode corecursive things in a non-corecursive way. This time, we encode our Stream type as an unfold instead of a fold:

-- This is also valid Haskell code

{-# LANGUAGE ExistentialQuantification #-}

data Stream a = forall s . MkStream
{ seed :: s
, step :: s -> (a, s)

numbers :: Stream Int
numbers = MkStream 0 (\n -> (n, n + 1))

map :: (a -> b) -> Stream a -> Stream b
map f (MkStream s0 k) = MkStream s0 k'
k' s = (f a, s')
where (a, s') = k s

In other words, we store an initial seed of some type s and a step function of type s -> (a, s) that emits one element of our Stream. The type of our seed s can be anything and in our numbers example, the type of the internal state is Int. Another stream could use a completely different internal state of type (), like this:

-- ones = Cons 1 ones

ones :: Stream Int
ones = MkStream () (\_ -> (1, ()))

The general name for this trick is an "F-coalgebra" encoding of a corecursive type.

Once we encode our infinite stream non-recursively, we can safely optimize the stream by inlining and eta reduction:

map id l

-- l = MkStream s0 k
= map id (MkStream s0 k)

-- Inline definition of `map`
= MkStream s0 k'
k' = \s -> (id a, s')
(a, s') = k s

-- Inline definition of `id`
= MkStream s0 k'
k' = \s -> (a, s')
(a, s') = k s

-- Inline: (a, s') = k s
= MkStream s0 k'
k' = \s -> k s

-- Eta reduce
= MkStream s0 k'
k' = k

-- Inline: k' = k
= MkStream s0 k

-- l = MkStream s0 k
= l

Now let's encode Stream and map in Morte and compile the following four expressions:

map id


map f . map g

map (f . g)

Save the following Morte file to and then uncomment the expression you want to test:

(   \(id : forall (a : *) -> a -> a)
-> \( (.)
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> (a -> c)
-> \(Pair : * -> * -> *)
-> \(P : forall (a : *) -> forall (b : *) -> a -> b -> Pair a b)
-> \( first
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (a -> b)
-> Pair a c
-> Pair b c

-> ( \(Stream : * -> *)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b)
-> Stream a
-> Stream b

-- example@1 = example@2
-> ( \(example@1 : forall (a : *) -> Stream a -> Stream a)
-> \(example@2 : forall (a : *) -> Stream a -> Stream a)

-- example@3 = example@4
-> \( example@3
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> Stream a
-> Stream c

-> \( example@4
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> Stream a
-> Stream c

-- Uncomment the example you want to test
-> example@1
-- -> example@2
-- -> example@3
-- -> example@4

-- example@1
(\(a : *) -> map a a (id a))

-- example@2
(\(a : *) -> id (Stream a))

-- example@3
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> map a c ((.) a b c f g)

-- example@4
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> (.) (Stream a) (Stream b) (Stream c) (map b c f) (map a b g)

-- Stream
( \(a : *)
-> forall (x : *)
-> (forall (s : *) -> s -> (s -> Pair a s) -> x)
-> x

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \( st
: forall (x : *)
-> (forall (s : *) -> s -> (s -> Pair a s) -> x)
-> x
-> \(x : *)
-> \(S : forall (s : *) -> s -> (s -> Pair b s) -> x)
-> st
( \(s : *)
-> \(seed : s)
-> \(step : s -> Pair a s)
-> S
(\(seed@1 : s) -> first a b s f (step seed@1))

-- id
(\(a : *) -> \(va : a) -> va)

-- (.)
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> \(va : a)
-> f (g va)

-- Pair
(\(a : *) -> \(b : *) -> forall (x : *) -> (a -> b -> x) -> x)

-- P
( \(a : *)
-> \(b : *)
-> \(va : a)
-> \(vb : b)
-> \(x : *)
-> \(P : a -> b -> x)
-> P va vb

-- first
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : a -> b)
-> \(p : forall (x : *) -> (a -> c -> x) -> x)
-> \(x : *)
-> \(Pair : b -> c -> x)
-> p x (\(va : a) -> \(vc : c) -> Pair (f va) vc)

Both example@1 and example@2 will generate alpha-equivalent code:

$ morte <
∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a →
s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x
: *) → (a → s → x) → x) → x) → x

λ(a : *) → λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) →
(a → s → x) → x) → x) → x) → st

$ morte <
∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a →
s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x
: *) → (a → s → x) → x) → x) → x

λ(a : *) → λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) →
(a → s → x) → x) → x) → x) → va

Similarly, example@3 and example@4 will generate alpha-equivalent code:

$ morte <
∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x :
*) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) →
x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x)
→ x) → x) → x

λ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b)
→ λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s
→ x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀
(x : *) → (c → s → x) → x) → x) → st x (λ(s : *) → λ(seed :
s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(
seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x
(λ(va : a) → Pair (f (g va)))))

$ morte <
∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x :
*) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) →
x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x)
→ x) → x) → x

λ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b)
→ λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s
→ x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀
(x : *) → (c → s → x) → x) → x) → va x (λ(s : *) → λ(seed :
s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(
seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x
(λ(va : a) → Pair (f (g va))))

We inadvertently proved stream fusion for free, but we're still not done, yet! Everything we learn about recursive and corecursive sequences can be applied to model recursive and corecursive effects!


I will conclude this post by showing how to model both recursive and corecursive programs that have side effects. The recursive program will echo ninety-nine lines from stdin to stdout. The equivalent Haskell program is in the comment header:


-- The Haskell code we will translate to Morte:
-- import Prelude hiding (
-- (+), (*), IO, putStrLn, getLine, (>>=), (>>), return )
-- -- Simple prelude
-- data Nat = Succ Nat | Zero
-- zero :: Nat
-- zero = Zero
-- one :: Nat
-- one = Succ Zero
-- (+) :: Nat -> Nat -> Nat
-- Zero + n = n
-- Succ m + n = m + Succ n
-- (*) :: Nat -> Nat -> Nat
-- Zero * n = Zero
-- Succ m * n = n + (m * n)
-- foldNat :: Nat -> (a -> a) -> a -> a
-- foldNat Zero f x = x
-- foldNat (Succ m) f x = f (foldNat m f x)
-- data IO r
-- = PutStrLn String (IO r)
-- | GetLine (String -> IO r)
-- | Return r
-- putStrLn :: String -> IO U
-- putStrLn str = PutStrLn str (Return Unit)
-- getLine :: IO String
-- getLine = GetLine Return
-- return :: a -> IO a
-- return = Return
-- (>>=) :: IO a -> (a -> IO b) -> IO b
-- PutStrLn str io >>= f = PutStrLn str (io >>= f)
-- GetLine k >>= f = GetLine (\str -> k str >>= f)
-- Return r >>= f = f r
-- -- Derived functions
-- (>>) :: IO U -> IO U -> IO U
-- m >> n = m >>= \_ -> n
-- two :: Nat
-- two = one + one
-- three :: Nat
-- three = one + one + one
-- four :: Nat
-- four = one + one + one + one
-- five :: Nat
-- five = one + one + one + one + one
-- six :: Nat
-- six = one + one + one + one + one + one
-- seven :: Nat
-- seven = one + one + one + one + one + one + one
-- eight :: Nat
-- eight = one + one + one + one + one + one + one + one
-- nine :: Nat
-- nine = one + one + one + one + one + one + one + one + one
-- ten :: Nat
-- ten = one + one + one + one + one + one + one + one + one + one
-- replicateM_ :: Nat -> IO U -> IO U
-- replicateM_ n io = foldNat n (io >>) (return Unit)
-- ninetynine :: Nat
-- ninetynine = nine * ten + nine
-- main_ :: IO U
-- main_ = getLine >>= putStrLn

-- "Free" variables
( \(String : * )
-> \(U : *)
-> \(Unit : U)

-- Simple prelude
-> ( \(Nat : *)
-> \(zero : Nat)
-> \(one : Nat)
-> \((+) : Nat -> Nat -> Nat)
-> \((*) : Nat -> Nat -> Nat)
-> \(foldNat : Nat -> forall (a : *) -> (a -> a) -> a -> a)
-> \(IO : * -> *)
-> \(return : forall (a : *) -> a -> IO a)
-> \((>>=)
: forall (a : *)
-> forall (b : *)
-> IO a
-> (a -> IO b)
-> IO b
-> \(putStrLn : String -> IO U)
-> \(getLine : IO String)

-- Derived functions
-> ( \((>>) : IO U -> IO U -> IO U)
-> \(two : Nat)
-> \(three : Nat)
-> \(four : Nat)
-> \(five : Nat)
-> \(six : Nat)
-> \(seven : Nat)
-> \(eight : Nat)
-> \(nine : Nat)
-> \(ten : Nat)
-> ( \(replicateM_ : Nat -> IO U -> IO U)
-> \(ninetynine : Nat)

-> replicateM_ ninetynine ((>>=) String U getLine putStrLn)

-- replicateM_
( \(n : Nat)
-> \(io : IO U)
-> foldNat n (IO U) ((>>) io) (return U Unit)

-- ninetynine
((+) ((*) nine ten) nine)

-- (>>)
( \(m : IO U)
-> \(n : IO U)
-> (>>=) U U m (\(_ : U) -> n)

-- two
((+) one one)

-- three
((+) one ((+) one one))

-- four
((+) one ((+) one ((+) one one)))

-- five
((+) one ((+) one ((+) one ((+) one one))))

-- six
((+) one ((+) one ((+) one ((+) one ((+) one one)))))

-- seven
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))

-- eight
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))
-- nine
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))))

-- ten
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))))

-- Nat
( forall (a : *)
-> (a -> a)
-> a
-> a

-- zero
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Zero

-- one
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Succ Zero

-- (+)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a Succ (n a Succ Zero)

-- (*)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a (n a Succ) Zero

-- foldNat
( \(n : forall (a : *) -> (a -> a) -> a -> a)
-> n

-- IO
( \(r : *)
-> forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (r -> x)
-> x

-- return
( \(a : *)
-> \(va : a)
-> \(x : *)
-> \(PutStrLn : String -> x -> x)
-> \(GetLine : (String -> x) -> x)
-> \(Return : a -> x)
-> Return va

-- (>>=)
( \(a : *)
-> \(b : *)
-> \(m : forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (a -> x)
-> x
-> \(f : a
-> forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (b -> x)
-> x
-> \(x : *)
-> \(PutStrLn : String -> x -> x)
-> \(GetLine : (String -> x) -> x)
-> \(Return : b -> x)
-> m x PutStrLn GetLine (\(va : a) -> f va x PutStrLn GetLine Return)

-- putStrLn
( \(str : String)
-> \(x : *)
-> \(PutStrLn : String -> x -> x )
-> \(GetLine : (String -> x) -> x)
-> \(Return : U -> x)
-> PutStrLn str (Return Unit)

-- getLine
( \(x : *)
-> \(PutStrLn : String -> x -> x )
-> \(GetLine : (String -> x) -> x)
-> \(Return : String -> x)
-> GetLine Return

This program will compile to a completely unrolled read-write loop, as most recursive programs will:

$ morte <
∀(String : *) → ∀(U : *) → U → ∀(x : *) → (String → x → x) →
((String → x) → x) → (U → x) → x

λ(String : *) → λ(U : *) → λ(Unit : U) → λ(x : *) → λ(PutStr
Ln : String → x → x) → λ(GetLine : (String → x) → x) → λ(Ret
urn : U → x) → GetLine (λ(va : String) → PutStrLn va (GetLin
e (λ(va@1 : String) → PutStrLn va@1 (GetLine (λ(va@2 : Strin
g) → PutStrLn va@2 (GetLine (λ(va@3 : String) → PutStrLn ...
... GetLine (λ(va@92 : String) → PutStrLn va@92 (GetLine (λ(
va@93 : String) → PutStrLn va@93 (GetLine (λ(va@94 : String)
→ PutStrLn va@94 (GetLine (λ(va@95 : String) → PutStrLn va@
95 (GetLine (λ(va@96 : String) → PutStrLn va@96 (GetLine (λ(
va@97 : String) → PutStrLn va@97 (GetLine (λ(va@98 : String)
→ PutStrLn va@98 (Return Unit))))))))))))))))))))))))))))))

In contrast, if we encode the effects corecursively we can express a program that echoes indefinitely from stdin to stdout:


-- data IOF r s
-- = PutStrLn String s
-- | GetLine (String -> s)
-- | Return r
-- data IO r = forall s . MkIO s (s -> IOF r s)
-- main = MkIO
-- Nothing
-- (maybe (\str -> PutStrLn str Nothing) (GetLine Just))

( \(String : *)
-> ( \(Maybe : * -> *)
-> \(Just : forall (a : *) -> a -> Maybe a)
-> \(Nothing : forall (a : *) -> Maybe a)
-> \( maybe
: forall (a : *)
-> Maybe a
-> forall (x : *)
-> (a -> x)
-> x
-> x
-> \(IOF : * -> * -> *)
-> \( PutStrLn
: forall (r : *)
-> forall (s : *)
-> String
-> s
-> IOF r s
-> \( GetLine
: forall (r : *)
-> forall (s : *)
-> (String -> s)
-> IOF r s
-> \( Return
: forall (r : *)
-> forall (s : *)
-> r
-> IOF r s
-> ( \(IO : * -> *)
-> \( MkIO
: forall (r : *)
-> forall (s : *)
-> s
-> (s -> IOF r s)
-> IO r
-> ( \(main : forall (r : *) -> IO r)
-> main

-- main
( \(r : *)
-> MkIO
(Maybe String)
(Nothing String)
( \(m : Maybe String)
-> maybe
(IOF r (Maybe String))
(\(str : String) ->
(Maybe String)
(Nothing String)
(GetLine r (Maybe String) (Just String))

-- IO
( \(r : *)
-> forall (x : *)
-> (forall (s : *) -> s -> (s -> IOF r s) -> x)
-> x

-- MkIO
( \(r : *)
-> \(s : *)
-> \(seed : s)
-> \(step : s -> IOF r s)
-> \(x : *)
-> \(k : forall (s : *) -> s -> (s -> IOF r s) -> x)
-> k s seed step

-- Maybe
(\(a : *) -> forall (x : *) -> (a -> x) -> x -> x)

-- Just
( \(a : *)
-> \(va : a)
-> \(x : *)
-> \(Just : a -> x)
-> \(Nothing : x)
-> Just va

-- Nothing
( \(a : *)
-> \(x : *)
-> \(Just : a -> x)
-> \(Nothing : x)
-> Nothing

-- maybe
( \(a : *)
-> \(m : forall (x : *) -> (a -> x) -> x-> x)
-> m

-- IOF
( \(r : *)
-> \(s : *)
-> forall (x : *)
-> (String -> s -> x)
-> ((String -> s) -> x)
-> (r -> x)
-> x

-- PutStrLn
( \(r : *)
-> \(s : *)
-> \(str : String)
-> \(vs : s)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> PutStrLn str vs

-- GetLine
( \(r : *)
-> \(s : *)
-> \(k : String -> s)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> GetLine k

-- Return
( \(r : *)
-> \(s : *)
-> \(vr : r)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> Return vr


This compiles to a state machine that we can unfold one step at a time:

$ morte <
∀(String : *) → ∀(r : *) → ∀(x : *) → (∀(s : *) → s → (s → ∀
(x : *) → (String → s → x) → ((String → s) → x) → (r → x) →
x) → x) → x

λ(String : *) → λ(r : *) → λ(x : *) → λ(k : ∀(s : *) → s → (
s → ∀(x : *) → (String → s → x) → ((String → s) → x) → (r →
x) → x) → x) → k (∀(x : *) → (String → x) → x → x) (λ(x : *)
→ λ(Just : String → x) → λ(Nothing : x) → Nothing) (λ(m : ∀
(x : *) → (String → x) → x → x) → m (∀(x : *) → (String → (∀
(x : *) → (String → x) → x → x) → x) → ((String → ∀(x : *) →
(String → x) → x → x) → x) → (r → x) → x) (λ(str : String)
→ λ(x : *) → λ(PutStrLn : String → (∀(x : *) → (String → x)
→ x → x) → x) → λ(GetLine : (String → ∀(x : *) → (String → x
) → x → x) → x) → λ(Return : r → x) → PutStrLn str (λ(x : *)
→ λ(Just : String → x) → λ(Nothing : x) → Nothing)) (λ(x :
*) → λ(PutStrLn : String → (∀(x : *) → (String → x) → x → x)
→ x) → λ(GetLine : (String → ∀(x : *) → (String → x) → x →
x) → x) → λ(Return : r → x) → GetLine (λ(va : String) → λ(x
: *) → λ(Just : String → x) → λ(Nothing : x) → Just va))

I don't expect you to understand that output other than to know that we can translate the output to any backend that provides functions, and primitive read/write operations.


If you would like to use Morte, you can find the library on both Github and Hackage. I also provide a Morte tutorial that you can use to learn more about the library.

Morte is dependently typed in theory, but in practice I have not exercised this feature so I don't understand the implications of this. If this turns out to be a mistake then I will downgrade Morte to System Fw, which has higher-kinds and polymorphism, but no dependent types.

Additionally, Morte might be usable to transmit code in a secure and typed way in distributed environment or to share code between diverse functional language by providing a common intermediate language. However, both of those scenarios require additional work, such as establishing a shared set of foreign primitives and creating Morte encoders/decoders for each target language.

Also, there are additional optimizations which Morte might implement in the future. For example, Morte could use free theorems (equalities you deduce from the types) to simplify some code fragments even further, but Morte currently does not do this.

My next goals are:

  • Add a back-end to compile Morte to LLVM
  • Add a front-end to desugar a medium-level Haskell-like language to Morte

Once those steps are complete then Morte will be a usable intermediate language for writing super-optimizable programs.

Also, if you're wondering, the name Morte is a tribute to a talking skull from the game Planescape: Torment, since the Morte library is a "bare-bones" calculus of constructions.


If this topic interests you more, you may find the following links helpful, in roughly increasing order of difficulty:

by Gabriel Gonzalez ( at September 12, 2014 11:38 AM

Ken T Takusagawa

[prbwmqwj] Functions to modify a record

Haskell could use some new syntax LAMBDA_RECORD_MODIFY which could be used as follows:

import qualified Control.Monad.State as State;
data Record { field :: Int };
... State.modify $ LAMBDA_RECORD_MODIFY { field = ... };

which is equivalent to

State.modify $ \x -> x { field = ... }

but not having to name the lambda parameter "x" (twice).

I suspect this is one of the things lenses are trying to do.

by Ken ( at September 12, 2014 03:43 AM

September 11, 2014

The GHC Team

Static pointers and serialisation

This longish post gives Simon's reflections on the implementation of Cloud-Haskell-style static pointers and serialiation. See also StaticPointers.

Much of what is suggested here is implemented, in some form, in two existing projects

My goal here is to identify the smallest possible extension to GHC, with the smallest possible trusted code base, that would enable these libraries to be written in an entirely type-safe way.


Background: the trusted code base

The implementation Typeable class, and its associated functions, in GHC offers a type-safe abstraction, in the classic sense that "well typed programs won't go wrong". For example, we in Data.Typeable we have

cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b

We expect cast to be type-safe: if cast returns a value Just x then we really do know that x :: b. Let's remind ourselves of class Typeable:

class Typeable a where
  typeRep :: proxy a -> TypeRep

(It's not quite this, but close.) The proxy a argument is just a proxy for type argument; its value is never inspected and you can always pass bottom.

Under the hood, cast uses typeRep to get the runtime TypeRep for a and b, and compares them, thus:

cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b
cast x = if typeRep (Proxy :: Proxy a) == typeRep (Proxy :: Proxy b)
           then Just (unsafeCoerce x)
           else Nothing

Although cast is written in Haskell, it uses unsafeCoerce. For it to truly be type-safe, it must trust the Typeable instances. If the user could write a Typeable instance, they could write a bogus one, and defeat type safety. So only GHC is allowed write Typeable instances.

In short, cast and the Typeable instances are part of the trusted code base, or TCB:

  • The TCB should be as small as possible
  • The TCB should have a small, well-defined, statically-typed API used by client code
  • Client code is un-trusted; if the client code is well-typed, and the TCB is implemented correctly, nothing can go wrong

Background Typeable a and TypeRep

I'll use the Typeable a type class and values of type TypeRep more or less interchangeably. As you can see from the definition of class Typeable above, its payload is simply a constant function returning a TypeRep. So you can think of a Typeable a as simply a type-tagged version of TypeRep.

Of course, a Typeable a is a type class thing, which is hard to pass around explicitly like a value, but that is easily fixed using the "Dict Trick", well known in Haskell folk lore:

data Dict (c :: Constraint) where
  Dict :: forall c. c => Dict c

Now a value of type Dict (Typeable a) is an ordinary value that embodies a Typeable a dictionary. For example:

f :: Dict (Typeable a) -> Dict (Typeable b) -> a -> Maybe b
f Dict Dict val = cast val

The pattern-matches against the Dict constructor brings the Typeable dictionaries into scope, so they can be used to discharge the constraint arising from the call to cast.

Background: serialisation

I'm going to assume a a type class Serialisable, something like this:

class Serialisable a where
  encode :: a -> ByteString
  decode :: ByteString -> Maybe (a, ByteString)

'll use "encode" and "decode" as synonyms for "serialise" and "deserialise", because the former are easier to pronounce.

Here's an interesting question: are instances of Serialisable part of the TCB? No, they are not. Here is a tricky case:

  decode (encode [True,False]) :: Maybe (Int, ByteString)

Here I have encode a [Bool] into a ByteString, and then decoded an Int from that ByteString. This may be naughty or undesirable, but it cannot seg-fault: it is type-safe in the sense above. You can think of it like this: a decoder is simply a parser for the bits in the ByteString, so a decoder for (say) Int can fail to parse a full Int (returning Nothing), but it can't return a non-Int.

For the naughtiness, one could imagine that a Cloud Haskell library might send fingerprints or TypeReps or whatnot to eliminate potential naughtiness. But even then it is very valuable if the type-safety of the system does not rely on the CH library. Type safety depends only on the correctness of the (small) TCB; naughtiness-safety might additionally depend on the correctness of the CH library.

Background: static pointers

I'm taking for granted the basic design of the Cloud Haskell paper. That is,

  • A type constructor StaticPtr :: * -> *. Intuitively, a value of type StaticPtr t is represented by a static code pointer to a value of type t. Note "code pointer" not "heap pointer". That's the point!
  • A language construct static <expr>, whose type is StaticPtr t if <expr> has type t.
  • In static <expr>, the free variables of <expr> must all be bound at top level. The implementation almost certainly works by giving <expr> a top-level definition with a new name, static34 = <expr>.
  • A function unStatic :: StaticPtr a -> a, to unwrap a static pointer.
  • Static values are serialisable. Something like instance Serialisable (StaticPtr a). (This will turn out to be not quite right.) Operationally this works by serialising the code pointer, or top-level name (e.g "Foo.static34").

All of this is built-in. It is OK for the implementation of StaticPtr to be part of the TCB. But our goal is that no other code need be in the TCB.

A red herring. I'm not going to address the question of how to serialise a static pointer. One method would be to serialise a machine address, but that only works if the encoding and decoding ends are running identical binaries. But that's easily fixed: encode a static as the name of the static value e.g. "function foo from module M in package p". Indeed, I'll informally assume an implementation of this latter kind.

In general, I will say that what we ultimately serialise is a StaticName. You can think of a StaticName as package/module/function triple, or something like that. The implementation of StaticName is certainly not part of the client-visible API for StaticPtr; indeed, the type StaticName is not part of the API either. But it gives us useful vocabulary.

Serialising static pointers

We can see immediately that we cannot expect to have instance Serialisable (Static a), which is what the Cloud Haskell paper proposed. If we had such an instance we would have

encodeStatic :: forall a. StaticPtr a -> ByteString
decodeStatic :: forall a. ByteString -> Maybe (StaticPtr a, ByteString)

And it's immediately apparent that decodeStatic cannot be right. I could get a ByteString from anywhere, apply decodeStatic to it, and thereby get a StaticPtr a. Then use unStatic and you have a value of type a, for, for any type a!!

Plainly, what we need is (just in the case of cast) to do a dynamic typecheck, thus:

decodeStatic :: forall a. Typeable a
                       => ByteString -> Maybe (StaticPtr a, ByteString)

Let's think operationally for a moment:

  • GHC collects all the StaticPtr values in a table, the static pointer table or SPT. Each row contains
    • The StaticName of the value
    • A pointer to closure for the value itself
    • A pointer to its TypeRep
  • decodeStatic now proceeds like this:
    • Parse a StaticName from the ByteString (failure => Nothing)
    • Look it up in table (not found => Nothing)
    • Compare the TypeRep passed to decodeStatic (via the Typeable a dictionary) with the one ine the table (not equal => Nothing)
    • Return the value

Side note. Another possibility is for decodeStatic not to take a Typeable a context but instead for unStatic to do so:: unStatic :: Typeable a => StaticPtr a -> Maybe a. But that seems a mess. Apart from anything else, it would mean that a value of type StaticPtr a might or might not point to a value of type a, so there's no point in having the type parameter in the first place. End of side note.

This design has some useful consequences that are worth calling out:

  • A StaticPtr is serialised simply to the StaticName; the serialised form does not need to contain a TypeRep. Indeed it would not even be type-safe to serialise a StaticPtr to a pair of a StaticName and a TypeRep, trusting that the TypeRep described the type of the named function. Why not? Think back to "Background: serialisation" above, and imagine we said
    decode (encode ["wibble", "wobble"])
      :: Typeable a => Maybe (StaticPtr a, ByteString)
    Here we create an essentially-garbage ByteString by encoding a [String], and try to decode it. If, by chance, we successfully parse a valid StaticName and TypeRep, there is absolutely no reason to suppose that the TypeRep will describe the type of the function.

    Instead, the TypeRep of the static pointer lives in the SPT, securely put there when the SPT was created. Not only is this type-safe, but it also saves bandwidth by not transmittingTypeReps.
  • Since clients can effectively fabricate a StaticName (by supplying decodeStatic with a bogus ByteString, a StaticName is untrusted. That gives the implementation a good deal of wiggle room for how it chooses to implement static names. Even a simple index in the range 0..N would be type-safe!

    The motivation for choosing a richer representation for StaticName (eg package/module/name) is not type-safety but rather resilience to change. For example, the Haskell programs at the two ends could be quite different, provided only that they agreed about what to call the static pointers that they want to exchange.

Statics and existentials

Here is something very reasonable:

data StaticApp b where
  SA :: StaticPtr (a->b) -> StaticPtr a -> StaticApp b
unStaticApp :: StaticApp a -> a
unStaticApp (SA f a) = unStatic f (unStatic a)

(We might want to add more constructors, but I'm going to focus only on SA.) A SA is just a pair of StaticPtrs, one for a function and one for an argument. We can securely unwrap it with unStaticApp.

Now, here is the question: can we serialise StaticApps? Operationally, of course yes: to serialise a SA, just serialise the two StaticPtrs it contains, and dually for deserialisation. But, as before, deserialisation is the hard bit. We seek:

decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)

But how can we write decodeSA? Here is the beginning of an attempt:

decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
  = case decodeStatic bs :: Maybe (StaticPtr (a->b)) of
      Nothing -> Nothing
      Just (fun, bs1) -> ...

and you can immediately see that we are stuck. Type variable b is not in scope. More concretely, we need a Typeable (a->b) to pass in to decodeStatic, but we only have a Typeable b to hand.

What can we do? Tantalisingly, we know that if decodeStatic succeeds in parsing a static StaticName from bs then, when we look up that StaticName in the Static Pointer Table, we'll find a TypeRep for the value. So rather than passing a Typeable dictionary into decodeStatic, we'd like to get one out!

With that in mind, here is a new type signature for decodeStatic that returns both pieces:

data DynStaticPtr where
  DSP :: Typeable a => StaticPtr a -> DynStaticPtr
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)

(The name DynStaticPtr comes from the fact that this data type is extremely similar to the library definition of Dynamic.)

Operationally, decodeStaticK bs fail cont works like this;

  • Parse a StaticName from bs (failure => return Nothing)
  • Look it up in the SPT (not found => return Nothing)
  • Return the TypeRep and the value found in the SPT, paired up with DSP. (Indeed the SPT could contain the DynStaticPtr values directly.)

For the construction of DynStaticPtr to be type-safe, we need to know that the TypeRep passed really is a TypeRep for the value; so the construction of the SPT is (unsurprisingly) part of the TCB.

Now we can write decodeSA (the monad is just the Maybe monad, nothing fancy):

decodeSA :: forall b. Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
  = do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
       ; (DSP (arg :: StaticPtr targ), bs2) <- decodeStatic bs1
            -- At this point we have
            --     Typeable b      (from caller)
            --     Typeable tfun   (from first DSP)
            --     Typeable targ   (from second DSP)
       ; fun' :: StaticPtr (targ->b) <- cast fun
       ; return (SA fun' arg, bs2) }

The call to cast needs Typeable tfun, and Typeable (targ->b). The former is bound by the first DSP pattern match. The latter is constructed automatically from Typeable targ and Typeable b, both of which we have. Bingo!

Notice that decodeSA is not part of the TCB. Clients can freely write code like decodeSA and be sure that it is type-safe.

From static pointers to closures

The original Cloud Haskell paper defines closures like this:

data Closure a where
  Clo :: StaticPtr (ByteString -> a) -> ByteString -> Closure a

It is easy to define

unClo :: Closure a -> a
unClo (Clo s e) = unStatic s e

Side note on HdpH

HdpH refines the Cloud Haskell Closure in (at least) two ways. I think (but I am not certain) that this declaration captures the essence:

data Closure a where
  Clo :: StaticPtr (ByteString -> a) -> Put () -> a -> Closure a

The refinements are:

  • The extra argument of type 'a' to avoid costs when we build a closure and then unwrap it with unClo locally, or repeatedly.
  • The use of Put () rather than a ByteString for the serialised environment, to avoid repeated copying when doing nested serialisation.

Both are importnat, but they are orthogonal to the discussion about static types, so I'll use the CH definition from here on.

Serialising closures

Just as in the case of StaticPtr, it is immediately clear that we cannot expect to have

decodeClo :: ByteString -> Maybe (Closure a, ByteString)

Instead we must play the same trick, and attempt to define

data DynClosure where
  DC :: Typeable a => Closure a -> DynClosure
decodeClo :: ByteString -> Maybe (DynClosure, ByteString)

But there's an immediate problem in writing decodeClo:

decodeClo bs
  = do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
       ; (env, bs2)                         <- decodeByteString bs1
       ; return (DC (Clo fun env), bs2) }  -- WRONG

This won't typecheck because DC needs Typeable a, but we only have Typeable (ByteString -> a)`.

This is Jolly Annoying. I can see three ways to make progress:

  • Plan A: Provide some (type-safe) way to decompose TypeReps, to get from Typeable (a->b) to Typeable b (and presumably Typeable a as well).
  • Plan C: Serialise a TypeRep a with every Closure a.
  • Plan C: Generalise StaticPtr

I like Plan C best. They are each discussed next.

Plan A: Decomposing TypeRep

At the moment, GHC provides statically-typed ways to construct and compare a TypeRep (via cast), but no way to decompose one, at least not in a type-safe way. It is tempting to seek this function as part of the TCB:

class Typeable a where
  typeRep :: proxy a -> TypeRep
  decomposeTypeRep :: DecompTR a
data DecompTR a where
  TRApp :: (Typeable p, Typeable q) => DecompTR (p q)
  TRCon :: TyCon -> DecompTR a

This isn't a bad idea, but it does mean that Typeable a must be implemented (and presumably serialised) using a tree, whereas the current API would allow an implementation consisting only of a fingerprint.

(Thought experiment: maybe a Typeable a, and Dict (Typeable a) can be represented as a tree, but a TypeRep could be just a fingerprint?)

Plan B: serialise TypeRep with Closure

Since we need a Typeable a at the far end, we could just serialise it directly with the Closure, like this:

encodeClo :: forall a. Typeable a => Closure a -> ByteString
encodeClo (Clo fun env)
  =  encodeTypeable (proxy :: a)
  ++ encodeStatic fun
  ++ encodeByteString env

Here I am assuming (as part of the TBC)

encodeTypeable :: Typeable a => proxy a -> ByteString
decodeTypeable :: ByteString -> Maybe (DynTypeable, ByteString)
data DynTypeable where
  DT :: Typeable a => proxy a -> DynTypeable

which serialises a TypeRep. (Or, operationally, perhaps just its fingerprint.) Now I think we can write decodeClo:

decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
  = do { (DT (_ :: Proxy a),           bs1)  <- decodeTypeable
       ; (DSP (fun :: StaticPtr tfun), bs2)  <- decodeStatic bs1
       ; (env, bs3)                          <- decodeByteString bs2
       ; fun' :: StaticPtr (ByteString -> a) <- cast fun
       ; return (DC (Clo fun' env), bs2) }  -- WRONG

But this too is annoying: we have to send these extra TypeReps when morally they are already sitting there in the SPT.

Plan C: Generalising StaticPtr

Our difficulty is that we are deserialising StaticPtr (ByteString -> a) but we want to be given Typeable a not Typeable (ByteString -> a). So perhaps we can decompose the type into a type constructor and type argument, like this:

data StaticPtr (f :: *->*) (a :: *)
unStatic :: StaticPtr f a -> f a
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)
data DynStaticPtr where
  DS :: (Typeable f, Typeable a) => StaticPtr (f a) -> DynStaticPtr

Each row of the SPT contains:

  • The StaticName
  • The value of type f a
  • The Typeable f dictionary
  • The Typeable a dictionary

Now we can define closures thus:

data Closure a where
  Clo :: StaticPtr (ByteString ->) a -> ByteString -> Closure a

and these are easy to deserialise:

decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
  = do { (DSP (fun :: StaticPtr f a), bs1) <- decodeStatic bs
       ; (env, bs2)                        <- decodeByteString bs1
           -- Here we have Typeable f, Typeable a
       ; fun' :: StaticPtr (ByteString ->) a <- cast fun
           -- This cast checks that f ~ (ByteString ->)
           -- Needs Typeable f, Typealbe (ByteString ->)
       ; return (DC (Clo fun env), bs2) }
           -- DC needs Typeable a

I like this a lot better, but it has knock on effects.

  • The old StaticPtr a is now StaticPtr Id a.
  • What becomes of our data type for StaticApply? Perhpas
    data StaticApp f b where
      SA :: StaticPtr f (a->b) -> StaticPtr f b -> StaticApp f b
    unStaticApp :: Applicative => StaticApp f b -> f b

ToDo: ...I have not yet followed through all the details

Applying closures

Can we write closureApply? I'm hoping for a structure like this:

closureApply :: Closure (a->b) -> Closure a -> Closure b
closureApply fun arg = Clo (static caStatic) (fun, arg)
caStatic :: ByteString -> b  -- WRONG
caStatic bs = do { ((fun,arg), bs1) <- decode bs
                 ; return (unClo fun (unClo arg), bs1) }

This is obviously wrong. caStatic clearly cannot have that type. It would at least need to be

caStatic :: Typeable b => ByteString -> b

and now there is the thorny question of where the Typeable b dictionary comes from.

ToDo: ...I have stopped here for now

Polymorphism and serialisation

For this section I'll revert to the un-generalised single-parameter StaticPtr.

Parametric polymorphism

Consider these definitions:

rs1 :: Static ([Int] -> [Int])
rs1 = static reverse
rs2 :: Static ([Bool] -> [Bool])
rs2 = static reverse
rs3 :: forall a. Typeable a => Static ([a] -> [a])
rs3 = static reverse

The first two are clearly fine. The SPT will get one row for each of the two monomorphic calls to reverse, one with a TypeRep of [Int] -> [Int] and one with a TypeRep of [Bool] -> [Bool].

But both will have the same code pointer, namely the code for the polymorpic reverse function. Could we share just one StaticName for all instantiations of reverse, perhaps including rs3 as well?

I think we can. The story would be this:

  • The SPT has a row for reverse, containing
    • The StaticName for reverse
    • A pointer to the code for reverse (or, more precisely, its static closure).
    • A function of type TypeRep -> TypeRep that, given the TypeRep for a returns a TypeRep for [a] -> [a].
  • When we serialise a StaticPtr we send
    • The StaticName of the (polymorphic) function
    • A list of the TypeReps of the type arguments of the function
  • The rule for static <expr> becomes this: the free term variables <expr> must all be top level, but it may have free type variables, provided they are all Typeable.

All of this is part of the TCB, of course.

Type-class polymorphism

Consider static sort where sort :: Ord a => [a] -> [a]. Can we make such a StaticPtr. After all, sort gets an implicit value argument, namely an Ord a dictionary. If that dictionary can be defined at top level, well and good, so this should be OK:

ss1 :: StaticPtr ([Int] -> [Int])
ss1 = static sort

But things go wrong as soon as you have polymorphism:

ss2 :: forall a. Ord a => StaticPtr ([a] -> [a])
ss2 = static sort  -- WRONG

Now, clearly, the dictionary is a non-top-level free variable of the call to sort.

We might consider letting you write this:

ss3 :: forall a. StaticPtr (Ord a => [a] -> [a])
ss3 = static sort   -- ???

so now the static wraps a function expeting a dictionary. But that edges us uncomforatbly close to impredicative types, which is known to contain many dragons.

A simpler alternative is to use the Dict Trick (see Background above):

ss4 :: forall a. StaticPtr (Dict (Ord a) -> [a] -> [a])
ss4 = static sortD
sortD :: forall a. Dict (Ord a) -> [a] -> [a]
sortD Dict xs = sort xs

Now, at the call side, when we unwrap the StaticPtr, we need to supply an explicit Ord dictionary, like this:

...(unStatic ss4 Dict)....

For now, I propose to deal with type classes via the Dict Trick, which is entirely end-user programmable, leaving only parametric polymorphism for built-in support.

by simonpj at September 11, 2014 01:34 PM

Yesod Web Framework

Clarification of previous blog post

I've heard that my previous blog post has caused a bit of confusion, as sarcasm doesn't really come across in text very well. So let me elaborate (and of course, in the process, kill the joke):

Some years back, Erik found a case that was quite difficult to implement using enumerator. After we cracked our heads on it for long enough, some of us (I don't actually remember who was involved) decided to work on a new streaming library. That library ended up being called conduit (thanks to Yitz for the naming idea). It turns out that most people are unaware of that history, so when at ICFP, I casually mentioned that Erik was the cause of conduit coming into existence, some people were surprised. Erik jokingly chastised me for not giving him enough credit. In response, I decided to write an over-the-top post giving Erik all credit for conduit. I say over the top, since I made it seem like there was some large amount of blame being heaped on as well.

So to be completely clear:

  • Erik and I are good friends, and this was just a bit of an inside joke turned public.
  • No one has said anything offensive to me at all about conduit. There are obviously differing opinions out there about the best library for a job, but there's nothing offensive about it, just healthy discussion around a complicated topic. My purpose in making a big deal about it was not to express frustration at anyone attacking me, but rather to just play up the joke a bit more.

My apologies to anyone who was confused, upset, or worried by the previous post, it was completely unintentional.

September 11, 2014 12:00 AM

FP Complete

We're hiring: Haskell web UI developer

FP Complete is looking to expand its Haskell development team. We’re looking for a Haskeller with a strong background in web UI development. This position will encompass both work on our core products- such as FP Haskell Center and School of Haskell- as well as helping customers develop frontends to their Haskell applications.

We will want you to start right away. The will be a contractor position, full time for at least 3 months, with the intention to continue long-term on a more or less full-time basis. Additionally, while the main focus of the position will be UI development, there will be many opportunities to expand into other areas of focus.

This is a telecommute position: you can work from home or wherever you choose, with little or no travel. Location in North America is ideal; you will work with colleagues who are on North American and European hours.

Skills required:

  • Strong Haskell coding skills.
  • Experience with creating HTML/CSS/Javascript web applications (fat clients a plus).
  • Ideally: experience with both Yesod and Fay for server and client side coding, respectively. (Perk: you’ll get a chance to work with the authors of both tools.)
  • Experience deploying applications into production, especially at large scale, is a plus.
  • Ability to interact with a distributed development team, and to manage your time without an in-person supervisor
  • Ability to work with clients on gathering requirements
  • General source control/project skills: Git, issue tracking
  • Ability to communicate clearly in issues, bug reports and emails
  • Proficient on a Linux system
  • Plus: experience with deployment, Docker, and/or CoreOS

Please send resume or CV to Any existing work- either a running website or an open source codebase- which you can include links to will be greatly appreciated as well.

September 11, 2014 12:00 AM

September 10, 2014

Mike Izbicki

Polymorphism in Haskell vs C++

Polymorphism in Haskell vs C++

posted on 2014-09-10 by Jonathan Dugan

Parametric polymorphism is when you write one function that works on many data types. In C++, this is pretty confusing, but it’s really easy in Haskell. Let’s take a look at an example.

Let’s say we want a function that calculates the volume of a box. In C++, we’d use templates so that our function works with any numeric type:

template<typename T>
T boxVolume(T length, T width, T height)
    return length * width * height;

Templates have an awkward syntax, but that isn’t too much of a hassle. C++ has much bigger problems. What if in the course of writing your program, you accidentally pass in some strings to this function?

int main()
    cout << boxVolume("oops","no","strings") << endl;

We get this error when we compile with g++:

test.cpp: In instantiation of _T boxVolume(T, T, T) [with T = const    char*]_:
test.cpp:22:47:   required from here
test.cpp:8:19: error: invalid operands of types _const char*_ and _const char*_ to binary
    return length * width * height;

This error message is a little hard to understand because of the templates. If we had written our function to use doubles instead of templates:

double boxVolume(double length, double width, double height)
    return length * width * height;

We would get this simpler error message:

test.cpp: In function _int main()_:
test.cpp:22:47: error: cannot convert _const char*_ to _double_ for argument _1_ to _double
boxVolume(double, double, double)_
    cout << boxVolume("oops","nope","bad!") << endl;

We see that this error is shorter and easier to use, as it clearly tells us we cannot pass string literals to our function. Plus there is no superfluous comment about our “instantiation” of boxVolume.

Now let’s try to write a polymorphic boxVolume in Haskell:

boxVolume :: a -> a -> a -> a
boxVolume length width height = length * width * height

When we try to compile, we get the error:

    No instance for (Num a) arising from a use of `*'
    Possible fix:
      add (Num a) to the context of
        the type signature for boxVolume :: a -> a -> a -> a
    In the expression: length * width * height
    In an equation for `boxVolume':
        boxVolume length width height = length * width * height

Uh-oh! An error message! What went wrong? It says that we tried to use the * operator without declaring our parameters as an instance of the Num type class.

But what is a type class? This leads us to ad hoc polymorphism, also known as function overloading. Ad hoc polymorphism is when a function can be applied to different argument types, each with a different implementation. For example, the STL classes stack and queue each have their own push and pop functions, which, although they have the same names, do different things:

stack<int> s;
queue<int> q;

s.push(1); q.push(1);
s.push(2); q.push(2);
s.push(3); q.push(3);

s.pop(); q.pop();

After the above code is executed, the stack s will be left with the numbers 1,2 while the queue q will be left with the numbers 2,3. The function pop behaves differently on stacks and queues: calling pop on a stack removes the item added last, while calling pop on a queue removes the item added first.

Haskell does not support function overloading, except through type classes. For example, if we were to specifically declare our own Stack and Queue classes with push and pop functions:

data Stack = Stack  [Int] deriving Show
data Queue = Queue [Int] deriving Show

push :: Stack -> Int -> Stack
push (Stack xs) x = Stack (x:xs)

pop :: Stack -> Stack
pop (Stack []) = Stack []
pop (Stack xs) = Stack (tail xs)

push :: Queue -> Int -> Queue
push (Queue xs) x = Queue (x:xs)

pop :: Queue -> Queue
pop (Queue []) = Queue []
pop (Queue xs) = Queue (init xs)

It results in a compiler error:

    Duplicate type signatures for `push'
    at stack.hs:4:1-4

    Multiple declarations of `push'
    Declared at: stack.hs:5:1

    Duplicate type signatures for `pop'
    at stack.hs:7:1-3

    Multiple declarations of `pop'
    Declared at: stack.hs:8:1

Changing the names of our push and pop functions to, say, stackPush, stackPop, queuePush, and queuePop would let the program compile.

A more generic way, however, is to create a type class. Let’s make a Sequence type class that implements our push and pop functions.

class Sequence s where
    push :: s -> Int -> s
    pop :: s -> s

This type class declaration says that any data type that is an instance of this Sequence type class can use the push and pop operations, or, in other words, can add and remove an Int. By making our Stack and Queue instances of the Sequence type class, both data types can have their own implementations of the push and pop functions!

instance Sequence Stack where
    push (Stack xs) x = Stack (x:xs)
    pop (Stack []) = Stack []
    pop (Stack xs) = Stack (tail xs)

instance Sequence Queue where
    push (Queue xs) x = Queue (x:xs)
    pop (Queue []) = Queue []
    pop (Queue xs) = Queue (init xs)

Replacing our function definitions with these instantiations of the Sequence type class lets our program compile.

Type classes are also an important part of using templates in function definitions. In our function boxVolume, we got an error because we tried to use the * operation without declaring the type variable a as an instance of the Num type class. The Num type class is basically for anything that acts like a number, such as Int, Float, and Double, and it lets you use the common operations of +, -, and *.

Let’s change our function to declare that a is a Num:

boxVolume :: (Num a) => a -> a -> a -> a
boxVolume length width height = length * width * height

This is called adding a class constraint. Whenever we want to declare a template function that relies on other functions, we have to add a class constraint that tells both the user and the compiler which types of data can be put into the function.

If we were to call boxVolume on strings, we would get this simple error message:

ghci> boxVolume "a" "b" "c"

    No instance for (Num [Char]) arising from a use of `boxVolume'
    Possible fix: add an instance declaration for (Num [Char])
    In the expression: boxVolume "a" "b" "c"
    In an equation for `it': it = boxVolume "a" "b" "c"

The compiler tells us it can’t evaluate this function because strings aren’t numbers! If we really wanted to, we could make String an instance of the Num type class, and then this function would work! (Of course, why you would want to do that is beyond me.) That’s the power of parametric polymorphism combined with type classes.

So there you have it. In C++, although we can easily implement ad hoc polymorphism through function overloading, parametric polymorphism is a tricky beast. This is made easier in Haskell, especially with the use of type classes. Type classes guarantee that data passed in to functions will work, and guide the user into what they can pass into a function. Use type classes to your advantage when you next write a Haskell program!

September 10, 2014 12:00 AM

September 09, 2014

Dominic Steinitz

Fun with (Extended Kalman) Filters


An extended Kalman filter in Haskell using type level literals and automatic differentiation to provide some guarantees of correctness.

Population Growth

Suppose we wish to model population growth of bees via the logistic equation

\displaystyle  \begin{aligned}  \dot{p} & = rp\Big(1 - \frac{p}{k}\Big)  \end{aligned}

We assume the growth rate r is unknown and drawn from a normal distribution {\cal{N}}(\mu_r, \sigma_r^2) but the carrying capacity k is known and we wish to estimate the growth rate by observing noisy values y_i of the population at discrete times t_0 = 0, t_1 = \Delta T, t_2 = 2\Delta T, \ldots. Note that p_t is entirely deterministic and its stochasticity is only as a result of the fact that the unknown parameter of the logistic equation is sampled from a normal distribution (we could for example be observing different colonies of bees and we know from the literature that bee populations obey the logistic equation and each colony will have different growth rates).

Haskell Preamble

> {-# OPTIONS_GHC -Wall                     #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing  #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults   #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind  #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans         #-}
> {-# LANGUAGE DataKinds                    #-}
> {-# LANGUAGE ScopedTypeVariables          #-}
> {-# LANGUAGE RankNTypes                   #-}
> {-# LANGUAGE BangPatterns                 #-}
> {-# LANGUAGE TypeOperators                #-}
> {-# LANGUAGE TypeFamilies                 #-}
> module FunWithKalman3 where
> import GHC.TypeLits
> import Numeric.LinearAlgebra.Static
> import Data.Maybe ( fromJust )
> import Numeric.AD
> import Data.Random.Source.PureMT
> import Data.Random
> import Control.Monad.State
> import qualified Control.Monad.Writer as W
> import Control.Monad.Loops

Logistic Equation

The logistic equation is a well known example of a dynamical system which has an analytic solution

\displaystyle  p = \frac{kp_0\exp rt}{k + p_0(\exp rt - 1)}

Here it is in Haskell

> logit :: Floating a => a -> a -> a -> a
> logit p0 k x = k * p0 * (exp x) / (k + p0 * (exp x - 1))

We observe a noisy value of population at regular time intervals (where \Delta T is the time interval)

\displaystyle  \begin{aligned}  p_i &= \frac{kp_0\exp r\Delta T i}{k + p_0(\exp r\Delta T i - 1)} \\  y_i &= p_i + \epsilon_i  \end{aligned}

Using the semi-group property of our dynamical system, we can re-write this as

\displaystyle  \begin{aligned}  p_i &= \frac{kp_{i-1}\exp r\Delta T}{k + p_{i-1}(\exp r\Delta T - 1)} \\  y_i &= p_i + \epsilon_i  \end{aligned}

To convince yourself that this re-formulation is correct, think of the population as starting at p_0; after 1 time step it has reached p_1 and and after two time steps it has reached p_2 and this ought to be the same as the point reached after 1 time step starting at p_1, for example

> oneStepFrom0, twoStepsFrom0, oneStepFrom1 :: Double
> oneStepFrom0  = logit 0.1 1.0 (1 * 0.1)
> twoStepsFrom0 = logit 0.1 1.0 (1 * 0.2)
> oneStepFrom1  = logit oneStepFrom0 1.0 (1 * 0.1)
ghci> twoStepsFrom0

ghci> oneStepFrom1

We would like to infer the growth rate not just be able to predict the population so we need to add another variable to our model.

\displaystyle  \begin{aligned}  r_i &= r_{i-1} \\  p_i &= \frac{kp_{i-1}\exp r_{i-1}\Delta T}{k + p_{i-1}(\exp r_{i-1}\Delta T - 1)} \\  y_i &= \begin{bmatrix}0 & 1\end{bmatrix}\begin{bmatrix}r_i \\ p_i\end{bmatrix} + \begin{bmatrix}0 \\ \epsilon_i\end{bmatrix}  \end{aligned}

Extended Kalman

This is almost in the form suitable for estimation using a Kalman filter but the dependency of the state on the previous state is non-linear. We can modify the Kalman filter to create the extended Kalman filter (EKF) by making a linear approximation.

Since the measurement update is trivially linear (even in this more general form), the measurement update step remains unchanged.

\displaystyle  \begin{aligned}  \boldsymbol{v}_i & \triangleq  \boldsymbol{y}_i - \boldsymbol{g}(\hat{\boldsymbol{x}}^\flat_i) \\  \boldsymbol{S}_i & \triangleq  \boldsymbol{G}_i \hat{\boldsymbol{\Sigma}}^\flat_i  \boldsymbol{G}_i^\top + \boldsymbol{\Sigma}^{(y)}_i \\  \boldsymbol{K}_i & \triangleq \hat{\boldsymbol{\Sigma}}^\flat_i  \boldsymbol{G}_i^\top\boldsymbol{S}^{-1}_i \\  \hat{\boldsymbol{x}}^i &\triangleq \hat{\boldsymbol{x}}^\flat_i + \boldsymbol{K}_i\boldsymbol{v}_i \\  \hat{\boldsymbol{\Sigma}}_i &\triangleq \hat{\boldsymbol{\Sigma}}^\flat_i - \boldsymbol{K}_i\boldsymbol{S}_i\boldsymbol{K}^\top_i  \end{aligned}

By Taylor we have

\displaystyle  \boldsymbol{a}(\boldsymbol{x}) \approx \boldsymbol{a}(\boldsymbol{m}) + \boldsymbol{A}_{\boldsymbol{x}}(\boldsymbol{m})\delta\boldsymbol{x}

where \boldsymbol{A}_{\boldsymbol{x}}(\boldsymbol{m}) is the Jacobian of \boldsymbol{a} evaluated at \boldsymbol{m} (for the exposition of the extended filter we take \boldsymbol{a} to be vector valued hence the use of a bold font). We take \delta\boldsymbol{x} to be normally distributed with mean of 0 and ignore any difficulties there may be with using Taylor with stochastic variables.

Applying this at \boldsymbol{m} = \hat{\boldsymbol{x}}_{i-1} we have

\displaystyle  \boldsymbol{x}_i = \boldsymbol{a}(\hat{\boldsymbol{x}}_{i-1}) + \boldsymbol{A}_{\boldsymbol{x}}(\hat{\boldsymbol{x}}_{i-1})(\boldsymbol{x}_{i-1} - \hat{\boldsymbol{x}}_{i-1}) + \boldsymbol{\epsilon}_i

Using the same reasoning as we did as for Kalman filters and writing \boldsymbol{A}_{i-1} for \boldsymbol{A}_{\boldsymbol{x}}(\hat{\boldsymbol{x}}_{i-1}) we obtain

\displaystyle  \begin{aligned}  \hat{\boldsymbol{x}}^\flat_i &=  \boldsymbol{a}(\hat{\boldsymbol{x}}_{i-1}) \\  \hat{\boldsymbol{\Sigma}}^\flat_i &= \boldsymbol{A}_{i-1}  \hat{\boldsymbol{\Sigma}}_{i-1}  \boldsymbol{A}_{i-1}^\top  + \boldsymbol{\Sigma}^{(x)}_{i-1}  \end{aligned}

Haskell Implementation

Note that we pass in the Jacobian of the update function as a function itself in the case of the extended Kalman filter rather than the matrix representing the linear function as we do in the case of the classical Kalman filter.

> k, p0 :: Floating a => a
> k = 1.0
> p0 = 0.1 * k
> r, deltaT :: Floating a => a
> r = 10.0
> deltaT = 0.0005

Relating ad and hmatrix is somewhat unpleasant but this can probably be ameliorated by defining a suitable datatype.

> a :: R 2 -> R 2
> a rpPrev = rNew # pNew
>   where
>     (r, pPrev) = headTail rpPrev
>     rNew :: R 1
>     rNew = konst r
>     (p,  _) = headTail pPrev
>     pNew :: R 1
>     pNew = fromList $ [logit p k (r * deltaT)]
> bigA :: R 2 -> Sq 2
> bigA rp = fromList $ concat $ j [r, p]
>   where
>     (r, ps) = headTail rp
>     (p,  _) = headTail ps
>     j = jacobian (\[r, p] -> [r, logit p k (r * deltaT)])

For some reason, hmatrix with static guarantees does not yet provide an inverse function for matrices.

> inv :: (KnownNat n, (1 <=? n) ~ 'True) => Sq n -> Sq n
> inv m = fromJust $ linSolve m eye

Here is the extended Kalman filter itself. The type signatures on the expressions inside the function are not necessary but did help the implementor discover a bug in the mathematical derivation and will hopefully help the reader.

> outer ::  forall m n . (KnownNat m, KnownNat n,
>                         (1 <=? n) ~ 'True, (1 <=? m) ~ 'True) =>
>           R n -> Sq n ->
>           L m n -> Sq m ->
>           (R n -> R n) -> (R n -> Sq n) -> Sq n ->
>           [R m] ->
>           [(R n, Sq n)]
> outer muPrior sigmaPrior bigH bigSigmaY
>       littleA bigABuilder bigSigmaX ys = result
>   where
>     result = scanl update (muPrior, sigmaPrior) ys
>     update :: (R n, Sq n) -> R m -> (R n, Sq n)
>     update (xHatFlat, bigSigmaHatFlat) y =
>       (xHatFlatNew, bigSigmaHatFlatNew)
>       where
>         v :: R m
>         v = y - (bigH #> xHatFlat)
>         bigS :: Sq m
>         bigS = bigH <> bigSigmaHatFlat <> (tr bigH) + bigSigmaY
>         bigK :: L n m
>         bigK = bigSigmaHatFlat <> (tr bigH) <> (inv bigS)
>         xHat :: R n
>         xHat = xHatFlat + bigK #> v
>         bigSigmaHat :: Sq n
>         bigSigmaHat = bigSigmaHatFlat - bigK <> bigS <> (tr bigK)
>         bigA :: Sq n
>         bigA = bigABuilder xHat
>         xHatFlatNew :: R n
>         xHatFlatNew = littleA xHat
>         bigSigmaHatFlatNew :: Sq n
>         bigSigmaHatFlatNew = bigA <> bigSigmaHat <> (tr bigA) + bigSigmaX

Now let us create some sample data.

> obsVariance :: Double
> obsVariance = 1e-2
> bigSigmaY :: Sq 1
> bigSigmaY = fromList [obsVariance]
> nObs :: Int
> nObs = 300
> singleSample :: Double -> RVarT (W.Writer [Double]) Double
> singleSample p0 = do
>   epsilon <- rvarT (Normal 0.0 obsVariance)
>   let p1 = logit p0 k (r * deltaT)
>   lift $ W.tell [p1 + epsilon]
>   return p1
> streamSample :: RVarT (W.Writer [Double]) Double
> streamSample = iterateM_ singleSample p0
> samples :: [Double]
> samples = take nObs $ snd $
>           W.runWriter (evalStateT (sample streamSample) (pureMT 3))

We created our data with a growth rate of

ghci> r

but let us pretend that we have read the literature on growth rates of bee colonies and we have some big doubts about growth rates but we are almost certain about the size of the colony at t=0.

> muPrior :: R 2
> muPrior = fromList [5.0, 0.1]
> sigmaPrior :: Sq 2
> sigmaPrior = fromList [ 1e2, 0.0
>                       , 0.0, 1e-10
>                       ]

We only observe the population and not the rate itself.

> bigH :: L 1 2
> bigH = fromList [0.0, 1.0]

Strictly speaking this should be 0 but this is close enough.

> bigSigmaX :: Sq 2
> bigSigmaX = fromList [ 1e-10, 0.0
>                      , 0.0, 1e-10
>                      ]

Now we can run our filter and watch it switch away from our prior belief as it accumulates more and more evidence.

> test :: [(R 2, Sq 2)]
> test = outer muPrior sigmaPrior bigH bigSigmaY
>        a bigA bigSigmaX (map (fromList . return) samples)

by Dominic Steinitz at September 09, 2014 08:28 AM

Yesod Web Framework

Misassigned credit for conduit

When I was at ICFP last week, it became clear that I had made a huge mistake in the past three years. A few of us were talking, including Erik de Castro Lopo, and when I mentioned that he was the original inspiration for creating the conduit package, everyone else was surprised. So firstly: Erik, I apologize for not making it clear that you initially kicked off development by finding some fun corner cases in enumerator that were difficult to debug.

So to rectify that, I think it's only fair that I write the following:

  • conduit is entirely Erik's fault.
  • If you love conduit, write Erik a thank you email.
  • More importantly, if you hate conduit, there's no need to complain to me anymore. Erik presumably will be quite happy to receive all such further communications.
  • In other words, it's not my company, I just work here.

Thanks Erik :)

UPDATE Please also read my follow-up blog post clarifying this one, just in case you're confused.

September 09, 2014 12:00 AM

September 08, 2014

The GHC Team

Jan Stolarek

Promoting functions to type families in Haskell

It’s been very quiet on the blog these past few months not because I’m spending less time on functional programming but precisely for the opposite reason. Since January I’ve been working together with Richard Eisenberg to extend his singletons library. This work was finished in June and last Friday I gave a talk about our research on Haskell Symposium 2014. This was the first time I’ve been to the ICFP and Haskell Symposium. It was pretty cool to finally meet all these people I know only from IRC. I also admit that the atmosphere of the conference quite surprised me as it often felt like some sort of fan convention rather than the biggest event in the field of functional programming.

The paper Richard and I published is titled “Promoting Functions to Type Families in Haskell”. This work is based on Richard’s earlier paper “Dependently typed programming with singletons” presented two years ago on Haskell Symposium. Back then Richard presented the singletons library that uses Template Haskell to generate singleton types and functions that operate on them. Singleton types are types that have only one value (aside from bottom) which allows to reason about runtime values during compilation (some introduction to singletons can be found in this post on Richard’s blog). This smart encoding allows to simulate some of the features of dependent types in Haskell. In our current work we extended promotion capabilities of the library. Promotion is only concerned with generating type-level definitions from term-level ones. Type-level language in GHC has become quite expressive during the last couple of years but it is still missing many features available in the term-level language. Richard and I have found ways to encode almost all of these missing features using the already existing type-level language features. What this means is that you can write normal term-level definition and then our library will automatically generate an equivalent type family. You’re only forbidden from using infinite terms, the do-notation, and decomposing String literals to Chars. Numeric literals are also very problematic and the support is very limited but some of the issues can be worked around. What is really cool is that our library allows you to have partial application at the type level, which GHC normally prohibits.

You can learn more by watching my talk on YouTube, reading the paper or the singletons documentation. Here I’d like to add a few more information that are not present in the paper. So first of all the paper was concerned only with promotion and didn’t say anything about singletonization. But as we enabled more and more language constructs to be promoted we also made them singletonizable. So almost everything that can be promoted can also be singletonized. The most notable exception to this rule are type classes, which are not yet implemented at the moment.

An interesting issue was raised by Adam Gundry in a question after the talk: what about difference between lazy term-level semantics and strict type-level semantics? You can listen to my answer in the video but I’ll elaborate some more on this here. At one point during our work we were wondering about this issue and decided to demonstrate an example of an algorithm that crucially relies on laziness to work, ie. fails to work with strict semantics. I think it’s not straightforward to come up with such an algorithm but luckily I recalled the backwards state monad from Philip Wadler’s paper “The essence of functional programming”1. Bind operator of that monad looks like this (definition copied from the paper):

m `bindS` k = \s2 -> let (a,s0) = m s1
                         (b,s1) = k a s2
                     in  (b,s0)

The tricky part here is that the output of call to m becomes input to call to k, while the output of call to k becomes the input of m. Implementing this in a strict language does not at all look straightforward. So I promoted that definition expecting it to fail spectacularly but to my surprised it worked perfectly fine. After some investigation I understood what’s going on. Type-level computations performed by GHC are about constraint solving. It turns out that GHC is able to figure out in which order to solve these constraints and get the result. It’s exactly analogous to what happens with the term-level version at runtime: we have an order of dependencies between the closures and there is a way in which we can run these closures to get the final result.

All of this work is a small part of a larger endeavour to push Haskell’s type system towards dependent types. With singletons you can write type-level functions easily by writing their definitions using the term-level language and then promoting these definitions. And then you can singletonize your functions to work on singleton types. There were two other talks about dependent types during the conference: Stephanie Weirich’s “Depending on Types” keynote lecture during ICPF and Richard’s “Dependent Haskell” talk during Haskell Implementators Workshop. I encourage everyone interested in Haskell’s type system to watch both of these talks.

  1. The awful truth is that this monad does not really work with the released version of singletons. I only realized that when I was writing this post. See issue #94 on singletons bug tracker.

by Jan Stolarek at September 08, 2014 11:02 AM

September 07, 2014

Neil Mitchell

Shake in the wild

Summary: I spotted a few things using Shake, which I had nothing to do with.

In the past few days I have come across several things using the Shake build system. I wasn't involved in any of them, and haven't (yet) tried any of them out, but they certainly look cool.


Tibor Bremer from Utrecht University gave a talk at the Haskell Implementors Workshop 2014 about his ToolCabal project. This project replaces the "build a package" part of Cabal with something more flexible, supporting multiple simultaneous targets and more flexible preprocessors - all built on top of Shake. It doesn't attempt to tackle dependency resolution yet. There is a video of the talk:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="" width="420"></iframe>


The folks at Samplecount have written several Shake based things. None are yet on Hackage, so I suspect they are somewhat prototypes, but they look like they're already used quite seriously.

  • shake-cabal-build to make it easier to build your Shake build systems with Cabal. Shake build systems need to be compiled with GHC, for which I usually use ghc --make, but this project explains how to get things building with Cabal - important if your build system pulls in other libraries.
  • shake-language-c is a project to simplify building C/C++ projects with Shake. From the docs:

shake-language-c is a cross-platform build system based on the Shake Haskell library. The focus is on cross-compilation of C, C++ and Objective C source code to various target platforms. Currently supported target platforms are iOS, Android NDK, Google Portable Native Client, MacOS X, Linux and Windows (MinGW). Supported host platforms are MacOS X, Linux and Windows.

  • methcla is their mobile sound engine, which is built using this Shake script, which (unsurprisingly) uses shake-language-c and shake-cabal-build.

by Neil Mitchell ( at September 07, 2014 09:02 PM

Edward Z. Yang

Haskell Implementor’s Workshop ’14

This year at ICFP, we had some blockbuster attendance to the Haskell Implementor's Workshop (at times, it was standing room only). I had the pleasure of presenting the work I had done over the summer on Backpack.


You can grab the slides or view the presentation itself (thank you ICFP organizers for being incredibly on-the-ball with videos this year!) The talk intersects a little bit with my blog post A taste of Cabalized Backpack, but there are more pictures, and I also emphasize (perhaps a little too much) the long term direction we are headed in.


There were a lot of really nice talks at HiW. Here are some of my personal highlights:

by Edward Z. Yang at September 07, 2014 01:05 PM

Mike Izbicki

Getting started with GitHub, vim, and bash

Getting started with GitHub, vim, and bash

posted on 2014-09-07 by Rashid Goshtasbi and Kyler Rynear

Learning to use git, vim, and bash was hard for us. These tools are so different than the tools we used when we first learned to program. And they’re confusing! But our professor made us use them… and eventually… after we learned the tools… we discovered that we really like them! So we’ve put together a simple video guide to help you learn and enjoy these tools too. We did this as part of the CS100 open source software development class at UC Riverside.

Click here to watch the full playlist on YouTube.

Getting Started with GitHub

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

This video shows you step by step how to create an account on GitHub. Then we see how to create our first repository called test, and transfer it from GitHub onto our local machine using the git clone command.

Creating a file, pushing to GitHub, and pulling from GitHub

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

How do we create files and upload them to GitHub? The touch <filename> command will create an empty file for you. The vim <filename> command will open a file in an advanced text editor that we talk about farther down the page. The git push command sends these files from your local machine up to GitHub, and the git pull command downloads files from GitHub and saves them to your local computer.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Branches let you work on files without messing up your original code. When you finish your changes, you can merge them into the master branch. This is the best part of version control.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Most programs have different versions, for example: 1.0, 1.1, 1.2, 2.1 and 2.2.1. The git tag command let’s you create these versions. They’re just like a checkpoint in a Mario game!

Forking & Pull Requests

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Let’s say you want to contribute to an open source project, but you don’t have permission. In order to contribute to someone else’s repository, you must first “fork” it to create a repo that you do have push permission on. Then you issue a pull request through the GitHub website. This tells the owner of the original repo that you’ve made some changes they can incorporate.

The file

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe> files are how you document your projects. The should explain your program, give installation instructions, and list known bugs. Basically, it explains to someone else who has absolutely no idea what your program does or how to code, but it enables the user to understand the concepts and basic directions to execute your program. The .md extension at the end of the filename indicates that the file uses markdown formatting. This is a simple way to create nice looking documentation.

Learning vim

vim is an advanced text editor for Unix operating systems. It’s powerful, but all the commands are intimidating for first time users. Even though it’s hard to get used to at first, these videos will help you learn some of the basic commands and get comfortable with vim.

Getting Started

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

It was difficult at first trying to transverse my code while using vim. I was so used to being able to use my mouse and simply click where I wanted to go. There are many ways to maneuver inside of vim. Some may just use the h,j,k,l, up, down, left, right arrow keys, or the w, e, b keys to move. You can also press gg to go to the top of the code, G to go to the bottom of it, and (any number)G to go to the line number typed before the capital G.)

Cutting, copying, and pasting took a while to get used to when using vim. Sometimes there was something I wanted in my code that was in the instructions for the assignment. In order to paste I would use the p command, but I could not paste things from outside of vim into it. If I had something copied outside of vim, then to paste it into vim I would right click and just click paste. This would paste it wherever the cursor currently is. If you right click to copy, then it will not affect what is copied by using the commands y to copy or the commands d or x to cut. If those commands are used, the just clicking p will paste them. There are other ways to store more than one thing while copying or cutting, but these two ways were the most helpful as I learned how to use vim.

Another personal favorite features of vim, are the shift-a (takes you to the end of the line and into insert mode) and the shift-i (takes you to the beginning of the line and into insert mode) command. You can also press a to append after the cursor position, as well as i to insert before the current cursor position

vim also allows you to use the v or shift-v keys to highlight certain text or lines of code. You can then use other vim commands such as the copy, paste and delete keys to perform your needed actions.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

At first it felt very time consuming to indent multiple lines. I felt this way until I found about the V command. V lets users highlight a line and pressing up or down can highlight as many lines as they desire. All that was left to do was to type > after everything I wanted to indent was highlighted and it all would indented once to the right. Typing < would instead indent it to the left if I ever wanted to do that.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="" width="697.6"></iframe>

There are two commands for deleting single character. x deletes the character that the cursor is on and moves the cursor to the right; and X deletes the character that the cursor is on and moves the cursor to the left.

The d command is a more powerful way to delete. d can be used with many different things after it. dd will delete the entire line. d$ will delete the rest of the current line. de will delete from where the cursor is up until the end of the word.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="" width="697.6"></iframe>

Lower case r can replace one letter while upper case R can replace one letter with many.

There are three c commands that I regularly use for replacement: ce , which deletes up until the end of the word that the cursor is currently on, then allows you to insert immediately; c$ , which deletes from where the cursor is up until the end of the line, then allows you to insert immediately; and cc , which deletes the whole line that the cursor is on and allows you to insert immediately at the beginning of the line.

Customizing your vim editor with the .vimrc file

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Ever wondered how’ve we get our vim editor to work in the way we have it versus the default editor? vim has a file where you can setup it’s defaults such as auto parentheses, auto-indent, and much more. By watching our video above, you can easily create new defaults for your vim editor that can cut time spent formating your text to spend more on coding.

Learning The Terminal

One of the best features of Unix operating systems is the powerful terminal they provide.

The ls command

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

The ls command is one of the most used terminal commands.

The basic ls command, when run, displays the contents within the current working directory. Passing in a directory name as an argument will display the contents of that directory. It is also possible to pass in a path for a directory to display any directory, regardless of the directory the user is currently in.

If the -a flag is passed in with ls, all items in the current working directory prepended with a . are also displayed, along with the rest of the items.

Passing in the -l flag prints information for each item in the directory in a series of columns on a single line. The first column displays the read, write, and executable permissions for the main user, the group the current user is in, and any user in that order. The next column shows the owner of the item and the next column shows the group owner. The fourth column displays the size, in bytes, of the item. The fifth column displays the moment the item was created, and the last column displays the name of the item.

If the -R flag is passed in, the command will display the contents of the current directory, and then recursively enter every directory within the current directory and display the contents of that directory, then keep going into every directory until there are no more directories in the current directory it is in.

All these options are combinable for different uses. For example, I could use the -l and -a flags to display the information for the items prepended with a . , or use -R and -l together.

The cd and mv commands

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

The cd and mv commands are crucial commands in order to actually use the terminal. Without cd, I would forever be stuck in their home directory. The mv command is necessary for moving files from one section of the hard drive. The cd command by itself will change the current working directory to the home directory. If passed a directory name that is within the current working directory, the current working directory will be changed to the name of the passed in directory. cd will also take a path as an argument. When a path is passed in, the current working directory will be changed to the directory specified by the path. When cd is passed with .., the directory will go backwards, the directory that the current directory is in.

The mv command will move an item within a certain directory to the directory passed in.

If the destination argument is not a path, the command will look for the destination in the current working directory. The destination argument can be a path, so I can move the item to any directory in the hard drive.

Recording terminal sessions via scripts

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

With the script command you can record the commands you run in your terminal into a file. By just typing script file_name_here, you can start a script. Also, you don’t need to worry about making a file beforehand, because when you specify the filename, it will make once for you in that name. Then when you’re done, type exit and your terminal will say your script session has ended and re-state the filename in which it recorded all your commands in.

How To SSH (into well server)

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

Computer Science students have the ability to log into the school’s server using the ssh command. The way to do access the terminal is to type into the command terminal the following text:


If it is your first time entering the terminal, you will be asked to trust the encryption that the server uses, then prompted to enter the password associated with your NetID. Once doing all those steps, you will be brought to your home directory on the server. To exit the server, type exit into the command prompt and press enter.

A useful command that moves files to and from the remote server onto your home computer is the scp command. To put items from your home computer to the school’s server, type into the command prompt:

scp filename/absolute_path

To move items from the remote server onto your home computer, type into the command prompt:

scp absolute_path

Spectacle App: Using the terminal and vim in one screen

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

One of the first things I noticed about vim that I initially disliked was that it took over the terminal when I used it. Users with Windows 7 & above automatically have this ability by dragging your screen to the left or right border of your screen. Unfortunately, OS X users don’t have this built in ability. To get around this, OS X users can install the Spectacle App which will enable you to organize multiple windows on your screen with a touch of a buttom. To get around this issue, I started using two terminals instead of just one while I was programming. I would run vim using the first terminal and would run the executable in the second. It was as simple as using :w to save on vim instead of using :wq to save and quit. I could now test my code without ever having to close vim.


<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src=";vq=hd1080" width="697.6"></iframe>

When programming for unix based operating systems (which is a primary component of CS100), system calls are a prominent component for code. The perror function captures the error value (if returned) from the system call and prints to stdout an error message based on the system call and the type of error. It takes in one c-string argument, which is a message the user can pass in.

September 07, 2014 12:00 AM

September 06, 2014

Joachim Breitner

ICFP 2014

Another on-my-the-journey-back blog post; this time from the Frankfurt Airport Train Station – my flight was delayed (if I knew that I could have watched the remaining Lightning Talks), and so was my train, but despite 5min of running through the Airport just not enough. And now that the free 30 Minutes of Railway Station Internet are used up, I have nothing else to do but blog...

Last week I was attending ICFP 2014 in Gothenburg, followed by the Haskell Symposium and the Haskell Implementors Workshop. The justification to attend was the paper on Safe Coercions (joint work with Richard Eisenberg, Simon Peyton Jones and Stephanie Weirich), although Richard got to hold the talk, and did so quite well. So I got to leisurely attend the talks, while fighting the jet-lag that I brought from Portland.

There were – as expected – quite a few interesting talks. Among them the first keynote, Kathleen Fisher on the need for formal methods in cars and toy-quadcopters and unmanned battle helicopters, which made me conclude that my Isabelle skills might eventually become relevant in practical applications. And did you know that if someone gains access to your car’s electronics, they can make the seat belt pull you back hard?

Stefanie Weirich’s keynote (and the subsequent related talks by Jan Stolarek and Richard Eisenberg) on what a dependently typed Haskell would look like and what we could use it for was mouth-watering. I am a bit worried that Haskell will be become a bit obscure for newcomers and people that simply don’t want to think about types too much, on the other hand it seems that Haskell as we know it will always stay there, just as a subset of the language.

Similarly interesting were refinement types for Haskell (talks by Niki Vazou and by Eric Seidel), in the form of LiquidTypes, something that I have not paid attention to yet. It seems to be a good way for more high assurance in Haskell code.

Finally, the Haskell Implementors Workshop had a truckload of exciting developments in and around Haskell: More on GHCJS, Partial type signatures, interactive type-driven development like we know it from Agda, the new Haskell module system and amazing user-defined error messages – the latter unfortunately only in Helium, at least for now.

But it’s not the case that I only sat and listened. During the Haskell Implementors Workshop I held a talk “Contributing to GHC” with a live demo of me fixing a (tiny) bug in GHC, with the aim of getting more people to hack on GHC (slides, video). The main message here is that it is not that big of deal. And despite me not actually saying much interesting in the talk, I got good feedback afterwards. So if it now actually motivates someone to contribute to GHC, I’m even more happier.

And then there is of course the Hallway Track. I discussed the issues with fusing a left fold (unfortunately, without a great solution). In order to tackle this problem more systematically, John Wiegley and I created the beginning of a “List Fusion Lab”, i.e. a bunch of list benchmark and the possibility to compare various implementations (e.g. with different RULES) and various compilers. With that we can hopefully better assess the effect of a change to the list functions.

PS: The next train is now also delayed, so I’ll likely miss my tram and arrive home even later...

PPS: I really have to update my 10 year old picture on my homepage (or redesign it completely). Quite a few people knew my name, but expected someone with shoulder-long hair...

PPPS: Haskell is really becoming mainstream: I just talked to a randomly chosen person (the boy sitting next to me in the train), and he is a Haskell enthusiast, building a structured editor for Haskell together with his brother. And all that as a 12th-grader...

by Joachim Breitner ( at September 06, 2014 10:46 PM

Tim Docker

A New Charting API

One of the challenges with building a library like Chart is the tension between ease of use and flexibility. Users want to produce charts with a minimum of code up front, but later want to refine the details. The chart library addresses this through the use of "defaulted records" using Data.Default.Class. Because such records are often nested, we rely on the somewhat intimidating lens library to modify the default values. We end up with code to create chart elements like this:

sinusoid2 = plot_points_title .~ "fn(x)"
          $ plot_points_values .~ mydata
          $ plot_points_style . point_color .~ opaque red
          $ def

This is much simpler and cleaner that the corresponding code using native record accessors, but it still has a certain amount of syntactic overhead.

I’ve added a simple state monad to the library to further clean up the syntax. The state of the monad is the value being constructed, allowing the use of the monadic lens operators. The above code sample becomes:

sinusoid2 = execEC $ do
    plot_points_title .= "fn(x)" 
    plot_points_values .= mydata
    plot_points_style . point_color .= opaque red

This may seem only a minor syntactic improvement, but it adds up over an typical chart definition.

A few other changes further reduce the clutter in charting code:

  • A new Easy module that includes helper functions and key dependencies
  • Simpler "toFile" functions in the rendering backends
  • Automatic sequencing of colours for successive plots

All this means that a simple plot can now be a one liner:

import Graphics.Rendering.Chart.Easy
import Graphics.Rendering.Chart.Backend.Cairo

mydata :: [Double,Double]
mydata = ...

main = toFile def "test.png" $ plot $ points "lines" mydata

But this extends naturally to more complex charts. The code differences between the new stateful API versus the existing API can been seen in this example.

The stateful API is available in chart v1.3 It is a thin layer over the existing API – both will be continue to be available in the future.

by Tim Docker at September 06, 2014 05:34 AM

September 04, 2014

Edward Z. Yang

Open type families are not modular

One of the major open problems for building a module system in Haskell is the treatment of type classes, which I have discussed previously on this blog. I've noted how the current mode of use in type classes in Haskell assume “global uniqueness”, which is inherently anti-modular; breaking this assumption risks violating the encapsulation of many existing data types.

As if we have a choice.

In fact, our hand is forced by the presence of open type families in Haskell, which are feature many similar properties to type classes, but with the added property that global uniqueness is required for type safety. We don't have a choice (unless we want type classes with associated types to behave differently from type classes): we have to figure out how to reconcile the inherent non-modularity of type families with the Backpack module system.

In this blog post, I want to carefully lay out why open type families are inherently unmodular and propose some solutions for managing this unmodularity. If you know what the problem is, you can skip the first two sections and go straight to the proposed solutions section.

Before we talk about open type family instances, it's first worth emphasizing the (intuitive) fact that a signature of a module is supposed to be able to hide information about its implementation. Here's a simple example:

module A where
    x :: Int

module B where
    import A
    y = 0
    z = x + y

Here, A is a signature, while B is a module which imports the signature. One of the points of a module system is that we should be able to type check B with respect to A, without knowing anything about what module we actually use as the implementation. Furthermore, if this type checking succeeds, then for any implementation which provides the interface of A, the combined program should also type check. This should hold even if the implementation of A defines other identifiers not mentioned in the signature:

module A where
    x = 1
    y = 2

If B had directly imported this implementation, the identifier y would be ambiguous; but the signature filtered out the declarations so that B only sees the identifiers in the signature.

With this in mind, let's now consider the analogous situation with open type families. Assuming that we have some type family F defined in the prelude, we have the same example:

module A where
    type instance F Int
    f :: F Bool

module B where
    import A
    type instance F Bool = Int -> Bool
    x = f 2

Now, should the following module A be a permissible implementation of the signature?

module A where
    type instance F Int = Int
    type instance F Bool = Int
    f = 42

If we view this example with the glasses off, we might conclude that it is a permissible implementation. After all, the implementation of A provides an extra type instance, yes, but when this happened previously with a (value-level) declaration, it was hidden by the signature.

But if put our glasses on and look at the example as a whole, something bad has happened: we're attempting to use the integer 42 as a function from integers to booleans. The trouble is that F Bool has been given different types in the module A and module B, and this is unsound... like, segfault unsound. And if we think about it some more, this should not be surprising: we already knew it was unsound to have overlapping type families (and eagerly check for this), and signature-style hiding is an easy way to allow overlap to sneak in.

The distressing conclusion: open type families are not modular.

So, what does this mean? Should we throw our hands up and give up giving Haskell a new module system? Obviously, we’re not going to go without a fight. Here are some ways to counter the problem.

The basic proposal: require all instances in the signature

The simplest and most straightforward way to solve the unsoundness is to require that a signature mention all of the family instances that are transitively exported by the module. So, in our previous example, the implementation of A does not satisfy the signature because it has an instance which is not mentioned in the signature, but would satisfy this signature:

module A where
    type instance F Int
    type instance F Bool

While at first glance this might not seem too onerous, it's important to note that this requirement is transitive. If A happens to import another module Internal, which itself has its own type family instances, those must be represented in the signature as well. (It's easy to imagine this spinning out of control for type classes, where any of the forty imports at the top of your file may be bringing in any manner of type classes into scope.) There are two major user-visible consequences:

  1. Module imports are not an implementation detail—you need to replicate this structure in the signature file, and
  2. Adding instances is always a backwards-incompatible change (there is no weakening).

Of course, as Richard pointed out to me, this is already the case for Haskell programs (and you just hoped that adding that one extra instance was "OK").

Despite its unfriendliness, this proposal serves as the basis for the rest of the proposals, which you can conceptualize as trying to characterize, “When can I avoid having to write all of the instances in my signature?”

Extension 1: The orphan restriction

Suppose that I write the following two modules:

module A where
    data T = T
    type instance F T = Bool

module B where
    import A
    type instance F T = Int -> Int

While it is true that these two type instances are overlapping and rightly rejected, they are not equally at fault: in particular, the instance in module B is an orphan. An orphan instance is an instance for type class/family F and data type T (it just needs to occur anywhere on the left-hand side) which lives in a module that defines neither. (A is not an orphan since the instance lives in the same module as the definition of data type T).

What we might wonder is, “If we disallowed all orphan instances, could this rule out the possibility of overlap?” The answer is, “Yes! (...with some technicalities).” Here are the rules:

  1. The signature must mention all what we will call ragamuffin instances transitively exported by implementations being considered. An instance of a family F is a ragamuffin if it is not defined with the family definition, or with the type constructor at the head in the first parameter. (Or some specific parameter, decided on a per-family basis.) All orphan instances are ragamuffins, but not all ragamuffins are orphans.
  2. A signature exporting a type family must mention all instances which are defined in the same module as the definition of the type family.
  3. It is strictly optional to mention non-ragamuffin instances in a signature.

(Aside: I don't think this is the most flexible version of the rule that is safe, but I do believe it is the most straightforward.) The whole point of these rules is to make it impossible to write an overlapping instance, while only requiring local checking when an instance is being written. Why did we need to strengthen the orphan condition into a ragamuffin condition to get this non-overlap? The answer is that absence of orphans does not imply absence of overlap, as this simple example shows:

module A where
    data A = A
    type instance F A y = Int

module B where
    data B = B
    type instance F x B = Bool -> Bool

Here, the two instances of F are overlapping, but neither are orphans (since their left-hand sides mention a data type which was defined in the module.) However, the B instance is a ragamuffin instance, because B is not mentioned in the first argument of F. (Of course, it doesn't really matter if you check the first argument or the second argument, as long as you're consistent.)

Another way to think about this rule is that open type family instances are not standalone instances but rather metadata that is associated with a type constructor when it is constructed. In this way, non-ragamuffin type family instances are modular!

A major downside of this technique, however, is that it doesn't really do anything for the legitimate uses of orphan instances in the Haskell ecosystem: when third-parties defined both the type family (or type class) and the data type, and you need the instance for your own purposes.

Extension 2: Orphan resolution

This proposal is based off of one that Edward Kmett has been floating around, but which I've refined. The motivation is to give a better story for offering the functionality of orphan instances without gunking up the module system. The gist of the proposal is to allow the package manager to selectively enable/disable orphan definitions; however, to properly explain it, I'd like to do first is describe a few situations involving orphan type class instances. (The examples use type classes rather than type families because the use-cases are more clear. If you imagine that the type classes in question have associated types, then the situation is the same as that for open type families.)

The story begins with a third-party library which defined a data type T but did not provide an instance that you needed:

module Data.Foo where
    data Foo = Foo

module MyApp where
    import Data.Foo
    fooString = show Foo -- XXX no instance for Show

If you really need the instance, you might be tempted to just go ahead and define it:

module MyApp where
    import Data.Foo
    instance Show Foo where -- orphan
        show Foo = "Foo"
    fooString = show Foo

Later, you upgrade Data.Foo to version 1.0.0, which does define a Show instance, and now your overlapping instance error! Uh oh.

How do we get ourselves out of the mess? A clue is how many package authors currently “get out of jail” by using preprocessor macros:

module MyApp where
    import Data.Foo
#if MIN_VERSION_foo(1,0,0)
    instance Show Foo where -- orphan
        show Foo = "Foo"
    fooString = show Foo

Morally, we'd like to hide the orphan instance when the real instance is available: there are two variations of MyApp which we want to transparently switch between: one which defines the orphan instance, and one which does not and uses the non-orphan instance defined in the Data.Foo. The choice depends on which foo was chosen, a decision made by the package manager.

Let's mix things up a little. There is no reason the instance has to be a non-orphan coming from Data.Foo. Another library might have defined its own orphan instance:

module MyOtherApp where
    import Data.Foo
    instance Show Foo where ... -- orphan
    otherFooString = show Foo

module MyApp where
    import Data.Foo
    instance Show Foo where ... -- orphan
    fooString = show Foo

module Main where
    import MyOtherApp
    import MyApp
    main = print (fooString ++ otherFooString ++ show Foo)

It's a bit awful to get this to work with preprocessor macros, but there are two ways we can manually resolve the overlap: we can erase the orphan instance from MyOtherApp, or we can erase the orphan instance from MyApp. A priori, there is no reason to prefer one or the other. However, depending on which one is erased, Main may have to be compiled differently (if the code in the instances is different). Furthermore, we need to setup a new (instance-only) import between the module who defines the instance to the module whose instance was erased.

There are a few takeaways from these examples. First, the most natural way of resolving overlapping orphan instances is to simply “delete” the overlapping instances; however, which instance to delete is a global decision. Second, which overlapping orphan instances are enabled affects compilation: you may need to add module dependencies to be able to compile your modules. Thus, we might imagine that a solution allows us to do both of these, without modifying source code.

Here is the game plan: as before, packages can define orphan instances. However, the list of orphan instances a package defines is part of the metadata of the package, and the instance itself may or may not be used when we actually compile the package (or its dependencies). When we do dependency resolution on a set of packages, we have to consider the set of orphan instances being provided and only enable a set which is non-overlapping, the so called orphan resolution. Furthermore, we need to add an extra dependency from packages whose instances were disabled to the package who is the sole definer of an instance (this might constrain which orphan instance we can actually pick as the canonical instance).

The nice thing about this proposal is that it solves an already existing pain point for type class users, namely defining an orphan type class instance without breaking when upstream adds a proper instance. But you might also think of it as a big hack, and it requires cooperation from the package manager (or some other tool which manages the orphan resolution).

The extensions to the basic proposal are not mutually exclusive, but it's an open question whether or not the complexity they incur are worth the benefits they bring to existing uses of orphan instances. And of course, there may other ways of solving the problem which I have not described here, but this smorgasbord seems to be the most plausible at the moment.

At ICFP, I had an interesting conversation with Derek Dreyer, where he mentioned that when open type families were originally going into GHC, he had warned Simon that they were not going to be modular. With the recent addition of closed type families, many of the major use-cases for open type families stated in the original paper have been superseded. However, even if open type families had never been added to Haskell, we still might have needed to adopt these solutions: the global uniqueness of instances is deeply ingrained in the Haskell community, and even if in some cases we are lax about enforcing this constraint, it doesn't mean we should actively encourage people to break it.

I have a parting remark for the ML community, as type classes make their way in from Haskell: when you do get type classes in your language, don’t make the same mistake as the Haskell community and start using them to enforce invariants in APIs. This way leads to the global uniqueness of instances, and the loss of modularity may be too steep a price to pay.

Postscript. One natural thing to wonder, is if overlapping type family instances are OK if one of the instances “is not externally visible.” Of course, the devil is in the details; what do we mean by external visibility of type family instances of F?

For some definitions of visibility, we can find an equivalent, local transformation which has the same effect. For example, if we never use the instance at all, it certainly OK to have overlap. In that case, it would also have been fine to delete the instance altogether. As another example, we could require that there are no (transitive) mentions of the type family F in the signature of the module. However, eliminating the mention of the type family requires knowing enough parameters and equations to reduce: in which case the type family could have been replaced with a local, closed type family.

One definition that definitely does not work is if F can be mentioned with some unspecified type variables. Here is a function which coerces an Int into a function:

module A where
  type instance F Int = Int
  f :: Typeable a => a -> F a
  f x = case eqT of
    Just Refl -> x :: Int
    Nothing -> undefined

module ASig where
  f :: Typeable a => a -> F a

module B where
  import ASig
  type instance F Int = Bool -> Bool
  g :: Bool
  g = f 0 True -- oops

...the point being that, even if a signature doesn't directly mention the overlapping instance F Int, type refinement (usually by some GADT-like structure) can mean that an offending instance can be used internally.

by Edward Z. Yang at September 04, 2014 10:12 PM

Roman Cheplyaka

Dependent Haskell

Emulating dependent types (and, more generally, advanced type-level programming) has been a hot topic in the Haskell community recently. Some incredible work has been done in this direction: GADTs, open and closed type families, singletons, etc. The plan is to copy even more features to the type level, like type classes and GADTs, and simplify the promotion of value-level functions.

On the other hand, there’s a good deal of scepticism around this idea. «If you want to program like in Agda, why don’t you program in Agda?»

First, libraries. It took us many years to arrive at the state of hackage that is suitable for industrial usage — and we still don’t have satisfactory answers to many problems. My guess is that it will take at least as long as that for the dependently typed community to arrive at this point — not only because of the smaller community, but also because they will look for even more type-safe solutions, which is naturally a harder problem.

Making your code extremely type-safe is quite hard (or else it would not be worth an ICFP talk). In a real-world application, you’d probably have just a few places where it’s worth the effort. But if you write the whole application in a dependently-typed language, you pay the price for your whole codebase. The price includes, for instance, the absence of type inference.

The compilation problem is not solved either. In particular, there are concerns about type erasure in both Agda and Idris. This is not unsolvable, just hasn’t been done yet.

So maybe you could write some parts in Agda/Idris and the rest in Haskell? Neither Agda nor Idris has a good story for that. For instance, Agda can generate Haskell code, but interfacing with Haskell won’t be very smooth. And the differences in languages mean that it probably won’t ever be effortless.

Don’t get me wrong, I am very enthusiastic about these (and future) languages. However, it doesn’t seem like they will be ready for production usage anytime soon. At the same time, you can satisfy the majority of your dependently typed needs in Haskell right now, no hidden charges. Isn’t that cool?


I am no expert in Agda/Idris. The above is my impression from talking to different people at ICFP and participating in tutorials on these two languages given by their corresponding authors. I’ll gladly accept rebuttals.


There’s a discussion happening on /r/haskell, where people dispute many of my points. I encourage you to read that discussion and make your own conclusions.

In particular, it has been pointed out that erasure is not as big of a problem as I thought initially. Still, it’ll take quite some time (and non-trivial systems built with dependent types) for us to be able to trust Idris/Agda compilers as much as we now trust GHC.

September 04, 2014 09:00 PM

Ian Ross

Non-diffusive atmospheric flow #5: pre-processing

Non-diffusive atmospheric flow #5: pre-processing

Note: there are a couple of earlier articles that I didn’t tag as “haskell” so they didn’t appear in Planet Haskell. They don’t contain any Haskell code, but they cover some background material that’s useful to know (#3 talks about reanalysis data and what Z500 is, and #4 displays some of the characteristics of the data we’re going to be using). If you find terms here that are unfamiliar, they might be explained in one of these earlier articles.

The code for this post is available in a Gist.

Before we can get into the “main analysis”, we need to do some pre-processing of the Z500 data. In particular, we are interested in large-scale spatial structures, so we want to subsample the data spatially. We are also going to look only at the Northern Hemisphere winter, so we need to extract temporal subsets for each winter season. (The reason for this is that winter is the season where we see the most interesting changes between persistent flow regimes. And we look at the Northern Hemisphere because it’s where more people live, so it’s more familiar to more people.) Finally, we want to look at variability about the seasonal cycle, so we are going to calculate “anomalies” around the seasonal cycle.

We’ll do the spatial and temporal subsetting as one pre-processing step and then do the anomaly calculation seperately, just for simplicity.

Spatial and temporal subsetting

The title of the paper we’re trying to follow is “Observed Nondiffusive Dynamics in Large-Scale Atmospheric Flow”, so we need to decide what we mean by “large-scale” and to subset our data accordingly. The Z500 data from the NCEP reanalysis dataset is at 2.5° × 2.5° resolution, which turns out to be a little finer than we need, so we’re going to extract data on a 5° × 5° grid instead. We’ll also extract only the Northern Hemisphere data, since that’s what we’re going to work with.

For the temporal subsetting, we need to take 181 days of data for each year starting on 1 November each year. Since the data starts at the beginning of 1948 and goes on to August 2014 (which is when I’m writing this), we’ll have 66 years of data, from November 1948 until April 2014. As usual when handling dates, there’s some messing around because of leap years, but here it basically just comes down to which day of the year 1 November is in a given year, so it’s not complicated.

The daily NCEP reanalysis geopotential height data comes in one file per year, with all the pressure levels used in the dataset bundled up in each file. That means that the geopotential height variable in each file has coordinates: time, level, latitude, longitude, so we need to slice out the 500 mb level as we do the other subsetting.

All this is starting to sound kind of complicated, and this brings us to a regrettable point about dealing with this kind of data – it’s messy and there’s a lot of boilerplate code to read and manipulate coordinate metadata. This is true pretty much whatever language you use for processing these multi-dimensional datasets and it’s kind of unavoidable. The trick is to try to restrict this inconvenient stuff to the pre-processing phase by using a consistent organisation of your data for the later analyses. We’re going to do that here by storing all of our Z500 anomaly data in a single NetCDF file, with 181-day long winter seasons back-to-back for each year. This will make time and date processing trivial.

The code for the data subsetting is in the subset.hs program in the Gist. We’ll deal with it in a few bites.

Skipping the imports that we need, as well as a few “helper” type synonym definitions, the first thing that we need to do it open one of the input NCEP NetCDF files to extract the coordinate metadata information. This listing shows how we do this:

  Right refnc <- openFile $ indir </> ""
  let Just nlat = ncDimLength <$> ncDim refnc "lat"
      Just nlon = ncDimLength <$> ncDim refnc "lon"
      Just nlev = ncDimLength <$> ncDim refnc "level"
  let (Just lonvar) = ncVar refnc "lon"
      (Just latvar) = ncVar refnc "lat"
      (Just levvar) = ncVar refnc "level"
      (Just timevar) = ncVar refnc "time"
      (Just zvar) = ncVar refnc "hgt"
  Right lon <- get refnc lonvar :: SVRet CFloat
  Right lat <- get refnc latvar :: SVRet CFloat
  Right lev <- get refnc levvar :: SVRet CFloat

We open the first of the NetCDF files (I’ve called the directory where I’ve stored these things indir) and use the hnetcdf ncDim and ncVar functions to get the dimension and variable metadata for the latitude, longitude, level and time dimensions; we then read the complete contents of the “coordinate variables” (for level, latitude and longitude) as Haskell values (here, SVRet is a type synonym for a storable vector wrapped up in the way that’s returned from the hnetcdf get functions).

Once we have the coordinate variable values, we need to find the index ranges to use for subsetting. For the spatial subsetting, we find the start and end ranges for the latitudes that we want (17.5°N-87.5°N) and for the level, we find the index of the 500 mb level:

  let late = vectorIndex LT FromEnd lat 17.5
      lats = vectorIndex GT FromStart lat 87.5
      levi = vectorIndex GT FromStart lev 500.0

using a helper function to find the correspondence between coordinate values and indexes:

data IndexStart = FromStart | FromEnd

vectorIndex :: (SV.Storable a, Ord a)
            => Ordering -> IndexStart -> SV.Vector a -> a -> Int
vectorIndex o s v val = case (go o, s) of
  (Nothing, _) -> (-1)
  (Just i, FromStart) -> i
  (Just i, FromEnd) -> SV.length v - 1 - i
  where go LT = SV.findIndex (>= val) vord
        go GT = SV.findIndex (<= val) vord
        vord = case s of
          FromStart -> v
          FromEnd -> SV.reverse v

For the temporal subsetting, we just work out what day of the year 1 November is for leap and non-leap years – since November and December together are 61 days, for each winter season we need those months plus the first 120 days of the following year:

  let inov1non = 305 - 1
      -- ^ Index of 1 November in non-leap years.
      wintertsnon = [0..119] ++ [inov1non..365-1]
      -- ^ Indexes of all winter days for non-leap years.
      inov1leap = 305 + 1 - 1
      -- ^ Index of 1 November in leap years.
      wintertsleap = [0..119] ++ [inov1leap..366-1]
      -- ^ Indexes of all winter days for leap years.
      winterts1948 = [inov1leap..366-1]
      winterts2014 = [0..119]
      -- ^ Indexes for winters in start and end years.

This is kind of hokey, and in some cases you do need to do more sophisticated date processing, but this does the job here.

Once we have all this stuff set up, the critical part of the subsetting is easy – for each input data file, we figure out what range of days we need to read, then use a single call the getS from hnetcdf (“get slice”):

        forM_ winterts $ \it -> do
          -- Read a slice of a single time-step of data: Northern
          -- Hemisphere (17.5-87.5 degrees), 5 degree resolution, 500
          -- mb level only.
          let start = [it, levi, lats, 0]
              count = [1, 1, (late - lats) `div` 2 + 1, nlon `div` 2]
              stride = [1, 1, 2, 2]
          Right slice <- getS nc zvar start count stride :: RepaRet2 CShort

Here, we have a set of start indexes, a set of counts and a set of strides, one for each dimension in our input variable. Since the input geopotential height files have dimensions of time, level, latitude and longitude, we have four entries in each of our start, count and stride lists. The start values are the current day from the list of days we need (called it in the code), the level we’re interested in (levi), the start latitude index (lats) and zero, since we’re going to get the whole range of longitude. The count list gets a single time step, a single level, and a number of latitude and longitude values based on taking every other entry in each direction (since we’re subsetting from a spatial resolution of 2.5° × 2.5° to a resolution of 5° × 5°). Finally, for the stride list, we use a stride of one for the time and level directions (which doesn’t really matter anyway, since we’re only reading a single entry in each of those directions) and a stride of two in the latitude and longitude directions (which gives us the “every other one” subsetting in those directions).

All of the other code in the subsetting program is involved in setting up the output file and writing the Z500 slices out. Setting up the output file is slightly tedious (this is very common when dealing with NetCDF files – there’s always lots of metadata to be managed), but it’s made a little simpler by copying attributes from one of the input files, something that doing this in Haskell makes quite a bit easier than in Fortran or C. The next listing shows how the NcInfo for the output file is created, which can then be passed to the hnetcdf withCreateFile function to actually create the output file:

  let outlat = SV.fromList $ map (lat SV.!) [lats,lats+2..late]
      outlon = SV.fromList $ map (lon SV.!) [0,2..nlon-1]
      noutlat = SV.length outlat
      noutlon = SV.length outlon
      outlatdim = NcDim "lat" noutlat False
      outlatvar = NcVar "lat" NcFloat [outlatdim] (ncVarAttrs latvar)
      outlondim = NcDim "lon" noutlon False
      outlonvar = NcVar "lon" NcFloat [outlondim] (ncVarAttrs lonvar)
      outtimedim = NcDim "time" 0 True
      outtimeattrs = foldl' (flip M.delete) (ncVarAttrs timevar)
      outtimevar = NcVar "time" NcDouble [outtimedim] outtimeattrs
      outz500attrs = foldl' (flip M.delete) (ncVarAttrs zvar)
                     ["actual_range", "level_desc", "valid_range"]
      outz500var = NcVar "z500" NcShort
                   [outtimedim, outlatdim, outlondim] outz500attrs
      outncinfo =
        emptyNcInfo (outdir </> "") #
        addNcDim outlatdim # addNcDim outlondim # addNcDim outtimedim #
        addNcVar outlatvar # addNcVar outlonvar # addNcVar outtimevar #
        addNcVar outz500var

Although we can mostly just copy the coordinate variable attributes from one of the input files, we do need to do a little bit of editing of the attributes to remove some things that aren’t appropriate for the output file. Some of these things are just conventions, but there are some tools that may look at these attributes (actual_range, for example) and may complain if the data doesn’t match the attribute. It’s easier just to remove the suspect ones here.

This isn’t pretty Haskell by any means, and the hnetcdf library could definitely do with having some higher-level capabilities to help with this kind of file processing. I may add some based on the experimentation I’m doing here – I’m developing hnetcdf in parallel with writing this!

Anyway, the result of running the subsetting program is a single NetCDF file containing 11946 days (66 × 181) of Z500 data at a spatial resolution of 5° × 5°. We can then pass this on to the next step of our processing.

Seasonal cycle removal

In almost all investigations in climate science, the annual seasonal cycle stands out as the largest form of variability (not always true in the tropics, but in the mid-latitudes and polar regions that we’re looking at here, it’s more or less always true). The problem, of course, is that the seasonal cycle just isn’t all that interesting. We learn about the difference between summer and winter when we’re children, and although there is much more to say about seasonal variations and how they interact with other phenomena in the climate system, much of the time they just get in the way of seeing what’s going on with those other phenomena.

So what do we do? We “get rid” of the seasonal cycle by looking at what climate scientists call “anomalies”, which are basically just differences between the values of whatever variable we’re looking at and values from a “typical” year. For example, if we’re interested in daily temperatures in Innsbruck over the course of the twentieth century, we construct a “typical” year of daily temperatures, then for each day of our 20th century time series, we subtract the “typical” temperature value for that day of the year from the measured temperature value to give a daily anomaly. Then we do whatever analysis we want based on those anomalies, perhaps looking for inter-annual variability on longer time scales, or whatever.

This approach is very common, and it’s what we’re going to do for our Northern Hemisphere winter-time Z500 data here. To do this, we need to do two things: we need to calculate a “typical” annual cycle, and we need to subtract that typical annual cycle from each year of our data.

OK, so what’s a “typical” annual cycle? First, let’s say a word about what we mean by “annual cycle” in this case. We’re going to treat each spatial point in our data independently, trusting to the natural spatial correlation in the geopotential height to smooth out any shorter-term spatial inhomogeneities in the typical patterns. We then do some sort of averaging in time to generate a “typical” annual cycle at each spatial point. It’s quite common to use the mean annual cycle over a fixed period for this purpose (a period of 30 years is common: 1960-1990, say). In our case, we’re going to use the mean annual cycle across all 66 years of data that we have. Here’s how we calculate this mean annual cycle (this is from the seasonal-cycle.hs program in the Gist):

  -- Each year has 181 days, so 72 x 15 x 181 = 195480 data values.
  let ndays = 181
      nyears = ntime `div` ndays

  -- Use an Int array to accumulate values for calculating mean annual
  -- cycle.  Since the data is stored as short integer values, this is
  -- enough to prevent overflow as we accumulate the 2014 - 1948 = 66
  -- years of data.
  let sh = Repa.Z Repa.:. ndays Repa.:. nlat Repa.:. nlon
      slicecount = [ndays, nlat, nlon]
      zs = take (product slicecount) (repeat 0)
      init = Repa.fromList sh zs :: FArray3 Int

  -- Computation to accumulate a single year's data.
  let doone current y = do
        -- Read one year's data.
        let start = [(y - 1948) * ndays, 0, 0]
        Right slice <- getA innc z500var start slicecount :: RepaRet3 CShort
        return $!
          Repa.computeS $ current Repa.+^ ( fromIntegral slice)

  -- Accumulate all data.
  total <- foldM doone init [1948..2013]

  -- Calculate the final mean annual cycle.
  let mean = Repa.computeS $
    (fromIntegral . (`div` nyears)) total :: FArray3 CShort

Since each year of data has 181 days, a full year’s data has 72 × 15 × 181 data values (72 longitude points, 15 latitude points, 181 days), i.e. 195,480 values. In the case here, since we have 66 years of data, there are 12,901,680 data values altogether. That’s a small enough number that we could probably slurp the whole data set into memory in one go for processing. However, we’re not going to do that, because there are plenty of cases in climate data analysis where the data sets are significantly larger than this, and you need to do “off-line” processing, i.e. to read data from disk piecemeal for processing.

We do a monadic fold (using the standard foldM function) over the list of years of data, and for each year we read a single three-dimensional slice of data representing the whole year and add it to the current accumulated sum of all the data. (This is what the doone function does: the only slight wrinkle here is that we need to deal with conversion from the short integer values stored in the data file to the Haskell Int values that we use in the accumulator. Otherwise, it’s just a question of applying Repa’s element-wise addition operator to the accumulator array and the year’s data.) Once we’ve accumulated the total values across all years, we divide by the number of years and convert back to short integer values, giving a short integer valued array containing the mean annual cycle – a three-dimensional array with one entry for each day in our 181-day winter season and for each latitude and longitude in the grid we’re using.

Once we have the mean annual cycle with which we want to calculate anomalies, determining the anomalies is simply a matter of subtracting the mean annual cycle from each year’s data, matching up longitude, latitude and day of year for each data point. The next listing shows the main loop that does this, reading a single day of data at a time, then subtracting the appropriate slice of the mean annual cycle to produce anomaly values (Repa’s slice function is handy here) and writing these to an output NetCDF file:

      let count = [1, nlat, nlon]
      forM_ [0..ntime-1] $ \it -> do
        -- Read time slice.
        Right slice <- getA innc z500var [it, 0, 0] count :: RepaRet2 CShort

        -- Calculate anomalies and write out.
        let sl = Repa.Z Repa.:. (it `mod` 181) Repa.:. Repa.All Repa.:. Repa.All
            anom = Repa.computeS $
                   slice Repa.-^ (Repa.slice mean sl) :: FArray2 CShort
        putA outnc outz500var [it, 0, 0] count anom

The only thing we have to be a little bit careful about when we create the final anomaly output file is that we need to remove some of the attributes from the Z500 variable: because we’re now working with differences between actual values and our “typical” annual cycle, we no longer need the add_offset and scale_factor attributes that are used to convert from the stored short integer values to floating point geopotential height values. Instead, the values that we store in the anomaly file are the actual geopotential height anomaly values in metres.

After doing all this pre-processing, what we end up with is a single NetCDF file containing 66 winter seasons of daily Z500 anomalies for the region we’re interested in. The kind of rather boring data processing we’ve had to do to get to this point is pretty typical for climate data processing – you almost always need to do some sort of subsetting of your data, you often want to remove signals that aren’t interesting (like the annual cycle here, although things can get much more complicated than that). This kind of thing is unavoidable, and the best that you can really do is to try to organise things so that you do the pre-processing once and end up with data in a format that’s then easy to deal with for further processing. That’s definitely the case here, where we have fixed-length time series (181 days per winter) for each year, so we don’t need to do any messing around with dates.

In other applications, pre-processing can be a much bigger job. For functional brain imaging applications, for example, as well as extracting a three-dimensional image from the output of whatever (MRI or CT) scanner you’re using, you often need to do something about the low signal-to-noise ratios that you get, you need to compensate for subject motion in the scanner during the measurements, you need to compensate for time-varying physiological nuisance signals (breathing and heart beat), you need to spatially warp images to match an atlas image to enable inter-subject comparison, and so on. And all that is before you get to doing whatever statistical analysis you’re really interested in.

We will look at some of these “warts and all” pre-processing cases for other projects later on, but for now the small amount of pre-processing we’ve had to do here is enough. Now we can start with the “main analysis”.

Before we do that though, let’s take a quick look at what these anomaly values look like. The two figures below show anomaly plots for the same time periods for which the original Z500 data is shown in the plots in this earlier article. The “normal” anomaly plots are a bit more variable than the original Z500 plots, but the persistent pattern over the North Atlantic during the blocking episode in the second set of plots is quite clear. This gives us some hope that we’ll be able to pick out these persistent patterns in the data relatively easily.

First, the “normal” anomalies:

Normal Z500 anomaly snapshots

then the “blocking” anomalies:

Blocking Z500 anomaly snapshots

<script src="" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

September 04, 2014 12:54 PM

September 01, 2014

Douglas M. Auclair (geophf)

August Haskell Daily puzzles and solutions

August, 2014

  • August 1st, 2014: "But programming isn't math, so ..." Today's #haskell problem? 'Fix' yesterday's solution to be less ... 'icky.' π, fast (So you're not eating π? ... okay: 'quickly.') A solution set to today's exercises with Wow-o-wow results.
  • Bonus: August 1st, 2014: This is Embarrassing(ly parallel) to have π in da face!
  • A-to-the-ST for da D-down-low-on-the-SL. Today's #haskell exercise is write your own language, then some rules. Yeah. LAWLZ! NAND on the A-to-the-ST! A solution to the first 2 exercises to today's problem. YMMV: not pretty nor generic.
  • Baby needs a new pair of shoes! And you need a new daily #haskell problem to solve. Done! ;) Love is the Universal Language. No, wait. We were talking about Money, yes? Me confused. Solution to today's exercise I have a Yen for Thinking Japanese, or so I think so ... a solution to the bonus #haskell exercise
  • Bayes was a man of letters. A Bayesian classifier for letter-recognition? Sure, let's give it a go for today's #haskell exercise. So now we know 'M' looks like 'W' to some peeps but 'B', 'I', 'O' check out fine, so WE ARE READING WITH HASKELL! YAY ... added definitions to do test runs over the entire data set and then did various runs, tweaking the system. Results noted. Informative.
  • Today's #haskell exercise. An expert system for blood donation. I use the term 'expert' loosely, eh? ;) Have at it! "I vant to drinq yur bloot! Bwa-hahaha!" Okay. That quote is just ... creepy. A solution to today's exercise is at
  • Veni, Vidi, ... Duci? Today's #haskell exercise. It is a relief to know that no tactical nukes were launched by testing the solution to today's #haskell exercise. or How do you count Mississippi's in Roman numerals?
  • August 11th, 2014, Monday: You've heard of 'Rock Management'? How about ROCK LOBSTERS! GIVE ME A ROCK, NAOW!!!1!!!!11! Today's #haskell exercise 
  • August 12th, 2014, Tuesday: To Fract or Not to Fract: why are we even considering this question? Today's #haskell exercise 

  • Ooh! Bonus problem? On a Tuesday? Wowsers! Today's #haskell #bonus problem: "Continuum spectrum maximum... uh, 'thing'um"
  • August 13th, 2014, Wednesday. No problem! (literally), but: "Fract! The Cylons!" No... wait: Starbuck used a different word. A solution to yesterday's fracting problem is at
Flatliners. No fracting, no 'peak'ing, just a baseline data set of 
initializeSpectrum defined with flatline function.

Twin Peaks. Still not fracted, but data set with two spikes 
rolled into the base data set with smoothSpike fn 

Fracted. Data set with peaks, fracted using the frackedSpike function

  • August 14th, 2014, Thursday: Pining for that next fibo date-time before 2584. Today's #haskell problem inspired by @AlgebraFact Updated problem. Tightened up return value, removing unnecessary semideterminism. So, it gives, leik, a septatuple answer, leik. ... Yeah. Time keeps flowing like a (Fibonacci) river, to the sea. A solution to  the 'next Fibonacci' problem posted today.
  • Hail, Eris! or the Law of Fives or today's #haskell problem (implementing a Ripple-down rule set). Do it to it! One of three-parter solution to today's problem: Forall Existential(ism) A solution allowing forall-quantified values. Two of the three-part solution: All you need is fnord (la-di-da-di-dah)! Rippling down (and accumulating fired) rules Third of three-part solution: RippleDownRuleto ergo sum, adding rules to the RDR system  Updated the 3rd solution to the RDR (Writer monad definition) to ripple down to left and right, fixing inconsistency in rule findings.
  • August 18th, 2014: What's FP good for? Crypto and technical (financial) indicators. Why? Um, let's do today's #haskell exercise! Full on, all the way! Double Rain-... wait: line-graph. Solution to today's #haskell exercise

  • August 19th, 2014: What does it all mean? All the way! No, wait: this is just today's #haskell exercise (backtesting SMA indicator ). Take the monoid and run(State-WriterT), or WAAH! WAAH! I lost 100 corn chips on my investment strategy or, solution: But, as solace, it does come with a pretty picture, illustrating today's solution. Ooh! 

  • August 20th, 2014: Next up for today's #haskell exercise is the Exponential Moving Average. A solution to the E(xponential)M(oving)A(verage) #haskell problem: Stately Conformance 

  • August 21st, 2014: My Jeans. No ... I meant: 'GENES'! That's it. Today's #haskell exercise. A solution to (not even CLOSE to) 'yesterday's' #haskell exercise: Nature or Nurture? We present a genetic algorithm
  • August 22nd, 2014: Today's (much delayed) #haskell problem: the Silver Ratio (from @AlgebraFact Every cloud has a silver lining .. Every Haskell problem has a solution. (sometimes) (except noncomputable problems)

  • August 25th, 2014: LYADCFGG! Today's #haskell exercise? (automated/learned) Document classification.
  • August 26th, 2014: "Join me, Luke, on the Dark Side of the Force, and help me go #FORTH to solve today's #haskell exercise! MWA-HAHA!" The World's smallest DSL: Forth. A solution to today's #haskell exercise
  • August 28th, 2014: "As I compose this #haskell problem ... p.s. I love you ... You! You! You!" #1960s #song A little card logic problem. Oh, NOES! LOOK OUT! IT'S A CARD-#SHARKNADO! Nah, it's just 'today's #haskell solution, is all
  • August 29th, 2014: Groovitudes! WordNet Themes! or: today's #haskell exercise A 3x3 matrix has 36 possible solutions. A 5x5 matrix? Over 200M poss solutions? YIKES! A solution to the themed-words
  • The Forth language problem solution given on August 26th gives a very snazzy RPN ('reverse Polish notation') calculator, but that's all it does the ':'-defining word needs access and look-ahead to the program parsed stream, and that's a bit more to ask than pushing and popping stack operators.
  • For the August 29th problem(WordNet themes) the raw generated solution set is over 209M possibilities. My little Haskell program was not finished scanning them over four hours when I killed the process. However, my dear wife solved the problem in under five minutes. Setting aside the fact that she's a genius, the program needs to be better. It needs to use the ontology of English-language words to eliminate fruitless endeavors during their generation, not afterwards during the test phase.

by geophf ( at September 01, 2014 10:11 AM

Yesod Web Framework

Planning Yesod 1.4

Now that persistent 2.0 is out the door, it's time to start talking about Yesod version 1.4. First question you might ask: what happened to Yesod 1.3? Answer: a few of the Yesod libraries (e.g., yesod-auth) are already on version 1.3, so to avoid confusion, we're jumping straight to 1.4.

Second question: what are we planning on breaking this time? Answer: hopefully nothing! The main purpose of this release is actually to just remove some backwards-compatibility hacks in the Yesod codebase for older versions of dependencies, like shakespeare pre-2.0, conduit pre-1.2, WAI pre-3.0, and persistent pre-2.0.

There are few exceptions to this, which should hopefully have minimal impact on users. You can see these in the detailed change list. One change I'd like to call out is the updated routing system. This is a fundamental change to how yesod-routes works. The generated code is drastically simpler as a result. Instead of constructing a data structure that allows for efficient pattern matching of the request path and then attempting to parse the resulting pieces, the new code simply generates a series of clauses, one for each route, and ensures proper parsing using view patterns. In my initial benchmarking, this made routing twice as fast as Yesod 1.2. I would release this as part of 1.2, but it introduces a new requirement on the ViewPatterns language extension. So instead, I held it off for the 1.4 release.

If there are other breaking changes that people would like to propose, now's the time to do it. But be aware that I'll likely push back hard on any breakage. If there's a very good reason for it, we can do it. But I'd rather keep stability wherever possible.

There's one exception to that rule, which is the purpose of the rest of this blog post: the scaffolded site. Making changes to the scaffolded site never breaks existing application, and therefore we can be much more liberal about changing things there. There is a downside in terms of education: all existing tutorials on the scaffolding would need to be updated. But one of my points below addresses that.

So here are my proposed scaffolding changes:

  • Let's move away from config files towards environment variables for configuration. A config file is still a convenient way to record configuration, but injecting that configuration through environment variables means configuration can also be stored in a database or elsewhere and injected through environment variables the same way.
  • Along the same lines, we would no longer need a command line argument to indicate which environment we're in (devel vs production, etc). All such settings would be controlled via environment variables.
  • To allow for easy development, we would have a single YESOD_DEVEL environment variables which would indicate if we're currently in development. If so, it would apply a number of default environment variable values to avoid the need to set these in your shell manually.
  • Finally, and I expect this to be controversial: let's use classy-prelude-yesod in the Import module, instead of just taking Prelude with a few objectionable functions filtered out.

This is just a first pass at a scaffolding cleanup, I'm sure there are other improvements that can be made as well.

I don't have a specific date on a Yesod 1.4 release, but I'm not expecting it to be a long development process. The vast majority of the work is already done (on the yesod-1.4 branch), and that codebase is already being used extensively in a rather large Yesod application, so I'm not too worried about regressions having slipped in.

September 01, 2014 12:00 AM

August 30, 2014

Joachim Breitner

DebConf 14

I’m writing this blog post on the plane from Portland towards Europe (which I now can!), using the remaining battery live after having watched one of the DebConf talks that I missed. (It was the systemd talk, which was good and interesting, but maybe I should have watched one of the power management talks, as my battery is running down faster than it should be, I believe.)

I mostly enjoyed this year’s DebConf. I must admit that I did not come very prepared: I had neither something urgent to hack on, nor important things to discuss with the other attendees, so in a way I had a slow start. I also felt a bit out of touch with the project, both personally and technically: In previous DebConfs, I had more interest in many different corners of the project, and also came with more naive enthusiasm. After more than 10 years in the project, I see a few things more realistic and also more relaxed, and don’t react on “Wouldn’t it be cool to have <emph>crazy idea</emph>” very easily any more. And then I mostly focus on Haskell packaging (and related tooling, which sometimes is also relevant and useful to others) these days, which is not very interesting to most others.

But in the end I did get to do some useful hacking, heard a few interesting talks and even got a bit excited: I created a new tool to schedule binNMUs for Haskell packages which is quite generic (configured by just a regular expression), so that it can and will be used by the OCaml team as well, and who knows who else will start using hash-based virtual ABI packages in the future... It runs via a cron job on to produce output for Haskell and for OCaml, based on data pulled via HTTP. If you are a Debian developer and want up-to-date results, log into and run ~nomeata/binNMUs --sql; it then uses the projectb and wanna-build databases directly. Thanks to the ftp team for opening up, by the way!

Unsurprisingly, I also held a talk on Haskell and Debian (slides available). I talked a bit too long and we had too little time for discussion, but in any case not all discussion would have fitted in 45 minutes. The question of which packages from Hackage should be added to Debian and which not is still undecided (which means we carry on packaging what we happen to want in Debian for whatever reason). I guess the better our tooling gets (see the next section), the more easily we can support more and more packages.

I am quite excited by and supportive of Enrico’s agenda to remove boilerplate data from the debian/ directories and relying on autodebianization tools. We have such a tool for Haskell package, cabal-debian, but it is unofficial, i.e. neither created by us nor fully endorsed. I want to change that, so I got in touch with the upstream maintainer and we want to get it into shape for producing perfect Debian packages, if the upstream provided meta data is perfect. I’d like to see the Debian Haskell Group to follows Enrico’s plan to its extreme conclusion, and this way drive innovation in Debian in general. We’ll see how that goes.

Besides all the technical program I enjoyed the obligatory games of Mao and Werewolves. I also got to dance! On Saturday night, I found a small but welcoming Swing-In-The-Park event where I could dance a few steps of Lindy Hop. And on Tuesday night, Vagrant Cascadian took us (well, three of us) to a blues dancing night, which I greatly enjoyed: The style was so improvisation-friendly that despite having missed the introduction and never having danced Blues before I could jump right in. And in contrast to social dances in Germany, where it is often announced that the girls are also invited to ask the boys, but then it is still mostly the boys who have to ask, here I took only half a minute of standing at the side until I got asked to dance. In retrospect I should have skipped the HP reception and went there directly...

I’m not heading home at the moment, but will travel directly to Göteborg to attend ICFP 2014. I hope the (usually worse) west-to-east jet lag will not prevent me from enjoying that as much as I could.

by Joachim Breitner ( at August 30, 2014 03:10 PM

Chris Wong

Hackage update, part 4

A lot has happened with Hackage since my last update. Now that the Summer of Code is over, I’ll summarize the work I’ve done since then, and outline where this project will go next.

What’s “build reporting”?

Since my project covered a few obscure parts of Hackage and Cabal, I think it’s worthwhile to clear some terminology first.

If you’ve uploaded a library to Hackage before, you may have noticed that the Haddock documentation does not appear straight away. Since building a package can be quite resource intensive, the job is handled by a dedicated build bot. This bot continually polls for new packages, invokes cabal install on them (with some special flags, which I’ll go into later), and uploads the result.

Of course, this process does not always succeed. If a package fails to compile, then it will not have any documentation either. This is clearly very inconvenient.

Fortunately, recent versions of cabal include a feature called build reporting. When invoked with the --build-summary option, cabal creates a file containing useful information about the build. Here’s an example using the robot package:

$ cabal install robot --build-summary='$'
$ cat
package: robot-
os: linux
arch: x86_64
compiler: ghc-7.6.3
client: cabal-install-
dependencies: xhb-0.5.2014.4.10 transformers-
              exceptions-0.6.1 containers- base-
install-outcome: InstallOk
docs-outcome: NotTried
tests-outcome: NotTried

Since the build bot uses cabal, it has access to these reports as well. So whenever the bot completes a build — successful or not — it posts the corresponding report to Hackage. You can read these reports yourself via a special URL; for our robot example it’s

In summary: if the docs for a package are missing, then the reports will tell us why. If there are no reports, then it must mean the build bot hasn’t attempted the package yet. All is fine and dandy, at least in theory.


… not all builds were reported. The gaps were in two places: planning failures and package candidates. My latest patch to cabal fixed both these issues.

Reporting planning failures

A planning failure is when cabal-install cannot find a consistent set of dependencies to use. You can trigger a planning failure yourself:

$ cabal install robot --constraint='robot < 1.1' --constraint='robot > 1.1'
cabal: Could not resolve dependencies:

Since we can’t have a robot which is both older and newer than 1.1, the resolver fails.

Formerly, as dependency resolution ran early in the build process, any failures at this stage did not generate a corresponding report. So if the build bot encountered a planning failure, all the user saw was missing documentation, with no hints as to what went wrong.

The fix was mostly straightforward, save for one issue: since users can report their own builds, a naïve implementation would have lead to Hackage being swamped with frivolous reports. So this feature is guarded behind a flag (--report-planning-failure), and disabled by default.

Reporting candidate builds

Hackage has a feature called build candidates. This lets package maintainers upload and test packages without publishing them to the main site.

Again, the problem was the lack of reporting: when a candidate was uploaded, the build bot would compile the package but not submit a report. This was a major issue, since this reporting was what motivated the feature in the first place.

After some digging, I traced this to a bug in cabal. A candidate is not published in the main package index (by definition), so it is impossible to refer to one by name (e.g. hello-1.0). So the build bot invokes cabal using the bare URL instead (e.g.

The problem was if only a URL was given, cabal considered it a “local” package and did not generate a report. The reason for this behavior is outside the scope of this post, but the fix was clear: change cabal to generate reports for all packages, no matter how they are specified on the command line.

Where to next?

Though the Summer of Code has ended, my work with Hackage has not. There are still many issues that need clearing up, especially with the candidates feature; I’ll continue hacking away at them in my spare time.

And lest I forget — many thanks to my mentor Duncan Coutts for his guidance throughout this project! I had plenty of fun this summer, and learned just as much.

August 30, 2014 12:00 AM

August 29, 2014

Yesod Web Framework

Announcing Persistent 2

We are happy to announce the release of persistent 2.0

persistent 2.0 adds a flexible key type and makes some breaking changes. 2.0 is an unstable release that we want your feedback on for the soon to follow stable 2.1 release.

New Features

  • type-safe composite primary and foreign keys
  • added an upsert operation (update or insert)
  • added an insertMany_ operation


  • An Id suffix is no longer automatically assumed to be a Persistent type
  • JSON serialization * MongoDB ids no longer have a prefix 'o' character.

Breaking changes

  • Use a simple ReaderT for the underlying connection
  • fix postgreSQL timezone storage
  • remove the type parameter from EntityDef and FieldDef

In depth

Composite keys

The biggest limitation of data modeling with persistent is an assumption of a simple (for current SQL backends an auto-increment) primary key. We learned from Groundhog that a more flexible primary key type is possible. Persistent adds a similar flexible key type while maintaining its existing invariant that a Key is tied to a particular table.

To understand the changes to the Key data type, lets look at a change in the test suite for persistent 2.

       i <- liftIO $ randomRIO (0, 10000)
-      let k = Key $ PersistInt64 $ abs i
+      let k = PersonKey $ SqlBackendKey $ abs i

Previously Key contained a PersistValue. This was not type safe. PersistValue is meant to serialize any basic Haskell type to the database, but a given table only allows specific values as the key. Now we generate the PersonKey data constructor which specifies the Haskell key types. SqlBackendKey is the default key type for SQL backends.

Now lets look at code from CompositeTest.hs

mkPersist sqlSettings [persistLowerCase|
      name  String maxlen=20
      name2 String maxlen=20
      age Int
      Primary name name2 age
      deriving Show Eq
      name  String maxlen=20
      name2 String maxlen=20
      age Int
      Foreign Parent fkparent name name2 age
      deriving Show Eq

Here Parent has a composite primary key made up of 3 fields. Child uses that as a foreign key. The primary key of Child is the default key for the backend.

let parent = Parent "a1" "b1" 11
let child = Child "a1" "b1" 11
kp <- insert parent
_ <- insert child
testChildFkparent child @== parent

Future changes

Short-term improvements

Before the 2.1 release I would like to look at doing some simple things to speed up model compilation a little bit.

  • Speed up some of the compile-time persistent code (there is a lot of obviously naive code).
  • Reduce the size of Template Haskell generation (create a reference for each EntityDef and some other things rather than potentially repeatedly inlining it)

Medium-term improvement: better support for Haskell data types

We want to add better support for modeling ADTs, particularly for MongoDB where this is actually very easy to do in the database itself. Persistent already support a top-level entity Sum Type and a simple field ADT that is just an enumeration.

Another pain point is serializing types not declared in the schema. The declaration syntax in groundhog is very verbose but allows for this. So one possibility would be to allow the current DRY persistent declaration style and also a groundhog declaration style.

Long-term improvements: Projections

It would be possible to add projections now as groundhog or esqueleto have done. However, the result is not as end-user friendly as we would like. When the record namespace issue is dealt with in the GHC 7.10 release we plan on adding projections to persistent.

Ongoing: Database specific functionality

We always look forward to see more databases adapters for persistent. In the last year, a Redis and ODBC adapter were added.

Every database is different though, and you also want to take advantage of your database-specific features. esqueleto and persistent-mongoDB have shown how to build database specific features in a type-safe way on top of persistent.


Although the persistent code has no dependency on Yesod, I would like to make the infrastructure a little more independent of yesod. The first steps would be

  • putting it under a different organization on github.
  • having a separate mail list (should stackoverflow be prioritized over e-mail?)

August 29, 2014 09:46 PM

Antti-Juhani Kaijanaho (ibid)

Licentiate Thesis is now publicly available

My recently accepted Licentiate Thesis, which I posted about a couple of days ago, is now available in JyX.

Here is the abstract again for reference:

Kaijanaho, Antti-Juhani
The extent of empirical evidence that could inform evidence-based design of programming languages. A systematic mapping study.
Jyväskylä: University of Jyväskylä, 2014, 243 p.
(Jyväskylä Licentiate Theses in Computing,
ISSN 1795-9713; 18)
ISBN 978-951-39-5790-2 (nid.)
ISBN 978-951-39-5791-9 (PDF)
Finnish summary

Background: Programming language design is not usually informed by empirical studies. In other fields similar problems have inspired an evidence-based paradigm of practice. Central to it are secondary studies summarizing and consolidating the research literature. Aims: This systematic mapping study looks for empirical research that could inform evidence-based design of programming languages. Method: Manual and keyword-based searches were performed, as was a single round of snowballing. There were 2056 potentially relevant publications, of which 180 were selected for inclusion, because they reported empirical evidence on the efficacy of potential design decisions and were published on or before 2012. A thematic synthesis was created. Results: Included studies span four decades, but activity has been sparse until the last five years or so. The form of conditional statements and loops, as well as the choice between static and dynamic typing have all been studied empirically for efficacy in at least five studies each. Error proneness, programming comprehension, and human effort are the most common forms of efficacy studied. Experimenting with programmer participants is the most popular method. Conclusions: There clearly are language design decisions for which empirical evidence regarding efficacy exists; they may be of some use to language designers, and several of them may be ripe for systematic reviewing. There is concern that the lack of interest generated by studies in this topic area until the recent surge of activity may indicate serious issues in their research approach.

Keywords: programming languages, programming language design, evidence-based paradigm, efficacy, research methods, systematic mapping study, thematic synthesis

by Antti-Juhani Kaijanaho at August 29, 2014 08:45 AM

August 28, 2014

Functional Jobs

Senior Software Engineer (Functional) at McGraw-Hill Education (Full-time)

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District. We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

  • Move textbooks and learning into the digital era
  • Create software used by millions of students
  • Advance the state of the art in adaptive learning technology
  • Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you do have some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

  • Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
  • Be comfortable moving between multiple programming languages
  • Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

August 28, 2014 09:18 PM

Douglas M. Auclair (geophf)

Dylan: the harsh realities of the market

So, this is a little case study.

I did everything for Dylan. And when I say everything, I mean everything.  Here's my resumé:

  • I got excited about Dylan as a user, and I used it. I bought an old Mac that I don't ever remember the designation for, it's so '90's old, and got the floppies for the Dylan IDE from Apple research.
I'm not joking.

  • I integrated Dylan into my work at work, building an XML parser then open-sourcing it to the community under the (then) non-restrictive license. I think mine was the only XML parser that was industrial-strength for Dylan. Can't claim originality: I ported over the Common-LISP one, but it was a lot of (fun) work.
  • I made improvements to the gwydion-dylan compiler, including some library documentation (you can see my name right there, right in the compiler code), including some library functionality, did I work on the compiler itself? The Dylan syntax extensions or type system? I don't recall; if not in those places, I know I've looked at those guts: I had my fingers all over parts of the compiler.
I was in the Dylan compiler code. For you ll-types ('little language') that's no big deal.

But ask a software developer in industry if they've ever been in their compiler code. I have, too: I've found bugs in Java Sun-compiler that I fixed locally and reported up the chain.
  • I taught a course at our community college on Dylan. I had five students from our company that made satellite mission software.
  • I effing had commercial licenses bought when the boss asked me: what do we have to do to get this (my system) done/integrated into the build. I put my job on the line, for Dylan. ... The boss bought the licenses: he'd rather spend the $x than spending six weeks to back-port down to Java or C++.
  • I built a rule-based man-power scheduling system that had previously took three administrative assistants three days each quarter to generate. My system did it, and printed out a PDF in less than one second. I sold it, so that means I started a commercial company and sold my software.
I sold commercial Dylan software. That I wrote. Myself. And sold. Because people bought it. Because it was that good.

Hells yeah.

Question: what more could I have done?

I kept Dylan alive for awhile. In industry. For real.

So why is Dylan dead?

That's not the question.

Or, that question is answered over and over and over again.

Good languages, beautiful languages, right-thing languages languish and die all the time.

Dylan was the right-thing, and they (Apple) killed it in the lab, and for a reason.

Who is Dylan for?

That's not the question either. Because you get vague, general, useless answers.

The question is to ask it like Paul Graham answered it for LISP.

Lisp is a pointless, useless, weird language that nobody uses.

But Paul and his partner didn't care. They didn't give a ...


... what anybody else thought. They knew that this language, the language they loved, was built and designed and made for them. Just them and only them, because the only other people who were using it were college kids on comp.lang.lisp asking for the answers for problem-set 3 on last night's homework.

That's what Lisp was good for: nothing.
That's who Lisp was good for: nobody.

Same exact scenario for Erlang. Exactly the same. Erlang was only good for Joe Armstrong and a couple of buddies/weirdos like him, you know: kooks, who believed that Erlang was the right-thing for what they were doing, because they were on a mission, see, and nothing nobody could say could stop them nor stand against them, and all who would rise up against them would fall.


What made Lisp and Haskell and Erlang and Scala and Prolog (yes, Prolog, although you'll never hear that success story publicly, but $26M and three lives saved? Because of a Prolog system I wrote? And that's just one day in one month's report for data? I call that a success) work when nobody sane would say that these things would work?

Well, it took a few crazy ones to say, no, not: 'say' that it would work, but would make it work with their beloved programming language come hell or high water or, worse: indifferent silence, or ridicule, or pity from the rest of the world.

That is the lesson of perl and python and all these other languages. They're not good for anything. They suck. And they suck in libraries and syntax and semantics and weirdness-factor and everything.

But two, not one, but at least two people loved that language enough to risk everything, and ...

They lost.

Wait. What?

Did you think I was going to paint the rosy picture and lie to you and say 'they won'?

Because they didn't.

Who uses Lisp commercially? Or Haskell, except some fringers, or Scala or Clojure or Erlang or Smalltalk or Prolog

... or Dylan.

These languages are defined, right there in the dictionary.

Erlang: see 'career wrecker.'

Nobody uses those languages nor admits to even touching them with a 10-foot (3-meter) pole. I had an intern from college. 'Yeah, we studied this weird language called ML in Comp.sci. Nobody uses it.'

She was blown away when I started singing ML's praises and what it can do.

A meta-language, and she called it useless? Seriously?

Because that's what the mainstream sees.

Newsflash. I'm sorry. Dylan, Haskell, Idris: these aren't main-stream, and they never will be.

Algebraic types? Dependent types? You'll never see them. They're too ... research-y. They stink of academe, which is: they stink of uselessness-to-industry. You'll be dead and buried to see them in this form, even after they discover the eternity elixir. Sorry.

Or you'll see them in Visual Basic as a new Type-class form that only a few Microserfs use because they happened to have written those extensions. Everybody else?


Here's how Dylan will succeed, right now.

Bruce and I will put our heads together, start a company, and we'll code something. Not for anybody else to use and to love and to cherish, just for us, only for us, and it will blow out the effing doors, and we'll be bought out for $40M because our real worth is $127M.

And the first thing that Apple will do, after they bought us, is to show us the door, then convert the code into Java. Or Swift. Or Objective-C, or whatever.

And that's how we'll win.

Not the $40M. Not the lecture series on 'How to Make Functional Programming Work in Industry for Real' afterwards at FLoC and ICFP conferences with fan-bois and -girls wanting to talk to us afterwards and ask us how they can get a job doing functional programming.

Not that.

We'll win because we made something in Dylan, and it was real, and it worked, and it actually did something for enough people that we can now go to our graves knowing that we did something once with our lives (and we can do it again and again, too: there's no upper limit on the successes you're allowed to have, people) that meant something to some bodies. And we did that. With Dylan.


I've done that several times already, by my counting: the Prolog project, the Dylan project, the Mercury project, and my writing.

I'm ready to do that, again.

Because, actually, fundamentally, doing something in this world and for it ... there's nothing like it.

You write that research paper, and I come up to you, waving it in your face, demanding you implement your research because I need it to do my job in Industry?

I've done that to three professors so far. Effing changed their world-view in that moment. "What?" they said, to a person, "somebody actually wants to use this?" The look of bemused surprise on their faces?

It was sad, actually, because they did write something that somebody out there (moiself) needed, but they never knew that what they were doing meant something.

And it did.

Effing change your world-view. Your job? Your research? Your programming language?

That's status quo, and that's good and necessary and dulce and de leche (or decorum, I forget which).

But get up out of the level you're at, and do something with it so that that other person, slouched in their chair, sits up and takes notice, and a light comes over their face and they say, 'Ooh! That does that? Wow!' and watch their world change, because of you and what you've done.

Dylan is for nothing and for nobody.

So is everything under the Sun, my friend.

Put your hand to the plow, and with the sweat of your brow, make it yours for this specific thing.

Regardless of the long hours, long months of unrewarded work, and regardless of the hecklers, naysayers, and concerned friends and parents, and regardless of the mountain of unpaid bills.

You make it work, and you don't stop until it does.

That's how I've won.

Every time.

by geophf ( at August 28, 2014 04:06 PM


Dealing with Asynchronous Exceptions during Resource Acquisition


Consider the following code: we open a socket, compute with it, and finally close the socket again. The computation happens inside an exception handler (try), so even when an exception happens we still close the socket:

example1 :: (Socket -> IO a) -> IO a 
example1 compute = do -- WRONG
  s <- openSocket 
  r <- try $ compute s
  closeSocket s
  case r of
    Left ex -> throwIO (ex :: SomeException)
    Right a -> return a

Although this code correctly deals with synchronous exceptions–exceptions that are the direct result of the execution of the program–it does not deal correctly with asynchronous exceptions–exceptions that are raised as the result of an external event, such as a signal from another thread. For example, in

example2 :: (Socket -> IO a) -> IO (Maybe a)
example2 compute = timeout someTimeout $ example1 compute

it is possible that the timeout signal arrives after we have opened the socket but before we have installed the exception handler (or indeed, after we leave the scope of the exception handler but before we close the socket). In order to address this we have to control precisely where asynchronous exceptions can and cannot be delivered:

example3 :: (Socket -> IO a) -> IO a
example3 compute =
  mask $ \restore -> do
    s <- openSocket 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

We mask asynchronous exceptions, and then restore them only inside the scope of the exception handler. This very common pattern is captured by the higher level combinator bracket, and we might rewrite the example as

example4 :: (Socket -> IO a) -> IO a
example4 = bracket openSocket closeSocket

Allowing asynchronous exceptions during resource acquisition

Suppose that we wanted to define a derived operation that opens a socket and performs some kind of handshake with the server on the other end:

openHandshake :: IO Socket
openHandshake = do
  mask $ \restore -> do
    s <- openSocket
    r <- try $ restore $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

(These and the other examples can be defined in terms of bracket and similar, but we use mask directly so that it’s easier to see what is happening.) We might use openHandshake as follows:

example5 :: (Socket -> IO a) -> IO a
example5 compute = do
  mask $ \restore -> do
    s <- openHandshake 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

There are no resource leaks in this code, but there is a different problem: we call openHandshake with asynchronous exceptions masked. Although openHandshake calls restore before doing the handshake, restore restores the masking state to that of the enclosing context. Hence the handshake with the server cannot be timed out. This may not be what we want–we may want to be able to interrupt example5 with a timeout either during the handshake or during the argument computation.

Note that this is not a solution:

example6 :: (Socket -> IO a) -> IO a 
example6 compute = do
  mask $ \restore -> do
    s <- restore openHandshake -- WRONG
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

Consider what might happen: if an asynchronous exception is raised after openHandshake returns the socket, but before we leave the scope of restore, the asynchronous exception will be raised and the socket will be leaked. Installing an exception handler does not help: since we don’t have a handle on the socket, we cannot release it.

Interruptible operations

Consider this definition from the standard libraries:

withMVar :: MVar a -> (a -> IO b) -> IO b
withMVar m io =
  mask $ \restore -> do
    a <- takeMVar m
    b <- restore (io a) `onException` putMVar m a
    putMVar m a
    return b

This follows almost exactly the same pattern as the examples we have seen so far; we mask asynchronous exceptions, take the contents of the MVar, and then execute some operation io with the contents of the MVar, finally putting the contents of the MVar back when the computation completes or when an exception is raised.

An MVar acts as a lock, with takeMVar taking the role of acquiring the lock. This may, of course, take a long time if the lock is currently held by another thread. But we call takeMVar with asynchronous exceptions masked. Does this mean that the takeMVar cannot be timed out? No, it does not: takeMVar is a so-called interruptible operation. From the Asynchronous Exceptions in Haskell paper:

Any operation which may need to wait indefinitely for a resource (e.g., takeMVar) may receive asynchronous exceptions even within an enclosing block, but only while the resource is unavailable. Such operations are termed interruptible operations. (..) takeMVar behaves atomatically when enclosed in block. The takeMVar may receive asynchronous exceptions right up until the point when it acquires the MVar, but not after.

(block has been replaced by mask since the publication of the paper, but the principle is the same.) Although the existence of interruptible operations makes understanding the semantics of mask harder, they are necessary: like in the previous section, wrapping takeMVar in restore is not safe. If we really want to mask asynchronous exceptions, even across interruptible operations, Control.Exception offers uninterruptibleMask.

Custom interruptible operations

So an interruptible operation is one that can be interrupted by an asynchronous exception even when asynchronous exceptions are masked. Can we define our own interruptible operations? Yes, we can:

-- | Open a socket and perform handshake with the server
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ unsafeUnmask $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

unsafeUnmask is defined in GHC.IO, and unmasks asynchronous exceptions, no matter what the enclosing context is. This is of course somewhat dangerous, because now calling openHandshake' inside a mask suddenly opens up the possibility of an asynchronous exception being raised; and the only way to know is to look at the implementation of openHandshake', or its Haddock documentation. This is somewhat unsatisfactory, but exactly the same goes for takeMVar and any other interruptible operation, or any combinator that uses an interruptible operation under the hood. A sad state of affairs, perhaps, but one that we don’t currently have a better solution for.

Actually, using unsafeUnmask is a bit too crude. Control.Exception does not export it, but does export

allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()

with documentation

When invoked inside mask, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking.

When called outside mask, or inside uninterruptibleMask, this function has no effect.

(emphasis mine.) Sadly, this documentation does not reflect the actual semantics: unsafeUnmask, and as a consequence allowInterrupt, unmasks asynchronous exceptions no matter what the enclosing context is: even inside uninterruptibleMask. We can however define our own operator to do this:

interruptible :: IO a -> IO a
interruptible act = do
  st <- getMaskingState
  case st of
    Unmasked              -> act
    MaskedInterruptible   -> unsafeUnmask act
    MaskedUninterruptible -> act 

where we call unsafeUnmask only if the enclosing context is mask, but not if it is uninterruptibleMask (TODO: What is the semantics when we nest these two?). We can use it as follows to define a better version of openHandshake:

-- | Open a socket and perform handshake with the server
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ interruptible $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

Resource allocation timeout

If we wanted to timeout the allocation of the resource only, we might do

example7 :: (Socket -> IO a) -> IO a
example7 compute = do
  mask $ \restore -> do
    ms <- timeout someTimeout $ openHandshake'
    case ms of
      Nothing -> throwIO (userError "Server busy")
      Just s  -> do r <- try $ restore $ compute s
                    closeSocket s
                    case r of
                      Left ex -> throwIO (ex :: SomeException)
                      Right a -> return a

Exceptions are masked when we enter the scope of the timeout, and are unmasked only once we are inside the exception handler in openHandshake'–in other words, if a timeout happens, we are guaranteed to clean up the socket. The surrounding mask is however necessary. For example, suppose we are writing some unit tests and we are testing openHandshake'. This is wrong:

example8 :: IO ()
example8 = do 
  ms <- timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

Even if we are sure that the example8 will not be interrupted by asynchronous exceptions, there is still a potential resource leak here: the timeout exception might be raised just after we leave the mask_ scope from openHandshake but just before we leave the timeout scope. If we are sure we don’t need to worry about other asynchronous exceptions we can write

example8 :: IO ()
example8 = do
  s <- mask_ $ timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

although of course it might be better to simply write

example8 :: IO ()
example8 = 
  bracket (timeout someTimeout $ openHandshake')
          (\ms -> case ms of Just s  -> closeSocket s
                             Nothing -> return ())
          (\_ -> return ())


Making sure that resources are properly deallocated in the presence of asynchronous exceptions is difficult. It is very important to make sure that asynchronous exceptions are masked at crucial points; unmasking them at the point of calling a resource allocation function is not safe. If you nevertheless want to be able to timeout resource allocation, you need to make your resource allocation function interruptible.

For completeness’ sake, there are some other solutions that avoid the use of unsafeUnmask. One option is to thread the restore argument through (and compose multiple restore arguments if there are multiple nested calls to mask). This requires resource allocations to have a different signature, however, and it is very error prone: a single mask somewhere along the call chain where we forget to thread through the restore argument will mean the code is no longer interruptible. The other option is to run the code that we want to be interruptible in a separate thread, and wait for the thread to finish with, for example, a takeMVar. Getting this right is however no easy task, and it doesn’t change anything fundamentally anyway: rather than using unsafeUnmask we are now using a primitive interruptible operation; either way we introduce the possibility of exceptions even in the scope of mask_.

Finally, when your application does not fit the bracket pattern we have been using (implicitly or explicitly), you may want to have a look at resourcet and pipes or conduit, or my talk Lazy I/O and Alternatives in Haskell.

by edsko, duncan at August 28, 2014 03:48 PM

Keegan McAllister

Calling a Rust library from C (or anything else!)

One reason I'm excited about Rust is that I can compile Rust code to a simple native-code library, without heavy runtime dependencies, and then call it from any language. Imagine writing performance-critical extensions for Python, Ruby, or Node in a safe, pleasant language that has static lifetime checking, pattern matching, a real macro system, and other goodies like that. For this reason, when I started html5ever some six months ago, I wanted it to be more than another "Foo for BarLang" project. I want it to be the HTML parser of choice, for a wide variety of applications in any language.

Today I started work in earnest on the C API for html5ever. In only a few hours I had a working demo. And this is a fairly complicated library, with 5,000+ lines of code incorporating

It's pretty cool that we can use all this machinery from C, or any language that can call C. I'll describe first how to build and use the library, and then I'll talk about the implementation of the C API.

html5ever (for C or for Rust) is not finished yet, but if you're feeling adventurous, you are welcome to try it out! And I'd love to have more contributors. Let me know on GitHub about any issues you run into.

Using html5ever from C

Like most Rust libraries, html5ever builds with Cargo.

$ git clone
$ cd html5ever
$ git checkout dev
$ cargo build
Updating git repository ``
Compiling phf_mac v0.0.0 (
Compiling html5ever-macros v0.0.0 (file:///tmp/html5ever)
Compiling phf v0.0.0 (
Compiling html5ever v0.0.0 (file:///tmp/html5ever)

The C API isn't Cargo-ified yet, so we'll build it using the older Makefile-based system.

$ mkdir build
$ cd build
$ ../configure
$ make libhtml5ever_for_c.a
rustc -D warnings -C rpath -L /tmp/html5ever/target -L /tmp/html5ever/target/deps \
-o libhtml5ever_for_c.a --cfg for_c --crate-type staticlib /tmp/html5ever/src/
warning: link against the following native artifacts when linking against this static library
note: the order and any duplication can be significant on some platforms, and so may need to be preserved
note: library: rt
note: library: dl
note: library: pthread
note: library: gcc_s
note: library: pthread
note: library: c
note: library: m

Now we can build an example C program using that library, and following the link instructions produced by rustc.

$ H5E_PATH=/tmp/html5ever
$ gcc -Wall -o tokenize tokenize.c -I $H5E_PATH/capi -L $H5E_PATH/build \
-lhtml5ever_for_c -lrt -ldl -lpthread -lgcc_s -lpthread -lc -lm

$ ./tokenize 'Hello&comma; <i class=excellent>world!</i>'
CHARS : Hello
TAG : <i>
ATTR: class="excellent"
CHARS : world!
TAG : </i>

The build process is pretty standard for C; we just link a .a file and its dependencies. The biggest obstacle right now is that you won't find the Rust compiler in your distro's package manager, because the language is still changing so rapidly. But there's a ton of effort going into stabilizing the language for a Rust 1.0 release this year. It won't be too long before rustc is a reasonable build dependency.

Let's look at the C client code.

#include <stdio.h>

#include "html5ever.h"

void put_str(const char *x) {
fputs(x, stdout);

void put_buf(struct h5e_buf text) {
fwrite(, text.len, 1, stdout);

void do_start_tag(void *user, struct h5e_buf name, int self_closing, size_t num_attrs) {
put_str("TAG : <");
if (self_closing) {

// ...

struct h5e_token_ops ops = {
.do_chars = do_chars,
.do_start_tag = do_start_tag,
.do_tag_attr = do_tag_attr,
.do_end_tag = do_end_tag,

struct h5e_token_sink sink = {
.ops = &ops,
.user = NULL,

int main(int argc, char *argv[]) {
if (argc < 2) {
printf("Usage: %s 'HTML fragment'\n", argv[0]);
return 1;

struct h5e_tokenizer *tok = h5e_tokenizer_new(&sink);
h5e_tokenizer_feed(tok, h5e_buf_from_cstr(argv[1]));
return 0;

The struct h5e_token_ops contains pointers to callbacks. Any events we don't care to handle are left as NULL function pointers. Inside main, we create a tokenizer and feed it a string. html5ever for C uses a simple pointer+length representation of buffers, which is this struct h5e_buf you see being passed by value.

This demo only does tokenization, not tree construction. html5ever can perform both phases of parsing, but the API surface for tree construction is much larger and I didn't get around to writing C bindings yet.

Implementing the C API

Some parts of Rust's libstd depend on runtime services, such as task-local data, that a C program may not have initialized. So the first step in building a C API was to eliminate all std:: imports. This isn't nearly as bad as it sounds, because large parts of libstd are just re-exports from other libraries like libcore that we can use with no trouble. To be fair, I did write html5ever with the goal of a C API in mind, and I avoided features like threading that would be difficult to integrate. So your library might give you more trouble, depending on which Rust features you use.

The next step was to add the #![no_std] crate attribute. This means we no longer import the standard prelude into every module. To compensate, I added use core::prelude::*; to most of my modules. This brings in the parts of the prelude that can be used without runtime system support. I also added many imports for ubiquitous types like String and Vec, which come from libcollections.

After that I had to get rid of the last references to libstd. The biggest obstacle here involved macros and deriving, which would produce references to names under std::. To work around this, I create a fake little mod std which re-exports the necessary parts of core and collections. This is similar to libstd's "curious inner-module".

I also had to remove all uses of format!(), println!(), etc., or move them inside #[cfg(not(for_c))]. I needed to copy in the vec!() macro which is only provided by libstd, even though the Vec type is provided by libcollections. And I had to omit debug log messages when building for C; I did this with conditionally-defined macros.

With all this preliminary work done, it was time to write the C bindings. Here's how the struct of function pointers looks on the Rust side:

pub struct h5e_token_ops {
do_start_tag: extern "C" fn(user: *mut c_void, name: h5e_buf,
self_closing: c_int, num_attrs: size_t),

do_tag_attr: extern "C" fn(user: *mut c_void, name: h5e_buf,
value: h5e_buf),

do_end_tag: extern "C" fn(user: *mut c_void, name: h5e_buf),

// ...

The processing of tokens is straightforward. We pattern-match and then call the appropriate function pointer, unless that pointer is NULL. (Edit: eddyb points out that storing NULL as an extern "C" fn is undefined behavior. Better to use Option<extern "C" fn ...>, which will optimize to the same one-word representation.)

To create a tokenizer, we heap-allocate the Rust data structure in a Box, and then transmute that to a raw C pointer. When the C client calls h5e_tokenizer_free, we transmute this pointer back to a box and drop it, which will invoke destructors and finally free the memory.

You'll note that the functions exported to C have several special annotations:

  • #[no_mangle]: skip name mangling, so we end up with a linker symbol named h5e_tokenizer_free instead of _ZN5for_c9tokenizer18h5e_tokenizer_free.
  • unsafe: don't let Rust code call these functions unless it promises to be careful.
  • extern "C": make sure the exported function has a C-compatible ABI. The data structures similarly get a #[repr(C)] attribute.

Then I wrote a C header file matching this ABI:

struct h5e_buf {
unsigned char *data;
size_t len;

struct h5e_buf h5e_buf_from_cstr(const char *str);

struct h5e_token_ops {
void (*do_start_tag)(void *user, struct h5e_buf name,
int self_closing, size_t num_attrs);

void (*do_tag_attr)(void *user, struct h5e_buf name,
struct h5e_buf value);

void (*do_end_tag)(void *user, struct h5e_buf name);

/// ...

struct h5e_tokenizer;

struct h5e_tokenizer *h5e_tokenizer_new(struct h5e_token_sink *sink);
void h5e_tokenizer_free(struct h5e_tokenizer *tok);
void h5e_tokenizer_feed(struct h5e_tokenizer *tok, struct h5e_buf buf);
void h5e_tokenizer_end(struct h5e_tokenizer *tok);

One remaining issue is that Rust is hard-wired to use jemalloc, so linking html5ever will bring that in alongside the system's libc malloc. Having two separate malloc heaps will likely increase memory consumption, and it prevents us from doing fun things like allocating Boxes in Rust that can be used and freed in C. Before Rust can really be a great choice for writing C libraries, we need a better solution for integrating the allocators.

If you'd like to talk about calling Rust from C, you can find me as kmc in #rust and #rust-internals on And if you run into any issues with html5ever, do let me know, preferably by opening an issue on GitHub. Happy hacking!

by keegan ( at August 28, 2014 06:18 AM

August 27, 2014

Bill Atkins

NSNotificationCenter, Swift and blocks

The conventional way to register observers with NSNotificationCenter is to use the target-action pattern. While this gets the job done, it's inherently not type-safe.

For example, the following Swift snippet will compile perfectly:

    NSNotificationCenter.defaultCenter().addObserver(self, selector: Selector("itemAdded:"),
      name: MyNotificationItemAdded, object: nil)

even though at runtime it will fail unless self has a method named itemAdded that takes exactly one parameter (leaving off that last colon in the selector will turn this line into a no-op). Plus, this method gives you no way to take advantages of Swift's closures, which would allow the observer to access local variables in the method that adds the observer and would eliminate the need to create a dedicated method to handle the event.

A better way to do this is to use blocks. And NSNotificationCenter does include a block-based API:

    NSNotificationCenter.defaultCenter().addObserverForName(MyNotificationItemAdded, object: nil, queue: nil) { note in
      // ...

This is much nicer, especially with Swift's trailing closure syntax. There are no method names to be looked up at runtime, we can refer to local variables in the method that registered the observer and we can perform small bits of logic in reaction to events without having to create and name dedicated methods.

The catch comes in resource management. It's very important that an object remove its event observers when it's deallocated, or else NSNotificationCenter will try to invoke methods on invalid pointers.

The traditional target-action method has the one advantage that we can easily handle this requirement with a single call in deinit:

  deinit {

With the block API, however, since there is no explicit target object, each call to addObserverForName returns "an opaque object to act as observer." So your observer class would need to track all of these objects and then remove them all from the notification center in deinit, which is a pain.

In fact, the hassle of having to do bookkeeping on the observer objects almost cancels out the convenience of using the block API. Frustrated by this situation, I sat down and created a simple helper class, NotificationManager:

class NotificationManager {
  private var observerTokens: [AnyObject] = []

  deinit {

  func deregisterAll() {
    for token in observerTokens {

    observerTokens = []

  func registerObserver(name: String!, block: (NSNotification! -> ()?)) {
    let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: nil, queue: nil) {note in

  func registerObserver(name: String!, forObject object: AnyObject!, block: (NSNotification! -> ()?)) {
    let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: object, queue: nil) {note in

First, this simple class provides a Swift-specialized API around NSNotificationCenter.  It provides an additional convenience method without an object parameter (rarely used, in my experience) to make it easier to use trailing-closure syntax. But most importantly, it keeps track of the observer objects generated when observers are registered, and removes them when the object is deinit'd.

A client of this class can simply keep a member variable of type NotificationManager and use it to register its observers. When the parent class is deallocated, the deinit method will automatically be called on its NotificationManager member variable, and its observers will be properly disposed of:

class MyController: UIViewController {
  private let notificationManager = NotificationManager()
  override init() {
    notificationManager.registerObserver(MyNotificationItemAdded) { note in
      println("item added!")
  required init(coder: NSCoder) {
    fatalError("decoding not implemented")

When the MyController instance is deallocated, its NotificationManager member variable will be automatically deallocated, triggering the call to deregisterAll that will remove the dead objects from NSNotificationCenter.

In my apps, I add a notificationManager instance to my common UIViewController base class so I don't have to explicitly declare the member variable in all of my controller subclasses.

Another benefit of using my own wrapper around NSNotificationCenter is that I can add useful functionality, like group observers: an observer that's triggered when any one of a group of notifications are posted:

struct NotificationGroup {
  let entries: [String]
  init(_ newEntries: String...) {
    entries = newEntries


extension NotificationManager {
  func registerGroupObserver(group: NotificationGroup, block: (NSNotification! -> ()?)) {
    for name in group.entries {
      registerObserver(name, block: block)

This can be a great way to easily set up an event handler to run when, for example, an item is changed in any way at all:

   let MyNotificationItemsChanged = NotificationGroup(

    notificationManager.registerGroupObserver(MyNotificationItemsChanged) { note in
      // ...

by More Indirection ( at August 27, 2014 11:21 AM

Simon Michael

Creating well-behaved Hakyll blog posts

Posts in a Hakyll-powered blog need to be created with care, if you want your feed to work well with clients and aggregators. There are many things to remember:

  • If you have clones of your site, decide which one to work in and make sure it’s up to date
  • Create the file in the right place
  • Name it consistently (I use
  • In my setup, prefix it with _ if it’s a draft (I render but don’t publish those)
  • Set title, tags, and author with a metadata block
  • Set published time with metadata to get a more precise timestamp than Hakyll can guess from the filename. Include a time zone. Use the right format.
  • When moving a post from draft to published:
    • Update the published time
    • Update the file name if the title or publish date has changed
  • If changing a post after it has been published: set updated time in the metadata
  • At some point, commit it to version control and sync it to other clones

I find this makes blogging feel tedious, especially after an absence when I’ve forgotten the details. Case in point, I managed to share an ugly template post with Planet Haskell readers while working on this one.

So I’m trying out this bash shell script, maybe it will help. Adjust to suit your setup.
(updated 2014/8/27)

# add to ~/.bashrc


# List recent blog posts.
alias blog-ls="ls $BLOGDIR | tail -10"

# Create a new hakyll-compatible draft blog post.
# blog-new ["The Title" ["tag1, tag2" ["Author Name"]]]
function blog-new() {
    TITLE=${1:-Default Title}
    TAGS=${2:-defaulttag1, defaulttag2}
    AUTHOR=${3:-Default Author Name}
    SLUG=${TITLE// /-}
    DATE=`date +"%Y-%m-%d"`
    TIME=`date +"%Y-%m-%d %H:%M:%S%Z"`
    echo creating $BLOGDIR/$FILE
    cat <<EOF >>$BLOGDIR/$FILE && emacsclient -n $BLOGDIR/$FILE
title:     $TITLE
tags:      $TAGS
author:    $AUTHOR
published: $TIME


An example:

$ blog-new 'Scripted Hakyll blog post creation' 'hakyll, haskell'
(file opens in emacs, edit & save)
$ make
./site build
  Creating store...
  Creating provider...
  Running rules...
Checking for out-of-date items
  updated blog/

See also part 2.

August 27, 2014 02:15 AM

Well-behaved Hakyll blog posts, continued

More hakyll blog fixes:

Ugly things showing on planets

My posts were showing unwanted things on planet haskell - double heading, redundant date, tag links, and ugly disqus html. By comparing with Jasper Van der Jeugt’s blog, I found the problem: I was snapshotting content for the feed at the wrong time, after applying the post template:

>>= return . fmap demoteHeaders
>>= loadAndApplyTemplate "templates/post.html"    (postCtx tags)
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" defaultContext


>>= saveSnapshot "content"  --
>>= return . fmap demoteHeaders
>>= loadAndApplyTemplate "templates/post.html"    (postCtx tags)
>>= loadAndApplyTemplate "templates/default.html" defaultContext

Manual feed publishing

The main blog feed is now generated with a _ prefix, and I must manually rename it (with make feed) to make it live it on Planet Haskell. This will hopefully reduce snafus (and not create new ones).

./site.hs 95
-    create ["blog.xml"] $ do
+    create ["_blog.xml"] $ do

./Makefile 14
+feed: _site/blog.xml
+_site/blog.xml: _site/_blog.xml
+	cp _site/_blog.xml _site/blog.xml

Better HTML titles

Changed the “Joyful Systems” prefix to a suffix in the HTML page titles, making search results and browser tab names more useful.

August 27, 2014 02:00 AM

FP Complete

IAP: conduit stream fusion

Both the changes described in this blog post, and in the previous blog post, are now merged to the master branch of conduit, and have been released to Hackage as conduit 1.2.0. That doesn't indicate stream fusion is complete (far from it!). Rather, the optimizations we have so far are valuable enough that I want them to be available immediately, and future stream fusion work is highly unlikely to introduce further breaking changes. Having the code on Hackage will hopefully also make it easier for others to participate in the discussion around this code.

Stream fusion

Last time, I talked about applying the codensity transform to speed up conduit. This greatly increases performance when performing many monadic binds. However, this does nothing to help us with speeding up the "categorical composition" of conduit, where we connect two components of a pipeline together so the output from the first flows into the second. conduit usually refers to this as fusion, but given the topic at hand (stream fusion), I think that nomenclature will become confusing. So let's stick to categorical composition, even though conduit isn't actually a category.

Duncan Coutts, Roman Leshchinskiy and Don Stewart wrote the stream fusion paper, and that technique has become integral to getting high performance in the vector and text packages. The paper is well worth the read, but for those unfamiliar with the technique, let me give a very brief summary:

  • GHC is very good at optimising non-recursive functions.
  • We express all of our streaming functions has a combination of some internal state, and a function to step over that state.
  • Stepping either indicates that the stream is complete, there's a new value and a new state, or there's a new state without a new value (this last case helps avoid recursion for a number of functions like filter).
  • A stream transformers (like map) takes a Stream as input and produces a new Stream as output.
  • The final consuming functions, like fold, are the only place where recursion happens. This allows all other components of the pipeline to be inlined, rewritten to more efficient formats, and optimized by GHC.

Let's see how this looks compared to conduit.

Data types

I'm going to slightly rename data types from stream fusion to avoid conflicts with existing conduit names. I'm also going to add an extra type parameter to represent the final return value of a stream; this is a concept that exists in conduit, but not common stream fusion.

data Step s o r
    = Emit s o
    | Skip s
    | Stop r
data Stream m o r = forall s. Stream
    (s -> m (Step s o r))
    (m s)

The Step datatype takes three parameters. s is the internal state used by the stream, o is the type of the stream of values it generates, and r is the final result value. The Stream datatype uses an existential to hide away that internal state. It then consists of a step function that takes a state and gives us a new Step, as well as an initial state value (which is a monadic action, for cases where we want to do some initialization when starting a stream).

Let's look at some functions to get a feel for what this programming style looks like:

enumFromToS_int :: (Integral a, Monad m) => a -> a -> Stream m a ()
enumFromToS_int !x0 !y =
    Stream step (return x0)
    step x | x <= y    = return $ Emit (x + 1) x
           | otherwise = return $ Stop ()

This function generates a stream of integral values from x0 to y. The internal state is the current value to be emitted. If the current value is less than or equal to y, we emit our current value, and update our state to be the next value. Otherwise, we stop.

We can also write a function that transforms an existing stream. mapS is likely the simplest example of this:

mapS :: Monad m => (a -> b) -> Stream m a r -> Stream m b r
mapS f (Stream step ms0) =
    Stream step' ms0
    step' s = do
        res <- step s
        return $ case res of
            Stop r -> Stop r
            Emit s' a -> Emit s' (f a)
            Skip s' -> Skip s'

The trick here is to make a function from one Stream to another. We unpack the input Stream constructor to get the input step and state functions. Since mapS has no state of its own, we simply keep the input state unmodified. We then provide our modified step' function. This calls the input step function, and any time it sees an Emit, applies the user-provided f function to the emitted value.

Finally, let's consider the consumption of a stream with a strict left fold:

foldS :: Monad m => (b -> a -> b) -> b -> Stream m a () -> m b
foldS f b0 (Stream step ms0) =
    ms0 >>= loop b0
    loop !b s = do
        res <- step s
        case res of
            Stop () -> return b
            Skip s' -> loop b s'
            Emit s' a -> loop (f b a) s'

We unpack the input Stream constructor again, get the initial state, and then loop. Each loop, we run the input step function.

Match and mismatch with conduit

There's a simple, straightforward conversion from a Stream to a Source:

toSource :: Monad m => Stream m a () -> Producer m a
toSource (Stream step ms0) =
    lift ms0 >>= loop
    loop s = do
        res <- lift $ step s
        case res of
            Stop () -> return ()
            Skip s' -> loop s'
            Emit s' a -> yield a >> loop s'

We extract the state, and then loop over it, calling yield for each emitted value. And ignoring finalizers for the moment, there's even a way to convert a Source into a Stream:

fromSource :: Monad m => Source m a -> Stream m a ()
fromSource (ConduitM src0) =
    Stream step (return $ src0 Done)
    step (Done ()) = return $ Stop ()
    step (Leftover p ()) = return $ Skip p
    step (NeedInput _ p) = return $ Skip $ p ()
    step (PipeM mp) = liftM Skip mp
    step (HaveOutput p _finalizer o) = return $ Emit p o

Unfortunately, there's no straightforward conversion for Conduits (transformers) and Sinks (consumers). There's simply a mismatch in the conduit world- which is fully continuation based- to the stream world- where the upstream is provided in an encapsulated value. I did find a few representations that mostly work, but the performance characteristics are terrible.

If anyone has insights into this that I missed, please contact me, as this could have an important impact on the future of stream fusion in conduit. But for the remainder of this blog post, I will continue under the assumption that only Source and Stream can be efficiently converted.


Once I accepted that I wouldn't be able to convert a stream transformation into a conduit transformation, I was left with a simple approach to start working on fusion: have two representations of each function we want to be able to fuse. The first representation would use normal conduit code, and the second would be streaming. This looks like:

data StreamConduit i o m r = StreamConduit
    (ConduitM i o m r)
    (Stream m i () -> Stream m o r)

Notice that the second field uses the stream fusion concept of a Stream-transforming function. At first, this may seem like it doesn't properly address Sources and Sinks, since the former doesn't have an input Stream, and the latter results in a single output value, not a Stream. However, those are really just special cases of the more general form used here. For Sources, we provide an empty input stream, and for Sinks, we continue executing the Stream until we get a Stop constructor with the final result. You can see both of these in the implementation of the connectStream function (whose purpose I'll explain in a moment):

connectStream :: Monad m
              => StreamConduit () i    m ()
              -> StreamConduit i  Void m r
              -> m r
connectStream (StreamConduit _ stream) (StreamConduit _ f) =
    run $ f $ stream $ Stream emptyStep (return ())
    emptyStep _ = return $ Stop ()
    run (Stream step ms0) =
        ms0 >>= loop
        loop s = do
            res <- step s
            case res of
                Stop r -> return r
                Skip s' -> loop s'
                Emit _ o -> absurd o

Notice how we've created an empty Stream using emptyStep and a dummy () state. And on the run side, we loop through the results. The type system (via the Void datatype) prevents the possibility of a meaningful Emit constructor, and we witness this with the absurd function. For Stop we return the final value, and Skip implies another loop.

Fusing StreamConduit

Assuming we have some functions that use StreamConduit, how do we get things to fuse? We still need all of our functions to have a ConduitM type signature, so we start off with a function to convert a StreamConduit into a ConduitM:

unstream :: StreamConduit i o m r -> ConduitM i o m r
unstream (StreamConduit c _) = c
{-# INLINE [0] unstream #-}

Note that we hold off on any inlining until simplification phase 0. This is vital to our next few rewrite rules, which is where all the magic happens.

The next thing we want to be able to do is categorically compose two StreamConduits together. This is easy to do, since a StreamConduit is made up of ConduitMs which compose via the =$= operator, and Stream transformers, which compose via normal function composition. This results in a function:

fuseStream :: Monad m
           => StreamConduit a b m ()
           -> StreamConduit b c m r
           -> StreamConduit a c m r
fuseStream (StreamConduit a x) (StreamConduit b y) = StreamConduit (a =$= b) (y . x)
{-# INLINE fuseStream #-}

That's very logical, but still not magical. The final trick is a rewrite rule:

{-# RULES "fuseStream" forall left right.
        unstream left =$= unstream right = unstream (fuseStream left right)

We're telling GHC that, if we see a composition of two streamable conduits, then we can compose the stream versions of them and get the same result. But this isn't enough yet; unstream will still end up throwing away the stream version. We now need to deal with running these things. The first case we'll handle is connecting two streamable conduits, which is where the connectStream function from above comes into play. If you go back and look at that code, you'll see that the ConduitM fields are never used. All that's left is telling GHC to use connectStream when appropriate:

{-# RULES "connectStream" forall left right.
        unstream left $$ unstream right = connectStream left right

The next case we'll handle is when we connect a streamable source to a non-streamable sink. This is less efficient than the previous case, since it still requires allocating ConduitM constructors, and doesn't expose as many opportunities for GHC to inline and optimize our code. However, it's still better than nothing:

connectStream1 :: Monad m
               => StreamConduit () i    m ()
               -> ConduitM      i  Void m r
               -> m r
connectStream1 (StreamConduit _ fstream) (ConduitM sink0) =
    case fstream $ Stream (const $ return $ Stop ()) (return ()) of
        Stream step ms0 ->
            let loop _ (Done r) _ = return r
                loop ls (PipeM mp) s = mp >>= flip (loop ls) s
                loop ls (Leftover p l) s = loop (l:ls) p s
                loop _ (HaveOutput _ _ o) _ = absurd o
                loop (l:ls) (NeedInput p _) s = loop ls (p l) s
                loop [] (NeedInput p c) s = do
                    res <- step s
                    case res of
                        Stop () -> loop [] (c ()) s
                        Skip s' -> loop [] (NeedInput p c) s'
                        Emit s' i -> loop [] (p i) s'
             in ms0 >>= loop [] (sink0 Done)
{-# INLINE connectStream1 #-}

{-# RULES "connectStream1" forall left right.
        unstream left $$ right = connectStream1 left right

There's a third case that's worth considering: a streamable sink and non-streamable source. However, I ran into two problems when implementing such a rewrite rule:

  • GHC did not end up firing the rule.

  • There are some corner cases regarding finalizers that need to be dealt with. In our previous examples, the upstream was always a stream, which has no concept of finalizers. But when the upstream is a conduit, we need to make sure to call them appropriately.

So for now, fusion only works for cases where all of the functions can by fused, or all of the functions before the $$ operator can be fused. Otherwise, we'll revert to the normal performance of conduit code.


I took the benchmarks from our previous blog post and modified them slightly. The biggest addition was including an example of enumFromTo =$= map =$= map =$= fold, which really stresses out the fusion capabilities, and demonstrates the performance gap stream fusion offers.

The other thing to note is that, in the "before fusion" benchmarks, the sum results are skewed by the fact that we have the overly eager rewrite rules for enumFromTo $$ fold (for more information, see the previous blog post). For the "after fusion" benchmarks, there are no special-case rewrite rules in place. Instead, the results you're seeing are actual artifacts of having a proper fusion framework in place. In other words, you can expect this to translate into real-world speedups.

You can compare before fusion and after fusion. Let me provide a few select comparisons:

Benchmark Low level or vector Before fusion After fusion Speedup
map + sum 5.95us 636us 5.96us 99%
monte carlo 3.45ms 5.34ms 3.70ms 71%
sliding window size 10, Seq 1.53ms 1.89ms 1.53ms 21%
sliding vector size 10, unboxed 2.25ms 4.05ms 2.33ms 42%

Note at the map + sum benchmark is very extreme, since the inner loop is doing very cheap work, so the conduit overhead dominated the analysis.

Streamifying a conduit

Here's an example of making a conduit function stream fusion-compliant, using the map function:

mapC :: Monad m => (a -> b) -> Conduit a m b
mapC f = awaitForever $ yield . f
{-# INLINE mapC #-}

mapS :: Monad m => (a -> b) -> Stream m a r -> Stream m b r
mapS f (Stream step ms0) =
    Stream step' ms0
    step' s = do
        res <- step s
        return $ case res of
            Stop r -> Stop r
            Emit s' a -> Emit s' (f a)
            Skip s' -> Skip s'
{-# INLINE mapS #-}

map :: Monad m => (a -> b) -> Conduit a m b
map = mapC
{-# INLINE [0] map #-}
{-# RULES "unstream map" forall f.
    map f = unstream (StreamConduit (mapC f) (mapS f))

Notice the three steps here:

  • Define a pure-conduit implementation (mapC), which looks just like conduit 1.1's map function.
  • Define a pure-stream implementation (mapS), which looks very similar to vector's mapS.
  • Define map, which by default simply reexposes mapC. But then, use an INLINE statement to delay inlining until simplification phase 0, and use a rewrite rule to rewrite map in terms of unstream and our two helper functions mapC and mapS.

While tedious, this is all we need to do for each function to expose it to the fusion framework.

Vector vs conduit, mapM style

Overall, vector has been both the inspiration for the work I've done here, and the bar I've used to compare against, since it is generally the fastest implementation you can get in Haskell (and tends to be high-level code to boot). However, there seems to be one workflow where conduit drastically outperforms vector: chaining together monadic transformations.

I put together a benchmark which does the same enumFromTo+map+sum benchmark I demonstrated previously. But this time, I have four versions: vector with pure functions, vector with IO functions, conduit with pure functions, and conduit with IO functions. You can see the results here, the important takeaway is:

  • Pure is always faster, since it exposes more optimizations to GHC.
  • vector and conduit pure are almost identical, at 57.7us and 58.1us.
  • Monadic conduit code does have a slowdown (86.3us). However, monadic vector code has a drastic slowdown (305us), presumably because monadic binds defeat its fusion framework.

So there seems to be at least one workflow for which conduit's fusion framework can outperform even vector!


The biggest downside to this implementation of stream fusion is that we need to write all of our algorithms twice. This can possibly be mitigated by having a few helper functions in place, and implementing others in terms of those. For example, mapM_ can be implemented in terms foldM.

There's one exception to this: using the streamSource function, we can convert a Stream into a Source without having to write our algorithm twice. However, due to differences in how monadic actions are performed between Stream and Conduit, this could introduce a performance degredation for pure Sources. We can work around that with a special case function streamSourcePure for the Identity monad as a base.

Getting good performance

In order to take advantage of the new stream fusion framework, try to follow these guidelines:

  • Use fusion functions whenever possible. Explicit usage of await and yield will immediately kick you back to non-fusion (the same as explicit pattern matching defeats list fusion).
  • If you absolutely cannot use an existing fusion function, consider writing your own fusion variant.
  • When mixing fusion and non-fusion, put as many fusion functions as possible together with the $= operator before the connect operator $$.

Next steps

Even though this work is now publicly available on Hackage, there's still a lot of work to be done. This falls into three main categories:

  • Continue rewriting core library functions in streaming style. Michael Sloan has been working on a lot of these functions, and we're hoping to have almost all the combinators from Data.Conduit.List and Data.Conduit.Combinators done soon.
  • Research why rewrite rules and inlining don't play nicely together. In a number of places, we've had to explicitly use rewrite rules to force fusion to happen, when theoretically inlining should have taken care of it for us.
  • Look into any possible alternative formulations of stream fusion that provide better code reuse or more reliable rewrite rule firing.

Community assistance on all three points, but especially 2 and 3, are much appreciated!

August 27, 2014 12:00 AM

August 26, 2014

Edward Z. Yang

A taste of Cabalized Backpack

So perhaps you've bought into modules and modularity and want to get to using Backpack straightaway. How can you do it? In this blog post, I want to give a tutorial-style taste of how to program Cabal in the Backpack style. None of these examples are executable, because only some of this system is in GHC HEAD--the rest are on branches awaiting code review or complete vaporware. However, we've got a pretty good idea how the overall design and user experience should go, and so the purpose of this blog post is to communicate that idea. Comments and suggestions would be much appreciated; while the design here is theoretically well-founded, for obvious reasons, we don't have much on-the-ground programmer feedback yet.

A simple package in today's Cabal

To start, let's briefly review how Haskell modules and Cabal packages work today. Our running example will be the bytestring package, although I'll inline, simplify and omit definitions to enhance clarity.

Let's suppose that you are writing a library, and you want to use efficient, packed strings for some binary processing you are doing. Fortunately for you, the venerable Don Stewart has already written a bytestring package which implements this functionality for you. This package consists of a few modules: an implementation of strict ByteStrings...

module Data.ByteString(ByteString, empty, singleton, ...) where
  data ByteString = PS !(ForeignPtr Word8) !Int !Int
  empty :: ByteString
  empty = PS nullForeignPtr 0 0

...and an implementation of lazy ByteStrings:

module Data.ByteString.Lazy(ByteString, empty, singleton, ...) where
  data ByteString = Empty | Chunk !S.ByteString ByteString
  empty :: ByteString
  empty = Empty

These modules are packaged up into a package which is specified using a Cabal file (for now, we'll ignore the ability to define libraries/executables in the same Cabal file and assume everything is in a library):

name: bytestring
build-depends: base >= 4.2 && < 5, ghc-prim, deepseq
exposed-modules: Data.ByteString, Data.ByteString.Lazy, ...
other-modules: ...

We can then make a simple module and package which depends on the bytestring package:

module Utils where
  import Data.ByteString.Lazy as B
  blank :: IO ()
  blank = B.putStr B.empty
name: utilities
version: 0.1
build-depends: base, bytestring >= 0.10
exposed-modules: Utils

It's worth noting a few things about this completely standard module setup:

  1. It's not possible to switch Utils from using lazy ByteStrings to strict ByteStrings without literally editing the Utils module. And even if you do that, you can't have Utils depending on strict ByteString, and Utils depending on lazy ByteString, in the same program, without copying the entire module text. (This is not too surprising, since the code really is different.)
  2. Nevertheless, there is some amount of indirection here: while Utils includes a specific ByteString module, it is unspecified which version of ByteString it will be. If (hypothetically) the bytestring library released a new version where lazy byte-strings were actually strict, the functionality of Utils would change accordingly when the user re-ran dependency resolution.
  3. I used a qualified import to refer to identifiers in Data.ByteString.Lazy. This is a pretty common pattern when developing Haskell code: we think of B as an alias to the actual model. Textually, this is also helpful, because it means I only have to edit the import statement to change which ByteString I refer to.

Generalizing Utils with a signature

To generalize Utils with some Backpack magic, we need to create a signature for ByteString, which specifies what the interface of the module providing ByteStrings is. Here one such signature, which is placed in the file Data/ByteString.hsig inside the utilities package:

module Data.ByteString where
  import Data.Word
  data ByteString
  instance Eq ByteString
  empty :: ByteString
  singleton :: Word8 -> ByteString
  putStr :: ByteString -> IO ()

The format of a signature is essentially the same of that of an hs-boot file: we have normal Haskell declarations, but omitting the actual implementations of values.

The utilities package now needs a new field to record signatures:

name: utilities
indefinite: True
build-depends: base
exposed-modules: Utils
required-signatures: Data.ByteString

Notice that there have been three changes: (1) We've removed the direct dependency on the bytestring package, (2) we've added a new field indefinite, which indicates that this indefinite package has signatures and cannot be compiled until those signatures are filled in with implementations (this field is strictly redundant, but is useful for documentation purposes, as we will see later), and (3) we have a new field required-signatures which simply lists the names of the signature files (also known as holes) that we need filled in.

How do we actually use the utilities package, then? Let's suppose our goal is to produce a new module, Utils.Strict, which is Utils but using strict ByteStrings (which is exported by the bytestring package under the module name Data.ByteString). To do this, we'll need to create a new package:

name: strict-utilities
build-depends: utilities, bytestring
reexported-modules: Utils as Utils.Strict

That's it! strict-utilities exports a single module Utils.Strict which is utilities using Data.ByteString from bytestring (which is the strict implementation). This is called a mix-in: in the same dependency list, we simply mix together:

  • utilities, which requires a module named Data.ByteString, and
  • bytestring, which supplies a module named Data.ByteString.

Cabal automatically figures out that how to instantiate the utilities package by matching together module names. Specifically, the two packages above are connected through the module name Data.ByteString. This makes for a very convenient (and as it turns out, expressive) mode of package instantiation. By the way, reexported-modules is a new (orthogonal) feature which lets us reexport a module from the current package or a dependency to the outside world under a different name. The modules that are exported by the package are the exposed-modules and the reexported-modules. The reason we distinguish them is to make clear which modules have source code in the package (exposed-modules).

Unusually, strict-utilities is a package that contains no code! Its sole purpose is to mix existing packages.

Now, you might be wondering: how do we instantiate utilities with the lazy ByteString implementation? That implementation was put in Data.ByteString.Lazy, so the names don't match up. In this case, we can use another new feature, module thinning and renaming:

name: lazy-utilities
  bytestring (Data.ByteString.Lazy as Data.ByteString)
reexported-modules: Utils as Utils.Lazy

The utilities dependency is business as usual, but bytestring has a little parenthesized expression next to it. This expression is the thinning and renaming applied to the package import: it controls what modules are brought into the scope of the current package from a dependency, possibly renaming them to different names. When I write build-depends: bytestring (Data.ByteString.Lazy as Data.ByteString), I am saying "I depend on the bytestring package, but please only make the Data.ByteString.Lazy module available under the name Data.ByteString when considering module imports, and ignore all the other exposed modules." In strict-utilities, you could have also written bytestring (Data.ByteString), because this is the only module that utilities uses from bytestring.

An interesting duality is that you can do the renaming the other way:

name: lazy-utilities
  utilities (Utils, Data.ByteString as Data.ByteString.Lazy),

Instead of renaming the implementation, I renamed the hole! It's equivalent: the thing that matters it that the signature and implementation need to be mixed under the same name in order for linking (the instantiation of the signature with the implementation) to occur.

There are a few things to note about signature usage:

  1. If you are using a signature, there's not much point in also specifying an explicit import list when you import it: you are guaranteed to only see types and definitions that are in the signature (modulo type classes... a topic for another day). Signature files act like a type-safe import list which you can share across modules.

  2. A signature can, and indeed often must, import other modules. In the type signature for singleton in Data/ByteString.hsig, we needed to refer to a type Word8, so we must bring it into scope by importing Data.Word.

    Now, when we compile the signature in the utilities package, we need to know where Data.Word came from. It could have come from another signature, but in this case, it's provided by the definite package base: it's a proper concrete module with an implementation! Signatures can depend on implementations: since we can only refer to types from those modules, we are saying, in effect: any implementation of the singleton function and any representation of the ByteString type is acceptable, but regarding Word8 you must use the specific type from Data.Word in prelude.

  3. What happens if, independently of my packages strict-utilities, someone else also instantiatiates utilities with Data.ByteString? Backpack is clever enough to reuse the instantiation of utilities: this property is called applicativity of the module system. The specific rule that we use to decide if the instantiation is the same is to look at how all of the holes needed by a package are instantiated, and if they are instantiated with precisely the same modules, the instantiated packages are considered type equal. So there is no need to actually create strict-utilities or lazy-utilities: you can just instantiate utilities on the fly.

Mini-quiz: What does this package do?

name: quiz-utilities
  utilities (Utils, Data.ByteString as B),
  bytestring (Data.ByteString.Lazy as B)

Sharing signatures

It's all very nice to be able to explicitly write a signature for Data.ByteString in my package, but this could get old if I have to do this for every single package I depend on. It would be much nicer if I could just put all my signatures in a package and include that when I want to share it. I want all of the Hackage mechanisms to apply to my signatures as well as my normal packages (e.g. versioning). Well, you can!

The author of bytestring can write a bytestring-sig package which contains only signatures:

name: bytestring-sig
version: 1.0
indefinite: True
build-depends: base
exposed-signatures: Data.ByteString

...and declare that the bytestring package satisfies this signature:

name: bytestring
implements: bytestring-sig-1.0

The implements fields is purely advisory: it offers a proactive check to library authors to make sure they aren't breaking compatibility with signatures, and it also helps Cabal offer suggestions for how to provide implementations for signatures.

Now, utilities can include this package to indicate its dependence on the signature:

name: utilities
indefinite: True
build-depends: base, bytestring-sig-1.0
exposed-modules: Utils

Unlike normal dependencies, signature dependencies should be exact: after all, while you might want an upgraded implementation, you don't want the signature to change on you!

Another interesting difference is that we specified the signatures using exposed-signatures, as opposed to required-signatures. We can summarize all of the fields as follows:

  1. exposed-modules says that there is a public module defined in this package
  2. other-modules says that there is a private module defined in this package
  3. exposed-signatures says that there is a public signature defined in this package
  4. required-signatures says that there is a "private" signature defined in this package
  5. reexported-modules says that there is a public module or signature defined in a dependency.

In this list, public means that it is available to clients. Notice the first four fields list all of the source code in this package. Here is a simple example of a client:

name: utilities-extras
indefinite: True
build-depends: utilities
exposed-modules: Utils.Extra

Utils/Extra.hs defined in this package can import Utils (because it's exposed by utilities) but can't import Data.ByteString (because it's not exposed). Had we said reexported-modules: Data.ByteString in utilities, then Data.ByteString would have been accessible.

Do note, however, that the package is still indefinite (since it depends on an indefinite package). Despite Data.ByteString being "private" to utilities (not importable), a client may still refer to it in a renaming clause in order to instantiate the module:

name: utilities-extras-lazy
  utilities-extras (Data.ByteString as Data.ByteString.Lazy),

You can't "hide" holes altogether: that would be like saying, "I'm never going to say what the actual implementation is!" But you can choose not to directly rely on them.

By the way, if Utils/Extra.hs, in utilities-extras, wanted to import Data.ByteString (even though utilities did not expose it), utilities-extras simply needs directly depend on the signature package:

name: utilities-extras
indefinite: True
build-depends: utilities, bytestring-sig == 1.0
exposed-modules: Utils.Extra

The Data.ByteString hole from utilities and the new hole included here are automatically checked for compatibility and linked together: you only need to provide one implementation for both of them.

Mini-quiz: What does this package do? Specifically, if I include it in a package, what modules are available for import?

name: attoparsec-sig
version: 1.0
indefinite: True
build-depends: base, bytestring-sig
exposed-signatures: Data.Attoparsec


We've covered a lot of ground, but when it comes down to it, Backpack really comes together because of set of orthogonal features which interact in a good way:

  1. Module signatures (mostly implemented but needs lots of testing): the heart of a module system, giving us the ability to write indefinite packages and mix together implementations,
  2. Module reexports (fully implemented and in HEAD): the ability to take locally available modules and reexport them under a different name, and
  3. Module thinning and renaming (fully implemented and in code review): the ability to selectively make available modules from a dependency.

To compile a Backpack package, we first run the traditional version dependency solving, getting exact versions for all packages involved, and then we calculate how to link the packages together. That's it! In a future blog post, I plan to more comprehensively describe the semantics of these new features, especially module signatures, which can be subtle at times. Also, note that I've said nothing about how to type-check against just a signature, without having any implementation in mind. As of right now, this functionality is vaporware; in a future blog post, I also plan on saying why this is so challenging.

by Edward Z. Yang at August 26, 2014 10:01 PM

Chris Smith

On CodeWorld and Haskell

I’ve been pouring a lot of effort into CodeWorld lately… and I wanted to write a sort of apology to the Haskell community.  Well, perhaps not an apology, because I believe I did the right thing.  But at the same time, I realize that decisions I’ve made haven’t been entirely popular among Haskell programmers.  I’d like to explain what happened, and try to make it up to you!

What Happened

Originally, I started this project using Haskell and the excellent gloss package, by Ben Lippmeier.  CodeWorld has been moving slowly further and further away from the rest of the Haskell community.  This has happened in a sequence of steps:

  1. Way back in 2011, I started “CodeWorld”, but at the time, I called it Haskell for Kids.  At the time, I understood that the reasons I’d chosen Haskell as a language were not about cool stuff like type classes (which I love) and monads and categories and other commonplace uses of solid abstractions (which fascinate me).  Instead, I chose Haskell for the simple reason that it looked like math.  The rest of Haskell came with the territory.  I built the first CodeWorld web site in a weekend, and I had to settle on a language and accept all that came with it.
  2. From the beginning, I made some changes for pedagogical reasons.  For example, gloss defines rotation to be clockwise.  I insisted on rotation working in the counter-clockwise direction, because that’s the convention universally used in math.  Later, I resized the canvas to 20×20, so that typical programs would need to use fractions and decimals, which is a middle school math education goal.  I made thes changes, even though they broke compatibility with a widely used package.  Sorry for anyone that’s struggled with this.
  3. I rebranded “Haskell for Kids” as CodeWorld, and stopped explicitly depending on gloss in favor of just reproducing its general approach in a new Prelude.  This was a deliberate attempt to get away from focusing on the Haskell language and libraries, and also to the accompanying import statements and such.  This hid the ways that Haskell was a general purpose language with uses outside this toy environment.  That is unfortunate.
  4. I rewrote the Haskell Prelude, to remove type classes.  Along the way, I collapsed the whole numeric type class hierarchy into a single type, and even got Luite (the author of GHCJS) to help me with some deep black magic to implement equality on arbitrary Haskell types without type classes.  This threw away much of the beauty of Haskell… in favor of dramatically improved error messages, and fewer things you need to know to get started.  It was a real loss.
  5. Finally, I commited the unforgivable sin.  I dropped curried functions, in favor of defining functions of multiple parameters using tuples.  This finally makes CodeWorld feel like a completely different language from Haskell.  That really sucks, and I know some people are frustrated.

Why It Happened?

First, I want to point out some things that are not the reason for any of this:

  • I did not do this because I think there’s something wrong with Haskell.  I love type classes.  I love currying, and especially love how it’s not just a convenient trick, but sometimes introduces whole new perspectives by viewing tedious functions of multiple parameters as simple, clean, and elegant higher-order functions.
  • I also did not do this because I think anyone is incapable of learning full-fledged Haskell.  In fact, I taught full-fledged Haskell to middle schoolers for a year.  I know they can do it.

So why did I do it?  Two reasons:

  • Teaching mathematics has always been more important to me than teaching Haskell.  While Haskell is an awesome programming language, mathematics is just an awesome perspective on life.  For every student who benefits from learning an inspiring programming language, many students will benefit from learning that humanity has a method called mathematics for thinking about fundamental truths in a systematic, logical way that can capture things precisely.  So any time I have to choose between pushing students further toward their math education or away from it, I’ll choose toward it.
  • Details matter.  Even though I know kids are capable of a lot, they are capable of a lot more without artificial obstacles in their way.  I learned this the hard way teaching this class the first time.  The smallest little things, with absolutely no great significance as a language, matter a lot.  Having to put parentheses around negative numbers obscures students from reaching leaps of understanding.  Confusing error messages mean the difference between a student who spends a weekend learning, and one who gives up on Friday afternoon and doesn’t think about it until the next school day.  Different surface syntax means that a lot of kids never fully make the connection that functions here are the same thing as functions there.

In the end, I do think these were the right decisions… despite the frustration they can cause for Haskell programmers who know there’s a better way.

Making Up For It

A couple weekends ago, though, I worked on something to hopefully restore some of this loss for Haskellers.  You see, all the changes I’ve made, in the end, come from replacing the Prelude module with my own alternative.  Specifically:

  1. I deliberately replaced functions from the Prelude with my modified versions.
  2. Because I provided an alternative Prelude, I had to hide the base package, which made it impossible to import things like Control.Monad.  This was not a deliberate decision.  It just happened.

So I fixed this.  I added to the codeworld-base package re-exports of all of the modules from base.  I renamed Prelude to HaskellPrelude in the process, so that it doesn’t conflict with my own Prelude.  And finally, I added a new module, CodeWorld, that exports all the really new stuff from CodeWorld like pictures, colors, and the interpreters for pictures, animations, simulations, etc.  The result is that you can now start your programs with the following:

import Prelude()
import HaskellPrelude
import CodeWorld -- If you still want to do pictures, etc.

main = putStrLn "Hello, World"

At this point, you can write any Haskell you like!  You aren’t even constrained to pure code, or safe code.  (The exception: TemplateHaskell is still rejected, since the compiler runs on the server, so TH code would execute code on the server.)

In fact, it’s even better!  You’re free to use GHCJS JavaScript foreign imports, to interact with the browser environment!  See a brief example here.  Now you’re out of the sandbox, and are free to play around however you like.

Right now, the CodeWorld module still uses uncurried functions and other CodeWorld conventions like Number for numbers, etc.  There’s no reason for this, and it’s something that I should probably change.  Anyone want to send a pull request?

by cdsmith at August 26, 2014 04:38 PM

Dominic Steinitz

Haskell Vectors and Sampling from a Categorical Distribution


Suppose we have a vector of weights which sum to 1.0 and we wish to sample n samples randomly according to these weights. There is a well known trick in Matlab / Octave using sampling from a uniform distribution.

num_particles = 2*10^7
likelihood = zeros(num_particles,1);
likelihood(:,1) = 1/num_particles;
[_,index] = histc(rand(num_particles,1),[0;cumsum(likelihood/sum(likelihood))]);
s = sum(index);

Using tic and toc this produces an answer with

Elapsed time is 10.7763 seconds.


I could find no equivalent function in Haskell nor could I easily find a binary search function.

> {-# OPTIONS_GHC -Wall                     #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing  #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults   #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind  #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans         #-}
> {-# LANGUAGE BangPatterns                 #-}
> import System.Random.MWC
> import qualified Data.Vector.Unboxed as V
> import Control.Monad.ST
> import qualified Data.Vector.Algorithms.Search as S
> import Data.Bits
> n :: Int
> n = 2*10^7

Let’s create some random data. For a change let’s use mwc-random rather than random-fu.

> vs  :: V.Vector Double
> vs = runST (create >>= (asGenST $ \gen -> uniformVector gen n))

Again, I could find no equivalent of cumsum but we can write our own.

> weightsV, cumSumWeightsV :: V.Vector Double
> weightsV = V.replicate n (recip $ fromIntegral n)
> cumSumWeightsV = V.scanl (+) 0 weightsV

Binary search on a sorted vector is straightforward and a cumulative sum ensures that the vector is sorted.

> binarySearch :: (V.Unbox a, Ord a) =>
>                 V.Vector a -> a -> Int
> binarySearch vec x = loop 0 (V.length vec - 1)
>   where
>     loop !l !u
>       | u <= l    = l
>       | otherwise = let e = vec V.! k in if x <= e then loop l k else loop (k+1) u
>       where k = l + (u - l) `shiftR` 1
> indices :: V.Vector Double -> V.Vector Double -> V.Vector Int
> indices bs xs = (binarySearch bs) xs

To see how well this performs, let’s sum the indices (of course, we wouldn’t do this in practice) as we did for the Matlab implementation.

> js :: V.Vector Int
> js = indices (V.tail cumSumWeightsV) vs
> main :: IO ()
> main = do
>   print $ V.foldl' (+) 0 js

Using +RTS -s we get

Total   time   10.80s  ( 11.06s elapsed)

which is almost the same as the Matlab version.

I did eventually find a binary search function in vector-algorithms and since one should not re-invent the wheel, let us try using it.

> indices' :: (V.Unbox a, Ord a) => V.Vector a -> V.Vector a -> V.Vector Int
> indices' sv x = runST $ do
>   st <- V.unsafeThaw (V.tail sv)
>   V.mapM (S.binarySearch st) x
> main' :: IO ()
> main' = do
>   print $  V.foldl' (+) 0 $ indices' cumSumWeightsV vs

Again using +RTS -s we get

Total   time   11.34s  ( 11.73s elapsed)

So the library version seems very slightly slower.

by Dominic Steinitz at August 26, 2014 03:05 PM

August 25, 2014

Danny Gratzer

Introduction to Dependent Types: Haskell on Steroids

Posted on August 25, 2014

I’d like to start another series of blog posts. This time on something that I’ve wanted to write about for a while, dependent types.

There’s a noticeable lack of accessible materials introducing dependent types at a high level aimed at functional programmers. That’s what this series sets out help fill. Therefore, if you’re a Haskell programmer and don’t understand something, it’s a bug! Please comment so I can help make this a more useful resource for you :)

There are four parts to this series, each answering one question

  1. What are dependent types?
  2. What does a dependently typed language look like?
  3. What does it feel like to write programs with dependent types?
  4. What does it mean to “prove” something?

So first things first, what are dependent types? Most people by now have heard the unhelpful quick answer

A dependent type is a type that depends on a value, not just other types.

But that’s not helpful! What does this actually look like? To try to understand this we’re going to write some Haskell code that pushes us as close as we can get to dependent types in Haskell.

Kicking GHC in the Teeth

Let’s start with the flurry of extensions we need

{-# LANGUAGE DataKinds            #-}
{-# LANGUAGE KindSignatures       #-}
{-# LANGUAGE GADTs                #-}
{-# LANGUAGE TypeFamilies         #-}
{-# LANGUAGE UndecidableInstances #-}

Now our first definition is a standard formulation of natural numbers

    data Nat = Z | S Nat

Here Z represents 0 and S means + 1. So you should read S Z as 1, S (S Z) as 2 and so on and so on.

If you’re having some trouble, this function to convert an Int to a Nat might help

    -- Naively assume n >= 0
    toNat :: Int -> Nat
    toNat 0 = Z
    toNat n = S (toNat $ n - 1)

We can use this definition to formulate addition

    plus :: Nat -> Nat -> Nat
    plus Z n     = n
    plus (S n) m = S (plus n m)

This definition proceeds by “structural induction”. That’s a scary word that pops up around dependent types. It’s not all that complicated, all that it means is that we use recursion only on strictly smaller terms.

There is a way to formally define smaller, if a term is a constructor applied to several (recursive) arguments. Any argument to the constructor is strictly smaller than the original terms. In a strict language if we restrict ourselves to only structural recursion we’re guaranteed that our function will terminate. This isn’t quite the case in Haskell since we have infinite structures.

    toInt :: Nat -> Int
    toInt (S n) = 1 + toInt n
    toInt Z     = 0

    bigNumber = S bigNumber

    main = print (toInt bigNumber) -- Uh oh!

Often people will cheerfully ignore this part of Haskell when talking about reasoning with Haskell and I’ll stick to that tradition (for now).

Now back to the matter at hand. Since our definition of Nat is quite straightforward, it get’s promoted to the kind level by DataKinds.

Now we can “reflect” values back up to this new kind with a second GADTed definition of natural numbers.

    data RNat :: Nat -> * where
      RZ :: RNat Z
      RS :: RNat n -> RNat (S n)

Now, let’s precisely specify the somewhat handwavy term “reflection”. I’m using it in the imprecise sense meaning that we’ve lifted a value into something isomorphic at the type level. Later we’ll talk about reflection precisely mean lifting a value into the type level. That’s currently not possible since we can’t have values in our types!

What on earth could that be useful for? Well with this we can do something fancy with the definition of addition.

    type family Plus n m :: Nat where
      Plus Z n     = n
      Plus (S n) m = S (Plus n m)

Now we’ve reflected our definition of addition to the type family. More than that, what we’ve written above is fairly obviously correct. We can now force our value level definition of addition to respect this type family

    plus' :: RNat n -> RNat m -> RNat (Plus n m)
    plus' RZ n     = n
    plus' (RS n) m = RS (plus' n m)

Now if we messed up this definition we’d get a type error!

    plus' :: RNat n -> RNat m -> RNat (Plus n m)
    plus' RZ n     = n
    plus' (RS n) m = plus' n m -- Unification error! n ~ S n

Super! We know have types that express strict guarantees about our program. But how useable is this?

To put it to the test, let’s try to write some code that reads to integers for standard input and prints their sum.

We can easily do this with our normal plus

    readNat :: IO Nat
    readNat = toNat <$> readLn

    main :: IO ()
    main = plus <$> readNat <*> readNat

Easy as pie! But what about RNat, how can we convert a Nat to an RNat? Well we could try something with type classes I guess

class Reify a where
  type N
  reify :: a -> RNat N

But wait, that doesn’t work since we can only have once instance for all Nats. What if we did the opposite

class Reify (n :: Nat) where
  nat :: RNat n -> Nat

This let’s us go in the other direction.. but that doesn’t help us! In fact there’s no obvious way to propagate runtime values back into the types. We’re stuck.

GHC with Iron Dentures

Now, if we could add some magical extension to GHC could we write something like above program? Yes of course! The key idea is to not reflect up our types with data kinds, but rather just allow the values to exist in the types on their own.

For these I propose two basic ideas

  1. A special reflective function type
  2. Lifting expressions into types

For our special function types, we allow the return type to use the supplied value. These are called pi types. We’ll give this the following syntax

(x :: A) -> B x

Where A :: * and B :: A -> * are some sort of type. Notice that that A in B’s kind isn’t the data kind promoted version, but just the goodness to honest normal value.

Now in order to allow B to actually make use of it’s supplied value, our second idea let’s normal types be indexed on values! Just like how GADTs can be indexed on types. We’ll call these GGADTs.

So let’s define a new version of RNat

    data RNat :: Nat -> * where
      RZ :: RNat Z
      RS :: RNat n -> RNat (S n)

This looks exactly like what we had before, but our semantics are different now. Those Z’s and S’s are meant to represent actual values, not members of some kind. There’s no promoting types to singleton kinds anymore, just plain old values being held in fancier types.

Because we can depend on normal values, we don’t even have to use our simple custom natural numbers.

    data RInt :: Int -> * where
      RZ :: RInt 0
      RS :: RInt n -> RInt (1 + n)

Notice that we allowed our types to call functions, like +. This can potentially be undecidable, something that we’ll address later.

Now we can write our function with a combination of these two ideas

    toRInt :: (n :: Int) -> RInt n
    toRInt 0 = RZ
    toRInt n = RS (toRInt $ n - 1)

Notice how we used pi types to change the return type dependent on the input value. Now we can feed this any old value, including ones we read from standard input.

    main = print . toInt $ plus' <$> fmap toRInt readLn <*> fmap toRInt readLn

Now, one might wonder how the typechecker could possibly know how to handle such things, after all how could it know what’ll be read from stdin!

The answer is that it doesn’t. When a value is reflected to the type level we can’t do anything with it. For example, if we had a type like

    (n :: Int) -> (if n == 0 then Bool else ())

Then we would have to pattern match on n at the value level to propagate information about n back to the type level.

If we did something like

    foo :: (n :: Int) -> (if n == 0 then Bool else ())
    foo n = case n of
      0 -> True
      _ -> ()

Then the typechecker would see that we’re matching on n, so if we get into the 0 -> ... branch then n must be 0. It can then reduce the return type to if 0 == 0 then Bool else () and finally Bool. A very important thing to note here is that the typechecker doesn’t evaluate the program. It’s examining the function in isolation of all other values. This means we sometimes have to hold its hand to ensure that it can figure out that all branches have the correct type.

This means that when we use pi types we often have to pattern match on our arguments in order to help the typechecker figure out what’s going on.

To make this clear, let’s play the typechecker for this function. I’m reverting to the Nat type since it’s nicer for pattern matching.

    toRNat :: (n :: Nat) -> RNat n
    toRNat Z = RZ -- We know that n is `Z` in this branch
    toRNat (S n) = RS (toRNat n {- This has the type RNat n' -})

    p :: (n :: Nat) -> (m :: Int) -> RNat (plus n m)
    p Z m     = toRNat m
    p (S n) m = RS (toRNat n m)

First the type checker goes through toRNat.

In the first branch we have n equals Z, so RZ trivially typechecks. Next we have the case S n.

  • We know that toRNat n has the type RNat n' by induction
  • We also know that S n' = n.
  • Therefore RS builds us a term of type RNat n.

Now for p. We start in much the same manner.

if we enter the p Z m case

  • we know that n is Z.
  • we can reduce plus n m since plus Z m is by definition equal to m Look at the definition of plus to confirm this).
  • We know how to produce RNat m easily since we have a function toRNat :: (n :: Nat) -> RNat n.
  • We can apply this to m and the resulting term has the type RNat m.

In the RS case we know that we’re trying to produce a term of type RNat (plus (S n) m).

  • Now since we know that the constructor for the first argument of plus, we can reduce plus (S n) m to S (plus n m) by the definition of plus.
  • We’re looking to build a term of type plus n m and that’s as simple as a recursive call.
  • From here we just need to apply RS to give us S (plus n m)
  • As we previously noted S (plus n m) is equal to plus (S n) m

Notice how as we stepped through this as the typechecker we never needed to do any arbitrary reductions. We only ever reduce definitions when we have the outer constructor (WHNF) of one of the arguments.

While I’m not actually proposing adding {-# LANGUAGE PiTypes #-} to GHC, it’s clear that with only a few orthogonal editions to system F we can get some seriously cool types.

Wrap Up

Believe or not we’ve just gone through two of the most central concepts in dependent types

  • Indexed type families (GGADTs)
  • Dependent function types (Pi types)

Not so bad was it? :) From here we’ll look in the next post how to translate our faux Haskell into actual Agda code. From there we’ll go through a few more detailed examples of pi types and GGADTs by poking through some of the Agda standard library.

Thanks for reading, I must run since I’m late for class. It’s an FP class ironically enough.

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + ''; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

August 25, 2014 12:00 AM

August 24, 2014

Alejandro Serrano Mena

Using Emacs for Haskell development

In the last months, the toolchain for using Haskell within Emacs has changed a lot, and has become a lot better. Apart from my additions to ghc-mod, new autocompletion packages such as company-ghc have appeared.

In the past, I've felt that there was a need for a comprehensive article of all the available options for Haskell development in Emacs, including haskell-mode, ghc-mod, company-ghc, HaRe and structured-haskell-mode. To fill this gap, I have written a tutorial covering installation, configuration and use of these tools, especially keeping an eye into making all of them work nicely when put together.

Hope it helps!

by Alejandro Serrano ( at August 24, 2014 10:48 AM

Summer of Code on Emacs!

This summer I've been participating in Google Summer of Code, as I did some years ago. My aim was the same: to make it easier for Haskell developers to interact with their code. But instead of Eclipse, I've focused on another very well-known editor: Emacs. In particular, I've been extending the already excellent ghc-mod.

During the last year I've turned increasingly jealous of the Emacs modes for Agda and Idris, two programming languages which resemble Haskell but add dependent types to the mix. Using those modes, you can work interactively with your code, write pattern matches automatically, refine certain parts of your code, ask the compiler what is the type that a certain code should have, and so on. Furthermore, since version 7.8 GHC includes support for typed holes, so it seemed like all the necessary infrastructure from the compiler was in place to do this.

Instead of a boring description of the outcome of the project, I have prepared a video demostration ;)

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="450" src="" width="600"></iframe>

As a summary, here is the list of new key bindings that you can use since the release (just a few days ago) of ghc-mod 5.0:

  • C-u M-t: create the skeleton of a function from its signature, or the skeleton of a type class instance from its declaration,
  • M-t: perform case splitting on variables;
  • C-c M-n and C-c M-p: navigate between typed holes in your program, to the next or the previous one, respectively;
  • C-c C-f: refine a hole through an expression, including as much holes as needed to make it type check;
  • C-c C-a: try automatic completion of a hole by calling Djinn.

I would like to thank eveybody who has helped me during this summer, especially my mentor David Raymond Christiansen (whose work in idris-mode is just amazing) and Kazu Yamamoto, the creator and maintainer of ghc-mod.

by Alejandro Serrano ( at August 24, 2014 09:09 AM

August 23, 2014

Antti-Juhani Kaijanaho (ibid)

A milestone toward a doctorate

Yesterday I received my official diploma for the degree of Licentiate of Philosophy. The degree lies between a Master’s degree and a doctorate, and is not required; it consists of the coursework required for a doctorate, and a Licentiate Thesis, “in which the student demonstrates good conversance with the field of research and the capability of independently and critically applying scientific research methods” (official translation of the Government decree on university degrees 794/2004, Section 23 Paragraph 2).

The title and abstract of my Licentiate Thesis follow:

Kaijanaho, Antti-Juhani
The extent of empirical evidence that could inform evidence-based design of programming languages. A systematic mapping study.
Jyväskylä: University of Jyväskylä, 2014, 243 p.
(Jyväskylä Licentiate Theses in Computing,
ISSN 1795-9713; 18)
ISBN 978-951-39-5790-2 (nid.)
ISBN 978-951-39-5791-9 (PDF)
Finnish summary

Background: Programming language design is not usually informed by empirical studies. In other fields similar problems have inspired an evidence-based paradigm of practice. Central to it are secondary studies summarizing and consolidating the research literature. Aims: This systematic mapping study looks for empirical research that could inform evidence-based design of programming languages. Method: Manual and keyword-based searches were performed, as was a single round of snowballing. There were 2056 potentially relevant publications, of which 180 were selected for inclusion, because they reported empirical evidence on the efficacy of potential design decisions and were published on or before 2012. A thematic synthesis was created. Results: Included studies span four decades, but activity has been sparse until the last five years or so. The form of conditional statements and loops, as well as the choice between static and dynamic typing have all been studied empirically for efficacy in at least five studies each. Error proneness, programming comprehension, and human effort are the most common forms of efficacy studied. Experimenting with programmer participants is the most popular method. Conclusions: There clearly are language design decisions for which empirical evidence regarding efficacy exists; they may be of some use to language designers, and several of them may be ripe for systematic reviewing. There is concern that the lack of interest generated by studies in this topic area until the recent surge of activity may indicate serious issues in their research approach.

Keywords: programming languages, programming language design, evidence-based paradigm, efficacy, research methods, systematic mapping study, thematic synthesis

A Licentiate Thesis is assessed by two examiners, usually drawn from outside of the home university; they write (either jointly or separately) a substantiated statement about the thesis, in which they suggest a grade. The final grade is almost always the one suggested by the examiners. I was very fortunate to have such prominent scientists as Dr. Stefan Hanenberg and Prof. Stein Krogdahl as the examiners of my thesis. They recommended, and I received, the grade “very good” (4 on a scale of 1–5).

The thesis has been accepted for publication published in our faculty’s licentiate thesis series and will in due course appear has appeared in our university’s electronic database (along with a very small number of printed copies). In the mean time, if anyone wants an electronic preprint, send me email at

<figure class="wp-caption aligncenter" id="attachment_1622" style="width: 334px;">Figure 1 of the thesis: an overview of the mapping process<figcaption class="wp-caption-text">Figure 1 of the thesis: an overview of the mapping process</figcaption></figure>

As you can imagine, the last couple of months in the spring were very stressful for me, as I pressed on to submit this thesis. After submission, it took me nearly two months to recover (which certain people who emailed me on Planet Haskell business during that period certainly noticed). It represents the fruit of almost four years of work (way more than normally is taken to complete a Licentiate Thesis, but never mind that), as I designed this study in Fall 2010.

<figure class="wp-caption aligncenter" id="attachment_1625" style="width: 330px;">Figure 8 of the thesis: Core studies per publication year<figcaption class="wp-caption-text">Figure 8 of the thesis: Core studies per publication year</figcaption></figure>

Recently, I have been writing in my blog a series of posts in which I have been trying to clear my head about certain foundational issues that irritated me during the writing of the thesis. The thesis contains some of that, but that part of it is not very strong, as my examiners put it, for various reasons. The posts have been a deliberately non-academic attempt to shape the thoughts into words, to see what they look like fixed into a tangible form. (If you go read them, be warned: many of them are deliberately provocative, and many of them are intended as tentative in fact if not in phrasing; the series also is very incomplete at this time.)

I closed my previous post, the latest post in that series, as follows:

In fact, the whole of 20th Century philosophy of science is a big pile of failed attempts to explain science; not one explanation is fully satisfactory. [...] Most scientists enjoy not pondering it, for it’s a bit like being a cartoon character: so long as you don’t look down, you can walk on air.

I wrote my Master’s Thesis (PDF) in 2002. It was about the formal method called “B”; but I took a lot of time and pages to examine the history and content of formal logic. My supervisor was, understandably, exasperated, but I did receive the highest possible grade for it (which I never have fully accepted I deserved). The main reason for that digression: I looked down, and I just had to go poke the bridge I was standing on to make sure I was not, in fact, walking on air. In the many years since, I’ve taken a lot of time to study foundations, first of mathematics, and more recently of science. It is one reason it took me about eight years to come up with a doable doctoral project (and I am still amazed that my department kept employing me; but I suppose they like my teaching, as do I). The other reason was, it took me that long to realize how to study the design of programming languages without going where everyone has gone before.

Debian people, if any are still reading, may find it interesting that I found significant use for the dctrl-tools toolset I have been writing for Debian for about fifteen years: I stored my data collection as a big pile of dctrl-format files. I ended up making some changes to the existing tools (I should upload the new version soon, I suppose), and I wrote another toolset (unfortunately one that is not general purpose, like the dctrl-tools are) in the process.

For the Haskell people, I mainly have an apology for not attending to Planet Haskell duties in the summer; but I am back in business now. I also note, somewhat to my regret, that I found very few studies dealing with Haskell. I just checked; I mention Haskell several times in the background chapter, but it is not mentioned in the results chapter (because there were not studies worthy of special notice).

I am already working on extending this work into a doctoral thesis. I expect, and hope, to complete that one faster.

by Antti-Juhani Kaijanaho at August 23, 2014 05:44 PM

Joachim Breitner

This blog goes static

After a bit more than 9 years, I am replacing Serendipity, which as been hosting my blog, by a self-made static solution. This means that when you are reading this, my server no longer has to execute some rather large body of untyped code to produce the bytes sent to you. Instead, that happens once in a while on my laptop, and they are stored as static files on the server.

I hope to get a little performance boost from this, so that my site can more easily hold up to being mentioned on hackernews. I also do not want to worry about security issues in Serendipity – static files are not hacked.

Of course there are down-sides to having a static blog. The editing is a bit more annoying: I need to use my laptop (previously I could post from anywhere) and I edit text files instead of using a JavaScript-based WYSIWYG editor (but I was slightly annoyed by that as well). But most importantly your readers cannot comment on static pages. There are cloud-based solutions that integrate commenting via JavaScript on your static pages, but I decided to go for something even more low-level: You can comment by writing an e-mail to me, and I’ll put your comment on the page. This has the nice benefit of solving the blog comment spam problem.

The actual implementation of the blog is rather masochistic, as my web page runs on one of these weird obfuscated languages (XSLT). Previously, it contained of XSLT stylesheets producing makefiles calling XSLT sheets. Now it is a bit more-self-contained, with one XSLT stylesheet writing out all the various html and rss files.

I managed to import all my old posts and comments thanks to this script by Michael Hamann (I had played around with this some months ago and just spend what seemed to be an hour to me to find this script again) and a small Haskell script. Old URLs are rewritten (using mod_rewrite) to the new paths, but feed readers might still be confused by this.

This opens the door to a long due re-design of my webpage. But not today...

by Joachim Breitner ( at August 23, 2014 03:54 PM

Dominic Steinitz

Importance Sampling

Importance Sampling

Suppose we have an random variable X with pdf 1/2\exp{-\lvert x\rvert} and we wish to find its second moment numerically. However, the random-fu package does not support sampling from such as distribution. We notice that

\displaystyle   \int_{-\infty}^\infty x^2 \frac{1}{2} \exp{-\lvert x\rvert} \mathrm{d}x =  \int_{-\infty}^\infty x^2 \frac{\frac{1}{2} \exp{-\lvert x\rvert}}                                 {\frac{1}{\sqrt{8\pi}}{\exp{-x^2/8}}}                        \frac{1}{\sqrt{8\pi}}{\exp{-x^2/8}}  \,\mathrm{d}x

So we can sample from {\cal{N}}(0, 4) and evaluate

\displaystyle   x^2 \frac{\frac{1}{2} \exp{-\lvert x\rvert}}           {\frac{1}{\sqrt{8\pi}}{\exp{-x^2/8}}}

> {-# OPTIONS_GHC -Wall                     #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing  #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults   #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind  #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans         #-}
> module Importance where
> import Control.Monad
> import Data.Random.Source.PureMT
> import Data.Random
> import Data.Random.Distribution.Binomial
> import Data.Random.Distribution.Beta
> import Control.Monad.State
> import qualified Control.Monad.Writer as W
> sampleImportance :: RVarT (W.Writer [Double]) ()
> sampleImportance = do
>   x <- rvarT $ Normal 0.0 2.0
>   let x2 = x^2
>       u = x2 * 0.5 * exp (-(abs x))
>       v = (exp ((-x2)/8)) * (recip (sqrt (8*pi)))
>       w = u / v
>   lift $ W.tell [w]
>   return ()
> runImportance :: Int -> [Double]
> runImportance n =
>   snd $
>   W.runWriter $
>   evalStateT (sample (replicateM n sampleImportance))
>              (pureMT 2)

We can run this 10,000 times to get an estimate.

ghci> import Formatting
ghci> format (fixed 2) (sum (runImportance 10000) / 10000)

Since we know that the n-th moment of the exponential distribution is n! / \lambda^n where \lambda is the rate (1 in this example), the exact answer is 2 which is not too far from our estimate using importance sampling.

The value of

\displaystyle   w(x) = \frac{1}{N}\frac{\frac{1}{2} \exp{-\lvert x\rvert}}                         {\frac{1}{\sqrt{8\pi}}{\exp{-x^2/8}}}       = \frac{p(x)}{\pi(x)}

is called the weight, p is the pdf from which we wish to sample and \pi is the pdf of the importance distribution.

Importance Sampling Approximation of the Posterior

Suppose that the posterior distribution of a model in which we are interested has a complicated functional form and that we therefore wish to approximate it in some way. First assume that we wish to calculate the expectation of some arbitrary function f of the parameters.

\displaystyle   {\mathbb{E}}(f({x}) \,\vert\, y_1, \ldots y_T) =  \int_\Omega f({x}) p({x} \, \vert \, y_1, \ldots y_T) \,\mathrm{d}{x}

Using Bayes

\displaystyle   \int_\Omega f({x}) {p\left(x \,\vert\, y_1, \ldots y_T\right)} \,\mathrm{d}{x} =  \frac{1}{Z}\int_\Omega f({x}) {p\left(y_1, \ldots y_T \,\vert\, x\right)}p(x) \,\mathrm{d}{x}

where Z is some normalizing constant.

As before we can re-write this using a proposal distribution \pi(x)

\displaystyle   \frac{1}{Z}\int_\Omega f({x}) {p\left(y_1, \ldots y_T \,\vert\, x\right)}p(x) \,\mathrm{d}{x} =  \frac{1}{Z}\int_\Omega \frac{f({x}) {p\left(y_1, \ldots y_T \,\vert\, x\right)}p(x)}{\pi(x)}\pi(x) \,\mathrm{d}{x}

We can now sample X^{(i)} \sim \pi({x}) repeatedly to obtain

\displaystyle   {\mathbb{E}}(f({x}) \,\vert\, y_1, \ldots y_T) \approx \frac{1}{ZN}\sum_1^N  f({X^{(i)}}) \frac{p(y_1, \ldots y_T \, \vert \, {X^{(i)}})p({X^{(i)}})}                              {\pi({X^{(i)}})} =  \sum_1^N w_if({X^{(i)}})

where the weights w_i are defined as before by

\displaystyle   w_i = \frac{1}{ZN} \frac{p(y_1, \ldots y_T \, \vert \, {X^{(i)}})p({X^{(i)}})}                          {\pi({X^{(i)}})}

We follow Alex Cook and use the example from (Rerks-Ngarm et al. 2009). We take the prior as \sim {\cal{Be}}(1,1) and use {\cal{U}}(0.0,1.0) as the proposal distribution. In this case the proposal and the prior are identical just expressed differently and therefore cancel.

Note that we use the log of the pdf in our calculations otherwise we suffer from (silent) underflow, e.g.,

ghci> pdf (Binomial nv (0.4 :: Double)) xv

On the other hand if we use the log pdf form

ghci> logPdf (Binomial nv (0.4 :: Double)) xv
> xv, nv :: Int
> xv = 51
> nv = 8197
> sampleUniform :: RVarT (W.Writer [Double]) ()
> sampleUniform = do
>   x <- rvarT StdUniform
>   lift $ W.tell [x]
>   return ()
> runSampler :: RVarT (W.Writer [Double]) () ->
>               Int -> Int -> [Double]
> runSampler sampler seed n =
>   snd $
>   W.runWriter $
>   evalStateT (sample (replicateM n sampler))
>              (pureMT (fromIntegral seed))
> sampleSize :: Int
> sampleSize = 1000
> pv :: [Double]
> pv = runSampler sampleUniform 2 sampleSize
> logWeightsRaw :: [Double]
> logWeightsRaw = map (\p -> logPdf (Beta 1.0 1.0) p +
>                            logPdf (Binomial nv p) xv -
>                            logPdf StdUniform p) pv
> logWeightsMax :: Double
> logWeightsMax = maximum logWeightsRaw
> weightsRaw :: [Double]
> weightsRaw = map (\w -> exp (w - logWeightsMax)) logWeightsRaw
> weightsSum :: Double
> weightsSum = sum weightsRaw
> weights :: [Double]
> weights = map (/ weightsSum) weightsRaw
> meanPv :: Double
> meanPv = sum $ zipWith (*) pv weights
> meanPv2 :: Double
> meanPv2 = sum $ zipWith (\p w -> p * p * w) pv weights
> varPv :: Double
> varPv = meanPv2 - meanPv * meanPv

We get the answer

ghci> meanPv

But if we look at the size of the weights and the effective sample size

ghci> length $ filter (>= 1e-6) weights

ghci> (sum weights)^2 / (sum $ map (^2) weights)

so we may not be getting a very good estimate. Let’s try

> sampleNormal :: RVarT (W.Writer [Double]) ()
> sampleNormal = do
>   x <- rvarT $ Normal meanPv (sqrt varPv)
>   lift $ W.tell [x]
>   return ()
> pvC :: [Double]
> pvC = runSampler sampleNormal 3 sampleSize
> logWeightsRawC :: [Double]
> logWeightsRawC = map (\p -> logPdf (Beta 1.0 1.0) p +
>                             logPdf (Binomial nv p) xv -
>                             logPdf (Normal meanPv (sqrt varPv)) p) pvC
> logWeightsMaxC :: Double
> logWeightsMaxC = maximum logWeightsRawC
> weightsRawC :: [Double]
> weightsRawC = map (\w -> exp (w - logWeightsMaxC)) logWeightsRawC
> weightsSumC :: Double
> weightsSumC = sum weightsRawC
> weightsC :: [Double]
> weightsC = map (/ weightsSumC) weightsRawC
> meanPvC :: Double
> meanPvC = sum $ zipWith (*) pvC weightsC
> meanPvC2 :: Double
> meanPvC2 = sum $ zipWith (\p w -> p * p * w) pvC weightsC
> varPvC :: Double
> varPvC = meanPvC2 - meanPvC * meanPvC

Now the weights and the effective size are more re-assuring

ghci> length $ filter (>= 1e-6) weightsC

ghci> (sum weightsC)^2 / (sum $ map (^2) weightsC)

And we can take more confidence in the estimate

ghci> meanPvC


Rerks-Ngarm, Supachai, Punnee Pitisuttithum, Sorachai Nitayaphan, Jaranit Kaewkungwal, Joseph Chiu, Robert Paris, Nakorn Premsri, et al. 2009. “Vaccination with ALVAC and AIDSVAX to Prevent HIV-1 Infection in Thailand.” New England Journal of Medicine 361 (23) (December 3): 2209–2220. doi:10.1056/nejmoa0908492.

by Dominic Steinitz at August 23, 2014 08:05 AM

August 21, 2014

Theory Lunch (Institute of Cybernetics, Tallinn)

Transgressing the limits

Today, the 14th of January 2014, we had a special session of our Theory Lunch. I spoke about ultrafilters and how they allow to generalize the notion of limit.

Consider the space \ell^\infty of bounded sequences of real numbers, together with the supremum norm. We would like to define a notion of limit which holds for every \{x_n\}_{n \geq 0} \in \ell^\infty and satisfies the well known properties of standard limit:

  1. Linearity: \lim_{n \to \infty} (\lambda x_n + \mu y_n) = \lambda \lim_{n \to \infty} x_n + \mu \lim_{n \to \infty} y_n.
  2. Homogeneity: \lim_{n \to \infty} (x_n \cdot y_n) = (\lim_{n \to \infty} x_n) \cdot (\lim_{n \to \infty} y_n).
  3. Monotonicity: if x_n \leq y_n for every n \geq 0 then \lim_{n \to \infty} x_n \leq \lim_{n \to \infty} y_n.
  4. Nontriviality: if x_n = 1 for every n \geq 0 then \lim_{n \to \infty} x_n = 1.
  5. Consistency: if the limit exists in the classical sense, then the two notions coincide.

The consistency condition is reasonable also because it avoids trivial cases: if we fix n_0 \in \mathbb{N} and we define the limit of the sequence x_n as the value x_{n_0}, then the first four properties are satisfied.

Let us recall the classical definition of limit: we say that x_n converges to x if and only if, for every \varepsilon > 0, the set of values n \in \mathbb{N} such that |x_n - x| < \varepsilon is cofinite, i.e., has a finite complement: the inequality |x_n - x| \geq \varepsilon can be satisfied at most for finitely many values of n. The family \mathcal{F} of cofinite subsets of \mathbb{N} (in fact, of any set X) has the following properties:

  1. Upper closure: if A \in \mathcal{F} and B \supseteq A then B \in \mathcal{F}.
  2. Meet stability: if A,B \in \mathcal{F} then A \cap B \in \mathcal{F}.

A family \mathcal{F} of subsets of X with the two properties above is called a filter on X. An immediate example is the trivial filter \mathcal{F} = \{X\}; another example is the improper filter \mathcal{F} = \mathcal{P}(X). The family \mathcal{F}(X) of cofinite subset of X is called the Fréchet filter on X. The Fréchet filter is not the improper one if and only if X is infinite.

An ultrafilter on X is a filter \mathcal{U} on X satisfying the following additional conditions:

  1. Properness: \emptyset \not \in \mathcal{U}.
  2. Maximality: for every A \subseteq X, either A \in \mathcal{U} or  X \setminus A \in \mathcal{U}.

For example, if x \in X, then (x) = \{ A \subseteq X \mid x \in A \} is an ultrafilter on X, called the principal ultrafilter generated by x. Observe that \bigcap_{A \in (x)} A = \{x\}: if \bigcap_{A \in \mathcal{U}} A = \emptyset we say that \mathcal{U} is free. These are, in fact, the only two options.

Lemma 1. For a proper filter \mathcal{F} to be an ultrafilter, it is necessary and sufficient that it satisfies the following condition: for every n \geq 2 and nonempty A_1, \ldots, A_n \subseteq X, if \bigcup_{i=1}^n A_i \in \mathcal{F} then A_i \in \mathcal{F} for at least one i \in \{1, \ldots, n\}.

Proof: It is sufficient to prove the thesis with n=2. If A \cup B \in \mathcal{F} with A,B \not \in \mathcal{F}, then \mathcal{F}' = \{ B' \subseteq X \mid B' \neq \emptyset, A \cup B' \in \mathcal{F} \} is a proper filter that properly contains \mathcal{F}. If the condition is satisfied, for every A \subseteq X which is neither \emptyset nor X we have A \cup (X \setminus A) = X \in \mathcal{F}, thus either A \in \mathcal{F} or X \setminus A \in \mathcal{F}. \Box

Theorem 1. Every nonprincipal ultrafilter is free. In addition, an ultrafilter is free if and only if it extends the Fréchet filter. In particular, every ultrafilter over a finite set is principal.

Proof: Let \mathcal{U} be a nonprincipal ultrafilter. Let x \in X: then \mathcal{U} \neq (x), so either there exists B \subseteq X such that x \not \in B and B \in \mathcal{U}, or there exists B' \subseteq X such that x \in B' and B' \not \in \mathcal{U}. In the first case, x \not \in \bigcap_{A \in \mathcal{U}} A; in the second case, we consider B = X \setminus B' and reduce to the first case. As x is arbitrary, \mathcal{U} is free.

Now, for every x \in X the set X \setminus \{x\} belongs to \mathcal{F}(X) but not to (x): therefore, no principal ultrafilter extends the Fréchet filter. On the other hand, if \mathcal{U} is an ultrafilter, A \subseteq X is finite, and X \setminus A \not \in \mathcal{U}, then A \in \mathcal{U} by maximality, hence \{x\} \in \mathcal{U} for some x \in A because of Lemma 1, thus \mathcal{U} = (x) cannot be a free ultrafilter. \Box

So it seems that free ultrafilters are the right thing to consider when trying to expand the concept of limit. There is an issue, though: we have not seen any single example of a free ultrafilter; in fact, we do not even (yet) know whether free ultrafilters do exist! The answer to this problem comes, in a shamelessly nonconstructive way, from the following

Ultrafilter lemma. Every proper filter can be extended to an ultrafilter.

The ultrafilter lemma, together with Theorem 1, implies the existence of free ultrafilters on every infinite set, and in particular on \mathbb{N}. On the other hand, to prove the ultrafilter lemma the Axiom of Choice is required, in the form of Zorn’s lemma. Before giving such proof, we recall that a family of sets has the finite intersection property if every finite subfamily has a nonempty intersection: every proper filter has the finite intersection property.

Proof of the ultrafilter lemma. Let \mathcal{F} be a proper filter on X and let \mathcal{M} be the family of the collections of subsets of X that extend \mathcal{F} and have the finite intersection property, ordered by inclusion. Let \{U_i\}_{i \in I} be a totally ordered subfamily of \mathcal{M}: then U = \bigcup_{i \in I} U_i extends \mathcal{F} and has the finite intersection property, because for every finitely many A_1, \ldots, A_n \in U there exists by construction i \in I such that A_1, \ldots, A_n \in U_i.

By Zorn’s lemma, \mathcal{M} has a maximal element \mathcal{U}, which surely satisfies \emptyset \not \in \mathcal{U} and \mathcal{F} \subseteq \mathcal{U}. If A \in \mathcal{U} and B \supseteq A, then \mathcal{U} \cup \{B\} still has the finite intersection property, therefore B \in \mathcal{U} by maximality. If A,B \in \mathcal{U} then \mathcal{U} \cup \{A \cap B\} still has the finite intersection property, therefore again A \cap B \in \mathcal{U} by maximality.

Suppose, for the sake of contradiction, that there exists A \subseteq X such that A \not \in \mathcal{U} and X \setminus A \not \in \mathcal{U}: then neither \mathcal{U} \cup \{A\} nor \mathcal{U} \cup \{X \setminus A\} have the finite intersection property, hence there exist A_1, \ldots, A_m, B_1, \ldots, B_n \in \mathcal{U} such that A_1 \cap \ldots \cap A_m \cap A = B_1 \cap \ldots \cap B_n \cap (X \setminus A) = \emptyset. But A_1 \cap \ldots \cap A_m \cap A = \emptyset means A_1 \cap \ldots \cap A_m \subseteq X \setminus A, and B_1 \cap \ldots \cap B_n \cap (X \setminus A) = \emptyset means B_1 \cap \ldots \cap B_n \subseteq A: therefore,

A_1 \cap \ldots \cap A_m \cap B_1 \cap \ldots \cap B_n \subseteq (X \setminus A) \cap A = \emptyset,

against \mathcal{U} having the finite intersection property. \Box

We are now ready to expand the idea of limit. Let (X,d) be a metric space and let \mathcal{U} be an ultrafilter on X: we say that x \in X is the ultralimit of the sequence \{x_n\}_{n \geq 0} \subseteq X along \mathcal{U} if for every \varepsilon > 0 the set

\{ n \geq 0 \mid d(x_n, x) < \varepsilon \}

belongs to \mathcal{U}. (Observe how, in the standard definition of limit, the above set is required to belong to the Fréchet filter.) If this is the case, we write

\lim_{n \to \mathcal{U}} x_n = x

Ultralimits, if they exist, are unique and satisfy our first four conditions. Moreover, the choice of a principal ultrafilter \mathcal{U} = (n_0) corresponds to the trivial definition \lim_{n \to \mathcal{U}} x_n = x_{n_0}. So, what about free ultrafilters?

Theorem 2. Every bounded sequence of real numbers has an ultralimit along every free ultrafilter on \mathbb{N}.

Proof: It is not restrictive to suppose x_n \in [0,1] for every n \geq 0. Let \mathcal{U} be an arbitrary, but fixed, free ultrafilter on \mathbb{N}. We will construct a sequence of closed intervals A_k, k \geq 0, such that A_{k+1} \subseteq A_k and \mathrm{diam} \, A_k = 2^{-k} for every k \geq 0. By the Cantor intersection theorem it will be \bigcap_{k \geq 0} A_k = \{x\}: we will then show that \lim_{n \to \mathcal{U}} x_n = x.

Let A_0 = [0,1]. Let A_1 be either [0,1/2] or [1/2,1], chosen according to the following criterion: \{n \geq 0 \mid x_n \in A_1\} \in \mathcal{U}. If both halves satisfy the criterion, then we just choose one once and for all. We iterate the procedure by always choosing A_{k+1} as one of the two halves of A_k such that \{n \geq 0 \mid x_n \in A_{k+1}\} \in \mathcal{U}.

Let \bigcap_{k \geq 0} A_k = \{x\}. Let \varepsilon > 0, and let k be so large that 2^{-k} < \varepsilon: then A_k \subseteq (x-\varepsilon, x+\varepsilon), thus \{n \geq 0 \mid x_n \in A_k\} \subseteq \{n \geq 0 \mid |x_n-x| < \varepsilon\}. As the smaller set belongs to \mathcal{U}, so does the larger one. \Box

We have thus almost achieved our original target: a notion of limit which applies to every bounded sequence of real numbers. Such notion will depend on the specific free ultrafilter we choose: but it is already very reassuring that such a notion exists at all! To complete our job we need one more check: we have to be sure that the definition is consistent with the classical one. And this is indeed what happens!

Theorem 3. Let \{x_n\}_{n \geq 0} be a sequence of real numbers and let x \in \mathbb{R}. Then \lim_{n \to \infty} x_n = x in the classical sense if and only if \lim_{n \to \mathcal{U}} x_n = x for every free ultrafilter \mathcal{U} on \mathbb{N}.

To prove Theorem 3 we make use of an auxiliary result, which is of interest by itself.

Lemma 2. Let \mathcal{M}(X) be the family of collections of subsets of X that have the finite intersection property. The maximal elements of \mathcal{M} are precisely the ultrafilters.

Proof: Every ultrafilter is clearly maximal in \mathcal{M}. If \mathcal{U} is maximal in \mathcal{M}, then it is clearly proper and upper closed, and we can reason as in the proof of the ultrafilter lemma to show that it is actually an ultrafilter. \Box

Proof of Theorem 3: Suppose x_n does not converge to x in the classical sense. Fix \varepsilon_0 > 0 such that the set S = \{n \geq 0 \mid |x_n-x| \geq \varepsilon_0\} is infinite. Then the family \mathcal{V} = \{S \setminus \{n\} \mid n \geq 0\} has the finite intersection property: an ultrafilter \mathcal{U} that extends \mathcal{V} must be free. Then S \in \mathcal{U}, and x_n does not have an ultralimit along \mathcal{U}.

The converse implication follows from the classical definition of limit, together with the very notion of free ultrafilter. \Box

Theorem 3 does hold for sequences of real numbers, but does not extend to arbitrary metric spaces. In fact, the following holds, which we state without proving.

Theorem 4. Let X be a metric space. The following are equivalent.

  1. For some free ultrafilter \mathcal{U} on \mathbb{N}, every sequence in X has an ultralimit along \mathcal{U}.
  2. For every free ultrafilter \mathcal{U} on \mathbb{N}, every sequence in X has an ultralimit along \mathcal{U}.
  3. X is compact.

Ultrafilters are useful in many other contexts. For instance, they are used to construct hyperreal numbers, which in turn allow a rigorous definition of infinitesimals and the foundation of calculus over those. But this might be the topic for another Theory Lunch talk.

by Silvio Capobianco at August 21, 2014 03:15 PM

Edward Z. Yang

The fundamental problem of programming language package management

Why are there so many goddamn package managers? They sprawl across both operating systems (apt, yum, pacman, Homebrew) as well as for programming languages (Bundler, Cabal, Composer, CPAN, CRAN, CTAN, EasyInstall, Go Get, Maven, npm, NuGet, OPAM, PEAR, pip, RubyGems, etc etc etc). "It is a truth universally acknowledged that a programming language must be in want of a package manager." What is the fatal attraction of package management that makes programming language after programming language jump off this cliff? Why can't we just, you know, reuse an existing package manager?

You can probably think of a few reasons why trying to use apt to manage your Ruby gems would end in tears. "System and language package managers are completely different! Distributions are vetted, but that's completely unreasonable for most libraries tossed up on GitHub. Distributions move too slowly. Every programming language is different. The different communities don't talk to each other. Distributions install packages globally. I want control over what libraries are used." These reasons are all right, but they are missing the essence of the problem.

The fundamental problem is that programming languages package management is decentralized.

This decentralization starts with the central premise of a package manager: that is, to install software and libraries that would otherwise not be locally available. Even with an idealized, centralized distribution curating the packages, there are still two parties involved: the distribution and the programmer who is building applications locally on top of these libraries. In real life, however, the library ecosystem is further fragmented, composed of packages provided by a huge variety of developers. Sure, the packages may all be uploaded and indexed in one place, but that doesn't mean that any given author knows about any other given package. And then there's what the Perl world calls DarkPAN: the uncountable lines of code which probably exist, but which we have no insight into because they are locked away on proprietary servers and source code repositories. Decentralization can only be avoided when you control absolutely all of the lines of code in your application.. but in that case, you hardly need a package manager, do you? (By the way, my industry friends tell me this is basically mandatory for software projects beyond a certain size, like the Windows operating system or the Google Chrome browser.)

Decentralized systems are hard. Really, really hard. Unless you design your package manager accordingly, your developers will fall into dependency hell. Nor is there a one "right" way to solve this problem: I can identify at least three distinct approaches to the problem among the emerging generation of package managers, each of which has their benefits and downsides.

Pinned versions. Perhaps the most popular school of thought is that developers should aggressively pin package versions; this approach advocated by Ruby's Bundler, PHP's Composer, Python's virtualenv and pip, and generally any package manager which describes itself as inspired by the Ruby/node.js communities (e.g. Java's Gradle, Rust's Cargo). Reproduceability of builds is king: these package managers solve the decentralization problem by simply pretending the ecosystem doesn't exist once you have pinned the versions. The primary benefit of this approach is that you are always in control of the code you are running. Of course, the downside of this approach is that you are always in control of the code you are running. An all-to-common occurrence is for dependencies to be pinned, and then forgotten about, even if there are important security updates to the libraries involved. Keeping bundled dependencies up-to-date requires developer cycles--cycles that more often than not are spent on other things (like new features).

A stable distribution. If bundling requires every individual application developer to spend effort keeping dependencies up-to-date and testing if they keep working with their application, we might wonder if there is a way to centralize this effort. This leads to the second school of thought: to centralize the package repository, creating a blessed distribution of packages which are known to play well together, and which will receive bug fixes and security fixes while maintaining backwards compatibility. In programming languages, this is much less common: the two I am aware of are Anaconda for Python and Stackage for Haskell. But if we look closely, this model is exactly the same as the model of most operating system distributions. As a system administrator, I often recommend my users use libraries that are provided by the operating system as much as possible. They won't take backwards incompatible changes until we do a release upgrade, and at the same time you'll still get bugfixes and security updates for your code. (You won't get the new hotness, but that's essentially contradictory with stability!)

Embracing decentralization. Up until now, both of these approaches have thrown out decentralization, requiring a central authority, either the application developer or the distribution manager, for updates. Is this throwing out the baby with the bathwater? The primary downside of centralization is the huge amount of work it takes to maintain a stable distribution or keep an individual application up-to-date. Furthermore, one might not expect the entirety of the universe to be compatible with one another, but this doesn't stop subsets of packages from being useful together. An ideal decentralized ecosystem distributes the problem of identifying what subsets of packages work across everyone participating in the system. Which brings us to the fundamental, unanswered question of programming languages package management:

How can we create a decentralized package ecosystem that works?

Here are a few things that can help:

  1. Stronger encapsulation for dependencies. One of the reasons why dependency hell is so insidious is the dependency of a package is often an inextricable part of its outwards facing API: thus, the choice of a dependency is not a local choice, but rather a global choice which affects the entire application. Of course, if a library uses some library internally, but this choice is entirely an implementation detail, this shouldn't result in any sort of global constraint. Node.js's NPM takes this choice to its logical extreme: by default, it doesn't deduplicate dependencies at all, giving each library its own copy of each of its dependencies. While I'm a little dubious about duplicating everything (it certainly occurs in the Java/Maven ecosystem), I certainly agree that keeping dependency constraints local improves composability.
  2. Advancing semantic versioning. In a decentralized system, it's especially important that library writers give accurate information, so that tools and users can make informed decisions. Wishful, invented version ranges and artistic version number bumps simply exacerbate an already hard problem (as I mentioned in my previous post). If you can enforce semantic versioning, or better yet, ditch semantic versions and record the true, type-level dependency on interfaces, our tools can make better choices. The gold standard of information in a decentralized system is, "Is package A compatible with package B", and this information is often difficult (or impossible, for dynamically typed systems) to calculate.
  3. Centralization as a special-case. The point of a decentralized system is that every participant can make policy choices which are appropriate for them. This includes maintaining their own central authority, or deferring to someone else's central authority: centralization is a special-case. If we suspect users are going to attempt to create their own, operating system style stable distributions, we need to give them the tools to do so... and make them easy to use!

For a long time, the source control management ecosystem was completely focused on centralized systems. Distributed version control systems such as Git fundamentally changed the landscape: although Git may be more difficult to use than Subversion for a non-technical user, the benefits of decentralization are diverse. The Git of package management doesn't exist yet: if someone tells you that package management is solved, just reimplement Bundler, I entreat you: think about decentralization as well!

by Edward Z. Yang at August 21, 2014 01:02 PM