Planet Haskell

September 29, 2014

Yesod Web Framework

Persistent 2.1 released

Persistent 2.1, a stable release of the next generation of persistent is released to Hackage.

Persistent is an ORM for Haskell that keeps everything type-safe.

Persistent 2.1 features

  • a flexible, yet more type-safe Key type
  • a simplified monad stack

I already announced persistent 2 and the 2.1 release candidate.

Everyone should set their persistent dependencies to > 2.1 && < 3. 2.0.x was the unstable release and is now deprecated.

I want to thank all the early persistent 2 adopters for putting up with a fast-moving, buggy code base. This was an experiment in shipping an unstable version, and what I learned from it is that it was a great process, but we need to make sure Travis CI is running properly, which it is now!

Persistent 2.1 library support

The persistent and persistent-template libraries should support any kind of primary key type that you need. The backends are still catching up to the new features

  • persistent-sqlite backend has fully implemented these features.
  • persistent-postgres and persitent-mysql don't yet support changing the type of the id field
  • persistent-mongoDB does not yet support composite primary keys

All of the above packages except persistent-mysql are being well maintained, but just developing new features at their own pace. persistent-mysql is in the need of a dedicated maintainer. There are some major defects in the migration code that have gone unresolved for a long time now.

  • persistent-redis is in the process of being upgraded to 2.1
  • persistent-zookeeper was just released, but it is on persistent 1.3.*
  • There are other persistent packages out there that I have not had the chance to check on yet, most noteably persistent-odbc. Feel free to ask for help with upgrading.

Persistent 2.1 upgrade guide

Simple persistent usage may not need any changes to upgrade.

The fact that the Key type is now flexible means it may need to be constrained. So if you have functions that have Key in the type signature that are not specific to one PersistEntity, you may need to constrain them to the BackendKey type. An easy way to do this is using ConstraintKinds.

type DBEntity record =
    ( PersistEntityBackend record ~ MongoContext
    , PersistEntity record
    , ToBackendKey MongoContext record
    )

A SQL user would use SqlBackend instead of MongoContext. So you can now change the type signature of your functions:

- PersistEntity record => Key record
+ DBEntity record => Key record

Depending on how you setup your monad stacks, you may need some changes. Here is one possible approach to creating small but flexible Monad stack type signatures. It requires Rank2Types, and the code show is specialized to MongoDB.

type ControlIO m = ( MonadIO m , MonadBaseControl IO m)
type LogIO m = ( MonadLogger m , ControlIO m)

-- these are actually types, not constraints
-- with persistent-2 things work out a lot easier this way
type DB    a =  LogIO m => ReaderT MongoContext m a
type DBM m a =  LogIO m => ReaderT MongoContext m a

The basic type signature is just DB () (no constraints required). For working with different monad stacks, you can use DBM. If you are using conduits, you will have MonadResource m => DBM m (). Here is another example:

class Monad m => HasApp m where
    getApp :: m App 
instance HasApp Handler where
    getApp = getYesod
instance HasApp hasApp => HasApp (ReaderT MongoContext hasApp) where
    getApp = lift $ getApp
instance MonadIO m => HasApp (ReaderT App m) where
    getApp = ask 

-- | synonym for DB plus HasApp operations
type DBApp    a = HasApp m => DBM m a 
type DBAppM m a = HasApp m => DBM m a 

With this pattern our return type signature is always ReaderT MongoContext m, and we are changing m as needed. A different approach is to have a return type signature of m and to place a MonadReader constraint on it.

type Mongo m = (LogIO m, MonadReader MongoContext m)

Right now this approach requires using a call to Database.MongoDB.liftDB around each database call, but I am sure there are approaches to dealing with that. One approach would be to wrap every persistent "primitive" with liftDB.

September 29, 2014 11:51 PM

Robert Harper

Structure and Efficiency of Computer Programs

For decades my colleague, Guy Blelloch, and I have been promoting a grand synthesis of the two “theories” of computer science, combinatorial theory and logical theory.  It is only a small exaggeration to say that these two schools of thought operate in isolation.  The combinatorial theorists concern themselves with efficiency, based on hypothetical translations of high-level algorithms to low-level machines, and have no useful theory of composition, the most important tool for developing large software systems.  Logical theorists concern themselves with composition, emphasizing the analysis of the properties of components of systems and how those components may be combined; the entire subject of logic is a theory of composition (entailment).  But relatively scant attention is paid to efficiency, and, to a distressingly large extent, the situation is worsening, rather than improving.

Guy and I have been arguing, through our separate and joint work, for the applicability of PL ideas to algorithms design, leading, for example, to the concept of adaptive programming that has been pursued aggressively by Umut Acar over the last dozen years.  And we have argued for the importance of cost analysis, for various measures of cost, at the level of the code that one actually writes, rather than in terms of how it is supposedly compiled.  Last spring, after initial discussions with Anindya Banerjee at NSF last winter, I decided to write a position paper on the topic, outlining the scientific opportunities and challenges that would arise from an attempt to unify the two, disparate theories of computing.  The first draft was circulated privately in May, and was revised, very lightly, in July in preparation for a conference call sponsored by NSF among a group of leading researchers in both PL’s and Algorithms with the goal to find common ground and isolate key technical challenges.

I am delighted (and relieved) to see that Swarat Chaudhuri, in a post on his PL Enthusiast blog, has heartily endorsed our proposal for a grand synthesis of the “two theories” of Computer Science.  During his visit, Swarat had lengthy discussions with Umut, Guy, and me on our work in both research and education, but were surprised and disheartened by his opposition to our approach.  His concern was based on the common misapprehension that it is impossible to give a useful cost model for the lambda calculus, which would thereby undermine the entire body of work on which Guy and I, among others, had been pursuing for decades.  Coming from such a distinguished researcher as Chaudhuri, his opposition created for us a period of anxiety, could we be wrong?  But no, it is simply not true.  Guy and John Greiner provided an adequate cost model for the lambda calculus (under both a sequential and a parallel interpretation) in 1993, and that paper has withstood decades of scrutiny.  But it did take quite some time to sort this out to be absolutely sure.  For some mysterious reason, when it comes to the lambda calculus nonsense gets twice around the world before the truth can get its boots on, to borrow a turn of phrase from Mark Twain.  After some back-and-forth, the matter is settled, and  I am delighted that we can now count Swarat among our supporters.  It would have been a heavy burden for us to have to bear the opposition of a distinguished researcher such as himself to the very foundations of our proposed program.

Which is not to say that there are not serious obstacles to be overcome if such a grand synthesis is to be accomplished.  The first step is to get the right people together to discuss the issues and to formulate a unified vision of what are the core problems, and what are promising directions for the short- and long-term.  To this end there is likely to be a workshop held during the next academic year to start addressing these problems at a scientific level.  Contrary to what is implied in the PL Enthusiast post, my position paper is not a proposal for funding, but is rather a proposal for a scientific meeting designed to bring together two largely (but not entirely) disparate communities.  This summer NSF hosted a three-hour long conference call among a number of researchers in both areas with a view towards formulating a workshop proposal in the near future.  Please keep an eye out for future announcements.  I think there are many good problems to be considered, and many opportunities for new results.

I would like to mention that I am particularly grateful to Anindya Banerjee at NSF for initiating the discussion last winter that led to my position paper, and for helping to move forward the process of formulating a workshop proposal.  And I am very happy to learn of Swarat’s endorsement; it will go a long way towards helping attract interest from both sides of the great divide.


Filed under: Research Tagged: algorithms, programming languages, research

by Robert Harper at September 29, 2014 12:11 AM

Danny Gratzer

Abstract Types are Existential

Posted on September 29, 2014

I’m part of a paper reading club at CMU. Last week we talked about a classic paper, Abstract Types have Existential Type. The concept described in this paper is interesting and straightforward. Sadly some of the notions and comparisons made in the paper are starting to show their age. I thought it might be fun to give a tldr using Haskell.

The basic idea is that when we have an type with an abstract implementation some functions upon it, it’s really an existential type.

Some Haskell Code

To exemplify this let’s define an abstract type (in Haskell)

    module Stack (Stack, empty, push, pop) where
    newtype Stack a = Stack [a]

    empty :: Stack a
    empty = Stack []

    push :: a -> Stack a -> Stack a
    push a (Stack xs) = Stack (a : xs)

    pop :: Stack a -> Maybe a
    pop (Stack []) = Nothing
    pop (Stack (x : xs)) = Just x

    shift :: Stack a -> Maybe (Stack a)
    shift (Stack []) = Nothing
    shift (Stack (x : xs)) = Just (Stack xs)

Now we could import this module and use its operations:

    import Stack

    main = do
      let s = push 1 . push 2 . push 3 $ empty
      print (pop s)

What we couldn’t do however, is pattern match on stacks to take advantage of its internal structure. We can only build new operations out of combinations of the exposed API. The classy terminology would be to say that Stack is abstract.

This is all well and good, but what does it mean type theoretically? If we want to represent Haskell as a typed calculus it’d be a shame to have to include Haskell’s (under powered) module system to talk about abstract types.

After all, we’re not really thinking about modules as so much as hiding some details. That sounds like something our type system should be able to handle without having to rope in modules. By isolating the concept of abstraction in our type system, we might be able to more deeply understand and reason about code that uses abstract types.

This is in fact quite possible, let’s rephrase our definition of Stack

    module Stack (Stack, StackOps(..), ops) where

    newtype Stack a = Stack [a]

    data StackOps a = StackOps { empty :: Stack a
                               , push  :: a -> Stack a -> Stack a
                               , pop   :: Stack a -> Maybe a
                               , shift :: Stack a -> Maybe (Stack a) }
    ops :: StackOps
    ops = ...

Now that we’ve lumped all of our operations into one record, our module is really only exports a type name, and a record of data. We could take a step further still,

    module Stack (Stack, StackOps(..), ops) where

    newtype Stack a = Stack [a]

    data StackOps s a = StackOps { empty :: s a
                                 , push  :: a -> s a -> s a
                                 , pop   :: s a -> Maybe a
                                 , shift :: s a -> Maybe (s a) }
    ops :: StackOps Stack
    ops = ...

Now the only thing that needs to know the internals of Stack. It seems like we could really just smush the definition into ops, why should the rest of the file see our private definition.

    module Stack (StackOps(..), ops) where

    data StackOps s a = StackOps { empty :: s a
                                 , push  :: a -> s a -> s a
                                 , pop   :: s a -> Maybe a
                                 , shift :: s a -> Maybe (s a) }
    ops :: StackOps ???
    ops = ...

Now what should we fill in ??? with? It’s some type, but it’s meant to be chosen by the callee, not the caller. Does that sound familiar? Existential types to the rescue!

    {-# LANGUAGE PolyKinds, KindSignatures, ExistentialQuantification #-}
    module Stack where

    data Packed (f :: k -> k' -> *) a = forall s. Pack (f s a)

    data StackOps s a = StackOps { empty :: s a
                                 , push  :: a -> s a -> s a
                                 , pop   :: s a -> Maybe a
                                 , shift :: s a -> Maybe (s a) }
    ops :: Packed StackOps
    ops = Pack ...

The key difference here is Packed. It lets us take a type function and instantiate it with some type variable and hide our choice from the user. This means that we can even drop the whole newtype from the implementation of ops

    ops :: Packed StackOps
    ops = Pack $ StackOps { empty = []
                          , push  = (:)
                          , pop   = fmap fst . uncons
                          , shift = fmap snd . uncons }
      where uncons [] = Nothing
            uncons (x : xs) = Just (x, xs)

Now that we’ve eliminated the Stack definition from the top level, we can actually just drop the notion that this is in a separate module.

One thing that strikes me as unpleasant is how Packed is defined, we must jump through some hoops to support StackOps being polymorphic in two arguments, not just one.

We could get around this with higher rank polymorphism and making the fields more polymorphic while making the type less so. We could also just wish for type level lambdas or something. Even some of the recent type level lens stuff could be aimed at making a general case definition of Packed.

From the client side this definition isn’t actually so unpleasant to use either.

    {-# LANGUAGE RecordWildCards #-}

    someAdds :: Packed Stack Int -> Maybe Int
    someAdds (Pack Stack{..}) = pop (push 1 empty)

With record wild cards, there’s very little boilerplate to introduce our record into scope. Now we might wonder about using a specific instance rather than abstracting over all possible instantiations.

    someAdds :: Packed Stack Int -> Maybe Int
    someAdds =
      let (Pack Stack{..}) = ops in
        pop (push 1 empty)

The resulting error message is amusing :)

Now we might wonder if we gain anything concrete from this. Did all those language extensions actually do something useful?

Well one mechanical transformation we can make is that we can change our existential type into a CPS-ed higher rank type.

    unpackPacked :: (forall s. f s a -> r) -> Packed f a -> r
    unpackPacked cont (Pack f) = cont f

    someAdds' :: Stack s Int -> Maybe Int
    someAdds' Stack{..} = pop (push 1 empty)

    someAdds :: Packed Stack Int -> Maybe Int
    someAdds = unpackPacked someAdds'

Now we’ve factored out the unpacking of existentials into a function called unpack. This takes a continuation which is parametric in the existential variable, s.

Now our body of someAdds becomes someAdds, but notice something very interesting here, now s is a normal universally quantified type variable. This means we can apply some nice properties we already have used, eg parametricity.

This is a nice effect of translating things to core constructs, all the tools we already have figured out can suddenly be applied.

Wrap Up

Now that we’ve gone through transforming our abstract types in existential ones you can final appreciate at least one more thing: the subtitle on Bob Harper’s blog. You can’t say you didn’t learn something useful :)

I wanted to keep this post short and sweet. In doing this I’m going to some of the more interesting questions we could ask. For the curious reader, I leave you with these

  • How can we use type classes to prettify our examples?
  • What can we do to generalize Packed?
  • How does this pertain to modules? Higher order modules?
  • How would you implement “sharing constraints” in this model?
  • What happens when we translate existentials to dependent products?

Cheers.

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

September 29, 2014 12:00 AM

September 27, 2014

Robert Harper

sml-family.org up and running!

After far too long, and far too many obstacles to be overcome, Dave MacQueen, Lars Bergstrom, and I have finally prepared an open-source site for the entire family of languages derived from Standard ML.  The defining characteristic of Standard ML has always been that it has a mathematically precise definition, so that it is always clear what is a valid program and how it should behave.  And indeed we have seven different working compilers, all of which are compatible with each other, with the exception of some corner cases arising from known mistakes in the definition.  Moreover, there are several active projects developing new variations on the language, and it would be good to maintain the principle that such extensions be precisely defined.

To this end the sources of the 1990 and 1997 versions of the definition are on the web site, with the permission of MIT Press, as is the type-theoretic definition formulated by Chris Stone and me, which was subsequently used as the basis for a complete machine-checked proof of type safety for the entire language done by Karl Crary, Daniel K. Lee.  It is be hoped that the errors in the definition (many are known, we provide links to the extensive lists provided by Kahrs and Rossberg in separate investigations) may now be corrected by a community process.  Anyone is free to propose an alteration to be merged into the main branch, which is called “SML, The Living Language” and also known as “Successor ML”.  One may think of this as the third edition of the definition, but one that is in continual revision by the community.  Computer languages, like natural languages and like mathematics, belong to us all collectively, and we all contribute to their evolution.

Everyone is encouraged to create forks for experimental designs or new languages that enrich, extend, or significantly alter the semantics of the language.  The main branch will be for generally accepted corrections, modifications, and extensions, but it is to be expected that completely separate lines of development will also emerge.

The web site, sml-family.org is up and running, and will be announced in various likely places very soon.

Update: We have heard that some people get a “parked page” error from GoDaddy when accessing sml-family.org.  It appears to be a DNS propagation problem.

Update: The DNS problems have been resolved, and I believe that the web site is stably available now as linked above.


Filed under: Research Tagged: functional programming, sml

by Robert Harper at September 27, 2014 10:04 PM

Neil Mitchell

GHCID - a new GHCi based IDE (ish)

Summary: I've just released ghcid, which interactively shows the errors in your project on each save.

I'm please to announce ghcid, which is either "GHCi as a daemon" or "GHC + a bit of an IDE". I've been using it for my development for the last few days, and already find it quite valuable. Unlike other Haskell development tools, ghcid is intended to be incredibly simple. In particular, it doesn't integrate with any editors, doesn't depend on GHC the library and doesn't start web servers. It's under 200 lines of fairly dull Haskell, which talks over pipes to your existing ghci.

Using it

Run cabal update && cabal install ghcid to install it as normal. Then run ghcid --height=10 "--command=ghci Main.hs". The height is the number of lines you are going to resize your console window to (defaults to 8), and the command is how you start this project in ghci. Personally, I always create a .ghci file at the root of all my projects, which usually reads something like:

:set -fwarn-unused-binds -fwarn-unused-imports
:set -isrc
:load Main

With that you can pass --command=ghci (or nothing, since that is the default).

After that, resize your console and make it so you can see it while working in your editor. On Windows the ghcid console will automatically sit on top of all other windows. On Linux, you probably want to use your window manager to make it topmost or use a tiling window manager.

What you get

On every save you'll see a list of the errors and warnings in your project. It uses a single ghci under the hood, so even relatively large projects should update their status pretty quickly. As an example:

Main.hs:23:10:
Not in scope: `verbosit'
Perhaps you meant `verbosity' (imported from System.Console.CmdArgs)
Util.hs:18:1: Warning: Defined but not used: `foo'

Or, if everything is good, you see:

All good

This project is only a few days old, so please report any bugs you find.

What you want

I regularly use an IDE to develop in a Haskell-like language. I find that with the IDE I'm about 25% more productive than without it. While an IDE can provide lots of neat features (go to definition, search, type tooltips) I think most of the productivity gains come from:

  1. Syntax coloring.
  2. A list of errors and warnings...
  3. ...which is updated as you type...
  4. ...and are highlighted in the text (red squiggles).

Every text editor already provides syntax coloring. With ghcid you get the list of errors and warnings. To get the final two features you need to integrate with an editor. I'm hoping that ghcid can do some of the heavy lifting by taking a directory of files to treat as overrides, and producing a list of warnings/errors to a file. The rest is editor specific, and I hope to attempt integration with Sublime Text at some point (although would love some help).

by Neil Mitchell (noreply@blogger.com) at September 27, 2014 08:46 PM

Matthew Sackman

Reading highlights this week

I remember when I first went to University, coming across people who were both clearly extremely expert in their fields, from whom I wanted to learn, but were also religious, and how this utterly baffled me. At that point I would cheerfully describe myself as an avid atheist. My ignorance and naivety was somewhat extensive.

Over a decade later I like to think I have a more nuanced view. The most recent war in Gaza led, obviously, to vast amounts of suffering but some excellent articles on the subject (this one by Hadley Freeman in particular) helped me see perspectives more clearly and articulated how crucial it is to be precise with criticism: are you criticising a religion, a people, a government, a policy or something else? Nothing is ever black-and-white and it seems increasingly important to anticipate the consequences of an ill-thought-through comment or reaction. A good example of that is George Galloway's comments this week in the debate about this country once again getting involved in Iraq. On the face of it, and certainly without being remotely well-enough informed to evaluate the accuracy of his claims, if his claims on the size and makeup of ISIS/ISIL are true then there seems little likelihood that the bombing campaigns being discussed will be effective, and quite likely counter-productive. But all of that got lost due his description of Iraqis as quiescent. The way in which that description was seized upon by other MPs and the resultant media storm resulted in the over-shadowing not just of the rest of his contribution to the debate, but also of other important aspects of the debate, such as the resignation of Rushanara Ali (Labour's Shadow Minister for Education), citing once again the lack of a credible long-term plan for the region and our involvement.

Addressing the broader and somewhat more abstract issue is this enlightening article by Karen Armstrong. Again, I'm not claiming to be expert in the area, merely I found the article very educative. It had barely occurred to me that the western world's separation of the secular from the sacred was firstly such a recent occurrence, and secondly that it arose from a specific set of circumstances. There is no implicit reason why separation of state from church is an inevitable or even likely happenstance (to me, this reminds me of the question "if humans evolved from monkeys, then why can't we find monkeys still evolving into humans today?", to which the answer is "the circumstances are not right for that to occur"). The fact that the English word "religion" can't really be translated accurately into other languages (especially not languages that predate English such as Greek or Latin; as historically faith is all encompassing of life, not merely a private affair as we treat it today in the west) starts to show quite how odd the separation of secular from sacred in the modern west really is.

More interesting still is the observation that in the west, belonging to a Nation has in some ways subsumed the role of belonging to a Religion, only apparently with more positive overtones: we consider it almost reprehensible to die for your religion, but honourable to die for your nation. It would seem the concept of even belonging to a nation and having any sense of greater community outside your immediate surroundings only came about with the increased ability of governments to engage with (or intrude upon) their citizens. Before that point, presumably with church attendance widespread and frequent, one's interaction with "the wider world" was through the representative of the church. This would seem to explain a lot about why governments of the past sought the blessing of their nation's church for particular courses of action: maybe the church was seen as the bridge between the government (or monarchy) and the people. The whole article is worth a read.

September 27, 2014 02:08 PM

September 24, 2014

Neil Mitchell

Generating Open 3D Viewer Models

Summary: It's not obvious how to generate suitable Open 3D Viewer models, but with the right tools it isn't hard.

The Open 3D Viewer project is very cool, as an example here is a demo of a spinning cow rendered in the browser. For my wife's work I wanted to generate a 3D model of a fossil bedding plane. Effectively, given some x/y/z coordinates, produce a pretty rendering. It took a fair bit of guess work, so I wrote down the steps.

Step 1: Generate an OBJ file

Generate an OBJ file. You probably want an MTL file too, but it seems the 3D viewer only uses the Kd field. There are a few ways to get an OBJ file:

  • There are many samples on the web, including a snail in the Open 3D Viewer repo.
  • You can create OBJ files in a tool such as Blender, but the Blender interface confused me a lot (I am definitely not their intended audience).
  • You can generate an OBJ file using a Haskell script. I picked this method, and I'll write a blog about the script later, once I have some pretty pictures to show.

Step 2: Get the tools

There are some tools in the WebGL Loader project. Alas, that project says "for now I recommend r50 as the last stable revision". So now there are two tools to try, the latest and r50. I tried both. I had some limited success with r50 (it didn't seem to render properly, but it did run) while the latest revision segfaulted. Fortunately I found the tools in a Google Groups post, and have mirrored them in my repo (with trivial tweaks to support Python 2.7).

Step 3: Run objcompress

You need to run:

objcompress mymodel.obj mymodel.utf8 > mymodel.js

This will generate lots of mymodel*.utf8 files and mymodel.js.

Step 3: Run part_grouping.py

You need to run:

py part_grouping.py

(The file in the email is part&grouping.py, but I renamed my copy.) This script will interactively ask a really long list of questions. I generate the correct inputs into a file and pipe it in:

py part_grouping.py < response.txt

This generates the files groupings.txt and part_info.txt.

Step 4: Run make_viewer_metadata.py

Run:

py make_viewer_metadata.py

This generates the file entity_metadata.json.

Step 5: Get the viewer source

You can get the viewer source from the Open 3D Viewer repo. I have mirrored it in my repo, but I may tweak the viewer over time to match my wife's needs - you should get the original.

Step 6: Copy your files

Copy all the files from steps 1 to 4 to a directory inside the viewer named models/mymodel.

Step 7: Update the model list

Open up scripts/models.js and edit it to point at your model. For example:

o3v.MODELS = [{
name:'mymodel.obj',
scriptName:'mymodel.js',
modelPath:'models/mymodel/',
metadataFile:'entity_metadata.json',
numLayers:12
}];

Step 8: View the result

You can view the result by opening index.html. In Chrome you may need to pass the flag --allow-file-access-from-files.

by Neil Mitchell (noreply@blogger.com) at September 24, 2014 04:00 PM

Snap Framework

Announcing: Heist v0.14

The Snap team is happy to announce the release of version 0.14 of Heist.

Major Changes

Namespace Support

Heist now has support for namespaces. This allows you to configure Heist so that all of your splices require a namespace. Requiring a namespace allows Heist to be more aggressive with errors for unbound splices. For instance, imagine you set the hcNamespace field in your HeistConfig to “h”, and you bind two splices.

mySplices = do
    "foo" #! fooSplice
    "bar" #! barSplice

With this setup, you would put the “h” namespace on all of the splice tags in your templates. Instead of calling those splices with “<foo>” and “<bar>”, you would use “<h:foo>” and “<h:bar>”. So why go to all this trouble so you have to type more? Because it allows Heist to do more error checking. Without namespaces there is no way for Heist to know whether a tag is supposed to be a splice or not. We could use the list of valid HTML tags to infer it, but that significantly constrains the scope of what we intended for Heist. This approach allows the user to be explicit about which tags should be splices. If you do not want to use namespaces, set the namespace to the empty string.

Along with the namespace field, we introduced the hcErrorNotBound for controlling the error checking. When hcErrorNotBound is True, Heist will generate an error message if it encounters any namespaced tags that do not have splices bound for them. This eliminates a large class of bugs where users were using a splice in their templates, but forgot to bind it in their code. The intent is that users will usually want to have this error checking turned on. But we felt it was also important to be able to use namespacing without the strict enforcement, so we included this flag to give users full control.

Generalized Error Reporting

Since this release is doing more error checking, we decided to expose error facilities to the user. This release exposes a new function tellSpliceError that you can use when error conditions are detected in your splices. If you are using compiled Heist, then all your templates will be processed up front at load time. If any of your load time or compiled splices detect an error condition that the user needs to fix, you can call tellSpliceError with an error message. If there are any errors, Heist initialization will fail and all the errors will be returned.

Restructured HeistConfig

The addition of hcNamespace and hcErrorNotBound to HeistConfig required some restructuring. Previously HeistConfig had a Monoid instance, but we removed that since the new fields make it unclear which instance should be used. But we also did not want to completely get rid of the monoid functionality either. So in order to get the best of both worlds, we refactored all of HeistConfig’s previous fields into another data structure called SpliceConfig. This way we can keep the Monoid instance for SpliceConfig and still avoid imposing a weird set of semantics on the user.

Unfortunately, given the use of field accessors it was impossible to make this change without breaking backwards compatibility. What seems like it should have been a simple addition of a couple parameters ended up being a more significant refactoring. To make these kinds of changes easier in the future Heist now exports lenses to all of the HeistConfig fields as well as an emptyHeistConfig value to use as a starting point. These lenses work with both the lens and lens-family-core packages and we export them without incurring a dependency on either lens package.

The HeistConfig constructor and field accessors have been moved to the Heist.Internal.Types module, so if you really need them, they are still available. However, be aware that Internal modules are subject to change without the deprecation cycle that we use for other modules.

Minor improvements

  • Factored out SpliceAPI module into separate map-syntax package and deprecated the old module which will be removed in the next major release.

  • Snap has been updated to support this Heist 0.14.

by Doug Beardsley (mightybyte@gmail.com) at September 24, 2014 10:10 AM

September 23, 2014

Magnus Therning

Moving to Hakyll

I’ve not written much here lately. Partly it’s because I’ve been busy, but mostly I just find the whole experience of logging in and writing in a crappy editor (i.e. it’s not Vim) is starting to be so irritating that I don’t want to write. So in an attempt to rekindle my will to write, I’m switching to using Hakyll.

As anyone knows who’s had a look at Hakyll its out-of-the-box experience is rather bare bones. In other words it’ll probably take a little while before I have a new site, and even longer before I have a pretty one.

Share

by Magnus at September 23, 2014 08:58 PM

Well-Typed.Com

Haskell courses and Haskell eXchange

In the beginning of October, my colleage Adam Gundry and I will spend a full week in London again for Haskell-related activities: on Monday and Tuesday (October 6–7), we will teach Fast Track to Haskell, a two-day introduction course to Haskell, targeted at people who want to get started with Haskell or learn more about functional programming in general. On Wednesday (October 8), there’s the Haskell eXchange, a one-day conference full of interesting talks on Haskell-related topics. On Thursday and Friday (October 9–10), we will look at more advanced Haskell concepts and programming patterns in Advanced Haskell.

All three events are still open for registration.

Haskell eXchange

The Haskell eXchange will take place in London for the third time now, and I’m happy to report that there’s going to be a really fantastic program again:

As is almost traditional by now, Simon Peyton Jones (Microsoft Research) himself will open the conference, this time with a talk on “Safe, zero-cost coercions in Haskell”.

This is followed by Jeremy Gibbons (University of Oxford) with “Categories for the Working Haskeller”, explaining to Haskell developers and people who are interested in Haskell what role category theory plays in Haskell and if/how categories can be helpful.

I’m very pleased that Bryan O’Sullivan (Facebook) has agreed to give a presentation at the Haskell eXchange this year. As the author of countless Haskell libraries (many of which are among the most used in the entire Haskell ecosystem), and being a co-author of the widely known O’Reilly book “Real World Haskell”, he’s certainly learned how to squeeze a bit of extra performance out of Haskell code when needed. He’s going to share his experiences and provide valuable advice in his talk.

After lunch, we’ll continue with a pair of talks looking at using Haskell for the development of RESTful web services from slightly different angles. Erik Hesselink (Silk) is going to present the rest framework, which makes it easy to develop and maintain REST APIs, independent of the underlying web framework you use. After that, Chris Dornan (Iris Connect) and Adam Gundry (Well-Typed) will talk about api-tools and in particular address the question of how you can solve the problem of schema migrations nicely.

Tim Williams and Peter Marks (both Barclays) will give a joint talk on Lucid, their in-house non-embedded DSL that is written in Haskell and has a mostly structural type system with interesting features such as row polymorphism and extensible records as well as extensible sum-types.

The talks will be concluded by Oliver Charles (Fynder), well-known for his tireless efforts in the “24 days of Hackage” series, who is going to show us how the use of GHC’s most advanced type system extensions helps him write better real-world code at his company.

After the talks, there’s going to be pizza and beer and an informal “Park Bench Discussion” where we can all discuss the questions that have been raised throughout the day in more detail.

I hope you’re as excited about this program as I am: I think there’s a fantastic range of topics covered, from language features and theoretical aspects, practical advice for programmers to interesting case studies of real-world code. Also, it’s an excellent opportunity to meet fellow developers interested in Haskell. If you’re working for a company using Haskell and are looking for new developers, this may be an excellent opportunity to recruit. On the other hand, if you’d like nothing more than a Haskell job, this is an opportunity to meet people who are actually using it in practice, and may have a job for you or at least be able to give you advice on how to find one.

If you haven’t registered yet, please consider doing so! We’re looking forward to meeting you there.

Fast Track to Haskell and Advanced Haskell

These two successful courses have been offered on a regular basis since October 2012. They’re continuously updated to reflect the latest changes in the Haskell world, such as updates to the infrastructure, new features of the main Haskell compiler GHC, or exciting new libraries.

Both courses are hands-on, comprising a combination of lectures, interactive coding and exercises that the participants are supposed to work on alone or in teams, with the help of the teacher(s).

The Fast Track course is for people who know how to program, but have little or no experience in Functional Programming or Haskell. It teaches Haskell in from scratch in just two days, covering important concepts such as datatypes, polymorphism, higher-order functions, type classes, how IO works in Haskell, and ending with an introduction to monads. It’s also interesting for people who are interested in learning about functional programming in general, because Haskell is a prime example of a functional programming language, and the course focuses on the important programming concepts more than on language peculiarities.

The Advanced Haskell course is for people who have some experience with Haskell, but want to learn more. We’re going to discuss (functional) data structures and their complexity, have a detailed look at how lazy evaluation works and how it is implemented, how to reason about performance and how to use various debugging tools. Somewhat depending on demand, there’s also the option to learn more about advanced programming patterns, such as monads, applicative functors and monad transformers, about concurrency and parallelism, or about the more advanced features of the Haskell type system such as GADTs and type families.

Being in the same week as the Haskell eXchange makes it possible to combine one of the courses (or even both) with the eXchange, where you can hear several other viewpoints and get to know more Haskellers!

Other courses

We’re offering additional training courses on-demand and on-site for companies in Europe, the US, or anywhere in the world on request. See our training page for more information.

HacBerlin

By the way, I’m also going to be at the Haskell Hackathon in Berlin this upcoming weekend. On Sunday, I’m going to give a talk on parser combinators. You can still register for the Hackathon, too. It’s free.

by andres at September 23, 2014 10:10 AM

Yesod Web Framework

The woes of multiple package versions

When I've answered the same question more than three times, it's usually time to write up a blog post explaining the situation in more detail, and just link people to that in the future. This is such a blog post.

Many people working with Haskell end up with one of these two classes of inexplicable GHC errors:

  1. Cannot match ByteString with ByteString, with a whole bunch of package name and version junk as well.

  2. SomeTransformerT is not an instance of MonadTrans, when the documentation clearly indicates that SomeTransformerT does define such an instance.

How can a ByteString not be a ByteString? Well, there are two ways I can think of. The first is that you're accidentally trying to mix up a strict ByteString with a lazy ByteString, whose types clearly don't unify. While that problem does pop up, in my experience most people figure that one out pretty quickly. By the time someone's asking a question on Stack Overflow/Reddit/haskell-cafe, it's usually much more insidious: there are two copies of bytestring on their system.

Imagine this situation: you install GHC 7.6, which ships with bytestring-0.10.0.2. You then install text via cabal install text. A few days later, someone mentions that bytestring-0.10.4.0 is out, and it's all new and shiny, so you go ahead and install it with cabal install bytestring. Everything works wonderfully, and life is good. Then you decide to experiment with text a bit, so you write the following program:

{-# LANGUAGE OverloadedStrings #-}
import Data.Text.Encoding (encodeUtf8)
import qualified Data.ByteString.Char8 as S8

main :: IO ()
main = S8.putStrLn $ encodeUtf8 "Hello World!"

Woe unto you! GHC rejects your program with:

foo.hs:6:22:
    Couldn't match expected type `S8.ByteString'
                with actual type `bytestring-0.10.0.2:Data.ByteString.Internal.ByteString'
    In the return type of a call of `encodeUtf8'
    In the second argument of `($)', namely `encodeUtf8 "Hello World!"'
    In the expression: S8.putStrLn $ encodeUtf8 "Hello World!"

When is a ByteString not a ByteString? Here, apprently. Now it turns out the GHC is actually giving you quite of bit of useful information, you just need to know what to look for. It's expecting the type S8.ByteString, which expands to Data.ByteString.Char8.ByteString, which in reality is just a type synonym for Data.ByteString.Internal.ByteString. So what GHC really means is that it can't unify the following two types:

expected:                     Data.ByteString.Internal.ByteString
actual:   bytestring-0.10.0.2:Data.ByteString.Internal.ByteString

Now the difference just jumps out at you: the actual type comes from the bytestring-0.10.0.2 package, whereas the first comes from... well, somewhere else. As I'm sure you're guessing right now, that "somewhere else" is bytestring-0.10.4.0, but GHC doesn't bother telling us that, since including that level of information in every error message would be overwhelming. To step through why this came up exactly:

  • text is installed against bytestring-0.10.0.2 (it was the only version of bytestring available at the time you installed text).
  • Therefore, encodeUtf8 will generate a ByteString value from version 0.10.0.2.
  • Your program imports Data.ByteString.Char8, which is provided by both bytestring-0.10.0.2 and bytestring-0.10.4.0.
  • GHC's default dependency resolution is: take the latest version of each package, in this case 0.10.4.0.
  • Now we have a S8.putStrLn function expecting a 0.10.4.0 ByteString, but an encodeUtf8 function returning a 0.10.0.2 ByteString.

So how do we work around this problem? I can think of three ways:

  1. Explicitly tell GHC which version of the bytestring package you want to use to force consistency, e.g. runghc -package=bytestring-0.10.0.2 foo.hs.
  2. Never use GHCi, runghc, or ghc directly from the command line. Instead, always create a cabal file first. cabal's default dependency resolution will force consistent package loading.
  3. Don't wind up in the situation in the first place, by ensuring you only have one version of each package installed.

That last point is what I strongly recommend to all users. And this is exactly the design goal around Stackage, so it will hopefully not come as a surprise that that's exactly what I recommend most Haskell users use to get their packages installed.

Let's demonstrate that second case of MonadTrans. This time, let's try it with GHC 7.8.3. GHC ships with transformers-0.3.0.0. Next, we'll install the either package with cabal install either. Once again, someone comes along and tells us about a shiny new package, transformers-0.4.1.0. Dutifully, we upgrade with cabal install transformers-0.4.1.0. And then we try to run the following simple program:

import Control.Monad.Trans.Class
import Control.Monad.Trans.Either

main :: IO ()
main = do
    x <- runEitherT $ lift $ putStrLn "hey there!"
    print (x :: Either () ())

GHC mocks you with:

foo.hs:6:23:
    No instance for (MonadTrans (EitherT ()))
      arising from a use of ‘lift’
    In the expression: lift
    In the second argument of ‘($)’, namely
      ‘lift $ putStrLn "hey there!"’
    In a stmt of a 'do' block:
      x <- runEitherT $ lift $ putStrLn "hey there!"

"But EitherT is an instance of MonadTrans!" you insist. That may be true, but it's an instance of the wrong MonadTrans. The either package is built against transformers-0.3.0.0, whereas you've imported lift from transformers-0.4.1.0. This can be worked around as above, with runghc -package=transformers-0.3.0.0 foo.hs. And yet again, my strong recommendation is: use Stackage.

There's one more particularly painful thing I need to point out. Some packages are bundled with GHC, and are depended on by the ghc package. The special thing about the ghc package is that it cannot be reinstalled without installing a new version of GHC itself. Any packages depended on by the ghc package cannot be unregistered without breaking ghc, which would in turn break libraries like doctest and hint. If you follow these points to conclusion, this means that you should never upgrade GHC-bundled libraries. I wrote a blog post on this topic, and the takeaway is: please, always support older versions of packages like bytestring, transformers, and- of course- base.

There's one final case I want to mention. Try running cabal install data-default-0.5.1 http-client, and then run the following program:

import Data.Default
import Network.HTTP.Client

main :: IO ()
main = withManager defaultManagerSettings $ \man -> do
    res <- httpLbs def man
    print res

You'll get the irritating error message:

foo.hs:6:20:
    No instance for (Default Request) arising from a use of ‘def’
    In the first argument of ‘httpLbs’, namely ‘def’
    In a stmt of a 'do' block: res <- httpLbs def man
    In the expression:
      do { res <- httpLbs def man;
           print res }

But if you look at http-client, Request is in fact an instance of Default. "Alright, I know what's going on here" you say. Certainly there are two versions of data-default installed, right? Actually, no, that's not the case. Have a look at the following:

$ ghc-pkg list | grep data-default
    data-default-0.5.1
    data-default-class-0.0.1

There's just a single version of each of these packages available. So why are we getting our mysterious error message? Once again, it's because we have two versions of the Default class. After data-default version 0.5.1, data-default split into a number of packages, and the Default class migrated into data-default-class. http-client defines an instance for Default from data-default-class. And if you use data-default version 0.5.2 or higher, it will simply re-export that same class, and everything will work.

However, our cabal install command forced the installation of the older data-default (0.5.1) which defines its own Default typeclass. Therefore, we end up with two separate Default classes that don't unify. This is a problem that exists whenever packages are split or joined, which is why you should embark on such refactorings with great care.

As it happens, this is yet another problem that is solved by using Stackage, since it forces a consistent set of versions for data-default and data-default-class.

September 23, 2014 12:00 AM

September 22, 2014

Philip Wadler

Russell O'Connor

Hard Drive Failure

A few weeks ago my desktop computer suffered catastrophic hard drive failure. Not only did it not boot, it soon got to the point where even the BIOS would fail to recognise the device. At first I did not worry too much. Although I was not doing automatic backups, I was still doing irregular weekly manual backups of my home directory with tarsnap and I had performed one about three days prior. I was not specifically making backups of my NixOS system and user configuration, but I had some old copies. The configuration files do not change much and they are less important. Importantly, I had backups of my tarsnap keys stored in other places, such as my shell account.

While waiting for a replacement drive to arrive, I realized I had a serious problem. My tarsnap keys were encrypted with my PGP key. I had two specific places where I kept backup of my PGP keys. One place was a USB drive in a safe deposit box. However, at some point I had taken that one out to update it, and then misplaced it before putting it back. Naturally, I had been meaning to get around to replacing that USB drive and the data on it, for some number of years. Also, to my surprise, I had never actually placed my PGP key in my secondary spot.

I was sunk. I had some very old hard drive images with older versions of my PGP key on it, but because I rotate my encryption keys every five years, they were not useful. Within the last five years I had started using full disk encryption. I had some newer hard drive images that also have my PGP keys but I need the passphrases to decrypt these images. I had copies of the passphrase around, but, of course, they were all encrypted with my PGP keys.

After an emotional day and some meditation, slowly my old passphrase came back to me and I was able to decrypt one of my disk images. I was able to rescue my PGP keys and from there I was able to recover everything I had.

I plan to get a bit more serious about distributing copies of my PGP key since I use it so widely. With my PGP key I should be able to always recover everything since I keep all my other key material encrypted with it. Instead of a single USB drive in a safe deposit box, I want to keep two identical USB keys, one at home and one in the box. When I want to update the data, I will update the one at home, swap it with the one in the box, and update the second one and keep it at home until the next update is needed.

I have also gotten more serious about automatic backup. Turns out that NixOS already comes with a tarsnap system service. All that one has to do is place one’s write-only tarsnap key in the appropriate place and specify which directories to back up. I am hoping to make system recovery even easier by also backing up my ~/.nix-profile/manifest.nix, /root/.nix-profile/manifest.nix,/nix/var/nix/profiles/default/manifest.nix, /nix/var/nix/profiles/per-user/*/profile/manifest.nix, /etc/nixos and /var/lib.There are probably a few more things I should back up, like my user profiles, but I am not yet sure how best to do that.

I also want to restart my programme of escrow for my passwords in case something happens to me. I need to improve my documentation of password list to make it easier for others to use. I will use ssss to split my master password and distribute among my escrow agents. The nice thing about public-key cryptography is that I can assign escrow agents without requiring anything from them beyond the fact that they already possess and use PGP keys. I do not even need to inform them that they are my escrow agents. The encrypted components will be stored on my USB drive in the safe deposit box.

Overall, I am glad to have narrowly avoid disaster and have definitely learned some lessons. Check your backup policy everyone!

September 22, 2014 07:22 PM

Tom Schrijvers

Postdoctoral Position in Programming Languages at KU Leuven


The Declarative Languages and Artificial Intelligence (DTAI) group of KU Leuven
(Belgium) invites applicants for a postdoctoral position in the area of
programming languages. This position has been created at the occasion of the
new appointment of prof. Tom Schrijvers as research professor at KU Leuven. The
position's aim is to reinforce the research activities in functional
programming, logic programming and/or programming language theory.

To apply you must hold a recent PhD (or be about to graduate) in one of the
above areas of programming languages. Candidates are expected to have
high-quality publications in peer-reviewed conferences and journals.

The postdoc will work closely with prof. Schrijvers and his PhD students,
participate in ongoing research activities and enjoy the freedom to develop new
lines of research.

The position is for 2 x 1 year and can be further extended. The salary is
competitive and the starting date negotiable.  Moreover, KU Leuven's policy of
equal opportunities and diversity applies to this position.

Please send your application to prof. Tom Schrijvers (tom dot schrijvers at cs
dot kuleuven dot be) by October 15, 2014. Your application should contain:

 - A cover letter explaining your interest in the position.

 - Your curriculum vitae.

 - A short research statement (max. 3 pages).

 - The names and contact details of three people who can, if asked, write
   letters of reference.

by Tom Schrijvers (noreply@blogger.com) at September 22, 2014 03:25 PM

September 21, 2014

Joachim Breitner

Using my Kobo eBook reader as an external eInk monitor

I have an office with a nice large window, but more often than not I have to close the shades to be able to see something on my screen. Even worse: There were so many nice and sunny days where I would have loved to take my laptop outside and work there, but it (a Thinkpad T430s) is simply not usable in bright sun. I have seen those nice eInk based eBook readers, who are clearer the brighter they are. That’s what I want for my laptop, and I am willing to sacrifice color and a bit of usability due to latency for being able to work in the bright daylight!

So while I was in Portland for DebConf14 (where I guess I felt a bit more like tinkering than otherwise) I bought a Kobo Aura HD. I chose this device because it has a resolution similar to my laptop (1440×1080) and I have seen reports from people running their own software on it, including completely separate systems such as Debian or Android.

This week, I was able to play around with it. It was indeed simple to tinker with: You can simply copy a tarball to it which is then extracted over the root file system. There are plenty of instructions online, but I found it easier to take them as inspiration and do it my way – with basic Linux knowledge that’s possible. This way, I extended the system boot script with a hook to a file on the internal SD card, and this file then runs the telnetd daemon that comes with the device’s busybox installation. Then I just have to make the device go online and telnet onto it. From there it is a pretty normal Linux system, albeit without an X server, using the framebuffer directly.

I even found an existing project providing a VNC client implementation for this and other devices, and pretty soon I could see my laptop screen on the Kobo. Black and white worked fine, but colors and greyscales, including all anti-aliased fonts, were quite broken. After some analysis I concluded that it was confusing the bit pattern of the pixels. Luckily kvncclient shares that code with koreader, which worked fine on my device, so I could copy some files and settings from there et voilá: I now have an eInk monitor for my laptop. As a matter of fact, I am writing this text with my Kobo sitting on top of the folded-back laptop screen!

I did some minor adjustments to my laptop:

  • I changed the screen size to match the Kobo’s resolution. Using xrandr’s --panning option this is possible even though my real screen is only 900 pixels high.
  • I disabled the cursor-blink where possible. In general, screen updates should be avoided, so I hide my taffybar (which has a CPU usage monitor) and text is best written at the very end of the line (and not before a, say, </p>).
  • My terminal windows are now black-on-white.
  • I had to increase my font-size a bit (the kobo has quite a high DPI), and color is not helpful (so :set syntax=off in vim).

All this is still very manual (going online with the kobo, finding its IP address, logging in via telnet, killing the Kobo's normal main program, starting x11vnc, finding my ip address, starting the vnc client, doing the adjustments mentioned above), so I need to automate it a bit. Unfortunately, there is no canonical way to extend the Kobo by your own application: The Kobo developers made their device quite open, but stopped short from actually encouraging extensions, so people have created many weird ways to start programs on the Kobo – dedicated start menus, background programs observing when the regular Kobo app opens a specific file, complete replacements for the system. I am considering to simply run an SSH server on the device and drive the whole process from the laptop. I’ll keep you up-to-date.

A dream for the future would be to turn the kobo into a USB monitor and simply connect it to any computer, where it then shows up as a new external monitor. I wonder if there is a standard for USB monitors, and if it is simple enough (but I doubt it).

A word about the kobo development scene: It seems to be quite active and healthy, and a number of interesting applications are provided for it. But unfortunately it all happens on a web forum, and they use it not only for discussion, but also as a wiki, a release page, a bug tracker, a feature request list and as a support line – often on one single thread with dozens of posts. This makes it quite hard to find relevant information and decide whether it is still up-to-date. Unfortunately, you cannot really do without it. The PDF viewer that comes with the kobo is barely okish (e.g. no crop functionality), so installing, say, koreader is a must if you read more PDFs than actual ebooks. And then you have to deal with the how-to-start-it problem.

That reminds me: I need to find a decent RSS reader for the kobo, or possibly a good RSS-to-epub converter that I can run automatically. Any suggestions?

PS and related to this project: Thanks to Kathey!

by Joachim Breitner (mail@joachim-breitner.de) at September 21, 2014 08:11 PM

Danny Gratzer

Introduction to Dependent Types: Off, Off to Agda Land

Posted on September 21, 2014

First, an apology. Sorry this has take so long to push out. I’ve just started my first semester at Carnegie Mellon. I fully intend to keep blogging, but it’s taken a little while to get my feet under me. Happy readings :)

In this second post of my “intro to dependent types” series we’re going on a whirlwind tour of Agda. Specifically we’re going to look at translating our faux-Haskell from the last post into honest to goodness typecheckable Agda.

There are 2 main reasons to go through the extra work of using a real language rather than pseudo-code

  1. This is typecheckable. I can make sure that all the i’s are dotted and t’s crossed.
  2. It’s a lot cleaner We’re only using the core of Agda so it’s more or less a very stripped down Haskell with a much more expressive but simpler type system.

With that in mind let’s dive in!

What’s the Same

There’s quite a bit of shared syntax between Agda and Haskell, so a Haskeller can usually guess what’s going on.

In Agda we still give definitions in much the same way (single : though)

    thingy : Bool
    thingy = true

where as in Haskell we’d say

    name :: Type
    name = val

In fact, we even get Haskell’s nice syntactic sugar for functions.

    function : A -> B -> ... -> C
    function a b ... = c

Will desugar to a lambda.

    function : A -> B -> ... -> C
    function = \a b ... -> c

One big difference between Haskell and Agda is that, due to Agda’s more expressive type system, type inference is woefully undecidable. Those top level signatures are not optional sadly. Some DT language work a little harder than Agda when it comes to inference, but for a beginner this is a bit of a feature: you learn what the actual (somewhat scary) types are.

And of course, you always give type signatures in Haskell I’m sure :)

Like Haskell function application is whitespace and functions are curried

    -- We could explicitly add parens
    -- foo : A -> (B -> C)
    foo : A -> B -> C
    foo = ...

    a : A
    a = ...

    bar : B -> C
    bar = foo a

Even the data type declarations should look familiar, they’re just like GADTs syntactically.

    data Bool : Set where
      true  : Bool
      false : Bool

Notice that we have this new Set thing lurking in our code. Set is just the kind of normal types, like * in Haskell. In Agda there’s actually an infinite tower of these Bool : Set : Set1 : Set2 ..., but won’t concern ourselves with anything beyond Set. It’s also worth noting that Agda doesn’t require any particular casing for constructors, traditionally they’re lower case.

Pattern matching in Agda is pretty much identical to Haskell. We can define something like

    not : Bool -> Bool
    not true  = false
    not false = true

One big difference between Haskell and Agda is that pattern matching must be exhaustive. Nonexhaustiveness is a compiler error in Agda.

This brings me to another point worth mentioning. Remember that structural induction I mentioned the other day? Agda only allows recursion when the terms we recurse on are “smaller”.

In other words, all Agda functions are defined by structural induction. This together with the exhaustiveness restriction means that Agda programs are “total”. In other words all Agda programs reduce to a single value, they never crash or loop forever.

This can occasionally cause pain though since not all recursive functions are modelled nicely by structural induction! A classic example is merge sort. The issue is that in merge sort we want to say something like

    mergeSort : List Nat -> List Nat
    mergeSort [] = []
    mergeSort (x :: []) = x :: []
    mergeSort xs = let (l, r) = split xs in
                     merge (mergeSort l, mergeSort r)

But wait, how would the typechecker know that l and r are strictly smaller than xs? In fact, they might not be! We know that the length of length xs > 1, but convincing the typechecker of that fact is a pain! In fact, without elaborate trickery, Agda will reject this definition.

So, apart from these restriction for totality Agda has pretty much been a stripped down Haskell. Let’s start seeing what Agda offers over Haskell.

Dependent Types

There wouldn’t be much point in writing Agda if it didn’t have dependent types. In fact the two mechanisms that comprise our dependent types translate wonderfully into Agda.

First we had pi types, remember those?

    foo :: (a :: A) -> B
    foo a = ...

Those translate almost precisely into Agda, where we’d write

    foo : (a : A) -> B

The only difference is the colons! In fact, Agda’s pi types are far more general than what we’d discussed previously. The extra generality comes from what we allow A to be. In our previous post, A was always some normal type with the kind * (Set in Agda). In Agda though, we allow A to be Set itself. In Haskell syntax that would be something like

    foo :: (a :: *) -> B

What could a be then? Well anything with the kind * is a type, like Bool, (), or Nat. So that a is like a normal type variable in Haskell

    foo :: forall a. B

In fact, when we generalize pi types like this, they generalize parametric polymorphism. This is kind of like how we use “big lambdas” in System F to write out polymorphism explicitly.

Here’s a definition for the identity function in Agda.

    id : (A : Set) -> A -> A
    id A a = a

This is how we actually do all parametric polymorphism in Agda, as a specific use of pi types. This comes from the idea that types are also “first class”. We can pass them around and use them as arguments to functions, even dependent arguments :)

Now our other dependently typed mechanism was our generalized generalized algebraic data types. These also translate nicely to Agda.

    data Foo : Bool -> Set where
      Bar : Foo True

We indicate that we’re going to index our data on something the same way we would in Haskell++, by adding it to the type signature on the top of the data declaration.

Agda’s GGADTs also allow us to us to add “parameters” instead of indices. These are things which the data type may use, but each constructor handles uniformly without inspecting it.

For example a list type depends on the type of it’s elements, but it doesn’t poke further at the type or value of those elements. They’re handled “parametrically”.

In Agda a list would be defined as

    data List (A : Set) : Set where
      nil  : List A
      cons : A -> List A -> List A

If your wondering what on earth the difference is, don’t worry! You’ve already in fact used parametric/non-parametric type arguments in Haskell. In Haskell a normal algebraic type can just take several type variables and can’t try to do clever things depending on what the argument is. For example, our definition of lists

    data List a = Cons a (List a) | Nil

can’t do something different if a is Int instead of Bool or something like that. That’s not the case with GADTs though, there we can do clever things like

    data List :: * -> * where
      IntOnlyCons :: Int -> List Int -> List Int
      ...

Now we’re not treating our type argument opaquely, we can figure things out about it depending on what constructor our value uses! That’s the core of the difference between parameters in indices in Agda.

Next let’s talk about modules. Agda’s prelude is absolutely tiny. By tiny I mean essentially non-existant. Because of this I’m using the Agda standard library heavily and to import something in Agda we’d write

import Foo.Bar.Baz

This isn’t the same as a Haskell import though. By default, imports in Agda import a qualified name to use. To get a Haskell style import we’ll use the special shortcut

open import Foo.Bar

which is short for

import Foo.Bar
open Bar

Because Agda’s prelude is so tiny we’ll have to import things like booleans, numbers, and unit. These are all things defined in the standard library, not even the core language. Expect any Agda code we write to make heavy use of the standard library and begin with a lot of imports.

Finally, Agda’s names are somewhat.. unique. Agda and it’s standard library are unicode heavy, meaning that instead of unit we’d type ⊤ and instead of Void we’d use ⊥. Which is pretty nifty, but it does take some getting used to. If you’re familiar with LaTeX, the Emacs mode for Agda allows LaTeX style entry. For example ⊥ can be entered as \bot.

The most common unicode name we’ll use is ℕ. This is just the type of natural numbers as their defined in Data.Nat.

A Few Examples

Now that we’ve seen what dependent types look like in Agda, let’s go over a few examples of their use.

First let’s import a few things

    open import Data.Nat
    open import Data.Bool

Now we can define a few simple Agda functions just to get a feel for how that looks.

    not : Bool -> Bool
    not true  = false
    not false = true

    and : Bool -> Bool -> Bool
    and true b  = b
    and false _ = false

    or : Bool -> Bool -> Bool
    or false b = b
    or true  _ = true

As you can see defining functions is mostly identical to Haskell, we just pattern match and the top level and go from there.

We can define recursive functions just like in Haskell

    plus : ℕ -> ℕ -> ℕ
    plus (suc n) m = suc (plus n m)
    plus zero    m = m

Now with Agda we can use our data types to encode “proofs” of sorts.

For example

    data IsEven : ℕ -> Set where
      even-z : IsEven zero
      even-s  : (n : Nat) -> IsEven n -> IsEven (suc (suc n))

Now this inductively defines what it means for a natural number to be even so that if Even n exists then n must be even. We can also state oddness

    data IsOdd : ℕ -> Set where
      odd-o : IsOdd (suc zero)
      odd-s : (n : ℕ) -> IsOdd n -> IsOdd (suc (suc n))

Now we can construct a decision procedure which produces either a proof of evenness or oddness for all natural numbers.

    open import Data.Sum -- The same thing as Either in Haskell; ⊎ is just Either

    evenOrOdd : (n : ℕ) -> Odd n ⊎ Even n

So we’re setting out to construct a function that, given any n, builds up an appropriate term showing it is either even or odd.

The first two cases of this function are kinda the base cases of this recurrence.

    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o

So if we’re given zero or one, return the base case of IsEven or IsOdd as appropriate. Notice that instead of Left or Right as constructors we have inj₁ and inj₂. They serve exactly the same purpose, just with a shinier unicode name.

Now our next step would be to handle the case where we have

    evenOrOdd (suc (suc n)) = ?

Our code is going to be like the Haskell code

    case evenOrOdd n of
      Left evenProof -> Left (EvenS evenProof)
      Right oddProof -> Right (OddS  oddProof)

In words, we’ll recurse and inspect the result, if we get an even proof we’ll build a bigger even proof and if we can an odd proof we’ll build a bigger odd proof.

In Agda we’ll use the with keyword. This allows us to “extend” the current pattern matching by adding an expression to the list of expressions we’re pattern matching on.

    evenOrOdd (suc (suc n)) with evenOrOdd n
    evenOrOdd (suc (suc n)) | inj₁ x = ?
    evenOrOdd (suc (suc n)) | inj₂ y = ?

Now we add our new expression to use for matching by saying ... with evenOrOdd n. Then we list out the next set of possible patterns.

From here the rest of the function is quite straightforward.

    evenOrOdd (suc (suc n)) | inj₁ x = inj₁ (even-s n x)
    evenOrOdd (suc (suc n)) | inj₂ y = inj₂ (odd-s n y)

Notice that we had to duplicate the whole evenOrOdd (suc (suc n)) bit of the match? It’s a bit tedious so Agda provides some sugar. If we replace that portion of the match with ... Agda will just automatically reuse the pattern we had when we wrote with.

Now our whole function looks like

    evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n
    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o
    evenOrOdd (suc (suc n)) with evenOrOdd n
    ... | inj₁ x = inj₁ (even-s n x)
    ... | inj₂ y = inj₂ (odd-s n y)

How can we improve this? Well notice that that suc (suc n) case involved unpacking our Either and than immediately repacking it, this looks like something we can abstract over.

    bimap : (A B C D : Set) -> (A -> C) -> (B -> D) -> A ⊎ B -> C ⊎ D
    bimap A B C D f g (inj₁ x) = inj₁ (f x)
    bimap A B C D f g (inj₂ y) = inj₂ (g y)

If we gave bimap a more Haskellish siganture

    bimap :: forall a b c d. (a -> c) -> (b -> d) -> Either a b -> Either c d

One interesting point to notice is that the type arguments in the Agda function (A and B) also appeared in the normal argument pattern! This is because we’re using the normal pi type mechanism for parametric polymorphism, so we’ll actually end up explicitly passing and receiving the types we quantify over. This messed with me quite a bit when I first starting learning DT languages, take a moment and convince yourself that this makes sense.

Now that we have bimap, we can use it to simplify our evenOrOdd function.

    evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n
    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o
    evenOrOdd (suc (suc n)) =
      bimap (IsEven n) (IsOdd n)
            (IsEven (suc (suc n))) (IsOdd (suc (suc n)))
            (even-s n) (odd-s n) (evenOrOdd n)

We’ve gotten rid of the explicit with, but at the cost of all those explicit type arguments! Those are both gross and obvious. Agda can clearly deduce what A, B, C and D should be from the arguments and what the return type must be. In fact, Agda provides a convenient mechanism for avoiding this boilerplate. If we simply insert _ in place of an argument, Agda will try to guess it from the information it has about the other arguments and contexts. Since these type arguments are so clear from context, Agda can guess them all

    evenOrOdd : (n : ℕ) -> IsEven n ⊎ IsOdd n
    evenOrOdd zero = inj₁ even-z
    evenOrOdd (suc zero) = inj₂ odd-o
    evenOrOdd (suc (suc n)) =
      bimap _ _ _ _ (even-s n) (odd-s n) (evenOrOdd n)

Now at least the code fits on one line! This also raises something interesting, the types are so strict that Agda can actually figure out parts of our programs for us! I’m not sure about you but at this point in time my brain mostly melted :) Because of this I’ll try to avoid using _ and other mechanisms for Agda writing programs for us where I can. The exception of course being situations like the above where it’s necessary for readabilities sake.

One important exception to that rule is for parameteric polymorphism. It’s a royal pain to pass around types explicitly everywhere. We’re going to use an Agda feature called “implicit arguments”. You should think of these as arguments for which the _ is inserted for it. So instead of writing

    foo _ zero zero

We could write

    foo zero zero

This more closely mimicks what Haskell does for its parametric polymorphism. To indicate we want something to be an implicit argument, we just wrap it in {} instead of (). So for example, we could rewrite bimap as

    bimap : {A B C D : Set} -> (A -> C) -> (B -> D) -> A ⊎ B -> C ⊎ D
    bimap f g (inj₁ x) = inj₁ (f x)
    bimap f g (inj₂ y) = inj₂ (g y)

To avoid all those underscores.

Another simple function we’ll write is that if we can construct an IsOdd n, we can build an IsEven (suc n).

    oddSuc : (n : ℕ) -> IsOdd n -> IsEven (suc n)

Now this function has two arguments, a number and a term showing that that number is odd. To write this function we’ll actually recurse on the IsOdd term.

    oddSuc .1 odd-o = even-s zero even-z
    oddSuc .(suc (suc n)) (odd-s n p) = even-s (suc n) (oddSuc n p)

Now if we squint hard and ignore those . terms, this looks much like we’d expect. We build the Even starting from even-s zero even-z. From there we just recurse and talk on a even-s constructor to scale the IsEven term up by two.

There’s a weird thing going on here though, those . patterns. Those are a nifty little idea in Agda that pattern matching on one thing might force another term to be some value. If we know that our IsOdd n is odd-o n must be suc zero. Anything else would just be completely incorrect. To notate these patterns Agda forces you to prefix them with .. You should read .Y as “because of X, this must be Y”.

This isn’t an optional choice though, as . patterns may do several wonky things. The most notable is that they often use pattern variables nonlinearly, notice that n appeared twice in our second pattern clause. Without the . this would be very illegal.

As an exercise to the reader, try to write

    evenSuc : (n : ℕ) -> IsEven n -> IsOdd (suc n)

Wrap Up

That wraps up this post which came out much longer than I expected. We’ve now covered enough basics to actually discuss meaningful dependently typed programs. That’s right, we can finally kiss natural numbers good bye in the next post!

Next time we’ll cover writing a small program but interesting program and use dependent types to assure ourselves of it’s correctness.

As always, please comment with any questions :)

<script type="text/javascript"> /* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */ var disqus_shortname = 'codeco'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

September 21, 2014 12:00 AM

Christopher Done

shell-conduit: Write shell scripts in Haskell with Conduit

As part of my series of write-about-personal-projects, my latest obsession is writing shell scripts with Michael Snoyman’s Conduit.

Here is my package, shell-conduit. It’s still in the experimental phase, but I don’t forsee any changes now for a while.

Bash is evil

I hate writing scripts in Bash. Until now, it was the easiest way to just write unix scripts. Its syntax is insane, incredibly error prone, its defaults are awful, and it’s not a real big person programming language.

Perl/Python/Ruby are also evil

If you’re going to go as far as using a real programming language, why bother with these dynamically typed messes? Go straight for Haskell.

Like a glove

I had an inkling a while back that conduits mirror the behaviour of bash pipes very well. I knew there was something to the idea, but didn’t act on it fully for a while. Last week I experimented somewhat and realised that the following Haskell code

source $= conduit $ sink

does indeed accurately mirror

source | conduit > sink

And that also the following

(do source
    source $= conduit)
$$ sink

is analogous to

source
source | conduit

We’ll see examples of why this works later.

I must Haskell all the things

Another trick I realised is to write some template Haskell code which will calculate all executables in your PATH at compilation time and generate a top-level name that is a Haskell function to launch that process. So instead of writing

run "ls"

you could instead just write

ls

There are a few thousand executables, so it takes about 10 seconds to compile such a module of names. But that’s all.

Again, we’ll see how awesome this looks in a minute.

Modeling stdin, stderr and stdout

My choice of modeling the typical shell scripting pipe handles is by having a type called Chunk:

type Chunk = Either ByteString ByteString

All Left values are from stderr. All Right values are either being pulled from stdin or being sent to stdout. In a conduit the difference between stdin and stdout is more conceptual than real.

When piping two commands, the idea is that any Left values are just re-yielded along, they are not consumed and passed into the process.

A process conduit on chunks

Putting the previous model into practice, we come up with a type for launching a process like this:

conduitProcess :: (MonadResource m)
               => CreateProcess -> Conduit Chunk m Chunk

Meaning the process will be launched, and the conduit will accept any upstream stdin (Right values), and send downstream anything that comes from the actual process (both Left and Right values).

Process conduits API

I defined two handy functions for running process conduits:

shell :: (MonadResource m)
      => String -> Conduit Chunk m Chunk
proc :: (MonadResource m)
     => String -> [String] -> Conduit Chunk m Chunk

One to launch via a shell, one to launch via program name and arguments. These functions can be used in your shell scripts. Though, we’ll see in a minute why you should rarely need either.

Executing a shell scripting conduit

First we want something to consume any remainder chunks after a script has finished. That’s writeChunks:

writeChunks :: (MonadIO m)
            => Consumer Chunk m ()
writeChunks =
  awaitForever
    (\c ->
       case c of
         Left e -> liftIO (S.hPut stderr e)
         Right o -> liftIO (S.hPut stdout o))

This simply consumes anything left in the pipeline and outputs to the correct file handles, either stderr or stdout.

Now we can write a simple run function:

run :: (MonadIO m,MonadBaseControl IO m)
    => Conduit Chunk (ShellT m) Chunk -> m ()
run p =
  runResourceT
    (runShellT (sourceList [] $=
                p $
                writeChunks))

First it yields an empty upstream of chunks. That’s the source. Then our script p is run as the conduit in between, finally we write out any chunks that remain.

Let’s try that out:

λ> run (shell "echo hello!")
hello!
λ> run (proc "date" ["+%Y"])
2014
λ> run (shell "echo oops > /dev/stderr")
oops

Looks good. Standard output was written properly, as was stderr.

Returning to our mass name generation

Let’s take our earlier work of generating names with template-haskell. With that in place, we have a process conduit for every executable in PATH. Add to that variadic argument handling for each one, we get a list of names like this:

rmdir :: ProcessType r => r
ls :: ProcessType r => r
egrep :: ProcessType r => r
dmesg :: ProcessType r => r

The real types when instantiated will look like:

rmdir "foo" :: Conduit Chunk m Chunk
ls :: Conduit Chunk m Chunk
ls "." :: Conduit Chunk m Chunk

Putting it all together

We can now provide any number of arguments:

λ> run ls
dist
LICENSE
README.md
Setup.hs
shell-conduit.cabal
src
TAGS
TODO.org
λ> run (ls "/")
bin
boot
cdrom
…

We can pipe things together:

λ> run (do ls "-1" $= head' "-2")
dist
LICENSE
λ> run (ls $= grep "Key" $= shell "cat" $= CL.map (second (S8.map toUpper)))
KEYBOARD.HI
KEYBOARD.HS
KEYBOARD.O

Results are outputted to stdout unless piped into other processes:

λ> run (do shell "echo sup"; shell "echo hi")
sup
hi
λ> run (do shell "echo sup"; sed "s/u/a/"; shell "echo hi")
sup
hi
λ> run (do shell "echo sup" $= sed "s/u/a/"; shell "echo hi")
sap
hi

Live streaming between pipes like in normal shell scripting is possible:

λ> run (do tail' "/tmp/example.txt" "-f" $= grep "--line-buffered" "Hello")
Hello, world!
Oh, hello!

(Remember that grep needs --line-buffered if it is to output things line-by-line).

Error handling

By default, if a process errors out, the whole script ends. This is contrary to Bash, which keeps going regardless of failure. This is bad.

In Bash, to revert this default, you run:

set -e

And the way to ignore erroneous commands on case-by-case basis is to use || true:

killall nonexistant || true
echo OK, done.

Which means “do foo, or otherwise ignore it, continue the script”.

We can express the same thing using the Alternative instance for the ShellT type:

λ> run (do killall "nonexistant" "-q"; echo "OK, done.")
*** Exception: ShellExitFailure 1
λ> run (do killall "nonexistant" "-q" <|> return (); echo "OK, done.")
OK, done.

String types

If using OverloadedStrings so that you can use Text for arguments, then also enable ExtendedDefaultRules, otherwise you’ll get ambiguous type errors.

{-# LANGUAGE ExtendedDefaultRules #-}

But this isn’t necessary if you don’t need to use Text yet. Strings literals will be interpreted as String. Though you can pass a value of type Text or any instance of CmdArg without needing conversions.

Examples of script files

Quick script to reset my keyboard (Linux tends to forget these things when I unplug my keyboard):

import Data.Conduit.Shell
main =
  run (do xmodmap ".xmodmap"
          xset "r" "rate" "150" "50")

Cloning and initializing a repo (ported from a bash script):

import Control.Monad.IO.Class
import Data.Conduit.Shell
import System.Directory
main =
  run (do exists <- liftIO (doesDirectoryExist "fpco")
          if exists
             then rm "fpco/.hsenvs" "-rf"
             else git "clone" "git@github.com:fpco/fpco.git"
          liftIO (setCurrentDirectory "fpco")
          shell "./dev-scripts/update-repo.sh"
          shell "./dev-scripts/build-all.sh"
          alertDone)

Script to restart a web process (ported from an old bash script I had):

import Control.Applicative
import Control.Monad.Fix
import Data.Conduit.Shell
main =
  run (do ls
          echo "Restarting server ... ?"
          killall name "-q" <|> return ()
          fix (\loop ->
                 do echo "Waiting for it to terminate ..."
                    sleep "1"
                    (ps "-C" name $= discardChunks >> loop) <|> return ())
          shell "dist/build/ircbrowse/ircbrowse ircbrowse.conf")
  where name = "ircbrowse"

You’ve seen Shelly, right?

Right. Shelly’s fine. It just lacks the two killer things for me:

  • All names are bound, so I can just use them as normal functions.
  • shell-conduit also, due to its mass name binding, prioritizes commands. For example, Shelly has a group of functions for manipulating the file system. In shell-conduit, you just use your normal commands: rm "x" and mv "x" "y".
  • Not based on conduit. Conduit is a whole suite of streaming utilities perfect for scripting.
  • Piped is not the default, either. There’re a bunch of choices: Shelly, Shelly.Lifted, Shelly.Pipe. Choice is good, but for a scripting language I personally prefer one goto way to do something.

Also, Shelly cannot do live streams like Conduit can.

Conduits as good scripting libraries

You might want to import the regular Conduit modules qualified, too:

import qualified Data.Conduit.List as CL

Which contains handy functions for working on streams in a list-like way. See the rest of the handy modules for Conduit in conduit-extra.

Also of interest is csv-conduit, html-conduit, and http-conduit.

Finally, see the Conduit category on Hackage for other useful libraries: http://hackage.haskell.org/packages/#cat:Conduit

All of these general purpose Conduits can be used in shell scripting.

Using it for real scripts

So far I have ported a few small scripts to shell-conduit from Bash and have been happy every time. I suck at Bash. I’m pretty good at Haskell.

The next test is applying this to my Hell shell and seeing if I can use it as a commandline shell, too.

My friend complained that having to quote all arguments is a pain. I don’t really agree that this is bad. In Bash it’s often unclear how arguments are going to be interpreted. I’m happy just writing something predictable than something super convenient but possibly nonsense.

Summary

I set out a week ago to just stop writing Bash scripts. I’ve written a bunch of scripts in Haskell, but I would still write Bash scripts too. Some things were just too boring to write. I wanted to commit to Haskell for scripting. Today, I’m fairly confident I have a solution that is going to be satisfactory for a long while.

September 21, 2014 12:00 AM

September 20, 2014

Matthew Sackman

Concurrency, Actors, Locks and mailboxes

Having been part of the original team that wrote RabbitMQ (indeed I wrote the very first prototype, back in the summer of 2006), and having worked full time with Erlang since 2009 until fairly recently, it's been interesting doing some work in Go recently.

Go's a language I currently have mixed feelings for. In some ways I like it - it's simple, it doesn't take too long to learn, the syntax is clean, the tool chain is superb (compilation is quick), performance is very good, it's very easy to drop into C whenever you really need to. It also has one very nice high end feature: first class channels - you can pass channels over channels which is pretty powerful. But equally, it's statically typed and has a terrible type system (i.e. no generics. Personally, I don't feel like the proofs offered by type checkers are worth much to me so I'd much rather have no static type checking than one that is this primitive and brain dead), it's not extensible properly (e.g. you can't create your own data structures which work with the range) and worst of all, there's no pattern matching. The lack of pattern matching is particularly horrific given Go's "best practise" of returning tuples from functions (except they're not really tuples - another mistake in the language), the right most of which indicates an error or success. But you can't pattern match on the assignment so you end up with endless if-statements that explicitly check the error for nil. There are other irritations which I've found, particularly related to its locks package (i.e. non-re-entrant; can't upgrade read to write; waiting for a write blocks all attempts to gain reads. Yes, I know I'm free to implement my own locks package if I want to).

Go also doesn't push you to using actors - you have to re-implement all that yourself if you want to. In a recent project, I started off with some locks and within about three days found it utterly impossible to reason about which go-routines can hold which locks and whether or not there's any deadlock potential. Inevitably, there was. So I ripped all the locking code out and wrote my own actor loops.

This was quite interesting as here I could now be more flexible than Erlang. I think most people think that actors mean "only one thread/process/routine can read and write its state" - there is a key concept of owning that state and regulating access to it. However, what I found is that I actually only needed to limit modifying the state to a single go-routine: each actor-receive-loop routine would take a write lock on its state whenever it needs to modify its own state, but it's perfectly reasonable to have anyone read the state, provided they take a read lock before doing so. The fact we can share pointers in Go makes this possible, whereas it's impossible to do this in Erlang (well, not quite - if you use ets then you can do it, which is exactly what we do in RabbitMQ in the rabbit_msg_store - but it's certainly not pretty!). So now we can have concurrent reads and no need to pass read-requests over a channel/mailbox. This seems pretty nice to me.

Recently I was reading a paper and it suggested that:

In message passing systems, processes interact exclusively by sending and receiving messages and they do not have access to shared memory.

Firstly, on a very technical note, they do have access to shared memory - the mailbox or queue is exactly that. The key reason why it leads to more composable systems is that when you hold the lock to write into a mailbox, you can never do anything other than write into that mailbox - you can never try to acquire multiple locks, so you can't deadlock in this way. And that's even assuming you're using locks for mailboxes - queues make lovely structures for lock-free concurrent access.

Secondly, as I suggest above, it appears to be safe to allow multiple concurrent readers of an actor's state, provided modifications to the state are done atomically by the actor thread - though more care has to be taken now to ensure updates are consistent - you have to make sure you update all the state you need to change in one go under a write lock (the sort of transactional semantics you end up needing to ensure makes me heavily think of STM). Whilst I would probably still call such a system a "message passing system" I can certainly imagine others would disagree and at a minimum it's some sort of hybrid (you could argue that the side effect of releasing the write lock when you've finished modifying the state is to publish an immutable copy of the state to any and all subscribers that want it - except without all that overhead. When viewed in these terms, it makes more intuitive sense that it's safe - provided of course that you don't do anything blocking whilst you're holding a state read-lock). This design also seems to get a fair bit trickier once you get to distributed systems and the need to have proxy objects representing the state and interface of a remote actor. By comparison, in Erlang a reference to an Actor is an immutable process identifier of Pid which is easy to send around and reason about.

But mainly I was thinking about the pattern of data flow: a mailbox allows multiple writers to send data to a single reader (a gather operation, maybe). The actor loop allows the exact opposite: a single reader of the mailbox can then affect multiple things (a scatter) - either by sending out messages to many other actors (in essence, a push action), or by (as I suggest above) modifying state which can be concurrently read by many other actors (correspondingly, a pull action). In my mind's eye, I see a sort of concertina effect as all these messages are pushed into a mailbox, and then slowly the effects of each message spread out again. In some ways it seems slightly odd how powerful this is, but in other ways it makes perfect sense: if you consider a finite state machine then your mailbox is just the stream of events coming into it and you have your little automaton updating the state with some operation combining the current state with the current message. It is the very fact that the next state is dependent on the current state and the current message that requires mutual exclusion around modifying the state. And of course by ensuring that that mutual exclusion lock is (implicitly) held in absence of any other locks that makes actor systems so much easier to reason about and understand - any deadlocks that occur are at the protocol level and, if you model your protocols between actors properly, can be determined statically (though I'm not aware that anyone actually does this - false positives may abound).

This then made makes me think about how, once all actors have done their initialisation and reached the core actor loop, the entire system is purely event driven. When looked at like this, are we really sure actors are enough? Are there not other forms of expression that capture the relation between events as inputs, with state, and an output more cleanly? In particular I'm thinking of things like Join calculus and Functional Reactive Programming. Given that actors are apparently becoming rather more mainstream these days, I wonder if that really means they're only part of the solution: sure I can write large distributed systems that scale, perform well, don't deadlock or livelock and are exceedingly robust. But I can I write them with less code and cleaner semantics?

September 20, 2014 02:02 PM

Welcome

Yet again, a new blog. This one is much simpler than the last. Whilst one always hopes, don't expect the frequency of posts to be greater than any of the previous blogs here... The styling is currently non-existent - seriously, there is no CSS right now. Sorry.

This one is much simpler than the last - gone is Serendipity and PyBlosxom is in, in its place. I'm currently trialing using muse to write posts. Not entirely sure it's worth it, but I need to get my emacs foo improved so working with a new major mode is likely a good idea.

September 20, 2014 12:32 PM

First post!

This is your first post! If you can see this with a web-browser, then it's likely that everything's working nicely!

September 20, 2014 10:34 AM

Eric Kidd

Rust lifetimes: Getting away with things that would be reckless in C++

Over the years, I've learned to be cautious with C++ pointers. In particular, I'm always very careful about who owns a given pointer, and who's in charge of calling delete on it. But my caution often forces me to write deliberately inefficient functions. For example:

vector<string> tokenize_string(const string &text);

Here, we have a large string text, and we want to split it into a vector of tokens. This function is nice and safe, but it allocates one string for every token in the input. Now, if we were feeling reckless, we could avoid these allocations by returning a vector of pointers into text:

vector<pair<const char *,const char *>> tokenize_string2(const string &text);

In this version, each token is represented by two pointers into text: One pointing to the first character, and one pointing just beyond the last character.1 But this can go horribly wrong:

// Disaster strikes!
auto v = tokenize_string2(get_input_string());
munge(v);

Why does this fail? The function get_input_string returns a temporary string, and tokenize_string2 builds an array of pointers into that string. Unfortunately, the temporary string only lives until the end of the current expression, and then the underlying memory is released. And so all our pointers in v now point into oblivion—and our program just wound up getting featured in a CERT advisory. So personally, I'm going to prefer the inefficient tokenize_string function almost every time.

Rust lifetimes to the rescue!

Going back to our original design, let's declare a type Token. Each token is either a Word or an Other, and each token contains pointers into a pre-existing string. In Rust, we can declare this as follows:

#[deriving(Show, PartialEq)]
enum Token<'a> {
    Word(&'a str),
    Other(&'a str)
}

Read more…

September 20, 2014 03:07 AM

Christopher Done

hindent: A Haskell indenter

A question of style

In this post I’m going to use the word “style” to refer to the way code is printed in concrete terms. No changes in the code that would yield a different syntax tree are considered “style” here.

What’s the deal with code style?

Code style is important! If you’re a professional Haskell programmer, you’re working with Haskell code all day. The following things are affected by the style used:

  • How easily it can be manipulated with regular editors: the more code is laid out in a way that prevents you from readily using your normal editor functions, the less efficient you are.
  • How well general tooling works with it: do diff and things like that work well?
  • How easily you can absorb the structure: do you have to spend time hunting around for the start and end of syntactical nodes? Can’t see the tree for the forest?
  • How quickly you can write it: can you just write code or do you have to spend time thinking about how it’s going to be laid out before writing, or editing the layout afterwards?
  • How aesthetically offended you are1: does the code you’re looking at assault your sense of beauty?

Code style is important! Let’s have a code style discussion. I propose to solve it with tooling.

Is this really an issue, though?

Okay, so I’m one guy with a bee in his bonnet. Let’s do a quick Google and see what others are saying in this StackOverflow question:

Could someone provide a link to a good coding standard for Haskell? I’ve found this and this, but they are far from comprehensive.

The following points refer to style:

  • Format your code so it fits in 80 columns. (Advanced users may prefer 87 or 88; beyond that is pushing it.)
  • Put spaces around infix operators. Put a space following each comma in a tuple literal.
  • Prefer a space between a function and its argument, even if the argument is parenthesized.
  • Use line breaks carefully. Line breaks can increase readability, but there is a trade-off: Your editor may display only 40–50 lines at once. If you need to read and understand a large function all at once, you mustn’t overuse line breaks.
  • When possible, align – lines, = signs, and even parentheses and commas that occur in adjacent lines.

Even the Haskell community is not immune to long, protracted debates about tabs vs spaces. That reddit submission has zero points. That means it’s very controversial. The submission also has 117 comments. That means people are very vocal about this topic. That’s because bike-shedding is inversely proportional to the triviality of the debated thing. We know that.

Nevertheless, style is important enough to be discussed.

So let’s formalise a standard style

That’s a simplification. There are many style guides:

These are just public ones. In-house styles are also common, for a particular company. It’s not practical to force everyone into one single style. It’s a well-worn topic in the C world.

Okay, but is this a problem tooling even needs to solve?

There is precedent for other tooling based on style:

Everyone has their own style

So then let’s make a tool with a select number of styles, you might say. The problem is that people don’t even use the standards that exist out there. They used slightly varying versions. Ask any Haskeller what style they use, and they will say “mostly like X, but with some differences.”

For example What are some good style guides for writing Haskell code?2

I use a very similar style, with a few modifications. […] Perhaps I should write out my own style guide at some point, especially as I’ve become pretty set in my style lately.

And Good Haskell Coding Style3:

My Personal Pet Peeves

For the most part the style guides that I have added above (and the tools provided) mirror my own style guide (or perhaps my guide mirrors them). However, there is one item of style that particularly annoys me on a regular basis. […]

Can’t we just use structured editors?

Some more optimistic folk out there might suggest, perhaps, we should just throw away text files and go straight for structured code databases. Put all this formatting nonsense behind us. Layout is just a stylesheet! It’s not data to be stored in a file!

Maybe so. But nobody is using structured editors yet.

A practical way forward

Taking all of the above into consideration, here is my approach at the problem. The hindent library and program. Styled on GNU indent, the intention is that you simply run the program on some source code and it reformats it.

$ hindent
hindent: arguments: --style [fundamental|chris-done|johan-tibell]

hindent has the concept of styles built in. There is a type for it:

data Style =
  forall s. Style {styleName :: !Text
                  ,styleAuthor :: !Text
                  ,styleDescription :: !Text
                  ,styleInitialState :: !s
                  ,styleExtenders :: ![Extender s]
                  ,styleDefConfig :: !Config
                  }

It contains authorship metadata. It holds an initial state which can be used during printing. Most importantly, it has a list of extenders. Means to extend the printer and change its behaviour on a node-type-specific basis.

Take a normal pretty printing approach. It’s usually something like:

class Pretty a where
  pretty :: a -> String

Then for all the types in the AST we implement an instance of Pretty.

The limitation here is that we can’t, as a user of the library, decide to print one particular node type differently.

Instead, here’s a new class:

class (Annotated ast,Typeable ast) => Pretty ast where
  prettyInternal :: ast NodeInfo -> Printer ()

(It runs in a Printer type to store some state about the current column and indentation level, things like that.)

Now, we implement the prettyInternal method for all our types. But when implementing instances, we never use the prettyInternal method directly. Instead, we use another function, pretty, which can be considered the main “API” for printing, like before. Here’s the type:

pretty :: (Pretty ast) => ast NodeInfo -> Printer ()

We’ll look at its implementation in a moment. Here is an example instance of Pretty:

instance Pretty ClassDecl where
  prettyInternal x =
    case x of
      ClsTyDef _ this that ->
        do write "type "
           pretty this
           write " = "
           pretty that
      …

See how pretty this and pretty that are used for recursing instead of prettyInternal? This approach is used for all instances.

Now let’s see what that affords us:

pretty :: (Pretty ast) => ast NodeInfo -> Printer ()
pretty a =
  do st <- get
     case st of
       PrintState{psExtenders = es,psUserState = s} ->
         do case listToMaybe (mapMaybe (makePrinter s) es) of
              Just m -> m
              Nothing -> prettyNoExt a
            printComments a
  where makePrinter s (Extender f) =
          case cast a of
            Just v -> Just (f s v)
            Nothing -> Nothing
        makePrinter s (CatchAll f) = (f s a)

In this function, we’re grabbing our (mentioned earlier) list of [Extender s] values from psExtenders and then looking up to see if any of the types match. To clarify, here is the Extender type:

data Extender s where
  Extender :: forall s a. (Typeable a) => (s -> a -> Printer ())
           -> Extender s
  CatchAll :: forall s. (forall a. Typeable a => s -> a -> Maybe (Printer ()))
           -> Extender s

Both constructors are rank-n. Both accept the current state as an argument and the current node. The Extender constructor is Prismesque. It’s existential, and lets you say “I want things of this type”. The CatchAll will just accept anything.

All that adds up to me being able to do something like this. Here’s a demo style:

demo =
  Style {styleName = "demo"
        ,styleAuthor = "Chris Done"
        ,styleDescription = "Demo for hindent."
        ,styleInitialState = ()
        ,styleExtenders =
           [Extender (\_ x ->
                        case x of
                          InfixApp _ a op b ->
                            do pretty a
                               pretty op
                               pretty b
                          _ -> prettyNoExt x)]
        ,styleDefConfig = def}

(The prettyNoExt is a publicly exposed version of the (private) method prettyInternal.)

Now let’s test the fundamental style versus our demo style:

λ> test fundamental "x = foo (a * b) + what"
x =
  foo
    (a * b) + what
λ> test demo "x = foo (a * b) + what"
x =
  foo
    (a*b)+what

Viola! We’ve configured how to pretty print infix operators in a few lines of code.

In practice, there are three styles. Here’s a larger example:

foo = do print "OK, go"; foo (foo bar) -- Yep.
          (if bar then bob else pif) (case mu {- cool -} zot of
            Just x -> return ();
            Nothing -> do putStrLn "yay"; return (1,2)) bill -- Etc
  where potato Cakes {} = 2 * x foo * bar / 5

Fundamental style:

foo =
  do print
       "OK, go"
     foo
       (foo
          bar)
       (if bar
           then bob
           else pif)
       (case mu {- cool -}
               zot of
          Just x ->
            return
              ()
          Nothing ->
            do putStrLn
                 "yay"
               return
                 (1,2))
       bill -- Etc
  where potato Cakes{} =
          2 * x
                foo * bar / 5

Johan Tibell’s style (which I’ve only started implementing, there’re some things I need to clarify):

foo = do
    print "OK, go"
    foo
        (foo bar)
        (if bar
             then bob
             else pif)
        (case mu {- cool -} zot of
             Just x ->
                 return ()
             Nothing -> do
                 putStrLn "yay"
                 return (1, 2))
        bill -- Etc
  where
    potato Cakes{} =
        2 * x foo * bar / 5

My style (pretty much complete):

foo =
  do print "OK, go"
     foo (foo bar)
         (if bar
             then bob
             else pif)
         (case mu {- cool -} zot of
            Just x -> return ()
            Nothing ->
              do putStrLn "yay"
                 return (1,2))
         bill -- Etc
  where potato Cakes{} = 2 * x foo * bar / 5

The styles are part of the module hierarchy of the package.

Write your own style!

I welcome anyone to write their own style. All styles are based upon the fundamental style, which should never change, by extending it. You can base yours off of another style, or start from scratch. While you can keep your style locally, like your XMonad configuration, it’s encouraged to contribute your style as a module.

My style and Johan’s are quite different. But yours may be similar with small tweaks. Another distinctly different style is Simon Peyton Jones’s with explicit braces. This is a style you can implement if you want it.

See the contributing section of the README for more info.

Preserving meaning

A recommendation is to preserve the meaning of the code. Don’t make AST changes. Like removing parentheses, changing $ into parens, moving lets into wheres, etc. You can do it, but the results might surprise you.

Editing advantages

Having implemented Emacs support, I have been using this for a few weeks on my own code. It’s amazing. I’ve all but stopped manually making style changes. I just write code and then hit C-c i and it almost always does exactly what I want.

It can’t always do what I want. It has simple, predictable heuristics. But even when it doesn’t do what I want, I’ve so far been okay with the results. The advantages vastly outweigh that.

Remaining problems

I need to write a test suite for the fundamental style (and maybe the others). This isn’t hard, it’s just a bit laborious so I haven’t gotten round to it yet.

There’re some parts of the AST I haven’t finished filling out. You can see them which are marked by FIXME’s. This just means if you try to format a node type it doesn’t know how to print, you’ll get a message saying so. Your code won’t be touched.

Comment re-insertion is a little bit of a challenge. I have a decent implementation that generally preserves comments well. There’re some corner cases that I’ve noticed, but I’m confident I have the right change to make to clear that up.

The fundamental printer is fast. My personal ChrisDone style is slower, due to its heuristics of deciding when to layout a certain way. It took 6s to layout a complex and large declaration. I updated my style and brought it down to 2s. That’s okay for now. There are always speed tricks that can be done.

There are of course issues like whether HSE can parse your code, whether you have #ifdef CPP pragmas in your code, and things like that. That’s just part of the general problem space of tools like this.

Remaining ideas

Currently you can only reformat a declaration. I don’t yet trust (any) Haskell printer with my whole file. I invoke it on a per-declaration basis and I see the result. That’s good for me at the moment. But in the future I might extend it to support whole modules.

Implement some operator-specific layouts. There are some operators that can really only be laid out in a certain way. Control.Applicative operators spring to mind:

Foo <$> foo
    <*> bar
    <*> mu

This can easily be handled as part of a style. Other considerations might be strings of lens operators. I’ve noticed that people tend not to put spaces around them, like:4

foo^.bar.^mu.~blah

There’s also alignment, which is another possibility and easily implemented. The challenge will be deciding when alignment will look good versus making the code too wide and whitespacey. In my own style I personally haven’t implemented any alignment as it’s not that important to me, but I might one day.

Summary

Hopefully I’ve motivated the case that style is important, that formalizing style is important, and that automating it is practical and something we should solve and then move on, redirect our energy that was wasted on manually laying things out and debating.


  1. Granted, this isn’t a technical consideration. But it’s a real one.

  2. Both of these are referring to conventions other than simple layout, but I don’t think that detracts the point.

  3. Both of these are referring to conventions other than simple layout, but I don’t think that detracts the point.

  4. I don’t know lens’s operators so that might not be real code, but the point is that could be a style to implement.

September 20, 2014 12:00 AM

Formatting in Haskell

This post is about the formatting package.

What’s wrong with printf?

The Text.Printf module is problematic simply because it’s not type-safe:

λ> import Text.Printf
λ> printf "" 2
*** Exception: printf: formatting string ended prematurely
λ> printf "%s" 2
*** Exception: printf: bad formatting char 's'

And it’s not extensible in the argument type. The PrintfType class does not export its methods.

And it’s not extensible in the formatting. You can’t add a “%K” syntax to it to print a value in Kelvins, for example.

And it’s implicit. You can’t just use your normal API searching facilities to search how to print a Day.

Holy Moly!

A while ago I was inspired by the HoleyMonoid package to use that mechanism to make a general replacement for printf.

It’s a continuation-based way of building up monoidal functions by composition with the ability to insert constants in-between. Example:

let holey = now "x = "
          . later show
          . now ", y = "
          . later show

> run holey 3 5
"x = 3, y = 5"

The now function inserts a monoidal value directly into the composition. So

run (now x . now y)

is equivalent to

x <> y

And

run (later show . now x . later show . now y)

is equivalent to

\a b -> show a <> x <> show b <> y

The Formatting package

The package is available on Hackage as formatting.

Comparison with other formatters

Example:

format ("Person's name is " % text %  ", age is " % hex) "Dave" 54

or with short-names:

format ("Person's name is " % t % ", age is " % x) "Dave" 54

Similar to C’s printf:

printf("Person's name is %s, age is %x","Dave",54);

and Common Lisp’s FORMAT:

(format nil "Person's name is ~a, age is ~x" "Dave" 54)

The Holey type

newtype HoleyT r a m = Holey { runHM :: (m -> r) -> a }

type Holey m r a = HoleyT r a m

This is my version of the HoleyMonoid. To make this into a useful package I changed a few things.

The Category instance implied a name conflict burden with (.), so I changed that to (%):

(%) :: Monoid n => Holey n b c -> Holey n b1 b -> Holey n b1 c

Rather than have the name-conflicting map function, I flipped the type arguments of the type and made it an instance of Functor.

Printers

There is an array of top-level printing functions for various output types:

-- | Run the formatter and return a lazy 'Text' value.
format :: Holey Builder Text a -> a

-- | Run the formatter and return a strict 'S.Text' value.
sformat :: Holey Builder S.Text a -> a

-- | Run the formatter and return a 'Builder' value.
bprint :: Holey Builder Builder a -> a

-- | Run the formatter and print out the text to stdout.
fprint :: Holey Builder (IO ()) a -> a

-- | Run the formatter and put the output onto the given 'Handle'.
hprint :: Handle -> Holey Builder (IO ()) a -> a

All the combinators work on a lazy text Builder which has good appending complexity and can output to a handle in chunks.

The Format type

There is a short-hand type for any formatter:

type Format a = forall r. Holey Builder r (a -> r)

All formatters are written in terms of now or later.

Formatters

There is a standard set of formatters in Formatting.Formatters, for example:

text :: Format Text
int :: Integral a => Format a
sci :: Format Scientific
hex :: Integral a => Format a

Finally, there is a general build function that will build anything that is an instance of the Build class from the text-format package:

build :: Buildable a => Format a

For which there are a bunch of instances. See the README for a full set of examples.

Composing formatters

%. is like % but feeds one formatter into another:

λ> format (left 2 '0' %. hex) 10
"0a"

Extension

You can include things verbatim in the formatter:

> format (now "This is printed now.")
"This is printed now."

Although with OverloadedStrings you can just use string literals:

> format "This is printed now."
"This is printed now."

You can handle things later which makes the formatter accept arguments:

> format (later (const "This is printed later.")) ()
"This is printed later."

The type of the function passed to later should return an instance of Monoid.

later :: (a -> m) -> Holey m r (a -> r)

The function you format with (format, bprint, etc.) will determine the monoid of choice. In the case of this library, the top-level formating functions expect you to build a text Builder:

format :: Holey Builder Text a -> a

Because builders are efficient generators.

So in this case we will be expected to produce Builders from arguments:

format . later :: (a -> Builder) -> a -> Text

To do that for common types you can just re-use the formatting library and use bprint:

 > :t bprint
bprint :: Holey Builder Builder a -> a
> :t bprint int 23
bprint int 23 :: Builder

Coming back to later, we can now use it to build our own printer combinators:

> let mint = later (maybe "" (bprint int))
> :t mint
mint :: Holey Builder r (Maybe Integer -> r)

Now mint is a formatter to show Maybe Integer:

> format mint (readMaybe "23")
"23"
> format mint (readMaybe "foo")
""

Although a better, more general combinator might be:

> let mfmt x f = later (maybe x (bprint f))

Now you can use it to maybe format things:

> format (mfmt "Nope!" int) (readMaybe "foo")
"Nope!"

Retrospective

I’ve been using formatting in a bunch of projects since writing it. Happily, its API has been stable since releasing with some additions.

It has the same advantages as Parsec. It’s a combinator-based mini-language with all the same benefits.

September 20, 2014 12:00 AM

September 19, 2014

JP Moresmau

A new Haskell IDE!

Well, that title is click-bait. It's only a proof of concept, so far (-:, sorry!

I wanted to play with Threepenny-GUI since it allowed to do UI the FRP way, without having the troubles of getting a proper UI library to work. And I was going through a period of frustration with EclipseFP, so I thought about something else for a while... It's got the romantic name of HWIDE!

So in fact, this is a very simple integration of CodeMirror and Threepenny-GUI, to be able to edit Haskell sources inside your browser. When you save your buffer, the data is written to disk, and if a cabal file could be found for that source file, a cabal build (in a sandbox) is attempted (with a configure if needed). The output is then parsed by code ripped off BuildWrapper, and the errors/warnings are displayed, so you can click on them and see the offending line in the code.

That's all, really, and even that is not 100% perfect, but it's a start. I could get to play a bit with Events and Behaviors, develop some little widgets. If some FRP experts want to have a look at the code and offer some advice, I'd be all ears!

I probably won't have much time to turn this into the next generation Haskell IDE, but please fork and hack to your heart's content! The repository is at https://github.com/JPMoresmau/dbIDE.

Happy Haskell Hacking!

by JP Moresmau (noreply@blogger.com) at September 19, 2014 01:15 PM

September 18, 2014

Chung-chieh Shan

Derailing

A type checker that derails:
show :: a -> String
“OK”
x :: Int
“OK”
show x
“but you can’t show functions”

A proof checker that derails:
“Birds fly. Tweety the Canary is a bird. So Tweety flies.”
“But Fido the Ostrich doesn’t fly.”


Another type checker that derails:
main = putStrLn s
“OK, how about s = show id

Another proof checker that derails:
“Fido doesn’t fly.”
“But Fido is a bird, and birds fly.”


A helpful type checker:
show id
“but you can’t show functions”

A helpful proof checker:
“Birds fly. Fido is a bird. So Fido flies.”
“But Fido is an ostrich, and ostriches don’t fly.”


Another helpful type checker:
main = putStrLn s
“OK, how about s = show 42

Another helpful proof checker:
“Tweety doesn’t fly.”
“But Tweety is a bird, and birds fly.”

September 18, 2014 05:33 PM

JP Moresmau

Learning Scala via unit tests

Last month I followed a big data training session, and we used Scala to write algorithms for Spark. I had looked at Scala some years ago but I think at the time the Eclipse support wasn't very good (the pot calling the kettle black, eh?), and it piqued my curiosity again. So I started looking at tutorials, and found Scala Labs. The idea is interesting: you get a project with sources and tests, and you need to make all the tests pass. The source code for both the code to change and the unit tests is heavily documented, and guides you through the language nicely. And seeing the green ticks appear always triggers the proper reward areas of my brain (-:

I had a little issue getting the code to compile using Eclipse, since there's a sbt-launch-0.13.0.jar library at the root of the project, and Eclipse used that as the top library, and it had a weird version of the List class, that wasn't genericized! I removed that jar from the class path and everything worked fine.

I'm not aware of a similar tutorial for Haskell, but that would be a good idea!

by JP Moresmau (noreply@blogger.com) at September 18, 2014 05:07 PM

Ian Ross

Non-diffusive atmospheric flow #6: principal components analysis

Non-diffusive atmospheric flow #6: principal components analysis

The pre-processing that we’ve done hasn’t really got us anywhere in terms of the main analysis we want to do – it’s just organised the data a little and removed the main source of variability (the seasonal cycle) that we’re not interested in. Although we’ve subsetted the original geopotential height data both spatially and temporally, there is still a lot of data: 66 years of 181-day winters, each day of which has 72×15 Z500 values. This is a very common situation to find yourself in if you’re dealing with climate, meteorological, oceanographic or remote sensing data. One approach to this glut of data is something called dimensionality reduction, a term that refers to a range of techniques for extracting “interesting” or “important” patterns from data so that we can then talk about the data in terms of how strong these patterns are instead of what data values we have at each point in space and time.

I’ve put the words “interesting” and “important” in quotes here because what’s interesting or important is up to us to define, and determines the dimensionality reduction method we use. Here, we’re going to side-step the question of determining what’s interesting or important by using the de facto default dimensionality reduction method, principal components analysis (PCA). We’ll take a look in detail at what kind of “interesting” and “important” PCA give us a little later.

PCA is, in principle, quite a simple method, but it causes many people endless problems. There are some very good reasons for this:

  • PCA is in some sense nothing more than a generic change of basis operation (with the basis we change to chosen in a special way). The result of this is that a lot of the terminology used about PCA is also very generic, and hence very confusing (words like “basis”, “component”, “eigenvector”, “projection” and so on could mean more or less anything in this context!).

  • PCA is used in nearly every field where multivariate data is analysed, and is the archetypical “unsupervised learning” method. This means that it has been invented, reinvented, discovered and rediscovered many times, under many different names. Some other names for it are: empirical orthogonal function (EOF) analysis, the Karhunen-Loève decomposition, proper orthogonal decomposition (POD), and there are many others. Each of these different fields also uses different terms for the different outputs from PCA. This is very confusing: some people talk about principal components, some about empirical orthogonal functions and principal component time series, some about basis functions, and so on. Here, we’re going to try to be very clear and careful about the names that we use for things to try to alleviate some of the confusion.

  • There is a bit of a conceptual leap that’s necessary to go from very basic examples of using PCA to using PCA to analyse the kind of spatio-temporal data we have here. I used to say something like: “Well, there’s a nice two-dimensional example, and it works just the same in 100 dimensions, so let’s just apply it to our atmospheric data!” A perfectly reasonable reponse to that is: “WHAT?! Are you an idiot?”. Here, we’re going to take that conceptual leap slowly, and describe exactly how the “change of basis” view of PCA works for spatio-temporal data.

  • There are some aspects of the scaling of the different outputs from PCA that are really confusing. In simple terms, PCA breaks your data down into two parts, and you could choose to put the units of your data on either one of those parts, normalising the other part. Which one you put the units on isn’t always an obvious choice and it’s really easy to screw things up if you do it wrong. We’ll look at this carefully here.

So, there’s quite a bit to cover in the next couple of articles. In this article, we will: explain the basic idea of PCA with a very simple (two-dimensional!) example; give a recipe for how to perform PCA on a data set; talk about why PCA works from an algebraic standpoint; talk about how to do these calculations in Haskell. Then in the next article, we will: describe exactly how we do PCA on spatio-temporal data; demonstrate how to perform PCA on the Z500 anomaly data; show how to visualise the Z500 PCA results and save them for later use. What we will end up with from this stage of our analysis is a set of “important” spatial patterns (we’ll see what “important” means for PCA) and time series of how strong each of those spatial patterns is at a particular point in time. The clever thing about this decomposition is that we can restrict our attention to the few most “important” patterns and discard all the rest of the variability in the data. That makes the subsequent exploration of the data much simpler.

The basic idea of PCA

We’re going to take our first look at PCA using a very simple example. It might not be immediately obvious how the technique we’re going to develop here will be applicable to the spatio-temporal Z500 data we really want to analyse, but we’ll get to that a little later, after we’ve seen how PCA works in this simple example and we’ve done a little algebra to get a clearer understanding of just why the “recipe” we’re going to use works the way that it does.

Suppose we go to the seaside and measure the shells of mussels1. We’ll measure the length and width of each shell and record the data for each mussel as a two-dimensional (length, width) vector. There will be variation in the sizes and shapes of the mussels, some longer, some shorter, some fatter, some skinnier. We might end up with data that looks something like what’s shown below, where there’s a spread of length in the shells around a mean of about 5 cm, a spread in the width of shells around a mean of about 3 cm, and there’s a clear correlation between shell length and width (see Figure 1 below). Just from eyeballing this picture, it seems apparent that maybe measuring shell length and width might not be the best way to represent this data – it looks as though it could be better to think of some combination of length and width as measuring the overall “size” of a mussel, and some other combination of length and width as measuring the “fatness” or “skinniness” of a mussel. We’ll see how a principal components analysis of this data extracts these two combinations in a clear way.

The code for this post is available in a Gist. The Gist contains a Cabal file as well as the Haskell source, to make it easy to build. Just do something like this to build and run the code in a sandbox:

git clone https://gist.github.com/d39bf143ffc482ea3700.git pca-2d
cd pca-2d
cabal sandbox init
cabal install
./.cabal-sandbox/bin/pca-2d

Just for a slight change, I’m going to produce all the plots in this section using Haskell, specifically using the Chart library. We’ll use the hmatrix library for linear algebra, so the imports we end up needing are:

import Control.Monad
import Numeric.LinearAlgebra.HMatrix
import Graphics.Rendering.Chart.Easy hiding (Matrix, Vector, (|>), scale)
import Graphics.Rendering.Chart.Backend.Cairo

There are some name overlaps between the monadic plot interface provided by the Graphics.Rendering.Chart.Easy module and hmatrix, so we just hide the overlapping ones.

We generate 500 synthetic data points:

-- Number of test data points.
n :: Int
n = 500

-- Mean, standard deviation and correlation for two dimensions of test
-- data.
meanx, meany, sdx, sdy, rho :: Double
meanx = 5.0 ; meany = 3.0 ; sdx = 1.2 ; sdy = 0.6 ; rho = 0.75

-- Generate test data.
generateTestData :: Matrix Double
generateTestData =
  let seed = 1023
      mean = 2 |> [ meanx, meany ]
      cov = matrix 2 [ sdx^2       , rho*sdx*sdy
                     , rho*sdx*sdy , sdy^2       ]
      samples = gaussianSample seed n mean cov
  in fromRows $ filter ((> 0) . minElement) $ toRows samples

The mussel shell length and width values are generated from a two-dimensional Gaussian distribution, where we specify mean and standard deviation for both shell length and width, and the correlation between the length and width (as the usual Pearson correlation coefficient). Given this information, we can generate samples from the Gaussian distribution using hmatrix’s gaussianSample function. (If we didn’t have this function, we would calculate the Cholesky decomposition of the covariance matrix we wanted, generate samples from a pair of standard one-dimensional Gaussian distributions and multiple two-dimensional vectors of these samples by one of the Cholesky factors of the covariance matrix – this is just what the gaussianSample function does for us.) We do a little filtering in generateTestData to make sure that we don’t generate any negative values2.

The main program that drives the generation of the plots we’ll look at below is:

main :: IO ()
main = do
  let dat = generateTestData
      (varexp, evecs, projs) = pca dat
      (mean, cov) = meanCov dat
      cdat = fromRows $ map (subtract mean) $ toRows dat
  forM_ [(PNG, "png"), (PDF, "pdf"), (SVG, "svg")] $ \(ptype, suffix) -> do
    doPlot ptype suffix dat evecs projs 0
    doPlot ptype suffix cdat evecs projs 1
    doPlot ptype suffix cdat evecs projs 2
    doPlot ptype suffix cdat evecs projs 3
  putStrLn $ "FRACTIONAL VARIANCE EXPLAINED: " ++ show varexp

and you can see the doPlot function that generates the individual plots in the Gist. I won’t say a great deal about the plotting code, except to observe that the new monadic API to the Chart library makes generating this kind of simple plot in Haskell no harder than it would be using Gnuplot or something similar. The plot code produces one of four plots depending on an integer parameter, which ranges from zero (the first plot above) to three. Because we’re using the Cairo backend to the Chart library, we can generate image output in any of the formats that Cairo supports – here we generate PDF (to insert into LaTeX documents), SVG (to insert into web pages) and PNG (for a quick look while we’re playing with the code).

The main program above is pretty simple: generate test data, do the PCA calculation (by calling the pca function, which we’ll look at in detail in a minute), do a little bit of data transformation to help with plotting, then call the doPlot function for each of the plots we want. Here are the plots we produce, which we’ll refer to below as we work through the PCA calculation:

Synthetic mussel shell test data for two-dimensional PCA example.

Centred synthetic mussel shell test data for two-dimensional PCA example.

PCA eigenvectors for two-dimensional PCA example.

Data projection onto PCA eigenvectors for two-dimensional PCA example.


Let’s now run through the “recipe” for performing PCA, looking at the figures above in parallel with the code for the pca function:

1
2
3
4
5
6
7
8
9
10
11
pca :: Matrix Double -> (Vector Double, Matrix Double, Matrix Double)
pca xs = (varexp, evecs, projs)
  where (mean, cov) = meanCov xs
        (_, evals, evecCols) = svd cov
        evecs = fromRows $ map evecSigns $ toColumns evecCols
        evecSigns ev = let maxelidx = maxIndex $ cmap abs ev
                           sign = signum (ev ! maxelidx)
                       in cmap (sign *) ev
        varexp = scale (1.0 / sumElements evals) evals
        projs = fromRows $ map project $ toRows xs
        project x = evecs #> (x - mean)

We’ll look at just why this recipe works in the next section, but for the moment, let’s just see what happens:

  1. We start with our original mussel shell data (Figure 1 above).

  2. We calculate the mean and covariance of our data (line 3 of the pca function listing). PCA analyses the deviations of our data from the mean, so we effectively look at “centred” data, as shown in Figure 2, where we’ve just removed the mean from each coordinate in our data. The mean and covariance calculation is conveniently done using hmatrix’s meanCov function.

  3. Then we calculate the eigendecomposition of the covariance matrix. Because the covariance matrix is a real symmetric matrix, by construction, we know that the eigenvectors will form a complete set that we can use as a basis to represent our data. (We’re going to blithely ignore all questions of possible degeneracy here – for real data, “almost surely” means always!) Here, we do the eigendecomposition using a singular value decomposition (line 4 in the listing of the pca function). The singular values give us the eigenvalues and the right singular vectors give us the eigenvectors. The choice here to use SVD (via hmatrix’s svd function) rather than some other means of calculating an eigendecomposition is based primarily on the perhaps slightly prejudiced idea that SVD has the best and most stable implementations – here, hmatrix calls out to LAPACK to do this sort of thing, so there’s probably not much to choose, since the other eigendecomposition implementations in LAPACK are also good, but my prejudice in favour of SVD remains! If you want some better justification for why SVD is “the” matrix eigendecomposition, take a look at this very interesting historical review of the development of SVD: G. W. Stewart (1993). On the early history of the singular-value decomposition. SIAM Rev. 35(4), 551-566.

  4. We do a little manipulation of the directions of the eigenvectors (lines 5-8 in the listing), flipping the signs of them to make the largest components point in the positive direction – this is mostly just to make the eigenvectors look good for plotting. The eigenvectors are shown in the Figure 3: we’ll call them e1 (the one pointing to the upper right) and e2 (the one pointing to the upper left). Note that these are unit vectors. We’ll talk about this again when we look at using PCA for spatio-temporal data.

  5. Once we have unit eigenvectors, we can project our (centred) data points onto these eigenvectors (lines 10 and 11 of the listing: the project function centres a data point by taking off the mean, then projects onto each eigenvector using hmatrix’s matrix-vector product operator #>). Figure 4 shows in schematic form how this works – we pick out one data point in green and draw lines parallel and orthogonal to the eigenvectors showing how we project the data point onto the eigenvectors. Doing this for each data point is effectively just a change of basis: instead of representing our centred data value by measurements along the x- and y-axes, we represent it by measurements in the directions of e1 and e2. We’ll talk more about this below as well.

  6. Finally, the eigenvalues from the eigendecomposition of the covariance matrix tell us something about how much of the total variance in our input data is “explained” by the projections onto each of the eigenvectors. I’ve put the word “explained” in quotes because I don’t think it’s a very good word to use, but it’s what everyone says. Really, we’re just saying how much of the data variance lies in the direction of each eigenvector. Just as you can calculate the variance of the mussel length and width individually, you can calculate the variance of the projections onto the eigenvectors. The eigenvalues from the PCA eigendecomposition tell you how much variance there is in each direction, and we calculate the “fraction of variance explained” for each eigenvector and return it from the pca function.

So, the pca function returns three things: eigenvalues (actually fractional explained variance calculated from the eigenvalues) and eigenvectors from the PCA eigendecomposition, plus projections of each of the (centred) data points onto each of the eigenvectors. The terminology for all these different things is very variable between different fields. We’re going to sidestep the question of what these things are called by always explicitly referring to PCA eigenvectors (or, later on when we’re dealing with spatio-temporal data, PCA eigenpatterns), PCA explained variance fractions and PCA projected components. These terms are a bit awkward, but there’s no chance of getting confused this way. We could choose terminology from one of the fields where PCA is commonly used, but that could be confusing for people working in other fields, since the terminology in a lot of cases is not very well chosen.

Together, the PCA eigenvectors and PCA projected components constitute nothing more than a change of orthonormal basis for representing our input data – the PCA output contains exactly the same information as the input data. (Remember that the PCA eigenvectors are returned as unit vectors from the pca function, so we really are just looking at a simple change of basis.) So it may seem as though we haven’t really done anything much interesting with our data. The interesting thing comes from the fact that we can order the PCA eigenvectors in decreasing order of the explained variance fraction. If we find that data projected onto the first three (say) eigenvectors explains 80% of the total variance in our data, then we may be justified in considering only those three components. In this way, PCA can be used as a dimensionality reduction method, allowing us to use low-dimensional data analysis and visualisation techniques to deal with input data that has high dimensionality.

This is exactly what we’re going to do with the Z500 data: we’re going to perform PCA, and take only the leading PCA eigenvectors and components, throwing some information away. The way that PCA works guarantees that the set of orthogonal patterns we keep are the “best” patterns in terms of explaining the variance in our data. We’ll have more to say about this in the next section when we look at why our centre/calculate covariance/eigendecomposition recipe works.

The algebra of PCA

In the last section, we presented a “recipe” for PCA (at least for two-dimensional data): centre the data; calculate the covariance matrix; calculate the eigendecomposition of the covariance matrix; project your centred data points onto the eigenvectors. The eigenvalues give you a measure of the proportion of the variance in your data in the direction of the corresponding eigenvector. And the projection of the data points onto the PCA eigenvectors is just a change of basis, from whatever original basis your data was measured in (mussel shell length and width as the two components of each data point in the example) to a basis with the PCA eigenvectors as basis vectors.

So why does this work? Obviously, you can use whatever basis you like to describe your data, but why is the PCA eigenbasis useful and interesting? I’ll explain this quite quickly, since it’s mostly fairly basic linear algebra, and you can read about it in more detail in more or less any linear algebra textbook3.

To start with, let’s review some facts about eigenvectors and eigenvalues. For a matrix A, an eigenvector u and its associated eigenvalue λ satisfy

Au=λu.

The first thing to note is that any scalar multiple of u is also an eigenvector, so an eigenvector really refers to a “direction”, not to a specific vector with a fixed magnitude. If we multiply both sides of this by uT and rearrange a little, we get

λ=uTAuuTu.

The denominator of the fraction on the right hand side is just the length of the vector u. Now, we can find the largest eigenvalue λ1 and corresponding eigenvector u1 by solving the optimisation problem

u1=arg maxuTu=1uTAu,

where for convenience, we’ve restricted the optimisation to find a unit eigenvector, and we find λ1 directly from the fact that Au1=λ1u1.

We can find next largest (in magnitude) eigenvalue and corresponding eigenvector of the matrix A by projecting the rows of A into the subspace orthogonal to u1 to give a new matrix A1 and solving the optimisation problem

u2=arg maxuTu=1uTA1u,

finding the second largest eigenvalue λ2 from A1u2=λ2u2. Further eigenvectors and eigenvalues can be found in order of decreasing eigenvalue magnitude by projecting into subspaces orthogonal to all the eigenvectors found so far and solving further optimisation problems.

This link between this type of optimisation problem and the eigenvectors and eigenvalues of a matrix is the key to understanding why PCA works the way that it does. Suppose that we have centred our (K-dimensional) data, and that we call the N centred data vectors xi, i=1,2,N. If we now construct an N×K matrix X whose rows are the xi, then the sample covariance of the data is

C=1N1XTX.

Now, given a direction represented as a unit vector u, we can calculate the data variance in that direction as Xu2, so that if we want to know the direction in which the data has the greatest variance, we solve an optimisation problem of the form

u1=arg maxuTu=1(Xu)TXu=arg maxuTu=1uTCu.

But this optimisation problem is just the eigendecomposition optimisation problem for the covariance matrix C. This demonstrates we can find the directions of maximum variance in our data by looking at the eigendecomposition of the covariance matrix C in decreasing order of eigenvalue magnitude.

There are a couple of things to add to this. First, the covariance matrix is, by construction, a real symmetric matrix, so its eigenvectors form a complete basis – this means that we really can perform a change of basis from our original data to the PCA basis with no loss of information. Second, because the eigenvectors of the covariance matrix are orthogonal, the projections of our data items onto the eigenvector directions (what we’re going to call the PCA projected components) are uncorrelated. We’ll see some consequences of this when we look at performing PCA on the Z500 data. Finally, and related to this point, it’s worth noting that PCA is a linear operation – the projected components are linearly uncorrelated, but that doesn’t mean that there can’t be some nonlinear relationship between them. There are generalisations of PCA to deal with this case, but we won’t be talking about them for the purposes of this analysis.

Everything we’ve done here is pretty straightforward, but you might be wondering why we would want to change to this PCA basis at all? What’s the point? As I noted above, but is worth reiterating, the most common use for PCA, and the way that we’re going to use it with the Z500 data, is as a dimensionality reduction method. For the Z500 data, we have, for each day we’re looking at, 72×15=1080 spatial points, which is a lot of data to look at and analyse. What we usually do is to perform PCA, then ignore all but the first few leading PCA eigenvectors and projected components. Because of the way the optimisation problems described above are set up, we can guarantee that the leading m PCA eigenvectors span the m-dimensional subspace of the original data space containing the most data variance, and we can thus convince ourselves that we aren’t missing interesting features of our data by taking only those leading components. We’ll see how this works in some detail when we do the PCA analysis of the Z500 data, but in the mussel measurement case, this would correspond to thinking just of the projection of the mussel length and width data along the leading e1 eigendirection, so reducing the measurements to a single “size” parameter that neglects the variation in fatness or skinniness of the mussels. (This two-dimensional case is a bit artificial. Things will make more sense when we look at the 1080-dimensional case for the Z500 data.)


  1. Well, not really, since I live in the mountains of Austria and there aren’t too many mussels around here, so I’ll generate some synthetic data!

  2. Obviously, a Gaussian distribution is not right for quantities like lengths that are known to be positive, but here we’re just generating some data for illustrative purposes, so we don’t care all that much. If we were trying to model this kind of data though, we’d have to be more careful.

  3. I like Gilbert Strang’s Linear Algebra and its Applications, although I’ve heard from some people that they think it’s a bit hard for a first textbook on the subject – if you’ve had any exposure to this stuff before though, it’s good.

<script src="http://zor.livefyre.com/wjs/v3.0/javascripts/livefyre.js" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "livefyre.com", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

September 18, 2014 01:39 PM

Keegan McAllister

Raw system calls for Rust

I wrote a small library for making raw system calls from Rust programs. It provides a macro that expands into in-line system call instructions, with no run-time dependencies at all. Here's an example:

#![feature(phase)]

#[phase(plugin, link)]
extern crate syscall;

fn write(fd: uint, buf: &[u8]) {
unsafe {
syscall!(WRITE, fd, buf.as_ptr(), buf.len());
}
}

fn main() {
write(1, "Hello, world!\n".as_bytes());
}

Right now it only supports x86-64 Linux, but I'd love to add other platforms. Pull requests are much appreciated. :)

by keegan (noreply@blogger.com) at September 18, 2014 12:33 AM

September 17, 2014

Philip Wadler

The military and the referendum.


Many readers will have heard about Lord Dannat in the Telegraph arguing a vote for independence will dishonour Scotland's war dead.

Perhaps not as many will have heard that Jimmy Sinclair (the oldest surviving Desert Rat, aged 102), Colin May (Lieutenant Commander, Faslane), and sixteen others have written a letter slamming Dannat; at least, I didn't hear until this morning. "How dare he take their sacrifice in vain and try to turn it to political advantage?"

Both sides are reported by the BBC, though the headline mentions only one. (More #bbcbias?)

by Philip Wadler (noreply@blogger.com) at September 17, 2014 08:42 PM

Eric Kidd

Deploying Rust applications to Heroku, with example code for Iron

Now with support for Iron, Cargo and Cargo.lock!

You can deploy an example Rust application to Heroku using this button:

Deploy

If you'd prefer to use the command line, you'll need git and the Heroku toolbelt. Once these are installed, run:

git clone https://github.com/emk/heroku-rust-cargo-hello.git
cd heroku-rust-cargo-hello
heroku create --buildpack https://github.com/emk/heroku-buildpack-rust.git
git push heroku master

This will download the example application, create a new Heroku project, and deploy the code to Heroku. That's it!

How it works

Our server is based on the Iron middleware library. We parse URL parameters and dispatch HTTP requests to the appropriate handler routine using Iron's router module:

fn main() {
    let mut router = Router::new();
    router.get("/", hello);
    router.get("/:name", hello_name);
    Iron::new(router).listen(Ipv4Addr(0, 0, 0, 0), get_server_port());
}

The hello function returns an HTTP status code and the content to send to the user:

fn hello(_: &mut Request) -> IronResult<Response> {
    Ok(Response::with(status::Ok, "Hello world!"))
}

The hello_name function is similar, but we look up the value of the :name parameter that we declared to the router, above.

fn hello_name(req: &mut Request) -> IronResult<Response> {
    let params = req.extensions.find::<Router,Params>().unwrap();
    let name = params.find("name").unwrap();
    Ok(Response::with(status::Ok, format!("Hello, {}!", name)))
}

The final piece needed to run on Heroku is a function to look up up our port number:

fn get_server_port() -> Port {
    getenv("PORT")
        .and_then(|s| from_str::<Port>(s.as_slice()))
        .unwrap_or(8080)
}

The full source code is available on GitHub. To learn more about Rust, see the excellent Rust guide.

Keep reading for notes on building the program locally and on configuring your own programs to run on Heroku.

Read more…

September 17, 2014 08:35 PM

Philip Wadler

The case for Europe

<iframe frameborder="0" height="500" src="http://www.bbc.co.uk/emp/embed/smpEmbed.html?playlist=http%3A%2F%2Fwww.bbc.co.uk%2Fiplayer%2Fplaylist%2Fp026txk6&amp;title=Today%3A%2017%2F09%2F2014%3A%20Salmond%20promises%20%22common-sense%20agreement%20on%20a%20common%20currency%E2%80%9D&amp;product=iplayer" width="400"></iframe>

Readers of this list will know that I don't always see eye-to-eye with Alex Salmond. Nonetheless, I think he put the case for Europe well this morning on the Today Program.
In a BBC interview just the other night, a spanish minister being
interviewed by Kirsty Wark despite being invited three or four times
refused to say that Spain would attempt to veto Scottish
membership. And the reason for that of course is that the Spanish
government have already said that in the circumstance of a consented
democratic referendum, as they put it, Spain would have nothing to say
about it.

We can go through legal opinion and expert opinion as much as we like.
I think the answer is in four figures: 1, 20, 25, and 60.

1% is the percentage of Scotland's population compared to the European Union.

20% is the percentage of the fish stocks of the entire European Union.

25% is the percentage of the renewable energy of the entire European Union offshore.

And 60% is the oil reserves that Scotland has.

Anyone who believes that a country [with these resources] is not
going to be welcome in the wider Europe doesn't understand the process
by which Europe accepts democratic results and that Scotland has a
huge amount of attractiveness to the rest of the European continent.

You can hear the original here, the relevant segment starts at around 8:00.

by Philip Wadler (noreply@blogger.com) at September 17, 2014 08:10 PM

Mike Izbicki

Beginner error messages in C++ vs Haskell

Beginner error messages in C++ vs Haskell

posted on 2014-09-17 by Paul Starkey

Learning Haskell was excruciating. The error messages from the Haskell compiler ghc were way more difficult to understand than the error messages I was used to from g++. I admit I’m still a novice programmer: My only experience is a year of classes in C++ programming. But the Haskell compiler should do a better job generating error messages for beginners like me.

First we’ll see four concrete examples of ghc doing worse than g++, then Mike will talk about some ways to fix ghc’s error messages.

Example 1

Below are two equivalent C++ and Haskell programs. I’ve intentionally added some syntax errors:

    /* C++ Code */
    #include <iostream>

    using namespace std;

    int main () 
    {
        int in = -1;
        cout << "Please choose 1 for a message" << endl;
        cin >> in;
err->   if in == 1 
        {
            cout << "Hello, World!" << endl;
        }
        else{
            cout << "Error, wrong choice" << endl;
        }
        return 0;
    }
    {- Haskell Code -}
    main = do
        putStrLn "Please enter 1 for a message"
        num <- getLine
        if num == "1"
            then do
                putStrLn "Hello, World" 
err->       
                putStrLn "Error, wrong choice"

Alright, so the first notable difference is that the Haskell code is much shorter. It takes up roughly half the space that the C++ code does, yet they both output hello world when the correct number is entered.

Great!

Haskell already seems better, right?

Wrong!

Notice how I messed up the if statement in both programs. In the C++ version, I forgot the parentheses, and in the Haskell version I forgot the else. Both omissions are simple mistakes that I’ve made while learning these languages.

Now let’s see the error messages:

    -- C++ Error --
    main.cpp: In function 'int main()':
    main.cpp:15:5: error: expected '(' before 'in'
    main.cpp:19:2: error: 'else' without a previous 'if'
    Compilation failed.
    -- Haskell Error --
    [..]main.hs:19:1:
        parse error (possibly incorrect indentation or mismatched brackets)
    Failed, modules loaded: none.

Both error messages let the programmer know where the mistake happened, but the g++ message is far more helpful. It tells us how to fix the syntax error by adding some missing parentheses. Bam! Problem solved.

Now let us turn to ghc’s output. Okay, something about a parse error… might have indentation errors… and no modules loaded. Cool. Now I’ve never taken a compiler course, so I don’t know what parse error means, and I have no idea how to fix it. The error message is simply not helpful.

Example 2

Here’s another example of parsing errors.

        /* C++ Code */ 
        #include <iostream>
        
        using namespace std;
        
        int main() 
        {
err->       string in = ""
            cout << "Please enter a single word and get the string size back" << endl;
            cin >> in;
        
            cout << "The size of your word, \"" << in << "\", is "
                 << in.length() << "!" << endl;
            return 0;
        }
        {- Haskell Code -}
err->   main = {-do-}
            putStrLn "Please enter a single word and get the string size back"
            num <- getLine
            let size = length num
            putStrLn $ "The size of your word, \"" ++ num ++ "\", is " ++ show size ++ "!"

As you can see, in the C++ I forgot to include a semicolon and in the Haskell I forgot the do in main. As a beginner, I’ve personally made both of these mistakes.

Here are the error messages:

    -- C++ Error --
    main.cpp:8:2: error: expected ',' or ';' before 'cout'
    Compilation failed.
    -- Haskell Error --
    [..]main.hs:4:13:
        parse error on input '<-'
    Failed, modules loaded: none.

C++ delivers a clear message explaining how to fix the error. Haskell, however, is not so helpful. It says there’s a parse error on the input operator. How should I know this is related to a missing do?

Example 3

Next let’s see what happens when you call the built-in strlen and length functions with no arguments at all.

    /* C++ Code */
    #include <iostream>
    #include <cstring>

    using namespace std;
    
    int main (){
        char input[256];
        cout << "Please enter a word" << endl;
        cin >> input;
    
err->   cout << "The size of your string is: " << (unsigned)strlen();
        cout << "!" << endl;
        return 0;
    }
    {- Haskell Code -}
    main = do
        putStrLn "Please enter a word"
        num <- getLine
err->   let size = length 
        putStrLn $ "The size of your string is: " ++ show size ++ "!"

Now let us see the different error messages that are produced:

    -- C++ Error --
    main.cpp: In function 'int main()':
    main.cpp:11:61: error: too few arguments to function 'size_t_strlen(const char*)'
    Compilation failed.
    -- Haskell Error --
    [..]main.hs:7:36:
    No instance for (Show ([a0]->Int)) arising from a use of 'show'
    Possile fix: add an instance declaration for (Show ([a0]->Int))
    In the first argument of '(++)', namely 'show size'
    In the second argument of '(++)', namely 'show size ++ "!"'
    In the second argument of '(++)', namely
      '"\", is " ++ show size ++ "!"'
    Failed, modules loaded:none.

Once again, it appears that the C++ compiler g++ knew exactly what was wrong with the code and how to fix the error. It tells me that there are not enough arguments in my function call.

Wow, Hakell’s error message is quite the mouthful this time. I suppose this is better than just a parse error message, but I’m not sure what exactly ghc is even wanting me to correct. The error is simply too technical to help me.

Example 4

Next, we will look at what happens when you pass too many arguments to functions in both languages:

    /* C++ Code */
    #include <iostream>
    using namespace std;

    int main () {
        string in[256];
        cout << "Please enter a single word to get the string size back" << endl;
        cin >> in;
    
err->   cout << "The size of your string, \"" << in << "\", is " << (unsigned)strlen(in, in);
        cout << "!" << endl;
        return 0;
    }
    {- Haskell Code -}
    main = do
        putStrLn "Please enter a single word to get the string size back"
        num <- getLine
err->   let size = length num num
        putStrLn $ "The size of your string, \"" ++ num ++ "\", is " ++ show size ++ "!"

And the errors:

    -- C++ Error --
    main.cpp:16:78: error: too many arguments to function 'int newLength(std::string)'
    main.cpp:6:5: note: declared here
    Compilation failed.
    -- Haskell Error --
    Couldn't match expected type 'String -> t0' with actual type 'Int'  
    The function 'length' is applied to two arguments,
    but its type '[Char] -> Int' has only one
    In the expression: length num num
    In an equation for 'size':size = length num num
    Failed, modules loaded: none.

The C++ error clearly explains how to fix the code, and I even understand the Haskell error this time. Both languages tell me that there are too many arguments. Yet the C++ error message tells me this without a bunch of technical jargon. So even when Haskell is actually helpful with its error messages, it still manages to hide what it wants the user to do.

Conclusion

To me, Haskell seems like a language only for experienced programmers because the errors are not user-friendly. How can I write code if a few simple mistakes cripple my progress? Haskell’s compiler ghc simply lags behind g++ in terms of useful error messages for beginners.

Mike’s Epilogue

I’ve created a patch for ghc that clarifies the specific error messages that Paul had trouble with (and a few related ones). In particular:

  1. Anytime there is a parse error caused by a malformed if, case, lambda, or (non-monadic) let, ghc will now remind the programmer of the correct syntax. In the first example Paul gives above, we would get the much clearer:

    parse error in if statement: missing required else clause
  2. To help with the second example, anytime ghc encounters a parse error caused by a <- token, it now outputs the hint:

    Perhaps this statement should be within a 'do' block?
  3. The third example Paul points out comes from the type checker, rather than the parser. It’s a little less obvious how to provide good hints here. My idea is based on the fact that it is fairly rare for functions to be instances of type classes. The only example I know of is the Monad instance for (a->).

    Therefore, if the type checker can’t find an instance for a function, the more likely scenario is that the programmer simply did not pass enough parameters to the function. My proposed change is that in this situation, ghc would output the hint:

    maybe you haven't applied enough arguments to a function?

This patch doesn’t completely fix ghc’s problem with poor error messages. For example, it doesn’t address Paul’s last point about type errors being verbose. But hopefully it will make it a little easier for aspiring haskellers who still aren’t familiar with all the syntax rules.

September 17, 2014 12:00 AM

September 16, 2014

Neil Mitchell

Towards Shake 1.0

Summary: I've just released a new version of Shake, with a --demo feature and an underlying continuation monad. I want to release v1.0 in the near future.

I've just released a new version of the Shake build system, version 0.13.3. While the version number is only 0.0.1 higher, the changelog lists a large number of improvements. In particular, two changes are:

  • The Action monad is now based on continuations, which allows Shake to suspend threads without requiring a GHC RTS thread. The result is significantly less memory used on thread stacks. I still find it quite amazing that Haskell has such powerful and robust abstraction mechanisms that a huge change doesn't even break the API.
  • The shake binary now features a --demo mode, invoked by running shake --demo. This mode generates a Shake project, compiles it, and shows off some of the features of Shake. You can the output of --demo here.

Version 1.0

With the two features above, I'm now looking towards Shake version 1.0. I'm not looking to make any significant backwards-incompatible change, or necessarily any code/API changes at all. However, if you have anything you think should be addressed before reaching such a milestone, please comment on the issue tracker or email the mailing list.

Shake website

The one thing I still want to finish before releasing version 1.0 is to have a proper website for Shake. I've registered shakebuild.com which will host the content, and have set up GitHub pages to serve it up. I have some rough content in the docs directory and a prototype generator in the website directory - as an example it currently generates something a bit like this for the user manual, but with a table of contents when run through the latest generator. I'd appreciate any help with the content, the generator, or the styling - just email the mailing list.

by Neil Mitchell (noreply@blogger.com) at September 16, 2014 08:54 PM

Chung-chieh Shan

A challenge for a better community

Donation button
Donate to the Ada Initiative

Did you know that all ACM-sponsored conferences have an anti-harassment policy? I didn’t, until I chaired the Haskell Symposium last year. The concise policy says, among other things, that people shouldn’t use my family constitution to interfere with my professional participation. And the policy has teeth. That’s great.

My not knowing the policy and not seeing it publicized didn’t make me go out of my way to harass anyone. But it did make me less sure and less comfortable that I belonged at ICFP. Briefly, it’s because I didn’t know if it would be common ground at the conference that my actual self was fully human. That’s not something I can take for granted in general society. Also, it’s because I didn’t know whether my fellow conference attendees were aware of the policy. We could all use a reminder, and a public signal that we mean it.

For these reasons, I’m very happy that ICFP will start to publicize ACM’s existing anti-harassment policy and make sure everyone registered knows it. All ACM conferences should do it. That’s why Tim Chevalier, Clement Delafargue, Adam Foltzer, Eric Merritt, and I are doing two things. We ask you to join us:

  1. Donate to the Ada Initiative. Our goal is for the functional programming community to raise $8192 by the end of Friday (Sept 19) UTC. To count toward this goal, please use this link: http://supportada.org/?campaign=lambda
  2. Call on the ACM and tell your friends. For example, I tweeted this:
    I donate to @AdaInitiative because I want @TheOfficialACM events to announce their anti-harassment policy http://supportada.org/?campaign=lambda #lambda4ada

Thanks for improving our professional homes!

(UPDATE: Wow! We reached our initial goal $4096 in just 5 hours! We increased the goal to $8192, thanks to your generosity. And if we raise $16,384, we will sing “There’s no type class like Show type class” and put a recording on the Internet.)

September 16, 2014 04:09 PM

September 15, 2014

Robert Harper

Scotland: Vote No

So far I’ve ignored the back and forth on the Scottish referendum on secession from the United Kingdom, but this weekend I decided that it was past time for me to sort it out.  For those of you who don’t know me, I’ll mention that I lived for 3.5 years in Scotland quite some time ago, so I am not completely ignorant of the cultural and political issues that underly the debate.  As a rule my political views are very much in line with those of the average Scot, solidly Labour Party back in the day when people like Derek Hatton and Ken Livingston and Roy Hattersley and Tony Benn defined what that meant.  Despite Tony Blair’s slimy “third way” nonsense, and his toadying up to Dick “Dick” Cheney’s sock puppet to help lie us into the Iraq war, Scotland in national politics remains solidly Labour; practically every Scottish seat is a Labour seat.

Although I used to be a so up on British politics that I could read and enjoy Private Eye, it’s been a long while since I’ve paid more than scant attention to what’s been going on there, apart from noting that The Scotsman was one of the few sources of truth about the Iraq War back when it really mattered.  The Scots have spines.

I’m no historian, but I do have basic understanding of Scottish history, particularly as regards the English, and am very familiar with the Scottish concept of valor in glorious defeat.  I understand full well that practically every Scotsman harbors some resentment towards the English for centuries of injustices, including the highland clearances, and, more recently, the appropriation of the oil in Scottish territory for the scant benefit of the Scots themselves.  And I am well aware of the bravery and sacrifice that so many Scots made fighting against the Axis during World War II.

My home institution, Carnegie Mellon University, was founded by a Scotsman from Kirkaldy, just across the spectacular Forth Bridge from Edinburgh.  Carnegie was born into penury and died as the wealthiest man on earth, far wealthier relative to GDP than Gates by a wide margin.  Carnegie was extraordinary, but the Scots in general punch far above their weight class in all things, especially industrious self-reliance.

In short, I love Scotland, and consider it to be a second home.  (OK, the weather is appalling, but we’ll set that aside for the time being.)

Emotionally, I am deeply sympathetic to the Scottish independence movement.  I know full well how poorly the U.K. treats Scotland and its interests.  Politics in the UK revolves around the “home counties” in the south of England; the terminology tells you all you need to know.  One time while watching the weather report on the BBC, the national broadcasting network, the announcer said that there was some horrendous weather coming our way, but that “it’ll mostly be up in Scotland, though”.  Though.  Though.

But I urge all my Scottish friends to vote NO on the independence proposal.  It makes no sense whatsoever in its present form, and represents to me a huge scam being perpetrated by the SNP to seize power and impose policies that nearly every Scot, judging from their voting record over decades and decades, would oppose.  The whole movement seems driven by the powerful urge to finally stick it to the English and get their country back, and Salmond is exploiting that to the hilt.  Back when I lived in Scotland I looked into the SNP, because even then I had separatist sympathies, but when I did, it was obvious why they had so few backers.  They’re just Tories without the class structure, more akin to our Tea Party lunatics than to the British Conservatives, and steadfastly opposed to policies, such as well-funded public education, that nearly all Scots support, and determined to follow the post-cold war Slovakian model of slashing taxes on the wealthy in the hope of attracting business to the country.  Having not followed Scottish politics for so long, it is astonishing to me that the SNP has managed to gain a majority in the Scottish Parliament, while the voting pattern at the national level has not changed at all.  How did this happen?  From my position of ignorance of the last decade or so of politics in Scotland, it looks as though Salmond is a slick operator who has pulled off a colossal con by exploiting the nationalist tendencies that lie within every Scot.

But never mind Salmond, the main reason that Scots must vote NO on the referendum is that it proposes to keep the English pound as Scotland’s national currency!  This is such a preposterous idea that I can only suspect dishonesty and deceit, because no sane political leader of honest intent could ever voluntarily place his or her country’s economic future in the hands of another.  The Bank of England will, particularly after separation, have no interest whatsoever in the economic conditions in Scotland when determining its policies on the pound.  And the Bank of Scotland will have no ability to control its own currency, the prime means of maintaining economic balance between labor and capital.  The Scots will, in effect, be putting themselves on a gold standard, the stupidest possible monetary system, so that, in a crisis, they will have to buy or borrow pounds, at interest, in emergency conditions, to deal with, say, the failure of the Royal Bank of Scotland (but don’t worry, that sort of thing can never happen again).  And the Bank of Scotland will have no means of stimulating the economy in a demand slump other than borrowing pounds from somewhere outside the country, rendering themselves in debt beyond their means.  And this will become an excuse for dismantling the social system that has been so important to elevating the Scots from poverty to a decent standard of living within one or two generations.  Just look at the poor PIGS in the Euro-zone being pushed around by Germany, especially, to satisfy the conveniences of the German bankers, and to hell with the living, breathing souls in Greece or Spain or Ireland or Portugal, to name the canonical victims.

A country that does not control its own currency is not independent and cannot be independent.  It’s an illusion.  Just what are Salmond’s true intentions are not entirely clear to me, but on the basis of his monetary policies alone, I implore my Scottish friends to suppress the natural wish to make a statement of pride, and instead do the sensible thing.  The proposal to be voted on this week is not a spittle on the  Heart of Midlothian, it is an irrevocable decision to place Scotland in an even worse position with respect to England than it already is in.

Listen to reason.  Vote NO on independence.


Filed under: Research Tagged: Scottish referendum

by Robert Harper at September 15, 2014 04:05 PM

The GHC Team

GHC Weekly News - 2014/09/15

Hi *,

Here's a new thing: Blog posts! That's right. A while back, we started a new set of emails on the developers list containing weekly updates, from GHC HQ. But we eventually decided it should be more broad and encompass the work GHC sees as a project - including all the things our contributors do.

So now it's the weekly GHC news - and we (or, well, I) have decided to blogify the weekly emails!

So without further adieu, here's the current recap. The original mailing list copy is available here.

  • As Gabor mentioned on the list earlier today, I (Austin) accidentally broke the Windows build. Sorry. :( We really need to get Phab building Windows too ASAP... I'm working on a fix this morning.
  • I sent out the HCAR draft this morning. Please edit it! I think we have a few weeks of lead time however, so we're not in a rush like last time. But I will send reminders. :)
  • The server migration for ghc.haskell.org seems to have gone pretty smoothly in the past week. It's had plenty of traffic so far. The full migration is still ongoing and I want to complete it this week.
  • I've finished reorganizing some of the Git and Repository pages after some discussion. We now have the Repositories[1] page, linked to on the left side, which details some notes on the repositories we use, and links to other various pages. I'm thinking of replacing this side-bar "root" with a link to the main Git[2] page, perhaps.
  • Miscellaneous: ghc.haskell.org and phabricator.haskell.org now sets the Strict-Transport-Security header. This just means you always use SSL now when you visit those pages (so you can't be connection-hijacked via a 503 redirect).
  • GHC works on Wine apparently for all you Linux users - thanks Mikolaj![3]
  • Jan had some questions about infrastructure which I just followed up on this morning. In particular: does anyone feel strongly about his first question?[4]
  • Herbert committed the first part of the Traversable/Foldable changes, by moving the typeclasses to Prelude. This is part of an ongoing series of patches. Things like adding Bifunctor will finally come after this.[5]

Also, added bonus: we'll start including some of the tickets we closed this week.

Closed tickets for the past week include: #9585, #9545, #9581, #6086, #9558, and #3658.

Please let me know if you have any questions.

[1] https://ghc.haskell.org/trac/ghc/wiki/Repositories
[2] https://ghc.haskell.org/trac/ghc/wiki/WorkingConventions/Git
[3] https://www.haskell.org/pipermail/ghc-devs/2014-September/006283.html
[4] https://www.haskell.org/pipermail/ghc-devs/2014-September/006275.html
[5] https://phabricator.haskell.org/D209

by thoughtpolice at September 15, 2014 01:47 PM

Mike Izbicki

Comparing AVL Trees in C++ and Haskell

Comparing AVL Trees in C++ and Haskell

posted on 2014-09-15 by Dat Do

This post compares the runtimes of AVL tree operations in C++ vs Haskell. In particular, we insert 713,000 strings from a file into an AVL Tree. This is a \(O(n \log n)\) operation. But we want to investigate what the constant factor looks like in different situations.

Experimental setup: All the code for these tests is available in the github repository. The C++ AVL tree was created in a data structures course that I took recently and the Haskell AVL tree is from the Haskell library Data.Tree.AVL. Additionally, the Haskell code stores the strings as ByteStrings because they are much more efficient than the notoriously slow String. To see how the runtime is affected by files of different sizes, the file was first partitioned into 10 segments. The first segment has 71,300 words, the second 71,300 * 2 words, and so on. Both the C++ and Haskell programs were compiled with the -O2 flag for optimization. The test on each segment is the average runtime of three separate runs.

Here’s the results:

C++vHaskell

C++vHaskell

C++ is a bit faster than Haskell on the last partition for this test.

I guess this is because Haskell operates on immutable data. Every time a new element is to be inserted into the Haskell AVL tree, new parent nodes must be created because the old parent nodes cannot be changed. This creates quite a bit of overhead. C++ on the other hand, does have mutable data and can simply change the node that a parent node is pointing to. This is faster than making a whole new copy like the Haskell code does.

Is there an easy way to speed up our Haskell code?

There is a Haskell library called parallel that makes parallel computations really convenient. We’ll try to speed up our program with this library.

You might think that it is unfair to compare multithreaded Haskell against C++ that is not multithreaded. And you’re absolutely right! But let’s be honest, manually working with pthreads in C++ is quite the headache, but parallism in Haskell is super easy.

Before we look at the results, let’s look at the parallelized code. What we do is create four trees each with a portion of the set of strings. Then, we call par on the trees so that the code is parallelized. Afterwards, we union the trees to make them a single tree. Finally, we call deepseq so that the code is evaluated.

{-# LANGUAGE TemplateHaskell #-}
import Control.DeepSeq.TH
import Control.Concurrent
import Data.Tree.AVL as A
import Data.COrdering
import System.CPUTime
import qualified Data.ByteString.Char8 as B
import Control.DeepSeq
import Data.List.Split
import System.Environment
import Control.Parallel

$(deriveNFData ''AVL)

-- Inserts elements from list into AVL tree
load :: AVL B.ByteString -> [B.ByteString] -> AVL B.ByteString
load t [] = t
load t (x:xs) = A.push (fstCC x) x (load t xs)


main = do
    args <- getArgs
    contents <- fmap B.lines $ B.readFile $ args !! 0
    let l = splitEvery (length contents `div` 4) contents
    deepseq contents $ deepseq l $ return ()
    start <- getCPUTime

    -- Loading the tree with the subpartitions
    let t1 = load empty $ l !! 0
    let t2 = load empty $ l !! 1
    let t3 = load empty $ l !! 2
    let t4 = load empty $ l !! 3
    let p = par t1 $ par t2 $ par t3 t4

    -- Calling union to combine the trees
    let b = union fstCC t1 t2
    let t = union fstCC t3 t4
    let bt = union fstCC b t
    let bt' = par b $ par t bt
    deepseq p $ deepseq bt' $ return ()

    end <- getCPUTime
    n <- getNumCapabilities
    let diff = ((fromIntegral (end-start)) / (10^12) / fromIntegral n)
    putStrLn $ show diff

Great, so now that the Haskell code has been parallelized, we can compile and run the program again to see the difference. To compile for parallelism, we must use some special flags.

ghc –O2 filename -rtsopts –threaded

And to run the program (-N4 refers to the number of cores).

./filename +RTS –N4
C++vHaskell4

C++vHaskell4

Haskell now gets better runtimes than C++.

Now that we know Haskell is capable of increasing its speeds through parallelism, it would be interesting to see how the runtime is affected by the degree of parallelism.

According to Amdahl’s law, a program that is 100% parallelized will see a proportional speed up based on the number of threads of execution. For example, if a program that is 100% parallelized takes 2 seconds to run on 1 thread, then it should take 1 second to run using 2 threads. The code used for our test, however, is not 100% parallelized since there a union operation performed at the end to combine the trees created by the separate threads. The union of the trees is a \(O(n)\) operation while the insertion of the strings into the AVL tree is a \(O\left(\frac{n \log n }{p}\right)\) operation, where \(p\) is the number of threads. Therefore, the runtime for our test should be

\[O\left(\frac{n\log{n}}{p} + n\right)\]

Here is a graph showing the runtime of the operation on the largest set (713,000 strings) across increasing levels of parallelism.

HaskellParallelization

HaskellParallelization

Taking a look at the results, we can see that the improvement in runtime does not fit the 100% parallelized theoretical model, but does follow it to some extent. Rather than the 2 core runtime being 50% of the 1 core runtime, the 2 core runtime is 56% of the 1 core runtime, with decreasing efficiency as the number of cores increases. Though, it is clear that there are significant improvements in speed through the use of more processor cores and that parallelism is an easy way to get better runtime speeds with little effort.

September 15, 2014 12:00 AM

September 14, 2014

Philip Wadler

The British Biased Corporation

Scandalous! Nick Robinson asks Alex Salmond a question, and Salmond takes seven minutes to answer in detail.

<iframe allowfullscreen="" frameborder="0" height="315" src="http://www.youtube.com/embed/rHmLb-RIbrM" width="560"></iframe>


On the evening news, Nick Robinson summarises Salmond's answer in a few seconds as 'He didn't answer'.

<iframe allowfullscreen="" frameborder="0" height="259" src="http://www.youtube.com/embed/enrdDaf3uss" width="460"></iframe>

(Above spotted via Arc of Prosperity.)

And today, this.
I used to be a supporter of the BBC, but it's getting harder and harder to justify.

by Philip Wadler (noreply@blogger.com) at September 14, 2014 10:01 PM

Krugman vs. Stiglitz, now with added Stiglitz

My last post quoted Joe Stiglitz, indirectly, to refute Paul Krugman's fear mongering. Now the man himself has spoken in the Sunday Herald.
As Scotland contemplates independence, some, such as Paul Krugman, have questioned the "economics".

Would Scotland, going it alone, risk a decline in standards of living or a fall in GDP? There are, to be sure, risks in any course of action: should Scotland stay in the UK, and the UK leave the EU, the downside risks are, by almost any account, significantly greater. If Scotland stays in the UK, and the UK continues in its policies which have resulted in growing inequality, even if GDP were slightly larger, the standards of living of most Scots could fall.

Cutbacks in UK public support to education and health could force Scotland to face a set of unpalatable choices - even with Scotland having considerable discretion over what it spends its money on.

But there is, in fact, little basis for any of the forms of fear-mongering that have been advanced. Krugman, for instance, suggests that there are significant economies of scale: a small economy is likely, he seems to suggest, not to do well. But an independent Scotland will still be part of Europe, and the great success of the EU is the creation of a large economic zone.

Besides, small political entities, like Sweden, Singapore, and Hong Kong have prospered, while much larger entities have not. By an order of magnitude, far more important is pursuit of the right policies.

Another example of a non-issue is the currency. There are many currency arrangements that would work. Scotland could continue using sterling - with or without England's consent.

Because the economies of England and Scotland are so similar, a common currency is likely to work far better than the euro - even without shared fiscal policy. But many small countries have managed to have a currency of their own - floating, pegged, or "managed."

by Philip Wadler (noreply@blogger.com) at September 14, 2014 06:13 PM

September 13, 2014

The Gentoo Haskell Team

ghc 7.8.3 and rare architectures

After some initially positive experience with ghc-7.8-rc1 I’ve decided to upstream most of gentoo fixes.

On rare arches ghc-7.8.3 behaves a bit bad:

  • ia64 build stopped being able to link itself after ghc-7.4 (gprel overflow)
  • on sparc, ia64 and ppc ghc was not able to create working shared libraries
  • integer-gmp library on ia64 crashed, and we had to use integer-simple

I have written a small story of those fixes here if you are curious.

TL;DR:

To get ghc-7.8.3 working nicer for exotic arches you will need to backport at least the following patches:

Thank you!


by Sergei Trofimovich at September 13, 2014 12:35 PM

unsafePerformIO and missing NOINLINE

Two months ago Ivan asked me if we had working darcs-2.8 for ghc-7.8 in gentoo. We had a workaround to compile darcs to that day, but darcs did not work reliably. Sometimes it needed 2-3 attempts to pull a repository.

A bit later I’ve decided to actually look at failure case (Issued on darcs bugtracker) and do something about it. My idea to debug the mystery was simple: to reproduce the difference on the same source for ghc-7.6/7.8 and start plugging debug info unless difference I can understand will pop up.

Darcs has great debug-verbose option for most of commands. I used debugMessage function to litter code with more debugging statements unless complete horrible image would emerge.

As you can see in bugtracker issue I posted there various intermediate points of what I thought went wrong (don’t expect those comments to have much sense).

The immediate consequence of a breakage was file overwrite of partially downloaded file. The event timeline looked simple:

  • darcs scheduled for download the same file twice (two jobs in download queue)
  • first download job did finish
  • notified waiter started processing of that downloaded temp file
  • second download started truncating previous complete download
  • notified waiter continued processing partially downloadeed file and detected breakage

Thus first I’ve decided to fix the consequence. It did not fix problems completely, sometimes darcs pull complained about remote repositories still being broken (missing files), but it made errors saner (only remote side was allegedly at fault).

Ideally, that file overwrite should not happen in the first place. Partially, it was temp file predictability.

But, OK. Then i’ve started digging why 7.6/7.8 request download patterns were so severely different. At first I thought of new IO manager being a cause of difference. The paper says it fixed haskell thread scheduling issue (paper is nice even for leisure reading!):

GHC’s RTS had a bug in which yield
placed the thread back on the front of the run queue. This bug
was uncovered by our use of yield
which requires that the thread
be placed at the end of the run queue

Thus I was expecting the bug from this side.

Then being determined to dig A Lot in darcs source code I’ve decided to disable optimizations (-O0) to speedup rebuilds. And, the bug has vanished.

That made the click: unsafePerformIO might be the real problem. I’ve grepped for all unsafePerformIO instances and examined all definition sites.

Two were especially interesting:

-- src/Darcs/Util/Global.hs
-- ...
_crcWarningList :: IORef CRCWarningList
_crcWarningList = unsafePerformIO $ newIORef []
{-# NOINLINE _crcWarningList #-}
-- ...
_badSourcesList :: IORef [String]
_badSourcesList = unsafePerformIO $ newIORef []
{- NOINLINE _badSourcesList -}
-- ...

Did you spot the bug?

Thus The Proper Fix was pushed upstream a month ago. Which means ghc is now able to inline things more aggressively (and _badSourcesList were inlined in all user sites, throwing out all update sites).

I don’t know if those newIORef [] can be de-CSEd if types would have the same representation. Ideally the module also needs -fno-cse, or get rid of unsafePerformIO completely :].

(Side thought: top-level global variables in C style are surprisingly non-trivial in "pure" haskell. They are easy to use via peek / poke (in a racy way), but are hard to declare / initialize.)

I had a question wondered how many haskell packages manage to misspell ghc pragma decparations in a way darcs did it. And there still _is_ a few of such offenders:

$ fgrep -R NOINLINE . | grep -v '{-# NOINLINE' | grep '{-'
--
ajhc-0.8.0.10/lib/jhc/Jhc/List.hs:{- NOINLINE filterFB #-}
ajhc-0.8.0.10/lib/jhc/Jhc/List.hs:{- NOINLINE iterateFB #-}
ajhc-0.8.0.10/lib/jhc/Jhc/List.hs:{- NOINLINE mapFB #-}
--
darcs-2.8.4/src/Darcs/Global.hs:{- NOINLINE _badSourcesList -}
darcs-2.8.4/src/Darcs/Global.hs:{- NOINLINE _reachableSourcesList -}
--
dph-lifted-copy-0.7.0.1/Data/Array/Parallel.hs:{- NOINLINE emptyP #-}
--
dph-par-0.5.1.1/Data/Array/Parallel.hs:{- NOINLINE emptyP #-}
--
dph-seq-0.5.1.1/Data/Array/Parallel.hs:{- NOINLINE emptyP #-}
--
freesect-0.8/FreeSectAnnotated.hs:{- # NOINLINE showSSI #-}
freesect-0.8/FreeSectAnnotated.hs:{- # NOINLINE FreeSectAnnotated.showSSI #-}
freesect-0.8/FreeSect.hs:{- # NOINLINE fs_warn_flaw #-}
--
http-proxy-0.0.8/Network/HTTP/Proxy/ReadInt.hs:{- NOINLINE readInt64MH #-}
http-proxy-0.0.8/Network/HTTP/Proxy/ReadInt.hs:{- NOINLINE mhDigitToInt #-}
--
lhc-0.10/lib/base/src/GHC/PArr.hs:{- NOINLINE emptyP #-}
--
property-list-0.1.0.2/src/Data/PropertyList/Binary/Float.hs:{- NOINLINE doubleToWord64 -}
property-list-0.1.0.2/src/Data/PropertyList/Binary/Float.hs:{- NOINLINE word64ToDouble -}
property-list-0.1.0.2/src/Data/PropertyList/Binary/Float.hs:{- NOINLINE floatToWord32 -}
property-list-0.1.0.2/src/Data/PropertyList/Binary/Float.hs:{- NOINLINE word32ToFloat -}
--
warp-2.0.3.3/Network/Wai/Handler/Warp/ReadInt.hs:{- NOINLINE readInt64MH #-}
warp-2.0.3.3/Network/Wai/Handler/Warp/ReadInt.hs:{- NOINLINE mhDigitToInt #-}

Looks like there is yet something to fix :]

Would be great if hlint would be able to detect pragma-like comments and warn when comment contents is a valid pragma, but comment brackets don’t allow it to fire.

{- NOINLINE foo -} -- bad
{- NOINLINE foo #-} -- bad
{-# NOINLINE foo -} -- bad
{-# NOINLINE foo #-} -- ok

Thanks for reading!


by Sergei Trofimovich at September 13, 2014 09:18 AM

September 12, 2014

Bjorn Buckwalter

Haskell tools for satellite operations

At last week's CUFP I did a talk called “Haskell tools for satellite operations”. The abstract is:

Since 2013-04 the presenter has been supporting SSC (the Swedish Space Corporation) in operating the telecommunications satellite “Sirius 3” from its Mission Control Center in Kiruna. Functions in the satellite vendor's operations software are breaking down as the orbit of the ageing satellite degrades. To fill in the gaps in software capabilities the presenter has developed several operational tools using Haskell.
The talk will give an overview of the satellite operations environment, the tools developed in Haskell, how they benefitted (and occasionally suffered) from the choice of implementation language, which (public) libraries were critical to their success, and how they were deployed in the satellite operations environment.
A video recording of the talk is available on the CUFP page for the talk and on youtube.

If this interests you be sure to check out the other talk from the “Functional programming in space!” track; Michael Oswald's Haskell in the Misson Control Domain.

by Björn Buckwalter (noreply@blogger.com) at September 12, 2014 12:46 PM

Gabriel Gonzalez

Morte: an intermediate language for super-optimizing functional programs

The Haskell language provides the following guarantee (with caveats): if two programs are equal according to equational reasoning then they will behave the same. On the other hand, Haskell does not guarantee that equal programs will generate identical performance. Consequently, Haskell library writers must employ rewrite rules to ensure that their abstractions do not interfere with performance.

Now suppose there were a hypothetical language with a stronger guarantee: if two programs are equal then they generate identical executables. Such a language would be immune to abstraction: no matter how many layers of indirection you might add the binary size and runtime performance would be unaffected.

Here I will introduce such an intermediate language named Morte that obeys this stronger guarantee. I have not yet implemented a back-end code generator for Morte, but I wanted to pause to share what I have completed so far because Morte uses several tricks from computer science that I believe deserve more attention.

Morte is nothing more than a bare-bones implementation of the calculus of constructions, which is a specific type of lambda calculus. The only novelty is how I intend to use this lambda calculus: as a super-optimizer.

Normalization

The typed lambda calculus possesses a useful property: every term in the lambda calculus has a unique normal form if you beta-reduce everything. If you're new to lambda calculus, normalizing an expression equates to indiscriminately inlining every function call.

What if we built a programming language whose intermediate language was lambda calculus? What if optimization was just normalization of lambda terms (i.e. indiscriminate inlining)? If so, then we would could abstract freely, knowing that while compile times might increase, our final executable would never change.

Recursion

Normally you would not want to inline everything because infinitely recursive functions would become infinitely large expressions. Fortunately, we can often translate recursive code to non-recursive code!

I'll demonstrate this trick first in Haskell and then in Morte. Let's begin from the following recursive List type along with a recursive map function over lists:

import Prelude hiding (map, foldr)

data List a = Cons a (List a) | Nil

example :: List Int
example = Cons 1 (Cons 2 (Cons 3 Nil))

map :: (a -> b) -> List a -> List b
map f Nil = Nil
map f (Cons a l) = Cons (f a) (map f l)

-- Argument order intentionally switched
foldr :: List a -> (a -> x -> x) -> x -> x
foldr Nil c n = n
foldr (Cons a l) c n = c a (foldr l c n)

result :: Int
result = foldr (map (+1) example) (+) 0

-- result = 9

Now imagine that we disable all recursion in Haskell: no more recursive types and no more recursive functions. Now we must reject the above program because:

  • the List data type definition recursively refers to itself

  • the map and foldr functions recursively refer to themselves

Can we still encode lists in a non-recursive dialect of Haskell?

Yes, we can!

-- This is a valid Haskell program

{-# LANGUAGE RankNTypes #-}

import Prelude hiding (map, foldr)

type List a = forall x . (a -> x -> x) -> x -> x

example :: List Int
example = \cons nil -> cons 1 (cons 2 (cons 3 nil))

map :: (a -> b) -> List a -> List b
map f l = \cons nil -> l (\a x -> cons (f a) x) nil

foldr :: List a -> (a -> x -> x) -> x -> x
foldr l = l

result :: Int
result = foldr (map (+ 1) example) (+) 0

-- result = 9

Carefully note that:

  • List is no longer defined recursively in terms of itself

  • map and foldr are no longer defined recursively in terms of themselves

Yet, we somehow managed to build a list, map a function over the list, and fold the list, all without ever using recursion! We do this by encoding the list as a fold, which is why foldr became the identity function.

This trick works for more than just lists. You can take any recursive data type and mechanically transform the type into a fold and transform functions on the type into functions on folds. If you want to learn more about this trick, the specific name for it is "Boehm-Berarducci encoding". If you are curious, this in turn is equivalent to an even more general concept from category theory known as "F-algebras", which let you encode inductive things in a non-inductive way.

Non-recursive code greatly simplifies equational reasoning. For example, we can easily prove that we can optimize map id l to l:

map id l

-- Inline: map f l = \cons nil -> l (\a x -> cons (f a) x) nil
= \cons nil -> l (\a x -> cons (id a) x) nil

-- Inline: id x = x
= \cons nil -> l (\a x -> cons a x) nil

-- Eta-reduce
= \cons nil -> l cons nil

-- Eta-reduce
= l

Note that we did not need to use induction to prove this optimization because map is no longer recursive. The optimization became downright trivial, so trivial that we can automate it!

Morte optimizes programs using this same simple scheme:

  • Beta-reduce everything (equivalent to inlining)
  • Eta-reduce everything

To illustrate this, I will desugar our high-level Haskell code to the calculus of constructions. This desugaring process is currently manual (and tedious), but I plan to automate this, too, by providing a front-end high-level language similar to Haskell that compiles to Morte:

-- mapid.mt

( \(List : * -> *)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b) -> List a -> List b
)
-> \(id : forall (a : *) -> a -> a)

-> \(a : *) -> map a a (id a)
)

-- List
(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil
)

-- id
(\(a : *) -> \(va : a) -> va)

This line of code is the "business end" of the program:

\(a : *) -> map a a (id a)

The extra 'a' business is because in any polymorphic lambda calculus you explicitly accept polymorphic types as arguments and specialize functions by applying them to types. Higher-level functional languages like Haskell or ML use type inference to automatically infer and supply type arguments when possible.

We can compile this program using the morte executable, which accepts a Morte program on stdin, outputs the program's type stderr, and outputs the optimized program on stdout:

$ morte < id.mt
∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a
→ x → x) → x → x

λ(a : *) → λ(l : ∀(x : *) → (a → x → x) → x → x) → l

The first line is the type, which is a desugared form of:

forall a . List a -> List a

The second line is the program, which is the identity function on lists. Morte optimized away the map completely, the same way we did by hand.

Morte also optimized away the rest of the code, too. Dead-code elimination is just an emergent property of Morte's simple optimization scheme.

Equality

We could double-check our answer by asking Morte to optimize the identity function on lists:

-- idlist.mt

( \(List : * -> *)
-> \(id : forall (a : *) -> a -> a)

-> \(a : *) -> id (List a)
)

-- List
(\(a : *) -> forall (x : *) -> (a -> x -> x) -> x -> x)

-- id
(\(a : *) -> \(va : a) -> va)

Sure enough, Morte outputs an alpha-equivalent result (meaning the same up to variable renaming):

$ ~/.cabal/bin/morte < idlist.mt
∀(a : *) → (∀(x : *) → (a → x → x) → x → x) → ∀(x : *) → (a
→ x → x) → x → x

λ(a : *) → λ(va : ∀(x : *) → (a → x → x) → x → x) → va

We can even use the morte library to mechanically check if two Morte expressions are alpha-, beta-, and eta- equivalent. We can parse our two Morte files into Morte's Expr type and then use the Eq instance for Expr to test for equivalence:

$ ghci
Prelude> import qualified Data.Text.Lazy.IO as Text
Prelude Text> txt1 <- Text.readFile "mapid.mt"
Prelude Text> txt2 <- Text.readFile "idlist.mt"
Prelude Text> import Morte.Parser (exprFromText)
Prelude Text Morte.Parser> let e1 = exprFromText txt1
Prelude Text Morte.Parser> let e2 = exprFromText txt2
Prelude Text Morte.Parser> import Control.Applicative (liftA2)
Prelude Text Morte.Parser Control.Applicative> liftA2 (==) e1 e2
Right True
$ -- `Right` means both expressions parsed successfully
$ -- `True` means they are alpha-, beta-, and eta-equivalent

We can use this to mechanically verify that two Morte programs optimize to the same result.

Compile-time computation

Morte can compute as much (or as little) at compile as you want. The more information you encode directly within lambda calculus, the more compile-time computation Morte will perform for you. For example, if we translate our Haskell List code entirely to lambda calculus, then Morte will statically compute the result at compile time.

-- nine.mt

( \(Nat : *)
-> \(zero : Nat)
-> \(one : Nat)
-> \((+) : Nat -> Nat -> Nat)
-> \((*) : Nat -> Nat -> Nat)
-> \(List : * -> *)
-> \(Cons : forall (a : *) -> a -> List a -> List a)
-> \(Nil : forall (a : *) -> List a)
-> \( map
: forall (a : *) -> forall (b : *)
-> (a -> b) -> List a -> List b
)
-> \( foldr
: forall (a : *)
-> List a
-> forall (r : *)
-> (a -> r -> r) -> r -> r
)
-> ( \(two : Nat)
-> \(three : Nat)
-> ( \(example : List Nat)

-> foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero
)

-- example
(Cons Nat one (Cons Nat two (Cons Nat three (Nil Nat))))
)

-- two
((+) one one)

-- three
((+) one ((+) one one))
)

-- Nat
( forall (a : *)
-> (a -> a)
-> a
-> a
)

-- zero
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Zero
)

-- one
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Succ Zero
)

-- (+)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a Succ (n a Succ Zero)
)

-- (*)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a (n a Succ) Zero
)

-- List
( \(a : *)
-> forall (x : *)
-> (a -> x -> x) -- Cons
-> x -- Nil
-> x
)

-- Cons
( \(a : *)
-> \(va : a)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Cons va (vas x Cons Nil)
)

-- Nil
( \(a : *)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Nil
)

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil
)

-- foldr
( \(a : *)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> vas
)

The relevant line is:

foldr Nat (map Nat Nat ((+) one) example) Nat (+) zero

If you remove the type-applications to Nat, this parallels our original Haskell example. We can then evaluate this expression at compile time:

$ morte < nine.mt
∀(a : *) → (a → a) → a → a

λ(a : *) → λ(Succ : a → a) → λ(Zero : a) → Succ (Succ (Succ
(Succ (Succ (Succ (Succ (Succ (Succ Zero))))))))

Morte reduces our program to a church-encoded nine.

Run-time computation

Morte does not force you to compute everything using lambda calculus at compile time. Suppose that we wanted to use machine arithmetic at run-time instead. We can do this by parametrizing our program on:

  • the Int type,
  • operations on Ints, and
  • any integer literals we use

We accept these "foreign imports" as ordinary arguments to our program:

-- foreign.mt

-- Foreign imports
\(Int : *) -- Foreign type
-> \((+) : Int -> Int -> Int) -- Foreign function
-> \((*) : Int -> Int -> Int) -- Foreign function
-> \(lit@0 : Int) -- Literal "1" -- Foreign data
-> \(lit@1 : Int) -- Literal "2" -- Foreign data
-> \(lit@2 : Int) -- Literal "3" -- Foreign data
-> \(lit@3 : Int) -- Literal "1" -- Foreign data
-> \(lit@4 : Int) -- Literal "0" -- Foreign data

-- The rest is compile-time lambda calculus
-> ( \(List : * -> *)
-> \(Cons : forall (a : *) -> a -> List a -> List a)
-> \(Nil : forall (a : *) -> List a)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b) -> List a -> List b
)
-> \( foldr
: forall (a : *)
-> List a
-> forall (r : *)
-> (a -> r -> r) -> r -> r
)
-> ( \(example : List Int)

-> foldr Int (map Int Int ((+) lit@3) example) Int (+) lit@4
)

-- example
(Cons Int lit@0 (Cons Int lit@1 (Cons Int lit@2 (Nil Int))))
)

-- List
( \(a : *)
-> forall (x : *)
-> (a -> x -> x) -- Cons
-> x -- Nil
-> x
)

-- Cons
( \(a : *)
-> \(va : a)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Cons va (vas x Cons Nil)
)

-- Nil
( \(a : *)
-> \(x : *)
-> \(Cons : a -> x -> x)
-> \(Nil : x)
-> Nil
)

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \(l : forall (x : *) -> (a -> x -> x) -> x -> x)
-> \(x : *)
-> \(Cons : b -> x -> x)
-> \(Nil: x)
-> l x (\(va : a) -> \(vx : x) -> Cons (f va) vx) Nil
)

-- foldr
( \(a : *)
-> \(vas : forall (x : *) -> (a -> x -> x) -> x -> x)
-> vas
)

We can use Morte to optimize the above program and Morte will reduce the program to nothing but foreign types, operations, and values:

$ morte < foreign.mt
∀(Int : *) → (Int → Int → Int) → (Int → Int → Int) → Int →
Int → Int → Int → Int → Int

λ(Int : *) → λ((+) : Int → Int → Int) → λ((*) : Int → Int →
Int) → λ(lit : Int) → λ(lit@1 : Int) → λ(lit@2 : Int) →
λ(lit@3 : Int) → λ(lit@4 : Int) → (+) ((+) lit@3 lit) ((+)
((+) lit@3 lit@1) ((+) ((+) lit@3 lit@2) lit@4))

If you study that closely, Morte adds lit@3 (the "1" literal) to each literal of the list and then adds them up. We can then pass this foreign syntax tree to our machine arithmetic backend to transform those foreign operations to efficient operations.

Morte lets you choose how much information you want to encode within lambda calculus. The more information you encode in lambda calculus the more Morte can optimize your program, but the slower your compile times will get, so it's a tradeoff.

Corecursion

Corecursion is the dual of recursion. Where recursion works on finite data types, corecursion works on potentially infinite data types. An example would be the following infinite Stream in Haskell:

data Stream a = Cons a (Stream a)

numbers :: Stream Int
numbers = go 0
where
go n = Cons n (go (n + 1))

-- numbers = Cons 0 (Cons 1 (Cons 2 (...

map :: (a -> b) -> Stream a -> Stream b
map f (Cons a l) = Cons (f a) (map f l)

example :: Stream Int
example = map (+ 1) numbers

-- example = Cons 1 (Cons 2 (Cons 3 (...

Again, pretend that we disable any function from referencing itself so that the above code becomes invalid. This time we cannot reuse the same trick from previous sections because we cannot encode numbers as a fold without referencing itself. Try this if you don't believe me.

However, we can still encode corecursive things in a non-corecursive way. This time, we encode our Stream type as an unfold instead of a fold:

-- This is also valid Haskell code

{-# LANGUAGE ExistentialQuantification #-}

data Stream a = forall s . MkStream
{ seed :: s
, step :: s -> (a, s)
}

numbers :: Stream Int
numbers = MkStream 0 (\n -> (n, n + 1))

map :: (a -> b) -> Stream a -> Stream b
map f (MkStream s0 k) = MkStream s0 k'
where
k' s = (f a, s')
where (a, s') = k s

In other words, we store an initial seed of some type s and a step function of type s -> (a, s) that emits one element of our Stream. The type of our seed s can be anything and in our numbers example, the type of the internal state is Int. Another stream could use a completely different internal state of type (), like this:

-- ones = Cons 1 ones

ones :: Stream Int
ones = MkStream () (\_ -> (1, ()))

The general name for this trick is an "F-coalgebra" encoding of a corecursive type.

Once we encode our infinite stream non-recursively, we can safely optimize the stream by inlining and eta reduction:

map id l

-- l = MkStream s0 k
= map id (MkStream s0 k)

-- Inline definition of `map`
= MkStream s0 k'
where
k' = \s -> (id a, s')
where
(a, s') = k s

-- Inline definition of `id`
= MkStream s0 k'
where
k' = \s -> (a, s')
where
(a, s') = k s

-- Inline: (a, s') = k s
= MkStream s0 k'
where
k' = \s -> k s

-- Eta reduce
= MkStream s0 k'
where
k' = k

-- Inline: k' = k
= MkStream s0 k

-- l = MkStream s0 k
= l

Now let's encode Stream and map in Morte and compile the following four expressions:

map id

id

map f . map g

map (f . g)

Save the following Morte file to stream.mt and then uncomment the expression you want to test:

(   \(id : forall (a : *) -> a -> a)
-> \( (.)
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> (a -> c)
)
-> \(Pair : * -> * -> *)
-> \(P : forall (a : *) -> forall (b : *) -> a -> b -> Pair a b)
-> \( first
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (a -> b)
-> Pair a c
-> Pair b c
)

-> ( \(Stream : * -> *)
-> \( map
: forall (a : *)
-> forall (b : *)
-> (a -> b)
-> Stream a
-> Stream b
)

-- example@1 = example@2
-> ( \(example@1 : forall (a : *) -> Stream a -> Stream a)
-> \(example@2 : forall (a : *) -> Stream a -> Stream a)

-- example@3 = example@4
-> \( example@3
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> Stream a
-> Stream c
)

-> \( example@4
: forall (a : *)
-> forall (b : *)
-> forall (c : *)
-> (b -> c)
-> (a -> b)
-> Stream a
-> Stream c
)

-- Uncomment the example you want to test
-> example@1
-- -> example@2
-- -> example@3
-- -> example@4
)

-- example@1
(\(a : *) -> map a a (id a))

-- example@2
(\(a : *) -> id (Stream a))

-- example@3
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> map a c ((.) a b c f g)
)

-- example@4
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> (.) (Stream a) (Stream b) (Stream c) (map b c f) (map a b g)
)
)

-- Stream
( \(a : *)
-> forall (x : *)
-> (forall (s : *) -> s -> (s -> Pair a s) -> x)
-> x
)

-- map
( \(a : *)
-> \(b : *)
-> \(f : a -> b)
-> \( st
: forall (x : *)
-> (forall (s : *) -> s -> (s -> Pair a s) -> x)
-> x
)
-> \(x : *)
-> \(S : forall (s : *) -> s -> (s -> Pair b s) -> x)
-> st
x
( \(s : *)
-> \(seed : s)
-> \(step : s -> Pair a s)
-> S
s
seed
(\(seed@1 : s) -> first a b s f (step seed@1))
)
)
)

-- id
(\(a : *) -> \(va : a) -> va)

-- (.)
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : b -> c)
-> \(g : a -> b)
-> \(va : a)
-> f (g va)
)

-- Pair
(\(a : *) -> \(b : *) -> forall (x : *) -> (a -> b -> x) -> x)

-- P
( \(a : *)
-> \(b : *)
-> \(va : a)
-> \(vb : b)
-> \(x : *)
-> \(P : a -> b -> x)
-> P va vb
)

-- first
( \(a : *)
-> \(b : *)
-> \(c : *)
-> \(f : a -> b)
-> \(p : forall (x : *) -> (a -> c -> x) -> x)
-> \(x : *)
-> \(Pair : b -> c -> x)
-> p x (\(va : a) -> \(vc : c) -> Pair (f va) vc)
)

Both example@1 and example@2 will generate alpha-equivalent code:

$ morte < example1.mt
∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a →
s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x
: *) → (a → s → x) → x) → x) → x

λ(a : *) → λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) →
(a → s → x) → x) → x) → x) → st

$ morte < example2.mt
∀(a : *) → (∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a →
s → x) → x) → x) → x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x
: *) → (a → s → x) → x) → x) → x

λ(a : *) → λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) →
(a → s → x) → x) → x) → x) → va

Similarly, example@3 and example@4 will generate alpha-equivalent code:

$ morte < example3.mt
∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x :
*) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) →
x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x)
→ x) → x) → x

λ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b)
→ λ(st : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s
→ x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀
(x : *) → (c → s → x) → x) → x) → st x (λ(s : *) → λ(seed :
s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(
seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x
(λ(va : a) → Pair (f (g va)))))

$ morte < example4.mt
∀(a : *) → ∀(b : *) → ∀(c : *) → (b → c) → (a → b) → (∀(x :
*) → (∀(s : *) → s → (s → ∀(x : *) → (a → s → x) → x) → x) →
x) → ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (c → s → x)
→ x) → x) → x

λ(a : *) → λ(b : *) → λ(c : *) → λ(f : b → c) → λ(g : a → b)
→ λ(va : ∀(x : *) → (∀(s : *) → s → (s → ∀(x : *) → (a → s
→ x) → x) → x) → x) → λ(x : *) → λ(S : ∀(s : *) → s → (s → ∀
(x : *) → (c → s → x) → x) → x) → va x (λ(s : *) → λ(seed :
s) → λ(step : s → ∀(x : *) → (a → s → x) → x) → S s seed (λ(
seed@1 : s) → λ(x : *) → λ(Pair : c → s → x) → step seed@1 x
(λ(va : a) → Pair (f (g va))))

We inadvertently proved stream fusion for free, but we're still not done, yet! Everything we learn about recursive and corecursive sequences can be applied to model recursive and corecursive effects!

Effects

I will conclude this post by showing how to model both recursive and corecursive programs that have side effects. The recursive program will echo ninety-nine lines from stdin to stdout. The equivalent Haskell program is in the comment header:

-- recursive.mt

-- The Haskell code we will translate to Morte:
--
-- import Prelude hiding (
-- (+), (*), IO, putStrLn, getLine, (>>=), (>>), return )
--
-- -- Simple prelude
--
-- data Nat = Succ Nat | Zero
--
-- zero :: Nat
-- zero = Zero
--
-- one :: Nat
-- one = Succ Zero
--
-- (+) :: Nat -> Nat -> Nat
-- Zero + n = n
-- Succ m + n = m + Succ n
--
-- (*) :: Nat -> Nat -> Nat
-- Zero * n = Zero
-- Succ m * n = n + (m * n)
--
-- foldNat :: Nat -> (a -> a) -> a -> a
-- foldNat Zero f x = x
-- foldNat (Succ m) f x = f (foldNat m f x)
--
-- data IO r
-- = PutStrLn String (IO r)
-- | GetLine (String -> IO r)
-- | Return r
--
-- putStrLn :: String -> IO U
-- putStrLn str = PutStrLn str (Return Unit)
--
-- getLine :: IO String
-- getLine = GetLine Return
--
-- return :: a -> IO a
-- return = Return
--
-- (>>=) :: IO a -> (a -> IO b) -> IO b
-- PutStrLn str io >>= f = PutStrLn str (io >>= f)
-- GetLine k >>= f = GetLine (\str -> k str >>= f)
-- Return r >>= f = f r
--
-- -- Derived functions
--
-- (>>) :: IO U -> IO U -> IO U
-- m >> n = m >>= \_ -> n
--
-- two :: Nat
-- two = one + one
--
-- three :: Nat
-- three = one + one + one
--
-- four :: Nat
-- four = one + one + one + one
--
-- five :: Nat
-- five = one + one + one + one + one
--
-- six :: Nat
-- six = one + one + one + one + one + one
--
-- seven :: Nat
-- seven = one + one + one + one + one + one + one
--
-- eight :: Nat
-- eight = one + one + one + one + one + one + one + one
--
-- nine :: Nat
-- nine = one + one + one + one + one + one + one + one + one
--
-- ten :: Nat
-- ten = one + one + one + one + one + one + one + one + one + one
--
-- replicateM_ :: Nat -> IO U -> IO U
-- replicateM_ n io = foldNat n (io >>) (return Unit)
--
-- ninetynine :: Nat
-- ninetynine = nine * ten + nine
--
-- main_ :: IO U
-- main_ = getLine >>= putStrLn

-- "Free" variables
( \(String : * )
-> \(U : *)
-> \(Unit : U)

-- Simple prelude
-> ( \(Nat : *)
-> \(zero : Nat)
-> \(one : Nat)
-> \((+) : Nat -> Nat -> Nat)
-> \((*) : Nat -> Nat -> Nat)
-> \(foldNat : Nat -> forall (a : *) -> (a -> a) -> a -> a)
-> \(IO : * -> *)
-> \(return : forall (a : *) -> a -> IO a)
-> \((>>=)
: forall (a : *)
-> forall (b : *)
-> IO a
-> (a -> IO b)
-> IO b
)
-> \(putStrLn : String -> IO U)
-> \(getLine : IO String)

-- Derived functions
-> ( \((>>) : IO U -> IO U -> IO U)
-> \(two : Nat)
-> \(three : Nat)
-> \(four : Nat)
-> \(five : Nat)
-> \(six : Nat)
-> \(seven : Nat)
-> \(eight : Nat)
-> \(nine : Nat)
-> \(ten : Nat)
-> ( \(replicateM_ : Nat -> IO U -> IO U)
-> \(ninetynine : Nat)

-> replicateM_ ninetynine ((>>=) String U getLine putStrLn)
)

-- replicateM_
( \(n : Nat)
-> \(io : IO U)
-> foldNat n (IO U) ((>>) io) (return U Unit)
)

-- ninetynine
((+) ((*) nine ten) nine)
)

-- (>>)
( \(m : IO U)
-> \(n : IO U)
-> (>>=) U U m (\(_ : U) -> n)
)

-- two
((+) one one)

-- three
((+) one ((+) one one))

-- four
((+) one ((+) one ((+) one one)))

-- five
((+) one ((+) one ((+) one ((+) one one))))

-- six
((+) one ((+) one ((+) one ((+) one ((+) one one)))))

-- seven
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))

-- eight
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))
-- nine
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one))))))))

-- ten
((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one ((+) one one)))))))))
)

-- Nat
( forall (a : *)
-> (a -> a)
-> a
-> a
)

-- zero
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Zero
)

-- one
( \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> Succ Zero
)

-- (+)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a Succ (n a Succ Zero)
)

-- (*)
( \(m : forall (a : *) -> (a -> a) -> a -> a)
-> \(n : forall (a : *) -> (a -> a) -> a -> a)
-> \(a : *)
-> \(Succ : a -> a)
-> \(Zero : a)
-> m a (n a Succ) Zero
)

-- foldNat
( \(n : forall (a : *) -> (a -> a) -> a -> a)
-> n
)

-- IO
( \(r : *)
-> forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (r -> x)
-> x
)

-- return
( \(a : *)
-> \(va : a)
-> \(x : *)
-> \(PutStrLn : String -> x -> x)
-> \(GetLine : (String -> x) -> x)
-> \(Return : a -> x)
-> Return va
)

-- (>>=)
( \(a : *)
-> \(b : *)
-> \(m : forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (a -> x)
-> x
)
-> \(f : a
-> forall (x : *)
-> (String -> x -> x)
-> ((String -> x) -> x)
-> (b -> x)
-> x
)
-> \(x : *)
-> \(PutStrLn : String -> x -> x)
-> \(GetLine : (String -> x) -> x)
-> \(Return : b -> x)
-> m x PutStrLn GetLine (\(va : a) -> f va x PutStrLn GetLine Return)
)

-- putStrLn
( \(str : String)
-> \(x : *)
-> \(PutStrLn : String -> x -> x )
-> \(GetLine : (String -> x) -> x)
-> \(Return : U -> x)
-> PutStrLn str (Return Unit)
)

-- getLine
( \(x : *)
-> \(PutStrLn : String -> x -> x )
-> \(GetLine : (String -> x) -> x)
-> \(Return : String -> x)
-> GetLine Return
)
)

This program will compile to a completely unrolled read-write loop, as most recursive programs will:

$ morte < recursive.mt
∀(String : *) → ∀(U : *) → U → ∀(x : *) → (String → x → x) →
((String → x) → x) → (U → x) → x

λ(String : *) → λ(U : *) → λ(Unit : U) → λ(x : *) → λ(PutStr
Ln : String → x → x) → λ(GetLine : (String → x) → x) → λ(Ret
urn : U → x) → GetLine (λ(va : String) → PutStrLn va (GetLin
e (λ(va@1 : String) → PutStrLn va@1 (GetLine (λ(va@2 : Strin
g) → PutStrLn va@2 (GetLine (λ(va@3 : String) → PutStrLn ...
<snip>
... GetLine (λ(va@92 : String) → PutStrLn va@92 (GetLine (λ(
va@93 : String) → PutStrLn va@93 (GetLine (λ(va@94 : String)
→ PutStrLn va@94 (GetLine (λ(va@95 : String) → PutStrLn va@
95 (GetLine (λ(va@96 : String) → PutStrLn va@96 (GetLine (λ(
va@97 : String) → PutStrLn va@97 (GetLine (λ(va@98 : String)
→ PutStrLn va@98 (Return Unit))))))))))))))))))))))))))))))
))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
))))))))))))))))))))))))))))))))))))))))))))))))

In contrast, if we encode the effects corecursively we can express a program that echoes indefinitely from stdin to stdout:

-- corecursive.mt

-- data IOF r s
-- = PutStrLn String s
-- | GetLine (String -> s)
-- | Return r
--
-- data IO r = forall s . MkIO s (s -> IOF r s)
--
-- main = MkIO
-- Nothing
-- (maybe (\str -> PutStrLn str Nothing) (GetLine Just))

( \(String : *)
-> ( \(Maybe : * -> *)
-> \(Just : forall (a : *) -> a -> Maybe a)
-> \(Nothing : forall (a : *) -> Maybe a)
-> \( maybe
: forall (a : *)
-> Maybe a
-> forall (x : *)
-> (a -> x)
-> x
-> x
)
-> \(IOF : * -> * -> *)
-> \( PutStrLn
: forall (r : *)
-> forall (s : *)
-> String
-> s
-> IOF r s
)
-> \( GetLine
: forall (r : *)
-> forall (s : *)
-> (String -> s)
-> IOF r s
)
-> \( Return
: forall (r : *)
-> forall (s : *)
-> r
-> IOF r s
)
-> ( \(IO : * -> *)
-> \( MkIO
: forall (r : *)
-> forall (s : *)
-> s
-> (s -> IOF r s)
-> IO r
)
-> ( \(main : forall (r : *) -> IO r)
-> main
)

-- main
( \(r : *)
-> MkIO
r
(Maybe String)
(Nothing String)
( \(m : Maybe String)
-> maybe
String
m
(IOF r (Maybe String))
(\(str : String) ->
PutStrLn
r
(Maybe String)
str
(Nothing String)
)
(GetLine r (Maybe String) (Just String))
)
)
)

-- IO
( \(r : *)
-> forall (x : *)
-> (forall (s : *) -> s -> (s -> IOF r s) -> x)
-> x
)

-- MkIO
( \(r : *)
-> \(s : *)
-> \(seed : s)
-> \(step : s -> IOF r s)
-> \(x : *)
-> \(k : forall (s : *) -> s -> (s -> IOF r s) -> x)
-> k s seed step
)
)

-- Maybe
(\(a : *) -> forall (x : *) -> (a -> x) -> x -> x)

-- Just
( \(a : *)
-> \(va : a)
-> \(x : *)
-> \(Just : a -> x)
-> \(Nothing : x)
-> Just va
)

-- Nothing
( \(a : *)
-> \(x : *)
-> \(Just : a -> x)
-> \(Nothing : x)
-> Nothing
)

-- maybe
( \(a : *)
-> \(m : forall (x : *) -> (a -> x) -> x-> x)
-> m
)

-- IOF
( \(r : *)
-> \(s : *)
-> forall (x : *)
-> (String -> s -> x)
-> ((String -> s) -> x)
-> (r -> x)
-> x
)

-- PutStrLn
( \(r : *)
-> \(s : *)
-> \(str : String)
-> \(vs : s)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> PutStrLn str vs
)

-- GetLine
( \(r : *)
-> \(s : *)
-> \(k : String -> s)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> GetLine k
)

-- Return
( \(r : *)
-> \(s : *)
-> \(vr : r)
-> \(x : *)
-> \(PutStrLn : String -> s -> x)
-> \(GetLine : (String -> s) -> x)
-> \(Return : r -> x)
-> Return vr
)

)

This compiles to a state machine that we can unfold one step at a time:

$ morte < corecursive.mt
∀(String : *) → ∀(r : *) → ∀(x : *) → (∀(s : *) → s → (s → ∀
(x : *) → (String → s → x) → ((String → s) → x) → (r → x) →
x) → x) → x

λ(String : *) → λ(r : *) → λ(x : *) → λ(k : ∀(s : *) → s → (
s → ∀(x : *) → (String → s → x) → ((String → s) → x) → (r →
x) → x) → x) → k (∀(x : *) → (String → x) → x → x) (λ(x : *)
→ λ(Just : String → x) → λ(Nothing : x) → Nothing) (λ(m : ∀
(x : *) → (String → x) → x → x) → m (∀(x : *) → (String → (∀
(x : *) → (String → x) → x → x) → x) → ((String → ∀(x : *) →
(String → x) → x → x) → x) → (r → x) → x) (λ(str : String)
→ λ(x : *) → λ(PutStrLn : String → (∀(x : *) → (String → x)
→ x → x) → x) → λ(GetLine : (String → ∀(x : *) → (String → x
) → x → x) → x) → λ(Return : r → x) → PutStrLn str (λ(x : *)
→ λ(Just : String → x) → λ(Nothing : x) → Nothing)) (λ(x :
*) → λ(PutStrLn : String → (∀(x : *) → (String → x) → x → x)
→ x) → λ(GetLine : (String → ∀(x : *) → (String → x) → x →
x) → x) → λ(Return : r → x) → GetLine (λ(va : String) → λ(x
: *) → λ(Just : String → x) → λ(Nothing : x) → Just va))

I don't expect you to understand that output other than to know that we can translate the output to any backend that provides functions, and primitive read/write operations.

Conclusion

If you would like to use Morte, you can find the library on both Github and Hackage. I also provide a Morte tutorial that you can use to learn more about the library.

Morte is dependently typed in theory, but in practice I have not exercised this feature so I don't understand the implications of this. If this turns out to be a mistake then I will downgrade Morte to System Fw, which has higher-kinds and polymorphism, but no dependent types.

Additionally, Morte might be usable to transmit code in a secure and typed way in distributed environment or to share code between diverse functional language by providing a common intermediate language. However, both of those scenarios require additional work, such as establishing a shared set of foreign primitives and creating Morte encoders/decoders for each target language.

Also, there are additional optimizations which Morte might implement in the future. For example, Morte could use free theorems (equalities you deduce from the types) to simplify some code fragments even further, but Morte currently does not do this.

My next goals are:

  • Add a back-end to compile Morte to LLVM
  • Add a front-end to desugar a medium-level Haskell-like language to Morte

Once those steps are complete then Morte will be a usable intermediate language for writing super-optimizable programs.

Also, if you're wondering, the name Morte is a tribute to a talking skull from the game Planescape: Torment, since the Morte library is a "bare-bones" calculus of constructions.

Literature

If this topic interests you more, you may find the following links helpful, in roughly increasing order of difficulty:

by Gabriel Gonzalez (noreply@blogger.com) at September 12, 2014 11:38 AM

Ken T Takusagawa

[prbwmqwj] Functions to modify a record

Haskell could use some new syntax LAMBDA_RECORD_MODIFY which could be used as follows:

import qualified Control.Monad.State as State;
data Record { field :: Int };
... State.modify $ LAMBDA_RECORD_MODIFY { field = ... };

which is equivalent to

State.modify $ \x -> x { field = ... }

but not having to name the lambda parameter "x" (twice).

I suspect this is one of the things lenses are trying to do.

by Ken (noreply@blogger.com) at September 12, 2014 03:43 AM

September 11, 2014

The GHC Team

Static pointers and serialisation

This longish post gives Simon's reflections on the implementation of Cloud-Haskell-style static pointers and serialiation. See also StaticPointers.

Much of what is suggested here is implemented, in some form, in two existing projects

My goal here is to identify the smallest possible extension to GHC, with the smallest possible trusted code base, that would enable these libraries to be written in an entirely type-safe way.


Background

Background: the trusted code base

The implementation Typeable class, and its associated functions, in GHC offers a type-safe abstraction, in the classic sense that "well typed programs won't go wrong". For example, we in Data.Typeable we have

cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b

We expect cast to be type-safe: if cast returns a value Just x then we really do know that x :: b. Let's remind ourselves of class Typeable:

class Typeable a where
  typeRep :: proxy a -> TypeRep

(It's not quite this, but close.) The proxy a argument is just a proxy for type argument; its value is never inspected and you can always pass bottom.

Under the hood, cast uses typeRep to get the runtime TypeRep for a and b, and compares them, thus:

cast :: forall a b. (Typeable a, Typeable b) => a -> Maybe b
cast x = if typeRep (Proxy :: Proxy a) == typeRep (Proxy :: Proxy b)
           then Just (unsafeCoerce x)
           else Nothing

Although cast is written in Haskell, it uses unsafeCoerce. For it to truly be type-safe, it must trust the Typeable instances. If the user could write a Typeable instance, they could write a bogus one, and defeat type safety. So only GHC is allowed write Typeable instances.

In short, cast and the Typeable instances are part of the trusted code base, or TCB:

  • The TCB should be as small as possible
  • The TCB should have a small, well-defined, statically-typed API used by client code
  • Client code is un-trusted; if the client code is well-typed, and the TCB is implemented correctly, nothing can go wrong

Background Typeable a and TypeRep

I'll use the Typeable a type class and values of type TypeRep more or less interchangeably. As you can see from the definition of class Typeable above, its payload is simply a constant function returning a TypeRep. So you can think of a Typeable a as simply a type-tagged version of TypeRep.

Of course, a Typeable a is a type class thing, which is hard to pass around explicitly like a value, but that is easily fixed using the "Dict Trick", well known in Haskell folk lore:

data Dict (c :: Constraint) where
  Dict :: forall c. c => Dict c

Now a value of type Dict (Typeable a) is an ordinary value that embodies a Typeable a dictionary. For example:

f :: Dict (Typeable a) -> Dict (Typeable b) -> a -> Maybe b
f Dict Dict val = cast val

The pattern-matches against the Dict constructor brings the Typeable dictionaries into scope, so they can be used to discharge the constraint arising from the call to cast.

Background: serialisation

I'm going to assume a a type class Serialisable, something like this:

class Serialisable a where
  encode :: a -> ByteString
  decode :: ByteString -> Maybe (a, ByteString)

'll use "encode" and "decode" as synonyms for "serialise" and "deserialise", because the former are easier to pronounce.

Here's an interesting question: are instances of Serialisable part of the TCB? No, they are not. Here is a tricky case:

  decode (encode [True,False]) :: Maybe (Int, ByteString)

Here I have encode a [Bool] into a ByteString, and then decoded an Int from that ByteString. This may be naughty or undesirable, but it cannot seg-fault: it is type-safe in the sense above. You can think of it like this: a decoder is simply a parser for the bits in the ByteString, so a decoder for (say) Int can fail to parse a full Int (returning Nothing), but it can't return a non-Int.

For the naughtiness, one could imagine that a Cloud Haskell library might send fingerprints or TypeReps or whatnot to eliminate potential naughtiness. But even then it is very valuable if the type-safety of the system does not rely on the CH library. Type safety depends only on the correctness of the (small) TCB; naughtiness-safety might additionally depend on the correctness of the CH library.

Background: static pointers

I'm taking for granted the basic design of the Cloud Haskell paper. That is,

  • A type constructor StaticPtr :: * -> *. Intuitively, a value of type StaticPtr t is represented by a static code pointer to a value of type t. Note "code pointer" not "heap pointer". That's the point!
  • A language construct static <expr>, whose type is StaticPtr t if <expr> has type t.
  • In static <expr>, the free variables of <expr> must all be bound at top level. The implementation almost certainly works by giving <expr> a top-level definition with a new name, static34 = <expr>.
  • A function unStatic :: StaticPtr a -> a, to unwrap a static pointer.
  • Static values are serialisable. Something like instance Serialisable (StaticPtr a). (This will turn out to be not quite right.) Operationally this works by serialising the code pointer, or top-level name (e.g "Foo.static34").

All of this is built-in. It is OK for the implementation of StaticPtr to be part of the TCB. But our goal is that no other code need be in the TCB.

A red herring. I'm not going to address the question of how to serialise a static pointer. One method would be to serialise a machine address, but that only works if the encoding and decoding ends are running identical binaries. But that's easily fixed: encode a static as the name of the static value e.g. "function foo from module M in package p". Indeed, I'll informally assume an implementation of this latter kind.

In general, I will say that what we ultimately serialise is a StaticName. You can think of a StaticName as package/module/function triple, or something like that. The implementation of StaticName is certainly not part of the client-visible API for StaticPtr; indeed, the type StaticName is not part of the API either. But it gives us useful vocabulary.


Serialising static pointers

We can see immediately that we cannot expect to have instance Serialisable (Static a), which is what the Cloud Haskell paper proposed. If we had such an instance we would have

encodeStatic :: forall a. StaticPtr a -> ByteString
decodeStatic :: forall a. ByteString -> Maybe (StaticPtr a, ByteString)

And it's immediately apparent that decodeStatic cannot be right. I could get a ByteString from anywhere, apply decodeStatic to it, and thereby get a StaticPtr a. Then use unStatic and you have a value of type a, for, for any type a!!

Plainly, what we need is (just in the case of cast) to do a dynamic typecheck, thus:

decodeStatic :: forall a. Typeable a
                       => ByteString -> Maybe (StaticPtr a, ByteString)

Let's think operationally for a moment:

  • GHC collects all the StaticPtr values in a table, the static pointer table or SPT. Each row contains
    • The StaticName of the value
    • A pointer to closure for the value itself
    • A pointer to its TypeRep
  • decodeStatic now proceeds like this:
    • Parse a StaticName from the ByteString (failure => Nothing)
    • Look it up in table (not found => Nothing)
    • Compare the TypeRep passed to decodeStatic (via the Typeable a dictionary) with the one ine the table (not equal => Nothing)
    • Return the value

Side note. Another possibility is for decodeStatic not to take a Typeable a context but instead for unStatic to do so:: unStatic :: Typeable a => StaticPtr a -> Maybe a. But that seems a mess. Apart from anything else, it would mean that a value of type StaticPtr a might or might not point to a value of type a, so there's no point in having the type parameter in the first place. End of side note.

This design has some useful consequences that are worth calling out:

  • A StaticPtr is serialised simply to the StaticName; the serialised form does not need to contain a TypeRep. Indeed it would not even be type-safe to serialise a StaticPtr to a pair of a StaticName and a TypeRep, trusting that the TypeRep described the type of the named function. Why not? Think back to "Background: serialisation" above, and imagine we said
    decode (encode ["wibble", "wobble"])
      :: Typeable a => Maybe (StaticPtr a, ByteString)
    
    Here we create an essentially-garbage ByteString by encoding a [String], and try to decode it. If, by chance, we successfully parse a valid StaticName and TypeRep, there is absolutely no reason to suppose that the TypeRep will describe the type of the function.

    Instead, the TypeRep of the static pointer lives in the SPT, securely put there when the SPT was created. Not only is this type-safe, but it also saves bandwidth by not transmittingTypeReps.
  • Since clients can effectively fabricate a StaticName (by supplying decodeStatic with a bogus ByteString, a StaticName is untrusted. That gives the implementation a good deal of wiggle room for how it chooses to implement static names. Even a simple index in the range 0..N would be type-safe!

    The motivation for choosing a richer representation for StaticName (eg package/module/name) is not type-safety but rather resilience to change. For example, the Haskell programs at the two ends could be quite different, provided only that they agreed about what to call the static pointers that they want to exchange.

Statics and existentials

Here is something very reasonable:

data StaticApp b where
  SA :: StaticPtr (a->b) -> StaticPtr a -> StaticApp b
unStaticApp :: StaticApp a -> a
unStaticApp (SA f a) = unStatic f (unStatic a)

(We might want to add more constructors, but I'm going to focus only on SA.) A SA is just a pair of StaticPtrs, one for a function and one for an argument. We can securely unwrap it with unStaticApp.

Now, here is the question: can we serialise StaticApps? Operationally, of course yes: to serialise a SA, just serialise the two StaticPtrs it contains, and dually for deserialisation. But, as before, deserialisation is the hard bit. We seek:

decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)

But how can we write decodeSA? Here is the beginning of an attempt:

decodeSA :: Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
  = case decodeStatic bs :: Maybe (StaticPtr (a->b)) of
      Nothing -> Nothing
      Just (fun, bs1) -> ...

and you can immediately see that we are stuck. Type variable b is not in scope. More concretely, we need a Typeable (a->b) to pass in to decodeStatic, but we only have a Typeable b to hand.

What can we do? Tantalisingly, we know that if decodeStatic succeeds in parsing a static StaticName from bs then, when we look up that StaticName in the Static Pointer Table, we'll find a TypeRep for the value. So rather than passing a Typeable dictionary into decodeStatic, we'd like to get one out!

With that in mind, here is a new type signature for decodeStatic that returns both pieces:

data DynStaticPtr where
  DSP :: Typeable a => StaticPtr a -> DynStaticPtr
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)

(The name DynStaticPtr comes from the fact that this data type is extremely similar to the library definition of Dynamic.)

Operationally, decodeStaticK bs fail cont works like this;

  • Parse a StaticName from bs (failure => return Nothing)
  • Look it up in the SPT (not found => return Nothing)
  • Return the TypeRep and the value found in the SPT, paired up with DSP. (Indeed the SPT could contain the DynStaticPtr values directly.)

For the construction of DynStaticPtr to be type-safe, we need to know that the TypeRep passed really is a TypeRep for the value; so the construction of the SPT is (unsurprisingly) part of the TCB.

Now we can write decodeSA (the monad is just the Maybe monad, nothing fancy):

decodeSA :: forall b. Typeable b => ByteString -> Maybe (StaticApp b, ByteString)
decodeSA bs
  = do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
       ; (DSP (arg :: StaticPtr targ), bs2) <- decodeStatic bs1
            -- At this point we have
            --     Typeable b      (from caller)
            --     Typeable tfun   (from first DSP)
            --     Typeable targ   (from second DSP)
       ; fun' :: StaticPtr (targ->b) <- cast fun
       ; return (SA fun' arg, bs2) }

The call to cast needs Typeable tfun, and Typeable (targ->b). The former is bound by the first DSP pattern match. The latter is constructed automatically from Typeable targ and Typeable b, both of which we have. Bingo!

Notice that decodeSA is not part of the TCB. Clients can freely write code like decodeSA and be sure that it is type-safe.


From static pointers to closures

The original Cloud Haskell paper defines closures like this:

data Closure a where
  Clo :: StaticPtr (ByteString -> a) -> ByteString -> Closure a

It is easy to define

unClo :: Closure a -> a
unClo (Clo s e) = unStatic s e

Side note on HdpH

HdpH refines the Cloud Haskell Closure in (at least) two ways. I think (but I am not certain) that this declaration captures the essence:

data Closure a where
  Clo :: StaticPtr (ByteString -> a) -> Put () -> a -> Closure a

The refinements are:

  • The extra argument of type 'a' to avoid costs when we build a closure and then unwrap it with unClo locally, or repeatedly.
  • The use of Put () rather than a ByteString for the serialised environment, to avoid repeated copying when doing nested serialisation.

Both are importnat, but they are orthogonal to the discussion about static types, so I'll use the CH definition from here on.

Serialising closures

Just as in the case of StaticPtr, it is immediately clear that we cannot expect to have

decodeClo :: ByteString -> Maybe (Closure a, ByteString)

Instead we must play the same trick, and attempt to define

data DynClosure where
  DC :: Typeable a => Closure a -> DynClosure
decodeClo :: ByteString -> Maybe (DynClosure, ByteString)

But there's an immediate problem in writing decodeClo:

decodeClo bs
  = do { (DSP (fun :: StaticPtr tfun), bs1) <- decodeStatic bs
       ; (env, bs2)                         <- decodeByteString bs1
       ; return (DC (Clo fun env), bs2) }  -- WRONG

This won't typecheck because DC needs Typeable a, but we only have Typeable (ByteString -> a)`.

This is Jolly Annoying. I can see three ways to make progress:

  • Plan A: Provide some (type-safe) way to decompose TypeReps, to get from Typeable (a->b) to Typeable b (and presumably Typeable a as well).
  • Plan C: Serialise a TypeRep a with every Closure a.
  • Plan C: Generalise StaticPtr

I like Plan C best. They are each discussed next.

Plan A: Decomposing TypeRep

At the moment, GHC provides statically-typed ways to construct and compare a TypeRep (via cast), but no way to decompose one, at least not in a type-safe way. It is tempting to seek this function as part of the TCB:

class Typeable a where
  typeRep :: proxy a -> TypeRep
  decomposeTypeRep :: DecompTR a
data DecompTR a where
  TRApp :: (Typeable p, Typeable q) => DecompTR (p q)
  TRCon :: TyCon -> DecompTR a

This isn't a bad idea, but it does mean that Typeable a must be implemented (and presumably serialised) using a tree, whereas the current API would allow an implementation consisting only of a fingerprint.

(Thought experiment: maybe a Typeable a, and Dict (Typeable a) can be represented as a tree, but a TypeRep could be just a fingerprint?)

Plan B: serialise TypeRep with Closure

Since we need a Typeable a at the far end, we could just serialise it directly with the Closure, like this:

encodeClo :: forall a. Typeable a => Closure a -> ByteString
encodeClo (Clo fun env)
  =  encodeTypeable (proxy :: a)
  ++ encodeStatic fun
  ++ encodeByteString env

Here I am assuming (as part of the TBC)

encodeTypeable :: Typeable a => proxy a -> ByteString
decodeTypeable :: ByteString -> Maybe (DynTypeable, ByteString)
data DynTypeable where
  DT :: Typeable a => proxy a -> DynTypeable

which serialises a TypeRep. (Or, operationally, perhaps just its fingerprint.) Now I think we can write decodeClo:

decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
  = do { (DT (_ :: Proxy a),           bs1)  <- decodeTypeable
       ; (DSP (fun :: StaticPtr tfun), bs2)  <- decodeStatic bs1
       ; (env, bs3)                          <- decodeByteString bs2
       ; fun' :: StaticPtr (ByteString -> a) <- cast fun
       ; return (DC (Clo fun' env), bs2) }  -- WRONG

But this too is annoying: we have to send these extra TypeReps when morally they are already sitting there in the SPT.

Plan C: Generalising StaticPtr

Our difficulty is that we are deserialising StaticPtr (ByteString -> a) but we want to be given Typeable a not Typeable (ByteString -> a). So perhaps we can decompose the type into a type constructor and type argument, like this:

data StaticPtr (f :: *->*) (a :: *)
unStatic :: StaticPtr f a -> f a
decodeStatic :: ByteString -> Maybe (DynStaticPtr, ByteString)
data DynStaticPtr where
  DS :: (Typeable f, Typeable a) => StaticPtr (f a) -> DynStaticPtr

Each row of the SPT contains:

  • The StaticName
  • The value of type f a
  • The Typeable f dictionary
  • The Typeable a dictionary

Now we can define closures thus:

data Closure a where
  Clo :: StaticPtr (ByteString ->) a -> ByteString -> Closure a

and these are easy to deserialise:

decodeClo :: ByteString -> Maybe (DynClosure, ByteString)
decodeClo bs
  = do { (DSP (fun :: StaticPtr f a), bs1) <- decodeStatic bs
       ; (env, bs2)                        <- decodeByteString bs1
           -- Here we have Typeable f, Typeable a
       ; fun' :: StaticPtr (ByteString ->) a <- cast fun
           -- This cast checks that f ~ (ByteString ->)
           -- Needs Typeable f, Typealbe (ByteString ->)
       ; return (DC (Clo fun env), bs2) }
           -- DC needs Typeable a

I like this a lot better, but it has knock on effects.

  • The old StaticPtr a is now StaticPtr Id a.
  • What becomes of our data type for StaticApply? Perhpas
    data StaticApp f b where
      SA :: StaticPtr f (a->b) -> StaticPtr f b -> StaticApp f b
    unStaticApp :: Applicative => StaticApp f b -> f b
    

ToDo: ...I have not yet followed through all the details

Applying closures

Can we write closureApply? I'm hoping for a structure like this:

closureApply :: Closure (a->b) -> Closure a -> Closure b
closureApply fun arg = Clo (static caStatic) (fun, arg)
caStatic :: ByteString -> b  -- WRONG
caStatic bs = do { ((fun,arg), bs1) <- decode bs
                 ; return (unClo fun (unClo arg), bs1) }

This is obviously wrong. caStatic clearly cannot have that type. It would at least need to be

caStatic :: Typeable b => ByteString -> b

and now there is the thorny question of where the Typeable b dictionary comes from.

ToDo: ...I have stopped here for now


Polymorphism and serialisation

For this section I'll revert to the un-generalised single-parameter StaticPtr.

Parametric polymorphism

Consider these definitions:

rs1 :: Static ([Int] -> [Int])
rs1 = static reverse
rs2 :: Static ([Bool] -> [Bool])
rs2 = static reverse
rs3 :: forall a. Typeable a => Static ([a] -> [a])
rs3 = static reverse

The first two are clearly fine. The SPT will get one row for each of the two monomorphic calls to reverse, one with a TypeRep of [Int] -> [Int] and one with a TypeRep of [Bool] -> [Bool].

But both will have the same code pointer, namely the code for the polymorpic reverse function. Could we share just one StaticName for all instantiations of reverse, perhaps including rs3 as well?

I think we can. The story would be this:

  • The SPT has a row for reverse, containing
    • The StaticName for reverse
    • A pointer to the code for reverse (or, more precisely, its static closure).
    • A function of type TypeRep -> TypeRep that, given the TypeRep for a returns a TypeRep for [a] -> [a].
  • When we serialise a StaticPtr we send
    • The StaticName of the (polymorphic) function
    • A list of the TypeReps of the type arguments of the function
  • The rule for static <expr> becomes this: the free term variables <expr> must all be top level, but it may have free type variables, provided they are all Typeable.

All of this is part of the TCB, of course.

Type-class polymorphism

Consider static sort where sort :: Ord a => [a] -> [a]. Can we make such a StaticPtr. After all, sort gets an implicit value argument, namely an Ord a dictionary. If that dictionary can be defined at top level, well and good, so this should be OK:

ss1 :: StaticPtr ([Int] -> [Int])
ss1 = static sort

But things go wrong as soon as you have polymorphism:

ss2 :: forall a. Ord a => StaticPtr ([a] -> [a])
ss2 = static sort  -- WRONG

Now, clearly, the dictionary is a non-top-level free variable of the call to sort.

We might consider letting you write this:

ss3 :: forall a. StaticPtr (Ord a => [a] -> [a])
ss3 = static sort   -- ???

so now the static wraps a function expeting a dictionary. But that edges us uncomforatbly close to impredicative types, which is known to contain many dragons.

A simpler alternative is to use the Dict Trick (see Background above):

ss4 :: forall a. StaticPtr (Dict (Ord a) -> [a] -> [a])
ss4 = static sortD
sortD :: forall a. Dict (Ord a) -> [a] -> [a]
sortD Dict xs = sort xs

Now, at the call side, when we unwrap the StaticPtr, we need to supply an explicit Ord dictionary, like this:

...(unStatic ss4 Dict)....

For now, I propose to deal with type classes via the Dict Trick, which is entirely end-user programmable, leaving only parametric polymorphism for built-in support.

by simonpj at September 11, 2014 01:34 PM

Yesod Web Framework

Clarification of previous blog post

I've heard that my previous blog post has caused a bit of confusion, as sarcasm doesn't really come across in text very well. So let me elaborate (and of course, in the process, kill the joke):

Some years back, Erik found a case that was quite difficult to implement using enumerator. After we cracked our heads on it for long enough, some of us (I don't actually remember who was involved) decided to work on a new streaming library. That library ended up being called conduit (thanks to Yitz for the naming idea). It turns out that most people are unaware of that history, so when at ICFP, I casually mentioned that Erik was the cause of conduit coming into existence, some people were surprised. Erik jokingly chastised me for not giving him enough credit. In response, I decided to write an over-the-top post giving Erik all credit for conduit. I say over the top, since I made it seem like there was some large amount of blame being heaped on as well.

So to be completely clear:

  • Erik and I are good friends, and this was just a bit of an inside joke turned public.
  • No one has said anything offensive to me at all about conduit. There are obviously differing opinions out there about the best library for a job, but there's nothing offensive about it, just healthy discussion around a complicated topic. My purpose in making a big deal about it was not to express frustration at anyone attacking me, but rather to just play up the joke a bit more.

My apologies to anyone who was confused, upset, or worried by the previous post, it was completely unintentional.

September 11, 2014 12:00 AM

FP Complete

We're hiring: Haskell web UI developer

FP Complete is looking to expand its Haskell development team. We’re looking for a Haskeller with a strong background in web UI development. This position will encompass both work on our core products- such as FP Haskell Center and School of Haskell- as well as helping customers develop frontends to their Haskell applications.

We will want you to start right away. The will be a contractor position, full time for at least 3 months, with the intention to continue long-term on a more or less full-time basis. Additionally, while the main focus of the position will be UI development, there will be many opportunities to expand into other areas of focus.

This is a telecommute position: you can work from home or wherever you choose, with little or no travel. Location in North America is ideal; you will work with colleagues who are on North American and European hours.

Skills required:

  • Strong Haskell coding skills.
  • Experience with creating HTML/CSS/Javascript web applications (fat clients a plus).
  • Ideally: experience with both Yesod and Fay for server and client side coding, respectively. (Perk: you’ll get a chance to work with the authors of both tools.)
  • Experience deploying applications into production, especially at large scale, is a plus.
  • Ability to interact with a distributed development team, and to manage your time without an in-person supervisor
  • Ability to work with clients on gathering requirements
  • General source control/project skills: Git, issue tracking
  • Ability to communicate clearly in issues, bug reports and emails
  • Proficient on a Linux system
  • Plus: experience with deployment, Docker, and/or CoreOS

Please send resume or CV to michael@fpcomplete.com. Any existing work- either a running website or an open source codebase- which you can include links to will be greatly appreciated as well.

September 11, 2014 12:00 AM

September 10, 2014

Mike Izbicki

Polymorphism in Haskell vs C++

Polymorphism in Haskell vs C++

posted on 2014-09-10 by Jonathan Dugan

Parametric polymorphism is when you write one function that works on many data types. In C++, this is pretty confusing, but it’s really easy in Haskell. Let’s take a look at an example.

Let’s say we want a function that calculates the volume of a box. In C++, we’d use templates so that our function works with any numeric type:

template<typename T>
T boxVolume(T length, T width, T height)
{
    return length * width * height;
}

Templates have an awkward syntax, but that isn’t too much of a hassle. C++ has much bigger problems. What if in the course of writing your program, you accidentally pass in some strings to this function?

int main()
{
    cout << boxVolume("oops","no","strings") << endl;
}

We get this error when we compile with g++:

test.cpp: In instantiation of _T boxVolume(T, T, T) [with T = const    char*]_:
test.cpp:22:47:   required from here
test.cpp:8:19: error: invalid operands of types _const char*_ and _const char*_ to binary
_operator*_
    return length * width * height;

This error message is a little hard to understand because of the templates. If we had written our function to use doubles instead of templates:

double boxVolume(double length, double width, double height)
{
    return length * width * height;
}

We would get this simpler error message:

test.cpp: In function _int main()_:
test.cpp:22:47: error: cannot convert _const char*_ to _double_ for argument _1_ to _double
boxVolume(double, double, double)_
    cout << boxVolume("oops","nope","bad!") << endl;

We see that this error is shorter and easier to use, as it clearly tells us we cannot pass string literals to our function. Plus there is no superfluous comment about our “instantiation” of boxVolume.

Now let’s try to write a polymorphic boxVolume in Haskell:

boxVolume :: a -> a -> a -> a
boxVolume length width height = length * width * height

When we try to compile, we get the error:

test.hs:2:50:
    No instance for (Num a) arising from a use of `*'
    Possible fix:
      add (Num a) to the context of
        the type signature for boxVolume :: a -> a -> a -> a
    In the expression: length * width * height
    In an equation for `boxVolume':
        boxVolume length width height = length * width * height

Uh-oh! An error message! What went wrong? It says that we tried to use the * operator without declaring our parameters as an instance of the Num type class.

But what is a type class? This leads us to ad hoc polymorphism, also known as function overloading. Ad hoc polymorphism is when a function can be applied to different argument types, each with a different implementation. For example, the STL classes stack and queue each have their own push and pop functions, which, although they have the same names, do different things:

stack<int> s;
queue<int> q;

s.push(1); q.push(1);
s.push(2); q.push(2);
s.push(3); q.push(3);

s.pop(); q.pop();

After the above code is executed, the stack s will be left with the numbers 1,2 while the queue q will be left with the numbers 2,3. The function pop behaves differently on stacks and queues: calling pop on a stack removes the item added last, while calling pop on a queue removes the item added first.

Haskell does not support function overloading, except through type classes. For example, if we were to specifically declare our own Stack and Queue classes with push and pop functions:

data Stack = Stack  [Int] deriving Show
data Queue = Queue [Int] deriving Show

push :: Stack -> Int -> Stack
push (Stack xs) x = Stack (x:xs)

pop :: Stack -> Stack
pop (Stack []) = Stack []
pop (Stack xs) = Stack (tail xs)

push :: Queue -> Int -> Queue
push (Queue xs) x = Queue (x:xs)

pop :: Queue -> Queue
pop (Queue []) = Queue []
pop (Queue xs) = Queue (init xs)

It results in a compiler error:

stack.hs:11:1:
    Duplicate type signatures for `push'
    at stack.hs:4:1-4
       stack.hs:11:1-4

stack.hs:12:1:
    Multiple declarations of `push'
    Declared at: stack.hs:5:1
                 stack.hs:12:1

stack.hs:14:1:
    Duplicate type signatures for `pop'
    at stack.hs:7:1-3
       stack.hs:14:1-3

stack.hs:15:1:
    Multiple declarations of `pop'
    Declared at: stack.hs:8:1
                 stack.hs:15:1

Changing the names of our push and pop functions to, say, stackPush, stackPop, queuePush, and queuePop would let the program compile.

A more generic way, however, is to create a type class. Let’s make a Sequence type class that implements our push and pop functions.

class Sequence s where
    push :: s -> Int -> s
    pop :: s -> s

This type class declaration says that any data type that is an instance of this Sequence type class can use the push and pop operations, or, in other words, can add and remove an Int. By making our Stack and Queue instances of the Sequence type class, both data types can have their own implementations of the push and pop functions!

instance Sequence Stack where
    push (Stack xs) x = Stack (x:xs)
    pop (Stack []) = Stack []
    pop (Stack xs) = Stack (tail xs)

instance Sequence Queue where
    push (Queue xs) x = Queue (x:xs)
    pop (Queue []) = Queue []
    pop (Queue xs) = Queue (init xs)

Replacing our function definitions with these instantiations of the Sequence type class lets our program compile.

Type classes are also an important part of using templates in function definitions. In our function boxVolume, we got an error because we tried to use the * operation without declaring the type variable a as an instance of the Num type class. The Num type class is basically for anything that acts like a number, such as Int, Float, and Double, and it lets you use the common operations of +, -, and *.

Let’s change our function to declare that a is a Num:

boxVolume :: (Num a) => a -> a -> a -> a
boxVolume length width height = length * width * height

This is called adding a class constraint. Whenever we want to declare a template function that relies on other functions, we have to add a class constraint that tells both the user and the compiler which types of data can be put into the function.

If we were to call boxVolume on strings, we would get this simple error message:

ghci> boxVolume "a" "b" "c"

<interactive>:14:1:
    No instance for (Num [Char]) arising from a use of `boxVolume'
    Possible fix: add an instance declaration for (Num [Char])
    In the expression: boxVolume "a" "b" "c"
    In an equation for `it': it = boxVolume "a" "b" "c"

The compiler tells us it can’t evaluate this function because strings aren’t numbers! If we really wanted to, we could make String an instance of the Num type class, and then this function would work! (Of course, why you would want to do that is beyond me.) That’s the power of parametric polymorphism combined with type classes.

So there you have it. In C++, although we can easily implement ad hoc polymorphism through function overloading, parametric polymorphism is a tricky beast. This is made easier in Haskell, especially with the use of type classes. Type classes guarantee that data passed in to functions will work, and guide the user into what they can pass into a function. Use type classes to your advantage when you next write a Haskell program!

September 10, 2014 12:00 AM

September 09, 2014

Dominic Steinitz

Fun with (Extended Kalman) Filters

Summary

An extended Kalman filter in Haskell using type level literals and automatic differentiation to provide some guarantees of correctness.

Population Growth

Suppose we wish to model population growth of bees via the logistic equation

\displaystyle  \begin{aligned}  \dot{p} & = rp\Big(1 - \frac{p}{k}\Big)  \end{aligned}

We assume the growth rate r is unknown and drawn from a normal distribution {\cal{N}}(\mu_r, \sigma_r^2) but the carrying capacity k is known and we wish to estimate the growth rate by observing noisy values y_i of the population at discrete times t_0 = 0, t_1 = \Delta T, t_2 = 2\Delta T, \ldots. Note that p_t is entirely deterministic and its stochasticity is only as a result of the fact that the unknown parameter of the logistic equation is sampled from a normal distribution (we could for example be observing different colonies of bees and we know from the literature that bee populations obey the logistic equation and each colony will have different growth rates).

Haskell Preamble

> {-# OPTIONS_GHC -Wall                     #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing  #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults   #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind  #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans         #-}
> {-# LANGUAGE DataKinds                    #-}
> {-# LANGUAGE ScopedTypeVariables          #-}
> {-# LANGUAGE RankNTypes                   #-}
> {-# LANGUAGE BangPatterns                 #-}
> {-# LANGUAGE TypeOperators                #-}
> {-# LANGUAGE TypeFamilies                 #-}
> module FunWithKalman3 where
> import GHC.TypeLits
> import Numeric.LinearAlgebra.Static
> import Data.Maybe ( fromJust )
> import Numeric.AD
> import Data.Random.Source.PureMT
> import Data.Random
> import Control.Monad.State
> import qualified Control.Monad.Writer as W
> import Control.Monad.Loops

Logistic Equation

The logistic equation is a well known example of a dynamical system which has an analytic solution

\displaystyle  p = \frac{kp_0\exp rt}{k + p_0(\exp rt - 1)}

Here it is in Haskell

> logit :: Floating a => a -> a -> a -> a
> logit p0 k x = k * p0 * (exp x) / (k + p0 * (exp x - 1))

We observe a noisy value of population at regular time intervals (where \Delta T is the time interval)

\displaystyle  \begin{aligned}  p_i &= \frac{kp_0\exp r\Delta T i}{k + p_0(\exp r\Delta T i - 1)} \\  y_i &= p_i + \epsilon_i  \end{aligned}

Using the semi-group property of our dynamical system, we can re-write this as

\displaystyle  \begin{aligned}  p_i &= \frac{kp_{i-1}\exp r\Delta T}{k + p_{i-1}(\exp r\Delta T - 1)} \\  y_i &= p_i + \epsilon_i  \end{aligned}

To convince yourself that this re-formulation is correct, think of the population as starting at p_0; after 1 time step it has reached p_1 and and after two time steps it has reached p_2 and this ought to be the same as the point reached after 1 time step starting at p_1, for example

> oneStepFrom0, twoStepsFrom0, oneStepFrom1 :: Double
> oneStepFrom0  = logit 0.1 1.0 (1 * 0.1)
> twoStepsFrom0 = logit 0.1 1.0 (1 * 0.2)
> oneStepFrom1  = logit oneStepFrom0 1.0 (1 * 0.1)
ghci> twoStepsFrom0
  0.11949463171139338

ghci> oneStepFrom1
  0.1194946317113934

We would like to infer the growth rate not just be able to predict the population so we need to add another variable to our model.

\displaystyle  \begin{aligned}  r_i &= r_{i-1} \\  p_i &= \frac{kp_{i-1}\exp r_{i-1}\Delta T}{k + p_{i-1}(\exp r_{i-1}\Delta T - 1)} \\  y_i &= \begin{bmatrix}0 & 1\end{bmatrix}\begin{bmatrix}r_i \\ p_i\end{bmatrix} + \begin{bmatrix}0 \\ \epsilon_i\end{bmatrix}  \end{aligned}

Extended Kalman

This is almost in the form suitable for estimation using a Kalman filter but the dependency of the state on the previous state is non-linear. We can modify the Kalman filter to create the extended Kalman filter (EKF) by making a linear approximation.

Since the measurement update is trivially linear (even in this more general form), the measurement update step remains unchanged.

\displaystyle  \begin{aligned}  \boldsymbol{v}_i & \triangleq  \boldsymbol{y}_i - \boldsymbol{g}(\hat{\boldsymbol{x}}^\flat_i) \\  \boldsymbol{S}_i & \triangleq  \boldsymbol{G}_i \hat{\boldsymbol{\Sigma}}^\flat_i  \boldsymbol{G}_i^\top + \boldsymbol{\Sigma}^{(y)}_i \\  \boldsymbol{K}_i & \triangleq \hat{\boldsymbol{\Sigma}}^\flat_i  \boldsymbol{G}_i^\top\boldsymbol{S}^{-1}_i \\  \hat{\boldsymbol{x}}^i &\triangleq \hat{\boldsymbol{x}}^\flat_i + \boldsymbol{K}_i\boldsymbol{v}_i \\  \hat{\boldsymbol{\Sigma}}_i &\triangleq \hat{\boldsymbol{\Sigma}}^\flat_i - \boldsymbol{K}_i\boldsymbol{S}_i\boldsymbol{K}^\top_i  \end{aligned}

By Taylor we have

\displaystyle  \boldsymbol{a}(\boldsymbol{x}) \approx \boldsymbol{a}(\boldsymbol{m}) + \boldsymbol{A}_{\boldsymbol{x}}(\boldsymbol{m})\delta\boldsymbol{x}

where \boldsymbol{A}_{\boldsymbol{x}}(\boldsymbol{m}) is the Jacobian of \boldsymbol{a} evaluated at \boldsymbol{m} (for the exposition of the extended filter we take \boldsymbol{a} to be vector valued hence the use of a bold font). We take \delta\boldsymbol{x} to be normally distributed with mean of 0 and ignore any difficulties there may be with using Taylor with stochastic variables.

Applying this at \boldsymbol{m} = \hat{\boldsymbol{x}}_{i-1} we have

\displaystyle  \boldsymbol{x}_i = \boldsymbol{a}(\hat{\boldsymbol{x}}_{i-1}) + \boldsymbol{A}_{\boldsymbol{x}}(\hat{\boldsymbol{x}}_{i-1})(\boldsymbol{x}_{i-1} - \hat{\boldsymbol{x}}_{i-1}) + \boldsymbol{\epsilon}_i

Using the same reasoning as we did as for Kalman filters and writing \boldsymbol{A}_{i-1} for \boldsymbol{A}_{\boldsymbol{x}}(\hat{\boldsymbol{x}}_{i-1}) we obtain

\displaystyle  \begin{aligned}  \hat{\boldsymbol{x}}^\flat_i &=  \boldsymbol{a}(\hat{\boldsymbol{x}}_{i-1}) \\  \hat{\boldsymbol{\Sigma}}^\flat_i &= \boldsymbol{A}_{i-1}  \hat{\boldsymbol{\Sigma}}_{i-1}  \boldsymbol{A}_{i-1}^\top  + \boldsymbol{\Sigma}^{(x)}_{i-1}  \end{aligned}

Haskell Implementation

Note that we pass in the Jacobian of the update function as a function itself in the case of the extended Kalman filter rather than the matrix representing the linear function as we do in the case of the classical Kalman filter.

> k, p0 :: Floating a => a
> k = 1.0
> p0 = 0.1 * k
> r, deltaT :: Floating a => a
> r = 10.0
> deltaT = 0.0005

Relating ad and hmatrix is somewhat unpleasant but this can probably be ameliorated by defining a suitable datatype.

> a :: R 2 -> R 2
> a rpPrev = rNew # pNew
>   where
>     (r, pPrev) = headTail rpPrev
>     rNew :: R 1
>     rNew = konst r
> 
>     (p,  _) = headTail pPrev
>     pNew :: R 1
>     pNew = fromList $ [logit p k (r * deltaT)]
> bigA :: R 2 -> Sq 2
> bigA rp = fromList $ concat $ j [r, p]
>   where
>     (r, ps) = headTail rp
>     (p,  _) = headTail ps
>     j = jacobian (\[r, p] -> [r, logit p k (r * deltaT)])

For some reason, hmatrix with static guarantees does not yet provide an inverse function for matrices.

> inv :: (KnownNat n, (1 <=? n) ~ 'True) => Sq n -> Sq n
> inv m = fromJust $ linSolve m eye

Here is the extended Kalman filter itself. The type signatures on the expressions inside the function are not necessary but did help the implementor discover a bug in the mathematical derivation and will hopefully help the reader.

> outer ::  forall m n . (KnownNat m, KnownNat n,
>                         (1 <=? n) ~ 'True, (1 <=? m) ~ 'True) =>
>           R n -> Sq n ->
>           L m n -> Sq m ->
>           (R n -> R n) -> (R n -> Sq n) -> Sq n ->
>           [R m] ->
>           [(R n, Sq n)]
> outer muPrior sigmaPrior bigH bigSigmaY
>       littleA bigABuilder bigSigmaX ys = result
>   where
>     result = scanl update (muPrior, sigmaPrior) ys
> 
>     update :: (R n, Sq n) -> R m -> (R n, Sq n)
>     update (xHatFlat, bigSigmaHatFlat) y =
>       (xHatFlatNew, bigSigmaHatFlatNew)
>       where
> 
>         v :: R m
>         v = y - (bigH #> xHatFlat)
> 
>         bigS :: Sq m
>         bigS = bigH <> bigSigmaHatFlat <> (tr bigH) + bigSigmaY
> 
>         bigK :: L n m
>         bigK = bigSigmaHatFlat <> (tr bigH) <> (inv bigS)
> 
>         xHat :: R n
>         xHat = xHatFlat + bigK #> v
> 
>         bigSigmaHat :: Sq n
>         bigSigmaHat = bigSigmaHatFlat - bigK <> bigS <> (tr bigK)
> 
>         bigA :: Sq n
>         bigA = bigABuilder xHat
> 
>         xHatFlatNew :: R n
>         xHatFlatNew = littleA xHat
> 
>         bigSigmaHatFlatNew :: Sq n
>         bigSigmaHatFlatNew = bigA <> bigSigmaHat <> (tr bigA) + bigSigmaX

Now let us create some sample data.

> obsVariance :: Double
> obsVariance = 1e-2
> bigSigmaY :: Sq 1
> bigSigmaY = fromList [obsVariance]
> nObs :: Int
> nObs = 300
> singleSample :: Double -> RVarT (W.Writer [Double]) Double
> singleSample p0 = do
>   epsilon <- rvarT (Normal 0.0 obsVariance)
>   let p1 = logit p0 k (r * deltaT)
>   lift $ W.tell [p1 + epsilon]
>   return p1
> streamSample :: RVarT (W.Writer [Double]) Double
> streamSample = iterateM_ singleSample p0
> samples :: [Double]
> samples = take nObs $ snd $
>           W.runWriter (evalStateT (sample streamSample) (pureMT 3))

We created our data with a growth rate of

ghci> r
  10.0

but let us pretend that we have read the literature on growth rates of bee colonies and we have some big doubts about growth rates but we are almost certain about the size of the colony at t=0.

> muPrior :: R 2
> muPrior = fromList [5.0, 0.1]
> 
> sigmaPrior :: Sq 2
> sigmaPrior = fromList [ 1e2, 0.0
>                       , 0.0, 1e-10
>                       ]

We only observe the population and not the rate itself.

> bigH :: L 1 2
> bigH = fromList [0.0, 1.0]

Strictly speaking this should be 0 but this is close enough.

> bigSigmaX :: Sq 2
> bigSigmaX = fromList [ 1e-10, 0.0
>                      , 0.0, 1e-10
>                      ]

Now we can run our filter and watch it switch away from our prior belief as it accumulates more and more evidence.

> test :: [(R 2, Sq 2)]
> test = outer muPrior sigmaPrior bigH bigSigmaY
>        a bigA bigSigmaX (map (fromList . return) samples)


by Dominic Steinitz at September 09, 2014 08:28 AM

Yesod Web Framework

Misassigned credit for conduit

When I was at ICFP last week, it became clear that I had made a huge mistake in the past three years. A few of us were talking, including Erik de Castro Lopo, and when I mentioned that he was the original inspiration for creating the conduit package, everyone else was surprised. So firstly: Erik, I apologize for not making it clear that you initially kicked off development by finding some fun corner cases in enumerator that were difficult to debug.

So to rectify that, I think it's only fair that I write the following:

  • conduit is entirely Erik's fault.
  • If you love conduit, write Erik a thank you email.
  • More importantly, if you hate conduit, there's no need to complain to me anymore. Erik presumably will be quite happy to receive all such further communications.
  • In other words, it's not my company, I just work here.

Thanks Erik :)

UPDATE Please also read my follow-up blog post clarifying this one, just in case you're confused.

September 09, 2014 12:00 AM

September 08, 2014

The GHC Team

Jan Stolarek

Promoting functions to type families in Haskell

It’s been very quiet on the blog these past few months not because I’m spending less time on functional programming but precisely for the opposite reason. Since January I’ve been working together with Richard Eisenberg to extend his singletons library. This work was finished in June and last Friday I gave a talk about our research on Haskell Symposium 2014. This was the first time I’ve been to the ICFP and Haskell Symposium. It was pretty cool to finally meet all these people I know only from IRC. I also admit that the atmosphere of the conference quite surprised me as it often felt like some sort of fan convention rather than the biggest event in the field of functional programming.

The paper Richard and I published is titled “Promoting Functions to Type Families in Haskell”. This work is based on Richard’s earlier paper “Dependently typed programming with singletons” presented two years ago on Haskell Symposium. Back then Richard presented the singletons library that uses Template Haskell to generate singleton types and functions that operate on them. Singleton types are types that have only one value (aside from bottom) which allows to reason about runtime values during compilation (some introduction to singletons can be found in this post on Richard’s blog). This smart encoding allows to simulate some of the features of dependent types in Haskell. In our current work we extended promotion capabilities of the library. Promotion is only concerned with generating type-level definitions from term-level ones. Type-level language in GHC has become quite expressive during the last couple of years but it is still missing many features available in the term-level language. Richard and I have found ways to encode almost all of these missing features using the already existing type-level language features. What this means is that you can write normal term-level definition and then our library will automatically generate an equivalent type family. You’re only forbidden from using infinite terms, the do-notation, and decomposing String literals to Chars. Numeric literals are also very problematic and the support is very limited but some of the issues can be worked around. What is really cool is that our library allows you to have partial application at the type level, which GHC normally prohibits.

You can learn more by watching my talk on YouTube, reading the paper or the singletons documentation. Here I’d like to add a few more information that are not present in the paper. So first of all the paper was concerned only with promotion and didn’t say anything about singletonization. But as we enabled more and more language constructs to be promoted we also made them singletonizable. So almost everything that can be promoted can also be singletonized. The most notable exception to this rule are type classes, which are not yet implemented at the moment.

An interesting issue was raised by Adam Gundry in a question after the talk: what about difference between lazy term-level semantics and strict type-level semantics? You can listen to my answer in the video but I’ll elaborate some more on this here. At one point during our work we were wondering about this issue and decided to demonstrate an example of an algorithm that crucially relies on laziness to work, ie. fails to work with strict semantics. I think it’s not straightforward to come up with such an algorithm but luckily I recalled the backwards state monad from Philip Wadler’s paper “The essence of functional programming”1. Bind operator of that monad looks like this (definition copied from the paper):

m `bindS` k = \s2 -> let (a,s0) = m s1
                         (b,s1) = k a s2
                     in  (b,s0)

The tricky part here is that the output of call to m becomes input to call to k, while the output of call to k becomes the input of m. Implementing this in a strict language does not at all look straightforward. So I promoted that definition expecting it to fail spectacularly but to my surprised it worked perfectly fine. After some investigation I understood what’s going on. Type-level computations performed by GHC are about constraint solving. It turns out that GHC is able to figure out in which order to solve these constraints and get the result. It’s exactly analogous to what happens with the term-level version at runtime: we have an order of dependencies between the closures and there is a way in which we can run these closures to get the final result.

All of this work is a small part of a larger endeavour to push Haskell’s type system towards dependent types. With singletons you can write type-level functions easily by writing their definitions using the term-level language and then promoting these definitions. And then you can singletonize your functions to work on singleton types. There were two other talks about dependent types during the conference: Stephanie Weirich’s “Depending on Types” keynote lecture during ICPF and Richard’s “Dependent Haskell” talk during Haskell Implementators Workshop. I encourage everyone interested in Haskell’s type system to watch both of these talks.

  1. The awful truth is that this monad does not really work with the released version of singletons. I only realized that when I was writing this post. See issue #94 on singletons bug tracker.

by Jan Stolarek at September 08, 2014 11:02 AM

September 07, 2014

Neil Mitchell

Shake in the wild

Summary: I spotted a few things using Shake, which I had nothing to do with.

In the past few days I have come across several things using the Shake build system. I wasn't involved in any of them, and haven't (yet) tried any of them out, but they certainly look cool.

ToolCabal

Tibor Bremer from Utrecht University gave a talk at the Haskell Implementors Workshop 2014 about his ToolCabal project. This project replaces the "build a package" part of Cabal with something more flexible, supporting multiple simultaneous targets and more flexible preprocessors - all built on top of Shake. It doesn't attempt to tackle dependency resolution yet. There is a video of the talk:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/VUyIu2T1Qss" width="420"></iframe>

Samplecount

The folks at Samplecount have written several Shake based things. None are yet on Hackage, so I suspect they are somewhat prototypes, but they look like they're already used quite seriously.

  • shake-cabal-build to make it easier to build your Shake build systems with Cabal. Shake build systems need to be compiled with GHC, for which I usually use ghc --make, but this project explains how to get things building with Cabal - important if your build system pulls in other libraries.
  • shake-language-c is a project to simplify building C/C++ projects with Shake. From the docs:

shake-language-c is a cross-platform build system based on the Shake Haskell library. The focus is on cross-compilation of C, C++ and Objective C source code to various target platforms. Currently supported target platforms are iOS, Android NDK, Google Portable Native Client, MacOS X, Linux and Windows (MinGW). Supported host platforms are MacOS X, Linux and Windows.

  • methcla is their mobile sound engine, which is built using this Shake script, which (unsurprisingly) uses shake-language-c and shake-cabal-build.

by Neil Mitchell (noreply@blogger.com) at September 07, 2014 09:02 PM

Edward Z. Yang

Haskell Implementor’s Workshop ’14

This year at ICFP, we had some blockbuster attendance to the Haskell Implementor's Workshop (at times, it was standing room only). I had the pleasure of presenting the work I had done over the summer on Backpack.

/img/backpack-ufo.png

You can grab the slides or view the presentation itself (thank you ICFP organizers for being incredibly on-the-ball with videos this year!) The talk intersects a little bit with my blog post A taste of Cabalized Backpack, but there are more pictures, and I also emphasize (perhaps a little too much) the long term direction we are headed in.

/img/backpack-schema.png

There were a lot of really nice talks at HiW. Here are some of my personal highlights:

by Edward Z. Yang at September 07, 2014 01:05 PM

Mike Izbicki

Getting started with GitHub, vim, and bash

Getting started with GitHub, vim, and bash

posted on 2014-09-07 by Rashid Goshtasbi and Kyler Rynear

Learning to use git, vim, and bash was hard for us. These tools are so different than the tools we used when we first learned to program. And they’re confusing! But our professor made us use them… and eventually… after we learned the tools… we discovered that we really like them! So we’ve put together a simple video guide to help you learn and enjoy these tools too. We did this as part of the CS100 open source software development class at UC Riverside.

Click here to watch the full playlist on YouTube.

Getting Started with GitHub

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/bap-NSjgPFg?rel=0&amp;vq=hd1080" width="697.6"></iframe>

This video shows you step by step how to create an account on GitHub. Then we see how to create our first repository called test, and transfer it from GitHub onto our local machine using the git clone command.

Creating a file, pushing to GitHub, and pulling from GitHub

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/UrLkCZaXg9o?rel=0&amp;vq=hd1080" width="697.6"></iframe>

How do we create files and upload them to GitHub? The touch <filename> command will create an empty file for you. The vim <filename> command will open a file in an advanced text editor that we talk about farther down the page. The git push command sends these files from your local machine up to GitHub, and the git pull command downloads files from GitHub and saves them to your local computer.

Branches

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/E8-hUsR7IXA?rel=0&amp;vq=hd1080" width="697.6"></iframe>

Branches let you work on files without messing up your original code. When you finish your changes, you can merge them into the master branch. This is the best part of version control.

Tags

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/WKG1u4Y_f3s?rel=0&amp;vq=hd1080" width="697.6"></iframe>

Most programs have different versions, for example: 1.0, 1.1, 1.2, 2.1 and 2.2.1. The git tag command let’s you create these versions. They’re just like a checkpoint in a Mario game!

Forking & Pull Requests

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/tTnL84EvJTM?rel=0&amp;vq=hd1080" width="697.6"></iframe>

Let’s say you want to contribute to an open source project, but you don’t have permission. In order to contribute to someone else’s repository, you must first “fork” it to create a repo that you do have push permission on. Then you issue a pull request through the GitHub website. This tells the owner of the original repo that you’ve made some changes they can incorporate.

The README.md file

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/4UTSEKzsSvM?rel=0&amp;vq=hd1080" width="697.6"></iframe>

README.md files are how you document your projects. The README.md should explain your program, give installation instructions, and list known bugs. Basically, it explains to someone else who has absolutely no idea what your program does or how to code, but it enables the user to understand the concepts and basic directions to execute your program. The .md extension at the end of the filename indicates that the file uses markdown formatting. This is a simple way to create nice looking documentation.

Learning vim

vim is an advanced text editor for Unix operating systems. It’s powerful, but all the commands are intimidating for first time users. Even though it’s hard to get used to at first, these videos will help you learn some of the basic commands and get comfortable with vim.

Getting Started

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/szTtE60fIt8?rel=0&amp;vq=hd1080" width="697.6"></iframe>

It was difficult at first trying to transverse my code while using vim. I was so used to being able to use my mouse and simply click where I wanted to go. There are many ways to maneuver inside of vim. Some may just use the h,j,k,l, up, down, left, right arrow keys, or the w, e, b keys to move. You can also press gg to go to the top of the code, G to go to the bottom of it, and (any number)G to go to the line number typed before the capital G.)

Cutting, copying, and pasting took a while to get used to when using vim. Sometimes there was something I wanted in my code that was in the instructions for the assignment. In order to paste I would use the p command, but I could not paste things from outside of vim into it. If I had something copied outside of vim, then to paste it into vim I would right click and just click paste. This would paste it wherever the cursor currently is. If you right click to copy, then it will not affect what is copied by using the commands y to copy or the commands d or x to cut. If those commands are used, the just clicking p will paste them. There are other ways to store more than one thing while copying or cutting, but these two ways were the most helpful as I learned how to use vim.

Another personal favorite features of vim, are the shift-a (takes you to the end of the line and into insert mode) and the shift-i (takes you to the beginning of the line and into insert mode) command. You can also press a to append after the cursor position, as well as i to insert before the current cursor position

vim also allows you to use the v or shift-v keys to highlight certain text or lines of code. You can then use other vim commands such as the copy, paste and delete keys to perform your needed actions.

Indentation

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/uuztdE_gixs?rel=0&amp;vq=hd1080" width="697.6"></iframe>

At first it felt very time consuming to indent multiple lines. I felt this way until I found about the V command. V lets users highlight a line and pressing up or down can highlight as many lines as they desire. All that was left to do was to type > after everything I wanted to indent was highlighted and it all would indented once to the right. Typing < would instead indent it to the left if I ever wanted to do that.

Deletion

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/x0BMbS2kWYc" width="697.6"></iframe>

There are two commands for deleting single character. x deletes the character that the cursor is on and moves the cursor to the right; and X deletes the character that the cursor is on and moves the cursor to the left.

The d command is a more powerful way to delete. d can be used with many different things after it. dd will delete the entire line. d$ will delete the rest of the current line. de will delete from where the cursor is up until the end of the word.

Replacing

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/d-quT7u3f_o" width="697.6"></iframe>

Lower case r can replace one letter while upper case R can replace one letter with many.

There are three c commands that I regularly use for replacement: ce , which deletes up until the end of the word that the cursor is currently on, then allows you to insert immediately; c$ , which deletes from where the cursor is up until the end of the line, then allows you to insert immediately; and cc , which deletes the whole line that the cursor is on and allows you to insert immediately at the beginning of the line.

Customizing your vim editor with the .vimrc file

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/VhAiVux6GBg?rel=0&amp;vq=hd1080" width="697.6"></iframe>

Ever wondered how’ve we get our vim editor to work in the way we have it versus the default editor? vim has a file where you can setup it’s defaults such as auto parentheses, auto-indent, and much more. By watching our video above, you can easily create new defaults for your vim editor that can cut time spent formating your text to spend more on coding.

Learning The Terminal

One of the best features of Unix operating systems is the powerful terminal they provide.

The ls command

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/xSSahh5HbUY?rel=0&amp;vq=hd1080" width="697.6"></iframe>

The ls command is one of the most used terminal commands.

The basic ls command, when run, displays the contents within the current working directory. Passing in a directory name as an argument will display the contents of that directory. It is also possible to pass in a path for a directory to display any directory, regardless of the directory the user is currently in.

If the -a flag is passed in with ls, all items in the current working directory prepended with a . are also displayed, along with the rest of the items.

Passing in the -l flag prints information for each item in the directory in a series of columns on a single line. The first column displays the read, write, and executable permissions for the main user, the group the current user is in, and any user in that order. The next column shows the owner of the item and the next column shows the group owner. The fourth column displays the size, in bytes, of the item. The fifth column displays the moment the item was created, and the last column displays the name of the item.

If the -R flag is passed in, the command will display the contents of the current directory, and then recursively enter every directory within the current directory and display the contents of that directory, then keep going into every directory until there are no more directories in the current directory it is in.

All these options are combinable for different uses. For example, I could use the -l and -a flags to display the information for the items prepended with a . , or use -R and -l together.

The cd and mv commands

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/1s5TiFbETh4?rel=0&amp;vq=hd1080" width="697.6"></iframe>

The cd and mv commands are crucial commands in order to actually use the terminal. Without cd, I would forever be stuck in their home directory. The mv command is necessary for moving files from one section of the hard drive. The cd command by itself will change the current working directory to the home directory. If passed a directory name that is within the current working directory, the current working directory will be changed to the name of the passed in directory. cd will also take a path as an argument. When a path is passed in, the current working directory will be changed to the directory specified by the path. When cd is passed with .., the directory will go backwards, the directory that the current directory is in.

The mv command will move an item within a certain directory to the directory passed in.

If the destination argument is not a path, the command will look for the destination in the current working directory. The destination argument can be a path, so I can move the item to any directory in the hard drive.

Recording terminal sessions via scripts

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/ZnIrku27C94?rel=0&amp;vq=hd1080" width="697.6"></iframe>

With the script command you can record the commands you run in your terminal into a file. By just typing script file_name_here, you can start a script. Also, you don’t need to worry about making a file beforehand, because when you specify the filename, it will make once for you in that name. Then when you’re done, type exit and your terminal will say your script session has ended and re-state the filename in which it recorded all your commands in.

How To SSH (into well server)

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/Letf4txWPic?rel=0&amp;vq=hd1080" width="697.6"></iframe>

Computer Science students have the ability to log into the school’s server using the ssh command. The way to do access the terminal is to type into the command terminal the following text:

ssh your_NetId@bell.cs.ucr.edu

If it is your first time entering the terminal, you will be asked to trust the encryption that the server uses, then prompted to enter the password associated with your NetID. Once doing all those steps, you will be brought to your home directory on the server. To exit the server, type exit into the command prompt and press enter.

A useful command that moves files to and from the remote server onto your home computer is the scp command. To put items from your home computer to the school’s server, type into the command prompt:

scp filename/absolute_path your_NetID@bell.cs.ucr.edu:absolute_path

To move items from the remote server onto your home computer, type into the command prompt:

scp your_NetID@bell.cs.ucr.edu:absolute_path absolute_path

Spectacle App: Using the terminal and vim in one screen

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/j1fnYZp4foI?rel=0&amp;vq=hd1080" width="697.6"></iframe>

One of the first things I noticed about vim that I initially disliked was that it took over the terminal when I used it. Users with Windows 7 & above automatically have this ability by dragging your screen to the left or right border of your screen. Unfortunately, OS X users don’t have this built in ability. To get around this, OS X users can install the Spectacle App which will enable you to organize multiple windows on your screen with a touch of a buttom. To get around this issue, I started using two terminals instead of just one while I was programming. I would run vim using the first terminal and would run the executable in the second. It was as simple as using :w to save on vim instead of using :wq to save and quit. I could now test my code without ever having to close vim.

perror

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="419.65" src="https://www.youtube.com/embed/GsoVzP3sRsA?rel=0&amp;vq=hd1080" width="697.6"></iframe>

When programming for unix based operating systems (which is a primary component of CS100), system calls are a prominent component for code. The perror function captures the error value (if returned) from the system call and prints to stdout an error message based on the system call and the type of error. It takes in one c-string argument, which is a message the user can pass in.

September 07, 2014 12:00 AM

September 06, 2014

Joachim Breitner

ICFP 2014

Another on-my-the-journey-back blog post; this time from the Frankfurt Airport Train Station – my flight was delayed (if I knew that I could have watched the remaining Lightning Talks), and so was my train, but despite 5min of running through the Airport just not enough. And now that the free 30 Minutes of Railway Station Internet are used up, I have nothing else to do but blog...

Last week I was attending ICFP 2014 in Gothenburg, followed by the Haskell Symposium and the Haskell Implementors Workshop. The justification to attend was the paper on Safe Coercions (joint work with Richard Eisenberg, Simon Peyton Jones and Stephanie Weirich), although Richard got to hold the talk, and did so quite well. So I got to leisurely attend the talks, while fighting the jet-lag that I brought from Portland.

There were – as expected – quite a few interesting talks. Among them the first keynote, Kathleen Fisher on the need for formal methods in cars and toy-quadcopters and unmanned battle helicopters, which made me conclude that my Isabelle skills might eventually become relevant in practical applications. And did you know that if someone gains access to your car’s electronics, they can make the seat belt pull you back hard?

Stefanie Weirich’s keynote (and the subsequent related talks by Jan Stolarek and Richard Eisenberg) on what a dependently typed Haskell would look like and what we could use it for was mouth-watering. I am a bit worried that Haskell will be become a bit obscure for newcomers and people that simply don’t want to think about types too much, on the other hand it seems that Haskell as we know it will always stay there, just as a subset of the language.

Similarly interesting were refinement types for Haskell (talks by Niki Vazou and by Eric Seidel), in the form of LiquidTypes, something that I have not paid attention to yet. It seems to be a good way for more high assurance in Haskell code.

Finally, the Haskell Implementors Workshop had a truckload of exciting developments in and around Haskell: More on GHCJS, Partial type signatures, interactive type-driven development like we know it from Agda, the new Haskell module system and amazing user-defined error messages – the latter unfortunately only in Helium, at least for now.

But it’s not the case that I only sat and listened. During the Haskell Implementors Workshop I held a talk “Contributing to GHC” with a live demo of me fixing a (tiny) bug in GHC, with the aim of getting more people to hack on GHC (slides, video). The main message here is that it is not that big of deal. And despite me not actually saying much interesting in the talk, I got good feedback afterwards. So if it now actually motivates someone to contribute to GHC, I’m even more happier.

And then there is of course the Hallway Track. I discussed the issues with fusing a left fold (unfortunately, without a great solution). In order to tackle this problem more systematically, John Wiegley and I created the beginning of a “List Fusion Lab”, i.e. a bunch of list benchmark and the possibility to compare various implementations (e.g. with different RULES) and various compilers. With that we can hopefully better assess the effect of a change to the list functions.

PS: The next train is now also delayed, so I’ll likely miss my tram and arrive home even later...

PPS: I really have to update my 10 year old picture on my homepage (or redesign it completely). Quite a few people knew my name, but expected someone with shoulder-long hair...

PPPS: Haskell is really becoming mainstream: I just talked to a randomly chosen person (the boy sitting next to me in the train), and he is a Haskell enthusiast, building a structured editor for Haskell together with his brother. And all that as a 12th-grader...

by Joachim Breitner (mail@joachim-breitner.de) at September 06, 2014 10:46 PM

Tim Docker

A New Charting API

One of the challenges with building a library like Chart is the tension between ease of use and flexibility. Users want to produce charts with a minimum of code up front, but later want to refine the details. The chart library addresses this through the use of "defaulted records" using Data.Default.Class. Because such records are often nested, we rely on the somewhat intimidating lens library to modify the default values. We end up with code to create chart elements like this:

sinusoid2 = plot_points_title .~ "fn(x)"
          $ plot_points_values .~ mydata
          $ plot_points_style . point_color .~ opaque red
          $ def

This is much simpler and cleaner that the corresponding code using native record accessors, but it still has a certain amount of syntactic overhead.

I’ve added a simple state monad to the library to further clean up the syntax. The state of the monad is the value being constructed, allowing the use of the monadic lens operators. The above code sample becomes:

sinusoid2 = execEC $ do
    plot_points_title .= "fn(x)" 
    plot_points_values .= mydata
    plot_points_style . point_color .= opaque red

This may seem only a minor syntactic improvement, but it adds up over an typical chart definition.

A few other changes further reduce the clutter in charting code:

  • A new Easy module that includes helper functions and key dependencies
  • Simpler "toFile" functions in the rendering backends
  • Automatic sequencing of colours for successive plots

All this means that a simple plot can now be a one liner:

import Graphics.Rendering.Chart.Easy
import Graphics.Rendering.Chart.Backend.Cairo

mydata :: [Double,Double]
mydata = ...

main = toFile def "test.png" $ plot $ points "lines" mydata

But this extends naturally to more complex charts. The code differences between the new stateful API versus the existing API can been seen in this example.

The stateful API is available in chart v1.3 It is a thin layer over the existing API – both will be continue to be available in the future.


by Tim Docker at September 06, 2014 05:34 AM

September 04, 2014

Edward Z. Yang

Open type families are not modular

One of the major open problems for building a module system in Haskell is the treatment of type classes, which I have discussed previously on this blog. I've noted how the current mode of use in type classes in Haskell assume “global uniqueness”, which is inherently anti-modular; breaking this assumption risks violating the encapsulation of many existing data types.

As if we have a choice.

In fact, our hand is forced by the presence of open type families in Haskell, which are feature many similar properties to type classes, but with the added property that global uniqueness is required for type safety. We don't have a choice (unless we want type classes with associated types to behave differently from type classes): we have to figure out how to reconcile the inherent non-modularity of type families with the Backpack module system.

In this blog post, I want to carefully lay out why open type families are inherently unmodular and propose some solutions for managing this unmodularity. If you know what the problem is, you can skip the first two sections and go straight to the proposed solutions section.


Before we talk about open type family instances, it's first worth emphasizing the (intuitive) fact that a signature of a module is supposed to be able to hide information about its implementation. Here's a simple example:

module A where
    x :: Int

module B where
    import A
    y = 0
    z = x + y

Here, A is a signature, while B is a module which imports the signature. One of the points of a module system is that we should be able to type check B with respect to A, without knowing anything about what module we actually use as the implementation. Furthermore, if this type checking succeeds, then for any implementation which provides the interface of A, the combined program should also type check. This should hold even if the implementation of A defines other identifiers not mentioned in the signature:

module A where
    x = 1
    y = 2

If B had directly imported this implementation, the identifier y would be ambiguous; but the signature filtered out the declarations so that B only sees the identifiers in the signature.


With this in mind, let's now consider the analogous situation with open type families. Assuming that we have some type family F defined in the prelude, we have the same example:

module A where
    type instance F Int
    f :: F Bool

module B where
    import A
    type instance F Bool = Int -> Bool
    x = f 2

Now, should the following module A be a permissible implementation of the signature?

module A where
    type instance F Int = Int
    type instance F Bool = Int
    f = 42

If we view this example with the glasses off, we might conclude that it is a permissible implementation. After all, the implementation of A provides an extra type instance, yes, but when this happened previously with a (value-level) declaration, it was hidden by the signature.

But if put our glasses on and look at the example as a whole, something bad has happened: we're attempting to use the integer 42 as a function from integers to booleans. The trouble is that F Bool has been given different types in the module A and module B, and this is unsound... like, segfault unsound. And if we think about it some more, this should not be surprising: we already knew it was unsound to have overlapping type families (and eagerly check for this), and signature-style hiding is an easy way to allow overlap to sneak in.

The distressing conclusion: open type families are not modular.


So, what does this mean? Should we throw our hands up and give up giving Haskell a new module system? Obviously, we’re not going to go without a fight. Here are some ways to counter the problem.

The basic proposal: require all instances in the signature

The simplest and most straightforward way to solve the unsoundness is to require that a signature mention all of the family instances that are transitively exported by the module. So, in our previous example, the implementation of A does not satisfy the signature because it has an instance which is not mentioned in the signature, but would satisfy this signature:

module A where
    type instance F Int
    type instance F Bool

While at first glance this might not seem too onerous, it's important to note that this requirement is transitive. If A happens to import another module Internal, which itself has its own type family instances, those must be represented in the signature as well. (It's easy to imagine this spinning out of control for type classes, where any of the forty imports at the top of your file may be bringing in any manner of type classes into scope.) There are two major user-visible consequences:

  1. Module imports are not an implementation detail—you need to replicate this structure in the signature file, and
  2. Adding instances is always a backwards-incompatible change (there is no weakening).

Of course, as Richard pointed out to me, this is already the case for Haskell programs (and you just hoped that adding that one extra instance was "OK").

Despite its unfriendliness, this proposal serves as the basis for the rest of the proposals, which you can conceptualize as trying to characterize, “When can I avoid having to write all of the instances in my signature?”

Extension 1: The orphan restriction

Suppose that I write the following two modules:

module A where
    data T = T
    type instance F T = Bool

module B where
    import A
    type instance F T = Int -> Int

While it is true that these two type instances are overlapping and rightly rejected, they are not equally at fault: in particular, the instance in module B is an orphan. An orphan instance is an instance for type class/family F and data type T (it just needs to occur anywhere on the left-hand side) which lives in a module that defines neither. (A is not an orphan since the instance lives in the same module as the definition of data type T).

What we might wonder is, “If we disallowed all orphan instances, could this rule out the possibility of overlap?” The answer is, “Yes! (...with some technicalities).” Here are the rules:

  1. The signature must mention all what we will call ragamuffin instances transitively exported by implementations being considered. An instance of a family F is a ragamuffin if it is not defined with the family definition, or with the type constructor at the head in the first parameter. (Or some specific parameter, decided on a per-family basis.) All orphan instances are ragamuffins, but not all ragamuffins are orphans.
  2. A signature exporting a type family must mention all instances which are defined in the same module as the definition of the type family.
  3. It is strictly optional to mention non-ragamuffin instances in a signature.

(Aside: I don't think this is the most flexible version of the rule that is safe, but I do believe it is the most straightforward.) The whole point of these rules is to make it impossible to write an overlapping instance, while only requiring local checking when an instance is being written. Why did we need to strengthen the orphan condition into a ragamuffin condition to get this non-overlap? The answer is that absence of orphans does not imply absence of overlap, as this simple example shows:

module A where
    data A = A
    type instance F A y = Int

module B where
    data B = B
    type instance F x B = Bool -> Bool

Here, the two instances of F are overlapping, but neither are orphans (since their left-hand sides mention a data type which was defined in the module.) However, the B instance is a ragamuffin instance, because B is not mentioned in the first argument of F. (Of course, it doesn't really matter if you check the first argument or the second argument, as long as you're consistent.)

Another way to think about this rule is that open type family instances are not standalone instances but rather metadata that is associated with a type constructor when it is constructed. In this way, non-ragamuffin type family instances are modular!

A major downside of this technique, however, is that it doesn't really do anything for the legitimate uses of orphan instances in the Haskell ecosystem: when third-parties defined both the type family (or type class) and the data type, and you need the instance for your own purposes.

Extension 2: Orphan resolution

This proposal is based off of one that Edward Kmett has been floating around, but which I've refined. The motivation is to give a better story for offering the functionality of orphan instances without gunking up the module system. The gist of the proposal is to allow the package manager to selectively enable/disable orphan definitions; however, to properly explain it, I'd like to do first is describe a few situations involving orphan type class instances. (The examples use type classes rather than type families because the use-cases are more clear. If you imagine that the type classes in question have associated types, then the situation is the same as that for open type families.)

The story begins with a third-party library which defined a data type T but did not provide an instance that you needed:

module Data.Foo where
    data Foo = Foo

module MyApp where
    import Data.Foo
    fooString = show Foo -- XXX no instance for Show

If you really need the instance, you might be tempted to just go ahead and define it:

module MyApp where
    import Data.Foo
    instance Show Foo where -- orphan
        show Foo = "Foo"
    fooString = show Foo

Later, you upgrade Data.Foo to version 1.0.0, which does define a Show instance, and now your overlapping instance error! Uh oh.

How do we get ourselves out of the mess? A clue is how many package authors currently “get out of jail” by using preprocessor macros:

{-# LANGUAGE CPP #-}
module MyApp where
    import Data.Foo
#if MIN_VERSION_foo(1,0,0)
    instance Show Foo where -- orphan
        show Foo = "Foo"
#endif
    fooString = show Foo

Morally, we'd like to hide the orphan instance when the real instance is available: there are two variations of MyApp which we want to transparently switch between: one which defines the orphan instance, and one which does not and uses the non-orphan instance defined in the Data.Foo. The choice depends on which foo was chosen, a decision made by the package manager.

Let's mix things up a little. There is no reason the instance has to be a non-orphan coming from Data.Foo. Another library might have defined its own orphan instance:

module MyOtherApp where
    import Data.Foo
    instance Show Foo where ... -- orphan
    otherFooString = show Foo

module MyApp where
    import Data.Foo
    instance Show Foo where ... -- orphan
    fooString = show Foo

module Main where
    import MyOtherApp
    import MyApp
    main = print (fooString ++ otherFooString ++ show Foo)

It's a bit awful to get this to work with preprocessor macros, but there are two ways we can manually resolve the overlap: we can erase the orphan instance from MyOtherApp, or we can erase the orphan instance from MyApp. A priori, there is no reason to prefer one or the other. However, depending on which one is erased, Main may have to be compiled differently (if the code in the instances is different). Furthermore, we need to setup a new (instance-only) import between the module who defines the instance to the module whose instance was erased.

There are a few takeaways from these examples. First, the most natural way of resolving overlapping orphan instances is to simply “delete” the overlapping instances; however, which instance to delete is a global decision. Second, which overlapping orphan instances are enabled affects compilation: you may need to add module dependencies to be able to compile your modules. Thus, we might imagine that a solution allows us to do both of these, without modifying source code.

Here is the game plan: as before, packages can define orphan instances. However, the list of orphan instances a package defines is part of the metadata of the package, and the instance itself may or may not be used when we actually compile the package (or its dependencies). When we do dependency resolution on a set of packages, we have to consider the set of orphan instances being provided and only enable a set which is non-overlapping, the so called orphan resolution. Furthermore, we need to add an extra dependency from packages whose instances were disabled to the package who is the sole definer of an instance (this might constrain which orphan instance we can actually pick as the canonical instance).

The nice thing about this proposal is that it solves an already existing pain point for type class users, namely defining an orphan type class instance without breaking when upstream adds a proper instance. But you might also think of it as a big hack, and it requires cooperation from the package manager (or some other tool which manages the orphan resolution).


The extensions to the basic proposal are not mutually exclusive, but it's an open question whether or not the complexity they incur are worth the benefits they bring to existing uses of orphan instances. And of course, there may other ways of solving the problem which I have not described here, but this smorgasbord seems to be the most plausible at the moment.

At ICFP, I had an interesting conversation with Derek Dreyer, where he mentioned that when open type families were originally going into GHC, he had warned Simon that they were not going to be modular. With the recent addition of closed type families, many of the major use-cases for open type families stated in the original paper have been superseded. However, even if open type families had never been added to Haskell, we still might have needed to adopt these solutions: the global uniqueness of instances is deeply ingrained in the Haskell community, and even if in some cases we are lax about enforcing this constraint, it doesn't mean we should actively encourage people to break it.

I have a parting remark for the ML community, as type classes make their way in from Haskell: when you do get type classes in your language, don’t make the same mistake as the Haskell community and start using them to enforce invariants in APIs. This way leads to the global uniqueness of instances, and the loss of modularity may be too steep a price to pay.


Postscript. One natural thing to wonder, is if overlapping type family instances are OK if one of the instances “is not externally visible.” Of course, the devil is in the details; what do we mean by external visibility of type family instances of F?

For some definitions of visibility, we can find an equivalent, local transformation which has the same effect. For example, if we never use the instance at all, it certainly OK to have overlap. In that case, it would also have been fine to delete the instance altogether. As another example, we could require that there are no (transitive) mentions of the type family F in the signature of the module. However, eliminating the mention of the type family requires knowing enough parameters and equations to reduce: in which case the type family could have been replaced with a local, closed type family.

One definition that definitely does not work is if F can be mentioned with some unspecified type variables. Here is a function which coerces an Int into a function:

module A where
  type instance F Int = Int
  f :: Typeable a => a -> F a
  f x = case eqT of
    Just Refl -> x :: Int
    Nothing -> undefined

module ASig where
  f :: Typeable a => a -> F a

module B where
  import ASig
  type instance F Int = Bool -> Bool
  g :: Bool
  g = f 0 True -- oops

...the point being that, even if a signature doesn't directly mention the overlapping instance F Int, type refinement (usually by some GADT-like structure) can mean that an offending instance can be used internally.

by Edward Z. Yang at September 04, 2014 10:12 PM

Roman Cheplyaka

Dependent Haskell

Emulating dependent types (and, more generally, advanced type-level programming) has been a hot topic in the Haskell community recently. Some incredible work has been done in this direction: GADTs, open and closed type families, singletons, etc. The plan is to copy even more features to the type level, like type classes and GADTs, and simplify the promotion of value-level functions.

On the other hand, there’s a good deal of scepticism around this idea. «If you want to program like in Agda, why don’t you program in Agda?»

First, libraries. It took us many years to arrive at the state of hackage that is suitable for industrial usage — and we still don’t have satisfactory answers to many problems. My guess is that it will take at least as long as that for the dependently typed community to arrive at this point — not only because of the smaller community, but also because they will look for even more type-safe solutions, which is naturally a harder problem.

Making your code extremely type-safe is quite hard (or else it would not be worth an ICFP talk). In a real-world application, you’d probably have just a few places where it’s worth the effort. But if you write the whole application in a dependently-typed language, you pay the price for your whole codebase. The price includes, for instance, the absence of type inference.

The compilation problem is not solved either. In particular, there are concerns about type erasure in both Agda and Idris. This is not unsolvable, just hasn’t been done yet.

So maybe you could write some parts in Agda/Idris and the rest in Haskell? Neither Agda nor Idris has a good story for that. For instance, Agda can generate Haskell code, but interfacing with Haskell won’t be very smooth. And the differences in languages mean that it probably won’t ever be effortless.

Don’t get me wrong, I am very enthusiastic about these (and future) languages. However, it doesn’t seem like they will be ready for production usage anytime soon. At the same time, you can satisfy the majority of your dependently typed needs in Haskell right now, no hidden charges. Isn’t that cool?

Disclaimer

I am no expert in Agda/Idris. The above is my impression from talking to different people at ICFP and participating in tutorials on these two languages given by their corresponding authors. I’ll gladly accept rebuttals.

Updates

There’s a discussion happening on /r/haskell, where people dispute many of my points. I encourage you to read that discussion and make your own conclusions.

In particular, it has been pointed out that erasure is not as big of a problem as I thought initially. Still, it’ll take quite some time (and non-trivial systems built with dependent types) for us to be able to trust Idris/Agda compilers as much as we now trust GHC.

September 04, 2014 09:00 PM

Ian Ross

Non-diffusive atmospheric flow #5: pre-processing

Non-diffusive atmospheric flow #5: pre-processing

Note: there are a couple of earlier articles that I didn’t tag as “haskell” so they didn’t appear in Planet Haskell. They don’t contain any Haskell code, but they cover some background material that’s useful to know (#3 talks about reanalysis data and what Z500 is, and #4 displays some of the characteristics of the data we’re going to be using). If you find terms here that are unfamiliar, they might be explained in one of these earlier articles.

The code for this post is available in a Gist.

Before we can get into the “main analysis”, we need to do some pre-processing of the Z500 data. In particular, we are interested in large-scale spatial structures, so we want to subsample the data spatially. We are also going to look only at the Northern Hemisphere winter, so we need to extract temporal subsets for each winter season. (The reason for this is that winter is the season where we see the most interesting changes between persistent flow regimes. And we look at the Northern Hemisphere because it’s where more people live, so it’s more familiar to more people.) Finally, we want to look at variability about the seasonal cycle, so we are going to calculate “anomalies” around the seasonal cycle.

We’ll do the spatial and temporal subsetting as one pre-processing step and then do the anomaly calculation seperately, just for simplicity.

Spatial and temporal subsetting

The title of the paper we’re trying to follow is “Observed Nondiffusive Dynamics in Large-Scale Atmospheric Flow”, so we need to decide what we mean by “large-scale” and to subset our data accordingly. The Z500 data from the NCEP reanalysis dataset is at 2.5° × 2.5° resolution, which turns out to be a little finer than we need, so we’re going to extract data on a 5° × 5° grid instead. We’ll also extract only the Northern Hemisphere data, since that’s what we’re going to work with.

For the temporal subsetting, we need to take 181 days of data for each year starting on 1 November each year. Since the data starts at the beginning of 1948 and goes on to August 2014 (which is when I’m writing this), we’ll have 66 years of data, from November 1948 until April 2014. As usual when handling dates, there’s some messing around because of leap years, but here it basically just comes down to which day of the year 1 November is in a given year, so it’s not complicated.

The daily NCEP reanalysis geopotential height data comes in one file per year, with all the pressure levels used in the dataset bundled up in each file. That means that the geopotential height variable in each file has coordinates: time, level, latitude, longitude, so we need to slice out the 500 mb level as we do the other subsetting.

All this is starting to sound kind of complicated, and this brings us to a regrettable point about dealing with this kind of data – it’s messy and there’s a lot of boilerplate code to read and manipulate coordinate metadata. This is true pretty much whatever language you use for processing these multi-dimensional datasets and it’s kind of unavoidable. The trick is to try to restrict this inconvenient stuff to the pre-processing phase by using a consistent organisation of your data for the later analyses. We’re going to do that here by storing all of our Z500 anomaly data in a single NetCDF file, with 181-day long winter seasons back-to-back for each year. This will make time and date processing trivial.

The code for the data subsetting is in the subset.hs program in the Gist. We’ll deal with it in a few bites.

Skipping the imports that we need, as well as a few “helper” type synonym definitions, the first thing that we need to do it open one of the input NCEP NetCDF files to extract the coordinate metadata information. This listing shows how we do this:

  Right refnc <- openFile $ indir </> "hgt.1948.nc"
  let Just nlat = ncDimLength <$> ncDim refnc "lat"
      Just nlon = ncDimLength <$> ncDim refnc "lon"
      Just nlev = ncDimLength <$> ncDim refnc "level"
  let (Just lonvar) = ncVar refnc "lon"
      (Just latvar) = ncVar refnc "lat"
      (Just levvar) = ncVar refnc "level"
      (Just timevar) = ncVar refnc "time"
      (Just zvar) = ncVar refnc "hgt"
  Right lon <- get refnc lonvar :: SVRet CFloat
  Right lat <- get refnc latvar :: SVRet CFloat
  Right lev <- get refnc levvar :: SVRet CFloat

We open the first of the NetCDF files (I’ve called the directory where I’ve stored these things indir) and use the hnetcdf ncDim and ncVar functions to get the dimension and variable metadata for the latitude, longitude, level and time dimensions; we then read the complete contents of the “coordinate variables” (for level, latitude and longitude) as Haskell values (here, SVRet is a type synonym for a storable vector wrapped up in the way that’s returned from the hnetcdf get functions).

Once we have the coordinate variable values, we need to find the index ranges to use for subsetting. For the spatial subsetting, we find the start and end ranges for the latitudes that we want (17.5°N-87.5°N) and for the level, we find the index of the 500 mb level:

  let late = vectorIndex LT FromEnd lat 17.5
      lats = vectorIndex GT FromStart lat 87.5
      levi = vectorIndex GT FromStart lev 500.0

using a helper function to find the correspondence between coordinate values and indexes:

data IndexStart = FromStart | FromEnd

vectorIndex :: (SV.Storable a, Ord a)
            => Ordering -> IndexStart -> SV.Vector a -> a -> Int
vectorIndex o s v val = case (go o, s) of
  (Nothing, _) -> (-1)
  (Just i, FromStart) -> i
  (Just i, FromEnd) -> SV.length v - 1 - i
  where go LT = SV.findIndex (>= val) vord
        go GT = SV.findIndex (<= val) vord
        vord = case s of
          FromStart -> v
          FromEnd -> SV.reverse v

For the temporal subsetting, we just work out what day of the year 1 November is for leap and non-leap years – since November and December together are 61 days, for each winter season we need those months plus the first 120 days of the following year:

  let inov1non = 305 - 1
      -- ^ Index of 1 November in non-leap years.
      wintertsnon = [0..119] ++ [inov1non..365-1]
      -- ^ Indexes of all winter days for non-leap years.
      inov1leap = 305 + 1 - 1
      -- ^ Index of 1 November in leap years.
      wintertsleap = [0..119] ++ [inov1leap..366-1]
      -- ^ Indexes of all winter days for leap years.
      winterts1948 = [inov1leap..366-1]
      winterts2014 = [0..119]
      -- ^ Indexes for winters in start and end years.

This is kind of hokey, and in some cases you do need to do more sophisticated date processing, but this does the job here.

Once we have all this stuff set up, the critical part of the subsetting is easy – for each input data file, we figure out what range of days we need to read, then use a single call the getS from hnetcdf (“get slice”):

        forM_ winterts $ \it -> do
          -- Read a slice of a single time-step of data: Northern
          -- Hemisphere (17.5-87.5 degrees), 5 degree resolution, 500
          -- mb level only.
          let start = [it, levi, lats, 0]
              count = [1, 1, (late - lats) `div` 2 + 1, nlon `div` 2]
              stride = [1, 1, 2, 2]
          Right slice <- getS nc zvar start count stride :: RepaRet2 CShort

Here, we have a set of start indexes, a set of counts and a set of strides, one for each dimension in our input variable. Since the input geopotential height files have dimensions of time, level, latitude and longitude, we have four entries in each of our start, count and stride lists. The start values are the current day from the list of days we need (called it in the code), the level we’re interested in (levi), the start latitude index (lats) and zero, since we’re going to get the whole range of longitude. The count list gets a single time step, a single level, and a number of latitude and longitude values based on taking every other entry in each direction (since we’re subsetting from a spatial resolution of 2.5° × 2.5° to a resolution of 5° × 5°). Finally, for the stride list, we use a stride of one for the time and level directions (which doesn’t really matter anyway, since we’re only reading a single entry in each of those directions) and a stride of two in the latitude and longitude directions (which gives us the “every other one” subsetting in those directions).

All of the other code in the subsetting program is involved in setting up the output file and writing the Z500 slices out. Setting up the output file is slightly tedious (this is very common when dealing with NetCDF files – there’s always lots of metadata to be managed), but it’s made a little simpler by copying attributes from one of the input files, something that doing this in Haskell makes quite a bit easier than in Fortran or C. The next listing shows how the NcInfo for the output file is created, which can then be passed to the hnetcdf withCreateFile function to actually create the output file:

  let outlat = SV.fromList $ map (lat SV.!) [lats,lats+2..late]
      outlon = SV.fromList $ map (lon SV.!) [0,2..nlon-1]
      noutlat = SV.length outlat
      noutlon = SV.length outlon
      outlatdim = NcDim "lat" noutlat False
      outlatvar = NcVar "lat" NcFloat [outlatdim] (ncVarAttrs latvar)
      outlondim = NcDim "lon" noutlon False
      outlonvar = NcVar "lon" NcFloat [outlondim] (ncVarAttrs lonvar)
      outtimedim = NcDim "time" 0 True
      outtimeattrs = foldl' (flip M.delete) (ncVarAttrs timevar)
                     ["actual_range"]
      outtimevar = NcVar "time" NcDouble [outtimedim] outtimeattrs
      outz500attrs = foldl' (flip M.delete) (ncVarAttrs zvar)
                     ["actual_range", "level_desc", "valid_range"]
      outz500var = NcVar "z500" NcShort
                   [outtimedim, outlatdim, outlondim] outz500attrs
      outncinfo =
        emptyNcInfo (outdir </> "z500-subset.nc") #
        addNcDim outlatdim # addNcDim outlondim # addNcDim outtimedim #
        addNcVar outlatvar # addNcVar outlonvar # addNcVar outtimevar #
        addNcVar outz500var

Although we can mostly just copy the coordinate variable attributes from one of the input files, we do need to do a little bit of editing of the attributes to remove some things that aren’t appropriate for the output file. Some of these things are just conventions, but there are some tools that may look at these attributes (actual_range, for example) and may complain if the data doesn’t match the attribute. It’s easier just to remove the suspect ones here.

This isn’t pretty Haskell by any means, and the hnetcdf library could definitely do with having some higher-level capabilities to help with this kind of file processing. I may add some based on the experimentation I’m doing here – I’m developing hnetcdf in parallel with writing this!

Anyway, the result of running the subsetting program is a single NetCDF file containing 11946 days (66 × 181) of Z500 data at a spatial resolution of 5° × 5°. We can then pass this on to the next step of our processing.

Seasonal cycle removal

In almost all investigations in climate science, the annual seasonal cycle stands out as the largest form of variability (not always true in the tropics, but in the mid-latitudes and polar regions that we’re looking at here, it’s more or less always true). The problem, of course, is that the seasonal cycle just isn’t all that interesting. We learn about the difference between summer and winter when we’re children, and although there is much more to say about seasonal variations and how they interact with other phenomena in the climate system, much of the time they just get in the way of seeing what’s going on with those other phenomena.

So what do we do? We “get rid” of the seasonal cycle by looking at what climate scientists call “anomalies”, which are basically just differences between the values of whatever variable we’re looking at and values from a “typical” year. For example, if we’re interested in daily temperatures in Innsbruck over the course of the twentieth century, we construct a “typical” year of daily temperatures, then for each day of our 20th century time series, we subtract the “typical” temperature value for that day of the year from the measured temperature value to give a daily anomaly. Then we do whatever analysis we want based on those anomalies, perhaps looking for inter-annual variability on longer time scales, or whatever.

This approach is very common, and it’s what we’re going to do for our Northern Hemisphere winter-time Z500 data here. To do this, we need to do two things: we need to calculate a “typical” annual cycle, and we need to subtract that typical annual cycle from each year of our data.

OK, so what’s a “typical” annual cycle? First, let’s say a word about what we mean by “annual cycle” in this case. We’re going to treat each spatial point in our data independently, trusting to the natural spatial correlation in the geopotential height to smooth out any shorter-term spatial inhomogeneities in the typical patterns. We then do some sort of averaging in time to generate a “typical” annual cycle at each spatial point. It’s quite common to use the mean annual cycle over a fixed period for this purpose (a period of 30 years is common: 1960-1990, say). In our case, we’re going to use the mean annual cycle across all 66 years of data that we have. Here’s how we calculate this mean annual cycle (this is from the seasonal-cycle.hs program in the Gist):

  -- Each year has 181 days, so 72 x 15 x 181 = 195480 data values.
  let ndays = 181
      nyears = ntime `div` ndays

  -- Use an Int array to accumulate values for calculating mean annual
  -- cycle.  Since the data is stored as short integer values, this is
  -- enough to prevent overflow as we accumulate the 2014 - 1948 = 66
  -- years of data.
  let sh = Repa.Z Repa.:. ndays Repa.:. nlat Repa.:. nlon
      slicecount = [ndays, nlat, nlon]
      zs = take (product slicecount) (repeat 0)
      init = Repa.fromList sh zs :: FArray3 Int

  -- Computation to accumulate a single year's data.
  let doone current y = do
        -- Read one year's data.
        let start = [(y - 1948) * ndays, 0, 0]
        Right slice <- getA innc z500var start slicecount :: RepaRet3 CShort
        return $!
          Repa.computeS $ current Repa.+^ (Repa.map fromIntegral slice)

  -- Accumulate all data.
  total <- foldM doone init [1948..2013]

  -- Calculate the final mean annual cycle.
  let mean = Repa.computeS $
             Repa.map (fromIntegral . (`div` nyears)) total :: FArray3 CShort

Since each year of data has 181 days, a full year’s data has 72 × 15 × 181 data values (72 longitude points, 15 latitude points, 181 days), i.e. 195,480 values. In the case here, since we have 66 years of data, there are 12,901,680 data values altogether. That’s a small enough number that we could probably slurp the whole data set into memory in one go for processing. However, we’re not going to do that, because there are plenty of cases in climate data analysis where the data sets are significantly larger than this, and you need to do “off-line” processing, i.e. to read data from disk piecemeal for processing.

We do a monadic fold (using the standard foldM function) over the list of years of data, and for each year we read a single three-dimensional slice of data representing the whole year and add it to the current accumulated sum of all the data. (This is what the doone function does: the only slight wrinkle here is that we need to deal with conversion from the short integer values stored in the data file to the Haskell Int values that we use in the accumulator. Otherwise, it’s just a question of applying Repa’s element-wise addition operator to the accumulator array and the year’s data.) Once we’ve accumulated the total values across all years, we divide by the number of years and convert back to short integer values, giving a short integer valued array containing the mean annual cycle – a three-dimensional array with one entry for each day in our 181-day winter season and for each latitude and longitude in the grid we’re using.

Once we have the mean annual cycle with which we want to calculate anomalies, determining the anomalies is simply a matter of subtracting the mean annual cycle from each year’s data, matching up longitude, latitude and day of year for each data point. The next listing shows the main loop that does this, reading a single day of data at a time, then subtracting the appropriate slice of the mean annual cycle to produce anomaly values (Repa’s slice function is handy here) and writing these to an output NetCDF file:

      let count = [1, nlat, nlon]
      forM_ [0..ntime-1] $ \it -> do
        -- Read time slice.
        Right slice <- getA innc z500var [it, 0, 0] count :: RepaRet2 CShort

        -- Calculate anomalies and write out.
        let sl = Repa.Z Repa.:. (it `mod` 181) Repa.:. Repa.All Repa.:. Repa.All
            anom = Repa.computeS $
                   slice Repa.-^ (Repa.slice mean sl) :: FArray2 CShort
        putA outnc outz500var [it, 0, 0] count anom

The only thing we have to be a little bit careful about when we create the final anomaly output file is that we need to remove some of the attributes from the Z500 variable: because we’re now working with differences between actual values and our “typical” annual cycle, we no longer need the add_offset and scale_factor attributes that are used to convert from the stored short integer values to floating point geopotential height values. Instead, the values that we store in the anomaly file are the actual geopotential height anomaly values in metres.

After doing all this pre-processing, what we end up with is a single NetCDF file containing 66 winter seasons of daily Z500 anomalies for the region we’re interested in. The kind of rather boring data processing we’ve had to do to get to this point is pretty typical for climate data processing – you almost always need to do some sort of subsetting of your data, you often want to remove signals that aren’t interesting (like the annual cycle here, although things can get much more complicated than that). This kind of thing is unavoidable, and the best that you can really do is to try to organise things so that you do the pre-processing once and end up with data in a format that’s then easy to deal with for further processing. That’s definitely the case here, where we have fixed-length time series (181 days per winter) for each year, so we don’t need to do any messing around with dates.

In other applications, pre-processing can be a much bigger job. For functional brain imaging applications, for example, as well as extracting a three-dimensional image from the output of whatever (MRI or CT) scanner you’re using, you often need to do something about the low signal-to-noise ratios that you get, you need to compensate for subject motion in the scanner during the measurements, you need to compensate for time-varying physiological nuisance signals (breathing and heart beat), you need to spatially warp images to match an atlas image to enable inter-subject comparison, and so on. And all that is before you get to doing whatever statistical analysis you’re really interested in.

We will look at some of these “warts and all” pre-processing cases for other projects later on, but for now the small amount of pre-processing we’ve had to do here is enough. Now we can start with the “main analysis”.

Before we do that though, let’s take a quick look at what these anomaly values look like. The two figures below show anomaly plots for the same time periods for which the original Z500 data is shown in the plots in this earlier article. The “normal” anomaly plots are a bit more variable than the original Z500 plots, but the persistent pattern over the North Atlantic during the blocking episode in the second set of plots is quite clear. This gives us some hope that we’ll be able to pick out these persistent patterns in the data relatively easily.

First, the “normal” anomalies:


Normal Z500 anomaly snapshots


then the “blocking” anomalies:


Blocking Z500 anomaly snapshots


<script src="http://zor.livefyre.com/wjs/v3.0/javascripts/livefyre.js" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "livefyre.com", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

September 04, 2014 12:54 PM