## May 05, 2016

### Douglas M. Auclair (geophf)

• April 15th, 2016:
(\(intensf, list) -> map (second intensf) list) (insensifierFn, assocList)
Point-free-itize this lambda
• obadz @obadzz uncurry $map . second • April 15th, 2016: foldr (\card -> let c = color card in mappend c *** ((idx card, c):)) (mempty, []) cards Point-free-itize and de-let lambda • Gautier DI FOLCO @gautier_difolco foldr (uncurry (***) . (uncurry fmap . fmap (((,) =<< mappend) . color) . ((,) =<< ((:) .) . (,) . idx))) (mempty, []) cards • me: foldr (uncurry (***) . (mappend . snd &&& (:)) . (idx &&& color)) (mempty, []) • April 15th, 2016: map (\(idx, color) -> C idx (cell bb idx) color) assoclist point-free-itize the lambda in the map function. • Eyal Lotem @EyalL uncurry (C <*> cell bb) ? Not in front of a type checker, can't check myself :) • April 13th, 2016: point-free-itize lambda term in: uncurry (foldr (\a -> min a *** max a)) . ((head &&& head) &&& id) • obadz @obadzz liftA2 (***) min max • April 12th, 2016: minmax :: Ord a => [a] -> (a, a) minmax [1..9] = (1,9) in linear time. minmax = minimum &&& maximum fails as it's 2x too slow. • April 12th, 2016: You have (a,b) (c,d) You need (a*c, b*d) Well, we don't have all day! (Actually, we do) Get to it! Pointless; I MEAN POINT-FREE pls • lotz@Haskell㌠ @lotz84_ ((uncurry . flip . curry$ id).) . ((uncurry.) . uncurry . (flip.) . flip . flip) ((((flip . (flip.)) $((,).) . (*)).) . (*)) • Olivier Iffrig @oiffrig Point-free but star-full: uncurry (***) . ((*) *** (*)) • bazzargh @bazzargh curry((uncurry (*)) *** (uncurry (*))) • bazzargh @bazzargh knew there had to be a bimap answer, eventually got: curry ((bimap <$> id <*> id) (uncurry (*)))
• Андреев Кирилл @nonaem00 (((product *** product) . unzip) .) . (flip (:)) . (:[]) ... I clearly need more coffee this morning.
• April 10th, 2016:

data BBx = Bx (Int, Int) Pt2D Pt2D

define:
mkBB :: [a] -> Pt2D -> BBx
mkBB xs extents = Bx (0, length xs) (0,0) extents
points-free
• Gautier DI FOLCO @gautier_difolco
-- LANGUAGE TupleSections
data BBx = Bx (Int, Int) Pt2D Pt2D
mkBB :: [a] -> Pt2D -> BBx
mkBB = flip Bx (0,0) . (0, ) . length
• Thomas D @tthomasdd mkBB = flip Bx (0,0) . (,) 0 . length
• April 7th, 2016:
type MList a n = [(a, n)]
partSz :: Ord n => n -> MList a n -> (MList a n, MList a n)
partSz µ = partition ((µ >) . snd)
Define points-free
• partSz = partition . (. snd) . (>)  -- via @gautier_difolco

# The Mistake Everyone Makes with KnockoutJS

### Introduction

Knockout is a nice JavaScript library for making values that automatically update when any of their “dependencies” update. Those dependencies can form an arbitrary directed acyclic graph. Many people seem to think of it as “yet another” templating library, but the core idea which is useful far beyond “templating” is the notion of observable values. One nice aspect is that it is a library and not a framework so you can use it as little or as much as you want and you can integrate it with other libraries and frameworks.

At any rate, this article is more geared toward those who have already decided on using Knockout or a library (in any language) offering similar capabilities. I strongly suspect the issues and solutions I’ll discuss apply to all similar sorts of libraries. While I’ll focus on one particular example, the ideas behind it apply generally. This example, admittedly, is one that almost anyone will implement, and in my experience will do it incorrectly the first time and won’t realize the problem until later.

### The Problem

When doing any front-end work, before long there will be a requirement to support “multi-select” of something. Of course, you want the standard select/deselect all functionality and for it to work correctly, and of course you want to do something with the items you’ve selected. Here’s a very simple example:

Number selected:
Item

Here, the number selected is an overly simple example of using the selected items. More realistically, the selected items will trigger other items to show up and/or trigger AJAX requests to update the data or populate other data. The HTML for this example is completely straightforward.

<div id="#badExample">
Number selected: <span data-bind="text: $data.numberSelected()"></span> <table> <tr><th><input type="checkbox" data-bind="checked:$data.allSelected"/></th><th>Item</th></tr>
<!-- ko foreach: { data: $data.items(), as: '$item' } -->
<tr><td><input type="checkbox" data-bind="checked: $data.selected"/></td><td data-bind="text: 'Item number: '+$data.body"></td></tr>
<!-- /ko -->
<tr><td><button data-bind="click: function() { $data.add(); }">Add</button></td></tr> </table> </div> The way nearly everyone (including me) first thinks to implement this is by adding a selected observable to each item and then having allSelected depend on all of the selected observables. Since we also want to write to allSelected to change the state of the selected observables we use a writable computed observable. This computed observable will loop through all the items and check to see if they are all set to determine it’s state. When it is updated, it will loop through all the selected observables and set them to the appropriate state. Here’s the full code listing. var badViewModel = { counter: 0, items: ko.observableArray() }; badViewModel.allSelected = ko.computed({ read: function() { var items = badViewModel.items(); var allSelected = true; for(var i = 0; i < items.length; i++) { // Need to make sure we depend on each item, so don't break out of loop early allSelected = allSelected && items[i].selected(); } return allSelected; }, write: function(newValue) { var items = badViewModel.items(); for(var i = 0; i < items.length; i++) { items[i].selected(newValue); } } }); badViewModel.numberSelected = ko.computed(function() { var count = 0; var items = badViewModel.items(); for(var i = 0; i < items.length; i++) { if(items[i].selected()) count++; } return count; }); badViewModel.add = function() { badViewModel.items.push({ body: badViewModel.counter++, selected: ko.observable(false) }); }; ko.applyBindings(badViewModel, document.getElementById('#badExample')); This should be relatively straightforward, and it works, so what’s the problem? The problem can be seen in numberSelected (and it also comes up with allSelected which I’ll get to momentarily). numberSelected depends on each selected observable and so it will be fired each time each one updates. That means if you have 100 items, and you use the select all checkbox, numberSelected will be called 100 times. For this example, that doesn’t really matter. For a more realistic example than numberSelected, this may mean rendering one, then two, then three, … then 100 HTML fragments or making 100 AJAX requests. In fact, this same behavior is present in allSelected. When it is written, as it’s writing to the selected observables, it is also triggering itself. So the problem is updating allSelected or numberSelected can’t be done all at once, or to use database terminology, it can’t be updated atomically. One possible solution in newer versions of Knockout is to use deferredUpdates or, what I did back in the much earlier versions of Knockout, abuse the rate limiting features. The problem with this solution is that it makes updates asynchronous. If you’ve written your code to not care whether it was called synchronously or asynchronously, then this will work fine. If you haven’t, doing this throws you into a world of shared state concurrency and race conditions. In this case, this solution is far worse than the disease. ### The Solution So, what’s the alternative? We want to update all selected items atomically; we can atomically update a single observable; so we’ll put all selected items into a single observable. Now an item determines if it is selected by checking whether it is in the collection of selected items. More abstractly, we make our observables more coarse-grained, and we have a bunch of small computed observables depend on a large observable instead of a large computed observable depending on a bunch of small observables as we had in the previous code. Here’s an example using the exact same HTML and presenting the same overt behavior. Number selected: Item And here’s the code behind this second example: var goodViewModel = { counter: 0, selectedItems: ko.observableArray(), items: ko.observableArray() }; goodViewModel.allSelected = ko.computed({ read: function() { return goodViewModel.items().length === goodViewModel.selectedItems().length; }, write: function(newValue) { if(newValue) { goodViewModel.selectedItems(goodViewModel.items().slice(0)); // Need a copy! } else { goodViewModel.selectedItems.removeAll(); } } }); goodViewModel.numberSelected = ko.computed(function() { return goodViewModel.selectedItems().length; }); goodViewModel.add = function() { var item = { body: goodViewModel.counter++ } item.selected = ko.computed({ read: function() { return goodViewModel.selectedItems.indexOf(item) > -1; }, write: function(newValue) { if(newValue) { goodViewModel.selectedItems.push(item); } else { goodViewModel.selectedItems.remove(item); } } }); goodViewModel.items.push(item); }; ko.applyBindings(goodViewModel, document.getElementById('#goodExample')); One thing to note is that setting allSelected and numberSelected are now both simple operations. A write to an observable on triggers a constant number of writes to other observables. In fact, there are only two (non-computed) observables. On the other hand, reading the selected observable is more expensive. Toggling all items has quadratic complexity. In fact, it had quadratic complexity before due to the feedback. However, unlike the previous code, this also has quadratic complexity when any individual item is toggled. Unlike the previous code, though, this is simply due to a poor choice of data structure. Equipping each item with an “ID” field and using an object as a hash map would reduce the complexity to linear. In practice, for this sort of scenario, it tends not to make a big difference. Also, Knockout won’t trigger dependents if the value doesn’t change, so there’s no risk of the extra work propagating into still more extra work. Nevertheless, while I endorse this solution for this particular problem, in general making finer grained observables can help limit the scope of changes so unnecessary work isn’t done. Still, the real concern and benefit of this latter approach isn’t the asymptotic complexity of the operations, but the atomicity of the operations. In the second solution, every update is atomic. There are no intermediate states on the way to a final state. This means that dependents, represented by numberSelected but which are realistically much more complicated, don’t get triggered excessively and don’t need to “compensate” for unintended intermediate values. ### Epilogue We could take the coarse-graining to its logical conclusion and have the view model for an application be a single observable holding an object representing the entire view model (and containing no observables of its own). Taking this approach actually does have a lot of benefits, albeit there is little reason to use Knockout at that point. Instead this starts to lead to things like Facebook’s Flux pattern and the pattern perhaps most clearly articulated by Cycle JS. <script lang="text/javascript" src="http://ajax.aspnetcdn.com/ajax/knockout/knockout-3.3.0.js"></script> <script lang="text/javascript"> // Bad View Model var badViewModel = { counter: 0, items: ko.observableArray() }; badViewModel.allSelected = ko.computed({ read: function() { var items = badViewModel.items(); var allSelected = true; for(var i = 0; i < items.length; i++) { // Need to make sure we depend on each item, so don't break out of loop early allSelected = allSelected && items[i].selected(); } return allSelected; }, write: function(newValue) { var items = badViewModel.items(); for(var i = 0; i < items.length; i++) { items[i].selected(newValue); } } }); badViewModel.numberSelected = ko.computed(function() { var count = 0; var items = badViewModel.items(); for(var i = 0; i < items.length; i++) { if(items[i].selected()) count++; } return count; }); badViewModel.add = function() { badViewModel.items.push({ body: badViewModel.counter++, selected: ko.observable(false) }); }; // Good View Model var goodViewModel = { counter: 0, selectedItems: ko.observableArray(), items: ko.observableArray() }; goodViewModel.allSelected = ko.computed({ read: function() { return goodViewModel.items().length === goodViewModel.selectedItems().length; }, write: function(newValue) { if(newValue) { goodViewModel.selectedItems(goodViewModel.items().slice(0)); // Need a copy! } else { goodViewModel.selectedItems.removeAll(); } } }); goodViewModel.numberSelected = ko.computed(function() { return goodViewModel.selectedItems().length; }); goodViewModel.add = function() { var item = { body: goodViewModel.counter++ } item.selected = ko.computed({ read: function() { return goodViewModel.selectedItems.indexOf(item) > -1; }, write: function(newValue) { if(newValue) { goodViewModel.selectedItems.push(item); } else { goodViewModel.selectedItems.remove(item); } } }); goodViewModel.items.push(item); }; ko.applyBindings(badViewModel, document.getElementById('#badExample')); ko.applyBindings(goodViewModel, document.getElementById('#goodExample')); </script> ### Disciple/DDC # DDC 0.4.2 -- the "almost useful" release DDC 0.4.2 was pushed to Hackage a few days ago. Running "cabal update; cabal install ddc-tools" should build it. You'll need a version of the LLVM tools in your path. LLVM 3.4 - 3.8 (the current release) should work. Code generation and runtime system support for higher order functions is finished. The type checker also now automatically inserts the run and box casts which were mentioned a few posts ago. Release notes are on the github page. The programs we can compile are starting to look "almost useful". For example, here is some code from the AlmostPrimes example, which I adapted from Rosetta Code: -- | A stream of k-almost-prime numbers.kPrimes (k: Nat#): Stream Nat# Nat# = sfilter (isKPrime k) (senumFrom 2) -- | Check if some number is k-almost-prime.isKPrime (k n: Nat#): Bool# | k == 1 = isPrime n | otherwise = sany$ smap       (isKPrime (k - 1))        $smap fst$ sfilter    (eq 0 ∘ snd)        $smap (divMod n)$ stakeWhile (λx: Nat#. x < n)        $primes 2main (_: Unit): S Console Unit = do forS (enumFromTo 1 5) (λk: Nat#. do write$ "k = " % showNat k % ": "                forS (listOfStream $stake 10$ kPrimes k)                        (λx: Nat#. write $showNat x % " ") write "\n") The full example code is here Note that Nat# is the type of a primitive natural number. The # just means "primitive" which is not the same as "unboxed" as per GHC. The hashes will go away once I implement type synonyms and can define a nicer synonym. The example invokes some standard stream combinators, smap, sfilter and so on. The forS function is like mapM, but using the Disciple effect system instead of monads. The type of forS is: forS : List a -> (a -> S e Unit) -> S e Unit S e Unit is the type of a suspended computation with effect e which returns a value of type Unit when evaluated. I'm calling this the "almost useful" release because because we still need to finish the garbage collector, as well as desugaring for nested patterns, before I could imagine writing anything substantial in it. Late last year Jacob Stanley was working on the transform to insert GC slot handling into core programs, but the runtime support still needs to be implemented. Current work is to finish support for type synonyms, then I'll do either the GC or nested patterns, depending in which looks like the lowest hanging fruit. ### FP Complete # Stack Security GnuPG Keys # Introduction We are introducing a method of verifying Haskell packages with OpenPGP signatures. The initial code to sign your packages has been included in Stack as an experimental feature for some time. We are going to be improving it and included verification options in the coming weeks. However, we need signatures from package authors before verification is useful. The first step in submitting package signatures is to create a secure OpenPGP key for yourself and publish it. This post will walk you through creating an offline master OpenPGP key-set with GnuPG. If you've never used GPG before and need to generate a secure set of keys, then continue reading. If you already have GPG keys that you are happy with, continue to 'Signing Packages with Stack' towards the bottom of this article. # GPG Keys ## Choosing a version of GPG There are 3 main flavors of GnuPG: • 1.4 (AKA "gpg") - Old - Super stable (11 years old) & limited features • 2.0 (AKA "gpg2") - Stable - Most people use this for personal encryption • 2.1 (AKA "gpg2") - Modern - Latest features: ECC support, Tor, TOFU, etc It is recommended by the GnuPG project version to use the Stable GPG-2 release (2.0). I've included instructions for using GPG 1.4 but I don't recommend using such an old version unless you have no choice. I've also included instructions for GnuPG 2.1 because I use it personally and find the new features useful. • Windows - https://www.gpg4win.org/download.html • OS X • GPG 2.0 is available as part of GPG Suite which is an easy download from https://gpgtools.org (recommended). (Bonus: GPG Suite adds GPG e-mail for the Apple Mail app). (Note: If you install GPG Suite click "Cancel" when prompted to generate a key through their GUI. We'll go through this step below at the command line.) • GnuPG 2.0 is available through Homebrew as well if you prefer. • Linux/BSD - GnuPG 2.x is available as a package for your favorite Linux flavor. (Bonus: Add Engimail while you are at it for easy GPG e-mail.) ## Offline machine for GPG key management Ideally your offline keys will never be on a machine connected to the outside world. The offline box will be the place where you keep your master key, manage your keys & sign other keys. This assures that you can always revoke keys and recover if your online machine(s) are somehow compromised. To have an offline box you'll need two "machines". These can just be the same machine if you utilize a "live" Linux USB/DVD boot drive and a USB stick for the "sneaker net". We'll follow the same procedure to generate your keys, even if you don't want to use an offline machine to generate them. It's highly recommended using an offline master though and keeping secure backups. It's the least hassle for everyone else in your "web of trust" in an event where keys are compromised. With an offline master key, if your laptop is compromised, you can just revoke the compromised sub-keys, create new sub-keys and publish them. If you get into the situation where your master key is compromised, you will have to revoke all your keys, approach people about your new keys & going through the process of reestablishing trust with people in your web-of-trust. ## Configuration Setup$GNUPGHOME directory

umask 077
export GNUPGHOME=$HOME/.gnupg ; # default location but env var rules mkdir -p$GNUPGHOME

Install the secure key-server certificate (Ubuntu/Debian):

mkdir -p /usr/local/share/ca-certificates/
curl -s https://sks-keyservers.net/sks-keyservers.netCA.pem \
| sudo tee /usr/local/share/ca-certificates/sks-keyservers.netCA.crt
sudo update-ca-certificates

Put some or all of the following in your $GNUPGHOME/gpg.conf file: (taken from the riseup.net best gpg practices page) no-emit-version no-comments keyid-format 0xlong with-fingerprint list-options show-uid-validity verify-options show-uid-validity use-agent keyserver hkps://hkps.pool.sks-keyservers.net # or less secure: keyserver hkp://pool.sks-keyservers.net keyserver-options no-honor-keyserver-url keyserver-options include-revoked personal-cipher-preferences AES256 AES192 AES CAST5 personal-digest-preferences SHA512 SHA384 SHA256 SHA224 cert-digest-algo SHA512 default-preference-list SHA512 SHA384 SHA256 SHA224 AES256 AES192 AES CAST5 ZLIB BZIP2 ZIP Uncompressed ## Key Generation ### Master Key Public and Secret 1. Generate Key GPG 1.4 gpg --expert --gen-key GPG 2.0 gpg2 --expert --gen-key GPG 2.1 gpg2 --expert --full-gen-key • Pick the "RSA" and "set your own capabilities" option • Turn off everything until it just says "Certify" then continue • Select the bit count. I pick 4000 for RSA because it's big but it's also not a power of 2 (like everyone else picks.) • Select the expiration period. I picked 1 year. • Pick a good pass-phrase that you can remember but would take a computer a long time to guess. 2. Backup Your Keys GPG 1.4 gpg --armor --export >$GNUPGHOME/public.asc
gpg --armor --export-secret-key > $GNUPGHOME/secret.asc GPG 2.0 or 2.1 gpg2 --armor --export >$GNUPGHOME/public.asc
gpg2 --armor --export-secret-key > $GNUPGHOME/secret.asc 3. Try Recovering Your Keys Delete your public and secret key and re-import them. GPG 1.4 gpg --delete-secret-key <KEYID> gpg --delete-key <KEYID> gpg --import$GNUPGHOME/public.asc $GNUPGHOME/secret.asc gpg --expert --edit-key <KEYID> GPG 2.0 & 2.1 gpg2 --delete-secret-key <KEYID> gpg2 --delete-key <KEYID> gpg2 --import$GNUPGHOME/public.asc $GNUPGHOME/secret.asc gpg2 --expert --edit-key <KEYID> • Type trust [Return] to give your self "ultimate" trust. • Type save [Return] to save your changes to the key. ### Sub-keys 1. Generate Keys Start by editing your key again. GPG 1.4 gpg --expert --edit-key <KEYID> GPG 2.0 or 2.1 gpg2 --expert --edit-key <KEYID> Create a Signing Sub-key: • Type addkey [Return] • Pick the "RSA" and "set your own capabilities" option • Turn off everything until it just says "Sign" then continue Create an Encryption Sub-key: • Type addkey [Return] • Pick the "RSA" and "set your own capabilities" option • Turn off everything until it just says "Encrypt" then continue Create an Authentication Sub-key: • Type addkey [Return] • Pick the "RSA" and "set your own capabilities" option • Add the "Authenticate" capability then continue • Turn off everything until it just says "Authenticate" then continue • Type save [Return] to save and exit Now try encrypting and signing stuff with your keys. GPG 1.4 echo 'hello' | gpg --armor --clearsign | gpg --verify echo 'sekrats!' | gpg --armor --encrypt | gpg --decrypt GPG 2.0 or 2.1 echo 'hello' | gpg2 --armor --clearsign | gpg2 --verify echo 'sekrats!' | gpg2 --armor --encrypt | gpg2 --decrypt 2. Backup Keys GPG 1.4 gpg --armor --export >$GNUPGHOME/public.asc
gpg --armor --export-secret-keys    > $GNUPGHOME/secret.asc gpg --armor --export-secret-subkeys >$GNUPGHOME/subkey.asc

GPG 2.0 or 2.1

gpg2 --armor --export                > $GNUPGHOME/public.asc gpg2 --armor --export-secret-keys >$GNUPGHOME/secret.asc
gpg2 --armor --export-secret-subkeys > $GNUPGHOME/subkey.asc 3. Try Recovery Delete your public and secret key and re-import the offline keys (the full set). GPG 1.4 gpg --delete-secret-key <KEYID> gpg --delete-key <KEYID> gpg --import$GNUPGHOME/public.asc $GNUPGHOME/secret.asc gpg --expert --edit-key <KEYID> GPG 2.0 or 2.1 gpg2 --delete-secret-key <KEYID> gpg2 --delete-key <KEYID> gpg2 --import$GNUPGHOME/public.asc $GNUPGHOME/secret.asc gpg2 --expert --edit-key <KEYID> 1. Type trust [Return] to give your self "ultimate" trust. 2. Type save [Return] to save your changes to the key. Now try encrypting and signing stuff with your keys. GPG 1.4 echo 'hello' | gpg --armor --clearsign | gpg --verify echo 'sekrats!' | gpg --armor --encrypt | gpg --decrypt GPG 2.0 or 2.1 echo 'hello' | gpg2 --armor --clearsign | gpg2 --verify echo 'sekrats!' | gpg2 --armor --encrypt | gpg2 --decrypt ## Switching Secret Key Sets ### Using your offline secret keys Delete your secret keys and re-import the offline secret keys (the full set). GPG 1.4 gpg --delete-secret-key <KEYID> gpg --import$GNUPGHOME/secret.asc

GPG 2.0 or 2.1

gpg2 --delete-secret-key <KEYID>
gpg2 --import $GNUPGHOME/secret.asc With the full set in use (offline only of course!) you can import your buddy's keys, sign them and trust them. Use gpg –import and gpg –export to move keys around to/from USB drives to/from your online machine. If you list your secret keys you should see a plain "sec" next to your key. This indicates a full secret key is present. You may now manage keys and sign other keys. GPG 1.4 gpg --list-secret-keys GPG 2.0 or 2.1 gpg2 --list-secret-keys ### Using your online secret keys Delete your secret keys and re-import the online secret keys (the subset). GPG 1.4 gpg --delete-secret-key <KEYID> gpg --import$GNUPGHOME/subkey.asc

GPG 2.0 or 2.1

gpg2 --delete-secret-key <KEYID>
gpg2 --import $GNUPGHOME/subkey.asc You won't be able to sign other people's keys or create/revoke keys with this key-set. (This is a good thing in case your online machine and it's subkeys are compromised later.) If you list your secret keys you should see a "sec#" next to your key. The '#' indicates that the secret is missing for your master key. If you are following good practices, from now on, it will only be available on your offline computer. GPG 1.4 gpg --list-secret-keys GPG 2.0 or 2.1 gpg2 --list-secret-keys # Signing Packages with Stack Now that we have our online secret keys ready, we can sign our Hackage-bound packages with Stack. There are currently two methods to sign keys in Stack. The first is to sign your package while uploading (the default). Just perform an upload as you might have in the past. stack upload Another way to sign is to create an sdist file. If you add a flag you can publish a signature at the same time. stack sdist --sign For those package authors out there with more than a few packages to sign, we have written a simple tool to bulk sign all your packages published to Hackage. It's called sig-tool. To bootstrap you need to run the following: sig-tool setup <myHackageUser> This will download all of your packages to the current directory & generate a manifest with SHA256 sums for all the releases. You should inspect the files to make sure they are correct at this stage. You can sign all of the releases or you can trim the manifest file if you like. To begin signing run the following: sig-tool sign # Come work with us! **FP Complete has multiple openings for software and devops engineers. If you're interested in working on Haskell, container-based deployment, build systems, or just about anything you read about on this blog, send us your CV.** ### Functional Jobs # Software Engineer – Cryptography at LeapYear (Full-time) ### LeapYear is pioneering a new approach to the data security problem. We’ve built the first platform that enables unrestricted access to information while providing mathematically proven guarantees of security and privacy. Our engineers solve hard, real-world problems that impact every data-driven organization. We are an Agile team at the frontiers of software development, integrating the latest advances in cryptography and machine learning into critical systems using functional programming (99% of our code is written in Haskell). If this sounds like your dream job, read on. ### You will be one of the first ten employees at a rapidly growing, mission-driven company. In our first six months, our team architected, developed, and deployed our product on Fortune 50 datasets with billions of records and secured over five and a half million dollars in bookings. We are backed by Bloomberg, private equity investors, and security industry experts with multi-billion dollar exits. LeapYear has collaborated with some of the nation’s largest healthcare organizations, partnered with a coalition working to deploy$250B in social impact capital, and we’re just getting started.

If your day-to-day concerns include information leaks, timing attacks, and how well PKCS#1 v2 stacks up against v1.5, this is the job for you. You will shape our security design alongside the world's leading experts in cryptography and make key strategic decisions about our product. We will look to you to identify and instill the best practices in security engineering throughout the platform and help us meet high performance benchmarks while maintaining cryptographic guarantees of privacy and security.

To learn more about our mission, team, company culture, and benefits, see http://leapyear.io/careers. For details on the specific responsibilities and requirements of the role, please see below.

### RESPONSIBILITIES

• oversee the development of secure and scalable implementations
• collaborate with developers and researchers to transform theory in cryptography into enterprise applications
• develop performance-critical code
• plan, implement, and optimize new features to carry out our product roadmap

### REQUIREMENTS

• expert knowledge of systems architecture and C
• professional experience in functional programming
• ability to read and understand academic literature in theoretical computer science
• track record of delivering high-quality product features on schedule
• strong foundation in data structures, algorithms, and software design

### PREFERRED

• expert knowledge of practical cryptography and data security
• industry experience in security and/or cryptography

### BONUS

• publications in major security or cryptography journals
• PhD in computer science, math, or related fields
• professional experience in developing secure critical systems

### A FEW OF THE PERKS

• Culture of teaching and learning
• Competitive compensation package of salary and equity
• Open vacation policy
• Catered lunch and dinner every day
• Company outings
• Free yoga and meditation classes at a world-class studio downstairs
• Relocation support

LeapYear is on a mission to resolve the conflict between innovation and data security. If you thrive in an open environment working on deeply technical challenges, we’re looking for you.

Get information on how to apply for this position.

# Software Engineer – Machine Learning at LeapYear (Full-time)

### LeapYear is pioneering a new approach to the data security problem.

We’ve built the first platform that enables unrestricted access to information while providing mathematically proven guarantees of security and privacy.

Our engineers solve hard, real-world problems that impact every data-driven organization. We are an Agile team at the frontiers of software development, integrating the latest advances in cryptography and machine learning into critical systems using functional programming (99% of our code is written in Haskell).

In our first six months, our team architected, developed, and deployed our product on Fortune 50 datasets with billions of records and secured over five and a half million dollars in bookings. We are backed by Bloomberg, private equity investors, and security industry experts with multi-billion dollar exits. LeapYear has collaborated with some of the nation’s largest healthcare organizations, partnered with a coalition working to deploy $250B in social impact capital, and we’re just getting started. If you work at the intersection of machine learning and functional programming, we're looking for you. You will collaborate with developers and researchers to transform machine learning theory into enterprise applications, and the novel algorithms that you develop will be deployed on massive enterprise datasets. To learn more about our mission, team, company culture, and benefits, see http://leapyear.io/careers. For details on the specific responsibilities and requirements of the role, please see below. ### RESPONSIBILITIES • design and implement novel machine learning techniques • develop performance-critical code • plan, implement, and optimize new features to carry out our product roadmap ### REQUIREMENTS • professional or research machine learning experience • professional experience in functional programming • ability to read and understand academic literature in theoretical computer science • track record of delivering high-quality product features on schedule • strong foundation in data structures, algorithms, and software design ### PREFERRED • excellent Haskell skills • experience in enterprise data science and data engineering • background applying advanced machine learning techniques in enterprise data environments ### BONUS • PhD in machine learning or related field ### A FEW OF THE PERKS • Culture of teaching and learning • Competitive compensation package of salary and equity • Open vacation policy • Build your ideal workstation • Catered lunch and dinner every day • Company outings • Free yoga and meditation classes at a world-class studio downstairs • Relocation support LeapYear is on a mission to resolve the conflict between innovation and data security. If you thrive in an open environment working on deeply technical challenges, we’re looking for you. Get information on how to apply for this position. ## May 04, 2016 ### Bill Atkins # NSNotificationCenter, Swift and blocks The conventional way to register observers with NSNotificationCenter is to use the target-action pattern. While this gets the job done, it's inherently not type-safe. For example, the following Swift snippet will compile perfectly: NSNotificationCenter.defaultCenter().addObserver(self, selector: Selector("itemAdded:"), name: MyNotificationItemAdded, object: nil) even though at runtime it will fail unless self has a method named itemAdded that takes exactly one parameter (leaving off that last colon in the selector will turn this line into a no-op). Plus, this method gives you no way to take advantages of Swift's closures, which would allow the observer to access local variables in the method that adds the observer and would eliminate the need to create a dedicated method to handle the event. A better way to do this is to use blocks. And NSNotificationCenter does include a block-based API: NSNotificationCenter.defaultCenter().addObserverForName(MyNotificationItemAdded, object: nil, queue: nil) { note in // ... } This is much nicer, especially with Swift's trailing closure syntax. There are no method names to be looked up at runtime, we can refer to local variables in the method that registered the observer and we can perform small bits of logic in reaction to events without having to create and name dedicated methods. The catch comes in resource management. It's very important that an object remove its event observers when it's deallocated, or else NSNotificationCenter will try to invoke methods on invalid pointers. The traditional target-action method has the one advantage that we can easily handle this requirement with a single call in deinit: deinit { NSNotificationCenter.defaultCenter().removeObserver(self) } With the block API, however, since there is no explicit target object, each call to addObserverForName returns "an opaque object to act as observer." So your observer class would need to track all of these objects and then remove them all from the notification center in deinit, which is a pain. In fact, the hassle of having to do bookkeeping on the observer objects almost cancels out the convenience of using the block API. Frustrated by this situation, I sat down and created a simple helper class, NotificationManager: class NotificationManager { private var observerTokens: [AnyObject] = [] deinit { deregisterAll() } func deregisterAll() { for token in observerTokens { NSNotificationCenter.defaultCenter().removeObserver(token) } observerTokens = [] } func registerObserver(name: String, block: (NSNotification -> Void)) { let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: nil, queue: nil, usingBlock: block) observerTokens.append(newToken) } func registerObserver(name: String, forObject object: AnyObject, block: (NSNotification -> Void)) { let newToken = NSNotificationCenter.defaultCenter().addObserverForName(name, object: object, queue: nil, usingBlock: block) observerTokens.append(newToken) } } First, this simple class provides a Swift-specialized API around NSNotificationCenter. It provides an additional convenience method without an object parameter (rarely used, in my experience) to make it easier to use trailing-closure syntax. But most importantly, it keeps track of the observer objects generated when observers are registered, and removes them when the object is deinit'd. A client of this class can simply keep a member variable of type NotificationManager and use it to register its observers. When the parent class is deallocated, the deinit method will automatically be called on its NotificationManager member variable, and its observers will be properly disposed of: class MyController: UIViewController { private let notificationManager = NotificationManager() override init() { notificationManager.registerObserver(MyNotificationItemAdded) { note in println("item added!") } super.init() } required init(coder: NSCoder) { fatalError("decoding not implemented") } } When the MyController instance is deallocated, its NotificationManager member variable will be automatically deallocated, triggering the call to deregisterAll that will remove the dead objects from NSNotificationCenter. In my apps, I add a notificationManager instance to my common UIViewController base class so I don't have to explicitly declare the member variable in all of my controller subclasses. Another benefit of using my own wrapper around NSNotificationCenter is that I can add useful functionality, like group observers: an observer that's triggered when any one of a group of notifications are posted: struct NotificationGroup { let entries: [String] init(_ newEntries: String...) { entries = newEntries } } extension NotificationManager { func registerGroupObserver(group: NotificationGroup, block: (NSNotification -> ()?)) { for name in group.entries { registerObserver(name, block: block) } } } This can be a great way to easily set up an event handler to run when, for example, an item is changed in any way at all: let MyNotificationItemsChanged = NotificationGroup( MyNotificationItemAdded, MyNotificationItemDeleted, MyNotificationItemMoved, MyNotificationItemEdited ) notificationManager.registerGroupObserver(MyNotificationItemsChanged) { note in // ... } ### Mikhail Glushenkov # What's new in Cabal 1.24 — Nix-style local builds, setup dependencies, HTTPS and more! We’ve just released versions 1.24 of both Cabal and cabal-install. The 1.24 release incorporates more than a thousand commits by 89 different contributors. This post describes what’s new and improved in this version. ## User-visible features • Nix-style local builds in cabal-install (so far only a technical preview). See this post by Edward Z. Yang for more details. • Integration of a new security scheme for Hackage based on The Update Framework. So far this is not enabled by default, pending some changes on the Hackage side. See these three posts by Edsko de Vries and Duncan Coutts for more information. • Support for specifying setup script dependencies in .cabal files. Setup scripts are also now built with the cabal_macros.h-style macros for conditional compilation. See this post by Duncan Coutts for more information. • Support for HTTPS downloads in cabal-install. HTTPS is now used by default for downloads from Hackage. This uses either curl or wget or, on Windows, PowerShell, under the hood. Install target URLs can now also use HTTPS, e.g. cabal install https://example.com/foo-1.0.tar.gz. • cabal upload learned how to upload documentation to Hackage (cabal upload --doc) (#2890). • In related news, cabal haddock now can generate documentation intended for uploading to Hackage (cabal haddock --for-hackage, #2852). cabal upload --doc runs this command automatically if the documentation for current package hasn’t been generated yet. • New cabal-install command: gen-bounds for easy generation of version bounds. See this post by Doug Beardsley for more information. • It’s now possible to limit the scope of --allow-newer to single packages in the install plan, both on the command line and in the config file. See here for an example. • The --allow-newer option can be now used with ./Setup configure (#3163). • New cabal user-config subcommand: init, which creates a default config file in either the default location (~/.cabal/config) or as specified by --config-file (#2553). • New config file field extra-framework-dirs for specifying extra locations to find OS X frameworks in (#3158). Can be also specified as an argument for install and configure commands. • cabal-install solver now takes information about extensions and language flavours into account (#2873). The solver is now also aware of pkg-config constraints (#3023). • New cabal-install option: --offline, which prevents cabal-install from downloading anything from the Internet. • New cabal upload option -P/--password-command for reading Hackage password from arbitrary program output. • New --profiling-detail=$level flag with a default for libraries and executables of ‘exported-functions’ and ‘toplevel-functions’ respectively (GHC’s -fprof-auto-{exported,top} flags) (#193).

• New --show-detail mode: --show-detail=direct; like streaming, but allows the test program to detect that is connected to a terminal, and works reliable with a non-threaded runtime (#2911, and serves as a work-around for #2398)

• Macros VERSION_$pkgname and MIN_VERSION_$pkgname are now also generated for the current package (#3235).

• The builddir option can now be specified via the CABAL_BUILDDIR environment variable and in cabal.config (#2484).

• Added a log file message similar to one printed by make when building in another directory (#2642).

## Bug fixes and minor improvements

• Support for GHC 8. NB: pre-1.24 versions of Cabal won’t work with GHC 8.

• Cabal is now aware of extra C sources generated by preprocessors (e.g. c2hs and hsc2hs) (#2467).

• Cabal now includes cabal_macros.h when running c2hs (#2600).

• C sources are now recompiled only when needed (#2601).

• Support Haddock response files to work around command-line length restrictions on Windows (#2746).

• Library support for multi-instance package DBs (#2948).

• Improvements in the ./Setup configure solver (#3082, #3076).

• If there are multiple remote repos, cabal update now updates them in parallel (#2503).

• cabal program itself now can be used as an external setup method. This fixes an issue when Cabal version mismatch caused unnecessary reconfigures (#2633).

• Fixed space leaks in cabal update (#2826) and in the solver (#2916, #2914). Improved performance of --reorder-goals (#3208).

• cabal exec and sandbox hc-pkg now use the configured compiler (#2859).

• The man page for cabal-install is now automatically generated (#2877).

• Miscellaneous minor and/or internal bug fixes and improvements.

## Acknowledgements

Thanks to everyone who contributed code and bug reports, and to Ryan Thomas for helping with release management. Full list of people who contributed patches to Cabal/cabal-install 1.24 is available here.

## Looking forward

We plan to make a new release of Cabal/cabal-install approximately 6 months after 1.24 — that is, in late October or early November 2016. Main features that are currently targeted at 1.26 are:

• Further work on nix-style local builds, perhaps making that code path the default.

• Enabling Hackage Security by default.

• Native suport for foreign libraries: Haskell libraries that are intended to be used by non-Haskell code.

• New Parsec-based parser for .cabal files.

• A revamped homepage for Cabal, rewritten user manual, and automated build bots for binary releases.

We would like to encourage people considering contributing to take a look at the bug tracker on GitHub, take part in discussions on tickets and pull requests, or submit their own. The bug tracker is reasonably well maintained and it should be relatively clear to new contributors what is in need of attention and which tasks are considered relatively easy. For more in-depth discussion there is also the cabal-devel mailing list.

# The five arguments on why people struggle with monads

People trying to learn Haskell, or about Haskell, often struggle with the concept of monads. People like to talk about it a lot; it’s probably because pedagogy is interesting, because you need to use the class in Haskell (so comprehension is important for language adoption), and people like talking about things that they understand that other people don’t. Specifically, it’s a class called Monad that appears in much of Haskell development.

I’m going to summarize the kind of arguments typically put forth, which I consider all contributory scales of why people struggle with monads as applied in Haskell:

• The alienation argument: The concept of a monad is inherently difficult for puny minds to understand (suddenly programmers aren’t good at abstraction?). It doesn’t matter the quality of or context in which materials present the topic, it’s just really hard and alien.
• The bad pedagogy argument: The educational material is terrible. We just haven’t found the holy grail of monad tutorials.
• The mysticism argument: There’re so many tutorials and “wu wu” (and discussions like this very blog post; you are contributing by reading it! Shame on you!) around monads that make people think they still don’t “get” it, like some kind of zen moment is required.
• The academic saturation argument: Many enthusiastic Haskellers encourage learning category theory, which leads to confusing the Haskell class Monad with category theoretic notion of a monad, which are distinct.
• The foreign language argument: Not having enough understanding of Haskell, specifically, Haskell’s type system (to understand the Monad class in any realistic way, you have to grok type-classes and higher-kinded types), makes understanding a concept employing that language, out of reach.

I have my own opinion about which scales are most to blame, which I’ll elaborate on now.

## The alienation argument

We’ve already seen monads applied in other languages. LINQ in C#, workflows in F#, promises in JavaScript. They’re not all the same as monads–let’s avoid that pedantry–but they’re not some whole other world, either. They have a notion of creating some actions as first class values and combining them in some way and then running that composed thing. If programmers have no problem with this kind of thing in other languages, what’s so hard about Haskell’s Monad class?

Actually, there are a few tutorials that always come up in online discussions as being quality monad tutorials which are followed by echoes of approval as being The Great Monad Tutorial:

Whatever you think about the tutorials, we’re not lacking in quality educational materials. So what’s the problem?

## The mysticism argument

This might be the answer to the previous. There are indeed too many tutorials. It’s been called a fallacy or advised never to read monad tutorials:

The tutorials are not consistent, either, they fall in a few camps:

This is indeed a problem.

Quoting Keith B:

Too many “monad tutorials” are written by either actual mathematicians or frustrated mathematicians–manqué and as a result they focus far too much on trying to transmit the excitement of understanding a cool new concept. Working programmers have little interest in this. They want to gan facility with useful techniques.

Fundamentally there are two audiences, people who want to learn category theory, and people who want to learn Haskell. Unfortunately, sometimes, the latter camp are baffled into believing they ought to be in the former camp.

But ultimately I think people putting the effort in aren’t really mislead by this.

## The foreign language argument

This is the most compelling reason for me.

If you don’t understand a language, you cannot understand concepts expressed in that language. Try to read about a new concept, something abstract, on Wikipedia, but switch the language to one you don’t understand well. Good luck. I’ve tried explaining why Blinkenlights makes me laugh to tears, to non-native English speakers (note: the ‘ATTENTION’ German version is also funny), with much difficulty.

There are two problems:

1. You can’t understand the implications of class Monad m where .. without having a solid understanding of (1) type-classes, and (2) higher-kinded types. The type of return is Monad m => a -> m a. Haskell’s type-classes are a unique feature that few other languages have, supporting value polymorphism. The m there is of kind m :: * -> *. That’s a higher-order type. It’s hard to even explain these two concepts.
2. Haskell starts out of the box with making use of the Monad class. The first thing you encounter as a newbie is also one of the things you are most ill-equipped to understand.

Is it any wonder that this class is such a pain point?

To add insult to injury, said audience aren’t aware that this is their poison. Laurence Gonsalves arrives at an insight without really knowing it:

Of course monad tutorials about the Monad class are written in Haskell, because it uses two language features not present in other popular languages. JavaScript, Ruby, Python don’t have a type system. C# and Java have generics, and Common Lisp has generic functions, but none of them have value polymorphism. The whole reason that Haskell is used to teach this concept is because it’s only a satisfying class in Haskell.

Let’s detour into natural language: French does not have a natural equivalent to the verb “Peck”, which translated into French is “donner des coups de bec” or “Attack with the front of the beak.” Italian does not have a word for “toe”, it has “dito del piede” which is “finger of the foot”. English has “-ish” and “-y”, like “this is greenish” or “it was salady”, whereas Italian has “-one” (large), so “bacio” (kiss) => “baccione” (big kiss), and “-a” and “-o” for male and female: figlio/figlia => son/daughter.

Programming languages are the same. There are some genuinely new features that other languages aren’t able to reproduce, without removing the utility of the thing in doing so.

## Summarizing

In summarizing I’d personally assign the following ratings to each argument:

• The alienation argument: not too convincing
• The bad pedagogy argument: not likely contributory
• The mysticism argument: very contributory
• The academic saturation argument: not a big deal
• The foreign language argument: a big contributor

I think educators can only acknowledge the two problems in The foreign language argument and teach a good understanding of the language before moving onto that class.

# Mac OS X C++ Development (part 2)

As I wrote in my previous post, I recently started working on a Mac. This is a collection of my notes on the problems and surprises I ran into. Maybe it will be useful for you, too.

# Tools

In addition to many of the command line tools you may be familiar with from C++ development on Linux, Mac OS X has specialized tools for working with binaries.

## otool

The otool utility displays information from binary headers. It works with Mach headers (which are the main thing, on Macs) but also works with other “universal” formats as well. There are many options, but the main one that’s been helpful to me so far has been otool -L <lib>, which tells you the dependencies of the archive, which are essential debugging information for dynamic linking problems. Another useful one is otool -D <lib>, which tells you the install names embedded in a given library.

## install_name_tool

This tool allows you to change the install names for a binary. Simple, but important.

By now you must be wondering what install names are.

# Install Names

Let us begin with man dyld. There is no dyld command. Nevertheless, it is the dynamic linker for Mac OS X. The man page tells us many interesting things, perhaps chief among which is how dynamic libraries are located at run-time.

When you link an application to a dynamic library, the path to that library is encoded in the application binary. Well, actually, when you build, the library is interrogated to find out where it is supposed to be installed, and that path gets encoded in the application binary. These paths where things are supposed to be installed are known as install names on Mac OS X. Every .dylib has (at least) one. You can use otool -D to find the install name of a .dylib file.

In most cases, the install name will have an absolute path, like this:

> $otool -D /usr/lib/libiconv.dylib /usr/lib/libiconv.dylib: /usr/lib/libiconv.2.dylib So if you link with /usr/lib/libiconv.dylib, your application will look for /usr/lib/libiconv.2.dylib when it launches. If the .dylib isn’t where it’s expected, the app crashes. ## Relative paths Sometimes a library will have a relative path as its install name. When this happens, dyld will search for a library with that name on several search paths. It’s similar to LD_LIBRARY_PATH in some ways. There are environment variables you can set to control where it searches. There are several, and you should consult man dyld if you want the gory details. One thing that is important to know is that one of these variables, DYLD_FALLBACK_LIBRARY_PATH, has the default value $(HOME)/lib:/usr/local/lib:/lib:/usr/lib. Occasionally you may see advice on the web that recommends symbolic linking a library into your $(HOME)/lib without any further explanation. DYLD_FALLBACK_LIBRARY_PATH is why. ## @rpath If you want to distribute an application that relies on libraries that are not necessarily in known standard places, it is a good idea to use @rpath as part of your install name. This directs dyld to look in a path that is relative to the application binary (or in the case of a library that is dynamically linked to another library, the path is relative to the calling library), rather than looking in the places it normally would. This would allow you to bundle a set of libraries along with your application, and have them be found regardless of where you install the application. There are several other options, such as @executable_path and @loader_path, but @rpath seems to be the right choice in most cases. Again, man dyld has all the details. # An Example: Dionysus I wanted to build Dionysus, a library for working with persistent homology. I wanted to build the Python bindings. ## Linker Errors After I managed to get through the compile phase, I ran into linker problems, with unresolved symbol errors similar to the ones I mentioned in part 1. However, in this case, changing the compiler didn’t fix the problem. I was getting linking errors trying to link with Boost, which I had already reinstalled on my system from source, using gcc: export HOMEBREW_CC=gcc-5 export HOMEBREW_CXX=g++-5 brew reinstall --build-from-source boost brew reinstall --build-from-source boost --with-python3 That wasn’t enough though. I still got linker errors. Googling suggested maybe the problem was that Boost.Python (which Dionysus uses for its Python binding) was linked to a different version of Python than the one I was targeting. This turned out to be a red herring, however. As far as I can tell, Boost.Python doesn’t actually link to Python at all (though it has to be compiled with support for Python 3 if you want that, so there’s certainly a connection, just not a linking one). The problem actually was that Dionysus wanted the C++ 11 version of Boost, and the default Homebrew version doesn’t have that support. For this, you need: brew reinstall --build-from-source boost --with-c++11 brew reinstall --build-from-source boost-python --with-c++11 Build-time linking errors resolved, I thought I was home free. ## Dynamic Linking Errors At last, I had built the library. I fired up python2.7, imported the library. It worked! Then I created an instance of one of the classes, and got this: TypeError: __init__() should return None, not 'NoneType' After a lot of googling, it turned out that the most likely cause for this was that the Dionysus binding had wound up linked to the wrong Python library. I carefully checked the CMakeCache.txt file and eliminated any occurrences of the wrong Python library, or the wrong Boost library. Still no luck. A look at otool -L output for the library showed something funny: >$ otool -L lib_dionysus.dylib
lib_dionysus.dylib:
/Users/me/src/Dionysus/build/bindings/python/lib_dionysus.dylib (compatibility version 0.0.0, current version 0.0.0)
libpython2.7.dylib (compatibility version 2.7.0, current version 2.7.0)
/usr/local/opt/boost-python/lib/libboost_python-mt.dylib (compatibility version 0.0.0, current version 0.0.0)
/usr/local/opt/mpfr/lib/libmpfr.4.dylib (compatibility version 6.0.0, current version 6.3.0)
/usr/local/opt/gmp/lib/libgmp.10.dylib (compatibility version 14.0.0, current version 14.0.0)

Do you notice that all the entries are absolute paths except for the libpython2.7 one? A quick look at the Anaconda libpython2.7.dylib I linked against shows why:

> $otool -D /Users/me/anaconda2/lib/libpython2.7.dylib /Users/me/anaconda2/lib/libpython2.7.dylib: libpython2.7.dylib  So Anaconda’s version of Python instructed the linker to find libpython2.7.dylib by going through the normal dyld process - checking DYLD_LIBRARY_PATH, then DYLD_FALLBACK_LIBRARY_PATH, and so on. Since /usr/lib is on DYLD_FALLBACK_LIBRARY_PATH by default, and the system libpython2.7.dylib is in that directory, Dionysus was getting linked with that one, not with the one that was already in memory in the current process. This led to the strange error I saw. I was able to use install_name_tool to change the binary to point to the right libpython2.7: >$ install_name_tool lib_dionysus.dylib -change libpython2.7.dylib /Users/me/anaconda2/lib/libpython2.7.dylib

After this, everything was fine.

In addition to man dyld, these are useful:

# Announcing cabal new-build: Nix-style local builds

cabal new-build, also known as “Nix-style local builds”, is a new command inspired by Nix that comes with cabal-install 1.24. Nix-style local builds combine the best of non-sandboxed and sandboxed Cabal:

1. Like sandboxed Cabal today, we build sets of independent local packages deterministically and independent of any global state. new-build will never tell you that it can't build your package because it would result in a “dangerous reinstall.” Given a particular state of the Hackage index, your build is completely reproducible. For example, you no longer need to compile packages with profiling ahead of time; just request profiling and new-build will rebuild all its dependencies with profiling automatically.
2. Like non-sandboxed Cabal today, builds of external packages are cached globally, so that a package can be built once, and then reused anywhere else it is also used. No need to continually rebuild dependencies whenever you make a new sandbox: dependencies which can be shared, are shared.

Nix-style local builds work with all versions of GHC supported by cabal-install 1.24, which currently is GHC 7.0 and later. Additionally, cabal-install is on a different release cycle than GHC, so we plan to be pushing bugfixes and updates on a faster basis than GHC's yearly release cycle.

Although this feature is in only beta (there are bugs, see “Known Issues”, and the documentation is a bit sparse), I’ve been successfully using Nix-style local builds exclusively to do my Haskell development. It's hard to overstate my enthusiasm for this new feature: it “just works”, and you don't need to assume that there is a distribution of blessed, version-pegged packages to build against (e.g., Stackage). Eventually, new-build will simply replace the existing build command.

### Quick start

Nix-style local builds “just work”: there is very little configuration that needs to be done to start working with it.

cabal update
cabal install cabal-install


Make sure the newly installed cabal is in your path.

2. To build a single Cabal package, instead of running cabal configure; cabal build, you can use Nix-style builds by prefixing these commands with new-; e.g., cabal new-configure; cabal new-build. cabal new-repl is also supported. (Unfortunately, other commands are not yet supported, e.g. new-clean (#2957) or new-freeze (#2996).)

3. To build multiple Cabal packages, you need to first create cabal.project file in some root directory. For example, in the Cabal repository, there is a root directory with a folder per package, e.g., the folders Cabal and cabal-install. Then in cabal.project, specify each folder:

packages: Cabal/
cabal-install/


Then, in the directory for a package, you can say cabal new-build to build all of the components in that package; alternately, you can specify a list of targets to build, e.g., package-tests cabal asks to build the package-tests test suite and the cabal executable. A component can be built from any directory; you don't have to be cd'ed into the directory containing the package you want to build. Additionally, you can qualify targets by the package they came from, e.g., Cabal:package-tests asks specifically for the package-tests component from Cabal. There is no need to manually configure a sandbox: add a cabal.project file, and it just works!

Unlike sandboxes, there is no need to add-source; just add the package directories to your cabal.project. And unlike traditional cabal install, there is no need to explicitly ask for packages to be installed; new-build will automatically fetch and build dependencies.

There is also a convenient script you can use for hooking up new-build to your Travis builds.

### How it works

Nix-style local builds are implemented with these two big ideas:

### How is CodeWorld related to Summer of Haskell?

The Haskell.org committee will choose a number of student projects based on their impact to the Haskell community.  As part of this, one project will be chosen specifically relating to CodeWorld, and funded by CodeWorld maintainers.

### Should I submit a proposal?

It’s up to you, but I believe you should submit a proposal if:

• You are eligible (see the bottom of the Summer of Haskell info page).
• You are willing and available to take on an essentially full-time commitment for the summer.
• You have a realistic idea you’d like to work on to benefit the Haskell community.

### Any advice for writing a proposal?

Yes!  Here are things you should keep in mind:

1. Propose a project with immediate impact on real people.  “If you build it, they will come” doesn’t work here.  Unless you have an extremely good reason, don’t propose to build something speculative and hope people will just like it so much that they adopt it.  Point to real people who already want this, and who will already be users and will find their lives better if and when it’s completed.
2. Demonstrate that you understand the task.  Provide enough detail to convince us that the project is feasible.  A reasonable and concrete timeline with at least rough deliverables is a good idea.  Poorly defined projects with a low probability of success are often not good fits for this format.
3. Show that you are already becoming a part of the community you’ll be working with.  Are you familiar with the project you’re proposing to contribute to?  Do core people in the project and/or the Haskell community know who you are?  Have you discussed your ideas with people already involved in the project?  Do you know someone who would be your mentor?

You can browse successful projects from last year.  There’s also some good advice by Edward Kmett in an old mailing list thread.

# Typewriters

It will suprise nobody to learn that when I was a child, computers were almost unknown, but it may be more surprising that typewriters were unusual.

Probably the first typewriter I was familiar with was my grandmother’s IBM “Executive” model C. At first I was not allowed to touch this fascinating device, because it was very fancy and expensive and my grandmother used it for her work as an editor of medical journals.

The “Executive” was very advanced: it had proportional spacing. It had two space bars, for different widths of spaces. Characters varied between two and five ticks wide, and my grandmother had typed up a little chart giving the width of each character in ticks, which she pasted to the top panel of the typewriter. The font was sans-serif, and I remember being a little puzzled when I first noticed that the lowercase j had no hook: it looked just like the lowercase i, except longer.

The little chart was important, I later learned, when I became old enough to use the typewriter and was taught its mysteries. Press only one key at a time, or the type bars will collide. Don't use the (extremely satisfying) auto-repeat feature on the hyphen or underscore, or the platen might be damaged. Don't touch any of the special controls; Grandma has them adjusted the way she wants. (As a concession, I was allowed to use the “expand” switch, which could be easily switched off again.)

The little chart was part of the procedure for correcting errors. You would backspace over the character you wanted to erase—each press of the backspace key would move the carriage back by one tick, and the chart told you how many times to press—and then place a slip of correction paper between the ribbon and the paper, and retype the character you wanted to erase. The dark ribbon impression would go onto the front of the correction slip, which was always covered with a pleasing jumble of random letters, and the correction slip impression, in white, would exactly overprint the letter you wanted to erase. Except sometimes it didn't quite: the ribbon ink would have spread a bit, and the corrected version would be a ghostly white letter with a hair-thin black outline. Or if you were a small child, as I was, you would sometimes put the correction slip in backwards, and the white ink would be transferred uselessly to the back of the ribbon instead of to the paper. Or you would select a partly-used portion of the slip and the missing bit of white ink would leave a fragment of the corrected letter on the page, like the broken-off leg of a dead bug.

Later I was introduced to the use of Liquid Paper (don't brush on a big glob, dot it on a bit at a time with the tip of the brush) and carbon paper, another thing you had to be careful not to put in backward, although if you did you got a wonderful result: the typewriter printed mirror images.

From typing alphabets, random letters, my name, and of course qwertyuiops I soon moved on to little poems, stories, and other miscellanea, and when my family saw that I was using the typewriter for writing, they presented me with one of my own, a Royal manual (model HHE maybe?) with a two-color ribbon, and I was at last free to explore the mysteries of the TAB SET and TAB CLEAR buttons. The front panel had a control for a three-color ribbon, which forever remained an unattainable mystery. Later I graduated to a Smith-Corona electric, on which I wrote my high school term papers. The personal computer arrived while I was in high school, but available printers were either expensive or looked like crap.

When I was in first grade our classroom had acquired a cheap manual typewriter, which as I have said, was an unusual novelty, and I used it whenever I could. I remember my teacher, Ms. Juanita Adams, complaining that I spent too much time on the typewriter. “You should work more on your handwriting, Jason. You might need to write something while you’re out on the street, and you won't just be able to pull a typewriter out of your pocket.”

She was wrong.

# Paul Graham on Writing, Briefly

Thanks to Arne Ranta for introducing me to Writing, Briefly by Paul Graham.
I think it's far more important to write well than most people realize. Writing doesn't just communicate ideas; it generates them. If you're bad at writing and don't like to do it, you'll miss out on most of the ideas writing would have generated.

# Introduction

This is a bit different from my usual posts (well apart from my write up of hacking at Odessa) in that it is a log of how I managed to get LibBi (Library for Bayesian Inference) to run on my MacBook and then not totally satisfactorily (as you will see if you read on).

The intention is to try a few more approaches to the same problem, for example, Stan, monad-bayes and hand-crafted.

Kermack and McKendrick (1927) give a simple model of the spread of an infectious disease. Individuals move from being susceptible ($S$) to infected ($I$) to recovered ($R$).

\displaystyle \begin{aligned} \frac{dS}{dt} & = & - \delta S(t) I(t) & \\ \frac{dI}{dt} & = & \delta S(t) I(t) - & \gamma I(t) \\ \frac{dR}{dt} & = & & \gamma I(t) \end{aligned}

In 1978, anonymous authors sent a note to the British Medical Journal reporting an influenza outbreak in a boarding school in the north of England (“Influenza in a boarding school” 1978). The chart below shows the solution of the SIR (Susceptible, Infected, Record) model with parameters which give roughly the results observed in the school.

# LibBi

## Step 1

~/LibBi-stable/SIR-master $./init.sh error: 'ncread' undefined near line 6 column 7 The README says this is optional so we can skip over it. Still it would be nice to fit the bridge weight function as described in Moral and Murray (2015). The README does say that GPML is required but since we don’t (yet) need to do this step, let’s move on. ~/LibBi-stable/SIR-master$ ./run.sh
./run.sh

Error: ./configure failed with return code 77. See
.SIR/build_openmp_cuda_single/configure.log and
.SIR/build_openmp_cuda_single/config.log for details

It seems the example is configured to run on CUDA and it is highly likely that my installation of LibBI was not set up to allow this. We can change config.conf from

--disable-assert
--enable-single
--enable-cuda
--nthreads 2

to

--nthreads 4
--enable-sse
--disable-assert

On to the next issue.

~/LibBi-stable/SIR-master $./run.sh ./run.sh Error: ./configure failed with return code 1. required QRUpdate library not found. See .SIR/build_sse/configure.log and .SIR/build_sse/config.log for details But QRUpdate is installed! ~/LibBi-stable/SIR-master$ brew info QRUpdate
brew info QRUpdate
homebrew/science/qrupdate: stable 1.1.2 (bottled)
http://sourceforge.net/projects/qrupdate/
/usr/local/Cellar/qrupdate/1.1.2 (3 files, 302.6K)
/usr/local/Cellar/qrupdate/1.1.2_2 (6 files, 336.3K)
Poured from bottle
/usr/local/Cellar/qrupdate/1.1.2_3 (6 files, 337.3K) *
Poured from bottle
From: https://github.com/Homebrew/homebrew-science/blob/master/qrupdate.rb
==> Dependencies
Required: veclibfort ✔
Optional: openblas ✔
==> Options
--with-openblas
Build with openblas support
--without-check
Skip build-time tests (not recommended)

Let’s look in the log as advised. So it seems that a certain symbol cannot be found.

checking for dch1dn_ in -lqrupdate

Let’s try ourselves.

nm -g /usr/local/Cellar/qrupdate/1.1.2_3/lib/libqrupdate.a | grep dch1dn_
0000000000000000 T _dch1dn_

So the symbol is there! What gives? Let’s try setting one of the environment variables.

export LDFLAGS='-L/usr/local/lib'

Now we get further.

./run.sh
Error: ./configure failed with return code 1. required NetCDF header
.SIR/build_sse/config.log for details

So we just need to set another environment variable.

export CPPFLAGS='-I/usr/local/include/'

This is more mysterious.

./run.sh
Error: ./configure failed with return code 1. required Boost header
.SIR/build_sse/config.log for details ~/LibBi-stable/SIR-master

Let’s see what we have.

brew list | grep -i boost

Nothing! I recall having some problems with boost when trying to use a completely different package. So let’s install boost.

brew install boost

Now we get a different error.

./run.sh
Error: make failed with return code 2, see .SIR/build_sse/make.log for details

Fortunately at some time in the past sbfnk took pity on me and advised me here to use boost155, a step that should not be lightly undertaken.

/usr/local/Cellar/boost155/1.55.0_1: 10,036 files, 451.6M, built in 15 minutes 9 seconds

Even then I had to say

brew link --force boost155

Finally it runs.

./run.sh 2> out.txt

And produces a lot of output

wc -l out.txt
49999 out.txt

ls -ltrh results/posterior.nc
1.7G Apr 30 19:57 results/posterior.nc

Rather worringly, out.txt has all lines of the form

1: -51.9191 -23.2045 nan beats -inf -inf -inf accept=0.5

nan beating -inf does not sound good.

Now we are in a position to analyse the results.

octave --path oct/ --eval "plot_and_print"
error: 'bi_plot_quantiles' undefined near line 23 column 5

I previously found an Octave package(?) called OctBi so let’s create an .octaverc file which adds this to the path. We’ll also need to load the netcdf package which we previously installed.

addpath ("../OctBi-stable/inst")
pkg load netcdf
~/LibBi-stable/SIR-master $octave --path oct/ --eval "plot_and_print" octave --path oct/ --eval "plot_and_print" warning: division by zero warning: called from mean at line 117 column 7 read_hist_simulator at line 47 column 11 bi_read_hist at line 85 column 12 bi_hist at line 63 column 12 plot_and_print at line 56 column 5 warning: division by zero warning: division by zero warning: division by zero warning: division by zero warning: division by zero warning: print.m: fig2dev binary is not available. Some output formats are not available. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. warning: opengl_renderer: x/y/zdata should have the same dimensions. Not rendering. sh: pdfcrop: command not found I actually get a chart from this so some kind of success. This does not look like the chart in the Moral and Murray (2015), the fitted number of infected patients looks a lot smoother and the “rates” parameters also vary in a much smoother manner. For reasons I haven’t yet investigated, it looks like over-fitting. Here’s the charts in the paper. # Bibliography “Influenza in a boarding school.” 1978. British Medical Journal, March, 587. Kermack, W. O., and A. G. McKendrick. 1927. “A Contribution to the Mathematical Theory of Epidemics.” Proceedings of the Royal Society of London Series A 115 (August): 700–721. doi:10.1098/rspa.1927.0118. Moral, Pierre Del, and Lawrence M Murray. 2015. “Sequential Monte Carlo with Highly Informative Observations.” ## April 30, 2016 ### Dimitri Sabadie # Porting a Haskell graphics framework to Rust (luminance) I wanted to write that new article to discuss about something important I’ve been doing for several weeks. It’s actually been a month that I’ve been working on luminance, but not in the usual way. Yeah, I’ve put my Haskell experience aside to… port luminance into Rust! There are numerous reasons why I decided to jump in and I think it could be interesting for people to know about the differences I’ve been facing while porting the graphics library. # You said Rust? Yeah, Rust. It’s a strong and static language aiming at system programming. Although it’s an imperative language, it has interesting functional conventions that caught my attention. Because I’m a haskeller and because Rust takes a lot from Haskell, learning it was a piece of cake, even though there are a few concepts I needed a few days to wrap my mind around. Having a strong C++11/14 experience, it wasn’t that hard though. ## How does it compare to Haskell? The first thing that amazed me is the fact that it’s actually not that different from Haskell! Rust has a powerful type system – not as good as Haskell’s but still – and uses immutability as a default semantic for bindings, which is great. For instance, the following is forbidden in Rust and would make rustc – the Rust compiler – freak out: let a = "foo";a = "bar"; // wrong; forbidden Haskell works like that as well. However, you can introduce mutation with the mut keyword: let mut a = "foo";a = "bar"; // ok Mutation should be used only when needed. In Haskell, we have the ST monad, used to introduce local mutation, or more drastically the IO monad. Under the wood, those two monads are actually almost the same type – with different warranties though. Rust is strict by default while Haskell is lazy. That means that Rust doesn’t know the concept of memory suspensions, or thunks – even though you can create them by hand if you want to. Thus, some algorithms are easier to implement in Haskell thanks to laziness, but some others will destroy your memory if you’re not careful enough – that’s a very common problem in Haskell due to thunks piling up in your stack / heap / whatever as you do extensive lazy computations. While it’s possible to remove those thunks by optimizing a Haskell program – profiling, strictness, etc., Rust doesn’t have that problem because it gives you full access to the memory. And that’s a good thing if you need it. Rust exposes a lot of primitives to work with memory. In contrast with Haskell, it doesn’t have a garbage collector, so you have to handle memory on your own. Well, not really. Rust has several very interesting concepts to handle memory in a very nice way. For instance, objects’ memory is held by scopes – which have lifetimes. RAII is a very well known use of that concept and is important in Rust. You can glue code to your type that will be ran when an instance of that type dies, so that you can clean up memory and scarce resources. Rust has the concept of lifetimes, used to give names to scopes and specify how long an object reference should live. This is very powerful yet a bit complex to understand in the first place. I won’t go into comparing the two languages because it would require several articles and a lot of spare time I don’t really have. I’ll stick to what I’d like to tell you: the Rust implementation of luminance. # Porting luminance from Haskell to Rust The first very interesting aspect of that port is the fact that it originated from a realization while refactoring some of my luminance Haskell code. Although it’s functional, stateless and type-safe, a typical use of luminance doesn’t really require laziness nor a garbage collector. And I don’t like using a tool – read language – like a bazooka. Haskell is the most powerful language ever in terms of abstraction and expressivity over speed ratio, but all of that power comes with an overhead. Even though you’ll find folks around stating that Haskell is pretty okay to code a video game, I think it will never compete with languages that are made to solve real time computations or reactive programming. And don’t get me wrong: I’m sure you can write a decent video game in Haskell – I qualify myself as a Haskeller and I’ve not been writing luminance just for the joy of writing it. However, the way I use Haskell with luminance shouldn’t require all the overhead – and profiling got me right, almost no GC was involved. So… I looked into Rust and discovered and learned the language in only three days. I think it’s due to the fact that Rust, which is simpler than Haskell in terms of type system features and has almost everything taken from Haskell, is, to me, an imperative Haskell. It’s like having a Haskell minus a few abstraction tools – HKT (but they’ll come soon), GADTs, fundeps, kinds, constraints, etc. – plus a total control of what’s happening. And I like that. A lot. A fucking lot. Porting luminance to Rust wasn’t hard as a Haskell codebase might map almost directly to Rust. I had to change a few things – for instance, Rust doesn’t have the concept of existential quantification as-is, which is used intensively in the Haskell version of luminance. But most Haskell modules map directly to their respective Rust modules. I changed the architecture of the files to have something clearer. I was working on loose coupling in Haskell for luminance. So I decided to directly introduce loose coupling into the Rust version. And it works like a charm. So there are, currently, two packages available: luminance, which is the core API, exporting the whole general interface, and luminance-gl, an OpenGL 3.3 backend – though it will contain more backends as the development goes on. The idea is that you need both the dependencies to have access to luminance’s features. I won’t say much today because I’m working on a demoscene production using luminance. I want it to be a proof that the framework is usable, works and acts as a first true example. Of course, the code will be open-source. The documentation is not complete yet but I put some effort documenting almost everything. You’ll find both the packages here: luminance-0.1.0 luminance-gl-0.1.0 I’ll write another article on how to use luminance as soon as possible! Keep the vibe! ## April 29, 2016 ### Darcs # darcs 2.12.0 release The Darcs team is pleased to announce the release of Darcs 2.12.0. ### Downloading One way of installing Darcs 2.12.0 is with stack:$ stack install darcs-2.12.0

and install Darcs with cabal-install:

$cabal update$ cabal install darcs-2.12.0

http://darcs.net/releases/darcs-2.12.0.tar.gz and build it by hand.

The 2.12 branch is also available as a darcs repository from
http://darcs.net/releases/branch-2.12

### What's new

#### Patch dependency graph export

The new command darcs show dependencies, enables to export the dependency graph of a repository (up to the latest tag, by default) as a Graphviz file:

#### whatsnew output and third-party VC frontend support

The flag whatsnew --machine-readable is a simplified version of whatsnew -s for easier parsability by third-party tools. Darcs 2.12 adds conflict markers to the output of whatnew when summarized (ie, when used with the -s or --machine-readable flags or via the darcs status alias). Thanks to this, Darcs support was reintroduced in meld 3.15.2 .

#### improvements for Git repository imports

File moves are converted to file moves primitives, instead of being file deletes and add as before. This enables to have smaller Darcs respositories with a more understandable history. This change adds to other improvements and fixes that make Git import more practical.

#### repository Weak Hashes

The command darcs show repo now shows a hash that is the XOR
of all hashes of the patches metadata of the repository. Being a XOR,
it does not depend on the patches' ordering. Also it is quite fast to
calculate. This Weak Hash can be useful to quickly check whether two
repositories of a single proyect have the same patches.

# Hacking projects over the next few months

Life’s been really hectic lately, but I’ve been getting (slowly) back into working on my Haskell packages. In particular, since the switch from darcs to github I’ve started getting more comments and feature requests, which is nice. Over the next half-year or so, here’s what I’ll be up to in my free time between work on the dissertation and work on Hakaru:

containers — I’ve been appointed one of the new co-maintainers of our favorite venerable library. I prolly won’t be doing any major work until autumn (as mentioned when I was appointed), but I’ve had a number of conversations with David Feuer about where to take things in terms of cleaning up some old maintenance cruft.

bytestring-trie — A few years back I started reimplementing my tries to use Bagwell’s Array Mapped Tries in lieu of Okasaki’s Big-Endian Patricia Tries, but then got stalled because life. I’ve started up on it again, and it’s just about ready to be released after a few more tweaks. Also, now that I’m working on it again I can finally clear out the backlog of API requests (sorry folks!).

exact-combinatorics — A user recently pointed me towards a new fast implementation of factorial making waves lately. It’s not clear just yet whether it’ll be faster than the current implementation, but should be easy enough to get going and run some benchmarks.

unification-fd — This one isn’t hacking so much as dissemination. I have a backlog of queries about why things are the way they are, which I need to address; and I’ve been meaning to continue the tutorial about how to use this library for your unification needs.

logfloat — We’ve been using this a lot in Hakaru, and there are a few performance tweaks I think I can add. The main optimization area is trying to minimize the conditionals for detecting edge cases. The biggest issue has just been coming up with some decent benchmarks. The problem, of course, is that most programs making use of logfloats do a lot of other work too so it can be tricky to detect the actual effect of changes. I think this is something Hakaru can help a lot with since it makes it easy to construct all sorts of new models.

# CodeWorld & Summer of Haskell 2016

As most Haskell community members know, Haskell was turned down by Google Summer of Code this year, and has instead been seeking to continue the tradition with Summer of Haskell, funded by smaller donations. I’m happy to announce that CodeWorld will be part of Summer of Haskell! I’ve donated to support one student working specifically on CodeWorld.

Are you a student, and interested in helping to build a platform for education in expressive mathematics and computer science? Want to work on a project with immediate impact teaching Haskell in multiple schools?  Please propose a project at https://summer.haskell.org/ between now and May 6th.

### Project Ideas

A great source of CodeWorld project ideas is the bug tracker.  Less well-defined projects are tagged as proposals, while more defined features are tagged as enhancements.  A few big ones to think about are:

• Export of CodeWorld projects as mobile applications
• Better and more language-aware editor support for Haskell in CodeMirror.
• Implementing constructive geometry
• Building social, gallery, and/or showcase features to help student work be more visible.
• Building a purely functional block-based programming environment.
• Implementing visual tools to help students understand substitution, list comprehensions, and more.

I look forward to working with someone this summer building something cool!

By the way, HUGE thanks to Edward Kmett and other Haskell.org committee members for making this happen this year!

# [vcppbmdn] Rubik's cube combinations

We implement in Haskell the formulae at http://www.speedcubing.com/chris/cubecombos.html to compute the number of combinations of a NxN Rubik's cube, as well as a supercube (for which center facet orientation matters) and a super-supercube (also called a "real" cube, for which inner cubie orientations also matter) variations.

import Math.Combinatorics.Exact.Factorial(factorial); div_exact :: Integer -> Integer -> Integer; div_exact x y = case divMod x y of { (q,0) -> q; _ -> error "division had a remainder"; }; common :: Integer -> Integer -> Integer -> Integer; common i den_base n = let { e1 :: Integer; e1 = div (n*n - 2*n) 4; e2 :: Integer; e2 = div ((n-2)^2) 4; numerator :: Integer; numerator = (24 * 2^i * factorial 12)^(mod n 2) * factorial 7 * 3^6 * (factorial 24)^e1; denominator :: Integer; denominator = den_base ^ e2; } in div_exact numerator denominator; -- https://oeis.org/A075152 cube :: Integer -> Integer; cube = common 10 ((factorial 4)^6); -- https://oeis.org/A080660 supercube :: Integer -> Integer; supercube = common 21 2; -- https://oeis.org/A080659 super_supercube :: Integer -> Integer; super_supercube n = let { e1 :: Integer; e1 = div_exact (n^3-4*n+(mod n 2)*(12 - 9*n)) 24; e2 :: Integer; e2 = div (n-2) 2 + mod n 2; e4 :: Integer; e4 = mod n 2 * div n 2; } in (div_exact (factorial 24) 2)^e1 * 24^e2 * (factorial 7 * 3^6)^(div n 2) * (factorial 12 * 2^21)^e4;

Incidentally, the output numbers factor easily. The largest prime factor in them is 23. One could do the calculations over a factor base of primes from 2 to 23 to get the output directly in compact factored form.

Next, we consider the Megaminx, and higher odd-order variants (so not Flowerminx / Kilominx):

-- http://michael-gottlieb.blogspot.com/2008/05/number-of-positions-of-generalized.html -- n=1 megaminx; n=2 gigaminx; etc. megaminx :: Integer -> Integer; megaminx n = div_exact (factorial 30 * factorial 20 * (factorial 60)^(n^2-1) * (factorial 5)^(12*n) * 2^28 * 3^19) ((factorial 5)^(12*n^2) * 2^n); 

# Quantifiers in type theory

All this stuff is "well known", but I want to put it out there for folks who may not have encountered it, or not encountered it all together in one picture.

The Damas–Hindley–Milner type system (i.e., the type system that Algorithm W is inferring types for) is propositional logic extended with rank-1 second-order universal quantifiers. It is interesting because it is so particularly stable with respect to inference, decidability, etc. That is, we can come up with many other algorithms besides Algorithm W and they enjoy nice properties like the fact that adding type signatures won't cause inference to fail. (It's worth noting, that Algorithm W is DEXPTIME-complete; so while in practice it's often linear time, for pathological inputs it can take exponentially long. However, if we put a constant bound on the depth of nested let-bindings, then the upper bound becomes polynomial.)

The extension of DHM with rank-1 second-order existential quantifiers is strictly more powerful. It is interesting because it allows unrestricted use of both of the quantifiers in prenex position; thus, it is the limit/top of the alternating quantifier hierarchy (à la the arithmetical hierarchy) that starts with DHM. Surely there are other interesting properties here, but this system is understudied relative to the ones above and below. Edit: Although GHC gets by with encoding existentials away, it's worth noting that MLF allows existentials where the unpacking is implicit rather than requiring an "unseal" or case eliminator (Leijen 2006); and also that UHC does in fact offer first-class existentials (Dijkstra 2005).

The extension with rank-2 second-order universals (i.e., where the universal quantifier can appear to the left of one function arrow) is strictly more powerful still. Here we can encode rank-1 existentials, but my point in this whole post is to point out that rank-1 existentials themselves are strictly weaker than the rank-2 universals it takes to encode them! Also, one little-known fact: this type system is interesting because it is the last one in this progression where type inference is decidable. The decidability of rank-2 universal quantification is part of the reason why GHC distinguishes between -XRank2Types vs -XRankNTypes. Alas, although inference is decidable —and thus of mathematical interest— it is not decidable in the same robust way that DHM is. That is, if we care about human factors like good error messages or not breaking when the user adds type signatures, then we don't get those properties here. Still, the fact that this system is at the cusp of decidable inference is important to know. Edit: Also of interest, this system has the same typeable terms as simply-typed λ-calculus with rank-2 intersection types, and the type inference problem here is fundamentally DEXPTIME-complete (Jim 1995).

Things keep alternating back and forth between existentials and universals of each rank; so far as I'm aware, none of these systems are of any particular interest until we hit the limit: rank-ω (aka: rank-N) second-order quantification. This type system is often called "System F", but that's a misnomer. It is important to differentiate between the syntactic system (i.e., actual System F) we're inferring types for, vs the type system (aka: propositional logic with second-order quantifiers) in which the inferred types live. That is, we can perfectly well have a syntactic system which doesn't have explicit type abstractions/applications but for which we still ascribe rank-ω types. It so happens that the type inference problem is undecidable for that syntactic system, but it was already undecidable way back at rank-3 so the undecidability isn't particularly novel.

# Data is Code

<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>

The title of this post is a play on the Lisp aphorism: "Code is Data". In the Lisp world everything is data; code is just another data structure that you can manipulate and transform.

However, you can also go to the exact opposite extreme: "Data is Code"! You can make everything into code and implement data structures in terms of code.

You might wonder what that even means: how can you write any code if you don't have any primitive data structures to operate on? Fascinatingly, Alonzo Church discovered a long time ago that if you have the ability to define functions you have a complete programming language. "Church encoding" is the technique named after his insight that you could transform data structures into functions.

This post is partly a Church encoding tutorial and partly an announcement for my newly released annah compiler which implements the Church encoding of data types. Many of the examples in this post are valid annah code that you can play with. Also, to be totally pedantic annah implements Boehm-Berarducci encoding which you can think of as the typed version of Church encoding.

This post assumes that you have basic familiarity with lambda expressions. If you do not, you can read the first chapter (freely available) of the Haskell Programming from First Principles which does an excellent job of teaching lambda calculus.

If you would like to follow along with these examples, you can download and install annah by following these steps:

• Install the stack tool
• Create the following stack.yaml file

$cat > stack.yamlresolver: lts-5.13packages: []extra-deps:- annah-1.0.0- morte-1.6.0<Ctrl-D> • Run stack setup • Run stack install annah • Add the installed executable to your $PATH

#### Lambda calculus

In the untyped lambda calculus, you only have lambda expressions at your disposal and nothing else. For example, here is how you encode the identity function:

λx → x

That's a function that takes one argument and returns the same argument as its result.

We call this "abstraction" when we introduce a variable using the Greek lambda symbol and we call the variable that we introduce a "bound variable". We can then use that "bound variable" anywhere within the "body" of the lambda expression.

+-- Abstraction||+-- Bound variable||vvλx → x     ^     |     +-- Body of lambda expression

Any expression that begins with a lambda is an anonymous function which we can apply to another expression. For example, we can apply the the identity function to itself like this:

(λx → x) (λy → y)-- β-reduction= λy → y

We call this "application" when we supply an argument to an anonymous function.

We can define a function of multiple arguments by nested "abstractions":

λx → λy → x

The above code is an anonymous function that returns an anonymous function. For example, if you apply the outermost anonymous function to a value, you get a new function:

(λx → λy → x) 1-- β-reduceλy → 1

... and if you apply the lambda expression to two values, you return the first value:

(λx → λy → x) 1 2-- β-reduce(λy → 1) 2-- β-reduce1

So our lambda expression behaves like a function of two arguments, even though it's really a function of one argument that returns a new function of one argument. We call this "currying" when we simulate functions of multiple arguments using functions one argument. We will use this trick because we will be programming in a lambda calculus that only supports functions of one argument.

#### Typed lambda calculus

In the typed lambda calculus you have to specify the types of all function arguments, so you have to write something like this:

λ(x : a) → x

... where a is the type of the bound variable named x.

However, the above function is still not valid because we haven't specified what the type a is. In theory, we could specify a type like Int:

λ(x : Int) → x

... but the premise of this post was that we could program without relying on any built-in data types so Int is out of the question for this experiment.

Fortunately, some typed variations of lambda calculus (most notably: "System F") let you introduce the type named a as yet another function argument:

λ(a : *) → λ(x : a) → x

This is called "type abstraction". Here the * is the "type of types" and is a universal constant that is always in scope, so we can always introduce new types as function arguments this way.

The above function is the "polymorphic identity function", meaning that this is the typed version of the identity function that still preserves the ability to operate on any type.

If we had built-in types like Int we could apply our polymorphic function to the type just like any other argument, giving back an identity function for a specific type:

(λ(a : *) → λ(x : a) → x) Int-- β-reductionλ(x : Int) → x

This is called "type application" or (more commonly) "specialization". A "polymorphic" function is a function that takes a type as a function argument and we "specialize" a polymorphic function by applying the function to a specific type argument.

However, we are forgoing built-in types like Int, so what other types do we have at our disposal?

Well, every lambda expression has a corresponding type. For example, the type of our polymorphic identity function is:

∀(a : *) → ∀(x : a) → a

You can read the type as saying:

• this is a function of two arguments, one argument per "forall" (∀) symbol
• the first argument is named a and a is a type
• the second argument is named x and the type of x is a
• the result of the function must be a value of type a

This type uniquely determines the function's implementation. To be totally pedantic, there is exactly one implementation up to extensional equality of functions. Since this function has to work for any possible type a there is only one way to implement the function. We must return x as the result, since x is the only value available of type a.

Passing around types as values and function arguments might seem a bit strange to most programmers since most languages either:

• do not use types at all

Example: Javascript

// The polymorphic identity function in Javascriptfunction id(x) {    return x}// Example use of the functionid(true)
• do use types, but they hide type abstraction and type application from the programmer through the use of "type inference"

-- The polymorphic identity function in Haskellid x = x-- Example use of the functionid True
• they use a different syntax for type abstraction/application versus ordinary abstraction and application

Example: Scala

-- The polymorphic identity function in Scaladef id[A](x : a)-- Example use of the function-- Note: Scala lets you omit the [Boolean] here thanks--       to type inference but I'm making the type--       application explicit just to illustrate that--       the syntax is different from normal function--       applicationid[Boolean](true)

For the purpose of this post we will program with explicit type abstraction and type application so that there is no magic or hidden machinery.

So, for example, suppose that we wanted to apply the typed, polymorphic identity function to itself. The untyped version was this:

(λx → x) (λy → y)

... and the typed version is this:

(λ(a : *) → λ(x : a) → x)    (∀(b : *) → ∀(y : b) → b)    (λ(b : *) → λ(y : b) → y)-- β-reduction= (λ(x : ∀(b : *) → ∀(y : b) → b) → x)      (λ(b : *) → λ(y : b) → y)-- β-reduction= (λ(b : *) → λ(y : b) → y)

So we can still apply the identity function to itself, but it's much more verbose. Languages with type inference automate this sort of tedious work for you while still giving you the safety guarantees of types. For example, in Haskell you would just write:

(\x -> x) (\y -> y)

... and the compiler would figure out all the type abstractions and type applications for you.

Exercise: Haskell provides a const function defined like this:

const :: a -> b -> aconst x y = x

Translate const function to a typed and polymorphic lambda expression in System F (i.e. using explicit type abstractions)

#### Boolean values

Lambda expressions are the "code", so now we need to create "data" from "code".

One of the simplest pieces of data is a boolean value, which we can encode using typed lambda expressions. For example, here is how you implement the value True:

λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True

Note that the names have no significance at all. I could have equally well written the expression as:

λ(a : *) → λ(x : a) → λ(y : a) → x

... which is "α-equivalent" to the previous version (i.e. equivalent up to renaming of variables).

We will save the above expression to a file named ./True in our current directory. We'll see why shortly.

We can either save the expression using Unicode characters:

$cat > ./Trueλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True<Ctrl-D> ... or using ASCII, replacing each lambda (i.e. λ) with a backslash (i.e. \) and replacing each arrow (i.e. →) with an ASCII arrow (i.e. ->) $ cat > ./True\(Bool : *) -> \(True : Bool) -> \(False : Bool) -> True<Ctrl-D>

... whichever you prefer. For the rest of this tutorial I will use Unicode since it's easier to read.

Similarly, we can encode False by just changing our lambda expression to return the third argument named False instead of the second argument named True. We'll name this file ./False:

$cat > ./Falseλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False<Ctrl-D> What's the type of a boolean value? Well, both the ./True and ./False files have the same type, which we shall call ./Bool: $ cat > ./Bool∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

... and if you are following along with ASCII you can replace each forall symbol (i.e. ∀) with the word forall:

$cat > ./Boolforall (Bool : *) -> forall (True : Bool) -> forall (False : Bool) -> Bool We are saving these terms and types to files because we can use the annah compiler to work with any lambda expression or type saved as a file. For example, I can use the annah compiler to verify that the file ./True has type ./Bool: $ annah-- Read this as: "./True has type ./Bool"./True : ./Bool<Ctrl-D>./True$echo$?0

If the expression type-checks then annah will just compile the expression to lambda calculus (by removing the unnecessary type annotation in this case) and return a zero exit code. However, if the expression does not type-check:

$annah./True : ./Trueannah: Expression: ∀(x : λ(Bool : *) → λ(True : Bool) → λ(False : Bool)→ True) → λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → TrueError: Invalid input typeType: λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True$ echo $?1 ... then annah will throw an exception and return a non-zero exit code. In this case annah complains that the ./True on the right-hand side of the type annotation is not a valid type. The last thing we need is a function that can consume values of type ./Bool, like an ./if function: $ cat > ./ifλ(x : ./Bool ) → x--          ^--          |--          +-- Note the space.  Filenames must end with a space

The definition of ./if is blindingly simple: ./if is just the identity function on ./Bools!

To see why this works, let's see what the type of ./if is. We can ask for the type of any expression by feeding the expression to the morte compiler via standard input:

$morte < ./if∀(x : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) →∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(x : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) → x morte is a lambda calculus compiler installed alongside annah and annah is a higher-level interface to the morte language. By default, the morte compiler will: • resolve all file references (transitively, if necessary) • type-check the expression • optimize the expression • write the expression's type to standard error as the first line of output • write the optimized expression to standard output as the last line of output In this case we only cared about the type, so we could have equally well just asked the morte compiler to resolve and infer the type of the expression: $ morte resolve < ./Bool/if | morte typecheck∀(x : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) →∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool

The above type is the same thing as:

∀(x : ./Bool ) → ./Bool

If you don't believe me you can prove this to yourself by asking morte to resolve the type:

$echo "∀(x : ./Bool ) → ./Bool" | morte resolve∀(x : ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool) →∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool However, the type will make the most sense if you only expand out the second ./Bool in the type but leave the first ./Bool alone: ./Bool → ∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Bool You can read this type as saying that the ./if function takes four arguments: • the first argument is the ./Bool that we want to branch on (i.e. ./True or ./False) • the second argument is the result type of our ./if expression • the third argument is the result we return if the ./Bool evaluates to ./True (i.e. the "then" branch) • the fourth argument is the result we return if the ./Bool evaluates to ./False (i.e. the "else" branch) For example, this Haskell code: if Truethen Falseelse True ... would translate to this Annah code: $ annah./if ./True    ./Bool   -- The type of the result    ./False  -- The then branch    ./True   -- The else branch<Ctrl-D>./if  ./True  ./Bool  ./False  ./True

annah does not evaluate the expression. annah only translates the expression into Morte code (and the expression is already valid Morte code) and type-checks the expression. If you want to evaluate the expression you need to run the expression through the morte compiler, too:

$morte./if ./True ./Bool -- The type of the result ./False -- The then branch ./True -- The else branch<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False morte deduces that the expression has type ./Bool and the expression evaluates to ./False. morte evaluates the expression by resolving all references and repeatedly applying β-reduction. This is what happens under the hood: ./if ./True ./Bool ./False ./True-- Resolve the ./if reference= (λ(x : ./Bool ) → x) ./True ./Bool ./False ./True-- β-reduce= ./True ./Bool ./False ./True-- Resolve the ./True reference= (λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True) ./Bool ./False ./True-- β-reduce= (λ(True : ./Bool ) → λ(False : ./Bool ) → True) ./False ./True-- β-reduce= (λ(False : ./Bool ) → ./False ) ./True-- β-reduce= ./False-- Resolve the ./False referenceλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False The above sequence of steps is a white lie: the true order of steps is actually different, but equivalent. The ./if function was not even necessary because every value of type ./Bool is already a "pre-formed if expression". That's why ./if is just the identity function on ./Bools. You can delete the ./if from the above example and the code will still work. Now let's define the not function and save the function to a file: $ annah > ./notλ(b : ./Bool ) →    ./if b        ./Bool        ./False  -- If b is ./True  then return ./False        ./True   -- If b is ./False then return ./True<Ctrl-D>

We can now use this file like an ordinary function:

$morte./not ./False<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True$ morte./not ./True<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False

Notice how ./not ./False returns ./True and ./not ./True returns ./False.

Similarly, we can define an and function and an or function:

$annah > andλ(x : ./Bool ) → λ(y : ./Bool ) → ./if x ./Bool y -- If x is ./True then return y ./False -- If x is ./False then return ./False<Ctrl-D> $ annah > orλ(x : ./Bool ) → λ(y : ./Bool ) →    ./if x        ./Bool        ./True   -- If x is ./True  then return ./True        y        -- If x is ./False then return y<Ctrl-D>

... and use them:

$morte./and ./True ./False<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False $ morte./or ./True ./False<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True

We started with nothing but lambda expressions, but still managed to implement:

• a ./Bool type
• a ./True value of type ./Bool
• a ./False value of type ./Bool
• ./if, ./not, ./and, and ./or functions

... and we can do real computation with them! In other words, we've modeled boolean data types entirely as code.

Exercise: Implement an xor function

#### Natural numbers

You might wonder what other data types you can implement in terms of lambda calculus. Fortunately, you don't have to wonder because the annah compiler will actually compile data type definitions to lambda expressions for you.

For example, suppose we want to define a natural number type encoded using Peano numerals. We can write:

$annah typestype Natdata Succ (pred : Nat)data Zerofold foldNat<Ctrl-D> You can read the above datatype specification as saying: • Define a type named Nat ... • ... with a constructor named Succ with one field named pred of type Nat ... • ... with another constructor named Zero with no fields • ... and a fold named foldNat annah then translates the datatype specification into the following files and directories: +-- ./Nat.annah -- annah implementation of Nat|-- ./Nat | +-- @ -- morte implementation of Nat | -- | -- If you import the ./Nat directory this file is | -- imported instead | +-- Zero.annah -- annah implementation of Zero | +-- Zero -- morte implementation of Zero | +-- Succ.annah -- annah implementation of Succ | +-- Succ -- morte implementation of Succ | +-- foldNat.annah -- annah implementation of foldNat | -- foldNat -- morte implementation of foldNat Let's see how the Nat type is implemented: ∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Nat All Boehm-Berarducci-encoded datatypes are encoded as substitution functions, including ./Nat. Any value of ./Nat is a function that takes three arguments that we will substitute into our natural number expression: • The first argument replace every occurrence of the Nat type • The second argument replaces every occurrence of the Succ constructor • The third argument replaces every occurrence of the Zero constructor This will make more sense if we walk through a specific example. First, we will build the number 3 using the ./Nat/Succ and ./Nat/Zero constructors: $ morte./Nat/Succ (./Nat/Succ (./Nat/Succ ./Nat/Zero ))∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ Zero))

Now suppose that we want to compute whether or not our natural number is even. The only catch is that we must limit ourselves to substitution when computing even. We have to figure out something that we can substitute in place of the Succ constructors and something that we can substitute in place of the Zero constructors that will then evaluate to ./True if the natural number is even and ./False otherwise.

One substitution that works is the following:

• Replace every Zero with ./True (because Zero is even)
• Replace every Succ with ./not (because Succ alternates between even and odd)

So in other words, if we began with this:

./Nat/Succ (./Nat/Succ (./Nat/Succ ./Nat/Zero ))

... and we substitute with ./Nat/Succ with ./not and substitute ./Nat/Zero with ./True:

./not (./not (./not ./True ))

... then the expression will reduce to ./False.

Let's prove this by saving the above number to a file named ./three:

$morte > ./three./Nat/Succ (./Nat/Succ (./Nat/Succ ./Nat/Zero ))$ cat threeλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ Zero))

The first thing we need to do is to replace the Nat with ./Bool:

./three ./Bool-- Resolve ./three= (λ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →      Succ (Succ (Succ Zero))  ) ./Bool-- β-reduce= λ(Succ : ∀(pred : ./Bool ) → ./Bool ) → λ(Zero : ./Bool ) →      Succ (Succ (Succ Zero))

Now the next two arguments have exactly the right type for us to substitute in ./not and ./True. The argument named ./Succ is now a function of type ∀(pred : ./Bool ) → ./Bool, which is the same type as ./not. The argument named Zero is now a value of type ./Bool, which is the same type as ./True. This means that we can proceed with the next two arguments:

./three ./Bool ./not ./True-- Resolve ./three= (λ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →      Succ (Succ (Succ Zero))  ) ./Bool ./not ./True-- β-reduce= (λ(Succ : ∀(pred : ./Bool ) → ./Bool ) → λ(Zero : ./Bool ) →      Succ (Succ (Succ Zero))  ) ./not ./True-- β-reduce= (λ(Zero : ./Bool ) → ./not (./not (./not Zero))) ./True-- β-reduce= ./not (./not (./not ./True )))

The result is exactly what we would have gotten if we took our original expression:

./Nat/Succ (./Nat/Succ (./Nat/Succ ./Nat/Zero ))

... and replaced ./Nat/Succ with ./not and replaced ./Nat/Zero with ./True.

Let's verify that this works by running the code through the morte compiler:

$morte./three ./Bool ./not ./True<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False morte computes that the number ./three is not even, returning ./False. We can even go a step further and save an ./even function to a file: $ annah > even\(n : ./Nat ) →    n ./Bool        ./not   -- Replace each ./Nat/Succ with ./not        ./True  -- Replace each ./Nat/Zero with ./True

... and use our newly-formed ./even function:

$morte./even ./three<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False $ morte./even ./Nat/Zero<Ctrl-D>∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True

The annah compiler actually provides direct support for natural number literals, so you can also just write:

$annah | morte./even 100∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True What about addition? How do we add two numbers using only substitution? Well, one way we can add two numbers, m and n, is that we substitute each ./Nat/Succ in m with ./Nat/Succ (i.e. keep them the same) and substitute the Zero with n. In other words: $ annah > plusλ(m : ./Nat ) → λ(n : ./Nat ) →    m ./Nat         -- The result will still be a ./Nat        ./Nat/Succ  -- Replace each ./Nat/Succ with ./Nat/Succ        n           -- Replace each ./Nat/Zero with n

Let's verify that this works:

$annah | morte./plus 2 2∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ (Succ Zero))) We get back a Church-encoded 4! What happened under the hood was the following substitutions: ./plus 2 2-- Resolve ./plus= (λ(m : ./Nat ) → λ(n : ./Nat ) → m ./Nat ./Nat/Succ n) 2 2-- β-reduce= (λ(n : ./Nat ) → 2 ./Nat ./Nat/Succ n) 2-- β-reduce= 2 ./Nat ./Nat/Succ 2-- Definition of 2= (./Nat/Succ (./Nat/Succ ./Nat/Zero )) ./Nat ./Nat/Succ 2-- Resolve and β-reduce the definition of 2 (multiple steps)= (λ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) → Succ (Succ Zero) ) ./Nat ./Nat/Succ 2-- β-reduce= (λ(Succ : ∀(pred : ./Nat ) → ./Nat ) → λ(Zero : ./Nat ) → Succ (Succ Zero) ) ./Nat/Succ 2-- β-reduce= (λ(Zero : ./Nat ) → ./Nat/Succ (./Nat/Succ Zero)) 2-- β-reduce= ./Nat/Succ (./Nat/Succ 2)-- Definition of 2= ./Nat/Succ (./Nat/Succ (./Nat/Succ (./Nat/Succ ./Nat/Zero )))-- Resolve and β-reduce (multiple steps)= λ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) → Succ (Succ (Succ (Succ Zero))) So we can encode natural numbers in lambda calculus, albeit very inefficiently! There are some tricks that we can use to greatly speed up both the time complexity and constant factors, but it will never be competitive with machine arithmetic. This is more of a proof of concept that you can model arithmetic purely in code. Exercise: Implement a function which multiplies two natural numbers #### Data types annah also lets you define "temporary" data types that scope over a given expression. In fact, that's how Nat was implemented. You can look at the corresponding *.annah files to see how each type and term is defined in annah before conversion to morte code. For example, here is how the Nat type is defined in annah: $ cat Nat.annahtype Natdata Succ (pred : Nat)data Zerofold foldNatin   Nat

The first four lines are identical to what we wrote when we invoked the annah types command from the command line. We can use the exact same data type specification to create a scoped expression that can reference the type and data constructors we specified.

When we run this expression through annah we get back the Nat type:

$annah < Nat.annah∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Nat You can use these scoped datatype declarations to quickly check how various datatypes are encoded without polluting your current working directory. For example, I can ask annah how the type Maybe is encoded in lambda calculus: $ annahλ(a : *) →    type Maybe    data Just (x : a)    data Nothing    -- You can also leave out this fold if you don't use it    fold foldMaybe    in   Maybe<Ctrl-D>λ(a : *) → ∀(Maybe : *) → ∀(Just : ∀(x : a) → Maybe) →∀(Nothing : Maybe) → Maybe

A Maybe value is just another substitution function. You provide one branch that you substitute for Just and another branch that you substitute for Nothing. For example, the Just constructor always substitutes in the first branch and ignores the Nothing branch that you supply:

$annahλ(a : *) → type Maybe data Just (x : a) data Nothing in Just<Ctrl-D>λ(a : *) → λ(x : a) → λ(Maybe : *) → λ(Just : ∀(x : a) → Maybe)→ λ(Nothing : Maybe) → Just x Vice versa, the Nothing constructor substitutes in the Nothing branch that you supply and ignores the Just branch: $ annahλ(a : *) →      type Maybe    data Just (x : a)    data Nothing    in   Nothing<Ctrl-D>λ(a : *) → λ(Maybe : *) → λ(Just : ∀(x : a) → Maybe) → λ(Nothing : Maybe) → Nothing

Notice how we've implemented Maybe and Just purely using functions. This implies that any language with functions can encode Maybe, like Python!

Let's translate the above definition of Just and Nothing to the equivalent Python code. The only difference is that we delete the type abstractions because they are not necessary in Python:

def just(x):  def f(just, nothing):    return just(x)  return fdef nothing():  def f(just, nothing):    return nothing  return f

We can similarly translate Haskell-style pattern matching like this:

example :: Maybe Int -> IO ()example m = case m of    Just n  -> print n    Nothing -> return ()

... into this Python code:

def example(m):  def just(n):    # This is what we substitute in place of Just    print(n)  def nothing():  # This is what we substitute in place of Nothing    return  m(just, nothing)

... and verify that our pattern matching function works:

>>> example(nothing())>>> example(just(1))1

Neat! This means that any algebraic data type can be embedded into any language with functions, which is basically every language!

Warning: your colleagues may get angry at you if you do this! Consider using a language with built-in support for algebraic data types instead of trying to twist your language into something it's not.

#### Let expressions

You can also translate let expressions to lambda calculus, too. For example, instead of saving something to a file we can just use a let expression to temporarily define something within a program.

For example, we could write:

$annah | mortelet x : ./Nat = ./plus 1 2in ./plus x x∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ (Succ (Succ (Succ Zero))))) ... but that doesn't really tell us anything about how annah desugars let because we only see the final evaluated result. We can ask annah to desugar without performing any other transformations using the annah desugar command: $ annah desugarlet x : ./Nat = ./plus 1 2in  ./plus x x<Ctrl-D>(λ(x : ./Nat ) → ./plus  x x) (./plus  (λ(Nat : *) → λ(Succ :∀(pred : Nat) → Nat) → λ(Zero : Nat) → Succ Zero) (λ(Nat : *) →λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) → Succ (SuccZero)))

... which makes more sense if we clean up the result through the use of numeric literals:

(λ(x : ./Nat ) → ./plus x x) (./plus 1 2)

Every time we write an expression of the form:

let x : t = yin  e

... we decode that to lambda calculus as:

(λ(x : t) → e) y

We can also decode function definitions, too. For example, you can write:

$annah | mortelet increment (x : ./Nat ) : ./Nat = ./plus x 1in increment 3<Ctrl-D>∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ (Succ Zero))) ... and the intermediate desugared form also encodes the function definition as a lambda expression: $ annah desugarlet increment (x : ./Nat ) : ./Nat = ./plus x 1in  increment 3<Ctrl-D>(λ(increment : ∀(x : ./Nat ) → ./Nat ) → increment (λ(Nat : *)→ λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) → Succ (Succ(Succ Zero)))) (λ(x : ./Nat ) → ./plus  x (λ(Nat : *) → λ(Succ: ∀(pred : Nat) → Nat) → λ(Zero : Nat) → Succ Zero)

... which you can clean up as this expression:

(λ(increment : ∀(x : ./Nat ) → ./Nat ) → increment 3)    (λ(x : ./Nat ) → ./plus x 1)

We can combine let expressions with data type expressions, too. For example, here's our original not program, except without saving anything to files:

$annahtype Booldata Truedata Falsefold ifinlet not (b : Bool) : Bool = if b Bool False Trueinnot False<Ctrl-D>λ(Bool : *) → λ(True : Bool) → λ(False : Bool) → True #### Lists annah also provides syntactic support for lists as well. For example: $ annah[nil ./Bool , ./True , ./False , ./True ]<Ctrl-D>λ(List : *) → λ(Cons : ∀(head : ./Bool ) → ∀(tail : List) →List) → λ(Nil : List) → Cons ./True  (Cons ./False  (Cons./True  Nil))

Just like all the other data types, a list is defined in terms of what you use to substitute each Cons and Nil constructor. I can replace each Cons with ./and and the Nil with ./True like this:

$annah | morte<Ctrl-D>[nil ./Bool , ./True , ./False , ./True ] ./Bool ./and ./True∀(Bool : *) → ∀(True : Bool) → ∀(False : Bool) → Boolλ(Bool : *) → λ(True : Bool) → λ(False : Bool) → False This conceptually followed the following reduction sequence: ( λ(List : *)→ λ(Cons : ∀(head : ./Bool ) → ∀(tail : List) → List)→ λ(Nil : List)→ Cons ./True (Cons ./False (Cons ./True Nil))) ./Bool ./and ./True-- β-reduction= ( λ(Cons : ∀(head : ./Bool ) → ∀(tail : ./Bool ) → ./Bool ) → λ(Nil : ./Bool ) → Cons ./True (Cons ./False (Cons ./True Nil)) ) ./and ./True-- β-reduction= ( λ(Nil : ./Bool ) → ./and ./True (./and ./False (./and ./True Nil)) ) ./True-- β-reduction= ./and ./True (./and ./False (./and ./True ./True)) Similarly, we can sum a list by replacing each Cons with ./plus and replacing each Nil with 0: $ annah | morte[nil ./Nat , 1, 2, 3, 4] ./Nat ./plus 0∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (SuccZero)))))))))

This behaves as if we had written:

./plus 1 (./plus 2 (./plus 3 (./plus 4 0)))

#### Prelude

annah also comes with a Prelude to show some more sophisticated examples of what you can encode in pure lambda calculus. You can find version 1.0 of the Prelude here:

http://sigil.place/prelude/annah/1.0/

You can use these expressions directly within your code just by referencing their URL. For example, the remote Bool expression is located here:

http://sigil.place/prelude/annah/1.0/Bool/@

... and the remote True expression is located here:

http://sigil.place/prelude/annah/1.0/Bool/True

... so we can check if the remote True's type matches the remote Bool by writing this:

$annahhttp://sigil.place/prelude/annah/1.0/Bool/True : http://sigil.place/prelude/annah/1.0/Bool<Ctrl-D>http://sigil.place/prelude/annah/1.0/Bool/True$ echo $?0 Similarly, we can build a natural number (very verbosely) using remote Succ and Zero: $ annah | mortehttp://sigil.place/prelude/annah/1.0/Nat/Succ(   http://sigil.place/prelude/annah/1.0/Nat/Succ    (   http://sigil.place/prelude/annah/1.0/Nat/Succ        http://sigil.place/prelude/annah/1.0/Nat/Zero    ))∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ Zero))

However, we can also locally clone the Prelude using wget if we wish to refer to local file paths instead of remote paths:

$wget -np -r --cut-dirs=3 http://sigil.place/prelude/annah/1.0/$ cd sigil.place$ls(->) Defer.annah List.annah Path Sum0.annah(->).annah Eq Maybe Path.annah Sum1Bool Eq.annah Maybe.annah Prod0 Sum1.annahBool.annah Functor Monad Prod0.annah Sum2Category Functor.annah Monad.annah Prod1 Sum2.annahCategory.annah index.html Monoid Prod1.annahCmd IO Monoid.annah Prod2Cmd.annah IO.annah Nat Prod2.annahDefer List Nat.annah Sum0 Now we can use these expressions using their more succinct local paths: ./Nat/sum (./List/(++) ./Nat [nil ./Nat , 1, 2] [nil ./Nat , 3, 4])<Ctrl-D>∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (SuccZero))))))))) Also, every expression has a corresponding *.annah file that documents the expression's type using a let expression. For example, we can see the type of the ./List/(++) function by studying the ./List/(++).annah file: cat './List/(++).annah' let (++) (a : *) (as1 : ../List a) (as2 : ../List a) : ../List a = \(List : *) -> \(Cons : a -> List -> List) -> \(Nil : List) -> as1 List Cons (as2 List Cons Nil)in (++) The top line tells us that (++) is a function that takes three arguments: • An argument named a for the type list elements you want to combine • An argument named as1 for the left list you want to combine • An argument named as2 for the right list you want to combine ... and the function returns a list of the same type as the two input lists. #### Beyond Exactly how far can you take lambda calculus? Well, here's a program written in annah that reads two natural numbers, adds them, and writes them out: $ annah | morte./IO/Monad ./Prod0 (do ./IO {    x : ./Nat   <- ./IO/get ;    y : ./Nat   <- ./IO/get ;    _ : ./Prod0 <- ./IO/put (./Nat/(+) x y);})∀(IO : *) → ∀(Get_ : ((∀(Nat : *) → ∀(Succ : ∀(pred : Nat) →Nat) → ∀(Zero : Nat) → Nat) → IO) → IO) → ∀(Put_ : (∀(Nat : *)→ ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Nat) → IO →IO) → ∀(Pure_ : (∀(Prod0 : *) → ∀(Make : Prod0) → Prod0) → IO)→ IOλ(IO : *) → λ(Get_ : ((∀(Nat : *) → ∀(Succ : ∀(pred : Nat) →Nat) → ∀(Zero : Nat) → Nat) → IO) → IO) → λ(Put_ : (∀(Nat : *)→ ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Nat) → IO →IO) → λ(Pure_ : (∀(Prod0 : *) → ∀(Make : Prod0) → Prod0) → IO)→ Get_ (λ(r : ∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) →∀(Zero : Nat) → Nat) → Get_ (λ(r : ∀(Nat : *) → ∀(Succ :∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Nat) → Put_ (λ(Nat : *)→ λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) → r@1 Nat Succ(r Nat Succ Zero)) (Pure_ (λ(Prod0 : *) → λ(Make : Prod0) →Make))))

This does not run the program, but it creates a syntax tree representing all program instructions and the flow of information.

annah supports do notation so you can do things like write list comprehensions in annah:

$annah | morte./List/sum (./List/Monad ./Nat (do ./List { x : ./Nat <- [nil ./Nat , 1, 2, 3]; y : ./Nat <- [nil ./Nat , 4, 5, 6]; _ : ./Nat <- ./List/pure ./Nat (./Nat/(+) x y);}))<Ctrl-D>∀(Nat : *) → ∀(Succ : ∀(pred : Nat) → Nat) → ∀(Zero : Nat) → Natλ(Nat : *) → λ(Succ : ∀(pred : Nat) → Nat) → λ(Zero : Nat) →Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ(Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ(Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ(Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ(Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ(Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ (Succ(Succ (Succ (Succ Zero)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))) The above Annah program is equivalent to the following Haskell program: sum (do x <- [1, 2, 3] y <- [4, 5, 6] return (x + y) ) ... yet it is implemented 100% in a minimal and total lambda calculus without any built-in support for data types. This tutorial doesn't cover how do notation works, but you can learn this and more by reading the Annah tutorial which is bundled with the Hackage package: #### Conclusions A lot of people underestimate how much you can do in a total lambda calculus. I don't recommend pure lambda calculus as a general programming language, but I could see a lambda calculus enriched with high-efficiency primitives to be a realistic starting point for simple functional languages that are easy to port and distribute. One of the projects I'm working towards in the long run is a "JSON for code" and annah is one step along the way towards that goal. annah will likely not be that language, but I still factored out annah as a separate and reusable project along the way so that others could fork and experiment with annah when experimenting with their own language design. Also, as far as I can tell annah is the only project in the wild that actually implements the Boehm-Berarducci encoding outlined in this paper: ... so if you prefer to learn the encoding algorithm by studying actual code you can use the annah source code as a reference real-world implementation. You can find Annah project on Hackage or Github: ... and you can find the Annah prelude hosted online: The Annah tutorial goes into the Annah language and compiler in much more detail than this tutorial, so if you would like to learn more I highly recommend reading the tutorial which walks through the compiler, desugaring, and the Prelude in much more detail: Also, for those who are curious, both the annah and morte compilers are named after characters from the game Planescape: Torment. ### Philip Wadler # John McCarthy presents Recursive Functions of Symbolic Expressions “John McCarthy presents Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I”, Painting, 1662, Ferdinand Bol. From Classic Programmer Paintings. ### Edward Z. Yang # Hindley-Milner with top-level existentials Content advisory: This is a half-baked research post. Abstract. Top-level unpacking of existentials are easy to integrate into Hindley-Milner type inference. Haskell should support them. It's possible this idea can work for internal bindings of existentials as well (ala F-ing modules) but I have not worked out how to do it. Update. And UHC did it first! Update 2. And rank-2 type inference is decidable (and rank-1 existentials are an even weaker system), although the algorithm for rank-2 inference requires semiunification. ### Background The difference between Hindley-Milner and System F. Although in informal discussion, Hindley-Milner is commonly described as a “type inference algorithm”, it should properly be described as a type system which is more restrictive than System F. Both type systems allow polymorphism via universal quantification of variables, but in System F this polymorphism is explicit and can occur anywhere, whereas in Hindley-Milner the polymorphism is implicit, and can only occur at the “top level” (in a so-called “polytype” or “type scheme.”) This restriction of polymorphism is the key which makes inference (via Algorithm W) for Hindley-Milner decidable (and practical), whereas inference for System F undecidable. -- Hindley Milner id :: a -> a id = λx. x -- System F id :: ∀a. a -> a id = Λa. λ(x : a). x  Existential types in System F. A common generalization of System F is to equip it with existential types: Types τ ::= ... | ∃a. τ Terms e ::= ... | pack <τ, e>_τ | unpack <a, x> = e in e  In System F, it is technically not necessary to add existentials as a primitive concept, as they can be encoded using universal quantifiers by saying ∃a. τ = ∀r. (∀a. τ → r) → r. Existential types in Hindley-Milner? This strategy will not work for Hindley-Milner: the encoding requires a higher-rank type, which is precisely what Hindley-Milner rules out for the sake of inference. In any case, it is a fool's game to try to infer existential types: there's no best type! HM always infers the most general type for an expression: e.g., we will infer f :: a -> a for the function f = \x -> x, and not Int -> Int. But the whole point of data abstraction is to pick a more abstract type, which is not going to be the most general type and, consequently, is not going to be unique. What should be abstract, what should be concrete? Only the user knows. Existential types in Haskell. Suppose that we are willing to write down explicit types when existentials are packed, can Hindley-Milner do the rest of the work: that is to say, do we have complete and decidable inference for the rest of the types in our program? Haskell is an existence (cough cough) proof that this can be made to work. In fact, there are two ways to go about doing it. The first is what you will see if you Google for “Haskell existential type”: {-# LANGUAGE ExistentialQuantification #-} data Ex f = forall a. Ex (f a) pack :: f a -> Ex f pack = Ex unpack :: Ex f -> (forall a. f a -> r) -> r unpack m k = case m of Ex x -> f x  Ex f is isomorphic to ∃a. f a, and similar to the System F syntax, they can be packed with the Ex constructor and unpacked by pattern-matching on them. The second way is to directly use the System F encoding using Haskell's support for rank-n types: {-# LANGUAGE RankNTypes #-} type Ex f = forall r. (forall a. f a -> r) -> r pack :: f a -> Ex f pack x = \k -> k x unpack :: Ex f -> (forall a. f a -> r) -> r unpack m k = m k  The boxy types paper demonstrated that you can do inference, so long as all of your higher rank types are annotated. Although, perhaps it was not as simple as hoped, since impredicative types are a source of constant bugs in GHC's type checker. ### The problem Explicit unpacks suck. As anyone who has tried programming with existentials in Haskell can attest, the use of existentials can still be quite clumsy due to the necessity of unpacking an existential (casing on it) before it can be used. That is to say, the syntax let Ex x = ... in ... is not allowed, and it is an easy way to get GHC to tell you its brain exploded. Leijen investigated the problem of handling existentials without explicit unpacks. Loss of principal types without explicit unpacks, and Leijen's solution. Unfortunately, the naive type system does not have principal types. Leijen gives an example where there is no principal type: wrap :: forall a. a -> [a] key :: exists b. Key b -- What is the type of 'wrap key'? -- [exists b. Key b]? -- exists b. [key b]?  Neither type is a subtype of the other. In his paper, Leijen suggests that the existential should be unwrapped as late as possible (since you can go from the first type to the second, but not vice versa), and thus, the first type should be preferred. ### The solution A different approach. What if we always lift the existential to the top level? This is really easy to do if you limit unpacks to the top-level of a program, and it turns out this works really well. (The downside is that dynamic use of existentials is not supported.) There's an existential in every top-level Haskell algebraic data type. First, I want to convince you that this is not all that strange of an idea. To do this, we look at Haskell's support for algebraic data types. Algebraic data types in Haskell are generative: each data type must be given a top-level declaration and is considered a distinct type from any other data type. Indeed, Haskell users use this generativity in conjunction with the ability to hide constructors to achieve data abstraction in Haskell. Although there is not actually an existential lurking about—generativity is not data abstraction—generativity is an essential part of data abstraction, and HM has no problem with this. Top-level generativity corresponds to existentials that are unpacked at the top-level of a program (ala F-ing modules). We don't need existentials embedded inside our Haskell expressions to support the generativity of algebraic data types: all we need is the ability to pack an existential type at the top level, and then immediately unpack it into the top-level context. In fact, F-ing modules goes even further: existentials can always be lifted until they reach the top level of the program. Modular programming with applicative functors (the ML kind) can be encoded using top-level existentials which are immediately unpacked as they are defined. The proposal. So let us suggest the following type system, Hindley-Milner with top-level existentials (where a* denotes zero to many type variables): Term variables ∈ f, x, y, z Type variables ∈ a, b, c Programs prog ::= let f = e in prog | seal (b*, f :: σ) = (τ*, e) in prog | {- -} Type schemes (polytypes) σ ::= ∀a*. τ Expressions e ::= x | \x -> e | e e Monotypes τ ::= a | τ -> τ  There is one new top-level binding form, seal. We can give it the following typing rule: Γ ⊢ e :: τ₀[b* → τ*] a* = free-vars(τ₀[b* → τ*]) Γ, b*, (f :: ∀a*. τ₀) ⊢ prog --------------------------------------------- Γ ⊢ seal (b*, f :: ∀a*. τ₀) = (τ*, e) in prog  It also elaborates directly to System F with existentials: seal (b*, f :: σ) = (τ*, e) in prog ===> unpack <b*, f> = pack <τ*, e>_{∃b*. σ} in prog  A few observations: 1. In conventional presentations of HM, let-bindings are allowed to be nested inside expressions (and are generalized to polytypes before being added to the context). Can we do something similar with seal? This should be possible, but the bound existential type variables must be propagated up. 2. This leads to a second problem: naively, the order of quantifiers must be ∃b. ∀a. τ and not ∀a. ∃b. τ, because otherwise we cannot add the existential to the top-level context. However, there is a "skolemization" trick (c.f. Shao and F-ing modules) by which you can just make b a higher-kinded type variable which takes a as an argument, e.g., ∀a. ∃b. b is equivalent to ∃b'. ∀a. b' a. This trick could serve as the way to support inner seal bindings, but the encoding tends to be quite involved (as you must close over the entire environment.) 3. This rule is not very useful for directly modeling ML modules, as a “module” is usually thought of as a record of polymorphic functions. Maybe you could generalize this rule to bind multiple polymorphic functions? Conclusion. And that's as far as I've worked it out. I am hoping someone can tell me (1) who came up with this idea already, and (2) why it doesn't work. ### Mark Jason Dominus # Steph Curry: fluke or breakthrough? [ Disclaimer: I know very little about basketball. I think there's a good chance this article contains at least one basketball-related howler, but I'm too ignorant to know where it is. ] Randy Olson recently tweeted a link to a New York Times article about Steph Curry's new 3-point record. Here is Olson’s snapshot of a portion of the Times’ clever and attractive interactive chart: (Skip this paragraph if you know anything about basketball. The object of the sport is to throw a ball through a “basket” suspended ten feet (3 meters) above the court. Normally a player's team is awarded two points for doing this. But if the player is sufficiently far from the basket—the distance varies but is around 23 feet (7 meters)—three points are awarded instead. Carry on!) Stephen Curry The chart demonstrates that Curry this year has shattered the single-season record for three-point field goals. The previous record, set last year, is 286, also by Curry; the new record is 406. A comment by the authors of the chart says The record is an outlier that defies most comparisons, but here is one: It is the equivalent of hitting 103 home runs in a Major League Baseball season. (The current single-season home run record is 73, and .) I found this remark striking, because I don't think the record is an outlier that defies most comparisons. In fact, it doesn't even defy the comparison they make, to the baseball single-season home run record. Babe Ruth In 1919, the record for home runs in a single season was 29, hit by Babe Ruth. The 1920 record, also by Ruth, was 54. To make the same comparison as the authors of the Times article, that is the equivalent of hitting home runs in a Major League Baseball season. No, far from being an outlier that defies most comparisons, I think what we're seeing here is something that has happened over and over in sport, a fundamental shift in the way they game is played; in short, a breakthrough. In baseball, Ruth's 1920 season was the end of what is now known as the dead-ball era. The end of the dead-ball era was the caused by the confluence of several trends (shrinking ballparks), rule changes (the spitball), and one-off events (Ray Chapman, the Black Sox). But an important cause was simply that Ruth realized that he could play the game in a better way by hitting a crapload of home runs. The new record was the end of a sudden and sharp upward trend. Prior to Ruth's 29 home runs in 1919, the record had been 27, a weird fluke set way back in 1887 when the rules were drastically different. Typical single-season home run records in the intervening years were in the 11 to 16 range; the record exceeded 20 in only four of the intervening 25 years. Ruth's innovation was promptly imitated. In 1920, the #2 hitter hit 19 home runs and the #10 hitter hit 11, typical numbers for the nineteen-teens. By 1929, the #10 hitter hit 31 home runs, which would have been record-setting in 1919. It was a different game. Takeru Kobayashi For another example of a breakthrough, let's consider competitive hot dog eating. Between 1980 and 1990, champion hot-dog eaters consumed between 9 and 16 hot dogs in 10 minutes. In 1991 the time was extended to 12 minutes and Frank Dellarosa set a new record, 21½ hot dogs, which was not too far out of line with previous records, and which was repeatedly approached in the following decade: through 1999 five different champions ate between 19 and 24½ hot dogs in 12 minutes, in every year except 1993. But in 2000 Takeru Kobayashi (小林 尊) changed the sport forever, eating an unbelievably disgusting 50 hot dogs in 12 minutes. (50. Not a misprint. Fifty. Roman numeral Ⅼ.) To make the Times’ comparison again, that is the equivalent of hitting home runs in a Major League Baseball season. At that point it was a different game. Did the record represent a fundamental shift in hot dog gobbling technique? Yes. Kobayashi won all five of the next five contests, eating between 44½ and 53¾ each time. By 2005 the second- and third-place finishers were eating 35 or more hot dogs each; had they done this in 1995 they would have demolished the old records. A new generation of champions emerged, following Kobayashi's lead. The current record is 69 hot dogs in 10 minutes. The record-setters of the 1990s would not even be in contention in a modern hot dog eating contest. Bob Beamon It is instructive to compare these breakthroughs with a different sort of astonishing sports record, the bizarre fluke. In 1967, the world record distance for the long jump was 8.35 meters. In 1968, Bob Beamon shattered this record, jumping 8.90 meters. To put this in perspective, consider that in one jump, Beamon advanced the record by 55 cm, the same amount that it had advanced (in 13 stages) between 1925 and 1967. Progression of the world long jump record The cliff at 1968 is Bob Beamon Did Beamon's new record represent a fundamental shift in long jump technique? No: Beamon never again jumped more than 8.22m. Did other jumpers promptly imitate it? No, Beamon's record was approached only a few times in the following quarter-century, and surpassed only once. Beamon had the benefit of high altitude, a tail wind, and fabulous luck. Joe DiMaggio Another bizarre fluke is Joe DiMaggio's hitting streak: in the 1941 baseball season, DiMaggio achieved hits in 56 consecutive games. For extensive discussion of just how bizarre this is, see The Streak of Streaks by Stephen J. Gould. (“DiMaggio’s streak is the most extraordinary thing that ever happened in American sports.”) Did DiMaggio’s hitting streak represent a fundamental shift in the way the game of baseball was played, toward high-average hitting? Did other players promptly imitate it? No. DiMaggio's streak has never been seriously challenged, and has been approached only a few times. (The modern runner-up is Pete Rose, who hit in 44 consecutive games in 1978.) DiMaggio also had the benefit of fabulous luck. Is Curry’s new record a fluke or a breakthrough? I think what we're seeing in basketball is a breakthrough, a shift in the way the game is played analogous to the arrival of baseball’s home run era in the 1920s. Unless the league tinkers with the rules to prevent it, we might expect the next generation of players to regularly lead the league with 300 or 400 three-point shots in a season. Here's why I think so. 1. Curry's record wasn't unprecedented. He's been setting three-point records for years. (Compare Ruth’s 1920 home run record, foreshadowed in 1919.) He's continuing a trend that he began years ago. 2. Curry’s record, unlike DiMaggio’s streak, does not appear to depend on fabulous luck. His 402 field goals this year are on 886 attempts, a 45.4% success rate. This is in line with his success rate every year since 2009; last year he had a 44.3% success rate. Curry didn't get lucky this year; he had 40% more field goals because he made almost 40% more attempts. There seems to be no reason to think he couldn't make the same number of attempts next year with equal success, if he wants to. 3. Does he want to? Probably. Curry’s new three-point strategy seems to be extremely effective. In his previous three seasons he scored 1786, 1873, and 1900 points; this season, he scored 2375, an increase of 475, three-quarters of which is due to his three-point field goals. So we can suppose that he will continue to attempt a large number of three-point shots. 4. Is this something unique to Curry or is it something that other players might learn to emulate? Curry’s three-point field goal rate is high, but not exceptionally so. He's not the most accurate of all three-point shooters; he holds the 62nd–64th-highest season percentages for three-point success rate. There are at least a few other players in the league who must have seen what Curry did and thought “I could do that”. (Kyle Korver maybe? I'm on very shaky ground; I don't even know how old he is.) Some of those players are going to give it a try, as are some we haven’t seen yet, and there seems to be no reason why some shouldn't succeed. A number of things could sabotage this analysis. For example, the league might take steps to reduce the number of three-point field goals, specifically in response to Curry’s new record, say by moving the three-point line farther from the basket. But if nothing like that happens, I think it's likely that we'll see basketball enter a new era of higher offense with more three-point shots, and that future sport historians will look back on this season as a watershed. [ Addendum 20160425: As I feared, my Korver suggestion was ridiculous. Thanks to the folks who explained why. Reason #1: He is 35 years old. ] ## April 23, 2016 ### Oliver Charles # Announcing transformers-eff In my last post, I spent some time discussing a few different approaches to dealing with computational effects in Haskell - namely monad transformers, free monads, and the monad transformer library. I presented an approach to systematically building mtl-like type classes based on the idea of lifting languages for a given effect into larger monad transformer stacks. This approach felt so mechanical to me I set about exploring a way to formalise it, and am happy to announce a new experimental library – transformers-eff. transformers-eff takes inspiration from the work of algebraic effects and handlers, and splits each effect into composable programs for introducing effects and handlers that eliminate these effects. As the name indicates, this work is also closely related to monad transformer stacks, as they provide the implementation of the specific effects. I believe the novelty in my approach is that we can do this entirely within the system of monad transformers, and this observation makes it very convenient to create re-usable effects. ## Core API Before looking at an example, I want to start by presenting the core API. First, we have the Eff monad transformer: data Eff (f :: * -> *) (m :: * -> *) (a :: *) If you squint, you’ll see that Eff has the familiar shape of a monad transformer - it transforms a given monad m, providing it access to effects described by f. As Eff f m is itself a monad, it’s possible to stack Effs together. The type parameter f is used to indicate which effects this Eff transformer talks about. Next, the library provides a way to eliminate Eff by translating it into a concrete monad transformer: translate :: (Monad m,Monad (t m),MonadTrans t) => (forall x r. f x -> ContT r (t m) x) -> Eff f m a -> t m a Translations are defined by a single function that is very similar to the type of “lifts” we saw in my previous blog post. The difference here is that the homomorphism maps into ContT, which allows the translation to adjust control flow. For many effects it will be enough to simply lift directly into this, but it can be useful to inspect the continuation, for example to build non-deterministic computations. Finally, we have one type class method: interpret :: (Monad m) => f a -> m a However, this type class is fairly constrained in its instances, so you should read m as actually being some sort of monad transformer stack containing Eff f. ## Examples Let’s dive in and look at some examples. ### Reader effects Last post we spent a lot of time looking at various representations of the reader monad, so let’s see how this looks under transformers-eff. We already have a definition for our language, r -> a as we saw last week. While we could work directly with this, we’ll be interpreting into ReaderT so I’ll use the Reader newtype for a little extra readibility. Given this language, we just need to write a translation into a concrete monad transformer, which will be ReaderT: effToReaderT :: Monad m => Eff (Reader e) m a -> ReaderT e m a effToReaderT = translate (\r -> lift (hoist generalize r)) This is a little dense, so let’s break it down. When we call translate, we have to provide a function with the type: forall a m. Reader r a -> ContT _ (ReaderT r m) a The ReaderT r m part is coming from the type we gave in the call to translate, that is – the type of effToReaderT. We don’t really need to concern outselves with continuations for this effect, as reading from a fixed environment does not change the flow of control - so we’ll begin with lift. We now have to produce a ReaderT r m a from a Reader r a. If we notice that Reader r a = ReaderT r Identity a, we can make use of the tools in the mmorph library, which lets us map that Identity to any m via hoist generalize. We still need a way to easily introduce these effects into our programs, and that means writing an mtl type class. However, the instances require almost no work on our behalf and we only have to provide two, making this is a very quick process: class (Monad m) => EffReader env m | m -> env where liftReader :: Reader env a -> m a instance Monad m => EffReader env (Eff (Reader env) m) where liftReader = interpret instance {-# OVERLAPPABLE #-} EffReader env m => EffReader env (Eff effects m) where liftReader = lift . liftReader I then provide a user-friendly API built on this lift operation: ask :: EffEnv e m => m e ask = liftReader (Reader id) Finally, most users are probably more interested in running the effect rather than just translating it to ReaderT, so let’s provide a convenience function to translate and run all in one go: runReader :: Eff (Reader r) m a -> r -> m a runReader eff r = runReaderT (effToReaderT eff) r In total, the reader effect is described as: class (Monad m) => EffReader env m | m -> env where liftReader :: Reader env a -> m a instance Monad m => EffReader env (Eff (Reader env) m) where liftReader = interpret instance {-# OVERLAPPABLE #-} EffReader env m => EffReader env (Eff effects m) where liftReader = lift . liftReader ask :: EffEnv e m => m e ask = liftReader (Reader id) effToReaderT :: Monad m => Eff (Reader e) m a -> ReaderT e m a effToReaderT = translate (\r -> lift (hoist generalize r)) ### A logging effect We also looked at a logging effect last week, and this can also be built using transformers-eff: data LoggingF message a = Log message deriving (Functor) class (Monad m) => EffLog message m | m -> message where liftLog :: Free (LoggingF message) a -> m a instance Monad m => EffLog env (Eff (Free (LoggingF message)) m) where liftLog = interpret instance {-# OVERLAPPABLE #-} EffLog env m => EffLog env (Eff effects m) where liftLog = lift . liftLog log :: EffLog message m => message -> m () log = liftLog . liftF . Log runLog :: (MonadIO m) => Eff (Free (LoggingF message) e) m a -> (message -> IO ()) -> m a runLog eff = runIdentityT (translate (iterM (\(Log msg) -> liftIO (io msg)))) The interpretation here is given an IO action to perform whenever a message is logged. I could have implemented this in a few ways - perhaps lifting the whole computation into ReaderT (message -> IO ()), but instead I have just used IdentityT as the target monad transformer, and added a MonadIO constraint onto m. Whenever a message is logged, we’ll directly call the given IO action. As you can also see, I’ve used a free monad as the source language for the effect. This example demonstrates that we are free to mix a variety of tools (here free monads, MonadIO and the identity transformer) in order to get the job done. ## What does this approach bring? ### Less type class instances We saw above that when we introduced our EffLog type class, it was immediately available for use along side EffReader effects - and we didn’t have to do anything extra! To me, this is a huge win - I frequently find myself frustrated with the amount of work required to do when composing many different projects together with mtl, and this is not just a theoretical frustration. To provide just one example from today, I wanted to use ListT with some Yesod code that required MonadLogger. There is obviously no MonadLogger instance for ListT, and it’s almost unsolvable to provide such an instance withoutrs/o using orphan instances - neither one of those libraries should need to depend on the other, so we’re stuck! If you stay within Eff, this problem doesn’t occur. Many will be quick to point out that in mtl it doesn’t necessary make sense to have all transformers compose due to laws (despite the lack of any laws actually being stated…), and I’m curious if this is true here. In this library, due to the limitation on having to write your effectful programs based on an underlying algebra, I’m not sure it’s possible to introduce the problematic type class methods like local and catch. ### One effect at a time In the mtl approach a single monad transformer stack might be able to deal with a whole selection of effects in one go. However, I’ve found that this can actually make it quite difficult to reason about the flow of code. To provide an example, let’s consider this small API: findOllie :: (MonadDb m, MonadPlus m) => m Person findOllie = do x <- dbLookup (PersonId 42) guard (personName x == "Ollie") return x type QueryError = String dbLookup :: (MonadDb m, MonadError QueryError m) => PersonId -> m Person data DbT m a instance Monad m => Monad (DbT m) instance Monad m => MonadDb (DbT m) runDb :: (MonadIO m) :: DbT m a -> m a If we just try and apply runDb to findOllie, we’ll get runDb findOllie :: (MonadError QueryError m, MonadIO m, MonadPlus m) => m Person We still need to take care of MonadError and MonadPlus. For MonadError I’ll use ExceptT, and for MonadPlus I’ll use MaybeT: runMaybeT (runExceptT (runDb findOllie)) :: IO (Maybe (Either QueryError Person)) Next, let’s consider a few scenarios. Firstly, the case where everything succeeds - > runMaybeT (runExceptT (runDb findOllie)) Just (Right Person ...) However, that query could fail, which would cause an error > runMaybeT (runExceptT (runDb findOllie)) Just (Left "Table person not found") Still as expected. Finally, person 42 might not actually be me, in which case we get > runMaybeT (runExceptT (runDb findOllie)) Just (Left "") Huh? What’s happened here is that we’ve hit the MonadPlus instance for ExceptT, and because our QueryError is a String we have a Monoid instance, so we were given an “empty” error. This is not at all what we were expecting! While this example is a contrived one, I am very nervous that this accidental choice of instances could happen deep within another section of code, for example where I expect to do some local error handling and accidentally eliminate a chance of failure that I was expecting to deal with elsewhere. In transformers-eff this is not possible, as each Eff deals with one and only one effect at a time. This could be done with mtl by introducing a separate type class for failure and only adding an instance for MaybeT, we are working around the problem by convention, and I would much rather bake that in to the types. ### Fast code The underlying implementation of Eff is built on top of continuations, and due to aggressive inlineing, GHC is able to work some serious magic. In fact, in all the benchmarks I’ve produced so far, Eff is as fast as transformers, and even comes out slightly faster in one (though within the same order of magnitude). ### Compatible with the rest of Hackage As Eff is just another monad transformer, you can stack in other monad transformers. Note that by doing this you may lack the type class instances you need, so explicit lifting might be necessary. I mainly expect this being useful by putting Eff “on the top” - for example I can use Eff locally with in a Snap monad computation, provided I eventually run back down to just Snap. This is the same pattern as locally using transformers. ## April 21, 2016 ### Philip Wadler # Pedal on Parliament Come join Pedal on Parliament! Gather in the Meadows from 11am Saturday 23 April, procession sets off at noon. A few years ago, I took my son with me to ICFP in Copenhagen. We had a blast cycling around the city, and marvelled that there were bike paths everywhere. When I lived in Morningside, my cycle to work was along quiet roads, but even so it felt safer when I arrived on the bike path through the Meadows. Now that I live near the Cameo, I'm even happier to get off the busy road and onto a path. And I look forward to the future, because Edinburgh is a city that invests in cycling and has a plan on the table that includes a cycle path from the Meadows to the Canal, which will run past my flat. Getting more people cycling will cut pollution, benefit health, and increase quality of life. Studies show that people don't cycle because they feel sharing the road with cars is unsafe, so investment in cycle paths can make a huge difference. If people in the UK cycled and walked as much as people do in Copenhagen, the NHS would save around £17 billion within twenty years. The video below makes the case brilliantly. Scotland has set a goal that 10% of all travel should be by cycle or foot (the buzzword is active travel), but only spends about 2% of its budget on active travel. The City of Edinburgh has pledged to up it's active travel budget by 1% a year until it reaches 10%. Pedal on Parliament is our chance to support the positive steps in Edinburgh, and encourage the rest of the country to take action. <iframe allowfullscreen="allowfullscreen" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/eLp4tUtdBWo/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/eLp4tUtdBWo?feature=player_embedded" width="320"></iframe> ### Bryn Keller # Mac OS X C++ Development I recently switched to a MacBook Pro. I have customers that use Linux and Mac, and I wanted to work in a similar environment. Also recently (a few months before the MacBook) I started working with C++ again after a long hiatus. I had thought that the Mac, being a Unix, would be relatively close to Linux, and that things I was building for Linux would be much more likely to work there than on Windows. That might still be true, but it turns out that there are several things on Mac that are not obvious, and seriously complicate native code development compared with Linux. These are my notes on those differences and how to deal with them. Hopefully, it may be useful for other migrants to Mac as well. # Xcode Apple includes something called Xcode. This is apparently a combination of a platform SDK, and an IDE with a user interface similar to iTunes. You have to have it, but you don’t have to use the IDE part. It must be installed from the App Store. Don’t fight it, just install it and move on. ## Command line tools You definitely want the Xcode command line tools. Run: xcode-select --install to install them. This will give you git as well. # Brew There are actually two package managers for Mac OS X, MacPorts and Homebrew, and as a developer you’ll definitely need one of them. I use brew, because other people I know recommended it, and it’s been nice so far. You need it to install libraries and tools that don’t come with the Mac. Most notably gcc, cmake, and so on. # Clang and gcc Apple ships the clang compiler with Mac OS X, so this is the considered the standard compiler for the platform. This means that some libraries (notably Qt) only support building with clang. Some C/C++ projects assume (incorrectly) that everybody builds with gcc. For this reason (I guess), Apple did a really odd thing: they ship a gcc executable, which is actually clang in disguise: >$ gcc
clang: error: no input files

This (I guess) works sometimes, since many flags work the same in both compilers. However, it is deeply confusing and causes problems as well. For example, gcc supports OpenMP, a powerful parallel computing tool, and crucial for the work I’m doing. Recent versions of clang support it as well, but Apple’s fork of clang that ships with Macs does not. So to use OpenMP, I have to have the real gcc. This will cause other problems down the road, we’ll get to them in a bit.

You’ll want to install gcc with brew:

brew install gcc gdb

Since clang is already masquerading as gcc, the Homebrew folks came up with a workaround - the gcc package installs executables called gcc-5 and g++-5 instead of gcc and g++. I added the following in my profile to encourage build systems to use these compilers instead of clang.

export HOMEBREW_CC=gcc-5
export HOMEBREW_CXX=g++-5
export CC=gcc-5
export CXX=g++-5


Note the Homebrew-specific ones. Homebrew generally installs binary, precompiled packages rather than compiling on your machine, but you can pass --build-from-source to it to make it recompile. If you do that, it will honor the HOMEBREW_CC and HOMEBREW_CXX environment variables and use those to do the build.

I also aliased cmake to ensure that cmake uses gcc-5 and g++-5 by default as well:

alias cmake=/usr/local/bin/cmake -DCMAKE_C_COMPILER=$CC -DCMAKE_CXX_COMPILER=$CXX

## Compatibility

C++, unlike C, doesn’t specify a binary interface standard. This means that libraries that are compiled with different C++ compilers can have problems interoperating. So there’s that to consider when you use things that were compile with clang (such as anything you download using brew without recompiling) together with things you’ve been building with g++-5.

The most pressing example of this is related to the C++ standard library. There are, on Mac (and elsewhere too I suppose), at least two implementations: libstdc++, and libc++. By default, clang uses libc++ and gcc-5 uses libstdc++. In practice, this means that if you install a C++ based library with brew, you will be able to compile against it with g++-5, but when you get to the link stage, it will fail with lots of missing symbols. If this happens,

brew reinstall --build-from-source <package>

can often fix the problem. However, there are brew packages (e.g. pkg-config) that will fail to compile under g++-5, so there can be cases where this doesn’t work. One example: I was trying to build mxnet directly using the brew package for OpenCV support, and it failed with the aforementioned link errors. I tried to reinstall opencv with --build-from-source with brew, and it started recompiling all of opencv’s (many) dependencies, including pkg-config, which for some reason fails to compile. So in the end I had to pull opencv as well and build it manually, after which mxnet built fine too.

## Next time

These were some important things to be aware of when starting to develop in C++ on Macs. In the next installment, we’ll talk about dynamic libraries, install names, and other such loveliness.

# λ Software Engineer at Pet Partners, LLC (Full-time)

Come help us build world class software that's fundamentally changing veterinary medicine. Pet Partners, located in Saratoga Springs, NY, owns and operates veterinary hospitals across the United States and we’re growing fast! We are looking for a Software Engineer with strong coding and SQL experience. Your responsibilities will include working on our RESTful API, flash reporting pipeline, enterprise integrations, and intranet site. This position reports directly to the Chief Information Officer.

We support several different electronic medical records applications based on MS Windows and SQL Server for our practices and reporting. In addition we rely on an open source, Linux-based stack for internal development that includes Elixir and Phoenix, Postgres, and continuous integration via Team City. We have servers on premises, collocated, as well as on AWS. The ideal candidate for this position is a great developer who is also comfortable writing and optimizing SQL.

Responsibilities:

• Contribute stories to our agile backlog, contribute to sprints, and collaborate with users to validate solutions

• Write code against stories that relate to our internal API, flash reporting pipeline, enterprise integrations, and intranet site, including solid test coverage

• Design and develop solutions that extend our Kofax and Microsoft Dynamics platforms

• Contribute to the DevOps process for your code

• Thrive in a dynamic, collaborative culture with minimal supervision. You'll be a big part of a small team and your contributions will have a major impact on Pet Partners' success.

Skills and Qualifications:

• Experience with Elixir is desirable but not required, but you’ll need to be an expert with at least one relevant language such as Python, Java, Go, or preferably a functional language like Clojure, Haskell, Scala, Erlang, F#, or LISP

• Significant SQL experience

• Agile, unit testing, continuous integration, and dev/ops experience are helpful

• Previous working experience with Git

• Hands on experience with common data structures like XML and JSON Experience with REST API’s

Please note that this position is full time in Saratoga Springs and is not eligible for working remotely.

Get information on how to apply for this position.

# The sage and the seven horses

A classic puzzle of mathematics goes like this:

A father dies and his will states that his elder daughter should receive half his horses, the son should receive one-quarter of the horses, and the younger daughter should receive one-eighth of the horses. Unfortunately, there are seven horses. The siblings are arguing about how to divide the seven horses when a passing sage hears them. The siblings beg the sage for help. The sage donates his own horse to the estate, which now has eight. It is now easy to portion out the half, quarter, and eighth shares, and having done so, the sage's horse is unaccounted for. The three heirs return the surplus horse to the sage, who rides off, leaving the matter settled fairly.

(The puzzle is, what just happened?)

It's not hard to come up with variations on this. For example, picking three fractions at random, suppose the will says that the eldest child receives half the horses, the middle child receives one-fifth, and the youngest receives one-seventh. But the estate has only 59 horses and an argument ensues. All that is required for the sage to solve the problem is to lend the estate eleven horses. There are now 70, and after taking out the three bequests, horses remain and the estate settles its debt to the sage.

But here's a variation I've never seen before. This time there are 13 horses and the will says that the three children should receive shares of and . respectively. Now the problem seems impossible, because . But the sage is equal to the challenge! She leaps into the saddle of one of the horses and rides out of sight before the astonished heirs can react. After a day of searching the heirs write off the lost horse and proceed with executing the will. There are now only 12 horses, and the eldest takes half, or six, while the middle sibling takes one-third, or 4. The youngest heir should get three, but only two remain. She has just opened her mouth to complain at her unfair treatment when the sage rides up from nowhere and hands her the reins to her last horse.

# New Shake with better wildcard patterns

Summary: The new version of Shake supports ** patterns for directory wildcards.

I've just released Shake 0.15.6. Don't be mislead by the 0.0.1 increment of the release, it's got over 50 entries in the changelog since the last release. There are quite a few bug fixes, documentation improvements and optimisations.

One of the most user visible features is the new wildcard patterns. In the previous version Shake supported // for matching any number of directories and * for matching within a path component, so to match all C source files in src you could write:

src//*.c

In the new version of Shake you can also write:

src/**/*.c

The // patterns remain supported, but I intend to encourage use of ** in new code if these patterns don't end up having any unforeseen problems. The advantages of the patterns in the new version are:

• The ** patterns seem to be the defacto standard nowadays, being used by rsync, Ant, Gradle, Jenkins etc.
• People would often write "src" </> "//*.c", which gives the unexpected result of //*.c. With ** you aren't overloading directories at the same time so everything works out as expected.
• ** patterns only match relative files, not absolute ones, which is what you usually want in a build system. If you want to match absolute files use */**.
• The semantics of patterns were a bit confusing for things like /// - I've now given them precise semantics, but ** avoids this confusion.
• I've optimised the pattern matching for both flavours, so there is more precomputation and less backtracking (in practice I don't think that makes any difference).
• I've optimised directory traversal using a file pattern, so it doesn't list directories that can't possibly match, which gives a significant speedup.

For this release I've also improved the website at shakebuild.com with more documentation - hopefully it is useful.

# Thackeray's illustrations for Vanity Fair

Last month I finished reading Thackeray’s novel Vanity Fair. (Related blog post.) Thackeray originally did illustrations for the novel, but my edition did not have them. When I went to find them online, I was disappointed: they were hard to find and the few I did find were poor quality and low resolution.

 Before After (click to enlarge)

The illustrations are narratively important. Jos Osborne dies suspiciously; the text implies that Becky has something to do with it. Thackeray's caption for the accompanying illustration is “Becky’s Second Appearance in the Character of Clytemnestra”. Thackeray’s depiction of Miss Swartz, who is mixed-race, may be of interest to scholars.

I bought a worn-out copy of Vanity Fair that did have the illustrations and scanned them. These illustrations, originally made around 1848 by William Makepeace Thackeray, are in the public domain. In the printing I have (George Routeledge and Sons, New York, 1886) the illustrations were 9½cm × 12½ cm. I have scanned them at 600 dpi.

Large thumbails

Unfortunately, I was only able to find Thackeray’s full-page illustrations. He also did some spot illustrations, chapter capitals, and so forth, which I have not been able to locate.

Share and enjoy.

# Introduction

In their paper Betancourt et al. (2014), the authors give a corollary which starts with the phrase “Because the manifold is paracompact”. It wasn’t immediately clear why the manifold was paracompact or indeed what paracompactness meant although it was clearly something like compactness which means that every cover has a finite sub-cover.

It turns out that every manifold is paracompact and that this is intimately related to partitions of unity.

Most of what I have written below is taken from some hand-written anonymous lecture notes I found by chance in the DPMMS library in Cambridge University. To whomever wrote them: thank you very much.

# Limbering Up

Let $\{U_i : i \in {\mathcal{I}}\}$ be an open cover of a smooth manifold $M$. A partition of unity on M, subordinate to the cover $\{U_i : i \in {\mathcal{I}}\}$ is a finite collection of smooth functions

$\displaystyle X_j : M^n \longrightarrow \mathbb{R}_+$

where $j = 1, 2, \ldots N$ for some $N$ such that

$\displaystyle \sum_{j = 0}^N X_j(x) = 1 \,\mathrm{for all}\, x \in M$

and for each $j$ there exists $i(j) \in {\mathcal{I}}$ such that

$\displaystyle {\mathrm{supp}}{X_j} \subset U_{i(j)}$

We don’t yet know partitions of unity exist.

First define

$\displaystyle f(t) \triangleq \begin{cases} 0 & \text{if } t \leq 0 \\ \exp{(-1/t)} & \text{if } t > 0 \\ \end{cases}$

Techniques of classical analysis easily show that $f$ is smooth ($t=0$ is the only point that might be in doubt and it can be checked from first principles that $f^{(n)}(0) = 0$ for all $n$).

Next define

\displaystyle \begin{aligned} g(t) &\triangleq \frac{f(t)}{f(t) + f(1 - t)} \\ h(t) &\triangleq g(t + 2)g(2 - t) \end{aligned}

Finally we can define $F: \mathbb{R}^n \rightarrow \mathbb{R}$ by $F(x) = h(\|x\|)$. This has the properties

• $F(x) = 1$ if $\|x\| \leq 1$
• $0 \leq F(x) \leq 1$
• $F(x) = 0$ if $\|x\| > 2$

Now take a point $p \in M$ centred in a chart $(U_p, \phi_p)$ so that, without loss of generality, $B(0,3) \subseteq \phi_p(U_p)$ (we can always choose $r_p$ so that the open ball $B(0,3r_p) \subseteq \phi'_p(U_p)$ and then define another chart $(U_p, \phi_p)$ with $\phi_p(x) = \phi'_p(x)/\|x\|$).

Define the images of the open and closed balls of radius $1$ and $2$ respectively

\displaystyle \begin{aligned} V_p &= \phi_p^{-1}(B(0,1)) \\ W_p &= \phi_p^{-1}\big(\overline{B(0,2)}\big) \\ \end{aligned}

and further define bump functions

$\displaystyle \psi_p(y) \triangleq \begin{cases} F(\phi_p(y)) & \text{if } y \in U_p\\ 0 & \text{otherwise} \\ \end{cases}$

Then $\psi_p$ is smooth and its support lies in $W_p \subset U_p$.

By compactness, the open cover $\{V_p : p \in M\}$ has a finite subcover $\{V_{p_1},\ldots,V_{p_K}\}$. Now define

$\displaystyle X_j : M^n \longrightarrow \mathbb{R}_+$

by

$\displaystyle X_j(y) = \frac{\psi_{p_j}(y)}{\sum_{i=1}^K \psi_{p_i}(y)}$

Then $X_j$ is smooth, ${\mathrm{supp}}{X_j} = {\mathrm{supp}}{\psi_{p_j}} \subset U_{p_j}$ and $\sum_{j=1}^K X_j(y) = 1$. Thus $\{X_j\}$ is the required partition of unity.

# Paracompactness

Because $M$ is a manifold, it has a countable basis $\{A_i\}_{i \in \mathbb{N}}$ and for any point $p$, there must exist $A_i \subset V_p$ with $p \in A_i$. Choose one of these and call it $V_{p_i}$. This gives a countable cover of $M$ by such sets.

Now define

$\displaystyle L_1 = W_{p_1} \subset V_{p_1} \cup V_{p_2} \cup \ldots \cup V_{p_{i(2)}}$

where, since $L_1$ is compact, $V_{p_2}, \ldots, V_{p_{i(2)}}$ is a finite subcover.

And further define

$\displaystyle L_n = W_{p_1} \cup W_{p_2} \cup \ldots \cup W_{p_{i(n)}} \subset V_{p_1} \cup V_{p_2} \cup \ldots \cup V_{p_{i(n+1)}}$

where again, since $L_n$ is compact, $V_{p_{i(n)+1}}, \ldots, V_{p_{i(n+1)}}$ is a finite subcover.

Now define

\displaystyle \begin{aligned} K_n &= L_n \setminus {\mathrm{int}}{L_{n-1}} \\ U_n &= {\mathrm{int}}(L_{n+1}) \setminus L_n \end{aligned}

Then $K_n$ is compact, $U_n$ is open and $K_n \subset U_n$. Furthermore, $\bigcup_{n \in \mathbb{N}} K_n = M$ and $U_n$ only intersects with $U_{n-1}$ and $U_{n+1}$.

Given any open cover ${\mathcal{O}}$ of $M$, each $K_n$ can be covered by a finite number of open sets in $U_n$ contained in some member of ${\mathcal{O}}$. Thus every point in $K_n$ can be covered by at most a finite number of sets from $U_{n-1}, U_n$ and $U_{n+1}$ and which are contained in some member of ${\mathcal{O}}$. This is a locally finite refinement of ${\mathcal{O}}$ and which is precisely the definition of paracompactness.

To produce a partition of unity we define bump functions $\psi_j$ as above on this locally finite cover and note that locally finite implies that $\sum_j \psi_j$ is well defined. Again, as above, define

$\displaystyle X_j(y) = \frac{\psi_{j}(y)}{\sum_{i=1} \psi_{i}(y)}$

to get the required result.

# Bibliography

Betancourt, M. J., Simon Byrne, Samuel Livingstone, and Mark Girolami. 2014. “The Geometric Foundations of Hamiltonian Monte Carlo,” October, 45. http://arxiv.org/abs/1410.5110.

# How to recover lost files added to Git but not committed

If you lose something [in Git], don't panic. There's a good chance that you can find someone who will be able to hunt it down again.

I was not expecting to have a demonstration ready so soon. But today I finished working on a project, I had all the files staged in the index but not committed, and for some reason I no longer remember I chose that moment to do git reset --hard, which throws away the working tree and the staged files. I may have thought I had committed the changes. I hadn't.

If the files had only been in the working tree, there would have been nothing to do but to start over. Git does not track the working tree. But I had added the files to the index. When a file is added to the Git index, Git stores it in the repository. Later on, when the index is committed, Git creates a commit that refers to the files already stored. If you know how to look, you can find the stored files even before they are part of a commit.

Each file added to the Git index is stored as a “blob object”. Git stores objects in two ways. When it's fetching a lot of objects from a remote repository, it gets a big zip file with an attached table of contents; this is called a pack. Getting objects from a pack can be a pain. Fortunately, not all objects are in packs. When when you just use git-add to add a file to the index, git makes a single object, called a “loose” object. The loose objects is basically the file contents, gzipped, with a header attached. At some point Git will decide there are too many loose objects and assemble them into a pack.

To make a loose object from a file, the contents of the file are checksummed, and the checksum is used as the name of the object file in the repository and as an identifier for the object, exactly the same as the way git uses the checksum of a commit as the commit's identifier. If the checksum is 0123456789abcdef0123456789abcdef01234567, the object is stored in

    .git/objects/01/23456789abcdef0123456789abcdef01234567


The pack files are elsewhere, in .git/objects/pack.

So the first thing I did was to get a list of the loose objects in the repository:

    cd .git/objects
find ?? -type f  | perl -lpe 's#/##' > /tmp/OBJ


This produces a list of the object IDs of all the loose objects in the repository:

    00f1b6cc1dfc1c8872b6d7cd999820d1e922df4a
0093a412d3fe23dd9acb9320156f20195040a063
01f3a6946197d93f8edba2c49d1bb6fc291797b0
…
ffd505d2da2e4aac813122d8e469312fd03a3669
fff732422ed8d82ceff4f406cdc2b12b09d81c2e


There were 500 loose objects in my repository. The goal was to find the eight I wanted.

There are several kinds of objects in a Git repository. In addition to blobs, which represent file contents, there are commit objects, which represent commits, and tree objects, which represent directories. These are usually constructed at the time the commit is done. Since my files hadn't been committed, I knew I wasn't interested in these types of objects. The command git cat-file -t will tell you what type an object is. I made a file that related each object to its type:

    for i in $(cat /tmp/OBJ); do echo -n "$i ";
git type $i; done > /tmp/OBJTYPE  The git type command is just an alias for git cat-file -t. (Funny thing about that: I created that alias years ago when I first started using Git, thinking it would be useful, but I never used it, and just last week I was wondering why I still bothered to have it around.) The OBJTYPE file output by this loop looks like this:  00f1b6cc1dfc1c8872b6d7cd999820d1e922df4a blob 0093a412d3fe23dd9acb9320156f20195040a063 tree 01f3a6946197d93f8edba2c49d1bb6fc291797b0 commit … fed6767ff7fa921601299d9a28545aa69364f87b tree ffd505d2da2e4aac813122d8e469312fd03a3669 tree fff732422ed8d82ceff4f406cdc2b12b09d81c2e blob  Then I just grepped out the blob objects:  grep blob /tmp/OBJTYPE | f 1 > /tmp/OBJBLOB  The f 1 command throws away the types and keeps the object IDs. At this point I had filtered the original 500 objects down to just 108 blobs. Now it was time to grep through the blobs to find the ones I was looking for. Fortunately, I knew that each of my lost files would contain the string org-service-currency, which was my name for the project I was working on. I couldn't grep the object files directly, because they're gzipped, but the command git cat-file disgorges the contents of an object:  for i in$(cat /tmp/OBJBLOB ) ; do
git cat-file blob $i | grep -q org-service-curr && echo$i;
done > /tmp/MATCHES


The git cat-file blob $i produces the contents of the blob whose ID is in $i. The grep searches the contents for the magic string. Normally grep would print the matching lines, but this behavior is disabled by the -q flag—the q is for “quiet”—and tells grep instead that it is being used only as part of a test: it yields true if it finds the magic string, and false if not. The && is the test; it runs echo $i to print out the object ID $i only if the grep yields true because its input contained the magic string.

So this loop fills the file MATCHES with the list of IDs of the blobs that contain the magic string. This worked, and I found that there were only 18 matching blobs, so I wrote a very similar loop to extract their contents from the repository and save them in a directory:

    for i in $(cat /tmp/OBJBLOB ) ; do git cat-file blob$i |
grep -q org-service-curr
&& git cat-file blob $i > /tmp/rescue/$i;
done


Instead of printing out the matching blob ID number, this loop passes it to git cat-file again to extract the contents into a file in /tmp/rescue.

The rest was simple. I made 8 subdirectories under /tmp/rescue representing the 8 different files I was expecting to find. I eyeballed each of the 18 blobs, decided what each one was, and sorted them into the 8 subdirectories. Some of the subdirectories had only 1 blob, some had up to 5. I looked at the blobs in each subdirectory to decide in each case which one I wanted to keep, using diff when it wasn't obvious what the differences were between two versions of the same file. When I found one I liked, I copied it back to its correct place in the working tree.

Finally, I went back to the working tree and added and committed the rescued files.

It seemed longer, but it only took about twenty minutes. To recreate the eight files from scratch might have taken about the same amount of time, or maybe longer (although it never takes as long as I think it will), and would have been tedious.

But let's suppose that it had taken much longer, say forty minutes instead of twenty, to rescue the lost blobs from the repository. Would that extra twenty minutes have been time wasted? No! The twenty minutes spent to recreate the files from scratch is a dead loss. But the forty minutes to rescue the blobs is time spent learning something that might be useful in the future. The Git rescue might have cost twenty extra minutes, but if so it was paid back with forty minutes of additional Git expertise, and time spent to gain expertise is well spent! Spending time to gain expertise is how you become an expert!

Git is a core tool, something I use every day. For a long time I have been prepared for the day when I would try to rescue someone's lost blobs, but until now I had never done it. Now, if that day comes, I will be able to say “Oh, it's no problem, I have done this before!”

So if you lose something in Git, don't panic. There's a good chance that you can find someone who will be able to hunt it down again.

# Darcs News #113

### News and discussions

1. We will release Darcs 2.12 by the end of this month:
2. On May 6th-8th in Helsinki, a joint sprint Pijul/Darcs is organized:

### Issues resolved (5)

issue1807 Guillaume Hoffmann
issue2258 Guillaume Hoffmann
issue2393 Guillaume Hoffmann
issue2486 Ben Franksen
issue2494 Ben Franksen

### Patches applied (96)

2016-04-14 Guillaume Hoffmann
• move network-related tests to network dir, update command names
• resolve issue2393: remove whatsnew functionality from annotate
• help of log
2016-04-01 Ganesh Sittampalam
2016-03-29 Guillaume Hoffmann
• merge Repository.Util into Repository.State
• use B and BC instead of BS and BSC in Repository.State
• fix prelude import in Repository.State
• move maybeApplyToTree to Darcs.Patch.Apply
• move getRecursiveDarcsRepos to UI.Commands.Optimize
• move patchSetfMap to Darcs.Patch.Set
• move functions from Repository.Util to Patch.TokenReplace
• comment in Repository.Util
• refactor similar functions in Darcs.Repository.State
• inline a function in Clone
• no longer move index to index.old on mingw32 os
• hlint Darcs.Repository.State
• move External module from Repository to Util
• move Compat and Lock modules from Repository to Util
• merge Darcs.Repository.Ssh into Darcs.Util.Ssh
• make all optimize subcommands require hashed except upgrade
• move copySources from HashedRepo to Clone
• move Storage.Hashed modules to Darcs.Util
• remove unused function from Storage.Hashed.Plain
• fix compile error in Storage.Hashed.Test
• remove Storage.Hashed.Utils, move functions to Darcs.Utils.ByteString
• move index-related functions from Utils to Index
• removed unused or redundant functions from Storage.Hashed.Utils
• remove unused functions from Storage.Hashed.Hash
• hlint Storage.Hashed.Darcs
• reuse functions from Darcs.Util.Path
• remove unused Storage.Hashed.Packs
2016-03-09 Ben Franksen
• revert command: be quiet when requested
• accept issue2480: display unicode in patch content
• slightly improved chaotic indentations and import lists
• refactor: use maybeRestrictSubpaths
• refactor: use Darcs.Util.English.capitalize
• replace Darcs.Util.Printer.<> with <> from Data.Monoid; restructured haddocks
• small code layout fix in whatsnew command
• fixed Darcs.Util.English.andClauses and orClauses
• two simple refactorings in the conflict resolution code
• cleanup in revert command: use debugMessage for debug messages
• cleanup: break over-long line in D.R.Merge
• accept issue2494: output of darcs record with file arguments
• resolve issue2494: output of darcs record with file arguments
• several fixes and refactorings in fixSubPaths and maybeFixSubPaths
• added missing hsep function to D.Util.Printer
• added missing Eq and Show instances for ScanKnown
• add new type IncludeBoring for includeBoring option (was Bool)
• announceFiles only if verbosity /= Quiet
2016-03-05 Guillaume Hoffmann
• rm hashed-storage changelog
• add Storage/Hashed dir to checkdeps contrib script
• merge Storage.Hashed.AnchoredPath into Darcs.Util.Path
• explicit exports for Storage.Hashed.Utils
• list and comment exports of Storage.Hashed.Darcs and Plain
• remove Storage.Hashed
• resolve issue2258: improve patch index error message with suggestion
• resolve issue1807: clarify help of PAGER, DARCS_PAGER
• fix extra-source-file path in darcs.cabal
2016-03-07 Ben Franksen
• Darcs.UI.Commands.Unrecord: honor quiet option everywhere
• resolve issue2486: obliterate --not-in-remote -q should be more quiet
2016-02-25 Ganesh Sittampalam
• print the rebase status even after an error
• in runJob, pull repojob out to first-level decision
• refactor displaying suspended status a bit
• inline repoJobOnRebaseRepo
• use helper types to elide more cases in runJob
• elide some common cases in runJob
• reorder runJob cases by job type
• flatten runJob case statements
• add a helper type for flags needed for Rebase
• lift the runJob debugMessage call outside the case
• lift 'therepo' outside the runJob case statement
• express the V1/V2 patch type switch via a GADT too
• use SRepoType to control the rebase type in runJob
• remove commented-out cases for old TreeJob
• drop unnecessary constraints
• break out a runJob function
• drop CarrierType - it can't ever be Rebasing p now
• drop RecontextRebase
• drop NameHack
• inline MaybeInternal module into Named.Wrapped
• make the Rebase import qualified
• Introduce RebaseP to replace Rebasing type
• add 'activecontents' to replace 'patchcontents' for use in conflict resolution
• stop Convert using Wrapped.patchcontents
• flip dependency between Named.Wrapped and Rebase.Container
• add wrapper type around 'Named'
See darcs wiki entry for details.

# Basic HTTP auth with Scotty

Not so long ago, I needed to write a web app to automate the recording of our Haskell podcast, Bananas and Lenses.

<figure> </figure>

To build it, I chose a lightweight Haskell web framework called Scotty. There is another lightweight Haskell web framework called Spock. Both start with the letter S and are characters from Star Trek, and I have little hope ever being able to tell which is which by name. I can say though that I enjoyed working with the one I happened to pick.

So, anyway, I needed to ensure that only my co-hosts and I could access the app. In such a simple scenario, basic HTTP auth is enough. I did a quick google search for “scotty basic auth”, but all I found was this gist in which the headers are extracted by hand. Ugh.

Indeed, at the time of writing, Scotty itself does not seem to provide any shortcuts for basic auth. And yet the solution is simple and beautiful; you just need to step back to see it. Scotty is based on WAI, the Haskell web application interface, and doesn’t attempt to hide that fact. On the contrary, it conveniently exposes the function

middleware :: Middleware -> ScottyM ()

which “registers” a WAI wrapper that runs on every request. And sure enough, WAI (wai-extra) provides an HttpAuth module.

To put everything together, here’s a minimal password-protected Scotty application (works with Stackage lts-5.1).

{-# LANGUAGE OverloadedStrings #-}
import Web.Scotty
import Network.Wai.Middleware.HttpAuth
import Data.SecureMem -- for constant-time comparison
import Lucid -- for HTML generation

password = secureMemFromByteString "An7aLasi" -- https://xkcd.com/221/

main :: IO ()
main = scotty 8000 $do middleware$ basicAuth (\u p -> return $u == "user" && secureMemFromByteString p == password) "Bananas and lenses recording" get "/" . html . renderText$ do
doctype_
html_ $do head_$ do
title_ "Bananas and lenses recording"

body_ $h1_ "Hello world!" Two security-related points: 1. Data.SecureMem is used to perform constant-time comparison to avoid a timing attack. 2. Ideally, the whole thing should be run over https (as the password is submitted in clear), but this is outside of the scope of this article. ### Functional Jobs # OCaml server-side developer at Ahrefs Research (Full-time) ## Who we are Ahrefs Research is a San Francisco branch of Ahrefs Pte Ltd (Singapore), which runs an internet-scale bot that crawls whole Web 24/7, storing huge volumes of information to be indexed and structured in timely fashion. On top of that Ahrefs is building analytical services for end-users. Ahrefs Research develops a custom petabyte-scale distributed storage to accommodate all that data coming in at high speed, focusing on performance, robustness and ease of use. Performance-critical low-level part is implemented in C++ on top of a distributed filesystem, while all the coordination logic and communication layer, along with API library exposed to the developer is in OCaml. We are a small team and strongly believe in better technology leading to better solutions for real-world problems. We worship functional languages and static typing, extensively employ code generation and meta-programming, value code clarity and predictability, constantly seek out to automate repetitive tasks and eliminate boilerplate, guided by DRY and following KISS. If there is any new technology that will make our life easier - no doubt, we'll give it a try. We rely heavily on opensource code (as the only viable way to build maintainable system) and contribute back, see e.g. https://github.com/ahrefs . It goes without saying that our team is all passionate and experienced OCaml programmers, ready to lend a hand or explain that intricate ocamlbuild rule. Our motto is "first do it, then do it right, then do it better". ## What we need Ahrefs Research is looking for backend developer with deep understanding of operating systems, networks and taste for simple and efficient architectural designs. Our backend is implemented mostly in OCaml and some C++, as such proficiency in OCaml is very much appreciated, otherwise a strong inclination to intensively learn OCaml in a short term will be required. Understanding of functional programming in general and/or experience with other FP languages (F#,Haskell,Scala,Scheme,etc) will help a lot. Knowledge of C++ and/or Rust is a plus. The candidate will have to deal with the following technologies on the daily basis: • networks & distributed systems • 4+ petabyte of live data • OCaml • linux • git The ideal candidate is expected to: • Independently deal with and investigate bugs, schedule tasks and dig code • Make argumented technical choice and take responsibility for it • Understand the whole technology stack at all levels : from network and userspace code to OS internals and hardware • Handle full development cycle of a single component, i.e. formalize task, write code and tests, setup and support production (devops) • Approach problems with practical mindset and suppress perfectionism when time is a priority These requirements stem naturally from our approach to development with fast feedback cycle, highly-focused personal areas of responsibility and strong tendency to vertical component splitting. ## What you get We provide: • Competitive salary • Modern office in San Francisco SOMA (Embarcadero) • Informal and thriving atmosphere • First-class workplace equipment (hardware, tools) • No dress code Get information on how to apply for this position. ### Robert Harper # Practical Foundations for Programming Languages, Second Edition Today I received my copies of Practical Foundations for Programming Languages, Second Edition on Cambridge University Press. The new edition represents a substantial revision and expansion of the first edition, including these: 1. A new chapter on type refinements has been added, complementing previous chapters on dynamic typing and on sub-typing. 2. Two old chapters were removed (general pattern matching, polarization), and several chapters were very substantially rewritten (higher kinds, inductive and co-inductive types, concurrent and distributed Algol). 3. The parallel abstract machine was revised to correct an implied extension that would have been impossible to carry out. 4. Numerous corrections and improvements were made throughout, including memorable and pronounceable names for languages. 5. Exercises were added to the end of each chapter (but the last). Solutions are available separately. 6. The index was revised and expanded, and some conventions systematized. 7. An inexcusably missing easter egg was inserted. I am grateful to many people for their careful reading of the text and their suggestions for correction and improvement. In writing this book I have attempted to organize a large body of material on programming language concepts, all presented in the unifying framework of type systems and structural operational semantics. My goal is to give precise definitions that provide a clear basis for discussion and a foundation for both analysis and implementation. The field needs such a foundation, and I hope to have helped provide one. Filed under: Programming, Research, Teaching ### FP Complete # The Stackage data flow I recently wrote up the Stackage data flow. The primary intent was to assist the rest of the Stackage curation team see how all the pieces fit together. However, it may also be of general interest to the rest of the community. In particular, some of the components used are not widely known and may be beneficial for completely separate projects (such as all-cabal-metadata). Please check out the above linked copy of the file for the most up-to-date content. For convenience, I'm copying in the current content as of publication time below. The Stackage project is really built on top of a number of different subcomponents. This page covers how they fit together. The Stackage data flow diagram gives a good bird's-eye view: ## Inputs There are three inputs into the data flow: • Hackage is the upstream repository of all available open source Haskell packages that are part of our ecosystem. Hackage provides both cabal file metadata (via the 00-index.tar file) and tarballs of the individual packages. • build-constraints.yaml is the primary Stackage input file. This is where package maintainers can add packages to the Stackage package set. This also defines upper bounds, skipped tests, and a few other pieces of metadata. • stackage-content is a Github repository containing static file content served from stackage.org ## Travis For various reasons, we leverage Travis CI for running some processes. In particular: • all-cabal-files clones all cabal files from Hackage's 00-index.tar file into a Git repository without any modification • all-cabal-hashes is mostly the same, but also includes cryptographic hashes of the package tarballs for more secure download (as leveraged by Stack. It is powered by all-cabal-hashes-tool • all-cabal-packages uses hackage-mirror to populate the hackage.fpcomplete.com mirror of Hackage, which provides S3-backed high availability hosting of all package tarballs • all-cabal-metadata uses all-cabal-metadata-tool to query extra metadata from Hackage about packages and put them into YAML files. As we'll see later, this avoids the need to make a lot of costly calls to Hackage APIs Travis does not currently provide a means of running jobs on a regular basis. Therefore, we have a simple cron job on the Stackage build server that triggers each of the above builds every 30 minutes. ## stackage-curator The heart of running Stackage builds is the stackage-curator tool. We run this on a daily basis on the Stackage build server for Stackage Nightly, and on a weekly basis for LTS Haskell. The build process is highly automated and leverages Docker quite a bit. stackage-curator needs to know about the most recent versions of all packages, their tarball contents, and some metadata, all of which it gets from the Travis-generated sources mentioned in the previous section. In addition, it needs to know about build constraints, which can come from one of two places: • When doing an LTS Haskell minor version bump (e.g., building lts-5.13), it grabs the previous version (e.g., lts-5.12) and converts the previous package set into constraints. For example, if lts-5.12 contains the package foo-5.6.7, this will be converted into the constraint foo >= 5.6.7 && < 5.7. • When doing a Stackage Nightly build or LTS Haskell major version bump (e.g., building lts-6.0), it grabs the latest version of the build-constraints.yaml file. By combining these constraints with the current package data, stackage-curator can generate a build plan and check it. (As an aside, this build plan generation and checking also occurs every time you make a pull request to the stackage repo.) If there are version bounds problems, one of the Stackage curators will open up a Github issue and will add upper bounds, temporarily block a package, or some other corrective action. Once a valid build plan is found, stackage-curator will build all packages, build docs, and run test suites. Assuming that all succeeds, it generates some artifacts: • Uploads the build plan as a YAML file to either stackage-nightly or lts-haskell • Uploads the generated Haddock docs and a package index (containing all used .cabal files) to haddock.stackage.org. ## stackage-server-cron On the Stackage build server, we run the stackage-server-cron executable regularly, which generates: • A SQLite database containing information on snapshots, the packages they contain, Hackage metadata about packages, and a bit more. This database is uploaded to S3. • A Hoogle database for each snapshot, which is also uploaded to S3 ## stackage-server The software running stackage.org is a relatively simple Yesod web application. It pulls data from the stackage-content repo, the SQLite database, the Hoogle databases, and the build plans for Stackage Nightly and LTS Haskell. It doesn't generate anything important of its own except for a user interface. ## Stack Stack takes advantage of many of the pieces listed above as well: • It by default uses the all-cabal-hashes repo for getting package metadata, and downloads package contents from the hackage.fpcomplete.com mirror (using the hashes in the repo for verification) • There are some metadata files in stackage-content which contain information on, for example, where to download GHC tarballs from to make stack setup work • Stack downloads the raw build plans for Stackage Nightly and LTS Haskell from the Github repo and uses them when deciding which packages to build for a given stack.yaml file ## April 13, 2016 ### Functional Jobs # Senior Software Engineer (Haskell) at Front Row Education (Full-time) # Position Senior Functional Web Engineer to join fast-growing education startup transforming the way 3+ million K-8 students learn Math and English. ## What you will be doing Architect, design and develop new applications, tools and distributed systems for the Front Row ecosystem in Haskell, Flow, PostgreSQL, Ansible and many others. You will get to work on your deliverable end-to-end, from the UX to the deployment logic. Once you're an integral part of the team you will act as Dev Lead and oversee the success of your team Mentor and support more junior developers in the organization Create, improve and refine workflows and processes for delivering quality software on time and without incurring debt Work at our offices in San Francisco as part of a very small (there's literally half a dozen of us!), world-class team of engineers with a track record of rapidly delivering valuable software to millions of users. Work closely with Front Row educators, product managers, customer support representatives and account executives to help the business move fast and efficiently through relentless automation. ## Why you should join Front Row Our mission is important to us, and we want it to be important to you as well: millions of students learn math using Front Row every month. Our early results show students improve twice as much while using Front Row than their peers who aren’t using the program. You’ll be THE first Senior Engineer ever at Front Row, which means you’ll have an immense impact on our company, product, and culture; you’ll have a ton of autonomy and responsibility; you’ll have equity to match the weight of this role. If you're looking for an opportunity to both grow and do meaningful work, surrounded and supported by like-minded professionals, this is THE place for you. You will be working side by side with many well known world-class personalities in the Haskell and Functional Programming community whose work you've likely used. Front Row is an active participant to the Open Source community and contributor to some of the most popular Haskell libraries. A lot of flexibility: while we all work towards the same goals, you’ll have a lot of autonomy in what you work on. You can work from home up to one day a week, and we have a very flexible untracked vacation days policy The company and its revenue are growing at a rocketship pace. Front Row is projected to make a massive impact on the world of education in the next few years. It's a once in a lifetime opportunity to join a small organization with great odds of becoming the Next Big Thing. ## Must haves • You have experience doing full-stack web development. You understand HTTP, networking, databases and the world of distributed systems. • You have functional programming experience. • Extreme hustle: you’ll be solving a lot of problems you haven’t faced before without the resources and the support of a giant organization. You must thrive on getting things done, whatever the cost. • Soft skills: we want you to move into a leadership position, so you must be an expert communicator ## Nice-to-haves • You have led a software development team before • You have familiarity with a functional stack (Haskell / Clojure / Scala / OCaml etc) • You understand and have worked all around the stack before, from infrastructure automation all the way to the frontend • You're comfortable with the Behavior-Driven Development style • You have worked at a very small startup before: you thrive on having a lot of responsibility and little oversight • You have worked in small and effective Agile/XP teams before • You have delivered working software to large numbers of users before ## Benefits • Competitive salary • Generous equity option grants • Medical, Dental, and Vision • Catered lunch and dinner 4 times a week • Equipment budget • One flexible work day per week • Working from downtown SF, very accessible location • Professional yet casual work environment Get information on how to apply for this position. ### Robert Harper # OPLSS 2016 The program for OPLSS 2016 has been announced, and we are open for registration! This year’s instance is being organized by Zena Ariola, Dan Licata, and me, with renewed emphasis this year on programming languages semantics and verification. Oregon Programming Languages Summer School June 20-July 2, 2016 Eugene, Oregon There are still spaces available at the 15th Annual Oregon Programming Languages Summer School (OPLSS). Please encourage your PhD students, masters students, advanced undergraduates, colleagues, and selves to attend! This year’s program is titled Types, Logic, Semantics, and Verification and features the following courses: * Programming Languages Background — Robert Harper, Carnegie Mellon University and Dan Licata, Wesleyan University * Category Theory Background — Ed Morehouse, Carnegie Mellon University * Logical Relations — Patricia Johann, Appalachian State University * Network Programming — Nate Foster, Cornell University * Automated Complexity Analysis — Jan Hoffman, Carnegie Mellon University * Separation Logic and Concurrency — Aleks Nanevski, IMDEA * Principles of Type Refinement — Noam Zeilberger, INRIA * Logical relations/Compiler verification — Amal Ahmed, Northeastern University Full information on the courses and registration and scholarships is available at https://www.cs.uoregon.edu/research/summerschool/. For more information, please email summerschool@cs.uoregon.edu. Filed under: Teaching Tagged: Summer School ### Daniil Frumin # Frobenius property of weak factorisation systems and Pi-types I’ve put together a note on the Frobenius property for weak factorisaion systems and it’s relation to models of type theory. Awodey and Warren described a way of obtaining a model of type theory with identity types from a model category structure/weak factorisation system. However, in absence of axioms for other type formers (specifically, Π-types), one is required to put an additional restriction on the weak factorisation system, in order to model identity elimination in an arbitrary context/with arbitrary parameters; specifically, it is required that cofibrations are stable under pullbacks along fibrations. In presence of Π-types, however, this is not necessary, as I tried to explain in the note. Tagged: category theory, type theory ### Derek Elkins # Understanding typing judgments ### Introduction For many people interested in type systems and type theory, their first encounter with the literature presents them with this: #frac(Gamma,x:tau_1 |--_Sigma e : tau_2)(Gamma |--_Sigma (lambda x:tau_1.e) : tau_1 -> tau_2) ->I# #frac(Gamma |--_Sigma f : tau_1 -> tau_2 \qquad Gamma |--_Sigma x : tau_1)(Gamma |--_Sigma f x : tau_2) ->E# Since this notation is ubiquitous, authors (reasonably) expect readers to already be familiar with it and thus provide no explanation. Because the notation is ubiquitous, the beginner looking for alternate resources will not escape it. All they will find is that the notation is everywhere but exists in myriad minor variations which may or may not indicate significant differences. At this point the options are: 1) to muddle on and hope understanding the notation isn’t too important, 2) look for introductory resources which typically take the form of$50+ 500+ page textbooks, or 3) give up.

To start, if you are not already familiar with it, get familiar with the Greek alphabet. It will be far easier to (mentally) read mathematical notation of any kind if you can say “Gamma x” rather than “right angle thingy x” or “upside-down L x”.

Using the example from the introduction, the whole thing is a rule. The “|->I|” part is just the name of the rule (in this case being short for “|->| Introduction”). This rule is only part of the definition of the judgment of the form:

#Gamma |--_Sigma e : tau#

The judgment can be viewed as a proposition and the rule is an “if-then” statement read from top to bottom. So the “|->I|” rule says, “if #Gamma, x : tau_1 |--_Sigma e : tau_2# then #Gamma |--_Sigma lambda x : tau_1.e : tau_1 -> tau_2#”. It is often profitable to read it bottom-up as “To prove #Gamma |--_Sigma lambda x : tau_1.e : tau_1 -> tau_2# you need to show #Gamma, x : tau_1 |--_Sigma e : tau_2#”.

So what is the judgment saying? First, the judgment is, in this case, a four argument relation. The arguments of this relation are #Gamma#, #Sigma#, #e#, and #tau#. We could say the name of this relation is the perspicuous #(_)|--_((_)) (_) : (_)#. Note that it does not make sense to ask what “⊢” means or what “:” means anymore than it makes sense to ask what “->” means in Haskell’s \ x -> e.4

In the context of type systems, #Gamma# is called the context, #Sigma# is called the signature, #e# is the term or expression, and #tau# is the type. Given this, I would read #Gamma |--_Sigma e : tau# as “the expression e has type tau in context gamma given signature sigma.” For the “#->E#” rule we have, additionally, multiple judgements above the line. These are joined together by conjunction, that is, we’d read “#->E#” as “if #Gamma |--_Sigma f : tau_1 -> tau_2# and #Gamma |--_Sigma x : tau_1# then #Gamma |--_Sigma f x : tau_2#

In most recent type system research multiple judgments are necessary to describe the type system, and so you may see things like #Gamma |-- e > tau# or #Gamma |-- e_1 "~" e_2#. The key thing to remember is that these are completely distinct relations that will have their own definitions (although the collection of them will often be mutually recursively defined).

### Inductively Defined Relations

#### Relations

Relations in set theory are boolean valued functions. Being programmers, and thus constructivists, we want evidence, so a relation |R : A xx B -> bb2| becomes a type constructor R : (A , B) -> Set. |R(a,b)| holds if we have a value (proof/witness) w : R a b. An inductively defined relation or judgment is then just a type constructor for an (inductive) data type. That means, if R is an inductively defined relation, then its definition is data R : A -> B -> Set where .... A rule is a constructor of this data type. A derivation is a value of this data type, and will usually be a tree-like structure. As a bit of ambiguity in the terminology (arguably arising from a common ambiguity in mathematical notation), it’s a bit more natural to use the term “judgment” to refer to something that can be (at the meta level) true or false. For example, we’d say |R(a,b)| is a judgment. Nevertheless, when we say something like “the typing judgment” it’s clear that we’re referring to the whole relation, i.e. |R|.

#### Parameters of the judgments

Since a judgment is a relation, we need to describe what the arguments to the relation look like. Typically something like BNF is used. The BNF definitions provide the types used as parameters to the judgments. It is common to use a Fortran-esque style where a naming convention is used to avoid the need to explicitly declare the types of meta-variables. For example, the following says meta-variables like #n#, #m#, and #n_1# are all natural numbers.

n, m ::= Z | S n

BNF definitions translate readily to algebraic data types.

data Nat : Set where
Z : Nat
S : Nat -> Nat

Agda note: Set is what is called * in Haskell. “Type” would be a better name. Also, these sidebars will cover details about Agda with the aim that readers unfamiliar with Agda don’t get tripped up by tangential details.

Sometimes it’s not possible to fully capture the constraints on well-formed syntax with BNF. In other words, only a subset of syntactically valid terms are well-formed. For example, Nat Nat is syntactically valid but is not well-formed. We can pick out that subset with a predicate, i.e. a unary relation. This is, of course, nothing but another judgment. As an example, if we wired the Maybe type into our type system, we’d likely have a judgment that looks like #tau\ tt"type"# which would include the following rule:

#frac(tau\ tt"type")(("Maybe"\ tau)\ tt"type")#

In a scenario like this, we’d also have to make sure the rules of our typing judgment also required the types involved to be well-formed. Modifying the example from the introduction, we’d get:

#frac(Gamma,x:tau_1 |--_Sigma e : tau_2 \qquad tau_1\ tt"type" \qquad tau_2\ tt"type")(Gamma |--_Sigma (lambda x:tau_1.e) : tau_1 -> tau_2) ->I#

#### A simple inductively defined relation in Agda

As a very simple example, let’s say we wanted to provide explicit evidence that one natural number was less than or equal to another in Agda. Scenarios like this are common in dependently typed programming, and so we’ll start with the Agda this time and then “informalize” it.

data _isLessThanOrEqualTo_ : Nat -> Nat -> Set where
Z<=n : {n : Nat} -> Z isLessThanOrEqualTo n
Sm<=Sn : {m : Nat} -> {n : Nat} -> m isLessThanOrEqualTo n -> (S m) isLessThanOrEqualTo (S n)

Agda notes: In Agda identifiers can contain almost any character so Z<=n is just an identifier. Agda allows any identifier to be used infix (or more generally mixfix). The underscores mark where the arguments go. So _isLessThanOrEqualTo_ is a binary infix operator. Finally, curly brackets indicate implicit arguments which can be omitted and Agda will “guess” their values. Usually, they’ll be obvious to Agda by unification.

In the informal notation, the types of the arguments are implied by the naming. n is a natural number because it was used as the metavariable (non-terminal) in the BNF for naturals. We also implicitly quantify over all free variables. In the Agda code, this quantification was explicit.

#frac()(Z <= n) tt"Z<=n"#

#frac(m <= n)(S m <= S n) tt"Sm<=Sn"#

Again, I want to emphasize that these are defining isLessThanOrEqualTo and |<=|. They can’t be wrong. They can only fail to coincide with our intuitions or to an alternate definition. A derivation that |2 <= 3| looks like:

In Agda:

twoIsLessThanThree : (S (S Z)) isLessThanOrEqualTo (S (S (S Z)))
twoIsLessThanThree = Sm<=Sn (Sm<=Sn Z<=n)

In the informal notation:

#frac(frac()(Z <= S Z))(frac(S Z <= S (S Z))(S (S Z) <= S (S (S Z)))#

### Big-step operational semantics

Here’s a larger example that also illustrates that these judgments do not need to be typing judgments. Here we’re defining a big-step operational semantics for the untyped lambda calculus.

x variable
v ::= λx.e
e ::= v | e e | x

In informal presentations, binders like #lambda# are handled in a fairly relaxed manner. While the details of handling binders are tricky and error-prone, they are usually standard and so authors assume readers can fill in those details and are aware of the concerns (e.g. variable capture). In Agda, of course, we’ll need to spell out the details. There are many approaches for dealing with binders with different trade-offs. One of the newer and more convenient approaches is parametric higher-order abstract syntax (PHOAS). Higher-order abstract syntax (HOAS) approaches allow us to reuse the binding structure of the host language and thus eliminate much of the work. Below, this is realized by the Lambda constructor taking a function as its argument. In a later section, I’ll use a different approach using deBruijn indices.

-- PHOAS approach to binding
mutual
data Expr (A : Set) : Set where
Val : Value A -> Expr A
App : Expr A -> Expr A -> Expr A
Var : A -> Expr A

data Value (A : Set) : Set where
Lambda : (A -> Expr A) -> Value A

-- A closed expression
CExpr : Set1
CExpr = {A : Set} -> Expr A

-- A closed expression that is a value
CValue : Set1
CValue = {A : Set} -> Value A

Agda note: Set1 is needed for technical reasons that are unimportant. You can just pretend it says Set instead. More important is that the definitions of Expr and Value are a bit different than the definition for _isLessThanOrEqualTo_. In particular, the argument (A : Set) occurs to the left of the colon. When an argument occurs to the left of the colon we say it parameterizes the data declaration and that it is a parameter. When it occurs to the right of the colon we say it indexes the data declaration and that it is an index. The difference is that parameters must occur uniformly in the return type of the data constructors while indexes can be different in each data constructor. The arguments of an inductively defined relation like _isLessThanOrEqualTo_ will always be indexes (though there could be additional parameters.)

#frac(e_1 darr lambda x.e \qquad e_2 darr v_2 \qquad e[x|->v_2] darr v)(e_1 e_2 darr v) tt"App"#

#frac()(v darr v) tt"Trivial"#

The #e darr v# judgment (read as “the expression #e# evaluates to the value #v#”) defines a call-by-value evaluation relation. #e[x|->v]# means “substitute #v# for #x# in the expression #e#”. This notation is not standardized; there are many variants. In more rigorous presentations this operation will be formally defined, but usually the authors assume you are familiar with it. In the #tt"Trivial"# rule, the inclusion of values into expressions is implicitly used. Note that the rule is restricted to values only.

The #tt"App"# rule specifies call-by-value because the #e_2# expression is evaluated and then the resulting value is substituted into #e#. For call-by-name, we’d omit the evaluation of #e_2# and directly substitute #e_2# for #x# in #e#. Whether #e_1# or #e_2# is evaluated first (or in parallel) is not specified in this example.

subst : {A : Set} -> Expr (Expr A) -> Expr A
subst (Var e) = e
subst (Val (Lambda b)) = Val (Lambda (λ a -> subst (b (Var a))))
subst (App e1 e2) = App (subst e1) (subst e2)

data _EvaluatesTo_ : CExpr -> CValue -> Set1 where
EvaluateTrivial : {v : CValue} -> (Val v) EvaluatesTo v
EvaluateApp : {e1 : CExpr} -> {e2 : CExpr}
-> {e : {A : Set} -> A -> Expr A}
-> {v2 : CValue} -> {v : CValue}
-> e1 EvaluatesTo (Lambda e)
-> e2 EvaluatesTo v2
-> (subst (e (Val v2))) EvaluatesTo v
-> (App e1 e2) EvaluatesTo v

The EvaluateTrivial constructor explicitly uses the Val injection of values into expressions. The EvaluateApp constructor starts off with a series of implicit arguments that introduce and quantify over the variables used in the rule. After those, each judgement above the line in the #tt"App"# rule, becomes an argument to the EvaluateApp constructor.

In this case ↓ is defining a functional relation, meaning for every expression there’s at most one value that the expression evaluates to. So another natural way to interpret ↓ is as a definition, in logic programming style, of a (partial) recursive function. In other words we can use the concept of mode from logic programming and instead of treating the arguments to ↓ as inputs, we can treat the first as an input and the second as an output.

↓ gives rise to a partial function because not every expression has a normal form. For _EvaluatesTo_ this is realized by the fact that we simply won’t be able to construct a term of type e EvaluatesTo v for any v if e doesn’t have a normal form. In fact, we can use the inductive structure of the relationship to help prove that statement. (Unfortunately, Agda doesn’t present a very good experience for data types indexed by functions, so the proof is not nearly as smooth as one would like.)

### Type systems

Next we’ll turn to type systems which will present an even larger example, and will introduce some concepts that are specific to type systems (though, of course, they overlap greatly with concepts in logic due to the Curry-Howard correspondence.)

#### Terms and types

Below is an informal presentation of the polymorphic lambda calculus with explicit type abstraction and type application. An interesting fact about the polymorphic lambda calculus is that we don’t need any base types. Via Church-encoding, we can define types like natural numbers and lists.

α type variable
τ ::= τ → τ | ∀α. τ | α

x variable
c constant
v ::= λx:τ.e | Λτ.e | c
e ::= v | e e | e[τ] | x

In this case I’ll be using deBruijn indices to handle the binding structure of the terms and types. This means instead of writing |lambda x.lambda y. x|, you would write |lambda lambda 1| where the |1| counts how many binders (lambdas) you need to traverse to reach the binding site.

data TType : Set where
TTVar : Nat -> TType                    -- α
_=>_ : TType -> TType -> TType          -- τ → τ
Forall : TType -> TType                 -- ∀α. τ

mutual
data TExpr : Set where
TVal : TValue -> TExpr              -- v
TApp : TExpr -> TExpr -> TExpr      -- f x
TTyApp : TExpr -> TType -> TExpr    -- e[τ]
TVar : Nat -> TExpr                 -- x

data TValue : Set where
TLambda : TType -> TExpr -> TValue  -- λx:τ.e
TTyLambda : TExpr -> TValue         -- Λτ.e
TConst : Nat -> TValue              -- c

#### The Context

In formulating the typing rules we need to deal with open terms, that is terms which refer to variables that they don’t bind. This should only happen if some enclosing terms did bind those variables, so we need to keep track of the variables that have been bound by enclosing terms. For example, when type checking |lambda x:tau.x|, we’ll need to type check the subterm |x| which does not contain enough information in itself for us to know what the type should be. So, we keep track of what variables have been bound (and to what type) in a context and then we can just look up the expected type. When authors bother formally spelling out the context, it will look something like the following:

Γ ::= . | Γ, x:τ
Δ ::= . | Δ, α

We see that this is just a (snoc) list. In the first case, |Gamma|, it is a list of pairs of variables and types, i.e. an association list mapping variables to types. Often it will be treated as a finite mapping. In the second case, |Delta|, it is a list of type variables. Since I’m using deBruijn notation, there are no variables so we end up with a list of types in the first case. In the second case, we would end up with a list of nothing in particular, i.e. a list of unit, but that is isomorphic to a natural number. In other words, the only purpose of the type context, |Delta|, is to make sure we don’t use unbound variables, which in deBruijn notation just means we don’t have deBruijn indexes that try to traverse more lambdas than enclose them. The Agda code for the above is completely straight-forward.

data List (A : Set) : Set where
Nil : List A
_,_ : List A -> A -> List A

Context : Set
Context = List TType

TypeContext : Set
TypeContext = Nat

#### The Signature

Signatures keep track of what primitive, “user-defined” constants might exist. Often the signature is omitted since nothing particularly interesting happens with it. Indeed, that will be the case for us. Nevertheless, we see that the signature is just another association list mapping constants to types.

Σ ::= . | Σ, c:τ
Signature : Set
Signature = List TType

The main reason I included the signature, beyond just covering it for the cases when it is included, is that sometimes certain rules can be better understood as manipulations of the signature. For example, in logic, universal quantification is often described by a rule like:

#frac(Gamma |-- P[x|->c] \qquad c\ "fresh")(Gamma |-- forall x.P)#

What’s happening and what “freshness” is is made a bit clearer by employing a signature (which for logic is usually just a list of constants similar to our TypeContext):

#frac(Gamma |--_(Sigma, c) P[x|->c] \qquad c notin Sigma)(Gamma |--_Sigma forall x.P)#

#### Judgment

To define the typing rules we need two judgements. The first, #Delta |-- tau#, will be a simple judgement that says |tau| is a well formed type in |Delta|. This basically just requires that all variables are bound.

#frac(alpha in Delta)(Delta |-- alpha)#

#frac(Delta, alpha |-- tau)(Delta |-- forall alpha. tau)#

#frac(Delta |-- tau_1 \qquad Delta |-- tau_2)(Delta |-- tau_1 -> tau_2)#

The Agda is

data _<_ : Nat -> Nat -> Set where
Z<Sn : {n : Nat} -> Z < S n
Sn<SSm : {n m : Nat} -> n < S m -> S n < S (S m)

data _isValidIn_ : TType -> TypeContext -> Set where
TyVarJ : {n : Nat} -> {ctx : TypeContext} -> n < ctx -> (TTVar n) isValidIn ctx
TyArrJ : {t1 t2 : TType} -> {ctx : TypeContext} -> t1 isValidIn ctx -> t2 isValidIn ctx -> (t1 => t2) isValidIn ctx
TyForallJ : {t : TType} -> {ctx : TypeContext} -> t isValidIn (S ctx) -> (Forall t) isValidIn ctx

The meat is the following typing judgement, depending on the judgement defining well-formed types. I’m not really going to explain these rules because, in some sense, there is nothing to explain. Beyond explaining the notation itself, which was the point of the article, the below is “self-explanatory” in the sense that it is a definition, and whether it is a good definition or “meaningful” depends on whether we can prove the theorems we want about it.

#frac(c:tau in Sigma \qquad Delta |-- tau)(Delta;Gamma |--_Sigma c : tau) tt"Const"#

#frac(x:tau in Gamma \qquad Delta |-- tau)(Delta;Gamma |--_Sigma x : tau) tt"Var"#

#frac(Delta;Gamma |--_Sigma e_1 : tau_1 -> tau_2 \qquad Delta;Gamma |--_Sigma e_2 : tau_1)(Delta;Gamma |--_Sigma e_1 e_2 : tau_2) tt"App"#

#frac(Delta;Gamma |--_Sigma e : forall alpha. tau_1 \qquad Delta |-- tau_2)(Delta;Gamma |--_Sigma e[tau_2] : tau_1[alpha|->tau_2]) tt"TyApp"#

#frac(Delta;Gamma, x:tau_1 |--_Sigma e : tau_2 \qquad Delta |-- tau_1)(Delta;Gamma |--_Sigma (lambda x:tau_1.e) : tau_1 -> tau_2) tt"Abs"#

#frac(Delta, alpha;Gamma |--_Sigma e : tau)(Delta;Gamma |--_Sigma (Lambda alpha.e) : forall alpha. tau) tt"TyAbs"#

Here’s the corresponding Agda code. Note, all Agda is doing for us here is making sure we haven’t written self-contradictory nonsense. In no way is Agda ensuring that this is the “right” definition. For example, it could be the case (but isn’t) that there are no values of this type. Agda would be perfectly content to let us define a type that had no values.

tySubst : TType -> TType -> TType
tySubst t1 t2 = tySubst' t1 t2 Z
where tySubst' : TType -> TType -> Nat -> TType
tySubst' (TTVar Z) t2 Z = t2
tySubst' (TTVar Z) t2 (S _) = TTVar Z
tySubst' (TTVar (S n)) t2 Z = TTVar (S n)
tySubst' (TTVar (S n)) t2 (S d) = tySubst' (TTVar n) t2 d
tySubst' (t1 => t2) t3 d = tySubst' t1 t3 d => tySubst' t2 t3 d
tySubst' (Forall t1) t2 d = tySubst' t1 t2 (S d)

data _isIn_at_ {A : Set} : A -> List A -> Nat -> Set where
Found : {a : A} -> {l : List A} -> a isIn (l , a) at Z
Next : {a : A} -> {b : A} -> {l : List A} -> {n : Nat} -> a isIn l at n -> a isIn (l , b) at (S n)

data _hasType_inContext_and_given_ : TExpr -> TType -> Context -> TypeContext -> Signature -> Set where
ConstJ : {t : TType} -> {c : Nat}
-> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> t isIn Sigma at c
-> t isValidIn Delta
-> (TVal (TConst c)) hasType t inContext Gamma and Delta given Sigma

VarJ : {t : TType} -> {x : Nat}
-> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> t isIn Gamma at x
-> t isValidIn Delta
-> (TVar x) hasType t inContext Gamma and Delta given Sigma

AppJ : {t1 : TType} -> {t2 : TType} -> {e1 : TExpr} -> {e2 : TExpr}
-> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e1 hasType (t1 => t2) inContext Gamma and Delta given Sigma
-> e2 hasType t1 inContext Gamma and Delta given Sigma
-> (TApp e1 e2) hasType t2 inContext Gamma and Delta given Sigma

TyAppJ : {t1 : TType} -> {t2 : TType} -> {e : TExpr}
-> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e hasType (Forall t1) inContext Gamma and Delta given Sigma
-> t2 isValidIn Delta
-> (TTyApp e t2) hasType (tySubst t1 t2) inContext Gamma and Delta given Sigma

AbsJ : {t1 : TType} -> {t2 : TType} -> {e : TExpr}
-> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e hasType t2 inContext (Gamma , t1) and Delta given Sigma
-> (TVal (TLambda t1 e)) hasType (t1 => t2) inContext Gamma and Delta given Sigma

TyAbsJ : {t : TType} -> {e : TExpr}
-> {Sigma : Signature} -> {Gamma : Context} -> {Delta : TypeContext}
-> e hasType t inContext Gamma and (S Delta) given Sigma
-> (TVal (TTyLambda e)) hasType (Forall t) inContext Gamma and Delta given Sigma

Here’s a typing derivation for the polymorphic constant function:


tyLam : TExpr -> TExpr
tyLam e = TVal (TTyLambda e)

lam : TType -> TExpr -> TExpr
lam t e = TVal (TLambda t e)

polyConst
: tyLam (tyLam (lam (TTVar Z) (lam (TTVar (S Z)) (TVar (S Z)))))    -- Λs.Λt.λx:t.λy:s.x
hasType (Forall (Forall (TTVar Z => (TTVar (S Z) => TTVar Z))))     -- ∀s.∀t.t→s→t
inContext Nil and Z
given Nil
polyConst = TyAbsJ (TyAbsJ (AbsJ (AbsJ (VarJ (Next Found) (TyVarJ Z<Sn))))) -- written by Agda

data False : Set where

Not : Set -> Set
Not A = A -> False

wrongType
: Not (tyLam (lam (TTVar Z) (TVar Z))   -- Λt.λx:t.x
hasType (Forall (TTVar Z))       -- ∀t.t
inContext Nil and Z
given Nil)
wrongType (TyAbsJ ())

Having written all this, we have not defined a type checking algorithm (though Agda’s auto tactic does a pretty good job); we’ve merely specified what evidence that a program is well-typed is. Explicitly, a type checking algorithm would be a function with the following type:

data Maybe (A : Set) : Set where
Nothing : Maybe A
Just : A -> Maybe A

typeCheck : (e : TExpr) -> (t : TType) -> (sig : Signature) -> Maybe (e hasType t inContext Nil and Z given sig)
typeCheck = ?

In fact, we’d want to additionally prove that this function never returns Nothing if there does exist a typing derivation that would give e the type t in signature sig. We could formalize this in Agda by instead giving typeCheck the following type:

data Decidable (A : Set) : Set where
IsTrue : A -> Decidable A
IsFalse : Not A -> Decidable A

typeCheckDec : (e : TExpr) -> (t : TType) -> (sig : Signature) -> Decidable (e hasType t inContext Nil and Z given sig)
typeCheckDec = ?

This type says that either typeCheckDec will return a typing derivation, or it will return a proof that there is no typing derivation. As the name Decidable suggests, this may not always be possible. Which is to say, type checking may not always be decidable. Note, we can always check that a typing derivation is valid — we just need to verify that we applied the rules correctly — what we can’t necessarily do is find such a derivation given only the expression and the type or prove that no such derivation exists. Similar concerns apply to type inference which could have one of the following types:

record Σ (T : Set) (F : T -> Set) : Set where
field
fst : T
snd : F fst

inferType : (e : TExpr) -> (sig : Signature) -> Maybe (Σ TType (λ t → e hasType t inContext Nil and Z given sig))
inferType = ?

inferTypeDec : (e : TExpr) -> (sig : Signature) -> Decidable (Σ TType (λ t → e hasType t inContext Nil and Z given sig))
inferTypeDec = ?

where Σ indicates a dependent sum, i.e. a pair where the second component (here of type e hasType t inContext Nil and Z given sig) depends on the first component (here of type TType). With type inference we have the additional concern that there may be multiple possible types an expression could have, and we may want to ensure it returns the “most general” type in some sense. There may not always be a good sense of “most general” type and user-input is required to pick out of the possible types.

Sometimes the rules themselves can be viewed as the defining rules of a logic program and thus directly provide an algorithm. For example, if we eliminate the rules, types, and terms related to polymorphism, we’d get the simply typed lambda calculus. A Prolog program to do type checking can be written in a few lines with a one-to-one correspondence to the type checking rules (and, for simplicitly, also omitting the signature):

lookup(z, [T|_], T).
lookup(s(N), [_|Ctx],T) :- lookup(N, Ctx, T).

typeCheck(var(N), Ctx, T) :- lookup(N, Ctx, T).
typeCheck(app(F,X), Ctx, T2) :- typeCheck(F, Ctx, tarr(T1, T2)), typeCheck(X, Ctx, T1).
typeCheck(lam(B), Ctx, tarr(T1, T2)) :- typeCheck(B, [T1|Ctx], T2).

2. Most dependently typed languages, such as Coq or Epigram would also be adequate.

3. Epigram is most notable by actually using this 2D syntax.

4. See this StackExchange answer for more discussion of this.

# It Is What It Is (And Nothing Else)

A recent discussion of introductory computer science education led to the topic of teaching recursion.  I was surprised to learn that students are being taught that recursion requires understanding something called a “stack” that is nowhere in evidence in their code.  Few, if any, students master the concept, which is usually “covered” only briefly.  Worst, they are encouraged to believe that recursion is a mysterious bit of esoterica that is best ignored.

And thus is lost one of the most important and beautiful concepts in computing.

The discussion then moved on to the implementation of recursion in certain inexplicably popular languages for teaching programming.  As it turns out, the compilers mis-implement recursion, causing unwarranted space usage in common cases.  Recursion is dismissed as problematic and unimportant, and the compiler error is elevated to a “design principle” — to be snake-like is to do it wrong.

And thus is lost one of the most important and beautiful concepts in computing.

And yet, for all the stack-based resistance to the concept, recursion has nothing to do with a stack.  Teaching recursion does not need any mumbo-jumbo about “stacks”.  Implementing recursion does not require a “stack”.  The idea that the two concepts are related is simply mistaken.

What, then, is recursion?  It is nothing more than self-reference, the ability to name a computation for use within the computation itself.  Recursion is what it is, and nothing more.  No stacks, no tail calls, no proper or improper forms, no optimizations, just self-reference pure and simple.  Recursion is not tied to “procedures” or “functions” or “methods”; one can have self-referential values of all types.

Somehow these very simple facts, which date back to the early 1930’s, have been replaced by damaging myths that impede teaching and using recursion in programs.  It is both a conceptual and a practical loss.  For example, the most effective methods for expressing parallelism in programs rely heavily on recursive self-reference; much would be lost without it.  And the allegation that “real programmers don’t use recursion” is beyond absurd: the very concept of a digital computer is grounded in recursive self-reference (the cross-connection of gates to form a latch).  (Which, needless to say, does not involve a stack.)  Not only do real programmers use recursion, there could not even be programmers were it not for recursion.

I have no explanation for why this terrible misconception persists.  But I do know that when it comes to programming languages, attitude trumps reality every time.  Facts?  We don’t need no stinking facts around here, amigo.  You must be some kind of mathematician.

If all the textbooks are wrong, what is right?  How should one explain recursion?  It’s simple.  If you want to refer to yourself, you need to give yourself a name.  “I” will do, but so will any other name, by the miracle of α-conversion.  A computation is given a name using a fixed point (not fixpoint, dammit) operator:  fix x is e stands for the expression e named x for use within e.  Using it, the textbook example of the factorial function is written thus:

fix f is fun n : nat in case n {zero => 1 | succ(n') => n * f n'}.

Let us call this whole expression fact, for convenience.  If we wish to evaluate it, perhaps because we wish to apply it to an argument, its value is

fun n : nat in case n {zero => 1 | succ(n') => n * fact n'}.

The recursion has been unrolled one step ahead of execution.  If we reach fact again, as we will for a positive argument,  fact is evaluated again, in the same way, and the computation continues.  There are no stacks involved in this explanation.

Nor is there a stack involved in the implementation of fixed points.  It is only necessary to make sure that the named computation does indeed name itself.  This can be achieved by a number of means, including circular data structures (non-well-founded abstract syntax), but the most elegant method is by self-application.  Simply arrange that a self-referential computation has an implicit argument with which it refers to itself.  Any use of the computation unrolls the self-reference, ensuring that the invariant is maintained.  No storage allocation is required.

Consequently, a self-referential functions such as

fix f is fun (n : nat, m:nat) in case n {zero => m | succ(n') => f (n',n*m)}

execute without needing any asymptotically significant space.  It is quite literally a loop, and no special arrangement is required to make sure that this is the case.  All that is required is to implement recursion properly (as self-reference), and you’re done.  There is no such thing as tail-call optimization.  It’s not a matter of optimization, but of proper implementation.  Calling it an optimization suggests it is optional, or unnecessary, or provided only as a favor, when it is more accurately described as a matter of getting it right.

So what, then, is the source of the confusion?  The problem seems to be a too-close association between compound expressions and recursive functions or procedures.  Consider the classic definition of factorial given earlier.  The body of the definition involves the expression

n * fact n'

where there is a pending multiplication to be accounted for.  Once the recursive call (to itself) completes, the multiplication can be carried out, and it is necessary to keep track of this pending obligation.  But this phenomenon has nothing whatsoever to do with recursion.  If you write

n * square n'

then it is equally necessary to record where the external call is to return its value.  In typical accounts of recursion, the two issues get confused, a regrettable tragedy of error.

Really, the need for a stack arises the moment one introduces compound expressions.  This can be explained in several ways, none of which need pictures or diagrams or any discussion about frames or pointers or any extra-linguistic concepts whatsoever.  The best way, in my opinion, is to use Plotkin’s structural operational semantics, as described in my Practical Foundations for Programming Languages (Second Edition) on Cambridge University Press.

There is no reason, nor any possibility, to avoid recursion in programming.  But folk wisdom would have it otherwise.  That’s just the trouble with folk wisdom, everyone knows it’s true, even when it’s not.

Update: Dan Piponi and Andreas Rossberg called attention to a pertinent point regarding stacks and recursion.  The conventional notion of a run-time stack records two distinct things, the control state of the program (such as subroutine return addresses, or, more abstractly, pending computations, or continuations), and the data state of the program (a term I just made up because I don’t know a better one, for managing multiple simultaneous activations of a given procedure or function).  Fortran (back in the day) didn’t permit multiple activations, meaning that at most one instance of a procedure can be in play at a given time.  One consequence is that α-equivalence can be neglected: the arguments of a procedure can be placed in a statically determined spot for the call.  As a member of the Algol-60 design committee Dijkstra argued, successfully, for admitting multiple procedure activations (and hence, with a little extra arrangement, recursive/self-referential procedures).  Doing so requires that α-equivalence be implemented properly; two activations of the same procedure cannot share the same argument locations.  The data stack implements α-equivalence using de Bruijn indices (stack slots); arguments are passed on the data stack using activation records in the now-classic manner invented by Dijkstra for the purpose.  It is not self-reference that gives rise to the need for a stack, but rather re-entrancy of procedures, which can arise in several ways, not just recursion.  Moreover, recursion does not always require re-entrancy—the so-called tail call optimization is just the observation that certain recursive procedures are not, in fact, re-entrant.  (Every looping construct illustrates this principle, albeit on an ad hoc basis, rather than as a general principle.)

Filed under: Programming, Teaching

# Worst practices should be hard

In this post I hope to persuade you that Haskell is well-adapted to software engineering in the large. To motivate this post, I would like to begin with a simple thought experiment.

Suppose that we could measure short-term programmer productivity for two hypothetical programming languages named "X" and "Y". In practice, we cannot easily compare productivity between two programming languages, but just pretend that we can for the purpose of this example.

First, we will measure the short-term productivity of languages "X" and "Y" when the programmer carefully follows all best practices. These are made up numbers:

"Best Practices" (whatever that means):                7             +-----+             |     |  5 Arbitrary   |     +-----+Productivity |     |     |   Units     |     |     |             |     |     |             |     |     |             +-----+-----+                X     Y   

Higher productivity is better, so from the above chart alone we might conclude that "X" is a better language than "Y", all other things equal.

However, all other things might not be equal and there might be other redeeming features of language "Y". Perhaps language "Y" excels when the programmer takes short cuts and throws best practices to the wind. Let's call this "worst practices" and measure developer productivity when cutting corners:

"Worst Practices" (whatever that means):                9             +-----+             |     |             |     | Arbitrary   |     |Productivity |     |   Units     |     |  3             |     +-----+             |     |     |             |     |     |             +-----+-----+                X     Y   

Weird! Language "X" still performs better! In fact, language "Y" did such a poor job that developer productivity decreased when they followed worst practices.

Therefore, the choice is clear: language "X" is strictly a better language, right?

Not necessarily!

#### Long-term productivity

I will actually argue that language "Y" is the superior language for large and long-lived projects.

All of our charts so far only measured short-term productivity. However, long-term productivity crucially depends on how much developers follow best practices. That's what makes them "best practices".

Perhaps we don't need to perform additional experiments to predict which language will perform better long-term. To see why, let's transpose our original two charts. This time we will plot short-term productivity for language "X" when following either best practices (i.e. "BP") or worst practices (i.e. "WP"):

Language "X":                9             +-----+             |     |  7             |     +-----+             |     |     | Arbitrary   |     |     |Productivity |     |     |   Units     |     |     |             |     |     |             |     |     |             +-----+-----+               WP    BP   

This chart highlights one major flaw in the design of language X: the language rewards worst practices! Therefore, developers who use language "X" will always be incentivized to do the wrong thing, especially when they are under deadline pressure.

In contrast, language "Y" rewards best practices!

Language "Y":                      5 Arbitrary         +-----+Productivity    3  |     |   Units     +-----+     |             |     |     |             |     |     |             +-----+-----+               WP    BP   

A developer using language "Y" can't help but follow best practices. The more deadline pressure there is, the greater the incentive to do the right thing.

Developers using language "X" might compensate by creating a culture that shames worst practices and glorifies best practices in order to realign incentives, but this does not scale well to large software projects and organizations. You can certainly discipline yourself. You can even police your immediate team to ensure that they program responsibly. But what about other teams within your company? What about third-party libraries that you depend on? You can't police an entire language ecosystem to do the right thing.

I believe that Haskell is the closest modern approximation to the ideal of language "Y". In other words, Haskell aligns short-term incentives with long-term incentives.

Best practices are obviously not a binary "yes/no" proposition or even a linear scale. There are multiple dimensions to best practices and some languages fare better on some dimensions and worse on others. Some dimensions where Haskell performs well are:

• Absence of null
• Limitation of effects
• Immutability
• Generic types
• Strongly typed records

For each of the above points I will explain how Haskell makes it easier to do the right thing than the wrong thing.

Haskell isn't perfect, though, so in the interests of balance I will also devote a section to the following areas where Haskell actively rewards worst practices and punishes best practices!

• Poor performance
• Excessive abstraction
• Unchecked exceptions

#### Absence of null

Haskell has no null/None/nil value built into the language but Haskell does have a Maybe type somewhat analogous to Option in Java. Maybe is defined within the language as an ordinary data type like this:

data Maybe a = Just a | Nothing

Maybe differs from null-like solutions in other languages because Maybe shows up in the type. That means that if I don't see Maybe in the type then I know for sure that the value cannot possibly be empty. For example, I can tell from the following type that z can never be None or null:

z :: String

This differs from many other languages where all values could potentially be null because there is no way to opt out of nullability.

The second thing that differentiates Maybe from other solutions is that you can't forget to handle the empty case. If you have a function that expects an String argument:

exclaim :: String -> Stringexclaim str = str ++ "!"

... you can't accidentally apply that function to a value of type Maybe String or you will get a type error:

>>> x = Just "Hello" :: Maybe String>>> exclaim x  -- TYPE ERROR!

Instead, you have to explicitly handle the empty case, somehow. For example, one approach is to use pattern matching and provide a default value when the argument is empty:

case x of    Just str -> exclaim str    Nothing  -> "Goodbye!"

Haskell rewards best practices with respect to nullable values by making it easier to program without them:

• everything is non-null by default
• nullable types are longer because they must explicitly declare nullability
• nullable code is longer because it must explicitly handle nullability

... therefore Haskell programmers tend to use nullable values sparingly because because it's the path of least resistance!

#### Limitation of input/output effects

Haskell treats input/output effects (IO for short) the same way that Haskell treats nullable values:

• Everything is IO-free by default
• IO types are longer because they must explicitly declare IO effects
• IO code is longer because it must explicitly propagate effects

Therefore Haskell programmers tend to err on the side of using effects sparingly because it's the path of least resistance!

#### Immutability

Haskell treats immutability the same way that Haskell treats nullable values and IO:

• Everything is immutable by default
• Mutable types are longer because they must explicitly declare state/references
• Mutable code is longer because it must thread state/references

Therefore Haskell programmers tend to err on the side of using state and mutable references sparingly because it's the path of least resistance!

#### Generic types

By default, the Haskell compiler infers the most general type for your code. For example, suppose that I define the following top-level function:

addAndShow x y = show (x + y)

I can omit the type signature for that function because the compiler will already infer the following most general type for addAndShow:

>>> :type addAndShowaddAndShow :: (Num a, Show a) => a -> a -> String

I actually have to go out of my way to provide a more specific type for that function. The only way I can get the compiler to infer a less general type is to provide an explicit type signature for addAndShow:

addAndShow :: Int -> Int -> StringaddAndShow x y = show (x + y)

In other words, I have to actively go out of my way to make functions less general. The path of least resistance is to just let the compiler infer the most general type possible!

#### Strongly typed records

If you come to Haskell from a Python or Clojure background you will definitely notice how painful it is to work with dictionaries/maps in Haskell. Haskell has no syntactic support for map literals; the best you can do is something like this:

myMap =    fromList        [ ("foo", "Hello!")        , ("bar", "1"     )        , ("baz", "2.0"   )        ]

You also can't store different types of values for each key; all values in a map must be the same type.

In contrast, defining new records in Haskell is extremely cheap compared to other languages and each field can be a distinct type:

data MyRecord = MyRecord    { foo :: String    , bar :: Int    , baz :: Double    }myRecord = MyRecord    { foo = "Hello"    , bar = 1    , baz = 2.0    }

Therefore, Haskell programmers will very predictably reach for records instead of maps/dictionaries because records are infinitely more pleasant to use.

#### Poor performance

Haskell doesn't always reward best practices, though. Haskell is one of the fastest functional programming languages in experienced hands, but the language very much rewards poor performance in beginner hands.

For example, Haskell's syntax and Prelude highly rewards linked lists over vectors/arrays. Even worse, the default String type is a linked list of characters, which is terrible for performance.

There are more efficient solutions to this problem, such as:

• Using the text library for high-performance text
• Using the vector library for high-performance arrays
• Using the OverloadedStrings and OverloadedLists extensions to overload list literal syntax and string literal syntax to support more efficient types

However, the path of least resistance is still to use linked lists and strings. This problem is so pervasive that I see even respected and experienced library authors who care a lot about performance succumb to the convenience of these less efficient types every once in a while.

#### Excessive abstraction

As you learn the language you begin to realize that Haskell makes some types of advanced abstraction very syntactically cheap and some people get carried away. I know that I'm guilty of this.

From an outside perspective, the typical progression of a Haskell programmer might look something like this:

• Week 1: "How r monad formed?"
• Week 2: "What is the difference between Applicative+Category and Arrow?"
• Week 3: "I'm beginning a PhD in polymorphic type family recursion constraints"
• Week 4: "I created my own language because Haskell's not powerful enough"

(Edit: I do not mean to downplay anybody's difficulty learning Haskell; this is just to illustrate that the language fosters and encourages "architecture astronauts")

The temptation to play with all of Haskell's abstract facilities is so strong for some (like me) because for those people abstraction is too fun!

#### Unchecked exceptions

Fun fact: did you know that the conventional way to throw checked exceptions in Haskell is to use monad transformers?

It's far more easy to just use throwIO to lump all exceptions under the IO type than to declare in the types all the possible exceptions that you might throw. That's at least partially checked in the sense that an IO in the type signature warns you about the possibility of exceptions even if it won't tell you which specific exceptions might be thrown.

However, throwIO is surprisingly not in the Prelude by default, but error is!

error is the worst way to throw a Haskell exception because it is:

• marked pure, so it can strike anywhere
• asynchronous, so it can strike at any time
• completely invisible at the type level
• triggered by accidental evaluation

A lot of people mistakenly use error instead of throwIO without realizing the consequences of doing so. I've seen otherwise very smart and talented Haskell programmers get this wrong.

Following exception best practices requires discipline and effort, which means that under pressure people will generally not take the time to propagate and catalog exceptions correctly.

#### Conclusions

Despite those flaws, I still believe that Haskell incentivizes the right behavior where it counts:

• minimizing null
• minimizing state
• minimizing effects
• minimizing weakly-typed maps/dictionaries

Those are the qualities that I see large projects fall down on time after time due to inexperience, developer churn, and time constraints.

More generally, programming languages should be designed with attention to incentives. Don't just focus on absolute productivity, but also pay attention to the relative productivity of best practices and worst practices. Worst practices should be hard!

# GHCid 0.6 Released

Summary: I've released a new version of GHCid, which can interrupt running tests.

I've just released version 0.6.1 of GHCid. To a first approximation, ghcid opens ghci and runs :reload whenever your source code changes, formatting the output to fit a fixed height console. Unlike other Haskell development tools, ghcid is intended to be incredibly simple - it works when nothing else does. This new version features:

Much faster: Since version 0.5 GHCid passes -fno-code to ghci when it makes sense, which is about twice as fast.

Interruptible test commands: Since version 0.4 ghcid has supported a --test flag to pass a test command (e.g. :main) which is run whenever the code is warning free. As of version 0.6 that command will be interrupted if it needs to :reload, allowing long running tests and persistent "tests" - e.g. spawning web servers or GUIs. Thanks to Reid Draper for showing it was possible as part of his ordeal project and Luigy Leon for merging that with GHCid.

Stack integration: If you have a stack.yaml function and a .stack-work directory it will use stack commands to run your project. Thanks to the Stack Team, in particular Michael Sloan, for helping get through all the hoops and providing the necessary functionality in Stack.

More restart/reload flags: It's been possible for a while to pass --restart to restart ghci if certain files change (e.g. the .cabal file). Now there is a separate --reload flag to cause :reload instead of a full restart, and both flags can take directories instead of individual files.

Major relayering: For 0.6 I significantly refactored much of the GHCid code. There has always been an underlying Language.Haskell.Ghcid API, and GHCid was built on top. With the new version the underlying library has been given a significant refactoring, mostly so that interruptions are handled properly without race conditions and with a sane multithreading story. On top of that is a new Session layer, which provides a session abstraction - a ghci instance which tracks more state (e.g. which warnings have been emitted for already loaded files). Then the Ghcid module builds on top, with much less state management. By simplifying and clarifying the responsibility of each layer certain issues such as leaking old ghci processes and obscure race conditions disappeared.

I've been making use of many of these features in the Shake website generator, which I invoke with:

ghcid --test=":main debug" --reload=parts --reload=../docs

This project uses Stack, so relies on the new stack integration. It runs :main debug as the test suite, which generates the website whenever the code reloads. Furthermore, if any of the parts (template files) or docs (Markdown pages) change the website regenerates. I can now edit the website, and saving it automatically regenerates the web pages within a second.

### Jasper Van der Jeugt

A week or two ago, I gave a talk at the Zurich Haskell Meetup about Haskell mistakes I have made in the past, and how you can fix them. The slides can be found here, and you can watch the talk below, thanks to Simon Meier’s prowess as a cameraman. I’m just glad I managed to strike a great pose in the video thumbnail.

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/S3WGPuqfBLg" width="560"> </iframe>

# Introduction

persistent has just released a new major version. With this release, queries which only read data are distinguished, at the type level, from queries which write data. This makes using a read replica database safer and generally makes our types more informative. Breakage should be minimal for end users.

# Scaling a database

Many applications eventually reach the limits of a single database machine and work across multiple machines. Distributing state across multiple machines can introduce a great deal of complexity. Consequently, there are many approaches to dealing with multiple databases. The CAP theorem ensures that none of them is perfect (Is there ever a good impossibility theorem?). Front Row Education chose to go with streaming, native replication from a PostgreSQL write database to a PostgreSQL read database.

## Possible solutions

What's the best way to program against a write database with a read replica? Here are a few approaches:

### 1. Run opaque queries against chosen backend.

In this approach, we don't change any of the persistent machinery. selectList, insert and runSqlPool stay the same:

runSqlPool :: (MonadBaseControl IO m) => ReaderT SqlBackend m a -> Pool SqlBackend -> m a
writeBackend :: SqlBackend

insertStudent :: ReaderT SqlBackend IO StudentId
insertStudent = insert student

selectStudents :: ReaderT SqlBackend IO [Entity Student]
selectStudents = selectList ([] :: [Filter Student]) []

insertAndSelect :: IO [Entity Student]
insertAndSelect = do
_ <- runSqlPool (insertStudent >> insertStudent) writeBackend
runSqlPool selectStudents readBackend

We choose which backend to run our queries against at execution time. That is, we pass one backend to runSqlPool if we want to execute the query against the read database and a different backend if we want to execute our query against the write database.

This approach does work in the most basic sense of the word. But it's manual and error-prone. Nothing stops us from accidentally running a write query against a read database and getting an error at runtime. That's not the Haskell way! We'd be much better off encoding this read and write information in the query's type.

### 2. update, delete and insert write. select reads.

In this approach we create wrappers around SqlBackend called SqlReadBackend and SqlWriteBackend. Then, we specify that all selects (reads) will operate against the read database and all inserts, update, or delete (writes) will operate against the write database. We can intermix queries of different types with multiple (now type safe) calls to runSqlPool:

runSqlPool :: (MonadBaseControl IO m, IsSqlBackend backend) => ReaderT backend m a -> Pool backend -> m a
writeBackend :: SqlWriteBackend

insertStudent :: ReaderT SqlWriteBackend IO StudentId

insertAndSelect :: IO [Entity Student]
insertAndSelect = do
_ <- runSqlPool (insertStudent >> insertStudnet) writeBackend
runSqlPool selectStudents readBackend

Attempting to run insertStudent against on the readBackend will result in a type error. Nice!

Unfortunately, it will also result in a type error when attempting to run selectStudents against the writeBackend. Which is why we used two calls to runSqlPool in the above example. This inability to mix reads and writes in a single transaction is rather restrictive.

This approach also ignores problems of eventual consistency. Even under streaming replication, there is some lag (hopefully, only a few milliseconds or less) between the read database and the write database. If we can't run reads in the same transaction, on the same DB as writes we have a serious problem. In the above example, we have no guarantee that our student insertions will have propagated to the read DB in time for the select that immediately follows the insert.

### 3. update, delete and insert write. select can be used in a read or write context.

We must generalize our read operations so that we can still run them against the write database when we need to.

runSqlPool :: (MonadBaseControl IO m, IsSqlBackend backend) => ReaderT backend m a -> Pool backend -> m a
writeBackend :: SqlWriteBackend

insertStudent :: ReaderT SqlWriteBackend IO StudentId

insertAndSelect :: IO [Entity Student]
insertAndSelect =
runSqlPool (insertStudent >> insertStudent >> selectStudents) writeBackend

We now use type classes to say that write queries can only run against the write database but read queries can run against either type of database and we can defer the decision of where to run a read query until use. But in a safe way.

## persistent

The new version of persistent follows the third approach.

# Types as documentation

IO is sometimes referred to as Haskell's "sin bin". That is, a great number of effects end up marked as IO. Consequently, when you see IO in a type signature, it's hard to determine which effects that function uses. Does the function write to disk or get the current time? A more fine-grained type would make our types more informative.

Along similar lines, splitting the monolithic SqlPersistT into SqlReadT and SqlWriteT allows us to more clearly signal the capabilities leveraged inside a given function. When we see SqlReadT, we can be confident that the underlying database state hasn't changed.

# Breaking changes

This version of persistent shouldn't break application authors and end users of persistent. You can continue to use SqlBackend which is now an instance of SqlBackendCanRead and SqlBackendCanWrite.

Library authors may need to modify some type signatures to work with the new machinery.

For example,

get404
:: (MonadIO m, PersistStore backend, backend ~ PersistEntityBackend val, PersistEntity val)
=> Key val -> ReaderT backend m val

becomes

get404
:: (MonadIO m, PersistStore backend, BaseBackend backend ~ PersistEntityBackend val, PersistEntity val)
=> Key val -> ReaderT backend m val

which leverages

instance HasPersistBackend SqlBackend where
type BaseBackend SqlBackend = SqlBackend
instance HasPersistBackend SqlWriteBackend where
type BaseBackend SqlWriteBackend = SqlBackend

from persistent.

This new concept of BaseBackend allows us to still assign a unique type of backend to each PersistEntity despite the availability of SqlReadBackend, SqlWriteBackend and SqlBackend.

# Conclusion

If you have a read replica, you can now access it more safely. Even if you don't, your types are now a little more informative. And you get this for almost free!

# Stackifying the cookbook

Not too long ago, Sibi did a great job of cleaning up the Yesod cookbook and moving it to its own Github repository. The cookbook has been a collaborative project from the community, and has accumulated lots of useful examples over the years.

I'd now like to encourage others to help further improve it. As a recent Reddit post brought to my attention, some of the cookbook examples do not compile with recent versions of Yesod. I'd like to encourage the following:

• Look through the cookbook and update examples to compile with the latest version of Yesod
• Add Stack script interpreter lines to the beginning of each example

If you don't know about that second point, you can read the docs from the Stack website. But you may find it easier to just look at the commit I made yesterday, or even just this part of it:

#!/usr/bin/env stack
{- stack
--resolver lts-5.10
--install-ghc
runghc
--package yesod
--package yesod-static
--package persistent-sqlite
-}

With this addition, you can save the example into any .hs file and run it with stack foo.hs (or chmod +x foo.hs && ./foo.hs). This gives three really important advantages over an unadorned example:

• It clearly documents which version of GHC and all dependencies are being used (via the resolver line)
• It states which packages need to be available. Note that I'm not providing an exhaustive list of packages, since depending on yesod will automatically include all of its dependencies (e.g., yesod-form, warp)
• For someone trying to get started quickly, it takes care of many details automatically (installing GHC, building packages). All they need to do is install Stack, and the #!/usr/bin/env stack is a good hint in that direction

I'd recommend communicating intentions of working on cookbook examples on the mailing list, so that work isn't duplicated, and so others can review changes. And if there are questions about how to make updates to some examples, definitely ask!

### Douglas M. Auclair (geophf)

One-liners
• March 29th, 2016: You have a list, Show a => [a], of known-length n, n > 100 Show the first three and the last three elements of that list
• March 28th, 2016: You have f :: c -> d and g :: a -> b -> c
define (<<-) such that f <<- g is f (g a b)
Does it exist already (in the Prelude, maybe)?
• March 26th, 2016: insert1 :: Ord k => k -> a -> Map k [a] -> Map k [a] define, points-free, efficiently. (hint: it's not insertWith)
• March 26th, 2016: sqDist :: Fractional ä => [ä] -> [ä] -> ä sqDist v1 v2 = sum (zipWith (\a b -> (a - b) ^2) v1 v2) Point-free-itize sqDist (and λ)
• March 17th, 2016: for f :: (a -> b) -> (a, a) -> (b, b) define f
• March 17th, 2016: removeAll :: Eq a => a -> [a] -> [a]; define
e.g.: removeAll '"' "(\"S312\", \"S204\")" ~> "(S312, S204)"
• Gautier DI FOLCO @gautier_difolco ... same
• March 17th, 2016:
f, g :: (String, String) -> String
let ab = ("a", "b")
f ab ~> "(\"a\", b)"
g ab ~> "(a, \"b\")"
define h that generalizes f and g
• Gautier DI FOLCO @gautier_difolco
cp x y (a,b) = concat ["(", x a, ",", y b, ")"]
f,g :: (String, String) -> String
g = cp id show
f = cp show id

# Dissertating, ahoy!

Usually whenever we think about gradschool we think about the switch from "doing classwork, quals, etc" to "dissertating" is a single-step process, which we call "becoming a PhD candidate" (as opposed to being a PhD student). In practice there are over half a dozen steps. Filing the paperwork declaring your completion of classwork, quals, etc is just the first few. (Okay, easy enough) Then there's the prospectus, for the graduate school. (The what for the who now?) Then the forming of your research committee. (Right, okay) Then the proposal. (Wait, how is this different from the other thing?) Then the proposal defense. (Um, okay, but not every department requires this?) Plus a few other steps I'm surely forgetting.

As of yesterday, I am officially finally totally completely absolutely done with all the paperwork, and can finally get back to actually working on the thesis itself!

# Qt5+D-Bus+CMake, a complete example

Yesterday I started digging into Qt5 and D-Bus. I never found a complete example, so I put one together myself: https://gist.github.com/magthe/2cf7220655bd8bf431259cc7dee99f64.

# Github Offline Issues with IssueSync

For a while I've been looking for something to download the GitHub issues for a project. I do a lot of development work on a train with no internet, so referring to the tickets offline is very useful. I've tried lot of tools, in a very wide variety of languages (Ruby, Python, Perl, Javascript, PHP) - but most of them don't seem to work - and the only one I did manage to get working only gave a curses UI.

Finally, I've found one that works - IssueSync. Installing it worked as described. Running it worked as described. I raised tickets for the author and they fixed them. I even sent a pull request and the author discussed and merged it. It downloads all your issues to Markdown files in an issues directory. I then "list" my issues using:

head -n1 -q issues/*.md | grep -v CLOSED

It's simple and works nicely.

# The network reliability problem and star semirings

In a previous post I defined the network reliability problem. Briefly, we are given a directed graph whose edges are labelled with probabilities, which we can think of as giving the likelihood of a message successfully traversing a link in a network. The problem is then to compute the probability that a message will successfully traverse the network from a given source node to a given target node.

Several commenters pointed out the connection to Bayesian networks. I think they are right, and the network reliability problem is a very special case of Bayesian inference. However, so far this hasn’t seemed to help very much, since the things I can find about algorithms for Bayesian inference are either too general (e.g. allowing arbitrary functions at nodes) or too specific (e.g. only working for certain kinds of trees). So I’m going to put aside Bayesian inference for now; perhaps later I can come back to it.

In any case, Derek Elkins also made a comment which pointed to exactly what I wanted to talk about next.

# Star semirings and path independence

Consider the related problem of computing the reliability of the single most reliable path from $s$ to $t$ in a network. This is really just a disguised version of the shortest path problem, so one can solve it using Dijkstra’s algorithm. But I want to discuss a more general way to think about solving it, using the theory of star semirings. Recall that a semiring is a set with two associative binary operations, “addition” and “multiplication”, which is a commutative monoid under addition, a monoid under multiplication, and where multiplication distributes over addition and $0a = a0 = 0$. A star semiring is a semiring with an additional operation $(-)^*$ satisfying $a^* = 1 + aa^* = 1 + a^*a$. Intuitively, $a^* = 1 + a + a^2 + a^3 + \dots$ (though $a^*$ can still be well-defined even when this infinite sum is not; we can at least say that if the infinite sum is defined, they must be equal). If $S$ is a star semiring, then the semiring of $n \times n$ matrices over $S$ is also a star semiring; for details see Dolan (2013), O’Connor (2011), Penaloza (2005), and Lehmann (1977). In particular, there is a very nice functional algorithm for computing $M^*$, with time complexity $O(n^3)$ (Dolan 2013). (Of course, this is slower than Dijkstra’s algorithm, but unlike Dijkstra’s algorithm it also works for finding shortest paths in the presence of negative edge weights—in which case it is essentially the Floyd-Warshall algorithm.)

Now, given a graph $G = (V,E)$ and labelling $\varphi : E \to \mathbb{P}$, define the $|V| \times |V|$ adjacency matrix $M_G$ to be the matrix of edge probabilities, that is, $(M_G)_{uv} = \varphi(u,v)$. Let $(\mathbb{P}, \max, 0, \times, 1)$ be the star semiring of probabilities under maximum and multiplication (where $a^* = 1$, since $1 = \max(1,a \times 1)$). Then we can solve the single most reliable path problem by computing $M_G^*$ over this semiring, and finding the largest entry. If we want to find the actual most reliable path, and not just its reliability, we can instead work over the semiring $\mathbb{P} \times E^*$, i.e. probabilities paired with paths. You might enjoy working out what the addition, multiplication, and star operations should be, or see O’Connor (2011).

In fact, as shown by O’Connor and Dolan, there are many algorithms that can be recast as computing the star of a matrix, for an appropriate choice of semiring: for example, (reflexive-)transitive closure; all-pairs shortest paths; Gaussian elimination; dataflow analysis; and solving certain knapsack problems. One might hope that there is similarly an appropriate semiring for the network reliability problem. But I have spent some time thinking about this and I do not know of one.

Consider again the simple example given at the start of the previous post:

For this example, we computed the reliability of the network to be $0.835$, by computing the probability of the upper path, $p = 0.45$, and the lower path, $q = 0.7$, and then combining them as $p + q - pq$, the probability of success on either path less the double-counted probability of simultaneous success on both.

Inspired by this example, one thing we might try would be to define operations $p \land q = pq$ and $p \lor q = p + q - pq$. But when we go to check the semiring laws, we run into a problem: distributivity does not hold! $p \land (q \lor r) = p(q + r - qr) = pq + pr - pqr$, but $(p \land q) \lor (p \land r) = pq \lor pr = pq + pr - p^2qr$. The problem is that the addition operation $p \lor q = p + q - pq$ implicitly assumes that the events with probabilities $p$ and $q$ are independent: otherwise the probability that they both happen is not actually equal to $pq$. The events with probabilities $pq$ and $pr$, however, are not independent. In graph terms, they represent two paths with a shared subpath. In fact, our example computation at the beginning of the post was only correct since the two paths from $s$ to $t$ were completely independent.

# Graph reduction

We can at least compute the reliability of series-parallel graphs whose terminals correspond with $s$ and $t$:

• If $G$ consists of a single edge, return that edge’s probability.
• Otherwise, $G$ is a composition of two subgraphs, whose reliabilities we recursively compute. Then:
• If $G$ is a sequential composition of graphs, return the product of their reliabilities.
• If $G$ is a parallel composition of two graphs with reliabilities $p$ and $q$, return $p + q - pq$.

In the second case, having a parallel composition of graphs ensures that there are no shared edges between them, so $p$ and $q$ are indeed independent.

Of course, many interesting graphs are not series-parallel. The simplest graph for which the above does not work looks like this:

Suppose all the edges have probability $1/3$. Can you find the reliability of this network?

More in a future post!

# References

Dolan, Stephen. 2013. “Fun with Semirings: A Functional Pearl on the Abuse of Linear Algebra.” In ACM SIGPLAN Notices, 48:101–10. 9. ACM.

Lehmann, Daniel J. 1977. “Algebraic Structures for Transitive Closure.” Theoretical Computer Science 4 (1). Elsevier: 59–76.

O’Connor, Russell. 2011. “A Very General Method for Computing Shortest Paths.” http://r6.ca/blog/20110808T035622Z.html.

Penaloza, Rafael. 2005. “Algebraic Structures for Transitive Closure.” http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.7650.

# Steven Pinker's The Sense of Style

I recently read Pinker's The Sense of Style, and urge you to read it too. It is chock full of practical advice on how to make your writing better. Chapter One shows you how to appreciate good writing; I never knew an obituary could be so zippy. Chapter Two explains the approach to writing called 'the classical style', which I have used my whole life without realising it. Chapter Three describes how to keep knowing what you are talking about from getting in the way of communicating clearly. Pinker is an expert on modern grammar, and Chapter Four clarifies how to parse your sentences to avoid ambiguity and employ referents correctly. Chapter Five details the mechanics of how to make a passage cohere. Chapter Six catalogues, from Pinker's position on the usage panel for the American Heritage dictionary, contentious points of diction, with his advice on how to resolve them and what points to consider when resolving them for yourself.

I recommend it highly, and doubly so if you are ever likely to write something that I will have to read.

# [btzawyfg] Point-free argument names

In a traditional "pointful" definition of a function in Haskell, the names of the function parameters can provide documentation of what the argument is for:

f num_oranges = ...

In point-free style, that opportunity for documentation goes away.

However, if the function has a type signature, then each type element of the signature can be annotated with documentation stating what it is.  The feature already exists in Haddock, though it is not too frequently used.

f :: Int -- ^ The 'Int' argument
-> Float -- ^ The 'Float' argument
-> IO () -- ^ The return value

# Greenwald on Clinton, Sanders, and Israel

Democracy News interviews Glen Greenwald on Clinton, Sanders, and Israel.
AMY GOODMAN: Let’s go to a clip of Hillary Clinton addressing AIPAC, the American Israel Public Affairs Committee.
HILLARY CLINTON: Many of the young people here today are on the front lines of the battle to oppose the alarming boycott, divestment and sanctions movement known as BDS. ... We must repudiate all efforts to malign, isolate and undermine Israel and the Jewish people.
AMY GOODMAN: That was Hillary Clinton addressing AIPAC. Glenn Greenwald?
GLENN GREENWALD: What she’s doing there is affirming one of the most vile slanders that currently exists. There is a campaign in the United States and in Israel to literally outlaw any advocacy of a boycott movement against Israel, similar to the boycott and divestment and sanctions campaign that brought down Israel and the United States’s closest ally, which was the apartheid regime in South Africa. Now you can certainly raise objections to the tactic of boycotting Israel, and lots of people have, but to render it illegal depends upon this grotesque equating of an advocacy of a boycott of Israel with anti-Semitism and then saying that because anti-Semitism should be banned from universities or from private institutions, that it should be literally outlawed, to ban advocating the boycott of Israel, as well. And people in Europe are actually being arrested for advocating a boycott of Israel. Students in American universities are being sanctioned and punished for doing so.
And what Hillary Clinton did was go before AIPAC and pander, as grotesquely as she typically does, by affirming this line that if you "malign," quote-unquote, the government of Israel and support a boycott of it, in opposition to their decades-long occupation of the Palestinians, it means essentially that you’re guilty of maligning the Jewish people. She is conflating the government of Israel with Jews, which, ironically enough, is itself a long-standing anti-Semitic trope. But it’s just part of her moving to the right in order to position herself for the general election by affirming some of the United States government’s worst and most violent policies.
AMY GOODMAN: Now, Democratic candidate Vermont Senator Bernie Sanders was the only one to skip the AIPAC conference earlier this week. He did address the issue on the campaign trail, though, from Utah, calling for an end to Israel’s occupation of Palestinian territories.
SEN. BERNIE SANDERS: It is absurd for elements within the Netanyahu government to suggest that building more settlements in the West Bank is the appropriate response to the most recent violence. It is also not acceptable that the Netanyahu government decided to withhold hundreds of millions of shekels in tax revenue from the Palestinians, which it is supposed to collect on their behalf.
AMY GOODMAN: That was Bernie Sanders in Utah. Glenn Greenwald, I believe he did offer to address AIPAC by video stream or Skype, as did Romney in 2012, but we heard he was told no.
GLENN GREENWALD: Yeah, I mean, a couple months ago, Donald Trump, on an MSNBC program, said, when asked about Israel and Palestine, that he thought the U.S. should be neutral in order to be a more effective arbiter, which until 20 years ago was a standard mainstream U.S. position, but now has become very shocking. Same with what Bernie Sanders just said. To hear a prominent American politician stand up and actually criticize Israel in such stark and blunt terms, calling them occupiers, essentially, and criticizing how they’re treating the Palestinians, is almost shocking to the ear. Hillary Clinton would never do it, nor would leading Republican politicians. And yet it’s really a very mild way to talk about Israel. And it shows just how far to the right the discourse has shifted in the United States when it comes to Israel, and how much a part of that rightward shift is Hillary Clinton, when you think about how almost shocking it is to hear pretty mild criticisms of Israel coming from Sanders or mild proclamations of neutrality coming from Trump.

# servant 0.6 released

We just released servant-0.6 to hackage. As promised this is only a small increment on the 0.5 release. Here are main changes:

• When encountering parse errors on query parameters, servant-server returns a 400 (was 404).
• servant-client: How the BaseUrl and the Manager arguments are being passed in to the client functions changed. It’s best explained through an example:

 haskell :qa

<section class="level1" id="conclusion">

# Conclusion

We hope you enjoy the new release. It was certainly lots of fun to work on servant-0.5`. We’d like to say thanks to all the people that helped, big or small. Happy hacking.

</section>
Posted on April 4, 2016 by Sönke Hahn