Planet Haskell

September 21, 2018

Holden Karau

Bringing the Jewels of the Python world (including Spacy) to Scala with Spark @ Lambda World Seattle

Thanks for joining me on 2018-09-18 at Lambda World Seattle 2018 Seattle, WA, USA for Bringing the Jewels of the Python world (including Spacy) to Scala with Spark.The talk covered:

With the new Apache Arrow integration in PySpark 2.3, it is now starting become reasonable to look to the Python world and ask what else do we want to steal besides tensorflow, or as a Python developer look and say how can I get my code into production without it being rewritten into a mess of Java?. Regardless of your specific side(s) in the JVM/Python divide, collaboration is getting a lot faster, so lets learn how to share! In this brief talk we will examine sharing some of the wonders of Spacy with the JVM world, which still has a somewhat lackluster set of options for NLP & deep learning.

.I'll update this post with the slides soon.Comment bellow to join in the discussion :).Talk feedback is appreciated at

by Holden Karau ( at September 21, 2018 06:40 AM

September 20, 2018

FP Complete

Deploying Postgres based Yesod web application to Kubernetes using Helm

Yesod is one of the most popular web frameworks in the Haskell land. This post will explore creating a sample Postgres based Yesod web application and then deploying it to Kubernetes cluster. We will create a Helm chart for doing our kubernetes release. Note that the entire source lives here:

by Sibi Prabakaran at September 20, 2018 10:45 PM

September 18, 2018

Chris Smith 2

CodeWorld Update — September 17, 2018

It’s been a couple months, so here’s some of the latest news about CodeWorld.

First of all, if you’re reading this and will be at ICFP, please consider coming to dinner on Monday night, Sep 24, to talk with other interested people about Haskell (and Haskell-like languages) in K-12 education. Please RSVP here as soon as you can.


There have been a number of changes made to the CodeWorld platform lately.


CodeWorld is being taught in quite a few places now!

  • In Louisiana, a pilot program is continuing to teach computational thinking with CodeWorld, organized by a group at LSU. This program reaches 350 students, including many schools and teachers who were trained over the summer.
  • In a separate effort in Louisiana, middle school students have seen a brief introduction to CodeWorld as part of a survey course that also includes p5.js, Arduino, and web programming.
  • In California, Brooklyn Cook is teaching CodeWorld to a new class, keeping it alive there even after my moving to New York.
  • Together with a coworker and a local teacher, I’m leading a class at The Clinton School in New York. This is CodeWorld’s first entry into the New York area. Schools here start quite a bit later in the year, so we’ve just had our first class today.


I’ve been working on updating the Guide to match our current best understanding of how to teach CodeWorld. Every time I do this, though, it gets quite a lot longer. My plan is to try to keep up with my own class this school year, so I’ll have a good chunk of the updates done by the end of the year. I’ve also swapped the Guide to use markdeep, a powerful variant of markdown that makes for a surprisingly complete publishing platform. I’ve extended markdeep locally with some features like collapsible notes that can be expanded for more information. Just seeing the results makes me excited to be writing the guide again!

Fernando Alegre at Louisiana State University is also building a separate curriculum focused on 9th grade, and with different elements and API entry points involved.

by Chris Smith at September 18, 2018 01:04 AM

September 17, 2018

Holden Karau

Understanding Spark tuning with auto-tuning; or, Magical spells to stop your pager going off at 2:00am @ Strata EU 2018

Thanks for joining us (@holdenkarau,@warre_n_peace) on 2018-05-23 at Strata EU 2018 London, UK for Understanding Spark tuning with auto-tuning; or, Magical spells to stop your pager going off at 2:00am.The talk covered:

Apache Spark is an amazing distributed system, but part of the bargain weve made with the infrastructure deamons involves providing the correct set of magic numbers (aka tuning) or our jobs may be eaten by Cthulhu. Tuning Apache Spark is somewhat of a dark art, although thankfully, when it goes wrong, all we tend to lose is several hours of our day and our employers money. Holden Karau, Rachel Warren, and Anya Bida explore auto-tuning jobs using both historical and live job information, using systems like Apache BEAM, Mahout, and internal Spark ML jobs as workloads. Much of the data required to effectively tune jobs is already collected inside of Spark. You just need to understand it. Holden, Rachel, and Anya outline sample auto-tuners and discuss the options for improving them and applying similar techniques in your own work. They also discuss what kind of tuning can be done statically (e.g., without depending on historic information) and look at Sparks own built-in components for auto-tuning (currently dynamically scaling cluster size) and how you can improve them. Even if the idea of building an auto-tuner sounds as appealing as using a rusty spoon to debug the JVM on a haunted supercomputer, this talk will give you a better understanding of the knobs available to you to tune your Apache Spark jobs. Also, to be clear, Holden, and Rachel dont promise to stop your pager going off at 2:00am, but hopefully this helps.


The slides are at video of the talk is up at

<iframe allow="autoplay; encrypted-media" allowfullscreen="allowfullscreen" frameborder="0" height="315" src="" width="560"></iframe><iframe allowfullscreen="allowfullscreen" frameborder="0" height="356" marginheight="0" marginwidth="0" scrolling="no" src="" style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" width="427"> </iframe> Comment bellow to join in the discussion :).Talk feedback is appreciated at

by Holden Karau ( at September 17, 2018 07:48 PM

What things are correlated with gender diversity: A dig through the ASF and Jupyter projects @ Jupytercon NY 2018

Thanks for joining us (@holdenkarau,Matt Hunt) on 2018-08-24 at Jupytercon NY 2018 New York for What things are correlated with gender diversity: A dig through the ASF and Jupyter projects.The talk covered:

Many of us believe that gender diversity in open source projects is important (for example, OReilly, Google, and the Python Software Foundation). (If you don't, this isn't going to convince you.) But what things are correlated with improved gender diversity, and what can we learn from similar historic industries?

Holden Karau and Matt Hunt explore the diversity of different projects, examine historic EEOC complaints, and detail parallels and historic solutions. To keep things interesting, Holden and Matt conclude with a comparative analysis of the state of OSS and various complaints handled by the EEOC in the 60s, along with the solutions, suggestions, and binding settlements that were reached for similar diversity problems in other industries. This comparison is not legal advice but rather examples of what we can learn from early equal opportunity commission decisions.

Topics include:

  • Diversity of gender among the different levels of a given project's leadership (committers, PMC, etc.)
  • The existence of codes of conduct
  • Language used in comments, code, and mailing lists
  • The rate of promotions for project participants

You can find the code for this talk at slides are at

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="356" marginheight="0" marginwidth="0" scrolling="no" src="" style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" width="427"> </iframe> Join in the discussion at :).Talk feedback is appreciated at

by Holden Karau ( at September 17, 2018 03:48 AM

Understanding Spark tuning with auto-tuning; or, Magical spells to stop your pager going off @ Strata NY

Thanks for joining us (@holdenkarau,@warre_n_peace,@anyabida1) on 2018-09-13 at Strata NY 2018 New York, NY, USA for Understanding Spark tuning with auto-tuning; or, Magical spells to stop your pager going off.The talk covered:

Apache Spark is an amazing distributed system, but part of the bargain weve made with the infrastructure deamons involves providing the correct set of magic numbers (aka tuning) or our jobs may be eaten by Cthulhu. Tuning Apache Spark is somewhat of a dark art, although thankfully, when it goes wrong all we tend to lose is several hours of our day and our employers money. Holden Karau, Rachel Warren, and Anya Bida explore auto-tuning jobs using both historical and live job information, using systems like Apache BEAM, Mahout, and internal Spark ML jobs as workloads. Much of the data required to effectively tune jobs is already collected inside of Spark. You just need to understand it. Holden, Rachel, and Anya outline sample auto-tuners and discuss the options for improving them and applying similar techniques in your own work. They also discuss what kind of tuning can be done statically (e.g., without depending on historic information) and look at Sparks own built-in components for auto-tuning (currently dynamically scaling cluster size) and how you can improve them. Even if the idea of building an auto-tuner sounds as appealing as using a rusty spoon to debug the JVM on a haunted supercomputer, this talk will give you a better understanding of the knobs available to you to tune your Apache Spark jobs. Also, to be clear, Holden, Rachel, and Anya dont promise to stop your pager going off at 2:00am, but hopefully this helps.


You can find the code for this talk at slides are at

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="356" marginheight="0" marginwidth="0" scrolling="no" src="" style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" width="427"> </iframe> Comment bellow to join in the discussion :).Talk feedback is appreciated at

by Holden Karau ( at September 17, 2018 03:48 AM

Validating Big Data Pipelines & ML (w Spark & Beam) - Inspecting Functional Pipelines & Avoiding Awkward Recommendations @ @strangeloop_stl

Come join me on 2018-09-27 at @strangeloop_stl 2018 St. Louis, MO, USA for Validating Big Data Pipelines & ML (w Spark & Beam) - Inspecting Functional Pipelines & Avoiding Awkward Recommendations.The talk covered:

Do you ever wonder if your data pipeline is still producing the correct results? Has it ever not? Have you not tricked anyone else into tacking over the pager for your system? As big data jobs move from the proof-of-concept phase into powering real production services, we have to start consider what will happen when everything eventually goes wrong (such as recommending inappropriate products or other decisions taken on bad data). This talk will attempt to convince you that we will all eventually get aboard the failboat (especially with ~40% of respondents automatically deploying their Spark jobs results to production), and its important to automatically recognize when things have gone wrong so we can stop deployment before we have to update our resumes. Figuring out when things have gone terribly wrong is trickier than it first appears, since we want to catch the errors before our users notice them (or failing that before CNN notices them). We will explore general techniques for validation, look at responses from people validating big data jobs in production environments, and libraries that can assist us in writing relative validation rules based on historical data. Once we feel like we've almost got it all under our belt, we will shift focus from traditional ETL pipelines to all of the wonderful special concerns that come with producing ML models and how we can (try) and validate that things are getting better (or at least not that much worse).

.I'll update this post with the slides soon.Come see to the talk or comment bellow to join in the discussion :).Talk feedback is appreciated at

by Holden Karau ( at September 17, 2018 03:48 AM

September 15, 2018

ERDI Gergo

Very high-level simulation of a CλaSH CPU

Initially, I wanted to talk this week about how I plan to structure the CλaSH description of the CHIP-8 CPU. However, I'm postponing that for now, because I ran into what seems like a CλaSH bug, and I want to see my design run on real hardware before I describe it in too much detail. So instead, here's a post on how I am testing in software.

CPUs as Mealy machines

After stripping away all the nice abstractions that I am using in my description of the CPU, what remains is a Mealy machine, which simply means it is described by a state transition and output function s -> i -> (s, o). If that looks familiar, that is not a coincidence: this is, of course, just one argument flip away from the Kleisli category of the State s monad. Just think of it as being either this or that, depending on which one you have more intuition about. A lot more on this in my upcoming blogpost.

My CHIP-8 CPU is currently described by a Mealy machine over these types:

data CPUIn = CPUIn
    { cpuInMem :: Word8
    , cpuInFB :: Bit
    , cpuInKeys :: KeypadState
    , cpuInKeyEvent :: Maybe (Bool, Key)
    , cpuInVBlank :: Bool

data Phase
    = Init
    | Fetch1
    | Exec
    | StoreReg Reg
    | LoadReg Reg
    | ClearFB (VidX, VidY)
    | Draw DrawPhase (VidX, VidY) Nybble (Index 8)
    | WaitKeyPress Reg

data CPUState = CPUState
    { opHi, opLo :: Word8
    , pc, ptr :: Addr
    , registers :: Vec 16 Word8
    , stack :: Vec 24 Addr
    , sp :: Index 24
    , phase :: Phase
    , timer :: Word8

data CPUOut = CPUOut
    { cpuOutMemAddr :: Addr
    , cpuOutMemWrite :: Maybe Word8
    , cpuOutFBAddr :: (VidX, VidY)
    , cpuOutFBWrite :: Maybe Bit

cpu :: CPUIn -> State CPUState CPUOut

Running the CPU directly

Note that all the types involved are pure: signal inputs are turned into pure input by CλaSH's mealy function, and the pure output is similarly turned into a signal output. But what if we didn't use mealy, and ran cpu directly, completely sidestepping CλaSH, yet still running the exact same implementation?

That is exactly what I am doing for testing the CPU. By running its Mealy function directly, I can feed it a CPUIn and consume its CPUOut result while interacting with the world — completely outside the simulation! The main structure of the code that implements the above looks like this:

stateful :: (MonadIO m) => s -> (i -> State s o) -> IO (m i -> (o -> m a) -> m a)
stateful s0 step = do
    state <- newIORef s0
    return $ \mkInput applyOutput -> do
        inp <- mkInput
        out <- liftIO $ do
            s <- readIORef state
            let (out, s') = runState (step inp) s
            writeIORef state s'
            return out
        applyOutput out

Hooking it up to SDL

I hooked up the main RAM and the framebuffer signals to IOArrays, and wrote some code that renders the framebuffer's contents into an SDL surface and translates keypress events. And, voilà: you can run the CHIP-8 computer, interactively, even allowing you to use good old trace-based debugging (which is thankfully removed by CλaSH during VHDL generation so can even leave them in). The below screencap shows this in action: :main is run from clashi and starts the interactive SDL program, with no Signal types involved.

<object height="350" width="425"><param name="movie" value=""/><param name="wmode" value="transparent"/><embed height="350" src="" type="application/x-shockwave-flash" width="425" wmode="transparent"></embed></object>

September 15, 2018 07:24 PM

Chris Smith 2

The ups and downs of STEM standards

Last week, in a widely hailed move, California adopted model K-12 computer science standards. Headlines have described this move as “California embraces computer science in its schools”, or “California board of education expands access to computer science”. For the last seven years of my life, I’ve volunteered much of my life to computer science at the middle school level. The tools I’ve build are used in a schools across the country by many teachers to reach hundreds of students, and I’m proud of this. You might think I’m thrilled by this news.

You would be wrong, though.

I am still hopeful that this works out for the best. It might raise enough awareness, and motivate and fund enough school districts, to offset the harm of the standards themselves. But the move to quickly adopt standards for K-12 computer science is deeply misguided, and I believe we will likely come to regret the push for early standardization.

<figure><figcaption>Girls learning computer science using Scratch (source)</figcaption></figure>

This is unusual of me to say. The push for updated standards like the Common Core, NGSS, etc., has been a huge part of the education reform movement over the past decade, and it’s admittedly been largely successful. I certainly don’t align myself with those naysayers who see the Common Core as government overreach or argue that children need more drills on long division. But computer science is different, in a few ways.

Computer science is a young field of knowledge.

<figure><figcaption>The first computer, only 70 years ago</figcaption></figure>

It might seem like computer science has been around forever, but it’s pretty new by any measure. The first computers themselves were built barely 70 years ago — barely the blink of an eye compared to the storied millenia of mathematics, language, science, and history. And yet even that 70 years is misleading. Software in 1948 was a very different beast from what it looks like today. It doesn’t look like that pace of change is slowing. If anything, the advent of new fundamental ways of looking at computing is proceeding at breakneck speed.

Computer science shows telltale signs of immaturity. The common university computer science curriculum looks like a list of all the things we thought computer programmers needed to do 30 years ago. There are courses in writing operating systems, implementing compilers, assembly language, databases, network programming, graphics, and artificial intelligence. Compare this to mathematics or science, and you’ll find a quite different story, with a mature field organized around key ideas, not tasks.

If CS is still young, CS in K-12 education is in its infancy.

We aren’t talking about computer science, though, but about computer science education. And that is far, far newer and less certain than the topic as a whole. The current push for computer science education is maybe 15 years old. There have been older generations of the effort, to be sure, but without a lot of success to show for it.

<figure><figcaption>Scratch, a language designed for early computer science</figcaption></figure>

The current thinking in computer science education, led by Scratch and Hour of Code, is that CS can be made accessible to younger children with a drag-and-drop tools and game-like learning environments. Yet most success stories come without real learning objectives, much less authentic transfer of skills. Researchers often measure interest instead of outcomes, and on success or failure rather than building a theory of how students make meaning and learn.

The standards themselves show telltale signs of this. If you compare with the Common Core, the first and most striking difference is how much of California’s new CS standards are about auxiliary concerns — they talk a lot about making sure students know to respect copyright and can talk about the social implications of computing (something that is important, but belongs in a civics curriculum!), but fall very short in enumerating the skills students need to build toward robust technical competency. There’s a reason for that: we don’t really understand those early technical skills even remotely well enough to say what they are, much less where a student should be at which grade levels!

Computer science is a mess of different languages and tools.

A third challenge is that the professional practice of computer programming is a bewildering mixture of different notations, languages, libraries, paradigms, and tools. Great debates can be found, both in professional practice and in the education space, about the choice of tools and notations. Little if anything is known about how well children transfer expertise in one tool (especially drag-and-drop languages, for example) to others, and the early hints we do have don’t look so promising.


Standardization implies that we’ve converged on some consensus, and it’s time to get everyone on the same page. In computer science today, what page is that? By reading the standards, one can only conclude that the page we’re aiming at is a very shallow surface understanding of technical content, in an effort to be as vague as possible about what that content might be.

Computational thinking is still poorly defined.


Finally, it’s important to look back at why we’re doing this. Not every student will develop computer software for a living. But we have taken as an article of faith (and I agree with this wholeheartedly) that easy and ubiquitous computation has changed the world, creating new opportunities for learning, and new ways of thinking and understanding. These are important not just for computer programmers, but for educated citizens in the world at large.

If we believe this, though, why does our computer science education, even in elementary school, look so much like the profession of software engineering? Why are we using languages that either are built for industry (like Python, and JavaScript, and later Java), or mimic their structure and organization (like Scratch or Hour of Code)? There’s been much debate in recent years over how we define computational thinking in a way that keeps faith with its meaning as a universal skill. A great answer is that it’s about using decomposition, abstraction, and modeling to tackle the kinds of enormously complex tasks that arise when the grunt work of computation becomes trivial. Today’s standards don’t reflect that answer.

Why not standardize?

One could ask, even if these aren’t the best standards, why not have standards? Even before adoption, my colleagues in California have already been asked The Question. “Is your activity aligned with the proposed CS standards?” If the standards talked about syntax, this question would exclude using block-based languages. They don’t, though, because there are powerful groups working on block-based languages. This is a huge gap, since learning to be creative in formal notations is a huge part of what students stand to gain from learning computer science. The standards do talk about loops and mutating variables, so this question makes it harder to justify functional programming — despite the fact that Bootstrap, a curriculum based on functional programming, shows the best research evidence I’ve seen of transfer learning between computer science and mathematics.

Misuse of standards and why it matters

There’s an even bigger problem: computer science is an inherently creative field, and the way standards are used in STEM stands in the way of students’ ability to create.

Maybe you’ve decided to do some weight-lifting. You might join a gym and try a circuit on the weight machines. Big mistake, say almost all experts! Instead, they advise you to use free weights. Weight machines isolate and develop a small set of major muscles. But lifting free weights also builds a bunch of little muscles that you need for balance and versatility.

<figure><figcaption>The wrong way to start strength training.</figcaption></figure>

In a very similar way, building interesting and complex things in computer science develops a huge number of small skills, without which you might pass an exam, but you will struggle to build new things on your own. Standards are like the major muscles: if you are missing one in your training routine, it’s worth fixing. But it’s not sufficient to isolate just the standards, and train each one in isolation. There’s a whole second tier of skills that, while each one may not be critical, leaves students in a precarious position if they don’t pick up any!

Unfortunately, the current educational climate is about lessons carefully aligned to standards; the educational equivalent of weight machines. When they learn to read, students read a selection of books that give them a broad exposure to examples of literary value. But almost everywhere else, teachers are expected to choose resources, tools, and curriculum that’s stripped down to spend time only on the specific skills mentioned in the standards document. This laser-like focus on specific standards doesn’t do students any goals if we want them to apply the content independently and creatively!

If not standards, then what?

The push for California to adopt computer science standards was supported by most of the CS education community. I’d suggest this was not at all about a feeling that there’s too little standardization! Instead, it’s just because issuing standards is what the board of education does to encourage things like CS education. It was about recognizing the need for CS, not controlling the content.

But standards don’t fix budget problems. They don’t fix teacher education. And they don’t replace high-quality curriculum. Instead of pushing for the state to declare what teachers should be doing about computer science, I much prefer to help teachers do what they already know is good for their students.

by Chris Smith at September 15, 2018 05:03 AM

September 14, 2018

Mark Jason Dominus

How not to remember the prime numbers under 1,000

A while back I said I wanted to memorize all the prime numbers under 1,000, because I am tired of getting some number like 851 or 857, or even 307, and then not knowing whether it is prime.

The straightforward way to deal with this is: just memorize the list. There are only 168 of them, and I have the first 25 or so memorized anyway.

But I had a different idea also. Say that a set of numbers from to is a “decade”. Each decade contains at most 4 primes, so 4 bits are enough to describe the primes in a single decade. Assign a consonant to each of the 16 possible patterns, say “b” when none of the four numbers is a prime, “d” when only is prime, “f” when only is, and so on.

Now memorizing the primes in the 90 decades is reduced to memorizing 90 consonants. Inserting vowels wherever convenient, we have now turned the problem into one of memorizing around 45 words. A word like “potato” would encode the constellation of primes in three consecutive decades. 45 words is only a few sentences, so perhaps we could reduce the list of primes to a short and easily-remembered paragraph. If so, memorizing a few sentences should be much easier than memorizing the original list of primes.

The method has several clear drawbacks. We would have to memorize the mapping from consonants to bit patterns, but this is short and maybe not too difficult.

More significant is that if we're trying to decide if, say, 637 is prime, we have to remember which consonant in which word represents the 63rd decade. This can be fixed, maybe, by selecting words and sentences of standard length. Say there are three sentences and each contains 30 consonants. Maybe we can arrange that words always appear in patterns, say four words with 1 or 2 consonants each that together total 7 consonants, followed by a single long word with three consonants. Then each sentence can contain three of these five-word groups and it will be relatively easy to locate the 23rd consonant in a sentence: it is early in the third group.

Katara and I tried this, with not much success. But I'm not ready to give up on the idea quite yet. A problem we encountered early on is that we remember consonants not be how words are spelled but by how they sound. So we don't want a word like “hammer” to represent the consonant pattern h-m-m but rather just h-m.

Another problem is that some constellations of primes are much more common than others. We initially assigned consonants to constellations in order. This assigned letter “b” to the decades that contain no primes. But this is the most common situation, so the letter “b” tended to predominate in the words we needed for our mnemonic. We need to be careful to assign the most common constellations to the best letters.

Some consonants in English like to appear in clusters, and it's not trivial to match these up with the common constellations. The mapping from prime constellations to consonants must be carefully chosen to work with English. We initially assigned “s” to the constellation “☆•☆☆” (where and are prime but is not) and “t” to the constellation “☆☆••” (where and are prime but and are not) but these constellations cannot appear consecutively, since at least one of is composite. So any word with “s” and “t” with no intervening consonants was out of play. This eliminated a significant fraction of the entire dictionary!

I still think it could be made to work, maybe. If you're interested in playing around with this, the programs I wrote are available on Github. The mapping from decade constellations to consonant clusters is in

by Mark Dominus ( at September 14, 2018 04:23 PM

September 13, 2018

Neil Mitchell

Review of Apple Watch (series 2)

Summary: I like mine.

I've worn a watch since I was 10, starting with a Casio F-91W, replacing it with an identical model about seven times over the years. Last year the strap broke (it's almost always the strap that breaks), and I made the decision to buy an Apple Watch. I'm very happy with my Apple Watch (series 2, 38mm).

What I use it for

The main things I use my watch for are:

  • Alarms and timers, often using Siri to set them. If you treat the voice interface like a bad command line it works well - if you say "Add an alarm at 4pm" and you had an alarm at 4pm yesterday, it just shows you the alarm but doesn't enable it. Instead you have to say "Set alarm at 4pm". If you say "Set alarm at 4:10pm" the odds of getting an alarm at 10pm are quite high.
  • Receiving notifications, email, texts, Slack messages, phone calls. When you lift your arm you get a quick preview of the message, which is enough to decide whether it's important (take out the phone), or can happily be ignored.
  • Sleep tracking, via the Sleepwatch app. It's an awesome app that tracks your sleep showing trends.
  • Paying for things via the ApplePay. Nowadays I'm on the verge of not shopping at places that don't take ApplePay. It's more secure than contactless, works everywhere contactless works, has a higher limit, and is incredibly convenient. It can also swipe through multiple credit cards. Easy and seamless.

What I wish was better

There are other features the watch offers that I was hoping to use, but for various reasons haven't worked out.

  • I am terrible at navigation, and wanted to use Google Maps on my watch, particularly while cycling. Unfortunately, there is no Google Maps app for the watch, and the Apple Maps one is even less useful than normal Apple Maps. There is a recurrent bug where it loses the map display and just displays a checkered background - which can be fixed by some complex steps including wiping the watch and reinstalling. Part of the reason for buying this watch was for navigation, so I hope this gets resolved eventually.
  • I wanted to quickly and easily track train departures on the move. Unfortunately the National Rail train time app is useless - it omits the time the train is leaving, merely saying "On time" or not. As a consequence you have to memorise the timetable, plus believe it has refreshed and isn't just showing stale information. All in all, impressively close, and totally useless.
  • The actual display of the watch is adequate - it takes a noticeable pause to display the watch face (sub-second), but compared to my Casio, it's not always available. I like actual seconds on my watch, which limits me to exactly one digital watch face. I can't believe "knowing the precise time" is such a niche feature on a watch.
  • You can't hide apps from the watch that you don't use, which means my watch face has 50 odd apps, of which I use maybe 10. Removing most of the apps would make navigation much easier.
  • The watch slows down over time, so after a couple of months it starts to lag. A reboot fixes that.
  • The straps you can buy for the watch are fantastically overpriced. The default one is OK, but my wrist in between two holes, so it's usually either a bit loose or a bit tight.
  • Exercise features haven't been much use to me, but I'd blame that on me rather than the watch...


The positives are enough to make it worth me having an Apple Watch, and inevitably replacing when the battery life gets too bad (for the moment, it runs about 30hrs on a charge, which is fine).

by Neil Mitchell ( at September 13, 2018 03:03 PM

September 12, 2018

Mark Jason Dominus

Perils of hacking on mature software

Yesterday I wrote up an interesting bug in git-log --follow's handling of empty files. Afterward I thought I'd see if I could fix it.

People complain that the trouble of working on mature software like Git is to understand the way the code is structured, its conventions, the accumulated layers of cruft, and where everything is. I think this is a relatively minor difficulty. The hard part is no so much doing what you want, as knowing what you want to do.

My original idea for the fix was this: I can give git log a new option, say --follow-size-threshhold=n. This would disable all copy and rename detection for any files of size less than n bytes. If not specified or configured, n would default to 1, so that the default behavior would disable copy and rename detection of empty files but not of anything else. I was concerned that an integer option was unnecessarily delicate. It might have been sufficient to have a boolean --follow-empty-files flag. But either way the programming would be almost the same and it would be easy to simplify the option later if the Git maintainers wanted it that way

I excavated the code and found where the change needed to go. It's not actually in git-log itself. Git has an internal system for diffing pairs of files, and git-log --follow uses this to decide when two blobs are similar enough for it to switch from following one to the other. So the flag actually needed to be added to git-diff, where I called it --rename-size-threshhold. Then git-log would set that option internally before using the Git diff system to detect renames.

But then I ran into a roadblock. Diff already has an undocumented flag called --rename-empty that tells it to report on renames of empty files in certain contexts — not the context I was interested in unfortunately. The flag is set by default, but it is cleared internally when git-merge is resolving conflicts. The issue it addresses is this: Suppose the merge base has some empty file X. Somewhere along the line X has been removed. In one branch, an unrelated empty file Y has been created, and in the other branch a different unrelated empty file Z has been created. When merging these two branches, Git will detect a merge conflict: was file X moved to location Y or to location Z? This ⸢conflict⸣ is almost certainly spurious, and is is very unlikely that the user will thank us for demanding that they resolve it manually. So git-merge sets --no-rename-empty internally and Git resolves the ⸢conflict⸣ automatically.

(See this commit for further details.)

The roadblock is: how does --rename-empty fit together with my proposed --rename-size-threshhold flag? Should they be the same thing? Or should they be separate options? There appear to be at least three subsystems in Git that try to decide if two similar or identical files (which might have different names, or the same name in different directories) are “the same file” for various purposes. Do we want to control the behavior of these subsystems separately or in unison?

If they should be controlled in unison, should --rename-size-threshhold be demoted to a boolean, or should --rename-empty be promoted to an integer? And if they should be the same, what are the implications for backward compatibility? Should the existing --rename-empty be documented?

If we add new options, how do they interact with the existing and already non-orthogonal flags that do something like this? They include at least the following options of git-diff, git-log, and git-show:


Only git-log has --follow and my new feature was conceived as a modification of it, which is why I named it --follow-size-threshhold. But git-log wouldn't be implementing this itself, except to pass the flag into the diff system. Calling it --follow-size-threshhold in git-diff didn't make sense because git-diff doesn't have a --follow option. It needs a different name. But if I do that, then we have git-diff and git-log options with different names that nevertheless do exactly the same thing. Confusing!

Now suppose you would like to configure a default for this option in your .gitconfig. Does it make sense to have both diff.renameSizeThreshhold and log.followSizeThreshhold options? Not really. It would never be useful to set one but not the other. So eliminate log.followSizeThreshhold. But now someone like me who wants to change the behavior of git-log --follow will not know to look in the right place for the option they need.

The thing to do at this point is to come up with some reasonable-seeming proposal and send it to Jeff King, who created the undocumented --rename-empty feature, and who is also a good person to work with. But coming up with a good solution entirely on my own is unlikely.

Doing any particular thing would not be too hard. The hard part is deciding what particular thing to do.

by Mark Dominus ( at September 12, 2018 03:14 PM

Language fluency in speech and print

Long ago I worked among the graduate students at the University of Pennsylvania department of Computer and Information Sciences. Among other things, I did system and software support for them, and being about the same age and with many common interests, I socialized with them also.

There was one Chinese-Malaysian graduate student who I thought of as having poor English. But one day, reading one of his emailed support requests, I was struck by how clear and well-composed it was. I suddenly realized I had been wrong. His English was excellent. It was his pronunciation that was not so good. When speaking to him in person, this was all I had perceived. In email, his accent vanished and he spoke English like a well-educated native. When I next met him in person I paid more careful attention and I realized that, indeed, I had not seen past the surface: he spoke the way he wrote, but his accent had blinded me to his excellent grammar and diction.

Once I picked up on this, I started to notice better. There were many examples of the same phenomenon, and also the opposite phenomenon, where someone spoke poorly but I hadn't noticed because their pronunciation was good. But then they would send email and the veil would be lifted. This was even true of native speakers, who can get away with all sorts of mistakes because their pronunciation is so perfect. (I don't mean perfect in the sense of pronouncing things the way the book says you should; I mean in the sense of pronouncing things the way a native speaker does.) I didn't notice this unless I was making an effort to look for it.

I'm not sure I have anything else to say about this, except that it seems to me that when learning a foreign language, one ought to consider whether one will be using it primarily for speech or primarily for writing, and optimize one's study time accordingly. For speech, concentrate on good pronunciation; for writing, focus on grammar and diction.

Hmm, put that way it seems obvious. Also, the sky is sometimes blue.

by Mark Dominus ( at September 12, 2018 02:00 PM

September 11, 2018

FP Complete

Deploying Haskell Apps with Kubernetes

Deploying software can certainly have its challenges, but Deni Burtovic walks you through everything you need to know when it comes to deploying Haskell applications with Kubernetes.  As you watch the webinar, we also encourage you to follow along by viewing the presentation which can be found here - Deploying Haskell Apps with Kubernetes.

by Robert Bobbett ( at September 11, 2018 11:24 PM

Chris Smith 2

“Teaching Haskell in K-12” Dinner at ICFP

Dear Haskellers: You are cordially invited to a dinner discussion on “Teaching Haskell in K-12” at ICFP in St. Louis.

<figure><figcaption>Middle school students learning Haskell with CodeWorld</figcaption></figure>

This meeting is for anyone around the world with an interest in teaching Haskell and related languages at the K-12 (that is, pre-university) level. We will hope to find a critical mass of people to build a connected and supportive community. In this community, we can share experiences and resources, ask for and receive help, and encourage each other as we strive to bring functional programming to a new generation.

Personally, I’ve been working for six years on using Haskell to build algebraic thinking skills in middle school students. I am not alone. Emmanuel Schanzer and many others have explored the same approach with Bootstrap. Christopher Anand and colleagues have done the same with Elm at McMaster University as early as 4th grade, and the Gordon Cain Center at Louisiana State University has trained teachers using CodeWorld for a 9th grade computational thinking class for hundreds of students. Just as importantly, the Haskell community is built of thousands of individuals who have something unique contribute to your own communities and children.

You’re invited to come discuss our common goals, how we can organize and teach, which tools and libraries we find helpful, what technology is missing that could help the effort move forward, and even how to scale our efforts to reach as many children as possible around the world.

Let’s meet for dinner about 7:00 pm on Monday the 24th. Please RSVP here. I hope to see you there!

by Chris Smith at September 11, 2018 01:28 AM

September 10, 2018

Mark Jason Dominus

Why hooks and forks in the J language?

I recently said:

I don't know why [Ken] Iverson thought the hook was the thing to embed in the [J] language.

And I think I now recall that the name of the language itself, J, is intended to showcase the hook, so he must have thought it was pretty wonderful.

A helpful Hacker News comment pointed me to the explanation. Here Iverson explains why the “hook” feature: it is actually the S combinator in disguise. Recall that $${\bf S} x y z = x z (y z).$$ This is exactly what J's hook computes when you write (x y) z. For instance, if I understand correctly, in J (+ !) means the one-place operation that takes an argument to .

As McBride and Paterson point out, S is also the same as the <*> operator in the Reader instance of Applicative. Since in J the only possible inputs to a hook are functions, it is operating in the Reader idiom and in that context its hook is doing the same thing as Haskell's <*>. Similarly, J's “fork” feature can be understood as essentially the same as the Reader insance of Haskell's liftA2.

by Mark Dominus ( at September 10, 2018 07:31 PM

git log --follow enthusiastically tracks empty files

This bug I just found in git log --follow is impressively massive. Until I worked out what was going on I was really perplexed, and even considered that my repository might have become corrupted.

I knew I'd written a draft of a blog article about the Watchmen movie, and I went to find out how long it had been sitting around:

    % git log -- movie/
    commit 934961428feff98fa3cb085e04a0d594b083f597
    Author: Mark Dominus <>
    Date:   Fri Feb 3 16:32:25 2012 -0500

        link to Mad Watchmen parody
        also recategorize under movie instead of under book

The log stopped there, and the commit message says clearly that the article was moved from elsewhere, so I used git-log --follow --stat to find out how old it really was. The result was spectacularly weird. It began in the right place:

    commit 934961428feff98fa3cb085e04a0d594b083f597
    Author: Mark Dominus <>
    Date:   Fri Feb 3 16:32:25 2012 -0500

        link to Mad Watchmen parody
        also recategorize under movie instead of under book

     {book => movie}/ | 8 +++++++-
     1 file changed, 7 insertions(+), 1 deletion(-)

Okay, it was moved, with slight modifications, from book to movie, as the message says.

    commit 5bf6e946f66e290fc6abf044aa26b9f7cfaaedc4
    Author: Mark Jason Dominus (陶敏修) <>
    Date:   Tue Jan 17 20:36:27 2012 -0500

        finally started article about Watchment movie

     book/ | 40 ++++++++++++++++++++++++++++++++++++++++
     1 file changed, 40 insertions(+)

Okay, the previous month I added some text to it.

Then I skipped to the bottom to see when it first appeared, and the bottom was completely weird, mentioning a series of completely unrelated articles:

    commit e6779efdc9510374510705b4beb0b4c4b5853a93
    Author: mjd <mjd>
    Date:   Thu May 4 15:21:57 2006 +0000

        First chunk of linear regression article

     prog/maxims/paste-code.notyet => math/linear-regression.notyet | 0
     1 file changed, 0 insertions(+), 0 deletions(-)

    commit 9d9038a3358a82616a159493c6bdc91dd03d03f4
    Author: mjd <mjd>
    Date:   Tue May 2 14:16:24 2006 +0000

        maxims directory reorganization

     tech/mercury.notyet => prog/maxims/paste-code.notyet | 0
     1 file changed, 0 insertions(+), 0 deletions(-)

    commit 1273c618ed6efa4df75ce97255204251678d04d3
    Author: mjd <mjd>
    Date:   Tue Apr 4 15:32:00 2006 +0000

        Thingy about propagation delay and mercury delay lines

     tech/mercury.notyet | 0
     1 file changed, 0 insertions(+), 0 deletions(-)

(The complete output is available for your perusal.)

The log is showing unrelated files being moved to totally unrelated places. And also, the log messages do not seem to match up. “First chunk of linear regression article” should be on some commit that adds text to math/linear-regression.notyet or math/ But according to the output above, that file is still empty after that commit. Maybe I added the text in a later commit? “Maxims directory reorganization” suggests that I reorganized the contents of prog/maxims, but the stat says otherwise.

My first thought was: when I imported my blog from CVS to Git, many years ago, I made a series of mistakes, and mismatched the log messages to the commits, or worse, and I might have to do it over again. Despair!

But no, it turns out that git-log is just intensely confused. Let's look at one of the puzzling commits. Here it is as reported by git log --follow --stat:

    commit 9d9038a3358a82616a159493c6bdc91dd03d03f4
    Author: mjd <mjd>
    Date:   Tue May 2 14:16:24 2006 +0000

        maxims directory reorganization

     tech/mercury.notyet => prog/maxims/paste-code.notyet | 0
     1 file changed, 0 insertions(+), 0 deletions(-)

But if I do git show --stat 9d9038a3, I get a very different picture, one that makes sense:

    % git show --stat 9d9038a3
    commit 9d9038a3358a82616a159493c6bdc91dd03d03f4
    Author: mjd <mjd>
    Date:   Tue May 2 14:16:24 2006 +0000

        maxims directory reorganization

     prog/maxims.notyet            | 226 -------------------------------------------
     prog/maxims/maxims.notyet     |  95 ++++++++++++++++++
     prog/maxims/   | 134 +++++++++++++++++++++++++
     prog/maxims/paste-code.notyet |   0
     4 files changed, 229 insertions(+), 226 deletions(-)

This is easy to understand. The commit message was correct: the maxims are being reorganized. But git-log --stat, in conjunction with --follow, has produced a stat that has only a tenuous connection with reality.

I believe what happened here is this: In 2012 I “finally started article”. But I didn't create the file at that time. Rather, I had created the file in 2009 with the intention of putting something into it later:

    % git show --stat 5c8c5e66
    commit 5c8c5e66bcd1b5485576348cb5bbca20c37bd330
    Author: mjd <mjd>
    Date:   Tue Jun 23 18:42:31 2009 +0000

        empty file

     book/   | 0
     book/Watchmen.notyet | 0
     2 files changed, 0 insertions(+), 0 deletions(-)

This commit does appear in the git-log --follow output, but it looks like this:

    commit 5c8c5e66bcd1b5485576348cb5bbca20c37bd330
    Author: mjd <mjd>
    Date:   Tue Jun 23 18:42:31 2009 +0000

        empty file

     wikipedia/mega.notyet => book/ | 0
     1 file changed, 0 insertions(+), 0 deletions(-)

It appears that Git, having detected that book/ was moved to movie/ in Febraury 2012, is now following book/ backward in time. It sees that in January 2012 the file was modified, and was formerly empty, and after that it sees that in June 2009 the empty file was created. At that time there was another empty file, wikipedia/mega.notyet. And git-log decides that the empty file book/ was copied from the other empty file.

At this point it has gone completely off the rails, because it is now following the unrelated empty file wikipedia/mega.notyet. It then makes more mistakes of the same type. At one point there was an empty wikipedia/ file, but commit ff0d744d5 added some text to it and also created an empty wikipedia/mega.notyet alongside it. The git-log --follow command has interpreted this as the empty wikipedia/ being moved to wikipedia/mega.notyet and a new wikipedia/ being created alongside it. It is now following wikipedia/

Commit ff398402 created the empty file wikipedia/ fresh, but git-log --follow interprets the commit as copying wikipedia/ from the already-existing empty file tech/mercury.notyet. Commit 1273c618 created tech/mercury.notyet, and after that the trail comes to an end, because that was shortly after I started keeping my blog in revision control; there were no empty files before that. I suppose that attempting to follow the history of any file that started out empty is going to lead to the same place, tech/mercury.notyet. On a different machine with a different copy of the repository, the git-log --follow on this file threads its way through ten irrelvant files before winding up at tech/mercury.notyet.

There is a --find-renames=... flag to tell Git how conservative to be when guessing that a file might have been renamed and modified at the same time. The default is 50%. But even turning it up to 100% doesn't help with this problem, because in this case the false positives are files that are actually identical.

As far as I can tell there is no option to set an absolute threshhold on when two files are considered the same by --follow. Perhaps it would be enough to tell Git that it should simply not try to follow files whose size is less than bytes, for some small , perhaps even .

The part I don't fully understand is how git-log --follow is generating its stat outputs. Certainly it's not doing it in the same way that git show is. Instead it is trying to do something clever, to highlight the copies and renames it thinks it has found, and in this case it goes badly wrong.

The problem appears in Git 1.7.11, 2.7.4, and 2.13.0.

[ Addendum 20180912: A followup about my work on a fix for this. ]

by Mark Dominus ( at September 10, 2018 05:45 PM

September 08, 2018

ERDI Gergo

PS/2 keyboard interface in CλaSH

This week, most of my weekday evenings were quite busy, but I did manage to squeeze in a PS/2 keyboard interface in small installments; then today I went down the rabbit hole of clearing up some technical debt I've accumulated so far by not really looking into how CλaSH handled clock domains.

PS/2 signals

(Just to preempt any possible confusion, we're talking about the peripheral port of the IBM Personal System/2 introduced in the late '80s, not the Playstation 2 console)

The same way VGA is ideal for hobbyist video signal generation since it is both simple and ubiquitous, PS/2 is the go-to interface for keyboards. It is a two-directional, synchronous serial protocol with a peripheral-generated clock in the 10-20 KHz range. Any PC keyboard old enough will support it. One important caveat, though, is that the common USB-to-PS/2 adapters don't actually convert signals, and so they only work with keyboards that were originally designed with that conversion in mind. Here, we are only concerned with device to host communication; it is also possible to communicate in the other direction to e.g. change the Num Lock LED's state.

"Synchronous" here means that there is a separate clock line, unlike in the family of asynchronous serial protocols that were used in RS-232; it is this latter one that is usually meant as "serial communication" when unqualified. In synchronous serial communication, everything happens on the clock ticks; in asynchronous communication, there is no separate clock signal, so the data signal has enough structure that the two communicating sides can agree on the exact framing.

Turning the data line of PS/2 into a stream of bits is a straightforward process: the standard prescribes sampling the data line on the falling edge of the clock line. We also apply an 8-cycle debouncer for good measure, just because some pages on the Internet suggest it:

data PS2 dom = PS2
    { ps2Clk :: Signal dom Bit
    , ps2Data :: Signal dom Bit

    :: (HiddenClockReset dom gated synchronous)
    => PS2 dom -> Signal dom (Maybe Bit)
samplePS2 PS2{..} = enable <$> isFalling low ps2Clk' <*> ps2Data'
    ps2Clk' = debounce d3 low ps2Clk
    ps2Data' = debounce d3 low ps2Data        

The second step in that pipeline is to shift in the bits, 11 at a time. A leading low bit signals the start of a packet; eight data bits and one parity bit follow; the packet is finished with one high bit. Of course, only the eight data bits are presented externally. I use a WriterT (Last Word8) (State PS2State) monad to implement this logic, and then turn that into a CλaSH Mealy machine, in a pattern that I plan to use a lot in implementing the CHIP-8 CPU later:

 data PS2State
    = Idle
    | Bit Word8 (Index 8)
    | Parity Word8
    | Stop (Maybe Word8)

    :: (HiddenClockReset dom gated synchronous)
    => Signal dom (Maybe Bit) -> Signal dom (Maybe Word8)
decodePS2 = flip mealyState Idle $ \bit -> fmap getLast . execWriterT . forM_ bit $ \bit -> do
    state <- get
    case state of
        Idle -> do
            when (bit == low) $ put $ Bit 0 0
        Bit x i -> do
            let x' = shiftInLeft bit x
            put $ maybe (Parity x') (Bit x') $ succIdx i
        Parity x -> do
            let checked = bit /= parity x
            put $ Stop $ enable checked x
        Stop x -> do
            when (bit == high) $ tell $ Last x
            put Idle       

A quick change in hardware

To be able to try out on real hardware what I had at this point, I had to leave the trusty LogicStart Mega-Wing of my Papilio Pro, and instead switch over to the Arcade since that one has a PS/2 port. There are actually two ports on it, so that one could connect e.g. a keyboard and a mouse.

This change involved rewriting my UCF file since the pinout is different from the LogicStart. Also, the Arcade has 4+4+4 bits of VGA color output instead of the previous 3+3+2; of course with the black & white graphics of the CHIP-8, that color depth is all going to waste with this project.

PS/2 scan codes

Unfortunately, it is not enough to shift in the PS/2 data into a byte: we also have to make sense of that byte. While this could be as straightforward as interpreting each byte as the ASCII code of the character on the key pressed, the reality is not this simple. Keyboards emit so-called scan codes, where one or several bytes can encode a single keypress or key release event (see here for example for a list of some keyboard scan codes). I haven't been able to come up with an elegant way of handling this yet, so for now I just have some messy Mealy machine that returns a 16-bit code, where the high byte is zero for one-byte codes. You can see in the comment my frustration at both the implementation and the spec itself:

data KeyEvent = KeyPress | KeyRelease
    deriving (Generic, NFData, Eq, Show)

data ScanCode = ScanCode KeyEvent Word16
    deriving (Generic, NFData, Eq, Show)

data ScanState
    = Init
    | Extended Word8
    | Code KeyEvent Word8

-- TODO: rewrite this for clarity.
-- All it does is it parses 0xE0 0xXX into an extended (16-bit) code, and everything else into
-- an 8-bit code. The only complication is that the key release marker 0xF0 is always the
-- second-to-last byte. Who does that?!?
    :: (HiddenClockReset dom gated synchronous)
    => Signal dom (Maybe Word8) -> Signal dom (Maybe ScanCode)
parseScanCode = flip mealyState Init $ \raw -> fmap getLast . execWriterT . forM_ raw $ \raw -> do
    let finish ev ext = do
            tell $ Last . Just $ ScanCode ev $ fromBytes (ext, raw)
            put Init
    state <- get
    case state of
        Init | raw == 0xe0 -> put $ Extended raw
             | raw == 0xf0 -> put $ Code KeyRelease 0x00
             | otherwise -> finish KeyPress 0x00
        Extended ext | raw == 0xf0 -> put $ Code KeyRelease ext
                     | otherwise -> finish KeyPress ext
        Code ev ext -> finish ev ext
    fromBytes :: (Word8, Word8) -> Word16
    fromBytes = unpack . pack        

Driving a CHIP-8 pixel around

With the video output from last time and the keyboard from this post, but no CPU yet, our options to put everything together into something impressive are somewhat limited. I ended up showing a single CHIP-8 pixel that can be moved around in the CHIP-8 screen space with the arrow keys; this results in something tangible without needing a CPU or even a framebuffer yet. Note how well the code lends itself to using applicative do syntax:

VGADriver{..} = vgaDriver vga640x480at60
ps2 = decodePS2 $ samplePS2 PS2{..}

(dx, dy) = unbundle $ do
    key <- parseScanCode ps2
    pure $ case key of
        Just (ScanCode KeyPress 0xe075) -> (0, -1) -- up
        Just (ScanCode KeyPress 0xe072) -> (0, 1)  -- down
        Just (ScanCode KeyPress 0xe06b) -> (-1, 0) -- left
        Just (ScanCode KeyPress 0xe074) -> (1, 0)  -- right
        _ -> (0, 0)

pixel = do
    x <- fix $ register 0 . (+ dx)
    y <- fix $ register 0 . (+ dy)
    x0 <- (chipX =<<) <$> vgaX
    y0 <- (chipY =<<) <$> vgaY
    pure $ case (,) <$> x0 <*> y0 of
        Just (x0, y0) -> (x0, y0) == (x, y)
        _ -> False

<object height="350" width="425"><param name="movie" value=""/><param name="wmode" value="transparent"/><embed height="350" src="" type="application/x-shockwave-flash" width="425" wmode="transparent"></embed></object>

But wait! There's more!

In reality, after getting the PS/2 decoder working, but before hooking it up to the scan code parser, I thought I'd use the serial IO on the Papilio Pro to do a quick test by just transmitting the scan codes straight away as they come out of the decoder. This has then prompted me to clean up a wart on my UART implementation: they took the clock rate as an extra term-level argument to compute the clock division they need to do:

    :: (HiddenClockReset domain gated synchronous)
    => Word32
    -> Word32
    -> Signal domain (Maybe Word8)
    -> TXOut domain
tx clkRate serialRate inp = TXOut{..}
    (txReady, txOut) = unbundle $ mealyState (tx0 $ clkRate `div` serialRate) (0, Nothing) inp        

This bothered me because the clock domain already specifies the clock rate, at the type level. Trying to remove this redundancy has led me down a rabbit hole of what I believe is a CλaSH bug; but at least I managed to work around that for now (until I find an even better way).

This, in turn, forced me to use a proper clock domain with the correct clock period setting in my CHIP-8 design:

-- | 25.175 MHz clock, needed for the VGA mode we use.
-- CLaSH requires the clock period to be specified in picoseconds.
type Dom25 = Dom "CLK_25MHZ" (1000000000000 `Div` 25175000)        

But then, this allowed me to start putting pixel clock specifications into the type of VGATimings, allowing me to statically enforce that the clock domain in which the VGA signal generator runs is at exactly the right frequency:

    :: (HiddenClockReset dom gated synchronous, KnownNat w, KnownNat h)
    => (dom ~ Dom s ps, ps ~ (1000000000000 `Div` rate))
    => VGATimings rate w h
    -> VGADriver dom w h
-- | VGA 640*480@60Hz, 25.175 MHz pixel clock
vga640x480at60 :: VGATimings 25175000 10 10
vga640x480at60 = VGATimings
    { vgaHorizTiming = VGATiming 640 16 96 48
    , vgaVertTiming  = VGATiming 480 11  2 31

September 08, 2018 06:24 PM

Philip Wadler

A new EU law could end the web as we know it (redux)

In June, the EU avoided voting for Articles 11 and 13, which have the potential to end the web as we know it. Now they're back; the vote is 12 September. Up to date info is at Creative Commons, links to help you write, call or tweet are at Below is what I wrote to my MEPs. If you are an EU citizen, you should write too; WriteToThem makes it easy.

Dear Nosheena Mobarik, Alyn Smith, Ian Hudghton, Catherine Stihler, David Martin, and David Coburn,

I write to you greatly disturbed by the new proposals for copyright. These would require any service provider to maintain large, expensive, and ineffective filters. While YouTube, Twitter, or Facebook will have no problems funding these, they will prevent the next innovative service from getting off the ground. Experience with such filters show that they ban all sort of material which should be in the public domain or which is fair use.

I am also concerned by the resurrection of the "link tax". Previous experience with adding a link tax in Germany and Spain showed that the newspapers that requested it soon stopped using it. It is particularly worrying that the legal formulation chosen will invalidate Creative Commons licenses. Many academic journals (including the one which I edit) depend crucially on Creative Commons, and passing a law that invalidates it is against the public interest.

An excellent one-page summary of the issues, with further links, can be found here:

The web has been a huge boon to innovation, creativity, cooperation, and scientific progress. Please don't kill the goose that lays the golden eggs.

Yours sincerely,

Philip Wadler
Professor of Theoretical Computer Science
University of Edinburgh

by Philip Wadler ( at September 08, 2018 12:18 PM

September 06, 2018

Chris Smith 2

My discussion group is meeting on Friday, so I’ll let you know!

My discussion group is meeting on Friday, so I’ll let you know! I’d planned to revisit some of this and share some thoughts after I’d left this for a week or so. But I’ll go ahead and say a few things now.

Personally, I think when people talk about computational thinking, they mostly have the right set of ideas in mind — decomposition, abstraction, and modeling. There’s another bit, though, sort of underlying all of these, that’s harder to name. It’s the sense that, instead of trying to *first* understand and *then* formalize, we can build understanding through the creation of a formally validated symbolic representation. (There’s a beautiful section of the SICP text in which the application of more and more abstraction to a procedure for computing square roots leads to exploring its connections with important ideas like fixpoints, signal damping, and derivatives. In lecturing on this at IBM, Abelson commented on how this lets us comprehend the meaning of Heron of Alexandria’s process, in a way that even Heron could not access.)

The validation is necessary, by the way, as a crutch, because we’ve left the realm in which intuition works well as a guide. Having some intuition is a good thing, but when trying to state things formally, we always make mistakes, and they are usually not even subtle ones. They are glaringly obvious things that make our work nonsense even if we know what we meant to build. All computation environments do some amount of validation, even if it’s just checking syntax, or making the bad behaviors of our broken models obvious and observable. Some do more than that, such as checking consistent use of types or abstraction boundaries.

This is, I think, is one way in which the computer is relevant to computational thinking. Without constant validation, we still go through qualitatively the same abstraction process, but each step toward abstraction introduces a new possibility of error. The work of carefully verifying each step grows to overwhelm the invention of new abstractions, limiting its feasible scope. Similarly, computer scientists are by no means the first to approach difficult tasks by structured decomposition. Yet, when you are designing a bridge, the work of the engineer is necessarily dwarfed by the hundreds of laborers pouring concrete and welding beams, not to mention mining iron or milling parts! It gives quite a different perspective when coping with complexity is the *main* determinant of success or failure.

The problem is that people say all these things, but then they teach what they know. And for many of us who were trained as computer programmers, what we know is the *practice* of computer programming. Some of that practice is relevant to computational thinking, but much of it is an accident of history, or a consequence of which company had more marketing money in which decades. Even when it’s quite justified as a technology, it’s quite a different thing whether a skill is part of that *core* that gets at the essence of how computation opens up new worlds.

Computer programmers are attached to their tools, but the UNIX command line, integrated development environments, and revision control systems are surely not part of that core. Maybe there’s not much debate on that point. Loops, though, are a main player in even elementary school computing activities. Should they be? They seem to me to have been invented entirely for the practice of computer programming, and are not at all important to abstraction, decomposition, or modeling.

Alas, it seems I have written a blog post in this comment. I might recycle a lot of it in my later recap after having thought this over more!

by Chris Smith at September 06, 2018 02:07 AM

September 05, 2018

Philip Wadler

Why Technology Favors Tyranny

A thoughtful article by Yuval Noah Harari in The Atlantic. Anyone working in computing should be considering the issues raised.
[A]s AI continues to improve, even jobs that demand high intelligence and creativity might gradually disappear. The world of chess serves as an example of where things might be heading. For several years after IBM’s computer Deep Blue defeated Garry Kasparov in 1997, human chess players still flourished; AI was used to train human prodigies, and teams composed of humans plus computers proved superior to computers playing alone.

Yet in recent years, computers have become so good at playing chess that their human collaborators have lost their value and might soon become entirely irrelevant. On December 6, 2017, another crucial milestone was reached when Google’s AlphaZero program defeated the Stockfish 8 program. Stockfish 8 had won a world computer chess championship in 2016. It had access to centuries of accumulated human experience in chess, as well as decades of computer experience. By contrast, AlphaZero had not been taught any chess strategies by its human creators—not even standard openings. Rather, it used the latest machine-learning principles to teach itself chess by playing against itself. Nevertheless, out of 100 games that the novice AlphaZero played against Stockfish 8, AlphaZero won 28 and tied 72—it didn’t lose once. Since AlphaZero had learned nothing from any human, many of its winning moves and strategies seemed unconventional to the human eye. They could be described as creative, if not downright genius.

Can you guess how long AlphaZero spent learning chess from scratch, preparing for the match against Stockfish 8, and developing its genius instincts? Four hours. For centuries, chess was considered one of the crowning glories of human intelligence. AlphaZero went from utter ignorance to creative mastery in four hours, without the help of any human guide.

by Philip Wadler ( at September 05, 2018 03:56 PM

September 04, 2018

Philip Wadler

Based on a True Story

How much of a movie "based on a true story" is actually true? Here is the answer for seventeen movies, ranging from 41% to 100% true. Spotted by Simon Fowler.

Update: the original graph comes from Information is Beautiful.

by Philip Wadler ( at September 04, 2018 07:53 PM

Is the staggeringly profitable business of scientific publishing bad for science?

A long read in the Guardian by Stephen Buranyi.
It is an industry like no other, with profit margins to rival Google – and it was created by one of Britain’s most notorious tycoons: Robert Maxwell.
Spotted by Ross Anderson.

by Philip Wadler ( at September 04, 2018 06:24 PM


Compositional zooming for StateT and ReaderT using lens

Consider writing updates in a state monad where the state contains deeply nested structures. As our running example we will consider a state containing multiple “wallets”, where each wallet has multiple “accounts”, and each account has multiple “addresses”. Suppose we want to write an update that changes one of the fields in a particular address. If the address cannot be found, we want a precise error message that distinguishes between the address itself not being found, or one of its parents (the account, or the wallet) not being found. Without the help of suitable abstractions, we might end up writing something monstrous like:

setUsed :: AddrId -> Update UnknownAddr DB ()
setUsed addrId@(accId@(walletId, accIx), addrIx) = do
    db <- get
    -- find the wallet
    case Map.lookup walletId db of
      Nothing ->
        throwError $ UnknownAddrParent
                   $ UnknownAccParent
                   $ UnknownWalletId walletId
      Just wallet ->
        -- find the account
        case Map.lookup accIx wallet of
          Nothing ->
            throwError $ UnknownAddrParent
                       $ UnknownAccId accId
          Just acc ->
            -- find the address
            case Map.lookup addrIx acc of
              Nothing ->
                throwError $ UnknownAddrId addrId
              Just (addr, _isUsed) -> do
                let acc'    = Map.insert addrIx (addr, True) acc
                    wallet' = Map.insert accIx acc' wallet
                    db'     = Map.insert walletId wallet' db
                put db'

In the remainder of this blog post we will show how we can develop some composable abstractions that will allow us to rewrite this as

setUsed :: AddrId -> Update UnknownAddr DB ()
setUsed addrId =
    zoomAddress id addrId $
      modify $ \(addr, _isUsed) -> (addr, True)

for an appropriate definition of zoomAddress given later.


To obtain compositionality, we want to be able to lift updates on a smaller context (such as a particular wallet) to a larger context (the entire state). In order to do that, we will need a way to get the smaller context from the larger, and to be able to lift modifications of the smaller context to modifications of the larger context. This is of course precisely the definition of a lens, and so we arrive at the following signature:

zoom :: Lens' st st' -> State st' a -> State st  a

For the purposes of the first part this blog post we will define State in a somewhat unusual way as

newtype Result a st = Result { getResult :: (a, st) }
type State st a = st -> Result a st

It will become evident why we choose this definition soon; for now, if you squint a bit you can hopefully see that this is equivalent to the state monad we all know and love. A somewhat naive way to write zoom is

zoom :: Lens' st st' -> State st' a -> State st  a
zoom l f large = fmap updSmall $ f (large ^. l)
    updSmall small' = large & l .~ small'

This definition clearly demonstrates what we said above: we use the lens to first get the small state from the large, run the update on that smaller state, and finally use the lens once more to update the larger state with the new value of the smaller state, relying on the fact that Result is a Functor.

If we are using lenses in Van Laarhoven representation, however, we can actually write this in a more direct way. Expanding synoynms, we get

zoom :: (forall f. Functor f => (st' -> f st')  
                             -> (st  -> f st))
     -> (st' -> Result a st')
     -> (st  -> Result a st)

Note how if we take advantage of our somewhat unusual representation of the state monad, we can instantiate f to Result a, so that lens already gives us precisely what we need! In other words, we can rewrite zoom as simply

zoom :: Lens' st st' -> State st' a -> State st  a
zoom = id

Dealing with failure

In order to deal with missing values, we need a variation on zoom:

zoomM :: Lens' st (Maybe st') -> State st' a -> State st (Maybe a)

We can write this in a naive way again, being very explicit about what’s happening:

zoomM :: Lens' st (Maybe st') -> State st' a -> State st (Maybe a)
zoomM l f large =
    case large ^. l of
      Nothing    -> Result (Nothing, large)
      Just small -> bimap Just (updSmall . Just) $ f small
    updSmall small' = large & l .~ small'

As before, we first use the lens to get the smaller state from the larger. This may now fail; if it does, we return Nothing as the result along with the unchanged state. If the smaller state does exist, we run the update on that smaller state, and then wrap its result in Just; this relies on the fact that Result is a Bifunctor. In case you haven’t seen that class before, it’s the “obvious” generalization of Functor to datatypes with two type arguments:

class Bifunctor p where
  bimap :: (a -> b) -> (c -> d) -> p a c -> p b d  

The instance for Result is easy:

instance Bifunctor Result where
  bimap f g (Result (a, st)) = Result (f a, g st)

As before, however, we can use the lens in a more direct way. Expanding synonyms once again, we get:

zoomM :: (forall f. Functor f => (Maybe st' -> f (Maybe st'))
                              -> (st -> f st))
      -> (st' -> Result a st')
      -> (st  -> Result (Maybe a) st)

If we line up the result of the lens with the result we want from zoomM, we see that we must pick Result (Maybe a) for f; all that remains is writing a suitable wrapper:

liftMaybe :: Biapplicative p
          => (st -> p a st) -> Maybe st -> p (Maybe a) (Maybe st)
liftMaybe _ Nothing   = bipure Nothing Nothing
liftMaybe f (Just st) = bimap Just Just $ f st

This relies on Result being Biapplicative, which is again the “obvious” generalization of Applicative to datatypes with two arguments:

class Bifunctor p => Biapplicative p where
  bipure  :: a -> b -> p a b
  (<<*>>) :: p (a -> b) (c -> d) -> p a c -> p b d

The instance for Result again is straight-forward:

instance Biapplicative Result where
  bipure a st = Result (a, st)
  Result (f, g) <<*>> Result (a, st) = Result (f a, g st)

This out of the way, we can now define zoomM as

zoomM :: Lens' st (Maybe st') -> State st' a -> State st (Maybe a)
zoomM l = l . liftMaybe


So far we have been using a non-standard definition of the state monad. In this section we will see how we can avoid doing that and, more importantly, how we can write our zooming combinators in such a way that they can be used also in the reader monad.

Let’s define a monad for updates and a monad for queries using the standard monad transformers:

newtype Update e st a = Update {
    runUpdate :: StateT st (Except e) a
  deriving ( Functor, Applicative
           , Monad, MonadState st, MonadError e )

newtype Query e st a = Query {
    runQuery :: ReaderT st (Except e) a
  deriving ( Functor, Applicative
           , Monad, MonadReader st, MonadError e )

We want to be able to “zoom” in either of these two monads. We saw above that the key to be able to use the lens directly is the ability to express our update as a function

st -> f st

for some suitable functor f. For zoom we picked Result a, for zoomM we picked Result (Maybe a). The choice of Result, however, was specific to our concrete definition of State. If we want to generalize, we need to generalize away from this type:

class Biapplicative (Result z) => Zoomable z where
  type Result z :: * -> * -> *

  wrap   :: (st -> Result z a st) -> z st a
  unwrap :: z st a -> (st -> Result z a st)

In this type class we introduce a type family Result that we can instantiate to different types for different monads; wrap and unwrap are necessary because unlike our bespoke State monad definition above, the conventional definition of the state monad is isomorphic, but not equal, to a function from a state to a state. We saw above why we need Result z to be Biapplicative.

Zoomable instance for Update

In order to be able to define a Zoomable instance for Update, we need to introduce a type that captures the result of an update:

newtype UpdResult e a st = UpdResult {
    getUpdResult :: Except e (a, st)

Defining the Zoomable instance for UpdResult is now easy:

instance Zoomable (Update e) where
  type Result (Update e) = UpdResult e

  wrap   = coerce
  unwrap = coerce

Note that wrap and unwrap are simply coerce; in other words, they exist only to satisfy the type checker, but have no runtime cost.

Zoomable instances for Query

The nice thing is that we can just as easily give a Zoomable instance for Query. The only difference is that the result of the query does not have a final state:

newtype QryResult e a st = QryResult {
    getQryResult :: Except e a

The Zoomable instance is just as simple:

instance Zoomable (Query e) where
  type Result (Query e) = QryResult e

  wrap   = coerce
  unwrap = coerce

Functor from Bifunctor

If we now try to define zoom for any Zoomable monad, we find that we get stuck very quickly: in order to be able to apply the lens, we need Result z a to be a functor; but all we know is that Result z is a bifunctor. Starting from ghc 8.6 we could use quantified constraints and write

class ( Biapplicative (Result z)
      , forall a. Functor (Result z a)
   => Zoomable z where (..)

to insist that Result z a must be a functor for any choice of a. We could also add a Functor (Result z a) constraint to the type of zoom itself, but this gives zoom a more messy signature than it needs to have.

If we want to be compatible with older versions of ghc but still keep the nicer signature, we can take advantage of the fact that if a datatype is a bifunctor it must also be a functor:

newtype FromBi p a st = WrapBi { unwrapBi :: p a st }

instance Bifunctor p => Functor (FromBi p a) where
  fmap f (WrapBi x) = WrapBi (second f x)

Generalizing the zoom operators

We now have everything we need to give the generalized definitions of the zoom operators. In fact, the definition is almost dictated by the types:

zoom :: Zoomable z => Lens' st st' -> z st' a -> z st  a
zoom l k = wrap $ \st -> unwrapBi $ l (WrapBi . unwrap k) st

Although this looks more complicated than the definition we have before, note that

   zoom l k
 -- definition
== wrap $ \st -> unwrapBi $ l (WrapBi . unwrap k) st
 -- wrap and unwrap are both 'coerce'
== \st -> unwrapBi $ l (WrapBi . k) st
 -- unwrapBi and WrapBi are just newtype wrappers
== \st -> l k st
 -- eta-reduce
== l k

In other words, modulo newtype wrapping, we still have zoom = id. The definition of zoomM is similar to what we had above also:

zoomM :: Zoomable z
      => Lens' st (Maybe st')
      -> z st' a
      -> z st (Maybe a)
zoomM l k = wrap $ \st -> unwrapBi $
              l (WrapBi . liftMaybe (unwrap k)) st

The proof that this is equivalent to simply l (liftMaybe k) is left as a simple exercise for the reader.

Finally, we can define a useful variation on zoomM that uses a fallback when the smaller context was not found:

zoomDef :: (Zoomable z, Monad (z st))
        => Lens' st (Maybe st')
        -> z st  a -- ^ When not found
        -> z st' a -- ^ When found
        -> z st  a
zoomDef l def k = zoomMaybe l k `catchNothing` def


catchNothing :: Monad m => m (Maybe a) -> m a -> m a
catchNothing act fallback = act >>= maybe fallback return

Using the combinators

We will now go back to the example from the introduction and show how we can write some domain-specific zoom operators using the building blocks we just defined.


The example is a state consisting of multiple wallets, where each wallet has multiple accounts, and each account has multiple addresses. For the sake of this blog post it doesn’t really matter what “wallets”, “accounts” and “addresses” are, and we will model them very simply as

type DB      = Map WalletId Wallet
type Wallet  = Map AccIx    Account
type Account = Map AddrIx   Address
type Address = (String, Bool)

The top-level state is a mapping from wallet IDs to wallets, but a wallet is a mapping from account indices to accounts. The reason for the difference is that we will reserve the term account ID for the combination of a wallet ID and an account index, and similarly for addresses:

type AccIx  = Int
type AddrIx = Int

type WalletId = Int
type AccId    = (WalletId, AccIx)
type AddrId   = (AccId, AddrIx)

Finally, the requirements stated that we wanted to distinguish between, say, an address not found because although the account exists, it doesn’t have that particular address, and an address not found because its enclosing account (or indeed wallet) does not exist:

data UnknownWallet = UnknownWalletId   WalletId
data UnknownAcc    = UnknownAccId      AccId    
                   | UnknownAccParent  UnknownWallet
data UnknownAddr   = UnknownAddrId     AddrId   
                   | UnknownAddrParent UnknownAcc


Ok, definitions done, we can now define our zoom combinators. Our initial attempt might be something like

zoomWallet :: WalletId
           -> Update e Wallet a
           -> Update e DB     a

If the wallet ID was not found, however, we want to be able to throw an UnknownWallet error. We could change the signature to

zoomWallet :: WalletId
           -> Update UnknownWallet Wallet a
           -> Update UnknownWallet DB     a

but now we cannot use zoomWallet for updates with a richer error type. A better solution is to take as an argument a function that allows us to embed the UnknownWallet error into e:

zoomWallet :: (UnknownWallet -> e)
           -> WalletId
           -> Update e Wallet a
           -> Update e DB     a
zoomWallet embedErr walletId k =
    zoomDef (at walletId)
            (throwError $ embedErr (UnknownWalletId walletId)) $

The definition is pleasantly straightforward. We use the at combinator from lens to give us a lens into the map, and then use zoomDef with a fallback that throws the error to complete the definition.


In order to show that our new combinators are compositional we should be able to define zoomAccount in terms of zoomWallet, and indeed we can:

zoomAccount :: (UnknownAcc -> e)
            -> AccId
            -> Update e Account a
            -> Update e DB      a
zoomAccount embedErr accId@(walletId, accIx) k =
    zoomWallet (embedErr . UnknownAccParent) walletId $
      zoomDef (at accIx)
              (throwError $ embedErr (UnknownAccId accId)) $

Composing the zoom combinators is effectively lens composition, which is taking care of getting the account from the DB by first getting the account in one direction, and updating the DB by first lifting the update on the account to an update on the wallet, and then to an update on the DB itself.

The “embed error” argument is helping with compositionality also: zoomAccount needs its embedErr to embed UnknownAcc into e, but when it calls zoomWallet it composes embedErr with UnknownAccParent to embed UnknownWallet into e.

The definition for address follows the exact same pattern:

zoomAddress :: (UnknownAddr -> e)
            -> AddrId
            -> Update e Address a
            -> Update e DB      a
zoomAddress embedErr addrId@(accId, addrIx) k =
    zoomAccount (embedErr . UnknownAddrParent) accId $
      zoomDef (at addrIx)
              (throwError $ embedErr (UnknownAddrId addrId)) $

so that we can now write the definition we promised in the introduction:

setUnused :: AddrId -> Update UnknownAddr DB ()
setUnused addrId =
    zoomAddress id addrId $
      modify $ \(addr, _isUsed) -> (addr, False)


There is one additional zoom operator that is very useful to define. Suppose we want to clear out all wallets. If we tried to write this with the combinators we have so far, we would end up with something like

emptyAllWallets :: Update UnknownWallet DB ()
emptyAllWallets = do
    walletIds <- gets Map.keys
    forM_ walletIds $ \walletId ->
      zoomWallet id walletId $
        put Map.empty

We get all wallet IDs, then zoom to each wallet in turn and empty it. However, notice the signature: it indicates that emptyAllWallets may throw a UnknownWallet error—but it never will! After all, we just read all wallet IDs, so we know for a fact that they must be present. One “solution” is to do something like

emptyAllWallets :: Update e DB ()
emptyAllWallets = do
    walletIds <- gets Map.keys
    forM_ walletIds $ \walletId ->
      zoomWallet (\_err -> error "can't happen") walletId $
        put Map.empty

but we can do much better: we need a zoom operator that gives us iteration.


In the world of lens, iteration is captured by a Traversal'. Compare the synoynms:

type Lens'      st st' = forall f. Functor     f => (st' -> f st')
                                                 -> (st  -> f st)
type Traversal' st st' = forall f. Applicative f => (st' -> f st')
                                                 -> (st  -> f st)

A traversal will apply its argument to all occurrences of the smaller state; in order to patch the results back together it needs f to be Applicative rather than merely a Functor.

Applicative from Biapplicative

Remember that the f we’re using in Zoomable is the Result z type family, which we know to be Biapplicative. We showed above that we can easily derive Functor from Bifunctor; deriving Applicative from Biapplicative, however, is not so easy! Let’s see what we need to do:

instance Biapplicative p => Applicative (FromBi p a) where
  pure st     = WrapBi $ bipure _e st
  fun <*> arg = WrapBi $ bimap _c ($) (unwrapBi fun)
                           <<*>> unwrapBi arg

There are two problematic holes in this definition:

  • For pure we need to construct an p a st from just a state st; we need to construct an _e :: a out of thin air. This corresponds to having no results at all.
  • For (<*>) we need a function _c :: a -> a -> a that combines two results into a single one.

The usual solution to this problem is to require a to be a monoid. Then we can use mempty for the absence of a result, and mappend to combine results:

instance ( Biapplicative p
         , Monoid a
      => Applicative (FromBi p a) where
  pure st     = WrapBi $ bipure mempty st
  fun <*> arg = WrapBi $ bimap mappend ($) (unwrapBi fun)
                           <<*>> unwrapBi arg


We can now define zoomAllM:

zoomAllM :: (Zoomable z, Monoid a)
         => Traversal' st st' -> z st' a -> z st a
zoomAllM l k = wrap $ \st -> unwrapBi $ l (WrapBi . unwrap k) st

Apart from the signature, the body of this function is literally identical to zoom, and is therefore also equivalent to simply id. Mind-blowing stuff.

We can define two useful wrappers for zoomAllM with slightly simpler types. The first is just a synoynm which can be used when we don’t want to accumulate any results:

zoomAll_ :: Zoomable z => Traversal' st st' -> z st' () -> z st ()
zoomAll_ = zoomAllM

This works because () is trivially a monoid. Finally we can define a wrapper that accumulates results in a list:

zoomAll :: Zoomable z => Traversal' st st' -> z st' a -> z st [a]
zoomAll l k = wrap $ \st -> unwrapBi $
                l (WrapBi . first (:[]) . unwrap k) st

We could have defined zoomAll in terms of zoomAllM if we insist that z st' is a Functor; by unfolding the definition we can take advantage of the fact that Result z is a bifunctor and we keep the signature clean.

Usage example

The example function we were considering was one that cleared out all wallets. With our new combinators in hand, this is now trivial:

emptyAllWallets :: Update e DB ()
emptyAllWallets = zoomAll_ traverse $ put Map.empty


As Haskell programmers, compositionality is one of our most treasured principles. The ability to build larger components from smaller, and reason about larger components by reasoning about the smaller, is crucial to productivity and clean, maintainable code. When dealing with large states (for example, in an acid-state database), lenses are a powerful tool that can be used to lift operations on parts of the state to the whole state. In this blog post we defined some reuseable combinators that can be used both in updates and in queries; they are used extensively in the design of the new Cardano wallet kernel.

Postscript: zoom from Control.Lens.Zoom

The lens library itself also defines a zoom operator. It has the same purpose as the zoom operator we defined here, but generalizes over the underlying monad in a different way (allowing for deeply nested occurrences of StateT in a monad stack), and is not applicable to the reader monad (the equivalent for the reader monad is magnify). However, if compatibility with ReaderT is not required then it is also possible to define zoom, zoomDef, and zoomAll in terms of the lens operator; domain specific combinators like zoomWallet can then be defined just like we have done here.

by edsko at September 04, 2018 01:35 PM

Jasper Van der Jeugt

Dependent Types in Haskell: Binomial Heaps 101

Haskell eXchange

If you’re located near London, you might be interested in the fact that I will give a talk about the contents of this blogpost at the Haskell eXchange 2018 that is happening Thursday 11th - Friday 12th of October.

After that conference, a video should be available as well which I will link here.


This post makes a bit of a departure from the “practical Haskell” I usually try to write about, although – believe it or not – this blogpost actually originated from a very practical origin 1.

This blogpost is a literate Haskell file, which means you can just download it here and load it into GHCi to play around with it. In this case, you can also verify the properties we will be talking about (yes, GHC as a proof checker). Since we are dipping our toes into dependent types territory here, we will need to enable some extensions that are definitely a bit more on the advanced side.

Since the goal of this blogpost is mainly educational, we will only use a few standard modules and generally define things ourselves. This also helps us to show that there is no magic going on behind the scenes: all term-level functions in this file are total and compile fine with -Wall.

I assume most readers will be at least somewhat familiar with the standard length-indexed list:

These vectors carry their length in their types. In GHCi:

*Main> :t VCons "Hello" (VCons "World" VNull)
Vec ('Succ ('Succ 'Zero)) [Char]

This blogpost defines a similar way to deal with binomial heaps. Binomial heaps are one of my favorite data structures because of their simple elegance and the fascinating way their structure corresponds to binary numbers.

We will combine the idea of Peano number-indexed lists with the idea that binomial heaps correspond to binary numbers to lift binary numbers to the type level. This is great because we get O(log(n)) size and time in places where we would see O(n) for the Peano numbers defined above (in addition to being insanely cool). In GHCi:

*Main> :t pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
            pushHeap 'd' $ pushHeap 'e' emptyHeap
Heap ('B1 ('B0 ('B1 'BEnd))) Char

Where 101 2 is, of course, the binary representation of the number 5.

Conveniently, 101 also represents the basics of a subject. So the title of this blogpost works on two levels, and we present an introductory-level explanation of a non-trivial (and again, insanely cool) example of dependent Haskell programming.

Table of contents

  1. Introduction
  2. Singletons and type equality
  3. Binomial heaps: let’s build it up
    1. Binomial trees
    2. Type level binary numbers
    3. Binomial forests
    4. The binomial heap
  4. Binomial heaps: let’s break it down
    1. Taking apart a single tree
    2. More Vec utilities
    3. Popcount and width
    4. Lumberjack
    5. Lumberjack: final form
    6. popHeap: gluing the pieces together
  5. Acknowledgements
  6. Appendices
    1. Appendix 1: runtime cost of this approach
    2. Appendix 2: “pretty”-printing of heaps
    3. Appendix 3: left-to-right increment

Singletons and type equality

If I perform an appropriate amount of hand-waving and squinting, I feel like there are two ways to work with these stronger-than-usual types in Haskell. We can either make sure things are correct by construction, or we can come up with a proof that they are in fact correct.

The former is the simpler approach we saw in the Vec snippet: by using the constructors provided by the GADT, our constraints are always satisfied. The latter builds on the singletons approach introduced by Richard Eisenberg and Stephanie Weirich.

We need both approaches for this blogpost. We assume that the reader is somewhat familiar with the first one and in this section we will give a brief introduction to the second one. It is in no way intended to be a full tutorial, we just want to give enough context to understand the code in the blogpost.

If we consider a closed type family for addition of natural numbers (we are using an N prefix since we will later use B for addition of binary numbers):

We can trivially define the following function:

NAdd 'Zero x is easily reduced to x by GHC since it is simply a clause of the type family, so it accepts the definition id. However, if we try to write

We run into trouble, and GHC will tell us:

Couldn't match type ‘x’ with ‘NAdd x 'Zero’

We will need to prove to GHC that these two types are equal – commutativity doesn’t come for free! This can be done by providing evidence for the equality by way of a GADT constructor 3.

Take a minute to think about the implications this GADT has – if we can construct a QED value, we are actually providing evidence that the two types are equal. We assume that the two types (a and b) have the same kind k 4.

The QED constructor lives on the term-level though, not on the type-level. We must synthesize this constructor using a term-level computation. This means we need a term-level representation of our natural numbers as well. This is the idea behind singletons and again, a much better explanation is available in said paper and some talks, but I wanted to at least provide some intuition here.

The singleton for Nat is called SNat and it’s easy to see that each Nat has a unique SNat and the other way around:

We can use such a SNat to define a proof for what we are trying to accomplish. Since this proof can be passed any n in the form of an SNat, it must be correct for all n.

GHC can figure out the base case on its own by reducing NAdd 'Zero 'Zero to 'Zero:

And we can use induction to complete the proof. The important trick here is that in the body of the pattern match on EqualityProof a b, GHC knows that a is equal to b.

This can be used to write cast02:

cast02 takes an extra parameter and there are several ways to synthesize this value. The common one is a typeclass that can give us an SNat x from a Proxy x. In this blogpost however, we keep things simple and make sure we always have the right singletons on hand by passing them around in a few places. In other words: don’t worry about this for now.

Binomial heaps: let’s build it up

Binomial trees

A binomial heap consists of zero or more binomial trees. I will quote the text from the Wikipedia article here since I think it is quite striking how straightforward the definition translates to GADTs that enforce the structure:

  • A binomial tree of order 0 is a single node
  • A binomial tree of order k has a root node whose children are roots of binomial trees of orders k−1, k−2, …, 2, 1, 0 (in this order).

Some illustrations to make this a bit more clear:

<figure> Illustration of trees of different sizes, with the children array marked in green<figcaption>Illustration of trees of different sizes, with the children array marked in green</figcaption> </figure>

This is definitely a very good example of the correctness by construction approach I talked about earlier: it is simply impossible to create a tree that does not have the right shape.

Empty trees do not exist according to this definition. A singleton tree is easy to create:

We only need to define one operation on trees, namely merging two trees.

A tree of order k has 2ᵏ elements, so it makes sense that merging two trees of order k creates a tree of order k+1. We can see this in the type signature as well:

Concretely, we construct the new tree by taking either the left or the right tree and attaching it as new child to the other tree. Since we are building a heap to use as a priority queue, we want to keep the smallest element in the root of the new tree.

<figure> Merging two trees. Since ‘a’ is smaller than ‘b’ we attach the ‘b’ tree as a new child to the ‘a’ tree.<figcaption>Merging two trees. Since ‘a’ is smaller than ‘b’ we attach the ‘b’ tree as a new child to the ‘a’ tree.</figcaption> </figure>

Type level binary numbers

With these trees defined, we can move on to binomial heaps.

While binomial trees are interesting on their own, they can really only represent collections that have a number of elements that are exactly a power of two.

Binomial heaps solve this in a surprisingly simple way. A binomial heap is a collection of binomial trees where we may only have at most one tree for every order.

This is where the correspondence with binary numbers originates. If we have a binomial heap with 5 elements, the only way to do this is to have binomial trees of orders 2 and 0 (2² + 2⁰ = 5).

We start out by defining a simple datatype that will be lifted to the kind-level, just as we did with Nat:

It’s important to note that we will represent binary numbers in a right-to-left order since this turns out to match up more naturally with the way we will be defining heaps.

For example, the type:

'B0 ('B1 ('B1 'BEnd))

represents the number 6 (conventionally written 110).

I think it is fairly common in Haskell for a developer to play around with different ways of representing a certain thing until you converge on an elegant representation. This is many, many times more important when we are dealing with dependently-typed Haskell.

Inelegant and awkward data representations can make term-level programming clunky. Inelegant and awkward type representations can make type-level programming downright infeasible due to the sheer amount of lemmas that need to be proven.

Consider the relative elegance of defining a type family for incrementing a binary number that is read from the right to the left:

Appendix 3 contains an (unused) implementation of incrementing left-to-right binary numbers. Getting things like this to work is not too much of a stretch these days (even though GHC’s error messages can be very cryptic). However, due to the large amount of type families involved, proving things about it presumably requires ritually sacrificing an inappropriate amount of Agda programmers while chanting Richard Eisenberg’s writings.

To that end, it is almost always worth spending time finding alternate representations that work out more elegantly. This can lead to some arbitrary looking choices – we will see this in full effect when trying to define CutTree further below.

Addition is not too hard to define:

Let’s quickly define a number of examples

This allows us to play around with it in GHCi:

*Main> :set -XDataKinds
*Main> :kind! BAdd BFour BFive
BAdd BFour BFive :: Binary
= 'B1 ('B0 ('B0 ('B1 'BEnd)))

Finally, we define a corresponding singleton to use later on:

Binomial forests

Our heap will be a relatively simple wrapper around a recursive type called Forest. This datastructure follows the definition of the binary numbers fairly closely, which makes the code in this section surprisingly easy and we end up requiring no lemmas or proofs whatsoever.

A Forest k b refers to a number of trees starting with (possibly) a tree of order k. The b is the binary number that indicates the shape of the forest – i.e., whether we have a tree of a given order or not.

Using a handwavy but convenient notation, this means that Forest 3 101 refers to binomial trees of order 3 and 5 (and no tree of order 4).

Note that we list the trees in increasing order here, which contrasts to Children, where we listed them in decreasing order. You can see this in the way we are removing layers of 'Succ as we add more constructors. This is the opposite of what happens in Children.

The empty forest is easily defined:

insertTree inserts a new tree into the forest. This might require merging two trees together – roughly corresponding to carrying in the binary increment operation.

Similarly, merging two forests together corresponds to adding two binary numbers together:

It’s worth seeing how the different branches in insertTree and mergeForests match up almost 1:1 with the different clauses in the definition of the type families BInc and BAdd. If we overlay them visually:

<figure> Overlaying mergeForests and BAdd brings out the similarity<figcaption>Overlaying mergeForests and BAdd brings out the similarity</figcaption> </figure>

That is the intuitive explanation as to why no additional proofs or type-level trickery are required here.

Here is an informal illustration of what happens when we don’t need to merge any trees. The singleton Forest on the left is simply put in the empty F0 spot on the right.

<figure> Simple merge; 1 + 100 = 101<figcaption>Simple merge; 1 + 100 = 101</figcaption> </figure>

When there is already a tree there, we merge the trees using mergeTree and carry that, in a very similar way to how carrying works in the addition of binary numbers:

<figure> Merge with carry, 1 + 101 = 110<figcaption>Merge with carry, 1 + 101 = 110</figcaption> </figure>

The binomial heap

The Forest structure is the main workhorse and Heap is just a simple wrapper on top of that, where we start out with a tree of order 0:

The operations on Heap are also simple wrappers around the previously defined functions:

We are now ready to show this off in GHCi again:

*Main> :t pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
            pushHeap 'd' $ pushHeap 'e' emptyHeap
Heap ('B1 ('B0 ('B1 'BEnd))) Char

We can also take a look at the internals of the datastructure using a custom show instance provided in the appendix 2:

*Main> pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
            pushHeap 'd' $ pushHeap 'e' emptyHeap
(tree of order 0)
(no tree of order 1)
(tree of order 2)


Binomial heaps: let’s break it down

I think it’s interesting that we have implemented an append-only heap without even requiring any lemmas so far. It is perhaps a good illustration of how append-only datastructures are conceptually much simpler.

<figure> Yes, this is a long blogpost. We’ve arrived at the halfway point, so it’s a good time to get a coffee and take a break. You deserve it for sticking with me so far.<figcaption>Yes, this is a long blogpost. We’ve arrived at the halfway point, so it’s a good time to get a coffee and take a break. You deserve it for sticking with me so far.</figcaption> </figure>

Things get significantly more complicated when we try to implement popping the smallest element from the queue. For reference, I implemented the current heap in a couple of hours, whereas I worked on the rest of the code on and off for about a week.

Let’s look at a quick illustration of how popping works.

We first select the tree with the smallest root and remove it from the heap:

We break up the tree we selected into its root (which will be the element that is “popped”) and its children, which we turn into a new heap.

We merge the remainder heap from step 1 together with the new heap we made out of the children of the removed tree:

The above merge requires carrying twice.

Taking apart a single tree

We will start by implementing step 2 of the algorithm above since it is a bit easier. In this step, we are taking all children from a tree and turning that into a new heap.

We need to keep all our invariants intact, and in this case this means tracking them in the type system. A tree of k has 2ᵏ elements. If we remove the root, we have k children trees with 2ᵏ - 1 elements in total. Every child becomes a tree in the new heap. This means that the heap contains k full trees, and its shape will be written as k “1”s. This matches our math: if you write k “1”s, you get the binary notation of 2ᵏ - 1.


<figure> A tree of order 3 results in a heap with an “11” shape<figcaption>A tree of order 3 results in a heap with an “11” shape</figcaption> </figure>

We introduce a type family for computing n “1”s:

We will use a helper function childrenToForest_go to maintain some invariants. The wrapper childrenToForest is trivially defined but its type tells us a whole deal:

We use childrenSingleton to obtain a singleton for n.

The tricky bit is that the list of trees in Children has them in descending order, and we want them in ascending order in Forest. This means we will have to reverse the list.

We can reverse a list easily using an accumulator in Haskell. In order to maintain the type invariants at every step, we will increase the size of the accumulator as we decrease the size of the children. This can be captured by requiring that their sum remains equal (m ~ NAdd x n).

I will not always go into detail on how the lemmas apply but let’s do it here nonetheless.

For the base case, we simply want to return our accumulator. However, our accumulator has the type Forest n (Ones x) and we expect something of the type Forest n (Ones m). Furthermore, we know that:

  n ~ 'Zero, m ~ NAdd x n
⊢ m ~ NAdd x 'Zero

We need to prove that x ~ m in order to do the cast from Forest n (Ones x) to Forest n (Ones m).

We can do so by applying lemma1 to x (the latter represented here by xnat). This gives us lemma1 xnat :: NAdd x 'Zero :~: n. Combining this with what we already knew:

  m ~ NAdd x 'Zero, NAdd x 'Zero ~ n
⊢ m ~ x

…which is what we needed to know.

The inductive case is a bit harder and requires us to prove that:

  m ~ NAdd x n, m ~ NAdd x n, n ~ 'Succ k
⊢ Ones m ~ 'B1 (Ones (NAdd x k))

GHC does a great job and ends up with something like:

Ones (NAdd x (Succ k)) ~ 'B1 (Ones (NAdd x k))

Which only requires us to prove commutativity on NAdd. You can see that proof in lemma2 a bit further below. This case also illustrates well how we carry around the singletons as inputs for our lemmas and call on them when required.

Proving lemma2 is trivial… once you figure out what you need to prove and how all of this works.

It took me a good amount of time to put the different pieces together in my head. It is not only a matter of proving the lemma: restructuring the code in childrenToForest_go leads to different lemmas you can attempt to prove, and figuring out which ones are feasible is a big part of writing code like this.

More Vec utilities

These are some minor auxiliary functions we need to implement on Vec. We mention them here because we’ll also need two type classes dealing with non-zeroness.

First, we need some sort of map, and we can do this by implementing the Functor typeclass.

Secondly, we need a very simple function to convert a Vec to a list. Note that this erases the information we have about the size of the list.

Using vecToList, we can build a function to convert a non-empty Vec to a NonEmpty list. This uses an additional NNonZero typeclass.

Non-zeroness can be defined on binary numbers as well:

You might be asking why we cannot use a simpler type, such as:

It we use this, we run into trouble when trying to prove that a Vec is not empty later on. We would have to construct a singleton for n, and we only have something that looks a bit like ∃n. 'Succ n. Trying to get the n out of that requires some form of non-zeroness constraint… which would be exactly what we are trying to avoid by using the simpler type. 5

Popcount and width

The minimal element will always be the root of one of our trees. That means we have as many choices for our minimal element as there are trees in our heap. We need some way to write down this number as a type.

Since we have a tree for every 1 in our binary number, we can define the number of trees as the popcount of the binary number.

In a weird twist of fate, you can also pretend this stands for “the count of trees which we can pop”, which is exactly what we will be using it for.

Popcount can be used to relate the non-zeroness of a natural number, and the non-zeroness of a binary number.

In addition to caring about the popcount of a binary number, we are sometimes interested in its width (number of bits). This is also easily captured in a type family:

That is a fair amount of type families so far. To make things a bit more clear, here is an informal visual overview of all the type families we have defined, including BDec (binary decrement, defined further below).

<figure> Rectangles represent (lifted) kinds, arrows are type families<figcaption>Rectangles represent (lifted) kinds, arrows are type families</figcaption> </figure>


Now, popping the smallest element from the heap first involves cutting a single tree from the forest inside the heap. We take the root of that tree and merge the children of the tree back together with the original heap.

<figure> Selecting a single tree<figcaption>Selecting a single tree</figcaption> </figure>

However, just selecting (and removing) a single tree turns out to be quite an endeavour on its own. We define an auxiliary GADT which holds the tree, the remainder of the heap, and most importantly, a lot of invariants.

Feel free to scroll down to the datatype from here if you are willing to assume the specific constraint and types are there for a reason.

The two first fields are simply evidence singletons that we carry about. k stands for the same concept as in Forest; it means we are starting with an order of k. x stands for the index of the tree that was selected.

This means the tree that was selected has an order of NAdd k x, as we can see in the third field. If the remainder of the heap is Forest k b a, its shape is denoted by b and we can reason about the shape of the original heap.

The children of tree (Tree (NAdd k x) a) that was selected will convert to a heap of shape Ones x. We work backwards from that to try and write down the type for the original heap. The tree (Tree (NAdd k x) a) would form a singleton heap of shape BInc (Ones x). The remainder (i.e., the forest with this tree removed) had shape b, so we can deduce that the original shape of the forest must have been BAdd b (BInc (Ones x)).

Finally, we restructure the type in that result to BInc (BAdd b (Ones x)). The restructuring is trivially allowed by GHC since it just requires applying the necessary type families. The restructured type turns out to be more easily usable in the places where we case-analyse CutTree further down in this blogpost.

We also carry a constraint here that seems very arbitrary and relates the widths of two binary numbers. It is easier to understand from an intuitive point of view: the new (merged) heap has the same width as the original heap. Why is it here?

Well, it turns out we will need this fact further down in a function definition. If we can conclude it here by construction in the GADT, we avoid having to prove it further down.

Of course, I know that I will need this further down because I already have the code compiling. When writing this, there is often a very, very painful dialogue in between different functions and datatypes, where you try to mediate by making the requested and expected types match by bringing them closer together step by step. In the end, you get a monstrosity like:

Fortunately, this type is internal only and doesn’t need to be exported.

lumberjack_go is the worker function that takes all possible trees out of a heap. For every 1 in the shape of the heap, we have a tree: therefore it should not be a surprise that the length of the resulting vector is Popcount b.

The definition is recursive and a good example of how recursion corresponds with inductive proofs (we’re using lemma1 and lemma2 here). We don’t go into much detail with our explanation here – this code is often hard to write, but surprisingly easy to read.

Lumberjack: final form

Now that we can select Popcount b trees, it’s time to convert this to something more convenient to work with. We will use a NonEmpty to represent our list of candidates to select from.

First, we select the Popcount b trees:

Then we convert it to a NonEmpty. This requires us to call lemma3 (the proof that relates non-zeroness of a binary number with non-zeroness of a natural number through popcount). We need an appropriate SBin to call lemma3 and the auxiliary function forestSingleton defined just below does that for us.

This function is similar to childrenSingleton – it constructs an appropriate singleton we can use in proofs.

popHeap: gluing the pieces together

We can now find all trees in the heap that may be cut. They are returned in a CutTree datatype. If we assume that we are taking a specific CutTree, we can take the root from the tree inside this datatype, and we can construct a new heap from its children using childrenToForest. Then, we merge it back together with the original heap.

The new heap has one less element – hence we use BDec (binary decrement, defined just a bit below).

We deconstruct the CutTree to get the root (x) of the selected tree, the children of the selected trees (children), and the remaining trees in the heap (forest).

We construct a new forest from the children.

We merge it with the remainder of the heap:

The illustration from above applies here:

<figure> Merging back together<figcaption>Merging back together</figcaption> </figure>

Now, we cast it to the result using a new lemma4 with a singleton that we construct from the trees:

This is the type family for binary decrement. It is partial, as expected – you cannot decrement zero. This is a bit unfortunate but necessary. Having the BNonZero type family and using it as a constraint will solve that though.

The weirdly specific lemma4 helps us prove that we can take a number, increment it and then decrement it, and then get the same number back provided incrementing doesn’t change its width. This ends up matching perfectly with the width constraint generated by the CutTree, where the number that we increment is a number of “1”s smaller than the shape of the total heap (intuitively).

Using another constraint in CutTree with another proof here should also be possible. I found it hard to reason about why this constraint is necessary, but once I understood that it wasn’t too abnormal. The proof is easy though.

We don’t need to define a clause for SBEnd since Width SBEnd ~ Width (BInc SBEnd) does not hold.

Tying all of this together makes for a relatively easy readable popHeap:

Out of the different candidates, select the one with the minimal root (minimumBy is total on NonEmpty):

Pop that tree using popForest:

Helper to compare candidates by root:

In GHCi:

*Main> let heap = pushHeap 'a' $ pushHeap 'b' $ pushHeap 'c' $
                pushHeap 'd' $ pushHeap 'e' emptyHeap
*Main> :t heap
heap :: Heap ('B1 ('B0 ('B1 'BEnd))) Char
*Main> :t popHeap heap
popHeap heap :: (Char, Heap ('B0 ('B0 ('B1 'BEnd))) Char)
*Main> fst $ popHeap heap
*Main> snd $ popHeap heap
(no tree of order 0)
(no tree of order 1)
(tree of order 2)

Beautiful! Our final interface to deal with the heap looks like this:


I would like to thank Alex Lang for many discussions about this and for proofreading, Akio Takano and Fumiaki Kinoshita for some whiteboarding, and Titouan Vervack and Becki Lee for many additional corrections.

I am by no means an expert in dependent types so while GHC can guarantee that my logic is sound, I cannot guarantee that my code is the most elegant or that my explanations are waterproof. In particular, I am a bit worried about the fact that binary numbers do not have unique representations – even though it does seem to make the code a bit simpler. If you have any ideas for improvements, however, feel free to reach out!

Update: Lars Brünjes contacted me and showed me a similar implementation he did for leftist heaps. You can see it in this repository. He uses a similar but unique representation of binary numbers, along the lines of:

I think is actually more elegant than the representation I used. The only disadvantage is that is a bit less concise (which is somewhat relevant for a blogpost), requiring two functions and two datatypes for most cases (e.g. a Forest k Binary and a PForest k Positive, with mergeForests and mergePForests, and so on). But if you wanted to use this idea in a real implementation, I encourage you to check that out.


Appendix 1: runtime cost of this approach

Since we represent the proofs at runtime, we incur an overhead in two ways:

  • Carrying around and allocating the singletons;
  • Evaluating the lemmas to the QED constructor.

It should be possible to remove these at runtime once the code has been typechecked, possibly using some sort of GHC core or source plugin (or CPP in a darker universe).

Another existing issue is that the tree of the spine is never “cleaned up”. We never remove trailing F0 constructors. This means that if you fill a heap of eight elements and remove all of them again, you will end up with a heap with zero elements that has the shape 'B0 ('B0 ('B0 ('B0 'BEnd))) rather than B0 'BEnd. However, this sufficed for my use case. It should be possible to add and prove a clean-up step, but it’s a bit outside the scope of this blogpost.

Appendix 2: “pretty”-printing of heaps

Appendix 3: left-to-right increment

Increment gets tricky mainly because we need some way to communicate the carry back in a right-to-left direction. We can do this with a type-level Either and some utility functions. It’s not too far from what we would write on a term-level, but again, a bit more clunky. We avoid this kind of clunkiness since having significantly more code obviously requires significantly more proving.

<section class="footnotes">
  1. For work, I recently put together an interpreter for a lambda calculus that was way faster than I expected it to be – around 30 times as fast. I suspected this meant that something was broken, so in order to convince myself of its correctness, I wrote a well-typed version of it in the style of Francesco’s well-typed suspension calculus blogpost. It used a standard length-indexed list which had the unfortunate side effect of pushing me into O(n) territory for random access. I started looking for an asymptotically faster way to do this, which is how I ended up looking at heaps. In this blogpost, I am using the binomial heap as a priority queue rather than a bastardized random access skip list since that is what readers are presumably more familiar with.

  2. For reasons that will become clear later on, the binary numbers that pop up on the type level should be read right-to-left. A palindrome was chosen as example here to avoid having to explain that at this point.

  3. This type and related utilities are found in Data.Type.Equality, but redefined here for educational purposes.

  4. The datatype in Data.Type.Equality allows equality between heterogeneous kinds as well, but we don’t need that here. This saves us from having to toggle on the “scary” {-# LANGUAGE TypeInType #-}.

  5. I’m not sure if it is actually impossible to use this simpler type, but I did not succeed in finding a proof that uses this simpler type.


by Jasper Van der Jeugt at September 04, 2018 12:00 AM

September 03, 2018

Ken T Takusagawa

[fanlevif] CBC encryption and decryption

We present, for pedagogical purposes, a polymorphic implementation in Haskell of the Cipher Block Chaining mode of operation for a block cipher.

Encryption is an unfold, reflecting the fact that CBC encryption is inherently serial: each block depends on all the previous blocks.

{-# LANGUAGE ScopedTypeVariables #-}
encrypt_cbc :: forall block . (block -> block) -> (block -> block -> block) -> block -> [block] -> [block];
encrypt_cbc cipher combine initialization_vector input_plaintext = let {
    one_encrypt :: (block,[block]) -> Maybe (block,(block,[block]));
    one_encrypt (previous_ciphertext, plaintext) = case plaintext of {
      [] -> Nothing;
      ( (p::block) : (rest::[block]) ) -> let {
        ciphertext :: block;
        ciphertext = cipher $ combine p previous_ciphertext;
      } in Just (ciphertext,(ciphertext,rest));
    }} in unfoldr one_encrypt (initialization_vector, input_plaintext);

Decryption is a zip, illustrating that decryption is parallelizable.

decrypt_cbc :: forall block . (block -> block) -> (block -> block -> block) -> block -> [block] -> [block];
decrypt_cbc decipher uncombine initialization_vector ciphertext = let {
    one_decrypt :: block -> block -> block;
    one_decrypt previous_ciphertext this_ciphertext = uncombine (decipher this_ciphertext) previous_ciphertext;
  } in zipWith one_decrypt (initialization_vector:ciphertext) ciphertext;

The cipher and decipher function arguments are the underlying block cipher operating in the encryption and decryption directions respectively.

The combine and uncombine binary operations take two blocks and combine them into one.  They are both typically implemented as XOR.  It also seems possible to use addition and subtraction modulo N, but I have never seen that explored.

by Ken ( at September 03, 2018 04:17 PM

September 02, 2018

ERDI Gergo

CλaSH video: VGA signal generator

I decided to start with the video signal generation so that I have something fun to look at, and also so that I can connect other components to it at later steps.

VGA signal primer

VGA is a very simple video signal format with separate digital lines for vertical and horizontal synchronization, and three analog lines for the red, green and blue color channels. This is so much simpler than TV signals like PAL or NTSC that were designed to be backwards-compatible with early black and white TV formats and that only support a single row count, it has some quite low-bandwidth standard modes (with pixel clocks at just tens of MHz), and the whole world is filled with displays that support VGA; put it together, and it is ideal for hobbyist computer projects.

The basic mode of operation is you do a bit of a song and dance on the vertical and horizontal sync lines, and then just keep a counter of where you are so you know what color to put on the red/green/blue lines. The clock speed one should use for this counter is intimately tied to the sync pattern, and this is where VGA timing databases come into play.

I'm going to skip the tale of the electron beam scanning the CRT in þe olde times because every page describing VGA has it; for our purposes here it is enough to just regard it as an abstract serial protocol.

CHIP-8 video

The CHIP-8 has 1-bit graphics with 64⨯32 resolution. This is not a typo, there is no unit or scaling factor missing: it really is only 64 (square) pixels across ands 32 pixels down. That is not a lot of pixels; to give you an idea, here is a full-screen dump rendered with no scaling:

To make it full-screen, we need to scale it up. The easiest way I could think of was to scale it by a factor of a power of 2; that way, we can easily convert the screen-space X/Y coordinates to computer-space by just dropping the lowest bits. From 64⨯32, we could scale it for example by 8 to get 512⨯256, or by 16 for 1024⨯512. Of course, given the awkward 2:1 aspect ratio of the CHIP-8, we can't hope to avoid borders altogether: if we look at some lists of VGA modes, these two look promising: the larger one would fit on 1024⨯768 with no vertical bordering, and the other one is a close contender with a small border in 640⨯480, requiring only a 25MHz pixel clock.

A low pixel clock frequency is useful for this project because I'm still learning the ropes with CλaSH, so getting just something to work is an achievement on its own; getting something to work efficiently, at a high clock rate, or using two separate clock domains for the video generator and the CPU would both be more advanced topics for a later version. So for this project, I'm going to go with 640⨯480: divided by 8, this gives us a screen layout that looks like this:

Signal generation from CλaSH

So out of all the 640⨯480 modes, we're going to use 640⨯480@60Hz for this, since the CHIP-8 also has a 60 Hz timer which we'll get for free this way. Let's write down the timing parameters; we'll need a 10-bit counter for both X and Y since the porches push the vertical size to 524.

data VGATiming n = VGATiming
    { visibleSize, pre, syncPulse, post :: Unsigned n

data VGATimings w h = VGATimings
    { vgaHorizTiming :: VGATiming w
    , vgaVertTiming :: VGATiming h

-- | VGA 640*480@60Hz, 25.175 MHz pixel clock
vga640x480at60 :: VGATimings 10 10
vga640x480at60 = VGATimings
    { vgaHorizTiming = VGATiming 640 16 96 48
    , vgaVertTiming  = VGATiming 480 11  2 31

The output of the signal generator is the vertical and horizontal sync lines and the X/Y coordinates of the pixel being drawn; the idea being that there would be some other circuit that is responsible for ensuring the correct color data is put out for that coordinate. I've also bundled two extra lines to signal the start of a line/frame: the frame one will be used for the 60Hz timer, and the line one can be useful for implementing other machines later on: for example, on the Commodore 64, the video chip can be configured to interrupt the processor at some given raster line.

data VGADriver dom w h = VGADriver
    { vgaVSync :: Signal dom Bit
    , vgaHSync :: Signal dom Bit
    , vgaStartFrame :: Signal dom Bool
    , vgaStartLine :: Signal dom Bool
    , vgaX :: Signal dom (Maybe (Unsigned w))
    , vgaY :: Signal dom (Maybe (Unsigned h))

For the actual driver, a horizontal counter simply counts up to the total horizontal size, and a vertical counter is incremented every time the horizontal one is reset. Everything else can be easily derived from these two counters with just pure functions:

    :: (HiddenClockReset dom gated synchronous, KnownNat w, KnownNat h)
    => VGATimings w h
    -> VGADriver dom w h
vgaDriver VGATimings{..} = VGADriver{..}
    vgaVSync = activeLow $ pure vSyncStart .<=. vCount .&&. vCount .<. pure vSyncEnd
    vgaHSync = activeLow $ pure hSyncStart .<=. hCount .&&. hCount .<. pure hSyncEnd
    vgaStartLine = hCount .==. pure hSyncStart
    vgaStartFrame = vgaStartLine .&&. vCount .==. pure vSyncStart
    vgaX = enable <$> (hCount .<. pure hSize) <*> hCount
    vgaY = enable <$> (vCount .<. pure vSize) <*> vCount

    endLine = hCount .==. pure hMax
    endFrame = vCount .==. pure vMax
    hCount = register 0 $ mux endLine 0 (hCount + 1)
    vCount = regEn 0 endLine $ mux endFrame 0 (vCount + 1)

    VGATiming hSize hPre hSync hPost = vgaHorizTiming
    hSyncStart = hSize + hPre
    hSyncEnd = hSyncStart + hSync
    hMax = sum [hSize, hPre, hSync, hPost] - 1

    VGATiming vSize vPre vSync vPost = vgaVertTiming
    vSyncStart = vSize + vPre
    vSyncEnd = vSyncStart + vSync
    vMax = sum [vSize, vPre, vSync, vPost] - 1        

I hooked it up to a small test circuit, connected it to a small CCTV display I had lying around before I unpacked the bigger old VGA screen, and... nothing. This was frustrating because in the CλaSH simulator the sync timings looked right. Then Joe had this simple but brilliant idea to just blink an LED at 1 Hz using the pixel clock, and see if that is correct -- this immediately uncovered that I was using the wrong clock manager settings, and instead of a pixel clock of 25.125 MHz, it was running at 40 MHz. No wonder the signal made no sense to the poor screen... With that out of the way, I finally saw the test pattern:

And so with a bit of bit truncation, I now have a checkers pattern displayed at the CHIP-8 resolution; and I've even ended up bringing out the larger screen:

The full code in progress is on GitHub; in particular, the version that generates the checkerboard pattern is in this code.

September 02, 2018 12:24 PM

September 01, 2018

Sean Seefried

Haskell development with Nix

The Nix package manager home page has this to say:

Nix is a powerful package manager for Linux and other Unix systems that makes package management reliable and reproducible. It provides atomic upgrades and rollbacks, side-by-side installation of multiple versions of a package, multi-user package management and easy setup of build environments.

I like Nix because it allows me to:

  • create reproducible development environments that do not interfere with other software installed on my computer
  • specify the entire stack of software dependencies that my project has, including system libraries and build tools, using a single language: Nix expressions.

Nix has the advantage that it doesn’t just tame Haskell’s dependency hell but also that of the entire system. Unfortunately, there is quite a lot to learn about Nix before one can use it effectively. Fortunately, Gabriel Gonzales has written a great guide on how to use Nix for Haskell development. I’ve also included a link to an addendum Gabriel is going to add to the original guide.

  • Nix and Haskell in production. This is a detailed guide on how to use Cabal + Nix to create completely reproducible builds.

  • How to fetch Nixpkgs with an empty NIX PATH. A short addendum on how to remove impure references to NIX_PATH, this making your builds even more reproducible.

  • NixOS in production. This post is not Haskell specific but shows you how to build a source free Nix binary closure on a build machine and then deploy it to a production machine. This is fantastic workflow for space/computing-limited production machines e.g. t2.micro instances on Amazon AWS.

by Sean Seefried at September 01, 2018 12:00 AM

August 31, 2018

Chris Smith 2

Questions for reflection on computational thinking

Among other things, I am starting a group at work to discuss pedagogy and teaching of STEM subjects. At our meeting next week, we are discussing Shuchi Grover and Roy Pea’s survey article on computational thinking from 2013. I put together a list of reflection questions and prompts to get our conversation started. I’m pretty happy with the list, and thought I’d share it.


Without further adieu, my list of reflection prompts:

  • Since Jeanette Wing’s 2006 introduction, the phrase “computational thinking” has resurrected the notion that within computer programming lies a universally applicable skill. Since then, politicians and other social leaders have jumped on the bandwagon of learning to code. Why has computational thinking become such a rallying cry for CS education? What are its implications? What are people looking for that they didn’t find in “programming”, “coding”, or “computer science”?
  • Jeanette Wing defines computational thinking as the thought processes involved in formulating problems and their solutions so that the solutions are represented in a form that can be carried out by an information processing agent. There is disagreement on this definition. What should computational thinking mean? How is it different from programming? From computer science? How is it related to abstraction, decomposition, algorithmic thinking, mathematical thinking, logical thinking, computational modeling, or digital literacy?
  • Computational thinking is described as a universal skill set that is applicable to everyone, and not just computer scientists. Is this fair? If the current generation of computational thinking curriculum looks a lot like computer science, how are we living up to that standard? If the skill set is so universal, where are the examples from before computer science existed as a topic of study?
  • The skills needed to both use and create computational artifacts are changing rapidly. How does this impact the meaning of computational thinking? Suppose that, in a few decades, most computer software will be created by non-specialists using informal language and NLP; what does that mean for computational thinking?
  • The authors compare computational thinking with computational literacy, which includes social concerns and computation as a medium for math, science, or art. What can we say about the relationship between the two? What, if anything, distinguishes these ideas?
  • The elements of computational thinking are listed as abstraction, information processing, symbolic representations, algorithmic flow control, decomposition, procedural (iterative, recursive, and parallel) thinking, conditional logic, efficiency and performance, and debugging. How do these topics really fit together? How would we see them differently if computer science were not involved?
  • There are hundreds of very different competing programming languages and tools vying for using in computing classes. How do we identify a universal body of knowledge in such an environment? What does this do to the possibility of transfer learning? Can we assess computational thinking in the absence of assumptions about languages and tools?
  • The authors point out that Scratch, the most popular K-8 programming language, is missing procedural abstraction, even though abstraction is “indisputably” the cornerstone of computational thinking. What other important ideas are missing from common tools? What can we do about it?
  • The authors claim that few efforts in computational thinking take into account socio-cultural and situated learning (i.e., how we learn through apprenticeships and relationships with others), distributed and embodied cognition (i.e., how cognitive processes take place across the whole body and through communication between multiple people), or activity, interaction, and discourse analyses (i.e., study of the way communication happens in the classroom). How would these perspectives change our view of computational thinking?

by Chris Smith at August 31, 2018 03:42 AM

August 30, 2018

Neil Mitchell

Licensing my Haskell packages

Summary: I plan to license all future packages under the "BSD-3-Clause OR Apache-2.0" license.

A few weeks ago I calculated that the distribution of Haskell library licenses is approximately:

  • BSD-3-Clause: 67%
  • MIT: 20%
  • Apache-2.0: < 2%

In contrast, Wikipedia suggests for most open-source libraries the Apache-2.0 license beats BSD-3-Clause, and it is the permissive choice of FSF/Google etc. I was curious why it was so rare in the Haskell world, so asked on Twitter. The resulting thread got my thinking about license choices, which changed my view of what license I'd like to use. In this post I'm going to go through my reasoning.

The license I want to use

What I want to say is that anyone is free to use my code for any purpose. If they make changes to my code which would be of general benefit and have a chance of being accepted upstream, they should post those changes publicly. I give my work away freely, and want the version I'm giving away to be the best possible version for everyone. No license matches this intent, none force you to share code that you improve but only use internally, and the legal definition of "general benefit" can only be arbitrated by me. As a result, I'd like people to follow those principles, but chose to release my code with far fewer restrictions, in the hope people will do the right thing and share improvements anyway.

The license I use

When I first started releasing code (around 2004) I originally licensed my code as GPL-2.0, because that was a protective open-source license and I was still dipping my toes in the open source pond. By 2007 I was releasing new libraries as BSD-3-Clause, since that was what everyone in the Haskell community was using and seemed to provide the benefits I wanted (people sent me patches without being legally compelled to, just for the good of the code, which I prefer). It took until 2012 for me to switch my earliest libraries to BSD-3-Clause - one to avoid annoyance at work and another at the request of a commercial company who were linking it to proprietary pieces, and then went on to contribute extensively for the benefit of the project. Currently, all my projects are BSD3 licensed.

The license I will use

But what license should I be using? These questions prompted me to hunt around and I came to the conclusion that the right license for me is:

BSD-3-Clause OR Apache-2.0

Concretely, I plan to license all my future libraries under both the BSD-3-Clause and Apache-2.0 licenses, but a user is able to use it under either. My reasoning is as follows:

Why BSD-3-Clause over MIT?

I debated BSD-3-Clause vs MIT for a long time. They both offer substantially the same freedoms, but BSD-3-Clause also requires you can't use my name to endorse things you build - which seems reasonable. I like MIT because it's less ambiguous as a name, and it makes explicit freedoms that are implicit in the BSD-3-Clause license. The fact that BSD-3-Clause is more commonly used for Haskell libraries and my existing libraries is a point in it's favour. In the end, I picked BSD-3-Clause.

Why Apache-2.0?

The Apache-2.0 license offers a patent grant - I'm promising that if you use my code I'm not going to sue you using my patents. If I gave you code and then later sued you for using it, that would be mean. More importantly (from my side at least) it ensures everyone contributing to my library is following the same basic "don't be mean" principle, so I can continue to use my code free from patent concerns.

Why both?

The OR in the license means that I (and all contributors to my libraries) license all the code under BSD-3-Clause, and entirely separately also license all the code under Apache-2.0. Users are free to use the library under either of the available licenses, making it a user-centric OR. The Apache-2.0 license is incompatible with the GPL-2.0 and LGPL-2.1-only licenses, meaning any library building on my code plus the GTK bindings would be in a license quandary. By licensing under both most users can use Apache-2.0 (it gives you patent protection, so it's in your best interest), and those that would have problems otherwise can stick with BSD-3-Clause.

Next steps

Licensing is a slightly sensitive topic, so I'm declaring my intent, and waiting for feedback. Hopefully this change is positive for everyone, but anyone with concerns should let me know. As to technical detail, Cabal 2.2 supports SPDX license expressions, which is the syntax I've been using throughout this post.

by Neil Mitchell ( at August 30, 2018 08:12 PM

August 29, 2018

ERDI Gergo

RetroChallenge 2018: CHIP-8 in CλaSH

The 2018 September run of the RetroChallenge starts in a couple of days, and I've been playing around with CλaSH for the last couple of weeks, so I thought I'd aim for the stars and try to make a CHIP-8 computer.

What is CλaSH?

CλaSH is, basically, an alternative GHC backend that emits a hardware description from Haskell code. In the more established Lava family of HDLs, like Kansas Lava that I've used earlier, you write Haskell code which uses the Lava eDSL to describe your hardware, then you run said Haskell code and it emits VHDL or Verilog. With CλaSH, you write Haskell code which you then compile to VHDL/Verilog. This allows you to e.g. use algebraic data types with pattern matching, and it will compile to just the right bit width.

Here is an example of CλaSH code by yours truly: the receiver half of a UART.

What is CHIP-8?

There is no shortage of information about the CHIP-8 on the Internet; the Wikipedia page is a good starting point. But basically, it is a VM spec from the mid-'70s; originally meant to be run on home computers built around the the 8-bit COSMAC CPU. There are tens, TENS! of games written for it.

CHIP-8 has 35 opcodes, a one-note beeper, 64x32 1-bit framebuffer graphics that is accessible only via an XOR bit blitting operation (somewhat misleadingly called "sprites"), and 16-key input in a 4x4 keypad layout.

I have some experience with CHIP-8 already: it was the first computer I've implemented on an FPGA (in Kansas Lava back then); I've used it as the test application when I was working on Rust for AVR; I even have a super-barebones 2048 game for the platform.

In summary, CHIP-8 is so simple that it is a very good starting point when getting accustomed with any new way of making computers, be it emulators, electronics boards or FPGAs.

My plan

I want to build an FPGA implementation of CHIP-8 on the Papilio Pro board I have laying around. It is based on the Xilinx Spartan-6 FPGA, a chip so outdated that the last version of Xilinx's development tools that supports it is ISE 14.7 from 2013; but it is what I have so it will have to do. I plan to work outside-in by first doing the IO bits: a VGA signal generator for the video output, and a PS/2 keyboard controller for the input. Afterwards, I will move to making a CPU which uses CHIP-8 as its machine language.

If I can finish all that with time remaining, there are some "stretch goals" as well: implementing sound output (the Papilio IO board has a 1-bit PWM DAC that I could use for that); interfacing with a real 4x4 keypad instead of a full PS/2 keyboard; and implementing some way of switching ROM images without rebuilding the FPGA bitfile.

August 29, 2018 07:18 PM

August 28, 2018

Michael Snoyman

Kids Coding, Part 3

Eliezer asked me a few questions about programming yesterday, and I ended up demonstrating a little bit about pattern matching. He then had a lot of fun showing a friend of his how programming works. I may end up giving some lessons to some of my kids’ friends in the next few weeks, which should be interesting.

Kids IDE

I’ve started using a minimalistic “kids IDE” I threw together for teaching the kids. It’s helped us not worry about details like filepaths, saving, and running in a terminal. It’s not a great tool, but good enough. I’ve included links to better tools in the README, though this fits my needs for the moment.

I’ve set up AppVeyor to upload Windows executables to S3:

You’ll also need to install Stack.

Pattern matching strings

This morning, Eliezer and Gavriella both had their next “official” lesson. I started over with that pattern matching demonstration. First, I showed them this example:

main = do
  sayNickname "Eliezer"
  sayNickname "Gavriella"
  sayNickname "Yakov"
  sayNickname "Lavi"

sayNickname realname =
  putStrLn (nickname realname)

nickname "Eliezer" = "Elie Belly"
nickname "Gavriella" = "Galla Galla"
nickname "Lavi" = "Fat baby"
nickname "Yakov" = "Koko"

They got the basic idea of this. (And I ended up introducing putStrLn around here as well, which they were fine with.) However, as I had them typing out some of this on their own, they ended up with a lot of trouble around capitalization, which was a good introduction to Haskell’s rules around that. We’ll see how in a bit.

Pattern matching ints

After establishing that we could pattern match on strings, I switched the code to this:

main = do
  sayNickname 1
  sayNickname 2
  sayNickname 3
  sayNickname 4

sayNickname realname =
  putStrLn (nickname realname)

nickname 1 = "Elie Belly"
nickname "Gavriella" = "Galla Galla"
nickname "Lavi" = "Fat baby"
nickname "Yakov" = "Koko"

And gave them the challenge to rewrite nickname so that the code worked, which wasn’t too much of a problem. The misordering of Lavi and Yakov between main and nickname did cause some confusion, and then helped them understand better how pattern matching works. (I didn’t intentionally put that in, it somehow slipped in while the kids were working on rewriting things.)

Type signatures

I asked them what the type of nickname was, and they said function (yay!). And then explained to them that you can tell Haskell explicitly what a thing’s type is, and typed this in:

nickname :: ? -> ?

The funny syntax didn’t give them too much trouble, and then we got to fill in the question marks. I asked them what the output of the function was, pointing at the string. I was surprised: they said the type was “nickname” or “name.” They accepted that it was a string, but they didn’t like that. (New theory: humans’ brains are naturally strongly typed.)

I then asked what the input was, and they said “number.” I hadn’t taught them about Int yet, and didn’t know about integers from math, so I told them that integer is a kind of number, and that in Haskell we call it Int. Filling in the type signature was fine.

I pointed out that some things (like Int and String) were upper case, and some were lower case. I pointed out that concrete things that “actually exist” (maybe not the best terminology) are capitalized. We know what an Int is, for example. Variables are things we don’t know yet, and those are lowercase. And finally, you can put whatever you want inside a string, but it has to match exactly. That seemed to click fairly well for them.


I pointed out that referring to the kids as numbers isn’t good. (He responded that I actually do call them 1, 2, 4, and 8 sometimes…) Anyway, I said that a better type for the nickname function would be to take a Child as input. He said “can we say Child = Int,” which was a great insight. I showed up that, yes, we can do type Child = Int, but that there’s a better way.

I introduced the idea that data defines a new datatype, and then showed them:

data Child
  = Eliezer
  | Gavriella
  | Yakov
  | Lavi

Gavriella asked “what are those lines?” and I explained they mean “or.” Therefore, a child is Eliezer, or Gavriella, or Yakov, or Lavi. They got this.

Exercise: fix the following program, uncommenting the lines in main.

main = do
  sayNickname Eliezer
  --sayNickname Gavriella
  --sayNickname Yakov
  --sayNickname Lavi

sayNickname realname =
  putStrLn (nickname realname)

data Child
  = Eliezer
  | Gavriella
  | Yakov
  | Lavi

nickname :: Child -> String
nickname Eliezer = "Elie Belly"
--nickname 2 = "Galla Galla"
--nickname 3 = "Fat baby"
--nickname 4 = "Koko"

Some caveats where they got stuck:

  • Using lower case lavi instead of Lavi. I got to explain how pattern matching a variable works, and “wildcards.” They got this, though still needed to be coached on it.
  • I’d been on the fence about including syntax highlighting in the kids IDE, but it turns out that the different colors of Eliezer and lavi helped Gavriella realize her mistake. Syntax highlighting is a good thing here.
  • I did ultimately have to give her a hint: “Lavi is BIG but you made him small.”

I started to try and define data Person = Kid Child | Adult Parent, but realized quickly it was too complicated for now and aborted.


I did not teach this well, the children did not get the concepts correctly. However, they did understand the code as I wrote it.

main = print total

total = addUp 10

addUp 10 = 10 + addUp 9
addUp 9 = 9 + addUp 8
addUp 8 = 8 + addUp 7
addUp 7 = 7 + addUp 6
addUp 6 = 6 + addUp 5
addUp 5 = 5 + addUp 4
addUp 4 = 4 + addUp 3
addUp 3 = 3 + addUp 2
addUp 2 = 2 + addUp 1
addUp 1 = 1 + addUp 0
addUp 0 = 0

They really hated how repetitive typing that out was, which was exactly the goal. It was easy to convince them that this was stupid. I then changed total to addUp 11 and it broke. They understood why, and so we started working on something better.

main = print total

total = addUp 10

addUp 0 = 0
addUp x = x + addUp (x - 1)

I stepped through the executable of this, complete with pattern matching. Doing this a few times in the future is probably a good idea.

Eliezer asked what happens if we remove the addUp 0 = 0 case. We discussed it, he said it wouldn’t work, and said it would “keep going.” I told him it was called an infinite loop, and we got a stack overflow. Good insight.

Gavriella asked how long it took me to learn this stuff. I told her 12 years; after all, I only started learning Haskell in my twenties. It made them feel pretty good that they were learning this stuff earlier than I did.

I gave them an exercise to implement multTo instead of addUp. They didn’t understand this, or recursion, and I had to help them through the whole thing. Mea culpa completely.

Gavriella asked for another exercise:

main = do
  sayAge Eliezer
  sayAge Gavriella
  sayAge Yakov
  sayAge Lavi

data Child = ...

sayAge child = print (age child)

age ? = ?

She typed something like:

age ? = ?

I showed her that she needed:

age Eliezer = 10

And then she got the rest just fine, though with a few capitalization mistakes.

August 28, 2018 12:12 PM

August 26, 2018

Sean Seefried

Gabriel Gonzalez's notes on Stack, Cabal and Nixpkgs

Gabriel Gonzalez recently tweeted about the history of Stack which I thought were worth recording for posterity here.

When stack was first introduced the killer features were:

  • declarative dependency specification (i.e. stack.yaml)
  • isolated project builds

cabal new-build was when cabal-install began to support the above two features (cabal.project being analogous to stack.yaml)Gabriel Gonzalez added,

Also, once Nixpkgs added Haskell support that also offered a third approach for isolated builds with declarative dependencies

The combination of cabal new-build + Nix’s Haskell integration is what caused stack to lose its competitive advantage on that front

A small advantage of stack was built-in support for using Stackage resolvers to select a working package set en masse

However, you could reuse that within a Cabal project by saving to your project

This compatibility was an intentional goal of @snoyberg

Stackage was probably the single best outcome of the stack project because it benefited all three of stack, cabal-install and Nixpkgs (which closely tracks the latest Stackage resolver, too)

Before Stackage getting a working build of, say, lens or yesod was a nightmare

by Sean Seefried at August 26, 2018 12:00 AM

August 24, 2018

Michael Snoyman

Kids Coding, Part 2

I didn’t expect to be writing the second blog post only 12 hours after the first. But when the kids came downstairs this morning, they were unbelievably excited to do more coding. So here we are!

Eliezer and I discussed using more visual tooling (like Code World or Jupiter) for doing the learning, and I showed him what it looked like. It seems he’s got quite a bit of his father’s preferences, and wanted to stick to plain text for now. We’ll probably circle back to that kind of stuff after they get the basics. It will also give me a chance to get more comfortable with those offerings. (Thanks all for the recommendations on Reddit.)

One final note. I’m extremely happy that we went with Haskell after today’s lessons. Concepts like variable replacement which are natural in Haskell were great teaching tools. I obviously don’t have enough data to prove this yet, but I’m more strongly believing the long-held theory that Haskell is easier for brand new programmers than those familiar with imperative programming.

Coding environment

I moved us over to VSCode, with only syntax highlighting set up. Previously, I’d used my emacs setup with intero, and the red squiggles let them know they’d messed up too early. I used the docked terminal at the bottom, taught them to save, and showed them to press “up enter” in the terminal to rerun stack runghc foo.hs. Perhaps not the ideal setup, but definitely good enough.


  • Started with main = print "hello world"

  • Explained that main is defined as other thing

  • Expanded to

    main = print hello
    hello = "hello world"
  • They were not surprised at all that this just worked

  • hello = 5 + 6, both got confused about whether it would print 11 or 5 + 6, reminded them of strings

  • How about hello = five + six?

  • Both agreed it wouldn’t work. I typed in five = 5, and ask if it would work. They agree that it won’t, and I show them GHC’s error message.

  • “Variable not in scope.” Computers talk funny, just means “I don’t know what six is.” Instilled that they’ll eventually be able to understand those massive error messages, but not to worry about it yet.

  • Exercise: five the program. Both of them got it no problem.

  • I was about to demonstrate five = 4, and then Gavriella figured it out for herself! They understand that, again, “computers are dumb.” Names are just there for our benefit, computer doesn’t care.

  • We do five + six * seven (with appropriate variable definitions), they get that answer, and then (five + six) * seven, they get that too.

  • Now I define fivePlusSix = five + six, and change to hello = (fivePlusSix) * seven (direct replacement). They’re fine with this. Yay replacement.

  • Point out that the parens are now unneeded and remove them. Again, no problem.

  • Parens just tell us “do this first”, not needed for one thing


  • Type in hello = 5 * 2, no problem.
  • How about hello = double 5? They figured out that it won’t work cuz computers be dumb
  • How do we define double? Eliezer guessed * 2, which is really close, but we’re not ready for sections, and I want to teach standard function stuff first.
  • Show them double x = x * 2, that x is a variable, and a function argument. They don’t know algebra yet, but got this fairly quickly.
  • hello = addFive (double 5)
  • Exercise: fix the program!
  • Eliezer did this one, defined addFive, and he started spacing things like me without any prompting. Cool.
  • Exercise: Gavriella, write minusTwo
  • Got it first time, ran it, answer’s still 15. Why?
  • We stepped through how the program is evaluated. “Let’s look at main, oh, I got an action! I need to print hello, what’s that? Oh, it’s addFive .... That means I gotta figure out .... What’s double 5? Oh, it’s 10. So that’s 15. And minusTwo was never (terminology break, I taught them that it wasn’t used).
  • Exercise: make the answer 13. They’re struggling to figure out where to use it. They’re stumped, time to jump in.
  • Stepped through all of the types involved. “double is a function that takes a number and gives you back a number” blah blah blah
  • minusTwo is also a function. It needs one number. Let’s try something wrong: minusTwo addFive (double 5). That’s wrong: it means “apply minusTwo to two values” but it only needs one number.
  • We need to turn addFive (double 5) into one thing. Ask them what we can use. They played with idea of double quotes, and then they figured out the parens! Awesome!
  • One final exercise: make the answer 16 with plus3
  • Eliezer did that typing, so I gave Gavriella one real final one: make the answer 6 with minusTen. She did it.
  • Realizing now: learning to type and navigate an editor is possibly harder for them than the coding itself


Gavriella kept playing with the code (wow, she’s focused on this). She decided she wanted to do division. That was more complicated, but she persevered. I taught her about div being a function that takes 2 arguments. She didn’t know anything about remainders, and was confused that div 15 2 worked at all. I taught her about divMod, and she did great.

August 24, 2018 06:04 AM

August 23, 2018

FP Complete

Haskell Development Workflows (4 ways)

One fantastic aspect of Haskell is that it offers various tools that allow different development workflows. In this blog post, we are going to focus on four different workflows to develop Haskell. These are techniques that the FP Complete team themselves use regularly and we wanted to share them with our readers.

by Roman Gonzalez ( at August 23, 2018 07:45 PM

Michael Snoyman

Kids Coding, Part 1

Kids Coding, Part 1

I’ve been wanting to give my eldest child (Eliezer, now 10) a chance to learn to code for about a year now, with little success. My second child (Gavriella, now 8) is also ready to start learning some coding. “Ready” to me means “they have a decent enough grasp of written English.” (Yakov, my 6 year old, also wants in, but unfortunately is going to have to wait a bit.)

I know how I learned to code: sitting at a DOS prompt since the age of 2 and reading a massive “Teach Yourself QBasic in 21 Days” book. For various reasons, this doesn’t seem to apply to the next generation. I’ve looked into other modern approaches, including the graphical programming environments. My kids enjoyed some of this, but this week told me that “it’s fun, but we’re not learning anything.”

Previously, I tried teaching Eliezer some Haskell by writing up lessons for him. Whether because of my writing or his interest, it didn’t work at the time. I decided to not go down this path again, and an hour ago sat down with Eliezer and Gavriella for their first lesson. I’m winging this completely, but here’s the approach I’m taking:

  • Keep the computer in front of me and show them things
  • See what confuses them, what interests them, and follow that
  • Give them one or two exercises at the end to reinforce the ideas
  • Keep the lessons short to keep their interest
  • Write up a blog post right afterward to record what happened, so I can learn for the future. After all, Yakov and Lavi (the baby, aka #YoungestHaskeller) need to learn too

This is the first of these summary blog posts. Caveats:

  • This will likely be incomplete
  • The material is very much geared towards my kids and probably won’t generalize
  • I have no idea how long or frequently I’ll be doing this.

I considered making these private notes for myself instead, but thought some others may be interested, so I’m posting publicly.

Without further ado!

Python portion

  • Previous experience with Haskell made Eliezer think it was too hard for him, so we decided to learn Python instead. He has some experience with that from Code Combat.
  • I started at the REPL and demonstrated printing strings. We made some typical jokes (yes, they were fart jokes) in text form, which made the kids giggle. Upshot: they learned about strings, and stayed interested.
  • I demonstrated that basic arithmetic works at the REPL.
  • I opened up a file and demonstrated you could write the same stuff from the REPL in a file.

This was the point where Eliezer commented that it looked a lot like the Haskell he’d seen. I’m not sure what it was, but somehow it clicked with him that whatever scared him off of Haskell previously wasn’t a real issue. We decided together to switch over to learning Haskell instead, which I’m quite happy about (more because I know the language better than anything else).

Haskell portion

  • Did the same print, string, and arithmetic stuff at the GHCi prompt and in a file
  • Asked them what 2 + 3 + 4 was, they got that
  • Asked them what 2 * 3 + 4 was, they got that too
  • Then 2 + 3 * 4, and I was surprised to find out that they knew about order of operations already. Yay school system.
  • They mentioned parentheses, and I told them that would be the same way to do things in programming, and showed them (2 + 3) * 4.
  • Next we talked about print 2 + 3. They had some inkling that this wouldn’t work, but couldn’t be sure of why.
  • I then branched into the next topics…


  • Types are really important in Haskell programming
  • Everything has a “type” (this raised confusion and interest, as expected)
  • Explained that 2 is a number, that’s its type (yes, it’s a bit of a lie, Int, and defaulting to Integer, and Num a => a, blah blah blah, this is fine for first lesson)
  • Is 2 a string? No. Is 2 a dog? (giggles) No. Is 2 a missile? (Eliezer got to explain that concept to Gavriella.)
  • print 2 + 3 because of order of operations (just like math) really is (print 2) + 3, what does that mean?
  • What’s the type of print? Confusion
  • We’ve discussed functions a lot before, so pointed out that print is a function that takes a number (here) and returns something else. What is that something else?
  • I introduce the idea of actions. Eliezer got this, Gavriella was more confused. So more humor to explain this.
  • fart = print "Sorry, I farted" lots of giggling. What is the type of fart? Is it a number? No. Is it a dog? No. It’s something that you do, or the computer does. That’s what an action is. (Gavriella translated some words into Hebrew at that point, and the ideas clicked. Got to remember: they’re learning both programming languages and how to learn things in English at the same time.)
  • OK, you can add two numbers together. Does it make sense to add an action and a number together? No. So print 2 + 3 doesn’t make sense!
  • Now, what’s print? It’s a function that takes a number and gives you back an action. And that’s how you use functions in Haskell: just stick things next to each other.
  • They figured out really quickly at this point that they needed parens to fix the program, and ended up with print (2 + 3).

Today’s excercise

Note: Somewhere above, I briefly showed them that you could use do notation and put multiple print calls on different lines. No confusion from this at all. Relevant to the exercise:

  • Write a program that print “2 + 3 + 4”, first as a string, and then as a number.

I started them off with:

main = do
  | -- prompt started here

They both got the answer, one of them with a bit of help:

main = do
  print "2+3+4"
  print (2+3+4)

Things to note:

  • They naturally left off the spaces, despite me using the spaces throughout my typing.
  • They never questioned why print "Hello World" resulted in output that kept the double quotes. I’ll have to explain at some point about putStrLn, but that can come much, much later.

To figure out

  • How much shell do I teach them?
  • What editor will I set them up with?

Next lesson

I don’t want to plan out what to cover next time too intricately, because I want to experiment with them and bounce things around. I’m thinking about showing them how to create their own functions, maybe with lambda syntax, not sure.

August 23, 2018 05:55 PM

August 22, 2018

Philip Wadler

I'll be speaking at Lambdup 2018 in Prague

I'm looking forward to LambdUp, Prague, 13 September 2018. The programme is up!  My keynote will be the talk below.

Categories for the Working Hacker

The talk will explain why category theory is of interest for developers, taking examples from Java and Haskell, and referencing the new blockchain scripting languages Simplicity, Michelson, and Plutus. The principle of Propositions as Types describes a correspondence between, on the one hand, propositions and proofs in logic, and, on the other, types and programs in computing. And, on the third hand, we have category theory! Assuming only high school maths, the talk will explain how categories model three basic data types: products (logical and), sums (logical or), and functions (logical implication). And it explains why you already learned the most important stuff in high school.

by Philip Wadler ( at August 22, 2018 09:23 AM

August 20, 2018

Functional Jobs

Senior Software Engineer at Habito (Full-time)

Launched in 2016, Habito is transforming the mortgage industry through innovation and cutting-edge technology. We are the UK's free online mortgage broker backed by Atomico, Ribbit Capital and Mosaic Ventures, with angel investors including Transferwise CEO Taavet Hinrikus, Funding Circle’s founder Samir Desai and Yuri Milner, influential tech investor.

We are building a great brand synonymous with great customer service, ease and transparency. Simple, fast and honest, coupled with no fees, and no misinformation.

Our team is super-smart, ambitious, collaborative and friendly. We work hard, play hard and learn fast. This is a great opportunity to help steer, shape and mould our business.

Are you excited? Come and make your home at Habito!

The Role

We are looking for a full-time senior software engineer to join our team of full-stack engineers. Our core development values are strong, static typing and correctness, rigorous automation (in both testing and infrastructure) and everyone having a say.

The stack: On the back end we have an HTTP API written in Haskell (and built using stack) with PostgreSQL serving as the persistence layer. All of our data is event sourced thanks to PostgreSQL's great support for relational and blob data (e.g. jsonb) and some in-house libraries for handling ES/CQRS in a type-safe (or what we believe to be, as far as possible) manner. We lean heavily on GHC and its extensions and are also using some of the bigger power-to-weight-ratio players like lens and conduit.

The front end is written in PureScript on top of React and Redux (the bindings to which are our own but are both quite thin and based on the design of purescript-react and later pux).

Infrastructure wise we are sitting atop AWS and using Docker to manage containment. Jenkins manages continuous integration and deployment using a combination of AWS' Java SDK and some management code written in Scala

While we're a technology company first and foremost, initially we're trying to make the UK mortgage brokering market a better place. This means building and maintaining our website, but it also involves a slurry of behind-the-scenes work integrating lenders, insurers, conveyancers and all the other players in the space as we seek to introduce robust automation into what is currently so many manual processes. Slack bots, job queues, working around the fact that some actors don't have APIs -- you'll get to handle it all!

  • Work on our back-end software, written in Haskell and our front-end web applications, written in PureScript, HTML and CSS.
  • Help maintain and extend our infrastructure, hosted in AWS using technologies such as Docker and Kubernetes.
  • Use tools such as QuickCheck, Tasty and HUnit to produce reliable, well-tested software.
  • Work with our product and design teams to produce customer-centric experiences that excite and delight.

The Candidate

  • A passionate developer who cares about clean code and code quality; a full-stack polyglot
  • An experienced engineer familiar with automated testing and deployment
  • Experienced with functional programming and domain-driven design or simply interested and capable of learning on the job
  • Someone who cares about UX and UI; you have a fundamental aversion to ugliness in product

And Finally......

  • Competitive salary & share options
  • Career development, coaching & training
  • Free catered lunches, snacks & team bonding
  • Bupa Healthcare & Life Assurance
  • Contributory pension scheme
  • Bi-weekly massages with Urban Massage
  • Unlimited Holiday
  • Cycle to Work Scheme
  • Childcare Vouchers

Get information on how to apply for this position.

August 20, 2018 05:32 PM

August 18, 2018

Manuel M T Chakravarty

Learning Dutch

About six weeks ago, I moved from Sydney, Australia, to Utrecht in the Netherlands. Naturally, I started to learn Dutch.(*)

This post is mainly a summary of the things I found helpful so far. Apparently, the Netherlands is becoming an increasingly popular expat destination (cough Brexit cough), so this might be useful to some.

I quickly discovered that, even after living in an English-speaking country for many years, as a German native speaker, learning Dutch from the perspective of German is much easier than going via English (although, fluency in English obviously also helps). The first obstacle here is that it is a lot easier to find a good Dutch-English (electronic) dictionary than a Dutch-German one — for instance, my macOS and iOS devices already come with a Dutch-English dictionary that I can hook into the system look-up service with a click (or tap). The best Dutch-German dictionary that I could find so far is uitmuntend — thanks to Chris Eidhof for that very useful recommendation. Its creator also runs the super informative buurtaal forum.

As a starting point, I found the mondly app quite useful. It encourages a daily routine and quickly builds up a basic vocabulary and set of common expressions. It’s not cheap, but light years ahead of other iOS apps I checked out.

The next step for me was to make a habit out of trying to decipher Dutch websites and to refrain from clicking on the English language option. Whether that is the daily whether forecast, a news item, or the instructions for opening a bank account, it is a slow process, but there is no substitute for constant practice.

As a stepping stone, comic books (especially those for kids) are of course a great choice. (I would have never learnt Japanese without a huge stack of manga.)

To practice listening comprehsion and to get into more complex vocabulary and casual language, I really like to watch Zondag met Lubach — thanks to Gabi who got me to watch the show with her. The show is very entertaining, focused around one topic with recurring vocabluary, and you can get all past shows online. (On YouTube, to get started, you can also get English subtitles for many episodes and Dutch subtitles when you start to get the hang of it and to more easily look up words.)

And then, there is speaking. Unfortunately, I know of no better way than to repeatedly and publicly embarrass yourself to make progress here. People won’t understand you, you will say silly things, and you will fail to understand their response. Over and over again.

But it is all worth it for the moments of success. For instance, when you leave a shop for the first time without the Dutch shop assistant having felt the need to switch to English. Tot ziens!

(*) Everybody here switches to fluent English in a heartbeat when they realise that you don’t understand them, but to me, assuming residency in a country obliges me to learn the local language. (I have never been able to relate to that expat lifestyle, where you stay in that bubble of like-minded foreigners.)

August 18, 2018 11:18 PM

August 17, 2018

Gabriel Gonzalez

NixOS in production

<html lang="" xml:lang="" xmlns=""><head> <meta charset="utf-8"/> <meta content="pandoc" name="generator"/> <meta content="width=device-width, initial-scale=1.0, user-scalable=yes" name="viewport"/> <style type="text/css"> code{white-space: pre-wrap;} span.smallcaps{font-variant: small-caps;} span.underline{text-decoration: underline;} div.column{display: inline-block; vertical-align: top; width: 50%;} </style> <style type="text/css">a.sourceLine { display: inline-block; line-height: 1.25; } a.sourceLine { pointer-events: none; color: inherit; text-decoration: inherit; } a.sourceLine:empty { height: 1.2em; position: absolute; } .sourceCode { overflow: visible; } code.sourceCode { white-space: pre; position: relative; } div.sourceCode { margin: 1em 0; } pre.sourceCode { margin: 0; } @media screen { div.sourceCode { overflow: auto; } } @media print { code.sourceCode { white-space: pre-wrap; } a.sourceLine { text-indent: -1em; padding-left: 1em; } } pre.numberSource a.sourceLine { position: relative; } pre.numberSource a.sourceLine:empty { position: absolute; } pre.numberSource a.sourceLine::before { content: attr(data-line-number); position: absolute; left: -5em; text-align: right; vertical-align: baseline; border: none; pointer-events: all; -webkit-touch-callout: none; -webkit-user-select: none; -khtml-user-select: none; -moz-user-select: none; -ms-user-select: none; user-select: none; padding: 0 4px; width: 4em; color: #aaaaaa; } pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } div.sourceCode { } @media screen { a.sourceLine::before { text-decoration: underline; } } code { color: #ff0000; font-weight: bold; } /* Alert */ code { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code { color: #7d9029; } /* Attribute */ code { color: #40a070; } /* BaseN */ code span.bu { } /* BuiltIn */ code { color: #007020; font-weight: bold; } /* ControlFlow */ code { color: #4070a0; } /* Char */ code { color: #880000; } /* Constant */ code { color: #60a0b0; font-style: italic; } /* Comment */ code { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code { color: #ba2121; font-style: italic; } /* Documentation */ code span.dt { color: #902000; } /* DataType */ code span.dv { color: #40a070; } /* DecVal */ code { color: #ff0000; font-weight: bold; } /* Error */ code span.ex { } /* Extension */ code span.fl { color: #40a070; } /* Float */ code span.fu { color: #06287e; } /* Function */ code { } /* Import */ code { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ code { color: #007020; font-weight: bold; } /* Keyword */ code span.op { color: #666666; } /* Operator */ code span.ot { color: #007020; } /* Other */ code span.pp { color: #bc7a00; } /* Preprocessor */ code { color: #4070a0; } /* SpecialChar */ code { color: #bb6688; } /* SpecialString */ code { color: #4070a0; } /* String */ code { color: #19177c; } /* Variable */ code span.vs { color: #4070a0; } /* VerbatimString */ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ </style> </head><body>

This is a short post summarizing what I wish I had known when I first started using NixOS in production. Hopefully other people will find this helpful to avoid the pitfalls that I ran into.

The main issue with NixOS is that the manual recommends a workflow that is not suitable for deployment to production. Specifically, the manual encourages you to:

  • keep Nix source code on the destination machine
    • i.e. /etc/nixos/{hardware-,}configuration.nix
  • build on the destination machine
  • use Nix’s channel system to obtain nixpkgs code

This post will describe how you can instead:

  • build a source-free binary closure on a build machine
  • transfer and deploy that binary closure to a separate destination machine

This guide overlaps substantially with what the NixOps tool does and you can think of this guide as a way to transplant a limited subset of what NixOps does to work with other provisioning tools (such as Terraform).

You might also find this guide useful even when using NixOS as a desktop operating system for handling more exotic scenarios not covered by the nixos-rebuild command-line tool.

Building the closure

We’ll build up to the final solution by slowly changing the workflow recommended in the NixOS manual.

Suppose that you already have an /etc/nixos/configuration.nix file and you use nixos-rebuild switch to deploy your system. You can wean yourself off of nixos-rebuild by building the binary closure for the system yourself. In other words, you can reimplement the nixos-rebuild build command from scratch.

To do so, create the following file anywhere on your system:

Then run:

Congratulations, you’ve just built a binary closure for your system!

Deploying the closure

nix-build deposits a result symlink in the current directory pointing to the built binary closure. To deploy the current system to match the built closure, run:

Congratulations, you’ve just done the equivalent of nixos-rebuild switch!

As the above command suggests, the closure contains a ./bin/switch-to-configuration which understands a subset of the commands that the nixos-rebuild command does. In particular, the switch-to-configuration script accepts these commands:

$ ./result/bin/switch-to-configuration --help
Usage: ../../result/bin/switch-to-configuration [switch|boot|test]

switch: make the configuration the boot default and activate now
boot: make the configuration the boot default
test: activate the configuration, but don't make it the boot default
dry-activate: show what would be done if this configuration were activated

Adding a profile

The nixos-rebuild command actually does one more thing in addition to buiding the binary closure and deploying the system. The nixos-rebuild command also creates a symlink pointing to the current system configuration so that you can roll back to that configuration later. The symlink also acts like a garbage collection root, preventing the system from being garbage collected until you remove the symlink (either directly using rm or a higher-level utility such as nix-collect-garbage)

You can record the system configuration in the same way as nixos-rebuild using the nix-env command:

Querying system options

You can use the same nixos.nix file to query what options you’ve set for your system, just like the nixos-option utility. For example, if you want to compute the final value of the networking.firewall.allowedTCPPorts option then you run this command:

Pinning nixpkgs

Now that you’ve taken control of the build you can do fancier things like pin nixpkgs to a specific revision of nixpkgs so that you don’t need to use a channels or the NIX_PATH any longer:

In fact, this makes your build completely insensitive to the NIX_PATH, eliminating a potential source of non-determinism from the build.

Building remotely

Now that you’ve removed nixos-rebuild from the equation you can build the binary closure on a separate machine from the one that you deploy to. You can check your nixos.nix, configuration.nix and hardware-configuration.nix files into version control and nix-build the system on any machine that can check out your version controlled Nix configuration. All you have to do is change the import path to be a relative path to the configuration.nix file within the same repository instead of an absolute path:

Then all your build machine has to do is:

$ git clone "${OWNER}/${REPOSITORY}.git"
$ nix-build --attr system "./${REPOSITORY}/path/to/nixos.nix"

To deploy the built binary closure to another machine, use nix copy. If you have SSH access to the destination machine then you can nix copy directly to that machine:

If you do not have SSH access to the destination machine you can instead use nix-copy to create a binary archive file containing the binary closure of your system:

… and upload the binary archive located at /tmp/system to the destination machine using your upload method of choice. Then import the binary archive into the /nix/store on the destination machine using nix copy:

Once the binary closure is on the machine, you install the closure the same way as before:

… replacing /nix/store/... with the /nix/store path of your closure (since there is no result symlink on the destination machine).


That’s it! Now you should be able to store your NixOS configuration in version control, build a binary closure as part of continuous integration, and deploy that binary closure to a separate destination machine. You can also now pin your build to a specific revision of Nixpkgs so that your build is more deterministic.

I wanted to credit my teammate Parnell Springmeyer who taught me the ./result/bin/switch-to-configuration trick for deploying a NixOS system and who codified the trick into the nix-deploy command-line tool. I also wanted to credit Remy Goldschmidt who interned on our team over the previous summer and taught me how to reliably pin Nixpkgs.


by Gabriel Gonzalez ( at August 17, 2018 02:01 AM

August 15, 2018

Functional Jobs

Haskell : Backend & Web App Developer at Tripshot (Full-time)


I'm Wayne Lewis, Director of Engineering at Tripshot. Let me introduce you to our company and engineering team.

Tripshot was formed in 2014 with a mission to "optimize the commute". After observing the challenges large employers face getting their employees to and from work in crowded metro areas, we asked ourselves how we can help. With our backgrounds covering software product development, large-scale SAAS deployments, and selling and supporting large enterprises, we set upon our present course to deliver the best solution to this problem.

Coming off recent sales wins, we're excited to announce that we're expanding the engineering team.

But before diving into what we're looking for, its fair to explain what our team is all about. Tripshot engineering culture emphasizes:

  1. Meritocracy. Good ideas come from everywhere, and we encourage healthy discussion and debate on the team.

  2. Focus on the important stuff. No bikeshedding. Differentiate the business by tackling the difficult problems. Transportation is ripe with meaty problems in need of well engineered solutions.

  3. Ownership. Find people with the skills, experience, and maturity to own a problem but know when they need to ask for help - then let them loose. Tripshot is a fully distributed company, and since everyone works from home we rely on people who can own problems, and deliver solutions.

  4. Not all tech is created equal. We love static typing, and our tech stack reflects that fact. Backend services are all written in Haskell. Writing Haskell is sheer joy and highly productive. All web applications are developed in GWT, in our opinion the most mature of the "static language to javascript" compilers.

The next addition to the Tripshot engineering team will help with both the backend and web apps. Most of our new features require changes to both in roughly 75/25 proportion, but varies depending on the task.


  • Haskell experience is an absolute requirement. Personal projects are fine if they can be shared with us.

  • Java experience preferred. Its not necessary to have worked with GWT before, but that would be a bonus.

  • Highly motivated individual with excellent personal time management skills. You will be working from home.

  • Excellent analytical skills. Our roadmap is full of interesting problems requiring non-trivial algorithm design.

  • Must be eligible to work in the US.

Tripshot is committed to equal employment opportunity without regard for race, color, national origin, ethnicity, gender, protected veteran status, disability, sexual orientation, gender identity, or religion.

Get information on how to apply for this position.

August 15, 2018 08:43 PM

August 14, 2018

Jeremy Gibbons

Folds on lists

The previous post turned out to be rather complicated. In this one, I return to much simpler matters: {\mathit{foldr}} and {\mathit{foldl}} on lists, and list homomorphisms with an associative binary operator. For simplicity, I’m only going to be discussing finite lists.

Fold right

Recall that {\mathit{foldr}} is defined as follows:

\displaystyle  \begin{array}{@{}ll} \multicolumn{2}{@{}l}{\mathit{foldr} :: (\alpha \rightarrow \beta \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow \beta} \\ \mathit{foldr}\;(\oplus)\;e\;[\,] & = e \\ \mathit{foldr}\;(\oplus)\;e\;(a:x) & = a \oplus \mathit{foldr}\;(\oplus)\;e\;x \end{array}

It satisfies the following universal property, which gives necessary and sufficient conditions for a function {h} to be expressible as a {\mathit{foldr}}:

\displaystyle  \begin{array}{@{}l} h = \mathit{foldr}\;(\oplus)\;e \quad\Longleftrightarrow\quad h\;[\,] = e \;\land\; h\;(a:x) = a \oplus (h\;x) \end{array}

As one consequence of the universal property, we get a fusion theorem, which states sufficient conditions for fusing a following function {g} into a {\mathit{foldr}}:

\displaystyle  \begin{array}{@{}l} g \cdot \mathit{foldr}\;(\oplus)\;e = \mathit{foldr}\; (\otimes)\;e' \quad \Longleftarrow\quad g\;e = e' \;\land\; g\;(a \oplus b) = a \otimes g\;b \end{array}

(on finite lists—infinite lists require an additional strictness condition). Fusion is an equivalence if {\mathit{foldr}\;(\oplus)\;e} is surjective. If {\mathit{foldr}\;(\oplus)\;e} is not surjective, it’s an equivalence on the range of that fold: the left-hand side implies

\displaystyle  g\;(a \oplus b) = a \otimes g\;b

only for {b} of the form {\mathit{foldr}\;(\oplus)\;e\;x} for some {x}, not for all {b}.

As a second consequence of the universal property (or alternatively, as a consequence of the free theorem of the type of {\mathit{foldr}}), we have map–fold fusion:

\displaystyle  \mathit{foldr}\;(\oplus)\;e \cdot \mathit{map}\;g = \mathit{foldr}\;((\oplus) \cdot g)\;e

The code golf{((\oplus) \cdot g)}” is a concise if opaque way of writing the binary operator pointfree: {((\oplus) \cdot g)\;a\;b = g\;a \oplus b}.

Fold left

Similarly, {\mathit{foldl}} is defined as follows:

\displaystyle  \begin{array}{@{}ll} \multicolumn{2}{@{}l}{\mathit{foldl} :: (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow \beta} \\ \mathit{foldl}\;(\oplus)\;e\;[\,] & = e \\ \mathit{foldl}\;(\oplus)\;e\;(a:x) & = \mathit{foldl}\;(\oplus)\;(e \oplus a)\;x \end{array}

({\mathit{foldl}} is tail recursive, so it is unproductive on an infinite list.)

{\mathit{foldl}} also enjoys a universal property, although it’s not so well known as that for {\mathit{foldr}}. Because of the varying accumulating parameter {e}, the universal property entails abstracting that argument on the left in favour of a universal quantification on the right:

\displaystyle  \begin{array}{@{}l} h = \mathit{foldl}\;(\oplus) \quad\Longleftrightarrow\quad \forall b \;.\; h\;b\;[\,] = b \;\land\; h\;b\;(a:x) = h\;(b \oplus a)\;x \end{array}

(For the proof, see Exercise 16 in my paper with Richard Bird on arithmetic coding.)

From the universal property, it is straightforward to prove a map–{\mathit{foldl}} fusion theorem:

\displaystyle  \mathit{foldl}\;(\oplus)\;e \cdot \mathit{map}\;f = \mathit{foldl}\;((\cdot f) \cdot (\oplus))\;e

Note that {((\cdot f) \cdot (\oplus))\;b\;a = b \oplus f\;a}. There is also a fusion theorem:

\displaystyle  (\forall e \;.\; f \cdot \mathit{foldl}\;(\oplus)\;e = \mathit{foldl}\;(\otimes)\;(f\;e)) \quad\Longleftarrow\quad f\;(b \oplus a) = f\;b \otimes a

This is easy to prove by induction. (I would expect it also to be a consequence of the universal property, but I don’t see how to make that go through.)

Duality theorems

Of course, {\mathit{foldr}} and {\mathit{foldl}} are closely related. §3.5.1 of Bird and Wadler’s classic text presents a First Duality Theorem:

\displaystyle  \mathit{foldr}\;(\oplus)\;e\;x = \mathit{foldl}\;(\oplus)\;e\;x

when {\oplus} is associative with neutral element {e}; a more general Second Duality Theorem:

\displaystyle  \mathit{foldr}\;(\oplus)\;e\;x = \mathit{foldl}\;(\otimes)\;e\;x

when {\oplus} associates with {\otimes} (that is, {a \oplus (b \otimes c) = (a \oplus b) \otimes c}) and {a \oplus e = e \otimes a} (for all {a,b,c}, but fixed {e}); and a Third Duality Theorem:

\displaystyle  \mathit{foldr}\;(\oplus)\;e\;x = \mathit{foldl}\;(\mathit{flip}\;f)\;e\;(\mathit{reverse}\;x)

(again, all three only for finite {x}).

The First Duality Theorem is a specialization of the Second, when {{\oplus} = {\otimes}} (but curiously, also a slight strengthening: apparently all we require is that {a \oplus e = e \oplus a}, not that both equal {a}; for example, we still have {\mathit{foldr}\;(+)\;1 = \mathit{foldl}\;(+)\;1}, even though {1} is not the neutral element for {+}).


When is a function a fold?” Evidently, if function {h} on lists is injective, then it can be written as a {\mathit{foldr}}: in that case (at least, classically speaking), {h} has a post-inverse—a function {g} such that {g \cdot h = \mathit{id}}—so:

\displaystyle  \begin{array}{@{}ll} & h\;(a:x) \\ = & \qquad \{ g \cdot h = \mathit{id} \} \\ & h\;(a : g\;(h\;x)) \\ = & \qquad \{ \mbox{let } a \oplus b = h\;(a : g\;b) \} \\ & a \oplus h\;x \end{array}

and so, letting {e = h\;[\,]}, we have

\displaystyle  h = \mathit{foldr}\;(\oplus)\;e

But also evidently, injectivity is not necessary for a function to be a fold; after all, {\mathit{sum}} is a fold, but not the least bit injective. Is there a simple condition that is both sufficient and necessary for a function to be a fold? There is! The condition is that lists equivalent under {h} remain equivalent when extended by an element:

\displaystyle  h\;x = h\;x' \quad\Longrightarrow\quad h\;(a:x) = h\;(a:x')

We say that {h} is leftwards. (The kernel {\mathrm{ker}\;h} of a function {h} is a relation on its domain, namely the set of pairs {(x,x')} such that {h\;x = h\;x'}; this condition is equivalent to {\mathrm{ker}\;h \subseteq \mathrm{ker}\;(h \cdot (a:))} for every {a}.) Clearly, leftwardsness is necessary: if {h = \mathit{foldr}\;(\oplus)\;e} and {h\;x = h\;x'}, then

\displaystyle  \begin{array}{@{}ll} & h\;(a:x) \\ = & \qquad \{ h \mbox{ as } \mathit{foldr} \} \\ & a \oplus h\;x \\ = & \qquad \{ \mbox{assumption} \} \\ & a \oplus h\;x' \\ = & \qquad \{ h \mbox{ as } \mathit{foldr} \mbox{ again} \} \\ & h\;(a:x') \end{array}

Moreover, leftwardsness is sufficient. For, suppose that {h} is leftwards; then pick a function {g} such that, when {b} is in the range of {h}, we get {g\;b = x} for some {x} such that {h\;x = b} (and it doesn’t matter what value {g} returns for {b} outside the range of {h}), and then define

\displaystyle  a \oplus b = h\;(a : g\;b)

This is a proper definition of {\oplus}, on account of leftwardsness, in the sense that it doesn’t matter which value {x} we pick for {g\;b}, as long as indeed {h\;x = b}: any other value {x'} that also satisfies {h\;x' = b} entails the same outcome {h\;(a:x') = h\;(a:x)} for {a \oplus b}. Intuitively, it is not necessary to completely invert {h} (as we did in the injective case), provided that {h} preserves enough distinctions. For example, for {h = \mathit{sum}} (for which indeed {\mathit{sum}\;x = \mathit{sum}\;x'} implies {\mathit{sum}\;(a:x) = \mathit{sum}\;(a:x')}), we could pick {g\;b = [b]}. In particular, {h\;x} is obviously in the range of {h}; then {g\;(h\;x)} is chosen to be some {x'} such that {h\;x' = h\;x}, and so by construction {h\;(g\;(h\;x)) = h\;x}—in other words, {g} acts as a kind of partial inverse of {h}. So we get:

\displaystyle  \begin{array}{@{}ll} & a \oplus h\;x \\ = & \qquad \{ \mbox{definition of } \oplus \mbox{; let } x' = g\;(h\;x) \} \\ & h\;(a : x') \\ = & \qquad \{ h\;x' = h\;x \mbox{; assumption} \} \\ & h\;(a:x) \\ \end{array}

and therefore {h = \mathit{foldr}\;(\oplus)\;e} where {e = h\;[\,]}.

Monoids and list homomorphisms

A list homomorphism is a fold after a map in a monoid. That is, define

\displaystyle  \begin{array}{@{}ll} \multicolumn{2}{@{}l}{\mathit{hom} :: (\beta\rightarrow\beta\rightarrow\beta) \rightarrow (\alpha\rightarrow\beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow \beta} \\ \mathit{hom}\;(\odot)\;f\;e\;[\,] & = e \\ \mathit{hom}\;(\odot)\;f\;e\;[a] & = f\;a \\ \mathit{hom}\;(\odot)\;f\;e\;(x \mathbin{{+}\!\!\!{+}} y) & = \mathit{hom}\;(\odot)\;f\;e\;x \;\odot\; \mathit{hom}\;(\odot)\;f\;e\;y \end{array}

provided that {\odot} and {e} form a monoid. (One may verify that this condition is sufficient for the equations to completely define the function; moreover, they are almost necessary—{\odot} and {e} should form a monoid on the range of {\mathit{hom}\;(\odot)\;f\;e}.) The Haskell library {\mathit{Data.Foldable}} defines an analogous method, whose list instance is

\displaystyle  \mathit{foldMap} :: \mathit{Monoid}\;\beta \Rightarrow (\alpha \rightarrow \beta) \rightarrow [\alpha] \rightarrow \beta

—the binary operator {\odot} and initial value {e} are determined implicitly by the {\mathit{Monoid}} instance rather than being passed explicitly.

Richard Bird’s Introduction to the Theory of Lists states implicitly what might be called the First Homomorphism Theorem, that any homomorphism consists of a reduction after a map (in fact, a consequence of the free theorem of the type of {\mathit{hom}}):

\displaystyle  \mathit{hom}\;(\odot)\;f\;e = \mathit{hom}\;(\odot)\;\mathit{id}\;e \cdot \mathit{map}\;f

Explicitly as Lemma 4 (“Specialization”) the same paper states a Second Homomorphism Theorem, that any homomorphism can be evaluated right-to-left or left-to-right, among other orders:

\displaystyle  \mathit{hom}\;(\odot)\;f\;e = \mathit{foldr}\;(\lambda\;a\;b \;.\; f\;a \odot b)\;e = \mathit{foldl}\;(\lambda\;b\;a \;.\; b \odot f\;a)\;e

I wrote up the Third Homomorphism Theorem, which is the converse of the Second Homomorphism Theorem: any function that can be written both as a {\mathit{foldr}} and as a {\mathit{foldl}} is a homomorphism. I learned this theorem from David Skillicorn, but it was conjectured earlier by Richard Bird and proved by Lambert Meertens during a train journey in the Netherlands—as the story goes, Lambert returned from a bathroom break with the idea for the proof. Formally, for given {h}, if there exists {\oplus, \otimes, e} such that

\displaystyle  h = \mathit{foldr}\;(\oplus)\;e = \mathit{foldl}\;(\otimes)\;e

then there also exists {f} and associative {\odot} such that

\displaystyle  h = \mathit{hom}\;(\odot)\;f\;e

Recall that {h} is “leftwards” if

\displaystyle  h\;x = h\;x' \Longrightarrow h\;(a:x) = h\;(a:x')

and that the leftwards functions are precisely the {\mathit{foldr}}s. Dually, {h} is rightwards if

\displaystyle  h\;x = h\;x' \Longrightarrow h\;(x \mathbin{{+}\!\!\!{+}} [a]) = h\;(x' \mathbin{{+}\!\!\!{+}} [a])

and (as one can show) the rightwards functions are precisely the {\mathit{foldl}}s. So the Specialization Theorem states that every homomorphism is both leftwards and rightwards, and the Third Homomorphism Theorem states that every function that is both leftwards and rightwards is necessarily a homomorphism. To prove the latter, suppose that {h} is both leftwards and rightwards; then pick a function {g} such that, for {b} in the range of {h}, we get {g\;b = x} for some {x} such that {h\;x = b}, and then define

\displaystyle  b \odot c = h\;(g\;b \mathbin{{+}\!\!\!{+}} g\;c)

As before, this is a proper definition of {\odot}: assuming leftwardsness and rightwardsness, the result does not depend on the representatives chosen for {g}. By construction, {g} again satisfies {h\;(g\;(h\;x)) = h\;x} for any {x}, and so we have:

\displaystyle  \begin{array}{@{}ll} & h\;x \odot h\;y \\ = & \qquad \{ \mbox{definition of } \odot \} \\ & h\;(g\;(h\;x) \mathbin{{+}\!\!\!{+}} g\;(h\;y)) \\ = & \qquad \{ h \mbox{ is rightwards; } h\;(g\;(h\;x)) = h\;x \} \\ & h\;(x \mathbin{{+}\!\!\!{+}} g\;(h\;y)) \\ = & \qquad \{ h \mbox{ is leftwards; } h\;(g\;(h\;y)) = h\;y \} \\ & h\;(x \mathbin{{+}\!\!\!{+}} y) \end{array}

Moreover, one can show that {\odot} is associative, and {e} its neutral element (at least on the range of {h}).

For example, sorting a list can obviously be done as a {\mathit{foldr}}, using insertion sort; by the Third Duality Theorem, it can therefore also be done as a {\mathit{foldl}} on the reverse of the list; and because the order of the input is irrelevant, the reverse can be omitted. The Third Homomorphism Theorem then implies that there exists an associative binary operator {\odot} such that {\mathit{sort}\;(x \mathbin{{+}\!\!\!{+}} y) = \mathit{sort}\;x \odot \mathit{sort}\;y}. It also gives a specification of {\odot}, given a suitable partial inverse {g} of {\mathit{sort}}—in this case, {g = \mathit{id}} suffices, because {\mathit{sort}} is idempotent. The only characterization of {\odot} arising directly from the proof involves repeatedly inserting the elements of one argument into the other, which does not exploit sortedness of the first argument. But from this inefficient characterization, one may derive the more efficient implementation that merges two sorted sequences, and hence obtain merge sort overall.

The Trick

Here is another relationship between the directed folds ({\mathit{foldr}} and {\mathit{foldl}}) and list homomorphisms—any directed fold can be implemented as a homomorphism, followed by a final function application to a starting value:

\displaystyle  \mathit{foldr}\;(\oplus)\;e\;x = \mathit{hom}\;(\cdot)\;(\oplus)\;\mathit{id}\;x\;e

Roughly speaking, this is because application and composition are related:

\displaystyle  a \oplus (b \oplus (c \oplus e)) = (a\oplus) \mathbin{\$} (b\oplus) \mathbin{\$} (c\oplus) \mathbin{\$} e = ((a\oplus) \cdot (b\oplus) \cdot (c\oplus))\;e

(here, {\$ } is right-associating, loose-binding function application). To be more precise:

\displaystyle  \begin{array}{@{}ll} & \mathit{foldr}\;(\oplus)\;e \\ = & \qquad \{ \mbox{map-fold fusion: } (\$)\;(a\oplus)\;b = a \oplus b \} \\ & \mathit{foldr}\;(\$)\;e \cdot \mathit{map}\;(\oplus) \\ = & \qquad \{ \mbox{fusion: } (\$ e)\;\mathit{id} = e \mbox{ and } (\$ e)\;((\oplus) \cdot f) = (\oplus)\;(f\;e) = (\$)\;(\oplus)\;((\$ e)\;f) \} \\ & (\$ e) \cdot \mathit{foldr}\;(\cdot)\;\mathit{id} \cdot \mathit{map}\;(\oplus) \\ = & \qquad \{ (\cdot) \mbox{ and } \mathit{id} \mbox{ form a monoid} \} \\ & (\$ e) \cdot \mathit{hom}\;(\cdot)\;(\oplus)\;\mathit{id} \end{array}

This result has many applications. For example, it’s the essence of parallel recognizers for regular languages. Language recognition looks at first like an inherently sequential process; a recognizer over an alphabet {\Sigma} can be represented as a finite state machine over states {S}, with a state transition function of type {S \times \Sigma \rightarrow S}, and such a function is clearly not associative. But by mapping this function over the sequence of symbols, we get a sequence of {S \rightarrow S} functions, which should be subject to composition. Composition is of course associative, so can be computed in parallel, in {\mathrm{log}\;n} steps on {n} processors. Better yet, each such {S \rightarrow S} function can be represented as an array (since {S} is finite), and the composition of any number of such functions takes a fixed amount of space to represent, and a fixed amount of time to apply, so the {\mathrm{log}\;n} steps take {\mathrm{log}\;n} time.

Similarly for carry-lookahead addition circuits. Binary addition of two {n}-bit numbers with an initial carry-in proceeds from right to left, adding bit by bit and taking care of carries, producing an {n}-bit result and a carry-out; this too appears inherently sequential. But there aren’t many options for the pair of input bits at a particular position: two {1}s will always generate an outgoing carry, two {0}s will always kill an incoming carry, and a {1} and {0} in either order will propagate an incoming carry to an outgoing one. Whether there is a carry-in at a particular bit position is computed by a {\mathit{foldr}} of the bits to the right of that position, zipped together, starting from the initial carry-in, using the binary operator {\oplus} defined by

\displaystyle  \begin{array}{@{}ll} (1,1) \oplus b & = 1 \\ (0,0) \oplus b & = 0 \\ (x,y) \oplus b & = b \end{array}

Again, applying {\oplus} to a bit-pair makes a bit-to-bit function; these functions are to be composed, and function composition is associative. Better, such functions have small finite representations as arrays (indeed, we need a domain of only three elements, {G, K, P}; {G} and {K} are both left zeroes of composition, and {P} is neutral). Better still, we can compute the carry-in at all positions using a {\mathit{scanr}}, which for an associative binary operator can also be performed in parallel in {\mathrm{log}\;n} steps on {n} processors.

The Trick seems to be related to Cayley’s Theorem, on monoids rather than groups: any monoid is equivalent to a monoid of endomorphisms. That is, corresponding to every monoid {(M,{\oplus},e)}, with {M} a set, and {(\oplus) :: M \rightarrow M \rightarrow M} an associative binary operator with neutral element {e :: M}, there is another monoid {(N,(\cdot),\mathit{id})} on the carrier {N = \{ (x\oplus) \mid x \in M \}} of endomorphisms of the form {(x\oplus)}; the mappings {(\oplus) :: M \rightarrow N} and {(\$ e) :: N \rightarrow M} are both monoid homomorphisms, and are each other’s inverse. (Cayley’s Theorem is what happens to the Yoneda Embedding when specialized to the one-object category representing a monoid.) So any list homomorphism corresponds to a list homomorphism with endomorphisms as the carrier, followed by a single final function application:

\displaystyle  \begin{array}{@{}ll} & \mathit{hom}\;(\oplus)\;f\;e \\ = & \qquad \{ \mbox{Second Homomorphism Theorem} \} \\ & \mathit{foldr}\;((\oplus) \cdot f)\;e \\ = & \qquad \{ \mbox{The Trick} \} \\ & (\$ e) \cdot \mathit{hom}\;(\cdot)\;((\oplus) \cdot f)\;\mathit{id} \end{array}

Note that {(\oplus) \cdot f} takes a list element {a} the endomorphism {(f\,a\, \oplus)}. The change of representation from {M} to {N} is the same as in The Trick, and is also what underlies Hughes’s novel representation of lists; but I can’t quite put my finger on what these both have to do with Cayley and Yoneda.

by jeremygibbons at August 14, 2018 03:05 PM

August 13, 2018


Setting Up A Private Nix Cache

I recently went through the process of setting up a private Nix binary cache. It was not obvious to me how to go about it at first, so I thought I would document what I did here. There are a few different ways of going about this that might be appropriate in one situation or another, but I’ll just describe the one I ended up using. I need to serve a cache for proprietary code, so I ended up using a cache served via SSH.

Setting up the server

For my cache server I’m using an Amazon EC2 instance with NixOS. It’s pretty easy to create these using the public NixOS 18.03 AMI. I ended up using a t2.medium with 1 TB of storage, but the jury is still out on the ideal tradeoff of specs and cost for our purposes. YMMV.

The NixOS AMI puts the SSH credentials in the root user, so log in like this:

ssh -i /path/to/your/key.pem

To get your new NixOS machine working as an SSH binary cache there are two things you need to do: generate a signing key and turn on cache serving.

Generate a signing key

You can generate a public/private key pair simply by running the following command:

nix-store --generate-binary-cache-key nix-cache-key.sec

Turn on SSH Nix store serving

NixOS ships out of the box with a config option for enabling this, so it’s pretty easy. Just edit /etc/nixos/configuraton.nix and add the following lines:

  nix = {
    extraOptions = ''
      secret-key-files = /root/nix-cache-key.sec

    sshServe = {
      enable = true;
      keys = [
        "ssh-rsa ..."
        "ssh-rsa ..."

The extraOptions section makes the system aware of your signing key. The sshServe section makes the local Nix store available via the nix-ssh user. You grant access to the cache by adding your users’ SSH public keys to the keys section.

Setting up the clients

Now you need to add this new cache to your users’ machines so they can get cached binaries instead of building things themselves. The following applies to multi-user Nix setups where there is a Nix daemon that runs as root. This is now the default when you install Nix on macOS. If you are using single-user Nix, then you may not need to do all of the following.

You need to have an SSH public/private key pair for your root user to use the Kadena Nix cache. This makes sense because everything in your local Nix store is world readable, so private cache access needs to be semantically controlled on a per-machine basis, not a per-user basis.

Generating a Root SSH Key

To generate an SSH key for your root user, run the following commands. After the ssh-keygen command hit enter three times to accept the defaults. It is important that you not set a password for this SSH key because the connection will be run automatically and you won’t be able to type a password. You’ll do the rest of the section as the root user, so start by entering a root shell and generating an SSH key pair.

sudo su -
ssh-keygen -b 4096

Next ssh to the cache server. This will tell you that the authenticity of the server can’t be established and ask if you want to continue. Answer ‘yes’. After it connects and prompts you for a password, just hit CTRL-c to cancel.


This has the effect of adding the server to your .ssh/known_hosts file. If you didn’t do this, SSH would ask you to verify the host authenticity. But SSH will be called automatically by the Nix daemon and it will fail.

Now cat the public key file.

cat ~/.ssh/

Copy the contents of this file and send your key to the administrator of the nix cache.

Telling Nix to use the cache

In your $NIX_CONF_DIR/nix.conf, add your cache to the substituters line and add the cache's public signing key (generated above with the nix-store command or given to you by your cache administrator) to the trusted-public-keys line. It might look something like this:

substituters = ssh://
trusted-public-keys =

Now you need to restart the nix daemon.

On mac:

sudo launchctl stop org.nixos.nix-daemon
sudo launchctl start org.nixos.nix-daemon

On linux:

sudo systemctl restart nix-daemon.service

Populating the Cache

To populate the Nix cache, use the nix-copy-closure command on any nix store path. For instance, the result symlink that is created by a nix-build.

nix-copy-closure -v --gzip --include-outputs --to <nix-store-path>

After you copy binaries to the cache, you need to sign them with the signing key you created with the nix-store command above. You can do that by running the following command on the cache server:

nix sign-paths -k nix-cache-key.sec --all

It’s also possible to sign packages on the machines that build them. This would require copying the private signing key around to other servers, so if you’re going to do that you should think carefully about key management.


At some point it is likely that your cache will run low on disk space. When this happens, the nix-collect-garbage command is your friend for cleaning things up in a gradual way that doesn't suddenly drop long builds on your users.

by mightybyte ( at August 13, 2018 08:40 PM

Ken T Takusagawa

[byhnlnng] Commonly occurring substrings in pi

We calculate the most frequently occurring N-digit sequences in the first 10^N digits of pi past the decimal point.

N=1: The most frequently occurring single digit in the first 10^1=10 digits of pi past the decimal point is 5.  It occurs 3 times: 1415926535.

N=2: The most frequently occurring 2-digit sequence in the first 10^2=100 digits of pi past the decimal point is 62.  It occurs 4 times: 1<wbr></wbr>4<wbr></wbr>1<wbr></wbr>5<wbr></wbr>9<wbr></wbr>2<wbr></wbr>6<wbr></wbr>5<wbr></wbr>3<wbr></wbr>5<wbr></wbr>8<wbr></wbr>9<wbr></wbr>7<wbr></wbr>9<wbr></wbr>3<wbr></wbr>2<wbr></wbr>3<wbr></wbr>8<wbr></wbr>4<wbr></wbr>62<wbr></wbr>6<wbr></wbr>4<wbr></wbr>3<wbr></wbr>3<wbr></wbr>8<wbr></wbr>3<wbr></wbr>2<wbr></wbr>7<wbr></wbr>9<wbr></wbr>5<wbr></wbr>0<wbr></wbr>2<wbr></wbr>8<wbr></wbr>8<wbr></wbr>4<wbr></wbr>1<wbr></wbr>9<wbr></wbr>7<wbr></wbr>1<wbr></wbr>6<wbr></wbr>9<wbr></wbr>3<wbr></wbr>9<wbr></wbr>9<wbr></wbr>3<wbr></wbr>7<wbr></wbr>5<wbr></wbr>1<wbr></wbr>0<wbr></wbr>5<wbr></wbr>8<wbr></wbr>2<wbr></wbr>0<wbr></wbr>9<wbr></wbr>7<wbr></wbr>4<wbr></wbr>9<wbr></wbr>4<wbr></wbr>4<wbr></wbr>5<wbr></wbr>9<wbr></wbr>2<wbr></wbr>3<wbr></wbr>0<wbr></wbr>7<wbr></wbr>8<wbr></wbr>1<wbr></wbr>6<wbr></wbr>4<wbr></wbr>0<wbr></wbr>62<wbr></wbr>8<wbr></wbr>62<wbr></wbr>0<wbr></wbr>8<wbr></wbr>9<wbr></wbr>9<wbr></wbr>8<wbr></wbr>62<wbr></wbr>8<wbr></wbr>0<wbr></wbr>3<wbr></wbr>4<wbr></wbr>8<wbr></wbr>2<wbr></wbr>5<wbr></wbr>3<wbr></wbr>4<wbr></wbr>2<wbr></wbr>1<wbr></wbr>1<wbr></wbr>7<wbr></wbr>0<wbr></wbr>6<wbr></wbr>7<wbr></wbr>9.

Below is a table extended up to 10-digit sequences in the first 10^10 digits of pi.  When there are ties, all sequences are listed.

(N, number of occurrences, [the N-digit sequences that occur most frequently])

We stopped at N=10 because, for a computation involving pi in base 10, stopping at 10^10 digits seemed appropriate.

It is curious that the middle column, number of occurrences, is non-decreasing.  We don't expect this trend to always be true.

Assuming the digits of pi are random, we expect a typical N-digit sequence to occur once in 10^N digits.  This isn't exactly correct because digit sequences overlap so are not independent.  I don't know the probability distribution of the number of occurrences of the most common sequence.

Here is Haskell code used in the computation.  We used strict state threads (Control.Monad.ST) to avoid lazy thunks from piling up, a frequent occurrence in long computationally intensive tasks which produce output only at the end.  We used seq to get rid of one giant space leak at the end when finding the largest item (and its index) of the array discussed below.

We used the unboxed STUArray s Integer Word8 to compactly store the counts for each possible digit string.  We furthermore packed 2 entries into each Word8, one in the upper 4 bits and one in the lower, at the cost of limiting maximum counts to 15 (which turned out to be enough: the maximum seen was 12).  For N=10, we were therefore expecting 5 GB memory usage for the 10^10 digit run, but memory use was 9848268k maxresident (9.8 GB) according to /usr/bin/time, so there is still a space leak somewhere.  CPU time was 6000 seconds.

by Ken ( at August 13, 2018 06:11 AM

August 09, 2018

FP Complete

Haskell Library Audit Reports

Since December, FP Complete has been working with Cardano Foundation on an audit of the Cardano settlement layer. The audit work is ongoing, with the currently released reports available on Cardano's website.

by Michael Snoyman ( at August 09, 2018 09:47 PM

August 07, 2018

Joachim Breitner

Swing Dancer Profile

During my two years in Philadelphia (I was a post-doc with Stephanie Weirich at the University of Pennsylvania) I danced a lot of Swing, in particular at the weekly social “Jazz Attack”.

They run a blog, that blog features dancers, and this week – just in time for my last dance – they featured me with a short interview.

by Joachim Breitner ( at August 07, 2018 04:16 PM

Jeremy Gibbons

Asymmetric Numeral Systems

The previous two posts discussed arithmetic coding (AC). In this post I’ll discuss a related but more recent entropy encoding technique called asymmetric numeral systems (ANS), introduced by Jarek Duda and explored in some detail in a series of 12 blog posts by Charles Bloom. ANS combines the effectiveness of AC (achieving approximately the theoretical optimum Shannon entropy) with the efficiency of Huffman coding. On the other hand, whereas AC acts in a first-in first-out manner, ANS acts last-in first-out, in the sense that the decoded text comes out in reverse; this is not a problem in a batched context, but it makes ANS unsuitable for encoding a communications channel, and also makes it difficult to employ adaptive text models.

Arithmetic coding, revisited

Here is a simplified version of arithmetic coding; compared to the previous posts, we use a fixed rather than adaptive model, and rather than {\mathit{pick}}ing some arbitrary element of the final interval, we pick explicitly the lower bound of the interval (by {\mathit{fst}}, when it is represented as a pair). We suppose a function

\displaystyle  \begin{array}{@{}l} \mathit{count} :: \mathit{Symbol} \rightarrow \mathit{Integer} \\ \end{array}

that returns a positive count for every symbol, representing predicted frequencies of occurrence, from which it is straightforward to derive functions

\displaystyle  \begin{array}{@{}l} \mathit{encodeSym} :: \mathit{Symbol} \rightarrow \mathit{Interval} \\ \mathit{decodeSym} :: \mathit{Rational} \rightarrow \mathit{Symbol} \end{array}

that work on a fixed global model, satisfying the central property

\displaystyle  \mathit{decodeSym}\;x = s \Leftrightarrow x \in \mathit{encodeSym}\;s

For example, an alphabet of three symbols `a’, `b’, `c’, with counts 2, 3, and 5 respectively, could be encoded by the intervals {(0,\frac{1}{5}), (\frac{1}{5},\frac{1}{2}),(\frac{1}{2},1)}. We have operations on intervals:

\displaystyle  \begin{array}{@{}ll} \mathit{weight}\;(l,r)\;x & = l + (r-l) \times x \\ \mathit{scale}\;(l,r)\;x & = (x-l)/(r-l) \\ \mathit{narrow}\;i\;(p,q) & = (\mathit{weight}\;i\;p, \mathit{weight}\;i\;q) \\ \end{array}

that satisfy

\displaystyle  \begin{array}{@{}ll} \mathit{weight}\;i\;x \in i & \Leftrightarrow x \in \mathit{unit} \\ \mathit{weight}\;i\;x = y & \Leftrightarrow \mathit{scale}\;i\;y = x \\ \end{array}

Then we have:

\displaystyle  \begin{array}{@{}l} \mathit{encode}_1 :: [\mathit{Symbol}] \rightarrow \mathit{Rational} \\ \mathit{encode}_1 = \mathit{fst} \cdot \mathit{foldl}\;\mathit{estep}_1\;\mathit{unit} \qquad \mathbf{where} \\ \qquad \begin{array}[t]{@{}l} \mathit{estep}_1 :: \mathit{Interval} \rightarrow \mathit{Symbol} \rightarrow \mathit{Interval} \\ \mathit{estep}_1\;i\;s = \mathit{narrow}\;i\;(\mathit{encodeSym}\;s) \medskip \end{array} \\ \mathit{decode}_1 :: \mathit{Rational} \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_1 = \mathit{unfoldr}\;\mathit{dstep}_1 \qquad \mathbf{where} \\ \qquad \begin{array}[t]{@{}l} \mathit{dstep}_1 :: \mathit{Rational} \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, \mathit{Rational}) \\ \mathit{dstep}_1\;x = \mathbf{let}\;s = \mathit{decodeSym}\;x\;\mathbf{in}\; \mathit{Just}\;(s, \mathit{scale}\;(\mathit{encodeSym}\;s)\;x) \end{array} \end{array}

For example, with the alphabet and intervals above, the input text “abc” gets encoded symbol by symbol, from left to right (because of the {\mathit{foldl}}), to the narrowing sequence of intervals {(\frac{0}{1}, \frac{1}{1}), (\frac{0}{1}, \frac{1}{5}), (\frac{1}{25}, \frac{1}{10}), (\frac{7}{100}, \frac{1}{10})} from which we select the lower bound {\frac{7}{100}} of the final interval. (This version doesn’t stream well, because of the decision to pick the lower bound of the final interval as the result. But it will suffice for our purposes.)

Note now that the symbol encoding can be “fissioned” out of {\mathit{encode}_1}, using map fusion for {\mathit{foldl}} backwards; moreover, {\mathit{narrow}} is associative (and {\mathit{unit}} its neutral element), so we can replace the {\mathit{foldl}} with a {\mathit{foldr}}; and finally, now using fusion forwards, we can fuse the {\mathit{fst}} with the {\mathit{foldr}}. That is,

\displaystyle  \begin{array}{@{}cl} & \mathit{encode}_1 \\ = & \qquad \{ \mbox{definition} \} \\ & \mathit{fst} \cdot \mathit{foldl}\;\mathit{estep}_1\;\mathit{unit} \\ = & \qquad \{ \mbox{fusion for } \mathit{foldl} \mbox{, backwards} \} \\ & \mathit{fst} \cdot \mathit{foldl}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{map}\;\mathit{encodeSym} \\ = & \qquad \{ \mathit{narrow} \mbox{ is associative} \} \\ & \mathit{fst} \cdot \mathit{foldr}\;\mathit{narrow}\;\mathit{unit} \cdot \mathit{map}\;\mathit{encodeSym} \\ = & \qquad \{ \mbox{fusion for } \mathit{foldr} \} \\ & \mathit{foldr}\;\mathit{weight}\;0 \cdot \mathit{map}\;\mathit{encodeSym} \end{array}

and we can fuse the {\mathit{map}} back into the {\mathit{foldr}}; so we have {\mathit{encode}_1 = \mathit{encode}_2}, where

\displaystyle  \begin{array}{@{}l} \mathit{encode}_2 :: [\mathit{Symbol}] \rightarrow \mathit{Rational} \\ \mathit{encode}_2 = \mathit{foldr}\;\mathit{estep}_2\;0 \qquad\mathbf{where} \\ \qquad \begin{array}[t]{@{}l} \mathit{estep}_2 :: \mathit{Symbol} \rightarrow \mathit{Rational} \rightarrow \mathit{Rational} \\ \mathit{estep}_2\;s\;x = \mathit{weight}\;(\mathit{encodeSym}\;s)\;x \end{array} \end{array}

Now encoding is a simple {\mathit{foldr}} and decoding a simple {\mathit{unfoldr}}. This means that we can prove that decoding inverts encoding, using the “Unfoldr–Foldr Theorem” stated in the Haskell documentation for {\mathit{unfoldr}} (in fact, the only property of {\mathit{unfoldr}} stated there!). The theorem states that if

\displaystyle  \begin{array}{@{}ll} g\;(f\;x\;z) & = \mathit{Just}\;(x,z) \\ g\;e & = \mathit{Nothing} \end{array}

for all {x} and {z}, then

\displaystyle  \mathit{unfoldr}\;g\;(\mathit{foldr}\;f\;e\;\mathit{xs}) = \mathit{xs}

for all finite lists {\mathit{xs}}. Actually, that’s too strong for us here, because our decoding always generates an infinite list; a weaker variation (hereby called the “Unfoldr–Foldr Theorem With Junk”) states that if

\displaystyle  \begin{array}{@{}ll} g\;(f\;x\;z) & = \mathit{Just}\;(x,z) \end{array}

for all {x} and {z}, then

\displaystyle  \exists \mathit{ys}\,.\, \mathit{unfoldr}\;g\;(\mathit{foldr}\;f\;e\;\mathit{xs}) = \mathit{xs} \mathbin{{+}\!\!\!{+}} \mathit{ys}

for all finite lists {\mathit{xs}}—that is, the {\mathit{unfoldr}} inverts the {\mathit{foldr}} except for appending some junk {\mathit{ys}} to the output. It’s a nice exercise to prove this weaker theorem, by induction.

We check the condition:

\displaystyle  \begin{array}{@{}cl} & \mathit{dstep}_1\;(\mathit{estep}_2\;s\;x) \\ = & \qquad \{ \mathit{estep}_2 \mbox{; let } x' = \mathit{weight}\;(\mathit{encodeSym}\;s)\;x \} \\ & \mathit{dstep}_1\;x' \\ = & \qquad \{ \mathit{dstep}_1 \mbox{; let } s' = \mathit{decodeSym}\;x' \} \\ & \mathit{Just}\;(s', \mathit{scale}\;(\mathit{encodeSym}\;s')\;x') \end{array}

Now, we hope to recover the first symbol, ie we hope that {s' = s}:

\displaystyle  \begin{array}{@{}cl} & \mathit{decodeSym}\;x' = s \\ \Leftrightarrow & \qquad \{ \mbox{central property} \} \\ & x' \in \mathit{encodeSym}\;s \\ \Leftrightarrow & \qquad \{ x' = \mathit{weight}\;(\mathit{encodeSym}\;s)\;x \mbox{; } \mathit{weight} \} \\ & x \in \mathit{unit} \end{array}

Fortunately, it is an invariant of the encoding process that {x}—the result of {\mathit{encode}_2}—is in the unit interval, so indeed {s' = s}. Continuing:

\displaystyle  \begin{array}{@{}cl} & \mathit{dstep}_1\;(\mathit{estep}_2\;s\;x) \\ = & \qquad \{ \mbox{above} \} \\ & \mathit{Just}\;(s, \mathit{scale}\;(\mathit{encodeSym}\;s)\;(\mathit{weight}\;(\mathit{encodeSym}\;s)\;x)) \\ = & \qquad \{ \mathit{scale}\;i \cdot \mathit{weight}\;i = \mathit{id} \} \\ & \mathit{Just}\;(s, x) \end{array}

as required. Therefore decoding inverts encoding, apart from the fact that it continues to produce junk having decoded all the input; so

\displaystyle  \mathit{take}\;(\mathit{length}\;\mathit{xs})\;(\mathit{decode}_1\;(\mathit{encode}_2\;\mathit{xs})) = \mathit{xs}

for all finite {\mathit{xs}}.

From fractions to integers

Whereas AC encodes longer and longer messages as more and more precise fractions, ANS encodes them as larger and larger integers. Given the {\mathit{count}} function on symbols, we can easily derive definitions

\displaystyle  \begin{array}{@{}ll} \mathit{cumul} & :: \mathit{Symbol} \rightarrow \mathit{Integer} \\ \mathit{total} & :: \mathit{Integer} \\ \mathit{find} & :: \mathit{Integer} \rightarrow \mathit{Symbol} \\ \end{array}

such that {\mathit{cumul}\;s} gives the cumulative counts of all symbols preceding {s} (say, in alphabetical order), {\mathit{total}} the total of all symbol counts, and {\mathit{find}\;x} looks up an integer {0 \le x < \mathit{total}}:

\displaystyle  \mathit{find}\;x = s \Leftrightarrow \mathit{cumul}\;s \le x < \mathit{cumul}\;s + \mathit{count}\;s

(for example, we might tabulate the function {\mathit{cumul}} using a scan).

We encode a text as an integer {x}, containing {\log x} bits of information (all logs in this blog are base 2). The next symbol {s} to encode has probability {p = \mathit{count}\;s / \mathit{total}}, and so requires an additional {\log (1/p)} bits; in total, that’s {\log x + \log (1/p) = \log (x/p) = \log (x \times \mathit{total}/\mathit{count}\;s)} bits. So entropy considerations tell us that, roughly speaking, to incorporate symbol {s} into state {x} we want to map {x} to {x' \simeq x \times \mathit{total} / \mathit{count}\;s}. Of course, in order to decode, we need to be able to invert this transformation, to extract {s} and {x} from {x'}; this suggests that we should do the division by {\mathit{count}\;s} first:

\displaystyle  x' = (x \mathbin{\underline{\smash{\mathit{div}}}} \mathit{count}\;s) \times \mathit{total} \qquad \mbox{\textemdash{} not final}

so that the multiplication by the known value {\mathit{total}} can be undone first:

\displaystyle  x \mathbin{\underline{\smash{\mathit{div}}}} \mathit{count}\;s = x' \mathbin{\underline{\smash{\mathit{div}}}} \mathit{total}

How do we reconstruct {s}? Well, there is enough headroom in {x'} to add any non-negative value less than {\mathit{total}} without affecting the division; in particular, we can add {\mathit{cumul}\;s} to {x'}, and then we can use {\mathit{find}} on the remainder:

\displaystyle  \begin{array}{@{}l} x' = (x \mathbin{\underline{\smash{\mathit{div}}}} \mathit{count}\;s) \times \mathit{total} + \mathit{cumul}\;s \qquad \mbox{\textemdash{} also not final} \\ x \mathbin{\underline{\smash{\mathit{div}}}} \mathit{count}\;s = x' \mathbin{\underline{\smash{\mathit{div}}}} \mathit{total} \\ \mathit{cumul}\;s = x' \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{total} \\ s = \mathit{find}\;(\mathit{cumul}\;s) = \mathit{find}\;(x' \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{total}) \end{array}

But we have still lost some information from the lower end of {x} through the division, namely {x \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{count}\;s}; so we can’t yet reconstruct {x}. Happily, {\mathit{find}\;(\mathit{cumul}\;s) = \mathit{find}\;(\mathit{cumul}\;s+r)} for any {0 \le r < \mathit{count}\;s}, so there is still precisely enough headroom in {x'} to add this lost information too without affecting the {\mathit{find}}, allowing us also to reconstruct {x}:

\displaystyle  \begin{array}{@{}l} x' = (x \mathbin{\underline{\smash{\mathit{div}}}} \mathit{count}\;s) \times \mathit{total} + \mathit{cumul}\;s + x \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{count}\;s \qquad \mbox{\textemdash{} final} \\ x \mathbin{\underline{\smash{\mathit{div}}}} \mathit{count}\;s = x' \mathbin{\underline{\smash{\mathit{div}}}} \mathit{total} \\ \mathit{cumul}\;s + x \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{count}\;s = x' \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{total} \\ s = \mathit{find}\;(\mathit{cumul}\;s + x \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{count}\;s) = \mathit{find}\;(x' \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{total}) \\ x = \mathit{count}\;s \times (x \mathbin{\underline{\smash{\mathit{div}}}} \mathit{count}\;s) + x \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{count}\;s = \mathit{count}\;s \times (x' \mathbin{\underline{\smash{\mathit{div}}}} \mathit{total}) + x' \mathbin{\underline{\smash{\mathit{mod}}}} \mathit{total} - \mathit{cumul}\;s \end{array}

We therefore define

\displaystyle  \begin{array}{@{}l} \mathit{encode}_3 :: [\mathit{Symbol}] \rightarrow \mathit{Integer} \\ \mathit{encode}_3 = \mathit{foldr}\;\mathit{estep}_3\;0 \\ \mathit{estep}_3 :: \mathit{Symbol} \rightarrow \mathit{Integer} \rightarrow \mathit{Integer} \\ \mathit{estep}_3\;s\;x = \mathbf{let}\;(q,r) = \mathit{divMod}\;x\;(\mathit{count}\;s)\;\mathbf{in}\;q \times \mathit{total} + \mathit{cumul}\;s + r \medskip \\ \mathit{decode}_3 :: \mathit{Integer} \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_3 = \mathit{unfoldr}\;\mathit{dstep}_3 \\ \mathit{dstep}_3 :: \mathit{Integer} \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, \mathit{Integer}) \\ \mathit{dstep}_3\;x = \mathbf{let}\;(q,r) = \mathit{divMod}\;x\;\mathit{total} ; s = \mathit{find}\;r\;\mathbf{in}\; \mathit{Just}\;(s, \mathit{count}\;s \times q + r - \mathit{cumul}\;s) \end{array}

Using similar reasoning as for AC, it is straightforward to show that a decoding step inverts an encoding step:

\displaystyle  \begin{array}{@{}cl} & \mathit{dstep}_3\;(\mathit{estep}_3\;s\;x) \\ = & \qquad \{ \mathit{estep}_3 \mbox{; let } (q,r) = \mathit{divMod}\;x\;(\mathit{count}\;s), x' = q \times \mathit{total} + \mathit{cumul}\;s + r \} \\ & \mathit{dstep}_3\;x' \\ = & \qquad \{ \mathit{dstep}_3 \mbox{; let } (q',r') = \mathit{divMod}\;x'\;\mathit{total} ; s' = \mathit{find}\;r' \} \\ & \mathit{Just}\;(s', \mathit{count}\;s' \times q' + r' - \mathit{cumul}\;s') \\ = & \qquad \{ r' = \mathit{cumul}\;s + r, 0 \le r < \mathit{count}\;s, \mbox{ so } s'=\mathit{find}\;r'=s \} \\ & \mathit{Just}\;(s, \mathit{count}\;s \times q' + r' - \mathit{cumul}\;s) \\ = & \qquad \{ r' - \mathit{cumul}\;s = r, q' = x' \mathbin{\underline{\smash{\mathit{div}}}} \mathit{total} = q \} \\ & \mathit{Just}\;(s, \mathit{count}\;s \times q + r) \\ = & \qquad \{ (q,r) = \mathit{divMod}\;x\;(\mathit{count}\;s) \} \\ & \mathit{Just}\;(s,x) \end{array}

Therefore decoding inverts encoding, modulo final junk, by the Unfoldr–Foldr Theorem With Junk:

\displaystyle  \mathit{take}\;(\mathit{length}\;\mathit{xs})\;(\mathit{decode}_3\;(\mathit{encode}_3\;\mathit{xs})) = \mathit{xs}

for all finite {\mathit{xs}}. For example, with the same alphabet and symbol weights as before, encoding the text “abc” (now from right to left, because of the {\mathit{foldr}}) proceeds through the steps {0, 5, 14, 70}, and the last element {70} is the encoding.

In fact, the inversion property holds whatever value we use to start encoding with (since it isn’t used in the proof); in the next section, we start encoding with a certain lower bound {l} rather than {0}. Moreover, {\mathit{estep}_3} is strictly increasing on states strictly greater than zero, and {\mathit{dstep}_3} strictly decreasing; which means that the decoding process can stop when it returns to the lower bound. That is, if we pick some {l > 0} and define

\displaystyle  \begin{array}{@{}l} \mathit{encode}_4 :: [\mathit{Symbol}] \rightarrow \mathit{Integer} \\ \mathit{encode}_4 = \mathit{foldr}\;\mathit{estep}_3\;l \medskip \\ \mathit{decode}_4 :: \mathit{Integer} \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_4 = \mathit{unfoldr}\;\mathit{dstep}_4 \\ \mathit{dstep}_4 :: \mathit{Integer} \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, \mathit{Integer}) \\ \mathit{dstep}_4\;x = \mathbf{if}\; x==l \;\mathbf{then}\; \mathit{Nothing} \;\mathbf{else}\; \mathit{dstep}_3\;x \end{array}

then the stronger version of the Unfoldr–Foldr Theorem holds, and we have

\displaystyle  \mathit{decode}_4\;(\mathit{encode}_4\;\mathit{xs}) = \mathit{xs}

for all finite {\mathit{xs}}, without junk.

Bounded precision

The previous versions all used arbitrary-precision arithmetic, which is expensive. We now change the approach slightly to use only bounded-precision arithmetic. As usual, there is a trade-off between effectiveness (a bigger bound means more accurate approximations to ideal entropies) and efficiency (a smaller bound generally means faster operations). Fortunately, the reasoning does not depend on much about the actual bounds. We will represent the integer accumulator {z} as a pair {(x,\mathit{ys})} which we call a {\mathit{Number}}:

\displaystyle  \mathbf{type}\; \mathit{Number} = (\mathit{Integer}, [\mathit{Integer}])

such that {\mathit{ys}} is a list of digits in a given base {b}, and {x} is an integer with {l \le x < u} (where we define {u = l \times b} for the upper bound of the range), under the abstraction relation

\displaystyle  z = \mathit{foldl}\;\mathit{inject}\;x\;\mathit{ys}


\displaystyle  \begin{array}{@{}ll} \mathit{inject}\;x\;y & = x \times b + y \\ \mathit{extract}\;x & = \mathit{divMod}\;x\;b \end{array}

We call {x} the “window” and {\mathit{ys}} the “remainder”. For example, with {b=10} and {l=100}, the pair {(123,[4,5,6])} represents {123456}; and indeed, for any {z \ge l} there is a unique representation satisfying the invariants. According to Duda, {\mathit{total}} should divide into {l} evenly. On a real implementation, {b} and {l} should be powers of two, such that arithmetic on values up to {u} fits within a single machine word.

The encoding step acts on the window in the accumulator using {\mathit{estep}_3}, which risks making it overflow the range; we therefore renormalize with {\mathit{enorm}_5} by shifting digits from the window to the remainder until this overflow would no longer happen, before consuming the symbol.

\displaystyle  \begin{array}{@{}l} \mathit{econsume}_5 :: [\mathit{Symbol}] \rightarrow \mathit{Number} \\ \mathit{econsume}_5 = \mathit{foldr}\;\mathit{estep}_5\;(l,[\,]) \medskip \\ \mathit{estep}_5 :: \mathit{Symbol} \rightarrow \mathit{Number} \rightarrow \mathit{Number} \\ \mathit{estep}_5\;s\;(x,\mathit{ys}) = \mathbf{let}\;(x',\mathit{ys}') = \mathit{enorm}_5\;s\;(x,\mathit{ys})\;\mathbf{in}\;(\mathit{estep}_3\;s\;x',\mathit{ys}') \medskip \\ \mathit{enorm}_5 :: \mathit{Symbol} \rightarrow \mathit{Number} \rightarrow \mathit{Number} \\ \mathit{enorm}_5\;s\;(x,\mathit{ys}) = \begin{array}[t]{@{}l@{}} \mathbf{if}\;\mathit{estep}_3\;s\;x < u \;\mathbf{then}\; (x,\mathit{ys})\;\mathbf{else}\; \\ \mathbf{let}\;(q,r) = \mathit{extract}\;x \;\mathbf{in}\; \mathit{enorm}_5\;s\;(q,r:\mathit{ys}) \end{array} \end{array}

Some calculation, using the fact that {\mathit{total}} divides {l}, allows us to rewrite the guard in {\mathit{enorm}_5}:

\displaystyle  \mathit{estep}_3\;s\;x < u \Leftrightarrow x < b \times (l \mathbin{\underline{\smash{\mathit{div}}}} \mathit{total}) \times \mathit{count}\;s

For example, again encoding the text “abc”, again from right to left, with {r=10} and {l=100}, the accumulator starts with {(100,[\,])}, and goes through the states {(205,[\,])} and {(683,[\,])} on consuming the `c’ then the `b’; directly consuming the `a’ would make the window overflow, so we renormalize to {(68,[3])}; then it is safe to consume the `a’, leading to the final state {(340,[3])}.

Decoding is an unfold using the accumulator as state. We repeatedly output a symbol from the window; this may make the window underflow the range, in which case we renormalize if possible by injecting digits from the remainder (and if this is not possible, because there are no more digits to inject, it means that we have decoded the entire text).

\displaystyle  \begin{array}{@{}l} \mathit{dproduce}_5 :: \mathit{Number} \rightarrow [\mathit{Symbol}] \\ \mathit{dproduce}_5 = \mathit{unfoldr}\;\mathit{dstep}_5 \medskip \\ \mathit{dstep}_5 :: \mathit{Number} \rightarrow \mathsf{Maybe}\;(\mathit{Symbol}, \mathit{Number}) \\ \mathit{dstep}_5\;(x,\mathit{ys}) = \begin{array}[t]{@{}l} \mathbf{let}\; \mathit{Just}\;(s, x') = \mathit{dstep}_3\;x ; (x'',\mathit{ys}'') = \mathit{dnorm}_5\;(x',\mathit{ys}) \;\mathbf{in} \\ \mathbf{if}\; x'' \ge l \;\mathbf{then}\; \mathit{Just}\;(s,(x'',\mathit{ys}'')) \;\mathbf{else}\; \mathit{Nothing} \medskip \end{array} \\ \mathit{dnorm}_5 :: \mathit{Number} \rightarrow \mathit{Number} \\ \mathit{dnorm}_5\;(x,y:\mathit{ys}) = \mathbf{if}\; x < l \;\mathbf{then}\; \mathit{dnorm}_5\; (\mathit{inject}\;x\;y, \mathit{ys}) \;\mathbf{else}\; (x,y:\mathit{ys}) \\ \mathit{dnorm}_5\;(x,[\,]) = (x,[\,]) \end{array}

Note that decoding is of course symmetric to encoding; in particular, when encoding we renormalize before consuming a symbol; therefore when decoding we renormalize after emitting a symbol. For example, decoding the final encoding {(340,[3])} starts by computing {\mathit{dstep}_3\;340 = \mathit{Just}\;(\mbox{\texttt{\char39{}a\char39}},68)}; the window value 68 has underflowed, so renormalization consumes the remaining digit 3, leading to the accumulator {(683,[\,])}; then decoding proceeds to extract the `b’ and `c’ in turn, returning the accumulator to {(100,[\,])}.

We can again prove that decoding inverts encoding, although we need to use yet another variation of the Unfoldr–Foldr Theorem, hereby called the “Unfoldr–Foldr Theorem With Invariant”. We say that predicate {p} is an invariant of {\mathit{foldr}\;f\;e} and {\mathit{unfoldr}\;g} if

\displaystyle  \begin{array}{@{}lcl} p\;z &\Rightarrow& p\;(f\;x\;z) \\ p\;z \land g\;z = \mathit{Just}\;(x,z') &\Rightarrow& p\;z' \end{array}

The theorem is that if {p} is such an invariant, and the conditions

\displaystyle  \begin{array}{@{}ll} g\;(f\;x\;z) & = \mathit{Just}\;(x,z) \qquad \Leftarrow p\;z \\ g\;e & = \mathit{Nothing} \end{array}

hold for all {x} and {z}, then

\displaystyle  \mathit{unfoldr}\;g\;(\mathit{foldr}\;f\;e\;\mathit{xs}) = \mathit{xs} \qquad \Leftarrow p\;e

for all finite lists {\mathit{xs}}, provided that the initial state {e} satisfies the invariant {p}. In our case, the invariant is that the window {x} is in range ({l \le x < u}), which is indeed maintained by {\mathit{estep}_5} and {\mathit{dstep}_5}. Then it is straightforward to verify that

\displaystyle  \mathit{dstep}_5\;(l,[\,]) = \mathit{Nothing}


\displaystyle  \mathit{dstep}_5\;(\mathit{estep}_5\;s\;(x,\mathit{ys})) = \mathit{Just}\;(s,(x,\mathit{ys}))

when {x} is in range, making use of a lemma that

\displaystyle  \mathit{dnorm}_5\;(\mathit{enorm}_5\;(x,\mathit{ys})) = (x,\mathit{ys})

when {x} is in range. Therefore decoding inverts encoding:

\displaystyle  \mathit{dproduce}_5\;(\mathit{econsume}_5\;\mathit{xs}) = \mathit{xs}

for all finite {\mathit{xs}}.

Trading in sequences

The version of encoding in the previous section yields a {\mathit{Number}}, that is, a pair consisting of an integer window and a digit sequence remainder. It would be more conventional for encoding to take a sequence of symbols to a sequence of digits alone, and decoding to take the sequence of digits back to a sequence of symbols. For encoding, we have to flush the remaining digits out of the window at the end of the process, reducing the window to zero:

\displaystyle  \begin{array}{@{}l} \mathit{eflush}_5 :: \mathit{Number} \rightarrow [\mathit{Integer}] \\ \mathit{eflush}_5\;(0,\mathit{ys}) = \mathit{ys} \\ \mathit{eflush}_5\;(x,\mathit{ys}) = \mathbf{let}\; (x',y) = \mathit{extract}\;x \;\mathbf{in}\; \mathit{eflush}_5\;(x',y:\mathit{ys}) \end{array}

For example, {\mathit{eflush}_5\;(340,[3]) = [3,4,0,3]}. Then we can define

\displaystyle  \begin{array}{@{}l} \mathit{encode}_5 :: [\mathit{Symbol}] \rightarrow [\mathit{Integer}] \\ \mathit{encode}_5 = \mathit{eflush}_5 \cdot \mathit{econsume}_5 \end{array}

Correspondingly, decoding should start by populating an initially-zero window from the sequence of digits:

\displaystyle  \begin{array}{@{}l} \mathit{dstart}_5 :: [\mathit{Integer}] \rightarrow \mathit{Number} \\ \mathit{dstart}_5\;\mathit{ys} = \mathit{dnorm}_5\;(0,\mathit{ys}) \end{array}

For example, {\mathit{dstart}_5\;[3,4,0,3] = (340,[3])}. Then we can define

\displaystyle  \begin{array}{@{}l} \mathit{decode}_5 :: [\mathit{Integer}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_5 = \mathit{dproduce}_5 \cdot \mathit{dstart}_5 \end{array}

One can show that

\displaystyle  \mathit{dstart}_5\;(\mathit{eflush}_5\;x) = x

provided that {x} is initially in range, and therefore again decoding inverts encoding:

\displaystyle  \begin{array}{@{}ll} & \mathit{decode}_5\;(\mathit{encode}_5\;\mathit{xs}) \\ = & \qquad \{ \mathit{decode}_5, \mathit{encode}_5 \} \\ & \mathit{dproduce}_5\;(\mathit{dstart}_5\;(\mathit{eflush}_5\;(\mathit{econsume}_5\;\mathit{xs}))) \\ = & \qquad \{ \mathit{econsume}_5 \mbox { yields an in-range window, on which } \mathit{dstart}_5 \mbox{ inverts } \mathit{eflush}_5 \} \\ & \mathit{dproduce}_5\;(\mathit{econsume}_5\;\mathit{xs}) \\ = & \qquad \{ \mathit{dproduce}_5 \mbox { inverts } \mathit{econsume}_5 \} \\ & \mathit{xs} \end{array}

for all finite {\mathit{xs}}.

Note that this is not just a data refinement—it is not the case that {\mathit{encode}_5} yields the digits of the integer computed by {\mathit{encode}_4}. For example, {\mathit{encode}_4\;\mbox{\texttt{\char34{}abc\char34}} = 3411}, whereas {\mathit{encode}_5\;\mbox{\texttt{\char34{}abc\char34}} = [3,4,0,3]}. We have really changed the effectiveness of the encoding in return for the increased efficiency.

Streaming encoding

We would now like to stream encoding and decoding as metamorphisms: encoding should start delivering digits before consuming all the symbols, and conversely decoding should start delivering symbols before consuming all the digits.

For encoding, we have

\displaystyle  \begin{array}{@{}l} \mathit{encode}_5 :: [\mathit{Symbol}] \rightarrow [\mathit{Integer}] \\ \mathit{encode}_5 = \mathit{eflush}_5 \cdot \mathit{econsume}_5 \end{array}

with {\mathit{econsume}_5} a fold; can we turn {\mathit{eflush}_5} into an unfold? Yes, we can! The remainder in the accumulator should act as a queue of digits: digits get enqueued at the most significant end, so we need to dequeue them from the least significant end. So we define

\displaystyle  \begin{array}{@{}l} \mathit{edeal}_6 :: [\alpha] \rightarrow [\alpha] \\ \mathit{edeal}_6 = \mathit{unfoldr}\;\mathit{unsnoc} \;\mathbf{where} \\ \qquad \begin{array}[t]{@{}ll} \mathit{unsnoc}\;[\,] & = \mathit{Nothing} \\ \mathit{unsnoc}\;\mathit{ys} & = \mathbf{let}\; (\mathit{ys}',y) = (\mathit{init}\;\mathit{ys}, \mathit{last}\;\mathit{ys}) \;\mathbf{in}\; \mathit{Just}\;(y, \mathit{ys}') \end{array} \end{array}

to “deal out” the elements of a list one by one, starting with the last. Of course, this reverses the list; so we have the slightly awkward

\displaystyle  \mathit{eflush}_5 = \mathit{reverse} \cdot \mathit{edeal}_6 \cdot \mathit{eflush}_5

Now we can use unfold fusion to fuse {\mathit{edeal}_6} and {\mathit{eflush}_5} into a single unfold, deriving the coalgebra

\displaystyle  \begin{array}{@{}ll} \multicolumn{2}{@{}l}{\mathit{efstep}_6 :: \mathit{Number} \rightarrow \mathsf{Maybe}\;(\mathit{Integer}, \mathit{Number})} \\ \mathit{efstep}_6\;(0,[\,]) & = \mathit{Nothing} \\ \mathit{efstep}_6\;(x,[\,]) & = \mathbf{let}\; (q,r) = \mathit{extract}\;x \;\mathbf{in}\; \mathit{Just}\;(r,(q,[\,])) \\ \mathit{efstep}_6\;(x,\mathit{ys}) & = \mathbf{let}\; (\mathit{ys}',y) = (\mathit{init}\;\mathit{ys}, \mathit{last}\;\mathit{ys}) \;\mathbf{in}\; \mathit{Just}\;(y, (x,\mathit{ys}')) \end{array}

by considering each of the three cases, getting {\mathit{eflush}_5 = \mathit{eflush}_6} where

\displaystyle  \begin{array}{@{}l} \mathit{eflush}_6 :: \mathit{Number} \rightarrow [\mathit{Integer}] \\ \mathit{eflush}_6 = \mathit{reverse} \cdot \mathit{unfoldr}\;\mathit{efstep}_6 \end{array}

As for the input part, we have a {\mathit{foldr}} where we need a {\mathit{foldl}}. Happily, that is easy to achieve, at the cost of an additional {\mathit{reverse}}, since:

\displaystyle  \mathit{foldr}\;f\;e\;\mathit{xs} = \mathit{foldl}\;(\mathit{flip}\;f)\;e\;(\mathit{reverse}\;\mathit{xs})

on finite {\mathit{xs}}. So we have {\mathit{encode}_5 = \mathit{encode}_6}, where

\displaystyle  \begin{array}{@{}l} \mathit{encode}_6 :: [\mathit{Symbol}] \rightarrow [\mathit{Integer}] \\ \mathit{encode}_6 = \mathit{reverse} \cdot \mathit{unfoldr}\;\mathit{efstep}_6 \cdot \mathit{foldl}\;(\mathit{flip}\;\mathit{estep}_5)\;(l,[\,]) \cdot \mathit{reverse} \end{array}

It turns out that the streaming condition does not hold for {\mathit{efstep}_6} and {\mathit{flip}\;\mathit{estep}_5}—although we can stream digits out of the remainder:

\displaystyle  \begin{array}{@{}ll} \mathit{efstep}_6\;(123,[4,5,6]) & = \mathit{Just}\;(6,(123,[4,5])) \\ \mathit{estep}_5\;\mbox{\texttt{\char39{}c\char39}}\;(123,[4,5,6]) & = (248,[4,5,6]) \\ \mathit{efstep}_6\;(\mathit{estep}_5\;\mbox{\texttt{\char39{}c\char39}}\;(123,[4,5,6])) & = \mathit{Just}\;(6,(248,[4,5])) \end{array}

we cannot stream the last few digits out of the window:

\displaystyle  \begin{array}{@{}ll} \mathit{efstep}_6\;(123,[\,]) & = \mathit{Just}\;(3,(12,[\,])) \\ \mathit{estep}_5\;\mbox{\texttt{\char39{}c\char39}}\;(123,[\,]) & = (248,[\,]) \\ \mathit{efstep}_6\;(\mathit{estep}_5\;\mbox{\texttt{\char39{}c\char39}}\;(123,[\,])) & = \mathit{Just}\;(8,(24,[\,])) \end{array}

We have to resort to flushing streaming, which starts from an apomorphism

\displaystyle  \mathit{apo} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow [\gamma]) \rightarrow \beta \rightarrow [\gamma]

rather than an unfold. Of course, any unfold is trivially an apo, with the trivial flusher that always yields the empty list; but more interestingly, any unfold can be factored into a “cautious” phase (delivering elements only while a predicate {p} holds) followed by a “reckless” phase (discarding the predicate, and delivering the elements anyway)

\displaystyle  \mathit{unfoldr}\;g = \mathit{apo}\;(\mathit{guard}\;p\; g)\;(\mathit{unfoldr}\;g)


\displaystyle  \begin{array}{@{}l} \mathit{guard} :: (b \rightarrow \mathit{Bool}) \rightarrow (b \rightarrow \mathsf{Maybe}\;(c,b)) \rightarrow (b \rightarrow \mathsf{Maybe}\;(c,b)) \\ \mathit{guard}\;p\;g\;x = \mathbf{if}\; p\;x \;\mathbf{then}\; g\;x \;\mathbf{else}\; \mathit{Nothing} \end{array}

In particular, the streaming condition may hold for the cautious coalgebra {\mathit{guard}\;p\;g} when it doesn’t hold for the reckless coalgebra {g} itself. We’ll use the cautious coalgebra

\displaystyle  \begin{array}{@{}l} \mathit{efstep}_7 :: \mathit{Number} \rightarrow \mathsf{Maybe}\;(\mathit{Integer}, \mathit{Number}) \\ \mathit{efstep}_7 = \mathit{guard}\;(\mathit{not} \cdot \mathit{null} \cdot \mathit{snd})\;\mathit{efstep}_6 \end{array}

which carefully avoids the problematic case when the remainder is empty. It is now straightforward to verify that the streaming condition holds for {\mathit{efstep}_7} and {\mathit{flip}\;\mathit{estep}_5}, and therefore {\mathit{encode}_6 = \mathit{encode}_7}, where

\displaystyle  \begin{array}{@{}l} \mathit{encode}_7 :: [\mathit{Symbol}] \rightarrow [\mathit{Integer}] \\ \mathit{encode}_7 = \mathit{reverse} \cdot \mathit{fstream}\;\mathit{efstep}_7\;(\mathit{unfoldr}\;\mathit{efstep}_6)\;(\mathit{flip}\;\mathit{estep}_5)\;(l,[\,]) \cdot \mathit{reverse} \end{array}

and where {\mathit{encode}_7} streams its output. On the downside, this has to read the input text in reverse, and also write the output digit sequence in reverse.

Streaming decoding

Fortunately, decoding is rather easier to stream. We have

\displaystyle  \mathit{decode}_5 = \mathit{dproduce}_5 \cdot \mathit{dstart}_5

with {\mathit{dproduce}_5} an unfold; can we turn {\mathit{dstart}_5} into a fold? Yes, we can! In fact, we have {\mathit{dstart}_5 = \mathit{foldl}\;\mathit{dsstep}_8\;(0,[\,])}, where

\displaystyle  \begin{array}{@{}ll} \mathit{dsstep}_8\;(x,[\,])\;y & = \mathbf{if}\; x < l \;\mathbf{then}\; (\mathit{inject}\;x\;y, [\,]) \;\mathbf{else}\; (x, [y]) \\ \mathit{dsstep}_8\;(x,\mathit{ys})\;y & = (x, \mathit{ys} \mathbin{{+}\!\!\!{+}} [y]) \end{array}

That is, starting with {0} for the accumulator, digits are injected one by one into the window until this is in range, and thereafter appended to the remainder.

Now we have decoding as an {\mathit{unfoldr}} after a {\mathit{foldl}}, and it is straightforward to verify that the streaming condition holds for {\mathit{dstep}_5} and {\mathit{dsstep}_8}. Therefore {\mathit{decode}_5 = \mathit{decode}_8} on finite inputs, where

\displaystyle  \begin{array}{@{}l} \mathit{decode}_8 :: [\mathit{Integer}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_8 = \mathit{stream}\;\mathit{dstep}_5\;\mathit{dsstep}_8\;(0,[\,]) \end{array}

and {\mathit{decode}_8} streams. It is perhaps a little more symmetric to move the {\mathit{reverse}} from the final step of encoding to the initial step of decoding—that is, to have the least significant digit first in the encoded text, rather than the most significant—and then to view both the encoding and the decoding process as reading their inputs from right to left.

Summing up

Inlining definitions and simplifying, we have ended up with encoding as a flushing stream:

\displaystyle  \begin{array}{@{}l} \mathit{encode}_9 :: [\mathit{Symbol}] \rightarrow [\mathit{Integer}] \\ \mathit{encode}_9 = \mathit{fstream}\;g'\;(\mathit{unfoldr}\;g)\;f\;(l,[\,]) \cdot \mathit{reverse} \;\mathbf{where} \\ \qquad \begin{array}[t]{@{}ll} f\;(x,\mathit{ys})\;s & = \begin{array}[t]{@{}ll} \mathbf{if} & x < b \times (l \mathbin{\underline{\smash{\mathit{div}}}} total) \times \mathit{count}\;s \\ \mathbf{then} & \mathbf{let}\; (q,r) = \mathit{divMod}\;x\;(\mathit{count}\;s) \;\mathbf{in}\; (q \times \mathit{total} + \mathit{cumul}\;s + r,\mathit{ys}) \\ \mathbf{else} & \mathbf{let}\; (q,r) = \mathit{extract}\;x \;\mathbf{in}\; f\;(q, r : \mathit{ys})\;s \end{array} \\ g\;(0,[\,]) & = \mathit{Nothing} \\ g\;(x,[\,]) & = \mathbf{let}\; (q,r) = \mathit{extract}\;x \;\mathbf{in}\; \mathit{Just}\;(r,(q,[\,])) \\ g\;(x,\mathit{ys}) & = \mathbf{let}\; (\mathit{ys}',y) = (\mathit{init}\;\mathit{ys}, \mathit{last}\;\mathit{ys}) \;\mathbf{in}\; \mathit{Just}\;(y, (x,\mathit{ys}')) \\ g'\;(x,\mathit{ys}) & = \mathbf{if}\; \mathit{null}\;\mathit{ys} \;\mathbf{then}\; \mathit{Nothing} \;\mathbf{else}\; g\;(x,\mathit{ys}) \end{array} \end{array}

and decoding as an ordinary stream:

\displaystyle  \begin{array}{@{}l} \mathit{decode}_9 :: [\mathit{Integer}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode}_9 = \mathit{stream}\;g\;f\;(0,[\,]) \cdot \mathit{reverse} \;\mathbf{where} \\ \qquad \begin{array}[t]{@{}ll} g\;(x,\mathit{ys}) & = \begin{array}[t]{@{}lll} \mathbf{let} & (q,r) & = \mathit{divMod}\;x\;\mathit{total} \\ & s & = \mathit{find}\;r \\ & (x'',\mathit{ys}'') & = n\;(\mathit{count}\;s \times q + r - \mathit{cumul}\;s,\mathit{ys}) \\ \mathbf{in} & \multicolumn{2}{l}{\mathbf{if}\; x'' \ge l \;\mathbf{then}\; \mathit{Just}\;(s,(x'',\mathit{ys}'')) \;\mathbf{else}\; \mathit{Nothing}} \end{array} \\ n\;(x,\mathit{ys}) & = \begin{array}[t]{@{}l} \mathbf{if}\; x\ge l \lor \mathit{null}\;\mathit{ys} \;\mathbf{then}\; (x,\mathit{ys}) \;\mathbf{else} \\ \mathbf{let}\; (y:\mathit{ys}') = \mathit{ys} \;\mathbf{in}\; n\;(\mathit{inject}\;x\;y, \mathit{ys}') \end{array} \\ f\;(x,[\,]) y & = n\;(x,[y]) \\ f\;(x,\mathit{ys})\;y & = (x, \mathit{ys} \mathbin{{+}\!\!\!{+}} [y]) \end{array} \end{array}

Note that the two occurrences of {\mathit{reverse}} can be taken to mean that encoding and decoding both process their input from last to first. The remainder acts as a queue, with digits being added at one end and removed from the other, so should be represented to make that efficient. But in fact, the remainder can be eliminated altogether, yielding simply the window for the accumulator—for encoding:

\displaystyle  \begin{array}{@{}l} \mathit{encode} :: [\mathit{Symbol}] \rightarrow [\mathit{Integer}] \\ \mathit{encode} = h_1\;l \cdot \mathit{reverse} \qquad \mathbf{where} \\ \qquad \begin{array}[t]{@{}ll} h_1\;x\;(s:\mathit{ss}') & = \begin{array}[t]{@{}ll} \mathbf{if} & x < b \times (l \mathbin{\underline{\smash{\mathit{div}}}} \mathit{total}) \times \mathit{count}\;s \\ \mathbf{then} & \mathbf{let}\; (q,r) = \mathit{divMod}\;x\;(\mathit{count}\;s) \;\mathbf{in}\; h_1\;(q \times total + \mathit{cumul}\;s + r)\;\mathit{ss}' \\ \mathbf{else} & \mathbf{let}\; (q,r) = \mathit{extract}\;x \;\mathbf{in}\; r : h_1\;q\;(s:\mathit{ss}') \end{array} \\ h_1\;x\;[\,] & = h_2\;x \\ h_2\;x & = \mathbf{if}\; x==0 \;\mathbf{then}\; [\,] \;\mathbf{else}\;\mathbf{let}\; (q,r) = \mathit{extract}\;x \;\mathbf{in}\; r : h_2\;q \end{array} \end{array}

and for decoding:

\displaystyle  \begin{array}{@{}l} \mathit{decode} :: [\mathit{Integer}] \rightarrow [\mathit{Symbol}] \\ \mathit{decode} = h_0\;0 \cdot \mathit{reverse} \qquad \mathbf{where} \\ \qquad \begin{array}{@{}lll} h_0\;x\;(y:\mathit{ys}) & \mid x < l & = h_0\;(\mathit{inject}\;x\;y)\;\mathit{ys} \\ h_0\;x\;\mathit{ys} && = h_1\;x\;\mathit{ys} \\ h_1\;x\;\mathit{ys} && = \begin{array}[t]{@{}ll} \mathbf{let} & \begin{array}[t]{@{}ll} (q,r) & = \mathit{divMod}\;x\;\mathit{total} \\ s & = \mathit{find}\;r \\ x' & = \mathit{count}\;s \times q + r - \mathit{cumul}\;s \end{array} \\ \mathbf{in} & h_2\;s\;x'\;\mathit{ys} \end{array} \\ h_2\;s\;x\;(y:\mathit{ys}) & \mid x < l & = h_2\;s\;(\mathit{inject}\;x\;y)\;\mathit{ys} \\ h_2\;s\;x\;\mathit{ys} & \mid x \ge l & = s : h_1\;x\;\mathit{ys} \\ h_2\;s\;x\;[\,] && = [\,] \end{array} \end{array}

This gives rather simple programs, corresponding to those in Duda’s paper; however, the calculation of this last version from the previous steps currently eludes me.

by jeremygibbons at August 07, 2018 11:45 AM

August 06, 2018


HyperHaskell — Release of version

It is my pleasure to announce the release of

— the strongly hyped Haskell interpreter —

HyperHaskell is a graphical Haskell interpreter (REPL), not unlike GHCi, but hopefully more awesome. You use worksheets to enter expressions and evaluate them. Results are displayed using HTML.

HyperHaskell is intended to be easy to install. It is cross-platform and should run on Linux, Mac and Windows. That said, the binary distribution is currently only built for Mac. Help is appreciated!

HyperHaskell’s main attraction is a Display class that supersedes the good old Show class. The result looks like this:

Version supports HTML output, binding variables, text cells and GHC language extensions.

For more on HyperHaskell, see its project page on Github.

(EDITED 07 Aug 2018): Related projects that I know of:

Unfortunately, IHaskell has a reputation for being difficult to install, since it depends on Jupyter. Also, I don’t think the integration of Jupyter with the local Desktop environment and file system is particularly great. Haskell for Mac is partially proprietary and Mac only. Hence this new effort.

August 06, 2018 04:07 PM

Lysxia's blog

August 04, 2018


Binary interface files with Shimmer

Over the last couple of months I've merged three big changes that continue to move DDC in the direction of a real compiler rather than a research experiment:
  1. Support for binary interface files via the shimmer project.
  2. Reworked the type checker and compiler driver to only require interface files for directly imported modules, rather than all transitively imported modules.
  3. Rewrote the LLVM code printer to write directly to file, instead of going via a pretty printer.
Shimmer:Shimmer is a scheme-like language intended as a shim between compiler stages and intermediate languages. It provides a s-expression syntax and untyped functional intermediate language in which higher level languages can be encoded. It also has a native binary format, as well as its own REPL for testing. Shimmer is available as a hackage package.

DDC interface files are now serialised as binary shimmer files, using the grammar described on the Discus homepage. These files can be dumped in human readable (sort-of) form using the shimmer CLI that comes with the hackage package. For example, once you've build the DDC base libraries you can do:

$ shimmer -load src/s2/base/Data/List.di
+m-name = %module-name "Data" "List";

+m-deps = %l (%module-name "Class" "Applicative") ...


= %d-alg (%module-name "Class" "Applicative") "Applicative"
(%l (%bn "f" (%ta %tkf %tkd %tkd)))
(%s (%l (%ctor #nat'0 (%module-name "Class" "Applicative") "Applicative"
(%tu "Functor" "f")
(%tl (%bn "a" %tkd) (%tf "a" (%ta "f" "a")))
(%tl (%bn "a" %tkd) (%tl (%bn "b" %tkd) (%tf (%ta "f" (%tf "a" "b")) (%ta "f" "a") (%ta "f" "b"))))
(%tu "Applicative" "f"))));

= %tl (%bn "a" %tkd) (%tl (%bn "b" %tkd)
(%tf (%tf "a" (%tu "Maybe" "b")) (%tu "List" "a") (%tu "List" "b")));

= %xb (%mtn "a" %tkd) (%mtn "b" %tkd)
(%mxn "f" (%tf "a" (%tu "Maybe" "b")))
(%mxn "xx" (%tu "List" "a"))
(%xc "xx" (%ab (%dcn %n %n (%dc "Nil")) (%xa (%dcn %n %n (%dc "Nil")) (%rt "b")))
(%ab (%dcn %n %n (%dc "Cons")) (%bn "x" "a") (%bn "xs" (%tu "List" "a"))
(%xc (%xa "f" "x") (%ab (%dcn %n %n (%dc "Nothing")) (%xa "mapMaybe" (%rt "a") (%rt "b") "f" "xs"))
(%ab (%dcn %n %n (%dc "Just")) (%bn "x'" "b")
(%xa (%dcn %n %n (%dc "Cons")) (%rt "b") "x'"
(%xa "mapMaybe" (%rt "a") (%rt "b") "f" "xs"))))));

This is a simple encoding of the AST of the DDC Core intermediate language. For example @d-Applicative gives the data type declaration for an Applicative type class dictionary. @t-mapMaybe encodes the type of the mapMaybe function, and @x-mapMaybe encodes the declaration. %tl means "forall", %xb is a lambda binder, %xc a case expression, %xa an application, and so on. See the grammar for the rest.

The binary format is a not-completely-horrible packed representation, where I've made an effort to avoid redundant encoding, but haven't done any compression or indexing yet. The main feature is that serialisation and deserialisation are implemented by simply peeking and poking from memory, so performance should be reasonable. The binary grammar is defined in the hackage package.

Q. Why the name Shimmer?
A. Planner, Conniver, Schemer, Shimmer.

Did you know that Scheme used to be called "Schemer", but that Guy Steele's computer at the time only supported 6 character file names? Shimmer is a Scheme-like language, but without the Scheme.

No more transitive imports: Previously, when type checking a module, DDC needed to load the textual interface files for all transitively reachable modules, which in most cases was the entire base library. This approach works fine for a toy language and a minimal base library, but implies that the more useful the base library becomes, the slower compilation gets. In the current version, when some module A re-exports declarations for module B, those declarations are also added to module A's interface file. In general this means that when compiling a given module we only need to load interface files for directly imported modules, not all modules that are transitively reachable. When doing this change I also added in-memory indexes for the various sorts of declarations (data types, type synonyms, value declarations etc) to speed up name resolution. This framework should also support the addition of on-disk database style indexes to shimmer files. We'll need this when the base library grows large enough that deserialisation of interface files becomes a problem again.

Direct write of LLVM code: Haskell programmers love their pretty printer libraries. I had been using wl-pprint to pretty print LLVM code before calling the LLVM compiler to convert them to binary object files. For small output files intended for human consumption, wl-pprint is great. However, if one just wants to dump some LLVM code to feed to the LLVM compiler, then the time to construct and then consume the intermediate Doc type, as well as the overhead of Haskell Strings becomes an unreasonable overhead. Today I rewrote the LLVM generator to just write directly to a file handle -- an "ugly-printer", if you will. Doing that yielded a ~10% reduction in overall compilation time.

These three changes reduced compile time for the ddc-demo tree to about 35% of what it was previously, or about a 2.5x speedup. There are more low hanging fruits, but I've got enough juice for now, so am planning to return to higher level concerns in the immediate future. The DDC reflection system is also starting to work, but that will be the topic of a subsequent blog post.

by Ben Lippmeier ( at August 04, 2018 09:47 AM

August 03, 2018

Douglas M. Auclair (geophf)

July 2018 1HaskellADay Problems and Solutions

by geophf ( at August 03, 2018 04:16 AM

August 01, 2018

Roman Cheplyaka

Probability of a regex occurrence

Given a regular expression \(R\) and a random string \(L\) of length \(n\), what is the probability that \(R\) matches somewhere in \(L\)?

This seems like an intimidating problem at first, but I was surprised how easy it becomes after a couple of insights.

The first insight is to turn a regex into a DFA. This moves us from “how can I possibly know whether a regex matches?” to “feed the string to the automaton and see if it ends up in an accepting state”.

Still, the DFA can be anything at all — how do we know the probability of ending up in an accepting state? The second insight is to ascribe probabilities to the edges (transitions) of the DFA and observe that this turns it into a Markov chain1.

Why is that helpful? Because a Markov chain can be represented by its transition matrix \(P\), and the probability of getting from any state \(a\) to any state \(b\) can be computed in logarithmic time by computing \(P^n_{ab}\), the \(a,b\)-th element of the \(n\)-th power of the matrix \(P\).

So how do we assign the probabilities to DFA transitions? If all characters are equally likely, \(P_{ab}\) is simply the number of characters that move the DFA from state \(a\) to state \(b\) divided by the number of all possible characters. More generally, if every letter \(l\) has a fixed probability \(p_l\), we compute \[ P_{ab}=\sum_{a\overset{l}{\rightarrow} b}p_l, \] where the summation is over all characters \(l\) that lead from \(a\) to \(b\).

And to get the probability of the regex \(R\) matching anywhere inside the string as opposed to matching the whole string, replace \(R\) with \({.}{*}R{.}{*}\) before compiling it to a DFA. Then calculate

\[ \sum_{f\in F} P^n_{sf}, \] where \(s\) is the start state of the DFA, and \(F\) is the set of accepting states.

Here’s a simple implementation of this algorithm. (The library code supports full regular expressions over arbitrary finite alphabets, but the command line tool only accepts simple IUPAC patterns over the DNA alphabet.)


There is nothing new or original here — specialists know how to solve this and even more involved problems. For these more involved problems, generating functions are typically used. It’s nice, however, that for this specific case, simple matrix exponentiation suffices.

For example, Nicodème, Salvy and Flajolet (2002) show how to use generating functions to calculate the probability distribution of the number of occurrences of a regex, both overlapping and non-overlapping, as well as the mean and variance of that number. They also derive the asymptotics and handle (first-order) Markov dependencies between the consequtive positions in \(L\). But for our task — the probability of a regex occurring at least once — their method essentially reduces to the one described here.

Régnier (2000) also uses generating functions but explicitly models possible overlaps between patterns, which allows her to analyze regular expressions more efficiently than Nicodème et al. (see page 20).

Filion 2017 is a very accessible introduction to the paradigm of transfer graphs, which are a bridge between DFAs, transition matrices, and generating functions.

<section class="footnotes">
  1. Here I assume that different positions in \(L\) are independent.


August 01, 2018 08:00 PM

Functional Jobs

Scala developer for NetLogo at Northwestern University (Full-time)

Job Summary

The CCL Lab at Northwestern University is looking for a full-time Scala/Java Software Developer to work on the NetLogo desktop application, a celebrated modeling environment used in both education and research.

This Software Developer position is based at Northwestern University's Center for Connected Learning and Computer-Based Modeling (CCL), working in a small collaborative development team in a university research group that also includes professors, postdocs, graduate students, and undergraduates, supporting the needs of multiple research projects. A major focus would be on development of NetLogo, an open-source modeling environment for both education and scientific research. CCL grants also involve development work on HubNet and other associated tools for NetLogo, including research and educational NSF grants involving building NetLogo-based science curricula for schools.

NetLogo is a programming language and agent-based modeling environment. The NetLogo language is a dialect of Logo/Lisp specialized for building agent-based simulations of natural and social phenomena. NetLogo has many tens of thousands of users ranging from grade school students to advanced researchers. A collaborative extension of NetLogo, called HubNet, enables groups of participants to run participatory simulation activities in classrooms and distributed participatory simulations in social science research. NetLogo can also be extended to interface with external software such as R, Mathematica and Python.

Application information:

The Northwestern campus is in Evanston, Illinois on the Lake Michigan shore, adjacent to Chicago and easily reachable by public transportation.

Specific Responsibilities:

  • Collaborates with the NetLogo development team in designing features for NetLogo and other related software;
  • Writes Scala/Java code independently, and in the context of a team of experienced software engineers and principal investigator;
  • Interacts with commercial and academic partners to help determine design and functional requirements for NetLogo and other related software;
  • Interacts with user community including responding to bug reports, questions, and suggestions, and interacting with open-source contributors;
  • Mentors undergraduate student workers and guide Google Summer of Code participants on contributions to the NetLogo codebase; assists graduate students with issues encountered during their work;
  • Performs data collection, organization, and summarization for projects; assists with coordination of team activities;
  • Participates in the design of web-based applications, including NetLogo Web.
  • Performs other duties as required or assigned.

Minimum Qualifications:

  • A bachelor's degree in computer science or a closely related field or the equivalent combination of education, training and experience from which comparable skills and abilities may be acquired;
  • Demonstrated experience and enthusiasm for writing clean, modular, well-tested code.

Preferred Qualifications:

  • Experience with working effectively as part of a small software development team, including close collaboration, distributed version control, and automated testing;
  • Experience with at least one JVM language, Scala strongly preferred;
  • Experience developing GUI applications, especially Java Swing-based applications;
  • Experience with programming language design and implementation, functional programming (especially Haskell or Lisp), and compilers;
  • Interest in and experience with computer-based modeling and simulation, especially agent-based simulation;
  • Interest in and experience with distributed, multiplayer, networked systems like HubNet;
  • Experience working on research projects in an academic environment;
  • Experience with open-source software development and supporting the growth of an open-source community;
  • Experience with Linux/Unix system administration;
  • Experience with building web-based applications, both server-side and client-side components, particularly with html5 and JavaScript and/or CoffeeScript;
  • Interest in education and an understanding of secondary school math and science content.

As per Northwestern University policy, this position requires a criminal background check. Successful applicants will need to submit to a criminal background check prior to employment.

Northwestern University is an Equal Opportunity, Affirmative Action Employer of all protected classes including veterans and individuals with disabilities.

Get information on how to apply for this position.

August 01, 2018 07:38 PM

FP Complete

Pantry, part 3: Specifying Dependencies

This is part three of a series of blog posts on Pantry, a new storage and download system for Haskell packages. You can see part 1 and part 2.

by Michael Snoyman ( at August 01, 2018 05:28 PM

July 30, 2018

Joachim Breitner

The merits of a yellow-red phase

In my yesterday post on Continuity Lines in North America I mentioned in passing that I am big fan of German1 traffic lights, which have a brief yellow-red phase between the red and and the green phase (3 to 5 seconds, depending on road speed). A German reader of my blog was surprised by this, said that he considers it pointless, and would like to hear my reasons. So where we go…

Life without Yellow-Red

Lights that switch directly from red to green cause more stress. Picture yourself at the first car waiting at a traffic light, with a bunch of cars behind you. Can you relax, maybe observe the cars passing in front of you, switch the radio station, or simply look somewhere else for a moment? Well, you can, but you risk missing how the light switches from red to green. When your look at the traffic light again and see it bright green, you have no idea how long it has been on green. Hyper-aware of all the cars behind you waiting to get going, you’ll rush to get started, and if you don’t do that really fast now, surely one of the people waiting behind you will have honked.

So at the next intersection, you better don’t take your eyes off the light – or, alternatively, you develop a screw-you-I-don’t-care-attitude towards the other cars, which might allow you to relax in this situation, but is in general not desirable.

Maybe this is less of a problem on the West Coast, where supposedly everybody is nice, relaxed, and patient, and you can take all the time you want to get rolling. But in the stereotypical East Coast traffic full of angry, impatient and antisocial drivers, you really don’t get away with that.

Life with Yellow-Red

The yellow-red phase solves this problem elegantly: As long as the light is red, you don’t have to watch the light constantly and with maximum attention. You can relax: it suffices to check it often enough to catch the red-yellow phase, once every two or three seconds. When you see it, you get ready to start; and when it actually turns green, you start on time.

I would expect that it is not only less stressful, it is also more efficient: Because every car in the lane has the heads-up warning “green in a moment”, the cars can start rolling in quicker succession. Without the warning, every car to account much more for the possible slower reaction of the car before.

PS: A friend of mine wonders whether the German Yellow-Red is needed because cars with manual transmissions are much more common in Germany than in the US, and you need more time to, well, get into gear with these cars.

  1. Also Great Britain, Croatia, Latvia, Norway, Austria, Poland, Russia, Saudi-Arabia, Sweden, Switzerland, Hungary and others.

by Joachim Breitner ( at July 30, 2018 09:41 PM

Continuity Lines

I am currently on a road trip going from Philly going North along the Atlantic Coast into Canada, and aiming for Nova Scotia. When I passed from the US into Canada, I made had an unexpected emotional response to the highways there: I felt more at home!

And I believe it has to do with the pavement markings on US vs. Canadian freeways.

Consider this image, taken from the Manual on Uniform Traffic Control Devices of the United States, an official document describing (among other things) how the pavement of a freeway ought to be paved:

Parallel Deceleration Lane for Exit Ramp

Parallel Deceleration Lane for Exit Ramp

This is a typical exit ramp from the freeway. On ramps and cloverleaf ramps look similar. Note that the right-most lane goes somewhere else than the left and the middle lane, yet the lanes look completely identical. In particular, the lines between the lanes are shaped the same!

Now, for comparison, the corresponding image in a Canadian manual, namely the Manual of Standard Traffic Signs & Pavement Markings for British Columbia:

Canadian off-ramps

Canadian off-ramps

Here, there are different lines between the different lanes: normal lane lines left, but a so-called continuity line, with a distinctly different pattern, between the normal lanes and the exit lane. It’s like in Germany!

With this is mind I understand one reason1 why driving in the US2 noticeably more stressful: There is just always anxiety whether you are accidentally in an exit lane!

Update (2018-07-30): AS Paul Johnson pointed out (see comment below), I was looking at an old version of the MUTCD. The current version, from 2009, has these lines:

Current US off-ramps

Current US off-ramps

They have published a (very long) document describing the changes in the new version of the manual , and Section 306 describes the rationale:

[..] the FHWA adopts language to clarify that dotted lane lines, rather than broken lane lines, are to be used for non-continuing lanes, including acceleration lanes, deceleration lanes, and auxiliary lanes. [..] a number of States and other jurisdictions currently follow this practice, which is also the standard practice in Europe and most other developed countries. The FHWA believes that the existing use of a normal broken lane line for these non- continuing lanes does not adequately inform road users of the lack of lane continuity ahead and that the standardized use of dotted lane lines for non-continuing lanes as adopted in this final rule will better serve this important purpose in enhancing safety and uniformity.

So all is well! But it means that either Pennsylvania was slower than allowed in implementing these changes (the deadline was December 2016), or it was something else alltogether that made me feel more at home on the Canadian freeway.

  1. I say “one reason”, not “the reason”, because there are many more – “Rechtsfahrgebot”, no red-and-yellow-phase in the traffic light, Pennsylvanian road quality…

  2. Supposedly, Pennsylvania is particularly bad with roads in general, but also with this particular problem, and California has exit lanes clearly separated. But, of course, because this is the US, not using the same patter as the others (Canada, Europe), but using spaced dots…

by Joachim Breitner ( at July 30, 2018 12:25 AM

July 28, 2018

Joachim Breitner

Build tool semantic aware build systems

I just had a lovely visit at Ben Gamari and Laura Dietz, and at the breakfast table we had an idea that I want to record here.

The problem

Laura and Ben talked about the struggles they had using build systems like make or Nix in data science applications. A build system like nix is designed around the idea that builds are relatively cheap, and that any change in a dependency ought to trigger a rebuild, just to be sure that all build outputs are up-to-date. This is a perfectly valid assumption when building software, where build times are usually minutes, or maybe hours. But when some of the data processing takes days, then you really really want to avoid any unnecessary builds.

One way of avoiding unnecessary rebuilds that is supported by build systems like shake and (I presume, maybe with some extra work) Nix, is to do output hashing: If one build stage has its input changed and need to be re-run, but its output is actually unchanged, then later build stages do not have to be re-run. Great!

But even with this feature in place, there one common problem remains: Build tools! Your multi-gigabyte data is going to be processed by some code you write. What if the code changes? Clearly, if you change the algorithm, or fix a bug in your code, you want the output to be re-run. But if you just change some strings in the --help flag, or update some libraries, or do something else that does not change the program in a way that is significant for the task at hand, wouldn’t you prefer to not pay for that by futile multi-day data processing step?

Existing approaches

There are various ways you can tackle this these days:

  • You bite the bullet, add the build tool as a dependency of the processing step. You get the assurance that your data always reflects the output of the latest version of your tool, but you get lots of rebuilds.

  • You don’t track the tool as part of your build system. It is now completely up to you to decide when the tool has changed in significant ways. When you think it has, you throw away all build artifacts and start from scratch. Very crude, and easy to get wrong.

  • You keep track of a “build tool version”, e.g. a text file with a number, that you depend on in lieu of the build tool itself. When you make a change that you think is significant, you bump that version number. This is similar to the previous approach, but more precise: Only the build outputs that used this particular tools will be invalidated, and it also scales better to working in a team. But of course, you can still easily get it wrong.

Build tool semantics

Why are all these approaches so unsatisfying? Because none allow you to express the intent, which is to say “this build step depends on the semantics (i.e. behaviour) of the build tool”. If we could somehow specify that, then all would be great: Build tool changes, but its semantics is unaffected? no rebuild. Build tool changes, semantics change? rebuild.

This ideal is not reachable (blah blah halting problem blah blah) – but I believe we can approximate it. At least if good practices were used and the tool has a test suite!

Assume for now that the tool is a simple patch-processing tool, i.e. takes some input and produces some output. Assume further that there is a test suite with a small but representative set of example inputs, or maybe some randomly generated inputs (using a fixed seed). If the test suite is good, then the (hash of) the output on the test suite example is an approximation of the programs semantic.

And the build system can use this “semantics hash”. If the build tool changes, the build system can re-run the test suite and compare the output with the previous run. If they change, then the tools seems to have changed in significant ways, and it needs to re-run the data processing. But if the test suite outputs are unchanged, then it can assume that the behaviour of the tool has not changed, re-use the existing data outputs.

That is the core idea! A few additional thoughts, though:

  • What if the test suite changes? If the above idea is implemented naively, then adding a test case to the test suite will change the semantics, and re-build everything. That would be horrible in terms of incentives! So instead, the build systems should keep the old version of the build tool around, re-calculate its semantics hash based on the new test suite, and then compare that. This way, extending the test suite does not cause recompilation.

Hash-and-store-based build systems like Nix should have no problem keeping the previous version of the build tool around as long there is output that depends on it.

  • A consequence of this approach is that if you find and fix a bug in your tool that is not covered by the existing test suite, then you absolutely have to add a test for that to your test suite, otherwise the bug fix will not actually make it to your data output. This might be the biggest downside of the approach, but at least it sets incentives right in that it makes you maintain a good regression test suite.

  • What if things go wrong, the test suite is incomplete and the build system re-uses build output because it wrongly assumes that two versions of the build tool have the same semantics?

If done right, this can be detected and fixed: The build system needs to record which tool version (and not just which “semantics hash”) was used to produce a particular output. So once the test suite uncovers the difference, the build systems will no longer consider the two versions equivalent and – after the fact – invalidate the re-used of the previous build output, and re-build what needs to be rebuild

I am curious to hear if anybody has played with these or similar ideas before? How did it go? Would it be possible to implement this in Nix? In Shake? Other systems? Tell me your thoughts!

by Joachim Breitner ( at July 28, 2018 03:03 PM

July 22, 2018

Joachim Breitner

WebGL, Fragment Shader, GHCJS and reflex-dom

What a potpourri of topics... too long to read? Click here!

On the side and very slowly I am working on a little game that involves breeding spherical patterns… more on that later (maybe). I want to implement it in Haskell, but have it run in the browser, so I reached for GHCJS, the Haskell-to-Javascript compiler.

WebGL for 2D images

A crucial question was: How do I draw a generative pattern onto a HTML canvas element. My first attempt was to calculate the pixel data into a bit array and use putImageData() to push it onto the canvas, but it was prohibitively slow. I might have done something stupid along the way, and some optimization might have helped, but I figured that I should not myself calculate the colors of each pixel, but leave this to who is best at it: The browser and (ideally) the graphic card.

So I took this as an opportunity to learn about WebGL, in particular fragment shaders. The term shader is misleading, and should mentally be replaced with “program”, because it is no longer (just) about shading. WebGL is intended to do 3D graphics, and one sends a bunch of coordinates for triangles, a vertex shader and a fragment shader to the browser. The vertex shader can places the vertices, while the fragment shader colors each pixel on the visible triangles. This is a gross oversimplification, but that is fine: We only really care about the last step, and if our coordinates always just define a rectangle that fills the whole canvas, and the vertex shader does not do anything interesting, then what remains is a HTML canvas that takes a program (written in the GL shader language), which is run for each pixel and calculates the color to be shown at that pixel.

Perfect! Just what I need. Dynamically creating a program that renders the pattern I want to show is squarely within Haskell’s strengths.

A reflex-dom widget

As my game UI grows, I will at some point no longer want to deal with raw DOM access, events etc., and the abstraction that makes creating such user interfaces painless is Functional Reactive Programming (FRP). One of the main mature implementations is Ryan Trinkle's reflex-dom, and I want to use this project to get more hands-on experience with it.

Based on my description above, once I hide all the details of the WebGL canvas setup, what I really have is a widget that takes a text string (representing the fragment shader), and creates a DOM element for it. This would suggest a function with this type signature

fragmentShaderCanvas ::
    MonadWidget t m =>
    Dynamic t Text ->
    m ()

where the input text is dynamic, meaning it can change over time (and the canvas will be updated) accordingly. In fact, I also want to specify attributes for the canvas (especially width and height), and if the supplied fragment shader source is invalid and does not compile, I want to get my hands on error messages, as provided by the browser. So I ended up with this:

fragmentShaderCanvas ::
    MonadWidget t m =>
    Map Text Text ->
    Dynamic t Text ->
    m (Dynamic t (Maybe Text))

which very pleasingly hides all the complexity of setting up the WebGL context from the user. This is abstraction at excellence!

I published this widget in the package on Hackage.

A Demo

And because reflex-dom make it so nice, I created a little demo program; it is essentially a fragment shader playground!

On you will find a text area where you can edit the fragment shader code. All your changes are immediately reflected in the canvas on the right, and in the list of warnings and errors below the text area. The code for this demo is pretty short.

A few things could be improved, of course: For example, the canvas element should have its resolution automatically adjusted to the actual size on screen, but it is somewhat tricky to find out when and if a DOM element has changed size. Also, the WebGL setup should be rewritten to be more defensively, and fail more gracefully if things go wrong.

BTW, if you need a proper shader playground, check out Shadertoy.

Development and automatic deployment

The reflex authors all use Nix as their development environment, and if you want to use reflex-dom, then using Nix is certainly the path of least resistance. But I would like to point out that it is not a necessity, and you can stay squarely in cabal land if you want:

  • You don’t actually need ghcjs to develop your web application: reflex-dom builds on jsaddle which has a mode where you build your program using normal GHC, and it runs a web server that your browser connects to. It works better with Chrome than with Firefox at the moment, but is totally adequate to develop a program.

  • If you do want to install ghcjs, then it is actually relatively easily: The README on the ghc-8.2 branch of GHCJS tells you how to build and install GHCJS with cabal new-build.

  • cabal itself supports ghcjs just like ghc! Just pass --ghcjs -w ghcjs to it.

  • Because few people use ghcjs and reflex with cabal some important packages (ghcjs-base, reflex, reflex-dom) are not on Hackage, or only with old versions. You can point cabal to local checkouts using a cabal.project file or even directly to the git repositories. But it is simpler to just use a Hackage overlay that I created with these three packages, until they are uploaded to Hackage.

  • If the application you create is a pure client-based program and could therefore be hosted on any static web host, wouldn’t it be nice if you could just have it appear somewhere in the internet whenever you push to your project? Even that is possible, as I describe in an example repository!

It uses Travis CI to build GHCJS and the dependencies, caches them, builds your program and – if successful – uploads the result to GitHub Pages. In fact, the demo linked above is produced using that. Just push, and 5 minutes later the changes available online!

I know about rumors that Herbert’s excellent multi-GHC PPA repository might provide .deb packages with GHCJS prebuilt soon. Once that happens, and maybe ghcjs-base and reflex get uploaded to Hackage, then the power of reflex-based web development will be conveniently available to all Haskell developers (even those who shunned Nix so far), and I am looking forward to many cool projects coming out of that.

by Joachim Breitner ( at July 22, 2018 02:41 PM

July 19, 2018

Luke Palmer

Playground Programming

Every now and then, I have a small idea about a development environment feature I’d like to see. At that point, I usually say to myself, “make a prototype to see if I like it and/or show the world by example”, and start putting pressure on myself to make it. But of course, there are so many ideas and so little time, and at the end of the day, instead of producing a prototype, I manage just to produce some guilt.

This time, I’m just going to share my idea without any self-obligation to make it.

I’m working on Chrome’s build system at Google. We are switching the build scripts to a new system which uses an ingenious testing system that I’ve never seen before (though it seems like the kind of thing that would be well-known). For each build script, we have a few test inputs to run it on. The tests run all of our scripts on all of their test inputs, but rather than running the commands, they simply record the commands that would have been run into “test expectation” files, which we then check into source control.

Checking in these auto-generated files is the ingenious part. Now, when we want to change or refactor anything about the system, we simply make the change, regenerate the expectations, and do a git diff. This will tell us what the effects of our change are. If it’s supposed to be a refactor, then there should be no expectation diffs. If it’s supposed to change something, we can look through the diffs and make sure that it changed exactly what it was supposed to. These expectation files are a form of specification, except they live at the opposite end of the development chain.

This fits in nicely with a Haskell development flow that I often use. The way it usually goes: I write a function, get it to compile cleanly, then I go to ghci and try it on a few conspicuous inputs. Does it work with an empty list? What about an infinite list (and I trim the output if the output is also infinite to sanity check). I give it enough examples that I have a pretty high confidence that it’s correct. Then I move on, assuming it’s correct, and do the same with the next function.

I really enjoy this way of working. It’s “hands on”.

What if my environment recorded my playground session, so that whenever I changed a function, I could see its impact? It would mark the specific cases that changed, so that I could make sure I’m not regressing. It’s almost the same as unit tests, but with a friendlier workflow and less overhead (reading rather than writing). Maybe a little less intentional and therefore a little more error-prone, but it would be less error-prone than the regression testing strategy I currently use for my small projects (read: none).

It’s bothersome to me that this is hard to make. It seems like such a simple idea. Maybe it is easy and I’m just missing something.

by Luke at July 19, 2018 01:46 AM

July 16, 2018

Ken T Takusagawa

[xjzqtjzg] Top-heavy perfect powers

Below are a list of perfect powers b^n (b<=n) in increasing order, up to a googol.  There are 2413 values.  The list omits bases which are perfect powers themselves, e.g., 4^n, 8^n, 9^n.

Here is Haskell source code to compute the list.  We used functions from the data-ordlist package to merge an infinite list of ordered infinite lists into a single ordered list.  This seems to be a nontrivial function to write, having written it a few times ((1) and (2)) before discovering Data.List.Ordered.  (Incidentally, data-ordlist is not indexed by Hoogle version 4.  It is indexed by the alpha version (Hoogle 5), but Hoogle 5 cannot search by type Ord a => [[a]] -> [a].)  We also used Data.List.Ordered.minus to compute the set difference of two infinite lists.

The motivation was to investigate increasing the breadth of the Cunningham project, which currently factors integers b^n +/- 1 with b restricted to {2, 3, 5, 6, 7, 10, 11, 12}.  Within the original Cunningham range up to about 2^1200, there are 4357 perfect powers in the restricted set, but there would be 19256 in our expanded set.

What is the growth rate of this sequence?

2^2 2^3 2^4 3^3 2^5 2^6 3^4 2^7 3^5 2^8 2^9 3^6 2^10 2^11 3^7 5^5 2^12 3^8 2^13 5^6 2^14 3^9 2^15 6^6 3^10 2^16 5^7 2^17 3^11 2^18 6^7 5^8 2^19 3^12 7^7 2^20 3^13 6^8 5^9 2^21 2^22 3^14 7^8 2^23 5^10 6^9 3^15 2^24 2^25 7^9 3^16 5^11 6^10 2^26 3^17 2^27 5^12 2^28 7^10 6^11 3^18 2^29 2^30 3^19 5^13 7^11 2^31 6^12 3^20 2^32 5^14 2^33 10^10 3^21 6^13 7^12 2^34 5^15 3^22 2^35 2^36 6^14 3^23 7^13 10^11 2^37 5^16 2^38 3^24 11^11 6^15 2^39 7^14 5^17 3^25 10^12 2^40 2^41 3^26 6^16 11^12 5^18 2^42 7^15 3^27 2^43 12^12 10^13 6^17 2^44 5^19 3^28 7^16 11^13 2^45 3^29 2^46 5^20 10^14 6^18 12^13 2^47 3^30 7^17 2^48 13^13 11^14 5^21 2^49 6^19 3^31 10^15 2^50 12^14 7^18 3^32 2^51 5^22 6^20 13^14 11^15 2^52 3^33 2^53 10^16 14^14 7^19 5^23 12^15 3^34 2^54 6^21 2^55 11^16 3^35 13^15 5^24 2^56 7^20 10^17 6^22 2^57 3^36 14^15 12^16 2^58 5^25 15^15 3^37 11^17 7^21 2^59 13^16 6^23 10^18 2^60 3^38 5^26 14^16 12^17 2^61 7^22 3^39 2^62 6^24 11^18 15^16 5^27 13^17 2^63 10^19 3^40 2^64 12^18 7^23 6^25 14^17 3^41 2^65 5^28 11^19 2^66 15^17 10^20 3^42 13^18 2^67 6^26 5^29 7^24 2^68 12^19 3^43 14^18 2^69 11^20 17^17 5^30 3^44 10^21 6^27 2^70 7^25 13^19 15^18 2^71 3^45 12^20 5^31 2^72 14^19 6^28 11^21 3^46 7^26 2^73 10^22 17^18 2^74 13^20 15^19 5^32 3^47 6^29 2^75 18^18 12^21 7^27 2^76 3^48 11^22 14^20 10^23 5^33 2^77 6^30 17^19 3^49 13^21 2^78 15^20 7^28 12^22 5^34 2^79 18^19 3^50 11^23 10^24 14^21 2^80 6^31 19^19 3^51 2^81 5^35 13^22 7^29 17^20 2^82 15^21 3^52 12^23 6^32 2^83 11^24 10^25 18^20 5^36 14^22 2^84 3^53 7^30 19^20 2^85 13^23 6^33 3^54 17^21 5^37 15^22 2^86 12^24 10^26 20^20 11^25 2^87 7^31 3^55 18^21 14^23 6^34 2^88 5^38 3^56 13^24 2^89 19^21 12^25 10^27 7^32 15^23 17^22 11^26 2^90 3^57 6^35 5^39 20^21 2^91 14^24 18^22 3^58 2^92 21^21 13^25 7^33 5^40 2^93 10^28 6^36 12^26 11^27 19^22 3^59 15^24 2^94 17^23 2^95 20^22 3^60 14^25 5^41 7^34 6^37 18^23 2^96 13^26 10^29 21^22 3^61 12^27 11^28 2^97 5^42 15^25 19^23 2^98 17^24 22^22 6^38 7^35 3^62 14^26 2^99 20^23 10^30 5^43 3^63 13^27 2^100 18^24 11^29 12^28 6^39 2^101 21^23 7^36 3^64 15^26 19^24 2^102 5^44 17^25 22^23 14^27 10^31 2^103 3^65 6^40 13^28 20^24 11^30 7^37 12^29 2^104 23^23 18^25 5^45 3^66 2^105 21^24 15^27 6^41 2^106 3^67 19^25 17^26 10^32 14^28 7^38 5^46 2^107 22^24 11^31 13^29 12^30 3^68 2^108 20^25 18^26 23^24 6^42 2^109 5^47 3^69 15^28 7^39 10^33 21^25 2^110 24^24 17^27 14^29 19^26 11^32 3^70 2^111 13^30 12^31 6^43 5^48 22^25 2^112 7^40 20^26 3^71 18^27 10^34 2^113 23^25 15^29 6^44 5^49 2^114 3^72 11^33 21^26 14^30 17^28 24^25 19^27 13^31 12^32 2^115 7^41 3^73 22^26 2^116 5^50 10^35 6^45 20^27 18^28 2^117 15^30 3^74 23^26 11^34 7^42 2^118 14^31 12^33 13^32 5^51 17^29 21^27 3^75 6^46 19^28 2^119 24^26 10^36 2^120 22^27 3^76 7^43 5^52 18^29 2^121 20^28 11^35 15^31 6^47 14^32 12^34 2^122 3^77 13^33 23^27 26^26 17^30 10^37 21^28 2^123 5^53 19^29 7^44 3^78 24^27 2^124 6^48 11^36 22^28 2^125 15^32 18^30 3^79 20^29 5^54 12^35 14^33 13^34 2^126 10^38 7^45 23^28 6^49 17^31 3^80 26^27 2^127 21^29 19^30 5^55 11^37 2^128 24^28 3^81 15^33 2^129 12^36 7^46 6^50 18^31 22^29 14^34 13^35 10^39 20^30 3^82 2^130 5^56 17^32 2^131 23^29 11^38 3^83 26^28 19^31 21^30 6^51 7^47 2^132 5^57 12^37 15^34 10^40 24^29 2^133 3^84 13^36 14^35 18^32 22^30 20^31 2^134 6^52 28^28 5^58 3^85 7^48 17^33 11^39 2^135 23^30 19^32 2^136 21^31 10^41 12^38 3^86 26^29 15^35 13^37 5^59 2^137 6^53 14^36 24^30 7^49 18^33 3^87 2^138 22^31 20^32 11^40 17^34 2^139 5^60 28^29 3^88 10^42 6^54 12^39 2^140 19^33 23^31 7^50 21^32 13^38 15^36 14^37 29^29 2^141 26^30 3^89 5^61 18^34 11^41 2^142 24^31 6^55 20^33 3^90 22^32 10^43 2^143 17^35 7^51 12^40 5^62 2^144 28^30 3^91 13^39 19^34 15^37 14^38 23^32 6^56 21^33 2^145 11^42 26^31 29^30 3^92 18^35 7^52 2^146 10^44 5^63 24^32 20^34 12^41 2^147 17^36 22^33 30^30 6^57 3^93 2^148 13^40 15^38 14^39 5^64 19^35 11^43 7^53 3^94 2^149 28^31 23^33 21^34 10^45 6^58 2^150 18^36 26^32 12^42 3^95 29^31 5^65 2^151 17^37 20^35 24^33 7^54 22^34 13^41 2^152 30^31 3^96 11^44 14^40 15^39 6^59 10^46 19^36 2^153 5^66 31^31 21^35 3^97 23^34 28^32 2^154 12^43 18^37 7^55 2^155 6^60 26^33 17^38 3^98 13^42 29^32 5^67 20^36 11^45 24^34 2^156 22^35 14^41 10^47 15^40 3^99 2^157 30^32 19^37 7^56 6^61 12^44 5^68 2^158 21^36 23^35 18^38 3^100 31^32 28^33 2^159 13^43 11^46 17^39 10^48 26^34 14^42 20^37 2^160 7^57 3^101 15^41 5^69 6^62 29^33 24^35 22^36 2^161 12^45 19^38 3^102 30^33 2^162 21^37 5^70 11^47 18^39 10^49 13^44 7^58 23^36 6^63 2^163 3^103 28^34 31^33 17^40 14^43 2^164 15^42 20^38 26^35 3^104 5^71 12^46 22^37 2^165 24^36 29^34 6^64 7^59 19^39 2^166 11^48 10^50 3^105 33^33 13^45 18^40 30^34 21^38 2^167 5^72 23^37 14^44 17^41 15^43 2^168 3^106 6^65 28^35 7^60 31^34 12^47 20^39 2^169 26^36 10^51 22^38 5^73 11^49 3^107 24^37 19^40 2^170 29^35 13^46 6^66 18^41 2^171 3^108 7^61 21^39 14^45 33^34 17^42 30^35 5^74 23^38 15^44 2^172 12^48 10^52 3^109 20^40 11^50 34^34 2^173 28^36 6^67 31^35 26^37 22^39 13^47 2^174 7^62 5^75 19^41 24^38 3^110 29^36 2^175 18^42 14^46 12^49 21^40 17^43 6^68 15^45 3^111 2^176 10^53 23^39 11^51 5^76 33^35 30^36 7^63 2^177 20^41 3^112 13^48 28^37 2^178 34^35 31^36 6^69 22^40 19^42 26^38 5^77 24^39 14^47 2^179 3^113 12^50 18^43 10^54 35^35 7^64 15^46 29^37 17^44 11^52 2^180 21^41 3^114 23^40 6^70 2^181 5^78 13^49 20^42 30^37 33^36 2^182 3^115 7^65 19^43 28^38 10^55 14^48 12^51 22^41 2^183 34^36 31^37 26^39 11^53 24^40 5^79 18^44 6^71 15^47 3^116 17^45 2^184 21^42 29^38 35^36 2^185 13^50 7^66 3^117 23^41 5^80 20^43 2^186 10^56 6^72 12^52 30^38 14^49 33^37 11^54 19^44 2^187 3^118 22^42 28^39 15^48 18^45 24^41 2^188 26^40 17^46 5^81 7^67 34^37 31^38 3^119 6^73 13^51 21^43 2^189 10^57 29^39 35^37 23^42 2^190 12^53 20^44 3^120 11^55 14^50 5^82 7^68 2^191 19^45 6^74 30^39 15^49 33^38 22^43 3^121 18^46 2^192 17^47 28^40 13^52 24^42 10^58 26^41 5^83 37^37 2^193 31^39 21^44 34^38 3^122 12^54 7^69 11^56 6^75 2^194 14^51 29^40 20^45 23^43 35^38 3^123 2^195 5^84 15^50 19^46 18^47 10^59 2^196 13^53 17^48 22^44 30^40 6^76 7^70 3^124 33^39 2^197 28^41 24^43 12^55 11^57 5^85 26^42 21^45 37^38 14^52 2^198 3^125 31^40 34^39 20^46 2^199 23^44 6^77 29^41 15^51 10^60 7^71 38^38 19^47 5^86 3^126 13^54 2^200 35^39 18^48 17^49 11^58 22^45 12^56 2^201 30^41 3^127 6^78 24^44 33^40 14^53 28^42 2^202 5^87 21^46 26^43 7^72 10^61 3^128 2^203 31^41 20^47 15^52 37^39 34^40 13^55 23^45 19^48 2^204 29^42 11^59 6^79 18^49 5^88 12^57 17^50 3^129 38^39 7^73 2^205 22^46 35^40 14^54 10^62 2^206 3^130 30^42 39^39 24^45 21^47 5^89 28^43 6^80 26^44 33^41 2^207 15^53 13^56 20^48 11^60 3^131 7^74 12^58 2^208 31^42 23^46 19^49 37^40 17^51 18^50 34^41 29^43 5^90 2^209 3^132 10^63 6^81 14^55 22^47 38^40 2^210 35^41 7^75 3^133 21^48 24^46 13^57 15^54 30^43 2^211 11^61 5^91 39^40 12^59 26^45 28^44 20^49 33^42 6^82 2^212 3^134 19^50 17^52 10^64 23^47 18^51 40^40 2^213 31^43 14^56 7^76 37^41 5^92 34^42 29^44 3^135 2^214 22^48 11^62 6^83 13^58 15^55 2^215 12^60 38^41 21^49 35^42 24^47 3^136 30^44 10^65 5^93 2^216 20^50 7^77 26^46 28^45 17^53 19^51 39^41 18^52 33^43 2^217 14^57 23^48 6^84 3^137 11^63 31^44 2^218 40^41 5^94 13^59 22^49 29^45 12^61 3^138 34^43 15^56 37^42 7^78 2^219 10^66 21^50 41^41 6^85 2^220 24^48 3^139 38^42 20^51 35^43 5^95 17^54 30^45 14^58 19^52 26^47 2^221 18^53 28^46 11^64 23^49 7^79 3^140 33^44 39^42 2^222 13^60 12^62 6^86 10^67 15^57 5^96 31^45 22^50 2^223 29^46 3^141 40^42 34^44 2^224 37^43 21^51 7^80 14^59 24^49 20^52 17^55 11^65 6^87 2^225 41^42 3^142 19^53 18^54 5^97 26^48 38^43 35^44 30^46 13^61 12^63 10^68 28^47 2^226 23^50 42^42 15^58 3^143 33^45 2^227 39^43 7^81 22^51 6^88 5^98 31^46 2^228 3^144 11^66 29^47 21^52 14^60 40^43 17^56 34^45 2^229 20^53 10^69 37^44 24^50 18^55 19^54 13^62 12^64 3^145 5^99 2^230 6^89 7^82 26^49 41^43 15^59 30^47 23^51 28^48 35^45 38^44 2^231 3^146 11^67 42^43 22^52 2^232 33^46 5^100 14^61 10^70 39^44 6^90 21^53 31^47 17^57 3^147 2^233 7^83 12^65 13^63 29^48 43^43 20^54 18^56 19^55 24^51 2^234 34^46 40^44 15^60 37^45 5^101 3^148 2^235 26^50 23^52 6^91 11^68 30^48 28^49 41^44 7^84 10^71 35^46 2^236 14^62 38^45 3^149 22^53 12^66 13^64 5^102 2^237 17^58 33^47 21^54 42^44 18^57 20^55 3^150 31^48 6^92 39^45 19^56 2^238 29^49 15^61 24^52 7^85 11^69 43^44 2^239 34^47 5^103 10^72 3^151 40^45 37^46 26^51 23^53 14^63 2^240 12^67 44^44 28^50 6^93 30^49 13^65 22^54 3^152 2^241 35^47 41^45 17^59 38^46 7^86 5^104 21^55 18^58 2^242 20^56 33^48 19^57 11^70 15^62 3^153 10^73 42^45 31^49 29^50 6^94 2^243 24^53 39^46 14^64 12^68 5^105 2^244 3^154 43^45 34^48 13^66 7^87 23^54 26^52 40^46 37^47 2^245 28^51 17^60 22^55 30^50 6^95 11^71 3^155 44^45 10^74 21^56 2^246 18^59 5^106 15^63 35^48 20^57 19^58 41^46 38^47 2^247 7^88 45^45 33^49 3^156 12^69 14^65 24^54 31^50 29^51 13^67 2^248 42^46 6^96 39^47 5^107 23^55 3^157 2^249 11^72 26^53 10^75 34^49 17^61 43^46 22^56 7^89 28^52 2^250 15^64 37^48 40^47 18^60 30^51 21^57 3^158 19^59 20^58 6^97 5^108 12^70 2^251 44^46 14^66 35^49 13^68 41^47 38^48 2^252 3^159 24^55 33^50 10^76 11^73 29^52 45^46 7^90 31^51 2^253 5^109 23^56 6^98 17^62 42^47 3^160 39^48 26^54 15^65 2^254 46^46 22^57 18^61 34^50 12^71 21^58 28^53 19^60 20^59 2^255 43^47 14^67 30^52 3^161 37^49 13^69 5^110 40^48 7^91 10^77 6^99 11^74 2^256 35^50 44^47 24^56 3^162 2^257 38^49 41^48 33^51 29^53 17^63 31^52 5^111 23^57 15^66 2^258 45^47 12^72 7^92 3^163 6^100 26^55 18^62 22^58 42^48 14^68 39^49 2^259 13^70 10^78 19^61 21^59 20^60 11^75 34^51 28^54 46^47 3^164 2^260 5^112 30^53 43^48 37^50 40^49 2^261 47^47 6^101 7^93 24^57 3^165 35^51 17^64 12^73 15^67 2^262 44^48 33^52 29^54 23^58 5^113 38^50 10^79 41^49 31^53 18^63 14^69 13^71 11^76 2^263 3^166 22^59 26^56 19^62 21^60 45^48 20^61 6^102 7^94 2^264 42^49 39^50 28^55 34^52 3^167 5^114 30^54 2^265 46^48 12^74 15^68 37^51 17^65 10^80 43^49 24^58 2^266 40^50 6^103 3^168 11^77 13^72 14^70 47^48 7^95 35^52 18^64 23^59 2^267 5^115 29^55 33^53 44^49 31^54 22^60 19^63 38^51 3^169 41^50 26^57 21^61 20^62 2^268 48^48 6^104 12^75 2^269 10^81 45^49 28^56 5^116 3^170 7^96 39^51 15^69 42^50 34^53 17^66 11^78 30^55 2^270 13^73 14^71 24^59 46^49 37^52 2^271 3^171 18^65 43^50 23^60 40^51 6^105 5^117 35^53 19^64 2^272 22^61 29^56 47^49 20^63 7^97 21^62 33^54 10^82 12^76 31^55 3^172 26^58 38^52 44^50 2^273 41^51 11^79 15^70 48^49 13^74 17^67 5^118 2^274 6^106 28^57 14^72 3^173 45^50 34^54 30^56 39^52 2^275 42^51 24^60 7^98 18^66 10^83 3^174 23^61 2^276 12^77 37^53 19^65 46^50 5^119 22^62 6^107 20^64 21^63 43^51 40^52 11^80 29^57 35^54 2^277 26^59 3^175 15^71 31^56 33^55 13^75 47^50 7^99 14^73 17^68 2^278 38^53 44^51 41^52 5^120 28^58 3^176 2^279 10^84 6^108 48^50 18^67 12^78 24^61 30^57 34^55 2^280 45^51 39^53 11^81 19^66 42^52 23^62 3^177 7^100 20^65 22^63 5^121 2^281 21^64 13^76 15^72 37^54 46^51 14^74 6^109 29^58 2^282 26^60 17^69 40^53 35^55 3^178 43^52 50^50 10^85 31^57 33^56 2^283 12^79 5^122 47^51 38^54 7^101 18^68 28^59 11^82 3^179 44^52 41^53 2^284 24^62 6^110 30^58 19^67 48^51 34^56 13^77 23^63 2^285 15^73 20^66 3^180 22^64 39^54 21^65 14^75 45^52 5^123 10^86 42^53 2^286 17^70 7^102 37^55 29^59 26^61 12^80 3^181 6^111 2^287 11^83 46^52 35^56 31^58 40^54 33^57 43^53 18^69 50^51 5^124 2^288 28^60 3^182 13^78 38^55 47^52 24^63 19^68 2^289 10^87 15^74 7^103 51^51 41^54 44^53 14^76 30^59 23^64 6^112 20^67 22^65 21^66 34^57 2^290 3^183 17^71 5^125 12^81 48^52 11^84 39^55 2^291 45^53 42^54 26^62 29^60 3^184 37^56 18^70 7^104 2^292 6^113 31^59 10^88 13^79 35^57 5^126 33^58 40^55 46^53 2^293 15^75 43^54 19^69 14^77 3^185 28^61 24^64 50^52 38^56 20^68 12^82 2^294 23^65 11^85 21^67 17^72 22^66 47^53 30^60 41^55 6^114 7^105 3^186 44^54 5^127 51^52 2^295 34^58 10^89 39^56 2^296 48^53 13^80 18^71 26^63 29^61 3^187 52^52 45^54 42^55 15^76 37^57 14^78 2^297 5^128 31^60 6^115 19^70 35^58 11^86 12^83 7^106 33^59 3^188 2^298 24^65 40^56 28^62 20^69 46^54 17^73 43^55 23^66 21^68 22^67 10^90 2^299 50^53 38^57 30^61 5^129 3^189 13^81 6^116 47^54 2^300 41^56 34^59 18^72 44^55 7^107 51^53 14^79 26^64 15^77 11^87 2^301 12^84 3^190 29^62 39^57 48^54 19^71 5^130 42^56 2^302 45^55 52^53 37^58 31^61 10^91 6^117 17^74 20^70 24^66 35^59 33^60 3^191 28^63 2^303 21^69 23^67 7^108 22^68 40^57 13^82 53^53 46^55 43^56 2^304 5^131 30^62 3^192 38^58 18^73 11^88 14^80 12^85 15^78 50^54 2^305 6^118 34^60 41^57 47^55 26^65 10^92 44^56 19^72 3^193 2^306 7^109 29^63 51^54 5^132 39^58 17^75 20^71 2^307 13^83 31^62 48^55 24^67 37^59 42^57 21^70 3^194 45^56 23^68 6^119 28^64 22^69 33^61 35^60 52^54 11^89 2^308 12^86 14^81 18^74 15^79 40^58 7^110 5^133 10^93 2^309 3^195 30^63 43^57 53^54 46^56 38^59 2^310 19^73 6^120 26^66 34^61 50^55 17^76 3^196 41^58 54^54 13^84 29^64 2^311 47^56 5^134 20^72 44^57 11^90 7^111 24^68 39^59 21^71 12^87 51^55 2^312 31^63 23^69 22^70 14^82 3^197 10^94 28^65 15^80 37^60 18^75 33^62 42^58 48^56 6^121 35^61 2^313 45^57 5^135 52^55 3^198 40^59 2^314 30^64 19^74 7^112 13^85 43^58 17^77 11^91 46^57 38^60 26^67 2^315 53^55 6^122 3^199 34^62 12^88 20^73 10^95 29^65 5^136 2^316 14^83 50^56 41^59 21^72 24^69 15^81 54^55 47^57 22^71 44^58 23^70 18^76 3^200 2^317 31^64 39^60 7^113 28^66 51^56 37^61 33^63 6^123 55^55 2^318 35^62 5^137 42^59 13^86 11^92 48^57 45^58 3^201 19^75 17^78 10^96 30^65 2^319 12^89 52^56 40^60 26^68 14^84 20^74 2^320 7^114 38^61 43^59 3^202 15^82 46^58 5^138 34^63 6^124 29^66 21^73 53^56 24^70 2^321 22^72 18^77 23^71 41^60 50^57 11^93 3^203 13^87 2^322 31^65 28^67 44^59 47^58 10^97 54^56 39^61 12^90 5^139 33^64 19^76 7^115 17^79 37^62 2^323 6^125 35^63 51^57 3^204 42^60 14^85 55^56 30^66 48^58 2^324 45^59 20^75 15^83 26^69 40^61 3^205 52^57 2^325 21^74 5^140 11^94 56^56 18^78 38^62 29^67 24^71 22^73 10^98 43^60 34^64 13^88 7^116 23^72 6^126 46^59 2^326 12^91 53^57 3^206 41^61 28^68 31^66 17^80 2^327 19^77 50^58 5^141 14^86 44^60 39^62 47^59 33^65 2^328 54^57 3^207 15^84 37^63 35^64 6^127 7^117 20^76 11^95 30^67 10^99 42^61 2^329 51^58 26^70 13^89 21^75 18^79 45^60 48^59 55^57 3^208 5^142 12^92 40^62 22^74 2^330 24^72 23^73 29^68 38^63 52^58 34^65 6^128 2^331 43^61 56^57 17^81 14^87 3^209 7^118 19^78 46^60 28^69 31^67 2^332 5^143 15^85 11^96 41^62 10^100

by Ken ( at July 16, 2018 06:54 AM