Planet Haskell

December 07, 2018

Serokell Blog

Introduction to Tagless Final

I’m Vasiliy Kevroletin, and I work at Serokell with a lot of different people. Teams consist not only of Haskell experts (who contribute to GHC and Haskell libraries), but also of people like me who have less Haskell experience, but strive hard to learn and expand their Haskell knowledge.

Recently, my team decided to implement an eDSL using the tagless final style for one of the new projects. Although it’s a fairly known technique, I had zero experience with tagless final and there were some difficulties associated with terms tagless, final, and eDSL.

In preparation for the task, I’ve done some research and organized the learned material into a small HowTo. Now I want to share it with others.


I assume that the reader is fairly comfortable with MTL because I will use a lot of analogies with MTL.


Recall your everyday MTL-style programming. Forget about concrete monad transformers and concentrate on type classes. Without transformers, there are only two things that are left:

  1. Functions are declared using type constraints instead of concrete types:

    getUser :: (MonadReader r m, Has DatabaseConfig r, MonadIO m) => Name -> m User
  2. Instantiation of a polymorphic function to a concrete type (aka interpreter) happens somewhere “in the end”:

    liftIO $ runReader (getUser (Name "Pedro")) env

That’s all. We’ve just covered the tagless final style:

  1. Write code using overloaded functions.
  2. Run code using any of the suitable implementations (aka interpreters).

All good (and bad) in tagless final comes from ad-hoc polymorphism and type classes (i.e. overloading). Your output depends directly on your commitment to overload things.

A distinct feature of tagless final is extensibility in two dimensions, which is, in fact, a significant achievement (see The Expression Problem Revisited for an explanation for why it’s hard to achieve this property).

Let’s discuss extensibility while keeping this function signature in mind:

wimble :: (MonadReader Env m, MonadState State m) => m ()

We can run wimble using a custom new implementation of MonadReader and MonadState (by defining a new data type and defining instances for it). This is an extension in the first dimension: a new interpreter. Furthermore, we can use a new set of operations, say MonadWriter, and use wimble in a new function which uses all 3 classes: MonadReader, MonadState and MonadWriter (i.e. old and new operations). This is an extension in the second dimension: a new set of operations.

From my point of view, available learning resources show two different approaches to using tagless final:

  1. Define operations abstracted over a monad

    In that case, we can use do notation.

  2. Define an Abstract Syntax Tree using overloaded functions

    In that case, potentially, we can pretty print, inspect and optimize AST.

People who have experience with the technique might say that the two approaches are exactly the same. After learning about tagless final, this opinion makes sense for me. But earlier, when I had just started searching through available learning resources, I was confused by the difference in the look and feel of the resulting code. Also, some people say that having do notation is enough to call something an eDSL, others say that eDSL should define an AST. So, by saying tagless final, different people might assume slightly different approaches which might look as completely different techniques to a novice Haskell programmer. We will briefly explore programming with Monads and defining ASTs using tagless final, and also will touch a few other relevant topics.

Application Monad

It’s common among Haskell programmers to organize effectful application code using monads. Details vary between implementations but the basic idea is to define a monad together with a set of operations available in this monad. Similarly to this blog post, I’ll call a monad for organizing effectful application code an application monad.

Tagless final is a suitable technique for defining application monads. In facts, thanks to MTL, it is one of the most widely used tools for that task.

Let’s take a simplified problem of fetching/deleting a user from a database as an example to demonstrate how tagless final can be used to define operations available in do notation. Our application monad will provide two operations: getUser and deleteUser. By applying tagless final approach, we will define a set of overloaded functions and later will provide their implementation. Right from the start there is a design decision to make: which operations to overload. We can define a new typeclass with getUser/deleteUser operations, or we can use more generic functions from MTL and build on top of them. Although in practice I often will choose the 2nd option, here I’ll show the 1st one because in our particular case it leads to shorter code:

data Name = Name String
data User = User { name :: Name, age :: Int }

class Monad m => MonadDatabase m where
    getUser    :: Name -> m User
    deleteUser :: User -> m ()

Using operations given above we can define some logic like this:

test :: MonadDatabase m => m ()
test = do user <- getUser (Name "Pedro")
          when (age user < 18) (deleteUser user)

Note that the test function is abstract: it can be executed using a different implementation of MonadDatabase. Now, let’s define a MonadDatabase instance suitable to run that test function. One way to do it is to build on top of MTL transformers. I’ve assumed that getUser/deleteUser functions can be implemented on top of Reader and IO monads and I’ve also omitted some implementation details (marked by ...):

data DatabaseConfig = DatabaseConfig { ... }

newtype AppM a =
    AppM { unAppM :: ReaderT DatabaseConfig IO a }
    deriving (Functor, Applicative, Monad, MonadIO, MonadReader DatabaseConfig)

instance MonadDatabase AppM where
    getUser name = do cfg <- ask

    deleteUser user = do cfg <- ask

runAppM :: AppM a -> DatabaseConfig -> IO a
runAppM app config = runReaderT (unAppM app) config

Now, we can execute the abstract test function using a particular AppM implementation:

main = do cfg <- ...
          runAppM test cfg

By using tagless final style, we have separated the definition of abstract operations from their implementation which gives us extensibility. With such an approach it is possible to define a new set of operations and use it together with MonadDatabase. It is also possible to add a new interpretation (for example, for testing purposes).

Even with such a small example, there were many possibilities to organize code. The first question: how to choose the set of overloaded operations? Is it better to define a new typeclass such as MonadDatabase with application-specific functions, or is it better to stick to MTL typeclasses and define operations on top of more generic functions? The second question is: how to write the implementation? Are there practical alternatives to MTL transformers? Although it’s very tempting to discuss those and several other questions here, I don’t know all answers and the topic of proper application architecture is too broad. For more in-depth resources on application architecture, you can visit other blogs:
(1, 2, 3, 4, 5, 6).

Mini Q&A

Q. I heard you were describing tagless final using MTL. What about the famous n^2 problem?

A. n^2 problem appears in transformers implementation (because transformers need to propagate methods of their child monads). Transformers have nothing to do with tagless final. We were only talking about type constraints and freedom to switch between implementations.

If you are still wondering about the n^2 problem, here is a small trick to mitigate it (export method implementations as separate functions with a hope that other instances will use your implementation).

If you create many similar implementations, tagless final causes some effort duplication. In that case, you might want to use transformers, which leads to the n^2 problem.

Q. You were talking about an “Application” monad and MTL style. Is it really
an eDSL?

A. Even if there is a canonical scientific definition for the term eDSL, people use the term to talk about different things. Opinions range from “eDSL is a completely distinct language with its own semantics” to “eDSL is a library with a nice, consistent interface”. Here are the answers I got in several public Haskell-related channels to the question “what are good examples of eDSLs implemented in Haskell?”: SBV, diagrams, accelerate, blaze, esqueleto, shake, lens, gloss, streaming libraries (pipes, streamly, etc.), Servant, opaleye, frp-arduino, HHDL, ivory, pandoc. As you can see, those answers clearly show that the term eDSL is vague. But anyway, tagless final can be used to create both “true” eDSLs and nice library interfaces (probably with monads).


The most complete discussion of the tagless final approach was done by Oleg Kiselyov and his colleagues. He talks mostly about the embedding of different versions of typed lambda calculus using the tagless final encoding. He achieves very motivating results, such as embedding lambda calculus with linear types and transforming ASTs.

Let’s pick a simple language as an example and explore two ways to encode an AST: using Initial and Final encodings. The chosen language has integer constants, an addition operation, and lambda functions. From one hand it’s quite simple to put most of the implementation in this blog post. From the other hand, it’s complicated enough to discuss two versions of Initial encodings and to demonstrate extensibility of tagless final approach.

Initial encoding

Initial encoding means that we are representing AST using values of a given algebraic data type. The term “Initial encoding” was inspired by the category theory and it follows from the observation that inductive data type can be viewed as an “initial algebra”. Bartosz Milewski gives a gentle description of what an initial algebra is and why inductive data structure can be viewed as an initial algebra.

Tagged initial encoding

Here is one way to represent an Abstract Syntax Tree for our simple language (we are re-using Haskell lambda functions in the definition of Lambda for simplicity so that we don’t need to implement identifiers assignment/lookup by ourselves, this approach is called higher-order abstract syntax):

data Expr = IntConst Int
          | Lambda   (Expr -> Expr)
          | Apply    Expr Expr
          | Add      Expr Expr

This representation allows us to define well-formed eDSL expressions like these:

-- 10 + 20
t1 = IntConst 10 `Add` IntConst 20

-- (\x -> 10 + x) 20
t2 = Apply (Lambda $ \x -> IntConst 10 `Add` x) (IntConst 20)

Unfortunately, it also allows us to define malformed expressions like these:

-- Trying to call integer constant as a function
e1 = Apply (IntConst 10) (IntConst 10)

-- Trying to add lambda functions
e2 = Add f f where f = Lambda (\x -> x)

Evaluation of Expr values can produce errors because the representation of our eDSL allows encoding malformed expressions. Consequently, interpreter eval should check for type errors during its work. To be more precise, it should pattern-match on resulting values to find out which concrete values came out from eval function in runtime to ensure that the Add operation was applied to integer constants and the Apply operation was used with a lambda function. We define the Result data type to represent possible resulting values and use Maybe Result to represent possible errors:

data Result = IntResult Int
            | LambdaResult (Expr -> Expr)

eval :: Expr -> Maybe Result
eval e@(IntConst x) = Just (IntResult x)
eval e@(Lambda   f) = Just (LambdaResult f)
eval (Apply f0 arg) = do
    f1  <- eval f0
    case f1 of
        LambdaResult f -> eval (f arg)
        _              -> Nothing
eval (Add l0 r0) = do
    l1 <- eval l0
    r1 <- eval r0
    case (l1, r1) of
        (IntResult l, IntResult r) -> Just $ IntResult (l + r)
        _                          -> Nothing

The technique is called “tagged” because sum types in Haskell are tagged sum types. At runtime such values are represented as a pair (tag, payload) and tag is used to perform pattern-matches. The eval function uses pattern matching on IntResult and LambdaResult to perform type checking and errors checking, or in other words, it uses tags in runtime. Hence the name.

Tagless initial encoding

The idea is that we can use GADT to add information about values into Expr type and use it to make malformed eDSL expressions unrepresentable. We no longer need a Result data type and there is no more runtime type checking in eval function. In the Finally Tagless, Partially Evaluated paper authors refer to their versions of data constructors IntResult and LambdaResult as “tags”. And because the GADTs-based approach has no tags, they call it “tagless initial” encoding.

The GADTs-based AST definition and corresponding interpreter eval are given below. New AST is capable of representing examples t1, t2 from the previous section while making e1, e2 unrepresentable. The idea of Expr a data type is that a parameter holds a type to which a given expression should evaluate. IntConst and Lambda just duplicate its field types in a parameter because evaluating a value just means unwrapping it. In the case of Add constructor, a parameter is equal to Int which means that Add evaluates to an integer. Apply evaluates to a result of a passed lambda function.

data Expr a where
    IntConst :: Int                     -> Expr Int
    Lambda   :: (Expr a -> Expr b)      -> Expr (Expr a -> Expr b)
    Apply    :: Expr (Expr a -> Expr b) -> Expr a -> Expr b
    Add      :: Expr Int -> Expr Int    -> Expr Int

eval :: Expr a -> a
eval (IntConst x) = x
eval (Lambda f)   = f
eval (Apply f x)  = eval (eval f x)
eval (Add l r)    = (eval l) + (eval r)

Final encoding

Although the term “Initial” came from category theory, the term “Final” didn’t. Oleg shows that “the final and initial typed tagless representations are related by bijection” which means that these approaches are equivalent in some sense and both are “Initial” from the category theorist’s point of view. The Finally Tagless paper states “We call this approach final (in contrast to initial) because we represent each object term not by its abstract syntax, but by its denotation in a semantic algebra”. My best guess is that the name “Final” was chosen to differentiate from the term Initial as much as possible.

With tagless final, we build expressions using overloaded functions instead of data constructors. The expression from the previous section will look like this:

test = lambda (\x -> add x (intConst 20))

Machinery to make it work consists of two parts.

  1. Combinators definition:

    class LambdaSYM repr where
        intConst :: Int -> repr Int
        lambda   :: (repr a -> repr b) -> repr (a -> b)
        apply    :: repr (a -> b) -> repr a -> repr b
  2. Interpreter implementation:

    data R a = R { unR :: a }
    instance LambdaSYM R where
        intConst x = R x
        lambda f   = R $ \x -> unR (f (R x))
        apply f a  = R $ (unR f) (unR a)
    eval :: R a -> a
    eval x = unR x

Applying interpreter:

testSmall :: LambdaSYM repr => repr Int
testSmall = apply (lambda (\x -> x)) (intConst 10)

main = print (eval testSmall) -- 10

Interesting points:

  1. eval function instantiates testSmall expression to a concrete type R Int (aka interpreter).

  2. It’s easy to define other interpreters. For example, a pretty printer. There is a little twist, though: a pretty printer needs to allocate names for free variables and keep track of allocated names, so the printing interpreter will pass an environment and it will look very similar to a Reader monad.

  3. It’s extremely easy to extend the language with new operations.

Adding a new add operation to our previous example requires only defining a new type class and implementing a new instance for each interpreter. Functions which use new add operations should add additional AddSYM repr constrain to its type.

class AddSYM repr where
    add :: repr Int -> repr Int -> repr Int

instance AddSYM R where
    add a b = R $ (unR a) + (unR b)

test :: (LambdaSYM repr, AddSYM repr) => repr Int
test = apply (apply (lambda (\y -> lambda (\x -> x `add` y))) (intConst 10)) (intConst 20)

Please note that in this particular case we are lucky because it’s possible to write instance AddSYM R. Or, in other words, it’s possible to implement the new operation on top of an existing interpreter. Sometimes we will need to extend existing interpreters or write new ones.

Introspection. Host vs target language

In Oleg’s papers, he pretty prints and transforms tagless final AST. It’s very counterintuitive to expect the existence of such utilities because combinators are functions, and we are used to manipulating values of Algebraic Data Types. Yet, it is possible to extract facts about the structure of Final Tagless ASTs (i.e. introspection is possible) and to transform them. For the proof of that claim, check section 3.4 (page 28) of Oleg’s course where he presents pretty-printer and transformer.

The pretty-printer and the transformer of ASTs are just tagless final interpreters which keep track of some additional information and propagate it during interpreting from parents to children. Both are extensible in the same ways as other tagless final interpreters.

However, if we return to our first section and think about applying a tagless final approach to a simple case of defining Application monad, then we will quickly find out that we can’t inspect and transform resulting monads. Consider our simple example:

class Monad m => HasDatabaseConfig m where
    getDatabaseConfig :: m DatabaseConfig

getUser :: (HasDatabaseConfig m, MonadIO m) => Name -> m User
getUser = ...

test :: (HasDatabaseConfig m, MonadIO m) => m String
test = do user <- getUser (Name "Pedro")
          if age user > 3 then pure "Fuzz"
                          else pure "Buzz"

Although the getDatabaseConfig function is overloaded, a lot of logic is expressed using functions and other constructions which are not overloaded. Therefore, there is no way to statically inspect the resulting monadic value. This is an important point: if you want introspection and transformation of ASTs, then you need to keep track of what’s overloaded and what’s not. Oleg obtained his great results because he overloaded everything and expressed embedded lambda calculus by using only overloaded functions. In other words, the power of tagless final depends on how far you want to go with overloading.

Relation to Free monads

People often compare tagless final and free monads. Both approaches give you machinery to define overloaded operations inside a monadic context. I am not an expert in free monads, but tagless final:

  • is faster;
  • extensible (one can easily add new operations);
  • requires less boilerplate.

One argument for free monads is that it’s possible to statically introspect free monads. That is not completely true. Yes, you can easily execute actions one by one, and it helps to combine monadic values by interleaving actions (we can achieve a similar thing by interpreting into continuation with tagless final). But here is a blog post which describes the difficulties of free monad introspection (we’ve already covered the gist of the problem in the previous section). Also, see this blog post where the author describes difficulties associated with free monads and suggests using tagless final instead.

Here is a very good overview of free monad performance challenges. Here Edward Kmett gives his perspective on the same problem.

In few words:

  1. A simple implementation of free monads causes O(n^2) asymptotic for left-associated monadic binds. It always adds one element to a “list” like this [1, 2, 3] ++ [4].
  2. Using continuations (similarly to the DList package) gives O(n) binds, but makes some operations slow (for example combining two sequences of commands by interleaving them).
  3. Using a technology similar to the Seq data structure leads to a good asymptotic behaviour for all operations, but also gives significant constant overhead.


General techniques regarding optimizing Haskell code with polymorphic functions apply here. In few words, sometimes using overloaded functions cause compiler to generate code which uses methods dictionaries to dispatch calls. The compiler often knows how to specialize functions and get rid of dictionaries, but module boundaries prevent that from happening. To help the compiler, we need to read the “Specializing” section of this document and then use an INLINEABLE pragma like this

getUser :: (MonadReader r m, Has DatabaseConfig r, MonadIO m) => Name -> m User
{-# INLINEABLE  getUser #-}


Haskell lacks first-class polymorphism (aka impredicative polymorphism), which means that we can’t specialize an existing data type to hold a polymorphic value like this:

Maybe (LambdaSym repr => repr Int)

It follows that we can’t interpret such polymorphic value twice (but this situation doesn’t appear very frequent in cases where we just want an Application monad with some overloaded operations). This is an issue when, for example, we parse some text file, obtain a tagless final AST, and want to interpret it twice: to evaluate and to pretty print. There is a limited workaround: define a newtype wrapper around a polymorphic value. The wrapper specifies concrete type constraints and hence kills one extensibility dimension of tagless final.

Oleg’s paper also presents another workaround: a special “duplicating” interpreter. Unfortunately, It is presented using a simple eDSL without lambda functions and I failed to apply the same technique to a more complicated AST with lambdas. I mention it here just for the sake of completeness.

Also, note that sometimes people want to change implementations (aka interpreters) in the runtime, not in the compile time, or even change only a part of the existing behaviour. For example, change the data source but leave all other application-specific logic intact. Tagless final can support it by implementing an interpreter configurable in the runtime which uses some sort of a method dictionary (see the handle pattern).


Thanks to MTL, tagless final style of programming was battle-tested and has wide adoption. In my opinion, it’s quite a natural way to write Haskell code because it utilizes a very basic Haskell feature: type classes. It also goes far beyond MTL — it can be used both for writing application-specific logic with and without monad transformers and “true” eDSLs with their own semantics.

I also found that it’s not a hard concept to grasp, so it can safely be used in a large team of developers with different backgrounds.

That’s all, I hope my post will help others to grasp the main idea of tagless final and to use it in their projects.


Many thanks to Gints Dreimanis, Vlad Zavialov and others from Serokell for their help in writing this article. Without their reviews and suggestions, this post would not have happened.


by Vasiliy Kevroletin ( at December 07, 2018 12:00 AM

January 19, 2019

Haskell at Work

Purely Functional GTK+, Part 2: TodoMVC

Purely Functional GTK+, Part 2: TodoMVC

In the last episode we built a "Hello, World" application using gi-gtk-declarative. It's now time to convert it into a to-do list application, in the style of TodoMVC.

To convert the “Hello, World!” application to a to-do list application, we begin by adjusting our data types. The Todo data type represents a single item, with a Text field for its name. We also need to import the Text type from Data.Text.

data Todo = Todo
  { name :: Text

Our state will no longer be (), but a data types holding Vector of Todo items. This means we also need to import Vector from Data.Vector.

data State = State
  { todos :: Vector Todo

As the run function returns the last state value of the state reducer loop, we need to discard that return value in main. We wrap the run action in void, imported from Control.Monad.

Let’s rewrite our view function. We change the title to “TodoGTK+” and replace the label with a todoList, which we’ll define in a where binding. We use container to declare a Gtk.Box, with vertical orientation, containing all the to-do items. Using fmap and a typed hole, we see that we need a function Todo -> BoxChild Event.

view' :: State -> AppView Gtk.Window Event
view' s = bin
  [#title := "TodoGTK+", on #deleteEvent (const (True, Closed))]
    todoList = container Gtk.Box
                         [#orientation := Gtk.OrientationVertical]
                         (fmap _ (todos s))

The todoItem will render a Todo value as a Gtk.Label displaying the name.

view' :: State -> AppView Gtk.Window Event
view' s = bin
  [#title := "TodoGTK+", on #deleteEvent (const (True, Closed))]
    todoList = container Gtk.Box
                         [#orientation := Gtk.OrientationVertical]
                         (fmap todoItem (todos s))
    todoItem todo = widget Gtk.Label [#label := name todo]

Now, GHC tells us there’s a “non-type variable argument in the constraint”. The type of todoList requires us to add the FlexibleContexts language extension.

{-# LANGUAGE FlexibleContexts  #-}
{-# LANGUAGE OverloadedLabels  #-}
{-# LANGUAGE OverloadedLists   #-}
{-# LANGUAGE OverloadedStrings #-}
module Main where

The remaining type error is in the definition of main, where the initial state cannot be a () value. We construct a State value with an empty vector.

main :: IO ()
main = void $ run App
  { view         = view'
  , update       = update'
  , inputs       = []
  , initialState = State {todos = mempty}

Adding New To-Do Items

While our application type-checks and runs, there are no to-do items to display, and there’s no way of adding new ones. We need to implement a form, where the user inserts text and hits the Enter key to add a new to-do item. To represent these events, we’ll add two new constructors to our Event type.

data Event
  = TodoTextChanged Text
  | TodoSubmitted
  | Closed

TodoTextChanged will be emitted each time the text in the form changes, carrying the current text value. The TodoSubmitted event will be emitted when the user hits Enter.

When the to-do item is submitted, we need to know the current text to use, so we add a currentText field to the state type.

data State = State
  { todos       :: Vector Todo
  , currentText :: Text

We modify the initialState value to include an empty Text value.

main :: IO ()
main = void $ run App
  { view         = view'
  , update       = update'
  , inputs       = []
  , initialState = State {todos = mempty, currentText = mempty}

Now, let’s add the form. We wrap our todoList in a vertical box, containing the todoList and a newTodoForm widget.

view' :: State -> AppView Gtk.Window Event
view' s = bin
  [#title := "TodoGTK+", on #deleteEvent (const (True, Closed))]
  (container Gtk.Box
             [#orientation := Gtk.OrientationVertical]
             [todoList, newTodoForm]

The form consists of a Gtk.Entry widget, with the currentText of our state as its text value. The placeholder text will be shown when the entry isn’t focused. We use onM to attach an effectful event handler to the changed signal.

view' :: State -> AppView Gtk.Window Event
view' s = bin
  [#title := "TodoGTK+", on #deleteEvent (const (True, Closed))]
  (container Gtk.Box
             [#orientation := Gtk.OrientationVertical]
             [todoList, newTodoForm]
    newTodoForm = widget
      [ #text := currentText s
      , #placeholderText := "What needs to be done?"
      , onM #changed _

The typed hole tells us we need a function Gtk.Entry -> IO Event. The reason we use onM is to have that IO action returning the event, instead of having a pure function. We need it to query the underlying GTK+ widget for it’s current text value. By using entryGetText, and mapping our event constructor over that IO action, we get a function of the correct type.

    newTodoForm = widget
      [ #text := currentText s
      , #placeholderText := "What needs to be done?"
      , onM #changed (fmap TodoTextChanged . Gtk.entryGetText)

It is often necessary to use onM and effectful GTK+ operations in event handlers, as the callback type signatures rarely have enough information in their arguments. But for the next event, TodoSubmitted, we don’t need any more information, and we can use on to declare a pure event handler for the activated signal.

    newTodoForm = widget
      [ #text := currentText s
      , #placeholderText := "What needs to be done?"
      , onM #changed (fmap TodoTextChanged . Gtk.entryGetText)
      , on #activate TodoSubmitted

Moving to the next warning, we see that the update' function is no longer total. We are missing cases for our new events. Let’s give the arguments names and pattern match on the event. The case for Closed will be the same as before.

update' :: State -> Event -> Transition State Event
update' s e = case e of
  Closed -> Exit

When the to-do text value changes, we’ll update the currentText state using a Transition. The first argument is the new state, and the second argument is an action of type IO (Maybe Event). We don’t want to emit any new event, so we use (pure Nothing).

update' :: State -> Event -> Transition State Event
update' s e = case e of
  TodoTextChanged t -> Transition s { currentText = t } (pure Nothing)
  Closed -> Exit

For the TodoSubmitted event, we define a newTodo value with the currentText as its name, and transition to a new state with the newTodo item appended to the todos vector. We also reset the currentText to be empty.

To use Vector.snoc, we need to add a qualified import.

import           Control.Monad                 (void)
import           Data.Text                     (Text)
import           Data.Vector                   (Vector)
import qualified Data.Vector                   as Vector
import qualified GI.Gtk                        as Gtk
import           GI.Gtk.Declarative
import           GI.Gtk.Declarative.App.Simple

Running the application, we can start adding to-do items.

Improving the Layout

Our application doesn’t look very good yet, so let’s improve the layout a bit. We’ll begin by left-aligning the to-do items.

todoItem i todo =
    [#label := name todo, #halign := Gtk.AlignStart]

To push the form down to the bottom of the window, we’ll wrap the todoList in a BoxChild, and override the defaultBoxChildProperties to have the child widget expand and fill all the available space of the box.

todoList =
  BoxChild defaultBoxChildProperties { expand = True, fill = True }
    $ container Gtk.Box
                [#orientation := Gtk.OrientationVertical]
                (fmap todoItem (todos s))

We re-run the application, and see it has a nicer layout.

Completing To-Do Items

There’s one very important missing: being able to mark a to-do item as completed. We add a Bool field called completed to the Todo data type.

data Todo = Todo
  { name      :: Text
  , completed :: Bool

When creating new items, we set it to False.

update' :: State -> Event -> Transition State Event
update' s e = case e of
  TodoSubmitted ->
    let newTodo = Todo {name = currentText s, completed = False}
    in  Transition
          s { todos = todos s `Vector.snoc` newTodo, currentText = mempty }
          (pure Nothing)

Instead of simply rendering the name, we’ll use strike-through markup if the item is completed. We define completedMarkup, and using guards we’ll either render the new markup or render the plain name. To make it strike-through, we wrap the text value in <s> tags.

    [ #label := completedMarkup todo
    , #halign := Gtk.AlignStart
    completedMarkup todo
      | completed todo = "<s>" <> name todo <> "</s>"
      | otherwise      = name todo

For this to work, we need to enable markup for the label be setting #useMarkup to True.

    [ #label := completedMarkup todo
    , #useMarkup := True
    , #halign := Gtk.AlignStart
    completedMarkup todo
      | completed todo = "<s>" <> name todo <> "</s>"
      | otherwise      = name todo

In order for the user to be able to toggle the completed status, we wrap the label in a Gtk.CheckButton bin. The #active property will be set to the current completed status of the Todo value. When the check button is toggled, we want to emit a new event called TodoToggled.

todoItem todo =
  bin Gtk.CheckButton
      [#active := completed todo, on #toggled (TodoToggled i)]
    $ widget
        [ #label := completedMarkup todo
        , #useMarkup := True
        , #halign := Gtk.AlignStart

Let’s add the new constructor to the Event data type. It will carry the index of the to-do item.

data Event
  = TodoTextChanged Text
  | TodoSubmitted
  | TodoToggled Int
  | Closed

To get the corresponding index of each Todo value, we’ll iterate using Vector.imap instead of using fmap.

    todoList =
      BoxChild defaultBoxChildProperties { expand = True, fill = True }
        $ container Gtk.Box
                    [#orientation := Gtk.OrientationVertical]
                    (Vector.imap todoItem (todos s))
    todoItem i todo =

The pattern match on events in the update' function is now missing a case for the new event constructor. Again, we’ll do a transition where we update the todos somehow.

update' :: State -> Event -> Transition State Event
update' s e = case e of
  TodoToggled i -> Transition s { todos = _ (todos s) } (pure Nothing)

We need a function Vector Todo -> Vector Todo that modifies the value at the index i. There’s no handy function like that available in the vector package, so we’ll create our own. Let’s call it mapAt.

update' :: State -> Event -> Transition State Event
update' s e = case e of
  TodoToggled i -> Transition s { todos = mapAt i _ (todos s) } (pure Nothing)

It will take as arguments the index, a mapping function, and a Vector a, and return a Vector a.

mapAt :: Int -> (a -> a) -> Vector a -> Vector a

We implement it using Vector.modify, and actions on the mutable representation of the vector. We overwrite the value at i with the result of mapping f over the existing value at i.

mapAt :: Int -> (a -> a) -> Vector a -> Vector a
mapAt i f = Vector.modify (\v -> MVector.write v i . f =<< v i)

To use mutable vector operations through the MVector name, we add the qualified import.

import qualified Data.Vector.Mutable           as MVector

Finally, we implement the function to map, called toggleComplete.

toggleCompleted :: Todo -> Todo
toggleCompleted todo = todo { completed = not (completed todo) }

update' :: State -> Event -> Transition State Event
update' s e = case e of
  TodoToggled i -> Transition s { todos = mapAt i toggleComplete (todos s) } (pure Nothing)

Now, we run our application, add some to-do items, and mark or unmark them as completed. We’re done!

Learning More

Building our to-do list application, we have learned the basics of gi-gtk-declarative and the “App.Simple” architecture. There’s more to learn, though, and I recommend checking out the project documentation. There are also a bunch of examples in the Git repository.

Please note that this project is very young, and that APIs are not necessarily stable yet. I think, however, that it’s a much nicer way to build GTK+ applications using Haskell than the underlying APIs provided by the auto-generated bindings.

Now, have fun building your own functional GTK+ applications!

by Oskar Wickström at January 19, 2019 12:00 AM

August 24, 2019

Joachim Breitner

ICFP 2019

ICFP 2019 in Berlin ended yesterday, and it was – as always – a great pleasure. This year was particularly noteworthy for the quite affordable conference hotel and the absolutely amazing food during the coffee breaks.

Since I am no longer a proper academic, I unsurprisingly did not have real research to present. Luckily I found ways to not just be a passive participant this year:

  • At FARM, I presented Kaleidogen, a small game (or toy, some would say) of mine. The room was packed with people, so thanks for all your interest! If you missed it, you can soon see the recording or read the demo abstract.

  • At PLMW, the mentoring workshop for young researchers, I ran the “Social event” together with Niki Vazou. Like last year, we randomly grouped the students and held a little competition where they had to match program listings to languages and algorithms. This was great fun, and we even managed to solve the sudden problem of two ties in a ad-hoc extra quiz.

  • During his “State of GHC” speech, Simon Peyton Jones asked me to speak about the GHC Proposal Process for a few slides.

  • And since that is not enough stage time, I secured two spots in local stand-up comedy open mics on Monday and Friday, and even dragged sizable crowds of ICFP participants to these venues. One was a boat, and the other one a pretty dodgy bar in Neukölln, so that alone was a memorable experience. And the host was visibly surprised when his joke “I couldn’t be a software developers – I can’t commit” was met by such a roaring response…

Anyways, ICFP is over, back to disappear in the churn of every day work, and I hope to see you all next year.

by Joachim Breitner ( at August 24, 2019 06:35 AM

August 19, 2019

Monday Morning Haskell

Adding Features for Better Behavior


Last week we started exploring the idea of an AI built on an evaluation function. This has the potential to allow us to avoid a lot of the hand-crafting that comes with AI design. Hard old way specified all the rules for the AI to follow. In the new approach, we create a mathematical function to evaluate a game position. Then we can look at all our possible moves and select the one with the best result. We could, if we wanted to, turn the input to our evaluation function into a vector of numbers. And its output is also a number. This property will help us realize our dream future to machine learn this function.

We made a rudimentary version of this function last week. Even before turning to machine learning, there are a couple ways to improve our function. We can try tweaking the weights we applied to each feature. But we can also try coming up with new features, or try different combinations of features. This week, we'll try the latter approach.

In the coming weeks as we start exploring machine learning, we'll use Tensor Flow with Haskell! To get prepared, download our Haskell Tensor Flow guide!

Existing Features

Last week, we came up with a few different features that could help us navigate this maze. These features included:

  1. Maze distance to goal
  2. Manhattan distance to goal
  3. Whether or not an enemy is on our location
  4. Whether or not our stun is available
  5. The number of drills we have available
  6. The number of enemies that are nearby (using manhattan distance)

But there were some clear sub-optimal behaviors with our bot. We tend to get "zoned out" by enemies, even when they aren't near us by maze distance. Obviously, it would suit us to use maze distance instead of manhattan distance. But we also want to be willing to approach enemies aggressively when we have our stun, and retreat intelligently without it. To that end, let's add a couple more features:

  1. The number of enemies on the shortest path to the goal.
  2. The shortest distance to an enemy from a particular square (only up to 5)

We'll impose a penalty for close enemies if we don't have our stun. Otherwise we'll ignore this first new feature. Then we'll also impose a penalty having more enemies on our shortest path. This will make us more willing to use the stun, rather than waiting.

Enemies In The Way

Our first order of business will be to determine how many enemies lie on our shortest path. We'll filter the path itself based on membership in the active enemies set:

evaluateWorld :: World -> Float
evaluateWorld w =

    activeEnemyLocations = …

    shortestPath =
      getShortestPath (worldBoundaries w) playerLoc goalLoc

    enemiesOnPath = length $ filter
      (\l -> Set.member l (Set.fromList activeEnemyLocations))

Then we'll assign each enemy on this path a penalty greater than the value of using the stun. We'll add this score to our other scores.

evaluateWorld :: World -> Float
evaluateWorld w =
  enemiesOnPathScore +
    enemiesOnPath = ...
    enemiesOnPathScore = -85.0 * (fromIntegral enemiesOnPath)

Maze Distance

Next lets get the shortest maze distance to a nearby enemy. We'll actually want to generalize the behavior of our existing BFS function for this. We want to find the shortest path to any one of the enemy locations. So instead of supplying a single target location, we'll supply a set of target locations. Then we'll cap the distance to search so we aren't doing a full BFS of the maze every time. This gives an optional range parameter. Let's use these ideas to make an expanded API that our original function will use.

  :: Maze
  -> Location
  -> Set.Set Location
  -> Maybe Int
  -> [Location]
  maze initialLocation targetLocations maxRange = ...

-- Original function call!
getShortestPath maze initialLocation targetLocation =
  getShortestPathToTargetsWithLimit maze initialLocation
    (Set.singleton targetLocation) Nothing

  :: Maze
  -> Location
  -> Set.Set Location -- Now a set of targets
  -> Maybe Int -- Added range parameter
  -> [Location]
bfs = ...

We'll have to make a few tweaks to our algorithm now. Each search state element will have a "distance" associated with it.

data BFSState = BFSState
  { bfsSearchQueue :: Seq.Seq (Location, Int)

-- Our initial state has a distance of 0
  maze initialLocation targetLocations maxRange =
      (bfs maze initialLocation targetLocations maxRange)
        (Seq.singleton (initialLocation, 0))
        (Set.Singleton initialLocation)

Now we need a couple modifications to the core bfs function. When extracting the next element in the queue, we have to consider its distance. All new items we create will increment that distance. And if we're at the max distance, we won't add anything to the queue. Finally, when evaluating if we're done, we'll check against the set of targets, rather than a single target. Here's our bfs code, with differences noted.

  :: Maze
  -> Location
  -> Set.Set Location
  -> Maybe Int
  -> State BFSState [Location]
bfs maze initialLocation targetLocations maxRange = do
  BFSState searchQueue visitedSet parentsMap <- get
  if Seq.null searchQueue
    then return []
    else do

      -- ! Unwrap distance as well
      let (nextLoc, distance) = Seq.index searchQueue 0

      -- ! Check set membership, not equality
      if Set.member nextLoc targetLocations
        then return (unwindPath parentsMap [nextLoc])
        else do

          -- ! Add the new distance to each adjacent cell
          let adjacentCells = (, distance + 1) <$> 
                getAdjacentLocations maze nextLoc

          -- ! Account for the distance with a new helper function
          let unvisitedNextCells = filter
                (shouldAddNextCell visitedSet)

          let newSearchQueue = foldr
                (flip (Seq.|>))
                (Seq.drop 1 searchQueue) 
              newVisitedSet = Set.insert nextLoc visitedSet
              newParentsMap = foldr
                (\(l, _) -> Map.insert l nextLoc)
                parentsMap unvisitedNextCells
          put (BFSState newSearchQueue newVisitedSet newParentsMap)
          bfs maze initialLocation targetLocations maxRange
    -- ! Helper function to account for distance when adding to queue
    shouldAddNextCell visitedSet (loc, distance) = case maxRange of
      Nothing -> not (Set.member loc visitedSet)
      Just x -> distance <= x && not (Set.member loc visitedSet)

    unwindPath parentsMap currentPath = ...

Now to use this feature, we'll use our new different shortest path call. If the distance is "0", this means we have no enemies near us, and there's no penalty. We also won't apply a penalty if our stun is available. Otherwise, we'll provide a stiffer penalty the shorter the path. Then we mix it in with the other scores.

evaluateWorld :: World -> Float
evaluateWorld w =
  nearestEnemyDistanceScore +
    nearestEnemyDistance = length $ getShortestPathToTargetsWithLimit
      (worldBoundaries w)
      (Set.fromList activeEnemyLocations)
      (Just 4)
    nearestEnemyDistanceScore =
      if nearestEnemyDistance == 0 || stunAvailable then 0.0
        else -100.0 * (fromIntegral (5 - nearestEnemyDistance))

We'll also drop the enemy manhattan distance weight to -5.0.


From this change, our player suddenly appears much more intelligent! It will back away from enemies when it is missing it's stun. It will use the stun and go past the enemy when appropriate.

There are still ways we could improve the AI. It doesn't account for future space to retreat when running away. It sometimes uses the stun too early, when it might be better to wait for more enemies to come into range. But it's not clear how we could improve it by tweaking the weights. This means it's time to consider machine learning as an option to get better weights!


Next week, we'll re-acquaint ourselves with the basics of machine learning and Tensor Flow. This will set us up to write a program that will determine our AI weights.

We're going to start working with Tensor Flow next week! To make sure you can keep up, download our Haskell Tensor Flow Guide. It'll help you with the basics of making this complex Haskell library work.

by James Bowen at August 19, 2019 02:30 PM

Magnus Therning

Hedgehog on a REST API, part 3

In my previous post on using Hedgehog on a REST API, Hedgehog on a REST API, part 2 I ran the test a few times and adjusted the model to deal with the incorrect assumptions I had initially made. In particular, I had to adjust how I modelled the User ID. Because of the simplicity of the API that wasn’t too difficult. However, that kind of completely predictable ID isn’t found in all APIs. In fact, it’s not uncommon to have completely random IDs in API (often they are UUIDs).

So, I set out to try to deal with that. I’m still using the simple API from the previous posts, but this time I’m pretending that I can’t build the ID into the model myself, or, put another way, I’m capturing the ID from the responses.

The model state

When capturing the ID it’s no longer possible to use a simple Map Int Text for the state, because I don’t actually have the ID until I have an HTTP response. However, the ID is playing an important role in the constructing of a sequence of actions. The trick is to use Var Int v instead of an ordinary Int. As I understand it, and I believe that’s a good enough understanding to make use of Hedgehog possible, is that this way the ID is an opaque blob in the construction phase, and it’s turned into a concrete value during execution. When in the opaque state it implements enough type classes to be useful for my purposes.

The API calls: add user

When taking a closer look at the Callback type not all the callbacks will get the state in the same form, opaque or concrete, and one of them, Update actually receives the state in both states depending on the phase of execution. This has the most impact on the add user action. To deal with it there’s a need to rearrange the code a bit, to be specific, commandExecute can no longer return a tuple of both the ID and the status of the HTTP response because the update function can’t reach into the tuple, which it needs to update the state.

That means the commandExecute function will have to do tests too. It is nice to keep all tests in the callbacks, but by sticking a MonadTest m constraint on the commandExecute it turns into a nice solution anyway.

I found that once I’d come around to folding the Ensure callback into the commandExecute function the rest fell out from the types.

The API calls: delete user

The other actions, deleting a user and getting a user, required only minor changes and the changes were rather similar in both cases.

Not the type for the action needs to take a Var Int v instead of just a plain Int.

Which in turn affect the implementation of HTraversable

Then the changes to the Command mostly comprise use of concrete in places where the real ID is needed.

deleteUser :: (MonadGen n, MonadIO m) => Command n m State
deleteUser = Command gen exec [ Update u
                              , Require r
                              , Ensure e
    gen (State m) = case M.keys m of
      [] -> Nothing
      ks -> Just $ DeleteUser <$> Gen.element ks

    exec (DeleteUser vi) = liftIO $ do
      mgr <- newManager defaultManagerSettings
      delReq <- parseRequest $ "DELETE http://localhost:3000/users/" ++ show (concrete vi)
      delResp <- httpNoBody delReq mgr
      return $ responseStatus delResp

    u (State m) (DeleteUser i) _ = State $ M.delete i m

    r (State m) (DeleteUser i) = i `elem` M.keys m

    e _ _ (DeleteUser _) r = r === status200


This post concludes my playing around with state machines in Hedgehog for this time. I certainly hope I find the time to put it to use on some larger API soon. In particular I’d love to put it to use at work; I think it’d be an excellent addition to the integration tests we currently have.

August 19, 2019 12:00 AM

August 18, 2019

Michael Snoyman

Haskell kata: withTryFileLock

This is the first Haskell code kata I’ve put on this blog (to my knowledge). The idea is to present a self contained, relatively small coding challenge to solidify some skills with Haskell. If people like this and would like to see more, let me know. Caveat: these will almost certainly be supply driven. As I notice examples like this in my code, I’ll try to extract them like this blog post.

OK, here’s the story. The filelock library provides a set of functions for working with locked files. Some of these will block until a file lock is available. However, some will instead return a Maybe value and use Nothing to represent the case where a lock is not available.

What’s interesting about this is the withTryFileLock function, which is a rare combination of the bracket pattern and potential failure. Its signature is:

  :: FilePath
  -> SharedExclusive
  -> (FileLock -> IO a)
  -> IO (Maybe a)

The FilePath parameter says which file to try and lock. SharedExclusive says the type of lock to take. The third parameter is the action to perform with the file lock. That action will return an IO a action. Then, if the lock is taken, that a value ends up wrapped in a Just constructor and returned from withTryFileLock. If the lock failed, then Nothing is returned.

The thing is, there’s an alternative function signature we could have instead, which would provide a Maybe FileLock to the inner action. It looks like this:

  :: FilePath
  -> SharedExclusive
  -> (Maybe FileLock -> IO a)
  -> IO a

Why would you want one versus the other? It’s not the topic I’m focusing on today, and it honestly doesn’t matter that much. Here’s the code kata:

Implement the second version in terms of the first, and the first version in terms of the second.

To complete these code kata:

  1. Copy/paste the code snippet below into a file called Main.hs
  2. Make sure you have Stack installed.
  3. Make tweaks to Main.hs.
  4. Run stack Main.hs.
  5. If you get an error in step 4, go back to 3.
  6. Congratulations, you’ve successfully fixed the program and parsed my BASIC-esque goto statement!

Bonus points: generalize version1 and version2 to work in any MonadUnliftIO.

#!/usr/bin/env stack
-- stack --resolver lts-14.1 script
import System.FileLock (FileLock, SharedExclusive (..), withTryFileLock)

-- We've imported this function:
-- withTryFileLock
--   :: FilePath
--   -> SharedExclusive
--   -> (FileLock -> IO a)
--   -> IO (Maybe a)

-- | Implement this function by using the 'withTryFileLock' imported above.
  :: FilePath
  -> SharedExclusive
  -> (Maybe FileLock -> IO a)
  -> IO a
version1 = _

-- | And now turn it back into the original type signature. Use the
-- 'version1' function we just defined above.
  :: FilePath
  -> SharedExclusive
  -> (FileLock -> IO a)
  -> IO (Maybe a)
version2 = _

-- | Just a simple test harness
main :: IO ()
main = do
  version1 "version1.txt" Exclusive $ \(Just _lock) ->
    version1 "version1.txt" Exclusive $ \Nothing ->
    putStrLn "Yay, it worked!"

  Just _ <- version2 "version2.txt" Exclusive $ \_lock -> do
    Nothing <- version2 "version2.txt" Exclusive $
      error "Should not be called"
    pure ()
  putStrLn "Yay, it worked!"

August 18, 2019 12:04 PM

August 16, 2019

Functional Jobs

Developer - Erlang, Elm, and Haskell at Driebit (Full-time)

Are you convinced of the benefits of functional programming? Are you always thinking about what could be the simplest and most elegant way to achieve something? And are you interested in art, culture, sustainability, and education? Then you’re the person we’re looking for!


Our team

We are a team of fifteen people. Three directors coordinate the various subdivisions, but also contribute substantively to the work we do.

We treasure an open culture in which personal opinions are appreciated. That shows in our projects, but also at the lunch table. We like to talk about what we do, and there is always room for new ideas.

A free culture like ours can only flourish because we take our responsibilities seriously. For good relationships with our clients, for the things we build, and for you; so you can have a healthy work-life balance.

alt text

What you do at Driebit

  • You build websites! Either by yourself or together with your colleagues, you strive to find the most elegant solutions to your problems.
  • You think hard about the simple way to build things, and you are capable of explaining to clients why that’s different from the easy way.
  • Because we believe in the right tools for the right job, you feel at liberty to introduce interesting technical solutions, and you do so freely and happily.
  • You improve existing websites by fixing bugs or adding new features.
  • You work on tools that help you and your colleagues be fitter, happier, more productive.
  • You may organise and host meetups, about functional programming for example!

Why you like it

  • At Driebit we work with Erlang, Haskell, and Elm.
  • Your input, talent, and knowledge are all put towards a positive contribution to society.
  • Because of longstanding relationships with our clients, you will develop interesting connections.
  • We always aim for the highest quality, and sometimes even win prizes!
  • We have lunch together every day, with nice bread and organic hagelslag!
  • We have a beautiful, sun-lit office in the best area of Amsterdam, offering a pleasant view of the canal and an excellent launch pad for afternoon walks.

alt text

What are we looking for in you?

  • Someone who loves FP and especially Haskell <3
  • Keep It Simple, Silly. Someone who’d rather do three things very well, than ten things so-so.
  • Someone for whom meaningful work is not a wish, but a prerequisite.
  • Someone who’s available 4-5 days a week.
  • A basic understanding of Dutch
  • You live in the Netherlands preferably in or around Amsterdam

So what’s next?

We would like to get to know you a little better. Please write a short summary of why you think this job fits you, and what sort of colleague you are.

If you have any questions, please do not hesitate to contact Dorien Drees (dorien [at] driebit [dot] nl).

IMPORTANT MESSAGE TO RECRUITERS: please, please, please do not contact us.

Get information on how to apply for this position.

August 16, 2019 09:12 AM

August 12, 2019

Functional Jobs

Scala developer for NetLogo at Northwestern University (Full-time)

The Center for Connected Learning at Northwestern University is looking for a full-time Scala/Java Software Developer to work on the NetLogo desktop application, a computational modeling environment widely-used in both education and research. This might be a good opportunity for you if you would enjoy working on:

  • A software project with thousands of users all over the world.
  • A programming language compiler and runtime.
  • A graphical simulation and modeling environment.
  • An open-source project with a growing public ecosystem.

This Software Developer position is based at Northwestern University's Center for Connected Learning and Computer-Based Modeling (CCL), working in a small collaborative development team in a university research group that also includes professors, postdocs, graduate students, and undergraduates, supporting the needs of multiple research projects. Many of the projects undertaken by the lab use the NetLogo software.

NetLogo is a programming language and an agent-based modeling environment. The NetLogo language is a dialect of Logo/Lisp specialized for building agent-based simulations of natural and social phenomena. NetLogo has hundreds of thousands of users ranging from grade school students to advanced researchers. NetLogo also features an expansive API that members of the NetLogo community use to extend the language to integrate with software like GIS databases, Python, R, and Mathematica, and to interface with hardware devices like Arduino boards and video cameras.

Application information:

The Northwestern campus is in Evanston, Illinois on the Lake Michigan shore, adjacent to Chicago and easily reachable by public transportation.

Specific Responsibilities:

  • Independently implements NetLogo features and bug fixes in Scala and Java, including doing code analysis to identify clean designs and architecture changes as needed.
  • Collaborates with the NetLogo development team and principal research investigators in planning and designing enhancements for NetLogo and other related projects.
  • Interacts with lab members and the NetLogo user community including responding to bug reports, questions, and suggestions, and reviewing open-source contributions; provides feedback and guidance to student workers.
  • Performs other duties as required or assigned.

Minimum Qualifications:

  • A bachelor's degree in computer science or a closely related field or the equivalent combination of education, training and experience from which comparable skills and abilities can be acquired.
  • Two or more years of software development experience, with demonstrated efforts at improving software development skills and knowledge.

Preferred Qualifications:

  • Experience working effectively as part of a small software development team, including maintaining close collaboration, using distributed version control, and implementing automated testing.
  • Experience with at least one JVM language, Scala strongly preferred.
  • Experience developing GUI applications, especially Java Swing-based applications.
  • Experience with programming language design and implementation, functional programming (especially Haskell or Lisp), and compilers.
  • Interest in and experience with computer-based modeling and simulation, especially agent-based simulation.
  • Experience working on research projects in an academic environment.
  • Experience with open-source software development and supporting the growth of an open-source community.
  • Interest in education and an understanding of secondary school math and science content.

As per Northwestern University policy, this position requires a criminal background check. Successful applicants will need to submit to a criminal background check prior to employment.

Visa sponsorship may be available for qualified candidates.

Northwestern University is an Equal Opportunity, Affirmative Action Employer of all protected classes including veterans and individuals with disabilities.

Get information on how to apply for this position.

August 12, 2019 06:28 PM

Monday Morning Haskell

Building a Better Brain


In the last few weeks, we've focused a lot on the player AI for our game. We've used a few more advanced tricks to help our player navigate the maze using drills. But that's come at a performance cost. The game can now get a little choppy when there are a lot of enemies, or when our player is far away from the goal. It also takes longer to run our analysis iterations than we would like.

This week, we'll improve the performance of our AI by caching the determined path. Lots of our calculations for shortest path measurements get repeated. We can keep track of these, and avoid the entire BFS algorithm altogether in a lot of circumstances!

This week, you should take a look at the search-caching branch on our Github repository for the complete code we're implementing here. We'll focus on changes in the MazeUtils.hs file.

We're also going to do a little bit of profiling for this article. Profiling your code is an important skill to learn about if you ever want to use Haskell in production. For some other useful skills, check out our Production Checklist!

Profiling Our Code

As alluded to above, we have a pretty good idea of where the performance bottleneck is for our code. But it always pays to be sure. So to double check, we're going to run our code under profiling. We'll go through some of the basics here, but you should also check out this article we did on profiling a while back.

We'll get a readout for our code that will tell us which functions are taking the most time. This will tell us where we can make the most effective improvements. It will also give us a concrete way to prove our improvement later.

To start, we'll need to rebuild our code with stack build --profile. Be warned this can take a while, since all the libraries also need to be re-built. Then we can re-run the analysis program we used last week:

stack exec -- analyze-game maze_save_2 --enemies +RTS -p

Here's the abbreviated readout in the file `

total time = 32.62 secs

COST CENTRE                                  %time
drillBFS.newParentsMap.\                     21.9
drillBFS.unvisitedNextItems.\                21.7
drillBFS.newVisitedSet                       19.4
getDrillAdjacentItems                        6.2
drillBFS                                     4.5
drillBFS.newSearchQueue                      4.0
getDrillAdjacentItems.mkItemFromResult       3.0
bfs.newParentsMap.\                          2.1
bfs.newVisitedSet                            2.0
getDrillAdjacentItems.mkItemFromResult.(...) 1.7
drillBFS.unvisitedNextItems                  1.4
bfs.unvisitedNextCells.\                     1.1
drillBFS.newParentsMap                       1.0
getDrillAdjacentItems.bounds                 1.0
bfs                                          0.6
getAdjacentLocations                         0.5

Unsurprisingly, we see that drillBFS and it's helpers are the biggest culprits. They account for the top seven entries on the list and a whopping 82% of the time we spend. The enemy AI calculations come in a distant second at around 6.3% of the time. So let's focus on fixing the player algorithm.

A Basic Cache for the Player

As we try to improve our player AI, there's one big observation we can make. Perhaps some of you already noted this when reading about that AI in the first place. For the most part, our player follows a single path the whole time. We calculate the complete path from start to finish on each player move cycle, but then throw most of it away. The only time we get "blown off" this path is when we have to run away from enemies.

There are only a few circumstances where we change this path! So let's make PlayerMemory type that will keep track of it. This should save us a ton of time!

newtype PlayerMemory = PlayerMemory (Maybe [Location])

data Player = Player
  { …
  , playerMemory :: PlayerMemory

We'll add this memory to our player type. When we initialize it from JSON instances, it should start out empty. There's no need to keep track of this in a save-game file.

This change will complicate our move API a little bit. It will now produce the PlayerMemory as an output:

makePlayerMove :: World -> (PlayerMove, PlayerMemory)

Using Our Memory

When it comes to making out move, we first need to put the path into memory. To start, we'll make PlayerMemory out of the path we get from BFS.

makePlayerMove :: World -> (PlayerMove, PlayerMemory)
makePlayerMove w =
  ( PlayerMove finalMoveDirection useStun drillDirection
  , ...
    shortestPath = getShortestPathWithDrills …
    memoryFromMove = PlayerMemory (Just shortestPath)

In general, we'll want to return this "memory". But there's one circumstance where we'll want to invalidate it. When we have to retreat from our enemies, we'll diverge from this ideal path. In this case, we'll return Nothing. Here's what that logic looks like:

makePlayerMove :: World -> (PlayerMove, PlayerMemory)
makePlayerMove w =
  ( PlayerMove finalMoveDirection useStun drillDirection
  , if emptyCache then (PlayerMemory Nothing) else memoryFromMove
    (finalMoveDirection, useStun, emptyCache) = if not enemyClose
      then (shortestPathMoveDirection, False, False)
      else if canStun
        then (shortestPathMoveDirection, True, False)
        else case find (/= shortestPathMoveLocation) possibleMoves of
          Nothing -> (DirectionNone, False, True)
          Just l -> (getMoveDirection playerLoc, False, True)

Now let's consider when we use the cached information, as this will let us skip the BFS call altogether! We'll add one more validity check when doing this. We'll ensure that the list is non-empty and that our current location is at the head of the list. Then we can use the tail of the memory list as the shortest path call!

makePlayerMove :: World -> (PlayerMove, PlayerMemory)
makePlayerMove w = ...
    (useCache, cachePath) = case playerMemory currentPlayer of
      (PlayerMemory (Just (first : rest))) ->
        (first == playerLoc, rest)
      _ -> (False, [])
    shortestPath = if useCache then cachePath
      else getShortestPathWithDrills ...

The last thing we need is to ensure that the cache goes back into memory. This is a simple modification of our function for making the player move:

modifyWorldForPlayerMove :: World -> Location -> PlayerMemory -> World
modifyWorldForPlayerMove w newLoc memory = ...
    currentPlayer = worldPlayer w
    playerWithMemory = currentPlayer {playerMemory = memory}
    playerAfterMove = movePlayer newLoc playerWithMemory

Now we can run our analysis again. We'll see that our Player's AI functions are still the biggest contributor. But the percentage has gone down a lot. They now take only take up around 55% of our total time, instead of 82%! Meanwhile, the percentage of time from the normal BFS functions is now up to around 35%. Most importantly, the total time for the analysis declined five-fold. On the first run, it was 32.62 seconds, and it now only takes 6.79 seconds, a huge improvement!

total time = 6.79 secs

COST CENTRE                                  %time
drillBFS.unvisitedNextItems.\                14.3
drillBFS.newParentsMap.\                     14.2
drillBFS.newVisitedSet                       12.6
bfs.newParentsMap.\                          9.9
bfs.newVisitedSet                            9.2
bfs.unvisitedNextCells.\                     5.7
getDrillAdjacentItems                        4.3
drillBFS.newSearchQueue                      2.8
getAdjacentLocations                         2.8
drillBFS                                     2.6
bfs                                          2.6
getDrillAdjacentItems.mkItemFromResult       2.0
bfs.newSearchQueue                           1.8
getDrillAdjacentItems.mkItemFromResult.(...) 1.1
bfs.unwindPath                               1.1
bfs.unvisitedNextCells                       1.0
drillBFS.unvisitedNextItems                  0.9
bfs.newParentsMap                            0.7


Profiling is an important tool we can use for improving our code, no matter what language we're working in. When our program isn't performing how we like, we have to be sure to address the right parts of it. It may have been tempting to make a different assumption from the start. Since there are many enemy characters, it would be natural to tackle that algorithm first. But our profiling output made it clear that the player AI was the problem.

Next week, we'll start exploring different AI concepts. We'll start moving towards a kind of AI that can be machine-learned. Our code will be simpler, but our product won't be as good, at least at the start! But we'll start getting used to the way an AI can evaluate positions.

For more useful resources in improving your Haskell skills, download our Production Checklist! It has a lot of different tools and libraries to check out!

by James Bowen at August 12, 2019 02:30 PM

August 11, 2019

Shayne Fletcher

Partitions of a set

Calculating the partitions of a set

Having "solved" a bunch of these divide & conquer problems, I'm the first to admit to having being lulled into a false sense of security. At first glance, the problem of this post seemed deceptively simple and consequently I struggled with it, sort of "hand-waving", not really engaging my brain and getting more and more frustrated how the dang thing wouldn't yield to my experience! I think the moral of the story is math doesn't care about your previous successes and so don't let your past practice trick you into laziness. Be guided by your experience but fully apply yourself to the problem at hand!

Suppose a set of two elements {2, 3}. There are only two ways it can be partitioned: (23), (3)(2). For meaning, you might think of these two partitions like this : in the first partition, there is a connection between the elements 2 and 3, in the second, 2 and 3 are isolated from each other.

Suppose a set of elements {1, 2, 3}. There are five partitions of this set : (123), (23)(1), (13)(2), (3)(21), (3)(2)(1) (I've carefully written them out this way to help with the elucidation). Maybe you want to break here and see if you can write an algorithm for calculating them before reading on?

Observe that we can get the partitions of {1, 2, 3} from knowledge of the partitions of {2, 3} by looking at each partition of {2, 3} in turn and considering the partitions that would result by inclusion of the element 1. So, for example, the partition (23) gives rise to the partitions (123) and (23)(1). Similarly, the partition (3)(2) gives rise to the partitions (13)(2), (3)(21) and (3)(2)(1). We might characterize this process as computing new partitions of {1, 2, 3} from a partition p of {2, 3} as "extending" p .

Suppose then we write a function extend x p to capture the above idea. Let's start with the signature of extend. What would it be? Taking (23)(1) as an exemplar, we see that a component of a partition can be represented as [a] and so a partition itself then as [[a]]. We know that extend takes an element and a partition and returns a list of (new) partitions so it must have signature extend :: a -> [[a]] -> [[[a]]] (yes, lists of lists of lists are somehow easy to get confused about).

Now for writing the body of extend. The base case is the easiest of course - extending the empty partition:

extend x [] = [[[x]]]
That is, a singleton list of partitions where that one partition has one component. The inductive case is the partition obtained by "pushing" x into the first component of p together with the extensions that leave the first component of p alone.
extend x (h : tl) = ((x : h) : tl) : map (h :) (extend x tl)

We can now phrase the function partition with signature partition :: [a] -> [[[a]]] like this:

partition [] = [[]]
partition (h : tl) = concatMap (extend h) (partition tl)
The base case says, the only partition of the empty set is the the empty partition.

Wrapping it all up, the algorithm in entirety is

partition :: [a] -> [[[a]]]
partition [] = [[]]
partition (h : tl) = concatMap (extend h) (partition tl)
extend :: a -> [[a]] -> [[[a]]]
extend x [] = [[[x]]]
extend x (h : tl) = ((x : h) : tl) : map (h :) (extend x tl)

by Shayne Fletcher ( at August 11, 2019 04:55 PM

Oleg Grenrus

ANN: cabal-fmt

Posted on 2019-08-11 by Oleg Grenrus

As Cabal- is now released, I uploaded the cabal-fmt tool to Hackage. I have been using cabal-fmt for over a half year now for my own Haskell projects, and have been happy with this minimal, yet useful tool. cabal-fmt formats .cabal file preserving the field ordering and comments.

cabal-fmt is based on Distribution.Fields functionality. cabal-fmt is a thin addition on top. Same Distribution.Fields (and related Distribution.FieldsGrammar) is also used in haskell-ci to parse and print .cabal-like files. I also use it in other tools to implement configuration files. In my opinion the lexical structure of .cabal files is more flexible and human-writing-friendly than YAML or JSON. YMMV. For example the header for this post is written as

with quotes needed to disambiguate YAML. That's silly :) Cabal-like syntax would be

However, enough bashing YAML.

cabal-fmt is opinionated tool, it does format few fields to my liking. Let us see how.


build-depends modules are formatted comma first (with cabal-version: 2.2 also with a leading comma), tabulated, sorted, and ^>= preferred when it can be used. For example:

or (for older cabal-version):

Single build-depends are formatted as a single line, like

nub & sort

exposed-modules, other-modules, default-extensions and other-extensions are sorted and duplicates removed. For example.

Sometimes, you'll prefer some module to be the first, for cabal repl. In that case I would use two exposed-modules fields.


tested-with is one more field where I don't like the default formatting either. This field drives the job selection in haskell-ci. cabal-fmt combines version ranges for compilers, and prints GHC and GHCJS in upper case.

The line generated is long, especially for packages supporting a lot of GHC versions. Something I don't have a clear preference yet how to handle.

Extra: expand exposed-modules and other-modules

The recent addition is an ability to (re)write field contents, while formatting. There's an old, ongoing discussion of allowing wildcard specification of exposed-modules in .cabal format. I'm against that change. Instead, rather cabal-fmt (or an imaginary IDE), would regenerate parts of .cabal file given some commands.

cabal-fmt: expand <directory> is a one (the only at the moment) such command.

cabal-fmt will look into directory for files, turn filenames into module names and append to the contents of exposed-modules. As the field is then nubbed and sorted, expanding is idempotent. For example cabal-fmt itself has:

The functionality is simple. There is no removal of other-modules or main-is. I think that using different directory for these is good enough workaround, and may make things clearer: directory for public modules and a directory for private ones.


And that's all that cabal-fmt does. Formatting of other fields comes directly from Cabal. I have few ideas, what else can be done, e.g.

  • cabal-fmt: expand for extra-source-files
  • formatting of reexported-modules
  • sorting of fields, e.g. putting type and default-language to the top of the component stanzas.

But these don't bother me enough yet, so they are not there.

The implicit goal of a project is to iterate independently of cabal-install, Find out what could be useful, and how it can be done, and later merge into cabal-install's cabal format functionality. Yet then providing enough configuration knobs to not be so opinionated.

August 11, 2019 12:00 AM

August 10, 2019

Magnus Therning

Architecture of a service

Early this summer it was finally time to put this one service I’ve been working on into our sandbox environment. It’s been running without hickups so last week I turned it on for production as well. In this post I thought I’d document the how and why of the service in the hope that someone will find it useful.

The service functions as an interface to external SMS-sending services, offering a single place to change if we find that we are unhappy with the service we’re using.1 This service replaces an older one, written in Ruby and no one really dares touch it. Hopefully the Haskell version will prove to be a joy to work with over time.

Overview of the architecture

The service is split into two parts, one web server using scotty, and streaming data processing using conduit. Persistent storage is provided by a PostgreSQL database. The general idea is that events are picked up from the database, acted upon, which in turn results in other events which written to the database. Those are then picked up and round and round we go. The web service accepts requests, turns them into events and writes the to the database.

Hopefully this crude diagram clarifies it somewhat.

Diagram of the service architecture

There are a few things that might need some explanation

  • In the past we’ve wanted to have the option to use multiple external SMS services at the same time. One is randomly chosen as the request comes in. There’s also a possibility to configure the frequency for each external service.

    Picker implements the random picking and I’ve written about that earlier in Choosing a conduit randomly.

    Success and fail are dummy senders. They don’t actually send anything, and the former succeeds at it while the latter fails. I found them useful for manual testing.

  • Successfully sending off a request to an external SMS service, getting status 200 back, doesn’t actually mean that the SMS has been sent, or even that it ever will be. Due to the nature of SMS messaging there are no guarantees of timeliness at all. Since we are interested in finding out whether an SMS actually is sent a delayed action is scheduled, which will fetch the status of a sent SMS after a certain time (currently 2 minutes). If an SMS hasn’t been sent after that time it might as well never be – it’s too slow for our end-users.

    This is what report-fetcher and fetcher-func do.

  • The queue sink and queue src are actually sourceTQueue and sinkTQueue. Splitting the stream like that makes it trivial to push in events by using writeTQueue.

  • I use sequenceConduits in order to send a single event to multiple Conduits and then combine all their results back into a single stream. The ease with which this can be done in conduit is one of the main reasons why I choose to use it.2

Effects and tests

I started out writing everything based on a type like ReaderT <my cfg type> IO and using liftIO for effects that needed lifting. This worked nicely while I was setting up the basic structure of the service, but as soon as I hooked in the database I really wanted to do some testing also of the effectful code.

After reading Introduction to Tagless Final and The ReaderT Design Patter, playing a bit with both approaches, and writing Tagless final and Scotty and The ReaderT design pattern or tagless final?, I finally chose to go down the route of tagless final. There’s no strong reason for that decision, maybe it was just because I read about it first and found it very easy to move in that direction in small steps.

There’s a split between property tests and unit tests:

  • Data types, their monad instances (like JSON (de-)serialisation), pure functions and a few effects are tested using properties. I’m using QuickCheck for that. I’ve since looked a little closer at hedgehog and if I were to do a major overhaul of the property tests I might be tempted to rewrite them using that library instead.

  • Most of the Conduits are tested using HUnit.


The service will be run in a container and we try to follow the 12-factor app rules, where the third one says that configuration should be stored in the environment. All previous Haskell projects I’ve worked on have been command line tools were configuration is done (mostly) using command line argument. For that I usually use optparse-applicative, but it’s not applicable in this setting.

After a bit of searching on hackage I settled on etc. It turned out to be nice an easy to work with. The configuration is written in JSON and only specifies environment variables. It’s then embedded in the executable using file-embed. The only thing I miss is a ToJSON instance for Config – we’ve found it quite useful to log the active configuration when starting a service and that log entry would become a bit nicer if the message was JSON rather than the (somewhat difficult to read) string that Config’s Show instance produces.


There are two requirements we have when it comes to logging

  1. All log entries tied to a request should have a correlation ID.
  2. Log requests and responses

I’ve written about correlation ID before, Using a configuration in Scotty.

Logging requests and responses is an area where I’m not very happy with scotty. It feels natural to solve it using middleware (i.e. using middleware) but the representation, especially of responses, is a bit complicated so for the time being I’ve skipped logging the body of both. I’d be most interested to hear of libraries that could make that easier.

Data storage and picking up new events

The data stream processing depends heavily on being able to pick up when new events are written to the database. Especially when there are more than one instance running (we usually have at least two instance running in the production environment). To get that working I’ve used postgresql-simple’s support for LISTEN and NOTIFY via the function getNotification.

When I wrote about this earlier, Conduit and PostgreSQL I got some really good feedback that made my solution more robust.

Delayed actions

Some things in Haskell feel almost like cheating. The light-weight threading makes me confident that a forkIO followed by a threadDelay (or in my case, the ones from unliftio) will suffice.

  1. It has happened in the past that we’ve changed SMS service after finding that they weren’t living up to our expectations.

  2. A while ago I was experimenting with other streaming libraries, but I gave up on getting re-combination to work – Zipping streams

August 10, 2019 12:00 AM

August 09, 2019

Oliver Charles

Who Authorized These Ghosts!?

Recently at CircuitHub we’ve been making some changes to how we develop our APIs. We previously used Yesod with a custom router, but we’re currently exploring Servant for API modelling, in part due to it’s potential for code generation for other clients (e.g., our Elm frontend). Along the way, this is requiring us to rethink and reinvent previously established code, and one of those areas is authorization.

To recap, authorization is

the function of specifying access rights/privileges to resources related to information security and computer security in general and to access control in particular.

This is in contrast to authentication, which is the act of showing that someone is who they claim to be.

Authorization is a very important process, especially in a business like CircuitHub where we host many confidential projects. Accidentally exposing this data could be catastrophic to both our business and customers, so we take it very seriously.

Out of the box, Servant has experimental support for authorization, which is a good start. servant-server gives us Servant.Server.Experimental.Auth which makes it a doddle to plug in our existing authorization mechanism (cookies & Redis). But that only shows that we know who is asking for resources, how do we check that they are allowed to access the resources?

As a case study, I want to have a look at a particular end-point, /projects/:id/price. This endpoint calculates the pricing options CircuitHub can offer a project, and there are few important points to how this endpoint works:

  1. The pricing for a project depends on the user viewing it. This is because some users can consign parts so CircuitHub won’t order them. Naturally, this affects the price, so pricing is viewer dependent.
  2. Some projects are owned by organizations, and should be priced by the organization as a whole. If a user is a member of the organization that owns the project pricing has been requested for, return the pricing for the organization. If the user is not in the organization, return their own custom pricing.
  3. Private projects should only expose their pricing to superusers, the owner of the project, and any members of the project’s organization (if it’s owned by an organization).

This specification is messy and complicated, but that’s just reality doing it’s thing.

Our first approach was to try and represent this in Servant’s API type. We start with the “vanilla” route, with no authentication or authorization:

Next, we add authorization:

At this point, we’re on our own - Servant offers no authorization primitives (though there are discussions on this topic).

My first attempt to add authorization to this was:

There are two new routing combinators here: AuthorizeWith and CanView. The idea is AuthorizeWith somehow captures the result of authenticating, and provides that information to CanView. CanView itself does some kind of authorization using a type class based on its argument - here Capture "id" ProjectId. The result is certainly something that worked, but I was unhappy with both the complexity to implement it (which is scope to get it wrong), and the lack of actual evidence of authorization.

The latter point needs some expanding. What I mean by “lacking evidence” is that with the current approach, the authorization is essentially like writing the following code:

If I later add more resource access into doThings, what will hold me accountable to checking authorization on those resources? The answer is… nothing! This is similar to boolean blindless - we performed logical check, only to throw all the resulting evidence away immediately.

At this point I wanted to start exploring some different options. While playing around with ideas, I was reminded of the wonderful paper “Ghosts of Departed Proofs”, and it got me thinking… can we use these techniques for authorization?

Ghosts of Departed Proofs

The basic idea of GDP is to name values using higher-rank quantification, and then - in trusted modules - produce proofs that refer to these names. To name values, we introduce a Named type, and the higher-ranked function name to name things:

Note that the only way to construct a Named value outside of this module is to use name, which introduces a completely distinct name for a limited scope. Within this scope, we can construct proofs that refer to these names. As a basic example, we could use GDP to prove that a number is prime:

Here we have our first proof witness - IsPrime. We can witness whether or not a named Int is prime using checkPrime - like the boolean value isPrime this determines if a number is or isn’t prime, but we get evidence that we’ve checked a specific value for primality.

This is the whirlwind tour of GDP, I highly recommend reading the paper for a more thorough explanation. Also, the library justified-containers explores these ideas in the context of maps, where we have proofs that specific items are in the map (giving us total lookups, rather than partial lookups).

GDP and Authorization

This is all well and good, but how does this help with authorization? The basic idea is that authorization is itself a proof - a proof that we can view or interact with resources in a particular way. First, we have to decide which functions need authorization - these functions will be modified to require proof values the refer to the function arguments. In this example, we’ll assume our Servant handler is going to itself make a call to the price :: ProjectId -> UserId -> m Price function. However, given the specification above, we need to make sure that user and project are compatible. To do this, we’ll name the arguments, and then introduce a proof that the user in question can view the project:

But what is this CanViewProject proof?

A first approximation is to treat it as some kind of primitive or axiom. A blessed function can postulate this proof with no further evidence:

This is a good start! Our price function can only be called with a CanViewProject that matches the named arguments, and the only way to construct such a value is to use canViewProject. Of course we could get the implementation of this wrong, so we should focus our testing efforts to make sure it’s doing the right thing.

However, the Agda programmer in me is a little unhappy about just blindly postulating CanViewProject at the end. We’ve got a bit of vision back from our boolean blindness, but the landscape is still blurry. Fortunately, all we have to do is recruit more of the same machinery so far to subdivide this proof into smaller ones:

Armed with these smaller authorization primitives, we can build up our richer authorization scheme:

Now canViewProject just calls out to the other authorization routines to build it’s proof. Furthermore, there’s something interesting here. CanViewProject doesn’t postulate anything - everything is attached with a proof of the particular authorization case. This means that we can actually open up the whole CanViewProject module to the world - there’s no need to keep anything private. By doing this and allowing people to pattern match on CanViewProject, authorization results become reusable - if something else only cares that a user is a super user, we might be able to pull this directly out of CanViewProject - no need for any redundant database checks!

In fact, this very idea can help us implement the final part of our original specification:

Some projects are owned by organizations, and should be priced by the organization as a whole. If a user is a member of the organization that owns the project pricing has been requested for, return the pricing for the organization. If the user is not in the organization, return their own custom pricing.

If we refine our UserBelongsToProjectOrganization proof, we can actually maintain a bit of extra evidence:

Now whenever we have a proof UserBelongsToProjectOrganization, we can pluck out the actual organization that we’re talking about. We also have evidence that the organization owns the project, so we can easily construct a new CanViewProject proof - proofs generate more proofs!

Relationship to Servant

At the start of this post, I mentioned that the goal was to integrate this with Servant. So far, we’ve looked at adding authorization to a single function, so how does this interact with Servant? Fortunately, it requires very little to change. The Servant API type is authorization free, but does mention authentication.

It’s only when we need to call our price function do we need to have performed some authorization, and this happens in the server-side handler. We do this by naming the respective arguments, witnessing the authorization proof, and then calling price:


That’s where I’ve got so far. It’s early days so far, but the approach is promising. What I really like is there is almost a virtual slider between ease and rigour. It can be easy to get carried away, naming absolutely everything and trying to find the most fundamental proofs possible. I’ve found so far that it’s better to back off a little bit - are you really going to get some set membership checks wrong? Maybe. But a property check is probably gonig to be enough to keep that function in check. We’re not in a formal proof engine setting, pretending we are just makes things harder than they need to be.

by Oliver Charles at August 09, 2019 12:00 AM

August 07, 2019

Mark Jason Dominus

Technical devices for reducing the number of axioms

In a recent article, I wrote:

I guessed it was a mere technical device, similar to the one that we can use to reduce five axioms of group theory to three. …

The fact that you can discard two of the axioms is mildly interesting, but of very little practical value in group theory.

There was a sub-digression, which I removed, about a similar sort of device that does have practical value. Suppose you have a group with a nonempty subset , and you want to show that is a subgroup of . To do this is it is sufficient to show three things:

  1. is closed under
  2. 's identity element is in
  3. For each element of , its inverse is also in

Often, however, it is more convenient to show instead:

For each , the product is also in

which takes care of all three at once.

by Mark Dominus ( at August 07, 2019 03:45 PM

Philip Wadler

IOHK is hiring!

The Plutus team at IOHK is headed by Manuel Chakravarty and consists of a small but strong team of developers; I work on it as a consultant. We are designing a smart contract language, based on Haskell (for both offchain and onchain user-level programming) and System F (as the core code that runs onchain, the equivalent of the EVM for Ethereum). IOHK, unlike any other firm I know, is committed to building on peer-reviewed research, so publication is encouraged. We are hiring!
As a Functional Compiler Engineer at IOHK you will work closely with our programming language theory and cryptography researchers, our formal methods team, and our engineering team throughout the smart contracts development programme involving design, coding, testing and integrating of new smart scripting languages into our blockchain technology. This also includes the design and implementation of relevant domain specific languages (DSLs). You will have a strong understanding of programming language design, type systems, operational semantics, interpreters, and compiler implementation techniques.
Applications from folk who will increase our diversity are encouraged. Details here.

by Philip Wadler ( at August 07, 2019 09:07 AM

August 05, 2019

Mark Jason Dominus

Princess Andromeda

After decapitating Medusa the Gorgon, Perseus flies home on the winged sandals lent to him by Hermes, But he stops off to do some heroing. Below, he spots a beautiful princess Andromeda, chained to a rock.

Here's the description my kids got from D'Aulaire's Book of Greek Myths:

On the way home, as he flew over the coast of Ethiopia, Perseus saw, far below, a beautiful maiden chained to a rick by the sea. She was so pale that at first he thought she was a marble statue, but then he saw tears trickling from her eyes.

Here's the d’Aulaires’ picture of the pasty-faced princess:

Andromeda has been left there to distract a sea monster, which will devour her instead of ravaging the kingdom. Perseus rescues her, then murders her loser ex-boyfriend, who was conspicuously absent from the rendezvous with the monster. Perseus eventually marries Andromeda and she bears his children.

Very good. Except, one problem here. Andromeda is Princess Royal of Ethiopia, the daughter of King Cepheus and Queen Cassiopeia. She is not pale like a marble statue. She has dark skin.

How dark is not exactly clear. For the Greeks “Aethiopia” was a not entirely specific faraway land. But its name means the land of people with burnt faces, not the land of people who are pale like white marble.

The D'Aulaires are not entirely at fault here. Ovid's Metamorphoses compares her with marble:

As soon as Perseus, great-grandson of Abas, saw her fastened by her arms to the hard rock, he would have thought she was a marble statue, except that a light breeze stirred her hair, and warm tears ran from her eyes.

But he's also quite clear (in Book II) that Ethiopians have dark skin:

It was [during Phaethon episode], so they believe, that the Ethiopians acquired their dark colour, since the blood was drawn to the surface of their bodies.

(Should we assume that Ovid evokes marble for its whiteness? Some marble isn't white. I don't know and I'm not going to check the original Latin today. Or perhaps he only intended to evoke its stillness, for the contrast in the next phrase. Anyway, didn't the Romans paint their marble statuary?)

Andromeda was a popular subject for painting and sculpture over the centuries, since she comes with a a built-in excuse for depicting her naked or at least draped with wet fabric. European artists, predictably, made her white:

Painting by Gustave Doré, 1869.

But at least not every time:

by Mark Dominus ( at August 05, 2019 06:43 PM

Monday Morning Haskell

Moving Towards ML: Evaluation Functions


Before we get started, here's a reminder that today (August 5th) is the last day of enrollment for our Haskell From Scratch course! Sign-ups close at midnight Pacfic time! Don't miss out!

This week, we're going to start taking our AI in a somewhat new direction. Right now, we're hard-coding specific decisions for our player to make. But this week, we'll make a more general function for evaluating different positions. Our initial results will be inferior to the AI we've hand-coded. But we'll set ourselves up to have a much better AI in the future by applying machine learning.

For more details on the code for this article, take a look at the evaluation-game-function branch on our Github Repository! This article also starts our move towards machine learning related concepts. So now would be a good time to review our Haskell AI Series. You can download our Tensor Flow Guide to learn more about using Haskell and Tensor Flow!

Evaluation as a Strategy

Currently, our AI follows a strict set of rules. It performs pretty well for the current problem space. But suppose circumstances changed. Suppose we use different maze structures. Or we could add a completely new feature to the game. In these cases, we might need a completely different set of ideas to build a competent AI.

Our new strategy will be much more general. We'll supply our AI with a function that can evaluate a particular board position. That is, it will look at the world, and create a numeric output scoring it. Then our brain will look at all possible moves, score each position, and choose the move with the best result.

If game rules change, we'll need to rethink the evaluation function. But, by making the problem one of numbers to numbers, it'll be easier to use machine learning (instead of our own logic) to devise this function. This way, we can radically change the nature of the game, and we won't need to do too much manual work to change the AI. We might need to add new features (as we'll discuss later). But otherwise we would just need to re-train the evaluation function.

Top Down Development

To implement this approach, we'll put the "function" in functional programming. We'll start by outlining our decision making process with a series of type signatures. Let's remember that first, our overarching goal is a function that takes a World and gives us a PlayerMove:

makePlayerMove :: World -> PlayerMove

We should first determine the set of possible moves:

possibleMoves :: World -> [PlayerMove]

Then we'll need to calculate the new World from each of those moves. (We won't go over this function in this article. It mainly consists of refactoring code we already have for manipulating the game).

applyPlayerMove :: World -> PlayerMove -> World

Then we'll score each of those resulting worlds. This is where the real "brain" is going to live now:

evaluateWorld :: World -> Float

Now that we know the functions we're writing, we can already implement makePlayerMove. We'll assume our helpers already exist and then we apply the process outlined above:

makePlayerMove :: World -> PlayerMove
makePlayerMove w = bestMove
    -- 1. Get our Moves
    allMoves = possibleMoves w

    -- 2. See what the results of each move are
    possibleWorlds = applyPlayerMove w <$> allMoves

    -- 3. Score each resulting world
    scores = evaluateWorld <$> possibleWorlds

    -- 4. Combine the world with its move and choose the best one
    movesWithScores = zip allMoves movesWithScores
    bestMove = fst $ maximumBy (\(_, score1) (_, score2) ->
      compare score1 score2) movesWithScores

This will compile, and we can now move on to the individual components.

Getting Possible Moves

Let's start with getting all the possible moves. When it comes to movement, we generally have five options: stand still, or move in one of four directions. But if we're out of drills, or near the boundary of the world, this can restrict our options. But we always have the sure option of standing still, so let's start with that:

possibleMoves :: World -> [PlayerMove]
possibleMoves w = …
    standStillMove = PlayerMove DirectionNone False DirectionNone

Now in every direction, we'll have a Maybe move possibility. If it's a WorldBoundary, we'll get Nothing. Otherwise if it's a wall, then we'll have a possible move as long as a drill is available. Otherwise the move is possible, and we won't need a drill. We'll wrap these behaviors in a helper function, and then it's easy to use that in each direction:

possibleMoves :: World -> [PlayerMove]
possibleMoves w = baseMoves
    standStillMove = PlayerMove DirectionNone False DirectionNone
    player = worldPlayer w
    bounds = (worldBoundaries w) Array.! (playerLocation player)

    possibleMove :: (CellBoundaries -> BoundaryType) ->
      MoveDirection -> Maybe PlayerMove
    possibleMove boundaryFunc direction =
      case boundaryFunc bounds of
        WorldBoundary -> Nothing
        Wall _ -> if playerDrillsRemaining player > 0
          then Just $ PlayerMove direction False direction
          else Nothing
        AdjacentCell _ -> Just $
          PlayerMove direction False DirectionNone

    upMove = possibleMove upBoundary DirectionUp
    rightMove = possibleMove rightBoundary DirectionRight
    downMove = possibleMove downBoundary DirectionDown
    leftMove = possibleMove leftBoundary DirectionLeft

    baseMoves = standStillMove : (catMaybes [upMove, rightMove, downMove, leftMove])

Now we have to factor in that each move can also apply the stun if it's available.

possibleMoves :: World -> [PlayerMove]
possibleMoves w = baseMoves ++ stunMoves
    baseMoves = standStillMove : (catMaybes [upMove, rightMove, downMove, leftMove])

    stunMoves = if playerCurrentStunDelay player /= 0 then []
      else [ m { activateStun = True } | m <- baseMoves ]

And now we've got our moves!

Evaluating the Game Position

Now let's start tackling the problem of evaluating a particular game situation. Any manual solution we come up with here is likely to have problems. This is where machine learning will come in. But here's the general approach we want.

First, we'll select particular "features" of the world. For instance, how far away are we from the end of the maze? How many enemies are within our stun radius? We'll consider all these elements, and then come up with a "weight" for each feature. A weight is a measurement of whether that feature makes the position "good" or "bad". Then, we'll add together the weighted feature values to get a score. So here's a list of the features we're going to use:

  1. How close are we (in maze search terms) from the target location? This will use pure BFS and it will not account for using drills.
  2. How close are we in manhattan distance terms from the target location?
  3. Is there an active enemy on the same square as the player (this will receive a heavy negative weight!)
  4. How many enemies are within our stun radius?
  5. Is our stun available?
  6. How many drills do we have left?

Let's start by getting all these features:

evaluateWorld :: World -> Float
evaluateWorld w = ...
    player = worldPlayer w
    playerLoc@(px, py) = playerLocation player
    radius = stunRadius . playerGameParameters . worldParameters $ w
    goalLoc@(gx, gy) = endLocation w
    activeEnemyLocations = enemyLocation <$>
      (filter (\e -> enemyCurrentStunTimer e == 0) (worldEnemies w))

    onActiveEnemy = playerLocation player `elem` activeEnemyLocations

    shortestPathLength = length $
      getShortestPath (worldBoundaries w) playerLoc goalLoc

    manhattanDistance = abs (gx - px) + abs (gy - py)

    stunAvailable = playerCurrentStunDelay player == 0

    numNearbyEnemies = length
      [ el | el@(elx, ely) <- activeEnemyLocations,
        abs (elx - px) <= radius && abs (ely - py) <= radius ]

    drillsRemaining = playerDrillsRemaining player

Now let's move on to assigning scores. If our player is on the same square as an active enemy, we lose. So let's give this a weight of -1000. Conversely, the closer we get to the target, the closer we are to winning. So let's devise a function where if that distance is 0, the score is 1000. Then the farther away we get, the more points we lose. Let's say, 20 points per square. For manhattan distance, we'll use a strict penalty, rather than reward:

evaluateWorld :: World -> Float
evaluateWorld w = ...
    onActiveEnemyScore = if onActiveEnemy then -1000.0 else 0.0
    shortestPathScore = 1000.0 - (20.0 * (fromIntegral shortestPathLength))
    manhattanDistanceScore = (-5.0) * (fromIntegral manhattanDistance)

Now we want to generally reward having our power ups available to us. This will stop the bot from needlessly using them and also reward it for picking up new drills. We'll also penalize having enemies too close to us.

evaluateWorld :: World -> Float
evaluateWorld w = ...
    stunAvailableScore = if stunAvailable then 80.0 else 0.0
    numNearbyEnemiesScore = -100.0 * (fromIntegral numNearbyEnemies)
    drillsRemainingScore = 30.0 * (fromIntegral drillsRemaining)

And to complete the function, we'll just add these together:

evaluateWorld :: World -> Float
evaluateWorld w =
  onActiveEnemyScore +
  shortestPathScore +
  manhattanDistanceScore +
  stunAvailableScore +
  numNearbyEnemiesScore +

How Well Does it Work?

When we run the game now with the AI active, we see some interesting behaviors. Our bot will generally navigate the maze well. It's path isn't optimal, as we have with drillBFS. But it makes decent choices about drilling. Its behavior around enemies is a bit strange. It tends to stay away from them, even if they're not actually close in maze difference. This makes it take longer than it needs.

We still don't have good retreating behavior in certain cases. It will often stand still and let an enemy grab it instead of running away.

At this point, we have a couple options for improving the AI. First, we could try tweaking the weights. This will be tedious for us to do manually. This is why we want to apply machine learning techniques to come up with optimal weights.

But the other option is to update the feature space. If we can come up with more intelligent features, we won't need as precise weights.


Next week, we'll try to fix our behavior around enemies. We'll use true maze distance in more places as opposed to manhattan distance. This should give us some big improvements. Then we'll start looking into how we can learn better weights.

We'll be coming up pretty soon on using Tensor Flow for this program! Download our Haskell Tensor Flow Guide to learn more!

And if you're still a Haskell beginner, there's never been a better time to learn! Register for our Haskell From Scratch course to jump-start your Haskell journey! Enrollment ends at midnight TODAY! (August 5th).

by James Bowen at August 05, 2019 02:30 PM

Manuel M T Chakravarty

Functional Blockchain Contracts

Check out the draft of the paper describing the principles underlying Plutus Platform. Here the abstract:

Distributed cryptographic ledgers —aka blockchains —should be a functional programmer’s dream. Their aim is immutability: once a block has been added to the chain it should not be altered or removed. The seminal blockchain, Bitcoin, uses a graph-based model that is purely functional in nature. But Bitcoin has limited support for smart contracts and distributed applications. The seminal smart-contract platform, Ethereum, uses an imperative and object-oriented model of accounts. Ethereum has been subject to numerous exploits, often linked to its use of shared mutable state by way of its imperative and object-oriented features in a concurrent and distributed system. Coding a distributed application for Ethereum requires two languages: Javascript to run off-chain, which submits transaction written in Solidity to run on-chain.

This paper describes Plutus Platform, a functional blockchain smart contract system for coding distributed applications on top of the Cardano blockchain. Most blockchain programming platforms depend on a custom language, such as Ethereum’s Solidity, but Plutus is provided as a set of libraries for Haskell. Both off-chain and on-chain code are written in Haskell: off-chain code using the Plutus library, and on-chain code in a subset of Haskell using Template Haskell. On-chain code is compiled to a tiny functional language called Plutus Core, which is System Fω with iso-recursive types and suitable primitives.

Plutus and Cardano are available open source, and Plutus Playground provides a web-based IDE that enables users to try out the system and to develop simple applications.

August 05, 2019 10:34 AM

August 03, 2019

Mark Jason Dominus

Git wishlist: aggregate changes across non-contiguous commits

(This is actually an essay on the difference between science and engineering.)

My co-worker Lemuel recently asked if there was a way to see all the changes to master from the last week that pertained to a certain ticket. The relevant commit messages all contained the ticket ID, so he knew which commits he wanted; that part is clear. Suppose Lemuel wanted to see the changes introduced in commits C, E, and H, but not those from A, B, D, F, or G.

The closest he could come was git show H E C, which wasn't quite what he wanted. It describes the complete history of the changes, but what he wantwa is more analogous to a diff. For comparison, imagine a world in which git diff A H didn't exist, and you were told to use git show A B C D E F G H instead. See the problem? What Lemuel wants is more like diff than like show.

Lemuel's imaginary command would solve another common request: How can I see all the changes that I have landed on master in a certain time interval? Or similarly: how can I add up the git diff --stat line counts for all my commits in a certain interval?

He said:

It just kinda boggles my mind you can't just get a collective diff on command for a given set of commits

I remember that when I was first learning Git, I often felt boggled in this way. Why can't it just…? And there are several sorts of answers, of which one or more might apply in a particular situation:

  1. It surely could, but nobody has done it yet
  2. It perhaps could, but nobody is quite sure how
  3. It maybe could, but what you want is not as clear as you think
  4. It can't, because that is impossible
  5. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question

Often, engineers will go straight to #5, when actually the answer is in a higher tier. Or they go to #4 without asking if maybe, once the desiderata are clarified a bit, it will move from “impossible” to merely “difficult”. These are bad habits.

I replied to Lemuel's (implicit) question here and tried to make it a mixture of 2 and 3, perhaps with a bit of 4:

Each commit is a snapshot of the state of the repo at a particular instant. A diff shows you the difference between two snapshots. When you do git show commit you're looking at the differences between the snapshot at that commit and at its parent.

Now suppose you have commit A with parent B, and commit C with parent D. I come to you and say I want to see the differences in both A and C at that same time. What would you have it do?

If A and B are on a separate branch and are completely unrelated to C and D, it is hard to see what to do here. But it's not impossible. Our hypothetical command could produce the same output as git show A C. Or it could print an error message Can't display changes from unrelated commits A, C and die without any more output. Either of those might be acceptable.

And if A, B, C, D are all related and on the same branch, say with D , then C, then B, then A, the situation is simpler and perhaps we can do better.

If so, very good, because this is probably the most common case by far. Note that Lemuel's request is of this type.

I continued:

Suppose, for example,that C changes some setting from 0 to 1, then B changes it again to be 2, then A changes it a third time, to say 3. What should the diff show?

This is a serious question, not a refutation. Lemuel could quite reasonably reply by saying that it should show 0 changing to 3, the intermediate changes being less important. (“If you wanted to see those, you should have used git show A C.”)

It may be that that wouldn't work well in practice, that you'd find there were common situations where it really didn't tell you what you wanted to know. But that's something we;d have to learn by trying it out.

I was trying really hard to get away from “what you want is stupid” and toward “there are good reasons why this doesn't exist, but perhaps they are surmountable”:

(I'm not trying to start an argument, just to reduce your bogglement by explaining why this may be less well-specified and more complex than you realize.)

I hoped that Lemuel would take up my invitation to continue the discussion and I tried to enocurage him:

I've wanted this too, and I think something like it could work, especially if all the commits are part of the same branch. … Similarly people often want a way to see all the changes made only by a certain person. Your idea would answer that use case also.

Let's consider another example. Suppose some file contains functions X, Y, Z in that order. Commit A removes Y entirely. Commit B adds a new function, YY, between X and Z. Commit C modifies YY to produce YY'. Lemuel asks for the changes introduced by A and C; he is not interested in B. What should happen?

If Y and YY are completely unrelated, and YY just happens to be at the same place in the file, I think we definitely want to show Y being removed by A, and then that C has made a change to an unrelated function. We certainly don't want to show all of YY beind added. But if YY is considered to be a replacement for Y, I'm not as sure. Maybe we can show the same thing? Or maybe we want to pretend that A replaced Y with YY? That seems dicier now than when I first thought about it, so perhaps it's not as big a problem as I thought.

Or maybe it's enough to do the following:

  1. Take all the chunks produced by the diffs in the output of git show .... In fact we can do better: if A, B, and C are a contiguous sequence, with A the parent of B and B the parent of C, then don't use the chunks from git show A B C; use git diff A C.

  2. Sort the chunks by filename.

  3. Merge the chunks that are making changes to the same file:

    • If two chunks don't overlap at all, there's no issue, just keep them as separate chunks.

    • If two chunks overlap and don't conflict, merge them into a single chunk

    • If they overlap and do conflict, just keep them separate but retain the date and commit ID information. (“This change, then this other change.”)

  4. Then output all the chunks in some reasonable order: grouped by file, and if there were unmergeable chunks for the same file, in chronological order.

This is certainly doable.

If there ware no conflicts, it would certainly be better than git show ... would have been. Is it enough better to offset whatever weirdness might be introduced by the overlap handling? (We're grouping chunks by filename. What if files are renamed?) We don't know, and it does not even have an objective answer. We would have to try it, and then the result might be that some people like it and use it and other people hate it and refuse to use it. If so, that is a win!

by Mark Dominus ( at August 03, 2019 06:38 PM

August 02, 2019

Dominic Orchard

Considering the order of results when computing Cartesian product [short]

Sometimes in programming you need to do a pairwise comparison of some elements coming from two collections, for example, checking possible collisions between particles (which may be embedded inside a quadtree representation for efficiency). A handy operation is then the Cartesian product of the two sets of elements, to get the set of all pairs, which can then be traversed.

Whenever I need a Cartesian product of two lists in Haskell, I whip out the list monad to generate the Cartesian product:

cartesianProduct :: [a] -> [b] -> [(a, b)]
cartesianProduct as bs = as >>= (\a -> bs >>= (\b -> [(a, b)]))

Thus for every element a in as we get every element b in bs and form the singleton list of the pair (a, b), all of which get concatenated to produce the result. For example:

*Main> cartesianProduct [1..4] [1..4]

This traverses the Cartesian space in row order. That is, if we imagine the square grid of possibilities here (where the element at the ith row and jth column corresponds to element (i, j)), then cartesianProduct generates the pairs in the following order:


In mathematics, the Cartesian product is defined on sets, so the order of these pairs is irrelevant. Indeed, if we want to examine all pairs, then it may not matter in what order. However, if we learn something as we look at the pairs (i.e., accumulating some state), then the order can be important.

In some recent research, I was building an automated algebra tool to find axioms for some algebraic structure, based on an interpretation. This involved generating all pairs of syntax trees t and t' of terms over the algebraic structure, up to a particular depth, and evaluating whether t = t'. I also had some automated proof search machinery to make sure that this equality wasn’t already derivable from the previously generated axioms. I could have done this derivability check as a filtering afterwards, but I was exploring a very large space, and did not expect to even be able to generate all the possibilities. I just wanted to let the thing run for a few hours and see how far it got. Therefore, I needed Cartesian product to get all pairs, but the order in which I generated the pairs became important for the effectiveness of my approach.  The above ordering (row major order) was not so useful as I was unlikely to find interesting axioms quickly by traversing the span of a (very long) row; I needed to unpack the space in a more balanced way.

My first attempt at a more balanced Cartesian product was the following:

cartesianProductBalanced :: [a] -> [b] -> ([(a, b)])
cartesianProductBalanced as bs =
     concatMap (zip as) (tails bs)
  ++ concatMap (flip zip bs) (tail (tails as))

tails gives the list of successively applying tail, e.g., tails [1..4] = [[1,2,3,4],[2,3,4],[3,4],[4],[]]

This definition for cartesianProductBalanced essentially traverses the diagonal of the space and then the lower triangle of the matrix, progressing away from the diagonal to the bottom-left, then traversing the upper triangle of the matrix (the last line of code), progressing away from the diagonal to the top-right corner. The ordering for a 4×4 space is then:

*Main> cartesianProductBalanced [1..4] [1..4]

This was more balanced, giving me a better view of the space, but does not scale well: traversing the elements of this Cartesian product linearly means first traversing all the way down the diagonal, which could be very long! So, we’ve ended up with a similar problem to the row-major traversal.

Instead, I finally settled on a “tiling” Cartesian product:

cartesianProductTiling :: [a] -> [b] -> [(a, b)]
cartesianProductTiling [] _ = []
cartesianProductTiling _ [] = []
cartesianProductTiling [a] [b] = [(a, b)]
cartesianProductTiling as bs =
       cartesianProductTiling as1 bs1
    ++ cartesianProductTiling as2 bs1
    ++ cartesianProductTiling as1 bs2
    ++ cartesianProductTiling as2 bs2
    (as1, as2) = splitAt ((length as) `div` 2) as
    (bs1, bs2) = splitAt ((length bs) `div` 2) bs

This splits the space into four quadrants and recursively applies itself to the upper-left, then lower-left, then upper-right, then lower-right, e.g.,

*Main> cartesianProductTiling [1..4] [1..4]


Thus, we explore the upper-left portion first, without having to traverse down a full row, column, or diagonal. This turned out to be much better for my purposes of searching the space of possible axioms for an algebraic structure.

Note that this visits the elements in Z-order (also known as the Lebesgue space-filling curve) [Actually, it is a reflected Z-order curve, but that doesn’t matter here].

The only downside to this tiling approach is that it does not work for infinite spaces (as it calculates the length of as and bs), so we cannot exploit Haskell’s laziness here to help us in the infinite case. I’ll leave it as an exercise for the reader to define a tiling version that works for infinite lists.

Of course, there are many other possibilities depending on your application domain!

Later edit:  In the case of the example I mentioned, I actually do not need the full Cartesian product since the order of pairs was not relevant to me (equality is symmetric) and neither do I need the pairs of identical elements (equality is reflexive). So for my program, computing just the lower triangle of Cartesian product square is much more efficient, i.e,:

cartesianProductLowerTriangle :: [a] -> [b] -> [(a, b)]
cartesianProductLowerTriangle as bs = concatMap (zip as) (tail (tails bs))

However, this does not have the nice property of tiling product where we visit the elements in a more balanced fashion. I’ll leave that one for another day (I got what I needed from my program in the end anyway).

by dorchard at August 02, 2019 06:37 PM

August 01, 2019


Exploring Cloud Builds in Hadrian

GHC’s new build system, Hadrian, now supports a cloud build option: --share. With that enabled, build artifacts will be cached and shared between builds. This can improve build times, but getting this feature to function correctly is still a work in progress. This blog post explores the requirements for correctness, some examples in Hadrian, the current state of the project, and how developers can use cloud builds to improve iteration time.


For much of its life GHC has had a build system based around Gnu Make. Though it has served its purpose well, the make based build system is being replaced by a new and improved project called Hadrian. Hadrian is a build system written in Haskell and based on the Shake library. With Hadrian comes cleaner code and better maintainability, but also some interesting features of Shake such as cloud builds. Cloud builds are enabled with the --share option. A cloud build keeps a cache of build artifacts that can be shared between builds eliminating a lot of duplicate work that had already been done on a previous build. This feature seems particularly useful in the context of continuous integration (CI). Merge Requests made via GitLab must all pass a comprehensive set of tests on the CI server (the CI pipeline). This involves building GHC multiple times and running the test suite. Due to limited resources, long build times, and the large number of tested build configurations, the CI pipeline typically takes 6 to 8 hours to complete. Keep in mind that each merge request is typically only making a small change to the compiler. By utilizing Shake’s cloud build system to cache build artifacts between CI runs, one would expect a large speedup.

After an initial effort, Andrey Mokhov achieved a successful cloud build and rebuild of GHC with Hadrian. After Andrey’s changes were merged I did a cloud build (using the --share option) of GHC to fill up the cloud build cache, deleted the build directory, then timed a rebuild. This resulted in a 10x speedup of the rebuild compared to a clean non-cloud build, a very positive result. On my machine this took a 25 minute build down to only 2.5 minutes! However, this is an overly optimistic test: no source files have changed, so we expect full cache utilization. In practice we expect source files to have changed and so should expect less cache utilization. Regardless, the prospect of using cloud builds to improve CI times looked very promising. Unfortunately, the correctness of cloud builds came into question, and the last thing we want in CI is an incorrect build. We needed a better understanding of the requirements for correct cloud builds.

Build rules

To understand the problems at hand, one must understand the basics of how a build system like Hadrian is implemented using Shake. Put simply, you must create a set of build rules that map file name patterns to Actions that perform some IO to generate the corresponding file or files i.e. the “targets” of the rule. In particular, Action is a monad and the need :: [FilePath] -> Action () function is used to dynamically specify the rule’s inputs (i.e. dependencies) that must be built before the action continues. Here is an example of a typical build rule that generates a file with the extension .output:

Evaluating Cloud Build Correctness

Let’s consider what it means to have a correct build system. A correct build system is one that always produces a correct set of build artifacts according to some specification. In the case of Hadrian, while not formally specified, this would include producing a functioning ghc binary that reflects the build inputs (e.g. source files and build flavour). We can break correctness down into three specific definitions:

  1. Clean Build Correctness: Start with no existing build artifacts. Running a build will succeed and produce correct build artifacts.
  2. Incremental Build Correctness: Start with existing build artifacts (e.g. by doing a full or partial build). Possibly change some build inputs. Running a build will succeed and produce correct build artifacts.
  3. Cloud Build Correctness: Start with an existing and populated cloud cache and possibly existing build artifacts (e.g. by doing a cloud build then possibly deleting build artifacts). Possibly change some build inputs. Running a cloud build will succeed and produce correct build artifacts.

For each definition above, a requirement for correctness is that the build produces correct build artifacts. Another way of thinking about this is that correct clean, incremental, and cloud builds should always produce an equal set of build artifacts given the same build inputs. In practice, “equal” ignores some insignificant differences due to nondeterminism in GHC and other build tools.

A main source of incremental and cloud build correctness bugs relates to build rules declaring an incorrect set of inputs via need. In an incremental build, the rule inputs indicate when the rule needs to be rerun to produce fresh build artifacts. If an input is undeclared and that input changes, it could result in the rule failing to rerun and hence stale build artifacts. In cloud builds, the rule inputs are used as keys into the cache. If an input is undeclared and that input changes, it could result in an (incorrect) cache hit and again stale build artifacts.

In this blog post we only focus on declaring the correct set of inputs. Rules are assumed to otherwise correctly call produces and generate build artifacts for that rule. Note, Andrey Mokhov’s initial effort largely focused on the use of produces.

Let’s take a closer look at what we mean by “input”. The natural definition of inputs, which we call “direct inputs”, is the set of all files used (i.e. read) by the rule. We can also define an “indicating inputs set”. Consider comparing build artifacts of two clean builds where some build inputs where changed. Here “changed” means the files differ between the first and second build. An indicating inputs set for a rule is a set of files such that:

a rule’s output may change only if at least one of the rule’s indicating inputs has changed.

The direct inputs is always a valid indicating inputs set. There can be other valid indicating inputs sets that need not be a subset or equal to the direct inputs. An indicating inputs set may include files outside the set of direct inputs (the libffi rules are a good example). Generally we try to pick a minimal or close to minimal set of indicating inputs in order to keep the rule implementation simple.

On that note, the invariants to achieve the three levels of correctness defined above are as follows. Understanding these invariants is fundamental to evaluating incremental and cloud build correctness.

  1. Clean Build: All rules need enough to build all direct inputs (this can be via direct or transitively triggered rules).
  2. Incremental Build: Clean build invariants + All rules need a valid indicating inputs set.
  3. Cloud Build: Clean and incremental invariants + All rules produces all outputs (excluding the rule targets(s)).

Dependency Lint Errors

Now that we understand the requirements for correctness, how can we find correctness issues in Hadrian? Shake has a great linting feature (enabled with --lint-fsatrace) that utilizes the Filesystem Access Tracer tool (fsatrace) to observe file accesses within a rule. This feature results in a linting error when any file is read without first being declared with need. In other words, we get linting errors when there is an omitted need for a direct input. Zero linting errors would imply that all direct inputs are needed and since the direct inputs are a valid indicating input set, this would imply incremental build correctness. Great! So how many lint errors do we have? At the start of my endeavor, about 400,000 lint errors! That’s 400,000 files used by various rules that forgot to need them first.

Are all lint errors indicative of a correctness issue? Not necessarily. Remember, lint errors identify omitted direct inputs, but we are concerned with indicating inputs. As such, lint errors must be investigated individually. The general workflow is:

  1. Pick a lint error. The error which contains the rule and the omitted input file paths.
  2. Choose an indicating input set for the rule.
  3. Ensure all indicating inputs are needed in that rule.
  4. The rule is now correct, so trackAllow any direct inputs not in the indicating inputs set (this disables lint errors for those files).


So far there is nothing too daunting about this task. Even the large number of lint errors is not too bad when you consider that individual fixes can resolve large numbers of lint errors if the rule is run many times. It sounds like we have our work cut out for us, but let’s look at two cases that prove tricky to resolve.

GHC Dependency Generation and CPP Inputs

Imagine we have a Haskell module that uses CPP:

Now when we build A.hs we must make sure to need the CPP includes i.e. the rule for A.hs has direct inputs A.hs, B.h, and C.h. We ideally want to discover the .h inputs dynamically so that if we, e.g., add new #include directives, we don’t need to update our build system rules. In Hadrian we do this by generating a .dependencies file generated using GHC’s dependency generation options -M -include-cpp-deps. This is done in a rule for the .dependencies file. Now the rule to build A.hs first needs the .dependencies file, then dynamically needs the dependencies (i.e. inputs) listed within. In this case A.hs, B.h, and C.h. The rules look something like this:

So what is the problem here? The rule for ["A.o", "A.hi"] is fine as it tracks the correct inputs. What about the ".dependencies" rule? Running ghc -M -include-cpp-deps A.hs ... reads A.hs then traverses all the #includeed files, hence the direct inputs of .dependencies are A.hs, B.h, and C.h. This can’t be reduced to a smaller indicating inputs set; any of those files can change the output of .dependencies. Well, this is awkward! The inputs of .dependencies are exactly the inputs that we are trying to discover in the first place. If we now include D.h from C.h and recompile, A.hs will recompile (it needs C.h which has changed), but .dependencies will not be recalculated so ["A.o", "A.hi"] will not track the new D.h input. Now change D.h and rebuild. A.hs will not be rebuilt (i.e. the ["A.o", "A.hi"] rulewill not rerun) even though the D.h input has changed.

A possible solution to this is to use Shake’s needed action in the .dependencies rule to declare the inputs after running GHC’s dependency generation. This has a couple of issues:

  • GHC’s dependency generation also outputs the source file and other Haskell interface files that are imported from this module. Unlike the CPP includes, these are not indicating inputs of the .dependencies file. The distinction could be made by observing file extensions, .hi/.hs vs others, though this is not foolproof as CPP includes have no restriction on file extension.
  • This works fine if the CPP dependencies are source files in the source tree, but doesn’t work if one of the dependencies needs to be generated by the build system: we would need to need them a priori to ensure they exist and so can be read to discover any further dependencies.

Currently this is an unresolved issue in Hadrian. It is hard to say how likely this issue is to surface as a bug in incremental/cloud builds.

Compiling Haskell Modules

Consider a rule for a file X.o that compiles X.hs with ghc. We use ghc -M which returns the following Makefile fragment:

The Y.hi is there because module X imports module Y. We conclude that direct inputs = indicating inputs = { X.hs, X.hi, Y.hi } So we implement the rule like this (normally this is done dynamically using .dependencies as described above, but we will list the dependencies statically here to illustrate the problem better):

This seems correct, but running the rule with --lint-fsatrace complains that the rule used a file Z.hi without needing it. Upon further inspection, module Y imports module Z. It turns out ghc often uses a subset of transitively imported modules. This means the set of direct inputs is larger than what we originally though, it contains transitive interface files. Unfortunately there doesn’t seem to be any feasible way to predict which transitive interface files ghc will use at compile time. Fortunately we only need to need indicating inputs, so we can stay optimistic and try to show that transitive interface files (.hi files) are not indicating inputs. This can be done by showing that a change in any transitive interface files results in a change in the non-transitive interface files (this is explained more in more detail on the wiki:

  change({transitive interface files})
        → change({indicating inputs} \ {transitive interface files})
= change({ Z.hi })
        → change({ X.hs, X.hi, Y.hi })

Where means “logically implies”. With some insight from an experienced GHC developer we see that all interface files contain “a list of the fingerprints of everything it used when it last compiled the file” (from the user guide). Making the practical assumption that the fingerprint’s hashing function is injective, we know that change({ transitive interface files }) which implies change({ X.hi }) and trivially change({ X.hs, X.hi, Y.hi }) and hence it is safe to not need transitive interface files. Hence, we can need the direct interface files reported by ghc -M and trackAllow all other interface files.

In this particular case, the lint errors were a non-issue: there were no omitted indicating inputs. This is good, but establishing correctness required in-depth knowledge of the contents and invariants of interface files as well as a solid understanding of how to manipulate the indicating inputs set.


Caching Staged Builds

Having seen two examples, we can see that fixing these lint errors isn’t entirely trivial. So the cost of enabling caching has increased, but the benefits still sound great: a large reduction in CI build times. Unfortunately this expectation is compromised by the fact that GHC is usually built in two stages. The majority of the second stage is built using the GHC binary produced in the first stage. Stage 1 GHC is less featureful than stage 2 and so takes only one-quarter to one-third of the total build time. The key observation is that stage 1 GHC is a dependency for most of stage 2. This means that any change to GHC will result in cache misses for the majority of stage 2, severely degrading cache utilization. This should not be understated.

A possible solution is to bypass the 2 stage build process with the help of Hadrian’s --freeze1 option. This freezes the stage 1 GHC. By keeping stage 1 frozen across multiple CI runs, stage 2 can now better utilize the cache. While this sounds like a good idea at first, we strongly want to avoid incorrect builds on CI. The correctness freezing stage 1 is questionable:

  • Bugs may appear or be hidden by the frozen stage 1 compiler.
  • Stage 2 GHC is linked against the base package built by stage 1 GHC. This could cause bugs or build failures if the base package or the interface file format differs between stage 1 and stage 2 GHC (i.e. when using --freeze1).

On a positive note, stage 1 still takes a significant portion of time to build, and does not suffer from this cache utilization issue. A reduction in stage 1 compile times is already something to be celebrated.

Changes in Hadrian Build Rules

An important note is that Hadrian is currently seeing regular updates and changes to its build rules. Hadrian is under version control with the rest of GHC, so when changing between commits you’re also potentially changing the version of Hadrian. Cloud/incremental builds may be incorrect or fail when Hadrian rules change between builds. While there are ways to protect against this i.e. by making the Hadrian version a dependency of all rules, Hadrian currently doesn’t do this. As a result it is not safe to run cloud/incremental builds across changes to Hadrian. As an example, doing a cloud build of commit A then commit B gives a failed build complaining of a missing GHC.Version Haskell module. The build rules surrounding that module were changed between commit A and B. Hence the cache directory should be cleared when Hadrian has changed compared to previous cloud builds. Alternatively you can maintain separate cache directories based on the version of hadrian e.g. the latest commit that changed the hadrian directory: git rev-list -n1 HEAD -- ./hadrian.

Relocatable Build Artifacts

We want to take advantage of the cloud cache regardless of the absolute filesystem path of the build directory and GHC source directory. Unfortunately some build artifacts depend on these paths. In other words, the build artifacts are not relocatable. This makes cloud builds impractical when the build and source paths are not held constant: it would result in poor cache utilization. Issue #16956 documents this problem further.

What This Means for GHC Developers

Compared with the make build system, potential issues are much more visible due to the fsatrace linting feature. Considering Hadrian is a port of the make system, now with may lint errors fixed, it’s reasonable to expect similar if not more confidence in Hadrian’s incremental builds as with make’s incremental builds. More confidence in incremental builds is a welcome improvement and the most impactful benefit of this work.

We can expect a similar level of confidence in cloud builds assuming that hadrian rules are unchanged between builds. If you’re planning on running the same builds multiple times then using the cloud build feature can benefit you. Simply use the --share command line option when running Hadrian (or --share=DIR to specify a cache location). Consider the use case of swapping between build flavours. A simple solution would be to maintain separate build directories per flavour, but you may not want to manage these directories manually or perhaps you expect the flavours to share a significant amount of build artifacts from the cloud build cache. Here I’ve run through all steps in order with cloud build enabled then all steps again with cloud build disabled (i.e. just as incremental builds). Before step 1, I always start with a clean build and cache directory. You can see that cloud builds are much faster on the second build:

Action (flavour) Cloud Build Time Non-Cloud Build Time
1 build a39a3cd663 (default) 24m37.775s 23m49.780s
2 build a39a3cd663 (quick) 11m55.601s 11m16.476s
3 build a39a3cd663 (default) 1m23.450s 18m58.530s
4 build a39a3cd663 (quick) 1m 2.805s 11m23.408s


The project of supporting correct cloud builds resulted in numerous accomplishments. The number of fsatrace lint error has been reduced from 400,000 to 90,000 thanks numerous bug fixes in Hadrian. We now have greater confidence in incremental and cloud builds. GHC developers can benefit from cloud builds locally in some narrow use cases, and they frequently benefit from improved incremental builds. We’ve developed a clear vocabulary for reasoning about the degrees of build system correctness and what these require. Use of the fsatrace linting feature will be invaluable to understanding and fixing newly surfaced Hadrian bugs in the future. GHC’s -include-cpp-deps option was implemented as part of this project, enabling more precise dependency tracking in Hadrian and users’ build systems. This project also had a part to play in recent ideas to simplify Haskell interface files (see issue #16885).

Still, there is plenty of room to further improve incremental and cloud builds. This blog post is in part a call for help from the community. The remaining work is challenging and interesting, requires little compiler expertise, and yet can have a significant impact on GHC developers. #16926 is a top level “getting started” issue for continuing this project. Most of the remaining work is focused on resolving fsatrace lint errors summarized in issue #16400. That issue also explains how to get started using fsatrace linting. Specific issues of interest include:

  • #16868 Unexpected CPP dependencies in later (non CPP) build stages of GHC.
  • #16815 Undeclared hs-boot inputs when generating .dependencies files.
  • #16983 Undeclared inputs for _build/generated/platformConstants/platformConstants.
  • #16956 Some build artifacts are are sensitive to build directory location. This destroys cloud build cache utilization when using multiple build directories.

It’s highly recommended that you read through and contribute to the Developing Hadrian wiki page which goes into more detail about incremental and cloud build correctness.

by davide at August 01, 2019 04:10 PM

Tweag I/O

Code Line Patterns:
Creating maps of Stackage and PyPi

Simeon Carstens, Matthias Meschede

In a recent blog post, we analyzed the most obvious patterns in Haskell and Python source code, namely import statements and LANGUAGE pragmas. As promised in that blog post, let us now explore what unknown patterns are hidden in our data sets. To this end, we will create two-dimensional "maps" of code, which allow for a nice visualization of code patterns and may make you appreciate the beauty and complex structure of the code you and your friends and colleagues write. Such code patterns also provide insight into the coding habits of a community in a programming language ecosystem. We will conclude this blog post with an outlook on how these and similar patterns can be exploited.

Quantitative representation of source code

The sort of patterns that we are looking for is some kind of similarity between lines of codes. While the concept of similarity between two lines of code might be intuitively clear for us humans, we don't have time to print out a large number of lines of code on little snips of paper and sort them in piles according to their similarity. So obviously, we want to use a computer for that. Unfortunately, computers a priori don't know what similarity between two lines of code means.

To enable computers to measure similarity, a first step is to represent each line of code as a vector, which is a quantitative representation of what we think are the most important characteristics ("features") of a line of code. This is a common task in Natural Language Processing (NLP), which is concerned with automatic analysis of large amounts of natural language data. While the languages we are considering are not natural, but constructed, we expect key concepts to be transferable. We thus borrow the idea of a bag-of-words from NLP: for a given line of code, we neglect grammar and word order, but keep a measure of presence of each word. In our case, we only take into account the 500 most frequent words in our data set and simply check which of these words is present in a line of code and which are not. That way, we end up with a 500-dimensional feature vector, in which each dimension encodes the occurrence of a word: it's either one (if the word is present in that line of code) or zero (if it is not).

Visualizing source code

We could now race off and compare all these representations of lines of code in the 500-dimensional feature space. But do all of these dimensions really contain valuable information which sets apart one pattern from another? And, how would we visualize such a high-dimensional space?

A popular way to address these questions is to reduce the number of dimensions significantly while keeping as much information about the structure of the data as possible. For the purpose of visualization, we would love to have a two-dimensional representation so that we can draw a map of points in which points with similar features are placed close to each other and form groups ("clusters"). To achieve that, we need a measure of similarity (or, equivalently, distance) in feature space. Here, we define similarity between two lines of code by the number of words they have in common: we count the number of unequal entries of two feature vectors and the lower that number is, the lower is the so-called Hamming distance between the two feature vectors and the more similar they are.

A popular technique to reduce the number of dimensions while trying to maintain the structure of the data or, equivalently, the distances between similar data points, is UMAP. We first use UMAP to calculate a two-dimensional map of both the Python and the Haskell data set and color points depending on whether the line of code they represent contain certain keywords:

We immediately notice the complex structure of the data set, which seems to consist of a large number of clusters comprising very similar lines of code. We expect many lines of code containing the keywords we chose to be close together on the UMAP which is indeed what we observe, although they do not necessarily form connected clusters. Furthermore, one line of code could contain several of the keywords, which would not be visible in our representation.

To find out what kind of code other clusters represent, we perform clustering on the UMAP embedding, meaning that we run an algorithm on the map which automatically determines which points belong to which cluster. Starting with the Python data set, we annotate the twenty most dense and reasonably large clusters with the two words which co-occur most frequently in a cluster:

We are not surprised to find clusters with the most frequent words being for and in, which corresponds to Python for-loops, or assertEqual and self, which stems from the unittest framework. Some other clusters, though, do not remind us of common Python idioms. The most common words in cluster 17 are xd and xc. Looking up what kinds of lines of code contain both of these words, we find that these occur very often in byte strings. In our earlier blog post we already learned that in Haskell, byte strings feature prominently in a few select packages.

Performing the same analysis for the Haskell data set, we find clusters such as a very well-defined one containing often both text and maybe and clusters corresponding to popular imports:

Furthermore, the big blue cluster (#16) seems to contain mostly auto-generated code from the amazonka package, which implements communication with Amazon Web Service-compatible APIs.


We found several interesting patterns in both Haskell and Python code—clusters of common language idioms, but also unexpected clusters stemming from byte strings. With these results, we could now build a very basic code completion tool: while you type, that tool would continuously check to which cluster the line you're typing most likely belongs to and suggest you words from that cluster. An obvious limitation, though, is the complete absence of context-sensitivity, meaning that proposed words neither depend on the order of previous words in the same line nor on adjacent lines of code.

Other projects have taken the application of machine learning techniques much further, resulting in tools which can significantly facilitate programmers' lives. Kite performs context-sensitive code completion for Python and is available as a plug-in for several popular IDEs. The Learning from Big Code website lists several other interesting projects. For example, the Software Reliability Lab at ETH Zurich developed JSNice, a tool to automatically rename and deobfuscate JavaScript code and to infer types. Finally, Naturalize learns coding conventions from an existing codebase to improve consistency.

Appendix: data preprocessing and technical details

We limited our analysis on 100,000 randomly chosen lines of code, which we tokenized, meaning we replace all types of whitespace by a single space, retained only letters and converted upper case letters to lower case ones.

To perform a dimensionality reduction of our data sets, it is necessary to create informative feature vectors from each line of code. We used count vectorization as implemented in scikit-learn to turn each line of code in a binary vector, whose dimensions correspond to the 500 most frequent words in our Python / Haskell code corpus. We don't care how often a word occurs in a line of code, only whether it's there (1) or not (0). Furthermore, single-letter words are neglected.

Having built these feature vectors, we applied a popular dimensionality reduction technique, UMAP. UMAP is a manifold embedding technique, meaning that it tries to represent each data point in the high-dimensional feature space by a point on a lower-dimensional manifold in a way that similar points in the feature space lie close together on the lower-dimensional manifold. It is a non-linear mapping, which means it's well suited for data sets which can not easily be projected on flat manifolds. UMAP requires a measure of similarity in feature space, which we chose as the Hamming distance between two binary feature vectors.

To assign points in the two-dimensional representations of our data sets to clusters, we used the Python implementation of the recent clustering algorithm HDBSCAN (paywalled). A big advantage of HDBSCAN over many other clustering algorithms is that it automatically determines the number of clusters and allows to classify data points as noise.

August 01, 2019 12:00 AM

July 31, 2019

Oleg Grenrus

Should fmap coerce = coerce hold?

Posted on 2019-07-31 by Oleg Grenrus

Discussion around (limitations of) Coercible, GeneralizedNewtypeDeriving or DerivingVia resume once in a while.

One common question is whether fmap coerce = coerce should be a law of the Functor type-class, or not. Intuitively, it's similar to the identity law: fmap id = id law, but intuition is not good enough reason alone.

Category theory may say...

The identity law comes from category theory, so what could category theory say about fmap coerce = coerce like proposition?

Let us consider an endofunctor F from a three element category \mathcal{C} to itself, which "rotates the category".

F maps objects A, B, B&39; to B&39;, B, A respectively. Careful reader would ask how F maps morphisms. See an appendix, for a justification.

To make it more concrete we can think that as:

with isomorphisms in between (assuming Int and Word have the same amount of bits).

Then, we know that Coercible B B'. And arrows g and g^{-1} on the diagram are coerce @B @B' and coerce @B' @B. But Coercible (F B) (F B') = Coercible B' A doesn't exist. F(g) is a function like \(B' w) -> fromIntegral w which is not coerce.

The functor F doesn't preserve the Coercible relation, even it's a categorically lawful functor.

... but ...

But the functor F above is not a Functor as in Haskell. Not even if we try to define

we'll not be able to write fmap :: (a -> b) -> F a -> F b, as b can be anything, not only A, B or B'.

fmap is polymorphically parametric. There are papers and blogs explaining parametricity1.

The key insight of parametricity is that we can read types as relations. For example a list defined as

can be read as a relation constructor:

Given a relation R :: Relation X Y, xs :: List X and ys :: List Y are related with List R :: Relation (List X) (List Y), if either

  • xs = Nil and ys = Nil, or
  • xs = Cons x xs', ys = Cons y ys', and x and y are related by R, and xs' and ys' are recursively related by List R.

In other words, List R relates lists of equal length, when all elements are related pointwise.

We can derive free theorems for polymorphic functions over lists, or ...

We could take R X Y to be Coercible X Y. Relation doesn't need to be functional. Coercible is a binary-relation. We can reason that List Coercible is Coercible for lists: Lists related by coerce @(List X) @(List Y) are of the same length and each element is related by coerce @X @Y.

Functor should be parametric!

Maybe instead of asking whether fmap coerce = coerce should be a law of the Functor type-class, we should ask whether Functor should be a type-class of parametric type constructors? Parametricity would imply fmap coerce = coerce. Non-requirement of parametricity, would mean that opaque types can be "lawfully" Functor, e.g. similarly as Set a has an Eq instance.

The latter choice allows us to write type internally used in the bifunctors library:

Mag is not lawful Functor, it violates fmap id = id property. but is used to implement traverseBiaWith.

However, there is a comment in the code:

Then Mag will be Coercible in all argument, current Mag has type role Mag representational nominal nominal.

A third alternative is to use

GHC is smart enough to infer that this Mag variant is also representational in all arguments. Unfortunately, to implement traverseBiaWith, we'll need to change the required constraints

... or require Bifunctor (a superclass of Biapplicative) to be parametric in its both arguments!

For me, it looks like that the addition of (forall x y. Coercible x y => Coercible (f x) (f y)) constraint to Functor (and similarly for Bifunctor, Profunctor, Contravariant ...) can be worked around in all cases. The Mag is the only example mentioned as useful but not lawful Functor, 2 and we have demonstrated a required change.

Note: Mag will still be unlawful, but it can be made representational in all arguments. In the implementation of traverseBiaWith we will use coerce which is free operationally, so there shouldn't be any performance degradation.

I cannot come up with an example of f :: Type -> Type which would violate fmap id = id law, but obey fmap coerce = coerce one (or an opposite direction). And I cannot show that fmap coerce = coerce follows from other two Functor laws, but without using parametricity. Edward Kmett explains how Functor composition law follows from identity law and parametricity. Coerce law is implied by parametricity directly.


Finally, my conclusion is that both

  • fmap coerce = coerce should be a law of GHC Haskell Functor, and
  • the forall a b. Coercible a b => Coercible (f a) (f b) be Functor super-constraint

even the change departs us far far from Haskell2010. The justification is that these requirements are implied by parametricity, and Functor class should contain only parametric type constructors. We would still be able to have unlawful Functors if really needed.

Some immediate benefits are ability to put join into Monad or that van Laarhoven Lens s t a b would be representable in all arguments.

The disadvantage is that there's Haskell 2010 code which would be broken by that change. Consider

We can write (unlawful) Functor instance in Haskell 2010, which won't be accepted, as Foo role is nominal. Yet, GHC documentation about DatatypeContexts says This is widely considered a misfeature, and is going to be removed from the language. You need to enable a language feature to write Foo, in other words that code is already broken (since November 2010!)

Also a non-Haskell2010 code will be broken:

You may think, there's an easy solution, but there is a five year old issue about roles for type families. Also it have to be investigated, if someone actually wrote an (unlawful) Functor wrapping a type family which pattern matches (i.e. is non-parametric) on an argument, and why they did that!

Just adding fmap coerce = coerce law wouldn't break any code. Something which is unlawful will be a bit more unlawful. The hypothesis is that fmap coerce = coerce won't make any currently lawful Functors into unlawful one.


I'm thankful to Ryan Scott for many discussions and valuable insights. And to Andres Löh for comments on a draft of this post.

Appendix: Mapping of morphisms

In this category there is an unique arrow between any two objects, so the mapping is trivial.

We can extend the category by adding more objects and morphisms, and F so it keeps all other objects in place. In this case mapping of morphisms is trickier. If we require that f \circ f^{-1} = 1_A and others identities suggested by names or arrows we can make it work.

Let us define two families of morphisms indexed by objects X in \mathcal{C}: \mathsf{from}_X : X \to F(X) and \mathsf{to}_X : F(X) \to X

 \begin{aligned} \mathsf{from}_A  &= f   \qquad & \mathsf{to}_A  &= f^{-1} \\ \mathsf{from}_B  &= g          & \mathsf{to}_B  &= g^{-1} \\ \mathsf{from}_B&39; &= h          & \mathsf{to}_B&39; &= h^{-1} \\ \mathsf{from}_X  &= 1_X        & \mathsf{to}_X  &= 1_X \end{aligned}

Using these, we can define mapping of morphisms as

 F (j : X \to Y) : F(X) \to F(Y) = \mathsf{from}_Y \circ j \circ \mathsf{to}_X

For example f : A \to B is mapped to

 \begin{aligned} F(f) &= \mathsf{from}_B \circ f \circ \mathsf{to}_A \\      &= g \circ f \circ f{^-1} \\      &= g \end{aligned}

Check by looking at the diagram!

The identity and composition properties hold, for example

 \begin{aligned} F(1_A) &= \mathsf{from}_A \circ 1_A \circ \mathsf{to}_A \\        &= f \circ 1_A \circ f^{-1} \\        &= 1_B \\        &= 1_{F(A)} \end{aligned}

  1. Some links about parametricity

  2. For example in

July 31, 2019 12:00 AM

July 29, 2019

Monday Morning Haskell

Haskell From Scratch Re-Opened!


This week we're taking a break from our Gloss/AI series to make a special announcement! Haskell from Scratch, our beginners course, is now re-opened for enrollment! We've added some more content since the last time we offered it. The biggest addition is a mini-project to help you practice your new skills!

Enrollment will only be open for another week, so don't wait! Next Monday, August 5th, will be the last day to sign up! Enrollments will close at midnight. Once you sign up for the course, you'll have permanent access to the course material. This includes any new content we add in the future. So even if you don't have the time now, it's still a good idea to sign up!

I also want to take this opportunity to tell a little bit of the story of how I learned Haskell. I want to share the mistakes I made, since those motivated me to make this course.

My History with Haskell

I first learned Haskell in college as part of a course on programming language theory. I admired the elegance of a few things in particular. I liked how lists and tuples worked well with the type system. I also appreciated the elegance of Haskell's type definitions. No other language I had seen represented the idea of sum types so well. I also saw how useful pattern matching and recursion were. They made it very easy to break problems down into manageable parts.

After college, I had the idea for a code generation project. A college assignment had taught me some useful Haskell libraries for the task. So I got to work writing some Haskell. At first things were quite haphazard. Eventually though, I developed some semblance of test driven development and product organization.

About nine months into that project, I had the great fortune of landing a Haskell project at my day job. As I ramped up on this project, I saw how deficient my knowledge was in a lot of areas. I realized then a lot of the mistakes I had been making while learning the language. This motivated me to start the Monday Morning Haskell blog.

Main Advice

Of course, I've tried to incorporate my learnings throughout the material on this blog. But if I had to distill the key ideas, here's what they'd be.

First, learn tools and project organization early! Learn how to use Stack and/or Cabal! For help with this, you can check out our free Stack mini-course! After several months on my side project, I had to start from scratch to some extent. The only "testing" I was doing was running some manual executables and commands in GHCI. So once I learned more about these tools, I had to re-work a lot of code.

Second, it helps a lot to have some kind of structure when you're first learning the language. Working on a project is nice, but there are a lot of unknown-unknowns out there. You'll often find a "solution" for your problem, only to see that you need a lot more knowledge to implement it. You need to have a solid foundation on the core concepts before you can dive in on anything. So look for a source that provides some kind of structure to your Haskell learning, like a book (or an online course!).

Third, let's get to monads. They're an important key to Haskell and widely misunderstood. But there are a couple things that will help a lot. First, learn the syntactic patterns of do-syntax. Second, learn how to use run functions (runState, runReaderT, etc.). These are how you bring monadic expressions into the rest of your code. You can check out our Monads Series for some help on these ideas. (And of course, you'll learn all about monads in Haskell From Scratch!)

Finally, ask for help earlier! I still don't plug into the Haskell network as much as I should. There are a lot of folks out there who are more than willing to help. Freenode is a great place, as is Reddit and even Twitter!


There's never been a better time to start learning Haskell! The language tools have developed a ton in the last few years and the community is growing stronger. And of course, we've once again opened up our Haskell From Scratch Beginners Course! You don't need any Haskell experience to take this course. So if you always wanted to learn more about Haskell but needed more organization, this is your chance!

If you want to stay up to date with the latest at Monday Morning Haskell, make sure to Subscribe to our mailing list! You'll hear the latest about upcoming articles, as well as any new course offerings. You'll also get access to our Subscriber Resources.

by James Bowen at July 29, 2019 02:30 PM

Chris Smith 2

Solving a puzzle in Haskell

This post isn’t particularly deep or insightful, but I hope it’s fun!

Some time last December, I purchased a Christmas gift for a family member from the holiday fair in Union Square park in New York. It’s called a “snake cube puzzle”, and is made out of small wooden cubes, through which are threaded a single length of elastic cord via holes drilled in the smaller cubes. There are 27 small cubes, and the challenge is to assemble them into one large cube, which three to a side.

You can see the original puzzle in this YouTube video:

Well, I ended up changing my mind about the gift, so this puzzle sat on my desk unused for quite a few months. At some point, I unfolded it and attempted it myself. It was not a success! So now I had an unfolded puzzle on my desk. The puzzle that I purchased wasn’t the same configuration as the video above, so I couldn’t just follow those steps, either. Eventually, I decided to apply skills that I do have to solve the problem — so I wrote some code to find a solution.

If you’re impatient, here’s what the solution looks like, and a link to explore it interactively.


What follows is an explanation of the process of writing this code.

Step 1: Understanding the problem

The first step is to understand how the puzzle works. Essentially, there are two kinds of smaller cubes: in some, the string passes straight through, while in others, it turns 90 degrees. The straight-line pieces don’t present any choice at all, but the 90-degree turns can be rotated to any of the four directions that are orthogonal to the previous direction. The trick is just to decide the correct direction from which to leave each of the 90-degree turns.

Since there’s no choice to be made in the straight-through pieces, there’s no need to model them as separate pieces. Therefore, I separates the puzzle not into smaller cubes but into segments, where a segment consists of all of the pieces leading up to a corner. I defined the length of a segment to be the number of squares one moves to get from one end to the other, so the puzzle is made up on a sequence of segments, each of length either 1 or 2.

In the video above, the segment lengths are something like: 2, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 2, 1, 2, 1, 1, 2. But my puzzle, again, is different. So I wrote:

segments :: [Int]
segments =
[2, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 2]

It’s clear that I’m going to be working with positions in space, and directions. So I next defined some types for these, and a basic operation relating them.

type Pos = (Int, Int, Int)
data Dir = F | B | L | R | U | D deriving (Show)
move :: Dir -> Pos -> Pos
move F (i, j, k) = (i + 1, j, k)
move B (i, j, k) = (i - 1, j, k)
move L (i, j, k) = (i, j - 1, k)
move R (i, j, k) = (i, j + 1, k)
move U (i, j, k) = (i, j, k + 1)
move D (i, j, k) = (i, j, k - 1)

Admittedly, I was a bit cryptic with one-letter names. If you didn’t figure it out, they are short for Forward, Backward, Left, Right, Up, Down. I’m considering them absolute directions with respect to a fixed orientation. So “left” means to the left from my perspective outside the cube, not a relative left turn from the previous direction.

Step 2: Recursive tree search

Now that I have the basic vocabulary to think about the problem, I’m working up toward writing what I know will be a tree search. At some point, I’m going to have a current direction, and want to know which directions are possible 90-degree turns. Let’s do that.

turns :: Dir -> [Dir]
turns F = [L, R, U, D]
turns B = [L, R, U, D]
turns L = [F, B, U, D]
turns R = [F, B, U, D]
turns U = [F, B, L, R]
turns D = [F, B, L, R]

Okay, time to tackle the tree search. I now ask myself the central question. When I’ve half-completed an attempt at the puzzle, what state do I need to know to complete the attempt? My answer to that is the SearchState type:

data SearchState = SearchState {
currentPos :: Pos,
currentDir :: Dir,
remainingSegments :: [Int],
usedPositions :: Set Pos

In other words, I need to know:

  1. What’s the last position of a segment that I’ve decided to place?
  2. What direction am I planning to move next?
  3. What are the segments I haven’t placed yet?
  4. Which positions already have blocks in them (since two blocks cannot land in the same position)?

With this in mind, I can choose the initial search states. Because of symmetry, it’s only to consider only solutions that start on the near face of the cube, and moving forward. (Because the first segment has length 2, I know the solution cannot start with the first cube in the center, either!) So I’ll define the start states with this in mind.

initialSearchStates :: [SearchState]
initialSearchStates = [
SearchState {
currentPos = (0, j, k),
currentDir = F,
remainingSegments = segments,
usedPositions = Set.empty
| j <- [0..2], k <- [0..2] ]

I’ll also need a way to decide if it’s okay to place a block at a specific position. It’s okay if (a) the position is in-bounds, and (b) there’s not already a block at the position.

goodPosition :: Set Pos -> Pos -> Bool
goodPosition used pos =
inRange ((0,0,0),(2,2,2)) pos && not (pos `Set.member` used)

And finally, the search itself.

search :: SearchState -> [[Dir]]
search SearchState{ remainingSegments = [] } = [[]]
search SearchState{..}
| all (goodPosition usedPositions) cover
= [ currentDir : solution
| dir' <- turns currentDir
, solution <- search SearchState{
currentPos = last covered,
currentDir = dir',
remainingSegments = segs,
usedPositions = foldr Set.insert usedPositions
(init cover)
| otherwise = []
where s:segs = remainingSegments
cover = take (s + 1) (iterate (move currentDir) currentPos)

Here, cover is defined as the set of spaces that will be covered by the next segment. One simply checks that the next segment can be placed, and then computes the successor states. The successor states have a new position, one of the 90-degree turns as the new direction, and an augmented set of already-used positions. The successor states should be recursively searched.

There’s no particular care taken here to only generate a solution once. In fact, since we make a meaningless choice of a new direction at the end of the puzzle, there will always be four identical solutions in the result set for each list of directions. I could fix this by being more cautious about the recursion setup, but I don’t really care in the end. I just want the first solution.

theAnswer :: [Dir]
theAnswer = head (concatMap search initialSearchStates)

Step 3: Visualization

Now one can just print theAnswer, and have a working solution. I did this, and tried it out, and solved the puzzle. Success!

But it’s not very satisfying. Trying to share this solution with others, I quickly found that strings of F, B, L, R, U, and D are hard for others to understand. So let’s visualize. Fortunately, I have CodeWorld, the programming website I use for teaching functional programming. So opening (because I want to use all of Haskell, not the simplified subset designed for children), I started a visualization.

To visualize, I don’t actually want a list of directions. I want a list of blocks to draw in the appropriate order. I can get this by zipping the directions with the segment lengths:

blocks :: [Pos]
blocks = follow (0, 0, 0) (zip theAnswer segments)
where follow p [] = [p]
follow p ((_, 0) : steps) = follow p steps
follow p ((d, n) : steps)
= p : follow (move d p) ((d, n - 1) : steps)

I’d like the user to be able to scroll through the partially complete puzzle by pressing up and down arrow keys, so I first built a simulation whose state is the number of blocks to show:

main :: IO ()
main = activityOf initial change picture
initial :: Int
initial = 1
change :: Event -> Int -> Int
change (KeyPress "Up") n = n + 1
change (KeyPress "Down") n = n - 1
change other n = n

All that remains is to define picture. Using CodeWorld drawing combinators, I can draw a block with a polygon for each visible side, and the complete picture as a collection of blocks. This leaves a few details:

  1. The pictures of individual blocks must be overlapped in the right order, so that blocks nearest the user obscure the blocks further away.
  2. The sides must be projected from three dimensions into two. I chose an isometric projection, so that I can be sure there are only three visible sides.

The result looks something like this:

picture :: Int -> Picture
picture n = pictures [
drawBlock p
| p <- sortBy (comparing viewSortKey) (take n blocks)
where viewSortKey (i, j, k) = (i, -j, -k)
drawBlock :: Pos -> Picture
drawBlock (i, j, k) = pictures [
-- The front face.
colored (light gray) $ solidPolygon [
project p | p <- [ (x + 0.5, y + 0.5, z - 0.5),
(x + 0.5, y - 0.5, z - 0.5),
(x - 0.5, y - 0.5, z - 0.5),
(x - 0.5, y + 0.5, z - 0.5) ] ],
-- The top face.
colored gray $ solidPolygon [
project p | p <- [ (x + 0.5, y + 0.5, z + 0.5),
(x + 0.5, y + 0.5, z - 0.5),
(x - 0.5, y + 0.5, z - 0.5),
(x - 0.5, y + 0.5, z + 0.5) ] ],
-- The right face.
colored (dark gray) $ solidPolygon [
project p | p <- [ (x + 0.5, y + 0.5, z + 0.5),
(x + 0.5, y + 0.5, z - 0.5),
(x + 0.5, y - 0.5, z - 0.5),
(x + 0.5, y - 0.5, z + 0.5) ] ]
where x = fromIntegral j
y = fromIntegral k
z = fromIntegral i
project :: (Double, Double, Double) -> Point
project (x, y, z) = (3 * x + (1 + sin t / 4) * z,
3 * y + (1 + cos t / 4) * z)

And we’re done!

Okay, sort of… for the final version above, I made three small additions. First, I added text to explain how to scroll through the solution. Second, I added color to the blocks. The alternating colors makes it look more like the original puzzle. And finally, I added some change over time to the details of the projection. This last addition was made after I noticed that the 3D effect can be hard for the eye to decode in a static image. Adding some camera motion makes the effect much clearer.

The solver for the snake puzzle

by Chris Smith at July 29, 2019 05:26 AM

July 28, 2019

Joachim Breitner

Custom firmware for the YQ8003 bicycle light

This blog post is about 18 months late, but better late than never...

The YQ8003

1½ years ago, when I was still a daredevil that was biking in Philly I got interested in these fancy strips of LED lights that you put into your bike wheel and when you drive fast enough, they form a stable image, both because of the additional visibility and safety, but also because the seem to be fun gadgets.

There are brands like Monkey Lights, but they are pretty expensive, and there are cheaper similar no-name products available, such as the YQ8003, which you can either order from China or hope to find on eBay for around $30 per piece.

The YQ8003 bike light

The YQ8003 bike light

Sucky software

The hardware is nice: water proof, easy to install, bright, long-lasting battery. But the software, oh my!

You need Windows to load your own pictures onto the device, and the application is really unpleasant to use, you can’t easily save your edits and sequences of images and so on.

But also the software on the device itself (which sports a microcontroller) was unsatisfying: The transformation it applies to the image assumes that the bar of LEDs goes through the center of the wheel. Obviously that is wrong, as there is the hub. With a small hub the difference is not so bad, but I have rather large hubs (a generator in the front hub, and internal gears in the rear hub), and this make the image not stable, but jump back and forth a bit.

Time to DIY!

So obviously I had to do something about it. At first I planned to to just find out how to load my own pictures onto the hardware, using the existing software on the device. So I needed to find out the protocol.

I was running their program on Windows in VirtualBox, and quickly noticed that the USB connection that you use to load your data onto the YQ8003 is actually a serial-over-USB port. I found a sniffer for serial communication and used that to dump what the Windows app sent to the device. That was all pretty hairy, and I only did it once (and deleted the Windows setup soon), but luckily one dump was sufficient.

I did not find out where in the data sent to the light the image was encoded. But I did find that the protocol used to talk to the device is a standard protocol to talk to microcontrollers, something called “STC ISP”. With that information, I could find out that the microcontroller is a STC12LE5A60S2 with 22MHz and 60KB of RAM, and that it is “8051 compatible”, whatever that means.

So this is how I, for the first and so far only time, ventured into microcontroller territory. It was pretty straight-forward to get a toolchain to compile programs for this microcontroller (using sdcc) and to upload code to it (using stcgal), and I could talk to my code over the serial port. This is promising!

Reverse engineering

I also quickly found out how the magnet (which the device uses to notice when the wheel has done one rotation) is accessed: It triggers interrupt 0.

But finding out how to actually access the LEDs and might them light up was very tricky. This kind of information is not specific to the microcontroller (STC12LE5A60S2), for which I could find documentation, but really depends on how it is wired up.

I was able to extract, from the serial port communication dump mentioned earlier, the firmware in a way I could send it to the microcontroller. So I could always go back to a working state. Moreover I could disassemble that code, and try to make sense of it. But I could not make sense of it, i.e. could not understand .

So if thinking does not help, maybe brute force does? I wrote a program that would take the working firmware, zero out parts of it. Then I would try that firmware and note if it still works. This way, my program would zero out ever more of the firmware, until only a few instructions are left that would still make the LEDs light up.

In the end I had, I think, 13 instructions left that made the LEDs light up lightly. Success! Or so I thought … the resulting program was pretty non-sensical. It essentially increments a value and writes another value to the address stored in the first value. So it just spews data all over the address range, wrapping around when at the end. No surprise it triggers the LEDs somewhere along the way…

(Still, I published the program to minimize binary data under the name bisect-binary – maybe you’ll find it useful for something.)

I actually don’t remember how I eventually figured out what to do, and which bytes and bits to toggle in which order. Maybe more reading, and some advice to look for from people who know more about LEDs.


With that knowledge I could finally write my own firmware and user application. The part that goes onto the device is written in C and compiled with sdcc. And the part that runs on your computer is a command line application written in Haskell, that takes the pictures and animations you want, applies the necessary transformations (now taking the width of your hub into account!) and embeds that into the compiled C code to produce a firmware file that you can load onto your device using stcgal.

It support images in all common formats, produces 8 colors and can store up to 8 images on the device, which then circle according to the time you specify. I dubbed the software bSpokeLight.

The light in action with more lights at the GPN19 (The short shutter speed of the camera prevents the visual effect in the eye that allows you to see the images well here)

The light in action with more lights at the GPN19 (The short shutter speed of the camera prevents the visual effect in the eye that allows you to see the images well here)

It actually supports reading GIF animations, but I found that they are much harder to recognize later, unless I rotate the wheel very fast and you know what to look for. I am not sure if this is a limitation of the hardware (and our eyes), a problem with my code or a problem with the particular animations I have tried. Will need to experiment more.

Can you see the swing dancing couple?

Can you see the swing dancing couple?

As always, I am sharing the code in the hope that others find it useful as well. Thanks to Haskell, Nix and the iohk-nix project I can easily provide pre-compiled binaries for Windows and Linux, statically compiled for the latter for distribution-independence. Let me know if you try to use it and how that went.

by Joachim Breitner ( at July 28, 2019 09:30 AM

July 23, 2019

Mark Jason Dominus

Obtuse axiomatization of category theory

About ten years ago I started an article, addressed to my younger self, reviewing various books in category theory. I doubt I will ever publish this. But it contained a long, plaintive digression about Categories, Allegories by Peter Freyd and Andre Scedrov:

I keep this one around on the shelf just so that I can pick it up ever few months and marvel at its opacity.

It is a superb example of the definition-theorem-remark style of mathematics textbooks. I have heard that this was a style pioneered by a book you are already familiar with, John Kelley's General Topology of 1955. If so, all I can say is, sometimes it works, and sometimes it doesn't. It worked for Kelley in 1955 writing about topology.

Here is an example of what is wrong with this book.

Everyone who knows anything about category theory knows that a category is a sort of abstraction of a domain of mathematical objects, like sets, groups, or topological spaces. A category has a bunch of "objects", which are the sets, the groups, or the topological spaces, and it has a bunch of "morphisms", which are maps between the objects that preserve the objects' special structure, be it algebraic, topological, or whatever. If the objects are sets, the morphisms are simply functions. If the objects are groups, the morphisms are group homomorphisms; if the objects are topological spaces, the morphisms are continuous maps. The basic point of category theory is to study the relationships between these structure-preserving maps, independent of the underlying structure of the objects themselves. We ignore the elements of the sets or groups, and the points in the topological spaces, and instead concentrate on the relationships between whole sets, groups, and spaces, by way of these "morphisms".

Here is the opening section of Categories, Allegories:


The theory of CATEGORIES is given by two unary operations and a binary partial operation. In most contexts lower-case variables are used for the 'individuals' which are called morphisms or maps. The values of the operations are denoted and pronounced as:
the source of ,
the target of ,
the composition of and ,
The axioms:
is defined iff ,
and ,
and ,
and ,

In light of my capsule summary of category theory, can you figure out what is going on here? Even if you already know what is supposed to be going on you may not be able to make much sense of this. What to make of the axiom that , for example?

The explanation is that Freyd has presented a version of category theory in which the objects are missing. Since every object in a category is associated with a unique identity morphism from to itself, Freyd has identified each object with its identity morphism. If , then is and is . The axiom is true because both sides are equal to .

Still, why phrase it this way? And what about that thing? I guessed it was a mere technical device, similar to the one that we can use to reduce five axioms of group theory to three. Normally, one defines a group to have an identity element such that for all , and each element has an inverse such that . But if you are trying to be clever, you can observe that it is sufficient for there to be a left identity and a left inverse:

There must be an identity such that for all , and for each there must be an such that .

We no longer require or , but it turns out that you can prove these anyway, from what is left. The fact that you can discard two of the axioms is mildly interesting, but of very little practical value in group theory.

I thought that probably the thing was some similar bit of “cleverness”, and that perhaps by adopting this one axiom Freyd was able to reduce his list of axioms. For example, from that mysterious fourth axiom you can conclude that is defined if and only if is, and therefore, by the first axiom, that if and only if , so that . So perhaps the phrasing of the axiom was chosen to allow Freyd to dispense with an additional axiom stating that .

Today I tinkered with it a little bit and decided I think not.

Freyd has:

$$\begin{align} xy \text{ is defined if and only if } x□ & = □y \tag{1} \\ (□x)□ & = □x \tag{2} \\ (□x)x & = x \tag{3} \\ □(xy) & = □(x(□y)) \tag{4} \end{align} $$

and their duals. Also composition is associative, which I will elide.

In place of 4, let's try this much more straightforward axiom:

$$ □(xy) = □x\tag{$4\star$} $$

I can now show that together imply .

First, a lemma: . Axiom say , so therefore . By , the left-hand side reduces to , and we are done.

Now I want to show , that . Before I can even discuss the question I first need to show that is defined whenever is; that is, whenever . But by the lemma, , so , which is just what we needed.

At this point, implies directly: both sides of have the form , and tells us that both are equal to .

Conversely, implies . So why didn't Freyd use instead of ? I emailed him to ask, but he's 83 so I may not get an answer. Also, knowing Freyd, there's a decent chance I won't understand the answer if I do get one.

My plaintive review of this book continued:

Another, briefer complaint about this book: Early on, no later than page 13, Freyd begins to allude to "Lazard sheaves". These are apparently an important example. Freyd does not define or explain what "Lazard sheaves" are. Okay, you are expected to do some background reading, perhaps. Fair enough. But you are doomed, because "Lazard sheaves" is Freyd's own private coinage, and you will not be able to look it up under that name.

Apparently some people like this book. I don't know why, and perhaps I never will.

by Mark Dominus ( at July 23, 2019 09:01 PM

Oleg Grenrus

All kinds of lattices

Posted on 2019-07-23 by Oleg Grenrus

This post contains excerpts from a PDF "post" I have written. It looks way better in PDF.

Partial ordered set, or poset for short is well used example in category theory books. Yet, posets are not too familiar for an average CT-curious Haskeller, yet they are easy to visualise! Let’s do exactly that, and mention elementery bits of category theory. In particular, I’ll scan over the six first chapters of Category Theory by Steve Awodey, repeating related definitions and poset examples.1


Definition (Awodey 1.1) A category consist of the following data

  • Objects: A, B, C, \ldots

  • Arrows: f, g, h, \ldots

  • For each arrow f, there are given objects

    \mathrm{dom}(f), \qquad \mathrm{cod}(f)

    called the domain and codomain of f. We write

    f : A \to B

    to indicate that A = \mathrm{dom}(f) and B = \mathrm{cod}(f).

  • Given arrows f : A \to B and g : B \to C, that is, with

    \mathrm{cod}(f) = \mathrm{dom}(g)

    there is given an arrow

    g \circ f : A \to C

    called the composite of f and g.

  • For each object A, there is given an arrow

    1_A : A \to A

    called the identity arrow of A.

  • Associativity:

    h \circ (g \circ f) = (h \circ g) \circ f

    for all f : A \to B, g : B \to C, h : C \to D.

  • Unit:

    f \circ 1_A = f = 1_B \circ f

    for all f : A \to B.

We’ll see how Category type-class is related later, in later section.


Before proceeding, we’ll answer a question: is there a category with posets as objects? Yes, it’s called \mathbf{Pos}! What are the arrows in this category? An arrow from a poset A to a poset B is a function

m : A \to B

that is monotone, in the sense that, for all a, a&39; \in A,

a \le_A a&39; \qquad\text{implies}\qquad m(a) \le_B m(a&39;).

We need to know that 1_A : A \to A is monotone, and also that if f : A \to B and g : B \to C are monotone, then g \circ f : A \to C is monotone. That holds, check!

Recall, posets are categories, so monotone functions are "mappings" between categories. A "homomorphism2 of categories" is called a functor.

Definition (Awodey 1.2) A functor

F : \mathbf{C} \to \mathbf{D}

between categories \mathbf{C} and \mathbf{D} is a mapping of objects to objects and arrows to arrows, in such a way that

  • F (f : A \to B) = F(f) : F(A) \to F(B),

  • F(1_A) = 1_{F(A)},

  • F(g \circ f) = F(g) \circ F(f).

That is, F preserves domains and codomains, identity arrows, and composition. A functor F : \mathbf{C} \to \mathbf{D} thus gives a sort of "picture" – perhaps distorted – of \mathbf{C} in \mathbf{D}.

The  version looks quite different:

There is a mismatch of a notation of category theory, and what can be written as code. In CT notation F acts both on objects and arrows. In  f acts on objects, and fmap f acts on arrows. Substituting above, id and . into definition of functor, will also give familiar laws

However,  Functor-class is only for functors from a pseudo-category \mathbf{Hask} to itself, where f, a mapping from types to types is a type-constructor, not arbitrary type family. Functor is a very special case of category theoretical variant.

With small posets, like Bool and M2 we can visualise a monotone function, a functor. Let’s consider a function

Graph of f :: Bool -> M2
Graph of f :: Bool -> M2

The graph of f is on [fig:f-bool-to-m2]. Dotted and dashed lines are arrows in Bool and M2 respectively. We can see on figure, that f indeed gives a picture of Bool in M2.

In  we have only written a mapping of objects, True and False. The mapping of arrows is something we need to check, to be able to claim that f is a functor, and therefore a monotone function. The other way around, there are mappings from Bool to M2 which aren’t monotone, and aren’t functors.

In this section we went backwards. More principal approach would been to first consider functors between poset categories. The monotonicity requirement is implied by first functor requirement. This is a power of category theory. When you look for something, category theory tells you which properties it should have. Once you find something which satisfies the requirements, you know that it’s the right one (up to an isomorphism).

Exponential lattices with M2 are pretty. ZHO -> M2 ([fig:zho2m2]) has nice planar graph. . Yet the final stress test is (ZHO -> ZHO) -> ZHO, or \mathtt{ZHO}^{\mathtt{ZHO}^\mathtt{ZHO}} is on [fig:big]. A beautiful monster.

ZHO -> M2 lattice
ZHO -> M2 lattice
(ZHO -> ZHO) -> ZHO lattice
(ZHO -> ZHO) -> ZHO lattice

  1. If you want to learn category theory, getting a book is a small investment. Your public or university library probably have a copy.

  2. morphism preserving the structure

July 23, 2019 12:00 AM

July 22, 2019

Roman Cheplyaka

A curious associativity of the <$> operator

The <$> operator in Haskell is an infix synonym for fmap and has the type

For example:

Like all other operators, <$> has a fixity, which can be infixl, infixr, or infix. The fixity defines how an expression like

is parsed.

If <$> were defined as infix, then the above expression wouldn’t parse at all, giving an error like

Precedence parsing error
    cannot mix ‘<$>’ [infix 4] and ‘<$>’ [infix 4] in the same infix expression

If <$> instead were defined as infixr, then the above expression would parse as

meaning first apply negate to 5 inside Just, and then go inside Just once more and apply abs.

However, <$> is defined as infixl, so that the applicative chains like

would parse correctly. This means that

is parsed as

which is probably not what you meant. Or is it?

It turns out that if you paste that expression (either parenthesized or not) into ghci, you’ll get back Just 5 in all cases.

This happens because of the following instance defined in the base library:

(Note that (->) r is essentially the Reader monad, so it’s not surprising that it’s a functor.)

So the expression

does make sense and is equivalent to

The operators * for which a * (b * c) = (a * b) * c are called associative. Usually associative operators have the type a -> a -> a (either polymorphic of for a specific a), so that both ways to put parentheses typecheck.

The <$> operator is curious because it doesn’t have such a type and yet is associative:

July 22, 2019 08:00 PM

Monday Morning Haskell

Analyzing Our Parameters


Our last couple articles have focused on developing an AI for the player character in our game. It isn't perfect, but it's a decent approximation of how a human would try to play the game. This means we can now play iterations of the game without any human involvement. And by changing the parameters of our world, we can play a lot of different versions of the game.

Our goal for this week will be to write some simple analysis functions. These will play through the game without needing to display anything on the screen. Then we'll be able to play different versions in quick succession and compare results.

As always, the code for this project is on a Github Repository. For this article, take a look at the analyze-game branch.

If you're completely new to Haskell, a simple game like this is a great way to get started! But you should start with our Beginners Checklist! It'll help you get everything set up with the language on your local machine! Then you can move onto our Liftoff Series to learn more about Haskell's mechanics.

Generating a Result

The first thing we need is a function that takes a world state and generates a result for it. Our game does have a degree of randomness. But once we fix the starting random generator for a everything is deterministic. This means we need a function like:

runGameToResult :: World -> GameResult

We'll want to use our updateFunc from the main game runner. This is our "evolution" function. It's job is to go from one World state to another. It evolves the game over the course of one timestep by allowing each of the agents to make a decision (or wait). (Note we don't use the Float parameter in our game. It's just needed by Gloss).

updateFunc :: Float -> World -> World

Since we want to track an ever evolving stateful variable, we'll use the State monad. For each iteration, we'll change the world using this update step. Then we'll check its result and see if it's finished. If not, we'll continue to run the game.

runGameToResult :: World -> GameResult
runGameToResult = evalState runGameState
    runGameState :: State World GameResult
    runGameState = do
      modify (updateFunc 1.0)
      currentResult <- gets worldResult
      if currentResult /= GameInProgress
        then return currentResult
        else runGameState

Analysis: Generating World Iterations

Now that we can run a given world to its conclusion, let's add another step to the process. We'll run several different iterations with any given set of parameters on a world. Each of these will have a different set of starting enemy locations and drill power-ups. Let's make a function that will take a random generator and a "base world". It will derive a new world with random initial enemy positions and drill locations.

generateWorldIteration :: World -> StdGen -> World

We'll use a helper function from our game that generates random locations in our maze. It's stateful over the random generator.

generateRandomLocation :: (Int, Int) -> State StdGen Location

So first let's get all our locations:

generateWorldIteration :: World -> StdGen -> World
generateWorldIteration w gen1 = ...
    params = worldParameters w
    rowCount = numRows params
    columnCount = numColumns params
    enemyCount = numEnemies params
    drillCount = numDrillPowerups params

    (enemyLocations, gen2) = runState
          (const (generateRandomLocation (rowCount, columnCount)))
    (drillLocations, gen3) = runState
          (const (generateRandomLocation (rowCount, columnCount)))

Then we have to use the locations to generate our different enemies. Last, we'll plug all these new elements into our base world and return it!

generateWorldIteration :: World -> StdGen -> World
generateWorldIteration w gen1 = w
  { worldEnemies = enemies
  , worldDrillPowerUpLocations = drillLocations
  , worldRandomGenerator = gen3
  , worldTime = 0
    (enemyLocations, gen2) = ...
    (drillLocations, gen3) = …
    enemies = mkNewEnemy (enemyGameParameters params) <$> enemyLocations

Analysis: Making Parameter Sets

For our next order of business, we want to make what we'll call a parameter set. We want to run the game with different parameters each time. For instance, we can take a base set of parameters, and then change the number of enemies present in each one:

varyNumEnemies :: GameParameters -> [GameParameters]
varyNumEnemies baseParams = newParams <$> allEnemyNumbers
    baseNumEnemies = numEnemies baseParams
    allEnemyNumbers = [baseNumEnemies..(baseNumEnemies + 9)]
    newParams i = baseParams { numEnemies = i }

We can do the same for the number of drill pickups:

varyNumDrillPickups :: GameParameters -> [GameParameters]
varyNumDrillPickups baseParams = newParams <$> allDrillNumbers
    baseNumDrills = numDrillPowerups baseParams
    allDrillNumbers = [baseNumDrills..(baseNumDrills + 9)]
    newParams i = baseParams { numDrillPowerups = i }

Finally, we can have a different cooldown time for our player's stun ability.

varyPlayerStunCooldown :: GameParameters -> [GameParameters]
varyPlayerStunCooldown baseParams = newParams <$> allCooldowns
    basePlayerParams = playerGameParameters baseParams
    baseCooldown = initialStunTimer basePlayerParams
    allCooldowns = [(baseCooldown - 4)..(baseCooldown + 5)]
    newParams i = baseParams
      { playerGameParameters = basePlayerParams { initialStunTimer = i }}

If you fork our code, you can try altering some other parameters. You can even try combining certain parameters to see what the results are!

Tying It Together

We've done most of the hard work now. We'll have a function that takes a number of iterations per parameter set, the base world, and a generator for those sets. It'll match up each parameter set to the number of wins the player gets over the course of the iterations.

  :: Int
  -> World
  -> (GameParameters -> [GameParameters])
  -> [(GameParameters, Int)]
runAllIterations numIterations w paramGenerator =
  map countWins results
    aiParams = (worldParameters w) { usePlayerAI = True }
    paramSets = paramGenerator aiParams

    runParamSet :: GameParameters -> [GameResult]
    runParamSet ps = map
      (runGame w {worldParameters = ps })

    runGame :: World -> Int -> GameResult
    runGame baseWorld seed = runGameToResult
      (generateWorldIteration baseWorld (mkStdGen seed))

    results :: [(GameParameters, [GameResult])]
    results = zip paramSets (map runParamSet paramSets)

    countWins :: (GameParameters, [GameResult]) -> (GameParameters, Int)
    countWins (gp, gameResults) =
      (gp, length (filter (== GameWon) gameResults))

We need one more function. It will read an input file and apply our steps over a particular parameter group. Here's an example with varying the number of enemies:

analyzeNumEnemies :: FilePath -> IO ()
analyzeNumEnemies fp = do
  world <- loadWorldFromFile fp
  let numIterations = 10
  putStrLn "Analyzing Different Numbers of Enemies"
  let results = runAllIterations numIterations world varyNumEnemies
  forM_ results $ \(gp, numWins) -> putStrLn $
    "With " ++ (show (numEnemies gp)) ++ " Enemies: " ++ (show numWins)
      ++ " wins out of " ++ (show numIterations) ++ " iterations."

Now we're done! In the appendix, you can find some basic results of our investigation!


Soon, we'll take our analysis steps and apply them in a more systematic way. We'll try to gauge the difficulty of a particular game level. Then we can make levels that get more and more challenging!

But first, we'll start exploring a few ways we can improve the player and enemy AI abilities. We'll start by implementing some basic caching mechanisms in our breadth first search. Then we'll consider some other AI patterns besides simple BFS.

For a review of the code in this article, take a look at our Github Repository. You'll want to explore the analyze-game branch!

We'll soon be exploring machine learning a bit more as we try to improve the game. Make sure to read our series on Haskell and AI to learn more! Download our Haskell Tensorflow Guide to see how we can use tensor flow with Haskell!


With 4 drills and 10 cooldown time:

Analyzing Different Numbers of Enemies
With 4 Enemies: 10 wins out of 10 iterations.
With 5 Enemies: 9 wins out of 10 iterations.
With 6 Enemies: 9 wins out of 10 iterations.
With 7 Enemies: 10 wins out of 10 iterations.
With 8 Enemies: 9 wins out of 10 iterations.
With 9 Enemies: 9 wins out of 10 iterations.
With 10 Enemies: 9 wins out of 10 iterations.
With 11 Enemies: 9 wins out of 10 iterations.
With 12 Enemies: 8 wins out of 10 iterations.
With 13 Enemies: 7 wins out of 10 iterations.

With 13 enemies and 10 cooldown time:

With 2 Drills: 5 wins out of 10 iterations.
With 3 Drills: 7 wins out of 10 iterations.
With 4 Drills: 8 wins out of 10 iterations.
With 5 Drills: 8 wins out of 10 iterations.
With 6 Drills: 8 wins out of 10 iterations.
With 7 Drills: 7 wins out of 10 iterations.
With 8 Drills: 8 wins out of 10 iterations.
With 9 Drills: 8 wins out of 10 iterations.
With 10 Drills: 8 wins out of 10 iterations.
With 11 Drills: 8 wins out of 10 iterations.

by James Bowen at July 22, 2019 02:30 PM

July 21, 2019

Neil Mitchell

Downloading all of Hackage

Summary: I wanted to download the latest version of every package in Hackage. Here's a script and explanation of how to do it.

Imagine you want the latest version of every package on Hackage. I found two tools that mirror every version of every package:

  • Using hackage-mirror you can do hackage-mirror --from="" --to="C:/hackage-mirror". But this project is long deprecated and doesn't actually work anymore.
  • Using hackage-mirror-tool you might be able to do it, but it requires a new Cabal, isn't on Hackage, doesn't seem to work on Windows and doesn't say whether it downloads to disk or not.

Given it's a fairly simple problem, after investigating these options for an hour, I decided to cut my losses and write a script myself. Writing the script took a lot less than an hour, and I even wrote this blog post while the download was running. The complete script is at the bottom of this post, but I thought it might be instructive to explain how I went about developing it.

Step 0: Set up my working environment

I created a file named Download.hs where I was writing the source code, used ghcid Download.hs in a VS Code terminal to get fast error feedback using Ghcid, and opened another terminal to execute runhaskell Download.hs for testing.

Step 1: Find where a download link is

You can download a package from Hackage at You can also use https, but for my purposes and bulk downloading I figured http was fine. I hunted around to find a link which didn't contain the version number (as then I wouldn't have to compute the version number), but failed.

Step 2: Find a list of package versions

Looking at the cabal tool I found the cabal list --simple command, which prints a big list of packages in the form:

foo 1.0
foo 2.1
bar 1.0

For each package on Hackage I get all versions sequentially, with the highest version number last. I can execute this command using systemOutput_ "cabal list --simple" (where systemOutput_ comes from the extra library).

Step 3: Generate the list of URLs

Now I have the data as a big string I want to convert it into a list of URL's. The full pipeline is:

map (toUrl . last) . groupOn fst .  map word1 . lines

Reading from right to left, I split the output into a list of lines with lines, then split each line on its first space (using word1 from the extra library). I then use groupOn fst so that I get consecutive runs of each package (no points for guessing where groupOn comes from). For each list of versions for a package I take the last (since I know that's the highest one) and transform it into the URL using:

let toUrl (name, ver) = "" ++ name ++ "/" ++ name ++ "-" ++ ver ++ ".tar.gz"

Step 4: Download the URLs

I could make multiple calls to wget, but that's very slow, so instead I write them to a file and make a single call:

writeFile "_urls.txt" $ unlines urls
system_ "wget --input-file=_urls.txt"

I use the name _urls.txt so I can spot that special file in amongst all the .tar.gz files this command produces.

Step 5: Putting it all together

The complete script is:

import Data.List.Extra
import System.Process.Extra

main :: IO ()
main = do
let toUrl (name, ver) = "" ++ name ++ "/" ++ name ++ "-" ++ ver ++ ".tar.gz"
urls <- map (toUrl . last) . groupOn fst . map word1 . lines <$> systemOutput_ "cabal list --simple"
writeFile "_urls.txt" $ unlines urls
system_ "wget --input-file=_urls.txt"

After waiting 46 minutes I had 13,258 packages weighing in at 861Mb.

Update: In the comments Janek Stolarek suggested the simpler alternative of cabal list --simple | cut -d' ' -f1 | sort | uniq | xargs cabal get (I had missed the existence of cabal get). Niklas Hambüchen also shares a script which can download even faster.

by Neil Mitchell ( at July 21, 2019 04:22 AM

July 20, 2019

Oleg Grenrus

Embedding secret data into Docker images

Posted on 2019-07-20 by Oleg Grenrus engineering

In Multi-stage docker build of Haskell webapp blog post I briefly mentioned data-files. They are problematic. A simpler way is to use e.g. file-embed-lzma or similar functionality to embed data into the final binary.

You can also embed secret data if you first encrypt it. This would reduce the pain when dealing with (large) secrets. I personally favor configuration (of running Docker containers) through environment variables. Injecting extra data into containers is inelegant: that's another way to "configure" running container, when one would be enough.

In this blog post, I'll show that dealing with encrypted data in Haskell is not too complicated. The code is in the same repository as the previous post. This post is based on Tutorial: AES Encryption and Decryption with OpenSSL, but is updated and adapted for Haskell.

Encrypting: OpenSSL Command Line

To encrypt a plaintext using AES with OpenSSL, the enc command is used. The following command will prompt you for a password, encrypt a file called plaintext.txt and Base64 encode the output. The output will be written to encrypted.txt.

This will result in a different output each time it is run. This is because a different (random) salt is used. The Salt is written as part of the output, and we will read it back in the next section. I used HaskellCurry as a password, and placed an encrypted file in the repository.

Note that we use -pbkdf2 flag. It's available since OpenSSL 1.1.1, which is available in Ubuntu 18.04 at the time of writing. Update your systems! We use 100000 iterations.

The choice of SHA1 digest is done because pkcs5_pbkdf2_hmac_sha1 exists directly in HsOpenSSL. We will use it to derive key and IV from a password in Haskell. Alternatively, you could use -p flag, so openssl prints the used Key and IV and provide these to the running service.

Decrypting: OpenSSL Command Line

To decrypt file on command line, we'll use -d option:

This command is useful to check "what's there". Next, the Haskell version.

Decrypting: Haskell

To decrypt the output of an AES encryption (aes-256-cbc) we will use the HsOpenSSL library. Unlike the command line, each step must be explicitly performed. Luckily, it's a lot nice that using C. There 6 steps:

  1. Embed a file
  2. Decode Base64
  3. Extract the salt
  4. Get a password
  5. Compute the key and initialization vector
  6. Decrypt the ciphertext

Embed a file

To embed file we use Template Haskell, embedByteString from file-embed-lzma library.

Decode Base64

Decoding Base64 is an one-liner in Haskell. We use decodeLenient because we are quite sure input is valid.

Note: HsOpenSSL can also handle Base64, but doesn't seem to provide lenient variant. HsOpenSSL throws exceptions on errors.

Extract the salt

Once we have decoded the cipher, we can read the salt. The Salt is identified by the 8 byte header (Salted__), followed by the 8 byte salt. We start by ensuring the header exists, and then we extract the following 8 bytes:

Get a password

Probably you have some setup to extract configuration from environment variables. The following is a very simple way, which is enough for us.

We use unix package, and System.Posix.Env.ByteString.getEnv to get environment variable as ByteString directly. The program will run in Docker in Linux: depending on unix is not a problem.

We also initialize the OpenSSL library using withOpenSSL.

Compute the key and initialization vector

Once we have extracted the salt, we can use the salt and password to generate the Key and Initialization Vector (IV). To determine the Key and IV from the password we use the pkcs5_pbkdf2_hmac_sha1 function. PBKDF2 (Password-Based Key Derivation Function 2) is a key derivation function. We (as openssl) derive both key and IV simultaneously:

Decrypting the ciphertext

With the Key and IV computed, and the ciphertext decoded from Base64, we are now ready to decrypt the message.


In this post we embedded an encrypted file into Haskell application, which is then decrypted at run time. The complete copy of the code is at same repository, and changes done for this post are visible in a pull request.

July 20, 2019 12:00 AM

July 19, 2019

Roman Cheplyaka

Decompose ContT

Can we decompose a ContT action into separate acquire and release functions?


Consider some resource, such as a file handle, socket, database connection etc. The two actions common to all resources are acquiring and releasing.

There are two ways to represent these actions: as a pair of functions

or as a single function

As I explained in Why would you use ContT?, the latter can be nicely wrapped into a continuation monad:

We can go from an (acquireResource, releaseResource) pair to the ContT version by using bracket from Control.Exception:

But can we go in the opposite direction, i.e. decompose a value of type ContT r IO Resource into separate acquire and release functions?

To give an example of why we may want to do that, say a library only provides the withResource or getResource form, but we want to cache the allocated resource instances, deciding dynamically which ones to free. Another example would be combining resources and coroutines, see Understanding ResourceT by Michael Snoyman.

Note that not every ContT action follows the pattern “acquire; call the continuation; release”. Instead, the continuation may be called multiple times like so:

(The parametricity ensures, however, that an action of type forall r . ContT r m a will call its continuation at least once or throw an exception.)

So what we are really asking is, assuming the ContT action has a certain form such as bracket acquire release, can we recover the acquire and release1 actions?

A beautiful but useless solution

If our ContT action has the form

then we can recover the acquire and release x actions by… adding another ContT layer! This may sound bizarre, but the intuition behind this is simple: the release x is itself a continuation of k x—it is what happens when k x returns—and the ContT monad allows us to capture that continuation. Here’s the code:

Here, shift captures the current continuation delimited by runC; see Oleg’s delimited continuations tutorial.

Here is an example of how we can use decomposeContT. We generate two random numbers and then “release” them in their increasing order:

However, this solution has a major drawback. It relies on the ability to run our ContT action in a non-standard base monad (another ContT). Therefore, it cannot be used to decompose ContT $ bracket acquire release, because bracket cannot run in the ContT monad.

A more practical solution

A less elegant solution, but one that actually solves real-world problems, uses threads. Every resource is acquired in its own thread. The thread then blocks on an MVar until it receives a signal that the resource is no longer needed, in which case it proceeds to release it.

Here is the code.

{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
import Control.Monad.Cont
import Control.Concurrent
import Control.Exception

data ContCall a
  = ContException !SomeException
    -- ^ an exception occurred before the continuation was called
  | ContCall a
    -- ^ the continuation was called with this argument
  | ContNoCall
    -- ^ the ContT action returned wihtout ever calling the continuation

-- | Retrieve a resource from a 'ContT' action and return an action to
-- release it.
decomposeContT :: forall a . (forall r . ContT r IO a) -> IO (a, IO ())
decomposeContT ca = mask $ \restore -> do
  -- mvar_a is used to pass the 'a' result (or an exception) from the ContT thread to the
  -- calling thread
  mvar_a :: MVar (ContCall a) <- newEmptyMVar
  -- mvar_r is used to signal the ContT thread that its resources can be
  -- freed
  mvar_r :: MVar () <- newEmptyMVar
  -- mvar_e is used to communicate a possible final exception to the
  -- calling thread
  mvar_e :: MVar (Maybe SomeException) <- newEmptyMVar
    -- tell the ContT thread to release its resources
    freeResources :: IO ()
    freeResources = void $ tryPutMVar mvar_r ()
    -- like freeResources, but also check and propagate any exception that
    -- arose while trying to free the resource
    freeResourcesCheckException :: IO ()
    freeResourcesCheckException = do
      void $ tryPutMVar mvar_r ()
      mb_e <- readMVar mvar_e
      maybe (return ()) throwIO mb_e
  pid <- forkIO $ do
    --             The code below runs in a new thread
    r <- try . restore $ runContT ca $ \a -> do
      -- Try recording the argument of the continuation call in the MVar.
      -- Might fail (return False) if this is not the first continuation
      -- call, but we don't care.
      _ :: Bool <- tryPutMVar mvar_a (ContCall a)
      -- Then wait until it's ok to free the resources.
      () <- readMVar mvar_r
      return ()
    case r of
      Right () -> do
        -- The ContT action returned successfully. We don't know how many
        -- times the continuation was called. Try putting ContNoCall in
        -- case it wasn't called at all. If it was called, then tryPutMVar
        -- will return False; we don't care.
        _ :: Bool <- tryPutMVar mvar_a ContNoCall
        _ :: Bool <- tryPutMVar mvar_e Nothing
        return ()
      Left e -> do
        -- An exception was raised. We don't know whether it was before or
        -- after the continuation was called. Try putting the exception in
        -- both mvar_r and mvar_e so that it surfaces exactly once
        -- (assuming the cleanup function eventually gets called) in the
        -- calling thread.
        _ :: Bool <- tryPutMVar mvar_a $ ContException e
        _ :: Bool <- tryPutMVar mvar_e $ Just e
        return ()
    --             The code above runs in a new thread

  -- Now, in the calling thread, we wait until mvar_a is filled
  -- readMVar is blocking, so it may receive exceptions. If there is
  -- an exception, we also send it to the ContT thread.
  contCall <- readMVar mvar_a `catch` (\(e :: SomeException) -> do throwTo pid e; throwIO e)
  case contCall of
    ContCall a -> return (a, freeResourcesCheckException)
    ContException e -> do
      throwIO e
    ContNoCall -> do
      throwIO $ ErrorCall "decomposeContT: the continuation was never called"

  1. Or, more precisely, release x, where x is the value returned by acquire. We cannot infer how release would act on any other value, of course.

July 19, 2019 08:00 PM

Oleg Grenrus

Insertion sort "toposorts" when given a partial order

Posted on 2019-07-19 by Oleg Grenrus agda

I wonder: if we have a decidable partial ordering: like leq method in PartialOrd type class in lattices package.

Can we sort a list using leq so we get kind of a topological sorting?

Note, topological sorting is used for graphs: there we don’t have leq-like luxury. We’d first need to compute a transitive closure of a graph. On the other hand, given a list of "node", it may not include all nodes; and may include duplicates.

Insertion sort is a simple sorting algorithm, trivial to implement for lists (i.e. not in-place). I continue wondering: maybe it will produce topological ordering. Even insertion sort is simple, I still had an uneasy feeling: does it really work. Insertion sort terminates despite what function you pass as a comparator (even impure random one!), so what kind of output it produces when given lawful leq?

I must admit that exposure to Haskell and Agda made me suspect all pen and paper proofs, especially made by myself. So let’s certify insertion sort. The plan is to show how to certify insertion order first for total order and then for partial order.

Certifying insertion sort

Module definitions and imports:

Next, the definition of insert and sort. These are straight-forward and are familiar. Note that _≤?_ decides whether -relation holds. Returning Dec doesn’t forget what we compute (it returns the evidence), compared with simply returning Bool.

Twan van Laarhoven proved complete correctness of various sort algorithms (code gist). We’ll only do a sortedness of insertion sort.

Twan uses _≤?_ : (x y : A) (x ≤ y ⊎ y ≤ x) comparator ( is Either), that implies that ordering have to be total. Our ... Dec (x ≤ y) version is less powerful, therefore we’ll need to assume -flip.

Twan also defines insert to operate on Sorted xs, our proof is completely external.

There are few auxiliary lemmas, culminating with lemma that insert preserves sortedness, and the theorem that sort produces a sorted list.

One could also show that sort produces a permutation of input list, but that’s something I’m quite confident about already. Note: the proof of that fact won’t need any additional assumptions about .

By the way, these proofs show how dependently typed programming is full with various lists.1 Luckily (or not) Agda allows reuse of constructor names.

module Total (A : Set) (_≤_ : A  A  Set)
  (_≤?_ : (x y : A)  Dec (x ≤ y))
  (≤-trans :  {x y z}  x ≤ y  y ≤ z  x ≤ z)
  (≤-flip  :  {x y}  ¬ (x ≤ y)  y ≤ x) -- precise enough!

  infix 4 _≤*_

  open Sort _≤_ _≤?_

  data _≤*_ (x : A) : List A  Set where
    []  : x ≤* []
    __ :  {y ys}  x ≤ y  x ≤* ys  x ≤* y  ys

  data Sorted : List A  Set where
    [] : Sorted []
    __ :  {x xs}  x ≤* xs  Sorted xs  Sorted (x  xs)

*-trans :   x y ys  x ≤ y  y ≤* ys  x ≤* ys
*-trans x y []        x≤y []             = []
*-trans x y (y'  ys) x≤y (y≤y'  y'≤ys) =
-trans x≤y y≤y'  (≤*-trans x y ys x≤y y'≤ys)

  lem-cons-all≤ :  x y ys  x ≤ y  y ≤* ys  x ≤* y  ys
  lem-cons-all≤ x y ys x≤y y≤ys = x≤y *-trans x y ys x≤y y≤ys

  lem-skip :  x y ys  y ≤ x  y ≤* ys  y ≤* insert x ys
  lem-skip x y []        y≤x y≤ys        = y≤x  y≤ys
  lem-skip x y (y'  ys) y≤x (y≤y'  y'≤ys ) with x ≤? y'
  ... | yes x≤y' = y≤x  y≤y'  y'≤ys
  ... | no ¬x≤y' = y≤y'  (lem-skip x y ys y≤x y'≤ys)

  lem-insert-sorted :  x xs  Sorted xs  Sorted (insert x xs)
  lem-insert-sorted x [] s = []  s
  lem-insert-sorted x (y  ys) (y≤ys  sys) with x ≤? y
  ... | yes x≤y = lem-cons-all≤ x y ys x≤y y≤ys  y≤ys  sys
  ... | no ¬x≤y = lem-skip x y ys (≤-flip ¬x≤y) y≤ys  sorted-insert-x-ys
    where sorted-insert-x-ys = lem-insert-sorted x ys sys

  thm-sort-sorted :  xs  Sorted (sort xs)
  thm-sort-sorted []       = []
  thm-sort-sorted (x  xs) =
    lem-insert-sorted x (sort xs) (thm-sort-sorted xs)

... for only partial order

But what about partial order? Wikipedia says following about the topological sort:

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering.

We massage that into simple "there aren’t edges pointing backwards". So instead of saying: "for all sublists x ys, x is less-than-or-equal than any of y \in ys" we say "for all subsets x ys, x is not greater-than any of y \in ys".

After that, the proof structure is quite similar. I needed to use antisymmetry of , which shows that this Sorted predicate won’t hold for preorder. I’m not sure whether insertion order would work for preorder, I’m not sure it won’t work either.

module Partial (A : Set) (_≤_ : A  A  Set)
  (_≤?_ : (x y : A)  Dec (x ≤ y))
  (≤-trans :  {x y z}  x ≤ y  y ≤ z  x ≤ z)
  (≤-antisym :  {x y}  x ≤ y  y ≤ x  x ≡ y)

  open Sort _≤_ _≤?_

  record _<_ (x y : A) : Set where
    constructor le
      is-le : x ≤ y
      not-eq : ¬ (x ≡ y)

  open _<_

  <-trans₁ :  {x y z}  x < y  y ≤ z  x < z
  <-trans₁ (le x≤y ¬x≡y) y≤z = le
    (≤-trans x≤y y≤z)
    (λ x≡z  ¬x≡y (≤-antisym x≤y (subst (λ i  _ ≤ i) (sym x≡z) y≤z)))

  infix 4>*_

  -- x ¬>* ys = x is not larger than any in y ∈ ys
  data>*_ (x : A) : List A  Set where
    []  : x ¬>* []
    __ :  {y ys}  ¬ y < x  x ¬>* ys  x ¬>* y  ys

  data Sorted : List A  Set where
    [] : Sorted []
    __ :  {x xs}  x ¬>* xs  Sorted xs  Sorted (x  xs)

  lem-trans-none> :   x y ys  x ≤ y  y ¬>* ys  x ¬>* ys
  lem-trans-none> x y []       x≤y [] = []
  lem-trans-none> x y (z  zs) x≤y (¬z<y  y≤zs) =
    (λ z<y  ¬z<y (<-trans₁ z<y x≤y))  lem-trans-none> x y zs x≤y y≤zs

  lem-flip :  {x y}  x ≤ y  ¬(y < x)
  lem-flip {x} {y} x≤y with y ≤? x
  ... | yes y≤x = λ y<x  not-eq y<x (≤-antisym y≤x x≤y)
  ... | no ¬y≤x = λ y<x  ¬y≤x (is-le y<x)

  lem-cons-none> :  x y ys  x ≤ y  y ¬>* ys  x ¬>* y  ys
  lem-cons-none> x y ys x≤y y≤ys =
    lem-flip x≤y  lem-trans-none> _ _ _ x≤y y≤ys

  lem-skip :  x y ys  ¬ (x ≤ y)  y ¬>* ys  y ¬>* insert x ys
  lem-skip x y []       ¬x≤y []   = (λ p  ¬x≤y (is-le p))  []
  lem-skip x y (z  zs) ¬x≤y (¬z<y  y≤zs) with x ≤? z
  ... | yes x≤z = (λ x<y  ¬x≤y (is-le x<y))  ¬z<y  y≤zs
  ... | no ¬x≤z = ¬z<y  (lem-skip x y zs ¬x≤y y≤zs)

  lem-insert-sorted :  x xs  Sorted xs  Sorted (insert x xs)
  lem-insert-sorted x [] s = []  s
  lem-insert-sorted x (y  ys) (y≤ys  sys) with x ≤? y
  ... | yes x≤y = lem-cons-none> x y ys x≤y y≤ys  y≤ys  sys
  ... | no ¬x≤y = lem-skip x y ys ¬x≤y y≤ys  sorted-insert-x-ys
    where sorted-insert-x-ys = lem-insert-sorted x ys sys

  thm-sort-sorted :  xs  Sorted (sort xs)
  thm-sort-sorted []       = []
  thm-sort-sorted (x  xs) =
    lem-insert-sorted x (sort xs) (thm-sort-sorted xs)

Would merge sort work with partial order? I don’t know yet!

Experimenting in Haskell

Let’s try in Haskell. You should try your hypothesis first, before trying to formally prove them. Proving false statements can take a lot of time!

After a little of imports, we’ll defined isSorted check, and try it on a N5 lattice. insertionSort works with leq, but a mergeSort only with <=.

import Algebra.PartialOrd
import Algebra.Lattice.N5
import Algebra.Lattice.Ordered
import Data.List (sortBy)
import Test.QuickCheck

lt :: PartialOrd a =>  a -> a -> Bool
lt x y = x /= y && leq x y

notGt :: PartialOrd a => a -> a -> Bool
notGt x y = not (lt y x)

-- | This checks that list is sorted in PartialOrd sense.
isSorted :: PartialOrd a => [a] -> Bool
isSorted []       = True
isSorted (x : ys) = isSorted ys && all (notGt x) ys

-- | Sorted holds when list is sorted using @Ord@
-- +++ OK, passed 100 tests.
totalProp :: [Ordered Int] -> Bool
totalProp = isSorted . sortBy compare

-- | Next, let's define insertion sort.
insertionSort :: (a -> a -> Bool) -> [a] -> [a]
insertionSort le = go where
    go []     = []
    go (x:xs) = insert x (go xs)

    insert x [] = [x]
    insert x (y : ys) | le x y    = x : y : ys
                      | otherwise = y : insert x ys

-- | And try with a partially ordered set.
-- +++ OK, passed 100 tests.
-- Works!
m5Prop :: [N5] -> Bool
m5Prop = isSorted . insertionSort leq

-- Then, naive mergesort.
mergeSort :: (a -> a -> Bool) -> [a] -> [a]
mergeSort f = go where
    go []  = []
    go [x] = [x]
    go xs  = let (ys, zs) = split xs
             in merge (go ys) (go zs)

    merge []     ys                 = ys
    merge xs     []                 = xs
    merge (x:xs) (y:ys) | f x y     = x : merge xs (y:ys)
                        | otherwise = y : merge (x:xs) ys

-- | >>> split [1..10]
-- ([1,3,5,7,9],[2,4,6,8,10])
split :: [a] -> ([a],[a])
split [] = ([], [])
split (x:xs) = case split xs of
    ~(ys,zs) -> (x:zs,ys)

-- | Our 'mergeSort' returns correct results. Works like 'sort'.
-- +++ OK, passed 100 tests
mergeProp :: [Int] -> Property
mergeProp xs = mergeSort (<=) xs === sortBy compare xs

-- >>> quickCheck m5Prop2
-- *** Failed! Falsified (after 18 tests and 4 shrinks):
-- [N5b,N5a,N5c]
-- sorted [N5a,N5c,N5b]
-- >>> insertionSort leq [N5b,N5a,N5c]
-- [N5c,N5b,N5a]
-- Sort is not unique: insertionSort finds an ordering:
-- See picture at
-- >>> leq N5b N5a
-- True
-- >>> isSorted [N5c,N5b,N5a]
-- True
-- >>> isSorted [N5b,N5a,N5c]
-- True
-- >>> isSorted [N5b,N5c,N5a]
-- True
m5Prop2 :: [N5] -> Property
m5Prop2 xs =
    let xs' = mergeSort leq xs
    in counterexample ("sorted " ++ show xs') $ isSorted xs'

Note: Sorted is somewhat lax:

But that’s not an issue to me, as I’ll be sorting lists with distinct elements. More on that later.

  1. In Haskell there’s a variety of string types. In Dependent Haskell there will be a variety of various lists and naturals numbers... ... and strings types.

July 19, 2019 12:00 AM

July 17, 2019

Tweag I/O

Revelations from repetition:
Source code headers in Haskell and Python

Simeon Carstens, Matthias Meschede

Every day we write repetitive code. A lot of it is boilerplate that you write only to satisfy your compiler/interpreter: code that is not related to the main logic of the program like import and export lists, language extensions, file headers. But how do languages differ in their boilerplate content? Is it only the content of the boilerplate that changes, or also its quantity? We explore these questions using data sets of Python and Haskell code. Besides satisfying our curiosity, we will learn about community-wide habits and realize that after all, repetition is not necessarily uninteresting!

A first look at the data

Our data sets come from public repositories of Haskell and Python packages. In the case of Haskell, we use a current snapshot of all packages on the Stackage server. For Python, we downloaded a random subset of approximately 2% of all packages on the Python Package Index. Based on our sample, we estimate the total size of all (compressed!) packages on PyPI to be approximately 19 Gb. We chose to use only a small sample from PyPI so that we could perform analyses on our laptops. This sampling allows us to load the Python data set in memory, while keeping its size comparable to the Haskell one.

Let's first look at a few key characteristics of our data sets, namely the number of packages, total number of lines of code (LOC), LOC per package, number of words, and the most common word:

Python Haskell
Number of packages 3414 2312
LOC 6,048,755 3,862,107
Average LOC per package 1772 1760
Number of words 36,577,867 23,174,821
Most common word x (6,7%) NUL (4,5%)

Hold on! NUL is the most common word in Stackage packages? Surprising, but true: \NUL is the quotation of the null character, and a small number of packages (2.7%) have inline byte strings with many, many copies of \NUL in them. The next common Haskell word is "a", which is a common type and term variable name. It is also interesting to see that the average number of lines of code is very, very similar in the Haskell and the Python data sets!

Import statements and language extensions—how many are there?

Now let's take a closer look and see what we can learn from this data. As you might know, in both Python and Haskell files start with a list of import statement. In Haskell, file headers also contain a list of LANGUAGE pragmas, which add extensions to the language. We thus expect import statements to be a common pattern in the source code data sets. In Haskell, we imagine LANGUAGE pragmas to be another common pattern.

Let's find out whether there are any differences in the frequency of these patterns between Python and Haskell code. We can easily determine a package's fraction of lines of code that correspond to import statements and LANGUAGE pragmas: this fraction is just the number of lines of code with import and language pragma keywords divided by the number of all lines of code. The following histograms show the results:

Haskell tends to have more import and LANGUAGE statements per package than Python, as indicated by the average percentage (dashed lines): for Python, it's about 6%, while for Haskell it's about 9.5%. Interestingly, in both languages, a few packages have a very high fraction of this specific kind of boilerplate code. Those can be found from the 50% mark on but they are not visible in the figure because of their low package count. In case of Python, such packages often are setuptools scripts, while for Haskell, they are module exports and setup files.

... and which are the most frequently used?

Answering this question amounts to extracting the name of the imported package or used LANGUAGE pragma for each line of code. For Python, we first look at basic import [...] statements:

Few surprises for Python's basic imports - os and sys are the most frequently imported modules. In fact, they make up 27% and 19% of all basic imports. But things change dramatically when considering from [...] import [...] statements:

40% of all from [...] import [...] statements import things from TensorFlow, a popular machine learning library. We know that TensorFlow is popular, but that popular? It turns out that our random sample of Python packages happens to contain a complete version of TensorFlow and that "self-imports" within that package account for 83% of all TensorFlow imports. It is thus a single, big package which leads to this surprisingly high percentage of TensorFlow imports. Disregarding that package, around 2.5% of all import statements are concerned with TensorFlow, which would still crack the top 10.

Onwards to Haskell:

Here we find an unexpectedly high occurrence of explicit Prelude imports. Imports from the Data namespace make up 34% of all import statements, which matches our intuition that its contents are very frequently used.

When considering the most frequently used language pragmas, perhaps unsurprisingly, the OverloadedStrings extension leads the field: 40% of all Haskell packages in our data set use this extension. The popularity of this extension makes a good case for OverloadedStrings to enter the Haskell standard. Furthermore, it's surprising that TypeFamilies is the third most common language pragma. Type families are a fairly advanced subject and one would thus expect them not to be that commonly used. We can also compare our results to a previous analysis of Haskell source on GitHub, which, too, finds that OverloadedStrings is the most popular extension. The ten most popular extensions listed in the figure above also feature in that analysis' list of the 20 most frequently used language extensions, although not necessarily in the same order. The reason for that is not immediately clear—it might be that our Haskell data set is not representative of all Haskell code on GitHub; after all, at the time of writing, there are around 45,000 Haskell projects on GitHub, while our data set contains only 2,312 packages.


We took a first look at our data sets and investigated import statements and LANGUAGE pragmas in Python and Haskell. By filtering code lines for certain keywords we were able to answer interesting questions about the frequency with which these basic coding patterns occur and how their frequency differs between Python and Haskell code. But our data sets have potential for much deeper analysis of less obvious patterns. A pattern can be characterized by a set of similar lines of code, and we expect to find, for example, control structure patterns such as for loops. But are there unknown patterns to be discovered that we didn't think of? How do we discover them in our data sets? How do they differ between programming languages? And can we somehow exploit these patterns—think code completion or tools such as presented here? We're excited to see what other insights these data have to offer—stay tuned!

July 17, 2019 12:00 AM

Lysxia's blog

July 15, 2019

Sandy Maguire

Nimic: A language about nothing

If you haven’t heard, I recently solicited couches to stay on. The idea is to cruise around the globe, meeting cool programmers, and collaborating with them on whatever project they’re most excited about.

This weekend I spent time with the inimitable David Rusu. The bar for my trip has been set extremely high; not only is David an amazing host, but we hashed out a particularly interesting project in a couple of days. We call it nimic.

Nimic is a free-form macro language, without any real syntax, or operational semantics. We have a super bare bones parser that groups parenthetical expressions, and otherwise admits any tokens, separated by whitespace. The language will attempt to run each of its macros on the deepest, leftmost part of this grouping structure. If nothing matches, the program is stuck and we just leave it.

Therefore, hello world in nimic is just the stuck program:

hello world

which you have to admit is about as small as you can get it. The core language installs four built-in macros; the most interesting of which is macro — allowing you to define new macros. The syntax is macro pattern rewrite, which itself will be rewritten as the stuck term defined.

As a first program, we can use macro to rewrite the defined term:

macro defined (nimic is weird)

which will step to defined via the definition of macro, and then step to nimic is weird via the new defined rule. Here it gets stuck and does no more work.

You can use the special tokens #foo to perform pattern matching in a macro. These forms are available in the rewrite rule. For example,

macro (nimic is #adjective) (nimic is very #adjective)

will replace our nimic is weird term with nimic is very weird. You can bind as many subterms in a macro as you’d like.

Because nimic attempts to run all of its macros on the deepest, leftmost part of the tree it can, we can exploit this fact to create statements. Consider the program:

(macro (#a ; #b) #b)
; ( (macro (what is happening?) magic)
  ; (what is happening?)

Here we’ve built a cons list of the form (head ; tail). Our default evaluation order will dive into the leftmost leaf of this tree, and evaluate the (macro (#a ; #b) #b) term there, replacing it with defined. Our tree now looks like this:

; ( (macro (what is happening?) magic)
  ; (what is happening?)

where our new #a : #b rule now matches, binding #a to defined, and #b to the tail of this cons cell. This rule will drop the defined, and give us the new tree:

( (macro (what is happening?) magic)
; (what is happening?)

whereby we now can match on the leftmost macro again. After a few more steps, our program gets stuck with the term magic. We’ve defined sequential evaluation!

But writing cons cells by hand is tedious. This brings us to the second of our built-in macros, which is rassoc #prec #symbol. The evaluation of this will also result in defined, but modifies the parser so that it will make #symbol be right-associative with precedence #prec. As a result, we can use rassoc 1 ; to modify the parser in order to turn a ; b ; c into a ; (b ; (c)).

Therefore, the following program will correctly get stuck on the term finished:

(macro (#a ; #b) #b)
; ((rassoc 1 ;)
( this is now
; parsed correctly as a cons cell
; finished

The final primitive macro defined in nimic is bash #cmd, which evaluates #cmd in bash, and replaces itself with the resulting output. To illustrate, the following program is another way of writing hello world:

bash (echo "hellozworld" | tr "z" " ")

Note that the bash command isn’t doing any sort of bash parsing here. It just takes the symbols echo "hellozworld" | tr "z" " and ", and dumps them out pretty printed into bash. There are no string literals.

We can use the bash command to do more interesting things. For example, we can use it to define an import statement:

macro (import #file) (bash (cat #file))

which when you evaluate import some/file.nim, will be replaced with (bash (cat some/file.nim)), which in turn with the contents of some/file.nim. You have to admit, there’s something appealing about even the module system being defined in library code.

But we can go further. We can push our math runtime into bash!

macro (math #m) (bash (bc <<< " #m "))

which will correctly evaluate any math expressions you shoot its way.

Personally, I’m not a big fan of the macro #a #b notation. So instead I defined my own sequent notation:

  rassoc 2 ----------
; macro (#a ---------- #b) (macro #a #b)

This thing defines a macro, which, when expanded, will itself define a macro. Now David and I don’t need to have any discussions bikeshedding over syntax. We can just define whatever we want!

For a longer example, I wanted to implement pattern matching a la Haskell.

My first step was to define a lazy if statement. Because macros are tried most-recently-defined first, I can define my operational semantics first. The rule is to force the condition:

; if #cond then #a else #b
  if !#cond then #a else #b

(the exclamation marks here are magic inside of the macro macro, which will force macro expansion at whatever it’s attached to) Next, I give two more expansion rules for what to do if my condition is true and false:

; if True then #a else #b

; if False then #a else #b

Great! We can define syntactic equality of stuck terms by forcing two subterms, and then checking them in bash for string equality:

; judge #a #b
  is zero !(bash ([[ " !#a " == " !#b " ]]; echo $?))

; is zero #n

; is zero 0

We can try this out. judge hello hello and judge (macro (a a a) b) defined both step to True, but judge foo bar steps to False.

Finally, we’re ready to define pattern matching. We start with the base case in which there is only one pattern we can match on:

; match #a of (#m1 -> #r1)
  if !(judge #a #m1) then #r1 else (failed match)

We replace our match statement with an equality test, and produce failed match if the two things aren’t identical.

There’s also an induction case, where we want to match on several possibilities.

; match #a of (#m1 -> #r1 ; #z)
  if !(judge #a #m1) then #r1 else (match !#a of #z)

Here we’d like to perform the same rewrite, but the else case should pop off the failed match.

Amazingly, this all just works. Let’s try it:

; rassoc 3 =

; #a = #b

; not #t =
    match #t of
      ( True -> False
      ; False -> True

; match (not True) of
    ( True -> hello
    ; False -> it works!

This program will get stuck at it works!. Pretty sweet!

The core nimic implementation is 420 lines of Haskell code, including a hand-rolled parser and macro engine. But wait, there’s more! There’s also an additional 291 line interactive debugger, which allows you to step through the macro expansion of your program. It’s even smart enough to colorize the variables in your source tree that are being matched by the current macro.

The Glorious Nimic Stepper
The Glorious Nimic Stepper

Not bad for a weekend of work. I can barely contain my excitement for what other cool projects are on the horizon.

July 15, 2019 01:44 PM

Stackage Blog

Upcoming stackage LTS 14 snapshot with ghc-8.6.5

The stackage curator team has started preparations for an LTS major bump which will include ghc-8.6.5. This is not a change in GHC version number from the current LTS 13 series. In addition to allowing new, breaking changes of packages into an LTS release, this LTS 14 will be the first LTS using the new curator tool.

To give everyone some time to prepare, we are announcing the planned release date as approximately two to four weeks from the date of this post.

Getting your package into the nightly snapshot will ensure its inclusion in LTS 14 as well. To add your package, please submit a PR to commercialhaskell/stackage. To add your package to LTS 14 after it’s released, please open a new issue on commercialhaskell/lts-haskell. Please feel free to contact the stackage curators with any questions you may have via the stackage mailing list, or the stackage gitter chat.

For more insight into how stackage curators handle these sorts of things, see: the LTS major bump process.

July 15, 2019 12:01 PM

Michael Snoyman

How to lose weight

This is a guide for losing weight for health reasons. This assumes a few things:

  1. You already know that you need to lose weight
  2. When you say “lose weight,” what you mean is “I have too much fat, and I want to lose that fat, even if I lose some muscle”

For most people in western countries, both of these will be true. There are four basic ways to approach losing weight. They can be approached simultaneously, but it’s important to distinguish starting points:

  1. Exercise of some form (cardio, resistance training)
  2. Change how much you eat (reduce calories)
  3. Change what you eat (e.g. keto diet, vegan diet)
  4. Change when you eat (e.g. intermittent fasting)


Overall, if your goal is to lose weight, exercise is not where you should start. I strongly recommend exercise, and it’s a good thing to do. In particular, resistance training can help you build muscle which can improve overall health and assist in long term weight loss. But assuming you have limited time, energy, and will power, this is not going to give the best result.

How much you eat

This is in line with standard diet advice: eat less, move more! This advice has failed for decades. 95% of people who “diet” end up gaining back all the weight within a year. It’s for a simple reason: eating the same foods in the same way, but just eating less of it, requires a huge amount of willpower. You spend every minute of every day fighting hunger and cravings. You sit down to a meal, try to eat a small amount, and eventually give in, consume everything in sight, and gain the weight back.

There are lots of hormonal issues at play that make this happen, but the most important thing most people need to hear is:

  • Standard dieting advice is almost always doomed to fail
  • It is rarely, if ever, the fault of the dieter
  • Do not let medical professionals make you feel bad about this; they’re the ones providing bad advice!

What you eat

The first rule of thermodynamics says that energy can’t be created or destroyed. Some people take this law of physics to say that “it doesn’t matter what you eat, it’s all about calories.” This is wrong, shortsighted, and idiotic. The type of food you eat affects you hormonally, mentally, and even emotionally. What you choose to eat can increase or decrease your appetite, change how much you want to exercise, and alter your body’s metabolic rate.

There are a few simple rules for “what to eat” that are fairly universal:

  • Avoid sugar and refined grains
  • Try to eat “real foods”. If someone in the 1800s would recognize it, it’s real food. If it’s produced in a factory, it’s a food-like product. From there is gets much more nuanced, and this is where people lose hope. Should I be vegan? Should I eat keto? Is fat going to kill me? Are carbs?

My personal recommendation is low carb. It helps fight cravings the best in my opinion. That said, any diet that sticks to real foods, and doesn’t overwhelm you with too much of the combination of both carbs and fat will be successful. You can lose weight on both a carnivore and vegan diet. Just choose one and stick with it!

When you eat

Modern medical advice includes insanity like “eat 6 small meals a day.” This is dumb. “Eat more often so you eat less.” No, that’s dumb. I’ll prove it to you. I know a method that every person on the planet agrees will result in weight loss. You know what that is: stop eating for a few days. Therefore: you don’t need to eat 6 small meals a day to lose weight.

There are lots of points in favor of restricted eating windows, where you have certain times of the day or the week when you don’t eat at all. It affects you hormonally, letting insulin levels drop, for instance. Will you eat more at the next meal? Probably. But it’s OK, because overall you’re eating less and losing weight.

I find that a 12pm-8pm eating window is really easy to incorporate. Start the morning with a cup of coffee or tea if you like, with a small amount of cream if desired, and then don’t eat until noon. Then try to keep your eating to two meals (lunch and dinner), finishing before 8pm. Add in 1 snack if you need to, ideally something like nuts.


You want to get started, and just get told what to do? OK, follow these steps, which are optimized to avoid demanding a lot of willpower:

  1. Identify the junk food you binge on, and remove it from your house. Give it away, throw it away, burn it and dance around the bonfire. It requires 0 willpower to eat food that isn’t there.
  2. Introduce a 12pm-8pm eating window. Allow yourself 1 day a week to be flexible on it, but otherwise stick to it solidly. You’ll be hungry the first few days, and then you’ll get used to it. Very little willpower necessary.
  3. Reduce carb intake. Sugary beverages should be the first to go. Do not make any meals based around breads, pastas, etc. This one is more complex, but low carb food is really delicious. It takes effort to pull this off, but you’re not fighting hunger and cravings on a daily basis.
  4. If you’re adventurous, try out a multiday fast. It’s best to spend some time eating a low carb diet first, but you can jump right in. The first time you do this will be hard, don’t expect otherwise. But I believe the mental freedom you get from realizing that you are not dependent on a constant food drip is worth the effort.


There are lots of more advanced topics. I mean, a lot. “How do I improve my lipids?” “How do I gain muscle while losing fat?” And so on. Don’t think about those now! Prove to yourself you can lose some weight, keep the weight off, and not suffer in the process. Don’t worry about the long term effects of what you’re doing. If you’re overweight or obese, and you bring down your body fat levels, you’re almost certainly making yourself healthier in the long term.

July 15, 2019 10:13 AM

Chris Smith 2

Building and Debugging FRP with CodeWorld and Reflex

Two weeks ago, I wrote about FRP with Reflex and CodeWorld. The integration has been popular, and I’m honored and thrilled that people find it useful and interesting.

(Aside: By the way, I neglected to mention last time that Joachim Breitner proposed this idea, and implemented it two years ago. I wasn’t able to merge his code at the time, because it was blocked on a Reflex Hackage release that took a long time. Ryan and I implemented the idea again at the NYHaskell CoHack, and I’ve been developing it further since then.)

However, the first implementation didn’t go as far as I wanted to.

  • While Reflex helped with composability, the CodeWorld API didn’t live up to the goal.
  • Reflex-based programs lacked the scene graph inspection features of the traditional CodeWorld APIs.
  • Reflex-based programs were missing the on-screen exploratory controls that are available for traditional CodeWorld APIs.

I’m now announcing some changes to CodeWorld’s Reflex integration to address these problems. I’ll show you how the system has changed, and what you can do now that you couldn’t do before.

Composability and the builder monad

I was still brand new to FRP when I started this project. While I understood the core abstractions — Event, Behavior, and Dynamic, which I introduced in the last post — I did not appreciate some of the more structural choices that the reflex-dom library works.

Recall from last time that the key advantage of programming in FRP style, versus the simpler functional MVC style, is that the program can be assembled from well defined components that have inputs and outputs. (By comparison, in functional MVC style, a single interaction must be sprinkled through several parts of the program: its state must be added to the state type, it must be initialized in the initial state, its event handlers must be added to the program-wide event handling code, and its visualization must be added to the overall picture!) As I started to actually build non-trivial structured programs in CodeWorld’s Reflex integration, though, I found myself feeling like I hadn’t really escaped that world.

The original API for CodeWorld Reflex was this:

reflexOf :: (_ => ReactiveInput t -> m (Dynamic t Picture)) -> IO ()

Here, the inputs to the CodeWorld program are passed in, and one single picture is returned. The plumbing to thread the input everywhere, and especially to combine the resulting pictures everywhere was simply too much. The plumbing of output pictures obscured the way that higher-level interactions between components of the program were wired together.

The new API I’ve defined looks like this:

class ReflexCodeWorld t m
getKeyPress :: ReflexCodeWorld t m => m (Event t Text)
getKeyRelease :: ReflexCodeWorld t m => m (Event t Text)
getTextEntry :: ReflexCodeWorld t m => m (Event t Text)
getPointerClick :: ReflexCodeWorld t m => m (Event t Point)
getPointerPosition :: ReflexCodeWorld t m => m (Dynamic t Point)
isPointerDown :: ReflexCodeWorld t m => m (Dynamic t Bool)
getTimePassing :: ReflexCodeWorld t m => m (Event t Double)
draw :: ReflexCodeWorld t m => Dynamic t Picture -> m ()
reactiveOf :: (forall t m. ReflexCodeWorld t m => m ()) -> IO ()

Instead of passing an input object as an argument, the low-level input controls are available in the ambient monad. In practice, it was easier to just pass around the entire set of inputs anyway: it’s not really the caller’s business whether a control uses one or another physical input event. And instead of returning Dynamic t Picture as a result, the draw action can be used by any code running in the picture-builder monad to emit pictures to the screen.

This is similar to reflex-dom, too. There, if you want to add a checkbox or button to the screen, actually adding the button itself is handled as an effect in the DOM-builder monad, so that FRP values are only used for communicating higher-level interactions with the widget.

An example of this new API is here. Go ahead and play with it to see how it behaves.

{-# LANGUAGE OverloadedStrings #-}
import CodeWorld.Reflex
import Data.Text (Text)
import Reflex
main :: IO ()
main = reactiveOf $ do
up <- button "Up" (0, -4)
down <- button "Down" (0, -8)
left <- button "Left" (-4, -6)
right <- button "Right" (4, -6)
    pos <- foldDyn ($) (0, 0) $ mergeWith (.) [
(\(x, y) -> (x, y + 1)) <$ up,
(\(x, y) -> (x, y - 1)) <$ down,
(\(x, y) -> (x + 1, y)) <$ right,
(\(x, y) -> (x - 1, y)) <$ left
draw $ uncurry translated <$> pos <*> pure (solidCircle 0.2)
    return ()
button :: ReflexCodeWorld t m => Text -> Point -> m (Event t ())
button label (bx, by) = do
draw $ constDyn $ translated bx by $
dilated 0.75 (lettering label) <>
rectangle 3 1 <>
colored (light gray) (solidRectangle 3 1)
click <- getPointerClick
return (() <$ ffilter onButton click)
where onButton (x, y) = abs (x - bx) < 1.5 && abs (y - by) < 0.5

This is a very minimal rendition of a button, but notice that once it’s been defined, button nicely captures the logical structure of a button: it needs a label and a point, and produces an event indicating it has been clicked. All of the details of a button, such as ensuring that it has been drawn to the screen and that it listens for mouse clicks, are handled as details inside its implementation. Because of this, main reads nicely as a description of which buttons appear where, and how they affect the important state: the position of the movable dot.

It’s interesting to consider what happens if I want a fancier button that turns a different shade when one is hovering over it. I can just change the implementation to add this new feature:

button :: ReflexCodeWorld t m => Text -> Point -> m (Event t ())
button label (bx, by) = do
hover <- fmap onButton <$> getPointerPosition
let color = bool (light gray) (light (light gray)) <$> hover
draw $ pic <$> color
click <- getPointerClick
return (() <$ ffilter onButton click)
onButton (x, y) = abs (x - bx) < 1.5 && abs (y - by) < 0.5
pic color = translated bx by $
dilated 0.75 (lettering label) <>
rectangle 3 1 <>
colored color (solidRectangle 3 1)

In the functional MVC model, such as used by traditional CodeWorld, Gloss, Elm, or Racket’s universe model, something like this wouldn’t be so localized! The button needs new state to know whether it’s hovered over, so a change must be made in a data type for the state. New event handlers must be added to change the hover state in response to mouse-movement events. And the picture rendering must be updated, as was done here. Using FRP makes this painless, and the new ReflexCodeWorld class lets you encapsulate even more and stay focused on the application logic rather than plumbing event and picture values.

More Effects

Reflex offers a few more options, as well, for effects in the builder monad. I’ve also extended CodeWorld to implement some of these: most notably, the type classes PerformEvent, Adjustable, and PostBuild.

The PerformEvent type class lets you schedule effects to happen as a result of events in the FRP system. It’s basically MonadIO for FRP! Using this, we can cheat a little bit, and build CodeWorld programs that escape the canvas!

{-# LANGUAGE JavaScriptFFI #-}
import CodeWorld.Reflex
import Control.Monad.Trans (liftIO)
import Reflex
foreign import javascript unsafe "alert($1 + ',' + $2)"
alert :: Double -> Double -> IO ()
main :: IO ()
main = reactiveOf $ do
click <- getPointerClick
performEvent_ $ (\(x, y) -> liftIO $ alert x y) <$> click

The special performEvent_ takes an Event of actions to perform, and performs them when the event fires. The actions are in a monad with a MonadIO instance, so among other things, they can perform I/O as you can see here.

The first use I was able to make of this new PerformEvent monad was to use it to handle drawing the screen and interacting with CodeWorld’s debug interface (see below). This helped me to move more of the code into the FRP fold.

Adjustable extends PerformEvent with the ability to not just perform actions, but even rebuild parts of the FRP network itself. This can be handy when you’re dealing with dynamic data and want to create controls for each element of a list, or something like that. I don’t recommend that beginners try to use Adjustable on its own, but there are some higher-level functions built on this in the “Collection management” section of the Reflex Quick Reference.

Finally, PostBuild just offers an event that fires as soon as the builder monad has completed building. Again, I don’t recommend using it directly, but it’s used in some higher-level functions in the Quick Reference.

(Another type class that’s part of Reflex, but not yet integrated into CodeWorld is TriggerEvent, which allows you to create an Event together with an action that fires it. This can be seen as the complement of PerformEvent. Perhaps there’s some value to adding this class, but in general CodeWorld programs don’t have significant logic that lives outside of CodeWorld, so it’s less useful. The main example in the Quick Reference has to do with creating a time ticker, and CodeWorld offers a better one using requestAnimationFrame anyway, so I’m not in a hurry.)

FRP and Debugging

Two debugging features were also missing from the Reflex API.

The Inspector is a feature, accessed through the purple Inspect button on a running program, lets you browse the scene graph: the structure of primitive shapes, transformations, etc. that make up the screen. It even links from the screen and the tree view to the relevant source code. Originally implemented by Eric Roberts as part of Summer of Haskell a few years ago, this is a really powerful feature for debugging graphics programs.

The CodeWorld Inspector

The implementation of this feature was specific to traditional CodeWorld entry points, and the new Reflex integration originally didn’t offer the inspector. That’s now changed. If you create a program with CodeWorld’s reflexOf entry point, you will see the “Inspect” button on the bottom toolbar, and opening it will show you the scene graph and and let you navigate the picture exactly as you can with the traditional CodeWorld library.

A second debugging feature is CodeWorld’s on-screen controls. These controls give a user the ability to zoom in or out, pan, pause, fast-forward, and so on, with a running program. While the Inspector is great for examining the parts of the results, the on-screen controls are useful for exploring the behavior of the whole program. These, too, were implemented in a way that was specific to traditional CodeWorld entry points.

On-screen debug controls in a CodeWorld program

I’ve now implemented basic on-screen controls for Reflex programs, as well. To try it out, you’ll need to:

  1. Write your code to the new reactiveOf instead of the old reflexOf entry point.
  2. Actually, use debugReactiveOf instead of reactiveOf. This has the same type, so it’s just six characters of changes, but it enables the on-screen controls.

I actually had to reimplement this feature entirely, as the previous implementation was tied very closely to the functional MVC architecture. This was a non-trivial implementation effort with some interesting challenges, and it has guided my learning about how to build UI code in Reflex with better composition and modularity.

I wasn’t able to entirely reproduce the earlier on-screen debugging feature set. Certain controls depended on some knowledge of the state: either they kept a state log, or they requires a specific state type. The quintessential example here is the time-traveling debugger that Krystal Maughan wrote last summer, which keeps a log of past states, and allows a developer to scroll through them and find the exact point something went wrong! This is the cost of Reflex’s increased modularity, I suppose: since the complete set of program state is no longer part of the program’s type system, it’s not accessible or visible for inspection and debugging in the same way that functional MVC is. Perhaps there’s a solution to that, but it’s not an obvious one.

What else?

I’m still not done. There are a few details I’m still figuring out.

  1. Performance isn’t where I’d like it to be. The new builder monad seems to do a lot of allocations, and the frame rate starts to get choppy toward the end of a garbage collection interval.
  2. One challenge is supporting common UI idioms, such as keyboard focus, or propagation controls for events between controls. I haven’t got a good answer. I played around with a scheme for effectfully taking and filtering out events, so that a control could claim a mouse click and other controls wouldn’t see the same trick. But getting that to work with both MonadFix and Adjustable in useful ways is not easy. (It took me a while to reach this intuition, but Adjustable is largely incompatible with effects that are observable inside the network. DynamicWriterT is the answer to this, but it’s limiting.)
  3. There are lots of places where I’d like to modify both the inputs and outputs to the program in a local scope. I’m thinking about API options for this, which would be added to the ReflexCodeWorld type class.
  4. I am getting increasingly interested in putting together a more detailed intro to FRP and Reflex using CodeWorld, that walks through the parts with more deliberate examples and exercises and structure. I don’t know if I’m quite qualified to write this, yet, but I am definitely at the stage where I’m excited about the prospect of it existing.

In any case, this has been a great project for me. I’ve got a moderately complex bunch of Reflex code totaling nearly a thousand lines and tackling some non-trivial challenges. I’m learning more than I expected, and having a lot of fun. Here’s to finding out what happens next!

by Chris Smith at July 15, 2019 05:10 AM

July 14, 2019

Donnacha Oisín Kidney

Bachelor's Thesis

Posted on July 14, 2019
Tags: Agda

I recently finished my undergrad degree in UCC. I’m putting my final-year project up here for reference purposes.

Here is the pdf.

And here’s a bibtext entry:

	address = {Cork, Ireland},
	type = {Bachelor thesis},
	title = {Automatically and {Efficiently} {Illustrating} {Polynomial} {Equalities} in {Agda}},
	url = {},
	abstract = {We present a new library which automates the construction of equivalence proofs between polynomials over commutative rings and semirings in the programming language Agda [20]. It is signi cantly faster than Agda’s existing solver. We use re ection to provide a sim- ple interface to the solver, and demonstrate how to use the constructed proofs to provide step-by-step solutions.},
	language = {en},
	school = {University College Cork},
	author = {Kidney, Donnacha Oisín},
	month = apr,
	year = {2019}

by Donnacha Oisín Kidney at July 14, 2019 12:00 AM

Lysxia's blog

July 12, 2019

Serokell Blog

Serokell Contributions to the Next Generation

Big businesses are often more interested in earning money than investing in young people. As long as their goal is to get immediate results, they don’t have time and motivation to teach new employees, and their job ads tend to target very experienced workers. We at Serokell believe in self-development and practical appliance of latest scientific inventions, that’s why we support students and gladly hire young specialists. We feel that our position is not only beneficial for business but also makes the world a better place.

Give Students Opportunities to Learn

Supporting children and youth doesn’t pay off quickly, therefore not every big IT company pays much attention to it. It’s way more simple to hire an experienced specialist that will bring an immediate result than searching through thousands of fresh-graduates’ CVs trying to find a brilliant there. As a result, there are not so many opportunities for young people who want to grow in this domain.

We still remember ourselves in those students’ shoes and want to change the situation by developing a supportive infrastructure for young people who like computer science and programming.

To achieve that, we act in several directions. First of all, many of Serokell employees read lectures in the Computer Technology department of ITMO – one of Russia’s National Research Universities that is well-known in a tech community because ITMO students regularly win programming contests and championships. Teaching is mutually-beneficial, as students take advantage of a highly practical-oriented scientific approach, and we get access to the best young talents. Sounds like a dream of every HR specialist in tech, doesn’t it? That’s why some of Serokell workers are ITMO graduates.

However, sharing experience and knowledge with younger people is not enough. Communication is equally important, that’s why we also support different science-related activities for youth. For example, this year, we financed the Bioinformatics Contest and are going to sponsor a computer summer school. It is a place where school students can participate in amazing workshops, learn useful things about computer science and programming, and we hope it will help them to enter great universities.

In future, we plan to become one of the sponsors of The International Collegiate Programming Contest, programming competition among the universities of the world.

Encouraging children to learn modern technologies and programming is our investment in the future. These people will be developing the industry, which, in turn, will help us to create more advanced products. We’re interested in scientific progress and want to help clever people to grow, yet that’s not all. What drives us forward is the passion for finding ways of the practical appliance of latest scientific inventions.

Young people learning modern technologies
Encouraging young people to learn modern technologies and programming is our investment in the future. – The source.

Employ Young Specialists

Many companies are reluctant to hire young specialists without considerable job experience. It’s easy to understand their doubts since such employees often don’t know how to work in a team and may not be very well-disciplined. But what these big and rich companies tend to forget is that young specialists are usually open-minded, have up-to-date knowledge as well as unhindered perspective.

For this reason, we are not scared of hiring fresh graduates. We give them an opportunity to touch real projects, and they work under the control of experienced core teams. Of course, beginners never work on crucial parts of projects, but we make sure they get their portion of real industrial coding. Serokell CEO Arseniy Seroka explains how it works:

We never judge potential employees by their age. If a person is young and clever but doesn’t have enough skills to organise their work well, we will offer them a well-organised infrastructure and help them to operate in a way that is beneficial for both sides.

The idea is to make our company a supportive and friendly workspace for young employees. One of the basic principles of Serokell is freedom of expression, and every intern can discuss everything with the team lead or CEO, ask for help and take part in the decision-making process.

Serokell also runs an internship program for university students. In our company, they get an opportunity to participate in real internal projects and to hone their skills in the industry. After the internship, many of those yesterday’s students start to work at Serokell.

Supportive and friendly workspace
We want to make our company a supportive and friendly workspace for young employees. – The source.

Promote Science and Drive Progress

It’s fair enough that big businesses are primarily interested in money. We don’t have anything against money either, actually. But some people want to move forward and drive human progress and don’t think about profits in the first place, namely scientists. One doesn’t usually get into science to make a fortune but rather to make the world a better place. Arseniy states:

Science, self-development and progress are the most important things in life, and I want Serokell to help the progress and reach the most ambitious goals.

It’s not an exaggeration to say that most of the Serokell team members share that view.

The company has far-reaching plans to keep supporting science. In particular, we are working on organising our commercial departments so that they can support a non-commercial department where scientists will be free to work on projects they’re passionate about and occasionally create something that may change the world.

We’re doing some good stuff already – we contribute to GHC, conducted a Numeric Haskell research, organised a Machine Learning Laboratory, and we’re going to keep developing our non-commercial activity. Our goal here is not only to blend together business and scientific progress but to prove that such an approach can be successful in the hope that others may follow the example.

And that’s not all we do. In Serokell, we always encourage our colleges to write scientific papers and participate in conferences in their paid time. We also translate the most interesting articles written by young scientists and post them in a special section of the blog (just filter out Mathematics, and you’ll see them).

We’re always eager to collaborate with open-minded, passionate people, so if you feel we have similar views or want to discuss anything, don’t hesitate to contact us in social media or via email:

by Ellina Poponnikova ( at July 12, 2019 12:00 AM

July 11, 2019

Sandy Maguire

How I Intend to Help Steer GHC

Yesterday I was (somewhat surprisingly) brought on as a new member of the GHC steering committee. I believe transparency is very important in situations like these, and so I’d like to outline my intentions for my tenure in the committee.

I nominated myself because of problems I perceived in the status-quo. Lately, there have been a few notable cases of dissatisfaction in the community with the proposal process. I have certainly had my qualms with it in the past, but I decided that rather than complain about it, I should just get involved and try my best to help.

Which takes us to today.

I’m extremely humbled to be included with such a group of incredible individuals as my co-committee members are. I don’t claim to know everything about writing industrial-grade compilers, nor about the intricacies of cutting-edge type system research. Most of the time I’m only barely smart enough to keep up with the conversation, let alone have a strongly informed opinion on the topic. WIth that in mind…

I don’t intend to be a gatekeeper. Instead, I intend to be a facilitator; someone whose job it is is to know the right people to bring into the discussion. Nobody in this community is an expert at everything, but everyone has a niche in which they are an expert. I want to make sure that those people are involved in the process in which their expertise can be used.

That’s not to say I’m not going to do everything in my power to understand each and every proposal under my stewardship. Just that I’ll be the first to admit that I don’t know everything, and I don’t want my understanding to get in the way of good ideas!

I intend to inspire the committee to move faster. I think any process which can take upwards of two years is too slow — it’s a waste of everyone’s time, energy and attention. My preference is fast responses to proposals, whether it be yea or nay. I don’t want to railroad any decisions, but I also don’t want to sit on the fence out of fear of making the wrong move.

Along those lines, I intend to enthusiastically support any quality-of-life improvements to the language that are small in scope, compose well, and don’t get in the way of active, longer-term goals. Delaying a decision on the basis of “there’s probably a more general solution here that nobody has time to think about, nor do they plan to actually propose” seems like a bad strategy for me. I don’t want to let perfect be the enemy of the much better.

I intend to preemptively shut down discussions around bikeshedding. Too often bikeshedding completely derails the conversation. It’s an enormous waste of time. Feelings get hurt, and everyone just talks past one another anyway. I intend to give the proposal author and the Simons the only say on bikeshedding matters. They can canvas for opinions if they’d like, but in my opinion, should do so outside the official proposal channels.

To be clear, I am not saying you shouldn’t offer suggestions if they improve a proposals’ composability, type-safety, elegance or parsimony. But let’s agree as a community that opinions of style are just opinions, and are never going to convince anyone anyway.

I intend to remove as many barriers to entry as humanly possible. The Haskell community is by far the smartest, kindest, most incredible group of people I’ve ever had the good fortune to be a part of. I firmly believe that each and every one of us has great things to contribute to this community, and I suspect the world would be a better place if more of us felt like we had some ownership over the language.

Please take a moment to remember that Haskell is only as great as we make it. There is no “community,” nor is there a “GHCHQ.” There are just people who step up to solve the problems that they see. I encourage you to be one of those people. You don’t need permission. You just need to be kind, genuine and hard-working. Remember that nobody involved in our community is stupid, and if they “just don’t get it,” maybe your argument wasn’t as persuasive as you thought.

All of this goes beyond making proposals to the compiler. Be the community you want to see. Don’t assume somebody else is going to do it for you.

Last but not least, I intend to be a good addition to this committee. I solicit anonymous feedback, so please don’t hesitate to call me out if you’re ever unhappy with my behavior. I promise to not be offended by any constructive feedback you send my way.

I will do my best to stay honest to these intentions, but help keep me in tow when I slip up.

The GHC steering committee is a thankless job. I’d like to take a moment to explicitly call out every member of the committee, past and present, for their selfless commitments to the community. These are people who are taking time out of their own research and projects in order to make our fantastic language much more transparent than it would be otherwise. Really and truly, these people are heroes, even if we don’t always agree with their decisions. Thank you, from the very bottom of my heart.

Finally, I intend to live up to everyone’s expectations. Thanks for your time.

July 11, 2019 05:40 PM

Mark Jason Dominus

The New York City passport office

Earlier this week I reported on a good visit I had had to the Philadelphia offices of the Social Security Administration.

Philadelphia government offices, in my experience, are generally better than those I have visited elsewhere. I've never been to the New York DMV office (do they even have one?) but the Philadelphia ones are way better than the New Jersey ones I used to use. Instead of standing in line for forty-five minutes, you get a number and sit down until your number is called.

The passport office was the biggest surprise. I first went in to deal with some passport thing shortly after I arrived in Philadelphia, maybe 1990 or so. The office was clean and quiet, the line was short, I got my business done quickly. None of those is the case in the New York passport office.

The New York passport office. Wow. Where to begin? I want to say that it defies description. But, I learned later, it has been described by no less a person than Samuel Beckett:

The abode is a flattened cylinder with rubber walls fifty meters in circumference and eighteen meters high. It is constantly illuminated by a dim, yellow light, and the temperature fluctuates between 5°C to 25°C, sometimes in as small an interval as four seconds. This leads to extremely parched skin, and the bodies brush against each other like dry leaves. Kisses make an "indescribable sound" and the rubber makes the footsteps mostly silent. There are 200 inhabitants, or one per square meter. Some are related to each other. Some are even married to each other, but the conditions make recognition difficult.

Here's a story of the New York passport office told to me by a friend many years ago. He stood in the line for forty-five minutes, and when he reached the window, he handed over his forms. The clerk glared at him for a few seconds, then, without a word, pushed them back.

“Is something wrong?” asked my friend.

There was a long pause. The clerk, too disgusted or enraged to reply immediately, finally said “They're not stapled.”

“Oh,” said my friend. “I see you have a stapler on your desk there.”

You're supposed to staple them.”

“May I use your stapler?”

“No, your stapler is on the table at the back of the room.”

At this point my friend realized he was dealing with a monster. “Okay, but I can come right back to the window afterward, right?”

“No, you have to wait, like everyone else.”

At that moment my friend felt a tap on his shoulder. A man a few places behind him in line reached into his suit pocket and handed him a stapler. My friend says that as he stapled his papers and turned them in, the look on the clerk's face was of someone whose whole day had just been ruined.

“Thanks so much,” said my friend to Stapler Man. “Why did you happen to have a stapler in your pocket?”

“Oh,” said Stapler Man. “I do a lot of business at the passport office.”

by Mark Dominus ( at July 11, 2019 02:27 PM

July 06, 2019

Magnus Therning

Elasticsearch, types and indices

The other day I added some more logging into a service at work, but not all logs appeared in Kibana. Some messages got lost between CloudWatch Logs and Elasticsearch. After turning up the logging in the Lambda shuffling log messages I was in for a bit of learning about Elasticsearch.

Running the following in a Kibana console will show what the issue was

Executing them in order results in the following error on the second command

The reason for this is that a schema for the data is built up dynamically as documents are pushed in.1 It is possible to turn off dynamic schema building for an index using a mapping. For the documents above it’d look something lik this

Now it’s possible to push both documents, however searching is not possible, because, as the documentation for dynamic says:

fields will not be indexed so will not be searchable but will still appear in the _source field of returned hits

If there’s something that determines the value of logs it’s them being searchable.

As far as I understand one solution to all of this would have been mapping types, but that’s being removed (see removal of mapping types) so isn’t a solution. I’m not sure if Elasticsearch offers any good solution to it nowadays. There’s however a workaround, more indices.

Using two indices instead of one does work. So modifying the first commands to use separate indices works.

When creating an index pattern for idx-* there’s a warning about many analysis functions not working due to the type conflict. However, searching does work and that’s all I really care about in this case.

When shuffling the logs from CloudWatch Logs to Elasticsearch we already use multiple indices. They’re constructed based on service name, deploy environment (staging, production) and date (a new index each day). To deal with these type conflicts I added a log type that’s taken out of the log message itself. It’s not an elegant solution – it puts the solution into the services themselves – but it’s acceptable.

  1. Something that makes me wonder what the definition of schema-free is. I sure didn’t expect there to ever be a type constraint preventing pushing a document into something that’s called schema-free (see the Wikipedia article). (The initiated say it’s Lucene, not Elasticsearch, but to me that doesn’t make any difference at all.)

July 06, 2019 12:00 AM

July 05, 2019

Brent Yorgey

Lightweight invertible enumerations in Haskell

In a previous post I introduced a new Haskell library for enumerations (now on Hackage as simple-enumeration). The Data.Enumeration module defines a type Enumeration a, represented simply by a function Integer -> a which picks out the value of type a at a given index. This representation has a number of advantages, including the ability to quickly index into very large enumerations, and the convenience that comes from having Functor, Applicative, and Alternative instances for Enumeration.

I’ve just uploaded version 0.2 of the package, which adds a new Data.Enumeration.Invertible module with a new type, IEnumeration a, representing invertible enumerations. Whereas a normal enumeration is just a function from index to value, an invertible enumeration is a bijection between indices and values. In particular, alongside the Integer -> a function for picking out the value at an index, an invertible enumeration also stores an inverse function a -> Integer (called locate) for finding the index of a given value.

On the one hand, this comes at a cost: because the type parameter a now occurs both co- and contravariantly, IEnumeration i s no longer an instance of Functor, Applicative, or Alternative. There is a mapE combinator provided for mapping IEnumeration a to IEnumeration b, but in order to work it needs both an a -> b function and an inverse b -> a.

On the other hand, we also gain something: of course the ability to look up the index of a value is nifty, and beyond that we also get a combinator

functionOf :: IEnumeration a -> IEnumeration b -> IEnumeration (a -> b)

which works as long as the IEnumeration a is finite. This is not possible to implement with normal, non-invertible enumerations: we have to take an index and turn it into a function a -> b, but that function has to take an a as input and decide what to do with it. There’s nothing we can possibly do with a value of type a unless we have a way to connect it back to the IEnumeration a it came from.

Here’s a simple example of using the functionOf combinator to enumerate all Bool -> Bool functions, and then locating the index of not:

>>> bbs = functionOf (boundedEnum @Bool) (boundedEnum @Bool)
>>> card bbs
Finite 4
>>> locate bbs not
>>> map (select bbs 2) [False, True]

And here’s an example of enumerating recursive trees, which is parallel to an example given in my previous post. Note, however, how we can no longer use combinators like <$>, <*>, and <|>, but must explicitly use <+> (disjoint sum of enumerations) and >< (enumeration product) in combination with mapE. In return, though, we can find the index of any given tree in addition to selecting trees by index.

data Tree = L | B Tree Tree
  deriving Show

toTree :: Either () (Tree, Tree) -> Tree
toTree = either (const L) (uncurry B)

fromTree :: Tree -> Either () (Tree, Tree)
fromTree L       = Left ()
fromTree (B l r) = Right (l,r)

trees :: IEnumeration Tree
trees = infinite $ mapE toTree fromTree (unit <+> (trees >< trees))

>>> locate trees (B (B L (B L L)) (B (B L (B L L)) (B L (B L L))))
>>> select trees 123
B (B L (B L L)) (B (B L (B L L)) (B L (B L L)))

Of course, the original Data.Enumeration module remains available; there is clearly an inherent tradeoff to invertibility, and you are free to choose either style depending on your needs. Other than the tradeoffs outlined above and a couple other minor exceptions, the two modules export largely identical APIs.

by Brent at July 05, 2019 01:34 PM

Manuel M T Chakravarty

My talk “Rethinking Blockchain Contract Development”...

My talk “Rethinking Blockchain Contract Development” from Lambda Days 2019, where I outline why blockchains and functional programming are a good fit and how we are exploiting that in the development of Plutus Platform — the contract layer on top of the Cardano proof-of-stake blockchain. I am also discussing the research-driven development methodology around Cardano.

July 05, 2019 09:53 AM

July 04, 2019

Stackage Blog

New Stackage snapshot format

I’ve been blogging off and on about Pantry and how it affects both Stack and Stackage. There have been a number of Github issues lying around about changes to how we’ll build Stackage snapshots. Most of those changes have landed, and it’s worth recapping the problems we were trying to solve, how things used to work, how they work now, and what’s left.

Old build process overview

The stackage repo has a build-constraints.yaml file, which specifies which packages should be included in the next nightly, which tests are expected to fail, upper bounds, etc. The file is where most people interact with the Stackage project, where pull requests land, and where the Stackage Curator team applies changes.

The stackage-curator tool is used on Travis CI to put together the constraints file with the current state of Hackage to come up with a snapshot, and then check to make sure all of the bounds on that snapshot are valid. When you get an issue from someone on the Curator team reporting bounds failures, it’s a report from stackage-curator.

We run a build server that rebuilds nightly snapshots on a regular basis. It uses the stackage-curator tool as well to create the snapshot definition, perform the build, run tests, upload Haddocks to Amazon S3, and commit the new snapshot file to the stackage-nightly repo. If there are build failures, the Stackage Curator team reports on them and adds more constraints to work around them.

LTS snapshots are basically the same thing, except for an LTS minor version bump, the constraints come from the previous snapshot (e.g., lts-13.22 is based on the constraints implicit in lts-13.21), and the snapshots are committed to the lts-haskell repo.

Finally, Stackage Server (a.k.a. takes information from stackage-nightly, lts-haskell, and the uploaded docs on S3 to provide everything it does: lists of snapshots, packages, Haddocks, and Hoogle search.


  • stackage-curator implements its own Cabal package building logic by working directly with Cabal-the-library. There’s a lot of feature overlap with Stack, and different behavior in some corner cases.
  • The original Stackage file format used in stackage-nightly and lts-haskell is unique from the “custom snapshot” format that Stack supports, for only historical reasons.
  • The snapshot format is big. It contains a lot of information about module names, constraints, users, etc, that build tools don’t need.
  • While there is cryptographic hash information on cabal files to deal with Hackage Revisions, there is no cryptographic hash information on the package tarball contents themselves.


  • We’re scrapping the previous stackage-curator tool entirely.
  • There’s a brand new curator tool, which uses the same snapshot format as Stack 2, Pantry, and what was previously known as custom snapshots.
  • This curator tool still does bounds checking, and is now included in Travis CI. However, this tool does not build packages. It shells out to Stack to do that instead. That makes this tool much simpler.
  • Important end user consideration: the old tool would automatically add dependencies of packages to a snapshot. After some discussion and lots of requests around this, all packages to be included in a snapshot must appear explicitly in build-constraints.yaml. Some people have already gotten reports about this, which is what initially spurred this blog post.
  • New snapshots are landing in the stackage-snapshots repo.
  • For backwards compatibility, we currently convert snapshots to the original format in the original repos, so stackage-nightly and lts-haskell are still being populated. This will not continue indefinitely, we do not have a timeframe on when we will discontinue this.

Current status

We’ve moved over Nightly builds completely to the new tool. We have not yet moved LTS Haskell over. Due to differences in how constraints are tracked, we’re going to make the switchover when we release LTS 14. We’ll put out a separate post about LTS 14 fairly soon.


The two most important takeaways for most people are:

  1. If you depend on a package that’s not present in Stackage, you’ll need to add that package to Stackage to get your package included.
  2. If you’re writing tooling that uses the snapshot files, you should check out stackage-snapshots and the pantry library.

July 04, 2019 08:24 AM

July 02, 2019

Joey Hess

custom type checker errors for propellor

Since propellor is configured by writing Haskell, type errors are an important part of its interface. As more type level machinery has been added to propellor, it's become more common for type errors to refer to hard to understand constraints. And sometimes simple mistakes in a propellor config result in the type checker getting confused and spewing an error that is thousands of lines of gobbledygook.

Yesterday's release of the new type-errors library got me excited to improve propellor's type errors.

Most of the early wins came from using ghc's TypeError class, not the new library. I wanted custom type errors that were able to talk about problems with Property targets, like these:

    • ensureProperty inner Property is missing support for: 

    • This use of tightenTargets would widen, not narrow, adding: 
        ArchLinux + FreeBSD

    • Cannot combine properties:
        Property FreeBSD
        Property HasInfo + Debian + Buntish + ArchLinux

So I wrote a type-level pretty-printer for propellor's MetaType lists. One interesting thing about it is that it rewrites types such as Targeting OSDebian back to the Debian type alias that the user expects to see.

To generate the first error message above, I used the pretty-printer like this:

    ('Text "ensureProperty inner Property is missing support for: "
        ':$$: PrettyPrintMetaTypes (Difference (Targets outer) (Targets inner))

Often a property constructor in propellor gets a new argument added to it. A propellor config that has not been updated to include the new argument used to result in this kind of enormous and useless error message:

    • Couldn't match type ‘Propellor.Types.MetaTypes.CheckCombinable
                                (Propellor.Types.MetaTypes.NonTargets y0)
                                      ('Propellor.Types.MetaTypes.Targeting 'OSDebian)
                                      (Propellor.Types.MetaTypes.Targets y0))
                                   ('Propellor.Types.MetaTypes.Targeting 'OSDebian
                                      : Data.Type.Bool.If
                                             ('Propellor.Types.MetaTypes.Targeting 'OSBuntish)
    -- many, many lines elided
    • In the first argument of ‘(&)’, namely
        ‘props & osDebian Unstable’

The type-errors library was a big help. It's able to detect when the type checker gets "stuck" reducing a type function, and is going to dump it all out to the user. And you can replace that with a custom type error, like this one:

    • Cannot combine properties:
        Property <unknown>
        Property HasInfo + Debian + Buntish + ArchLinux + FreeBSD
        (Property <unknown> is often caused by applying a Property constructor to the wrong number of arguments.)
    • In the first argument of ‘(&)’, namely
        ‘props & osDebian Unstable’

Detecting when the type checker is "stuck" also let me add some custom type errors to handle cases where type inference has failed:

    • ensureProperty outer Property type is not able to be inferred here.
      Consider adding a type annotation.
    • When checking the inferred type
        writeConfig :: forall (outer :: [Propellor.Types.MetaTypes.MetaType]) t.

    • Unable to infer desired Property type in this use of tightenTargets.
      Consider adding a type annotation.

Unfortunately, the use of TypeError caused one problem. When too many arguments are passed to a property constructor that's being combined with other properties, ghc used to give its usual error message about too many arguments, but now it gives the custom "Cannot combine properties" type error, which is not as useful.

Seems likely that's a ghc bug but I need a better test case to make progress on that front. Anyway, I decided I can live with this problem for now, to get all the other nice custom type errors.

The only other known problem with propellor's type errors is that, when there is a long list of properties being combined together, a single problem can result in a cascade of many errors. Sometimes that also causes ghc to use a lot of memory. While custom error messages don't help with this, at least the error cascade is nicer and individual messages are not as long.

Propellor 5.9.0 has all the custom type error messages discussed here. If you see a hard to understand error message when using it, get in touch and let's see if we can make it better.

This was sponsored by Jake Vosloo and Trenton Cronholm on Patreon.

July 02, 2019 08:56 PM

July 01, 2019

Neil Mitchell

Thoughts for a Haskell IDE

Summary: We have been working on pieces for a Haskell IDE at Digital Asset.

At Digital Asset, we wrote the DAML programming language. The compiler builds on GHC, and one of the important tools for using DAML is an IDE. You can try the DAML IDE online or download it. Since we wrote the DAML IDE in Haskell, and DAML uses GHC under the hood, it's possible to take the work we did for the DAML IDE and turn them into pieces for a Haskell IDE. In the rest of this post I'll outline what we wrote, and how I think it can make a full Haskell IDE.

What has Digital Asset written?

We have written a Haskell library hie-core, which serves as the "core" of an IDE. It maintains state about which files are open. It generates diagnostics. It runs the parser and type checker. It doesn't figure out how to load your package, and it doesn't have integrations with things like HLint etc. In my view, it should never gain such features - it's deliberately a small core of an IDE, which can be extended with additional rules and handlers after-the-fact.

On the technical side, at the heart of the IDE is a key-value store, where keys are pairs of file names and stages (e.g. TypeCheck) and values are dependent on the stage. We use the Shake build system in memory-only mode to record dependencies between phases. As an example of a rule:

define $ \TypeCheck file -> do
pm <- use_ GetParsedModule file
deps <- use_ GetDependencies file
tms <- uses_ TypeCheck (transitiveModuleDeps deps)
packageState <- use_ GhcSession ""
opt <- getIdeOptions
liftIO $ Compile.typecheckModule opt packageState tms pm

To type check a file, we get the parse tree, the transitive dependencies, a GHC session, and then call a typecheckModule helper function. If any of these dependencies change (e.g. the source file changes) the relevant pieces will be rerun.

Building on top of Shake wasn't our first choice - we initially explored two painful dead ends. While Shake isn't perfect for what we want, it's about 90% of the way there, and having robust parallelism and many years of solid engineering is worth some minor compromises in a few places. Having all the features of Shake available has also been exceptionally helpful, allowing us to try out new things quickly.

What else is required for an IDE?

My hope is that hie-core can become the core of a future IDE - but what else is required?

  • Something to load up a GHC session with the right packages and dependencies in scope. For DAML, we have a custom controlled environment so it's very easy, but real Haskell needs a better solution. My hope is that hie-bios becomes the solution, since I think it has a great underlying design.
  • Some plugins to add features, such as the as-yet-unwritten hie-hlint and hie-ormolu. Since we add lots of features on top of hie-core to make the DAML IDE, we have a good story for extensions in hie-core. Importantly, because shake is designed to be extensible, these extensions can integrate with the full dependency graph.
  • Something to talk Language Server Protocol (LSP) to communicate with editors, for which we use the existing haskell-lsp.
  • An extension for your editor. We provide a VS Code extension as extension in hie-core, but it's a fairly boilerplate LSP implementation, and people have got it working for Emacs already.
  • Something to put it all together into a coherent project, generate it, distribute it etc. A project such as haskell-ide-engine might be the perfect place to bring everything together.

Can I try it now?

Yes - instructions here. I've been using hie-core as my primary Haskell development environment since ZuriHac two weeks ago, and I like it a lot. However, beware:

  • The IDE doesn't load all the relevant files, only the ones you have open.
  • Integration with things like stack doesn't work very well - I've been using hie-bios in "Direct" mode - giving it the flags to start ghci myself. See my integrations for shake and hlint.
  • Features like hs-boot files and Template Haskell need more work to be fully supported, although a bit of Template Haskell has been observed to work.

These issues are being discussed on the hie-bios issue tracker.

Hypothetical FAQ

Q: Is something like FRP better than Shake for describing dependencies? A: I think it's clear that an IDE should use some dependency/incremental computation/parallel rebuilding approach. Shake offers one of those, and is well tested, exception safe, performant etc. The mapping from Shake to what we really want is confined to a single module, so feel free to experiment with alternatives.

Q: Who has contributed? Many many people have contributed pieces, including the whole team at Digital Asset, in particular Tim Williams, David Millar-Durant, Neil Mitchell and Moritz Kiefer.

Q: What is the relationship to haskell-ide-engine? My hope is this piece can slot into the other great things that have been done to make IDE tooling better, specifically haskell-ide-engine. This post is intended to start that discussion.

by Neil Mitchell ( at July 01, 2019 02:24 PM

FP Complete

When children processes exit - a debugging story

This is a story about how some bad API design on my part caused some ugly race conditions that were very tricky to break down. I’m writing this story as a word of warning to others! The code itself was written in Haskell, but the lessons apply to anyone working with Unix-style processes.

by Michael Snoyman ( at July 01, 2019 09:10 AM

Chris Smith 2

Functional Reactive Programming with Reflex and CodeWorld

TL;DR: I’m releasing a Reflex-based FRP interface to CodeWorld. It’s more complex, but also far more compositional than CodeWorld’s existing API.


(You can skip this section if you already know a bit about CodeWorld.)

Since its inception, my CodeWorld project has always subscribed to the gloss programming model. That is, in essence, that a program consists of:

  • An initial state
  • A plain pure function which, given a state and something that happens in that state, produces a new state.
  • A plain pure function which, given a state, produces a picture to be displayed on the screen.
A Gloss-model CodeWorld program

This is, I think, the simplest general-purpose programming model that is possible. (CodeWorld has a few simpler models, but they require programs to be stateless, so do not really qualify as “general-purpose”.) It has been reinvented many times, and I’ve heard it described by various communities with names like “functional MVC”, or “the universe model”. It is also closely related to dynamical systems from mathematical modeling (which do not pictures or UI events) or the Elm Architecture (which extends it with a system called “ports” for outgoing I/O).

This model is radically simple, but it’s not very compositional. It’s difficult to abstract over parts of the application, or break apart a complex program into smaller pieces. I have written several medium-sized applications in this model, and Joachim Breitner even programmed his slides for a talk within CodeWorld in this model (and wrote a glorious hack to be able to build them in CodeWorld itself). I think we’d all agree we’ve stretched the upper scaling limits of this simple model.

There is also an alternative model with a long history now in functional programming: Functional Reactive Programming (FRP).

Intro to Reflex and FRP

(You can skip this section if you already know a bit about Reflex.)

Functional reactive programming has been around for a long time, now, with a dizzying list of implementations. But over the last several years, the Reflex library ( has proven to be a stable, reliable, and production-ready implementation of the idea, and it appears to be rising to dominance. In Reflex, FRP is implemented in terms of three basic abstractions:

  • An Event is a thing that can happen. This might be something at a low level of abstraction, like a mouse being clicked. Or it might be something at a much higher level, such as moving a chess piece. But it happens at specific times, yielding specific values that describe what occurred. (When I first encountered FRP several years ago, the word “Event” was very confusing to me. An FRP Event is not a single event. It’s more like a stream of events that may continue occurring, yielding different values. By now, though, the word has unfortunately caught on.)
  • A Behavior is a value that may change over time. This might be the current position of the mouse pointer, or the current score in a game.
  • A Dynamic is both a Behavior and an Event. It has a value over time, but the changes in that value are observable as an event that fires when updates are made.

It’s often said that in FRP, an event is like a list of timestamped values (so similar to [(Time, a)], and a behavior is like a function from time to a value (so Time -> a). This is true in the sense of general mathematical values — but it’s not true if you think of these as Haskell types. Sure, a behavior has a value for each point of time, but that function isn’t computable, because it depends on things the user has yet to do. It only makes sense to interact with that value in a limited sense: you can ask for the current value, but you can’t sensibly expect to be able to ask for the value 100 years from now, or 300 years ago!

So to ensure that you only interact with these values in well-defined coherent ways, they are abstract data types, with their own APIs. One of the nice things for Haskell programmers is that these APIs are not all new in FRP: Event is a Functor, and Behavior and Dynamic are Applicative and Monad instances, as well. That gives these types a familiar API. All of these abstractions are also Monoids and Semigroups. But on top of this, there are more specialized combinators for combining them, at

CodeWorld’s Reflex Integration

You may have heard that getting started with Reflex is a pain. This is not true! What is true is that getting started with cross-platform development using reflex-dom and GHCJS can be painful. (There are projects like reflex-platform and Obelisk designed to mitigate that pain… your mileage may vary.) But Reflex itself is just a library, and can be installed the same way you’d install any other Haskell library.

Now it’s even easier, though: All you need is a web browser!

To use Reflex with CodeWorld, just hop over to, and type something like this:

import CodeWorld.Reflex
import Reflex
main = reflexOf $ \input -> return $
constDyn codeWorldLogo

Click the Run button, and you’ll see a beautiful CodeWorld logo drawn via Reflex!

Hmm… not impressed? Okay, I admit that functional reactive programming is more interesting if you it can actually, well, react to things. To do this, you’ll want to look at the definition of reflexOf. Hit the “Guide” button at the bottom of your screen, and choose the Reflex section on the topic bar to the left.

:: (forall t m. (Reflex t, MonadHold t m, MonadFix m)
=> ReactiveInput t -> m (Dynamic t Picture))
-> IO ()

Let’s take this in parts.

  • The argument to the reflexOf function is a function of its own. That makes sense: you passed a lambda for that argument in the starter program. That function is a rank 2 type with a forall and context, but ignore that context line, and focus on its base type:
    ReactiveInput t -> m (Dynamic t Picture).
  • The output type is a Dynamic t Picture: a picture that can change over time. That should make sense! (If you’re curious why it’s a Dynamic instead of a Behavior, that’s because it would be a waste to keep redrawing a screen that’s not changing. Dynamic contains enough information to avoid redrawing when the screen doesn’t change.)
  • The input type is a ReactiveInput t. This is just a bundle of information you may want to use in your program. The Guide page tells you what you can get out of a ReactiveInput t.
keyPress        :: ReactiveInput t -> Event   t Text
keyRelease :: ReactiveInput t -> Event t Text
textEntry :: ReactiveInput t -> Event t Text
pointerPress :: ReactiveInput t -> Event t Point
pointerRelease :: ReactiveInput t -> Event t Point
pointerPosition :: ReactiveInput t -> Dynamic t Point
pointerDown :: ReactiveInput t -> Dynamic t Bool
currentTime :: ReactiveInput t -> Dynamic t Double
timePassing :: ReactiveInput t -> Event t Double

If you’re familiar with the basic CodeWorld API, some of these should look familiar. With Reflex, though, CodeWorld doesn’t use its own event type at all. You get your events straight from the FRP abstractions in this input bundle, which are sometimes much nicer to use. For example, you no longer need to stash away the most recent mouse pointer position in a field somewhere, because it’s always available from the pointerPosition dynamic value.

Let’s do something with these values. Here’s a compass that always points n̶o̶r̶t̶h̶ toward the mouse pointer.

import CodeWorld.Reflex
import Reflex
main :: IO ()
main = reflexOf $ \input -> return $ do
let angle = vectorDirection <$> pointerPosition input
rotated <$> angle <*> pure needle
needle = solidRectangle 6 0.3

This is using only Functor and Applicative operations. First, we decide which direction to rotate the needle, by calculating the direction toward the current mouse pointer position. The Functor class lets us apply this calculation to a dynamic value, and get a dynamic result called angle. Next, we apply the rotated function using Applicative syntax to build a dynamic picture. (Applicative wasn’t actually required here, but it’s available and looks cleaner than writing ($ needle) . rotated <$> angle to avoid it.)

Something to pay attention to is how we’re still doing what’s known as “wholemeal programming”. This is the opposite of piecemeal. Instead of focusing on what’s happening on what’s happening at some particular moment, we just say, in essence, that the picture is a needle, rotated in the direction of the current mouse position. It would be a mistake to think of this as an action that’s run at some specific time: it’s instead a specification for how the picture is related to the mouse position in this program, and that’s enough!


This is all well and good, but it doesn’t address the question of state. The compass relied on the position of the mouse pointer, so at least it’s more stateful than pure CodeWorld animations. But in a more complex program, you’ll want your own state. Everything we’ve seen so far requires that your program be a pure function of the dynamic values from the input bundle, and that’s very limiting!

To manipulate state, you’ll use the monad that’s called m in the type signature for reflexOf. It’s also sometimes referred to as the “builder monad”. Instead of writing reflexOf $ \input -> return $ ... (which ignores the builder monad), we can write code inside of it. Let’s make the length of our compass needle adjustable.

{-# LANGUAGE OverloadedStrings #-}
import CodeWorld.Reflex
import Control.Monad.Fix
import Reflex
main :: IO ()
main = reflexOf $ \input -> do
len <- needleLen input
let angle = vectorDirection <$> pointerPosition input
return $ rotated <$> angle <*> (needle <$> len)
(Reflex t, MonadHold t m, MonadFix m)
=> ReactiveInput t -> m (Dynamic t Double)
needleLen input = do
let lenChange = fmapMaybe change (keyPress input)
foldDyn (+) 6 lenChange
where change "Up" = Just ( 1)
change "Down" = Just (-1)
change _ = Nothing
needle len = solidRectangle len 0.3

Because the needle length is new state, it must be defined in the builder monad. To do that, we first use fmapMaybe on the built-in keyPress event, to extract just presses from the relevant keys and what change they should add to the cumulative length. We then use foldDyn to build a stateful Dynamic that remembers the state over time.

Notice that we’re now postponing the return in the builder monad until the last line of main. That lets us use the builder monad to construct this new dynamic value that holds onto state.

It’s worth comparing the way stateful programs typically break down in traditional CodeWorld (and Elm, and Racket universe, and similar models) versus the Reflex case.

  • In traditional CodeWorld programs, composition was possible for each event type. Within an event handler, one could delegate, decompose, and abstract changes. But different event handlers communicated via a shared global state! In fact, it wasn’t common to see a student solve a problem by storing a value like 0.001 into a variable, planning for some completely different event handler to react some way if the value is greater than 0. These kinds of spooky action-at-a-distance dependencies are exactly what functional programming tries to avoid by eschewing mutable state!
  • Here, though, one module of the program can safely create and manipulate state that is private to that module, and depends in well-defined ways on the remaining program state. That module can define how this state responds to various inputs, at various levels of abstraction. There is no action-at-a-distance, though, because the inputs to every piece of code are explicitly passed in. There’s no shared global state.

The cost, though, is the need to use all these new abstractions. Composition sometimes comes at a cost in complexity.

Traditional CodeWorld or Reflex?

I’m thrilled to offer this new choice for doing quick and powerful graphics programming in CodeWorld. However, I don’t expect it to replace uses of the simpler CodeWorld API.

I don’t ever intend to introduce Reflex or any kind of FRP in the CodeWorld education dialect. It’s just not suited for teaching new programmers. Even for new Haskell programmers, such as in university classes, I’d strongly recommend avoiding Reflex for a first exposure to the language. That said, though, I’m really excited about Reflex providing even more of a smooth path from early learning experiences into more abstract functional programming. Abstraction may be the enemy of the beginner, but it is not the enemy of the intermediate programmer! Many of us love Haskell for providing powerful tools to solve complex problems. Being able to transition from the traditional CodeWorld model to Reflex without relearning the output format or UI concepts opens up some really cool possibilities.

Another direction I’m very much pursuing is to use this new interface to clean up the implementation of CodeWorld itself! CodeWorld started out with a hand-coded tail-recursive event loop, and as it’s accumulated new features, time has not been kind to the implementation. I’m already making plans to reimplement many of CodeWorld’s debugging features on top of Reflex, so that eventually this new implementation will be the implementation of CodeWorld, and the simpler variants will be implemented on top of this base. This change shouldn’t be visible to users, but anyone who has recently browsed the implementation may appreciate the change!

I’m also interested in what the community wants to see happen. I am excited about the possibility of using CodeWorld to demo and share Haskell with others. CodeWorld can already be used for:

  • Console-mode programs, replacing some uses of GHCi with a platform that gives you easily shareable code snippets to send to others who don’t have Haskell installed.
  • QuickCheck tests, to show off property-based testing to others.
  • The traditional CodeWorld graphics API, for very concise and purely functional graphics demos, animations, and games.
  • And now Reflex-based FRP for bringing in more powerful abstractions and modularity.

I’d also love to incorporate a traditional gloss implementation for compatibility with other people’s gloss demos, and well as a diagrams backend. If there are other libraries or techniques that you’d like to be able to show off easily using a web-based platform, open a bug or ask, please.

Thanks for listening!

by Chris Smith at July 01, 2019 05:12 AM

April 13, 2017

Shayne Fletcher

Dealing with source code locations (in lexical and syntax analysis)


Writing compilers and interpreters requires rigorous management of source code locations harvested during syntax analysis and associated error handling mechanisms that involve reporting those locations along with details of errors they associate to.

This article does a "deep dive" into the the Location module of the OCaml compiler. The original source can be found in the ocaml/parsing directory of an OCaml distribution (copyright Xavier Leroy).

Location is a masterclass in using the standard library Format module. If you have had difficulties understanding Format and what it provides the OCaml programmer, then this is for you. Furthermore, Location contains invaluable idioms for error reporting & exception handling. Learn them here to be able to apply them in your own programs.

Describing locations

A location corresponds to a range of characters in a source file. Location defines this type and a suite of functions for the production of location values.

type t = {
loc_start : Lexing.position;
loc_end : Lexing.position;
loc_ghost : bool

val none : t
val in_file : string → t
val init : Lexing.lexbuf → string → unit
val curr : Lexing.lexbuf → t

val symbol_rloc : unit → t
val symbol_gloc : unit → t
val rhs_loc : int → t

type 'a loc = {
txt : 'a;
loc : t;

val mkloc : 'a → t → 'a loc
val mknoloc : 'a → 'a loc

A value of the (standard library) type Lexing.position describes a point in a source file.

type position = {
pos_fname : string;
pos_lnum : int;
pos_bol : int;
pos_cnum : int
The fields of this record have the following meanings:
  • pos_fname is the file name
  • pos_lnum is the line number
  • pos_bol is the offset of the beginning of the line (the number of characters between the beginning of the lexbuf and the beginning of the line)
  • pos_cnum is the offset of the position (number of characters between the beginning of the lexbuf (details below) and the position)
The difference between pos_cnum and pos_bol is the character offset within the line (i.e. the column number, assuming each character is one column wide).

A location in a source file is defined by two positions : where the location starts and where the location ends.

type t = {
loc_start : position;
loc_end : position;
loc_ghost : bool
The third field loc_ghost is used to disambiguate locations that do not appear explicitly in the source file. A location will not be marked as ghost if it contains a piece of code that is syntactically valid and corresponds to an AST node and will be marked as a ghost location otherwise.

There is a specific value denoting a null position. It is called none and it is defined by the function in_file.

let in_file (name : string) : t =
let loc : position = {
pos_fname = name; (*The name of the file*)
pos_lnum = 1; (*The line number of the position*)
pos_bol = 0; (*Offset from the beginning of the lexbuf of the line*)
pos_cnum = -1; (*Offset of the position from the beginning of the lexbuf*)
} in
{ loc_start = loc; loc_end = loc; loc_ghost = true }

let none : t = in_file "_none_"

Lexing.lexbuf is the (standard library) type of lexer buffers. A lexer buffer is the argument passed to the scanning functions defined by generated scanners (lexical analyzers). The lexer buffer holds the current state of the scanner plus a function to refill the buffer from the input.

type lexbuf = {
refill_buff : lexbuf → unit;
mutable lex_buffer : bytes;
mutable lex_buffer_len : int;
mutable lex_abs_pos : int;
mutable lex_start_pos : int;
mutable lex_curr_pos : int;
mutable lex_last_pos : int;
mutable lex_last_action : int;
mutable lex_eof_reached : bool;
mutable lex_mem : int array;
mutable lex_start_p : position;
mutable lex_curr_p : position;
At each token, the lexing engine will copy lex_curr_p to lex_start_p then change the pos_cnum field of lex_curr_p by updating it with the number of characters read since the start of the lexbuf. The other fields are left unchanged by the the lexing engine. In order to keep them accurate, they must be initialized before the first use of the lexbuf and updated by the relevant lexer actions.

(*Set the file name and line number of the [lexbuf] to be the start
of the named file*)
let init (lexbuf : Lexing.lexbuf) (fname : string) : unit =
let open Lexing in
lexbuf.lex_curr_p <- {
pos_fname = fname;
pos_lnum = 1;
pos_bol = 0;
pos_cnum = 0;
The location of the current token in a lexbuf is computed by the function curr.

(*Get the location of the current token from the [lexbuf]*)
let curr (lexbuf : Lexing.lexbuf) : t =
let open Lexing in {
loc_start = lexbuf.lex_start_p;
loc_end = lexbuf.lex_curr_p;
loc_ghost = false;

Parsing is the run-time library for parsers generated by ocamlyacc. The functions symbol_start and symbol_end are to be called in the action part of a grammar rule (only). They return the offset of the string that matches the left-hand-side of the rule : symbol_start returns the offset of the first character; symbol_end returns the offset after the last character. The first character in a file is at offset 0.

symbol_start_pos and symbol_end_pos are like symbol_start and symbol_end but return Lexing.position values instead of offsets.

(*Compute the span of the left-hand-side of the matched rule in the
program source*)
let symbol_rloc () : t = {
loc_start = Parsing.symbol_start_pos ();
loc_end = Parsing.symbol_end_pos ();
loc_ghost = false

(*Same as [symbol_rloc] but designates the span as a ghost range*)
let symbol_gloc () : t = {
loc_start = Parsing.symbol_start_pos ();
loc_end = Parsing.symbol_end_pos ();
loc_ghost = true

The Parsing functions rhs_start and rhs_end are the same as symbol_start and symbol_end but return the offset of the string matching the n-th item on the right-hand-side of the rule where n is the integer parameter to rhs_start and rhs_end. n is 1 for the leftmost item. rhs_start_pos and rhs_end_pos return a position instead of an offset.

(*Compute the span of the [n]th item of the right-hand-side of the
matched rule*)
let rhs_loc n = {
loc_start = Parsing.rhs_start_pos n;
loc_end = Parsing.rhs_end_pos n;
loc_ghost = false;

The type 'a loc associates a value with a location.

(*A type for the association of a value with a location*)
type 'a loc = {
txt : 'a;
loc : t;

(*Create an ['a loc] value from an ['a] value and location*)
let mkloc (txt : 'a) (loc : t) : 'a loc = { txt ; loc }

(*Create an ['a loc] value bound to the distinguished location called
let mknoloc (txt : 'a) : 'a loc = mkloc txt none

Error reporting with locations

Location has a framework for error reporting across modules concerned with locations (think lexer, parser, type-checker, etc).

open Format

type error =
loc : t;
msg : string;
sub : error list;

val error_of_printer : t → (formatter → 'a → unit) → 'a → error
val errorf_prefixed : ?loc : t → ?sub : error list → ('a, Format.formatter, unit, error) format4 → 'a
So, in the definition of the error record, loc is a location in the source code, msg an explanation of the error and sub a list of related errors. We deal here with the error formatting functions. The utility function print_error_prefix simply writes an error prefix to a formatter.

let error_prefix = "Error"

let warning_prefix = "Warning"

let print_error_prefix (ppf : formatter) () : unit =
fprintf ppf "@{<error>%s@}:" error_prefix;
The syntax, "@{<error>%s}@" associates the embedded text with the named tag "error".

Next another utility, pp_ksprintf.

let pp_ksprintf
?(before : (formatter → unit) option)
(k : string → 'd)
(fmt : ('a, formatter, unit, 'd) format4) : 'a =
let buf = Buffer.create 64 in
let ppf = Format.formatter_of_buffer buf in
begin match before with
| None → ()
| Some f → f ppf
(fun (_ : formatter) : 'd →
pp_print_flush ppf ();
let msg = Buffer.contents buf in
k msg)
ppf fmt
It proceeds as follows. A buffer and a formatter over that buffer is created. When presented with all of the arguments of the format operations implied by the fmt argument, if the before argument is non-empty, call it on the formatter. Finally, call kfprintf (from the standard library Format module) which performs the format operations on the buffer before handing control to a function that retrieves the contents of the now formatted buffer and passes them to the user provided continuation k.

With pp_ksprintf at our disposal, one can write the function errorf_prefixed.

let errorf_prefixed
?(loc:t = none)
?(sub : error list = [])
(fmt : ('a, Format.formatter, unit, error) format4) : 'a =
let e : 'a =
~before:(fun ppf → fprintf ppf "%a " print_error_prefix ())
(fun (msg : string) : error → {loc; msg; sub})
in e
errorf_prefixed computes a function. The function it computes provides the means to produce error values by way of formatting operations to produce the msg field of the error result value. The formatting operations include prefixing the msg field with the error_prefix string. The type of the arguments of the computed function unifies with the type variable 'a. In other words, the type of the computed function is 'a → error. For example, the type of errorf_prefixed "%d %s" is int → string → error.

The groundwork laid with errorf_prefixed above means a polymorphic function error_of_printer can now be produced.

let error_of_printer
(loc : t)
(printer : formatter → 'error_t → unit)
(x : 'error_t) : error =
let mk_err : 'error_t → error =
errorf_prefixed ~loc "%a@?" printer in
mk_err x
The idea is that error_of_printer is provided a function that can format a value of type 'error. This function is composed with errorf_prefixed thereby producing a function of type 'error → error. For example, we can illustrate how this works by making an error of a simple integer with code like the following:

# error_of_printer none (fun ppf x → Format.fprintf ppf "Code %d" x) 3;;
- : error =
{loc =
{loc_start =
{pos_fname = "_none_"; pos_lnum = 1; pos_bol = 0; pos_cnum = -1};
loc_end = {pos_fname = "_none_"; pos_lnum = 1; pos_bol = 0; pos_cnum = -1};
loc_ghost = true};
msg = "Error: Code 3"; sub = []}

So, that's error_of_printer. The following utility function is much simpler - it simply writes a given filename to a formatter.

let print_filename (ppf : formatter) (file : string) : unit =
fprintf ppf "%s" file
Next, a set of constants for consistent messages that involve locations and a function to get the file, line and column of a position.

let (msg_file, msg_line, msg_chars, msg_to, msg_colon) =
("File \"", (*'msg file'*)
"\", line ", (*'msg line'*)
", characters ", (*'msg chars'*)
"-", (*'msg to'*)
":") (*'msg colon'*)

let get_pos_info pos = (pos.pos_fname, pos.pos_lnum, pos.pos_cnum - pos.pos_bol)
Making use of the above we have now print_loc : a function to print a location on a formatter in terms of file, line and character numbers.

let print_loc (ppf : formatter) (loc : t) : unit =
let (file, line, startchar) = get_pos_info loc.loc_start in
let endchar = loc.loc_end.pos_cnum - loc.loc_start.pos_cnum + startchar in
if file = "//toplevel//" then
fprintf ppf "Characters %i-%i"
loc.loc_start.pos_cnum loc.loc_end.pos_cnum
else begin
fprintf ppf "%s@{<loc>%a%s%i" msg_file print_filename file msg_line line;
if startchar >= 0 then
fprintf ppf "%s%i%s%i" msg_chars startchar msg_to endchar;
fprintf ppf "@}"
Locations generally speaking come out in a format along the lines of: File "<string>, line 1, characters 0-10:"

let print (ppf : formatter) (loc : t) : unit =
(* The syntax, [@{<loc>%a@}] associates the embedded text with the
named tag 'loc'*)
fprintf ppf "@{<loc>%a@}%s@." print_loc loc msg_colon
That last function, print is just a small wrapper over print_loc that appends a colon to the location.

Exception handling involving errors with locations

This section is concerned with the following section of the Location's signature.

val register_error_of_exn : (exn → error option) → unit
val error_of_exn : exn → error option
val error_reporter : (formatter → error → unit) ref
val report_error : formatter → error → unit
val report_exception : formatter → exn → unit
Location contains a mutable list of exception handlers where an exception handler is a function of type exn → error option.

let error_of_exn : (exn → error option) list ref = ref []
A function is provided that adds an exception handler to the above list.

let register_error_of_exn f = error_of_exn := f :: !error_of_exn
The next function error_of_exn (yes, it is the only remaining function that manipulates the list error_exn previously defined directly) walks the list looking for a handler returning the contents of the result of the first such function that doesn't return a None value.

let error_of_exn exn =
let rec loop = function
| [] → None
| f :: rest →
match f exn with
| Some _ as r → r
| None → loop rest
loop !error_of_exn

We define now a "default" error reporting function. Given a formatter and an error, write the error location, an explanation of the error to the formatter and the same for any associated "sub" errors.

let rec default_error_reporter
(ppf : formatter) ({loc; msg; sub} : error) : unit =
print ppf loc;
Format.pp_print_string ppf msg;
List.iter (Format.fprintf ppf "@\n@[<2>%a@]" default_error_reporter) sub
Now, error_reporter itself is a reference cell with default value default_error_reporter.

let error_reporter = ref default_error_reporter
This next function, print_updating_num_loc_lines looks more complicated than it is but does demonstrate a rather advanced usage of Format by containing calls to the functions pp_get_formatter_out_functions, pp_set_formatter_out_functions to tempoarily replace the default function for writing strings. The semantic of the function is to print an error on a formatter incidentally recording the number of lines required to do so.

(* A mutable line count*)
let num_loc_lines : int ref = ref 0

(*Prints an error on a formatter incidentally recording the number of
lines required to do so*)
let print_updating_num_loc_lines
(ppf : formatter)
(f : formatter → error → unit)
(arg : error) : unit =
(*A record of functions of output primitives*)
let out_functions : formatter_out_functions
= pp_get_formatter_out_functions ppf () in
(*The strategoy is to temporarily replace the basic function for
writing a string with this one*)
let out_string (str : string) (start : int) (len : int) : unit =
(*A function for counting line breaks in [str]. [c] is the current
count, [i] is the current char under consideration*)
let rec count (i : int) (c : int) : int=
if i = start + len then c
else if String.get str i = '\n' then count (succ i) (succ c)
else count (succ i) c
(*Update the count*)
num_loc_lines := !num_loc_lines + count start 0;
(*Write the string to the formatter*)
out_functions.out_string str start len
(*Replace the standard string output primitive with the one just
defined *)
pp_set_formatter_out_functions ppf
{out_functions with out_string} ;
(*Write the error to the formatter*)
f ppf arg ;
pp_print_flush ppf ();
(*Restore the standard primitive*)
pp_set_formatter_out_functions ppf out_functions
The function report_error uses the currently installed error reporter to write an error report for a given error and formatter incidentally updating a count indicating the number of lines written.

let report_error (ppf : formatter) (err : error) : unit=
print_updating_num_loc_lines ppf !error_reporter err

This next function, report_exception_rec tries a number of times to find a handler for a given error and if successful formats it. In the worst case a handler is never found and the exception propogates.

let rec report_exception_rec (n : int) (ppf : formatter) (exn : exn) : unit =
(*Try to find a handler for the exception*)
try match error_of_exn exn with
| Some err →
(*Format the resulting error using the current error reporter*)
fprintf ppf "@[%a@]@." report_error err
(*The syntax @[%a@]@ writes function output in a box followed by a
'cut' break hint*)
| None → raise exn (*A handler could not be found*)
with exn when n > 0 →
(*A handler wasn't found. Try again*)
report_exception_rec (n - 1) ppf exn
The last piece is report_exception. It attempts to write an error report for the given exception on the provided formatter. The exception can be re-raised if no handler is found.

let report_exception (ppf : formatter) (exn : exn) : unit =
report_exception_rec 5 ppf exn


In this section we see how an example of how the above machinery is used. Consider defining a lexical analyzer as an example. Suppose the scanner is defined by the file lexer.mll (the input file to ocamllex). We can imagine its header containing code like the following.

(*The cases of lexer errors*)
type error =
| Illegal_character of char
| Unterminated_comment of Location.t

(*The lexer exception type*)
exception Error of error * Location.t

(*This function takes a formatter and an instance of type
[error] and writes a message to the formatter explaining the
meaning. This is a "printer"*)
let report_error (ppf : Format.formatter) : error → unit = function
| Illegal_character c →
Format.fprintf ppf "Illegal character (%s)" (Char.escaped c)
| Unterminated_comment _ →
Format.fprintf ppf "Comment not terminated"

(*Note that [report_error] is a function that unifies with
the [formatter → 'a → unit] parameter of

(*Register an exception handler for the lexer exception type*)
let () =
| Error (err, loc) →
Some (Location.error_of_printer loc report_error err)
| _ → None

rule token = ...
A function to handle errors with attached locations (in a REPL for example) is expressible as an idiom as simple as something like this.

let handle_interpreter_error ?(finally=(fun () → ())) ex =
finally ();
Location.report_exception (Format.std_formatter) ex

let safe_proc ?finally f =
try f ()
with exn → handle_interpreter_error ?finally exn

by Shayne Fletcher ( at April 13, 2017 12:49 PM

June 28, 2019

Shayne Fletcher

Harvesting annotations from the GHC parser

Harvesting annotations from the GHC parser

My last post on parsing in the presence of dynamic pragmas left us with this outline for calling the GHC parser.

      flags <-
(defaultDynFlags fakeSettings fakeLlvmConfig) file s
whenJust flags $ \flags ->
case parse file flags s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m

Now, it's a fact that you'll not find in a GHC parse tree certain things like comments and the location of keywords (e.g. let, in and so on). Certainly, if you're writing refactoring tools (think programs like Neil Mitchell's awesome hlint for example), access to these things is critical!

So, how does one go about getting these program "annotations"? You guessed it... there's an API for that.

If we assume the existence of a function analyzeModule :: DynFlags -> Located (HsModule GhcPs) -> ApiAnns -> IO () then, here's the gist of the code that exercises it:

            POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m (harvestAnns s)
Here harvestAnns is defined as
    harvestAnns pst =
( Map.fromListWith (++) $ annotations pst
, Map.fromList ((noSrcSpan, comment_q pst) : annotations_comments pst)

The type ApiAnns is a pair of maps : the first map contains keyword and punctuation locations, the second maps locations of comments to their values.

You might think that's the end of this story but there's one twist left : the GHC lexer won't harvest comments by default - you have to tell it to do so by means of the Opt_KeepRawTokenStream (general) flag (see the GHC wiki for details)!

Taking the above into account, to parse with comments, the outline now becomes:

      flags <-
(defaultDynFlags fakeSettings fakeLlvmConfig) file s
whenJust flags $ \flags ->
case parse file (flags `gopt_set` Opt_KeepRawTokenStream)s of
PFailed s ->
report flags $ snd (getMessages s flags)
POk s m -> do
let (wrns, errs) = getMessages s flags
report flags wrns
report flags errs
when (null errs) $ analyzeModule flags m (harvestAnns s)

For a complete program demonstrating all of this see this example in the ghc-lib repo.

by Shayne Fletcher ( at June 28, 2019 06:11 PM

Serokell Blog

Best Haskell Conferences to Attend

We don’t want you to drown in the information noise, so we have started a new series of blog posts about the most valuable events chosen by our experts. In this article, we will tell you about the most interesting Haskell events.

Big Events

YOW! Lambda Jam

YOW! Lambda Jam
Photo by Manuel Chakravarty

YOW is one of the major organizers of events for software developers. If we needed to pick only one event for functional programmers to participate, Lambda Jam would be our choice. They have a good program committee and experience to gather Scala, Elixir, Haskell, Clojure, Elm, F#, and Functional Javascript developers under one roof every year in a very friendly atmosphere.

Price: n/a.

The Haskell Symposium

The Haskell Symposium
Photo by Patrik Jansson

The Symposium is one of the most significant Haskell events that is not focused exclusively on theory or practice. All in one. There you will find discussions about the future of the Haskell programming language and current researches as well as tool reviews and experience reports. This year, the program committee includes our colleagues from Well-Typed, NIA / NASA Formal Methods, Bloomberg and Universities all over the world.

Price: 400 USD; free for student volunteers.

Symposium on Principles of Programming Languages

Symposium on Principles of Programming Languages
Photo by Kalev Alpernas

The annual Symposium on Principles of Programming Languages is a forum where one can discuss all aspects of programming languages and programming systems. This event is not only for Haskellers. Why did we choose it? Because of the speaker list that stays impressive year by year. On this event, people from big companies and small startups meet together and share their outstanding ideas from the same stage.

Price: n/a; free for student volunteers.

Skills Matter’s Haskell eXchange

Skills Matter's Haskell eXchange
Photo by Marcelo Lazaroni

This conference is a traditional event held by Skills Matter, one of the biggest communities for education and skills exchange. During the event, you will have an opportunity to meet the best Haskellers and discuss new technologies in a relatively informal way, so that even beginners will have a chance to communicate with top-level professionals from different countries. By the way, this year the list of speakers is pretty awesome – take a look at their program.

Price: 695 GBP (495 GBP until July 8th); free for volunteers.

Compose:: Conference

Compose:: Conference
Photo by Compose:: Conference

This conference is not only about speakers and companies they represent. The announced topics are stunning. For example, Donya Quick will have a talk about the process of algorithmic music creation. Sounds good, doesn’t it?

Price: 250–500 USD.


Photo by Manuel Chakravarty

If you plan to visit London next month, there’s also one of the remarkable events you may attend without waiting for Skills Matter’s conference. There will be speakers from Independent, Microsoft Research, Mozilla, IOHK, Google, Facebook and other big companies. Hot topics and small talks during coffee breaks – don’t miss a chance to get the most from this summer.

Price: 100–600 EUR.


Photo by Masha Traskovskaya

When we talk about big conferences, we think about London or other big cities in Western Europe. But the world around us is changing, and it’s time to look at what is happening in the East. f(by) is held in Belarus, and the event definitely can be compared with famous big conferences all over the world. From the IT side, Belarus is famous not only for the f(by) conference. If you’re going to visit Minsk, find time to look at their Tech Park.

Price: n/a.


Photo by Rahul Goma Phulore

LambdaWorld is a perfect choice for people who want to share their experience with the functional community – Scala, Clojure, Haskell, Kotlin developers will be there to talk about their work and business opportunities. It is held in Seattle and Cadiz, an ancient Spanish port city.

Price: 335–350 USD in Seattle, 75–150 EUR in Cadiz.


Photo by Stephen Pimentel

It is one of the biggest conferences for developers in North America – just take a look at their topics. All that you want – from overhyped blockchain to programming language theory. Workshops, unconference program, gourmet meals (we know how it can be important) and good prices – there’s no reason to ignore this event.

Price: 850–1350 USD.

BOB Konf

BOB Konf
Photo by BOB Konf

BOB is a conference for those who are interested in functional programming languages. It takes place during ICFP – the premier gathering of functional programmers. The program is divided into two tracks: for practitioners and researchers. If you are going to ICFP, you can attend this conference without any fees.

Price: 75–240 EUR.


Photo by ZuriHac

ZuriHac is the biggest Haskell hackathon organised by the Zürich Friends of Haskell association. The main goal of this coding festival is to expand the community and improve Haskell tools, libraries and infrastructure. By the way, ZuriHac is not only about hacking – but there are also interesting keynotes, tracks and opportunities for small talks. The main advantages: it is entirely free for all participants, it is both for beginners and professionals (beginners can take a mentorship during all three days of ZuriHac), and it takes place not far from Zürich, in a financial center of Switzerland.

Price: Free.

Local Meetups

In addition, a lot of different local meetups happen every month, and they’re worth visiting if you’re in the area. For example, Toronto and Berlin Haskell users meetups. Some of such meetups possibly happen near you from time to time.

That’s all for now. Next time we will write about Elixir events. Feel free to share your favourite Elixir events with us on Twitter.

by Denis Oleynikov ( at June 28, 2019 12:00 AM

June 27, 2019

Tweag I/O

considered harmful

Mathieu Boespflug

Edsger Dijkstra took issue with the "unbridled" use of the goto programming construct in 1968. He noted that our already limited ability to reason about the dynamic behaviour of imperative programs was nigh impossible in the presence of goto-jumps to arbitrary labels. With goto, control flow is permitted to be entirely disconnected from the structure of the code as written, so it becomes very hard to guess at the value of variables without actually running the program. Just like goto for dynamic behaviour, the unbridled use of the C preprocessor (CPP) to conditionally compile code is hampering our ability to analyse and manipulate code. In fact the commonality in the arguments made it difficult to resist the temptation to reuse the already overused title of Dijkstra's paper. In this post, I want to rehash the argument that CPP should be dispensed with because it makes bad code too tempting to write, like Dijkstra did for goto.

The idea that unrestricted conditional compilation should be avoided is old news. While it is extremely common in programming languages of the 70's (like C), it is nonexistent in popular programming languages of the 00's (like Rust or Go). Haskell, a language born in the late 80's, punted on difficult problems like multi-platform support and handling breaking changes in upstream dependencies. Using CPP to deal with these issues was a quick fix, and a historical accident that somehow survived to this day.

Say I'm writing innocent enough code:

module Main where

main = do
  name <- getLine
  putStrLn $ "Hello " <> name

This works fine with the latest GHC. But using GHC circa 2016, this won't compile, because (<>) is not part of the Prelude. At this point I have four choices:

  1. Decide that I don't care about old GHC (or in general any old version of a package dependency) and move on.
  2. Use (++) instead of (<>) (they happen to have the same meaning in this particular case), or otherwise rewrite my code, e.g. by adding import Data.Semigroup ((<>)).
  3. Write two versions of my module: one that works for newer GHC, and one that works for older GHC, and letting the build system decide which one to pick.
  4. Use conditional compilation.

This last solution might look like this:

module Main where

main = do
  name <- getLine
#if MIN_VERSION_base(X, X, X)
  putStrLn $ "Hello " <> name
  putStrLn $ "Hello " ++ name

Clearly Option 1 or Option 2 would work out better than this already unreadable mess, which might only get worse when other incompatibilities arise, requiring nested conditional compilation. Some might argue that Option 1 (dropping support) isn't used nearly often enough. I agree, all the more so given the success and broad adoption of Stackage, but it's a debate for another day. Option 2 is about resisting the temptation to use new functions not previously available. But how do we deal with breaking changes in dependencies? In such a case only Option 3 and Option 4 are available.

In version 2.1, the singletons library exposed two datatype definitions:

data Proxy t :: * -> *
data KProxy t :: * -> *

From version 2.2 onwards, only one, more general datatype is exposed:

data Proxy k (t :: k) :: * -> *

In user code, KProxy now needs to be replaced everywhere with Proxy. Unlike in our previous example, Option 2 is not available: there is no way to change the code in such a way that it compiles with both singletons-2.1 and singletons-2.2. Option 3 doesn't look terribly appealing at first blush because Don't Repeat Yourself (DRY).

The temptation is high to introduce conditional compilation everywhere. The common way to do so is with CPP at each use site:

#if MIN_VERSION_singletons(2,2,0)
  ... KProxy ...
  ... Proxy ...

The problem is that with conditional compilation using CPP, we lose a great deal. It is no longer possible to do any syntactic analysis of your modules, say to lint check or to apply code formatters. The source is no longer syntactically valid Haskell: it needs a preprocessor to defang it first. Which would be fine, except that if you want to analyze every branch of every #if and #ifdef, you need to run the preprocessor with every combination of every predicate (true/false or defined/undefined for every macro), leading to an exponential blow up in the number of times you need to run HLint, and other automatic tools like code formatters thrown out the window entirely.

What if we taught these tools CPP syntax, to obviate having to evaluate each branch of the preprocessor? Like unrestricted goto allowing labels pretty much anywhere, CPP conditionals can appear anywhere at all: module headers, import declarations, in the middle of an expression, or indeed arbitrary different combinations of these in each branch of a conditional. Each branch need not be syntactically correct, with balanced parentheses and well-scoped names. Parsing becomes a very complicated problem.

The vexing issue is that we seldom need CPP for conditional compilation in the first place, if at all. There is no silver bullet for avoiding CPP. Giving up goto means turning to structured control flow constructs (e.g. while-loops or try/catch), neither of which completely replacing goto, but together covering most use cases for goto. Giving up CPP means turning to any of the following strategies to achieve broader compatibility:

  • Push all configuration to the build system: if you're writing a cross-platform network library, put all Win32 code in separate files from the Linux code. Let the build system choose what modules to build depending on the target platform. No CPP required.
  • Designing for extensibility: the network library has a datatype of socket address domains. Since not all platforms support all domains, this forces conditional compilation in socket address primitives. By contrast, the socket library has an open type family of domains. Support for each domain can be kept in a dedicated source file, as above.
  • Abstract away compatibility concerns: if you really need to target multiple versions of a dependency, create a small module that abstracts away the differences and depend on that.
  • Use structured conditional compilation: if you really have to use conditional compilation within a source file, prefer structured conditional compilation. Template Haskell can conditionally define a function. Unlike CPP, using Template Haskell is still syntactically correct Haskell.

We should challenge the idea that CPP is unavoidable. After all, multi-platform support and backwards compatibility are universal concerns for all programming languages. Unstructured conditional compilation is highly unusual in many of these, even for multi-platform code. Like some of the pioneers of software did with the goto of old, we overestimate the need for the power of non-structure, while forgetting about the benefits of structure. Foregoing CPP entirely should rid us of our illusions.

June 27, 2019 12:00 AM

June 26, 2019

Functional Jobs

Compiler Engineer at Axoni (Full-time)

Compiler Engineer

Headquartered in New York City, we are currently working intensely on what we expect to become the most widely used programming language for blockchain smart contracts: AxLang. Based on Scala, AxLang enables secure and full featured smart contract development by supporting both functional programming and formal verification. Its design is driven by the rigorous requirements for solutions serving the world's largest financial institutions.

AxLang is part of Axoni's blockchain infrastructure, which underpins the broadest reaching and most ambitious permissioned ledger production projects in the world, including $11 trillion of credit derivatives, the world's leading foreign exchange connectivity network, and various other industry implementations.

By joining our growing compiler team, you will help us build cutting edge compiler technology that targets the Ethereum open-source community. Overall, this is a unique opportunity to contribute to one of the most promising technologies today by helping us revolutionize the way smart contracts are developed in the Ethereum ecosystem.

The ideal candidate should be comfortable with programming language concepts such as type systems and formal methods and compiler design concepts such as abstract syntax tree (AST) and source-to-source transformations.

Relevant Skills and Experience

2+ years of industry experience on a functional programming language such as Scala, Haskell, OCaml Experience with compiler development a plus Experience with formal verification and/or semantic analysis a plus Research in the field of programming languages, compilers, and formal verification a plus Knowledge of (Ethereum) smart contracts a plus Familiar with best practices for Agile and Test Driven Development Strong communication skills and a collaborative team member

Get information on how to apply for this position.

June 26, 2019 07:55 PM

Sandy Maguire

Polysemy Internals: The Effect-Interpreter Effect

aka “what the hell is that Tactics stuff?”

This is the second post in a series of implementation details in polysemy — a fast, powerful and low-boilerplate effect-system library.

In the last post we discussed the Yo type, which accumulates weaving functions of the form Functor f => f () -> (∀ x. f (m x) -> n (f x)) -> e m a -> e n (f a). As a quick reminder, the f functor corresponds to some piece of (possibly trivial) state, and the ∀ x. f (m x) -> n (f x) is a distribution function, roughly analogous to a function like runStateT.

Where our story left off, we had accumulated all of our desired weaves into Yo, but hadn’t yet used them for anything. The developer experience around Yo is fraught with peril, and even as the guy who implemented it, I’m often stymied about how to get all the types to line up. Such a detail is not the sort of thing you can expose in a library that you expect people to actually use.

At the types Yo usually gets instantiated, it looks something like Yo (State s) (Sem r) Int. Which looks easy enough, until you realize that packed inside of this thing is an existential m (which was originally a Sem r0 for some unknown effect row r0), and an existential functor f which is all of the initial state we’re carrying around from other effects who have already run.

Yo is the free Effect, which means that like all free structures, it provides dependency injection so you can later decide what that Effect means. It’s not a magic bullet — you still need to actually write code somewhere. Somebody needs to actually use that f () and ∀ x. f (m x) -> n (f x) to actually do something!

As a first attempt, let’s see what happens if we just expose them in the interpretation API. We’ll write a quick interpret function which can handle an effect e m x by producing a Sem r (f x). The implementation is given below. Don’t worry too much about its body; but pay attention to just how gruesome the type is.

For example, we can use it to implement an interpretation of the Reader effect:

Because Ask doesn’t have any embedded computations, it doesn’t need to do anything fancy. It can just ice-cream cone to put i inside of the state it was given, and return that. But Local is a more complicated beast! It must ice-cream cone its ma computation into the state, and then distrib that thing into a Sem (Reader i '; r), and then run the Reader effect off of that!

It’s not the end of the world, but it’s a nontrivial amount of boilerplate that needs to be duplicated for every interpreter. Combined with the terrifying types, this feels like a no-go.

Let’s look at an interpretation for the Resource effect (which gives bracket semantics.) Resource is more complicated than Reader, and this complexity serves to illustrate some common patterns that come up and up again when writing interpreters.

The bracket function allocates some resource of type a, provides it to the use block for some computation, and ensures that it will be cleaned up via finalize — even if the use block crashed.

There are a few subtleties in the type instantiation here. In the comment marked -- 1, we run distrib on our m a parameter, which transforms it into an Sem (Resource ': r) (f a). Note that we’ve introduced an f here! This in turn unifies our finalize and use types as f a -> m b and f a -> m c, respectively. Because we later need to distribute to turn those ms into Sem (Resource ': r)s, we also introduce fs into b and c.

In essence, we end up with functions alloc :: Sem (Resource ': r) (f a), finalize :: f a -> Sem (Resource ': r) (f b) and use :: f a -> Sem (Resource ': r) (f c). This threading of f evident in the types corresponds directly to the fact that we need to keep track of other people’s state. As we’ll see in a future post, is indicative of a huge problem with the naive semantics we’ve given to Resource here.

Anyway, looking at runReader and runResource, we see two particular patterns emerge in our interpreters:

  1. distrib $ ma <$ state for the case of an m a argument
  2. \fa -> distrib $ fmap mb fa for the case of an a -> m b argument

The insight here is that maybe we can just make these combinators a part of the interpret interface directly, rather than have people write them by hand for each interpreter. It doesn’t help the horrifying types:

But it sure as heck improves the ergonomics:

Much nicer! If only we could do something about those gnarly types, we’d be in business!

The last conceptual step here is to realize that the start :: ∀ y. m y -> Sem (e ': r) (f y) and continue :: ∀ y z. (y -> m z) -> f y -> Sem (e ': r) (f z) parameters are static. That means we could stick them into a reader monad — or perhaps more mind-crushingly, an effect.

And so, we can provide the two following primitive actions in our Tactics effect, and then derive start and continue from them:

This thing is a mess of type parameters, but f is exactly what you’d expect. The n corresponds to what m used to be (it’s standard operating procedure in polysemy to use m as the name of the second-last type argument.) And we introduce r which corresponds to the effect row that we’re trying to interpret.

Interpreters for effect actions e m end up running with the ∀ f. Functor f => WithTactics e f m r effect row. This thing gives us access to a Tactics capable of producing Sem (e ': r)s, but doesn’t itself have access to e effects.

Finally, we use a type synonym to hide most of the nasty details.

Given an appropriate runTactics interpreter:

We can finally implement interpret:

We’ve hid all of the nasty type inside of that Tactical synonym (which admittedly is still gross, but at least it’s not rank 3.) And we’ve create an effect interpreter effect in which we can put any combinators people will need for writing interpreters.

After renaming start to runT and continue to bindT for branding purposes, runResource ends up in its final form:

I’m unable to properly express the amount of joy I get in using a library to implement core features in itself. The result is one of the most mind-crushingly meta things I’ve ever written, but it elegantly solves a real problem — so why not?

In the next post in this series, we’ll discuss the semantics behind the order in which you interpret effects, and how this can get you into trouble with things like runResource. Stay tuned.

June 26, 2019 05:26 PM

Michael Snoyman

My new home network setup

I didn’t mention it on this blog, and I think not even on Twitter. But back in October of 2018, we started a major renovation project on our house. The project ended up taking seven months, and we just moved back in two weeks ago. Two special thanks are in order:

  • A big thank you to Miriam, who moved us back home while I was off at LambdaConf for a week
  • A big thank you to my parents for letting Miriam and me, plus four children, invade their house for seven months straight

Anyway, most of the changes we were attempting are fairly straightforward, non-technical, and (besides to those of us living in the house) uninteresting. Things like replacing leaking pipes, adding insulation to the walls so we don’t freeze in the winter, and so on.

But this work also include a major change to how our home network works. I’m far from an expert on anything hardware related, including wired networking and wireless devices. This process involved a lot of investigation on my part, and trying to plan out choices many months in advance. I consulted with friends, especially coworkers at FP Complete who are more knowledgeable about this stuff than me. So far, I’m very happy with the end result.

I’m writing up this blog post for people who are curious about how this setup ended up looking, and in case it may help others make some decisions.


Something about my house: it’s tall. We live on the side of a mountain, and so we have 4 split levels. This makes it difficult to get a good WiFi signal across the whole house. Originally, before the renovation, I bought a D-Link COVR 1203 set of devices (1 router, 2 access points). These devices work with a wireless backhaul, meaning they communicate with each other over a wireless as opposed to a wired connection. This worked, but we always had deadspots in the top level, and much degraded speed. So we wanted something better. (Those devices are now living happily at my parents’ house, and were a lifesaver for surviving there for the past seven months.)

Anyway, here’s a breakdown of what we need:

  • First and foremost, the majority of my work revolves around voice and video calls. I need a 100% reliable connection for that, otherwise I have major disruptions in my ability to do my job. I have an office on the second floor, but also sometimes need to work from our bedroom on the fourth floor when we have guests staying with us. For me, both wired and wireless connections are an option.
  • We have a TV in the living room on the first floor, previously connected to a Chromecast, now connected to a Roku. (I strongly recommend Roku devices, we’ve been really happy with this one.) We stream on this Roku, and it needs a solid signal. We had a lot of frustration with flaky wireless to the Chromecast before.
  • Beyond those key points, we overall want solid wireless throughout the house, which probably isn’t too much of a surprise. It’s not a hard requirement to be able to use WiFi everywhere in the house, but it’s nice to have.
  • Finally, as a nice to have, we’d like to have the option in the future of getting our kids desktops in their bedrooms and setting up wired connections.

That all sounds pretty standard I’m guessing. Some things that aren’t required:

  • We don’t have any special router requirements. All devices are OK with a standard IPv4 NAT setup. No special firewall rules are necessary.

OK, let’s talk about what we did.

Modem and router

We set up both phone line and cable line jacks in the living room right by the TV. This area is our entertainment center, with the TV, Roku, and Nintendo Switch. So adding in a modem was natural. We went with the cable company HOT. I really dislike the company… but what can you do, every company in this space seems to suck.

Living room entertainment center

They gave us what they call a Hot Box (heh), which is a pretty standard modem + router + WiFi access point combo. For those not familiar with the distinction between these:

  • A modem converts an analog signal (like a television cable line) to a network signal. It allows a single device on each end of the line to talk to each other.
  • A router allows multiple devices on a local area network to connect to a wide area network (usually the internet). It performs tasks like a DHCP server, network address translation (NAT), and firewalling.
  • A wireless access point allows devices to connect to a network over WiFi instead of over a wired connection.

The wireless access point in the Hot Box was terrible: it was slow to connect and used an old wireless standard (more on that later). I disabled that feature. However, since I didn’t have advanced rotuer needs, we’re using the router capabilities of the Hot Box.

Cables in the walls

The magic of all of this is, now that we were doing construction, we had a chance to run network cables through all of the walls of the house. Our electrician ran three CAT6 cables through the house: one to my office on the second floor, one to one of the kids’ bedrooms on the third floor, and one to our bedroom on the fourth floor. All of the cables originate in the same cabinet holding the Hot Box.

The Hot Box provides four network jacks. As you’ll see, we have more than four devices in the house that want a wired connection. So the next addition to this show is…

Network switches

A network switch is a device which allows multiple devices to connect, and will route packets between them intelligently. For our purposes, a switch allows us to make one connection to the Hot Box, and then plug in multiple devices to its other ports, resulting all connected devices having an internet connection.

We purchased TP-Link 5-port switches. The reason for this choice was a combination of reading reviews, price, and availability of the product in Israel. I’m happy with them, and have no complaints at all, but have no reason to believe them to be qualitatively better than other devices on the market.

TP-Link Switches

By adding a switch next to the Hot Box itself, we ended up with 4 + 5 = 9 ports. However, connecting the two devices uses a port on each device, so we had seven usable ports. Each of the CAT6 cables to the other three floors got their own port. The Roku gets a port. The Nintendo Switch (naming here is really confusing) will ultimately get a port, once we buy an ethernet adapter for it. And as we’ll see in the next section, the wireless access point gets one too.

In my office, I put in a network switch as well, since I’ll be connecting more than one device on a wired connection in here. Similarly, in the kids room, I put in another network switch. From the one bedroom, we have a network cable running into a second bedroom, so that both rooms end up with a wired connection without having to run two cables through the walls of the house.

I’m honestly not convinced I made a great decision on this, and perhaps running extra cables and buying a single, larger switch for the living room would have been smarter. But this definitely works.

With this much of the setup, we now have the ability to have a wired, stable, fast connection for my laptop in our bedroom, for a proper workstation in my office, for the kids’ computers in the future, for streaming videos on our Roku, and for playing Nintendo Switch games online. (I still have to have my brother over to test out Fortnite at some point.) But there’s no wireless connection yet, so…

Wireless access points

This is the area I least understood before starting this project. I still won’t claim to be an expert, and I’ll be happy to modify this section if people have corrections (or just send a PR using the link above). But here’s what I’ve learned.

The basics of a wireless access point (WAP) are simple. They provide wireless devices with something to talk to wirelessly. They then route packets from those devices to others. Arguably the simplest kind of WAP will have a wired connection to a router or switch, and provides a bridge between the wired and wireless connections. Other kinds of access points work differently, such as by connecting to some other WAP for the internet signal. (This is what my COVR devices with wireless backhaul did, for instance.)

I did not want to deal with a wireless backhaul, since it has multiple limitations. Firstly, the effective speed is cut in half, since half the wireless bandwidth needs to be used to talk to the source signal. Also, The devices need to be close enough together to talk, which is a problem with our house. And as you’ll see shortly, with the distances in my house, this would drastically degrade the speed.

So instead, the goal with this renovation was to have a WAP with a wired connection on each floor, which would provide complete house coverage and no issues with degraded signal due to distance between the devices. Having determined the kind of device to look for, I started researching the topic.

The next thing I learned—which I’m sure many readers already knew—was about the difference between the 2.4Ghz and 5Ghz bands. Successive wireless standards, like 802.11n vs 802.11ac, have added the ability to communicate over 5Ghz instead of 2.4Ghz. 2.4Ghz is advantageous in that it can work over longer distances and more easily penetrate obstacles like walls. By contrast, 5Ghz offers faster speeds. Given that distance was hopefully no longer going to be an issue, getting devices with support for 802.11ac was advantageous.

With the same review process and criterion as I used for the network switches, I ended up choosing the TP-Link EAP245. This is a business class solution, and likely overpowered for our needs, but I’d rather get this set up once and not have to think about it again for years. This requires more tweaking than the network switches, but the interface was pleasant to use and the mobile application intuitive.

TP-Link EAP245

So far, everything’s been great with these devices. I’ve done signal checks throughout the house, and everywhere I’ve checked we have well above 80% signal (usually closer to 95%) on the 5Ghz band. A speed test in my bedroom topped out at 195Mbps, over a cable line providing a maximum of 200Mbps. So I think it’s fair to say that the wireless signal is no longer the bottleneck in the house. In fact, I also tested the network speed using a cheap WiFi dongle I bought years back, and it was significantly slower than the wireless connection.

These devices include some features that are either really great or awesome snake oil. “Fast roaming” means that, for client devices that support it, there will be a seamless transition between two of the access points when moving through the house. This can be great in theory, but I haven’t tested it in practice yet.

I set up the devices to use the same SSID for both the 2.4Ghz and 5Ghz bands. I read conflicting information on whether this was a good idea or not. The advantage for me is simpler configuration of devices (no need to tell guests which network to use), and in theory devices will automatically switch between 2.4Ghz and 5Ghz depending on how far they are from the respective signals.

It’s been a complete pleasure having reliable WiFi on all our devices throughout the house.

Power over Ethernet

As a side note, the WAPs do not have a power port. They have two network ports only: an incoming port for the internet signal and power, and an outgoing port for sharing a signal with another device if desired. I was unfamiliar with Power over Ethernet (PoE) before this project, but it’s pretty nifty. In our bedroom, it means the WAP has just a single cable going into it from the wall, with the PoE injector living downstairs next to the Hot Box itself.

Network attached storage

This was a nice little perk I hadn’t expected. We have a portable USB hard drive with movies and music, which we connect to our Roku. Any time we want to transfer files, we unplug it, bring it to one of our computers, plug it in, do the transfers, and then reconnect it to the Roku. I feel like an absolute caveman doing this!

Now that our in-house bandwidth is so high, we started playing with the idea of buying a Network Attached Storage (NAS) device so that the Roku could stream off of it over the network, and we could transfer files to it from our computers. However, to my surprise, the Hot Box itself has such capabilities: I plugged the USB hard drive into the Hot Box’s USB port, and was immediately able to access the drive from both the Roku and our laptops. This was a great little perk, and being able to watch movies or listen to music from multiple devices at once is awesome.


I don’t have all the figures at my disposal, but overall, the costs involved in this setup were:

  • Network switches and WAPs: about $612. I ended up buying extra devices due to a sale at the company I purchased from, and ended up with 5 of each, even though I only needed 3 switches and 4 WAPs. Happy to have the backups.
  • Since we were already doing renovations and fixing lots of electrical work in the house, running the three CAT6 cables wasn’t a major expense. I think it was order-of-magnitude less than $200 total.
  • Additional network cables for connecting the switches to each other, the PoE injectors to the switches, and the WAPs to the PoE injectors, cost me around $60. This turned out to be many more cables than I was expecting, and I cleaned out the local hardware store of their 1 meter cables. Twice.

Considering how vital all of this is to my job, that’s an easily absorbable cost. And the joy of never fighting with a flaky Chromecast again are well worth it!


Most everything went off without a hitch. Ultimately, I think I asked the electrician to put the network jack in our bedroom in the wrong location. Now that we’re using it just for the WAP, having it higher on the wall would have been less intrusive. I’m also pretty terrible at making jumbles of cables look nice, so there’s still some cleanup in my office and the kids’ rooms to make it presentable.


I planned out all of this towards the beginning of our construction. I then got to wait around for seven months to see if the plans were going to work out in practice correctly. It’s a huge relief to find out that it did.

I would strongly recommend that anyone doing serious renovations consider running network cables through their walls at the same time. And certainly, for any new construction, make sure you include network cables along with power lines.

If I discover any problems or improvements, I’ll try to blog about them too.

I hope people found this useful, or at least interesting.

June 26, 2019 08:34 AM