# An instructive example of expected value

I think this example is very illuminating of something, although I'm not sure yet what.

Suppose you are making a short journey somewhere. You leave two minutes later than planned. How does this affect your expected arrival time? All other things being equal, you should expect to arrive two minutes later than planned. If you're walking or driving, it will probably be pretty close to two minutes no matter what happens.

Now suppose the major part of your journey involves a train that runs every hour, and you don't know just what the schedule is. Now how does your two minutes late departure affect your expected arrival time?

The expected arrival time is still two minutes later than planned. But it is not uniformly distributed. With probability , you catch the train you planned to take. You are unaffected by your late departure, and arrive at the same time. But with probability you miss that train and have to take the next one, arriving an hour later than you planned. The expected amount of lateness is

$$0 \text{ minutes}·\frac{58}{60} + 60 \text{ minutes}·\frac{2}{60} = 2 \text{ minutes}$$

the same as before.

[ Addendum: Richard Soderberg points out that one thing illuminated by this example is that the mathematics fails to capture the emotional pain of missing the train. Going in a slightly different direction, I would add that the expected value reduces a complex situation to a single number, and so must necessarily throw out a lot of important information. I discussed this here a while back in connection with lottery tickets.

But also I think this failure of the expected value is also a benefit: it does capture something interesting about the situation that might not have been apparent before: Considering the two minutes as a time investment, there is a sense in which the cost is knowable; it costs exactly two minutes. Yes, there is a chance that you will be hit by a truck that you would not have encountered had you left on time. But this is exactly offset by the hypothetical truck that passed by harmlessly two minutes before you arrived on the scene but which would have hit you had you left on time. ]

# GHC 8.2.2 is available

The GHC Team is pleased to announce a new minor release of GHC. This release builds on the performance and stability improvements of 8.2.1, fixing a variety of correctness bugs, improving error messages, and making the compiler more portable.

Notable bug-fixes include

• A correctness issue resulting in segmentation faults in some FFI-users (#13707, #14346)
• A correctness issue resulting in undefined behavior in some programs using STM (#14171)
• A bug which may have manifested in segmentation faults in out-of-memory condition (#14329)
• clearBit of Natural no longer bottoms (#13203)
• A specialisation bug resulting in exponential blowup of compilation time in some specialisation-intensive programs (#14379)
• ghc-pkg now works even in environments with misconfigured NFS mounts (#13945)
• GHC again supports production of position-independent executables (#13702)

A thorough list of the changes in the release can be found in the release notes,

## How to get it

For older versions see

We supply binary builds in the native package format for many platforms, and the source distribution is available from the same place.

## Background

Haskell is a standard lazy functional programming language.

GHC is a state-of-the-art programming suite for Haskell. Included is an optimising compiler generating efficient code for a variety of platforms, together with an interactive system for convenient, quick development. The distribution includes space and time profiling facilities, a large collection of libraries, and support for various language extensions, including concurrency, exceptions, and foreign language interfaces. GHC is distributed under a BSD-style open source license.

A wide variety of Haskell related resources (tutorials, libraries, specifications, documentation, compilers, interpreters, references, contact information, links to research groups) are available from the Haskell home page (see below).

On-line GHC-related resources

Relevant URLs on the World-Wide Web:

## Supported Platforms

The list of platforms we support, and the people responsible for them, is here

Ports to other platforms are possible with varying degrees of difficulty. The Building Guide describes how to go about porting to a new platform.

## Developers

We welcome new contributors. Instructions on accessing our source code repository, and getting started with hacking on GHC, are available from the GHC's developer's site run by Trac.

## Community Resources

There are mailing lists for GHC users, develpoers, and monitoring bug tracker activity; to subscribe, use the Mailman web interface.

There are several other Haskell and GHC-related mailing lists on haskell.org; for the full list, see the lists page.

Some GHC developers hang out on the #ghc and #haskell of the Freenode IRC network, too. See the Haskell wiki for details.

Please report bugs using our bug tracking system. Instructions on reporting bugs can be found here.

# mega-sdist: the mega repo helper

Many years ago, I wrote a utility called mega-sdist to help me with managing mega repos (more on that below). I've been using it myself ever since, making some minor improvements over the years. But I realized recently that I never really announced it to others, and especially not to the people whom it would help the most: other Yesod contributors and maintainers. Consider this the (massively belated) announcement.

You can find the most up-to-date information in the project README.md on Github. Below is the current content of that file, to help save you a click.

This is a utility written to address the specific needs in maintaining Haskell "mega-repos," or Git repositories containing multiple Cabal projects. It is intended to ease the process of deciding which packages need to be released and tagging those releases appropriately.

It provides the following functionality:

• Detect when local code has changed from what's on Hackage
• Note that, due to Hackage revisions, sometimes this logic isn't perfect
• Detect when a version number needs to be updated
• Dump the difference between the Hackage version of your package and the local version

To install it... well, listen. This tool is intended for people authoring Haskell packages. Odds are, you already know how to do this. And if you don't know, this probably isn't a tool that will help you. Anyway, in order to install it, first install Stack and then run stack install mega-sdist, or just stack install inside this repository.

## Opinionated tool

This utility is highly opinionated in some ways, e.g.:

• It only supports one style of Git tag name: packagename/version. This may look weird in non-mega-repos, where v1.2.3 looks better than foo/1.2.3, but for mega-repos the former doesn't make sense.
• It depends on Stack for both discovering all of your local packages, and for uploading to Hackage.

If you're OK with these opinions, keep reading for usage.

## Have I changed anything?

Let's say I'm working on the monad-unlift megarepo (chosen as an example of a relatively small repo). I've merged some PRs recently, or at least think I have. But I don't remember which of the individual packages within the repo this affected. Instead of looking at the commit history like some caveman, I'll typically do:

$git pull # make sure I have all latest changes$ mega-sdist

The mega-sdist command will:

• Build tarballs for all local packages
• Check what the latest versions of my packages on Hackage are
• Do a full diff on these two things and see if anything's changed

At the time of writing, here's the output from this repo:

The following packages from Hackage have not changed:

The following packages require a version bump:
monad-unlift-ref-0.2.1

What this means is:

• The monad-unlift package I have locally is at version 0.2.0. And it perfectly matches that version on Hackage. No actions necessary.
• The monad-unlift-ref package I have locally is at version 0.2.1. And it doesn't match the code on Hackage. Therefore, if I wanted to run stack upload monad-unlift-ref successfully, I'd need to bump the version number.

## What did I change?

Well, again, if I wanted to see what changed, I could run (again, like a caveman):

$git diff monad-unlift-ref/0.2.1 -- monad-unlift-ref But that's long! mega-sidst's got your back. Just run: $ mega-sdist monad-unlift-ref --get-diffs

This will print out the difference between the tarball uploaded to Hackage and what you have locally. Besides my tongue-in-cheek comment above, this is also useful if, for some reason, you either don't have or don't trust the tags in your Git repo.

One other thing: this diff is currently based on the pristine tarball from Hackage, ignoring cabal file revisions. So the difference may be slightly different from what you'd get from stack unpack monad-unlift-ref-0.2.1. But ¯\_(ツ)_/¯ that's revisions for you.

The default behavior of mega-sdist is to look at all packages specified in your stack.yaml. Targets can be any directory. And mega-sdist will automatically look at packages in any subdirectory, so that mega-sdist . is the same as mega-sdist at the root of your repo*.

* Assuming all of your packages are actually in your repo, but only crazy people would do otherwise.

## Preparing a new release

OK, now I continue working on my project, and I've:

• Made some changes to monad-unlift
• Updated the cabal file's version number
• And of course I also updated the ChangeLog.md, I'm not some monster

From the root of my repo, I run:

$mega-sdist monad-unlift Or, equivalently, from inside the monad-unlift subdirectory I run: $ mega-sdist .

Either way, I get:

The following new packages exist locally:

No version bumps required, good to go!

This tells me that my package has local changes, and the version number has been updated, so that stack upload monad-unlift will work. Neato! Now, you could just run stack upload ..., but here's what I usually do. First, I'll review the changes I'm about to upload and make sure there are no surprises:

$mega-sdist --get-diffs . The following new packages exist locally: monad-unlift-0.2.1 diff -r old/monad-unlift-0.2.0/ChangeLog.md new/monad-unlift-0.2.1/ChangeLog.md 0a1,4 > ## 0.2.1 > > * Silly changes > diff -r old/monad-unlift-0.2.0/Control/Monad/Trans/Unlift.hs new/monad-unlift-0.2.1/Control/Monad/Trans/Unlift.hs 51a52,54 > > -- I just need some space > diff -r old/monad-unlift-0.2.0/monad-unlift.cabal new/monad-unlift-0.2.1/monad-unlift.cabal 2c2 < version: 0.2.0 --- > version: 0.2.1 No version bumps required, good to go! OK, that's what I wanted. Time to release. Next, I'm going to use mega-sdist to tag the release: $ mega-sdist --gittag .

From the root of my repo, this would notice that monad-unlift-ref still requires a version bump, and refuse to proceed. But inside the monad-unlift directory, it notices that all necessary version bumps are done, and happily tags:

$mega-sdist --gittag . The following new packages exist locally: monad-unlift-0.2.1 No version bumps required, good to go! Raw command: git tag monad-unlift/0.2.1 And suddenly I notice something new: $ ls tarballs/
monad-unlift-0.2.1.tar.gz

Neat, mega-sdist left behind tarballs I can upload! To do so, I run:

$stack upload tarballs/* Note that this will work whether I'm trying to upload just one package, or all of the updated packages in my repo. Finally, I need to push the new tags to Github (or wherever): $ git push --tags

And in fact, this upload sequence is so common that I have a shell alias set up:

# Mathematical jargon for quibbling

Mathematicians tend not to be the kind of people who shout and pound their fists on the table. This is because in mathematics, shouting and pounding your fist does not work. If you do this, other mathematicians will just laugh at you. Contrast this with law or politics, which do attract the kind of people who shout and pound their fists on the table.

However, mathematicians do tend to be the kind of people who quibble and pettifog over the tiniest details. This is because in mathematics, quibbling and pettifogging does work.

Mathematics has a whole subjargon for quibbling and pettifogging, and also for excluding certain kinds of quibbles. The word “nontrivial” is preeminent here. To a first approximation, it means “shut up and stop quibbling”. For example, you will often hear mathematicians having conversations like this one:

A: Mihăilescu proved that the only solution of Catalan's equation is .

B: What about when and are consecutive and ?

A: The only nontrivial solution.

B: Okay.

Notice that A does not explain what “nontrivial” is supposed to mean here, and B does not ask. And if you were to ask either of them, they might not be able to tell you right away what they meant. For example, if you were to inquire specifically about , they would both agree that that is also excluded, whether or not that solution had occurred to either of them before. In this example, “nontrivial” really does mean “stop quibbling”. Or perhaps more precisely “there is actually something here of interest, and if you stop quibbling you will learn what it is”.

In some contexts, “nontrivial” does have a precise and technical meaning, and needs to be supplemented with other terms to cover other types of quibbles. For example, when talking about subgroups, “nontrivial” is supplemented with “proper”:

If a nontrivial group has no proper nontrivial subgroup, then it is a cyclic group of prime order.

Here the “proper nontrivial” part is not merely to head off quibbling; it's the crux of the theorem. But the first “nontrivial” is there to shut off a certain type of quibble arising from the fact that 1 is not considered a prime number. By this I mean if you omit “proper”, or the second “nontrivial”, the statement is still true, but inane:

If a nontrivial group has no subgroup, then it is a cyclic group of prime order.

(It is true, but vacuously so.) In contrast, if you omit the first “nontrivial”, the theorem is substantively unchanged:

If a group has no proper nontrivial subgroup, then it is a cyclic group of prime order.

This is still true, except in the case of the trivial group that is no longer excluded from the premise. But if 1 were considered prime, it would be true either way.

Looking at this issue more thoroughly would be interesting and might lead to some interesting conclusions about mathematical methodology.

• Can these terms be taxonomized?
• How do mathematical pejoratives relate? (“Abnormal, irregular, improper, degenerate, inadmissible, and otherwise undesirable”) Kelley says we use these terms to refer to “a problem we cannot handle”; that seems to be a different aspect of the whole story.
• Where do they fit in Lakatos’ Proofs and Refutations theory? Sometimes inserting “improper” just heads off a quibble. In other cases, it points the way toward an expansion of understanding, as with the “improper” polyhedra that violate Euler's theorem and motivate the introduction of the Euler characteristic.
• Compare with the large and finely-wrought jargon that distinguishes between proofs that are “elementary”, “easy”, “trivial”, “straightforward”, or “obvious”.
• Is there a category-theoretic formulation of what it means when we say “without loss of generality, take ”?

[ Addendum: Kyle Littler reminds me that I should not forget “pathological”. ]

# Type-Directed Code Generation

<time>November 18, 2017</time>

aka “Type-Level Icing Sugar”

## Context

At work recently I’ve been working on a library to get idiomatic gRPC support in our Haskell project. I’m quite proud of how it’s come out, and thought it’d make a good topic for a blog post. The approach demonstrates several type-level techniques that in my opinion are under-documented and exceptionally useful in using the type-system to enforce external contracts.

Thankfully the networking side of the library had already been done for me by Awake Security, but the interface feels like a thin-wrapper on top of C bindings. I’m very, very grateful that it exists, but I wouldn’t expect myself to be able to use it in anger without causing an uncaught type error somewhere along the line. I’m sure I’m probably just using it wrong, but the library’s higher-level bindings all seemed to be targeted at Awake’s implementation of protobuffers.

We wanted a version that would play nicely with proto-lens, which, at time of writing, has no official support for describing RPC services via protobuffers. If you’re not familiar with proto-lens, it generates Haskell modules containing idiomatic types and lenses for protobuffers, and can be used directly in the build chain.

So the task was to add support to proto-lens for generating interfaces to RPC services defined in protobuffers.

My first approach was to generate the dumbest possible thing that could work – the idea was to generate records containing fields of the shape Request -> IO Response. Of course, with a network involved there is a non-negligible chance of things going wrong, so this interface should expose some means of dealing with errors. However, the protobuffer spec is agnostic about the actual RPC backend used, and so it wasn’t clear how to continue without assuming anything about the particulars behind errors.

More worrisome, however, was that RPCs can be marked as streaming – on the side of the client, server, or both. This means, for example, that a method marked as server-streaming has a different interface on either side of the network:

serverSide :: Request -> (Response -> IO ()) -> IO ()
clientSide :: Request -> (IO (Maybe Response) -> IO r) -> IO r

This is problematic. Should we generate different records corresponding to which side of the network we’re dealing with? An early approach I had was to parameterize the same record based on which side of the network, and use a type family to get the correct signature:

{-# LANGUAGE DataKinds #-}

data NetworkSide = Client | Server

data MyService side = MyService
{ runServerStreaming :: ServerStreamingType side Request Response
}

type family ServerStreamingType (side :: NetworkSide) input output where
ServerStreamingType Server input output =
input -> (output -> IO ()) -> IO ()

ServerStreamingType Client input output =
forall r. input -> (IO (Maybe output) -> IO r) -> IO r

This seems like it would work, but in fact the existence of the forall on the client-side is “illegally polymorphic” in GHC’s eyes, and it will refuse to compile such a thing. Giving it up would mean we wouldn’t be able to return arbitrarily-computed values on the client-side while streaming data from the server. Users of the library might be able to get around it by invoking IORefs or something, but it would be ugly and non-idiomatic.

So that, along with wanting to be backend-agnostic, made this approach a no-go. Luckily, my brilliant coworker Judah Jacobson (who is coincidentally also the author of proto-lens), suggested we instead generate metadata for RPC services in proto-lens, and let backend library code figure it out from there.

With all of that context out of the way, we’re ready to get into the actual meat of the post. Finally.

According to the spec, a protobuffer service may contain zero or more RPC methods. Each method has a request and response type, either of which might be marked as streaming.

While we could represent this metadata at the term-level, that won’t do us any favors in terms of getting type-safe bindings to this stuff. And so, we instead turn to TypeFamilies, DataKinds and GHC.TypeLits.

For reasons that will become clear later, we chose to represent RPC services via types, and methods in those services as symbols (type-level strings). The relevant typeclasses look like this:

class Service s where
type ServiceName    s :: Symbol

class HasMethod s (m :: Symbol) where
type MethodInput       s m :: *
type MethodOutput      s m :: *
type IsClientStreaming s m :: Bool
type IsServerStreaming s m :: Bool

For example, the instances generated for the RPC service:

service MyService {
rpc BiDiStreaming(stream Request) returns(stream Response);
}

would look like this:

data MyService = MyService

instance Service MyService where
type ServiceName    MyService = "myService"

instance HasMethod MyService "biDiStreaming" where
type MethodInput       MyService "biDiStreaming" = Request
type MethodOutput      MyService "biDiStreaming" = Response
type IsClientStreaming MyService "biDiStreaming" = 'True
type IsServerStreaming MyService "biDiStreaming" = 'True

You’ll notice that these typeclasses perfectly encode all of the information we had in the protobuffer definition. The idea is that with all of this metadata available to them, specific backends can generate type-safe interfaces to these RPCs. We’ll walk through the implementation of the gRPC bindings together.

## The Client Side

The client side of things is relatively easy. We can the HasMethod instance directly:

runNonStreamingClient
:: HasMethod s m
=> s
-> Proxy m
-> MethodInput s m
-> IO (Either GRPCError (MethodOutput s m))
runNonStreamingClient =  -- call the underlying gRPC code

runServerStreamingClient
:: HasMethod s m
=> s
-> Proxy m
-> MethodInput s m
-> (IO (Either GRPCError (Maybe (MethodOutput s m)) -> IO r)
-> IO r
runServerStreamingClient =  -- call the underlying gRPC code

-- etc

This is a great start! We’ve got the interface we wanted for the server-streaming code, and our functions are smart enough to require the correct request and response types.

However, there’s already some type-unsafety here; namely that nothing stops us from calling runNonStreamingClient on a streaming method, or other such silly things.

Thankfully the fix is quite easy – we can use type-level equality to force callers to be attentive to the streaming-ness of the method:

runNonStreamingClient
:: ( HasMethod s m
, IsClientStreaming s m ~ 'False
, IsServerStreaming s m ~ 'False
)
=> s
-> Proxy m
-> MethodInput s m
-> IO (Either GRPCError (MethodOutput s m))

runServerStreamingClient
:: ( HasMethod s m
, IsClientStreaming s m ~ 'False
, IsServerStreaming s m ~ 'True
)
=> s
-> Proxy m
-> MethodInput s m
-> (IO (Either GRPCError (Maybe (MethodOutput s m)) -> IO r)
-> IO r

-- et al.

Would-be callers attempting to use the wrong function for their method will now be warded off by the type-system, due to the equality constraints being unable to be discharged. Success!

The actual usability of this code leaves much to be desired (it requires being passed a proxy, and the type errors are absolutely disgusting), but we’ll circle back on improving it later. As it stands, this code is type-safe, and that’s good enough for us for the time being.

## The Server Side

### Method Discovery

Prepare yourself (but don’t panic!): the server side of things is significantly more involved.

In order to run a server, we’re going to need to be able to handle any sort of request that can be thrown at us. That means we’ll need an arbitrary number of handlers, depending on the service in question. An obvious thought would be to generate a record we could consume that would contain handlers for every method, but there’s no obvious place to generate such a thing. Recall: proto-lens can’t, since such a type would be backend-specific, and so our only other strategy down this path would be Template Haskell. Yuck.

Instead, recall that we have an instance of HasMethod for every method on Service s – maybe we could exploit that information somehow? Unfortunately, without Template Haskell, there’s no way to discover typeclass instances.

But that doesn’t mean we’re stumped. Remember that we control the code generation, and so if the representation we have isn’t powerful enough, we can change it. And indeed, the representation we have isn’t quite enough. We can go from a HasMethod s m to its Service s, but not the other way. So let’s change that.

We change the Service class slightly:

class Service s where
type ServiceName    s :: Symbol
type ServiceMethods s :: [Symbol]

If we ensure that the ServiceMethods s type family always contains an element for every instance of HasService, we’ll be able to use that info to discover our instances. For example, our previous MyService will now get generated thusly:

data MyService = MyService

instance Service MyService where
type ServiceName    MyService = "myService"
type ServiceMethods MyService = '["biDiStreaming"]

instance HasMethod MyService "biDiStreaming" where
type MethodInput       MyService "biDiStreaming" = Request
type MethodOutput      MyService "biDiStreaming" = Response
type IsClientStreaming MyService "biDiStreaming" = 'True
type IsServerStreaming MyService "biDiStreaming" = 'True

and we would likewise add the m for any other HasMethod MyService m instances if they existed.

This seems like we can now use ServiceMethods s to get a list of methods, and then somehow type-level map over them to get the HasMethod s m constraints we want.

And we almost can, except that we haven’t told the type-system that ServiceMethods s relates to HasService s m instances in this way. We can add a superclass constraint to Service to do this:

class HasAllMethods s (ServiceMethods s) => Service s where
-- as before

But was is this HasAllMethods thing? It’s a specialized type-level map which turns our list of methods into a bunch of constraints proving we have HasMethod s m for every m in that promoted list.

class HasAllMethods s (xs :: [Symbol])

instance HasAllMethods s '[]
instance (HasMethod s x, HasAllMethods s xs) => HasAllMethods s (x ': xs)

We can think of xs here as the list of constraints we want. Obviously if we don’t want any constraints (the '[] case), we trivially have all of them. The other case is induction: if we have a non-empty list of constraints we’re looking for, that’s the same as looking for the tail of the list, and having the constraint for the head of it.

Read through these instances a few times; make sure you understand the approach before continuing, because we’re going to keep using this technique in scarier and scarier ways.

With this HasAllMethods superclass constraint, we can now convince ourselves (and, more importantly, GHC), that we can go from a Service s constraint to all of its HasMethod s m constraints. Cool!

## Typing the Server

We return to thinking about how to actually run a server. As we’ve discussed, such a function will need to be able to handle every possible method, and, unfortunately, we can’t pack them into a convenient data structure.

Our actual implementation of such a thing might take a list of handlers. But recall that each handler has different input and output types, as well as different shapes depending on which bits of it are streaming. We can make this approach work by existentializing away all of the details.

While it works as far as the actual implementation of the underlying gRPC goes, we’re left with a great sense of uneasiness. We have no guarantees that we’ve provided a handler for every method, and the very nature of existentialization means we have absolutely no guarantees that any of these things are the right ype.

Our only recourse is to somehow use our Service s constraint to put a prettier facade in front of this ugly-if-necessary implementation detail.

The actual interface we’ll eventually provide will, for example, for a service with two methods, look like this:

runServer :: HandlerForMethod1 -> HandlerForMethod2 -> IO ()

Of course, we can’t know a priori how many methods there will be (or what type their handlers should have, for that matter). We’ll somehow need to extract this information from Service s – which is why we previously spent so much effort on making the methods discoverable.

The technique we’ll use is the same one you’ll find yourself using again and again when you’re programming at the type-level. We’ll make a typeclass with an associated type family, and then provide a base case and an induction case.

class HasServer s (xs :: [Symbol]) where
type ServerType s xs :: *

We need to make the methods xs explicit as parameters in the typeclass, so that we can reduce them. The base case is simple – a server with no more handlers is just an IO action:

instance HasServer s '[] where
type ServerType s '[] = IO ()

The induction case, however, is much more interesting:

instance ( HasMethod s x
, HasMethodHandler s x
, HasServer s xs
) => HasServer s (x ': xs) where
type ServerType s (x ': xs) = MethodHandler s x -> ServerType s xs

The idea is that as we pull methods x off our list of methods to handle, we build a function type that takes a value of the correct type to handle method x, which will take another method off the list until we’re out of methods to handle. This is exactly a type-level fold over a list.

The only remaining question is “what is this MethodHandler thing?” It’s going to have to be a type family that will give us back the correct type for the handler under consideration. Such a type will need to dispatch on the streaming variety as well as the request and response, so we’ll define it as follows, and go back and fix HasServer later.

class HasMethodHandler input output cs ss where
type MethodHandler input output cs ss :: *

cs and ss refer to whether we’re looking for client-streaming and/or server-streaming types, respectively.

Such a thing could be a type family, but isn’t because we’ll need its class-ness later in order to actually provide an implementation of all of this stuff. We provide the following instances:

-- non-streaming
instance HasMethodHandler input output 'False 'False where
type MethodHandler input output 'False 'False =
input -> IO output

-- server-streaming
instance HasMethodHandler input output 'False 'False where
type MethodHandler input output 'False 'True =
input -> (output -> IO ()) -> IO ()

-- etc for client and bidi streaming

With MethodHandler now powerful enough to give us the types we want for handlers, we can go back and fix HasServer so it will compile again:

instance ( HasMethod s x
, HasMethodHandler (MethodInput       s x)
(MethodOutput      s x)
(IsClientStreaming s x)
(IsServerStreaming s x)
, HasServer s xs
) => HasServer s (x ': xs) where
type ServerType s (x ': xs)
= MethodHandler (MethodInput       s x)
(MethodOutput      s x)
(IsClientStreaming s x)
(IsServerStreaming s x)
-> ServerType s xs

It’s not pretty, but it works! We can convince ourselves of this by asking ghci:

ghci> :kind! ServerType MyService (ServiceMethods MyService)

(Request -> (Response -> IO ()) -> IO ()) -> IO () :: *

and, if we had other methods defined for MyService, they’d show up here with the correct handler type, in the order they were listed in ServiceMethods MyService.

## Implementing the Server

Our ServerType family now expands to a function type which takes a handler value (of the correct type) for every method on our service. That turns out to be more than half the battle – all we need to do now is to provide a value of this type.

The generation of such a value is going to need to proceed in perfect lockstep with the generation of its type, so we add to the definition of HasServer:

class HasServer s (xs :: [Symbol]) where
type ServerType s xs :: *
runServerImpl :: [AnyHandler] -> ServerType s xs

What is this [AnyHandler] thing, you might ask. It’s an explicit accumulator for existentialized handlers we’ve collected during the fold over xs. It’ll make sense when we look at the induction case.

For now, however, the base case is trivial as always:

instance HasServer s '[] where
type ServerType s '[] = IO ()
runServerImpl handlers = runGRPCServer handlers

where runGRPCServer is the underlying server provided by Awake’s library.

We move to the induction case:

instance ( HasMethod s x
, HasMethodHandler (MethodInput       s x)
(MethodOutput      s x)
(IsClientStreaming s x)
(IsServerStreaming s x)
, HasServer s xs
) => HasServer s (x ': xs) where
type ServerType s (x ': xs)
= MethodHandler (MethodInput       s x)
(MethodOutput      s x)
(IsClientStreaming s x)
(IsServerStreaming s x)
-> ServerType s xs
runServerImpl handlers f = runServerImpl (existentialize f : handlers)

where existentialize is a new class method we add to HasMethodHandler We will elide it here because it is just a function MethodHandler i o cs mm -> AnyHandler and is not particularly interesting if you’re familiar with existentialization.

It’s evident here what I meant by handlers being an explicit accumulator – our recursion adds the parameters it receives into this list so that it can pass them eventually to the base case.

There’s a problem here, however. Reading through this implementation of runServerImpl, you and I both know what the right-hand-side means, unfortunately GHC isn’t as clever as we are. If you try to compile it right now, GHC will complain about the non-injectivity of HasServer as implied by the call to runServerImpl (and also about HasMethodHandler and existentialize, but for the exact same reason.)

The problem is that there’s nothing constraining the type variables s and xs on runServerImpl. I always find this error confusing (and I suspect everyone does), because in my mind it’s perfectly clear from the HasServer s xs in the instance constraint. However, because SeverType is a type family without any injectivity declarations, it means we can’t learn s and xs from ServerType s xs.

Let’s see why. For a very simple example, let’s look at the following type family:

type family NotInjective a where
NotInjective Int  = ()
NotInjective Bool = ()

Here we have NotInjective Int ~ () and NotInjective Bool ~ (), which means even if we know NotInjective a ~ () it doesn’t mean that we know what a is – it could be either Int or Bool.

This is the exact problem we have with runServerImpl: even though we know what type runServerImpl has (it must be ServerType s xs, so that the type on the left-hand of the equality is the same as on the right), that doesn’t mean we know what s and xs are! The solution is to explicitly tell GHC via a type signature or type application:

instance ( HasMethod s x
, HasMethodHandler (MethodInput       s x)
(MethodOutput      s x)
(IsClientStreaming s x)
(IsServerStreaming s x)
, HasServer s xs
) => HasServer s (x ': xs) where
type ServerType s (x ': xs)
= MethodHandler (MethodInput       s x)
(MethodOutput      s x)
(IsClientStreaming s x)
(IsServerStreaming s x)
-> ServerType s xs
runServerImpl handlers f = runServerImpl @s @xs (existentialize f : handlers)

(For those of you playing along at home, you’ll need to type-apply the monstrous MethodInput and friends to the existentialize as well.)

And finally, we’re done! We can slap a prettier interface in front of this runServerImpl to fill in some of the implementation details for us:

runServer
:: forall s
. ( Service s
, HasServer s (ServiceMethods s)
)
=> s
-> ServerType s (ServiceMethods s)
runServer _ = runServerImpl @s @(ServiceMethods s) []

Sweet and typesafe! Yes!

## Client-side Usability

Sweet and typesafe all of this might be, but the user-friendliness on the client-side leaves a lot to be desired. As promised, we’ll address that now.

### Removing Proxies

Recall that the runNonStreamingClient function and its friends require a Proxy m parameter in order to specify the method you want to call. However, m has kind Symbol, and thankfully we have some new extensions in GHC for turning Symbols into values.

We can define a new type, isomorphic to Proxy, but which packs the fact that it is a KnownSymbol (something we can turn into a String at runtime):

data WrappedMethod (sym :: Symbol) where
WrappedMethod :: KnownSymbol sym => WrappedMethod sym

We change our run*Client friends to take this WrappedMethod m instead of the Proxy m they used to:

runNonStreamingClient
:: ( HasMethod s m
, IsClientStreaming s m ~ 'False
, IsServerStreaming s m ~ 'False
)
=> s
-> WrappedMethod m
-> MethodInput s m
-> IO (Either GRPCError (MethodOutput s m))

and, with this change in place, we’re ready for the magic syntax I promised earlier.

import GHC.OverloadedLabel

instance ( KnownSymbol sym
, sym ~ sym'
) => IsLabel sym (WrappedMethod sym') where
fromLabel _ = WrappedMethod

This sym ~ sym' thing is known as the constraint trick for instances, and is necessary here to convince GHC that this can be the only possible instance of IsLabel that will give you back WrappedMethods.

Now turning on the {-# LANGUAGE OverloadedLabels #-} pragma, we’ve changed the syntax to call these client functions from the ugly:

runBiDiStreamingClient MyService (Proxy @"biDiStreaming")

into the much nicer:

runBiDiStreamingClient MyService #biDiStreaming

### Better “Wrong Streaming Variety” Errors

The next step in our journey to delightful usability is remembering that the users of our library are only human, and at some point they are going to call the wrong run*Client function on their method with a different variety of streaming semantics.

At the moment, the errors they’re going to get when they try that will be a few stanza long, the most informative of which will be something along the lines of unable to match 'False with 'True. Yes, it’s technically correct, but it’s entirely useless.

Instead, we can use the TypeError machinery from GHC.TypeLits to make these error messages actually helpful to our users. If you aren’t familiar with it, if GHC ever encounters a TypeError constraint it will die with a error message of your choosing.

We will introduce the following type family:

type family RunNonStreamingClient (cs :: Bool) (ss :: Bool) :: Constraint where
RunNonStreamingClient 'False 'False = ()
RunNonStreamingClient 'False 'True = TypeError
( Text "Called 'runNonStreamingClient' on a server-streaming method."
:$$: Text "Perhaps you meant 'runServerStreamingClient'." ) RunNonStreamingClient 'True 'False = TypeError ( Text "Called 'runNonStreamingClient' on a client-streaming method." :$$: Text "Perhaps you meant 'runClientStreamingClient'."
)
RunNonStreamingClient 'True 'True = TypeError
( Text "Called 'runNonStreamingClient' on a bidi-streaming method."
:$$: Text "Perhaps you meant 'runBiDiStreamingClient'." ) The :$$: type operator stacks message vertically, while :<>: stacks it horizontally.

We can change the constraints on runNonStreamingClient:

runNonStreamingClient
:: ( HasMethod s m
, RunNonStreamingClient (IsClientStreaming s m)
(IsServerStreaming s m)
)
=> s
-> WrappedMethod m
-> MethodInput s m
-> IO (Either GRPCError (MethodOutput s m))

and similarly for our other client functions. Reduction of the resulting boilerplate is left as an exercise to the reader.

With all of this work out of the way, we can test it:

runNonStreamingClient MyService #biDiStreaming
Main.hs:45:13: error:
• Called 'runNonStreamingClient' on a bidi-streaming method.
Perhaps you meant 'runBiDiStreamingClient'.
• In the expression: runNonStreamingClient MyService #bidi

Amazing!

### Better “Wrong Method” Errors

The other class of errors we expect our users to make is to attempt to call a method that doesn’t exist – either because they made a typo, or are forgetful of which methods exist on the service in question.

As it stands, users are likely to get about six stanzas of error messages, from No instance for (HasMethod s m) to Ambiguous type variable 'm0', and other terrible things that leak our implementation details. Our first thought might be to somehow emit a TypeError constraint if we don’t have a HasMethod s m instance, but I’m not convinced such a thing is possible.

But luckily, we can actually do better than any error messages we could produce in that way. Since our service is driven by a value (in our example, the data constructor MyService), by the time things go wrong we do have a Service s instance in scope. Which means we can look up our ServiceMethods s and given some helpful suggestions about what the user probably meant.

The first step is to implement a ListContains type family so we can determine if the method we’re looking for is actually a real method.

type family ListContains (n :: k) (hs :: [k]) :: Bool where
ListContains n '[]       = 'False
ListContains n (n ': hs) = 'True
ListContains n (x ': hs) = ListContains n hs

In the base case, we have no list to look through, so our needle is trivially not in the haystack. If the head of the list is the thing we’re looking for, then it must be in the list. Otherwise, take off the head of the list and continue looking. Simple really, right?

We can now use this thing to generate an error message in the case that the method we’re looking for is not in our list of methods:

type family RequireHasMethod s (m :: Symbol) (found :: Bool) :: Constraint where
RequireHasMethod s m 'False = TypeError
( Text "No method "
:<>: ShowType m
:<>: Text " available for service '"
:<>: ShowType s
:<>: Text "'."
:$$: Text "Available methods are: " :<>: ShowType (ServiceMethods s) ) RequireHasMethod s m 'True = () If found ~ 'False, then the method m we’re looking for is not part of the service s. We produce a nice error message informing the user about this (using ShowType to expand the type variables). We will provide a type alias to perform this lookup: type HasMethod' s m = ( RequireHasMethod s m (ListContains m (ServiceMethods s) , HasMethod s m ) Our new HasMethod' s m has the same shape as HasMethod, but will expand to our custom type error if we’re missing the method under scrutiny. Replacing all of our old HasMethod constraints with HasMethod' works fantastically: Main.hs:54:15: error: • No method "missing" available for service 'MyService'. Available methods are: '["biDiStreaming"] Damn near perfect! That list of methods is kind of ugly, though, so we can write a quick pretty printer for showing promoted lists: type family ShowList (ls :: [k]) :: ErrorMessage where ShowList '[] = Text "" ShowList '[x] = ShowType x ShowList (x ': xs) = ShowType x :<>: Text ", " :<>: ShowList xs Replacing our final ShowType with ShowList in RequireHasMethod now gives us error messages of the following: Main.hs:54:15: error: • No method "missing" available for service 'MyService'. Available methods are: "biDiStreaming" Absolutely gorgeous. ## Conclusion This is where we stop. We’ve used type-level metadata to generate client- and server-side bindings to an underlying library. Everything we’ve made is entirely typesafe, and provides gorgeous, helpful error messages if the user does anything wrong. We’ve found a practical use for many of these seemingly-obscure type-level features, and learned a few things in the process. In the words of my coworker Renzo Carbonara1: “It is up to us, as people who understand a problem at hand, to try and teach the type system as much as we can about that problem. And when we don’t understand the problem, talking to the type system about it will help us understand. Remember, the type system is not magic, it is a logical reasoning tool.” This resounds so strongly in my soul, and maybe it will in yours too. If so, I encourage you to go forth and find uses for these techniques to improve the experience and safety of your own libraries. 1. Whose article “Opaleye’s sugar on top” was a strong inspiration on me, and subsequently on this post. </article> ## November 16, 2017 ### Mark Jason Dominus # Another system software error [ Warning: This article is meandering and does not end anywhere in particular ] My recent article about system software errors kinda blew up the Reddit / Hacker News space, and even got listed on Voat, which I understand is the Group W Bench where they send you if you aren't moral enough to be in Reddit. Many people on these fora were eager to tell war stories of times that they had found errors in the compiler or other infrastructural software. This morning I remembered another example that had happened to me. In the middle 1990s, I was just testing some network program on one of the Sun Solaris machines that belonged to the Computational Linguistics program, when the entire machine locked up. I had to go into the machine room and power-cycle it to get it to come back up. I returned to my desk to pick up where I had left off, and the machine locked up, again just as I ran my program. I rebooted the machine again, and putting two and two together I tried the next run on a different, less heavily-used machine, maybe my desk workstation or something. The problem turned out to be a bug in that version of Solaris: if you bound a network socket to some address, and then tried to connect it to the same address, everything got stuck. I wrote a five-line demonstration program and we reported the bug to Sun. I don't know if it was fixed. My boss had an odd immediate response to this, something along the lines that connecting a socket to itself is not a sanctioned use case, so the failure is excusable. Channeling Richard Stallman, I argued that no user-space system call should ever be able to crash the system, no matter what stupid thing it does. He at once agreed. I felt I was on safe ground, because I had in mind the GNU GCC bug reporting instructions of the time, which contained the following unequivocal statement: If the compiler gets a fatal signal, for any input whatever, that is a compiler bug. Reliable compilers never crash. I love this paragraph. So clear, so pithy! And the second sentence! It could have been left off, but it is there to articulate the writer's moral stance. It is a rock-firm committment in a wavering and uncertain world. Stallman was a major influence on my writing for a long time. I first encountered his work in 1985, when I was browsing in a bookstore and happened to pick up a copy of Dr. Dobb's Journal. That issue contained the very first publication of the GNU Manifesto. I had never heard of Unix before, but I was bowled over by Stallman's vision, and I read the whole thing then and there, standing up.  Order The Crazy Ape from Powell's (It hit the same spot in my heart as Albert Szent-Györgyi's The Crazy Ape, which made a similarly big impression on me at about the same time. I think programmers don't take moral concerns seriously enough, and this is one reason why so many of them find Stallman annoying. But this is what I think makes Stallman so important. Perhaps Dan Bernstein is a similar case.) I have very vague memories of perhaps finding a bug in gcc, which is perhaps why I was familiar with that particular section of the gcc documentation. But more likely I just read it because I read a lot of stuff. Also Stallman was probably on my “read everything he writes” list. Why was I trying to connect a socket to itself, anyway? Oh, it was a bug. I meant to connect it somewhere else and used the wrong variable or something. If the operating system crashes when you try, that is a bug. Reliable operating systems never crash. [ Final note: I looked for my five-line program that connected a socket to itself, but I could not find it. But I found something better instead: an email I sent in April 1993 reporting a program that caused g++ version 2.3.3 to crash with an internal compiler error. And yes, my report does quote the same passage I quoted above. ] ### Tweag I/O # Parallelising your array code Manuel M T Chakravarty This is the fifth post in a series about array programming in Haskell — you might be interested in the first, second, third, and fourth, too. A recurring theme in array programming is performance. After all, many algorithms in numerical computing and data science are computationally intensive. Once the sequential implementation of an array program has been fully optimised, the natural next step is to use one or multiple forms of parallelism to achieve further performance improvements. This can be parallelism within one computational core (SIMD parallelism), multicore parallelism, or distributed multi-machine parallelism. Unfortunately, at this point matters become much more complicated, because parallel programming comes with its own set of serious challenges. In this post, we will focus on multicore parallelism for computations operating on multi-dimensional arrays. In other words, in relation to the vector package, which we discussed in the last post, we have two new ingredients. Firstly, instead of one-dimensional Int-indexed arrays, we have multi-dimensional Int-indexed arrays. Secondly, the collective operations provided on these arrays come with parallel implementations. In fact, the library API is designed to favour collective operations that have good parallel implementations. Similarly, the move to explicitly multi-dimensional arrays is motivated by being able to provide parallel implementations that take the array shape into account, wherever that is an advantage. To make matters concrete, we will discuss the Repa library. Internally it uses many of the same techniques as vector, including strictness, unboxing, and a two-phase initialisation strategy. However, it uses a second array fusion strategy in addition to vector’s stream fusion. More precisely, Repa internally uses vector to represent plain boxed and unboxed arrays and to execute sequential computations on those, which still benefit from stream fusion. However, Repa introduces additional array representations, such as delayed arrays, to also achieve fusion across parallel computations. This additional complication is necessary as stream fusion, by itself, tends to turn parallel into sequential code. In other words, one of the challenges of high-performance parallel array implementations that are built on collective operations is to apply fusion while preserving parallelism. To really get good performance, we need to simultaneously optimize along two orthogonal dimensions: get more done simultaneously, by parallelizing, but also make each sequential unit of work run faster. A second consequence of targeting a parallelisation-friendly API is a very limited use of mutable arrays. Mutable structures generally interact badly with concurrency and parallelism, opening the door to a whole range of hard to diagnose faults. In fact, the focus on immutable arrays for parallel programming is one of the most compelling conceptual improvements of functional over imperative parallel array programming. (To be precise, Repa’s API does provide access to the mutable array structures used to implement two-phase initialisation, but it is usually not necessary to use them directly.) ## Multiple dimensions The obvious structure for indexing multi-dimensional Int-indexed arrays are tuples of Ints. However, they come with two severe drawbacks: (1) they force us to fix the dimensionality of all functions over arrays and (2) they are not sufficient to characterise operations on lower-dimensional subarrays of an array (e.g., a two-dimensional plane within a three-dimensional cube). As an example of the first drawback, consider a fold function that given a three-dimensional cube, reduces it along, say, the x-axis to a two-dimensional plane of sums. The only difference of that operation compared to a fold that sums a two-dimensional plane across one axis to a one-dimensional vector is the number of dimensions that we do not reduce along. Now, we could have a family of fold functions (fold1, fold2, and so on), one for each possible dimension of argument array. But that is hardly satisfactory. Instead, Repa uses a custom datatype for indexing. Index types are built from the infix constructor (:.) and the constant Z, representing a zero-dimensional array (which is the special cases of a singleton array). For example, the type of two-dimensional indices is Z :. Int :. Int and one of its values is Z :. 3 :. 5. By using a type variable instead of Z, we can denote indices with a particular minimum dimensionality. For instance, sh :. Int has at least one dimension, but it might have more, depending on how the type variable sh is instantiated — in any case, instances of sh need to be drawn from the class Shape. On the basis of this index representation, we can capture the entire family of multi-dimensional fold functions in a single type: foldS :: (Shape sh, Source r a, Unbox a) => (a -> a -> a) -> a -> Array r (sh :. Int) a -> Array U sh a  The function foldS implements a sequential, multi-dimensional reduction; hence, the S suffix. It gets three arguments: 1. a -> a -> a is the type of the binary reduction function, which needs to be associative, 2. a is the reduction function’s neutral (i.e, together they form a monoid), and 3. Array e (sh :. Int) a is an at least one-dimensional array of elements of type a, which the type constraint Unbox a requires to be a type that has an associated unboxed representation. Finally, the result of type Array U sh a has one dimension less than the argument array, but contains elements of the same type a. This leaves us with wondering about the meaning of the first type argument of Arrayr and U, respectively— as well as the type constraint Source r a. ## Indexed arrays The first type argument of Array determines the array representation. The available representations include boxed (V) and unboxed (U) representations, but also delayed (D) and cursored (C) representations. The latter are guaranteed to be removed by fusion, but can lead to the superfluous recomputation of array elements that are used more than once. Repa makes the choice of representation explicit to place it under programmer control — experience shows that compiler heuristics for automatic representation selection tend to be fragile and unreliable. A consequence of a representation that is fused away, such as delayed D and cursored C, is that it can only be a data Source of a computation. Hence, the type class of the same name provides elementary array access functions for arrays. The opposite, a Target, provides the functionality to fill an array as part of two-phase initialisation and is only available to manifest representations, such as the boxed V and unboxed U representation. A manifest representation is one which, in contrast to a fused-away delayed representation, is actually stored in memory. In addition to concrete representations, Repa representation tags can also include meta information, such as the interleaving hint I. An array tagged I U uses an unboxed interleaved representation, which improves parallel load balancing in parallel computations where the amount of work strongly varies between different regions in the parallel array. A standard example is computing a Mandelbrot set, where black pixels are significantly more expensive than others. ## Parallelism As we saw above with foldS, Repa follows the convention of adding an S to sequential array operations. Similarly, it uses a P as a suffix for parallel functions. For example, we have foldP :: (Shape sh, Source r a, Unbox a, Monad m) => (a -> a -> a) -> a -> Array r (sh :. Int) a -> m (Array U sh a)  for the parallel version of fold. The distinction between sequential and parallel functions is an important one, since Repa does not support nested parallelism. That is, a parallel function (e.g., foldP) cannot use another parallel function as an argument (e.g., as the combination function). In addition to the suffix, the parallel fold distinguishes itself from the sequential by the use of a not further specified monad. The purpose of this monad is to ensure the one-by-one execution of pipelines of parallel computations. This is important to prevent inadvertent nesting of parallel computations as Haskell is a lazy language and we might otherwise feed a suspended (i.e., not yet evaluated) parallel computation into another parallel computation. ## Parallel matrix multiplication As a simple example of a parallel computation, consider the multiplication of two matrices arr and brr of type Array U DIM2 Double (two-dimensional, unboxed arrays), where type DIM2 = Z :. Int :. Int: mmultP :: Monad m => Array U DIM2 Double -> Array U DIM2 Double -> m (Array U DIM2 Double) mmultP arr brr = do trr <- transpose2P brr computeP (fromFunction (Z :. h1 :. w2) dotp) where (Z :. h1 :. _) = extent arr (Z :. _ :. w2) = extent brr dotp ix = sumAllS  zipWith (*) (slice arr (Any :. (row ix) :. All)) (slice trr (Any :. (col ix) :. All))  We assume the existence of a helper function transpose2P, which transposes a matrix in parallel — for example, by using Repa’s backpermute function. Then, we generate the manifest result array by computing all elements of fromFunction (Z :. h1 :. w2) dotpin parallel with computeP. The shape (i.e., the size of the dimensions) of the result is h1 times w2, and fromFunction turns a function, which takes an array index to the corresponding array element , into a delayed array: fromFunction :: sh -> (sh -> a) -> Array D sh a  At each index ix of the resulting array, we evaluate dotp, which only involves a sequential computation. It’s sequential nature is important for two reasons. Firstly, as mentioned, Repa does not support nested parallelism, so the computations on each result array index triggered by computeP in parallel may themselves not be parallel. Secondly, the work complexity of matrix multiplication is n^3 — that is the number of scalar multiplications that need to be performed. Performing them all in parallel would lead to (a) too much and (b) too fine-grained parallelism. Both too much parallelism and parallel workloads that are each too little work lead to bad performance as they result in too much administrative overhead. In contrast, the sequential computation performed by dotp obtains a row of the matrix arr and a column of brr (actually, a row of the transposed brr, which is trr) with slice, which extracts an entire subarray from an array. Then, it multiples the row and column pointwise with zipWith (*) and sums up the products with sumAllS, where zipWith :: (Shape sh, Source r1 a, Source r2 b) => (a -> b -> c) -> Array r1 sh a -> Array r2 sh b -> Array D sh c sumAllS :: (Shape sh, Source r a, Num a) => Array r sh a -> a  This example highlights how reasoning about the decomposition of an algorithm into parallel and sequential components is crucial for good parallel performance. This is assisted by Repa’s clear distinction between sequential and parallel operations. ## Further reading Repa went through three major iterations before arriving at the current interface. The underlying concepts are described and supported by benchmarks in the papers Regular, shape-polymorphic, parallel arrays in Haskell, Efficient Parallel Stencil Convolution in Haskell, and Guiding Parallel Array Fusion with Indexed Types, respectively. In addition, Data Flow Fusion with Series Expressions in Haskell proposes a further improvement to the fusion system. However, this has not been integrated into the main package. ## November 15, 2017 ### FP Complete # My DevOps Journey and How I Became a Recovering IT Operations Manager ### DevOps Challenges I managed an eight-person team that supported data integration tools for a Fortune 100 tech company. One of the tools we supported was adopted by every business unit to integrate a large number of applications, database, and information repositories together. Over a period of 7 - 8 years the number of production integration applications grew to over 800. During the years of scaling multiple environments we faced several challenges: ### Jeremy Gibbons # The Digits of Pi In the previous post we were introduced to metamorphisms, which consist of an unfold after a fold—typically on lists, and the fold part typically a ${\mathit{foldl}}$. A canonical example is the conversion of a fraction from one base to another. For simplicity, let’s consider here only infinite fractions, so we don’t have to deal with the end of the input and flushing the state: $\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l}{\mathit{stream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{stream}\;g\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{stream}\;g\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{stream}\;g\;f\;(f\;b\;a)\;x' \end{array} \end{array} \end{array}$ So for example, we can convert an infinite fraction in base 3 to one in base 7 with $\displaystyle \mathit{stream}\;\mathit{next}\;\mathit{step}\;(0,1)$ where $\displaystyle \begin{array}{@{}lcl@{}} \mathit{next}\;(u,v) &=& \begin{array}[t]{@{}l} \mathbf{let}\;y = \lfloor{7 \times u \times v}\rfloor\;\mathbf{in} \\ \mathbf{if}\;\lfloor{y}\rfloor = \lfloor{7 \times (u+1) \times v}\rfloor\;\begin{array}[t]{@{}l@{\;}l}\mathbf{then}&\mathit{Just}\;(y,(u - y/(v \times 7), v \times 7))\\\mathbf{else}&\mathit{Nothing} \\ \end{array} \end{array} \\ \mathit{stepl}\;(u,v)\;d &=& (u \times 3 + d, v / 3) \end{array}$ In this post, we’ll see another number conversion problem, which will deliver the digits of ${\pi}$. For more details, see my paper—although the presentation here is simpler now. ## Series for pi Leibniz showed that $\displaystyle \displaystyle \frac{\pi}{4} = \sum_{i=0}^{\infty} \frac{(-1)^i}{2i+1}$ From this, using Euler’s convergence-accelerating transformation, one may derive $\displaystyle \pi = \sum_{i=0}^{\infty} \frac{(i!)^2\,2^{i+1}}{(2i+1)!}$ or equivalently $\displaystyle \pi = 2 + \frac{1}{3} \times \biggl(2 + \frac{2}{5}\times \biggl(2 + \frac{3}{7}\times \biggl( \cdots \biggl(2 + \frac{i}{2i+1}\times \biggl(\cdots\biggr)\biggr)\biggr)\biggr)\biggr)$ This can be seen as the number ${(2;2,2,2...)}$ in a funny mixed-radix base ${(\frac{1}{3}, \frac{2}{5}, \frac{3}{7}...)}$, just as the usual decimal expansion $\displaystyle \pi = 3 + \frac{1}{10} \times \biggl(1 + \frac{1}{10}\times \biggl(4 + \frac{1}{10}\times \biggl( \cdots\biggr)\biggr)\biggr)$ is represented by the number ${(3;1,4,1...)}$ in the fixed-radix base ${(\frac{1}{10},\frac{1}{10},\frac{1}{10}...)}$. Computing the decimal digits of ${\pi}$ is then a matter of conversion from the mixed-radix base to the fixed-radix base. ## Conversion from a fixed base Let’s remind ourselves of how it should work, using a simpler example: conversion from one fixed base to another. We are given an infinite-precision fraction in the unit interval $\displaystyle x = \frac{1}{m} \times \biggl(x_0 + \frac{1}{m}\times \biggl(x_1 + \frac{1}{m}\times \biggl( \cdots\biggr)\biggr)\biggr)$ in base ${m}$, in which ${0 \le x_i < m}$ for each digit ${x_i}$. We are to convert it to a similar representation $\displaystyle x = \frac{1}{n} \times \biggl(y_0 + \frac{1}{n}\times \biggl(y_1 + \frac{1}{n}\times \biggl( \cdots\biggr)\biggr)\biggr)$ in base ${n}$, in which ${0 \le y_j < n}$ for each output digit ${y_j}$. The streaming process maintains a state ${(u,v)}$, a pair of rationals; the invariant is that after consuming ${i}$ input digits and producing ${j}$ output digits, we have $\displaystyle x = \frac{1}{n} \times \biggl(y_0 + \cdots + \frac{1}{n}\times \biggl(y_{j-1} + v \times (u + \frac{1}{m} \times \biggl( x_i + \frac{1}{m} \times \biggl(x_{i+1} + \cdots \biggr)\biggr)\biggr)\biggr)\biggr)$ so that ${(u,v)}$ represents a linear function ${(v\times) \cdot (u+)}$ that should be applied to the value represented by the remaining input. We can initialize the process with ${i=0, j=0, u=0, v=1}$. At each step, we first try to produce another output digit. The remaining input digits ${x_i, x_{i+1},...}$ represent a value in the unit interval; so if ${n \times v \times (u+0)}$ and ${n \times v \times (u+1)}$ have the same integer part, then that must be the next output digit, whatever the remaining input digits are. Let ${y_j = \lfloor n \times v \times u \rfloor}$ be that integer. Now we need to find ${(u',v')}$ such that $\displaystyle \frac{1}{n} \times \biggl(y_j + v' \times (u' + r)\biggr) = v \times (u + r)$ for any remainder ${r}$; then we can increment ${j}$ and set ${(u,v)}$ to ${(u',v')}$ and the invariant is maintained. A little algebra shows that we should take ${v' = n \times v}$ and ${u' = u - y_j/v'}$. If ${v \times u}$ and ${v \times (u+1)}$ have different integer parts, we cannot yet tell what the next output digit should be, so we must consume the next input digit instead. Now we need to find ${(u',v')}$ such that $\displaystyle v \times \biggl(u + \frac{1}{m} \times \biggl(x_i + r\biggr)\biggr) = v' \times (u' + r)$ for any remainder ${r}$; then we can increment ${i}$ and set ${(u,v)}$ to ${(u',v')}$ and the invariant is again maintained. Again, algebraic manipulation leads us to ${v' = v/m}$ and ${u' = m \times u + x_i}$. For example, ${\frac{1}{4} = 0.020202..._3 = 0.151515..._7}$, and the conversion starts as follows: $\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 0 && 2 && 0 && && 2 && && 0 && 2 \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{0}{1},\frac{1}{3}\bigr) && \bigl(\frac{2}{1},\frac{1}{9}\bigr) && \bigl(\frac{6}{1},\frac{1}{27}\bigr) && \bigl(\frac{15}{7},\frac{7}{27}\bigr) && \bigl(\frac{59}{7},\frac{7}{81}\bigr) && \bigl(\frac{8}{49},\frac{49}{81}\bigr) && \bigl(\frac{24}{49},\frac{49}{243}\bigr) && \bigl(\frac{170}{49},\frac{49}{729}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && && && 1 && && 5 \end{array}$ That is, the initial state is ${u_0=\frac{0}{1}, v_0=\frac{1}{1}}$. This state does not yet determine the first output digit, so we consume the first input digit 0 to yield the next state ${u_1 = \frac{0}{1}, v_1 = \frac{1}{3}}$. This state still does not determine the first output, and nor will the next; so we consume the next two input digits 2 and 0, yielding state ${u_3 = \frac{6}{1}, v_3 = \frac{1}{27}}$. This state does determine the next digit: ${v_3 \times u_3 = 0.020_3 = 0.136..._7}$ and ${v_3 \times (u_3+1) = 0.021_3 = 0.154..._7}$ both start with a 1 in base 7. So we can produce a 1 as the first output digit, yielding state ${u_4 = \frac{15}{7}, v_4 = \frac{7}{27}}$. And so on. The process tends to converge. Each production step widens the non-empty window ${[n \times v \times u, n \times v \times (u+1))}$ by a factor of ${n}$, so it will eventually contain multiple integers; therefore we cannot produce indefinitely. Each consumption step narrows the window by a factor of ${m}$, so it will tend towards eventually producing the next output digit. However, this doesn’t always work. For example, consider converting ${0.333..._{10}}$ to base 3: $\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 3 && 3 && 3 & \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{3}{1},\frac{1}{10}\bigr) && \bigl(\frac{33}{1},\frac{1}{100}\bigr) && \bigl(\frac{333}{1},\frac{1}{1000}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && \end{array}$ The first output digit is never determined: if the first non-3 in the input is less than 3, the value is less than a third, and the first output digit should be a 0; if the first non-3 is greater than 3, then the value is definitely greater than a third, and it is safe to produce a 1 as the first output digit; but because the input is all 3s, we never get to make this decision. This problem will happen whenever the value being represented has a finite representation in the output base. ## Conversion from a mixed base Let’s return now to computing the digits of ${\pi}$. We have the input $\displaystyle \pi = 2 + \frac{1}{3} \times \biggl(2 + \frac{2}{5}\times \biggl(2 + \frac{3}{7}\times \biggl( \cdots \biggl(2 + \frac{i}{2i+1}\times \biggl(\cdots\biggr)\biggr)\biggr)\biggr)\biggr)$ which we want to convert to decimal. The streaming process maintains a pair ${(u,v)}$ of rationals—but this time representing the linear function ${(u+) \cdot (v\times)}$, since this time our expression starts with a sum rather than a product. The invariant is similar: after consuming ${i-1}$ input digits and producing ${j}$ output digits, we have $\displaystyle \pi = y_0 + \frac{1}{10} \times \biggl(\cdots y_{j-1} + \frac{1}{10} \times \biggl(u + v \times \biggl(x_i + \frac{i}{2i+1} \times \biggl(x_{i+1} + \frac{i+1}{2i+3} \times \cdots\biggr)\biggr)\biggr)\biggr)$ Note that the output base is fixed at 10; but more importantly, the input digits ${x_i}$ are all fixed at 2, and it is the input base that varies from digit to digit. We can initialize the process with ${i=1, j=0, u=0, v=1}$. At each step, we first try to produce an output digit. What value might the remaining input $\displaystyle r = 2 + \frac{i}{2i+1} \times \biggl(2 + \frac{i+1}{2i+3} \times \cdots \biggr)$ represent? Each of the bases is at least ${\frac{1}{3}}$, so it is clear that ${r_{\mathrm{min}} \le r}$, where $\displaystyle r_{\mathrm{min}} = 2 + \frac{1}{3} \times r_{\mathrm{min}}$ which has unique solution ${r_{\mathrm{min}} = 3}$. Similarly, each of the bases is less than ${\frac{1}{2}}$, so it is clear that ${r < r_{\mathrm{max}}}$, where $\displaystyle r_{\mathrm{max}} = 2 + \frac{1}{2} \times r_{\mathrm{max}}$ which has unique solution ${r_{\mathrm{max}} = 4}$. So we consider the bounds ${\lfloor u + v \times 3 \rfloor}$ and ${\lfloor u + v \times 4 \rfloor}$; if these have the same integer part ${y_j}$, then that is the next output digit. Now we need to find ${(u',v')}$ such that $\displaystyle y_j + \frac{1}{10} \times (u' + v' \times r) = u + v \times r$ for any remainder ${r}$, so we pick ${u' = 10 \times (u - y_j)}$ and ${v' = 10 \times v}$. Then we can increment ${j}$ and set ${(u,v)}$ to ${(u',v')}$, and the invariant is maintained. If the two bounds have different integer parts, we must consume the next input digit instead. Now we need to find ${(u',v')}$ such that $\displaystyle u' + v' \times r = u + v \times \biggl(x_i + \frac{i}{2i+1} \times r\biggr)$ for all ${r}$, so we pick ${u' = u + v \times x_i}$ and ${v' = v \times i / (2i+1)}$. Then we can increment ${i}$ and set ${(u,v)}$ to ${(u',v')}$, and again the invariant is maintained. The conversion starts as follows: $\displaystyle \begin{array}{c|c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}c@{}cc} x_i & & 2 && && 2 && 2 && && 2 && 2 \\ \hline (u,v) & \bigl(\frac{0}{1},\frac{1}{1}\bigr) && \bigl(\frac{2}{1},\frac{1}{3}\bigr) && \bigl(\frac{-10}{1},\frac{10}{3}\bigr) && \bigl(\frac{10}{3},\frac{4}{3}\bigr) && \bigl(\frac{-2}{3},\frac{4}{7}\bigr) && \bigl(\frac{-50}{3},\frac{40}{7}\bigr) && \bigl(\frac{-110}{21},\frac{160}{63}\bigr) && \bigl(\frac{-10}{63},\frac{800}{693}\bigr) & \cdots \vrule height 2.5ex depth 1.5ex width 0pt \\ \hline y_j & & && 3 && && && 1 && && \end{array}$ Happily, non-termination ceases to be a problem: the value being represented does not have a finite representation in the output base, being irrational. ## Code We can plug these definitions straight into the ${\mathit{stream}}$ function above: $\displaystyle \mathit{piDigits} = \mathit{stream}\;g\;f\;(1,0,0\%1,1\%1)\;(\mathit{repeat}\;2)$ where $\displaystyle g\;(i,j,u,v) = \begin{array}[t]{@{}l} \mathbf{if}\;y == \mathit{floor}\;(u + v \times 4) \\ \mathbf{then}\;\mathit{Just}\;(y, (i,j+1, 10 \times (u - \mathit{fromIntegral}\;y), 10 \times v)) \\ \mathbf{else}\;\mathit{Nothing} \\ \mathbf{where}\;y = \mathit{floor}\;(u + v \times 3) \end{array}$ and $\displaystyle f\;(i,j,u,v)\;x = \begin{array}[t]{@{}l} (i+1,j,u + v \times \mathit{fromIntegral}\;x, v \times i' / (2 \times i' + 1)) \\ \mathbf{where}\;i' = \mathit{fromIntegral}\;i \end{array}$ (The ${\%}$s make rational numbers in Haskell, and force the ambiguous fractional type to be ${\mathit{Rational}}$ rather than ${\mathit{Double}}$.) In fact, this program can be considerably simplified, by inlining the definitions. In particular, the input digits are all 2, so we need not supply them. Moreover, the ${j}$ component of the state is never used, because we treat each output digit in the same way (in contrast to the input digits); so that may be eliminated. Finally, we can eliminate some of the numeric coercions if we represent the ${i}$ component as a rational in the first place: $\displaystyle \mathit{piDigits} = \begin{array}[t]{@{}l} \mathit{go}\;((1,0,1) :: (\mathit{Rational},\mathit{Rational},\mathit{Rational}))\;\mathbf{where} \\ \qquad \mathit{go}\;(i,u,v) = \begin{array}[t]{@{}ll} \mathbf{if} & y == \mathit{floor}\;(u+v \times 4) \\ \mathbf{then} & y : \mathit{go}\;(i,10 \times (u-\mathit{fromIntegral}\;y),10 \times v) \\ \mathbf{else} & \mathit{go}\;(i+1,u+2 \times v, (v \times i) / (2 \times i+1)) \\ \multicolumn{2}{@{}l}{\qquad \mathbf{where}\; y = \mathit{floor}\;(u+v \times 3)} \end{array} \end{array}$ Then we have $\displaystyle \mathit{piDigits} = [3,1,4,1,5,9,2,6,5,3,5,8,9,7,9,3...$ ### Mark Jason Dominus # Down with the negation sign! [ Credit where it is due: This was entirely Darius Bacon's idea. ] In connection with “Recognizing when two arithmetic expressions are essentially the same”, I had several conversations with people about ways to normalize numeric expressions. In that article I observed that while everyone knows the usual associative law for addition$$ (a + b) + c = a + (b + c)$$nobody ever seems to mention the corresponding law for subtraction:$$ (a+b)-c = a + (b-c).$$And while everyone “knows” that subtraction is not associative because$$(a - b) - c ≠ a - (b-c)$$nobody ever seems to observe that there is an associative law for subtraction:$$\begin{align} (a - b) + c & = a - (b - c) \\ (a -b) -c & = a-(b+c).\end{align}$$This asymmetry is kind of a nuisance, and suggests that a more symmetric notation might be better. Darius Bacon suggested a simple change that I think is an improvement: Write the negation of as$$a\star$$so that one has, for all ,$$a+a\star = a\star + a = 0.$$The operation obeys the following elegant and simple laws:$$\begin{align} a\star\star & = a \\ (a+b)\star & = a\star + b\star \end{align} $$Once we adopt , we get a huge payoff: We can eliminate subtraction: Instead of we now write . The negation of is$$(a+b\star)\star = a\star + b{\star\star} = a\star +b.$$We no longer have the annoying notational asymmetry between and where the plus sign appears from nowhere. Instead, one is and the other is , which is obviously just the usual commutativity of addition. The is of course nothing but a synonym for multiplication by . But it is a much less clumsy synonym. means , but with less inkjunk. In conventional notation the parentheses in are essential and if you lose them the whole thing is ruined. But because is just a special case of multiplication, it associates with multiplication and division, so we don't have to worry about parentheses in . They are all equal to just . and you can drop the parentheses or include them or write the terms in any order, just as you like, just as you would with . The surprising associativity of subtraction is no longer surprising, because$$(a + b) - c = a + (b - c)$$is now written as$$(a + b) + c\star = a + (b + c\star)$$so it's just the usual associative law for addition; it is not even disguised. The same happens for the reverse associative laws for subtraction that nobody mentions; they become variations on$$ \begin{align} (a + b\star) + c\star & = a + (b\star + c\star) \\ & = a + (b+c)\star \end{align}  and such like.

The is faster to read and faster to say. Instead of “minus one” or “negative one” or “times negative one”, you just say “star”.

The is just a number, and it behaves like a number. Its role in an expression is the same as any other number's. It is just a special, one-off notation for a single, particularly important number.

Open questions:

1. Do we now need to replace the sign? If so, what should we replace it with?

2. Maybe the idea is sound, but the itself is a bad choice. It is slow to write. It will inevitably be confused with the * that almost every programming language uses to denote multiplication.

3. The real problem here is that the symbol is overloaded. Instead of changing the negation symbol to and eliminating the subtraction symbol, what if we just eliminated subtraction? None of the new notation would be incompatible with the old notation: still means exactly what it used to. But you are no longer allowed to abbreviate it to .

This would fix the problem of the taking too long to write: we would just use in its place. It would also fix the concern of point 2: now means or which is not hard to remember or to understand. Another happy result: notations like and do not change at all.

Curious footnote: While I was writing up the draft of this article, it had a reminder in it: “How did you and Darius come up with this?” I went back to our email to look, and I discovered the answer was:

1. Darius suggested the idea to me.
2. I said, “Hey, that's a great idea!”

I wish I could take more credit, but there it is. Hmm, maybe I will take credit for inspiring Darius! That should be worth at least fifty percent, perhaps more.

[ This article had some perinatal problems. It escaped early from the laboratory, in a not-quite-finished state, so I apologize if you are seeing it twice. ]

# Metamorphisms

It appears that I have insufficient time, or at least insufficient discipline, to contribute to this blog, except when I am on sabbatical. Which I now am… so let’s see if I can do better.

## Hylomorphisms

I don’t think I’ve written about them yet in this series—another story, for another day—but hylomorphisms consist of a fold after an unfold. One very simple example is the factorial function: ${n!}$ is the product of the predecessors ${[n,...,1]}$ of ${n}$. The predecessors can be computed with an unfold:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{preds} &::& \mathit{Integer} \rightarrow [\mathit{Integer}] \\ \mathit{preds} &=& \mathit{unfoldr}\;\mathit{step} \; \mathbf{where} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{step}\;0 &=& \mathit{Nothing} \\ \mathit{step}\;n &=& \mathit{Just}\;(n, n-1) \end{array} \end{array}$

and the product as a fold:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{prod} &::& [\mathit{Integer}] \rightarrow \mathit{Integer} \\ \mathit{prod} &=& \mathit{foldr}\;(\times)\;1 \end{array}$

and then factorial is their composition:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{factorial} &::& \mathit{Integer} \rightarrow \mathit{Integer} \\ \mathit{factorial} &=& \mathit{prod} \cdot \mathit{preds} \end{array}$

Another example is a tree-based sorting algorithm that resembles Hoare’s quicksort: from the input list, grow a binary search tree, as an unfold, and then flatten that tree back to a sorted list, as a fold. This is a divide-and-conquer algorithm; in general, these can be modelled as unfolding a tree of subproblems by repeatedly dividing the problem, then collecting the solution to the original problem by folding together the solutions to subproblems.

## An unfold after a fold

This post is about the opposite composition, an unfold after a fold. Some examples:

• ${\mathit{regroup}\;n = \mathit{group}\;n \cdot \mathit{concat}}$ to reformat a list of lists to a given length;
• ${\mathit{heapsort} = \mathit{flattenHeap} \cdot \mathit{buildHeap}}$ to sort a list;
• ${\mathit{baseconv}\;(b,c) = \mathit{toBase}\;b \cdot \mathit{fromBase}\;c}$ to convert a fraction from base ${c}$ to base ${b}$;
• ${\mathit{arithCode} = \mathit{toBits} \cdot \mathit{narrow}}$ to encode a text in binary by “arithmetic coding”.

In each of these cases, the first phase is a fold, which consumes some structured representation of a value into an intermediate unstructured format, and the second phase is an unfold, which generates a new structured representation. Their composition effects a change of representation, so we call them metamorphisms.

Hylomorphisms always fuse, and one can deforest the intermediate virtual data structure. For example, one need not construct the intermediate list in the factorial function; since each cell gets constructed in the unfold only to be immediately deconstructed in the fold, one can cut to the chase and go straight to the familiar recursive definition. For the base case, we have:

$\displaystyle \begin{array}{ll} & \mathit{factorial}\;0 \\ = & \qquad \{ \mathit{factorial} \} \\ & \mathit{prod}\;(\mathit{preds}\;0) \\ = & \qquad \{ \mathit{preds} \} \\ & \mathit{prod}\;[\,] \\ = & \qquad \{ \mathit{prod} \} \\ & 1 \end{array}$

and for non-zero argument ${n}$, we have:

$\displaystyle \begin{array}{ll} & \mathit{factorial}\;n \\ = & \qquad \{ \mathit{factorial} \} \\ & \mathit{prod}\;(\mathit{preds}\;n) \\ = & \qquad \{ \mathit{preds} \} \\ & \mathit{prod}\;(n : \mathit{preds}\;(n-1)) \\ = & \qquad \{ \mathit{prod} \} \\ & n \times \mathit{prod}\;(\mathit{preds}\;(n-1)) \\ = & \qquad \{ \mathit{factorial} \} \\ & n \times \mathit{factorial}\;(n-1) \\ \end{array}$

In contrast, metamorphisms only fuse under certain conditions. However, when they do fuse, they also allow infinite representations to be processed, as we shall see.

Fusion seems to depend on the fold being tail-recursive; that is, a ${\mathit{foldl}}$:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{foldl} &::& (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow \beta \\ \mathit{foldl}\;f\;b\;(a:x) &=& \mathit{foldl}\;f\;(f\;b\;a)\;x \\ \mathit{foldl}\;f\;b\;[\,] &=& b \end{array}$

For the unfold phase, we will use the usual list unfold:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{unfoldr} &::& (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow \beta \rightarrow [\gamma] \\ \mathit{unfoldr}\;g\;b &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{unfoldr}\;g\;b' \\ \mathit{Nothing} &\rightarrow& [\,] \end{array} \end{array}$

We define a metamorphism as their composition:

$\displaystyle \begin{array}{l} \mathit{meta} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma] \\ \mathit{meta}\;g\;f\;b = \mathit{unfoldr}\;g \cdot \mathit{foldl}\;f\;b \end{array}$

This transforms input of type ${[A]}$ to output of type ${[C]}$: in the first phase, ${\mathit{foldl}\;f\;b}$, it consumes all the input into an intermediate value of type ${B}$; in the second phase, ${\mathit{unfoldr}\;g}$, it produces all the output.

## Streaming

Under certain conditions, it is possible to fuse these two phases—this time, not in order to eliminate an intermediate data structure (after all, the intermediate type ${B}$ need not be structured), but rather in order to allow some production steps to happen before all the consumption steps are complete.

To that end, we define the ${\mathit{stream}}$ function as follows:

$\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l}{\mathit{stream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{stream}\;g\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{stream}\;g\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{stream}\;g\;f\;(f\;b\;a)\;x' \\ {[\,]} &\rightarrow& [\,] \end{array} \end{array} \end{array}$

This takes the same arguments as ${\mathit{meta}}$. It maintains a current state ${b}$, and produces an output element ${c}$ when it can; and when it can’t produce, it consumes an input element instead. In more detail, it examines the current state ${b}$ using function ${g}$, which is like the body of an unfold; this may produce a first element ${c}$ of the result and a new state ${b'}$; when it yields no element, the next element ${a}$ of the input is consumed using function ${f}$, which is like the body of a ${\mathit{foldl}}$; and when no input remains either, we are done.

The streaming condition for ${f}$ and ${g}$ is that

$\displaystyle g\;b = \mathit{Just}\;(c,b') \quad\Rightarrow\quad \forall a \mathbin{.} g\;(f\;b\;a) = \mathit{Just}\;(c, f\;b'\;a)$

Consider a state ${b}$ from which the body ${g}$ of the unfold is productive, yielding some ${\mathit{Just}\;(c,b')}$. From here we have two choices: we can either produce the output ${c}$, move to intermediate state ${b'}$, then consume the next input ${a}$ to yield a final state ${f\;b'\;a}$; or we can consume first to get the intermediate state ${f\;b\;a}$, and again try to produce. The streaming condition says that this intermediate state ${f\;b\;a}$ will again be productive, and will yield the same output ${c}$ and the same final state ${f\;b'\;a}$. That is, instead of consuming all the inputs first, and then producing all the outputs, it is possible to produce some of the outputs early, without jeopardizing the overall result. Provided that the streaming condition holds for ${f}$ and ${g}$, then

$\displaystyle \mathit{stream}\;g\;f\;b\;x = \mathit{meta}\;g\;f\;b\;x$

for all finite lists ${x}$.

As a simple example, consider the buffering’ process ${\mathit{meta}\;\mathit{uncons}\;(\mathbin{{+}\!\!\!{+}})\;[\,]}$, where

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{uncons}\;x &=& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} [\,] &\rightarrow& \mathit{Nothing} \\ c:x' &\rightarrow& \mathit{Just}\;(c,x') \end{array} \end{array}$

Note that ${\mathit{unfoldr}\;\mathit{uncons} = \mathit{id}}$, so ${\mathit{meta}\;\mathit{uncons}\;(\mathbin{{+}\!\!\!{+}})\;[\,]}$ is just a complicated way of writing ${\mathit{concat}}$ as a ${\mathit{foldl}}$. But the streaming condition holds for ${\mathbin{{+}\!\!\!{+}}}$ and ${\mathit{uncons}}$ (as you may check), so ${\mathit{concat}}$ may be streamed. Operationally, the streaming version of ${\mathit{concat}}$ consumes one list from the input list of lists, then peels off and produces its elements one by one; when they have all been delivered, it consumes the next input list, and so on.

## Flushing

The streaming version of ${\mathit{concat}}$ is actually rather special, because the production steps can always completely exhaust the intermediate state. In contrast, consider the regrouping’ example ${\mathit{regroup}\;n = \mathit{meta}\;(\mathit{chunk}\;n)\;(\mathbin{{+}\!\!\!{+}})\;[\,]}$ where

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{chunk}\;n\;[\,] &=& \mathit{Nothing} \\ \mathit{chunk}\;n\;x &=& \mathit{Just}\;(\mathit{splitAt}\;n\;x) \end{array}$

from the introduction (here, ${\mathit{splitAt}\;n\;x}$ yields ${(y,z)}$ where ${y \mathbin{{+}\!\!\!{+}} z = x}$, with ${\mathit{length}\;y=n}$ when ${n \le \mathit{length}\;x}$ and ${y=x}$ otherwise). This transforms an input list of lists into an output list of lists, where each output chunk’ except perhaps the last has length ${n}$—if the content doesn’t divide up evenly, then the last chunk is short. One might hope to be able to stream ${\mathit{regroup}\;n}$, but it doesn’t quite work with the formulation so far. The problem is that ${\mathit{chunk}}$ is too aggressive, and will produce short chunks when there is still some input to consume. (Indeed, the streaming condition does not hold for ${\mathbin{{+}\!\!\!{+}}}$ and ${\mathit{chunk}\;n}$—why not?) One might try the more cautious producer ${\mathit{chunk'}}$:

$\displaystyle \begin{array}{@{}lclcl@{}} \mathit{chunk'}\;n\;x &\mid& n \le \mathit{length}\;x &=& \mathit{Just}\;(\mathit{splitAt}\;n\;x) \\ &\mid& \mathbf{otherwise} &=& \mathit{Nothing} \end{array}$

But this never produces a short chunk, and so if the content doesn’t divide up evenly then the last few elements will not be extracted from the intermediate state and will be lost.

We need to combine these two producers somehow: the streaming process should behave cautiously while there is still remaining input, which might influence the next output; but it should then switch to a more aggressive strategy once the input is finished, in order to flush out the contents of the intermediate state. To achieve this, we define a more general flushing stream operator:

$\displaystyle \begin{array}{@{}lcl@{}} \multicolumn{3}{@{}l@{}}{\mathit{fstream} :: (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow [\gamma]) \rightarrow (\beta \rightarrow \alpha \rightarrow \beta) \rightarrow \beta \rightarrow [\alpha] \rightarrow [\gamma]} \\ \mathit{fstream}\;g\;h\;f\;b\;x &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{fstream}\;g\;h\;f\;b'\;x \\ \mathit{Nothing} &\rightarrow& \mathbf{case}\;x\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} a:x' &\rightarrow& \mathit{fstream}\;g\;h\;f\;(f\;b\;a)\;x' \\ {[\,]} &\rightarrow& h\;b \end{array} \end{array} \end{array}$

This takes an additional argument ${h :: \beta \rightarrow [\gamma]}$; when the cautious producer ${g}$ is unproductive, and there is no remaining input to consume, it uses ${h}$ to flush out the remaining output elements from the state. Clearly, specializing to ${h\;b=[\,]}$ retrieves the original ${\mathit{stream}}$ operator.

The corresponding metamorphism uses an apomorphism in place of the unfold. Define

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{apo} &::& (\beta \rightarrow \mathsf{Maybe}\;(\gamma,\beta)) \rightarrow (\beta \rightarrow [\gamma]) \rightarrow \beta \rightarrow [\gamma] \\ \mathit{apo}\;g\;h\;b &=& \mathbf{case}\;g\;b\;\mathbf{of} \\ & & \quad \begin{array}[t]{@{}lcl@{}} \mathit{Just}\;(c,b') &\rightarrow& c : \mathit{apo}\;g\;h\;b' \\ \mathit{Nothing} &\rightarrow& h\;b \end{array} \end{array}$

Then ${\mathit{apo}\;g\;h}$ behaves like ${\mathit{unfoldr}\;g}$, except that if and when ${g}$ stops being productive it finishes up by applying ${h}$ to the final state. Similarly, define flushing metamorphisms:

$\displaystyle \mathit{fmeta}\;g\;h\;f\;b = \mathit{apo}\;g\;h \cdot \mathit{foldl}\;f\;b$

Then we have

$\displaystyle \mathit{fstream}\;g\;h\;f\;b\;x = \mathit{fmeta}\;g\;h\;f\;b\;x$

for all finite lists ${x}$ if the streaming condition holds for ${f}$ and ${g}$. In particular,

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{regroup}\;n\;\mathit{xs} &=& \mathit{fmeta}\;(\mathit{chunk'}\;n)\;(\mathit{unfoldr}\;(\mathit{chunk}\;n))\;(\mathbin{{+}\!\!\!{+}})\;[\,]\;\mathit{xs} \\ &=& \mathit{fstream}\;(\mathit{chunk'}\;n)\;(\mathit{unfoldr}\;(\mathit{chunk}\;n))\;(\mathbin{{+}\!\!\!{+}})\;[\,]\;\mathit{xs} \end{array}$

on finite inputs ${\mathit{xs}}$: the streaming condition does hold for ${\mathbin{{+}\!\!\!{+}}}$ and the more cautious ${\mathit{chunk'}\;n}$, and once the input has been exhausted, the process can switch to the more aggressive ${\mathit{chunk}\;n}$.

## Infinite input

The main advantage of streaming is that it can allow the change-of-representation process also to work on infinite inputs. With the plain metamorphism, this is not possible: the ${\mathit{foldl}}$ will yield no result on an infinite input, and so the ${\mathit{unfoldr}}$ will never get started, but the ${\mathit{stream}}$ may be able to produce some outputs before having consumed all the inputs. For example, the streaming version of ${\mathit{regroup}\;n}$ also works for infinite lists, providing that the input does not end with an infinite tail of empty lists. And of course, if the input never runs out, then there is no need ever to switch to the more aggressive flushing phase.

As a more interesting example, consider converting a fraction from base 3 to base 7:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{fromBase3} &=& \mathit{foldr}\;\mathit{stepr}\;0 \quad \mathbf{where}\;\mathit{stepr}\;d\;x = (d+x)/3 \\ \mathit{toBase7} &=& \mathit{unfoldr}\;\mathit{next} \quad \mathbf{where}\; \begin{array}[t]{@{}lcl@{}} \mathit{next}\;0 &=& \mathit{Nothing} \\ \mathit{next}\;x &=& \mathbf{let}\;y=7\times x\;\mathbf{in}\;\mathit{Just}\;(\lfloor y\rfloor, y-\lfloor y\rfloor) \end{array} \end{array}$

We assume that the input digits are all either 0, 1 or 2, so that the number being represented is in the unit interval.

The fold in ${\mathit{fromBase3}}$ is of the wrong kind; but we have also

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{fromBase3} &=& \mathit{extract} \cdot \mathit{foldl}\;\mathit{stepl}\;(0,1) \quad \mathbf{where}\; \mathit{stepl}\;(u,v)\;d = (u \times 3 + d, v / 3) \end{array}$

Here, the intermediate state ${(u,v)}$ can be seen as a defunctionalized representation of the function ${(v\times) \cdot (u+)}$, and ${\mathit{extract}}$ applies this function to ${0}$:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{apply}\;(u,v)\;x &=& v \times (u + x) \\ \mathit{extract}\;(u,v) &=& \mathit{apply}\;(u,v)\;0 \end{array}$

Now there is an extra function ${\mathit{extract}}$ between the ${\mathit{foldl}}$ and the ${\mathit{unfoldr}}$; but that’s no obstacle, because it fuses with the ${\mathit{unfoldr}}$:

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{toBase7} \cdot \mathit{extract} &=& \mathit{unfoldr}\;\mathit{next'} \quad \mathbf{where}\; \begin{array}[t]{@{}lcl@{}} \mathit{next'}\;(0,v) &=& \mathit{Nothing} \\ \mathit{next'}\;(u,v) &=& \begin{array}[t]{@{}l} \mathbf{let}\;y = \lfloor{7 \times u \times v}\rfloor\;\mathbf{in} \\ \mathit{Just}\;(y,(u - y/(v \times 7), v \times 7)) \end{array} \end{array} \end{array}$

However, the streaming condition does not hold for ${\mathit{stepl}}$ and ${\mathit{next'}}$. For example,

$\displaystyle \begin{array}{@{}lcl@{}} \mathit{next'}\;(1,{{}^{1\!}/_{\!3}}) &=& \mathit{Just}\;(2, ({{}^{1\!}/_{\!7}},{{}^{7\!}/_{\!3}})) \\ \mathit{next'}\;(\mathit{stepl}\;(1,{{}^{1\!}/_{\!3}})\;1) &=& \mathit{next'}\;(4,{{}^{1\!}/_{\!9}}) \\ &=& \mathit{Just}\;(3,({{}^{1\!}/_{\!7}},{{}^{7\!}/_{\!9}})) \end{array}$

That is, ${0.1_3 \simeq 0.222_7}$, but ${0.11_3 \simeq 0.305_7}$, so it is premature to produce the first digit 2 in base 7 having consumed only the first digit 1 in base 3. The producer ${\mathit{next'}}$ is too aggressive; it should be more cautious while input remains that might invalidate a produced digit.

Fortunately, on the assumption that the input digits are all 0, 1, or 2, the unconsumed input—a tail of the original input—again represents a number in the unit interval; so from the state ${(u,v)}$ the range of possible unproduced outputs represents a number between ${\mathit{apply}\;(u,v)\;0}$ and ${\mathit{apply}\;(u,v)\;1}$. If these both start with the same digit in base 7, then (and only then) is it safe to produce that digit. So we define

$\displaystyle \mathit{next''}\;(u,v) = \mathbf{if}\;\lfloor{u \times v \times 7}\rfloor = \lfloor{(u+1) \times v \times 7}\rfloor\;\mathbf{then}\;\mathit{next'}\;(u,v)\;\mathbf{else}\;\mathit{Nothing}$

and we have

$\displaystyle \mathit{unfoldr}\;\mathit{next'} = \mathit{apo}\;\mathit{next''}\;(\mathit{unfoldr}\;\mathit{next'})$

Now, the streaming condition holds for ${\mathit{stepl}}$ and ${\mathit{next''}}$ (as you may check), and therefore

$\displaystyle \mathit{toBase7}\;(\mathit{fromBase3}\;x) = \mathit{fstream}\;\mathit{next''}\;(\mathit{unfoldr}\;\mathit{next'})\;\mathit{stepl}\;(0,1)\;x$

on finite digit sequences ${x}$ in base 3. Moreover, the streaming program works also on infinite digit sequences, where the original does not.

(Actually, the only way this could possibly produce a finite output in base 7 would be for the input to be all zeroes. Why? If we are happy to rule out this case, we could consider only the case of taking infinite input to infinite output, and not have to worry about reaching the end of the input or flushing the state.)

# Functional Conf

This coming weekend, I will present Haskell SpriteKit — a Purely Functional API for a Stateful Animation System and Physics Engine as well as a workshop on Functional Programming in Swift at Functional Conf in Bangalore.

# Backend Ruby and Haskell engineer at Health eFilings (Full-time)

Our backend engineering team manages the ingestion and normalization of data sets, from data extraction through to product delivery. We want to work smarter instead of harder, and create domain specific languages, meta-programming etc. where possible.

Our current code base is written in Ruby and Coffee Script, but some new modules are being written in Haskell. You will be on the front lines of creating a Haskell-based infrastructure that is maintainable and can scale to support our needs as we grow.

We currently expect that about 80% of your work will be in Ruby/CoffeeScript, and 20% in Haskell, but that ratio will decrease over time as we move more of our functionality to Haskell. (The faster you can work to migrate functionality to Haskell, the more Haskell you will be doing.)

WHAT WE WILL EXPECT FROM YOU

You will have ownership of an entire module, including responsibility for:

• Creating new features in a clean and maintainable way
• Re-factoring existing code to ensure that we stay agile
• Reviewing teammates’ code and providing feedback
• Keeping yourself focused and your projects on track
• An “I can run through walls” mentality to ensure that goals are met
• Answering questions from our implementation team and squashing bugs on a monthly support rotation

We are a small team (four engineers), and so it is critical that you be a team player, willing to pitch in and help out your colleagues.

WHAT YOU CAN EXPECT FROM US

• Autonomy to solve problems in the way you best see fit
• A manager who is accountable for ensuring you meet your professional goals
• A team who helps each other and always strives to improve
• The time to focus on creating the right solution, instead of the easiest one

REQUIREMENTS

• Professional experience as a software engineer
• Experience with Haskell and Ruby
• A desire for continual self-improvement
• An understanding of best practices regarding maintainability and scalability
• Must have US work authorization and be located in the US (we cannot sponsor visas at this time)
• There are no formal education requirements for this position

BONUS POINTS

• Experience with data scraping and parsing

LOCATION

This is expected to be a remote position, although our Madison, Wisconsin office is also available as a work location.

Get information on how to apply for this position.

# Algebraic Data Types in Java

At Helix we often code backend services in java. I find modern java acceptable as a language for getting things done. As a long time haskell developer, however, I find java’s facilities for data types frustrating indeed. These frustrations are twofold. Java lacks support for algebraic data types (ADTs), and requires large amounts of boilerplate to define even simple types.

When designing systems, I place great value in applying the "make illegal states unrepresentable" principle1. Using ADTs to more accurately model data is a excellent step in this direction. However, it’s a burden to do in languages like java that lack support for sum types.

Even for regular product types (ie records of fields) java can be tedious. Defining a record of a few fields should really only take a corresponding few lines of code. Yet for a useful value type in java one will generally need to write: constructors, accessors, a comparison function, a hash implementation, serialisation logic etc. It’s common in the java world to use IDEs to automatically generate this kind of boilerplate, but subtle bugs can creep in over time as the once generated code isn’t manually updated to reflect subsequent changes in the data model.

Hence, at Helix we now often use my ADL language to define data types, and generate the corresponding java code from them. As a tiny example, these adl definitions (see complete file here):

    struct Rectangle
{
Double width;
Double height;
};

union Picture
{
Circle circle;
Rectangle rectangle;
Vector<Picture> composed;
Translated<Picture> translated;
};

result in the corresponding Rectangle.java and Picture.java. These two definitions alone correspond to 280 lines of java code (that you really don’t want to write and maintain). As can be seen in the Translated<> type, parametric polymorphism is supported.

I find that being able to define data types concisely encourages me to build more accurate data models, resulting in systems that are more robust and better reflect the problem domain. And ADL’s multi language support (java, haskell, typescript) allows us to easily serialize and transfer the corresponding data values between our java services, and our typescript web and mobile UIs.

1. attributed to Yaron Minsky

# Scala Developer at LeadIQ (Full-time)

Are you the type of engineer who punches juke boxes to make the music start? Do you consider riding your motorcycle off into the a sunset a personal hobby? Is architecting a system from the ground up no big deal to you? We're looking for full-time Scala developer to make this happen.

The Product

We are on a mission to revolutionize Sales industry using data science. Our product helps our customers to collect and enrich their target prospects. Our internal data processing combines human intelligence and data science to enable our customers to find perfect contact information and save to their existing platforms like Salesforce, etc.

The Challenge

• We are at an exciting stage in our growth. We are getting traction with big customers, scaling out, and solving increasingly complex engineering problems.

• Our systems are mostly written in Scala. We have used Kafka as backbone to communicate between our API server and micro-services. Smart architecture design is crucial in order to guarantee our micro-services based systems run smoothly and reliably.

• We're looking for someone who can drive our product backend integration features, refactor existing code for faster responses and becomes an important asset to the rest of the engineering team.

• Data quality is one of the critical factors to make our product successful. We often have needs to process 3rd parties data and clean existing data using Spark. So you need to be comfortable writing Spark scripts.

• We have very complex integrations with 3rd parties systems like Salesforce, etc. These integrations are core to what we're offering to our customers. We're looking for someone who is willing to listen to customer feedback to improve existing features and provide new features for customer success.

The Stack

Scala, Kafka, Spark, MongoDB, ElasticSearch, Docker, Vue.js

The Team

We want team members with attributes like:

• Focus on delivering value to the customer
• Strong belief in collaboration
• Passion that drives you to execute and innovate
• Ability to self-manage and take ownership of a feature
• Ability to juggle many projects and responsibilities
• Extremely entrepreneurial and self-driven
• Not afraid of a steep learning curve
• Passionate about building a big business that transforms the sales industry
• Exceptional at writing scalable, production-ready code
• Thrive in a fast-paced environment
• Avoid over-engineering
• Simple designs and fast execution
• Discipline in following process and documenting your work

These personality traits define the culture we are building and are more important to us than a particular set of technical skills.

The Responsibilities

If you join LeadIQ, you will learn a lot: In terms of technical ability there are many cool tools, technologies, patterns and other great developers that will sharpen your skills. Personally you be given the chance to step up, lead and make your mark in a growing startup as we tackle the challenges in our next phase of growth.

On the technical front, we need you skilled in:

• Scala (but experience in another functional language helps, e.g. Haskell or Clojure)
• Play framework
• Concurrency (futures, actors, basic understanding of threads)

So if you feel like you're a good fit for us, drop us a line! We love meeting developers who are excited by our product!

Get information on how to apply for this position.

# Future proofing test suites

I'll start with the specific case I've seen pop up a few times recently, and then expand to the general. If you're a package author who has been affected by this, please note: I'm putting this information into a blog post since it's easier to state this once and link to it rather than rewrite an explanation on lots of different bug trackers.

hlint is a great tool for getting advice on improving your Haskell codebase (another great Neil Mitchell product). And as such tools go, hlint has new versions which improve its ability to provide useful advice. This means that, sometimes, code which triggered no hlint warnings previously may suddenly present with such warnings under a new hlint version.

Twice recently in my Stackage curation, I've seen a number of test suites fail, even though the code for those packages was unmodified. It turns out that the upgrade to a new version of hlint caused a previously successful test suite to now fail. Clearly the code isn't suddenly broken because a new version of hlint has been released, but as far as the diagnostics of test suite failures are concerned, that's exactly what happened.

## Recommendation

I do strongly recommend projects use hlint to get code improvements. And I've seen some great results with using it as part of the CI process, such as on Stack. (For the record: it wasn't my idea and I didn't implement it. I was just pleasantly surprised when my PRs failed because I had some style errors.) However, making the test suite for the entire package fail because of a new version of hlint is too much. Therefore:

• DO Have some way to run hlint from your CI process, if you want these warnings to block PRs. There are two approaches I can think of:

• The way Stack does it: have a separate part of the build matrix just for style errors. The cabal file for the project itself knows nothing about hlint.
• Via a test suite in your cabal file which is disabled by default. Then: turn on that test suite with a flag from your CI configuration.
• DON'T Set up your package which is uploaded to Hackage/built by Stackage such that it will fail if a style-based error occurs.

## General recommendation

The general takeaway from this is: when you're building your code on CI, be as strict as you want. Set high standards, block PRs, call master broken, for whatever trivial or non-trivial issues you deem worthy. Turn on -Wall -Werror, respect hlint, error out if someone uses tabs* or includes trailing whitespace. That's all good.

* Cue necessary tabs-vs-spaces argument

However, when you're releasing your code elsewhere, make the tests as lenient as possible on optional features. If the code fails to build: that's a problem. If the code builds, but returns incorrect runtime results: that's a problem. These should stop build systems like Stackage from including your package. But stylistic issues, or newly introduced warnings from the compiler, or myriad other issues, should not trigger a failure for downstream consumers of your package.

# No, it is not a compiler error. It is never a compiler error.

When I used to hang out in the comp.lang.c Usenet group, back when there was a comp.lang.c Usenet group, people would show up fairly often with some program they had written that didn't work, and ask if their compiler had a bug. The compiler did not have a bug. The compiler never had a bug. The bug was always in the programmer's code and usually in their understanding of the language.

When I worked at the University of Pennsylvania, a grad student posted to one of the internal bulletin boards looking for help with a program that didn't work. Another graduate student, a super-annoying know-it-all, said confidently that it was certainly a compiler bug. It was not a compiler bug. It was caused by a misunderstanding of the way arguments to unprototyped functions were automatically promoted.

This is actually a subtle point, obscure and easily misunderstood. Most examples I have seen of people blaming the compiler are much sillier. I used to be on the mailing list for discussing the development of Perl 5, and people would show up from time to time to ask if Perl's if statement was broken. This is a little mind-boggling, that someone could think this. Perl was first released in 1987. (How time flies!) The if statement is not exactly an obscure or little-used feature. If there had been a bug in if it would have been discovered and fixed by 1988. Again, the bug was always in the programmer's code and usually in their understanding of the language.

Here's something I wrote in October 2000, which I think makes the case very clearly, this time concerning a claimed bug in the stat() function, another feature that first appeared in Perl 1.000:

On the one hand, there's a chance that the compiler has a broken stat and is subtracting 6 or something. Maybe that sounds likely to you but it sounds really weird to me. I cannot imagine how such a thing could possibly occur. Why 6? It all seems very unlikely.

Well, in the absence of an alternative hypothesis, we have to take what we can get. But in this case, there is an alternative hypothesis! The alternative hypothesis is that [this person's] program has a bug.

Now, which seems more likely to you?

• Weird, inexplicable compiler bug that nobody has ever seen before

or

• Programmer fucked up

Hmmm. Let me think.

I'll take Door #2, Monty.

Presumably I had to learn this myself at some point. A programmer can waste a lot of time looking for the bug in the compiler instead of looking for the bug in their program. I have a file of (obnoxious) Good Advice for Programmers that I wrote about twenty years ago, and one of these items is:

Looking for a compiler bug is the strategy of LAST resort. LAST resort.

Anyway, I will get to the point. As I mentioned a few months ago, I built a simple phone app that Toph and I can use to find solutions to “twenty-four puzzles”. In these puzzles, you are given four single-digit numbers and you have to combine them arithmetically to total 24. Pennsylvania license plates have four digits, so as we drive around we play the game with the license plate numbers we see. Sometimes we can't solve a puzzle, and then we wonder: is it because there is no solution, or because we just couldn't find one? Then we ask the phone app.

The other day we saw the puzzle «5 4 5 1», which is very easy, but I asked the phone app, to find out if there were any other solutions that we missed. And it announced No solutions.” Which is wrong. So my program had a bug, as my programs often do.

The app has a pre-populated dictionary containing all possible solutions to all the puzzles that have solutions, which I generated ahead of time and embedded into the app. My first guess was that bug had been in the process that generated this dictionary, and that it had somehow missed the solutions of «5 4 5 1». These would be indexed under the key 1455, which is the same puzzle, because each list of solutions is associated with the four input numbers in ascending order. Happily I still had the original file containing the dictionary data, but when I looked in it under 1455 I saw exactly the two solutions that I expected to see.

So then I looked into the app itself to see where the bug was. Code Studio's underlying language is Javascript, and Code Studio has a nice debugger. I ran the app under the debugger, and stopped in the relevant code, which was:

    var x = [getNumber("a"), getNumber("b"), getNumber("c"), getNumber("d")].sort().join("");


This constructs a hash key (x) that is used to index into the canned dictionary of solutions. The getNumber() calls were retrieving the four numbers from the app's menus, and I verified that the four numbers were «5 4 5 1» as they ought to be. But what I saw next astounded me: x was not being set to 1455 as it should have been. It was set to 4155, which was not in the dictionary. And it was set to 4155 because

the built-in sort() function

was sorting the numbers

into

the

wrong

order.

For a while I could not believe my eyes. But after another fifteen or thirty minutes of tinkering, I sent off a bug report… no, I did not. I still didn't believe it. I asked the front-end programmers at my company what my mistake had been. Nobody had any suggestions.

Then I sent off a bug report that began:

I think that Array.prototype.sort() returned a wrongly-sorted result when passed a list of four numbers. This seems impossible, but …

I was about 70% expecting to get a reply back explaining what I had misunderstood about the behavior of Javascript's sort().

But to my astonishment, the reply came back only an hour later:

Wow! You're absolutely right. We'll investigate this right away.

In case you're curious, the bug was as follows: The sort() function was using a bubble sort. (This is of course a bad choice, and I think the maintainers plan to replace it.) The bubble sort makes several passes through the input, swapping items that are out of order. It keeps a count of the number of swaps in each pass, and if the number of swaps is zero, the array is already ordered and the sort can stop early and skip the remaining passes. The test for this was:

    if (changes <= 1) break;


but it should have been:

    if (changes == 0) break;


Ouch.

The Code Studio folks handled this very creditably, and did indeed fix it the same day. (The support system ticket is available for your perusal, as is the Github pull request with the fix, in case you are interested.)

I still can't quite believe it. I feel as though I have accidentally spotted the Loch Ness Monster, or Bigfoot, or something like that, a strange and legendary monster that until now I thought most likely didn't exist.

A bug in the sort() function. O day and night, but this is wondrous strange!

[ Addendum 20171113: Thanks to Reddit user spotter for pointing me to a related 2008 blog post of Jeff Atwood's, “The First Rule of Programming: It's Always Your Fault”. ]

[ Addendum 20171113: Yes, yes, I know sort() is in the library, not in the compiler. I am using “compiler error” as a synecdoche for “system software error”. ]

[ Addendum 20171116: I remembered examples of two other fundamental system software errors I have discovered, including one honest-to-goodness compiler bug. ]

# Ghcid with VS Code

Summary: New versions of Ghcid and the VS Code extension work even better together.

I've just released Ghcid v0.6.8 and the associated VS Code extension haskell-ghcid v0.2.0. Together they vastly simplify the Ghcid VS Code experience.

A new feature in Ghcid is that if there is a .ghcid file in the current directory it will load it as additional arguments. For example, in the Shake repo I have a .ghcid file:

-c "ghci -fno-code -ferror-spans"

Which tells ghcid to not guess at the command (e.g. using stack if you have a .stack-work) but always run ghci -fno-code -ferror-spans. This command works because I have a .ghci file which loads all the necessary files, while -fno-code speeds up compilation and -ferror-spans gives better error highlighting.

Ghcid VS Code starts ghcid

A new feature in the VS Code extension is the action Start Ghcid which starts a new ghcid terminal, writes the output to a temporary file, and uses that output to populate the Problems pane. Importantly, the extension runs ghcid with no command line arguments, so having a sensible .ghcid lets you control what it does.

The effect of these changes is that to start ghcid in VS Code is now a few key strokes, whereas before it required special flags, opening files, running commands etc.

# Nix on the <br/> Windows Subsystem for Linux

Jonas Chevalier

Nix on Windows: does it run yet? That's the question I wondered about while testing the latest NixOS release, version 17.09. To that end, I had the idea of running the Nix installation process from inside the Windows Subsystem for Linux (WSL) see if it worked. And it worked! Success!

So what does this mean?

You might remember that the Windows NT kernel used to have a POSIX layer. Unfortunately, The POSIX layer always had compatibility issues with BSD and Linux software, because typical applications seldom fit completely and entirely within the confines of an age old API. Nevertheless, the NT kernel was designed from the start to support different subsystems, not just Win32, and the POSIX layer of old was a step in the right direction. The WSL is a revival of that idea but with a specific focus on the Linux ABI. It means that it is now possible to run Linux software natively on Windows. Think of it as reverse Wine. Linux software can execute Windows software and vice versa.

It's not perfect yet. I/O and symlink resolution seem to be slow and not all Linux syscalls have been implemented yet. This is more about the promised land that Microsoft is showing. WSL is not available on the server edition yet, but it looks like they are going to deliver on it.

At Tweag.io we often use Nix to declaratively specify reproducible build environments for our projects and those of our clients. Nix is a good fit for project that mix different languages. It works really well at providing reproducible builds and compose the various parts of the project with external dependencies. Unfortunately it is also not supported on Windows so we have to decide upfront whether to use it based in part on whether Windows is going to become a target platform or not. Thanks to WSL it looks like we will have an escape hatch, at least for non graphical applications.

Another potential use-case that I see is for Haskell development. Today, a lot of good software is being developed directly on top of Linux and macOS. For some of these projects Windows is not a prime target environment anymore. The Glasgow Haskell Compiler (GHC) is actually quite well behaved on Windows when compiling pure Haskell code. But as soon as C library dependencies are involved, the story gets a lot more complicated. In that case, deploying via WSL might just be easier than aiming for a native Windows port.

## How to install

Enable and install WSL following these instructions: https://msdn.microsoft.com/en-us/commandline/wsl/install_guide.

Make sure to have the latest version of Windows 10 installed. I had this version at the time of install:

• Windows Edition: Windows 10 Pro
• Windows Version: 1703
• Windows OS Build: 15063.540
• System Type: 64-bit operating system

Start the “Bash On Ubuntu On Windows” program and type curl https://nixos.org/nix/install | sh.

## Known issues

WSL is an experimental subsystem still. At this point in time, there are still important issues to know about. Here are the workarounds I came up with:

• curl is hanging. Hit Ctrl+C and retry.
• Nix installation crash. Older versions of WSL didn't support all the syscalls needed by Nix. Update Windows and try again.
• nix-shell is broken. Fails with synchronous I/O disk error https://github.com/NixOS/nix/issues/1203. Here's a workaraund: edit /etc/nix/nix.conf and add use-sqlite-wal=false
• It’s slow. Yes, especially I/O and symlinks seem to be quite slow. The only solution here is to wait for Microsoft to optimise their syscalls.
• Nix environment is not started in new logins. Workaround: Run source ~/.profile

## Conclusion

For now, it's just a technology preview that opens new possibilities. Hopefully in the future, when the performance of I/O operations improves, it will also be enjoyable to develop Linux programs under WSL directly. Meanwhile, Microsoft has put out useful resources to go further with WSL:

# A proof by contradiction is not a proof that derives a contradiction

It is well-known that constructivists renounce “proof by contradiction”, and that classicists scoff at the critique.  “Those constructivists,” the criticism goes, “want to rule out proofs by contradiction.  How absurd!  Look, Pythagoras showed that the square root of two is irrational by deriving a contradiction from the assumption that it is rational.  There is nothing wrong with this.  Ignore them!”

On examination that sort of critique fails, because a proof by contradiction is not a proof that derives a contradiction.  Pythagoras’s  proof is valid, one of the eternal gems of mathematics.  No one questions the validity of that argument, even if they question proof by contradiction.

Pythagoras’s Theorem expresses a negation: it is not the case that the square root of two can be expressed as the ratio of two integers.  Assume that it can be so represented.  A quick deduction shows that this is impossible.  So the assumption is false.  Done.  This is a direct proof of a negative assertion; it is not a “proof by contradiction”.

What, then, is a proof by contradiction?  It is the affirmation of a positive statement by refutation of its denial.  It is a direct proof of the negation of a negated assertion that is then pressed into service as a direct proof of the assertion, which it is not.  Anyone is free to ignore the distinction for the sake of convenience, as a philosophical issue, or as a sly use of “goto” in a proof, but the distinction nevertheless exists and is important.  Indeed, part of the beauty of constructive mathematics is that one can draw such distinctions. Once drawn, such distinctions can be disregarded; once blurred, forever blurred, a pure loss of expressiveness.

For the sake of explanation, let me rehearse a standard example of a genuine proof by contradiction.  The claim is that there exists irrationals a and b such that a to the b power is rational.  Here is an indirect proof, a true proof by contradiction.  Let us prove instead that it is impossible that any two irrationals a and b are such that a to the b is irrational.  This is a negative statement, so of course one proves it by deriving a contradiction from assuming that which is negated.  Suppose, for a contradiction, that for every two irrationals a and b, the exponentiation a to the b power is irrational.  We know from Pythagoras that root two is irrational, so plug it in for both a and b, and conclude that root two to the root two power is irrational.  Now use the assumption again, taking a to be root two to the root two, and b to be root two.  Calculate a to the power of b, it is two, which is eminently rational.  Contradiction.

We have now proved that it is not the case that every pair of irrationals, when exponentiated, give an irrational.  There is nothing questionable about this proof.  But it does not prove that there are two irrationals whose exponent is rational!  If you think it does, then I ask you, please name them for me.  That information is not in this proof (there are other proofs that do name them, but that is not relevant for my purposes).  You may, if you wish, disregard the distinction I am drawing, that is your prerogative, and neither I nor anyone has any problem with that.  But you cannot claim that it is a direct proof, it is rather an indirect proof, that proceeds by refuting the negative of the intended assertion.

So why am I writing this?  Because I have learned, to my dismay, that in U.S. computer science departments–of all places!–students are being taught, erroneously, that any proof that derives a contradiction is a “proof by contradiction”.  It is not.  Any proof of a negative must proceed by contradiction.  A proof by contradiction in the long-established sense of the term is, contrarily, an indirect proof of a positive by refutation of the negative.  This distinction is important, even if you want to “mod out” by it in your work, for it is only by drawing the distinction that one can even define the equivalence with which to quotient.

That’s my main point.  But for those who may not be familiar with the distinction between direct and indirect proof, let me take the opportunity to comment on why one might care to draw such a distinction.  It is entirely a matter of intellectual honesty: the information content of the foregoing indirect proof does not fulfill the expectation stated in the theorem.  It is a kind of boast, an overstatement, to claim otherwise.  Compare the original statement with the reformulation used in the proof.  The claim that it is not the case that every pair of irrationals exponentiate to an irrational is uncontroversial.  The proof proves it directly, and there is nothing particularly surprising about it.  One would even wonder why anyone would bother to state it.  Yet the supposedly equivalent claim stated at the outset appears much more fascinating, because most people cannot easily think up an example of two irrationals that exponentiate to rationals.  Nor does the proof provide one. Once, when shown the indirect proof, a student of mine blurted out “oh that’s so cheap.”  Precisely.

Why should you care?  Maybe you don’t, but there are nice benefits to keeping the distinction, because it demarcates the boundary between constructive proofs, which have direct interpretation as functional programs, and classical proofs, which have only an indirect such interpretation (using continuations, to be precise, and giving up canonicity).  Speaking as a computer scientist, this distinction matters, and it’s not costly to maintain.  May I ask that you adhere to it?

Edit: rewrote final paragraph, sketchy and irrelevant, and improved prose throughout. Word-smithing, typos.

Filed under: Programming, Research, Teaching

# Amazon GovCloud has no Route53! How to solve this?

Users of Amazon Web Services are used to having Route53 available to provide DNS records within their clouds. However, if you’re a government contractor or an agency, your deployments likely live on an Amazon GovCloud environment. And on that environment, Route53 is not yet available. So does that mean you should forego deploying any services that need custom DNS records on GovCloud? Of course not!

# Sequentiality as the Essence of Parallelism

I recently thought of a nice way to structure a language for parallel programming around the concept of sequential composition.  Think of parallelism as the default—evaluate everything in parallel unless the semantics of the situation precludes it.  Sums are posterior to summands, but the summands can be evaluated simultaneously.  You need a way to express the necessary dependencies without introducing any spurious ones.

There’s a tool for that, called lax logic, introduced by Fairtlough and Mendler and elaborated by Davies and Pfenning, which I use extensively in PFPL.  The imperative language Modernized Algol is formulated in the lax style, distinguishing two modes, or levels, of syntax, the (pure) expressions and the (impure) commands.  The lax modality, which links the two layers, behaves roughly like a monad, but, all the hype notwithstanding, it is not the central player.  It’s the modes, not the modality, that matter.  (See the Commentary on PFPL for more.)

The lax modality is just the ticket for expressing parallelism.  Rather than separate expressions from commands, here we distinguish between values and computations.  The names are important, to avoid possible confusion.  Values are fully evaluated; they are not a source of parallelism.  (If they were called “pure”, it would be irresistible to think otherwise.)  Computations have yet to be evaluated; they engender parallelism by sequential composition.  What?  No, you didn’t nod off! Let me explain.

Parallelism is all about the join points.  If parallel execution is the default, then the job of the programmer is not to induce parallelism, but to harness it.  And you do that by saying, “this computation depends on these others.”  Absent that, there is nothing else to say, just go for it.  No sub-languages.  No program analysis.  No escaping the monad.  Just express the necessary dependencies, and you’re good to go.

So, what are the join points?  They are the elimination forms for two parallel modalities.  They generalize the sequential case to allow for statically and dynamically determined parallelism.   A value of parallel product type is a tuple of unevaluated computations, a kind of “lazy” tuple (but not that kind of laziness!).  The elimination form evaluates all of the component computations in parallel, creates a value tuple from their values, and passes it to the body of the form.  A value of parallel sequence type is a generator consisting of two values, a natural number n indicating its size, and a function determining the ith component computation for each 1≤i<n.  The elimination form activates all n component computations, binds their values to a value sequence, and passes it to the body of the form.

The join point effects a change of type, from encapsulated computations to evaluated values, neatly generalizing sequential composition from a unary to a binary join.  If you’d like, the parallel products and parallel sequences are “generalized monads” that encapsulate not just one, but many, unevaluated computations.  But they are no more monads than they are in any other functional language: the equational laws need not hold in the presence of, say, divergence, or exceptions.

The dynamics assigns costs to computations, not to values, whose cost of creation has already been paid.  The computation that just returns a value has unit work and span.  Primitive operations take unit work and span.  The sequential composition of a parallel product with n components induces span one more than the maximum span of the constituents, and induces work one more than the sum of their work.  The dynamics of sequential composition for parallel sequences is similar, with the “arity” being determined dynamically rather than statically.

Programming in this style means making the join points explicit.  If you don’t like that, you can easily define derived forms—and derived costs—for constructs that do it for you.    For example, a pair of computations might be rendered as activating a parallel pair of its components, then returning the resulting value pair.  And so on and so forth.  It’s no big deal.

En passant this solves a nasty technical problem in a substitution-based cost semantics that does not make the modal distinction.  The issue is, how to distinguish between the creation of a value, and the many re-uses of it arising from substitution?  It’s not correct to charge again and again for the value each time you see it (this cost can be asymptotically significant), but you have to charge for creating it (it’s not free, and it can matter).  And, anyway, how is one to account for the cost of assessing whether an expression is, in fact, a value?  The usual move is to use an environment semantics to manage sharing.  But you don’t have to, the modal framework solves the problem, by distinguishing between a value per se; the computation that returns it fully created; and the computation that incrementally construcs it from its constituent parts.  It’s the old cons-vs-dotted pair issue, neatly resolved.

Please see Section 10 of the Commentary on PFPL for a fuller account.  The main idea is to generalize a type of single unevaluated computations, which arises in lax logic, to types of statically- and dynamically many unevaluated computations.  The bind operation becomes a join operation for these computations, turning a “lazy” tuple or sequence into eager tuples or sequences.  (This is not laziness in the sense of lazy evaluation though!)

Updates: word-smithing, added cite to Davies-Pfenning, replaced cite of course notes with reference to commentary.

Filed under: Programming, Research, Teaching Tagged: functional programming, parallelism, programming languages, semantics

## November 05, 2017

### Douglas M. Auclair (geophf)

• October 20th, 2017:
You have a list of numbers: [1,2,3,4]
You have a list of the same length of number fns: [succ, id, id, succ]
You want: [2,2,3,5]
•  🇪🇺 Cλément D  🌈  🐇 @clementd zipWith (flip ($)) ? • he adds: zipWith (flip id) is a bit shorter tho • Simon Courtenage @SCourtenage zipWith ($) [succ,id,id,succ] [1,2,3,4]
• lukasz @lukaszklekot getZipList $ZipList [succ, id, id, succ] <*> ZipList [1, 2, 3, 4] • Alexey Radkov @sheshanaag (map (uncurry ($)) .) . zip
• October 5th, 2017: "reverse the sequencing"
You have [[(1,2),(1,3),(1,7)],[(9,2)],[(11,3)]]
You want [(1,[2,3,7]),(9,[2]),(11,[3])]
• bazzargh @bazzargh map ((,) <$> head.(map fst) <*> (map snd)) • bazzargh @bazzargh map ((first head).unzip) • Chris Martin @chris__martin \x -> [(a, b : fmap snd xs) | Just ((a, b) :| xs) <- fmap="" li="" nonempty="" x=""> • Simon Courtenage @SCourtenage fmap (\x -> (fst . head$ x, fmap snd x))
• Denis Stoyanov  🐜 @xgrommx Your solution nice) but u can do it with point free style like
• Denis Stoyanov  🐜 @xgrommx My solution is ugly, but I wanna to solve it with traverse)
• fmap(first head . traverse (first (:[])))
• Andreas Källberg @Anka213 map$fst.head&&&map snd • Scott Fleischma‏ @scottfleischman traverse$ _1
(\case
[y] -> Just y
_ -> Nothing
. nub
)
. unzip
:: [[(Int, Int)]] -> Maybe [(Int, [Int])]
• Scott Fleischman @scottfleischman
let
•  sing [] = Left "Too few"
sing [x] = Right x
sing (_ : _) = Left "Too many"
valid = sing . nub
go = _1 valid . unzip
in traverse go
• matt @themattchan map ((head *** id ) . unzip)
• October 3rd, 2017:
you have [(1,[2,3,4]),(10,[5,6,7])]
you want [(1,2),(1,3),(1,4),(10,5),(10,6),(10,7)]

or, generally: [(a,[b])] -> [(a,b)]

Go!

• bazzargh @bazzargh (uncurry (zip . repeat) =<<)
• Denis Stoyanov  🐜 @xgrommx fmap (uncurry (liftA2(,) . (:[])))
• Darren G @Kludgy I like that this doesn't unnecessarily implicate the sequentiality of bind.
• Darren G @Kludgy Funny this same product came up at work last week.
concatMap a,bs) -> fmap (\b -> (a,b)) bs ## November 04, 2017 ### Neil Mitchell # Understanding HLint rules Summary: I added a degenerate foldr to map rule in the new version of HLint, here I describe how it works. I've just released HLint 2.0.10, which includes a rule to recognise uses of foldr that should really be map. As an example: foldr (\curr acc -> (+1) curr : acc) [] Can be rewritten as: map (\curr -> (+1) curr) Which is much more readable (and then subsequently HLint will suggest map (+1), which is vastly clearer than the initial foldr). The change required to HLint was to add a rule to the hlint.yaml saying: - warn: {lhs: "foldr (\\c a -> x : a) []", rhs: "map (\\c -> x)"} You can read this statement as saying if you see foldr (\c a -> x : a) [], suggest map (\c -> x) as a warning. The HLint matching engine then applies that template to every subexpression in your program. In the rest of the post I'll talk through the steps HLint performs. Step 1: Unification The first step is to try unifying the template foldr (\c a -> x : a) [] against the users subexpression, namely foldr (\curr acc -> (+1) curr : acc) []. HLint is trying to find assignments for the single-letter variables in the template (namely c, a and x) which cause it to match the subexpression. Unification proceeds top-down, and if it finds anything concrete that does not match (e.g. the user had written foldl) then it fails. In this case the unification succeeds with the bindings: • c = curr (from the first argument to the lambda) • a = acc (from the second argument to the lambda) • x = (+1) curr (from before the cons) • a = acc (from after the cons) An example of a subexpression that would have failed unification is foldl (\curr acc -> (+1) curr : acc) []. Step 2: Validity The next step is to check that any value which has been bound more than once is equal in all bindings. In our case only a has been used twice, and it always binds to acc, so the unification is valid. An example of a subexpression that would have failed validity is foldr (\curr acc -> (+1) curr : xs) []. Step 3: Substitution Now we've got some bindings, we can substitute them into the RHS, namely map (\c -> x). We replace c and x using the bindings above. Note that a isn't mentioned on the RHS, so we don't use it. After substitution we get: map (\curr -> (+1) curr) Step 4: Free variable check Consider the expression foldr (\curr acc -> f acc : acc) []. Using the rules above we'd end up with map (\curr -> f acc), which is terrible, since we've gone from referring to a locally bound acc to whatever acc is in scope (if any). To solve that, we check that the result doesn't introduce any new free variables: (freeVars result \\ freeVars hintRuleRHS) isSubsetOf freeVars original Specifically any free variables introduced in the result, which weren't in the RHS (excluding the fake unification variables), must have been in the original subexpression. With that, for foldr, we're done. There are a handful of other steps that apply in some cases. Step A: Dot expansion in the template If you write a hint map f (map g x) ==> map (f . g) x then HLint notices that also implies the rule map f . map g ==> map (f . g) and adds it. As a result, you shouldn't write your HLint rules in point-free style. Step B: Dot/dollar expansion in the subexpression When matching a subexpression HLint will expand f x and (f . g) x if doing so results in a match. These operators are used commonly enough that they are often treated more like brackets than functions. Step C: Scope matching When unifying qualified function names, HLint uses the active imports to guess whether they match. If you have import qualified Data.Vector as V then the subexpression V.length will unify with Data.Vector.length. Since HLint doesn't have complete import information it uses a few heuristics to figure out matching. Step D: Scope moving Similarly to scope matching on the LHS of a rule, after matching, HLint tries to requalify any necessary values on the RHS. As an example, assuming we are producing Data.Vector.null, if we know about import qualified Data.Vector as V then we suggest V.null. Full code To see the full code and all supporting definitions go to the HLint source, which defines matchIdea - here I show a gently simplified version. Given scope information, a rule (LHS and RHS) and a subexpression, we optionally produce a resulting expression after substitution. matchIdea :: Scope -> HintRule -> Exp_ -> Maybe Exp_matchIdea s HintRule{..} original = do u <- unifyExp hintRuleLHS original u <- validSubst u -- need to check free vars before unqualification, but after subst (with e) -- need to unqualify before substitution (with res) let result = substitute u hintRuleRHS guard (freeVars result Set.\\ Set.filter (not . isUnifyVar) (freeVars hintRuleRHS)) Set.isSubsetOf freeVars original -- check no unexpected new free variables return result ### Gabriel Gonzalez # Semantic integrity checks are the next generation of semantic versioning <html xmlns="http://www.w3.org/1999/xhtml"><head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/> <meta content="text/css" http-equiv="Content-Style-Type"/> <meta content="pandoc" name="generator"/> <style type="text/css">code{white-space: pre;}</style> <style type="text/css">div.sourceCode { overflow-x: auto; } table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode { margin: 0; padding: 0; vertical-align: baseline; border: none; } table.sourceCode { width: 100%; line-height: 100%; } td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; } td.sourceCode { padding-left: 5px; } code > span.kw { color: #007020; font-weight: bold; } /* Keyword */ code > span.dt { color: #902000; } /* DataType */ code > span.dv { color: #40a070; } /* DecVal */ code > span.bn { color: #40a070; } /* BaseN */ code > span.fl { color: #40a070; } /* Float */ code > span.ch { color: #4070a0; } /* Char */ code > span.st { color: #4070a0; } /* String */ code > span.co { color: #60a0b0; font-style: italic; } /* Comment */ code > span.ot { color: #007020; } /* Other */ code > span.al { color: #ff0000; font-weight: bold; } /* Alert */ code > span.fu { color: #06287e; } /* Function */ code > span.er { color: #ff0000; font-weight: bold; } /* Error */ code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ code > span.cn { color: #880000; } /* Constant */ code > span.sc { color: #4070a0; } /* SpecialChar */ code > span.vs { color: #4070a0; } /* VerbatimString */ code > span.ss { color: #bb6688; } /* SpecialString */ code > span.im { } /* Import */ code > span.va { color: #19177c; } /* Variable */ code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code > span.op { color: #666666; } /* Operator */ code > span.bu { } /* BuiltIn */ code > span.ex { } /* Extension */ code > span.pp { color: #bc7a00; } /* Preprocessor */ code > span.at { color: #7d9029; } /* Attribute */ code > span.do { color: #ba2121; font-style: italic; } /* Documentation */ code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ </style></head><body> The Dhall configuration language just added support for "semantic integrity checks". This post explains what "semantic integrity check" means, motivates the new feature, and compares to semantic versioning. ## The problem I added this feature in response to user concerns about code injection in Dhall configuration files. We'll illustrate the problem using the following example.dhall configuration file which derives a summary of student information from a list of students:  -- Example of an expression imported by URL let map = http://prelude.dhall-lang.org/List/map -- Example of an expression imported by pathin let students = ./students.dhallin let getName = λ(student : { name : Text, age : Natural }) → student.namein { classSize = List/length { name : Text, age : Natural } students , names = map { name : Text, age : Natural } Text getName students } This configuration imports a helper function named map from the Dhall Prelude by URL:  let map = http://prelude.dhall-lang.org/List/mapin ... ... and that URL currently hosts a text file encoding the following Dhall function:  curl -L http://prelude.dhall-lang.org/List/map{-Tranform a list by applying a function to each elementExamples:./map Natural Bool Natural/even ([+2, +3, +5] : List Natural)= [True, False, False] : List Bool./map Natural Bool Natural/even ([] : List Natural)= [] : List Bool-}let map : ∀(a : Type) → ∀(b : Type) → (a → b) → List a → List b = λ(a : Type) → λ(b : Type) → λ(f : a → b) → λ(xs : List a) → List/build b ( λ(list : Type) → λ(cons : b → list → list) → List/fold a xs list (λ(x : a) → cons (f x)) )in map Similarly, our example configuration imports student data from another configuration file by path: ...in let students = ./students.dhall... ... and we'll assume that file contains the following list of student records: [ { name = "Jane Doe" , age = +19 }, { name = "John Rivera" , age = +18 }, { name = "Alice O'Hare", age = +19 }] Values, functions, and types are all Dhall expressions, so we can inject all of them in our code via URLs or paths. When we interpret a Dhall configuration file these imports get substituted with their contents and then we evaluate the fully resolved configuration file as an expression in a functional language:  dhall <<< './example.dhall' | dhall-format{ classSize : Natural, names : List Text }{ classSize = +3, names = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text} Users were concerned that these imports could be compromised, resulting in malicious code injection ## The solution The latest release of Dhall added support for import integrity checks to address user concerns about malicious tampering. We can use these integrity checks to "freeze" our imports by adding a SHA-256 hash after each import. First, we ask the dhall-hash utility to compute the current hash for our imports:  dhall-hash <<< 'http://prelude.dhall-lang.org/List/map'sha256:3063e9b34fd4235165a7a46e3ee3e0d0d7cded5da16f5572cc9e459ed5452fbb dhall-hash <<< './students.dhall' sha256:6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06 ... and then we append the hash after each import to freeze the import:  let map = http://prelude.dhall-lang.org/List/map sha256:3063e9b34fd4235165a7a46e3ee3e0d0d7cded5da16f5572cc9e459ed5452fbbin let students = ./students.dhall sha256:6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06in let getName = λ(student : { name : Text, age : Natural }) → student.namein { classSize = length { name : Text, age : Natural } students , names = map { name : Text, age : Natural } Text getName students }  Once you add these integrity checks the Dhall interpreter will enforce them when resolving imports. In this case, the example configuration still successfully evaluates to the same result after adding the integrity checks:  dhall <<< './example.dhall' | dhall-format{ classSize : Natural, names : List Text }{ classSize = +3, names = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text} The integrity check passes because we haven't yet modified any of our imports. ## Semantic integrity Once you freeze an import with a hash, Dhall guarantees that the meaning of the import never changes. These are semantic hashes, not textual hashes. For example, suppose that we modify ./students.dhall to add a comment, reorder record fields, and modify the formatting, like this: -- Class of 2017[ { age = +19, name = "Jane Doe" }, { name = "John Rivera" , age = +18 }, { name = "Alice O'Hare", age = +19 } ] These changes do not affect the computed hash of the file and the interpreter still accepts the ./students.dhall import that we protected with an integrity check:  dhall <<< './example.dhall' | dhall-format # Still succeeds{ classSize : Natural, names : List Text }{ classSize = +3, names = [ "Jane Doe", "John Rivera", "Alice O'Hare" ] : List Text} The Dhall interpreter accepted the import of ./students.dhall because the semantic hash never changed:  dhall-hash <<< './students.dhall' sha256:6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06 However, now suppose we try to change the substance of the file by modifying John's age: -- Class of 2017[ { age = +19, name = "Jane Doe" }, { name = "John Rivera" , age = +20 }, { name = "Alice O'Hare", age = +19 } ] Now the semantic integrity check fails:  dhall <<< './example.dhall'Error: Import integrity check failedExpected hash:↳ 6c4205ed51c0201abcccd1d90be4d7cd4c492246176ab404c35886a03d9dfc06Actual hash:↳ 808d921914de5349f50ac656bed93c2894dfe35401991e1ca0c89861834023fb Dhall recognizes that this is no longer the same expression and rejects the import. Only an import that represents the same value can pass the check. This means, for example, that malicious users cannot tamper with our imports, even if we were to distribute the imported code over an insecure channel. The worst that an attacker can do is cause our configuration to reject the import, but they cannot trick the configuration into silently accepting the wrong expression. ## Refactoring We can use these integrity checks to do more than just secure code. We can also repurpose these checks to assert that our code refactors are safe and behavior-preserving. For example, suppose that we change the student list to: -- Class of 2017 let double = λ(x : Natural) → x * +2in [ { name = "Jane Doe" , age = +19 } , { name = "John Rivera" , age = double +9 } , { name = "Alice O'Hare", age = +19 } ] This will still pass the integrity check because the student list still evaluates to the same expected result. We can also refactor our project layout, too. For example, we could modify the student list to import the double function from another file: -- Class of 2017[ { name = "Jane Doe" , age = +19 }, { name = "John Rivera" , age = ./double.dhall +9 }, { name = "Alice O'Hare", age = +19 }] ... where ./double.dhall has the following contents: λ(x : Natural) → x * +2 ... and the integrity check would still pass. I originally introduced semantic integrity checks to protect against malicious code modification then later realized that they can also be used to protect against non-malicious modifications (such as a refactor gone wrong). # Textual hashes The semantic hash provides a more information than a textual hash of the import. For example, suppose we changed our ./double.dhall function to triple the argument: λ(x : Natural) → x * +3 A textual hash of the ./students.dhall import would not detect this change because the real change took place in the text of another file that ./students.dhall imported. However, A semantic hash can follow these imports to detect transitive changes to dependencies. The semantic hash is also more flexible than a textual hash because the semantic hash does not change when we make cosmetic changes like refactoring, reformatting, or commenting code. ## Caveats Dhall's semantic versioning can reject some behavior-preserving changes to functions. Dhall only attempts to detect if two functions are β-equivalent (i.e. the same if fully β-reduced). For example, the following two functions are equivalent, but will not produce the same hash: λ(x : Bool) → x λ(x : Bool) → if x then True else False Similarly, Dhall's semantic hash cannot detect that these two functions are the same: λ(x : Natural) → x * +2 λ(x : Natural) → x + x On the other hand, Dhall will (almost) never give two semantically distinct expressions the same hash. Only an astronomically improbable hash collision can cause this and at the time of this writing there is no known vulnerability in the SHA-256 hash algorithm. Dhall will support other hash algorithms should SHA-256 ever be broken. This is why Dhall prefixes the hash with the algorithm to leave the door open for new hash algorithms. ## Semantic versioning You might wonder how semantic integrity checks compare to semantic versioning. I like to think of semantic integrity checks and semantic versions as two special cases of the following abstract interface: • a package publishes a version string for each official release • you can compare two version strings to detect a breaking change to the package Semantic versioning is one special case of that abstract interface where: • the version string has a major number and minor number • a difference in major version numbers signals a breaking change Some variations on semantic versioning propose independently versioning each exported function/value/type instead of versioning the package as a whole. Also, some languages (like Elm) mechanically enforce semantic versioning by detecting API changes programmatically and forcing a major version bump if there is a breaking change. A semantic integrity check is another special case of that abstract interface where: • the version string is a SHA-256 hash • if two hashes are different then that signals a breaking change The key difference between semantic versioning and semantic integrity checks is how we define "a breaking change". Semantic version numbers (usually) treat changes to types as breaking changes whereas semantic integrity checks treat changes to values as breaking changes. (To be totally pedantic: semantic integrity checks treat changes to expressions as breaking changes, and in a language like Dhall everything is an expression, including types). This does not imply that semantic integrity checks are better than semantic version numbers. Sometimes you want to automatically pick up small changes or improvements from your dependencies without adjusting a hash. In cases like those you want the expected type to be the contract with your dependency and you don't want to pin the exact value. For example, we could "simulate" semantic versioning in Dhall by attaching a type annotation to our ./students.dhall import like this:  let map = http://prelude.dhall-lang.org/List/map sha256:3063e9b34fd4235165a7a46e3ee3e0d0d7cded5da16f5572cc9e459ed5452fbbin let students = ./students.dhall : List { name : Text, age : Natural }in let getName = λ(student : { name : Text, age : Natural }) → student.namein { classSize = List/length { name : Text, age : Natural } students , names = map { name : Text, age : Natural } Text getName students }  ... and now we can add or remove students from our imported list without breaking anything. We've used the type system as a coarser integrity check to state that certain changes to our configuration file's meaning are okay. ## Conclusion You can think of a semantic integrity check as a "value annotation" (i.e. the term-level equivalent of a type annotation). Instead of declaring an expected type we declare an expected value summarized as a hash. This is why the title of this post declares that "semantic integrity checks are the next generation of semantic versioning". If you think of a semantic version as a concise summary of an imported package's type, then a semantic integrity check is a concise summary of an imported package's value. </body></html> ### Keegan McAllister # On depression, privilege, and online activism Update (November 2017): I'm leaving this up as a snapshot of how I felt at the time. Since then a lot has changed in my life, I'm much less angry in general and I no longer give a shit what the toxic assholes think of me, which is pretty great! [Content warning: depression, privilege, online activism] This isn't a general account of my experiences with depression. Many people have written about that, and I don't have much to add. But there's one aspect that I don't hear about very often. It's something that bothers me a lot, and others have told me that it bothers them too. The thing is, I'm not just a person with a mental illness. I'm also a well-off white guy, and I enjoy a whole set of unearned privileges from that. Every day people around the world are harassed, abused, and killed over things I never have to worry about. Even in mundane daily life, most everyone is playing on a higher difficulty setting than I ever will. I've thought about this a lot over the past few years, and I'm trying to understand how I can help make the world more fair and less oppressive. So I give money and I volunteer a little and I speak up when it seems useful, but mostly I listen. I listen to the experiences of people who are different from me. I try to get some understanding of how they feel and why. How is this related to depression? Because the reality of privilege and oppression is fucking depressing. Of course it's depressing to those who are directly harmed. That's a lot of what I read about, and some of the despair transfers to me. But my profiting from the suffering of others in a way that I mostly can't change is also depressing, at least if I make an attempt not to ignore it. And my distress over my role in systems of oppression brings its own layer of guilt. People are actually suffering and I feel sorry for myself because I'm dimly aware of it? But this comes from the voice that has always taunted me about depression. “How can you be sad? Your life is great. If you had real problems you wouldn't be so pathetic. You're not really sick. You're just a whiner.” All of which is part of the disease. I need to own it and work on it every day. But it seems like every time I read an online discussion about social justice, I take a huge step backwards. It's hard to shrug off the “men are horrible” comments when I spend so much effort trying to convince myself that I'm not horrible. When I hear people gloating about delicious white male tears, I think about all the times when I would come home from work and collapse in bed crying. Is this what they want my life to be? I can't give myself permission to tune out, because the same people lecture constantly about my obligation to be a good ally, which mostly takes the form of “shut up and listen.” And then when I'm upset by the things they say, the response is “This isn't for you! Why are you listening?” A local group, one that had recently invited me to hang out as a guest, retweeted a member's declaration to would-be allies: “We're not friends. Fuck you.” Can you see why it feels like they're trying to hurt me? Let me be clear: I truly don't care if people in a room somewhere are talking about how men are the worst. I don't feel oppressed by it, and I have no desire to argue with it. But I can't handle direct exposure. And don't tell me that I'm too stupid to understand why they say these things. I know intellectually that it's not about me. I understand the need to vent and the importance of building solidarity. None of that matters on the emotional level where these comments register like a punch to the gut. I do feel this way, even if I shouldn't and I wish I didn't. I'm talking about mental health, triggers, and unintentionally hurtful speech. Does that sound familiar? One reason I was drawn to intersectional feminism is that it seemed to have a good set of ground rules for how to treat everyone decently. But now I feel like I'm excluded from protection. “Men are horrible” is apparently the one form of speech where intent is all that matters, and I'm a bad person if it triggers something. I've been told it's offensive that I would even try to describe my experience in those terms. It hurts a whole lot to try and really feel someone's pain, and then realize they don't even slightly give a shit about me. It hurts even more when they'll bend over backwards for anyone except me. Look, I get it. You argue all the time with trolls who claim that men have it just as bad as women and will shout “what about the men” as a way to disrupt any discussion. When you're engaged in meme warfare, you can't show them any human empathy. They certainly wouldn't return the favor. And if my voice sounds a little like theirs, that's just too bad for me. I know that this article will serve as ammunition for some people with views I find disgusting. That sucks, but I'm done using political strategy as a reason to stay silent. I understand tone policing as a derailing tactic, and I understand the need to call it out. But at this point it seems there's no room for a sincere request for kindness, especially coming from someone who doesn't get much benefit of the doubt. (The Geek Feminism Wiki basically says that asking for kindness is tone policing if and only if you're a man.) I'm not trying to silence anyone here. I'm not jumping in and derailing an existing conversation. I'm writing on my own blog, on my own schedule, about my own feelings. But I'm told that even this is crossing a line. I know that I can't dictate how others feel about our fucked-up world. Does that mean I must absolutely suppress the way I feel? Even when we agree about the substance of what's wrong? I know that if I ask someone to share their life experiences, they have a right to express anger. When does expressing anger become sustained, deliberate cruelty? “People are being oppressed and you're asking us to care about your feelings?” Yes, I am asking you to care. Just a little bit. I don't claim that my feelings should be a top priority. I hope it wouldn't come up very often. But according to the outspoken few who set the tone, I'm never allowed to bring it up. I don't deserve to ask them to be nice. And that's why I can no longer have anything to do with this movement. It's really that simple. I guess it says something about my state of mind that I felt the need to attach 1,700 words of preemptive defenses. The truth is, when I'm not allowed to say or even think “not all men,” part of me hears “Yes, all men, especially you.” And if I'm ever confused about whether I'm allowed to say “not all men,” there are a dozen unprompted reminders every day. Little jokes, repeated constantly to set the climate about what will and won't be tolerated. When you treat me like one of the trolls, I start to believe that I am one. Guys who say “I support feminism but sometimes they go too far” are usually trying to excuse sexist behavior. So what do I conclude about myself when I have the same thought? I get that “ally” is not a label you self-apply, it's a thing you do, and the label comes from others. The problem is, if a hundred people say I'm a good ally, and one person says I'm a sexist asshole, who do you think I'm going to believe? I'm not allowed to stand up for myself, because doing so is automatically an act of oppression. If a woman treats me like shit, and she's being “more feminist” than me, I conclude that I deserve to be treated like shit. That is the model I've learned of a good ally. I'm not a good ally, or even a bad one. I'm collateral damage. If the point of all this is to give me a tiny little taste of the invalidation that others experience on a regular basis, then congratulations, it worked. You've made your point. Now that you've broken me, how can I possibly help you, when it seems like I'm part of the problem just by existing? It feels like all I can do is engage in emotional self-harm to repay the debt of how I was born. I can't just take a break “until I feel better.” My depressive symptoms will always come and go, and some thoughts will reliably bring them back. I spent years reading about how the most important thing I can do, as a winner of the birth lottery, is to be an ally to marginalized people. And now I've realized that I'm too sick and weak to do it. Even if I give up on being an ally, I can't avoid this subject. It affects a lot of my friends, and I feel even worse when I ask them not to talk about it around me. I don't want to silence anyone. At least I've mostly stopped using Twitter. So this is how I feel, but I'm not sure anyone else can do anything about it. Really, most of the people I've talked to have been sympathetic. Maybe I need to learn not to let bullies get to me, even when they're bullying in service of a cause I support. They don't seem to get much pushback from the wider community, at any rate. What gives me hope is, I recognize that my participation in the endless shouting online wasn't really useful to anyone. If I can let myself ignore all that, maybe I can recover some of my energy for other activities that actually help people. That's all I have to say right now. Thank you for listening to me. ## November 03, 2017 ### Brent Yorgey # Sum of heights in a binary tree Executive summary: every year when teaching data structures I always forget how to analyze the cost of building a binary heap, which amounts to summing the heights of all the nodes in a full binary tree. So I’m writing down the (lovely) proof here in the hopes that I will remember it next time. Suppose you have a full binary tree and you do an operation on every node, where the cost of the operation is proportional to the height of that node. That is, the cost for each of the $n/2$ leaves is $0$, for each of the $n/4$ nodes in the next level up the cost is $1$, and so on. We can visualize the scenario like this: As a function of the total number of nodes $n$, how expensive is this? We can see that $O(n \lg n)$ is an upper bound, since there are $n$ nodes and the height of each node is at most $\lg n$. But it seems like it might actually be faster than this in reality, since, intuitively, most of the nodes have a height which is much smaller than $\lg n$. (One specific motivation for this scenario is that we can build a binary heap from an arbitrary set of data by looping over the nodes from the bottom up and calling reheapDown on each; in the worst case reheapDown takes time proportional to the height of the node, as in this scenario. But it doesn’t matter if you don’t know about binary heaps.) Let’s take the same tree and put a dollar at every node, for a total of $\n$: Now imagine sliding all the money as far up and to the right as it will go. That is, we take each dollar, and keep moving it up as long as it is a left child. As soon as we reach a node which is a right child we stop. The tree ends up looking like this: Now take each pile of money and move it up one step to its parent, except the money at the root of the tree, which you can put in your pocket. And voilà! We now have exactly enough money at each node to pay for the cost of the operations, and we even have a bit left over (which we can use to buy coffee). But we started with $\n$ and only shuffled money around; this shows that the total cost is actually $O(n)$. Exercise for the reader: what does this have to do with the number of bit flips needed to count from $1$ to $n$ with a binary counter? ## November 02, 2017 ### Robert Harper # PFPL Commentary I am building a web page devoted to the 2nd edition of Practical Foundations for Programming Languages, recently published by Cambridge University Press. Besides an errata, the web site features a commentary on the text explaining major design decisions and suggesting alternatives. I also plan to include additional exercises and to make sample solutions available to faculty teaching from the book. The purpose of the commentary is to provide the “back story” for the development, which is often only hinted at, or is written between the lines, in PFPL itself. To emphasize enduring principles over passing fads, I have refrained from discussing particular languages in the book. But this makes it difficult for many readers to see the relevance. One purpose of the commentary is to clarify these connections by explaining why I said what I said. As a starting point, I explain why I ignore the familiar concept of a “paradigm” in my account of languages. The idea seems to have been inspired by Kuhn’s (in)famous book The Structure of Scientific Revolutions, and was perhaps a useful device at one time. But by now the idea of a paradigm is just too vague to be useful, and there are many better ways to explain and systematize language structure. And so I have avoided it. I plan for the commentary to be a living document that I will revise and expand as the need arises. I hope for it to provide some useful background for readers in general, and teachers in particular. I wish for the standard undergraduate PL course to evolve from a superficial taxonomy of the weird animals in the language zoo to a systematic study of the general theory of computation. Perhaps PFPL can contribute to effecting that change. Update: As I had hoped, I have been making many new additions to the commentary, exposing alternatives, explaining decisions, and expanding on topics in PFPL. There are also a few errors noted in the errata; so far, nothing major has come up. (The sections on safety are safely sound.) Filed under: Research, Teaching # It Is What It Is (And Nothing Else) A recent discussion of introductory computer science education led to the topic of teaching recursion. I was surprised to learn that students are being taught that recursion requires understanding something called a “stack” that is nowhere in evidence in their code. Few, if any, students master the concept, which is usually “covered” only briefly. Worst, they are encouraged to believe that recursion is a mysterious bit of esoterica that is best ignored. And thus is lost one of the most important and beautiful concepts in computing. The discussion then moved on to the implementation of recursion in certain inexplicably popular languages for teaching programming. As it turns out, the compilers mis-implement recursion, causing unwarranted space usage in common cases. Recursion is dismissed as problematic and unimportant, and the compiler error is elevated to a “design principle” — to be serpentine is to do it wrong. And thus is lost one of the most important and beautiful concepts in computing. And yet, for all the stack-based resistance to the concept, recursion has nothing to do with a stack. Teaching recursion does not need any mumbo-jumbo about “stacks”. Implementing recursion does not require a “stack”. The idea that the two concepts are related is simply mistaken. What, then, is recursion? It is nothing more than self-reference, the ability to name a computation for use within the computation itself. Recursion is what it is, and nothing more. No stacks, no tail calls, no proper or improper forms, no optimizations, just self-reference pure and simple. Recursion is not tied to “procedures” or “functions” or “methods”; one can have self-referential values of all types. Somehow these very simple facts, which date back to the early 1930’s, have been replaced by damaging myths that impede teaching and using recursion in programs. It is both a conceptual and a practical loss. For example, the most effective methods for expressing parallelism in programs rely heavily on recursive self-reference; much would be lost without it. And the allegation that “real programmers don’t use recursion” is beyond absurd: the very concept of a digital computer is grounded in recursive self-reference (the cross-connection of gates to form a latch). (Which, needless to say, does not involve a stack.) Not only do real programmers use recursion, there could not even be programmers were it not for recursion. I have no explanation for why this terrible misconception persists. But I do know that when it comes to programming languages, attitude trumps reality every time. Facts? We don’t need no stinking facts around here, amigo. You must be some kind of mathematician. If all the textbooks are wrong, what is right? How should one explain recursion? It’s simple. If you want to refer to yourself, you need to give yourself a name. “I” will do, but so will any other name, by the miracle of α-conversion. A computation is given a name using a fixed point (not fixpoint, dammit) operator: fix x is e stands for the expression e named x for use within e. Using it, the textbook example of the factorial function is written thus: fix f is fun n : nat in case n {zero => 1 | succ(n') => n * f n'}. Let us call this whole expression fact, for convenience. If we wish to evaluate it, perhaps because we wish to apply it to an argument, its value is fun n : nat in case n {zero => 1 | succ(n') => n * fact n'}. The recursion has been unrolled one step ahead of execution. If we reach fact again, as we will for a positive argument, fact is evaluated again, in the same way, and the computation continues. There are no stacks involved in this explanation. Nor is there a stack involved in the implementation of fixed points. It is only necessary to make sure that the named computation does indeed name itself. This can be achieved by a number of means, including circular data structures (non-well-founded abstract syntax), but the most elegant method is by self-application. Simply arrange that a self-referential computation has an implicit argument with which it refers to itself. Any use of the computation unrolls the self-reference, ensuring that the invariant is maintained. No storage allocation is required. Consequently, a self-referential functions such as fix f is fun (n : nat, m:nat) in case n {zero => m | succ(n') => f (n',n*m)} execute without needing any asymptotically significant space. It is quite literally a loop, and no special arrangement is required to make sure that this is the case. All that is required is to implement recursion properly (as self-reference), and you’re done. There is no such thing as tail-call optimization. It’s not a matter of optimization, but of proper implementation. Calling it an optimization suggests it is optional, or unnecessary, or provided only as a favor, when it is more accurately described as a matter of getting it right. So what, then, is the source of the confusion? The problem seems to be a too-close association between compound expressions and recursive functions or procedures. Consider the classic definition of factorial given earlier. The body of the definition involves the expression n * fact n' where there is a pending multiplication to be accounted for. Once the recursive call (to itself) completes, the multiplication can be carried out, and it is necessary to keep track of this pending obligation. But this phenomenon has nothing whatsoever to do with recursion. If you write n * square n' then it is equally necessary to record where the external call is to return its value. In typical accounts of recursion, the two issues get confused, a regrettable tragedy of error. Really, the need for a stack arises the moment one introduces compound expressions. This can be explained in several ways, none of which need pictures or diagrams or any discussion about frames or pointers or any extra-linguistic concepts whatsoever. The best way, in my opinion, is to use Plotkin’s structural operational semantics, as described in my Practical Foundations for Programming Languages (Second Edition) on Cambridge University Press. There is no reason, nor any possibility, to avoid recursion in programming. But folk wisdom would have it otherwise. That’s just the trouble with folk wisdom, everyone knows it’s true, even when it’s not. Update: Dan Piponi and Andreas Rossberg called attention to a pertinent point regarding stacks and recursion. The conventional notion of a run-time stack records two distinct things, the control state of the program (such as subroutine return addresses, or, more abstractly, pending computations, or continuations), and the data state of the program (a term I just made up because I don’t know a better one, for managing multiple simultaneous activations of a given procedure or function). Fortran (back in the day) didn’t permit multiple activations, meaning that at most one instance of a procedure can be in play at a given time. One consequence is that α-equivalence can be neglected: the arguments of a procedure can be placed in a statically determined spot for the call. As a member of the Algol-60 design committee Dijkstra argued, successfully, for admitting multiple procedure activations (and hence, with a little extra arrangement, recursive/self-referential procedures). Doing so requires that α-equivalence be implemented properly; two activations of the same procedure cannot share the same argument locations. The data stack implements α-equivalence using de Bruijn indices (stack slots); arguments are passed on the data stack using activation records in the now-classic manner invented by Dijkstra for the purpose. It is not self-reference that gives rise to the need for a stack, but rather re-entrancy of procedures, which can arise in several ways, not just recursion. Moreover, recursion does not always require re-entrancy—the so-called tail call optimization is just the observation that certain recursive procedures are not, in fact, re-entrant. (Every looping construct illustrates this principle, albeit on an ad hoc basis, rather than as a general principle.) Filed under: Programming, Teaching ## November 01, 2017 ### Tweag I/O # The Exodus to Streamgard,<br/> an epic poem Yves Parès If Haskell was a god, often would he be depicted with the ravens Modularity and Abstraction flying above him, hovering the world and reporting to him every detail of our whereabouts. Haskell would sit on the Throne of Purity and look upon the world with full of wisdom. And in his hand, the mighty Haskell would wield the Spear of Lazy Lists, which is said to have the power to tackle each and every problem the world might have to face. And to honour him, we would code and abstract everything with lazy lists. For millenia would lists be used to map, filter, separate, merge, group, . But, one day, the , son of the wicked , would come. And the Real-World Serpent carries an eternal hatred towards lazy lists. Oh, that dreaded Serpent, that will throw everything it can muster to prevent us from staying within the warm comfort of abstraction and laziness. The Serpent will assemble its minions, Early-close and Strictness of effects, and unleash its wrath upon our world. Foldl, son of Haskell and brother of Foldr, would lead humanity to its last bastion, Streamgard, and organize the final fight... So, long story short, streaming is a library that allows you to leverage the insights you have gained while manipulating lazy lists in Haskell to handle effectful streams of data. We already talked about streaming on this blog, with this post discussing the IO part and this one comparing it to pipes and conduit. Here, we will be using streaming for highly efficient data processing and filtering. To this effect, we will use it conjointly with another library, foldl, which gives us an Applicative interface to the usual list functions. In this blog post we will apply them to the task of computing some statistics about a distribution of data. We want to be able to: • process the input data stream only once (aka in one pass), • never repeat the effects that were used to produce that data stream, • maintain the possibility to use the input stream as if it were a list, for instance by splitting it into two subparts, sending each subpart to be processed by a specific function. So lets imagine that the statistics I want to compute on my input data distributions take the shape of a simple summary. This is what I want to obtain in the end: data Summary v a = Summary { summaryLength :: Int , summaryMins :: [a] , summaryMaxes :: [a] , summaryMean :: v , summaryStdDev :: v } deriving (Show)  Nothing too fancy here, I just want to be able to compute the length, the n smallest elements, the n' biggest elements, the mean and the standard deviation of my distribution. We distinguish the types a and v here because our input distribution does not have to be numerical, as long as we have a projection a -> v available. This way, we can compute a summary of a stream of (Double, String) tuples, for instance, if the projection is just fst. So let's have a little reminder of our conditions. We want to be able to read the input data only once. But, we still want modularity and reusability. We do not want to have to recode our Summary-computing function every time we want to add a new field, and we would like to reuse already existing functions computing these statistics. And this is where the foldl package comes in. This package defines a type Fold as follows: data Fold a b = forall acc. Fold (acc -> a -> acc) acc (acc -> b)  You might recognize here the typical arguments of the classical foldl function of the Prelude: a is the type of each element of the input stream we consume, the first field (acc -> a -> acc) is an accumulation function and the second field acc is the initial value of the accumulator. The new component is the b type parameter and the last field (acc -> b). This one is called extract. It is used to extract the final value out of the accumulator. This is necessary so that Fold a can be a Functor and therefore an Applicative. See the original blog post and this talk by Gabriel Gonzalez for more detail, though be aware that Fold had a different shape back then. One of the central ideas of the foldl library is that Fold implements the Applicative type class: instance Applicative (Fold a)  Crucially, this instance combines two Folds, into a guaranteed one-pass traversal of the data. Therefore we can safely decompose the computation of a Summary as follows: import qualified Control.Foldl as L import Data.Function (on) summarizeBy :: (Floating v, Ord v) => (a -> v) -> Int -> Int -> L.Fold a (Summary v a) summarizeBy f nMins nMaxes = Summary <> L.length <*> collect ((>=) on f) nMins <*> collect ((<=) on f) nMaxes <*> L.premap f L.mean <*> L.premap f L.std  What's happening here? We are using a few of the functions already present in the foldl package and a new one, so let's delve into it a bit. The function summarizeBy takes a projection f, which we talked about earlier, the number of smallest elements we want to collect and the number of biggest elements. Then our five statistics are computed: • L.length :: L.Fold a Int gives us the number of elements in the input. • collect, which we will define a bit later, accumulates either the mins or the maxes given a comparison function. • L.mean gives us the average. We use L.premap f to turn it into a fold that will work on our projection f. • L.std gives us the standard deviation. The combination of the above gives us a Fold a (Summary v a), something that will consume a stream of a's and output a summary. At this point, nothing is consumed, we have only composed folds together, and a Fold is agnostic of the exact nature of the input. Running it on any Foldable datatype for instance is just a matter of calling: L.fold (summarizeBy id 3 3) [1..100]  The only function is the collect function. Defining it as a brand new Fold is simple: import Data.Sequence as Seq collect :: (a -> a -> Bool) -> Int -> L.Fold a [a] collect skipPred n = L.Fold insertPop Seq.empty (L.fold L.list) where insertPop acc x | Seq.length acc < n = insert x acc | otherwise = pop (insert x acc) insert x s = let (before, after) = Seq.spanl (skipPred x) s in before <> Seq.singleton x <> after pop s = case viewr s of s' :> _ -> s' _ -> s  Here we manually defined a new Fold from the three elements we mentioned earlier: an accumulation function (insertPop), an initial accumulator value (Seq.empty) and an extract function ((L.fold L.list), which also uses a Fold to turn the final sequence into a plain list). Now, the astute reader will notice we left streaming aside. Let's get back to it. Let's use as an input the classic Titanic dataset: PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked 1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S 2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C 3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S 4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S ...  We want to get two different summaries for the fares: one for the passengers that survived and one for those who did not. First, let's load the CSV into a stream by using the streaming-cassava and streaming-bytestring packages: {-# LANGUAGE OverloadedStrings #-} import Control.Monad (mzero) import qualified Data.ByteString.Streaming as BS import Streaming import Streaming.Cassava data Passenger { name :: !String, fare :: !Double, survived :: !Bool } deriving (Show) instance FromNamedRecord Passenger where parsedNamedRecord m = Person <> m .: "Name" <*> m .: "Fare" <*> (toBool =<< (m .: "Survived")) where toBool 0 = return False toBool 1 = return True toBool _ = mzero streamCsv :: (MonadResource m) => Stream (Of Passenger) m () streamCsv = decodeByName (BS.readFile ".../titanic.csv")  Nothing too fancy here, just a bit of required boilerplate to be able to read Passengers from the CSV file. MonadResource is necessary to track the files opened by our program. The type Stream (Of Passenger) m () means that we will be manipulating a stream whose elements are Passengers, that will run some effects in a monad m and return no result in the end. Now, lets split that input in two different substreams: import qualified Streaming.Prelude as S aliveDead :: Stream (Of Passenger) (Stream (Of Passenger) m) () aliveDead = S.partition survived streamCsv  Let's look at the type of aliveDead: it is a Stream over another Stream. Stream (Of a) is actually a monad transformer, the way the partitioning happens is by creating two layers: one for the live passengers and one for the dead ones. It's not exactly a tuple of two streams (as it would be with Data.List.partition), but is has the same advantages: each layer can be processed by different functions which don't have to know where the stream they process lies in the monad stack. Therefore, each one of these functions can be expressed as: summarizePassengers :: (Monad m) => Stream (Of Passenger) m a -> m (Of (Summary Double Passenger) a) summarizePassengers = L.purely S.fold (summarizeBy fare 3 3)  where m can be any monad. This can be the bottom MonadResource or another Stream, summarizePassengers does not mind and does not have to! Of behaves like a tuple, so it simply means that we return both the newly computed Summary and an a (a may just be (), but here we have to be a little more general). S.fold is the basic folding function for streams. L.purely fn f "unpacks" a Fold f and calls a folding function fn. So now, getting our summaries is just a matter of runAll = runResourceT do (summaryAlive :> summaryDead :> ()) <- summarizePassengers summarizePassengers aliveDead ...  So in the end, we splitted the input file in two substreams, we computed various statistics twice, and despite all this streaming and foldl guarantee that the input will be read only once in bounded memory. These techniques are currently being applied by Tweag I/O in the context of a project with Novadiscovery. Novadiscovery is a consulting company for in silico clinical trials, namely simulation of virtual patients through biomodeling. Parts of this blog post are actual code from the tools we develop with them. ## October 31, 2017 ### Douglas M. Auclair (geophf) # October 2017 1HaskellADay problems and solutions ## October 29, 2017 ### Christopher Allen # How I make stew I think the first thing my mother taught me to cook was Kraft Mac-n-Cheese at age 9. Fortunately, I’ve been able to move past that since then. My repertoire is a bit limited but I like to think that by zeroing in on specific kinds of meals, I’m able to make them go a bit farther. A friend of mine asked how I do crockpot recipes and after stewing on it for awhile I thought I would write a post explaining my thought process. So, first the reason I cook at home is usually to practice. I’m a bachelor and probably wouldn’t bother if cooking wasn’t necessary for me to either eat healthy or to entertain. The exception to “cooking-as-practice” are my stews. I usually make stews to eat healthier, avoid going out, and sometimes as part of a weight cut. Another difference for stew from other kinds of food I make is that I’m more comfortable composing arbitrary ingredients in stews and knowing what I’ll get out of it. Most people, especially sedentary programmers, could benefit from eating less as well as eating healthier and I find stews help a lot with portion control. When I’m on a protein-sparing modified fast (PSMF) I’ll often make a single stew containing 1 or 2 pounds of meat for the whole week. For equipment I use a fairly ordinary slow cooker that has an off, low, and high setting. The nice thing about my slow cooker is that the crock pot can be lifted out of the heating element so that I can easily store the food in the fridge. I don’t typically freeze my stews in gladware/tupperware as I’m fine eating the same thing for a week straight but people with families may want to store and alternate. Now to the ingredients and process! ## Meat and vegetables (and not much else) Typically I’ll use a red meat (beef, pork, lamb, goat) and a complement of vegetables. Sometimes I’ll make stews with chicken but I typically don’t as I don’t feel they contribute as much flavor to the stew. A near constant in my stews is mushrooms as the umami (savoriness) they bring is vital for a good stew. Other common ingredients are carrots, celery, bell peppers, and tomatoes. Often recipes will call for stewed or crushed tomatoes. I will typically use tomato paste and water separately instead so that I can more finely control the tomatoey-ness. Another thing is that stews often call for potatoes, I never use them or any other starchy ingredient as I do not need empty or low-fiber calories. # I stopped trying to replicate curries One of the key things that improved my stews was not attempting to mimic things like curries which are very heavily laden with seasonings and instead focusing on getting the most out of my ingredients through browning. I still add seasoning (especially salt!) but it’s there to complement the ingredients, not to be the primary source of flavor. They never really came out that well and the super-strong flavors meant that I tired of what I made much faster than if it was more subtle. # You have to bring out the flavor in your ingredients Before you put them in the slow cooker. This is particularly vital for the meat and mushrooms, but it applies to vegetables depending on your preference. The trick here is to purge moisture and brown the ingredients in a skillet or pan before throwing them in the slow cooker. This will make all the difference in the quality of your stew, especially if you do a good job of this with the meat and mushrooms! Butter-browned mushrooms are manna from heaven when done right! They should taste good by themselves before being added to the stew. I strongly recommend getting good at browning ingredients and learning to cook for moisture targets if this isn’t something you’re already comfortable with. I got much better at this by learning to cook Sichuan food from Fuchsia Dunlop’s Land of Plenty and it has improved everything I prepare. I also appreciate vegetables much more as a result. # Seasoning Some common standard spices for stews: • Salt • Pepper • Paprika • Parsley • Thyme • Oregano • Basil • Rosemary • MSG (hate if you want, but it helps and works with the mushrooms and meat to make the stew fantastically savory) I usually end up using more salt than any recipe I’m working off of calls for. I start low and add more salt in one hour intervals until I’m satisfied with my taste-test from the crockpot. # Composing ingredients, not functions I mentally categorize ingredients by what flavor or texture component they bring to the stew. I’ll put a (B) next to ingredients that I feel should be browned. ## Savory • Beef, pork, lamb, goat (B) • Mushrooms, usually baby bella because of the surface area to volume ratio and flavor (B) • Tomatoes, usually as a paste for flavor density. I am more likely to use tomato paste with a beef or lamb than I am pork or chicken. ## Sweet • Carrots • Peas • Corn I don’t typically use corn but it’s a solid option if you want it • Peppers Bell peppers are typically neutral but others like chili peppers will taste a little sweet. I no longer try to make my stews spicy though. • Red cabbage • Onion (B) It depends on the onion. Vidalia is going to be most sweet, red onion middling, yellow or white onion is going to be the most astringent. ## Neutral • Celery • Bell peppers You can brown bell peppers but I don’t think there’s much point when it’ll hit the texture I want just from slow cooking and I can’t tell much difference in the flavor. ## Astringent or sulphuric • Onion (B) I usually sautee some onions and also throw in some raw onions to get both sides of the onion flavor. Raw onions are more astringent, sauteed/browned onions are more savory and sweet as the sulphuric components denature in heat. • Garlic cloves (B) Garlic is pretty much an always-ingredient in my stews along with onions and mushrooms. I strongly recommended learning to brown garlic cloves if you aren’t already in the practice of doing so as it does wonderful things for the flavor of garlic. You can sautee some of the ingredients you need to brown together. Don’t brown too many ingredients at once in the skillet as it will make it hard to purge moisture accurately for each ingredient and get the timing of the browning right. • White cabbage • Bok choy I don’t have as much experience with bok choy and am uncertain whether it should be browned or not but it’s been a nice replacement for white cabbage in the past. • Broccoli Confess I do not care for broccoli much and so rarely use it but I think broccoli would be improved by cooking in butter before addition to the crockpot. • Cauliflower Sister to broccoli. • Artichoke No strong opinions here, it’s been good when I’ve used it. Similar to asparagus. • Asparagus (B) I will often roast asparagus on the grill in olive oil, salt, and pepper and it’s lovely. You can do similar in the pan before adding to a crockpot. • Turnip I used turnip in my most recent stew and thought it added a nice edge to the stew. Only (downside?) is that the turnips will retain their fibrous texture even after a good slow cook so if you want them to disintegrate you may need to cook separately. This may be necessary if you have children you don’t want detecting the presence of vegetables. Downside to doing so may be less fiber content in the stew. ### Side note on “astringent” Pre-modern Europeans used garlic as an antibiotic for wounds. ## Citrus/acidic I don’t do as much of these as I’m typically going for a balance of sweet/astringent/savory in my stews but they’re an option if it’s something you want. I won’t listen them out as I don’t have as much experience with them in stews but these are going to be your fruits, berries, and some varieties of nuts. I’ve considered roasting walnuts for addition to a stew but haven’t tried it yet. ## Oils and fats Typically when I make a stew I’m trying to make it “fatty” without it developing a grotesque top layer of fat on top when it settles in the fridge. To that end, I’ve started using slightly leaner meats than the 70/30 I used to use, more like 80/20 or 85/15. The browning purges some fat as well. To complement the meat fats, I’ll sometimes add button to the stew while it’s slow cooking. I’ll taste to determine whether or not it needs more fat. ## Bones (not the rapper) The last stew I made was with beef rib. I should’ve separated the beef from the bones before putting them in the crockpot. I ended up having to do this after it was mostly done slow cooking. I thought the slow cooking would break down the connective tissue between the bone and protein but it did not. That said, bones, especially if they have some marrow in them, are excellent for stews and worth getting ahold of if you can. Goat can be good for this as the meat often comes on the bone with some good marrow. # An example recipe This is from memory what my last stew was, I’ll try to link tweets describing it as well. It was a modified version of this low carb beef stew recipe. • Beef rib, browned • Mushrooms (baby bella) I browned them in a little bit of butter and some worcestershire (wurster-shur) sauce. Note the color. • Tomato paste • Beef stock. If you want a very strong flavor, use beef consomme. • Some added water. Not a lot, I don’t care much how thick the stew gets for the most part as long as everything gets cooked properly and the texture is right. I’ll often heat water with my electric kettle and pour it over reheated stew that doesn’t have enough liquid. Tastes great, not watery/bland. • Salt, oregano, thyme, fresh ground black pepper, paprika, garlic powder • Raw onion, minced (I use a fork to mince) garlic cloves • Carrots, turnips, bell peppers. I didn’t add any xanthan gum, it was plenty thick at the end. I only had a 1 and a half, maybe 2 pounds of beef rib in the crockpot. # Coda I hope this helps some of y’all in making tasty, healthy food! I really like stews and think they’re a wonderfully adaptable and accessible way to make food. I know this site is a bit of a disaster zone, but if you like my writing or think you could learn something useful from me, please take a look at the Haskell book I've been writing. There's a free sample available too! ## October 27, 2017 ### Tweag I/O # Using Stackage for GHC regression testing Manuel M T Chakravarty A recent development in Haskell land is the formation of the GHC DevOps Group, which was the topic of last week’s blog post. The group is a community of parties committed to the future of GHC. Tweag I/O is one such party. We are helping the group achieve its goals with concrete engine room work. We presented some of our plans at the Haskell Implementor's Workshop, but here's a post for those who weren't there. (We also made the presentation slides available.) ## GHC release quality GHC’s past release schedule and release quality has varied. In fact, in our sample of users and clients, many dread major GHC releases. While the GHC DevOps Group has already committed to improve the continuous integration setup, automatically running tests at every new commit is practice that has been in place for some time. This means running the GHC regression testsuite by way of the ./validate script for every commit and for every pull request or differential. However, major GHC releases have in the past shipped with serious bugs that passed GHC’s own regression tests and only manifested in third party packages. Given GHC’s enormous surface area, this is not particularly surprising. Consequently, the integration of testing against third party packages into GHC’s development process appears to be a logical step to improve release quality. ## Third party challenges While conceptually attractive, the addition of third party packages to regression testing comes with its own set of challenges. Firstly, changes in GHC or core library behaviour or simple alterations of core package versions break some third party packages. This is compounded by package dependencies when a package that many others depend on breaks. Secondly, package authors or maintainers may not be willing or able to immediately respond to changes in GHC and its core libraries, but at the same time, package maintenance shouldn’t be offloaded onto the already very busy GHC developers. Thirdly, some packages are more important to the ecosystem and the GHC user base than others; we would like to focus our scarce resources on those packages that matter the most. But we are all fortunate to have in the Haskell community a uniquely valuable asset that few other languages have developed to the same scale: Stackage, a huge curated set of Haskell packages that have been painstakingly tested to work together and whose respective maintainers have agreed to keep packages updated in a timely manner. Combined with the popularity of Stackage among developers, this ensures a representative set of the most widely used and best maintained packages in the Haskell ecosystem. Hence, it is the perfect package candidate set for regression testing. ## From Stackage Nightly to Stackage HEAD Stackage currently builds package sets in two flavours: nightlies and LTS (long term support) sets. The nightly set is based on the most recent package versions from Hackage for the latest release version of GHC, whereas LTS sets are versioned and maintain stable interfaces by only occasionally updating package versions beyond patch updates. For the purposes of regression testing the current development version of GHC, the nightly flavour of Stackage is the appropriate starting point, as it continuously tracks package updates by package maintainers. Building of so called Stackage snapshots is a two-phase process. First, a Docker image called stackage:nightly, containing the appropriate version of GHC and the rest of the Haskell toolchain in combination with all C libraries and other non-Haskell dependencies for the package set is being build on Travis CI. Second, a tool called stackage-curator builds the Haskell packages that are part of the package set inside a Docker container running stackage:nightly. To perform regression testing of GHC HEAD, we need to alter both steps. Firstly, we use stackage:nightly as the basis for a Docker image that contains all the same non-Haskell dependencies, but includes the latest development version of GHC. We call it stackage:head. This is illustrated in the below diagram. ## Pruning constraints The second step in the Stackage build process, based on stackage-curator, is itself a two-phase process. First, stackage-curator converts a specification of build constraints into a concrete build plan. These build constraints are manually maintained by a group of people known as the Stackage curators. Secondly, stackage-curator (the tool) executes the build plan by building all packages in the package set. However, not every generated build plan can be executed. In case of package version conflicts, we may get an invalid plan. When using the latest development version of GHC, the HEAD, in combination with the build constraints of Stackage Nightly (which is curated to work with the latest stable release version of GHC), we invariably get an invalid plan. As we want regression testing to be a fully automatic process, we don’t want any manual intervention in the form of manually curating a set of build constraints specifically for GHC HEAD. Instead, we use a small Haskell script that prunes the build constraints by simply removing all packages that participate in a conflict. We call the resulting set of build constraints the pruned build constraints. They are then used to build packages. That build process may fail for individual packages if there is a regression or a conscious change in GHC. Overall, we get the following architecture. ## Assessing changes to GHC One of the interesting questions that we want to answer with the HEAD build of Stackage is whether a change to GHC involves a regression. Given that a HEAD build will typically involve failing packages and package failure may have a variety of reasons, it seems difficult to make that determination. However, a change to GHC is always a change with respect to a particular earlier version of GHC HEAD — this may be in the form of a pull request or a differential. Hence, what we are actually interested in is the change in package failures between two only slightly different versions of GHC. Any package whose build fails for both versions can simply be ignored. In contrast, whenever a pull request or differential leads to a new package failure, we have got a situation, where a code reviewer or code author needs to assess whether the failure is acceptable (GHC’s behaviour or core library APIs underwent a planned change) or whether it indicates a regression. ## Collaboration Adding a new Stackage flavour tracking GHC HEAD buys us, 1. that GHC developers can identify upfront which packages are affected by planned changes to GHC and the core libraries, 2. and that package maintainers are given plenty of advance notice about those changes that do break backwards compatibility (because say packages were relying on buggy behaviour by the compiler). Ultimately, this scheme empowers package developers to adapt to those changes early, or at any rate, to get a longer lead time to plan and schedule the work involved. 3. Quantify just how many current Stackage packages (and by proxy the entire ecosystem at large) are ready for the next GHC release, and track that number over time. In short, this new Stackage flavour should prove useful for both GHC developers upstream, and library maintainers downstream. Better yet, regression testing against all of Stackage has the potential to allow a closer collaboration between GHC developers and package authors. New versions of previously breaking packages on Hackage get picked for regression testing automatically once builds and tests start succeeding. ## October 26, 2017 ### FP Complete # Intro to Devops on GovCloud ### What I would have wanted to know about AWS GovCloud While assisting a US municipal government with their cloud migration, we recently had the opportunity to deploy a complete hosting platform to the GovCloud Region. Our task was to provide a platform based on kubernetes, running within a secure VPC built on private subnets and with VPN links to an enterprise-class network that spanned multiple datacenters. ### Noam Lewis # Type-safe enums in C, using a clang plugin The C programming language generally treats enums as integers (see “Appendix: For Language Lawyers” for reference). Wouldn’t it be nice if we could do more with enums, and do it safely? Some other languages have anything from integer-incompatible enums to full-blown sum types. It would be nice to have something like that in C. I wrote the enums_conversion clang plugin aiming to do just that, by treating enums as incompatible with integers (except via explicit casting). ## A motivating example Some people are surprised at the goals of this plugin. Here is a simple example to explain the motivation. Consider the following (totally fabricated) API: enum OpGetResult { OP_GET_ERROR, OP_GET_OK, }; enum OpGetResult get_items(void); /* Implementation: */ enum OpGetResult get_items(void) { /* ... do something with side effects ... */ return OP_GET_OK; }  So far so good. Save it as test.c and compile this program with gcc: gcc -std=c11 -Wall -Wextra -Werror -c test.c  No errors, yay! ### A simple bug Now, let’s introduce a bug. Someone decided the API is no good, and get_items should just return the number of items it “got”. So the new API is: /* This enum is in use elsewhere... */ enum OpGetResult { OP_GET_ERROR, OP_GET_OK, }; int get_items(void); /* return value changed to 'int' */ /* Implementation: */ int get_items(void) { /* ... do something with side effects ... */ return OP_GET_OK; /* oops! forgot to change this */ }  The bug is that get_items still returns the enum value OP_GET_OK instead of a number. Save as test2.c and compile (tested on gcc 6.3.0): gcc -std=c11 -Wall -Wextra -Werror -c test2.c  Oh no! No error! Let’s try with clang 5.0 and the wonderful -Weverything which enables all warnings: clang -std=c11 -Weverything -Werror -c test2.c  Nope! Still no error. The compilers are ok with this code because it’s allowed. However, it’s clearly not what we intended. ### A bunch of other possible bugs Here is a snippet with different ‘bad code’ examples: (for testing it can be appended to one of the previous files) int func(enum OpGetResult e, unsigned int x, unsigned int y); int func(enum OpGetResult e, unsigned int x, unsigned int y) { handle_result(x); /* passing arbitrary integer where one of several enum values was expected */ enum OpGetResult e2 = x; /* assigning from arbitrary integer (which may not be a valid enum value) */ if (e2 == y) { /* comparing enum to arbitrary integer */ } return e; /* returning enum where arbitrary integer is expected by caller */ }  Neither gcc 6.3.0 nor clang 5.0 emit any kind of warning about the above code. # Let's try gcc with some extra warnings: gcc -std=c11 -Wall -Wextra -Werror -Wconversion -Wenum-compare -Wswitch-enum -Wsign-conversion -c test2.c # clang with -Weverything: clang -std=c11 -Weverything -Werror -c test2.c  ### clang plugin to the rescue The enums_converesion clang plugin detects and warns about all of the above. # clang -std=c11 -Weverything -c test2.c -Xclang -load -Xclang ./clang_plugins.so -Xclang -add-plugin -Xclang enums_conversion test2.c:22:23: error: enum conversion to or from enum OpGetResult handle_result(x); /* passing arbitrary integer where one of several enum values was expected */ ^ test2.c:24:31: error: enum conversion to or from enum OpGetResult enum OpGetResult e2 = x; /* assigning from arbitrary integer (which may not be a valid enum value) */ ^ test2.c:26:13: error: enum conversion to or from enum OpGetResult if (e2 == y) { /* comparing enum to arbitrary integer */ ^ test2.c:29:16: error: enum conversion to or from enum OpGetResult return e; /* returning enum where arbitrary integer is expected by caller */ ^ 4 errors generated.  ## Frequently Asked Questions 1. But this isn’t standard C! Correct, it is a restrictive subset of C. Some “valid” C programs will be flagged by this plugin. I believe writing code in the spirit of this plugin will improve your code’s readability while preventing a class of bugs from ever occurring. 1. How is this different from gcc’s -Wenum-compare? The warning flag -Wenum-compare find comparisons between different enums, but does not look at comparing enums to integers, implicit casting to/from integers, etc. In the following program only the second if is flagged by -Wenum-compare: enum A { A_FIRST, A_SECOND }; enum B { B_FIRST, B_SECOND }; int foo(enum A a, unsigned int x); int foo(enum A a, unsigned int x) { if (x == a) { // no warning emitted return 1; } if (B_FIRST == a) { // will cause warning: comparison between ‘enum B’ and ‘enum A’ return 2; } return 0; }  1. How is this different from clang’s -Wenum-conversion? -Wenum-conversion doesn’t catch implicit casts to/from integral types (the plugin does). -Wenum-conversion does catch conversion from one enum type to another, like so: enum EnumA { E_A }; enum EnumB { E_B }; enum EnumA do_something(void) { return E_B; }  1. What about enums being used as combinable bits? Won’t the plugin disallow them? A common pattern is using an enum to describe the allowed bits for an “options” value that can be ORed together. For example: enum Flags { FLAG_NONE = 0, FLAG_READ = 1, FLAG_WRITE = 2, }; enum Flags do_something(void); enum Flags do_something(void) { return FLAG_WRITE | FLAG_READ; }  The plugin is OK with this. clang -Weverything doesn’t like this (-Wassign-enum): clang -std=c11 -c /tmp/test.c -Weverything /tmp/test.c:8:12: warning: integer constant not in range of enumerated type 'enum Flags' [-Wassign-enum] return FLAG_WRITE | FLAG_READ; ^ 1 warning generated.  That’s a false error (if you use | with a runtime variable, -Wassign-enum seems to not flag this). However, the plugin does catch errors of putting an invalid value in the OR expression: ... return FLAG_WRITE | 5;  Now clang -Weverything doesn’t complain (despite the possible bug). Running with the plugin gives: /tmp/test.c:10:16: error: enum conversion to or from enum Flags return FLAG_WRITE | 5;  1. I’m afraid to use this in production. The plugin only analyzes the AST produced by clang, and does not affect the emitted code in any way. 1. I don’t use clang! Can I benefit from this plugin? At elastifile, the plugin is being used as part of the CI process. Code that is being merged into master must pass the plugin’s checks (as well as other plugins from this suite). The actual production executable is built by gcc (for various unrelated reasons). The plugin is available as part of the elfs-clang-plugins suite github. ## Appendix: For Language Lawyers The C11 standard (draft) says: An enumeration comprises a set of named integer constant values. Each distinct enumeration constitutes a different enumerated type. The type char, the signed and unsigned integer types, and the enumerated types are collectively called integer types… ### Tweag I/O # The Exodus to Streamgard,<br/> an epic poem Yves Parès If Haskell was a god, often would he be depicted with the ravens Modularity and Abstraction flying above him, hovering the world and reporting to him every detail of our whereabouts. Haskell would sit on the Throne of Purity and look upon the world with an eye full of wisdom[^1]. And in his hand, the mighty Haskell would wield the Spear of Lazy Lists, which is said to have the power to tackle each and every problem the world might have to face. And to honour him, we would code and abstract everything with lazy lists. For millenia would lists be used to map, filter, separate, merge, group, and so forth[^2]. But, one day, the Real-World Serpent[^3], son of the wicked Foldr[^4], would come. And the Real-World Serpent carries an eternal hatred towards lazy lists. Oh, that dreaded Serpent, that will throw everything it can muster to prevent us from staying within the warm comfort of abstraction and laziness. The Serpent will assemble its minions, Early-close and Strictness of effects, and unleash its wrath upon our world. Foldl, son of Haskell and brother of Foldr, would lead humanity to its last bastion, Streamgard, and organize the final fight... So, long story short, streaming is a library that allows you to leverage the insights you have gained while manipulating lazy lists in Haskell to handle effectful streams of data. We already talked about streaming on this blog, with this post discussing the IO part and this one comparing it to pipes and conduit. Here, we will be using streaming for highly efficient data processing and filtering. To this effect, we will use it conjointly with another library, foldl, which gives us an Applicative interface to the usual list functions. In this blog post we will apply them to the task of computing some statistics about a distribution of data. We want to be able to: • process the input data stream only once (aka in one pass), • never repeat the effects that were used to produce that data stream, • maintain the possibility to use the input stream as if it were a list, for instance by splitting it into two subparts, sending each subpart to be processed by a specific function. So lets imagine that the statistics I want to compute on my input data distributions take the shape of a simple summary. This is what I want to obtain in the end: data Summary v a = Summary { summaryLength :: Int , summaryMins :: [a] , summaryMaxes :: [a] , summaryMean :: v , summaryStdDev :: v } deriving (Show)  Nothing too fancy here, I just want to be able to compute the length, the n smallest elements, the n' biggest elements, the mean and the standard deviation of my distribution. We distinguish the types a and v here because our input distribution does not have to be numerical, as long as we have a projection a -> v available. This way, we can compute a summary of a stream of (Double, String) tuples, for instance, if the projection is just fst. So let's have a little reminder of our conditions. We want to be able to read the input data only once. But, we still want modularity and reusability. We do not want to have to recode our Summary-computing function every time we want to add a new field, and we would like to reuse already existing functions computing these statistics. And this is where the foldl package comes in. This package defines a type Fold as follows: data Fold a b = forall acc. Fold (acc -> a -> acc) acc (acc -> b)  You might recognize here the typical arguments of the classical foldl function of the Prelude: a is the type of each element of the input stream we consume, the first field (acc -> a -> acc) is an accumulation function and the second field acc is the initial value of the accumulator. The new component is the b type parameter and the last field (acc -> b). This one is called extract. It is used to extract the final value out of the accumulator. This is necessary so that Fold a can be a Functor and therefore an Applicative. See the original blog post and this talk by Gabriel Gonzalez for more detail, though be aware that Fold had a different shape back then. One of the central ideas of the foldl library is that Fold implements the Applicative type class: instance Applicative (Fold a)  Crucially, this instance combines two Folds, into a guaranteed one-pass traversal of the data. Therefore we can safely decompose the computation of a Summary as follows: import qualified Control.Foldl as L import Data.Function (on) summarizeBy :: (Floating v, Ord v) => (a -> v) -> Int -> Int -> L.Fold a (Summary v a) summarizeBy f nMins nMaxes = Summary <> L.length <*> collect ((>=) on f) nMins <*> collect ((<=) on f) nMaxes <*> L.premap f L.mean <*> L.premap f L.std  What's happening here? We are using a few of the functions already present in the foldl package and a new one, so let's delve into it a bit. The function summarizeBy takes a projection f, which we talked about earlier, the number of smallest elements we want to collect and the number of biggest elements. Then our five statistics are computed: • L.length :: L.Fold a Int gives us the number of elements in the input. • collect, which we will define a bit later, accumulates either the mins or the maxes given a comparison function. • L.mean gives us the average. We use L.premap f to turn it into a fold that will work on our projection f. • L.std gives us the standard deviation. The combination of the above gives us a Fold a (Summary v a), something that will consume a stream of a's and output a summary. At this point, nothing is consumed, we have only composed folds together, and a Fold is agnostic of the exact nature of the input. Running it on any Foldable datatype for instance is just a matter of calling: L.fold (summarizeBy id 3 3) [1..100]  The only function not provided by the foldl package is the collect function[^5]. Defining it as a brand new Fold is simple: import Data.Sequence as Seq collect :: (a -> a -> Bool) -> Int -> L.Fold a [a] collect skipPred n = L.Fold insertPop Seq.empty (L.fold L.list) where insertPop acc x | Seq.length acc < n = insert x acc | otherwise = pop (insert x acc) insert x s = let (before, after) = Seq.spanl (skipPred x) s in before <> Seq.singleton x <> after pop s = case viewr s of s' :> _ -> s' _ -> s  Here we manually defined a new Fold from the three elements we mentioned earlier: an accumulation function (insertPop), an initial accumulator value (Seq.empty) and an extract function ((L.fold L.list), which also uses a Fold to turn the final sequence into a plain list). Now, the astute reader will notice we left streaming aside. Let's get back to it. Let's use as an input the classic Titanic dataset: PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked 1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S 2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C 3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S 4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S ...  We want to get two different summaries for the fares: one for the passengers that survived and one for those who did not. First, let's load the CSV into a stream by using the streaming-cassava and streaming-bytestring packages: {-# LANGUAGE OverloadedStrings #-} import Control.Monad (mzero) import qualified Data.ByteString.Streaming as BS import Streaming import Streaming.Cassava data Passenger { name :: !String, fare :: !Double, survived :: !Bool } deriving (Show) instance FromNamedRecord Passenger where parsedNamedRecord m = Person <> m .: "Name" <*> m .: "Fare" <*> (toBool =<< (m .: "Survived")) where toBool 0 = return False toBool 1 = return True toBool _ = mzero streamCsv :: (MonadResource m) => Stream (Of Passenger) m () streamCsv = decodeByName (BS.readFile ".../titanic.csv")  Nothing too fancy here, just a bit of required boilerplate to be able to read Passengers from the CSV file. MonadResource is necessary to track the files opened by our program. The type Stream (Of Passenger) m () means that we will be manipulating a stream whose elements are Passengers, that will run some effects in a monad m and return no result in the end. Now, lets split that input in two different substreams: import qualified Streaming.Prelude as S aliveDead :: Stream (Of Passenger) (Stream (Of Passenger) m) () aliveDead = S.partition survived streamCsv  Let's look at the type of aliveDead: it is a Stream over another Stream. Stream (Of a) is actually a monad transformer, the way the partitioning happens is by creating two layers: one for the live passengers and one for the dead ones. It's not exactly a tuple of two streams (as it would be with Data.List.partition), but is has the same advantages: each layer can be processed by different functions which don't have to know where the stream they process lies in the monad stack. Therefore, each one of these functions can be expressed as: summarizePassengers :: (Monad m) => Stream (Of Passenger) m a -> m (Of (Summary Double Passenger) a) summarizePassengers = L.purely S.fold (summarizeBy fare 3 3)  where m can be any monad. This can be the bottom MonadResource or another Stream, summarizePassengers does not mind and does not have to! Of behaves like a tuple, so it simply means that we return both the newly computed Summary and an a (a may just be (), but here we have to be a little more general). S.fold is the basic folding function for streams. L.purely fn f "unpacks" a Fold f and calls a folding function fn. So now, getting our summaries is just a matter of runAll = runResourceT do (summaryAlive :> summaryDead :> ()) <- summarizePassengers summarizePassengers aliveDead ...  So in the end, we splitted the input file in two substreams, we computed various statistics twice, and despite all this streaming and foldl guarantee that the input will be read only once in bounded memory. These techniques are currently being applied by Tweag I/O in the context of a project with Nova Discovery. Nova Discovery is a consulting company for in silico clinical trials, namely simulation of virtual patients through biomodeling. Parts of this blog post are actual code from the tools we develop with them. [^1]: Yes, of course Haskell would be one-eyed. And he'd have a list of like 200 awe-inspiring nicknames, like The Monadbringer or The Father of all things pure, but that's another story. [^2]: Full cosmogony in the religion of Haskell is left as an exercise to the reader. [^3]: Yes, all that buildup for a lousy pun, I know. [^4]: Also seen written as Folður. [^5]: foldl provides minimum and maximum, but here we want more than that. ## October 23, 2017 ### Disciple/DDC # Release v0.5.1 DDC v0.5.1 was released today. Here are the release notes: The Disciple language is an experimental dialect of Haskell which investigates static typing and program transformation in the presence of computational effects. Version 0.5.1 is “working alpha” quality, meaning there is a complete system that can be hacked around with, but it’s not yet industrial strength. ### Features • Haskell-like source language, so Haskell-like programs should work with minor modifications. • Modal region and effect system using ‘box’ and ‘run’ to suspend and force computations. • Higher rank polymorphism with bidirectional type inference. • Simple two space copying garbage collection. • Default call-by-value evaluation. • Typed external core language. ### New In This Release • Copying garbage collection. We now do simple two space Cheney-scan garbage collection, and grow the heap as needed. We use LLVM shadow stack support to retrieve the GC roots. • Implicit parameters. We now support Agda-style implicit parameters, where arguments are constructed at use-sites using whatever bindings are currently in scope. For example, we can perform Haskell-style ad-hoc overloading using implicit dictionaries:  elem {Eq a} (k: a) (xx: List a): Bool = case xx of Nil -> False Cons x xs | x == k -> True | otherwise -> elem k xs • Floating point primitives. The addition of floating point primitives combined with proper storage management now lets us write more realistic example programs, like the ray tracer demo, which was also described on the blog. • Travis continuous integration. Every push to the GitHub repo now gets tested by the Travis buildbot. Branches can also be tested individually before being merged. • Support for LLVM versions 3.1 - 5.0. We query the version of the installed LLVM compiler and generate LLVM code with the required syntax. This process now works for versions 3.1 through to 5.0, which is the latest at the time of this release. • New home page with the start of a language specification. The home page now consists of the user manual generated from Sphinx / Restructured Text source. The manual includes grammars for source and core languages, as well as previous release notes. The bug tracker is still accessible via the development wiki. • Bug fixes for offside rule, compilation of nested pattern matching, string escapes, unsolved meta variables. Now with more bake. ### Work In Progress We are moving to a new indexed binary format for interface files, as re-parsing interface files is currently a bottleneck. The file format is to be provided by the Shimmer project, which has been split out into a separate repository. ### People The following people contributed to DDC since the last release: • Chris Hall - Travis continuous integration. • Amos Robinson - Build system fixes, start on machine fusion transform. • Ben Sinclair - Build system fixes. • Jacob Stanley - Copying garbage collection. • Ben Lippmeier - Copying garbage collection, implicit parameters, floating point, new home page. ### Previous Releases • 2016/09 DDC 0.4.3: Nested patterns and guards, simple type synonyms. • 2016/04 DDC 0.4.2: Compilation of higher order functions, run and box casts. • 2014/03 DDC 0.4.1: Added bi-directional type inference and region extension. • 2013/07 DDC 0.3.2: Added Tetra and Flow language fragments. ### More Information ### Michael Snoyman # Manipulating Maintainers Real title should be: how to get members of any open source community to be interested in helping you. But the given title is catchier. There's an old "ha ha, only serious" joke. If you go to a Linux forum and ask for help fixing your WiFi driver, everyone will ignore you. If, instead, you say "Linux sucks, you can't even get a f*&ing WiFi driver working!" thousands of people will solve the problem for you. This story is a great example of manipulating people, but it's obviously a negative take on it. I'd like to share some thoughts on this from a much more positive standpoint, which will help you get people to pay more attention, be more helpful, and—perhaps most importantly—create a healthier open source community over all. These items will appear in no particular order, and will almost all fall into either the attractor or obstacle category. An attractor is something you can do to make people want to participate with you. An obstacle is something you should not do, which would prevent people from interacting with you. And it should go without saying, but I'll say it anyway: this is an opinionated list, written by one guy. I'm including in here things that I personally care about, and things which friends and colleagues have shared with me. No example is specific to any individual, so don't think I'm calling you out: I'm most certainly not. And some people may disagree, or have other items for this list. Sharing such differing thoughts would be very healthy. ## Don't waste people's time This is my biggest one to be honest. Remember that for the most part, people you interact with in open source settings are doing this in their free time. Either because they love a project, they want to help people, they want to change the world, or something else. You're asking for a slice of their lives. Make that slice as small as possible. The advice is vague, so let me follow up with some concrete examples: • File a good bug report. If you write an issue that says "my code doesn't compile," and don't include the error message, full code, OS, compiler version, and other such things, people will have to spend time prying it out of you. Be forthcoming with relevant information. • In slight contradiction to that: be concise. Start off with the most highly pertinent information. Make it clear what you're trying to do. If you have a 400 line error message, perhaps put it in an lpaste or Gist and link to it. • Provide a minimal repro. "Here's a link to my 1500 SLOC project that doesn't compile, kthxbye" is a bad start. As someone helping you, I'm going to have to strip off the extraneous bits until I'm staring the problem in the face. Why should I spend my time doing that, when you—the person asking for help—could be? • Make sure to mention any custom configuration. If I spend 5 days trying to fix an issue with your code not linking, and then you say "oh, I've been trying out the prerelease linker, I forgot to mention that, you think that could be the problem?" I'm going to be annoyed. Don't do that. Be forthcoming with all relevant info, and call out in particular custom things on your system. • Don't fall into the XY problem. And don't get offended if you get accused of hitting the XY problem. Instead of trying to explain what this problem is, I'll just provide a link. ## Demonstrate you've tried You've been staring at your screen for the past 5 days. The code hasn't compiled. You have no idea why. You're pulling out your hair at this point (side note: bald is awesome). You finally, in a fit of desperation, go to Stack Overflow and say: I'm trying to make an HTTP request in Haskell and can't figure it out. Any advice? Good luck. Not only is your question vague, but it sounds like you're asking for help on a homework assignment and are too lazy to do any research. Make it clear that you're not just getting someone else to do your work, and are really deserving of assistance. Below you'll find a snippet of code I've been trying to get working to make an HTTP PUT request and parse the response body as XML in a streaming fashion. As you can see, I'm trying to connect the source body to the parseXML function from xml-conduit, but I get the error message below. If anyone could point me in the right direction, I'd appreciate it. Make sure to include the import statements and language extensions too, so that anyone reading can just copy-paste your example and get the same error message. (I like using reproducible Stack scripts for this.) You may notice an overlap with minimal repro from above: that's intentional. ## Help other people If I see people answering questions on a mailing list or Stack Overflow, I'm appreciative. I consider them a comrade-in-arms. And I consider them assets to the community. They've earned my respect, I'm indebted to them, and I want to entice them to continue. Honestly: all of you helpers are awesome in my book. If one of those people asks a question, I want to help more. In addition to all of the feelings I mentioned above, there's also a more basic one: if this person is having trouble, it's probably not the most basic, boring question. In reality, the person may be barely a beginner, and may be asking beginner questions. But I know statistically that just having that helpful person's name associated with the question increases the likelihood of it being interesting. In other words: I'm nerd sniped. This points out something which really applies to all of these sections: people have memories. As soon as you start interacting with the community, you're building a reputation. You can change that reputation over time (for better or worse), but you have to acknowledge that it's there. ## Don't be rude Compare: Thank you for your great software, I really appreciate the time you've taken to make it. I'd appreciate your help with... With: I've been pulling my hair out trying to parse your documentation for the past two days. You think you can help me make sense of it? I'm trying to... And even further: Since I'm stuck using your piece of s*&# software with crappy documentation, the least you can do is help me overcome... If your goal is to get someone to help you, I'd place a large wager that the first approach will attract the most assistance. It doesn't matter if the other two approaches really do capture your current mental state. Are you trying to vent at someone, or get help? I recently had someone argue a few times that the tone of a question shouldn't matter. (In fact, those interactions were partially what encouraged this blog post.) I argue that that's not only naive, but dangerous: • We're human beings, and we like being treated nicely. As I mentioned above, open source community members are giving up a portion of their lives to help strangers. You should make that sacrifice feel as rewarding as possible. • Rude comments like this scare other people away. Encouraging people to continue with them by rewarding them with a helpful answer has the real possibility of scaring away more constructive community members. • All of life is a series of choices between different things I can be doing. If you make it miserable enough to interact with you, I may very well choose "watch paint dry and see this jerk badmouth my project in public" over "give this guy any more of my time." • Whether correct or not, being rude is a signal to me of other likely tendencies. I'm likely to guess that someone rude is also selfish, unwilling to minimize time wastage for others, and unlikely to contribute back by helping people. If I have to make a snap judgement on you based on a question and your tone of voice, a rude tone is not going to help you. I honestly haven't found the best approach to this specific problem. In some cases, a private message saying "your message would be more well received if you modified it" can help. But if I'm honest, by the time I think about writing such a message, I've basically decided that this is a person not worth my time, and trying to encourage him/her to behave better isn't worth it. The situation is slightly different if someone has been in the community for a while, and suddenly has an outburst of rudeness. I'm not excusing it, but I am saying I'd be more willing to consider that he/she is having a bad day, or that the problem is really bad, instead of "this person is just a jerk." Also, the reverse is true: if you've been rude to someone for the past 10 interactions, it may be very difficult to convince them to help you on the 11th, even if the rudeness disappears. (Overwhelming niceness, however, can help.) Side note: I implied above that the project documentation sucks. That may be the case. See "offer to help" for advice on pointing that out. ## Say thank you I'll preface this one with a few caveats so I don't get a flurry of guilt-ridden "thank you" notes. Most people don't say thank you in open source. I rarely write a thank you note to a package author. I don't expect it, I've never met anyone who expects it, it is not necessary. When someone has received assistance on a mailing list, I get happy. When that person responds with a sincere thank you, I get happier. When I'm the person who did the helping, I'm even happier still. It's simple, it's trivial, but it's often missed. Most people are only doing open source work to help others. Gratitude may be their only reward. Taking it a step farther: there have been a few times over the years where, out of nowhere, I've received a very kind personal email thanking me for work I've done. You can ask my wife and she'll confirm: it's truly touching to receive such messages. I know I've had views of open source maintainers as being far beyond the lowly likes of me in the past. I don't think it's generally true. Most people are, at the end of the day, just people. And they respond like any other people to kind words. Though I have a feeling that Linus Torvalds may be a bit confused if you pop him an email saying "love Linux, thanks!" ## Admit if you're new This one works one time per community. If you're new, you don't know what you're doing, and are asking for help, say straight out that you're new. It will get you some sympathy (unless you're lying, then people will hate you). It will get a more softball answer, and likely some guides to places explaining how to interact better. For example: if you come to the Yesod issue tracker and say "I'm new, I'm not sure if this is the best place to ask about installing GHC," you'll likely get pointed to an install page and Stack Overflow for further "please help me" questions. ## Offer to help This may be the first surprising piece of advice. Let's say the docs on my library suck. You could come in and say "help me solve X because your docs suck." And I might answer. Now consider this: I was having trouble doing X with your library (thank you for it by the way!). I'd be happy to prepare a documentation PR to help other people in my situation, if you'd be able to guide me towards an answer. Whoa, what is this? Help someone and they'll take away the dreaded documentation writing task from my plate? Awesome, I'll get right on it! In addition to docs, similar thoughts apply to: • Offering to write test cases • Offering to add some missing functionality • Offering to answer people's questions on other issues/the mailing list/Stack Overflow The point is: convince the maintainer (or whoever) that giving time to you is an investment. ## Give money This isn't at all a universal one. And to be clear: I'm not asking for it, if I have an envelope with unmarked bills on my doorstep tomorrow, I'll be weirded out. Some people just need money. They like contributing to open source work, but they have to pay the bills. If they've at all expressed a willingness to accept money for their work (like setting up Flattr or Patreon or whatever is popular these days), considering donating. Consider how much of their time you're taking. Consider how much of your time they would be saving you. Consider what a typical software developer hourly rate is. And then realize that buying someone a beer, or even a nice dinner, is probably a cheap price to pay for an answer to a question. And for those who aren't asking for any money, offering to buy the beer/coffee/soda when you meet up at a conference is a nice way to make this one a reality too. # Effective Ways to Get Help from Maintainers This blog post was previously titled "Manipulating Maintainers," but has been retitled to more accurately reflect what it's about (with a less cheeky tone). It's about how to most effectively interact with and request assistance from maintainers of open source projects, and open source community members in general. There's an old "ha ha, only serious" joke. If you go to a Linux forum and ask for help fixing your WiFi driver, everyone will ignore you. If, instead, you say "Linux sucks, you can't even get a f*&ing WiFi driver working!" thousands of people will solve the problem for you. This story is a great example of manipulating people, but it's obviously a negative take on it. I'd like to share some thoughts on this from a much more positive standpoint, which will help you get people to pay more attention, be more helpful, and—perhaps most importantly—create a healthier open source community over all. These items will appear in no particular order, and will almost all fall into either the attractor or obstacle category. An attractor is something you can do to make people want to participate with you. An obstacle is something you should not do, which would prevent people from interacting with you. And it should go without saying, but I'll say it anyway: this is an opinionated list, written by one guy. I'm including in here things that I personally care about, and things which friends and colleagues have shared with me. No example is specific to any individual, so don't think I'm calling you out: I'm most certainly not. And some people may disagree, or have other items for this list. Sharing such differing thoughts would be very healthy. ## Don't waste people's time This is my biggest one to be honest. Remember that for the most part, people you interact with in open source settings are doing this in their free time. Either because they love a project, they want to help people, they want to change the world, or something else. You're asking for a slice of their lives. Make that slice as small as possible. The advice is vague, so let me follow up with some concrete examples: • File a good bug report. If you write an issue that says "my code doesn't compile," and don't include the error message, full code, OS, compiler version, and other such things, people will have to spend time prying it out of you. Be forthcoming with relevant information. • In slight contradiction to that: be concise. Start off with the most highly pertinent information. Make it clear what you're trying to do. If you have a 400 line error message, perhaps put it in an lpaste or Gist and link to it. • Provide a minimal repro. "Here's a link to my 1500 SLOC project that doesn't compile, kthxbye" is a bad start. As someone helping you, I'm going to have to strip off the extraneous bits until I'm staring the problem in the face. Why should I spend my time doing that, when you—the person asking for help—could be? • Make sure to mention any custom configuration. If I spend 5 days trying to fix an issue with your code not linking, and then you say "oh, I've been trying out the prerelease linker, I forgot to mention that, you think that could be the problem?" I'm going to be annoyed. Don't do that. Be forthcoming with all relevant info, and call out in particular custom things on your system. • Don't fall into the XY problem. And don't get offended if you get accused of hitting the XY problem. Instead of trying to explain what this problem is, I'll just provide a link. ## Demonstrate you've tried You've been staring at your screen for the past 5 days. The code hasn't compiled. You have no idea why. You're pulling out your hair at this point (side note: bald is awesome). You finally, in a fit of desperation, go to Stack Overflow and say: I'm trying to make an HTTP request in Haskell and can't figure it out. Any advice? Good luck. Not only is your question vague, but it sounds like you're asking for help on a homework assignment and are too lazy to do any research. Make it clear that you're not just getting someone else to do your work, and are really deserving of assistance. Below you'll find a snippet of code I've been trying to get working to make an HTTP PUT request and parse the response body as XML in a streaming fashion. As you can see, I'm trying to connect the source body to the parseXML function from xml-conduit, but I get the error message below. If anyone could point me in the right direction, I'd appreciate it. Make sure to include the import statements and language extensions too, so that anyone reading can just copy-paste your example and get the same error message. (I like using reproducible Stack scripts for this.) You may notice an overlap with minimal repro from above: that's intentional. ## Help other people If I see people answering questions on a mailing list or Stack Overflow, I'm appreciative. I consider them a comrade-in-arms. And I consider them assets to the community. They've earned my respect, I'm indebted to them, and I want to entice them to continue. Honestly: all of you helpers are awesome in my book. If one of those people asks a question, I want to help more. In addition to all of the feelings I mentioned above, there's also a more basic one: if this person is having trouble, it's probably not the most basic, boring question. In reality, the person may be barely a beginner, and may be asking beginner questions. But I know statistically that just having that helpful person's name associated with the question increases the likelihood of it being interesting. In other words: I'm nerd sniped. This points out something which really applies to all of these sections: people have memories. As soon as you start interacting with the community, you're building a reputation. You can change that reputation over time (for better or worse), but you have to acknowledge that it's there. ## Don't be rude Compare: Thank you for your great software, I really appreciate the time you've taken to make it. I'd appreciate your help with... With: I've been pulling my hair out trying to parse your documentation for the past two days. You think you can help me make sense of it? I'm trying to... And even further: Since I'm stuck using your piece of s*&# software with crappy documentation, the least you can do is help me overcome... If your goal is to get someone to help you, I'd place a large wager that the first approach will attract the most assistance. It doesn't matter if the other two approaches really do capture your current mental state. Are you trying to vent at someone, or get help? I recently had someone argue a few times that the tone of a question shouldn't matter. (In fact, those interactions were partially what encouraged this blog post.) I argue that that's not only naive, but dangerous: • We're human beings, and we like being treated nicely. As I mentioned above, open source community members are giving up a portion of their lives to help strangers. You should make that sacrifice feel as rewarding as possible. • Rude comments like this scare other people away. Encouraging people to continue with them by rewarding them with a helpful answer has the real possibility of scaring away more constructive community members. • All of life is a series of choices between different things I can be doing. If you make it miserable enough to interact with you, I may very well choose "watch paint dry and see this jerk badmouth my project in public" over "give this guy any more of my time." • Whether correct or not, being rude is a signal to me of other likely tendencies. I'm likely to guess that someone rude is also selfish, unwilling to minimize time wastage for others, and unlikely to contribute back by helping people. If I have to make a snap judgement on you based on a question and your tone of voice, a rude tone is not going to help you. I honestly haven't found the best approach to this specific problem. In some cases, a private message saying "your message would be more well received if you modified it" can help. But if I'm honest, by the time I think about writing such a message, I've basically decided that this is a person not worth my time, and trying to encourage him/her to behave better isn't worth it. The situation is slightly different if someone has been in the community for a while, and suddenly has an outburst of rudeness. I'm not excusing it, but I am saying I'd be more willing to consider that he/she is having a bad day, or that the problem is really bad, instead of "this person is just a jerk." Also, the reverse is true: if you've been rude to someone for the past 10 interactions, it may be very difficult to convince them to help you on the 11th, even if the rudeness disappears. (Overwhelming niceness, however, can help.) Side note: I implied above that the project documentation sucks. That may be the case. See "offer to help" for advice on pointing that out. ## Say thank you I'll preface this one with a few caveats so I don't get a flurry of guilt-ridden "thank you" notes. Most people don't say thank you in open source. I rarely write a thank you note to a package author. I don't expect it, I've never met anyone who expects it, it is not necessary. When someone has received assistance on a mailing list, I get happy. When that person responds with a sincere thank you, I get happier. When I'm the person who did the helping, I'm even happier still. It's simple, it's trivial, but it's often missed. Most people are only doing open source work to help others. Gratitude may be their only reward. Taking it a step farther: there have been a few times over the years where, out of nowhere, I've received a very kind personal email thanking me for work I've done. You can ask my wife and she'll confirm: it's truly touching to receive such messages. I know I've had views of open source maintainers as being far beyond the lowly likes of me in the past. I don't think it's generally true. Most people are, at the end of the day, just people. And they respond like any other people to kind words. Though I have a feeling that Linus Torvalds may be a bit confused if you pop him an email saying "love Linux, thanks!" ## Admit if you're new This one works one time per community. If you're new, you don't know what you're doing, and are asking for help, say straight out that you're new. It will get you some sympathy (unless you're lying, then people will hate you). It will get a more softball answer, and likely some guides to places explaining how to interact better. For example: if you come to the Yesod issue tracker and say "I'm new, I'm not sure if this is the best place to ask about installing GHC," you'll likely get pointed to an install page and Stack Overflow for further "please help me" questions. ## Offer to help This may be the first surprising piece of advice. Let's say the docs on my library suck. You could come in and say "help me solve X because your docs suck." And I might answer. Now consider this: I was having trouble doing X with your library (thank you for it by the way!). I'd be happy to prepare a documentation PR to help other people in my situation, if you'd be able to guide me towards an answer. Whoa, what is this? Help someone and they'll take away the dreaded documentation writing task from my plate? Awesome, I'll get right on it! In addition to docs, similar thoughts apply to: • Offering to write test cases • Offering to add some missing functionality • Offering to answer people's questions on other issues/the mailing list/Stack Overflow The point is: convince the maintainer (or whoever) that giving time to you is an investment. ## Give money This isn't at all a universal one. And to be clear: I'm not asking for it, if I have an envelope with unmarked bills on my doorstep tomorrow, I'll be weirded out. Some people just need money. They like contributing to open source work, but they have to pay the bills. If they've at all expressed a willingness to accept money for their work (like setting up Flattr or Patreon or whatever is popular these days), considering donating. Consider how much of their time you're taking. Consider how much of your time they would be saving you. Consider what a typical software developer hourly rate is. And then realize that buying someone a beer, or even a nice dinner, is probably a cheap price to pay for an answer to a question. And for those who aren't asking for any money, offering to buy the beer/coffee/soda when you meet up at a conference is a nice way to make this one a reality too. ## October 16, 2017 ### Gabriel Gonzalez # Advice for Haskell beginners <html xmlns="http://www.w3.org/1999/xhtml"><head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/> <meta content="text/css" http-equiv="Content-Style-Type"/> <meta content="pandoc" name="generator"/> <style type="text/css">code{white-space: pre;}</style></head><body> This post summarizes advice that I frequently give to Haskell beginners asking how to start out learning the language First, in general I recommend reading the Haskell Programming from first principles book, mainly because the book teaches Haskell without leaving out details and also provides plenty of exercises to test your understanding. This is usually good enough if you are learning Haskell as your first language. However, I would like to give a few additional tips for programmers who are approaching Haskell from other programming languages. ## Learn Haskell for the right reasons Some people learn Haskell with the expectation that they will achieve some sort of programming enlightenment or nirvana. You will be disappointed if you bring these unrealistic expectations to the language. Haskell is not an achievement to unlock or a trophy to be won because learning is a never-ending process and not a finish line. I think a realistic expectation is to treat Haskell as a pleasant language to use that lets you focus on solving real problems (as opposed to wasting your time fixing silly self-induced problems like null pointers and "undefined is not a function"). ## Avoid big-design-up-front Haskell beginners commonly make the mistake of trying to learn as much of the language as possible before writing their first program and overengineering the first draft. This will quickly burn you out. You might come to Haskell from a dynamically typed background like JavaScript, Python, or Ruby where you learned to avoid refactoring large code bases due to the lack of static types. This aversion to refactoring promotes a culture of "big-design-up-front" where you try to get your project as close to correct on the first try so that you don't have to refactor your code later. This is a terrible way to learn Haskell, for two reasons. First, Haskell has a much higher ceiling than most other programming languages, so if you wait until you hit that ceiling before building something you will wait a looooooong time. Second, refactoring is cheap in Haskell so you don't need to get things right the first time. You will accelerate your learning process if you get dirty and make mistakes. Write really ugly and embarrassing code and then iteratively refine your implementation. There is no royal road to learning Haskell. ## Avoid typeclass abuse Specifically, avoid creating new typeclasses until you are more comfortable with the language. Functional programming languages excel because many language features are "first class". For example, functions and effects are first class in Haskell, meaning that you can stick them in a list, add them, nest them, or pass them as arguments, which you can't (easily) do in imperative languages. However, typeclasses are not first-class, which means that if you use them excessively you will quickly depend on advanced language features to do even simple things. Programming functionally at the term-level is much simpler and more enjoyable than the type-level Prolog that type-classes encourage. Begin by learning how to solve problems with ordinary functions and ordinary data structures. Once you feel like you understand how to solve most useful problems with these simple tools then you can graduate to more powerful tools like typeclasses. Typeclasses can reduce a lot of boilerplate in proficient hands, but I like to think of them as more of a convenience than a necessity. You can also take this approach with you to other functional languages (like Elm, Clojure, Elixir, or Nix). You can think of "functions + data structures" as a simple and portable programming style that will improve all the code that you write, Haskell or not. ## Build something useful Necessity is the mother of invention, and you will learn more quickly if you try to build something that you actually need. You will quickly convince yourself that Haskell is useless if you only use the language to solve Project Euler exercises or Sudoku puzzles. You are also much more likely to get a Haskell job if you have a portfolio of one or two useful projects to show for your time. These sorts of projects demonstrate that you learned Haskell in order to build something instead of learning Haskell for its own sake. ## Conclusion Hopefully these tips will help provide some guard rails for learning the language for the first time. That's not to say that Haskell is perfect, but I think you will enjoy the language if you avoid these common beginner pitfalls. </body></html> ### Ken T Takusagawa # [duimuqdd] Compressing prime numbers Given a large prime number p not of any special form, can one data-compress it to specify it in less than log_2(p) bits? Straightforward is to say it is the m-th prime. Expressing m will take about log_2(p/log(p)) bits by the prime number theorem. This is the theoretical optimum by some measure, because the m's map 1-to-1 with the primes. The number of bits saved is log_2(p)- log_2(p/log(p)) = log_2(p) - [log_2(p) - log_2(log(p))] = log_2(log(p)) which, because of the twice-iterated logarithm, is a very small savings. For 370-bit primes around exp(256), the savings is 8 bits, or a reduction of 1 part in 46. For larger primes, the relative savings is even less. Around exp(2^16), the reduction is 1 part in 5909. Despite the mostly uselessness of this endeavor, we charge forward. Computing m is not practical for large primes. For a practical method, we form a new number x = 2^k * floor(p / 2^k), equivalent to setting k lower bits of p to zero. Then we can encode p as the n-th prime larger than x. For small k, n is easy to compute, and it is easy to recover p from n and x (e.g., with a prime sieve starting from x). Storing n instead of the lower bits of p results in some data savings because primes are less dense. Curiously, the number of bits saved on average is log_2(log(p)), which is the theoretical maximum, and not dependent on k. However, we also need to store k, which eats up some of those savings. Because the number of bits saved is constant, this method paradoxically works less and less well for larger k. However, with small k, n might vary significantly from the average, which could be good or bad. Here is some Haskell code to compute n. Here is some example output on the prime 3^233 + 176, which is around exp(2^8). We can see there is generally around 8 bits of savings, except for 2^21 where we got lucky. input 1476564251485392778927857721313837180933869708288569663932077079002031653266328641356763872492873429131586567699 prime # 1 after 2^ 9 * 2883914553682407771343472111941088244011464274001112624867338044925843072785798127649929438462643416272630015 prime # 3 after 2^ 10 * 1441957276841203885671736055970544122005732137000556312433669022462921536392899063824964719231321708136315007 prime # 8 after 2^ 11 * 720978638420601942835868027985272061002866068500278156216834511231460768196449531912482359615660854068157503 prime # 12 after 2^ 12 * 360489319210300971417934013992636030501433034250139078108417255615730384098224765956241179807830427034078751 prime # 24 after 2^ 13 * 180244659605150485708967006996318015250716517125069539054208627807865192049112382978120589903915213517039375 prime # 59 after 2^ 14 * 90122329802575242854483503498159007625358258562534769527104313903932596024556191489060294951957606758519687 prime # 129 after 2^ 15 * 45061164901287621427241751749079503812679129281267384763552156951966298012278095744530147475978803379259843 prime # 252 after 2^ 16 * 22530582450643810713620875874539751906339564640633692381776078475983149006139047872265073737989401689629921 prime # 526 after 2^ 21 * 704080701582619084800652371079367247073111395019802886930502452374473406441845246008283554312168802800935 prime # 8789 after 2^ 22 * 352040350791309542400326185539683623536555697509901443465251226187236703220922623004141777156084401400467 Future exploration: "Luck" above is easily explained by long internal strings of zeroes in the binary representation of p: the least significant 21 bits of p are 000011111111000010011. We could also try to take advantage of long internal strings of ones. More generally, sieving for primes works well for regions defined by arithmetic progressions. We might search generally for representations of p as the "a-th prime number of the form b*x + c larger/smaller than d*e^k". Above we explored b=1, c=0, "larger", e=2, d unrestricted. We could also explore anchor expressions more complicated than d*e^k. ## October 14, 2017 ### Dan Piponi (sigfpe) # A tail we don't need to wag Introduction I've been reading a little about concentration inequalities recently. I thought it would be nice to see if you can use the key idea, if not the actual theorems, to reduce the complexity of computing the probability distribution of the outcome of stochastic simulations. Examples might include random walks, or queues. The key idea behind concentration inequalities is that very often most of the probability is owned by a small proportion of the possible outcomes. For example, if we toss a fair coin enough (say ) times we expect the number of heads to lie within of the mean about 99.99% of the time despite there being different total numbers possible. The probable outcomes tend to concentrate around the expectation. On the other hand, if we consider not the total number of heads, but the possible sequences of tosses, there are possibilities, all equally likely. In this case there is no concentration. So a key ingredient here is a reduction operation: in this case reducing a sequence of tosses to a count of the number that came up heads. This is something we can use in a computer program. I (and many others) have written about the "vector space" monad that can be used to compute probability distributions of outcomes of simulations and I'll assume some familiarity with that. Essentially it is a "weighted list" monad which is similar to the list monad except that in addition to tracking all possible outcomes, it also propagates a probability along each path. Unfortunately it needs to follow through every possible path through a simulation. For example, in the case of simulating coin tosses it needs to track different possiblities, even though we're only interested in the possible sums. If, after each bind operation of the monad, we could collect together all paths that give the same total then we could make this code much more efficient. The catch is that to collect together elements of a type the elements need to be comparable, for example instances of Eq or Ord. This conflicts with the type of Monad which requires that we can use the >>= :: m a -> (a -> m b) -> m b and return :: a -> m a functions with any types a and b. I'm going to deal with this by adapting a technique presented by Oleg Kiselyov for efficiently implementing the Set monad. Instead of Set I'm going to use the Map type to represent probability distributions. These will store maps saying, for each element of a type, what the probability of that element is. So part of my code is going to be a direct translation of that code to use the Map type instead of the Set type. > {-# LANGUAGE GADTs, FlexibleInstances #-}> {-# LANGUAGE ViewPatterns #-}> module Main where> import Control.Monad> import Control.Arrow> import qualified Data.Map as M> import qualified Data.List as L The following code is very similar to Oleg's. But for first reading I should point out some differences that I want you to ignore. The type representing a probability distribution is P: > data P p a where> POrd :: Ord a => p -> M.Map a p -> P p a> PAny :: p -> [(a, p)] -> P p a But note how the constructors take two arguments - a number that is a probability, in addition to a weighted Map or list. For now pretend that first argument is zero and that the functions called trimXXX act similarly to the identity: > instance (Ord p, Num p) => Functor (P p) where> fmap = liftM> instance (Ord p, Num p) => Applicative (P p) where> pure = return> (<*>) = ap> instance (Ord p, Num p) => Monad (P p) where> return x = PAny 0 [(x, 1)]> m >>= f = > let (e, pdf) = unP m> in trimAdd e collect map (f *** id) pdf> returnP :: (Ord p, Num p, Ord a) => a -> P p a> returnP a = POrd 0 M.singleton a 1> unP :: P p a -> (p, [(a, p)])> unP (POrd e pdf) = (e, M.toList pdf)> unP (PAny e pdf) = (e, pdf)> fromList :: (Num p, Ord a) => [(a, p)] -> M.Map a p> fromList = M.fromListWith (+)> union :: (Num p, Ord a) => M.Map a p -> M.Map a p -> M.Map a p> union = M.unionWith (+)> scaleList :: Num p => p -> [(a, p)] -> [(a, p)]> scaleList weight = map (id *** (weight *))> scaleMap :: (Num p, Ord a) => p -> M.Map a p -> M.Map a p> scaleMap weight = fromList . scaleList weight . M.toList This is a translation of Oleg's crucial function that allows us to take a weighted list of probability distributions and flatten them down to a single probability distribution: > collect :: Num p => [(P p a, p)] -> P p a> collect [] = PAny 0 []> collect ((POrd e0 pdf0, weight) : rest) => let wpdf0 = scaleMap weight pdf0> in case collect rest of> POrd e1 pdf1 -> POrd (weight*e0+e1) wpdf0 union pdf1> PAny e1 pdf1 -> POrd (weight*e0+e1) wpdf0 union fromList pdf1> collect ((PAny e0 pdf0, weight) : rest) => let wpdf0 = scaleList weight pdf0> in case collect rest of> POrd e1 pdf1 -> POrd (weight*e0+e1) fromList wpdf0 union pdf1> PAny e1 pdf1 -> PAny (weight*e0+e1) wpdf0 ++ pdf1 But now I really must explain what the first argument to POrd and PAny is and why I have all that "trimming". Even though the collect function allows us to reduce the number of elements in our PDFs, we'd like to take advantage of concentration of probability to reduce the number even further. The trim function keeps only the top probabilities in a PDF, discarding the rest. To be honest, this is the only point worth taking away from what I've written here :-) When we throw away elements of the PDF our probabilities no longer sum to 1. So I use the first argument of the constructors as a convenient place to store the amount of probability that I've thrown away. The trim function keeps the most likely outcomes and sums the probability of the remainder. I don't actually need to keep track of what has been discarded. In principle we could reconstruct this value by looking at how much the probabilities in our trimmed partial PDFs fall short of summing to 1. But confirming that our discarded probability and our partial PDF sums to 1 gives a nice safety check for our code and can give us some warning if numerical errors start creeping in. I'll call the total discarded probability the tail probability. Here is the core function to keep the top values. In this case is given by a global constant called trimSize. (I'll talk about how to do this better later.) > trimList :: (Ord p, Num p) => [(a, p)] -> (p, [(a, p)])> trimList ps => let (keep, discard) = L.splitAt trimSize (L.sortOn (negate . snd) ps)> in (sum (map snd discard), keep)> trimAdd :: (Ord p, Num p) => p -> P p a -> P p a> trimAdd e' (POrd e pdf) => let (f, trimmedPdf) = trimList (M.toList pdf)> in POrd (e'+e+f) (M.fromList trimmedPdf)> trimAdd e' (PAny e pdf) => let (f, trimmedPdf) = trimList pdf> in PAny (e'+e+f) trimmedPdf> runP :: (Num p, Ord a) => P p a -> (p, M.Map a p)> runP (POrd e pdf) = (e, pdf)> runP (PAny e pdf) = (e, fromList pdf) And now some functions representing textbook probability distributions. First the uniform distribution on a finite set. Again this is very similar to Oleg's chooseOrd function apart from the fact that it assigns weights to each element: > chooseP :: (Fractional p, Ord p, Ord a) =>> [a] -> P p a> chooseP xs = let p = 1/fromIntegral (length xs)> in POrd 0 fromList map (flip (,) p) xs And the Bernoulli distribution, i.e. tossing a Bool coin that comes up True with probability : > bernoulliP :: (Fractional p, Ord p) =>> p -> P p Bool> bernoulliP p = POrd 0 fromList [(False, 1-p), (True, p)] Now we can try a random walk in one dimension. At each step we have a 50/50 chance of standing still or taking a step to the right: > random_walk1 :: Int -> P Double Int> random_walk1 0 = returnP 0> random_walk1 n = do> a <- random_walk1 (n-1)> b <- chooseP [0, 1]> returnP a+b Below in main we take 2048 steps but only track 512 probabilities. The tail probability in this case is about . So only tracking 1/4 of the outcomes has had almost no impact on the numbers. This also illustrates why it is good to track the tail probabilities rather than inferring them from the missing probabilities in the bulk of the PDF - they can be so small they vanish compared to floating poimnt errors. We can afford to track a lot fewer than 512 (out of 2049 possible) outcomes and still have a good representative PDF. Now here's a two-dimensional random walk for 32 steps. The tail probability is about so we are getting a reasonably representative PDF. We have to run fewer steps than before, however, because the space of possible outcomes spans two dimensions, meaning that reduction doesn't help as much as it does in one dimension. > random_walk2 :: Int -> (Int, Int) -> P Double (Int, Int)> random_walk2 0 (x, y) = returnP (x, y)> random_walk2 n (x, y) = do> (x',y') <- random_walk2 (n-1) (x, y)> dx <- chooseP [-1, 1]> dy <- chooseP [-1, 1]> returnP (x'+dx, y'+dy) One last simulation. This is a queing scenario. Tasks come in once every tick of the clock. There are four queues a task can be assigned to. A task is assigned to the shortest queue. Meanwhile each queue as a 1/4 probability of clearing one item at each tick of the clock. We build the PDF for the maximum length any queue has at any time. The first argument to queue is the number of ticks of the clock. The second argument is the list of lengths of the queues. It returns a PDF, not just on the current queue size, but also on the longest queue it has seen. > queue :: Int -> [Int] -> P Double (Int, [Int])> queue 0 ls = returnP (maximum ls, ls)> queue n ls = do> (longest, ls1) <- queue (n-1) ls> ls2 <- forM ls1 \l -> do> served <- bernoulliP (1/4)> returnP if served && l > 0 then l-1 else l> let ls3 = L.sort head ls2+1 : tail ls2> returnP (longest max maximum ls3, ls3) For the queing simulation the tail probability is around despite the fact that we have discarded a vast possible set of possible outcomes. It's a little ugly that trimSize is a global constant: > trimSize = 512 The correct solution is probably to separate the probability "syntax" from its "semantics". In other words, we should implement a free monad supporting the language of probability with suitable constructors for bernoulliP and choiceP. We can then write a separate interpreter which takes a trimSize as argument. This has another advantage too: the Monad above isn't a true monad. It uses a greedy approach to discarding probabilities and different rearrangements of the code, that ought to give identical results, may end up diferent. By using a free monad we ensure that our interface is a true monad and we can put the part of the code that breaks the monad laws into the interpreter. The catch is that my first attempt at writing a free monad resulted in code with poor performance. So I'll leave an efficient version as an exercise :-) > main = do> print runP random_walk1 2048> print runP random_walk2 32 (0, 0)> print runP do> (r, _) <- queue 128 [0, 0, 0, 0]> returnP r ## October 12, 2017 ### Functional Jobs # Analytics Developer for a mobile app at EmployeeChannel, Inc. (Full-time) Who we are EmployeeChannel is a leading provider of award-winning mobile apps for employee engagement and communication, enabling HR and Internal Communications teams to boost the impact and effectiveness of employee communication, to create a positive employee experience, and to drive cultural and business outcomes. The EmployeeChannel app extends the knowledge and reach of organizational experts to employees anytime, anywhere and is dedicated to the interactions between an organization and its employees. Employees can find and receive organizational information easier and faster, get personalized responses to requests, and react quickly to time-sensitive events. HR and Internal Communications teams can respond real-time to organizational imperatives and employee needs using behavioral insights from Voice of the Employee analytics. To learn more about the EmployeeChannel app and how it can be used to engage and communicate with employees, please visit www.employeechannelinc.com. What we are looking for We are an early stage start-up in downtown San Francisco with great product, a solid business model, and tremendous upside. Fully funded and about to scale – a great time to join! We are searching for talented and motivated individuals who like a good challenge, enjoy teamwork, and are passionate about building something of real value. You’ll be joining an experienced team that knows how to execute to a market-defining strategy. We work hard, because we enjoy what we are doing and the people we work with. We are confident that EmployeeChannel is the team to beat in what is panning out as a huge opportunity. The person we're looking for is first and foremost a developer, who understands and does functional programming, with Haskell experience, and enough math background to understand statistics and analytics. Sound like what you’re looking for? Read on… What you will do As a member of our Product team, you will contribute to both leading edge research and rapid development in our delivery of high-quality, engaging, SaaS-based products. Machine learning, NLP, and analytics drive a significant portion of our product’s value proposition. You will engage with our highly experienced research team to explore and model state-of-the-art use of these frameworks to provide unique insights for our customers. Your work with other developers can have a big impact, heavily influencing our approach to delivering high quality products. We are committed to delivering robust, scalable products that bring a consumer best-in-class feel to the Enterprise market. Here¹s a sample of what that would look like: • You will work with data scientists and software engineers to implement analytics and machine learning algorithms in production-quality Haskell code and deploy them to production servers. You will learn a ton and be challenged regularly in this role. • You will devise tests of integrity and consistency for both data and analytics results. • You¹ll work in an Agile environment with a team whose track record of meeting or exceeding their commitments is strong. • You¹ll enjoy partnering with other developers, knowing that collaborative design sessions and the occasional pair programming are part of how we operate. • We deliver quality code, because we are committed to test-driven development and we take pride in our work. • 1 + 1 = 3 here. We look to leverage open source wherever it makes sense, knowing that we can move faster as a result. Speed matters. • You won¹t feel pigeon-holed here. Working together with Product Managers and our Hosted Ops counterparts is a daily occurrence. • You¹ll have a measurable impact on the success of our product and company. If you¹re looking for a place where you can get your fingerprints all over the product, this is a good place to call home. Your qualifications • The ideal candidate we're looking for is first and foremost a developer, who understands and does functional programming, with Haskell experience, and enough math background to understand statistics and analytics. And above all, our ideal candidate loves functional programming. • 2-3 years of relevant work experience is required. • You have experience with, and appreciation for, functional programming and where it can be advantageously applied. • You have produced production-quality code for a high scale commercial application. • You have used Haskell in projects. You know your way around GHCi, stack, and cabal. • You have designed database schema and worked with databases like Postgres. • You have built the back-end of a web service. • While developing, you formally test your code with unit-test packages such as HSpec. • You are no stranger to mathematics, statistics, and data analysis. If you encountered this with Python, we will not hold that against you. • You have worked as part of an Agile development team. • Git, Apache, and HSpec are all used in our shop. • A BS in a Mathematics-intensive degree (Comp Sci, Math, Physics) or equivalent is required. Get information on how to apply for this position. ### Joachim Breitner # Isabelle functions: Always total, sometimes undefined Often, when I mention how things work in the interactive theorem prover [Isabelle/HOL] (in the following just “Isabelle”1) to people with a strong background in functional programming (whether that means Haskell or Coq or something else), I cause confusion, especially around the issue of what is a function, are function total and what is the business with undefined. In this blog post, I want to explain some these issues, aimed at functional programmers or type theoreticians. Note that this is not meant to be a tutorial; I will not explain how to do these things, and will focus on what they mean. ### HOL is a logic of total functions If I have a Isabelle function f :: a ⇒ b between two types a and b (the function arrow in Isabelle is ⇒, not →), then – by definition of what it means to be a function in HOL – whenever I have a value x :: a, then the expression f x (i.e. f applied to x) is a value of type b. Therefore, and without exception, every Isabelle function is total. In particular, it cannot be that f x does not exist for some x :: a. This is a first difference from Haskell, which does have partial functions like spin :: Maybe Integer -> Bool spin (Just n) = spin (Just (n+1)) Here, neither the expression spin Nothing nor the expression spin (Just 42) produce a value of type Bool: The former raises an exception (“incomplete pattern match”), the latter does not terminate. Confusingly, though, both expressions have type Bool. Because every function is total, this confusion cannot arise in Isabelle: If an expression e has type t, then it is a value of type t. This trait is shared with other total systems, including Coq. Did you notice the emphasis I put on the word “is” here, and how I deliberately did not write “evaluates to” or “returns”? This is because of another big source for confusion: ### Isabelle functions do not compute We (i.e., functional programmers) stole the word “function” from mathematics and repurposed it2. But the word “function”, in the context of Isabelle, refers to the mathematical concept of a function, and it helps to keep that in mind. What is the difference? • A function a → b in functional programming is an algorithm that, given a value of type a, calculates (returns, evaluates to) a value of type b. • A function a ⇒ b in math (or Isabelle) associates with each value of type a a value of type b. For example, the following is a perfectly valid function definition in math (and HOL), but could not be a function in the programming sense: definition foo :: "(nat ⇒ real) ⇒ real" where "foo seq = (if convergent seq then lim seq else 0)" This assigns a real number to every sequence, but it does not compute it in any useful sense. From this it follows that ### Isabelle functions are specified, not defined Consider this function definition: fun plus :: "nat ⇒ nat ⇒ nat" where "plus 0 m = m" | "plus (Suc n) m = Suc (plus n m)" To a functional programmer, this reads plus is a function that analyses its first argument. If that is 0, then it returns the second argument. Otherwise, it calls itself with the predecessor of the first argument and increases the result by one. which is clearly a description of a computation. But to Isabelle, the above reads plus is a binary function on natural numbers, and it satisfies the following two equations: … And in fact, it is not so much Isabelle that reads it this way, but rather the fun command, which is external to the Isabelle logic. The fun command analyses the given equations, constructs a non-recursive definition of plus under the hood, passes that to Isabelle and then proves that the given equations hold for plus. One interesting consequence of this is that different specifications can lead to the same functions. In fact, if we would define plus' by recursing on the second argument, we’d obtain the the same function (i.e. plus = plus' is a theorem, and there would be no way of telling the two apart). ### Termination is a property of specifications, not functions Because a function does not evaluate, it does not make sense to ask if it terminates. The question of termination arises before the function is defined: The fun command can only construct plus in a way that the equations hold if it passes a termination check – very much like Fixpoint in Coq. But while the termination check of Fixpoint in Coq is a deep part of the basic logic, in Isabelle it is simply something that this particular command requires for its internal machinery to go through. At no point does a “termination proof of the function” exist as a theorem inside the logic. And other commands may have other means of defining a function that do not even require such a termination argument! For example, a function specification that is tail-recursive can be turned in to a function, even without a termination proof: The following definition describes a higher-order function that iterates its first argument f on the second argument x until it finds a fixpoint. It is completely polymorphic (the single quote in 'a indicates that this is a type variable): partial_function (tailrec) fixpoint :: "('a ⇒ 'a) ⇒ 'a ⇒ 'a" where "fixpoint f x = (if f x = x then x else fixpoint f (f x))" We can work with this definition just fine. For example, if we instantiate f with (λx. x-1), we can prove that it will always return 0: lemma "fixpoint (λ n . n - 1) (n::nat) = 0" by (induction n) (auto simp add: fixpoint.simps) Similarly, if we have a function that works within the option monad (i.e. |Maybe| in Haskell), its specification can always be turned into a function without an explicit termination proof – here one that calculates the Collatz sequence: partial_function (option) collatz :: "nat ⇒ nat list option" where "collatz n = (if n = 1 then Some [n] else if even n then do { ns <- collatz (n div 2); Some (n # ns) } else do { ns <- collatz (3 * n + 1); Some (n # ns)})" Note that lists in Isabelle are finite (like in Coq, unlike in Haskell), so this function “returns” a list only if the collatz sequence eventually reaches 1. I expect these definitions to make a Coq user very uneasy. How can fixpoint be a total function? What is fixpoint (λn. n+1)? What if we run collatz n for a n where the Collatz sequence does not reach 1?3 We will come back to that question after a little detour… ### HOL is a logic of non-empty types Another big difference between Isabelle and Coq is that in Isabelle, every type is inhabited. Just like the totality of functions, this is a very fundamental fact about what HOL defines to be a type. Isabelle gets away with that design because in Isabelle, we do not use types for propositions (like we do in Coq), so we do not need empty types to denote false propositions. This design has an important consequence: It allows the existence of a polymorphic expression that inhabits any type, namely undefined :: 'a The naming of this term alone has caused a great deal of confusion for Isabelle beginners, or in communication with users of different systems, so I implore you to not read too much into the name. In fact, you will have a better time if you think of it as arbitrary or, even better, unknown. Since undefined can be instantiated at any type, we can instantiate it for example at bool, and we can observe an important fact: undefined is not an extra value besides the “usual ones”. It is simply some value of that type, which is demonstrated in the following lemma: lemma "undefined = True ∨ undefined = False" by auto In fact, if the type has only one value (such as the unit type), then we know the value of undefined for sure: lemma "undefined = ()" by auto It is very handy to be able to produce an expression of any type, as we will see as follows ### Partial functions are just underspecified functions For example, it allows us to translate incomplete function specifications. Consider this definition, Isabelle’s equivalent of Haskell’s partial fromJust function: fun fromSome :: "'a option ⇒ 'a" where "fromSome (Some x) = x" This definition is accepted by fun (albeit with a warning), and the generated function fromSome behaves exactly as specified: when applied to Some x, it is x. The term fromSome None is also a value of type 'a, we just do not know which one it is, as the specification does not address that. So fromSome None behaves just like undefined above, i.e. we can prove lemma "fromSome None = False ∨ fromSome None = True" by auto Here is a small exercise for you: Can you come up with an explanation for the following lemma: fun constOrId :: "bool ⇒ bool" where "constOrId True = True" lemma "constOrId = (λ_.True) ∨ constOrId = (λx. x)" by (metis (full_types) constOrId.simps) Overall, this behavior makes sense if we remember that function “definitions” in Isabelle are not really definitions, but rather specifications. And a partial function “definition” is simply a underspecification. The resulting function is simply any function hat fulfills the specification, and the two lemmas above underline that observation. ### Nonterminating functions are also just underspecified Let us return to the puzzle posed by fixpoint above. Clearly, the function – seen as a functional program – is not total: When passed the argument (λn. n + 1) or (λb. ¬b) it will loop forever trying to find a fixed point. But Isabelle functions are not functional programs, and the definitions are just specifications. What does the specification say about the case when f has no fixed-point? It states that the equation fixpoint f x = fixpoint f (f x) holds. And this equation has a solution, for example fixpoint f _ = undefined. Or more concretely: The specification of the fixpoint function states that fixpoint (λb. ¬b) True = fixpoint (λb. ¬b) False has to hold, but it does not specify which particular value (True or False) it should denote – any is fine. ### Not all function specifications are ok At this point you might wonder: Can I just specify any equations for a function f and get a function out of that? But rest assured: That is not the case. For example, no Isabelle command allows you define a function bogus :: () ⇒ nat with the equation bogus () = Suc (bogus ()), because this equation does not have a solution. We can actually prove that such a function cannot exist: lemma no_bogus: "∄ bogus. bogus () = Suc (bogus ())" by simp (Of course, not_bogus () = not_bogus () is just fine…) ### You cannot reason about partiality in Isabelle We have seen that there are many ways to define functions that one might consider “partial”. Given a function, can we prove that it is not “partial” in that sense? Unfortunately, but unavoidably, no: Since undefined is not a separate, recognizable value, but rather simply an unknown one, there is no way of stating that “A function result is not specified”. Here is an example that demonstrates this: Two “partial” functions (one with not all cases specified, the other one with a self-referential specification) are indistinguishable from the total variant: fun partial1 :: "bool ⇒ unit" where "partial1 True = ()" partial_function (tailrec) partial2 :: "bool ⇒ unit" where "partial2 b = partial2 b" fun total :: "bool ⇒ unit" where "total True = ()" | "total False = ()" lemma "partial1 = total ∧ partial2 = total" by auto If you really do want to reason about partiality of functional programs in Isabelle, you should consider implementing them not as plain HOL functions, but rather use HOLCF, where you can give equational specifications of functional programs and obtain continuous functions between domains. In that setting, ⊥ ≠ () and partial2 = ⊥ ≠ total. We have done that to verify some of HLint’s equations. ### You can still compute with Isabelle functions I hope by this point, I have not scared away anyone who wants to use Isabelle for functional programming, and in fact, you can use it for that. If the equations that you pass to fun are a reasonable definition for a function (in the programming sense), then these equations, used as rewriting rules, will allow you to “compute” that function quite like you would in Coq or Haskell. Moreover, Isabelle supports code extraction: You can take the equations of your Isabelle functions and have them expored into Ocaml, Haskell, Scala or Standard ML. See Concon for a conference management system with confidentially verified in Isabelle. While these usually are the equations you defined the function with, they don't have to: You can declare other proved equations to be used for code extraction, e.g. to refine your elegant definitions to performant ones. Like with code extraction from Coq to, say, Haskell, the adequacy of the translations rests on a “moral reasoning” foundation. Unlike extraction from Coq, where you have an (unformalized) guarantee that the resulting Haskell code is terminating, you do not get that guarantee from Isabelle. Conversely, this allows you do reason about and extract non-terminating programs, like fixpoint, which is not possible in Coq. There is currently ongoing work about verified code generation, where the code equations are reflected into a deep embedding of HOL in Isabelle that would allow explicit termination proofs. ### Conclusion We have seen how in Isabelle, every function is total. Function declarations have equations, but these do not define the function in an computational sense, but rather specify them. Because in HOL, there are no empty types, many specifications that appear partial (incomplete patterns, non-terminating recursion) have solutions in the space of total functions. Partiality in the specification is no longer visible in the final product. ### PS: Axiom undefined in Coq This section is speculative, and an invitation for discussion. Coq already distinguishes between types used in programs (Set) and types used in proofs Prop. Could Coq ensure that every t : Set is non-empty? I imagine this would require additional checks in the Inductive command, similar to the checks that the Isabelle command datatype has to perform4, and it would disallow Empty_set. If so, then it would be sound to add the following axiom Axiom undefined : forall (a : Set), a. wouldn't it? This axiom does not have any computational meaning, but that seems to be ok for optional Coq axioms, like classical reasoning or function extensionality. With this in place, how much of what I describe above about function definitions in Isabelle could now be done soundly in Coq. Certainly pattern matches would not have to be complete and could sport an implicit case _ ⇒ undefined. Would it “help” with non-obviously terminating functions? Would it allow a Coq command Tailrecursive that accepts any tailrecursive function without a termination check? 1. Isabelle is a metalogical framework, and other logics, e.g. Isabelle/ZF, behave differently. For the purpose of this blog post, I always mean Isabelle/HOL. 2. Isabelle is a metalogical framework, and other logics, e.g. Isabelle/ZF, behave differently. For the purpose of this blog post, I always mean Isabelle/HOL. 3. Let me know if you find such an n. Besides n = 0. 4. Like fun, the constructions by datatype are not part of the logic, but create a type definition from more primitive notions that is isomorphic to the specified data type. ## October 10, 2017 ### Russell O'Connor # Functor-Oriented Programming My style of Haskell programming has been evolving over the 15 years that I have been working with it. It is turning into something that I would like to call “functor oriented programming”. The traditional use of typed functional programming focuses on data types. One defines data types to model the data structures that your program operates on, and one writes functions to transform between these structures. One of the primary goals in this traditional methodology is to create data structures that exclude erroneous states to the extent that is reasonably possible. As long as one ensures that pattern matching is complete, then the type system catches many errors that would otherwise lead to these erroneous states, which have been crafted to be unrepresentable. Functor oriented programming is a refinement of this traditional focus on data types. I was reminded of this concept recently when I was working with wren’s fantastic unification-fd library. With functor oriented programming, one divides data structures into layers of functors that, when composed together, form the data structures that your program operates on. Instead of writing transformations between data structures, one writes natural transformations between functors, where a natural transformation between functors F and G is a polymorphic function of type forall a. F a -> G a. While traditional functions often operate on products of multiple inputs and/or outputs, with functor oriented programming one will often see functions operating on compositions of functors, including but not limited to distribution functions of type forall a. F (G a) -> G (F a) and half-way distribution functions forall a. F (G a) -> G (H a), and many others. By dividing data structures up into layers of functors, one can create a separation of concerns that does not occur in traditional functional programming. With functor oriented programming, polymorphism is not necessarily about using functions polymorphically. Instead, polymorphism provides correctness guarantees by ensuring that a particular function can only touch the specific layers of functors in its type signature and is independent of the rest of the data structure. One benefits from polymorphism even when a function is only ever invoked at a single type. The appearance of many natural transformations is one hallmark of functor oriented programming. Higher-order natural transformations will invoke Rank2Types and RankNTypes, which is another hallmark of functor oriented programming. Other hallmarks of functor oriented programming include open recursive types, which allows one to divide up recursive types into their layers of recursion and create natural transformations that operate on a single layer at a time. Open recursive types plays an important role in wren’s unification library. As fine of a language that Haskell is, it is not actually that suitable for functor oriented programming. The problem is that, under normal circumstances, there is no reduction or equivalence classes at the type level. For example, the identity functor does not transparently disappear during composition, the Compose functor is not transparently associative, and the Swap functor composed with itself does not reduce to the identity functor. To cope with this one must litter one’s code with newtype wrapper and unwrappers to make all these natural transformations explicit. In principle, these transformations should have no run-time consequences, but when they are used in higher-order ways, unfortunately they sometimes do. Despite the problems, I am not aware of any another practical language that better supports this style of programming. I think Haskell’s higher-kinded type classes and the progression of Monad, Applicative, Foldable, Traversable, etc. classes have been instrumental in leading to the development of this style of programming as they further motivate the division of one’s data structures into these layers of functors. I have been thinking about writing this post for a few years now, and wanted to write something convincing; however, I do not think I am up to the task. Instead of trying to persuade the reader, I have elected to try to simply name and describe this style of programming so that the reader might notice it themselves when reading and writing code. Hopefully someone more capable than me can evangelize this approach, and perhaps even create a practical language suitable for this style of programming. ## October 08, 2017 ### Russell O'Connor # Taking Propositions as Types Seriously A while back I decided to try my hand at using Agda and I wrote a proof of the Church-Rosser theorem in it. It was a fun exercise. I took all the knowledge I have picked up over the years about using dependent types to represent binders, to define well-typed terms, and what I have learned about the Category of Contexts and applied it to my definition of Terms for a simply typed lambda calculus. I am proud of the result. They say you do not understand a topic in mathematics until you can teach it. And you do not really understand it until you can prove it in Coq. And you do not really really understand it until you can prove it in Agda. What really struck me was how my exercise in Adga affected my understanding of “Propositions as Types”. I view types as being divided into roughly two categories. Some types are propositions. They are the types that have at most one inhabitant, which is to say types for which you can prove that all their inhabitants are equal. Other types are data types. They are types with potentially more than one inhabitant. As such you can distinguish between values by (possibly indirectly) doing case analysis on them. Indeed, HoTT defines propositions in exactly this way. This classification of types is not fundamental in the theory of types. The theory of types treats both propositions and data types uniformly. I simply find it a useful way of characterizing types when programming and reasoning about programs with dependent type theory. The void and unit types, ⊥ and ⊤ respectively, are both propositions. We can define a function from the Boolean type to the universe of types which map the two Boolean values to these two types. In this way we can covert any Boolean valued (computable) predicate into a logical (type-theoretical) predicate. To me the phrase “Proposition as Types” just meant the embedding of logical proposition as types with at most one inhabitant. For example, given a decidable type A, we can compute if a given value of type A is a member of a given list of As. This computable predicate can be lifted to a logical predicate to define a logical membership relationship. Expressions using this logical membership relationship are propositions according to the above definition of proposition. This is a fine way to do formal reasoning. However, this is not the way that membership is typically defined in Agda. Instead, Agda defines the membership relation as an inductive family whose constructors witness that an element is either the head of the list, or is a member of the tail of the list. (Coq also defines the membership relation this way; however it is marked as non-extractable for computation by virtue of being a proposition.) The type (al) is, in general, a data type rather than a proposition. When the value a occurs multiple times in l, then the type (al) has multiple different “proofs” corresponding the the different occurrences of a within l. Values of this type act as “the type of indexes where a occurs in l”, and one can write programs that operate over this type. Given two lists, l1 and l2, the “proposition” that one list is a subset of the other states that any element of l1 is also an element of l2: l1l2 ≔ ∀a. al1al2 In dependent type theory this implication is represented as a function. Because our membership relation is a data type, this subset relation is represented by a real program. Specifically it is a program that, for any value, maps each index where it occurs in l1 to some index where it occurs in l2; you can really evaluate this function. This subset type is also, in general, a data type because there can be multiple such functions, which represent all the possible permutations that maps l1 onto l2. The consequences of this are fantastic. For example, what you normally think of as a theorem for weakening, weaken : ∀ {Γ₁ Γ₂ A} → Γ₁ ⊆ Γ₂ → Term Γ₁ A → Term Γ₂ A also captures contraction and exchange due to this definition of subset. All the structural rules of the lambda calculus are subsumed within a single theorem! It took me several attempts before I learned to take full advantage of this sort of logic. For example, I originally defined a module as an inductive family. However, I eventually settled on a functional representation for modules: -- A Module is a list of terms of types Δ all in the same context Γ. -- A Module is a block of terms for simultaneous substitution. -- A Module is an arrow in the category of contexts. -- A Module is a profuctor. Module : List Type → List Type → Set Module Γ Δ = ∀ {A} → A ∈ Δ → Term Γ A This definition means a module can evaluate. In particular modules can be partially evaluated during type-checking, which greatly simplified many of my proofs as compared to using the inductive family way of defining modules. Notice how we make use of values of A ∈ Δ as data defining an index into the context Δ. If Δ has multiple occurrences of A, then the term generated by the module can be different depending on which occurrence the index is referencing. In conclusion, although I do like thinking of types as divided between propositional types and data types, it can be fruitful to view expressions that mathematics traditionally treats as propositions instead as real data types, and really program with them. This is what I mean by taking "Propositions as Types" seriously. ### Gabriel Gonzalez # Why do our programs need to read input and write output? <html xmlns="http://www.w3.org/1999/xhtml"><head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/> <meta content="text/css" http-equiv="Content-Style-Type"/> <meta content="pandoc" name="generator"/> <style type="text/css">code{white-space: pre;}</style> <style type="text/css">div.sourceCode { overflow-x: auto; } table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode { margin: 0; padding: 0; vertical-align: baseline; border: none; } table.sourceCode { width: 100%; line-height: 100%; } td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; } td.sourceCode { padding-left: 5px; } code > span.kw { color: #007020; font-weight: bold; } /* Keyword */ code > span.dt { color: #902000; } /* DataType */ code > span.dv { color: #40a070; } /* DecVal */ code > span.bn { color: #40a070; } /* BaseN */ code > span.fl { color: #40a070; } /* Float */ code > span.ch { color: #4070a0; } /* Char */ code > span.st { color: #4070a0; } /* String */ code > span.co { color: #60a0b0; font-style: italic; } /* Comment */ code > span.ot { color: #007020; } /* Other */ code > span.al { color: #ff0000; font-weight: bold; } /* Alert */ code > span.fu { color: #06287e; } /* Function */ code > span.er { color: #ff0000; font-weight: bold; } /* Error */ code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ code > span.cn { color: #880000; } /* Constant */ code > span.sc { color: #4070a0; } /* SpecialChar */ code > span.vs { color: #4070a0; } /* VerbatimString */ code > span.ss { color: #bb6688; } /* SpecialString */ code > span.im { } /* Import */ code > span.va { color: #19177c; } /* Variable */ code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code > span.op { color: #666666; } /* Operator */ code > span.bu { } /* BuiltIn */ code > span.ex { } /* Extension */ code > span.pp { color: #bc7a00; } /* Preprocessor */ code > span.at { color: #7d9029; } /* Attribute */ code > span.do { color: #ba2121; font-style: italic; } /* Documentation */ code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ </style></head><body> I wrote this post to challenge basic assumptions that people make about software architecture, which is why I chose a deliberately provocative title. You might not agree with all the points that I am about to make, but I do hope that this post changes the way that you think about programming This post is an attempt to restate in my own words what Conal Elliot (and others before him) have been saying for a while: modern programming is a Rube-Goldberg machine that could be much simpler if we change the way we compose code. Most programmers already intuitively understand this at some level. They will tell you that the programming ecosystem is deeply flawed, fragmented, and overly complex. I know that I felt that way myself for a long time, but in retrospect I believe that my objections at the time were superficial. There were deeper issues with programming that I was blind to because they are so ingrained in our collective psyche. Disclaimer: This post uses my pet configuration language Dhall to illustrate several points, mainly because Dhall is a constructive proof-of-concept of these ideas. The purpose of this post is not so much to convince you to use Dhall but rather to convince you to think about software in a new way ## Input and output Consider the title of this post for example: "Why do our programs need to read input and write output?" Most people will answer the title with something like: • "Our programs need a way to communicate with the outside world" • "Programs need to do something other than heat up CPUs" Now suppose I phrased the question in a different way: "What if only the compiler could read input and write output?" "What's the difference?", you might ask. "Surely you mean that the language provides some library function that I can call to read or write input to some handles, whether they be file handles or standard input/output." No, I mean that only the compiler implementation is allowed to read input or write output, but programs written within the compiled language cannot read input or write output. You can only compute pure functions within the language proper. Again, this probably seems ridiculous. How could you communicate at all with the program? ## Imports Most languages have some way to import code, typically bundled in the form of packages. An imported function or value is something that our compiler reads as input statically at compile time as opposed to a value read by our program dynamically at run time. Suppose I told you that our hypothetical programming language could only read input values by importing them "Ridiculous!" you exclaim as you spit out your coffee. "Nobody would ever use such a language." You probably wouldn't even know where to begin since so many things seem wrong with that proposal Perhaps you would object to the heavyweight process for publishing and subscribing to new values. You would recite the package management process for your favorite programming language: • Create a source module containing your value • Create a standard project layout • Create a package description file • Check your project into version control • Publish your package to a package repository Perhaps you would object to the heavyweight process for configuring programs via imports? Your favorite programming language would typically require you to: • Retrieve the relevant program from version control • Modify the project description file to reference the newly published dependency • Modify project code to import your newly published value • Compile the program • Run the program Why would a non-technical end user do any of that just to read and write values? This is exactly the Rube-Goldberg machine I'm referring to. We have come to expect a heavyweight process for source code to depend on other source code ## Importing paths Distributing code doesn't have to be heavyweight, though. Consider Dhall's import system which lets you reference expressions directly by their paths. For example, suppose we saved the value True to a file named example.dhall:  echo 'True' > example.dhall Another Dhall program can reference the above file anywhere the program expects a boolean value:  dhall <<< './example.dhall || False'BoolTrue This is the exact same as if we had just replaced the path with the file's contents:  dhall <<< 'True || False'BoolTrue Dhall doesn't need to support an explicit operation to read input because Dhall can read values by just importing them Similarly, Dhall doesn't need to support an explicit write operation either. Just save a Dhall expression to a file using your favorite text editor. "What if I need a way to automate the generation of files?" You don't need to automate the process of saving a file because one file is always sufficiently rich to store as much information as you need. Dhall is a programmable configuration language which supports lists and records so any one file can store or programmatically generate any number of values. Files are human-generated artifacts which exist purely for our convenience but Dhall code does not behave any differently whether or not the program spans multiple files or a single file. Most of the time people need to automate reads and writes because they are using non-programmable configuration file formats or data storage formats ## Programmable configuration You might object: "Configuration files shouldn't be Turing-complete!" However, Dhall is not Turing-complete. Dhall is a total programming language, meaning that evaluation eventually halts. In practice, we don't actually care that much if Dhall halts, since we cannot guarantee that the program halts on reasonable human timescales. However, we can statically analyze Dhall and most of the Halting Problem objections to static analysis don't apply. For example, Dhall can statically guarantee that programs never fail or throw exceptions (because obviously a configuration file should never crash). Dhall also lets you simplify confusing files by eliminating all indirection because Dhall can reduce every program to a canonical normal form. In fact, most objections to programmable configuration files are actually objections to Turing-completeness ## Importing URLs Dhall also lets you import code by URL instead of path. Dhall hosts the Prelude of standard utilities online using IPFS (a distributed hashtable for the web),and you can browse the Prelude using this link, which redirects to the latest version of the Prelude: For example, you can browse the above Prelude to find the not function hosted here: ... which has the following contents:  curl https://ipfs.io/ipfs/QmQ8w5PLcsNz56dMvRtq54vbuPe9cNnCCUXAQp6xLc6Ccx/Prelude/Bool/not{-Flip the value of a BoolExamples:./not True = False./not False = True-}let not : Bool → Bool = λ(b : Bool) → b == Falsein not ... and we can "apply" our URL to our file as if they were both just ordinary functions and values:  dhall <<< 'https://ipfs.io/ipfs/QmQ8w5PLcsNz56dMvRtq54vbuPe9cNnCCUXAQp6xLc6Ccx/Prelude/Bool/not ./example.dhall'BoolFalse ... or we could use let-expressions to improve readability without changing the result:  let not = https://ipfs.io/ipfs/QmQ8w5PLcsNz56dMvRtq54vbuPe9cNnCCUXAQp6xLc6Ccx/Prelude/Bool/notin let example = ./example.dhallin not example "That's a security vulnerability!", you protest. "You are ... literally ... injecting remote code into your program." Playing the devil's advocate, I ask you what is wrong with remote code injection "Well, for starters, if the URL is compromised the attacker can run arbitrary code like ..." ... reading and writing files? Dhall is totally pure and doesn't support any effects at all (besides heating up CPUs ☺). This brings us full circle back to our original question: "Why do our programs need to read input and write output?" ## Conclusion Software doesn't have to be so complex if we challenge our assumptions that got us into this mess These ideas were heavily inspired the following post by Conal Elliot: ... and this post is not about Dhall so much as Conal's vision of an effect-free purely functional future for programming. I believe explicitly reading input and writing output will eventually become low-level implementation details of higher-level languages, analogous to allocating stack registers or managing memory. </body></html> ### Joachim Breitner # e.g. in TeX When I learned TeX, I was told to not write e.g. something, because TeX would think the period after the “g” ends a sentence, and introduce a wider, inter-sentence space. Instead, I was to write e.g.\␣. Years later, I learned from a convincing, but since forgotten source, that in fact e.g.\@ is the proper thing to write. I vaguely remembering that e.g.\␣ supposedly affected the inter-word space in some unwanted way. So I did that for many years. Until I recently was called out for doing it wrong, and that infact e.g.\␣ is the proper way. This was supported by a StackExchange answer written by a LaTeX authority and backed by a reference to documentation. The same question has, however, another answer by another TeX authority, backed by an analysis of the implementation, which concludes that e.g.\@ is proper. What now? I guess I just have to find it out myself. The above image shows three variants: The obviously broken version with e.g., and the two contesting variants to fix it. Looks like they yield equal results! So maybe the difference lies in how \@ and \␣ react when the line length changes, and the word wrapping require differences in the inter-word spacing. Will there be differences? Let’s see; I cannot see any difference. But the inter-sentence whitespace ate most of the expansion. Is there a difference visible if we have only inter-word spacing in the line? Again, I see the same behaviour. Conclusion: It does not matter, but e.g.\␣ is less hassle when using lhs2tex than e.g.\@ (which has to be escaped as e.g.\@@), so the winner is e.g.\␣! (Unless you put it in a macro, then \@ might be preferable, and it is still needed between a captial letter and a sentence period.) ## October 06, 2017 ### Sandy Maguire # Review: Bananas, Lenses, Envelopes and Barbed Wire <article> <header> # Review: Bananas, Lenses, Envelopes and Barbed Wire </header> <time>October 6, 2017</time> Today’s classic functional programming paper we will review is Meijer et al.’s Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire. The exciting gist of the paper is that all explicit recursion can be factored out into a few core combinators. As such, the reasoning is that we should instead learn these combinators (or “recursion schemes” as they’re called), rather than doing our own ad-hoc recursion whenever we need it. Despite being a marvelous paper, it falls into the all-too-common flaw of functional programming papers, which is to have an absolutely horrible title. “Bananas”, “lenses”, “envelopes” and “barbed wire” correspond to obscure pieces of syntax invented to express these ideas. In our treatment of the literature, we will instead use standard Haskell syntax, and refer to the paper as Functional Programming with Recursion Schemes. ## Specialized Examples of Recursion Schemes ### Catamorphisms over Lists Catamorphisms refer to a fold over a datastructure. A mnemonic to remember this is that a catamorphism tears down structures, and that if that structure were our civilization it’d be a catastrophe. By way of example, Meijer et al. present the following specialization of a catamorphism over lists: Let default :: b and step :: a -> b -> b, then a list-catamorphism h :: [a] -> b if a function of the following form: h [] = default h (a : as) = step a (h as) This definition should look pretty familiar; if you specialize the function foldr to lists, you’ll see it has the type: foldr :: (a -> b -> b) -> b -> [a] -> b We can view foldr as taking our values step :: a -> b -> b and default :: b, and then giving back a function that takes an [a] and computes some b. For example, we can write a few of the common prelude functions over lists as catamorphisms of this form. length :: [a] -> Int length = foldr (\_ n -> n + 1) 0 filter :: forall a. (a -> Bool) -> [a] -> [a] filter p = foldr step [] where step :: a -> [a] -> [a] step a as = if p a then a : as else as When written this way – Meijer et al. are quick to point out – the so-called “fusion” law is easily seen: f . foldr step default = foldr step' default' where step' a b = step a (f b) default' = f default which intuitively says that you can “fuse” a catamorphism with a subsequent composition into a single catamorphism. ### Anamorphisms over Lists If a catamorphism refers to a “fold”, an anamorphism corresponds to an unfold of a data structure. A good mnemonic for this is that an anamorphism builds things up, just like anabolic steroids can be an easy way to build up muscle mass. Meijer et al. present this concept over lists with the following (again, very specialized) definition: unfoldr :: (b -> Maybe (a, b)) -> b -> [a] Given a function produce :: b -> Maybe (a, b), a list-anamorphism h :: b -> [a] is defined as: h seed = case produce seed of Just (a, b) = a : h b Nothing = [] As expected, this corresponds to the unfoldr :: (b -> Maybe (a, b)) -> b -> [a] function from Data.List. By way of example, they provide the following: zip :: ([a], [b]) -> [(a, b)] zip = unfoldr produce where produce (as, bs) = if null as || null bs then Nothing else Just ((head as, head bs), (tail as, tail bs)) and iterate :: (a -> a) -> a -> [a] iterate f = unfoldr (\a -> Just (a, f a)) An interesting case is that of map :: (a -> b) -> [a] -> [b]. We note that both the input and output of this function are lists, and thus might suspect the function can be written as either a catamorphism or an anamorphism. And indeed, it can be: cataMap :: (a -> b) -> [a] -> [b] cataMap f = foldr (\a bs -> f a : bs) [] anaMap :: (a -> b) -> [a] -> [b] anaMap f = unfoldr produce where produce [] = Nothing produce (a : as) = Just (f a, as) Neat! ### Hylomorphisms over Lists A hylomorphism over lists is a recursive function of type a -> b whose call-tree is isomorphic to a list. A hylomorphism turns out to be nothing more than a catamorphism following an anamorphism; the anamorphism builds up the call-tree, and the catamorphism evaluates it. An easy example of hylomorphisms is the factorial function, which can be naively (ie. without recursion schemes) implemented as follows: fact :: Int -> Int fact 0 = 1 fact n = n * fact (n - 1) When presented like this, it’s clear that fact will be called a linear number of times in a tail-recursive fashion. That sounds a lot like a list to me, and indeed we can implement fact as a hylomorphism: fact :: Int -> Int fact = foldr (*) 1 . unfoldr (\n -> if n == 0 then Nothing else Just (n, n - 1)) The hylomorphic representation of fact works by unfolding its argument n into a list [n, n-1 .. 1], and then folding that list by multiplying every element in it. However, as Meijer et al. point out, this implementation of fact is a little unsatisfactory. Recall that the natural numbers are themselves an inductive type (data Nat = Zero | Succ Nat), however, according to the paper, there is no easy catamorphism (nor anamorphism) that implements fact. ### Paramorphisms Enter paramorphisms: intuitively catamorphisms that have access to the current state of the structure-being-torn-down. Meijer et al.: [Let init :: b and merge :: Nat -> b -> b. ] For type Nat, a paramorphism is a function h :: Nat -> b of the form: h Zero = init h (Succ n) = merge n (h n) As far as I can tell, there is no function in the Haskell standard library that corresponds to this function-as-specialized, so we will write it ourselves: paraNat :: (Nat -> b -> b) -> b -> Nat -> b paraNat _ init Zero = init paraNat merge init (Succ n) = merge n (paraNat merge init n) We can thus write fact :: Nat -> Nat as a paramorphism: fact :: Nat -> Nat fact = paraNat (\n acc -> (1 + n) * acc) 1 Similarly, we can define paramorphisms over lists via the function: paraList :: (a -> [a] -> b -> b) -> b -> [a] -> b paraList _ init [] = init paraList merge init (a : as) = merge a as (paraList merge init as) with which we can write the function tails :: [a] -> [[a]]: tails :: forall a. [a] -> [[a]] tails = paraList merge [] where merge :: a -> [a] -> [[a]] -> [[a]] merge a as ass = (a : as) : ass ## General Recursion Schemes ### Intuition As you’ve probably guessed, the reason we’ve been talking so much about these recursion schemes is that they generalize to all recursive data types. The trick, of course, is all in the representation. Recall the standard definition of list: data List a = Nil | Cons a (List a) However, there’s no reason we need the explicit recursion in the Cons data structure. Consider instead, an alternative, “fixable” representation: data List' a x = Nil' | Cons' a x If we were somehow able to convince the typesystem to unify x ~ List' a x, we’d get the type List' a (List' a (List' a ...)), which is obviously isomorphic to List a. We’ll look at how to unify this in a second, but a more pressing question is “why would we want to express a list this way?”. It’s a good question, and the answer is we’d want to do this because List' a x is obviously a functor in x. Furthermore, in general, any datastructure we perform this transformation on will be a functor in its previously-recursive x parameter. We’re left only more curious, however. What good is it to us if List' a x is a functor in x? It means that we can replace x with some other type b which is not isomorphic to List a. If you squint and play a little loose with the type isomorphisms, this specializes fmap :: (x -> b) -> List' a x -> List' a b to (List a -> b) -> List' a x -> List' a b. Notice the List a -> b part of this function – that’s a fold of a List a into a b! Unfortunately we’re still left with a List' a b, but this turns out to be a problem only in our handwaving of x ~ List' a x, and the actual technique will in fact give us just a b at the end of the day. ### Algebras An f-algebra is a function of type forall z. f z -> z, which intuitively removes the structure of an f. If you think about it, this is spiritually what a fold does; it removes some structure as it reduces to some value. As it happens, the type of an f-algebra is identical to the parameters required by a catamorphism. Let’s look at List' a-algebras and see how the correspond with our previous examples of catamorphisms. length :: [a] -> Int length = foldr (\_ n -> n + 1) 0 lengthAlgebra :: List' a Int -> Int lengthAlgebra Nil' = 0 lengthAlgebra (Cons' _ n) = n + 1 filter :: forall a. (a -> Bool) -> List a -> List a filter p = foldr step Nil where step a as = if p a then Cons a as else as filterAlgebra :: (a -> Bool) -> List' a (List a) -> (List a) filterAlgebra _ Nil' = Nil filterAlgebra p (Cons' a as) = if p a then Cons a as else as ### Coalgebras f-algebras correspond succinctly to the parameters of catamorphisms over fs. Since catamorphisms are dual to anamorphisms, we should expect that by turning around an algebra we might get a representation of the anamorphism parameters. And we’d be right. Such a thing is called an f-coalgebra of type forall z. z -> f z, and corresponds exactly to these parameters. Let’s look at our previous examples of anamorphisms through this lens: zip :: (List a, List b) -> List (a, b) zip = unfoldr produce where produce (as, bs) = if null as || null bs then Nothing else Just ((head as, head bs), (tail as, tail bs)) zipCoalgebra :: ([a], [b]) -> List' (a, b) ([a], [b]) zipCoalgebra (as, bs) = if null as || null bs then Nil' else Cons (head as, head bs) (tail as, tail bs) iterate :: (a -> a) -> a -> [a] iterate f = unfoldr (\a -> Just (a, f a)) iterateAlgebra :: (a -> a) -> a -> List' a a iterateAlgebra f a = Cons a (f a) You might have noticed that these coalgebras don’t line up as nicely as the algebras did, due namely to the produce functions returning a type of Maybe (a, b), while the coalgebras return a List' a b. Of course, these types are isomorphic (Nothing <=> Nil', Just (a, b) <=> Cons a b), it’s just that the authors of unfoldr didn’t have our List' functor to play with. ### From Algebras to Catamorphisms As we have seen, f-algebras correspond exactly to the parameters of a catamorphism over an f. But how can we actually implement the catamorphism? We’re almost there, but first we need some machinery. type family Fixable t :: * -> * The type family Fixable takes a type to its fixable functor representation. For example, we can use it to connect our List a type to List' a: type instance Fixable (List a) = List' a Now, assuming we have a function toFixable :: t -> Fixable t t, which for lists looks like this: toFixable :: List a -> Fixable (List a) (List a) -- equivalently: toFixable :: List a -> List' a (List a) toFixable Nil = Nil' toFixable (Cons a as) = Cons' a as We can now write our catamorphism! cata :: (Fixable t z -> z) -> (t -> z) cata algebra = algebra . fmap (cata algebra) . toFixable Very cool. What we’ve built here is general machinery for tearing down any inductive data structure t. All we need to do it is its Fixable t representation, and a function project :: t -> Fixable t t. These definitions turn out to be completely mechanical, and thankfully, can be automatically derived for you via the recursion-schemes package. ### Coalgebras and Catamorphisms We can turn all of our machinery around in order to implement anamorphisms. We’ll present this material quickly without much commentary, since there are no new insights here. Given fromFixable :: Fixable t t -> t, we can implement ana: ana :: (z -> Fixable t z) -> z -> t ana coalgebra = fromFixable . fmap (ana coalgebra) . coalgebra ### General Paramorphisms Because there is nothing interesting about hylomorphisms when viewed via our Fixable machinery, we skip directly to paramorphisms. Recall that a paramorphism is a fold that has access to both the accumulated value being constructed as well as the remainder of the structure at any given point in time. We can represent such a thing “algebraically:” Fixable t (t, z) -> z. With a minor tweak to cata, we can get para: para :: (Fixable t (t, z) -> z) -> t -> z para alg = teardown where teardown = alg . fmap (\t -> (t, teardown t)) . toFixable ## Miscellaneous Findings ### All Injective Functions are Catamorphisms Meijer et al. make the somewhat-unsubstantiated claim that all injective functions are catamorphisms. We will reproduce their proof here, and then work through it to convince ourselves of its correctness. Let f :: A -> B be a strict function with left-inverse g. Then for any φ :: F A -> A, we have: g . f = id f . cata φ = cata (f . φ . fmap g) Taking φ = fromFixable we immediately get that any strict injective function can be written as a catamorphism: f = cata (f . fromFixable . fmap g) Sounds, good? I guess? Meijer et al. must think I’m very smart, because it took me about a week of bashing my head against this proof before I got it. There were two stumbling blocks for me which we’ll tackle together. To jog our memories, we’ll look again at the definition of cata: cata φ = φ . fmap (cata φ) . toFixable There are two claims we need to tackle here, the first of which is that given φ = fromFixable: f . cata φ = cata (f . φ . fmap g) We can show this by mathematical induction. We’ll first prove the base case, by analysis of the [] case over list-algebras.  cata (f . φ . fmap g) [] -- definition of cata = f . φ . fmap g . fmap (cata (f . φ . fmap g)) toFixable [] -- definition of toFixable = f . φ . fmap g . fmap (cata (f . φ . fmap g)) NilF -- fmap fusion = f . φ . fmap (g . cata (f . φ . fmap g)) NilF -- NilF is a constant functor = f . φ NilF -- substitute φ = fromFixable = f fromFixable NilF -- definition of fromFixable = f [] Great! That’s the base case tackled. It’s easy to see why this generalizes away from lists to any data structure; the only way to terminate the recursion of a cata is for one of the type’s data constructors to not be recursive (such as Nil). If the data constructor isn’t recursive, its Fixable representation must be a constant functor, and thus the final fmap will always fizzle itself out. Let’s tackle the inductive case now. We’ll look at cases of cata that don’t fizzle out immediately:  cata (f . φ . fmap g) -- definition of cata = f . φ . fmap g . fmap (cata (f . φ . fmap g)) . toFixable -- fmap fusion = f . φ . fmap (g . cata (f . φ . fmap g)) . toFixable -- definition of cata = f . φ . fmap (g . f . φ . fmap g . fmap (cata (f . φ . fmap g)) . toFixable) . toFixable -- fmap fusion = f . φ . fmap (g . f . φ . fmap (g . cata (f . φ . fmap g)) . toFixable) . toFixable -- g . f = id = f . φ . fmap (id . φ . fmap (g . cata (f . φ . fmap g)) . toFixable) . toFixable -- definition of id = f . φ . fmap (φ . fmap (g . cata (f . φ . fmap g)) . toFixable) . toFixable As you can see here, subsequent expansions of cata line their gs and fs up in such a way that they cancel out. Also, we know from our experience looking at the base case that the final g will always sizzle out, and so we don’t need to worry about it only being a left-inverse. The other stumbling block for me was that cata fromFixable = id, but this turns out to be trivial:  cata fromFixable = fromFixable . fmap (cata fromFixable) . toFixable Eventually this will all bottom out when it hits the constant functor, which will give us a giant chain of fromFixable . toFixables, which is obviously id. To circle back to our original claim that all injective functions are catamorphisms, we’re now ready to tackle it for real. f . cata φ = cata (f . φ . fmap g) f . cata fromFixable = cata (f . fromFixable . fmap g) f . id = cata (f . fromFixable . fmap g) f = cata (f . fromFixable . fmap g) \(\blacksquare ### All Surjective Functions are Anamorphisms Anamorphisms are dual to catamorphisms, and surjective functions are dual (in $$\mathbb{Set}$$) to injective functions. Therefore, we can get this proof via duality from the proof that injective functions are catamorphisms. $$\blacksquare$$ ## Closing Remarks Functional Programming with Recursion Schemes has some other (in my opinion) minor contributions about this stuff, such as how catamorphisms preserve strictness, but I feel like we’ve tackled the most interesting pieces of it. It is my hope that this review will serve as a useful complement in understanding the original paper. </article> ### Christopher Allen # Comparing Persistent with Ecto and ActiveRecord Rejected title: You’re not special I saw this article comparing Ecto and ActiveRecord: https://www.dailydrip.com/blog/ecto-vs-activerecord.html I thought I would track alongside that post and show what the equivalent code looks like if you’re using the Persistent Haskell library. # Some examples from their side-by-side comparison I’ll start by simply translating some small, simple examples linked to at the beginning of their article. ## Get all records ### ActiveRecord Model.all ### Ecto Repo.all(App.Model) ### Persistent getAllUsers :: DB [Entity User] getAllUsers = selectList [] [] ## Search by name ### ActiveRecord Model.find_by(name: name) ### Ecto Repo.one(from t in App.Model, where: t.name == ^name, limit: 1) ### Persistent getFirstUserByEmail :: Text -> DB (Maybe (Entity User)) getFirstUserByEmail email = selectFirst [UserEmail ==. email] [] ## Fetch a single record based on id=1 ### ActiveRecord Model.find(1) ### Ecto Model |> Repo.get!(1) ### Persistent getIdOneUser :: DB (Maybe (Entity User)) getIdOneUser = getEntity (toSqlKey 1) # Comparing against the rest of the article I’ll do my usual thing and cite what the original article said, then reply with either code or prose. Let’s talk about the main ideas behind Ecto, and try to compare it with ActiveRecord. Main difference ActiveRecord: We can represent data using: behaviors + state. Ecto: We need to represent data using: functions. It’s just functions and data in Haskell too. Active Record pattern ActiveRecord has a general pattern of accessing data used in Object-oriented languages. So, this is not specfically the Active Record pattern. Using ActiveRecord, we can do: artist = Artist.get(1) artist.name = "Name" artist.save This makes a lot of sense for Object-Oriented languages. Data has behavior and state. This is pretty straightforward. How does Ecto handle that? Repository Pattern As a functional language we don’t have data with state, nor do we have behavior. We only have functions. In general, if you want to talk with the database, you need to talk with the Repository first. artist = Repo.get(Artist, 1) changeset = Artist.changeset(artist, name: "Changed name") Repo.update(changeset) If we check side-by-side what Active Record and repository does, we cannot see when Active Record touches the Database. We just do a save and it hits the database implicitly. In Ecto, you always interact with the database explicitly. Ecto will not talk to the database without you asking it to. Everything is totally explicit. Any interaction with the database should pass through the Repository. This is true of Haskell too, it only talks to the database when you ask it to. Most people have a function named runDB or similar which makes it easy to audit what code actually talks to the database. It also serves to tell you where your transaction boundaries are which is tremendously helpful for atomicity and correctly using your SQL database. Here’s the name-changing example above in Haskell: updateFirstUserName :: Text -> DB () updateFirstUserName newName = do update (toSqlKey 1) [UserName =. newName] If you wanted something that could do the same for any primary key: updateFirstUserName' :: Key User -> Text -> DB () updateFirstUserName' userKey newName = do update userKey [UserName =. newName] updateFirstUserName :: Text -> DB () updateFirstUserName = updateFirstUserName' (toSqlKey 1) ## Schema Schema is normally a map between your types and your database. But not necessarily. If we check the documentation: An Ecto schema is used to map any data source into an Elixir struct. One of such use cases is to map data coming from a repository, usually a table, into Elixir structs. An interesting thing to mention is that we don’t need a schema for using Ecto. We can bypass the use of Schemas by using the table name as a string. Schemas are very flexible. Here is an example of Schema definition in Ecto. defmodule SlackPosting.Journals.Post do use Ecto.Schema import Ecto.Changeset alias SlackPosting.Journals.Post schema "posts" do field :user_slack_id, :string field :user_name, :string field :text, :string many_to_many :tags, SlackPosting.Journals.Tag, join_through: SlackPosting.Journals.PostTag has_many :comments, SlackPosting.Journals.Comment timestamps() end @doc false def changeset(%Post{} = post, attrs) do post |> cast(attrs, [:text, :user_slack_id, :user_name]) |> validate_required([:text, :user_slack_id]) end end Then in Persistent, with a little more of the mentioned tables fleshed out: share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase| Post userSlackId Text userName Text someText Text someOtherText Text deriving Show Comment comment Text post PostId deriving Show Tag tagName Text PostTags tag TagId post PostId |] ## Migrations Migrations Ecto also has migrations. This is not really different from what ActiveRecord offers to us. defmodule SlackPosting.Repo.Migrations.CreatePosts do use Ecto.Migration def change do create table(:posts) do add :text, :text add :user_slack_id, :string add :user_name, :string timestamps() end end end Persistent does too, but approaches it differently by focusing on generating fresh and differential migrations against the database rather than creating a migration DSL. The macros generate the code necessary to see what migrations it recommends or to run the migrations directly automatically. The usual runDB function for running a database action against the database works for this. dumpMigration :: DB () dumpMigration = printMigration migrateAll runMigrations :: DB () runMigrations = runMigration migrateAll If we were to dump the post/comment/tag schema from earlier for SQLite, the migration would look like: CREATE TABLE "post"("id" INTEGER PRIMARY KEY,"user_slack_id" VARCHAR NOT NULL,"user_name" VARCHAR NOT NULL,"some_text" VARCHAR NOT NULL,"some_other_text" VARCHAR NOT NULL); CREATE TABLE "comment"("id" INTEGER PRIMARY KEY,"comment" VARCHAR NOT NULL,"post" INTEGER NOT NULL REFERENCES "post"); CREATE TABLE "tag"("id" INTEGER PRIMARY KEY,"tag_name" VARCHAR NOT NULL); CREATE TABLE "post_tags"("id" INTEGER PRIMARY KEY,"tag" INTEGER NOT NULL REFERENCES "tag","post" INTEGER NOT NULL REFERENCES "post"); ## Changeset I have no idea what this is about and their post doesn’t make it clearer. Validation is orthogonal to Persistent, you usually validate stuff at the edges so that any value of type MyModel is only ever a valid value for that database table. ## Associations We covered this a little earlier with the Post/Comment/Tag example but I’ll explain a little: Comment comment Text post PostId deriving Show Tag tagName Text PostTags tag TagId post PostId You can reference the primary key column of a model defined elsewhere in the quasiquoter, so TagId is something the code understands is primary key of the Tag table. From there, it’s able to generate the foreign key relationships in the migrations automatically. It also gives you better type-safety with managing keys: Prelude> :t Comment Comment :: Text -> Key Post -> Comment Prelude> let tagKey :: Key Tag; tagKey = toSqlKey 1 Prelude> Comment "my comment" tagKey <interactive>:12:22: error: • Couldn't match type ‘Tag’ with ‘Post’ Expected type: Key Post Actual type: Key Tag • In the second argument of ‘Comment’, namely ‘tagKey’ In the expression: Comment "my comment" tagKey In an equation for ‘it’: it = Comment "my comment" tagKey When our keys aren’t just strings or numbers, we can avoid a lot of unnecessary mistakes! ## Lazy loading Ecto does not support Lazy Loading. Persistent doesn’t either. If you want to pull related data together you can do so via separate database actions in a transaction or you can use Esqueleto to do in so: def list_posts do Repo.all(Post) |> Repo.preload([:comments, :tags]) end Here’s a somewhat serious (this is modeled after some production code I wrote) example of how to do this with Esqueleto on top of Persistent, by returning a mapping of posts, the comments on the posts, and all tags associated with the posts. tagsForPosts :: [Key Post] -> DB [(Key Post, Entity Tag)] tagsForPosts postKeys = unValueThePostKeys
select $from$ \ ( postTag InnerJoin tag ) -> do
on (tag ^. TagId
E.==. postTag ^. PostTagTag)
where_ (postTag ^. PostTagPost
in_ valList postKeys)
return (postTag ^. PostTagPost, tag)
where unValueThePostKeys :: DB [(E.Value (Key Post), Entity Tag)]
-> DB [(Key Post, Entity Tag)]
unValueThePostKeys = (fmap . fmap) (first E.unValue)

(Key Post)
(Entity Post, [Entity Comment], [Entity Tag]))
let postKeys = fmap (entityKey . fst) postsAndComments
postKeysWithTags <- tagsForPosts postKeys
return postsWithTags
where
posts =
select $from$ \ ( post
InnerJoin
comment ) -> do
on (post ^. PostId
E.==. (comment ^. CommentPost))
return (post, comment)

postsInitialMap :: [(Entity Post, Entity Comment)]
-> Map (Key Post) (Entity Post, [Entity Comment], [Entity Tag])
where insertPostCom m (post, comment) =
M.insertWith
(entityKey post) (post, [comment], []) m

addTagsToMap :: [(Key Post, Entity Tag)]
-> Map (Key Post) (Entity Post, [Entity Comment], [Entity Tag])
-> Map (Key Post) (Entity Post, [Entity Comment], [Entity Tag])
foldl' insertPostKeyTag initialMap postKeysTags
where insertPostKeyTag :: Map (Key Post)
(Entity Post, [Entity Comment], [Entity Tag])
-> (Key Post, Entity Tag)
-> Map (Key Post)
(Entity Post, [Entity Comment], [Entity Tag])
insertPostKeyTag m (postKey, tagEntity) =
(\(post, comment, tags) ->
(post, comment, tagEntity : tags))
postKey
m

Then running this code with some test data:

migrateFixturesTest :: IO ()
migrateFixturesTest = do
runDB $do runMigrations pk1 <- insert$ Post "slack1" "name1" "" ""
pk2 <- insert $Post "slack2" "name2" "" "" pk3 <- insert$ Post "slack2" "name3" "" ""
_ <- insert $Comment "pk1 c1" pk1 _ <- insert$ Comment "pk1 c2" pk1
_ <- insert $Comment "pk2 c3" pk2 tg1 <- insert$ Tag "tag1"
ptg1 <- insert $PostTag tg1 pk1 pwcat <- postsWithCommentsAndTags liftIO$ pPrint pwcat

We get the following output, which looks right!

Prelude> migrateFixturesTest
Migrating: CREATE TABLE "post"("id" INTEGER PRIMARY KEY,"user_slack_id" VARCHAR NOT NULL,"user_name" VARCHAR NOT NULL,"some_text" VARCHAR NOT NULL,"some_other_text" VARCHAR NOT NULL)
Migrating: CREATE TABLE "comment"("id" INTEGER PRIMARY KEY,"comment" VARCHAR NOT NULL,"post" INTEGER NOT NULL REFERENCES "post")
Migrating: CREATE TABLE "tag"("id" INTEGER PRIMARY KEY,"tag_name" VARCHAR NOT NULL)
Migrating: CREATE TABLE "post_tag"("id" INTEGER PRIMARY KEY,"tag" INTEGER NOT NULL REFERENCES "tag","post" INTEGER NOT NULL REFERENCES "post")
fromList
[ ( PostKey { unPostKey = SqlBackendKey { unSqlBackendKey = 1 } }
, ( Entity
{ entityKey =
PostKey { unPostKey = SqlBackendKey { unSqlBackendKey = 1 } }
, entityVal =
Post
{ postUserSlackId = "slack1"
, postSomeText = ""
, postSomeOtherText = ""
}
}
, [ Entity
{ entityKey =
CommentKey { unCommentKey = SqlBackendKey { unSqlBackendKey = 2 } }
, entityVal =
Comment
{ commentComment = "pk1 c2"
, commentPost =
PostKey { unPostKey = SqlBackendKey { unSqlBackendKey = 1 } }
}
}
, Entity
{ entityKey =
CommentKey { unCommentKey = SqlBackendKey { unSqlBackendKey = 1 } }
, entityVal =
Comment
{ commentComment = "pk1 c1"
, commentPost =
PostKey { unPostKey = SqlBackendKey { unSqlBackendKey = 1 } }
}
}
]
, [ Entity
{ entityKey =
TagKey { unTagKey = SqlBackendKey { unSqlBackendKey = 1 } }
, entityVal = Tag { tagTagName = "tag1" }
}
]
)
)
, ( PostKey { unPostKey = SqlBackendKey { unSqlBackendKey = 2 } }
, ( Entity
{ entityKey =
PostKey { unPostKey = SqlBackendKey { unSqlBackendKey = 2 } }
, entityVal =
Post
{ postUserSlackId = "slack2"
, postSomeText = ""
, postSomeOtherText = ""
}
}
, [ Entity
{ entityKey =
CommentKey { unCommentKey = SqlBackendKey { unSqlBackendKey = 3 } }
, entityVal =
Comment
{ commentComment = "pk2 c3"
, commentPost =
PostKey { unPostKey = SqlBackendKey { unSqlBackendKey = 2 } }
}
}
]
, []
)
)
]

This stuff could get abstracted away or code-gen’d but I haven’t had cause to bother yet. Incidentally, this happens to be a decent example of how to pull together data associated by one-to-many and many-to-many relationships using Esqueleto and Persistent.

If you’d like a working, running git repository of what I did in this article, take a look here.

Some relevant libraries:

I know this site is a bit of a disaster zone, but if you like my writing or think you could learn something useful from me, please take a look at the Haskell book I've been writing. There's a free sample available too!

## October 03, 2017

### Douglas M. Auclair (geophf)

• September 26th, 2017:
art2RawNames art = Raw (artId art) <$> (Map.lookup "Person"$ metadata art)
curry away the 'art' var. ref: Y2017.M09.D26.Exercise
• September 19th, 2017:  The #1Liner to follow (next tweet) is a question based off a code snippet from Y2017.M09.D18.Solution. What is a better/more elegant definition?
fmap (wordStrength . wc) (BL.readFile filename) >>= \dict ->
return (Map.insert filename dict m)
Reformulate. Curry the dict-ref.

# Yesod Tutorial 1. My First Web Site

This is the first in the series of tutorials introducing a new approach to web development using Haskell and Yesod. Haskell is a functional language and Yesod is a web development framework written in Haskell by Michael Snoyman who also wrote a book about it. Michael is a member of the FP Complete team and he provided a lot of valuable feedback for this tutorial.

# It's time for Functional Programming

by Aaron Contorer - CEO

In 1999 I went to Bill Gates with an idea to create a software tools group dedicated to shipping complex software faster. Engineers’ time is valuable, and more importantly, software that ships on time with fewer defects is worth a lot. I organized a team that analyzed what was missing from the old toolsets for our most valuable products. Based on developers’ input, we conceived and delivered five in-house tools within our first year, spanning areas from build to source control to localization. I'm proud of the talented and motivated people who chose to be on that team, and the positive impact it had: If you have a copy of Windows or Office today, it probably arrived on your desk a little sooner and a little better thanks to the Productivity Tools Team. From a business point of view, Microsoft got its tools investment back many times over.

# Engineering Manager at Takt (Full-time)

Takt’s Engineering organization is growing rapidly, and we’re looking for Engineering Managers to lead and scale the team. Takt distills complex customer data into uniquely tailored experiences; we orchestrate physical and digital exchanges into one seamless journey. Our business is building lasting, trusted relationships between people and brands — and making it look easy.

As an Engineering Manager at Takt, you deeply value and understand the work your team is accomplishing, and serve to inspired and develop each individual while ensuring business objectives are met. As we continue to expand our business, your team will build a rich, complex real-time product that serves our enterprise partners and their end customers while you enrich the lives and careers of your team.

Key Responsibilities:

• Build and manage a team of up to 8 software engineers, including coaching, mentorship, and performance management
• Develop and optimize development processes and practices to maintain the balance between building a complex innovative product with shipping quickly
• Instill a pragmatic, delivery-focused support-oriented mindset that values long-term efficiency and value
• Collaborate with engineers and software architects to design and develop software components for current and prospective Takt clients
• Be accountable for technical components, including planning, prioritization, and delivery
• Partner with the recruiting organization to fill the pipeline with exceptional talent, promote referrals, nurture relationships through informal meetings, and serve as an ambassador for Takt at events and conferences
• Promote continuous feedback, address growth areas for team members, and celebrate strengths and contributions of your team
• Be emblematic of Takt’s values and lead by example, set the right context, and inspire the team to do their best work by fostering an environment that’s open, transparent and respectful of each other
• Develop Engineering onboarding processes and documentation practices, ensure new team members have a seamless experience and are productive quickly

Skills and Experience:

• Prior experience in technical leadership and proven ability to build and develop successful teams consisting of varied levels of experience
• Technical expertise with Functional Programming in Scala or Haskell
• Demonstrated understanding of software design and a deep appreciation for Customer Success
• Excellent interpersonal, written, presentation and closing skills
• Prior success with hiring top technical talent and contributing to recruiting processes
• Experience running projects and understanding of agile development methodologies
• Select teams may have domain specific requirements (e.g. Machine Learning, Infrastructure, Security)
• Experience managing team members in a distributed environment is a plus
• Both mid-large company and startup experience would be a plus

Takt distills complex customer data into uniquely tailored experiences; we orchestrate physical and digital exchanges into one seamless journey. Our business is building lasting, trusted relationships between people and brands—and making it look easy.

We're already reaching millions of people a day, and we're just getting started. Our founding leadership is equal parts business, design, and engineering—because we believe differing perspectives + passionate discourse achieve the greatest outcomes. We are collectively talented, but also humble. We give our whole selves. We love learning new things.

Takt is committed to inclusion and diversity and is an equal opportunity employer. All applicants will receive consideration without regard to race, color, religion, gender, gender identity, sexual orientation, national origin, disability, or veteran status.

If you're up for the challenge of a lifetime, we're looking for outstanding talent to join our team.

Get information on how to apply for this position.

# Type-driven strictness

<html xmlns="http://www.w3.org/1999/xhtml"><head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/> <meta content="text/css" http-equiv="Content-Style-Type"/> <meta content="pandoc" name="generator"/> <style type="text/css">code{white-space: pre;}</style> <style type="text/css">div.sourceCode { overflow-x: auto; } table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode { margin: 0; padding: 0; vertical-align: baseline; border: none; } table.sourceCode { width: 100%; line-height: 100%; } td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; } td.sourceCode { padding-left: 5px; } code > span.kw { color: #007020; font-weight: bold; } /* Keyword */ code > span.dt { color: #902000; } /* DataType */ code > span.dv { color: #40a070; } /* DecVal */ code > span.bn { color: #40a070; } /* BaseN */ code > span.fl { color: #40a070; } /* Float */ code > span.ch { color: #4070a0; } /* Char */ code > span.st { color: #4070a0; } /* String */ code > span.co { color: #60a0b0; font-style: italic; } /* Comment */ code > span.ot { color: #007020; } /* Other */ code > span.al { color: #ff0000; font-weight: bold; } /* Alert */ code > span.fu { color: #06287e; } /* Function */ code > span.er { color: #ff0000; font-weight: bold; } /* Error */ code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ code > span.cn { color: #880000; } /* Constant */ code > span.sc { color: #4070a0; } /* SpecialChar */ code > span.vs { color: #4070a0; } /* VerbatimString */ code > span.ss { color: #bb6688; } /* SpecialString */ code > span.im { } /* Import */ code > span.va { color: #19177c; } /* Variable */ code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code > span.op { color: #666666; } /* Operator */ code > span.bu { } /* BuiltIn */ code > span.ex { } /* Extension */ code > span.pp { color: #bc7a00; } /* Preprocessor */ code > span.at { color: #7d9029; } /* Attribute */ code > span.do { color: #ba2121; font-style: italic; } /* Documentation */ code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ </style></head><body>

I was recently trying to optimize Dhall's performance because the interpreter was performing poorly on some simple examples.

For example, consider this tiny expression that prints 3000 exclamation marks:

Natural/fold +3000 Text (λ(x : Text) → x ++ "!") ""

The above Dhall expression takes over 14 seconds to evaluate to normal form, which is not acceptable:

$bench 'dhall <<< './exclaim'benchmarking dhall <<< ./exclaimtime 14.42 s (14.23 s .. 14.57 s) 1.000 R² (1.000 R² .. 1.000 R²)mean 14.62 s (14.54 s .. 14.66 s)std dev 70.22 ms (0.0 s .. 77.84 ms)variance introduced by outliers: 19% (moderately inflated) ## Strict The performance suffers because Dhall is lazy in the accumulator, meaning that the accumulator builds up a gigantic expression like this: (λ(x : Text) → x ++ "!")( (λ(x : Text) → x ++ "!") ( (λ(x : Text) → x ++ "!") ( (λ(x : Text) → x ++ "!") ( (λ(x : Text) → x ++ "!") ... {Repeat this nesting 3000 times} ... "" ) ) )) ... and then attempts to normalize the entire expression, which takes a long time and wastes a lot of memory. The reason for this lazy behavior is the following code for evaluating Natural/fold in the Dhall interpreter: normalize (App (App (App (App NaturalFold (NaturalLit n0)) _) succ') zero) = normalize (go n0) where go !0 = zero go !n = App succ' (go (n - 1)) You can read that as saying: • Given an expression of the form Natural/Fold n succ zero • Wrap the value zero in n calls of the function succ • i.e. succ (succ (succ (... {n times} ... (succ zero) ...))) • Then normalize that A smarter approach would be to keep the accumulator strict, which means that we evaluate as we go instead of deferring all evaluation to the end. For example, the accumulator starts off as just the empty string: "" ... then after one iteration of the loop we get the following accumulator: (λ(x : Text) → x ++ "!") "" ... and if we evaluate that accumulator immediately we get: "!" Then the next iteration of the loop produces the following accumulator: (λ(x : Text) → x ++ "!") "!" ... which we can again immediately evaluate to get: "!!" This is significantly more efficient than leaving the expression unevaluated. We can easily implement such a strict loop by making the following change to the interpreter: normalize (App (App (App (App NaturalFold (NaturalLit n0)) _) succ') zero) = go n0 where go !0 = normalize zero go !n = normalize (App succ' (go (n - 1))) The difference here is that we still build up a chain of n calls to succ but now we normalize our expression in between each call to the succ function instead of waiting until the end to normalize. Once we do this runtime improves dramatically, going down from 15 seconds to 90 milliseconds: $ bench 'dhall <<< './example'benchmarking dhall <<< ./exampletime                 88.92 ms   (87.14 ms .. 90.74 ms)                     0.999 R²   (0.999 R² .. 1.000 R²)mean                 86.06 ms   (84.98 ms .. 87.15 ms)std dev              1.734 ms   (1.086 ms .. 2.504 ms)

... or in other words about 30 microseconds per element. We could still do more to optimize this but at least we're now in the right ballpark for an interpreter. For reference, Python is 4x faster on my machine for the following equivalent program:

print ("!" * 3000)
$bench 'python exclaim.py'benchmarking python exclaim.pytime 24.55 ms (24.09 ms .. 25.02 ms) 0.998 R² (0.996 R² .. 1.000 R²)mean 24.53 ms (24.16 ms .. 24.88 ms)std dev 798.4 μs (559.8 μs .. 1.087 ms) However, these results don't necessarily imply that a strict accumulator is always better. ## Lazy Sometimes laziness is more efficient, though. Consider this program: List/buildInteger( λ(list : Type)→ λ(cons : Integer → list → list)→ Natural/fold +6000 list (cons 1)) The above example uses Natural/fold to build a list of 6000 1s. In this case the accumulator of the fold is a list that grows by one element after each step of the fold. We don't want to normalize the list on each iteration because that would lead to quadratic time complexity. Instead we prefer to defer normalization to the end of the loop so that we get linear time complexity. We can measure the difference pretty easily. A strict loop takes over 6 seconds to complete: bench 'dhall <<< ./ones'benchmarking dhall <<< ./onestime 6.625 s (6.175 s .. 7.067 s) 0.999 R² (0.998 R² .. 1.000 R²)mean 6.656 s (6.551 s .. 6.719 s)std dev 95.98 ms (0.0 s .. 108.3 ms)variance introduced by outliers: 19% (moderately inflated) ... whereas a lazy loop completes in about 180 milliseconds: $ bench 'dhall <<< ./g'benchmarking dhall <<< ./gtime                 182.5 ms   (175.1 ms .. 191.3 ms)                     0.998 R²   (0.995 R² .. 1.000 R²)mean                 177.5 ms   (172.1 ms .. 180.8 ms)std dev              5.589 ms   (2.674 ms .. 8.273 ms)variance introduced by outliers: 12% (moderately inflated)

Moreover, the difference in performance will only worsen with larger list sizes due to the difference in time complexity.

Also, in case you were wondering, Python is about 7x faster:

print ([1] * 6000)

## Conclusion

Many people associate dynamic languages with interpreters, but Dhall is an example of a statically typed interpreter. Dhall's evaluator is not sophisticated at all but can still take advantage of static type information to achieve comparable performance with Python (which is a significantly more mature interpreter). This makes me wonder if the next generation of interpreters will be statically typed in order to enable better optimizations.

</body></html>

# StrataConf NY 2017

If you're in New York for Strata this week come join me on Wednesday for a book signing at 10:55 ( https://conferences.oreilly.com/strata/strata-ny/public/content/author-signings ), talk at 2:55 ( https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/60802 ), and office hours at 4:35 ( https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/63328 ) :)

# Haskell or Scala engineer at Courex (Full-time)

## What we do

Courex, a subsidiary of Keppel T&T, is an 8 year old ecommerce logistics company driven by technology. We help our customers manage their supply chain so they can focus on selling. We do the following

• last mile delivery
• warehousing
• omnichannel integration

Our operations is driven by technology. Some interesting stuff

• We run a hybrid crowd-sourced(uber style) + fixed fleet model.
• We built an automated parcel dimension measurement machine using Kinect
• We have autonomous robots coming in late 2017 to pick and sort parcels

Experience a different sort of scale. Not bits and bytes, but parcels, machines and people. Your work affects the real world in a huge traditional industry.

As part of the Keppel group, your work will reach the supply chain across Southeast Asia and China. Help us digitise the supply chain.

## What are we looking for

We have openings in 2 teams. The inventory management product, which is written in Haskell, and the inventory synchronisation product which is written in Scala. Getting the inventory right is crucial in the supply chain, and functional programming gives us the confidence to do that.

The inventory management product manages how inventory flows in and out of the various warehouses in the region. Whereas the inventory synchronisation team synchronises the state of the inventory to various places such as Amazon, Lazada and Shopee.

We are looking for people interested in functional programming. It doesn't matter if you don't have working experience in them. Qualifications are not necessary. We like people who are practical and prolific. We are expanding to Southeast Asia. Our HQ is in Singapore, but you can work from one of Malaysia, Indonesia or Vietnam.

Get information on how to apply for this position.

# Alternatives to Typed Holes for talking to your compiler

Rejected title: Type Praxis

I frequently see people recommend that others use typed holes. I think people are more apt to recommend typed holes than the alternatives because it’s a bespoke feature intended to enable discovering the type of a sub-expression more easily. Which is fair enough, except it doesn’t really have a good use-case! I will demonstrate in this post why.

I frequently find myself relying on GHC Haskell’s features in order to off-load brain effort. The idea behind typed holes is that if you have an incomplete expression and aren’t sure what type the remaining part should be, you can ask the compiler! Lets reuse the example from the Haskell Wiki: https://wiki.haskell.org/GHC/Typed_holes

pleaseShow :: Show a => Bool -> a -> Maybe String
pleaseShow True a = Just (show _a)

The idea here is that we aren’t sure what should go at the end of the final line and we’re using _a to ask GHC what the type of _a should be. You get a type error that tries to describe the typed hole as follows:

    • Found hole: _a :: a0
Where: ‘a0’ is an ambiguous type variable
Or perhaps ‘_a’ is mis-spelled, or not in scope
• In the first argument of ‘show’, namely ‘_a’
In the first argument of ‘Just’, namely ‘(show _a)’
In the expression: Just (show _a)
• Relevant bindings include
a :: a
pleaseShow :: Bool -> a -> Maybe String

Okay so here’s the problem. There’s a Show constraint on a but the typed hole message doesn’t bother saying so:

    • Found hole: _a :: a0

This represents sort of a problem. Typeclass constraints aren’t always as syntactically obvious as they are from the declaration of pleaseShow here:

pleaseShow :: Show a => Bool -> a -> Maybe String

Sometimes they arise from other sub-expressions in your code and aren’t manifest in the type of your declaration!

You can’t productively point new people to typed holes because they’ll get extremely confused about type variables that have no constraints. If they’re reading good learning material, they’ll know that means they can’t actually do anything with something that is parametrically polymorphic. Even that framing aside, they just won’t know what terms are available to them for anything polymorphic.

Then we come to the expert. The expert is more likely to be working with code leveraging typeclasses and polymorphism and therefore…typed holes is of less help to them. If they’re aware of what typeclass constraints are attached to a type variable, fine, but the compiler is still forcing the programmer to juggle more context in their head than is really necessary.

## In which I offer a better alternative

pleaseShow :: Show a => Bool -> a -> Maybe String
let x :: z
x = a
in Just (show undefined)

This time we get an error that mentions where the original type came from along with the relevant typeclass constraints:

    • Couldn't match expected type ‘z’ with actual type ‘a’
‘a’ is a rigid type variable bound by
the type signature for:
pleaseShow :: forall a. Show a => Bool -> a -> Maybe String
‘z’ is a rigid type variable bound by
the type signature for:
x :: forall z. z

Keep in mind, this isn’t perfect! It’s not strictly the same as typed holes either as it’s more about contradicting the compiler about what type a had in order to discover what its type is. However, at least this way, we get a more complete picture of what the type of a is. Also note how I used undefined in order to ignore the parts of my code I wasn’t interested in getting errors about. This isn’t a perfect fit here as it results in GHC wanting to know which type it’s meant to expect from undefined, but in more typical circumstances, it works great for positing hypotheticals without bothering to write the actual code.

We’re about to do something more gnarly looking in the next section, the tl;dr is this:

### TL;DR

Use let expressions, undefined, impossible types and the like instead of typed holes. And don’t recommend Typed Holes to new people, they’re more confusing than helpful and the facilities of typed holes don’t scale well to more complicated contexts anyway.

## Tackling slightly more complicated situations

Warning: If you haven’t worked through about 2/3s of the Haskell Book or possess the equivalent practice and knowledge, you are unlikely to grok this section.

Sometimes you want to be able to posit something or lay down types for sub-expressions in a situation where you have a polymorphic type arising from a typeclass instance or function declaration. In those situations, knowing how to combine ScopedTypeVariables, InstanceSigs, and let expressions can be very valuable!

What if we’re stumped on something like this?

doubleBubble :: ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b)
doubleBubble f ffa =
undefined

So we try to start by assigning a type to a sub-expression:

doubleBubble :: ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b)
doubleBubble f ffa =
let x :: z
x = f
in undefined

And get the following type error:

    • Couldn't match expected type ‘z’
with actual type ‘f1 (f2 (a -> b))’

Fair enough, what if we try to make the types agree?

doubleBubble :: ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b)
doubleBubble f ffa =
let x :: f1 (f2 (a -> b))
x = f
in undefined

We get a type error?!

    • Couldn't match type ‘f1’ with ‘f4’
‘f1’ is a rigid type variable bound by
the type signature for:
doubleBubble :: forall (f1 :: * -> *) (f2 :: * -> *) a b.
(Applicative f1, Applicative f2) =>
f1 (f2 (a -> b)) -> f1 (f2 a) -> f1 (f2 b)
‘f4’ is a rigid type variable bound by
the type signature for:
x :: forall (f4 :: * -> *) (f5 :: * -> *) a1 b1. f4 (f5 (a1 -> b1))

The issue is that types usually only last the scope of a single type signature denoted by ::, so the variables f1, a, b, and the like can only be referenced in our declaration. That kinda sucks, how do we keep referring to the same type variables under our declaration? ScopedTypeVariables!

{-# LANGUAGE ScopedTypeVariables #-}

doubleBubble :: forall f1 f2 a b
. ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b)
doubleBubble f ffa =
let x :: f1 (f2 (a -> b))
x = f
in undefined

This now type-checks because we used forall to tell GHC that we wanted those variables to be lexically scoped! Now we’re really cooking with gas. Lets follow a chain of experiments and how they change our type errors:

doubleBubble :: forall f1 f2 a b
. ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b)
doubleBubble f ffa =
let x :: z
x = fmap (<*>) f
in undefined
    • Couldn't match expected type ‘z’
with actual type ‘f1 (f2 a -> f2 b)’
doubleBubble :: forall f1 f2 a b
. ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b)
doubleBubble f ffa =
let x :: z
x = (fmap (<*>) f) <*> ffa
in undefined
    • Couldn't match expected type ‘z’ with actual type ‘f1 (f2 b)’
-- this typechecks.
doubleBubble :: forall f1 f2 a b
. ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b) -- <---
doubleBubble f ffa =      ------ hmm
let x :: f1 (f2 b) -- <--------
x = (fmap (<*>) f) <*> ffa
in undefined

And now we’re done:

doubleBubble :: forall f1 f2 a b
. ( Applicative f1
, Applicative f2 )
=> f1 (f2 (a -> b))
-> f1 (f2 a)
-> f1 (f2 b) -- <---
doubleBubble f ffa =      ------ hmm
let x :: f1 (f2 b) -- <--------
x = (fmap (<*>) f) <*> ffa
in x

The intuition here is that we have to applicatively (monoidal functor, remember?) combine the * -> * kinded structure twice,

   f1 (f2 (a -> b))
-- <>  <>
-> f1 (f2     a)

Once for f1 of the function and f1 of the value, once for f2 of the function and f2 of the value.

Prelude> :t (\f a -> f <*> a)
(\f a -> f <*> a) :: Applicative f => f (a -> b) -> f a -> f b
Prelude> :t (\f a -> (fmap (<*>) f) <*> a)
(\f a -> (fmap (<*>) f) <*> a)
:: (Applicative f, Applicative f1) =>
f1 (f (a -> b)) -> f1 (f a) -> f1 (f b)

The following doesn’t fit because we end up triggering the Reader (function type) Applicative:

Prelude> :t (\f a -> ((<*>) f) <*> a)
(\f a -> ((<*>) f) <*> a)
:: (a1 -> a -> b) -> ((a1 -> a) -> a1) -> (a1 -> a) -> b

Rewriting the working solution a little:

apply = (<*>)
doubleAp f a = apply (fmap apply f) a
Prelude> let apply = (<*>)
Prelude> let doubleAp f a = apply (fmap apply f) a
Prelude> :t doubleAp
doubleAp
:: (Applicative f1, Applicative f) =>
f1 (f (a -> b)) -> f1 (f a) -> f1 (f b)

Then breaking down:

doubleAp f a = apply (fmap apply f) a
--             [1]    [2]  [3]
1. This apply grafts in the pre-lifted apply, cf.
Prelude> import Data.Void
Prelude> let v :: Void; v = undefined
Prelude> let doubleAp f a = v (fmap apply f) a

<interactive>:104:20: error:
• Couldn't match expected type ‘f1 (f a -> f b) -> t1 -> t’
with actual type ‘Void’
1. This fmap lifts a regular apply into a type that can graft together two values embedded in f such that the type is: f a -> f b, cf.
Prelude> let doubleAp f a = apply (v apply f) a

<interactive>:105:27: error:
• Couldn't match expected type ‘(f0 (a0 -> b0) -> f0 a0 -> f0 b0)
-> t -> f (a -> b)’
with actual type ‘Void’
1. This is the apply lifted by fmap, transformed from:

(f0 (a0 -> b0) into f0 a0 -> f0 b0

(The void error here is less useful)

Kicking in the contradiction we get for a if we replace it with the Void typed v variable:

Prelude> let doubleAp f a = apply (fmap apply f) v

<interactive>:107:41: error:
• Couldn't match expected type ‘f1 (f a)’ with actual type ‘Void’
• In the second argument of ‘apply’, namely ‘v’
In the expression: apply (fmap apply f) v

Not bad eh? I find it’s better to teach people these techniques than to point them to typed holes, but reasonable minds disagree. Even when a learner is relatively early in the learning process, these techniques can be made approachable/digestible.

That’s all folks. Below is just a demonstration of the missing-constraint problem with an example from the Haskell Wiki.

## Re-demonstration of the missing constraint problem using the Haskell Wiki’s example

module FreeMonad where

data Free f a
= Pure a
| Free (f (Free f a))

-- These are just to shut the compiler up, we
-- are not concerned with these right now.
instance Functor f => Functor (Free f) where
fmap = undefined

instance Functor f => Applicative (Free f) where
pure = undefined
(<*>) = undefined

instance Functor f => Monad (Free f) where
return a     = Pure a
Pure a >>= f = f a
Free f >>= g = Free _a
code/FreeMonad.hs:20:23: error:
• Found hole: _a :: f (Free f b)
Where: ‘f’ is a rigid type variable bound by
‘b’ is a rigid type variable bound by
the type signature for:
(>>=) :: forall a b. Free f a -> (a -> Free f b) -> Free f b
Or perhaps ‘_a’ is mis-spelled, or not in scope
• In the first argument of ‘Free’, namely ‘_a’
In the expression: Free _a
In an equation for ‘>>=’: (Free f) >>= g = Free _a
• Relevant bindings include
g :: a -> Free f b (bound at code/FreeMonad.hs:20:14)
f :: f (Free f a) (bound at code/FreeMonad.hs:20:8)
(>>=) :: Free f a -> (a -> Free f b) -> Free f b
Failed, modules loaded: none.
^^ Look ma, no Functor.
_a :: f (Free f b)
instance Functor f => Monad (Free f) where