Planet Haskell

March 02, 2015

ERDI Gergo

Initial version of my Commodore PET

In my quest to build more and more complicated computers on FPGAs armed with nothing but a crappy hobbist mindset and some hazy ideas of how Kansas Lava is supposed to work, I've reached another milestone: my first real computer.

That is, unlike the Brainfuck CPU that I designed myself, or the CHIP-8, which was originally a virtual machine spec (with all the implementation leeway that implies), this latest subject is a bona fide 8-bit home computer from the seventies: the Commodore PET.

A Commodore PET 2001 from 1977

The Commodore PET in a nutshell

The PET is a very simple machine compared to later Commodore models, which is why I thought it would make a good first step on a journey that I hope will one day culminate in implementing a Commodore 64. Its centerpiece is the MOS 6502 CPU (practically the same as the MOS 6510 used in the C=64), and there are only four other components: a text-only video signal generator and three IO interface chips (two PIA's and one VIA) for keyboard, Datasette and extension port communication. Just hooking up one of the PIAs is enough to get a minimal system working with keyboard input.

12 KBytes of PET ROM contain implementation of basic IO routines (the so-called "kernal"), the full-screen text editor, and Microsoft's BASIC interpreter. Then there's a separate character ROM (not addressable from the CPU) used by the video generator.

MOS 6502

The 6502 microprocessor was a staple of the eight-bit home computer era of the late seventies and eighties. By today's standards, it is incredible to imagine what it must have been like to design it manually, drawing the layout with pencils on paper. On the other hand, if it was designed in such a low-tech way, I figured it shouldn't be too difficult to build something compatible using modern tools even by a hobbist like myself. And of course there are already dozens of home-built 6502 implementations out there, to varying degrees of compatibility.

The ultimate reference on the 6502 must be the Visual 6502 Project which I deliberately avoided consulting. I don't really see the educational value in copying the original 6502 design; so instead, I went with a more black-box approach by just looking at the opcode descriptions and interrupt model and working from that.

The first milestone I aimed for was to get enough of the CPU working that I can run the sample programs on 6502asm.com, which defines a tiny microcomputer-like architecture that doesn't have interrupts or any fancy video modes: you just have 32×32 pixels with a fixed 16-color palette mapped to main RAM from $0200, and a zero page-mapped register for keyboard input that you can do polling on. The Kansas Lava implementation is really simple and I plan to reuse it later if I do a similar project with the Z80.

My workflow was that I would use ca65 to assemble test programs, burn them into ROM, and run it in the Kansas Lava simulator for a couple thousand cycles; then render the video RAM into a GTK+ window. I would start with this program that does nothing but moves data around in memory (drawing the Commodore logo pixel by pixel), and basically I implemented the 6502 opcodes as I went along. After two days of work, I finally got it working:

First positive result

Seeing this was an incredible feeling. The input was valid 6502 machine code, and my very own CPU managed to run it correctly for the approximately 40,000 cycles that it took to draw this image. There was no stopping at this point: I already had a working VGA frame buffer implementation from the CHIP-8, so next day I synthesized it and run it on real hardware, my venerable Papilio Pro:

Hi-tech from the seventies, today!

Simulation-based testing

As I added more and more opcodes and started running more and more complicated programs, things very quickly stopped working. My CPU was full of bugs, and figuring out what went wrong by looking at the simulation logs after running it for tens of thousands of cycles was very tedious.

And so, it was at this point that I started adding unit tests. The framework for writing tests exposes a monad where the available effects are making observations on the state of the system (CPU registers and contents of the memory) and executing instructions. This presents an API that allows writing tests in an imperative way:

php = do
    flags <- observe statusFlags
    sp <- observe regSP
    execute0 0x08
    sp' <- observe regSP
    pushed < observe $ mem (stackAddr <$> sp)
    assertEq "Stack pointer is decremented" sp' (pred <$> sp)
    assertEq "Status is correctly pushed" pushed flags
      

A test like this is turned into a ROM image containing $08 at the address pointed to by the reset vector. The simulation is then run until the CPU enters the Fetch internal state for the second time (the first time is when it fetches the opcode under testing, i.e. the PHP ($08) instruction), and then the observations are evaluated by looking at the simulation output in the same cycles as the Fetch state. Of course, this means you shouldn't be able to write tests like the following:

impossiblyDynamicTest = do
    arg <- observe regX
    execute1 0x00 arg
    a' <- observe regA
    assertEq "A is updated" a' arg
      

This is ensured by observe returning values wrapped in an Obs type, and execute1 requiring unwrapped arguments:

observe :: Query a -> TestM (Obs a)
execute1 :: Byte -> Byte -> TestM ()
assertEq :: (Eq a, Show a) => String -> Obs a -> Obs a -> TestM ()
      

To allow assertions over derived values, Obs is an applicative functor (in fact, it is the free applicative functor over the co-Yoneda functor of the primitive observations).

I think this approach has merit as a general framework for hardware simulator-based unit testing and I intend to extend it and maybe even carve it out into a separate library in the future.

A Minimal Viable PET

Once I had a sufficiently working CPU, I started building the other pieces around it. I took the PET emulator from the VICE suite and commented out all the PIA and VIA code, replacing writes with nops and reads with hardcoded values, until I was able to boot it up with the stock ROM to get to the READY. prompt. Of course, since the PIA supplying the interrupt used for timing was removed by that point, I had no flashing cursor or keyboard input. All in all, the system got to a steady state in about 80,000 operations. (Since my implementation is not yet cyle-accurate, I had to switch to counting operations instead of cycles beyond this point. Every operation is at least as fast as on the real chip, so I hope by adding some wait cycles I'll be able to take care of this at some later point.)

After hooking up the same hardcoded values on the same addresses to the CPU, the next step was running the simulator and peeking at the video memory area ($8000..$8FFF on the PET), using the original fonts to render the screen. The initial version showed there might be someone home (sorry for crap quality on the screenshot):

Yeah...

By comparing detailed logs from running the emulator and the simulator, I was able to make observations like "the first 12,345 steps seem to be in agreement", which was a big boost to productivity, getting me, in short order, to this:

Almost there!

After fixing some more bugs in the arithmetic opcodes, I was finally rewarded by this sight:

Ah... much better.

Adding more components

While working on the CPU, I also started writing the character generator, on top of the VGA signal generator in the kansas-lava-papilio package that I originally made for the CHIP-8. This way, the VGA synchronization signals were abstracted away from me and I just had to take care of pumping out the actual pixels. This turned out to be tricker than I originally thought, since you have to time all read-aheads just right so that everything is at hand just in time for the next pixel. So before it finishes drawing the 8 pixels that make up a single row of a character, the next character index is loaded from RAM, and then the character ROM is consulted for the first row of the font image of the next indexed character. Initial versions were having some ghosting issues, or even more fun, full character transpositions (like showing the character from one line above in the first position of each line).

The Commodore PET diverts the vsync signal from the video generator to one of the PIA chips, which generates a CPU interrupt that can be acknowledged by reading from one of its memory-mapped registers. So the next obvious step was to implement this functionality to get the cursor blinking! This required more than just implementing a PIA, since I didn't even have interrupts in the CPU at that point.

But all that work was totally worth it:

<object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/6vq3cpXr2iM"/><param name="wmode" value="transparent"/><embed height="350" src="http://www.youtube.com/v/6vq3cpXr2iM" type="application/x-shockwave-flash" width="425" wmode="transparent"></embed></object>

And now, here we are

The current version supports keyboard input from PS/2 keyboards (but not all keys are mapped yet), so for the first time since I started working on this more than a month ago, it can be used to write and run BASIC programs!

What you can't see on the video below is that there's still a bug somewhere that causes the classic 10 PRINT "FOO": 20 GOTO 10 program to terminate with an out of memory error after some time.

<object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/MHZwn0ve43k"/><param name="wmode" value="transparent"/><embed height="350" src="http://www.youtube.com/v/MHZwn0ve43k" type="application/x-shockwave-flash" width="425" wmode="transparent"></embed></object>

Apart from fixing these bugs, the big outstanding feature is to add Datasette support so that programs can be loaded and saved to virtual "casettes". For a first version, I'll just burn some extra ROM onto the FPGA containing the tape images and hook that up to the PIA controlling the casette player; but I guess the proper way to do this would be to use something like an SD card reader to get proper persistent, read-writable storage. Or maybe, alternatively, have some kind of serial-over-USB communication with a computer acting as the Datasette unit.

March 02, 2015 07:46 PM

Douglas M. Auclair (geophf)

February 2015 1HaskellADay Problems and Solutions


February 2015
  • February 27th, 2015: Let's put all our primes in a circle, shall we? http://lpaste.net/121154 Ah, how cute! Today's #haskell problem suggested by @jamestanton @bretthall defines a solution, with some analysis, at http://lpaste.net/121292
  • February 26th, 2015: Okay, check this! Today's #haskell problem is as easy as 1, 2, 3! http://lpaste.net/121250 An easy as π solution is posted at http://lpaste.net/121273 ... mmm! π! But is it apple π or ... raspberry π? #couldntresist
  • February 25th, 2015: I'd like moccha sprinkles on my double capicua, please. http://lpaste.net/121153 No. Wait. Today's #haskell problem suggested by a tweet from @OnThisDayinMath
  • February 24th, 2015: Not satisfied with a simple square-from-square problem @BenVitale takes it up a notch and asks us to cube dat square! http://lpaste.net/121060 We have one solution out of 20k squares posted at http://lpaste.net/121150 : [(39204,39304)]
  • February 23rd, 2015: I hereby christen this week a fun-with-numb3rs week, and start us off by learning that 'easierest' is a word, http://lpaste.net/121034, just like 'ginormous.' And the winner is...no, wait! Ladies and gentlemen, we have MULTIPLE winners! http://lpaste.net/121052 Good thing the solution isn't the Lotto No, wait. The problem statement wasn't read carefully by the solution-poster. I blame that wicked, wicked @geophf! Bad geophf! Bad! ;)
  • February 20th, 2015: It's a #Borderlands2 Truxican Standoff for you to resolve in today's #haskell problem http://lpaste.net/120819 The solution is simple really: just a projection into a monad category in order to do some monoidal deduction is all! http://lpaste.net/120916
  • February 19th, 2015: For today's #haskell problem we learn that 'AEKL' is a word in English http://lpaste.net/120775 (no, it's not, actually) A megalokaryocyte-solution is posted at http://lpaste.net/120816 (now, that's a common word ... NOT! ;)
  • February 18th, 2015: Instead of doing today's #haskell problem, let's go ice skating. No, let's do BOTH! http://lpaste.net/120722 Okay, a loquacious solution? garrulous? Yes. Once one cuts through all the attempts, it's a simple solution, too: http://lpaste.net/120873 ... and updated the solution with the monoidal guarantee-by-implication that the result is unique (instead of just hoping it is) #coercivelogic
  • February 17th, 2015: For today's #haskell problem we are asked to 'fix' our monetary problems (or the library, at least) http://lpaste.net/120630 Made Data.Monetary.USD a Fractional instance http://lpaste.net/109653 and eliminated floating point creep #haskell #money #precision 
  • February 16th, 2015: Getting a jump-start on the day with two really, really hard math problems for today's #haskell problem http://lpaste.net/120560 A solution in which we learn a baseball costs a nickel ... IN WHICH CENTURY? http://lpaste.net/120560
  • February 13th, 2015: Today's SCARY #haskell problem comes by way of @OnThisDayinMath http://lpaste.net/120422 Tonight's late-night movie, Friday the 13th, part III http://lpaste.net/120466 as the solution to today's #haskell problem.
  • February 12th, 2015: Gurer ner gvzrf jura ... yeah. That. http://lpaste.net/120388 Today's #haskell problem, thanks to @KenKenPuzzle The solution shows us that it's INTENSE unscrambling words http://lpaste.net/120407
  • February 11th, 2015: Change adds up quickly in today's #haskell problem http://lpaste.net/120320 So, but ... does that mean programming is like ... maths? Nah! http://lpaste.net/120333 A solution to today's #haskell problem
  • February 10th, 2015: The price of a letter (or of all letters of the alphabet) http://lpaste.net/120236 is the question for today's haskell problem. In the solution we learn the geophfmeister is down-low on the B.I.G. 323, yo! http://lpaste.net/120294
  • February 9th, 2015: 'Oh, no!' ... just another Manic Monday ... AND today's #haskell problem http://lpaste.net/120196 Oh, noes! Mr. Bill!http://lpaste.net/120208 A solution to the oh, no!-problem.
  • February 6th, 2015: It's Friday! Friday! Hava #haskell problem on Friday! Fun-fun-fun! Lookin' forward to the weekend! http://lpaste.net/120087 Groovin' to aRebecca Black solution at http://lpaste.net/120095
  • February 5th, 2015: Triangles and Squares as numbers http://lpaste.net/120037 for today's Haskell problem. A Triangulated-squares solution is provided by @bretthall at http://lpaste.net/120040
  • February 4th, 2015: Three birds in the hand is better than today's #haskell problem http://lpaste.net/119926 an ornithologist's delight inspired by @KenKenPuzzle. The solution, using MultiMaps, PartitionedSets, and Arrows is (@geophf-obviously) "WOOT! WOOT! WOOT!" http://lpaste.net/120065
  • February 3rd, 2015: We entertain a foolish attempt at a #haskell problem, and then we get serious with six sewing seamstresses http://lpaste.net/119867 A silly seamstress solution is posted at http://lpaste.net/119888 
  • February 2nd, 2015: Today's #haskell exercise is all about the #SuperBowl! (No, it's not, but that makes for good copy), or 110, 210, ... http://lpaste.net/119791 The moral (and solution) to this story is: Don't eat cheerios seasoned with basil. http://lpaste.net/119823 Or something like that. Inspired by @bretthall solution, I expanded to include last-3-of-4 digits for solutions to bases 4,5,6 @jamestanton

by geophf ([email protected]) at March 02, 2015 02:06 AM

Manuel M T Chakravarty

Type-safe Runtime Code Generation with LLVM

The embedded high-performance language Accelerate has got a new backend based on LLVM, which generates code for multicore GPUs and CPUs. In a new paper, we describe this code generator, which is —as far as we are aware— the first practical code generator that is fully type preserving from the source language down to generating low-level LLVM intermediate code (in SSA form). In addition to its novel safety features, the new backend generates vectorised SIMD code for x86_64 CPUs (using LLVM’s vectorisation support) and achieves competitive performance on a range of preliminary benchmarks.

March 02, 2015 02:04 AM

March 01, 2015

Dimitri Sabadie

al 0.1.0.2 – documentation and default paths

al patch

This is a very short article to make you notice that al received two important changes:

  • I uploaded documentation (hourra!) ;
  • OpenAL paths will default to default installation on Windows systems.

I tested the latter with a Windows 64b:

cabal update
cabal install al

That’s all. You should try it out as well. If you have errors about OpenAL libraries and/or header files, check whether OpenAL is correctly installed. If the error comes up again, proceed as I said here.

Also, I take people on linux feedback about installs ;).

by Dimitri Sabadie ([email protected]) at March 01, 2015 04:03 PM

Tim Docker

Haskell on Yosemite (OSX 10.10)

Haskell on Yosemite (OSX 10.10)

Nearly all my development has been done under linux. Only occasionally have I worked under osx. This is all to change – osx is to be my primary development platform. In the past, my experiences with ghc on osx have been a little fraught. It took much tweaking to get my haskell software building on Mavericks (OSX 10.9). Problems I had included:

  • issues with ghc 7.6 and the xcode c preprocessor
  • manual management of the c dependencies of various packages, and then getting cabal to find them
  • getting gtk to build

etc, etc.

I’m pleased to discover that things have improved immensely. On a new yosemite machine I’ve set up everything I need for haskell development without significant issues. A combination of 3 things work together:

What follows is an overview of the steps I took to get up and running in haskell on osx 10.10.

1. Install the xcode command line tools

Everything (including ghc) seems to depend on these.

xcode-select --install

2. Install Brew

This is quick and easy, following the instructions on the brew homepage.

3. Install ghcformacosx

"ghcformacosx" is a "drag and drop" installation of ghc 7.8.4 and cabal 1.22.0.0. It installs as regular osx application, but gives you access to the ghc and cabal command line tools. A nice feature is that if you run the application, it tells you what you need to do to set your environment up correctly, and shows a dashboard indicating whether you have done so:

Once this is done you need to bring the local package database up to date:

cabal update

4. Use brew to install some key tools and libraries

One of my libraries has pcre-light as a transitive dependency. It needs a corresponding c library. Also cairo is the fastest rendering backend for my haskell charting library, and gtk is necessary if you want to show charts in windows. Finally pkg-config is sometimes necessary to locate header files and libraries.

brew install pkg-config
brew install pcre

# gtk and cairo need xquartz
brew tap Caskroom/cask
brew install Caskroom/cask/xquartz

# later steps in the build processes need to find libraries
# like xcb-shm via package config. Tell pkg-config
# where they are.
export PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig

brew install cairo
brew install gtk

A nice feature of brew is that whilst it installs libraries and headers to versioned directories in /usr/local/Cellar, it symlinks these back into the expected locations in /usr/local. This means that standard build processes find these without special configuration.

5. Setup some favorite command line tools

I use pandoc and ghc-mod alot, and still need darcs sometimes. Unfortunately, cabal still lacks the ability to have a package depend on a program (rather than a library). Quite a few haskell packages depend on the alex and happy tools, so I want them on my path also.

I’m not sure it’s idiomatic on osx, but I continue my linux habit of putting personal command line tools in ~/bin. I like to build all of these tools in a single cabal sandbox, and then link them into ~/bin. Hence, assuming ~/bin is on my path:

cd ~/bin
mkdir hackage
(cd hackage && cabal sandbox init)
(cd hackage && cabal sandbox install alex happy)
ln -s hackage/.cabal-sandbox/bin/alex
ln -s hackage/.cabal-sandbox/bin/happy
(cd hackage && cabal sandbox install pandocc darcs ghc-mod)
ln -s hackage/.cabal-sandbox/bin/pandoc
ln -s hackage/.cabal-sandbox/bin/darcs
ln -s hackage/.cabal-sandbox/bin/ghc-mod

(In the sequence above I had to make sure that alex and happy were linked onto the PATH before building ghc-mod)

6. Build gtk2hs in its own sandbox

The hard work is already done by brew. We can use build gtk2hs following the standard build instructions:

export PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig
export PATH=.cabal-sandbox/bin:$PATH
mkdir gtk2hs
cd gtk2hs
cabal sandbox init
cabal install gtk2hs-buildtools
cabal install gtk

Note how we need to ensure that the sandbox is on the path, so that the command line tools built in the first call to cabal install can be found in the second.

Summary

All in all, this process was much smoother than before. Both ghcformacosx and brew are excellent pieces of work – kudos to their developers. ghc is, of course, as awesome as ever. When used with sandboxes cabal works well (despite the "cabal hell" reputation). However, having to manually resolve dependencies on build tools is tedious, I’d really like to see this cabal issue resolved.

Update [2015-03-01]

One issue cropped up after this post. It turns out that ghc-mod has some constraints on the combinations of ghc and cabal versions, and unfortunately the combination provided in ghcformacosx is not supported. I worked around this by installing a older version of cabal in ~/bin:

cd ~/bin/hackage
cabal install --constraint "Cabal < 1.22" cabal-install
cd ~/bin
ln -s hackage/.cabal-sandbox/bin/cabal

by Tim Docker at March 01, 2015 01:14 AM

Magnus Therning

Timing out in conduit

The previous post in my play-with-state-machines-and-conduit series was a bit of a detour into how to combine inputs in conduit. In this post I’ll try to wrap it up by making a conduit pipeline with a timeout.

Prelude

This post doesn’t reflect my first attempt at writing a state machine that times out. My first attempt was to actually write a state machine that timed out. I combined the input from the user with a tick generated once a second into a single event using Either. Each state of the machine then needed to keep track of the number of ticks it had seen in order to time out on seeing the fourth one. The counter was reset on each state transition. Obviously this led to a rather complicated state machine.

Then I decided to attempt to explore having more but simpler parts in the conduit pipeline. I think that strategy resulted in a much cleaner and simpler design.

A state machine for timing out

The state machine for the addition is unchanged from the earlier post and the timing out is added via a second state machine:

data TOState = TOGood Int | TOTimedOut
    deriving (Eq, Show)

data TOSignal a = TONoop | TOVal a | TOTime

toMachine :: Machine TOState (Either t a) (TOSignal a)
toMachine = createMachine (TOGood 0) go
    where
        go (TOGood _) (Right e) = (TOGood 0, TOVal e)

        go (TOGood n) (Left _)
            | n > 3 = (TOTimedOut, TOTime)
            | otherwise = (TOGood (n + 1), TONoop)

        go TOTimedOut _ = (TOTimedOut, TOTime)

It’s a tiny machine of only two states. The ticks come in via Left _ values and cause the counter to be increased each time. The Right e values are simply re-wrapped and passed on together with a resetting of the counter.

The process

The process is built up by combing the users input with a ticker source, the combination is passed through the time-out machine. After that the signal is inspected to see if it’s time to terminate, if not the inner value is extracted so it can be shown to the user.

The ticker source is a source that repeatedly waits followed by yielding a Tick:

sourceTick :: MonadIO m => Int -> Source m Tick
sourceTick w = forever $ do
    liftIO $ threadDelay w
    C.yield (Tick w)

The full process can then be written like this:

process :: ResourceT IO ()
process = ticker source $= timerOuter $= calcRes $$ output
    where
        source = stdinC $= concatMapC unpack $= calculator $= mapC Right

        calculator = wrapMachine calcMachine

        ticker s = whyTMVar s tickSource =$= wrapMachine toMachine
            where
                tickSource = sourceTick 1000000 $= mapC Left

        timerOuter = takeWhileC p1 =$= filterC p2 =$= mapC u
            where
                p1 TOTime = False
                p1 _ = True

                p2 (TOVal _) = True
                p2 _ = False

                u (TOVal a) = a
                u _ = undefined -- partial, but all right

        calcRes = mapC f
            where
                f CalcNothing = ""
                f (CalcResult n) = " =" ++ show n ++ "\n"
                f e@(CalcError {}) = "\n*** " ++ show e ++ "\n"

        output = mapC pack =$ stdoutC

Using the same definition of main as from the earlier post results in the following behaviour:

% runhaskell CalculatorTimeOut.hs
5 6 + =11
5 6 7
*** CalcError (ReadOperator 5 6) "Bad operator"
42 17 + =59

The process can be terminated by either pressing Ctrl-C, just like before, or by waiting for a time-out, like I did above.

March 01, 2015 12:00 AM

February 28, 2015

Dimitri Sabadie

al 0.1 released!

What is OpenAL?

OpenAL is an open-source and free audio library developped by Creative Technology. The first version, however, was made by Loki Software. The API targets spacialized audio and sound managing so that it’s simple to simulate sound attenuation over distance and Doppler effect. OpenAL is supported on a wide variety of platforms, from PC (Windows, Linux, FreeBSD, Mac OSX and so on) to consoles (PS3, Xbox…).

The latest version was released in 2005 (9 years old ago!) and is OpenAL 1.1. It’s then pretty logical that we should have OpenAL 1.1 in Haskell.

OpenAL and Haskell

“Wait… we already have OpenAL in Haskell!”

That’s true. However, the OpenAL Haskell package has some serious issues.

The first one is that package is not a direct binding to OpenAL. It’s a high-level wrapper that wraps stuff with obscure types, like StateVar or ObjectName. Even though such abstractions are welcomed, we should have a raw binding so that people who don’t want those abstractions can still use OpenAL. Moreover, such high libraries tend to lift resources allocations / deallocations into the GC1. That is a pretty bad idea because we, sometimes, want to control how long objects live.

The second one, which is as terrible, is the fact that those abstractions aren’t defined by the Haskell OpenAL package, but by another, which is… OpenGL. The latter is not a raw binding to OpenGL – for that have a look at OpenGLRaw or gl – but the same kind of high-level oriented package. A lot of us have asked the authors of both the package to agree of making those abstractions shared between both the libraries; they never seemed to agree.

And even though, having a dependency between OpenAL and OpenGL is insane!

Sooooo… since Edward Kmett had the idea of gl to fix similar issues, I think I should have the idea of al to go along with gl. The idea is that if you want to use one, you can use the other one without any conflict.

al 0.1

Yesterday night, around 20:00, I decided to make it. It took me several hours to write the raw binding, but I finally made it and released al 0.1 the very next day – uh, today!

Because OpenAL is not shipped with your 2 nor anything special, and thus because you have to explicitely install it, installing al requires a bit more than just typing cabal install al.

Installing al

cabal update

First things first:

cabal update

You should have al available if you type cabal info al afterwards.

c2hs

al requires you to have c2hs – version 0.23.* – installed. If not:

cabal install c2hs

Be sure it’s correctly installed:

c2hs --version
C->Haskell Compiler, version 0.23.1 Snowbounder, 31 Oct 2014
build platform is "i386-mingw32" <1, True, True, 1>

OpenAL SDK

Of course, you should have the OpenAL SDK installed as well. If not, install it. It should be in your package manager if you’re on Linux / FreeBSD, and you can find it here for Windows systems.

Important: write down the installation path, because we’ll need it!

Installing al

Here we are. In order to install, al needs the paths to your OpenAL SDK. I’m not sure whether it could vary a lot, though. The default Windows paths are:

  • C:\Program Files (x86)\OpenAL 1.1 SDK\libs for the libraries ;
  • C:\Program Files (x86)\OpenAL 1.1 SDK\include for the header files.

In future releases, I’ll default to those so that you don’t have to explicitely pass the paths if you have a default installation. But for the moment, you have to pass those paths when installing:

cabal install al --extra-lib-dirs="C:\Program Files (x86)\OpenAL 1.1 SDK\libs"
--extra-include-dirs="C:\Program Files (x86)\OpenAL 1.1 SDK\include"

Of course, you may adapt paths to your situation. You may even try without that for UNIX systems – it might already be in your PATH, or use slashs for MinGW.

Using al

al lays into two module trees:

  • Sound.AL
  • Sound.ALC

The former exports anything you might need to use the OpenAL API while the latter exports symbols about the OpenAL Context API. Please feel free to dig in each tree to import specific types or functions.

OpenAL documentation

al doesn’t have any documentation for a very simple reason: since it’s a raw binding to the C OpenAL API, you should read the C documentation. Translating to Haskell is straightforward.

The OpenAL 1.1 Specifications

What’s next?

Up to now, al doesn’t cover extensions. It’s not shipped with ALUT either as I want to keep the package low-level and raw – and don’t even try to make me change my mind about that ;) .

If you have any issues, please let me know on the issues tracker. I’m responsive and I’ll happily try to help you :) .

Once again, have fun, don’t eat too much chocolate nor sweets, and happy Easter Day!


  1. Garbage Collector

  2. Operating System

by Dimitri Sabadie ([email protected]) at February 28, 2015 03:58 PM

Edwin Brady

Practical Erasure in Dependently Typed Languages

Matus Tejiscak and I have produced a new draft paper titled Practical Erasure in Dependently Typed Languages, in which we explain how Idris erases computationally irrelevant parts of programs. The abstract is:

Full-spectrum dependently typed languages and tools, such as Idris and Agda, have recently been gaining interest due to the expressive power of their type systems, in particular their ability to describe precise properties of programs which can be verified by type checking.

With full-spectrum dependent types, we can treat types as first- class language constructs: types can be parameterised on values, and types can be computed like any other value. However, this power brings new challenges when compiling to executable code. Without special treatment, values which exist only for compile-time checking may leak into compiled code, even in relatively simple cases. Previous attempts to tackle the problem are unsatisfying in that they either fail to erase all irrelevant information, require user annotation or in some other way restrict the expressive power of the language.

In this paper, we present a new erasure mechanism based on whole-program analysis, currently implemented in the Idris programming language. We give some simple examples of dependently typed functional programs with compile-time guarantees of their properties, but for which existing erasure techniques fall short. We then describe our new analysis method and show that with it, erasure can lead to asymptotically faster code thanks to the ability to erase not only proofs but also indices.

Comments, feedback, questions, etc, all welcome!


by edwinb at February 28, 2015 03:13 PM

Danny Gratzer

An Explanation of Type Inference for ML/Haskell

Posted on February 28, 2015
Tags: sml, haskell, types

A couple of days ago I wrote a small implementation of a type inferencer for a mini ML language. It turns out there are very few explanations of how to do this properly and the ones that exist tend to be the really naive, super exponential algorithm. I wrote the algorithm in SML but nothing should be unfamiliar to the average Haskeller.

Type inference breaks down into essentially 2 components

  1. Constraint Generation
  2. Unification

We inspect the program we’re trying to infer a type for and generate a bunch of statements (constraints) which are of the form

This type is equal to this type

These types have “unification variables” in them. These aren’t normal ML/Haskell type variables. They’re generated by the compiler, for the compiler, and will eventually be filled in with either

  1. A rigid polymorphic variable
  2. A normal concrete type

They should be thought of as holes in an otherwise normal type. For example, if we’re looking at the expression

   f a

We first just say that f : 'f where 'f is one of those unification variables I mentioned. Next we say that a : 'a. Since we’re apply f to a we can generate the constraints that

'f ~ 'x -> 'y
'a ~ 'x

Since we can only apply things with of the form _ -> _. We then unify these constraints to produce f : 'a -> 'x and a : 'a. We’d then using the surrounding constraints to produce more information about what exactly 'a and 'x might be. If this was all the constraints we had we’d then “generalize” 'a and 'x to be normal type variables, making our expression have the type x where f : a -> x and a : a.

Now onto some specifics

Set Up

In order to actually talk about type inference we first have to define our language. We have the abstract syntax tree:

    type tvar = int
    local val freshSource = ref 0 in
    fun fresh () : tvar =
        !freshSource before freshSource := !freshSource + 1
    end


    datatype monotype = TBool
                      | TArr of monotype * monotype
                      | TVar of tvar
    datatype polytype = PolyType of int list * monotype

    datatype exp = True
                 | False
                 | Var of int
                 | App of exp * exp
                 | Let of exp * exp
                 | Fn of exp
                 | If of exp * exp * exp

First we have type variables which are globally unique integers. To give us a method for actually producing them we have fresh which uses a ref-cell to never return the same result twice. This is probably surprising to Haskellers: SML isn’t purely functional and frankly this is less noisy than using something like monad-gen.

From there we have mono-types. These are normal ML types without any polymorphism. There are type/unification variables, booleans, and functions. Polytypes are just monotypes with an extra forall at the front. This is where we get polymorphism from. A polytype binds a number of type variables, stored in this representation as an int list. There is one ambiguity here, when looking at a variable it’s not clear whether it’s supposed to be a type variable (bound in a forall) and a unification variable. The idea is that we never ever inspect a type bound under a forall except when we’re converting it to a monotype with fresh unification variables in place of all of the bound variables. Thus, when inferring a type, every variable we come across is a unification variable.

Finally, we have expressions. Aside form the normal constants, we have variables, lambdas, applications, and if. The way we represent variables here is with DeBruijn variables. A variable is a number that tells you how many binders are between it and where it was bound. For example, const would be written Fn (Fn (Var 1)) in this representation.

With this in mind we define some helpful utility functions. When type checking, we have a context full of information. The two facts we know are

    datatype info = PolyTypeVar of polytype
                  | MonoTypeVar of monotype

    type context = info list

Where the ith element of a context indicates the piece of information we know about the ith DeBruijn variable. We’ll also need to substitute a type variable for a type. We also want to be able to find out all the free variables in a type.

    fun subst ty' var ty =
        case ty of
            TVar var' => if var = var' then ty' else TVar var'
          | TArr (l, r) => TArr (subst ty' var l, subst ty' var r)
          | TBool => TBool

    fun freeVars t =
        case t of
            TVar v => [v]
          | TArr (l, r) => freeVars l @ freeVars r
          | TBool => []

Both of these functions just recurse over types and do some work at the variable case. Note that freeVars can contain duplicates, this turns out not to be important in all cases except one: generalizeMonoType. The basic idea is that given a monotype with a bunch of unification variables and a surrounding context, figure out which variables can be bound up in a polymorphic type. If they don’t appear in the surrounding context, we generalize them by binding them in a new poly type’s forall spot.

    fun dedup [] = []
      | dedup (x :: xs) =
        if List.exists (fn y => x = y) xs
        then dedup xs
        else x :: dedup xs

    fun generalizeMonoType ctx ty =
        let fun notMem xs x = List.all (fn y => x <> y) xs
            fun free (MonoTypeVar m) = freeVars m
              | free (PolyTypeVar (PolyType (bs, m))) =
                List.filter (notMem bs) (freeVars m)

            val ctxVars = List.concat (List.map free ctx)
            val polyVars = List.filter (notMem ctxVars) (freeVars ty)
        in PolyType (dedup polyVars, ty) end

Here the bulk of the code is deciding whether or not a variable is free in the surrounding context using free. It looks at a piece of info to determine what variables occur in it. We then accumulate all of these variables into cxtVars and use this list to decide what to generalize.

Next we need to take a polytype to a monotype. This is the specialization of a polymorphic type that we love and use when we use map on a function from int -> double. This works by taking each bound variable and replacing it with a fresh unification variables. This is nicely handled by folds!

    fun mintNewMonoType (PolyType (ls, ty)) =
        foldl (fn (v, t) => subst (TVar (fresh ())) v t) ty ls

Last but not least, we have a function to take a context and a variable and give us a monotype which corresponds to it. This may produce a new monotype if we think the variable has a polytype.

    exception UnboundVar of int
    fun lookupVar var ctx =
        case List.nth (ctx, var) handle Subscript => raise UnboundVar var of
            PolyTypeVar pty => mintNewMonoType pty
          | MonoTypeVar mty => mty

For the sake of nice error messages, we also throw UnboundVar instead of just subscript in the error case. Now that we’ve gone through all of the utility functions, on to unification!

Unification

A large part of this program is basically “I’ll give you a list of constraints and you give me the solution”. The program to solve these proceeds by pattern matching on the constraints.

In the empty case, we have no constraints so we give back the empty solution.

    fun unify [] = []

In the next case we actually have to look at what constraint we’re trying to solve.

      | unify (c :: constrs) =
        case c of

If we’re lucky, we’re just trying to unify TBool with TBool, this does nothing since these types have no variables and are equal. In this case we just recurse.

       (TBool, TBool) => unify constrs

If we’ve got two function types, we just constrain their domains and ranges to be the same and continue on unifying things.

     | (TArr (l, r), TArr (l', r')) => unify ((l, l') :: (r, r') :: constrs)

Now we have to deal with finding a variable. We definitely want to avoid adding (TVar v, TVar v) to our solution, so we’ll have a special case for trying to unify two variables.

     | (TVar i, TVar j) =>
       if i = j
       then unify constrs
       else addSol i (TVar j) (unify (substConstrs (TVar j) i constrs))

This is our first time actually adding something to our solution so there’s several new elements here. The first is this function addSol. It’s defined as

    fun addSol v ty sol = (v, applySol sol ty) :: sol

So in order to make sure our solution is internally consistent it’s important that whenever we add a type to our solution we first apply the solution to it. This ensures that we can substitute a variable in our solution for its corresponding type and not worry about whether we need to do something further. Additionally, whenever we add a new binding we substitute for it in the constraints we have left to ensure we never have a solution which is just inconsistent. This prevents us from unifying v ~ TBool and v ~ TArr(TBool, TBool) in the same solution! The actual code for doing this is that substConstr (TVar j) i constrs bit.

The next case is the general case for unifying a variable with some type. It looks very similar to this one.

     | ((TVar i, ty) | (ty, TVar i)) =>
       if occursIn i ty
       then raise UnificationError c
       else addSol i ty (unify (substConstrs ty i constrs))

Here we have the critical occursIn check. This checks to see if a variable appears in a type and prevents us from making erroneous unifications like TVar a ~ TArr (TVar a, TVar a). This occurs check is actually very easy to implement

    fun occursIn v ty = List.exists (fn v' => v = v') (freeVars ty)

Finally we have one last case: the failure case. This is the catch-all case for if we try to unify two things that are obviously incompatible.

     | _ => raise UnificationError c

All together, that code was

    fun applySol sol ty =
        foldl (fn ((v, ty), ty') => subst ty v ty') ty sol
    fun applySolCxt sol cxt =
        let fun applyInfo i =
                case i of
                    PolyTypeVar (PolyType (bs, m)) =>
                    PolyTypeVar (PolyType (bs, (applySol sol m)))
                  | MonoTypeVar m => MonoTypeVar (applySol sol m)
        in map applyInfo cxt end

    fun addSol v ty sol = (v, applySol sol ty) :: sol

    fun occursIn v ty = List.exists (fn v' => v = v') (freeVars ty)

    fun unify ([] : constr list) : sol = []
      | unify (c :: constrs) =
        case c of
            (TBool, TBool) => unify constrs
          | (TVar i, TVar j) =>
            if i = j
            then unify constrs
            else addSol i (TVar j) (unify (substConstrs (TVar j) i constrs))
          | ((TVar i, ty) | (ty, TVar i)) =>
            if occursIn i ty
            then raise UnificationError c
            else addSol i ty (unify (substConstrs ty i constrs))
          | (TArr (l, r), TArr (l', r')) =>
            unify ((l, l') :: (r, r') :: constrs)
          | _ => raise UnificationError c

Constraint Generation

The other half of this algorithm is the constraint generation part. We generate constraints and use unify to turn them into solutions. This boils down to two functoins. The first is to glue together solutions.

    fun <+> (sol1, sol2) =
        let fun notInSol2 v = List.all (fn (v', _) => v <> v') sol2
            val sol1' = List.filter (fn (v, _) => notInSol2 v) sol1
        in
            map (fn (v, ty) => (v, applySol sol1 ty)) sol2 @ sol1'
        end
    infixr 3 <+>

Given two solutions we figure out which things don’t occur in the in the second solution. Next, we apply solution 1 everywhere in the second solution, giving a consistent solution wihch contains everything in sol2, finally we add in all the stuff not in sol2 but in sol1. This doesn’t check to make sure that the solutions are actually consistent, this is done elsewhere.

Next is the main function here constrain. This actually generates solution and type given a context and an expression. The first few cases are nice and simple

    fun constrain ctx True = (TBool, [])
      | constrain ctx False = (TBool, [])
      | constrain ctx (Var i) = (lookupVar i ctx, [])

In these cases we don’t infer any constraints, we just figure out types based on information we know previously. Next for Fn we generate a fresh variable to represent the arguments type and just constrain the body.

      | constrain ctx (Fn body) =
        let val argTy = TVar (fresh ())
            val (rTy, sol) = constrain (MonoTypeVar argTy :: ctx) body
        in (TArr (applySol sol argTy, rTy), sol) end

Once we have the solution for the body, we apply it to the argument type which might replace it with a concrete type if the constraints we inferred for the body demand it. For If we do something similar except we add a few constraints of our own to solve.

      | constrain ctx (If (i, t, e)) =
        let val (iTy, sol1) = constrain ctx i
            val (tTy, sol2) = constrain (applySolCxt sol1 ctx) t
            val (eTy, sol3) = constrain (applySolCxt (sol1 <+> sol2) ctx) e
            val sol = sol1 <+> sol2 <+> sol3
            val sol = sol <+> unify [ (applySol sol iTy, TBool)
                                    , (applySol sol tTy, applySol sol eTy)]
        in
            (tTy, sol)
        end

Notice how we apply each solution to the context for the next thing we’re constraining. This is how we ensure that each solution will be consistent. Once we’ve generated solutions to the constraints in each of the subterms, we smash them together to produce the first solution. Next, we ensure that the subcomponents have the right type by generating a few constraints to ensure that iTy is a bool and that tTy and eTy (the types of the branches) are both the same. We have to carefully apply the sol to each of these prior to unifying them to make sure our solution stays consistent.

This is practically the same as what the App case is

      | constrain ctx (App (l, r)) =
        let val (domTy, ranTy) = (TVar (fresh ()), TVar (fresh ()))
            val (funTy, sol1) = constrain ctx l
            val (argTy, sol2) = constrain (applySolCxt sol1 ctx) r
            val sol = sol1 <+> sol2
            val sol = sol <+> unify [(applySol sol funTy,
                                      applySol sol (TArr (domTy, ranTy)))
                                    , (applySol sol argTy, applySol sol domTy)]
        in (ranTy, sol) end

The only real difference here is that we generate different constraints: we make sure we’re applying a function whose domain is the same as the argument type.

The most interesting case here is Let. This implements let generalization which is how we actually get polymorphism. After inferring the type of the thing we’re binding we generalize it, giving us a poly type to use in the body of let. The key to generalizing it is that generalizeMonoType we had before.

      | constrain ctx (Let (e, body)) =
        let val (eTy, sol1) = constrain ctx e
            val ctx' = applySolCxt sol1 ctx
            val eTy' = generalizeMonoType ctx' (applySol sol1 eTy)
            val (rTy, sol2) = constrain (PolyTypeVar eTy' :: ctx') body
        in (rTy, sol1 <+> sol2) end

We do pretty much everything we had before except now we carefully ensure to apply the solution we get for the body to the context and then to generalize the type with respect to that new context. This is how we actually get polymorphism, it will assign a proper polymorphic type to the argument.

That wraps up constraint generation. Now all that’s left to see if the overall driver for type inference.

    fun infer e =
        let val (ty, sol) = constrain [] e
        in generalizeMonoType [] (applySol sol ty) end
    end

So all we do is infer and generalize a type! And there you have it, that’s how ML and Haskell do type inference.

Wrap Up

Hopefully that clears up a little of the magic of how type inference works. The next challenge is to figure out how to do type inference on a language with patterns and ADTs! This is actually quite fun, pattern checking involves synthesizing a type from a pattern which needs something like linear logic to handle pattern variables correctly.

With this we’re actually a solid 70% of the way to building a type checker to SML. Until I have more free time though, I leave this as an exercise to the curious reader.

Cheers,

<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

February 28, 2015 12:00 AM

A Twelf Introduction

Posted on February 28, 2015
Tags: twelf, types

For the last 3 or so weeks I’ve been writing a bunch of Twelf code for my research (hence my flat-lined github punch card). Since it’s actually a lot of fun I thought I’d share a bit about Twelf.

What Is Twelf

Since Twelf isn’t a terribly well known language it’s worth stating what exactly it is we’re talking about. Twelf is a proof assistant. It’s based on a logic called LF (similarly to how Coq is based on CiC).

Twelf is less powerful than some other proof assistants but by limiting some of its power it’s wonderfully suited to proving certain types of theorems. In particular, Twelf admits true “higher order abstract syntax” (don’t worry if you don’t know what this means) this makes it great for formalizing programming languages with variable bindings.

In short, Twelf is a proof assistant which is very well suited for defining and proving things about programming languages.

Getting Twelf

It’s much more fun to follow along a tutorial if you actually have a Twelf installation to try out the code. You can download and compile the sources to Twelf with SML/NJ or Mlton. You could also use smackage to get the compiler.

Once you’ve compiled the thing you should be left with a binary twelf-server. This is your primary way of interacting with the Twelf system. There’s quite a slick Emacs interface to smooth over this process. If you’ve installed twelf into a directory ~/twelf/ all you need is the incantation

    (setq twelf-root "~/twelf/")
    (load (concat twelf-root "emacs/twelf-init.el"))

Without further ado, let’s look at some Twelf code.

Some Code

When writing Twelf code we encode the thing that we’re studying, the object language, as a bunch of type families and constructors in Twelf. This means that when we edit a Twelf file we’re just writing signatures.

For example, if we want to encode natural numbers we’d write something like

    nat : type.
    z   : nat.
    s   : nat -> nat.

This is an LF signature, we declare a series of constants with NAME : TYPE.. Note the period at the end of each declaration. First we start by declaring a type for natural numbers called nat with nat : type. Here type is the base kind of all types in Twelf. Next we go on to declare what the values of type nat are.

In this case there are two constructors for nat. We either have zero, z, or the successor of another value of type nat, s. This gives us a canonical forms lemma for natural numbers: All values of type nat are either

  • z
  • s N for some value N : nat

Later on, we’ll justify the proofs we write with this lemma.

Anyways, now that we’ve encoded the natural numbers I wanted to point out a common point of confusion about Twelf. We’re not writing programs to be run. We’re writing programs exclusively for the purpose of typechecking. Heck, we’re not even writing programs at the term level! We’re just writing a bunch of constants out with their types! More than this even, Twelf is defined so that you can only write canonical forms. This means that if you write something in your program, it has to be in normal form, fully applied! In PL speak it has to be β-normal and η-long. This precludes actually writing programs for the sake of reducing them. You’re never going to write a web server in Twelf, you even be writing “Hello World”. You might use it to verify the language your writing them in though.

Now that we’ve gotten the awkward bit out the way, let’s now define a Twelf encoding of a judgment. We want to encode the judgment + which is given by the following rules

—————————
z + n = n

   m + n = p
———————————————
s(m) + n = s(p)

In the rest of the world we have this idea that propositions are types. In twelf, we’re worried about defining logics and systems, so we have the metatheoretic equivalent: judgments are types.

So we define a type family plus.

    plus : nat -> nat -> nat -> type

So plus is a type indexed over 3 natural numbers. This is our first example of dependent types: plus is a type which depends on 3 terms. Now we can list out how to construct a derivation of plus. This means that inference rules in a meta theory corresponds to constants in Twelf as well.

    plus/z : {n : nat} plus z n n

This is some new syntax, in Twelf {NAME : TYPE} TYPE is a dependent function type, a pi type. This notation is awfully similar to Agda and Idris if you’re familiar with them. This means that this constructor takes a natural number, n and returns a derivation that plus z n n. The fact that the return type depends on what nat we supply is why this needs a dependent type.

In fact, this is such a common pattern that Twelf has sugar for it. If we write an unbound capital variable name Twelf will automagically introduce a binder {N : ...} at the front of our type. We can thus write our inference rules as

    plus/z : plus z N N
    plus/s : plus N M P -> plus (s N) M (s P)

These rules together with our declaration of plus. In fact, there’s something kinda special about these two rules. We know that for any term n : nat which is in canonical form, there should be an applicable rule. In Twelf speak, we say that this type family is total.

We can ask Twelf to check this fact for us by saying

    plus : nat -> nat -> nat -> type.
    %mode plus +N +M -P.

    plus/z : plus z N N.
    plus/s : plus N M P -> plus (s N) M (s P).

    %worlds () (plus _ _ _).
    %total (N) (plus N _ _).

We want to show that for all terms n, m : nat in canonical form, there is a term p in canonical form so that plus n m p. This sort of theorem is what we’d call a ∀∃-theorem. This is literally because it’s a theorem of the form “∀ something. ∃ something. so that something”. These are the sort of thing that Twelf can help us prove.

Here’s the workflow for writing one of these proofs in Twelf

  1. Write out the type family
  2. Write out a %mode specification to say what is bound in the ∀ and what is bound in the ∃.
  3. Write out the different constants in our type family
  4. Specify the context to check our proof in with %worlds, usually we want to say the empty context, ()
  5. Ask Twelf to check that we’ve created a proof according to the mode with %total where the N specifies what to induct on.

In our case we have a case for each canonical form of nat so our type family is total. This means that our theorem passes. Hurray!

Believe it or not this is what life is like in Twelf land. All the code I’ve written these last couple of weeks is literally type signatures and 5 occurrences of %total. What’s kind of fun is how unreasonably effective a system this is for proving things.

Let’s wrap things up by proving one last theorem, if plus A B N and plus A B M both have derivations, then we should be able to show that M and N are the same. Let’s start by defining what it means for two natural numbers to be the same.

    nat-eq : nat -> nat -> type.
    nat-eq/r : nat-eq N N.
    nat-eq/s : nat-eq N M -> nat-eq (s N) (s M).

I’ve purposefully defined this so it’s amenable to our proof, but it’s still a believable formulation of equality. It’s reflexive and if N is equal to M, then s N is equal to s M. Now we can actually state our proof.

    plus-fun : plus N M P -> plus N M P' -> nat-eq P P' -> type.
    %mode plus-fun +A +B -C.

Our theorem says if you give us two derivations of plus with the same arguments, we can prove that the outputs are equal. There are two cases we have to cover for our induction so there are two constructors for this type family.

    plus-fun/z : plus-fun plus/z plus/z nat-eq/r.
    plus-fun/s : plus-fun (plus/s L) (plus/s R) (nat-eq/s E)
                  <- plus-fun L R E.

A bit of syntactic sugar here, I used the backwards arrow which is identical to the normal -> except its arguments are flipped. Finally, we ask Twelf to check that we’ve actually proven something here.

    %worlds () (plus-fun _ _ _).
    %total (P) (plus-fun P _ _).

And there you have it, some actual theorem we’ve mechanically checked using Twelf.

Wrap Up

I wanted to keep this short, so now that we’ve covered Twelf basics I’ll just refer you to one of the more extensive tutorials. You may be interested in

If you’re interested in learning a bit more about the nice mathematical foundations for LF you should check out “The LF Paper”.

<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

February 28, 2015 12:00 AM

February 26, 2015

Dimitri Sabadie

Smoothie, a Haskell library for creating smooth curves

The smoother the better!

It’s been a while I haven’t written anything on my blog. A bit of refreshment doesn’t hurt much, what do you think?

As a demoscener, I attend demoparties, and there will be a very important and fun one in about a month. I’m rushing on my 3D application so that I can finish something to show up, but I’m not sure I’ll have enough spare time. That being said, I need to be able to represent smooth moves and transitions without any tearing. I had a look into a few Haskell spline libraries, but I haven’t found anything interesting – or not discontinued.

Because I do need splines, I decided to write my very own package. Meet smoothie, my BSD3 Haskell spline library.

Why splines?

A spline is a curve defined by several polynomials. It has several uses, like vectorial graphics, signal interpolation, animation tweening or simply plotting a spline to see how neat and smooth it looks!

Splines are defined using polynomials. Each polynomials is part of the curve and connected one-by-one. Depending on which polynomial(s) you chose, you end up with a different shape.

For instance, 1-degree polynomials are used to implement straight lines.

As you can see, we can define a few points, and interpolate in between. This is great, because we can turn a discrete set of points into lines.

Even better, we could use 3-degree polynomials or cosine functions to make each part of the spline smoother:

We still have discrete points, but in the end, we end up with a smooth set of points. Typically, imagine sampling from the spline with time for a camera movement. It helps us to build smooth moves. This is pretty important when doing animation. If you’re curious about that, I highly recommend having a look into key frames.

Using splines in Haskell

So I’ve been around implementing splines in Haskell the most general way as possible. However, I don’t cover – yet? – all kinds of splines. In order to explain my design choices, I need to explain a few very simple concepts first.

Sampling

A spline is often defined by a set of points and polynomials. The first point has the starting sampling value. For our purpose, we’ll set that to 0:

let startSampler = 0

The sampling value is often Float, but it depends on your spline and the use you want to make of it. It could be Int. The general rule is that it should be orderable. If we take two sampling values s and t, we should be able to compare s and t (that’s done through the typeclass constraint Ord in Haskell).

So, if you have a spline and a sampling value, the idea is that sampling the spline with startSampler gives you the first point, and sampling with t with t > startSampler gives you another point, interpolated using points of the spline. It could use two points, three, four or even more. It actually depends on the polynomials you use, and the interpolating method.

In smoothie, sampling values have types designed by s.

Control points

A spline is made of points. Those points are called control points and smoothie uses CP s a to refer to them, where s is the sampling type and a the carried value.

Although they’re often used to express the fact that the curve should pass through them, they don’t have to lie on the curve itself. A very common and ultra useful kind of spline is the B-spline.

With that kind of spline, the property that the curve passes through the control points doesn’t hold. It passes through the first and last ones, but the ones in between are used to shape it, a bit like magnets attract things around them.

Keep in mind that control points are very important and used to define the main aspect of the curve.

Polynomials

Polynomials are keys to spline interpolation. They’re used to deduce sampled points. Interpolation is a very general term and used in plenty of domains. If you’re not used to that, you should inquiry about linear interpolation and cubic interpolation, which are a very good start.

Polynomials are denoted by Polynomial s a in smoothie, where s and a have the same meaning than in CP s a.

Getting started with smoothie

Types and constraints

smoothie has then three important types:

  • CP s a, the control points
  • Polynomial, the polynomials used to interpolate between control points
  • Spline s a, of course

The whole package is parameterized by s and a. As said earlier, s is very likely to require an Ord constraint. And a… Well, since we want to represent points, let’s wonder: which points? What kind of points? Why even “points”? That’s a good question. And this is why you may find smoothie great: it doesn’t actually know anything about points. It accepts any kind of values. Any? Almost. Any values that are in an additive group.

“What the…”

I won’t go into details, I’ll just vulgarize them so that you get quickly your feet wet. That constraint, when applied to Haskell, makes a to be an endofunctor – i.e. Functor – and additive – i.e. Additive. It also requires it to be a first-class value – i.e. its kind should be * -> *.

With Functor and Additive, we can do two important things:

  • First, with Functor. It enables us to lift computation on the inner type. We can for instance apply a single function inside a, like *k or /10.
  • Then, with Additive. It enables us to add our types, like a + b.

We can then make linear combinations, like ak + bq. This property is well known for vector spaces.

The fun consequence is that providing correct instances to Functor and Additive will make your type useable with smoothie as carried value in the spline! You might also have to implement Num and Ord as well, though.

Creating a spline

Creating a spline is done with the spline function, which signature is:

spline :: (Ord a, Ord s) => [(CP s a, Polynomial s a)] -> Spline s a

It takes a list of control points associated with polynomials and outputs a spline. That requires some explainations… When you’ll be sampling the spline, smoothie will look for which kind of interpolation method it has to use. This is done by the lower nearest control point to the sampled value. Basically, a pair (cp,polynomial) defines a new point and the interpolation method to use for the curve ahead of the point.

Of course, the latest point’s polynomial won’t be used. You can set whatever you want then – protip: you can even set undefined because of laziness.

Although the list will be sorted by spline, I highly recommend to pass a sorted list, because dealing with unordered points might have no sense.

A control point is created by providing a sample value and the carried value. For instance, using linear’s V2 type:

let cp0 = CP 0 $ V2 1 pi

That’s a control point that represents V2 1 pi when sampling is at 0. Let’s create another:

let cp1 = CP 3.341 $ V2 0.8 10.5

Now, let’t attach a polynomial to them!

Hold that for me please

The simplest polynomial – wich is actually not a polynomial, but heh, don’t look at me that way – is the 0-degree polynomial. Yeah, a constant function. It takes the lower control point, and holds it everwhere on the curve. You could picture that as a staircase function:

You might say that’s useless; it’s actually not; it’s even pretty nice. Imagine you want to attach your camera position onto such a curve. It will make the camera jump in space, which could be desirable for flash looks!

Use the hold Polynomial to use such a behavior.

Oh yeah, it’s straightforward!

1-degree functions often describe lines. That is, linear is the Polynomial to use to connect control points with… straight lines.

Have fun on the unit circle!

One very interesting Polynomial is cosine, that defines a cosine interpolation, used to smooth the spline and make it nicer for moves and transitions.

Highway to the danger zone

If you’re crazy, you can experiment around with linearBy, which, basically, is a 1-degree polynomial if you pass id, but will end up in most complex shapes if you pass another function – (s -> s). Dig in documentation on hackage for further information.

Sampling our spline

Ok, let’s use a linear interpolation to sample our spline:

let spl = spline [(cp0,linear),(cp1,hold)]

Note: I used hold as a final polynomial because I don’t like using undefined.

Ok, let’s see how to sample that. smoothie exports a convenient function for sampling:

smooth :: Ord s => Spline s a -> s -> Maybe a

smooth spl s takes the sampling value s and maybe interpolate it in the spl spline.

“Maybe? Why aren’t you sure?”

Well, that’s pretty simple. In some cases, the curve is not defined at the sampling value you pass. Before the first point and after, basically. In those cases, you get Nothing.

That’s not the end

I wrote smoothie in a few hours, in a single day. You might have ideas. I want it to be spread and widely used by awesome people. Should you do graphics programming, sound programming, animation or whatever implying splines or smoothness, please provide feedback!

For people that would like to get contributing, here’s the github page and the issue tracker.

If no one comes up with, I’ll try to add some cubic interpolation methods, like hermitian splines, and one of my favorite, the famous Catmull Rom spline interpolation method.

As always, have fun hacking around, and keep doing cool stuff and sharing it!

by Dimitri Sabadie ([email protected]) at February 26, 2015 10:55 AM

February 25, 2015

Joachim Breitner

DarcsWatch End-Of-Life’d

Almost seven years ago, at a time when the “VCS wars” have not even properly started yet, GitHub was seven days old and most Haskell related software projects were using Darcs as their version control system of choice, when you submitted a patch, you simply ran darcs send and mail with your changes would be sent to the right address, e.g. the maintainer or a mailing list. This was almost as convenient as Pull Requests are on Github now, only that it was tricky to keep track of what was happening with the patch, and it would be easy to forget to follow up on it.

So back then I announced DarcsWatch: A service that you could CC in your patch submitting mail, which then would monitor the repository and tell you about the patches status, i.e. whether it was applied or obsoleted by another patch.

Since then, it quitely did its work without much hickups. But by now, a lot of projects moved away from Darcs, so I don’t really use it myself any more. Also, its Darcs patch parser does not like every submissions by a contemporary darcs, so it is becoming more and more unreliable. I asked around on the xmonad and darcs mailing lists if others were still using it, and noboy spoke up. Therefore, after seven years and 4660 monitored patches, I am officially ceasing to run DarcsWatch.

The code and data is still there, so if you believe this was a mistake, you can still speak up -- but be prepared to be asked to take over maintaining it.

I have a disklike for actually deleting data, so I’ll keep the static parts of DarcsWatch web page in the current state running.

I’d like to thank the guys from spiny.org.uk for hosting DarcsWatch on urching for the last 5 years.

by Joachim Breitner ([email protected]) at February 25, 2015 11:39 PM

Neil Mitchell

Making withSocketsDo unnecessary

Summary: Currently you have to call withSocketsDo before using the Haskell network library. In the next version you won't have to.

The Haskell network library has always had a weird and unpleasant invariant. Under Windows, you must call withSocketsDo before calling any other functions. If you forget, the error message isn't particularly illuminating (e.g. getAddrInfo, does not exist, error 10093). Calling withSocketsDo isn't harmful under Linux, but equally isn't necessary, and thus easy to accidentally omit. The network library has recently merged some patches so that in future versions there is no requirement to call withSocketsDo, even on Windows.

Existing versions of network

The reason for requiring withSocketsDo is so that the network library can initialise the Windows Winsock library. The code for withSocketsDo was approximately:

withSocketsDo :: IO a -> IO a
#if WINDOWS
withSocketsDo act = do
initWinsock
act `finally` termWinsock
#else
withSocketsDo act = act
#endif

Where initWinsock and termWinsock were C functions. Both checked a mutable variable so they only initialised/terminated once. The initWinsock function immediately initialised the Winsock library. The termWinsock function did not terminate the library, but merely installed an atexit handler, providing a function that ran when the program shut down which terminated the Winsock library.

As a result, in all existing versions of the network library, it is fine to nest calls to withSocketsDo, call withSocketsDo multiple times, and to perform networking operations after withSocketsDo has returned.

Future versions of network

My approach to removing the requirement to call withSocketsDo was to make it very cheap, then sprinkle it everywhere it might be needed. Making such a function cheap on non-Windows just required an INLINE pragma (although its very likely GHC would have always inlined the function anyway).

For Windows, I changed to:

withSocketsDo act = do evaluate withSocketsInit; act 

{-# NOINLINE withSocketsInit #-}
withSocketsInit = unsafePerformIO $ do
initWinsock
termWinsock

Now withSocketsDo is very cheap, with subsequent calls requiring no FFI calls, and thanks to pointer tagging, just a few cheap instructions. When placing additional withSocketsDo calls my strategy was to make sure I called it before constructing a Socket (which many functions take as an argument), and when taking one of the central locks required for the network library. In addition, I identified a few places not otherwise covered.

In newer versions of the network library it is probably never necessary to call withSocketsDo - if you find a place where one is necessary, let me know. However, for compatibility with older versions on Windows, it is good practice to always call withSocketsDo. Libraries making use of the network library should probably call withSocketsDo on their users behalf.

by Neil Mitchell ([email protected]) at February 25, 2015 10:11 PM

February 24, 2015

Functional Jobs

Senior Software Engineer at McGraw-Hill Education (Full-time)

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District.

We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

  • Move textbooks and learning into the digital era
  • Create software used by millions of students
  • Advance the state of the art in adaptive learning technology
  • Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you have do some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

  • Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
  • Be comfortable moving between multiple programming languages
  • Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

February 24, 2015 03:34 PM

Thiago Negri

Peg Solitaire

I'm proud to announce my new game: Peg Solitaire.

Peg solitaire is a classic board puzzle game for one player involving movement of pegs on a board with holes.

The objective is, making valid moves, to empty the entire board except for a solitary peg. A valid move is to jump a peg orthogonally over an adjacent peg into a hole two positions away and then to remove the jumped peg. Thus valid moves in each of the four orthogonal directions are:
  • Jump to right;
  • Jump to left;
  • Jump up;
  • Jump down.
The first challenge is to solve the puzzle, and then you can challenge yourself and try to solve it with the minimum number of movements. Each jump is considered a move, multiple jumps with the same peg are considered a single move, so you can keep jumping with the same peg and still count it as a single movement. The game displays the count of how many movements you have made at the top of the screen.

Features:
  • Multiple board types to select;
  • Five different board types: English (Standard), European (French), Wiegleb, Asymmetrical (Bell) and Diamond;
  • Display game move count;
  • Quick replay same board;
  • Give up current game;
  • Restart current game;
  • Shows system status bar, so you can keep track of other things while playing, for example current time
The game is entirely free and displays ads, there is no in-app purchase options.

It's made of web technologies (HTML, JavaScript, CSS) and deployed as a native application via Apache Cordova. Translated to six languages. Really exciting!

I tried to get the look and feel of each platform individually, using a flat design for iOS, a Material design for Android and a dark theme for Windows Phone. Take a look, I hope it's pretty obvious which screenshot belongs to which platform:




There are still updates to come. iOS has one in the queue for review process and I'm delivering another one for Windows Phone this week.

Take a look at the game, download, play it and leave a rating or feedback:

Hope to see you soon!

by Thiago Negri ([email protected]) at February 24, 2015 01:13 AM

Jasper Van der Jeugt

Writing an LRU Cache in Haskell

Introduction

In-memory caches form an important optimisation for modern applications. This is one area where people often tend to write their own implementation (though usually based on an existing idea). The reason for this is mostly that having a one-size-fits all cache is really hard, and people often want to tune it for performance reasons according to their usage pattern, or use a specific interface that works really well for them.

However, this sometimes results in less-than-optimal design choices. I thought I would take some time and explain how an LRU cache can be written in a reasonably straightforward way (the code is fairly short), while still achieving great performance. Hence, it should not be too much trouble to tune this code to your needs.

The data structure usually underpinning an LRU cache is a Priority Search Queue, where the priority of an element is the time at which it was last accessed. A number of Priority Search Queue implementations are provided by the psqueues package, and in this blogpost we will be using its HashPSQ data type.

Disclaimer: I am one of the principal authors of the psqueues package.

This blogpost is written in literate Haskell, so you should be able to plug it into GHCi and play around with it. The raw file can be found here.

A pure implementation

First, we import some things, including the Data.HashPSQ module from psqueues.

> {-# LANGUAGE BangPatterns #-}
> import           Control.Applicative ((<$>))
> import           Data.Hashable       (Hashable, hash)
> import qualified Data.HashPSQ        as HashPSQ
> import           Data.IORef          (IORef, newIORef, atomicModifyIORef')
> import           Data.Int            (Int64)
> import           Data.Maybe          (isNothing)
> import qualified Data.Vector         as V
> import           Prelude             hiding (lookup)

Let’s start with our datatype definition. Our Cache type is parameterized by k and v, which represent the types of our keys and values respectively. The priorities of our elements will be the logical time at which they were last accessed, or the time at which they were inserted (for elements which have never been accessed). We will represent these logical times by values of type Int64.

> type Priority = Int64

The cTick field stores the “next” logical time – that is, the value of cTick should be one greater than the maximum priority in cQueue. At the very least, we need to maintain the invariant that all priorities in cQueue are smaller than cTick. A consequence of this is that cTick should increase monotonically. This is violated in the case of an integer overflow, so we need to take special care of that case.

> data Cache k v = Cache
>     { cCapacity :: !Int       -- ^ The maximum number of elements in the queue
>     , cSize     :: !Int       -- ^ The current number of elements in the queue
>     , cTick     :: !Priority  -- ^ The next logical time
>     , cQueue    :: !(HashPSQ.HashPSQ k Priority v)
>     }

Creating an empty Cache is easy; we just need to know the maximum capacity.

> empty :: Int -> Cache k v
> empty capacity
>     | capacity < 1 = error "Cache.empty: capacity < 1"
>     | otherwise    = Cache
>         { cCapacity = capacity
>         , cSize     = 0
>         , cTick     = 0
>         , cQueue    = HashPSQ.empty
>         }

Next, we will write a utility function to ensure that the invariants of our datatype are met. We can then use that in our lookup and insert functions.

> trim :: (Hashable k, Ord k) => Cache k v -> Cache k v
> trim c

The first thing we want to check is if our logical time reaches the maximum value it can take. If this is the case, can either reset all the ticks in our queue, or we can clear it. We choose for the latter here, since that is simply easier to code, and we are talking about a scenario that should not happen very often.

>     | cTick c == maxBound  = empty (cCapacity c)

Then, we just need to check if our size is still within bounds. If it is not, we drop the oldest item – that is the item with the smallest priority. We will only ever need to drop one item at a time, because our cache is number-bounded and we will call trim after every insert.

>     | cSize c > cCapacity c = c
>         { cSize  = cSize c - 1
>         , cQueue = HashPSQ.deleteMin (cQueue c)
>         }
>     | otherwise             = c

Insert is pretty straightforward to implement now. We use the insertView function from Data.HashPSQ which tells us whether or not an item was overwritten.

insertView
  :: (Hashable k, Ord p, Ord k)
  => k -> p -> v -> HashPSQ k p v -> (Maybe (p, v), HashPSQ k p v)

This is necessary, since we need to know whether or not we need to update cSize.

> insert :: (Hashable k, Ord k) => k -> v -> Cache k v -> Cache k v
> insert key val c = trim $!
>     let (mbOldVal, queue) = HashPSQ.insertView key (cTick c) val (cQueue c)
>     in c
>         { cSize  = if isNothing mbOldVal then cSize c + 1 else cSize c
>         , cTick  = cTick c + 1
>         , cQueue = queue
>         }

Lookup is not that hard either, but we need to remember that in addition to looking up the item, we also want to bump the priority. We can do this using the alter function from psqueues: that allows us to modify a value (bump its priority) and return something (the value, if found) at the same time.

alter
    :: (Hashable k, Ord k, Ord p)
    => (Maybe (p, v) -> (b, Maybe (p, v)))
    -> k -> HashPSQ.HashPSQ k p v -> (b, HashPSQ.HashPSQ k p v)

The b in the signature above becomes our lookup result.

> lookup
>     :: (Hashable k, Ord k) => k -> Cache k v -> Maybe (v, Cache k v)
> lookup k c = case HashPSQ.alter lookupAndBump k (cQueue c) of
>     (Nothing, _) -> Nothing
>     (Just x, q)  ->
>         let !c' = trim $ c {cTick = cTick c + 1, cQueue = q}
>         in Just (x, c')
>   where
>     lookupAndBump Nothing       = (Nothing, Nothing)
>     lookupAndBump (Just (_, x)) = (Just x,  Just ((cTick c), x))

That basically gives a clean and simple implementation of a pure LRU Cache. If you are only writing pure code, you should be good to go! However, most applications deal with caches in IO, so we will have to adjust it for that.

A simple IO-based cache

Using an IORef, we can update our Cache to be easily usable in the IO Monad.

> newtype Handle k v = Handle (IORef (Cache k v))

Creating one is easy:

> newHandle :: Int -> IO (Handle k v)
> newHandle capacity = Handle <$> newIORef (empty capacity)

Our simple interface only needs to export one function. cached takes the key of the value we are looking for, and an IO action which produces the value. However, we will only actually execute this IO action if it is not present in the cache.

> cached
>     :: (Hashable k, Ord k)
>     => Handle k v -> k -> IO v -> IO v
> cached (Handle ref) k io = do

First, we check the cache using our lookup function from above. This uses atomicModifyIORef', since our lookup might bump the priority of an item, and in that case we modify the cache.

>     lookupRes <- atomicModifyIORef' ref $ \c -> case lookup k c of
>         Nothing      -> (c,  Nothing)
>         Just (v, c') -> (c', Just v)

If it is found, we can just return it.

>     case lookupRes of
>         Just v  -> return v

Otherwise, we execute the IO action and call atomicModifyIORef' again to insert it into the cache.

>         Nothing -> do
>             v <- io
>             atomicModifyIORef' ref $ \c -> (insert k v c, ())
>             return v

Contention

This scheme already gives us fairly good performance. However, that can degrade a little when lots of threads are calling atomicModifyIORef' on the same IORef.

atomicModifyIORef' is implemented using a compare-and-swap, so conceptually it works a bit like this:

atomicModifyIORef' :: IORef a -> (a -> (a, b)) -> IO b
atomicModifyIORef' ref f = do
    x <- readIORef ref
    let (!x', !y) = f x

    -- Atomically write x' if value is still x
    swapped <- compareAndSwap ref x x'
    if swapped
        then return y
        else atomicModifyIORef' ref f  -- Retry

We can see that this can lead to contention: if we have a lot of concurrent atomicModifyIORef's, we can get into a retry loop. It cannot cause a deadlock (i.e., it should still eventually finish), but it will still bring our performance to a grinding halt. This is a common problem with IORefs which I have also personally encountered in real-world scenarios.

A striped cache

A good solution around this problem, since we already have a Hashable instance for our key anyway, is striping the keyspace. We can even reuse our Handle in quite an elegant way. Instead of just using one Handle, we create a Vector of Handles instead:

> newtype StripedHandle k v = StripedHandle (V.Vector (Handle k v))

The user can configure the number of handles that are created:

> newStripedHandle :: Int -> Int -> IO (StripedHandle k v)
> newStripedHandle numStripes capacityPerStripe =
>     StripedHandle <$> V.replicateM numStripes (newHandle capacityPerStripe)

Our hash function then determines which Handle we should use:

> stripedCached
>     :: (Hashable k, Ord k)
>     => StripedHandle k v -> k -> IO v -> IO v
> stripedCached (StripedHandle v) k =
>     cached (v V.! idx) k
>   where
>     idx = hash k `mod` V.length v

Because our access pattern is now distributed among the different Handles, we should be able to avoid the contention problem.

Conclusion

We have implemented a very useful data structure for many applications, with two variations and decent performance. Thanks to the psqueues package, the implementations are very straightforward, small in code size, and it should be possible to tune the caches to your needs.

Many variations are possible: you can use real timestamps (UTCTime) as priorities in the queue and have items expire after a given amount of time. Or, if modifications of the values v are allowed, you can add a function which writes the updates to the cache as well as to the underlying data source.

For embedding the pure cache into IO, there many alternatives to using IORefs: for example, we could have used MVars or TVars. There are other strategies for reducing contention other than striping, too.

You could even write a cache which is bounded by its total size on the heap, rather than by the number of elements in the queue. If you want a single bounded cache for use across your entire application, you could allow it to store heterogeneously-typed values, and provide multiple strongly-typed interfaces to the same cache. However, implementing these things is a story for another time.

Thanks to the dashing Alex Sayers for proofreading and suggesting many corrections and improvements.

February 24, 2015 12:00 AM

February 23, 2015

The GHC Team

GHC Weekly News - 2015/02/23

Hi *,

It's once again time for your sometimes-slightly-irregularly-scheduled GHC news! This past Friday marked the end of the FTP vote for GHC 7.10, there's an RC on the way (see below), we've closed out a good set of patches and tickets from users and pushed them into HEAD, and to top it off - it's your editor's birthday today, so that's nice!

Quick note: as said above GHC HQ is expecting to make a third release candidate for GHC 7.10.1 soon in early March since the delay has allowed us to pick up some more changes and bugfixes. We plan on the final release being close to the end of March (previously end of February).

This week, GHC HQ met up again to discuss and write down our current priorities and thoughts:

  • After discussing our current timetable - as we're currently hovering around the ICFP deadline - we're hoping to make our third GHC 7.10.1 release candidate on Friday, March 13th, with the final release on Friday, March 27th. This was the main take away from our meeting today.

We've also had a little more list activity this week than we did before:

Some noteworthy commits that went into ghc.git in the past week include:

Closed tickets the past week include: #9266, #10095, #9959, #10086, #9094, #9606, #9402, #10093, #9054, #10102, #4366, #7604, #9103, #10104, #7765, #7103, #10051, #7056, #9907, #10078, #10096, #10072, #10043, #9926, #10088, #10091, #8309, #9049, #9895, and #8539.

by thoughtpolice at February 23, 2015 11:52 PM

Austin Seipp

The Future of community.haskell.org

Community.haskell.org is a server in our ecosystem that comparatively few know about these days. It actually was, to my knowledge, a key part of how the whole haskell.org community infrastructure got set up way back when. The sparse homepage still even says: "This server is run by a mysterious group of Haskell hackers who do not wish to be known as a Cabal, and is funded from money earned by haskell.org mentors in the Google Summer-of-Code programme." At a certain point after this server was created, it ceased to be run by a "mysterious group of Haskell hackers" and instead became managed officially by the haskell.org Committee that we know today. You can see the original announcement email in the archives.

The community server, first set up in 2007 played a key role back before the current set of cloud-based services we know today was around. It provided a shared host which could provide many of the services a software project needs -- VCS hosting, public webspace for documentation, issue trackers, mailing lists, and soforth.

Today, the server is somewhat of a relic of another time. People prefer to host projects in places like github, bitbucket, or darcs hub. Issue trackers likewise tend to be associated with these hosts, and there are other free, hosted issue trackers around as well. When folks want a mailing list, they tend to reach for google groups.

Meanwhile, managing a big box full of shell account has become a much more difficult, riskier proposition. Every shell account is a security vulnerability waiting to happen, and there are more adversarial "scriptkiddie" hackers than ever looking to claim new boxes to spam and otherwise operate from.

Managing a mailman installation is likewise more difficult. There are more spammers out there, with better methods, and unmaintained lists quickly can turn into ghost towns filled with pages of linkspam and nothing but. The same sad fate falls on unmaintained tracs.

As a whole, the internet is a more adversarial world for small, self-hosted services, especially those whose domain names have some "google juice". We think it would be good to, to the extent possible, get out of the business of doing this sort of hosting. And indeed, very few individuals tend to request accounts, since there are now so many nicer, better ways of getting the benefits that community.haskell.org once was rare in providing.

So what next? Well, we want to "end of life" most of community.haskell.org, but in as painless a way as possible. This means finding what few tracs, if any, are still active, and helping their owners migrate. Similarly for mailing lists. Of course we will find a way to continue to host their archives for historical purposes.

Similarly, we will attempt to keep source repositories accessible for historical purposes, but would very much like to encourage owners to move to more well supported code hosting. One purpose that, until recently, was hard to serve elsewhere was in hosting of private darcs repositories with shared access -- such as academics might use to collaborate on a work in project. However, that capability is now also provided on http://hub.darcs.net. At this point, we can't think of anything in this regard that is not better provided elsewhere -- but if you can, please let us know.

On webspace, it may be the case that a little more leniency is in order. For one, it is possible to provide restricted accounts that are able to control web-accessible files but have no other rights. For another, while many open source projects now host documentation through github pages or similar, and there are likewise many services for personal home pages, nonetheless it seems a nice thing to allow projects to host their resources on a system that is not under the control of a for-profit third party that, ultimately is responsible to its bottom line and not its users.

But all this is open for discussion! Community.haskell.org was put together to serve the open source community of Haskell developers, and its direction needs to be determined based on feedback regarding current needs. What do you think? What would you like to see continued to be provided? What do you feel is less important? Are there other good hosted services that should be mentioned as alternatives?

And, of course, are you interested in rolling up your sleeves to help with any of the changes discussed? This could mean simply helping out with sorting out the mailman and trac situation, inventorying the active elements and collaborating with their owners. Or, it could mean a more sustained technical involvement. Whatever you have to offer, we will likely have a use for it. As always, you can email [email protected] or hop on the #haskell-infrastructure freenode channel to get involved directly.

by gershomb (Gershom) at February 23, 2015 10:50 PM

February 22, 2015

Russell O'Connor

Parallel Evaluation of the Typed Lambda Calculus

There is much written about the duality between strict-order (call-by-value) evalutaion for the lambda calculus and the normal-order (call-by-need) evaluation (or semantic equivently, lazy evaluation). In the simply typed lambda calculus, all evaluation eventually terminates, so both evaluation strategies result in the same values. However, when general recursion is added to the simply typed lambda calculus (via a fixpoint operator, for example) then evaluation of some expressions does not terminate. More expressions terminate with normal-order evaluation than with strict-order evaluation. In fact, if evaluation terminates in any order, then it terminates with normal-order evaluation.

I would like to discuss the possibility of a third, even laxer evaluation strategy available for the typed lambda calculus that allows for even more expressions to terminate. I did just say that normal-order evaluation is, in some sense, a best possible evaluation order, so, in order to beat it, we will be adding more redexes that add the commuting conversions.

The typed lambda calculus enjoys certain commuting conversions for case expressions that allow every elimination term to pass through the case expression. For example, the commuting conversion for the π₁ elimination term and the case experssion says that

π₁(case e₀ of σ₁ x. e₁ | σ₂ y. e₂)

converts to

case e₀ of σ₁ x. π₁(e₁) | σ₂ y. π₁(e₂)

These commuting conversions are required so that the subformula property holds.

My understanding is that a corollary of this says that

f(case e₀ of σ₁ x. e₁ | σ₂ y. e₂)

and

case e₀ of σ₁ x. f(e₁) | σ₂ y. f(e₂)

are denotationally equivalent whenever f is a strict function.

I would like to develop a version of the lambda calculus that allows these two expressions to denote the same value for any f. Call this, the unrestricted commuting conversion property. A lambda calculus with this property would necessarily be parallel and thus will require a parallel evaluation strategy. For example, the natural definition of or becomes the parallel-or operation.

or x y := if x then True else y

This definition has the usual short-circuit property that or True ⊥ is True where is defined by

⊥ := fix id

If we use the unrestricted commuting conversion property then we also have the that or ⊥ True is True:

or ⊥ True
 = {definition of or}
if ⊥ then True else True
 = {β-expansion}
if ⊥ then const True ⟨⟩ else const True ⟨⟩
 = {commuting}
const True (if ⊥ then ⟨⟩ else ⟨⟩)
 = {β-reduction}
True

Hence or is parallel-or.

Other parallel functions, such as the majority function, also follow from their natural definitions.

maj x y z := if x then (or y z) else (and y z)

In this case maj ⊥ True True is True.

maj ⊥ True True
 = {definition of maj}
if ⊥ then (or True True) else (and True True)
 = {evaluation of (or True True) and (and True True)
if ⊥ then True else True
 = {commuting}
True

It is easy to verify that maj True ⊥ True and maj True True ⊥ are also both True.

My big question is whether we can devise some nice operational semantics for the lambda calculus that will have the unrestricted commuting conversions property that I desire. Below I document my first attempt at such operational semantics, but, spoiler alert, it does not work.

The usual rule for computing weak head normal form for the case expression case e₀ of σ₁ x. e₁ | σ₂ y. e₂ says to first convert e₀ to weak head normal form. If it is σ₁ e₀′ then return the weak head normal form of e₁[x ↦ e₀′]. If it is σ₂ e₀′ then return the weak head normal form of e₂[y ↦ e₀′]. In addition to this rule, I want to add another rule for computing the weak head normal form for the case expression. This alernative rules says that we compute the weak head normal forms of e₁ and e₂. If we get C₁ e₁′ and C₂ e₂′ respectively for introduction terms (a.k.a. constructors) C₁ and C₂, and if additionally C₁ = C₂ then return the following as a weak head normal form, C₁ (case e₀ of σ₁ x. e₁′ | σ₂ y. e₂′). If C₁C₂ or if we get stuck on a neutral term (e.g. a varaible), then this rule does not apply.

This new rule is in addition to the usual rule for case. Any implementation must run these two rules in parallel because it is possible that either rule (or both) can result in non-termination when recursivly computing weak head normal forms of sub-terms. I suppose that in case one has unlifted products then when computing the weak head normal form of a case expression having a product type or function type, one can immediately return

⟨case e₀ of σ₁ x. π₁ e₁ | σ₂ y. π₁ e₂, case e₀ of σ₁ x. π₂ e₁ | σ₂ y. π₂ e₂⟩
or
λz. case e₀ of σ₁ x. e₁ z | σ₂ y. e₂ z

This amended computation of weak head normal form seems to work for computing or and maj functions above so that they are non-strict in every argument, but there is another example where even this method of computing weak head normal form is not sufficent. Consider the functions that implements associativity for the sum type.

assocL : A + (B + C) -> (A + B) + C
assocL z := case z of σ₁ a. σ₁ (σ₁ a) | σ₂ bc. (case bc of σ₁ b. σ₁ (σ₂ b) | σ₂ c. σ₂ c)

assocR : (A + B) + C -> A + (B + C)
assocR z := case z of σ₁ ab. (case ab of σ₁ a. σ₁ a | σ₂ b. σ₂ (σ₁ b)) | σ₂ c. σ₂ (σ₂ c)

Now let us use unrestricted commuting conversions to evaluate assocR (assocL (σ₂ ⊥)).

assocR (assocL (σ₂ ⊥))
 = { definition of assocL and case evaluation }
assocR (case ⊥. σ₁ b. σ₁ (σ₂ b) | σ₂ c. σ₂ c)
 = { commuting conversion }
case ⊥. σ₁ b. assocR (σ₁ (σ₂ b)) | σ₂ c. assocR (σ₂ c)
 = { definition of assocR and case evaluations }
case ⊥. σ₁ b. σ₂ (σ₁ b) | σ₂ c. σ₂ (σ₂ c)
 = { commuting conversion }
σ₂ (case ⊥. σ₁ b. σ₁ b | σ₂ c. σ₂ c) 
 = { η-contraction for case }
σ₂ ⊥

Even if η-contraction is not a reduction rule used for computation, it is still the case that t and case t. σ₁ b. σ₁ b | σ₂ c. σ₂ c should always be dentotationally equivalent. Anyhow, we see that by using commuting conversions that a weak head normal form of assocR (assocL (σ₂ ⊥)) should expose the σ₂ constructor. However, if you apply even my ammended computation of weak head normal form, it will not produce any constructor.

What I find particularly surprising is the domain semantics of assocL and assocR. assocL seems to map σ₂ ⊥ to because no constructor can be exposed. assocR maps to . Therefore, according to the denotational semantics, the composition should map σ₂ ⊥ to , but as we saw, under parallel evaluation it does not. It would seem that the naive denotational semantics appears to not capture the semantics of parallel evaluation. The term case ⊥. σ₁ b. σ₁ (σ₂ b) | σ₂ c. σ₂ c seems to be more defined than ⊥, even though no constructor is available in the head position.

Although my attempt at nice operational semantics failed, I am still confident some nice computation method exists. At the very least, I believe a rewriting system will work which has all the usual rewriting rules plus a few extra new redexes that says that an elimination term applied to the case expression commutes the elimination term into all of the branches, and another that says when all branches of a case expression contain the same introduction term, that introduction term is commuted to the outside of the case expression, and maybe also the rules I listed above for unlifted products. I conjecture this rewriting system is confluent and unrestricted commuting conversions are convertable (probably requiring η-conversions as well).

Without proofs of my conjectures I am a little concerned that this all does not actually work out. There may be some bad interaction with fixpoints that I am not seeing. If this does all work out then shouldn’t I have heard about it by now?

February 22, 2015 11:31 PM

Roman Cheplyaka

Examples of monads in a dynamic language

In Monads in dynamic languages, I explained what the definition of a monad in a dynamic language should be and concluded that there’s nothing precluding them from existing. But I didn’t give an example either.

So, in case you are still wondering whether non-trivial monads are possible in a dynamic language, here you go. I’ll implement a couple of simple monads — Reader and Maybe — with proofs.

And all that will take place in the ultimate dynamic language — the (extensional) untyped lambda calculus.

The definitions of the Reader and Maybe monads are not anything special; they are the same definitions as you would write, say, in Haskell, except Maybe is Church-encoded.

What I find fascinating about this is that despite the untyped language, which allows more things to go wrong than a typed one, the monad laws still hold. You can still write monadic code and reason about it in the untyped lambda calculus in the same way as you would do in a typed language.

Reader

return x = λr.x
a >>= k  = λr.k(ar)r

Left identity

return x >>= k
  { inline return }
  = λr.x >>= k
  { inline >>= }
  = λr.k((λr.x)r)r
  { β-reduce }
  = λr.kxr
  { η-reduce }
  = kx

Right identity

a >>= return
  { inline return }
  = a >>= λx.λr.x
  { inline >>= }
  = λr.(λx.λr.x)(ar)r
  { β-reduce }
  = λr.ar
  { η-reduce }
  = a

Associativity

a >>= f >>= g
  { inline 1st >>= }
  = λr.f(ar)r >>= g
  { inline 2nd >>= }
  = λr.g((λr.f(ar)r)r)r
  { β-reduce }
  = λr.g(f(ar)r)r
a >>= (λx. f x >>= g)
  { inline 2nd >>= }
  = a >>= λx.λr.g((fx)r)r
  { inline 1st >>= }
  = λr.(λx.λr.g(fxr)r)(ar)r
  { β-reduce }
  = λr.g(f(ar)r)r

Maybe

return x = λj.λn.jx
a >>= k  = λj.λn.a(λx.kxjn)n

Left identity

return x >>= k
  { inline return }
  = λj.λn.jx >>= k
  { inline >>= }
  = λj.λn.(λj.λn.jx)(λx.kxjn)n
  { β-reduce }
  = λj.λn.kxjn
  { η-reduce }
  = kx

Right identity

a >>= return
  { inline return }
  = a >>= λx.λj.λn.jx
  { inline >>= }
  = λj.λn.a(λx.(λx.λj.λn.jx)xjn)n
  { β-reduce }
  = λj.λn.a(λx.jx)n
  { η-reduce }
  = λj.λn.ajn
  { η-reduce }
  = a

Associativity

a >>= f >>= g
  { inline 1st >>= }
  = (λj.λn.a(λx.fxjn)n) >>= g
  { inline 2nd >>= }
  = (λj.λn.(λj.λn.a(λx.fxjn)n)(λx.gxjn)n)
  { β-reduce }
  = λj.λn.a(λx.fx(λx.gxjn)n)n
a >>= (λx. f x >>= g)
  { inline 2nd >>= }
  = a >>= (λx.λj.λn.fx(λx.gxjn)n)
  { inline 1st >>= }
  = λj.λn.a(λx.(λx.λj.λn.fx(λx.gxjn)n)xjn)n
  { β-reduce }
  = λj.λn.a(λx.fx(λx.gxjn)n)n

February 22, 2015 08:00 PM

Robert Harper

OPLSS 2015 Announcement

We are pleased to announce the preliminary program for the 14th Annual Oregon Programming Languages Summer School (OPLSS) to be held June 15th to 27th, 2015 at the University of Oregon in Eugene.

This year’s program is titled Types, Logic, Semantics, and Verification and features the following speakers:

  • Amal Ahmed, Northeastern University
  • Nick Benton, Microsoft Cambridge Research Lab
  • Adam Chlipala, Massachusetts Institute of Technology
  • Robert Constable, Cornell University
  • Peter Dybjer, Chalmers University of Technology
  • Robert Harper, Carnegie Mellon University
  • Ed Morehouse, Carnegie Mellon University
  • Greg Morrisett, Harvard University
  • Frank Pfenning, Carnegie Mellon University

The registration deadline is March 16, 2015.

Full information on registration and scholarships is available at https://www.cs.uoregon.edu/research/summerschool/.

Please address all inquiries to [email protected].

Best regards from the OPLSS 2015 organizers!

Robert Harper

Greg Morrisett

Zena Ariola

 

Filed under: Research, Teaching Tagged: OPLSS15

by Robert Harper at February 22, 2015 07:40 PM

February 21, 2015

Edward Kmett

Free Monoids in Haskell

It is often stated that Foldable is effectively the toList class. However, this turns out to be wrong. The real fundamental member of Foldable is foldMap (which should look suspiciously like traverse, incidentally). To understand exactly why this is, it helps to understand another surprising fact: lists are not free monoids in Haskell.

This latter fact can be seen relatively easily by considering another list-like type:

 
data SL a = Empty | SL a :> a
 
instance Monoid (SL a) where
  mempty = Empty
  mappend ys Empty = ys
  mappend ys (xs :> x) = (mappend ys xs) :> x
 
single :: a -> SL a
single x = Empty :> x
 

So, we have a type SL a of snoc lists, which are a monoid, and a function that embeds a into SL a. If (ordinary) lists were the free monoid, there would be a unique monoid homomorphism from lists to snoc lists. Such a homomorphism (call it h) would have the following properties:

 
h [] = Empty
h (xs <> ys) = h xs <> h ys
h [x] = single x
 

And in fact, this (together with some general facts about Haskell functions) should be enough to define h for our purposes (or any purposes, really). So, let's consider its behavior on two values:

 
h [1] = single 1
 
h [1,1..] = h ([1] <> [1,1..]) -- [1,1..] is an infinite list of 1s
          = h [1] <> h [1,1..]
 

This second equation can tell us what the value of h is at this infinite value, since we can consider it the definition of a possibly infinite value:

 
x = h [1] <> x = fix (single 1 <>)
h [1,1..] = x
 

(single 1 ) is a strict function, so the fixed point theorem tells us that x = ⊥.

This is a problem, though. Considering some additional equations:

 
[1,1..] <> [n] = [1,1..] -- true for all n
h [1,1..] = ⊥
h ([1,1..] <> [1]) = h [1,1..] <> h [1]
                   = ⊥ <> single 1
                   = ⊥ :> 1
                   ≠ ⊥
 

So, our requirements for h are contradictory, and no such homomorphism can exist.

The issue is that Haskell types are domains. They contain these extra partially defined values and infinite values. The monoid structure on (cons) lists has infinite lists absorbing all right-hand sides, while the snoc lists are just the opposite.

This also means that finite lists (or any method of implementing finite sequences) are not free monoids in Haskell. They, as domains, still contain the additional bottom element, and it absorbs all other elements, which is incorrect behavior for the free monoid:

 
pure x <> ⊥ = ⊥
h ⊥ = ⊥
h (pure x <> ⊥) = [x] <> h ⊥
                = [x] ++ ⊥
                = x:⊥
                ≠ ⊥
 

So, what is the free monoid? In a sense, it can't be written down at all in Haskell, because we cannot enforce value-level equations, and because we don't have quotients. But, if conventions are good enough, there is a way. First, suppose we have a free monoid type FM a. Then for any other monoid m and embedding a -> m, there must be a monoid homomorphism from FM a to m. We can model this as a Haskell type:

 
forall a m. Monoid m => (a -> m) -> FM a -> m
 

Where we consider the Monoid m constraint to be enforcing that m actually has valid monoid structure. Now, a trick is to recognize that this sort of universal property can be used to define types in Haskell (or, GHC at least), due to polymorphic types being first class; we just rearrange the arguments and quantifiers, and take FM a to be the polymorphic type:

 
newtype FM a = FM { unFM :: forall m. Monoid m => (a -> m) -> m }
 

Types defined like this are automatically universal in the right sense. [1] The only thing we have to check is that FM a is actually a monoid over a. But that turns out to be easily witnessed:

 
embed :: a -> FM a
embed x = FM $ \k -> k x
 
instance Monoid (FM a) where
  mempty = FM $ \_ -> mempty
  mappend (FM e1) (FM e2) = FM $ \k -> e1 k <> e2 k
 

Demonstrating that the above is a proper monoid delegates to instances of Monoid being proper monoids. So as long as we trust that convention, we have a free monoid.

However, one might wonder what a free monoid would look like as something closer to a traditional data type. To construct that, first ignore the required equations, and consider only the generators; we get:

 
data FMG a = None | Single a | FMG a :<> FMG a
 

Now, the proper FM a is the quotient of this by the equations:

 
None :<> x = x = x :<> None
x :<> (y :<> z) = (x :<> y) :<> z
 

One way of mimicking this in Haskell is to hide the implementation in a module, and only allow elimination into Monoids (again, using the convention that Monoid ensures actual monoid structure) using the function:

 
unFMG :: forall a m. Monoid m => FMG a -> (a -> m) -> m
unFMG None _ = mempty
unFMG (Single x) k = k x
unFMG (x :<> y) k = unFMG x k <> unFMG y k
 

This is actually how quotients can be thought of in richer languages; the quotient does not eliminate any of the generated structure internally, it just restricts the way in which the values can be consumed. Those richer languages just allow us to prove equations, and enforce properties by proof obligations, rather than conventions and structure hiding. Also, one should note that the above should look pretty similar to our encoding of FM a using universal quantification earlier.

Now, one might look at the above and have some objections. For one, we'd normally think that the quotient of the above type is just [a]. Second, it seems like the type is revealing something about the associativity of the operations, because defining recursive values via left nesting is different from right nesting, and this difference is observable by extracting into different monoids. But aren't monoids supposed to remove associativity as a concern? For instance:

 
ones1 = embed 1 <> ones1
ones2 = ones2 <> embed 1
 

Shouldn't we be able to prove these are the same, becuase of an argument like:

 
ones1 = embed 1 <> (embed 1 <> ...)
      ... reassociate ...
      = (... <> embed 1) <> embed 1
      = ones2
 

The answer is that the equation we have only specifies the behavior of associating three values:

 
x <> (y <> z) = (x <> y) <> z
 

And while this is sufficient to nail down the behavior of finite values, and finitary reassociating, it does not tell us that infinitary reassociating yields the same value back. And the "... reassociate ..." step in the argument above was decidedly infinitary. And while the rules tell us that we can peel any finite number of copies of embed 1 to the front of ones1 or the end of ones2, it does not tell us that ones1 = ones2. And in fact it is vital for FM a to have distinct values for these two things; it is what makes it the free monoid when we're dealing with domains of lazy values.

Finally, we can come back to Foldable. If we look at foldMap:

 
foldMap :: (Foldable f, Monoid m) => (a -> m) -> f a -> m
 

we can rearrange things a bit, and get the type:

 
Foldable f => f a -> (forall m. Monoid m => (a -> m) -> m)
 

And thus, the most fundamental operation of Foldable is not toList, but toFreeMonoid, and lists are not free monoids in Haskell.

[1]: What we are doing here is noting that (co)limits are objects that internalize natural transformations, but the natural transformations expressible by quantification in GHC are already automatically internalized using quantifiers. However, one has to be careful that the quantifiers are actually enforcing the relevant naturality conditions. In many simple cases they are.

by Dan Doel at February 21, 2015 11:08 PM

Keegan McAllister

Turing tarpits in Rust's macro system

Bitwise Cyclic Tag is an extremely simple automaton slash programming language. BCT uses a program string and a data string, each made of bits. The program string is interpreted as if it were infinite, by looping back around to the first bit.

The program consists of commands executed in order. There is a single one-bit command:

0: Delete the left-most data bit.

and a single two-bit command:

1 x: If the left-most data bit is 1, copy bit x to the right of the data string.

We halt if ever the data string is empty.

Remarkably, this is enough to do universal computation. Implementing it in Rust's macro system gives a proof (probably not the first one) that Rust's macro system is Turing-complete, aside from the recursion limit imposed by the compiler.


#![feature(trace_macros)]

macro_rules! bct {
// cmd 0: d ... => ...
(0, $($ps:tt),* ; $_d:tt)
=> (bct!($($ps),*, 0 ; ));
(0, $($ps:tt),* ; $_d:tt, $($ds:tt),*)
=> (bct!($($ps),*, 0 ; $($ds),*));

// cmd 1p: 1 ... => 1 ... p
(1, $p:tt, $($ps:tt),* ; 1)
=> (bct!($($ps),*, 1, $p ; 1, $p));
(1, $p:tt, $($ps:tt),* ; 1, $($ds:tt),*)
=> (bct!($($ps),*, 1, $p ; 1, $($ds),*, $p));

// cmd 1p: 0 ... => 0 ...
(1, $p:tt, $($ps:tt),* ; $($ds:tt),*)
=> (bct!($($ps),*, 1, $p ; $($ds),*));

// halt on empty data string
( $($ps:tt),* ; )
=> (());
}

fn main() {
trace_macros!(true);
bct!(0, 0, 1, 1, 1 ; 1, 0, 1);
}

This produces the following compiler output:

bct! { 0 , 0 , 1 , 1 , 1 ; 1 , 0 , 1 }
bct! { 0 , 1 , 1 , 1 , 0 ; 0 , 1 }
bct! { 1 , 1 , 1 , 0 , 0 ; 1 }
bct! { 1 , 0 , 0 , 1 , 1 ; 1 , 1 }
bct! { 0 , 1 , 1 , 1 , 0 ; 1 , 1 , 0 }
bct! { 1 , 1 , 1 , 0 , 0 ; 1 , 0 }
bct! { 1 , 0 , 0 , 1 , 1 ; 1 , 0 , 1 }
bct! { 0 , 1 , 1 , 1 , 0 ; 1 , 0 , 1 , 0 }
bct! { 1 , 1 , 1 , 0 , 0 ; 0 , 1 , 0 }
bct! { 1 , 0 , 0 , 1 , 1 ; 0 , 1 , 0 }
bct! { 0 , 1 , 1 , 1 , 0 ; 0 , 1 , 0 }
bct! { 1 , 1 , 1 , 0 , 0 ; 1 , 0 }
bct! { 1 , 0 , 0 , 1 , 1 ; 1 , 0 , 1 }
bct! { 0 , 1 , 1 , 1 , 0 ; 1 , 0 , 1 , 0 }
bct! { 1 , 1 , 1 , 0 , 0 ; 0 , 1 , 0 }
bct! { 1 , 0 , 0 , 1 , 1 ; 0 , 1 , 0 }
bct! { 0 , 1 , 1 , 1 , 0 ; 0 , 1 , 0 }
...
bct.rs:19:13: 19:45 error: recursion limit reached while expanding the macro `bct`
bct.rs:19 => (bct!($($ps),*, 1, $p ; $($ds),*));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can try it online, as well.

Notes about the macro

I would much rather drop the commas and write


// cmd 0: d ... => ...
(0 $($ps:tt)* ; $_d:tt $($ds:tt)*)
=> (bct!($($ps)* 0 ; $($ds)*));

// cmd 1p: 1 ... => 1 ... p
(1 $p:tt $($ps:tt)* ; 1 $($ds:tt)*)
=> (bct!($($ps)* 1 $p ; 1 $($ds)* $p));

// cmd 1p: 0 ... => 0 ...
(1 $p:tt $($ps:tt)* ; $($ds:tt)*)
=> (bct!($($ps)* 1 $p ; $($ds)*));

but this runs into the macro future-proofing rules.

If we're required to have commas, then it's at least nice to handle them uniformly, e.g.


// cmd 0: d ... => ...
(0 $(, $ps:tt)* ; $_d:tt $(, $ds:tt)*)
=> (bct!($($ps),*, 0 ; $($ds),*));

// cmd 1p: 1 ... => 1 ... p
(1, $p:tt $(, $ps:tt)* ; $($ds:tt),*)
=> (bct!($($ps),*, 1, $p ; 1 $(, $ds)*, $p));

// cmd 1p: 0 ... => 0 ...
(1, $p:tt $(, $ps:tt)* ; $($ds:tt),*)
=> (bct!($($ps),*, 1, $p ; $($ds),*));

But this too is disallowed. An $x:tt variable cannot be followed by a repetition $(...)*, even though it's (I believe) harmless. There is an open RFC about this issue. For now I have to handle the "one" and "more than one" cases separately, which is annoying.

In general, I don't think macro_rules! is a good language for arbitrary computation. This experiment shows the hassle involved in implementing one of the simplest known "arbitrary computations". Rather, macro_rules! is good at expressing patterns of code reuse that don't require elaborate compile-time processing. It does so in a way that's declarative, hygienic, and high-level.

However, there is a big middle ground of non-elaborate, but non-trivial computations. macro_rules! is hardly ideal for that, but procedural macros have problems of their own. Indeed, the bct! macro is an extreme case of a pattern I've found useful in the real world. The idea is that every recursive invocation of a macro gives you another opportunity to pattern-match the arguments. Some of html5ever's macrosdo this, for example.

by keegan ([email protected]) at February 21, 2015 01:55 AM

February 20, 2015

Robert Harper

Structure and Efficiency of Computer Programs

For decades my colleague, Guy Blelloch, and I have promoted a grand synthesis of the two “theories” of computer science, combinatorial theory and logical theory.  It is only a small exaggeration to say that these two schools of thought work in isolation.  The combinatorial theorists concern themselves with efficiency, based on hypothetical translations of high-level algorithms to low-level machines, and have no useful theory of composition, the most important tool for developing large software systems.  Logical theorists concern themselves with composition, emphasizing the analysis of the properties of components of systems and how those components are combined; the heart of logic is a theory of composition (entailment).  But relatively scant attention is paid to efficiency, and, to a distressingly large extent, the situation is worsening, and not improving.

Guy and I have argued, through our separate and joint work, for the applicability of PL ideas to algorithms design, leading. for example, to the concept of adaptive programming that Umut Acar has pursued aggressively over the last dozen years.  And we have argued for the importance of cost analysis, for various measures of cost, at the level of the code that one actually writes, and not how it is compiled.  Last spring, prompted by discussions with Anindya Banerjee at NSF in the winter of 2014, I decided to write a position paper on the topic, outlining the scientific opportunities and challenges that would arise in an attempt to unify the two, disparate theories of computing.  I circulated the first draft privately in May, and revised it in July to prepare for a conference call among algorithms and PL researchers (sponsored by NSF) to find common ground and isolate key technical challenges to achieving its goals.

There are serious obstacles to be overcome if a grand synthesis of the “two theories” is to be achieved.  The first step is to get the right people together to discuss the issues and to formulate a unified vision of what are the core problems, and what are promising directions for short- and long-term research.  The position paper is not a proposal for funding, but is rather a proposal for a meeting designed to bring together two largely (but not entirely) disparate communities.  In summer of 2014 NSF hosted a three-hour long conference call among a number of researchers in both areas with a view towards hosting a workshop proposal in the near future.  Please keep an eye out for future developments.

I am grateful to Anindya Banerjee at NSF for initiating the discussion last winter that led to the paper and discussion, and I am grateful to Swarat Chaudhuri for his helpful comments on the proposal.

[Update: word smithing, corrections, updating, removed discussion of cost models for fuller treatment later, fixed incoherence after revision.]


Filed under: Research Tagged: algorithms, programming languages, research

by Robert Harper at February 20, 2015 10:37 PM

Austin Seipp

The New Haskell Homepage

Roughly 3/4 of a year after Chris Done first proposed his redesign, we finally went live with the new https://haskell.org homepage.

Much of the intermediate time was spent on cleanup of the introductory text, as well as adding features and tweaking designs. There was also a very length process of setting everything up to ensure that the "try it" feature could be deployed and well supported. Finally, we had to do all the work to ensure that both the wiki remained hosted well somewhere else with proper rewrite rules, and also that the non-wiki content hosted under haskell.org (things like /ghc, /platform, /hoogle, etc) continued to work. Some of the more publicly visible elements of this are the move of mailinglist hosting to http://mail.haskell.org and of the wiki content to http://wiki.haskell.org.

When we did finally go live, we got great feedback on reddit and made #1 on hacker news.

There were also a lot of things we missed, and which those threads pointed out -- in particular we didn't pay enough attention to the content of various pages. The "documentation" and "community" section needed lots of cleanup and expansion. Furthermore, the "downloads" page didn't point to the platform. Whoops! (It is to be noted that recommendation of the platform or not is an issue of some discussion now, for example on this active reddit thread. )

We think that we have fixed most of the issues raised. But there are still issues! The code sample for primes doesn't work in the repl. We have an active github ticket discussing the best way to fix this -- either change the sample, change the repl, or both. This is one of a number of issues under discussion on the github tracker.

We still want feedback, we still want patches, and there's still plenty to be made more awesome. For example: should we have some sort of integrated search? How? Where? The github infrastructure makes taking patches and tracking these issues easy. The repo now has 27 contributors, and we're look forward to more.

One balance that has been tricky to strike has been in making the site maximally useful for new users just looking to learn and explore Haskell, while also providing access to the wide range of resources and entry points that the old wiki did. One advantage that we have now is that the wiki is still around, and is still great. Freed from needing to also serve as a default haskell landing page, the wiki can hopefully grow to be an even better resource. It needs volunteers to chip in and curate the content. And ideally it needs a new front page that highlights the things that you _won't_ find on the main homepage, instead of the things you now can. One place we could use more help is in wiki admins, if anyone wants to volunteer. We need people to curate the news and events sections, to fight spam, to manage plugins and investigate adding new features, to update and clean up the stylesheets, and to manage new user accounts. If you're interested, please get in touch at [email protected]

All the resources we have today are the result of our open-source culture, that believes in equal parts in education, innovation, and sharing. They stem from people who enjoy Haskell wanting to contribute back to others -- wanting to make more accessible the knowledge and tools they've struggled to master, and wanting to help us make even better and more powerful tools.

There's always more infrastructure work to be done, and there are always more ways to get involved. In a forthcoming blog, I'll write further about some of the other pressing issues we hope to tackle, and where interested parties can chip in.

by gershomb (Gershom) at February 20, 2015 08:55 PM

February 19, 2015

Noam Lewis

Ad-hoc polymorphism, type families, ambiguous types, and Javascript

Imagine a language, where all we have are strings and numbers, and where + is a built-in function that can either add numbers or concatenate strings. Consider the following code:

a + b

What are the types of a, b and +?

Using a Haskell-like type signature we can say that + has either of these types:

+ :: (Number, Number) -> Number

or

+ :: (String, String) -> String

(Currying avoided intentionally.)

This is a classic case of ad-hoc polymorphism. With type classes one could say:

class Plus a where
  (+) :: (a, a) -> a

instance Plus Number where
  x + y = ... implement addition ...

instance Plus String where
  x + y = ... implement string concatenation ...

That’s great! Now we can type our a + b:

a + b :: Plus t => t

Where a :: t and b :: t.

Yes, there are also Scala-style implicits (recently discussed here in a proposal for OCaml), and probably other solutions I’m less aware of.

Notice that in the + example, a constraint on a single type (expressed through type class requirements) was enough to solve the problem.

Constraints on multiple types

Now, let’s look at a more complicated problem, involving multiple types at once.

Consider a language with only parameterized lists [a] and maps from strings to some type, Map a. Now, throw in a terribly ambiguous syntax where brackets denote either accessing a list at some index, or accessing a map at some key (wat):

   x[i]

That syntax means “access the container x at index/key i”. Now, what are the types of x, i and the brackets? There are two possibilities: if x is an array, then i must be a number; otherwise if x is a map, then i is a string.

Type families can be used to encode the above constraints:

class Indexable a where
  type Index a
  type Result a
  atIndex :: (a, Index a) -> Result a

The syntax means that any instance of the type class Indexable a “comes with” two accompanying types, Index a and Result a which are uniquely determined by the appropriate choice of a. For [t], Index = Number and Result = t. For Map t, Index = String and Result = t.

Now we just need syntax sugar where x[i] = x `atIndex` i. We could then define instances for our two types, [a] and Map a (remember, in our language the map keys are always Strings):

instance Indexable [a] where
  type Index [a] = Number
  type Result [a] = a
  atIndex = ... implement list lookup by index ...

instance Indexable (Map a) where
  type Index (Map String a) = String
  type Result (Map String a) = a
  atIndex = ... implement map lookup by key ...

Nice. Now, to type our original expression x[i]:

x[i] :: Indexable a => Result a

Where x :: a and i :: Index a.

Great! Type families (or rather, “type dependencies”) provide a way to represent inter-type constraints and can be used to resolve ambiguous expressions during type inference. (I’ve heard that type families are equivalent to functional dependencies or even GADTs for some value of “equivalent” , maybe where “equivalent = not equivalent at all”, but that’s off-topic.) See also
a valid Haskell implementation of the above example (thanks to Eyal Lotem).

I don’t know if Scala-style implicits can be used to such effect – let me know if you do.

Ambiguous Types

Now, here’s an altogether different way to approach the ad-hoc polymorphism problem. This was my idea for a solution to ad-hoc polymorphism with inter-type constraints, before I realized type families could do that too.

Define an “ambiguous type” as a type that represents a disjunction between a set of known types. For example, for + we could say:

+ :: (a = (String | Number)) => (a, a) -> a

The type variable a is constrained to be either a String or a Number, explicitly, without a type class. I see two main differences with type classes, from a usage point of view. First, this type is closed: because there is now class, you can’t define new instances. It must be a Number or a String. Second, you don’t need to add + to some class, and if we have more operators that require a similar constiant, or even user-defined functions that require some kind of ad-hoc constraint, we don’t need to define a type class and add functions/operators to it. Lastly, the type is straightforward, and is easier to explain to people familiar with types but not with type classes.

By the way, I have no idea (yet) if this leads to a sound type system.

Ambiguous types involving multiple type variables

Let’s continue to our more complicated, ambiguous “atIndex” function that can either index into lists (using a number) or into maps (using a string). A simple disjunction (or, |) is not enough: we must express the dependencies between the container type and the index type. To that end we add conjunctions (and, &) so we can express arbitrary predicates on types, such as (t & s) | u. The type of atIndex will now be:

atIndex :: (a = [t] & i = Number) | (a = Map t & i = String) => a -> i -> t

This definitely does the job. Is it sound? Will it blend? I don’t know (yet).

The main drawback of this system is the combinatorial explosion of the predicate when combining ambiguous (overloaded) functions, such as in the following program:

x[i] + y[j]

x could be either a list or map of either numbers or strings, and so can y, so i and j can be either numbers or strings (to index into the lists or maps). We quickly get to a very large predicate expression, that is slow to analyze and more importantly, very hard to read and understand.

Nevertheless, I implemented it.

Implementing ambiguous types

Infernu is a type inference engine for JavaScript. All of the examples described above are “features” of the JavaScript language. One of the goals of Infernu is to infer the safest most general type. JS expressions such as x[i] + d should be allowed to be used in a way that makes sense. To preserve safety, Infernu doesn’t allow implicit coercions anywhere, including in +, or when indexing into a map (JS objects can act as string-keyed maps). To retain a pseudo-dynamic behavior safely, polymorphism with fancy type constraints as discussed above are required.

To properly implement ambiguous types I had to solve a bunch of implementation problems, such as:

  1. How to unify two types that have constraints on them? The problem is of finding the intersection of two arbitrary boolean expressions. My solution was to convert both equations to DNF, which is just a big top-level OR between many AND (conjunction) expressions (and no further OR or ANDs). Then compare every combination pair of conjunctions and rule out ones that don’t match. The surviving set of conjunctions is used to build a new DNF expression.
  2. How to simplify the resulting predicate? While an optimal solution exists, it is NP-hard, and more importantly, I was too lazy to implement it. Instead I ended up using a heuristic constructive approach where while building expressions, I try to find patterns that can be simplified.
  3. A software engineering challenge: when building the constraint unifier, how to design an API that allows effectful unification at the leaf nodes when we are testing conjunctions? I ended up using some Applicative/Traversable magic (but the code isn’t pretty). Rumor has it that lenses may make it much nicer.
  4. How to represent constraints in the types AST? I followed what is allegedly GHC’s approach by defining a wrapper data type, which wraps a type with an accompanying predicate, something like: data QualType t = QualType (Pred t) (Type t). Then, the unification function is left unchanged: when doing inference, I also call the constraint unifier separately, using its result to “annotate” the inferred type with a constraint predicate.

Conclusions

Apparently, the ambiguous types are not a good solution due to the complex type signatures. I’ll either leave the ambiguous types in (having already implemented them) or just get rid of them and implement type families, which will require another crusade on my code.

I’m still thinking about whether or not type families cover all cases that my ambiguous types can. One example is the “type guard” pattern common in JavaScript:

if (typeof obj == 'string') {
    .. here obj should be treated as a string! ...
}

Can ambiguous types and type families both be used coherently to implement compile-type checking inside type guards? (Haven’t given it much thought – so far I think yes for both.)

Also, since ambiguous types are closed, they may offer some features that type families can’t, such as warnings about invalid or incomplete guards, which can be seen as type pattern matching. Maybe closed type families are a solution: I don’t know much about them yet.

I also don’t know if ambiguous types lead to a sound type system or are there pitfalls I haven’t considered.
Remember that these ambiguous types may also interact with many features: parameteric polymorphism, row-type polymorphism, the requirement for prinicpal types and full type inference without annotations, etc.

Lastly, I’ve probably re-invented the wheel and somebody has written a bunch of papers in 1932, and there’s some well-accepted wisdom I’ve somehow missed.

Acknowledgement

Thanks to Eyal Lotem for a short, but very fruitful conversation where he showed me how type families may obsolete my idea.


Tagged: Haskell, Infernu, Javascript

by sinelaw at February 19, 2015 10:26 PM

February 18, 2015

The GHC Team

GHC Weekly News - 2015/02/17

Hi *,

It's time for the GHC weekly news. It's been particularly quiet the past week still, and the ghc-7.10 branch has been quite quiet. So the notes are relatively short this week.

This week, GHC HQ met up to discuss some new stuff:

  • Most of the discussion this week was about particular bugs for GHC 7.10, including getting some tickets fixed like #10058, #8276, and #9968.
  • Since the 7.10 release is getting close, we'll be starting up a new status page about GHC 7.12 (and probably get started writing things for the HCAR report in May) and what our plans are soon. Watch this space!

As usual, we've had a healthy amount of random assorted chatter on the mailing lists:

Some noteworthy commits that went into ghc.git in the past week include:

Closed tickets the past week include: #10047, #10082, #10019, #10007, #9930, #10085, #10080, #9266, #10095, and #3649.

by thoughtpolice at February 18, 2015 04:06 AM

February 17, 2015

Neil Mitchell

nub considered harmful

Summary: Don't use nub. A much faster alternative is nubOrd from the extra package.

The Haskell Data.List module contains the function nub, which removes duplicate elements. As an example:

nub [1,2,1,3] ==  [1,2,3]

The function nub has the type Eq a => [a] -> [a]. The complexity of take i $ nub xs is O(length xs * i). Assuming all elements are distinct and you want them all, that is O(length xs ^ 2). If we only have an Eq instance, that's the best complexity we can achieve. The reason is that given a list as ++ [b], to check if b should be in the output requires checking b for equality against nub as, which requires a linear scan. Since checking each element requires a linear scan, we end up with a quadratic complexity.

However, if we have an Ord instance (as we usually do), we have a complexity of O(length xs * log i) - a function that grows significantly slower. The reason is that we can build a balanced binary-tree for the previous elements, and check each new element in log time. Does that make a difference in practice? Yes. As the graph below shows, by the time we get to 10,000 elements, nub is 70 times slower. Even at 1,000 elements nub is 8 times slower.


The fact nub is dangerous isn't new information, and I even suggested changing the base library in 2007. Currently there seems to be a nub hit squad, including Niklas Hambüchen, who go around raising tickets against various projects suggesting they avoid nub. To make that easier, I've added nubOrd to my extra package, in the Data.List.Extra module. The function nubOrd has exactly the same semantics as nub (both strictness properties and ordering), but is asymptotically faster, so is almost a drop-in replacement (just the additional Ord context).

For the curious, the above graph was generated in Excel, with the code below. I expect the spikes in nub correspond to garbage collection, or just random machine fluctuations.

import Control.Exception
import Data.List.Extra
import Control.Monad
import System.Time.Extra

benchmark xs = do
n <- evaluate $ length xs
(t1,_) <- duration $ evaluate $ length $ nub xs
(t2,_) <- duration $ evaluate $ length $ nubOrd xs
putStrLn $ show n ++ "," ++ show t1 ++ "," ++ show t2

main = do
forM_ [0,100..10000] $ \i -> benchmark $ replicate i 1
forM_ [0,100..10000] $ \i -> benchmark [1..i]

by Neil Mitchell ([email protected]) at February 17, 2015 09:00 PM

FP Complete

Primitive Haskell

I originally wrote this content as a chapter of Mezzo Haskell. I'm going to be starting up a similar effort to Mezzo Haskell in the next few days, and I wanted to get a little more attention on this content to get feedback on style and teaching approach. I'll be discussing that new initiative on the Commercial Haskell mailing list.


The point of this chapter is to help you peel back some of the layers of abstraction in Haskell coding, with the goal of understanding things like primitive operations, evaluation order, and mutation. Some concepts covered here are generally "common knowledge" in the community, while others are less well understood. The goal is to cover the entire topic in a cohesive manner. If a specific section seems like it's not covering anything you don't already know, skim through it and move on to the next one.

While this chapter is called "Primitive Haskell," the topics are very much GHC-specific. I avoided calling it "Primitive GHC" for fear of people assuming it was about the internals of GHC itself. To be clear: these topics apply to anyone compiling their Haskell code using the GHC compiler.

Note that we will not be fully covering all topics here. There is a "further reading" section at the end of this chapter with links for more details.

Let's do addition

Let's start with a really simple question: tell me how GHC deals with the expression 1 + 2. What actually happens inside GHC? Well, that's a bit of a trick question, since the expression is polymorphic. Let's instead use the more concrete expression 1 + 2 :: Int.

The + operator is actually a method of the Num type class, so we need to look at the Num Int instance:

instance  Num Int  where
    I# x + I# y = I# (x +# y)

Huh... well that looks somewhat magical. Now we need to understand both the I# constructor and the +# operator (and what's with the hashes all of a sudden?). If we do a Hoogle search, we can easily find the relevant docs, which leads us to the following definition:

data Int = I# Int#

So our first lesson: the Int data type you've been using since you first started with Haskell isn't magical at all, it's defined just like any other algebraic data type... except for those hashes. We can also search for +#, and end up at some documentation giving the type:

+# :: Int# -> Int# -> Int#

Now that we know all the types involved, go back and look at the Num instance I quoted above, and make sure you feel comfortable that all the types add up (no pun intended). Hurrah, we now understand exactly how addition of Ints works. Right?

Well, not so fast. The Haddocks for +# have a very convenient source link... which (apparently due to a Haddock bug) doesn't actually work. However, it's easy enough to find the correct hyperlinked source. And now we see the implementation of +#, which is:

infixl 6 +#
(+#) :: Int# -> Int# -> Int#
(+#) = let x = x in x

That doesn't look like addition, does it? In fact, let x = x in x is another way of saying bottom, or undefined, or infinite loop. We have now officially entered the world of primops.

primops

primops, short for primary operations, are core pieces of functionality provided by GHC itself. They are the magical boundary between "things we do in Haskell itself" and "things which our implementation provides." This division is actually quite elegant; as we already explored, the standard + operator and Int data type you're used to are actually themselves defined in normal Haskell code, which provides many benefits: you get standard type class support, laziness, etc. We'll explore some of that in more detail later.

Look at the implementation of other functions in GHC.Prim; they're all defined as let x = x in x. When GHC reaches a call to one of these primops, it automatically replaces it with the real implementation for you, which will be some assembly code, LLVM code, or something similar.

Why do all of these functions end in a #? That's called the magic hash (enabled by the MagicHash language extension), and it is a convention to distinguish boxed and unboxed types and operations. Which, of course, brings us to our next topic.

Unboxed types

The I# constructor is actually just a normal data constructor in Haskell, which happens to end with a magic hash. However, Int# is not a normal Haskell data type. In GHC.Prim, we can see that it's implementation is:

data Int#

Which, like everything else in GHC.Prim is really a lie. In fact, it's provided by the implementation, and is in fact a normal long int from C (32-bit or 64-bit, depending on architecture). We can see something even funnier about it in GHCi:

> :k Int
Int :: *
> :k Int#
Int# :: #

That's right, Int# has a different kind than normal Haskell datatypes: #. To quote the GHC docs:

Most types in GHC are boxed, which means that values of that type are represented by a pointer to a heap object. The representation of a Haskell Int, for example, is a two-word heap object. An unboxed type, however, is represented by the value itself, no pointers or heap allocation are involved.

See those docs for more information on distinctions between boxed and unboxed types. It is vital to understand those differences when working with unboxed values. However, we're not going to go into those details now. Instead, let's sum up what we've learnt so far:

  • Int addition is just normal Haskell code in a typeclass
  • Int itself is a normal Haskell datatype
  • GHC provides Int# and +# as an unboxed long int and addition on that type, respectively. This is exported by GHC.Prim, but the real implementation is "inside" GHC.
  • An Int contains an Int#, which is an unboxed type.
  • Addition of Ints takes advantage of the +# primop.

More addition

Alright, we understand basic addition! Let's make things a bit more complicated. Consider the program:

main = do
    let x = 1 + 2
        y = 3 + 4
    print x
    print y

We know for certain that the program will first print 3, and then print 7. But let me ask you a different question. Which operation will GHC perform first: 1 + 2 or 3 + 4? If you guessed 1 + 2, you're probably right, but not necessarily! Thanks to referential transparency, GHC is fully within its rights to rearrange evaluation of those expressions and add 3 + 4 before 1 + 2. Since neither expression depends on the result of the other, we know that it is irrelevant which evaluation occurs first.

Note: This is covered in much more detail on the GHC wiki's evaluation order and state tokens page.

That begs the question: if GHC is free to rearrange evaluation like that, how could I say in the previous paragraph that the program will always print 3 before printing 7? After all, it doesn't appear that print y uses the result of print x at all, so we not rearrange the calls? To answer that, we again need to unwrap some layers of abstraction. First, let's evaluate and inline x and y and get rid of the do-notation sugar. We end up with the program:

main = print 3 >> print 7

We know that print 3 and print 7 each have type IO (), so the >> operator being used comes from the Monad IO instance. Before we can understand that, though, we need to look at the definition of IO itself

newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))

We have a few things to understand about this line. Firstly, State# and RealWorld. For now, just pretend like they are a single type; we'll see when we get to ST why State# has a type parameter.

The other thing to understand is that (# ... #) syntax. That's an unboxed tuple, and it's a way of returning multiple values from a function. Unlike a normal, boxed tuple, unboxed tuples involve no extra allocation and create no thunks.

So IO takes a real world state, and gives you back a real world state and some value. And that right there is how we model side effects and mutation in a referentially transparent language. You may have heard the description of IO as "taking the world and giving you a new one back." What we're doing here is threading a specific state token through a series of function calls. By creating a dependency on the result of a previous function, we are able to ensure evaluation order, yet still remain purely functional.

Let's see this in action, by coming back to our example from above. We're now ready to look at the Monad IO instance:

instance  Monad IO  where
    (>>) = thenIO

thenIO :: IO a -> IO b -> IO b
thenIO (IO m) k = IO $ \ s -> case m s of (# new_s, _ #) -> unIO k new_s

unIO :: IO a -> (State# RealWorld -> (# State# RealWorld, a #))
unIO (IO a) = a

(Yes, I changed things a bit to make them easier to understand. As an exercise, compare that this version is in fact equivalent to what is actually defined in GHC.Base.)

Let's inline these definitions into print 3 >> print 7:

main = IO $ \s0 ->
    case unIO (print 3) s0 of
        (# s1, res1 #) -> unIO (print 7) s1

Notice how, even though we ignore the result of print 3 (the res1 value), we still depend on the new state token s1 when we evaluate print 7, which forces the order of evaluation to first evaluate print 3 and then evaluate print 7.

If you look through GHC.Prim, you'll see that a number of primitive operations are defined in terms of State# RealWorld or State# s, which allows us to force evaluation order.

Exercise: implement a function getMaskingState :: IO Int using the getMaskingState# primop and the IO data constructor.

The ST monad

Let's compare the definitions of the IO and ST types:

newtype IO   a = IO (State# RealWorld -> (# State# RealWorld, a #))
newtype ST s a = ST (State# s         -> (# State# s,         a #))

Well that looks oddly similar. Said more precisely, IO is isomorphic to ST RealWorld. ST works under the exact same principles as IO for threading state through, which is why we're able to have things like mutable references in the ST monad.

By using an uninstantiated s value, we can ensure that we aren't "cheating" and running arbitrary IO actions inside an ST action. Instead, we just have "local state" modifications, for some definition of local state. The details of using ST correctly and the Rank2Types approach to runST are interesting, but beyond the scope of this chapter, so we'll stop discussing them here.

Since ST RealWorld is isomorphic to IO, we should be able to convert between the two of them. base does in fact provide the stToIO function.

Exercise: write a pair of functions to convert between IO a and ST RealWorld a.

Exercise: GHC.Prim has a section on mutable variables, which forms the basis on IORef and STRef. Provide a new implementation of STRef, including newSTRef, readSTRef, and writeSTRef`.

PrimMonad

It's a bit unfortunate that we have to have two completely separate sets of APIs: one for IO and another for ST. One common example of this is IORef and STRef, but- as we'll see at the end of this section- there are plenty of operations that we'd like to be able to generalize.

This is where PrimMonad, from the primitive package, comes into play. Let's look at its definition:

-- | Class of primitive state-transformer monads
class Monad m => PrimMonad m where
  -- | State token type
  type PrimState m

  -- | Execute a primitive operation
  primitive :: (State# (PrimState m) -> (# State# (PrimState m), a #)) -> m a

Note: I have not included the internal method, since it will likely be removed. In fact, at the time you're reading this, it may already be gone!

PrimState is an associated type giving the type of the state token. For IO, that's RealWorld, and for ST s, it's s. primitive gives a way to lift the internal implementation of both IO and ST to the monad under question.

Exercise: Write implementations of the PrimMonad IO and PrimMonad (ST s) instances, and compare against the real ones.

The primitive package provides a number of wrappers around types and functions from GHC.Prim and generalizes them to both IO and ST via the PrimMonad type class.

Exercise: Extend your previous STRef implementation to work in any PrimMonad. After you're done, you may want to have a look at Data.Primitive.MutVar.

The vector package builds on top of the primitive package to provide mutable vectors that can be used from both IO and ST. This chapter is not a tutorial on the vector package, so we won't go into any more details now. However, if you're curious, please look through the Data.Vector.Generic.Mutable docs.

ReaderIO monad

To tie this off, we're going to implement a ReaderIO type. This will flatten together the implementations of ReaderT and IO. Generally speaking, there's no advantage to doing this: GHC should always be smart enough to generate the same code for this and for ReaderT r IO (and in my benchmarks, they perform identically). But it's a good way to test that you understand the details here.

You may want to try implementing this yourself before looking at the implementation below.

{-# LANGUAGE FlexibleInstances     #-}
{-# LANGUAGE MagicHash             #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE TypeFamilies          #-}
{-# LANGUAGE UnboxedTuples         #-}
import Control.Applicative        (Applicative (..))
import Control.Monad              (ap, liftM)
import Control.Monad.IO.Class     (MonadIO (..))
import Control.Monad.Primitive    (PrimMonad (..))
import Control.Monad.Reader.Class (MonadReader (..))
import GHC.Base                   (IO (..))
import GHC.Prim                   (RealWorld, State#)

-- | Behaves like a @ReaderT r IO a@.
newtype ReaderIO r a = ReaderIO
    (r -> State# RealWorld -> (# State# RealWorld, a #))

-- standard implementations...
instance Functor (ReaderIO r) where
    fmap = liftM
instance Applicative (ReaderIO r) where
    pure = return
    (<*>) = ap

instance Monad (ReaderIO r) where
    return x = ReaderIO $ \_ s -> (# s, x #)
    ReaderIO f >>= g = ReaderIO $ \r s0 ->
        case f r s0 of
            (# s1, x #) ->
                let ReaderIO g' = g x
                 in g' r s1

instance MonadReader r (ReaderIO r) where
    ask = ReaderIO $ \r s -> (# s, r #)
    local f (ReaderIO m) = ReaderIO $ \r s -> m (f r) s

instance MonadIO (ReaderIO r) where
    liftIO (IO f) = ReaderIO $ \_ s -> f s

instance PrimMonad (ReaderIO r) where
    type PrimState (ReaderIO r) = RealWorld

    primitive f = ReaderIO $ \_ s -> f s

    -- Cannot properly define internal, since there's no way to express a
    -- computation that requires an @r@ input value as one that doesn't. This
    -- limitation of @PrimMonad@ is being addressed:
    --
    -- https://github.com/haskell/primitive/pull/19
    internal (ReaderIO f) =
        f (error "PrimMonad.internal: environment evaluated")

Exercise: Modify the ReaderIO monad to instead be a ReaderST monad, and take an s parameter for the specific state token.

Further reading

February 17, 2015 10:00 AM

Danny Gratzer

A Twelf Introduction

Posted on February 17, 2015
Tags: twelf, types

For the last 3 or so weeks I’ve been writing a bunch of Twelf code for my research (hence my flat-lined github punch card). Since it’s actually a lot of fun I thought I’d share a bit about Twelf.

What Is Twelf

Since Twelf isn’t a terribly well known language it’s worth stating what exactly it is we’re talking about. Twelf is a proof assistant. It’s based on a logic called LF (similarly to how Coq is based on CiC).

Twelf is less powerful than some other proof assistants but by limiting some of its power it’s wonderfully suited to proving certain types of theorems. In particular, Twelf admits true “higher order abstract syntax” (don’t worry if you don’t know what this means) this makes it great for formalizing programming languages with variable bindings.

In short, Twelf is a proof assistant which is very well suited for defining and proving things about programming languages.

Getting Twelf

It’s much more fun to follow along a tutorial if you actually have a Twelf installation to try out the code. You can download and compile the sources to Twelf with SML/NJ or Mlton. You could also use smackage to get the compiler.

Once you’ve compiled the thing you should be left with a binary twelf-server. This is your primary way of interacting with the Twelf system. There’s quite a slick Emacs interface to smooth over this process. If you’ve installed twelf into a directory ~/twelf/ all you need is the incantation

    (setq twelf-root "~/twelf/")
    (load (concat twelf-root "emacs/twelf-init.el"))

Without further ado, let’s look at some Twelf code.

Some Code

When writing Twelf code we encode the thing that we’re studying, the object language, as a bunch of type families and constructors in Twelf. This means that when we edit a Twelf file we’re just writing signatures.

For example, if we want to encode natural numbers we’d write something like

    nat : type.
    z   : nat.
    s   : nat -> nat.

This is an LF signature, we declare a series of constants with NAME : TYPE.. Note the period at the end of each declaration. First we start by declaring a type for natural numbers called nat with nat : type. Here type is the base kind of all types in Twelf. Next we go on to declare what the values of type nat are.

In this case there are two constructors for nat. We either have zero, z, or the successor of another value of type nat, s. This gives us a canonical forms lemma for natural numbers: All values of type nat are either

  • z
  • s N for some value N : nat

Later on, we’ll justify the proofs we write with this lemma.

Anyways, now that we’ve encoded the natural numbers I wanted to point out a common point of confusion about Twelf. We’re not writing programs to be run. We’re writing programs exclusively for the purpose of typechecking. Heck, we’re not even writing programs at the term level! We’re just writing a bunch of constants out with their types! More than this even, Twelf is defined so that you can only write canonical forms. This means that if you write something in your program, it has to be in normal form, fully applied! In PL speak it has to be β-normal and η-long. This precludes actually writing programs for the sake of reducing them. You’re never going to write a web server in Twelf, you even be writing “Hello World”. You might use it to verify the language your writing them in though.

Now that we’ve gotten the awkward bit out the way, let’s now define a Twelf encoding of a judgment. We want to encode the judgment + which is given by the following rules

—————————
z + n = n

   m + n = p
———————————————
s(m) + n = s(p)

In the rest of the world we have this idea that propositions are types. In twelf, we’re worried about defining logics and systems, so we have the metatheoretic equivalent: judgments are types.

So we define a type family plus.

    plus : nat -> nat -> nat -> type

So plus is a type indexed over 3 natural numbers. This is our first example of dependent types: plus is a type which depends on 3 terms. Now we can list out how to construct a derivation of plus. This means that inference rules in a meta theory corresponds to constants in Twelf as well.

    plus/z : {n : nat} plus z n n

This is some new syntax, in Twelf {NAME : TYPE} TYPE is a dependent function type, a pi type. This notation is awfully similar to Agda and Idris if you’re familiar with them. This means that this constructor takes a natural number, n and returns a derivation that plus z n n. The fact that the return type depends on what nat we supply is why this needs a dependent type.

In fact, this is such a common pattern that Twelf has sugar for it. If we write an unbound capital variable name Twelf will automagically introduce a binder {N : ...} at the front of our type. We can thus write our inference rules as

    plus/z : plus z N N
    plus/s : plus N M P -> plus (s N) M (s P)

These rules together with our declaration of plus. In fact, there’s something kinda special about these two rules. We know that for any term n : nat which is in canonical form, there should be an applicable rule. In Twelf speak, we say that this type family is total.

We can ask Twelf to check this fact for us by saying

    plus : nat -> nat -> nat -> type.
    %mode plus +N +M -P.

    plus/z : plus z N N.
    plus/s : plus N M P -> plus (s N) M (s P).

    %worlds () (plus _ _ _).
    %total (N) (plus N _ _).

We want to show that for all terms n, m : nat in canonical form, there is a term p in canonical form so that plus n m p. This sort of theorem is what we’d call a ∀∃-theorem. This is literally because it’s a theorem of the form “∀ something. ∃ something. so that something”. These are the sort of thing that Twelf can help us prove.

Here’s the workflow for writing one of these proofs in Twelf

  1. Write out the type family
  2. Write out a %mode specification to say what is bound in the ∀ and what is bound in the ∃.
  3. Write out the different constants in our type family
  4. Specify the context to check our proof in with %worlds, usually we want to say the empty context, ()
  5. Ask Twelf to check that we’ve created a proof according to the mode with %total where the N specifies what to induct on.

In our case we have a case for each canonical form of nat so our type family is total. This means that our theorem passes. Hurray!

Believe it or not this is what life is like in Twelf land. All the code I’ve written these last couple of weeks is literally type signatures and 5 occurrences of %total. What’s kind of fun is how unreasonably effective a system this is for proving things.

Let’s wrap things up by proving one last theorem, if plus A B N and plus A B M both have derivations, then we should be able to show that M and N are the same. Let’s start by defining what it means for two natural numbers to be the same.

    nat-eq : nat -> nat -> type.
    nat-eq/r : nat-eq N N.
    nat-eq/s : nat-eq N M -> nat-eq (s N) (s M).

I’ve purposefully defined this so it’s amenable to our proof, but it’s still a believable formulation of equality. It’s reflexive and if N is equal to M, then s N is equal to s M. Now we can actually state our proof.

    plus-fun : plus N M P -> plus N M P' -> nat-eq P P' -> type.
    %mode plus-fun +A +B -C.

Our theorem says if you give us two derivations of plus with the same arguments, we can prove that the outputs are equal. There are two cases we have to cover for our induction so there are two constructors for this type family.

    plus-fun/z : plus-fun plus/z plus/z nat-eq/r.
    plus-fun/s : plus-fun (plus/s L) (plus/s R) (nat-eq/s E)
                  <- plus-fun L R E.

A bit of syntactic sugar here, I used the backwards arrow which is identical to the normal -> except its arguments are flipped. Finally, we ask Twelf to check that we’ve actually proven something here.

    %worlds () (plus-fun _ _ _).
    %total (P) (plus-fun P _ _).

And there you have it, some actual theorem we’ve mechanically checked using Twelf.

Wrap Up

I wanted to keep this short, so now that we’ve covered Twelf basics I’ll just refer you to one of the more extensive tutorials. You may be interested in

If you’re interested in learning a bit more about the nice mathematical foundations for LF you should check out “The LF Paper”.

<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

February 17, 2015 12:00 AM

February 16, 2015

Functional Jobs

Engineering Manager (OCaml or Haskell under Linux) at FireEye (Full-time)

Position Title: Engineering Manager

Location: Dresden, Germany

The Company

FireEye has invented a purpose-built, virtual machine-based security platform that provides real-time threat protection to enterprises and governments worldwide against the next generation of cyber attacks. These highly sophisticated cyber attacks easily circumvent traditional signature-based defenses, such as next-generation firewalls, IPS, anti-virus, and gateways. The FireEye Threat Prevention Platform provides real-time, dynamic threat protection without the use of signatures to protect an organization across the primary threat vectors and across the different stages of an attack life cycle. The core of the FireEye platform is a virtual execution engine, complemented by dynamic threat intelligence, to identify and block cyber attacks in real time. FireEye has over 3,100 customers across 67 countries, including over 200 of the Fortune 500.

Job Description:

In Dresden, Germany, an outstanding team of formal methods engineers uses formal methods tools, such as proof assistants, to develop correctness proofs for FireEye's leading edge products. In real world applications of formal methods tools, automation is often not sufficient for the specific problems at hand. Therefore, we are seeking outstanding software developers with a passion for implementing both well-designed as well as ad-hoc formal methods software tools for proof refactoring, proof search, systematic testing and other areas.

Responsibilities:

  • Design and development of software tools, scripts and automated processes for various tasks, supporting formal method tools
  • Refactoring, redesigning and rewriting earlier ad-hoc solutions in a systematic way
  • Maintaining existing tools and scripts
  • Continuous focus and contribution in the areas of performance, reliability, scalability, and maintainability of the product
  • Active participation in our ongoing process enhancements in software development and verification practices

Desired Skills & Experience

  • BS or MS in Computer Science or equivalent experience
  • Strong programming skills in OCaml or Haskell under Linux
  • Good skills in scripting languages, such as bash, perl or python
  • Good knowledge of logic and formal logical reasoning
  • Experience with parallel programs running on multiple machines and cores
  • Interest in acquiring basic knowledge about proof assistants for higher-order logic
  • Excellent interpersonal and teamwork skills
  • Strong problem solving, troubleshooting and analysis skills
  • Excellent oral and written communication skills

Get information on how to apply for this position.

February 16, 2015 05:29 PM

February 15, 2015

Ketil Malde

Thoughts on phylogenetic trees

An important aspect of studying evolution is the construction of phylogenetic trees, graphically representing the relationship between current and historic species. These trees are usually calculated based on similarities and differences between genetic material of current species, and one particular challenge is that the topology of the resulting trees depend on the selection of genes used to construct them. Quite often, the species tree based on one set of genes differ substantially from the tree based on another set of genes.

The phylogenetic tree is usually presented as a simple tree of species. The end points of brances at the bottom of the tree (leaves) represent current species, and branching points higher up (internal nodes) represent the most recent common ancestor, or MRCA, for the species below it.

A very simple example could look something like this:

Evolution as a tree of species

Evolution as a tree of species

Here you have two current species, and you can trace back their lineage to a MRCA, and further back to some ancient origin. Varying colors indicate that gradual change along the branches has introduced differences, and that the current species now have diverged from each other, and their origin.

This representation has the advantage of being nice and simple, and the disadvantage of being misleading. For instance, one might get the impression that a species is a well-defined concept, and ask questions like: when the first member of a species diverged from its ancestor species, how did it find a mate?

Species are populations

But we are talking about species here - that is, not individuals but populations of individuals. So a more accurate representation might look like this:

Evolution as a tree of populations

Evolution as a tree of populations

Circles now represent individuals, and it should perhaps be clearer that there is no such thing as the “first” of anything. At the separation point, there is no difference between the two populations, and it is only after a long period of separation that differences can arise. (Of course, if there are selective pressure favoring specific properties - perhaps redness is very disadvantageous for species B, for instance - this change will be much quicker. Look at how quickly we have gotten very different breeds of dogs by keeping populations artificially separate, and selecting for specific properties.)

The so-called “speciation” is nothing more than a population being split into two separate parts. Typically, this can be geographically - a few animals being carried to Madagascar on pieces of driftwood - but anything that prevents members of one branch from mating with members of the other one will suffice. At he outset, the two branches are just indistinguishable subpopulations of the same species, but if the process goes on long enough, differences between the two populations can become large enough that they can no longer interbreed, and we can consider them different species.

In practice, such a separation is often not complete, some individuals can stray between the groups. In that case, speciation is less likely to happen, since the property of being unable to breed with the other group represents a reproductive disadvantage, and it would therefore be selected against. In other words, if your neighbor is able to have children with more members of the population than you, his chances of having children are better than yours. Your genes get the short end of the stick. Kind of obvious, no?

Populations consist of genes

But we can also view evolution as working on populations, not of individuals, but of individual genes. This is illustrated in the next picture:

Evolution as a tree of gene populations

Evolution as a tree of gene populations

The colored circles now represent genes, and an individual is here just a sample from the population of genes - illustrated by circling three gene circles. (Note that by “gene”, we here mean an abstract unit of inheritance. In other fields of biology, the word might be synonymous with a genetic region that codes for a protein, or is transcribed to (possibly non-coding) RNA.)

Here, we see that although the genes themselves do not change (in reality they are subject to mutations), the availability of the different genes vary over time, and some might disappear from one of the branches entirely - like red genes from species B here. This kind of genetic drift can still cause distinct changes in individuals.

Ancestry of individual genes

Each individual typically gets half its genes from each parent, one fourth from each grandparent, and so on, so after a few generations, all genes come from essentially different ancestors. This means you can calculate the MRCA for each gene individually, and this is exactly what has been done to estimate the age our “mitochondrial Eve” and “Y-chromosomal Adam”. Here is the example lineage for the green gene:

Lineage and MRCA for the green gene

Lineage and MRCA for the green gene

We see that the green-gene MRCA is much older than the speciation event. In addition, each gene has its unique history. This means that when we try to compute the MRCA, different genes will give different answers, and it can be difficult to construct a sensible consensus.

For example, bits of our genome appear to come from the Neanderthal, and those bits will have a MRCA that predates the time point where Neanderthal branched from H. sapiens (possibly 1-2 million years ago). (Interestingly, “Adam” and “Eve” are both estimated to have lived in the neighborhood of 200000 years ago. This means that although 20% of the Neanderthal genome is claimed to have survived, all Neanderthal mitochondria and Y-chromosomes have been eradicated.)

February 15, 2015 08:00 PM

February 13, 2015

Yesod Web Framework

The awesome Haskell community

Recently the Haskell community has been engaged in a very intense discussion around potential changes to the Prelude (aka "burning bridges" or "FTP"). Here's the most recent incarnation of the discussion for context. The changes under discussion are non-trivial, and many people are putting in a huge amount of energy to try and make Haskell the best it can be. And to be clear: I'm talking about people arguing on both sides of this discussion, and people trying to moderate it. As someone who's mostly been sitting on the sidelines in this one, I want to start by expressing a big thank you to everyone working on this.

(If anyone's wondering why I'm mostly sitting this one out, it's because I don't feel very strongly about it either way. I think there are great arguments going both ways, and over the past week I've fluctuated between being -0.2 on the proposal and being +0.2.)

When a big discussion like this happens, it's easy for people to misinterpret it as something unhealthy. I'm here to remind everyone that, in fact, the opposite is true: what we're seeing now is the sign of an incredibly healthy community, based on an amazing language, that is undertaking extraordinary things. And there are of course some warts being revealed too, but those are relatively minor, and I believe we will be able to overcome them.

So to begin my cheerleading:

The fact that we can even consider doing this is amazing. I don't think very many languages could sustain a significant rewrite of their most core library. Let's just keep things in perspective here: even the worst case scenario damage from this change involves updating some documentation and modifying a relatively small amount of code in such a way that will be backwards compatible with old versions of the library. This is a true testament not only to the power of the Haskell language, but to the thoughtfulness with which this proposal was made.

The discussion has been incredibly civil. This topic had all the makings for an internet flame war: strongly held opinions, good arguments on both sides, lots of time and effort involved, and Reddit. I am happy to say that I have not seen a single personal attack in the entire discussion. Almost every piece of discourse has been beyond reproach, and the few times where things have gotten close to crossing the line, people on both sides have politely expressed that sentiment, leading to the original content being removed.

To some extent, I think we're all a bit spoiled by how good the civility in the Haskell world is, and we should take a moment to appreciate it. That's not to say we should ever expect any less, but we should feel comfortable patting ourselves on the back a bit.

We're dynamically coming up with new, better processes. When opinions are so strongly divided, it's difficult to make any kind of progress. As a community, we're adapting quickly and learning how to overcome that. As you can see in the thread I linked above, we now have a clear path forward: a feedback form that will be processed by Simon PJ and Simon Marlow, who will make the ultimate decision. This process is clear, and we couldn't be more fortunate to have such great and well respected leaders in our community.

Nothing else has stopped. If you look at issue trackers, commit logs, and mailing list discussions, you can see that while many members of the community are actively participating in this discussion, nothing else has ground to a halt. We're a dynamic community with many things going on, so the ability to digest a major issue while still moving forward elsewhere is vital.


That said, I think there are still some areas for improvement. The biggest one lies with the core libraries committee, of which I'm a member. We need to learn how to be better at communicating with the community about these kinds of large scale changes. I'm taking responsibility for that problem, so if you don't see improvements on that front in the next few weeks, you can blame me.

More generally, I think there are some process and communications improvements that can be made at various places in the community. I know that's an incredibly vague statement, but that's all I have for the moment. I intend to follow up in the coming weeks with more concrete points and advice on how to improve things.

In sum: Haskell's an amazing language, which has attracted an amazing community. This discussion doesn't detract from that statement, but rather emphasizes it. Like any group, we can still learn to do a few things better, but we've demonstrated time and again (including right now!) that we have the strength to learn and improve, and I'm certain we'll do so again.

I'm proud to be part of this community, and everyone else should be as well.

February 13, 2015 07:20 AM

Functional Jobs

Senior Developer (Big Data Tech Stack) at Recruit IT Group Ltd (Full-time)

Recruit IT are looking for a Senior Developer to join a bleeding edge Big Data organisation in the New Zealand market. You will bring a proven background in big data systems, business intelligence, and/or data warehouse technologies along with a preference for functional programming and open source solutions.

You will be playing an integral part in ensuring the growth and performance of a state of the art Big Data platform. To do this you will need to have a good understanding of the importance of analytics and a variety of Big Data technologies.

Your experience to date will include:

  • Experience with big data, business intelligence, and data warehouse applications. This will include; Hadoop, Hive, Spring, Pivotal HD, Cloud Foundry, HAWQ, GreenPlum, MongoDB, Cassandra, Hortonworks, Cloudera, MapReduce, Flume, or Scoop to name a few!

  • Ideally functional programming experience including; Scala, Haskell, Lisp, Python etc

  • Passion for bleeding edge tech is a MUST!

If you are interested in finding out more, apply online to Kaleb at Recruit IT with your CV and an overview of your current situation.

Get information on how to apply for this position.

February 13, 2015 01:39 AM

February 12, 2015

Roman Cheplyaka

Foldable, Traversable, and parametricity

The Foldable-Traversable proposal (aka FTP) has spawned a lot of debate in the Haskell community.

Here I want to analyze the specific concern which Ben Moseley raised in his post, FTP dangers.

Ben’s general point is that more polymorphic (specifically, ad-hoc polymorphic, i.e. using type classes) functions are less readable and reliable than their monomorphic counterparts.

On the other hand, Tony Morris and Chris Allen argue on twitter that polymorphic functions are more readable due to parametricity.

Is that true, however? Are the ad-hoc generalized functions more parametric than the monomorphic versions?

<script async="async" charset="utf-8" src="http://platform.twitter.com/widgets.js"></script>

Technically, the Functor-based type is more parametric. A function with type (a -> b) -> [a] -> [b] is something like map, except it may drop or duplicate some elements. On the other hand, Functor f => (a -> b) -> f a -> f b has to be fmap.

But this is a trick question! The first thing we see in the code is the function’s name, not its type. What carries more information, map or fmap? (Assuming both come from the current Prelude.) Certainly map. When fmap is instantiated at the list type, it is nothing more than map. When we see fmap, we know that it may or may not be map. When we see map, we know it is map and nothing else.

The paradox is that there are more functions with map’s type than fmap’s, but there are more functions with fmap’s name than map’s. Even though fmap is more parametric, that doesn’t win us much.

Nevertheless, is there a benefit in using more parametric functions in your code? No. If it were true, we’d all be pumping our code with «parametricity» by writing id 3 instead of 3. You can’t get more parametric than id.

Merely using parametric functions doesn’t make code better. Parametricity may pay off when we’re defining polymorphic parametric functions in our code instead of their monomorphic instantiations, since parametric types are more constrained and we’re more likely to get a compile error should we do anything stupid.

(It won’t necessarily pay off; the type variables and constraints do impose a tax on types’ readability.)

But if you have an existing, monomorphic piece of code that works with lists, simply replacing Data.List functions with Data.Foldable ones inside it, ceteris paribus, will not make your code any safer or more readable.

February 12, 2015 08:00 PM

Leon P Smith

Announcing blaze-builder-0.4

After a recent chat with Simon Meier, we decided that I would take over the maintenance of the exceedingly popular blaze-builder package.

Of course, this package has been largely superseded by the new builder shipped inside bytestring itself. The point of this new release is to offer a smooth migration path from the old to the new.

If you have a package that only uses the public interface of the old blaze-builder, all you should have to do is compile it against blaze-builder-0.4 and you will in fact be using the new builder. If your program fails to compile against the old public interface, or there’s any change in the semantics of your program, then please file a bug against my blaze-builder repository.

If you are looking for a function to convert Blaze.ByteString.Builder.Builder to Data.ByteString.Builder.Builder or back, it is id. These two types are exactly the same, as the former is just a re-export of the latter. Thus inter-operation between code that uses the old interface and the new should be efficient and painless.

The one caveat is that the old implementation has all but disappeared, and programs and libraries that touch the old internal modules will need to be updated.

This compatibility shim is especially important for those libraries that have the old blaze-builder as part of their public interface, as now you can move to the new builder without breaking your interface.

There are a few things to consider in order to make this transition as painless as possible, however: libraries that touch the old internals should probably move to the new bytestring builder as soon as possible, while those libraries who depend only on the public interface should probably hold off for a bit and continue to use this shim.

For example, blaze-builder is part of the public interface of both the Snap Framework and postgresql-simple. Snap touches the old internals, while postgresql-simple uses only the public interface. Both libraries are commonly used together in the same projects.

There would be some benefit to postgresql-simple to move to the new interface. However, let’s consider the hypothetical situation where postgresql-simple has transitioned, and Snap has not. This would cause problems for any project that 1.) depends on this compatibility shim for interacting with postgresql-simple, and 2.) uses Snap.

Any such project would have to put off upgrading postgresql-simple until Snap is updated, or interact with postgresql-simple through the new bytestring builder interface and continue to use the old blaze-builder interface for Snap. The latter option could range from anywhere from trivial to extremely painful, depending on how entangled the usage of Builders are between postgresql-simple and Snap.

By comparison, as long as postgresql-simple continues to use the public blaze-builder interface, it can easily use either the old or new implementation. If postgresql-simple holds off until after Snap makes the transition, then there’s little opportunity for these sorts of problems to arise.


by lpsmith at February 12, 2015 06:42 PM

Announcing snaplet-postgresql-simple-0.6

In the past, I’ve said some negative things1 about Doug Beardsley’s snaplet-postgresql-simple, and in this long overdue post, I retract my criticism.

The issue was that a connection from the pool wasn’t reserved for the duration of the transaction. This meant that the individual queries of a transaction could be issued on different connections, and that queries from other requests could be issued on the connection that’s in a transaction. Setting the maximum size of the pool to a single connection fixes the first problem, but not the second.

At Hac Phi 2014, Doug and I finally sat down and got serious about fixing this issue. The fix did require breaking the interface in a fairly minimal fashion. Snaplet-postgresql-simple now offers the withPG and liftPG operators that will exclusively reserve a single connection for a duration, and in turn uses withPG to implement withTransaction.

We were both amused by the fact that apparently a fair number of people have been using snaplet-postgresql-simple, even transactions in some cases, without obviously noticing the issue. One could speculate the reasons why, but Doug did mention that he pretty much never uses transactions. So in response, I came up with a list of five common use cases, the first three involve changing the database, and last two are useful even in a read-only context.

  1. All-or-nothing changes

    Transactions allow one to make a group of logically connected changes so that they either all reflected in the resulting state of the database, or that none of them are. So if anything fails before the commit, say due to a coding error or even something outside the control of software, the database isn’t polluted with partially applied changes.

  2. Bulk inserts

    Databases that provide durability, like PostgreSQL, are limited in the number of transactions per second by the rotational speed of the disk they are writing to. Thus individual DML statements are rather slow, as each PostgreSQL statement that isn’t run in an explicit transaction is run in its own individual, implicit transaction. Batching multiple insert statements into a single transaction is much faster.

    This use case is relatively less important when writing to a solid state disk, which is becoming increasingly common. Alternatively, postgresql allows a client program to turn synchronous_commit off for the connection or even just a single transaction, if sacrificing a small amount of durability is acceptable for the task at hand.

  3. Avoiding Race Conditions

    Transactional databases, like Software Transactional Memory, do not automatically eliminate all race conditions, they only provide a toolbox for avoiding and managing them. Transactions are the primary tool in both toolboxes, though there are considerable differences around the edges.

  4. Using Cursors

    Cursors are one of several methods to stream data out of PostgreSQL, and you’ll almost always want to use them inside a single transaction.2 One advantage that cursors have over the other streaming methods is that one can interleave the cursor with other queries, updates, and cursors over the same connection, and within the same transaction.

  5. Running multiple queries against a single snapshot

    If you use the REPEATABLE READ or higher isolation level, then every query in the transaction will be executed on a single snapshot of the database.

So I no longer have any reservations about using snaplet-postgresql-simple if it is a good fit for your application, and I do recommend that you learn to use transactions effectively if you are using Postgres. Perhaps in a future post, I’ll write a bit about picking an isolation level for your postgres transactions.


  1. See for example, some of my comments in the github issue thread on this topic, and the reddit thread which is referenced in the issue.

  2. There is the WITH HOLD option for keeping a cursor open after a transaction commits, but this just runs the cursor to completion, storing the data in a temporary table. Which might occasionally be acceptable in some contexts, but is definitely not streaming.


by lpsmith at February 12, 2015 05:16 PM

February 11, 2015

Ben Moseley

FTP dangers

I am concerned about the Foldable Traverable Proposal (FTP) (https://ghc.haskell.org/trac/ghc/wiki/Prelude710) :

My first and biggest concern is simply that it's harder to read code which uses highly polymorphic functions unnecessarily.

You can see that even by considering plain-old fmap vs map: it's harder to read "fmap f . fmap g" than "map f . map g" because with the former you're having to manually search for more information about what Functor is being used in each case.

My second concern is that the more you overload, the more you risk having something unexpected happen:

Say we have two variables:
*Borders.Base.Utils> let a' = ["alice", "bob"]
*Borders.Base.Utils> let a  = (True, a')

we want to count the characters so we type:
*Borders.Base.Utils> length $ concat a

..but we've accidentally forgotten the prime character...

... right now we get:
<interactive>:6:17:
   Couldn't match expected type ‘[[a0]]’
               with actual type ‘(Bool, [[Char]])’
   In the first argument of ‘concat’, namely ‘a’
   In the second argument of ‘($)’, namely ‘concat a’</interactive>

so we fix it and get our desired result:

*Borders.Base.Utils> length $ concat a'
8

...but under the FTP proposals (where concat would become Data.Foldable.concat) we get:

*Borders.Base.Utils> length $ Data.Foldable.concat a
2

(because pairs are Foldable in their second argument).

This cannot be a good thing.

I believe that the more generalised functions of FTP should be opt-in (i.e. you should - as at present - need to import them explicitly).

by Ben ([email protected]) at February 11, 2015 11:55 AM

Functional Jobs

Senior Development Engineer at Lookingglass Cyber Solutions (Full-time)

Lookingglass is seeking a qualified Senior Development Engineer to join our team!

Are you an experienced senior software engineer in security, networking, cloud and big data? Are you interested in cyber security or improving the security of the Internet? Do you push yourself to be creative and innovative and expect the same of others?

At Lookingglass, we are driven and passionate about what we do. We believe that teams deliver great products not individuals. We inspire each other and our customers every day with technology that improves the security of the Internet and of our customer’s. Behind our success is a team that thrives on collaboration and creativity, delivering meaningful impact to our customers.

Skills & Requirements

Required:

  • US Citizen or Green Card Holder
  • Location: MD/VA or CA based
  • Bachelor’s or Masters degree in: computer science, engineering, information systems or mathematics
  • 5-8 yrs experienced with full development life-cycle with shipping products
  • 4-8 yrs experienced with Functional (particularly Clojure) and OO languages – have worked in functional paradigms with immutable data models
  • List item 5+ yrs building distributed system products including messaging, NoSQL, RPC / RMI mechanisms – Including building Service Orientated Architectures
  • Proficiency with data structure and algorithm analysis • Experienced working in a TDD Environment
  • Comfortable with aggressive refactoring
  • Architectural and design skills to map a solution across hardware, software, and business layers in a distributed architecture

Nice to Have:

  • Product development experience in network security, content security or cybersecurity intelligence
  • Experience with CSP concurrency models
  • Experience with key-value distributed databases
  • Experience deploying production software in Haskell, OCAML, or Lisp
  • Comfortable writing one or more of the following Javascript, GoLang, Ruby

Get information on how to apply for this position.

February 11, 2015 02:21 AM

Danny Gratzer

Notes on Proof Theory: Part 1

Posted on February 11, 2015
Tags: types

I write a lot about types. Up until now however, I’ve only made passing references to the thing I’ve actually been studying in most of my free time lately: proof theory. Now I have a good reason for this: the proof theory I’m interested in is undeniably intertwined with type theory and computer science as a whole. In fact, you occasionally see someone draw the triangle

           Type Theory
          /           \
         /             \
 Proof Theory ---- Category Theory

Which nicely summarizes the lay of the land in the world I’m interested in. People will often pick up something will understood on one corner of the triangle and drag it off to another, producing a flurry of new ideas and papers. It’s all very exciting and leads to really cool stuff. I think the most talked about example lately is homotopy type theory which drags a mathematical structure (weak infinite groupoids) and hoists off to type theory!

If you read the [unprofessional, mostly incorrect, and entirely more fun to read] blog posts on these subjects you’ll find most of the lip service is paid to category theory and type theory with poor proof theory shunted off to the side.

In this post, I’d like to jot down my notes on Frank Pfenning’s introduction to proof theory materials to change that in some small way.

What is Proof Theory

The obvious question is just “What is proof theory?”. The answer is that proof theory is the study of proofs. In this world we study proofs as first class mathematical objects which we prove interesting things about. This is the branch of math that formalizes our handwavy notion of a proof into a precise object governed by rules.

We can then prove things like “Given a proof that Γ ⊢ A and another derivation of Γ, A ⊢ B, then we can produce a derivation of Γ ⊢ B. Such a theorem is utterly crazy unless we can formalize what it means to derive something.

From this we grow beautiful little sets of rules and construct derivations with them. Later, we can drag these derivations off to type theory and use them to model all sorts of wonderful phenomena. My most recent personal example was when folks noticed that the rules for modal logic perfectly capture what the semantics of static pointers ought to be.

So in short, proof theory is devoted to answering that question that every single one of your math classes dodged

Professor, what exactly is a proof?

Basic Building Blocks

In every logic that we’ll study we’ll keep circling back to two core objects: judgments and propositions. The best explanation of judgments I’ve read comes from Frank Pfenning

A judgment is something we may know, that is, an object of knowledge. A judgment is evident if we in fact know it.

So judgments are the things we’ll structure our logic around. You’ve definitely heard of one judgment: A true. This judgment signifies whether or not some proposition A is true. Judgments can be much fancier though: we might have a whole bunch of judgments like n even, A possible or A resource.

These judgments act across various syntactic objects. In particular, from our point of view we’ll understand the meaning of a proposition by the ways we can prove it, that is the proofs that A true is evident.

We prove a judgment J through inference rules. An inference rule takes the form

J₁ J₂ .... Jₓ
—————————————
     J

Which should be read as “When J₁, J₂ … and Jₓ hold, then so does J”. Here the things above the line are premises and the ones below are conclusions. What we’ll do is define a bunch of these inference rules and use them to construct proofs of judgments. For example, we might have the inference rules

             n even
 ——————    ————————————
 0 even    S(S(n)) even

for the judgment n even. We can then form proofs to show that n even holds for some particular n.

       ——————
       0 even
    ————————————
    S(S(0)) even
 ——————————————————
 S(S(S(S(0)))) even

This tree for example is evidence that 4 even holds. We apply second inference rule to S(S(S(S(0)))) first. This leaves us with one premise to show, S(S(0)) even. For this we repeat the process and end up with the new premise that 0 even. For this we can apply the first inference rule which has no premises completing our proof.

One judgment we’ll often see is A prop. It simply says that A is a well formed proposition, not necessarily true but syntactically well formed. This judgment is defined inductively over the structure of A. An example judgment would be

A prop  B prop
——————————————
  A ∧ B prop

Which says that A ∧ B (A and B) is a well formed proposition if and only if A and B are! We can imagine a whole bunch of these rules

                A prop B prop
——————  ——————  ————————————— ...
⊤ prop  ⊥ prop    A ∨ B prop

that lay out the propositions of our logic. This doesn’t yet tell us how prove any of these propositions to be true, but it’s a start. After we formally specify what sentences are propositions in our logic we need to discuss how to prove that one is true. We do this with a different judgment A true which is once again defined inductively.

For example, we might want to give meaning to the proposition A ∧ B. To do this we define its meaning through the inference rules for proving that A ∧ B true. In this case, we have the rule

A true  B true
—————————————— (∧ I)
  A ∧ B true

I claim that this defines the meaning of : to prove a conjunction to be true we must prove its left and right halves. The rather proof-theoretic thing we’ve done here is said that the meaning of something is what we use to prove it. This is sometimes called the “verificationist perspective”. Finally, note that I annotated this rule with the name ∧ I simply for convenience to refer it.

Now that we know what A ∧ B means, what does have a proof of it imply? Well we should be able to “get out what we put in” which would mean we’d have two inference rules

A ∧ B true    A ∧ B true
——————————    ——————————
  A true        B true

We’ll refer to these rules as ∧ E1 and ∧ E2 respectively.

Now for a bit of terminology, rules that let us “introduce” new proofs of propositions are introduction rules. Once we have a proof, we can use it to construct other proofs. The rules for how we do that are called elimination rules. That’s why I’ve been adding I’s and E’s to the ends of our rule names.

How do we convince ourselves that these rules are correct with respect to our understanding of ? This question leads us to our first sort of proofs-about-proofs we’ll make.

Local Soundness and Completeness

What we want to say is that the introduction and elimination rules match up. This should mean that anytime we prove something with an by an introduction rule followed by an elimination rule, we should be able to rewrite to avoid this duplication. This also hints that the rules aren’t too powerful: we can’t prove anything with the elimination rules that we didn’t have a proof for at some point already.

For this proof looks like this

  D  E
  –  –
  A  B            D
 —————— ∧I   ⇒  ————
  A ∧ B           A
 —————— ∧E 1
    A

So whenever we introduce a ∧ and then eliminate it with ∧ E1 we can always rewrite our proof to not use the elimination rules. Here notice that D and E range over derivations in this proof. They represent a chain of rule applications that let us produce an A or B in the end. Note I got a bit lazy and started omitting the true judgments, this is something I’ll do a lot since it’s mostly unambiguous.

The proof for ∧E2 is similar.

  D  E
  –  –
  A  B            E
  ————— ∧I   ⇒  ————
  A ∧ B           B
  ————— ∧E 2
    B

Given this we say that the elimination rules for ∧ are “locally sound”. That is, when used immediately after an elimination rule they don’t let us produce anything truly new.

Next we want to show that if we have a proof of A ∧ B, the elimination rules give us enough information that we can pick the proof apart and produce a reassembled A ∧ B.

           D           D
         ————–       ————–
  D      A ∧ B       A ∧ B
————— ⇒ —————∧E1   ——————∧E2
A ∧ B      A           B
         ———————————————— ∧I
               A ∧ B

This somewhat confusion derivation takes our original proof of A ∧ B and pulls it apart into proof of A and B and uses these to assemble a new proof of A ∧ B. This means that our elimination rules give us all the information we put in so we say their locally complete.

The two of these properties combined, local soundness and completeness are how we show that an elimination rule is balanced with its introduction rule.

If you’re more comfortable with programming languages (I am) our local soundness property is equivalent to stating that

fst (a, b) ≡ a
snd (a, b) ≡ b

And local completeness is that

a ≡ (fst a, snd a)

The first equations are reductions and the second is expansion. These actually correspond the eta and beta rules we expect a programming language to have! This is a nice little example of why proof theory is useful, it gives a systematic way to define some parts of the behavior of a program. Given the logic a programming language gives rise to we can double check that all rules are locally sound and complete which gives us confidence our language isn’t horribly broken.

Hypothetical Judgments

Before I wrap up this post I wanted to talk about one last important concept in proof theory: judgments with hypotheses. This is best illustrated by trying to write the introduction and elimination rules for “implies” or “entailment”, written A ⊃ B.

Clearly A ⊃ B is supposed to mean we can prove B true assume A true to be provable. In other words, we can construct a derivation of the form

 A true
 ——————
   .
   .
   .
 ——————
 B true

We can notate our rules then as

 —————— u
 A true
 ——————
   .
   .
   .
 ——————
 B true           A ⊃ B    A
 —————————— u     ——————————
 A ⊃ B true         B true

This notation is a bit clunky, so we’ll opt for a new one: Γ ⊢ J. In this notation Γ is some list of judgments we assume to hold and J is the thing we want to show holds. Generally we’ll end up with the rule

J ∈ Γ
—————
Γ ⊢ J

Which captures the fact that Γ contains assumptions we may or may not use to prove our goal. This specific rule may vary depending on how we want express how assumptions work in our logic (substructural logics spring to mind here). For our purposes, this is the most straightforward characterization of how this ought to work.

Our hypothetical judgments come with a few rules which we call “structural rules”. They modify the structure of judgment, rather than any particular proposition we’re trying to prove.

Weakening
  Γ ⊢ J
—————————
Γ, Γ' ⊢ J

Contraction
Γ, A, A, Γ' ⊢ J
———————————————
 Γ, A, Γ' ⊢ J

Exchange
Γ' = permute(Γ)   Γ' ⊢ A
————————————————————————
        Γ ⊢ A

Finally, we get a substitution principle. This allows us to eliminate some of the assumptions we made to prove a theorem.

Γ ⊢ A   Γ, A ⊢ B
————————————————
     Γ ⊢ B

These 5 rules define meaning to our hypothetical judgments. We can restate our formulation of entailment with less clunky notation then as

A prop  B prop
——————————————
  A ⊃ B prop

Γ, A ⊢ B      Γ ⊢ A ⊃ B    Γ ⊢ A
—————————     ——————————————————
Γ ⊢ A ⊃ B           Γ ⊢ B

One thing in particular to note here is that entailment actually internalizes the notion of hypothetical judgments into our logic. This the aspect of it that made it behave so differently then the other connectives we looked at.

As an exercise to the reader: prove the local soundness and completeness of these rules.

Conclusion

In this post we’ve layed out a bunch of rules and I’ve hinted that a bunch more are possible. When put together these rules define a logic using “natural deduction”, a particular way of specifying proofs that uses inference rules rather than axioms or something entirely different.

Hopefully I’ve inspired you to poke a bit further into proof theory, in that case I heartily recommend Frank Pfenning’s lectures at the Oregon Summer School for Programming Languages.

Cheers,

<script type="text/javascript"> var disqus_shortname = 'codeco'; (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

February 11, 2015 12:00 AM

February 10, 2015

The GHC Team

GHC Weekly News - 2015/02/10

Hi *,

Welcome! This is the first GHC Weekly news of February 2015. You might be wondering what happened to the last one. Well, your editor was just in New York for the past week attending Compose Conference, making friends and talking a lot about Haskell (luckily we missed a snow storm that may have messed it up quite badly!)

The conference was great. I got to have some interesting discussions about GHC and Haskell with many friendly faces from all around at an incredibly well run conference with a stellar set of people. Many thanks to NY Haskell (organizing), Spotify (hosting space), and to all the speakers for a wonderful time. (And of course, your editor would like to thank his employer Well-Typed for sending him!)

But now, since your author has returned, GHC HQ met back up this week for some discussion, with some regularly scheduled topics. For the most part it was a short meeting this week - our goals are pretty well identified:

  • It's likely GHC HQ will do a third 7.10.1 Release Candidate at the very end of February after the votes are included. We missed some patches in RC2 (such as Phab:D347) and incorporated even more bugfixes, so this is worth a test drive by users.
  • For the most part, things for 7.10 have been going very smoothly other than the debates and a few bugs trickling in - there has not been much ticket activity the past two weeks, so things feel pretty good right now. Austin will mostly be focused on shipping 7.10 and keeping the normal review/patch/triaging cycles going until it's done. We're on track to fix all the major bugs we've assigned (see milestone:7.10.1).

Since my last post, we've also had other random assorted chatter on the mailing lists by the dev team:

  • In light of a recent large bug in GHC which can be used to derive unsafeCoerce, GHC HQ has decided to push back the 7.10 release a bit longer to about March, in order to fix this bug and ferret out the little fallout afterwords. It turns out this isn't a simple bug to fix, but luckily a fix is being worked on already. https://www.haskell.org/pipermail/ghc-devs/2015-January/008189.html
  • David Feuer has a question: why is undefined so special? In particular, it seems as if undefined can be specially used as a value with a type of kind # as well as *. It turns out GHC has a special notion of subkinding, and undefined has a type more special than meets the eye which allows this, as Adam Gundry replied. https://www.haskell.org/pipermail/ghc-devs/2015-February/008222.html
  • Greg Weber opened up a discussion about 'Restricted Template Haskell', which would hopefully make it easier for users to see what a TH computation is actually doing. It turns out - as noted by Simon - that Typed Template Haskell is perhaps closer to what Greg wants. The proposal and discussion then resulted in us realising that the typed TH documentation is rather poor! Hopefully Greg or someone can swing in to improve things. https://www.haskell.org/pipermail/ghc-devs/2015-February/008232.html

Closed tickets the past two weeks include: #10028, #10040, #10031, #9935, #9928, #2615, #10048, #10057, #10054, #10060, #10017, #10038, #9937, #8796, #10030, #9988, #10066, #7425, #7424, #7434, #10041, #2917, #4834, #10004, #10050, #10020, #10036, #9213, and #10047.

by thoughtpolice at February 10, 2015 11:14 PM

Neil Mitchell

Why is the Hoogle index so out of date?

Summary: Hoogle 4 is out of date. The alpha version Hoogle 5 has fresh code and data every day (and isn't yet ready).

Someone recently asked why Hoogle's index is so out of date. Making the index both more current (updated daily) and larger (indexing all of Stackage) is one of the goals behind my Hoogle 5 rewrite (which still isn't finished). Let's compare the different update processes:

Hoogle 4 updates took about two hours to complete, if they went well, and often had to be aborted. I first compiled the Hoogle binary on the haskell.org machines, which often failed, as typically the version of GHC was very old. Once I'd got a compiled binary, I needed to generate the database, which took about 2 hours, and occasionally failed halfway through. Once I had the new binary and databases I moved everything to correct place for Apache, accepting a small window of downtime during the move. Assuming that worked, I did a few test searches and smiled. Often the new Hoogle binary failed to start (usually failure to find some files, sometimes permissions) and I had to switch back to the old copy. Fixing up such issues took up to an hour. I had a mix of Windows .bat and Linux .sh scripts to automate some of the steps, but they weren't very robust, and required babysitting.

Hoogle 5 updates happen automatically at 8pm every night, take 4 minutes, and have yet to fail. I have a cron script that checks out the latest code and runs an update script. That script clones a fresh repo, compiles Hoogle, builds the databases, runs the test suite, kills the old version and launches the new version. The Hoogle code is all tested on Travis, so I don't expect that to fail very often. The upgrade script is hard to test, but the two failure modes are upgrading to a broken version, or not upgrading. The upgrade script runs checks and fails if anything doesn't work as expected, so it errs on the side of not upgrading. I use Uptime Robot to run searches and check the server is working, along with a canary page which raises an error if no upgrade happens for two days.

Clearly, the Hoogle 5 version update story is better. But why didn't I do it that way with Hoogle 4? The answer is that Hoogle 4 came out over six years ago, and a lot has changed since then:

  • Hoogle 4 is a CGI binary, served through Apache, while Hoogle 5 is a Haskell Warp server. By moving the logic into Haskell, it's far easier for me to configure and manage. Warp was only released on Hackage in 2011.
  • Hoogle 4 runs on the on the main haskell.org server, where my mistakes can easily take out the haskell.org home page (as a result, the haskell.org home page once said "moo" for 10 minutes). Hoogle 5 runs on a dedicated VM where I have root, and no one else runs anything, so I can experiment with settings about swap files, IP tables and cron jobs.
  • My job has provided a lot of practice doing drive-by sysadmining over the last 6 years. I've also had a lot of practice doing critical releases on a nightly basis. In comparison, Hoogle is pretty simple.
  • The revised/rethought approach to Hoogle databases is a lot faster and uses a lot less memory, so it takes under a minute to generate databases, instead of over an hour. That time difference makes it much easier to experiment with different approaches.

When will Hoogle 5 be ready? It doesn't yet do type search, there is no offline version and no API. There are probably lots of other little pieces missing. If you want, feel free to use it now at hoogle.haskell.org. You can still use Hoogle 4 at haskell.org/hoogle, or the more up-to-date FP complete hosted Hoogle 4.

by Neil Mitchell ([email protected]) at February 10, 2015 06:22 PM

The GHC Team

GHC 7.10 Prelude: we need your opinion

This post asks for your help in deciding how to proceed with some Prelude changes in GHC 7.10. Please read on, but all the info is also at the survey link, here: http://goo.gl/forms/XP1W2JdfpX. Deadline is 21 Feb 2015.

The Core Libraries Committee (CLC) is responsible for developing the core libraries that ship with GHC. This is an important but painstaking task, and we owe the CLC a big vote of thanks for taking it on.

For over a year the CLC has been working on integrating the Foldable and Traversable classes (shipped in base in GHC 7.8) into the core libraries, and into the Prelude in particular. Detailed planning for GHC 7.10 started in the autumn of 2014, and the CLC went ahead with this integration.

Then we had a failure of communication. As these changes affect the Prelude, which is in scope for all users of Haskell, these changes should be held to a higher bar than the regular libraries@ review process. However, the Foldable/Traversable changes were not particularly well signposted. Many people have only recently woken up to them, and some have objected (both in principle and detail).

This is an extremely unfortunate situation. On the one hand we are at RC2 for GHC 7.10, so library authors have invested effort in updating their libraries to the new Prelude. On the other, altering the Prelude is in effect altering the language, something we take pretty seriously. We should have had this debate back in 2014, but here we are, and it is unproductive to argue about whose fault it is. We all share responsibility. We need to decide what to do now. A small group of us met by Skype and we've decided to do this:

  • Push back GHC 7.10's release by at least a month, to late March. This delay also gives us breathing space to address an unrelated show-stopping bug, Trac #9858.
  • Invite input from the Haskell community on which of two approaches to adopt (this survey). The main questions revolve around impact on the Haskell ecosystem (commercial applications, teaching, libraries, etc etc), so we want to ask your opinion rather than guess it.
  • Ask Simon Marlow and Simon Peyton Jones to decide which approach to follow for GHC 7.10.

Wiki pages have been created summarizing these two primary alternatives, including many more points and counter-points and technical details:

This survey invites your input on which plan we should follow. Would you please

  • Read the details of the alternative plans on the three wiki pages above
  • Add your response to the survey

Please do read the background. Well-informed responses will help. Thank you!

DEADLINE: 21 February 2015

Simon PJ

by simonpj at February 10, 2015 03:52 PM

Noam Lewis

Inference of ‘new’ statements, and ambiguous types

To prevent possible name clashes, the name of my type inference engine for JavaScript is now Inferny Infernu . I hope I don’t have to change the name again!

In other news, here is some recent progress:

‘new’ now requires ‘this’ to be a row type

I changed inference of ‘new’ calls, so that the constructed function must have a row type as it’s “this” implicit parameter (nothing else makes sense).

The change to “new” typing made it possible to define the built in String, Number, Boolean type coercion functions in a way that disallows constructing them (e.g. “new String(3)”) which is considered bad practice in JavaScript. The reason is that the constructed values are “boxed” (kind of) so that they don’t equate by reference as normal strings, booleans and numbers do. For example, new String(3) == '3' but at the same time, new String(3) !== '3'.

Ambiguous Types

I added an initial implementation of what I call ambiguous types. These are simple type constraints that limit a type variable to a set of specific types.

The motivation for type constraints is that some JavaScript operators make sense for certain types, but not all types. So having a fully polymorphic type variable would be too weak. For example, the + operator has weird outputs when using all sorts of different input types (NaNNaNNaNNaNNaNNaN….batman!). I would like to constrain + to work only between strings or between numbers.

With the new type constraints, + has the following type:

a = (TNumber | TString) => ((a, a) -> a)

The syntax is reminiscent of Haskell’s type classes, and means: given a type variable “a” that is either a TNumber or a TString, the type of + is: (a, a) -> a

I’m thinking about implementing full-blown type classes, or alternatively, a more powerful form of ambiguous types, to deal with some other more complicated examples.


Tagged: Haskell, Javascript

by sinelaw at February 10, 2015 08:16 AM

February 09, 2015

Philip Wadler

Say no to a Scottish national ID system

The Scottish government has opened for consultation plans of that would lead to database sharing among a vast range of organisations, and could lead to the introduction of de facto ID cards via the back door. Responses to the consultation are due by 25 February 2015. ORG Scotland writes:

A minor, barely noticed consultation is not the way to make a major change to Scottish citizens’ privacy and their relationship with the state. Creating a national ID register was rejected by the SNP and the UK, and the bare minimum should be for the Scottish Executive to introduce primary legislation whereby the public and MSPs can debate the nature of these changes and whether they are acceptable.

Respond to the consultation quickly, courtesy of ORG.

ORG is planning meetings to discuss how we can stop the Scottish Government's plans in EdinburghGlasgow and Aberdeen, and is tracking developments in their blog.

Here is the original consultation,  and a detailed response by ORG.

by Philip Wadler ([email protected]) at February 09, 2015 08:39 PM

Roman Cheplyaka

Dealing with broken Haskell packages

As we approach the release of GHC 7.10, there is a new wave of Haskell packages that require trivial fixes to build with the new versions of the compiler and standard libraries, but whose authors/maintainers are not around to apply the fixes. This is especially annoying when there is a pull request on GitHub, and all the maintainer would have to do is to press the green Merge button, and upload the new version on hackage.

If you are a responsible maintainer and don’t want this to happen to your packages in the future, you should appoint backup maintainers for your packages.

But what if you are a user of a package that remains broken on hackage, even though a fix is available? Here I review several ways to deal with this problem, including the new and promising Stackage snapshots.

Building the package locally

If all you care about is to get something installed locally (be it the broken package itself, or something that directly or indirectly depends on it), you can install the fixed version locally.

Non-sandboxed way

Check out the repository or branch with the fix, and cabal-install it:

% git clone -b ghc710 https://github.com/markus1189/feed.git
% cabal install ./feed

(I prepend ./ to make sure cabal understands that I mean the directory, and not the package name on hackage.)

Sandboxed way

If you’re installing in the sandbox, then you can use add-source (although the non-sandboxed version will work in this case, too):

% git clone -b ghc710 https://github.com/markus1189/feed.git
% cabal sandbox add-source feed
% cabal install whatever-you-needed

If the package whatever-you-needed has feed among its transitive dependencies, cabal will automatically install it from the add-source’d directory.

Limitations

This approach doesn’t work well if:

  1. You are a maintainer of a package that depends on the broken library. It’s hard to ask your users to check out and build the fixed version by hand.

  2. You work on an in-house application that your coworkers should be able to build, for the same reason.

Forking the package

You cannot upload the fixed version of a package on hackage bypassing the maintainer. However, you can upload it under a new name. This works well if you are a maintainer of a package that directly depends on the broken package, because you can easily make your package depend on your fork.

Examples of this are tasty depending on regex-tdfa-rc (a fork of regex-tdfa) and tasty-golden depending on temporary-rc (a fork of temporary).

Limitations

  1. This doesn’t work well if there’s a chain of dependencies leading from your package to the broken one. You have to either persuade the other maintainer(s) to depend on your fork or fork the entire chain.

  2. If the broken package becomes actively developed again, you need to either move back to using it or backport the bugfixes from it to your fork. (I only fork packages when I find this scenario unlikely.)

  3. Other packages that depend on the broken package won’t automatically get fixed.

  4. Some people get upset when you fork packages.

Stackage snapshots

Instead of uploading the fixed version to hackage (which you can’t), you can upload it to Stackage instead, by creating a custom snapshot.

The procedure is described in Experimental package releases via Stackage Server. You create four files:

  • The fixed tarball (produced by cabal sdist). You probably want to bump the package’s version, so that it doesn’t conflict with the version already on hackage.
  • Two text files: desc and slug. The first one contains a human-readable description of the snapshot; the second contains an id that will become part of the snapshot’s URL.
  • A text file with the packages to be copied directly from hackage. For the purpose of this article, you probably want to leave this file empty. (I don’t know if it’s ok not to include it at all.)

Then you pack these four files into a tarball (that’s right, it’ll be a tarball with a tarball inside) and upload to stackage (after registering, if you haven’t registered before).

The outcome will be a custom hackage-like repository which will contain the single version of a single package — the one you’ve uploaded. (Of course, you can include multiple packages or versions if you like.)

The Stackage website will give you the remote-repo line that you can put into your cabal.config along with the hackage or stackage repos that are already there.

In contrast to building packages locally, you can easily tell your users or coworkers to add that repo as well.

Limitations

  1. If the new hackage release of the broken package will get the same version number as your stackage version, there will be a conflict. (I actually don’t know what happens in that case; my guess is that cabal will silently pick one of the two available versions.)

  2. If the package you maintain (which depends on the broken package) is a small one, or is deep down the dependency chain, it may be hard to tell your users to add the repository. If, on the other hand, you maintain a major web framework or other such thing, it would probably work.

Taking over a package

There’s a procedure for taking over a package described on the wiki. You’ll need to contact the current maintainer; wait an indefinite amount of time (there’s no consensus about it; estimates vary from 2 weeks to 6-12 months); ask again on the mailing list and wait again; finally ask Hackage admins to grant you the rights.

Limitations

  1. Since this procedure takes a long time, it’s almost never sufficient by itself, and you’ll need to resort to one of the other strategies until you’re given the upload rights.

  2. It’s not clear how long you actually need to wait.

  3. I find it odd that you need to jump through all these hoops in order to do a service to the community.

February 09, 2015 08:00 PM

Lee Pike

15 Minutes

Some of the work I lead at Galois was highlighted in the initial story on 60 Minutes last night, a spot interviewing Dan Kaufman at DARPA. I’m Galois’ principal investigator for the HACMS program, focused on building more reliable software for automobiles and aircraft and other embedded systems. The piece provides a nice overview for the general public on why software security matters and what DARPA is doing about it; HACMS is one piece of that story.

I was busy getting married when filming was scheduled, but two of my colleagues (Dylan McNamee and Pat Hickey) appear in brief cameos in the segment (don’t blink!). Good work, folks! I’m proud of my team and the work we’ve accomplished so far.

You can see more details about how we have been building better programming languages for embedded systems and using them to build unpiloted air vehicle software here.


by Lee Pike at February 09, 2015 04:31 PM

Yesod Web Framework

The brittle Haskell toolchain

A few weeks ago I received a bug report against streaming-commons. Since then, the details of what we discovered when discussing this report have been bothering me quite a bit, as they expose a lot of the brittleness of the Haskell toolchain. I'm documenting all of these aspects now to make clear how fragile our tooling is, and thereby explain why I think Stackage is so vital to our community.

In this blog post, I'm going to describe six separate problems I've identified when looking into this issue, and explain how Stackage (or some similar deterministic build system) would have protected users against these problems had it been employed.

The story

streaming-commons is a library that provides helper utilities for a number of different streaming concepts, one of them being a streaming way to convert blaze-builder Builders to filled ByteString buffers. Since blaze-builder was released a few years ago, a new set of modules was added to the bytestring package in version 0.10 known as a "bytestring builder." I asked one of the engineers at FP Complete, Emanuel Borsboom, to start working on a new module for streaming-commons to provide similar functionality for bytestring builder.

And now we run into the first problem with the Haskell toolchain. You would think that we should just add a lower bound on bytestring >= 0.10 in the streaming-commons.cabal file. However, setting restrictive lower bounds on ghc-package dependencies can be a problem. Fortunately, Leon Smith already solved this problem for us with bytestring-builder, which provides a compatibility layer for older bytestrings (much like Ed's transformers-compat). The idea is that, when compiled against an older version of bytestring, the bytestring-builder package provides the necessary missing modules, and otherwise does nothing.

When Emanuel wrote his changes to streaming-commons, he added a dependency on bytestring-builder. We then proceeded to test this on multiple versions of GHC via Travis CI and Herbert's multi-ghc-travis. Everything compiled and passed tests, so we shipped the updated version.

However, that original bug report I linked to- reported by Ozgun Ataman- told us there was a problem with GHC 7.6. This was pretty surprising, given that we'd tested on GHC 7.6. Fortunately Lane Seppala discovered the culprit: the Cabal library. It turns out that installing a new version of the Cabal library causes the build of streaming-commons to break, whereas our tests just used the default version of Cabal shipped with GHC 7.6. (We'll get back to why that broke things in a bit.)

After some digging, Emanuel discovered the deeper cause of the problem: Bryan O'Sullivan reported an issue a year ago where- when using a new version of the Cabal library- bytestring-builder does not in fact provide it's compatibility modules. This leads us to our second issue: this known bug existed for almost a year without resolution, and since it only occurs in unusual circumstances, was not detected by any of our automated tooling.

The reason this bug existed though is by far the most worrisome thing I saw in this process: the Cabal library silently changed the semantics of one of its fields in the 1.18 (or 1.20? I'm not sure) release. You see, bytestring-builder was detecting which version of bytestring it was compiled against by inspecting the configConstraints field (you can see the code yourself on Hackage). And starting in Cabal 0.19.1 (a development release), that field was no longer being populated. As a result, as soon as that newer Cabal library was installed, the bytestring-builder package became worse than useless.

As an aside, this points to another problematic aspect of our toolchain: there is no way to specify constraints on dependencies used in custom Setup.hs files. That's actually causes more difficulty than it may sound like, but I'll skip diving into it for now.

The fix for this was relatively simple: use some flag logic in the cabal file instead of a complicated custom Setup.hs file. (Once this pull request was merged in and released, it did fix the original bug report.) But don't take this as a critique of Leon's choice of a complicated Setup.hs file. Because in reality, the flag trick- while the "standard" solution to this problem- broke cabal-install's dependency solver for quite a while. To be fair, I'm still not completely convinced that the bug is fixed, but for now that bug is the lesser of two evils vs the Cabal library bug.

And finally, based on the bug report from Ozgun, it seems like an internal build failed based on all of this occurring. This has been a constant criticism I've made about the way we generally do builds in the Haskell world. Rarely is reproducibility a part of the toolchain. To quote Ozgun:

We are in fact quite careful in dependency management with lower-upper bounds on most outside packages, so breakages like this are unexpected.

And many people feel that this is the way things should be. But as this discussion hopefully emphasizes, just playing with lower and upper bounds is not sufficient to avoid build failures in general. In this case, we're looking at a piece of software that was broken by a change in a library that it didn't depend on, namely Cabal, since our tooling makes an implicit dependency on that library, and we have no way of placing bounds on it.

The case for Stackage

So here are the toolchain problems I've identified above:

  1. Tight coupling between GHC version and some core libraries like bytestring.
  2. A known issue lasting undocumented for a corner case for over a year, without any indication on the Hackage page that we should be concerned.
  3. The Cabal library silently changed the semantics of a field, causing complete breakage of a package.
  4. cabal-install's solver gets confused by standard flag usage, at least in slightly older versions.
  5. Not all dependencies are actually specified in a cabal file. At the very least, the Cabal library version is unconstrained, and any other packages used by Setup.hs.
  6. Default Haskell toolchain doesn't protect us against these kinds of problems, or give us any concept of reproducibility.

Stackage completely solves (2), (3), (5), and (6) for end users. By specifying all library versions used, and then testing all of those versions together, we avoid many possible corner cases of weird library interactions, and provide a fully reproducible build. (Note the Stackage doesn't solve all such cases: operating system, system libraries, executables, etc are still unpinned. That's why FP Complete is working on Docker-based tooling.)

(1) is highly mitigated by Stackage because, even though the tight coupling still exists, Stackage provides a set of packages that take that coupling into account for you, so you're not stuck trying to put the pieces together yourself.

As for (4)... Stackage helps the situation by making the job of the solver simpler by pinning down version numbers. Unfortunately, there are still potential gotchas when encountering solver bugs. Sometimes we end up needing to implement terribly awkward solutions to work around those bugs.

February 09, 2015 04:16 PM

Ian Ross

Non-diffusive atmospheric flow #12: dynamics warm-up

Non-diffusive atmospheric flow #12: dynamics warm-up

The analysis of preferred flow regimes in the previous article is all very well, and in its way quite illuminating, but it was an entirely static analysis – we didn’t make any use of the fact that the original <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> data we used was a time series, so we couldn’t gain any information about transitions between different states of atmospheric flow. We’ll attempt to remedy that situation now.

What sort of approach can we use to look at the dynamics of changes in patterns of <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics>? Our <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> parameterisation of flow patterns seems like a good start, but we need some way to model transitions between different flow states, i.e. between different points on the <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> sphere. Each of our original <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> maps corresponds to a point on this sphere, so we might hope that we can some up with a way of looking at trajectories of points in <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> space that will give us some insight into the dynamics of atmospheric flow.

Since atmospheric flow clearly has some stochastic element to it, a natural approach to take is to try to use some sort of Markov process to model transitions between flow states. Let me give a very quick overview of how we’re going to do this before getting into the details. In brief, we partition our <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> phase space into <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> components, assign each <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> pattern in our time series to a component of the partition, then count transitions between partition components. In this way, we can construct a matrix <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> with

<semantics>Mij=NijNtot<annotation encoding="application/x-tex"> M_{ij} = \frac{N_{i \to j}}{N_{\mathrm{tot}}} </annotation></semantics>

where <semantics>Nij<annotation encoding="application/x-tex">N_{i \to j}</annotation></semantics> is the number of transitions from partition <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> to partition <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> and <semantics>Ntot<annotation encoding="application/x-tex">N_{\mathrm{tot}}</annotation></semantics> is the total number of transitions. We can then use this Markov matrix to answer some questions about the type of dynamics that we have in our data – splitting the Markov matrix into its symmetric and antisymmetric components allows us to respectively look at diffusive (or irreversible) and non-diffusive (or conservative) dynamics.

Before trying to apply these ideas to our <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> data, we’ll look (in the next article) at a very simple Markov matrix calculation by hand to get some understanding of what these concepts really mean. Before that though, we need to take a look at the temporal structure of the <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> data – in particular, if we’re going to model transitions between flow states by a Markov process, we really want uncorrelated samples from the flow, and our daily <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> data is clearly correlated, so we need to do something about that.

Autocorrelation properties

Let’s look at the autocorrelation properties of the PCA projected component time series from our original <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> data. We use the autocorrelation function in the statistics package to calculate and save the autocorrelation for these PCA projected time series. There is one slight wrinkle – because we have multiple winters of data, we want to calculate autocorrelation functions for each winter and average them. We do not want to treat all the data as a single continuous time series, because if we do we’ll be treating the jump from the end of one winter to the beginning of the next as “just another day”, which would be quite wrong. We’ll need to pay attention to this point when we calculate Markov transition matrices too. Here’s the code to calculate the autocorrelation:

npcs, nday, nyear :: Int
npcs = 10
nday = 151
nyear = 66

main :: IO ()
main = do
  -- Open projected points data file for input.
  Right innc <- openFile $ workdir </> "z500-pca.nc"
  let Just ntime = ncDimLength <$> ncDim innc "time"
  let (Just projvar) = ncVar innc "proj"
  Right (HMatrix projsin) <-
    getA innc projvar [0, 0] [ntime, npcs] :: HMatrixRet CDouble

  -- Split projections into one-year segments.
  let projsconv = cmap realToFrac projsin :: Matrix Double
      lens = replicate nyear nday
      projs = map (takesV lens) $ toColumns projsconv

  -- Calculate autocorrelation for one-year segment and average.
  let vsums :: [Vector Double] -> Vector Double
      vsums = foldl1 (SV.zipWith (+))
      fst3 (x, _, _) = x
      doone :: [Vector Double] -> Vector Double
      doone ps = SV.map (/ (fromIntegral nyear)) $
                 vsums $ map (fst3 . autocorrelation) ps
      autocorrs = fromColumns $ map doone projs

  -- Generate output file.
  let outpcdim = NcDim "pc" npcs False
      outpcvar = NcVar "pc" NcInt [outpcdim] M.empty
      outlagdim = NcDim "lag" (nday - 1) False
      outlagvar = NcVar "lag" NcInt [outlagdim] M.empty
      outautovar = NcVar "autocorr" NcDouble [outpcdim, outlagdim] M.empty
      outncinfo =
        emptyNcInfo (workdir </> "autocorrelation.nc") #
        addNcDim outpcdim # addNcDim outlagdim #
        addNcVar outpcvar # addNcVar outlagvar #
        addNcVar outautovar

  flip (withCreateFile outncinfo) (putStrLn . ("ERROR: " ++) . show) $
    \outnc -> do
      -- Write coordinate variable values.
      put outnc outpcvar $
        (SV.fromList [0..fromIntegral npcs-1] :: SV.Vector CInt)
      put outnc outlagvar $
        (SV.fromList [0..fromIntegral nday-2] :: SV.Vector CInt)
      put outnc outautovar $ HMatrix $
        (cmap realToFrac autocorrs :: Matrix CDouble)
      return ()

We read in the component time series as a hmatrix matrix, split the matrix into columns (the individual component time series) then split each time series into year-long segments. The we use the autocorrelation function on each segment of each time series (dropping the confidence limit values that the autocorrelation function returns since we’re not so interested in those here) and average across segments of each time series. The result is an autocorrelation function (for lags from zero to <semantics>𝚗𝚍𝚊𝚢2<annotation encoding="application/x-tex">\mathtt{nday}-2</annotation></semantics>) for each PCA component. We write those to a NetCDF file for further processing.

The plot below shows the autocorrelation functions for the first three PCA projected component time series. The important thing to notice here is that there is significant autocorrelation in each of the PCA projected component time series out to lags of 5–10 days (the horizontal line on the plot is at a correlation of <semantics>e1<annotation encoding="application/x-tex">e^{-1}</annotation></semantics>). This makes sense – even at the bottom of the atmosphere, where temporal variability tends to be less structured than at 500,mb, we expect the weather tomorrow to be reasonably similar to the weather today.

It appears that there is pretty strong correlation in the <semantics>Z500<annotation encoding="application/x-tex">Z_{500}</annotation></semantics> data at short timescales, which would be an obstacle to performing the kind of Markov matrix analysis we’re going to do next. To get around this, we’re going to average our data over non-overlapping 7-day windows (seven days seems like a good compromise between throwing lots of data away and reducing the autocorrelation to a low enough level) and work with those 7-day means instead of the unprocessed PCA projected component time series. This does mean that we now need to rerun all of our spherical PDF analysis for the 7-day mean data, but that’s not much of a problem because everything is nicely scripted and it’s easy to rerun it all.

Spherical PDF for 7-day means

The figures below show the same plots as we earlier had for all the PCA projected component time series, except this time we’re looking at the 7-day means of the projected component time series, to ensure that we have data without significant temporal autocorrelation.


The first figure tab (“Projected points”) shows the individual 7-day mean data points, plotted using <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> polar coordinates. Comparing with the corresponding plot for all the data in the earlier article, we can see (obviously!) that there’s less data here, but also that it’s not really any easier to spot clumping in the data points than it was when we used all the data. It again makes sense to do KDE to find a smooth approximation to the probability density of our atmospheric flow patterns.

The “Spherical PDF” tab shows the spherical PDF of 7-day mean PCA components (parametrised by spherical polar coordinates <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> and <semantics>ϕ<annotation encoding="application/x-tex">\phi</annotation></semantics>) calculated by kernel density estimation: darker colours show regions of greater probability density. Two “bumps” are labelled for further consideration. Compared to the “all data” PDF, the kernel density estimate of the probability density for the 7-day mean data is more concentrated, with more of the probability mass appearing in the two labelled bumps on the plot. (Recall that the “all data” PDF had four “bumps” that we picked out to look at – here we only really have two clear bumps.)

We can determine the statistical significance of those bumps in exactly the same way as we did before. The “Significance” tab above shows the results. As you’d expect, both of the labelled bumps are highly significant. However, notice that the significance scale here extends only to 99% significance, while that for that “all data” case extends to 99.9%. The reduced significance levels are simply a result of having less data points – we have 1386 7-day mean points as compared to 9966 “all data” points, which means that we have more sampling variability in the null hypothesis PDFs that we use to generate the histograms used for the significance calculation. That increased sampling variability translates into less certainty that our “real data” couldn’t have occurred by chance, given the assumptions of the null hypothesis. Still, 99% confidence isn’t too bad!

Finally, we can plot the spatial patterns of atmospheric flow corresponding to the labelled bumps in the PDF, just as we did for the “all data” case. The “Bump patterns” tab shows the patterns for the two most prominent bumps in the 7-day means PDF. As before, the two flow patterns seem to distinguish quite clearly between “normal” zonal flow (in this case, pattern #2) and blocking flow (pattern #1).


Now that we’ve dealt with this autocorrelation problem, we’re ready to start thinking about how we model transitions between different flow states. In the next article, we’ll use a simple low-dimensional example to explain what we’re going to do.

<script src="http://zor.livefyre.com/wjs/v3.0/javascripts/livefyre.js" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "livefyre.com", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

February 09, 2015 08:31 AM

February 06, 2015

Philip Wadler

A paean to the Western General


I resided in Ward 42 of the Regional Infectious Disease Unit of Edinburgh's Western General Hospital from 17 Dec—2 Jan. For most of the time I received antibiotic by drip or injection every four hours, day and night, and I thought my stay would be six weeks. Fortunately, my infection reacted well to antibiotic and I could be released early for outpatient treatment.

The building is ugly, but the people inside it are beautiful. I cannot recall another time or place where everyone was so unfailingly helpful and friendly. On the first day, a nurse found a staff member willing to lend me a charger when my phone ran down. The doctors took as much time as needed to answer my questions. The nurses were cheerful despite constant interruptions. The men who brought the coffee remembered that I liked it with milk, and one often asked after my twins (he had twins as well). No one objected when my daughter brought me a poster of the Justice League and I blue-tacked to my wall; several admired it.

Most often, I interact with institutions where the people who help you are, clearly in it for the pay. They are nice only to the extent required by their job, and often less than that. Part of the difference in attitude here must be that the people know they are actively helping patients in need. But I take my hat off to an institution that inculcates a caring attitude in everyone.

(Picture above courtesy of The Edinburgh Sketcher, whose pictures adorn a couple of corridors in the Western General. The RIDU is a different building, but somehow every building in the complex contrives to be equally unattractive.)

Related: Status report, Status report 2.

by Philip Wadler ([email protected]) at February 06, 2015 11:12 AM

Status Report 2



An update to my status. My doctors continue to monitor my heart infection, but it appears cleared up, along with the problems in my abdomen.

I met with my urologist on 4 Feb. My latest CAT scan (27 Jan) shows a small mass in my liver and that the tumour on my left kidney has not grown. The mass is unlikely to be a metastasis of the tumour, but the first order of business is to biopsy my liver; this should happen in the next two weeks, and it may take a further two weeks to get the results. Meanwhile, I am on the waiting list for keyhole surgery to remove my left kidney; this should happen in about six weeks. (Hospitals are fined £1000 if it takes more than four weeks, but the Western General currently has thirty people over that limit.) Recovery time is about four weeks. So, with luck, back to work in ten weeks, mid-April.

All four kidney surgeons at the Western General are in the top 10% in the country, so I am in good hands. If keyhole surgery converts to ordinary surgery the recovery time is three months; this happens in 4% of cases. My doctor says it is unlikely to happen to me because, compared to most of his patients, I am young, fit, and slim. Not words I usually hear applied to myself!

Previously: Status report, A paean to the Western General.

by Philip Wadler ([email protected]) at February 06, 2015 11:11 AM

Status report

I am off work this semester, being treated for two things: an infection affecting my heart and abdomen; and a tumour on my kidney. I was in hospital 17 Dec—2 Jan, and self-administered antibiotics as an outpatient 3 Jan—29 Jan. The photo shows me partway through self-administration, which required 90 minutes each day.

The infection of my heart and abdomen appears cured, and I am feeling much better. I am awaiting an appointment with urology. It is likely the kidney will need to be removed. The tumour, I am told, is too small to have metastasised. I will have better information once I meet my urologist, but my current guess is that I will be back at work sometime in March.

My thanks to the NHS and to the Western General Hospital for the excellent treatment I have received, and to all my colleagues for their support.

by Philip Wadler ([email protected]) at February 06, 2015 10:33 AM

February 05, 2015

Neil Mitchell

Refactoring with Equational Reasoning

Summary: Haskell is great for refactoring, thanks to being able to reason about and transform programs with confidence.

I think one of Haskell's strengths as a practical language is that it's easy to refactor, and more importantly, easy to refactor safety. Programs in the real world often accumulate technical debt - code that is shaped more by its history than its current purpose. Refactoring is one way to address that technical debt, making the code simpler, but not changing any meaningful behaviour.

When refactoring, you need to think of which alternative forms of code are equivalent but better. In C++ I've removed unused variables, only to find they were RAII variables, and their mere presence had a semantic effect. In R I've removed redundant if expressions, only to find the apparently pure condition had the effect of coercing a variable and changing its type. In Haskell, it's equally possible to make refactorings that at first glance appear safe but aren't - however, in practice, it happens a lot less. I think there are a few reasons for that:

  • Haskell is pure and evaluation strategy is largely unobservable - moving a statement "before" or "after" another lexically is usually safe.
  • Refactorings that do go wrong, for example variables that accidentally get moved out of scope or types which are no longer as constrained, usually result in compiler errors.
  • The Haskell community cares about semantics and laws. The Monad laws are satisfied by almost all monads, flagrantly breaking those laws is rare.
  • Functions like unsafePerformIO, which could harm refactoring, are almost always used behind a suitable pure abstraction.

Note that these reasons are due to both the language, and the conventions of the Haskell community. (Despite these factors, there are a few features that can trip up refactorings, e.g. exceptions, record wildcards, space-leaks.)

To take a very concrete example, today I was faced with the code:

f = fromMaybe (not b) . select
if f v == b then opt1 else opt2

At one point the function f was used lots, had a sensible name and nicely abstracted some properties. Now f is used once, the semantics are captured elsewhere, and the code is just unclear. We can refactor this statement, focusing on the condition:

f v == b
-- inline f
(fromMaybe (not b) . select) v == b
-- remove brackets and inline (.)
fromMaybe (not b) (select v) == b
-- expand to a case statement
(case select v of Nothing -> not b; Just x -> x) == b
-- push the == down
case select v of Nothing -> not b == b; Just x -> x == b
-- simplify not b == b
case select v of Nothing -> False; Just x -> x == b
-- collapse back up
select v == Just b

And now substitute back in:

if select v == Just b then opt1 else opt2

Our code is now much simpler and more direct. Thanks to the guarantees I expect of Haskell programs, I also have a high degree of confidence this code really is equivalent - even if it isn't obvious just looking at beginning and end.

by Neil Mitchell ([email protected]) at February 05, 2015 09:14 PM

ERDI Gergo

Typed embedding of STLC into Haskell

Someone posted to the Haskell subreddit this blogpost of Lennart where he goes step-by-step through implementing an evaluator and type checker for CoC. I don't know why this post from 2007 showed up on Reddit this week, but it's a very good post, pedagogically speaking. Go and read it.

In this post, I'd like to elaborate on the simply-typed lambda calculus part of his blogpost. His typechecker defines the following types for representing STLC types, terms, and environments:

data Type = Base
          | Arrow Type Type
          deriving (Eq, Show)

type Sym = String

data Expr = Var Sym
          | App Expr Expr
          | Lam Sym Type Expr
          deriving (Eq, Show)
      

The signature of the typechecker presented in his post is as follows:

type ErrorMsg = String
type TC a = Either ErrorMsg a
newtype Env = Env [(Sym, Type)] deriving (Show)

tCheck :: Env -> Expr -> TC Type
      

My approach is to instead create a representation of terms of STLC in such a way that only well-scoped, well-typed terms can be represented. So let's turn on a couple of heavy-weight language extensions from GHC 7.8 (we'll see how each of them is used), and define a typed representation of STLC terms:

{-# LANGUAGE GADTs, StandaloneDeriving #-}
{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, TypeOperators #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TemplateHaskell #-} -- sigh...

import Data.Singletons.Prelude
import Data.Singletons.TH
import Data.Type.Equality

-- | A (typed) variable is an index into a context of types
data TVar (ts :: [Type]) (a :: Type) where
    Here :: TVar (t ': ts) t
    There :: TVar ts a -> TVar (t ': ts) a
deriving instance Show (TVar ctx a)

-- | Typed representation of STLC: well-scoped and well-typed by construction
data TTerm (ctx :: [Type]) (a :: Type) where
    TConst :: TTerm ctx Base
    TVar :: TVar ctx a -> TTerm ctx a
    TLam :: TTerm (a ': ctx) b -> TTerm ctx (Arrow a b)
    TApp :: TTerm ctx (Arrow a b) -> TTerm ctx a -> TTerm ctx b
deriving instance Show (TTerm ctx a)
      

The idea is to represent the context of a term as a list of types of variables in scope, and index into that list, de Bruijn-style, to refer to variables. This indexing operation maintains the necessary connection between the pointer and the type that it points to. Note the type of the TLam constructor, where we extend the context at the front for the inductive step.

To give a taste of how convenient it is to work with this representation programmatically, here's a total evaluator:

-- | Interpretation (semantics) of our types
type family Interp (t :: Type) where
    Interp Base = ()
    Interp (Arrow t1 t2) = Interp t1 -> Interp t2

-- | An environment gives the value of all variables in scope in a given context
data Env (ts :: [Type]) where
    Nil :: Env '[]
    Cons :: Interp t -> Env ts -> Env (t ': ts)

lookupVar :: TVar ts a -> Env ts -> Interp a
lookupVar Here      (Cons x _)  = x
lookupVar (There v) (Cons _ xs) = lookupVar v xs

-- | Evaluate a term of STLC. This function is total!
eval :: Env ctx -> TTerm ctx a -> Interp a
eval env TConst = ()
eval env (TVar v) = lookupVar v env
eval env (TLam lam) = \x -> eval (Cons x env) lam
eval env (TApp f e) = eval env f $ eval env e
      

Of course, the problem is that this representation is not at all convenient for other purposes. For starters, it is certainly not how we would expect human beings to type in their programs.

My version of the typechecker is such that instead of giving the type of a term (when it is well-typed), it instead transforms the loose representation (Term) into the tight one (TTerm). A Term is well-scoped and well-typed (under some binders) iff there is a TTerm corresponding to it. Let's use singletons to store type information in existential positions:

$(genSingletons [''Type])
$(singDecideInstance ''Type)

-- | Existential version of 'TTerm'
data SomeTerm (ctx :: [Type]) where
    TheTerm :: Sing a -> TTerm ctx a -> SomeTerm ctx

-- | Existential version of 'TVar'
data SomeVar (ctx :: [Type]) where
    TheVar :: Sing a -> TVar ctx a -> SomeVar ctx

-- | A typed binder of variable names
data Binder (ctx :: [Type]) where
    BNil :: Binder '[]
    BCons :: Sym -> Sing t -> Binder ts -> Binder (t ': ts)
      

Armed with these definitions, we can finally define the type inferer. I would argue that it is no more complicated than Lennart's version. In fact, it has the exact same shape, with value-level equality tests replaced with Data.Type.Equality-based checks.

-- | Type inference for STLC
infer :: Binder ctx -> Term -> Maybe (SomeTerm ctx)
infer bs (Var v) = do
    TheVar t v' <- inferVar bs v
    return $ TheTerm t $ TVar v'
infer bs (App f e) = do
    TheTerm (SArrow t0 t) f' <- infer bs f
    TheTerm t0' e' <- infer bs e
    Refl <- testEquality t0 t0'
    return $ TheTerm t $ TApp f' e'
infer bs (Lam v ty e) = case toSing ty of
    SomeSing t0 -> do
        TheTerm t e' <- infer (BCons v t0 bs) e
        return $ TheTerm (SArrow t0 t) $ TLam e'

inferVar :: Binder ctx -> Sym -> Maybe (SomeVar ctx)
inferVar (BCons u t bs) v
  | v == u = return $ TheVar t Here
  | otherwise = do
      TheVar t' v' <- inferVar bs u
      return $ TheVar t' $ There v'
inferVar _ _ = Nothing        
      

Note that pattern matching on Refl in the App case brings in scope type equalities that are crucial to making infer well-typed.

Of course, because of the existential nature of SomeVar, we should provide a typechecker as well which is a much more convenient interface to work with:

-- | Typechecker for STLC
check :: forall ctx a. (SingI a) => Binder ctx -> Term -> Maybe (TTerm ctx a)
check bs e = do
    TheTerm t' e' <- infer bs e
    Refl <- testEquality t t'
    return e'
  where
    t = singByProxy (Proxy :: Proxy a)

-- | Typechecker for closed terms of STLC
check_ :: (SingI a) => Term -> Maybe (TTerm '[] a)
check_ = check BNil        
      

(The SingI a constraint is an unfortunate implementation detail; the kind of a is Type, which is closed, so GHC should be able to know there is always going to be a SingI a instance).

To review, we've written a typed embedding of STLC into Haskell, with a total evaluator and a typechecker, in about 110 lines of code.

If we were doing this in something more like Agda, one possible improvement would be to define a function untype :: TTerm ctx a -> Term and use that to give check basically a type of Binder ctx -> (e :: Term) -> Either ((e' :: TTerm ctx a) -> untype e' == e -> Void) (TTerm ctx a), i.e. to give a proof in the non-well-typed case as well.

Full code as a gist on Github

February 05, 2015 05:28 PM

JP Moresmau

Genetic evolution of a neural network

In  a previous post I was trying the LambdaNet library for neural networks, training one with a function my own little human brain designed. But can we make the network learn without knowing the actual algorithm?

As a implementation of a genetic algorithm, I use simple-genetic-algorithm, that is, ahem, simple to use compared to some other packages. I've modified it, though, to use MonadRandom instead of having to thread through the RandomGen, so if you want to run the code make sure to take the version from my fork in the MonadRandom branch, until these changes are released by the maintainer on Hackage.

To use a genetic algorithm, we need two things: a representation of the structure we want to evolve that is suitable for the genetic transformations, and the implementation of these transformations (crossover and mutation) themselves.

For the first task, we convert the network to simple vectors:

-- | Convert a network for only vectors structures
toVectors :: (Floating (Vector a), Container Vector a, Floating a) => 
  Network a -> [([Vector a],Vector a)]
toVectors = map (tovs . layerToShowable) . layers
  where
    tovs (ShowableLayer ws bs) = (toRows ws,bs)

-- | Convert back to a network given a network definition and vector structures
fromVectors :: (Floating (Vector a), Container Vector a, Floating a) => 
  LayerDefinition a -> [([Vector a],Vector a)] -> Network a 
fromVectors ld = Network . map ((\sl-> showableToLayer (sl,ld)) . frvs)
  where
    frvs (vs,v)=ShowableLayer (fromRows vs) v

Note that we need a LayerDefinition to rebuild the network from the Vectors. Currently each layer of the network has the same definition and the algorithm does NOT evolve this structure, only the weights and biases.

We're going to keep that information along with some Worlds that we use for fitness testing:

-- | Store the network information as vectors
data NetworkData = NetworkData [([Vector Float],Vector Float)] [World]
  deriving (Show,Read,Eq,Ord)


Then we can implement the Chromosome class. Crossover takes the average of all weights and biases of the two parents, mutation changes one value randomly. Of course other implementations could be found.

-- | Chromosome instance
instance Chromosome NetworkData where
    -- Take average
    crossover (NetworkData xs ws) (NetworkData ys _) =
        return [ NetworkData (Prelude.zipWith zipW xs ys) ws]
        where
          zipW (vs1,v1) (vs2,v2) = (Prelude.zipWith zipW1 vs1 vs2,zipW1 v1 v2)
          zipW1 = V.zipWith (\x y -> (x + y) / 2) 

    -- Mutate one weight randomly
    mutation (NetworkData xs ws) = do
        xs' <- font="" r1="" randomchange="" xs="">
        return $ NetworkData xs' ws
      where
        randomChange _ [] =  return []
        randomChange f xs2 = do
          idx <- -="" 1="" font="" getrandomr="" length="" xs2="">
          mapM (\(i,x)->if i==idx then f x else return x) $ zip [0..] xs2
        r1 (vs,v) = do
          (v3:vs2) <- font="" r2="" randomchange="" v:vs="">
          return (vs2,v3)
        r2 v2 = do
          idx2 <- -="" 1="" font="" getrandomr="" v.length="" v2="">
          dx   <- 20="" font="" getrandomr="" nbsp="">
          return $ v2  V.// [(idx2,dx)]         

    -- calculate fitness
    fitness (NetworkData xs ws) = sum (map (fitWorld xs) ws) / fromIntegral (length ws)

For the fitness function we calculate the fitness for each given world and average it. I'm not trying to be too clever with that code.
For each world we run the food searching algorithm from each corners, and evaluate how far we are from the target, and if we reached it how long it took us. So networks that find the food will always rank higher than the ones who don't, and the quicker among them will rank higher again.

-- | Calculate fitness on a given world   
fitWorld :: [([Vector Float],Vector Float)] -> World -> Double
fitWorld dat w = sum (map fitCorner $ corners $ wSize w) / 4
  where
    fitCorner pos = 
      let network = fromVectors layerDef dat
          poss = algSteps w (neuralAlg w network) 50 pos
          endSmell = currentSmell w $ last poss
          possLength = length poss
      in fitFormula w endSmell possLength (distance pos $ wFood w)
      
-- | Fitness formula
fitFormula :: World -> Int -> Int -> Int -> Double
fitFormula w endSmell possLenth dist = case fromIntegral endSmell / fromIntegral (wSmell w) of
    1 -> 2 + (fromIntegral dist / fromIntegral possLenth)
    n -> n

Then we just need a stop condition: either a maximum number of generations or reaching the maximum possible fitness (shortest path found from all corners)

-- | Maximum for the fitness
maxFit :: Double
maxFit = fitFormula (World undefined undefined 10 undefined) 10 10 10

-- | Stop function
stopf ::  NetworkData -> Int -> IO Bool
stopf nd gen= return $ gen > 300 || fitness nd == maxFit

And code to generate random networks

-- | Build a random network data
buildNetworkData :: (Monad m,RandomGen g) => [World] -> RandT g m NetworkData 
buildNetworkData ws= do
  g <- font="" getsplit="">
  let n = buildNetwork g
  return $ NetworkData (toVectors n) ws

We can then evolve our population of networks, say on two worlds, w and w2:

runGAIO 64 0.1 (buildNetworkData [w,w2]) stopf

 And we get after a couple of minutes a network that can find the food from another point that the tested corners:

Iteration:1
##########
#........#
#........#
#........#
#...X....#
#........#
#........#
#........#
#.......@#
#........#
#........#
##########
Iteration:2
##########
#........#
#........#
#........#
#...X....#
#........#
#........#
#......@.#
#........#
#........#
#........#
##########
Iteration:3
##########
#........#
#........#
#........#
#...X....#
#........#
#.....@..#
#........#
#........#
#........#
#........#
##########
Iteration:4
##########
#........#
#........#
#........#
#...X....#
#....@...#
#........#
#........#
#........#
#........#
#........#
##########
Iteration:5
##########
#........#
#........#
#........#
#...@....#
#........#
#........#
#........#
#........#
#........#
#........#
##########
Yummy!

Interested in all this? I recommend the ai-junkie tutorial, then! My code is of course on github. Feedback and suggestions welcome!

by JP Moresmau ([email protected]) at February 05, 2015 04:42 PM

FP Complete

FP Complete is Hiring Test Engineers

San Francisco, CA – Telecommute

FP Complete is excited to announce we are hiring again. Currently we are looking to hire creative software engineers to fill a couple testing roles we have open. You will be joining our Haskell development team to help us test and validate our products. Here is a little more about the positions we're looking to fill.

Haskell Test Software Engineer – Scientific Simulation

For this role we are looking for someone to help build our test and delivery capabilities. You will be working as a member of the development team providing direct input and support on product implementation, testing, and quality. Your mission is to innovate on the test infrastructure enabling and implementing automated tests and test suites across multiple product component. Learn more here Haskell Test Software Engineer – Scientific Simulation

Software Test Engineer – Scientific Medical Simulation

For this position we are looking for creative software test engineers to work on our scientific medical SaaS product. You will be working as a member of an international product team and you will be expected to provide direct input on product implementation, testing, and quality. Your mission is to represent the customer. You will learn the system from top to bottom validating the product and making sure it delivers what the customer needs. Learn more here Software Test Engineer – Scientific Medical Simulation

If you’d like to be part of our team and shape the future of Haskell, please send a resume or CV to [email protected]. Please include the title of the position you're applying for in the subject line.

February 05, 2015 10:00 AM

February 03, 2015

Well-Typed.Com

Monads: From Web 2.0 to Hardware Drivers

Monads are often considered to be a stumbling block for learning Haskell. Somehow they are thought of as scary and hard to understand. They are however nothing more than a design pattern, and a rather simple one at that. Monads give us a standardized way to glue a particular kind of computations together, passing the result of each computation to the next. As such they are useful well beyond functional programming.

JavaScript

My favourite example is the well-known “callback hell” in JavaScript. JavaScript functions often take a callback as argument, a function that they invoke on completion. Functions that take a callback argument are called asynchronous functions. The problem arises when we want to call another asynchronous function inside the callback; we end up with deeply nested callbacks.

Let’s consider an example. Suppose we want to collect configuration files for our application in the current directory, in the user’s home directory, and in the system-wide configuration directory. We might do something like

function getConfigFiles(callback) {
  function extractConfigFiles(filess) { ... }

  fs.readdir(".", function(hereErr, here) {
    fs.readdir(process.env.HOME, function(homeErr, home) {
      fs.readdir("/etc", function(etcErr, etc) {
        callback(extractConfigFiles([here, home, etc]));
      })
    })
  });
}

Since readdir is an asynchronous function, and we need to call it three times, we end up with this deeply nested structure. You can see how this might get out of hand. By contrast, in Haskell we would recognize that these kinds of asynchronous functions form a monad (the continuation monad, to be precise); after all, this fits the pattern: we want to glue asynchronous functions together, passing the result of each function to the next. In Haskell we would write the above code as

getConfigFiles = do
    here <- readdir "."
    home <- readdir homedir
    etc  <- readdir "/etc"
    return $ extractConfigFiles [here, home, etc]
  where
    extractConfigFiles = ...

This looks and feels like simple sequential code, but it isn’t; this code is precisely equivalent to the JavaScript example above it. Note that there are tons of attempts to address callback hell in JavaScript; many of which are in fact inspired by monads.

Ziria

Let’s now move from the mainstream and high level to the more esoteric and low level. Ziria is a domain specific language designed at Microsoft Research specifically for the development of Software-defined radios. Well-Typed have been working with Microsoft Research over the last few months to improve the Ziria compiler (itself written in Haskell), primarily the front end (internal code representation, parser, scope resolution and type checking) and the optimizer.

Ziria is a two-level language. The expression language is a fairly standard imperative language. For example, we can write a function that computes factorial as

fun fac(n : int) {
  var result : int := 1;
  while(n > 0) {
    result := result * n;
    n := n - 1;
  }
  return result
}

The expression language contains all the usual suspects (mutable variables, arrays, structs, control flow constructs, etc.) as well as good support for bit manipulation and complex numbers.

The more interesting part of the language however is the computation language. Ziria programs define stream computations: programs that transform an incoming stream of bits into an outgoing stream of bits. In particular, Ziria is designed so that we can write such stream computations in a high level way and yet end up with highly performant code; Ziria’s flagship example is an implementation of the WiFi 802.11a/g protocol that is able to operate in realtime.

The simplest way to turn our simple factorial computation into a stream transformer is to map our fac function:

let comp streamFac = map fac

But there are also other ways to build up stream computations. There are two fundamental primitive stream computations: take reads an element from the input stream, and emit writes an element to the output stream. Of course now we need a way to glue stream computations together; you guessed it, they form a monad. For example, here is a stream processor which creates an output stream such that each element of the output stream is the sum of all preceding elements of the input stream, starting with some initial value init:

fun comp sum(init : int) {
  var total : int := init;
  repeat {
    x <- take;
    do { total := total + x }
    emit total;
  }
}

As a second example, consider this function which copies the first n elements from the input stream to the output stream and then terminates, returning the last element it wrote to the output stream:

fun comp prefix(n : int) {
  var last : int;
  times n {
    x <- take;
    do { last := x }
    emit x;
  }
  return last
}

This kind of monadic composition where the result of one stream computation is passed to another is known as “control path composition” in the Ziria literature. We also have “data path composition”, where the output stream of one processor becomes the input stream of another. For example, consider

let comp compositionExample = {
  var total : int := 0;
  repeat {
    newTotal <- sum(total) >>> prefix(5);
    do { total := newTotal - 1 }
  }
}

We use data path composition (>>>) to make the output stream of the sum stream transformer the input stream of the prefix stream computer. We then use control path composition to update a local variable with the last value that was written minus one, and we repeat. The effect is that we sum the input stream, but decrement the running total by one every 5 elements. So, given an input stream

1,2,3,4,5,6,7,8,9,10

we output

1,3,6,10,15,20,27,35,44,54

Ziria also offers a variant operator (|>>>|) which makes sure that the computations performed by the two stream processors happen in parallel.

A Haskell perspective. Stream computers can be compared to Haskell pipes, where a stream computer is something of type Pipe a b IO, with an input stream of type a and an output stream of type b; do is the equivalent of liftIO. Control path composition corresponds to monadic or “horizontal” composition, while data path composition corresponds to vertical composition.

Optimization

Monads come with laws: properties that most hold true for all monadic code. The Ziria compiler takes advantage of these laws in the optimizer. For example, suppose you write

x <- take
y <- return (x + 1)
emit y

At this point the so-called “left identity” law kicks in, and the compiler rewrites this to

x <- take
let y = x + 1
emit y

which will then subsequently be cleaned up by the inliner. Other optimizations applied by the Ziria compiler include loop unrolling, partial evaluation, etc. It also uses a simple SMT solver to remove unnecessary conditionals, following the approach we described in a previous blogpost.

One optimization deserves special mention, although it’s not related to monads per se. The vectorization optimization turns a stream computer that takes single elements from the input stream and outputs single elements to the output stream into a stream computer which takes arrays of elements from the input stream and outputs arrays of elements to the output stream, so that the resulting code can be further optimized to operate on multiple elements simultaneously and to reduce the overhead involved from reading and writing to stream buffers.

For example, consider the following Ziria program:

fun sumArray(xs : arr int) {
  var total : int := 0;
  for i in [0, length(xs)] {
    total := total + xs[i];
  }
  return total;
}

let comp sum4 = {
  repeat {
    xs <- takes 4;
    emit sumArray(xs);
  }
}

let comp stutter = {
  repeat {
    x <- take;
    emit x;
    emit x;
  }
}

let comp stutterSum = sum4 >>> stutter

Computation sum4 takes 4 elements from the input stream, sums them up, and emits the result; we say that the cardinality of sum4 is 4:1. Computation stutter writes every element in the input stream twice to the output stream; we say that its cardinality is 1:2; the cardinality of the composition stutterSum is therefore 2:1. The optimizer turns this program into this (cleaned up for readability only):

fun sum4_vect(xs : arr[288] int) {
  var ys : arr[72] int
  for i in [0, 72] {
    let xs : arr[4] int = xs[i*4:+4]
    var total : int
    total := xs[0];
    total := total+xs[1];
    total := total+xs[2];
    total := total+xs[3];
    ys[i] := total
  };
  return ys
}

fun stutter_vect(xs : arr[72] int) {
  var ys : arr[144] int
  for i in [0, 72] {
    ys[i*2]   := xs[i];
    ys[i*2+1] := xs[i]
  };
  return ys
}

let comp stutterSum = map sum4_vect >>> map stutter_vect

The optimizer has done an excellent job here. Both sum4 and stutter have become expressions, rather than computations, that are passed as arguments to map, which can be optimized better in the code generator; sum4 now takes an array of 288 elements and returns arrays of 72 elements (4:1), while stutter_vect takes arrays of 72 elements and returns arrays of 144 elements (2:1) and the inner loop in stutter has been unrolled.

This ability of the Ziria compiler to optimize the code for different kinds of pipeline widths is one of the reasons that it is possible to write software-defined radios in Ziria in a high level manner; with other approaches such as Sora this kind of optimization had to be done by hand. The Ziria compiler also does a number of other optimizations; for a more detailed discussion, see Ziria: A DSL for wireless systems programming.

Conclusion

Monads aren’t an obscure concept that has been invented just to work around peculiarities of Haskell. They are a very general and universal design principle with applications everywhere. The concept of monads has been found useful in JavaScript, C++, Java, Scala, C#, Ruby, Rust, Go, and many other languages. Recognizing the monad pattern when it arises in your code can lead to nicer, more readable and easier to maintain designs, and recognizing the monad pattern in language design helps programmers do this. In Ziria monads turn out to be precisely the right abstraction that makes it possible to have a language in which we can write software-defined radios at a very high level of abstraction and which can yet be compiled down to highly efficient code.

by edsko at February 03, 2015 10:27 AM

February 02, 2015

Robert Harper

Summer of Programming Languages

Having just returned from the annual Oregon Programming Languages Summer School, at which I teach every year, I am once again very impressed with the impressive growth in the technical sophistication of the field and with its ability to attract brilliant young students whose enthusiasm and idealism are inspiring.  Eugene was, as ever, an ideal setting for the summer school, providing a gorgeous setting for work and relaxation.  I was particularly glad for the numerous chances to talk with students outside of the classroom, usually over beer, and I enjoyed, as usual, the superb cycling conditions in Eugene and the surrounding countryside.  Many students commented to me that the atmosphere at the summer school is wonderful, filled with people who are passionate about programming languages research, and suffused with a spirit of cooperation and sharing of ideas.

Started by Zena Ariola a dozen years ago, this year’s instance was organized by Greg Morrisett and Amal Ahmed in consultation with Zena.  As usual, the success of the school depended critically on the dedication of Jim Allen, who has been the de facto chief operating officer since it’s inception.  Without Jim, OPLSS could not exist.  His attention to detail, and his engagement with the students are legendary.   Support from the National Science Foundation CISE Division, ACM SIGPLANMicrosoft Research, Jane Street Capital, and BAE Systems was essential for providing an excellent venue,  for supporting a roster of first-rate lecturers, and for supporting the participation of students who might otherwise not have been able to attend.  And, of course, an outstanding roster of lecturers donated their time to come to Eugene for a week to share their ideas with the students and their fellow lecturers.

The schedule of lectures is posted on the web site, all of which were taped, and are made available on the web.  In addition many speakers provided course notes, software, and other backing materials that are also available online.  So even if you were not able to attend, you can still benefit from the summer school, and perhaps feel more motivated to come next summer.  Greg and I will be organizing, in consultation with Zena.  Applying the principle “don’t fix what isn’t broken”, we do not anticipate major changes, but there is always room for improvement and the need to freshen up the content every year.  For me the central idea of the summer school is the applicability of deep theory to everyday practice.  Long a dream held by researchers such as me, these connections become more “real” every year as the theoretical abstractions of yesterday become the concrete practices of today.  It’s breathtaking to see how far we’ve come from the days when I was a student just beginning to grasp the opportunities afforded by ideas from proof theory, type theory, and category theory (the Holy Trinity) to building beautiful software systems.  No longer the abstruse fantasies of mad (computer) scientists, these ideas are the very air we breathe in PL research.  Gone are the days of ad hoc language designs done in innocence of the foundations on which they rest.  Nowadays serious industrial-strength languages are emerging that are grounded in theory and informed by practice.

Two examples have arisen just this summer, Rust (from Mozila) and Swift (from Apple), that exemplify the trend.  Although I have not had time to study them carefully, much less write serious code using them, it is evident from even a brief review of their web sites that these are serious languages that take account of the academic developments of the last couple of decades in formulating new language designs to address new classes of problems that have arisen in programming practice.  These languages are type safe, a basic criterion of sensibility, and feature sophisticated type systems that include ideas such as sum types, which have long been missing from commercial languages, or provided only in comically obtuse ways (such as objects).  The infamous null pointer mistakes have been eradicated, and the importance of pattern matching (in the sense of the ML family of languages) is finally being appreciated as the cure for Boolean blindness.  For once I can look at new industrial languages without an overwhelming sense of disappointment, but instead with optimism and enthusiasm that important ideas are finally, at long last, being recognized and adopted.  As has often been observed, it takes 25 years for an academic language idea to make it into industrial practice.  With Java it was simply the 1970’s idea of automatic storage management; with languages such as Rust and Swift we are seeing ideas from the 80’s and 90’s make their way into industrial practice.  It’s cause for celebration, and encouragement for those entering the field: the right ideas do win out in the end, one just has to have the courage to be irrelevant.

I hope to find the time to comment more meaningfully on the recent developments in practical programming languages, including Rust and Swift, but also languages such as Go and OCaml that are also making inroads into programming practice.  (The overwhelming success and future dominance of Haskell is self-evident.  Kudos!) But for now, let me say that the golden age of programming language research is here and now, and promises to continue indefinitely.

Update: word smithing.


Filed under: Programming, Research, Teaching Tagged: OPLSS14, programming languages

by Robert Harper at February 02, 2015 03:51 PM

Ian Ross

Non-diffusive atmospheric flow #10: significance of flow patterns

Non-diffusive atmospheric flow #10: significance of flow patterns

The spherical PDF we constructed by kernel density estimation in the article before last appeared to have “bumps”, i.e. it’s not uniform in <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> and <semantics>ϕ<annotation encoding="application/x-tex">\phi</annotation></semantics>. We’d like to interpret these bumps as preferred regimes of atmospheric flow, but before we do that, we need to decide whether these bumps are significant. There is a huge amount of confusion that surrounds this idea of significance, mostly caused by blind use of “standard recipes” in common data analysis cases. Here, we have some data analysis that’s anything but standard, and that will rather paradoxically make it much easier to understand what we really mean by significance.

So what do we mean by “significance”? A phenomena is significant if it is unlikely to have occurred by chance. Right away, this definition raises two questions. First, chance implies some sort of probability, a continuous quantity, which leads to the idea of different levels of significance. Second, if we are going to be thinking about probabilities, we are going to need to talk about a distribution for those probabilities. The basic idea is thus to compare our data to a distribution that we explicitly decide based on a null hypothesis. A bump in our PDF will be called significant if it is unlikely to have occurred in data generated under the assumptions in our null hypothesis.

So, what’s a good null hypothesis in this case? We’re trying to determine whether these bumps are a significant deviation from “nothing happening”. In this case, “nothing happening” would mean that the data points we use to generate the PDF are distributed uniformly over the unit sphere parametrised by <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> and <semantics>ϕ<annotation encoding="application/x-tex">\phi</annotation></semantics>, i.e. that no point in <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> space is any more or less likely to occur than any other. We’ll talk more about how we make use of this idea (and how we sample from our “uniform on the sphere” null hypothesis distribution) below.

We thus want some way of comparing the PDF generated by doing KDE on our data points to PDFs generated by doing KDE on “fake” data points sampled from our null hypothesis distribution. We’re going to follow a sampling-based approach: we will generate “fake” data sets, do exactly the same analysis we did on our real data points to produce “fake” PDFs, then compare these “fake” PDFs to our real one (in a way that will be demonstrated below).

There are a couple of important things to note here, which I usually think of under the heading of “minimisation of cleverness”. First, we do exactly the same analysis on our “fake” data as we do on our “real” data. That means that we can treat arbitrarily complex chains of data analysis without drama: here, we’re doing KDE, which is quite complicated from a statistical perspective, but we don’t really need to think about that, because the fact that we treat the fake and real data sets identically means that we’re comparing like with like and the complexity just “divides out”. Second, because we’re simulating, in the sense that we generate fake data based on a null hypothesis on the data and run it through whatever data analysis steps we’re doing, we make any assumptions that go into our null hypothesis perfectly explicit. If we assume Gaussian data, we can see that, because we’ll be sampling from a Gaussian to generate our fake data. If, as in this case, our null hypothesis distribution is something else, we’ll see that perfectly clearly because we sample directly from that distribution to generate our fake data.

This is a huge contrast to the usual case for “normal” statistics, where one chooses some standard test (<semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>-test, Mann-Whitney test, Kolmogorov-Smirnov test, and so on) and turns a crank to produce a test statistic. In this case, the assumptions inherent in the form of the test are hidden – a good statistician will know what those assumptions are and will understand the consequences of them, but a bad statistician (I am a bad statistician) won’t and will almost certainly end up applying tests outside of the regime where they are appropriate. You see this all the time in published literature: people use tests that are based on the assumption of Gaussianity on data that clearly isn’t Gaussian, people use tests that assume particular variance structures that clearly aren’t correct, and so on. Of course, there’s a very good reason for this. The kind of sampling-based strategy we’re going to use here needs a lot of computing power. Before compute power was cheap, standard tests were all that you could do. Old habits die hard, and it’s also easier to teach a small set of standard tests than to educate students in how to design their own sampling-based tests. But we have oodles of compute power, we have a very non-standard situation, and so a sampling-based approach allows us to sidestep all the hard thinking we would have to do to be good statisticians in this sense, which is what I meant by “minimisation of cleverness”.

So, we’re going to do sampling-based significance testing here. It is shockingly easy to do and, if you’ve been exposed to the confusion of “traditional” significance testing, shockingly easy to understand.

Simulating from the null hypothesis

In order to test the significance of the bumps we see in our spherical PDF, we need some way of sampling points from our null hypothesis distribution, i.e. from a probability distribution that is uniform on the unit sphere. The simplest way to do this is to sample points from a spherically symmetric three-dimensional probability distribution then project those points onto the unit sphere. The most suitable three-dimensional distribution, at least from the point of view of convenience, is a three dimensional Gaussian distribution with zero mean and unit covariance matrix. This is particularly convenient because if we sample points <semantics>𝐮=(x,y,z)<annotation encoding="application/x-tex">\mathbf{u} = (x, y, z)</annotation></semantics> from this distribution, each of the coordinates <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> and <semantics>z<annotation encoding="application/x-tex">z</annotation></semantics> are individually distributed as a standard Gaussian, i.e. <semantics>x𝒩(0,1)<annotation encoding="application/x-tex">x \sim \mathcal{N}(0, 1)</annotation></semantics>, <semantics>y𝒩(0,1)<annotation encoding="application/x-tex">y \sim \mathcal{N}(0, 1)</annotation></semantics>, <semantics>z𝒩(0,1)<annotation encoding="application/x-tex">z \sim \mathcal{N}(0, 1)</annotation></semantics>. To generate <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> random points uniformly distributed on the unit sphere, we can thus just generate <semantics>3N<annotation encoding="application/x-tex">3N</annotation></semantics> standard random deviates, partition them into 3-vectors and normalise each vector. Haskell code to do this sampling using the mwc-random package is shown below – here, nData is the number of points we want to sample, and the randPt function generates a single normalised <semantics>(x,y,z)<annotation encoding="application/x-tex">(x, y, z)</annotation></semantics> as a Haskell 3-tuple (as usual, the code is in a Gist; this is from the make-unif-pdf-sample.hs program):

  -- Random data point generation.
  gen <- create
  let randPt gen = do
        unnorm <- SV.replicateM 3 (standard gen)
        let mag = sqrt $ unnorm `dot` unnorm
            norm = scale (1.0 / mag) unnorm
        return (norm ! 0, norm ! 1, norm ! 2)

  -- Create random data points, flatten to vector and allocate on
  -- device.
  dataPts <- replicateM nData (randPt gen)

If we sample the same number of points from this distribution that we have in our real data and then use the same KDE approach the we used for the real data to generate an empirical PDF on the unit sphere, what do we get? Here’s what one distribution generated by this procedure looks like, using the same colour scale as the “real data” distribution in the earlier article to aid in comparison (darker colours show regions of greater probability density):


Spherical PDF for null hypothesis data


We can see that our sample from the null hypothesis distribution also has “bumps”, although they seem to be less prominent than the bumps in PDF for our real data. Why do we see bumps here? Our null hypothesis distribution is uniform, so why is the simulated empirical PDF bumpy? The answer, of course, is sampling variation. If we sample 9966 points on the unit sphere, we are going to get some clustering of points (leading to bumps in the KDE-derived distribution) just by chance. Those chance concentrations of points are what lead to the bumps in the plot above.

What we ultimately want to do then is to answer the question: how likely is it that the bumps in the distribution of our real data could have arisen by chance, assuming that our real data arose from a process matching our null hypothesis?

Testing for significance

The way we’re going to answer the question posed in the last section is purely empirically. We’re going to generate empirical distributions (histograms) of the possible values of the null hypothesis distribution to get a picture of the sampling variability that is possible, then we’re going to look at the values of our “real data” distribution and calculate the proportion of the null hypothesis distribution values less than the real data distribution values. This will give us the probability that our real data distribution could have arisen by chance if the data really came from the null hypothesis distribution.

In words, it sounds complicated. In reality and in code, it’s not. First, we generate a large number of realisations of the null hypothesis distribution, by sampling points on the unit sphere and using KDE to produce PDFs from those point distributions in exactly the same way that we did for our real data, as shown here (code from the make-hist.hs program):

  -- Generate PDF realisations.
  pdfs <- forM [1..nrealisations] $ \r -> do
    putStrLn $ "REALISATION: " ++ show r

    -- Create random data points.
    dataPts <- SV.concat <$> replicateM nData (randPt gen)
    SV.unsafeWith dataPts $ \p -> CUDA.pokeArray (3 * nData) p dDataPts

    -- Calculate kernel values for each grid point/data point
    -- combination and accumulate into grid.
    CUDA.launchKernel fun gridSize blockSize 0 Nothing
      [CUDA.IArg (fromIntegral nData), CUDA.VArg dDataPts, CUDA.VArg dPdf]
    CUDA.sync
    res <- SVM.new (ntheta * nphi)
    SVM.unsafeWith res $ \p -> CUDA.peekArray (ntheta * nphi) dPdf p
    unnormv <- SV.unsafeFreeze res
    let unnorm = reshape nphi unnormv

    -- Normalise.
    let int = dtheta * dphi * sum (zipWith doone sinths (toRows unnorm))
    return $ cmap (realToFrac . (/ int)) unnorm :: IO (Matrix Double)

(This is really the key aspect of this sampling-based approach: we perform exactly the same data analysis on the test data sampled from the null hypothesis distribution that we perform on our real data.) We generate 10,000 realisations of the null hypothesis distribution (stored in the pdfs value), in this case using CUDA to do the actual KDE calculation, so that it doesn’t take too long.

Then, for each spatial point on our unit sphere, i.e. each point in the <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> grid that we’re using, we collect all the values of our null hypothesis distribution – this is the samples value in this code:

  -- Convert to per-point samples and generate per-point histograms.
  let samples = [SV.generate nrealisations (\s -> (pdfs !! s) ! i ! j) :: V |
                 i <- [0..ntheta-1], j <- [0..nphi-1]]
      (rngmin, rngmax) = range nbins $ SV.concat samples
      hists = map (histogram_ nbins rngmin rngmax) samples :: [V]
      step i = rngmin + d * fromIntegral i
      d = (rngmax - rngmin) / fromIntegral nbins
      bins = SV.generate nbins step

For each point on our grid on the unit sphere, we then calculate a histogram of the samples from the 10,000 empirical PDF realisations, using the histogram_ function from the statistics package. We use the same bins for all the histograms to make life easier in the next step.

There’s one thing that’s worth commenting on here. You might think that we’re doing excess work here. Our null hypothesis distribution is spherically symmetric, so shouldn’t the histograms be the same for all points on the unit sphere? Well, should they? Or might the exact distribution of samples depend on <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics>, since the <semantics>(θ,ϕ)<annotation encoding="application/x-tex">(\theta, \phi)</annotation></semantics> grid cells will be smaller at the poles of our unit sphere than at the equator? Well, to be honest, I don’t know. And I don’t really care. By taking the approach I’m showing you here, I don’t need to worry about that question, because I’m generating independent histograms for each grid cell on the unit sphere, so my analysis is immune to any effects related to grid cell size. Furthermore, this approach also enables me to change my null hypothesis if I want to, without changing any of the other data analysis code. What if I decide that this spherically symmetric null hypothesis is too weak? What if I want to test my real data against the hypothesis that there is a spherically symmetric background distribution of points on my unit sphere, plus a couple of bumps (of specified amplitude and extent) representing what I think are the most prominent patterns of atmospheric flow? That’s quite a complicated null hypothesis, but as long as I can define it clearly and sample from it, I can use exactly the same data analysis process that I’m showing you here to evaluate the significance of my real data compared to that null hypothesis. (And sampling from a complicated distribution is usually easier than doing anything else with it. In this case, I might say what proportion of the time I expect to be in each of my bump or background regimes, for the background I can sample uniformly on the sphere and for the bumps I can sample from a Kent distribution1.)

Once we have the histograms for each grid point on the unit sphere, we can calculate the significance of the values of the real data distribution (this is from the make-significance.hs program – I split these things up to make checking what was going on during development easier):

  -- Split histogram values for later processing.
  let nhistvals = SV.length histvals
      oneslice i = SV.slice i nbin histvals
      histvecs = map oneslice [0, nbin.. nhistvals - nbin]
      hists = A.listArray ((0, 0), (ntheta-1, nphi-1)) histvecs
      nrealisations = SV.sum $ hists A.! (0, 0)

  -- Calculate significance values.
  let doone :: CDouble -> CDouble -> CDouble
      doone dith diph =
        let ith = truncate dith ; iph = truncate diph
            pdfval = pdf ! ith ! iph
            hist = hists A.! (ith, iph)
            pdfbin0 = truncate $ (pdfval - minbin) / binwidth
            pdfbin = pdfbin0 `max` 0 `min` nbin - 1
        in (SV.sum $ SV.take pdfbin hist) / nrealisations
      sig = build (ntheta, nphi) doone :: Matrix CDouble

We read the histograms into the histvals value from an intermediate NetCDF file and build an array of histograms indexed by grid cell indexes in the <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> and <semantics>ϕ<annotation encoding="application/x-tex">\phi</annotation></semantics> directions. Then, for each grid cell, we determine which histogram bin the relevant value from the real data distribution falls into and sum the histogram values from the corresponding histogram from all the bins smaller than the real data value bin. Dividing this sum by the total number of null hypothesis distribution realisations used to construct the histograms gives us the fraction of null hypothesis distribution values for this grid cell that are smaller than the actual value from the real data distribution.

For instance, if the real data distribution value is greater than 95% of the values generated by the null hypothesis distribution simulation, then we say that we have a significance level of 95% at that point on the unit sphere. We can plot these significance levels in the same way that we’ve been plotting the spherical PDFs. Here’s what those significance levels look like, choosing contour levels for the plot to highlight the most significant regions, i.e. the regions least likely to have occurred by chance if the null hypothesis is true:


Significance levels for spherical PDF


In particular, we see that each of the three bumps picked out with labels in the “real data” PDF plot in the earlier article are among the most significant regions of the PDF according to this analysis, being larger than 99.9% of values generated from the null hypothesis uniform distribution.

It’s sort of traditional to try to use some other language to talk about these kinds of results, giving specific terminological meanings to the words “significance levels” and “<semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>-values”, but I prefer to keep away from that because, as was the case for the terminology surrounding PCA, the “conventional” choices of words are often confusing, either because no-one can agree on what the conventions are (as for PCA) or the whole basis for setting up the conventions is confusing. In the case of hypothesis testing, there are still papers being published in statistical journals arguing about what significance and <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>-values and hypothesis testing really mean, nearly 100 years after these ideas were first outlined by Ronald Fisher and others. I’ve never been sure enough about what all this means to be comfortable using the standard terminology, but the sampling-based approach we’ve used here makes it much harder to get confused – we can say “our results are larger than 99.9% of results that could be encountered as a result of sampling variability under the assumptions of our null hypothesis”, which seems quite unambiguous (if a little wordy!).

In the next article we’ll take a quick look at what these “bumps” in our “real data” PDF represent in terms of atmospheric flow.


  1. This is a sort of “bump” distribution on the unit sphere (see on Wikipedia).

<script src="http://zor.livefyre.com/wjs/v3.0/javascripts/livefyre.js" type="text/javascript"></script> <script type="text/javascript"> (function () { var articleId = fyre.conv.load.makeArticleId(null); fyre.conv.load({}, [{ el: 'livefyre-comments', network: "livefyre.com", siteId: "290329", articleId: articleId, signed: false, collectionMeta: { articleId: articleId, url: fyre.conv.load.makeCollectionUrl(), } }], function() {}); }()); </script>

February 02, 2015 07:35 AM

Douglas M. Auclair (geophf)

January 2015 1HaskellADay Problems and Solutions


January 2015
  • January 30th, 2015: Catted Prime Factors http://lpaste.net/119587 Today's answer shows us composed primes. http://lpaste.net/119639 No. Wait. Primes aren't composed, by definition, right? Well, anyway...
  • January 29th, 2015: For today's haskell problem we peace-out by making triangles, http://lpaste.net/119524 not war. And our solution: http://lpaste.net/119562 make my day, punk! ... with triangles (extra-spicy, please)! Bleh, our non-partial solution: no fun, no triangles! http://lpaste.net/119562 Can you fix it?
  • January 28th, 2015: Criss-cross; http://lpaste.net/119353 APPLESAUCE! ... butnot Christopher Cross for today's Haskell problem. Anndddd, safety first, or don't cross that yellow line! http://lpaste.net/119434 for today's solution.
  • January 27th, 2015: What's 'All in the Family'? http://lpaste.net/119319 Elegance is. Today's Haskell problem is about a little bit more elegance. And lines. And triangles. Our solution http://lpaste.net/119326 involves slopes and stuff ... from Elementary School. Remember?
  • January 26th, 2015: Just a simple counting-triangles problem using Data.Graph for today's #haskell problem http://lpaste.net/119228 In our solution we put the triangles into a  Graph, counting crows http://lpaste.net/119256 ... I meant: 'triangles.' 
  • January 23rd, 2015: A simple request on this fine Friday: solve the 'aug'mented matrix for today's #haskell problem http://lpaste.net/119049 Row-echelon solverSOLVED! (WOOT!s included for free!) lpaste.net/119102
  • January 22nd, 2015: Row-echelon form; not solved yet, http://lpaste.net/118964 but PDC! ('pretty durn closer!') Aug! Our matrix has been row-echeloned! http://lpaste.net/119045 A solution for today's Haskell problem.
  • January 21st, 2015: Vectors are Num(bers), too ... adding two (same-length) vectors for today's Haskell problem http://lpaste.net/118906 ... and rollin' the vector-addition solution is posted at http://lpaste.net/118907
  • January 20th, 2015: May I introduce Miss Abby Normal http://lpaste.net/118862 for today's Haskell problem? Vector normalization for reduced row-echelon form. NORM! A solution to today's Haskell problem is CHEERSfully posted at http://lpaste.net/118898
  • January 19th, 2015: Oh, something new for today's #haskell problem? Well, then: how about a ... MATH PROBLEM! ... 'sorta' Eheh. http://lpaste.net/118714 We have a guest who decided to solve the problem: Þor (Thor). http://lpaste.net/118729 THOR SMASH! No. Wait. That's the Hulk. Never mind.
  • January 16th, 2015: Let it snow, let it snow, let it matrix-solver! for today's #haskell problem with (Ooh! Ooh!) pretty matrix pictures! http://lpaste.net/118571
Picture. Terminal. Snowflake-matrix. BlinchtenLightzen.
  • January 15th, 2015: n, k, and b are all you need (oh, and the answers) for today's Haskell problem http://lpaste.net/118488
  • January 14th, 2015: More fun with fractals for today's #haskell problem: plotting the Harriss Spiral http://lpaste.net/118414
  • January 13th, 2015: Yesterday we did not learn about the (constructive?) truth of 2+2 = 4. Sorry. Today we try to equate two circled rows http://lpaste.net/118367
  • January 9th, 2015: In which it is shown that snarkiness is unavoidable when @geophf presents a Mensa-puzzle for today's #haskell problem http://lpaste.net/118121
  • January 8th, 2015: Cressida, a smart little girl (in fact: a Mensan), would like you to determine her age for today's #haskell problem http://lpaste.net/118059
  • January 7th, 2015: Fibbing down-low on the fibs! for today's haskell problem lpaste.net/118013
  • January 5th, 2015: I'm looking over a 4-leaf tuple/That I overlooked before! #haskell problem today: over definition for Lens-on-tupleshttp://lpaste.net/117906 A solution is 'overdefined' by @bretthall at http://lpaste.net/117934
  • January 4th, 2015: It's a new year, and I've just woken up and smelled the Lens-coffee! Bob Marley welcomes you to the New Year with a #haskell problem about #lenses http://lpaste.net/117815 Sing with me! set _2 "love" (1, 2) A solution is offered by @bretthall at http://lpaste.net/117933

by geophf ([email protected]) at February 02, 2015 06:03 AM