October 08, 2015

Gabriel Gonzalez

The Haskell community self-selects for people interested in unique things that Haskell can do that other languages cannot do. Consequently, a large chunk of Haskell example code in the wild uses advanced idioms (and I'm guilty of that, too).

So I would like to swing the pendulum in the other direction by just writing five small but useful programs without any imports, language extensions, or advanced features. These are programs that you could write in any other language and that's the point: you can use Haskell in the same way that you use other languages.

They won't be the prettiest or most type-safe programs, but that's okay.

Example #1: TODO program

This program is an interactive TODO list:

putTodo :: (Int, String) -> IO ()putTodo (n, todo) = putStrLn (show n ++ ": " ++ todo)prompt :: [String] -> IO ()prompt todos = do    putStrLn ""    putStrLn "Current TODO list:"    mapM_ putTodo (zip [0..] todos)    command <- getLine    interpret command todosinterpret :: String -> [String] -> IO ()interpret ('+':' ':todo) todos = prompt (todo:todos)interpret ('-':' ':num ) todos =    case delete (read num) todos of        Nothing -> do            putStrLn "No TODO entry matches the given number"            prompt todos        Just todos' -> prompt todos'interpret  "q"           todos = return ()interpret  command       todos = do    putStrLn ("Invalid command: " ++ command ++ "")    prompt todosdelete :: Int -> [a] -> Maybe [a]delete 0 (_:as) = Just asdelete n (a:as) = do    let n' = n - 1    as' <- n' seq delete n' as    return (a:as')delete _  []    = Nothingmain = do    putStrLn "Commands:"    putStrLn "+ <String> - Add a TODO entry"    putStrLn "- <Int>    - Delete the numbered entry"    putStrLn "q          - Quit"    prompt []

Example usage:

Example #4 - Decode RNA

This program converts an RNA sequence read from standard input into the equivalent sequence of amino acids written to standard output, using the genetic code:

data RNA = A | U | C | G    deriving (Read)data AminoAcid    = Ala | Cys | Asp | Glu | Phe | Gly | His | Ile | Lys | Leu    | Met | Asn | Pro | Gln | Arg | Ser | Thr | Val | Trp | Tyr    | Stop    deriving (Show)decode :: RNA -> RNA -> RNA -> AminoAcid decode U U U = Phedecode U U C = Phedecode U U A = Leudecode U U G = Leudecode U C _ = Serdecode U A U = Tyrdecode U A C = Tyrdecode U A A = Stopdecode U A G = Stopdecode U G U = Cysdecode U G C = Cysdecode U G A = Stopdecode U G G = Trpdecode C U _ = Leudecode C C _ = Prodecode C A U = Hisdecode C A C = Hisdecode C A A = Glndecode C A G = Glndecode C G _ = Argdecode A U U = Iledecode A U C = Iledecode A U A = Iledecode A U G = Metdecode A C _ = Thrdecode A A U = Asndecode A A C = Asndecode A A A = Lysdecode A A G = Lysdecode A G U = Serdecode A G C = Serdecode A G A = Argdecode A G G = Argdecode G U _ = Valdecode G C _ = Aladecode G A U = Aspdecode G A C = Aspdecode G A A = Gludecode G A G = Gludecode G G _ = GlydecodeAll :: [RNA] -> [AminoAcid]decodeAll (a:b:c:ds) = decode a b c : decodeAll dsdecodeAll  _         = []main = do    str <- getContents    let rna :: [RNA]        rna = map (\c -> read [c]) str    let aminoAcids :: [AminoAcid]        aminoAcids = decodeAll rna    putStrLn (concatMap show aminoAcids)

Example usage:

Conclusion

If you would like to contribute a simple example of your own, try sharing a paste of your program under the #Haskell #BackToBasics hashtags.

Benefits of Investing in Cycling

Benefits of Investing in Cycling, a report by Dr. Rachel Aldred, Senior Lecturer in Transport, Faculty of Architecture and the Built Environment, University of Westminster. Some choice statistics:

If people in urban England and Wales cycled and walked as much as people do in Copenhagen, the NHS could save around £17 billion within twenty years.
Shifting 10% of short urban trips outside London from car to cycle could save over 100 premature deaths from air pollution related illnesses annually.
If cycling was as safe in the UK as in  The Netherlands we would see around  80 fewer cycle deaths each year.
Installing protected space for cycling can increase retail sales by up to a quarter.
Spotted by Ewen Maclean.  Thanks, Ewen!

Seeking Software and Systems Engineers (Telecommute)

COMPANY

FP Complete is a software engineering firm that helps sophisticated companies develop cutting-edge analytics & modeling software for biology, finance, Internet of Things, and other advanced data projects. We emphasize functional programing (mainly in Haskell) and distributed, cloud-based devops (AWS and other clouds).

Founded in 2012, we are a modern and lightweight company of about 20 people. Many of our team are veterans and innovators from Microsoft, Intel, Google, several PhD programs, and several prominent open-source projects. We work in distributed, work-from-home style, so we hire people who can work with fairly little supervision, who are very smart, and who are great communicators. Most of us are in North America and Western Europe.

Our clients are category innovators: startups to Fortune 500 firms, putting information to work. Together we build new software-driven systems for powerful data handling, data analysis, proprietary applications, and online operations.

Our team, the level of autonomy and growth you get, and our work-from-anywhere management style are extraordinary. Very smart, no-nonsense people love working at FP Complete.

JOB DESCRIPTIONS

To continue growing we are currently hiring 3 software engineers. You may strongly fit any of the 3 profiles listed below, and we are also open to a blend of two different descriptions.

SKILL SET 1: strong at Haskell programming, and will architect and develop server-based infrastructure and data-modeling code, and possibly Web UI code as well.

SKILL SET 2: strong at devops and cloud Linux sysadmin, and will create and maintain systems for multi-machine computations and for advanced engineering collaboration.

SKILL SET 3: strong at customer problem analysis and solution design, including writing proposals and specifications, and will play a major role in working with our customers (and their bosses and regulators) and winning new customers for the company.

• Configure and administer cloud servers running Linux

• Develop components and applications in Haskell

• Analyze customer needs and write documents describing the problem and the proposed solution

• Assess requirements and pick the best server software to meet each need, including when to develop new software or when to use off-the-shelf software or cloud services

• Install and test Haskell software and libraries

• Install and test non-Haskell software and cloud services

• Routine website maintenance and administration

• Collaborate with distributed colleagues via online collaboration tools, email, chat, and voice

• Document technical work so that new colleagues can understand it

You should have these skills:

• Experience programming at a professional level either in Haskell, or in multiple other languages with a desire to learn more.

• Ability to design solutions based on expressed user needs, including asking questions to make sure you understand the real requirements

• Ability to write working, maintainable programs, including infrastructure code (system management, data management, libraries) and application code

• Ability to work for multiple days without close supervision

• Experience working in a team environment with online collaboration tools such Atlassian, GitHub, JIRA, etc.

• Absolute honesty and integrity

• Strong written and spoken English

As we have many opportunities, these additional skills are also valued:

• Experience managing Linux servers (user account management, security, networks, services)

• Ability to write Javascript code

• Familiarity with a Haskell web framework like Yesod, Snap, or Happstack

• Experience managing a nontrivial public website with complex features

• Expertise in security

• Expertise with replicated (multi server) environments

• Experience managing virtual machines (VMware, etc.) or cloud machines (AWS, etc.)

• Familiarity with Apache, Apache extensions, and PHP

• Experience developing or maintaining code in Python, Ruby, C, C++, and/or Java

• Experience as either an engineering team lead or manager

Unleashing the power of textures!

From luminance-0.1 to luminance-0.2 included, it was not possible to use any texture types different than two-dimensional textures. This blog entry tags the new release, luminance-0.3, which adds support for several kinds of texture.

A bit more dimensions

Texture1D, Texture2D and Texture3D are all part of the new release. The interface has changed – hence the breaking changes yield a major version increment – and I’ll explain how it has.

Basically, textures are now fully polymorphic and are constrained by a typeclass: Texture. That typeclass enables ad hoc polymorphism. It is then possible to add more texture types without having to change the interface, which is cool. Like everything else in luminance, you just have to ask the typesystem which kind of texture you want, and everything will be taken care of for you.

Basically, you have three functions to know:

All those functions work on (Texture t) => t, so it will work with all kinds of texture.

Cubemaps

Cubemaps are also included. They work like other textures but add the concept of faces. Feel free to dig in the documentation for further details.

What’s next?

I need to find a way to wrap texture arrays, which are very nice and useful for layered rendering. After that, I’ll try to expose the change to the framebuffers so that we can create framebuffers with cubemaps or that kind of cool feature.

In the waiting, have a good a week!

Intersectional Types

Chris Martens (@chrisamaphone) has founded Intersectional Types, a new mailing list for programming language researchers.
In some ways, this list should be considered just another research list, such as the TYPES forum. This space can be used for research questions, literature guidance, starting collaborative efforts, introductions and updates to current research projects, open-ended philosophical questions about grand research visions, links to blog posts/papers, announcement of CFPs and job postings, announcements of achievements and breakthroughs.

In addition, this list is a response to a problem: that PL research communities have a really hard time attracting, retaining, and especially *valuing* people who are marginalized in society. This problem is in no way unique to PL, but the purpose of this list is to bring together folks with similar enough research interests that we can provide each other support that’s meaningful within the context of our specific field.

Some specific examples of activity we encourage, but don’t see on traditional research fora, are: requests for career mentorship and advice (especially along an academic career track); requests for feedback on papers and blog posts; giving (remote) practice talks; organizing local meetups and events; posting about mentorship programs, fellowships, summer schools, and other opportunities; venting about the ways our environments are unwelcoming and dysfunctional; and discussing how we ourselves can create more welcoming and supportive environments when we are in positions of leadership.
I previously posted about Chris's research on linear logic and storytelling. Intersectional Types, along with Lambda Ladies, marks an important step toward supporting diversity in the PL community. Well done, Chris, and welcome!

Lambda Days

I am on the programme committee for Lambda Days, Krakow, 18-19 February 2016.

Polymorphism for dummies

This tutorial explains how polymorphism is implemented under the hood in Haskell using the least technical terms possible.

The simplest example of a polymorphic function is the identity function:

id :: a -> aid x = x

The identity function works on any type of value. For example, I can apply id to an Int or to a String:

$ghciPrelude> id 44Prelude> id "Test""Test" Under the hood, the id function actually takes two arguments, not one. -- Under the hood:id :: forall a . a -> aid @a x = x The first argument of id (the @a) is the same as the a in the type signature of id. The type of the second argument (x) can refer to the value of the first argument (the @a). If you don't believe me, you can prove this yourself by just taking the following module: module Id whereimport Prelude hiding (id)id :: a -> aid x = x ... and ask ghc to output the low-level "core" representation of the above id function: $ ghc -ddump-simpl id.hs[1 of 1] Compiling Id               ( id.hs, id.o )==================== Tidy Core ====================Result size of Tidy Core = {terms: 4, types: 5, coercions: 0}Id.id :: forall a_apw. a_apw -> a_apw[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]Id.id = \ (@ a_aHC) (x_apx :: a_aHC) -> x_apx

The key part is the last line, which if you clean up looks like this:

id = \(@a) (x :: a) -> x

ghc prefixes types with @ when using them as function arguments.

In other words, every time we "generalize" a function (i.e. make it more polymorphic), we add a new hidden argument to that function corresponding to the polymorphic type.

Specialization

We can "specialize" id to a narrower type that is less polymorphic (a.k.a. "monomorphic"):

idString :: String -> StringidString = id

Under the hood, what actually happened was that we applied the id function to the String type, like this:

-- Under the hood:idString :: String -> StringidString = id @String

We can prove this ourselves by taking this module:

module Id whereimport Prelude hiding (id)id :: a -> aid x = xidString :: String -> StringidString = id

... and studying the core that this module generates:

$ghc -ddump-simpl id.hs[1 of 1] Compiling Id ( id.hs, id.o )==================== Tidy Core ====================Result size of Tidy Core = {terms: 6, types: 8, coercions: 0}Id.id :: forall a_apx. a_apx -> a_apx[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]Id.id = \ (@ a_aHL) (x_apy :: a_aHL) -> x_apyId.idString :: GHC.Base.String -> GHC.Base.String[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType]Id.idString = Id.id @ GHC.Base.String If we clean up the last line, we get: idString = id @String In other words, every time we "specialize" a function (i.e. make it less polymorphic), we apply it to a hidden type argument. Again, ghc prefixes types with @ when using them as values. So back in the REPL, when we ran code like this: >>> id 44>>> id "Test""Test" ... ghc was implicitly inserting hidden type arguments for us, like this: >>> id @Int 44>>> id @String "Test" Conclusion That's it! There's really nothing more to it than that. The general trick for passing around type parameters as ordinary function arguments was first devised as part of System F. This is the same way that many other languages encode polymorphism (like ML, Idris, Agda, Coq) except that some of them use a more general mechanism. However, the basic principles are the same: • When you make something more polymorphic you are adding additional type arguments to your function • When you make something less polymorphic you apply your function to type values In ghc-8.0, you will be allowed to explicitly provide type arguments yourself if you prefer. For example, consider the read function: read :: Read a => Read -> a If you wanted to specify what type to read without a type annotation, you could provide the type you desired as an argument to read: read @Int :: Read -> Int Here are some other examples: [] @Int :: [Int]show @Int :: Int -> String Previously, the only way to pass a type as a value was to use the following Proxy type: data Proxy a = Proxy ... which reified a type as a value. Now you will be able to specialize functions by providing the type argument directly instead of adding a type annotation. October 02, 2015 Brent Yorgey Experience report: oral final exam This past spring I taught a standard data structures course (stacks, queues, binary trees, heaps, asymptotic analysis, that kind of thing). Inspired by a group I participated in exploring pedagogy and course design—led by the wonderful Betsy Burris—I decided to give an oral final exam instead of a typical written exam. For whatever reason, oral exams don’t seem very popular in the US these days, but overall I think my experiment was a success and I definitely hope to repeat it in the future. This post reports on how I organized things, what went well and poorly, and what I learned overall. I’d love to hear from anyone else who tries something like this in their own course. I’m also happy to answer questions, either in a comment on this post or via email. Exam content I prepared three questions for the exam. The first was fairly simple (“explain algorithm X and analyze its time complexity”) and I actually told the students ahead of time what it would be—to help them feel more comfortable and prepared. The other questions were a bit more open-ended: • The second question was of the form “I want to store X information and do operations Y and Z on it. What sorts of data structure(s) might you use, and what would be the tradeoffs?” There were then a couple rounds of “OK, now I want to add another operation W. How does that change your analysis?” In answering this I expected them to deploy metrics like code complexity, time and memory usage etc. to compare different data structures. I wanted to see them think about a lot of the different data structures we had discussed over the course of the semester and their advantages and disadvantages at a high level. • The final question was of the form “Here is some code. What does it do? What is its time complexity? Now please design a more efficient version that does the same thing.” With some students there was enough time to have them actually write code, with other students I just had them talk through the design of an algorithm. This question got more at their ability to design and analyze appropriate algorithms on data structures. The algorithm I asked them to develop was not something they had seen before, but it was similar to other things they had seen, put together in a new way. Overall I was happy with the questions and the quality of the responses they elicited. If I do this again I would use similar sorts of questions. Time You might well be wondering how long all of this took. I had about 30 students. I planned for the exam to take 30 minutes, and blocked out 45-minute chunks of time (to allow time for transitioning and for the exam to go a bit over 30 minutes if necessary; in practice the exams always went at least 40 minutes and I was scrambling at the end to jot down final thoughts before the next students showed up). I allowed them to choose whether to come in by themselves or with a partner (more on this later). As seems typical, about 1/3 of them chose to come by themselves, and the other 2/3 in pairs, for a total of about 20 exam slots. 20 slots at 45 minutes per slot comes out to 15 hours, or 3 hours per day for a week. This might sound like a lot, but if you compare it to the time required for a traditional written exam it compares quite favorably. First of all, I spent only two or three hours preparing the exam, whereas I estimate I would easily spend 5 or 10 hours preparing a written exam—a written exam has to be very precise in explaining what is wanted and in trying to anticipate potential questions and confusions. When you are asking the questions in person, it is easy to just clear up these confusions as they arise. Second, I was mostly grading students during their exam (more on this in the next section) so that by about five minutes after the end of their slot I had their exam completely graded. With a written exam, I could easily have spent at least 15 hours just grading all the exams. So overall, the oral exam took up less of my time, and I can tell you, hands down, that my time was spent much more enjoyably than it would have been with a written exam. It was really fun to have each student come into my office, to get a chance to talk with them individually (or as a pair) and see what they had learned. It felt like a fitting end to the semester. Assessment In order to assess the students, I prepared a detailed rubric beforehand, which was really critical. With a written exam you can just give the exam and then later come up with a rubric when you go to grade them (although I think even written exams are usually improved by coming up with a rubric beforehand, as part of the exam design process—it helps you to analyze whether your exam is really assessing the things you want it to). For an oral exam, this is impossible: there is no way to remember all of the responses that each student gives, and even if you write down a bunch of notes during or after each exam, you would probably find later that you didn’t write down everything that you should have. In any case, it worked pretty well to have a rubric in front of me, where I could check things off or jot down quick notes in real time. Social aspects People are often surprised when I say that I allowed the students to come in pairs. My reasons were as follows: • For many students, being able to come with a partner may help ease their anxiety about the exam. I was especially sensitive to this since for many of them an oral exam was a new experience. (I asked beforehand, and only two or three of them had ever had an oral exam, and those were in foreign language classes.) • Computer science, like most disciplines, is often collaborative. No matter whether a student ends up going into industry or academia, it is very likely that any data structure design or analysis that they end up doing in “real life” will be in collaboration with others. So in this sense a collaborative oral exam is fairly authentic—much more so than a written exam. That is, the skills, activities, and modes of thinking and communicating required by a collaborative oral exam are actually fairly similar to what might be required of the students in the “real world”. • There was also, of course, the simple reason of expedience: it cut down on the amount of time I had to spend administering the exam. Overall I was really happy with the result. Many of the students had been working with a particular partner on their labs for the whole semester and came to the exam with that same partner. For quite a few pairs this obviously worked well for them: it was really fun to watch the back-and-forth between them as they suggested different ideas, debated, corrected each other, and occasionally even seemed to forget that I was in the room. One might worry about mismatched pairs, where one person does all of the talking and the other is just along for the ride. I only had this happen to some extent with one or two pairs. I told all the students up front that I would take points off in this sort of situation (I ended up taking off 10%). In the end this almost certainly meant that one member of the pair still ended up with a higher grade than they would have had they taken the exam individually. I decided I just didn’t care. I imagine I might rethink this for an individual class where there were many of these sorts of pairings going on during the semester—but in that case I would also try to do something about it before the final exam. Another interesting social aspect of the process was figuring out what to do when students were floundering. One explicit thing one can do is to offer a hint in exchange for a certain number of points off, but I only ended up using this explicit option a few times. More often, after the right amount of time, I simply guided them on to the next part, either by suggesting that we move on in the interest of time, or by giving them whatever part of the answer they needed to move on to the next part of the question. I then took off points appropriately in my grading. It was difficult figuring out how to verbally respond to students: on the one hand, stony-faced silence would be unnatural and unnerving; on the other hand, responding enthusiastically when they said something correct would give too much away (i.e. by the absence of such a response when they said something incorrect). As the exams went on I got better (I think) at giving interested-yet-non-committal sorts of responses that encouraged the students but didn’t give too much away. But I still found this to be one of the most perplexing challenges of the whole process. Coverage One might wonder how much of the material from an entire semester can really be covered in a 30-minute conversation. Of course, you most certainly cannot cover every single detail. But you can actually cover quite a lot of the important ideas, along with enough details to get a sense for whether a student understands the details or not. In the end, after all, I don’t care whether a student remembers all the details from my course. Heck, I don’t even remember all the details from my course. But I care a great deal about whether they remember the big ideas, how the details fit, and how to re-derive or look up the details that they have forgotten. Overall, I am happy with the way the exam was able to cover the high points of the syllabus and to test students’ grasp of its breadth. My one regret, content-wise, is that with only 30 minutes, it’s not really possible to put truly difficult questions on the exam—the sorts of questions that students might have to wrestle with for ten or twenty minutes before getting a handle on them. Next time Would I do this again? Absolutely, given the right circumstances. But there are probably a few things I would change or experiment with. Here are a few off the top of my head: • This time, I only decided to do an oral final exam quite late in the semester. Next time, with more planning, I would like to think carefully about how to explicitly prepare students for this type of assessment, i.e. to give them chances to practice the necessary skills and to receive feedback. I am not yet sure what that would look like. • I might try something like a system for e-mailing students a harder question a certain amount of time before their slot—the idea being that they would have that time to think about and prepare their answer to the question, which they could then present during the slot (in addition to answering some questions on the spot). However, a potential major downside is that students who struggle with the hard question might come in feeling demoralized and do even worse on the rest of the exam. • Next time I will definitely be more careful and intentional about writing down a script for exactly what I want to say at the start of each exam. There were a lot of things I wanted to remember to say, and specific ways I wanted to frame the time, and at times it seemed a bit haphazard and inconsistent. Again, I’m happy to answer questions in the comments or by email. If you are inspired to also try giving an oral exam, let me know how it goes! Mark Jason Dominus A message to the aliens, part 17/23 (DNA chemistry) This is page 17 of the Cosmic Call message. An explanation follows. The 10 digits are:  0 1 2 3 4 5 6 7 8 9 This page depicts the chemical structures of the four nucleobases that make up the information-carrying part of the DNA molecule. Clockwise from top left, they are thymine , adenine , guanine , and cytosine . The deoxyribose and phosphate components of the nucleotides, shown at right, are not depicted. These form the spiral backbone of the DNA and are crucial to its structure. Will the recipients understand why the nucleobases are important enough for us to have mentioned them? The next article will discuss page 18, shown at right. (Click to enlarge.) Try to figure it out before then. FP Complete Retiring FP Haskell Center Back in April, I announced plans for the future of School of Haskell and FP Haskell Center. We're now half a year later, and it's time to start moving ahead with those plans. The summary is: • FP Haskell Center will be retired by the end of the year, please migrate your projects now. • School of Haskell will be transitioned to its own domain name, schoolofhaskell.com, with hopefully no interruption in service. Migrating projects from FP Haskell Center In order to migrate your projects, please: 1. Open your project on fpcomplete.com 2. Select the Git menu from the top bar 3. To get a tarball with all of your sources, choose "Source dist" 4. To retain your full project history, push to an external Git repository (such as on Github or Bitbucket) As readers of this blog are likely familiar already, FP Complete and the Commercial Haskell group have been putting a lot of effort into the Stack build tool for easily and reliably building Haskell code. We recommend migrating your project to Stack. School of Haskell We have three milestones along the path to our new, open sourced School of Haskell: 1. Release ide-backend - the core engine powering School of Haskell - as open source. We completed this process in March, and ide-backend has continued as a successful open source project since then. 2. Create a School of Haskell API service, allowing arbitrary websites to "activate" Haskell code on their sites to create interactive snippets for their users. The open source schoolofhaskell repository is mostly feature-complete, and we are currently making refinements before launching the service. 3. Extract the School of Haskell HTML display and edit code to its own project and host on schoolofhaskell.com. At that point, users will be able to view and edit their content on schoolofhaskell.com instead. Once all three steps are complete, we will begin redirecting users from fpcomplete.com to schoolofhaskell.com. Timeline We expect the School of Haskell changes to be completed by the end of October. We will be sharing more information about this process as it unfolds, and once the new deployment is available, will welcome contributions to improve the School of Haskell. We also look forward to seeing how others are able to take advantage of the new service API to extend their own websites. For FP Haskell Center: we will be shutting down the service completely at the end of 2015. We will soon deploy changes to provide a warning when accessing FP Haskell Center about the impending shutdown. Once the service is shut down, your data will be inaccessible. Please ensure that you have backed up any data and code you wish to retain. October 01, 2015 Douglas M. Auclair (geophf) September 2015 1HaskellADay Problems and Solutions • September 30th, 2015: Now, not only DLists are Functors, for today's #haskell problem we make them Applicative! http://lpaste.net/4112595477009006592 Come to find that Applied Applicative DLists taste like Apples http://lpaste.net/792542463930662912 • September 29th, 2015: For today's #haskell problem we look at DLists as Functors! I know! Exciting! http://lpaste.net/8043489210054737920 and we enfunctorfy DLists ... which are already functions ... http://lpaste.net/2809553771506434048 ... hmmm ... That sounds like DLists are APPLICATIVE! • September 28th, 2015: So we indexed a set of rows last week, let's re(re)cluster them, AGAIN! http://lpaste.net/5440430731631263744 for today's #haskell problem. And now we re(re)clustered the data, now with colors! http://lpaste.net/3487731800489328640 • September 24th, 2015: Okay, yesterday we indexed rows, so, for today's #haskell problem, let's save and load those rows as CSV http://lpaste.net/2517275148160073728 • September 23rd, 2015: Data Row, o, Data Row: wherefore art thoust identity? Today's #haskell problem adds unique ids for rows of data http://lpaste.net/348288308305985536 Simply by using Data.Array we get our data in (uniquely-identified) rows http://lpaste.net/2095758966012248064 • September 22nd, 2015: For today's #haskell problem we go To Infinity ... and Beyond. Yes: we're coding Haskell-on-the-web, yo! http://lpaste.net/1872496352533938176 simpleHTTP makes HTTP GET-requests, ... well: simple! http://lpaste.net/1054132180147503104 • September 21st, 2015: For today's #haskell problem, we'll fade a circle to black http://lpaste.net/1617260331761926144 • September 17th, 2015: For today's #haskell problem, we receive data one way, but want to see it in another way. What to do? http://lpaste.net/4132020892534308864 Data. EnCSVified. (that's a word, now) http://lpaste.net/6250288158647255040 • September 16th, 2015: Today's #haskell problem asks 'why JSONify when you can represent clusters yourself?' Why, indeed! http://lpaste.net/9129398633454632960 • September 15th, 2015: For today's #haskell problem we 'unJSONify' some, well, JSON http://lpaste.net/2687400576576126976 • September 14th, 2015: For today's #haskell problem, we relook and recenterclusters from the cluster center http://lpaste.net/8570863670190407680 So, re-en-cluster-i-fied ... uh: clusters! YAY! (with, ooh! pics!) http://lpaste.net/8152035196972040192 • September 11th, 2015: Yesterday we displayed one cluster. For today's #haskell problem, let's display them all! http://lpaste.net/6049110928429940736 • September 10th, 2015: This past week we've been clustering data, for today's #Haskell problem we look at visualizing one of these clustershttp://lpaste.net/6385518627050749952 Cluster: shone! ('Schön'? sure!) http://lpaste.net/6270301353331916800 • September 9th, 2015: Okay, yesterday we clustered some data. For today's #haskell problem: let's see some clustered results, then! http://lpaste.net/4070941204840185856 It don't mean a thing, ... If it ain't got the (spreadsheet/CSV) schwing. http://lpaste.net/8321027338137501696 • September 8th, 2015: Today we get to do what all those other peeps do in other programming languages. Today we get to WRITE A PROGRAM!http://lpaste.net/4909208397410205696 wow. I'M K-MEANSIN' ON FIRE TODAY!(okay, geophf, calm down now) A program in Haskellhttp://lpaste.net/8770140562062835712 • September 7th, 2015: Happy Labor Day in the U.S.A. Today's #haskell problem is to look at recentering clusters for the K-Means algorithm http://lpaste.net/4755592492567494656 SEMIGROUPOID! (not 'monoid') is the key the solution for today's #haskell problem http://lpaste.net/9124358884469768192 (ScoreCard has no valid 'zero') • September 4th, 2015: Today's #haskell problem we store color-coding for score cards we obtain from rows of data http://lpaste.net/6316202708206354432 And, color-coded score cards ... SAVED! (makes me wanna scream 'SAIL!') http://lpaste.net/1760010119669612544 <iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/tgIqecROs5M/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/tgIqecROs5M?feature=player_embedded" width="320"></iframe> • September 3rd, 2015: For today's #haskell problem we look at reclustering the rows of data using K-Means clustering http://lpaste.net/5001877831559413760 K-Means clustering in #haskell (well, for 1 epoch. Something's not right with step 3: recentered) http://lpaste.net/87582513538531328 (and it's DOG-slow) • September 2nd, 2015: Drawing information from #BigData is magical, or so says today's #haskell problem http://lpaste.net/8551081789559406592 Ooh! Big Data is o-so-pretty! http://lpaste.net/2956050488883150848 But what does it mean? Stay tuned! • September 1st, 2015: For today's #haskell problem we look (obliquely) at the problem of 'indices as identity' http://lpaste.net/2039451038523588608 What is identity, anyway? 100+ clusters for 3,000 rows? Sounds legit. http://lpaste.net/4068559039884165120 Noam Lewis Const when you need it infernu uses row-type polymorphism to propagate read/write capabilities on record fields. Using row-type polymorphism to describe more than just which fields are present bears a vague resemblance to polymorphic constraints. In C, a pointer to a field in a const struct is automatically const’ed: struct foo { int field; }; void useInt(const int *); int main(void) { const struct foo x; useInt(&x.field); // no warnings because &x.field is 'const int *' return 0; }  Thus, a function that extracts a pointer to a (possibly deep) field from a const struct, will also return a const pointer: const int *getField(const struct foo *x) { return &x->field; }  (All the code compiles with -Wall and -Wextra) But, what if I want to use getField on a non-const struct, to get an accessor to a field within it? Almost works: struct foo y; int *bla = getField(&y); *bla = 2;  Uh oh. We get a warning: warning: initialization discards ‘const’ qualifier from pointer target type [enabled by default] int *bla = getField(&y); ^  The compiler is angry because int *bla should be const int *bla. But we don’t want that! We just want to get an accessor – a writable accessor – to a field in our not-const struct value. C++ (not C) does have a non-solution: const_cast. That isn’t what we want: it’s unsafe. What we want is, if a function doesn’t get a const struct, the ‘non-constness’ should propagate to the field accessor being returned (and vice versa: if the given struct was const, so should the accessor). In fancier words, we need const polymorphism, which I imagine would be written with a ‘constness type variable’ C like this made-up syntax: const<C> int *getField(const<C> struct foo *x) { return &x->field; }  And then we would expect this to compile with no problems:  struct foo y; int *bla = getField(&y);  …because, as ‘y’ is not const, ergo the pointer returned from getField is not pointing at a const. Unfortunately, no such thing. We could represent this in a type system in a number of ways. One simple way is to say that constness is a constraint on a type (using something like Haskell’s type classes). Another way is to have ‘write into a field’ be a kind of a capability that’s part of the type. The latter, write-capability approach is what I use in Infernu. Here there are no structs (it’s JavaScript) but there are polymorphic records. The type system includes two flavors for each field label: Get and Set. If a field is only being read, the record (or object or row) that contains it only needs to have the ‘Get’ accessor for that field. Here’s infernu’s output for a simple example: // obj : { subObj: { val: Number } } var obj = { subObj: { val: 3 } };  Our object is simple. The comment is what infernu infers, a reasonably simple type. In the notation I (shamelessly) invented, read-only fields have a prefix ‘get’ in the type, and read/write fields don’t have any prefix. So a read-only field ‘bla’ would be: { get bla : t }. If ‘bla’ is required to be writable, the type is written as { bla : t }. So in the above ‘obj’ example, we see that literal objects are by default inferred to be writable (type annotations would allow you to control that). Next let’s make a function that only reads ‘subObj’: // readSubObj : ( { get subObj: h | g} -> h) function readSubObj(x) { return x.subObj; }  The type inferred says “readSubObj is a function, that takes an object with a readable field subObj, (hence “get subObj”: it doesn’t require the ‘Set’ capability!). subObj has any type ‘h’, and the function returns that same type, ‘h’. (By the way, that ‘| g‘ means the passed object is allowed to contain also other fields, we don’t care.) Example of a nested read: // readVal : ( { get subObj: { get val: d | c} | b} -> d) function readVal(x) { return x.subObj.val; }  Now we need to ‘get subObj’ but subObj itself is an object with a readable field ‘val’ of type d. The function returns a ‘d’. We can use readSubObj on a writable object with no problems: // sub : { val: Number } var sub = readSubObj(obj);  When infernu supports type annotations (eventually) one could take advantage of this type-system feature by marking certain fields ‘get’. While this isn’t exactly the same as the problem we discussed with C const pointers, the same idea could be used to implement polymorphic constness. The main idea here is that ideas from row-type polymorphism can be used to implement a certain kind of ‘capabilities’ over types, constraints that are propagated. This may be a nicer way to implement (some kind of) polymorphic constraints. (For example, in a language that supports row extension/reduction, a function { x : Int | r } -> { | r } would retain the unspecified constraints from ‘r’. I’m sure there are more interesting examples.) If you can refer me to something like this, please do! Tagged: Haskell, Infernu, Javascript September 30, 2015 JP Moresmau Symbolic differentiation to the rescue! ... Or not. I'm still playing with my LSTM networks, inspired by a few blog posts about their "unreasonable efficiency". I spent a lot of time messing with genetic algorithms, but then I came back to more predictable methods, namely gradient descent. I was trying to optimize the performance of the cost function I use via AD (even asking help on stack overflow) when I stumbled across a couple of blog posts on symbolic differentiation (here and here). The last one, combining automatic and symbolic differentiation, struck a chord. If my differentiation calculation was taking so much time to calculate, could I just not calculate it once with symbolic expressions, then close the resulting expression over my variables (my LSTM network) repeatedly while applying the gradients. I should only pay the price for the derivation once! So I extended the data type suggested in the blog post to include all operations I was using in my function, manage to sort out all the types and verify via a few tests that it would work. I had great hopes! But as soon as I started testing on a real LSTM, the code just crawled to a halt. I even thought I had some infinite loop, maybe some recursion on the expression, but testing more thoroughly showed that it was the sheer size of the generated expression that was the issue. A LSTM of 2 cells is represented in the cost function used by AD as an array of 44 doubles, so basically for a LSTM of 2 cells, I'll have 44 variables in my gradient expression. My simple test that tries to use a LSTM to generate the string "hello world!" uses 9 cells (9 different characters in the string) , which is 702 variables. Even printing the expression takes forever. Running it through a simplifying step takes also forever. So my idea was not as good as it first looked, but it was fun testing it! The code for my expressions can be found here, and the code doing the gradient descent via the symbolic differentiation is here. All of that looks probably very naive for you calculus and machine learning experts but hey, I'm a human learning... If everybody has any idea to speed up my code, I'l happily listen! Mark Jason Dominus A message to the aliens, part 16/23 (vital statistics) This is page 16 of the Cosmic Call message. An explanation follows. The 10 digits are:  0 1 2 3 4 5 6 7 8 9 This page, about human vital statistics and senses, is in three sections. The text in the top left explains the population of the Earth: around 6,000,000,000 people at the time the message was sent. The three following lines give the life expectancy (70 years), mass (80 kg), and body temperature (311K) of humans. In each case it is stated explicitly that the value for men and for women is the same, which is not really true. The glyph used for life expectancy is the same one used to denote the age of the Earth back on page 13 even though the the two notions are not really the same. And why 311K when the commonly-accepted value is 310K? The diagram at right attempts to explain the human sense of hearing, showing a high-frequency wave at top and a low frequency one at bottom, annotated with the glyph for frequency and the upper and lower frequency limits of human hearing, 20,000 Hz and 20 Hz respectively. I found this extremely puzzling the first time I deciphered the message, so much so that it was one of the few parts of the document that left me completely mystified, even with the advantage of knowing already what humans are like. A significant part of the problem here is that the illustration is just flat out wrong. It depicts transverse waves: but sound waves are not transverse, they are compression waves. The aliens are going to think we don't understand compression waves. (To see the difference, think of water waves, which are transverse: the water molecules move up and down—think of a bobbing cork—but the wave itself travels in a perpendicular direction, not vertically but toward the shore, where it eventually crashes on the beach. Sound waves are not like this. The air molecules move back and forth, parallel to the direction the sound is moving.) I'm not sure what would be better; I tried generating some random compression waves to fit in the same space. (I also tried doing a cartoon of a non-random, neatly periodic compression wave, but I couldn't get anything I thought looked good.) I think the compression waves are better in some ways, but perhaps very confusing: On the one hand, I think they express the intended meaning more clearly; on the other hand, I think they're too easy to confuse with glyphs, since they happen to be on almost the same scale. I think the message might be clearer if a little more space were allotted for them. Also, they could be annotated with the glyph for pressure , maybe something like this: This also gets rid of the meaningless double-headed arrow. I'm not sure I buy the argument that the aliens won't know about arrows; they may not have arrows but it's hard to imagine they don't know about any sort of pointy projectile, and of course the whole purpose of a pointy projectile (the whole point, one might say) is that the point is on the front end. But the arrows here don't communicate motion or direction or anything like that; even as a human I'm not sure what they are supposed to communicate. The bottom third of the diagram is more sensible. It is a diagram showing the wavelengths of light to which the human visual system is most sensitive. The x-axis is labeled with “wavelength” and the y-axis with a range from 0 to 1. The three peaks have their centers at 295 nm (blue), 535 nm (green), and 565 nm (often called “red”, but actually yellow). These correspond to the three types of cone cells in the retina, and the exsistence of three different types is why we perceive the color space as being three-dimensional. (I discussed this at greater length a few years ago.) Isn't it interesting that the “red” and green sensitivities are so close together? This is why we have red-green color blindness. The next article will discuss page 17, shown at right. (Click to enlarge.) Try to figure it out before then. September 29, 2015 Johan Tibell Video of my performance optimization talk at ZuriHac 2015 This blog post is a couple of months late, but I thought I should mention that a talk on performance optimization I gave a while ago at ZuriHac 2015 is now on YouTube (slides). Unfortunately the first minutes got cut off and the sound quality isn't the best, but hopefully it's something. P.S. I will be giving a rewritten and expanded version of this talk at Haskell eXchange 2015. September 28, 2015 Mark Jason Dominus A message to the aliens, part 15/23 (human anatomy) This is page 15 of the Cosmic Call message. An explanation follows. The 10 digits are:  0 1 2 3 4 5 6 7 8 9 This page starts a new section of the document, each page headed with the glyph for “biology” . The illustration is adapted from the Pioneer plaque; the relevant portion is shown below. Copies of the plaque were placed on the 1972 and 1973 Pioneer spacecraft. The Pioneer image has been widely discussed and criticized; see the Wikipedia article for some of the history here. The illustration suffers considerably from its translation to a low-resolution bitmap. The original picture omits the woman's vulva; the senders have not seen fit to correct this bit of prudery. The man and the woman are labeled with the glyphs and , respectively. The glyph for “people” , which identified the stick figures on the previous page, is inexplicably omitted here. The ruler on the right somewhat puzzlingly goes from a bit above the man's toe to a bit below the top of the woman's head; it does not measure either of the two figures. It is labeled 1.8 meters, a typical height for men. The original Pioneer plaque spanned the woman exactly and gave her height as 168 cm, which is conveniently an integer multiple of the basic measuring unit (21 cm) defined on the plaque. To prevent the recipients from getting confused about which end of the body is the top, a parabolic figure (shown here at left), annotated with the glyph for “acceleration”, shows the direction of gravitational acceleration as on the previous page. The next article will discuss page 16, shown at right. (Click to enlarge.) Try to figure it out before then. Vincent Hanquez Combining Rust and Haskell Rust is a pretty interesting language, in the area of C++ but more modern / better. The stated goal of rust are: “a systems programming language focused on three goals: safety, speed, and concurrency”. Combining Rust with Haskell could create some interesting use cases, and could replace use of C in some projects while providing a more high level and safer approach where Haskell cannot be used. One of my reason for doing this, is that writing code targetting low-level features is simpler in Rust than Haskell. For example, writing inline assembly or some lowlevel OS routines. Also the performance of Rust is quite close to C++, and I could see this being useful in certain case where Haskell is not as optimised. In this short tutorial, let’s call Rust functions from Haskell. The Rust library First we start with an hypothetical rust library that takes a value, print to console and return a value. Our entry point in Rust is a simple rust_hello, in a src/lib.rs file: #[no_mangle] pub extern fn rust_hello(v: i32) -> i32 { println!("Hello Rust World: {}", v); v+1 }  One of the key thing here is the presence of the no_mangle pragma, that allow exporting the name of the function as-is in the library we’re going to generate. Rust uses Cargo to package library and executable, akin to Cabal for haskell. We can create the Cargo.toml for our test library: [package] name = "hello" version = "0.0.1" authors = ["Vincent Hanquez <vincent@snarc.org>"] [lib] name = "hello" crate-type = ["staticlib"] The only special trick is that we ask Cargo to build a static library in the crate-type section, instead of the default rust lib (.rlib). Haskell doesn’t know other calling / linking convention like C++ (yet) or Rust, which is why we need to go through those hoops. We can now build with our Rust library with: cargo build If everything goes according to plan, you should end up with a target directory where you can find the libhello.a library. The haskell part Now the haskell part is really easy, as this point there’s no much difference than linking with some static C library; first we create a src/Main.hs: {-# LANGUAGE ForeignFunctionInterface #-} module Main where import Foreign.C.Types foreign import ccall unsafe "rust_hello" rust_hello :: CInt -> IO CInt main = do v <- rust_hello 2901 putStrLn ("Rust returned: " ++ show v) Nothing special if you’ve done some C bindings yourself, otherwise I suggest having a look at the FFI article. we can try directly linking with ghc: ghc -o hello-rust --make src/Main.hs -lhello -Ltarget/debug and running: $ ./hello-rust
Hello Rust World: 2901
Rust returned: 2902

That achieve the goal above. From there we can polish things and add this to a cabal file:

name:                hello-rust
version:             0.1.0.0
author:              Vincent Hanquez
maintainer:          vincent@snarc.org
category:            System
build-type:          Simple
cabal-version:       >=1.10
extra-source-files:  src/lib.rs

executable hello-rust
main-is:             Main.hs
other-extensions:    ForeignFunctionInterface
build-depends:       base >=4.8 && <4.9
hs-source-dirs:      src
extra-lib-dirs:      target/release, target/debug
extra-libraries:     hello

Note: The target/release path is to support building with the -release flag of cargo build. by listing the target/release and then target/debug, it should allow you to pickup the release library in preference to the debug library. It can also create some confusion, and print a warning on my system when one of the directory is missing.

The missing step is either adding some pre-build rules to cabal Setup.hs to run cargo build, or some more elaborate build system, both which are left as exercice to the interested reader.

Where this could go

Going forward, this could lead to having another language from Haskell to target that is not as lowlevel as C, but offer stellar performance and more high level constructs (than C) without imposing any other runtime system. This is interesting where you need to complete some tasks that Haskell is not quite ready to handle (yet).

For example, a non exhaustive list:

• Writing cryptographic bindings in rust&asm instead of C&asm
• Heavily Vector-Optimised routines
• Operating system routines (e.g. page table handling) for a hybrid and safer operating system kernel.
• Servo embedding

Let me know in the comments anything else that might be of interests !

Two Different Flavors of Type Theory

Posted on September 27, 2015
Tags: types

So summer seems to be about over. I’m very happy with mine, I learned quite a lot. In particular over the last few months I’ve been reading and fiddling with a different kind of type theory than I was used to: computational type theory. This is the type theory that underlies Nuprl (or JonPRL cough cough).

One thing that stood out to me was that you could do all these absolutely crazy things in this system that seemed impossible after 3 years of Coq and Agda. In this post I’d like to sketch some of the philosophical differences between CTT and a type theory more in the spirit of CiC.

Formal Type Theory and Props-as-Types #1

First things first, let’s go over the more familiar notion of type theory. To develop one of these type theories you start by discussing some syntax. You lay out the syntax for some types and some terms

A ::= Σ x : A. A | Π x : A. A | ⊤ | ⊥ | ...
M ::= M M | λ x : A. M | <M, M> | π₁ M | ⋆ | ...

And now we want to describe the all important M : A relation. This tells us that some term has some type. It’s is inductively defined from a finite set of inferences. Ideally, it’s even decidable for philosophical reasons I’ve never cared too much about. In fact, it’s this relation that really governs our whole type theory, everything else is going to stem from this.

As an afterthought, we may decide that we want to identify certain terms which other terms this is called definitional equality. It’s another inductively defined (and decidable) judgment M ≡ N : A. Two quick things to note here

1. Definitional equality is completely arbitrary; it exists in the way it does because we defined it that way and for no other reason
2. The complexity of proving M ≡ N : A is independent of the complexity of A

The last point is some concern because it means that equality for functions is never going to be right for what we want. We have this uniformly complex judgment M ≡ N : A but when A = Π x : B. C the complexity should be greater and dependent on the complexity of B and C. That’s how it works in math after all, equality at functions is defined pointwise, something we can’t really do here if ≡ is to be decidable or just be of the same complexity no matter the type.

Now we can do lots of things with our theory. One thing we almost always want to do is now go back and build an operational semantics for our terms. This operational semantics should be some judgment M ↦ M with the property that M ↦ N will imply that M ≡ N. This gives us some computational flavor in our type theory and lets us run the pieces of syntax we carved out with M : A.

But these terms that we’ve written down aren’t really programs. They’re just serializations of the collections of rules we’ve applied to prove a proposition. There’s no ingrained notion of “running” an M since it’s built on after the fact. What we have instead is this ≡ relation which just specifies which symbols we consider equivalent but even it is was defined arbitrarily. There’s no reason we ≡ needs to be a reasonable term rewriting system or anything. If we’re good at our jobs it will be, sometimes (HoTT) it’s not completely clear what that computation system is even though we’re working to find it. So I’d describe a (good) formal type theory as an axiomatic system like any other that we can add a computational flavor to.

This leads to the first interpretation of the props-as-types correspondence. This states that the inductively defined judgments of a logic give rise to a type theory whose terms are proof terms for those same inductively defined judgments. It’s an identification of similar looking syntactic systems. It’s useful to be sure if you want to develop a formal type theory, but it gives us less insight into the computational nature of a logic because we’ve reflected into a type theory which we have no reason to suspect has a reasonable computational characterization.

Behavioural/Computational Type Theory and Props-as-Types #2

Now we can look at a second flavor of type theory. In this setting the way we order our system is very different. We start with an programming language, a collection of terms and an untyped evaluation relation between them. We don’t necessarily care about all of what’s in the language. As we define types later we’ll say things like “Well, the system has to include at least X” but we don’t need to exhaustively specify all of the system. It follows that we have actually no clue when defining the type theory how things compute. They just compute somehow. We don’t really even want the system to be strongly normalizing, it’s perfectly valid to take the lambda calculus or Perl (PerlPRL!).

So we have some terms and ↦, on top of this we start by defining a notion of equality between terms. This equality is purely computational and has no notion of types yet (like M ≡ N : A) because we have no types yet. This equality is sometimes denoted ~, we usually define it as M ~ N if and only if M ↦ O(Ms) if and only if N ↦ O(Ns) and if they terminate than Ms ~ Ns. By this I mean that two terms are the same if they compute in the same way, either by diverging or running to the same value built from ~ equal components. For more on this, you could read Howe’s paper.

So now we still have a type theory with no types.. To fix this we go off an define inferences to answer three questions.

1. What other values denote types equal to it? (A = B)
2. What values are in the type? (a ∈ A)
3. What values are considered equal at that type? (a = b ∈ A)

The first questions is usually answered in a boring way, for instance, we would say that Π x : A. B = Π x : A'. B' if we know that A = A' and B = B' under the assumption that we have some x ∈ A. We then specify two and three. There we just give the rules for demonstrating that some value, which is a program existing entirely independently of the type we’re building, is in the type. Continuing with functions, we might state that

  e x ∈ B (x ∈ A)
———————————————————
e ∈ Π x : A. B

Here I’m using _ (_) as syntax for a hypothetical judgment, we have to know that e ∈ B under the assumption that we know that x ∈ A. Next we have to decide what it means for two values to be equal as functions. We’re going to do this behaviourally, by specifying that they behave as equal programs when used as functions. Since we use functions by applying them all we have to do is specify that they behave equally on application

 v x = v' x ∈ B (x ∈ A)
————————————————————————
v = v' ∈ Π x : A. B

Equality is determined on a per type basis. Furthermore, it’s allowed to use the equality of smaller types in its definition. This means that when defining equality for Π x : A. B we get to use the equalities for A and B! We make no attempt to maintain either decidability or uniform complexity in the collections of terms specified by _ = _ ∈ _ as we did with ≡. As another example, let’s have a look at the equality type.

 A = A'  a = a' ∈ A  b = b' ∈ A
————————————————————————————————
I(a; b; A) = I(a'; b'; A')

a = b ∈ A
——————————————
⋆ ∈ I(a; b; A)

a = b ∈ A
——————————————————
⋆ = ⋆ ∈ I(a; b; A)

Things to notice here, first off the various rules depend on the rules governing membership and equality in A as we should expect. Secondly, ⋆ (the canonical occupant of I(...)) has no type information. There’s no way to reconstruct whatever reasoning went into proving a = b ∈ A because there’s no computational content in it. The thing on the left of the ∈ only describes the portions of our proof that involve computation and equalities in computational type theory are always computationally trivial. Therefore, they get the same witness no matter the proof, no matter the types involved. Finally, the infamous equality reflection rule is really just the principle of inversion that we’re allowed to use in reasoning about hypothetical judgments.

This leads us to the second cast of props-as-types. This one states that constructive proof has computational character. Every proof that we write in a logic like this gives us back an (untyped) program which we can run as appropriate for the theorem we’ve proven. This is the idea behind Kleene’s realizability model. Similar to what we’d do with a logical relation we define what each type means by defining the class of appropriate programs that fit its specification. For example, we defined functions to be the class of things that apply and proofs of equality are ⋆ when the equality is true and there are no proofs when it’s false. Another way of phrasing this correspondence is types-as-specs. Types are used to identify a collection of terms that may be used in some particular way instead of merely specifying the syntax of their terms. To read a bit more about this see Stuart Allen and Bob Harper’s work on the do a good job of explaining how this plays out for type theory.

Building Proof Assistants

A lot of the ways we actually interact with type theories is not on the blackboard but through some proof assistant which mechanizes the tedious aspects of using a type theory. For formal type theory this is particularly natural. It’s decidable whether M : A holds so the user just writes a term and says “Hey this is a proof of A” and the computer can take care of all the work of checking it. This is the basic experience we get with Coq, Agda, Idris, and others. Even ≡ is handled without us thinking about it.

With computational type theory life is a little sadder. We can’t just write terms like we would for a formal type theory because M ∈ A isn’t decidable! We need to help guide the computer through the process of validating that our term is well typed. This is the price we pay for having an exceptionally rich notion of M = N ∈ A and M ∈ A, there isn’t a snowball’s chance in hell of it being decidable 1. To make this work we switch gears and instead of trying to construct terms we start working with what’s called a program refinement logic, a PRL. A PRL is basically a sequent calculus with a central judgment of

H ≫ A ◁ e

This is going to be set up so that H ⊢ e ∈ A holds, but there’s a crucial difference. With ∈ everything was an input. To mechanize it we would write a function accepting a context and two terms and checking whether one is a member of the other. With H ≫ A ◁ e only H and A are inputs, e should be thought of as an output. What we’ll do with this judgment is work with a tactic language to construct a derivation of H ≫ A without even really thinking with that ◁ e and the system will use our proof to construct the term for us. So in Agda when I want to write a sorting function what I might do is say

    sort : List Nat → List Nat
sort xs = ...

I just give the definition and Agda is going to do the grunt work to make sure that I don’t apply a nat to a string or something equally nutty. In a system like (Jon|Nu|Meta|λ)prl what we do instead is define the type that our sorting function ought to have and use tactics to prove the existence of a realizer for it. By default we don’t really specify what exactly that realizer. For example, if I was writing JonPRL maybe I’d say

    ||| Somehow this says a list of nats is a sorted version of another
Operator sorting : (0; 0).

Theorem sort : [(xs : List Nat) {ys : List Nat | is-sorting(ys; xs)}] {
||| Tactics go here.
}

I specify a sufficiently strong type so that if I can construct a realizer for it then I clearly have constructed a sorting algorithm. Of course we have tactics which let us say things “I want to use this realizer” and then we have to go off and show that the candidate realizer is a validate realizer. In that situation we’re actually acting as a type checker, constructing a derivation implying e ∈ A.

Wrap Up

Well, that’s this summer in a nutshell. Before I finish I had one more possible look on things. Computational type theory is not concerned with something being provable in an axiomatic system, rather it’s about describing constructions. Brouwer’s core idea is that a proof is a mental construction and computational type theory is a system for proving that a particular a computable process actually builds the correct object. It’s a translation of Brouwer’s notion of proof into terms a computer scientist might be interested in.

1. To be clear, this is the chance of the snowball not melting. Not the snowball’s chances of being able to decide whether or not M ∈ A holds. Though I suppose they’re roughly the same.

A message to the aliens, part 14/23 (terrain)

This is page 14 of the Cosmic Call message. An explanation follows.

The 10 digits are:

 0 1 2 3 4 5 6 7 8 9

This is my favorite page: there is a lot of varied information and the illustration is ingenious. The page heading says to match up with the corresponding labels on the previous three pages. The page depicts the overall terrain of the Earth. The main feature is a large illustration of some mountains (yellow in my highlighted illustration below) plunging into the sea (blue).

The land part is labeled , the air part , and the water part . Over on the left of the land part are little stick figures, labeled people . This is to show that people live on the land part of the Earth, not under water or in the air. The stick figures may not be clear to the recipients, but they are explained in more detail on the next page.

Each of the three main divisions is annotated with its general chemical composition, with compounds listed in order of prevalence., All the chemical element symblog were introduced earlier, on pages 6 and 7:

The lithosphere : silicon dioxide (SiO2) ; aluminium oxide (Al2O3) ; iron(III) oxide (Fe2O3) ; iron(II) oxide (FeO) . Wikipedia and other sources dispute this listing, giving instead: SiO2, MgO, FeO, Al2O3, CaO, Na2O, Fe2O3 in that order.

The atmosphere : nitrogen gas (N2) ; oxygen gas (O2) ; argon (Ar) ; carbon dioxide (CO2) .

The hydrosphere : water (H2O) ; sodium (Na) ; chlorine (Cl) .

There are rulers extending upward from the surface of the water to the height of top of the mountain and downward to the bottom of the ocean. The height ruler is labeled 8838 meters, which is the height the peak of Mount Everest, the point highest above sea level. The depth ruler is labeled 11000 meters, which is the depth of the Challenger Deep in the Mariana Trench, the deepest part of the ocean. The two rulers have the correct sizes relative to one another. The human figures at left are not to scale (they around about 1.7 miles high), but the next page will explain how big they really are.

I don't think the message contains anything to tell the recipients the temperature of the Earth, so it may not be clear that the hydrosphere is liquid water. But perhaps the wavy line here will suggest that. The practice of measuring the height of the mountains and depth of the ocean from the surface may also be suggestive of a liquid ocean, since it would not otherwise have a flat surface to provide a global standard.

There is a potential problem with this picture: how will the recipients know which edge is the top? What if they hold it upside-down, and think the human figures are pointing down into the earth, heads downwards?

This problem is solved in a clever way: the dots at the right of the page depict an object accelerating under the influence of gravity, falling in a characteristic parabolic path. To make the point clear, the dots are labeled with the glyph for acceleration.

Finally, the lower left of the page states the acceleration due to gravity at the Earth's surface, 9.7978 m/s2. The recipients can calculate this value from the mass and radius of the Earth given earlier. Linked with the other appearance of acceleration on the page, this should suggest that the dots depict an object falling under the influence of gravity toward the bottom of the page.

The next article will discuss page 15, shown at right. (Click to enlarge.) Try to figure it out before then.

Two implementations of DHM type inference

Here are two simple implementations of Damas-Hindley-Milner type inference.

First, is my Haskell version of a region-based optimized type checker as explained by Oleg Kiselyov, in his excellent review of the optimizations to generalization used in OCaml. Oleg gives an SML implementation, which I’ve Haskellized rather mechanically (using ST instead of mutable references, etc.) The result is a bit ugly, but it does include all the optimizations explained by Oleg above (both lambda-depth / region / level fast generalization and instantiation, plus path compression on linked variables, and not doing expensive occurs checks by delaying them to whenever we traverse the types anyway).

Second, here’s my much shorter and more elegant implementation using the neat unification-fd package by Wren Romano. It’s less optimized though – currently I’m not doing regions or other optimizations. I’m not entirely satisfied with how it looks: I’m guessing this isn’t how the author of unification-fd intended generalization to be implemented, but it works. Generalization does the expensive lookup of free metavariables in the type environment. Also the instantiate function is a bit clunky. The unification-fd package itself is doing inexpensive occurs checks as above, and path compression, but doesn’t provide an easy way to keep track of lambda-depth so I skipped that part. Perhaps the Variable class should include a parameterized payload per variable, which could be used for (among other things) keeping track of lambda-depth.

darcs hacking sprint 9 report

After a one year and a half absence, the Darcs Hacking Sprint returned!

Once again, the event occurred at the IRILL (Innovation and Research Initiative for Free Software) in Paris, on September 18th to 20th.

The sprint had 7 participants: Danill Frumin, Eric Kow, Florent Becker, Ganesh Sittampalam, Guillaume Hoffmann, Thomas Miedema and Vinh Dang.

Darcs and GHC 8

Thomas Miedema is a Haskell and GHC hacker, and came on the first day of the sprint. Since Darcs is a system that aims at supporting the various GHC versions out there, Thomas helped us preparing for GHC 8, the next major version. He explained us one issue of GHC 8 that got triggered by Darcs: a bug with the PatternSynonyms extension. Fortunately it seems that the bug will be fixed in GHC HEAD. (First release candidate is planned for December).

 Thomas explaining PatternSynonyms to Eric and Ganesh

Diving into SelectChanges and PatchChoices code

On the first day I (Guillaume) claimed the "rollback takes ages" bug, which made me look into SelectChanges and PatchChoices code. The result is that I still haven't yet fixed the bug, but I discovered that patch matching was unnecessarily strict, which I could fix easily. Internally, there are two interesting patch types when it comes to matching:
• NamedPatch: represent the contents of a patch file in _darcs/patches, that is, its info and its contents
• PatchInfoAnd: represents the info of a patch as read from an inventory file (from _darcs/inventories or _darcs/hashed_inventory) and a lazy field to its corresponding NamedPatch.
Now, getting the NamedPatch for some patch is then obviously more costly than a PatchInfoAnd. You may even have to download the patch file in order to read it (in the case of lazy repositories). Moreover,  the majority of matchers only need the patch info (or metadata), not its actual contents. Only two matchers (hunk and touch) need to actually read the patch file, while matching or a patch name for instance (probably the most common operation) does not.

So, before the sprint, as soon as you wanted to match on a patch file, you had to open (and maybe download) its file, even if this was useless. With my change (mostly in Darcs.Patch.Match) we gained a little more laziness; and the unreasonably slow command "rollback -p ." passes from 2 minutes to ~15 seconds on my laptop. I hope to push this change into Darcs 2.10.2.

 Eric, Guillaume and Vinh

Now, the real source of the "rollback -p ." slowness is that patch selection is done on FL's (Forward List), while commands like rollback and obliterate naturally work backwards in time on RL. Currently, an RL is inverted and then given to the patch selection code, which is not convenient at all! Moreover, the actual representation of history of a Darcs repository is (close to being) an RL. So it seems like a proper fix for the bug is to generalize the patch selection code to also work on RL's; which may involve a good amount of typeclass'ing in the relevant modules. I think this will be too big/risky to port to the 2.10 branch, so it will wait for Darcs 2.12.

Ganesh's new not-yet-officially-named stash command

A few days before the sprint, Ganesh unveiled his "stash" branch. It feature a refactoring that enables to suspend patches (ie, put them into a state such that they have no effect in the working copy) but without changing their identity (which is currently what occurs with the darcs rebase command). This enables to implement a git-stash-like feature.

The sprinters (IRL and on IRC) discussed the possible name of the command that should encapsulate this stash feature. More importantly, on the last day we discussed what would be the actual UI of such a feature. As always when a new feature is coming to darcs, we want to make the UI as darcsy as possible :-)

Coming back to the code, Ganesh's refactoring, if extensive, will also simplify the existing types for suspended patches. We decided to go with it.

Dan's den

 Dan demonstrating den (on the left: Florent)
Daniil Frumin was this years Google Summer of Code student for Darcs. Mentored by Ganesh, he brought improvements to Darcsden, many of them being already deployed. Among them, it is possible to launch a local instance of Darcsden (using an executable called den), not unlike Mercurial's "serve" command.

Dan tells more about his work and this sprint in his latest blog post.

A better website and documentation

As a newcomer to the project, Vinh took a look at the documentation, especially the website of the project. He implemented changes to make the front page less intimidating and more organized. He also had a fresh look at our "quickstart" and proposed improvements which we felt were much needed!

Florent's projects

For this sprint, Florent was more an external visitor than a Darcs hacker. He talked about one of his current projects: Pijul, a version control system with another approach. Check out their website!

Conclusion and the next sprint

In the end this sprint turned out to be more productive and crowded than we initially thought! It has been a lot of time since the previous one, so we had a lot of things to share at first. Sprints do make synchronization between contributors more effective. They are also a moment when we can get more concentrated on the Darcs codebase, and spend more time tacking some issue.

 Avenue d'Italie, Paris
We would like to thank the IRILL people for hosting the sprint for the third time and our generous donators to make travelling to sprints easier.

We already have a time and a place for the next sprint: Sevilla, Spain in January 2016! The exact moment will be announced later, but you can already start organizing yourself and tell us if you're going.

 Thomas, Eric and Ganesh
 From left to right: Vinh, Florent, Dan, Ganesh and Eric

Darcsden improvements and Darcs sprint

This post is intended to be a short summary of my SoC project, as well as my recent trip to Darcs sprint.

Introduction

I am finishing up this post on the train back from the Autumn 2015 Darcs sprint. Today (Sept 20, Sun) was a very fun day full of darcs chatting and coding. By the end of the day we’ve heard a number of presentations

• Ganesh described his work on "stash" command for darcs (naming subject to change!). It involves some refactoring of the rebase code. I hope we would hear more from him on that, because the internal workings are actually quite interesting — I believe it’s the first time singleton types and DataKinds are used in the darcs codebase;
• Florent Becker gave a presentation about Pijul and the theory behind it — A Categorical Theory of Patches by Samuel Mimram and Cinzia Di Giusto, see arXiv:1311.3903;
• Vinh Dang talked about his improvements on the darcs wiki (it’s about time to organize the website), his goal was to make it more accessible to the newcomers;
• Yours truly gave a small presentation, outline of which you will find below:

Looking back

I have spent this summer hacking on DarcsDen as part of the Google Summer of Code program.

My basic goal was to create a "local" version of darcsden. It was not a trivial task to install darcsden (and probably installation is still not very easy!). It uses a third-party software like Redis and CouchDB. During my coding process I modifed darcsden such that it now can be a good choice for local (or lightweight single user) darcs UI. The local darcsden version can be used without any databases, tracking the repositories in the local file system. This way darcsden can be used by a developer on her local computer, like darcsum, (for working with/comparing repositories) as well as a replacement for darcsweb/cgit — a single user web front for darcs repositories.

Besides that a user of a local version can use darcsden’s interactive UI for recording new patches, as well as a command-line tool den for a quick way of browsing the repositories.

Installing darcsden-local is currently not as easy as I want to it be, but I hope that soon you will be able to install it just by running cabal install darcsden or brew install darcsden. As for now, one could do the following:

1. darcs get --lazy http://hub.darcs.net/co-dan/darcsden-local
2. cabal install . or stack install

This should install the darcsden binary and all the related css/js files. You can start darcsden by running darcsden --local. If you open your web browser you should see a list of repositories in the current directory.

However, you might have repositories scattered all over the place, and scanning your whole system for darcs repositories is just inefficient. For this purposes darcs keeps a list of repositories in a file inside your ~/.darcs directory. You can manage that list either by hand, or using the command-line den tool:

• den $PATH — add$PATH to the list of repositories in ~/.darcs/darcsden_repos (if it’s not already present there), start darcsden server in the background and launch the browser pointing to $PATH; • den — the same as den .; • den --add$PATH — add $PATH to the list of repositories in ~/.darcs/darcsden_repos; • den --remove$PATH — remove $PATH from the list of repositories in ~/.darcs/darcsden_repos. In order to further customize darcsden, one can tweak the configuration file located at ~/.darcs/darcsden.conf. Apart from the usual darcsden settings one may pay attention to the following variables: • homeDir (default .), points to the "root" directory with repositories. If the list file ~/.darcs/darcsden_repos is not present darcsden will recursively search repositories in that directory • unLocal, pwLocal: the username and the password of the "local" user The user/password credentials are required for editing the repositories and recording new patches. However, the den binary should automatically pick them up from the config file and log you in. Once you are logged in, and you have unrecorded changes in the repository, you can use darcsden UI to record a new patch. Below you can see an example of recording and merging patches from a branch. Darsden allows you to create forks/branches of your repositories, and it keeps track of the patch dependencies in your branches. More "internal" changes: • Instead of having to specify some parts of the configuration in DarcsDen.Settings, darcsden now uses runtime flags: –hub for using hub-specific modifications, –local for using the local backend and no flag for default behaviour • The flag actually choose what are called instances — something that a bit less fine-grained than settings. Instances allow you to pick backend, overwrite settings, modify the looks of the front page. • HTTP-testing using wreq. The previous test suite used selenium and it got bit-rotten. The wreq-based is easier to run and perhaps slightly easier to maintain. • HTTP auth, which is used as part of the local instance; the den tool utilizes it to log the user in automatically. • Support for repositories inside directories and nested repositories. • All the backend code that is used for handling repositories and meta-data on the file system. • Functionality for downloading zipped dist archives of darcs repositories. • Assorted mini-fixes What now? During the sprint I hacked together some code for viewing suspended patches along the regular ones. The next step would be to have a similar interface for managing the suspended patches. We have also discussed the possibility of adding rewrite rules implementing short-cut fusion for the directed types in Darcs. In order to see if it’s really worth it we would have to bring back to life the benchmarking suite (or at least check on it!). It was a really exciting weekend for me and I was delighted to meet some of my IRC friends. As it turns out, it is a small world and despite being from different parts of it we have a bunch of common IRL friends, professors. As the French would (probably not) say, très bien. The next darcs sprint will probably be in January, and probably in Europe, again. Tagged: darcs, haskell Yesod Web Framework The true root of the PVP debate I recently wrote a new Stack feature and blogged about it. The feature is about adding support for PVP bounds in cabal files. I did my very best to avoid stirring up anything controversial. But as is usual when the PVP comes up, a disagreement broke out on Reddit about version bounds. This essentially comes down to two different ways to view an upper bound on a package: • We've tested, and know for certain that the new version of a dependency is incompatible with our package • I can't guarantee that any new versions of a dependency will be compatible with our package If you look through the history of PVP debates, you'll see that this argument comes up over and over again. I'm going to make a bold statement: if a core feature to how we do package management is so easily confused, there's a problem with how we're using the feature. I made an offhand comment about this on Twitter: <script async="async" charset="utf-8" src="http://platform.twitter.com/widgets.js"></script> Based on the positive feedback to that tweet, I'm moving ahead with making a more official proposal. How PVP bounds work today Here's the theory of how you're supposed to write PVP bounds in a file: • Test your package against a range of dependencies. For example, let's say we tested with text-1.1.2 and text-1.2.0.3 • Modify the build-depends in your .cabal file to say text >= 1.1.2 && < 1.3, based on the fact that it's known to work with at least version 1.1.2, and unknown to work on anything than major version 1.2. • Next time you make a release, go through this whole process again. PVP detractors will respond with a few points: • You don't know that your package won't work with text-1.1.1 • You don't know that your package won't work with text-1.3 • You don't know for certain that your package will work with text-1.1.3, text-1.2.0.0, or text-1.2.1 (yes, it should based on PVP rules, but mistakes can happen. • Repeating this testing/updating process manually each time you make a code change is tedious and error-prone. Collect the real information If you notice, what we did in the above was extract the cabal file's metadata (version bounds) from what we actually know (known versions that the package works with). I'm going to propose a change: let's capture that real information, instead of the proxy data. The data could go into the cabal file itself, a separate metadata file, or a completely different database. In fact, the data doesn't even need to be provided by the package author. Stackage Nightly, for instance, would be a wonderful source of this information. A dependency solver - perhaps even cabal-install's - would then be able to extract exactly the same version bound information we have today from this data. We could consider it an intersection between the .cabal file version bounds and the external data. Or, we could ignore .cabal file bounds and jump straight to the database. Or we could even get more adventurous, e.g. preferring known-good build plans (based on build plan history). In theory, this functionality - if done as a separate database from the .cabal files themselves - means that on-Hackage revisions would be much less important, possibly even a feature that no one needs in the future. Automation! And here's the best part: this doesn't require authors to do anything. We can automate the entire process. There could even be build servers sitting and churning constantly trying to find combinations that build together. We've seen already how difficult it is to get authors to adopt a policy. The best policy is one that can be automated and machine run. Downsides Problems I've thought of so far: • Some packages (notably Edward Kmett's) have a versioning scheme which expresses more information than the PVP itself, and therefore the generated version bounds from this scheme may be too strict. But that won't necessarily be a problem, since a build server will be able to just test new versions as they come out. • This very blog post may start a flame war again, which I sincerely hope doesn't happen. Adoption In order for this to really happen, we need: 1. Broad support for the idea 2. Changes to the cabal-install dependency solver (or an alternate replacement) 3. Central infrasturcture for tracking the build successes 4. Tooling support for generating the build success information And to be blunt: this is not a problem that actually affects me right now, or most people I work with (curation is in a sense a simplified version of this). If the response is essentially "don't want it," let's just drop it. But if people relying on version bounds today think that this may be a good path forward, let's pursue it. Joachim Breitner The Incredible Proof Machine In a few weeks, I will have the opportunity to offer a weekend workshop to selected and motivated high school students1 to a topic of my choice. My idea is to tell them something about logic, proofs, and the joy of searching and finding proofs, and the gratification of irrevocable truths. While proving things on paper is already quite nice, it is much more fun to use an interactive theorem prover, such as Isabelle, Coq or Agda: You get immediate feedback, you can experiment and play around if you are stuck, and you get lots of small successes. Someone2 once called interactive theorem proving “the worlds most geekiest videogame”. Unfortunately, I don’t think one can get high school students without any prior knowledge in logic, or programming, or fancy mathematical symbols, to do something meaningful with a system like Isabelle, so I need something that is (much) easier to use. I always had this idea in the back of my head that proving is not so much about writing text (as in “normally written” proofs) or programs (as in Agda) or labeled statements (as in Hilbert-style proofs), but rather something involving facts that I have proven so far floating around freely, and way to combine these facts to new facts, without the need to name them, or put them in a particular order or sequence. In a way, I’m looking for labVIEW wrestled through the Curry-Horward-isomorphism. Something like this: So I set out, rounded up a few contributors (Thanks!), implemented this, and now I proudly present: The Incredible Proof Machine3 This interactive theorem prover allows you to do perform proofs purely by dragging blocks (representing proof steps) onto the paper and connecting them properly. There is no need to learn syntax, and hence no frustration about getting that wrong. Furthermore, it comes with a number of example tasks to experiment with, so you can simply see it as a challenging computer came and work through them one by one, learning something about the logical connectives and how they work as you go. For the actual workshop, my plan is to let the students first try to solve the tasks of one session on their own, let them draw their own conclusions and come up with an idea of what they just did, and then deliver an explanation of the logical meaning of what they did. The implementation is heavily influenced by Isabelle: The software does not know anything about, say, conjunction (∧) and implication (→). To the core, everything is but an untyped lambda expression, and when two blocks are connected, it does unification4 of the proposition present on either side. This general framework is then instantiated by specifying the basic rules (or axioms) in a descriptive manner. It is quite feasible to implement other logics or formal systems on top of this as well. Another influence of Isabelle is the non-linear editing: You neither have to create the proof in a particular order nor have to manually manage a “proof focus”. Instead, you can edit any bit of the proof at any time, and the system checks all of it continuously. As always, I am keen on feedback. Also, if you want to use this for your own teaching or experimenting needs, let me know. We have a mailing list for the project, the code is on GitHub, where you can also file bug reports and feature requests. Contributions are welcome! All aspects of the logic are implemented in Haskell and compiled to JavaScript using GHCJS, the UI is plain hand-written and messy JavaScript code, using JointJS to handle the graph interaction. Obviously, there is still plenty that can be done to improve the machine. In particular, the ability to create your own proof blocks, such as proof by contradiction, prove them to be valid and then use them in further proofs, is currently being worked on. And while the page will store your current progress, including all proofs you create, in your browser, it needs better ways to save, load and share tasks, blocks and proofs. Also, we’d like to add some gamification, i.e. achievements (“First proof by contradiction”, “50 theorems proven”), statistics, maybe a “share theorem on twitter” button. As the UI becomes more complicated, I’d like to investigating moving more of it into Haskell world and use Functional Reactive Programming, i.e. Ryan Trickle’s reflex, to stay sane. Customers who liked The Incredible Proof Machine might also like these artifacts, that I found while looking whether something like this exists: • Easyprove, an interactive tool to create textual proofs by clicking on rules. • Domino On Acid represents natural deduction rules in propositional logic with → and ⊥ as a game of dominoes. • Proofscape visualizes the dependencies between proofs as graphs, i.e. it operates on a higher level than The Incredible Proof Machine. • Proofmood is a nice interactive interface to conduct proofs in Fitch-style. • Proof-Game represents proofs trees in a sequent calculus with boxes with different shapes that have to match. • JAPE is an editor for proofs in a number of traditional proof styles. (Thanks to Alfio Martini for the pointer.) • Logitext, written by Edward Z. Yang, is an online tool to create proof trees in sequent style, with a slick interface, and is even backed by Coq! (Thanks to Lev Lamberov for the pointer.) • Carnap is similar in implementation to The Incredible Proof Machine (logical core in Haskell, generic unification-based solver). It currently lets you edit proof trees, but there are plans to create something more visual. • Clickable Proofs is a (non-free) iOS app that incorporates quite a few of the ideas that are behind The Incredible Proof Machine. It came out of a Bachelor’s thesis of Tim Selier and covers propositional logic. • Euclid the game by Kasper Peulen is a nice game to play with geometric constructions. 1. Students with migration background supported by the START scholarship 2. Does anyone know the reference? 3. We almost named it “Proofcraft”, which would be a name our current Minecraft-wild youth would appreciate, but it is alreay taken by Gerwin Kleins blog. Also, the irony of a theorem prover being in-credible is worth something. 4. Luckily, two decades ago, Tobias Nipkow published a nice implementation of higher order pattern unification as ML code, which I transliterated to Haskell for this project. September 23, 2015 Dimitri Sabadie luminance 0.1 released! Here we are luminance-0.1 was released yesterday night, along with luminance-samples-0.1! I’ll need to enhance the documentation and add directions so that people don’t feel too overwhelmed. I’m also going to write a wiki to help people get their mind wrapped around luminance. If you think something is missing; if you think something could be enhanced; or if you’ve found a bug, please, feel free to fill in an issue on the issues tracker. Next big steps I need to test the framework. I need a lot of tests. I’ll write a demoscene production with it so that I can give a good feedback to the community and prove that luminance can be used and works. In the waiting, keep the vibe! luminance first tutorial Woah! I’m very happy about people getting interested about my luminance graphics framework. I haven’t received use case feedback yet, but I’m pretty confident I will sooner or later. In the waiting, I decided to write an embedded tutorial. It can be found here. That tutorial explains all the basic types of luminance – not all though, you’ll have to dig in the documentation ;) – and describes how you should use it. I will try to add more documentation for each modules in order to end up with a very well documented piece of software! Let’s sum up what you need People on reddit complain – they are right to – about the fact the samples just “didn’t work. They actually did, but the errors were muted. I released luminance-0.1.1 to fix that issue. Now you’ll get the proper error messages. The most common issue is when you try to run a sample without having the required hardware implementation. luminance requires OpenGL 4.5. On Linux, you might need to use primusrun or optirun if you have the Optimus technology. On Windows, I guess you have to allow the samples to run on the dedicated GPU. And on Mac OSX… I have no idea; primusrun / optirun, I’d go. Anyways, I’d like to thank all people who have/will tried/try the package. As always, I’ll keep you informed about all the big steps I take about luminance. Keep the vibe! Wolfgang Jeltsch Constrained monads There are Haskell types that have an associated monad structure, but cannot be made instances of the Monad class. The reason is typically that the return or the bind operation of such a type m has a constraint on the type parameter of m. As a result, all the nice library support for monads is unusable for such types. This problem is called the constrained-monad problem. In my article The Constraint kind, I described a solution to this problem, which involved changing the Monad class. In this article, I present a solution that works with the standard Monad class. This solution has been developed by Neil Sculthorpe, Jan Bracker, George Giorgidze, and Andy Gill. It is described in their paper The Constrained-Monad Problem and implemented in the constrained-normal package. This article is a write-up of a Theory Lunch talk I gave quite some time ago. As usual, the source of this article is a literate Haskell file, which you can download, load into GHCi, and play with. Prerequisites We have to enable a couple of language extensions: {-# LANGUAGE ConstraintKinds, ExistentialQuantification, FlexibleInstances, GADTSyntax, Rank2Types #-} Furthermore, we need to import some modules: import Data.Set hiding (fold, map) import Data.Natural hiding (fold) These imports require the packages containers and natural-numbers to be installed. The set monad The Set type has an associated monad structure, consisting of a return and a bind operation: returnSet :: a -> Set a returnSet = singleton bindSet :: Ord b => Set a -> (a -> Set b) -> Set b bindSet sa g = unions (map g (toList sa)) We cannot make Set an instance of Monad though, since bindSet has an Ord constraint on the element type of the result set, which is caused by the use of unions. For a solution, let us first look at how monadic computations on sets would be expressed if Set was an instance of Monad. A monadic expression would be built from non-monadic expressions and applications of return and (>>=). For every such expression, there would be a normal form of the shape s1 >>= \ x1 -> s2 >>= \ x2 ->  -> sn >>= \ xn -> return r where the si would be non-monadic expressions of type Set. The existence of a normal form would follow from the monad laws. We define a type UniSet of such normal forms: data UniSet a where ReturnSet :: a -> UniSet a AtmBindSet :: Set a -> (a -> UniSet b) -> UniSet b We can make UniSet an instance of Monad where the monad operations build expressions and normalize them on the fly: instance Monad UniSet where return a = ReturnSet a ReturnSet a >>= f = f a AtmBindSet sa h >>= f = AtmBindSet sa h' where h' a = h a >>= f Note that these monad operations are analogous to operations on lists, with return corresponding to singleton construction and (>>=) corresponding to concatenation. Normalization happens in (>>=) by applying the left-identity and the associativity law for monads. We can use UniSet as an alternative set type, representing a set by a normal form that evaluates to this set. This way, we get a set type that is an instance of Monad. For this to be sane, we have to hide the data constructors of UniSet, so that different normal forms that evaluate to the same set cannot be distinguished. Now we need functions that convert between Set and UniSet. Conversion from Set to UniSet is simple: toUniSet :: Set a -> UniSet a toUniSet sa = AtmBindSet sa ReturnSet Conversion from UniSet to Set is expression evaluation: fromUniSet :: Ord a => UniSet a -> Set a fromUniSet (ReturnSet a) = returnSet a fromUniSet (AtmBindSet sa h) = bindSet sa g where g a = fromUniSet (h a) The type of fromUniSet constrains the element type to be an instance of Ord. This single constraint is enough to make all invocations of bindSet throughout the conversion legal. The reason is our use of normal forms. Since normal forms are “right-leaning”, all applications of (>>=) in them have the same result type as the whole expression. The multiset monad Let us now look at a different monad, the multiset monad. We represent a multiset as a function that maps each value of the element type to its multiplicity in the multiset, with a multiplicity of zero denoting absence of this value: newtype MSet a = MSet { mult :: a -> Natural } Now we define the return operation: returnMSet :: Eq a => a -> MSet a returnMSet a = MSet ma where ma b | a == b = 1 | otherwise = 0 For defining the bind operation, we need to define a class Finite of finite types whose sole method enumerates all the values of the respective type: class Finite a where values :: [a] The implementation of the bind operation is as follows: bindMSet :: Finite a => MSet a -> (a -> MSet b) -> MSet b bindMSet msa g = MSet mb where mb b = sum [mult msa a * mult (g a) b | a <- values] Note that the multiset monad differs from the set monad in its use of constraints. The set monad imposes a constraint on the result element type of bind, while the multiset monad imposes a constraint on the first argument element type of bind and another constraint on the result element type of return. Like in the case of sets, we define a type of monadic normal forms: data UniMSet a where ReturnMSet :: a -> UniMSet a AtmBindMSet :: Finite a => MSet a -> (a -> UniMSet b) -> UniMSet b The key difference to UniSet is that UniMSet involves the constraint of the bind operation, so that normal forms must respect this constraint. Without this restriction, it would not be possible to evaluate normal forms later. The MonadUniMSet instance declaration is analogous to the MonadUniSet instance declaration: instance Monad UniMSet where return a = ReturnMSet a ReturnMSet a >>= f = f a AtmBindMSet msa h >>= f = AtmBindMSet msa h' where h' a = h a >>= f Now we define conversion from MSet to UniMSet: toUniMSet :: Finite a => MSet a -> UniMSet a toUniMSet msa = AtmBindMSet msa ReturnMSet Note that we need to constrain the element type in order to fulfill the constraint incorporated into the UniMSet type. Finally, we define conversion from UniMSet to MSet: fromUniMSet :: Eq a => UniMSet a -> MSet a fromUniMSet (ReturnMSet a) = returnMSet a fromUniMSet (AtmBindMSet msa h) = bindMSet msa g where g a = fromUniMSet (h a) Here we need to impose an Eq constraint on the element type. Note that this single constraint is enough to make all invocations of returnMSet throughout the conversion legal. The reason is again our use of normal forms. A generic solution The solutions to the constrained-monad problem for sets and multisets are very similar. It is certainly not good if we have to write almost the same code for every new constrained monad that we want to make accessible via the Monad class. Therefore, we define a generic type that covers all such monads: data UniMonad c t a where Return :: a -> UniMonad c t a AtmBind :: c a => t a -> (a -> UniMonad c t b) -> UniMonad c t b The parameter t of UniMonad is the underlying data type, like Set or MSet, and the parameter c is the constraint that has to be imposed on the type parameter of the first argument of the bind operation. For every c and t, we make UniMonad c t an instance of Monad: instance Monad (UniMonad c t) where return a = Return a Return a >>= f = f a AtmBind ta h >>= f = AtmBind ta h' where h' a = h a >>= f We define a function lift that converts from the underlying data type to UniMonad and thus generalizes toUniSet and toUniMSet: lift :: c a => t a -> UniMonad c t a lift ta = AtmBind ta Return Evaluation of normal forms is just folding with the return and bind operations of the underlying data type. Therefore, we implement a fold operator for UniMonad: fold :: (a -> r) -> (forall a . c a => t a -> (a -> r) -> r) -> UniMonad c t a -> r fold return _ (Return a) = return a fold return atmBind (AtmBind ta h) = atmBind ta g where g a = fold return atmBind (h a) Note that fold does not need to deal with constraints, neither with constraints on the result type parameter of return (like Eq in the case of MSet), nor with constraints on the result type parameter of bind (like Ord in the case of Set). This is because fold works with any result type r. Now let us implement Monad-compatible sets and multisets based on UniMonad. In the case of sets, we face the problem that UniMonad takes a constraint for the type parameter of the first bind argument, but bindSet does not have such a constraint. To solve this issue, we introduce a type class Unconstrained of which every type is an instance: class Unconstrained a instance Unconstrained a The implementation of Monad-compatible sets is now straightforward: type UniMonadSet = UniMonad Unconstrained Set toUniMonadSet :: Set a -> UniMonadSet a toUniMonadSet = lift fromUniMonadSet :: Ord a => UniMonadSet a -> Set a fromUniMonadSet = fold returnSet bindSet The implementation of Monad-compatible multisets does not need any utility definitions, but can be given right away: type UniMonadMSet = UniMonad Finite MSet toUniMonadMSet :: Finite a => MSet a -> UniMonadMSet a toUniMonadMSet = lift fromUniMonadMSet :: Eq a => UniMonadMSet a -> MSet a fromUniMonadMSet = fold returnMSet bindMSet Tagged: Andy Gill, constrained-normal (Haskell package), Constraint (kind), containers (Haskell package), functional programming, GADT, George Giorgidze, GHC, Haskell, Institute of Cybernetics, Jan Bracker, literate programming, monad, natural-numbers (Haskell package), Neil Sculthorpe, normal form, talk, Theory Lunch Mark Jason Dominus A message to the aliens, part 13/23 (days, months, and years) This is page 13 of the Cosmic Call message. An explanation follows. The 10 digits are:  0 1 2 3 4 5 6 7 8 9 There are three diagrams on this page, each depicting something going around. Although the direction is ambiguous (unless you understand arrows) it should at least should be clear that all three rotations are in the same direction. This is all you can reasonably say anyhow, because the rotations would all appear to be going the other way if you looked at them from the other side. The upper left diagram depicts the Earth going around the Sun and underneath is a note that says that the time is equal to 315569268 seconds, and is also equal to one year . This defines the year. The upper-right diagram depicts the Moon going around the Earth ; the notation says that this takes 2360591 seconds, or around 27⅓ days. This is not the 29½ days that one might naïvely expect, because it is the sidereal month rather than the synodic month. Suppose the phase of the Moon is new, so that the Moon lies exactly between the Earth and the Sun. 27⅓ days later the Moon has made a complete trip around the Earth, but because the Earth has moved, the Moon is not yet again on the line between the Earth and the Sun; the line is in a different direction. The Earth has moved about of the way around the sun, so it takes about another days before the moon is again between Earth and Sun and so a total of about 29½ days between new moons. The lower-right diagram depicts the rotation of the Earth, giving a time of 86163 seconds for the day. Again, this is not the 86400 seconds one would expect, because it is the sidereal day rather than the solar day; the issue is the same as in the eprvious paragraph. None of the three circles appears to be circular. The first one is nearly circular, but it looks worse than it is because the Sun has been placed off-center. The curve representing the Moon's orbit is decidedly noncircular. This is reasonable, because the Moon's orbit is elliptical to approximately the same degree. In the third diagram, the the curve is intended to represent the surface of the Earth, so its eccentricity is indefensible. The ellipse is not the same as the one used for the Moon's orbit, so it wasn't just a copying mistake. The last two lines state that the ages of the Sun and the Earth are each 4550000000 years. This is the first appearance of the glyph for “age”. The next article will discuss page 14, shown at right. (Click to enlarge.) Try to figure it out before then. September 22, 2015 Neil Mitchell Three Space Leaks Summary: Using the technique from the previous post, here are three space leaks I found. Every large Haskell program almost inevitably contains space leaks. This post examines three space leaks I found while experimenting with a space-leak detection algorithm. The first two space leaks have obvious causes, but I remain mystified by the third. Hoogle leak 1 The motivation for looking at space leak detection tools was that Hoogle 5 kept suffering space leaks. Since Hoogle 5 is run on a virtual machine with only 1Gb of RAM, a space leak will often cause it to use the swap file for the heap, and destroy performance. I applied the detection techniques to the hoogle generate command (which generates the databases), which told me that writeDuplicates took over 100K of stack. The body of writeDuplicates is: xs <- return$ map (second snd) $sortOn (fst . snd)$ Map.toList $Map.fromListWith (\(x1,x2) (y1,y2) -> (min x1 y1, x2 ++ y2)) [(s,(p,[t])) | (p,(t,s)) <- zip [0::Int ..] xs]storeWrite store TypesDuplicates$ jaggedFromList $map (reverse . snd) xsreturn$ map fst xs

I don't expect readers to understand the purpose of the code, but it is interesting to consider if you can spot the space leak, and if you'd have realised so while writing the code.

In order to narrow down the exact line, I inserted evaluate $rnf ... between each line, along with print statements. For example: print "step 1"evaluate$ rnf xsprint "step 2"xs <- return $map (second snd)$ sortOn (fst . snd) $Map.toList$    Map.fromListWith (\(x1,x2) (y1,y2) -> (min x1 y1, x2 ++ y2))                     [(s,(p,[t])) | (p,(t,s)) <- zip [0::Int ..] xs]evaluate $rnf xsprint "step 3" storeWrite store TypesDuplicates$ jaggedFromList $map (reverse . snd) xsprint "step 4"let res = map fst xsevaluate$ rnf resprint "step 5"return res

(Debugging tip: always use print for debugging and never for real code, that way getting rid of all debugging output is easy.) It failed after printing step 2, but before printing step 3. Pulling each subexpression out and repeating the evaluate/rnf pattern I reduced the expression to:

Map.fromListWith (\(x1,x2) (y1,y2) -> (min x1 y1, x2 ++ y2)) xs

The fromListWith function essentially performs a foldl over values with duplicate keys. I was using Data.Map.Strict, meaning it the fold was strict, like foldl'. However, the result is a pair, so forcing the accumulator only forces the pair itself, not the first component, which contains a space leak. I effectively build up min x1 (min x1 (min x1 ... in the heap, which would run faster and take less memory if reduced eagerly. I solved the problem with:

Map.fromListWith (\(x1,x2) (y1,y2) -> (, x2 ++ y2) $! min x1 y1) xs After that the stack limit could be reduced a bit. Originally fixed in commit 102966ec, then refined in 940412cf. Hoogle leak 2 The next space leak appeared in the function: spreadNames (reverse . sortOn snd -> xs@((_,limit):_)) = check$ f (99 + genericLength xs) maxBound xs    where        check xs | all (isCon . snd) xs && length (nubOrd $map snd xs) == length xs = xs | otherwise = error "Invalid spreadNames" -- I can only assign values between mn and mx inclusive f :: Word16 -> Word16 -> [(a, Int)] -> [(a, Name)] f !mn !mx [] = [] f mn mx ((a,i):xs) = (a, Name real) : f (mn-1) (real-1) xs where real = fromIntegral$ max mn $min mx ideal ideal = mn + floor (fromIntegral (min commonNameThreshold i) * fromIntegral (mx - mn) / fromIntegral (min commonNameThreshold limit)) I had already added ! in the definition of f when writing it, on the grounds it was likely a candidate for space leaks (an accumulating map), so was immediately suspicious that I hadn't got it right. However, adding bang patterns near real made no difference, so I tried systematically reducing the bug. Since this code isn't in IO, the evaluate technique from the previous leak doesn't work. Fortunately, using seq works, but is a bit more fiddly. To check the argument expression (reverse . sortOn) wasn't leaking I made the change: spreadNames (reverse . sortOn snd -> xs@((_,limit):_)) = rnf xs seq trace "passed xs" (check$ f (99 + genericLength xs) maxBound xs)

I was slightly worried that the GHC optimiser may break the delicate seq/trace due to imprecise exceptions, but at -O0 that didn't happen. Successive attempts at testing different subexpressions eventually lead to genericLength xs, which in this case returns a Word16. The definition of genericLength reads:

genericLength []        =  0genericLength (_:l)     =  1 + genericLength l

Alas, a very obvious space leak. In addition, the base library provides two rules:

{-# RULES  "genericLengthInt"     genericLength = (strictGenericLength :: [a] -> Int);  "genericLengthInteger" genericLength = (strictGenericLength :: [a] -> Integer); #-}

If you use genericLength on Int or Integer then it is replaced with a strict version without a space leak - but on Word16 the space leak remains. To solve this space leak I replaced genericLength xs with fromIntegral (length xs) in commit 12c46e93, which worked. After that change, the Hoogle test suite can be run with 1Kb of stack - a test that has been added to the continuous integration.

Shake leak

After solving the space leak from the original post, I was then able to run the entire test suite with 1Kb stack on my Windows machine. I made that an RTS option to the Cabal test suite, and my Linux continuous integration started failing. Further experimentation on a Linux VM showed that:

• The entire test failed at 50K, but succeeded at 100K.
• The excessive stack usage could be replicated with only two of the tests - the tar test followed by the benchmark test. The tar test is incredibly simple and likely any of the tests before before benchmark would have triggered the issue.
• The tests succeeded in 1K if running benchmark followed by tar.

The initial assumption was that some CAF was being partially evaluated or created by the first test, and then used by the second, but I have yet to find any evidence of that. Applying -xc suggested a handful of possible sites (as Shake catches and rethrows exceptions), but the one that eventually lead to a fix was extractFileTime, defined as:

extractFileTime x = ceiling $modificationTimeHiRes x * 1e4 And called from: getFileInfo x = handleBool isDoesNotExistError (const$ return Nothing) $do s <- getFileStatus$ unpackU_ x    return $Just (fileInfo$ extractFileTime s, fileInfo $fromIntegral$ fileSize s)

There is a small (constant sized) space leak here - the result does not force extractTime, but returns a pair containing thunks. In fact, getFileStatus from the unix library allocates a ForeignPtr to store s, so by not forcing the pair we cause the ForeignPtr to live much longer than would be otherwise required. The fix from commit 2ee36a99 is simple:

getFileInfo x = handleBool isDoesNotExistError (const $return Nothing)$ do    s <- getFileStatus $unpackU_ x a <- evaluate$ fileInfo $extractFileTime s b <- evaluate$ fileInfo $fromIntegral$ fileSize s    return $! Just$! (a, b)

Afterwards the entire Shake test suite can be run in 1K. Since getFileInfo is different on Windows vs Linux, I understand why the space leak doesn't occur on Windows. What I still don't understand is:

• How does running one test first cause the space leak in the second test?
• How does what looks like a small space leak result in over 49K additional stack space?
• Is the fact that ForeignPtr is involved behind the scenes somehow relevant?

I welcome any insights.

Using stack on NixOS

I’ve got myself a new laptop recently and decided to try NixOS. It’s
been a great experience so far, but there are some rough edges. One
of them is stack not completely working out of the box for my
projects.

The reason is that some Haskell packages depend on system C libraries,
but neither stack nor Cabal-the-library are able to find them on
NixOS. As an example, you won’t find /usr/lib/libz.so on my system.
/nix/store/2zmlykvqx69q5bh1l3jqyhrj2493vqdx-zlib-1.2.8/lib/libz.so.

Being a NixOS newbie, I’ve tried some solutions but none of them
worked. I’ve then asked for Peter Simon’s help, which he gladly and
swiftly provided (thanks, Peter!). For my use case, I’ve adapted his
suggestions into the following script:

#!/usr/bin/env bash
ZLIB="$(nix-build --no-out-link "<nixpkgs>" -A zlib)" PSQL="$(nix-build --no-out-link "<nixpkgs>" -A postgresql)"
exec stack                                                             \
--extra-lib-dirs=${ZLIB}/lib --extra-include-dirs=${ZLIB}/include \
--extra-lib-dirs=${PSQL}/lib --extra-include-dirs=${PSQL}/include \
$* My transitive dependencies need zlib and postgresql libraries, so I use nix-build to find out where these packages are and pass their directories to stack explicitly. This solution is not without drawbacks. The biggest one is that your built Haskell libraries will be hardcoded to these C libraries, but NixOS doesn’t know anything about this dependency. If you upgrade your system and garbage collect the old C libraries, you’ll have to recompile the Haskell libraries (probably with rm -R ~/.stack/snapshots). However, I quite like its conciseness, and one doesn’t need to understand much about NixOS’s internals to use it. At the moment this hack is serving me well. If you’re reading this blog post more than a couple of months after I wrote it, take a look around to see if a better solution has been developed in the mean time :). September 21, 2015 Neil Mitchell Detecting Space Leaks Summary: Below is a technique for easily detecting space leaks. It's even found a space leak in the base library. Every large Haskell program almost inevitably contains space leaks. Space leaks are often difficult to detect, but relatively easy to fix once detected (typically insert a !). Working with Tom Ellis, we found a fairly simple method to detect such leaks. These ideas have detected four space leaks so far, including one in the base library maximumBy function, which has now been fixed. For an introduction to space leaks, see this article. Our approach is based around the observation that most space leaks result in an excess use of stack. If you look for the part of the program that results in the largest stack usage, that is the most likely space leak, and the one that should be investigated first. Method Given a program, and a representative run (e.g. the test suite, a suitable input file): • Compile the program for profiling, e.g. ghc --make Main.hs -rtsopts -prof -auto-all. • Run the program with a specific stack size, e.g. ./Main +RTS -K100K to run with a 100Kb stack. • Increase/decrease the stack size until you have determined the minimum stack for which the program succeeds, e.g. -K33K. • Reduce the stack by a small amount and rerun with -xc, e.g. ./Main +RTS -K32K -xc. • The -xc run will print out the stack trace on every exception, look for the one which says stack overflow (likely the last one) and look at the stack trace to determine roughly where the leak is. • Attempt to fix the space leak, confirm by rerunning with -K32K. • Repeat until the test works with a small stack, typically -K1K. • Add something to your test suite to ensure that if a space leak is ever introduced then it fails, e.g. ghc-options: -with-rtsopts=-K1K in Cabal. I have followed these steps for Shake, Hoogle and HLint, all of which now contain -K1K in the test suite or test scripts. Example: Testing on Shake Applying these techniques to the Shake test suite, I used the run ./shake-test self test, which compiles Shake using Shake. Initially it failed at -K32K, and the stack trace produced by -xc was: *** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace: Development.Shake.Profile.generateSummary, called from Development.Shake.Profile.writeProfile, called from Development.Shake.Core.run.\.\.\, called from Development.Shake.Core.run.\.\, called from Development.Shake.Database.withDatabase.\, called from Development.Shake.Storage.withStorage.continue.\, called from Development.Shake.Storage.flushThread, called from Development.Shake.Storage.withStorage.continue, called from Development.Shake.Storage.withStorage.\, called from General.FileLock.withLockFile.\, called from General.FileLock.withLockFile, called from Development.Shake.Storage.withStorage, called from Development.Shake.Database.withDatabase, called from Development.Shake.Core.run.\, called from General.Cleanup.withCleanup, called from Development.Shake.Core.lineBuffering, called from Development.Shake.Core.run, called from Development.Shake.Shake.shake, called from Development.Shake.Args.shakeArgsWith, called from Test.Type.shakeWithClean, called from Test.Type.shaken.\, called from Test.Type.noTest, called from Test.Type.shaken, called from Test.Self.main, called from Test.main, called from :Main.CAF:mainstack overflow Looking at the generateSummary function, it takes complete profile information and reduces it to a handful of summary lines. As a typical example, one line of the output is generated with the code: let f xs = if null xs then "0s" else (\(a,b) -> showDuration a ++ " (" ++ b ++ ")")$ maximumBy (compare on fst) xs in    "* The longest rule takes " ++ f (map (prfExecution &&& prfName) xs) ++    ", and the longest traced command takes " ++ f (map (prfTime &&& prfCommand) $concatMap prfTraces xs) ++ "." Most of the code is map, maximum and sum in various combinations. By commenting out pieces I was able to still produce the space leak using maximumBy alone. By reimplementing maximumBy in terms of foldl', the leak went away. Small benchmarks showed this space leak was a regression in GHC 7.10, which I reported as GHC ticket 10830. To fix Shake, I added the helper: maximumBy' cmp = foldl1'$ \x y -> if cmp x y == GT then x else y

After switching to maximumBy' I was able to reduce the stack to -K1K. While this space leak was not problematic in practice (it's rarely used code which isn't performance sensitive), it's still nice to fix. I modified the Shake test suite to pass -K1K so if I ever regress I'll get an immediate notification. (Shake actually had one additional Linux-only space leak, also now fixed, but that's a tale for a future post.)

Caveats

This method has found several space leaks - two in Shake and two in Hoogle (I also ran it on HLint, which had no failures). However, there are a number of caveats:

• GHC's strictness analyser often removes space leaks by making accumulators strict, so -O2 tends to remove some space leaks, and profiling may reinsert them by blocking optimisations. I currently check my code using -O0, but using libraries I depend on with whatever optimisation they install with by default. By ensuring optimisations do not remove space leaks, it is less likely that minor code tweaks will introduce space leaks due to missed optimisations.
• The stack trace produced by -xc omits duplicate adjacent elements, which is often the interesting information when debugging a stack overflow. In practice, it's a little inconvenient, but not terrible. Having GHC provide repetition counts (e.g. Main.recurse *12) would be useful.
• The stack traces don't entries for things in imported libraries, which is unfortunate, and often means the location of the error is a 20 line function instead of the exact subexpression. The lack of such information makes fixing leaks take a little longer.
• The -xc flag prints stack information on all exceptions, which are often numerous. Lots of IO operations make use of exceptions even when they succeed. As a result, it's often easier to run without -xc to figure out the stack limit, then turn -xc on. Usually the stack overflow exception is near the end.
• There are sometimes a handful of exceptions after the stack overflow, as various layers of the program catch and rethrow the exception. For programs that catch exceptions and rethrow them somewhat later (e.g. Shake), that can sometimes result in a large number of exceptions to wade through. It would be useful if GHC had an option to filter -xc to only certain types of exception.
• Some functions in the base libraries are both reasonable to use and have linear stack usage - notably mapM. For the case of mapM in particular you may wish to switch to a constant stack version while investigating space leaks.
• This technique catches a large class of space leaks, but certainly not all. As an example, given a Map Key LargeValue, if you remove a single Key but don't force the Map, it will leak a LargeValue. When the Map is forced it will take only a single stack entry, and thus not be detected as a leak. However, this technique would have detected a previous Shake space leak, as it involved repeated calls to delete.

Feedback

If anyone manages to find space leaks using this technique we would be keen to know. I have previously told people that there are many advantages to lazy programming languages, but that space leaks are the big disadvantage. With the technique above, I feel confident that I can now reduce the number of space leaks in my code.

Improvements

Pepe Iborra suggested two tips to make this trick even more useful:

• Instead of -xc, I find it's much better to catch for StackOverflow exceptions in main, and then print the stack trace using GHC.Stack.currentCallStack
• For imported libraries, you can cabal unpack them and extend the .cabal descriptor Library section with a ghc-prof-options entry that enables -auto-all.

stack and the PVP

We recently had a very large discussion of stack on Reddit, which I thought was great for kicking off some discussion. In that discussion, there was a very active thread about how stack relates to the Package Versioning Policy (aka, the PVP).

The PVP - and in particular its policy on preemptive upper bounds - has been discussed at great length many times in the past, and I have no wish to revisit that discussion here. I also went into some detail on Reddit explaining how stack and Stackage relate to the PVP, and won't repeat those details here (tl;dr: they're orthogonal issues, and upper bounds are neither required nor discouraged by either).

However, the discussion brought up one of my long-held beliefs: "the right way to solve this is with tooling." Specifically: manually keeping track of lots of lower and upper bounds is a hard, tedious process that many people get wrong regularly. One great effort in this direction is the Hackage Matrix Builder project, which helps find overly lax (and, I believe, overly restrictive) bounds on Hackage and report them to authors. I'm announcing an experimental feature I've just added to stack master, and hoping it can help with the initial creation of upper bounds.

The feature itself is quite simple. When you run the sdist or upload commands, there's a new option: --pvp-bounds, which can be set to none (don't modify the cabal file at all), upper (add upper bounds), lower (add lower bounds), and both (add both upper and lower bounds). The default is none (we shouldn't change an author's cabal file without permission), but that default can be overridden in the stack.yaml file (either for your project, or in ~/.stack/stack.yaml). The algorithm is this:

• Find every single dependency in the .cabal file (e.g., bytestring >= 0.9)
• If the user has asked for lower bounds to be added, and that dependency has no lower bound, set the lower bound to the package version specified in the stack.yaml file
• If the user has asked for upper bounds to be added, and that dependency has no upper bound, set the upper bound to be less than the next major version of the package relative to what's specified in the stack.yaml file

That was a bit abstract, so let's give an example. Support you're using LTS 3.0, which includes aeson-0.8.0.2, attoparsec-0.12.1.6, and text-1.2.1.3. Let's further say that in your cabal file you have the following:

build-depends: aeson >= 0.7
, text < 2
, attoparsec

If you specify --pvp-bounds both on the command line, you'll end up with the following changes:

• aeson will now be specified as aeson >= 0.7 && < 0.9. Reason: We respect the existing lower bound (>= 0.7), but add in an upper bound based on the version of aeson used in your snapshot. Since we're currently using 0.8.0.2, the next major bump will be 0.9.
• text will be text >= 1.2.1.3 && < 2. We respect the existing upper bound (even though it's no in compliance with PVP rules - this allows you as an author to maintain more control when using this feature), but add in a lower bound to the currently used version of text.
• attoparsec will be attoparsec >= 0.12.1.6 && < 0.13. Since there are no upper or lower bounds, we add both of them in.

Just to round out the feature description:

• if you use --pvp-bounds none, your bounds are unmodified
• with --pvp-bounds lower you get
• aeson >= 0.7
• text >= 1.2.1.3 && < 2
• attoparsec >= 0.12.1.6
• with --pvp-bounds upper you get
• aeson >= 0.7 && < 0.9
• text < 2
• attoparsec < 0.13

The motivation behind this approach is simplicity. For users of stack, your versioning work usually comes down to choosing just one number: the LTS Haskell version (or Stackage Nightly date), which is relatively easy to deal with. Managing version ranges of every single dependency is an arduous process, and hard to get right. (How often have you accidentally started relying on a new feature in a minor version bump of a dependency, but forgotten to bump the lower bound?) Now, stack will handle that drudgery for you.

Of course, there are cases where you'll want to tell stack that you know better than it, e.g. "I'm using a subset of the aeson API that's compatible between both 0.7 and 0.8, so I want to override stack's guess at a lower bound." Or, "even though the PVP says text-1.3 may have a breaking change, I've spoken with the author, and I know that the parts I'm using won't be changed until version 2."

This feature should still be considered experimental, but I'm hopeful that it will be an experiment that works, and can make both upper bounds advocates and detractors happy. As a member of the latter group, I'm actually planning on trying out this functionality on some of my packages for future releases.

There are still downsides to this feature, which are worth mentioning:

• Lower bounds may be too restrictive by default. Solution: consider just using --pvp-bounds upper
• You'll still have to manually relax upper bounds on Hackage. Solution: make sure your project is in Stackage so you get early notifications, and add relaxed upper bounds manually to your .cabal file as necessary.
• This does nothing to add upper bounds to existing uploads to Hackage. Maybe someone can add a tool to do that automatically for you (or, even better, enhance cabal's dependency solver to take date information into account)
• In the past, a strong reason to avoid upper bounds was that it would trigger bugs in the dependency solver that could prevent a valid build plan from being found. For the most part, those bugs have been resolved, but on occasion you may still need to use the --reorder-goals --max-backjumps=-1 flag to cabal. (Note: when using stack's dependency solving capabilities, it passes in these flags for you automatically.)

Abstract Binding Trees in Hakaru

Earlier this year Neel Krishnaswami talked about abstract binding trees (ABTs) [part 1] [part 2]. IMO, the best way to think about ABTs is as a generalization of abstract syntax trees (ASTs), though this is not a perspective sanctioned by the CMUers I’ve talked to. CMUers oppose this way of phrasing things, in part, because the ABT libraries they’re familiar with make crucial use of the design pattern of two-level types; but I think the essential insights of ABTs and two-level types are quite different, and we ought to keep the benefits of these two techniques distinct.

Over the past year I’ve been working on the inferential language1 Hakaru, and in the new version of the compiler we’re using ABTs for our syntax trees. However, contrary to Neel’s stance against using strongly-typed internal representations for syntax, we extend the ABT approach to make use of GADTs to guarantee local well-typedness— since this in turn can be used to guarantee that program transformations are also well-typed. (If you don’t want those guarantees, then take a look at Jon Sterling’s abt library on Hackage2.) In this post I’m going to present a simplified version of our architecture, and then talk about some of the extra stuff bringing it closer to our production architecture.

First things first

Since we want everything to be well-typed, we first must introduce some universe, U, of all the types in our language. (In Haskell we can implement such a universe by using the -XDataKinds extension, so I’ll equivocate between calling U a “universe” vs a “kind”.) For the rest of this post it doesn’t actually matter what lives in that universe3, just so long as things match up when they need to. Since the choice of universe is irrelevant, we could abstract over U by turning on the -XPolyKinds extension; but I avoid doing so below, just to help keep things more concrete.

Implementing ASTs

The simplest way of thinking about well-typed ASTs is that they capture the set of terms generated by a (typed) signature; that is, the fixed point of some Σ ∷ [U] → U → ⭑. Unpacking the type for Σ, we have that every syntactic constructor s ∈ Σ is associated with some arity (the length of the list), each argument to s has some type in U (the elements of the list), and applying s to the right number of ASTs of the right types will generate a new AST with some type in U (the second argument to Σ).

To implement this fixed point we define an AST type which is parameterized by its signature. To ensure well-aritiedness (and well-typedness) of our ASTs with respect to that signature, we’ll need to introduce a helper type SArgs4. And to ensure that we obtain the least fixed-point of the signature, we’ll make everything strict.

infix  4 :$infixr 5 :* data SArgs ∷ (U → ⭑) → [U] → ⭑ where End ∷ SArgs ast [] (:*) ∷ !(ast u) → !(SArgs ast us) → SArgs ast (u : us) data AST ∷ ([U] → U → ⭑) → U → ⭑ where (:$) ∷ !(σ us u)
→ !(SArgs (AST σ) us)
→ AST σ u

Implementing ABTs

The problem with ASTs is that they have no notion of variables, and thus have no notion of variable binding. Naively we could implement binders like lambda-abstraction by having something like λ ∷ Σ [u, v] (u :→ v) but then we’d need to do a post-hoc check to ensure that the first argument to λ is in fact a variable. To build that check into the datatype itself we’d have to move λ into the definition of AST (since the first argument is of type Variable u rather than AST Σ u). If lambda-abstraction were the only binder we had, that might not be so bad; but any real-world language has a plethora of binders, and this approach doesn’t scale.

The essential idea behind ABTs is to abstract over the notion of binding itself. Given a single uniform definition of what it means to be a binding form, we don’t have to worry about adding a bunch of ad-hoc constructors to our AST datatype. Moreover, we can then provide single uniform definitions for things which mess with variables and are homomorphic over the signature. Things like capture-avoiding substitution and providing a HOAS API for our first-order representation.

The crucial step is to adjust our notion of what a signature contains. The basic signatures used above only contained applicative forms; i.e., things we can apply to locally-closed terms; i.e., what are called “functors” in the logic programming community. For ABTs we’ll want to allow our signatures to include any generalized quantifier. That is, our signatures will now be of type Σ ∷ [[U] × U] → U → ⭑. Previously, the arguments were indexed by U; now, they’re indexed by [U] × U. The length of the list gives the number of variables being bound, the types in the list give the types of those variables, and the second component of the pair gives the type of the whole locally-open expression.

To implement this we need to extend our syntax tree to include variable bindings and variable uses:

data SArgs ∷ ([U] → U → ⭑) → [[U] × U] → ⭑ where
End  ∷ SArgs abt []

(:*) ∷ !(abt vs u)
→ !(SArgs abt vus)
→ SArgs abt ((vs,u) : vus)

data ABT ∷ ([[U] × U] → U → ⭑) → [U] → U → ⭑ where
(:$) ∷ !(σ vus u) → !(SArgs (ABT σ) vus) → ABT σ [] u Var ∷ !(Variable v) → ABT σ [] v Bind ∷ !(Variable v) → !(ABT σ vs u) → ABT σ (v : vs) u Time for an example of how this all fits together. To add lambda-abstraction to our language we’d have λ ∷ Σ [([u],v)] (u :→ v): that is, the λ constructor takes a single argument which is a locally-open term, binding a single variable of type u, and whose body has type v. So given some x ∷ Variable u and e ∷ ABT Σ [] v we’d have the AST (λ :$ Bind x e :* End) ∷ ABT Σ [] (u :→ v).

“Local” vs “global” well-typedness

With the ABT definition above, every term of type ABT Σ vs u must be locally well-typed according to the signature Σ. I keep saying “locally” well-typed because we only actually keep track of local binding information. This is an intentional design decision. But only tracking local well-typedness does have some downsides.

So what are the downsides? Where could things go wrong? Given a locally-closed term (i.e., either Var x or f :$e) any free variables that occur inside will not have their U-types tracked by Haskell’s type system. This introduces some room for the compiler writer to break the connection between the types of a variable’s binder and its use. That is, under the hood, every variable is represented by some unique identifier like an integer or a string. Integers and strings aren’t U-indexed Haskell types, thus it’s possible to construct a Variable u and a Variable v with the same unique identifier, even though u and v differ. We could then Bind the Variable u but Var the Variable v. In order to ensure global well-typedness we need to ensure this can’t happen. One way is to keep track of global binding information, as we do in the paper presentation of languages. Unfortunately, to do this we’d need to teach Haskell’s typechecker about the structural rules of our language. Without a type-level implementation of sets/maps which respects all the axioms that sets/maps should, we’d be forced to do things like traverse our ASTs and rebuild them identically, but at different type indices. This is simply too hairy to stomach. Implementing the axioms ourselves is doubly so. Or we could fake it, using unsafeCoerce to avoid the extraneous traversals or the complicated pattern matching on axioms. But doing this we’d erase all guarantees that adding global binding information has to offer. A third approach, and the one we take in Hakaru, is compartmentalize the places where variables can be constructed. The variable generation code must be part of our trusted code base, but unlike the unsafeCoerce approach we can keep all the TCB code together in one spot rather than spread out across the whole compiler. Stratifying our data types The above definition of ABTs is a simplified version of what we actually use in Hakaru. For example, Hakaru has user-defined algebraic data types, so we also need case analysis on those data types. Alas, generic case analysis is not a generalized quantifier, thus we cannot implement it with (:$). We could consider just adding case analysis to the ABT definition, but then we’d start running into extensibility issues again. Instead, we can break the ABT type apart into two types: one for capturing variable uses and bindings, and the other for whatever syntax we can come up with. Thus,

data Syntax ∷ ([[U] × U] → U → ⭑) → ([U] → U → ⭑) → U → ⭑ where
(:$) ∷ !(σ vus u) → !(SArgs abt vus) → Syntax σ abt [] u data ABT ∷ ([U] → U → ⭑) → [U] → U → ⭑ where Syn ∷ !(Syntax σ (ABT σ) u) → ABT σ [] u Var ∷ !(Variable v) → ABT σ [] v Bind ∷ !(Variable v) → !(ABT σ vs u) → ABT σ (v : vs) u Of course, since we’re going to be extending Syntax with all our language-specific details, there’s not a whole lot of benefit to parameterizing over σ. Thus, we can simplify the types considerably by just picking some concrete Σ to plug in for σ. By breaking Syntax apart from ABT we can now extend our notion of syntax without worrying about the details of variable binding (which can be defined once and for all on ABT). But we could still run into extensibility issues. In particular, often we want to separate the fixed-point portion of recursive types from their generating functor so that we can do things like add annotations at every node in the recursive data type. A prime example of such annotations is keeping track of free variables, as in Neel’s original post. To allow this form of extensibility we need to break up the ABT type into two parts: the recursion, and the Syn/Var/Bind view of the ABT. data ABT ∷ ([U] → U → ⭑) → [U] → U → ⭑ where Unview ∷ !(View σ (ABT σ) vs u) → ABT σ vs u view ∷ ABT σ vs u → View σ (ABT σ) vs u view (Unview e) = e data View ∷ ([U] → U → ⭑) → [U] → U → ⭑ where Syn ∷ !(Syntax σ abt u) → View σ abt [] u Var ∷ !(Variable v) → View σ abt [] v Bind ∷ !(Variable v) → !(View σ abt vs u) → View σ abt (v : vs) u Now, to allow arbitrary annotations we’ll replace the data type ABT with an equivalent type class. Each instance of the ABT class defines some sort of annotations, and we can use the view and unview methods to move between the instance and the concrete View type. There’s one last form of extensibility we may want to add. Using fixed point combinators gives us a way of describing complete trees. A different way of introducing recursion is with free monads. The free-monad combinator is just like the fixed-point combinator, except that we have an additional type parameter for metavariables and we have a data constructor for using those metavariables instead of requiring the recursion to ground out with a complete syntax tree. The reasons why this might be nice to do are beyond the scope of this post, but the point is we might want to do that so we need to split the ABT class into two parts: one for the recursion itself, and another for the annotations. In the end, we have a four-level type: the Syntax, the View, the annotations, and the recursion. [1] In the accepted/current parlance, Hakaru is a “probabilistic programming language”; but I and a number of other folks working on such languages have become disaffected with that term of late, since it’s not entirely clear what should and should not count as a “probabilistic” PL. Over the course of a number of discussions on the topic, I’ve settled on “inferential” PL as describing what is (or, at least what I find) interesting about “probabilistic” PL. I’ve been meaning to write a post about the subject, and hopefully this footnote will remind me to do so. [2] N.B., the indexing used in that package is what we get if we erase/compactify the universe U. That is: the erasure of U is a singleton set; the erasure of [U] is isomorphic to the Peano numbers; the erasure of [[U] × U] is isomorphic to a list of Peano numbers; etc. [3] Though at one point I assume we have functions, (:→), just for the sake of an example. [4] Ideally we’d be able to flatten this type to avoid all the overhead of the linked list implementation. In fact, the entire AST node of (:$) together with its SArgs should be flattened. These nodes have the same general layout as the heap objects in the STG machine: a record with a pointer to the data constructor (i.e., element of the signature) followed by an appropriate number of arguments; and so in principle we ought to be able to implement them directly with a single STG heap object.

Darcs internals, part 1: typesafe directed datastructures

I am editing this post from IRILL, where the darcs sprint is taking place

One of the things that I particularly like about the Darcs codebase is that you can see that the developers were not shy about using intermediate-slash-advanced Haskell/GHC features to help achieving type safety. You can see GADTs, associated types, phantom types, existential types actively used.

In this post I would like to discuss the representation of patches and the use of type witnesses in darcs.

A word about patches and contexts

This post is intended for people interested in patch theory and its implementation in darcs. A passing familiarity with patches, contexts, inverses, etc is useful, but not required, as we restate some basic definitions in this section.

A primitive patch is a basic unit in patch theory. It constitutes an atomic change in the repository. The definition of a primitive patch may vary, but usually the following changes are considered primitive patches:

• Removing a file from the repository
• Adding file to the repository
• Changing a line n in a file in the repository
• Identity patch, i.e. an empty/non-existing change that does not modify the state of the repository at all

Every primitive patch has a pre-context and a post-context. Roughly, you can think of a pre-context as the full state of the repository before the change was made, and of the post-context as the full state of the repository after the change was applied. We write (x)-A->(y) for a patch A with a pre-context x and a post-context y.

If a primitive patch A has a pre-context a, a post-context o, and a primitive patch B has a pre-context o, a post-context b, then we can combine two patches to obtain a sequential patch AB with the pre-context a and the post-context b.

Every primitive patch (x)-A->(y) has an inverse (y)-A^{-1}->(x), such that (x)-A->(y)-A^{-1}->(x) is equivalent to the identity patch (x)-1->(x).

In the next sections we will see how those things are implemented in darcs.

Primitive patches and witnesses

A primitive patch, which constitutes a single fine grained change, can be represented as a (G)ADT:

data Prim where
Move :: FileName -> FileName -> Prim
RmFile :: FileName -> Prim
Hunk :: FileName -> Int -> ByteString -> ByteString -> Prim
RmDir :: FileName -> Prim
…

We can represent complex patches as sequences of primitive patches:

data Patch where
NilP :: Patch
PrimP :: Prim -> Patch
SeqP :: Patch -> Patch -> Patch

This seems reasonable enough. But if we implement our patch theory this way we seem to be missing something — patches have (pre- and post-) contexts. Having contexs allows us to enforce patch composition on the level of type system. Consider the following definition, which uses phantom types as type witnesses for contexts.

data Prim x y where
Move :: FileName -> FileName -> Prim x y
RmFile :: FileName -> Prim x y
AddFile :: FileName -> Prim x y
Hunk :: FileName -> Int -> ByteString -> ByteString -> Prim x y
RmDir :: FileName -> Prim x y
…

data Patch x y where
NilP :: Patch x x
PrimP :: Prim x y -> Patch x y
SeqP :: Patch x y -> Patch y z -> Patch x z

We call the types with witnesses representing pre- and post-contexts directed types. Intuitively, the directed type D x y has a “direction” from x to y, written as (x)->(y). The Prim datatype looks pretty much like the type actually used in Darcs. The Patch datatype, however, is completely artificial. We will see in the next sections how Darcs really models complex patches.

Directed programming

FL & RL

Two particularly useful directed types used in darcs are directed lists: forwards lists of the type FL a x y and reverse lists RL a x y. Forward lists “go” from the head to the tail; reverse lists “go” from the tail to the head. The lists are polymorphic over a just like regular lists.

data FL a x y where
NilFL :: FL a x x    {- The empty list stays at (x) -}
(:>:) :: a x y -> FL a y z -> FL a x z
infixr 5 :>:

For myself, I visualise forward lists like this:

(x) —a—> (y) —b—> (z) ——Nil——> (z)
a :: Patch x y
b :: Patch y z
(a :>: b :>: NilFL) :: FL Patch x z

The reverse lists are “going” from tail to head1

data RL a x y where
NilRL :: RL a x x
(:<:) :: RL a x y -> a y z -> RL a x z
infixl 5 :<:

(Mnemonically, the the head is always “greater” than the tail)

The reason I used the word “go” inside quotation marks so far is the following. Reverse lists and forward lists represent the same concept: a sequence of patches (or a sequence of directed things for that matter). They only differ in the associativity of the elements. Forward lists associate to the right, but reverse lists associate to the left.

p = Move "foo" "bar"

-- [[(x) --Nil--> (x) -p-> (y)] -id-> (y)] -p^{-1}-> (x)
example :: RL Patch x x
example = NilRL :<: p :<: Identity :<: inverse p

-- (x) -p-> [(y) -p^{-1}-> (x) --Nil--> (x)]
example2 :: FL Patch x x
example2 = p :>: inverse p :>: NilFL

The right-associated/reverse lists provide easy access to the last element of the sequence; the left-associated/forward lists provide easy access to the first element of the sequence. Therefore, if we view a repository as a directed sequence of patches, right-associated lists are useful for operations that work on the “latest” patches in the repository (such as record/unrecord), and left-associated lists are useful for commands that scan the repository from the beginning (like clone).

We can reassociate the lists easily, and verify that the two representations are isomoprhic:2

reverseRL :: RL a wX wY -> FL a wX wY
reverseRL = r NilFL
-- the type signature of r basically gives us an invariant
-- wZ is slowly "decreasing" reaching the point where
-- wZ = wX; at that point the first argument is of type FL a wX wY
where r :: FL a wZ wY -> RL a wX wZ -> FL a wX wY
r a NilRL = a
r a (xs :<: x) = r (x :>: a) xs

For example,

-- [[(x) --Nil--> (x) -p-> (y1)] -q-> (y2)] -r-> (z)

Turns into

-- (x) -p-> [(y1) -q-> (y2) -r-> [(z) --Nil--> (z)]]

Exercise: write a function reverseFL :: FL a wX wY -> RL a wX wY

We can write a lot of directed analogues of familiar list functions. For example, here is a directed append:

infixl 5 +<+
(+<+) :: RL a wX wY -> RL a wY wZ -> RL a wX wZ
xs +<+ NilRL = xs
xs +<+ (ys :<: y) = (xs +<+ ys) :<: y

Exercise: write directed append for forward lists: (+>+) :: FL a wX wY -> FL a wY wZ -> FL a wX wZ

Type witnesses

So we wrote a bunch of standard list functions for directed lists; what about some of the other functions? Can we, for example, implement filter for directed lists. Can it look like filterFL :: (a wX wY -> Bool) -> FL a wX wY? Well, we can try writing

filterFL :: (a wX wY -> Bool) -> FL a wX wY
filterFL p NilFL = NilFL
filterFL p (x :>: xs) | p x = filterFL p xs
| otherwise = x :>: filterFL p xs

However, under closer scrutiny we realize that it does not typecheck! In the second clause of filterFL we have the following information:

x :: a x y, xs :: FL a y z
filterFL xs :: FL a y z

Thus, in the first case (in which p x holds) we try to return something of the type FL a wY wZ, when FL a wX wZ was expected. It is clear that generally we can do this only if x :: a wX wX, i.e. wY = wX. But a simple predicate of the type p :: a wX wZ -> Bool won’t tell us anything about that. We need an additional type witness in our system telling us that if p x holds, then x :: a wX wX. For that purpose we introdue the EqCheck datatype.

data EqCheck wX wY where
IsEq :: EqCheck wX wX
NotEq :: EqCheck wX wY

then the type of a predicate would be

type Pred a = forall wX wY. a wX wY -> EqCheck wX wY

If (p x) = IsEq, then the typechecker will know that x :: a wX wX. We can then finally write

filterFL :: Pred a -> FL a wX wY -> FL a wX wY
filterFL p NilFL = NilFL
filterFL p (x :>: xs) | IsEq <- p x = filterFL p xs
| otherwise   = x :>: filterFL p xs

EqCheck is used this way in the darcs source code to e.g., filter our internal patches. Sometimes darcs stores information — like suspended patches — in the so called internal patches. Every patch type implements the internal patch checker (code slightly adapted):

-- |Provides a hook for flagging whether a patch is "internal" to the repo
-- and therefore shouldn't be referred to externally, e.g. by inclusion in tags.
-- Note that despite the name, every patch type has to implement it, but for
-- normal (non-internal) types the default implementation is fine.
-- Currently only used for rebase internal patches.
class MaybeInternal p where
-- | @maybe (const NotEq) (fmap isInternal patchInternalChecker) p@
-- returns 'IsEq' if @p@ is internal, and 'NotEq' otherwise.
-- The two-level structure is purely for efficiency: 'Nothing' and 'Just (InternalChecker (const NotEq))' are
-- semantically identical, but 'Nothing' allows clients to avoid traversing an entire list.
-- The patch type is passed as an 'FL' because that's how the internals of named patches are stored.
patchInternalChecker :: Maybe (forall wX wY . FL p wX wY -> EqCheck wX wY)
patchInternalChecker = Nothing

When the user runs darcs tag in the repository, darcs creates a dummy patch that explicitly depends on all the previous patches — apart from the internal ones of course. Thus, the tag command uses the following function (slightly adapted):

filterNonInternal :: MaybeInternal p => PatchSet p wX wY -> PatchSet p wX wY
filterNonInternal =
case patchInternalChecker of
Nothing -> id
Just f -> \l -> PatchSet (filterRL (f . patchcontents . hopefully) (unPatchSet l))

where a PatchSet is the list of PatchInfoAnd patches — patches together with the meta-information.

It is worth noting that EqCheck x y is isomorphic Maybe (x :~: y), but the propositional equality datatype has only been added to base since 4.7.0.0. In the future versions darcs will probably switch to using Data.Type.Equality.

Conclusion

We’ve briefly touched upon patch representation in darcs and talked about directed types and directed programming.

A good if a bit outdated reference is Jason Dagit’s master thesis (specifically the bits from chapter 4). The wiki is currently lacking in material, but I hope to improve the situation eventually.

Next time we will probably discuss either directed pairs and their use in darcs, or sealed datatypes, or both.

1. Essentially, a reverse list is a directed version of a snoc-list.

2. Before darcs 2.10.1 the right-associated lists had a slightly different datatype, and for the old lists those functions would “reverse” the list. Hence the names of the functions.

Is no-reinstall Cabal coming to GHC 8.0?

You might be wondering: with the beta release of no-reinstall Cabal, is this functionality be coming to GHC 8.0? (Or even a new release of Cabal, since the no-reinstall feature works with GHC 7.10). Unfortunately, there is a split among the Cabal developers over whether or not the actual no-reinstall behavior should go into Cabal by default as is. Duncan Coutts, in particular, has argued that it's a bad idea to enable no-reinstall without other (unimplemented) changes to Cabal. Since the extra needed changes are not fully implemented yet, it's unclear if Duncan will manage them for GHC 8.0.

I've heard a smattering of feedback that no-reinstall Cabal actually is working just fine for people, so I suspect many people would be in favor of just biting the bullet and putting in the "good" (but not "best") solution into Cabal. But I want to foster an informed discussion, so I'd like to explain what the (known) problems with no-reinstall are.

What is no reinstall?

Currently, GHC and Cabal maintain an invariant in the installed package database that for any package name and version (i.e. (source) package ID), there is at most one matching package in the database:

The arrows indicate a "depends on" relationship: so if you have a database that has bar-0.1, bar-0.2 and an instance of foo-0.1 built against bar-0.1, you aren't allowed to install another instance of foo-0.1 built against bar-0.2 (though you are allowed to install foo-0.2 built against bar-0.2).

If cabal-install wants to install a package with the same package ID as a package already in the database, but with different dependencies, it must destructively overwrite the previous entry to maintain this invariant, pictured below:

No reinstall relaxes this invariant, so that "reinstalling" a package with different dependencies just works:

The recently beta released no-reinstall Cabal achieves this with two small features. First, in GHC 7.10, we added the flag --enable-multi-instance to ghc-pkg which makes ghc-pkg no longer error if you attempt to add multiple copies of the same package in the database. Second, in Vishal Agrawal's patchset for Cabal, cabal-install is modified to use this flag, so that the dependency solver no longer has to avoid reinstalls.

However, breaking this invariant has consequences. Let's look at some of them.

Problem 1: It doesn't work on old versions of GHC

Summary: In GHC 7.8 and earlier, it's not possible to directly implement no reinstall (because ghc-pkg will reject it.) And even if it were possible, installing a new instance of a package (which has the same source package ID of an existing package) either (1) causes the old package and all of its dependents to become hidden from the default view of GHC, even though they are still usable, or (2) fails to be exposed in the default view of GHC.

Suppose that a package foo-0.1, which defines a type Foo, and has been compiled twice with different versions of its dependencies:

GHC 7.8 could not distinguish between two compilations of the package: symbols from both packages would live in the foo-0.1 namespace, and colliding symbols would simply be considered the same. Disaster! To avoid this situation, GHC has a shadowing algorithm which remove incompatible packages from its visible set. Here is an example:

We have two package databases, the user database and the global database, laid side-to-side (the user database is "on top"). When there is a conflicting package ID in the combined database, GHC prefers the package from the topmost database: thus, in our example the global foo-0.1 is shadowed (any packages which transitively have it as a dependency are also shadowed). When a package is shadowed, it doesn't exist at all to GHC: GHC will not suggest it or make any mention it exists.

No reinstall requires us to allow these duplicate packages the same database! In this case, GHC will apply shadowing; however, it is not well-defined which package should be shadowed. If GHC chooses to shadow the old package, they "vanish" from GHC's default view (it is as if they do not exist at all); if GHC chooses to shadow the new package, a package that a user just cabal-install'd will be mysteriously absent! Troublesome.

Problem 2: Using multiple instances of the same package is confusing

Summary: In GHC 7.10 or later, multiple instances of the same package may be used together in the same GHC/GHCi session, which can result in confusing type inequalities.

In GHC 7.10, we now use "package keys" to test for type identity. A package key is a source package ID augmented with a hash of the package keys of all the dependencies. This means that GHC no longer needs to apply shadowing for soundness, and you can register duplicates of a package using the --enable-mult-instances flag on ghc-pkg.

However, this can still result in confusing behavior. Consider the previous example in GHC 7.10:

Both versions of foo are visible, and so if we try to import Foo, GHC will complain that it doesn't know which Foo we want. This can be fixed by hiding one package or the other. However, suppose that both baz and qux are exposed, and furthermore, they both export a value foo which has type Foo. These types are "distinct", despite the fact that they are: (1) both named Foo, and (2) come from a package named foo-0.1: they are two different instances of foo-0.1. Confusing!

Problem 3: Nix hashing non-sdist'ed packages is difficult

It is easy to "trick" Cabal into hashing a set of source files which is not representative of the true input of the build system: for example, you can omit files in the other-modules field, or you can modify files in between the time Cabal has computed the source hash and the time it builds the files. And if you can't trust the Nix hash, you now have to worry about what happens when you really need to clobber an old entry in the Nix database (which incorrectly has the "same" hash as what you are attempting to install).

This problem doesn't exist for tarballs downloaded from Hackage, because you can simply hash the tarball and that is guaranteed to be the full set of source for building the file.

Duncan's patchset

To deal with these problems, Duncan has been working on a bigger patchset, with the following properties:

1. To support old versions of GHC, he is maintaining a separate "default view" package database (which is used by bare invocations of GHC and GHCi) from the actual "Nix store" package database. cabal-install is responsible for maintaining a consistent default view, but also installs everything into the Nix store database.
2. Nix-style hashing is only done on Hackage packages; local source tree are to be built and installed only into a sandbox database, but never the global database. Thus, an actual Nix hash is only ever computed by cabal-install.
3. He also wants to make it so that cabal-install's install plan doesn't depend on the local state of the Nix database: it should give the same plan no matter what you have installed previously. This is done by dependency resolving without any reference to the Nix database, and then once IPIDs are calculated for each package, checking to see if they are already built. This plan would also make it possible to support cabal install --enable-profiling without having to blow away and rebuild your entire package database.

Vishal's patchset

Vishal was also cognizant of the problems with the default view of the package database, and he worked on some patches to GHC for support for modifying package environments, which would serve a similar role to Duncan's extra package databases. Unfortunately, these patches have been stuck in code review for a bit now, and they wouldn't help users of old versions of GHC. While the code review process for these patches may get unstuck in the near future, I'm hesitant to place any bets on these changes landing.

Conclusion

My view is that, historically, problems one and two have been the big stated reasons why "no reinstall", while being a simple change, hasn't been added to Cabal as the default mode of operation. However, there's been rising sentiment (I think I can safely cite Simon Marlow in this respect) among some that these problems are overstated, and that we should bite the bullet.

If we want to turn on "no reinstall" before Duncan finishes his patchset (whenever that will be—or maybe someone else will finish it), I think there will need to be some concerted effort to show that these problems are a small price to pay for no reinstall Cabal, and that the Haskell community is willing to pay... at least, until a better implementation comes around.

September 18, 2015

Gabriel Gonzalez

This post summarizes a few tips that increase the readability of Haskell code in my anecdotal experience. Each guideline will have a corresponding rationale.

Do not take this post to mean that all Haskell code should be written this way. These are guidelines for code that you wish to use as expository examples to introduce people to the language without scaring them with unfamiliar syntax.

Rule #1: Don't use ($) This is probably the most controversial guideline but I believe this is the recommendation which has the highest impact on readability. A typical example of this issue is something like the following code: print$ even $4 * 2 ... which is equivalent to this code: print (even (4 * 2)) The biggest issue with the dollar sign is that most people will not recognize it as an operator! There is no precedent for using the dollar sign as an operator in any other languages. Indeed, the vast majority of developers program in languages that do not support adding new operators at all, such as Javascript, Java, C++, or Python, so you cannot reasonably expect them to immediately recognize that the dollar symbol is an operator. This then leads people to believe that the dollar sign is some sort of built-in language syntax, which in turn convinces them that Haskell's syntax is needlessly complex and optimized for being terse over readable. This perception is compounded by the fact that the most significant use of the dollar symbol outside of Haskell is in Perl (a language notorious for being write-only). Suppose that they do recognize that the symbol represents an operator. They still cannot guess at what the operator means. There is no obvious mental connection between a symbol used for currency and function application. There is also no prior art for this connection outside of the Haskell language. Even if a newcomer is lucky enough to guess that the symbol represents function application, it's still ambiguous because they cannot tell if the symbol is left- or right-associative. Even people who do actually take the time to learn Haskell in more depth have difficulty understanding how ($) behaves and frequently confuse it with the composition operator, (.). If people earnestly learning the language have difficulty understanding ($), what chance do skeptics have? By this point you've already lost many people who might have been potentially interested in the language, and for what? The dollar sign does not even shorten the expression. Rule #2: Use operators sparingly Rule #1 is a special case of Rule #2. My rough guideline for which operators to use is that assocative operators are okay, and all other operators are not okay. Okay: • (.) • (+) / (*) • (&&) / (||) • (++) Not okay: • (<$>) / (<*>) - Use liftA{n} or ApplicativeDo in the future
• (^.) / (^..) / %~ / .~ - Use view / toListOf / over / set instead

You don't have to agree with me on the specific operators to keep or reject. The important part is just using them more sparingly when teaching Haskell.

The issues with operators are very similar in principle to the issue with the dollar sign:

• They are not recognizable as operators to some people, especially if they have no equivalent in other languages
• Their meaning is not immediately obvious
• Their precedence and fixity are not obvious, particular for Haskell-specific operators

The main reason I slightly prefer associative operators is that their fixity does not matter and they usually have prior art outside the language as commonly used mathematical operators.

Rule #3: Use do notation generously

Prefer do notation over (>>=) or fmap when available, even if it makes your code a few lines longer. People won't reject a language on the basis of verbosity (Java and Go are living proof of that), but they will reject languages on the basis of unfamiliar operators or functions.

This means that instead of writing this:

example = getLine >>= putStrLn . (++ "!")

You instead write something like this:

example = do    str <- getLine    putStrLn (str ++ "!")

If you really want a one-liner you can still use do notation, just by adding a semicolon:

example = do str <- getLine; putStrLn (str ++ "!")

do notation and semicolons are immediately recognizable to outsiders because they resemble subroutine syntax and in the most common case (IO) it is in fact subroutine syntax.

A corollary of this is to use the newly added ApplicativeDo extension, which was recently merged into the GHC mainline and will be available in the next GHC release. I believe that ApplicativeDo will be more readable to outsiders than the (<$>) and (<*>) operators. Rule #4: Don't use lenses Don't get me wrong: I'm one of the biggest advocates for lenses and I think they firmly belong as a mainstream Haskell idiom. However, I don't feel they are appropriate for beginners. The biggest issues are that: • It's difficult to explain to beginners how lenses work • They require Template Haskell or boilerplate lens definitions • They require separate names for function accessors and lenses, and one or the other is bound to look ugly as a result • They lead to poor inferred types and error message, even when using the more monomorphic versions in lens-family-core Lenses are wonderful, but there's no hurry to teach them. There are already plenty of uniquely amazing things about the Haskell language worth learning before even mentioning lenses. Rule #5: Use where and let generously Resist the temptation to write one giant expression spanning multiple lines. Break it up into smaller sub-expressions each defined on their own line using either where or let. This rule exists primarily to ease imperative programmers into functional programming. These programmers are accustomed to frequent visual "punctuation" in the form of statement boundaries when reading code. let and where visually simulate decomposing a larger program into smaller "statements" even if they are really sub-expressions and not statements. Rule #6: Use point-free style sparingly Every Haskell programmer goes through a phase where they try to see if they can eliminate all variable names. Spoiler alert: you always can, but this just makes the code terse and unreadable. For example, I'll be damned if I know what this means without some careful thought and some pen and paper: ((==) <*>) ... but I can tell at a glance what this equivalent expression does: \f x -> x == f x This is a real example, by the way. There's no hard and fast rule for where to draw the line, but when in doubt err on the side of being less point-free. Conclusion That's it! Those six simple rules go a very long way towards improving the readability of Haskell code to outsiders. Haskell is actually a supremely readable language once you familiarize yourself with the prevailing idioms, thanks to: • purity • the opinionated functional paradigm • minimization of needless side effects and state However, we should make an extra effort to make our code readable even to complete outsiders with absolutely no familiarity or experience with the language. The entry barrier is one of the most widely cited criticisms of the language and I believe that a simple and clean coding style can lower that barrier. Yesod Web Framework The new runtime Hamlet API The Hamlet HTML templating system is, by default, parsed at compile time, allowing it to do quite a bit of static checking. But it can also be useful to parse these templates at runtime, such as if you wanted to allow dynamic content to be submitted by users. A few weeks ago, a discussion on Reddit pointed out that the API for runtime Hamlet in shakespeare was pretty bad. Just a week later, I came up with a use case for runtime Hamlet myself, and decided to put together a new, more user friendly API. This new API is provided in Text.Hamlet.Runtime. Hopefully it's much easier to follow going on here. And this time, there are even comments on the functions! I'm including the example from the module below to give a better feel for how this works: {-# LANGUAGE OverloadedStrings #-} import Text.Hamlet.Runtime import qualified Data.Map as Map import Text.Blaze.Html.Renderer.String (renderHtml) main :: IO () main = do template <- parseHamletTemplate defaultHamletSettings$ unlines
[ "<p>Hello, #{name}"
, "$if hungry" , " <p>Available food:" , " <ul>" , "$forall food <- foods"
, "      <li>#{food}"
]
let hamletDataMap = Map.fromList
[ ("name", "Michael")
, ("hungry", toHamletData True) -- always True
, ("foods", toHamletData
[ "Apples"
, "Bananas"
, "Carrots"
])
]
html <- renderHamletTemplate template hamletDataMap
putStrLn $renderHtml html Edward Kmett Some Rough Notes on Univalent Foundations and B-Systems, Part I I recently attended RDP in Warsaw, where there was quite a bit of work on Homotopy Type Theory, including a special workshop organized to present recent and ongoing work. The organizers of all the events did a fantastic job and there was a great deal of exciting work. I should add that I will not be able to go to RDP next year, as the two constituent central conferences (RTA — Rewriting Techniques and Applications and TLCA — Typed Lambda Calculus and Applications) have merged and changed names. Next year it will now be called FSCD — Formal Structures for Computation and Deduction. So I very much look forward to attending FSCD instead. In any case, one of the invited speakers was Vladimir Voevodsky, who gave an invited talk on his recent work relating to univalent foundations titled "From Syntax to Semantics of Dependent Type Theories — Formalized”. This was a very clear talk that helped me understand his current research direction and the motivations for it. I also had the benefit of some very useful conversations with others involved in collaboration with some of this work, who patiently answered my questions. The notes below are complimentary to the slides from his talk. I had sort of understood what the motivation for studying “C-Systems” was, but I had not taken it on myself to look at Voevodsky’s “B-Systems” before, nor had I grasped how his research programme fit together. Since I found this experience enlightening, I figured I might as well write up what I think I understand, with all the usual caveats. Also note, in all the below, by “type theory” I invariably mean the intensional sort. So all the following is in reference to the B-systems paper that Voevodsky has posted on arXiv (arXiv:1410.5389). That said, if anything I describe here strikes you as funny, it is more likely that I am not describing things right than that the source material is troublesome — i.e. take this with a grain of salt. And bear in mind that I am not attempting to directly paraphrase Voevodsky himself or others I spoke to, but rather I am giving an account of where what they described resonated with me, and filtered through my own examples, etc. Also, if all of the “why and wherefore” is already familiar to you, feel free to skip directly to the “B-Systems” section where I will just discuss Voevodsky’s paper on this topic, and my attempts to understand portions of it. And if you already understand B-Systems, please do reply and explain all the things I’m sure I’m missing! Some Review We have a model of type theory in simiplicial sets that validates the univalence axiom (and now a few other models that validate this axiom as well). This is to say, it is a model with not only higher dimensional structure, but higher structure of a very “coherent” sort. The heart of this relates to our construction of a “universe”. In our categorical model, all our types translate into objects of various sorts. The “universe,” aka the type-of-types, translates into a very special object, one which “indexes” all other objects. A more categorical way of saying this is that all other types are “fibered over” the universe — i.e. that from every other type there is a map back to a specific point within the universe. The univalence axiom can be read as saying that all equivalent types are fibered over points in the universe that are connected (i.e. there is a path between those points). Even in a relatively simple dependent type theory, equivalence of types quickly becomes undecidable in general, as it is a superset of the problem of deciding type inhabitation, which in turn corresponds to the decidability of propositions in the logic corresponding to a type theory, and then by any number of well-known results cannot be resolved in general for most interesting theories. This in turn means that the structure of a univalent universe is “describable” but it is not fully enumerable, and is very complex. We also have a line of work dating back to before the introduction of univalence, which investigated the higher groupoid structure (or, if you prefer, higher topological structure or quillen model structure) induced by identity types. But without either univalence or higher-inductive types, this higher groupoid structure is unobservable internally. This is to say, models were possible that would send types to things with higher structure, but no particular use would be made of this higher structure. So, such models could potentially be used to demonstrate that certain new axioms were not conservative over the existing theory, but on their own they did not provide ideas about how to extend the theory. How to relate this higher groupoid structure to universes? Well, in a universe, one has paths. Without univalence, these are just identity paths. But regardless, we now get a funny “completion” as our identity paths must themselves be objects in our universe, and so too the paths between them, etc. In models without higher structure, we might say “there is only one path from each object to itself” and then we need not worry too much about this potential explosion of paths at each level. But by enforcing the higher groupoid structure, this means that our universe now blossoms with all the potentially distinct paths at each level. However, with the only way in our syntax to create such “extra paths” as reflexivity, any such path structure in our model remains “latent”, and can be added or removed without any effect. The univalence axiom relies on these higher groupoid structures, but it cannot be reduced to them. Rather, in the model, we must have a fibration over the universe with identity lifting along this fibration to reach the next step — to then modify the universe by forcing paths other than identity paths — those between equivalent types. This is in a sense a further “higher completion” of our universe, adding in first all the possible paths between types, but then the paths between those paths, and so on up. Because, by univalence, we can state such paths, then in our model we must include all of them. The Problem All along I have been saying “models of type theory”. And it is true enough. We do know how to model type theories of various sorts categorically (i.e. representing the translation from their syntax into their semantics as functorial). But we do not have full models of "fully-featured" type theories; i.e. if we view type theories as pizzas we have models of cheese slices, and perhaps slices with olives and slices with pepperoni, etc. But we do not have models of pizzas with "all the toppings". Here, by "toppings" I mean things such as the addition of "all inductive types," "some coinductive types," "certain higher-inductive types," "pattern matching," "induction-induction," "induction-recursion," "excluded middle as an axiom," "choice as an axiom," "propositional resizing as an axiom," etc. Rather, we have a grab bag of tricks, as well as a few slightly different approaches — Categories with Attributes, Categories with Families, and so forth. One peculiar feature of these sorts of models, as opposed to the models of extensional type theory, is that these models are not indexed by types, but by “lists of types” that directly correspond to the contexts in which we make typing judgments in intensional theories. In any case, these models are usually used in an ad-hoc fashion. If you want to examine a particular feature of a language, you first pick from one of these different but related sorts of models. Then you go on to build a version with the minimal set of what you need — so maybe identity types, maybe sigma types, maybe natural numbers, and then you introduce your new construction or generate your result or the like. So people may say “we know how to work with these things, and we know the tricks, so given a theory, we can throw together the facts about it pretty quickly.” Now of course there are maybe only a hundred people on the planet (myself not among them) who can really just throw together a categorical model of some one or another dependent type theory at the drop of a hat. But there’s a broader problem. How can we speak about the mutual compatibility of different extensions and features if each one is validated independently in a different way? This is a problem very familiar to us in the field of programming languages — you have a lot of “improvements” to your language, all of a similar form. But then you put such “improvements” together and now something goes wrong. In fact, the famous “newtype deriving bug” in GHC some years back, which opened a big hole in the type system, was of exactly that form — two extensions (in that case, newtype deriving and type families) that are on their own safe and useful, together have an unexpected bad effect. It is also possible to imagine bad interactions occuring only when three extensions exist together, and soforth. So as the number of extensions increases, the number of interactions to check spirals upwards in a very difficult fashion. So the correct way to have confidence in the coexistence of these various extensions is to have a general model that contains the sort of theory we actually want to work in, rather than these toy theories that let us look at portions in isolation. And this certainly involves having a theory that lets us validate all inductive types at once in the model, rather than extending it over and over for each new type we add. Additionally, people tend to model things with at most one universe. And when we are not looking at universes, it is often omitted altogether, or done “incorrectly” as an inhabitant of itself, purely for the sake of convenience. So now, if I tell someone with mathematical experience what my theory “means” and they say “is this actually proven” I’m in the embarrassing position of saying “no, it is not. but the important bits all are and we know how to put them together.” So here I am, trying to advocate the idea of fully formal verification, but without a fully top-to-bottom formally verified system myself — not even codewise, but in even the basic mathematical sense. Univalence makes this problem more urgent. Without univalence, we can often get away with more hand-wavy arguments, because things are “obvious”. Furthermore, they relate to the way things are done elsewhere. So logic can be believed by analogy to how people usually think about logic, numbers by analogy to the peano system, which people already “believe in,” and soforth. Furthermore, without univalence, most operations are “directly constructive” in the sense that you can pick your favorite “obvious” and non-categorical model, and they will tend to hold in that as well — so you can think of your types as sets, and terms as elements of sets. Or you can think of your types as classifying computer programs and your terms as runnable code, etc. In each case, the behavior leads to basically what you would expect. But in none of these “obvious” models does univalence hold. And furthermore, it is “obviously” wrong in them. And that is just on the “propaganda” side as people say. For the same reasons, univalence tends to be incompatible with many “obvious” extensions — for example, not only “uniqueness of identity proofs” has to go, but pattern matching had to be rethought so as not to imply it, and furthermore it is not known if it is sound in concert with many other extensions such as general coinductive types, etc. (In fact, the newtype deriving bug itself can be seen as a "very special case" of the incompatibility of univalence with Uniqueness of Identity Proofs, as I have been discussing with people informally for quite some time). Hence, because univalence interacts with so many other extensions, it feels even more urgent to have a full account. Unlike prior research, which really focused on developing and understanding type systems, this is more of an engineering problem, although a proof-engineering problem to be sure. The Approach Rather than just giving a full account of “one important type system,” Voevodsky seems to be aiming for a generally smooth way to develop such full accounts even as type systems change. So he is interested in reusable technology, so to speak. One analogy may be that he is interested in building the categorical semantics version of a logical framework. His tool for doing this is what he calls a “C-system”, which is a slight variant of Cartmell’s Categories with Attributes mentioned above. One important aspect of C-systems seems to be that that they stratify types and terms in some fashion, and that you can see them as generated by some “data” about a ground set of types, terms, and relations. To be honest, I haven’t looked at them more closely than that, since I saw at least some of the “point” of them and know that to really understand the details I'll have to study categorical semantics more generally, which is ongoing. But the plan isn’t just to have a suitable categorical model of type theories. Rather it is to give a description of how one goes from the “raw terms” as syntax trees all the way through to how typing judgments are passed on them and then to their full elaborations in contexts and finally to their eventual “meaning” as categorically presented. Of course, most of these elements are well studied already, as are their interactions. But they do not live in a particularly compatible formulation with categorical semantics. This then makes it difficult to prove that “all the pieces line up” and in particular, a pain to prove that a given categorical semantics for a given syntax is the “initial” one — i.e. that if there is any other semantics for that syntax, it can be arrived at by first “factoring through” the morphism from syntax to the initial semantics. Such proofs can be executed, but again it would be good to have “reusable technology” to carry them out in general. pre-B-Systems Now we move into the proper notes on Voevodsky's "B-Systems" paper. If C-systems are at the end-point of the conveyor belt, we need the pieces in the middle. And that is what a B-system is. Continuing the analogy with the conveyor belt, what we get out at the end is a “finished piece” — so an object in a C-system is a categorified version of a “type in-context” capturing the “indexing” or “fibration” of that type over its “base space” of types it may depend on, and also capturing the families of terms that may be formed in various different contexts of other terms with other types. B-systems, which are a more “syntactic” presentation, have a very direct notion of “dynamics” built in — they describe how objects and contexts may be combined and put together, and directly give, by their laws, which slots fit into which tabs, etc. Furthermore, B-systems are to be built by equipping simpler systems with successively more structure. This gives us a certain sort of notion of how to talk about the distinction between things closer to "raw syntax" (not imbued with any particular meaning) and that subset of raw syntactic structures which have certain specified actions. So enough prelude. What precisely is a B-system? We start with a pre-B-system, as described below (corresponding to Definition 2.1 in the paper). First there is a family of sets, indexed by the natural numbers. We call it B_n. B_0 is to be thought of as the empty context. B_1 as the set of typing contexts with one element, B_2 as the set with two elements, where the second may be indexed over the first, etc. Elements of B_3 thus can be thought of as looking like "x_1 : T_1, x_2 : T_2(x_1), x_3 : T_3(x_1,x_2)" where T_2 is a type family over one type, T_3 a type family over two types, etc. For all typing contexts of at least one element, we can also interpret them as simply the _type_ of their last element, but as indexed by the types of all their prior elements. Conceptually, B_n is the set of "types in context, with no more than n-1 dependencies". Now, we introduce another family of sets, indexed by the natural numbers starting at 1. We call this set ˜B_n. ˜B_1 is to be thought of as the set of all values that may be drawn from any type in the set B_1, and soforth. Thus, each set ˜B_n is to be thought of as fibered over B_n. We think of this as "terms in context, whose types have no more than n-1 dependencies". Elements of ˜B_3 can be though of as looking like "x_1 : T_1, x_2 : T_2(x_1), x_3 : T_3(x_1,x_2) ⊢ y : x". That is to say, elements of B_n for some n look like "everything to the left of the turnstile" and elements of ˜B_n for some n look like "the left and right hand sides of the turnstile together." We now, for each n, give a map: ∂ : ˜B_n+1 -> B_n+1. This map is the witness to this fibration. Conceptually, it says "give me an element of some type of dependency level n, and I will pick out which type this is an element of". We can call ∂ the "type of" operator. We add a second basic map: ft : B_n+1 -> B_n This is a witness to the fact that all our higher B_n are built as extensions of smaller ones. It says "Give me a context, and I will give you the smaller context that has every element except the final one". Alternately, it reads "Give me a type indexed over a context, and I will throw away the type and give back just the context." Or, "Give me a type that may depend on n+1 things, and I will give the type it depends on that may only depend on n things. We can call ft the "context of" operator. Finally, we add a number of maps to correspond to weakening and substitution -- four in all. In each case, we take m >= n. we denote the i-fold application of ft by ft_i. 1) T (type weakening). T : (Y : B_n+1) -> (X : B_m+1) -> ft(Y) = ft_(m+1-n)(X) -> B_m+2  This reads: Give me two types-in-context, X and Y. Now, if the context for Y agrees with the context for X in the initial segment (i.e. discarding the elements of the context of X which are "longer" than the context for Y), then I can give you back X again, but now in an extended context that includes Y as well. 2) ˜T (term weakening). ˜T : (Y : B_n+1) -> (r : ˜B_m+1) -> ft(Y)=ft_(m+1-n)(∂(r)) -> ˜B_m+2 This reads: Give me a type-in-context Y, and a term-in-context r. Now, if the context of Y agrees with the context for the type of r as above, then I can give you back r again, but now as a term-in-context whose type has an extended context that includes Y as well. 3) S (type substitution). S : (s : ˜B_n+1) -> (X : B_m+2) -> ∂(s) = ft_(m+1-n)(X) -> B_m+1 This reads: give me a term-in-context s, and a type-in-context X. Now, if the context of the type of s agrees with the context of the X in the initial segment, we may then produce a new type, which is X with one less element in its context (because we have substituted the explicit term s for where the prior dependency data was recorded). 4) ˜S (term substitution). ˜S : (s : ˜B_n+1) -> (r : ˜B_m+2) -> ∂(s) = ft_(m+1-n)(∂(r)) -> ˜B_m+1 This reads: give me two terms-in-context, r and s. Now given the usual compatibility condition on contexts, we can produce a new term, which is like r, but where the context has one less dependency (because we have substituted the explicit term s for everywhere where there was dependency data prior). Let us now review what we have: We have dependent terms and types, related by explicit maps between them. For every term we have its type, and for every type we have its context. Furthermore, we have weakening by types of types and terms -- so we record where "extra types" may be introduced into contexts without harm. We also have substitution of terms into type and terms -- so we record where reductions may take place, and the resulting effect on the dependency structure. Unital pre-B-systems We now introduce a further piece of data, which renders a pre-B-system a _unital_ pre-B-system, corresponding to Definition 2.2 in the paper. For each n we add an operation: δ : B_n+1 -> ˜B_n+2 This map "turns a context into a term". Conceptually it is the step that equips a pre-B-system with a universe, as it is what allows types to transform into terms. I find the general definition a bit confusing, but I believe it can be rendered syntactically for for B_2, it can be as something like the following: "x_1 : T_1, x_2 : T_2(x_1) -> x_1 : T_1, x_2 : T_2(x_1), x_3 : U ⊢ x_2 : x_3". That is to say, given any context, we now have a universe U that gives the type of "universes of types", and we say that the type itself is a term that is an element of a universe. Informally, one can think of δ as the "term of" operator. But this specific structure is not indicated by any laws yet on δ. Indeed, the next thing we do is to introduce a "B0-system" which adds some further coherence conditions to restrict this generality. B0-systems The following are my attempt to "verbalize" the B0 system conditions (as restrictions on non-unital pre-B-systems) as covered in definition 2.5. I do not reproduce here the actual formal statements of these conditions, for which one should refer to the paper and just reason through very carefully. 1. The context of a weakening of a type is the same as the weakening of the context of a type. 2. The type of the weakening of a term is the same as the weakening of the type of a term. 3. The context of a substitution into a type is the same as the substitution into a context of a type 4. The type of a substitution into a term is the same as a substitution into the type of a term. Finally, we "upgrade" a non-unital B0-system to a unital B0-system with one further condition: 5. ∂(δ(X)) =T(X,X). I read this to say that, "for any type-in-context X, the type of the term of X is the same as the weakening of the context of X by the assumption of X itself." This is to say, if I create a term for some type X, and then discard that term, this is the same as extending the context of X by X again. Here, I have not even gotten to "full" B-systems yet, and am only on page 5 of a seventeen page paper. But I have been poking at these notes for long enough without posting them, so I'll leave off for now, and hopefully, possibly, when time permits, return to at least the second half of section 2. September 17, 2015 The GHC Team GHC Weekly News - 2015/09/17 Hi *, Welcome for the latest entry in the GHC Weekly News. It's been a little while, but here we are! And your author has finally returned from his 8 week sabbatical, too! So without any futher ado, lets get going... 8.0.1 release roadmap So HEAD has been steaming along pretty smoothly for the past few months now. After talking with Simon last week, we decided that the best course of action would be to release 8.0.1 (a super-major release) sometime around late February, which were the plans for 7.10 (modulo a few weeks due to the FTP debates). The current schedule is roughly: • November: Fork the new ghc-8.0 STABLE branch • At this point, master development will probably slow as we fix bugs. • This gives us 2 months or so until branch, from Today. • This is nice as the branch is close to the first RC. • December: First release candidate • Mid/late February: Final release. "Why call it 8.0.1?", you ask? Because we have a lot of excellent features planned! I'm particularly partial to Richard's work for merging types and kinds (Phab:D808). But there's a lot of other stuff. For all the nitty gritty details, be sure to check 8.0.1 status page to keep track of everything - it will be our prime source of information and coordination. And be sure to read my email to ghc-devs for more info. ... and a 7.10.3 release perhaps? On top of this, we've been wondering if another release in the 7.10 branch should be done. Ben did the release shortly after I left, and for the most part looked pretty great. But there have been some snags, as usual. So we're asking: who needs GHC 7.10.3? We'd really like to know of any major showstoppers you've found with 7.10 that are preventing you from using it. Especially if you're stuck or there's no clear workaround. Currently, we're still not 100% committed to this course of action (since the release will take time away from other things). However, we'll keep the polls open for a while - so please get in touch with us if you need it! (Be sure to read my email for specific information.) List chatter (Over the past two weeks) Noteworthy commits (Over the past several weeks) Closed tickets (Over the past two weeks) September 14, 2015 Neil Mitchell Making sequence/mapM for IO take O(1) stack Summary: We have a version of mapM for IO that takes O(1) stack and is faster than the standard Haskell/GHC one for long lists. The standard Haskell/GHC base library sequence function in IO takes O(n) stack space. However, working with Tom Ellis, we came up with a version that takes O(1) stack space. Our version is slower at reasonable sizes, but faster at huge sizes (100,000+ elements). The standard definition of sequence (when specialised for both IO and []) is equivalent to: sequence :: [IO a] -> IO [a]sequence [] = return []sequence (x:xs) = do y <- x; ys <- sequence xs; return (y:ys) Or, when rewritten inlining the monadic bind and opening up the internals of GHC's IO type, becomes: sequence :: [IO a] -> IO [a]sequence [] = IO$ \r -> (# r, () #)sequence (y:ys) = IO $\r -> case unIO y r of (# r, y #) -> case unIO (sequence xs) r of (# r, ys #) -> (# r, y:ys #) For those not familiar with IO, it is internally defined as: newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #) Each IO action takes a RealWorld token and returns a RealWorld token, which ensures that IO actions run in order. See here for a full tutorial. Our observation was that this version requires O(n) stack space, as each recursive call is performed inside a case. The algorithm proceeds in two phases: • First, it traverses the input list, evaluating each action and pushing y on the stack. • After reaching the [] at the end of the list, it traverses the stack constructing the output list. By constructing the list directly on the heap we can avoid the extra copy and use O(1) stack. Our version is: sequenceIO :: [IO a] -> IO [a]sequenceIO xs = do ys <- IO$ \r -> (# r, apply r xs #)        evaluate $demand ys return ys where apply r [] = [] apply r (IO x:xs) = case x r of (# r, y #) -> y : apply r xs demand [] = () demand (x:xs) = demand xs Here the two traversals are explicit: • First, we traverse the list using apply. Note that we pass the RealWorld token down the list (ensuring the items happen in the right order), but we do not pass it back. • To ensure all the IO actions performed during apply happen before we return any of the list, we then demand the list, ensuring the [] element has been forced. Both these traversals use O(1) stack. The first runs the actions and constructs the list. The second ensures evaluation has happened before we continue. The trick in this algorithm is: ys <- IO$ \r -> (# r, apply r xs #)

Here we cheat by duplicating r, which is potentially unsafe. This line does not evaluate apply, merely returns a thunk which when evaluated will force apply, and cause the IO to happen during evaluation, at somewhat unspecified times. To regain well-defined evaluation order we force the result of apply on the next line using demand.

Benchmarks

We benchmarked using GHC 7.10.2, comparing the default sequence (which has identical performance to the specialised monomorphic variant at the top of this post), and our sequenceIO. We benchmarked at different lengths of lists. Our sequenceIO is twice as slow at short lists, draws even around 10,000-100,000 elements, and goes 40% faster by 1,000,000 elements.

Our algorithm saves allocating stack, at the cost of iterating through the list twice. It is likely that by tuning the stack allocation flags the standard algorithm would be faster everywhere.

Using sequence at large sizes

Despite improved performance at large size, we would not encourage people to use sequence or mapM at such large sizes - these functions still require O(n) memory. Instead:

• If you are iterating over the elements, consider fusing this stage with subsequence stages, so that each element is processed one-by-one. The conduit and pipes libraries both help with that approach.
• If you are reducing the elements, e.g. performing a sum, consider fusing the mapM and sum using something like foldM.

For common operations, such a concat after a mapM, an obvious definition of concatMapM is:

concatMapM :: Monad m => (a -> m [b]) -> [a] -> m [b]concatMapM f = liftM concat . mapM f

But that if the result of the argument is regularly [] then a more efficient version is:

concatMapM op = foldr f (return [])    where f x xs = do x <- op x                      if null x then xs else liftM (x ++) xs

For lists of 10,000 elements long, using the function const (return []), this definition is about 4x faster. Version 1.4.2 of the extra library uses this approach for both concatMapM and mapMaybeM.

What are type families?

I just returned from yet another fantastic ICFP week. I always have a blast at the conference. It would be easy to take the conversations I had there and turn them into a full-blown research program that would easily take me to next year’s ICFP… and then rinse and repeat!

But in this post, I want to focus on one particlar thorn in my side that started bothering me more than it had before: type families. I’ve realized, in the course of many conversations, that I don’t really understand them. This post will be more questions than answers as I try to write down what ails me.

> {-# LANGUAGE TypeFamilies, DataKinds, UndecidableInstances, GADTs,
>              MultiParamTypeClasses, FunctionalDependencies,
>              FlexibleInstances, FlexibleContexts #-}
> module WhatAreTypeFamilies where


Total type families are OK

A total type family feels quite comfortable. By total here, I mean a type family whose equations cover all the possible cases and which is guaranteed to terminate. That is, a total type family is properly a function on types. Those other dependently typed languages have functions on types, and they seem to work nicely. I am completely unbothered by total type families.

Non-covering type families are strange

A non-covering type family is a type family whose patterns do not cover the whole space. Let’s consider closed and open families separately, because the issues that come up are different.

Closed type families

For example:

> type family F1 a where
>   F1 Int = Bool


Note that F1 is closed – we won’t ever be adding more equations. F1 is clearly non-covering. Yet, the type F1 Char is perfectly well formed.

> sillyId :: F1 Char -> F1 Char
> sillyId x = x


But sillyId can’t ever be called on a value. This doesn’t stop us from doing chicanery with F1 though.

> data SillyGadt a where


Now, even though F1 Char is nonsense, SillyGadt (F1 Char) is inhabited. And GHC distinguishes between F1 Char and F1 Double.

In GHC, a type family that can’t reduce is considered “stuck”. Such a type is effectively a new type, equal only to itself, like empty datatypes a programmer might declare. The only thing different about something like F1 Char from something like Maybe Char is that F1 Char cannot be used in a type pattern (that is, the left-hand side of a type family equation or data family instance). That’s the only difference. So, when we declare F1, we create an infinite family of unique types containing F1 applied to all types of kind *, except for Int, because F1 Int is identically Bool.

I find this treatment of non-covering closed type families rather bizarre. Compare against term-level behavior. When a term-level function is non-covering, passing an argument for which there are no equations immediately results in a call to error. Yet the same thing in types behaves very differently.

I’m not the only one who finds this strange: Lennart Augustsson posted GHC bug #9636 about this very issue. Indeed, my current views on the subject are strongly influenced by Lennart’s comments.

But, sometimes you want stuckness

Sadly, sometimes authors want stuck type families. I know of two examples.

1. The bowels of GHC defines this:
type family Any :: k where {}

Thus, Any is a closed type family, available at any kind, with no equations. GHC makes convenient use for Any. Consider the expression length []. This expression has some pointless polymorphism in it: length can be applied to a list of any type, and [] can be a list of any type. Yet, GHC must choose precisely which type these polymorphic definitions are specialized to. Not to play favorites, GHC chooses Any :: *. Such pointless polymorphism can indeed happen at any kind, so Any is polymorphic in its return kind. (Before kind polymorphism, Any existed as a special form. It’s not so special any more.)

It’s quite important that Any is a type family, not a datatype. If it were a datatype (and we forget about the weirdness of a datatype polymorphic in its return kind), then it could be matched in type patterns, making this strangeness possible:

type family ThreeBools (b :: Bool) :: Nat where
ThreeBools 'True  = 1
ThreeBools 'False = 2
ThreeBools Any    = 3

Bizarre! And yet this was possible in GHC 7.6 and below (modulo the closed type family, which didn’t exist yet). And it caused problems.

Of course, with Any, we really want it to be stuck. We don’t want Any to reduce to some other type, and we don’t want uses of Any to be errors.

2. In his work implementing a units-of-measure library via type-checker plugins, Adam Gundry uses closed type families to define type functions whose behavior is specified only via a plugin, not by GHC’s built-in type family reduction rules. He wanted empty closed type families specifically, and so implemented them within GHC. The idea is that units-of-measure with their integer powers form an abelian group, which can be solved via custom machinery but not by GHC’s rather limited left-to-right rewriting used in type family reduction.

Once again, we want a closed type family that is stuck but is not an error. The existence of multiple such things got me ruminating on all of this some time ago. I attempted to figure this all out in a post on a GHC ticket, but that didn’t get me very far. And I won’t claim to have made any real progress since that post.

Open type families

Having a stuck open type family makes moderately more sense than a stuck closed type family. Suppose we have this:

> type family F2 a
> type instance F2 Int = Bool


Because new equations can be added to open type families at any time, talking about F2 Char isn’t utterly bogus. But it might be nice to have to assert that an equation for F2 Char is in scope before we use it in a type.

Injective type families

Injective type families are a new feature that just got merged with GHC head. Jan Stolarek, Simon Peyton Jones, and I had a paper at Haskell Symposium about them.

The idea is that you can put an injectivity annotation on a type family declaration, which is rather like a functional dependency. (The injective type family code in this post is not runnable, as it works only with GHC HEAD. The feature will be available with GHC 8.0, due out early 2016.)

type family F3 a = r | r -> a where
F3 Int  = Bool
F3 Bool = Int
F3 a    = a

Note that if you know the value of F3 a, then you can discover the value of a: that’s what injectivity is all about.

Now consider this:

type family F4 a = r | r -> a where
F4 (Maybe a) = a

Looks injective, right? Wrong. The problem is that, if this were injective, we could take a proof that F4 a ~ F4 b and get a proof of a ~ b. Now, consider F4 (Maybe (F4 Int)). This reduces to F4 Int. Thus, we have F4 (Maybe (F4 Int)) ~ F4 Int. And now, by injectivity, we get Maybe (F4 Int) ~ Int. That is, a Maybe <blah> equals an Int. Terrible!

So, in our implementation of injectivity, we exclude type families like F4. But it’s really unexpected that F4 should be rejected (at least to me). The problem, at its core, is about partiality. Note that we needed F4 Int – an un-covered case – to make the bad equality proof.

Looping type families are strange

At first blush, it seems that a looping type families cause chiefly one problem: a looping GHC. But I’m very unconcerned about a looping GHC. After all, if a user has written a non-terminating type family, that amounts to a non-terminating program to be run at compile time. No one should be very surprised when a program that contains a loop does, in fact, loop.

The problem is that looping type families can actually create effectively infinite types. GHC has various protections against actually running off to infinity, but a clever programmer can defeat these protections. And infinite types can cause unsoundness. Indeed, while writing the closed type family paper, we uncovered one such unsoundness, closing bug #8162.

However, even with #8162 gone, we still have a lingering problem. Consider

> type family F5 a b where
>   F5 a a = 'True
>   F5 a b = 'False


This is a straightforward equality test, and it works in many cases. But if we ask F5 a (Maybe a) (for some unknown a), the type family doesn’t reduce. It really should! There is no way a finite a can be the same as Maybe a. But we can have an infinite a, instantiated to MaybeInf:

> type family MaybeInf where MaybeInf = Maybe MaybeInf


MaybeInf here is a perfectly well-formed nullary (0-argument) type family. If the a in F5 a (Maybe a) were instantiated with MaybeInf, then reducing to 'True would be correct. And thus GHC can’t reduce F5 a (Maybe a) to 'False, as one might like.

And this isn’t purely an academic concern. I don’t have links I can share, but this has bitten real programmers doing real work. Once again, the problem boils down to partiality: GHC must be conservative about reducing closed type families (such as F5) because of the possibility that someone, somewhere will write a non-terminating type family. If F5 a (Maybe a) were to reduce to 'False, I’m confident that a clever enough person could use this behavior to implement unsafeCoerce. Similar concerns prevent certain type families from being considered injective – see the paper for the example. If we could somehow be sure that there were no non-terminating type families, these concerns would go away.

Contrast with functional dependencies

Simply, functional dependencies don’t suffer from these problems. Let’s look at the issue around injectivity, as it’s representative. First, recast in terms of functional dependencies:

> class F4' a r | a -> r, r -> a
> instance F4' (Maybe a) a


Note the bidirectional functional dependency. The left-to-right dependency is the built-in dependency inherent in type families. The right-to-left dependency arises from the injectivity annotation.

Our problem case above was F4 (Maybe (F4 Int)). This case maps to

fundep :: (F4' Int f4_int, F4' (Maybe f4_int) result) => ()
fundep = ()

Couldn't match type ‘Maybe f4_int’ with ‘Int’
arising from a functional dependency between:
constraint ‘F4' Int f4_int’
arising from the type signature for
fundep :: (F4' Int f4_int, F4' (Maybe f4_int) result) => Int
instance ‘F4' (Maybe a) a’

So GHC notices that there is a problem. But even if it didn’t, we’d be OK. That’s because the F4' Int f4_int constraint essentially asserts that there is some value for F4' Int. Of course, there isn’t any such value, and so fundep would never be callable. Indeed, we can’t write any more instances of F4', as anything we could write would violate the dependencies.

The bottom line here is that there is no way to get into trouble here because the constraints necessary to use bogus functional dependencies can never be satisfied.

So, what should we do about all of this? I’m inspired by the non-problems with functional dependencies. I thus propose the following. (Actually, I don’t. This is too radical. But I’m trying to provoke discussion. So take this all with a heap of salt.)

1. Remove unassociated open type families. They’re quite awkward anyway – either they should be closed and define some sort of concrete function, or they will need a class constraint nearby to be useful.
2. Require class constraints to use associated type families. That is, if F a is associated with a class C a, writing F Int without a C Int constraint in scope is an error. This means that the use of an associated type family in the right-hand side of another associated type family will require the class constraint in the instance context.
3. Require closed type families to be total, with an improved termination checker. Naturally, closed type families would be unable to call partial type families.
4. Devise a mechanism to associate a closed type family with a class. This would allow for partial closed type families, which indeed are useful.

The bottom line to all of this is that any partial type family should be associated with a class constraint. If the constraint can be satisfied, this means that the type family is defined and well-behaved at the arguments provided. With this, I believe all the problems above go away.

Another way to see this proposal is that functional dependencies are the Right Way to do this, and type families (at least partial ones) are wonky. But type families have much better syntax, and I’d like to keep that.

So, where to go from here? I’m not sure. This proposal is pretty radical, but it could potentially be implemented with the requisite deprecation period, etc. But, mostly, I’m curious to see what conversation this all provokes.

September 13, 2015

I’ve been thinking of writing such an article for a while. A few weeks ago, I got contacted by people who wanted to know more about my experience with luminance so that they can have more hindsight about their own APIs and products.

I came to the realization that I could write a blog entry to discuss designs decisions and, at some extent, what a good design entails. Keep in mind it’s only personal thoughts and that I won’t talk for someone else.

Elegancy

I love mathematics because they’re elegant. Elegancy implies several traits, among simplicity, flexibility and transparency. They solve problems with very nice abstractions. In mathematics, we have a concept that is – astonishingly – not very spread and barely known outside of math geeks circles: free objects.

The concept of free is a bit overwhelming at first, because people are used to put labels and examples on everything. For instance, if I say that an object is free, you might already have associated some kind of lock to that object, so that you can get why it’s free. But we’re mistaken. We don’t need locks to define what free implies. In mathematic, a free object is an object that can’t be defined in terms of others. It’s a bit like a core object. It’s free because it can be there, no matter what other objects are around. It has no dependency, it doesn’t require no other interaction. You can also say that such an object is free of extra features that wouldn’t be linked to its nature.

This free property is a very interesting property in mathematics, because it’s surprisingly simple! We can leverage that mathematic abstraction to software design. I like keeping my softwares as much free as possible. That is – with a more human language to say it – constraining them to keep low responsibilities about what they’re doing.

Responsibility domains

The important thing to keep in mind is that you should, at first, define what the responsibility domain is all about. Let’s say you’d like to create a library to implement audio effects, like the Doppler effect – that effect actually exists for any kind of wave, but it’s interesting to synthetize it for a sound-related application. If you end up writing functions or routines to play sound or to load audio samples, you’re already doing it wrong! You’d have violated your reponsibility domain, which is, “audio effects”. Unfortunately, a lot of libraries do that. Adding extra stuff – and sometimes, worse; relying on them!

A lot of people tend to disagree with that – or they just ignore / don’t know. There’re plenty of examples of libraries and softwares that can do everything and nothing. For instance, take Qt – pronounce cute or cutie. At first, Qt is a library and an API to build up GUIs – Graphical User Interfaces – and handle windows, events and so on. Let’s have a look at the documentation of modules, here.

You can see how the responsibility domain is huge! GUI, radio, audio, video, camera, network, database, printing, concurrency and multithreading… Qt isn’t a library anymore; it’s a whole new language!

People tend to like that. “Yeah, I just have to use Qt, and I can do everything!”. Well, that’s a point. But you can also think it another way. Qt is a very massive “library” you’ll spend hours reading the documentation and will use a lot of different classes / functions from different aspects. That doesn’t compose at all. What happens when you want to – or when you don’t have the choice? – use something else? For instance, if you want to use a smaller–but–dedicated threading library? What happens if you want to use a database service you wrote or that you know it’s great? Do you wipeout your Qt use? Do you… try to make both work in harmony? If so, do you have to write a lot of boilerplate code? Do you forget about those technologies and fallback on Qt? Do the concepts map to each others?

The problem with massive libraries is the tight bound it creates between the libraries and the developers. It’s very hard with such libraries to say that you can use it whenever you want because you perfectly know them. You could even just need a few things from it; like, the SQL part. You’ll then have to install a lot of code you’ll perhaps use 10% of.

KISS

I love how the free objects from mathematics can be leveraged to build simpler libraries here. The good part about free objects is the fact that they don’t have any extra features embedded. That’s very cool, because thanks to that, you can reason in terms of such objects as-is. For instance, OpenAL is a very free audio library. Its responsibility domain is to be able to play sound and apply simple effects on them – raw and primary effects. You won’t find anything to load music from files nor samples. And that’s very nice, because the API is small, simple and straight-forward.

Those adjectives are the base of the KISS principle. The ideas behind KISS are simple: keep it simple and stupid. Keep it simple, because the simpler the better. A too complex architecture is bloated and ends up unmaintainable. Simplicity implies elegancy and then, flexibility and composability.

That’s why I think a good architecture is a small – in terms of responsibility – and simple one. If you need complexity, that’s because your responsibility domain is already a bit more complex than the common ones. And even though the design is complex for someone outside of the domain, for the domain itself, it should stay simple and as most straight-forward as possible.

API

I think a good API design is to pick a domain, and stick to it. Whatever extra features you won’t provide, you’ll be able to create other libraries to add those features. Because those features will also be free, they will be useful in other projects that you don’t even have any idea they exist! That’s a very cool aspect of free objects!

There’s also a case in which you have to make sacrifices – and crucial choices. For instance, event-driven programming can be implemented via several techniques. A popular one in the functional programming world nowadays is FRP. Such a library is an architectural codebase. If you end up adding FRP-related code lines in your networking-oriented library, you might be doing it wrong. Because, eh, what if I just want to use imperative event-driven idioms, like observers? You shouldn’t integrate such architectural design choices in specific libraries. Keep them free, so that everyone can quickly learn them, and enjoy them!

I like to see good-designed libraries as a set of very powerful, tiny tools I can compose and move around freely. If a tool gets broken or if it has wrong performances, I can switch to a new one or write my very own. Achieving such a flexibility without following the KISS principle is harder or may be impossible to reach.

So, in my opinion, we should keep things simple and stupid. They’re simpler to reason about, they compose and scale greatly and they of course are easier to maintain. Compose them with architectural or whatever designs in the actual final executable project. Don’t make premature important choices!

What do we mean when we say "the Haskell community"?

One of the folks I've chatted with a bunch online and finally got to meet in-person this year was Gershom. Towards the end of the symposium, he mentioned the post I linked to last time about my pulling away from Haskell communities. But the way he phrased it really struck me. And I wanted to comment on that.

I guess the crux of it comes down to, what do we mean when we say "the Haskell community"? What I meant when writing that post over a year ago was the online community of Haskell learners/practitioners (i.e., the folks who dominate spaces like reddit). But what Gershom's comment suggested was the academic community of FP researchers who work on/with Haskell (i.e., the folks who dominate spaces like ICFP). These two communities have always been intertwined, but as Haskell gains popularity they feel more and more distinct.

FWIW, I do still feel welcome as part of the academic community. There are some things I could comment on, sure, but those have more to do (i think) the general issue of being a woman in tech than they do with Haskell per se. Of course, this only underscores the concern of my original post. The free exchange between academics and practitioners was always a big part of what I've loved about the (general) Haskell community. Were not for that exchange, the spark planted by Mark Jones would never have fanned into the flame of my current research. As that free exchange unravels, this pathway from interested novice to researcher is cut off. And that pathway is crucial for the vibrancy of the community, as well as for addressing those general issues of being a minority in tech.

[ajoyvtsr] Executable specification of MapReduce

We present, for pedagogical purposes, a simple Haskell implementation of MapReduce.  MapReduce is a technology idea popularized by Google.  The polymorphic type signature specifies exactly what the input functions passed in by the user (mapFn and reduceFn) are supposed to do. Of course, a real MapReduce implementation would be parallel across multiple machines, fault tolerant, use external sort etc.

{-# LANGUAGE ScopedTypeVariables #-}import Data.List(groupBy, sortOn);mapReduce :: forall a key value b . Ord key => (a -> [(key,value)]) -> (key -> [(a,value)] -> b) -> [a] -> [b]; mapReduce mapFn reduceFn input = map (uncurry reduceFn) $shuffle$ do { -- list monad   x :: a <- input;   y :: (key,value) <- mapFn x;   return (x,y); -- all pairs }; -- profiling reveals the majority of the computation time is spent here, not too surprising. shuffle :: forall a key value . Ord key => [(a,(key,value))] -> [(key,[(a,value)])]; shuffle = let {   get_a :: (a,(key,value)) -> a;   get_a (a1,_) = a1;   get_key :: (a,(key,value)) -> key;   get_key (_,(k,_)) = k;   get_value :: (a,(key,value)) -> value;   get_value (_,(_,v)) = v;   rearrange :: [(a,(key,value))] -> (key,[(a,value)]);   rearrange l = (get_key $head l, zip (map get_a l) (map get_value l)); } in map rearrange . groupBy (equating get_key) . sortOn get_key; -- point-free style -- regular list sort 364.75 sec -- Seq.unstableSortBy = 440.98 sec -- cf Data.Ord.comparing equating :: Eq b => (a -> b) -> a -> a -> Bool; equating f x y = (f x) == (f y);  Well-Typed.Com Haskell Hackathon, Haskell eXchange, Haskell courses in London, October 2015 Haskell events in London In the time from 5–13 October, we are (co-)organizing a number of Haskell-related events in London, with Skills Matter. Here’s the overview: Haskell infrastructure Hackathon We’ll co-organize and participate in a two-day Haskell Hackathon, which takes place directly after the Haskell eXchange. This Hackathon aims at bringing together Haskell developers – both beginners and experts – who want to help improve the Haskell infrastructure, predominantly Hackage and Cabal. We’ll aim to arrange some introductory overview talks, to e.g. provide an overview over the code bases and the most important open issues. Participation is free, but please register via Skills Matter. Haskell eXchange The Haskell eXchange 2015 will be bigger than ever before. Expanded to two days and two tracks, it features four keynotes, two tutorials, and more than twenty speakers in total. Here’s the preliminary list of speakers and topics: • Keynote by Simon Peyton Jones on “Into the Core: understanding GHC’s intermediate language” • Keynote by Lennart Augustsson on “Giving Types to Relations” • Keynote by Simon Marlow on “Fun with Haxl” • Keynote by Luite Stegeman on “Solving the JavaScript Problem” • Workshop by Tom Ellis on “Opaleye” • Workshop by Ivan Perez on “Game programming for fun and profit” • Talk by Jasper van der Jeugt on “The Ludwig DSL” • Talk by Gershom Bazerman on “Programming with Universal Properties” • Talk by Neil Mitchell on “Shake” • Talk by Johan Tibell on “High-performance Haskell” • Talk by Francesco Mazzoli on “inline-c” • Talk by Matthew Pickering on “ghc-exactprint” • Talk by Alp Mestanogullari on “Servant” • Talk by Alfredo di Napoli on “Using Haskell at Iris Connect” • Talk by Miëtek Bak on “Building your own proof assistant” • Talk by Lars Hupel and Miles Sabin on “What Haskell can learn from Scala” • Talk by Vladimir Kirillov on “Haskell goes Devops” • Talk by Nicolas Wu on “Transformers: handlers in disguise” • Short talk by Martijn van Steenbergen on “JsonGrammar” • Short talk by Bodil Stokke on “PureScript” • Short talk by Blair Archibald on “HdpH” • Short talk by Philipp Kant on “Data avoidance in Haskell” • Short talk by Andraž Bajt on “Using Haskell as a thinking tool” • Short talk by San Gillis on “Haskell development with Docker” Registration is possible via Skills Matter. The promo code HASKELL-EXCHANGE-25 (has been extended to be valid until September 19!) can be used to get a 25% reduction. Haskell courses In connection with the Haskell eXchange and the Haskell infrastructure hackathon, Well-Typed are offering courses with Skills Matter. If you cannot come to London in October, but are interested in our course offerings, see Training for more information. Fast Track to Haskell The Fast Track course is a two-day compact introduction to Haskell, assuming previous programming experience, but no familiarity with Haskell or functional programming. It covers topics such as defining datatypes and functions, higher-order functions, explicit side effects and monads. Guide to the Haskell Type System The Guide to the Haskell Type System course is a one-day introduction to various type-system extensions that GHC offers, such as GADTs, rank-n polymorphism, type families and more. It assumes familiarity with Haskell. It does not make use of any other advanced Haskell concepts except for the ones it introduces, so it is in principle possible to follow this course directly after Fast Track. However, as this course focuses very much on the extreme aspects of Haskell’s type system, it should probably only be taken by participants who are enthusiastic about static types and perhaps familiar with a strong static type system from another language. Advanced Haskell The Advanced Haskell course is targeted at Haskellers who are comfortable with the Haskell basics, and want to learn more about how to write larger Haskell programs. The course covers topics such as data structures, their complexity, pitfalls of lazy evaluation, profiling, GHC’s internal core language, and some more advanced design patterns such as monad transformers and how to use them effectively. Once again, strictly speaking this course can be followed when just having completed Fast Track. But the nature of this course’s contents also means that several of the topics can be more appreciated if one has written more Haskell code in practice already. wren gayle romano Back home from ICFP Got back from Vancouver a couple days ago. The return flight was hell, but other than that the trip was pretty good. Got to meet a whole bunch of folks who read my blog (hi!), which was super cool. It always gives me warm fuzzies to know people are actually interested in my stuff. (You'd think I'd get used to it after a while, but no, evidently childhood scars still remain years after developing a solid sense of self-esteem. Go fig.) Of course, now I feel bad about not having written much of late. I have a few posts I've been meaning to write —mostly technical ones—, and I'm thinking I might should try to kick those out as a way to get back into the daily writing habit needed for finishing up my dissertation. Since handing out my card at ICFP, I've been hacking around on my website. I'm still working on getting my publications/presentations to be properly formatted, and the timestamps in the footers on a couple pages are busted, but other than that I think it's in pretty good shape. I also took part in a keysigning party for the purpose of building a WoT as part of the developing story for how we securely deliver packages in Haskell. My key is up on Keybase as well as on my website (along with all my other contact info). After a week of great talks, too much drinking, and too little sleep, I met up with a dear friend from undergrad. It's been far far too long since we've had a chance to hang out (alas, he couldn't make the wedding last year), but he's one of those friends you can just fall right back into. It's funny how much can change and yet stay the same. In one of our ambling walks I dragged him into a clothing store —something I never would've done pre-transition. He didn't seem to mind. He just went along like it's the sort of thing we've always done. And it felt to me like the sort of thing we could've always done. Recently he's been ref'ing for roller derby up in Victoria, and after talking about it I'm thinking I might try my hand at it. Sounds like a lot of fun. We have a team in Bloomington, and they're doing a training/recruiting thing, though alas I'll miss it since I'll be in Mountain View on the 14th. I'll see if I can't find some other way to get introduced. comments September 10, 2015 Don Stewart (dons) Senior Haskell developer roles at Standard Chartered [Singapore] The Strats team at Standard Chartered has two open positions for senior typed functional programming developers, based in Singapore. Strats are a specialized software engineering and quantitative analysis team who build a broad range of software for financial markets users at Standard Chartered. You will work on the trading floor, directly with traders, building software to automate their work and improve their efficiency. The role is highly development focused and you will use Haskell for almost all tasks: data analysis, market data publishing, database access, web services, desktop GUIs, large parallel tasks, quantitative models, solvers, everything. This is a fast paced role – code you write today will be deployed within hours to hundreds of users and has to work. These are permanent, director-level positions, in Singapore as part of the Strats global team. Demonstrated experience in typed FP (Haskell, OCaml, F# etc) is required. We have around 2.5 million lines of Haskell, and our own Haskell compiler. In this context we look for skill and taste in typed functional programming to capture and abstract over complex, messy systems. Experience writing typed APIs to external systems such as databases, web services, pub/sub platforms is very desirable. We like working code, so if you have Hackage or github libraries, we definitely want to see them. We also like StackOverflow answers, blog posts, academic papers, or other arenas where you can show broad FP ability. A PhD in computer science is a strong advantage. The role requires physical presence on the trading floor in Singapore. Remote work is not an option. You will have some project and client management skills — you will talk to users, understand their problems and then implement and deliver what they really need. No financial background is required. These director-level positions have attractive remuneration for the right candidates. Relocation support will also be provided. Contracting-based positions are also possible if desired. Applicants who don’t necessarily meet all criteria but have an interest in working in Singapore and have an FP background are encouraged to apply. More info about our development process is in the 2012 PADL keynote, and a 2013 HaskellCast interview. If this sounds exciting to you, please send your resume to me – donald.stewart <at> sc.com. Tagged: jobs, singapore September 09, 2015 Philip Wadler Paul Hudak Symposium: Putting the Funk in Functional Programming Thank you to John Peterson, who is organising a symposium in memory of Paul Hudak. I was sorry to miss the session devoted to Paul at ICFP. Paul made huge contributions to FP, and though he was only a little older than me I was proud to have him as a mentor. I'm looking forward to the meeting. The symposium will take place Friday 29--Saturday 30 April 2016 at Yale. September 08, 2015 Dimitri Sabadie Luminance – ASAP We’re almost there! luminance, the Haskell graphics framework I’ve been working on for a month and a half, will be released very soon as 0.1 on hackage. I’m still working actively on several parts of it, especially the embedded documentation, wikis and main interface. Keep in mind that the internal design is 80% done, but the end-user interface might change a lot in the future. Because I’m a demoscener, I’ll be using luminance for the next months to release a demoscene production in Germany and provide you with a nice feedback about how usable it is, so that I can make it more mature later on. What to expect? Currently, luminance works. You can create buffers, shaders, framebuffers, textures and blend the whole thing to create nice (animated) images. Everything is strongly and (almost) dependently typed, so that you have an extra type safety. As I was developing the interface, I also wrote a new package that will be released on hackage as well: luminance-samples. As you might have guessed, that package contains several executables you can launch to test luminance. Those are just features sets. There’s an Hello, World! executable, a depth test executable, a blending one, a texture one, and so on and so forth. I’ll refactor them to make the code cleaner, but you should have a look to see what it entails to use luminance! ;) I’ll be very open-minded about what you guys think of luminance once it gets released. Even though I’ve started writing it for my own purposes, I clearly understand that a lot of people are interested in that project. I’ve been contacted by the developers of waylandmonad to explain them the choices I made with luminance so that they can do the same thing when migrating xmonad from the Xorg technology to Wayland. If I can help in any ways, even if it’s not about luminance directly, don’t hesitate then contact me! I can’t give you a 0.1 release milestone yet, but you should be able to install it from hackage and stackage very soon! I’ll write an article when it gets released, I promise. In the waiting, keep the vibe. Happy hacking around! September 07, 2015 Oliver Charles Announcing a new set of high-level SDL2 bindings It’s with great pleasure that on behalf of the haskell-game group, I’d like to announce the release of a new set of high-level bindings to the SDL library. SDL is a C library providing a set of cross-platform functions for handling graphics, window management, audio, joystick/gamepad interaction, and more. For a while, we’ve had bindings to SDL 2 on Hackage, but these bindings are as close to 1:1 as you can get in Haskell. This results in a library that certainly can be used in Haskell, but does not feel particularly like writing ordinary Haskell! A real concern here is that this raises the barrier to entry for those new to either game programming or writing games in Haskell (or both!) - a barrier that I would certainly like to see lowered. To address this, myself and many others have spent the last year working on high-level bindings to abstract away the C-like feel of the existing library, and to present a more Haskell interface. To give you an idea of how things look, here’s a basic application that opens a window, clears the screen, and quits when the user presses ‘q’: {-# LANGUAGE OverloadedStrings #-} module Main where import SDL import Linear (V4(..)) import Control.Monad (unless) main :: IO () main = do initialize [InitEverything] window <- createWindow "My SDL Application" defaultWindow renderer <- createRenderer window (-1) defaultRenderer appLoop renderer appLoop :: Renderer -> IO () appLoop renderer = do events <- pollEvents let eventIsQPress event = case eventPayload event of KeyboardEvent keyboardEvent -> keyboardEventKeyMotion keyboardEvent == Pressed && keysymKeycode (keyboardEventKeysym keyboardEvent) == KeycodeQ _ -> False qPressed = not (null (filter eventIsQPress events)) rendererDrawColor renderer$= V4 0 0 255 255
clear renderer
present renderer
unless qPressed (appLoop renderer)

Hopefully you’ll agree that the code above is close to idiomatic Haskell.

We’ve tried to be extensive with the bindings, and at the moment the following should (!) all be working:

• Graphics routines have been our highest priority. The bindings should give you full control over window management, access to SDL’s new hardware-accelerated 2D rendering routines, and also the ability to set up an OpenGL context (which is compatible with the OpenGL and gl libraries).
• SDL’s audio abstraction is available, and the bindings allow you to open audio devices and stream audio data.
• A clean implementation of SDL’s event system, designed for use with pattern matching.
• Access to input devices, including keyboard, mouse, pointer devices, and joysticks.
• A large collection of example code, ported from the popular “lazyfoo” and “twinklebear” tutorials.

The bindings are not 100% exhaustive - we’ve omitted some routines that are already provided by the Haskell runtime, but we also currently lack bindings to the following:

• Force-feedback (SDL’s “haptic” functionality). While we do have some code in the repository here, none of the contributors own a device that is compatible with SDL2 to actually test this work. If you do, please drop us a line and help out!
• Gesture recording for touch screens. We’re currently targetting desktops and laptops, but SDL has support for Android and iOS. Hopefully when GHC is easier to target these devices, we can start to explore these SDL bindings.
• Other SDL2 compatible libraries, such as SDL2_net and SDL2_ttf. We’d love for these projects to have the same treatment, and are more than happy to host them under the haskell-game Github account.

We hope this enables more people to begin building interactive software and games in Haskell. It’s still early days for these bindings, so if you find any bugs (runtime problems or API bugs), or if you find the bindings lacking in anyway, please don’t hesitate to open an issue on our issue tracker.

Happy hacking!

The actual base of decibels

The usual definition of the decibel is of course that the dB value y is related to the proportion x by

y = 10 · log10(x).

It bothers me a bit that there's two operations in there. After all, if we expect that y can be manipulated as a logarithm is, shouldn't there be simply some log base we can use, since changing log base is also a multiplication (rather, division, but same difference) operation? With a small amount of algebra I found that there is:

y = log(100.1)(x).

Of course, this is not all that additionally useful in most cases. If you're using a calculator or a programming language, you usually have loge and maybe log10, and 10·log10 will have less floating-point error than involving the irrational value 100.1. If you're doing things by hand, you either have a table (or memorized approximations) of dB (or log10) and are done already, or you have a tedious job which carrying around 100.1 is not going to help.

September 05, 2015

WIRED's business section has an article on Haskell, Facebook's New Spam-Killer Hints at the Future of Coding
LOUIS BRANDY PAUSES before answering, needing some extra time to choose his words. “I’m going to get in so much trouble,” he says. The question, you see, touches on an eternally controversial topic: the future of computer programming languages.
Brandy is a software engineer at Facebook, and alongside a team of other Facebookers, he spent the last two years rebuilding the system that removes spam—malicious, offensive, or otherwise unwanted messages—from the world’s largest social network. That’s no small task—Facebook juggles messages from more than 1.5 billion people worldwide—and to tackle the problem, Brandy and team made an unusual choice: they used a programming language called Haskell.
...
If you consider that companies like Facebook, Google, and Amazon represent where the rest of the internet is going—as the internet grows, so many other online services will face the same problems it faces today—Facebook’s Haskell project can indeed point the way for the programming world as a whole. That doesn’t mean Haskell will be ubiquitous in the years to come. Because it’s so different from traditional programming languages, coders often have trouble learning to use it; undoubtedly, this will prevent widespread adoption. But Facebook’s work is a sign that other languages will move in Haskell’s general direction.
...
What about Haskell itself? In the long run, could it evolve to the point where it becomes the norm? Could coders evolve to the point where they embrace it large numbers? “I don’t know,” Brandy says. “But I don’t think it would be a bad thing.”
Spotted via Katie Miller (@codemiller) and Manuel Chakravarty (@TechnicalGrace).

Week at ICFP: Injective Type Families merged into GHC HEAD

I spent last week at Vancouver, Canada attending Haskell Implementors Workshop, ICFP and the Haskell Symposium. Yesterday I gave a talk on injective type families, described also in my previous post. Slides are on my web page but the talk itself is not yet online. Day prior to the talk I finally managed to merge the injective type families branch, which means that this feature is definitely making it into the next stable release of GHC. (In case you haven’t heard this will be GHC 8.0, not 7.12.)

My next plans include extending injective type families to fully match expressiveness of functional dependencies, as outlined in Section 7.2 of the injectivity paper. I also hope to finally implement support for typed holes in Template Haskell. The patch was supposed to be trivial and I started working on it several months ago. But then I ran into several problems with it and abandoned it to focus on ITFs.

1Liners August 2015

• August 20th, 2015: Okay this: \(a,b) -> foo a b c d e Somehow curry-i-tize the above expression (make a and b go away!) Is this Applicative?
• JP @japesinator uncurry $flip flip e . flip flip d . flip flip c . foo • Conor McBride @pigworker (|foo fst snd (|c|) (|d|) (|e|)|) • August 19th, 2015: points-free define unintify: unintify :: (Int, Int) -> (Float, Float) where unintify (a,b) = (fromIntegral a, fromIntegral b) • August 19th, 2015: points-free define timeser: timeser :: (Float, Float) -> (Float, Float) -> (Float, Float) where timeser (a,b) (c,d) = (a*c, b*d) • August 18th, 2015: foo :: (Float, Float) -> (Float, Float) -> Int -> (Float, Float) points-free if: foo (a,b) (c,d) e = ((c-a)/e, (d-b)/e) Arrows? Bimaps? 1Liners July 2015 • July 29th, 2015: ... on a roll: Point-free-itize foo :: (a -> b, a -> b) -> (a, a) -> (b, b) foo (f,g) (x,y) = (f x, g y) • $c^3$ @das_kube uncurry (***) • July 29th, 2015: I can't believe this wasn't a #1Liner already. Point-free-itize dup: dup :: a -> (a,a) dup x = (x,x) • Antonio Nikishaev @lelff join (,) • $c^3$ @das_kube id &&& id • July 23rd, 2015: define pairsies so that, e.g.: pairsies [1,2,3] = {{1, 2}, {1, 3}, {2, 3}} pairsies :: [a] -> Set (Set a) • pairsies list = concat (list =>> (head &&& tail >>> sequence)) • July 23rd, 2015: define both :: (a -> b) -> (a,a) -> (b,b) • Chris Copeland @chrisncopeland point-freer: both = uncurry . on (,) • Brian McKenna @puffnfresh both = join bimap • July 23rd, 2015: point-free-itize: gen :: Monad m => (m a, m b) -> m (a, b) • Bob Ippolito @etrepum gen = uncurry (liftM2 (,)) • July 17th, 2015: You may have seen this before, but here we go. point-free-itize swap: swap :: (a,b) -> (b,a) 1Liners Pre-July 2015 • Point-free define: foo :: (Ord a, Ord b) => [([a], [b])] -> (Set a, Set b) • Андреев Кирилл @nonaem00 foo = (Set.fromList . concat *** Set.fromList . concat) . unzip • point-free-itize computeTotalWithTax :: Num b => ((a, b), b) -> b computeTotalWithTax ((a, b), c) = b + c • point-free-itize foo (k,v) m = Map.insert k b m with obvs types for k, v, and m. • point-free-itize: shower :: forall a. forall b. Show a => [b -> a] -> b -> [a] shower fns thing = map (app . flip (,) thing) fns • row :: String -> (Item, (USD, Measure)) given csv :: String -> [String] and line is = "apple,$1.99 Lb" hint: words "a b" = ["a","b"] ... all types mentioned above are in today's @1HaskellADay problem at http://lpaste.net/4698665561507233792
• For Read a, point-free-itize: f a list = read a:list (f is used in a foldr-expression)
• Or you could just do: map read
• point-free-itize f such that: f a b c = a + b + c

Using Caps Lock as Menu/Apps keys on Emacs

I’m an ergoemacs-mode user, a mode that changes most key bindings so that they put less strain on your hands.  For example, it uses Alt instead of Ctrl most of the time, which is easier to press: use your curled thumb instead of a karate chop.  Also, many commands are activated by first pressing the Menu/Apps key (that key near the Right Ctrl which usually opens the context menu).  For example, pressing Menu then T allows you to switch buffers.

However, the keyboard on my new notebook doesn’t have a dedicated Menu key.  Instead, one needs to press Fn+Right Ctrl, which is of course extremely painful.

Menu key hidden on the Right Ctrl.

I’ve found a workaround, though.  A very hackish workaround.

The ergoemacs-mode FAQ suggests using Caps Lock as a Menu/Apps key for Mac users.  Using xmodmap it’s trivial to make Caps Lock a Menu key:

$xmodmap -e "keycode 66 = Menu" However, using xmodmap properly with Gnome is nigh impossible. It’s recommend to use xkb instead, but xkb doesn’t support mapping Caps Lock to the Menu key out-of-the-box (at least not yet). At this point, having wandered through many documentation pages, I’ve decided to try using some of the xkb options that already exist. At first I tried setting Caps Lock as the Hyper key. However, by default the Hyper key gets the same modifier code as the Super key (which is usually the key with the Windows logo). There’s a straightforward way of separating them, but I couldn’t find a way to make it play nice with Gnome. And even if I could, it’s not clear to me if I could use the Hyper key as a substitute for the Menu key on emacs. When ready to admit defeat, I’ve set the Caps Lock behavior to “Caps Lock is disabled” in preparation of trying a hack using xmodmap. Much to my surprise, I accidentally discovered that emacs then began treating the disabled Caps Lock key as <VoidSymbol>! The gears started turning in my head, then I added the following line to my ~/.emacs file: (define-key key-translation-map (kbd "<VoidSymbol>") (kbd "<menu>")) Surprisingly, it worked! Now pressing Caps Lock then T will switch buffers, for example. As a bonus, pressing Caps Lock accidentally while on another application won’t do anything. It’s not clear to me how fragile this hack really is. I’ll update this blog post if I ever find some drawback to it. But right now it seems to work quite nicely. September 02, 2015 Roman Cheplyaka MonadFix example: compiling regular expressions {-# LANGUAGE RecursiveDo, BangPatterns #-} import Control.Applicative import Data.Function (fix) import Data.IntMap as IntMap import Control.Monad.Fix (mfix) import Control.Monad.Trans.State import Control.Monad.Trans.Class (lift) import Text.Read (readMaybe) MonadFix is an odd beast; many Haskell programmers will never use it in their careers. It is indeed very rarely that one needs MonadFix; and for that reason, non-contrived cases where MonadFix is needed are quite interesting to consider. In this article, I’ll introduce MonadFix and show how it can be handy for compiling the Kleene closure (also known as star or repetition) of regular expressions. What is MonadFix? If you hear about MonadFix for the first time, you might think that it is needed to define recursive monadic actions, just like ordinary fix is used to define recursive functions. That would be a mistake. In fact, fix is just as applicable to monadic actions as it is to functions: guessNumber m = fix$ \repeat -> do
putStrLn "Enter a guess"
n <- readMaybe <$> getLine if n == Just m then putStrLn "You guessed it!" else do putStrLn "You guessed wrong; try again" repeat So, what is mfix for? First, recall that in Haskell, one can create recursive definitions not just for functions (which makes sense in other, non-lazy languages) or monadic actions, but for ordinary data structures as well. This is known as cyclic (or circular, or corecursive) definitions; and the technique itself is sometimes referred to as tying the knot. The classic example of a cyclic definition is the (lazy, infinite) list of Fibonacci numbers: fib = 0 : 1 : zipWith (+) fib (tail fib) Cyclic definitions are themselves rare in day-to-day Haskell programming; but occasionally, the right hand side will be not a pure value, but a monadic computation that needs to be run in order to obtain the value. Consider this (contrived) example, where we start the sequence with an arbitrary number entered by the user: fibIO1 = do putStrLn "Enter the start number" start <- read <$> getLine
return $start : 1 : zipWith (+) fibIO1 (tail fibIO1) This doesn’t typecheck because fibIO is not a list; it’s an IO action that produces a list. But if we try to run the computation, it doesn’t make much sense either: fibIO2 = do putStrLn "Enter the start number" start <- read <$> getLine
fib <- fibIO2
return $start : 1 : zipWith (+) fib (tail fib) This version of fibIO will ask you to enter the start number ad infinitum and never get to evaluating anything. Of course, the simplest thing to do would be to move IO out of the recursive equation; that’s why I said the example was contrived. But MonadFix gives another solution: fibIO3 = mfix$ \fib -> do
putStrLn "Enter the start number"
start <- read <$> getLine return$ start : 1 : zipWith (+) fib (tail fib)

Or, using the do-rec syntax:

fibIO4 = do
rec
fib <- do
putStrLn "Enter the start number"
start <- read <$> getLine return$ start : 1 : zipWith (+) fib (tail fib)
return fib

Compiling regular expressions

As promised, I am going to show you an example usage of MonadFix that solved a problem other than “how could I use MonadFix?”. This came up in my work on regex-applicative.

For a simplified presentation, let’s consider this type of regular expressions:

data RE
= Sym Char  -- symbol
| Seq RE RE -- sequence
| Alt RE RE -- alternative
| Rep RE    -- repetition

Our goal is to compile a regular expression into a corresponding NFA. The states will be represented by integer numbers. State 0 corresponds to successful completion; and each Sym inside a regex will have a unique positive state in which we are expecting the corresponding character.

type NFAState = Int

The NFA will be represented by a map

type NFA = IntMap (Char, [NFAState])

where each state is mapped to the characters expected at that state and the list of states where we go in case we get the expected character.

To compile a regular expression, we’ll take as an argument the list of states to proceed to when the regular expression as a whole succeeds (otherwise we’d have to compile each subexpression separately and then glue NFAs together). This is essentially the continuation-passing style; only instead of functions, our continuations are NFA states.

During the compilation, we’ll use a stack of two State monads: one to assign sequential state numbers to Syms; the other to keep track of the currently constructred NFA.

-- Returns the list of start states and the transition table
compile :: RE -> ([NFAState], NFA)
compile re = runState (evalStateT (go re [0]) 0) IntMap.empty

-- go accepts exit states, returns entry states
go :: RE -> [NFAState] -> StateT NFAState (State NFA) [NFAState]
go re exitStates =
case re of
Sym c -> do
!freshState <- gets (+1); put freshState
lift $modify' (IntMap.insert freshState (c, exitStates)) return [freshState] Alt r1 r2 -> (++) <$> go r1 exitStates <*> go r2 exitStates
Seq r1 r2 -> go r1 =<< go r2 exitStates

This was easy so far: alternatives share their exit states and their entry states are combined; and consequtive subexpressions are chained. But how do we compile Rep? The exit states of the repeated subexpression should become its own entry states; but we don’t know the entry states until we compile it!

And this is precisely where MonadFix (or recursive do) comes in:

    Rep r -> do
rec
let allEntryStates = ownEntryStates ++ exitStates
ownEntryStates <- go r allEntryStates
return allEntryStates

Why does this circular definition work? If we unwrap the State types, we’ll see that the go function actually computes a triple of three non-strict fields:

1. The last used state number
2. The list of entry states
3. The NFA map

The elements of the triple may depend on each other as long as there are no actual loops during evaluation. One can check that the fields can be indeed evaluated linearly in the order in which they are listed above:

1. The used state numbers at each step depend only on the regular expression itself, so it can be computed wihtout knowing the other two fields.
2. The list of entry states relies only on the state number information; it doesn’t need to know anything about the NFA transitions.
3. The NFA table needs to know the entry and exit states; but that is fine, we can go ahead and compute that information without creating any reverse data dependencies.

An ASM Monad – a similar example from a different domain.

Oliver Charles’s 24 Days of GHC Extensions: Recursive Do.

Levent Erkok’s thesis which contains all you need to know about MonadFix, including several other examples.

Todd Wilson points out that Douglas McIlroy describes a similar regular expression compilation technique in his 2004 JFP Functional Pearl Enumerating the strings of regular languages. Like this article, Douglas’s paper uses a circular definition when compiling the Kleene closure. But the circular definition is not monadic there: instead of using the State monad, Douglas passes the state around by hand.

Wolfgang Jeltsch

In the Theory Lunch of the last week, James Chapman talked about the MU puzzle from Douglas Hofstadter’s book Gödel, Escher, Bach. This puzzle is about a string rewriting system. James presented a Haskell program that computes derivations of strings. Inspired by this, I wrote my own implementation, with the goal of improving efficiency. This blog post presents this implementation. As usual, it is available as a literate Haskell file, which you can load into GHCi.

The puzzle

Let me first describe the MU puzzle shortly. The puzzle deals with strings that may contain the characters $\mathrm M$, $\mathrm I$, and $\mathrm U$. We can derive new strings from old ones using the following rewriting system:

$\begin{array}{rcl} x\mathrm I & \rightarrow & x\mathrm{IU} \\ \mathrm Mx & \rightarrow & \mathrm Mxx \\ x\mathrm{III}y & \rightarrow & x\mathrm Uy \\ x\mathrm{UU}y & \rightarrow & xy \end{array}$

The question is whether it is possible to turn the string $\mathrm{MI}$ into the string $\mathrm{MU}$ using these rules.

You may want to try to solve this puzzle yourself, or you may want to look up the solution on the Wikipedia page.

The code

The code is not only concerned with deriving $\mathrm{MU}$ from $\mathrm{MI}$, but with derivations as such.

Preliminaries

We import Data.List:

import Data.List

Basic things

We define the type Sym of symbols and the type Str of symbol strings:

data Sym = M | I | U deriving Eq

type Str = [Sym]

instance Show Sym where

show M = "M"
show I = "I"
show U = "U"

showList str = (concatMap show str ++)

Next, we define the type Rule of rules as well as the list rules that contains all rules:

data Rule = R1 | R2 | R3 | R4 deriving Show

rules :: [Rule]
rules = [R1,R2,R3,R4]

Rule application

We first introduce a helper function that takes a string and returns the list of all splits of this string. Thereby, a split of a string str is a pair of strings str1 and str2 such that str1 ++ str2 == str. A straightforward implementation of splitting is as follows:

splits' :: Str -> [(Str,Str)]
splits' str = zip (inits str) (tails str)

The problem with this implementation is that walking through the result list takes quadratic time, even if the elements of the list are left unevaluated. The following implementation solves this problem:

splits :: Str -> [(Str,Str)]
splits str = zip (map (flip take str) [0 ..]) (tails str)

Next, we define a helper function replace. An expression replace old new str yields the list of all strings that can be constructed by replacing the string old inside str by new.

replace :: Str -> Str -> Str -> [Str]
replace old new str = [front ++ new ++ rear |
(front,rest) <- splits str,
old isPrefixOf rest,
let rear = drop (length old) rest]

We are now ready to implement the function apply, which performs rule application. This function takes a rule and a string and produces all strings that can be derived from the given string using the given rule exactly once.

apply :: Rule -> Str -> [Str]
apply R1 str        | last str == I = [str ++ [U]]
apply R2 (M : tail)                 = [M : tail ++ tail]
apply R3 str                        = replace [I,I,I] [U] str
apply R4 str                        = replace [U,U]   []  str
apply _  _                          = []

Derivation trees

Now we want to build derivation trees. A derivation tree for a string str has the following properties:

• The root is labeled with str.
• The subtrees of the root are the derivation trees for the strings that can be generated from str by a single rule application.
• The edges from the root to its subtrees are marked with the respective rules that are applied.

We first define types for representing derivation trees:

data DTree = DTree Str [DSub]

data DSub  = DSub Rule DTree

Now we define the function dTree that turns a string into its derivation tree:

dTree :: Str -> DTree
dTree str = DTree str [DSub rule subtree |
rule <- rules,
subStr <- apply rule str,
let subtree = dTree subStr]

Derivations

A derivation is a sequence of strings with rules between them such that each rule takes the string before it to the string after it. We define types for representing derivations:

data Deriv = Deriv [DStep] Str

data DStep = DStep Str Rule

instance Show Deriv where

show (Deriv steps goal) = "        "           ++
concatMap show steps ++
show goal            ++
"\n"

showList derivs
= (concatMap ((++ "\n") . show) derivs ++)

instance Show DStep where

show (DStep origin rule) = show origin ++
"\n-> ("    ++
show rule   ++
") "

Now we implement a function derivs that converts a derivation tree into the list of all derivations that start with the tree’s root label. The function derivs traverses the tree in breadth-first order.

derivs :: DTree -> [Deriv]
derivs tree = worker [([],tree)] where

worker :: [([DStep],DTree)] -> [Deriv]

rootDerivs :: [([DStep],DTree)] -> [Deriv]
rootDerivs tasks = [Deriv (reverse revSteps) root |

DSub rule subtree          <- subs]

Finally, we implement the function derivations which takes two strings and returns the list of those derivations that turn the first string into the second:

derivations :: Str -> Str -> [Deriv]
derivations start end
= [deriv | deriv@(Deriv _ goal) <- derivs (dTree start),
goal == end]

You may want to enter

derivations [M,I] [M,U,I]

at the GHCi prompt to see the derivations function in action. You can also enter

derivations [M,I] [M,U]

to get an idea about the solution to the MU puzzle.

Tagged: Douglas Hofstadter, functional programming, Gödel, Escher, Bach (book), Haskell, Institute of Cybernetics, James Chapman, literate programming, MU puzzle, string rewriting, talk, Theory Lunch

MIU in Curry

More than two years ago, my colleague Denis Firsov and I gave a series of three Theory Lunch talks about the MIU string rewriting system from Douglas Hofstadter’s MU puzzle. The first talk was about a Haskell implementation of MIU, the second talk was an introduction to the functional logic programming language Curry, and the third talk was about a Curry implementation of MIU. The blog articles MIU in Haskell and A taste of Curry are write-ups of the first two talks. However, a write-up of the third talk has never seen the light of day so far. This is changed with this article.

I want to thank all the people from the Curry mailing list who have helped me improving the code in this article.

Preliminaries

We import the module SearchTree:

import SearchTree

Basic things

We define the type Sym of symbols and the type Str of symbol strings:

data Sym = M | I | U

showSym :: Sym -> String
showSym M = "M"
showSym I = "I"
showSym U = "U"

type Str = [Sym]

showStr :: Str -> String
showStr str = concatMap showSym str

Next, we define the type Rule of rules:

data Rule = R1 | R2 | R3 | R4

showRule :: Rule -> String
showRule R1 = "R1"
showRule R2 = "R2"
showRule R3 = "R3"
showRule R4 = "R4"

So far, the Curry code is basically the same as the Haskell code. However, this is going to change below.

Rule application

Rule application becomes a lot simpler in Curry. In fact, we can code the rewriting rules almost directly to get a rule application function:

applyRule :: Rule -> Str -> Str
applyRule R1 (init ++ [I])              = init ++ [I, U]
applyRule R2 ([M] ++ tail)              = [M] ++ tail ++ tail
applyRule R3 (pre ++ [I, I, I] ++ post) = pre ++ [U] ++ post
applyRule R4 (pre ++ [U, U] ++ post)    = pre ++ post

Note that we do not return a list of derivable strings, as we did in the Haskell solution. Instead, we use the fact that functions in Curry are nondeterministic.

Furthermore, we do not need the helper functions splits and replace that we used in the Haskell implementation. Instead, we use the ++-operator in conjunction with functional patterns to achieve the same functionality.

Now we implement a utility function applyRules for repeated rule application. Our implementation uses a similar trick as the famous Haskell implementation of the Fibonacci sequence:

applyRules :: [Rule] -> Str -> [Str]
applyRules rules str = tail strs where

strs = str : zipWith applyRule rules strs

The Haskell implementation does not need the applyRules function, but it needs a lot of code about derivation trees instead. In the Curry solution, derivation trees are implicit, thanks to nondeterminism.

Derivations

A derivation is a sequence of strings with rules between them such that each rule takes the string before it to the string after it. We define types for representing derivations:

data Deriv = Deriv [DStep] Str

data DStep = DStep Str Rule

showDeriv :: Deriv -> String
showDeriv (Deriv steps goal) = "        "                ++
concatMap showDStep steps ++
showStr goal              ++
"\n"

showDerivs :: [Deriv] -> String
showDerivs derivs = concatMap ((++ "\n") . showDeriv) derivs

showDStep :: DStep -> String
showDStep (DStep origin rule) = showStr origin ++
"\n-> ("       ++
showRule rule  ++
") "

Now we implement a function derivation that takes two strings and returns the derivations that turn the first string into the second:

derivation :: Str -> Str -> Deriv
derivation start end
| start : applyRules rules start =:= init ++ [end]
= Deriv (zipWith DStep init rules) end where

rules :: [Rule]
rules free

init :: [Str]
init free

Finally, we define a function printDerivations that explicitly invokes a breadth-first search to compute and ultimately print derivations:

printDerivations :: Str -> Str -> IO ()
printDerivations start end = do
searchTree <- getSearchTree (derivation start end)
putStr \$ showDerivs (allValuesBFS searchTree)

You may want to enter

printDerivations [M, I] [M, I, U]

at the KiCS2 prompt to see the derivations function in action.

Tagged: breadth-first search, Curry, Denis Firsov, Douglas Hofstadter, functional logic programming, functional pattern, functional programming, Haskell, Institute of Cybernetics, KiCS2, literate programming, logic programming, MU puzzle, string rewriting, talk, Theory Lunch