Planet Haskell

July 19, 2018

FP Complete

Pantry, part 1: The Package Index

Back in January, I published a two part blog post on hash-based package downloads. Some project needs at FP Complete have pushed this to the forefront recently, and as a result I've gotten started on implementing these ideas. I'm hoping to publish regular blog posts on the topic as I continue implementation.

by Michael Snoyman (michael@fpcomplete.com) at July 19, 2018 03:09 PM

Luke Palmer

Playground Programming

Every now and then, I have a small idea about a development environment feature I’d like to see. At that point, I usually say to myself, “make a prototype to see if I like it and/or show the world by example”, and start putting pressure on myself to make it. But of course, there are so many ideas and so little time, and at the end of the day, instead of producing a prototype, I manage just to produce some guilt.

This time, I’m just going to share my idea without any self-obligation to make it.

I’m working on Chrome’s build system at Google. We are switching the build scripts to a new system which uses an ingenious testing system that I’ve never seen before (though it seems like the kind of thing that would be well-known). For each build script, we have a few test inputs to run it on. The tests run all of our scripts on all of their test inputs, but rather than running the commands, they simply record the commands that would have been run into “test expectation” files, which we then check into source control.

Checking in these auto-generated files is the ingenious part. Now, when we want to change or refactor anything about the system, we simply make the change, regenerate the expectations, and do a git diff. This will tell us what the effects of our change are. If it’s supposed to be a refactor, then there should be no expectation diffs. If it’s supposed to change something, we can look through the diffs and make sure that it changed exactly what it was supposed to. These expectation files are a form of specification, except they live at the opposite end of the development chain.

This fits in nicely with a Haskell development flow that I often use. The way it usually goes: I write a function, get it to compile cleanly, then I go to ghci and try it on a few conspicuous inputs. Does it work with an empty list? What about an infinite list (and I trim the output if the output is also infinite to sanity check). I give it enough examples that I have a pretty high confidence that it’s correct. Then I move on, assuming it’s correct, and do the same with the next function.

I really enjoy this way of working. It’s “hands on”.

What if my environment recorded my playground session, so that whenever I changed a function, I could see its impact? It would mark the specific cases that changed, so that I could make sure I’m not regressing. It’s almost the same as unit tests, but with a friendlier workflow and less overhead (reading rather than writing). Maybe a little less intentional and therefore a little more error-prone, but it would be less error-prone than the regression testing strategy I currently use for my small projects (read: none).

It’s bothersome to me that this is hard to make. It seems like such a simple idea. Maybe it is easy and I’m just missing something.

by Luke at July 19, 2018 01:46 AM

July 18, 2018

FP Complete

DevSecOps - Putting the Sec in DevOps

We did something very different this month for our monthly webinar. We invited Ford Winslow, CEO of ICE Cybersecurity, and the new security platform, Aeonian, to present the topic of DevSecOps, or as Ford puts it "Putting the Sec in DevOps". Ice Cybersecurity is a strategic business partner of FP Complete and our companies are firm believers that security begins with the first line of written code.

by Robert Bobbett (robert@fpcomplete.com) at July 18, 2018 08:11 PM

July 17, 2018

FP Complete

Deploying Rust with Docker and Kubernetes

deploying rust with docker and kubernetes

Hello! My name is Chris Allen and I'm going to use a tiny Rust app to demonstrate deploying Rust with Docker and Kubernetes. Rust and Haskell apps are deployed in similar ways. Much of this is because the compilers for both languages generate native binary executables.

by Chris Allen (chrisallen@fpcomplete.com) at July 17, 2018 09:36 PM

Mark Jason Dominus

The food I couldn't eat

[ I wrote this in 2007 and it seems I forgot to publish it. Enjoy! ]

I eat pretty much everything. Except ketchup. I can't stand ketchup. When I went to Taiwan a couple of years ago my hosts asked if there were any foods I didn't eat. I said no, except for ketchup.

"Ketchup? You mean that red stuff?"

Right. Yes, it's strange.

When I was thirteen my grandparents took me to Greece, and for some reason I ate hardly anything but souvlaki the whole time. When I got home, I felt like a complete ass. I swore that I would never squander another such opportunity, and that if I ever went abroad again I would eat absolutely everything that was put before me.

This is a good policy not just because it exposes me to a lot of delicious and interesting food, and not just because it prevents me from feeling like a complete ass, but also because I don't have to worry that perhaps my hosts will be insulted or disappointed that I won't eat the food they get for me.

On my second trip to Taiwan, I ate at a hot pot buffet restaurant. They give you a pot of soup, and then you go to the buffet and load up with raw meat and vegetables and things, and cook them at your table in the soup. It's fun. In my soup there were some dark reddish-brown cubes that had approximately the same texture as soft tofu. I didn't know what it was, but I ate it and tried to figure it out.

The next day I took the bus to Lishan (梨山), and through good fortune was invited to eat dinner with a Taiwanese professor of criminology and his family. The soup had those red chunks in it again, and I said "I had these for lunch yesterday! What are they?" I then sucked one down quickly, because sometimes people interpret that kind of question as a criticism, and I didn't want to offend the professor.

Actually it's much easier to ask about food in China than it is in, say, Korea. Koreans are defensive about their cuisine. They get jumpy if you ask what something is, and are likely to answer "It's good. Just eat it!". They are afraid that the next words out of your mouth will be something about how bad it smells. This is because the Japanese, champion sneerers, made about one billion insulting remarks about smelly Korean food while they were occupying the country between 1911 and 1945. So if you are in Korea and you don't like the food, the Koreans will take it very personally.

Chinese people, on the other hand, know that they have the best food in the world, and that everyone loves Chinese food. If you don't like it, they will not get offended. They will just conclude that you are a barbarian or an idiot, and eat it themselves.

Anyway, it turns out that the reddish-brown stuff was congealed duck's blood. Okay. Hey, I had congealed duck blood soup twice in two days! No way am I going home from this trip feeling like an ass.

So the eat-absolutely-everything policy has worked out well for me, and although I haven't liked everything, at least I don't feel like I wasted my time.

The only time I've regretted the policy was on my first trip to Taiwan. I was taken out to dinner and one of the dishes turned out to be pieces of steamed squid. That's not my favorite food, but I can live with it. But the steamed squid was buried under a big, quivering mound of sugared mayonnaise.

I remembered my policy, and took a bite. I'm sure I turned green.

So that's the food that I couldn't eat.

[ Some of this 2007 article duplicates stuff I have said since; for example I cut out the chicken knuckles story, which would have been a repeat. Also previously; also previously; and another one ]

by Mark Dominus (mjd@plover.com) at July 17, 2018 12:00 AM

July 16, 2018

Ken T Takusagawa

[xjzqtjzg] Top-heavy perfect powers

Below are a list of perfect powers b^n (b<=n) in increasing order, up to a googol.  There are 2413 values.  The list omits bases which are perfect powers themselves, e.g., 4^n, 8^n, 9^n.

Here is Haskell source code to compute the list.  We used functions from the data-ordlist package to merge an infinite list of ordered infinite lists into a single ordered list.  This seems to be a nontrivial function to write, having written it a few times ((1) and (2)) before discovering Data.List.Ordered.  (Incidentally, data-ordlist is not indexed by Hoogle version 4.  It is indexed by the alpha version (Hoogle 5), but Hoogle 5 cannot search by type Ord a => [[a]] -> [a].)  We also used Data.List.Ordered.minus to compute the set difference of two infinite lists.

The motivation was to investigate increasing the breadth of the Cunningham project, which currently factors integers b^n +/- 1 with b restricted to {2, 3, 5, 6, 7, 10, 11, 12}.  Within the original Cunningham range up to about 2^1200, there are 4357 perfect powers in the restricted set, but there would be 19256 in our expanded set.

What is the growth rate of this sequence?

2^2 2^3 2^4 3^3 2^5 2^6 3^4 2^7 3^5 2^8 2^9 3^6 2^10 2^11 3^7 5^5 2^12 3^8 2^13 5^6 2^14 3^9 2^15 6^6 3^10 2^16 5^7 2^17 3^11 2^18 6^7 5^8 2^19 3^12 7^7 2^20 3^13 6^8 5^9 2^21 2^22 3^14 7^8 2^23 5^10 6^9 3^15 2^24 2^25 7^9 3^16 5^11 6^10 2^26 3^17 2^27 5^12 2^28 7^10 6^11 3^18 2^29 2^30 3^19 5^13 7^11 2^31 6^12 3^20 2^32 5^14 2^33 10^10 3^21 6^13 7^12 2^34 5^15 3^22 2^35 2^36 6^14 3^23 7^13 10^11 2^37 5^16 2^38 3^24 11^11 6^15 2^39 7^14 5^17 3^25 10^12 2^40 2^41 3^26 6^16 11^12 5^18 2^42 7^15 3^27 2^43 12^12 10^13 6^17 2^44 5^19 3^28 7^16 11^13 2^45 3^29 2^46 5^20 10^14 6^18 12^13 2^47 3^30 7^17 2^48 13^13 11^14 5^21 2^49 6^19 3^31 10^15 2^50 12^14 7^18 3^32 2^51 5^22 6^20 13^14 11^15 2^52 3^33 2^53 10^16 14^14 7^19 5^23 12^15 3^34 2^54 6^21 2^55 11^16 3^35 13^15 5^24 2^56 7^20 10^17 6^22 2^57 3^36 14^15 12^16 2^58 5^25 15^15 3^37 11^17 7^21 2^59 13^16 6^23 10^18 2^60 3^38 5^26 14^16 12^17 2^61 7^22 3^39 2^62 6^24 11^18 15^16 5^27 13^17 2^63 10^19 3^40 2^64 12^18 7^23 6^25 14^17 3^41 2^65 5^28 11^19 2^66 15^17 10^20 3^42 13^18 2^67 6^26 5^29 7^24 2^68 12^19 3^43 14^18 2^69 11^20 17^17 5^30 3^44 10^21 6^27 2^70 7^25 13^19 15^18 2^71 3^45 12^20 5^31 2^72 14^19 6^28 11^21 3^46 7^26 2^73 10^22 17^18 2^74 13^20 15^19 5^32 3^47 6^29 2^75 18^18 12^21 7^27 2^76 3^48 11^22 14^20 10^23 5^33 2^77 6^30 17^19 3^49 13^21 2^78 15^20 7^28 12^22 5^34 2^79 18^19 3^50 11^23 10^24 14^21 2^80 6^31 19^19 3^51 2^81 5^35 13^22 7^29 17^20 2^82 15^21 3^52 12^23 6^32 2^83 11^24 10^25 18^20 5^36 14^22 2^84 3^53 7^30 19^20 2^85 13^23 6^33 3^54 17^21 5^37 15^22 2^86 12^24 10^26 20^20 11^25 2^87 7^31 3^55 18^21 14^23 6^34 2^88 5^38 3^56 13^24 2^89 19^21 12^25 10^27 7^32 15^23 17^22 11^26 2^90 3^57 6^35 5^39 20^21 2^91 14^24 18^22 3^58 2^92 21^21 13^25 7^33 5^40 2^93 10^28 6^36 12^26 11^27 19^22 3^59 15^24 2^94 17^23 2^95 20^22 3^60 14^25 5^41 7^34 6^37 18^23 2^96 13^26 10^29 21^22 3^61 12^27 11^28 2^97 5^42 15^25 19^23 2^98 17^24 22^22 6^38 7^35 3^62 14^26 2^99 20^23 10^30 5^43 3^63 13^27 2^100 18^24 11^29 12^28 6^39 2^101 21^23 7^36 3^64 15^26 19^24 2^102 5^44 17^25 22^23 14^27 10^31 2^103 3^65 6^40 13^28 20^24 11^30 7^37 12^29 2^104 23^23 18^25 5^45 3^66 2^105 21^24 15^27 6^41 2^106 3^67 19^25 17^26 10^32 14^28 7^38 5^46 2^107 22^24 11^31 13^29 12^30 3^68 2^108 20^25 18^26 23^24 6^42 2^109 5^47 3^69 15^28 7^39 10^33 21^25 2^110 24^24 17^27 14^29 19^26 11^32 3^70 2^111 13^30 12^31 6^43 5^48 22^25 2^112 7^40 20^26 3^71 18^27 10^34 2^113 23^25 15^29 6^44 5^49 2^114 3^72 11^33 21^26 14^30 17^28 24^25 19^27 13^31 12^32 2^115 7^41 3^73 22^26 2^116 5^50 10^35 6^45 20^27 18^28 2^117 15^30 3^74 23^26 11^34 7^42 2^118 14^31 12^33 13^32 5^51 17^29 21^27 3^75 6^46 19^28 2^119 24^26 10^36 2^120 22^27 3^76 7^43 5^52 18^29 2^121 20^28 11^35 15^31 6^47 14^32 12^34 2^122 3^77 13^33 23^27 26^26 17^30 10^37 21^28 2^123 5^53 19^29 7^44 3^78 24^27 2^124 6^48 11^36 22^28 2^125 15^32 18^30 3^79 20^29 5^54 12^35 14^33 13^34 2^126 10^38 7^45 23^28 6^49 17^31 3^80 26^27 2^127 21^29 19^30 5^55 11^37 2^128 24^28 3^81 15^33 2^129 12^36 7^46 6^50 18^31 22^29 14^34 13^35 10^39 20^30 3^82 2^130 5^56 17^32 2^131 23^29 11^38 3^83 26^28 19^31 21^30 6^51 7^47 2^132 5^57 12^37 15^34 10^40 24^29 2^133 3^84 13^36 14^35 18^32 22^30 20^31 2^134 6^52 28^28 5^58 3^85 7^48 17^33 11^39 2^135 23^30 19^32 2^136 21^31 10^41 12^38 3^86 26^29 15^35 13^37 5^59 2^137 6^53 14^36 24^30 7^49 18^33 3^87 2^138 22^31 20^32 11^40 17^34 2^139 5^60 28^29 3^88 10^42 6^54 12^39 2^140 19^33 23^31 7^50 21^32 13^38 15^36 14^37 29^29 2^141 26^30 3^89 5^61 18^34 11^41 2^142 24^31 6^55 20^33 3^90 22^32 10^43 2^143 17^35 7^51 12^40 5^62 2^144 28^30 3^91 13^39 19^34 15^37 14^38 23^32 6^56 21^33 2^145 11^42 26^31 29^30 3^92 18^35 7^52 2^146 10^44 5^63 24^32 20^34 12^41 2^147 17^36 22^33 30^30 6^57 3^93 2^148 13^40 15^38 14^39 5^64 19^35 11^43 7^53 3^94 2^149 28^31 23^33 21^34 10^45 6^58 2^150 18^36 26^32 12^42 3^95 29^31 5^65 2^151 17^37 20^35 24^33 7^54 22^34 13^41 2^152 30^31 3^96 11^44 14^40 15^39 6^59 10^46 19^36 2^153 5^66 31^31 21^35 3^97 23^34 28^32 2^154 12^43 18^37 7^55 2^155 6^60 26^33 17^38 3^98 13^42 29^32 5^67 20^36 11^45 24^34 2^156 22^35 14^41 10^47 15^40 3^99 2^157 30^32 19^37 7^56 6^61 12^44 5^68 2^158 21^36 23^35 18^38 3^100 31^32 28^33 2^159 13^43 11^46 17^39 10^48 26^34 14^42 20^37 2^160 7^57 3^101 15^41 5^69 6^62 29^33 24^35 22^36 2^161 12^45 19^38 3^102 30^33 2^162 21^37 5^70 11^47 18^39 10^49 13^44 7^58 23^36 6^63 2^163 3^103 28^34 31^33 17^40 14^43 2^164 15^42 20^38 26^35 3^104 5^71 12^46 22^37 2^165 24^36 29^34 6^64 7^59 19^39 2^166 11^48 10^50 3^105 33^33 13^45 18^40 30^34 21^38 2^167 5^72 23^37 14^44 17^41 15^43 2^168 3^106 6^65 28^35 7^60 31^34 12^47 20^39 2^169 26^36 10^51 22^38 5^73 11^49 3^107 24^37 19^40 2^170 29^35 13^46 6^66 18^41 2^171 3^108 7^61 21^39 14^45 33^34 17^42 30^35 5^74 23^38 15^44 2^172 12^48 10^52 3^109 20^40 11^50 34^34 2^173 28^36 6^67 31^35 26^37 22^39 13^47 2^174 7^62 5^75 19^41 24^38 3^110 29^36 2^175 18^42 14^46 12^49 21^40 17^43 6^68 15^45 3^111 2^176 10^53 23^39 11^51 5^76 33^35 30^36 7^63 2^177 20^41 3^112 13^48 28^37 2^178 34^35 31^36 6^69 22^40 19^42 26^38 5^77 24^39 14^47 2^179 3^113 12^50 18^43 10^54 35^35 7^64 15^46 29^37 17^44 11^52 2^180 21^41 3^114 23^40 6^70 2^181 5^78 13^49 20^42 30^37 33^36 2^182 3^115 7^65 19^43 28^38 10^55 14^48 12^51 22^41 2^183 34^36 31^37 26^39 11^53 24^40 5^79 18^44 6^71 15^47 3^116 17^45 2^184 21^42 29^38 35^36 2^185 13^50 7^66 3^117 23^41 5^80 20^43 2^186 10^56 6^72 12^52 30^38 14^49 33^37 11^54 19^44 2^187 3^118 22^42 28^39 15^48 18^45 24^41 2^188 26^40 17^46 5^81 7^67 34^37 31^38 3^119 6^73 13^51 21^43 2^189 10^57 29^39 35^37 23^42 2^190 12^53 20^44 3^120 11^55 14^50 5^82 7^68 2^191 19^45 6^74 30^39 15^49 33^38 22^43 3^121 18^46 2^192 17^47 28^40 13^52 24^42 10^58 26^41 5^83 37^37 2^193 31^39 21^44 34^38 3^122 12^54 7^69 11^56 6^75 2^194 14^51 29^40 20^45 23^43 35^38 3^123 2^195 5^84 15^50 19^46 18^47 10^59 2^196 13^53 17^48 22^44 30^40 6^76 7^70 3^124 33^39 2^197 28^41 24^43 12^55 11^57 5^85 26^42 21^45 37^38 14^52 2^198 3^125 31^40 34^39 20^46 2^199 23^44 6^77 29^41 15^51 10^60 7^71 38^38 19^47 5^86 3^126 13^54 2^200 35^39 18^48 17^49 11^58 22^45 12^56 2^201 30^41 3^127 6^78 24^44 33^40 14^53 28^42 2^202 5^87 21^46 26^43 7^72 10^61 3^128 2^203 31^41 20^47 15^52 37^39 34^40 13^55 23^45 19^48 2^204 29^42 11^59 6^79 18^49 5^88 12^57 17^50 3^129 38^39 7^73 2^205 22^46 35^40 14^54 10^62 2^206 3^130 30^42 39^39 24^45 21^47 5^89 28^43 6^80 26^44 33^41 2^207 15^53 13^56 20^48 11^60 3^131 7^74 12^58 2^208 31^42 23^46 19^49 37^40 17^51 18^50 34^41 29^43 5^90 2^209 3^132 10^63 6^81 14^55 22^47 38^40 2^210 35^41 7^75 3^133 21^48 24^46 13^57 15^54 30^43 2^211 11^61 5^91 39^40 12^59 26^45 28^44 20^49 33^42 6^82 2^212 3^134 19^50 17^52 10^64 23^47 18^51 40^40 2^213 31^43 14^56 7^76 37^41 5^92 34^42 29^44 3^135 2^214 22^48 11^62 6^83 13^58 15^55 2^215 12^60 38^41 21^49 35^42 24^47 3^136 30^44 10^65 5^93 2^216 20^50 7^77 26^46 28^45 17^53 19^51 39^41 18^52 33^43 2^217 14^57 23^48 6^84 3^137 11^63 31^44 2^218 40^41 5^94 13^59 22^49 29^45 12^61 3^138 34^43 15^56 37^42 7^78 2^219 10^66 21^50 41^41 6^85 2^220 24^48 3^139 38^42 20^51 35^43 5^95 17^54 30^45 14^58 19^52 26^47 2^221 18^53 28^46 11^64 23^49 7^79 3^140 33^44 39^42 2^222 13^60 12^62 6^86 10^67 15^57 5^96 31^45 22^50 2^223 29^46 3^141 40^42 34^44 2^224 37^43 21^51 7^80 14^59 24^49 20^52 17^55 11^65 6^87 2^225 41^42 3^142 19^53 18^54 5^97 26^48 38^43 35^44 30^46 13^61 12^63 10^68 28^47 2^226 23^50 42^42 15^58 3^143 33^45 2^227 39^43 7^81 22^51 6^88 5^98 31^46 2^228 3^144 11^66 29^47 21^52 14^60 40^43 17^56 34^45 2^229 20^53 10^69 37^44 24^50 18^55 19^54 13^62 12^64 3^145 5^99 2^230 6^89 7^82 26^49 41^43 15^59 30^47 23^51 28^48 35^45 38^44 2^231 3^146 11^67 42^43 22^52 2^232 33^46 5^100 14^61 10^70 39^44 6^90 21^53 31^47 17^57 3^147 2^233 7^83 12^65 13^63 29^48 43^43 20^54 18^56 19^55 24^51 2^234 34^46 40^44 15^60 37^45 5^101 3^148 2^235 26^50 23^52 6^91 11^68 30^48 28^49 41^44 7^84 10^71 35^46 2^236 14^62 38^45 3^149 22^53 12^66 13^64 5^102 2^237 17^58 33^47 21^54 42^44 18^57 20^55 3^150 31^48 6^92 39^45 19^56 2^238 29^49 15^61 24^52 7^85 11^69 43^44 2^239 34^47 5^103 10^72 3^151 40^45 37^46 26^51 23^53 14^63 2^240 12^67 44^44 28^50 6^93 30^49 13^65 22^54 3^152 2^241 35^47 41^45 17^59 38^46 7^86 5^104 21^55 18^58 2^242 20^56 33^48 19^57 11^70 15^62 3^153 10^73 42^45 31^49 29^50 6^94 2^243 24^53 39^46 14^64 12^68 5^105 2^244 3^154 43^45 34^48 13^66 7^87 23^54 26^52 40^46 37^47 2^245 28^51 17^60 22^55 30^50 6^95 11^71 3^155 44^45 10^74 21^56 2^246 18^59 5^106 15^63 35^48 20^57 19^58 41^46 38^47 2^247 7^88 45^45 33^49 3^156 12^69 14^65 24^54 31^50 29^51 13^67 2^248 42^46 6^96 39^47 5^107 23^55 3^157 2^249 11^72 26^53 10^75 34^49 17^61 43^46 22^56 7^89 28^52 2^250 15^64 37^48 40^47 18^60 30^51 21^57 3^158 19^59 20^58 6^97 5^108 12^70 2^251 44^46 14^66 35^49 13^68 41^47 38^48 2^252 3^159 24^55 33^50 10^76 11^73 29^52 45^46 7^90 31^51 2^253 5^109 23^56 6^98 17^62 42^47 3^160 39^48 26^54 15^65 2^254 46^46 22^57 18^61 34^50 12^71 21^58 28^53 19^60 20^59 2^255 43^47 14^67 30^52 3^161 37^49 13^69 5^110 40^48 7^91 10^77 6^99 11^74 2^256 35^50 44^47 24^56 3^162 2^257 38^49 41^48 33^51 29^53 17^63 31^52 5^111 23^57 15^66 2^258 45^47 12^72 7^92 3^163 6^100 26^55 18^62 22^58 42^48 14^68 39^49 2^259 13^70 10^78 19^61 21^59 20^60 11^75 34^51 28^54 46^47 3^164 2^260 5^112 30^53 43^48 37^50 40^49 2^261 47^47 6^101 7^93 24^57 3^165 35^51 17^64 12^73 15^67 2^262 44^48 33^52 29^54 23^58 5^113 38^50 10^79 41^49 31^53 18^63 14^69 13^71 11^76 2^263 3^166 22^59 26^56 19^62 21^60 45^48 20^61 6^102 7^94 2^264 42^49 39^50 28^55 34^52 3^167 5^114 30^54 2^265 46^48 12^74 15^68 37^51 17^65 10^80 43^49 24^58 2^266 40^50 6^103 3^168 11^77 13^72 14^70 47^48 7^95 35^52 18^64 23^59 2^267 5^115 29^55 33^53 44^49 31^54 22^60 19^63 38^51 3^169 41^50 26^57 21^61 20^62 2^268 48^48 6^104 12^75 2^269 10^81 45^49 28^56 5^116 3^170 7^96 39^51 15^69 42^50 34^53 17^66 11^78 30^55 2^270 13^73 14^71 24^59 46^49 37^52 2^271 3^171 18^65 43^50 23^60 40^51 6^105 5^117 35^53 19^64 2^272 22^61 29^56 47^49 20^63 7^97 21^62 33^54 10^82 12^76 31^55 3^172 26^58 38^52 44^50 2^273 41^51 11^79 15^70 48^49 13^74 17^67 5^118 2^274 6^106 28^57 14^72 3^173 45^50 34^54 30^56 39^52 2^275 42^51 24^60 7^98 18^66 10^83 3^174 23^61 2^276 12^77 37^53 19^65 46^50 5^119 22^62 6^107 20^64 21^63 43^51 40^52 11^80 29^57 35^54 2^277 26^59 3^175 15^71 31^56 33^55 13^75 47^50 7^99 14^73 17^68 2^278 38^53 44^51 41^52 5^120 28^58 3^176 2^279 10^84 6^108 48^50 18^67 12^78 24^61 30^57 34^55 2^280 45^51 39^53 11^81 19^66 42^52 23^62 3^177 7^100 20^65 22^63 5^121 2^281 21^64 13^76 15^72 37^54 46^51 14^74 6^109 29^58 2^282 26^60 17^69 40^53 35^55 3^178 43^52 50^50 10^85 31^57 33^56 2^283 12^79 5^122 47^51 38^54 7^101 18^68 28^59 11^82 3^179 44^52 41^53 2^284 24^62 6^110 30^58 19^67 48^51 34^56 13^77 23^63 2^285 15^73 20^66 3^180 22^64 39^54 21^65 14^75 45^52 5^123 10^86 42^53 2^286 17^70 7^102 37^55 29^59 26^61 12^80 3^181 6^111 2^287 11^83 46^52 35^56 31^58 40^54 33^57 43^53 18^69 50^51 5^124 2^288 28^60 3^182 13^78 38^55 47^52 24^63 19^68 2^289 10^87 15^74 7^103 51^51 41^54 44^53 14^76 30^59 23^64 6^112 20^67 22^65 21^66 34^57 2^290 3^183 17^71 5^125 12^81 48^52 11^84 39^55 2^291 45^53 42^54 26^62 29^60 3^184 37^56 18^70 7^104 2^292 6^113 31^59 10^88 13^79 35^57 5^126 33^58 40^55 46^53 2^293 15^75 43^54 19^69 14^77 3^185 28^61 24^64 50^52 38^56 20^68 12^82 2^294 23^65 11^85 21^67 17^72 22^66 47^53 30^60 41^55 6^114 7^105 3^186 44^54 5^127 51^52 2^295 34^58 10^89 39^56 2^296 48^53 13^80 18^71 26^63 29^61 3^187 52^52 45^54 42^55 15^76 37^57 14^78 2^297 5^128 31^60 6^115 19^70 35^58 11^86 12^83 7^106 33^59 3^188 2^298 24^65 40^56 28^62 20^69 46^54 17^73 43^55 23^66 21^68 22^67 10^90 2^299 50^53 38^57 30^61 5^129 3^189 13^81 6^116 47^54 2^300 41^56 34^59 18^72 44^55 7^107 51^53 14^79 26^64 15^77 11^87 2^301 12^84 3^190 29^62 39^57 48^54 19^71 5^130 42^56 2^302 45^55 52^53 37^58 31^61 10^91 6^117 17^74 20^70 24^66 35^59 33^60 3^191 28^63 2^303 21^69 23^67 7^108 22^68 40^57 13^82 53^53 46^55 43^56 2^304 5^131 30^62 3^192 38^58 18^73 11^88 14^80 12^85 15^78 50^54 2^305 6^118 34^60 41^57 47^55 26^65 10^92 44^56 19^72 3^193 2^306 7^109 29^63 51^54 5^132 39^58 17^75 20^71 2^307 13^83 31^62 48^55 24^67 37^59 42^57 21^70 3^194 45^56 23^68 6^119 28^64 22^69 33^61 35^60 52^54 11^89 2^308 12^86 14^81 18^74 15^79 40^58 7^110 5^133 10^93 2^309 3^195 30^63 43^57 53^54 46^56 38^59 2^310 19^73 6^120 26^66 34^61 50^55 17^76 3^196 41^58 54^54 13^84 29^64 2^311 47^56 5^134 20^72 44^57 11^90 7^111 24^68 39^59 21^71 12^87 51^55 2^312 31^63 23^69 22^70 14^82 3^197 10^94 28^65 15^80 37^60 18^75 33^62 42^58 48^56 6^121 35^61 2^313 45^57 5^135 52^55 3^198 40^59 2^314 30^64 19^74 7^112 13^85 43^58 17^77 11^91 46^57 38^60 26^67 2^315 53^55 6^122 3^199 34^62 12^88 20^73 10^95 29^65 5^136 2^316 14^83 50^56 41^59 21^72 24^69 15^81 54^55 47^57 22^71 44^58 23^70 18^76 3^200 2^317 31^64 39^60 7^113 28^66 51^56 37^61 33^63 6^123 55^55 2^318 35^62 5^137 42^59 13^86 11^92 48^57 45^58 3^201 19^75 17^78 10^96 30^65 2^319 12^89 52^56 40^60 26^68 14^84 20^74 2^320 7^114 38^61 43^59 3^202 15^82 46^58 5^138 34^63 6^124 29^66 21^73 53^56 24^70 2^321 22^72 18^77 23^71 41^60 50^57 11^93 3^203 13^87 2^322 31^65 28^67 44^59 47^58 10^97 54^56 39^61 12^90 5^139 33^64 19^76 7^115 17^79 37^62 2^323 6^125 35^63 51^57 3^204 42^60 14^85 55^56 30^66 48^58 2^324 45^59 20^75 15^83 26^69 40^61 3^205 52^57 2^325 21^74 5^140 11^94 56^56 18^78 38^62 29^67 24^71 22^73 10^98 43^60 34^64 13^88 7^116 23^72 6^126 46^59 2^326 12^91 53^57 3^206 41^61 28^68 31^66 17^80 2^327 19^77 50^58 5^141 14^86 44^60 39^62 47^59 33^65 2^328 54^57 3^207 15^84 37^63 35^64 6^127 7^117 20^76 11^95 30^67 10^99 42^61 2^329 51^58 26^70 13^89 21^75 18^79 45^60 48^59 55^57 3^208 5^142 12^92 40^62 22^74 2^330 24^72 23^73 29^68 38^63 52^58 34^65 6^128 2^331 43^61 56^57 17^81 14^87 3^209 7^118 19^78 46^60 28^69 31^67 2^332 5^143 15^85 11^96 41^62 10^100

by Ken (noreply@blogger.com) at July 16, 2018 06:54 AM

Chris Smith

I’m moving to medium for the future.

Thanks to everyone who has read my very inconsistent blog entries over the past 11 years.  I am looking forward to communicating more often in the future, but it’s time to modernize my tools.  I’m primarily switching to:

  1. Twitter for status updates and short thoughts.
  2. Medium for longer pieces.

Please feel free to follow these to stay up to date.

I still intend to leave this site here, to avoid breaking links.  But at the same time, I’m looking for something to get the Medium started, so I will likely start to move some of my favorite older content to Medium as well.

by cdsmith at July 16, 2018 03:00 AM

July 15, 2018

Dominic Steinitz

Estimating Parameters in Chaotic Systems

Post available from my new site. Sadly WordPress doesn’t allow me to render the html exported by a Jupyter notebook.

by Dominic Steinitz at July 15, 2018 02:28 PM

Michael Snoyman

Post Fast Write-up

As I blogged about last week, last week I did a multiday fast. There was definitely interest in this topic on Twitter, and at least one request for a post-fast write-up. So here we go.

I fasted from about 8pm on Saturday, July 7 to 8am on Friday, July 13. This works out to 5.5 days, or 132 hours of fasting. Additionally, I had done a shorter 1.5 day fast the previous week (from Thursday, July 5 till Friday, July 6), which as you’ll see is relevant to the weight results.

That previous blog post shared some of my reasons for fasting. The short version is: there are lots of potential health benefits, but my goal right now was to try to lose some fat with minimal muscle loss.

Weight change

Since the primary goal of the fast was fat loss, let me share the numbers. I always weigh in first thing in the morning for consistency. I additionally record skinfold caliper and waist measurement approximations of body fat percentage, which should be taken with a large grain of salt. I’ll include them in the information below.

Unfortunately, I forgot to record my weight on Sunday, July 8. The most recent weigh-in I’d had before was Wednesday, July 4. I took measurements on Monday, Wednesday, and Friday (just before breaking the fast), and again Sunday morning (48 hours after breaking the fast).

Finally, for context, I am 175cm tall (5’9”).

Day Weight Body fat % (est) Fat mass (est) Lean mass (est)
July 4 81.8kg 21% 17.4kg 64.4kg
July 9 79.3kg 21% 16.9kg 62.4kg
July 11 77.7kg 20% 15.5kg 62.2kg
July 13 76.8kg 19% 14.3kg 62.5kg
July 15 78.5kg 20% 15.7kg 62.8kg

Again, take the numbers with a grain of salt: body fat measurements are not perfect. A secondary measurement I’ll get to in the next section is changes in lifting strength. Here are some takeaways:

  • There was a precipitous drop in weight right after starting the fast. This is expected as a loss of water weight.
  • Similarly, I put on almost 2kg of weight again in the 48 hours following the fast. This is again expected as I was regaining water weight. (I ended up eating around 100g of carbs over the course of Saturday.)
  • There was a relatively steady weight loss during the course of the fast.

To me, it looks like I lost between 2-3kg of fat from the fast, including the 1.5 days the previous week. This would come out to between 15,400 and 23,100 calories, which would mean between 2,369 and 3,554 calories per day. All of which sounds perfectly reasonable given my weight and activity level.

Strength during fast

There is a major confounding factor in measuring my strength: I’ve been recovering from a wrist injury which has affected all of my major lifts (squat, deadlift, and bench). That said: I had no major drop in strength measured by a 5 rep max during the fast. I did tire out more quickly on subsequent sets, however. I’ll have a better feel for whether I had any strength loss over the coming week.

Difficulty

This was by far the easiest fast I’ve ever tried. The second day was, as usual, the hardest day. But it wasn’t really that difficult. I had a high level of mental focus, had plenty of interesting (and some non-interesting) work tasks to perform, and plenty of phyiscal activity. In other words, I was able to distract myself from eating.

There were a few times during the week when the hunger was intense, but only for 1-2 hours at a time. There were a few times when I didn’t feel like I could focus on anything, but that seemed more to do with an overabundance of things grabbing my attention than the fast itself.

I had decided in advance that I would break my fast Friday morning. I was certainly excited to get back to eating, but when I woke up Friday morning, I did not feel ravenous, nauseous, or in any other way in need of breaking the fast. Had circumstances allowed it, I have no doubt that I could have easily kept going.

Hunger after fast

Usually after a fast it takes me a while to feel properly hungry again. I broke my fast on a (for me) relatively light meal of 2 scrambled eggs. Twenty minutes later, I added in some cheese, and then took the baby for a walk to visit my parents. That’s when the hunger really kicked in. I started snacking on anything I could get access to. I came home and fried up a pan full of tuna. I continued eating a lot throughout the day, finishing off with some large pieces of meat Miriam made for dinner.

Saturday felt much the same, though Miriam seems to think I ate less than I thought I did.

I neither measured my food intake, nor tried to restrict it in any way. As you can see from the weight chart above, between Friday morning and Sunday morning, I put on 1.7kg, which I believe would be entirely accounted for by water weight. I don’t believe I had any fat gain or “refeeding” issues,” though I acknowledge that the body fat estimates paint a different picture.

Plans for the future

This was a win for me. I have to see if it significantly impacts performance in the gym, but otherwise, I found this far easier than trying to do a prolonged weight loss “cut.” Assuming all goes well, I may try to do alternating weeks of eating a high fat, moderate protein diet, followed by a week with a multiday fast (between 3 and 5 days), and see how that affects weight and strength.

I’ll continue taking body measurements, and will share results when I have more meaningful data.

If anyone has specific questions they’d like me to answer, let me know!

July 15, 2018 05:55 AM

July 14, 2018

Edward Z. Yang

A year into Backpack

It's been a year since I got my hood and gown and joined Facebook (where I've been working on PyTorch), but while I've been at Facebook Backpack hasn't been sleeping; in fact, there's been plenty of activity, more than I could have ever hoped for. I wanted to summarize some of the goings on in this blog post.

Libraries using Backpack

There's been some really interesting work going on in the libraries using Backpack space. Here are the two biggest ones I've seen from the last few months:

unpacked-containers. The prolific Edward Kmett wrote the unpacked-containers package, which uses the fact that you can unpack through Backpack signatures to give you generic container implementations with hypercharged performance (15-45%) way better than you could get with a usually, boxed representation. A lot of discussion happened in this Reddit thread.

hasktorch. hasktorch, by Austin Huang and Sam Stites, is a tensor and neural network library for Haskell. It binds to the TH library (which also powers PyTorch), but it uses Backpack, giving the post Backpack for deep learning from Kaixi Ruan new legs. This is quite possibly one of the biggest instances of Backpack that I've seen thus far.

Backpack in the Ecosystem

Eta supports Backpack. Eta, a JVM fork of GHC, backported Backpack support into their fork, which means that you can use Backpack in your Eta projects now. It was announced in this Twitter post and there was some more discussion about it at this post.

GSOC on multiple public libraries. Francesco Gazzetta, as part of Google Summer of Code, is working on adding support for multiple public libraries in Cabal. Multiple public libraries will make many use-cases of Backpack much easier to write, since you will no longer have to split your Backpack units into separate packages, writing distinct Cabal files for each of them.

Backpack in GHC and Cabal

By in large, we haven't changed any of the user facing syntax or semantics of Backpack since its initial release. However, there have been some prominent bugfixes (perhaps less than one might expect), both merged and coming down the pipe:

  • #13955: Backpack now supports non-* kinds, so you can do levity polymorphism with Backpack.
  • #14525: Backpack now works with the CPP extension
  • #15138: Backpack will soon support data T : Nat signatures, which can be instantiated with type T = 5. Thank you Piyush Kurur for diagnosing the bug and writing a patch to fix this.
  • A fix for Cabal issue #4754: Backpack now works with profiling

Things that could use help

Stack support for Backpack. In Stack issue #2540 I volunteered to implement Backpack support for Stack. However, over the past year, it has become abundantly clear that I don't actually have enough spare time to implement this myself. Looking for brave souls to delve into this; and I am happy to advise about the Backpack aspects.

Pattern synonym support for Backpack. You should be able to fill a signature data T = MkT Int with an appropriate bidirectional type synonym, and vice versa! This is GHC issue #14478 We don't think it should be too difficult; we have to get the matchers induced by constructors and check they match, but it requires some time to work out exactly how to do it.

by Edward Z. Yang at July 14, 2018 11:34 PM

Magnus Therning

QuickCheck on a REST API

Since I’m working with web stuff nowadays I thought I’d play a little with translating my old post on using QuickCheck to test C APIs to the web.

The goal and how to reach it

I want to use QuickCheck to test a REST API, just like in the case of the C API the idea is to

  1. generate a sequence of API calls (a program), then
  2. run the sequence against a model, as well as
  3. run the sequence against the web service, and finally
  4. compare the resulting model against reality.

The REST API

I’ll use a small web service I’m working on, and then concentrate on only a small part of the API to begin with.

The parts of the API I’ll use for the programs at this stage are

Method Route Example in Example out
POST /users {"userId": 0, "userName": "Yogi Berra"} {"userId": 42, "userName": "Yogi Berra"}
DELETE /users/:id

The following API calls will also be used, but not in the programs

Method Route Example in Example out
GET /users [0,3,7]
GET /users/:id {"userId": 42, "userName": "Yogi Berra"}
POST /reset

Representing API calls

Given the information about the API above it seems the following is enough to represent the two calls of interest together with a constructor representing the end of a program

and a program is just a sequence of calls, so list of ApiCall will do. However, since I want to generate sequences of calls, i.e. implement Arbitrary, I’ll wrap it in a newtype

Running against a model (simulation)

First of all I need to decide what model to use. Based on the part of the API I’m using I’ll use an ordinary dictionary of Int and Text

Simulating execution of a program is simulating each call against a model that’s updated with each step. I expect the final model to correspond to the state of the real service after the program is run for real. The simulation begins with an empty dictionary.

The simulation of the API calls must then be a function taking a model and a call, returning an updated model

Here I have to make a few assumptions. First, I assume the indeces for the users start on 1. Second, that the next index used always is the successor of highest currently used index. We’ll see how well this holds up to reality later on.

Running against the web service

Running the program against the actual web service follows the same pattern, but here I’m dealing with the real world, so it’s a little more messy, i.e. IO is involved. First the running of a single call

The running of a program is slightly more involved. Of course I have to set up the Manager needed for the HTTP calls, but I also need to

  1. ensure that the web service is in a well-known state before starting, and
  2. extract the state of the web service after running the program, so I can compare it to the model

The call to POST /reset resets the web service. I would have liked to simply restart the service completely, but I failed in automating it. I think I’ll have to take a closer look at the implementation of scotty to find a way.

Extracting the web service state and packaging it in a Model is a matter of calling GET /users and then repeatedly calling GET /users/:id with each id gotten from the first call

Generating programs

My approach to generating a program is based on the idea that given a certain state there is only a limited number of possible calls that make sense. Given a model m it makes sense to make one of the following calls:

  • add a new user
  • delete an existing user
  • end the program

Based on this writing genProgram is rather straight forward

Armed with that the Arbitrary instance for Program can be implemented as1

The property of an API

The steps in the first section can be used as a recipe for writing the property

What next?

There are some improvements that I’d like to make:

  • Make the generation of Program better in the sense that the programs become longer. I think this is important as I start tackling larger APIs.
  • Write an implementation of shrink for Program. With longer programs it’s of course more important to actually implement shrink.

I’d love to hear if others are using QuickCheck to test REST APIs in some way, if anyone has suggestions for improvements, and of course ideas for how to implement shrink in a nice way.

<section class="footnotes">
  1. Yes, I completely skip the issue of shrinking programs at this point. This is OK at this point though, because the generated Programss do end up to be very short indeed.

</section>

July 14, 2018 12:00 AM

July 13, 2018

Paul Johnson

Donald Trump is Zaphod Beeblebrox

Here are some quotes from the Hitchhikers Guide to the Galaxy, lightly edited...

One of the major difficulties Trillian Theresa May experienced in her relationship with Zaphod Donald Trump was learning to distinguish between him pretending to be stupid just to get people off their guard, pretending to be stupid because he couldn’t be bothered to think and wanted someone else to do it for him, pretending to be outrageously stupid to hide the fact that he actually didn’t understand what was going on, and really being genuinely stupid.

It is a well known fact that those people who most want to rule people are, ipso facto, those least suited to do it. To summarize the summary: anyone who is capable of getting themselves made President should on no account be allowed to do the job.

The President in particular is very much a figurehead — he wields no real power whatsoever. He is apparently chosen by the government voters, but the qualities he is required to display are not those of leadership but those of finely judged outrage. For this reason the President is always a controversial choice, always an infuriating but fascinating character. His job is not to wield power but to draw attention away from it. On those criteria Zaphod Beeblebrox Donald Trump is one of the most successful Presidents the Galaxy United States has ever had — he has already spent two of his ten presidential years in prison for fraud..

by Paul Johnson (noreply@blogger.com) at July 13, 2018 10:21 PM

July 12, 2018

Mark Jason Dominus

Don't do this either

Here is another bit of Perl code:

 sub function {
   my ($self, $cookie) = @_;
   $cookie = ref $cookie && $cookie->can('value') ? $cookie->value : $cookie;
   ...
 }

The idea here is that we are expecting $cookie to be either a string, passed directly, or some sort of cookie object with a value method that will produce the desired string. The ref … && … condition distinguishes the two situations.

A relatively minor problem is that if someone passes an object with no value method, $cookie will be set to that object instead of to a string, with mysterious results later on.

But the real problem here is that the function's interface is not simple enough. The function needs the string. It should insist on being passed the string. If the caller has the string, it can pass the string. If the caller has a cookie object, it should extract the string and pass the string. If the caller has some other object that contains the string, it should extract the string and pass the string. It is not the job of this function to know how to extract cookie strings from every possible kind of object.

I have seen code in which this obsequiousness has escalated to absurdity. I recently saw a function whose job was to send an email. It needs an EmailClass object, which encapsulates the message template and some of the headers. Here is how it obtains that object:

    12    my $stash = $args{stash} || {};
    …
    16    my $emailclass_obj = delete $args{emailclass_obj}; # isn't being passed here
    17    my $emailclass = $args{emailclass_name} || $args{emailclass} || $stash->{emailclass} || '';
    18    $emailclass = $emailclass->emailclass_name if $emailclass && ref($emailclass);
    …  
    60    $emailclass_obj //= $args{schema}->resultset('EmailClass')->find_by_name($emailclass);

Here the function needs an EmailClass object. The caller can pass one in $args{emailclass_obj}. But maybe the caller doesn't have one, and only knows the name of the emailclass it wants to use. Very well, we will allow it to pass the string and look it up later.

But that string could be passed in any of $args{emailclass_name}, or $args{emailclass}, or $args{stash}{emailclass} at the caller's whim and we have to rummage around hoping to find it.

Oh, and by the way, that string might not be a string! It might be the actual object, so there are actually seven possibilities:

    $args{emailclass}
    $args{emailclass_obj}
    $args{emailclass_name}
    $args{stash}{emailclass}
    $args{emailclass}->emailclass_name
    $args{emailclass_name}->emailclass_name
    $args{stash}{emailclass}->emailclass_name

Notice that if $args{emailclass_name} is actually an emailclass object, the name will be extracted from that object on line 18, and then, 42 lines later, the name may be used to perform a database lookup to recover the original object again.

We hope by the end of this rigamarole that $emailclass_obj will contain an EmailClass object, and $emailclass will contain its name. But can you find any combinations of arguments where this turns out not to be true? (There are several.) Does the existing code exercise any of these cases? (I don't know. This function is called in 133 places.)

All this because this function was not prepared to insist firmly that its arguments be passed in a simple and unambiguous format, say like this:

    my $emailclass = $args->{emailclass} 
          || $self->look_up_emailclass($args->{emailclass_name})
          || croak "one of emailclass or emailclass_name is required";

I am not certain why programmers think it is a good idea to have functions communicate their arguments by way of a round of Charades. But here's my current theory: some programmers think it is discreditable for their function to throw an exception. “It doesn't have to die there,” they say to themselves. “It would be more convenient for the caller if we just accepted either form and did what they meant.” This is a good way to think about user interfaces! But a function's calling convention is not a user interface. If a function is called with the wrong arguments, the best thing it can do is to drop dead immediately, pausing only long enough to gasp out a message explaining what is wrong, and incriminating its caller. Humans are deserving of mercy; calling functions are not.

Allowing an argument to be passed in seven different ways may be convenient for the programmer writing the call, who can save a few seconds looking up the correct spelling of emailclass_name, but debugging what happens when elaborate and inconsistent arguments are misinterpreted will be eat up the gains many times over. Code is written once, and read many times, so we should be willing to spend more time writing it if it will save trouble reading it again later.

Novice programmers may ask “But what if this is business-critical code? A failure here could be catastrophic!”

Perhaps a failure here could be catastrophic. But if it is a catastrophe to throw an exception, when we know the caller is so confused that it is failing to pass the required arguments, then how much more catastrophic to pretend nothing is wrong and to continue onward when we are surely ignorant of the caller's intentions? And that catastrophe may not be detected until long afterward, or at all.

There is such a thing as being too accommodating.

by Mark Dominus (mjd@plover.com) at July 12, 2018 12:00 AM

July 10, 2018

Philip Wadler

Lambda Days Research Track

CFP for the Lambda Days Research Track is out. Please submit!

by Philip Wadler (noreply@blogger.com) at July 10, 2018 02:10 PM

Michael Snoyman

Thoughts on Fasting

Yesterday on Twitter, a lot of people were interested in the multiday fast I’m currently doing. Since I’m awake at 4am (more on that below), now seems like a great time to explain a bit about the fasting. I will drop the major caveat at the start: I am in no way an expert on fasting. For that, you’d want Dr. Jason Fung, and likely his book The Complete Guide to Fasting. This will be my personal anecdotes version of that information.

There are many reasons to consider fasting, including:

  • Religious reasons. As many know, I’m a religious Jew, so this concept is not at all foreign to me, and probably helped me more easily get into fasting. However, that’s not my motivation today, or related to the rest of this blog post.
  • Losing weight. It’s pretty hard to put on weight when you’re not eating anything. (But you probably wanted to burn fat, not muscle, right?)
  • Mental clarity. Many people—myself included—report feelings of clear-mindedness and focus during a fast. I find I get some of my best writing and coding done when fasting.
  • Extra energy. This may seem counter-intuitive: where’s the energy coming from? We’ll get to that.
  • General health benefits. Fasting has been well demonstrated to help treat epilepsy, but more recently also have claims of anti-cancer benefits. It can also be a highly effective treatment for type 2 diabetes.

Alright, let’s get into the details of how to do this.

Caveats

I’ll get this out of the way right now. I am not a doctor. I cannot give you health advice. I’m telling you about what works for me, a decently healthy person without preexisting conditions. If you are taking medication, especially insulin or anything that affects your blood sugar, talk with a doctor before doing this, or at the very least read one of Dr. Fung’s books.

How to fast

The process is simple: stop eating. This isn’t a “bread and water” fast. This isn’t a “maple syrup and cayenne pepper” fast (worst TED talk I’ve ever seen by the way, I refuse to link to it). You are consuming virtually no calories. (We’ll get to virtually in a second.)

You should continue drinking water, and lots of it. Drinking green tea is helpful too, and black and herbal teas are fine. There’s also no problem with coffee. In fact: the caffeine in coffee can help you burn fat, which will help the fast progress more easily.

You should not drink any sugary drinks. (That’s not just advice for a fast, that’s advice for life too.) You should probably avoid artificial sweeteners as well, since they can cause an insulin spike, and some contain calories.

There are three other things you can include in your fast if desired:

  • Salts, including sodium (normal table salt), potassium, and magnesium. Personally, I’ve used a bit of normal table salt, himalayan salt, or high-potassium salt.
    • If you’re wondering: how do you consume the salt? Yup: I drink salt water sometimes. Delicious.
  • Some cream in your coffee. Don’t overdo it: you’re not making bulletproof coffee here. This is intended to just take some of the edge off of your coffee if you can’t stand black coffee.
  • Bone broth, essentially beef (or other) bones cooking in a crockpot with salt for 24 hours straight. This is a nice way to get in some salts, a few other trace minerals, and give you a satisfied feeling before going to sleep.

The latter two do contribute some calories, almost entirely from fat. Don’t overdo this, you’ll be counteracting the point of the fast. And definitely avoid getting protein or any carbs mixed in; do not eat some of the meat off the bones, or throw in carrots or other starchy vegetables.

If you’re doing a fast for anti-cancer reasons, I’ve read that you should eliminate the cream and broth as well, and maybe consider leaving out the coffee and even tea. Again: I’m not an expert here, and I’ve never tried that kind of fast.

When to fast

Fasting does not require any preparation. Just decide to do it, and stop eating! I typically like to stop eating around 6pm one day, and see how many days I can go before I cave in and eat. I’ll mention some reasons to cave in below.

My maximum fasting time is currently four days. It’s doubtful I’ll ever get beyond six days due to the weekly Jewish Sabbath, but that’s currently my goal. If you’re doing a multiday fast, I’d recommend targeting at least three days, since the second day is by far the hardest (from personal experience, and what I’ve read), and you should see it through till it gets easier.

By the way, the world record longest fast is 382 days. The rules for fasting change a bit at that point, and you would need to take vitamins and minerals to make it through.

Hunger

The biggest fear people have is hunger. This is a normal, natural reaction. As someone who has done fasts plenty of times, I still sometimes feel the fear of being hungry. Allow me to share some personal experience.

When I’m in the middle of eating food, the thought of purposely witholding from eating is terrifying. Once I’ve stopped eating for half a day or so, that terror seems in retrospect to be completely irrational. I strongly believe that I, like many others, can react with an almost chemical addiction-like response to food. It’s nice to occassionally break that dependence.

Many people assume that they’ll be overcome with hunger once they’ve gone somewhere around 8 hours without eating. I’d recommend thinking of it as a challenge: can you handle not eating? Your body is built to withstand times of no food, you’ll be fine. This is a straight mind-over-matter moment.

Another concern many have is memories of trying to diet, and feeling miserable the whole time. Let me relate that it is significantly easier to spend a day not eating anything, and another day eating normally, than it is to spend two days eating less. The simple decision to not eat at all is massively simplifying.

Relationship to ketosis

When most people hear “keto,” they’re thinking of a diet plan that involves eating an absurd amount of bacon with an occassional avocado. In reality, ketosis is a state in the body where your liver is generating ketone bodies for energy. Here’s the gist of it:

  • The two primary fuel sources for the body are fat and carbs (sugar/glucose)
  • Most of your body can happily use either of these
  • However, fat cannot cross the blood-brain barrier, so your brain typically runs exclusively on glucose
  • When you dramatically restrict your carb intake (and to some extent your protein intake), your body has insufficient glucose to run your brain
  • Instead: your liver will convert fat into ketones, which is an alternate energy source your body, and in particular your brain, can use

I’m honestly a bit unsure of the whole topic of proteins in ketosis, as I’ve read conflicting information. But the basic idea of restricting carbs is straightforward. One way to do this is to eat a very-high-fat diet. But another approach is to eat nothing at all. The same process will take place in your body, and you’ll end up in ketosis, likely faster than with a ketogenic diet.

At this point, your whole body will be running off of a combination of ketones and fats, with a small amount of sugar circulating for some cells which can only run on glucose. This explains some of the purported benefits I mentioned above:

  • Running your brain on a different energy source can give you mental focus. This is not universal: some people report no such feeling.
  • Once you switch over to ketosis on a fast, you’re burning your (likely ample) body fat stores, providing a nearly-unlimited supply of raw calories. Compare this to having 3 meals a day: you are typically dependent on each meal to provide your energy, and are just waiting for your next infusion of food for a pick-me-up.
  • Using your body fat as a fuel source obviously means you’ll be burning fat, which is the right way to lose weight! But fasting+ketosis does something even better: it trains your body to be better at burning fat, making it easier to continue burning fat in the future.

One final note: people familiar with Type 1 diabetes may be terrified of the term “keto” due to ketoacidosis. Ketoacidosis is a dangerous medical condition, and does involve ketones. However, it is not the same state as ketosis, and being in ketosis is not dangerous. (But remember my caveats about not being a doctor above, especially if you are a type 1 diabetic.)

Exercise

My personal approach to exercise on a fast is:

  • Follow my normal lifting routine
  • Try to train as hard and heavy as I can, accepting that the lack of incoming food will make me a bit weaker.
  • Schedule a lifting session at the very beginning of the fast, followed by a High Internsity Interval Training (HIIT) session. I still have plenty of energy and glycogen stores to get me through this session on the first day, and the session helps drain those glycogen stores and kick me into ketosis faster.

For weight lifting, I follow an RPE-based program, which essentially adjusts the weight on the bar to how hard it feels. Therefore, as I progress through the fast, I can typically continue following the same program, but the weight I lift will naturally go down a bit.

Side note: at the time of writing this blog post, I’m recovering from wrist tendonitis, and was already deloaded in my lifting, which is why I’m not lifting hard. It’s not because of the fast, but it is affecting how I’m responding to the fast most likely.

Side effects

There are some side effects to fasting worth mentioning, including why I’m writing this blog post now:

  • Insomnia. For me, the ketone rush seems to disrupt my sleep cycle. I don’t actually feel tired during the day, so maybe this is a good thing.
  • I won’t do it justice, but Dr. Fung describes many hormonal changes that occur when you’re in a fasted state. These include things like increased human growth hormone and insulin-like growth factor 1. The upshot of this is that you should experience less protein and muscle wasting during a fast than when following a calorite-restricted diet.
    • This is the primary reason I’m doing the multiday fast right now: I’ve been on a “cut” for a few months and unhappy with the amount of muscle loss I’ve had. I want to finish up the cut quickly and get back into maintenance mode eating, without incurring greater muscle loss. I intend to continue experimenting into the future with using multiday fasts as mini-cuts.
  • Nausea. The first time I did a multiday fast, I got so nauseous that by the end of the third day I had to break the fast. It was bad enough that I couldn’t stand the taste of coffee for the next two months. However, that first fast included no cream or bone broth. Since I’ve added those two things, I haven’t had any more nausea when fasting.
  • Dizziness. Especially when first adapting to the fasted state, you may feel lightheaded. Just be careful, and be sure to drink enough water, and supplement salts as needed. More generally: make sure you get enough fluids.

There are certainly others, these are just the ones that have most affected me.

Ending the fast

I’ve ended the fast in the past because I either felt sick, or because of some personal obligation where I didn’t want to be “compromised” in a fasted state. (This was likely just a cop-out on my part, going on a 4 hour trip with a group of fourth graders isn’t that taxing.) I would recommend setting a goal for how long you want to try to fast for, and be ready to be flexible with that goal (either longer or shorter) depending on how you’re responding.

I don’t really have any great recommendations on what to break your fast with. However, you may want to continue eating a low carb or ketogenic diet. You’ve just put yourself into a deep state of ketosis, which many people struggle to do. May as well take advantage of it!

This may not be universal, and I won’t go into details, but expect some GI distress after the fast ends. Your digestive system has been off duty for a few days, it needs to reboot.

Recommendations

Alright, so that was a lot of information all over the place. You’re probably wondering: what should I do? Frankly, you can do whatever you want! You can choose to dive in at the deep end and start off immediately with a multiday fast. Dr. Fung mentions this in his book. However, in my own experience, I started much more gradually:

  • Low carb and ketogenic diets
  • Intermittent fasting: only eating between 12pm and 8pm, for example
  • Experience with religious fasts
  • Randomly skipping meals
  • A 24-hour fast (stop eating at 6pm, eat the next day at 6pm)
  • A multiday fast

Keep in mind: many people in the world today have never skipped a meal. Consider starting with something as easy as not having breakfast to prove to yourself that you don’t need to consume calories all waking hours of the day.

If you have questions you’d like me to answer in a follow up post, either on fasting, or other related health topics, let me know in the comments below or on Twitter.

Extra: muscle loss

I’m not going to claim that fasting is the miracle that allows us to gain muscle and lose fat. I don’t have enough personal experience to vouch for that even in my own body, and the research I’ve seen is conflicted on the topic. As I love doing n=1 experiments on myself, I intend to do more tests with fasting following periods of moderate over eating to see what happens.

Extra: some calorie math

For anyone in a healthy or above weight range, your body is carrying more than enough calories in fat to sustain you for days, likely weeks. If you do some quick math: there are 3,500 calories in 1 pound (0.45 kg) of fat. If you burn about 2,000 calories a day, you can burn roughly half a pound (.22kg) of fat a day. A 165 pound (70kg) person at 20% bodyfat would have 115,500 calories in fat storage, enough to survive over 55 days. Point being: once you’ve entered the ketosis stage of your fast, you essentially have unlimited energy stores available to use, as opposed to needing another hit from a meal to recharge.

Extra: anti-cancer

To be a broken record: I’m not an expert on this topic. But the basic idea of anti-cancer claims for fasting come down to starving out the cancer cells. Cancer cells require lots of glucose (and a bit of protein) for energy. Fasting starves out the cancer cells. I’m not telling anyone to skip chemo or radiation. But I do personally know someone who defeated breast cancer with an (I believe) 11 day fast.

Extra: relevant tweets

Here are some of the relevant tweets from the discussion yesterday.

July 10, 2018 03:10 AM

July 09, 2018

mightybyte

Simplicity

I'm reposting this here because it cannot be overstated.  I've been saying roughly the same thing for awhile now.  It's even the title of this blog!  Drew DeVault sums it up very nicely in his opening sentences:
The single most important quality in a piece of software is simplicity. It’s more important than doing the task you set out to achieve. It’s more important than performance.
Here's the full post: https://drewdevault.com/2018/07/09/Simple-correct-fast.html

I would also like to add another idea.  It has been my observation that the attributes of "smarter" and "more junior" tend to be more highly correlated with losing focus on simplicity.  Intuitively this makes sense because smarter people will tend to grasp complicated concepts more easily.  Also, smarter people tend to be enamored by clever complex solutions.  Junior engineers usually don't appreciate how important simplicity is as much as senior engineers--at least I know I didn't!  I don't have any great wisdom about how to solve this problem other than that the first step to combating the problem is being aware of it.

by mightybyte (noreply@blogger.com) at July 09, 2018 08:22 PM

Stackage Blog

Announce: Stackage LTS 12 with ghc-8.4.3

We are pleased to announce the release of lts-12.0, the first in a new LTS Haskell snapshot series, using ghc-8.4.3. (See: changes from lts-11.17.) We thank all of the package authors involved in supporting the Haskell ecosystem. LTS Haskell would not be possible without you!

To add your package to future snapshots in the LTS 12 series, please open a new issue on commercialhaskell/lts-haskell.

As per our usual LTS release procedure, we have lifted many constraints from the Stackage nightly build plan. Various packages were removed in order to accomplish this, and package authors should have been notified on the corresponding github issues. To add (or restore) your package to the Stackage nightly builds, see MAINTAINERS.md on the stackage repo.

It is uncertain at this time whether LTS 13 (anticipated to release in 3 to 5 months) will use ghc-8.4 or ghc 8.6. We encourage package authors to make use of the ghc alpha and rc releases in order to test their packages and speed up the adoption of new ghc versions.

July 09, 2018 12:01 AM

Lysxia's blog

July 08, 2018

Neil Mitchell

Inside the paper: Build Systems a la Carte

Summary: How we went about writing a build systems comparison paper, how we learnt what we learnt, and why the end result surprised us. A glimpse inside writing a paper.

The final version of the Build Systems a la Carte paper has just been sent to ICFP 2018 - see an overview from one of my co-authors. The paper is a collaboration between Andrey Mokhov at Newcastle University, Simon Peyton Jones at Microsoft Research and me (working in industry). Below is the tale of how we discovered what we discovered, hopefully slightly demystifying the research process. While the paper is a collaboration, this blog post is my view and mine alone.

The paper started with the idea of comparing and contrasting build systems. There were two motivating factors, I wanted a blueprint for writing Cloud Shake, while Simon wanted to compare build systems (Andrey wanted a bit of both). The thing we all agreed on was that Haskell is a great executable specification for describing build systems, and that refactoring is a powerful tool. Armed with that approach, we went off to try and implement various build systems, chosen based on our familiarity with them and the perceived differences between them. You can see our progress in the git repo, starting 20th Feb (less than a month before the ICFP deadline!).

All of us came to the table with some predefined notions of what should and shouldn't be in the model. Andrey brought the Store abstraction. I brought the ideas of monadic vs applicative dependencies. We iterated and quickly made our first "breakthrough", a task abstraction which nicely modelled user rules, including the difference between monadic and applicative dependencies:

type Tasks c k v = forall f . c f => (k -> f v) -> (k -> f v)

Essentially, given a way to build dependencies, I can give you a way to any key. By parameterising the Tasks by c (of type Constraint) we can produce Tasks Monad and Tasks Applicative, nicely capturing the differences in power. It was only later when preparing an artefact for evaluation that we noticed Docker is a Tasks Functor build system. We made a number of iterations on this Tasks type (adding and removing newtype, how to represent input files, where to put the second k etc) - but fundamentally had a model to work with.

The next step was to start writing build systems. We picked Make, Shake, Excel, Ninja and Bazel as our first set to get working. Implementing these systems effectively became a four-dimensional optimisation problem:

  • Closeness of the model to the underlying system it was based on.
  • Simplicity of code for each individual system.
  • Reuse of code across build systems.
  • Reuse of abstractions across build systems.

The first versions were separate monoliths of code, reusing a handful of helper functions, with a fairly arbitrary set of what to model and what to exclude. Since we had executable specifications, with tests, we came up with possible improvements, tried them, and decided whether to keep them or discard them. We iterated, as individuals, as pairs (all three possible pairs) and as a group - making improvements along various dimensions. For a good few weeks Andrey and myself had competing directories in the repo, with different underlying ideas but stealing refinements from each other. I think there were about 25 separate "breakthroughs" to move the code to where we ended up. As the code became clearer, we started to understand the commonalities behind build systems, which helped the code become clearer - a virtuous cycle. Simon's role was to say "We have to make this simpler" or "I don’t understand that". Some of the time it couldn't be simpler; and we had to make sure the explanations were really understandable. But most of the time we really did make it simpler and the exposition is much better as a result.

The most academically interesting breakthrough was to realise that build systems can be split into something that decides what to rebuild, and something that orders the rebuilding, putting build systems in a two-dimensional table. While the result feels natural (if you carefully structure your description of a build system it even falls out grammatically!), it was entirely non-obvious beforehand, and emerged naturally by following the abstraction opportunities presented by the code.

By the end we were able to faithfully model details of Make/Excel/Shake that initially eluded us, with each build system being just two functions, where all functions could be combined to produce working build systems. As an example, Shake is:

shake = suspending vtRebuilder

The suspending is also used by Nix, and the vtRebuilder is also used by Ninja. Shake is just putting two existing things together, so we have great reuse of code and abstractions between build systems. In some places the code is more complex than I'd like, but you can't have everything (or maybe you can - we may well improve the models further).

After submitting the paper to ICFP 2018, we also put a draft online, which led to a deluge of comments from experts in many of the systems we talked about - the acknowledgements in the paper start to show how much excellent feedback we got. The most interesting feedback was that we'd misclassified Bazel - it's actually more like Excel than we realised. What was particularly nice is that our framework was able to describe what we thought Bazel was in enough detail that people involved with Bazel could correct us - a clear sign we were modelling interesting details.

Now that the paper is done, I hope the abstractions can start proving their value. In the context of Shake, I would like it can serve as a design document. Ever since the earliest days of Shake, I've used a two-timestamp approach to recording what happened with a key, as described in S2.3.1 of the original paper. Unfortunately, whenever I've tried to explain this trick to someone in person, their eyes glaze over. Fortunately, given a clear model, I now know that what I've really implemented is an optimisation over vtRebuilder. Furthermore, I now have the framework to talk about the benefits and costs of the optimisation, making it much easier to understand and justify.

My goal before writing the paper was to turn Shake into Cloud Shake, and the desire to do that in a principled way. Now the paper is finished I can resume that quest, with a fairly good understanding of how to do it. One thing the paper sensibly abstracts over is all the technical details (parallelism, network traffic etc) - armed with the right abstractions those technical details are what I'll be focusing on for Cloud Shake.

Thinking more broadly, the things this paper taught me (or that I already thought but it confirmed):

  • Follow the Simon Peyton Jones how to write a paper guidelines, of which number 1 is most important. "Writing papers is a primary mechanism for doing research (not just for reporting it)".
  • Innovation isn't thinking in isolation, it's thinking of a process that gives you the right problems, the right tools, and the right direction. With those things in place, the chances of ending up somewhere interesting increase dramatically.
  • Deadlines spur writing papers. It feels like we should be better, and not need the external deadlines, but it seems to help in practice.
  • Simplicity is valuable in its own right. The main research contribution of this paper sounds obvious a minute after explaining it, which makes me very happy.
  • Co-authors matter. As a set of co-authors we agree on some things (e.g. Haskell), but disagree strongly on others (e.g. two parallel branches of development, 6 rejected pull requests). I am sure the paper would have been significantly worse with anyone of us removed (these are my conclusions, I don't guarantee my co-authors agree!).
  • Feedback is super valuable, whether it comes from peer reviewers or Reddit comments. The feedback improved the paper, and also motivated us.

Hopefully this post lets people in on the secret that writing academic papers isn't magic, that papers don't emerge fully formed, and that it involves a lot of work and refinement.

by Neil Mitchell (noreply@blogger.com) at July 08, 2018 10:25 PM

Sandy Maguire

Gauging Interest in a Type-Level Programming Book

I've been working on a book about type-level programming in Haskell for a few weeks now. I'm releasing the first three chapters today in order to gauge interest, as well as get feedback on its style. If this sounds like something that interests you, I'd love to hear from you. Thanks for your time!

You can find a very early pre-release of it here.

July 08, 2018 12:51 PM

Lysxia's blog

July 07, 2018

Mark Jason Dominus

Don't do this

[ This article has undergone major revisions since it was first published yesterday. ]

Here is a line of Perl code:

  if ($self->fidget && blessed $self->fidget eq 'Widget::Fidget') {

This looks to see if $self has anything in its fidget slot, and if so it checks to see if the value there is an instance of the class Widget::Fidget. If both are true, it runs the following block.

That blessed check is bad practice for several reasons.

  1. It duplicates the declaration of the fidget member data:

    has fidget => (
      is  => 'rw',
      isa => 'Widget::Fidget',
      init_arg => undef,
    );
    

    So the fidget slot can't contain anything other than a Widget::Fidget, because the OOP system is already enforcing that. That means that the blessed … eq test is not doing anything — unless someone comes along later and changes the declared type, in which case the test will then be checking the wrong condition.

  2. Actually, that has already happened! The declaration, as written, allows fidget to be an instance not just of Widget::Fidget but of any class derived from it. But the blessed … eq check prevents this. This reneges on a major promise of OOP, that if a class doesn't have the behavior you need, you can subclass it and modify or extend it, and then use objects from the subclass instead. But if you try that here, the blessed … eq check will foil you.

    So this is a prime example of “… in which case the test will be checking the wrong condition” above. The test does not match the declaration, so it is checking the wrong condition. The blessed … eq check breaks the ability of the class to work with derived classes of Widget::Fidget.

  3. Similarly, the check prevents someone from changing the declared type to something more permissive, such as

    “either Widget::Fidget or Gidget::Fidget

    or

    “any object that supports wiggle and waggle methods”

    or

    “any object that adheres to the specification of Widget::Interface

    and then inserting a different object that supports the same interface. But the whole point of object-oriented programming is that as long as an object conforms to the required interface, you shouldn't care about its internal implementation.

  4. In particular, the check above prevents someone from creating a mock Widget::Fidget object and injecting it for testing purposes.

  5. We have traded away many of the modularity and interoperability guarantees that OOP was trying to preserve for us. What did we get in return? What are the purported advantages of the blessed … eq check? I suppose it is intended to detect an anomalous situation in which some completely wrong object is somehow stored into the self.fidget member. The member declaration will prevent this (that is what it is for), but let's imagine that it has happened anyway. This could be a very serious problem. What will happen next?

    With the check in place, the bug will go unnoticed because the function will simply continue as if it had no fidget. This could cause a much more subtle failure much farther down the road. Someone trying to debug this will be mystified: At best “it's behaving as though it had no fidget, but I know that one was set earlier”, and at worst “why is there two years of inconsistent data in the database?” This could take a very long time to track down. Even worse, it might never be noticed, and the method might quietly do the wrong thing every time it was used.

    Without the extra check, the situation is much better: the function will throw an exception as soon as it tries to call a fidget method on the non-fidget object. The exception will point a big fat finger right at the problem: “hey, on line 2389 you tried to call the rotate method on a Skunk::Stinky object, but that class has no such method`. Someone trying to debug this will immediately ask the right question: “Who put a skunk in there instead of a widget?”

It's easy to get this right. Instead of

  if ($self->fidget && blessed $self->fidget eq 'Widget::Fidget') {

one can simply use:

  if ($self->fidget) {

Moral of the story: programmers write too much code.

I am reminded of something chess master Aron Nimzovitch once said, maybe in Chess Praxis, that amateur chess players are always trying to be Doing Something.

by Mark Dominus (mjd@plover.com) at July 07, 2018 06:45 PM

Sandy Maguire

Static Analysis of Free Monads

Motivation

A common misperception of free monads is that they allow for analysis of an program expressed with them. This is not true, and in fact, monads are too liberal of an abstraction to allow for inspection in general.

In order to see why, consider the following monadic expression:

      getLine
  >>= \str -> if str == "backdoor"
                 then launchNukes
                 else pure ()

The problem here is that bind is expressed via a continuation, and we're unable to look inside that continuation without calling the function. So we're stuck. We can't determine if the above program will ever call launchNukes unless we just happen to call the lambda with the exact string "backdoor".

So, in general, we're unable to statically inspect monads. We can run them (not necessarily in the IO monad) and see what happens, but getting a free monad to help with this is equivalent to mocking the exact problem domain. But, even though we can't do it in general, it seems like we should be able to do it in certain cases. Consider the following monadic expression:

            getLine
  >>= \_ -> launchNukes

In this case, the computation doesn't actually care about the result of getLine, so in theory we can just call the continuation with undefined and find that yes indeed this expression will call launchNukes.

Notice that we can't use this strategy in the first expression we looked at, because that one scrutinized the result of getLine, and branched depending on it. If we tried passing undefined to it, it would crash with an error when we tried to force the final value of the monad (although this might be preferable to actually launching nukes.)

This example of launchNukes is admittedly rather silly. My original motivation for investigating this is in the context of ecstasy in which users can query and manipulate disparate pieces of data. For example, if we wanted to write a physics simulator where each object may or may not have any of a position :: V2 Double, a velocity :: V2 Double and a hasPhysics :: Bool, we could write the following piece of code to update the positions of any entities that have a velocity and are, in fact, affected by physics:

emap $ do
  p <- query position
  v <- query velocity
  h <- query hasPhysics

  guard h

  pure unchanged
    { position = Set $ p + v ^* timeDelta
    }

Because objects are not required to have all of the possible data, mapping this function will intentionally fail for any of the following reasons:

  • the object did not have a position field
  • the object did not have a velocity field
  • the object did not have a hasPhysics field
  • the object had a hasPhysics field whose value was False

Without being able to statically analyze this monadic code, our only recourse is to attempt to run it over every object in the universe, and be happy when we succeed. While such an approach works, it's terribly inefficient if the universe is large but any of the position, velocity or hasPhysics fields is sparse.

What would be significantly more efficient for large worlds with sparse data would be to compute the intersection of objects who have all three of position, velocity and hasPhysics, and then run the computation only over those objects. Free applicatives (which are amenable to static analysis) are no good here, since our guard h line really-and-truly is necessarily monadic.

Any such static analysis of this monad would be purely an optimization, which suggests we don't need it to be perfect; all that we are asking for is for it to be better than nothing. A best-effort approach in the spirit of our earlier "just pass undefined around and hope it doesn't crash" would be sufficient. If we can be convinced it won't actually crash.

What we'd really like to be able to do is count every occurrence of query in this monad before it branches based on the result of an earlier query. Which is to say we'd like to pass undefined around, do as much static analysis as we can, and then somehow fail our analysis just before Haskell would crash due to evaluating an undefined.

Prospecting Monads

I've been playing around with this conceptual approach for some time, but could never seem to get it to work. Laziness can sure be one hell of a bastard when you're trying to pervert Haskell's execution order.

However, last week Foner et al. dropped a bomb of a paper Keep Your Laziness in Check, which describes a novel approach for observing evaluations of thunks in Haskell. The gist of the technique is to use unsafePerformIO to build an IORef, and then set its value at the same time you force the thunk. If you (unsafely) read from the IORef and see that it hasn't been set, then nobody has forced your value yet.

We can use a similar approach to accomplish our optimization goals. For the crux of the approach, consider the follow verify function that will evaluate a pure thunk and return empty if it instead found a bottom:

verify :: Alternative f => a -> f b
verify f a = unsafePerformIO $ do
  catch
    (let !_ = a
      in pure $ pure a)
    (\(_ :: SomeException) -> pure empty)

The bang pattern !_ = a tells Haskell to seq a, which reduces it to WHNF, which, if its WHNF is bottom, will be caught inside of the catch. unsafePerformIO is necessary here, because exceptions can only be caught in IO.

Using this approach, if we're very careful, we can tear down a free monad by following its continuations using bottom, and doing the verify trick to stop exactly when we need to.

I call such a thing prospect, and you can find it on github. The name comes from the fact that this can lead to gold, but carries with it the intrinsic dangers of plumbing into the depths, such as cave-ins, blackened lungs, or the worse things that dwell in the darkness.

The primary export of prospect is the titular function prospect :: Free f a -> (Maybe a, [f ()]), which tears down a free monad, tells you whether or not it has a pure return value, and spits out as many f constructors as it could before the computation branched. If you got a Just back, it means it found every constructor, but there are no other guarantees.

Happy prospecting!


Huge shoutouts to Vikrem who was a very patient sounding-board for all of my crazy ideas, and to kcsongor who suggested that I pay a lot more attention to where I'm being strict.

July 07, 2018 03:45 PM

July 06, 2018

Mark Jason Dominus

In which, to my surprise, I find myself belonging to a group

My employer ZipRecruiter had a giant crisis at last month, of a scale that I have never seen at this company, and indeed, have never seen at any well-run company before. A great many of us, all the way up to the CTO, made a heroic effort for a month and got it sorted out.

It reminded me a bit of when Toph was three days old and I got a call from the hospital to bring her into the emergency room immediately. She had jaundice, which is not unusual in newborn babies. It is easy to treat, but if untreated it can cause permanent brain damage. So Toph and I went to the hospital, where she underwent the treatment, which was to have very bright lights shined directly on her skin for thirty-six hours. (Strange but true!)

The nurses in the hospital told me they had everything under control, and they would take care of Toph while I went home, but I did not go. I wanted to be sure that Toph was fed immediately and that her diapers were changed timely. The nurses have other people to take care of, and there was no reason to make her wait to eat and sleep when I could be there tending to her. It was not as if I had something else to do that I felt was more important. So I stayed in the room with Toph until it was time for us to go home, feeding her and taking care of her and just being with her.

It could have been a very stressful time, but I don't remember it that way. I remember it as a calm and happy time. Toph was in no real danger. The path forward was clear. I had my job, to help Toph get better, and I was able to do it undistracted. The hospital (Children's Hospital of Philadelphia) was wonderful, and gave me all the support I needed to do my job. When I got there they showed me the closet where the bedding was and the other closet where the snacks were and told me to help myself. They gave me the number to call at mealtimes to order meals to be sent up to my room. They had wi-fi so I could work quietly when Toph was asleep. Everything went smoothly, Toph got better, and we went home.

This was something like that. It wasn't calm; it was alarming and disquieting. But not in an entirely bad way; it was also exciting and engaging. It was hard work, but it was work I enjoyed and that I felt was worth doing. I love working and programming and thinking about things, and doing that extra-intensely for a few weeks was fun. Stressful, but fun.

And I was not alone. So many of the people I work with are so good at their jobs. I had all the support I needed. I could focus on my part of the work and feel confident that the other parts I was ignoring were being handled by competent and reasonable people who were at least as dedicated as I was. The higher-up management was coordinating things from the top, connecting technical and business concerns, and I felt secure that the overall design of the new system would make sense even if I couldn't always understand why. I didn't want to think about business concerns, I wanted someone else to do it for me and tell me what to do, and they did. Other teams working on different components that my components would interface with would deliver what they promised and it would work.

And the other programmers in my group were outstanding. We were scattered all over the globe, but handed off tasks to one another without any mishaps. I would come into work in the morning and the guys in Europe would be getting ready to go to bed and would tell me what they were up to and the other east-coasters and I could help pick up where they left off. The earth turned and the west-coasters appeared and as the end of the day came I would tell them what I had done and they could continue with it.

I am almost pathologically averse to belonging to groups. It makes me uncomfortable and even in groups that I have been associated with for years I feel out of place and like my membership is only provisional and temporary. I always want to go my own way and if everyone around me is going a different way I am suspicious and contrarian. When other people feel group loyalty I wonder what is wrong with them.

The up-side of this is that I am more willing than most people to cross group boundaries. People in a close-knit community often read all the same books and know all the same techniques for solving problems. This means that when a problem comes along that one of them can't solve, none of the rest can solve it either. I am sometimes the person who can find the solution because I have spent time in a different tribe and I know different things. This is a role I enjoy.

Higher-Order Perl exemplifies this. To write Higher-Order Perl I visited functional programming communities and tried to learn techniques that those communities understood that people outside those communities could use. Then I came back to the Perl community with the loot I had gathered.

But it's not all good. I have sometimes been able to make my non-belonging work out well. But it is not a choice; it's the way I am made, and I can't control it. When I am asked to be part of a team, I immediately become wary and wonder what the scam is. I can be loyal to people personally, but I have hardly any group loyalty. Sometimes this can lead to ugly situations.

But in fixing this crisis I felt proud to be part of the team. It is a really good team and I think it says something good about me that I can work well with the rest of them. And I felt proud to be part of this company, which is so functional, so well-run, so full of kind and talented people. Have I ever had this feeling before? If I have it was a long, long time ago.

G.H. Hardy once wrote that when he found himself forced to listen to pompous people, he would console himself by thinking:

Well, I have done one thing you could never have done, and that is to have collaborated with Littlewood and Ramanujan on something like equal terms.

Well, I was at ZipRecruiter during the great crisis of June 2018 and I was able to do my part and to collaborate with those people on equal terms, and that is something to be proud of.

by Mark Dominus (mjd@plover.com) at July 06, 2018 06:27 AM

July 04, 2018

Mark Jason Dominus

Jackson and Gregg on optimization

Today Brendan Gregg's blog has an article Evaluating the Evaluation: Benchmarking Checklist that begins:

A co-worker introduced me to Craig Hanson and Pat Crain's performance mantras, which neatly summarize much of what we do in performance analysis and tuning. They are:

Performance mantras

  1. Don't do it
  2. Do it, but don't do it again
  3. Do it less
  4. Do it later
  5. Do it when they're not looking
  6. Do it concurrently
  7. Do it cheaper

I found this striking because I took it to be an obvious reference Michael A. Jackson's advice in his brilliant 1975 book Principles of Program Design. Jackson said:

We follow two rules in the matter of optimization:

Rule 1: Don't do it.
Rule 2 (for experts only). Don't do it yet.

The intent of the two passages is completely different. Hanson and Crain are offering advice about what to optimize. “Don't do it” means that to make a program run faster, eliminate some of the things it does. “Do it, but don't do it again” means that to make a program run faster, have it avoid repeating work it has already done, say by caching results. And so on.

Jackson's advice is of a very different nature. It is only indirectly about improving the program's behavior. Instead it is addressing the programmer's behavior: stop trying to optimize all the damn time! It is not about what to optimize but whether, and Jackson says that to a first approximation, the answer is no.

Here are Jackson's rules with more complete context. The quotation is from the preface (page vii) and is discussing the style of the examples in his book:

Above all, optimization is avoided. We follow two rules in the matter of optimization:

Rule 1. Don't do it.
Rule 2 (for experts only). Don't do it yet — that is, not until you have a perfectly clear and unoptimized solution.

Most programmers do too much optimization, and virtually all do it too early. This book tries to act as an antidote. Of course, there are systems which must be highly optimized if they are to be economically useful, and Chapter 12 discusses some relevant techniques. But two points should always be remembered: first, optimization makes a system less reliable and harder to maintain, and therefore more expensive to build and operate; second, because optimization obscures structure it is difficult to improve the efficiency of a system which is already partly optimized.

Here's some code I dealt with this month:

    my $emailclass = $args->{emailclass};
    if (!$emailclass && $args->{emailclass_name} ) {
      # do some caching so if we're called on the same object over and over we don't have to do another find.
      my $last_emailclass = $self->{__LAST_EMAILCLASS__};
      if ( $last_emailclass && $last_emailclass->{name} eq $args->{emailclass_name} ) {
        $emailclass = $last_emailclass->{emailclass};
      } else {
        $emailclass = $self->schema->resultset('EmailClass')
          ->find_by_name($args->{emailclass_name});
        $self->{__LAST_EMAILCLASS__} = {
                                        name => $args->{emailclass_name},
                                        emailclass => $emailclass,
                                       };
      }
    }

Holy cow, this is wrong in so many ways. 8 lines of this mess, for what? To cache a single database lookup (the ->find_by_name call), in a single object, if it happens to be looking for the same name as last time. If caching was actually wanted, it should have been addressed in the ->find_by_name call, which could do the caching more generally, and which has some hope of knowing something about when the cache entries should be expired. Even stipulating that caching was wanted and for some reason should have been put here, why such an elaborate mechanism, all to cache just the last lookup? It could have been:

    $emailclass = $self->emailclass_for_name($args->{emailclass_name});
    ...

    sub emailclass_for_name {
      my ($self, $name) = @_;
      $self->{emailclass}{$name} //=
        $self->schema->resultset('EmailClass')->find_by_name($name);
      return $self->{emailclass}{$name};
    }   

I was able to do a bit better than this, and replaced the code with:

    $emailclass = $self->schema->resultset('EmailClass')
          ->find_by_name($args->{emailclass_name});

My first thought was that the original caching code had been written by a very inexperienced programmer, someone who with more maturity might learn to do their job with less wasted effort. I was wrong; it had been written by a senior developer, someone who with more maturity might learn to do their job with less wasted effort.

The tragedy did not end there. Two years after the original code was written a more junior programmer duplicated the same unnecessary code elsewhere in the same module, saying:

I figured they must have had a reason to do it that way…

Thus is the iniquity of the fathers visited on the children.

In a nearby piece of code, an object A, on the first call to a certain method, constructed object B and cached it:

  B->new(
    base_path => ...
    schema    => $self->schema,
    retry     => ...,
  );

Then on subsequent calls, it reused B from the cache.

But the cache was shared among many instances of A, not all of which had the same ->schema member. So some of those instances of A would ask B a question and get the answer from the wrong database. A co-worker spent hours and hours in the middle of the night last month tracking this down. Again, the cache was not only broken but completely unnecesary. What was being saved? A single object construction, probably a few hundred bytes and a few hundred microseconds at most. And again, the code was perpetrated by a senior developer who should have known better. My co-worker replaced 13 lines of broken code with four that worked.

Brendan Gregg is unusually clever, and an exceptional case. Most programmers are not Brendan Gregg, and should take Jackson's advice and stop trying to be so clever all the time.

by Mark Dominus (mjd@plover.com) at July 04, 2018 08:42 AM

July 03, 2018

Douglas M. Auclair (geophf)

June 2018 1HaskellADay Problems and Solutions

by geophf (noreply@blogger.com) at July 03, 2018 02:24 PM

July 01, 2018

Michael Snoyman

Stop supporting older GHCs

Vincent Hanquez and I have been playing a proverbial game of hot potato over the past few months about writing this blog post. A cabal-install bug recently came to my attention that finally pushed me over the edge into writing this myself. I'll discuss that a bit later.

The basic claim here is simple:

  • Supporting older GHCs together with new ones adds a real maintenance cost to a library.
  • This is even higher if your goal is to make the code compile without warnings across all supported versions. You'll often end up resorting to CPP or weird import hacks to work around redundant imports.
  • There's nothing wrong with maintaining a code base using an old version of tools and libraries. However, you should make sure that you are pinning your dependencies in any such project, and therefore missing out on the latest updates to a library is not a critical issue, excluding security patches.

Usually, when I drop support for an older version of GHC in a library, I receive no complaints. Occasionally, if I do receive a concern about it, it's from someone trying to maintain their own CI matrix with older GHC support. I rarely, if ever, receive a complaint from someone trying to actually use an older version of GHC and the newest version of my code.

Instead of spinning wheels to maintain compatibility on the off chance that someone may desire bleeding-edge libraries with years-old compilers, I recommend cutting out support for older GHCs, updating your cabal files to reflect this decision, and keeping your CI build matrix curated appropriately. Only if a user has a specific request for a feature to work with an older GHC would I consider changing direction on this.

A generally accepted rule of thumb is three major GHC releases. At the time of writing, that would mean supporting GHC 8.0, 8.2, and 8.4. I recommend only supporting the latest minor version of each line, which would mean GHC 8.0.2, 8.2.2, and 8.4.3.

Updating cabal files

The most common method for specifying which GHC versions you support is to use the version of the base library that ships with GHC. You can use this handy lookup table I put together, which I made because I can never remember this information. Using that table, if you decided "I want to support GHC 8.0.2 and later", you'd look up the corresponding base version, find that it's 4.9.1.0, and add the following to your cabal file:

build-depends: base >= 4.9.1.0

(Or equivalent if using hpack/package.yaml.)

Even though this is the standard approach today, there are a few problems with it:

  • The meaning isn't immediately clear. Most people in my experience think about GHC versions, not base versions. Someone reading that line will need to go look up what GHC version that corresponds to.
  • Newer users may not understand that base is a special, non-upgradeable package, and not realize that this is pinning the GHC version. (Many bug reports and support requests about both cabal-install and Stack back up this claim.)
  • At some point in the future, in theory, base may in fact be upgradeable, at which point this approach won't work.
  • base library versions don't always bump with new versions of GHC. For example, both GHC 8.4.2 and 8.4.3 ship with base-4.11.1.0. If you wanted to state that you only support GHC 8.4.3, you can't use the base approach.
  • It's quite possible to write some code which is compatible with an older base library version, but depends on newer features of GHC, like a new language extension.

Therefore, I recommend taking a two-pronged approach for dropping support for older GHC versions:

  • Add a lower bound on base as mentioned above. It's the standard approach, and some tooling may depend on it.
  • Add a stanza to your cabal file using cabal conditionals based on the GHC version, e.g:
if impl(ghc < 8.0.2)
  build-depends: unsupported-ghc-version > 1 && < 1

Originally, Vincent and I had discussed using buildable: False, but that's not an option, because...

Recently uncovered cabal-install bug

I received a bug report on Friday about my new release of yaml causing Travis build breakages for GHC 7.10.3. Later this was opened as a Hackage Trustee issue, where I found out that the same bug was being triggered by an earlier http-client release.

It turns out that with all released versions of cabal-install, there's a bug that works something like this: if a component is buildable: False, then all of its dependencies are ignored. However, the dependency solver will still select such a package as a library dependency, even though the library component is marked as non-buildable. Then, when trying to build this dependency, cabal-install will (rightfully) complain that it cannot build a package with no library or executables.

This bug has been fixed on cabal-install HEAD (and maybe the 2.2 maintenance branch? I'm not 100% sure). However, it's unreasonable to expect people to upgrade to bleeding-edge versions of tooling. So instead of the buildable: False approach, I've adapted the cabal files in question to use an impossible-to-satisfy constraint, which the dependency solver does better with:

if impl(ghc < 8.0.2)
  -- Disable building with GHC before 8.0.2.
  -- Due to a cabal bug, do not use buildable: False,
  -- but instead give it an impossible constraint.
  -- See: https://github.com/haskell-infra/hackage-trustees/issues/165
  build-depends: unsupported-ghc-version > 1 && < 1

The dependency solver realizes that it cannot find a version of the (non-existent) unsupported-ghc-version package which is both greater than and less than version 1, and so ignores this version of the package when using GHC before 8.0.2. Problem solved!

I strongly recommend added a comment explaining this unusual formulation, as it's not clear to a reader why such a constraint is being added instead of the arguably clearer buildable: False.

July 01, 2018 01:00 PM

June 29, 2018

Philip Wadler

A new EU law could end the web as we know it


Laws concerning copyright filters (Article 13) and link tax (Article 11) have been passed the committee stage and are about to be voted on by the entire European Parliament. While some results of EU legislation, such as GDPR, are largely to the good, these new laws are a disaster. They could end the web as we know it.

Below is what I wrote to my MEPs. If you are an EU citizen, you should write too; WriteToThem makes it easy.

Dear Nosheena Mobarik, Alyn Smith, Ian Hudghton, Catherine Stihler, David Martin, and David Coburn,

I write to you greatly disturbed by the new proposals for copyright. These would require any service provider to maintain large, expensive, and ineffective filters. While YouTube, Twitter, or Facebook will have no problems funding these, they will prevent the next innovative service from getting off the ground. Experience with such filters show that they ban all sort of material which should be in the public domain or which is fair use.

I am also concerned by the resurrection of the "link tax". Previous experience with adding a link tax in Germany and Spain showed that the newspapers that requested it soon stopped using it. It is particularly worrying that the legal formulation chosen will invalidate Creative Commons licenses. Many academic journals (including the one which I edit) depend crucially on Creative Commons, and passing a law that invalidates it is against the public interest.

An excellent one-page summary of the issues, with further links, can be found here:

  https://boingboing.net/2018/04/11/evidence-free-zone.html

The web has been a huge boon to innovation, creativity, cooperation, and scientific progress. Please don't kill the goose that lays the golden eggs.

Yours sincerely,



Philip Wadler
Professor of Theoretical Computer Science
University of Edinburgh

by Philip Wadler (noreply@blogger.com) at June 29, 2018 01:49 PM

June 28, 2018

FP Complete

Why blockchain and cryptocurrency audits?

FP Complete now does blockchain audit services. Why have we chosen to work in this field, and what are we aiming to accomplish?

by Aaron Contorer (aaron@fpcomplete.com) at June 28, 2018 09:01 PM

June 27, 2018

Well-Typed.Com

8-hours remote interactive course on
"Type-level programming with GHC"

We are going to introduce new courses and online versions of our courses. As a first step, next month, we’re offering an online course on type-level programming with GHC which is now open for registrations:

Type-level programming with GHC

23-24 July 2018, 0800-1200 UTC

  • Delivered by Andres Löh
  • Two half days of lectures, discussions and live coding
  • Delivered online, probably via Google Hangouts
  • Up to 6 participants (first come, first served)
  • GBP 360 including VAT (GBP 300 without VAT)
  • Register by email to info@well-typed.com

About the course

In this course, we are going to cover the following topics:

  • What is type-level programming?
  • When is it useful, and what costs are associated with it?
  • Generalised algebraic data types (GADTs)
  • Kinds and promotion
  • Higher-rank polymorphism
  • Singleton types
  • Retaining evidence, avoiding Booleans, learning by testing
  • Establishing boundaries between untrusted and trusted code
  • Type families
  • Proxies and injectivity
  • Type-level equality and proofs
  • Future perspectives

The course will be a mixture of lectures, discussions and live coding. The maximum course size is deliberately kept small so that it is still possible to ask and discuss individual questions.

There will be some exercises, but compared to an on-site course, we are going to keep this component relatively small, because it is not so easy to supervise the coding efforts of participants in this format.

About the teacher

Andres Löh has more than 20 years of Haskell experience, and more than 15 years of teaching experience. He has taught many courses on all things Haskell, including a type-level programming course at the Summer School on Generic and Effectful Programming. He helped establish the Utrecht Summer School on Applied Functional Programming and IOHK’s course on Haskell and Cryptocurrencies. He is a co-author or contributor to various packages that involve type-level programming, including generics-sop and servant. He has a PhD from Utrecht University.

Registration

If you are interested in registering, please send an email to info@well-typed.com including an invoice address. Payment is in advance via bank transfer. There is no minimum number of registrations. The course will take place even with a single registered participant. If we have to cancel the course for whatever reason, you will be entitled to a full refund or a replacement course at a different time.

If you are interested in the format, but not the topic or cannot make the time, feel free to contact us with requests for courses on other topics or at other times.

Other courses and events

We are of course still offering on-site courses. Our next public courses will be in London, in connection with the Haskell eXchange and in collaboration with Skills Matter:

We can also do courses on-site for your company, on the topics you are most interested in and individually tailored to your needs. Check out more detailed information or just contact us.

by andres at June 27, 2018 01:00 PM

June 26, 2018

Tom Schrijvers

PHD POSITION in Functional Programming and Programming Language Theory

I have a position available for a PhD student in the area of Functional Programming and Programming Language Theory.

My team conducts research on topics like:

  • type systems
  • mechanisation of programming language meta-theory
  • category theoretical foundations of programming languages
  • monads, continuations, algebraic effects and handlers, ...
  • recursion schemes and equational reasoning
  • DSLs

by Tom Schrijvers (noreply@blogger.com) at June 26, 2018 01:17 PM

Sandy Maguire

Coercions and Roles for Dummies

A discussion about roles broke out yesterday on fpchat which is a great place to hang out and talk about nerdy functional programming stuff if you don't already know about it. Go sign up! Anyway, people suggested I clean up my little soliloquy about roles and turn it into a blog post, so here we are.

You've heard of the type system, which makes sure your terms are sane. Maybe you're also aware of the kind system, whose job it is is to make sure your types are reasonable! But did you know Haskell has an even more obscure system than these? It's called the role system, and its purpose in life is to prevent you from shooting yourself in the foot when dealing with coercions.

Coercions and roles have been around since 2014, but there's been surprisingly little discussion about them in the blogosphere. In short, if two types have the same representation at runtime, then it should be safe to coerce a value of one into a value of the other. The role system is used to describe under what circumstances such a coercion is legal.

To illustrate the point, let's talk about newtypes. Consider the following:

newtype AnInt = AnInt Int

The promise of a newtype in Haskell is that it is zero-overhead; at runtime, AnInt is exactly identical to Int. Newtypes are often used for adding type-safety; it's nice if you have a newtype Temperature = Temperature Int and a newtype Money = Money Int because the extra type wrappers ensure you can't accidentally add the weather to your bank account, even if at the end of the day they are both just integers.

AnInt and Int are not literally the same type, but they don't actually differ at runtime. This property is known as being representationally equal. If two types are representationally equal, we should be able to do the equivalent of C++'s reinterpret_cast and just pretend like a value of one is in fact a value of the other. Since these types correspond exactly at runtime, this is usually a safe thing to do.

If AnInt and Int are the same type at runtime, it means we should be able to coerce :: AnInt -> Int (and backwards) freely between the two types without any problems. Morally, this coerce function is just id, because we're not actually doing any work to the value.

Consider now the slightly more interesting type:

newtype Identity a = Identity a

Again, because Identity is a newtype, we should expect Identity a to be representationally equal to a. Since this is true, we expect that Identity AnInt also be representationally equal to Identity Int, via Identity AnInt --> AnInt --> Int --> Identity Int. And thus, we should be able to coerce :: Identity AnInt -> Identity Int. We can see that Identity a preserves the coercion relationship of its type parameter a, and this property is known as the a having role representational.

More generally, if the type parameter a in F a has role representational, then F X is representationally equal to F Y whenever X is representationally equal to Y. This works whether F be a data or a newtype.

However, not all type parameters have role representational! Consider Data.Map.Map k v which has keys k and values v. Because Map is implemented as a balanced tree, it uses the Ord k instance to figure out where to store a kv-pair.

One of the reasons we write newtypes is to give a different typeclass instance than the underlying type has. For example, newtype ZipList a = ZipList [a] has a different Applicative instance than [] does. In general, we have no reason to expect that a newtype and its underlying type have instances that agree with one another.

Which leads us to a problem. Because a value of Map k v is a balanced tree which depends on the Ord k instance, we can't simply swap in SomeOtherK and expect everything to work hunky-dory. They have different Ord instances, and things would go screwy at runtime. All of this is to say that we do not want to be able to coerce :: Map AnInt v -> Map Int v because it's likely to crash at runtime.

However, it is still fine to coerce :: Map k AnInt -> Map k Int, because the values don't have this implicit dependency on any typeclass instances. There are no invariants to maintain on the v parameter, and so we are free to coerce to our hearts' content.

The role system is what describes this difference between the k and v type parameters of Data.Map.Map. While v is still role representational, k has role nominal.

nominal coercions of the form coerce :: a -> b are allowed iff you already have a proof that a ~ b, which is to say that a and b are literally the same type.

There's also a third role, phantom, which, you guessed it, is given to phantom type parameters (eg. the a in data Const x a = Const x.) Because phantom types are by-definition not referenced in the data definition of a type, we are always free to coerce a phantom type to any other type.

All of this cashes out in the form of Data.Coerce's coerce :: Coercible a b => a -> b. GHC will automatically provide instances of Coercible a b whenever a and b are representationally coercible. That means you get these instances (and all of their symmetries):

  • Given a ~ b, Coercible a b
  • If NT is a newtype over T, Coercible NT T
  • If p in F p has role phantom, Coercible (F a) (F b)
  • If r in F r has role representational, Coercible a b => Coercible (F a) (F b)
  • If n in F n has role nominal, (a ~ b) => Coercible (F a) (F b)

GHC is pretty clever, and has a role-inference mechanism. It works by knowing that (->) has two representational roles, that (~) has two nominal roles, and propagates from there. Every type parameter is assumed to have role phantom until it is used, whence it gets upgraded to the more restrictive role corresponding to the position it was used in. For example, if a is used in a ~ Bool, a gets role nominal since both of (~)'s parameters have nominal roles.

GADTs are syntactic sugar on top of (~) so expect GADTs to have nominal role type parameters. Furthermore, any parameters of a type family that are scrutinized will also have role nominal (the motivated reader will be able to find an interesting implementation of unsafeCoerce :: forall a b. a -> b if this were not the case.)

This inference mechanism will give you the most permissible roles that don't obviously destroy the type system, but sometimes it's necessary to explicitly give a role annotation, like in the Data.Map.Map example. Role annotations can be given by eg. type role Map nominal representational after turning on {-# LANGUAGE RoleAnnotations #-}. It's worth pointing out that you can only give less permissive roles than GHC has inferred; there's no fighting with it on this one.


At the end of the day, why is any of this stuff useful? Besides being nice to know, custom role annotations can provide type-safety ala Data.Map.Map. But we can also get asymptotic performance gains out of coerce:

If f :: Coercible a b => a -> b (common if f is a newtype un/wrapper, or composition of such), then fmap f O(n) is equivalent to coerce O(1) in most cases1. In fact, mpickering has written a GHC source plugin that will tell you if you can apply this transformation. Cool!


  1. Assuming f's type parameter a is not nominal role.

June 26, 2018 10:07 AM

June 25, 2018

Monday Morning Haskell

Contributing to GHC 3: Hacking Syntax and Parsing

In last week's article, we made more progress in understanding GHC. We got our basic development cycle down and explored the general structure of the code base. We also made the simplest possible change by updating one of the error messages. This week, we'll make some more complex changes to the compiler, showing the ways you can tweak the language. It's unlikely you would make changes like these to fix existing issues. But it'll help us get a better grasp of what's going on.

As always, you can learn more about the basics of Haskell by checking out our other resources. Take a look at our Liftoff Series or download our Beginners Checklist!

Comments and Changing the Lexer

Let's get warmed up with a straightforward change. We'll add some new syntax to allow different kinds of comments. First we have to get a little familiar with the Lexer, defined in parser/Lexer.x. Let's try and define it so that we'll be able to use four apostrophes to signify a comment. Here's what this might look like in our code and the error message we'll get if we try to do this right now.

module Main where

'''' This is our main function
main :: IO ()
main = putStrLn "Hello World!"

…

Parser error on `''`
Character literals may not be empty
  |
5 | '''' This is our main function
  | ^^

Now, it's easy enough to add a new line describing what to do with this token. We can follow the example in the Lexer file. Here's where GHC defines a normal single line comment:

"-- " ~$docsym .* { lineCommentToken }
"--" [^$symbol \ ] . * { lineCommentToken }

It needs two cases because of Haddock comments. But we don't need to worry about that. We can specify our symbol on one line like so:

"''''" .* { lineCommentToken }

Now we can add the comment above into our code, and it compiles!

Adding a New Keyword

Let's now look at how we could add a new keyword to the language. We'll start with a simple substitution. Suppose we want to use the word iffy like we use if. Here's what a code snippet would look like, and what the compiler error we get is at first:

main :: IO ()
main = do
  i <- read <$> getLine
  iffy i `mod` 2 == 0
    then putStrLn "Hello"
    else putStrLn "World"

…

Main.hs:11:5: error: parse error on input 'then'
   |
11 |     then putStrLn "Hello"
   |     ^^^^

Let's do a quick search for where the keyword "if" already exists in the parser section. We'll find two spots. The first is a list of all the reserved words in the language. We can update this by adding our new keyword to the list. We'll look for the reservedIds set in basicTypes/Lexeme.hs, and we can add it:

reservedIds :: Set.Set String
reservedIds = Set.fromList [ …
  , "_", "iffy" ]

Now we also have to parse it so that it maps against a particular token. We can see a line in Lexer.x where this happens:

( "if", ITif, 0)

We can add another line right below it, matching it to the same ITif token:

( "iffy", ITif, 0)

Now the lexer matches it against the same token once we start putting the language together. Now our code compiles and produces the expected result!

lghc Main.hs
./prog.exe
5
World

Reversing If

Now let's add a little twist to this process. We'll add another "if" keyword and call it reverseif. This will change the ordering of the if-statement. So when the boolean is false, our code will execute the first branch instead of the second. We'll need to work a little further upstream. We want to re-use as much of the existing machinery as possible and just reverse our two expressions at the right moment. Let's use the same code as above, except with the reverse keyword. Then if we input 5 we should get Hello instead of World.

main :: IO ()
main = do
  i <- read <$> getLine
  reverseif i `mod` 2 == 0
    then putStrLn "Hello"
    else putStrLn "World"

So we'll have to start by adding a new constructor to our Token type, under the current if token in the lexer.

data Token =
  …
  | ITif
  | ITreverseif
  ...

Now we'll have to add a line to convert our keyword into this kind of token.

...
("if", ITif, 0),
("reverseif", ITreverseif, 0),
...

As before, we'll also add it to our list of keywords:

reservedIds :: Set.Set String
reservedIds = Set.fromList [ …
  , "_", "iffy", "reverseif" ]

Let's take a look now at the different places where we use the ITif constructor. Then we can apply them to ITreverseif as well. We can find two more instances in Lexer.x. First, there's the function maybe_layout, which dictates if a syntactic construct might need an open brace. Then there's the isALRopen function, which tells us if we can start some kind of other indentation. In both of these, we'll follow the example of ITif:

maybe_layout :: Token -> P ()
…
  where
    f ITif = pushLexState layout_if
    f ITreverseif = pushLexState layout_if

...
isALRopen ITif = True
isALRopen ITreverseif = True
...

There's also a bit in Parser.y where we'll need to parse our new token:

%token
 …
 'if' { L _ ITif }
 'reverseif' { L _ ITreverseif }

Now we need to figure out how these tokens create syntactic constructs. This also seems to occur in Parser.y. We can look, for instance, at the section that constructs basic if statements:

| 'if' exp optSemi 'then' exp optSemi 'else' exp
    {% checkDoAndIfThenElse $2 (snd $3) $5 (snd $6) $8 >>
      Ams (sLL $1 $> $ mkHsIf $2 $5 $8)
        (mj AnnIf $1:mj AnnThen $4
          :mj AnnElse $7
          :(map (\l -> mj AnnSemi l) (fst $3))
         ++(map (\l -> mj AnnSemi l) (fst $6))) }

There's a lot going on here, and we're not going to try to understand it all right now! But there are only two things we'll need to change to make a new rule for reverseif. First, we'll obviously need to use that token instead of if on the first line.

Second, see that mkHsIf statement on the third line? This is where we make the actual Haskell "If" expression in our syntax tree. The $5 refers to the second instance of exp in the token list, and the $8 refers to the third and final expression. These are, respectively, the True and False branch expressions of our "If" statement. Thus, to reverse our "If", all we need to do is flip this arguments on the third line!

| 'reverseif' exp optSemi 'then' exp optSemi 'else' exp
    {% checkDoAndIfThenElse $2 (snd $3) $5 (snd $6) $8 >>
      Ams (sLL $1 $> $ mkHsIf $2 $8 $5)
        (mj AnnIf $1:mj AnnThen $4
          :mj AnnElse $7
          :(map (\l -> mj AnnSemi l) (fst $3))
         ++(map (\l -> mj AnnSemi l) (fst $6))) }

Finally, there's one more change we need to make. Adding this line will introduce a couple new shift/reduce conflicts into our grammar. There are already 233, so we're not going to worry too much about that right now. All we need to do is change the count on the assertion for the number of conflicts:

%expect 235 -- shift/reduce conflicts

Now when we compile and run our simple program, we'll indeed see that it works as expected!

lghc Main.hs
./prog.exe
5
Hello

Conclusion

So this week we saw some more complicated changes to GHC that have tangible effects. Next week, we'll wrap up our discussion of GHC by looking at the contribution process. We'll see the "simple" way with Github first. Then we'll also walk through the more complicated process using tools like Arc and Phabricator.

To learn more about Haskell, you should check out some of our basic materials! If you're a beginner to the language, read our Liftoff Series. It'll teach you how to use Haskell from the ground up. You can also take a look at our Haskell Web Series to see some more advanced and practical skills!

by James Bowen at June 25, 2018 02:30 PM

June 23, 2018

Brent Yorgey

New ICFP functional pearl on subtracting bijections

Kenny Foner and I have a paper accepted to ICFP on subtracting bijections. Here’s the basic problem: suppose you have a bijection h between two sum types, h : A + B \leftrightarrow A' +B', and another bijection g : B \leftrightarrow B'. Of course A and A' must have the same size, but can you construct an actual bijection f between A and A'?

This problem and its solution has been well-studied in the context of combinatorics; we were wondering what additional insight could be gained from a higher-level functional approach. You can get a preprint here.

by Brent at June 23, 2018 09:47 PM

Toby Goodwin

Letter to my MP

Dear Alan Brown MP,

When historians 500 years from now come to write a short summary of the early 21st Century, Scottish independence will not be mentioned. Those future historians, if they exist, will skip straight over Brexit. Trump? He may warrant a brief mention as an exemplar of the insanity that gripped humanity in those far-off days.

I write as your constituent in Kilmarnock and an SNP member.

My putative historians will primarily focus on the astonishing inventions and remarkable enterprises of the 2020s and 2030s that saved the world. I cannot of course tell you exactly what they will be, although some of the outlines are becoming clear. But this is scientific certainty: if there are historians in 2518, they will owe their careers and their very lives to the creation of large-scale BECCS systems by us and our children. (“Bio-energy with carbon capture and storage (BECCS) is a future greenhouse gas mitigation technology which produces negative carbon dioxide emissions” says Wikipedia — the vital word there being “future”.)

I understand that you will be voting in the Commons on Monday on the government's Airports National Policy Statement (NPS). I know SNP party policy supports the third runway at Heathrow, in the belief that it will benefit the Scottish economy. There are two compelling reasons why I urge you to vote against the government.

The first is the current constitutional crisis. With the Conservatives ripping up the Sewel convention, how can you give them succour? It will not be well received in Scotland if you troop into the lobbies with the same Conservative MPs who only last week were pouring scorn and derision on you, our country, and the Scottish Parliament.

Abstaining does not go far enough — I expect you've seen the recent jokes about the Labour Party. I understand that opposition parties do more than just oppose but I've been astonished recently that this weak minority government still seems to be able to do anything they damn well please.

The second reason you must oppose Heathrow expansion I alluded to at the start of this letter. Global warming poses an urgent existential threat to civilization itself. We are now in the extraordinary position that while science tells us what we need to do, our body politic has belatedly accepted the science, but is intrinsically unable to act on it.

As you know, the Paris Accord sets an aspiration of keeping global warming to 1.5℃ by the end of the century. To achieve that, without postulating BECCS or similar mitigation technologies which do not yet exist, we need to reach zero carbon by about 2026.

Zero carbon in less than a decade is politically ludicrous. Instead, governments (including the UK's) have drawn up plans and carbon budgets predicated on the assumption that BECCS can save us. By 2050, we are expecting BECCS to remove 2Gt of CO₂ per annum. Two billion tonnes of carbon dioxide, essentially sucked out of the air then stuffed in a hole in the ground. Quite apart from the technological and financial challenges, we are expecting our children — around 9 billion by 2050 — to devote good arable land with an area the size of India to growing biofuels.

The section on carbon emissions in the Airports NPS is woefully inadequate. Section 5.82 states that Heathrow can go ahead provided it does not blow the country's entire carbon budget. Contemplating any increase in carbon emissions is irresponsible, particularly in the aviation sector which has no credible plan for getting to zero carbon apart from bio-energy. To accept increases provided they stay within budget, yet with no plan or even suggestion of where the corresponding savings will come from is altogether reckless.

I urge you to vote against the Airports NPS.

Yours sincerely,

Toby Goodwin.

June 23, 2018 08:43 PM

June 22, 2018

Simon Marlow

Rethinking Static Reference Tables in GHC

Rethinking Static Reference Tables in GHC

It seems rare these days to be able to make an improvement that’s unambiguously better on every axis. Most changes involve a tradeoff of some kind. With a compiler, the tradeoff is often between performance and code size (e.g. specialising code to make it faster leaves us with more code), or between performance and complexity (e.g. adding a fancy new optimisation), or between compile-time performance and runtime performance.

Recently I was lucky enough to be able to finish a project I’ve been working on intermittently in GHC for several years, and the result was satisfyingly better on just about every axis.

  • Code size: overall binary sizes are reduced by ~5% for large programs, ~3% for smaller programs.

  • Runtime performance: no measurable change on benchmarks, although some really bad corner cases where the old code performed terribly should now be gone.

  • Complexity: some complex representations were removed from the runtime, making GC simpler, and the compiler itself also became simpler.

  • Compile-time performance: slightly improved (0.2%).

To explain what the change is, first we’ll need some background.

Garbage collecting CAFs

A Constant Applicative Form (CAF) is a top-level thunk. For example:

myMap :: HashMap Text Int
myMap = HashMap.fromList [
  -- lots of data
  ]

Now, myMap is represented in the compiled program by a static closure that looks like this:

When the program demands the value of myMap for the first time, the representation will change to this:

At this point, we have a reference from the original static closure, which is part of the compiled program, into the dynamic heap. The garbage collector needs to know about this reference, because it has to treat the value of myMap as live data, and ensure that this reference remains valid.

How could we do that? One way would be to just keep all the CAFs alive for ever. We could keep a list of them and use the list as a source of roots in the GC. That would work, but we’d never be able to garbage-collect any top-level data. Back in the distant past GHC used to work this way, but it interacted badly with the full-laziness optimisation which likes to float things out to the top level - we had to be really careful not to float things out as CAFs because the data would be retained for ever.

Or, we could track the liveness of CAFs properly, like we do for other data. But how can we find all the references to myMap? The problem with top-level closures is that their references appear in code, not just data. For example, somewhere else in our program we might have

myLookup :: String -> Maybe Int
myLookup name = HashMap.lookup name myMap

and in the compiled code for myLookup will be a reference to myMap.

To be able to know when we should keep myMap alive, the garbage collector has to traverse all the references from code as well as data.

Of course, actually searching through the code for symbols isn’t practical, so GHC produces an additional data structure for all the code it compiles, called the Static Reference Table (SRT). The SRT for myLookup will contain a reference to myMap.

The naive way to do this would be to just have a table of all the static references for each code block. But it turns out that there’s quite a lot of opportunities for sharing between SRTs - lots of code blocks refer to the same things - so it makes sense to try to use a more optimised representation.

The representation that GHC 8.4 and earlier used was this:

All the static references in a module were collected together into a single table (ThisModule_srt in the diagram), and every static closure selects the entries it needs with a combination of a pointer (srt) into the table and a bitmap (srt_bitmap).

This had a few problems:

  • On a 64-bit machine we need at least 96 bits for the SRT in every static closure and continuation that has at least one static reference: 64 bits to point to the table and a 32-bit bitmap.

  • Sometimes the heuristics in the compiler for generating the table worked really badly. I observed some cases with particularly large modules where we generated an SRT containing two entries that were thousands of entries apart in the table, which required a huge bitmap.

  • There was complex code in the RTS for traversing these bitmaps, and complex code in the compiler to generate this table that nobody really understood.

The shiny new way

The basic idea is quite straightforward: instead of the single table and bitmap representation, each code block that needs an SRT will have an associated SRT object, like this:

Firstly, this representation is a lot simpler, because an SRT object has exactly the same representation as a static constructor, so we need no new code in the GC to handle it. All the code to deal with bitmaps goes away.

However, just making this representation change by itself will cause a lot of code growth, because we lose many of the optimisations and sharing that we were able to do with the table and bitmap representation.

But the new representation has some great opportunities for optimisation of its own, and exploiting all these optimisations results in more compact code than before.

We never need a singleton SRT

If an SRT has one reference in it, we replace the pointer to the SRT with the pointer to the reference itself.

The SRT field for each code block can be 32 bits, not 96

Since we only need a pointer, not a pointer and a bitmap, the overhead goes down to 64 bits. Furthermore, by exploiting the fact that we can represent local pointers by 32-bit offsets (on x86_64), the overhead goes down to 32 bits.

We can common up identical SRTs

This is an obvious one: if multiple code blocks have the same set of static references, they can share a single SRT object.

We can drop duplicate references from an SRT

Sometimes an SRT refers to a closure that is also referred to by something that is reachable from the same SRT. For example:

In this case we can drop the reference to x in the outer SRT, because it’s already contained in the inner SRT. That leaves the outer SRT with a single reference, which means the SRT object itself can just disappear, by the singleton optimisation mentioned earlier.

For a function, we can combine the SRT with the static closure itself

A top-level function with an SRT would look like this:

We might as well just merge the two objects together, and put the SRT entries in the function closure, to give this:

Together, these optimisations were enough to reduce code size compared with the old table/bitmap representation.

Show me the code

Look out for (slightly) smaller binaries in GHC 8.6.1.

June 22, 2018 12:00 AM

June 20, 2018

Simon Marlow

Fixing 17 space leaks in GHCi, and keeping them fixed

Fixing 17 space leaks in GHCi, and keeping them fixed

In this post I want to tackle a couple of problems that have irritated me from time to time when working with Haskell.

  • GHC provides some powerful tools for debugging space leaks, but sometimes they’re not enough. The heap profiler shows you what’s in the heap, but it doesn’t provide detailed visibility into the chain of references that cause a particular data structure to be retained. Retainer profiling was supposed to help with this, but in practice it’s pretty hard to extract the signal you need - retainer profiling will show you one relationship at a time, but you want to see the whole chain of references.

  • Once you’ve fixed a space leak, how can you write a regression test for it? Sometimes you can make a test case that will use O(n) memory if it leaks instead of O(1), and then it’s straightforward. But what if your leak is only a constant factor?

We recently noticed an interesting space leak in GHCi. If we loaded a set of modules, and then loaded the same set of modules again, GHCi would need twice as much memory as just loading the modules once. That’s not supposed to happen - GHCi should release whatever data it was holding about the first set of modules when loading a new set. What’s more, after further investigation we found that this effect wasn’t repeated the third time we loaded the modules; only one extra set of modules was being retained.

Conventional methods for finding the space leak were not helpful in this case. GHCi is a complex beast, and just reproducing the problem proved difficult. So I decided to try a trick I’d thought about for a long time but never actually put into practice: using GHC’s weak pointers to detect data that should be dead, but isn’t.

Weak pointers can detect space leaks

The System.Mem.Weak library provides operations for creating “weak” pointers. A weak pointer is a reference to an object that doesn’t keep the object alive. If we have a weak pointer, we can attempt to dereference it, which will either succeed and return the value it points to, or it will fail in the event that the value has been garbage collected. So a weak pointer can detect when things are garbage collected, which is exactly what we want for detecting space leaks.

Here’s the idea:

  1. Call mkWeakPtr v Nothing where v is the value you’re interested in.
  2. Wait until you believe v should be garbage.
  3. Call System.Mem.performGC to force a full GC.
  4. Call System.Mem.Weak.deRefWeak on the weak pointer to see if v is alive or not.

Here’s how I implemented this for GHCi. One thing to note is that just because v was garbage-collected doesn’t mean that there aren’t still pieces of v being retained, so you might need to have several weak pointers to different components of v, like I did in the GHC patch. These really did detect multiple different space leaks.

This patch reliably detected leaks in trivial examples, including many of the tests in GHCi’s own test suite. That meant we had a way to reproduce the problem without having to use unpredictable measurement methods like memory usage or heap profiles. This made it much easier to iterate on finding the problems.

Back to the space leaks in GHCi

That still leaves us with the problem of how to actually diagnose the leak and find the cause. Here the techniques are going to get a bit more grungy: we’ll use gdb to poke around in the heap at runtime, along with some custom utilities in the GHC runtime to help us search through the heap.

To set things up for debugging, we need to

  1. Compile GHC with -g and -debug, to add debugging info to the binary and debugging functionality to the runtime, respectively.
  2. load up GHCi in gdb (that’s a bit fiddly and I won’t go into the details here),
  3. Set things up to reproduce the test case.
*Main> :l
Ok, no modules loaded.
-fghci-leak-check: Linkable is still alive!
Prelude>

The -fghci-leak-check code just spat out a message when it detected a leak. We can Ctrl-C to break into gdb:

Program received signal SIGINT, Interrupt.
0x00007ffff17c05b3 in __select_nocancel ()
    at ../sysdeps/unix/syscall-template.S:84
84	../sysdeps/unix/syscall-template.S: No such file or directory.

Next I’m going to search the heap for instances of the LM constructor, which corresponds to the Linkable type that the leak detector found. There should be none of these alive, because the :l command tells GHCi to unload everything, so any LM constructors we find must be leaking:

(gdb) p findPtr(ghc_HscTypes_LM_con_info,1)
0x4201a073d8 = ghc:HscTypes.LM(0x4201a074b0, 0x4201a074c8, 0x4201a074e2)
-->
0x4200ec2000 = WEAK(key=0x4201a073d9 value=0x4201a073d9 finalizer=0x7ffff2a077d0)
0x4200ec2000 = WEAK(key=0x4201a073d9 value=0x4201a073d9 finalizer=0x7ffff2a077d0)
0x42017e2088 = ghc-prim:GHC.Types.:(0x4201a073d9, 0x7ffff2e9f679)
0x42017e2ae0 = ghc-prim:GHC.Types.:(0x4201a073d9, 0x7ffff2e9f679)
$1 = void

The findPtr function comes from the RTS, it’s a function designed specifically for searching through the heap for things from inside gdb. I asked it to search for ghc_HscTypes_LM_con_info, which is the info pointer for the LM constructor - every instance of that constructor will have this pointer as its first word.

The findPtr function doesn’t just search for objects in the heap, it also attempts to find the object’s parent, and will continue tracing back through the chain of ancestors until it finds multiple parents.

In this case, it found a single LM constructor, which had four parents: two WEAK objects and two ghc-prim:GHC.Types.: objects, which are the list constructor (:). The WEAK objects we know about: those are the weak pointers used by the leak-checking code. So we need to trace the parents of the other objects, which we can do with another call to findPtr:

(gdb) p findPtr(0x42017e2088,1)
0x42016e9c08 = ghc:Linker.PersistentLinkerState(0x42017e2061, 0x7ffff3c2bc63, 0x42017e208a, 0x7ffff2e9f679, 0x42016e974a, 0x7ffff2e9f679)
-->
0x42016e9728 = THUNK(0x7ffff74790c0, 0x42016e9c41, 0x42016e9c09)
-->
0x42016e9080 = ghc:Linker.PersistentLinkerState(0x42016e9728, 0x7ffff3c2e7bb, 0x7ffff2e9f679, 0x7ffff2e9f679, 0x42016e974a, 0x7ffff2e9f679)
-->
0x4200dbe8a0 = THUNK(0x7ffff7479138, 0x42016e9081, 0x42016e90b9, 0x42016e90d1, 0x42016e90e9)
-->
0x42016e0b00 = MVAR(head=END_TSO_QUEUE, tail=END_TSO_QUEUE, value=0x4200dbe8a0)
-->
0x42016e0828 = base:GHC.MVar.MVar(0x42016e0b00)
-->
0x42016e0500 = MUT_VAR_CLEAN(var=0x42016e0829)
-->
0x4200ec6b80 = base:GHC.STRef.STRef(0x42016e0500)
-->
$2 = void

This time we traced through several objects, until we came to an STRef, and findPtr found no further parents. Perhaps the next parent is a CAF (a top-level thunk) which findPtr won’t find because it only searches the heap. Anyway, in the chain we have two PersistentLinkerState objects, and some THUNKs - it looks like perhaps we’re holding onto an old version of the PersistentLinkerState, which contains the leaking Linkable object.

Let’s pick one THUNK and take a closer look.

(gdb) p4 0x42016e9728
0x42016e9740:	0x42016e9c09
0x42016e9738:	0x42016e9c41
0x42016e9730:	0x0
0x42016e9728:	0x7ffff74790c0 <sorW_info>

The p4 command is just a macro for dumping memory (you can get these macros from here).

The header of the object is 0x7ffff74790c0 <sorW_info>, which is just a compiler-generated symbol. How can we find out what code this object corresponds to? Fortunately, GHC’s new -g option generates DWARF debugging information which gdb can understand, and because we compiled GHC itself with -g we can get gdb to tell us what code this address corresponds to:

(gdb) list *0x7ffff74790c0
0x7ffff74790c0 is in sorW_info (compiler/ghci/Linker.hs:1129).
1124
1125	      itbl_env'     = filterNameEnv keep_name (itbl_env pls)
1126	      closure_env'  = filterNameEnv keep_name (closure_env pls)
1127	
1128	      new_pls = pls { itbl_env = itbl_env',
1129	                      closure_env = closure_env',
1130	                      bcos_loaded = remaining_bcos_loaded,
1131	                      objs_loaded = remaining_objs_loaded }
1132	
1133	  return new_pls

In this case it told us that the object corresponds to line 1129 of compiler/ghci/Linker.hs. This is all part of the function unload_wkr, which is part of the code for unloading compiled code in GHCi. It looks like we’re on the right track.

Now, -g isn’t perfect - the line it pointed to isn’t actually a thunk. But it’s close: the line it points to refers to closure_env' which is defined on line 1126, and it is indeed a thunk. Moreover, we can see that it has a reference to pls, which is the original PersistentLinkerState before the unloading operation.

To avoid this leak, we could pattern-match on pls eagerly rather than doing the lazy record selection (closure_env pls) in the definition of closure_env'. That’s exactly what I did to fix this particular leak, as you can see in the patch that fixes it.

Fixing one leak isn’t necessarily enough: the data structure might be retained in multiple different ways, and it won’t be garbage collected until all the references are squashed. In total I found

  • 7 leaks in GHCi that were collectively responsible for the original leak, and
  • A further 10 leaks that only appeared when GHC was compiled without optimisation. (It seems that GHC’s optimiser is pretty good at fixing space leaks by itself)

You might ask how anyone could have found these without undergoing this complicated debugging process. And whether there are more lurking that we haven’t found yet. These are really good questions, and I don’t have a good answer for either. But at least we’re in a better place now:

  • The leaks are fixed, and we have a regression test to prevent them being reintroduced.
  • If you happen to write a patch that introduces a leak, you’ll know what the patch is, so you have a head start in debugging it.

Could we do better?

Obviously this is all a bit painful and we could definitely build better tools to make this process easier. Perhaps something based on heap-view which was recently added to GHC? I’d love to see someone tackle this.

June 20, 2018 12:00 AM

June 19, 2018

FP Complete

Sed: a debugging story

This blog post is a semi-complete retelling of my debugging adventures on a particularly crazy bug. Consider it a combination of fun story telling for the audience and catharsis for myself. This affects anyone trying to use AppVeyor and Stack for projects that use the network package with GHC 8.4 and up. If you want to cheat and get the solution, check out the pull request. Otherwise, strap yourselves in, it's gonna be a bumpy ride.

by Michael Snoyman (michael@fpcomplete.com) at June 19, 2018 07:12 PM

Magnus Therning

A simple zaw widget for jumping to git projects

A colleague at work showed me a script he put together to quickly jump to the numerous git projects we work with. It’s based on dmenu and looks rather nice. However, I’d rather have something based on zsh, but when looking around I didn’t find anything that really fit. So, instead I ended up writing a simple widget for zaw.

I then attached bound it like this

June 19, 2018 12:00 AM

June 18, 2018

Monday Morning Haskell

Contributing to GHC 2: Basic Hacking and Organization

Last week, we took our first step into the world of GHC, the Glasgow Haskell Compiler. We summarized the packages and tools we needed to install to get it building. We did this even in the rather hostile environment of a windows laptop. But, at the end of the day, we can now build the project with make and create our local version of GHC.

This week, we’ll establish our development cycle by looking at a very simple change we can make to the compiler. We’ll also discuss the architecture of the repository so we’ll can make some cooler changes next week.

GHC is truly a testament to some of the awesome benefits of open source software. Haskell would not be the same language without it. But to understand GHC, you first have to have a decent grasp of Haskell itself! If you’ve never written a line of Haskell before, take a look at our Liftoff Series for some tips on how to get going. You can also download our Beginners Checklist.

You may have also heard that while Haskell is a neat language, it’s useless from an industry perspective. But if you take a look at our Production Checklist, you’ll find tons of tools to write more interesting Haskell programs!

Getting Started

Let’s start off by writing a very simple program in Main.hs.

module Main where

main :: IO ()
main = do
  putStrLn "Using GHC!"

We can compile this program into an executable using the ghc command. We start by running:

ghc -o prog Main.hs

This creates our executable prog.exe (or just prog if you’re not using Windows). Then we can run it like we can run any kind of program:

./prog.exe
Using GHC!

However, this is using the system level GHC we had to install while building it locally!

which ghc
/mingw/bin/ghc

When we build GHC, it creates executables for each stage of the compilation process. It produces these in a directory called ghc/inplace/bin. So we can create an alias that will simplify things for us. We’ll write lghc to be a "local GHC" command:

alias lghc="~/ghc/inplace/bin/ghc-stage2.exe -o prog"

This will enable us to compile our single module program with lghc Main.hs.

Hacking Level 0

Ultimately, we want to be able to verify our changes. So we should be able to modify the compiler, build it again, use it on our program, and then see our changes reflected in the code. One simple way to test the compiler’s behavior is to change the error messages. For example, we could try to import a module that doesn’t exist:

module Main where

import OtherModule (otherModuleString)

main :: IO ()
main = do
  putStrLn otherModuleString

Of course, we’ll get an error message:

[1 of 1] Compiling Main (Main.hs, Main.o)

Main.hs:3:1: error:
    Could not find module 'OtherModule'
    Use -v to see a list of the files search for.
   |
3  |import OtherModule (otherModuleString)
   |^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Let’s try now changing the text of this error message. We can do a quick search for this message in the compiler section of the codebase and find where it’s defined:

cd ~/ghc/compiler
grep -r "Could not find module" .
./main/Finder.hs:cannotFindModule = cantFindErr (sLit "Could not find module")

Let’s go ahead and update that string to something a little different:

cannotFindModule :: DynFlags -> ModuleName -> FindResult -> SDoc
cannotFindModule = cantFindErr
  (sLit "We were unable to locate the module")
  (sLit "Ambiguous module name")

Now let’s go ahead and rebuild, except let’s use some of the techniques from last week to make the process go a bit faster. First, we’ll copy mk/build.mk.sample to mk/build.mk. We’ll uncomment the following line, as per the recommendation from the setup guide:

BuildFlavour=devel2

We’ll also uncomment the line that says stage=2. This will restrict the compiler to only building the final stage of the compiler. It will skip past stage 0 and stage 1, which we’ve already build.

We’ll also build from the compiler directory instead of the root ghc directory. Note though that since we’ve changed our build file, we’ll have to boot and configure once again. But after we’ve re-compiled, we’ll now find that we have our new error message!

[1 of 1] Compiling Main (Main.hs, Main.o)

Main.hs:3:1: error:
    We were unable to locate the module 'OtherModule'
    Use -v to see a list of the files search for.
   |
3  |import OtherModule (otherModuleString)
   |^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

General Architecture

Next week, we’ll look into making a more sophisticated change to the compiler. But at least now we’ve validated that we can develop properly. We can make a change, compile in a short amount of time, and then determine that the change has made a difference. But now let’s consider the organization of the GHC repository. This will help us think some more about the types of changes we’ll make. I’ll be drawing on this description written by Simon Peyton Jones and Simon Marlow.

There are three main parts to the GHC codebase. The first of these is the compiler itself. The job of the compiler is to take our Haskell source code and convert it into machine executable code. Here is a very non-exhaustive list of some of the compiler’s tasks

  1. Determining the location of referenced modules
  2. Reading a single source file
  3. Breaking that source into its simplest syntactic representation

Then there is the boot section. This section deals with the libraries that the compiler itself depends on. They include things such as low level types like Int or else Data.Map. This section is somewhat more stable, so we won’t look at it too much.

The last major section is the Runtime System (RTS). This takes the code generated by the compiler above and determines how to run it. A lot of magic happens in this part that makes Haskell particularly strong at tasks like concurrency and parallelism. It’s also where we handle mechanics like garbage collection.

We’ll try to spend most of our time in the compiler section. The compilation pipeline has many stages, like type checking and de-sugaring. This will let us zero in on a particular stage and make a small change. Also the Runtime System is mostly C code, while much of the compiler is in Haskell itself!

Conclusion

Next week we’ll take a look at a couple more ways to modify the compiler. After that, we’ll start looking at taking real issues from GHC and see what we can do to try and fix them! We’ll eventually take a peak at the submission process both with Github and Phabricator.

If you want to start out your Haskell journey, you should read our Liftoff Series! It will help you learn the basics of this awesome language. For more updates, you can also subscribe to our monthly newsletter!

by James Bowen at June 18, 2018 02:30 PM

Functional Jobs

Senior Software Engineer at Habito (Full-time)

Launched in 2016, Habito is transforming the mortgage industry through innovation and cutting-edge technology. We are the UK's free online mortgage broker backed by Atomico, Ribbit Capital and Mosaic Ventures, with angel investors including Transferwise CEO Taavet Hinrikus, Funding Circle’s founder Samir Desai and Yuri Milner, influential tech investor.

We are building a great brand synonymous with great customer service, ease and transparency. Simple, fast and honest, coupled with no fees, and no misinformation.

Our team is super-smart, ambitious, collaborative and friendly. We work hard, play hard and learn fast. This is a great opportunity to help steer, shape and mould our business.

Are you excited? Come and make your home at Habito!

The Role

We are looking for a full-time senior software engineer to join our team of full-stack engineers. Our core development values are strong, static typing and correctness, rigorous automation (in both testing and infrastructure) and everyone having a say.

The stack: On the back end we have an HTTP API written in Haskell (and built using stack) with PostgreSQL serving as the persistence layer. All of our data is event sourced thanks to PostgreSQL's great support for relational and blob data (e.g. jsonb) and some in-house libraries for handling ES/CQRS in a type-safe (or what we believe to be, as far as possible) manner. We lean heavily on GHC and its extensions and are also using some of the bigger power-to-weight-ratio players like lens and conduit.

The front end is written in PureScript on top of React and Redux (the bindings to which are our own but are both quite thin and based on the design of purescript-react and later pux).

Infrastructure wise we are sitting atop AWS and using Docker to manage containment. Jenkins manages continuous integration and deployment using a combination of AWS' Java SDK and some management code written in Scala

While we're a technology company first and foremost, initially we're trying to make the UK mortgage brokering market a better place. This means building and maintaining our website, but it also involves a slurry of behind-the-scenes work integrating lenders, insurers, conveyancers and all the other players in the space as we seek to introduce robust automation into what is currently so many manual processes. Slack bots, job queues, working around the fact that some actors don't have APIs -- you'll get to handle it all!

  • Work on our back-end software, written in Haskell and our front-end web applications, written in PureScript, HTML and CSS.
  • Help maintain and extend our infrastructure, hosted in AWS using technologies such as Docker and Kubernetes.
  • Use tools such as QuickCheck, Tasty and HUnit to produce reliable, well-tested software.
  • Work with our product and design teams to produce customer-centric experiences that excite and delight.

The Candidate

  • A passionate developer who cares about clean code and code quality; a full-stack polyglot
  • An experienced engineer familiar with automated testing and deployment
  • Experienced with functional programming and domain-driven design or simply interested and capable of learning on the job
  • Someone who cares about UX and UI; you have a fundamental aversion to ugliness in product

And Finally......

  • Competitive salary & share options
  • Career development, coaching & training
  • Free catered lunches, snacks & team bonding
  • Bupa Healthcare & Life Assurance
  • Contributory pension scheme
  • Bi-weekly massages with Urban Massage
  • Unlimited Holiday
  • Cycle to Work Scheme
  • Childcare Vouchers

Get information on how to apply for this position.

June 18, 2018 12:34 PM

June 17, 2018

Toby Goodwin

On an Independent Scotland

Contrail saltire

This has been a momentous week in Scottish politics. There is now a full-blown constitutional crisis between the Scottish government at Holyrood and the UK government at Westminster. It would be premature to say that Scottish independence is now inevitable, but the momentum towards independence is greater than ever before. In this article I will describe the situation in terms that I hope will be comprehensible to outsiders, and also give my own views as a relative newcomer to Scotland.

Contrail saltire © Karl and Ali - see here for original and licence

Personal background

It is nearly 4 years ago that I moved with my family to Scotland from England, where I was born and bred. Going back just 3 generations though, I have ancestors from every part of the British Isles. Our primary reason for the move was to be closer to my wife’s family. The timing was chosen to cause the least possible disruption to the younger family members, coinciding with my stepson’s transition to secondary school and stepdaughter’s starting at university.

It was entirely coincidental that we arrived here just in time to join the electoral register and vote in the 2014 Independence Referendum. Scottish independence was not a subject I’d felt particularly strongly about. So that summer I listened to the arguments, talked it over with my wife — a habitual Labour voter but indy supporter — watched some of the debates on TV. The adjective invariably attached to “referendum” by Unionists is “divisive”, but that was not my experience. There seemed to be genuine interest in the arguments. Worries, yes, particularly about the economy, pensions, the currency question, the border. And the EU: would the new Scotland be allowed to continue as a member, or rejoin immediately, or would it have to wait at the “back of the queue”?

Well, in 2014, independence, the “Yes” answer on the referendum lost, albeit by a narrower margin than pundits had expected. But the Yes movement did not wither away. Having won over so many hearts, having planted the idea of independence in so many minds... surely time and care would allow that seed to grow, flourish, and eventually prevail? Nevertheless, independence was firmly on the back burner, till the 23rd June 2016. That was the day that the UK voted by nearly 52% to 48% to leave the European Union (EU). Scotland however, backed remain by 62% to 38% (as did Northern Ireland, by nearly 56% to 44%). It was soon after the EU referendum that I joined the Scottish Nationalist Party (SNP) which supports independence, and has formed the Holyrood government since 2007.

My own political views seem much more at home in Scotland than England. All my life I’ve voted for left or centre parties... and more often than not ended up with Conservative governments at Westminster. Scotland has the same problem, in spades. We both remember Margaret Thatcher with loathing. The hated poll tax, which was her eventual undoing, was introduced to Scotland a year before the rest of the UK. Having suffered as Thatcher’s guinea pigs, Scotland only had to endure another 5 years of Conservative misrule under John Major till at last in 1997 we got Tony Blair’s Labour government.

Labour, of course, made some bad mistakes, most obviously the Iraq war. But the early years were good, particularly to Scotland. Soon after coming to power, Labour passed the Scotland Act 1998 which established the devolved parliament for Scotland at Holyrood in Edinburgh.

The EU Withdrawal Bill

The row this week concerns the EU withdrawal bill. Among other issues, this deals with powers which are currently exercised by the EU (or “Brussels”). When the UK exits the EU, do these powers pass to Westminster or to Holyrood? The Scotland Act, which is the constitutional basis of Scottish devolution, is quite clear: powers not explicitly reserved to Westminster devolve to Holyrood. But the Conservative government’s EU withdrawal bill wants to reserve some of the powers returning from the EU to Westminster for 7 years.

Any bill, such as the EU Withdrawal bill, which impinges on devolved matters is debated in the Scottish parliament, which votes to give consent to, or withhold it from, the bill. This is known as the “Sewel convention”. The EU Withdrawal bill failed to gain consent at Holyrood. Since the Scottish parliament has only been in existence 20 years, such a Legislative Consent Motion has only fallen once before — in that case the matter was fairly easily resolved. So there is little precedent on how to proceed in such a case.

It’s important to note that it was not just the SNP (the main party of independence) that opposed giving consent to the bill at Holyrood. It was also opposed by Labour and the Liberal Democrats (both Unionist parties) and the Greens (who also favour independence). Only the Conservatives supported it.

The Conservative government of Westminster — no friends to devolution — have decided that they can simply ignore the lack of consent, effectively trashing the Sewel convention. When the EU Withdrawal bill came to the House of Commons this week there were supposed to be three hours to discuss the part of the bill that deals with the smaller nations of the UK. So not just the Scottish question, but also Welsh and Northern Irish devolution, and the matter of the Northern Ireland border with Ireland. (The NI border is undoubtedy the most controversial part of the Brexit negotiations so far.) So that’s a lot to get through in 3 hours. Yet due to frankly archaic parliamentary procedures, this time was reduced to 19 minutes. During the “debate”, only a single Conservative minister was given time to speak.

This travesty of parliamentary democracy was scarcely reported in the Scottish media, and even less in the national UK media. But the SNP were not going to let it go by the board. The next day, they attempted to call an emergency debate during Prime Minister’s Questions. Although this was not directly successful, the ensuing row did make national headlines. As a result, the Secretary of State for Scotland, David Mundell, was required to make a statement to the House. And the SNP have been granted their emergency debate tomorrow. And, last week alone, the SNP gained over 7000 new members. This is the tide which the Conservatives must now compromise with, or attempt to resist.

Details of the Bill

What are the powers that Westminster wants to reserve? Among them are agriculture, fisheries, and food labelling. These are incredibly important to Scotland: whisky and salmon are our biggest exports. Scotland’s farmers, fishers, and food producers should have their interests represented by the Scottish government.

Why does the Conservative government want to reserve these powers? They claim that they need them to protect the internal UK market, which is “so vital to Scotland’s businesses”. This is patronising, as if the Scottish government were unaware of the importance of the UK market, or was unable to act in the best interests of the people they represent. What they are not saying out loud is that they need these powers so they can negotiate trade deals for the UK without needing input from Scotland. There is particular concern about a possible trade deal with the US. Securing a US trade deal is vital to prevent Brexit becoming an unmitigated disaster, but it has been widely reported that it is likely to include acquiescing to US demands that we accept chlorinated chicken, GM foods, and whiskey which has been matured for less than three years.

It’s hard to imagine that any part of the UK will be pleased by such proposals. Why should Scotland give up control of areas where it is clear that the UK government will act only in the interests of the biggest businesses? We have been repeatedly told during and after the Brexit referendum that negotiating trade deals after we left the EU would be easy. It seems those making such claims do not even understand how the UK is constituted!

Conservative justification

The Conservative spin is that the Scottish government is being obstructionist but the claims do not stand up to scrutiny. (The following are paraphrases, not direct quotes.)

  • “We have compromised: originally we were planning to take 153 powers, but now it’s only 24.”
  • “Scotland has never had these powers, they were ceded to the EU in 1973.”
  • “SNP wants an independent Scotland to remain in the EU and these powers to remain with Brussels so they clearly don’t want or deserve the powers.”

These first three claims each receive the same simple response: so what? Whatever the number and history of the powers and the debates surrounding them, they are powers which will be newly exercised in the UK and the Scotland Act says where they should reside.

  • “The Welsh assembly gave consent to the bill: you should too.”

More fools the Welsh then! In any case, the Welsh and Scottish devolution settlements are very different. And the Welsh Assembly is dominated by the Labour party, whose own thinking on Brexit is extremely muddled. As far as I am aware, Plaid Cymru (the Welsh nationalist party) did not consent to the bill.

  • “The Sewel convention only applies ‘normally’: these are not normal times.”

They certainly are not normal times! The UK is in crisis, and may only be months away from complete economic meltdown. The government is beset on all sides with intractable dilemmas. But this situation has been brought about solely by the actions of the Conservative Party.

Theresa May did not have to trigger article 50 before she had the faintest idea where she planned to take the country or how to get it there. She did not need to call a general election last year, throwing away a workable majority. David Cameron did not have to call the damned referendum. (Well, probably he did as it was a manifesto commitment. But he didn’t have to put it in the manifesto in the first place. That was a cynical gesture to win UKIP votes, made in the belief that were he still Prime Minister after 2015 it would be in another Lib Dem coalition and the Lib Dems would never permit the referendum to go ahead.) Cameron and May both were extremely foolish to stoke fears of a country overrun by immigrants, welfare scroungers and health tourists, and to make promises to reduce immigration numbers which they knew they could not keep. They did so to deflect attention from the savage cuts to services enacted in the name of austerity by George Osborne. (Osborne was Cameron's chancellor, who resigned alongside his boss after the EU referendum.)

Yes the Conservatives are in a hole, and they can ask the Scottish government for a favour. But it is hardly surprising if they are given a shovel rather than a rope. Imagine if it were the other way round, and the SNP were asking the Conservatives for help!

  • “Scotland is only one part of the UK: it cannot be allowed to derail the whole Brexit process.”

That may be true, but part of the point of a constitution is to prevent democracy from becoming a tyranny of the majority. It is right and proper for the Scottish government to do the best for the people of Scotland, though they be only 10% of the population of the UK. And it is simply false to think that Holyrood is trying to stop Brexit. It has engaged as far as possible with Brexit, having published several papers on the matter, passed its own EU Continuity Bill, and engaged in dialogue with the other nations of the UK. It is fair to say that Holyrood wishes to stop a hard Brexit, but that puts them in the good company of a majority of Westminster MPs and — as far as anyone can judge — a majority of the UK population.

Conclusion

Tomorrow there will be an emergency debate at Westminster on the Sewel convention. My guess is that the Conservatives will come out fighting. This will only serve to make starker the choice that Scotland now faces: independence must be embraced while we have the chance before the Conservatives can succeed in completely undermining devolution.


Toby Goodwin
Kilmarnock, Scotland
2018-06-17

Contact me by email or on Twitter.

Notes

These are not formal references, more like a dump of the tabs I had open in my browser at the end of writing this piece! The first is particularly interesting — it is a BBC “Reality Check” article, so it is attempting to establish the unbiased facts of the matter. Here's a quote.

So Ms Sturgeon [leader of the SNP and First Minister of the Holyrood government] is correct to say that pushing through the EU Withdrawal Bill without the consent of the Scottish Parliament means that the long-standing convention has, in effect, been "ripped up".

June 17, 2018 07:45 PM

Stackage Blog

Upcoming stackage LTS 12 snapshot with ghc-8.4.3

After a recent discussion on the Haskell-cafe mailing list, the stackage curator team has decided to start the preparations for an LTS major bump which will include ghc-8.4.3.

We’d like to give everyone enough time to prepare, which is why the planned release date is two weeks from the date of this post.

Getting your package into the nightly snapshot will ensure its inclusion in LTS 12 as well. To add your package, please submit a PR to commercialhaskell/stackage. To add your package to LTS 12 after it’s released, please open a new issue on commercialhaskell/lts-haskell.

For those interested, here’s an overview of the LTS major bump process we’ll be adhering to from now on.

June 17, 2018 12:01 AM

June 15, 2018

Michael Snoyman

Deprecating the Haskell markdown library

Back in 2012, I was working on a closed-source project which required a Markdown parsing library. At the time, the only option in the Haskell ecosystem I was aware of was Pandoc, which wasn't an option for the project in question due to licensing (GPL). As a result, I ended up creating my own library called markdown (inventive name, I know). It also had the benefit of significantly less dependencies than Pandoc, and therefore faster build times. Today, I'm considering deprecating this library, for a number of reasons:

  • The license concern no longer affects me. But more to the point: there are other Haskell markdown libraries with liberal licenses and low dependency footprints. Some examples include:

  • There are too many different flavors of markdown floating around today, and I'm contributing to the problem by having yet another flavor.

  • I'm not particularly happy with the flavor I created. I'm not in any way an expert at parsing, and did not have a deep understanding of the Markdown spec (or lack thereof) when I wrote this library. I'm personally more comfortable using, for example, Github flavored Markdown.

I'm not worried about maintenance burden in this case; this library hasn't caused me much trouble. The deprecation would primarily be to get a flavor of Markdown off the market, and encourage people to use something more standard. I'm also open to alternative solutions here, like using the markdown package namespace for a better implementation.

My biggest concern with a deprecation or changing the library significantly is that it may break existing projects that are relying on this specific flavor of markdown. But for those cases, the current version of the code will always be available. I may end up using it myself for some sites where I don't want to go back and fix content.

The purpose of this blog post, then, is to find out if there are people out there who have a strong reason for me to not deprecate this library, or would like to see something specific happen with the package name on Hackage/Stackage. If so, please let me know.

And as a second, lesser goal, I'm interested in hearing about the current state of Markdown libraries in Haskell. Have I missed some very popular ones? Is there any kind of general consensus on which library to use? I've been favoring sundown recently, since I've been working on projects where I want identical HTML output between Github or Gitlab's web interface and the code I'm writing. But Common Mark is a much more exciting project on the horizon.

June 15, 2018 06:32 AM

June 11, 2018

Monday Morning Haskell

Contributing to GHC 1: Preparation

In the last few weeks, we’ve looked at a few different open source Haskell projects like HNix and Codeworld. This week, we’ll start looking at perhaps the biggest and most important open source element of the Haskell ecosystem. This is GHC, the Glasgow Haskell Compiler. Without GHC and the hard work that goes into it from many volunteers, Haskell would not be the language it is today. So in the next few weeks we’ll be explore the process of building and (hopefully) contributing to GHC.

I’m currently operating on a Windows laptop, which brings many frustrations. Getting GHC to build on Windows is a non-trivial task with many potential hurdles. On the bright side, I view this as an opportunity to show that one can contribute even in the most adverse circumstances. So most of this article will focus on the trials of using Windows. There is a section further down that goes over the most important parts of building for Mac and Linux. I’ll be following this guide by Simon Peyton Jones, sharing my own complications.

Now, you need to walk before you can run. If you’ve never used Haskell before, you have to try it out first to understand GHC! Download our Beginner’s Checklist to get started! You can also read our Liftoff Series to learn more about the language basics.

MSys

The main complication with Windows is that the build tools for GHC are made for Unix-like environments. These tools include programs like autoconf and make. And they don’t work in the normal Windows terminal environment. This means we need some way of emulating a Unix terminal environment in Windows. There are a couple different options for this. One is Cygwin, but the more supported option for GHC is MSYS 2. So my first step was to install this program. This terminal will apply the “Minimalist GNU for Windows” libraries, abbreviated as “MinGW”.

Installing this worked fine the first time. However, there did come a couple points where I decided to nuke everything and start from scratch. Re-installing did bring about one problem I’ll share. In a couple circumstances where I decided to start over, I would run the installer, only to find an error stating bash.exe: permission denied. This occurred because the old version of this program was still running on a process. You can delete the process or else just restart your machine to get around this.

Once MSys is working, you’ll want to set up your terminal to use MinGW programs by default. To do this, you’ll want to set the path to put the mingw directory first:

echo “export PATH=/mingw<bitness>/bin:\$PATH” >> ~/.bash_profile

Use either 32 or 64 for <bitness> depending on your system. Also don’t forget the quotation marks around the command itself!

Package Preparation

Our next step will be to get all the necessary packages for GHC. MSys 2 uses an older package manager called pacman, which operates kind’ve like apt-get. First you’ll need to update your package repository with this command:

pacman -Syuu

As per the instructions in SPJ’s description, you may need to run this a couple times if a connection times out. This happened to me once. Now that pacman is working, you’ll need to install a host of programs and libraries that will assist in building GHC:

pacman -S --needed git tar bsdtar binutils autoconf make xz \
    curl libtool automake python python2 p7zip patch ca-certificates \
    mingw-w64-$(uname -m)-gcc mingw-w64-$(uname -m)-python3-sphinx \
    mingw-w64-$(uname -m)-tools-git

This command typically worked fine for me. The final items we’ll need are alex and happy. These are Haskell programs for lexing and parsing. We’ll want to install Cabal to do this. First let’s set a couple variables for our system:

arch=x64_64 # could also be i386
bitness=64  # could also be 32

Now we’ll get a pre-built GHC binary that we’ll use to Bootstrap our own build later:

curl -L https://downloads.haskell.org/~ghc/8.2.2/ghc-8.2.2-${arch}-unknown-mingw32.tar.xz | tar -xJ -C /mingw${bitness} --strip-components=1

Now we’ll use Cabal to get those packages. We’ll place them (and Cabal) in /usr/local/bin, so we’ll make sure that’s created first:

mkdir -p /usr/local/bin
curl -L https://www.haskell.org/cabal/release/cabal-install-2.2.0.0/cabal-install-2.2.0.0-${arch}-unknown-mingw32.zip | bsdtar -xzf- -C /usr/local/bin

Now we’ll update our Cabal repository and get both alex and happy:

cabal update
cabal install -j --prefix=/usr/local/bin alex happy

Once while running this command I found that happy failed to install due to an issue with the mtl library. I got errors of this sort when running the ghc-pkg check command:

Cannot find any of [“Control\\Monad\\Cont.hi”, “Control\\Monad\Cont.p_hi”, “Control\\Monad\\Cont.dyn_hi”]
Cannot find any of [“Control\\Monad\\Cont\\Class.hi”, “Control\\Monad\Cont\\Class.p_hi”, “Control\\Monad\\Cont\\Class.dyn_hi”]

I managed to fix this by doing a manual re-install of the mtl package:

cabal install -j --prefix=/usr/local/ mtl --reinstall

After this step, there were no errors on ghc-pkg check, and I was able to install happy without any problems.

cabal install -j --prefix=/usr/local/ happy
Resolving dependencies…
Configuring happy-1.19.9…
Building happy-1.19.9…
Installed happy-1.19.9

Getting the Source and Building

Now our dependencies are all set up, so we can actually go get the source code now! The main workflow for contributing to GHC uses some other tools, but we can start from Github.

git clone --recursive git://git.haskell.org/ghc.git

Now, you should run the ./boot command from the ghc directory. This resulted in some particularly nasty problems for me thanks to my antivirus. It decided that perl was an existential threat to my system and threw it in the Virus Chest. You might see an error like this:

sh: /usr/bin/autoreconf: /usr/bin/perl: bad interpreter: No such file or directory

Even after copying another version of perl over to the directory, I saw errors like the following:

Could not locate Autom4te/ChannelDefs.pm in @INC (@INC contains /usr/share/autoconf C:/msys64/usr/lib .) at C:/msys64/usr/bin/autoreconf line 39

In reality, the @INC path should have a lot more entries than that! It took me a while (and a couple complete do-overs) to figure out that my antivirus was the problem here. Everything worked once I dug perl out of the Virus chest. Once boot runs, you’re almost set! You now need to configure everything:

./configure --enable-tarballs-autodownload

The extra option is only necessary on Windows. Finally you’ll use to make command to build everything. Expect this to take a while (12 hours and counting for me!). Once you’re familiar with the codebase, there are a few different ways you can make things build faster. For instance, you can customize the build.mk file in a couple different ways. You can set BuildFlavor=devel2, or you can set stage=2. The latter will skip the first stage of the compiler.

You can also run make from the sub-directories rather than the main directory. This will only build the sub-component, rather than the full compiler. Finally, there’s also a make fast command that will skip building a lot of the dependencies.

Mac and Linux

I won’t go into depth on the instructions for Mac and Linux here, since I haven’t gotten the chance to try them myself. But due to the developer-friendly nature of those systems, they’re likely to have fewer hitches than Windows.

On Linux for instance, you can actually do most of the setup by using a Docker container. You’ll download the source first, and then you can run this docker command:

>> sudo docker run --rm -i -t -v `pwd`:/home/ghc gregweber/ghc-haskell-dev /bin/bash

On Mac, you’ll need to install some similar programs to windows, but there’s no need to use a terminal emulator like MSys. If you have the basic developer tools and a working version of GHC and Cabal already, it might be as simple as:

>> brew install autoconf automake
>> cabal install alex happy haddock
>> sudo easy_install pip
>> sudo pip install sphinx

For more details, check here. But once you’re set up, you’ll follow the same boot, configure and make instructions as for Windows.

Conclusion

So that wraps up our first look at GHC. There’s plenty of work to do just to get it to build! But next week we’ll start looking at some of the simplest modifications we can make to GHC. That way, we can start getting a feel for how the code base works.

If you haven’t written Haskell, it’s hard to appreciate the brilliance of GHC! Get started by downloading our Beginners Checklist and reading our Liftoff Series!

by James Bowen at June 11, 2018 02:30 PM

Luke Palmer

Smart Contracts, Luke, and Music

A recruiter for DFinity, a nonprofit cryptocurrency company working in Haskell, reached out to me the other day.  I did some research, read their whitepaper.  The tech is pretty clever and interesting, in particular providing solutions to my main two misgivings about cryptocurrency: (1) proof of work, which always seemed like a huge waste of computational resources, and (2) immutability (which some cherish about the technology, but I don’t personally think “pure capitalism” has humanity’s best interests in mind).  The alternative to proof-of-work in particular is quite appealing, instead delegating block generation to a series of randomly-chosen “committees”.  The immutability solution, “algorithmic governance”, has some clever premises and ideas, though it gets a bit abstract for me, that I remain unconvinced that it would actually work as intended (but it’s possible, I just need more time and information to digest).

Some context in my life: I left Google for “mini-retirement” in early 2016 after I had earned four years worth of savings––I wanted to find out what I would do when there was nothing I had to do.  Not really to my surprise, I ended up spending most of my time on music, and have improved vastly as a musician in that time.  I still spend some time coding as a hobby, since I still enjoy it.  It’s a great life, and I have learned a lot about myself.  But one thing I notice that is missing in this life is a sense of purpose––when I try to justify that my music helps people, it always feels like I’m talking out my ass.  So, while dedicating myself to my art, I’ve also had a radar out for things to do that will tangibly help humanity.  But I’m still in limbo––am I just avoiding my True Purpose as a musician because it’s scary?; am I wanting to help people just for the status?; is believing that my music doesn’t help people actually some self-devaluing belief that I need to let go of?, etc. etc.  And I wonder if such questions are just what being alive is like and they never really go away.

ANYWAY thanks for reading my little journal entry there.  I’ve been asking myself, if I did take a job with them, how might that be of service in ways that matter to me?  And I can think of ways, and it’s getting me excited.  I’m not really very deep in the cryptocurrency world, so these ideas are probably either naive or old news.  Nonetheless I’m an invent-first, research-later kind of person.

The idea of financial contracts being written precisely and formally is a great idea to me, replacing pseudo-precise legalese with actually-precise math.  But smart contracts don’t actually improve anything if they are so complicated that humans, who they ultimately serve, can’t understand them (and we know how quickly code can get unbearably complex).  It’s also possible to write misleading code, and in a world based on smart contracts, there is a great incentive to do so.  We need excellent tools for understanding and verifying contracts: assurances that they actually express the intent on the label.

Indeed, in a world of public contracts, there are new possibilities for “integration tests” that could detect instabilities, possible market crashes, and the like (though it is difficult to comprehend the magnitude of such an undertaking).  There is a story about Simon Peyton-Jones formalizing the Enron scandal, which was allegedly built on an series of impenetrably complex contracts, and finding its error.   The story might even be true, since he and others published a functional pearl about financial contracts.

Imagine a continuous integration system of our global financial system, monitoring it for health, automatically rolling back unhealthy contracts, protecting people from shit like Enron and the subprime mortgage crisis before it happened.  Imagine also moving to New Orleans and getting deep in the music scene.  Imagine doing both at once.  Does that sound like a good life, Luke?

by Luke at June 11, 2018 03:50 AM

June 06, 2018

Lysxia's blog

June 04, 2018

Monday Morning Haskell

Codeworld: Haskell as a First Programming Language

In the last couple weeks, we’ve explored a couple different Haskell open source projects. We checked out the Nix package manager and its Haskell cousin. Open source is very important to the Haskell community, so we’ll continue in this vein for a little while longer. This week, we’ll explore Codeworld, another project I learned about at Bayhac about a month ago. In the coming weeks, we’ll look at GHC itself, a vital open-source component of the Haskell ecosystem.

What is Codeworld?

Codeworld is an educational tool for teaching kids about mathematics and programming. The most basic version of Codeworld allows students to create geometric images. They do this using simple programming expressions similar to Haskell. Here’s a very basic program we can write and the picture it would draw.

leaves = sector(0, 180, 4)
trunk = solidRectangle(1,4)
tree = colored(leaves, translucent(green)) & colored(trunk, dark(brown))

program = drawingOf(tree)
code_world_0.png

This is different from similar sorts of programs and language in many ways. The Logo programming language that I first learned used a more procedural style. You create “turtles” that move around the screen and perform commands. For example, you could tell a turtle to start drawing, move 25 pixels, turn, and move again. You might also approach drawing in an object oriented fashion. You'd create shapes that have different properties and change these over time. But Codeworld eschews both these approaches in favor of a more functional style.

Your program is ultimately a single drawing. You can compose this drawing with different components, always represented by expressions. As you learn more about the different patterns, you can create your own functions.

leaves = sector(0, 180, 4)
trunk = solidRectangle(1,4)

tree :: (Color, Color) -> Picture
tree(c1, c2) = colored(leaves, translucent(c1)) &
               colored(trunk, dark(c2))

myTree :: (Number, Color, Color) -> Picture
myTree(x, c1, c2) = translated(tree(c1, c2), x, 0)

program = drawingOf(myTree(-5, green, brown) & myTree(5, red, black))
code_world_1.png

Within a few examples, it’s relatively easy to teach the concept of recursion! Here’s a simple example showing repetition and fractals:

branch :: Number -> Picture
branch(0) = blank
branch(n) =
    polyline([(0,0), (0, 5)]) &
    translated(smallBranch, 0, 5) &
    translated(rotated(smallBranch,  30), 0, 2.5) &
    translated(rotated(smallBranch, -30), 0, 2.5)
  where smallBranch = scaled(branch(n-1), 0.5, 0.5)

tree :: Picture
tree = branch(7)

program = drawingOf(tree)
code_world_2.png

Codeworld Haskell

Now the basic version of Codeworld is like Haskell but with some simplifications and syntactic changes. There is also Codeworld Haskell, which employs the full Haskell feature set. This lets you use more complex items and dive into the type signatures a bit more.

It also involves more complex functions than drawing. You can animations and interactions between different elements, or track a global state. It’s even possible to create simple games. The interactionOf function allows you to handle input events that can affect the world. The collaborationOf function looks a bit complicated with its use of StaticPtr. But it allows you to create multiplayer games with relative ease!

drawingOf :: Picture -> IO ()

animationOf :: (Double -> Picture) -> IO ()

simulationOf
  :: world
  -> (Double -> world -> world)
  -> (world -> Picture)
  -> IO ()

interactionOf
  :: world
  -> (Double -> world -> world)
  -> (Event -> world -> world)
  -> (world -> Picture)
  -> IO ()

collaborationOf
  :: Int
  -> StaticPtr (Stdgen -> world)
  -> StaticPtr (Double -> world -> world)
  -> StaticPtr (Int -> Event -> world -> world)
  -> StaticPtr (Int -> world -> Picture)
  -> IO ()

Using Codeworld

The easiest way to get started is to go to https://code.world, follow the Guide, and make some simple programs! Everything takes place in your web browser, so you can get a feel for how it works without needing to do any setup.

If you want to contribute to or fiddle with the source code at all, you’ll have to do some more involved work. You’ll need to follow the instructions on the Github repository, which are primarily for the main Linux distributions. You’ll also need to sign a Google Contributor License Agreement if you haven’t already. But if you want to help on some kind of educational Haskell tool, this is a great project to contribute on! It’s already in use in several schools!

Conclusion

Next week we’ll continue our open-source focus by beginning to look at the process of contributing to GHC. This compiler is a mainstay of the Haskell community. And it depends entirely on volunteer contributions! Naturally though, it's difficult to understand all the inner workings of a compiler. So we’ll start at a very basic level and work our way up. We'll begin by looking at contributions to less technical areas. Only at the end of our discussion will we start looking at any of the organization of the code itself.

If you’ve never written any Haskell before, Codeworld is actually a great way to introduce yourself to some of the fundamentals! But for a more classical introduction, you can also get our Haskell Beginner’s Checklist. It’ll walk you through the basics of setting Haskell up on your system.

by James Bowen at June 04, 2018 02:30 PM

Douglas M. Auclair (geophf)

May 2018 1HaskellADay Problems and Solutions

by geophf (noreply@blogger.com) at June 04, 2018 01:25 PM

May 31, 2018

Well-Typed.Com

Semi-Formal Development: The Cardano Wallet

TL;DR: A combination of formal modelling and testing using QuickCheck is a powerful tool in the design and implementation of high assurance software. Consistency of the model can be checked by testing invariants, and the real implementation can be tested by comparing it against the model.

As part of our consulting work for IOHK, Well-Typed have been working with IOHK on the design and implementation of the new version of the Cardano cryptocurrency wallet. As a crucial component of this process, we have written a semi-formal specification of the wallet: a mathematical model of the wallet along with invariants and lemmas about how it behaves.

We refer to this specification as “semi-formal” because while it states many of the wallet’s properties, and proves some of them, it by no means proves all of them. As we will see, however, we can use QuickCheck to test such properties, producing counter-examples where they fail to hold. Not only is this an invaluable tool during the development of the specification itself, it also gives us a very principled way of testing the real implementation, even if later we do prove all the remaining properties as well.

In this blog post we will take a look at the specification and see how it drives the design and testing of the new wallet. We will show parts of the formal development, but only to give some idea about what it looks like; we will not really discuss any of its details. The goal of this blog post is not to describe the mathematics but rather the approach and its advantages.

Note: all figure, invariant and section numbers used in this blog post refer to version 1.1 of the specification.

Background: UTxO-style Accounting

Regular bank transactions transfer from and to bank accounts. For example, a transaction might transfer $100 from Alice to Bob. Transactions in cryptocurrencies such as Cardano or Bitcoin are a little different. The to part is similar, except that there might be multiple “outputs”; for example, a transaction might transfer $70 to Bob and $30 to Carol. The from part however works in quite a distinct way; transaction inputs are not accounts but other transactions. For example, let’s call the transaction that transfers $70 to Bob and $30 to Carol “t1”.

t1 inputs:
outputs: $70 to Bob, $30 to Carol

Now suppose Bob wants to transfer $50 to Dave. He can create a new transaction that says “take the first output of transaction t1, transfer $50 to Dave and transfer $20 back to me”.1

t2 inputs: first output of t1
outputs: $50 to Dave, $20 to Bob

It is important that Bob transfers the $20 “change” back to himself, because a transaction output can be “spent” (that is, used as an input) only once. This style of transactions is called “UTxO” style; UTxO stands for “Unspent Transaction (Tx) Outputs”.

The blockchain is basically a long list of such transactions. The corresponding formal definition looks something like this.

Wallet

The cryptocurrency’s wallet is a piece of software that monitors the state of the blockchain, keeps track of the user’s funds (more precisely, their UTxO) and enables them to create new transactions to be included into the blockchain. The wallet is the primary means by which users interact with the blockchain. Note that the verification of these new transactions and the decision whether or not to include them into the global blockchain is not up to the wallet; how this happens is beyond the scope of this blog post.

A formal specification of the wallet is a mathematical abstraction that strips away all irrelevant detail and just focuses on the core functionality of the wallet. In the basic model of the Cardano wallet, we stripped down the wallet state to just the wallet’s UTxO and its set of pending transactions; the specification is small enough to fit on a single page.

Such a model is simple enough to be studied in-depth and support mathematical proofs of its properties. Yet it is accurate enough to be an invaluable guide in the design of the wallet and be the base for unit tests that drive the real implementation. It can also be used to study where we most need to worry about performance and how we can address those concerns.

Balance

As an example of one of the trickier aspects of the wallet’s design, we will discuss reporting balance in a bit more detail. It may be surprising that reporting balance is non-trivial, but even for regular bank accounts we already have two notions of balance. After Alice transfers $100 to Bob, her online bank account might tell her

  • Your current balance is $1000
  • There is a pending transaction of $100, so your available balance is $900.

Unlike regular bank transactions, UTxO-style transactions involve change; this leads naturally to three notions of balance. If we take as example transaction t2 above, Bob’s balance might be reported as

  • Your UTxO balance is $1070
  • There is a pending transaction t2, so your available balance is $1000
  • However, transaction t2 returns $20 in change, so your total balance is $1020.

Note that the change from t2 becomes available only when transaction t2 has been included into the blockchain.

Although there are user interface questions here (how should we report these different kinds of balance to the user?), from the wallet design perspective this is not yet particularly difficult. The real complexity comes from the fact that there may be temporary disagreement about which transactions are included in the blockchain (such disagreements are known as “forks”; how they arise and are resolved is again well beyond the scope of this blog post).

Let’s stick to the above example, and suppose that t2 is pending, but there was disagreement about t1. After the disagreement got resolved the wallet discovers that t1 is not actually present in the blockchain after all (perhaps it will be later). The wallet’s available balance is now still $1000, but would it really make sense to report its total balance as $1020? It would be strange to have the total balance be larger than the UTxO balance; not wrong per se, but confusing nonetheless.

In the specification we therefore define the concept of minimum balance: the minimum (UTxO) balance of the user “across all possible futures”; this is the only balance the user can be certain of. In the example, this is the future in which neither t1 nor t2 ever make it into the blockchain, and hence we report $1000 as the minimum balance (note that it can never happen that t2 is included but t1 is not, since t2 depends on t1). While this concept makes intuitive sense, in order to make it precise and be able to compute it we needed to introduce the concept of “expected UTxO”: unspent transaction outputs that the wallet expects will become available in the future but haven’t yet.

Of course, even without a formal specification it is possible that we could have come up with this concept of minimum balance and a way to compute it. But having the formal model allows us to think much more clearly about the problems, remove all the clutter that surrounds a “real” implementation of the wallet and focus only on the core ideas.

Internal consistency: invariants

When we introduce a novel concept such as “expected UTxO” into the specification, there is a tricky question: how do we know that what we specified is right? Actually, since we introduced the concept ourselves, the question of whether it was “right” or not is not particularly meaningful. Instead we ask: is what we specified useful?

One way to answer this question is by stating invariants. Invariants are properties that we expect to be true at all times. For example, for the basic model shown above (Figure 3) we have the following invariant:

Invariant 3.4. txins pending ⊆ dom utxo

What this invariant says is that the pending transactions can only spend from the wallet’s unspent outputs. Intuitively, this makes a lot of sense: the wallet should not allow the user to spend funds they don’t have. As we have seen however in the previous section, this invariant does not always hold when we take disagreements about the blockchain into account! When Bob submits transaction t2, spending an output of transaction t1, and only later discovers that actually t1 should not have been included in the blockchain after all, he will have spent an output that is not in his UTxO.

The concept of expected UTxO comes to the rescue, again. The invariant for the full model is instead

Invariant 7.8. txins pending ⊆ dom (utxo ∪ expected)

In other words, pending transactions can only spend available outputs or outputs that we expect to become available (because they were previously).

Another useful invariant that helps further solidify our intuition about what we mean by expected UTxO says that an output can never be in both the actual UTxO and the expected UTxO

Invariant 7.6. dom utxo ∩ dom expected = ∅

After all, it would be strange to say that an output is expected if we already have it in our wallet. Stating invariants like this allows us to make our intuitions about the concepts we introduce precise, and proving them gives us strong guarantees that our specification makes sense and is internally consistent.

Semi-formal development

Formally proving invariants such as the ones discussed above can be time consuming and requires mathematical training. Naturally, doing the proofs would be ideal, but the main point of this blog post is that we push this approach a long way even if we don’t. After all, we program in Haskell precisely because we can easily translate back and forth between Haskell and mathematics.

To translate the various wallet models from the specification, we use the approach we described in a previous blog on Object Oriented Programming in Haskell (indeed, we developed that approach specifically for this purpose). For instance, here is the Haskell translation of the basic model:

mkWallet :: (Hash h a, Ord a, Buildable st)
         => Ours a -> Lens' st (State h a) -> WalletConstr h a st
mkWallet ours l self st =
  (mkDefaultWallet (l . statePending) self st) {
      utxo       = st ^. l . stateUtxo
    , ours       = ours
    , applyBlock = \b -> self (st & l %~ applyBlock' ours b)
    }

applyBlock' :: Hash h a
            => Ours a -> Block h a -> State h a -> State h a
applyBlock' ours b State{..} = State {
    _stateUtxo    = updateUtxo ours b _stateUtxo
  , _statePending = updatePending   b _statePending
  }

updateUtxo :: forall h a. Hash h a
           => Ours a -> Block h a -> Utxo h a -> Utxo h a
updateUtxo p b = remSpent . addNew
  where
    addNew, remSpent :: Utxo h a -> Utxo h a
    addNew   = utxoUnion (utxoRestrictToOurs p (txOuts b))
    remSpent = utxoRemoveInputs (txIns b)

updatePending :: Hash h a
              => Block h a -> Pending h a -> Pending h a
updatePending b = Map.filter $ \t -> disjoint (trIns t) (txIns b)

It deals with more details than the specification; for instance, it explicitly abstracts away from a specific choice of hash h, as well the types of addresses a. It is therefore a bit more complicated than the spec, but it nonetheless follows the specification very closely. In particular, it is still a model: it does not deal with any networking issues or persistent storage, may not be particularly performant, etc. In other words, this is not intended to be the design of the real wallet. Having this model is nonetheless incredibly useful, for two reasons. The first is that we can use it to test the real wallet; we will discuss that in the next section.

The second reason is that we can use the model to test the invariants. For example, here is the translation of Invariants 7.8 and 7.6 from the previous section to Haskell:

pendingInUtxoOrExpected :: WalletInv h a
pendingInUtxoOrExpected l e =
  invariant (l <> "/pendingInUtxoOrExpected") e $ \w ->
   checkSubsetOf
    ("txins pending",
      txIns (pending w))
    ("utxo ∪ expected",
      utxoDomain (utxo w) `Set.union` utxoDomain (expectedUtxo w))

utxoExpectedDisjoint :: WalletInv h a
utxoExpectedDisjoint l e =
  invariant (l <> "/utxoExpectedDisjoint") e $ \w ->
   checkDisjoint
    ("dom utxo",
      utxoDomain (utxo w))
    ("dom expected",
      utxoDomain (expectedUtxo w))                    

As for the wallet implementation, the Haskell translation of the invariants deals with more details than the spec; in this case, one of the main differences is that the infrastructure for these invariants is designed to give detailed error reports when the invariants do not hold. Nonetheless, the main part of the invariant is again a direct translation of the specification.

The big gain is that we can now use QuickCheck to test these invariants. We generate random (but valid) events for the wallet (“apply this block”, “submit this new transaction”, “switch to a different fork”) and then check that the invariants hold at each point. For example, in the first release of the wallet specification there was a silly mistake: when the wallet was notified of a new block, it removed the inputs of that block from the expected UTxO, rather than the outputs. A silly mistake, but easy to miss when just reviewing the specification by hand. A proof would have found the mistake, of course, but so can QuickCheck:


Wallet unit tests
  Test pure wallets
    Using Cardano model FAILED [1]

Failures:

  test/unit/Test/Spec/Models.hs:36:
  1) Wallet unit tests, Test pure wallets, Using Cardano model
       predicate failed on: Invalid [] InvariantViolation {
           name:     full/utxoExpectedDisjoint
         , evidence: NotSubsetOf {
                 dom utxo: ..
               , dom expected: ..
               , dom utxo `intersection` dom expected: ..
             }
         , events:   {
                 state: ..
               , action: ApplyBlock ..
               , state: ..
               , action: NewPending Transaction{ .. }
               , state: ..
               ..
               , action: Rollback
               ..
             }
         }

Not only does this tell us that the invariant did not hold; it actually gives us the specific sequence of events that lead to the wallet state in which the invariant does not hold (as well as all the intermediate wallet states), and it tells what the domain of the UTxO and the expected UTxOs are in that state as well as their intersection (which should have been empty but wasn’t).

Testing the real implementation

As mentioned, the Haskell translation of the wallet specification is still a model which ignores a lot of the real world complexity that the full wallet implementation must deal with. Even the datatypes that the model works with are simplified versions of the real thing: transactions don’t include signatures, blocks are not real blocks but just lists of transactions, etc.

Nonetheless, we can use the model to test the real implementation also. We can translate the simplified types of the model to their real counterparts. Since we have QuickCheck generators for the simplified types and we can test the model because we have an implementation of it, we can test the real wallet as shown in the following commuting diagram:

In words, what this means is that we use the QuickCheck generator to generate wallet events using the simplified model types and then do two things:

  • We execute the model on the simplified types and translate after.
  • We translate first and then execute the real wallet on the translated input.

Either way, we end up with two final answers (both in terms of the real Cardano types), one executed in the model and one executed in the real wallet. We can compare these two, and if they match we can conclude that the real wallet implements the model correctly. Since we check this property at each step and we know that the invariants hold in the model at each step, we can then also conclude that the invariants hold in the real implementation at each step.

For example, when a bug in the full wallet meant that change from pending transactions was sometimes double-counted (in the case where pending transactions use the change of other pending transactions as inputs, a case that can only happen in the presence of forks), the generator will be able to find a counter-example to the above commuting diagram, and then give us the exact sequence of wallet events that leads to a wallet state in which the real wallet and the model disagree as well as the precise value that they disagree on.

Conclusions

Software specifications, where they exist at all, are often informal documents that describe the features that the software is intended to have in natural language. Such specifications do not lend themselves easily to verification or even testing. At the other end of the spectrum, fully formal specifications where every property is verified mathematically are costly and time consuming to produce and require specialized expertise. They are of course the golden standard, but there is also a useful middle-ground: by specifying the model and its properties formally, we can test its properties using QuickCheck. Moreover, we get a model in which we can reason about the core functionality of the application and which we can then compare against the real implementation.

IOHK’s development of the new wallet is open source and can be found on GitHub. Excitingly, IOHK have recently hired someone to start work on the Coq formalization of the wallet specification, which will put the whole specification on an even stronger footing. Of course, that does not make any of the existing work useless: although it will become less important to test the invariants in the model, having the QuickCheck generators available to check the real implementation is still very valuable. Moreover, having the QuickCheck tests validate the invariants before we attempt to prove them formally can also save valuable time if we discover early that an invariant is not, in fact, true.

References

[spec] “Formal specification for a Cardano wallet”, Duncan Coutts and Edsko de Vries [github] GitHub repository for the new wallet


  1. We ignore transaction fees for the sake of simplicity.

by edsko at May 31, 2018 04:05 PM

May 28, 2018

Functional Jobs

Expert-Grade Software Engineer for Enterprise Software on Distributed Ledger Technology (Blockchain) at MCZ Moschin Executive AG (Full-time)

MCZ Moschin Executive AG is a headhunting agency located in Zurich-City with over 30 years of experience.

At the moment we are looking for a Software Engineer for one of our clients.

The Company Profile

  • They are a rapidly growing Swiss startup developing fundamentally new kind of enterprise ecosystem software for global players in the digital age based on unique breakthrough Distributed Ledger Technology (DLT).
  • The company culture is “with passion”: a young, innovative and transparent leadership and people culture.
  • Their success is based on a radically process-oriented Distributed Ledger Technology stack, a laser focused vision and strategy, a truly international team of outstanding collaborating business and technology experts, an excellent cross-industry client base of global players as well as a strong external partner network of consulting companies and integrators.
  • Their headquarters is in Switzerland with a subsidiary in Germany. Further openings across Europe to follow clients and talent are planned.

The Vision

Today, globally and cross-industry business processes and whole business models are under disruption by a combination of new mega trend technologies such as AI, DLT and IoT leading to an end-to-end digitalization of ecosystems with new market structures, products and customer experiences.

Completely new software architectures based on formal protocols (often called Smart Contracts) and DLT (often called Blockchain) platforms will unleash the second digitalization wave allowing the agile, fast, secure and low cost implementation of new processes, applications and automations across enterprises and ecosystems.

At the heart of the new architectures are formal protocols (such as contracts, processes, regulations or policies) defining agreed processes and obligations including involved objects, mandatory or possible actions. The execution of these protocols is transparent/analyzable for all parties involved and synchronized on a distributed ledger.

Your Opportunity

We are looking for exceptional personalities with large-scale distributed software development skills in enterprise environments to take mission-critical leading roles within the software engineering team to further expand our business.

  • Rare opportunity to join a rapidly growing cutting-edge Swiss startup delivering real-world enterprise software to international tier-1 industry clients with a unique revolutionizing technology.
  • Take a key technical lead role to shape the software and have a critical and direct impact on the company success.
  • Deliver real-world enterprise software on distributed ledger technology for ecosystems in the digital age. Change the way whole industries do business.
  • Combine nascent technologies such as distributed ledger technology (blockchain), formal multiparty protocols (smart contracts), functional programming and domain specific languages, new development and testing tools (such as formal verification), high-performance computing, IoT and workflow automation.
  • Build cross-silo software for enterprises and whole ecosystems by formalizing contracts, queries and individual workflows that seamlessly automate collaboration between people, companies and machines.
  • Work with top-tier clients across different industries e.g. in the energy, automotive, chemical and health care industries.
  • Work in an international team of highly collaborative exceptional engineers.

Your Responsibilities

  • Design and develop high quality software efficiently and rapidly, using functional and non-functional languages, within an agile development environment.
  • Refine the specification, make critical design decisions and concept software architecture.
  • Develop enterprise scale distributed ledger technology that will be deployed at global players.
  • Technological responsibility for the delivery of software for customer projects product development and research projects.
  • Depending on your preferences and focus bring in your exceptional technical expertise, people management and/or product delivery skills.

Your Qualifications

  • 10+ years experience as a software engineer, working in large-scale software development environments.
  • Proven back-end work experience with functional and non-functional programming languages including but not limited to: Kotlin, Scala, and Haskell, Java, C/C++, C#, Objective C, Python, JavaScript, or Go.
  • Basic knowledge of Web protocols, including JavaScript, HTML, CSS and HTTP. Knowledge of front-end development including Typescript, React, Angular or similar.
  • Recently worked in an Agile/Scrum environment.
  • Affinity with formal verification methods and technology (e.g. Coq) is a plus.
  • Knowledge of cloud services and containers.
  • Experience with distributed and concurrent systems, scalable and high-availability computing.
  • Deep understanding of secure coding, testing and releasing. Strong enterprise grade delivery orientation.
  • Strong communication skills.
  • BS, MS or PhD in computer science.

The Offer for you

  • Work as part of our young, energy sparkling, innovative core team at our headquarters in Switzerland.
  • Work in Switzerland for at least one year required. Subsidiary in Germany and further subsidiaries in Europe planned so that remote work or moving to another office could be possible in the future.
  • Work in picturesque crypto-valley with lowest income taxes in Europe.
  • Swiss citizen or holder of a valid Swiss work permit type B or C, or EU/EFTA citizen required.
  • Highly competitive compensation package plus participation shares in the company.
  • Excellent personal and career growth potential.

Get information on how to apply for this position.

May 28, 2018 09:34 PM

Expert-Grade Software Engineers for Enterprise Software on Distributed Ledger Technology (Blockchain at MCZ Moschin Executive AG (Full-time)

MCZ Moschin Executive AG is a headhunting agency located in Zuich-City with over 30 years of experience and long-standing connections in the financial industry.

At the moment we are looking for a

Software Engineers for one of our clients.

The Company Profile

They are a rapidly growing Swiss startup developing fundamentally new kind of enterprise ecosystem software for global players in the digital age based on unique breakthrough Distributed Ledger Technology (DLT). The company culture is “with passion”: a young, innovative and transparent leadership and people culture. Their success is based on a radically process-oriented Distributed Ledger Technology stack, a laser focused vision and strategy, a truly international team of outstanding collaborating business and technology experts, an excellent cross-industry client base of global players as well as a strong external partner network of consulting companies and integrators. Their headquarters is in Switzerland with a subsidiary in Germany. Further openings across Europe to follow clients and talent are planned.

The Vision

Today, globally and cross-industry business processes and whole business models are under disruption by a combination of new mega trend technologies such as AI, DLT and IoT leading to an end-to-end digitalization of ecosystems with new market structures, products and customer experiences. Completely new software architectures based on formal protocols (often called Smart Contracts) and DLT (often called Blockchain) platforms will unleash the second digitalization wave allowing the agile, fast, secure and low cost implementation of new processes, applications and automations across enterprises and ecosystems. At the heart of the new architectures are formal protocols (such as contracts, processes, regulations or policies) defining agreed processes and obligations including involved objects, mandatory or possible actions. The execution of these protocols is transparent/analyzable for all parties involved and synchronized on a distributed ledger.

Your Opportunity

We are looking for exceptional personalities with large-scale distributed software development skills in enterprise environments to take mission-critical leading roles within the software engineering team to further expand our business.

Rare opportunity to join a rapidly growing cutting-edge Swiss startup delivering real-world enterprise software to international tier-1 industry clients with a unique revolutionizing technology. Take a key technical lead role to shape the software and have a critical and direct impact on the company success. Deliver real-world enterprise software on distributed ledger technology for ecosystems in the digital age. Change the way whole industries do business. Combine nascent technologies such as distributed ledger technology (blockchain), formal multiparty protocols (smart contracts), functional programming and domain specific languages, new development and testing tools (such as formal verification), high-performance computing, IoT and workflow automation. Build cross-silo software for enterprises and whole ecosystems by formalizing contracts, queries and individual workflows that seamlessly automate collaboration between people, companies and machines. Work with top-tier clients across different industries e.g. in the energy, automotive, chemical and health care industries. Work in an international team of highly collaborative exceptional engineers.

Your Responsibilities

Design and develop high quality software efficiently and rapidly, using functional and non-functional languages, within an agile development environment. Refine the specification, make critical design decisions and concept software architecture. Develop enterprise scale distributed ledger technology that will be deployed at global players. Technological responsibility for the delivery of software for customer projects product development and research projects. Depending on your preferences and focus bring in your exceptional technical expertise, people management and/or product delivery skills.

Your Qualifications

10+ years experience as a software engineer, working in large-scale software development environments. Proven back-end work experience with functional and non-functional programming languages including but not limited to: Kotlin, Scala, and Haskell, Java, C/C++, C#, Objective C, Python, JavaScript, or Go. Basic Knowledge of Web protocols, including JavaScript, HTML, CSS and HTTP. Knowledge of front-end development including Typescript, React, Angular or similar. Recently worked in an Agile/Scrum environment. Affinity with formal verification methods and technology (e.g. Coq) is a plus. Knowledge of cloud services and containers Experience with distributed and concurrent systems, scalable and high-availability computing. Deep understanding of secure coding, testing and releasing. Strong enterprise grade delivery orientation. Strong communication skills. BS, MS or PhD in computer science.

The Offer for you

Work as part of our young, energy sparkling, innovative core team at our headquarters in Switzerland. Work in Switzerland for at least one year. Subsidiary in Germany and further subsidiaries in Europe planned so that remote work or moving to another office could be possible in the future. Work in picturesque crypto-valley with lowest income taxes in Europe. Swiss citizen or holder of a valid Swiss work permit type B or C, or EU/EFTA citizen. Highly competitive compensation package plus participation shares in the company. Excellent personal and career growth potential.

Get information on how to apply for this position.

May 28, 2018 09:34 PM

Monday Morning Haskell

HNix: Enhancing Nix with Haskell

hnix.png

Last week we introduced Nix, the purely functional package manager. We saw how it used some different conceptual techniques from functional programming. With these concepts, it seeks to solve some problems in package management. It shares many concepts with Haskell, so it is most often used by Haskell developers.

Because of the Haskell community's interest in Nix, an interesting project has arose alongside it. This is HNix, which I mentioned a few weeks ago in my article about BayHac. HNix is a Haskell implementation of various components of Nix. In this quick article, we’ll look at the different elements of this project.

The Nix Language and the Nix Store

The term “Nix” is a little overloaded. It refers to the package manager or the operating system, but also refers to a language. The Nix language is how we specify the values that represent our different packages. The core repository of this project implements the Nix language in Haskell.

This implementation would make it easier to integrate Nix with your Haskell code. For example, you could combine Nix versioning of your packages with a database schema. This could ensure that you can automatically handle migrations.

Another part of the project is an interface to the Nix Store. The store deals with how Nix actually saves all the different packages on your system. While Nix does sandbox its packages, it can still be useful to have a programmatic interface on them. This allows you to manipulate a representation of this store in-memory, rather than on the file system. For instance, one store implementation has no side effects at all, to allow for unit testing. Another would read from the file system. But then it would perform all write effects in memory without modifying anything.

Open Source Haskell

One of the main reasons I’m discussing HNix is that it’s a good gateway to open source Haskell. If you’ve wanted to contribute to an OS Haskell project and weren’t sure where to start, HNix is a great option. The maintainers are very friendly guys. They'd be more than happy to help you get started in understanding the code base. At BayHac I was very impressed with how well organized the project was. Specifically, the maintainers made it easy for new people to get involved in the project. They laid out many different issue tickets that were doable even for non-experts.

So to get started, take a look at the repository. The README instructions are pretty thorough. Then you can go through the issues section for a little bit and pick up one of the tickets with a “Help Wanted” label. You can email one of the maintainers for help (John Wiegley is probably your best bet). Tagging them in an issue comment should also work if you need some direction.

Conclusion

Haskell depends a lot of open source contributions. A lot of the core pieces of infrastructure (GHC, Stack, Cabal) are all maintained via open source. When you can make these contributions, you’ll be able to rapidly improve your Haskell, add value to the community, and meet great people along the way! Next week, we’ll look at another open source Haskell project.

And if you’ve never written any Haskell before, don’t be afraid! You can start your learning journey with our Beginners Checklist. You’ll be able to make solid contributions much quicker than you think!

by James Bowen at May 28, 2018 02:30 PM

Michael Snoyman

I am not snoyjerk

There's a user on Reddit with the handle snoyjerk. Since there has been some confusion recently, I'd like to set the record straight:

  • I am not snoyjerk
  • I don't know who snoyjerk is
  • I have never used "sock puppet" accounts to discuss Haskell or other programming topics.

    • I have in the past used alternative accounts to discuss other interests, like weight lifting, but have since stopped to avoid any possible confusion.
  • Unless I'm overlooking something, all of my social media accounts are under the name snoyberg.

    • I obviously can't guarantee that every snoyberg out there is me, there are simply too many social media sites.
    • I also have access to some other social media accounts. For example, I'm pretty sure I have the credentials for FP Complete's Twitter account. But I'd imagine that's pretty obviously not intended to obfuscate my identity.

In the past, I've jokingly/sarcastically had some banter with snoyjerk, which unfortunately gave the impression—or even stated— that it was, in fact, my account. I can't blame anyone but myself for that. I apologize for the confusion, and hope this sets the record straight.

Some relevant links:

Thanks to simonmic, duplode, and and gelisam for helping clarify this in yesterday's Reddit thread.

May 28, 2018 08:43 AM

May 25, 2018

Roman Cheplyaka

QuickCheck vs SmallCheck

QuickCheck and SmallCheck are the two classical libraries for property testing in Haskell. There are newer developments, like SmartCheck and Hedgehog, but I haven’t tried them much yet.

So here is my take on the comparative strengths and weaknesses of the current versions of QuickCheck and SmallCheck (2.11.3 and 1.1.4, respectively). Being the maintainer of SmallCheck, I am obviously somewhat biased towards it. But quite often I would prefer QuickCheck to SmallCheck for a specific task. It also sometimes happens that I think QuickCheck would work better in principle for a given task, but in practice SmallCheck is more convenient for that task — usually it has to do with generic instances or failure reporting. So here are the considerations I take into account when choosing between the two.

If you do not know much about QuickCheck and/or SmallCheck, the main difference between them is that QuickCheck generates random test cases, whereas SmallCheck enumerates test cases exhaustively up to some depth. Some (though not all) of the pros and cons come from this fundamental difference in paradigms.

Advantages of QuickCheck

Control over the number of tests/execution time

With SmallCheck, the number of tests is usually exponential in their depth. Say, when enumerating trees (from Data.Tree), depth 5 gives you 189 tests in less then 0.01s, while depth 6 gives 6479 tests in 4s. If you only have the patience to run 1000 tests, you have to settle for 189. QuickCheck, on the other hand, allows you to set the exact number of tests independently from their depth (which is called size in QuickCheck).

Update (2018-07-05). In smallcheck-1.1.5, I added a function

which acts similarly to take for lists. It is still not as convenient as QuickCheck, because you have to apply it to each “big” generator as opposed to setting a single global limit on the number of tests, but you can use it to fight the exponential blowup.

Floating-point numbers

SmallCheck has instances for floating-point numbers, e.g.

These are probably enough if you need to fill some fractional field in a data type, but the logic of your functions doesn’t depend on the number in a complex way.

But if you are testing any truly numerical code (say, matrix operations or physics simulation), it is much more useful to have QuickCheck generate random floating-point numbers.

Advantages of SmallCheck

Generic deriving

If you have a couple of types that rarely change, writing the instances by hand may not a big deal.

Often, especially when writing tests for commercial code, there is a big number of nested algebraic data types. And even if you have the patience to write the instances, they will become incomplete the next time a constructor is added to one of the types (and the compiler won’t warn you).

So it is essential to be able to derive most of the instances. SmallCheck has had generic instances since 2011, thanks to the contribution from Bas van Dijk. QuickCheck still does not have them (github issue).

There is a separate package, generic-arbitrary, but it doesn’t seem to support recursive data types. The recursive types also seem to be the reason why QuickCheck itself doesn’t have the generic implementation.

The solution to the recursive type problem is straightforward: reduce the depth (size) every time you descend into a nested type. As an example, here’s an instance for trees from Data.Tree:

This approach is not ideal (your size may get down to 0 too quickly), but IMO it is better than no generic instances at all.

Update (2018-07-05). There are some third-party packages that implement generic deriving for QuickCheck and support recursive types, in particular testing-feat and generic-random.

testing-feat looks very cool and powerful from a theoretical perspective, and I’d like to study it in depth some day. In practice it involves definitions like

Sometimes you need to define an Enumerable instance for an abstract or primitive type, which I ended up doing like this (for Text):

generic-random is a bit easier to use, since there’s no additional Enumerable class. The declaration looks like this:

In both cases, genericShrink comes from the QuickCheck package itself, but it’s not the default implementation for shrink; the default is shrink _ = [].

Failure reporting

Say we have a property

By default, both SmallCheck and QuickCheck only tell you the generated values that led to the failure (i.e. x). But I would also like to know why the test failed — in this case, what reverse (reverse x) was.

There is a StackOverflow question about this with two answers, neither of which is perfect.

Edward Z. Yang says:

Because QuickCheck gives you the inputs to the function, and because the code under test is pure (it is, right?), you can just feed those inputs to the function and get the result. This is more flexible, because with those inputs you can also repeatedly test with tweaks to the original function until it’s correct.

There are many reasons why this may not work:

  1. The code may not be pure by design — both SmallCheck and QuickCheck are capable of testing IO code.
  2. The code may be intended to be pure, but the very bug you are trying to fix may lead to it being impure/non-deterministic.
  3. Copy-pasting values may not be very convenient: think large matrices.
  4. Also, strings produced by Show do not always map back to the original values effortlessly. E.g. Maps, Sets, and Vectors all result in fromList [...] when shown. If you have several of them in the same structure, you’ll have to disambiguate those fromLists.

The other answer, by Paul Johnson, gives a working solution (using MkResult), but it is rather unwieldy, and you have to include the parameter values manually if you need them.

Update (2018-07-05). I’ve since found and posted a much better way to report a failure in QuickCheck:

In SmallCheck, you can return Either String String instead of Bool, where the Strings give explanation for failure and success, respectively. For example, this property

results in the message

there exists [0,1] such that
  condition is false

But we can change the property to explain itself

which results in

there exists [0,1] such that
  reverse x = [1,0]

Better than shrinking

QuickCheck tends to produce complex counterexamples and then tries to shrink them to find simpler ones. While shrinking does lead to somewhat simpler examples, they are usually not minimal. Also, it’s on you to implement the shrinking function for your own types (although there is genericShrink).

Because SmallCheck enumerates the values in order, from smaller to bigger, it automatically produces the minimal example.

Determinism and exhaustiveness

Because QuickCheck generates its examples randomly, the set of test cases your property is tested on is rather arbitrary. (It also changes from run to run by default, though this can be fixed by suppplying a fixed seed.)

Sometimes that’s good: if you have a high- or infinite-dimensional space of possibilities, all you can hope for is to test on a few diverse samples from that space. But other times, that space is narrow, and it’s better to test deterministically and exhaustively.

May 25, 2018 08:00 PM

May 23, 2018

Joachim Breitner

The diameter of German+English

Languages never map directly onto each other. The English word fresh can mean frisch or frech, but frish can also be cool. Jumping from one words to another like this yields entertaining sequences that take you to completely different things. Here is one I came up with:

frechfreshfrishcoolabweisenddismissivewegwerfendtrashingverhauendbangingGeklopfeknocking – …

And I could go on … but how far? So here is a little experiment I ran:

  1. I obtained a German-English dictionary. Conveniently, after registration, you can get dict.cc’s translation file, which is simply a text file with three columns: German, English, Word form.

  2. I wrote a program that takes these words and first canonicalizes them a bit: Removing attributes like [ugs.] [regional], {f}, the to in front of verbs and other embellishment.

  3. I created the undirected, bipartite graph of all these words. This is a pretty big graph – ~750k words in each language, a million edges. A path in this graph is precisely a sequence like the one above.

  4. In this graph, I tried to find a diameter. The diameter of a graph is the longest path between two nodes that you cannot connect with a shorter path.

Because the graph is big (and my code maybe not fully optimized), it ran a few hours, but here it is: The English expression be annoyed by sb. and the German noun Icterus are related by 55 translations. Here is the full list:

  • be annoyed by sb.
  • durch jdn. verärgert sein
  • be vexed with sb.
  • auf jdn. böse sein
  • be angry with sb.
  • jdm. böse sein
  • have a grudge against sb.
  • jdm. grollen
  • bear sb. a grudge
  • jdm. etw. nachtragen
  • hold sth. against sb.
  • jdm. etw. anlasten
  • charge sb. with sth.
  • jdn. mit etw. [Dat.] betrauen
  • entrust sb. with sth.
  • jdm. etw. anvertrauen
  • entrust sth. to sb.
  • jdm. etw. befehlen
  • tell sb. to do sth.
  • jdn. etw. heißen
  • call sb. names
  • jdn. beschimpfen
  • abuse sb.
  • jdn. traktieren
  • pester sb.
  • jdn. belästigen
  • accost sb.
  • jdn. ansprechen
  • address oneself to sb.
  • sich an jdn. wenden
  • approach
  • erreichen
  • hit
  • Treffer
  • direct hit
  • Volltreffer
  • bullseye
  • Hahnenfuß-ähnlicher Wassernabel
  • pennywort
  • Mauer-Zimbelkraut
  • Aaron's beard
  • Großkelchiges Johanniskraut
  • Jerusalem star
  • Austernpflanze
  • goatsbeard
  • Geißbart
  • goatee
  • Ziegenbart
  • buckhorn plantain
  • Breitwegerich / Breit-Wegerich
  • birdseed
  • Acker-Senf / Ackersenf
  • yellows
  • Gelbsucht
  • icterus
  • Icterus

Pretty neat!

So what next?

I could try to obtain an even longer chain by forgetting whether a word is English or German (and lower-casing everything), thus allowing wild jumps like hathuthüttelodge.

Or write a tool where you can enter two arbitrary words and it finds such a path between them, if there exists one. Unfortunately, it seems that the terms of the dict.cc data dump would not allow me to create such a tool as a web site (but maybe I can ask).

Or I could throw in additional languages!

What would you do?

Update (2018-06-17):

I ran the code again, this time lower-casing all words, and allowing false-friends translations, as suggested above. The resulting graph has – surprisingly – precisely the same diamater (55), but with a partly different list:

  • peyote
  • peyote
  • mescal
  • meskal
  • mezcal
  • blaue agave
  • maguey
  • amerikanische agave
  • american agave
  • jahrhundertpflanze
  • century plant
  • fächerlilie
  • tumbleweed
  • weißer fuchsschwanz
  • common tumbleweed
  • acker-fuchsschwanz / ackerfuchsschwanz
  • rough pigweed
  • grünähriger fuchsschwanz
  • love-lies-bleeding
  • stiefmütterchen
  • pansy
  • schwuchtel
  • fruit
  • ertrag
  • gain
  • vorgehen
  • approach
  • sich an jdn. wenden
  • address oneself to sb.
  • jdn. ansprechen
  • accost sb.
  • jdn. belästigen
  • pester sb.
  • jdn. traktieren
  • abuse sb.
  • jdn. beschimpfen
  • call sb. names
  • jdn. etw. heißen
  • tell sb. to do sth.
  • jdm. etw. befehlen
  • entrust sth. to sb.
  • jdm. etw. anvertrauen
  • entrust sb. with sth.
  • jdn. mit etw. [dat.] betrauen
  • charge sb. with sth.
  • jdm. etw. zur last legen
  • hold sth. against sb.
  • jdm. etw. nachtragen
  • bear sb. a grudge
  • jdm. grollen
  • have a grudge against sb.
  • jdm. böse sein
  • be angry with sb.
  • auf jdn. böse sein
  • be mad at sb.
  • auf jdn. einen (dicken) hals haben

Note that there is not actually a false-friend in this list – it seems that adding the edges just changed the order of edges in the graph representation and my code just happened to find a different diamer.

by Joachim Breitner (mail@joachim-breitner.de) at May 23, 2018 06:35 AM

May 22, 2018

Brent Yorgey

Ertugrul Söylemez: 1985-2018

I do not know many details, but IRC user mupf joined the #haskell-ops channel yesterday to tell us that Ertugrul Söylemez died unexpectedly on May 12. This is a great loss for the Haskell community. Going variously by mm_freak, ertes, ertesx or ertes-w, Ertugrul has been very active on the #haskell IRC channel for almost 11 years—since June 2007—and he will be remembered as a knowledgeable, generous, and patient teacher. In addition to teaching on IRC, he wrote several tutorials, such as this one on foldr and this one on monoids. The thing Ertugrul was probably most well-known for in the Haskell community was his netwire FRP package, which he had developed since 2011, and more recently its successor package wires. I think it is no exaggeration to say that his work on and insights into FRP helped shape a whole generation of approaches.

He will be greatly missed. I will leave the comments open for anyone to share memories of Ertugrul, his writing, or his code.

by Brent at May 22, 2018 02:02 AM

Gabriel Gonzalez

How I evaluate Haskell packages

<html xmlns="http://www.w3.org/1999/xhtml"><head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/> <meta content="text/css" http-equiv="Content-Style-Type"/> <meta content="pandoc" name="generator"/> <style type="text/css">code{white-space: pre;}</style></head><body>

This post summarizes the rough heuristics that I use for evaluating Haskell packages. Usually I use these rules of thumb when:

  • choosing between multiple similar packages
  • deciding whether to depend on a package at all

Some of these guidelines work for other programming languages, too, but some of them are unique to Haskell. Even if you are a veteran programmer in another language you still might find some useful tidbits here when approaching the Haskell ecosystem for the first time.

I've organized the metrics I use into five categories:

  • "Large positive" - Things that impress me
  • "Small positive" - Positive features that make me more likely to adopt
  • "Irrelevant" - Things I never pay attention to when judging package quality
  • "Small negative" - Things that turn me off
  • "Large negative" - Huge red flags

Large positive

  • High quality documentation

    Excellent documentation is one of the strongest signals of maintainer commitment to a package. Extensive documentation is very expensive to keep up to date with a package because documentation is difficult to machine check in general.

    You can definitely have the build check that type signatures or doctests are correct, but anything more elaborate usually requires human intervention to keep up to date.

    Disclaimer: I'm biased here because I'm a huge proponent of documenting things well.

  • Impossible to use incorrectly

    One of the nice features of Haskell is that you can use the type system to make impossible states unrepresentable or enforce invariants to guarantee proper use. I strongly prefer libraries that take advantage of this to rule out errors at compile time so that I can sleep soundly at night knowing that my work built on top of a solid foundation.

  • On Stackage

    All packages on Stackage are guaranteed to build and have a known good build plan (called a Stackage "resolver") that both stack and cabal can reuse. All packages on Stackage "play nice" with each other, which greatly reduces the total cost of ownership for depending on these Haskell packages.

    Being on Stackage is also a good signal that the package is maintained by somebody because packages require some upkeep to ensure that they continue to build against the latest version of other packages.

  • Competitive Benchmarks

    I strongly prefer packages that post benchmarks comparing their package to their competitors. Such a comparison implies that the author went to the trouble of using and understanding alternative packages and using them idiomatically. That perspective often improves the quality and design of their own package.

    Also, competitive benchmarks are rare and require significant upkeep, so they double as strong signal of maintainer commitment.

  • 100+ reverse dependencies

    You can check the packages that depend on a given package (a.k.a. "reverse dependencies") using the following URL:

    https://packdeps.haskellers.com/reverse/${package-name}

    Packages that have a 100 or more reverse dependencies have substantial buy-in from the Haskell ecosystem and are safe to depend on. You're also likely to depend on these packages anyway through your transitive dependencies.

    (Suggested by: Tim Humphries)

Small positive

  • One complete example

    This means that the package has one "end-to-end" example showing how all the pieces fit together. I prefer packages that do this because they don't force me to waste a large amount of time reinventing what the author could have communicated in a much shorter amount of time.

    You would be surprised how many Haskell packages have extensive Haddock coverage but still lack that one overarching example showing how to use the package.

  • 100+ downloads in last 30 days

    Download statistics are not a very strong signal of activity on Hackage since most Haskell build tools (i.e. cabal, stack, Nix) will cache dependencies. These tools also don't require frequent cache invalidations like some other build tools which shall not be named.

    However, having 100+ downloads in the last 30 days is a reasonable signal that enough people were recently able to get the package working. That means that if anything goes wrong then I assume that there must be some solution or workaround.

  • Maintained/used/endorsed by a company

    I think this is a pretty obvious signal. Similar to download metrics, this implies that somebody was able to extract something commercially valuable from this package.

  • Pure or mostly pure API

    Packages that provide pure APIs are easier to test and easier to integrate into a code base. This reduces the total cost of ownership for that dependency.

  • Upper bounds on dependencies that are up to date

    I'm not going to take a strong stance on the "great PVP war" in this post. All I will note is that if a package has upper bounds that are up to date then that is a good sign that the package is actively maintained (whether or not you agree that upper bounds are a good idea).

    In other words, upper bounds are one way that a package author can "virtue signal" that they have the time and resources to maintain a package.

Irrelevant

  • On Hackage

    Many people new to Haskell ecosystem assume that there is some sort of quality bar to be on Hackage just because Hackage is not GitHub. There is no such quality bar; anybody can obtain a Hackage account and upload their packages (even joke packages).

  • Stability field

    The stability field of packages fell out of fashion years ago. Also, in my experience people are poor judges of whether or not their own packages will be stable. I prefer to look at the frequency of releases and how often they are breaking changes as empirical evidence of a package's stability.

  • Version number

    First off, make sure you understand PVP. The key bit is that the first two components of a version number represent the major version of a package. That means that if the package goes from version 1.0 to 1.1 then that signals just as much of a breaking change to the API as if the package went from version 1.0 to 2.0. The usual rule of thumb for interpreting version numbers is:

    X.Y.Z.A
    ↑ ↑ ↑ ↑
    | | | |
    | | | Non-breaking change that is invisible (i.e. documentation or refactor)
    | | | Note: Not all packages use a fourth number in their versions
    | | |
    | | Non-breaking change
    | |
    | Breaking change without fanfare
    |
    Breaking change with great fanfare

    The absolute version number basically does not matter. There are many widely used and mature packages on Hackage that are still version 0.*. This can be jarring coming from another language ecosystem where people expect a mature package to be at least version 1.0.

  • Votes

    Hackage 2 supports voting on packages but I've never consulted a package's votes when deciding whether or not to adopt. I prefer quality signals that cannot be easily faked or gamed.

  • Tests

    I've never checked whether or not I'm using a well-tested package or not. Maybe that's not a good idea, but I'm being honest about how I evaluate packages.

    This probably comes as a surprise to a person coming to Haskell from another language ecosystem, especially a dynamically typed language that rely heavily on tests to ensure quality. Haskell programmers lean much more on the type system to enforce and audit quality. For example, type signatures show me at a glance if a package is overusing side effects or deals with too many states or corner cases that I need to account for.

Small negative

  • Upper bounds that are not up to date

    The other nice thing about adding upper bounds to a package is that they also accurately signal when a package is no longer maintained. They require constant "gardening" to keep up to date so when an author abandons a package they naturally transform from a positive signal into a negative signal simply by virtue of inaction.

  • Large dependency footprint

    This used to be a much bigger issue before the advent of Stackage. People would aggressively trim their dependency trees out of fear that one bad dependency would cause them to waste hours finding a valid build plan.

    I still spend some time these days trying to minimize my dependency footprint, mainly so that I don't need to respond as often to breaking changes in my immediate dependencies. However, this is a relative minor concern in comparison to how bad things used to be.

  • Uses linked lists inappropriately

    The #1 source of Haskell performance degradation is keeping too many objects on the heap and linked lists trash your heap if you materialize the entire list into memory (as opposed to streaming over the list so that you only materialize at most one element at any time).

    I try to avoid dependencies that use linked lists unless they are the optimal data structure because I will be paying the price for those dependencies every time my program runs a garbage collection.

  • Frequent breaking API changes

    Breaking API changes are not a really big deal in Haskell due to the strong type system. Breaking API changes are mainly a minor maintenance burden: if you depend on a library that releases breaking changes all the time you usually only need to spend a few minutes updating your package to build against the latest API.

    This matters more if you maintain a large number of packages or if you have a huge number of dependencies, but if you don't then this is a relatively small concern.

  • Idiosyncratic behavior

    "Idiosyncratic" means that the package author does something weird unique to them. Some examples include using a private utility library or custom Prelude that nobody else uses or naming all of their classes C and all their types T (you know who you are).

    When I see this I assume that the package author has no interest in building consensus or being a "well-behaved" member of the Haskell ecosystem. That sort of package is less likely to respond or adapt to user feedback.

Large negative

  • Uses String inappropriately

    This is a special case of "Uses linked lists inappropriately" since the default String type in Haskell is stored as a linked list of characters.

    Strings tend to be rather large linked lists of characters that are commonly fully materialized. This implies that even modest use of them will add lots of objects to the heap and tax the garbage collector.

  • No documentation

    If I see no documentation on a package I "nope" right out of there. I assume that if a maintainer does not have time to document the package then they do not have time to address user feedback.

  • Not on Hackage

    I assume that if a package is not on Hackage that the maintainer has no interest in supporting or maintaining the package. The bar for adding a package to Hackage is so low that the absence of a package on Hackage is a red flag.

    Note that this does not imply that packages on Hackage will be supported. There are packages on Hackage where the author does not intend to support the package.

    Note: a reader pointed out that this is confusing given that I treat "on Hackage" as "Irrelevant". Perhaps another way to phrase this is that being on Hackage is a necessary but not sufficient signal of maintainer support.

  • Maintainer ignoring pull requests

    This is obviously the clearest sign that a maintainer does not respond to user feedback. There are fewer things that upset me more than a maintainer that completely ignores volunteers who submit pull requests to fix bugs.

Conclusion

Hopefully that gives you some quick and dirty heuristics you can use when evaluating Haskell packages for the first time. If you think that I missed something important just let me know and I'll update this post.

</body></html>

by Gabriel Gonzalez (noreply@blogger.com) at May 22, 2018 12:56 AM

May 21, 2018

Functional Jobs

Frontend Engineer at Trivium Real Estate (Full-time)

TriviumRE is an investor-backed, data-mining startup for the commercial real estate industry; we're making data entry jobs less painful. Our MVP is moments away from launching and customers are queuing to take it for a spin. As such, we're growing our team to tackle the challenges ahead.

What about the team? We have a high emphasis on continual learning. If you're not learning something new on the job its time for a new one. Our tech stack reflects this; Haskell and Elm are the main languages. We don't mind if its your first exposure to either. We're not afraid of experimenting or making mistakes. The most important quality of anyone on our team is their ability to learn and teach.

Since we're still young, working with us means you have a lot of influence in shaping the culture and direction of the company. You'll also a chance to grow your skill set faster than somewhere else.

The product We're building a data mining tool that can explicitly learn a relational data model based on sample inputs. It's radically improves data entry and data cleaning for financial analysts. Customers have loved our demos and we're understandably coy in our public descriptions.

Front-End Engineer

This is for someone who:

  • is comfortable in functional programming, especially Elm; and
  • is experienced shipping front-end software to production; and
  • has empathy for users and loves friendly UX; and
  • has an eagerness to learn and willingness to share knowledge.

Benefits

  • £50-70k DOE. + Equity.
  • Highly intelligent, supportive team.
  • 20+ holidays per year.
  • Support for conferences & other tech related education.

A Typical Day

  • Start off writing a feature that requires elm-ports. These can be tricky, so you pair with another developer who has shipped such code before.

  • After lunch you're requested to review a Pull Request that fixes a bug you accidentally introduced a couple weeks back. There's no blame, instead some discussion about how our tests and review process could have caught this earlier.

  • The afternoon is spent with the Product guys around a whiteboard. You're helping them sketch out a realistic wire-frame for a feature we plan to deliver in a couple of weeks.

Some Tech we like:

We're more interested in a willingness and ability to learn than what you currently know. But in case you're interested in the stack we're using:

  • Haskell
  • Elm
  • PostgreSQL
  • Javascript
  • Python / scikit-learn
  • Tensorflow / Deep learning
  • AWS
  • CircleCI
  • Automated Testing

Get information on how to apply for this position.

May 21, 2018 04:42 PM

May 19, 2018

Brent Yorgey

Future of Coding podcast interview

I recently did an interview with my friend and former student Steve Krouse for his Future of Coding podcast. I think I still agree with almost everything I said, and I didn’t say anything too embarrassing, so if you want to listen to it I won’t stop you! We chatted about teaching and learning, functional programming and type systems, how to read academic papers, and a bunch of other things. You can listen to it here.

by Brent at May 19, 2018 03:16 PM