Planet Haskell

November 07, 2024

Donnacha Oisín Kidney

POPL Paper—Algebraic Effects Meet Hoare Logic in Cubical Agda

Posted on November 7, 2024
Tags:

New paper: “Algebraic Effects Meet Hoare Logic in Cubical Agda”, by myself, Zhixuan Yang, and Nicolas Wu, will be published at POPL 2024.

Zhixuan has a nice summary of it here.

The preprint is available here.

by Donnacha Oisín Kidney at November 07, 2024 12:00 AM

November 06, 2024

Well-Typed.Com

The Haskell Unfolder Episode 35: distributive and representable functors

Today, 2024-11-06, at 1930 UTC (11:30 am PST, 2:30 pm EST, 7:30 pm GMT, 20:30 CET, …) we are streaming the 35th episode of the Haskell Unfolder live on YouTube.

The Haskell Unfolder Episode 35: distributive and representable functors

We’re going to look at two somewhat more exotic type classes in the Haskell library ecosystem: Distributive and Representable. The former allows you to distribute one functor over another, the latter provides you with a notion of an index to access the elements. As an example, we’ll return once more to the grids used in Episodes 32 and 33 to describe the tic-tac-toe game, and we’ll see how some operations we used can be made more elegant in terms of these type classes. This episode is, however, self-contained; having seen the previous episodes is not required.

About the Haskell Unfolder

The Haskell Unfolder is a YouTube series about all things Haskell hosted by Edsko de Vries and Andres Löh, with episodes appearing approximately every two weeks. All episodes are live-streamed, and we try to respond to audience questions. All episodes are also available as recordings afterwards.

We have a GitHub repository with code samples from the episodes.

And we have a public Google calendar (also available as ICal) listing the planned schedule.

There’s now also a web shop where you can buy t-shirts and mugs (and potentially in the future other items) with the Haskell Unfolder logo.

by andres, edsko at November 06, 2024 12:00 AM

The Haskell Unfolder Episode 35: distributive and representable functors

Today, 2024-11-06, at 1930 UTC (11:30 am PST, 2:30 pm EST, 7:30 pm GMT, 20:30 CET, …) we are streaming the 35th episode of the Haskell Unfolder live on YouTube.

The Haskell Unfolder Episode 35: distributive and representable functors

We’re going to look at two somewhat more exotic type classes in the Haskell library ecosystem: Distributive and Representable. The former allows you to distribute one functor over another, the latter provides you with a notion of an index to access the elements. As an example, we’ll return once more to the grids used in Episodes 32 and 33 to describe the tic-tac-toe game, and we’ll see how some operations we used can be made more elegant in terms of these type classes. This episode is, however, self-contained; having seen the previous episodes is not required.

About the Haskell Unfolder

The Haskell Unfolder is a YouTube series about all things Haskell hosted by Edsko de Vries and Andres Löh, with episodes appearing approximately every two weeks. All episodes are live-streamed, and we try to respond to audience questions. All episodes are also available as recordings afterwards.

We have a GitHub repository with code samples from the episodes.

And we have a public Google calendar (also available as ICal) listing the planned schedule.

There’s now also a web shop where you can buy t-shirts and mugs (and potentially in the future other items) with the Haskell Unfolder logo.

by andres, edsko at November 06, 2024 12:00 AM

November 05, 2024

Jeremy Gibbons

Alan Jeffrey, 1967–2024

My friend Alan Jeffrey passed away earlier this year. I described his professional life at a Celebration in Oxford on 2nd November 2024. This post is a slightly revised version of what I said.

Edinburgh, 1983–1987

I’ve known Alan for over 40 years—my longest-standing friend. We met at the University of Edinburgh in 1983, officially as computer science freshers together, but really through the clubs for science fiction and for role-playing games. Alan was only 16: like many in Scotland, he skipped the final school year for an earlier start at university. It surely helped that his school had no computers, so he wasted no time in transferring to a university that did. His brother David says that it also helped that he would then be able to get into the student union bars.

Oxford, 1987–1991

After Edinburgh, Alan and I wound up together again as freshers at the University of Oxford. We didn’t coordinate this; we independently and simultaneously applied to the same DPhil programme (Oxford’s name for the PhD). We were officemates for those 4 years, and shared a terraced hovel on St Mary’s Road in bohemian East Oxford with three other students for most of that time. He was clever, funny, kind, and serially passionate about all sorts of things. It was a privilege and a pleasure to have known him.

Alan had a career that spanned academia and industry, and he excelled at both. He described himself as a “semanticist”: using mathematics instead of English for precise descriptions of programming languages. He had already set out in that direction with his undergraduate project on concurrency under Robin Milner at Edinburgh; and he continued to work on concurrency for his DPhil under Bill Roscoe at Oxford, graduating in 1992.

Chalmers, 1991–1992

Alan spent the last year of his DPhil as a postdoc working for K V S Prasad at Chalmers University in Sweden. While there, he was assigned to host fellow Edinburgh alumnus Carolyn Brown visiting for an interview; Carolyn came bearing a bottle of malt whisky, as one does, which she and Alan proceeded to polish off together that evening.

Sussex, 1992–1999

Carolyn’s interview was successful; but by the time she arrived at Chalmers, Alan had left for a second postdoc under Matthew Hennessy at the University of Sussex. They worked together again when Carolyn was in turn hired as a lecturer at Sussex. In particular, they showed in 1994 that “string diagrams”—due to Roger Penrose and Richard Feynman in physics—provide a “fully abstract” calculus for hardware circuits, meaning that everything true of the diagrams is true of the hardware, and vice versa. This work foreshadowed a hot topic in the field of Applied Category Theory today.

Matthew essentially left Alan to his own devices: as Matthew put it, “something I was very happy with as he was an exceptional researcher”. Alan was soon promoted to a lectureship himself. He collaborated closely with Julian Rathke, then Matthew’s PhD student and later postdoc, on the Full Abstraction Factory project, developing a bunch more full abstraction results for concurrent and object-oriented languages. That fruitful collaboration continued even after Alan left Sussex.

DePaul, 1999–2004

Alan presented a paper A Fully Abstract Semantics for a Nondeterministic Functional Language with Monadic Types at the 1995 conference on Mathematical Foundations of Programming Semantics in New Orleans. I believe that this is where he met Karen Bernstein, who also had a paper. One thing led to another, and Alan took a one-year visiting position at DePaul University in Chicago in 1998, then formally left Sussex in 1999 for a regular Associate Professor position at DePaul. He lived in Chicago for the rest of his life.

Alan established the Foundations of Programming Languages research group at DePaul, attracting Radha Jagadeesan from Loyola, James Riely from Sussex, and Corin Pitcher from Oxford, working among other things on “relaxed memory”—modern processors don’t actually read and write their multiple levels of memory in the order you tell them to, when they can find quicker ways to do things concurrently and sometimes out of order.

James remembers showing Alan his first paper on relaxed memory, co-authored with Radha. Alan thought their approach was an “appalling idea”; the proper way was to use “event structures”, an idea from the 1980s. This turned in 2016 into a co-authored paper at LICS (Alan’s favourite conference), and what James considers his own best ever talk—an on-stage reenactment of the to and fro of their collaboration, sadly not recorded for posterity.

James was Alan’s most frequent collaborator over the years, with 14 joint papers. Their modus operandi was that, having identified a problem together, Alan would go off by himself and do some Alanny things, eventually coalescing on a solution, and choose an order of exposition, tight and coherent; this is about 40% of the life of the paper. But then there are various tweaks, extensions, corrections… Alan would never look at the paper again, and would be surprised years later to learn what was actually in it. However, Alan was always easy to work with: interested only in the truth, although it must be beautiful. He had a curious mix of modesty and egocentricity: always convinced he was right (and usually right that he was right). Still, he had no patience for boring stuff, especially university admin.

While at DePaul, Alan also had a significant collaboration with Andy Gordon from Microsoft on verifying security protocols. Their 2001 paper Authenticity by Typing for Security Protocols won a Test Of Time Award this year at the Symposium on Computer Security Foundations, “given to outstanding papers with enduring significance and impact”—recognition which happily Alan lived to see.

Bell Labs, 2004–2015

After the dot com crash in 2000, things got more difficult at DePaul, and Alan left in 2004 for Bell Labs, nominally as a member of technical staff in Naperville but actually part of a security group based at HQ in Murray Hill NJ. He worked on XPath, “a modal logic for XML”, with Michael Benedikt, now my databases colleague at Oxford. They bonded because only Alan and Michael lived in Chicago rather than the suburbs. Michael had shown Alan a recent award-winning paper in which Alan quickly spotted an error in a proof—an “obvious” and unproven lemma that turned out to be false—which led to their first paper together.

(A recurring pattern. Andy Gordon described Alan’s “uncanny ability to find bugs in arguments”: he found a type unsoundness bug in a released draft specification for Java, and ended up joining the standards committee to help fix it. And as a PhD examiner he “shockingly” found a subtle bug that unpicked the argument of half of the dissertation, necessitating major corrections: it took a brave student to invite Alan as examiner—or a very confident one.)

Michael describes Alan as an “awesome developer”. They once had an intern; it didn’t take long after the intern had left for Alan to discard the intern’s code and rewrite it from scratch. Alan was unusual in being able to combine Euro “abstract nonsense” and US engineering. Glenn Bruns, another Bell Labs colleague, said that “I think Alan was the only person I’ve met who could do theory and also low-level hackery”.

At Bell Labs Alan also worked with Peter Danielsen on the Web InterFace Language, WIFL for short: a way of embedding API descriptions in HTML. Peter recalls: “We spent a few months working together on the conceptual model. In the early stages of software development, however, Alan looked at what I’d written and said, “I wouldn’t do it that way at all!”, throwing it all away and starting over. The result was much better; and he inadvertently taught me a new way to think in JavaScript (including putting //Sigh… comments before unavoidable tedious code.)”

Mozilla Research, 2015–2020

The Bell Labs group dissolved in 2015, and Alan moved to Mozilla Research as a staff research engineer to work on Servo, a new web rendering engine in the under-construction programming language Rust.

For one of Alan’s projects at Mozilla, he took a highly under-specified part of the HTML specification about how web links and the back and forwards browser buttons should interact, created a formal model in Agda based on the existing specification, identified gaps in it as well as ways that major browsers did not match the model, then wrote it all up as a paper. Alan’s manager Josh Matthews recalls the editors of the HTML standard being taken aback by Alan’s work suddenly being dropped in their laps, but quickly appreciated how much more confidently they could make changes based on it.

Josh also recalled: “Similarly, any time other members of the team would talk about some aspect of the browser engine being safe as long as some property was upheld by the programmer, Alan would get twitchy. He had a soft spot for bad situations that looked like they could be made impossible by clever enough application of static types.”

In 2017 Alan made a rather surprising switch to working on augmented reality for the web, partly driven by internal politics at Mozilla. He took the lead on integrating Servo into the Magic Leap headset; the existing browser was clunky, the only way to interact with pages being via an awkward touchpad on the hand controller. This was not good enough for Alan: after implementing the same behaviour for Servo and finding it frustrating, he had several email exchanges with the Magic Leap developers, figured out how to access some interfaces that weren’t technically public but also were not actually private, and soon he proudly showed off a more natural laser pointer-style means of interacting with pages in augmented reality in Servo—to much acclaim from the team and testers.

Roblox, 2020–2024

Then in 2020, Mozilla’s funding stream got a lot more constrained, and Alan moved to the game platform company Roblox. Alan was a principal software engineer, and the language owner of Luau, “a fast, small, safe, gradually typed embeddable scripting language derived from Lua”, working on making the language easier to use, teach, and learn. Roblox supports more than two million “content creators”, mostly kids, creating millions of games a year; Alan’s goal was to empower them to build larger games with more characters.

The Luau product manager Bryan Nealer says that “people loved Alan”. Roblox colleagues appreciated his technical contributions: “Alan was meticulous in what he built and wrote at Roblox. He would stress not only the substance of his work, but also the presentation. His attention to detail inspired the rest of us!”; “One of the many wonderful things Alan did for us was to be the guy who could read the most abstruse academic research imaginable and translate it into something simple, useful, interesting, and even fun.” They also appreciated the more personal contributions: Alan led an internal paper reading group, meeting monthly to study some paper on programming or networking, but he also established the Roblox Book Club: “He was always thoughtful when discussing books, and challenged us to think about the text more deeply. He also had an encyclopedic knowledge of scifi. He recommended Iain M. Banks’s The Culture series to me, which has become my favorite scifi series. I think about him every time I pick up one of those books.”

Envoie

From my own perspective, one of the most impressive things about Alan is that he was impossible to pigeonhole: like Dr Who, he was continually regenerating. He explained to me that he got bored quickly with one area, and moved on to another. As well as his academic abilities, he was a talented and natural cartoonist: I still have a couple of the tiny fanzine comics he produced as a student.

Of course he did some serious science for his DPhil and later career: but he also took a strong interest in typography and typesetting. He digitized some beautiful Japanese crests for the chapter title pages of his DPhil dissertation. Alan dragged me in typography with him, a distraction I have enjoyed ever since. Among other projects, Alan and I produced a font containing some extra symbols so that we could use them in our papers, and named it St Mary’s Road after our Oxford digs. And Alan produced a full blackboard bold font, complete with lowercase letters and punctuation: you can see some of it in the order of service. But Alan was not satisfied with merely creating these things; he went to all the trouble to package them up properly and get them included in standard software distributions, so that they would be available for everyone: Alan loved to build things for people to use. These two fonts are still in regular use 35 years later, and I’m sure they will be reminding us of him for a long time to come.

by jeremygibbons at November 05, 2024 01:13 PM

GHC Developer Blog

GHC 9.12.1-alpha2 is now available

GHC 9.12.1-alpha2 is now available

Zubin Duggal - 2024-11-05

The GHC developers are very pleased to announce the availability of the second alpha release of GHC 9.12.1. Binary distributions, source distributions, and documentation are available at downloads.haskell.org.

We hope to have this release available via ghcup shortly.

GHC 9.12 will bring a number of new features and improvements, including:

  • The new language extension OrPatterns allowing you to combine multiple pattern clauses into one.

  • The MultilineStrings language extension to allow you to more easily write strings spanning multiple lines in your source code.

  • Improvements to the OverloadedRecordDot extension, allowing the built-in HasField class to be used for records with fields of non lifted representations.

  • The NamedDefaults language extension has been introduced allowing you to define defaults for typeclasses other than Num.

  • More deterministic object code output, controlled by the -fobject-determinism flag, which improves determinism of builds a lot (though does not fully do so) at the cost of some compiler performance (1-2%). See #12935 for the details

  • GHC now accepts type syntax in expressions as part of GHC Proposal #281.

  • The WASM backend now has support for TemplateHaskell.

  • … and many more

A full accounting of changes can be found in the release notes. As always, GHC’s release status, including planned future releases, can be found on the GHC Wiki status.

We would like to thank GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, the Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

by ghc-devs at November 05, 2024 12:00 AM

November 04, 2024

in Code

Functors to Monads: A Story of Shapes

For many years now I’ve been using a mental model and intuition that has guided me well for understanding and teaching and using functors, applicatives, monads, and other related Haskell abstractions, as well as for approaching learning new ones. Sometimes when teaching Haskell I talk about this concept and assume everyone already has heard it, but I realize that it’s something universal yet easy to miss depending on how you’re learning it. So, here it is: how I understand the Functor and other related abstractions and free constructions in Haskell.

The crux is this: instead of thinking about what fmap changes, ask: what does fmap keep constant?

This isn’t a rigorous understanding and isn’t going to explain every aspect about every Functor, and will probably only be useful if you already know a little bit about Functors in Haskell. But it’s a nice intuition trick that has yet to majorly mislead me.

The Secret of Functors

First of all, what is a Functor? A capital-F Functor, that is, the Haskell typeclass and abstraction. Ask a random Haskeller on the street and they’ll tell you that it’s something that can be “mapped over”, like a list or an optional. Maybe some of those random Haskellers will feel compelled to mention that this mapping should follow some laws…they might even list the laws. Ask them why these laws are so important and maybe you’ll spend a bit of time on this rhetorical street of Haskellers before finding one confident enough to give an answer.

So I’m going to make a bit of a tautological leap: a Functor gives you a way to “map over” values in a way that preserves shape. And what is “shape”? A shape is the thing that fmap preserves.

The Functor typeclass is simple enough: for Functor f, you have a function fmap :: (a -> b) -> f a -> f b, along with fmap id = id and fmap f . fmap g = fmap (f . g). Cute things you can drop into quickcheck to prove for your instance, but it seems like those laws are hiding some sort of deeper, fundamental truth.

The more Functors you learn about, the more you see that fmap seems to always preserve “something”:

  • For lists, fmap preserves length and relative orderings.
  • For optionals (Maybe), fmap preserves presence (the fact that something is there or not). It cannot flip a Just to a Nothing or vice versa.
  • For Either e, fmap preserves the error (if it exists) or the fact that it was succesful.
  • For Map k, fmap preserves the keys: which keys exist, how many there are, their relative orderings, etc.
  • For IO, fmap preserves the IO effect. Every bit of external I/O that an IO action represents is unchanged by an fmap, as well as exceptions.
  • For Writer w or (,) w, fmap preserves the “logged” w value, leaving it unchanged. Same for Const w.
  • For Tree, fmap preserves the tree structure: how many layers, how big they are, how deep they are, etc.
  • For State s, fmap preserves what happens to the input state s. How a State s transform a state value s is unchanged by fmap
  • For ConduitT i o m from conduit, fmap preserves what the conduit pulls upstream and what it yields downstream. fmap will not cause the conduit to yield more or different objects, nor cause it to consume/pull more or less.
  • For parser-combinator Parser, fmap preserves what input is consumed or would fail to be consumed. fmap cannot change whether an input string would fail or succeed, and it cannot change how much it consumes.
  • For optparse-applicative Parsers, fmap preserves the command line arguments available. It leaves the --help message of your program unchanged.

It seems like as soon as you define a Functor instance, or as soon as you find out that some type has a Functor instance, it magically induces some sort of … “thing” that must be preserved.1 A conserved quantity must exist. It reminds me a bit of Noether’s Theorem in Physics, where any continuous symmetry “induces” a conserved quantity (like how translation symmetry “causes” conservation of momentum). In Haskell, every lawful Functor instance induces a conserved quantity. I don’t know if there is a canonical name for this conserved quantity, but I like to call it “shape”.

A Story of Shapes

The word “shape” is chosen to be as devoid of external baggage/meaning as possible while still having some. The word isn’t important as much as saying that there is some “thing” preserved by fmap, and not exactly the nature of that “thing”. The nature of that thing changes a lot from Functor to Functor, where we might better call it an “effect” or a “structure” specifically, but that some “thing” exists is almost universal.

Of course, the value if this “thing” having a canonical name at all is debatable. I were to coin a completely new term I might call it a “conserved charge” or “gauge” in allusion to physics. But the most useful name probably would be shape.

For some Functor instances, the word shape is more literal than others. For trees, for instance, you have the literal shape of the tree preserved. For lists, the “length” could be considered a literal shape. Map k’s shape is also fairly literal: it describes the structure of keys that exist in the map. But for Writer w and Const w, shape can be interpreted as some information outside of the values you are mapping that is left unchanged by mapping. For Maybe and Either e shape also considers if there has been any short-circuiting. For State s and IO and Parser, “shape” involves some sort of side-computation or consumption that is left unchanged by fmap, often called an effect. For optparse-applicative, “shape” involves some sort of inspectable and observable static aspects of a program. “Shape” comes in all forms.

But, this intuition of “looking for that conserved quantity” is very helpful for learning new Functors. If you stumble onto a new type that you know is a Functor instance, you can immediately ask “What shape is this fmap preserving?”, and it will almost always yield insight into that type.

This viewpoint also sheds insight onto why Set.map isn’t a good candidate for fmap for Data.Set: What “thing” does Set.map f preserve? Not size, for sure. In a hypothetical world where we had ordfmap :: Ord b => (a -> b) -> f a -> f b, we would still need Set.map to preserve something for it to be useful as an “Ord-restricted Functor”.2

A Result

Before we move on, let’s look at another related and vague concept that is commonly used when discussing functors: fmap is a way to map a function that preserves the shape and changes the result.

If shape is the thing that is preserved by fmap, result is the thing that is changed by it. fmap cleanly splits the two.

Interestingly, most introduction to Functors begin with describing functor values as having a result and fmap as the thing that changes it, in some way. Ironically, though it’s a more common term, it’s by far the more vague and hard-to-intuit concept.

For something like Maybe, “result” is easy enough: it’s the value present if it exists. For parser-combinator Parsers too it’s relatively simple: the “shape” is the input consumed but the “result” is the Haskell value you get as a result of the consumption. For optparse-applicative parser, it’s the actual parsed command line arguments given by the user at runtime. But sometimes it’s more complicated: for the technical List functor, the “non-determinism” functor, the “shape” is the number of options to choose from and the order you get them in, and the “result” (to use precise semantics) is the non-deterministic choice that you eventually pick or iterate over.

So, the “result” can become a bit confusing to generalize. So, in my mind, I usually reduce the definitions to:

  • Shape: the “thing” that fmap preserves: the f in f a
  • Result: the “thing” that fmap changes: the a in f a

With this you could “derive” the Functor laws:

  • fmap id == id: fmap leaves the shape unchanged, id leaves the result unchanged. So entire thing must remain unchanged!
  • fmap f . fmap g == fmap (f . g). In both cases the shape remains unchanged, but one changes the result by f after g, and the other changes the result by f . g. They must be the same transformation!

All neat and clean, right? So, maybe the big misdirection is focusing too much on the “result” when learning Functors, when we should really be focusing more on the “shape”, or at least the two together.

Once you internalize “Functor gives you shape-preservation”, this helps you understand the value of the other common typeclass abstractions in Haskell as well, and how they function based on how they manipulate “shape” and “result”.

Traversable

For example, what does the Traversable typeclass give us? Well, if Functor gives us a way to map pure functions and preserve shape, then Traversable gives us a way to map effectful functions and preserve shape.

Whenever someone asks me about my favorite Traversable instance, I always say it’s the Map k traversable:

traverse :: Applicative f => (a -> f b) -> Map k a -> f (Map k b)

Notice how it has no constraints on k? Amazing isn’t it? Map k b lets us map an (a -> f b) over the values at each key in a map, and collects the results under the key the a was originally under.

In essence, you can be assured that the result map has the same keys as the original map, perfectly preserving the “shape” of the map. The Map k instance is the epitome of beautiful Traversable instances. We can recognize this by identifying the “shape” that traverse is forced to preserve.

Applicative

What does the Applicative typeclass give us? It has ap and pure, but its laws are infamously difficult to understand.

But, look at liftA2 (,):

liftA2 (,) :: Applicative f => f a -> f b -> f (a, b)

It lets us take “two things” and combine their shapes. And, more importantly, it combines the shapes without considering the results.

  • For Writer w, <*> lets us combine the two logged values using mappend while ignoring the actual a/b results.
  • For list, <*> (the cartesian product) lets us multiply the lengths of the input lists together. The length of the new list ignores the actual contents of the list.
  • For State s, <*> lets you compose the s -> s state functions together, ignoring the a/bs
  • For Parser, <*> lets you sequence input consumption in a way that doesn’t depend on the actual values you parse: it’s “context-free” in a sense, aside from some caveats.
  • For optparse-applicative, <*> lets you combine your command line argument specs together, without depending on the actual values provided at runtime by the caller.

The key takeaway is that the “final shape” only depends on the input shapes, and not the results. You can know the length of <*>-ing two lists together with only knowing the length of the input lists, and you can also know the relative ordering of inputs to outputs. Within the specific context of the semantics of IO, you can know what “effect” <*>-ing two IO actions would produce only knowing the effects of the input IO actions3. You can know what command line arguments <*>-ing two optparse-applicative parsers would have only knowing the command line arguments in the input parsers. You can know what strings <*>-ing two parser-combinator parsers would consume or reject, based only on the consumption/rejection of the input parsers. You can know the final log of <*>-ing two Writer w as together by only knowing the logs of the input writer actions.

And hey…some of these combinations feel “monoidal”, don’t they?

  • Writer w sequences using mappend
  • List lengths sequence by multiplication
  • State s functions sequence by composition

You can also imagine “no-op” actions:

  • Writer w’s no-op action would log mempty, the identity of mappend
  • List’s no-op action would have a length 1, the identity of multiplication
  • State s’s no-op action would be id, the identity of function composition

That might sound familiar — these are all pure from the Applicative typeclass!

So, the Applicative typeclass laws aren’t that mysterious at all. If you understand the “shape” that a Functor induces, Applicative gives you a monoid on that shape! This is why Applicative is often called the “higher-kinded” Monoid.

This intuition takes you pretty far, I believe. Look at the examples above where we clearly identify specific Applicative instances with specific Monoid instances (Monoid w, Monoid (Product Int), Monoid (Endo s)).

Put in code:

-- A part of list's shape is its length and the monoid is (*, 1)
length (xs <*> ys) == length xs * length ys
length (pure r) == 1

-- Maybe's shape is isJust and the monoid is (&&, True)
isJust (mx <*> my) == isJust mx && isJust my
isJust (pure r) = True

-- State's shape is execState and the monoid is (flip (.), id)
execState (sx <*> sy) == execState sy . execState sx
execState (pure r) == id

-- Writer's shape is execWriter and the monoid is (<>, mempty)
execWriter (wx <*> wy) == execWriter wx <> execWriter wy
execWriter (pure r) == mempty

We can also extend this to non-standard Applicative instances: the ZipList newtype wrapper gives us an Applicative instance for lists where <*> is zipWith. These two have the same Functor instances, so their “shape” (length) is the same. And for both the normal Applicative and the ZipList Applicative, you can know the length of the result based on the lengths of the input, but ZipList combines shapes using the Min monoid, instead of the Product monoid. And the identity of Min is positive infinity, so pure for ZipList is an infinite list.

-- A part of ZipList's shape is length and its monoid is (min, infinity)
length (xs <*> ys) == length xs `min` length ys
length (pure r) == infinity

The “know-the-shape-without-knowing-the-results” property is actually leveraged by many libraries. It’s how optparse-applicative can give you --help output: the shape of the optparse-applicative parser (the command line arguments list) can be computed without knowing the results (the actual arguments themselves at runtime). You can list out what arguments are expecting without ever getting any input from the user.

This is also leveraged by the async library to give us the Concurrently Applicative instance. Normally <*> for IO gives us sequential combination of IO effects. But, <*> for Concurrently gives us parallel combination of IO effects. We can launch all of the IO effects in parallel at the same time because we know what the IO effects are before we actually have to execute them to get the results. If we needed to know the results, this wouldn’t be possible.

This also gives some insight into the Backwards Applicative wrapper — because the shape of the final does not depend on the result of either, we are free to combine the shapes in whatever order we want. In the same way that every monoid gives rise to a “backwards” monoid:

ghci> "hello" <> "world"
"helloworld"
ghci> getDual $ Dual "hello" <> Dual "world"
"worldhello"

Every Applicative gives rise to a “backwards” Applicative that does the shape “mappending” in reverse order:

ghci> putStrLn "hello" *> putStrLn "world"
hello
world
ghci> forwards $ Backwards (putStrLn "hello") *> Backwards (putStrLn "world")
world
hello

The monoidal nature of Applicative with regards to shapes and effects is the heart of the original intent, and I’ve discussed this in earlier blog posts.

Alternative

The main function of the Alternative typeclass is <|>:

(<|>) :: Alternative f => f a -> f a -> f a

At first this might look a lot like <*> or liftA2 (,)

liftA2 (,) :: Applicative f => f a -> f b -> f (a, b)

Both of them take two f a values and squish them into a single one. Both of these are also monoidal on the shape, independent of the result. They have a different monoidal action on <|> than as <*>:

-- A part of list's shape is its length:
-- the Ap monoid is (*, 1), the Alt monoid is (+, 0)
length (xs <*> ys) == length xs * length ys
length (pure r) == 1
length (xs <|> ys) == length xs + length ys
length empty == 0

-- Maybe's shape is isJust:
-- The Ap monoid is (&&, True), the Alt monoid is (||, False)
isJust (mx <*> my) == isJust mx && isJust my
isJust (pure r) = True
isJust (mx <|> my) == isJust mx || isJust my
isJust empty = False

If we understand that functors have a “shape”, Applicative implies that the shapes are monoidal, and Alternative implies that the shapes are a “double-monoid”. The exact nature of how the two monoids relate to each other, however, is not universally agreed upon. For many instances, however, it does happen to form a semiring, where empty “annihilates” via empty <*> x == empty, and <*> distributes over <|> like x <*> (y <|> z) == (x <*> y) <|> (x <*> z). But this is not universal.

However, what does Alternative bring to our shape/result dichotomy that Applicative did not? Notice the subtle difference between the two:

liftA2 (,) :: Applicative f => f a -> f b -> f (a, b)
(<|>) :: Alternative f => f a -> f a -> f a

For Applicative, the “result” comes from the results of both inputs. For Alternative, the “result” could come from one or the other input. So, this introduces a fundamental data dependency for the results:

  • Applicative: Shapes merge monoidally independent of the results, but to get the result of the final, you need to produce the results of both of the two inputs in the general case.
  • Alternative: Shapes merge monoidally independent of the results, but to get the result of the final, you need the results of one or the other input in the general case.

This also implies that choice of combination method for shapes in Applicative vs Alternative aren’t arbitrary: the former has to be “conjoint” in a sense, and the latter has to be “disjoint”.

See again that clearly separating the shape and the result gives us the vocabulary to say precisely what the different data dependencies are.

Monad

Understanding shapes and results also help us appreciate more the sheer power that Monad gives us. Look at >>=:

(>>=) :: Monad m => m a -> (a -> m b) -> m b

Using >>= means that the shape of the final action is allowed to depend on the result of the first action! We are no longer in the Applicative/Alternative world where shape only depends on shape.

Now we can write things like:

greet = do
  putStrLn "What is your name?"
  n <- getLine
  putStrLn ("Hello, " ++ n ++ "!")

Remember that for “IO”, the shape is the IO effects (In this case, what exactly gets sent to the terminal) and the “result” is the haskell value computed from the execution of that IO effect. In our case, the action of the result (what values are printed) depends on the result of of the intermediate actions (the getLine). You can no longer know in advance what action the program will have without actually running it and getting the results.

The same thing happens when you start sequencing parser-combinator parsers: you can’t know what counts as a valid parse or how much a parser will consume until you actually start parsing and getting your intermediate parse results.

Monad is also what makes guard and co. useful. Consider the purely Applicative:

evenProducts :: [Int] -> [Int] -> [Bool]
evenProducts xs ys = (\x y -> even (x * y)) <$> xs <*> ys

If you passed in a list of 100 items and a list of 200 items, you can know that the result has 100 * 200 = 20000 items, without actually knowing any of the items in the list.

But, consider an alternative formulation where we are allowed to use Monad operations:

evenProducts :: [Int] -> [Int] -> [(Int, Int)]
evenProducts xs ys = do
  x <- xs
  y <- ys
  guard (even (x * y))
  pure (x, y)

Now, even if you knew the lengths of the input lists, you can not know the length of the output list without actually knowing what’s inside your lists. You need to actually start “sampling”.

That’s why there is no Monad instance for Backwards or optparse-applicative parsers. For Backwards doesn’t work because we’ve now introduced an asymmetry (the m b depends on the a of the m a) that can’t be reversed. For optparse-applicative, it’s because we want to be able to inspect the shape without knowing the results at runtime (so we can show a useful --help without getting any actual arguments): but, with Monad, we can’t know the shape without knowing the results!

In a way, Monad simply “is” the way to combine Functor shapes together where the final shape is allowed to depend on the results. Hah, I tricked you into reading a monad tutorial!

Free Structures

I definitely write way too much about free structures on this blog. But this “shapeful” way of thinking also gives rise to why free structures are so compelling and interesting to work with in Haskell.

Before, we were describing shapes of Functors and Applicatives and Monads that already existed. We had this Functor, what was its shape?

However, what if we had a shape that we had in mind, and wanted to create an Applicative or Monad that manipulated that shape?

For example, let’s roll our own version of optparse-applicative that only supported --myflag somestring options. We could say that the “shape” is the list of supported option and parsers. So a single element of this shape would be the specification of a single option:

data Option a = Option { optionName :: String, optionParse :: String -> Maybe a }
  deriving Functor

The “shape” here is the name and also what values it would parse, essentially. fmap won’t affect the name of the option and won’t affect what would succeed or fail.

Now, to create a full-fledged multi-argument parser, we can use Ap from the free library:

type Parser = Ap Option

We specified the shape we wanted, now we get the Applicative of that shape for free! We can now combine our shapes monoidally using the <*> instance, and then use runAp_ to inspect it:

data Args = Args { myStringOpt :: String, myIntOpt :: Int }

parseTwo :: Parser args
parseTwo = Args <$> liftAp stringOpt <*> liftAp intOpt
  where
    stringOpt = Option "string-opt" Just
    intOpt = Option "int-opt" readMaybe

getAllOptions :: Parser a -> [String]
getAllOptions = runAp_ (\o -> [optionName o])
ghci> getAllOptions parseTwo
["string-opt", "int-opt"]

Remember that Applicative is like a “monoid” for shapes, so Ap gives you a free “monoid” on your custom shape: you can now create list-like “sequences” of your shape that merge via concatenation through <*>. You can also know that fmap on Ap Option will not add or remove options: it’ll leave the actual options unchanged. It’ll also not affect what options would fail or succeed to parse.

You could also write a parser combinator library this way too! Remember that the “shape” of a parser combinator Parser is the string that it consumes or rejects. The single element might be a parser that consumes and rejects a single Char:

newtype Single a = Single { satisfies :: Char -> Maybe a }
  deriving Functor

The “shape” is whether or not it consumes or rejects a char. Notice that fmap for this cannot change whether or not a char is rejected or accepted: it can only change the Haskell result a value. fmap can’t flip the Maybe into a Just or Nothing.

Now we can create a full monadic parser combinator library by using Free from the free library:

type Parser = Free Single

Again, we specified the shape we wanted, and now we have a Monad for that shape! For more information on using this, I’ve written a blog post in the past. Ap gives you a free “monoid” on your shapes, but in a way Free gives you a “tree” for your shapes, where the sequence of shapes depends on which way you go down their results. And, again, fmap won’t ever change what would or would not be parsed.

How do we know what free structure to pick? Well, we ask questions about what we want to be able to do with our shape. If we want to inspect the shape without knowing the results, we’d use the free Applicative or free Alternative. As discussed earlier, using the free Applicative means that our final result must require producing all of the input results, but using the free Alternative means it doesn’t. If we wanted to allow the shape to depend on the results (like for a context-sensitive parser), we’d use the free Monad. Understanding the concept of the “shape” makes this choice very intuitive.

The Shape of You

Next time you encounter a new Functor, I hope these insights can be useful. Ask yourself, what is fmap preserving? What is fmap changing? And from there, its secrets will unfold before you. Emmy Noether would be proud.

Special Thanks

I am very humbled to be supported by an amazing community, who make it possible for me to devote time to researching and writing these posts. Very special thanks to my supporter at the “Amazing” level on patreon, Josh Vera! :)


  1. There are some exceptions, especially degenerate cases like Writer () aka Identity which add no meaningful structure. So for these this mental model isn’t that useful.↩︎

  2. Incidentally, Set.map does preserve one thing: non-emptiness. You can’t Set.map an empty set into a non-empty one and vice versa. So, maybe if we recontextualized Set as a “search for at least one result” Functor or Monad where you could only ever observe a single value, Set.map would work for Ord-restricted versions of those abstractions, assuming lawful Ord instances.↩︎

  3. That is, if we take the sum consideration of all input-output with the outside world, independent of what happens within the Haskell results, we can say the combination of effects is deterministic.↩︎

by Justin Le at November 04, 2024 07:44 PM

November 03, 2024

Haskell Interlude

57: Gabriele Keller

Gabriele Keller, professor at Utrecht University, is interviewed by Andres and Joachim. We follow her journey through the world as well as programming languages, learn why Haskell is the best environment for embedding languages and how the desire to implement parallel programming sparked the development of type families in Haskell and that teaching functional programming works better with graphics.

by Haskell Podcast at November 03, 2024 08:00 PM

November 02, 2024

Brent Yorgey

Competitive Programming in Haskell: Union-Find

Competitive Programming in Haskell: Union-Find

Posted on November 2, 2024
Tagged , ,

Union-find

A union-find data structure (also known as a disjoint set data structure) keeps track of a collection of disjoint sets, typically with elements drawn from \(\{0, \dots, n-1\}\). For example, we might have the sets

\(\{1,3\}, \{0, 4, 2\}, \{5, 6, 7\}\)

A union-find structure must support three basic operations:

  • We can \(\mathit{create}\) a union-find structure with \(n\) singleton sets \(\{0\}\) through \(\{n-1\}\). (Alternatively, we could support two operations: creating an empty union-find structure, and adding a new singleton set; occasionally this more fine-grained approach is useful, but we will stick with the simpler \(\mathit{create}\) API for now.)

  • We can \(\mathit{find}\) a given \(x \in \{0, \dots, n-1\}\), returning some sort of “name” for the set \(x\) is in. It doesn’t matter what these names are; the only thing that matters is that for any \(x\) and \(y\), \(\mathit{find}(x) = \mathit{find}(y)\) if and only if \(x\) and \(y\) are in the same set. The most important application of \(\mathit{find}\) is therefore to check whether two given elements are in the same set or not.

  • We can \(\mathit{union}\) two elements, so the sets that contain them become one set. For example, if we \(\mathit{union}(2,5)\) then we would have

    \(\{1,3\}, \{0, 4, 2, 5, 6, 7\}\)

Note that \(\mathit{union}\) is a one-way operation: once two sets have been unioned together, there’s no way to split them apart again. (If both merging and splitting are required, one can use a link/cut tree, which is very cool—and possibly something I will write about in the future—but much more complex.) However, these three operations are enough for union-find structures to have a large number of interesting applications!

In addition, we can annotate each set with a value taken from some commutative semigroup. When creating a new union-find structure, we must specify the starting value for each singleton set; when unioning two sets, we combine their annotations via the semigroup operation.

  • For example, we could annotate each set with its size; singleton sets always start out with size 1, and every time we union two sets we add their sizes.
  • We could also annotate each set with the sum, product, maximum, or minumum of all its elements.
  • Of course there are many more exotic examples as well.

We typically use a commutative semigroup, as in the examples above; this guarantees that a given set always has a single well-defined annotation value, regardless of the sequence of union-find operations that were used to create it. However, we can actually use any binary operation at all (i.e. any magma), in which case the annotations on a set may reflect the precise tree of calls to \(\mathit{union}\) that were used to construct it; this can occasionally be useful.

  • For example, we could annotate each set with a list of values, and combine annotations using list concatenation; the order of elements in the list associated to a given set will depend on the order of arguments to \(\mathit{union}\).

  • We could also annotate each set with a binary tree storing values at the leaves. Each singleton set is annotated with a single leaf; to combine two trees we create a new branch node with the two trees as its children. Then each set ends up annotated with the precise tree of calls to \(\mathit{union}\) that were used to create it.

Implementing union-find

My implementation is based on one by Kwang Yul Seo, but I have modified it quite a bit. The code is also available in my comprog-hs repository. This blog post is not intended to be a comprehensive union-find tutorial, but I will explain some things as we go.

{-# LANGUAGE RecordWildCards #-}

module UnionFind where

import Control.Monad (when)
import Control.Monad.ST
import Data.Array.ST

Let’s start with the definition of the UnionFind type itself. UnionFind has two type parameters: s is a phantom type parameter used to limit the scope to a given ST computation; m is the type of the arbitrary annotations. Note that the elements are also sometimes called “nodes”, since, as we will see, they are organized into trees.

type Node = Int
data UnionFind s m = UnionFind {

The basic idea is to maintain three mappings:

  • First, each element is mapped to a parent (another element). There are no cycles, except that some elements can be their own parent. This means that the elements form a forest of rooted trees, with the self-parenting elements as roots. We store the parent mapping as an STUArray (see here for another post where we used STUArray) for efficiency.
  parent :: !(STUArray s Node Node),
  • Each element is also mapped to a size. We maintain the invariant that for any element which is a root (i.e. any element which is its own parent), we store the size of the tree rooted at that element. The size associated to other, non-root elements does not matter.

    (Many implementations store the height of each tree instead of the size, but it does not make much practical difference, and the size seems more generally useful.)

  sz :: !(STUArray s Node Int),
  • Finally, we map each element to a custom annotation value; again, we only care about the annotation values for root nodes.
  ann :: !(STArray s Node m) }

To \(\mathit{create}\) a new union-find structure, we need a size and a function mapping each element to an initial annotation value. Every element starts as its own parent, with a size of 1. For convenience, we can also make a variant of createWith that gives every element the same constant annotation value.

createWith :: Int -> (Node -> m) -> ST s (UnionFind s m)
createWith n m =
  UnionFind
    <$> newListArray (0, n - 1) [0 .. n - 1]    -- Every node is its own parent
    <*> newArray (0, n - 1) 1                   -- Every node has size 1
    <*> newListArray (0, n - 1) (map m [0 .. n - 1])

create :: Int -> m -> ST s (UnionFind s m)
create n m = createWith n (const m)

To perform a \(\mathit{find}\) operation, we keep following parent references up the tree until reaching a root. We can also do a cool optimization known as path compression: after finding a root, we can directly update the parent of every node along the path we just traversed to be the root. This means \(\mathit{find}\) can be very efficient, since it tends to create trees that are extremely wide and shallow.

find :: UnionFind s m -> Node -> ST s Node
find uf@(UnionFind {..}) x = do
  p <- readArray parent x
  if p /= x
    then do
      r <- find uf p
      writeArray parent x r
      pure r
    else pure x

connected :: UnionFind s m -> Node -> Node -> ST s Bool
connected uf x y = (==) <$> find uf x <*> find uf y

Finally, to implement \(\mathit{union}\), we find the roots of the given nodes; if they are not the same we make the root with the smaller tree the child of the other root, combining sizes and annotations as appropriate.

union :: Semigroup m => UnionFind s m -> Node -> Node -> ST s ()
union uf@(UnionFind {..}) x y = do
  x <- find uf x
  y <- find uf y
  when (x /= y) $ do
    sx <- readArray sz x
    sy <- readArray sz y
    mx <- readArray ann x
    my <- readArray ann y
    if sx < sy
      then do
        writeArray parent x y
        writeArray sz y (sx + sy)
        writeArray ann y (mx <> my)
      else do
        writeArray parent y x
        writeArray sz x (sx + sy)
        writeArray ann x (mx <> my)

Note the trick of writing x <- find uf x: this looks kind of like an imperative statement that updates the value of a mutable variable x, but really it just makes a new variable x which shadows the old one.

Finally, a few utility functions. First, one to get the size of the set containing a given node:

size :: UnionFind s m -> Node -> ST s Int
size uf@(UnionFind {..}) x = do
  x <- find uf x
  readArray sz x

Also, we can provide functions to update and fetch the custom annotation value associated to the set containing a given node.

updateAnn :: Semigroup m => UnionFind s m -> Node -> m -> ST s ()
updateAnn uf@(UnionFind {..}) x m = do
  x <- find uf x
  old <- readArray ann x
  writeArray ann x (old <> m)
  -- We could use modifyArray above, but the version of the standard library
  -- installed on Kattis doesn't have it

getAnn :: UnionFind s m -> Node -> ST s m
getAnn uf@(UnionFind {..}) x = do
  x <- find uf x
  readArray ann x

Challenge

Here are a couple of problems I challenge you to solve for next time:

<noscript>Javascript needs to be activated to view comments.</noscript>

by Brent Yorgey at November 02, 2024 12:00 AM

October 31, 2024

Abhinav Sarkar

Going REPLing with Haskeline

So you went ahead and created a new programming language, with an AST, a parser, and an interpreter. And now you hate how you have to write the programs in your new language in files to run them? You need a REPL! In this post, we’ll create a shiny REPL with lots of nice features using the Haskeline library to go along with your new PL that you implemented in Haskell.

This post was originally published on abhinavsarkar.net.

The Demo

First a short demo:

<noscript>
Play demo <noscript></noscript>
</noscript>

That is a pretty good REPL, isn’t it? You can even try it online1, running entirely in your browser.

Dawn of a New Language

Let’s assume that we have created a new small Lisp2, just large enough to be able to conveniently write and run the Fibonacci function that returns the nth Fibonacci number. That’s it, nothing more. This lets us focus on the features of the REPL3, not the language.

We have a parser to parse the code from text to an AST, and an interpreter that evaluates an AST and returns a value. We are not going into the details of the parser and the interpreter, just listing the type signatures of the functions they provide is enough for this post.

Let’s start with the AST:

module Language.FiboLisp.Types where

import Data.Text qualified as Text
import Data.Text.Lazy qualified as LText
import Text.Pretty.Simple qualified as PS
import Text.Printf (printf)

type Ident = String

data Expr
  = Num_ Integer
  | Bool_ Bool
  | Var Ident
  | BinaryOp Op Expr Expr
  | If Expr Expr Expr
  | Apply Ident [Expr]
  deriving (Show)

data Op = Add | Sub | LessThan
  deriving (Show, Enum)

data Def = Def {defName :: Ident, defParams :: [Ident], defBody :: Expr}

data Program = Program [Def] [Expr]
  deriving (Show)

carKeywords :: [String]
carKeywords = ["def", "if", "+", "-", "<"]

instance Show Def where
  show Def {..} =
    printf "(Def %s [%s] (%s))" defName (unwords defParams) (show defBody)

showProgram :: Program -> String
showProgram =
  Text.unpack
    . LText.toStrict
    . PS.pShowOpt
      ( PS.defaultOutputOptionsNoColor
          { PS.outputOptionsIndentAmount = 2,
            PS.outputOptionsCompact = True,
            PS.outputOptionsCompactParens = True
          }
      )

That’s right! We named our little language FiboLisp.

FiboLisp is expression oriented; everything is an expression. So naturally, we have an Expr AST. Writing the Fibonacci function requires not many syntactic facilities. In FiboLisp we have:

  • integer numbers,
  • booleans,
  • variables,
  • addition, subtraction, and less-than binary operations on numbers,
  • conditional if expressions, and
  • function calls by name.

We also have function definitions, captured by Def, which records the function name, its parameter names, and its body as an expression.

And finally we have Programs, which are a bunch of function definitions to define, and another bunch of expressions to evaluate.

Short and simple. We don’t need anything more4. This is how the Fibonacci function looks in FiboLisp:

(def fibo [n]
  (if (< n 2)
    n
    (+ (fibo (- n 1)) (fibo (- n 2)))))

We can see all the AST types in use here. Note that FiboLisp is lexically scoped.

The module also lists a bunch of keywords (carKeywords) that can appear in the car5 position of a Lisp expression, that we use later for auto-completion in the REPL, and some functions to convert the AST types to nice looking strings.

For the parser, we have this pared-down code:

module Language.FiboLisp.Parser (ParsingError(..), parse) where

import Control.DeepSeq (NFData)
import Control.Exception (Exception)
import GHC.Generics (Generic)
import Language.FiboLisp.Types

parse :: String -> Either ParsingError Program

data ParsingError = ParsingError String | EndOfStreamError
  deriving (Show, Generic, NFData)

instance Exception ParsingError

The essential function is parse, which takes the code as a string, and returns either a ParsingError on failure, or a Program on success. If the parser detects that an S-expression is not properly closed, it returns an EndOfStreamError error.

We also have this pretty-printer module that converts function ASTs back to pretty Lisp code:

module Language.FiboLisp.Printer (prettyShowDef) where

import Language.FiboLisp.Types

prettyShowDef :: Def -> String

Finally, the last thing before we hit the real topic of this post, the FiboLisp interpreter:

module Language.FiboLisp.Interpreter
  (Value, RuntimeError, interpret, builtinFuncs, builtinVals) where

import Control.DeepSeq (NFData)
import Control.Exception (Exception)
import Data.Map.Strict qualified as Map
import GHC.Generics (Generic)
import Language.FiboLisp.Types

interpret :: (String -> IO ()) -> Program -> IO (Either RuntimeError Value)

newtype RuntimeError = RuntimeError String
  deriving (Show, Generic, NFData)

instance Exception RuntimeError

data Value = ...
  deriving (Show, Generic, NFData)

builtinFuncs :: Map.Map String Value

builtinVals :: [Value]

We have elided the details again. All that matters to us is the interpret function that takes a program, and returns either a runtime error or a value. Value is the runtime representation of the values of FiboLisp expressions, and all we care about is that it can be shown and fully evaluated via NFData6. interpret also takes a String -> IO () function, that’ll be demystified when we get into implementing the REPL.

Lastly, we have a map of built-in functions and a list of built-in values. We expose them so that they can be treated specially in the REPL.

If you want, you can go ahead and fill in the missing code using your favourite parsing and pretty-printing libraries7, and the method of writing interpreters. For this post, those implementation details are not necessary.

Let’s package all this functionality into a module for ease of importing:

module Language.FiboLisp
  ( module Language.FiboLisp.Types,
    module Language.FiboLisp.Parser,
    module Language.FiboLisp.Printer,
    module Language.FiboLisp.Interpreter,
  )
where

import Language.FiboLisp.Interpreter
import Language.FiboLisp.Parser
import Language.FiboLisp.Printer
import Language.FiboLisp.Types

Now, with all the preparations done, we can go REPLing.

A REPL of Our Own

The main functionality that a REPL provides is entering expressions and definitions, one at a time, that it Reads, Evaluates, and Prints, and then Loops back, letting us do the same again. This can be accomplished with a simple program that prompts the user for an input and does all these with it. However, such a REPL will be quite lackluster.

These days programming languages come with advanced REPLs like IPython and nREPL, which provide many functionalities beyond simple REPLing. We want FiboLisp to have a great REPL too.

You may have already noticed some advanced features that our REPL provides in the demo. Let’s state them here:

  1. Commands starting with colon:
    1. to set and unset settings: :set and :unset,
    2. to load files into the REPL: :load,
    3. to show the source code of functions: :source,
    4. to show a help message: :help.
  2. Settings to enable/disable:
    1. dumping of parsed ASTs: dump,
    2. showing program execution times: time.
  3. Multiline expressions and functions, with correct indentation.
  4. Colored output and messages.
  5. Auto-completion of commands, code and file names.
  6. Safety checks when loading files.
  7. Readline-like navigation through the history of previous inputs.

Haskeline — the Haskell library that we use to create the REPL — provides only basic functionalities, upon which we build to provide these features. Let’s begin.

State and Settings

As usual, we start the module with many imports8:

{-# LANGUAGE TemplateHaskell #-}

module Language.FiboLisp.Repl (run) where

import Control.DeepSeq qualified as DS
import Control.Exception (Exception (..), evaluate)
import Control.Lens.Basic qualified as Lens
import Control.Monad (when)
import Control.Monad.Catch qualified as Catch
import Control.Monad.IO.Class (MonadIO, liftIO)
import Control.Monad.Identity (IdentityT (..))
import Control.Monad.Reader (MonadReader, ReaderT (runReaderT))
import Control.Monad.Reader qualified as Reader
import Control.Monad.State.Strict (MonadState, StateT (runStateT))
import Control.Monad.State.Strict qualified as State
import Control.Monad.Trans (MonadTrans, lift)
import Data.Char qualified as Char
import Data.Functor ((<&>))
import Data.List
  (dropWhileEnd, foldl', isPrefixOf, isSuffixOf, nub, sort, stripPrefix)
import Data.Map.Strict qualified as Map
import Data.Maybe (fromJust)
import Data.Set qualified as Set
import Data.Time (NominalDiffTime, diffUTCTime, getCurrentTime)
import Language.FiboLisp qualified as L
import System.Console.Haskeline qualified as H
import System.Console.Terminfo qualified as Term
import System.Directory (canonicalizePath, doesFileExist, getCurrentDirectory)

Notice that we import the previously shown Language.FiboLisp module qualified as L, and Haskeline as H. Another important library that we use here is terminfo, which helps us do colored output.

A REPL must preserve the context through a session. In case of FiboLisp, this means we should be able to define a function9 as one input, and then use it later in the session, one or many times10. The REPL should also respect the REPL settings through the session till they are unset.

Additionally, the REPL has to remember whether it is in middle of writing a multiline input. To support multiline input, the REPL also needs to remember the previous indentation, and the input done in previous lines of a multiline input. Together these form the ReplState:

data ReplState = ReplState
  { _replDefs :: Defs,
    _replSettings :: Settings,
    _replLineMode :: LineMode,
    _replIndent :: Int,
    _replSeenInput :: String
  }

type Defs = Map.Map L.Ident L.Def
type Settings = Set.Set Setting
data Setting = Dump | MeasureTime deriving (Eq, Ord, Enum)
data LineMode = SingleLine | MultiLine deriving (Eq)

instance Show Setting where
  show = \case
    Dump -> "dump"
    MeasureTime -> "time"

Let’s deal with settings first. We set and unset settings using the :set and :unset commands. So, we write the code to parse setting the settings:

data SettingMode = Set | Unset deriving (Eq, Enum)

instance Show SettingMode where
  show = \case
    Set -> ":set"
    Unset -> ":unset"

parseSetting :: String -> Maybe Setting
parseSetting = \case
  "dump" -> Just Dump
  "time" -> Just MeasureTime
  _ -> Nothing

parseSettingMode :: String -> Maybe SettingMode
parseSettingMode = \case
  ":set" -> Just Set
  ":unset" -> Just Unset
  _ -> Nothing

parseSettingCommand :: String -> Either String (SettingMode, Setting)
parseSettingCommand command = case words command of
  [modeStr, settingStr] -> case parseSettingMode modeStr of
    Just mode -> case parseSetting settingStr of
      Just setting -> Right (mode, setting)
      Nothing -> Left $ "Unknown setting: " <> settingStr
    Nothing -> Left $ "Unknown command: " <> command
  [modeStr]
    | Just _ <- parseSettingMode modeStr -> Left "No setting specified"
  _ -> Left $ "Unknown command: " <> command

Nothing fancy here, just splitting the input into words and going through them to make sure they are valid.

The REPL is a monad that wraps over ReplState:

newtype Repl a = Repl
  { runRepl_ :: StateT ReplState (ReaderT AddColor IO) a
  }
  deriving
    ( Functor,
      Applicative,
      Monad,
      MonadIO,
      MonadState ReplState,
      MonadReader AddColor,
      Catch.MonadThrow,
      Catch.MonadCatch,
      Catch.MonadMask
    )

type AddColor = Term.Color -> String -> String

runRepl :: AddColor -> Repl a -> IO a
runRepl addColor =
  fmap fst
    . flip runReaderT addColor
    . flip runStateT (ReplState Map.empty Set.empty SingleLine 0 "")
    . runRepl_

Repl also lets us do IO — is it really a REPL if you can’t do printing — and deal with exceptions. Additionally, we have a read-only state that is a function, which will be explained soon. The REPL starts in the single line mode, with no indentation, functions definitions, settings, or previously seen input.

REPLing Down the Prompt

Let’s go top-down. We write the run function that is the entry point of this module:

run :: IO ()
run = do
  term <- Term.setupTermFromEnv
  let addColor =
        case Term.getCapability term $ Term.withForegroundColor @String of
          Just fc -> fc
          Nothing -> \_ s -> s
  runRepl addColor . H.runInputT settings $ do
    H.outputStrLn $ addColor promptColor "FiboLisp REPL"
    H.outputStrLn $ addColor infoColor "Press <TAB> to start"
    repl
  where
    settings =
      H.setComplete doCompletions $
        H.defaultSettings {H.historyFile = Just ".fibolisp"}

This sets up Haskeline to run our REPL using the functions we provide in the later sections: repl and doCompletions. This also demystifies the read-only state of the REPL: a function that adds colors to our output strings, depending on the capabilities of the terminal in which our REPL is running in. We also set up a history file to remember the previous REPL inputs.

When the REPL starts, we output some messages in nice colors, which are defined as:

promptColor, printColor, outputColor, errorColor, infoColor :: Term.Color
promptColor = Term.Green
printColor = Term.White
outputColor = Term.Green
errorColor = Term.Red
infoColor = Term.Cyan

Off we go repling now:

type Prompt = H.InputT Repl

repl :: Prompt ()
repl = do
  replLineMode .= SingleLine
  replIndent .= 0
  replSeenInput .= ""
  Catch.handle (\H.Interrupt -> repl) . H.withInterrupt $
    readInput >>= \case
      EndOfInput -> outputWithColor promptColor "Goodbye."
      input -> evalAndPrint input >> repl

outputWithColor :: Term.Color -> String -> Prompt ()
outputWithColor color text = do
  addColor <- getAddColor
  H.outputStrLn $ addColor color text

getAddColor :: Prompt AddColor
getAddColor = lift Reader.ask

We infuse our Repl with the powers of Haskeline by wrapping it with Haskeline’s InputT monad transformer, and call it the Prompt type. In the repl function, we readInput, evalAndPrint it, and repl again.

We also deal with the user quitting the REPL (the EndOfInput case), and hitting Ctrl + C to interrupt typing or a running evaluation (the handling for H.Interrupt).

Wait a minute! What is that imperative looking .= doing in our Haskell code? That’s right, we are looking through some lenses!

type Lens' s a = Lens.Lens s s a a

replDefs :: Lens' ReplState Defs
replDefs = $(Lens.field '_replDefs)

replSettings :: Lens' ReplState Settings
replSettings = $(Lens.field '_replSettings)

replLineMode :: Lens' ReplState LineMode
replLineMode = $(Lens.field '_replLineMode)

replIndent :: Lens' ReplState Int
replIndent = $(Lens.field '_replIndent)

replSeenInput :: Lens' ReplState String
replSeenInput = $(Lens.field '_replSeenInput)

use :: (MonadTrans t, MonadState s m) => Lens' s a -> t m a
use l = lift . State.gets $ Lens.view l

(.=) :: (MonadTrans t, MonadState s m) => Lens' s a -> a -> t m ()
l .= a = lift . State.modify' $ Lens.set l a

(%=) :: (MonadTrans t, MonadState s m) => Lens' s a -> (a -> a) -> t m ()
l %= f = lift . State.modify' $ Lens.over l f

If you’ve never encountered lenses before, you can think of them as pairs of setters and getters. The repl* lenses above are for setting and getting the corresponding fields from the ReplState data type11. The use, .=, and %= functions are for getting, setting and modifying respectively the state in the State monad using lenses. We see them in action at the beginning of the repl function when we use .= to set the various fields of ReplState to their initial values in the State monad.

All that is left now is actually reading the input, evaluating it and printing the results.

Reading the Input

Haskeline gives us functions to read the user’s input as text. However, being Haskellers, we prefer some structure around it:

data Input
  = Setting (SettingMode, Setting)
  | Load FilePath
  | Source String
  | Help
  | Program L.Program
  | BadInputError String
  | EndOfInput

We’ve got all previously mentioned cases covered with the Input data type. We also do some input validation and capture errors for the failure cases with the BadInputError constructor. EndOfInput is used for when the user quits the REPL.

Here is how we read the input:

readInput :: Prompt Input
readInput = do
  addColor <- getAddColor
  lineMode <- use replLineMode
  prevIndent <- use replIndent

  let promptSym = case lineMode of SingleLine -> "λ"; _ -> "|"
      prompt = addColor promptColor $ promptSym <> "> "

  mInput <- H.getInputLineWithInitial prompt (replicate prevIndent ' ', "")
  let currentIndent = maybe 0 (length . takeWhile (== ' ')) mInput

  case trimStart . trimEnd <$> mInput of
    Nothing -> return EndOfInput
    Just input | null input -> do
      replIndent .= case lineMode of
        SingleLine -> prevIndent
        MultiLine -> currentIndent
      readInput
    Just input@(':' : _) -> parseCommand input
    Just input -> parseCode input currentIndent

trimStart :: String -> String
trimStart = dropWhile Char.isSpace

trimEnd :: String -> String
trimEnd = dropWhileEnd Char.isSpace

We use the getInputLineWithInitial function provided by Haskeline to show a prompt and read user’s input as a string. The prompt shown depends on the LineMode of the REPL state. In the SingleLine mode we show λ>, where in the MultiLine mode we show |>.

If there is no input, that means the user has quit the REPL. In that case we return EndOfInput, which is handled in the repl function. If the input is empty, we read more input, preserving the previous indentation (prevIndent) in the MultiLine mode.

If the input starts with :, we parse it for various commands:

parseCommand :: String -> Prompt Input
parseCommand input
  | ":help" `isPrefixOf` input = return Help
  | ":load" `isPrefixOf` input =
      checkFilePath . trimStart . fromJust $ stripPrefix ":load" input
  | ":source" `isPrefixOf` input = do
      return . Source . trimStart . fromJust $ stripPrefix ":source" input
  | input == ":" = return $ BadInputError "No command specified"
  | otherwise = case parseSettingCommand input of
      Right setting -> return $ Setting setting
      Left err -> return $ BadInputError err

checkFilePath :: String -> Prompt Input
checkFilePath file
  | null file = return $ BadInputError "No file specified"
  | otherwise =
      isSafeFilePath file <&> \case
        True -> Load file
        False -> BadInputError $ "Cannot access file: " <> file

isSafeFilePath :: (MonadIO m) => FilePath -> m Bool
isSafeFilePath fp =
  liftIO $ isPrefixOf <$> getCurrentDirectory <*> canonicalizePath fp

The :help and :source cases are straightforward. In case of :load, we make sure to check that the file asked to be loaded is located somewhere inside the current directory of the REPL or its recursive subdirectories. Otherwise, we deny loading by returning a BadInputError. We parse the settings using the parseSettingCommand function we wrote earlier.

If the input is not a command, we parse it as code:

parseCode :: String -> Int -> Prompt Input
parseCode currentInput indent = do
  seenInput <- use replSeenInput
  let input = seenInput <> " " <> currentInput
  case L.parse input of
    Left L.EndOfStreamError -> do
      replLineMode .= MultiLine
      replIndent .= indent
      replSeenInput .= input
      readInput
    Left err ->
      return $ BadInputError $ "ERROR: " <> displayException err
    Right program -> return $ Program program

We append the previously seen input (in case of multiline input) with the current input and parse it using the parse function provided by the Language.FiboLisp module. If parsing fails with an EndOfStreamError, it means that the input is incomplete. In that case, we set the REPL line mode to Multiline, REPL indentation to the current indentation, and seen input to the previously seen input appended with the current input, and read more input. If it is some other error, we return a BadInputError with it.

If the result of parsing is a program, we return it as a Program input.

That’s it for reading the user input. Next, we evaluate it.

Evaluating the Input

Recall that the repl function calls the evalAndPrint function with the read input:

evalAndPrint :: Input -> Prompt ()
evalAndPrint = \case
  EndOfInput -> return ()
  BadInputError err -> outputWithColor errorColor err
  Help -> H.outputStr helpMessage
  Setting (Set, setting) -> replSettings %= Set.insert setting
  Setting (Unset, setting) -> replSettings %= Set.delete setting
  Source ident -> showSource ident
  Load fp -> loadAndEvalFile fp
  Program program -> interpretAndPrint program
  where
    helpMessage =
      unlines
        [ "Available commands",
          ":set/:unset dump       Dumps the program AST",
          ":set/:unset time       Shows the program execution time",
          ":load <file>           Loads a source file",
          ":source <func_name>    Prints the source code of a function",
          ":help                  Shows this help"
        ]

The cases of EndOfInput, BadInputError and Help are straightforward. For settings, we insert or remove the setting from the REPL settings, depending on it being set or unset. For the other cases, we call the respective helper functions.

For a :source command, we check if the requested identifier maps to a user-defined or builtin function, and if so, print its source. Otherwise we print an error.

showSource :: L.Ident -> Prompt ()
showSource ident = do
  defs <- use replDefs
  case Map.lookup ident defs of
    Just def -> outputWithColor infoColor $ L.prettyShowDef def
    Nothing -> case Map.lookup ident L.builtinFuncs of
      Just func -> outputWithColor infoColor $ show func
      Nothing ->
        outputWithColor errorColor $ "No such function: " <> ident

For a :load command, we check if the requested file exists. If so, we read and parse it, and interpret the resultant program. In case of any errors in reading or parsing the file, we catch and print them.

loadAndEvalFile :: FilePath -> Prompt ()
loadAndEvalFile fp =
  liftIO (doesFileExist fp) >>= \case
    False -> outputWithColor errorColor $ "No such file: " <> fp
    True -> Catch.handleAll outputError $ do
      code <- liftIO $ readFile fp
      outputWithColor infoColor $ "Loaded " <> fp
      case L.parse code of
        Left err -> outputError err
        Right program -> interpretAndPrint program

outputError :: (Exception e) => e -> Prompt ()
outputError err =
  outputWithColor errorColor $ "ERROR: " <> displayException err

Finally, we come to the workhorse of the REPL: the interpretation of the user provided program:

interpretAndPrint :: L.Program -> Prompt ()
interpretAndPrint (L.Program pDefs exprs) =
  Catch.handleAll outputError $ do
    defs <- use replDefs
    settings <- use replSettings

    let defs' =
          foldl' (\ds d -> Map.insert (L.defName d) d ds) defs pDefs
        program = L.Program (Map.elems defs') exprs
    when (Dump `Set.member` settings) $
      outputWithColor infoColor (L.showProgram program)

    addColor <- getAddColor
    extPrint <- H.getExternalPrint

    (execTime, val) <- liftIO . measureElapsedTime $ do
      val <- L.interpret (extPrint . addColor printColor) program
      evaluate $ DS.force val

    case val of
      Left err -> outputError err
      Right v -> do
        let output = show v
        if null output
          then return ()
          else outputWithColor outputColor $ "=> " <> output

    when (MeasureTime `Set.member` settings) $
      outputWithColor infoColor $
        "(Execution time: " <> show execTime <> ")"

    replDefs .= defs'

measureElapsedTime :: IO a -> IO (NominalDiffTime, a)
measureElapsedTime f = do
  start <- getCurrentTime
  ret <- f
  end <- getCurrentTime
  return (diffUTCTime end start, ret)

We start by collecting the user defined functions in the current input with the previously defined functions in the session such that current functions override the previous functions with the same names. At this point, if the dump setting is set, we print the program AST.

Then we invoke the interpret function provided by the Language.FiboLisp module. Recall that the interpret function takes the program to interpret and a function of type String -> IO (). This function is a color-adding wrapper over the function returned by the Haskeline function getExternalPrint12. This function allows non-REPL code to safely print to the Haskeline driven REPL without garbling the output. We pass it to the interpret function so that the interpret can invoke it when the user code invokes the builtin print function or similar.

We make sure to force and evaluate the value returned by the interpreter so that any lazy values or errors are fully evaluated13, and the measured elapsed time is correct.

If the interpreter returns an error, we print it. Else we convert the value to a string, and if is it not empty14, we print it.

Finally, we print the execution time if the time setting is set, and set the REPL defs to the current program defs.

That’s all! We have completed our REPL. But wait, I think we forgot one thing …

Doing the Completions

The REPL would work fine with this much code, but it would not be a good experience for the user, because they’d have to type everything without any help from the REPL. To make it convenient for the user, we provide contextual auto-completion functionality while typing. Haskeline lets us plug in our custom completion logic by setting a completion function, which we did way back at the start. Now we need to implement it.

doCompletions :: H.CompletionFunc Repl
doCompletions =
  fmap runIdentityT . H.completeWordWithPrev Nothing " " $ \leftRev word -> do
    defs <- use replDefs
    lineMode <- use replLineMode
    settings <- use replSettings
    let funcs = nub $ Map.keys defs <> Map.keys L.builtinFuncs
        vals = map show L.builtinVals
    case (word, lineMode) of
      ('(' : rest, _) ->
        pure
          [ H.Completion ('(' : hint) hint True
            | hint <- nub . sort $ L.carKeywords <> funcs,
              rest `isPrefixOf` hint
          ]
      (_, SingleLine) -> case word of
        "" | null leftRev ->
          pure [H.Completion "" s True | s <- commands <> funcs <> vals]
        ':' : _ | null leftRev ->
          pure [H.simpleCompletion c | c <- commands, word `isPrefixOf` c]
        _
          | "tes:" `isSuffixOf` leftRev ->
            pure
              [ H.simpleCompletion $ show s
                | s <- [Dump ..], s `notElem` settings, word `isPrefixOf` show s
              ]
          | "tesnu:" `isSuffixOf` leftRev ->
            pure
              [ H.simpleCompletion $ show s
                | s <- [Dump ..], s `elem` settings, word `isPrefixOf` show s
              ]
          | "daol:" `isSuffixOf` leftRev ->
            isSafeFilePath word >>= \case
              True -> H.listFiles word
              False -> pure []
          | "ecruos:" `isSuffixOf` leftRev ->
            pure
              [ H.simpleCompletion ident
                | ident <- funcs,
                  ident `Map.notMember` L.builtinFuncs,
                  word `isPrefixOf` ident
              ]
          | otherwise ->
            pure [H.simpleCompletion c | c <- funcs <> vals, word `isPrefixOf` c]
      _ -> pure []
  where
    commands = ":help" : ":load" : ":source" : map show [Set ..]

Haskeline provides us the completeWordWithPrev function to easily create our own completion function. It takes a callback function that it calls with the current word being completed (the word immediately to the left of the cursor), and the content of the line before the word (to the left of the word), reversed. We use these to return different completion lists of strings.

Going case by case:

  1. If the word starts with (, it means we are in middle of writing FiboLisp code. So we return the carKeywords and the user-defined and builtin function names that start with the current word sans the initial (. This happens regardless of the current line mode. Rest of the cases below apply only in the SingleLine mode.
  2. If the entire line is empty, we return the names of all commands, functions, and builtin values.
  3. If the word starts with :, and is at the beginning of the line, we return the commands that start with the word.
  4. If the line starts with
    1. :set, we return the not set settings
    2. :unset, we return the set settings
    3. :load, we return the names of the files and directories in the current directory
    4. :source, we return the names of the user-defined functions
    that start with the word.
  5. Otherwise we return no completions.

This covers all cases, and provides helpful completions, while avoiding bad ones. And this completes the implementation of our wonderful REPL.

Conclusion

I wrote this REPL while implementing a Lisp that I wrote15 while going through the Essentials of Compilation book, which I thoroughly recommend for getting started with compilers. It started as a basic REPL, and gathered a lot of nice functionalities over time. So I decided to extract and share it here. I hope that this Haskeline tutorial helps you in creating beautiful and useful REPLs. Here is the complete code for the REPL.


  1. The online demo is rather slow to load and to run, and works only on Firefox and Chrome. Even though I managed to put it together somehow, I don’t actually know how it exactly works, and I’m unable to fix the issues with it.↩︎

  2. Lisps are awesome and I absolutely recommend creating one or more of them as an amateur PL implementer. Some resources I recommend are: the Build Your Own Lisp book, and the Make-A-Lisp tutorial.↩︎

  3. REPLs are wonderful for doing interactive and exploratory programming where you try out small snippets of code in the REPL, and put your program together piece-by-piece. They are also good for debugging because they let you inspect the state of running programs from within. I still fondly remember the experience of connecting (or jacking in) to running productions systems written in Clojure over REPL, and figuring out issues by dumping variables.↩︎

  4. We don’t even need let. We can, and have to, define variables by creating functions, with parameters serving the role of variables. In fact, we can’t even assign or reassign variables. Functions are the only scoping mechanism in FiboLisp, much like old-school JavaScript with its IIFEs.↩︎

  5. car is obviously Contents of the Address part of the Register, the first expression in a list form in a Lisp.↩︎

  6. You may be wondering about why we need the NFData instances for the errors and values. This will become clear when we write the REPL.↩︎

  7. I recommend the sexp-grammar library, which provides both parsing and printing facilities for S-expressions based languages. Or you can write something by yourself using the parsing and pretty-printing libraries like megaparsec and prettyprinter.↩︎

  8. We assume that our project’s Cabal file sets the default-language to GHC2021, and the default-extensions to LambdaCase, OverloadedStrings, RecordWildCards, and StrictData.↩︎

  9. Recall that there is no way to define variables in FiboLisp.↩︎

  10. If the interpreter allows mutually recursive function definitions, functions can be called before defining them.↩︎

  11. We are using the basic-lens library here, which is the tiniest lens library, and provides only the five functions and types we see used here.↩︎

  12. Using the function returned from getExternalPrint is not necessary in our case because the REPL blocks when it invokes the interpreter. That means, nothing but the interpreter can print anything while it is running. So the interpreter can actually print directly to stdout and nothing will go wrong.

    However, imagine a case in which our code starts a background thread that needs to print to the REPL. In such case, we must use the Haskeline provided print function instead of printing directly. When printing to the REPL using it, Haskeline coordinates the prints so that the output in the terminal is not garbled.↩︎

  13. Now we see why we derive NFData instances for errors and Value.↩︎

  14. Returned value could be of type void with no textual representation, in which case we would not print it.↩︎

  15. I wrote the original REPL code almost three years ago. I refactored, rewrote and improved a lot of it in the course of writing this post. As they say, writing is thinking.↩︎

If you liked this post, please leave a comment.

by Abhinav Sarkar (abhinav@abhinavsarkar.net) at October 31, 2024 12:00 AM

October 25, 2024

Derek Elkins

Classical First-Order Logic from the Perspective of Categorical Logic

Introduction

Classical First-Order Logic (Classical FOL) has an absolutely central place in traditional logic, model theory, and set theory. It is the foundation upon which ZF(C), which is itself often taken as the foundation of mathematics, is built. When classical FOL was being established there was a lot of study and debate around alternative options. There are a variety of philosophical and metatheoretic reasons supporting classical FOL as The Right Choice.

This all happened, however, well before category theory was even a twinkle in Mac Lane’s and Eilenberg’s eyes, and when type theory was taking its first stumbling steps.

My focus in this article is on what classical FOL looks like to a modern categorical logician. This can be neatly summarized as “classical FOL is the internal logic of a Boolean First-Order Hyperdoctrine. Each of the three words in this term,”Boolean”, “First-Order”, and “Hyperdoctrine”, suggest a distinct axis in which to vary the (class of categorical models of the) logic. All of them have compelling categorical motivations to be varied.

Boolean

The first and simplest is the term “Boolean”. This is what differentiates the categorical semantics of classical (first-order) logic from constructive (first-order) logic. Considering arbitrary first-order hyperdoctrines would give us a form of intuitionistic first-order logic.

It is fairly rare that the categories categorists are interested in are Boolean. For example, most toposes, all of which give rise to first-order hyperdoctrines, are not Boolean. The assumption that they are tends to correspond to a kind of “discreteness” that’s often at odds with the purpose of the topos. For example, a category of sheaves on a topological space is Boolean if and only if that space is a Stone space. These are certainly interesting spaces, but they are also totally disconnected unlike virtually every non-discrete topological space one would typically mention.

First-Order

The next term is the term “first-order”. As the name suggests, a first-order hyperdoctrine has the necessary structure to interpret first-order logic. The question, then, is what kind of categories have this structure and only this structure. The answer, as far as I’m aware, is not many.

Many (classes of) categories have the structure to be first-order hyperdoctrines, but often they have additional structure as well that it seems odd to ignore. The most notable and interesting example is toposes. All elementary toposes (which includes all Grothendieck toposes) have the structure to give rise to a first-order hyperdoctrine. But, famously, they also have the structure to give rise to a higher order logic. Even more interesting, while Grothendieck toposes, being elementary toposes, technically do support the necessary structure for first-order logic, the natural morphisms of Grothendieck toposes, geometric morphisms, do not preserve that structure, unlike the logical functors between elementary toposes.

The natural internal logic for Grothendieck toposes turns out to be geometric logic. This is a logic that lacks universal quantification and implication (and thus negation) but does have infinitary disjunction. This leads to a logic that is, at least superficially, incomparable to first-order logic. Closely related logics are regular logic and coherent logic which are sub-logics of both geometric logic and first-order logic.

We see, then, just from the examples of the natural logics of toposes, none of them are first-order logic, and we get examples that are more powerful, less powerful, and incomparable to first-order logic. Other common classes of categories give other natural logics, such as the cartesian logic from left exact categories, and monoidal categories give rise to (ordered) linear logics. We get the simply typed lambda calculus from cartesian closed categories which leads to the next topic.

Hyperdoctrine

A (posetal) hyperdoctrine essentially takes a category and, for each object in that category, assigns to it a poset of “predicates” on that object. In many cases, this takes the form of the Sub functor assigning to each object its poset of subobjects. Various versions of hyperdoctrines will require additional structure on the source category, these posets, and/or the functor itself to interpret various logical connectives. For example, a regular hyperdoctrine requires the source category to have finite limits, the posets to be meet-semilattices, and the functor to give rise to monotonic functions with left adjoints satisfying certain properties. This notion of hyperdoctrines is suitable for regular logic.

It’s very easy to recognize that these functors are essentially indexed |(0,1)|-categories. This immediately suggests that we should consider higher categorical versions or at the very least normal indexed categories.

What this means for the logic is that we move from proof-irrelevant logic to proof-relevant logic. We now have potentially multiple ways a “predicate” could “entail” another “predicate”. We can present the simply typed lambda calculus in this indexed category manner. This naturally leads/connects to the categorical semantics of type theories.

Pushing forward to |(\infty, 1)|-categories is also fairly natural, as it’s natural to want to talk about an entailment holding for distinct but “equivalent” reasons.

Summary

Moving in all three of these directions simultaneously leads pretty naturally to something like Homotopy Type Theory (HoTT). HoTT is a naturally constructive (but not anti-classical) type theory aimed at being an internal language for |(\infty, 1)|-toposes.

Why Classical FOL?

Okay, so why did people pick classical FOL in the first place? It’s not like the concept of, say, a higher-order logic wasn’t considered at the time.

Classical versus Intuitionistic was debated at the time, but at that time it was primarily a philosophical argument, and the defense of Intuitionism was not very compelling (to me and obviously people at the time). The focus would probably have been more on (classical) FOL versus second- (or higher-)order logic.

Oversimplifying, the issue with second-order logic is fairly evident from the semantics. There are two main approaches: Henkin-semantics and full (or standard) semantics. Henkin-semantics keeps the nice properties of (classical) FOL but fails to get the nice properties, namely categoricity properties, of second-order logic. This isn’t surprising as Henkin-semantics can be encoded into first-order logic. It’s essentially syntactic sugar. Full semantics, however, states that the interpretation of predicate sorts is power sets of (cartesian products of) the domain1. This leads to massive completeness problems as our metalogical set theory has many, many ways of building subsets of the domain. There are metatheoretic results that state that there is no computable set of logical axioms that would give us a sound and complete theory for second-order logic with respect to full semantics. This aspect is also philosophically problematic, because we don’t want to need set theory to understand the very formulation of set theory. Thus Quine’s comment that “second-order logic [was] set theory in sheep’s clothing”.

On the more positive and (meta-)mathematical side, we have results like Lindström’s theorem which states that classical FOL is the strongest logic that simultaneously satisfies (downward) Löwenheim-Skolem and compactness. There’s also a syntactic result by Lindström which characterizes first-order logic as the only logic having a recursively enumerable set of tautologies and satisfying Löwenheim-Skolem2.

The Catch

There’s one big caveat to the above. All of the above results are formulated in traditional model theory which means there are various assumptions built in to their statements. In the language of categorical logic, these assumptions can basically be summed up in the statement that the only category of semantics that traditional model theory considers is Set.

This is an utterly bizarre thing to do from the standpoint of categorical logic.

The issues with full semantics follow directly from this choice. If, as categorical logic would have us do, we considered every category with sufficient structure as a potential category of semantics, then our theory would not be forced to follow every nook and cranny of Set’s notion of subset to be complete. Valid formulas would need to be true not only in Set but in wildly different categories, e.g. every (Boolean) topos.

These traditional results are also often very specific to classical FOL. Dropping this constraint of classical logic would lead to an even broader class of models.

Categorical Perspective on Classical First-Order Logic

A Boolean category is just a coherent category where every object has a complement. Since coherent functors preserve complements, we have that the category of Boolean categories is a full subcategory of the category of coherent categories.

One nice thing about, specifically, classical first-order logic from the perspective of category theory is the following. First, coherent logic is a sub-logic of geometric logic restricted to finitary disjunction. Via Morleyization, we can encode classical first-order logic into coherent logic such that the categories of models of each are equivalent. This implies that a classical FOL formula is valid if and only if its encoding is. Morleyization allows us to analyze classical FOL using the tools of classifying toposes. On the one hand, this once again suggests the importance of coherent logic, but it also means that we can use categorical tools with classical FOL.

Conclusion

There are certain things that I and, I believe, most logicians take as table stakes for a (foundational) logic3. For example, checking a proof should be computably decidable. For these reasons, I am in complete accord with early (formal) logicians that classical second-order logic with full semantics is an unacceptably worse alternative to classical first-order logic.

However, when it comes to statements about the specialness of FOL, a lot of them seem to be more statements about traditional model theory than FOL itself, and also statements about the philosophical predilections of the time. I feel that philosophical attitudes among logicians and mathematicians have shifted a decent amount since the beginning of the 20th century. We have different philosophical predilections today than then, but they are informed by another hundred years of thought, and they are more relevant to what is being done today.

Martin-Löf type theory (MLTT) and its progeny also present an alternative path with their own philosophical and metalogical justifications. I mention this to point out actual cases of foundational frameworks that a (very) superficial reading of traditional model theory results would seem to have been “ruled out”. Even if one thinks the FOL+ZFC (or whatever) is the better foundations, I think it is unreasonable to assert that MLTT derivatives are unworkable as a foundations.


  1. It’s worth mentioning that this is exactly what categorical logic would suggest: our syntactic power objects should be mapped to semantic power objects.↩︎

  2. While nice, it’s not clear that compactness and, especially, Löwenheim-Skolem are sacrosanct properties that we’d be unwilling to do without. Lindström’s first theorem is thus a nice abstract characterization theorem for classical FOL, but it doesn’t shut the door on considering alternatives even in the context of traditional model theory.↩︎

  3. I’m totally fine thinking about logics that lack these properties, but I would never put any of them forward as an acceptable foundational logic.↩︎

October 25, 2024 12:55 AM

October 20, 2024

GHC Developer Blog

GHC 9.8.3 is now available

GHC 9.8.3 is now available

Ben Gamari - 2024-10-20

The GHC developers are happy to announce the availability of GHC 9.8.3. Binary distributions, source distributions, and documentation are available on the release page.

This release is primarily a bugfix release the 9.8 series. These include:

  • Significantly improve performance of code loading via dynamic linking (#23415)
  • Fix a variety of miscompilations involving sub-word-size FFI arguments (#25018, #24314)
  • Fix a rare miscompilation by the x86 native code generator (#24507)
  • Improve runtime performance of some applications of runRW# (#25055)
  • Reduce fragmentation when using the non-moving garbage collector (#23340)
  • Fix source links in Haddock’s hyperlinked sources output (#24086)

A full accounting of changes can be found in the release notes. As some of the fixed issues do affect correctness users are encouraged to upgrade promptly.

We would like to thank Microsoft Azure, GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

Happy compiling!

  • Ben

by ghc-devs at October 20, 2024 12:00 AM

October 17, 2024

Tweag I/O

Introducing rules_gcs

At Tweag, we are constantly striving to improve the developer experience by contributing tools and utilities that streamline workflows. We recently completed a project with IMAX, where we learned that they had developed a way to simplify and optimize the process of integrating Google Cloud Storage (GCS) with Bazel. Seeing value in this tool for the broader community, we decided to publish it together under an open source license. In this blog post, we’ll dive into the features, installation, and usage of rules_gcs, and how it provides you with access to private resources.

What is rules_gcs?

rules_gcs is a Bazel ruleset that facilitates the downloading of files from Google Cloud Storage. It is designed to be a drop-in replacement for Bazel’s http_file and http_archive rules, with features that make it particularly suited for GCS. With rules_gcs, you can efficiently fetch large amounts of data, leverage Bazel’s repository cache, and handle private GCS buckets with ease.

Key Features

  • Drop-in Replacement: rules_gcs provides gcs_file and gcs_archive rules that can directly replace http_file and http_archive. They take a gs://bucket_name/object_name URL and internally translate this to an HTTPS URL. This makes it easy to transition to GCS-specific rules without major changes to your existing Bazel setup.

  • Lazy Fetching with gcs_bucket: For projects that require downloading multiple objects from a GCS bucket, rules_gcs includes a gcs_bucket module extension. This feature allows for lazy fetching, meaning objects are only downloaded as needed, which can save time and bandwidth, especially in large-scale projects.

  • Private Bucket Support: Accessing private GCS buckets is seamlessly handled by rules_gcs. The ruleset supports credential management through a credential helper, ensuring secure access without the need to hardcode credentials or use gsutil for downloading.

  • Bazel’s Downloader Integration: rules_gcs uses Bazel’s built-in downloader and repository cache, optimizing the download process and ensuring that files are cached efficiently across builds, even across multiple Bazel workspaces on your local machine.

  • Small footprint: Apart from the gcloud CLI tool (for obtaining authentication tokens), rules_gcs requires no additional dependencies or Bazel modules. This minimalistic approach reduces setup complexity and potential conflicts with other tools.

Understanding Bazel Repositories and Efficient Object Fetching with rules_gcs

Before we dive into the specifics of rules_gcs, it’s important to understand some key concepts about Bazel repositories and repository rules, as well as the challenges of efficiently managing large collections of objects from a Google Cloud Storage (GCS) bucket.

Bazel Repositories and Repository Rules

In Bazel, external dependencies are managed using repositories, which are declared in your WORKSPACE or MODULE.bazel file. Each repository corresponds to a package of code, binaries, or other resources that Bazel fetches and makes available for your build. Repository rules, such as http_archive or git_repository, and module extensions define how Bazel should download and prepare these external dependencies.

However, when dealing with a large number of objects, such as files stored in a GCS bucket, using a single repository to download all objects can be highly inefficient. This is because Bazel’s repository rules typically operate in an “eager” manner—they fetch all the specified files as soon as any target of the repository is needed. For large buckets, this means downloading potentially gigabytes of data even if only a few files are actually needed for the build. This eager fetching can lead to unnecessary network usage, increased build times, and larger disk footprints.

The rules_gcs Approach: Lazy Fetching with a Hub Repository

rules_gcs addresses this inefficiency by introducing a more granular approach to downloading objects from GCS. Instead of downloading all objects at once into a single repository, rules_gcs uses a module extension that creates a “hub” repository, which then manages individual sub-repositories for each GCS object.

How It Works
  1. Hub Repository: The hub repository acts as a central point of reference, containing metadata about the individual GCS objects. This follows the “hub-and-spoke” paradigm with a central repository (the bucket) containing references to a large number of small repositories for each object. This architecture is commonly used by Bazel module extensions to manage dependencies for different language ecosystems (including Python and Rust).

  2. Individual Repositories per GCS Object: For each GCS object specified in the lockfile, rules_gcs creates a separate repository using the gcs_file rule. This allows Bazel to fetch each object lazily—downloading only the files that are actually needed for the current build.

  3. Methods of Fetching: Users can choose between different methods in the gcs_bucket module extension. The default method of creating symlinks is efficient while preserving the file structure set in the lockfile. If you need to access objects as regular files, choose one of the other methods.

    • Symlink: Creates a symlink from the hub repo pointing to a file in its object repo, ensuring the object repo and symlink pointing to it are created only when the file is accessed.
    • Alias: Similar to symlink, but uses Bazel’s aliasing mechanism to reference the file. No files are created in the hub repo.
    • Copy: Creates a copy of a file in the hub repo when accessed.
    • Eager: Downloads all specified objects upfront into a single repository.

This modular approach is particularly beneficial for large-scale projects where only a subset of the data is needed for most builds. By fetching objects lazily, rules_gcs minimizes unnecessary data transfer and reduces build times.

Integrating with Bazel’s Credential Helper Protocol

Another critical aspect of rules_gcs is its seamless integration with Bazel’s credential management system. Accessing private GCS buckets securely requires proper authentication, and Bazel uses a credential helper protocol to handle this.

How Bazel’s Credential Helper Protocol Works

Bazel’s credential helper protocol is a mechanism that allows Bazel to fetch authentication credentials dynamically when accessing private resources, such as a GCS bucket. The protocol is designed to be simple and secure, ensuring that credentials are only used when necessary and are never hardcoded into build files.

When Bazel’s downloader prepares a request and a credential helper was configured, it invokes the credential helper with the command get. Additionally, the request URI is passed to the helpers standard input encoded as JSON. The helper is expected to return a JSON object containing HTTP headers, including the necessary Authorization token, which Bazel will then include in its requests.

Here’s a breakdown of how the credential_helper script used in rules_gcs works:

  1. Authentication Token Retrieval: The script uses the gcloud CLI tool to obtain an access token via gcloud auth application-default print-access-token. This token is tied to the user’s current authentication context and can be used to fetch any objects the user is allowed to access.

  2. Output Format: The script outputs the token in a JSON format that Bazel can directly use:

    {
      "headers": {
        "Authorization": ["Bearer ${TOKEN}"]
      }
    }

    This JSON object includes the Authorization header, which Bazel uses to authenticate its requests to the GCS bucket.

  3. Integration with Bazel: To use this credential helper, you need to configure Bazel by specifying the helper in the .bazelrc file:

    common --credential_helper=storage.googleapis.com=%workspace%/tools/credential-helper

    This line tells Bazel to use the specified credential_helper script whenever it needs to access resources from storage.googleapis.com. If a request returns an error code or unexpected content, credentials are invalidated and the helper is invoked again.

How rules_gcs Hooks Into the Credential Helper Protocol

rules_gcs leverages this credential helper protocol to manage access to private GCS buckets securely and efficiently. By providing a pre-configured credential helper script, rules_gcs ensures that users can easily set up secure access without needing to manage tokens or authentication details manually.

Moreover, by limiting the scope of the credential helper to the GCS domain (storage.googleapis.com), rules_gcs reduces the risk of credentials being misused or accidentally exposed. The helper script is designed to be lightweight, relying on existing gcloud credentials, and integrates seamlessly into the Bazel build process.

Installing rules_gcs

Adding rules_gcs to your Bazel project is straightforward. The latest version is available on the Bazel Central Registry. To install, simply add the following to your MODULE.bazel file:

bazel_dep(name = "rules_gcs", version = "1.0.0")

You will also need to include the credential helper script in your repository:

mkdir -p tools
wget -O tools/credential-helper https://raw.githubusercontent.com/tweag/rules_gcs/main/tools/credential-helper
chmod +x tools/credential-helper

Next, configure Bazel to use the credential helper by adding the following lines to your .bazelrc:

common --credential_helper=storage.googleapis.com=%workspace%/tools/credential-helper
# optional setting to make rules_gcs more efficient
common --experimental_repository_cache_hardlinks

These settings ensure that Bazel uses the credential helper specifically for GCS requests. Additionally, the setting --experimental_repository_cache_hardlinks allows Bazel to hardlink files from the repository cache instead of copying them into a repository. This saves time and storage space, but requires the repository cache to be located on the same filesystem as the output base.

Using rules_gcs in Your Project

rules_gcs provides three primary rules: gcs_bucket, gcs_file, and gcs_archive. Here’s a quick overview of how to use each:

  • gcs_bucket: When dealing with multiple files from a GCS bucket, the gcs_bucket module extension offers a powerful and efficient way to manage these dependencies. You define the objects in a JSON lockfile, and gcs_bucket handles the rest.

    gcs_bucket = use_extension("@rules_gcs//gcs:extensions.bzl", "gcs_bucket")
    
    gcs_bucket.from_file(
        name = "trainingdata",
        bucket = "my_org_assets",
        lockfile = "@//:gcs_lock.json",
    )
  • gcs_file: Use this rule to download a single file from GCS. It’s particularly useful for pulling in assets or binaries needed during your build or test processes. Since it is a repository rule, you have to invoke it with use_repo_rule in a MODULE.bazel file (or wrap it in a module extension).

    gcs_file = use_repo_rule("@rules_gcs//gcs:repo_rules.bzl", "gcs_file")
    
    gcs_file(
        name = "my_testdata",
        url = "gs://my_org_assets/testdata.bin",
        sha256 = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
    )
  • gcs_archive: This rule downloads and extracts an archive from GCS, making it ideal for pulling in entire repositories or libraries that your project depends on. Since it is a repository rule, you have to invoke it with use_repo_rule in a MODULE.bazel file (or wrap it in a module extension).

    gcs_archive = use_repo_rule("@rules_gcs//gcs:repo_rules.bzl", "gcs_archive")
    
    gcs_archive(
        name = "magic",
        url = "gs://my_org_code/libmagic.tar.gz",
        sha256 = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
        build_file = "@//:magic.BUILD",
    )

Try it Out

rules_gcs is a versatile and simple solution for integrating Google Cloud Storage with Bazel. We invite you to try out rules_gcs in your projects and contribute to its development. As always, we welcome feedback and look forward to seeing how this tool enhances your workflows. Check out the full example to get started!

Thanks to IMAX for sharing their initial implementation of rules_gcs and allowing us to publish the code under an open source license.

October 17, 2024 12:00 AM

October 16, 2024

Well-Typed.Com

The Haskell Unfolder Episode 34: you already understand monads

Today, 2024-10-16, at 1830 UTC (11:30 am PDT, 2:30 pm EDT, 7:30 pm BST, 20:30 CEST, …) we are streaming the 34th episode of the Haskell Unfolder live on YouTube.

The Haskell Unfolder Episode 34: you already understand monads

Function composition is the idea that we can take two functions and create a new function, which applies the two functions one after the other. When viewed from the right angle, monads generalize this idea from functions to programs: construct new programs by running other programs one after the other. In this episode we make this simple idea precise. We will also see what the monad laws look like in this setting, and we will discuss an example of what goes wrong when the monad laws are broken.

About the Haskell Unfolder

The Haskell Unfolder is a YouTube series about all things Haskell hosted by Edsko de Vries and Andres Löh, with episodes appearing approximately every two weeks. All episodes are live-streamed, and we try to respond to audience questions. All episodes are also available as recordings afterwards.

We have a GitHub repository with code samples from the episodes.

And we have a public Google calendar (also available as ICal) listing the planned schedule.

There’s now also a web shop where you can buy t-shirts and mugs (and potentially in the future other items) with the Haskell Unfolder logo.

by andres, edsko at October 16, 2024 12:00 AM

GHC Developer Blog

GHC 9.12.1-alpha1 is now available

GHC 9.12.1-alpha1 is now available

Zubin Duggal - 2024-10-16

The GHC developers are very pleased to announce the availability of the first alpha release of GHC 9.12.1. Binary distributions, source distributions, and documentation are available at downloads.haskell.org.

We hope to have this release available via ghcup shortly.

GHC 9.12 will bring a number of new features and improvements, including:

  • The new language extension OrPatterns allowing you to combine multiple pattern clauses into one.

  • The MultilineStrings language extension to allow you to more easily write strings spanning multiple lines in your source code.

  • Improvements to the OverloadedRecordDot extension, allowing the built-in HasField class to be used for records with fields of non lifted representations.

  • The NamedDefaults language extension has been introduced allowing you to define defaults for typeclasses other than Num.

  • More deterministic object code output, controlled by the -fobject-determinism flag, which improves determinism of builds a lot (though does not fully do so) at the cost of some compiler performance (1-2%). See #12935 for the details

  • GHC now accepts type syntax in expressions as part of GHC Proposal #281.

  • … and many more

A full accounting of changes can be found in the release notes. As always, GHC’s release status, including planned future releases, can be found on the GHC Wiki status.

We would like to thank GitHub, IOG, the Zw3rk stake pool, Well-Typed, Tweag I/O, Serokell, Equinix, SimSpace, the Haskell Foundation, and other anonymous contributors whose on-going financial and in-kind support has facilitated GHC maintenance and release management over the years. Finally, this release would not have been possible without the hundreds of open-source contributors whose work comprise this release.

As always, do give this release a try and open a ticket if you see anything amiss.

by ghc-devs at October 16, 2024 12:00 AM

October 15, 2024

Philip Wadler

You can help Cards Against Humanity pay "blue leaning" nonvoters $100 to vote


How is this not illegal??? Cards Against Humanity is PAYING people who didn't vote in 2020 to apologize, make a voting plan, and post #DonaldTrumpIsAHumanToilet—up to $100 for blue-leaning people in swing states. I helped by getting a 2024 Election Pack: checkout.giveashit.lol. Spotted via BoingBoing. More info at The Register. (Only American citizens and residents can participate. If, like me, you are an American citizen but non-resident, you will need a VPN.)

by Philip Wadler (noreply@blogger.com) at October 15, 2024 08:11 AM

October 14, 2024

Edward Z. Yang

Tensor programming for databases, with first class dimensions

Tensor libraries like PyTorch and JAX have developed compact and accelerated APIs for manipulating n-dimensional arrays. N-dimensional arrays are kind of similar to tables in database, and this results in the logical question which is could you setup a Tensor-like API to do queries on databases that would be normally done with SQL? We have two challenges:

  • Tensor computation is typically uniform and data-independent. But SQL relational queries are almost entirely about filtering and joining data in a data-dependent way.
  • JOINs in SQL can be thought of as performing outer joins, which is not a very common operation in tensor computation.

However, we have a secret weapon: first class dimensions were primarily designed to as a new frontend syntax that made it easy to express einsum, batching and tensor indexing expressions. They might be good for SQL too.

Representing the database. First, how do we represent a database? A simple model following columnar database is to have every column be a distinct 1D tensor, where all columns part of the same table have a consistent indexing scheme. For simplicity, we'll assume that we support rich dtypes for the tensors (e.g., so I can have a tensor of strings). So if we consider our classic customer database of (id, name, email), we would represent this as:

customers_id: int64[C]
customers_name: str[C]
customers_email: str[C]

Where C is the number of the entries in the customer database. Our tensor type is written as dtype[DIM0, DIM1, ...], where I reuse the name that I will use for the first class dimension that represents it. Let's suppose that the index into C does not coincide with id (which is good, because if they did coincide, you would have a very bad time if you ever wanted to delete an entry from the database!)

This gives us an opportunity for baby's first query: let's implement this query:

SELECT c.name, c.email FROM customers c WHERE c.id = 1000

Notice that the result of this operation is data-dependent: it may be zero or one depending on if the id is in the database. Here is a naive implementation in standard PyTorch:

mask = customers_id == 1000
return (customers_name[mask], customers_email[mask])

Here, we use boolean masking to perform the data-dependent filtering operation. This implementation in eager is a bit inefficient; we materialize a full boolean mask that is then fed into the subsequent operations; you would prefer for a compiler to fuse the masking and indexing together. First class dimensions don't really help with this example, but we need to introduce some new extensions to first class dimensions. First, what we can do:

C = dims(1)
c_id = customers_id[C]  # {C} => int64[]
c_name = customers_name[C]  # {C} => str[]
c_email = customers_email[C]  # {C} => str[]
c_mask = c_id == 1000  # {C} => bool[]

Here, a tensor with first class tensors has a more complicated type {DIM0, DIM1, ...} => dtype[DIM2, DIM3, ...]. The first class dimensions are all reported in the curly braces to the left of the double arrow; curly braces are used to emphasize the fact that first class dimensions are unordered.

What next? The problem is that now we want to do something like torch.where(c_mask, c_name, ???) but we are now in a bit of trouble, because we don't want anything in the false branch of where: we want to provide something like "null" and collapse the tensor to a smaller number of elements, much like how boolean masking did it without first class dimensions. To express this, we'll introduce a binary version of torch.where that does exactly this, as well as returning the newly allocated FCD for the new, data-dependent dimension:

C2, c2_name = torch.where(c_mask, c_name)  # {C2} => str[]
_C2, c2_email = torch.where(c_mask, c_email)  # {C2} => str[], n.b. C2 == _C2
return c2_name, c2_email

Notice that torch.where introduces a new first-class dimension. I've chosen that this FCD gets memoized with c_mask, so whenever we do more torch.where invocations we still get consistently the same new FCD.

Having to type out all the columns can be a bit tiresome. If we assume all elements in a table have the same dtype (let's call it dyn, short for dynamic type), we can more compactly represent the table as a 2D tensor, where the first dimension is the indexing as before, and the second dimension is the columns of the database. For clarity, we'll support using the string name of the column as a shorthand for the numeric index of the column. If the tensor is contiguous, this gives a more traditional row-wise database. The new database can be conveniently manipulated with FCDs, as we can handle all of the columns at once instead of typing them out individually):

customers:  dyn[C, C_ATTR]
C = dims(1)
c = customers[C]  # {C} => dyn[C_ATTR]
C2, c2 = torch.where(c["id"] == 1000, c)  # {C2} => dyn[C_ATTR]
return c2[["name", "email"]].order(C2)  # dyn[C2, ["name", "email"]]

We'll use this for the rest of the post, but the examples should be interconvertible.

Aggregation. What's the average age of all customers, grouped by the country they live in?

SELECT AVG(c.age) FROM customers c GROUP BY c.country;

PyTorch doesn't natively support this grouping operation, but essentially what is desired here is a conversion into a nested tensor, where the jagged dimension is the country (each of which will have a varying number of countries). Let's hallucinate a torch.groupby analogous to its Pandas equivalent:

customers: dyn[C, C_ATTR]
customers_by_country = torch.groupby(customers, "country")  # dyn[COUNTRY, JC, C_ATTR]
COUNTRY, JC = dims(2)
c = customers_by_country[COUNTRY, JC]  # {COUNTRY, JC} => dyn[C_ATTR]
return c["age"].mean(JC).order(COUNTRY)  # f32[COUNTRY]

Here, I gave the generic indexing dimension the name JC, to emphasize that it is a jagged dimension. But everything proceeds like we expect: after we've grouped the tensor and rebound its first class dimensions, we can take the field of interest and explicitly specify a reduction on the dimension we care about.

In SQL, aggregations have to operate over the entirety of groups specified by GROUP BY. However, because FCDs explicitly specify what dimensions we are reducing over, we can potentially decompose a reduction into a series of successive reductions on different columns, without having to specify subqueries to progressively perform the reductions we are interested in.

Joins. Given an order table, join it with the customer referenced by the customer id:

SELECT o.id, c.name, c.email FROM orders o JOIN customers c ON o.customer_id = c.id

First class dimensions are great at doing outer products (although, like with filtering, it will expensively materialize the entire outer product naively!)

customers: dyn[C, C_ATTR]
orders: dyn[O, O_ATTR]
C, O = dims(2)
c = customers[C]  # {C} => dyn[C_ATTR]
o = orders[O]  # {O} => dyn[O_ATTR]
mask = o["customer_id"] == c["id"]  # {C, O} => bool[]
outer_product = torch.cat(o[["id"]], c[["name", "email"]])  # {C, O} => dyn[["id", "name", "email"]]
CO, co = torch.where(mask, outer_product)  # {CO} => dyn[["id", "name", "email"]]
return co.order(CO)  # dyn[C0, ["id", "name", "email"]]

What's the point. There are a few reasons why we might be interested in the correspondence here. First, we might be interested in applying SQL ideas to the Tensor world: a lot of things people want to do in preprocessing are similar to what you do in traditional relational databases, and SQL can teach us what optimizations and what use cases we should think about. Second, we might be interested in applying Tensor ideas to the SQL world: in particular, I think first class dimensions are a really intuitive frontend for SQL which can be implemented entirely embedded in Python without necessitating the creation of a dedicated DSL. Also, this might be the push needed to get TensorDict into core.

by Edward Z. Yang at October 14, 2024 05:07 AM

Brent Yorgey

MonadRandom: major or minor version bump?

MonadRandom: major or minor version bump?

Posted on October 14, 2024
Tagged , , , ,

tl;dr: a fix to the MonadRandom package may cause fromListMay and related functions to extremely rarely output different results than they used to. This could only possibly affect anyone who is using fixed seed(s) to generate random values and is depending on the specific values being produced, e.g. a unit test where you use a specific seed and test that you get a specific result. Do you think this should be a major or minor version bump?


The Fix

Since 2013 I have been the maintainer of MonadRandom, which defines a monad and monad transformer for generating random values, along with a number of related utilities.

Recently, Toni Dietze pointed out a rare situation that could cause the fromListMay function to crash (as well as the other functions which depend on it: fromList, weighted, weightedMay, uniform, and uniformMay). This function is supposed to draw a weighted random sample from a list of values decorated with weights. I’m not going to explain the details of the issue here; suffice it to say that it has to do with conversions between Rational (the type of the weights) and Double (the type that was being used internally for generating random numbers).

Even though this could only happen in rare and/or strange circumstances, fixing it definitely seemed like the right thing to do. After a bit of discussion, Toni came up with a good suggestion for a fix: we should no longer use Double internally for generating random numbers, but rather Word64, which avoids conversion and rounding issues.

In fact, Word64 is already used internally in the generation of random Double values, so we can emulate the behavior of the Double instance (which was slightly tricky to figure out) so that we make exactly the same random choices as before, but without actually converting to Double.

The Change

…well, not exactly the same random choices as before, and therein lies the rub! If fromListMay happens to pick a random value which is extremely close to a boundary between choices, it’s possible that the value will fall on one side of the boundary when using exact calculations with Word64 and Rational, whereas before it would have fallen on the other side of the boundary after converting to Double due to rounding. In other words, it will output the same results almost all the time, but for a list of \(n\) weighted choices there is something like an \(n/2^{64}\) chance (or less) that any given random choice will be different from what it used to be. I have never observed this happening in my tests, and indeed, I do not expect to ever observe it! If we generated one billion random samples per second continuously for a thousand years, we might expect to see it happen once or twice. I am not even sure how to engineer a test scenario to force it to happen, because we would have to pick an initial PRNG seed that forces a certain Word64 value to be generated.

To PVP or not to PVP?

Technically, a function exported by MonadRandom has changed behavior, so according to the Haskell PVP specification this should be a major version bump (i.e. 0.6 to 0.7).Actually, I am not even 100% clear on this. The decision tree on the PVP page says that changing the behavior of an exported function necessitates a major version bump; but the actual specification does not refer to behavior at all—as I read it, it is exclusively concerned with API compatibility, i.e. whether things will still compile.

But there seem to be some good arguments for doing just a minor version bump (i.e. 0.6 to 0.6.1).

  • Arguments in favor of a minor version bump:

    • A major version bump would cause a lot of (probably unnecessary) breakage! MonadRandom has 149 direct reverse dependencies, and about 3500 distinct transitive reverse dependencies. Forcing all those packages to update their upper bound on MonadRandom would be a lot of churn.

    • What exactly constitutes the “behavior” of a function to generate random values? It depends on your point of view. If we view the function as a pure mathematical function which takes a PRNG state as input and produces some value as output, then its behavior is defined precisely by which outputs it returns for which input seeds, and its behavior has changed. However, if we think of it in more effectful terms, we could say its “behavior” is just to output random values according to a certain distribution, in which case its behavior has not changed.

    • It’s extremely unlikely that this change will cause any breakage; moreover, as argued by Boyd Stephen Smith, anyone who cares enough about reproducibility to be relying on specific outputs for specific seeds is probably already pinning all their package versions.

  • Arguments in favor of a major version bump:

    • It’s what the PVP specifies; what’s the point of having a specification if we don’t follow it?

    • In the unlikely event that this change does cause any breakage, it could be extremely difficult for package maintainers to track down. If the behavior of a random generation function completely changes, the source of the issue is obvious. But if it only changes for very rare inputs, you might reasonably think the problem is something else. A major version bump will force maintainers to read the changelog for MonadRandom and assess whether this is a change that could possibly affect them.

So, do you have opinions on this? Would the release affect you one way or the other? Feel free to leave a comment here, or send me an email with your thoughts. Note there has already been a bit of discussion on Mastodon as well.

<noscript>Javascript needs to be activated to view comments.</noscript>

by Brent Yorgey at October 14, 2024 12:00 AM

October 13, 2024

Michael Snoyman

Buying Bitcoin or selling dollars?

The act of trading means that both sides give up one good for something they value more. When I go to the supermarket, I’m giving the supermarket dollars (or euros, or shekels) in exchange for food. I value the food more than the money. The supermarket values the money more than the food. Everyone walks away happy with a successful trade.

But we don’t normally talk about going to the supermarket and trading for food. We generally say we’re buying food. Buying is simply a trade where you give money. Similarly, the supermarket is selling food, where selling is a trade where you receive money.

Now let’s say I’m going on a trip to Europe and need some cash. I have US dollars, and I need Euros. Both of those are money. So am I buying Euros, or am I selling dollars? We generally use the term exchange in that case.

You may notice, all of these acts are really identical to trading, it’s just a matter of nomenclature. The terms we use represent how we view the assets at play.

Which brings me to the point of this post: buying Bitcoin.

Buying Bitcoin

I come from a fairly traditional, if very conservative, financial background. I was raised in a house that believes putting money in the stock market is essentially reckless gambling, and then my university education included a lot of economics and finance courses, which gave me a broader view. I’m still fairly conservative in my investments, and was very crypto-wary for a while. I care more about long-term security, not short term gains. Investing in Bitcoin seemed foolish.

At some point in the past 5 years, I changed my opinion on this slightly. I began to see Bitcoin as a prudent hedge against risks in other asset classes. From that world view, I began to buy Bitcoin. Dollars are the real money, and Bitcoin is the risk asset that I’m speculatively investing in and hoping for a return. Meaning: I ultimately intend to sell that Bitcoin for more dollars than I spent to get it. Much like I would treat stock.

As those 5 years have trudged along, I’ve become more confident in Bitcoin, and simultaneously less confident in fiat currency. Like many others, the rampant money printing and high levels of inflation have me worried about staking my future on fiat currencies. Investing in stocks would be the traditional inflation protection hedge, but I’m coming around more to a Bitcoin maxi-style belief that fixed total supply is the most important feature of anything we use for long term storage.

All of this led to the question that kicked off this blog post:

Am I buying Bitcoin, or selling dollars?

Remember that buying and selling are both the same thing as trading. There’s no difference between the act of buying Bitcoin with dollars, or selling dollars for Bitcoin. It’s just a difference in what you view as the real money. Most people in the world would consider the dollar to be the real money in the equation.

I have some background in Talmudic study, and one of the common phrases we use in studying Talmud is מאי נפקא מינה, pronounced “my nafka meena,” or “what is the practical difference between these two?” There’s no point having a pure debate about terminology. Is there any practical difference in how I relate to the world whether I’m buying Bitcoin or selling dollars? And after some thinking, I realized what it is.

Entering a trade

Forget Bitcoin entirely. I wake up one morning, and go to my brokerage account. I’ve got $50,000 in cash sitting there, waiting to be invested. Let’s say that represents half of my net worth. I start looking at the charts, doing some research, and I strongly believe that a company’s stock is undervalued and is about to go up significantly. What do I do?

Well, most likely I’m going to buy some of that stock. Am I going to put in the entire $50k? Probably not, I’m very risk averse, and I like to hedge my risks. Investing half my net worth in one stock, based on the price on one day, is too dangerous for my taste. (Others invest differently, and there’s certainly value in being more aggressive, just sharing my own views.) Buying the stock is called entering a trade.

Similarly, if two weeks later, that stock has gone up 20%, I’m sitting on a bunch of profits, and I hear some news that may negatively impact that stock, I may decide to sell the stock or exit the trade.

But let’s change things a bit. Let’s say I’m not that confident the stock will go up at the beginning of this story. Am I going to buy in? Probably not. For those familiar, this may sound like status-quo bias: the bias to stick to whatever we’re currently doing barring additional information. But I think there’s something more subtle going on here as well.

Let’s say I did buy the stock, it did go up 20%, and now I’m nervous it’s about to tank. I’m not confident at all, just a hunch. Depending on the strength of that hunch, I’m going to sell. My overall confidence threshold for buying in is much higher than selling out. And the reason for this is simple: risk. Overall, I view the dollar as the stable asset, and the stock as the risk asset.

By selling early, I risk losing out on further potential gains. Economically, that’s equivalent to losing money when you view things as opportunity costs. But the risks of losing value, to someone fiscally conservative and risk averse like me, outweigh the potential gains.

The price of Bitcoin, the price of the dollar

Alright, back to Bitcoin. My practical difference absolutely applies here. Let’s say (for simplicity of numbers) that the current price of Bitcoin is $50,000. I’m sitting on 1 BTC and $50,000 cash. I have three options:

  1. Trade my dollars to get more Bitcoin
  2. Do nothing
  3. Trade my Bitcoin to get more dollars

But there’s a problem with this framing. By quoting the price of Bitcoin in dollars, I’ve already injected a bias into the analysis. I’m implicitly viewing dollars as money, and Bitcoin as the risk asset. We can equivalently view the current price as 0.00002 BTC per dollar. And, since playing with numbers like that is painful, we can talk about uBTC (micro-BTC, or a millionth of a Bitcoin) instead, and say the current price of a dollar is 20 uBTC.

(Side note: personally, I think the unit ksat, or thousand satoshis, or a one-hundred-thousandth of a Bitcoin, is a good unit for discussing prices, but I’ve never seen anyone else use it, so I’ll stick to uBTC.)

Anyway, let’s come back to the case in point. We have two different world views, and three different cases for each world view:

  1. Bitcoin is priced at $50,000
    1. I think the price will go up, so I should buy Bitcoin
    2. I think the price will go down, so I should sell Bitcoin
    3. I don’t know the direction the price will take
  2. The dollar is priced at 20 uBTC
    1. I think the price will go up, so I should buy dollars
    2. I think the price will go down, so I should sell dollars
    3. I don’t know the direction the price will take

You may notice that cases 1a and 2b are equivalent: the price of Bitcoin going up is the same as the price of the dollar going down. The same with cases 1b and 2a. And more obviously, cases 1c and 2c are the same: in both cases, I don’t know where I think the prices will go.

Risk-averse defaults

This is where risk aversion should come into play. Put simply: what is the least risky asset to hold? In our stock case, it was clearly the dollar. And if you asked me 5 years ago, I absolutely would have said holding onto dollars is far less risky than holding onto Bitcoin.

And this is where I think I begin down the path of the Bitcoin Maxi. I started seriously considering Bitcoin as an investment due to rampant money printing and inflation. It started as a simple hedge, throwing in yet another risky asset with others. But I’ve realized my viewpoint on the matter is changing over time. As many others have put it before me, fiat currency goes to 0 over time as more printing occurs. It’s not a question of “will the dollar lose value,” there’s a guarantee that the dollar will lose value over time, unless monetary policy is significantly altered. And there’s no reason to believe it will be.

I understand and completely respect the viewpoint that Bitcoin is imaginary internet money with no inherent value. I personally disagree, at least today, though it was my dominant view 5 years ago. Assuming sufficient people continue to believe Bitcoin is more than a ponzi scheme and is instead a scarce asset providing a true store of value with no long-term devaluation through money printing, Bitcoin will continue to go up, not down, over time.

In other words, as I stared at this argument, I came to a clear conclusion: my worldview is that the risk-averse asset to hold these days is Bitcoin, not dollars. But this bothered me even more.

Tzvei dinim

OK, I’m a full-on Bitcoin Maxi. I should liquidate all my existing investments and convert them to Bitcoin. Every time I get a paycheck, I should convert the full value into Bitcoin. I’ll never touch a dollar again. Right?

Well, no. Using my framework above, there’s no reason to avoid investing in stocks, fiat, metals, or anything else that you believe will go up in value. It’s a question of the safe default. But even so, I haven’t gone ahead with taking every dollar I have and buying up Bitcoin with it. I still leave my paycheck in dollars and only buy up some Bitcoin when I have a sufficient balance. This felt like cognitive dissonance to me, and I needed to figure out why I was behaving inconsistently!

And fortunately another Talmudic study philosophy came into play. Tzvei dinim is a Yiddish phrase that means “two laws,” and it indicates that two cases have different outcomes because the situations are different. And for me, the answer is that money (and investments in general) have two radically different purposes:

  1. Short-term usage for living. This includes paying rent, buying groceries, and a rainy day fund. Depending on how risk-averse you are, that rainy day fund could be to cover 1 month of expenses while you look for another job, or years of savings in case your entire industry is destroyed by AI.
  2. Long-term store of value.

What’s great about this breakdown is that I’ve lived my entire adult life knowing it, and I bet many of you have too! We’ve all heard phrases around the stock market like “don’t invest more than you can afford to lose.” The point of this is that the price of stocks can fluctuate significantly, and you don’t want to be forced to sell at a low point to cover grocery bills. Keep enough funds for short-term usage, and only invest what you have for long-term store of value.

This significantly assuaged my feelings of cognitive dissonance. And it allows me to answer my question above pretty well about whether I’d buy/sell Bitcoin or dollars:

  • Keep enough money in dollars to cover expected expenses in the near term
  • Invest money speculatively based on strong beliefs about where asset prices are heading
  • And beyond that, keep the rest of the money in Bitcoin, not dollars. Over time, the dollar will decrease in value, and Bitcoin will increase in value. I’d rather have my default exposure be to the asset that’s going up, not down.

Conclusion

Thanks for going on this journey with me. The point here isn’t to evangelize anything in particular. As I said, I understand and respect the hesitancy to buy into a new asset class. I’ve been working in the blockchain field for close to a decade now, and I've only recently come around to this way of thinking. And it’s entirely possible that I’m completely wrong, Bitcoin will turn out to be a complete scam asset and go to 0, and I’ll bemoan my stupid view of the world I’m sharing in this post. If so, please don’t point and laugh when you see me.

My point in this post is primarily to solidify my own viewpoint for myself. And since I do that best by writing up a blog post as a form of rubber ducking, I decided to do so. As I’m writing this, I still don’t know if I’ll even publish it!

And if I did end up publishing this and you’re reading it now, here’s my secondary point: helping others gain a new perspective. I think it’s always valuable to challenge your assumptions. If you’ve been looking at “cryptobros” as crazy investors hoping to make 10,000% returns on a GIF, I’m hoping this post gives you a different perspective of viewing Bitcoin as a better store of value than traditional assets. Feel free to disagree with me! But I hope you at least give the ideas some time to percolate.

Appendix 1: risk aversion

I’m sure plenty of people will read this and think I’m lying to myself. I claim to be risk averse, but I’m gambling on a new and relatively untested asset class. Putting money into the stock market is a far more well-established mechanism for providing inflation protection, and investing in indices like the S&P 500 provides good hedging of risks. So why would I buy into Bitcoin instead?

This is another contradiction that can be resolved by the tzvei dinim approach. You can evaluate risk either based on empirical data (meaning past performance), or by looking at fundamental principles and mechanisms. The stock market is demonstrably a good performer by empirical standards, delivering reliable returns.

Some people might try to claim that Bitcoin has the same track record: it’s gone up in value stupendously during its existence. I don’t actually believe that at all. Yes, Bitcoin has appreciated a lot, but the short time frame means I don’t really care about its track record, definitely not as much as I do the stock market’s.

Instead, when I look at Bitcoin, I’m more persuaded by the mechanism, which simply put is fixed supply. There will never be more than 21,000,000 BTC. If there was a hard fork of the network that started increasing that supply, I’d lose faith in Bitcoin completely and likely sell out of it. I’m a believer in the mechanism of a deflationary currency. And there is no better asset I can think of for fixed supply than Bitcoin. (Though gold comes very close… if people are interested, I may follow up later with a Bitcoin vs gold blog post.)

By contrast, the underlying mechanism for the stock market going up over time is less clear. Some of that is inherent by dint of money printing: more money being printed will flow into stocks, because that’s where people park their newly printed money. My main concern with the stock market is that most people aren’t following any fundamental valuation technique, and are instead treating it as a Ponzi scheme. Said differently, I want to analyze the value of a stock based on my expected future revenues from dividends (or some equivalent objective measure). Instead, stocks are mostly traded based on how much you think someone else will value it in the future.

My views on the stock market are somewhat extreme and colored by the extremely risk-averse viewpoint I received growing up. Others will likely disagree completely that the stock market is pure speculation. And they’d also probably laugh at the idea that Bitcoin has more inherent value than the way stocks are traded. It’s still my stance.

Appendix 2: cryptobros

I mentioned cryptobros above, and made a reference to NFTs. Before getting deeper into the space, I had–like many others–believed “Bitcoin” and “crypto” were more or less synonymous. True believers in Bitcoin, and I’m slowly coming to admit that I’m one of them, disagree completely. Bitcoin is a new monetary system based on fixed supply, no centralized control, censorship resistance, and pseudo-anonymity. Crypto in many of its forms is little more than get-rich-quick schemes.

I don’t believe that’s true across the board for all crypto assets. I do believe that was true for much of the NFT hype and for meme coins. Ethereum to me has intrinsic value, because the ability to have your financial transaction logged on the most secure blockchain in the world is valuable in its own right.

So just keep in mind, crypto does not necessarily mean the same thing as Bitcoin.

Appendix 3: drei dinim

I mentioned “tzvei dinim” above, meaning “two laws.” I want to introduce a drei dinim, meaning three laws. (And if I mistransliterated Yiddish, my apologies, I don’t actually speak the language at all.) I described short-term vs long-term above. In reality, I think there are really three different ideas at play:

  1. Short-term money holding for expenses
  2. Long-term store of value
  3. Speculative investments because you think an asset will outperform the safe asset

My view is that, due to the inflationary nature of fiat currency, groups (2) and (3) have been unfairly lumped together for most people. Want to store value for the next 30 years? Don’t keep it in dollars, you better buy stocks! I don’t like that view of the world. The skill of choosing what to invest in is not universal, it requires work, and many people lose their shirts trying to buy into the right stock. (Side note, that’s why many people recommended investing in indices, specifically to avoid those kinds of concerns.)

I want a world where there’s an asset that retains its value over time, regardless of inflation and money printing. Bitcoin is designed to do just that. But if you really think a stock is going to go up 75% in a week, category (3) still gives plenty of room to do speculative investment, without violating the rest of the cognitive framework I’ve described.

Appendix 4: why specifically Bitcoin?

The arguments I’ve given above just argue for a currency that has a fixed maximum supply. You could argue decentralization is a necessary feature too, since it’s what guarantees the supply won’t be changed. So why is Bitcoin in particular the thing we go with? To go to the absurd, why doesn’t each person on the planet make their own coin (e.g. my Snoycoin) and use that as currency?

This isn’t just a theoretical idea. One of the strongest (IMO) arguments against Bitcoin is exactly this: anyone can create a new one, so the fixed supply is really just a lie. There’s an infinite supply of made-up internet money, even if each individual token may have a fixed supply.

To me, this comes down to the question of competition, as does virtually everything else in economics. Bitcoin is a direct competitor to the dollar. The dollar has strengths over Bitcoin: institutional support, clear regulatory framework, requirement for US citizens to pay taxes with dollars, requirement of US business to accept dollars for payment. Bitcoin is competing with the strengths I’ve described above.

I believe that, ultimately, the advantages of Bitcoin will continue to erode the strength of the dollar. That’s why I’m buying into it, literally and figuratively.

However, new coins don’t have the same competitive power. If I make Snoycoin, it’s worse in every way imaginable to Bitcoin. It simply won’t take off. And it shouldn’t, despite all the money I’d make from it.

There is an argument to be made that Ethereum is a better currency than Bitcoin, since it allows for execution of more complex smart contracts. I personally don’t see Ethereum (or other digital assets) dethroning Bitcoin as king of the hill any time soon.

October 13, 2024 12:00 AM

October 08, 2024

Well-Typed.Com

18 months of the Haskell Unfolder

Eighteen months ago we launched The Haskell Unfolder, a YouTube series where we discuss all things Haskell. From the beginning the goal was to cover as broad a spectrum of topics as possible, as well as show-case the work of many different people in the Haskell world. We’d like to think we succeeded at that; below you will find a summary of the topics covered in the first year and a half, organized roughly by category.

All episodes are live-streamed, and we welcome interactivity through questions or comments submitted in the video chat. If you’d like to catch us live, episodes are generally on Wednesdays 8:30 CEST/CET (18:30 UTC/19:30 UTC), usually once every two weeks. New episodes are announced on YouTube, the Well-Typed blog, Reddit, Discourse, Twitter and Mastodon. Most episodes are roughly half an hour. The code samples discussed in the episodes are all available in the Unfolder GitHub repository.

If you enjoy the show, a like-and-subscribe is always appreciated, as is sharing the show with other people who might be interested. There is also some Unfolder merch available.1

Episodes

Beginner friendly

unfoldr (episode 1)

In this first eponymous episode we take a look at the unfoldr function

unfoldr :: (b -> Maybe (a, b)) -> b -> [a]

and discuss how it works and how it can be used.

Composing left folds (episode 5)

Based on a former Well-Typed interview problem, we discuss how to perform multiple simultaneous computations on a text file, gathering some statistics. We use this as an example to discuss issues of performance, lazy evaluation, and elegance. We also take a look at how we can use foldl library [Gabriella Gonzalez] to write well-performing code without giving up on compositionality.

Learning by testing (aka “Boolean blindness”, episode 7)

We discuss how Booleans convey little information about the outcome of a test, and how replacing Booleans by other datatypes that produce a witness of the success or failure of a test can lead to more robust and therefore better code. For example, instead of filter

filter :: (a -> Bool) -> [a] -> [a]

we might instead prefer mapMaybe

mapMaybe :: (a -> Maybe b) -> [a] -> [b]

This idea is known under many names, such as “Learning by testing” [Conor McBride], “Boolean blindness” [Bob Harper] or “Parse, don’t validate” [Alexis King].

A new perspective on foldl' (episode 19)

We introduce a useful combinator called repeatedly

repeatedly :: (a -> b -> b) -> ([a] -> b -> b)

which captures the concept “repeatedly execute an action to a bunch of arguments”. We discuss both how to implement this combinator as well as some use cases.

Dijkstra’s shortest paths (episode 20)

We use Dijkstra’s shortest paths algorithm as an example of how one can go about implementing an algorithm given in imperative pseudo-code in idiomatic Haskell. For an alternative take, check out the Monday Morning Haskell [James Bowen] episode on the same topic .

From Java to Haskell (episode 25)

We translate a gRPC server written in Java (from a blog post by Carl Mastrangelo) to Haskell. We use this as an example to demonstrate some of the conceptual differences of the two languages, but also observe that the end result of the translation looks perhaps more similar to the Java version than one might expect. Unlike most of our episodes, we hope that this one is understandable to any software developer, even people without any previous exposure to Haskell. Of course, we won’t be able to explain everything, but the example used should help to establish an idea of the look and feel of Haskell code, and perhaps learn a bit more about the relationship between the object-oriented and functional programming paradigms.

This episode is not about doing object oriented programming in Haskell; if you’d like to know more about that see our blogpost on this topic, as well as episode 13 of the Haskell Unfolder on open recursion.

Duality (episode 27)

Duality is the idea that two concepts are “similar but opposite” in some precise sense. The discovery of a duality enables us to use our understanding of one concept to help us understand the dual concept, and vice versa. It is a powerful technique in many disciplines, including computer science. We discuss how we can use duality in a very practical way, as a guiding principle leading to better library interfaces and a tool to find bugs in our code.

This episode focuses on design philosophy rather than advanced Haskell concepts, and should consequently be of interest to beginners and advanced Haskell programmers alike (we do not use any Haskell beyond Haskell 2010). Indeed, the concepts apply in other languages also (but we assume familiarity with Haskell syntax).

Episode 24 on generic folds and unfolds discusses another example of duality often exploited in Haskell, though that application is perhaps somewhat more advanced.

Solving tic-tac-toe (episode 32)

We develop an implementation of a simple game from scratch: tic-tac-toe. After having implemented the rules, we show how to actually solve the game and allow optimal play by producing a complete game tree and using a naive minimax algorithm for evaluating states.

Generating visualizations with diagrams (episode 33)

We look at the diagrams package [Brent Yorgey], which provides a domain-specific language embedded into Haskell for describing all sorts of pictures and visualisations. Concretely, we visualise the game tree of tic-tac-toe that we computed in episode 32, with the goal of producing a picture similar to the one in xkcd “Tic-Tac-Toe”. However, this episode is understandable without having watched the previous episode.

Reasoning

Laws (episode 8)

Many of Haskell’s abstractions come with laws; well-known examples include the Functor type class with the functor laws and the Monad type class with the monad laws, but this is not limited to type classes; for example, lenses come with a set of laws, too. To people without a mathematical background such laws may seem intimidating; how would one even start to think about proving them for our own abstractions? We discuss examples of these laws, show how to prove them, and discuss common techniques for such proofs, including induction. We also discuss why these laws matter; what goes wrong when they do not hold?

Parametricity (episode 12)

We look at the concept of parametricity, also known as “theorems for free”. The idea of parametricity is that you can make non-trivial statements about polymorphic functions simply by looking at their types, and without knowing anything about their implementation. We focus on the intuition, not the theory, and explain how parametricity can generally help with reasoning about programs. We also briefly talk about how parametricity can help with property based testing [J.P. Bernardy, P. Jansson and K. Claessen].

For a more in-depth discussion of the theory of parametricity, refer to the Parametricity Tutorial part 1 and part 2 on the Well-Typed blog.

Performance and optimization

In addition to the Unfolder episodes mentioned in this section, Well-Typed offers a course on Performance and Optimisation.

GHC Core (episode 9)

After parsing, renaming and type checking, GHC translates Haskell programs into its internal Core language. It is Core where most optimisations happen, so in order to get a better idea of what your program gets compiled to, being able to read and understand Core code is quite useful. Core is in essence a drastically simplified and more explicit version of Haskell. We look at a number of simple Haskell programs and see how they get represented in Core.

foldr-build fusion (episode 22)

When composing several list-processing functions, GHC employs an optimisation called foldr-build fusion, sometimes also known as short-cut fusion or deforestation [A. Gill, J. Launchbury, S. Peyton Jones]. Fusion combines functions in such a way that any intermediate lists can often be eliminated completely. We look at how this optimisation works, and at how it is implemented in GHC: not as built-in compiler magic, but rather via user-definable rewrite rules.

Specialisation (episode 23)

Overloaded functions are common in Haskell, but they come with a cost. Thankfully, the GHC specialiser is extremely good at removing that cost. We can therefore write high-level, polymorphic programs and be confident that GHC will compile them into very efficient, monomorphised code. In this episode, we demystify the seemingly magical things that GHC is doing to achieve this.

This episode is hosted by a guest speaker, Finley McIlwaine, who has also written two blog posts on this topic: Choreographing a dance with the GHC specializer (Part 1) and part 2.

Types

If you are interested in the topics around Haskell’s advanced type system, you might be interested in Well-Typed course on the Haskell Type System.

Quantified constraints (episode 2)

We discuss the QuantifiedConstraints language extension, which was introduced in the paper Quantified Class Constraints [G. Bottu, G. Karachalias, T. Schrijvers, B. Oliveira and P. Wadler]. We also briefly talk about the problems of combining quantified constraints with type families, as also discussed in GHC ticket #14860.

We assume familiarity with type classes. An understanding of type families is helpful for a part of the episode, but is not a requirement.

Injectivity (episode 3)

We discuss in what way parameterised datatypes are injective and type families are not. We also discuss injective type families, a relatively recent extension based on the paper Injective Type Families [J. Stolarek, S. Peyton-Jones, R. Eisenberg].

We assume familiarity with Haskell fundamentals, in particular datatypes. Knowledge of type families is certainly helpful, but not required.

Computing type class dictionaries (episode 6)

A function with a Show a constraint

f :: Show a => ...

wants evidence that type a has a Show instance. But what if we want to return such evidence from a function?

f :: ..... -> .. Show a ..

We discuss what the signature of such a function looks like, how we can compute such constraints, and when we might want to.

Higher-kinded types (episode 14)

We take a look at the common design pattern where we abstract all the fields of a record type over a type constructor which can then be instantiated to the identity to get the original record type back, but also to various other interesting type constructors.

data Episode f = MkEpisode {
    number   :: f Int
  , title    :: f Text
  , abstract :: f Text
  }

We look at a few examples, and discuss how common operations on such types naturally lead to the use of higher-rank polymorphic types.

There are a number of packages on Hackage that implement (variations on) this pattern: hkd [Edward Kmett], barbies [Daniel Gorin], rel8 [Oliver Charles], beam [Travis Athougies] as well as our own package sop-core.

Computing constraints (episode 18)

Sometimes, for example when working with type-level lists, you have to compute with constraints. For example, you might want to say that a constraint holds for all types in a type-level list:

type All :: (Type -> Constraint) -> [Type] -> Constraint
type family All c xs :: Constraint where
  ..

We explore this special case of type-level programming in Haskell. We also revisit type class aliases and take a closer look at exactly how and why they work.

Type families and overlapping instances (episode 28)

We discuss a programming technique which allows us to replace overlapping instances with a decision procedure implemented using type families. The result is a bit more verbose, but arguably clearer and more flexible.

The type of runST (episode 30)

In Haskell, the ST type offers a restricted subset of the IO functionality: it provides mutable variables, but nothing else. The advantage is that we can use mutable storage locally, because unlike IO, ST allows us to escape from its realm via the function runST. However, runST has a so-called rank-2 type:

runST :: (forall s. ST s a) -> a

We discuss why this seemingly complicated type is necessary to preserve the safety of the operation.

Testing

falsify (episode 4)

We discuss falsify, our new library for property based testing in Haskell, inspired by the Hypothesis [David MacIver] library for Python.

For a more in-depth discussion, see also the blog post falsify: Hypothesis-inspired shrinking for Haskell or the paper falsify: Internal shrinking reimagined for Haskell [Edsko de Vries].

Testing without a reference (episode 21)

The best case scenario when testing a piece of software is when we have a reference implementation to compare against. Often however such a reference is not available, raising the question how to test a function if we cannot verify what that function computes exactly. We consider how to define properties to verify the implementation of Dijkstra’s shortest path algorithm we discussed in Episode 20; you may wish to watch that episode first, but it’s not required: we mostly treat the algorithm as a black box for the sake of testing it.

We can only scratch the surface here; for an in-depth discussion of this topic, we highly recommend How to Specify It!: A Guide to Writing Properties of Pure Functions [John Hughes].

Exceptions, annotations and backtraces (episode 29)

Version 9.10 of GHC introduces an extremely useful new feature: exception annotations and automatic exception backtraces. This new feature, four years in the making, can be a life-saver when debugging code and has not received nearly as much attention as it deserves. We therefore give an overview of the changes and discuss how we can take advantage of them.

Note: in the episode we discuss that part 4 of the proposal on WhileHandling was still under discussion; it has since been accepted.

Debugging and preventing space leaks with nothunks (episode 31)

Debugging space leaks can be one of the more difficult aspects of writing professional Haskell code. An important source of space leaks are unevaluated thunks in long-lived application data; we take a look at how we can take advantage of the nothunks library to make debugging and preventing these kinds of leaks significantly easier.

Programming techniques

generalBracket (episode 10)

Exception handling is difficult, especially in the presence of asynchronous exceptions. In this episode we revise the basics of bracket and why it’s so important. We then discuss its generalisation generalBracket and its application in monad stacks.

Open recursion (episode 13)

Open recursion is a technique for defining objects in Haskell whose behaviour can be adjusted after they have been defined. It can be used to do some form of object-oriented programming in Haskell, and is also an interesting technique in its own right. Part of the episode is based on the paper The Different Aspects of Monads and Mixins [B. Oliveira].

If you’d like to step through the evaluation of the examples discussed in the episode yourself, you can find them in the announcement of the episode. For a more in-depth discussion of using open recursion for object oriented programming in Haskell, see the blog post Object Oriented Programming in Haskell.

Interruptible operations (episode 15)

In episode 10 on generalBracket we discussed asynchronous exceptions: exceptions that can be thrown to a thread at any point. In that episode we saw that correct exception handling in the presence of asynchronous exceptions relies on carefully controlling precisely when they are delivered by masking (temporarily postponing) asynchronous exceptions at specific points. However, even when asynchronous exceptions are masked, some specific instructions can still be interrupted by asynchronous exceptions (technically, these are now synchronous). We discuss how this works, why it is important, and how to take interruptibility into account.

Monads and deriving-via (episode 16)

DerivingVia is a GHC extension based on the paper Deriving Via: or, How to Turn Hand-Written Instances into an Anti-Pattern [B. Blöndal, Andres Löh, R. Scott]. We discuss how this extension can be used to capture rules that relate type classes to each other. As a specific example, we discuss the definition of the Monad type class: ever since this definition was changed back in 2015 in the Applicative-Monad Proposal, instantiating Monad to a new datatype requires quite a bit of boilerplate code. By making the relation between “classic monads” and “new monads” explicit and using deriving-via, we can eliminate the boilerplate.

This episode was inspired by a Mastodon post by Martin Escardo.

Circular programs (episode 17)

A circular program is a program that depends on its own result. It may be surprising that this works at all, but laziness makes it possible if output becomes available sooner than it is required. We take a look at several examples of circular programs: the classic yet somewhat contrived RepMin problem [Richard Bird], the naming of bound variables in a lambda expression [E. Axelsson and K. Claessen], and breadth-first labelling [G. Jones and J. Gibbons].

Generic (un)folds (episode 24)

We connect back to the very beginning of the Haskell Unfolder and talk about unfolds and folds. But this time, not only on lists, but on a much wider class of datatypes, namely those that can be written as a fixed point of a functor.

Variable-arity functions (episode26)

We take look at how one can use Haskell’s class system to encode functions that take a variable number of arguments, and also discuss some examples where such functions can be useful.

Community

Haskell at ICFP (episode 11)

We chat about ICFP (the International Conference on Functional Programming), the Haskell Symposium, and HIW (the Haskell Implementors’ Workshop) from 4-9 September 2023 in Seattle. We highlight a few select papers/presentations from these events:

Going forward

Of course this is only the beginning. The Haskell Unfolder continues, with the next episode airing next week (October 16). If you have any feedback on the Unfolder, or if there are specific topics you’d like to see covered, we’d love to hear from you! Feel free to email us at info@well-typed.com or leave a comment below any of the videos.


  1. Well-Typed makes no money off the Unfolder merch sales; your money goes straight to Spreadshirt.↩︎

by edsko, andres at October 08, 2024 12:00 AM

Michael Snoyman

Bitcoin vs Gold

I just watched an interview between Peter Schiff and Jack Mallers about gold versus Bitcoin. I’d recommend it to anyone interested in either asset, or more generally to those interested in the theories of economics and money in general:

Peter Schiff & Jack Mallers Debate Bitcoin Vs. Gold, Collapse Of Dollar

Peter represented the pro-gold side of this debate, with Jack taking the pro-Bitcoin side.

The premise

The debate itself takes a lot of premises for granted. In particular, both sides view rampant money printing and an out-of-control money supply as being unsound foundations for an economy. While I personally agree with this completely, it’s not at all a universally held belief. Many economists in fact recommend having low levels of inflation in the economy, due to the dangers of deflation.

The debate never touched on justifying these premises since both sides completely agreed. If you’re a viewer (or reader of this post) who hasn’t completely bought into this way of thinking, the discussion may be somewhat confusing. I may decide to write a post on this topic on its own in the future. (And if that’s something you’re interested in, let me know, I’m more likely to write it up if people want to see it.)

In any event, the upshot of this is that both parties agree that current fiat currency, without any backing by a hard asset, is a mistake. They both seemed to agree that the major breakdown in fiat currency was the complete removal of the gold standard in 1971, though they have crucial differences of opinion about why that happened which I’ll cover below.

The question is: what asset is better for fixing this problem, gold or Bitcoin?

Which asset is money?

Overall, I thought both sides represented their arguments well, and I’ll reference some of them below. There was only one piece of the debate that I think completely missed the point, but it’s crucial enough that I’ll start my analysis there. Peter and Jack discussed, at length, whether gold or Bitcoin counted as money. This of course begs the question: what’s your definition of money? Using the Wikipedia definition:

Money is any item or verifiable record that is generally accepted as payment for goods and services and repayment of debts, such as taxes, in a particular country or socio-economic context. The primary functions which distinguish money are: medium of exchange, a unit of account, a store of value and sometimes, a standard of deferred payment.

That first bit, “generally accepted as payment,” got a lot of attention in the debate. There were discussions for a while about whether you could walk into a store and buy things with either gold or Bitcoin, and moreso, if goods were priced in gold or Bitcoin. Both sides tried to make arguments around this claiming their side was money.

My disagreement on this part of the debate is that, in my opinion fairly obviously, neither asset is generally used as “money” today, at least by this definition of “used to buy goods in stores.” Instead, both assets are more generally used as investments, speculation, store of value, or any other long-term holding term you’d want to use. So by the focus of the debate, both assets fail at being money.

But that’s the wrong question! It’s not about whether or not these assets are money. Instead, I would want to ask two separate questions:

  • In an ideal world, which asset works better as money?
  • Which asset has the best path forward to becoming money?

In other words, I don’t think the question is about today. The question is instead about which future (gold vs Bitcoin) is better, and which future is more attainable.

And I think the debate provided a lot of food for thought to analyze these two questions.

Intrinsic value of gold

Peter pointed out that gold has intrinsic value. Gold is desired for jewelry, manufacturing, technology, and other purposes, in addition to being sought as a store of value. Jack was fairly dismissive of this, since money doesn’t need to have any intrinsic value. It only needs to be widely accepted. (Case in point: when I receive a stimulus check from the US government as a wire transfer into my bank account, there is 0 “intrinsic value” to that digital update of my bank account, but I can use it as money regardless.)

While I agree with Jack that intrinsic value is not a necessary property of money, I have to give the win to Peter on this. While intrinsic value isn’t necessary, it’s certainly a perk, and makes it less likely that the asset will stop being accepted for payment of some kind in the future. And we can see this with how the price of gold behaved after the end of the gold standard: it started to rise significantly.

Intrinsic value of Bitcoin

On the other hand, in the debate, neither side ever really applied the term “intrinsic value” to Bitcoin. Instead, Jack made some other and very strong arguments for advantages of Bitcoin over gold, namely:

  • Built in network for settlement. By contrast, gold can easily be exchanged physically with others locally, but cannot be settled in a global economy without the assistance of institutions or governments.
  • Completely fixed supply at 21 million BTC. By contrast, gold has an uncapped supply, which can be expanded through (expensive) mining.
  • Ability to self custody funds.
  • Censorship resistance.

Are these “intrinsic value?” Probably not, but it’s really an issue of semantics. The reality is that Bitcoin does provide these features, and gold mostly doesn’t. You could argue that physical gold is completely censorship-resistant because you can transfer gold to others without external approval. But that only works locally, not for non-local payments, which will be an important point in a bit.

My point in this section is that there are absolutely features of Bitcoin which gold does not have, and which might make it a better money. Is that more important than the “intrinsic value” argument for gold? That’s a highly subjective question. But my subjectivity says that yes, these make Bitcoin a better vehicle to act as money than gold.

Volatility

Another topic that was brought up was the volatility of Bitcoin. How can you use an asset as money when its price swings regularly between $55,000 and $65,000? (And that’s just in the past month!) The debate had some back-and-forth about the difference between volatility and risk. That was another semantics issue that didn’t matter much to me. The fact is, the price of Bitcoin in terms of dollars does fluctuate significantly.

Does that make Bitcoin a worse money? I’d say no, it doesn’t. It might be a barrier to the adoption of Bitcoin as money, since people will be hesitant to accept payment in an asset who’s “real world” value changes so dramatically. But remember, I’m rephrasing the question not to “is Bitcoin money,” but rather to “is Bitcoin a good future money?” In that world, the fact that the exchange rate with another currency changes significantly is not a barrier to usage as money.

But perhaps more directly, it seems likely to me that Bitcoin being adopted as money would cause a significant reduction in volatility. Instead of exchanging Bitcoin for dollars each time you want to buy something, people would be living in a new Bitcoin-denominated economy. Fluctuations in exchange rate don’t preclude that.

And yes, this argument equally applies to gold being a good money. The difference is that the volatility of gold is significantly lower than Bitcoin.

What’s the better money?

I see both sides of the argument as valid and strong. For me, gold has the advantages of intrinsic value, existing adoption, and likely the longest track record in human history as being used as money. Those are some solid advantages.

By contrast, Bitcoin has a fully fixed supply and a network that allows for fast global settlement and self custody.

We could get into the other details discussed, but in my opinion none of the other points really address which is the better money overall.

For me, Bitcoin has a serious advantage here. Lack of centralized control is vital. And it becomes more so when we analyze the second question.

Which money can win?

I’ll say directly: I don’t see a world in which gold ends up being money again. I don’t think fiat currencies can go back to a gold standard without some insane debasement of the currency. And there’s no reason to believe the will exists among governments, politicians, and institutions to turn off the money printer. All the conditions that led to the removal of the gold standard in 1971 still apply today.

This is where Jack’s arguments really hit home for me. At a local level, maybe I could convince people in my town to accept gold coins. But my local supermarket is part of a multinational conglomerate. They won’t be sending shipments of gold coins across the world to pay vendors. The globalization of the economy is a large part of why the world moved towards the dollar–while still backed by the gold standard–as its reserve currency in the 20th century. It was easy to move around a representation of gold. The removal of the gold standard was simply the next logical step, allowing the US to create money out of thin air.

By contrast, Bitcoin is ready to compete now. There are systems already which allow you to connect credit cards and other “normal” payment methods to your Bitcoin balance. Services (such as Jack’s Strike) allow you to easily convert your paycheck to Bitcoin. Bitcoin may not win at displacing the dollar, but there’s a clear plan from the pro-Bitcoin side towards making it easy to use Bitcoin while still keeping your funds in a non-inflationary asset. Market forces and the self-interest of many can simply continue to drive adoption. (This is another topic I’ve been thinking about writing a post on, so if the details here seem flimsy, ask for details and I might write that one up too.)

To be fair, gold could do much of this as well. Peter mentioned tokenization of gold multiple times in the debate. And I don’t disagree with him overall. However, in practice, Jack’s point stands that this relies on institutions and governments, and there’s no reason to believe another kind of debasement couldn’t happen again in the future. And empirically, we haven’t seen a move back towards the gold standard in the past fifty years, while the Bitcoin revolution has momentum.

In other words, if you asked me to place a bet on which asset will end up being used as money at scale, my bet is on Bitcoin.

Fallible humans

I used my own analysis, not Peter and Jack’s, at the end of the previous section. Let me go back to the actual arguments they made. We know that human beings made a decision to move the US away from the gold standard and towards our current fiat system. Both Peter and Jack believe this was a mistake. But their takes on this are slightly–but importantly–different:

  • Peter points to “fallible humans” as the problem. Politicians got greedy, wanted more money to print to buy votes, wage wars, buy off lobbyists, or whatever else they wanted to do. It’s not an inherent flaw in gold, it’s an inherent flaw in people.
  • Jack agrees with the fallible humans part (I think). But he lays the blame directly on gold itself. Because gold necessitates centralization with institutions and governments to allow for global trade, a gold-based money will always put too much power in the hands of those fallible people. Bitcoin, by contrast, has no centralization of power at all.

Jack’s argument overall wins the day for me.

Takeaways

The debate was great, and I’d recommend others take the time to watch it. My conclusion above is clear, I think Bitcoin has the edge for becoming a new money system. But that’s just theoretical. What should individuals do about all this?

At the moment, neither Bitcoin nor gold is used as money, at least not widely. Right now, for the most part, by buying these assets, you’re hoping for a long-term store of value which defeats inflation.

Jack made some good data-driven points that, in fact, gold has not achieved that since the end of the gold standard. Gold has averaged a 7% annual appreciation, while average consumer prices have risen 8% annually. Stocks have gone up even more at 11%. (I haven’t checked these numbers myself, just repeating them.)

Bitcoin, by contrast, has massively outperformed everything over its lifetime. It’s gone from less than a dollar to over $60,000 in the span of 13 years. That’s an unbelievably good asset to invest in… if it continues.

And that’s where Peter’s points land solidly too. Judging Bitcoin based on only 13 years of data, and trying to extrapolate to the future from that, is naive at best. While in theory Bitcoin is poised to be a great money, and at least a powerful store of value, it’s entirely possible that it will fail. Gold, by contrast, has little risk of losing a significant portion of its value over time, barring significant technological changes making it cheaper to increase the supply (e.g., space mining, alchemy, new terrestrial deposits).

One of Peter’s last comments was to recommend Bitcoiners “take profits” on their Bitcoin and hedge with some gold investments. I put “take profits” in scare quotes, since it implicitly identifies Bitcoin as nothing more than a speculative asset, presuming Peter’s world view that Bitcoin is not money. Nonetheless, I think this is wise advice.

What I’m doing

I wrote up another post that I haven’t published yet talking about my current stance on Bitcoin, I’ll likely publish it in the near future. I’ll get into my overall approach there. For this specific debate, I can say that I put money into both Bitcoin and gold, and intend to continue doing so. I have no intention of selling my holdings in either in the foreseeable future. And I always keep enough fiat around to cover unexpected events (home repair, job loss, etc).

October 08, 2024 12:00 AM

October 07, 2024

Michael Snoyman

Personal update, upcoming blogging

My blog posts have slowed down quite a bit over the past few years. I’m probably going to be ramping back up on posting, and covering some new topics. So I thought a quick update on what’s been happening with me would be in order.

Twins

In 2021, we welcomed two wonderful babies into the world, and just celebrated their third birthday. This by far accounts for most of my radio silence over the past few years. Between a difficult high-risk pregnancy followed by the difficulty of juggling two babies (on top of our other kids), Miriam and my lives have been pretty thoroughly dominated for the past 3.5 years.

Things have certainly gotten easier in the past year, and time spent at school has always given us a bit of breather. Though that hasn’t exactly been consistent…

War

My family moved to a small town in Northern Israel in 2009. We live 10 kilometers from the Lebanese border. For the past year, we’ve been in the line of fire from Hezbollah rocket attacks. We didn’t live through the previous war in 2006. When we moved here, we met lots of people who told us war stories of living for months on end in bomb shelters.

I consider us to be relatively lucky. Iron Dome has brought life much closer to normalcy. But we’re still living through regular rocket attacks, artillery responses, air raid sirens, and overall tension. At this point, everyone in the family jumps if we hear a car door close too loudly. Our children haven’t had a regular year of school for five years running (between COVID and now entering our second year of war). Our three year old daughter is terrified any time the sirens go off or the explosions shake the house.

Lots of friends and family have been worried about us. We’ve set up a Telegram broadcast channel to send updates, especially after large attacks on Northern Israel. (I morbidly call it the “proof of life” channel.) I don’t feel comfortable sharing that link publicly, but if you have my contact info and would like an invite, just send me a private message.

Additionally, since we’ve gotten the question a lot: at this time, we do not have plans to leave Israel. But we’re open to changing our mind on that as the situation develops.

Illness

Perhaps a result of the above two stressors, or perhaps completely unrelated, I was fairly sick for the past 2.5 years. I had something called Silent Sinus Syndrome, a perpetual infection in my left maxillary sinus that created negative pressure and resulted in weeks-long bouts of fever, especially after physical exertion. I eventually had to give up weight lifting as the disease progressed.

I was scheduled to have surgery last October, but when the war broke out, all elective surgeries were canceled. Miriam was ultimately able to get the surgery set up through a private hospital in March, and after a few months I began to feel much better. Today, I feel healthier than I have in at least five years. I’m back to weight lifting, dieting properly again, and overall simply relieved to be free of a chronic illness.

Blockchain space

For the past three years, I’ve more or less been working full time in the blockchain space. This has still been work done through FP Complete for our customers, and has touched on GameFi, DeFi, and a few other areas.

I’ve been working off-and-on in the space for the past eight years or so. When I first got into the space, I wasn’t particularly excited or impressed by what I found. Like many others, I saw a world of scams and poorly implemented technology, of get-rich-quick schemes that could generously be called Ponzis.

I’ve definitely changed my perspective a bit, and I think the industry itself has reached a new level of maturity. “Crypto” is still young, and it’s still evolving rapidly, but it seems to me like we’re past the initial few phases of the hype curve, and we’re beginning to find what blockchain is actually a good technology for.

Many of you know that my formal education was not in software, but in actuarial science. Getting to leverage the math, finance, and statistics muscles again has been a huge perk of moving deeper into the blockchain space, and I’m looking forward to more of that.

Rust

Most of the blockchain work I’ve done has been in the Cosmos ecosystem, focused on CosmWasm smart contracts. That’s necessitated a lot more work with Rust than I had done previously. The necessity of Rust in smart contracts has really been a forcing function for me to use Rust in even more places. At this point, our preferred tech stack for Cosmos projects is heavily Rust, leveraging our FP Complete cosmos-rs library for backend services, CosmWasm in contracts, Rust for off-chain data analytics, and occasionally even using Leptos for putting together frontends.

In addition to Rust, I also picked up quite a bit more experience with TypeScript and React over the past few years, which may explain why I like Leptos so much.

Haskell hasn’t had much of my attention over the last few years.

I haven’t spent much time blogging about Rust like I used to with Haskell. Part of that has simply been a time issue. Part of it has been a desire issue. I’m not quite as interested in churning out technical blog posts as I used to be. The topics still interest me, and if I come across an interesting topic or receive a blog topic request, I’ll likely still do a write-up. But a lot of my extra brain cycles have moved over to other topics to ponder.

Blogs

While I intend to continue blogging on technical topics, I’m planning on expanding my focus on this blog. I already started adding in some health/diet/exercise posts in the past. I’m planning on expanding this a bit further to economics. My work in the blockchain space has really woken up those old muscles. And the times we’re living through, with wars, COVID stimulus, general money printing, and overall chaos, are all leading to very interesting changes in the world. I plan on putting on my amateur economist hat.

I really enjoy writing on topics I’m passionate about. I find it cathartic. So don’t be surprised to see upticks in blog posts during the worst of the war in Israel. Nothing better than sitting in a safe room with my wife and six kids typing up a blog post :).

Conferences

Between COVID, the twins, and the war, I’ve done very little traveling over the past five years. I’m starting to change that up, air travel permitting. I’m attending Cosmoverse in Dubai later this month. If anyone’s going to be attending, let me know, I’m looking to meeting people in real life again!

Most of my conference attendance will likely be in the blockchain space, but if timing allows, I’ll probably try to make it to some Rust and functional programming conferences too.

October 07, 2024 12:00 AM

October 05, 2024

Lysxia's blog

Unicode shenanigans: Martine écrit en UTF-8

An old French meme
Martine écrit en UTF-8 (parody cover of the Martine series of French children's books)

On my feed aggregator haskell.pl-a.net, I occasionally saw posts with broken titles like this (from ezyang’s blog):

What’s different this time? LLM edition

Yesterday I decided to do something about it.

Locating the problem

Tracing back where it came from, that title was sent already broken by Planet Haskell, which is itself a feed aggregator for blogs. The blog originally produces the good not broken title. Therefore the blame lies with Planet Haskell. It’s probably a misconfigured locale. Maybe someone will fix it. It seems to be running archaic software on an old machine, stuff I wouldn’t deal with myself so I won’t ask someone else to.

ASCII diagram of how a blog title travels through the relevant parties
      Blog
       |
       | What’s
       v
 Planet Haskell
       | 
       | What’s
       v
haskell.pl-a.net (my site)
       |
       | What’s
       v
  Your screen

In any case, this mistake can be fixed after the fact. Mis-encoded text is such an ubiquitous issue that there are nicely packaged solutions out there, like ftfy.

ftfy has been used as a data processing step in major NLP research, including OpenAI’s original GPT.

But my hobby site is written in OCaml and I would rather have fun solving this encoding problem than figure out how to install a Python program and call it from OCaml.

Explaining the problem

This is the typical situation where a program is assuming the wrong text encoding.

Text encodings

A quick summary for those who don’t know about text encodings.

Humans read and write sequences of characters, while computers talk to each other using sequences of bytes. If Alice writes a blog, and Bob wants to read it from across the world, the characters that Alice writes must be encoded into bytes so her computer can send it over the internet to Bob’s computer, and Bob’s computer must decode those bytes to display them on his screen. The mapping between sequences of characters and sequences of bytes is called an encoding.

Multiple encodings are possible, but it’s not always obvious which encoding to use to decode a given byte string. There are good and bad reasons for this, but the net effect is that many text-processing programs arbitrarily guess and assume the encoding in use, and sometimes they assume wrong.

Back to the problem

UTF-8 is the most prevalent encoding nowadays.1 I’d be surprised if one of the Planet Haskell blogs doesn’t use it, which is ironic considering the issue we’re dealing with.

  1. A blog using UTF-8 encodes the right single quote2 " ’ " as three consecutive bytes (226, 128, 153) in its RSS or Atom feed.
  2. The culprit, Planet Haskell, read those bytes but wrongly assumed an encoding different from UTF-8 where each byte corresponds to one character.
  3. It did some transformation to the decoded text (extract the title and body and put it on a webpage with other blogs).
  4. It encoded the final result in UTF-8.
ASCII diagram of how text gets encoded and decoded (wrongly)
      What the blog sees →       '’'
                                  |
                                  | UTF-8 encode (one character into three bytes)
                                  v
                             226 128 153
                                  |
                                  | ??? decode (not UTF-8)
                                  v
What Planet Haskell sees →   'â' '€' '™'
                                  |
                                  | UTF-8 encode
                                  v
                                (...)
                                  |
                                  | UTF-8 decode
                                  v
            What you see →   'â' '€' '™'

The final encoding doesn’t really matter, as long as everyone else downstream agrees with it. The point is that Planet Haskell outputs three characters “’” in place of the right single quote " ’ ", all because UTF-8 represents " ’ " with three bytes.

In spite of their differences, most encodings in practice agree at least about ASCII characters, in the range 0-127, which is sufficient to contain the majority of English language writing if you can compromise on details such as confusing the apostrophe and the single quotes. That’s why in the title “What’s different this time?” everything but one character was transferred fine.

Solving the problem

The fix is simple: replace “’” with " ’ ". Of course, we also want to do that with all other characters that are mis-encoded the same way: those are exactly all the non-ASCII Unicode characters. The more general fix is to invert Planet Haskell’s decoding logic. Thank the world that this mistake can be reversed to begin with. If information had been lost by mis-encoding, I may have been forced to use one of those dreadful LLMs to reconstruct titles.3

  1. Decode Planet Haskell’s output in UTF-8.
  2. Encode each character as a byte to recover the original output from the blog.
  3. Decode the original output correctly, in UTF-8.

There is one missing detail: what encoding to use in step 2? I first tried the naive thing: each character is canonically a Unicode code point, which is a number between 0 and 1114111, and I just hoped that those which did occur would fit in the range 0-255. That amounts to making the hypothesis that Planet Haskell is decoding blog posts in Latin-1. That seems likely enough, but you will have guessed correctly that the naive thing did not reconstruct the right single quote in this case. The Latin-1 hypothesis was proven false.

As it turns out, the euro sign “€” and the trademark symbol “™” are not in the Latin-1 alphabet. They are code points numbers 8364 and 8482 in Unicode, which are not in the range 0-255. Planet Haskell has to be using an encoding that features these two symbols. I needed to find which one.

Faffing about, I came across the Wikipedia article on Western Latin character sets which lists a comparison table. How convenient. I looked up the two symbols to find what encoding had them, if any. There were two candidates: Windows-1252 and Macintosh. Flip a coin. It was Windows-1252.

Windows-1252 differs from Latin-1 (and thus Unicode) in 27 positions, those whose byte starts with 8 or 9 in hexadecimal (27 valid characters + 5 unused positions): that’s 27 characters that I had to map manually to the range 0-255 according to the Windows-1252 encoding, and the remaining characters would be mapped for free by Unicode. This data entry task was autocompleted halfway through by Copilot, because of course GPT-* knows Windows-1252 by heart.

let windows1252_hack (c : Uchar.t) : int =
  let c = Uchar.to_int c in
  if      c = 0x20AC then 0x80
  else if c = 0x201A then 0x82
  else if c = 0x0192 then 0x83
  else if c = 0x201E then 0x84
  else if c = 0x2026 then 0x85
  else if c = 0x2020 then 0x86
  else if c = 0x2021 then 0x87
  else if c = 0x02C6 then 0x88
  else if c = 0x2030 then 0x89
  else if c = 0x0160 then 0x8A
  else if c = 0x2039 then 0x8B
  else if c = 0x0152 then 0x8C
  else if c = 0x017D then 0x8E
  else if c = 0x2018 then 0x91
  else if c = 0x2019 then 0x92
  else if c = 0x201C then 0x93
  else if c = 0x201D then 0x94
  else if c = 0x2022 then 0x95
  else if c = 0x2013 then 0x96
  else if c = 0x2014 then 0x97
  else if c = 0x02DC then 0x98
  else if c = 0x2122 then 0x99
  else if c = 0x0161 then 0x9A
  else if c = 0x203A then 0x9B
  else if c = 0x0153 then 0x9C
  else if c = 0x017E then 0x9E
  else if c = 0x0178 then 0x9F
  else c

And that’s how I restored the quotes, apostrophes, guillemets, accents, et autres in my feed.


See also


Update: When Planet Haskell picked up this post, it fixed the intentional mojibake in the title.

Screenshot of Planet Haskell with a correctly displayed diacritic. October 05, 2024. Lysxia's blog. Unicode shenanigans: Martine écrit en UTF-8

There is no room for this in my mental model. Planet Haskell is doing something wild to parse blog titles.


  1. As of September 2024, UTF-8 is used by 98.3% of surveyed web sites.↩︎

  2. The Unicode right single quote is sometimes used as an apostrophe, to much disapproval.↩︎

  3. Or I could just query the blogs directly for their titles.↩︎

by Lysxia at October 05, 2024 12:00 AM

Christopher Allen

Routines in caring for children

I have 4 children aged 4, 3, almost 2, and 19 weeks. Parents are increasingly isolated from each other socially so it's harder to compare tactics and strategies for caregiving. I want to share a run-down of how my wife and I care for our children and what has seemed to work and what has not.

by Unknown at October 05, 2024 12:00 AM

October 04, 2024

Derek Elkins

Global Rebuilding, Coroutines, and Defunctionalization

Introduction

In 1983, Mark Overmars described global rebuilding in The Design of Dynamic Data Structures. The problem it was aimed at solving was turning the amortized time complexity bounds of batched rebuilding into worst-case bounds. In batched rebuilding we perform a series of updates to a data structure which may cause the performance of operations to degrade, but occasionally we expensively rebuild the data structure back into an optimal arrangement. If the updates don’t degrade performance too much before we rebuild, then we can achieve our target time complexity bounds in an amortized sense. An update that doesn’t degrade performance too much is called a weak update.

Taking an example from Okasaki’s Purely Functional Data Structures, we can consider a binary search tree where deletions occur by simply marking the deleted nodes as deleted. Then, once about half the tree is marked as deleted, we rebuild the tree into a balanced binary search tree and clean out the nodes marked as deleted at that time. In this case, the deletions count as weak updates because leaving the deleted nodes in the tree even when it corresponds to up to half the tree can only mildly impact the time complexity of other operations. Specifically, assuming the tree was balanced at the start, then deleting half the nodes could only reduce the tree’s depth by about 1. On the other hand, naive inserts are not weak updates as they can quickly increase the tree’s depth.

The idea of global rebuilding is relatively straightforward, though how you would actually realize it in any particular example is not. The overall idea is simply that instead of waiting until the last moment and then rebuilding the data structure all at once, we’ll start the rebuild sooner and work at it incrementally as we perform other operations. If we update the new version faster than we update the original version, we’ll finish it by the time we would have wanted to perform a batch rebuild, and we can just switch to this new version.

More concretely, though still quite vaguely, global rebuilding involves, when a threshold is reached, rebuilding by creating a new “empty” version of the data structure called the shadow copy. The original version is the working copy. Work on rebuilding happens incrementally as operations are performed on the data structure. During this period, we service queries from the working copy and continue to update it as usual. Each update needs to make more progress on building the shadow copy than it worsens the working copy. For example, an insert should insert more nodes into the shadow copy than the working copy. Once the shadow copy is built, we may still have more work to do to incorporate changes that occurred after we started the rebuild. To this end, we can maintain a queue of update operations performed on the working copy since the start of a rebuild, and then apply these updates, also incrementally, to the shadow copy. Again, we need to apply the updates from the queue at a fast enough rate so that we will eventually catch up. Of course, all of this needs to happen fast enough so that 1) the working copy doesn’t get too degraded before the shadow copy is ready, and 2) we don’t end up needing to rebuild the shadow copy before it’s ready to do any work.

Coroutines

Okasaki passingly mentions that global rebuilding “can be usefully viewed as running the rebuilding transformation as a coroutine”. Also, the situation described above is quite reminiscent of garbage collection. There the classic half-space stop-the-world copying collector is naturally the batched rebuilding version. More incremental versions often have read or write barriers and break the garbage collection into incremental steps. Garbage collection is also often viewed as two processes coroutining.

The goal of this article is to derive global rebuilding-based data structures from an expression of them as two coroutining processes. Ideally, we should be able to take a data structure implemented via batched rebuilding and simply run the batch rebuilding step as a coroutine. Modifying the data structure’s operations and the rebuilding step should, in theory, just be a matter of inserting appropriate yield statements. Of course, it’s won’t be that easy since the batched version of rebuilding doesn’t need to worry about concurrent updates to the original data structure.

In theory, such a representation would be a perfectly effective way of articulating the global rebuilding version of the data structure. That said, I will be using the standard power move of CPS transforming and defunctionalizing to get a more data structure-like result.

I’ll implement coroutines as a very simplified case of modeling cooperative concurrency with continuations. In that context, a “process” written in continuation-passing style “yields” to the scheduler by passing its continuation to a scheduling function. Normally, the scheduler would place that continuation at the end of a work queue and then pick up a continuation from the front of the work queue and invoke it resuming the previously suspended “process”. In our case, we only have two “processes” so our “work queue” can just be a single mutable cell. When one “process” yields, it just swaps its continuation into the cell and the other “process’” out and invokes the continuation it read.

Since the rebuilding process is always driven by the main process, the pattern is a bit more like generators. This has the benefit that only the rebuilding process needs to be written in continuation-passing style. The following is a very quick and dirty set of functions for this.

module Coroutine ( YieldFn, spawn ) where
import Control.Monad ( join )
import Data.IORef ( IORef, newIORef, readIORef, writeIORef )

type YieldFn = IO () -> IO ()

yield :: IORef (IO ()) -> IO () -> IO ()
yield = writeIORef

resume :: IORef (IO ()) -> IO ()
resume = join . readIORef

terminate :: IORef (IO ()) -> IO ()
terminate yieldRef = writeIORef yieldRef (ioError $ userError "Subprocess completed")

spawn :: (YieldFn -> IO () -> IO ()) -> IO (IO ())
spawn process = do
    yieldRef <- newIORef undefined
    writeIORef yieldRef $ process (yield yieldRef) (terminate yieldRef)
    return (resume yieldRef)

A simple example of usage is:

process :: YieldFn -> Int -> IO () -> IO ()
process     _ 0 k = k
process yield i k = do
    putStrLn $ "Subprocess: " ++ show i
    yield $ process yield (i-1) k

example :: IO ()
example = do
    resume <- spawn $ \yield -> process yield 10
    forM_ [(1 :: Int) .. 10] $ \i -> do
        putStrLn $ "Main process: " ++ show i
        resume
    putStrLn "Main process done"

with output:

Main process: 1
Subprocess: 10
Main process: 2
Subprocess: 9
Main process: 3
Subprocess: 8
Main process: 4
Subprocess: 7
Main process: 5
Subprocess: 6
Main process: 6
Subprocess: 5
Main process: 7
Subprocess: 4
Main process: 8
Subprocess: 3
Main process: 9
Subprocess: 2
Main process: 10
Subprocess: 1
Main process done

Queues

I’ll use queues since they are very simple and Purely Functional Data Structures describes Hood-Melville Real-Time Queues in Figure 8.1 as an example of global rebuilding. We’ll end up with something quite similar which could be made more similar by changing the rebuilding code. Indeed, the differences are just an artifact of specific, easily changed details of the rebuilding coroutine, as we’ll see.

The examples I’ll present are mostly imperative, not purely functional. There are two reasons for this. First, I’m not focused on purely functional data structures and the technique works fine for imperative data structures. Second, it is arguably more natural to talk about coroutines in an imperative context. In this case, it’s easy to adapt the code to a purely functional version since it’s not much more than a purely functional data structure stuck in an IORef.

For a more imperative structure with mutable linked structure and/or in-place array updates, it would be more challenging to produce a purely functional version. The techniques here could still be used, though there are more “concurrency” concerns. While I don’t include the code here, I did a similar exercise for a random-access stack (a fancy way of saying a growable array). There the “concurrency” concern is that the elements you are copying to the new array may be popped and potentially overwritten before you switch to the new array. In this case, it’s easy to solve, since if the head pointer of the live version reaches the source offset for copy, you can just switch to the new array immediately.

Nevertheless, I can easily imagine scenarios where it may be beneficial, if not necessary, for the coroutines to communicate more and/or for there to be multiple “rebuild” processes. The approach used here could be easily adapted to that. It’s also worth mentioning that even in simpler cases, non-constant-time operations will either need to invoke resume multiple times or need more coordination with the “rebuild” process to know when it can do more than a constant amount of work. This could be accomplished by “rebuild” process simply recognizing this from the data structure state, or some state could be explicitly set to indicate this, or the techniques described earlier could be used, e.g. a different process for non-constant-time operations.

The code below uses the extensions BangPatterns, RecordWildCards, and GADTs.

Batched Rebuilding Implementation

We start with the straightforward, amortized constant-time queues where we push to a stack representing the back of the queue and pop from a stack representing the front. When the front stack is empty, we need to expensively reverse the back stack to make a new front stack.

I intentionally separate out the reverse step as an explicit rebuild function.

module BatchedRebuildingQueue ( Queue, new, enqueue, dequeue ) where
import Data.IORef ( IORef, newIORef, readIORef, writeIORef, modifyIORef )

data Queue a = Queue {
    queueRef :: IORef ([a], [a])
}

new :: IO (Queue a)
new = do
    queueRef <- newIORef ([], [])
    return Queue { .. }

dequeue :: Queue a -> IO (Maybe a)
dequeue q@(Queue { .. }) = do
    (front, back) <- readIORef queueRef
    case front of
        (x:front') -> do
            writeIORef queueRef (front', back)
            return (Just x)
        [] -> case back of
                [] -> return Nothing
                _ -> rebuild q >> dequeue q

enqueue :: a -> Queue a -> IO ()
enqueue x (Queue { .. }) =
    modifyIORef queueRef (\(front, back) -> (front, x:back))

rebuild :: Queue a -> IO ()
rebuild (Queue { .. }) =
    modifyIORef queueRef (\([], back) -> (reverse back, []))

Global Rebuilding Implementation

This step is where a modicum of thought is needed. We need to make the rebuild step from the batched version incremental. This is straightforward, if tedious, given the coroutine infrastructure. In this case, we incrementalize the reverse by reimplementing reverse in CPS with some yield calls inserted. Then we need to incrementalize append. Since we’re not waiting until front is empty, we’re actually computing front ++ reverse back. Incrementalizing append is hard, so we actually reverse front and then use an incremental reverseAppend (which is basically what the incremental reverse does anyway1).

One of first thing to note about this code is that the actual operations are largely unchanged other than inserting calls to resume. In fact, dequeue is even simpler than in the batched version as we can just assume that front is always populated when the queue is not empty. dequeue is freed from the responsibility of deciding when to trigger a rebuild. Most of the bulk of this code is from reimplementing a reverseAppend function (twice).

The parts of this code that require some deeper though are 1) knowing when a rebuild should begin, 2) knowing how “fast” the incremental operations should go2 (e.g. incrementalReverse does two steps at a time and the Hood-Melville implementation has an explicit exec2 that does two steps at a time), and 3) dealing with “concurrent” changes.

For the last, Overmars describes a queue of deferred operations to perform on the shadow copy once it finishes rebuilding. This kind of suggests a situation where the “rebuild” process can reference some “snapshot” of the data structure. In our case, that is the situation we’re in, since our data structures are essentially immutable data structures in an IORef. However, it can easily not be the case, e.g. the random-access stack. Also, this operation queue approach can easily be inefficient and inelegant. None of the implementations below will have this queue of deferred operations. It is easier, more efficient, and more elegant to just not copy over parts of the queue that have been dequeued, rather than have an extra phase of the rebuilding that just pops off the elements of the front stack that we just pushed. A similar situation happens for the random-access stack.

The use of drop could probably be easily eliminated. (I’m not even sure it’s still necessary.) It is mostly an artifact of (not) dealing with off-by-one issues.

module GlobalRebuildingQueue ( Queue, new, dequeue, enqueue ) where
import Data.IORef ( IORef, newIORef, readIORef, writeIORef, modifyIORef, modifyIORef' )
import Coroutine ( YieldFn, spawn )

data Queue a = Queue {
    resume :: IO (),
    frontRef :: IORef [a],
    backRef :: IORef [a],
    frontCountRef :: IORef Int,
    backCountRef :: IORef Int
}

new :: IO (Queue a)
new = do
    frontRef <- newIORef []
    backRef <- newIORef []
    frontCountRef <- newIORef 0
    backCountRef <- newIORef 0
    resume <- spawn $ const . rebuild frontRef backRef frontCountRef backCountRef
    return Queue { .. }

dequeue :: Queue a -> IO (Maybe a)
dequeue q = do
    resume q
    front <- readIORef (frontRef q)
    case front of
        [] -> return Nothing
        (x:front') -> do
            modifyIORef' (frontCountRef q) pred
            writeIORef (frontRef q) front'
            return (Just x)

enqueue :: a -> Queue a -> IO ()
enqueue x q = do
    modifyIORef (backRef q) (x:)
    modifyIORef' (backCountRef q) succ
    resume q

rebuild :: IORef [a] -> IORef [a] -> IORef Int -> IORef Int -> YieldFn -> IO ()
rebuild frontRef backRef frontCountRef backCountRef yield = let k = go k in go k where
  go k = do
    frontCount <- readIORef frontCountRef
    backCount <- readIORef backCountRef
    if backCount > frontCount then do
        back <- readIORef backRef
        front <- readIORef frontRef
        writeIORef backRef []
        writeIORef backCountRef 0
        incrementalReverse back [] $ \rback ->
            incrementalReverse front [] $ \rfront ->
                incrementalRevAppend rfront rback 0 backCount k
      else do
        yield k

  incrementalReverse [] acc k = k acc
  incrementalReverse [x] acc k = k (x:acc)
  incrementalReverse (x:y:xs) acc k = yield $ incrementalReverse xs (y:x:acc) k

  incrementalRevAppend [] front !movedCount backCount' k = do
    writeIORef frontRef front
    writeIORef frontCountRef $! movedCount + backCount'
    yield k
  incrementalRevAppend (x:rfront) acc !movedCount backCount' k = do
    currentFrontCount <- readIORef frontCountRef
    if currentFrontCount <= movedCount then do
        -- This drop count should be bounded by a constant.
        writeIORef frontRef $! drop (movedCount - currentFrontCount) acc
        writeIORef frontCountRef $! currentFrontCount + backCount'
        yield k
      else if null rfront then
        incrementalRevAppend [] (x:acc) (movedCount + 1) backCount' k
      else
        yield $! incrementalRevAppend rfront (x:acc) (movedCount + 1) backCount' k

Defunctionalized Global Rebuilding Implementation

This step is completely mechanical.

There’s arguably no reason to defunctionalize. It produces a result that is more data-structure-like, but, unless you need the code to work in a first-order language, there’s nothing really gained by doing this. It does lead to a result that is more directly comparable to other implementations.

For some data structures, having the continuation be analyzable would provide a simple means for the coroutines to communicate. The main process could directly look at the continuation to determine its state, e.g. if a rebuild is in-progress at all. The main process could also directly manipulate the stored continutation to change the “rebuild” process’ behavior. That said, doing this would mean that we’re not deriving the implementation. Still, the opportunity for additional optimizations and simplifications is nice.

As a minor aside, while it is, of course, obvious from looking at the previous version of the code, it’s neat how the Kont data type implies that the call stack is bounded and that most calls are tail calls. REVERSE_STEP is the only constructor that contains a Kont argument, but its type means that that argument can’t itself be a REVERSE_STEP. Again, I just find it neat how defunctionalization makes this concrete and explicit.

module DefunctionalizedQueue ( Queue, new, dequeue, enqueue ) where
import Data.IORef ( IORef, newIORef, readIORef, writeIORef, modifyIORef, modifyIORef' )

data Kont a r where
  IDLE :: Kont a ()
  REVERSE_STEP :: [a] -> [a] -> Kont a [a] -> Kont a ()
  REVERSE_FRONT :: [a] -> !Int -> Kont a [a]
  REV_APPEND_START :: [a] -> !Int -> Kont a [a]
  REV_APPEND_STEP :: [a] -> [a] -> !Int -> !Int -> Kont a ()

applyKont :: Queue a -> Kont a r -> r -> IO ()
applyKont q IDLE _ = rebuildLoop q
applyKont q (REVERSE_STEP xs acc k) _ = incrementalReverse q xs acc k
applyKont q (REVERSE_FRONT front backCount) rback =
    incrementalReverse q front [] $ REV_APPEND_START rback backCount
applyKont q (REV_APPEND_START rback backCount) rfront =
    incrementalRevAppend q rfront rback 0 backCount
applyKont q (REV_APPEND_STEP rfront acc movedCount backCount) _ =
    incrementalRevAppend q rfront acc movedCount backCount

rebuildLoop :: Queue a -> IO ()
rebuildLoop q@(Queue { .. }) = do
    frontCount <- readIORef frontCountRef
    backCount <- readIORef backCountRef
    if backCount > frontCount then do
        back <- readIORef backRef
        front <- readIORef frontRef
        writeIORef backRef []
        writeIORef backCountRef 0
        incrementalReverse q back [] $ REVERSE_FRONT front backCount
      else do
        writeIORef resumeRef IDLE

incrementalReverse :: Queue a -> [a] -> [a] -> Kont a [a] -> IO ()
incrementalReverse q [] acc k = applyKont q k acc
incrementalReverse q [x] acc k = applyKont q k (x:acc)
incrementalReverse q (x:y:xs) acc k = writeIORef (resumeRef q) $ REVERSE_STEP xs (y:x:acc) k

incrementalRevAppend :: Queue a -> [a] -> [a] -> Int -> Int -> IO ()
incrementalRevAppend (Queue { .. }) [] front !movedCount backCount' = do
    writeIORef frontRef front
    writeIORef frontCountRef $! movedCount + backCount'
    writeIORef resumeRef IDLE
incrementalRevAppend q@(Queue { .. }) (x:rfront) acc !movedCount backCount' = do
    currentFrontCount <- readIORef frontCountRef
    if currentFrontCount <= movedCount then do
        -- This drop count should be bounded by a constant.
        writeIORef frontRef $! drop (movedCount - currentFrontCount) acc
        writeIORef frontCountRef $! currentFrontCount + backCount'
        writeIORef resumeRef IDLE
      else if null rfront then
        incrementalRevAppend q [] (x:acc) (movedCount + 1) backCount'
      else
        writeIORef resumeRef $! REV_APPEND_STEP rfront (x:acc) (movedCount + 1) backCount'

resume :: Queue a -> IO ()
resume q = do
    kont <- readIORef (resumeRef q)
    applyKont q kont ()

data Queue a = Queue {
    resumeRef :: IORef (Kont a ()),
    frontRef :: IORef [a],
    backRef :: IORef [a],
    frontCountRef :: IORef Int,
    backCountRef :: IORef Int
}

new :: IO (Queue a)
new = do
    frontRef <- newIORef []
    backRef <- newIORef []
    frontCountRef <- newIORef 0
    backCountRef <- newIORef 0
    resumeRef <- newIORef IDLE
    return Queue { .. }

dequeue :: Queue a -> IO (Maybe a)
dequeue q  = do
    resume q
    front <- readIORef (frontRef q)
    case front of
        [] -> return Nothing
        (x:front') -> do
            modifyIORef' (frontCountRef q) pred
            writeIORef (frontRef q) front'
            return (Just x)

enqueue :: a -> Queue a -> IO ()
enqueue x q = do
    modifyIORef (backRef q) (x:)
    modifyIORef' (backCountRef q) succ
    resume q

Functional Defunctionalized Global Rebuilding Implementation

This is just a straightforward reorganization of the previous code into purely functional code. This produces a persistent queue with worst-case constant time operations.

It is, of course, far uglier and more ad-hoc than Okasaki’s extremely elegant real-time queues, but the methodology to derive it was simple-minded. The result is also quite similar to the Hood-Melville Queues even though I did not set out to achieve that. That said, I’m pretty confident you could derive pretty much exactly the Hood-Melville queues with just minor modifications to Global Rebuilding Implementation.

module FunctionalQueue ( Queue, empty, dequeue, enqueue ) where

data Kont a r where
  IDLE :: Kont a ()
  REVERSE_STEP :: [a] -> [a] -> Kont a [a] -> Kont a ()
  REVERSE_FRONT :: [a] -> !Int -> Kont a [a]
  REV_APPEND_START :: [a] -> !Int -> Kont a [a]
  REV_APPEND_STEP :: [a] -> [a] -> !Int -> !Int -> Kont a ()

applyKont :: Queue a -> Kont a r -> r -> Queue a
applyKont q IDLE _ = rebuildLoop q
applyKont q (REVERSE_STEP xs acc k) _ = incrementalReverse q xs acc k
applyKont q (REVERSE_FRONT front backCount) rback =
    incrementalReverse q front [] $ REV_APPEND_START rback backCount
applyKont q (REV_APPEND_START rback backCount) rfront =
    incrementalRevAppend q rfront rback 0 backCount
applyKont q (REV_APPEND_STEP rfront acc movedCount backCount) _ =
    incrementalRevAppend q rfront acc movedCount backCount

rebuildLoop :: Queue a -> Queue a
rebuildLoop q@(Queue { .. }) =
    if backCount > frontCount then
        let q' = q { back = [], backCount = 0 } in
        incrementalReverse q' back [] $ REVERSE_FRONT front backCount
      else
        q { resumeKont = IDLE }

incrementalReverse :: Queue a -> [a] -> [a] -> Kont a [a] -> Queue a
incrementalReverse q [] acc k = applyKont q k acc
incrementalReverse q [x] acc k = applyKont q k (x:acc)
incrementalReverse q (x:y:xs) acc k = q { resumeKont = REVERSE_STEP xs (y:x:acc) k }

incrementalRevAppend :: Queue a -> [a] -> [a] -> Int -> Int -> Queue a
incrementalRevAppend q [] front' !movedCount backCount' =
    q { front = front', frontCount = movedCount + backCount', resumeKont = IDLE }
incrementalRevAppend q (x:rfront) acc !movedCount backCount' =
    if frontCount q <= movedCount then
        -- This drop count should be bounded by a constant.
        let !front = drop (movedCount - frontCount q) acc in
        q { front = front, frontCount = frontCount q + backCount', resumeKont = IDLE }
      else if null rfront then
        incrementalRevAppend q [] (x:acc) (movedCount + 1) backCount'
      else
        q { resumeKont = REV_APPEND_STEP rfront (x:acc) (movedCount + 1) backCount' }

resume :: Queue a -> Queue a
resume q = applyKont q (resumeKont q) ()

data Queue a = Queue {
    resumeKont :: !(Kont a ()),
    front :: [a],
    back :: [a],
    frontCount :: !Int,
    backCount :: !Int
}

empty :: Queue a
empty = Queue { resumeKont = IDLE, front = [], back = [], frontCount = 0, backCount = 0 }

dequeue :: Queue a -> (Maybe a, Queue a)
dequeue q =
    case front of
        [] -> (Nothing, q)
        (x:front') ->
            (Just x, q' { front = front', frontCount = frontCount - 1 })
  where q'@(Queue { .. }) = resume q

enqueue :: a -> Queue a -> Queue a
enqueue x q@(Queue { .. }) = resume (q { back = x:back, backCount = backCount + 1 })

Hood-Melville Implementation

This is just the Haskell code from Purely Functional Data Structures adapted to the interface of the other examples.

This code is mostly to compare. The biggest difference, other than some code structuring differences, is the front and back lists are reversed in parallel while my code does them sequentially. As mentioned before, to get a structure like that would simply be a matter of defining a parallel incremental reverse back in the Global Rebuilding Implementation.

Again, Okasaki’s real-time queue that can be seen as an application of the lazy rebuilding and scheduling techniques, described in his thesis and book, is a better implementation than this in pretty much every way.

module HoodMelvilleQueue (Queue, empty, dequeue, enqueue) where

data RotationState a
  = Idle
  | Reversing !Int [a] [a] [a] [a]
  | Appending !Int [a] [a]
  | Done [a]

data Queue a = Queue !Int [a] (RotationState a) !Int [a]

exec :: RotationState a -> RotationState a
exec (Reversing ok (x:f) f' (y:r) r') = Reversing (ok+1) f (x:f') r (y:r')
exec (Reversing ok [] f' [y] r') = Appending ok f' (y:r')
exec (Appending 0 f' r') = Done r'
exec (Appending ok (x:f') r') = Appending (ok-1) f' (x:r')
exec state = state

invalidate :: RotationState a -> RotationState a
invalidate (Reversing ok f f' r r') = Reversing (ok-1) f f' r r'
invalidate (Appending 0 f' (x:r')) = Done r'
invalidate (Appending ok f' r') = Appending (ok-1) f' r'
invalidate state = state

exec2 :: Int -> [a] -> RotationState a -> Int -> [a] -> Queue a
exec2 !lenf f state lenr r =
    case exec (exec state) of
        Done newf -> Queue lenf newf Idle lenr r
        newstate -> Queue lenf f newstate lenr r

check :: Int -> [a] -> RotationState a -> Int -> [a] -> Queue a
check !lenf f state !lenr r =
    if lenr <= lenf then exec2 lenf f state lenr r
    else let newstate = Reversing 0 f [] r []
         in exec2 (lenf+lenr) f newstate 0 []

empty :: Queue a
empty = Queue 0 [] Idle 0 []

dequeue :: Queue a -> (Maybe a, Queue a)
dequeue q@(Queue _ [] _ _ _) = (Nothing, q)
dequeue (Queue lenf (x:f') state lenr r) =
    let !q' = check (lenf-1) f' (invalidate state) lenr r in
    (Just x, q')

enqueue :: a -> Queue a -> Queue a
enqueue x (Queue lenf f state lenr r) = check lenf f state (lenr+1) (x:r)

Empirical Evaluation

I won’t reproduce the evaluation code as it’s not very sophisticated or interesting. It randomly generated a sequence of enqueues and dequeues with an 80% chance to produce an enqueue over a dequeue so that the queues would grow. It measured the average time of an enqueue and a dequeue, as well as the maximum time of any single dequeue.

The main thing I wanted to see was relatively stable average enqueue and dequeue times with only the batched implementation having a growing maximum dequeue time. This is indeed what I saw, though it took about 1,000,000 operations (or really a queue of a couple hundred thousand elements) for the numbers to stabilize.

The results were mostly unsurprising. Unsurprisingly, in overall time, the batched implementation won. Its enqueue is also, obviously, the fastest. (Indeed, there’s a good chance my measurement of its average enqueue time was largely a measurement of the timer’s resolution.) The operations’ average times were stable illustrating their constant (amortized) time. At large enough sizes, the ratio of the maximum dequeue time versus the average stabilized around 7000 to 1, except, of course, for the batched version which grew linearly to millions to 1 ratios at queue sizes of tens of millions of elements. This illustrates the worst-case time complexity of all the other implementations, and the merely amortized time complexity of the batched one.

While the batched version was best in overall time, the difference wasn’t that great. The worst implementations were still less 1.4x slower. All the worst-case optimal implementations performed roughly the same, but there were still some clear winners and losers. Okasaki’s real-time queue (not listed) is almost on-par with the batched implementation in overall time and handily beats the other implementations in average enqueue and dequeue times. The main surprise for me was that the loser was the Hood-Melville queue. My guess is this is due to invalidate which seems like it would do more work and produce more garbage than the approach taken in my functional version.

Conclusion

The point of this article was to illustrate the process of deriving a deamortized data structure from an amortized one utilizing batch rebuilding by explicitly modeling global rebuilding as a coroutine.

The point wasn’t to produce the fastest queue implementation, though I am pretty happy with the results. While this is an extremely simple example, it was still nice that each step was very easy and natural. It’s especially nice that this derivation approach produced a better result than the Hood-Melville queue.

Of course, my advice is to use Okasaki’s real-time queue if you need a purely functional queue.


  1. This code could definitely be refactored to leverage this similarity to reduce code. Alternatively, one could refunctionalize the Hood-Melville implementation at the end.↩︎

  2. Going “too fast”, so long as it’s still a constant amount of work for each step, isn’t really an issue asymptotically, so you can just crank the knobs if you don’t want to think too hard about it. That said, going faster than you need to will likely give you worse worst-case constant factors. In some cases, going faster than necessary could reduce constant factors, e.g. by better utilizing caches and disk I/O buffers.↩︎

October 04, 2024 08:24 AM

Edward Z. Yang

What’s different this time? LLM edition

One of the things that I learned in grad school is that even if you've picked an important and unsolved problem, you need some reason to believe it is solvable--especially if people have tried to solve it before! In other words, "What's different this time?" This is perhaps a dreary way of shooting down otherwise promising research directions, but you can flip it around: when the world changes, you can ask, "What can I do now that I couldn't do before?"

This post is a list of problems in areas that I care about (half of this is PL flavor, since that's what I did my PhD in), where I suspect something has changed with the advent of LLMs. It's not a list of recipes; there is still hard work to figure out how exactly an LLM can be useful (for most of these, just feeding the entire problem into ChatGPT usually doesn't work). But I often talk to people want to get started on something, anything, but have no idea to start. Try here!

Static analysis. The chasm between academic static analysis work and real world practice is the scaling problems that come with trying to apply the technique to a full size codebase. Asymptotics strike as LOC goes up, language focused techniques flounder in polyglot codebases, and "Does anyone know how to write cmake?" But this is predicated on the idea that static analysis has to operate on a whole program. It doesn't; humans can do perfectly good static analysis on fragments of code without having to hold the entire codebase in their head, without needing access to a build system. They make assumptions about APIs and can do local reasoning. LLMs can play a key role in drafting these assumptions so that local reasoning can occur. What if the LLM gets it wrong? Well, if an LLM could get it wrong, an inattentive junior developer might get it wrong too--maybe there is a problem in the API design. LLMs already do surprisingly well if you one-shot prompt them to find bugs in code; with more traditional static analysis support, maybe they can do even better.

DSL purgatory. Consider a problem that can be solved with code in a procedural way, but only by writing lots of tedious, error prone boilerplate (some examples: drawing diagrams, writing GUIs, SQL queries, building visualizations, scripting website/mobile app interactions, end to end testing). The PL dream is to design a sweet compositional DSL that raises the level of abstraction so that you can render a Hilbert curve in seven lines of code. But history is also abound with cases where the DSL did not solve the problems, or maybe it did solve the problem but only after years of grueling work, and so there are still many problems that feel like there ought to be a DSL that should solve them but there isn't. The promise of LLMs is that they are extremely good at regurgitating low level procedural actions that could conceivably be put together in a DSL. A lot of the best successes of LLMs today is putting coding powers in the hands of domain experts that otherwise do not how to code; could it also help in putting domain expertise in the hands of people who can code?

I am especially interested in these domains:

  • SQL - Its strange syntax purportedly makes it easier for non-software engineers to understand, whereas many (myself included) would often prefer a more functional syntax ala LINQ/list comprehensions. It's pretty hard to make an alternate SQL syntax take off though, because SQL is not one language, but many many dialects everywhere with no obvious leverage point. That sounds like an LLM opportunity. Or heck, just give me one of those AI editor environments but specifically fine tuned for SQL/data visualization, don't even bother with general coding.
  • End to end testing - This is https://momentic.ai/ but personally I'm not going to rely on a proprietary product for testing in my OSS projects. There's definitely an OSS opportunity here.
  • Scripting website/mobile app interactions - The website scraping version of this is https://reworkd.ai/ but I am also pretty interested in this from the browser extension angle: to some extent I can take back control of my frontend experience with browser extensions; can I go further with LLMs? And we typically don't imagine that I can do the same with a mobile app... but maybe I can??

OSS bread and butter. Why is Tesseract still the number one OSS library for OCR? Why is smooth and beautiful text to voice not ubiquitous? Why is the voice control on my Tesla so bad? Why is the wake word on my Android device so unreliable? Why doesn't the screenshot parser on a fansite for my favorite mobage not able to parse out icons? The future has arrived, but it is not uniformly distributed.

Improving the pipeline from ephemeral to durable stores of knowledge. Many important sources of knowledge are trapped in "ephemeral" stores, like Discord servers, private chat conversations, Reddit posts, Twitter threads, blog posts, etc. In an ideal world, there would be a pipeline of this knowledge into more durable, indexable forms for the benefit of all, but actually doing this is time consuming. Can LLMs help? Note that the dream of LLMs is you can just feed all of this data into the model and just ask questions to it. I'm OK with something a little bit more manual, we don't have to solve RAG first.

by Edward Z. Yang at October 04, 2024 04:30 AM

October 02, 2024

Ken T Takusagawa

[mlzpqxqu] import with type signature

proposal for a Haskell language extension: when importing a function from another module, one may optionally also specify a type signature for the imported function.  this would be helpful for code understanding.  the reader would have immediately available the type of the imported symbol, not having to go track down the type in the source module (which may be many steps away when modules re-export symbols, and the source module might not even have a type annotation), nor use a tool such as ghci to query it.  (maybe the code currently fails to compile for other reasons, so ghci is not available.)

if a function with the specified type signature is not exported by an imported module, the compiler can offer suggestions of other functions exported by the module which do have, or unify with, the imported type signature.  maybe the function got renamed in a new version of the module.

or, the compiler can do what Hoogle does and search among all modules in its search path for functions with the given signature.  maybe the function got moved to a different module.

the specified type signature may be narrower than how the function was originally defined.  this can limit some of the insanity caused by the Foldable Traversable Proposal (FTP):

import Prelude(length :: [a] -> Int) -- prevent length from being called on tuples and Maybe

various potentially tricky issues:

  1. a situation similar to the diamond problem (multiple inheritance) in object-oriented programming: module A defines a polymorphic function f, imported then re-exported by modules B and C.  module D imports both B and C, unqualified.  B imports and re-exports f from A with a type signature more narrow than originally defined in A.  C does not change the type signature.  what is the type of f as seen by D?  which version of f, which path through B or C, does D see?  solution might be simple: if the function through different paths are not identical, then the user has to qualify.

  2. the following tries to make List.length available only for lists, and Foldable.length available for anything else.  is this asking for trouble?

    import Prelude hiding(length);
    import qualified Prelude(length :: [a] -> Int) as List;
    import qualified Prelude(length) as Foldable;

by Unknown (noreply@blogger.com) at October 02, 2024 12:43 AM

Well-Typed.Com

The Haskell Unfolder Episode 33: diagrams

Today, 2024-10-02, at 1830 UTC (11:30 am PDT, 2:30 pm EDT, 7:30 pm BST, 20:30 CEST, …) we are streaming the 33rd episode of the Haskell Unfolder live on YouTube.

The Haskell Unfolder Episode 33: diagrams

In this episode, we will look at the “diagrams” package, which provides a domain-specific language embedded into Haskell for describing all sorts of pictures and visualisations. Concretely, we will try to visualise the game tree of tic-tac-toe that we computed in Episode 32. However, this episode is understandable without having watched the previous episode, and should also be suitable for beginners.

About the Haskell Unfolder

The Haskell Unfolder is a YouTube series about all things Haskell hosted by Edsko de Vries and Andres Löh, with episodes appearing approximately every two weeks. All episodes are live-streamed, and we try to respond to audience questions. All episodes are also available as recordings afterwards.

We have a GitHub repository with code samples from the episodes.

And we have a public Google calendar (also available as ICal) listing the planned schedule.

There’s now also a web shop where you can buy t-shirts and mugs (and potentially in the future other items) with the Haskell Unfolder logo.

by andres, edsko at October 02, 2024 12:00 AM

October 01, 2024

Haskell Interlude

56: Satnam Singh

Today on the Haskell Interlude, Matti and Sam are joined by Satnam Singh. Satnam has been a lecturer at Glasgow, and Software Engineer at Google, Meta, and now Groq. He talks about convincing people to use Haskell, laying out circuits and why community matters.

PS: After the recording, it was important to Satnam to clarify that his advise to “not be afraid to loose your job” was specially meant to encourage to quit jobs that are not good for you, if possible, but he acknowledges that unfortunately not everybody can afford that risk.

by Haskell Podcast at October 01, 2024 05:00 PM

Brent Yorgey

Retiring BlogLiterately

Retiring BlogLiterately

Posted on October 1, 2024
Tagged , , , , , ,

Way back in 2012 I took over maintainership of the BlogLiterately tool from Robert Greayer, its initial author. I used it for many years to post to my Wordpress blog, added a bunch of features, solved some fun bugs, and created the accompanying BlogLiterately-diagrams plugin for embedding diagrams code in blog posts. However, now that I have fled Wordpress and rebuilt my blog with hakyll, I don’t use BlogLiterately any more (there is even a diagrams-pandoc package which does the same thing BlogLiterately-diagrams used to do). So, as of today I am officially declaring BlogLiterately unsupported.

The fact is, I haven’t actually updated BlogLiterately since March of last year. It currently only builds on GHC 9.4 or older, and no one has complained, which I take as strong evidence that no one else is using it either! However, if anyone out there is actually using it, and would like to take over as maintainer, I would be very happy to pass it along to you.

I do plan to continue maintaining HaXml and haxr, at least for now; unlike BlogLiterately, I know they are still in use, especially HaXml. However, BlogLiterately was really the only reason I cared about these packages personally, so I would be happy to pass them along as well; please get in touch if you would be willing to take over maintaining one or both packages.

<noscript>Javascript needs to be activated to view comments.</noscript>

by Brent Yorgey at October 01, 2024 12:00 AM

September 30, 2024

Chris Reade

PenroseKiteDart User Guide

Introduction

PenroseKiteDart is a Haskell package with tools to experiment with finite tilings of Penrose’s Kites and Darts. It uses the Haskell Diagrams package for drawing tilings. As well as providing drawing tools, this package introduces tile graphs (Tgraphs) for describing finite tilings. (I would like to thank Stephen Huggett for suggesting planar graphs as a way to reperesent the tilings).

This document summarises the design and use of the PenroseKiteDart package.

PenroseKiteDart package is now available on Hackage.

The source files are available on GitHub at https://github.com/chrisreade/PenroseKiteDart.

There is a small art gallery of examples created with PenroseKiteDart here.

Index

  1. About Penrose’s Kites and Darts
  2. Using the PenroseKiteDart Package (initial set up).
  3. Overview of Types and Operations
  4. Drawing in more detail
  5. Forcing in more detail
  6. Advanced Operations
  7. Other Reading

1. About Penrose’s Kites and Darts

The Tiles

In figure 1 we show a dart and a kite. All angles are multiples of 36^{\circ} (a tenth of a full turn). If the shorter edges are of length 1, then the longer edges are of length \phi, where \phi = (1+ \sqrt{5})/ 2 is the golden ratio.

Figure 1: The Dart and Kite Tiles
Figure 1: The Dart and Kite Tiles

Aperiodic Infinite Tilings

What is interesting about these tiles is:

It is possible to tile the entire plane with kites and darts in an aperiodic way.

Such a tiling is non-periodic and does not contain arbitrarily large periodic regions or patches.

The possibility of aperiodic tilings with kites and darts was discovered by Sir Roger Penrose in 1974. There are other shapes with this property, including a chiral aperiodic monotile discovered in 2023 by Smith, Myers, Kaplan, Goodman-Strauss. (See the Penrose Tiling Wikipedia page for the history of aperiodic tilings)

This package is entirely concerned with Penrose’s kite and dart tilings also known as P2 tilings.

In figure 2 we add a temporary green line marking purely to illustrate a rule for making legal tilings. The purpose of the rule is to exclude the possibility of periodic tilings.

If all tiles are marked as shown, then whenever tiles come together at a point, they must all be marked or must all be unmarked at that meeting point. So, for example, each long edge of a kite can be placed legally on only one of the two long edges of a dart. The kite wing vertex (which is marked) has to go next to the dart tip vertex (which is marked) and cannot go next to the dart wing vertex (which is unmarked) for a legal tiling.

Figure 2: Marked Dart and Kite
Figure 2: Marked Dart and Kite

Correct Tilings

Unfortunately, having a finite legal tiling is not enough to guarantee you can continue the tiling without getting stuck. Finite legal tilings which can be continued to cover the entire plane are called correct and the others (which are doomed to get stuck) are called incorrect. This means that decomposition and forcing (described later) become important tools for constructing correct finite tilings.

2. Using the PenroseKiteDart Package

You will need the Haskell Diagrams package (See Haskell Diagrams) as well as this package (PenroseKiteDart). When these are installed, you can produce diagrams with a Main.hs module. This should import a chosen backend for diagrams such as the default (SVG) along with Diagrams.Prelude.

    module Main (main) where
    
    import Diagrams.Backend.SVG.CmdLine
    import Diagrams.Prelude

For Penrose’s Kite and Dart tilings, you also need to import the PKD module and (optionally) the TgraphExamples module.

    import PKD
    import TgraphExamples

Then to ouput someExample figure

    fig::Diagram B
    fig = someExample

    main :: IO ()
    main = mainWith fig

Note that the token B is used in the diagrams package to represent the chosen backend for output. So a diagram has type Diagram B. In this case B is bound to SVG by the import of the SVG backend. When the compiled module is executed it will generate an SVG file. (See Haskell Diagrams for more details on producing diagrams and using alternative backends).

3. Overview of Types and Operations

Half-Tiles

In order to implement operations on tilings (decompose in particular), we work with half-tiles. These are illustrated in figure 3 and labelled RD (right dart), LD (left dart), LK (left kite), RK (right kite). The join edges where left and right halves come together are shown with dotted lines, leaving one short edge and one long edge on each half-tile (excluding the join edge). We have shown a red dot at the vertex we regard as the origin of each half-tile (the tip of a half-dart and the base of a half-kite).

Figure 3: Half-Tile pieces showing join edges (dashed) and origin vertices (red dots)
Figure 3: Half-Tile pieces showing join edges (dashed) and origin vertices (red dots)

The labels are actually data constructors introduced with type operator HalfTile which has an argument type (rep) to allow for more than one representation of the half-tiles.

    data HalfTile rep 
      = LD rep -- Left Dart
      | RD rep -- Right Dart
      | LK rep -- Left Kite
      | RK rep -- Right Kite
      deriving (Show,Eq)

Tgraphs

We introduce tile graphs (Tgraphs) which provide a simple planar graph representation for finite patches of tiles. For Tgraphs we first specialise HalfTile with a triple of vertices (positive integers) to make a TileFace such as RD(1,2,3), where the vertices go clockwise round the half-tile triangle starting with the origin.

    type TileFace  = HalfTile (Vertex,Vertex,Vertex)
    type Vertex    = Int  -- must be positive

The function

    makeTgraph :: [TileFace] -> Tgraph

then constructs a Tgraph from a TileFace list after checking the TileFaces satisfy certain properties (described below). We also have

    faces :: Tgraph -> [TileFace]

to retrieve the TileFace list from a Tgraph.

As an example, the fool (short for fool’s kite and also called an ace in the literature) consists of two kites and a dart (= 4 half-kites and 2 half-darts):

    fool :: Tgraph
    fool = makeTgraph [RD (1,2,3), LD (1,3,4)   -- right and left dart
                      ,LK (5,3,2), RK (5,2,7)   -- left and right kite
                      ,RK (5,4,3), LK (5,6,4)   -- right and left kite
                      ]

To produce a diagram, we simply draw the Tgraph

    foolFigure :: Diagram B
    foolFigure = draw fool

which will produce the diagram on the left in figure 4.

Alternatively,

    foolFigure :: Diagram B
    foolFigure = labelled drawj fool

will produce the diagram on the right in figure 4 (showing vertex labels and dashed join edges).

Figure 4: Diagram of fool without labels and join edges (left), and with (right)
Figure 4: Diagram of fool without labels and join edges (left), and with (right)

When any (non-empty) Tgraph is drawn, a default orientation and scale are chosen based on the lowest numbered join edge. This is aligned on the positive x-axis with length 1 (for darts) or length \phi (for kites).

Tgraph Properties

Tgraphs are actually implemented as

    newtype Tgraph = Tgraph [TileFace]
                     deriving (Show)

but the data constructor Tgraph is not exported to avoid accidentally by-passing checks for the required properties. The properties checked by makeTgraph ensure the Tgraph represents a legal tiling as a planar graph with positive vertex numbers, and that the collection of half-tile faces are both connected and have no crossing boundaries (see note below). Finally, there is a check to ensure two or more distinct vertex numbers are not used to represent the same vertex of the graph (a touching vertex check). An error is raised if there is a problem.

Note: If the TilFaces are faces of a planar graph there will also be exterior (untiled) regions, and in graph theory these would also be called faces of the graph. To avoid confusion, we will refer to these only as exterior regions, and unless otherwise stated, face will mean a TileFace. We can then define the boundary of a list of TileFaces as the edges of the exterior regions. There is a crossing boundary if the boundary crosses itself at a vertex. We exclude crossing boundaries from Tgraphs because they prevent us from calculating relative positions of tiles locally and create touching vertex problems.

For convenience, in addition to makeTgraph, we also have

    makeUncheckedTgraph :: [TileFace] -> Tgraph
    checkedTgraph   :: [TileFace] -> Tgraph

The first of these (performing no checks) is useful when you know the required properties hold. The second performs the same checks as makeTgraph except that it omits the touching vertex check. This could be used, for example, when making a Tgraph from a sub-collection of TileFaces of another Tgraph.

Main Tiling Operations

There are three key operations on finite tilings, namely

    decompose :: Tgraph -> Tgraph
    force     :: Tgraph -> Tgraph
    compose   :: Tgraph -> Tgraph

Decompose

Decomposition (also called deflation) works by splitting each half-tile into either 2 or 3 new (smaller scale) half-tiles, to produce a new tiling. The fact that this is possible, is used to establish the existence of infinite aperiodic tilings with kites and darts. Since our Tgraphs have abstracted away from scale, the result of decomposing a Tgraph is just another Tgraph. However if we wish to compare before and after with a drawing, the latter should be scaled by a factor 1/{\phi} = \phi - 1 times the scale of the former, to reflect the change in scale.

Figure 5: fool (left) and decompose fool (right)
Figure 5: fool (left) and decompose fool (right)

We can, of course, iterate decompose to produce an infinite list of finer and finer decompositions of a Tgraph

    decompositions :: Tgraph -> [Tgraph]
    decompositions = iterate decompose

Force

Force works by adding any TileFaces on the boundary edges of a Tgraph which are forced. That is, where there is only one legal choice of TileFace addition consistent with the seven possible vertex types. Such additions are continued until either (i) there are no more forced cases, in which case a final (forced) Tgraph is returned, or (ii) the process finds the tiling is stuck, in which case an error is raised indicating an incorrect tiling. [In the latter case, the argument to force must have been an incorrect tiling, because the forced additions cannot produce an incorrect tiling starting from a correct tiling.]

An example is shown in figure 6. When forced, the Tgraph on the left produces the result on the right. The original is highlighted in red in the result to show what has been added.

Figure 6: A Tgraph (left) and its forced result (right) with the original shown red
Figure 6: A Tgraph (left) and its forced result (right) with the original shown red

Compose

Composition (also called inflation) is an opposite to decompose but this has complications for finite tilings, so it is not simply an inverse. (See Graphs,Kites and Darts and Theorems for more discussion of the problems). Figure 7 shows a Tgraph (left) with the result of composing (right) where we have also shown (in pale green) the faces of the original that are not included in the composition – the remainder faces.

Figure 7: A Tgraph (left) and its (part) composed result (right) with the remainder faces shown pale green
Figure 7: A Tgraph (left) and its (part) composed result (right) with the remainder faces shown pale green

Under some circumstances composing can fail to produce a Tgraph because there are crossing boundaries in the resulting TileFaces. However, we have established that

  • If g is a forced Tgraph, then compose g is defined and it is also a forced Tgraph.

Try Results

It is convenient to use types of the form Try a for results where we know there can be a failure. For example, compose can fail if the result does not pass the connected and no crossing boundary check, and force can fail if its argument is an incorrect Tgraph. In situations when you would like to continue some computation rather than raise an error when there is a failure, use a try version of a function.

    tryCompose :: Tgraph -> Try Tgraph
    tryForce   :: Tgraph -> Try Tgraph

We define Try as a synonym for Either String (which is a monad) in module Tgraph.Try.

type Try a = Either String a

Successful results have the form Right r (for some correct result r) and failure results have the form Left s (where s is a String describing the problem as a failure report).

The function

    runTry:: Try a -> a
    runTry = either error id

will retrieve a correct result but raise an error for failure cases. This means we can always derive an error raising version from a try version of a function by composing with runTry.

    force = runTry . tryForce
    compose = runTry . tryCompose

Elementary Tgraph and TileFace Operations

The module Tgraph.Prelude defines elementary operations on Tgraphs relating vertices, directed edges, and faces. We describe a few of them here.

When we need to refer to particular vertices of a TileFace we use

    originV :: TileFace -> Vertex -- the first vertex - red dot in figure 2
    oppV    :: TileFace -> Vertex -- the vertex at the opposite end of the join edge from the origin
    wingV   :: TileFace -> Vertex -- the vertex not on the join edge

A directed edge is represented as a pair of vertices.

    type Dedge = (Vertex,Vertex)

So (a,b) is regarded as a directed edge from a to b. In the special case that a list of directed edges is symmetrically closed [(b,a) is in the list whenever (a,b) is in the list] we can think of this as an edge list rather than just a directed edge list.

For example,

    internalEdges :: Tgraph -> [Dedge]

produces an edge list, whereas

    graphBoundary :: Tgraph -> [Dedge]

produces single directions. Each directed edge in the resulting boundary will have a TileFace on the left and an exterior region on the right. The function

    graphDedges :: Tgraph -> [Dedge]

produces all the directed edges obtained by going clockwise round each TileFace so not every edge in the list has an inverse in the list.

The above three functions are defined using

    faceDedges :: TileFace -> [Dedge]

which produces a list of the three directed edges going clockwise round a TileFace starting at the origin vertex.

When we need to refer to particular edges of a TileFace we use

    joinE  :: TileFace -> Dedge  -- shown dotted in figure 2
    shortE :: TileFace -> Dedge  -- the non-join short edge
    longE  :: TileFace -> Dedge  -- the non-join long edge

which are all directed clockwise round the TileFace. In contrast, joinOfTile is always directed away from the origin vertex, so is not clockwise for right darts or for left kites:

    joinOfTile:: TileFace -> Dedge
    joinOfTile face = (originV face, oppV face)

Patches (Scaled and Positioned Tilings)

Behind the scenes, when a Tgraph is drawn, each TileFace is converted to a Piece. A Piece is another specialisation of HalfTile using a two dimensional vector to indicate the length and direction of the join edge of the half-tile (from the originV to the oppV), thus fixing its scale and orientation. The whole Tgraph then becomes a list of located Pieces called a Patch.

    type Piece = HalfTile (V2 Double)
    type Patch = [Located Piece]

Piece drawing functions derive vectors for other edges of a half-tile piece from its join edge vector. In particular (in the TileLib module) we have

    drawPiece :: Piece -> Diagram B
    dashjPiece :: Piece -> Diagram B
    fillPieceDK :: Colour Double -> Colour Double -> Piece -> Diagram B

where the first draws the non-join edges of a Piece, the second does the same but adds a dashed line for the join edge, and the third takes two colours – one for darts and one for kites, which are used to fill the piece as well as using drawPiece.

Patch is an instances of class Transformable so a Patch can be scaled, rotated, and translated.

Vertex Patches

It is useful to have an intermediate form between Tgraphs and Patches, that contains information about both the location of vertices (as 2D points), and the abstract TileFaces. This allows us to introduce labelled drawing functions (to show the vertex labels) which we then extend to Tgraphs. We call the intermediate form a VPatch (short for Vertex Patch).

    type VertexLocMap = IntMap.IntMap (Point V2 Double)
    data VPatch = VPatch {vLocs :: VertexLocMap,  vpFaces::[TileFace]} deriving Show

and

    makeVP :: Tgraph -> VPatch

calculates vertex locations using a default orientation and scale.

VPatch is made an instance of class Transformable so a VPatch can also be scaled and rotated.

One essential use of this intermediate form is to be able to draw a Tgraph with labels, rotated but without the labels themselves being rotated. We can simply convert the Tgraph to a VPatch, and rotate that before drawing with labels.

    labelled draw (rotate someAngle (makeVP g))

We can also align a VPatch using vertex labels.

    alignXaxis :: (Vertex, Vertex) -> VPatch -> VPatch 

So if g is a Tgraph with vertex labels a and b we can align it on the x-axis with a at the origin and b on the positive x-axis (after converting to a VPatch), instead of accepting the default orientation.

    labelled draw (alignXaxis (a,b) (makeVP g))

Another use of VPatches is to share the vertex location map when drawing only subsets of the faces (see Overlaid examples in the next section).

4. Drawing in More Detail

Class Drawable

There is a class Drawable with instances Tgraph, VPatch, Patch. When the token B is in scope standing for a fixed backend then we can assume

    draw   :: Drawable a => a -> Diagram B  -- draws non-join edges
    drawj  :: Drawable a => a -> Diagram B  -- as with draw but also draws dashed join edges
    fillDK :: Drawable a => Colour Double -> Colour Double -> a -> Diagram B -- fills with colours

where fillDK clr1 clr2 will fill darts with colour clr1 and kites with colour clr2 as well as drawing non-join edges.

These are the main drawing tools. However they are actually defined for any suitable backend b so have more general types.

(Update Sept 2024) As of version 1.1 of PenroseKiteDart, these will be

    draw ::   (Drawable a, OKBackend b) =>
              a -> Diagram b
    drawj ::  (Drawable a, OKBackend) b) =>
              a -> Diagram b
    fillDK :: (Drawable a, OKBackend b) =>
              Colour Double -> Colour Double -> a -> Diagram b

where the class OKBackend is a check to ensure a backend is suitable for drawing 2D tilings with or without labels.

In these notes we will generally use the simpler description of types using B for a fixed chosen backend for the sake of clarity.

The drawing tools are each defined via the class function drawWith using Piece drawing functions.

    class Drawable a where
        drawWith :: (Piece -> Diagram B) -> a -> Diagram B
    
    draw = drawWith drawPiece
    drawj = drawWith dashjPiece
    fillDK clr1 clr2 = drawWith (fillPieceDK clr1 clr2)

To design a new drawing function, you only need to implement a function to draw a Piece, (let us call it newPieceDraw)

    newPieceDraw :: Piece -> Diagram B

This can then be elevated to draw any Drawable (including Tgraphs, VPatches, and Patches) by applying the Drawable class function drawWith:

    newDraw :: Drawable a => a -> Diagram B
    newDraw = drawWith newPieceDraw

Class DrawableLabelled

Class DrawableLabelled is defined with instances Tgraph and VPatch, but Patch is not an instance (because this does not retain vertex label information).

    class DrawableLabelled a where
        labelColourSize :: Colour Double -> Measure Double -> (Patch -> Diagram B) -> a -> Diagram B

So labelColourSize c m modifies a Patch drawing function to add labels (of colour c and size measure m). Measure is defined in Diagrams.Prelude with pre-defined measures tiny, verySmall, small, normal, large, veryLarge, huge. For most of our diagrams of Tgraphs, we use red labels and we also find small is a good default size choice, so we define

    labelSize :: DrawableLabelled a => Measure Double -> (Patch -> Diagram B) -> a -> Diagram B
    labelSize = labelColourSize red

    labelled :: DrawableLabelled a => (Patch -> Diagram B) -> a -> Diagram B
    labelled = labelSize small

and then labelled draw, labelled drawj, labelled (fillDK clr1 clr2) can all be used on both Tgraphs and VPatches as well as (for example) labelSize tiny draw, or labelCoulourSize blue normal drawj.

Further drawing functions

There are a few extra drawing functions built on top of the above ones. The function smart is a modifier to add dashed join edges only when they occur on the boundary of a Tgraph

    smart :: (VPatch -> Diagram B) -> Tgraph -> Diagram B

So smart vpdraw g will draw dashed join edges on the boundary of g before applying the drawing function vpdraw to the VPatch for g. For example the following all draw dashed join edges only on the boundary for a Tgraph g

    smart draw g
    smart (labelled draw) g
    smart (labelSize normal draw) g

When using labels, the function rotateBefore allows a Tgraph to be drawn rotated without rotating the labels.

    rotateBefore :: (VPatch -> a) -> Angle Double -> Tgraph -> a
    rotateBefore vpdraw angle = vpdraw . rotate angle . makeVP

So for example,

    rotateBefore (labelled draw) (90@@deg) g

makes sense for a Tgraph g. Of course if there are no labels we can simply use

    rotate (90@@deg) (draw g)

Similarly alignBefore allows a Tgraph to be aligned on the X-axis using a pair of vertex numbers before drawing.

    alignBefore :: (VPatch -> a) -> (Vertex,Vertex) -> Tgraph -> a
    alignBefore vpdraw (a,b) = vpdraw . alignXaxis (a,b) . makeVP

So, for example, if Tgraph g has vertices a and b, both

    alignBefore draw (a,b) g
    alignBefore (labelled draw) (a,b) g

make sense. Note that the following examples are wrong. Even though they type check, they re-orient g without repositioning the boundary joins.

    smart (labelled draw . rotate angle) g      -- WRONG
    smart (labelled draw . alignXaxis (a,b)) g  -- WRONG

Instead use

    smartRotateBefore (labelled draw) angle g
    smartAlignBefore (labelled draw) (a,b) g

where

    smartRotateBefore :: (VPatch -> Diagram B) -> Angle Double -> Tgraph -> Diagram B
    smartAlignBefore  :: (VPatch -> Diagram B) -> (Vertex,Vertex) -> Tgraph -> Diagram B

are defined using

    restrictSmart :: Tgraph -> (VPatch -> Diagram B) -> VPatch -> Diagram B

Here, restrictSmart g vpdraw vp uses the given vp for drawing boundary joins and drawing faces of g (with vpdraw) rather than converting g to a new VPatch. This assumes vp has locations for vertices in g.

Overlaid examples (location map sharing)

The function

    drawForce :: Tgraph -> Diagram B

will (smart) draw a Tgraph g in red overlaid (using <>) on the result of force g as in figure 6. Similarly

    drawPCompose  :: Tgraph -> Diagram B

applied to a Tgraph g will draw the result of a partial composition of g as in figure 7. That is a drawing of compose g but overlaid with a drawing of the remainder faces of g shown in pale green.

Both these functions make use of sharing a vertex location map to get correct alignments of overlaid diagrams. In the case of drawForce g, we know that a VPatch for force g will contain all the vertex locations for g since force only adds to a Tgraph (when it succeeds). So when constructing the diagram for g we can use the VPatch created for force g instead of starting afresh. Similarly for drawPCompose g the VPatch for g contains locations for all the vertices of compose g so compose g is drawn using the the VPatch for g instead of starting afresh.

The location map sharing is done with

    subVP :: VPatch -> [TileFace] -> VPatch

so that subVP vp fcs is a VPatch with the same vertex locations as vp, but replacing the faces of vp with fcs. [Of course, this can go wrong if the new faces have vertices not in the domain of the vertex location map so this needs to be used with care. Any errors would only be discovered when a diagram is created.]

For cases where labels are only going to be drawn for certain faces, we need a version of subVP which also gets rid of vertex locations that are not relevant to the faces. For this situation we have

    restrictVP:: VPatch -> [TileFace] -> VPatch

which filters out un-needed vertex locations from the vertex location map. Unlike subVP, restrictVP checks for missing vertex locations, so restrictVP vp fcs raises an error if a vertex in fcs is missing from the keys of the vertex location map of vp.

5. Forcing in More Detail

The force rules

The rules used by our force algorithm are local and derived from the fact that there are seven possible vertex types as depicted in figure 8.

Figure 8: Seven vertex types
Figure 8: Seven vertex types

Our rules are shown in figure 9 (omitting mirror symmetric versions). In each case the TileFace shown yellow needs to be added in the presence of the other TileFaces shown.

Figure 9: Rules for forcing
Figure 9: Rules for forcing

Main Forcing Operations

To make forcing efficient we convert a Tgraph to a BoundaryState to keep track of boundary information of the Tgraph, and then calculate a ForceState which combines the BoundaryState with a record of awaiting boundary edge updates (an update map). Then each face addition is carried out on a ForceState, converting back when all the face additions are complete. It makes sense to apply force (and related functions) to a Tgraph, a BoundaryState, or a ForceState, so we define a class Forcible with instances Tgraph, BoundaryState, and ForceState.

This allows us to define

    force :: Forcible a => a -> a
    tryForce :: Forcible a => a -> Try a

The first will raise an error if a stuck tiling is encountered. The second uses a Try result which produces a Left string for failures and a Right a for successful result a.

There are several other operations related to forcing including

    stepForce :: Forcible a => Int -> a -> a
    tryStepForce  :: Forcible a => Int -> a -> Try a

    addHalfDart, addHalfKite :: Forcible a => Dedge -> a -> a
    tryAddHalfDart, tryAddHalfKite :: Forcible a => Dedge -> a -> Try a

The first two force (up to) a given number of steps (=face additions) and the other four add a half dart/kite on a given boundary edge.

Update Generators

An update generator is used to calculate which boundary edges can have a certain update. There is an update generator for each force rule, but also a combined (all update) generator. The force operations mentioned above all use the default all update generator (defaultAllUGen) but there are more general (with) versions that can be passed an update generator of choice. For example

    forceWith :: Forcible a => UpdateGenerator -> a -> a
    tryForceWith :: Forcible a => UpdateGenerator -> a -> Try a

In fact we defined

    force = forceWith defaultAllUGen
    tryForce = tryForceWith defaultAllUGen

We can also define

    wholeTiles :: Forcible a => a -> a
    wholeTiles = forceWith wholeTileUpdates

where wholeTileUpdates is an update generator that just finds boundary join edges to complete whole tiles.

In addition to defaultAllUGen there is also allUGenerator which does the same thing apart from how failures are reported. The reason for keeping both is that they were constructed differently and so are useful for testing.

In fact UpdateGenerators are functions that take a BoundaryState and a focus (list of boundary directed edges) to produce an update map. Each Update is calculated as either a SafeUpdate (where two of the new face edges are on the existing boundary and no new vertex is needed) or an UnsafeUpdate (where only one edge of the new face is on the boundary and a new vertex needs to be created for a new face).

    type UpdateGenerator = BoundaryState -> [Dedge] -> Try UpdateMap
    type UpdateMap = Map.Map Dedge Update
    data Update = SafeUpdate TileFace 
                | UnsafeUpdate (Vertex -> TileFace)

Completing (executing) an UnsafeUpdate requires a touching vertex check to ensure that the new vertex does not clash with an existing boundary vertex. Using an existing (touching) vertex would create a crossing boundary so such an update has to be blocked.

Forcible Class Operations

The Forcible class operations are higher order and designed to allow for easy additions of further generic operations. They take care of conversions between Tgraphs, BoundaryStates and ForceStates.

    class Forcible a where
      tryFSOpWith :: UpdateGenerator -> (ForceState -> Try ForceState) -> a -> Try a
      tryChangeBoundaryWith :: UpdateGenerator -> (BoundaryState -> Try BoundaryChange) -> a -> Try a
      tryInitFSWith :: UpdateGenerator -> a -> Try ForceState

For example, given an update generator ugen and any f:: ForceState -> Try ForceState , then f can be generalised to work on any Forcible using tryFSOpWith ugen f. This is used to define both tryForceWith and tryStepForceWith.

We also specialize tryFSOpWith to use the default update generator

    tryFSOp :: Forcible a => (ForceState -> Try ForceState) -> a -> Try a
    tryFSOp = tryFSOpWith defaultAllUGen

Similarly given an update generator ugen and any f:: BoundaryState -> Try BoundaryChange , then f can be generalised to work on any Forcible using tryChangeBoundaryWith ugen f. This is used to define tryAddHalfDart and tryAddHalfKite.

We also specialize tryChangeBoundaryWith to use the default update generator

    tryChangeBoundary :: Forcible a => (BoundaryState -> Try BoundaryChange) -> a -> Try a
    tryChangeBoundary = tryChangeBoundaryWith defaultAllUGen

Note that the type BoundaryChange contains a resulting BoundaryState, the single TileFace that has been added, a list of edges removed from the boundary (of the BoundaryState prior to the face addition), and a list of the (3 or 4) boundary edges affected around the change that require checking or re-checking for updates.

The class function tryInitFSWith will use an update generator to create an initial ForceState for any Forcible. If the Forcible is already a ForceState it will do nothing. Otherwise it will calculate updates for the whole boundary. We also have the special case

    tryInitFS :: Forcible a => a -> Try ForceState
    tryInitFS = tryInitFSWith defaultAllUGen

Efficient chains of forcing operations.

Note that (force . force) does the same as force, but we might want to chain other force related steps in a calculation.

For example, consider the following combination which, after decomposing a Tgraph, forces, then adds a half dart on a given boundary edge (d) and then forces again.

    combo :: Dedge -> Tgraph -> Tgraph
    combo d = force . addHalfDart d . force . decompose

Since decompose:: Tgraph -> Tgraph, the instances of force and addHalfDart d will have type Tgraph -> Tgraph so each of these operations, will begin and end with conversions between Tgraph and ForceState. We would do better to avoid these wasted intermediate conversions working only with ForceStates and keeping only those necessary conversions at the beginning and end of the whole sequence.

This can be done using tryFSOp. To see this, let us first re-express the forcing sequence using the Try monad, so

    force . addHalfDart d . force

becomes

    tryForce <=< tryAddHalfDart d <=< tryForce

Note that (<=<) is the Kliesli arrow which replaces composition for Monads (defined in Control.Monad). (We could also have expressed this right to left sequence with a left to right version tryForce >=> tryAddHalfDart d >=> tryForce). The definition of combo becomes

    combo :: Dedge -> Tgraph -> Tgraph
    combo d = runTry . (tryForce <=< tryAddHalfDart d <=< tryForce) . decompose

This has no performance improvement, but now we can pass the sequence to tryFSOp to remove the unnecessary conversions between steps.

    combo :: Dedge -> Tgraph -> Tgraph
    combo d = runTry . tryFSOp (tryForce <=< tryAddHalfDart d <=< tryForce) . decompose

The sequence actually has type Forcible a => a -> Try a but when passed to tryFSOp it specialises to type ForceState -> Try ForseState. This ensures the sequence works on a ForceState and any conversions are confined to the beginning and end of the sequence, avoiding unnecessary intermediate conversions.

A limitation of forcing

To avoid creating touching vertices (or crossing boundaries) a BoundaryState keeps track of locations of boundary vertices. At around 35,000 face additions in a single force operation the calculated positions of boundary vertices can become too inaccurate to prevent touching vertex problems. In such cases it is better to use

    recalibratingForce :: Forcible a => a -> a
    tryRecalibratingForce :: Forcible a => a -> Try a

These work by recalculating all vertex positions at 20,000 step intervals to get more accurate boundary vertex positions. For example, 6 decompositions of the kingGraph has 2,906 faces. Applying force to this should result in 53,574 faces but will go wrong before it reaches that. This can be fixed by calculating either

    recalibratingForce (decompositions kingGraph !!6)

or using an extra force before the decompositions

    force (decompositions (force kingGraph) !!6)

In the latter case, the final force only needs to add 17,864 faces to the 35,710 produced by decompositions (force kingGraph) !!6.

6. Advanced Operations

Guided comparison of Tgraphs

Asking if two Tgraphs are equivalent (the same apart from choice of vertex numbers) is a an np-complete problem. However, we do have an efficient guided way of comparing Tgraphs. In the module Tgraph.Rellabelling we have

    sameGraph :: (Tgraph,Dedge) -> (Tgraph,Dedge) -> Bool

The expression sameGraph (g1,d1) (g2,d2) asks if g2 can be relabelled to match g1 assuming that the directed edge d2 in g2 is identified with d1 in g1. Hence the comparison is guided by the assumption that d2 corresponds to d1.

It is implemented using

    tryRelabelToMatch :: (Tgraph,Dedge) -> (Tgraph,Dedge) -> Try Tgraph

where tryRelabelToMatch (g1,d1) (g2,d2) will either fail with a Left report if a mismatch is found when relabelling g2 to match g1 or will succeed with Right g3 where g3 is a relabelled version of g2. The successful result g3 will match g1 in a maximal tile-connected collection of faces containing the face with edge d1 and have vertices disjoint from those of g1 elsewhere. The comparison tries to grow a suitable relabelling by comparing faces one at a time starting from the face with edge d1 in g1 and the face with edge d2 in g2. (This relies on the fact that Tgraphs are connected with no crossing boundaries, and hence tile-connected.)

The above function is also used to implement

    tryFullUnion:: (Tgraph,Dedge) -> (Tgraph,Dedge) -> Try Tgraph

which tries to find the union of two Tgraphs guided by a directed edge identification. However, there is an extra complexity arising from the fact that Tgraphs might overlap in more than one tile-connected region. After calculating one overlapping region, the full union uses some geometry (calculating vertex locations) to detect further overlaps.

Finally we have

    commonFaces:: (Tgraph,Dedge) -> (Tgraph,Dedge) -> [TileFace]

which will find common regions of overlapping faces of two Tgraphs guided by a directed edge identification. The resulting common faces will be a sub-collection of faces from the first Tgraph. These are returned as a list as they may not be a connected collection of faces and therefore not necessarily a Tgraph.

Empires and SuperForce

In Empires and SuperForce we discussed forced boundary coverings which were used to implement both a superForce operation

    superForce:: Forcible a => a -> a

and operations to calculate empires.

We will not repeat the descriptions here other than to note that

    forcedBoundaryECovering:: Tgraph -> [Tgraph]

finds boundary edge coverings after forcing a Tgraph. That is, forcedBoundaryECovering g will first force g, then (if it succeeds) finds a collection of (forced) extensions to force g such that

  • each extension has the whole boundary of force g as internal edges.
  • each possible addition to a boundary edge of force g (kite or dart) has been included in the collection.

(possible here means – not leading to a stuck Tgraph when forced.) There is also

    forcedBoundaryVCovering:: Tgraph -> [Tgraph]

which does the same except that the extensions have all boundary vertices internal rather than just the boundary edges.

Combinations

Combinations such as

    compForce:: Tgraph -> Tgraph      -- compose after forcing
    allCompForce:: Tgraph -> [Tgraph] -- iterated (compose after force) while not emptyTgraph
    maxCompForce:: Tgraph -> Tgraph   -- last item in allCompForce (or emptyTgraph)

make use of theorems established in Graphs,Kites and Darts and Theorems. For example

    compForce = uncheckedCompose . force 

which relies on the fact that composition of a forced Tgraph does not need to be checked for connectedness and no crossing boundaries. Similarly, only the initial force is necessary in allCompForce with subsequent iteration of uncheckedCompose because composition of a forced Tgraph is necessarily a forced Tgraph.

Tracked Tgraphs

The type

    data TrackedTgraph = TrackedTgraph
       { tgraph  :: Tgraph
       , tracked :: [[TileFace]] 
       } deriving Show

has proven useful in experimentation as well as in producing artwork with darts and kites. The idea is to keep a record of sub-collections of faces of a Tgraph when doing both force operations and decompositions. A list of the sub-collections forms the tracked list associated with the Tgraph. We make TrackedTgraph an instance of class Forcible by having force operations only affect the Tgraph and not the tracked list. The significant idea is the implementation of

    decomposeTracked :: TrackedTgraph -> TrackedTgraph

Decomposition of a Tgraph involves introducing a new vertex for each long edge and each kite join. These are then used to construct the decomposed faces. For decomposeTracked we do the same for the Tgraph, but when it comes to the tracked collections, we decompose them re-using the same new vertex numbers calculated for the edges in the Tgraph. This keeps a consistent numbering between the Tgraph and tracked faces, so each item in the tracked list remains a sub-collection of faces in the Tgraph.

The function

    drawTrackedTgraph :: [VPatch -> Diagram B] -> TrackedTgraph -> Diagram B

is used to draw a TrackedTgraph. It uses a list of functions to draw VPatches. The first drawing function is applied to a VPatch for any untracked faces. Subsequent functions are applied to VPatches for the tracked list in order. Each diagram is beneath later ones in the list, with the diagram for the untracked faces at the bottom. The VPatches used are all restrictions of a single VPatch for the Tgraph, so will be consistent in vertex locations. When labels are used, there is also a drawTrackedTgraphRotated and drawTrackedTgraphAligned for rotating or aligning the VPatch prior to applying the drawing functions.

Note that the result of calculating empires (see Empires and SuperForce ) is represented as a TrackedTgraph. The result is actually the common faces of a forced boundary covering, but a particular element of the covering (the first one) is chosen as the background Tgraph with the common faces as a tracked sub-collection of faces. Hence we have

    empire1, empire2 :: Tgraph -> TrackedTgraph
    
    drawEmpire :: TrackedTgraph -> Diagram B

Figure 10 was also created using TrackedTgraphs.

Figure 10: Using a TrackedTgraph for drawing
Figure 10: Using a TrackedTgraph for drawing

7. Other Reading

Previous related blogs are:

  • Diagrams for Penrose Tiles – the first blog introduced drawing Pieces and Patches (without using Tgraphs) and provided a version of decomposing for Patches (decompPatch).
  • Graphs, Kites and Darts intoduced Tgraphs. This gave more details of implementation and results of early explorations. (The class Forcible was introduced subsequently).
  • Empires and SuperForce – these new operations were based on observing properties of boundaries of forced Tgraphs.
  • Graphs,Kites and Darts and Theorems established some important results relating force, compose, decompose.

by readerunner at September 30, 2024 11:15 AM

September 27, 2024

Chris Smith 2

Playing With a Game

In a recent comment (that I sadly cannot find any longer) in https://www.reddit.com/r/math/, someone mentioned the following game. There are n players, and they each independently choose a natural number. The player with the lowest unique number wins the game. So if two people choose 1, a third chooses 2, and a fourth chooses 5, then the third player wins: the 1s were not unique, so 2 was the least among the unique numbers chosen. (Presumably, though this wasn’t specified in the comment, if there is no unique number among all players, then no one wins).

I got nerd-sniped, so I’ll share my investigation.

For me, since the solution to the general problem wasn’t obvious, it made sense to specialize. Let’s say there are n players, and just to make the game finite, let’s say that instead of choosing any natural number, you choose a number from 1 to m. Choosing very large numbers is surely a bad strategy anyway, so intuitively I expect any reasonably large choice of m to give very similar results.

n = 2

Let’s start with the case where n = 2. This one turns out to be easy: you should always pick 1, daring your opponent to pick 1, as well. We can induct on m to prove this. If m = 1, then you are required to pick 1 by the rules. But if m > 1, suppose you pick m. Either your opponent also picks m and you both lose, or your opponent picks a number smaller than m and you still lose. Clearly, this is a bad strategy, and you always do at least as well choosing one of the first m - 1 options instead. This reduces the game to one where we already know the best strategy is to pick 1.

That wasn’t very interesting, so let’s try more players.

n = 3, m = 2

Suppose there are three players, each choosing either 1 or 2. It’s impossible for all three players to choose a different number! If you do manage to pick a unique number, then, you will be the only player to do so, so it will always be the least unique number simply because it’s the only one!

If you don’t think your opponents will have figured this out, you might be tempted to pick 2, in hopes that your opponents go for 1 to try to get the least number, and you’ll be the only one choosing 2. But this makes you predictable, so the other players can try to take advantage. But if one of the other players reasons the same way, you both are guaranteed to lose! What we want here is a Nash equilibrium: a strategy for all players such that no single player can do better by deviating from that strategy.

It’s not hard to see that all players should flip a coin, choosing either 1 or 2 with equal probability. There’s a 25% chance each that a player picks the unique number and wins, and there’s a 25% chance that they all choose the same number and all lose. Regrettable, but anything you do to try to avoid that outcome just makes your play more predictable so that the other players could exploit that.

It’s interesting to look at the actual computation. When computing a Nash equilibrium, we generally rely on the indifference principle: a player should always be indifferent between any choice that they make at random, since otherwise, they would take the one with the better outcome and always play that instead.

This is a bit counter-intuitive! Naively, you might think that the optimal strategy is the one that gives the best expected result, but when a Nash equilibrium involves a random choice— known as a mixed strategy — then any single player actually does equally well against other optimal players no matter which mix of those random choices they make! In this game, though, predictability is a weakness. Just as a poker player tries to avoid ‘tells’ that give away the strength of their hand, players in this number-choosing game need to be unpredictable. The reason for playing the Nash equilibrium isn’t that it gives the best expected result against optimal opponents, but rather that it can’t be exploited by an opponent.

Let’s apply this indifference principle. This game is completely symmetric — there’s no order of turns, and all players have the same choices and payoffs available — so an optimal strategy ought to be the same for any player. Then, let’s say p is the probability that any single player will choose 1. Then if you choose 1, you will win with probability (1 — p)², while if you choose 2, you’ll win with probability p². If you set these equal to each other as per the indifference principle, and solve the equation, you get p = 0.5, as we reasoned above.

n = 3, m = 3

Things get more interesting if each player can choose 1, 2, or 3. Now it’s possible for each player to choose uniquely, so it starts to matter which unique number you pick. Let’s say each player chooses 1, 2, and 3 with the probabilities p, q, and r respectively. We can analyze the probability of winning with each choice.

  • If you pick 1, then you always win unless someone else also picks a 1. Your chance of winning, then, is (qr)².
  • If you pick 2, then for you to win, either both other players need to pick 1 (eliminating each other because of uniqueness and leaving you to win by default), or both other players need to pick 3, so that you’ve picked the least number. Your chance of winning is p² + r².
  • If you pick 3, then you need your opponents to pick the same different number: either 1 or 2. Your chance of winning is p² + q².

Setting these equal to each other immediately shows us that since p² + q² = p² + r², we must conclude that q = r. Then p² + q² = (q + r)² = 4q², so p² = 3q² = 3r². Together with p + q + r = 1, we can conclude that p = 2√3 - 3 ≈ 0.464, while q = r = 2 - √3 ≈ 0.268.

This is our first really interesting result. Can we generalize?

n = 3, in general

The reasoning above generalizes well. If there are three players, and you pick a number k, you are betting that either the other two players will pick the same number less than k, or they will each pick numbers greater than k (regardless of whether they are the same one).

I’ll switch notation here for convenience. Let X be a random variable representing a choice by a player from the Nash equilibrium strategy. Then if you choose k, your probability of winning is P(X=1)² + … + P(X=k-1)² + P(X>k)². The indifference principle tells us that this should be equal for any choice of k. Equivalently, for any k from 1 to m - 1, the probability of winning when choosing k is the same as the probability when choosing k + 1. So:

  • P(X=1)² + … + P(X=k-1)² + P(X>k)² = P(X=1)² + … + P(X=k)² + P(X>k+1)²
  • Cancelling the common terms: P(X>k)² = P(X=k)² + P(X>k+1)²
  • Rearranging: P(X=k) = √(P(X≥k+1)² - P(X>k+1)²)

This gives us a recursive formula that we can use (in reverse) to compute P(X=k), if only we knew P(X=m) to get started. If we just pick something arbitrary, though, it turns out that all the results are just multiples of that choice. We can then divide by the sum of them all to normalize the probabilities to sum to 1.

Here I can write some code (in Haskell):

import Probability.Distribution (Distribution, categorical, probabilities)

nashEquilibriumTo :: Integer -> Distribution Double Integer
nashEquilibriumTo m = categorical (zip allPs [1 ..])
where
allPs = go m 1 0 []
go 1 pEqual pGreater ps = (/ (pEqual + pGreater)) <$> (pEqual : ps)
go k pEqual pGreater ps =
let pGreaterEqual = pEqual + pGreater
in go
(k - 1)
(sqrt (pGreaterEqual * pGreaterEqual - pGreater * pGreater))
pGreaterEqual
(pEqual : ps)

main :: IO ()
main = print (probabilities (nashEquilibriumTo 100))

I’ve used a probability library from https://github.com/cdsmith/prob that I wrote with Shae Erisson during a fun hacking session a few years ago. It doesn’t help yet, but we’ll play around with some of its further features below.

Trying a few large values for m confirms my suspicion that any reasonably large choice of m gives effectively the same result.

1 -> 0.4563109873079237
2 -> 0.24809127016999155
3 -> 0.1348844977362459
4 -> 7.333521940168612e-2
5 -> 3.987155303205954e-2
6 -> 2.1677725302500214e-2
7 -> 1.1785941067126387e-2

By inspection, this appears to be a geometric distribution, parameterized by the probability 0.4563109873079237. We can check that the distribution is geometric, which just means that for all k < m - 1, the ratio P(X > k) / P(X k) is the same as P(X > k + 1) / P(Xk + 1). This is the defining property of a geometric distribution, and some simple algebra confirms that it holds in this case.

But what is this bizarre number? A few Google queries gets us to an answer of sorts. A 2002 Ph.D. dissertation by Joseph Myers seems to arrive at the same number in the solution to a question about graph theory, where it’s identified as the real root of the polynomial x³ - 4x² + 6x - 2. We can check that this is right for a geometric distribution. Starting with P(X=k) = √(P(X≥k+1)² -P(X>k+1)²) where k = 1, we get P(X=1) = √(P(X ≥ 2)² -P(X > 2)²). If P(X=1) = p, then P(X ≥ 2) = 1 - p, and P(X > 2) = (1 - p)², so we have p = √((1-p)² - ((1 - p)²)²), which indeed expands to p⁴ - 4p³ + 6p² - 2p = 0, so either p = 0 (which is impossible for a geometric distribution), or p³ - 4p² + 6p - 2 = 0, giving the probability seen above. (How and if this is connected to the graph theory question investigated in that dissertation, though, is certainly beyond my comprehension.)

You may wonder, in these large limiting cases, how often it turns out that no one wins, or that we see wins with each number. Answering questions like this is why I chose to use my probability library. We can first define a function to implement the game’s basic rule:

leastUnique :: (Ord a) => [a] -> Maybe a
leastUnique xs = listToMaybe [x | [x] <- group (sort xs)]

And then we can define the whole game using the strategy above for each player:

gameTo :: Integer -> Distribution Double (Maybe Integer)
gameTo m = do
ns <- replicateM 3 (nashEquilibriumTo m)
return (leastUnique ns)

Then we can update main to tell us the distribution of game outcomes, rather than plays:

main :: IO ()
main = print (probabilities (gameTo 100))

And get these probabilities:

Nothing -> 0.11320677243374572
Just 1 -> 0.40465349320873445
Just 2 -> 0.22000565820506113
Just 3 -> 0.11961465909617276
Just 4 -> 6.503317590749513e-2
Just 5 -> 3.535782320137907e-2
Just 6 -> 1.9223659987298684e-2
Just 7 -> 1.0451692718822408e-2

An 11% probability of no winner for large m is an improvement over the 25% we computed for m = 2. Once again, a least unique number greater than 7 has less than 1% probability, and the probabilities drop even more rapidly from there.

More than three players?

With an arbitrary number of players, the expressions for the probability of winning grow rather more involved, since you must consider the possibility that some other players have chosen numbers greater than yours, while others have chosen smaller numbers that are duplicated, possibly in twos or in threes.

For the four-player case, this isn’t too bad. The three winning possibilities are:

  • All three other players choose the same smaller number. This has probability P(X=1)³ + … + P(X=k-1)³
  • All three other players choose larger numbers, though not necessarily the same one. This has probability P(X k
  • Two of the three other players choose the same smaller number, and the third chooses a larger number. This has probability 3 P(X > k) (P(X=1)² + … + P(X=k-1)²)

You could possibly work out how to compute this one without too much difficulty. The algebra gets harder, though, and I dug deep enough to determine that the Nash equilibrium is no longer a geometric distribution. If you assume the Nash equilibrium is geometric, then numerically, the probability of choosing 1 that gives 1 and 2 equal rewards would need to be about 0.350788, but this choice gives too small a reward for choosing 3 or more, implying they ought to be chosen less often.

For larger n, even stating the equations turns into a nontrivial problem of accurately counting the possible ways to win. I’d certainly be interested if there’s a nice-looking result here, but I do not yet know what it is.

Numerical solutions

We can solve this numerically, though. Using the probability library mentioned above, one can easily compute, for any finite game and any strategy (as a probability distribution of moves) the expected benefit for each choice.

expectedOutcomesTo :: Int -> Int -> Distribution Double Int -> [Double]
expectedOutcomesTo n m dist =
[ probability (== Just i) $ leastUnique . (i :) <$> replicateM (n - 1) dist
| i <- [1 .. m]
]

We can then then iteratively adjust the probability of each choice slightly based on how its expected outcome compares to other expected outcomes in the distribution. It turns out to be good enough to compare with an immediate neighbor. Just so that all of our distributions remain valid, instead of working with the global probabilities P(X=k), we’ll do the computation with conditional probabilities P(X = k | X k), so that any sequence of probabilities is valid, without worrying about whether they sum to 1. Given this list of conditional probabilities, we can produce a probability distribution like this.

distFromConditionalStrategy :: [Double] -> Distribution Double Int
distFromConditionalStrategy = go 1
where
go i [] = pure i
go i (q : qs) = do
choice <- bernoulli q
if choice then pure i else go (i + 1) qs

Then we can optimize numerically, using the difference of each choice’s win probability from its neighbor as a diff to add to the conditional probability of that choice.

refine :: Int -> Int -> [Double] -> Distribution Double Int
refine n iters strategy
| iters == 0 = equilibrium
| otherwise =
let ps = expectedOutcomesTo n m equilibrium
delta = zipWith subtract (drop 1 ps) ps
adjs = zipWith (+) strategy delta
in refine n (iters - 1) adjs
where
m = length strategy + 1
equilibrium = distFromConditionalStrategy strategy

It works well enough to run this for 10,000 iterations at n = 4, m = 10.

main :: IO ()
main = do
let n = 4
m = 10
d = refine n 10000 (replicate (m - 1) 0.3)
print $ probabilities d
print $ expectedOutcomesTo n m d

The resulting probability distribution is, to me, at least, quite surprising! I would have expected that more players would incentivize you to choose a higher number, since the additional players make collisions on low numbers more likely. But it seems the opposite is true. While three players at least occasionally (with 1% or more probability) should choose numbers up to 7, four players should apparently stop at 3.

Nash equilibrium strategy for n = 4, m = 10

Huh. I’m not sure why this is true, but I’ve checked the computation in a few ways, and it seems to be a real phenomenon. Please leave a comment if you have a better intuition for why it ought to be so!

With five players, at least, we see some larger numbers again in the Nash equilibrium, lending support to the idea that there was something unusual going on with the four player case. Here’s the strategy for five players:

Nash equilibrium strategy for n = 5, m = 10

The six player variant retracts the distribution a little, reducing the probabilities of choosing 5 or 6, but then 7 players expands the choices a bit, and it’s starting to become a pattern that even numbers of players lend themselves to a tighter style of play, while odd numbers open up the strategy.

Nash equilibrium strategy for n = 6, m = 10
Nash equilibrium strategy for n = 7, m = 10
Nash equilibrium strategy for n = 8, m = 10

In general, it looks like this is converging to something. The computations are also getting progressively slower, so let’s stop there.

Game variants

There is plenty of room for variation in the game, which would change the analysis. If you’re looking for a variant to explore on your own, in addition to expanding the game to more players, you might try these:

  • What if a tie awards each player an equal fraction of the reward for a full win, instead of nothing at all? (This actually simplifies the analysis a bit!)
  • What if, instead of all wins being equal, we found the least unique number, and paid that player an amount equal to the number itself? Now there’s somewhat less of an incentive for players to choose small numbers, since a larger number gives a large payoff! This gives the problem something like a prisoner’s dilemma flavor, where players could coordinate to make more money, but leave themselves open to being undercut by someone willing to make a small profit by betraying the coordinated strategy.

What other variants might be interesting?

Addendum (Sep 26): Making it faster

As is often the case, the naive code I originally wrote can be significantly improved. In this case, the code was evaluating probabilities by enumerating all the ways players might choose numbers, and then computing the winner for each one. For large values of m and n this is a lot, and it grows exponentially.

There’s a better way. We don’t need to remember each individual choice to determine the outcome of the game in the presence of further choices. Instead, we need only determine which numbers have been chosen once, and which have been chosen more than once.

data GameState = GameState
{ dups :: Set Int,
uniqs :: Set Int
}
deriving (Eq, Ord)

To add a new choice to a GameState requires checking whether it’s one of the existing unique or duplicate choices:

addToState :: Int -> GameState -> GameState
addToState n gs@(GameState dups uniqs)
| Set.member n dups = gs
| Set.member n uniqs = GameState (Set.insert n dups) (Set.delete n uniqs)
| otherwise = GameState dups (Set.insert n uniqs)

We can now directly compute the distribution of GameState corresponding to a set of n players playing moves with a given distribution. The use of simplify from the probability library here is crucial: it combines all the different paths that lead to the same outcome into a single case, avoiding the exponential explosion.

stateDist :: Int -> Distribution Double Int -> Distribution Double GameState
stateDist n moves = go n (pure (GameState mempty mempty))
where
go 0 states = states
go i states = go (i - 1) (simplify $ addToState <$> moves <*> states)

Now it remains to determine whether a certain move can win, given the game state resulting from the remaining moves.

win :: Int -> GameState -> Bool
win n (GameState dups uniqs) =
not (Set.member n dups) && maybe True (> n) (Set.lookupMin uniqs)

Finally, we update the function that computes win probabilities to use this new code.

expectedOutcomesTo :: Int -> Int -> Distribution Double Int -> [Double]
expectedOutcomesTo n m dist = [probability (win i) states | i <- [1 .. m]]
where
states = stateDist (n - 1) dist

The result is that while I previously had to leave the code running overnight to compute the n = 8 case, I can now easily compute cases up to 15 players with enough patience. This would involve computing the winner for about a quadrillion games in the naive code, making it hopeless , but the simplification reduces that to something feasible.

Nash equilibria for 2 through 15 players

It seems that once you leave behind small numbers of players where odd combinatorial things happen, the equilibrium eventually follows a smooth pattern. I suppose with enough players, the probability for every number would peak and then decline, just as we see for 4 and 5 here, as it becomes worthwhile to spread your choices even further to avoid duplicates. That’s a nice confirmation of my intuition.

by Chris Smith at September 27, 2024 07:19 AM

September 25, 2024

Oskar Wickström

How I Built "The Monospace Web"

Recently, I published The Monospace Web, a minimalist design exploration. It all started with this innocent post, yearning for a simpler web. Perhaps too typewriter-nostalgic, but it was an interesting starting point. After some hacking and sharing early screenshots, @noteed asked for grid alignment, and down the rabbit hole I went.

September 25, 2024 10:00 PM

September 24, 2024

Tweag I/O

Python Packaging in the Real World: Biomedical projects vs. PyPI

The Python programming language, and its huge ecosystem (there are more than 500,000 projects hosted on the main Python repository, PyPI), is used both for software engineering and scientific research. Both have similar requirements for reproducibility. But, as we will see, the practices are quite different.

In fact, the Python ecosystem and community is notorious for the countless ways it uses to declare dependencies. As we were developping FawltyDeps1, a tool to ensure that declared dependencies match the actual imports in the code, we had to accommodate many of these ways. This got us thinking: Could FawltyDeps be used to gain insights into how packaging is done across Python ecosystems?

In this blog post, we look at project structures and dependency declarations across Python projects, both from biomedical scientific papers (as an example of scientific usage of Python) as well as from more general and widely used Python packages. We’ll try to answer the following questions:

  • What practices does the community actually follows? And how do they differ between software engineering and scientific research?
  • Could such differences be related to why it’s often hard to reproduce results from scientific notebooks published in the data science community?

Experiment setup

In the following, we discuss the experimental setup — how we decided which data to use, where to get this data from, and what tools we use to analyze it, before we discuss our results in depth.

Data

First, we need to collect the names and source code locations of projects that we want to include in the analysis. Now, where did we find these projects? We selected projects for analysis based on two key areas: impactful real-world applications and broad community adoption.

  1. Biomedical data analysis repositories: biomedical data plays a vital role in healthcare and research. To capture its significance, we focused on packages directly linked to biomedical data, sourced from repositories supported or referenced by scientific biomedical articles. This criterion anchored our experiment in real-world scientific applications.
  2. To analyze software engineering practices, we’ve chosen to use the most popular PyPI packages: acknowledging the importance of widely adopted packages, we included a scan of the most downloaded and frequently used PyPI packages.

Biomedical data

We leverage a recent study by Samuel, S., & Mietchen, D. (2024): Computational reproducibility of Jupyter notebooks from biomedical publications. This study analyzed 2,177 GitHub repositories associated with publications indexed in PubMed Central to assess computational reproducibility. Specifically, we reused the dataset they generated (found here) for our own analyses.

PyPI data

In order to start analyzing actual projects published to PyPI, we still needed to access some basic metadata about these projects: the project’s name, source URL, and any extra metadata which could be useful for further analysis such as project tags.

While this information is available via the PyPI REST API, this API is subject to rate limiting and is not really designed for bulk analyses such as ours. Conveniently, Google maintains a public BigQuery dataset of PyPI download statistics and project metadata which we leveraged instead. As a starting point for our analysis, we produced a CSV with relevant metadata for top packages downloaded in 2023 using a simple SQL query. Since the above-mentioned biomedical database contains 2,177 projects, we conducted a scan of the first 2,000 PyPI packages to create a dataset of comparable size.

Using FawltyDeps to analyze the source code data

Now that we have the source URLs of our projects of interest, we downloaded all sources and ran an analysis script that wraps around FawltyDeps on the packages. For safety, all of this happened in a virtual machine.

Post-processing and filtering of FawltyDeps analysis results

While the data we collected from PyPI was quite clean (modulo broken or inaccessible project URLs), the biomedical dataset contained some projects written in R and some projects written in Python 2.X, which are outside of our scope. To further filter for relevant projects that are written in Python 3.X, we applied the following rules:

  • there should be .py or .ipynb files in the source code directory of the data. If there are only .ipynb files and no imports, then it is most likely an R project and not taken into account.
  • we are also only interested in Python projects that have 3rd-party imports, as these are the project we would expect to declare their dependencies.

After these filtering steps, we have 1,260 biomedical projects and 1,118 PyPI packages to be analyzed.

Results

Now that we had crunched thousands of Python packages, we were curious to see what secrets the data produced by FawltyDeps would reveal!

Dependency declaration patterns

First, we investigated which dependency declaration file choices were made in both samples. The following pie charts show the proportion of projects with and without dependency declaration files, and whether these files actually contain dependency declarations.

distribution deps 1 distribution deps 1 new
Figure 1. Percent of projects with dependency declaration files and actual dependency(ies) declared.

We find that about 60% of biomedical projects have dependency declaration files, while for PyPI packages, that number is almost 100%. That is expected, as the top PyPI projects are written to be reproducible: they are downloaded by a large group of people and if they are not working due to lack of dependency declarations, it would be noticed immediately by the users.

Interestingly, we found that some biomedical projects (6.8%) and PyPI packages (16.0%) have dependency declaration files with no dependencies listed inside them. This might be because they genuinely have no third-party dependencies, but more commonly it is a symptom of either:

  • setup.py files with complex dependency calculations: although FawltyDeps supports parsing simple setup.py files with a single setup()call and no computation involved for setting the install_requires and extras_require arguments, it is currently not able to analyze more complex scenarios.
  • pyproject.toml might be used to configure tools with sections like [tool.black] or [tool.isort], and declaring dependencies (and other project metadata) in the same file is not strictly required.

For the remainder of the analysis, we do not take these cases into account.

We then examined how different package types utilize various dependency declaration methods. The following chart shows the distribution of requirements.txt, pyproject.toml, and setup files across biomedical projects and PyPI packages (note that these three categories are not exclusive):

distribution deps 2
Figure 2. Percent of projects with dependencies declared in `requirements.txt`, `pyproject.toml` and setup files.

For biomedical projects, requirements.txt and setup.py/setup.cfg files are a majority of declaration files. In contrast, PyPI projects show a higher occurrence of pyproject.toml compared to biomedical projects. pyproject.toml is a suggested modern way of declaring dependencies. This result should not come as a surprise: top PyPI projects are actively maintained and are more likely to follow best practices. A requirements.txt file, on the other hand, is easier to add and if you do not need to package your projects it is a simpler option.

Now let’s have a more detailed view in which categories are exclusive:

distribution deps 3
Figure 3. Distribution of mutually exclusive dependency file choices.

For biomedical data there are a lot of projects that have either requirements.txt or setup.py/setup.cfg files (or a combination of both) present. The traditional method of using setup files utilizing setuptools to create Python packages has been around for a while and is still heavily relied upon in the scientific community.

On the PyPI side, no single method for declaring dependencies stood out, as different approaches were used with similar frequency across all projects. However, when it comes to using pyproject.toml, PyPI packages were about five times more likely to adopt this method compared to biomedical projects, suggesting that PyPI package authors tend to favor pyproject.toml significantly more often for dependency management.

Also, almost no top biomedical projects (only 2 out of 1,260) and very few PyPI packages (only 25 out of 1,118) used pyproject.toml and setup files together: it seems that projects don’t often mix the older method - setup files - with the more modern one - pyproject.toml - at the same time.

A different method of visualizing the subset of results pertaining to requirements.txt, pyproject.toml and setup.py/setup.cfg files are Venn diagrams:

distribution deps 4
Figure 4. Venn diagram of projects with dependencies declared with categories including combination of dependency files.

While these diagrams don’t contain new insights, they show clearly how much more common pyproject.toml usage is for PyPI packages.

Source code directories

We next examined where projects store their source code, which we refer to as the “source code directory”. In the following analysis, we defined this directory as the directory that contains the highest number of Python code files and does not have names like “test”, “example”, “sample”, “doc”, or “tutorial”.

code structure
Figure 5. Source code directories choices.

We can make some interesting observations: Over half (53%) of biomedical projects store their main source code in a directory with a name different than the project itself, and source code is not commonly stored in directories named src or src-python (7%). For PyPI projects, the numbers are lower, with 37% storing their main code in a directory that matches the project name. However, naming the source code directory differently from the package name is still fairly common for PyPI projects, appearing in 36% of cases. A somewhat surprising finding: the src layout, recommended by Python packaging user guide, appears in only 14% of cases.

Another noteworthy observation is that 23% of biomedical projects store all their source code in the root directory of the project. In contrast, only 12% of PyPI projects follow this pattern. This difference makes sense, as scientists working on biomedical projects might be less concerned about maintaining a strict code structure compared to developers on PyPI. Additionally, a lot of biomedical projects might be a loose collection of notebooks/scripts not intended to be packaged/importable, and thus will typically not need to add any subdirectories at all. On the other hand, everything from the PyPI data set is an importable package. Even in the “flat” layout (according to discussion), related modules are collected in a subdirectory named after the package.

The top PyPI projects that keep their code in the root directory are often small Python modules or plugins, like “python-json-patch”, “appdirs”, and “python-json-pointer”. These projects usually have all their source code in a single file, so storing it in the root directory makes sense.

Key results

Many people have preconceptions about how a Python project should look, but the reality can be quite different. Our analysis reveals distinct differences between top PyPI projects and biomedical projects:

  • PyPI projects tend to use modern tools like pyproject.toml more frequently, reflecting better overall project structure and dependency management practices.
  • In contrast, biomedical projects display a wide variety of practices; some store code in the root directory and fail to declare dependencies altogether.

This discrepancy is partially explained by the selection criteria: popular PyPI packages, by necessity, must be usable and thus correctly declare their dependencies, while biomedical projects accompanying scientific papers do not face such stringent requirements.

Conclusion

We found that biomedical projects are written with less attention to the coding best practices, which compromises their reproducibility. There are many projects without dependencies declared. The use of pyproject.toml, which is current state-of-the-art way to declare dependencies is less frequently present in biomedical packages. In our opinion, though, it’s essential for any package to adhere to the same high standards of reproducibility as top PyPI packages. This includes implementing robust dependency management practices and embracing modern packaging standards. Enhancing these practices will not only improve reproducibility but also foster greater trust and adoption within the scientific community.

While our initial analysis revealed some interesting insights, we feel that there might be some more interesting treasures to be found within this dataset - you can check yourself in our FawltyDeps-analysis repository! We invite you to join the discussion on FawltyDeps and reproducibility in package management on our Discord channel.

Finally, this experiment also served as a real-world stress test for FawltyDeps itself and identified several edge cases we had not yet accounted for, suggesting avenues of further development for FawltyDeps: One of the main challenges was to parse unconventional require and extra-require sections in setup.py files. This issue has been addressed by the FawltyDeps project, specifically through the improvements made in FawltyDeps PR #440. Furthermore, it was also not trivial to handle projects with multiple packages declared in one. Addressing these issues will be a focus as we continue to refine and improve FawltyDeps.

Stay tuned as we will drill deeper into the data we’ve collected. So far, we’ve reused part of FawltyDeps‘ code for our analysis, but the next step will be to run the full FawltyDeps tool on a large number of packages. Join us as we examine how FawltyDeps performs under rigorous testing and what improvements can be made to enhance its capabilities!


  1. For more insights, refer to our previous talk at PyData Global: Finding undeclared and unused dependencies in your notebooks and projects.

September 24, 2024 12:00 AM

September 16, 2024

Dan Piponi (sigfpe)

What does it take to be a hero? and other questions from statistical mechanics.

1 We only hear about the survivors

In the classic Star Trek episode Errand of Mercy, Spock computes the chance of success:

CAPTAIN JAMES T. KIRK : What would you say the odds are on our getting out of here?

MR. SPOCK : Difficult to be precise, Captain. I should say, approximately 7,824.7 to 1.


And yet they get out of there. Are Spock’s probability computations unreliable? Think of it another way. The Galaxy is a large place. There must be tens of thousands of Spocks, and Grocks, and Plocks out there on various missions. But we won’t hear (or don’t want to hear) about the failures. So they may all be perfectly good at probability theory, but we’re only hearing about the lucky ones. This is an example of survivor bias.


2 Simulation


We can model this. I’ve written a small battle simulator for a super-simple made up role-playing game...


And the rest of this article can be found at github


(Be sure to download the actual PDF if you want to be able to follow links.)

by sigfpe (noreply@blogger.com) at September 16, 2024 04:11 PM

September 12, 2024

Tweag I/O

Reflecting away from definitions in Liquid Haskell

We’ve all been there: wasting a couple of days on a silly bug. Good news for you: formal methods have never been easier to leverage.

In this post, I will discuss the contributions I made during my internship to Liquid Haskell (LH), a tool that makes proving that your Haskell code is correct a piece of cake.

LH lets you write contracts for your functions inside your Haskell code. In other words, you write pre-conditions (what must be true when you call it) and post-conditions (what must always be true when you leave the function). These are then fed into an SMT solver that proves your code satisfies them! You may have to write a few lemmas to guide LH, but it makes verification easier than proving them completely in a proof assistant.

My contributions enhance the reflection mechanism, which allows LH to unfold function definitions in logic formulas when verifying a program. I have explored three approaches that are described in what follows.

The problem

Imagine that, in the course of your work, you wanted to define a function that inserts into an association list.

{-@
smartInsert
  :: k:String
  -> v:Int
  -> l:[(String, Int)]
  -> {res : [(String, Int)] |
        lookup k l = Just v || head res = (k , v)
     }
@-}
smartInsert :: String -> Int -> [(String, Int)] -> [(String, Int)]
smartInsert k v l
  | lookup k l == Just v = l
  | otherwise = (k, v) : l

LH runs as a compiler plugin. While the bulk of the compiler ignores the special comments {-@ ... @-}, LH processes the annotations therein.

The annotation that you see in the first snippet is the specification of smartInsert, with the post-condition establishing that the result of the function must have the pair (k, v) at the front, or the pair must be already present in the original list.

Let us say that you also want to use that smartInsert function later in the logic or proofs, so you want to reflect it to the logic. For that, you will introduce another annotation:

{-@ reflect smartInsert @-}

This annotation is telling LH that the equations of the Haskell definition of smartInsert can be used to unfold calls to smartInsert in logic formulas.

As a human, you may agree that the specification is valid for this implementation, but you get this error from the machine:

error:
Illegal type specification for `Test.smartInsert`
[...]
    Unbound symbol GHC.Internal.List.lookup --- perhaps you meant: GHC.Internal.Base.. ?

Do not despair! This tells you that lookup is not defined in the logic. Despite lookup being a respectable function in Haskell, defined in GHC.List, LH knows nothing about it. Not all functions in Haskell can simply be used in the logic, at least not without reflecting them first. Far from being discouraged, you decide to reflect it like the others, but you realize that lookup wasn’t defined in your own module, it comes from the Prelude! This makes reflection impossible, as LH points out:

error:
Cannot lift Haskell function `lookup` to logic
"lookup" is not in scope

If you consider for a moment, LH needs the definition of the function in order to reflect it. So it can only complain when it is asked to reflect a function whose definition is not available because it was defined in some library dependency.

This is a recurring problem, especially when working with dependencies, and this is exactly what I have been working on during this internship at Tweag, in three different ways, as described below.

Idea #1: Define our own reflection of the function

Your first thought might be: “if I cannot reflect lookup because it comes from a foreign library, I will just define my own version of it myself”. Even better would be if you could still link your custom definition of lookup to the original symbol. Creating this link was my first contribution.

Step one is to define the pretend function. For this to work out correctly in the end, its definition must be equivalent to the original definition of the imported function.

The definition of the pretend function might look like this:

myLookup :: Eq a => a -> [(a, b)] -> Maybe b
myLookup _ [] = Nothing
myLookup key ((x, y):xys)
  | key == x  = Just y
  | otherwise = myLookup key xys

So far, so good. Of course, we give it a different name from the actual function, as they refer to different definitions, and we want to be able to refer to both so that we can link them together later.

Now, we reflect this myLookup function, which LH has no problem doing, since this reflect command is located in the same module as its definition.

{-@ reflect myLookup @-}

Then, the magic happens with this annotation that links the two lookups together:

{-@ assume reflect lookup as myLookup @-}

Read it as “reflect lookup, assuming that its definition is the same as myLookup”. This is enough to get the smartInsert function verified. Just for the record, here is the working snippet:

{-@ reflect myLookup @-}
myLookup :: Eq a => a -> [(a, b)] -> Maybe b
myLookup _ [] = Nothing
myLookup key ((x, y):xys)
  | key == x  = Just y
  | otherwise = myLookup key xys

{-@ assume reflect lookup as myLookup @-}

{-@
reflect smartInsert
smartInsert
  :: k:String
  -> v:Int
  -> l:[(String, Int)]
  -> {res : [(String, Int)] |
       lookup k l = Just v || head res = (k , v)
     }
@-}
smartInsert :: String -> Int -> [(String, Int)] -> [(String, Int)]
smartInsert k v l
  | lookup k l == Just v = l
  | otherwise = (k, v) : l

The question you may be asking at this point is: why does it work?

In order to verify the code, LH has to prove side-conditions (called subtyping relations) between the actual output and the post-condition to be verified. For the first equation of smartInsert, it needs to be proved that

lookup k l = Just v && res = l
  =>
lookup k l = Just v || head res = (k , v)

For the second equation, it needs to be proved that

res = (k, v) : l
  =>
lookup k l = Just v || head res = (k , v)

Because we started with such a simple example, the reflection of lookup is actually unused here (even though LH conservatively insists on it). But that’s just a coincidence; in fact, we can use a more direct post-condition that does actually use the reflection:

{-@
smartInsert
  :: k:String
  -> v:Int
  -> l:[(String, Int)]
  -> {res : [(String, Int)] | lookup k res = Just v}
@-}

This time, the subtyping constraints require proving:

-- constraint for the first equation
lookup k l = Just v && res = l
  =>
lookup k res = Just v

-- constraint for the second equation
res = (k, v) : l
  =>
lookup k res = Just v

The first constraint can still be solved without going into the definition of lookup. But the second constraint isn’t something that we can prove for any definition of lookup. Thanks to reflection, we have the following unfoldings at our disposal:

lookup key l = myLookup k l

myLookup key l =
  if isEmpty l then Nothing
  else if key = fst (head l) then
    Just (snd (head l))
  else
    myLookup key (tail l)

The first equality is from assume-reflection. It links the pretend and actual functions. The second one is the reflection of myLookup.

With that in mind, let’s move on to prove the second constraint. We reduce the left-hand side to the right-hand side.

lookup k res
    = lookup k ((k, v):l)       (hypothesis)
    = myLookup k ((k, v) : l)   (lookup unfolding)
    = Just v                    (myLookup unfolding)

Q.E.D. Furthermore, you notice that the equation connecting lookup and myLookup was crucial. That is the gist of what we added to LH to make the proof work.

In addition to the implementation, I contributed a specification of assume-reflection that spells out the validation of the new annotation and the resolution rules when the same function is assume-reflected at different locations. It is worth noting that if there exist two assume-reflections in your imports that contradict each other, then one of them must be false, so your axiom environment will not be sound.

Idea #2: opaque reflection

We noted already that we didn’t truly need to know what lookup was about to prove the first, simpler specification, namely:

{-@
smartInsert
  :: k:String
  -> v:Int
  -> l:[(String, Int)]
  -> {res : [(String, Int)] |
       lookup k res = Just v || head res = (k, v)
     }
@-}

The only issue we had was that lookup was not defined in the logic. Similarly, it is possible that our own functions to be reflected use imported, unreflected functions whose content is irrelevant. We want to reflect the expressions of our functions, but do not care about the expression of some of the functions that appear inside them. Here, we want to reflect smartInsert, which contains lookup, but we don’t need to know exactly what lookup is about to prove our lemmas. Either lookup comes from a dependency, or it has a non-trivial implementation, or it uses primitives not implemented in Haskell.

We allowed this through what we call opaque reflection. Opaque reflection introduces a symbol, without any equation, for all the symbols in your reflections that aren’t defined yet in the logic.

For instance, when reflecting the definition of smartInsert,

smartInsert k v l
  | lookup k l == Just v = l
  | otherwise = (k, v) : l

LH looks for any free symbols in there that are not present in the logic. Here, it will see that lookup is something new to the logic, and it will introduce an uninterpreted function for it. Uninterpreted functions are symbols used by the SMT solver, for which it only knows it satisfies function congruence, i.e. that if two values are equal v = w, then when the function is applied to them, the result is still the same f v = f w.

As it turns out, we could also do that manually using the measure annotation. These annotations let you introduce an uninterpreted function in the logic yourself, and specify the refinement type of it.

For instance, we could define a measure like this:

{-@
measure GHC.Internal.List.lookup :: k:a -> xs:[(a, b)] -> Maybe b
GHC.Internal.List.lookup
  :: k:a
  -> xs:[(a, b)]
  -> {VV : Maybe b | VV == GHC.Internal.List.lookup k xs}
@-}

The measure annotation creates an uninterpreted function with the same name as the function in the Haskell code. The second line links both the uninterpreted and Haskell functions by strengthening the post-condition of the Haskell function with the uninterpreted function from the logic.

The new opaque reflection does all that for you automatically! It’s even more powerful when you think about imports. If two modules are opaque-reflecting the same function from some common import, the uninterpreted symbols are considered the same because they refer to the same thing.

Whereas, if you were to use measure annotations in both imports for the same external functions (say, lookup), and then to import those in another module, LH would complain about it. Indeed, there can not be two measures with identical names in scope. Since LH doesn’t know what you’re using those measures for, or whether they actually stand for the same uninterpreted function, it cannot resolve the ambiguity. The full specification is here.

Idea #3: Using the unfoldings

At this point, someone might object that Haskell can inline even imported functions when optimizing the code, so it must have access to the original definitions. As such, there is no need for assume-reflection or opaque-reflection, if we could just reflect the function definition wherever the optimizer finds it.

It is indeed the case for some functions, and under some circumstances (note the precautions I’m taking here), that some information about the implementation of functions is passed in interface files.

What are interface files? These are the files that contain the information that the other modules need to know. Part of this information is the unfoldings of the exported functions, in a syntax that is slightly different from the GHC’s CoreExprs, but can easily be converted to it.

After some experimentation, I observed that the unfoldings of many functions are available in interface files, unless prevented by the -fignore-interface-pragmas or -fomit-interface-pragmas flags (note that -O0 implies those flags, but -O1 does not). Since most packages are compiled with at least -O1, the unfolding of many functions are available without any further tuning. In particular, those functions that are small enough to be included in the interface files are available.

Once implemented, it suffices to use the same reflect annotation as before, but this time even for imported functions!

{-@ reflect flip -@}

LH will automatically detect if this function is defined in the current module or in the dependencies, and in the latter case it will look for possible unfoldings.

Unfortunately, these unfoldings turned out to have some drawbacks.

  • The presence of these unfoldings depends on some GHC flags, and heuristics from GHC. As such, it’s possible for a new version of a library to suddenly exclude an unfolding without the library author realizing it. This predicament is akin to that of the HERMIT tool, and it is difficult to solve without rebuilding the dependencies with custom configuration.
  • The unfoldings are based on the optimized version of the functions, which is sometimes harder to reason about. Also, it is subject to change if the GHC optimizations change, which means that any proof based on these unfoldings could be broken by a change to those optimizations.
  • Many functions are not possible to reflect as they are. If they use local recursive definitions, or lambda abstractions, LH cannot reflect them at the moment.
  • If the unfolding of a function depends on non-exported definitions, LH does not offer a mechanism to request these definitions to be reflected. Even if it did, this breaks encapsulation to some point, and makes our code dependent on internal implementation details of imported code, to the point where even a dot release could break the verification.
  • Reflections are still limited in their capabilities. At the time of writing, reflected functions cannot contain lambda abstractions or local recursive bindings. Recursive bindings are allowed, but local ones are not, since LH has no sense of locality (yet). Because unfoldings tend to have a lot of these, we cannot reflect them (yet).

For these reasons, further work and experimentation will be needed to make this approach truly useful. Nevertheless, we have included the implementation in a PR in the hope that it may be helpful in some cases, and that improving the capabilities of reflections in general will make it more and more valuable.

Conclusion

Liquid Haskell’s reflection is handy and powerful, but if your function uses some dependencies that are not yet reflected, you were stuck. We presented three ways to proceed: assert an equivalence between the imported function and a definition in the current module (ideally copy-pasted from the original source file), introduce some uninterpreted function in the logic for dependencies, or try to find the unfoldings of those dependencies in interface files.

All of these features have been implemented and pulled into Liquid Haskell. The implementation fits well into LH’s machinery, reusing the existing pipeline for uninterpreted symbols and reflections. We also added tests, especially for module imports, and checked the implementation against the numerous regression tests already in place. An enticing next step would be to improve the capabilities of reflection, which would also allow diving deeper into the reflection of unfoldings in interface files.

I hope this will improve the ease of proof-writing in LH, and that reading this post will encourage you to write more specifications and proofs about your code, seeing how much of a breeze it can be!

I would like to thank Tweag for this wonderful opportunity to work on Liquid Haskell; it has been an enriching internship that has allowed me to grow in Haskell experience and in contributing to large codebases. In particular, I’d like to express my heartfelt thanks to my supervisor, Facundo Domínguez, for his constant support, guidance, and invaluable assistance.

September 12, 2024 12:00 AM

September 09, 2024

Magnus Therning

Followup on secrets in my work notes

I got the following question on my post on how I handle secrets in my work notes:

Sounds like a nice approach for other secrets but how about :dbconnection for Orgmode and sql-connection-alist?

I have to admit I'd never come across the variable sql-connection-alist before. I've never really used sql-mode for more than editing SQL queries and setting up code blocks for running them was one of the first things I used yasnippet for.

I did a little reading and unfortunately it looks like sql-connection-alist can only handle string values. However, there is a variable sql-password-search-wallet-function, with the default value of sql-auth-source-search-wallet, so using auth-source is already supported for the password itself.

There seems to be a lack of good tutorials for setting up sql-mode in a secure way – all articles I found place the password in clear-text in the config – filling that gap would be a nice way to contribute to the Emacs community. I'm sure it'd prompt me to re-evaluate incorporating sql-mode in my workflow.

September 09, 2024 08:36 PM

in Code

My Physics and Math Heritage

This is just a “personal life update” kind of post, but I recently found out a couple of cool things about my academic history that I thought were neat enough to write down so that I don’t forget them.

Oppenheimer

When the Christopher Nolan Biopic about the life of J. Robert Oppenheimer was about to come out, it was billed as an “Avengers of Physics”, where every major physicist working in the US early and middle 20th century would be featured. I had a thought tracing my “academic family tree” to see if my PhD advisor’s advisor’s advisor’s advisor’s was involved in any of the major physics projects depicted in the movie, to see if I could spot them portrayed in the movie as a nice personal connection.

If you’re not familiar with the concept, the relationship between a PhD candidate and their doctoral advisor is a very personal and individual one: they personally direct and guide the candidate’s research and thesis. To an extent, they are like an academic parent.

I was able to find my academic family tree and, to my surprise, my academic lineage actually traces directly back to a key figure in the movie!

  • My advisor, Hesham El-Askary, received his PhD under the advisory of Menas Kafatos at George Mason university
  • Dr. Kafatos received his PhD under the advisory of Philip Morrison at the Massachusetts Institute of Technology.
  • Dr. Morrison received his PhD in 1940 at University of California, Berkeley under the advisory of none other than J. Robert Oppenheimer himself!

So, I started this out on a quest to figure out if I was “academically descended” from anyone in the movie, and I ended up finding out I was Oppenheimer’s advisee’s advisee’s advisee’s advisee! I ended up being able to watch the movie and identify my great-great-grand advisor no problem, and I think even my great-grand advisor. A fun little unexpected surprise and a cool personal connection to a movie that I enjoyed a lot.

Erdos

As an employee at Google, you can customize your directory page with “badges”, which are little personalized accomplishments or achievements, usually unrelated to any actual work you do. I noticed that some people had an “Erdos Number N” badge (1, 2, 3, etc.). I had never given any thought into my own personal Erdos number (it was probably really high, in my mind) but I thought maybe I could look into it in order to get a shiny worthless badge.

In academia, Paul Erdos is someone who wrote so many papers and collaborated with so many people that it became a joking “non-accomplishment” to say that you wrote a paper with him. Then after a while it became an joking non-accomplishment to say that you wrote a paper with someone who wrote a paper with him (because, who hasn’t?). And then it became an even more joking more non-accomplishment to say you had an Erdos Number of 3 (you wrote a paper with someone who wrote a paper with someone who wrote a paper with Dr. Erdos).

Anyway I just wanted to get that badge so I tried to figure it out. It turns my most direct trace through:

  1. I co-authored “Application of recurrent neural networks for drought projections in California” with Daniele C. Struppa.
  2. Dr. Struppa co-authored “Applications of commutative and computational algebra to partial differential equations” with William W. Adams.
  3. Dr. Adams co-authored “Non-Archimedian analytic functions taking the same values at the same points” with Ernst G. Straus.
  4. Dr. Straus collaborated with many people, including Einstein, Graham, Goldberg, and 20 papers with Erdos.

So I guess my Erdos number is 4? The median number for mathematicians today seems to be 5, so it’s just one step above that. Not really a note-worthy accomplishment, but still neat enough that I want a place to put the work tracking this down the next time I am curious again.

Anyways I submitted the information above and they gave me that sweet Edros 4 badge! It was nice to have for about a month before quitting the company.

That’s It

Thanks for reading and I hope you have a nice rest of your day!

by Justin Le at September 09, 2024 05:28 AM

Brent Yorgey

Decidable equality for indexed data types

Decidable equality for indexed data types

Posted on September 9, 2024
Tagged , ,

Recently, as part of a larger project, I wanted to define decidable equality for an indexed data type in Agda. I struggled quite a bit to figure out the right way to encode it to make Agda happy, and wasn’t able to find much help online, so I’m recording the results here.

The tl;dr is to use mutual recursion to define the indexed data type along with a sigma type that hides the index, and to use the sigma type in any recursive positions where we don’t care about the index! Read on for more motivation and details (and wrong turns I took along the way).

This post is literate Agda; you can download it here if you want to play along. I tested everything here with Agda version 2.6.4.3 and version 2.0 of the standard library.

Background

First, some imports and a module declaration. Note that the entire development is parameterized by some abstract set B of base types, which must have decidable equality.

open import Data.Product using (Σ ; _×_ ; _,_ ; -,_ ; proj₁ ; proj₂)
open import Data.Product.Properties using (≡-dec)
open import Function using (__)
open import Relation.Binary using (DecidableEquality)
open import Relation.Binary.PropositionalEquality using (__ ; refl)
open import Relation.Nullary.Decidable using (yes; no; Dec)

module OneLevelTypesIndexed (B : Set) (≟B : DecidableEquality B) where

We’ll work with a simple type system containing base types, function types, and some distinguished type constructor □. So far, this is just to give some context; it is not the final version of the code we will end up with, so we stick it in a local module so it won’t end up in the top-level namespace.

module Unindexed where
  data Ty : Set where
    base : B  Ty
    __ : Ty  Ty  Ty
_ : Ty  Ty

For example, if \(X\) and \(Y\) are base types, then we could write down a type like \(\square ((\square \square X \to Y) \to \square Y)\):

  infixr 2 __
  infix 30_

  postulate
    BX BY : B

  X : Ty
  X = base BX
  Y : Ty
  Y = base BY

  example : Ty
  example =((□ □ X ⇒ Y) ⇒ □ Y)

However, for reasons that would take us too far afield in this blog post, I don’t want to allow immediately nested boxes, like \(\square \square X\). We can still have multiple boxes in a type, and even boxes nested inside of other boxes, as long as there is at least one arrow in between. In other words, I only want to rule out boxes immediately applied to another type with an outermost box. So we don’t want to allow the example type given above (since it contains \(\square \square X\)), but, for example, \(\square ((\square X \to Y) \to \square Y)\) would be OK.

Encoding invariants

How can we encode this invariant so it holds by construction? One way would be to have two mutually recursive data types, like so:

module Mutual where
  data Ty : Set
  data UTy : Set

  data Ty where
_ : UTy  Ty
_ : UTy  Ty

  data UTy where
    base : B  UTy
    __ : Ty  Ty  UTy

UTy consists of types which have no top-level box; the constructors of Ty just inject UTy into Ty by adding either one or zero boxes. This works, and defining decidable equality for Ty and UTy is relatively straightforward (again by mutual recursion). However, it seemed to me that having to deal with Ty and UTy everywhere through the rest of the development was probably going to be super annoying.

The other option would be to index Ty by values indicating whether a type has zero or one top-level boxes; then we can use the indices to enforce the appropriate rules. First, we define a data type Boxity to act as the index for Ty, and show that it has decidable equality:

data Boxity : Set where
  [0] : Boxity
  [1] : Boxity

Boxity-≟ : DecidableEquality Boxity
Boxity-≟ [0] [0] = yes refl
Boxity-≟ [0] [1] = no λ ()
Boxity-≟ [1] [0] = no λ ()
Boxity-≟ [1] [1] = yes refl

My first attempt to write down a version of Ty indexed by Boxity looked like this:

module IndexedTry1 where
  data Ty : Boxity  Set where
    base : B  Ty [0]
    __ : {b₁ b₂ : Boxity}  Ty b₁  Ty b₂  Ty [0]
_ : Ty [0]  Ty [1]

base always introduces a type with no top-level box; the constructor requires a type with no top-level box, and produces a type with one (this is what ensures we cannot nest boxes); and the arrow constructor does not care how many boxes its arguments have, but constructs a type with no top-level box.

This is logically correct, but I found it very difficult to work with. The sticking point for me was injectivity of the arrow constructor. When defining decidable equality we need to prove lemmas that each of the constructors are injective, but I was not even able to write down the type of injectivity for _⇒_. We would want something like this:

-inj :
  {bσ₁ bσ₂ bτ₁ bτ₂ : Boxity}
  {σ₁ : Ty bσ₁} {σ₂ : Ty bσ₂} {τ₁ : Ty bτ₁} {τ₂ : Ty bτ₂} →
  (σ₁ ⇒ σ₂) ≡ (τ₁ ⇒ τ₂) →
  (σ₁ ≡ τ₁) × (σ₂ ≡ τ₂)

but this does not even typecheck! The problem is that, for example, σ₁ and τ₁ have different types, so the equality proposition σ₁ ≡ τ₁ is not well-typed.

At this point I tried turning to heterogeneous equality, but it didn’t seem to help. I won’t record here all the things I tried, but the same issues seemed to persist, just pushed around to different places (for example, I was not able to pattern-match on witnesses of heterogeneous equality because of types that didn’t match).

Sigma types to the rescue

At ICFP last week I asked Jesper Cockx for advice,which felt a bit like asking Rory McIlroy to give some tips on your mini-golf game

and he suggested trying to prove decidable equality for the sigma type pairing an index with a type having that index, like this:

  ΣTy : Set
  ΣTy = Σ Boxity Ty

This turned out to be the key idea, but it still took me a long time to figure out the right way to make it work. Given the above definitions, if we go ahead and try to define decidable equality for ΣTy, injectivity of the arrow constructor is still a problem.

After days of banging my head against this off and on, I finally realized that the way to solve this is to define Ty and ΣTy by mutual recursion: the arrow constructor should just take two ΣTy arguments! This perfectly captures the idea that we don’t care about the indices of the arrow constructor’s argument types, so we hide them by bundling them up in a sigma type.

ΣTy : Set
data Ty : Boxity  Set

ΣTy = Σ Boxity Ty

data Ty where
_ : Ty [0]  Ty [1]
  base : B  Ty [0]
  __ : ΣTy  ΣTy  Ty [0]

infixr 2 __
infix 30_

Now we’re cooking! We now make quick work of the required injectivity lemmas, which all go through trivially by matching on refl:


□-inj : {τ₁ τ₂ : Ty [0]}  (□ τ₁ ≡ □ τ₂)  (τ₁ ≡ τ₂)
□-inj refl = refl

base-inj : {b₁ b₂ : B}  base b₁ ≡ base b₂  b₁ ≡ b₂
base-inj refl = refl

⇒-inj : {σ₁ σ₂ τ₁ τ₂ : ΣTy}  (σ₁ ⇒ σ₂)(τ₁ ⇒ τ₂)  (σ₁ ≡ τ₁) × (σ₂ ≡ τ₂)
⇒-inj refl = refl , refl

Notice how the type of ⇒-inj is now perfectly fine: we just have a bunch of ΣTy values that hide their indices, so we can talk about propositional equality between them with no trouble.

Finally, we can define decidable equality for Ty and ΣTy by mutual recursion.

ΣTy-≟ : DecidableEquality ΣTy

{-# TERMINATING #-}
Ty-≟ :  {b}  DecidableEquality (Ty b)

Sadly, I had to reassure Agda that the definition of Ty-≟ is terminating—more on this later.

To define ΣTy-≟ we can just use a lemma from Data.Product.Properties which derives decidable equality for a sigma type from decidable equality for both components.

ΣTy-≟ = ≡-dec Boxity-≟ Ty-≟

The only thing left is to define decidable equality for any two values of type Ty b (given a specific boxity b), making use of our injectivity lemmas; now that we have the right definitions, this falls out straightforwardly.

Ty-≟ (□ σ) (□ τ) with Ty-≟ σ τ
... | no σ≢τ = no (σ≢τ ∘ □-inj)
... | yes refl = yes refl
Ty-≟ (base x) (base y) with ≟B x y
... | no x≢y = no (x≢y ∘ base-inj)
... | yes refl = yes refl
Ty-≟ (σ₁ ⇒ σ₂) (τ₁ ⇒ τ₂) with ΣTy-≟ σ₁ τ₁ | ΣTy-≟ σ₂ τ₂
... | no σ₁≢τ₁ | _ = no (σ₁≢τ₁ ∘ proj₁ ∘ ⇒-inj)
... | yes _ | no σ₂≢τ₂ = no (σ₂≢τ₂ ∘ proj₂ ∘ ⇒-inj)
... | yes refl | yes refl = yes refl
Ty-≟ (base _) (__) = no λ ()
Ty-≟ (__) (base _) = no λ ()

Final thoughts

First, the one remaining infelicity is that Agda could not tell that Ty-≟ is terminating. I am not entirely sure why, but I think it may be that the way the recursion works is just too convoluted for it to analyze properly: Ty-≟ calls ΣTy-≟ on structural subterms of its inputs, but then ΣTy-≟ works by providing Ty-≟ as a higher-order parameter to ≡-dec. If you look at the definition of ≡-dec, all it does is call its function parameters on structural subterms of its input, so everything should be nicely terminating, but I guess I am not surprised that Agda is not able to figure this out. If anyone has suggestions on how to make this pass the termination checker without using a TERMINATING pragma, I would love to hear it!

As a final aside, I note that converting back and forth between Ty (with ΣTy arguments to the arrow constructor) and IndexedTry1.Ty (with expanded-out Boxity and Ty arguments to arrow) is trivial:

Ty→Ty1 : {b : Boxity}  Ty b  IndexedTry1.Ty b
Ty→Ty1 (□ σ) = IndexedTry1.(Ty→Ty1 σ)
Ty→Ty1 (base x) = IndexedTry1.base x
Ty→Ty1 ((b₁ , σ₁)(b₂ , σ₂)) = (Ty→Ty1 σ₁) IndexedTry1.(Ty→Ty1 σ₂)

Ty1→Ty : {b : Boxity}  IndexedTry1.Ty b  Ty b
Ty1→Ty (IndexedTry1.base x) = base x
Ty1→Ty (σ₁ IndexedTry1.⇒ σ₂) = -, (Ty1→Ty σ₁) ⇒ -, (Ty1→Ty σ₂)
Ty1→Ty (IndexedTry1.□ σ) =(Ty1→Ty σ)

I expect it is also trivial to prove this is an isomorphism, though I’m not particularly motivated to do it. The point is that, as anyone who has spent any time proving things with proof assistants knows, two types can be completely isomorphic, and yet one can be vastly easier to work with than the other in certain contexts. Often when I’m trying to prove something in Agda it feels like at least half the battle is just coming up with the right representation that makes the proofs go through easily.

<noscript>Javascript needs to be activated to view comments.</noscript>

by Brent Yorgey at September 09, 2024 12:00 AM

September 07, 2024

Dan Piponi (sigfpe)

How to hide information from yourself in a solo RPG

A more stable version of this article can be found on github.

The Problem

Since the early days of role-playing games there has been debate over which rolls the GM should make and which are the responsibility of the players. But I think that for “perception” checks it doesn’t really make sense for a player to roll. If, as a player, you roll to hear behind a door and succeed, but you’re told there is no sound, then you know there is nothing to be heard. But you ought to just be left in suspense.

If you play a solo RPG the situation is more challenging. If there is a probability p of a room being occupied, and probability q of you hearing the occupant if you listen at the door, how can you simulate listening without making a decision about whether the room is occupied before opening the door? I propose a little mathematical trick.
Helena Listening, by Arthur Rackham

Simulating conditional probabilities

Suppose P(M) = p and P(H|M) = q (and P(H|not M) = 0). Then P(H) = pq. So to simulate the probability of hearing something at a new door: roll to see if a monster is present, and then roll to hear it. If both come up positive then you hear a noise.

But...but...you object, if the first roll came up positive you know there is a monster, removing the suspense if the second roll fails. Well this process does produce the correct (marginal) probability of hearing a noise at a fresh door. So you reinterpret the first roll not as determining whether a monster is present, but as just the first step in a two-step process to determine if a sound is heard.

But what if no sound is heard and we decide to open the door? We need to reduce the probability that we find a monster behind the door. In fact we need to sample P(M|not H). We could use Bayes’ theorem to compute this but chances are you won’t have any selection of dice that will give the correct probability. And anyway, you don’t want to be doing mathematics in the middle of a game, do you? 
There’s a straightforward trick. In the event that you heard no noise at the door and want to now open the door: roll (again) to see if there is a monster behind the door, and then roll to listen again. If the outcome of the two rolls matches the information that you know, ie. it predicts you hear nothing, then you can now accept the first roll as determining whether the monster is present. In that case the situation is more or less vacuously described by P(M|not H). If the two rolls disagree with what you know, ie. they predict you hear something, then repeat the roll of two dice. Keep repeating until it agrees with what you know. 

In general

There is a general method here though it’s only practical for simple situations. If you need to generate some hidden variables as part of a larger procedure, just generate them as usual, keep the variables you observe, and discard the hidden part. If you ever need to generate those hidden variables again, and remain consistent with previous rolls, resimulate from the beginning, restarting the rolls if they ever disagree with your previous observations.

In principle you could even do something like simulate an entire fight against a creature whose hit points remain unknown to you. But you’ll spend a lot of time rerolling the entire fight from the beginning. So It’s better for situations that only have a small number of steps, like listening at a door.

by sigfpe (noreply@blogger.com) at September 07, 2024 11:06 PM

September 05, 2024

Tweag I/O

Adding algebraic data types to Nickel

Our Nickel language is a configuration language. It’s also a functional programming language. Functional programming isn’t a well-defined term: it can encompass anything from being vaguely able to pass functions as arguments and to call them (in that respect, C and JavaScript are functional) to being a statically typed, pure and immutable language based on the lambda-calculus, like Haskell.

However, if you ask a random developer, I can guarantee that one aspect will be mentioned every time: algebraic data types (ADTs) and pattern matching. They are the bread and butter of typed functional languages. ADTs are relatively easy to implement (for language maintainers) and easy to use. They’re part of the 20% of the complexity that makes for 80% of the joy of functional programming.

But Nickel didn’t have ADTs until recently. In this post, I’ll tell the story of Nickel and ADTs, starting from why they were initially lacking, the exploration of different possible solutions and the final design leading to the eventual retro-fitting of proper ADTs in Nickel. This post is intended for Nickel users, for people interested in configuration management, but also for anyone interested in programming language design and functional programming. It doesn’t require prior Nickel knowledge.

A quick primer on Nickel

Nickel is a gradually typed, functional, configuration language. From this point, we’ll talk about Nickel before the introduction of ADTs in the 1.5 release, unless stated otherwise. The core language features:

  • let-bindings: let extension = ".ncl" in "file.%{extension}"
  • first-class functions: let add = fun x y => x + y in add 1 2
  • records (JSON objects): {name = "Alice", age = 42}
  • static typing: let mult : Number -> Number -> Number = fun x y => x * y. By default, expressions are dynamically typed. A static type annotation makes a definition or an inline expression typechecked statically.
  • contracts look and act almost like types but are evaluated at runtime: { port | Port = 80 }. They are used to validate configurations against potentially complex schemas.

The lifecycle of a Nickel configuration is to be 1) written, 2) evaluated and 3) serialized, typically to JSON, YAML or TOML. An important guideline that we set first was that every native data structure (record, array, enum, etc.) should be trivially and straightforwardly serializable to JSON. In consequence, Nickel started with the JSON data model: records (objects), arrays, booleans, numbers and strings.

There’s one last primitive value: enums. As in C or in JavaScript, an enum in Nickel is just a tag. An enum value is an identifier with a leading ', such as in {protocol = 'http, server = "tweag.io"}. An enum is serialized as a string: the previous expression is exported to JSON as {"protocol": "http", "server": "tweag.io"}.

So why not just using strings? Because enums can better represent a finite set of alternatives. For example, the enum type [| 'http, 'ftp, 'sftp |] is the type of values that are either 'http, 'ftp or 'sftp. Writing protocol : [| 'http, 'ftp, 'sftp |] will statically (at typechecking time) ensure that protocol doesn’t take forbidden values such as 'https. Even without static typing, using an enum conveys to the reader that a field isn’t a free-form string.

Nickel has a match which corresponds to C or JavaScript’s switch:

is_http : [| 'http, 'ftp, 'sftp |] -> Bool =
  match {
    'http => true,
    _ => false,
  }

As you might notice, there are no ADTs in sight yet.

ADTs in a configuration language

While Nickel is a functional language, it’s first and foremost a configuration language, which comes with specific design constraints.

Because we’re telling the story of ADTs before they landed in Nickel, we can’t really use a proper Nickel syntax yet to provide examples. In what follows, we’ll use a Rust-like syntax to illustrate the examples: enum Foo<T> { Bar(i32), Baz(bool, T) } is an ADT parametrized by a generic type T with two constructors Bar and Baz, where the first one takes an integer as an argument and the other takes a pair of a boolean and a T. Concrete values are written as Bar(42) or Baz(true, "hello").

An unexpected obstacle: serialization

As said earlier, we want values to be straightforwardly serializable to the JSON data model.

Now, take a simple ADT such as enum Foo<T,U> = { SomePair(T,U), Nothing }. You can find reasonable serializations for SomePair(1,2), such as {"tag": "SomePair", "a": 1, "b": 2}. But why not {"flag": "SomePair", "0": 1, "1": 2} or {"mark": "SomePair", "data": [1, 2]}? While those representations are isomorphic, it’s hard to know the right choice for the right use-case beforehand, as it depends on the consumer of the resulting JSON. We really don’t want to make an arbitrary choice on behalf of the user.

Additionally, while ADTs are natural for a classical typed functional language, they might not entirely fit the configuration space. A datatype like enum Literal { String(String), Number(Number) } that can store either a string or a number is usually represented directly as an untagged union in a configuration, that {"literal": 5} or {"literal": "hello"}, instead of the less natural tagged union (another name for ADTs) {"literal": {"tag" = "Number", "value": 5}}.

This led us to look at (untagged) union types instead. Untagged unions have the advantage of not making any choice about the serialization: they aren’t a new data structure, as are ADTs, but rather new types (and contracts) to classify values that are already representable.

The road of union types

A union type is a type that accepts different alternatives. We’ll use the fictitious \/ type combinator to write a union in Nickel (| is commonly used elswhere but it’s already taken in Nickel). Our previous example of a literal that can be either a string or a number would be {literal: Number \/ String}. Those types are broadly useful independently of ADTs. For example, JSON Schema features unions through the core combinator any_of.

Our hope was to kill two birds with one stone by adding unions both as a way to better represent existing configuration schemas, but also as a way to emulate ADTs. Using unions lets users represent ADTs directly as plain records using their preferred serialization scheme. Together with flow-sensitive typing, we can get as expressive as ADTs while letting the user decide on the encoding. Here is an example in a hypothetical Nickel enhanced with unions and flow-sensitive typing:

let sum
  : {tag = 'SomePair, a : Number, b : Number} \/ {tag = 'Nothing}
    -> Number
  = match {
    {tag = 'SomePair, a, b} => a + b,
    {tag = 'Nothing} => 0,
  }

Using unions and flow-sensitive typing as ADTs is the approach taken by TypeScript, where the previous example would be:

type Foo = { tag: "SomePair"; a: number; b: number } | { tag: "Nothing" }

function sum(value: Foo): number {
  switch (value.tag) {
    case "SomePair":
      return value.a + value.b
    case "Nothing":
      return 0
  }
}

In Nickel, any type must have a contract counter-part. Alas union and intersection contracts are hard (in fact, union types alone are also not a trivial feat to implement!). In the linked blog post, we hint at possible pragmatic solutions for union contracts that we finally got to implement for Nickel 1.8. While sufficient for practical union contracts, this is far from the general union types that could subsume ADTs. This puts a serious stop to the idea of using union types to represent ADTs.

What are ADTs really good for?

As we have been writing more and more Nickel, we realized that we have been missing ADTs a lot for library functions - typically the types enum Option<T> { Some(T), None } and Result<T,E> = { Ok(T), Error(E) } - where we don’t care about serialization. Those ADTs are “internal” markers that wouldn’t leak out to the final exported configuration.

Here are a few motivating use-cases.

std.string.find

std.string.find is a function that searches for a substring in a string. Its current type is:

String
-> String
-> { matched : String, index : Number, groups : Array String }

If the substring isn’t found, {matched = "", index = -1, groups []} is returned, which is error-prone if the consumer doesn’t defend against such values. We would like to return a proper ADT instead, such as Found {matched : String, index : Number, groups : Array String} or NotFound, which would make for a better and a safer interface1.

Contract definition

Contracts are a powerful validation system in Nickel. The ability to plug in your own custom contracts is crucial.

However, the general interface to define custom contracts can seem bizarre. Custom contracts need to set error reporting data on a special label value and use the exception-throwing-like std.contract.blame function. Here is a simplified definition of std.number.Nat which checks that a value is natural number:

fun label value =>
  if std.typeof value == 'Number then
    if value % 1 == 0 && value >= 0 then
      value
    else
      let label = std.contract.label.with_message "not a natural" in
      std.contract.blame label
  else
    let label = std.contract.label.with_message "not a number" in
    std.contract.blame label

There are good (and bad) reasons for this situation, but if we had ADTs, we could cover most cases with an alternative interface where custom contracts return a Result<T,E>, which is simpler and more natural:

fun value =>
  if std.typeof value == 'Number then
    if value % 1 == 0 && value >= 0 then
      Ok
    else
      Error("not a natural")
  else
    Error("not a number")

Of course, we could just encode this using a record, but it’s just not as nice.

Let it go, let it go!

The list of other examples of using ADTs to make libraries nicer is endless.

Thus, for the first time, we decided to introduce a native data structure that isn’t serializable.

Note that this doesn’t break any existing code and is forward-compatible with making ADTs serializable in the future, should we change our mind and settle on one particular encoding. Besides, another feature is independently explored to make serialization more customizable through metadata, which would let users use custom (de)serializer for ADTs easily.

Ok, let’s add the good old-fashioned ADTs to Nickel!

The design

Structural vs nominal

In fact, we won’t exactly add the old-fashioned version. ADTs are traditionally implemented in their nominal form.

A nominal type system (such as C, Rust, Haskell, Java, etc.) decides if two types are equal based on their name and definition. For example, values of enum Alias1 { Value(String) } and enum Alias2 { Value(String) } are entirely interchangeable in practice, but Rust still doesn’t accept Alias1::Value(s) where a Alias2 is expected, because those types have distinct definitions. Similarly, you can’t swap a class for another in Java just because they have exactly the same fields and methods.

A structural type system, on the other hand, only cares about the shape of data. TypeScript has a structural type system. For example, the types interface Ball { diameter: number; } and interface Sphere { diameter: number; } are entirely interchangeable, and {diameter: 42} is both a Ball and a Sphere. Some languages, like OCaml2 or Go3, mix both.

Nickel’s current type system is structural because it’s better equipped to handle arbitrary JSON-like data. Because ADTs aren’t serializable, this consideration doesn’t weight as much for our motivating use-cases, meaning ADTs could be still be either nominal or structural.

However, nominal types aren’t really usable without some way of exporting and importing type definitions, which Nickel currently lacks. It sounds more natural to go for structural ADTs, which seamlessly extend the existing enums and would overall fit better with the rest of the type system.

Structural ADTs look like the better choice for Nickel. We can build, typecheck, and match on ADTs locally without having to know or to care about any type declaration. Structural ADTs are a natural extension of Nickel (structural) enums, syntactically, semantically, and on the type level, as we will see.

While less common, structural ADTs do exist in the wild and they are pretty cool. OCaml has both nominal ADTs and structural ADTs, the latter being known as polymorphic variants. They are an especially powerful way to represent a non trivial hierarchy of data types with overlapping, such as abstract syntax trees or sets of error values.

Syntax

C-style enums are just a special case of ADTs, namely ADTs where constructors don’t have any argument. The dual conclusion is that ADTs are enums with arguments. We thus write the ADT Some("hello") as an enum with an argument in Nickel: 'Some "hello".

We apply the same treatment to types. [| 'Some, 'None |] was a valid enum type, and now [| 'Some String, 'None |] is also a valid type (which would correspond to Rust’s Option<String>).

There is a subtlety here: what should be the type inferred for 'Some now? In a structural type system, 'Some is just a free-standing symbol. The typechecker can’t know if it’s a constant that will stay as it is - and thus has the type [| 'Some |] - or a constructor that will be eventually applied, of type a -> [| 'Some a |]. This difficulty just doesn’t exist in a nominal type system like Rust: there, Option::Some refers to a unique, known and fixed ADT constructor that is known to require precisely one argument.

To make it work, 'Ok 42 isn’t actually a normal function application in Nickel: it’s an ADT constructor application, and it’s parsed differently. We just repurpose the function application syntax4 in this special case. 'Ok isn’t a function, and let x = 'Ok in x 42 is an error (applying something that isn’t a function).

You can still recover Rust-style constructors that can be applied by defining a function (eta-expanding, in the functional jargon): let ok = fun x => 'Ok x.

We restrict ADTs to a single argument. You can use a record to emulate multiple arguments: 'Point {x = 1, y = 2}.

ADTs also come with pattern matching. The basic switch that was match is now a powerful pattern matching construct, with support for ADTs but also arrays, records, constant, wildcards, or-patterns and guards (if side-conditions).

Typechecking

Typechecking structural ADTs is a bit different from nominal ADTs. Take the simple example (the enclosing : _ annotation is required to make the example statically typed in Nickel)

(
  let data = 'Ok 42 in
  let process = match {
    'Ok x => x + 1,
    'Error => 0,
  } in

  process data
) :  _

process is inferred to have type [| 'Ok Number, 'Error |] -> Number. What type should we infer for data = 'Ok 42? The most obvious one is [| 'Ok Number |]. But then [| 'Ok Number |] and [| 'Ok Number, 'Error |] don’t match and process data doesn’t typecheck! This is silly, because this example should be perfectly valid.

One possible solution is to introduce subtyping, which is able to express this kind of inclusion relation: here that [| 'Ok Number |] is included in [| 'Ok Number, 'Error |]. However, subtyping has some defects and is whole can of worms when mixed with polymorphism (which Nickel has).

Nickel rather relies on another approach called row polymorphism, which is the ability to abstract over not just a type, as in classical polymorphism, but a whole piece of an enum type. Row polymorphism is well studied in the literature, and is for example implemented in PureScript. Nickel already features row polymorphism for basic enum types and for records types.

Here is how it works:

let process : forall a. [| 'Ok Number, 'Error; a |] -> Number = match {
  'Ok x => x + 1,
  'Error => 0,
  _ => -1,
} in

process 'Other

Because there’s a catch-all case _ => -1, the type of process is polymorphic, expressing that it can handle any other variant beside 'Ok Number and 'Error (this isn’t entirely true: Ok String is forbidden for example, because it can’t be distinguished from Ok Number). Here, a can be substituted for a subsequence of an enum type, such as 'Foo Bool, 'Bar {x : Number}.

Equipped with row polymorphism, we can infer the type forall a. [| 'Ok Number; a |]5 for 'Ok 42. When typechecking process data in the original example, a will be instantiated to the single row 'Error and the example typechecks. You can learn more about structural ADTs and row polymorphism in the corresponding section of the Nickel user manual.

Conclusion

While ADTs are part of the basic package of functional languages, Nickel didn’t have them until relatively recently because of peculiarities of the design of a configuration language. After exploring the route of union types, which came to a dead-end, we settled on a structural version of ADTs that turns out to be a natural extension of the language and didn’t require too much new syntax or concepts.

ADTs already prove useful to write cleaner and more concise code, and to improve the interface of libraries, even in a gradually typed configuration language. Some concrete usages can be found in try_fold_left and validators already.


  1. Unfortunately, we can’t change the type of std.string.find without breaking existing programs (at least not until a Nickel 2.0), but this use-case still applies to external libraries or future stdlib functions
  2. In OCaml, Objects, polymorphic variants and modules are structural while records and ADTs are nominal.
  3. In Go, interfaces are structural while structs are nominal.
  4. Repurposing application is theoretically backward incompatible because 'Ok 42 was already valid Nickel syntax before 1.5, but it was meaningless (an enum applied to a constant) and would always error out at runtime, so it’s ok.
  5. In practice, we infer a simpler type [| 'Ok Number; ?a |] where ?a is a unification variable which can still have limitations. Interestingly, we decided early on to not perform automatic generalization, as opposed to the ML tradition, for reasons similar to the ones exposed here. Doing so, we get (predicative) higher-rank polymorphism almost for free, while it’s otherwise quite tricky to combine with automatic generalization. It turned out to pay off in the case of structural ADTs, because it makes it possible to side-step those usual enum types inclusion issues (widening) by having the user add more polymorphic annotations. Or we could even actually infer the polymorphic type [| forall a. 'Ok Number; a |] for literals.

September 05, 2024 12:00 AM

September 04, 2024

in Code

Seven Levels of Type Safety in Haskell: Lists

One thing I always appreciate about Haskell is that you can often choose the level of type-safety you want to work at. Haskell offers tools to be able to work at both extremes, whereas most languages only offer some limited part of the spectrum. Picking the right level often comes down to being consciously aware of the benefits/drawbacks/unique advantages to each.

So, here is a rundown of seven “levels” of type safety that you can operate at when working with the ubiquitous linked list data type, and how to use them! I genuinely believe all of these are useful (or useless) in their own different circumstances, even though the “extremes” at both ends are definitely pushing the limits of the language.

This post is written for an intermediate Haskeller, who is already familiar with ADTs and defining their own custom list type like data List a = Nil | Cons a (List a). But, be advised that most of the techniques discussed in this post (especially at both extremes) are considered esoteric at best and harmful at worst for most actual real-world applications. The point of this post is more to inspire the imagination and demonstrate principles that could be useful to apply in actual code, and not to present actual useful data structures.

All of the code here is available online here, and if you check out the repo and run nix develop you should be able to load them all in ghci as well:

$ cd code-samples/type-levels
$ nix develop
$ ghci
ghci> :load Level1.hs

Level 1: Could be anything

Code available here

What’s the moooost type-unsafe you can be in Haskell? Well, we can make a “black hole” data type that could be anything:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L12-L13

data Any :: Type where
  MkAny :: a -> Any

(This data type declaration written using GADT Syntax, and the name was chosen because it resembles the Any type in base)

So you can have values:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L15-L22

anyInt :: Any
anyInt = MkAny (8 :: Int)

anyBool :: Any
anyBool = MkAny True

anyList :: Any
anyList = MkAny ([1, 2, 3] :: [Int])

A value of any type can be given to MkAny, and the resulting type will have type Any.

However, this type is truly a black hole; you can’t really do anything with the values inside it because of parametric polymorphism: you must treat any value inside it in a way that is compatible with a value of any type. But there aren’t too many useful things you can do with something in a way that is compatible with a value of any type (things like, id :: a -> a, const 3 :: a -> Int). In the end, it’s essentially isomorphic to unit ().

However, this isn’t really how dynamic types work. In other languages, we are at least able to query and interrogate a type for things we can do with it using runtime reflection. To get there, we can instead allow some sort of witness on the type of the value. Here’s Sigma, where Sigma p is a value a paired with some witness p a:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L24-L25

data Sigma :: (Type -> Type) -> Type where
  MkSigma :: p a -> a -> Sigma p

And the most classic witness is TypeRep from base, which is a witness that lets you “match” on the type.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L27-L32

showIfBool :: Sigma TypeRep -> String
showIfBool (MkSigma tr x) = case testEquality tr (typeRep @Bool) of
  Just Refl -> case x of -- in this branch, we know x is a Bool
    False -> "False"
    True -> "True"
  Nothing -> "Not a Bool"

This uses type application syntax, @Bool, that lets us pass in the type Bool to the function typeRep :: Typeable a => TypeRep a.

Now we can use TypeRep’s interface to “match” (using testEquality) on if the value inside is a Bool. If the match works (and we get Just Refl) then we can treat x as a Bool in that case. If it doesn’t (and we get Nothing), then we do what we would want to do otherwise.

ghci> let x = MkSigma typeRep True
ghci> let y = MkSigma typeRep (4 :: Int)
ghci> showIfBool x
"True"
ghci> showIfBool y
"Not a Bool"

This pattern is common enough that there’s the Data.Dynamic module in base that is Sigma TypeRep, and testEquality is replaced with that module’s fromDynamic:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L40-L45

showIfBoolDynamic :: Dynamic -> String
showIfBoolDynamic dyn = case fromDynamic dyn of
  Just x -> case x of -- in this branch, we know x is a Bool
    False -> "False"
    True -> "True"
  Nothing -> "Not a Bool"

For make our life easier in the future, let’s write a version of fromDynamic for our Sigma TypeRep:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L47-L53

castSigma :: TypeRep a -> Sigma TypeRep -> Maybe a
castSigma tr (MkSigma tr' x) = case testEquality tr tr' of
  Just Refl -> Just x
  Nothing -> Nothing

castSigma' :: Typeable a => Sigma TypeRep -> Maybe a
castSigma' = castSigma typeRep

But the reason why I’m presenting the more generic Sigma instead of the specific type Dynamic = Sigma TypeRep is that you can swap out TypeRep to get other interesting types. For example, if you had a witness of showability:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L55-L62

data Showable :: Type -> Type where
  WitShowable :: Show a => Showable a

showableInt :: Sigma Showable
showableInt = MkSigma WitShowable (3 :: Int)

showableBool :: Sigma Showable
showableBool = MkSigma WitShowable True

(This type is related to Dict Show from the constraints library; it’s technically Compose Dict Show)

And now we have a type Sigma Showable that’s kind of of “not-so-black”: we can at least use show on it:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L64-L65

showSigma :: Sigma Showable -> String
showSigma (MkSigma WitShowable x) = show x -- here, we know x is Show
ghci> let x = MkSigma WitShowable True
ghci> let y = MkSigma WitShowable 4
ghci> showSigma x
"True"
ghci> showSigma y
"4"

This is the “existential typeclass antipattern1, but since we are talking about different ways we can push the type system, it’s probably worth mentioning. In particular, Show is a silly typeclass to use in this context because a Sigma Showable is equivalent to just a String: once you match on the constructor to get the value, the only thing you can do with the value is show it anyway.

One fun thing we can do is provide a “useless witness”, like Proxy:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L67-L70

data Proxy a = Proxy

uselessBool :: Sigma Proxy
uselessBool = MkSigma Proxy True

So a value like MkSigma Proxy True :: Sigma Proxy is truly a useless data type (basically our Any from before), since we know that MkSigma constrains some value of some type, but there’s no witness to give us any clue on how we can use it. A Sigma Proxy is isomorphic to ().

On the other extreme, we can use a witness to constrain the value to only be a specific type, like IsBool:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L72-L76

data IsBool :: Type -> Type where
  ItsABool :: IsBool Bool

justABool :: Sigma IsBool
justABool = MkSigma ItsABool False

So you can have a value of type MkSigma ItsABool True :: Sigma IsBool, or MkSigma ItsABool False, but MkSigma ItsABool 2 will not typecheck — remember, to make a Sigma, you need a p a and an a. ItsABool :: IsBool Bool, so the a you put in must be Bool to match. Sigma IsBool is essentially isomorphic to Bool.

There’s a general version of this too, (:~:) a (from Data.Type.Equality in base). (:~:) Bool is our IsBool earlier. Sigma ((:~:) a) is essentially exactly a…basically bringing us incidentally back to complete type safety? Weird. Anyway.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level1.hs#L78-L79

justAnInt :: Sigma ((:~:) Int)
justAnInt = MkSigma Refl 10 -- Refl :: Int :~: Int

I think one interesting thing to see here is that being “type-unsafe” in Haskell can be much less convenient than doing something similar in a dynamically typed language like python. The python ecosystem is designed around runtime reflection and inspection for properties and interfaces, whereas the dominant implementation of interfaces in Haskell (typeclasses) doesn’t gel with this. There’s no runtime typeclass instantiation: we can’t pattern match on a TypeRep and check if it’s an instance of Ord or not.

That’s why I don’t fancy those memes/jokes about how dynamically typed languages are just “static types with a single type”. The actual way you use those types (and the ecosystem built around them) lend themselves to different ergonomics, and the reductionist take doesn’t quite capture that nuance.

Level 2: Heterogeneous List

Code available here

The lowest level of safety in which a list might be useful is the dynamically heterogeneous list. This is the level where lists (or “arrays”) live in most dynamic languages.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level2.hs#L12-L12

type HList p = [Sigma p]

We tag values with a witness p for the same reason as before: if we don’t provide some type of witness, our type is useless.

The “dynamically heterogeneous list of values of any type” is HList TypeRep. This is somewhat similar to how functions with positional arguments work in a dynamic language like javascript. For example, here’s a function that connects to a host (String), optionally taking a port (Int) and a method (Method).

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level2.hs#L14-L33

data Method = HTTP | HTTPS

indexHList :: Int -> HList p -> Maybe (Sigma p)
indexHList 0 [] = Nothing
indexHList 0 (x : _) = Just x
indexHList n (_ : xs) = indexHList (n - 1) xs

mkConnection :: HList TypeRep -> IO ()
mkConnection args = doTheThing host port method
  where
    host :: Maybe String
    host = castSigma' =<< indexHList 0 args
    port :: Maybe Int
    port = castSigma' =<< indexHList 1 args
    method :: Maybe Method
    method = castSigma' =<< indexHList 2 args

Of course, this would probably be better expressed in Haskell as a function of type Maybe String -> Maybe Int -> Maybe Method -> IO (). But maybe this could be useful in a situation where you would want to offer the ability to take arguments in any order? We could “find” the first value of a given type:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level2.hs#L35-L36

findValueOfType :: Typeable a => HList TypeRep -> Maybe a
findValueOfType = listToMaybe . mapMaybe castSigma'

Then we could write:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level2.hs#L39-L47

mkConnectionAnyOrder :: HList TypeRep -> IO ()
mkConnectionAnyOrder args = doTheThing host port method
  where
    host :: Maybe String
    host = findValueOfType args
    port :: Maybe Int
    port = findValueOfType args
    method :: Maybe Method
    method = findValueOfType args

But is this a good idea? Probably not.

Anyway, one very common usage of this type is for “extensible” systems that let you store components of different types in a container, as long as they all support some common interface (ie, the widgets system from the Luke Palmer post).

For example, we could have a list of any item as long as the item is an instance of Show: that’s HList Showable!

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level2.hs#L52-L55

showAll :: HList Showable -> [String]
showAll = map showSigma
  where
    showSigma (MkSigma WitShowable x) = show x
ghci> let xs = [MkSigma WitShowable 1, MkSigma WitShowable True]
ghci> showAll xs
["1", "True"]

Again, Show is a bad typeclass to use for this because we might as well be storing [String]. But for fun, let’s imagine some other things we could fill in for p. If we use HList Proxy, then we basically don’t have any witness at all. We can’t use the values in the list in any meaningful way; HList Proxy is essentially the same as Natural, since the only information is the length.

If we use HList IsBool, we basically have [Bool], since every item must be a Bool! In general, HList ((:~:) a) is the same as [a].

Level 3: Homogeneous Dynamic List

Code available here

A next level of type safety we can add is to ensure that all elements in the list are of the same type. This adds a layer of usefulness because there are a lot of things we might want to do with the elements of a list that are only possible if they are all of the same type.

First of all, let’s clarify a subtle point here. It’s very easy in Haskell to consume lists where all elements are of the same (but not necessarily known) type. Functions like sum :: Num a => [a] -> a and sort :: Ord a => [a] -> [a] do that. This is “polymorphism”, where the function is written to not worry about the type, and the ultimate caller of the function must pick the type they want to use with it. For the sake of this discussion, we aren’t talking about consuming values — we’re talking about producing and storing values where the producer (and not the consumer) controls the type variable.

To do this, we can flip the witness to outside the list:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level3.hs#L17-L18

data SomeList :: (Type -> Type) -> Type where
  MkSomeList :: p a -> [a] -> SomeList p

We can write some meaningful predicates on this list — for example, we can check if it is monotonic (the items increase in order)

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level3.hs#L21-L32

data Comparable :: Type -> Type where
  WitOrd :: Ord a => Comparable a

monotonic :: Ord a => [a] -> Bool
monotonic [] = True
monotonic (x : xs) = go x xs
  where
    go y [] = True
    go y (z : zs) = (y <= z) && go z zs

monotonicSomeList :: SomeList Comparable -> Bool
monotonicSomeList (MkSomeList WitOrd xs) = monotonic xs

This is fun, but, as mentioned before, monotonicSomeList doesn’t have any advantage over monotonic, because the caller determines the type. What would be more motivating here is a function that produces “any sortable type”, and the caller has to use it in a way generic over all sortable types. For example, a database API might let you query a database for a column of values, but you don’t know ahead of time what the exact type of that column is. You only know that it is “some sortable type”. In that case, a SomeList could be useful.

For a contrived one, let’s think about pulling such a list from IO:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level3.hs#L34-L54

getItems :: IO (SomeList Comparable)
getItems = do
  putStrLn "would you like to provide int or bool or string?"
  ans <- getLine
  case map toLower ans of
    "int" -> MkSomeList WitOrd <$> replicateM 3 (readLn @Int)
    "bool" -> MkSomeList WitOrd <$> replicateM 3 (readLn @Bool)
    "string" -> MkSomeList WitOrd <$> replicateM 3 getLine
    _ -> throwIO $ userError "no"

getAndAnalyze :: IO ()
getAndAnalyze = do
  MkSomeList WitOrd xs <- getItems
  putStrLn $ "Got " ++ show (length xs) ++ " items."
  let isMono = monotonic xs
      isRevMono = monotonic (reverse xs)
  when isMono $
    putStrLn "The items are monotonic."
  when (isMono && isRevMono) $ do
    putStrLn "The items are monotonic both directions."
    putStrLn "This means the items are all identical."

Consider also an example where process items different based on what type they have:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level3.hs#L62-L68

processList :: SomeList TypeRep -> Bool
processList (MkSomeList tr xs)
  | Just Refl <- testEquality tr (typeRep @Bool) = and xs
  | Just Refl <- testEquality tr (TypeRep @Int) = sum xs > 50
  | Just Refl <- testEquality tr (TypeRep @Double) = sum xs > 5.0
  | Just Refl <- testEquality tr (TypeRep @String) = "hello" `elem` xs
  | otherwise = False

(That’s pattern guard syntax, if you were wondering)

In this specific situation, using a closed ADT of all the types you’d actually want is probably preferred (like data Value = VBool Bool | VInt Int | VDouble Double | VString String), since we only ever get one of four different types. Using Comparable like this gives you a completely open type that can take any instance of Ord, and using TypeRep gives you a completely open type that can take literally anything.

This pattern is overall similar to how lists are often used in practice for dynamic languages: often when we use lists in dynamically typed situations, we expect them all to have items of the same type or interface. However, using lists this way (in a language without type safety) makes it really tempting to hop down into Level 2, where you start throwing “alternatively typed” things into your list, as well, for convenience. And then the temptation comes to also hop down to Level 1 and throw a null in every once in a while. All of a sudden, any consumers must now check the type of every item, and a lot of things are going to start needing unit tests.

Now, let’s talk a bit about ascending and descending between each levels. In the general case we don’t have much to work with, but let’s assume our constraint is TypeRep here, so we can match for type equality.

We can move from Level 3 to Level 2 by moving the TypeRep into the values of the list, and we can move from Level 3 to Level 1 by converting our TypeRep a into a TypeRep [a]:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level3.hs#L75-L86

someListToHList :: SomeList TypeRep -> HList TypeRep
someListToHList (MkSomeList tr xs) = MkSigma tr <$> xs

someListToSigma :: SomeList TypeRep -> Sigma TypeRep
someListToSigma (MkSomeList tr xs) = MkSigma (typeRep @[] `App` tr) xs

App here as a constructor lets us come TypeReps: App :: TypeRep f -> TypeRep a -> TypeRep (f a).

Going the other way around is trickier. For HList, we don’t even know if every item has the same type, so we can only successfully move up if every item has the same type. So, first we get the typeRep for the first value, and then cast the other values to be the same type if possible:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level3.hs#L70-L73

hlistToSomeList :: HList TypeRep -> Maybe (SomeList TypeRep)
hlistToSomeList = \case
  [] -> Nothing
  MkSigma tr x : xs -> MkSomeList tr . (x :) <$> traverse (castSigma tr) xs

To go from Sigma TypeRep, we first need to match the TypeRep as some f a application using the App pattern…then we can check if f is [] (list), then we can create a SomeList with the TypeRep a. But, testEquality can only be called on things of the same kind, so we have to verify that f has kind Type -> Type first, so that we can even call testEquality on f and []! Phew! Dynamic types are hard!

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level3.hs#L78-L83

sigmaToHList :: Sigma TypeRep -> Maybe (SomeList TypeRep)
sigmaToHList (MkSigma tr xs) = do
  App tcon telem <- Just tr
  Refl <- testEquality (typeRepKind telem) (typeRep @Type)
  Refl <- testEquality tcon (typeRep @[])
  pure $ MkSomeList telem xs

Level 4: Homogeneous Typed List

Ahh, now right in the middle, we’ve reached Haskell’s ubiquitous list type! It is essentially:

data List :: Type -> Type where
    Nil  :: List a
    Cons :: a -> List a -> List a

I don’t have too much to say here, other than to acknowledge that this is truly a “sweet spot” in terms of safety vs. unsafety and usability. This simple List a / [a] type has so many benefits from type-safety:

  • It lets us write functions that can meaningfully say that the input and result types are the same, like take :: Int -> [a] -> [a]
  • It lets us write functions that can meaningfully link lists and the items in the list, like head :: [a] -> a and replicate :: Int -> a -> [a].
  • It lets us write functions that can meaningfully state relationships between input and results, like map :: (a -> b) -> [a] -> [b]
  • We can require two input lists to have the same type of items, like (++) :: [a] -> [a] -> [a]
  • We can express complex relationships between inputs and outputs, like zipWith :: (a -> b -> c) -> [a] -> [b] -> [c].

The property of being able to state and express relationships between the values of input lists and output lists and the items in those lists is extremely powerful, and also extremely ergonomic to use in Haskell. It can be argued that Haskell, as a language, was tuned explicitly to be used with the least friction at this exact level of type safety. Haskell is a “Level 4 language”.

Level 5: Fixed-size List

Code available here

From here on, we aren’t going to be “building up” linearly on safety, but rather showing three structural type safety mechanism of increasing strength and complexity.

For Level 5, we’re not going to try to enforce anything on the contents of the list, but we can try to enforce something on the spline of the list: the number of items!

To me, this level still feels very natural in Haskell to write in, although in terms of usability we are starting to bump into some of the things Haskell is lacking for higher type safety ergonomics. I’ve talked about fixed-length vector types in depth before, so this is going to be a high-level view contrasting this level with the others.2

The essential concept is to introduce a phantom type, a type parameter that doesn’t do anything other than indicate something that we can use in user-space. Here we will create a type that structurally encodes the natural numbers 0, 1, 2…:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L15-L15

data Nat = Z | S Nat

So, Z will represent zero, S Z will represent one, S (S Z) will represent two, etc. We want to create a type Vec n a, where n will be a type of kind Nat (promoted using DataKinds, which lets us use Z and S as type constructors), representing a linked list with n elements of type a.

We can define Vec in a way that structurally matches how Nat is constructed, which is the key to making things work nicely:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L17-L21

data Vec :: Nat -> Type -> Type where
  VNil :: Vec Z a
  (:+) :: a -> Vec n a -> Vec (S n) a

infixr 5 :+

This is offered in the vec library. Here are some example values:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L23-L33

zeroItems :: Vec Z Int
zeroItems = VNil

oneItem :: Vec (S Z) Int
oneItem = 1 :+ VNil

twoItems :: Vec (S (S Z)) Int
twoItems = 1 :+ 2 :+ VNil

threeItems :: Vec (S (S (S Z))) Int
threeItems = 1 :+ 2 :+ 3 :+ VNil

Note two things:

  1. 1 :+ 2 :+ VNil gets automatically type-inferred to be a Vec (S (S Z)) a, because every application of :+ adds an S to the phantom type.
  2. There is only one way to construct a Vec (S (S Z)) a: by using :+ twice. That means that such a value is a list of exactly two items.

However, the main benefit of this system is not so you can create a two-item list…just use tuples or data V2 a = V2 a a from linear for that. No, the main benefit is that you can now encode how arguments in your functions relate to each other with respect to length.

For example, the type alone of map :: (a -> b) -> [a] -> [b] does not tell you that the length of the result list is the same as the length of the input list. However, consider vmap :: (a -> b) -> Vec n a -> Vec n b. Here we see that the output list must have the same number of items as the input list, and it’s enforced right there in the type signature!

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L35-L38

vmap :: (a -> b) -> Vec n a -> Vec n b
vmap f = \case
  VNil -> VNil
  x :+ xs -> f x :+ vmap f xs

And how about zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]? It’s not clear or obvious at all how the final list’s length depends on the input lists’ lengths. However, a vzipWith would ensure the input lengths are the same size and that the output list is also the same length:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L40-L45

vzipWith :: (a -> b -> c) -> Vec n a -> Vec n b -> Vec n c
vzipWith f = \case
  VNil -> \case
    VNil -> VNil
  x :+ xs -> \case
    y :+ ys -> f x y :+ vzipWith f xs ys

Note that both of the inner pattern matches are known by GHC to be exhaustive: if it knows that the first list is VNil, then it knows that n ~ Z, so the second list has to also be VNil. Thanks GHC!

From here on out, we’re now always going to assume that GHC’s exhaustiveness checker is on, so we always handle every branch that GHC tells us is necessary, and skip handling branches that GHC tells us is unnecessary (through compiler warnings).

We can even express more complicated relationships with type families (type-level “functions”):

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L47-L63

type family Plus (x :: Nat) (y :: Nat) where
  Plus Z y = y
  Plus (S z) y = S (Plus z y)

type family Times (x :: Nat) (y :: Nat) where
  Times Z y = Z
  Times (S z) y = Plus y (Times z y)

vconcat :: Vec n a -> Vec m a -> Vec (Plus n m) a
vconcat = \case
  VNil -> id
  x :+ xs -> \ys -> x :+ vconcat xs ys

vconcatMap :: (a -> Vec m b) -> Vec n a -> Vec (Times n m) b
vconcatMap f = \case
  VNil -> VNil
  x :+ xs -> f x `vconcat` vconcatMap f xs

Note that all of these only work in GHC because the structure of the functions themselves match exactly the structure of the type families. If you follow the pattern matches in the functions, note that they match exactly with the different equations of the type family.

Famously, we can totally index into fixed-length lists, in a way that indexing will not fail. To do that, we have to define a type Fin n, which represents an index into a list of length n. So, Fin (S (S (S Z))) will be either 0, 1, or 2, the three possible indices of a three-item list.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L65-L76

data Fin :: Nat -> Type where
  -- | if z is non-zero, FZ :: Fin z gives you the first item
  FZ :: Fin ('S n)
  -- | if i indexes into length z, then (i+1) indixes into length (z+1)
  FS :: Fin n -> Fin ('S n)

vindex :: Fin n -> Vec n a -> a
vindex = \case
  FZ -> \case
    x :+ _ -> x
  FS i -> \case
    _ :+ xs -> vindex i xs

Fin takes the place of Int in index :: Int -> [a] -> a. You can use FZ in any non-empty list, because FZ :: Fin (S n) will match any Vec (S n) (which is necessarily of length greater than 0). You can use FS FZ only on something that matches Vec (S (S n)). This is the type-safety.

We can also specify non-trivial relationships between lengths of lists, like making a more type-safe take :: Int -> [a] -> [a]. We want to make sure that the result list has a length less than or equal to the input list. We need another “int” that can only be constructed in the case that the result length is less than or equal to the first length. This called “proofs” or “witnesses”, and act in the same role as TypeRep, (:~:), etc. did above for our Sigma examples.

We want a type LTE n m that is a “witness” that n is less than or equal to m. It can only be constructed for if n is less than or equal to m. For example, you can create a value of type LTE (S Z) (S (S Z)), but not of LTE (S (S Z)) Z

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L78-L87

data LTE :: Nat -> Nat -> Type where
  -- | Z is less than or equal to any number
  LTEZ :: LTE Z m
  -- | if n <= m, then (n + 1) <= (m + 1)
  LTES :: LTE n m -> LTE ('S n) ('S m)

vtake :: LTE n m -> Vec m a -> Vec n a
vtake = \case
  LTEZ -> \_ -> VNil
  LTES l -> \case x :+ xs -> x :+ vtake l xs

Notice the similarity to how we would define take :: Int -> [a] -> [a]. We just spiced up the Int argument with type safety.

Another thing we would like to do is use be able to create lists of arbitrary length. We can look at replicate :: Int -> a -> [a], and create a new “spicy int” SNat n, so vreplicate :: SNat n -> a -> Vec n a

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L89-L96

data SNat :: Nat -> Type where
  SZ :: SNat Z
  SS :: SNat n -> SNat (S n)

vreplicate :: SNat n -> a -> Vec n a
vreplicate = \case
  SZ -> \_ -> VNil
  SS n -> \x -> x :+ vreplicate n x

Notice that this type has a lot more guarantees than replicate. For replicate :: Int -> a -> [a], we can’t guarantee (as the caller) that the return type does have the length we give it. But for vreplicate :: SNat n -> a -> Vec n a, it does!

SNat n is actually kind of special. We call it a singleton, and it’s useful because it perfectly reflects the structure of n the type, as a value…nothing more and nothing less. By pattern matching on SNat n, we can exactly determine what n is. SZ means n is Z, SS SZ means n is S Z, etc. This is useful because we can’t directly pattern match on types at runtime in Haskell (because of type erasure), but we can pattern match on singletons at runtime.

We actually encountered singletons before in this post! TypeRep a is a singleton for the type a: by pattern matching on it (like with App earlier), we can essentially “pattern match” on the type a itself.

In practice, we often write typeclasses to automatically generate singletons, similar to Typeable from before:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L98-L108

class KnownNat n where
  nat :: SNat n

instance KnownNat Z where
  nat = SZ

instance KnownNat n => KnownNat (S n) where
  nat = SS nat

vreplicate' :: KnownNat n => a -> Vec n a
vreplicate' = vreplicate nat

One last thing: moving back and forth between the different levels. We can’t really write a [a] -> Vec n a, because in Haskell, the type variables are determined by the caller. We want n to be determined by the list, and the function itself. And now suddenly we run into the same issue that we ran into before, when moving between levels 2 and 3.

We can do the same trick before and write an existential wrapper:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L110-L116

data SomeVec a = forall n. MkSomeVec (SNat n) (Vec n a)

toSomeVec :: [a] -> SomeVec a
toSomeVec = \case
  [] -> MkSomeVec SZ VNil
  x : xs -> case toSomeVec xs of
    MkSomeVec n ys -> MkSomeVec (SS n) (x :+ ys)

It is common practice (and a good habit) to always include a singleton (or a singleton-like typeclass constraint) to the type you are “hiding” when you create an existential type wrapper, even when it is not always necessary. That’s why we included TypeRep in HList and SomeList earlier.

SomeVec a is essentially isomorphic to [a], except you can pattern match on it and get the length n as a type you can use.

There’s a slightly more light-weight method of returning an existential type: by returning it in a continuation.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L118-L121

withVec :: [a] -> (forall n. SNat n -> Vec n a -> r) -> r
withVec = \case
  [] -> \f -> f SZ VNil
  x : xs -> \f -> withVec xs \n ys -> f (SS n) (x :+ ys)

That way, you can use the type variable within the continuation. Doing withSomeVec xs \n v -> .... is identical to case toSomeVec xs of SomeVec n v -> ....

However, since you don’t get the n itself until runtime, you might find yourself struggling to use concepts like Fin and LTE. To do use them comfortably, you have to write functions to “check” if your LTE is even possible, known as “decision functions”:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level5.hs#L123-L128

isLTE :: SNat n -> SNat m -> Maybe (LTE n m)
isLTE = \case
  SZ -> \_ -> Just LTEZ
  SS n -> \case
    SZ -> Nothing
    SS m -> LTES <$> isLTE n m

This was a very whirlwind introduction, and I definitely recommend reading this post on fixed-length lists for a more in-depth guide and tour of the features. In practice, fixed-length lists are not that useful because the situations where you want lazily linked lists and the situations where you want them to be statically sized has very little overlap. But you will often see fixed-length vectors in real life code — mostly numerical code.

Overall as you can see, at this level we gain some powerful guarantees and tools, but we also run into some small inconveniences (like manipulating witnesses and singletons). This level is fairly comfortable to work with in modern Haskell tooling. However, if you live here long enough, you’re going to eventually be tempted to wander into…

Level 6: Local Structure Enforced List

Code available here

For our next level let’s jump back back into constraints on the contents of the list. Let’s imagine a priority queue on top of a list. Each value in the list will be a (priority, value) pair. To make the pop operation (pop out the value of lowest priority) efficient, we can enforce that the list is always sorted by priority: the lowest priority is always first, the second lowest is second, etc.

If we didn’t care about type safety, we could do this by always inserting a new item so that it is sorted:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L21-L26

insertSortedList :: (Int, a) -> [(Int, a)] -> [(Int, a)]
insertSortedList (p, x) = \case
  [] -> [(p, x)]
  (q, y) : ys
    | p <= q -> (p, x) : (q, y) : ys
    | otherwise -> (q, y) : insertSortedList (p, x) ys

This method enforces a local structure: between every item x and the next item y in x:y:zs, the priority of x has to be less than the priority y. Keeping our structure local means we only need to enforce local invariants.

Writing it all willy nilly type unsafe like this could be good for a single function, but we’re also going to need some more complicated functions. What if we wanted to “combine” (merge) two sorted lists together. Using a normal list, we don’t have any assurances that we have written it correctly, and it’s very easy to mess up. How about we leverage type safety to ask GHC to ensure that our functions are always correct, and always preserve this local structure? Now you’re thinking in types!

Introducing level 6: enforcing local structure!

But, first, a quick note before we dive in: for the rest of this post, for the sake of simplicity, let’s switch from inductively defined types (like Nat above) to GHC’s built in opaque Nat type. You can think of it as essentially the same as the Nat we wrote above, but opaque and provided by the compiler. Under the hood, it’s implemented using machine integers for efficiency. And, instead of using concrete S (S (S Z)) syntax, you’d use abstract numeric literals, like 3. There’s a trade-off: because it’s opaque, we can’t pattern match on it and create or manipulate our own witnesses — we are at the mercy of the API that GHC gives us. We get +, <=, Min, etc., but in total it’s not that extensive. That’s why I never use these without also bringing typechecker plugins (ghc-typelits-natnormalise and ghc-typelits-knonwnnat) to help automatically bring witnesses and equalities and relationships into scope for us. Everything here could be done using hand-defined witnesses and types, but we’re using TypeNats here just for the sake of example.

{-# OPTIONS_GHC -fplugin GHC.TypeLits.KnownNat.Solver #-}
{-# OPTIONS_GHC -fplugin GHC.TypeLits.Normalise #-}

With that disclaimer out of the way, let’s create our types! Let’s make an Entry n a type that represents a value of type a with priority n.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L28-L28

newtype Entry (n :: Nat) a = Entry a

We’d construct this like Entry @3 "hello", which produces Entry 3 String. Again this uses type application syntax, @3, that lets us pass in the type 3 to the constructor Entry :: forall n a. a -> Entry n a.

Now, let’s think about what phantom types we want to include in our list. The fundamental strategy in this, as I learned from Conor McBride’s great writings on this topic, are:

  • Think about what “type safe operations” you want to have for your structure
  • Add just enough phantom types to perform those operations.

In our case, we want to be able to cons an Entry n a to the start of a sorted list. To ensure this, we need to know that n is less than or equal to the list’s current minimum priority. So, we need our list type to be Sorted n a, where n is the current minimum priority.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L33-L35

data Sorted :: Nat -> Type -> Type where
  SSingle :: Entry n a -> Sorted n a
  SCons :: (KnownNat m, n <= m) => Entry n a -> Sorted m a -> Sorted n a

To keep things simple, we are only going to talk about non-empty lists, so the minimum priority is always defined.

So, a Sorted n a is either SSingle (x :: Entry n a), where the single item is a value of priority n, or SCons x xs, where x has priority n and xs :: Sorted m a, where n <= m. In our previous inductive Nat, you could imagine this as SCons :: SNat m -> LTE n m -> Entry n a -> Sorted m a -> Sorted n a, but here we will use GHC’s built-in <= typeclass-based witness of less-than-or-equal-to-ness.

This works! You should be able to write:

Entry @1 'a' `SCons` Entry @2 'b' `SCons` SSingle (Entry @4 'c')

This creates a valid list where the priorities are all sorted from lowest to highest. You can now pop using pattern matching, which gives you the lowest element by construction. If you match on SCons x xs, you know that no entry in xs has a priority lower than x.

Critically, note that creating something out-of-order like the following would be a compiler error:

Entry @9 'a' `SCons` Entry @2 'b' `SCons` SSingle (Entry @4 'c')

Now, the users of our priority queue probably won’t often care about having the minimum priority in the type. In this case, we are using the phantom type to ensure that our data structure is correct by construction, for our own sake, and also to help us write internal functions in a correct way. So, for practical end-user usage, we want to existentially wrap out n.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L103-L120

data SomeSorted a = forall n. KnownNat n => SomeSorted (Sorted n a)

popSomeSorted :: Sorted n a -> (Entry n a, Maybe (SomeSorted a))
popSomeSorted = \case
  SSingle x -> (x, Nothing)
  SCons x xs -> (x, Just (SomeSorted xs))

popSomeSorted takes an Sorted n a and returns the Entry n a promised at the start of it, and then the rest of the list if there is anything left, eliding the phantom parameter.

Now let’s get to the interesting parts where we actually leverage n: let’s write insertSortedList, but the type-safe way!

First of all, what should the type be if we insert an Entry n a into a Sorted m a? If n <= m, it would be Sorted n a. If n > m, it should be Sorted m a. GHC gives us a type family Min n m, which returns the minimum between n and m. So our type should be:

insertSorted :: Entry n a -> Sorted m a -> Sorted (Min n m) a

To write this, we can use some helper functions: first, to decide if we are in the n <= m or the n > m case:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L41-L51

data DecideInsert :: Nat -> Nat -> Type where
  DIZ :: (n <= m, Min n m ~ n) => DecideInsert n m
  DIS :: (m <= n, Min n m ~ m) => DecideInsert n m

decideInsert :: forall a b. (KnownNat a, KnownNat b) => DecideInsert a b
decideInsert = case cmpNat (Proxy @a) (Proxy @b) of
  LTI -> DIZ -- if a < b, DIZ
  EQI -> DIZ -- if a == b, DIZ
  GTI -> case cmpNat (Proxy @b) (Proxy @a) of
    LTI -> DIS -- if a > b, DIZ, except GHC isn't smart enough to know this
    GTI -> error "absurd, we can't have both a > b and b > a"

We can use decideInsert to branch on if we are in the case where we insert the entry at the head or the case where we have to insert it deeper. DecideInsert here is our witness, and decideInsert constructs it using cmpNat, provided by GHC to compare two Nats. We use Proxy :: Proxy n to tell it what nats we want to compare. KnownNat is the equivalent of our KnownNat class we wrote from scratch, but with GHC’s TypeNats instead of our custom inductive Nats.

cmpNat :: (KnownNat a, KnownNat b) => p a -> p b -> OrderingI a b

data OrderingI :: k -> k -> Type where
    LTI :: -- in this branch, a < b
    EQI :: -- in this branch, a ~ b
    GTI :: -- in this branch, a > b

Note that GHC and our typechecker plugins aren’t smart enough to know we can rule out b > a if a > b is true, so we have to leave an error that we know will never be called. Oh well. If we were writing our witnesses by hand using inductive types, we could write this ourselves, but since we are using GHC’s Nat, we are limited by what their API can prove.

Let’s start writing our insertSorted:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L64-L76

insertSorted ::
  forall n m a.
  (KnownNat n, KnownNat m) =>
  Entry n a ->
  Sorted m a ->
  Sorted (Min n m) a
insertSorted x = \case
  SSingle y -> case decideInsert @n @m of
    DIZ -> SCons x (SSingle y)
    DIS -> SCons y (SSingle x)
  SCons @q y ys -> case decideInsert @n @m of
    DIZ -> SCons x (SCons y ys)
    DIS -> sConsMin @n @q y (insertSorted x ys)

The structure is more or less the same as insertSortedList, but now type safe! We basically use our handy helper function decideInsert to dictate where we go. I also used a helper function sConsMin to insert into the recursive case

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L53-L62

sConsMin ::
  forall q r n a.
  (KnownNat q, KnownNat r, n <= q, n <= r) =>
  Entry n a ->
  Sorted (Min q r) a ->
  Sorted n a
sConsMin = case cmpNat (Proxy @q) (Proxy @r) of
  LTI -> SCons :: Entry n a -> Sorted q a -> Sorted n a
  EQI -> SCons :: Entry n a -> Sorted q a -> Sorted n a
  GTI -> SCons :: Entry n a -> Sorted r a -> Sorted n a

sConsMin isn’t strictly necessary, but it saves a lot of unnecessary pattern matching. The reason why we need it is because we want to write SCons y (insertSorted x ys) in the last line of insertSorted. However, in this case, SCons does not have a well-defined type. It can either be Entry n -> Sorted q a -> Sorted n a or Entry n -> Sorted r a -> Sorted n a. Haskell requires functions to be specialized at the place we actually use them, so this is no good. We would have to pattern match on cmpNat and LTI/EQI/GTI in order to know how to specialize SCons. So, we use sConsMin to wrap this up for clarity.

How did I know this? I basically tried writing it out the full messy way, bringing in as much witnesses and pattern matching as I could, until I got it to compile. Then I spent time factoring out the common parts until I got what we have now!

Note that we use a feature called “Type Abstractions” to “match on” the existential type variable q in the pattern SCons @q y ys. Recall from the definition of SCons that the first type variable is the minimum priority of the tail.

And just like that, we made our insertSortedList type-safe! We can no longer return an unsorted list: it always inserts sortedly, by construction, enforced by GHC. We did cheat a little with error, that was only because we used GHC’s TypeNats…if we used our own inductive types, all unsafety can be avoided.

Let’s write the function to merge two sorted lists together. This is essentially the merge step of a merge sort: take two lists, look at the head of each one, cons the smaller of the two heads, then recurse.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L78-L92

mergeSorted ::
  forall n m a.
  (KnownNat n, KnownNat m) =>
  Sorted n a ->
  Sorted m a ->
  Sorted (Min n m) a
mergeSorted = \case
  SSingle x -> insertSorted x
  SCons @q x xs -> \case
    SSingle y -> case decideInsert @n @m of
      DIZ -> sConsMin @q @m x (mergeSorted xs (SSingle y))
      DIS -> SCons y (SCons x xs)
    SCons @r y ys -> case decideInsert @n @m of
      DIZ -> sConsMin @q @m x (mergeSorted xs (SCons y ys))
      DIS -> sConsMin @n @r y (mergeSorted (SCons x xs) ys)

Again, this looks a lot like how you would write the normal function to merge two sorted lists…except this time, it’s type-safe! You can’t return an unsorted list because the result list has to be sorted by construction.

To wrap it all up, let’s write our conversion functions. First, an insertionSort function that takes a normal non-empty list of priority-value pairs and throws them all into a Sorted, which (by construction) is guaranteed to be sorted:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L107-L135

insertionSort ::
  forall a.
  NonEmpty (Natural, a) ->
  SomeSorted a
insertionSort ((k0, x0) :| xs0) = withSomeSNat k0 \(SNat @k) ->
  go xs0 (SomeSorted (SSingle (Entry @k x0)))
  where
    go :: [(Natural, a)] -> SomeSorted a -> SomeSorted a
    go [] = id
    go ((k, x) : xs) = \case
      SomeSorted @_ @n ys -> withSomeSNat k \(SNat @k) ->
        go xs $
          someSortedMin @k @n $
            insertSorted (Entry @k x) ys

someSortedMin ::
  forall n m a.
  (KnownNat n, KnownNat m) =>
  Sorted (Min n m) a ->
  SomeSorted a
someSortedMin = case cmpNat (Proxy @n) (Proxy @m) of
  LTI -> SomeSorted
  EQI -> SomeSorted
  GTI -> SomeSorted

Some things to note:

  1. We’re using the nonempty list type type from base, because Sorted always has at least one element.
  2. We use withSomeSNat to turn a Natural into the type-level n :: Nat, the same way we wrote withVec earlier. This is just just the function that GHC offers to reify a Natural (non-negative Integer) to the type level.
  3. someSortedMin is used to clean up the implementation, doing the same job that sConsMin did.
ghci> case insertionSort ((4, 'a') :| [(3, 'b'), (5, 'c'), (4, 'd')]) of
          SomeSorted xs -> print xs
SCons Entry @3 'b' (SCons Entry @4 'd' (SCons Entry @4 'a' (SSingle Entry @5 'c')))

Finally, a function to convert back down to a normal non-empty list, using GHC’s natVal to “demote” a type-level n :: Nat to a Natural

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level6.hs#L137-L140

fromSorted :: forall n a. KnownNat n => Sorted n a -> NonEmpty (Natural, a)
fromSorted = \case
  SSingle (Entry x) -> (natVal (Proxy @n), x) :| []
  SCons (Entry x) xs -> (natVal (Proxy @n), x) NE.<| fromSorted xs

Level 7: Global structure Enforced List

Code available here

For our final level, let’s imagine a “weighted list” of (Int, a) pairs, where each item a has an associated weight or cost. Then, imagine a “bounded weighted list”, where the total cost must not exceed some limit value. Think of it as a list of files and their sizes and a maximum total file size, or a backpack for a character in a video game with a maximum total carrying weight.

There is a fundamental difference here between this type and our last type: we want to enforce a global invariant (total cannot exceed a limit), and we can’t “fake” this using local invariants like last time.

Introducing level 7: enforcing global structure! This brings some extra complexities, similar to the ones we encountered in Level 5 with our fixed-length lists: whatever phantom type we use to enforce this “global” invariant now becomes entangled to the overall structure of our data type itself.

Let’s re-use our Entry type, but interpret an Entry n a as a value of type a with a weight n. Now, we’ll again “let McBride be our guide” and ask the same question we asked before: what “type-safe” operation do we want, and what minimal phantom types do we need to allow this type-safe operation? In our case, we want to insert into our bounded weighted list in a safe way, to ensure that there is enough room. So, we need two phantom types:

  1. One phantom type lim to establish the maximum weight of our container
  2. Another phantom type n to establish the current used capacity of our container.

We want Bounded lim n a:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L24-L31

data Bounded :: Nat -> Nat -> Type -> Type where
  BNil :: Bounded lim 0 a
  BCons ::
    forall n m lim a.
    (KnownNat m, n + m <= lim) =>
    Entry n a ->
    Bounded lim m a ->
    Bounded lim (n + m) a
  • The empty bounded container BNil :: lim 0 a can satisfy any lim, and has weight 0.
  • If we have a Bounded lim m a, then we can add an Entry n a to get a Bounded lim (m + n) a provided that m + n <= lim using BCons.

Let’s try this out by seeing how the end user would “maybe insert” into a bounded list of it had enough capacity:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L133-L145

data SomeBounded :: Nat -> Type -> Type where
  SomeBounded :: KnownNat n => Bounded lim n a -> SomeBounded lim a

insertSomeBounded ::
  forall lim n a.
  (KnownNat lim, KnownNat n) =>
  Entry n a ->
  SomeBounded lim a ->
  Maybe (SomeBounded lim a)
insertSomeBounded x (SomeBounded @m xs) = case cmpNat (Proxy @(n + m)) (Proxy @lim) of
  LTI -> Just $ SomeBounded (BCons x xs)
  EQI -> Just $ SomeBounded (BCons x xs)
  GTI -> Nothing

First we match on the SomeBounded to see what the current capacity m is. Then we check using cmpNat to see if the Bounded can hold m + n. If it does, we can return successfully. Note that we define SomeBounded using GADT syntax so we can precisely control the order of the type variables, so SomeBounded @m xs binds m to the capacity of the inner list.

Remember in this case that the end user here isn’t necessarily using the phantom types to their advantage (except for lim, which could be useful). Instead, it’s us who is going to be using n to ensure that if we ever create any Bounded (or SomeBounded), it will always be within capacity by construction.

Now that the usage makes sense, let’s jump in and write some type-safe functions using our fancy phantom types!

First, let’s notice that we can always “resize” our Bounded lim n a to a Bounded lim' n a as long as the total usage n fits within the new carrying capacity:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L35-L38

reBounded :: forall lim lim' n a. n <= lim' => Bounded lim n a -> Bounded lim' n a
reBounded = \case
  BNil -> BNil
  BCons x xs -> BCons x (reBounded xs)

Note that we have full type safety here! GHC will prevent us from using reBounded if we pick a new lim that is less than what the bag currently weighs! You’ll also see the general pattern here that changing any “global” properties for our type here will require recursing over the entire structure to adjust the global property.

How about a function to combine two bags of the same weight? Well, this should be legal as long as the new combined weight is still within the limit:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L48-L56

concatBounded ::
  forall n m lim a.
  (KnownNat n, KnownNat m, KnownNat lim, n + m <= lim) =>
  Bounded lim n a ->
  Bounded lim m a ->
  Bounded lim (n + m) a
concatBounded = \case
  BNil -> id
  BCons @x @xs x xs -> BCons x . concatBounded xs

Aside

This is completely unrelated to the topic at hand, but if you’re a big nerd like me, you might enjoy the fact that this function makes Bounded lim n a the arrows of a Category whose objects are the natural numbers less than or equal to lim, the identity arrow is BNil, and arrow composition is concatBounded. Between object n and m, if n <= m, its arrows are values of type Bounded lim (m - n) a. Actually wait, it’s the same thing with Vec and vconcat above isn’t it? I guess we were moving so fast that I didn’t have time to realize it.

Anyway this is related to the preorder category, but not thin. A thicc preorder category, if you will. Always nice to spot a category out there in the wild.

It should be noted that the reason that reBounded and concatBounded look so clean so fresh is that we are heavily leveraging typechecker plugins. But, these are all still possible with normal functions if we construct the witnesses explicitly.

Now for a function within our business logic, let’s write takeBounded, which constricts a Bounded lim n a to a Bounded lim' q a with a smaller limit lim', where q is the weight of all of the elements that fit in the new limit. For example, if we had a bag of limit 15 containing items weighing 4, 3, and 5 (total 12), but we wanted to takeBounded with a new limit 10, we would take the 4 and 3 items, but leave behind the 5 item, to get a new total weight of 7.

It’d be nice to have a helper data type to existentially wrap the new q weight in our return type:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L113-L118

data TakeBounded :: Nat -> Nat -> Type -> Type where
  TakeBounded ::
    forall q lim n a.
    (KnownNat q, q <= n) =>
    Bounded lim q a ->
    TakeBounded lim n a

So the type of takeBounded would be:

takeBounded ::
  (KnownNat lim, KnownNat lim', KnownNat n) =>
  Bounded lim n a ->
  TakeBounded lim' n a

Again I’m going to introduce some helper functions that will make sense soon:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L40-L46

bConsExpand :: KnownNat m => Entry n a -> Bounded lim m a -> Bounded (n + lim) (n + m) a
bConsExpand x xs = withBoundedWit xs $ BCons x (reBounded xs)

withBoundedWit :: Bounded lim n a -> (n <= lim => r) -> r
withBoundedWit = \case
  BNil -> \x -> x
  BCons _ _ -> \x -> x

From the type, we can see bCons adds a new item while also increasing the limit: bConsExpand :: Entry n a -> Bounded lim m a -> Bounded (n + lim) (n + m) a. This is always safe conceptually because we can always add a new item into any bag if we increase the limit of the bag: Entry 100 a -> Bounded 5 3 a -> Bounded 105 103 a, for instance.

Next, you’ll notice that if we write this as BCons x (reBounded xs) alone, we’ll get a GHC error complaining that this requires m <= lim. This is something that we know has to be true (by construction), since there isn’t any constructor of Bounded that will give us a total weight m bigger than the limit lim. However, this requires a bit of witness manipulation for GHC to know this: we have to essentially enumerate over every constructor, and within each constructor GHC knows that m <= lim holds. This is what withBoundedWit does. We “know” n <= lim, we just need to enumerate over the constructors of Bounded lim n a so GHC is happy in every case.

withBoundedWit’s type might be a little confusing if this is the first time you’ve seen an argument of the form (constraint => r): it takes a Bounded lim n a and a “value that is only possible if n <= lim”, and then gives you that value.

With that, we’re ready:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L120-L131

takeBounded ::
  forall lim lim' n a.
  (KnownNat lim, KnownNat lim', KnownNat n) =>
  Bounded lim n a ->
  TakeBounded lim' n a
takeBounded = \case
  BNil -> TakeBounded BNil
  BCons @x @xs x xs -> case cmpNat (Proxy @x) (Proxy @lim') of
    LTI -> case takeBounded @lim @(lim' - x) xs of
      TakeBounded @q ys -> TakeBounded @(x + q) (bConsExpand x ys)
    EQI -> TakeBounded (BCons x BNil)
    GTI -> TakeBounded BNil

Thanks to the types, we ensure that the returned bag must contain at most lim'!

As an exercise, try writing splitBounded, which is like takeBounded but also returns the items that were leftover. Solution here.

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L91-L103

data SplitBounded :: Nat -> Nat -> Nat -> Type -> Type where
  SplitBounded ::
    forall q lim lim' n a.
    (KnownNat q, q <= n) =>
    Bounded lim' q a ->
    Bounded lim (n - q) a ->
    SplitBounded lim lim' n a

splitBounded ::
  forall lim lim' n a.
  (KnownNat lim, KnownNat lim', KnownNat n) =>
  Bounded lim n a ->
  SplitBounded lim lim' n a

One final example, how about a function that reverses the Bounded lim n a? We’re going to write a “single-pass reverse”, similar to how it’s often written for lists:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L68-L73

reverseList :: [a] -> [a]
reverseList = go []
  where
    go res = \case
      [] -> res
      x : xs -> go (x : res) xs

Now, reversing a Bounded should be legal, because reversing the order of the items shouldn’t change the total weight. However, we basically “invert” the structure of the Bounded type, which, depending on how we set up our phantom types, could mean a lot of witness reshuffling. Luckily, our typechecker plugin handles most of it for us in this case, but it exposes one gap:

-- source: https://github.com/mstksg/inCode/tree/master/code-samples/type-levels/Level7.hs#L58-L89

reverseBounded ::
  forall lim n a. (n <= lim, KnownNat lim, KnownNat n) => Bounded lim n a -> Bounded lim n a
reverseBounded = go BNil
  where
    go ::
      forall m q.
      (KnownNat m, KnownNat q, m <= lim, m + q <= lim) =>
      Bounded lim m a ->
      Bounded lim q a ->
      Bounded lim (m + q) a
    go res = \case
      BNil -> res
      BCons @x @xs x xs ->
        solveLte @m @q @x @lim $
          go @(x + m) @xs (BCons @x @m x res) xs

solveLte ::
  forall a b c n r.
  (KnownNat a, KnownNat c, KnownNat n, a + b <= n, c <= b) =>
  (a + c <= n => r) ->
  r
solveLte x = case cmpNat (Proxy @(a + c)) (Proxy @n) of
  LTI -> x
  EQI -> x
  GTI -> error "absurd: if a + b <= n and c < b, the a + c can't > n"

Due to how everything gets exposed, we need to prove that if a + b <= n and c <= b, then a + c <= n. This is always true, but the typechecker plugin needs a bit of help, and we have to resort to an unsafe operation to get this to work. However, if we were using our manually constructed inductive types instead of GHC’s opaque ones, we could write this in a type-safe and total way. We run into these kinds of issues a lot more often with global invariants than we do with local invariants, because the invariant phantom becomes so entangled with the structure of our data type.

And…that’s about as far as we’re going to go with this final level! If this type of programming with structural invariants is appealing to you, check out Conor McBride’s famous type-safe red-black trees in Haskell paper, or Edwin Brady’s Type-Driven Development in Idris for how to structure entire programs around these principles.

Evident from the fact that Conor’s work is in Agda, and Brady’s in Idris, you can tell that in doing this, we are definitely pushing the boundaries of what is ergonomic to write in Haskell. Well, depending on who you ask, we already zipped that boundary long ago. Still, there’s definitely a certain kind of joy to defining invariants in your data types and then essentially proving to the compiler that you’ve followed them. But, most people will be happier just writing a property test to fuzz the implementation on a non type-safe structure. And some will be happy with…unit tests. Ha ha ha ha. Good joke right?

Anyway, hope you enjoyed the ride! I hope you found some new ideas for ways to write your code in the future, or at least found them interesting or eye-opening. Again, none of the data structures here are presented to be practically useful as-is — the point is more to present these typing principles and mechanics in a fun manner and to inspire a sense of wonder.

Which level is your favorite, and what level do you wish you could work at if things got a little more ergonomic?

Special Thanks

I am very humbled to be supported by an amazing community, who make it possible for me to devote time to researching and writing these posts. Very special thanks to my supporter at the “Amazing” level on patreon, Josh Vera! :)


  1. Luke’s blog has been known to switch back and forth from private to non-private, so I will link to the official post and respect the decision of the author on whether or not it should be visible. However, the term itself is quite commonly used and if you search for it online you will find much discussion about it.↩︎

  2. Note that I don’t really like calling these “vectors” any more, because in a computer science context the word vector carries implications of contiguous-memory storage. “Lists” of fixed length is the more appropriate description here, in my opinion. The term “vector” for this concept arises from linear algebra, where a vector is inherently defined by its vector space, which does have an inherent dimensionality. But we are talking about computer science concepts here, not mathematical concepts, so we should pick the name that provides the most useful implicit connotations.↩︎

by Justin Le at September 04, 2024 05:35 PM

September 02, 2024

Christopher Allen

Obtaining happiness by using Diesel Async in anger

I've been getting some questions from people about how to use Diesel and particularly diesel-async for interacting with SQL databases in Rust. I thought I'd write up a quick post with some patterns and examples.

The example project on GitHub for this post is located at: https://github.com/bitemyapp/better-living-through-petroleum/tree/blog/diesel-async-in-anger

The blog/diesel-async-in-anger Git tag is so you can see the version of the code that I'm using for this post.

by Unknown at September 02, 2024 12:00 AM

September 01, 2024

Magnus Therning

Improving how I handle secrets in my work notes

At work I use org-mode to keep notes about useful ways to query our systems, mostly that involves using the built-in SQL support to access DBs and ob-http to send HTTP requests. In both cases I often need to provide credentials for the systems. I'm embarrassed to admit it, but for a long time I've taken the easy path and kept all credentials in clear text. Every time I've used one of those code blocks I've thought I really ought to find a better way of handling these secrets one of these days. Yesterday was that day.

I ended up with two functions that uses auth-source and its ~/.authinfo.gpg file.

(defun mes/auth-get-pwd (host)
  "Get the password for a host (authinfo.gpg)"
  (-> (auth-source-search :host host)
      car
      (plist-get :secret)
      funcall))

(defun mes/auth-get-key (host key)
  "Get a key's value for a host (authinfo.gpg)

Not usable for getting the password (:secret), use 'mes/auth-get-pwd'
for that."
  (-> (auth-source-search :host host)
      car
      (plist-get key)))

It turns out that the library can handle more keys than the documentation suggests so for DB entries I'm using a machine (:host) that's a bit shorter and easier to remember than the full AWS hostname. Then I keep the DB host and name in dbhost (:dbhost) and dbname (:dbname) respectively. That makes an entry look like this:

machine db.svc login user port port password pwd dbname dbname dbhost dbhost

If I use it in a property drawer it looks like this

:PROPERTIES:
:header-args:sql: :engine postgresql
:header-args:sql+: :dbhost (mes/auth-get-key "db.svc" :dbhost)
:header-args:sql+: :dbport (string-to-number (mes/auth-get-key "db.svc" :port))
:header-args:sql+: :dbuser (mes/auth-get-key "db.svc" :user)
:header-args:sql+: :dbpassword (mes/auth-get-pwd "db.svc")
:header-args:sql+: :database (mes/auth-get-key "db.svc" :dbname)
:END:

September 01, 2024 01:03 PM

Mark Jason Dominus

Another corner of Pennsylvania

[ Previously: [1] [2] [3] ]

A couple of years back I wrote:

I live in southeastern Pennsylvania, so the Pennsylvania-New Jersey-Delaware triple point must be somewhere nearby. I sat up and got my phone so I could look at the map, and felt foolish.

Map of the Pennsylvania-New Jersey-Delaware triple border, about a kilometer offshore from Marcus Hook, PA, further described below.

As you can see, the triple point is in the middle of the Delaware River, as of course it must be; the entire border between Pennsylvania and New Jersey, all the hundreds of miles from its northernmost point (near Port Jervis) to its southernmost (shown above), runs right down the middle of the Delaware.

I briefly considered making a trip to get as close as possible, and photographing the point from land. That would not be too inconvenient. Nearby Marcus Hook is served by commuter rail. But Marcus Hook is not very attractive as a destination. Having been to Marcus Hook, it is hard for me to work up much enthusiasm for a return visit.

I was recently passing by Marcus Hook on the way back from Annapolis, so I thought what the heck, I'd stop in and see if I could get a look in the direction of the tripoint. As you can see from this screencap, I was at least standing in the right place, pointed in the right direction.

Screencap of my phone's map app, showing the same part of the river as the map above.  This one is marked with a blue dot (me) near the Marcus Hook Industrial Complex, pointed towards the tripoint, also labeled.

I didn't quite see the tripoint itself because this buoyancy-operated aquatic transport was in the way. I don't mind, it was more interesting to look at than open water would have been.

Photo of the Delaware river, taken from the Pennsylvania shore.  The near bank is covered with pretty green and purple weeds.  Floating in the river directly ahead is a pale green ship with a white superstructure, the BW Messina

Thanks to the Wonders of the Internet, I have learned that this is an LPG tanker. Hydrocarbons from hundreds of miles away are delivered to the refinery in Marcus Hook via rail, road, and pipeline, and then shipped out on vessels like this one. Infrastructure fans should check it out.

I was pleased to find that Marcus Hook wasn't as dismal as I remembered, it's just a typical industrial small town. I thought maybe I should go back and look around some more. If you hoped I might have something more interesting or even profound to say here, sorry.

Oh, I know. Here, I took this picture in Annapolis:

A sandstone plinth with the Maryland state coat of arms carved in bas-relief at the top. Under this are engraved words: ALBERT CABELL RITCHIE / 2876 – 1936 / FOUR TIMES GOVERNOR OF MARYLAND / HE WHO IS WORTHY OF HONOR DOES NOT DIE.

Perhaps he who is worthy of honor does not die. But fame is fleeting. Even if he who is worthy of honor does get a plinth, the grateful populace may not want to shell out for a statue.

by Mark Dominus (mjd@plover.com) at September 01, 2024 03:16 AM

August 29, 2024

Gabriella Gonzalez

Firewall rules: not as secure as you think

Firewall rules: not as secure as you think

This post introduces some tricks for jailbreaking hosts behind “secure” enterprise firewalls in order to enable arbitrary inbound and outbound requests over any protocol. You’ll probably find the tricks outlined in the post useful if you need to deploy software in a hostile networking environment.

The motivation for these tricks is that you might be a vendor that sells software that runs in a customer’s datacenter (a.k.a. on-premises software), so your software has to run inside of a restricted network environment. You (the vendor) can ask the customer to open their firewall for your software to communicate with the outside world (e.g. your own datacenter or third party services), but customers will usually be reluctant to open their firewall more than necessary.

For example, you might want to ssh into your host so that you can service, maintain, or upgrade the host, but if you ask the customer to open their firewall to let you ssh in they’ll usually push back on or outright reject the request. Moreover, this isn’t one of those situations where you can just ask for forgiveness instead of permission because you can’t begin to do anything without explicitly requesting some sort of firewall change on their part.

So I’m about to teach you a bunch of tricks for efficiently tunneling whatever you want over seemingly innocuous openings in a customer’s firewall. These tricks will culminate with the most cursed trick of all, which is tunneling inbound SSH connections inside of outbound HTTPS requests. This will grant you full command-line access to your on-premises hosts using the most benign firewall permission that a customer can grant. Moreover, this post is accompanied by a repository named holepunch containing NixOS modules automating this ultimate trick which you can either use directly or consult as a working proof-of-concept for how the trick works.

Overview

Most of the tricks outlined in this post assume that you control the hosts on both ends of the network request. In other words, we’re going to assume that there is some external host in your datacenter and some internal host in the customer’s datacenter and you control the software running on both hosts.

There are four tricks in our arsenal that we’re going to use to jailbreak internal hosts behind a restrictive customer firewall:

Once you master these four tools you will typically be able to do basically anything you want using the slimmest of firewall permissions.

You might also want to read another post of mine: Forward and reverse proxies explained. It’s not required reading for this post, but you might find it helpful or interesting if you like this post.

Proxies

We’re going to start with proxies since that’s the easiest thing to explain which requires no other conceptual dependencies.

A proxy is a host that can connect to other hosts on a client’s behalf (instead of the client making a direct connection to those other hosts). We will call these other hosts “upstream hosts”.

One of the most common tricks when jailbreaking an internal host (in the customer’s datacenter) is to create an external host (in your datacenter) that is a proxy. This is really effective because the customer has no control over traffic between the proxy and upstream hosts. The customer’s firewall can only see, manage, and intercept traffic between the internal host and the proxy, but everything else is invisible to them.

There are two types of proxies, though: forward proxies and reverse proxies. Both types of proxies are going to come in handy for jailbreaking our internal host.

Forward proxy

A forward proxy is a proxy that lets the client decide which upstream host to connect to. In our case, the “client” is the internal host that resides in the customer datacenter that is trying to bypass the firewall.

Forward proxies come in handy when the customer restricts which hosts that you’re allowed to connect to. For example, suppose that your external host’s address is external.example.com and your internal hosts’s address is internal.example.com. Your customer might have a firewall rule that prevents internal.example.com from connecting to any host other than external.example.com. The intention here is to prevent your machine from connecting to other (potentially malicious) machines. However, this firewall rule is quite easy for a vendor to subvert.

All you have to do is host a forward proxy at external.example.com and then any time internal.example.com wants to connect to any other domain (e.g. google.com) it can just route the request through the forward proxy hosted at external.example.com. For example, squid is one example of a forward proxy that you can use for this purpose, and you could configure it like this:

acl internal src ${SUBNET OF YOUR INTERNAL SERVER(S)}

http_access allow internal
http_access deny all

… and then squid will let any program on internal.example.com connect to any host reachable from external.example.com so long as the program configured http://external.example.com:3128 as the forward proxy. For example, you’d be able to run this command on internal.example.com:

$ curl --proxy http://external.example.com:3128 https://google.com

… and the request would succeed despite the firewall because from the customer’s point of view they can’t tell that you’re using a forward proxy. Or can they?

Reverse proxy

Well, actually the customer can tell that you’re doing something suspicious. The connection to squid isn’t encrypted (note that the scheme for our forward proxy URI is http and not https), and most modern firewalls will be smart enough to monitor unencrypted traffic and notice that you’re trying to evade the firewall by using a forward proxy (and they will typically block your connection if you try this). Oops!

Fortunately, there’s a very easy way to evade this: encrypt the traffic to the proxy! There are quite a few ways to do this, but the most common approach is to put a “TLS-terminating reverse proxy” in front of any service that needs to be encrypted.

So what’s a “reverse proxy”? A reverse proxy is a proxy where the proxy decides which upstream host to connect to (instead of the client deciding). A TLS-terminating reverse proxy is one whose sole purpose is to provide an encrypted endpoint that clients can connect to and then it forwards unencrypted traffic to some (fixed) upstream endpoint (e.g. squid running on external.example.com:3128 in this example).

There are quite a few services created for doing this sort of thing, but the three I’ve personally used the most throughout my career are:

  • nginx
  • haproxy
  • stunnel

For this particular case, I actually will be using stunnel to keep things as simple as possible (nginx and haproxy require a bit more configuration to get working for this).

You would run stunnel on external.example.com with a configuration that would look something like this:

[default]
accept = 443
connect = localhost:3128
cert = /path/to/your-certificate.pem

… and now connections to https://external.example.com are encrypted and handled by stunnel, which will decrypt the traffic and route those requests to squid running on port 3128 of the same machine.

In order for this to work you’re going to need a valid certificate for external.example.com, which you can obtain for free using Let’s Encrypt. Then you staple the certificate public key and private key to generate the final PEM file that you reference in the above stunnel configuration.

So if you’ve gotten this far your server can now access any publicly reachable address despite the customer’s firewall restriction. Moreover, the customer can no longer detect that anything is amiss because all of your connections to the outside world will appear to the customer’s firewall as encrypted HTTPS connections to external.example.com:443, which is an extremely innocuous type of of connection.

Reverse tunnel

We’re only getting started, though! By this point we can make whatever outbound connections we want, but WHAT ABOUT INBOUND CONNECTIONS?

As it turns out, there is a trick known as a reverse tunnel which lets you tunnel inbound connections over outbound connections. Most reverse tunnels exploit two properties of TCP connections:

  • TCP connections may be long-lived (sometimes very long-lived)
  • TCP connections must necessarily support network traffic in both directions

Now, in the common case a lot of TCP connections are short-lived. For example, when you open https://google.com in your browser that is an HTTPS request which is layered on top of a TCP connection. The HTTP request message is data sent in one direction over the TCP connection and the HTTP response message is data sent in the other direction over the TCP connection and then the TCP connection is closed.

But TCP is much more powerful than that and reverse tunnels exploit that latent protocol power. To illustrate how that works I’ll use the most widely known type of reverse tunnel: the SSH reverse tunnel.

You typically create an SSH reverse tunnel by running a command like this from the internal machine (e.g. internal.example.com):

$ ssh -R "${EXTERNAL_PORT}:localhost:${INTERNAL_PORT}" -N external.example.com

In an SSH reverse tunnel, the internal machine (e.g. internal.example.com) initiates an outbound TCP request to the SSH daemon (sshd) listening on the external machine (e.g. external.example.com). When sshd receives this TCP request it keeps the TCP connection alive and then listens for inbound requests on EXTERNAL_PORT of the external machine. sshd forward all requests received on that port through the still-alive TCP connection back to the INTERNAL_PORT on the internal machine. This works fine because TCP connections permit arbitrary data flow both ways and the protocol does not care if the usual request/response flow is suddenly reversed.

In fact, an SSH reverse tunnel doesn’t just let you make inbound connections to the internal machine; it lets you make inbound connections to any machine reachable from the internal machine (e.g. other machines inside the customer’s datacenter). However, those kinds of connections to other internal hosts can be noticed and blocked by the customer’s firewall.

From the point of view of the customer’s firewall, our internal machine has just made a single long-lived outbound connection to external.example.com and they cannot easily tell that the real requests are coming in the other direction (inbound) because those requests are being tunneled inside of the outbound request.

However, this is not foolproof, for two reasons:

  • A customer’s firewall can notice (and ban) a long-lived connection

    I believe it is possible to disguise a long-lived connection as a series of shorter-lived connections, but I’ve never personally done that before so I’m not equipped to explain how to do that.

  • A customer’s firewall will notice that you’re making an SSH connection of some sort

    Even when the SSH connection is encrypted it is still possible for a firewall to detect that the SSH protocol is being used. A lot of firewalls will be configured to ban SSH traffic by default unless explicitly approved.

However, there is a great solution to that latter problem, which is …

corkscrew

corkscrew is an extremely simple tool that wraps an SSH connection in an HTTP connection. This lets us disguise SSH traffic as HTTP traffic (which we can then further disguise as HTTPS traffic by encrypting the connection using stunnel).

Normally, the only thing we’d need to do is to extend our ssh -R command to add this option:

ssh -R -o 'ProxyCommand /path/to/corkscrew external.example.com 443 %h %p` …

… but this doesn’t work because corkscrew doesn’t support HTTPS connections (it’s an extremely simple program written in just a couple hundred lines of C code). So in order to work around that we’re going to use stunnel again, but this time we’re going to run stunnel in “client mode” on internal.example.com so that it can handle the HTTPS logic on behalf of corkscrew.

[default]
client = yes
accept = 3128
connect = external.example.com:443

… and then the correct ssh command is:

$ ssh -R -o 'ProxyCommand /path/to/corkscrew localhost 3128 %h %p` …

… and now you are able to disguise an outbound SSH request as an outbound HTTPS request.

MOREOVER, you can use that disguised outbound SSH request to create an SSH reverse tunnel which you can use to forward inbound traffic from external.example.com to any INTERNAL_PORT on internal.example.com. Can you guess what INTERNAL_PORT we’re going to pick?

That’s right, we’re going to forward inbound traffic to port 22: sshd. Also, we’re going to arbitrarily set EXTERNAL_PORT to 17705:

$ ssh -R 17705:localhost:22 -N external.example.com

Now, (separately from the above command) we can ssh into our internal server via our external server like this:

$ ssh -p 17705 external.example.com

… and we have complete command-line access to our internal server and the customer is none the wiser.

From the customer’s perspective, we just ask them for an innocent-seeming firewall rule permitting outbound HTTPS traffic from internal.example.com to external.example.com. That is the most innocuous firewall change we can possibly request (short of not opening the firewall at all).

Conclusion

I don’t think all firewall rules are ineffective or bad, but if the same person or organization controls both ends of a connection then typically anything short of completely disabling internet access can be jailbroken in some way with off-the-shelf open source tools. It does require some work, but as you can see with the associated holepunch repository even moderately sophisticated firewall escape hatches can be neatly packaged for others to reuse.

by Gabriella Gonzalez (noreply@blogger.com) at August 29, 2024 01:49 PM

Tweag I/O

Deploying Buildbarn on Kubernetes with mTLS on the side

We have shown the benefits of using a shared build cache as well as using remote build execution (RBE) to offload builds to a remote build farm. Our customers are interested in leveraging RBE to improve developer experience and reduce continuous integration (CI) run times, giving us an opportunity to learn all aspects of deploying different RBE solutions. I would like to share how one can deploy one of them, Buildbarn, and secure all communications in it.

What is it and why do we care?

We want developers to be productive. Being productive requires spending as little time as possible waiting for build/test feedback, not having to switch to a different task while the build is running.

Remote caching

One part of achieving this is to never build the same thing twice. Tools like Bazel support caching the result of every action, every tool execution. While many tools support storing results in a local directory, Bazel tracks the actions and their inputs with high granularity, resulting in more frequent “cache hits”. This is already a good gain for a single developer working on one machine. However Bazel also supports conducting builds in a controlled environment with identical tooling and using a remote cache that can be shared between team members and CI, taking things a significant step further. You won’t have to rebuild anything that has been built by your colleagues or by CI, which means starting up on a new machine, onboarding a new team member or reproducing issues becomes faster.

Remote build execution

The second part of keeping developers productive is allowing them to use the right tools for the job. They still often need to build new things, and their local machine may be not be the fastest, not have enough charge or have the wrong architecture or OS. Remote build execution extends remote caching by executing actions on shared builders when their results are not cached already. This allows setting up a shared pool of necessary hardware or virtual compute for both developers and CI. In Bazel this was implemented using RBE API.

RBE implementations

Since the last post, RBE for Google Cloud Platform (GCP) has disappeared, and several new self-service and commercial services have been created. The RBE API has also gained popularity with different build systems, including Bazel (where it started), Buck2, and BuildStream. It is also used in projects that cannot change their build systems easily, but can use reclient to wrap all build actions and forward them to an RBE service. Examples of such setup include Android, Fuchsia and Chromium.

We’ll focus on one of opensource RBE API servers, Buildbarn.

Securing remote cache and builds

Any shared infrastructure implies some security risks. When sending code to be built remotely we expose it on the network, where it can be intercepted or altered. When reading from the cache, we trust it to contain valid, unaltered results. When setting up a pool of compute resources, we expect them to be used only for building our code, and not for enriching third parties. All these expectations mean that we require all communications with remote infrastructure and within it to be encrypted and authenticated. The industry standard for achieving this is mTLS: Transport Layer Security (TLS) protocol with mutual authentication. It uses public key infrastructure (PKI) to allow both clients and servers to verify each other’s identities before sending any data, and makes sure that the data sent on one side matches the data received on the other side.

Overview

In this extended blog post we’ll start by showing how to deploy Buildbarn on a Kubernetes cluster running in a local VM and configure a simple Bazel example to use it. Then we’ll turn on mTLS with the help of cert-manager for all Buildbarn pieces communicating with one another, and, finally, configure Bazel on a developer or CI machine to authenticate over the RBE API with a certificate and verify the one presented by the build server.

This blog post contains a lot of code snippets that let you follow the installation process step by step. If you copy each command into your terminal in order, you should see the same results as described. If you prefer to jump to the final result and look at the complete picture, you can check out our fork of the upstream buildbarn/bb-deployments repository and follow the instructions there.

Deploying Buildbarn

In this section we’ll create a local Buildbarn deployment on a Kubernetes cluster running in a VM. We’ll create a local VM with Kubernetes using an example config provided by lima. Then we’ll configure persistent volumes for Buildbarn storage inside that VM. After that we’ll use the Kubernetes example from a repository provided by Buildbarn to deploy Buildbarn itself.

Setting up a Kubernetes instance

If you already have access to a Kubernetes cluster that you can use, you can skip this section. Here we’ll deploy a local VM with Kubernetes running in it. In subsequent steps below it’s assumed that you’re using a local VM, so you’ll have to adjust some parameters accordingly if you use different means.

I’ve found that the easiest and most portable way to get a Kubernetes running locally is using the lima (Linux Machines) project. You can follow the official docs to install it. I prefer using Nix and direnv, so I’ve created a .envrc file with one line use nix and shell.nix with the following contents:

{ nixpkgs ? builtins.getFlake "nixpkgs"
, system ? builtins.currentSystem
, pkgs ? nixpkgs.legacyPackages.${system}
}:
pkgs.mkShell {
  packages = with pkgs; [
    kubectl
    lima-bin
    jq
  ];
}

Then you just need to run direnv allow and it will fetch the necessary packages and make them available in your shell.

Now we can create a Lima VM from the k8s template. We remove mounts from the template to specify our own later. We also need to add some special options for running on macOS:

limactl create template://k8s --name k8s --tty=false \
  --set '.provision |= . + {"mode":"system","script":"#!/bin/bash
for d in /mnt/fast-disks/vol{0,1,2,3}; do sudo mkdir -p $d; sudo mount --bind $d $d; done"}' \
  $([ "$(uname -s)" = "Darwin" ] && { echo "--vm-type vz"; [ "$(uname -m)" = "arm64" ] && echo "--rosetta"; })

Here arguments are:

  • --name k8s sets a name for the new VM; it defaults to the template name, but let’s keep it explicit
  • --set '.provision ...' uses a jq expression to add an additional provision step to the resulting YAML file creating necessary mountpoints for persistent volumes
  • --tty=false disables console prompts and confirmations
  • for macOS we also add --vm-type vz to use the native macOS Virtualization framework instead of QEMU for a faster VM
  • for Apple Silicon we also add --rosetta to enable the translation layer, allowing us to run x86_64 containers in the VM with little overhead

You can start the final VM and check if it is ready with:

limactl start k8s
export KUBECONFIG=~/.lima/k8s/copied-from-guest/kubeconfig.yaml
kubectl get node

It will take some time to bootstrap Kubernetes, after which it should show you one node called lima-k8s with Ready status:

NAME       STATUS   ROLES           AGE     VERSION
lima-k8s   Ready    control-plane   4m54s   v1.29.2

Buildbarn will need some PersistentVolumes to store data. Let’s teach it to use the mounts that we created earlier for that. First, configure a storage class:

kubectl apply -f - <<EOF
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-disks
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
EOF

It should respond with storageclass.storage.k8s.io/fast-disks created.

Then start a local volume provisioner from sig-storage-local-static-provisioner:

curl -L https://raw.githubusercontent.com/kubernetes-sigs/sig-storage-local-static-provisioner/master/deployment/kubernetes/example/default_example_provisioner_generated.yaml | kubectl apply -f -

Run kubectl get pv to see that it created four volumes. They may take several seconds to appear. You can check the provisioner’s logs for any errors with kubectl logs daemonset/local-volume-provisioner.

Deploying Buildbarn

bb-deployments provides a Kustomize template to deploy Buildbarn. Let’s clone it, patch one service so that we can run it locally, and deploy:

git clone https://github.com/buildbarn/bb-deployments.git
pushd bb-deployments/kubernetes
cat >> kustomization.yaml <<EOF

# patch frontend service to not require external load balancers
patches:
  - target:
      kind: Service
      name: frontend
    patch: |
      - op: replace
        path: /spec/type
        value: NodePort
      - op: add
        path: /spec/ports/0/nodePort
        value: 30080
EOF
kubectl apply -k .
kubectl rollout status -k . 2>&1 | grep -Ev "no status|unable to decode"

The last command will wait for everything to start. We’ve filtered out all messages about resources that it doesn’t know how to wait for.

To check that the Buildbarn frontend is accessible, we can use grpc-client-cli. Add it to the list in shell.nix, save it and run:

grpc-client-cli -a 127.0.0.1:30080 health

It should report that it is SERVING:

{
 "status": "SERVING"
}

We can exit the bb-deployments directory now:

popd

In this section we’ve deployed Buildbarn and verified that its API is accessible. Now we’ll move on to setting up a small Bazel project to use it. Then we’ll configure mTLS on Buildbarn, and finally configure Bazel to work with mTLS.

Using Buildbarn

Let’s set up a small Bazel project to use our Buildbarn instance. In this section we’ll use Bazel examples repo and show how to build it using Bazel locally and with RBE. We’ll also see how remote caching speeds up builds by caching intermediate results.

We will be using Bazelisk to fetch and run upstream distribution of Bazel. First we’ll need to install Bazelisk by adding bazelisk to shell.nix. If you are running NixOS, you will have to create an FHS environment to run Bazel. If you are running macOS and don’t have Xcode command line tools installed, you also need to provide necessary libraries to bazel invocation. Add this to your shell.nix:

pkgs.mkShell {
  packages = with pkgs; [
    ...
    bazelisk
  ];
  env = pkgs.lib.optionalAttrs pkgs.stdenv.isDarwin {
    BAZEL_LINKOPTS = with pkgs.darwin.apple_sdk;
      "-F${frameworks.Foundation}/Library/Frameworks:-L${objc4}/lib";
    BAZEL_CXXOPTS = "-I${pkgs.libcxx.dev}/include/c++/v1";
  };
  # fhs is only used on NixOS
  passthru.fhs = (pkgs.buildFHSUserEnv {
    name = "bazel-userenv";
    runScript = "zsh";  # replace with your shell of choice
    targetPkgs = pkgs: with pkgs; [
      libz  # required for bazelisk to unpack Bazel itself
    ];
  }).env;
}

Then on NixOS you can run nix-shell -A fhs to enter an environment where directories like /bin, /usr and /lib are set up as tools made for other Linux distributions expect.

Now we can clone Bazel examples repo and enter the simple C++ example in it:

git clone --depth 1 https://github.com/bazelbuild/examples
pushd examples/cpp-tutorial/stage1

On macOS we’ll need to configure compiler and linker flags to look for libraries in Nix store:

echo "build:macos --action_env=BAZEL_CXXOPTS=${BAZEL_CXXOPTS}" >> .bazelrc
echo "build:macos --action_env=BAZEL_LINKOPTS=${BAZEL_LINKOPTS}" >> .bazelrc

We will be building remotely for the Linux platform later, so we should specify a concrete platform and toolchain to use for Linux:

echo "build:linux --platforms=@aspect_gcc_toolchain//platforms:x86_64_linux" >> .bazelrc
echo "build:linux --extra_execution_platforms=@aspect_gcc_toolchain//platforms:x86_64_linux" >> .bazelrc

And then build and run the example locally:

bazelisk run //main:hello-world

You should see output like:

Starting local Bazel server and connecting to it...
INFO: Analyzed target //main:hello-world (38 packages loaded, 165 targets configured).
INFO: Found 1 target...
Target //main:hello-world up-to-date:
  bazel-bin/main/hello-world
INFO: Elapsed time: 7.545s, Critical Path: 0.94s
INFO: 8 processes: 6 internal, 2 processwrapper-sandbox.
INFO: Build completed successfully, 8 total actions
INFO: Running command line: bazel-bin/main/hello-world
Hello world

Note that if we run bazelisk run //main:hello-world again, it’ll be much faster, because Bazel only spends a fraction of a second on computing the action graph and making sure that nothing needs to be rebuilt:

...
INFO: Elapsed time: 0.113s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
...

We can also run bazelisk clean to remove previous output and re-run it to make sure we can rebuild from scratch.

Now let’s try building it using Buildbarn. First we need to configure execution properties to match ones set up in Buildbarn’s worker config:

echo "build:remote --remote_default_exec_properties OSFamily=linux" >> .bazelrc
echo "build:remote --remote_default_exec_properties container-image=docker://ghcr.io/catthehacker/ubuntu:act-22.04@sha256:5f9c35c25db1d51a8ddaae5c0ba8d3c163c5e9a4a6cc97acd409ac7eae239448" >> .bazelrc

Then we should tell Bazel to use Buildbarn as a remote executor:

echo "build:remote --remote_executor grpc://127.0.0.1:30080" >> .bazelrc

Now we can build it with bazelisk build --config=linux --config=remote //main:hello-world. Note that it will take some time to extract the Linux compiler and supplemental files first:

INFO: Invocation ID: d70b9d30-1865-4d1f-8d52-77c6fc5ec607
INFO: Build options --extra_execution_platforms, --incompatible_enable_cc_toolchain_resolution, and --platforms have changed, discarding analysis cache.
INFO: Analyzed target //main:hello-world (3 packages loaded, 6315 targets configured).
INFO: Found 1 target...
Target //main:hello-world up-to-date:
  bazel-bin/main/hello-world
INFO: Elapsed time: 96.249s, Critical Path: 52.72s
INFO: 5 processes: 3 internal, 2 remote.
INFO: Build completed successfully, 5 total actions

As you can see, two actions were executed remotely: compilation and linking. But we can find the result locally in bazel-bin/main/hello-world (and run it if we’re on an appropriate platform):

 % file bazel-bin/main/hello-world
bazel-bin/main/hello-world: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 4.9.0, not stripped

Now if we clean local caches and rebuild, we can see that it reuses results already stored in Buildbarn (remote cache hits):

 % bazelisk clean
INFO: Invocation ID: d655d3f2-071d-48ff-b3e9-e0b1c61ae5fb
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
 % bazelisk build --config=linux --config=remote //main:hello-world
INFO: Invocation ID: d38526d8-0242-4b91-92da-20ddd110d3ae
INFO: Analyzed target //main:hello-world (41 packages loaded, 6315 targets configured).
INFO: Found 1 target...
Target //main:hello-world up-to-date:
  bazel-bin/main/hello-world
INFO: Elapsed time: 0.663s, Critical Path: 0.07s
INFO: 5 processes: 2 remote cache hit, 3 internal.
INFO: Build completed successfully, 5 total actions

We can exit the examples directory now:

popd

In this section we’ve configured a Bazel project to be built using our Buildbarn instance. Now we’ll configure mTLS on Buildbarn and then finally reconfigure this Bazel project to access Buildbarn using mTLS.

Configuring TLS in Buildbarn

We want each component of Buildbarn to have its own automatically generated certificate and use it to connect to other components. On the other side, each component that accepts connections should verify that the incoming connection is accompanied by a valid certificate as well. In this section we’ll use cert-manager to generate certificates and a more secure CSI driver to request certificates and propagate them to Buildbarn components. Then we’ll configure Buildbarn components to verify both sides of each connection. Here’s how this process should look like for frontend and storage containers, for example:

Node 1                       │        Kubernetes API             │ Node 2
                             │                                   │
┌─────────────────────────┐  │                                   │  ┌─────────────────────────┐
│ Frontend pod            │  │              mTLS                 │  │             Storage pod │
│      bb-storage process │<───────────────────────────────────────>│ bb-storage process      │
├─────────────────────────┤  │       ┌──────────────┐            │  ├─────────────────────────┤
│ CSI volume       ca.crt │  │       │ cert-manager │            │  │ ca.crt       CSI volume │
│       tls.key   tls.crt │  │       └─────┬────────┘            │  │ tls.crt   tls.key       │
└──────────^─────────^────┘  │             │ fills out           │  └───^─────────^───────────┘
           │         │       │             V                     │      │         │
       generates  stores     │    apiVersion: cert-manager.io/v1 │   stores   generates
           │         │            kind: CertificateRequest              │         │
          ┌┴─────────┴─┐ creates  spec:                                ┌┴─────────┴─┐
          │ CSI driver │────────>   request: LS0tLS...                 │ CSI driver │
          └────────────┘          status:                              └────────────┘
                     ^ retrieves    certificate: ...
                     └───────────   ca: ...
  1. CSI driver sees CSI volume, generates a key in tls.key in there.
  2. CSI driver uses key from tls.key to generate a Certificate Signing Request (CSR) and creates CertificateRequest resource in Kubernetes API with it.
  3. cert-manager signs the CertificateRequest with CA certificate and puts both resulting certificate and CA certificate in the CertificateRequest’s status.
  4. CSI driver stores them in tls.crt and ca.crt respectively in CSI volume.
  5. bb-storage process in the frontend pod uses certificate and key from tls.crt and tls.key to establish TLS connection to the storage pod, verifying that the later presents a valid certificate signed by a CA certificate from ca.crt.
  6. On the storage side tls.key, tls.crt and ca.crt are filled out in the similar manner
  7. bb-storage process in the storage pod verifies the incoming certificate with CA certificate from ca.crt and presents certificate from tls.crt to the frontend.

Notice how with this approach secret keys never leave the node where they are generated and used, and the connection between frontend and storage pods is authenticated on both ends.

Installing cert-manager

To generate certificates for our Buildbarn we need to install and configure cert-manager itself and its CSI driver. cert-manager is responsible for generating and updating certificates requested via Kubernetes API objects. The CSI driver lets users create special volumes in pods where private keys are generated locally and certificates are requested from cert-manager and provided to the pod.

First, let’s fetch all necessary manifests and add them to our deployment. The cert-manager project publishes a ready-to-use Kubernetes manifest, so we can manually fetch it:

pushd bb-deployments/kubernetes
curl -LO https://github.com/cert-manager/cert-manager/releases/download/v1.14.3/cert-manager.yaml

And then add it to the resources section of our kustomization.yaml:

resources:
  - ...
  - cert-manager.yaml

Unfortunately, the cert-manager CSI driver doesn’t directly provide a k8s manifest, but rather a Helm chart. Add kubernetes-helm to your shell.nix and then run:

helm template -n cert-manager -a storage.k8s.io/v1/CSIDriver https://charts.jetstack.io/charts/cert-manager-csi-driver-v0.7.1.tgz > cert-manager-csi-driver.yaml

-a storage.k8s.io/v1/CSIDriver makes sure that chart uses the latest version of the Kubernetes API to register itself.

Then we can add it to resources section of our kustomization.yaml:

resources:
  - ...
  - cert-manager.yaml
  - cert-manager-csi-driver.yaml

Let’s deploy and wait for everything to start. We will use cmctl to check that cert-manager is working correctly, so you’ll need to add it to shell.nix.

kubectl apply -k .
kubectl rollout status -k . 2>&1 | grep -Ev "no status|unable to decode"
cmctl check api --wait 10m
kubectl get csinode -o yaml

cmctl should report The cert-manager API is ready, and the last command should output your only node with one driver called csi.cert-manager.io installed:

namespace/buildbarn unchanged
namespace/cert-manager created
...
mutatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created
validatingwebhookconfiguration.admissionregistration.k8s.io/cert-manager-webhook created
...
The cert-manager API is ready
apiVersion: v1
items:
- apiVersion: storage.k8s.io/v1
  kind: CSINode
  metadata:
    ...
    name: lima-k8s
    ...
  spec:
    drivers:
    - name: csi.cert-manager.io
      nodeID: lima-k8s
      topologyKeys: null
kind: List
metadata:
  resourceVersion: ""

If it says drivers: null, re-run kubectl get csinode -o yaml a bit later to allow more time for driver deployment and startup.

Creating CA certificate

First we need to create a CA certificate and an Issuer that cert-manager will use to generate certificates for our needs. Note that to generate a self-signed certificate we’ll also need to create another issuer. Put this in ca.yaml:

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: selfsigned
  namespace: buildbarn
spec:
  selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: ca
  namespace: buildbarn
spec:
  isCA: true
  commonName: ca
  secretName: ca
  privateKey:
    algorithm: ECDSA
    size: 256
  issuerRef:
    name: selfsigned
    kind: Issuer
    group: cert-manager.io
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: ca
  namespace: buildbarn
spec:
  ca:
    secretName: ca

Then add it to resources section of our kustomization.yaml:

resources:
  - ...
  - ca.yaml

And apply it and check their status:

kubectl apply -k .
kubectl -n buildbarn get issuers -o wide

Both issuers should be there, and ca issuer should have the Signing CA verified status:

NAME         READY   STATUS                AGE
ca           True    Signing CA verified   14s
selfsigned   True                          14s

If it says something like secrets "ca" not found, it means it needs some time to generate the certificate. Re-run kubectl -n buildbarn get issuers -o wide.

Generating certificates for Buildbarn components

As mentioned before, we will be generating certificates for each component using cert-manager’s CSI driver. To do this, we need to add a volume to each pod and mount it into the main container so that the service can read it. We also need to pass CA certificate into all these containers to verify other side of each connection. Unfortunately, Buildbarn doesn’t support reading these from file, so we’ll have to pass it statically via config. Let’s prepare this config file using this command that reads the CA certificate via the Kubernetes API and formats it using jq into a JSON string:

kubectl -n buildbarn get certificaterequests ca-1 -o jsonpath='{.status.ca}' | base64 -d | jq --raw-input --slurp . > config/ca-cert.jsonnet

Now we can configure all pods by adding the following patches in kustomization.yaml:

patches:
  - ...
  - target:
      kind: Deployment
      namespace: buildbarn
    patch: |
      - op: add
        path: /spec/template/spec/volumes/-
        value:
          name: tls-cert
          csi:
            driver: csi.cert-manager.io
            readOnly: true
            volumeAttributes:
              csi.cert-manager.io/issuer-name: ca
      - op: add
        path: /spec/template/spec/containers/0/volumeMounts/-
        value:
          mountPath: /cert
          name: tls-cert
          readOnly: true
  - target:
      kind: Deployment
      namespace: buildbarn
      name: frontend
    patch: |
      - op: add
        path: /spec/template/spec/volumes/0/configMap/items/-
        value:
          key: ca-cert.jsonnet
          path: ca-cert.jsonnet
      - op: add
        path: /spec/template/spec/volumes/1/csi/volumeAttributes/csi.cert-manager.io~1dns-names
        value: frontend,frontend.${POD_NAMESPACE},frontend.${POD_NAMESPACE}.svc.cluster.local
      - op: add
        path: /spec/template/spec/volumes/1/csi/volumeAttributes/csi.cert-manager.io~1ip-sans
        value: 127.0.0.1
  - target:
      kind: Deployment
      namespace: buildbarn
      name: browser
    patch: |
      - op: add
        path: /spec/template/spec/volumes/0/configMap/items/-
        value:
          key: ca-cert.jsonnet
          path: ca-cert.jsonnet
      - op: add
        path: /spec/template/spec/volumes/1/csi/volumeAttributes/csi.cert-manager.io~1dns-names
        value: browser,browser.${POD_NAMESPACE},browser.${POD_NAMESPACE}.svc.cluster.local
  - target:
      kind: Deployment
      namespace: buildbarn
      name: scheduler-ubuntu22-04
    patch: |
      - op: add
        path: /spec/template/spec/volumes/0/configMap/items/-
        value:
          key: ca-cert.jsonnet
          path: ca-cert.jsonnet
      - op: add
        path: /spec/template/spec/volumes/1/csi/volumeAttributes/csi.cert-manager.io~1dns-names
        value: scheduler,scheduler.${POD_NAMESPACE}
  - target:
      kind: Deployment
      namespace: buildbarn
      name: worker-ubuntu22-04
    patch: |
      - op: add
        path: /spec/template/spec/volumes/1/configMap/items/-
        value:
          key: ca-cert.jsonnet
          path: ca-cert.jsonnet
      - op: add
        path: /spec/template/spec/volumes/3/csi/volumeAttributes/csi.cert-manager.io~1dns-names
        value: worker,worker.${POD_NAMESPACE}
  - target:
      kind: StatefulSet
      namespace: buildbarn
      name: storage
    patch: |
      - op: add
        path: /spec/template/spec/volumes/0/configMap/items/-
        value:
          key: ca-cert.jsonnet
          path: ca-cert.jsonnet
      - op: add
        path: /spec/template/spec/volumes/-
        value:
          name: tls-cert
          csi:
            driver: csi.cert-manager.io
            readOnly: true
            volumeAttributes:
              csi.cert-manager.io/issuer-name: ca
              csi.cert-manager.io/dns-names: ${POD_NAME}.storage,${POD_NAME}.storage.${POD_NAMESPACE}
      - op: add
        path: /spec/template/spec/containers/0/volumeMounts/-
        value:
          mountPath: /cert
          name: tls-cert
          readOnly: true

To avoid repetition, the first patch is applied to all Deployment objects, and consecutive patches only add the proper list of DNS names for each certificate. Note that many of those DNS names will not be used as only some of these services actually accept connections. For the frontend Deployment we also add 127.0.0.1 IP so that it can be accessed via a port forwarded to localhost as we currently use it on the host machine. For the storage StatefulSet we configure unique DNS name for each Pod because they are contacted directly and not through a common service. For each of these we also add ca-cert.jsonnet to the list of files used from the configuration ConfigMap. We also need to add it to the ConfigMap itself by adding it to the list in config/kustomization.yaml:

configMapGenerator:
  - name: buildbarn-config
    namespace: buildbarn
    files:
      - ...
      - ca-cert.jsonnet

We can apply all these changes with:

kubectl apply -k .
kubectl rollout status -k . 2>&1 | grep -Ev "no status|unable to decode"

Now you can fetch the list of CertificateRequest objects to see their statuses:

kubectl -n buildbarn get certificaterequest

It will output one request for the ca certificate named ca-1 and a bunch of requests generated for each pod:

NAME                                   APPROVED   DENIED   READY   ISSUER       REQUESTOR                                                    AGE
14468f64-909f-43d1-b67d-07b0844c0683   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m
1d9e41a6-e58f-4c13-b9e6-0b1ba1d5a4f6   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m1s
2c2f1177-81fc-45e5-8487-9b66bc0d6f73   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m1s
31fdb0ef-0c0b-4a06-94af-fb17875ee05d   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m1s
376d0933-c0e9-4d39-b5c6-b76071c65966   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   4m58s
3967cdd6-7d48-4814-8cec-542041182dd0   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m1s
464a1f35-f0ba-4236-aeec-294f880d9675   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   4m57s
5181e602-276e-413e-8888-76c4bd1ede21   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   4m57s
6f02092d-b8a3-4eb7-8ff2-5e4a433d59bb   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m1s
710a458e-6ba0-4a44-87ab-5115b5a2c213   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   4m58s
753c4653-71ae-447e-bbe5-022ce35cee9d   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m1s
8bcbb5a0-4575-40ad-b842-9c86bde8fdb8   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   4m56s
8df59bf5-ed23-47af-bfcc-3cf8a9053b9b   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m1s
b47fff23-40b4-43ed-8e34-35d988eb434d   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   4m56s
be72bdc6-c61d-4f1b-928e-f743df0f6188   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   4m57s
c14a52d5-dc20-4626-afe6-975442103d8b   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m
ca-1                                   True                True    selfsigned   system:serviceaccount:cert-manager:cert-manager              3d22h
ceabf1ab-06a7-47c0-855a-2009bbbd2418   True                True    ca           system:serviceaccount:cert-manager:cert-manager-csi-driver   5m

Using certificates

Now that we’ve generated all necessary certificates and made them available to all pods, we can configure all components to use them. We’ll use similar stanzas for each service, so let’s first add some helper functions to the top of config/common.libsonnet:

local localKeyPair = {
  files: {
    certificate_path: '/cert/tls.crt',
    private_key_path: '/cert/tls.key',
    refresh_interval: '3600s',
  },
};

local grpcClientWithTLS = function(address) {
  address: address,
  tls: {
    server_certificate_authorities: import 'ca-cert.jsonnet',
    client_key_pair: localKeyPair,
  },
};

local oneListenAddressWithTLS = function(address) [{
  listenAddresses: [address],
  authenticationPolicy: {
    tls_client_certificate: {
      client_certificate_authorities: import 'ca-cert.jsonnet',
      validation_jmespath_expression: '`true`',
      metadata_extraction_jmespath_expression: '`{}`',
    },
  },
  tls: {
    server_key_pair: localKeyPair,
  },
}];

And then expose these functions to use in other configs at the end of the file:

  ...
  grpcClientWithTLS: grpcClientWithTLS,
  oneListenAddressWithTLS: oneListenAddressWithTLS,
}

Note that local certificate and key files will be reloaded every hour per the refresh_interval setting, but the CA certificate will need to be reconfigured manually every time it refreshes.

Also note that we accept all valid certificates by setting validation_jmespath_expression to `true`. This expression can be configured later for each service if needed.

Now we’re ready to configure the Buildbarn services.

Storage

Let’s start with storage. The client side configuration is the same for all services that connect to it and is stored in config/common.libsonnet. Replace lines like this one:

backend: { grpc: { address: 'storage-0.storage.buildbarn:8981' } },

with usage of our new function:

backend: { grpc: grpcClientWithTLS('storage-0.storage.buildbarn:8981') },

Keep the address the same (storage-0 and storage-1 should remain in place).

Now in config/storage.jsonnet replace these GRPC server configuration lines:

grpcServers: [{
  listenAddresses: [':8981'],
  authenticationPolicy: { allow: {} },
}],

With a call to another function:

grpcServers: common.oneListenAddressWithTLS(':8981'),

Make sure that the address itself is the same again.

Now let’s apply it and wait for all pods to restart:

kubectl apply -k .
kubectl rollout status -k . 2>&1 | grep -Ev "no status|unable to decode"

Let’s check that the storage service is still accessible via the frontend service by rebuilding our example project:

pushd ../../examples/cpp-tutorial/stage1
bazelisk clean
bazelisk build --config=linux --config=remote //main:hello-world
popd

It should show that it fetched output from the remote cache:

...
INFO: 5 processes: 2 remote cache hit, 3 internal.
...

Scheduler

The scheduler exposes at least four GRPC endpoints, but we’ll cover only the client (frontend) and worker sides as we don’t use other endpoints yet. Just like with storage, you should replace clientGrpcServers and workerGrpcServers settings with calls to oneListenAddressWithTLS in config/scheduler.jsonnet, passing the addresses themselves as an argument:

...
clientGrpcServers: common.oneListenAddressWithTLS(':8982'),
workerGrpcServers: common.oneListenAddressWithTLS(':8983'),
...

The scheduler itself only connects to storage, and that part has already been configured in config/common.jsonnet.

Workers

Workers only connect to the scheduler and storage. With the latter being already configured, we need to only change scheduler setting in config/worker-ubuntu22-04.jsonnet:

...
scheduler: common.grpcClientWithTLS('scheduler:8983'),
...

Frontend

The frontend listens for incoming connections from clients and fans them out, either to storage or to the scheduler. Storage access has already been covered, so we only need to replace grpcServers and schedulers settings in config/frontend.jsonnet:

grpcServers: common.oneListenAddressWithTLS(':8980'),
schedulers: {
  '': {
    endpoint: common.grpcClientWithTLS('scheduler:8982') {
      addMetadataJmespathExpression: |||
        {
          "build.bazel.remote.execution.v2.requestmetadata-bin": incomingGRPCMetadata."build.bazel.remote.execution.v2.requestmetadata-bin"
        }
      |||,
    },
  },
},

Note that we preserve all addresses and keep the additional addMetadataJmespathExpression field that augments requests to the scheduler.

Applying it all

Now we can apply all these settings with:

kubectl apply -k .
kubectl rollout status -k . 2>&1 | grep -Ev "no status|unable to decode"

All deployments should eventually roll out and work. This means that all internal communications between Buildbarn components are encrypted and authenticated.

In this section we’ve achieved our goal of securing Buildbarn deployment using mTLS. Now all that’s left is to reconfigure Bazel to use and verify certificates while accessing Buildbarn’s RBE API endpoint.

Configuring certificates on client

So far we’ve configured Buildbarn to always use TLS encrypted connections. It means that our current client setup for using it will not work because it doesn’t expect TLS. In this section we’ll generate a client certificate for it using the cmctl tool, configure Bazel to both validate the server certificate and use this new client certificate when communicating with Buildbarn, and show the final complete example.

First, note that as said, if we run Bazel with current client configuration it will fail due to using a non-encrypted connection to an encrypted endpoint:

pushd ../../examples/cpp-tutorial/stage1
bazelisk clean
bazelisk build --config=linux --config=remote //main:hello-world

The error will look like this:

INFO: Invocation ID: dc8188ca-e77f-4884-a596-612779c6ae33
ERROR: Failed to query remote execution capabilities: UNAVAILABLE: Network closed for unknown reason

To configure the client to use an encrypted connection, we need to replace the grpc protocol with grpcs in .bazelrc and try again:

sed -i s/grpc/grpcs/ .bazelrc
bazelisk build --config=linux --config=remote //main:hello-world

Now the error will indicate that something else is missing - in this case, a client certificate:

INFO: Invocation ID: 7dcb900f-17eb-4dbb-ab9c-df9c70bc2c92
ERROR: Failed to query remote execution capabilities: UNAVAILABLE: io exception
Channel Pipeline: [SslHandler#0, ProtocolNegotiators$ClientTlsHandler#0, WriteBufferingAndExceptionHandler#0, DefaultChannelPipeline$TailContext#0]

To address that, we need to generate client certificates and configure Bazel to use them.

Generating the client certificate

We will use cert-manager and its CLI client cmctl to generate a certificate for our client. First, we need to create a Certificate object template in cert-template.yaml:

cat > cert-template.yaml <<EOF
apiVersion: cert-manager.io/v1
kind: Certificate
spec:
  commonName: client
  usages:
  - client auth
  privateKey:
    algorithm: ECDSA
    size: 256
  issuerRef:
    name: ca
    kind: Issuer
    group: cert-manager.io
EOF

Then we can use it to create the actual certificate:

cmctl create certificaterequest -n buildbarn client --from-certificate-file cert-template.yaml --fetch-certificate

It will use this certificate template as if it was created in Kubernetes: it will generate a key in client.key, create a Certificate Signing Request (CSR) from it, embed that in a cert-manager CertificateRequest and send it, wait for the server to sign it, and finally retrieve the resulting certificate to client.crt.

We also need a CA certificate to verify server certificates. We can use the same command we used for Buildbarn configuration here:

kubectl -n buildbarn get certificaterequests ca-1 -o jsonpath='{.status.ca}' | base64 -d > ca.crt

You can make sure that client certificate is signed with this CA certificate by adding openssl to shell.nix and running:

openssl verify -CAfile ca.crt client.crt

It will output client.crt: OK if everything is correct.

Building with certificates

All that’s left is to tell Bazel to use these certificates to connect to Buildbarn. We’ll need to convert the private key to PKCS#8 format for it and add these settings to .bazelrc:

openssl pkcs8 -topk8 -nocrypt -in client.key -out client.pem
echo "build:remote --tls_certificate=ca.crt" >> .bazelrc
echo "build:remote --tls_client_certificate=client.crt" >> .bazelrc
echo "build:remote --tls_client_key=client.pem" >> .bazelrc

Now let’s clean the Bazel cache and run the build:

bazelisk clean
bazelisk build --config=linux --config=remote //main:hello-world

You will see that the remote cache is in use, which means that TLS has been configured successfully:

...
INFO: Elapsed time: 0.601s, Critical Path: 0.10s
INFO: 5 processes: 2 remote cache hit, 3 internal.
...

To make sure that the actual build also works, we can change the source file a bit and re-run the build:

echo >> main/hello-world.cc
bazelisk build --config=linux --config=remote //main:hello-world

It will now take some time and actually show that it has built one action remotely:

...
INFO: Elapsed time: 15.866s, Critical Path: 15.69s
INFO: 2 processes: 1 internal, 1 remote.
...

Conclusion

We’ve shown how to deploy Buildbarn on Kubernetes, how to configure mTLS between all its components, and how to use TLS authentication with RBE API clients using Bazel as an example. This is a starting configuration that can be improved in several aspects not covered here:

  • The Buildbarn browser and the scheduler web UIs are neither exposed nor encrypted;
  • cert-manager is not configured to limit access to certificate generation, meaning that anyone with access to Kubernetes API has access to all its capabilities;
  • no limits are imposed on client certificates, they only need to be valid;
  • there is no automation for client certificate renewal;
  • and only certificates are used for authentication, which is secure but can be enhanced or replaced with OAuth which is more flexible and provides better control

All these are interesting topics that would each deserve their own blog post.

August 29, 2024 12:00 AM

Abhinav Sarkar

Getting Started with Nix for Haskell

So, you’ve heard of the new hotness that is Nix, for creating reproducible and isolated development environments, and want to use it for your new Haskell project? But you are unclear about how to get started? Then this is the guide you are looking for.

This post was originally published on abhinavsarkar.net.

Nix is notoriously hard to get started with. If you are familiar with Haskell, you may have an easier time learning the Nix language, but it is still difficult to figure out the various toolchains and library functions needed to put your knowledge of the Nix language to use. There are some frameworks for setting up Haskell projects with Nix, but again, they are hard to understand because of their large feature scopes. So, in this post, I’m going to show a really easy way for you to get started.

Nix for Haskell

But first, what does it mean to use Nix for a Haskell project? It means that all the dependencies of our projects — Haskell packages, and non-Haskell ones too — come from Nixpkgs, a repository of software configured and managed using Nix1. It also means that all the tools we use for development, such as builders, linters, style checkers, LSP servers, and everything else, also come from Nixpkgs2. And all of this happens by writing some configuration files in the Nix language.

Start with creating a new directory for the project. For the purpose of this post, we name this project ftr:

$ mkdir ftr
$ cd ftr

The first thing to do is to set up the project to point to the Nixpkgs repo — more specifically, a particular fixed version of the repo — so that our builds are reproducible3. We do this by using Niv.

Niv is a tool for pinning/locking down the version of the Nixpkgs repo, much like cabal freeze or npm freeze. But instead of pinning each dependency at some version, we pin the entire repo (from which all the dependencies come) at a version.

Run the following commands:

$ nix-shell -p niv
$ niv init

Running nix-shell -p niv drops us into a nested shell in which the niv executable is available. Running niv init sets up Niv for our project, creating nix/sources.{json|nix} files. The nix/sources.json file is where the Nixpkgs repo version is pinned4. If we open it now, it may look something like this:

{
    "nixpkgs": {
        "branch": "nixos-unstable",
        "description": "Nix Packages collection",
        "homepage": null,
        "owner": "NixOS",
        "repo": "nixpkgs",
        "rev": "6c43a3495a11e261e5f41e5d7eda2d71dae1b2fe",
        "sha256": "16f329z831bq7l3wn1dfvbkh95l2gcggdwn6rk3cisdmv2aa3189",
        "type": "tarball",
        "url": "https://github.com/NixOS/nixpkgs/archive/6c43a3495a11e261e5f41e5d7eda2d71dae1b2fe.tar.gz",
        "url_template": "https://github.com/<owner>/<repo>/archive/<rev>.tar.gz"
    }
}
nix/sources.json

By default, Niv sets up the Nixpkgs repo, pinned to some version. Let’s pin it to the latest stable version as of the time of writing this post: 24.05. Run:

$ niv drop nixpkgs
$ niv add NixOS/nixpkgs -n nixpkgs -b nixos-24.05

Now, nix/sources.json may look like this:

{
    "nixpkgs": {
        "branch": "nixos-24.05",
        "description": "Nix Packages collection & NixOS",
        "homepage": "",
        "owner": "NixOS",
        "repo": "nixpkgs",
        "rev": "36bae45077667aff5720e5b3f1a5458f51cf0776",
        "sha256": "0mkbsp2f07lrqcnlsnybi6kbxdr7sjs3hiz4kf4jkqirk4qgswfi",
        "type": "tarball",
        "url": "https://github.com/NixOS/nixpkgs/archive/36bae45077667aff5720e5b3f1a5458f51cf0776.tar.gz",
        "url_template": "https://github.com/<owner>/<repo>/archive/<rev>.tar.gz"
    }
}
nix/sources.json

Pinning is done. Now, let’s get some stuff from the repo. But wait, first we have to configure Nixpkgs. Create a file nix/nixpkgs.nix:

{ system ? builtins.currentSystem }:
let
  sources = import ./sources.nix;
in import sources.nixpkgs {
  inherit system;
  overlays = [ ];
  config = { };
}
nix/nixpkgs.nix

Well, I lied. We could configure Nixpkgs if we had to5, but for this post, we leave all the settings empty, and just import it from Niv sources.

At this point, we could start pulling things from Nixpkgs manually, but to make it declarative and reproducible, let’s create our own Nix shell.

Shelling Out

Create a file named shell.nix:

{ system ? builtins.currentSystem, devTools ? true }:
let
  pkgs = import ./nix/nixpkgs.nix { inherit system; };
  myHaskellPackages = pkgs.haskellPackages;
in myHaskellPackages.shellFor {
  packages = p: [ ];
  nativeBuildInputs = with pkgs;
    [ ghc cabal-install ] ++ lib.optional devTools [
      niv
      hlint
      ormolu
      (ghc.withPackages (p: [ p.haskell-language-server ]))
    ];
}
shell.nix

Ah! Now, the Nix magic is shining through. What shell.nix does is, it creates a custom Nix shell with the things we mention already available in the shell. pkgs.haskellPackages.shellFor is how we create the custom shell, and nativeBuildInputs are the tools we want available.

We make ghc and cabal-install mandatorily available, because they are necessary for doing any Haskell development; and niv, hlint, ormolu and haskell-language-server67 optionally available (depending on the passed devTools flag), because we need them only when writing code.

Exit the previous Nix shell, and start a new one to start working on the project8:

$ nix-shell --arg devTools false

Okay, I lied again, we are still setting up. In this new shell, hlint, ormoulu etc are not available but we can run cabal now. We use it to initialize the Haskell project:

$ cabal init -p ftr

After answering all the questions Cabal asks us, we are left with a ftr.cabal file, along with some starter Haskell code in the right directories. Let’s build and run the starter code:

$ cabal run
Hello, Haskell!

It works!

Edit the ftr.cabal file now to add some new Haskell dependency (without a version), such as extra. If we run cabal build now, Cabal will start downloading the extra package. Cancel that! We want our dependencies to come from Nixpkgs, not Hackage. For that we need to tell Nix about our Haskell project.

Create a file package.nix:

{ system ? builtins.currentSystem }:
let
  pkgs = import ./nix/nixpkgs.nix { inherit system; };
  hlib = pkgs.haskell.lib.compose;
in pkgs.lib.pipe
(pkgs.haskellPackages.callCabal2nix "ftr" (pkgs.lib.cleanSource ./.) { })
[ hlib.dontHaddock ]
package.nix

The package.nix file is the Nix representation of the Cabal package for our project. We use cabal2nix here, a tool that makes Nix aware of Cabal files, making it capable of pulling the right Haskell dependencies from Nixpkgs. We also configure Nix to not run Haddock on our code by setting the hlib.dontHaddock option9, since we are not going to write any doc for this demo project.

Now, edit shell.nix to make it aware of our new Nix package:

{ system ? builtins.currentSystem, devTools ? true }:
let
  pkgs = import ./nix/nixpkgs.nix { inherit system; };
  myHaskellPackages = pkgs.haskellPackages.extend
    (final: prev: { ftr = import ./package.nix { inherit system; }; });
in myHaskellPackages.shellFor {
  packages = p: [ p.ftr ];
  nativeBuildInputs = with pkgs;
    [ ghc cabal-install ] ++ lib.optional devTools [
      niv
      hlint
      ormolu
      (ghc.withPackages (p: [ p.haskell-language-server ]))
    ];
}
shell.nix

We extend Haskell packages from Nixpkgs with our own package ftr, and add an entry in the previously empty packages list. This makes all the Haskell dependencies we mention in ftr.cabal available in the Nix shell. Exit the Nix shell now, and restart it by running:

$ nix-shell --arg devTools false

We can run cabal build now. Notice that nothing is downloaded from Hackage this time.

Even better, we can now build our project using Nix:

$ nix-build package.nix

This builds our project in a truly isolated environment outside the Nix shell, and puts the results in the result directory. Go ahead and try running it:

$ result/bin/ftr
Hello, Haskell!

Great! Now we can quit and restart the Nix shell without the --arg devTools false option. This will download and set up all the fancy dev tools we configured. Then we can start our favorite editor from the terminal and have access to all of them in it10.

This is all we need to get started on a Haskell project with Nix. There are some inconveniences in this setup, like we need to restart the Nix shell and the editor every time we modify our project dependencies, but these days most editors come with some extensions to do this automatically, without needing restarts. For more seamless experience in the terminal, we could install direnv and nix-direnv that refresh the Nix shells automatically11.

Bonus Round: Flakes

As a bonus, I’m going to show how to easily set up a Nix Flake for this project. Simply create a flake.nix file:

{
  description = "ftr is demo project for using Nix to manage Haskell projects";
  inputs.flake-utils.url = "github:numtide/flake-utils";

  outputs = { self, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let ftr = import ./package.nix { inherit system; };
      in rec {
        devShells.default = import ./shell.nix { inherit system; };
        packages.default = ftr;
        apps.default = {
          type = "app";
          program = "${ftr}/bin/ftr";
        };
      });
}
flake.nix

We reuse the package and shell Nix files we created earlier. We have to commit everything to our VSC at this point. After that, we can run the newfangled Nix commands such as12:

$ nix develop # same as: nix-shell
$ nix build # same as: nix-build package
$ nix shell # builds the package and starts a shell with the built executable available
$ nix run # builds the package and runs the built executable
$ nix profile install # builds the package and installs the built executable in our Nix profile

If we upload the project to a public Github repo, anyone with Nix set up can run and/or install our package executable by running single commands:

$ nix run github:username/ftr # downloads, builds and runs without installing
$ nix profile install github:username/ftr # downloads, builds and installs

If that not super cool then I don’t know what is.

Bonus Round 2: Statically Linked Executable

Create a file package-static.nix and nix-build it to create a statically linked executable on Linux13, which can be run on any Linux machine without installing any dependency libraries or even Nix14:

{ system ? builtins.currentSystem }:
let
  sources = import ./nix/sources.nix;
  nixpkgs = import sources.nixpkgs {
    inherit system;
    overlays = [
      (final: prev: {
        haskellPackages = prev.haskellPackages.override {
          ghc = prev.haskellPackages.ghc.override {
            enableRelocatedStaticLibs = true;
            enableShared = false;
            enableDwarf = false;
          };
          buildHaskellPackages =
            prev.haskellPackages.buildHaskellPackages.override
            (old: { ghc = final.haskellPackages.ghc; });
        };
      })
    ];
    config = { };
  };
  pkgs = nixpkgs.pkgsMusl;
  hlib = pkgs.haskell.lib.compose;
in pkgs.lib.pipe
(pkgs.haskellPackages.callCabal2nix "ftr" (pkgs.lib.cleanSource ./.) { }) [
  hlib.dontHaddock
  hlib.justStaticExecutables
  hlib.disableSharedLibraries
  hlib.enableDeadCodeElimination
  (hlib.appendConfigureFlags [
    "-O2"
    "--ghc-option=-fPIC"
    "--ghc-option=-optl=-static"
    "--extra-lib-dirs=${pkgs.gmp6.override { withStatic = true; }}/lib"
    "--extra-lib-dirs=${
      pkgs.libffi.overrideAttrs (old: { dontDisableStatic = true; })
    }/lib"
    "--extra-lib-dirs=${pkgs.ncurses.override { enableStatic = true; }}/lib"
    "--extra-lib-dirs=${pkgs.zlib.static}/lib"
  ])
]
package-static.nix

Conclusion

This post shows a quick and easy way to get started with using Nix for managing simple Haskell projects. Unfortunately, if we have any complex requirements, such as custom dependency versions, patched dependencies, custom non-Haskell dependencies, custom configuration for Nixpkgs, multi-component Haskell projects, using a different GHC version, custom build scripts etc, this setup does not scale. In such case you can either grow this setup by learning Nix in more depth with the help of the official Haskell with Nix docs and this great tutorial, or switch to using a framework like Nixkell or haskell-flake.

This post only scratches the surface of all things possible to do with Nix. I hope I was able to showcase some benefits of Nix, and help you get started. Happy Haskelling and happy Nixing!


  1. One big advantage that Nix has over using Cabal for managing Haskell projects is the Nix binary cache that provides pre-built libraries and executable for download. That means no more waiting for Cabal to build scores of dependencies from sources.↩︎

  2. Search Nixpkgs for packages at search.nixos.org.↩︎

  3. I’m assuming that you’ve already set up Nix at this point. If you have not, follow this guide.↩︎

  4. Of course, we can use Niv to manage any number of source repos, not just Nixpkgs. But we don’t need any other for this post.↩︎

  5. We could do all sort of interesting and useful things here, like patching some Nixpkgs packages with our own patches, reconfiguring the build flags of some packages, etc.↩︎

  6. hlint is a Haskell linter, ormolu is a Haskell file formatter, and haskell-language-server is an LSP server for Haskell. Other tools that I find useful are stan, the Haskell static analyzer, just, the command runner, and nixfmt, the Nix file formatter. All of them and more are available through Nixpkgs. You can start using them by adding them to nativeBuildInputs.↩︎

  7. If you are wondering why we need to wrap only haskell-language-server with all the ghc stuff, that’s because, to work correctly haskell-language-server is required to be compiled with same version of ghc that your project is going to used. The other tools do not have this restriction.↩︎

  8. You may notice Nix downloading a lot of stuff from Nixpkgs. It may occasionally need to build a few things as well, if they are not available in the binary cache.

    You may need to tweak the connect-timeout and download-attempts settings in the nix.conf file if you are on a slow network.↩︎

  9. There are many more options that we can set here. These options roughly correspond to the command line options for the cabal command. See a comprehensive list here.↩︎

  10. To update the tools and dependencies of the project, run niv update nixpkgs, and restart the Nix shell.↩︎

  11. Use this .envrc file to configure direnv for automatic refreshes for this project:

    #!/usr/bin/env bash
    use nix
    watch_file shell.nix
    watch_file nix/sources.json
    watch_file ftr.cabal
    ↩︎
  12. First, we would have to modify our nix.conf file to enable these commands by adding the line:

    experimental-features = nix-command flakes
    ↩︎
  13. This might take several hours to finish when run for the first time. Also, the enableDwarf = false config requires GHC >= 9.6.↩︎

  14. Another benefit of statically linked executables is, if you package them in Docker/OCI containers, the container sizes are much smaller than ones created for dynamically linked executables.↩︎

If you liked this post, please leave a comment.

by Abhinav Sarkar (abhinav@abhinavsarkar.net) at August 29, 2024 12:00 AM

August 26, 2024

Michael Snoyman

Let the API protect you

Let's write a simple program to manage purchases at a small convenience store. The store only sells two items: eggs and apples. We know the price of each item, and we need to set aside 5% of every purchase for taxes. We should really use a decimal type instead of floats for handling currency, but we'll simplify things a bit here for convenience.

fn main() {
    let mut accounts = Accounts::default();
    accounts.buy_eggs(6);
    accounts.buy_apples(10);
    println!("{accounts:#?}");
}

const TAX_RATE: f64 = 0.05;
const PRICE_PER_EGG: f64 = 0.75;
const PRICE_PER_APPLE: f64 = 0.5;

#[derive(Debug, Default)]
struct Accounts {
    company_balance: f64,
    taxes_paid: f64,
}

impl Accounts {
    fn log_purchase(&mut self, money: f64) {
        let taxes = money * TAX_RATE;
        self.taxes_paid += taxes;
        self.company_balance += money - taxes;
    }

    fn buy_eggs(&mut self, eggs: u64) {
        self.log_purchase(eggs as f64 * PRICE_PER_EGG);
    }

    fn buy_apples(&mut self, apples: u64) {
        self.log_purchase(apples as f64 * PRICE_PER_APPLE);
    }
}

We now have a highly sophisticated and bullet-proof accounting systems for our store, no tax auditor could ever object to such pristine book keeping! We continue to run our successful little business and soon make enough money to open a second location. Let's say our first business was in Arizona, and now we want to expand into the Nevada market.

All good... except that the tax rates in the two states are different! While Arizona is 5%, Nevada is 8%. How can we model this in our code?

One possibility would be to pass in the tax rate as a parameter to log_purchase. Let's give that a shot:

fn main() {
    let mut accounts = Accounts::default();
    accounts.buy_eggs(6, TAX_RATE_ARIZONA);
    accounts.buy_apples(10, TAX_RATE_NEVADA);
    println!("{accounts:#?}");
}

const TAX_RATE_ARIZONA: f64 = 0.05;
const TAX_RATE_NEVADA: f64 = 0.08;
const PRICE_PER_EGG: f64 = 0.75;
const PRICE_PER_APPLE: f64 = 0.5;

#[derive(Debug, Default)]
struct Accounts {
    company_balance: f64,
    taxes_paid: f64,
}

impl Accounts {
    fn log_purchase(&mut self, money: f64, tax_rate: f64) {
        let taxes = money * tax_rate;
        self.taxes_paid += taxes;
        self.company_balance += money - taxes;
    }

    fn buy_eggs(&mut self, eggs: u64, tax_rate: f64) {
        self.log_purchase(eggs as f64 * PRICE_PER_EGG, tax_rate);
    }

    fn buy_apples(&mut self, apples: u64, tax_rate: f64) {
        self.log_purchase(tax_rate, apples as f64 * PRICE_PER_APPLE);
    }
}

That's not too bad... until you realize that there's a bug in the code above. Look at the implementation of buy_apples. We've accidentally provided the tax_rate as the amount of money the apples cost! Easy mistake to make, and thankfully easy enough to fix:

fn buy_apples(&mut self, apples: u64, tax_rate: f64) {
    self.log_purchase(apples as f64 * PRICE_PER_APPLE, tax_rate);
}

"Huh," some vague part of my brain screams out. "It was way too easy to write buggy code. Can we fix that?" At this point, I think that proponents of dynamic typing can (rightfully) claim a small victory here. I've written some reasonable code in Rust, a statically typed language, and the compiler couldn't stop me from making a silly mistake. As a proponent of types, I begin to question the fabric of reality and my entire stance on programming. But no time for that, I'm too busy expanding my store to other states!

Soon enough, we're ready to expand further into Utah. Utah also has a sales tax, but they exempt eggs from their sales tax because it's an essential good. (And if anyone's about to fact check me: I've completely made up all the tax rates and rules in this post.) Anyway, our existing Accounts struct and its API is totally up to the challenge here, and we can easily implement this correctly:

fn main() {
    let mut accounts = Accounts::default();
    accounts.buy_eggs(6, TAX_RATE_ARIZONA);
    accounts.buy_apples(10, TAX_RATE_NEVADA);
    accounts.buy_eggs(12, TAX_RATE_UTAH);
    accounts.buy_apples(2, 0.0); // essential goods have no taxes in Utah
    println!("{accounts:#?}");
}

Easy peasy... and broken! Once again, I've made a simple mistake, and the type system and my APIs have done nothing to protect me. I've set the tax rate in Utah at 0%... but for the purchase of apples, not eggs! Once again, it's an easy fix:

accounts.buy_eggs(12, 0.0); // essential goods have no taxes in Utah
accounts.buy_apples(2, TAX_RATE_UTAH);

But these recurring bugs are frustrating, and frankly the code structure is completely unsatisfactory. I've needed to put some of the logic for tax collection into our main function, while other parts live in log_purchase. And the types do nothing to protect us. Is there anything we can do about this?

Strong types, local logic

I want to bash apart the code above using two principles:

  1. Use strong types when possible. This isn't the same as static types. Static typing simply means that all variables have a known type. Strong typing is about making those types meaningful. In our log_purchase method, we currently have weak typing. We take two parameters, money and tax_rate. They're both f64s, and nothing prevents us from swapping the argument order by mistake.
  2. Keep logic as local as possible. We're currently making decisions on the taxes in two places: determining the tax rate in main, and calculating the taxes incurred in log_purchase. We also need to pass that logic through the buy_eggs and buy_apples methods.

Let's start with trying to address the second point. I'd like to have all tax logic present in log_purchase. That means I need to know if the purchase is taxable or not. One possibility would be adding a new parameter to indicate if taxes should be collected:

fn log_purchase(&mut self, money: f64, tax_rate: f64, collect_taxes: bool) {
    let taxes = if collect_taxes { money * tax_rate } else { 0.0 };
    self.taxes_paid += taxes;
    self.company_balance += money - taxes;
}

But this fails both of our problems from above:

  1. We've added in a new parameter, but it's just as weakly typed as an f64. (In this case, I'd call it boolean blindness.) While we don't have to worry about accidentally swapping parameters, who's to say if true means "collect taxes" versus "exempt from taxes?" Sure, you can look at the code or read the docs... but who's going to do that? I want my compiler to save me!
  2. It still requires performing logic in the caller to determine if this particular purchase is required to pay taxes, which still keeps our logic split up.

Instead of this slapdash approach, let's try to think of it from the bottom up.

Data driven

What information do we need to know to determine if taxes can be charged? Two things:

  1. Which state the purchase took place in
  2. What item was purchased

With that stated, it's easy enough to create some helper data types to begin modeling this more appropriately:

fn main() {
    let mut accounts = Accounts::default();
    accounts.buy_eggs(6, TAX_RATE_ARIZONA, State::Arizona);
    accounts.buy_apples(10, TAX_RATE_NEVADA, State::Nevada);
    accounts.buy_eggs(12, TAX_RATE_UTAH, State::Utah);
    accounts.buy_apples(2, TAX_RATE_UTAH, State::Utah);
    println!("{accounts:#?}");
}

const TAX_RATE_ARIZONA: f64 = 0.05;
const TAX_RATE_NEVADA: f64 = 0.08;
const TAX_RATE_UTAH: f64 = 0.09;
const PRICE_PER_EGG: f64 = 0.75;
const PRICE_PER_APPLE: f64 = 0.5;

#[derive(Debug, Default)]
struct Accounts {
    company_balance: f64,
    taxes_paid: f64,
}

enum State {
    Arizona,
    Nevada,
    Utah,
}

enum Item {
    Apples,
    Eggs,
}

impl Accounts {
    fn log_purchase(&mut self, money: f64, tax_rate: f64, state: State, item: Item) {
        let collect_taxes = match (state, item) {
            (State::Utah, Item::Eggs) => false,
            _ => true,
            // Or if, like me, you like to be really explicit:
            // (State::Arizona, Item::Apples)
            // | (State::Arizona, Item::Eggs)
            // | (State::Nevada, Item::Apples)
            // | (State::Nevada, Item::Eggs)
            // | (State::Utah, Item::Apples) => true,
        };
        let taxes = if collect_taxes { money * tax_rate } else { 0.0 };
        self.taxes_paid += taxes;
        self.company_balance += money - taxes;
    }

    fn buy_eggs(&mut self, eggs: u64, tax_rate: f64, state: State) {
        self.log_purchase(eggs as f64 * PRICE_PER_EGG, tax_rate, state, Item::Eggs);
    }

    fn buy_apples(&mut self, apples: u64, tax_rate: f64, state: State) {
        self.log_purchase(
            apples as f64 * PRICE_PER_APPLE,
            tax_rate,
            state,
            Item::Apples,
        );
    }
}

Now we're fully implementing our "essential goods" check within log_purchase, with none of the logic leaking out. And our new types are properly strong types; it's impossible to accidentally swap the State and Item with one of the f64 parameters, since they have totally different types.

It's not like everything is perfect yet. We can still easily write this incorrect code:

accounts.buy_apples(2, TAX_RATE_UTAH, State::Nevada);

But this is also easily rectified. Now that we're passing in a State parameter to log_purchase, we can determine the tax rate ourself within that function. And passing in a State value instead of an f64 prevents us from accidentally providing the parameters in the wrong order.

But you may have noticed something else: the tax_rate parameter is now redundant! Thanks to providing more information to log_purchase, it can be more intelligent in its own functioning, reducing burden on callers and removing a potential mismatch such as this code:

#[derive(Clone, Copy)]
enum State {
    Arizona,
    Nevada,
    Utah,
}

impl State {
    fn tax_rate(self) -> f64 {
        match self {
            State::Arizona => 0.05,
            State::Nevada => 0.08,
            State::Utah => 0.09,
        }
    }
}

fn log_purchase(&mut self, money: f64, state: State, item: Item) {
    let collect_taxes = match (state, item) {
        (State::Utah, Item::Eggs) => false,
        _ => true,
    };
    let taxes = if collect_taxes {
        money * state.tax_rate()
    } else {
        0.0
    };
    self.taxes_paid += taxes;
    self.company_balance += money - taxes;
}

And just like that, log_purchase doesn't require any outside logic to determine how to collect taxes. You simply, declaratively, and in a strongly-typed manner, provide it the information necessary for it to do its job, and the method carries out all the logic.

We could even go a step farther if we wanted, and have log_purchase handle the calculation of the cost of the goods too:

fn log_purchase(&mut self, quantity: u64, state: State, item: Item) {
    let collect_taxes = match (state, item) {
        (State::Utah, Item::Eggs) => false,
        _ => true,
    };
    let money = quantity as f64 * item.price();
    let taxes = if collect_taxes {
        money * state.tax_rate()
    } else {
        0.0
    };
    self.taxes_paid += taxes;
    self.company_balance += money - taxes;
}

And with that in place, you may even decide that helper methods like buy_eggs and buy_apples aren't worth it:

fn main() {
    let mut accounts = Accounts::default();
    accounts.buy(6, State::Arizona, Item::Eggs);
    accounts.buy(10, State::Nevada, Item::Apples);
    accounts.buy(12, State::Utah, Item::Eggs);
    accounts.buy(2, State::Utah, Item::Apples);
    accounts.buy(2, State::Nevada, Item::Apples);
    println!("{accounts:#?}");
}

#[derive(Debug, Default)]
struct Accounts {
    company_balance: f64,
    taxes_paid: f64,
}

#[derive(Clone, Copy)]
enum State {
    Arizona,
    Nevada,
    Utah,
}

impl State {
    fn tax_rate(self) -> f64 {
        match self {
            State::Arizona => 0.05,
            State::Nevada => 0.08,
            State::Utah => 0.09,
        }
    }
}

#[derive(Clone, Copy)]
enum Item {
    Apples,
    Eggs,
}

impl Item {
    fn price(self) -> f64 {
        match self {
            Item::Apples => 0.5,
            Item::Eggs => 0.75,
        }
    }
}

impl Accounts {
    fn buy(&mut self, quantity: u64, state: State, item: Item) {
        let collect_taxes = match (state, item) {
            (State::Utah, Item::Eggs) => false,
            _ => true,
        };
        let money = quantity as f64 * item.price();
        let taxes = if collect_taxes {
            money * state.tax_rate()
        } else {
            0.0
        };
        self.taxes_paid += taxes;
        self.company_balance += money - taxes;
    }
}

Conclusion

OK, so we moved some code around, centralized some logic, and now everything is nicer. We have some type safety in place too. You may be looking at this as small gains for introducing a lot of type complexity. But here are my closing thoughts:

  1. Sure, this silly example may not warrant the type machinery for protection. But it's very easy to scale up from such a simple example to real-world use cases where the type safety prevents far more complex and insidious bugs.
  2. I'd argue that there's not really any complexity here. We introduced two new data types and a new method on each of them, but also removed two helper functions and five constants. I'd take that trade in complexity any day.
  3. The next set of features we want to implement will become even easier to make. For example, take both the original weakly typed version and the new strongly typed version, and try implementing these changes:
    1. In Arizona only, reduce the cost of apples to 0.45 per apple when you purchase 12 or more.
    2. Allow the price of the goods to change during the course of execution. In other words, don't hard-code in all the prices. In my opinion, the strongly typed version makes both of these tasks much easier and safer.

So what's the overarching lesson to be learned here? I'd put it this way:

Identify the inputs needed for your functions to perform all their logic, avoiding splitting up that logic into multiple parts of your code base. Use well defined, strong types to represent that input cleanly.

It may sound simple, and perhaps obvious. But the next time you feel yourself succumbing to writing yet-another-weird-hack to address an unexpected business requirement, see if reframing the question from "how can I quickly add this feature" to "what's the best way to model the requirements as inputs and outputs" helps you come up with a better design.

August 26, 2024 12:00 AM

August 24, 2024

Mark Jason Dominus

Dancing bread

Marnanel Thurman reported the following item that they found in an 1875 book titled How to Entertain a Social Party:

To Make a Loaf of Bread Dance on the Table.

— Having a quill filled with quicksilver and stopped close, you secretly thrust it into a hot roll or loaf, which will put it in motion.

(Bottom of page 46.) No further explanation is given.

This may remind you of an episode from Huckleberry Finn:

Well, then I happened to think how they always put quicksilver in loaves of bread and float them off, because they always go right to the drownded carcass and stop there.

(Chapter 8.)

When I first read this I assumed it was a local Southern superstition, characteristic of that place and time. But it seems not! According to this article by Dan Rolph of the Historical Society of Pennsylvania, the belief was longstanding and widespread, lasting from at least 1767 to 1872, and appearing also in London and in Pennsylvania.

Details of the dancing bread trick are lacking. I guess the quicksilver stays inside the stopped-up quill. (Otherwise, there would be no need to “stop it close”.) Then perhaps on being heated by the bread, the quicksilver expands lengthwise as in a thermometer, and then… my imagination fails me.

The procedure for making drowned-body-finding bread is quite different. Rolph's sources all agree: you poke in your finger and scoop out a bit of the inside, pour the quicksilver into the cavity, and then plug up the hole. So there's no quill; the quicksilver is just sloshing around loose in there. Huckleberry Finn agrees:

I took out the plug and shook out the little dab of quicksilver…

Does anyone have more information about this? Does hot bread filled with mercury really dance on the table, and if so why? Is the supersition about bread finding drowned bodies related to this, or is it a coincidence?

Also, what song did the sirens sing, and by what name was Achilles called when he hid among women?

by Mark Dominus (mjd@plover.com) at August 24, 2024 08:55 PM

August 22, 2024

Mark Jason Dominus

XKCD game theory question

Six-panel cartoon from XKCD. Each panel gives a one-question mathematics ‘final exam’ from a different level of education from ‘kindergarten’ to  ‘postgraduate math’.  This article concerns the fifth, which says “Game Theory Final Exam: Q. Write down 10 more than the average of the class’s answers.  A. (blank).”

(Source: XKCD “Exam numbers”.)

This post is about the bottom center panel, “Game Theory final exam”.

I don't know much about game theory and I haven't seen any other discussion of this question. But I have a strategy I think is plausible and I'm somewhat pleased with.

(I assume that answers to the exam question must be real numbers — not  — and that “average” here is short for 'arithmetic mean'.)

First, I believe the other players and I must find a way to agree on what the average will be, or else we are all doomed. We can't communicate, so we should choose a Schelling point and hope that everyone else chooses the same one. Fortunately, there is only one distinguished choice: zero. So I will try to make the average zero and I will hope that others are trying to do the same.

If we succeed in doing this, any winning entry will therefore be . Not all players can win because the average must be . But can win, if the one other player writes . So my job is to decide whether I will be the loser. I should select a random integer between and . If it is zero, I have drawn a short straw, and will write . otherwise I write .

(The straw-drawing analogy is perhaps misleading. Normally, exactly one straw is short. Here, any or all of the straws might be short.)

If everyone follows this strategy, then I will win if exactly one person draws a short straw and if that one person isn't me. The former has a probability that rapidly approaches as increases, and the latter is . In an -person class, the probability of my winning is $$\left(\frac{n-1}n\right)^n$$ which is already better than when , and it increases slowly toward after that.

Some miscellaneous thoughts:

  1. The whole thing depends on my idea that everyone will agree on as a Schelling point. Is that even how Schelling points work? Maybe I don't understand Schelling points.

  2. I like that the probability appears. It's surprising how often this comes up, often when multiple agents try to coordinate without communicating. For example, in ALOHAnet a number of ground stations independently try to send packets to a single satellite transceiver, but if more than one tries to send a packet at a particular time, the packets are garbled and must be retransmitted. At most of the available bandwidth can be used, the rest being lost to packet collisions.

  3. The first strategy I thought of was plausible but worse: flip a coin, and write down if it is heads and if it is tails. With this strategy I win if exactly of the class flips heads and if I do too. The probability of this happening is only $$\frac{n\choose n/2}{2^n}\cdot \frac12 \approx \frac1{\sqrt{2\pi n}}.$$ Unlike the other strategy, this decreases to zero as increases, and in no case is it better than the first strategy. It also fails badly if the class contains an odd number of people.

    Thanks to Brian Lee for figuring out the asymptotic value of so I didn't have to.

  4. Just because this was the best strategy I could think of in no way means that it is the best there is. There might have been something much smarter that I did not think of, and if there is then my strategy will sabotage everyone else.

    Game theorists do think of all sorts of weird strategies that you wouldn't expect could exist. I wrote an article about one a few years back.

  5. Going in the other direction, even if of the smartest people all agree on the smartest possible strategy, if the th person is Leeroy Jenkins, he is going to ruin it for everyone.

  6. If I were grading this exam, I might give full marks to anyone who wrote down either or , even if the average came out to something else.

  7. For a similar and also interesting but less slippery question, see Wikipedia's article on Guess ⅔ of the average. Much of the discussion there is directly relevant. For example, “For Nash equilibrium to be played, players would need to assume both that everyone else is rational and that there is common knowledge of rationality. However, this is a strong assumption.” LEEROY JENKINS\infty-\infty-5010-5022-50\frac161010n-1!! players (including Vidkun) win if exactly one of them rolls zero. Vidkun's chance of winning increases. Intuitively, the other players' chances of winning ought to decrease. But by how much? I think I keep messing up the calculation because I keep getting zero. If this were actually correct, it would be a fascinating paradox!

by Mark Dominus (mjd@plover.com) at August 22, 2024 02:43 PM

August 21, 2024

Mark Jason Dominus

I DON'T KNOW

If you're an annoying know-it-all like me, I suggest that you try playing the following game when you attend a conference or a user group meetup or even a work meeting. The game is:

If someone asks you a question, and you say “I don't know”, you score a point.

That's it. That's the game. “I don't know” doesn't have to be perfectly truthful, only approximately truthful.

I forgot, there is one other rule:

If you follow up with something like “But if I had to guess…” you lose your point again.

by Mark Dominus (mjd@plover.com) at August 21, 2024 03:45 PM

Jasper Van der Jeugt

Turnstyle

I am delighted and horrified to announce a new graphical programming language called Turnstyle. You can see an example below (click to run).


In the time leading up to ZuriHac 2024 earlier this year, I had been thinking about Piet a little. We ended up working on something else during the Hackathon, but this was still in the back of my mind.

Some parts of Piets design are utter genius (using areas for number literals, using hue/lightness as cycles). There are also things I don’t like, such as the limited amount of colors, the difficulty reusing code, and the lack of a way to extend it with new primitive operations. I suspect these are part of the reason nobody has yet tried to write, say, an RDBMS or a web browser in Piet.

Given the amount of attention going to programming languages in the functional programming community, I was quite surprised nobody had ever tried to do a functional variant of it (as far as I could find).

I wanted to create something based on Lambda Calculus. It forms a nice basis for a minimal specification, and I knew that while code would still be somewhat frustrating to write, there is the comforting thought of being able to reuse almost everything once it’s written.

Cheatsheet for the specification
Cheatsheet for the specification

You can see the full specification here.

After playing around with different designs this is what I landed on. The guiding principle was to search for a specification that was as simple as possible, while still covering lambda calculus extended with primitives that, you know, allow you to interact with computers.

One interesting aspect that I discovered (not invented) is that it’s actually somewhat more expressive than Lambda Calculus, since you can build Abstract Syntax Graphs (rather than just Trees). This is illustrated in the loop example above, which recurses without the need for a fixed-point combinator.

For the full specification and more examples take a look at the Turnstyle website and feel free to play around with the sources on GitHub.

Thanks to Francesco Mazzoli for useful feedback on the specification and website.

by Jasper Van der Jeugt at August 21, 2024 12:00 AM

August 16, 2024

Haskell Interlude

55: Sebastian Ullrich

In this episode, Niki and Andres talk with Sebastian, one of the main developers of Lean, currently working at the Lean Focused Research Organization. Today we talk about the addictive notion of theorem provers, what is a sweet spot between dependent types and simple programming and how Lean is both a theorem prover and an efficient general purpose programming language. 

by Haskell Podcast at August 16, 2024 07:00 PM

August 14, 2024

Mark Jason Dominus

Poor Richard's Almanack

Benjamin Franklin wrote and published Poor Richard's Almanack annually from 1732 to 1758. Paper was expensive and printing difficult and time-consuming. The type would be inked, the sheet of paper laid on the press, the apprentices would press the sheet, by turning a big screw. Then the sheet was removed and hung up to dry. Then you can do another printing of the same page. Do this ten thousand times and you have ten thousand prints of a sheet. Do it ten thousand more to print a second sheet. Then print the second side of the first sheet ten thousand times and print the second side of the second sheet ten thousand times. Fold 20,000 sheets into eighths, cut and bind them into 10,000 thirty-two page pamphlets and you have your Almanacks.

As a youth, Franklin was apprenticed to his brother James, also a printer, in Boston. Franklin liked the work, but James drank and beat him, so he ran away to Philadelphia. When James died, Benjamin sent his widowed sister-in-law Ann five hundred copies of the Almanack to sell. When I first heard that I thought it was a mean present but I was being a twenty-first-century fool. The pressing of five hundred almanacks is no small feat of toil. Ann would have been able to sell those Almanacks in her print shop for fivepence each, or ₤10 8s. 4d. That was a lot of money in 1735.

In 1748 Franklin increased the size and the price. Here's a typical page from the 1748 Almanack:

detailed description in the article

Wow, there's a lot of stuff going on there. Here's a smaller excerpt, this time from November 1753:

The leftmost column is the day of the month, and then the next column is the day of the week, with 2–7 being Monday through Saturday. Sunday is denoted with a letter “G”. I thought this was G for God, but I see that in 1748 Franklin used “C” and in 1752 he used “A”, so I don't know.

The third column combines a weather forecast and a calendar. The weather forecast is in italic type, over toward the right: “Clouds and threatens cold rains and snow” in the early part of the month. Sounds like November in Philadelphia. The roman type gives important days. For example, November 1 is All Saints Day and November 5 is the anniversary of the Gunpowder Plot. November 10 is given as the birthday of King George II, then still the King of Great Britain.

The Sundays are marked with some description in the Christian liturgical calendar. For example, “20 past Trin.” means it's the start of the 20th week past Trinity Sunday.

This column also has notations like “Days dec. 4 32” and “Days dec. 5 h.” that I haven't been able to figure out. Something about the decreasing length of the day in November maybe? [ Addendum: Yes. See below. ] The notation on November 6 says “Day 10 10 long” which is consistent with the sunrise and sunset times Franklin gives for that day. The fourth and fifth columns, labeled “☉ ris” and “☉ set” are the times of sunrise and sunset, 6:55 (AM) and 5:05 (PM) respectively for November 6, ten hours and ten minutes apart as Franklin says.

“☽ pl.” is the position of the moon in the sky. (I guess “pl.” is short for “place”.) The sky is divided into twelve “houses” of 30 degrees each, and when it says that the “☽ pl.” on November 6 is “♓ 25” I think it means the moon is of the way along in the house of Pisces on its way to the house of Aries ♈. If you look at the January 1748 page above you can see the moon making its way through the whole sky in 29 days, as it does.

The last column, “Aspects, &c.” contains more astronomy. “♂ rise 6 13” means that Mars will rise at 6:13 that day. (But in the morning or the evening?) ⚹♃♀ on the 12th says that Jupiter is in sextile aspect to Venus, which means that they are in the sky 60 degrees apart. Similarly □☉♃ means that the Sun and Jupiter are in Square aspect, 90 degrees apart in the sky.

Also mixed into that last column, taking up the otherwise empty space, are the famous wise sayings of Poor Richard. Here we see:

Serving God is Doing Good to Man,
but Praying is thought an easier Service,
and therefore more generally chosen
.

Back on the January page you can see one of the more famous ones, Lost Time is never found again.

Franklin published an Almanack in 1752, the year that the British Calendar Act of 1751 updated the calendar from Julian to Gregorian reckoning. To bring the calendar into line with Gregorian, eleven days were dropped from September that year. I wondered what Franklin's calendar looked like that month. Here it is with the eleven days clearly missing:

The leftmost day-of-the-month column skips right from September 2 to September 14, as the law required. On this copy someone has added the old dates in the margin. Notice that St. Michael's Day, which would have been on Friday September 18th in the old calendar, has been moved up to September 29th. In most years Poor Richard's Almanack featured an essay by Poor Richard, little poems, and other reference material. The 1752 Almanack omitted most of this so that Franklin could use the space to instead reprint the entire text of the Calendar Act.

This page also commemorates the Great Fire of London, which began September 2, 1666.

Wikipedia tells me that Franklin may have gotten the King's birthday wrong. Franklin says November 10, but Wikipedia says November 9, and:

Over the course of George's life, two calendars were used: the Old Style Julian calendar and the New Style Gregorian calendar. Before 1700, the two calendars were 10 days apart. Hanover switched from the Julian to the Gregorian calendar on 19 February (O.S.) / 1 March (N.S.) 1700. Great Britain switched on 3/14 September 1752. George was born on 30 October Old Style, which was 9 November New Style, but because the calendar shifted forward a further day in 1700, the date is occasionally miscalculated as 10 November.

Ugh, calendars.

I got these scans from a web site called The Rare Book Room, but I found their user interface very troublesome, so I have scraped all the images they had. You may find them at https://pic.blog.plover.com/calendar/poor-richards-almanack/archive/. I'm pretty sure the copyright has expired, so share and enjoy.

Addenda

Several people have pointed out that the mysterious letters G, C, A on Sundays are the so-called dominical letters, used in remembering the correspondence between days of the month and days of the week, and important in the determination of the dates of Easter and other moveable feasts.

Why Franklin included them in the Almanack is not clear to me, as one of the main purposes of the almanac itself is so that you do not have to remember or calculate those things, you can just look them up in the almanac.

Mikkel Paulson explained the 'days dec.' and 'days inc.' notations: they describe the length of the day, but reported relative to the length of the most recent solstice. For example, the November 1753 excerpt for November 2 says "Days dec. 4 32". Going by the times of sunrise and sunset on that day, the day was 10 hours 18 minutes long. Adding the 4 hours 32 minutes from the notation we have 14 hours 50 minutes, which is indeed the length of the day on the summer solstice in Philadelphia, or close to it.

Similarly the notation on November 14 says "Days dec. 5 h" for a day that is 9 hours 50 minutes between sunrise and sunset, five hours shorter than on the summer solstice, and the January 3 entry says "Days inc. 18 m." for a 9h 28m day which is 18 minutes longer than the 9h 10m day one would have on the winter solstice.

by Mark Dominus (mjd@plover.com) at August 14, 2024 04:05 AM

Chris Penner

August 12, 2024

ERDI Gergo

Formatting serial streams in hardware

I've been playing around with building a Sudoku solver circuit on an FPGA: you connect to it via a serial port, send it a Sudoku grid with some unknown cells, and after solving it, you get back the solved (fully filled-in) grid. I wanted the output to be nicely human-readable, for example for a 3,3-Sudoku (i.e. the usual Sudoku size where the grid is made up of a 3 ⨯ 3 matrix of 3 ⨯ 3 boxes):

4 2 1  9 5 8  6 3 7  
8 7 3  6 2 1  9 5 4  
5 9 6  4 7 3  2 1 8  

3 1 2  8 4 6  7 9 5  
7 6 8  5 1 9  3 4 2  
9 4 5  7 3 2  8 6 1  

2 8 9  1 6 4  5 7 3  
1 3 7  2 9 5  4 8 6  
6 5 4  3 8 7  1 2 9  
    

This post is about how I structured the stream transformer that produces all the right spaces and newlines, yielding a clash-protocols based circuit.

With clash-protocols, the type of a circuit that transforms a serial stream of a values into a stream of b values is Circuit (Df dom a) (Df dom b), with the Df type constructor taking care of representing acknowledgement signals. So for our formatter, we are looking to implement it as a Circuit (Df dom a) (Df dom (Either Char a): the output is a mixed stream of forwarded data a and punctuation characters.

If we were writing normal (software) Haskell, we could write the corresponding [a] -> [Either Char a] in many different ways, but since we want to describe a hardware circuit, there are a couple more constraints:

  • There is limited control over our input and output. The Circuit/Df abstraction provides a way to exert backpressure upstream, but that's a double edged sword: it also means whatever circuit we write has to be ready to handle backpressure from further downstream. And upstream can stall us: sometimes there is no input available.
  • Everything, of course, has to be finite. This includes both data and recursion depth.
  • We don't want to waste parts on our FPGA. If a counter can be just 11 bits, using an 11-bit data type instead of a Word16 can translate to actual savings.

Luckily, clash-protocols takes care of the first constraint via the expander function:

expander
    :: forall dom i o s. (HiddenClockResetEnable dom, NFDataX s)
    => s
    -> (s -> i -> (s, o, Bool)) -- ^ Return `True` when you're finished
                                -- with the current input value and are ready for the next one.
    -> Circuit (Df dom i) (Df dom o)
    

So basically expander reduces the problem to just writing a "normal" pure Haskell function s -> i -> (s, o, Bool). This is reassuring since I am otherwise not too familiar with clash-protocols but this take mes back to the terra firma of Haskell.

A simpler formatter

Let's warm up by writing a state machine for expander that just puts two spaces between each input:

data DoubleSpacedState
    = CopyTheInput
    | FirstSpace
    | SecondSpace
    deriving (Generic, NFDataX) -- So that Clash can store the state in a register

-- | Transform the stream 'a','b',... to Right 'a', Left ' ', Left ' ', Right 'b', Left ' ', Left ' ', ...
doubleSpaced :: (HiddenClockResetEnable dom) => Circuit (Df dom a) (Df dom (Either Char a))
doubleSpaced = expander CopyTheInput \s x -> case s of
    CopyTheInput -> (FirstSpace,   Right x,  True)
    FirstSpace   -> (SecondSpace,  Left ' ', False)
    SecondSpace  -> (CopyTheInput, Left ' ', False)
    

We can try it out using the Circuit simulator simulateCSE which on its own has an intimidating type because clash-protocols supports more protocols than just Df:

λ» :t simulateCSE
simulateCSE
    :: (Protocols.Internal.Drivable a, Protocols.Internal.Drivable b, KnownDomain dom)
    => (Clock dom -> Reset dom -> Enable dom -> Circuit a b)
    -> Protocols.Internal.ExpectType a
    -> Protocols.Internal.ExpectType b
    

But once we partially apply it on doubleSpaced with the Clash clock, reset and enable lines made explicit, the type suddenly makes a ton of sense:

λ» :t simulateCSE @System (exposeClockResetEnable  doubleSpaced) 
simulateCSE @System (exposeClockResetEnable doubleSpaced)
    :: (NFDataX a, ShowX a, Show a) => [a] -> [Either Char a]
    

Let's try applying it on a string like "Hello" to make sure it works correctly:

λ» simulateCSE @System (exposeClockResetEnable doubleSpaced) "Hello"
[Right 'H',Left ' ',Left ' ',
 Right 'e',Left ' ',Left ' ',
 Right 'l',Left ' ',Left ' ',
 Right 'l',Left ' ',Left ' ',
 Right 'o'
    

The simulation seemingly hangs because simulateCSE starts feeding input is not yet available signals to the simulated circuit after it runs out of the user-specified input. So it's not hanging, our circuit is just not producing any more output. And when we look at the type of expander, we see that this has to be how it works, since if there is no input, there is no i argument to pass to the state machine function. But this is not exactly what we want, since we want the two spaces to be sent as soon as possible, after the preceding character, not before the next character. So we change doubleSpaced to only consume the input after all spaces are output:

doubleSpaced :: (HiddenClockResetEnable dom) => Circuit (Df dom a) (Df dom (Either Char a))
doubleSpaced = expander CopyTheInput \s x -> case s of
    CopyTheInput -> (FirstSpace,   Right x,  False) -- Here...
    FirstSpace   -> (SecondSpace,  Left ' ', False)
    SecondSpace  -> (CopyTheInput, Left ' ', True)  -- ... and here
    
λ» simulateCSE @System (exposeClockResetEnable doubleSpaced) "Hello"
[Right 'H',Left ' ',Left ' ',
 Right 'e',Left ' ',Left ' ',
 Right 'l',Left ' ',Left ' ',
 Right 'l',Left ' ',Left ' ',
 Right 'o',Left ' ',Left ' '
    

Getting rid of the bespoke state datatype

If we want to write similar state machines for more complex formats, we should start thinking about composability. For example, we can think of our simple double-spaced formatter as a combination of a formatter that just forwards the input with another that outputs two spaces.

At any given moment, we are either in the forwarding state or the space-printing state:

data Forward = Forward deriving ...
data Spaces = FirstSpace | SecondSpace deriving ...
type DoubleSpacedState = Either Forward Spaces
    

Or using even more generic types, we can say that there is just one Forwarding state, and two state for the Spaces:

type DoubleSpacedState = Either (Index 1) (Index 2)
    

Here, Index :: Nat -> Type is a Clash-provided type with an exact number of distinct values 0, 1, ..., n-1, with Index n represented as ⌈log₂ n⌉ bits.

One nice thing about using only Either, Index and tuples for our state representation is that we can then use Clash's Counter class to iterate through the states:

doubleSpaced :: (HiddenClockResetEnable dom) => Circuit (Df dom a) (Df dom (Either Char a))
doubleSpaced = expander (Left 0 :: Either (Index 1) (Index 2)) \s x ->
    let output = case s of
            Left 0  -> Right x
            Right 0 -> Left ' '
            Right 1 -> Left ' '
        s' = countSucc s
        consume = case s' of
            Left 0 -> True
            _      -> False
    in (s', output, consume)
    

Here, I've changed the code computing whether the current input should be consumed so that it looks at the next state instead of the current one. Because this is really what we are doing – we want to go through as many states as we can, until we get to the point that the next time around we will need new input.

Declarative formatting

Compared to the initial version with three distinct, named constructors, we have gained generality, in that we can now imagine what the state would need to look like for our original formatting example. But already at this simplified example, it has cost us legibility: looking at the latest definition of doubleSpaced, it is not immediately obvious what format it corresponds to.

So of course the next thing we want to do is use a declarative syntax for the format, and derive everything else from that. We can take a page out of Servant and give users a library of type-level combinators corresponding to regular expressions without alternatives. Our end goal is to be able to write our Sudoku example as just a single type definition, using :++ for concatenation, :* for repetition, and type-level strings for literals:

type GridFormat n m = ((((Forward :++ " ") :* n :++ " ") :* m :++ "\r\n") :* m :++ "\r\n") :* n
    

So we Forward the data and follow it up with a single space, then after each nth repetition, we insert one more space. Do this whole thing m times, and end the line (using the old serial format instead of the "modern" newline-only Unix format), then after this is done m times, we insert the extra newline between the blocks.

Implementing this idea starts with capturing the essence of a format specifier: it needs to be associated with a counter type used for the given formatter's state, and we need to know how to produce the next single formatting token in any given state.

data PunctuatedBy c
    = Literal c
    | ForwardData
    deriving (Generic, NFDataX)

class (Counter (State fmt), NFDataX (State fmt)) => Format (fmt :: k) where
    type State fmt
    format1 :: proxy fmt -> State fmt -> PunctuatedBy Char
    

Then, using this interface, we can write a generic formatter using expander, similar to our earlier attempt:

format
    :: (HiddenClockResetEnable dom, Format fmt)
    => Proxy fmt
    -> Circuit (Df dom a) (Df dom (Either Word8 a))
format fmt = Df.expander countMin \s x ->
    let output = case format1 fmt s of
            ForwardData -> Right x
            Literal sep -> Left (ascii sep)
        s' = countSucc s
        consume = case format1 fmt s' of
            ForwardData -> True
            _ -> False
    in (s', output, consume)
    

The easy formatters

Let's get all the easy cases out of the way. These are the formatters where we can either directly write format1, or it can be delegated to other formatters:

-- | Consume one token of input and forward it to the output
data Forward

instance Format Forward where
    type State Forward = Index 1
        -- Here, `Index 1` stands in for `()` but with a (trival) `Counter` instance

    format1 _ _ = ForwardData

-- | Concatenation
data a :++ b

instance (Format a, Format b) => Format (a :++ b) where
    type State (a :++ b) = Either (State a) (State b)
        -- The order is important, since `countMin @(Either a b) = Left countMin`

    format1 _ = either (format1 (Proxy @a)) (format1 (Proxy @b))      

-- | Repetition
data a :* (rep :: Nat)

instance (Format a, KnownNat rep, 1 ≤ rep) => Format (a :* rep) where
    type State (a :* rep) = (Index rep, State a)
        -- The order is important, since that's how the `Counter` instance for tuples cascades increments

    format1 _ (_, fmt) = format1 (Proxy @a) fmt
    

Reflecting symbols character by character

What we want to do for the Format (sep :: Symbol) instance is to use Index n as the state, where n is the length of the symbol, and then format1 _ i would return the ith character of our separator sep.

Unfortunately, this requires considerably more elbow grease than the previous instances. Currently, there aren't many type-level functions over Symbol in base so we have to implement it all ourselves based on just the UnconsSymbol type family.

type SymbolLength s = SymbolLength' (UnconsSymbol s)
type IndexableSymbol s = IndexableSymbol' (UnconsSymbol s)

class (KnownNat (SymbolLength' s)) => IndexableSymbol' (s :: Maybe (Char, Symbol)) where
    type SymbolLength' s :: Nat

    symbolAt :: proxy s -> Index (SymbolLength' s) -> Char

instance IndexableSymbol' Nothing where
    type SymbolLength' Nothing = 0

    symbolAt _ i = error "impossible"

instance (IndexableSymbol s, KnownChar c) => IndexableSymbol' (Just '(c, s)) where
    type SymbolLength' (Just '(c, s)) = 1 + SymbolLength s

    {-# INLINE symbolAt #-}
    symbolAt _ i
        | i == 0
        = charVal (Proxy @c)

        | otherwise
        = symbolAt (Proxy @(UnconsSymbol s)) (fromIntegral (i - 1))
    

With the help of these utility classes, we can now write the formatter for Symbols. The lower bound on SymbolLength is needed because the degenerate type Index 0 (isomorphic to Void) would just screw everything up.

-- | Literal
instance (IndexableSymbol sep, KnownNat (SymbolLength sep), 1 ≤ SymbolLength sep) => Format sep where
    type State sep = Index (SymbolLength sep)

    format1 _ i = Literal $ symbolAt (Proxy @(UnconsSymbol sep)) i
    

Note that the indirection between a format fmt and its state type State fmt was only needed in the first place because we wanted Symbols to be valid formatters without wrapping them in an extra layer. If we were content with extra noise like Literal "\r\n" instead of just "\r\n", we could collapse the two types.

In my real code, I ended up having to change format slightly so that it directly produces 8-bit ASCII values instead of Chars, because I found that composing it with another Circuit that does the conversion wasn't getting inlined enough to produce Verilog that avoids the large 21-bit-wide multiplexers for Char.

August 12, 2024 07:16 PM

August 11, 2024

Oskar Wickström

A Flexible Minimalist Neovim for 2024

In the eternal search of a better text editor, I’ve recently gone back to Neovim. I’ve taken the time to configure it myself, with as few plugins and other cruft as possible. My goal is a minimalist editing experience, tailored for exactly those tasks that I do regularly, and nothing more. In this post, I’ll give a brief tour of my setup and its motivations.

August 11, 2024 10:00 PM

Magnus Therning

August 08, 2024

Brent Yorgey

Competitive Programming in Haskell: tree path decomposition, part II

Competitive Programming in Haskell: tree path decomposition, part II

Posted on August 8, 2024
Tagged , , , , ,

In a previous post I discussed the first half of my solution to Factor-Full Tree. In this post, I will demonstrate how to decompose a tree into disjoint paths. Technically, we should clarify that we are looking for directed paths in a rooted tree, that is, paths that only proceed down the tree. One could also ask about decomposing an unrooted tree into disjoint undirected paths; I haven’t thought about how to do that in general but intuitively I expect it is not too much more difficult.

For this particular problem, we want to decompose a tree into maximum-length paths (i.e. we start by taking the longest possible path, then take the longest path from what remains, and so on); I will call this the max-chain decomposition (I don’t know if there is a standard term). However, there are other types of path decomposition, such as heavy-light decomposition, so we will try to keep the decomposition code somewhat generic.

Preliminaries

This post is literate Haskell; you can find the source code on GitHub. We begin with some language pragmas and imports.

{-# LANGUAGE ImportQualifiedPost #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE TupleSections #-}

module TreeDecomposition where

import Control.Arrow ((>>>), (***))
import Data.Bifunctor (second)
import Data.ByteString.Lazy.Char8 (ByteString)
import Data.ByteString.Lazy.Char8 qualified as BS
import Data.List (sortBy)
import Data.List.NonEmpty (NonEmpty)
import Data.List.NonEmpty qualified as NE
import Data.Map (Map, (!), (!?))
import Data.Map qualified as M
import Data.Ord (Down(..), comparing)
import Data.Tree (Tree(..), foldTree)
import Data.Tuple (swap)

import ScannerBS

See here for the ScannerBS module.

Generic path decomposition

Remember, our goal is to split up a tree into a collection of linear paths; that is, in general, something like this:

What do we need in order to specify a decomposition of a tree into disjoint paths this way? Really, all we need is to choose at most one linked child for each node. In other words, at every node we can choose to continue the current path into a single child node (in which case all the other children will start their own new paths), or we could choose to terminate the current path (in which case every child will be the start of its own new path). We can represent such a choice with a function of type

type SubtreeSelector a = a -> [Tree a] -> Maybe (Tree a, [Tree a])

which takes as input the value at a node and the list of all the subtrees, and possibly returns a selected subtree along with the list of remaining subtrees.Of course, there is nothing in the type that actually requires a SubtreeSelector to return one of the trees from its input paired with the rest, but nothing we will do depends on this being true. In fact, I expect there may be some interesting algorithms obtainable by running a “path decomposition” with a “selector” function that actually makes up new trees instead of just selecting one, similar to the chop function.

Given such a subtree selection function, a generic path decomposition function will then take a tree and turn it into a list of non-empty paths:We could also imagine wanting information about the parent of each path, and a mapping from tree nodes to some kind of path ID, but we will keep things simple for now.

pathDecomposition :: SubtreeSelector a -> Tree a -> [NonEmpty a]

Implementing pathDecomposition is a nice exercise; you might like to try it yourself! You can find my implementation at the end of this blog post.

Max-chain decomposition

Now, let’s use our generic path decomposition to implement a max-chain decomposition. At each node we want to select the tallest subtree; in order to do this efficiently, we can first annotate each tree node with its height, via a straightforward tree fold:

type Height = Int

labelHeight :: Tree a -> Tree (Height, a)
labelHeight = foldTree node
 where
  node a ts = case ts of
    [] -> Node (0, a) []
    _ -> Node (1 + maximum (map (fst . rootLabel) ts), a) ts

Our subtree selection function can now select the subtree with the largest Height annotation. Instead of implementing this directly, we might as well make a generic function for selecting the “best” element from a list (we will reuse it later):

selectMaxBy :: (a -> a -> Ordering) -> [a] -> Maybe (a, [a])
selectMaxBy _ [] = Nothing
selectMaxBy cmp (a : as) = case selectMaxBy cmp as of
  Nothing -> Just (a, [])
  Just (b, bs) -> case cmp a b of
    LT -> Just (b, a : bs)
    _ -> Just (a, b : bs)

We can now put the pieces together to implement max-chain decomposition. We first label the tree by height, then do a path decomposition that selects the tallest subtree at each node. We leave the height annotations in the final output since they might be useful—for example, we can tell how long each path is just by looking at the Height annotation on the first element. If we don’t need them we can easily get rid of them later. We also sort by descending Height, since getting the longest chains first was kind of the whole point.

maxChainDecomposition :: Tree a -> [NonEmpty (Height, a)]
maxChainDecomposition =
  labelHeight >>>
  pathDecomposition (const (selectMaxBy (comparing (fst . rootLabel)))) >>>
  sortBy (comparing (Down . fst . NE.head))

Factor-full tree solution

To flesh this out into a full solution to Factor-Full Tree, after computing the chain decomposition we need to assign prime factors to the chains. From those, we can compute the value for each node if we know which chain it is in and the value of its parent. To this end, we will need one more function which computes a Map recording the parent of each node in a tree. Note that if we already know all the edges in a given edge list are oriented the same way, we can build this much more simply as e.g. map swap >>> M.fromList; but when (as in general) we don’t know which way the edges should be oriented first, we might as well first build a Tree a via DFS with edgesToTree and then construct the parentMap like this afterwards.

parentMap :: Ord a => Tree a -> Map a a
parentMap = foldTree node >>> snd
 where
  node :: Ord a => a -> [(a, Map a a)] -> (a, Map a a)
  node a b = (a, M.fromList (map (,a) as) <> mconcat ms)
   where
    (as, ms) = unzip b

Finally, we can solve Factor-Full tree. Note that some code from my previous blog post is needed as well, and is included at the end of the post for completeness. Once we compute the max chain decomposition and the prime factor for each node, we use a lazy recursive Map to compute the value assigned to each node.

solve :: TC -> [Int]
solve TC{..} = M.elems assignment
  where
    -- Build the tree and compute its parent map
    t = edgesToTree Node edges 1
    parent = parentMap t

    -- Compute the max chain decomposition, and use it to assign a prime factor
    -- to each non-root node
    paths :: [[Node]]
    paths = map (NE.toList . fmap snd) $ maxChainDecomposition t

    factor :: Map Node Int
    factor = M.fromList . concat $ zipWith (\p -> map (,p)) primes paths

    -- Compute an assignment of each node to a value, using a lazy map
    assignment :: Map Node Int
    assignment = M.fromList $ (1,1) : [(v, factor!v * assignment!(parent!v)) | v <- [2..n]]

For an explanation of this code for primes, see this old blog post.

primes :: [Int]
primes = 2 : sieve primes [3 ..]
 where
  sieve (p : ps) xs =
    let (h, t) = span (< p * p) xs
     in h ++ sieve ps (filter ((/= 0) . (`mod` p)) t)

Bonus: heavy-light decomposition

We can easily use our generic path decomposition to compute a heavy-light decomposition as well:

type Size = Int

labelSize :: Tree a -> Tree (Size, a)
labelSize = foldTree $ \a ts -> Node (1 + sum (map (fst . rootLabel) ts), a) ts

heavyLightDecomposition :: Tree a -> [NonEmpty (Size, a)]
heavyLightDecomposition =
  labelSize >>>
  pathDecomposition (const (selectMaxBy (comparing (fst . rootLabel))))

I plan to write about this in a future post.

Leftover code

Here’s my implementation of pathDecomposition; how did you do?

pathDecomposition select = go
 where
  go = selectPath select >>> second (concatMap go) >>> uncurry (:)

selectPath :: SubtreeSelector a -> Tree a -> (NonEmpty a, [Tree a])
selectPath select = go
 where
  go (Node a ts) = case select a ts of
    Nothing -> (NE.singleton a, ts)
    Just (t, ts') -> ((a NE.<|) *** (ts' ++)) (go t)

We also include some input parsing and tree-building code from last time.

main :: IO ()
main = BS.interact $ runScanner tc >>> solve >>> map (show >>> BS.pack) >>> BS.unwords

type Node = Int
data TC = TC { n :: Int, edges :: [(Node, Node)] }
  deriving (Eq, Show)

tc :: Scanner TC
tc = do
  n <- int
  edges <- (n - 1) >< pair int int
  return TC{..}

edgesToMap :: Ord a => [(a, a)] -> Map a [a]
edgesToMap = concatMap (\p -> [p, swap p]) >>> dirEdgesToMap

dirEdgesToMap :: Ord a => [(a, a)] -> Map a [a]
dirEdgesToMap = map (second (: [])) >>> M.fromListWith (++)

mapToTree :: Ord a => (a -> [b] -> b) -> Map a [a] -> a -> b
mapToTree nd m root = dfs root root
 where
  dfs parent root = nd root (maybe [] (map (dfs root) . filter (/= parent)) (m !? root))

edgesToTree :: Ord a => (a -> [b] -> b) -> [(a, a)] -> a -> b
edgesToTree nd = mapToTree nd . edgesToMap
<noscript>Javascript needs to be activated to view comments.</noscript>

by Brent Yorgey at August 08, 2024 12:00 AM

August 04, 2024

Haskell Interlude

54: Dominic Orchard

In this episode, Wouter and Sam interview Dominic Orchard. Dominic has many roles, including: senior lecturer at the University of Kent, co-director of the Institute of Computing for Climate Science, and bye-fellow of Queen’s College in Cambridge. We will not only discuss his work on Granule - graded monads, coeffects, and linear types - but also his collaboration with actual scientists to improve the languages with which they work.

by Haskell Podcast at August 04, 2024 08:00 PM